ABSTRACT
The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word embedding. Using these properties, we provide a methodology for modifying an embedding to remove gender stereotypes, such as the association between the words receptionist and female, while maintaining desired associations such as between the words queen and female. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks. The resulting embeddings can be used in applications without amplifying gender bias.
References
- J. Angwin, J. Larson, S. Mattu, and L. Kirchner. Machine bias: There's software used across the country to predict future criminals. and it's biased against blacks., 2016.Google Scholar
- S. Barocas and A. D. Selbst. Big data's disparate impact. Available at SSRN 2477899, 2014.Google Scholar
- E. Beigman and B. B. Klebanov. Learning with annotation noise. In ACL, 2009. Google Scholar
Digital Library
- A. Datta, M. C. Tschantz, and A. Datta. Automated experiments on ad privacy settings. Proceedings on Privacy Enhancing Technologies, 2015.Google Scholar
Cross Ref
- C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness through awareness. In Innovations in Theoretical Computer Science Conference, 2012. Google Scholar
Digital Library
- J. Eisenstein, B. O'Connor, N. A. Smith, and E. P. Xing. Diffusion of lexical change in social media. PLoS ONE, pages 1-13, 2014.Google Scholar
Cross Ref
- M. Faruqui, J. Dodge, S. K. Jauhar, C. Dyer, E. Hovy, and N. A. Smith. Retrofitting word vectors to semantic lexicons. In NAACL, 2015.Google Scholar
Cross Ref
- M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. Certifying and removing disparate impact. In KDD, 2015. Google Scholar
Digital Library
- C. Fellbaum, editor. Wordnet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, 1998.Google Scholar
- L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman, and E. Ruppin. Placing search in context: The concept revisited. In WWW. ACM, 2001. Google Scholar
Digital Library
- A. G. Greenwald, D. E. McGhee, and J. L. Schwartz. Measuring individual differences in implicit cognition: the implicit association test. Journal of personality and social psychology, 74(6):1464, 1998.Google Scholar
- C. Hansen, M. Tosik, G. Goossen, C. Li, L. Bayeva, F. Berbain, and M. Rotaru. How to get the best word vectors for resume parsing. In SNN Adaptive Intelligence / Symposium: Machine Learning 2015, Nijmegen.Google Scholar
- J. Holmes and M. Meyerhoff. The handbook of language and gender, volume 25. John Wiley & Sons, 2008.Google Scholar
- O. İrsoy and C. Cardie. Deep recursive neural networks for compositionality in language. In NIPS. 2014. Google Scholar
Digital Library
- R. Jakobson, L. R. Waugh, and M. Monville-Burston. On language. Harvard Univ Pr, 1990.Google Scholar
- M. Kay, C. Matuszek, and S. A. Munson. Unequal representation and gender stereotypes in image search results for occupations. In Human Factors in Computing Systems. ACM, 2015. Google Scholar
Digital Library
- T. Lei, H. Joshi, R. Barzilay, T. Jaakkola, A. M. Katerina Tymoshenko, and L. Marquez. Semi-supervised question retrieval with gated convolutions. In NAACL. 2016.Google Scholar
- O. Levy and Y. Goldberg. Linguistic regularities in sparse and explicit word representations. In CoNLL, 2014.Google Scholar
Cross Ref
- T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation of word representations in vector space. In ICLR, 2013.Google Scholar
- T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. Distributed representations of words and phrases and their compositionality. In NIPS. Google Scholar
Digital Library
- T. Mikolov, W.-t. Yih, and G. Zweig. Linguistic regularities in continuous space word representations. In HLT-NAACL, pages 746-751, 2013.Google Scholar
- E. Nalisnick, B. Mitra, N. Craswell, and R. Caruana. Improving document ranking with dual word embeddings. In www, April 2016. Google Scholar
Digital Library
- B. A. Nosek, M. Banaji, and A. G. Greenwald. Harvesting implicit group attitudes and beliefs from a demonstration web site. Group Dynamics: Theory, Research, and Practice, 6(1):101, 2002.Google Scholar
- D. Pedreshi, S. Ruggieri, and F. Turini. Discrimination-aware data mining. In SIGKDD, pages 560-568. ACM, 2008. Google Scholar
Digital Library
- J. Pennington, R. Socher, and C. D. Manning. Glove: Global vectors for word representation. In EMNLP, 2014.Google Scholar
Cross Ref
- K. Ross and C. Carter. Women and news: A long and winding road. Media, Culture & Society, 33(8):1148-1165, 2011.Google Scholar
- H. Rubenstein and J. B. Goodenough. Contextual correlates of synonymy. Communications of the ACM, 8(10):627-633, 1965. Google Scholar
Digital Library
- E. Sapir. Selected writings of Edward Sapir in language, culture and personality, volume 342. Univ of California Press, 1985.Google Scholar
- B. Schmidt. Rejecting the gender binary: a vector-space operation. http://bookworm.benschmidt.org/posts/2015-10-30-rejecting-the-gender-binary.html, 2015.Google Scholar
- J. P. Stanley. Paradigmatic woman: The prostitute. Papers in language variation, pages 303-321, 1977.Google Scholar
- L. Sweeney. Discrimination in online ad delivery. Queue, 11(3):10, 2013. Google Scholar
Digital Library
- A. Torralba and A. Efros. Unbiased look at dataset bias. In CVPR, 2012. Google Scholar
Digital Library
- P. D. Turney. Domain and function: A dual-space model of semantic relations and compositions. Journal of Artificial Intelligence Research, pages 533-585, 2012. Google Scholar
Digital Library
- C. Wagner, D. Garcia, M. Jadidi, and M. Strohmaier. It's a man's wikipedia? assessing gender inequality in an online encyclopedia. In Ninth International AAAI Conference on Web and Social Media, 2015.Google Scholar
- D. Yogatama, M. Faruqui, C. Dyer, and N. A. Smith. Learning word representations with hierarchical sparse coding. In ICML, 2015. Google Scholar
Digital Library
- I. Zliobaite. A survey on measuring indirect discrimination in machine learning. arXiv preprint arXiv:1511.00148, 2015.Google Scholar
Index Terms
(auto-classified)Man is to computer programmer as woman is to homemaker? debiasing word embeddings




Comments