skip to main content
research-article

To "See" is to Stereotype: Image Tagging Algorithms, Gender Recognition, and the Accuracy-Fairness Trade-off

Authors Info & Claims
Published:05 January 2021Publication History
Skip Abstract Section

Abstract

Machine-learned computer vision algorithms for tagging images are increasingly used by developers and researchers, having become popularized as easy-to-use "cognitive services." Yet these tools struggle with gender recognition, particularly when processing images of women, people of color and non-binary individuals. Socio-technical researchers have cited data bias as a key problem; training datasets often over-represent images of people and contexts that convey social stereotypes. The social psychology literature explains that people learn social stereotypes, in part, by observing others in particular roles and contexts, and can inadvertently learn to associate gender with scenes, occupations and activities. Thus, we study the extent to which image tagging algorithms mimic this phenomenon. We design a controlled experiment, to examine the interdependence between algorithmic recognition of context and the depicted person's gender. In the spirit of auditing to understand machine behaviors, we create a highly controlled dataset of people images, imposed on gender-stereotyped backgrounds. Our methodology is reproducible and our code publicly available. Evaluating five proprietary algorithms, we find that in three, gender inference is hindered when a background is introduced. Of the two that "see" both backgrounds and gender, it is the one whose output is most consistent with human stereotyping processes that is superior in recognizing gender. We discuss the accuracy--fairness trade-off, as well as the importance of auditing black boxes in better understanding this double-edged sword.

References

  1. Vernon L Allen and Evert Van de Vliert. 1984. A role theoretical perspective on transitional processes. In Role transitions. Springer, 3--18.Google ScholarGoogle Scholar
  2. Tawfiq Ammari and Sarita Schoenebeck. 2016. ?Thanks for your interest in our Facebook group, but it's only for dads? Social Roles of Stay-at-Home Dads. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. 1363--1375.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias. ProPublica, May, Vol. 23 (2016), 2016.Google ScholarGoogle Scholar
  4. Cory L Armstrong and Michelle R Nelson. 2005. How newspaper sources trigger gender stereotypes. Journalism & Mass Communication Quarterly, Vol. 82, 4 (2005), 820--837.Google ScholarGoogle ScholarCross RefCross Ref
  5. Saeideh Bakhshi, Lyndon Kennedy, Eric Gilbert, and David A Shamma. 2019. Filtered Food and Nofilter Landscapes in Online Photography: The Role of Content and Visual Effects in Photo Engagement. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13. 80--90.Google ScholarGoogle Scholar
  6. Andrea Ballatore, Michela Bertolotto, and David C Wilson. 2014. An evaluative baseline for geo-semantic relatedness and similarity. GeoInformatica, Vol. 18, 4 (2014), 747--767.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Pinar Barlas, Kyriakos Kyriakou, Styliani Kleanthous, and Jahna Otterbacher. 2019. Social b (eye) as: Human and machine descriptions of people images. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13. 583--591.Google ScholarGoogle Scholar
  8. CJ Beukeboom, JP Forgas, O Vincze, and J Laszlo. 2014. Mechanisms of linguistic bias: How words reflect and maintain stereotypic expectancies. Social Cognition and Communication (2014), 313--330.Google ScholarGoogle Scholar
  9. Reuben Binns. 2020. On the apparent conflict between individual and group fairness. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 514--524.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Abeba Birhane and Fred Cummins. 2019. Algorithmic Injustices: Towards a Relational Ethics. arXiv preprint arXiv:1912.07376 (2019).Google ScholarGoogle Scholar
  11. Philipp Blandfort, Desmond U Patton, William R Frey, Svebor Karaman, Surabhi Bhargava, Fei-Tzin Lee, Siddharth Varia, Chris Kedzie, Michael B Gaskell, Rossano Schifanella, et almbox. 2019. Multimodal social media analysis for gang violence prevention. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13. 114--124.Google ScholarGoogle Scholar
  12. Galen V Bodenhausen. 1993. Emotions, arousal, and stereotypic judgments: A heuristic model of affect and stereotyping. In Affect, cognition and stereotyping. Elsevier, 13--37.Google ScholarGoogle Scholar
  13. Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in neural information processing systems. 4349--4357.Google ScholarGoogle Scholar
  14. Hege Eggen Børve and Elin Børve. 2017. Rooms with gender: physical environment and play culture in kindergarten. Early Child Development and Care, Vol. 187, 5--6 (2017), 1069--1081.Google ScholarGoogle ScholarCross RefCross Ref
  15. Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on Fairness, Accountability and Transparency. 77--91.Google ScholarGoogle Scholar
  16. Jenna Burrell. 2016. How the machine ?thinks?: Understanding opacity in machine learning algorithms. Big Data & Society, Vol. 3, 1 (2016), 2053951715622512.Google ScholarGoogle ScholarCross RefCross Ref
  17. Carrie J Cai and Philip J Guo. 2019. Software Developers Learning Machine Learning: Motivations, Hurdles, and Desires. In 2019 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 25--34.Google ScholarGoogle ScholarCross RefCross Ref
  18. Mary Ann Cejka and Alice H Eagly. 1999. Gender-stereotypic images of occupations correspond to the sex segregation of employment. Personality and social psychology bulletin, Vol. 25, 4 (1999), 413--423.Google ScholarGoogle Scholar
  19. Deborah Chavez. 1985. Perpetuation of gender inequality: A content analysis of comic strips. Sex Roles, Vol. 13, 1--2 (1985), 93--102.Google ScholarGoogle ScholarCross RefCross Ref
  20. Sahil Chinoy. 2019. The Racist History Behind Facial Recognition. The New York Times [Internet], Vol. 883 (2019).Google ScholarGoogle Scholar
  21. Scott Coltrane and Melinda Messineo. 2000. The perpetuation of subtle prejudice: Race and gender imagery in 1990s television advertising. Sex roles, Vol. 42, 5--6 (2000), 363--389.Google ScholarGoogle Scholar
  22. Sasha Costanza-Chock. 2018. Design Justice, A.I., and Escape from the Matrix of Domination. Journal of Design and Science (16 7 2018). https://doi.org/10.21428/96c8d426 https://jods.mitpress.mit.edu/pub/costanza-chock.Google ScholarGoogle Scholar
  23. Amy JC Cuddy, Susan T Fiske, and Peter Glick. 2008. Warmth and competence as universal dimensions of social perception: The stereotype content model and the BIAS map. Advances in experimental social psychology, Vol. 40 (2008), 61--149.Google ScholarGoogle Scholar
  24. Amy JC Cuddy, Peter Glick, and Anna Beninger. 2011. The dynamics of warmth and competence judgments, and their outcomes in organizations. Research in organizational behavior, Vol. 31 (2011), 73--98.Google ScholarGoogle Scholar
  25. David Danks and Alex John London. 2017. Algorithmic Bias in Autonomous Systems.. In IJCAI. 4691--4697.Google ScholarGoogle Scholar
  26. Abhijit Das, Antitza Dantcheva, and Francois Bremond. 2018. Mitigating Bias in Gender, Age and Ethnicity Classification: a Multi-Task Convolution Neural Network Approach. In ECCVW 2018-European Conference of Computer Vision Workshops.Google ScholarGoogle Scholar
  27. Terrance de Vries, Ishan Misra, Changhan Wang, and Laurens van der Maaten. 2019. Does Object Recognition Work for Everyone?. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 52--59.Google ScholarGoogle Scholar
  28. Julia Deeb-Swihart, Christopher Polack, Eric Gilbert, and Irfan Essa. 2017. Selfie-presentation in everyday life: A large-scale characterization of selfie contexts on instagram. In Eleventh International AAAI Conference on Web and Social Media.Google ScholarGoogle Scholar
  29. Alessandro Del Sole. 2018. Introducing microsoft cognitive services. In Microsoft Computer Vision APIs Distilled. Springer, 1--4.Google ScholarGoogle Scholar
  30. Nicholas Diakopoulos. 2016. Accountability in algorithmic decision making. Commun. ACM, Vol. 59, 2 (2016), 56--62.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Graham Dove, Kim Halskov, Jodi Forlizzi, and John Zimmerman. 2017. UX design innovation: Challenges for working with machine learning as a design material. In Proceedings of the 2017 chi conference on human factors in computing systems. 278--288.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, and Max Leiserson. 2018. Decoupled classifiers for group-fair and efficient machine learning. In Conference on Fairness, Accountability and Transparency. 119--133.Google ScholarGoogle Scholar
  33. Alice H Eagly and Antonio Mladinic. 1994. Are people prejudiced against women? Some answers from research on attitudes, gender stereotypes, and judgments of competence. European review of social psychology, Vol. 5, 1 (1994), 1--35.Google ScholarGoogle Scholar
  34. Alice H Eagly and Wendy Wood. 1982. Inferred sex differences in status as a determinant of gender stereotypes about social influence. Journal of personality and social psychology, Vol. 43, 5 (1982), 915.Google ScholarGoogle ScholarCross RefCross Ref
  35. Alice H. Eagly and Wendy Wood. 2012. Social Role Theory. In Handbook of theories of social psychology, John C Turner and Katherine J Reynolds (Eds.). Vol. 2. Sage London, Chapter 49, 458--476. https://doi.org/10.4135/9781446249222.n49Google ScholarGoogle Scholar
  36. Motahhare Eslami, Kristen Vaccaro, Karrie Karahalios, and Kevin Hamilton. 2017. ?Be careful; things can be worse than they appear?: Understanding Biased Algorithms and Users? Behavior around Them in Rating Platforms. In Eleventh International AAAI Conference on Web and Social Media (ICWSM). 62--71.Google ScholarGoogle Scholar
  37. Klaus Fiedler and Gün R Semin. 1988. On the causal information conveyed by different interpersonal verbs: The role of implicit sentence context. Social Cognition, Vol. 6, 1 (1988), 21--39.Google ScholarGoogle ScholarCross RefCross Ref
  38. Katherine Fink. 2018. Opening the government?s black boxes: freedom of information and algorithmic accountability. Information, Communication & Society, Vol. 21, 10 (2018), 1453--1471.Google ScholarGoogle ScholarCross RefCross Ref
  39. Benjamin Fish, Jeremy Kun, and Ádám D Lelkes. 2016. A confidence-based approach for balancing fairness and accuracy. In Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 144--152.Google ScholarGoogle ScholarCross RefCross Ref
  40. Susan T Fiske, Amy JC Cuddy, and Peter Glick. 2007. Universal dimensions of social cognition: Warmth and competence. Trends in cognitive sciences, Vol. 11, 2 (2007), 77--83.Google ScholarGoogle Scholar
  41. Howard N Garb. 1994. Cognitive heuristics and biases in personality assessment. In Applications of heuristics and biases to social issues. Springer, 73--90.Google ScholarGoogle Scholar
  42. Venkata Rama Kiran Garimella, Abdulrahman Alfayad, and Ingmar Weber. 2016. Social media image analysis for public health. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 5543--5547.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. CS Garrett, P Lynne Ein, and Leslie Tremaine. 1977. The development of gender stereotyping of adult occupations in elementary school children. Child Development (1977), 507--512.Google ScholarGoogle Scholar
  44. Patrick J Grother, Mei L Ngan, and Kayee K Hanaoka. 2019. NISTIR 8280 Face Recognition Vendor Test Part 3: Demographic Effects. National Institute of Standards and Technology (2019).Google ScholarGoogle ScholarCross RefCross Ref
  45. Sharath Chandra Guntuku, Daniel Preotiuc-Pietro, Johannes C Eichstaedt, and Lyle H Ungar. 2019. What twitter profile and posted images reveal about depression and anxiety. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13. 236--246.Google ScholarGoogle Scholar
  46. Foad Hamidi, Morgan Klaus Scheuerman, and Stacy M Branham. 2018. Gender recognition or gender reductionism? The social implications of embedded gender recognition systems. In Proceedings of the 2018 chi conference on human factors in computing systems. 1--13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Anikó Hannák, Claudia Wagner, David Garcia, Alan Mislove, Markus Strohmaier, and Christo Wilson. 2017. Bias in online freelance marketplaces: Evidence from taskrabbit and fiverr. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 1914--1933.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. S Alexander Haslam, John C Turner, Penelope J Oakes, Katherine J Reynolds, and Bertjan Doosje. 2002. From personal pictures in the head to collective tools in the world: How shared stereotypes allow groups to represent and change social reality. (2002).Google ScholarGoogle Scholar
  49. Madeline E Heilman and Alice H Eagly. 2008. Gender stereotypes are alive, well, and busy producing workplace discrimination. Industrial and Organizational Psychology, Vol. 1, 4 (2008), 393--398.Google ScholarGoogle ScholarCross RefCross Ref
  50. Lisa Anne Hendricks, Kaylee Burns, Kate Saenko, Trevor Darrell, and Anna Rohrbach. 2018. Women also snowboard: Overcoming bias in captioning models. In European Conference on Computer Vision. Springer, 793--811.Google ScholarGoogle ScholarCross RefCross Ref
  51. Susan C Herring. 2009. Web content analysis: Expanding the paradigm. In International handbook of Internet research. Springer, 233--249.Google ScholarGoogle Scholar
  52. Marc Hooghe, Laura Jacobs, and Ellen Claes. 2015. Enduring gender bias in reporting on political elite positions: Media coverage of female MPs in Belgian news broadcasts (2003--2011). The International Journal of Press/Politics, Vol. 20, 4 (2015), 395--414.Google ScholarGoogle ScholarCross RefCross Ref
  53. Yuheng Hu, Lydia Manikonda, and Subbarao Kambhampati. 2014. What we instagram: A first analysis of instagram photo content and user types. In Eighth International AAAI conference on weblogs and social media.Google ScholarGoogle ScholarCross RefCross Ref
  54. Jevan A Hutson, Jessie G Taft, Solon Barocas, and Karen Levy. 2018. Debiasing desire: Addressing bias & discrimination on intimate platforms. Proceedings of the ACM on Human-Computer Interaction, Vol. 2, CSCW (2018), 1--18.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Ji Yoon Jang, Sangyoon Lee, and Byungjoo Lee. 2019. Quantification of Gender Representation Bias in Commercial Films based on Image Analysis. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Soon-Gyo Jung, Jisun An, Haewoon Kwak, Joni Salminen, and Bernard Jim Jansen. 2018. Assessing the accuracy of four popular face recognition tools for inferring gender, age, and race. In Twelfth International AAAI Conference on Web and Social Media.Google ScholarGoogle Scholar
  57. Matthew Kay, Cynthia Matuszek, and Sean A Munson. 2015. Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 3819--3828.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. John E Kelly. 2015. Computing, cognition and the future of knowing. Whitepaper, IBM Reseach, Vol. 2 (2015).Google ScholarGoogle Scholar
  59. Os Keyes. 2018. The misgendering machines: Trans/HCI implications of automatic gender recognition. Proceedings of the ACM on Human-Computer Interaction, Vol. 2, CSCW (2018), 1--22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Kishor S Kinage and SG Bhirud. 2008. Racial inconsistency in face recognition. In SPIT-IEEE Colloquium and International Conference, Vol. 1. 78--81.Google ScholarGoogle Scholar
  61. Katherine N Kinnick. 1998. Gender bias in newspaper profiles of 1996 Olympic athletes: A content analysis of five major dailies. Women's Studies in Communication, Vol. 21, 2 (1998), 212--237.Google ScholarGoogle ScholarCross RefCross Ref
  62. Brendan F Klare, Mark J Burge, Joshua C Klontz, Richard W Vorder Bruegge, and Anil K Jain. 2012. Face recognition performance: Role of demographic information. IEEE Transactions on Information Forensics and Security, Vol. 7, 6 (2012), 1789--1801.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent Trade-Offs in the Fair Determination of Risk Scores. In Proceedings of Innovations in Theoretical Computer Science (ITCS), 2017. http://arxiv.org/abs/1609.05807Google ScholarGoogle Scholar
  64. Chloe Kliman-Silver, Aniko Hannak, David Lazer, Christo Wilson, and Alan Mislove. 2015. Location, Location, Location: The Impact of Geolocation on Web Search Personalization. In Proceedings of the 2015 Internet Measurement Conference (IMC '15). ACM, New York, NY, USA, 121--127. https://doi.org/10.1145/2815675.2815714 event-place: Tokyo, Japan.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Enes Kocabey, Ferda Ofli, Javier Marin, Antonio Torralba, and Ingmar Weber. 2018. Using computer vision to study the effects of bmi on online popularity and weight-based homophily. In International Conference on Social Informatics. Springer, 129--138.Google ScholarGoogle ScholarCross RefCross Ref
  66. Juhi Kulshrestha, Motahhare Eslami, Johnnatan Messias, Muhammad Bilal Zafar, Saptarshi Ghosh, Krishna P. Gummadi, and Karrie Karahalios. 2017. Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17). ACM, New York, NY, USA, 417--432. https://doi.org/10.1145/2998181.2998321 event-place: Portland, Oregon, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Kyriakos Kyriakou, Pinar Barlas, Styliani Kleanthous, and Jahna Otterbacher. 2019. Fairness in proprietary image tagging algorithms: A cross-platform audit on people images. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13. 313--322.Google ScholarGoogle Scholar
  68. Martha M Lauzen, David M Dozier, and Nora Horan. 2008. Constructing gender stereotypes through social roles in prime-time television. Journal of Broadcasting & Electronic Media, Vol. 52, 2 (2008), 200--214.Google ScholarGoogle ScholarCross RefCross Ref
  69. Gil Levi and Tal Hassner. 2015. Age and gender classification using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 34--42.Google ScholarGoogle ScholarCross RefCross Ref
  70. Leqi Liu, Daniel Preotiuc-Pietro, Zahra Riahi Samani, Mohsen E Moghaddam, and Lyle Ungar. 2016. Analyzing personality through social media profile picture choice. In Tenth international AAAI conference on web and social media.Google ScholarGoogle Scholar
  71. Debbie S Ma, Joshua Correll, and Bernd Wittenbrink. 2015. The Chicago face database: A free stimulus set of faces and norming data. Behavior research methods, Vol. 47, 4 (2015), 1122--1135.Google ScholarGoogle Scholar
  72. Anne Maass, Daniela Salvi, Luciano Arcuri, and Gün R Semin. 1989. Language use in intergroup contexts: The linguistic intergroup bias. Journal of personality and social psychology, Vol. 57, 6 (1989), 981.Google ScholarGoogle ScholarCross RefCross Ref
  73. Gabriel Magno, Camila Souza Araujo, Wagner Meira Jr., and Virgilio Almeida. 2016. Stereotypes in Search Engine Results: Understanding The Role of Local and Global Factors. arXiv:1609.05413 [cs] (Sept. 2016). http://arxiv.org/abs/1609.05413 arXiv: 1609.05413.Google ScholarGoogle Scholar
  74. Aditya Krishna Menon and Robert C Williamson. 2018. The cost of fairness in binary classification. In Conference on Fairness, Accountability and Transparency. 107--118.Google ScholarGoogle Scholar
  75. Debra Merskin. 2007. Three faces of Eva: Perpetuation of the hot-Latina stereotype in Desperate Housewives. The Howard Journal of Communications, Vol. 18, 2 (2007), 133--151.Google ScholarGoogle ScholarCross RefCross Ref
  76. Abbe Mowshowitz and Akira Kawaguchi. 2005. Measuring Search Engine Bias . Inf. Process. Manage., Vol. 41, 5 (Sept. 2005), 1193--1205. https://doi.org/10.1016/j.ipm.2004.05.005Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Vidya Muthukumar, Tejaswini Pedapati, Nalini Ratha, Prasanna Sattigeri, Chai-Wah Wu, Brian Kingsbury, Abhishek Kumar, Samuel Thomas, Aleksandra Mojsilovic, and Kush R Varshney. 2018. Understanding unequal gender classification accuracy from face images. arXiv preprint arXiv:1812.00099 (2018).Google ScholarGoogle Scholar
  78. Arvind Narayanan. 2018. Translation tutorial: 21 fairness definitions and their politics. In Proc. Conf. Fairness Accountability Transp., New York, USA.Google ScholarGoogle Scholar
  79. Kalia Orphanou, Jahna Otterbacher, Styliani Kleanthous, Khuyagbaatar Batsuren, Fausto Giunchiglia, Veronika Bogina, Avital Shulner Tal, Alan Hartman, and Tsvi Kuflik. 2020. Deliverable D3.4 - Survey Article. http://www.cycat.io/wp-content/uploads/2020/06/D3.4_Survey_Article_NV.pdf Project deliverable, H2020 CyCAT (810105).Google ScholarGoogle Scholar
  80. Jahna Otterbacher. 2015. Crowdsourcing stereotypes: Linguistic bias in metadata generated via gwap. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 1955--1964.Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. Jahna Otterbacher, Pinar Barlas, Styliani Kleanthous, and Kyriakos Kyriakou. 2019. How Do We Talk about Other People? Group (Un) Fairness in Natural Language Image Descriptions. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 7. 106--114.Google ScholarGoogle Scholar
  82. J. Otterbacher, J. Bates, and P. D. Clough. 2017. Competent Men and Warm Women: Gender Stereotypes and Backlash in Image Search Results. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3025453.3025727Google ScholarGoogle Scholar
  83. Neelamdhab Padhy, RP Singh, and Suresh Chandra Satapathy. 2018. Software reusability metrics estimation: algorithms, models and optimization techniques. Computers & Electrical Engineering, Vol. 69 (2018), 653--668.Google ScholarGoogle ScholarCross RefCross Ref
  84. Ji Ho Park and Pascale Fung. 2017. One-step and two-step classification for abusive language detection on twitter. arXiv preprint arXiv:1706.01206 (2017).Google ScholarGoogle Scholar
  85. Kayur Patel, James Fogarty, James A Landay, and Beverly Harrison. 2008. Investigating statistical machine learning as a tool for software development. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 667--676.Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. 2009. Measuring discrimination in socially-sensitive decision records. In Proceedings of the 2009 SIAM International Conference on Data Mining. SIAM, 581--592.Google ScholarGoogle ScholarCross RefCross Ref
  87. David Martin Powers. 2011. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. (2011).Google ScholarGoogle Scholar
  88. Vinay Uday Prabhu and Abeba Birhane. 2020. Large image datasets: A pyrrhic win for computer vision?arxiv: 2006.16923 [cs.CY]Google ScholarGoogle Scholar
  89. Iyad Rahwan, Manuel Cebrian, Nick Obradovich, Josh Bongard, Jean-Francc ois Bonnefon, Cynthia Breazeal, Jacob W Crandall, Nicholas A Christakis, Iain D Couzin, Matthew O Jackson, et almbox. 2019. Machine behaviour. Nature, Vol. 568, 7753 (2019), 477.Google ScholarGoogle Scholar
  90. Rachel Rudinger, Chandler May, and Benjamin Van Durme. 2017. Social bias in elicited natural language inferences. In Proceedings of the First ACL Workshop on Ethics in Natural Language Processing. 74--79.Google ScholarGoogle ScholarCross RefCross Ref
  91. Mohamed Aymen Saied, Ali Ouni, Houari Sahraoui, Raula Gaikovina Kula, Katsuro Inoue, and David Lo. 2018. Improving reusability of software libraries through usage pattern mining. Journal of Systems and Software, Vol. 145 (2018), 164--179.Google ScholarGoogle Scholar
  92. Christian Sandvig, Kevin Hamilton, Karrie Karahalios, and Cedric Langbort. 2014. Auditing algorithms: Research methods for detecting discrimination on internet platforms. Data and discrimination: converting critical concerns into productive inquiry (2014), 1--23.Google ScholarGoogle Scholar
  93. Morgan Klaus Scheuerman, Jacob M Paul, and Jed R Brubaker. 2019. How Computers See Gender: An Evaluation of Gender Classification in Commercial Facial Analysis Services. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Christine K Shenouda and Judith H Danovitch. 2014. Effects of gender stereotypes and stereotype threat on children's performance on a spatial task. Revue internationale de psychologie sociale, Vol. 27, 3 (2014), 53--77.Google ScholarGoogle Scholar
  95. Amit Sheth, Hong Yung Yip, Arun Iyengar, and Paul Tepper. 2019. Cognitive services and intelligent chatbots: current perspectives and special issue introduction. IEEE Internet Computing, Vol. 23, 2 (2019), 6--12.Google ScholarGoogle Scholar
  96. Eva H Shinar. 1975. Sexual stereotypes of occupations. Journal of vocational behavior, Vol. 7, 1 (1975), 99--111.Google ScholarGoogle ScholarCross RefCross Ref
  97. Vivek Singh, Mary Chayko, Raj Inamdar, and Diana Floegel. 2019. Female Librarians and Male Computer Programmers? Gender Bias in Occupational Images on Digital Media Platforms. arXiv preprint arXiv:1912.05474 (2019).Google ScholarGoogle Scholar
  98. Amit Singhal et almbox. 2001. Modern information retrieval: A brief overview. IEEE Data Eng. Bull., Vol. 24, 4 (2001), 35--43.Google ScholarGoogle Scholar
  99. Luke Stark. 2018. Facial recognition, emotion and race in animated social media. First Monday (2018).Google ScholarGoogle Scholar
  100. Latanya Sweeney. 2013. Discrimination in Online Ad Delivery. Queue, Vol. 11, 3 (March 2013), 10:10--10:29. https://doi.org/10.1145/2460276.2460278Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. Emiel Van Miltenburg. 2016. Stereotyping and Bias in the Flickr30K Dataset. In Proceedings of the Workshop on Multimodal Corpora (MMC-2016). 1--4.Google ScholarGoogle Scholar
  102. Sahil Verma and Julia Rubin. 2018. Fairness definitions explained. In 2018 IEEE/ACM International Workshop on Software Fairness (FairWare). IEEE, 1--7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. Claudia Wagner, David Garcia, Mohsen Jadidi, and Markus Strohmaier. 2015. It's a man's Wikipedia? Assessing gender inequality in an online encyclopedia. In Ninth international AAAI conference on web and social media .Google ScholarGoogle Scholar
  104. Michael J White and Gwendolen B White. 2006. Implicit and explicit occupational gender stereotypes. Sex roles, Vol. 55, 3--4 (2006), 259--266.Google ScholarGoogle Scholar
  105. Benjamin Wilson, Judy Hoffman, and Jamie Morgenstern. 2019. Predictive inequity in object detection. arXiv preprint arXiv:1902.11097 (2019).Google ScholarGoogle Scholar
  106. Michele Wilson. 2017. Algorithms (and the) everyday. Information, Communication & Society, Vol. 20, 1 (2017), 137--150.Google ScholarGoogle ScholarCross RefCross Ref
  107. Piotr Winkielman, Jamin Halberstadt, Tedra Fazendeiro, and Steve Catty. 2006. Prototypes are attractive because they are easy on the mind. Psychological science, Vol. 17, 9 (2006), 799--806.Google ScholarGoogle Scholar
  108. Jieyu Zhao, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. 2017. Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  109. Liqi Zhu and Gerd Gigerenzer. 2006. Children can solve Bayesian problems: The role of representation in mental computation. Cognition, Vol. 98, 3 (2006), 287--308.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. To "See" is to Stereotype: Image Tagging Algorithms, Gender Recognition, and the Accuracy-Fairness Trade-off

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Human-Computer Interaction
        Proceedings of the ACM on Human-Computer Interaction  Volume 4, Issue CSCW3
        CSCW
        December 2020
        1822 pages
        EISSN:2573-0142
        DOI:10.1145/3446568
        Issue’s Table of Contents

        Copyright © 2021 Owner/Author

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 January 2021
        Published in pacmhci Volume 4, Issue CSCW3

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!