ABSTRACT
Controversies around race and machine learning have sparked debate among computer scientists over how to design machine learning systems that guarantee fairness. These debates rarely engage with how racial identity is embedded in our social experience, making for sociological and psychological complexity. This complexity challenges the paradigm of considering fairness to be a formal property of supervised learning with respect to protected personal attributes. Racial identity is not simply a personal subjective quality. For people labeled "Black" it is an ascribed political category that has consequences for social differentiation embedded in systemic patterns of social inequality achieved through both social and spatial segregation. In the United States, racial classification can best be understood as a system of inherently unequal status categories that places whites as the most privileged category while signifying the Negro/black category as stigmatized. Social stigma is reinforced through the unequal distribution of societal rewards and goods along racial lines that is reinforced by state, corporate, and civic institutions and practices. This creates a dilemma for society and designers: be blind to racial group disparities and thereby reify racialized social inequality by no longer measuring systemic inequality, or be conscious of racial categories in a way that itself reifies race. We propose a third option. By preceding group fairness interventions with unsupervised learning to dynamically detect patterns of segregation, machine learning systems can mitigate the root cause of social disparities, social segregation and stratification, without further anchoring status categories of disadvantage.
- Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine bias. ProPublica (May 2016).Google Scholar
- Julia Angwin and Terry Parris. 2016. Facebook Lets Advertisers Exclude Users by Race. https:/www.propublica.org/article/facebook-lets-advertisers-exclude-users-by-race.Google Scholar
- Chelsea Barabas, Karthik Dinakar, Joichi Ito Virza, Jonathan Zittrain, et al. 2017. Interventions over predictions: Reframing the ethical debate for actuarial risk assessment. arXiv preprint arXiv:1712.08238 (2017).Google Scholar
- Solon Barocas and Andrew D Selbst. 2016. Big data's disparate impact. Cal. L. Rev. 104 (2016), 671.Google Scholar
- Michał Bojanowski and Rense Corten. 2014. Measuring segregation in social networks. Social Networks 39 (2014), 14--32.Google Scholar
- Pedro Bordalo, Katherine Coffman, Nicola Gennaioli, and Andrei Shleifer. 2016. Stereotypes. The Quarterly Journal of Economics 131, 4 (2016), 1753--1794.Google Scholar
Cross Ref
- Pierre Bourdieu. 2011. The forms of capital.(1986). Cultural theory: An anthology 1 (2011), 81--93.Google Scholar
- Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on Fairness, Accountability and Transparency. 77--91.Google Scholar
- Traci Burch. 2015. Skin Color and the Criminal Justice System: Beyond Black-White Disparities in Sentencing. Journal of Empirical Legal Studies 12, 3 (2015), 395--420.Google Scholar
Cross Ref
- Raj Chetty, Nathaniel Hendren, Maggie R Jones, and Sonya R Porter. 2018. Race and economic opportunity in the United States: An intergenerational perspective. Technical Report. National Bureau of Economic Research.Google Scholar
- Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data 5, 2 (2017), 153--163.Google Scholar
- Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 797--806. Google Scholar
Digital Library
- Amit Datta, Anupam Datta, Jael Makagon, Deirdre K Mulligan, and Michael Carl Tschantz. 2018. Discrimination in Online Advertising: A Multidisciplinary Inquiry. In Conference on Fairness, Accountability and Transparency. 20--34.Google Scholar
- Anupam Datta, Matthew Fredrikson, Gihyuk Ko, Piotr Mardziel, and Shayak Sen. 2017. Use privacy in data-driven systems: Theory and experiments with machine learnt programs. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1193--1210. Google Scholar
Digital Library
- Amit Datta, Michael Carl Tschantz, and Anupam Datta. 2015. Automated experiments on ad privacy settings. Proceedings on Privacy Enhancing Technologies 2015, 1 (2015), 92--112.Google Scholar
- Thomas A DiPrete, Andrew Gelman, Tyler McCormick, Julien Teitler, and Tian Zheng. 2011. Segregation in social networks based on acquaintanceship and trust. Amer. J. Sociology 116, 4 (2011), 1234--83.Google Scholar
Cross Ref
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. ACM, 214--226. Google Scholar
Digital Library
- Facebook. 2017. Improving Enforcement and Promoting Diversity: Updates to Ads Policies and Tools. https://newsroom.fb.com/news/2017/02/improving-enforcement-and-promoting-diversity-updates-to-ads-policies-and-tools/.Google Scholar
- Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 259--268. Google Scholar
Digital Library
- Anthony W Flores, Kristin Bechtel, and Christopher T Lowenkamp. 2016. False Positives, False Negatives, and False Analyses: A Rejoinder to Machine Bias: There's Software Used across the Country to Predict Future Criminals. And It's Biased against Blacks. Fed. Probation 80 (2016), 38.Google Scholar
- Linton C Freeman. 1978. Segregation in social networks. Sociological Methods & Research 6, 4 (1978), 411--429.Google Scholar
Cross Ref
- Stephen Jay Gould. 1996. The mismeasure of man. WW Norton & Company.Google Scholar
- Moritz Hardt, Eric Price, Nati Srebro, et al. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems. 3315--3323. Google Scholar
Digital Library
- Victoria Hattam. 2007. In the shadow of race: Jews, Latinos, and immigrant politics in the United States. University of Chicago Press.Google Scholar
- Bruce D Haynes. 2018. The Soul of Judaism: Jews of African Descent in America. NYU Press.Google Scholar
- Bruce D Haynes and Jesus Hernandez. 2008. Place, space and race: monopolistic group closure and the dark side of social capital. Networked Urbanism: Social Capital in the City (2008), 59--84.Google Scholar
- Alex Hern. 2016. Facebook's 'ethnic affinity' advertising sparks concerns of racial profiling. The Guardian (Mar 2016). https:/www.theguardian.com/technology/2016/mar/22/facebooks-ethnic-affinity-advertising-concerns-racial-profilingGoogle Scholar
- Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Schölkopf. 2017. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems. 656--666. Google Scholar
Digital Library
- Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807 (2016).Google Scholar
- Rory Kramer, Robert DeFina, and Lance Hannon. 2016. Racial rigidity in the United States: comment on Saperstein and Penner. Amer. J. Sociology 122, 1 (2016), 233--246.Google Scholar
Cross Ref
- Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fairness. In Advances in Neural Information Processing Systems. 4066--4076. Google Scholar
Digital Library
- Matt J Kusner, Chris Russell, Joshua R Loftus, and Ricardo Silva. 2018. Causal Interventions for Fairness. arXiv preprint arXiv:1806.02380 (2018).Google Scholar
- Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. 2016. How we analyzed the COMPAS recidivism algorithm. ProPublica (5 2016) 9 (2016).Google Scholar
- Bruce G Link and Jo C Phelan. 2001. Conceptualizing stigma. Annual review of Sociology 27, 1 (2001), 363--385.Google Scholar
- Mara Loveman. 2014. National colors: Racial classification and the state in Latin America. Oxford University Press, USA.Google Scholar
- Douglas S Massey. 2007. Categorically unequal: The American stratification system. Russell Sage Foundation.Google Scholar
- Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a feather: Homophily in social networks. Annual review of sociology 27, 1 (2001), 415--444.Google Scholar
- Ellis P Monk Jr. 2014. Skin tone stratification among black Americans, 2001-2003. Social Forces 92, 4 (2014), 1313--1337.Google Scholar
Cross Ref
- Mark EJ Newman. 2003. Mixing patterns in networks. Physical Review E 67, 2 (2003), 026126.Google Scholar
Cross Ref
- Safiya Umoja Noble. 2018. Algorithms of Oppression: How search engines reinforce racism. NYU Press.Google Scholar
- Michael Omi and Howard Winant. 2014. Racial formation in the United States. Routledge.Google Scholar
- Thomas Piketty. 2014. Capital in the 21st Century. Harvard University Press Cambridge, MA.Google Scholar
- Angelisa C Plane, Elissa M Redmiles, Michelle L Mazurek, and Michael Carl Tschantz. 2017. Exploring user perceptions of discrimination in online targeted advertising. In USENIX Security. Google Scholar
Digital Library
- Sean F Reardon and David O'Sullivan. 2004. Measures of spatial segregation. Sociological methodology 34, 1 (2004), 121--162.Google Scholar
- Wendy Roth. 2012. Race migrations: Latinos and the cultural transformation of race. Stanford University Press.Google Scholar
- Wendy D Roth and Biorn Ivemark. 2018. Genetic Options: The Impact of Genetic Ancestry Testing on Consumers' Racial and Ethnic Identities. Amer. J. Sociology 124, 1 (2018), 150--184.Google Scholar
Cross Ref
- Aliya Saperstein and Andrew M Penner. 2012. Racial fluidity and inequality in the United States. American journal of sociology 118, 3 (2012), 676--727.Google Scholar
- Gábor Simonovits and Gábor Kézdi. 2016. Economic hardship triggers identification with disadvantaged minorities. The Journal of Politics 78, 3 (2016), 882--892.Google Scholar
Cross Ref
- Deenesh Sohoni. 2007. Unsuitable suitors: Anti-miscegenation laws, naturalization laws, and the construction of Asian identities. Law & Society Review 41, 3 (2007), 587--618.Google Scholar
Cross Ref
- Till Speicher, Muhammad Ali, Giridhari Venkatadri, Filipe Nunes Ribeiro, George Arvanitakis, Fabrício Benevenuto, Krishna P Gummadi, Patrick Loiseau, and Alan Mislove. 2018. Potential for Discrimination in Online Targeted Advertising. In Conference on Fairness, Accountability and Transparency. 5--19.Google Scholar
- Latanya Sweeney. 2013. Discrimination in online ad delivery. Queue 11, 3 (2013), 10. Google Scholar
Digital Library
- Takeyuki Tsuda. 2014. I'm American, not Japanese!': the struggle for racial citizenship among later-generation Japanese Americans. Ethnic and Racial Studies 37, 3 (2014), 405--424.Google Scholar
Cross Ref
- Jill Viglione, Lance Hannon, and Robert DeFina. 2011. The impact of light skin on prison time for black female offenders. The Social Science Journal 48, 1 (2011), 250--258.Google Scholar
Cross Ref
- Michael J White. 1983. The measurement of spatial segregation. American journal of sociology 88, 5 (1983), 1008--1018.Google Scholar
- William Julius Wilson. 1978. The declining significance of race. Society 15, 2 (1978), 56--62.Google Scholar
Cross Ref
- David W Wong. 2005. Formulating a general spatial segregation measure. The Professional Geographer 57, 2 (2005), 285--294.Google Scholar
Cross Ref
- Tukufu Zuberi. 2000. Deracializing social statistics: Problems in the quantification of race. The Annals of the American Academy of Political and Social Science 568, 1 (2000), 172--185.Google Scholar
Cross Ref
Index Terms
Racial categories in machine learning
Recommendations
Machine-based Stereotypes: How Machine Learning Algorithms Evaluate Ethnicity from Face Data
SBSI '23: Proceedings of the XIX Brazilian Symposium on Information SystemsContext: Soft biometrics is a field that aids traditional biometrics through attribute identification using descriptors such as hair type, ethnicity, or gender. When employed to large search sets, allows optimization of the search space, thus increasing ...
Measuring justice in machine learning
FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and TransparencyHow can we build more just machine learning systems? To answer this question, we need to know both what justice is and how to tell whether one system is more or less just than another. That is, we need both a definition and a measure of justice. ...






Comments