Abstract
We study fairness of scoring in online job marketplaces. We focus on group fairness and aim to algorithmically explore how a scoring function, through which individuals are ranked for jobs, treats different demographic groups. Previous work on group-level fairness has focused on the case where groups are pre-defined or where they are defined using a single protected attribute (e.g., whites vs. blacks or males vs. females). In this article, we argue for the need to examine fairness for groups of people defined with any combination of protected attributes (the-so called subgroup fairness). Existing work also assumes the availability of worker’s data (i.e., data transparency) and the scoring function (i.e., process transparency). We relax that assumption in this work and run user studies to assess the effect of different data and process transparency settings on the ability to assess fairness.
To quantify the fairness of a scoring of a group of individuals, we formulate an optimization problem to find a partitioning of those individuals on their protected attributes that exhibits the highest unfairness with respect to the scoring function. The scoring function yields one histogram of score distributions per partition and we rely on Earth Mover’s Distance, a measure that is commonly used to compare histograms, to quantify unfairness. Since the number of ways to partition individuals is exponential in the number of their protected attributes, we propose a heuristic algorithm to navigate the space of all possible partitionings to identify the one with the highest unfairness. We evaluate our algorithm using a simulation of a crowdsourcing platform and show that it can effectively quantify unfairness of various scoring functions. We additionally run experiments to assess the applicability of our approach in other less-transparent data and process settings. Finally, we demonstrate the effectiveness of our approach in assessing fairness of scoring in a real dataset crawled from the online job marketplace TaskRabbit.
- Sara Hajian, Francesco Bonchi, and Carlos Castillo. 2016. Algorithmic bias: From discrimination discovery to fairness-aware data mining. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2125--2126.Google Scholar
Digital Library
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez-Rodriguez, and Krishna P. Gummadi. 2017. Fairness constraints: Mechanisms for fair classification. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS’17). 962--970.Google Scholar
- Michael J. Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. 2018. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In Proceedings of the 35th International Conference on Machine Learning (ICML’18). 2569--2577.Google Scholar
- Ashudeep Singh and Thorsten Joachims. 2018. Fairness of exposure in rankings. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery 8 Data Mining (KDD’18). 2219--2228.Google Scholar
Digital Library
- Toon Calders and Sicco Verwer. 2010. Three naive bayes approaches for discrimination-free classification. Data Min. Knowl. Discov. 21, 2 (1 September 2010), 277--292. DOI:http://dx.doi.org/10.1007/s10618-010-0190-xGoogle Scholar
- Indre Zliobaite. 2015. A survey on measuring indirect discrimination in machine learning. CoRR abs/1511.00148 (2015). Retrieved from http://arxiv.org/abs/1511.00148.Google Scholar
- Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. 2016. On the (im)possibility of fairness. arxiv:1609.07236. Retrieved from http://arxiv.org/abs/1609.07236.Google Scholar
- Mike Noon. 2010. The shackled runner: Time to rethink positive discrimination? Work Employ. Soc. 24, 4 (2010), 728--739.Google Scholar
- Aniko Hannak, Claudia Wagner, David Garcia, Alan Mislove, Markus Strohmaier, and Christo Wilson. 2017. Bias in online freelance marketplaces: Evidence from taskrabbit and fiverr. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW’17). 1914--1933.Google Scholar
- Ofir Pele and Michael Werman. 2009. Fast and robust earth mover’s distances. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. IEEE, 460--467.Google Scholar
- Lee D. Ross, Teresa M. Amabile, and Julia L. Steinmetz. 1977. Social roles, social control, and biases in social-perception processes. J. Pers. Soc. Psychol. 35, 7 (1977), 485.Google Scholar
Cross Ref
- Chrysanthos Dellarocas. 2000. Immunizing online reputation reporting systems against unfair ratings and discriminatory behavior. In Proceedings of the 2nd ACM Conference on Electronic Commerce. ACM, 150--157.Google Scholar
Digital Library
- Sreerama K. Murthy. 1998. Automatic construction of decision trees from data: A multi-disciplinary survey. Data Min. Knowl. Discov. 2, 4 (1998), 345--389.Google Scholar
Digital Library
- Gary Bolton, Jordi Brandts, and Axel Ockenfels. 2005. Fair procedures: Evidence from games involving lotteries. Econ. J. 115, 506 (2005), 1054--1076.Google Scholar
Cross Ref
- Ahmad Ghizzawi, Julien Marinescu, Shady Elbassuoni, Sihem Amer-Yahia, and Gilles Bisson. 2019. FaiRank: An interactive system to explore fairness of ranking in online job marketplaces. In Proceedings of the International Conference on Extending Database Technology (EDBT’19).Google Scholar
- Shady Elbassuoni, Sihem Amer-Yahia, Ahmad Ghizzawi, and Christine El Atie. 2019. Exploring fairness of ranking in online job marketplaces. In Proceedings of the International Conference on Extending Database Technology (EDBT’19).Google Scholar
- SurveyMonkey. Calculating the number of respondents you need. Retrieved from https://help.surveymonkey.com/articles/en_US/kb/How-many-respondents-do-I-need.Google Scholar
- Jay Sethuraman, Chung-Piaw Teo, and Liwen Qian. 2006. Many-to-one stable matching: Geometry and fairness. Math. Oper. Res. 31, 3 (August 2006), 581--596.Google Scholar
- Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. Int. J. Uncert. Fuzz. Knowl.-Based Syst. 10, 5 (2002), 557--570.Google Scholar
Digital Library
- Neil Stewart, Christoph Ungemach, Adam J. L. Harris, Daniel M. Bartels, Ben R. Newell, Gabriele Paolacci, Jesse Chandler, et al. 2015. The average laboratory samples a population of 7,300 amazon mechanical turk workers. Judg. Decis. Making 10, 5 (2015), 479--491.Google Scholar
- Pierangela Samarati. 2001. Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13, 6 (2001), 1010--1027.Google Scholar
Digital Library
- Keith Kirkpatrick. 2016. Battling algorithmic bias: How do we ensure algorithms treat us fairly? Commun. ACM 59, 10 (2016), 16--17.Google Scholar
Digital Library
- Latanya Sweeney. 2013. Discrimination in online ad delivery. arxiv:1301.6822. Retrieved from http://arxiv.org/abs/1301.6822.Google Scholar
- Florian Tramèr, Vaggelis Atlidakis, Roxana Geambasu, Daniel J. Hsu, Jean-Pierre Hubaux, Mathias Humbert, Ari Juels, and Huang Lin. 2015. Discovering unwarranted associations in data-driven applications with the fairtest testing toolkit. arxiv:1510.02377. Retrieved from http://arxiv.org/abs/1510.02377.Google Scholar
- Mahmood Hosseini, Alimohammad Shahri, Keith Phalp, and Raian Ali. 2017. Four reference models for transparency requirements in information systems. Requir. Eng. 23, 2 (2017), 1--25.Google Scholar
- Teofilo F. Gonzalez. 1985. Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 38 (1985), 293--306. DOI:http://dx.doi.org/10.1016/0304-3975(85)90224-5Google Scholar
Cross Ref
- Karen Levy and Solon Barocas. 2017. Designing against discrimination in online markets. Berkeley Tech. LJ 32 (2017), 1183.Google Scholar
- Alex Rosenblat, Karen E. C. Levy, Solon Barocas, and Tim Hwang. 2017. Discriminating tastes: Uber’s customer ratings as vehicles for workplace discrimination. Policy Internet 9, 3 (2017), 256--279.Google Scholar
Cross Ref
- Benjamin Edelman, Michael Luca, and Dan Svirsky. 2017. Racial discrimination in the sharing economy: Evidence from a field experiment. Am. Econ. J.: Appl. Econ. 9, 2 (2017), 1--22.Google Scholar
Cross Ref
- David Durward, Ivo Blohm, and Jan Marco Leimeister. 2016. Is there PAPA in crowd work?: A literature review on ethical dimensions in crowdsourcing. In Proceedings of the 2016 International on IEEE Conferences on Ubiquitous Intelligence 8 Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld’16). IEEE, 823--832.Google Scholar
- Ria Mae Borromeo, Thomas Laurent, Motomichi Toyama, and Sihem Amer-Yahia. 2017. Fairness and transparency in crowdsourcing. In Proceedings of the 20th International Conference on Extending Database Technology (EDBT’17). 466--469. DOI:http://dx.doi.org/10.5441/002/edbt.2017.46Google Scholar
- Michael Luca and Rayl Fisman. 2016. Fixing discrimination in online marketplaces. Harv. Bus. Rev. (December 2016).Google Scholar
- Benjamin V. Hanrahan, Jutta K. Willamowski, Saiganesh Swaminathan, and David B. Martin. 2015. TurkBench: Rendering the market for turkers. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’15), Bo Begole, Jinwoo Kim, Kori Inkpen, and Woontack Woo (Eds.). ACM, 1613--1616.Google Scholar
- Chris Callison-Burch. 2014. Crowd-workers: Aggregating information across turkers to help them find higher paying work. In Proceedings of the 2nd AAAI Conference on Human Computation and Crowdsourcing (HCOMP’14).Google Scholar
- Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, and Ricardo Baeza-Yates. 2017. Fa* ir: A fair top-k ranking algorithm. In Proceedings of the Conference on Information and Knowledge Management (CIKM’17). 1569--1578.Google Scholar
- L. Elisa Celis, Damian Straszak, and Nisheeth K. Vishnoi. 2017. Ranking with fairness constraints. arxiv:1704.06840. Retrieved from https://arxiv.org/abs/1704.06840.Google Scholar
- Ke Yang and Julia Stoyanovich. 2017. Measuring fairness in ranked outputs. In Proceedings of the International Conference on Solid State Devices and Materials (SSDM’17). 22.Google Scholar
- Ashudeep Singh and Thorsten Joachims. 2018. Fairness of exposure in rankings. arxiv:1802.07281. Retrieved from https://arxiv.org/abs/1802.07281.Google Scholar
- Asia J. Biega, Krishna P. Gummadi, and Gerhard Weikum. 2018. Equity of attention: Amortizing individual fairness in rankings. arxiv:1805.01788. Retrieved from https://arxiv.org/abs/1805.01788.Google Scholar
Index Terms
Fairness of Scoring in Online Job Marketplaces
Recommendations
Optimized score transformation for consistent fair classification
This paper considers fair probabilistic binary classification where the outputs of primary interest are predicted probabilities, commonly referred to as scores. We formulate the problem of transforming scores to satisfy fairness constraints that are ...
On (assessing) the fairness of risk score models
FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and TransparencyRecent work on algorithmic fairness has largely focused on the fairness of discrete decisions, or classifications. While such decisions are often based on risk score models, the fairness of the risk models themselves has received considerably less ...
Preference-informed fairness
FAT* '20: Proceedings of the 2020 Conference on Fairness, Accountability, and TransparencyIn this work, we study notions of fairness in decision-making systems when individuals have diverse preferences over the possible outcomes of the decisions. Our starting point is the seminal work of Dwork et al. [ITCS 2012] which introduced a notion of ...






Comments