skip to main content
10.1145/3459637.3482342acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Fairness-Aware Training of Decision Trees by Abstract Interpretation

Published: 30 October 2021 Publication History

Abstract

We study the problem of formally verifying individual fairness of decision tree ensembles, as well as training tree models which maximize both accuracy and individual fairness. In our approach, fairness verification and fairness-aware training both rely on a notion of stability of a classifier, which is a generalization of the standard notion of robustness to input perturbations used in adversarial machine learning. Our verification and training methods leverage abstract interpretation, a well-established mathematical framework for designing computable, correct, and precise approximations of potentially infinite behaviors. We implemented our fairness-aware learning method by building on a tool for adversarial training of decision trees. We evaluated it in practice on the reference datasets in the literature on fairness in machine learning. The experimental results show that our approach is able to train tree models exhibiting a high degree of individual fairness with respect to the natural state-of-the-art CART trees and random forests. Moreover, as a by-product, these fairness-aware decision trees turn out to be significantly compact, which naturally enhances their interpretability.

References

[1]
Sina Aghaei, Mohammad Javad Azizi, and Phebe Vayanos. 2019. Learning Optimal and Fair Decision Trees for Non-Discriminative Decision-Making. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019. AAAI Press, 1418--1426. https://doi.org/10.1609/aaai.v33i01.33011418
[2]
Maksym Andriushchenko and Matthias Hein. 2019. Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks. In Proc. 33rd Annual Conference on Neural Information Processing Systems (NeurIPS 2019). https://proceedings.neurips.cc/paper/2019/hash/4206e38996fae4028a26d43b24f68d32-Abstract.html
[3]
Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine Bias. ProPublica, May, Vol. 23 (2016), 2016. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
[4]
Solon Barocas and Andrew D. Selbst. 2016. Big Data's Disparate Impact. California Law Review, Vol. 104 (2016), 671. https://doi.org/10.15779/Z38BG31
[5]
Dimitris Bertsimas and Jack Dunn. 2017. Optimal classification trees. Mach. Learn., Vol. 106, 7 (2017), 1039--1082. https://doi.org/10.1007/s10994-017-5633-9
[6]
Max Bramer. 2013. Avoiding Overfitting of Decision Trees. Springer, 121--136. https://doi.org/10.1007/978-1-4471-4884--5_9
[7]
Leo Breiman. 2001. Random Forests. Machine Learning, Vol. 45, 1 (2001), 5--32. https://doi.org/10.1023/A:1010933404324
[8]
Leo Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Wadsworth.
[9]
Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Conference on Fairness, Accountability and Transparency (FAT 2018) (Proceedings of Machine Learning Research, Vol. 81). PMLR, 77--91. http://proceedings.mlr.press/v81/buolamwini18a.html
[10]
Stefano Calzavara, Claudio Lucchese, and Gabriele Tolomei. 2019. Adversarial Training of Gradient-Boosted Decision Trees. In Proc. 28th ACM International Conference on Information and Knowledge Management (CIKM 2019). 2429--2432. https://doi.org/10.1145/3357384.3358149
[11]
Stefano Calzavara, Claudio Lucchese, Gabriele Tolomei, Seyum Assefa Abebe, and Salvatore Orlando. 2020. TREANT: training evasion-aware decision trees. Data Mining and Knowledge Discovery (2020). https://doi.org/10.1007/s10618-020-00694--9
[12]
Nicholas Carlini and David A. Wagner. 2017. Towards Evaluating the Robustness of Neural Networks. In Proc. of 38th IEEE Symposium on Security and Privacy (S & P 2017). 39--57. https://doi.org/10.1109/SP.2017.49
[13]
Hongge Chen, Huan Zhang, Duane S. Boning, and Cho-Jui Hsieh. 2019. Robust Decision Trees Against Adversarial Examples. In Proc. 36th Int. Conf. on Machine Learning, (ICML 2019). 1122--1131. http://proceedings.mlr.press/v97/chen19m.html
[14]
Alexandra Chouldechova. 2017. Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments. Big Data, Vol. 5, 2 (2017), 153--163. https://doi.org/10.1089/big.2016.0047
[15]
Patrick Cousot. 2021. Principles of Abstract Interpretation. MIT Press.
[16]
Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proc. 4th ACM Symposium on Principles of Programming Languages (POPL 1977). 238--252. https://doi.org/10.1145/512950.512973
[17]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository.
[18]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard S. Zemel. 2012. Fairness through awareness. In Innovations in Theoretical Computer Science 2012. ACM, 214--226. https://doi.org/10.1145/2090236.2090255
[19]
European Commission. 2021. Proposal for a Regulation Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act). https://digital-strategy.ec.europa.eu/en/library/proposal-regulation-laying-down-harmonised-rules-artificial-intelligence-artificial-intelligence.
[20]
Jerome H Friedman. 2001. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics (2001), 1189--1232. https://www.jstor.org/stable/2699986
[21]
Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness Testing: Testing Software for Discrimination. In FSE. 498--510. https://doi.org/10.1145/3106237.3106277
[22]
Ian Goodfellow, Patrick McDaniel, and Nicolas Papernot. 2018. Making Machine Learning Robust Against Adversarial Inputs. Commun. ACM, Vol. 61, 7 (2018), 56--66. https://doi.org/10.1145/3134599
[23]
Vincent Grari, Boris Ruf, Sylvain Lamprier, and Marcin Detyniecki. 2020. Achieving Fairness with Decision Trees: An Adversarial Approach. Data Sci. Eng., Vol. 5, 2 (2020), 99--110. https://doi.org/10.1007/s41019-020-00124-2
[24]
Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of Opportunity in Supervised Learning. In Proc. 30th Annual Conference on Neural Information Processing Systems (NeurIPS 2016). 3315--3323. http://papers.nips.cc/paper/6374-equality-of-opportunity-in-supervised-learning
[25]
John H Holland. 1984. Genetic algorithms and adaptation. In Adaptive Control of Ill-Defined Systems. Springer, 317--333. https://doi.org/10.1007/978-1-4684-8941-5_21
[26]
Alex Kantchelian, J. D. Tygar, and Anthony D. Joseph. 2016. Evasion and Hardening of Tree Ensemble Classifiers. In Proc. 33rd International Conference on Machine Learning (ICML 2016). 2387--2396. http://dl.acm.org/citation.cfm?id=3045390.3045642
[27]
Matthew Kay, Cynthia Matuszek, and Sean A. Munson. 2015. Unequal Representation and Gender Stereotypes in Image Search Results for Occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI 2015). ACM, 3819--3828. https://doi.org/10.1145/2702123.2702520
[28]
Michael J. Kearns and Yishay Mansour. 1998. A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998). Morgan Kaufmann, 269--277. https://dl.acm.org/doi/10.5555/645527.657457
[29]
Amir E. Khandani, Adlar J. Kim, and Andrew W. Lo. 2010. Consumer Credit-Risk Models via Machine-Learning Algorithms. Journal of Banking & Finance, Vol. 34, 11 (2010), 2767--2787. https://doi.org/10.1016/j.jbankfin.2010.06.001
[30]
Changliu Liu, Tomer Arnon, Christopher Lazarus, Christopher A. Strong, Clark W. Barrett, and Mykel J. Kochenderfer. 2021. Algorithms for Verifying Deep Neural Networks. Found. Trends Optim., Vol. 4, 3-4 (2021), 244--404. https://doi.org/10.1561/2400000035
[31]
Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv., Vol. 54, 6, Article 115 (July 2021), bibinfonumpages35 pages. https://doi.org/10.1145/3457607
[32]
Frank Neumann, Pietro Simone Oliveto, and Carsten Witt. 2009. Theoretical Analysis of Fitness-Proportional Selection: Landscapes and Efficiency. In Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO 2009) (Montreal, Québec, Canada). ACM, 835--842. https://doi.org/10.1145/1569901.1570016
[33]
Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations. Science Vol. 366 (2019), 447--453. Issue 6464. https://doi.org/10.1126/science.aax2342
[34]
Edward Raff, Jared Sylvester, and Steven Mills. 2018. Fair Forests: Regularized Tree Induction to Minimize Model Bias. In Proc. 1st AAAI/ACM Conference on AI, Ethics, and Society (AIES 2018). 243--250. https://doi.org/10.1145/3278721.3278742
[35]
Francesco Ranzato, Caterina Urban, and Marco Zanella. 2021. FATT: Fairness Aware Tree Training. https://github.com/fatt21/fatt.
[36]
Francesco Ranzato and Marco Zanella. 2020. Abstract Interpretation of Decision Tree Ensemble Classifiers. In Proc. 34th AAAI Conference on Artificial Intelligence (AAAI 2020). 5478--5486. https://aaai.org/ojs/index.php/AAAI/article/view/5998
[37]
Francesco Ranzato and Marco Zanella. 2021. Genetic Adversarial Training of Decision Trees. In Proceedings of the Genetic and Evolutionary Computation Conference (Lille, France) (GECCO '21). Association for Computing Machinery, New York, NY, USA, 358--367. https://doi.org/10.1145/3449639.3459286
[38]
Xavier Rival and Kwangkeun Yi. 2020. Introduction to Static Analysis: An Abstract Interpretation Perspective. The MIT Press.
[39]
Yuji Roh, Kangwook Lee, Steven Whang, and Changho Suh. 2020. FR-Train: A Mutual Information-Based Approach to Fair and Robust Training. In Proc. of the 37th Int. Conf. on Machine Learning (ICML 2020) (Proceedings of Machine Learning Research, Vol. 119). PMLR, 8147--8157. http://proceedings.mlr.press/v119/roh20a.html
[40]
Anian Ruoss, Mislav Balunovic, Marc Fischer, and Martin T. Vechev. 2020. Learning Certified Individually Fair Representations. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS 2020). https://proceedings.neurips.cc/paper/2020/hash/55d491cf951b1b920900684d71419282-Abstract.html
[41]
Candice Schumann, Jeffrey S. Foster, Nicholas Mattei, and John P. Dickerson. 2020. We Need Fairness and Explainability in Algorithmic Hiring. In Proc. 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020). 1716--1720. https://dl.acm.org/doi/abs/10.5555/3398761.3398960
[42]
M. Srinivas and L. M. Patnaik. 1994. Genetic algorithms: a survey. Computer, Vol. 27, 6 (1994), 17--26. https://doi.org/10.1109/2.294849
[43]
Caterina Urban, Maria Christakis, Valentin Wü stholz, and Fuyuan Zhang. 2020. Perfectly parallel fairness certification of neural networks. Proc. ACM Program. Lang., Vol. 4, OOPSLA (2020), 185:1--185:30. https://doi.org/10.1145/3428253
[44]
Caterina Urban and Antoine Miné. 2021. A Review of Formal Methods applied to Machine Learning. CoRR, Vol. abs/2104.02466 (2021). https://arxiv.org/abs/2104.02466
[45]
Mikhail Yurochkin, Amanda Bower, and Yuekai Sun. 2020. Training individually fair ML models with sensitive subspace robustness. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=B1gdkxHFDH
[46]
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez-Rodriguez, and Krishna P. Gummadi. 2017. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. In Proc. 26th International Conference on World Wide Web (WWW 2017). 1171--1180. https://doi.org/10.1145/3038912.3052660

Cited By

View all
  • (2024)Bias Mitigation for Machine Learning Classifiers: A Comprehensive SurveyACM Journal on Responsible Computing10.1145/36313261:2(1-52)Online publication date: 20-Jun-2024
  • (2024)An Interpretable Multivariate Time-Series Anomaly Detection Method in Cyber–Physical Systems Based on Adaptive MaskIEEE Internet of Things Journal10.1109/JIOT.2023.329386011:2(2728-2740)Online publication date: 15-Jan-2024
  • (2024)Robustness verification of k-nearest neighbors by abstract interpretationKnowledge and Information Systems10.1007/s10115-024-02108-466:8(4825-4859)Online publication date: 1-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
October 2021
4966 pages
ISBN:9781450384469
DOI:10.1145/3459637
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. decision tree
  2. fairness verification
  3. fairness-aware learning
  4. individual fairness
  5. machine learning

Qualifiers

  • Research-article

Funding Sources

  • Facebook Research award
  • University of Padova
  • Italian Ministry of University and Research

Conference

CIKM '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)68
  • Downloads (Last 6 weeks)3
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Bias Mitigation for Machine Learning Classifiers: A Comprehensive SurveyACM Journal on Responsible Computing10.1145/36313261:2(1-52)Online publication date: 20-Jun-2024
  • (2024)An Interpretable Multivariate Time-Series Anomaly Detection Method in Cyber–Physical Systems Based on Adaptive MaskIEEE Internet of Things Journal10.1109/JIOT.2023.329386011:2(2728-2740)Online publication date: 15-Jan-2024
  • (2024)Robustness verification of k-nearest neighbors by abstract interpretationKnowledge and Information Systems10.1007/s10115-024-02108-466:8(4825-4859)Online publication date: 1-Aug-2024
  • (2024)Abstract Interpretation-Based Feature Importance for Support Vector MachinesVerification, Model Checking, and Abstract Interpretation10.1007/978-3-031-50524-9_2(27-49)Online publication date: 15-Jan-2024
  • (2023)Explainable Global Fairness Verification of Tree-Based Classifiers2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)10.1109/SaTML54575.2023.00011(1-17)Online publication date: Feb-2023
  • (2023)Robustness Certification of k-Nearest Neighbors2023 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM58522.2023.00020(110-119)Online publication date: 1-Dec-2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media