skip to main content
research-article
Open Access

Probabilistic verification of fairness properties via concentration

Published:10 October 2019Publication History
Skip Abstract Section

Abstract

As machine learning systems are increasingly used to make real world legal and financial decisions, it is of paramount importance that we develop algorithms to verify that these systems do not discriminate against minorities. We design a scalable algorithm for verifying fairness specifications. Our algorithm obtains strong correctness guarantees based on adaptive concentration inequalities; such inequalities enable our algorithm to adaptively take samples until it has enough data to make a decision. We implement our algorithm in a tool called VeriFair, and show that it scales to large machine learning models, including a deep recurrent neural network that is more than five orders of magnitude larger than the largest previously-verified neural network. While our technique only gives probabilistic guarantees due to the use of random samples, we show that we can choose the probability of error to be extremely small.

Skip Supplemental Material Section

Supplemental Material

a118-bastani

Presentation at OOPSLA '19

References

  1. Aws Albarghouthi, Loris D’Antoni, Samuel Drews, and Aditya V Nori. 2017. FairSquare: probabilistic verification of program fairness. In OOPSLA.Google ScholarGoogle Scholar
  2. Solon Barocas and Andrew D Selbst. 2016. Big data’s disparate impact. Cal. L. Rev. 104 (2016), 671.Google ScholarGoogle Scholar
  3. Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, and Antonio Criminisi. 2016. Measuring neural net robustness with constraints. In Advances in neural information processing systems. 2613–2621.Google ScholarGoogle Scholar
  4. Dan Biddle. 2006. Adverse impact and test validation: A practitioner’s guide to valid and defensible employment testing. Gower Publishing, Ltd.Google ScholarGoogle Scholar
  5. Toon Calders, Faisal Kamiran, and Mykola Pechenizkiy. 2009. Building classifiers with independency constraints. In ICDMW. 13–18.Google ScholarGoogle Scholar
  6. Toon Calders and Sicco Verwer. 2010. Three naive Bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery 21, 2 (2010), 277–292.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R Varshney. 2017. Optimized Pre-Processing for Discrimination Prevention. In Advances in Neural Information Processing Systems. 3995– 4004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yuansi Chen, Raaz Dwivedi, Martin J Wainwright, and Bin Yu. 2018. Fast MCMC sampling algorithms on polytopes. The Journal of Machine Learning Research 19, 1 (2018), 2146–2231.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Guillaume Claret, Sriram K Rajamani, Aditya V Nori, Andrew D Gordon, and Johannes Borgström. 2013. Bayesian inference using data flow analysis. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. ACM, 92–102.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Edmund M Clarke and Paolo Zuliani. 2011. Statistical model checking for cyber-physical systems. In International Symposium on Automated Technology for Verification and Analysis. Springer, 1–12.Google ScholarGoogle ScholarCross RefCross Ref
  11. Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 797–806.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Anupam Datta, Shayak Sen, and Yair Zick. 2017. Algorithmic transparency via quantitative input influence. In Transparent Data Mining for Big and Small Data. Springer, 71–94.Google ScholarGoogle Scholar
  13. Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. ACM, 214–226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Cynthia Dwork, Nicole Immorlica, Adam Tauman Kalai, and Mark DM Leiserson. 2018. Decoupled Classifiers for Group-Fair and Efficient Machine Learning. In Conference on Fairness, Accountability and Transparency. 119–133.Google ScholarGoogle Scholar
  15. Andre Esteva, Brett Kuprel, Roberto A Novoa, Justin Ko, Susan M Swetter, Helen M Blau, and Sebastian Thrun. 2017. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 7639 (2017), 115.Google ScholarGoogle Scholar
  16. Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 259–268.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Antonio Filieri, Corina S Păsăreanu, and Willem Visser. 2013. Reliability analysis in symbolic pathfinder. In Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, 622–631.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Benjamin Fish, Jeremy Kun, and Ádám D Lelkes. 2016. A confidence-based approach for balancing fairness and accuracy. In Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 144–152.Google ScholarGoogle ScholarCross RefCross Ref
  19. Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. 2017. Fairness testing: testing software for discrimination. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM, 498–510.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Timon Gehr, Matthew Mirman, Dana Drachsler-Cohen, Petar Tsankov, Swarat Chaudhuri, and Martin Vechev. 2018. AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation. In IEEE Symposium on Security and Privacy.Google ScholarGoogle ScholarCross RefCross Ref
  21. Timon Gehr, Sasa Misailovic, and Martin Vechev. 2016. Psi: Exact symbolic inference for probabilistic programs. In CAV.Google ScholarGoogle Scholar
  22. Jaco Geldenhuys, Matthew B Dwyer, and Willem Visser. 2012. Probabilistic symbolic execution. In Proceedings of the 2012 International Symposium on Software Testing and Analysis. ACM, 166–176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. In ICLR.Google ScholarGoogle Scholar
  24. Google. 2018. Recurrent Neural Networks for Drawing Classification. https://www.tensorflow.org/versions/master/tutorials/ recurrent_quickdraw . Accessed: 2018-04-15.Google ScholarGoogle Scholar
  25. Andrew D Gordon, Thomas A Henzinger, Aditya V Nori, and Sriram K Rajamani. 2014. Probabilistic programming. In Proceedings of the on Future of Software Engineering. ACM, 167–181.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Radu Grosu and Scott A Smolka. 2005. Monte carlo model checking. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 271–286.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. David Ha and Douglas Eck. 2017. A neural representation of sketch drawings. arXiv preprint arXiv:1704.03477 (2017).Google ScholarGoogle Scholar
  28. Sara Hajian and Josep Domingo-Ferrer. 2013. A methodology for direct and indirect discrimination prevention in data mining. IEEE transactions on knowledge and data engineering 25, 7 (2013), 1445–1459.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of opportunity in supervised learning. In NIPS. 3315–3323.Google ScholarGoogle Scholar
  30. Thomas Hérault, Richard Lassaigne, Frédéric Magniette, and Sylvain Peyronnet. 2004. Approximate probabilistic model checking. In International Workshop on Verification, Model Checking, and Abstract Interpretation. Springer, 73–84.Google ScholarGoogle ScholarCross RefCross Ref
  31. Thomas Herault, Richard Lassaigne, and Sylvain Peyronnet. 2006. APMC 3.0: Approximate verification of discrete and continuous time Markov chains. In Quantitative Evaluation of Systems, 2006. QEST 2006. Third International Conference on. IEEE, 129–130.Google ScholarGoogle Scholar
  32. Wassily Hoeffding. 1963. Probability inequalities for sums of bounded random variables. Journal of the American statistical association 58, 301 (1963), 13–30.Google ScholarGoogle ScholarCross RefCross Ref
  33. Xiaowei Huang, Marta Kwiatkowska, Sen Wang, and Min Wu. 2017. Safety verification of deep neural networks. In International Conference on Computer Aided Verification. Springer, 3–29.Google ScholarGoogle ScholarCross RefCross Ref
  34. Ramesh Johari, Pete Koomen, Leonid Pekelis, and David Walsh. 2017. Peeking at a/b tests: Why it matters, and what to do about it. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1517–1525.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Guy Katz, Clark Barrett, David L Dill, Kyle Julian, and Mykel J Kochenderfer. 2017. Reluplex: An efficient SMT solver for verifying deep neural networks. In International Conference on Computer Aided Verification. Springer, 97–117.Google ScholarGoogle ScholarCross RefCross Ref
  36. Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Schölkopf. 2017. Avoiding discrimination through causal reasoning. In Advances in Neural Information Processing Systems. 656–666.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2017. Inherent trade-offs in the fair determination of risk scores. In ITCS.Google ScholarGoogle Scholar
  38. Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fairness. In Advances in Neural Information Processing Systems. 4069–4079.Google ScholarGoogle Scholar
  39. Marta Kwiatkowska, Gethin Norman, and David Parker. 2002. PRISM: Probabilistic symbolic model checker. In International Conference on Modelling Techniques and Tools for Computer Performance Evaluation. Springer, 200–204.Google ScholarGoogle ScholarCross RefCross Ref
  40. Himabindu Lakkaraju, Jon Kleinberg, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2017. The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables. In KDD.Google ScholarGoogle Scholar
  41. Jim Lawrence. 1991. Polytope volume computation. Math. Comp. 57, 195 (1991), 259–271.Google ScholarGoogle ScholarCross RefCross Ref
  42. Axel Legay, Benoît Delahaye, and Saddek Bensalem. 2010. Statistical model checking: An overview. In International conference on runtime verification.Google ScholarGoogle ScholarCross RefCross Ref
  43. David Monniaux. 2000. Abstract interpretation of probabilistic semantics. In International Static Analysis Symposium. Springer, 322–339.Google ScholarGoogle ScholarCross RefCross Ref
  44. David Monniaux. 2001a. An abstract Monte-Carlo method for the analysis of probabilistic programs. In ACM SIGPLAN Notices, Vol. 36. ACM, 93–101.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. David Monniaux. 2001b. Backwards abstract interpretation of probabilistic programs. In European Symposium on Programming. Springer, 367–382.Google ScholarGoogle ScholarCross RefCross Ref
  46. Razieh Nabi and Ilya Shpitser. 2018. Fair inference on outcomes. In AAAI, Vol. 2018.Google ScholarGoogle Scholar
  47. Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 560–568.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Aimee Picchi. 2019. Odds of winning $1 billion Mega Millions and Powerball: 1 in 88 quadrillion. CBS News (2019). https://www.cbsnews.com/news/odds-of-winning-1-billion-mega-millions-and-powerball-1-in-88-quadrillionGoogle ScholarGoogle Scholar
  49. Aditi Raghunathan, Jacob Steinhardt, and Percy Liang. 2018. Certified defenses against adversarial examples. In ICLR.Google ScholarGoogle Scholar
  50. Adrian Sampson, Pavel Panchekha, Todd Mytkowicz, Kathryn S McKinley, Dan Grossman, and Luis Ceze. 2014. Expressing and verifying probabilistic assertions. In PLDI.Google ScholarGoogle Scholar
  51. Sriram Sankaranarayanan, Aleksandar Chakarov, and Sumit Gulwani. 2013. Static analysis for probabilistic programs: inferring whole program properties from finitely many paths. In PLDI. 447–458.Google ScholarGoogle Scholar
  52. Koushik Sen, Mahesh Viswanathan, and Gul Agha. 2004. Statistical model checking of black-box probabilistic systems. In International Conference on Computer Aided Verification. Springer, 202–215.Google ScholarGoogle ScholarCross RefCross Ref
  53. Koushik Sen, Mahesh Viswanathan, and Gul Agha. 2005. On statistical model checking of stochastic systems. In International Conference on Computer Aided Verification. Springer, 266–280.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Mallory Simon. 2009. HP looking into claim webcams can’t see black people. http://www.cnn.com/2009/TECH/12/22/hp. webcams/index.htmlGoogle ScholarGoogle Scholar
  55. Vincent Tjeng and Russ Tedrake. 2017. Verifying Neural Networks with Mixed Integer Programming. arXiv preprint arXiv:1711.07356 (2017).Google ScholarGoogle Scholar
  56. Leslie G Valiant. 1979. The complexity of computing the permanent. Theoretical computer science 8, 2 (1979), 189–201.Google ScholarGoogle Scholar
  57. Abraham Wald. 1945. Sequential tests of statistical hypotheses. The annals of mathematical statistics 16, 2 (1945), 117–186.Google ScholarGoogle Scholar
  58. Min Wen, Osbert Bastani, and Ufuk Topcu. 2019. Fairness with Dynamics. arXiv preprint arXiv:1901.08568 (2019).Google ScholarGoogle Scholar
  59. Håkan LS Younes, David J Musliner, et al. 2002. Probabilistic plan verification through acceptance sampling. In Proceedings of the AIPS-02 Workshop on Planning via Model Checking. Citeseer, 81–88.Google ScholarGoogle Scholar
  60. Håkan LS Younes and Reid G Simmons. 2002. Probabilistic verification of discrete event systems using acceptance sampling. In International Conference on Computer Aided Verification. Springer, 223–235.Google ScholarGoogle ScholarCross RefCross Ref
  61. Håkan LS Younes and Reid G Simmons. 2006. Statistical probabilistic model checking with a focus on time-bounded properties. Information and Computation 204, 9 (2006), 1368–1409.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Hakan Lorens Samir Younes. 2004. Verification and Planning for Stochastic Processes with Asynchronous Events. Ph.D. Dissertation. Pittsburgh, PA, USA.Google ScholarGoogle Scholar
  63. Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1171–1180.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Tal Z Zarsky. 2014. Understanding discrimination in the scored society. Wash. L. Rev. 89 (2014), 1375.Google ScholarGoogle Scholar
  65. Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. 2013. Learning fair representations. In International Conference on Machine Learning. 325–333.Google ScholarGoogle Scholar
  66. Shengjia Zhao, Enze Zhou, Ashish Sabharwal, and Stefano Ermon. 2016. Adaptive Concentration Inequalities for Sequential Decision Problems. In NIPS. 1343–1351.Google ScholarGoogle Scholar

Index Terms

  1. Probabilistic verification of fairness properties via concentration

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Programming Languages
      Proceedings of the ACM on Programming Languages  Volume 3, Issue OOPSLA
      October 2019
      2077 pages
      EISSN:2475-1421
      DOI:10.1145/3366395
      Issue’s Table of Contents

      Copyright © 2019 Owner/Author

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 October 2019
      Published in pacmpl Volume 3, Issue OOPSLA

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!