skip to main content
research-article
Public Access

A Programming Language for Data Privacy with Accuracy Estimations

Published:08 June 2021Publication History
Skip Abstract Section

Abstract

Differential privacy offers a formal framework for reasoning about the privacy and accuracy of computations on private data. It also offers a rich set of building blocks for constructing private data analyses. When carefully calibrated, these analyses simultaneously guarantee the privacy of the individuals contributing their data, and the accuracy of the data analysis results, inferring useful properties about the population. The compositional nature of differential privacy has motivated the design and implementation of several programming languages to ease the implementation of differentially private analyses. Even though these programming languages provide support for reasoning about privacy, most of them disregard reasoning about the accuracy of data analyses. To overcome this limitation, we present DPella, a programming framework providing data analysts with support for reasoning about privacy, accuracy, and their trade-offs. The distinguishing feature of DPella is a novel component that statically tracks the accuracy of different data analyses. To provide tight accuracy estimations, this component leverages taint analysis for automatically inferring statistical independence of the different noise quantities added for guaranteeing privacy. We evaluate our approach by implementing several classical queries from the literature and showing how data analysts can calibrate the privacy parameters to meet the accuracy requirements, and vice versa.

References

  1. Aws Albarghouthi and Justin Hsu. 2018. Synthesizing coupling proofs of differential privacy. Proceedings of the ACM on Programming Languages 2, POPL (2018), Article 58.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Borja Balle and Yu-Xiang Wang. 2018. Improving the Gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. arXiv:1805.06530Google ScholarGoogle Scholar
  3. Gilles Barthe, Rohit Chadha, Paul Krogmeier, A. Prasad Sistla, and Mahesh Viswanathan. 2021. Deciding accuracy of differential privacy schemes. Proceedings of the ACM on Programming Languages 5, POPL (2021), 1--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Gilles Barthe, Gian Pietro Farina, Marco Gaboardi, Emilio Jesús Gallego Arias, Andy Gordon, Justin Hsu, and Pierre-Yves Strub. 2016. Differentially private Bayesian programming. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Gilles Barthe, Noémie Fong, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016. Advanced probabilistic couplings for differential privacy. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gilles Barthe, Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, César Kunz, and Pierre-Yves Strub. 2014. Proving differential privacy in Hoare logic. In Proceedings of the IEEE Computer Security Foundations Symposium.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gilles Barthe, Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, Aaron Roth, and Pierre-Yves Strub. 2015. Higher-order approximate relational refinement types for mechanism design and differential privacy. In Proceedings of the 42nd Annual SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’15). ACM, New York, NY.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gilles Barthe, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016. A program logic for union bounds. In Proceedings of the 43rd International Colloquium on Automata, Languages, and Programming (ICALP’16).Google ScholarGoogle Scholar
  9. Gilles Barthe, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016. Proving differential privacy via probabilistic couplings. In Proceedings of the ACM/IEEE Symposium on Logic in Computer Science.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. 2013. Differentially private data analysis of social networks via restricted sensitivity. In Proceedings of the 4th Conference on Innovations in Theoretical Computer Science (ITCS’13).Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Mark Bun, Cynthia Dwork, Guy N. Rothblum, and Thomas Steinke. 2018. Composable and versatile privacy via truncated CDP. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. ACM, New York, NY, 74–86.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mark Bun and Thomas Steinke. 2016. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Proceedings of the Theory of Cryptography Conference.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T.-H. Hubert Chan, Elaine Shi, and Dawn Song. 2011. Private and continual release of statistics. ACM Transactions on Information and System Security 14, 3 (2011), 26.Google ScholarGoogle Scholar
  14. Graham Cormode, Tejas Kulkarni, and Divesh Srivastava. 2018. Marginal release under local differential privacy. In Proceedings of the International Conference on Management of Data (SIGMOD’18). 131–146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Devdatt P. Dubhashi and Alessandro Panconesi. 2009. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press.Google ScholarGoogle Scholar
  16. Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Proceedings of the 3rd Conference on Theory of Cryptography (TCC’06). 265–284.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Cynthia Dwork and Aaron Roth. 2014. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science 9, 3-4 (2014), 211–407.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Cynthia Dwork and Guy N. Rothblum. 2016. Concentrated differential privacy. arXiv:1603.01887Google ScholarGoogle Scholar
  19. Cynthia Dwork, Guy N. Rothblum, and Salil P. Vadhan. 2010. Boosting and differential privacy. In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS’10). 51–60.Google ScholarGoogle Scholar
  20. Hamid Ebadi and David Sands. 2017. Featherweight PINQ. Journal of Privacy and Confidentiality 7, 2 (2017), 159–164.Google ScholarGoogle ScholarCross RefCross Ref
  21. Richard A. Eisenberg, Dimitrios Vytiniotis, Simon L. Peyton Jones, and Stephanie Weirich. 2014. Closed type families with overlapping equations. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, Aaron Roth, and Zhiwei Steven Wu. 2014. Dual query: Practical private query release for high dimensional data. In Proceedings of the International Conference on Machine Learning (ICML’14).Google ScholarGoogle Scholar
  23. Marco Gaboardi, Andreas Haeberlen, Justin Hsu, Arjun Narayan, and Benjamin C. Pierce. 2013. Linear dependent types for differential privacy. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages.Google ScholarGoogle Scholar
  24. Marco Gaboardi, James Honaker, Gary King, Kobbi Nissim, Jonathan Ullman, and Salil P. Vadhan. 2016. PSI (): A private data sharing interface. arXiv:1609.04340Google ScholarGoogle Scholar
  25. Chang Ge, Xi He, Ihab F. Ilyas, and Ashwin Machanavajjhala. 2019. APEx: Accuracy-aware differentially private data exploration. In Proceedings of the International Conference on Management of Data.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Andreas Haeberlen, Benjamin C. Pierce, and Arjun Narayan. 2011. Differential privacy under fire. In Proceedings of the USENIX Security Symposium.Google ScholarGoogle Scholar
  27. Moritz Hardt, Katrina Ligett, and Frank McSherry. 2012. A simple and practical algorithm for differentially private data release. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems.Google ScholarGoogle Scholar
  28. Moritz Hardt and Kunal Talwar. 2010. On the geometry of differential privacy. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC’10).Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Michael Hay, Ashwin Machanavajjhala, Gerome Miklau, Yan Chen, and Dan Zhang. 2016. Principled evaluation of differentially private algorithms using DPBench. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD’16).Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Michael Hay, Vibhor Rastogi, Gerome Miklau, and Dan Suciu. 2010. Boosting the accuracy of differentially private histograms through consistency. Proceedings of the VLDB Endowment 3, 1-2 (2010), 1021–1032.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Noah M. Johnson, Joseph P. Near, and Dawn Song. 2018. Towards practical differential privacy for SQL queries. Proceedings of the VLDB Endowment 11, 5 (2018), 526–539.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ios Kotsogiannis, Yuchao Tao, Xi He, Maryam Fanaeepour, Ashwin Machanavajjhala, Michael Hay, and Gerome Miklau. 2019. PrivateSQL: A differentially private SQL query engine. Proceedings of the VLDB Endowment 12, 11 (July 2019), 1371–1384.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Christoph H. Lampert, Liva Ralaivola, and Alexander Zimin. 2018. Dependency-dependent bounds for sums of dependent random variables. arXiv:1811.01404Google ScholarGoogle Scholar
  34. Chao Li, Gerome Miklau, Michael Hay, Andrew McGregor, and Vibhor Rastogi. 2015. The matrix mechanism: Optimizing linear counting queries under differential privacy. VLDB Journal 24, 6 (2015), 757–781.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. P. Li and S. Zdancewic. 2010. Arrows for secure information flow. Theoretical Computer Science 411, 19 (2010), 1974–1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Katrina Ligett, Seth Neel, Aaron Roth, Bo Waggoner, and Zhiwei Steven Wu. 2017. Accuracy first: Selecting a differential privacy level for accuracy-constrained ERM. arXiv:1705.10829Google ScholarGoogle Scholar
  37. E. Lobo-Vesga, A. Russo, and M. Gaboardi. 2020. A programming framework for differential privacy with accuracy concentration bounds. In Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP’20). IEEE, Los Alamitos, CA, 1333–1350. DOI:https://doi.org/10.1109/SP40000.2020.00086Google ScholarGoogle ScholarCross RefCross Ref
  38. Ashwin Machanavajjhala, Daniel Kifer, John M. Abowd, Johannes Gehrke, and Lars Vilhuber. 2008. Privacy: Theory meets practice on the Map. In Proceedings of the International Conference on Data Engineering (ICDE’08).Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Frank McSherry and Ratul Mahajan. 2011. Differentially-private network trace analysis. ACM SIGCOMM Computer Communication Review 41, 4 (2011), 123–134.Google ScholarGoogle Scholar
  40. Frank D. McSherry. 2009. Privacy integrated queries: An extensible platform for privacy-preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD’09). ACM, New York, NY. 19–30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Darakhshan J. Mir, Sibren Isaacman, Ramón Cáceres, Margaret Martonosi, and Rebecca N. Wright. 2013. DP-WHERE: Differentially private modeling of human mobility. In Proceedings of the IEEE International Conference on Big Data.Google ScholarGoogle Scholar
  42. Ilya Mironov. 2017. Rényi differential privacy. In Proceedings of the 2017 IEEE 30th Computer Security Foundations Symposium (CSF’17). IEEE, Los Alamitos, CA.Google ScholarGoogle ScholarCross RefCross Ref
  43. Eugenio Moggi. 1991. Notions of computation and monads. Information and Computation 93, 1 (1991), 55–92.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Prashanth Mohan, Abhradeep Thakurta, Elaine Shi, Dawn Song, and David E. Culler. 2012. GUPT: Privacy preserving data analysis made easy. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’12).Google ScholarGoogle Scholar
  45. Arjun Narayan and Andreas Haeberlen. 2012. DJoin: Differentially private join queries over distributed databases. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12).Google ScholarGoogle Scholar
  46. Joseph P. Near, David Darais, Chike Abuah, Tim Stevens, Pranav Gaddamadugu, Lun Wang, Neel Somani, et al. 2019. Duet: An expressive higher-order language and linear type system for statically enforcing differential privacy. Proceedings of the ACM on Programming Languages 3, OOPSLA (Oct. 2019), Article 172.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Aleksandar Nikolov, Kunal Talwar, and Li Zhang. 2013. The geometry of differential privacy: The sparse and approximate cases. In Proceedings of the 45th Annual ACM Symposium on Theory of Computing (STOC’13).Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Kobbi Nissim, Sofya Raskhodnikova, and Adam D. Smith. 2007. Smooth sensitivity and sampling in private data analysis. In Proceedings of the 39th Annual ACM Symposium on Theory of Computing (STOC’07).Google ScholarGoogle Scholar
  49. Davide Proserpio, Sharon Goldberg, and Frank McSherry. 2014. Calibrating data to sensitivity in private data analysis: A platform for differentially-private analysis of weighted datasets. Proceedings of the VLDB Endowment 7, 8 (2014), 637–648.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Jason Reed and Benjamin C. Pierce. 2010. Distance makes the types grow stronger: A calculus for differential privacy. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming.Google ScholarGoogle Scholar
  51. Indrajit Roy, Srinath T. V. Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. 2010. Airavat: Security and privacy for MapReduce. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’10).Google ScholarGoogle Scholar
  52. Alejandro Russo. 2015. Functional pearl: Two can keep a secret, if one of them uses Haskell. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming. ACM, New York, NY.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. A. Russo, K. Claessen, and J. Hughes. 2008. A library for light-weight information-flow security in Haskell. In Proceedings of the ACM SIGPLAN Symposium on Haskell. ACM, New York, NY.Google ScholarGoogle Scholar
  54. A. Sabelfeld and A. C. Myers. 2003. Language-based information-flow security. IEEE Journal on Selected Areas in Communications 21, 1 (Jan. 2003), 5–19.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Daniel Schoepe, Musard Balliu, Benjamin C. Pierce, and Andrei Sabelfeld. 2016. Explicit secrecy: A policy for taint tracking. In Proceedings of the IEEE European Symposium on Security and Privacy. 15–30.Google ScholarGoogle ScholarCross RefCross Ref
  56. Calvin Smith, Justin Hsu, and Aws Albarghouthi. 2019. Trace abstraction modulo probability. Proceedings of the ACM on Programming Languages 3, POPL (2019), Article 39.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. David Terei, Simon Marlow, Simon L. Peyton Jones, and David Mazières. 2012. Safe Haskell. In Proceedings of the 5th ACM SIGPLAN Symposium on Haskell. 137–148.Google ScholarGoogle Scholar
  58. Justin Thaler, Jonathan Ullman, and Salil P. Vadhan. 2012. Faster algorithms for privately releasing marginals. In Proceedings of the 39th International Colloquium on Automata, Languages, and Programming (ICALP’12). 810–821.Google ScholarGoogle Scholar
  59. Yuxin Wang, Zeyu Ding, Guanhong Wang, Daniel Kifer, and Danfeng Zhang. 2019. Proving differential privacy with shadow execution. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Daniel Winograd-Cort, Andreas Haeberlen, Aaron Roth, and Benjamin C. Pierce. 2017. A framework for adaptive differential privacy. Proceedings of the ACM on Programming Languages 1, ICFP (2017), Article 10.Google ScholarGoogle Scholar
  61. Xiaokui Xiao, Guozhang Wang, and Johannes Gehrke. 2011. Differential privacy via wavelet transforms. IEEE Transactions on Knowledge and Data Engineering 23, 8 (2011), 1200–1214.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Danfeng Zhang and Daniel Kifer. 2017. LightDP: Towards automating differential privacy proofs. In Proceedings of the ACM SIGPLAN Symposium on Principles of Programming Languages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Dan Zhang, Ryan McKenna, Ios Kotsogiannis, Michael Hay, Ashwin Machanavajjhala, and Gerome Miklau. 2018. EKTELO: A framework for defining differentially-private computations. In Proceedings of the International Conference on Management of Data.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Hengchu Zhang, Edo Roth, Andreas Haeberlen, Benjamin C. Pierce, and Aaron Roth. 2019. Fuzzi: A three-level logic for differential privacy. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming (ICFP’19).Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Programming Language for Data Privacy with Accuracy Estimations

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Programming Languages and Systems
            ACM Transactions on Programming Languages and Systems  Volume 43, Issue 2
            June 2021
            197 pages
            ISSN:0164-0925
            EISSN:1558-4593
            DOI:10.1145/3470134
            Issue’s Table of Contents

            Copyright © 2021 ACM

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 8 June 2021
            • Revised: 1 February 2021
            • Accepted: 1 February 2021
            • Received: 1 July 2020
            Published in toplas Volume 43, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!