Abstract
Differential privacy offers a formal framework for reasoning about the privacy and accuracy of computations on private data. It also offers a rich set of building blocks for constructing private data analyses. When carefully calibrated, these analyses simultaneously guarantee the privacy of the individuals contributing their data, and the accuracy of the data analysis results, inferring useful properties about the population. The compositional nature of differential privacy has motivated the design and implementation of several programming languages to ease the implementation of differentially private analyses. Even though these programming languages provide support for reasoning about privacy, most of them disregard reasoning about the accuracy of data analyses. To overcome this limitation, we present DPella, a programming framework providing data analysts with support for reasoning about privacy, accuracy, and their trade-offs. The distinguishing feature of DPella is a novel component that statically tracks the accuracy of different data analyses. To provide tight accuracy estimations, this component leverages taint analysis for automatically inferring statistical independence of the different noise quantities added for guaranteeing privacy. We evaluate our approach by implementing several classical queries from the literature and showing how data analysts can calibrate the privacy parameters to meet the accuracy requirements, and vice versa.
- Aws Albarghouthi and Justin Hsu. 2018. Synthesizing coupling proofs of differential privacy. Proceedings of the ACM on Programming Languages 2, POPL (2018), Article 58.Google Scholar
Digital Library
- Borja Balle and Yu-Xiang Wang. 2018. Improving the Gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. arXiv:1805.06530Google Scholar
- Gilles Barthe, Rohit Chadha, Paul Krogmeier, A. Prasad Sistla, and Mahesh Viswanathan. 2021. Deciding accuracy of differential privacy schemes. Proceedings of the ACM on Programming Languages 5, POPL (2021), 1--30.Google Scholar
Digital Library
- Gilles Barthe, Gian Pietro Farina, Marco Gaboardi, Emilio Jesús Gallego Arias, Andy Gordon, Justin Hsu, and Pierre-Yves Strub. 2016. Differentially private Bayesian programming. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security.Google Scholar
Digital Library
- Gilles Barthe, Noémie Fong, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016. Advanced probabilistic couplings for differential privacy. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security.Google Scholar
Digital Library
- Gilles Barthe, Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, César Kunz, and Pierre-Yves Strub. 2014. Proving differential privacy in Hoare logic. In Proceedings of the IEEE Computer Security Foundations Symposium.Google Scholar
Digital Library
- Gilles Barthe, Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, Aaron Roth, and Pierre-Yves Strub. 2015. Higher-order approximate relational refinement types for mechanism design and differential privacy. In Proceedings of the 42nd Annual SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL’15). ACM, New York, NY.Google Scholar
Digital Library
- Gilles Barthe, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016. A program logic for union bounds. In Proceedings of the 43rd International Colloquium on Automata, Languages, and Programming (ICALP’16).Google Scholar
- Gilles Barthe, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016. Proving differential privacy via probabilistic couplings. In Proceedings of the ACM/IEEE Symposium on Logic in Computer Science.Google Scholar
Digital Library
- Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. 2013. Differentially private data analysis of social networks via restricted sensitivity. In Proceedings of the 4th Conference on Innovations in Theoretical Computer Science (ITCS’13).Google Scholar
Digital Library
- Mark Bun, Cynthia Dwork, Guy N. Rothblum, and Thomas Steinke. 2018. Composable and versatile privacy via truncated CDP. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. ACM, New York, NY, 74–86.Google Scholar
Digital Library
- Mark Bun and Thomas Steinke. 2016. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Proceedings of the Theory of Cryptography Conference.Google Scholar
Digital Library
- T.-H. Hubert Chan, Elaine Shi, and Dawn Song. 2011. Private and continual release of statistics. ACM Transactions on Information and System Security 14, 3 (2011), 26.Google Scholar
- Graham Cormode, Tejas Kulkarni, and Divesh Srivastava. 2018. Marginal release under local differential privacy. In Proceedings of the International Conference on Management of Data (SIGMOD’18). 131–146.Google Scholar
Digital Library
- Devdatt P. Dubhashi and Alessandro Panconesi. 2009. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press.Google Scholar
- Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In Proceedings of the 3rd Conference on Theory of Cryptography (TCC’06). 265–284.Google Scholar
Digital Library
- Cynthia Dwork and Aaron Roth. 2014. The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science 9, 3-4 (2014), 211–407.Google Scholar
Digital Library
- Cynthia Dwork and Guy N. Rothblum. 2016. Concentrated differential privacy. arXiv:1603.01887Google Scholar
- Cynthia Dwork, Guy N. Rothblum, and Salil P. Vadhan. 2010. Boosting and differential privacy. In Proceedings of the 51st Annual IEEE Symposium on Foundations of Computer Science (FOCS’10). 51–60.Google Scholar
- Hamid Ebadi and David Sands. 2017. Featherweight PINQ. Journal of Privacy and Confidentiality 7, 2 (2017), 159–164.Google Scholar
Cross Ref
- Richard A. Eisenberg, Dimitrios Vytiniotis, Simon L. Peyton Jones, and Stephanie Weirich. 2014. Closed type families with overlapping equations. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages.Google Scholar
Digital Library
- Marco Gaboardi, Emilio Jesús Gallego Arias, Justin Hsu, Aaron Roth, and Zhiwei Steven Wu. 2014. Dual query: Practical private query release for high dimensional data. In Proceedings of the International Conference on Machine Learning (ICML’14).Google Scholar
- Marco Gaboardi, Andreas Haeberlen, Justin Hsu, Arjun Narayan, and Benjamin C. Pierce. 2013. Linear dependent types for differential privacy. In Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages.Google Scholar
- Marco Gaboardi, James Honaker, Gary King, Kobbi Nissim, Jonathan Ullman, and Salil P. Vadhan. 2016. PSI (): A private data sharing interface. arXiv:1609.04340Google Scholar
- Chang Ge, Xi He, Ihab F. Ilyas, and Ashwin Machanavajjhala. 2019. APEx: Accuracy-aware differentially private data exploration. In Proceedings of the International Conference on Management of Data.Google Scholar
Digital Library
- Andreas Haeberlen, Benjamin C. Pierce, and Arjun Narayan. 2011. Differential privacy under fire. In Proceedings of the USENIX Security Symposium.Google Scholar
- Moritz Hardt, Katrina Ligett, and Frank McSherry. 2012. A simple and practical algorithm for differentially private data release. In Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems.Google Scholar
- Moritz Hardt and Kunal Talwar. 2010. On the geometry of differential privacy. In Proceedings of the 42nd ACM Symposium on Theory of Computing (STOC’10).Google Scholar
Digital Library
- Michael Hay, Ashwin Machanavajjhala, Gerome Miklau, Yan Chen, and Dan Zhang. 2016. Principled evaluation of differentially private algorithms using DPBench. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD’16).Google Scholar
Digital Library
- Michael Hay, Vibhor Rastogi, Gerome Miklau, and Dan Suciu. 2010. Boosting the accuracy of differentially private histograms through consistency. Proceedings of the VLDB Endowment 3, 1-2 (2010), 1021–1032.Google Scholar
Digital Library
- Noah M. Johnson, Joseph P. Near, and Dawn Song. 2018. Towards practical differential privacy for SQL queries. Proceedings of the VLDB Endowment 11, 5 (2018), 526–539.Google Scholar
Digital Library
- Ios Kotsogiannis, Yuchao Tao, Xi He, Maryam Fanaeepour, Ashwin Machanavajjhala, Michael Hay, and Gerome Miklau. 2019. PrivateSQL: A differentially private SQL query engine. Proceedings of the VLDB Endowment 12, 11 (July 2019), 1371–1384.Google Scholar
Digital Library
- Christoph H. Lampert, Liva Ralaivola, and Alexander Zimin. 2018. Dependency-dependent bounds for sums of dependent random variables. arXiv:1811.01404Google Scholar
- Chao Li, Gerome Miklau, Michael Hay, Andrew McGregor, and Vibhor Rastogi. 2015. The matrix mechanism: Optimizing linear counting queries under differential privacy. VLDB Journal 24, 6 (2015), 757–781.Google Scholar
Digital Library
- P. Li and S. Zdancewic. 2010. Arrows for secure information flow. Theoretical Computer Science 411, 19 (2010), 1974–1994.Google Scholar
Digital Library
- Katrina Ligett, Seth Neel, Aaron Roth, Bo Waggoner, and Zhiwei Steven Wu. 2017. Accuracy first: Selecting a differential privacy level for accuracy-constrained ERM. arXiv:1705.10829Google Scholar
- E. Lobo-Vesga, A. Russo, and M. Gaboardi. 2020. A programming framework for differential privacy with accuracy concentration bounds. In Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP’20). IEEE, Los Alamitos, CA, 1333–1350. DOI:https://doi.org/10.1109/SP40000.2020.00086Google Scholar
Cross Ref
- Ashwin Machanavajjhala, Daniel Kifer, John M. Abowd, Johannes Gehrke, and Lars Vilhuber. 2008. Privacy: Theory meets practice on the Map. In Proceedings of the International Conference on Data Engineering (ICDE’08).Google Scholar
Digital Library
- Frank McSherry and Ratul Mahajan. 2011. Differentially-private network trace analysis. ACM SIGCOMM Computer Communication Review 41, 4 (2011), 123–134.Google Scholar
- Frank D. McSherry. 2009. Privacy integrated queries: An extensible platform for privacy-preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data (SIGMOD’09). ACM, New York, NY. 19–30.Google Scholar
Digital Library
- Darakhshan J. Mir, Sibren Isaacman, Ramón Cáceres, Margaret Martonosi, and Rebecca N. Wright. 2013. DP-WHERE: Differentially private modeling of human mobility. In Proceedings of the IEEE International Conference on Big Data.Google Scholar
- Ilya Mironov. 2017. Rényi differential privacy. In Proceedings of the 2017 IEEE 30th Computer Security Foundations Symposium (CSF’17). IEEE, Los Alamitos, CA.Google Scholar
Cross Ref
- Eugenio Moggi. 1991. Notions of computation and monads. Information and Computation 93, 1 (1991), 55–92.Google Scholar
Digital Library
- Prashanth Mohan, Abhradeep Thakurta, Elaine Shi, Dawn Song, and David E. Culler. 2012. GUPT: Privacy preserving data analysis made easy. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’12).Google Scholar
- Arjun Narayan and Andreas Haeberlen. 2012. DJoin: Differentially private join queries over distributed databases. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12).Google Scholar
- Joseph P. Near, David Darais, Chike Abuah, Tim Stevens, Pranav Gaddamadugu, Lun Wang, Neel Somani, et al. 2019. Duet: An expressive higher-order language and linear type system for statically enforcing differential privacy. Proceedings of the ACM on Programming Languages 3, OOPSLA (Oct. 2019), Article 172.Google Scholar
Digital Library
- Aleksandar Nikolov, Kunal Talwar, and Li Zhang. 2013. The geometry of differential privacy: The sparse and approximate cases. In Proceedings of the 45th Annual ACM Symposium on Theory of Computing (STOC’13).Google Scholar
Digital Library
- Kobbi Nissim, Sofya Raskhodnikova, and Adam D. Smith. 2007. Smooth sensitivity and sampling in private data analysis. In Proceedings of the 39th Annual ACM Symposium on Theory of Computing (STOC’07).Google Scholar
- Davide Proserpio, Sharon Goldberg, and Frank McSherry. 2014. Calibrating data to sensitivity in private data analysis: A platform for differentially-private analysis of weighted datasets. Proceedings of the VLDB Endowment 7, 8 (2014), 637–648.Google Scholar
Digital Library
- Jason Reed and Benjamin C. Pierce. 2010. Distance makes the types grow stronger: A calculus for differential privacy. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming.Google Scholar
- Indrajit Roy, Srinath T. V. Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. 2010. Airavat: Security and privacy for MapReduce. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’10).Google Scholar
- Alejandro Russo. 2015. Functional pearl: Two can keep a secret, if one of them uses Haskell. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming. ACM, New York, NY.Google Scholar
Digital Library
- A. Russo, K. Claessen, and J. Hughes. 2008. A library for light-weight information-flow security in Haskell. In Proceedings of the ACM SIGPLAN Symposium on Haskell. ACM, New York, NY.Google Scholar
- A. Sabelfeld and A. C. Myers. 2003. Language-based information-flow security. IEEE Journal on Selected Areas in Communications 21, 1 (Jan. 2003), 5–19.Google Scholar
Digital Library
- Daniel Schoepe, Musard Balliu, Benjamin C. Pierce, and Andrei Sabelfeld. 2016. Explicit secrecy: A policy for taint tracking. In Proceedings of the IEEE European Symposium on Security and Privacy. 15–30.Google Scholar
Cross Ref
- Calvin Smith, Justin Hsu, and Aws Albarghouthi. 2019. Trace abstraction modulo probability. Proceedings of the ACM on Programming Languages 3, POPL (2019), Article 39.Google Scholar
Digital Library
- David Terei, Simon Marlow, Simon L. Peyton Jones, and David Mazières. 2012. Safe Haskell. In Proceedings of the 5th ACM SIGPLAN Symposium on Haskell. 137–148.Google Scholar
- Justin Thaler, Jonathan Ullman, and Salil P. Vadhan. 2012. Faster algorithms for privately releasing marginals. In Proceedings of the 39th International Colloquium on Automata, Languages, and Programming (ICALP’12). 810–821.Google Scholar
- Yuxin Wang, Zeyu Ding, Guanhong Wang, Daniel Kifer, and Danfeng Zhang. 2019. Proving differential privacy with shadow execution. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation.Google Scholar
Digital Library
- Daniel Winograd-Cort, Andreas Haeberlen, Aaron Roth, and Benjamin C. Pierce. 2017. A framework for adaptive differential privacy. Proceedings of the ACM on Programming Languages 1, ICFP (2017), Article 10.Google Scholar
- Xiaokui Xiao, Guozhang Wang, and Johannes Gehrke. 2011. Differential privacy via wavelet transforms. IEEE Transactions on Knowledge and Data Engineering 23, 8 (2011), 1200–1214.Google Scholar
Digital Library
- Danfeng Zhang and Daniel Kifer. 2017. LightDP: Towards automating differential privacy proofs. In Proceedings of the ACM SIGPLAN Symposium on Principles of Programming Languages.Google Scholar
Digital Library
- Dan Zhang, Ryan McKenna, Ios Kotsogiannis, Michael Hay, Ashwin Machanavajjhala, and Gerome Miklau. 2018. EKTELO: A framework for defining differentially-private computations. In Proceedings of the International Conference on Management of Data.Google Scholar
Digital Library
- Hengchu Zhang, Edo Roth, Andreas Haeberlen, Benjamin C. Pierce, and Aaron Roth. 2019. Fuzzi: A three-level logic for differential privacy. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming (ICFP’19).Google Scholar
Digital Library
Index Terms
A Programming Language for Data Privacy with Accuracy Estimations
Recommendations
A Novel Differential Privacy Approach that Enhances Classification Accuracy
C3S2E '16: Proceedings of the Ninth International C* Conference on Computer Science & Software EngineeringIn the recent past, there has been a tremendous increase of large repositories of data, examples being in healthcare data, consumer data from retailers, and airline passenger data. These data are continually being shared with interested parties, either ...
A privacy framework: indistinguishable privacy
EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 WorkshopsIn this paper we illustrate a privacy framework named Indistinguishable Privacy. Indistinguishable privacy could be deemed as the formalization of the existing privacy definitions in privacy preserving data publishing as well as secure multi-party ...
Personality-based Knowledge Extraction for Privacy-preserving Data Analysis
K-CAP 2017: Proceedings of the Knowledge Capture ConferenceIn this paper, we present a differential privacy preserving approach, which extracts personality-based knowledge to serve privacy guarantee data analysis on personal sensitive data. Based on the approach, we further implement an end-to-end privacy ...






Comments