Abstract
Applying differential privacy at scale requires convenient ways to check that programs computing with sensitive data appropriately preserve privacy. We propose here a fully automated framework for testing differential privacy, adapting a well-known “pointwise” technique from informal proofs of differential privacy. Our framework, called DPCheck, requires no programmer annotations, handles all previously verified or tested algorithms, and is the first fully automated framework to distinguish correct and buggy implementations of PrivTree, a probabilistically terminating algorithm that has not previously been mechanically checked.
We analyze the probability of DPCheck mistakenly accepting a non-private program and prove that, theoretically, the probability of false acceptance can be made exponentially small by suitable choice of test size.
We demonstrate DPCheck’s utility empirically by implementing all benchmark algorithms from prior work on mechanical verification of differential privacy, plus several others and their incorrect variants, and show DPCheck accepts the correct implementations and rejects the incorrect variants.
We also demonstrate how DPCheck can be deployed in a practical workflow to test differentially privacy for the 2020 US Census Disclosure Avoidance System (DAS).
Supplemental Material
- Aws Albarghouthi and Justin Hsu. 2017. Synthesizing Coupling Proofs of Diferential Privacy. Proc. ACM Program. Lang. 2, POPL, Article 58 ( Dec. 2017 ), 30 pages. https://doi.org/10.1145/3158146 Google Scholar
Digital Library
- Apple. 2017. Apple Diferential Privacy Whitepaper. https://images.apple.com/privacy/docs/Diferential_Privacy_Overview. pdfGoogle Scholar
- E. Axelsson, K. Claessen, G. Dévai, Z. Horváth, K. Keijzer, B. Lyckegård, A. Persson, M. Sheeran, J. Svenningsson, and A. Vajdax. 2010. Feldspar: A domain specific language for digital signal processing algorithms. In Eighth ACM/IEEE International Conference on Formal Methods and Models for Codesign (MEMOCODE 2010 ). 169-178.Google Scholar
- Gilles Barthe, Rohit Chadha, Vishal Jagannath, A. Prasad Sistla, and Mahesh Viswanathan. 2019. Automated Methods for Checking Diferential Privacy. arXiv: 1910. 04137 [cs.CR]Google Scholar
- Gilles Barthe, Noémie Fong, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016a. Advanced Probabilistic Couplings for Diferential Privacy. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security-CCS'16 ( 2016 ). https://doi.org/10.1145/2976749.2978391 Google Scholar
Digital Library
- Gilles Barthe, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016b. Proving Diferential Privacy via Probabilistic Couplings. In Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science (New York, NY, USA) ( LICS '16). ACM, New York, NY, USA, 749-758. https://doi.org/10.1145/2933575.2934554 Google Scholar
Digital Library
- Benjamin Bichsel, Timon Gehr, Dana Drachsler-Cohen, Petar Tsankov, and Martin Vechev. 2018. DP-Finder: Finding Diferential Privacy Violations by Sampling and Optimization. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (Toronto, Canada) ( CCS '18). Association for Computing Machinery, New York, NY, USA, 508-524. https://doi.org/10.1145/3243734.3243863 Google Scholar
Digital Library
- EunYi Chung and Joseph P Romano. 2016. Multivariate and multiple permutation tests. Journal of econometrics 193, 1 ( 2016 ), 76-91.Google Scholar
Cross Ref
- Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Eficient SMT Solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (Budapest, Hungary) ( TACAS'08/ETAPS'08). Springer-Verlag, Berlin, Heidelberg, 337-340. http://dl.acm.org/citation.cfm?id= 1792734. 1792766Google Scholar
Cross Ref
- Zeyu Ding, Yuxin Wang, and Danfeng Zhang an Daniel Kifer. 2019. Free Gap Information from the Diferentially Private Sparse Vector and Noisy Max Mechanisms. CoRR abs/ 1904.12773 ( 2019 ). arXiv: 1904.12773 http://arxiv.org/abs/ 1904.12773Google Scholar
- Zeyu Ding, Yuxin Wang, Guanhong Wang, Danfeng Zhang, and Daniel Kifer. 2018. Detecting Violations of Diferential Privacy. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (Toronto, Canada) ( CCS '18). ACM, New York, NY, USA, 475-489. https://doi.org/10.1145/3243734.3243818 Google Scholar
Digital Library
- Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating Noise to Sensitivity in Private Data Analysis. In Theory of Cryptography, Shai Halevi and Tal Rabin (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 265-284.Google Scholar
- Cynthia Dwork and Aaron Roth. 2014. The Algorithmic Foundations of Diferential Privacy. Found. Trends Theor. Comput. Sci. 9, 3 –4 ( Aug. 2014 ), 211-407. https://doi.org/10.1561/0400000042 Google Scholar
Digital Library
- Gian Pietro Farina, Stephen Chong, and Marco Gaboardi. 2017. Relational Symbolic Execution. CoRR abs/1711.08349 ( 2017 ). arXiv: 1711.08349 http://arxiv.org/abs/1711.08349Google Scholar
- Marco Gaboardi, Andreas Haeberlen, Justin Hsu, Arjun Narayan, and Benjamin C. Pierce. 2013. Linear Dependent Types for Diferential Privacy. SIGPLAN Not. 48, 1 (Jan. 2013 ), 357-370. https://doi.org/10.1145/2480359.2429113 Google Scholar
Digital Library
- Arpita Ghosh, Tim Roughgarden, and Mukund Sundararajan. 2009. Universally Utility-maximizing Privacy Mechanisms. In Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing (Bethesda, MD, USA) ( STOC '09). ACM, New York, NY, USA, 351-360. https://doi.org/10.1145/1536414.1536464 Google Scholar
Digital Library
- Robert Hall, Larry Wasserman, and Alessandro Rinaldo. 2013. Random Diferential Privacy. Journal of Privacy and Confidentiality 4, 2 (Mar 2013 ). https://doi.org/10.29012/jpc.v4i2. 621 Google Scholar
Cross Ref
- Justin Hsu. 2017. Probabilistic Couplings for Probabilistic Reasoning. CoRR abs/1710.09951 ( 2017 ). arXiv: 1710.09951 http://arxiv.org/abs/1710.09951Google Scholar
- Justin Hsu, Marco Gaboardi, Andreas Haeberlen, Sanjeev Khanna, Arjun Narayan, Benjamin C. Pierce, and Aaron Roth. 2014. Diferential Privacy: An Economic Method for Choosing Epsilon. In Proceedings of the 2014 IEEE 27th Computer Security Foundations Symposium (CSF '14). IEEE Computer Society, Washington, DC, USA, 398-410. https://doi.org/10. 1109/CSF. 2014.35 Google Scholar
Digital Library
- Shiva P. Kasiviswanathan and Adam Smith. 2014. On the “Semantics” of Diferential Privacy: A Bayesian Formulation. Journal of Privacy and Confidentiality 6, 1 (Jun 2014 ). https://doi.org/10.29012/jpc.v6i1. 634 Google Scholar
Cross Ref
- James C. King. 1976. Symbolic Execution and Program Testing. Commun. ACM 19, 7 ( July 1976 ), 385-394. https: //doi.org/10.1145/360248.360252 Google Scholar
Digital Library
- Min Lyu, Dong Su, and Ninghui Li. 2017. Understanding the Sparse Vector Technique for Diferential Privacy. Proc. VLDB Endow. 10, 6 (Feb. 2017 ), 637-648. https://doi.org/10.14778/3055330.3055331 Google Scholar
Digital Library
- Frank J. Jr. Massey. 1951. The Kolmogorov-Smirnov Test for Goodness of Fit. J. Amer. Statist. Assoc. 46, 253 ( 1951 ), 68-78. https: //doi.org/10.1080/01621459. 1951. 10500769 arXiv:https://www.tandfonline.com/doi/pdf/10.1080/01621459. 1951.10500769 Google Scholar
Cross Ref
- Microsoft. 2017. Collecting telemetry data privately. https://www.microsoft.com/en-us/research/blog/collecting-telemetrydata-privately/Google Scholar
- Ilya Mironov. 2012. On Significance of the Least Significant Bits for Diferential Privacy. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (Raleigh, North Carolina, USA) ( CCS '12). ACM, New York, NY, USA, 650-661. https://doi.org/10.1145/2382196.2382264 Google Scholar
Digital Library
- E. Moggi. 1989. Computational lambda-calculus and monads. In Proceedings. Fourth Annual Symposium on Logic in Computer Science. 14-23. https://doi.org/10.1109/LICS. 1989.39155 Google Scholar
Cross Ref
- Duncan J Murdoch, Yu-Ling Tsai, and James Adcock. 2008. P-Values are Random Variables. The American Statistician 62, 3 ( 2008 ), 242-245. https://doi.org/10.1198/000313008X332421 arXiv:https://doi.org/10.1198/000313008X332421 Google Scholar
Cross Ref
- Aref N. Dajani, Amy D. Lauger, Phyllis E. Singer, Daniel Kifer, Jerome P. Reiter, Ashwin Machanavajjhala, Simson L. Garfinkel, Scot A. Dahl, Matthew Graham, Vishesh Karwa, Hang Kim, Philip Leclerc, Ian M. Schmutte, William N. Sexton, Lars Villhuber, and John M. Abowd. 2017. The modernization of statistical disclosure limitation at the U.S. Census Bureau. ( September 2017 ). https://www2.census.gov/cac/sac/meetings/2017-09/statistical-disclosure-limitation. pdf [Online; posted September-2017].Google Scholar
- Joseph P. Near, Alex Shan, Dawn Song, David Darais, Chike Abuah, Tim Stevens, Pranav Gaddamadugu, Lun Wang, Neel Somani, Mu Zhang, and et al. 2019. Duet: an expressive higher-order language and linear type system for statically enforcing diferential privacy. Proceedings of the ACM on Programming Languages 3, OOPSLA (Oct 2019 ), 1-30. https://doi.org/10.1145/3360598 Google Scholar
Digital Library
- S Petti and A Flaxman. 2019. Diferential privacy in the 2020 US census: what will it do? Quantifying the accuracy/privacy tradeof [version 1; peer review: 1 approved with reservations]. Gates Open Research 3, 1722 ( 2019 ). https://doi.org/10. 12688/gatesopenres.13089.1 Google Scholar
Cross Ref
- Jason Reed and Benjamin C. Pierce. 2010. Distance Makes the Types Grow Stronger: A Calculus for Diferential Privacy. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming (Baltimore, Maryland, USA) ( ICFP '10). ACM, New York, NY, USA, 157-168. https://doi.org/10.1145/1863543.1863568 Google Scholar
Digital Library
- Ryan M Rogers, Aaron Roth, Jonathan Ullman, and Salil Vadhan. 2016. Privacy Odometers and Filters: Pay-as-you-Go Composition. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 1921-1929. http://papers.nips.cc/paper/6170-privacy-odometers-andiflters-pay-as-you-go-composition.pdfGoogle Scholar
Digital Library
- Steven Ruggles, Sarah Flood, Ronald Goeken, Josiah Grover, Erin Meyer, Jose Pacas, and Matthew Sobek. 2020. IPUMS USA: Version 10.0 [dataset]. https://doi.org/10.18128/D010.V10.0 Google Scholar
Cross Ref
- T. Sato, G. Barthe, M. Gaboardi, J. Hsu, and S. Katsumata. 2019. Approximate Span Liftings: Compositional Semantics for Relaxations of Diferential Privacy. In 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). 1-14. https://doi.org/10.1109/LICS. 2019.8785668 Google Scholar
Cross Ref
- Josef Svenningsson and Emil Axelsson. 2013. Combining Deep and Shallow Embedding for EDSL. In Proceedings of the 2012 Conference on Trends in Functional Programming-Volume 7829 ( St. Andrews, UK) ( TFP 2012 ). Springer-Verlag New York, Inc., New York, NY, USA, 21-36. https://doi.org/10.1007/978-3-642-40447-4_2 Google Scholar
Digital Library
- Josef Svenningsson and Emil Axelsson. 2015. Combining deep and shallow embedding of domain-specific languages. Computer Languages, Systems & Structures 44 ( 2015 ), 143-165. https://doi.org/10.1016/j.cl. 2015. 07.003 SI: TFP 2011 /12. Google Scholar
Cross Ref
- David Terei, Simon Marlow, Simon Peyton Jones, and David Mazières. 2012. Safe Haskell. In Proceedings of the 2012 Haskell Symposium (Copenhagen, Denmark) ( Haskell '12). ACM, New York, NY, USA, 137-148. https://doi.org/10.1145/2364506. 2364524 Google Scholar
Digital Library
- Emina Torlak and Rastislav Bodik. 2014. A Lightweight Symbolic Virtual Machine for Solver-aided Host Languages. SIGPLAN Not. 49, 6 ( June 2014 ), 530-541. https://doi.org/10.1145/2666356.2594340 Google Scholar
Digital Library
- Yuxin Wang, Zeyu Ding, Guanhong Wang, Daniel Kifer, and Danfeng Zhang. 2019. Proving Diferential Privacy with Shadow Execution. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (Phoenix, AZ, USA) ( PLDI 2019). ACM, New York, NY, USA, 655-669. https://doi.org/10.1145/3314221.3314619 Google Scholar
Digital Library
- Royce J Wilson, Celia Yuxin Zhang, William Lam, Damien Desfontaines, Daniel Simmons-Marengo, and Bryant Gipson. 2019. Diferentially Private SQL with Bounded User Contribution. arXiv: 1909. 01917 [cs.CR]Google Scholar
- Daniel Winograd-Cort, Andreas Haeberlen, Aaron Roth, and Benjamin C. Pierce. 2017. A Framework for Adaptive Diferential Privacy. Proc. ACM Program. Lang. 1, ICFP, Article 10 ( Aug. 2017 ), 29 pages. https://doi.org/10.1145/3110254 Google Scholar
Digital Library
- Danfeng Zhang and Daniel Kifer. 2017a. LightDP: Towards Automating Diferential Privacy Proofs. SIGPLAN Not. 52, 1 (Jan. 2017 ), 888-901. https://doi.org/10.1145/3093333.3009884 Google Scholar
Digital Library
- Danfeng Zhang and Daniel Kifer. 2017b. LightDP: Towards Automating Diferential Privacy Proofs. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (Paris, France) (POPL 2017 ). ACM, New York, NY, USA, 888-901. https://doi.org/10.1145/3009837.3009884 Google Scholar
Digital Library
- Jun Zhang, Xiaokui Xiao, and Xing Xie. 2016. PrivTree: A Diferentially Private Algorithm for Hierarchical Decompositions. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA) ( SIGMOD '16). ACM, New York, NY, USA, 155-170. https://doi.org/10.1145/2882903.2882928 Google Scholar
Digital Library
Index Terms
Testing differential privacy with dual interpreters
Recommendations
An Emerging Strategy for Privacy Preserving Databases: Differential Privacy
HCI for Cybersecurity, Privacy and TrustAbstractData De-identification and Differential Privacy are two possible approaches for providing data security and user privacy. Data de-identification is the process where the personal identifiable information of individuals is extracted to create ...
A Novel Differential Privacy Approach that Enhances Classification Accuracy
C3S2E '16: Proceedings of the Ninth International C* Conference on Computer Science & Software EngineeringIn the recent past, there has been a tremendous increase of large repositories of data, examples being in healthcare data, consumer data from retailers, and airline passenger data. These data are continually being shared with interested parties, either ...
A privacy framework: indistinguishable privacy
EDBT '13: Proceedings of the Joint EDBT/ICDT 2013 WorkshopsIn this paper we illustrate a privacy framework named Indistinguishable Privacy. Indistinguishable privacy could be deemed as the formalization of the existing privacy definitions in privacy preserving data publishing as well as secure multi-party ...






Comments