ABSTRACT
Differential privacy is a robust privacy standard that has been successfully applied to a range of data analysis tasks. But despite much recent work, optimal strategies for answering a collection of related queries are not known.
We propose the matrix mechanism, a new algorithm for answering a workload of predicate counting queries. Given a workload, the mechanism requests answers to a different set of queries, called a query strategy, which are answered using the standard Laplace mechanism. Noisy answers to the workload queries are then derived from the noisy answers to the strategy queries. This two stage process can result in a more complex correlated noise distribution that preserves differential privacy but increases accuracy.
We provide a formal analysis of the error of query answers produced by the mechanism and investigate the problem of computing the optimal query strategy in support of a given workload. We show this problem can be formulated as a rank-constrained semidefinite program. Finally, we analyze two seemingly distinct techniques, whose similar behavior is explained by viewing them as instances of the matrix mechanism.
- B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: A holistic solution to contingency table release. In PODS, 2007. Google Scholar
Digital Library
- A. Blum, K. Ligett, and A. Roth. A learning theory approach to non-interactive database privacy. In STOC, 2008. Google Scholar
Digital Library
- S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press, 2004. Google Scholar
Digital Library
- J. Dattorro. Convex optimization & Euclidean distance geometry. Meboo Publishing USA, 2005.Google Scholar
- C. Dwork. Differential privacy: A survey of results. In TAMC, 2008. Google Scholar
Digital Library
- C. Dwork. The differential privacy frontier. In TCC, 2009. Google Scholar
Digital Library
- C. Dwork. A firm foundation for privacy. In To Appear, CACM, 2010.Google Scholar
- C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, 2006. Google Scholar
Digital Library
- A. Ghosh, T. Roughgarden, and M. Sundararajan. Universally utility-maximizing privacy mechanisms. In STOC, 2009. Google Scholar
Digital Library
- M. Hardt and K. Talwar. On the geometry of differential privacy. In STOC, 2010. Google Scholar
Digital Library
- M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially-private histograms through consistency. In Proceedings of the VLDB, 2010. (also available as CoRR abs/0904.0942 2009).Google Scholar
- C. Li, M. Hay, V. Rastogi, G. Miklau, and A. McGregor. Optimizing histogram queries under differential privacy. CoRR, abs/0912.4742, 2009.Google Scholar
- K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In Symposium on Theory of Computing (STOC), pages 75--84, 2007. Google Scholar
Digital Library
- F. Pukelsheim. Optimal Design of Experiments. Wiley & Sons, 1993.Google Scholar
- A. Roth and T. Roughgarden. The median mechanism: Interactive and efficient privacy with multiple queries. In STOC, 2010.Google Scholar
- S. D. Silvey. Statistical Inference. Chapman & Hall, 1975.Google Scholar
- X. Xiao, G. Wang, and J. Gehrke. Differential privacy via wavelet transforms. In ICDE, 2010.Google Scholar
Cross Ref
Index Terms
Optimizing linear counting queries under differential privacy
Recommendations
Relationship privacy: output perturbation for queries with joins
PODS '09: Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsWe study privacy-preserving query answering over data containing relationships. A social network is a prime example of such data, where the nodes represent individuals and edges represent relationships. Nearly all interesting queries over social ...
Optimizing Batch Linear Queries under Exact and Approximate Differential Privacy
Differential privacy is a promising privacy-preserving paradigm for statistical query processing over sensitive data. It works by injecting random noise into each query result such that it is provably hard for the adversary to infer the presence or ...
Answering n{2+o(1)} counting queries with differential privacy is hard
STOC '13: Proceedings of the forty-fifth annual ACM symposium on Theory of ComputingA central problem in differentially private data analysis is how to design efficient algorithms capable of answering large numbers of counting queries on a sensitive database. Counting queries are of the form "What fraction of individual records in the ...






Comments