ABSTRACT
"Privacy" and "utility" are words that frequently appear in the literature on statistical privacy. But what do these words really mean? In recent years, many problems with intuitive notions of privacy and utility have been uncovered. Thus more formal notions of privacy and utility, which are amenable to mathematical analysis, are needed. In this paper we present our initial work on an axiomatization of privacy and utility. In particular, we study how these concepts are affected by randomized algorithms. Our analysis yields new insights into the construction of both privacy definitions and mechanisms that generate data according to such definitions. In particular, it characterizes a class of relaxations of differential privacy and shows that desirable outputs of a differentially private mechanism are best interpreted as certain graphs rather than query answers or synthetic data.
- Nabil Adam and John Wortmann. Security-control methods for statistical databases. ACM Computing Surveys, 21(4):515--556, 1989. Google Scholar
Digital Library
- B. Barak, K. Chaudhuri, C. Dwork, S. Kale, F. McSherry, and K. Talwar. Privacy, accuracy, and consistency too: A holistic solution to contingency table release. In PODS, 2007. Google Scholar
Digital Library
- Michael Barbaro and Tom Zeller. A face is exposed for AOL searcher no. 4417749. New York Times, August 9 2006.Google Scholar
- Dimitri P. Bertsekas, Angelia Nedic, and Asuman E. Ozdaglar. Convex Analysis and Optimization. Athena Scientific, 2003.Google Scholar
- U. Blien, H. Wirth, and M. Muller. Disclosure risk for microdata stemming from official statistics. Statistica Neerlandica, 46(1):69--82, 1992.Google Scholar
Cross Ref
- Avrim Blum, Cynthia Dwork, Frank McSherry, and Kobbi Nissim. Practical privacy: the sulq framework. In PODS, pages 128--138, 2005. Google Scholar
Digital Library
- Avrim Blum, Katrina Ligett, and Aaron Roth. A learning theory approach to non-interactive database privacy. In STOC, pages 609--618, 2008. Google Scholar
Digital Library
- Rudolf Carnap and Richard C. Jeffrey, editors. Studies in Inductive Logic and Probability, volume I. University of California Press, 1971.Google Scholar
- George Casella and Roger L. Berger. Statistical Inference. Duxbury, 2nd edition, 2002.Google Scholar
- Bee-Chung Chen, Daniel Kifer, Kristen LeFevre, and Ashwin Machanavajjhala. Privacy-preserving data publishing. Foundations and Trends in Databases, 2(1-2):1--167, 2009. Google Scholar
Digital Library
- Joel E. Cohen, Yves Derriennic, and Gh. Zbaganu. Majorization, monotonicity of relative entropy and stochastic matrices. Contemporary Mathematics, 149, 1993.Google Scholar
- Irit Dinur and Kobbi Nissim. Revealing information while preserving privacy. In PODS, 2003. Google Scholar
Digital Library
- C. Dwork and N. Nissim. Privacy-preserving datamining on vertically partitioned databases. In CRYPTO, 2004.Google Scholar
Cross Ref
- Cynthia Dwork. Differential privacy. In ICALP, volume 4051 of Lecture Notes in Computer Science, pages 1--12, 2006. Google Scholar
Digital Library
- Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In EUROCRYPT, pages 486--503, 2006. Google Scholar
Digital Library
- Cynthia Dwork, Frank Mcsherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography Conference, pages 265--284, 2006. Google Scholar
Digital Library
- Cynthia Dwork, Frank McSherry, and Kunal Talwar. The price of privacy and the limits of lp decoding. In STOC, pages 85--94, 2007. Google Scholar
Digital Library
- Cynthia Dwork, Moni Naor, Omer Reingold, Guy N.Rothblum, and Salil Vadhan. On the complexity of differentially private data release: Efficient algorithms and hardness results. In STOC, pages 381--390, 2009. Google Scholar
Digital Library
- Alexandre Evfimievski, Ronald Fagin, and David P. Woodruff. Epistemic privacy. In PODS, 2008. Google Scholar
Digital Library
- Alexandre Evfimievski, Johannes Gehrke, and Ramakrishnan Srikant. Limiting privacy breaches in privacy-preserving data mining. In PODS, 2003. Google Scholar
Digital Library
- B. Fung, K. Wang, R. Chen, and P. Yu. Privacy-preserving data publishing: A survey on recent developments. ACM Computing Surveys, 42(4), 2010. Google Scholar
Digital Library
- Srivatsava Ranjit Ganta, Shiva Prasad Kasiviswanathan, and Adam Smith. Composition attacks and auxiliary information in data privacy. In KDD, 2008.Google Scholar
Digital Library
- Arpita Ghosh, Tim Roughgarden, and Mukund Sundararajan. Universally utility-maximizing privacy mechanisms. In STOC, pages 351--360, 2009. Google Scholar
Digital Library
- M. Hay, V. Rastogi, G. Miklau, and D. Suciu. Boosting the accuracy of differentially-private histograms through consistency. In VLDB, 2010.Google Scholar
Digital Library
- Daniel Kifer. Attacks on privacy and de finetti's theorem. In SIGMOD, 2009. Google Scholar
Digital Library
- Daniel Kifer and Bing-Rong Lin. Towards an axiomatization of statistical privacy and utility. Technical Report CSE-10-002, Penn State University, 2010.Google Scholar
Digital Library
- Ravi Kumar, Jasmine Novak, Bo Pang, and Andrew Tomkins. On anonymizing query logs via token-based hashing. In WWW, 2007. Google Scholar
Digital Library
- Ashwin Machanavajjhala, Johannes Gehrke, and Michaela Götz. Data publishing against realistic adversaries. VLDB, 2009. Google Scholar
Digital Library
- Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. In ICDE, 2006. Google Scholar
Digital Library
- Ashwin Machanavajjhala, Daniel Kifer, John Abowd, Johannes Gehrke, and Lars Vilhuber. Privacy: Theory meets practice on the map. ICDE, pages 277--286, 2008. Google Scholar
Digital Library
- Frank McSherry and Kunal Talwar. Mechanism design via differential privacy. In FOCS, pages 94--103, 2007. Google Scholar
Digital Library
- M. Ercan Nergiz and Chris Clifton. Thoughts on k-anonymization. Data & Knowledge Engineering, 63(3):622--645, 2007. Google Scholar
Digital Library
- Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. Smooth sensitivity and sampling in private data analysis. In STOC, pages 75--84, 2007. Google Scholar
Digital Library
- C. J. Nix and J. B. Paris. A continuum of inductive methods arising from a generalized principle of instantial relevance. Journal of Philosophical Logic, 35(1):83--115, 2006.Google Scholar
Cross Ref
- Vibhor Rastogi, Michael Hay, Gerome Miklau, and Dan Suciu. Relationship privacy: Output perturbation for queries with joins. In PODS, pages 107--116, 2009. Google Scholar
Digital Library
- Vibhor Rastogi, Dan Suciu, and Sungho Hong. The boundary between privacy and utility in data publishing. In VLDB, pages 531--542, 2007. Google Scholar
Digital Library
- Walter Rudin. Real & Complex Analysis. McGraw-Hill, 3rd edition, 1987. Google Scholar
Digital Library
- Pierangela Samarati and Latanya Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, CMU, SRI, 1998.Google Scholar
- Mark J. Schervish. Theory of Statistics. Springer, 1995.Google Scholar
Cross Ref
- Latanya Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):557--570, 2002. Google Scholar
Digital Library
- Raymond Wong, Ada Fu, Ke Wang, and Jian Pei. Minimality attack in privacy preserving data publishing. In VLDB, 2007. Google Scholar
Digital Library
- Xiaokui Xiao, Guozhang Wang, and Johannes Gehrke. Differential privacy via wavelet transforms. In ICDE, 2010.Google Scholar
Cross Ref
Index Terms
Towards an axiomatization of statistical privacy and utility
Recommendations
An information theoretic privacy and utility measure for data sanitization mechanisms
CODASPY '12: Proceedings of the second ACM conference on Data and Application Security and PrivacyData collection agencies publish sensitive data for legitimate purposes, such as research, marketing and etc. Data publishing has attracted much interest in research community due to the important concerns over the protection of individuals privacy. As ...
The cost of privacy: destruction of data-mining utility in anonymized data publishing
KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data miningRe-identification is a major privacy threat to public datasets containing individual records. Many privacy protection algorithms rely on generalization and suppression of "quasi-identifier" attributes such as ZIP code and birthdate. Their objective is ...
Bridging unlinkability and data utility: Privacy preserving data publication schemes for healthcare informatics
AbstractPublishing patient data without revealing their sensitive information is one of the challenging research issues in the healthcare sector. Patient records contain useful information that is often released to healthcare industries and ...






Comments