ABSTRACT
We study privacy-preserving query answering over data containing relationships. A social network is a prime example of such data, where the nodes represent individuals and edges represent relationships. Nearly all interesting queries over social networks involve joins, and for such queries, existing output perturbation algorithms severely distort query answers. We propose an algorithm that significantly improves utility over competing techniques, typically reducing the error bound from polynomial in the number of nodes to polylogarithmic. The algorithm is, to the best of our knowledge, the first to answer such queries with acceptable accuracy, even for worst-case inputs.
The improved utility is achieved by relaxing the privacy condition. Instead of ensuring strict differential privacy, we guarantee a weaker (but still quite practical) condition based on adversarial privacy. To explain precisely the nature of our relaxation in privacy, we provide a new result that characterizes the relationship between ε-indistinguishability~(a variant of the differential privacy definition) and adversarial privacy, which is of independent interest: an algorithm is ε-indistinguishable iff it is private for a particular class of adversaries (defined precisely herein). Our perturbation algorithm guarantees privacy against adversaries in this class whose prior distribution is numerically bounded.
- L. Backstrom, C. Dwork, and J. M. Kleinberg. Wherefore art thou R3579X?: Anonymized social networks, hidden patterns, and structural steganography. In WWW, 2007. Google Scholar
Digital Library
- H. W. Block, T. H. Savits, and M. Shaked. Some concepts of negative dependence. In Ann. of Prob., 1982.Google Scholar
Cross Ref
- A. Campan and T. M. Truta. A clustering approach for data and structural anonymity in social networks. In PinKDD, 2008.Google Scholar
- C. Dwork. Differential privacy. In ICALP, 2006. Google Scholar
Digital Library
- C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In TCC, 2006. Google Scholar
Digital Library
- A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data mining. In PODS, 2003. Google Scholar
Digital Library
- A. V. Evfimievski, R. Fagin, and D. P. Woodruff. Epistemic privacy. In PODS, 2008. Google Scholar
Digital Library
- T. Feder and M. Mihail. Balanced matroids. In STOC, 1992. Google Scholar
Digital Library
- Full version: http://www.cs.washington.edu/homes/vibhor/relationship_privacy.pdf.Google Scholar
- S. Ganta, S. Kasiviswanathan, and A. Smith. Composition attacks and auxiliary information in data privacy. In KDD, 2008. Google Scholar
Digital Library
- M. Hay, G. Miklau, D. Jensen, D. Towsley, and P. Weis. Resisting structural re-identification in anonymized social networks. In VLDB, 2008. Google Scholar
Digital Library
- K. Liu and E. Terzi. Towards identity anonymization on graphs. In SIGMOD, 2008. Google Scholar
Digital Library
- A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. l-diversity: Privacy beyond k-anonymity. In ICDE, 2006. Google Scholar
Digital Library
- G. Miklau and D. Suciu. A formal analysis of information disclosure in data exchange. In SIGMOD, 2004. Google Scholar
Digital Library
- M. Newman. The structure and function of complex networks. SIREV: SIAM Review, 2003.Google Scholar
Cross Ref
- K. Nissim, S. Raskhodnikova, and A. Smith. Smooth sensitivity and sampling in private data analysis. In STOC, 2007. Google Scholar
Digital Library
- V. Rastogi, D. Suciu, and S. Hong. The boundary between privacy and utility in data publishing. In VLDB, 2007. Google Scholar
Digital Library
- J. G. Shanthikumar and H.-W. Koo. On uniform conditional stochastic order conditioned on planar regions. In Ann. of Probab., 1990.Google Scholar
- V. Vu. Concentration of non-lipschitz functions and applications. RSA, 2002. Google Scholar
Digital Library
- X. Ying and X. Wu. Randomizing social networks: a spectrum preserving approach. In SIAM, 2007.Google Scholar
- E. Zheleva and L. Getoor. Preserving the privacy of sensitive relationships in graph data. In PinKDD, 2007. Google Scholar
Digital Library
- B. Zhou and J. Pei. Preserving privacy in social networks against neighborhood attacks. In ICDE, 2008. Google Scholar
Digital Library
Index Terms
Relationship privacy: output perturbation for queries with joins
Recommendations
Optimizing linear counting queries under differential privacy
PODS '10: Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsDifferential privacy is a robust privacy standard that has been successfully applied to a range of data analysis tasks. But despite much recent work, optimal strategies for answering a collection of related queries are not known.
We propose the matrix ...
Bayesian Differential Privacy on Correlated Data
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataDifferential privacy provides a rigorous standard for evaluating the privacy of perturbation algorithms. It has widely been regarded that differential privacy is a universal definition that deals with both independent and correlated data and a ...
Smooth sensitivity and sampling in private data analysis
STOC '07: Proceedings of the thirty-ninth annual ACM symposium on Theory of computingWe introduce a new, generic framework for private data analysis.The goal of private data analysis is to release aggregate information about a data set while protecting the privacy of the individuals whose information the data set contains.Our framework ...






Comments