Abstract
Conventional private data publication schemes are targeted at publication of sensitive datasets either after the k-anonymization process or through differential privacy constraints. Typically these schemes are designed with the objective of retaining as much utility as possible for the aggregate queries while ensuring the privacy of the individual records. Such an approach, though suitable for publishing aggregate information as public datasets, is inapplicable when users have different levels of access to the same data. We argue that existing schemes either result in increased disclosure of private information or lead to reduced utility when some users have more access privileges than the others. In this article, we present an anonymization framework for publishing large datasets with the goals of providing different levels of utility to the users based on their access privilege levels. We design and implement our proposed multilevel utility-controlled anonymization schemes in the context of large association graphs considering three levels of user utility, namely, (1) users having access to only the graph structure, (2) users having access to the graph structure and aggregate query results, and (3) users having access to the graph structure, aggregate query results, and individual associations. Our experiments on real large association graphs show that the proposed techniques are effective and scalable and yield the required level of privacy and utility for each user privacy and access privilege level.
- C. Aggarwal. 2005. On k-anonymity and the curse of dimensionality. In International Conference on Very Large Databases (VLDB’5). Google Scholar
Digital Library
- L. Backstrom, C. Dwork, and J. Kleinberg. 2007. Wherefore are thou R3579X? Anonymized social networks, hiddern patterns and structural steganography. In International Worldwide Web Conference (WWW’07). Google Scholar
Digital Library
- S. Bhagat, G. Cormode, B. Krishnamurthy, and D. Srivastava. 2009. Class-based graph anonymization for social network data. In International Conference on Very Large Databases (VLDB’09). Google Scholar
Digital Library
- R. Chen. 2011. Publishing set-valued data via differential privacy. In International Conference on Very Large Databases (VLDB’11).Google Scholar
- G. Cormode, D. Srivastava, N. Li, and T. Li. 2010. Minimizing and maximizing utility: Analyzing method-based attacks on anonymized data. In International Conference on Very Large Databases (VLDB’10). Google Scholar
Digital Library
- G. Cormode, D. Srivastava, T. Yu, and Q. Zhang. 2008. Anonymizing bipartite graph data using safe groupings. In International Conference on Very Large Databases (VLDB’08). Google Scholar
Digital Library
- R. A. Fisher and F. Yates. 1938. Statistical tables for biological, agricultural, and medical research. Oliver and Boyd, London, 20, Example 12.Google Scholar
- A. Friedman and A. Schuster. 2010. Data mining with differential privacy. In International Conference on Knowledge Discovery and Data Mining (SIGKDD’10). Google Scholar
Digital Library
- Samarati. 2001. Protecting respondents identities in microdata release. In Transactions on Knowledge and Data Engineering (TKDE’01). Google Scholar
Digital Library
- L. Sweeney. 2002. k-Anonymity: A model for protecting privacy. In International Journal on Uncertainty, Fuzziness and Knowledge-Based Systems. Google Scholar
Digital Library
- G. Ghinita, Y. Tao, and P. Kalnis. 2008. On the anonymization of sparse high-dimensional data. In International Conference on Data Engineering (ICDE’08). Google Scholar
Digital Library
- A. Korolova, R. Motwani, S. Nabar, and Y. Xu. 2008. Link privacy in social networks. In International Conference on Data Engineering (ICDE’08). Google Scholar
Digital Library
- A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam. 2006. l-Diversity: Privacy beyond k-anonymity. In International Conference on Data Engineering (ICDE’06). Google Scholar
Digital Library
- V. Karwa, S. Raskhodnikova, A. Smith, and G. Yaroslavtsev. 2001. Private analysis of graph structure. In International Conference on Very Large Databases (VLDB’01).Google Scholar
- S. Kasiviswanathan, K. Nissim, S. Raskhodnikova, and A. Smith. 2013. Analyzing graphs with node differential privacy. In Theory of Cryptography (TCC’13). Google Scholar
Digital Library
- K. LeFevre, D. DeWitt, and R. Ramakrishnan. 2005. Incognito: Efficient full-domain K-anonymity. In Special Interest Group on Management of Data (SIGMOD’05). Google Scholar
Digital Library
- N. Li, T. Li, and S. Venkatasubramanian. 2007. t-Closeness: Privacy beyond k- anonymity and l-diversity. In International Conference on Data Engineering (ICDE’05).Google Scholar
- A. Sala, X. Zhao, C. Wilson, H. Zheng, and B. Y. Zhao. 2011. Sharing graphs using differentially private graph models. In Internet Measurement Conference (IMC’11). Google Scholar
Digital Library
- A. Serjantov and G. Danezis. 2002. Towards an information theoretic metric for anonymity. In Privacy Enhancing Technologies Symposium (PETS’02). Google Scholar
Digital Library
- C. Task and C. Clifton. 2013. What should we protect? Defining differential privacy for social network analysis. In Social Network Analysis and Mining.Google Scholar
- G. Toth, Z. Hornak, and F. Vajda. 2004. Measuring anonymity revisited. In Nordic Workshop on Secure IT Systems (Nordsec).Google Scholar
- R. C. Wong, A. W. Fu, K. Wang, and J. Pei. 2007. Attack in privacy preserving data publishing. In International Conference on Very Large Databases (VLDB’07). Google Scholar
Digital Library
- R. Wong, J. Li, A. Fu, and K. Wang. 2006. (α, k)-Anonymity: An enhanced k-anonymity model for privacy-preserving data publishing. In International Conference on Knowledge Discovery and Data Mining (SIGKDD’06). Google Scholar
Digital Library
- C. Dwork. 2006. Differential privacy. In International Colloquium on Automata, Languages, and Programming (ICALP’06). Google Scholar
Digital Library
- X. Xiao and Y. Tao. 2006. Anatomy: Simple and effective privacy preservation. In International Conference on Very Large Databases (VLDB’06). Google Scholar
Digital Library
- Y. Yang, Z. Zhang, G. Miklau, M. Winslett, and X. Xiao et al. 2012. Differential privacy in data publication and analysis. In Special Interest Group on Management of Data (SIGMOD’12). Google Scholar
Digital Library
- Q. Zhang, N. Koudas, D. Srivastava, and T. Yu. 2007. Aggregate query answering on anonymized tables. In International Conference on Very Large Databases (VLDB’07).Google Scholar
- B. Zhou and J. Pei. 2008. Preserving privacy in social networks against neighborhood attacks. In International Conference on Data Engineering (ICDE’08). Google Scholar
Digital Library
- W. Day, Ni. Li, and M. Lyu. 2016. Publishing graph degree distribution with node differential privacy. In Special Interest Group on Management of Data (SIGMOD’16). Google Scholar
Digital Library
- J. Zhang, G. Cormode, C. Procopiuc, D. Srivastava, and X. Xiao. 2015. Private release of graph statistics using ladder functions. In Special Interest Group on Management of Data (SIGMOD’15). Google Scholar
Digital Library
- J. Blocki, A. Blum, A. Datta, and O. Sheffet. 2013. Differentially private data analysis of social networks via restricted sensitivity. In Innovations in Theoretical Computer Science (ITCS’13). Google Scholar
Digital Library
- S. Chen and S. Zhou. 2013. Recursive mechanism: Towards node differential privacy and unrestricted joins. In Special Interest Group on Management of Data (SIGMOD’13). Google Scholar
Digital Library
- E. Barker, M. Smid, D. Branstad, and S. Chokhani. 2013. NIST Special Publication 800 -130: A framework for designing cryptographic key management systems. In National Institute of Standards and Technology Report.Google Scholar
- D. Turner. 2016. What is key management? A CISO perspective. In Cryptomathic.Google Scholar
Index Terms
Privacy-Preserving Publishing of Multilevel Utility-Controlled Graph Datasets
Recommendations
An effective value swapping method for privacy preserving data publishing
Privacy is an important concern in the society, and it has been a fundamental issue when to analyze and publish data involving human individual's sensitive information. Recently, the slicing method has been popularly used for privacy preservation in ...
A Novel Differential Privacy Approach that Enhances Classification Accuracy
C3S2E '16: Proceedings of the Ninth International C* Conference on Computer Science & Software EngineeringIn the recent past, there has been a tremendous increase of large repositories of data, examples being in healthcare data, consumer data from retailers, and airline passenger data. These data are continually being shared with interested parties, either ...
Knowledge Reserving in Privacy Preserving Data Mining
IITA '08: Proceedings of the 2008 Second International Symposium on Intelligent Information Technology Application - Volume 02We present in this paper a novel method to protect data privacy in data mining. Nowadays, privacy is becoming an increasingly important issue in many data mining applications. Among the current privacy preserving techniques, data anonymization provides ...






Comments