skip to main content
research-article

Evaluating the Strength of Genomic Privacy Metrics

Published:09 January 2017Publication History
Skip Abstract Section

Abstract

The genome is a unique identifier for human individuals. The genome also contains highly sensitive information, creating a high potential for misuse of genomic data (for example, genetic discrimination). In this article, we investigate how genomic privacy can be measured in scenarios where an adversary aims to infer a person’s genomic markers by constructing probability distributions on the values of genetic variations. We measured the strength of privacy metrics by requiring that metrics are monotonic with increasing adversary strength and uncovered serious problems with several existing metrics currently used to measure genomic privacy. We provide suggestions on metric selection, interpretation, and visualization and illustrate the work flow using case studies for three real-world diseases.

References

  1. Dakshi Agrawal and Charu C. Aggarwal. 2001. On the design and quantification of privacy preserving data mining algorithms. In Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’01). ACM, New York, NY, 247--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. James Alexander and Jonathan Smith. 2003. Engineering privacy in public: Confounding face recognition. In Proceedings of the 3rd International Workshop on Privacy Enhancing Technologies (PET’03) (LNCS 2760). Springer, Berlin, 88--106.Google ScholarGoogle ScholarCross RefCross Ref
  3. Christer Andersson and Reine Lundin. 2008. On the fundamentals of anonymity metrics. In Proceedings of the 3rd IFIP International Summer School on The Future of Identity in the Information Society. Springer, Berlin, 325--341.Google ScholarGoogle ScholarCross RefCross Ref
  4. Erman Ayday, Jean Louis Raisaro, Urs Hengartner, Adam Molyneaux, and Jean-Pierre Hubaux. 2014. Privacy-preserving processing of raw genomic data. In Data Privacy Management and Autonomous Spontaneous Security. Springer, Berlin, 133--147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Erman Ayday, Jean Louis Raisaro, and Jean-Pierre Hubaux. 2013a. Personal use of the genomic data: Privacy vs. storage cost. In Proc. IEEE Global Communications Conf. (GLOBECOM 2013). IEEE, Los Alamitos, CA, 2723--2729.Google ScholarGoogle ScholarCross RefCross Ref
  6. Erman Ayday, Jean Louis Raisaro, Jean-Pierre Hubaux, and Jacques Rougemont. 2013b. Protecting and evaluating genomic privacy in medical tests and personalized medicine. In Proceedings of the 12th ACM Workshop on Workshop on Privacy in the Electronic Society (WPES’13). ACM, 95--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Elisa Bertino, Dan Lin, and Wei Jiang. 2008. A survey of quantification of privacy preserving data mining algorithms. In Privacy-Preserving Data Mining: Models and Algorithms. Number 34 in Advances in Database Systems. Springer, Berlin, 183--205.Google ScholarGoogle Scholar
  8. Terence Chen, Abdelberi Chaabane, Pierre Ugo Tournoux, Mohamed-Ali Kaafar, and Roksana Boreli. 2013. How much is too much? Leveraging ads audience estimation to evaluate public profile uniqueness. In Proceedings of the 13th International Symposium on Privacy Enhancing Technologies (PETS’13) (LNCS 7981). Springer, Berlin, 225--244.Google ScholarGoogle ScholarCross RefCross Ref
  9. Xihui Chen and Jun Pang. 2012. Measuring query privacy in location-based services. In Proceedings of the 2nd ACM Conference on Data and Application Security and Privacy (CODASPY’12). ACM, New York, NY, 49--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sebastian Clauß and Stefan Schiffner. 2006. Structuring anonymity metrics. In Proceedings of the 13th ACM Conference on Computer and Communications Security 2006 (CCS’06): 2nd ACM Workshop on Digital Identity Management (DIM’06). ACM, New York, NY, 55--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Yuxin Deng, Jun Pang, and Peng Wu. 2007. Measuring anonymity with relative entropy. In Proceedings of the 8th International Workshop on Formal Aspects in Security and Trust (FAST’11). Springer, Berlin, 65--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Claudia Diaz, Stefaan Seys, Joris Claessens, and Bart Preneel. 2003. Towards measuring anonymity. In Privacy Enhancing Technologies. 54--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Claudia Diaz, Carmela Troncoso, and George Danezis. 2007. Does additional information always reduce anonymity? In Proceedings of the 6th ACM Workshop on Privacy in Electronic Society (WPES’07). ACM, New York, NY, 72--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Radoje Drmanac, Andrew B. Sparks, Matthew J. Callow, Aaron L. Halpern, Norman L. Burns, et al. 2010. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327, 5961 (Jan. 2010), 78--81.Google ScholarGoogle ScholarCross RefCross Ref
  15. Cynthia Dwork. 2006. Differential privacy. In Proceedings of the 33rd International Colloqium on Automata, Languages and Programming (ICALP’06) (LNCS 4052). Springer, Berlin, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yaniv Erlich and Arvind Narayanan. 2014. Routes for breaching and protecting genetic privacy. Nat. Rev. Genet. 15, 6 (Jun. 2014), 409--421.Google ScholarGoogle Scholar
  17. Matthew Fredrikson, Eric Lantz, Somesh Jha, Simon Lin, David Page, and Thomas Ristenpart. 2014. Privacy in pharmacogenetics: An end-to-end case study of personalized Warfarin dosing. In USENIX Security. USENIX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Julien Freudiger, Maxim Raya, Márk Félegyházi, Panos Papadimitratos, and Jean-Pierre Hubaux. 2007. Mix-zones for location privacy in vehicular networks. In Proceedings of the 1st International Workshop on Wireless Networking for Intelligent Transportation Systems (WiN-ITS’07). ICST, Vancouver, Canada.Google ScholarGoogle Scholar
  19. Michael T. Goodrich. 2009. The mastermind attack on genomic data. In Proceedings of the 30th IEEE Symposium on Security and Privacy. 204--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Scott Gottlieb. 2001. US employer agrees to stop genetic testing. Br. Med. J. 322, 7284 (Feb. 2001), 449.Google ScholarGoogle Scholar
  21. Bastian Greshake, Philipp E. Bayer, Helge Rausch, and Julia Reda. 2014. openSNP--A crowdsourced web resource for personal genomics. PLoS ONE 9, 3 (March 2014).Google ScholarGoogle ScholarCross RefCross Ref
  22. Daojing He, S. Chan, and M. Guizani. 2015. Privacy and incentive mechanisms in people-centric sensing networks. IEEE Commun. Mag. 53, 10 (2015), 200--206.Google ScholarGoogle ScholarCross RefCross Ref
  23. Jerry L. Hintze and Ray D. Nelson. 1998. Violin plots: A box plot-density trace synergism. Am. Stat. 52, 2 (May 1998), 181--184.Google ScholarGoogle Scholar
  24. Nils Homer, Szabolcs Szelinger, Margot Redman, David Duggan, Waibhav Tembe, Jill Muehling, John V. Pearson, Dietrich A. Stephan, Stanley F. Nelson, and David W. Craig. 2008. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet. 4, 8 (August 2008), e1000167.Google ScholarGoogle ScholarCross RefCross Ref
  25. Mathias Humbert, Erman Ayday, Jean-Pierre Hubaux, and Amalio Telenti. 2013. Addressing the concerns of the lacks family: Quantification of kin genomic privacy. In Proceedings of the 20th ACM Conf. on Computer and Communications Security (CCS’13). ACM, Berlin, 1141--1152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mathias Humbert, Erman Ayday, Jean-Pierre Hubaux, and Amalio Telenti. 2014. Reconciling utility with privacy in genomics. In Proceedings of the 13th Workshop on Privacy in the Electronic Society (WPES’14). ACM, New York, NY, 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mathias Humbert, Kévin Huguenin, Joachim Hugonot, Erman Ayday, and Jean-Pierre Hubaux. 2015. De-anonymizing genomic databases using phenotypic traits.Google ScholarGoogle Scholar
  28. Georgios Kalogridis, Costas Efthymiou, Stojan Z. Denic, Tim A. Lewis, and Rafael Cepeda. 2010. Privacy for smart meters: Towards undetectable appliance load signatures. In Proceedings of the 1st International Conference on Smart Grid Communications (SmartGridComm’10). IEEE, Los Alamitos, CA, 232--237.Google ScholarGoogle ScholarCross RefCross Ref
  29. Zhen Lin, Michael Hewett, and Russ B. Altman. 2002. Using binning to maintain confidentiality of medical data. In Proceedings of the AMIA Symposium (AMIA’02), 454--458.Google ScholarGoogle Scholar
  30. Changchang Liu and Prateek Mittal. 2016. LinkMirage: Enabling privacy-preserving analytics on social relationships. In NDSS.Google ScholarGoogle Scholar
  31. Bradley A. Malin. 2005. Protecting DNA sequence anonymity with generalization lattices. Methods Inf. Med. 44, 5 (2005), 687--692.Google ScholarGoogle ScholarCross RefCross Ref
  32. Marina Meilă. 2007. Comparing clusterings—an information based distance. J. Multivar. Anal. 98, 5 (May 2007), 873--895. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Steven J. Murdoch. 2014. Quantifying and measuring anonymity. In Data Privacy Management and Autonomous Spontaneous Security. Springer, Berlin, 3--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Arvind Narayanan and Vitaly Shmatikov. 2009. De-anonymizing social networks. In 30th IEEE Symposium on Security and Privacy. 173--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Muhammad Naveed, Erman Ayday, Ellen W. Clayton, Jacques Fellay, Carl A. Gunter, Jean-Pierre Hubaux, Bradley A. Malin, and Xiaofeng Wang. 2015. Privacy in the genomic era. ACM Comput. Surv. 48, 1 (Aug. 2015), 6:1--6:44. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Dale R. Nyholt, Chang-En Yu, and Peter M. Visscher. 2009. On Jim Watson’s APOE status: Genetic information is hard to hide. Eur. J. Hum. Genet. 17, 2 (Feb. 2009), 147--149.Google ScholarGoogle ScholarCross RefCross Ref
  37. Simon Oya, Carmela Troncoso, and Fernando Pérez-González. 2014. Do dummies pay off? Limits of dummy traffic protection in anonymous communications. In Proceedings of the 14th International Symposium on Privacy Enhancing Technologies (PETS’14) (LNCS 8555). Springer, Berlin, 204--223.Google ScholarGoogle ScholarCross RefCross Ref
  38. Ravi Sachidanandam, David Weissman, Steven C. Schmidt, Jerzy M. Kakol, Lincoln D. Stein, Gabor Marth, Steve Sherry, James C. Mullikin, Beverley J. Mortimore, David L. Willey, Sarah E. Hunt, Charlotte G. Cole, Penny C. Coggill, Catherine M. Rice, Zemin Ning, Jane Rogers, David R. Bentley, Pui-Yan Kwok, Elaine R. Mardis, Raymond T. Yeh, Brian Schultz, Lisa Cook, Ruth Davenport, Michael Dante, Lucinda Fulton, LaDeana Hillier, Robert H. Waterston, John D. McPherson, Brian Gilman, Stephen Schaffner, William J. Van Etten, David Reich, John Higgins, Mark J. Daly, Brendan Blumenstiel, Jennifer Baldwin, Nicole Stange-Thomann, Michael C. Zody, Lauren Linton, Eric S. Lander, and David Altshuler. 2001. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 6822 (Feb. 2001), 928--933.Google ScholarGoogle Scholar
  39. Sahel Samani, Zhicong Huang, Erman Ayday, Mark Elliot, Jacques Fellay, Jean-Pierre Hubaux, and Zoltán Kutalik. 2015. Quantifying genomic privacy via inference attack with high-order SNV correlations. In Proceedings of the 2015 IEEE Security and Privacy Workshops (SPW’15). 32--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Andrei Serjantov and George Danezis. 2002. Towards an information theoretic metric for anonymity. In Proceedings of the 2nd Internationl Symposium on Privacy Enhancing Technologies (PETS’02) (LNCS 2482). Springer, Berlin, 41--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. S. T. Sherry, M.-H. Ward, M. Kholodov, J. Baker, L. Phan, E. M. Smigielski, and K. Sirotkin. 2001. dbSNP: The NCBI database of genetic variation. Nucl. Acids Res. 29, 1 (Jan. 2001), 308--311.Google ScholarGoogle ScholarCross RefCross Ref
  42. Reza Shokri, George Theodorakopoulos, Jean-Yves Le Boudec, and Jean-Pierre Hubaux. 2011. Quantifying location privacy. In Proceedings of the 2011 32nd IEEE Symp. on Security and Privacy (S8P’11). IEEE, 247--262. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Montgomery Slatkin. 2008. Linkage disequilibrium—understanding the evolutionary past and mapping the medical future. Nat. Rev. Genet. 9, 6 (June 2008), 477--485.Google ScholarGoogle ScholarCross RefCross Ref
  44. Latanya Sweeney. 2002. k-anonymity: A model for protecting privacy. Int. J. Uncert. Fuzz. Knowl.-Based Syst. 10, 05 (2002), 557--570. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Paul Syverson. 2013. Why I’m not an entropist. In Proc. 17th Int. Workshop on Security Protocols (LNCS 7028). Springer, Berlin, 213--230.Google ScholarGoogle ScholarCross RefCross Ref
  46. The 1000 Genomes Project Consortium. 2015. A global reference for human genetic variation. Nature 526, 7571 (Oct. 2015), 68--74.Google ScholarGoogle ScholarCross RefCross Ref
  47. Sarah A. Tishkoff and Kenneth K. Kidd. 2004. Implications of biogeography of human populations for “race” and medicine. Nat. Genet. 36 (Oct. 2004), S21--S27.Google ScholarGoogle Scholar
  48. Isabel Wagner. 2015. Genomic privacy metrics: A systematic comparison. In Proceedings of the 2015 IEEE Security and Privacy Workshops (SPW). 50--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Isabel Wagner and David Eckhoff. 2015. Technical privacy metrics: A systematic survey. arXiv:1512.00327 {cs, math} (Dec. 2015). http://arxiv.org/abs/1512.00327Google ScholarGoogle Scholar
  50. Rui Wang, Yong Fuga Li, XiaoFeng Wang, Haixu Tang, and Xiaoyong Zhou. 2009. Learning your identity and disease from research papers: Information leaks in genome wide association study. In Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS’09). ACM, Berlin, 534--544. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Danielle Welter, Jacqueline MacArthur, Joannella Morales, Tony Burdett, Peggy Hall, Heather Junkins, Alan Klemm, Paul Flicek, Teri Manolio, Lucia Hindorff, and Helen Parkinson. 2014. The NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucl. Acids Res. 42, D1 (Jan. 2014), D1001--D1006.Google ScholarGoogle ScholarCross RefCross Ref
  52. Kris Wetterstrand. 2016. DNA Sequencing Costs: Data from the NHGRI Genome Sequencing Program (GSP). Retrieved from https://www.genome.gov/sequencingcostsdata.Google ScholarGoogle Scholar
  53. Ye Zhu and Riccardo Bettati. 2005. Anonymity vs. information leakage in anonymity systems. In Proc. 25th IEEE Int. Conf. on Distributed Computing Systems (ICDCS’05). IEEE, Los Alamitos, CA, 514--524. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Evaluating the Strength of Genomic Privacy Metrics

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Transactions on Privacy and Security
              ACM Transactions on Privacy and Security  Volume 20, Issue 1
              February 2017
              99 pages
              ISSN:2471-2566
              EISSN:2471-2574
              DOI:10.1145/3038258
              Issue’s Table of Contents

              Copyright © 2017 ACM

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 9 January 2017
              • Revised: 1 November 2016
              • Accepted: 1 November 2016
              • Received: 1 June 2016
              Published in tops Volume 20, Issue 1

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader
            About Cookies On This Site

            We use cookies to ensure that we give you the best experience on our website.

            Learn more

            Got it!