skip to main content
research-article

Attribute Inference Attacks in Online Social Networks

Published:02 January 2018Publication History
Skip Abstract Section

Abstract

We propose new privacy attacks to infer attributes (e.g., locations, occupations, and interests) of online social network users. Our attacks leverage seemingly innocent user information that is publicly available in online social networks to infer missing attributes of targeted users. Given the increasing availability of (seemingly innocent) user information online, our results have serious implications for Internet privacy—private attributes can be inferred from users’ publicly available data unless we take steps to protect users from such inference attacks. To infer attributes of a targeted user, existing inference attacks leverage either the user’s publicly available social friends or the user’s behavioral records (e.g., the web pages that the user has liked on Facebook, the apps that the user has reviewed on Google Play), but not both. As we will show, such inference attacks achieve limited success rates. However, the problem becomes qualitatively different if we consider both social friends and behavioral records. To address this challenge, we develop a novel model to integrate social friends and behavioral records, and design new attacks based on our model. We theoretically and experimentally demonstrate the effectiveness of our attacks. For instance, we observe that, in a real-world large-scale dataset with 1.1 million users, our attack can correctly infer the cities a user lived in for 57% of the users; via confidence estimation, we are able to increase the attack success rate to over 90% if the attacker selectively attacks half of the users. Moreover, we show that our attack can correctly infer attributes for significantly more users than previous attacks.

References

  1. Sibel Adali and Jennifer Golbeck. 2012. Predicting personality with social behavior. In Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 302--309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Sadia Afroz, Aylin Caliskan-Islam, Ariel Stolerman, Rachel Greenstadt, and Damon McCoy. 2014. Doppelgänger finder: Taking stylometry to the underground. In IEEE Symposium on Security and Privacy. San Jose, CA, 212--226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In WSDM.Google ScholarGoogle Scholar
  4. A.-L. Barabási and R. Albert. 1999. Emergence of scaling in random networks. Science 286, 5439 (1999), 509--512.Google ScholarGoogle Scholar
  5. Sergey Bartunov, Anton Korshunov, Seung-Taek Park, Wonho Ryu, and Hyungdong Lee. 2012. Joint link-attribute user identity resolution in online social networks. In SNA-KDD.Google ScholarGoogle Scholar
  6. Ehrhard Behrends. 2000. Introduction to Markov Chains. Vieweg. Google ScholarGoogle ScholarCross RefCross Ref
  7. Smriti Bhagat, Udi Weinsberg, Stratis Ioannidis, and Nina Taft. 2014. Recommending with an agenda: Active learning of private attributes using matrix factorization. In RecSys.Google ScholarGoogle Scholar
  8. Bigfuture major and employer classification. 2014. https://bigfuture.collegeboard.org/majors-careers.Google ScholarGoogle Scholar
  9. Joseph Bonneau, Jonathan Anderson, and George Danezis. 2009. Prying data out of a social network. In Proceedings of the 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Abdelberi Chaabane, Gergely Acs, and Mohamed Ali Kaafar. 2012. You are what you like! Information leakage through users’ interests. In Proceedings of the 19th Annual Network 8 Distributed System Security Symposium.Google ScholarGoogle Scholar
  11. Deepayan Chakrabarti, Stanislav Funiak, Jonathan Chang, and Sofus A. Macskassy. 2014. Joint inference of multiple label types in large networks. In Proceedings of the 31st International Conference on International Conference on Machine Learning—Volume 32 (ICML’14). II-874--II-882.Google ScholarGoogle Scholar
  12. Jiayi Chen, Jianping He, Lin Cai, and Jianping Pan. 2016. Profiling online social network users via relationships and network characteristics. In Proceedings of the Global Communications Conference (GLOBECOM’16). IEEE, 1--6. Google ScholarGoogle ScholarCross RefCross Ref
  13. Christian Ludl et al. On the effectiveness of techniques to detect phishing sites. International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer Berlin Heidelberg. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 273 (1995). Google ScholarGoogle ScholarCross RefCross Ref
  15. Ratan Dey, Cong Tang, Keith Ross, and Nitesh Saxena. 2012. Estimating age privacy leakage in online social networks. In INFOCOM.Google ScholarGoogle Scholar
  16. Yuxiao Dong, Yang Yang, Jie Tang, Yang Yang, and Nitesh V Chawla. 2014. Inferring user demographics and social strategies in mobile social networks. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 15--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. Liblinear: A library for large linear classification. Journal of Machine Learning Research 9 (2008), 1871--1874.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Federal Trade Commission. 2014. Data brokers: A call for transparency and accountability. Federal Trade Commission (2014).Google ScholarGoogle Scholar
  19. Oana Goga, Howard Lei, Sree Hari Krishnan Parthasarathi, Gerald Friedland, Robin Sommer, and Renata Teixeira. 2013. Exploiting innocuous activity for correlating users across sites. In WWW. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Oana Goga, Daniele Perito, Howard Lei, Renata Teixeira, and Robin Sommer. 2013. Large-scale Correlation of Accounts Across Social Networks. Technical report. International Computer Science Institute. Technical Report TR-13-002, Berkeley, California.Google ScholarGoogle Scholar
  21. Jennifer Golbeck, Cristina Robles, and Karen Turner. 2011. Predicting personality with social media. In CHI’11 Extended Abstracts on Human Factors in Computing Systems (CHI EA’11). ACM, 253--262. DOI:http://dx.doi.org/10.1145/1979742.1979614 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Neil Zhenqiang Gong and Bin Liu. 2016. You are who you know and how you behave: Attribute inference attacks via users’ social friends and behaviors. In USENIX Security’16.Google ScholarGoogle Scholar
  23. Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang, Eui Chul Richard Shin, Emil Stefanov, Elaine Runting Shi, and Dawn Song. 2014. Joint link prediction and attribute inference using a social-attribute network. ACM Trans. Intell. Syst. Technol. 5, 2, Article 27 (April 2014). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Neil Zhenqiang Gong, Wenchang Xu, Ling Huang, Prateek Mittal, Emil Stefanov, Vyas Sekar, and Dawn Song. 2012. Evolution of social-attribute networks: Measurements, modeling, and implications using Google+. In IMC.Google ScholarGoogle Scholar
  25. Google Play. 2016. Homepage. Retrieved from https://play.google.com/store?hl=en.Google ScholarGoogle Scholar
  26. Payas Gupta, Swapna Gottipati, Jing Jiang, and Debin Gao. 2013. Your love is public now: Questioning the use of personal information in authentication. In AsiaCCS.Google ScholarGoogle Scholar
  27. Jianming He, Wesley W. Chu, and Zhenyu Victor Liu. 2006. Inferring privacy information from social networks. In Proceedings of the International Conference on IEEE Intelligence and Security Informatics. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Raymond Heatherly, Murat Kantarcioglu, and Bhavani Thuraisingham. 2013. Preventing private information inference attacks on social networks. IEEE Transactions on Knowledge and Data Engineering 25, 8 (2013), 1849--1862. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Markus Jakobsson. 2005. Modeling and preventing phishing attacks. In Financial Cryptography and Data Security. FC 2005. A. S. Patrick, M. Yung (Eds.). Lecture Notes in Computer Science, vol 3570. Springer, Berlin, Heidelberg. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Jinyuan Jia, Binghui Wang, Le Zhang, and Neil Zhenqiang Gong. 2017. AttriInfer: Inferring user attributes in online social networks using Markov random fields. In WWW.Google ScholarGoogle Scholar
  31. David W. Hosmer Jr and Stanley Lemeshow. 2004. Applied Logistic Regression. John Wiley 8 Sons.Google ScholarGoogle Scholar
  32. David Jurgens. 2013. That’s what friends are for: Inferring location in online social media platforms based on social relationships.ICWSM 13 (2013), 273--282.Google ScholarGoogle Scholar
  33. Tapas Kanungo, D. M. Mount, N.S. Netanyahu, and C. D. Piatko. 2002. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 7 (2002), 881--892.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Myunghwan Kim and Jure Leskovec. 2012. Multiplicative attribute graph model of real-world networks. Internet Math. 8, 1--2 (2012), 113--160.Google ScholarGoogle ScholarCross RefCross Ref
  35. Michal Kosinski, David Stillwell, and Thore Graepel. 2013. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences 110, 15 (2013), 5802--5805. Google ScholarGoogle ScholarCross RefCross Ref
  36. Sebastian Labitzke, Florian Werling, and Jens Mittag. 2013. Do online social network friends still threaten my privacy? In CODASPY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, and Kevin Chen-Chuan Chang. 2012. Towards social user profiling: Unified and discriminative influence model for inferring home locations. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’12). ACM, 1023--1031.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. D. Liben-Nowell and J. Kleinberg. 2003. The link prediction problem for social networks. In Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM’03). Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. LIBLINEAR Package. 2014. http://www.csie.ntu.edu.tw/∼cjlin/liblinear/.Google ScholarGoogle Scholar
  40. Jack Lindamood, Raymond Heatherly, Murat Kantarcioglu, and Bhavani Thuraisingham. 2009. Inferring private information using social network data. In Proceedings of the 18th International Conference on World Wide Web (WWW’09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Dixin Luo, Hongteng Xu, Hongyuan Zha, Jun Du, Rong Xie, Xiaokang Yang, and Wenjun Zhang. 2014. You are what you watch and when you watch: Inferring household structures from IPTV viewing data. IEEE Transactions on Broadcasting 60, 1 (2014), 61--72. Google ScholarGoogle ScholarCross RefCross Ref
  42. A. McCallum and K. Nigam. 1998. A comparison of event models for naive Bayes text classification. In AAAI.Google ScholarGoogle Scholar
  43. Miller McPherson, Lynn Smith-Lovin, and James M. Cook. 2001. Birds of a feather: Homophily in social networks. Ann. Rev. Soc. 27, 415--444. Google ScholarGoogle ScholarCross RefCross Ref
  44. Frank McSherry and Marc Najork. 2008. Computing information retrieval performance measures efficiently in the presence of tied scores. In ECIR. Google ScholarGoogle ScholarCross RefCross Ref
  45. Tehila Minkus, Yuan Ding, Ratan Dey, and Keith W. Ross. 2015. The city privacy attack: Combining social media and public records for detailed profiles of adults and children. In COSN.Google ScholarGoogle Scholar
  46. Alan Mislove, Bimal Viswanath, Krishna P. Gummadi, and Peter Druschel. 2010. You are who you know: Inferring user profiles in online social networks. In WSDM.Google ScholarGoogle Scholar
  47. Abedelaziz Mohaisen, Huy Tran, Nicholas Hopper, and Yongdae Kim. 2012. On the mixing time of directed social graphs and security implications. In ASIACCS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Arvind Narayanan, Hristo Paskov, Neil Zhenqiang Gong, John Bethencourt, Richard Shin, Emil Stefanov, and Dawn Song. 2012. On the feasibility of internet-scale author identification. In IEEE S8P. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Liqiang Nie, Luming Zhang, Meng Wang, Richang Hong, Aleksandr Farseev, and Tat-Seng Chua. 2017. Learning user attributes via mobile social multimedia analytics. ACM Trans. Intell. Syst. Technol. 8, 3, Article 36 (April 2017), 19 pages. DOI:http://dx.doi.org/10.1145/2963105 Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Jahna Otterbacher. 2010. Inferring gender of movie reviewers: Exploiting writing style, content and metadata. In CIKM.Google ScholarGoogle Scholar
  51. J. Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Oskar Perron. 1907. Zur theorie der matrices. Mathematische Annalen 64, 2 (1907), 248--263. Google ScholarGoogle ScholarCross RefCross Ref
  53. Salman Salamatian, Amy Zhang, Flavio du Pin Calmon, Sandilya Bhamidipati, Nadia Fawaz, Branislav Kveton, Pedro Oliveira, and Nina Taft. 2013. How to hide the elephant—or the donkey—in the room: Practical privacy against statistical inference for large data. In IEEE GlobalSIP.Google ScholarGoogle Scholar
  54. Spear Phishing Attacks. 2017. Retrieved fromhttp://www.microsoft.com/protect/yourself/phishing/spear.mspx.Google ScholarGoogle Scholar
  55. Chris Sumner, Alison Byers, Rachel Boochever, and Gregory J. Park. 2012. Predicting dark triad personality traits from Twitter usage and a linguistic analysis of Tweets. In Proceedings of the 2012 11th International Conference on Machine Learning and Applications, Volume 02 (ICMLA’12). 386--393. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. L. Sweeney. 2002. k-anonymity: A model for protecting privacy. Int. J. Uncertain., Fuzziness Knowl.-based Syst. 10, 5 (2002), 557--570. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Kurt Thomas, Chris Grier, and David M. Nicol. 2010. unFriendly: Multi-party privacy risks in social networks. In PETS.Google ScholarGoogle Scholar
  58. Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. In ICDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Amanda L. Trauda, Peter J. Muchaa, and Mason A. Porter. 2012. Social structure of facebook networks. Physica A: Statistical Mechanics and its Applications 391, 16 (2012), 4165--4180. Google ScholarGoogle ScholarCross RefCross Ref
  60. Udi Weinsberg, Smriti Bhagat, Stratis Ioannidis, and Nina Taft. 2012. BlurMe: Inferring and obfuscating user gender based on ratings. In RecSys.Google ScholarGoogle Scholar
  61. Qiang Xu, Jeffrey Erman, Alexandre Gerber, Zhuoqing Mao, Jeffrey Pang, and Shobha Venkataraman. 2011. Identifying diverse usage behaviors of smartphone apps. In IMC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Mao Ye, Xingjie Liu, and Wang-Chien Lee. 2012. Exploring social influence for recommendation - A probabilistic generative model approach. In SIGIR.Google ScholarGoogle Scholar
  63. Faiyaz Al Zamal, Wendy Liu, and Derek Ruths. 2012. Homophily and latent attribute inference: Inferring latent attributes of Twitter users from neighbors. In ICWSM.Google ScholarGoogle Scholar
  64. E. Zheleva and L. Getoor. 2009. To join or not to join: The illusion of privacy in social networks with mixed public and private user profiles. In WWW.Google ScholarGoogle Scholar
  65. Yuan Zhong, Nicholas Jing Yuan, Wen Zhong, Fuzheng Zhang, and Xing Xie. 2015. You are where you go: Inferring demographic attributes from location check-ins. In WSDM.Google ScholarGoogle Scholar

Index Terms

  1. Attribute Inference Attacks in Online Social Networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Privacy and Security
      ACM Transactions on Privacy and Security  Volume 21, Issue 1
      February 2018
      148 pages
      ISSN:2471-2566
      EISSN:2471-2574
      DOI:10.1145/3171591
      Issue’s Table of Contents

      Copyright © 2018 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 January 2018
      • Accepted: 1 October 2017
      • Revised: 1 August 2017
      • Received: 1 October 2016
      Published in tops Volume 21, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!