Abstract
We propose new privacy attacks to infer attributes (e.g., locations, occupations, and interests) of online social network users. Our attacks leverage seemingly innocent user information that is publicly available in online social networks to infer missing attributes of targeted users. Given the increasing availability of (seemingly innocent) user information online, our results have serious implications for Internet privacy—private attributes can be inferred from users’ publicly available data unless we take steps to protect users from such inference attacks. To infer attributes of a targeted user, existing inference attacks leverage either the user’s publicly available social friends or the user’s behavioral records (e.g., the web pages that the user has liked on Facebook, the apps that the user has reviewed on Google Play), but not both. As we will show, such inference attacks achieve limited success rates. However, the problem becomes qualitatively different if we consider both social friends and behavioral records. To address this challenge, we develop a novel model to integrate social friends and behavioral records, and design new attacks based on our model. We theoretically and experimentally demonstrate the effectiveness of our attacks. For instance, we observe that, in a real-world large-scale dataset with 1.1 million users, our attack can correctly infer the cities a user lived in for 57% of the users; via confidence estimation, we are able to increase the attack success rate to over 90% if the attacker selectively attacks half of the users. Moreover, we show that our attack can correctly infer attributes for significantly more users than previous attacks.
- Sibel Adali and Jennifer Golbeck. 2012. Predicting personality with social behavior. In Proceedings of the 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 302--309. Google Scholar
Digital Library
- Sadia Afroz, Aylin Caliskan-Islam, Ariel Stolerman, Rachel Greenstadt, and Damon McCoy. 2014. Doppelgänger finder: Taking stylometry to the underground. In IEEE Symposium on Security and Privacy. San Jose, CA, 212--226.Google Scholar
Digital Library
- Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In WSDM.Google Scholar
- A.-L. Barabási and R. Albert. 1999. Emergence of scaling in random networks. Science 286, 5439 (1999), 509--512.Google Scholar
- Sergey Bartunov, Anton Korshunov, Seung-Taek Park, Wonho Ryu, and Hyungdong Lee. 2012. Joint link-attribute user identity resolution in online social networks. In SNA-KDD.Google Scholar
- Ehrhard Behrends. 2000. Introduction to Markov Chains. Vieweg. Google Scholar
Cross Ref
- Smriti Bhagat, Udi Weinsberg, Stratis Ioannidis, and Nina Taft. 2014. Recommending with an agenda: Active learning of private attributes using matrix factorization. In RecSys.Google Scholar
- Bigfuture major and employer classification. 2014. https://bigfuture.collegeboard.org/majors-careers.Google Scholar
- Joseph Bonneau, Jonathan Anderson, and George Danezis. 2009. Prying data out of a social network. In Proceedings of the 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM). Google Scholar
Digital Library
- Abdelberi Chaabane, Gergely Acs, and Mohamed Ali Kaafar. 2012. You are what you like! Information leakage through users’ interests. In Proceedings of the 19th Annual Network 8 Distributed System Security Symposium.Google Scholar
- Deepayan Chakrabarti, Stanislav Funiak, Jonathan Chang, and Sofus A. Macskassy. 2014. Joint inference of multiple label types in large networks. In Proceedings of the 31st International Conference on International Conference on Machine Learning—Volume 32 (ICML’14). II-874--II-882.Google Scholar
- Jiayi Chen, Jianping He, Lin Cai, and Jianping Pan. 2016. Profiling online social network users via relationships and network characteristics. In Proceedings of the Global Communications Conference (GLOBECOM’16). IEEE, 1--6. Google Scholar
Cross Ref
- Christian Ludl et al. On the effectiveness of techniques to detect phishing sites. International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer Berlin Heidelberg. Google Scholar
Digital Library
- Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 273 (1995). Google Scholar
Cross Ref
- Ratan Dey, Cong Tang, Keith Ross, and Nitesh Saxena. 2012. Estimating age privacy leakage in online social networks. In INFOCOM.Google Scholar
- Yuxiao Dong, Yang Yang, Jie Tang, Yang Yang, and Nitesh V Chawla. 2014. Inferring user demographics and social strategies in mobile social networks. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 15--24. Google Scholar
Digital Library
- Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. Liblinear: A library for large linear classification. Journal of Machine Learning Research 9 (2008), 1871--1874.Google Scholar
Digital Library
- Federal Trade Commission. 2014. Data brokers: A call for transparency and accountability. Federal Trade Commission (2014).Google Scholar
- Oana Goga, Howard Lei, Sree Hari Krishnan Parthasarathi, Gerald Friedland, Robin Sommer, and Renata Teixeira. 2013. Exploiting innocuous activity for correlating users across sites. In WWW. Google Scholar
Digital Library
- Oana Goga, Daniele Perito, Howard Lei, Renata Teixeira, and Robin Sommer. 2013. Large-scale Correlation of Accounts Across Social Networks. Technical report. International Computer Science Institute. Technical Report TR-13-002, Berkeley, California.Google Scholar
- Jennifer Golbeck, Cristina Robles, and Karen Turner. 2011. Predicting personality with social media. In CHI’11 Extended Abstracts on Human Factors in Computing Systems (CHI EA’11). ACM, 253--262. DOI:http://dx.doi.org/10.1145/1979742.1979614 Google Scholar
Digital Library
- Neil Zhenqiang Gong and Bin Liu. 2016. You are who you know and how you behave: Attribute inference attacks via users’ social friends and behaviors. In USENIX Security’16.Google Scholar
- Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang, Eui Chul Richard Shin, Emil Stefanov, Elaine Runting Shi, and Dawn Song. 2014. Joint link prediction and attribute inference using a social-attribute network. ACM Trans. Intell. Syst. Technol. 5, 2, Article 27 (April 2014). Google Scholar
Digital Library
- Neil Zhenqiang Gong, Wenchang Xu, Ling Huang, Prateek Mittal, Emil Stefanov, Vyas Sekar, and Dawn Song. 2012. Evolution of social-attribute networks: Measurements, modeling, and implications using Google+. In IMC.Google Scholar
- Google Play. 2016. Homepage. Retrieved from https://play.google.com/store?hl=en.Google Scholar
- Payas Gupta, Swapna Gottipati, Jing Jiang, and Debin Gao. 2013. Your love is public now: Questioning the use of personal information in authentication. In AsiaCCS.Google Scholar
- Jianming He, Wesley W. Chu, and Zhenyu Victor Liu. 2006. Inferring privacy information from social networks. In Proceedings of the International Conference on IEEE Intelligence and Security Informatics. Google Scholar
Digital Library
- Raymond Heatherly, Murat Kantarcioglu, and Bhavani Thuraisingham. 2013. Preventing private information inference attacks on social networks. IEEE Transactions on Knowledge and Data Engineering 25, 8 (2013), 1849--1862. Google Scholar
Digital Library
- Markus Jakobsson. 2005. Modeling and preventing phishing attacks. In Financial Cryptography and Data Security. FC 2005. A. S. Patrick, M. Yung (Eds.). Lecture Notes in Computer Science, vol 3570. Springer, Berlin, Heidelberg. Google Scholar
Digital Library
- Jinyuan Jia, Binghui Wang, Le Zhang, and Neil Zhenqiang Gong. 2017. AttriInfer: Inferring user attributes in online social networks using Markov random fields. In WWW.Google Scholar
- David W. Hosmer Jr and Stanley Lemeshow. 2004. Applied Logistic Regression. John Wiley 8 Sons.Google Scholar
- David Jurgens. 2013. That’s what friends are for: Inferring location in online social media platforms based on social relationships.ICWSM 13 (2013), 273--282.Google Scholar
- Tapas Kanungo, D. M. Mount, N.S. Netanyahu, and C. D. Piatko. 2002. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 7 (2002), 881--892.Google Scholar
Digital Library
- Myunghwan Kim and Jure Leskovec. 2012. Multiplicative attribute graph model of real-world networks. Internet Math. 8, 1--2 (2012), 113--160.Google Scholar
Cross Ref
- Michal Kosinski, David Stillwell, and Thore Graepel. 2013. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences 110, 15 (2013), 5802--5805. Google Scholar
Cross Ref
- Sebastian Labitzke, Florian Werling, and Jens Mittag. 2013. Do online social network friends still threaten my privacy? In CODASPY. Google Scholar
Digital Library
- Rui Li, Shengjie Wang, Hongbo Deng, Rui Wang, and Kevin Chen-Chuan Chang. 2012. Towards social user profiling: Unified and discriminative influence model for inferring home locations. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’12). ACM, 1023--1031.Google Scholar
Digital Library
- D. Liben-Nowell and J. Kleinberg. 2003. The link prediction problem for social networks. In Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM’03). Google Scholar
Digital Library
- LIBLINEAR Package. 2014. http://www.csie.ntu.edu.tw/∼cjlin/liblinear/.Google Scholar
- Jack Lindamood, Raymond Heatherly, Murat Kantarcioglu, and Bhavani Thuraisingham. 2009. Inferring private information using social network data. In Proceedings of the 18th International Conference on World Wide Web (WWW’09). Google Scholar
Digital Library
- Dixin Luo, Hongteng Xu, Hongyuan Zha, Jun Du, Rong Xie, Xiaokang Yang, and Wenjun Zhang. 2014. You are what you watch and when you watch: Inferring household structures from IPTV viewing data. IEEE Transactions on Broadcasting 60, 1 (2014), 61--72. Google Scholar
Cross Ref
- A. McCallum and K. Nigam. 1998. A comparison of event models for naive Bayes text classification. In AAAI.Google Scholar
- Miller McPherson, Lynn Smith-Lovin, and James M. Cook. 2001. Birds of a feather: Homophily in social networks. Ann. Rev. Soc. 27, 415--444. Google Scholar
Cross Ref
- Frank McSherry and Marc Najork. 2008. Computing information retrieval performance measures efficiently in the presence of tied scores. In ECIR. Google Scholar
Cross Ref
- Tehila Minkus, Yuan Ding, Ratan Dey, and Keith W. Ross. 2015. The city privacy attack: Combining social media and public records for detailed profiles of adults and children. In COSN.Google Scholar
- Alan Mislove, Bimal Viswanath, Krishna P. Gummadi, and Peter Druschel. 2010. You are who you know: Inferring user profiles in online social networks. In WSDM.Google Scholar
- Abedelaziz Mohaisen, Huy Tran, Nicholas Hopper, and Yongdae Kim. 2012. On the mixing time of directed social graphs and security implications. In ASIACCS. Google Scholar
Digital Library
- Arvind Narayanan, Hristo Paskov, Neil Zhenqiang Gong, John Bethencourt, Richard Shin, Emil Stefanov, and Dawn Song. 2012. On the feasibility of internet-scale author identification. In IEEE S8P. Google Scholar
Digital Library
- Liqiang Nie, Luming Zhang, Meng Wang, Richang Hong, Aleksandr Farseev, and Tat-Seng Chua. 2017. Learning user attributes via mobile social multimedia analytics. ACM Trans. Intell. Syst. Technol. 8, 3, Article 36 (April 2017), 19 pages. DOI:http://dx.doi.org/10.1145/2963105 Google Scholar
Digital Library
- Jahna Otterbacher. 2010. Inferring gender of movie reviewers: Exploiting writing style, content and metadata. In CIKM.Google Scholar
- J. Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 2014.Google Scholar
Digital Library
- Oskar Perron. 1907. Zur theorie der matrices. Mathematische Annalen 64, 2 (1907), 248--263. Google Scholar
Cross Ref
- Salman Salamatian, Amy Zhang, Flavio du Pin Calmon, Sandilya Bhamidipati, Nadia Fawaz, Branislav Kveton, Pedro Oliveira, and Nina Taft. 2013. How to hide the elephant—or the donkey—in the room: Practical privacy against statistical inference for large data. In IEEE GlobalSIP.Google Scholar
- Spear Phishing Attacks. 2017. Retrieved fromhttp://www.microsoft.com/protect/yourself/phishing/spear.mspx.Google Scholar
- Chris Sumner, Alison Byers, Rachel Boochever, and Gregory J. Park. 2012. Predicting dark triad personality traits from Twitter usage and a linguistic analysis of Tweets. In Proceedings of the 2012 11th International Conference on Machine Learning and Applications, Volume 02 (ICMLA’12). 386--393. Google Scholar
Digital Library
- L. Sweeney. 2002. k-anonymity: A model for protecting privacy. Int. J. Uncertain., Fuzziness Knowl.-based Syst. 10, 5 (2002), 557--570. Google Scholar
Digital Library
- Kurt Thomas, Chris Grier, and David M. Nicol. 2010. unFriendly: Multi-party privacy risks in social networks. In PETS.Google Scholar
- Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. In ICDM. Google Scholar
Digital Library
- Amanda L. Trauda, Peter J. Muchaa, and Mason A. Porter. 2012. Social structure of facebook networks. Physica A: Statistical Mechanics and its Applications 391, 16 (2012), 4165--4180. Google Scholar
Cross Ref
- Udi Weinsberg, Smriti Bhagat, Stratis Ioannidis, and Nina Taft. 2012. BlurMe: Inferring and obfuscating user gender based on ratings. In RecSys.Google Scholar
- Qiang Xu, Jeffrey Erman, Alexandre Gerber, Zhuoqing Mao, Jeffrey Pang, and Shobha Venkataraman. 2011. Identifying diverse usage behaviors of smartphone apps. In IMC. Google Scholar
Digital Library
- Mao Ye, Xingjie Liu, and Wang-Chien Lee. 2012. Exploring social influence for recommendation - A probabilistic generative model approach. In SIGIR.Google Scholar
- Faiyaz Al Zamal, Wendy Liu, and Derek Ruths. 2012. Homophily and latent attribute inference: Inferring latent attributes of Twitter users from neighbors. In ICWSM.Google Scholar
- E. Zheleva and L. Getoor. 2009. To join or not to join: The illusion of privacy in social networks with mixed public and private user profiles. In WWW.Google Scholar
- Yuan Zhong, Nicholas Jing Yuan, Wen Zhong, Fuzheng Zhang, and Xing Xie. 2015. You are where you go: Inferring demographic attributes from location check-ins. In WSDM.Google Scholar
Index Terms
Attribute Inference Attacks in Online Social Networks
Recommendations
Are Attribute Inference Attacks Just Imputation?
CCS '22: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications SecurityModels can expose sensitive information about their training data. In an attribute inference attack, an adversary has partial knowledge of some training records and access to a model trained on those records, and infers the unknown values of a sensitive ...
Privacy Leakage via Attribute Inference in Directed Social Networks
Information and Communications SecurityAbstractSocial networking has become a frequent activity for most internet users. Profile attribute inference research has gained popularity due to its importance in social network privacy. While many social networks are in the form of directed networks, ...
Information Attacks on Online Social Networks
Online social networks have changed the way people interact, allowing them to stay in touch with their acquaintances, reconnect with old friends, and establish new relationships with other people based on hobbies, interests, and friendship circles. ...






Comments