Abstract
As a result of the blooming of online social networks (OSNs), a user often holds accounts on multiple sites. In this article, we study the emerging “cross-site linking” function available on mainstream OSN services including Foursquare, Quora, and Pinterest. We first conduct a data-driven analysis on crawled profiles and social connections of all 61.39 million Foursquare users to obtain a thorough understanding of this function. Our analysis has shown that the cross-site linking function is adopted by 57.10% of all Foursquare users, and the users who have enabled this function are more active than others. We also find that the enablement of cross-site linking might lead to privacy risks. Based on cross-site links between Foursquare and external OSN sites, we formulate cross-site information aggregation as a problem that uses cross-site links to stitch together site-local information fields for OSN users. Using large datasets collected from Foursquare, Facebook, and Twitter, we demonstrate the usefulness and the challenges of cross-site information aggregation. In addition to the measurements, we carry out a survey collecting detailed user feedback on cross-site linking. This survey studies why people choose to or not to enable cross-site linking, as well as the motivation and concerns of enabling this function.
- Fabian Abel, Samur Araújo, Qi Gao, and Geert-Jan Houben. 2011. Analyzing cross-system user modeling on the social web. In Proceedings of the International Conference on Web Engineering (ICWE’11). Google Scholar
Digital Library
- Jacopo A. Baggio, Shauna B. BurnSilver, Alex Arenas, James S. Magdanz, Gary P. Kofinas, and Manlio De Domenico. 2016. Multiplex social ecological network analysis reveals how social changes affect community robustness more than resource depletion. Proceedings of the National Academy of Sciences 113, 48 (2016), 13708--13713.Google Scholar
Cross Ref
- Leo Breiman. 2001. Random forests. Machine Learning 45, 1 (2001), 5--32. Google Scholar
Digital Library
- Qiang Cao, Michael Sirivianos, Xiaowei Yang, and Tiago Pregueiro. 2012. Aiding the detection of fake accounts in large scale social online services. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI'12). Google Scholar
Digital Library
- Xuezhi Cao and Yong Yu. 2016. BASS: A bootstrapping approach for aligning heterogenous social networks. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Google Scholar
Digital Library
- Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'16). Google Scholar
Digital Library
- Terence Chen, Mohamed Ali Kaafar, and Roksana Boreli. 2013. The where and when of finding new friends: Analysis of a location-based social discovery networks. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM'13).Google Scholar
- Terence Chen, Mohamed Ali Kaafar, Arik Friedman, and Roksana Boreli. 2012. Is more always merrier? A deep dive into online social footprints. In Proceedings of the ACM Workshop on Online Social Networks (WOSN'12). Google Scholar
Digital Library
- Yang Chen, Jiyao Hu, Hao Zhao, Yu Xiao, and Pan Hui. 2018. Measurement and analysis of the swarm social network with tens of millions of nodes. IEEE Access 6 (2018), 4547--4559.Google Scholar
Cross Ref
- Manlio De Domenico, Albert Solé-Ribalta, Emanuele Cozzo, Mikko Kivelä, Yamir Moreno, Mason A. Porter, Sergio Gómez, and Alex Arenas. 2013. Mathematical formulation of multilayer networks. Physical Review X 3, 4 (Dec. 2013), 041022.Google Scholar
- Cong Ding, Yang Chen, and Xiaoming Fu. 2013. Crowd crawling: Towards collaborative data collection for large-scale online social networks. In Proceedings of the ACM Conference on Online Social Networks (COSN'13). Google Scholar
Digital Library
- Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9 (2008), 1871--1874. Google Scholar
Digital Library
- Reza Farahbakhsh, Ángel Cuevas, and Noël Crespi. 2016. Characterization of cross-posting activity for professional users across Facebook, Twitter and Google+. Social Network Analysis and Mining 6, 1 (2016), 33:1--33:14.Google Scholar
- Aleksandr Farseev and Tat-Seng Chua. 2017. Tweetfit: Fusing multiple social media and sensor data for wellness profile learning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI'17).Google Scholar
- Aleksandr Farseev, Liqiang Nie, Mohammad Akbari, and Tat-Seng Chua. 2015. Harvesting multiple sources for user profile learning: A big data study. In Proceedings of the International Conference on Multimedia Retrieval (ICMR'15). Google Scholar
Digital Library
- Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recognition Letters 27, 8 (2006), 861--874. Google Scholar
Digital Library
- Maksym Gabielkov, Ashwin Rao, and Arnaud Legout. 2014. Studying social networks at scale: Macroscopic anatomy of the Twitter social graph. In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'14). Google Scholar
Digital Library
- Oana Goga, Gerald Friedland, Howard Lei, Robin Sommer, Sree Hari Krishnan, and Renata Teixeira. 2013. Exploiting innocuous activity for correlating users across sites. In Proceedings of the World Wide Web Conference (WWW'13). Google Scholar
Digital Library
- Neil Zhenqiang Gong, Wenchang Xu, Ling Huang, Prateek Mittal, Emil Stefanov, Vyas Sekar, and Dawn Song. 2012. Evolution of social-attribute networks: Measurements, modeling, and implications using Google+. In Proceedings of the ACM Internet Measurement Conference (IMC'12). Google Scholar
Digital Library
- Qingyuan Gong, Yang Chen, Xinlei He, Zhou Zhuang, Tianyi Wang, Hong Huang, Xin Wang, and Xiaoming Fu. 2018. DeepScan: Exploiting deep learning for malicious account detection in location-based social networks. IEEE Communications Magazine (2018). (In press).Google Scholar
- Wanqiu Guan, Haoyu Gao, Mingmin Yang, Yuan Li, Haixin Ma, Weining Qian, Zhigang Cao, and Xiaoguang Yang. 2014. Analyzing user behavior of the micro-blogging website Sina Weibo during hot social events. Physica A: Statistical Mechanics and Its Applications 395, 0 (2014), 340--351.Google Scholar
Cross Ref
- Jinyoung Han, Daejin Choi, Byung-Gon Chun, Ted Kwon, Hyun-chul Kim, and Yanghee Choi. 2014. Collecting, organizing, and sharing pins in Pinterest: Interest-driven or social-driven? In Proceedings of the ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'14). Google Scholar
Digital Library
- Tianran Hu, Eric Bigelow, Jiebo Luo, and Henry Kautz. 2017. Tales of two cities: Using social media to understand idiosyncratic lifestyles in distinctive metropolitan areas. IEEE Transactions on Big Data 3, 1 (2017), 55--66.Google Scholar
Cross Ref
- Danesh Irani, Steve Webb, Kang Li, and Calton Pu. 2011. Modeling unintended personal-information leakage from multiple online social networks. IEEE Internet Computing 15, 3 (2011), 13--19. Google Scholar
Digital Library
- Paridhi Jain, Ponnurangam Kumaraguru, and Anupam Joshi. 2016. Other times, other values: Leveraging attribute history to link user profiles across online social networks. Social Network Analysis and Mining 6, 1 (2016), 85.Google Scholar
Cross Ref
- Long Jin, Yang Chen, Tianyi Wang, Pan Hui, and Athanasios V. Vasilakos. 2013. Understanding user behavior in online social networks: A survey. IEEE Communications Magazine 51, 9 (2013), 144--150.Google Scholar
- Xiangnan Kong, Jiawei Zhang, and Philip S. Yu. 2013. Inferring anchor links across multiple heterogeneous social networks. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM'13). Google Scholar
Digital Library
- Juhi Kulshrestha, Farshad Kooti, Ashkan Nikravesh, and Krishna P. Gummadi. 2012. Geographic dissection of the Twitter network. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM'12).Google Scholar
- Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media? In Proceedings of the World Wide Web Conference (WWW'10). Google Scholar
Digital Library
- Roy Ka-Wei Lee, Tuan-Anh Hoang, and Ee-Peng Lim. 2017. On analyzing user topic-specific platform preferences across multiple social media sites. In Proceedings of the World Wide Web Conference (WWW'17). Google Scholar
Digital Library
- Shihan Lin, Rong Xie, Qinge Xie, Hao Zhao, and Yang Chen. 2017. Understanding user activity patterns of the swarm app: A data-driven study. In Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers (UbiComp/ISWC'17). Google Scholar
Digital Library
- Siyuan Liu, Shuhui Wang, Feida Zhu, Jinbo Zhang, and Ramayya Krishnan. 2014. HYDRA: Large-scale social identity linkage via heterogeneous behavior modeling. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'14). Google Scholar
Digital Library
- John Maheswaran, Daniel Jackowitz, Ennan Zhai, David Isaac Wolinsky, and Bryan Ford. 2016. Building privacy-preserving cryptographic credentials from federated online identities. In Proceedings of the ACM Conference on Data and Application Security and Privacy (CODASPY'16). Google Scholar
Digital Library
- Quinn McNemar. 1947. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12, 2 (1947), 153--157.Google Scholar
Cross Ref
- Pasquale De Meo, Emilio Ferrara, Fabian Abel, Lora Aroyo, and Geert-Jan Houben. 2013. Analyzing user behavior across social sharing environments. ACM Transactions on Intelligent Systems and Technology 5, 1 (2013), 14:1--14:31. Google Scholar
Digital Library
- Tehila Minkus, Kelvin Liu, and Keith W. Ross. 2015. Children seen but not heard: When parents compromise children’s online privacy. In Proceedings of the World Wide Web Conference (WWW'15). Google Scholar
Digital Library
- Dung T. Nguyen, Huiyuan Zhang, Soham Das, My T. Thai, and Thang N. Dinh. 2013. Least cost influence in multiplex social networks: Model representation and analysis. In Proceedings of the IEEE International Conference on Data Mining (ICDM'13).Google Scholar
- Anastasios Noulas, Salvatore Scellato, Cecilia Mascolo, and Massimiliano Pontil. 2011. An empirical study of geographic user activity patterns in foursquare. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM'11).Google Scholar
- Neil O’Hare and Vanessa Murdock. 2012. Gender-based models of location from flickr. In Proceedings of the ACM Workshop on Geotagging and Its Applications in Multimedia (GeoMM'12). Google Scholar
Digital Library
- Raphael Ottoni, Diego de Las Casas, João Paulo Pesce, Wagner Meira Jr., Christo Wilson, Alan Mislove, and Virgílio Almeida. 2014. Of pins and tweets: Investigating how users behave across image- and text-based social networks. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM'14).Google Scholar
- Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The Pagerank citation ranking: Bringing order to the web. Stanford InfoLab.Google Scholar
- Daniel Preoţiuc-Pietro and Trevor Cohn. 2013. Mining user behaviours: A study of check-in patterns in location based social networks. In Proceedings of the ACM Web Science Conference (WebSci'13). Google Scholar
Digital Library
- Daniele Quercia, Mansoureh Bodaghi, and Jon Crowcroft. 2012. Loosing “friends” on facebook. In Proceedings of the ACM Web Science Conference (WebSci'12).Google Scholar
Digital Library
- J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco. Google Scholar
Digital Library
- Salvatore Scellato, Anastasios Noulas, Renaud Lambiotte, and Cecilia Mascolo. 2011. Socio-spatial properties of online location-based social networks. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM'11).Google Scholar
- Thiago H. Silva, Pedro O. S. Vaz de Melo, Jussara M. Almeida, Juliana Salles, and Antonio A. F. Loureiro. 2014. Revealing the city that we cannot see. ACM Transactions on Internet Technology 14, 4 (Dec. 2014), 26:1--26:23. Google Scholar
Digital Library
- Xiaodan Song, Yun Chi, Koji Hino, and Belle Tseng. 2007. Identifying opinion leaders in the blogosphere. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM'07). Google Scholar
Digital Library
- Jie Tang, Tiancheng Lou, and Jon Kleinberg. 2012. Inferring social ties across heterogenous networks. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM'12). Google Scholar
Digital Library
- Shiliang Tang, Xinyi Zhang, Jenna Cryan, Miriam J. Metzger, Haitao Zheng, and Ben Y. Zhao. 2017. Gender bias in the job market: A longitudinal analysis. Proceedings of the ACM on Human-Computer Interaction 1, CSCW, Article 99 (Dec. 2017), 19 pages. Google Scholar
Digital Library
- Mike Thelwall. 2008. Social networks, gender and friending: An analysis of MySpace member profiles. Journal of the American Society for Information Science and Technology 59, 8 (2008), 1321--1330. Google Scholar
Digital Library
- Asimina Vasalou, Adam N. Joinson, and Delphine Courvoisier. 2010. Cultural differences, experience with social networks and the nature of “true commitment” in Facebook. International Journal of Human-Computer Studies 68, 10 (2010), 719--728. Google Scholar
Digital Library
- Marisa Affonso Vasconcelos, Saulo Ricci, Jussara Almeida, Fabrício Benevenuto, and Virgílio Almeida. 2012. Tips, dones and to-dos: Uncovering user profiles in foursquare. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM'12). Google Scholar
Digital Library
- Giridhari Venkatadri, Oana Goga, Changtao Zhong, Bimal Viswanath, Krishna P. Gummadi, and Nishanth Sastry. 2016. Strengthening weak identities through inter-domain trust transfer. In Proceedings of the World Wide Web Conference (WWW'16). Google Scholar
Digital Library
- King wa Fu and Michael Chau. 2013. Reality check for the Chinese microblog space: A random sampling approach. PLoS ONE 8, 3 (2013), e58356.Google Scholar
Cross Ref
- Gang Wang, Konark Gill, Manish Mohanlal, Haitao Zheng, and Ben Y. Zhao. 2013. Wisdom in the social crowd: An analysis of quora. In Proceedings of the World Wide Web Conference (WWW'13). Google Scholar
Digital Library
- Gang Wang, Sarita Y. Schoenebeck, Haitao Zheng, and Ben Y. Zhao. 2016. “Will check-in for badges”: Understanding bias and misbehavior on location-based social networks. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM'16).Google Scholar
- Huijuan Wang, Qian Li, G. D’Agostino, Shlomo Havlin, H. Stanley, and Piet Van Mieghem. 2013. Effect of the interconnected network structure on the epidemic threshold. Physical Review E 88, 2 (2013), 022801.Google Scholar
Cross Ref
- Pinghui Wang, Wenbo He, and Junzhou Zhao. 2014. A tale of three social networks: User activity comparisons across Facebook, Twitter, and Foursquare. IEEE Internet Computing 18, 2 (2014), 10--15.Google Scholar
Cross Ref
- Wenbo Wang, Lu Chen, Krishnaprasad Thirunarayan, and Amit P. Sheth. 2014. Cursing in english on Twitter. In Proceedings of the ACM Conference on Computer Supported Cooperative Work 8 Social Computing (CSCW'14). Google Scholar
Digital Library
- Yi-Chia Wang, Moira Burke, and Robert E. Kraut. 2013. Gender, topic, and audience response: An analysis of user-generated content on Facebook. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI'13). Google Scholar
Digital Library
- Jianshu Weng, Ee-Peng Lim, Jing Jiang, and Qi He. 2010. TwitterRank: Finding topic-sensitive influential Twitterers. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM'10). Google Scholar
Digital Library
- Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Y. Zhao. 2009. User interactions in social networks and their implications. In Proceedings of the ACM European Conference on Computer Systems (EuroSys'09). Google Scholar
Digital Library
- Chunjing Xiao, Ling Su, Juan Bi, Yuxia Xue, and Aleksandar Kuzmanovic. 2012. Selective behavior in online social networks. In Proceedings of the IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT'12). Google Scholar
Digital Library
- Dingqi Yang, Daqing Zhang, Bingqing Qu, and Philippe Cudré-Mauroux. 2016. PrivCheck: Privacy-preserving check-in data publishing for personalized location based services. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp'16). Google Scholar
Digital Library
- Yiming Yang and Jan O. Pedersen. 1997. A comparative study on feature selection in text categorization. In Proceedings of the International Conference on Machine Learning (ICML'97). Google Scholar
Digital Library
- Zhi Yang, Christo Wilson, Xiao Wang, Tingting Gao, Ben Y. Zhao, and Yafei Dai. 2014. Uncovering social network sybils in the wild. ACM Transactions on Knowledge Discovery from Data 8, 1 (2014), 2:1--2:29. Google Scholar
Digital Library
- Nicholas Jing Yuan, Fuzheng Zhang, Defu Lian, Kai Zheng, Siyu Yu, and Xing Xie. 2013. We know how you live: Exploring the spectrum of urban lifestyles. In Proceedings of the ACM Conference on Online Social Networks (COSN'13). Google Scholar
Digital Library
- Changtao Zhong, Hau-wen Chang, Dmytro Karamshuk, Dongwon Lee, and Nishanth Sastry. 2017. Wearing many (social) hats: How different are your different social network personae? In Proceedings of the International Conference on Weblogs and Social Media (ICWSM'17).Google Scholar
- Changtao Zhong, Nicolas Kourtellis, and Nishanth Sastry. 2016. Pinning alone? A study of the role of social ties on Pinterest. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM'16).Google Scholar
- Changtao Zhong, Mostafa Salehi, Sunil Shah, Marius Cobzarenco, Nishanth Sastry, and Meeyoung Cha. 2014. Social bootstrapping: How Pinterest and last.fm social communities benefit by borrowing links from Facebook. In Proceedings of the World Wide Web Conference (WWW'14). Google Scholar
Digital Library
Index Terms
Understanding Cross-Site Linking in Online Social Networks
Recommendations
Cross-site Prediction on Social Influence for Cold-start Users in Online Social Networks
Online social networks (OSNs) have become a commodity in our daily life. As an important concept in sociology and viral marketing, the study of social influence has received a lot of attentions in academia. Most of the existing proposals work well on ...
Understanding Cross-site Linking in Online Social Networks
SNAKDD'14: Proceedings of the 8th Workshop on Social Network Mining and AnalysisOnline social networks (OSNs) have attracted billions of users, and play an important role in people's daily life. A user often has accounts on multiple OSN sites. In this paper, we study the emerging "cross-site linking" function, which is supported by ...
Understanding latent interactions in online social networks
Popular online social networks (OSNs) like Facebook and Twitter are changing the way users communicate and interact with the Internet. A deep understanding of user interactions in OSNs can provide important insights into questions of human social ...






Comments