ABSTRACT
Personalized recommendation products at Twitter target a multitude of heterogeneous items: Tweets, Events, Topics, Hashtags, and users. Each of these targets varies in their cardinality (which affects the scale of the problem) and their "shelf life'' (which constrains the latency of generating the recommendations). Although Twitter has built a variety of recommendation systems before dating back a decade, solutions to the broader problem were mostly tackled piecemeal. In this paper, we present SimClusters, a general-purpose representation layer based on overlapping communities into which users as well as heterogeneous content can be captured as sparse, interpretable vectors to support a multitude of recommendation tasks. We propose a novel algorithm for community discovery based on Metropolis-Hastings sampling, which is both more accurate and significantly faster than off-the-shelf alternatives. SimClusters scales to networks with billions of users and has been effective across a variety of deployed applications at Twitter.
Supplemental Material
- Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg, and Eric P. Xing. 2008. Mixed Membership Stochastic Blockmodels. JMLR, Vol. 9 (June 2008), 1981--2014.Google Scholar
- Iván Cantador and Paolo Cremonesi. 2014. Tutorial on Cross-domain Recommender Systems. In RecSys '14. 401--402.Google Scholar
- Andrzej Cichocki and Anh-Huy Phan. 2009. Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations. IEICE Transactions, Vol. 92-A (03 2009), 708--721.Google Scholar
Cross Ref
- Graham Cormode and Shan Muthukrishnan. 2005. An improved data stream summary: the count-min sketch and its applications. Journal of Algorithms, Vol. 55, 1 (2005), 58--75.Google Scholar
Digital Library
- Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In RecSys '16. 191--198.Google Scholar
- Maurizio Ferrari Dacrema, Paolo Cremonesi, and Dietmar Jannach. 2019. Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches. In Recsys'19. 101--109.Google Scholar
- Inderjit S. Dhillon, Yuqiang Guan, and Brian Kulis. 2007. Weighted Graph Cuts Without Eigenvectors A Multilevel Approach. IEEE Trans. Pattern Anal. Mach. Intell., Vol. 29, 11 (Nov. 2007), 1944--1957.Google Scholar
Digital Library
- Ali Mamdouh Elkahky, Yang Song, and Xiaodong He. 2015. A multi-view deep learning approach for cross domain user modeling in recommendation systems. In WWW'15. 278--288.Google Scholar
Digital Library
- Ajeet Grewal, Jerry Jiang, Gary Lam, Tristan Jung, Lohith Vuddemarri, Quannan Li, Aaditya Landge, and Jimmy Lin. 2018. Recservice: Distributed Real-Time Graph Processing at Twitter. In HotCloud'18. USENIX Association, 3.Google Scholar
- Aditya Grover and Jure Leskovec. 2016. Node2Vec: Scalable Feature Learning for Networks. In KDD '16. 855--864.Google Scholar
Digital Library
- Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, and Reza Zadeh. 2013. WTF: The Who to Follow Service at Twitter. In WWW '13. 505--514.Google Scholar
Digital Library
- Pankaj Gupta, Venu Satuluri, Ajeet Grewal, Siva Gurumurthy, Volodymyr Zhabiuk, Quannan Li, and Jimmy Lin. 2014. Real-Time Twitter Recommendation: Online Motif Detection in Large Dynamic Graphs. Proceedings of the VLDB Endowment, Vol. 7, 13 (2014), 1379--1380.Google Scholar
Digital Library
- William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In NIPS'17. 1025--1035.Google Scholar
- Krishna Kamath, Aneesh Sharma, Dong Wang, and Zhijun Yin. 2014. Realgraph: User interaction prediction at twitter. In User Engagement Optimization Workshop at KDD'14.Google Scholar
- Richard M Karp, Scott Shenker, and Christos H Papadimitriou. 2003. A simple algorithm for finding frequent elements in streams and bags. ACM Transactions on Database Systems (TODS), Vol. 28, 1 (2003), 51--55.Google Scholar
Digital Library
- Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR'17.Google Scholar
- Jon M. Kleinberg. 1999. Authoritative Sources in a Hyperlinked Environment. J. ACM, Vol. 46, 5 (Sept. 1999), 604--632.Google Scholar
Digital Library
- Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer, Vol. 42, 8 (Aug. 2009), 30--37.Google Scholar
Digital Library
- Jérôme Kunegis. 2013. KONECT -- The Koblenz Network Collection. In Proc. Int. Conf. on World Wide Web Companion. 1343--1350.Google Scholar
- R. Lempel and S. Moran. 2001. SALSA: The Stochastic Approach for Link-Structure Analysis. ACM Trans. Inf. Syst., Vol. 19, 2 (April 2001), 131--160.Google Scholar
Digital Library
- Jure Leskovec and Rok Sosivc. 2016. SNAP: A General-Purpose Network Analysis and Graph-Mining Library. ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 8, 1 (2016), 1.Google Scholar
Digital Library
- Dawen Liang, Rahul G. Krishnan, Matthew D. Hoffman, and Tony Jebara. 2018. Variational Autoencoders for Collaborative Filtering. In WWW '18. 689--698.Google Scholar
- David Melamed. 2014. Community Structures in Bipartite Networks: A Dual-Projection Approach. PLOS ONE, Vol. 9, 5 (05 2014), 1--5.Google Scholar
Cross Ref
- Feng Niu, Benjamin Recht, Christopher Re, and Stephen J. Wright. 2011. HOGWILD!: A Lock-free Approach to Parallelizing Stochastic Gradient Descent. In NIPS'11. 693--701.Google Scholar
Digital Library
- F. et. al. Pedregosa. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, Vol. 12 (2011), 2825--2830.Google Scholar
Digital Library
- Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In KDD'14. 701--710.Google Scholar
Digital Library
- Steffen Rendle. 2010. Factorization machines. In ICDM'10. IEEE, 995--1000.Google Scholar
Digital Library
- Venu Satuluri and Srinivasan Parthasarathy. 2011. Symmetrizations for Clustering Directed Graphs. In EDBT/ICDT '11. 343--354.Google Scholar
- Venu Satuluri, Srinivasan Parthasarathy, and Yiye Ruan. 2011. Local Graph Sparsification for Scalable Clustering. In SIGMOD '11. 721--732.Google Scholar
- Sebastian Schelter, Venu Satuluri, and Reza Bosagh Zadeh. 2014. Factorbird - a Parameter Server Approach to Distributed Matrix Factorization. ArXiv, Vol. abs/1411.0602 (2014).Google Scholar
- Aneesh Sharma, Jerry Jiang, Praveen Bommannavar, Brian Larson, and Jimmy Lin. 2016. GraphJet: Real-time Content Recommendations at Twitter. Proc. VLDB Endow., Vol. 9, 13 (Sept. 2016), 1281--1292.Google Scholar
Digital Library
- Aneesh Sharma, C. Seshadhri, and Ashish Goel. 2017. When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors. In WWW '17. 431--440.Google Scholar
Digital Library
- Charalampos Tsourakakis. 2015. Provably Fast Inference of Latent Features from Networks: With Applications to Learning Social Circles and Multilabel Classification. In WWW'15. 1111--1121.Google Scholar
Digital Library
- Jaewon Yang and Jure Leskovec. 2013. Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach. In WSDM'13. 587--596.Google Scholar
Digital Library
- Jaewon Yang, Julian McAuley, and Jure Leskovec. 2014. Detecting Cohesive and 2-Mode Communities Indirected and Undirected Networks. In WSDM'14. 323--332.Google Scholar
Digital Library
- Xinyang Yi, Ji Yang, Lichan Hong, Derek Zhiyuan Cheng, Lukasz Heldt, Aditee Kumthekar, Zhe Zhao, Li Wei, and Ed Chi. 2019. Sampling-bias-corrected neural modeling for large corpus item recommendations. In Recsys'19. 269--277.Google Scholar
- Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, and Jure Leskovec. 2018. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In KDD '18. 974--983.Google Scholar
- Xiao Yu, Xiang Ren, Yizhou Sun, Quanquan Gu, Bradley Sturt, Urvashi Khandelwal, Brandon Norick, and Jiawei Han. 2014. Personalized entity recommendation: A heterogeneous information network approach. In WSDM'14. 283--292.Google Scholar
Digital Library
- Yongfeng Zhang, Qingyao Ai, Xu Chen, and W Bruce Croft. 2017. Joint representation learning for top-n recommendation with heterogeneous information sources. In CIKM'17. 1449--1458.Google Scholar
Digital Library
Index Terms
SimClusters: Community-Based Representations for Heterogeneous Recommendations at Twitter
Recommendations
New Recommendation Techniques for Multicriteria Rating Systems
Traditional single-rating recommender systems have been successful in a number of personalization applications, but the research area of multicriteria recommender systems has been largely untouched. Taking full advantage of multicriteria ratings in ...
Effects of Personalized and Aggregate Top-N Recommendation Lists on User Preference Ratings
Prior research has shown a robust effect of personalized product recommendations on user preference judgments for items. Specifically, the display of system-predicted preference ratings as item recommendations has been shown in multiple studies to bias ...
A Clustering Approach for Personalizing Diversity in Collaborative Recommender Systems
UMAP '17: Proceedings of the 25th Conference on User Modeling, Adaptation and PersonalizationMuch of the focus of recommender systems research has been on the accurate prediction of users' ratings for unseen items. Recent work has suggested that objectives such as diversity and novelty in recommendations are also important factors in the ...





Comments