Abstract
The vector space model is undoubtedly among the most popular data representation models used in the processing of large networks. Unfortunately, the vector space model suffers from the so-called curse of dimensionality, a phenomenon where data become extremely sparse due to an exponential growth of the data space volume caused by a large number of dimensions. Thus, dimensionality reduction techniques are necessary to make large networks represented in the vector space model available for analysis and processing. Most dimensionality reduction techniques tend to focus on principal components present in the data, effectively disregarding local relationships that may exist between objects. This behavior is a significant drawback of current dimensionality reduction techniques, because these local relationships are crucial for maintaining high accuracy in many network analysis tasks, such as link prediction or community detection. To rectify the aforementioned drawback, we propose Progressive Random Indexing, a new dimensionality reduction technique. Built upon Reflective Random Indexing, our method significantly reduces the dimensionality of the vector space model while retaining all important local relationships between objects. The key element of the Progressive Random Indexing technique is the use of the gain value at each reflection step, which determines how much information about local relationships should be included in the space of reduced dimensionality. Our experiments indicate that when applied to large real-world networks (Facebook social network, MovieLens movie recommendations), Progressive Random Indexing outperforms state-of-the-art methods in link prediction tasks.
- Massimiliano Albanese, Antonio D’Acierno, Vincenzo Moscato, Fabio Persia, and Antonio Picariello. 2013. A multimedia recommender system. ACM Trans. Internet Technol. 13, 1 (2013), 3:1—-3:32. Google Scholar
Digital Library
- Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, New York, NY, 635--644. Google Scholar
Digital Library
- Kishor Barman and Vivek S. Borkar. 2008. A note on linear function approximation using random projections. Syst. Control Lett. 57, 9 (2008), 784--786.Google Scholar
Cross Ref
- David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3 (March 2003), 993--1022. Google Scholar
Digital Library
- Erik Brynjolfsson, Yu Hu, and Duncan Simester. 2011. Goodbye pareto principle, hello long tail: The effect of search costs on the concentration of product sales. Manage. Sci. 57, 8 (June 2011), 1373--1386. Google Scholar
Digital Library
- Òscar Celma. 2010. Music Recommendation and Discovery. Springer, Berlin. 87--108 pages.Google Scholar
Digital Library
- Michal Ciesielczyk and Andrzej Szwabe. 2011. RSVD-based dimensionality reduction for recommender systems. Int. J. Mach. Learn. Comput. 1, 2 (2011), 170--175.Google Scholar
Cross Ref
- Michał Ciesielczyk, Andrzej Szwabe, and Mikołaj Morzy. 2015. On efficient link recommendation in social networks using actor-fact matrices. Scientific Programming 2015, Computational Aspects of Social Network Analysis (CASNA’15), 1--9. Google Scholar
Digital Library
- Trevor Cohen, Roger Schvaneveldt, and Dominic Widdows. 2010. Reflective random indexing and indirect inference: A scalable method for discovery of implicit connections. J. Biomed. Informatics 43, 2 (April 2010), 240--56. Google Scholar
Digital Library
- Colin Cooper, Sang Hyuk Lee, Tomasz Radzik, and Yiannis Siantos. 2014. Random walks in recommender systems: Exact computation and simulations. In Proceedings of the 23rd International Conference on World Wide Web (WWW’14 Companion). ACM, New York, NY, 811--816. Google Scholar
Digital Library
- Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the 4th ACM Conference on Recommender Systems (RecSys’10). ACM Press, New York, NY, 39. Google Scholar
Digital Library
- Danica Damljanovic, Johann Petrak, Mihai Lupu, Hamish Cunningham, Mats Carlsson, Gunnar Engstrom, and Bo Andersson. 2012. Random indexing for finding similar nodes within large RDF graphs. In Proceedings of the 8th International Conference on the Semantic Web (ESWC’11). Springer-Verlag, Berlin, 156--171. Google Scholar
Digital Library
- Christian Desrosiers and George Karypis. 2011. A comprehensive survey of neighborhood-based recommendation methods. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, 107--144.Google Scholar
- Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recogn. Lett. 27, 8 (2006), 861--874. Google Scholar
Digital Library
- Peter Flach. 2012. Machine learning experiments. In Machine Learning: The Art and Science of Algorithms That Make Sense of Data. Cambridge University Press, New York, NY, 343--359.Google Scholar
- Asela Gunawardana and Christopher Meek. 2009. A unified approach to building hybrid recommender systems. In Proceedings of the 3rd ACM Conference on Recommender Systems (RecSys’09). ACM, New York, NY, 117--124. Google Scholar
Digital Library
- Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl. 2004. Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22, 1 (January 2004), 5--53. Google Scholar
Digital Library
- Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30--37. Google Scholar
Digital Library
- Victor Lavrenko. 2010. A Generative Theory of Relevance. Springer Publishing Company. Google Scholar
Digital Library
- Jure Leskovec and Julian J. Mcauley. 2012. Learning to discover social circles in ego networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, 539--547. Google Scholar
Digital Library
- David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58, 7 (2007), 1019--1031. Google Scholar
Digital Library
- Sean M. McNee, John Riedl, and Joseph A. Konstan. 2006. Being accurate is not enough: How accuracy metrics have hurt recommender systems. In CHI’06 Extended Abstracts on Human Factors in Computing Systems (CHI EA’06). ACM, New York, NY, 1097--1101. Google Scholar
Digital Library
- Aditya Krishna Menon and Charles Elkan. 2011. Link prediction via matrix factorization. In Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II (ECML PKDD’11). Springer-Verlag, Berlin, 437--452. Google Scholar
Digital Library
- Surya Nepal, Cecile Paris, Payam Aghaei Pour, Jill Freyne, and Sanat Kumar Bista. 2015. Interaction-based recommendations for online communities. ACM Trans. Internet Technol. 15, 2 (2015), 6:1--6:21. Google Scholar
Digital Library
- Alexis Papadimitriou, Panagiotis Symeonidis, and Yannis Manolopoulos. 2012. Fast and accurate link prediction in social networking systems. J. Syst. Software 85, 9 (2012), 2119--2132. Google Scholar
Digital Library
- Yoon-Joo Park and Alexander Tuzhilin. 2008. The long tail of recommender systems and how to leverage it. In Proceedings of the 2008 ACM Conference on Recommender Systems (RecSys’08). ACM, New York, NY, 11--18. Google Scholar
Digital Library
- Itamar Pitowsky. 2006. Quantum mechanics as a theory of probability. In Physical Theory and Its Interpretation, William Demopoulos and Itamar Pitowsky (Eds.). The Western Ontario Series in Philosophy of Science, Vol. 72. Springer Netherlands, 213--240.Google Scholar
- Raymond Reiter. 1978. On closed world data bases. In Logic and Data Bases, Hervè Gallaire and Jack Minker (Eds.). Plenum Press, New York, NY, 119--140.Google Scholar
- Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to recommender systems handbook. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, Boston, MA, 1--35.Google Scholar
Digital Library
- Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, 257--297.Google Scholar
- Panagiotis Symeonidis, Nantia Iakovidou, Nikolaos Mantas, and Yannis Manolopoulos. 2013. From biological to social networks: Link prediction based on multi-way spectral clustering. Data Knowl. Engineering 87 (2013), 226--242. Google Scholar
Digital Library
- Andrzej Szwabe, Michal Ciesielczyk, and Pawel Misiorek. 2011. Long-tail recommendation based on reflective indexing. In Advances in Artificial Intelligence (AI’11), Dianhui Wang and Mark Reynolds (Eds.). Vol. 7106. Springer, Berlin, 142--151. Google Scholar
Digital Library
- Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). ACM, New York, NY, 1067--1077. Google Scholar
Digital Library
- Cornelis Joost van Rijsbergen. 2004. The geometry of IR. In The Geometry of Information Retrieval. Cambridge University Press, New York, NY, 73--101.Google Scholar
- Vidya Vasuki and Trevor Cohen. 2010. Reflective random indexing for semi-automatic indexing of the biomedical literature. J. Biomed. Informatics 43, 5 (October 2010), 694--700. Google Scholar
Digital Library
- Bimal Viswanath, Alan Mislove, Meeyoung Cha, and Krishna P. Gummadi. 2009. On the evolution of user interaction in Facebook. In Proceedings of the 2nd ACM Workshop on Online Social Networks (WOSN’09). ACM, New York, NY, 37--42. Google Scholar
Digital Library
- Muhammed A. Yildirim and Michele Coscia. 2014. Using random walks to generate associations between objects. PLoS ONE 9, 8 (2014), e104813.Google Scholar
Cross Ref
Index Terms
Progressive Random Indexing: Dimensionality Reduction Preserving Local Network Dependencies
Recommendations
Who are the most influential users in a recommender system?
ICEC '11: Proceedings of the 13th International Conference on Electronic CommerceCollaborative filtering (CF) is a popular method for personalizing product recommendations for e-commerce applications. In order to recommend a product to a user and predict her preference, CF utilizes product evaluation ratings of the like-minded ...
Large-scale social recommender systems: challenges and opportunities
WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide WebOnline social networks have become very important for networking, communication, sharing, and content discovery. Recommender systems play a significant role on any online social network for engaging members, recruiting new members, and recommending ...
Visual pattern discovery using random projections
VAST '12: Proceedings of the 2012 IEEE Conference on Visual Analytics Science and Technology (VAST)An essential element of exploratory data analysis is the use of revealing low-dimensional projections of high-dimensional data. Projection Pursuit has been an effective method for finding interesting low-dimensional projections of multidimensional ...






Comments