skip to main content
research-article

Progressive Random Indexing: Dimensionality Reduction Preserving Local Network Dependencies

Published:24 March 2017Publication History
Skip Abstract Section

Abstract

The vector space model is undoubtedly among the most popular data representation models used in the processing of large networks. Unfortunately, the vector space model suffers from the so-called curse of dimensionality, a phenomenon where data become extremely sparse due to an exponential growth of the data space volume caused by a large number of dimensions. Thus, dimensionality reduction techniques are necessary to make large networks represented in the vector space model available for analysis and processing. Most dimensionality reduction techniques tend to focus on principal components present in the data, effectively disregarding local relationships that may exist between objects. This behavior is a significant drawback of current dimensionality reduction techniques, because these local relationships are crucial for maintaining high accuracy in many network analysis tasks, such as link prediction or community detection. To rectify the aforementioned drawback, we propose Progressive Random Indexing, a new dimensionality reduction technique. Built upon Reflective Random Indexing, our method significantly reduces the dimensionality of the vector space model while retaining all important local relationships between objects. The key element of the Progressive Random Indexing technique is the use of the gain value at each reflection step, which determines how much information about local relationships should be included in the space of reduced dimensionality. Our experiments indicate that when applied to large real-world networks (Facebook social network, MovieLens movie recommendations), Progressive Random Indexing outperforms state-of-the-art methods in link prediction tasks.

References

  1. Massimiliano Albanese, Antonio D’Acierno, Vincenzo Moscato, Fabio Persia, and Antonio Picariello. 2013. A multimedia recommender system. ACM Trans. Internet Technol. 13, 1 (2013), 3:1—-3:32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In Proceedings of the 4th ACM International Conference on Web Search and Data Mining (WSDM’11). ACM, New York, NY, 635--644. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Kishor Barman and Vivek S. Borkar. 2008. A note on linear function approximation using random projections. Syst. Control Lett. 57, 9 (2008), 784--786.Google ScholarGoogle ScholarCross RefCross Ref
  4. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3 (March 2003), 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Erik Brynjolfsson, Yu Hu, and Duncan Simester. 2011. Goodbye pareto principle, hello long tail: The effect of search costs on the concentration of product sales. Manage. Sci. 57, 8 (June 2011), 1373--1386. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Òscar Celma. 2010. Music Recommendation and Discovery. Springer, Berlin. 87--108 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Michal Ciesielczyk and Andrzej Szwabe. 2011. RSVD-based dimensionality reduction for recommender systems. Int. J. Mach. Learn. Comput. 1, 2 (2011), 170--175.Google ScholarGoogle ScholarCross RefCross Ref
  8. Michał Ciesielczyk, Andrzej Szwabe, and Mikołaj Morzy. 2015. On efficient link recommendation in social networks using actor-fact matrices. Scientific Programming 2015, Computational Aspects of Social Network Analysis (CASNA’15), 1--9. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Trevor Cohen, Roger Schvaneveldt, and Dominic Widdows. 2010. Reflective random indexing and indirect inference: A scalable method for discovery of implicit connections. J. Biomed. Informatics 43, 2 (April 2010), 240--56. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Colin Cooper, Sang Hyuk Lee, Tomasz Radzik, and Yiannis Siantos. 2014. Random walks in recommender systems: Exact computation and simulations. In Proceedings of the 23rd International Conference on World Wide Web (WWW’14 Companion). ACM, New York, NY, 811--816. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Paolo Cremonesi, Yehuda Koren, and Roberto Turrin. 2010. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the 4th ACM Conference on Recommender Systems (RecSys’10). ACM Press, New York, NY, 39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Danica Damljanovic, Johann Petrak, Mihai Lupu, Hamish Cunningham, Mats Carlsson, Gunnar Engstrom, and Bo Andersson. 2012. Random indexing for finding similar nodes within large RDF graphs. In Proceedings of the 8th International Conference on the Semantic Web (ESWC’11). Springer-Verlag, Berlin, 156--171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Christian Desrosiers and George Karypis. 2011. A comprehensive survey of neighborhood-based recommendation methods. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, 107--144.Google ScholarGoogle Scholar
  14. Tom Fawcett. 2006. An introduction to ROC analysis. Pattern Recogn. Lett. 27, 8 (2006), 861--874. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Peter Flach. 2012. Machine learning experiments. In Machine Learning: The Art and Science of Algorithms That Make Sense of Data. Cambridge University Press, New York, NY, 343--359.Google ScholarGoogle Scholar
  16. Asela Gunawardana and Christopher Meek. 2009. A unified approach to building hybrid recommender systems. In Proceedings of the 3rd ACM Conference on Recommender Systems (RecSys’09). ACM, New York, NY, 117--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jonathan L. Herlocker, Joseph A. Konstan, Loren G. Terveen, and John T. Riedl. 2004. Evaluating collaborative filtering recommender systems. ACM Trans. Inf. Syst. 22, 1 (January 2004), 5--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Victor Lavrenko. 2010. A Generative Theory of Relevance. Springer Publishing Company. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jure Leskovec and Julian J. Mcauley. 2012. Learning to discover social circles in ego networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, 539--547. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58, 7 (2007), 1019--1031. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sean M. McNee, John Riedl, and Joseph A. Konstan. 2006. Being accurate is not enough: How accuracy metrics have hurt recommender systems. In CHI’06 Extended Abstracts on Human Factors in Computing Systems (CHI EA’06). ACM, New York, NY, 1097--1101. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Aditya Krishna Menon and Charles Elkan. 2011. Link prediction via matrix factorization. In Proceedings of the 2011 European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II (ECML PKDD’11). Springer-Verlag, Berlin, 437--452. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Surya Nepal, Cecile Paris, Payam Aghaei Pour, Jill Freyne, and Sanat Kumar Bista. 2015. Interaction-based recommendations for online communities. ACM Trans. Internet Technol. 15, 2 (2015), 6:1--6:21. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Alexis Papadimitriou, Panagiotis Symeonidis, and Yannis Manolopoulos. 2012. Fast and accurate link prediction in social networking systems. J. Syst. Software 85, 9 (2012), 2119--2132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yoon-Joo Park and Alexander Tuzhilin. 2008. The long tail of recommender systems and how to leverage it. In Proceedings of the 2008 ACM Conference on Recommender Systems (RecSys’08). ACM, New York, NY, 11--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Itamar Pitowsky. 2006. Quantum mechanics as a theory of probability. In Physical Theory and Its Interpretation, William Demopoulos and Itamar Pitowsky (Eds.). The Western Ontario Series in Philosophy of Science, Vol. 72. Springer Netherlands, 213--240.Google ScholarGoogle Scholar
  28. Raymond Reiter. 1978. On closed world data bases. In Logic and Data Bases, Hervè Gallaire and Jack Minker (Eds.). Plenum Press, New York, NY, 119--140.Google ScholarGoogle Scholar
  29. Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to recommender systems handbook. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, Boston, MA, 1--35.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Guy Shani and Asela Gunawardana. 2011. Evaluating recommendation systems. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, 257--297.Google ScholarGoogle Scholar
  31. Panagiotis Symeonidis, Nantia Iakovidou, Nikolaos Mantas, and Yannis Manolopoulos. 2013. From biological to social networks: Link prediction based on multi-way spectral clustering. Data Knowl. Engineering 87 (2013), 226--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Andrzej Szwabe, Michal Ciesielczyk, and Pawel Misiorek. 2011. Long-tail recommendation based on reflective indexing. In Advances in Artificial Intelligence (AI’11), Dianhui Wang and Mark Reynolds (Eds.). Vol. 7106. Springer, Berlin, 142--151. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). ACM, New York, NY, 1067--1077. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Cornelis Joost van Rijsbergen. 2004. The geometry of IR. In The Geometry of Information Retrieval. Cambridge University Press, New York, NY, 73--101.Google ScholarGoogle Scholar
  35. Vidya Vasuki and Trevor Cohen. 2010. Reflective random indexing for semi-automatic indexing of the biomedical literature. J. Biomed. Informatics 43, 5 (October 2010), 694--700. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Bimal Viswanath, Alan Mislove, Meeyoung Cha, and Krishna P. Gummadi. 2009. On the evolution of user interaction in Facebook. In Proceedings of the 2nd ACM Workshop on Online Social Networks (WOSN’09). ACM, New York, NY, 37--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Muhammed A. Yildirim and Michele Coscia. 2014. Using random walks to generate associations between objects. PLoS ONE 9, 8 (2014), e104813.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Progressive Random Indexing: Dimensionality Reduction Preserving Local Network Dependencies

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Internet Technology
          ACM Transactions on Internet Technology  Volume 17, Issue 2
          Special Issue on Advances in Social Computing and Regular Papers
          May 2017
          249 pages
          ISSN:1533-5399
          EISSN:1557-6051
          DOI:10.1145/3068849
          • Editor:
          • Munindar P. Singh
          Issue’s Table of Contents

          Copyright © 2017 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 24 March 2017
          • Accepted: 1 September 2016
          • Revised: 1 July 2016
          • Received: 1 February 2016
          Published in toit Volume 17, Issue 2

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!