skip to main content
research-article

ImmortalGraph: A System for Storage and Analysis of Temporal Graphs

Published:24 July 2015Publication History
Skip Abstract Section

Abstract

Temporal graphs that capture graph changes over time are attracting increasing interest from research communities, for functions such as understanding temporal characteristics of social interactions on a time-evolving social graph. ImmortalGraph is a storage and execution engine designed and optimized specifically for temporal graphs. Locality is at the center of ImmortalGraph’s design: temporal graphs are carefully laid out in both persistent storage and memory, taking into account data locality in both time and graph-structure dimensions. ImmortalGraph introduces the notion of locality-aware batch scheduling in computation, so that common “bulk” operations on temporal graphs are scheduled to maximize the benefit of in-memory data locality. The design of ImmortalGraph explores an interesting interplay among locality, parallelism, and incremental computation in supporting common mining tasks on temporal graphs. The result is a high-performance temporal-graph system that is up to 5 times more efficient than existing database solutions for graph queries. The locality optimizations in ImmortalGraph offer up to an order of magnitude speedup for temporal iterative graph mining compared to a straightforward application of existing graph engines on a series of snapshots.

References

  1. Charu C. Aggarwal and Haixun Wang (Eds.). 2010. Managing and Mining Graph Data. Advances in Database Systems, Vol. 40. Springer. 40--43 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Paolo Boldi, Massimo Santini, and Sebastiano Vigna. 2008. A large time-aware web graph. SIGIR Forum 42, 2, 33--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Computer Networks 30, 1--7, 107--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Rong Chen, Jiaxin Shi, Yanzhe Chen, Haibing Guan, Binyu Zang, and Haibo Chen. 2013. PowerLyra: Differentiated Graph Computation and Partitioning on Skewed Graphs. Technical Report IPADSTR-2013-001. Shanghai Jiao Tong University, Shanghai, China.Google ScholarGoogle Scholar
  5. Raymond Cheng, Ji Hong, Aapo Kyrola, Youshan Miao, Xuetian Weng, Ming Wu, Fan Yang, Lidong Zhou, Feng Zhao, and Enhong Chen. 2012. Kineograph: Taking the pulse of a fast-changing and connected world. In Proceedings of the 7th European Conference on Computer Systems (EuroSys’12). 85--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. M. Frigo, C. Leiserson, H. Prokop, and S. Ramachandran. 1999. Cache-oblivious algorithms. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science (FOCS’99). 285--298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12). 17--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. George Karypis and Vipin Kumar. 1995. METIS—Unstructured Graph Partitioning and Sparse Matrix Ordering System, Ver 2.0. Technical Report. University of Minnesota, Minneapolis, MN.Google ScholarGoogle Scholar
  9. Zuhair Khayyat, Karim Awara, Amani Alonazi, Hani Jamjoom, Dan Williams, and Panos Kalnis. 2013. Mizan: A system for dynamic load balancing in large-scale graph processing. In Proceedings of the 8th European Conference on Computer Systems (EuroSys’13). 169--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Udayan Khurana and Amol Deshpande. 2013. Efficient snapshot retrieval over historical graph data. In Proceedings of the 29th International Conference on Data Engineering (ICDE’13). 997--1008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. GraphChi: Large-scale graph computation on just a PC. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (OSDI’12), Vol. 8. 31--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kristina Lerman, Rumi Ghosh, and Jeon Hyung Kang. 2010. Centrality metric for dynamic networks. In Proceedings of the 8th Workshop on Mining and Learning with Graphs (MLG’10). 70--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (KDD’05). 177--187. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. David Lomet, Roger Barga, Mohamed F. Mokbel, German Shegalov, Rui Wang, and Yunyue Zhu. 2005. Immortal DB: Transaction time support for SQL server. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (SIGMOD’05). 939--941. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. David Lomet and Betty Salzberg. 1989. Access methods for multiversion data. In Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data (SIGMOD’89). 315--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yucheng Low, Danny Bickson, Joseph Gonzalez, Carlos Guestrin, Aapo Kyrola, and Joseph M. Hellerstein. 2012. Distributed GraphLab: A framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment (PVLDB’12) 5, 8, 716--727. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD’10). 135--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Alan Mislove. 2009. Online social networks: measurement, analysis, and applications to distributed information systems. Ph.D. Dissertation. Rice University, Houston, TX. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Keith Muller and Joseph Pasquale. 1991. A high performance multi-structured file system design. In Proceedings of the 13th ACM Symposium on Operating Systems Principles (SOSP’91). 56--67. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Derek Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martin Abadi. 2013. Naiad: A timely dataflow system. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). 439--455. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Neo4j. 2013. Neo4j: The graph database. Retrieved July 5, 2015 from http://neo4j.org.Google ScholarGoogle Scholar
  22. Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight infrastructure for graph analytics. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). 456--471. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. PostgreSQL. 2013. PostgreSQL. Retrieved July 5, 2015 from http://postgresql.org.Google ScholarGoogle Scholar
  24. Russell Power and Jinyang Li. 2010. Piccolo: Building fast, distributed programs with partitioned tables. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10). 1--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Vijayan Prabhakaran, Ming Wu, Xuetian Weng, Frank McSherry, Lidong Zhou, and Maya Haridasan. 2012. Managing large graphs on multi-cores with graph awareness. In Proceedings of the 2012 USENIX Annual Technical Conference (ATC’12), Vol. 12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Chenghui Ren, Eric Lo, Ben Kao, Xinjie Zhu, and Reynold Cheng. 2011. On querying historical evolving graph sequences. Proceedings of the VLDB Endowment (PVLDB) 4, 11, 726--737.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Liam Roditty and Uri Zwick. 2004. On dynamic shortest paths problems. In Proceedings of the 12th Annual European Symposium on Algorithms (ESA’04). 580--591.Google ScholarGoogle ScholarCross RefCross Ref
  28. Daniel M. Romero, Brendan Meeder, and Jon Kleinberg. 2011. Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on Twitter. In Proceedings of the 20th International Conference on World Wide Web (WWW’11). 695--704. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-Stream: Edge-centric graph processing using streaming partitions. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP’13). 472--488. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Betty Salzberg and Vassilis J. Tsotras. 1999. Comparison of access methods for time-evolving data. Computing Surveys 31, 2, 158--221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Julian Shun and Guy E. Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’13). 135--146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Sina. 2013. Weibo. Retrieved July 5, 2015 from http://weibo.com.Google ScholarGoogle Scholar
  33. Eno Thereska, Phil Gosset, and Richard Harper. 2012. Multi-structured redundancy. Proceedings of the 4th USENIX Conference on Hot Topics in Storage and File Systems (HotStorage’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Shivaram Venkataraman, Erik Bodzsar, Indrajit Roy, Alvin AuYoung, and Robert Schreiber. 2013. Presto: Distributed machine learning and graph processing with sparse matrices. In Proceedings of the 8th European Conference on Computer Systems (EuroSys’13). 197--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Wikipedia. 2013. Graph database. Retrieved July 5, 2015 from http://en.wikipedia.org/wiki/Graph_database.Google ScholarGoogle Scholar
  36. Christo Wilson, Bryce Boe, Ra Sala, Krishna P. N. Puttaswamy, and Ben Y. Zhao. 2009. User interactions in social networks and their implications. In Proceedings of the 4th European Conference on Computer Systems (EuroSys’09). 205--218. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lei Yang, Lei Qi, Yan-Ping Zhao, Bin Gao, and Tie-Yan Liu. 2007. Link analysis using time series of web graphs. In Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM’07). 1011--1014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin, Scott Shenker, and Ion Stoica. 2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th Symposium on Networked Systems Design and Implementation (NSDI’12). 2--2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Rui Zhang and Martin Stradling. 2010. The HV-tree: A memory hierarchy aware version index. Proceedings of the VLDB Endowment (PVLDB) 3, 1--2 397--408. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ImmortalGraph: A System for Storage and Analysis of Temporal Graphs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Storage
        ACM Transactions on Storage  Volume 11, Issue 3
        July 2015
        117 pages
        ISSN:1553-3077
        EISSN:1553-3093
        DOI:10.1145/2809503
        Issue’s Table of Contents

        Copyright © 2015 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 24 July 2015
        • Accepted: 1 November 2014
        • Received: 1 May 2014
        Published in tos Volume 11, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!