skip to main content
research-article

Queryable Compression on Time-evolving Web and Social Networks with Streaming

Published:21 December 2021Publication History
Skip Abstract Section

Abstract

Time-evolving web and social network graphs are modeled as a set of pages/individuals (nodes) and their arcs (links/relationships) that change over time. Due to their popularity, they have become increasingly massive in terms of their number of nodes, arcs, and lifetimes. However, these graphs are extremely sparse throughout their lifetimes. For example, it is estimated that Facebook has over a billion vertices, yet at any point in time, it has far less than 0.001% of all possible relationships. The space required to store these large sparse graphs may not fit in most main memories using underlying representations such as a series of adjacency matrices or adjacency lists.

We propose building a compressed data structure that has a compressed binary tree corresponding to each row of each adjacency matrix of the time-evolving graph. We do not explicitly construct the adjacency matrix, and our algorithms take the time-evolving arc list representation as input for its construction. Our compressed structure allows for directed and undirected graphs, faster arc and neighborhood queries, as well as the ability for arcs and frames to be added and removed directly from the compressed structure (streaming operations). We use publicly available network data sets such as Flickr, Yahoo!, and Wikipedia in our experiments and show that our new technique performs as well or better than our benchmarks on all datasets in terms of compression size and other vital metrics.

REFERENCES

  1. [1] Jérôme Kunegis. KONECT – The Koblenz Network Collection. 2020. Point contact Wikipedia articles edited. http://konect.uni-koblenz.de/.Google ScholarGoogle Scholar
  2. [2] The Max Planck Institute for Software Systems. 2020. User-to-user link crawled on Flickr Social Network. http://socialnetworks.mpi-sws.org/data-www2009.html.Google ScholarGoogle Scholar
  3. [3] Yahoo! Research. 2020. Yahoo! Network Flows Data, version 1.0. http://webscope.sandbox.yahoo.com/catalog.php?datatype=g.Google ScholarGoogle Scholar
  4. [4] Álvarez-García S., Brisaboa N. R., Bernardo G. D., and Navarro G.. 2014. Interleaved K2-Tree: Indexing and navigating ternary relations. In Data Compression Conference. 342351. DOI: https://doi.org/10.1109/DCC.2014.56Google ScholarGoogle Scholar
  5. [5] Bernardo G. D., Brisaboa N. R., Caro D., and Rodríguez M. A.. 2013. Compact data structures for temporal graphs. In Data Compression Conference. 477477. DOI: https://doi.org/10.1109/DCC.2013.59Google ScholarGoogle Scholar
  6. [6] Boldi P. and Vigna S.. 2004. The webgraph framework I: Compression techniques. In 13th International Conference on World Wide Web (WWW’04). ACM, New York, NY, 595602. DOI: https://doi.org/10.1145/988672.988752Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Brisaboa Nieves R., Caro Diego, Fariña Antonio, and Rodríguez M. Andrea. 2014. A compressed suffix-array strategy for temporal-graph indexing. In International Symposium on String Processing and Information Retrieval.Google ScholarGoogle Scholar
  8. [8] Brisaboa Nieves R., Ladra Susana, and Navarro Gonzalo. 2014. Compact representation of web graphs with extended functionality. Inf. Syst. 39 (2014), 152174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Brodnik Andrej and Munro J. Ian. 1999. Membership in constant time and almost-minimum space. SIAM J. Comput. 28, 5 (1999). Society for Industrial and Applied Mathematics. DOI: https://doi.org/10.1137/S0097539795294165Google ScholarGoogle Scholar
  10. [10] Bui-Xuan Binh-Minh, Ferreira Afonso, and Jarry Aubin. 2002. Computing Shortest, Fastest, and Foremost Journeys in Dynamic Networks. Technical Report RR-4589. INRIA. Retrieved from https://hal.inria.fr/inria-00071996.Google ScholarGoogle Scholar
  11. [11] Caro Diego, Rodríguez M. Andrea, and Brisaboa Nieves R.. 2015. Data structures for temporal graphs based on compact sequence representations. Inf. Syst. 51, C (July 2015), 126. DOI: https://doi.org/10.1016/j.is.2015.02.002Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Caro Diego, Rodriguez M. Andrea, Brisaboa Nieves R., and Farina Antonio. 2016. Compressed kd-tree for temporal graphs. Knowl. Inf. Syst. 49, 2 (Nov. 2016), 553595. DOI: https://doi.org/10.1007/s10115-015-0908-6Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Ferreira Afonso and Viennot Laurent. 2002. A Note on Models, Algorithms, and Data Structures for Dynamic Communication Networks. Research Report RR-4403. INRIA. Retrieved from https://hal.inria.fr/inria-00072185.Google ScholarGoogle Scholar
  14. [14] Grossi Roberto, Gupta Ankur, and Vitter Jeffrey Scott. 2003. High-order entropy-compressed text indexes. In 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’03). Society for Industrial and Applied Mathematics, Philadelphia, PA, 841850. Retrieved from http://dl.acm.org/citation.cfm?id=644108.644250.Google ScholarGoogle Scholar
  15. [15] Khurana Udayan and Deshpande Amol. 2013. Efficient snapshot retrieval over historical graph data. In IEEE 29th International Conference on Data Engineering (ICDE). 9971008.Google ScholarGoogle Scholar
  16. [16] Labouseur Alan G., Birnbaum Jeremy, Jr. Paul W. Olsen,, Spillane Sean R., Vijayan Jayadevan, Hwang Jeong-Hyon, and Han Wook-Shin. 2015. The G* graph database: Efficiently managing large distributed dynamic graphs. Distrib. Parallel Datab. 33, 4 (Dec. 2015), 479514. DOI: https://doi.org/10.1007/s10619-014-7140-3Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Liakos P., Papakonstantinopoulou K., and Delis A.. 2018. Realizing memory-optimized distributed graph processing. IEEE Trans. Knowl. Data Eng. 30, 4 (2018), 743756.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Liakos Panagiotis, Papakonstantinopoulou Katia, and Sioutis Michael. 2014. Pushing the envelope in graph compression. CIKM’14 (Nov. 2014), 15491558. DOI: https://doi.org/10.1145/2661829.2662053Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Nelson Michael, Radhakrishnan Sridhar, Chatterjee Amlan, and Sekharan Chandra. 2017. Queryable compression on streaming social networks. In IEEE International Conference on Big Data (BigData’17). IEEE Computer Society, 10. DOI: https://doi.org/10.1109/BigData.2017.8258020Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Nelson Michael, Radhakrishnan Sridhar, and Sekharan Chandra. 2018. Queryable compression on time-evolving social networks with streaming. In IEEE International Conference on Big Data (BigData’18). IEEE Computer Society, 10. DOI: https://doi.org/10.1109/BigData.2018.8622386Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Nicosia Vincenzo, Tang John Kit, Mascolo Cecilia, Musolesi Mirco, Russo Giovanni, and Latora Vito. 2013. Graph metrics for temporal networks. CoRR abs/1306.0493 (2013).Google ScholarGoogle Scholar
  22. [22] Ren Chenghui, Lo Eric, Kao Ben, Zhu Xinjie, and Cheng Reynold. 2011. On querying historical evolving graph sequences. PVLDB 4 (2011), 726737.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Queryable Compression on Time-evolving Web and Social Networks with Streaming

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on the Web
      ACM Transactions on the Web  Volume 16, Issue 2
      May 2022
      148 pages
      ISSN:1559-1131
      EISSN:1559-114X
      DOI:10.1145/3506669
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 December 2021
      • Revised: 1 October 2021
      • Accepted: 1 October 2021
      • Received: 1 May 2019
      Published in tweb Volume 16, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)131
      • Downloads (Last 6 weeks)9

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!