skip to main content
10.1145/3318464.3380581acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

In-Memory Subgraph Matching: An In-depth Study

Published: 31 May 2020 Publication History

Abstract

We study the performance of eight representative in-memory subgraph matching algorithms. Specifically, we put QuickSI, GraphQL, CFL, CECI, DP-iso, RI and VF2++ in a common framework to compare them on the following four aspects: (1) method of filtering candidate vertices in the data graph; (2) method of ordering query vertices; (3) method of enumerating partial results; and (4) other optimization techniques. Then, we compare the overall performance of these algorithms with Glasgow, an algorithm based on the constraint programming. Through experiments, we find that (1) the filtering method of GraphQL is competitive to that of the latest algorithms CFL, CECI and DP-iso in terms of pruning power; (2) the ordering methods in GraphQL and RI are usually the most effective; (3) the set intersection based local candidate computation in CECI and DP-iso performs the best in the enumeration; and (4) the failing sets pruning in DP-iso can significantly improve the performance when queries become large. Our source code is publicly available at https://github.com/RapidsAtHKUST/SubgraphMatching.

Supplementary Material

MP4 File (3318464.3380581.mp4)
Presentation Video

References

[1]
Christopher R Aberger, Andrew Lamb, Susan Tu, Andres Nötzli, Kunle Olukotun, and Christopher Ré. 2017. Emptyheaded: A relational engine for graph processing. In TODS.
[2]
Foto N Afrati, Dimitris Fotakis, and Jeffrey D Ullman. 2013. Enumerating subgraph instances using map-reduce. In ICDE.
[3]
Khaled Ammar, Frank McSherry, Semih Salihoglu, and Manas Joglekar. 2018. Distributed evaluation of subgraph queries using worst-case optimal low-memory dataflows. In PVLDB.
[4]
Blair Archibald, Fraser Dunlop, Ruth Hoffmann, Ciaran McCreesh, Patrick Prosser, and James Trimble. 2019. Sequential and parallel solution-biased search for subgraph algorithms. In International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research.
[5]
Molham Aref, Balder ten Cate, Todd J Green, Benny Kimelfeld, Dan Olteanu, Emir Pasalic, Todd L Veldhuizen, and Geoffrey Washburn. 2015. Design and implementation of the LogicBlox system. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data.
[6]
Bibek Bhattarai, Hang Liu, and H Howie Huang. 2019. Ceci: Compact embedding cluster index for scalable subgraph matching. In SIGMOD.
[7]
Fei Bi, Lijun Chang, Xuemin Lin, Lu Qin, and Wenjie Zhang. 2016. Efficient subgraph matching by postponing cartesian products. In SIGMOD.
[8]
Vincenzo Bonnici, Rosalba Giugno, Alfredo Pulvirenti, Dennis Shasha, and Alfredo Ferro. 2013. A subgraph isomorphism algorithm and its application to biochemical data. In BMC bioinformatics.
[9]
Vincenzo Carletti, Pasquale Foggia, Pierluigi Ritrovato, Mario Vento, and Vincenzo Vigilante. 2019. A Parallel Algorithm for Subgraph Isomorphism. In International Workshop on Graph-Based Representations in Pattern Recognition.
[10]
Vincenzo Carletti, Pasquale Foggia, Alessia Saggese, and Mario Vento. 2017. Challenging the time complexity of exact subgraph isomorphism for huge and dense graphs with VF3. In IEEE transactions on pattern analysis and machine intelligence.
[11]
Vincenzo Carletti, Pasquale Foggia, and Mario Vento. 2015. VF2 Plus: An improved version of VF2 for biological graphs. In International Workshop on Graph-Based Representations in Pattern Recognition.
[12]
Deepayan Chakrabarti, Yiping Zhan, and Christos Faloutsos. 2004. R-MAT: A Recursive Model for Graph Mining. In SDM.
[13]
Luigi P Cordella, Pasquale Foggia, Carlo Sansone, and Mario Vento. 2004. A (sub) graph isomorphism algorithm for matching large graphs. In TPAMI.
[14]
Rosalba Giugno, Vincenzo Bonnici, Nicola Bombieri, Alfredo Pulvirenti, Alfredo Ferro, and Dennis Shasha. 2013. Grapes: A software for parallel searching on biological graphs targeting multi-core architectures.
[15]
Georg Gottlob, Martin Grohe, Nysret Musliu, Marko Samer, and Francesco Scarcello. 2005. Hypertree decompositions: Structure, algorithms, and applications. In International Workshop on Graph-Theoretic Concepts in Computer Science.
[16]
Myoungji Han. 2018. An Efficient Algorithm for Subgraph Isomorphism using Dynamic Programming on Directed Acyclic Graphs. In Thesis.
[17]
Myoungji Han, Hyunjoon Kim, Geonmo Gu, Kunsoo Park, and Wook-Shin Han. 2019. Efficient Subgraph Matching: Harmonizing Dynamic Programming, Adaptive Matching Order, and Failing Set Together. In SIGMOD.
[18]
Shuo Han, Lei Zou, and Jeffrey Xu Yu. 2018. Speeding Up Set Intersections in Graph Algorithms using SIMD Instructions. In SIGMOD.
[19]
Wook-Shin Han, Jinsoo Lee, and Jeong-Hoon Lee. 2013. Turbo iso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In SIGMOD.
[20]
Huahai He and Ambuj K Singh. 2006. Closure-tree: An index structure for graph queries. In ICDE.
[21]
Huahai He and Ambuj K Singh. 2008. Graphs-at-a-time: query language and access methods for graph databases. In SIGMOD.
[22]
Ho Hoang Hung, Sourav S Bhowmick, Ba Quan Truong, Byron Choi, and Shuigeng Zhou. 2014. QUBLE: towards blending interactive visual subgraph search queries on large networks. In VLDBJ.
[23]
Alpár Jüttner and Péter Madarasi. 2018. VF2
[24]
: An improved subgraph isomorphism algorithm. In Discrete Applied Mathematics.
[25]
Chathura Kankanamge, Siddhartha Sahu, Amine Mhedbhi, Jeremy Chen, and Semih Salihoglu. 2017. Graphflow: An active graph database. In SIGMOD.
[26]
Foteini Katsarou, Nikos Ntarmos, and Peter Triantafillou. 2015. Performance and scalability of indexed subgraph query processing methods. In PVLDB.
[27]
Foteini Katsarou, Nikos Ntarmos, and Peter Triantafillou. 2017. Subgraph querying with parallel use of query rewritings and alternative algorithms. In EDBT.
[28]
Hyeonji Kim, Juneyoung Lee, Sourav S Bhowmick, Wook-Shin Han, JeongHoon Lee, Seongyun Ko, and Moath HA Jarrah. 2016. DUALSIM: Parallel Subgraph Enumeration in a Massive Graph on a Single Machine. In SIGMOD.
[29]
Raphael Kimmig, Henning Meyerhenke, and Darren Strash. 2017. Shared Memory Parallel Subgraph Enumeration. In IPDPSW.
[30]
Karsten Klein, Nils Kriege, and Petra Mutzel. 2011. CT-index: Fingerprint-based graph indexing combining cycles and trees. In ICDE.
[31]
Longbin Lai, Lu Qin, Xuemin Lin, and Lijun Chang. 2015. Scalable subgraph enumeration in MapReduce. In PVLDB.
[32]
Longbin Lai, Lu Qin, Xuemin Lin, Ying Zhang, Lijun Chang, and Shiyu Yang. 2016. Scalable distributed subgraph enumeration. In PVLDB.
[33]
Longbin Lai, Zhu Qing, Zhengyi Yang, Xin Jin, Zhengmin Lai, Ran Wang, Kongzhang Hao, Xuemin Lin, Lu Qin, Wenjie Zhang, et almbox. 2019. Distributed subgraph matching on timely dataflow. In PVLDB.
[34]
Jinsoo Lee, Wook-Shin Han, Romans Kasperovics, and Jeong-Hoon Lee. 2012. An in-depth comparison of subgraph isomorphism algorithms in graph databases. In PVLDB.
[35]
Matteo Lissandrini, Martin Brugnara, and Yannis Velegrakis. 2018. Beyond macrobenchmarks: Microbenchmark-based graph database evaluation. In PVLDB.
[36]
Ciaran McCreesh and Patrick Prosser. 2015. A parallel, backjumping subgraph isomorphism algorithm using supplemental graphs. In International conference on principles and practice of constraint programming.
[37]
Ciaran McCreesh, Patrick Prosser, Christine Solnon, and James Trimble. 2018. When subgraph isomorphism is really hard, and why this matters for graph databases. In Journal of Artificial Intelligence Research.
[38]
Amine Mhedhbi and Semih Salihoglu. 2019. Optimizing subgraph queries by combining binary and worst-case optimal joins. In arXiv preprint arXiv:1903.02076.
[39]
Hung Q Ngo. 2018. Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems. In PODS.
[40]
Dung Nguyen, Molham Aref, Martin Bravenboer, George Kollias, Hung Q Ngo, Christopher Ré, and Atri Rudra. 2015. Join processing for graph patterns: An old dog with new tricks. In Proceedings of the GRADES'15.
[41]
Miao Qiao, Hao Zhang, and Hong Cheng. 2017. Subgraph Matching: on Compression and Computation. In PVLDB.
[42]
Raghavan Raman, Oskar van Rest, Sungpack Hong, Zhe Wu, Hassan Chafi, and Jay Banerjee. 2014. Pgx. iso: parallel and efficient in-memory engine for subgraph isomorphism. In Proceedings of Workshop on GRAph Data management Experiences and Systems.
[43]
Xuguang Ren and Junhu Wang. 2015. Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. In PVLDB.
[44]
Carlos R Rivero and Hasan M Jamil. 2017. Efficient and scalable labeled subgraph matching using SGMatch. In Knowledge and Information Systems.
[45]
Siddhartha Sahu, Amine Mhedhbi, Semih Salihoglu, Jimmy Lin, and M Tamer Özsu. 2017. The ubiquity of large graphs and surprising challenges of graph processing. In PVLDB.
[46]
Haichuan Shang, Ying Zhang, Xuemin Lin, and Jeffrey Xu Yu. 2008. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. In PVLDB.
[47]
Yingxia Shao, Bin Cui, Lei Chen, Lin Ma, Junjie Yao, and Ning Xu. 2014. Parallel subgraph listing in a large-scale graph. In SIGMOD.
[48]
Christine Solnon. 2010. All different-based filtering for subgraph isomorphism. In Artificial Intelligence.
[49]
Christine Solnon. 2019. Experimental Evaluation of Subgraph Isomorphism Solvers. In International Workshop on Graph-Based Representations in Pattern Recognition.
[50]
Yinglong Song, Huey Eng Chua, Sourav S Bhowmick, Byron Choi, and Shuigeng Zhou. 2018. BOOMER: Blending visual formulation and processing of p-homomorphic queries on large networks. In SIGMOD.
[51]
Shixuan Sun, Yulin Che, Lipeng Wang, and Qiong Luo. 2019. Efficient Parallel Subgraph Enumeration on a Single Machine. In ICDE.
[52]
Shixuan Sun and Qiong Luo. 2018. Parallelizing Recursive Backtracking Based Subgraph Matching on a Single Machine. In ICPADS.
[53]
Shixuan Sun and Qiong Luo. 2019. Scaling Up Subgraph Query Processing with Efficient Subgraph Matching. In ICDE.
[54]
Zhao Sun, Hongzhi Wang, Haixun Wang, Bin Shao, and Jianzhong Li. 2012. Efficient subgraph matching on billion node graphs. In PVLDB.
[55]
Ha-Nguyen Tran, Jung-jae Kim, and Bingsheng He. 2015. Fast subgraph matching on large graphs using graphics processors. In DASFAA.
[56]
Julian R Ullmann. 1976. An algorithm for subgraph isomorphism. In JACM.
[57]
Mario Vento, Xiaoyi Jiang, and Pasquale Foggia. 2015. International contest on pattern search in biological databases.
[58]
Zhaokang Wang, Rong Gu, Weiwei Hu, Chunfeng Yuan, and Yihua Huang. 2019. BENU: Distributed Subgraph Enumeration with Backtracking-Based Framework. In ICDE.
[59]
Shijie Zhang, Shirong Li, and Jiong Yang. 2009. GADDI: distance index based subgraph matching in biological networks. In EDBT.
[60]
Peixiang Zhao and Jiawei Han. 2010. On graph query optimization in large networks. In PVLDB.

Cited By

View all
  • (2024)Categorical Multi-Query Subgraph Matching on Labeled GraphElectronics10.3390/electronics1321419113:21(4191)Online publication date: 25-Oct-2024
  • (2024)Top-k Graph Similarity Search Algorithm Based on Chi-Square Statistics in Probabilistic GraphsElectronics10.3390/electronics1301019213:1(192)Online publication date: 1-Jan-2024
  • (2024)Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368197517:11(2946-2959)Online publication date: 30-Aug-2024
  • Show More Cited By

Index Terms

  1. In-Memory Subgraph Matching: An In-depth Study

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
    June 2020
    2925 pages
    ISBN:9781450367356
    DOI:10.1145/3318464
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 May 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. comparison and analysis
    2. graph
    3. subgraph matching

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)559
    • Downloads (Last 6 weeks)55
    Reflects downloads up to 16 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Categorical Multi-Query Subgraph Matching on Labeled GraphElectronics10.3390/electronics1321419113:21(4191)Online publication date: 25-Oct-2024
    • (2024)Top-k Graph Similarity Search Algorithm Based on Chi-Square Statistics in Probabilistic GraphsElectronics10.3390/electronics1301019213:1(192)Online publication date: 1-Jan-2024
    • (2024)Efficient Maximal Motif-Clique Enumeration over Large Heterogeneous Information NetworksProceedings of the VLDB Endowment10.14778/3681954.368197517:11(2946-2959)Online publication date: 30-Aug-2024
    • (2024)TC-Match: Fast Time-Constrained Continuous Subgraph MatchingProceedings of the VLDB Endowment10.14778/3681954.368196317:11(2791-2804)Online publication date: 30-Aug-2024
    • (2024)Cardinality Estimation of Subgraph Matching: A Filtering-Sampling ApproachProceedings of the VLDB Endowment10.14778/3654621.365463517:7(1697-1709)Online publication date: 1-Mar-2024
    • (2024)Graph Feature Preprocessor: Real-time Subgraph-based Feature Extraction for Financial Crime DetectionProceedings of the 5th ACM International Conference on AI in Finance10.1145/3677052.3698674(222-230)Online publication date: 14-Nov-2024
    • (2024)A Distributed Framework for Subgraph Isomorphism Leveraging CPU and GPU Heterogeneous ComputingProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673134(433-442)Online publication date: 12-Aug-2024
    • (2024)Speeding Up Subgraph Matching Queries with Schema Guided IndexProceedings of the 2024 3rd International Conference on Networks, Communications and Information Technology10.1145/3672121.3672129(34-38)Online publication date: 7-Jun-2024
    • (2024)Understanding High-Performance Subgraph Pattern Matching: A Systems PerspectiveProceedings of the 7th Joint Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3661304.3661897(1-12)Online publication date: 14-Jun-2024
    • (2024)A Comprehensive Survey and Experimental Study of Subgraph Matching: Trends, Unbiasedness, and InteractionProceedings of the ACM on Management of Data10.1145/36393152:1(1-29)Online publication date: 26-Mar-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media