ABSTRACT
Graph data appears in a variety of application domains, and many uses of it, such as querying, matching, and transforming data, naturally result in incompletely specified graph data, i.e., graph patterns. While queries need to be posed against such data, techniques for querying patterns are generally lacking, and properties of such queries are not well understood.
Our goal is to study the basics of querying graph patterns. We first identify key features of patterns, such as node and label variables and edges specified by regular expressions, and define a classification of patterns based on them. We then study standard graph queries on graph patterns, and give precise characterizations of both data and combined complexity for each class of patterns. If complexity is high, we do further analysis of features that lead to intractability, as well as lower complexity restrictions. We introduce a new automata model for query answering with two modes of acceptance: one captures queries returning nodes, and the other queries returning paths. We study properties of such automata, and the key computational tasks associated with them. Finally, we provide additional restrictions for tractability, and show that some intractable cases can be naturally cast as instances of constraint satisfaction problem.
- S. Abiteboul, P. Buneman, D. Suciu. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kauffman, 1999. Google Scholar
Digital Library
- R. Angles, C. Gutierrez. Survey of graph database models. ACM Comput. Surv. 40(1): (2008). Google Scholar
Digital Library
- M. Arenas, P. Barcelo, L. Libkin, F. Murlak. Relational and XML Data Exchange. Morgan & Claypool, 2010. Google Scholar
Digital Library
- P. Barcelo, L. Libkin, A. Poggi, C. Sirangelo. XML with incomplete information. J. ACM 58(1): 1--62 (2010). Google Scholar
Digital Library
- P. Barcelo, C. Hurtado, L. Libkin, P. Wood. Expressive languages for path queries over graph-structured data. In PODS, pages 3--14, 2010. Google Scholar
Digital Library
- H. Bjorklund, W. Martens, and T. Schwentick. Conjunctive query containment over trees. In DBPL'07, pages 66--80. Google Scholar
Digital Library
- D. Calvanese, G. de Giacomo, M. Lenzerini, M. Y. Vardi. Containment of conjunctive regular path queries with inverse. In KR'00, pages 176--185.Google Scholar
- D. Calvanese, G. de Giacomo, M. Lenzerini, M. Y. Vardi. Answering regular path queries using views. In ICDE, pages 389--398, 2000. Google Scholar
Digital Library
- D. Calvanese, G. de Giacomo, M. Lenzerini, M. Y. Vardi. View-based query processing and constraint satisfaction. In LICS, pages 361--371, 2000. Google Scholar
Digital Library
- D. Calvanese, G. de Giacomo, M. Lenzerini, M. Y. Vardi. Rewriting of regular expressions and regular path queries. JCSS, 64(3):443--465, 2002.Google Scholar
Digital Library
- D. Calvanese, G. de Giacomo, M. Lenzerini, M. Y. Vardi. Simplifying schema mappings. In ICDT 2011, to appear. Google Scholar
Digital Library
- J. Cheng, J. X. Yu, B. Ding, P. S. Yu, and H. Wang. Fast graph pattern matching. In ICDE 2008, pages 913--922. Google Scholar
Digital Library
- S. Cohen and Y. Sagiv. An abstract framework for generating maximal answers to queries. In ICDT 2005, pages 129--143. Google Scholar
Digital Library
- M. P. Consens, A. O. Mendelzon. GraphLog: a visual formalism for real life recursion. In PODS'90, pages 404--416. Google Scholar
Digital Library
- I. Cruz, A. Mendelzon, P. Wood. A graphical query language supporting recursion. In SIGMOD'87, pages 323--330. Google Scholar
Digital Library
- R. Dechter. Constraint Processing. Morgan Kaufmann, 2003. Google Scholar
Digital Library
- A. Deutsch, V. Tannen. Optimization properties for classes of conjunctive regular path queries. DBPL'01, pages 21--39. Google Scholar
Digital Library
- R. Diestel. Graph Theory. Springer, 2005.Google Scholar
- R. Fagin, Ph. Kolaitis, R. Miller, and L. Popa. Data exchange: semantics and query answering. TCS, 336(1):89--124, 2005. Google Scholar
Digital Library
- W. Fan, J. Li, S. Ma, H. Wang, Y. Wu. Homomorphism revisited for graph matching. PVLDB 3(1): 1161--1172 (2010). Google Scholar
Digital Library
- W. Fan, J. Li, S. Ma, N. Tang, Y. Wu. Graph pattern matching: from intractable to polynomial time. PVLDB 3(1): 264--275 (2010). Google Scholar
Digital Library
- W. Fan, J. Li, S. Ma, N. Tang, Y. Wu. Adding regular expressions to graph reachability and pattern queries. In ICDE 2011, to appear. Google Scholar
Digital Library
- I. Glaister, J. Shallit. A lower bound technique for the size of nondeterministic finite automata. IPL 59:75--77, 1996. Google Scholar
Digital Library
- G. Gottlob, C. Koch, K. Schulz. Conjunctive queries over trees. J. ACM 53(2) (2006), 238--272. Google Scholar
Digital Library
- C. Gutierrez, C. Hurtado, A. Mendelzon. Foundations of semantic web databases. In PODS 2004, pages 95--106. Google Scholar
Digital Library
- M. Gyssens, J. Paredaens, J. Van den Bussche, D. Van Gucht. A graph-oriented object database model. IEEE TKDE 6(4) (1994), 572--586. Google Scholar
Digital Library
- T. Imielinski, W. Lipski. Incomplete information in relational databases. J. ACM 31 (1984), 761--791. Google Scholar
Digital Library
- D. Johnson, A. Klug. Testing containment of conjunctive queries under functional and inclusion dependencies. JCSS, 28(1) (1984), pages 167--189.Google Scholar
Cross Ref
- Y. Kanza, W. Nutt, Y. Sagiv. Querying incomplete information in semistructured data. JCSS 64 (3) (2002), 655--693.Google Scholar
Digital Library
- P. Kolaitis and M. Vardi. A logical approach to constraint satisfaction. In Finite Model Theory and Its Applications, Springer 2007, pages 339--370.Google Scholar
Cross Ref
- M. Lenzerini. Data integration: a theoretical perspective. In PODS'02, pages 233--246. Google Scholar
Digital Library
- U. Leser. A query language for biological networks. Bioinformatics 21 (suppl 2) (2005), ii33--ii39. Google Scholar
Digital Library
- R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, U. Alon. Network motifs: simple building blocks of complex networks. Science 298(5594) (2002), 824--827.Google Scholar
Cross Ref
- M. Natarajan. Understanding the structure of a drug trafficking organization: a conversational analysis. Crime Prevention Studies 11 (2000), 273--298.Google Scholar
- F. Olken. Graph data management for molecular biology. OMICS 7(1): 75--78 (2003).Google Scholar
Cross Ref
- J. Perez, M. Arenas, C. Gutierrez. Semantics and complexity of SPARQL. ACM TODS 34(3): 2009. Google Scholar
Digital Library
- R. Ronen and O. Shmueli. SoQL: a language for querying and creating data in social networks. In ICDE 2009. Google Scholar
Digital Library
- M. San Martin, C. Gutierrez. Representing, querying and transforming social networks with RDF/SPARQL. In ESWC 2009, pages 293--307. Google Scholar
Digital Library
- H. Tong, C. Faloutsos, B. Gallagher, and T. Eliassi-Rad. Fast best-effort pattern matching in large attributed graphs. In KDD 2007. Google Scholar
Digital Library
- G. Weikum, G. Kasneci, M. Ramanath, F. Suchanek. Database and information-retrieval methods for knowledge discovery. CACM 52(4):56--64 (2009). Google Scholar
Digital Library
Index Terms
Querying graph patterns
Recommendations
Foundations of Modern Query Languages for Graph Databases
We survey foundational features underlying modern graph query languages. We first discuss two popular graph data models: edge-labelled graphs, where nodes are connected by directed, labelled edges, and property graphs, where nodes and edges can further ...
Querying Regular Graph Patterns
Graph data appears in a variety of application domains, and many uses of it, such as querying, matching, and transforming data, naturally result in incompletely specified graph data, that is, graph patterns. While queries need to be posed against such ...
View-Based Query Processing and Constraint Satisfaction
LICS '00: Proceedings of the 15th Annual IEEE Symposium on Logic in Computer ScienceView-based query processing requires answering a query posed to a database only based on the information on a set of views, which are again queries over the same database. This problem is relevant in many aspects of database management, and has been ...






Comments