ABSTRACT
Graph databases have gained renewed interest in the last years, due to its applications in areas such as the Semantic Web and Social Networks Analysis. We study the problem of querying graph databases, and, in particular, the expressiveness and complexity of evaluation for several general-purpose query languages, such as the regular path queries and its extensions with conjunctions and inverses. We distinguish between two semantics for these languages. The first one, based on simple paths, easily leads to intractability, while the second one, based on arbitrary paths, allows tractable evaluation for an expressive family of languages.
We also study two recent extensions of these languages that have been motivated by modern applications of graph databases. The first one allows to treat paths as first-class citizens, while the second one permits to express queries that combine the topology of the graph with its underlying data.
- . Abiteboul, P. Buneman, D. Suciu. Data on the Web: From Relations to Semistructured Data and XML. Morgan Kauffman, 1999. Google Scholar
Digital Library
- . Abiteboul, R. Hull, V. Vianu. Foundations ofdatabases. Addison-Wesley, 1995. Google Scholar
Digital Library
- . Abiteboul, D. Quass, J. McHugh, J. Widom,J. L. Wiener. The Lorel query language for semistructureddata. Int. J. on Digital Libraries 1(1), pages 68--88, 1997.Google Scholar
Cross Ref
- S. Abiteboul, V. Vianu. Regular path queries with constraints. JCSS 58(3), pages 428--452, 1999. Google Scholar
Digital Library
- R. Angles, C. Gutiérrez. Survey of graph database models. ACM Comput. Surv. 40(1), 2008. Google Scholar
Digital Library
- . Alon, R. Yuster, U. Zwick. Finding and counting given length cycles (Extended abstract). In ESA 1994, pages 354--364. Google Scholar
Digital Library
- M. K. Anand, S. Bowers, B. Ludäscher. Techniques for efficiently querying scientific workflow provenancegraphs. In EDBT 2010, pages 287--298. Google Scholar
Digital Library
- K. Anyanwu, A. P. Sheth. ρ-queries: enabling querying for semantic associations on the semantic web. In WWW 2003, pages 690--699. Google Scholar
Digital Library
- K. Anyanwu, A. Maduko, A. P. Sheth. SPARQ2L: towards support for subgraph extraction queries in RDF databases. In WWW 2007, pages 797--806. Google Scholar
Digital Library
- M. Arenas, J. Pérez. Querying semantic web data with SPARQL. In PODS2011, pages 305--316. Google Scholar
Digital Library
- M. Arenas, S. Conca, J. Pérez. Counting beyond a Yottabyte, or how SPARQL 1.1 property paths will prevent adoption of the standard. In WWW 2012, pages 629--638. Google Scholar
Digital Library
- . Barceló, D. Figueira, L. Libkin. Graph-logics withrational relations and the generalized intersection problem. In LICS 2012, pages 115--124. Google Scholar
Digital Library
- . Barceló, L. Libkin, A. W. Lin, P. T. Wood. Expressive languages for path queries over graph-structured Data. TODS 37(4), 2012. Google Scholar
Digital Library
- . Barceló, L. Libkin, J. Reutter. Querying graphpatterns. In PODS 2011, pages 199--210. Google Scholar
Digital Library
- . Barceló, L. Libkin, J. Reutter. Parameterizedregular expressions and their languages. TCS 474, pages 21--45,2013. Google Scholar
Digital Library
- . Barceló, L. Libkin, M. Romero. Efficientapproximations of conjunctive queries. In PODS, pages 249--260,2012. Google Scholar
Digital Library
- . Barceló, J. Reutter, J. Pérez. Relativeexpressiveness of nested regular expressions. In AMW 2012, pages 180--195.Google Scholar
- . Barceló, M. Romero, M. Y. Vardi. Semantic acyclicity on graph databases. In PODS 2013. Google Scholar
Digital Library
- C.L. Barrett, R. Jacob, M.V. Marathe. Formal-language-constrained pathproblems. SIAM J. on Comp., 30(3), pages 809--837, 2000. Google Scholar
Digital Library
- J.M. Berstel. Transductions and Context-Free Languages. B. G. Teubner, 1979.Google Scholar
Cross Ref
- A. Blumensath, E. Grädel. Automatic structures. In LICS 2000,pages 51--62. Google Scholar
Digital Library
- . Bojanczyk, A. Muscholl, Th. Schwentick, L. Segoufin. Two-variable logic on data trees and XML reasoning. JACM 56(3), 2009. Google Scholar
Digital Library
- . Bojanczyk. Automata for data words and data trees. In RTA, 2010.Google Scholar
- P. Buneman. Semistructured data. In PODS 1997, pages 117--121. Google Scholar
Digital Library
- . Buneman, W. Fan, S. Weinstein. Path constraints insemistructured databases. JCSS 61(2), pages 146--193, 2000. Google Scholar
Digital Library
- P. Buneman, M. F. Fernandez, D. Suciu. UnQL: A query language andalgebra for semistructured data based on structural recursion. VLDB J. 9(1), pages 76--110, 2000. Google Scholar
Digital Library
- . Calvanese, G. de Giacomo, M. Lenzerini, M. Y. Vardi. Containment of conjunctive regular path queries with inverse. In KR 2000, pages 176--185.Google Scholar
Digital Library
- . Calvanese, G. de Giacomo, M. Lenzerini, M. Y. Vardi. Rewriting of regular expressions and regular path queries. J. Comput. Syst. Sci. (JCSS), 64(3), pages 443--465, 2002. JCSS, 64(3), pages 443--465, 2002.Google Scholar
Digital Library
- . Calvanese, G. de Giacomo, M. Lenzerini,M. Y. Vardi. View-based query containment. In PODS 2003, pages 56--67. Google Scholar
Digital Library
- D. Calvanese, G. de Giacomo, M. Lenzerini, M. Y. Vardi. Reasoning on regular path queries. SIGMOD Record 32(4), pages 83--92, 2003. Google Scholar
Digital Library
- . Chambart, Ph. Schnoebelen. Post embedding problem isnot primitive recursive, with applications to channel systems. In FSTTCS2007, pages 265--276. Google Scholar
Digital Library
- A. Chandra and P. Merlin. Optimal implementation of conjunctive queries in relationaldata bases. In STOC 1977, pages77--90. Google Scholar
Digital Library
- M. P. Consens, A. O. Mendelzon. Expressing structural hypertext queries in GraphLog. In Hypertext 1989, pages 269--292. Google Scholar
Digital Library
- . P. Consens, A. O. Mendelzon. GraphLog: a visual formalismfor real life recursion. In PODS 1990, pages 404--416. Google Scholar
Digital Library
- . P. Consens, A. O. Mendelzon. Low complexityaggregation in graphLog and datalog. TCS 116 (1 & 2), pages 95--116, 1993. Google Scholar
Digital Library
- I. Cruz, A. Mendelzon, P. Wood. A graphical query language supporting recursion. In SIGMOD 1987, pages 323--330. Google Scholar
Digital Library
- L3S dblp bibliography DB: http://dblp.l3s.de/d2r/Google Scholar
- S. DeRose. J. Clark. Xml path language (xpath). W3CRecommendation, November 1999, http://www.w3.org/TR/xpath.Google Scholar
- A. Deutsch, V. Tannen. Optimization properties forclasses of conjunctive regular path queries. In DBPL 2001, pages21--39. Google Scholar
Digital Library
- A. Dries, S. Nijssen, L. De Raedt. A query language for analyzing networks. In CIKM 2009, pages 485--494. Google Scholar
Digital Library
- C. Elgot, J. Mezei. On relations defined by generalizedfinite automata. IBM Journal of Research and Development, 9(1),pages 47--68, 1965. Google Scholar
Digital Library
- W. Fan. Graph pattern matching revised for social networkanalysis. In ICDT 2012, pages 8--21. Google Scholar
Digital Library
- M. F. Fernández, D. Florescu, A. Y. Levy, D. Suciu. Declarativespecification of web sites with Strudel. VLDB J. 9(1), pages38--55, 2000. Google Scholar
Digital Library
- M. F. Fernandez, D. Suciu. Optimizing regular path expressions usinggraph schemas. In ICDE 1998, pages 14--23. Google Scholar
Digital Library
- G. H. L. Fletcher, M. Gyssens, D. Leinders, J. Van den Bussche, D. Van Gucht, S. Vansummeren, Y. Wu. Relative expressive power of navigational querying on graphs. In ICDT 2011, pages 197--207. Google Scholar
Digital Library
- D. Florescu, A. Y. Levy, D. Suciu. Query containment for conjunctivequeries with regular expressions. In PODS 1998, pages 139--148. Google Scholar
Digital Library
- D. D. Freydenberger, D. Reidenbach. Bad news on decision problems forpatterns. Inf. Comput. 208(1), pages 83--96, 2010. Google Scholar
Digital Library
- D. D. Freydenberger, N. Schweikardt. Expressiveness and staticanalysis of extended conjunctive regular path queries. In AMW 2011.Google Scholar
- Ch. Frougny, J. Sakarovitch. Rational relations with bounded delay. In STACS 1991, pages 50--63. Google Scholar
Digital Library
- G. Grahne, A. Thomo. Query containment and rewriting using views for regular path queriesunder constraints. In PODS 2003, pages 111--122. Google Scholar
Digital Library
- . Greenlaw, J. Hoover, W. Ruzzo. Limits to parallel computation: P-completeness theory. OxfordUniversity Press, 1995. Google Scholar
Digital Library
- D. Gusfield. Algorithms on strings, trees and sequences: Computerscience and computational biology. Cambridge University Press, 1997. Google Scholar
Digital Library
- R. H. Güting. GraphDB: Modeling and querying graphs in databases. In VLDB 1994, pages 297--308. Google Scholar
Digital Library
- M. Gyssens, J. Paredaens, J. Van den Bussche, D. Van Gucht. A graph-oriented object databasemodel. IEEE Trans. Knowl. Data Eng. 6(4), pages 572--586, 1994. Google Scholar
Digital Library
- . Harel, D. Kozen, J. Tiuryn. Dynamic Logic. MIT Press, 2000. Google Scholar
Digital Library
- S. Harris, A. Seaborne. SPARQL 1.1 query language. W3Cworking draft. http://www.w3.org/TR/sparql11-query/, July 2012.Google Scholar
- J. Hellings, B. Kuijpers, J. Van den Bussche, X. Zhang. Walk logic as a framework for path query languages on graphdatabases. In ICDT 2013, pages 117--128. Google Scholar
Digital Library
- D.A. Holland, U. Braun, D. Maclean, K.K. Muniswamy-Reddy, M.I. Seltzer. Choosing a data model and query language for provenance. In IPAW 2008, pages 98--115.Google Scholar
- nfinite graph. http://objectivity.comGoogle Scholar
- T. Jiang, A. Salomaa, K. Salomaa, S. Yu. Decision problems for patterns. JCSS 50(1), pages 53--63, 1995. Google Scholar
Digital Library
- M. Kaminski, N. Francez. Finite memory automata. TCS, 134(2), pages 329--363, 1994. Google Scholar
Digital Library
- K. Kochut, M. Janik. SPARQLeR: Extended Sparql for semantic association discovery. In ESWC 2007, pages 145--159. Google Scholar
Digital Library
- Z. Lacroix, H. Murthy, F. Naumann, L. Raschid. Links andpaths through life sciences data Sources. In DILS 2004, pages203--211.Google Scholar
Cross Ref
- A. LaPaugh, Ch. Papadimitriou. The even path problem for graphs and digraphs. Networks 14(4),pages 507--513, 1984.Google Scholar
- . Libkin, D. Vrgoć. Regular path queries on graphswith data. In ICDT 2012, pages 74--85. Google Scholar
Digital Library
- L. Libkin, W. Martens, D. Vrgo\vc. Querying graph databases with XPath. In ICDT 2013. Google Scholar
Digital Library
- K. Losemann, W. Martens. The complexity of evaluating path expressions in SPARQL. In PODS2012, pages 101--112. Google Scholar
Digital Library
- N. Martínez-Bazan, V. Muntés-Mulero, S. Gomez-Villamor, J. Nin, M. Sánchez-Martínez, J. L. Larriba-Pey. Dex: high-performance exploration on large graphsfor information retrieval. In CIKM 2007, pages 573--582. Google Scholar
Digital Library
- A. Mendelzon, P. Wood. Finding regular simple paths in graph databases. SIAM J. Comput. 24(6), pages 1235--1258, 1995. Google Scholar
Digital Library
- neo4j. http://www.neo4j.org/Google Scholar
- F. Neven, Th. Schwentick, V. Vianu. Finite state machinesfor strings over infinite alphabets. ACM TOCL 5(3), pages 403--435, 2004. Google Scholar
Digital Library
- Ch. Papadimitriou, M. Yannakakis.On the complexity of database queries. In PODS 1997, pages12--19. Google Scholar
Digital Library
- J. Paredaens, P. Peelman, L. Tanca. G-Log: A graph-based query language. IEEE Trans. Knowl. Data Eng. 7(3), pages 436--453, 1995. Google Scholar
Digital Library
- J. Pérez, M. Arenas, C. Gutierrez. nSPARQL: A navigational language for RDF. Journal of Web Semantics 8(4), pages 255--270, 2010. Google Scholar
Digital Library
- J. Reutter. Containment of nested regular expressions. http://arxiv.org/abs/1304.2637Google Scholar
- R. Ronen, O. Shmueli. SoQL: A language for querying andcreating data in social networks. In ICDE 2009, pages 1595--1602. Google Scholar
Digital Library
- A. Salomaa. Patterns. Bulletin of the EATCS 54, pages 194--206,1994.Google Scholar
- M. San Martín, C. Gutierrez, P. T. Wood. SNQL: A social networks query and transformation language. In AMW 2011.Google Scholar
- L. J. Stockmeyer, A. R. Meyer. Word problems requiring exponential time: Preliminary report. In STOC 1973, pages 1--9. Google Scholar
Digital Library
- M. Y. Vardi. The complexity of relational querylanguages. In STOC 1982, pages 137--146. Google Scholar
Digital Library
- .Y. Vardi. On the complexity of bounded variablequeries. In PODS 1995, pages 266--276. Google Scholar
Digital Library
- G. Weikum, G. Kasneci, M. Ramanath, F. M. Suchanek. Database and information-retrieval methods for knowledgediscovery. CACM 52(4), pages 56--64, 2009. Google Scholar
Digital Library
- M. Yannakakis. Algorithms for acyclic database schemes. In VLDB 1981, pages 82--94. Google Scholar
Digital Library
Index Terms
Querying graph databases
Recommendations
Foundations of Modern Query Languages for Graph Databases
We survey foundational features underlying modern graph query languages. We first discuss two popular graph data models: edge-labelled graphs, where nodes are connected by directed, labelled edges, and property graphs, where nodes and edges can further ...
Semantic Acyclicity on Graph Databases
It is known that unions of acyclic conjunctive queries (CQs) can be evaluated in linear time, as opposed to arbitrary CQs, for which the evaluation problem is NP-complete. It follows from techniques in the area of constraint-satisfaction problems that ...
Regular Queries on Graph Databases
Graph databases are currently one of the most popular paradigms for storing data. One of the key conceptual differences between graph and relational databases is the focus on navigational queries that ask whether some nodes are connected by paths ...






Comments