skip to main content
research-article

A Learning-Based Framework for Improving Querying on Web Interfaces of Curated Knowledge Bases

Published:05 February 2018Publication History
Skip Abstract Section

Abstract

Knowledge Bases (KBs) are widely used as one of the fundamental components in Semantic Web applications as they provide facts and relationships that can be automatically understood by machines. Curated knowledge bases usually use Resource Description Framework (RDF) as the data representation model. To query the RDF-presented knowledge in curated KBs, Web interfaces are built via SPARQL Endpoints. Currently, querying SPARQL Endpoints has problems like network instability and latency, which affect the query efficiency. To address these issues, we propose a client-side caching framework, SPARQL Endpoint Caching Framework (SECF), aiming at accelerating the overall querying speed over SPARQL Endpoints. SECF identifies the potential issued queries by leveraging the querying patterns learned from clients’ historical queries and prefecthes/caches these queries. In particular, we develop a distance function based on graph edit distance to measure the similarity of SPARQL queries. We propose a feature modelling method to transform SPARQL queries to vector representation that are fed into machine-learning algorithms. A time-aware smoothing-based method, Modified Simple Exponential Smoothing (MSES), is developed for cache replacement. Extensive experiments performed on real-world queries showcase the effectiveness of our approach, which outperforms the state-of-the-art work in terms of the overall querying speed.

References

  1. Naomi S. Altman. 1992. An introduction to kernel and nearest-neighbor nonparametric regression. Amer. Stat. 46, 3 (1992), 175--185.Google ScholarGoogle Scholar
  2. Shady Elbassuoni, Maya Ramanath, and Gerhard Weikum. Query relaxation for entity-relationship search. In Proceedings of the 8th Extended Semantic Web Conference (ESWC’11). 62--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Anthony Fader, Luke Zettlemoyer, and Oren Etzioni. Open question answering over curated and extracted knowledge bases. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’14). 1156--1165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Géraud Fokou, Stéphane Jean, Allel Hadjali, and Mickaël Baron. Cooperative techniques for SPARQL query relaxation in RDF databases. In Proceedings of the 12th Extended Semantic Web Conference (ESWC’15). 237--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Everette S. Gardner. 2006. Exponential smoothing: The state of the art--part ii. Int. J. Forecast. 22, 4 (2006), 637--666.Google ScholarGoogle ScholarCross RefCross Ref
  6. Jiawei Han, Jian Pei, and Micheline Kamber. 2011. Data Mining: Concepts and Techniques. Elsevier. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Rakebul Hasan. Predicting SPARQL query performance and explaining linked data. In Proceedings of the 11th Extended Semantic Web Conference (ESWC’14). 795--805.Google ScholarGoogle Scholar
  8. Harold Hotelling. 1936. Relations between two sets of variates. Biometrika (1936), 321--377.Google ScholarGoogle Scholar
  9. Ian Jolliffe. 2002. Principal Component Analysis. Wiley Online Library.Google ScholarGoogle Scholar
  10. Elem Guzel Kalayci, Tahir Emre Kalayci, and Derya Birant. 2015. An ant colony optimisation approach for optimising SPARQL queries by reordering triple patterns. Inf. Syst. 50 (2015), 51--68. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Leonard Kaufman and Peter Rousseeuw. 1987. Clustering by Means of Medoids. North-Holland.Google ScholarGoogle Scholar
  12. Dashiell Kolbe, Qiang Zhu, and Sakti Pramanik. 2010. Efficient k-nearest neighbor searching in nonordered discrete data spaces. ACM Trans. Inf. Syst. 28, 2 (2010). Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Daniel D. Lee and H. Sebastian Seung. 1999. Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755 (1999), 788--791.Google ScholarGoogle Scholar
  14. Jens Lehmann and Lorenz Bühmann. AutoSPARQL: Let users query your knowledge base. In Proceedings of the 8th Extended Semantic Web Conference (ESWC’11). 63--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Justin J. Levandoski, Per-Åke Larson, and Radu Stoica. Identifying hot and cold data in main-memory databases. In Proceedings of 29th International Conference on Data Engineering (ICDE’13). 26--37. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Johannes Lorey and Felix Naumann. Detecting SPARQL query templates for data prefetching. In Proceedings of the 10th Extended Semantic Web Conference (ESWC’13). 124--139.Google ScholarGoogle Scholar
  17. Michael Martin, Jörg Unbehauen, and Sören Auer. Improving the performance of semantic web applications with SPARQL query caching. In Proceedings of the 7th Extended Semantic Web Conference (ESWC’10). 304--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Mohamed Morsey, Jens Lehmann, Sören Auer, and Axel-Cyrille Ngonga Ngomo. Usage-centric benchmarking of RDF triple stores. In Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Thomas Neumann and Gerhard Weikum. 2010. The RDF-3X engine for scalable management of RDF data. VLDB J. 19, 1 (2010), 91--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Elizabeth J. O’Neil, Patrick E. O’Neil, and Gerhard Weikum. The LRU-K page replacement algorithm for database disk buffering. In Proceedings of the International Conference on Management of Data (SIGMOD’93). 297--306. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Nikolaos Papailiou, Dimitrios Tsoumakos, Panagiotis Karras, and Nectarios Koziris. Graph-aware, workload-adaptive SPARQL query caching. In Proceedings of the International Conference on Management of Data (SIGMOD’15). 1777--1792. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jorge Pérez, Marcelo Arenas, and Claudio Gutierrez. 2009. Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34, 3 (2009). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Qun Ren, Margaret H. Dunham, and Vijay Kumar. 2003. Semantic caching and query processing. IEEE Trans. Knowl. Data Eng. 15, 1 (2003), 192--210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Alberto Sanfeliu and King-Sun Fu. 1983. A distance measure between attributed relational graphs for pattern recognition. IEEE Trans. Syst., Man, Cybern., Syst. 13, 3 (1983), 353--362.Google ScholarGoogle ScholarCross RefCross Ref
  25. Yanfeng Shu, Michael Compton, Heiko Müller, and Kerry Taylor. Towards content-aware SPARQL query caching for semantic web applications. In Proceedings of the 14th International Conference on Web Information Systems Engineering (WISE’13). 320--329.Google ScholarGoogle Scholar
  26. Ruben Verborgh, Olaf Hartig, Ben De Meester, Gerald Haesendonck, Laurens De Vocht, Miel Vander Sande, Richard Cyganiak, Pieter Colpaert, Erik Mannens, and Rik Van de Walle. Querying datasets on the web with high availability. In Proceedings of the 13th International Semantic Web Conference (ISWC’14). 180--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mengdong Yang and Gang Wu. Caching intermediate result of SPARQL queries. In Proceedings of the 20th International World Wide Web Conference (WWW’11). 159--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Pengcheng Yin, Nan Duan, Ben Kao, Jun-Wei Bao, and Ming Zhou. Answering questions with complex semantic constraints on open knowledge bases. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management (CIKM’15). 1301--1310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Wei Emma Zhang, Quan Z. Sheng, Kerry Taylor, and Yongrui Qin. Identifying and caching hot triples for efficient RDF query processing. In Proceedings of the 20th International Conference on Database Systems for Advanced Applications (DASFAA’15). 259--274.Google ScholarGoogle Scholar
  30. Wayne Xin Zhao, Xudong Zhang, Daniel Lemire, Dongdong Shan, Jian-Yun Nie, Hongfei Yan, and Ji-Rong Wen. 2015. A general SIMD-based approach to accelerating compression algorithms. ACM Trans. Inf. Syst. 33, 3 (2015), 15:1--15:28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Lei Zou, Jinghui Mo, Lei Chen, M. Tamer Özsu, and Dongyan Zhao. 2011. gStore: Answering SPARQL queries via subgraph matching. PVLDB 4, 8 (2011), 482--493. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Learning-Based Framework for Improving Querying on Web Interfaces of Curated Knowledge Bases

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Internet Technology
          ACM Transactions on Internet Technology  Volume 18, Issue 3
          Special Issue on Artificial Intelligence for Secruity and Privacy and Regular Papers
          August 2018
          314 pages
          ISSN:1533-5399
          EISSN:1557-6051
          DOI:10.1145/3185332
          • Editor:
          • Munindar P. Singh
          Issue’s Table of Contents

          Copyright © 2018 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 5 February 2018
          • Accepted: 1 October 2017
          • Revised: 1 September 2017
          • Received: 1 January 2017
          Published in toit Volume 18, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!