skip to main content
research-article

A bottom-up, knowledge-aware approach to integrating and querying web data services

Published:01 November 2013Publication History
Skip Abstract Section

Abstract

As a wealth of data services is becoming available on the Web, building and querying Web applications that effectively integrate their content is increasingly important. However, schema integration and ontology matching with the aim of registering data services often requires a knowledge-intensive, tedious, and error-prone manual process.

We tackle this issue by presenting a bottom-up, semi-automatic service registration process that refers to an external knowledge base and uses simple text processing techniques in order to minimize and possibly avoid the contribution of domain experts in the annotation of data services. The first by-product of this process is a representation of the domain of data services as an entity-relationship diagram, whose entities are named after concepts of the external knowledge base matching service terminology rather than being manually created to accommodate an application-specific ontology. Second, a three-layer annotation of service semantics (service interfaces, access patterns, service marts) describing how services “play” with such domain elements is also automatically constructed at registration time. When evaluated against heterogeneous existing data services and with a synthetic service dataset constructed using Google Fusion Tables, the approach yields good results in terms of data representation accuracy.

We subsequently demonstrate that natural language processing methods can be used to decompose and match simple queries to the data services represented in three layers according to the preceding methodology with satisfactory results. We show how semantic annotations are used at query time to convert the user's request into an executable logical query. Globally, our findings show that the proposed registration method is effective in creating a uniform semantic representation of data services, suitable for building Web applications and answering search queries.

References

  1. Bellahsene, Z., Bonifati, A., and Rahm, E. 2011. Schema Matching and Mapping. Springer. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bergamaschi, S., Po, L., Sorrentino, S., and Corni, A. 2010. Uncertainty in data integration systems: Automatic generation of probabilistic relationships. In Management of the Interconnected World, Springer Physica-Verlag, Berlin, 221--228.Google ScholarGoogle Scholar
  3. Bizer, C., Heath, T., Idehen, K., and Berners-Lee, T. 2008. Linked data on the Web. In Proceedings of WWW. 1265--1266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bozzon, A., Braga, D., Brambilla, M., Ceri, S., Corcoglioniti, F., Fraternali, P., and Vadacca, S. 2011. Search computing: Multi-domain search on ranked data. In Proceedings of SIGMOD. 1267--1270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bozzon, A., Brambilla, M., Ceri, S., and Fraternali, P. 2010. Liquid query: Multi-domain exploratory search on the Web. In Proceedings of WWW. 161--170. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Braga, D., Ceri, S., Corcoglioniti, F., and Grossniklaus, M. 2010. Panta Rhei: Flexible execution engine for search computing queries. In Search Computing: Challenges and Directions, S. Ceri and M. Brambilla, (Eds.), Springer-Verlag, Berlin, Heidelberg, Chapter 12, 225--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Braga, D., Ceri, S., Daniel, F., and Martinenghi, D. 2008. Optimization of multi-domain queries on the web. Proc. VLDB 1, 1, 562--573. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Brambilla, M., Campi, A., Ceri, S., and Quarteroni, S. 2011. Semantic resource framework. In Search Computing, Lecture Notes in Computer Science, vol. 6585, Springer, Berlin, 73--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Brambilla, M., Ceri, S., Cinefra, N., Das Sarma, A., Forghieri, F., and Quarteroni, S. 2012. Google fusion tables: Making sense of heterogeneous data. In Search Computing: Broadening Web Search. Lecture Notes in Computer Science, vol. 7538. Springer, Berlin, 53--67.Google ScholarGoogle Scholar
  10. Calvanese, D., Giacomo, G. D., Lenzerini, M., and Rosati, R. 2012. View-based query answering in description logics: Semantics and complexity. Comput. Syst. Sci. 78, 1, 26--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Carenini, A., Cerizza, D., Comerio, M., Valle, E. D., Paoli, F. D., Maurino, A., Palmonari, M., and Turati, A. 2008. Glue2: A Web service discovery engine with non-functional properties. In Proceedings of the 6th European Conference on Web Services (ECWS). 21--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ceri, S. and Brambilla, M. 2010. Search computing: Challenges and directions. In Objects and Databases. Lecture Notes in Computer Science, vol. 5950. Springer-Verlag, Berlin Heidelberg, 1--5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ceri, S. and Brambilla, M., eds. 2011. Search Computing: Trends and Developments. Lecture Notes in Computer Science, vol. 6585, Springer-Verlag, Berlin Heidelberg. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Choi, N., Song, I.-Y., and Han, H. 2006. A survey on ontology mapping. SIGMOD Rec. 35, 3, 34--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ciglan, M., Norvag, K., and Hluchy, L. 2012. The semsets model for ad-hoc semantic list search. In Proceedings of WWW. 131--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Dalvi, N., Kumar, R., Pang, B., Ramakrishnan, R., Tomkins, A., Bohannon, P., Keerthi, S., and Merugu, S. 2009. A web of concepts. In Proceedings of the Symposium on Principles of Database Systems (PODS). ACM, 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Damljanovic, D., Agatonovic, M., and Cunningham, H. 2010a. Natural language interfaces to ontologies: Combining syntactic analysis and ontology-based lookup through the user interaction. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC). N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, and D. Tapias, Eds. European Language Resources Association (ELRA), Valletta, Malta, 19--21.Google ScholarGoogle Scholar
  18. Damljanovic, D., Agatonovic, M., and Cunningham, H. 2010b. Natural language interfaces to ontologies: Combining syntactic analysis and ontology-based lookup through the user interaction. In Proceedings of the 7th Extended Semantic Web Conference (ESWC). Lecture Notes in Computer Science, Springer, vol. 6088, Berlin, 106--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Damljanovic, D. and Bontcheva, K. 2009. Towards enhanced usability of natural language interfaces to knowledge bases. In Web 2.0 & Semantic Web, V. Devedic and D. Gaevic, Eds. Annals of Information Systems, vol. 6. Springer, Berlin, 105--133.Google ScholarGoogle Scholar
  20. Das Sarma, A., Fang, L., Gupta, N., Halevy, A., Lee, H., Wu, F., Xin, R., and Yu, C. 2012. Finding related tables. In Proceedings of the SIGMOD. ACM, New York, NY, 817--828. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. De Virgilio, R. and Bianchini, D. 2010. A metamodel approach to flexible semantic Web service discovery. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM'10). ACM, New York, NY, 1309--1312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Doan, A., Halevy, A., and Ives, Z. 2012. Principles of Data Integration. Morgan Kauffman. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Doan, A. and Halevy, A. Y. 2005. Semantic integration research in the database community: A brief survey. AI Mag. 26, 1, 83--94. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Dong, X., Halevy, A., Madhavan, J., Nemes, E., and Zhang, J. 2004. Similarity search for Web services. In Proceedings of VLDB. 372--383. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Fazzinga, B. and Lukasiewicz, T. 2010. Semantic search on the Web. Semantic Web J. 1, 1--2, 89--96. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Fensel, D., Facca, F., Simperl, E., and Toma, I., eds. 2011. Semantic Web Services. Springer, Berlin. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ferragina, P. and Scaiella, U. 2010. TAGME: On-the-fly annotation of short text fragments (by wikipedia entities). In Proceedings of the International Conference on Information and Knowledge Management (CIKM). ACM, 1625--1628. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Foster, H., Uchitel, S., Magee, J., and Kramer, J. 2003. Model-based verification of Web service compositions. In Proceedings of Automated Software Engineering. 152--161.Google ScholarGoogle Scholar
  29. Giunchiglia, F., Kharkevich, U., and Zaihrayeu, I. 2009. Concept search. In Proceedings of the 6th Extended Semantic Web Coference (ESWC). Lecture Notes in Computer Science, vol. 5559, Springer, Berlin, 429--444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Granitzer, M., Sabol, V., Onn, K. W., Lukose, D., and Tochtermann, K. 2010. Ontology alignment: A survey with focus on visually supported semi-automatic techniques. Future Internet 2, 3, 238--258.Google ScholarGoogle ScholarCross RefCross Ref
  31. Halpin, T., Morgan, A., and Morgan, T. 2008. Information Modeling and Relational Databases. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Herzig, D. M. and Tran, T. 2012. Heterogeneous Web data search using relevance-based on the fly data integration. In Proceedings of WWW. 141--150. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Joachims, T. 1999. Making Large-Scale Support Vector Machine Learning Practical. MIT Press, Cambridge, MA, 169--184. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Lafferty, J., McCallum, A., and Pereira, F. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML). 282--289. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Lenzerini, M. 2002. Data integration: A theoretical perspective. In Proceedings of the Symposium on Principles of Database Systems (PODS). ACM, 233--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Li, X. 2010. Understanding the semantic structure of noun phrase queries. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL). 1337--1345. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lim, J. and Lee, K. 2010. Constructing composite Web services from natural language requests. Web Semantics: Science, Services and Agents on the World Wide Web 8, 1, 1--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Manolescu, I., Brambilla, M., Ceri, S., Comai, S., and Fraternali, P. 2005. Model-driven design and deployment of service-enabled web applications. ACM Trans. Internet Technol. 5, 3, 439--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Martin, D., Burstein, M., Mcdermott, D., Mcilraith, S., Paolucci, M., Sycara, K., Mcguinness, D. L., Sirin, E., and Srinivasan, N. 2007. Bringing semantics to Web services with OWL-S. World Wide Web 10, 3, 243--277. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Miller, G. 1995. WordNet: A lexical database for English. Comm. ACM 38, 11, 39--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Porter, M. 1980. An algorithm for suffix stripping. Program: Electr. Lib. Inf. Sys. 14, 3, 130--137.Google ScholarGoogle ScholarCross RefCross Ref
  42. Pound, J., Mika, P., and Zaragoza, H. 2010. Ad-hoc object retrieval in the Web of data. In Proceedings of WWW. 771--780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Preda, N., Kasneci, G., Suchanek, F. M., Neumann, T., Yuan, W., and Weikum, G. 2010. Active knowledge: Dynamically enriching RDF knowledge bases by Web services. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'10). ACM, New York, NY, 399--410. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Quarteroni, S., Guerrisi, V., and La Torre, P. 2012. Evaluating multi-focus natural language queries over data services. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC). European Language Resources Association (ELRA).Google ScholarGoogle Scholar
  45. Rahm, E. and Bernstein, P. A. 2001. A survey of approaches to automatic schema matching. VLDB 10, 4, 334--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Rajaraman, A., Sagiv, Y., and Ullman, J. D. 1995. Answering queries using templates with binding patterns (extended abstract). In Proceedings of the 14th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS'95). ACM, New York, NY, 105--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Ranganathan, A., Riabov, A., and Udrea, O. 2009. Mashup-based information retrieval for domain experts. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM'09). ACM, New York, NY, 711--720. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Roman, D., Keller, U., Lausen, H., de Bruijn, J., Lara, R., Stollberg, M., Polleres, A., Feier, C., Bussler, C., and Fensel, D. 2005. Web service modeling ontology. Appl. Ontol. 1, 1, 77--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Roy Chowdhury, S., Rodríguez, C., Daniel, F., and Casati, F. 2012. Baya: Assisted mashup development as a service. In Proceedings of WWW - Companion volume. ACM, New York, NY, 409--412. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Suchanek, F., Kasneci, G., and Weikum, G. 2007. YAGO: A core of semantic knowledge. In Proceedings of WWW. 697--706. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Suchanek, F. M., Bozzon, A., Valle, E. D., Campi, A., and Ronchi, S. 2011. Towards an ontological representation of services in search computing. In Search Computing: Trends and Developments. Lecture Notes in Computer Science, vol. 6585. Springer, Berlin, 101--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Ullman, J. D. 1997. Information integration using logical views. In Proceedings of ICDT, F. N. Afrati and P. G. Kolaitis, Eds., Lecture Notes in Computer Science, vol. 1186. Springer, Berlin, 19--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Vitvar, T., Kopecký, J., Viskova, J., and Fensel, D. 2008. WSMO-lite annotations for Web services. In Proceedings of the 5th Extended Semantic Web Conference (ESWC), Lecture Notes in Computer Science, vol. 5021, Springer, Berlin, Heidelberg, 674--689. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Wang, C., Xiong, M., Zhou, Q., and Yu, Y. 2007. Panto: A portable natural language interface to ontologies. In The Semantic Web: Research and Applications, E. Franconi, M. Kifer, and W. May, Eds., Lecture Notes in Computer Science, vol. 4519. Springer, Berlin, 473--487. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Wu, J., Chen, L., Xie, Y., and Zheng, Z. 2012. Titan: A system for effective Web service discovery. In Proceedings of the WWW - Companion volume. ACM, New York, NY, 441--444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Zhang, D. and Lee, W. 2003. Question classification using support vector machines. In Proceedings of SIGIR. ACM, 26--32.Google ScholarGoogle Scholar

Index Terms

  1. A bottom-up, knowledge-aware approach to integrating and querying web data services

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader
              About Cookies On This Site

              We use cookies to ensure that we give you the best experience on our website.

              Learn more

              Got it!