skip to main content
research-article
Public Access

Integrating Multi-level Tag Recommendation with External Knowledge Bases for Automatic Question Answering

Published:07 May 2019Publication History
Skip Abstract Section

Abstract

We focus on using natural language unstructured textual Knowledge Bases (KBs) to answer questions from community-based Question-and-Answer (Q8A) websites. We propose a novel framework that integrates multi-level tag recommendation with external KBs to retrieve the most relevant KB articles to answer user posted questions. Different from many existing efforts that primarily rely on the Q8A sites’ own historical data (e.g., user answers), retrieving answers from authoritative external KBs (e.g., online programming documentation repositories) has the potential to provide rich information to help users better understand the problem, acquire the knowledge, and hence avoid asking similar questions in future. The proposed multi-level tag recommendation best leverages the rich tag information by first categorizing them into different semantic levels based on their usage frequencies. A post-tag co-clustering model, augmented by a two-step tag recommender, is used to predict tags at different levels for a given user posted question. A KB article retrieval component leverages the recommended multi-level tags to select the appropriate KBs and search/rank the matching articles thereof. We conduct extensive experiments using real-world data from a Q8A site and multiple external KBs to demonstrate the effectiveness of the proposed question-answering framework.

References

  1. Junwei Bao, Nan Duan, Ming Zhou, and Tiejun Zhao. 2014. Knowledge-based question answering as machine translation. Cell 2, 6 (2014).Google ScholarGoogle Scholar
  2. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent dirichlet allocation. J. Mach.-Learn. Res. 3 (2003), 993--1022. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Yanhua Chen, Manjeet Rege, Ming Dong, and Jing Hua. 2008. Non-negative matrix factorization for semi-supervised data clustering. Knowl. Info. Syst. 17, 3 (2008), 355--379.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Philipp Cimiano, Michael Erdmann, and Günter Ladwig. 2007. Corpus-based pattern induction for a knowledge-based question answering approach. In Proceedings of the IEEE International Conference on Semantic Computing (ICSC’07). IEEE, 671--678. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Peter Clark, John Thompson, and Bruce Porter. 1999. A knowledge-based approach to question-answering. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’99), Vol. 99. Citeseer, 43--51.Google ScholarGoogle Scholar
  6. Oracle Corporation. {n.d.}. The Java Language Specification, Java SE 8th Edition. Retrieved from http://docs.oracle.com/javase/specs/jls/se8/html/index.html.Google ScholarGoogle Scholar
  7. Oracle Corporation. {n.d.}. Java Platform, Standard Edition 8 API Specification. Retrieved from https://docs.oracle.com/javase/8/docs/api/index.html.Google ScholarGoogle Scholar
  8. Daniel Hasan Dalip, Marcos André Gonçalves, Marco Cristo, and Pavel Calado. 2013. Exploiting user feedback to learn to rank answers in Q8A forums: A case study with stack overflow. In Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 543--552. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Inderjit S. Dhillon. 2001. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the Conference on Knowledge Discovery and Data Mining (KDD’01). ACM, 269--274. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Stack Exchange. {n.d.}. Stack Exchange Data Dump. Retrieved from https://archive.org/details/stackexchange.Google ScholarGoogle Scholar
  11. Anthony Fader, Luke Zettlemoyer, and Oren Etzioni. 2014. Open question answering over curated and extracted knowledge bases. In Proceedings of the Knowledge Discovery and Data Mining (KDD’14). ACM, 1156--1165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Django Software Foundation. {n.d.}. Django Documentation. Retrieved from https://docs.djangoproject.com/en/1.9/.Google ScholarGoogle Scholar
  13. Python Software Foundation. {n.d.}. The Python Language Reference. Retrieved from https://docs.python.org/3/reference/index.html.Google ScholarGoogle Scholar
  14. Python Software Foundation. {n.d.}. The Python Standard Library. Retrieved from https://docs.python.org/3/library/index.html.Google ScholarGoogle Scholar
  15. The Mozilla Foundation. {n.d.}. JavaScript Reference. Retrieved from https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference.Google ScholarGoogle Scholar
  16. The PHP Group. {n.d.}. PHP Documentation, PHP 7. Retrieved from http://php.net/docs.php.Google ScholarGoogle Scholar
  17. Ulf Hermjakob, Eduard H. Hovy, and Chin-Yew Lin. 2002. Knowledge-based question answering. In Proceedings of the SCI Conference.Google ScholarGoogle Scholar
  18. Google Incorporated. {n.d.}. Android Reference. Retrieved from http://developer.android.com/reference/packages.html.Google ScholarGoogle Scholar
  19. Jin Liu, Pingyi Zhou, Zijiang Yang, Xiao Liu, and John Grundy. 2018. FastTagRec: Fast tag recommendation for software information sites. Auto. Softw. Eng. (2018), 1--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Stefania Mariano and Andrea Casey. 2007. The process of knowledge retrieval: A case study of an American high-technology research, engineering and consulting company. VINE 37, 3 (2007), 314--330.Google ScholarGoogle ScholarCross RefCross Ref
  21. Avigit K. Saha, Ripon K. Saha, and Kevin A. Schneider. 2013. A discriminative model approach for suggesting tags automatically for stack overflow questions. In Proceedings of the Mining Software Repositories Conference (MSR’13). IEEE Press, 73--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. A. K. Singh, N. K. Nagwani, and S. Pandey. 2017. TAGme: A topical folksonomy based collaborative filtering for tag recommendation in community sites. In Proceedings of the 4th Multidisciplinary International Social Networks Conference. ACM, 27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Parikshit Sondhi and ChengXiang Zhai. 2014. Mining semi-structured online knowledge bases to answer natural language questions on community QA websites. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM’14). ACM, 341--350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Huan Sun, Hao Ma, Xiaodong He, Wen-tau Yih, Yu Su, and Xifeng Yan. 2016. Table cell search for question answering. In Proceedings of the World Wide Web Conference (WWW’16). 771--782. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Robert West, Evgeniy Gabrilovich, Kevin Murphy, Shaohua Sun, Rahul Gupta, and Dekang Lin. 2014. Knowledge base completion via search-based question answering. In Proceedings of the World Wide Web Conference (WWW’14). ACM, 515--526. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Yiyu Yao, Yi Zeng, Ning Zhong, and Xiangji Huang. 2007. Knowledge retrieval (KR). In Proceedings of the Web Intelligence Consortium (WIC’07). IEEE, 729--735. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Zhiping Zheng. 2003. Question answering using web news as knowledge base. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL’03). ACL, 251--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zhou Zhibin, Shi Shuicai, Li Yuqin, and Lv Xueqiang. 2010. An answer extraction method of simple question based on web knowledge library. In Proceedings of the Workshop on Education Technology and Computer Science (ETCS’10), Vol. 1. IEEE, 308--311.Google ScholarGoogle ScholarCross RefCross Ref
  29. P. Zhou, J. Liu, Z. Yang, and G. Zhou. 2017. Scalable tag recommendation for software information sites. In Proceedings of the IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER’17). 272--282.Google ScholarGoogle Scholar

Index Terms

  1. Integrating Multi-level Tag Recommendation with External Knowledge Bases for Automatic Question Answering

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Internet Technology
        ACM Transactions on Internet Technology  Volume 19, Issue 3
        Special Section on Advances in Internet-Based Collaborative Technologies
        August 2019
        289 pages
        ISSN:1533-5399
        EISSN:1557-6051
        DOI:10.1145/3329912
        • Editor:
        • Ling Liu
        Issue’s Table of Contents

        Copyright © 2019 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 7 May 2019
        • Accepted: 1 March 2019
        • Revised: 1 January 2019
        • Received: 1 November 2017
        Published in toit Volume 19, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!