skip to main content
10.1145/2872427.2883080acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Table Cell Search for Question Answering

Published: 11 April 2016 Publication History

Abstract

Tables are pervasive on the Web. Informative web tables range across a large variety of topics, which can naturally serve as a significant resource to satisfy user information needs. Driven by such observations, in this paper, we investigate an important yet largely under-addressed problem: Given millions of tables, how to precisely retrieve table cells to answer a user question. This work proposes a novel table cell search framework to attack this problem. We first formulate the concept of a relational chain which connects two cells in a table and represents the semantic relation between them. With the help of search engine snippets, our framework generates a set of relational chains pointing to potentially correct answer cells. We further employ deep neural networks to conduct more fine-grained inference on which relational chains best match the input question and finally extract the corresponding answer cells. Based on millions of tables crawled from the Web, we evaluate our framework in the open-domain question answering (QA) setting, using both the well-known WebQuestions dataset and user queries mined from Bing search engine logs. On WebQuestions, our framework is comparable to state-of-the-art QA systems based on knowledge bases (KBs), while on Bing queries, it outperforms other systems with a 56.7% relative gain. Moreover, when combined with results from our framework, KB-based QA performance can obtain a relative improvement of 28.1% to 66.7%, demonstrating that web tables supply rich knowledge that might not exist or is difficult to be identified in existing KBs.

References

[1]
Freebase wiki. http://wiki.freebase.com/wiki/Wikipedia.
[2]
M. D. Adelfio and H. Samet. Schema extraction for tabular data on the web. VLDB, 6(6):421--432, 2013.
[3]
I. Androutsopoulos, G. D. Ritchie, and P. Thanisch. Natural language interfaces to databases--an introduction. Natural language engineering, 1(01):29--81, 1995.
[4]
S. Balakrishnan, A. Y. Halevy, B. Harb, H. Lee, J. Madhavan, A. Rostamizadeh, W. Shen, K. Wilder, F. Wu, and C. Yu. Applying webtables in practice. In CIDR, 2015.
[5]
J. Berant, A. Chou, R. Frostig, and P. Liang. Semantic parsing on freebase from question-answer pairs. In EMNLP, pages 1533--1544, 2013.
[6]
J. Berant and P. Liang. Semantic parsing via paraphrasing. In ACL, pages 1415--1425, 2014.
[7]
K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD, pages 1247--1250. ACM, 2008.
[8]
E. Brill, S. Dumais, and M. Banko. An analysis of the AskMSR question-answering system. In Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10, pages 257--264. ACL, 2002.
[9]
C. J. Burges. From RankNet to LambdaRank to LambdaMART: An overview. Learning, 11:23--581, 2010.
[10]
C. J. Burges, K. M. Svore, P. N. Bennett, A. Pastusiak, and Q. Wu. Learning to rank using an ensemble of lambda-gradient models. In Yahoo! Learning to Rank Challenge, pages 25--35, 2011.
[11]
M. J. Cafarella, A. Halevy, D. Z. Wang, E. Wu, and Y. Zhang. Webtables: exploring the power of tables on the web. VLDB, 1(1):538--549, 2008.
[12]
M. J. Cafarella, A. Y. Halevy, Y. Zhang, D. Z. Wang, and E. Wu. Uncovering the relational web. In WebDB. Citeseer, 2008.
[13]
J. Chu-Carroll, J. Prager, C. Welty, K. Czuba, and D. Ferrucci. A multi-strategy and multi-source approach to question answering. Technical report, DTIC Document, 2006.
[14]
A. Das Sarma, L. Fang, N. Gupta, A. Halevy, H. Lee, F. Wu, R. Xin, and C. Yu. Finding related tables. In SIGMOD, pages 817--828. ACM, 2012.
[15]
X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, and W. Zhang. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In SIGKDD, pages 601--610. ACM, 2014.
[16]
A. Fader, L. Zettlemoyer, and O. Etzioni. Open question answering over curated and extracted knowledge bases. In SIGKDD. ACM, 2014.
[17]
A. Fader, L. S. Zettlemoyer, and O. Etzioni. Paraphrase-driven learning for open question answering. In ACL, pages 1608--1618, 2013.
[18]
D. Ferrucci, E. Brown, J. Chu-Carroll, J. Fan, D. Gondek, A. A. Kalyanpur, A. Lally, J. W. Murdock, E. Nyberg, J. Prager, et al. Building watson: An overview of the deepqa project. AI magazine, 31(3):59--79, 2010.
[19]
J. H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of Statistics, pages 1189--1232, 2001.
[20]
J. Gao, P. Pantel, M. Gamon, X. He, L. Deng, and Y. Shen. Modeling interestingness with deep neural networks. In EMNLP, 2014.
[21]
B. Hu, Z. Lu, H. Li, and Q. Chen. Convolutional neural network architectures for matching natural language sentences. In NIPS, pages 2042--2050, 2014.
[22]
P.-S. Huang, X. He, J. Gao, L. Deng, A. Acero, and L. Heck. Learning deep structured semantic models for web search using clickthrough data. In CIKM, pages 2333--2338. ACM, 2013.
[23]
A. Karpathy and L. Fei-Fei. Deep visual-semantic alignments for generating image descriptions. In CVPR, 2015.
[24]
J. Ko, E. Nyberg, and L. Si. A probabilistic graphical model for joint answer ranking in question answering. In SIGIR, pages 343--350. ACM, 2007.
[25]
F. Li and H. Jagadish. Constructing an interactive natural language interface for relational databases. VLDB, 8(1):73--84, 2014.
[26]
Y. Li, H. Yang, and H. Jagadish. Nalix: an interactive natural language interface for querying xml. In SIGMOD, pages 900--902. ACM, 2005.
[27]
G. Limaye, S. Sarawagi, and S. Chakrabarti. Annotating and searching web tables using entities, types and relationships. VLDB, 3(1--2):1338--1347, 2010.
[28]
C. D. Manning, P. Raghavan, H. Schütze, et al. Introduction to information retrieval, volume 1. Cambridge university press Cambridge, 2008.
[29]
B. Min, R. Grishman, L. Wan, C. Wang, and D. Gondek. Distant supervision for relation extraction with an incomplete knowledge base. In HLT-NAACL, pages 777--782, 2013.
[30]
D. Nadeau and S. Sekine. A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1):3--26, 2007.
[31]
P. Pasupat and P. Liang. Compositional semantic parsing on semi-structured tables. In ACL, 2015.
[32]
R. Pimplikar and S. Sarawagi. Answering table queries on the web using column keywords. VLDB, 5(10):908--919, 2012.
[33]
D. Pinto, M. Branstein, R. Coleman, W. B. Croft, M. King, W. Li, and X. Wei. Quasm: a system for question answering using semi-structured data. In Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, pages 46--55. ACM, 2002.
[34]
A.-M. Popescu, O. Etzioni, and H. Kautz. Towards a theory of natural language interfaces to databases. In Proceedings of the 8th international conference on Intelligent user interfaces, pages 149--157. ACM, 2003.
[35]
S. Reddy, M. Lapata, and M. Steedman. Large-scale semantic parsing without question-answer pairs. Transactions of the Association for Computational Linguistics, 2:377--392, 2014.
[36]
N. Schlaefer, P. Gieselmann, T. Schaaf, and A. Waibel. A pattern learning approach to question answering within the ephyra framework. In Text, speech and dialogue, pages 687--694. Springer, 2006.
[37]
Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil. A latent semantic model with convolutional-pooling structure for information retrieval. In CIKM, pages 101--110. ACM, 2014.
[38]
Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil. Learning semantic representations using convolutional neural networks for web search. In WWW companion, pages 373--374, 2014.
[39]
R. Socher, D. Chen, C. D. Manning, and A. Ng. Reasoning with neural tensor networks for knowledge base completion. In NIPS, pages 926--934, 2013.
[40]
H. Sun, H. Ma, W.-t. Yih, C.-T. Tsai, J. Liu, and M.-W. Chang. Open domain question answering via semantic enrichment. In WWW, pages 1045--1055, 2015.
[41]
I. Sutskever, O. Vinyals, and Q. V. Le. Sequence to sequence learning with neural networks. In NIPS, pages 3104--3112, 2014.
[42]
C. Unger, L. Bühmann, J. Lehmann, A.-C. Ngonga Ngomo, D. Gerber, and P. Cimiano. Template-based question answering over RDF data. In WWW, pages 639--648, 2012.
[43]
P. Venetis, A. Halevy, J. Madhavan, M. Paşca, W. Shen, F. Wu, G. Miao, and C. Wu. Recovering semantics of tables on the web. VLDB, 4(9):528--538, 2011.
[44]
E. M. Voorhees and D. M. Tice. Building a question answering test collection. In SIGIR, pages 200--207. ACM, 2000.
[45]
R. West, E. Gabrilovich, K. Murphy, S. Sun, R. Gupta, and D. Lin. Knowledge base completion via search-based question answering. In WWW, pages 515--526, 2014.
[46]
M. Yahya, K. Berberich, S. Elbassuoni, M. Ramanath, V. Tresp, and G. Weikum. Natural language questions for the web of data. In EMNLP-CoNLL, pages 379--390. ACL, 2012.
[47]
M. Yakout, K. Ganjam, K. Chakrabarti, and S. Chaudhuri. Infogather: entity augmentation and attribute discovery by holistic matching with web tables. In SIGMOD, pages 97--108. ACM, 2012.
[48]
M. Yang, B. Ding, S. Chaudhuri, and K. Chakrabarti. Finding patterns in a knowledge base using keywords to compose table answers. VLDB, 7(14):1809--1820, 2014.
[49]
Y. Yang and M.-W. Chang. S-mart: Novel tree-based structured learning algorithms applied to tweet entity linking. In ACL, 2015.
[50]
X. Yao and B. Van Durme. Information extraction over structured data: Question answering with freebase. In ACL, 2014.
[51]
W.-t. Yih, M.-W. Chang, X. He, and J. Gao. Semantic parsing via staged query graph generation: Question answering with knowledge base. In ACL, 2015.
[52]
M. Zhang and K. Chakrabarti. Infogather+: Semantic matching and annotation of numeric and time-varying attributes in web tables. In SIGMOD, pages 145--156. ACM, 2013.
[53]
L. Zou, R. Huang, H. Wang, J. X. Yu, W. He, and D. Zhao. Natural language question answering over RDF: a graph data driven approach. In SIGMOD, pages 313--324. ACM, 2014.

Cited By

View all
  • (2024)A Table Question Alignment based Cell-Selection Method for Table-Text QAJournal of Natural Language Processing10.5715/jnlp.31.18931:1(189-211)Online publication date: 2024
  • (2024)Table-GPT: Table Fine-tuned GPT for Diverse Table TasksProceedings of the ACM on Management of Data10.1145/36549792:3(1-28)Online publication date: 30-May-2024
  • (2024)The power and potentials of Flexible Query Answering Systems: A critical and comprehensive analysisData & Knowledge Engineering10.1016/j.datak.2023.102246149(102246)Online publication date: Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '16: Proceedings of the 25th International Conference on World Wide Web
April 2016
1482 pages
ISBN:9781450341431

Sponsors

  • IW3C2: International World Wide Web Conference Committee

In-Cooperation

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 11 April 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. knowledge bases
  2. question answering
  3. table cell search
  4. web search

Qualifiers

  • Research-article

Conference

WWW '16
Sponsor:
  • IW3C2
WWW '16: 25th International World Wide Web Conference
April 11 - 15, 2016
Québec, Montréal, Canada

Acceptance Rates

WWW '16 Paper Acceptance Rate 115 of 727 submissions, 16%;
Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)65
  • Downloads (Last 6 weeks)7
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Table Question Alignment based Cell-Selection Method for Table-Text QAJournal of Natural Language Processing10.5715/jnlp.31.18931:1(189-211)Online publication date: 2024
  • (2024)Table-GPT: Table Fine-tuned GPT for Diverse Table TasksProceedings of the ACM on Management of Data10.1145/36549792:3(1-28)Online publication date: 30-May-2024
  • (2024)The power and potentials of Flexible Query Answering Systems: A critical and comprehensive analysisData & Knowledge Engineering10.1016/j.datak.2023.102246149(102246)Online publication date: Jan-2024
  • (2024)Boolean interpretation, matching, and ranking of natural language queries in product selection systemsDiscover Computing10.1007/s10791-024-09432-x27:1Online publication date: 3-Apr-2024
  • (2023)SAND: Semantic Annotation of Numeric Data in Web TablesProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615046(2342-2351)Online publication date: 21-Oct-2023
  • (2023)Exploring Chart Question Answering for Blind and Low Vision UsersProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581532(1-15)Online publication date: 19-Apr-2023
  • (2023)Detecting Semantic Errors in Tables using Textual Evidence2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386632(292-303)Online publication date: 15-Dec-2023
  • (2023)Techniques, datasets, evaluation metrics and future directions of a question answering systemKnowledge and Information Systems10.1007/s10115-023-02019-w66:4(2235-2268)Online publication date: 22-Dec-2023
  • (2023)Dependency-Aware Core Column Discovery for Table UnderstandingThe Semantic Web – ISWC 202310.1007/978-3-031-47240-4_9(159-178)Online publication date: 27-Oct-2023
  • (2022)Classification of Layout vs. Relational Tables on the Web: Machine Learning with Rendered PagesACM Transactions on the Web10.1145/355534917:1(1-23)Online publication date: 20-Dec-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media