Abstract
Owing to the availability of various large-scale Machine Reading Comprehension (MRC) datasets, building an effective model to extract passage spans for question answering has been well studied in previous works. However, in reality, there are some questions that cannot be answered through the passage information, which brings more challenges to this task. In this article, we propose an Interactive Gated Decoder (IG Decoder), which focuses on modeling the interactions between the answer span prediction and no-answer prediction with a gating mechanism. We also propose a simple but effective approach for automatically generating pseudo training data, which aims to enrich the training data of the unanswerable questions. Experimental results on popular benchmark SQuAD 2.0 and NewsQA show that the proposed approaches yield consistent improvements over traditional BERT-large and strong ALBERT-xxlarge baseline systems. We also provide detailed ablations of the proposed method and error analysis on hard samples, which could be helpful in future research.
- [1] . 2016. Tensorflow: A system for large-scale machine learning. In OSDI, Vol. 16. 265–283. Google Scholar
Digital Library
- [2] . 2016. Layer normalization. arXiv:1607.06450.Google Scholar
- [3] . 2017. Reading Wikipedia to answer open-domain questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1870–1879. https://doi.org/10.18653/v1/P17-1171Google Scholar
Cross Ref
- [4] . 2020. ELECTRA: Pre-training text encoders as discriminators rather than generators. In ICLR. https://openreview.net/pdf?id=r1xMH1BtvB.Google Scholar
- [5] . 2017. Attention-over-Attention neural networks for reading comprehension. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 593–602. https://doi.org/10.18653/v1/P17-1055Google Scholar
Cross Ref
- [6] . 2016. Consensus attention-based neural networks for Chinese reading comprehension. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). The COLING 2016 Organizing Committee, 1777–1786. http://aclweb.org/anthology/C16-1167.Google Scholar
- [7] . 2019. Contextual recurrent units for cloze-style reading comprehension. arXiv:1911.05960.Google Scholar
- [8] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171–4186.Google Scholar
- [9] . 2017. Gated-Attention readers for text comprehension. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1832–1846. https://doi.org/10.18653/v1/P17-1168Google Scholar
Cross Ref
- [10] . 2016. Gaussian error linear units (GELUs). arXiv:1606.08415.Google Scholar
- [11] . 2015. Teaching machines to read and comprehend. In Advances in Neural Information Processing Systems. 1684–1692. Google Scholar
Digital Library
- [12] . 2015. The Goldilocks principle: Reading children’s books with explicit memory representations. arXiv:1511.02301.Google Scholar
- [13] . 2019. Read+ verify: Machine reading comprehension with unanswerable questions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6529–6537. Google Scholar
Digital Library
- [14] . 2016. Text understanding with the attention sum reader network. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 908–918. https://doi.org/10.18653/v1/P16-1086Google Scholar
Cross Ref
- [15] . 2014. Adam: A method for stochastic optimization. arXiv:1412.6980.Google Scholar
- [16] . 2018. A question-focused multi-factor attention network for question answering. In 32nd AAAI Conference on Artificial Intelligence. Google Scholar
Digital Library
- [17] . 2017. RACE: Large-scale reading comprehension dataset from examinations. In Proceedings of EMNLP 2017. Association for Computational Linguistics, 785–794. http://aclweb.org/anthology/D17-1082.Google Scholar
Cross Ref
- [18] . 2019. ALBERT: A lite BERT for self-supervised learning of language representations. In International Conference on Learning Representations (ICLR’19). https://openreview.net/forum?id=H1eA7AEtvS.Google Scholar
- [19] . 2017. Zero-Shot relation extraction via reading comprehension. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL’17). Association for Computational Linguistics, 333–342. https://doi.org/10.18653/v1/K17-1034Google Scholar
Cross Ref
- [20] . 2017. Generating and exploiting large-scale pseudo training data for zero pronoun resolution. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 102–111. https://doi.org/10.18653/v1/P17-1010Google Scholar
Cross Ref
- [21] . 2018. Stochastic answer networks for SQuAD 2.0. arXiv:1809.09194.Google Scholar
- [22] . 2019. Roberta: A robustly optimized BERT pretraining approach. arXiv:1907.11692.Google Scholar
- [23] . 2018. Know what you don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 784–789. http://aclweb.org/anthology/P18-2124.Google Scholar
Cross Ref
- [24] . 2016. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2383–2392. https://doi.org/10.18653/v1/D16-1264Google Scholar
Cross Ref
- [25] . 2013. MCTest: A challenge dataset for the open-domain machine comprehension of text. In Proceedings of EMNLP 2013. 193–203.Google Scholar
- [26] . 2017. Bi-Directional attention flow for machine comprehension. In International Conference on Learning Representations (ICLR’17). https://openreview.net/forum?id=HJ0UKP9ge.Google Scholar
- [27] . 2018. U-Net: Machine reading comprehension with unanswerable questions. arXiv:1810.06638.Google Scholar
- [28] . 2018. Densely connected attention propagation for reading comprehension. In Advances in Neural Information Processing Systems. 4906–4917. Google Scholar
Digital Library
- [29] . 2017. NewsQA: A machine comprehension dataset. In Proceedings of the 2nd Workshop on Representation Learning for NLP. Association for Computational Linguistics, 191–200. https://doi.org/10.18653/v1/W17-2623Google Scholar
Cross Ref
- [30] . 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998–6008. Google Scholar
Digital Library
- [31] . 2015. Pointer networks. In Advances in Neural Information Processing Systems. 2692–2700. Google Scholar
Digital Library
- [32] . 2016. Machine comprehension using match-LSTM and answer pointer. arXiv:1608.07905.Google Scholar
- [33] . 2016. Dynamic coattention networks for question answering. arXiv:1611.01604.Google Scholar
- [34] . 2019. XLNet: Generalized autoregressive pretraining for language understanding. In Advances in Neural Information Processing Systems. 5753–5763. Google Scholar
Digital Library
- [35] . 2020. Semantics-aware BERT for language understanding. In The 34th AAAI Conference on Artificial Intelligence (AAAI’20).Google Scholar
Cross Ref
- [36] . 2020. SG-Net: Syntax-Guided machine reading comprehension. In Proceedings of the 34th AAAI Conference on Artificial Intelligence.Google Scholar
Cross Ref
- [37] . 2020. Retrospective reader for machine reading comprehension. arXiv:2001.09694.Google Scholar
Index Terms
Interactive Gated Decoder for Machine Reading Comprehension
Recommendations
Selecting Paragraphs to Answer Questions for Multi-passage Machine Reading Comprehension
Information RetrievalAbstractThis paper addresses the problem of question answering style multi-passage Machine Reading Comprehension (MRC) and suggests that paragraph-level segments are suitable to answer questions in real Web query scenario. We propose to combine a learning ...
New Vietnamese Corpus for Machine Reading Comprehension of Health News Articles
Machine reading comprehension is a natural language understanding task where the computing system is required to read a text and then find the answer to a specific question posed by a human. Large-scale and high-quality corpora are necessary for ...
Multi-passage extraction-based machine reading comprehension based on verification sorting
Highlights- This paper proposes a sequencing-based multi-passage reading comprehension model. Compared with the traditional machine reading model, the innovation of this ...
AbstractFor traditional single-passage machine reading comprehension, the text data of a single passage does not well reflect the complexity of practical application scenarios. Many researchers have shifted their research goals to study multi-...
Graphical abstractDisplay Omitted






Comments