skip to main content
research-article

Outline Extraction with Question-Specific Memory Cells

Published:29 March 2020Publication History
Skip Abstract Section

Abstract

Outline extraction has been widely applied in online consultation to help experts quickly understand individual cases. Given a specific case described as unstructured plain text, outline extraction aims to make a summary for this case by answering a set of questions, which in fact is a new type of machine reading comprehension task. Inspired by a recently popular memory network, we propose a novel question-specific memory cell network (QSMCN) to extract information related to multiple questions on-the-fly as it reads texts. QSMCN constructs a specific memory cell for each question, which is sequentially expanded in recurrent neural network style. Each cell contains three specific vectors to first identify whether current input is related to corresponding question and then update question-specific case representation. We add a penalization term in the loss function to make extracted knowledge more reasonable and interpretable. To support this study, we construct a new outline extraction corpus, InjuryCase,1 which is composed of 3,995 real Chinese occupational injury cases. Experimental results show that our method makes a significant improvement. We further apply the proposed framework on two multi-aspect extraction tasks and find that the proposed model also remarkably outperforms existing state-of-the-art methods of the aspect extraction task.

References

  1. Sarath Chandar, Sungjin Ahn, Hugo Larochelle, Pascal Vincent, Gerald Tesauro, and Yoshua Bengio. 2016. Hierarchical memory networks. arXiv:1605.07427.Google ScholarGoogle Scholar
  2. Jianpeng Cheng, Li Dong, and Mirella Lapata. 2016. Long short-term memory-networks for machine reading. arXiv:1601.06733.Google ScholarGoogle Scholar
  3. Kyunghyun Cho, Bart Van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14).Google ScholarGoogle ScholarCross RefCross Ref
  4. Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, NY, 160--167.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.Google ScholarGoogle Scholar
  6. Chris Dyer, Miguel Ballesteros, Ling Wang, Austin Matthews, and Noah A. Smith. 2015. Transition-based dependency parsing with stack long short-term memory. Computer Science 37, 2 (2015), 321--332.Google ScholarGoogle Scholar
  7. Alex Graves, Abdelrahman Mohamed, and Geoffrey Hinton. 2013. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’13). 6645--6649.Google ScholarGoogle ScholarCross RefCross Ref
  8. Alex Graves, Greg Wayne, and Ivo Danihelka. 2014. Neural Turing machines. arXiv:1410.5401.Google ScholarGoogle Scholar
  9. Alex Graves, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwińska, et al. 2016. Hybrid computing using a neural network with dynamic external memory. Nature 538, 7626 (2016), 471--476.Google ScholarGoogle Scholar
  10. Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, and Yann LeCun. 2016. Tracking the world state with recurrent entity networks. arXiv:1612.03969.Google ScholarGoogle Scholar
  11. Karl Moritz Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. 2015. Teaching machines to read and comprehend. In Proceedings of Advances in Neural Information Processing Systems 28 (NIPS’15). 1693--1701.Google ScholarGoogle Scholar
  12. Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. 2015. The goldilocks principle: Reading children’s books with explicit memory representations. arXiv:1511.02301.Google ScholarGoogle Scholar
  13. Felix Hill, Kyunghyun Cho, and Anna Korhonen. 2016. Learning distributed representations of sentences from unlabelled data. In Proceedings of the 15th Annual Conference of the North American Chapter of the Associationfor Computational Linguistics: Human Language Technologies (NAACL HLT’16). 1367--1377.Google ScholarGoogle ScholarCross RefCross Ref
  14. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, and Ming Zhou. 2017. Reinforced mnemonic reader for machine reading comprehension. arXiv:1705.02798.Google ScholarGoogle Scholar
  16. Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 655–665.Google ScholarGoogle ScholarCross RefCross Ref
  17. Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1746–1751.Google ScholarGoogle ScholarCross RefCross Ref
  18. Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980.Google ScholarGoogle Scholar
  19. Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. 2015. Skip-thought vectors. In Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS’15). 3294–3302.Google ScholarGoogle Scholar
  20. J. Kolen and S. Kremer. 2007. Gradient flow in recurrent nets: The difficulty of learning longterm dependencies. In A Field Guide to Dynamical Recurrent Networks. Wiley-IEEE Press, Hoboken, NJ, 237–243.Google ScholarGoogle Scholar
  21. Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. 2017. Race: Large-scale reading comprehension dataset from examinations. arXiv:1704.04683.Google ScholarGoogle Scholar
  22. Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. Proceedings of Machine Learning Research 32, 2 (2014), 1188–1196.Google ScholarGoogle Scholar
  23. Tao Lei, Regina Barzilay, and Tommi Jaakkola. 2016. Rationalizing neural predictions. (2016).Google ScholarGoogle Scholar
  24. Zhouhan Lin, Minwei Feng, Cicero Nogueira Dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2016. A structured self-attentive sentence embedding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16). 107–117.Google ScholarGoogle Scholar
  25. Yang Liu, Chengjie Sun, Lei Lin, and Xiaolong Wang. 2016. Learning natural language inference using bidirectional LSTM model and inner-attention. arXiv:1605.09090.Google ScholarGoogle Scholar
  26. Mingbo Ma, Liang Huang, Bing Xiang, and Bowen Zhou. 2015. Dependency-based convolutional neural networks for sentence embedding. In Proceedings of the 53rd Annual Meeting of the Association for ComputationalLinguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 174--179.Google ScholarGoogle ScholarCross RefCross Ref
  27. Julian McAuley, Jure Leskovec, and Dan Jurafsky. 2012. Learning attitudes and attributes from multi-aspect reviews. In Proceedings of the 2012 IEEE 12th International Conference on Data Mining. 1020--1025.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781.Google ScholarGoogle Scholar
  29. Lili Mou, Hao Peng, Ge Li, Yan Xu, Lu Zhang, and Zhi Jin. 2015. Discriminative neural sentence modeling by tree-based convolution. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15). 2315–2325.Google ScholarGoogle ScholarCross RefCross Ref
  30. Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv:1802.05365.Google ScholarGoogle Scholar
  31. Pranav Rajpurkar, Robin Jia, and Percy Liang. 2018. Know what you don’t know: Unanswerable questions for SQuAD. arXiv:1806.03822.Google ScholarGoogle Scholar
  32. Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Ng, and Christopher Potts. 2013. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP’13). 1631--1642.Google ScholarGoogle Scholar
  33. Saku Sugawara, Kentaro Inui, Satoshi Sekine, and Akiko Aizawa. 2018. What makes reading comprehension questions easier? arXiv:1808.09384.Google ScholarGoogle Scholar
  34. Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, and Rob Fergus. 2015. End-to-end memory networks. arXiv:1503.08895.Google ScholarGoogle Scholar
  35. Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1556--1566.Google ScholarGoogle Scholar
  36. Jason Weston, Sumit Chopra, and Antoine Bordes. 2014. Memory networks. arXiv:1410.3916.Google ScholarGoogle Scholar
  37. Caiming Xiong, Stephen Merity, and Richard Socher. 2016. Dynamic memory networks for visual and textual question answering. In Proceedings of the International Conference on Machine Learning. 2397--2406.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Outline Extraction with Question-Specific Memory Cells

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Asian and Low-Resource Language Information Processing
          ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 4
          July 2020
          291 pages
          ISSN:2375-4699
          EISSN:2375-4702
          DOI:10.1145/3391538
          Issue’s Table of Contents

          Copyright © 2020 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 29 March 2020
          • Accepted: 1 December 2019
          • Revised: 1 September 2019
          • Received: 1 July 2018
          Published in tallip Volume 19, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!