Abstract
Abstractive text summarization is a highly difficult problem, and the sequence-to-sequence model has shown success in improving the performance on the task. However, the generated summaries are often inconsistent with the source content in semantics. In such cases, when generating summaries, the model selects semantically unrelated words with respect to the source content as the most probable output. The problem can be attributed to heuristically constructed training data, where summaries can be unrelated to the source content, thus containing semantically unrelated words and spurious word correspondence. In this article, we propose a regularization approach for the sequence-to-sequence model and make use of what the model has learned to regularize the learning objective to alleviate the effect of the problem. In addition, we propose a practical human evaluation method to address the problem that the existing automatic evaluation method does not evaluate the semantic consistency with the source content properly. Experimental results demonstrate the effectiveness of the proposed approach, which outperforms almost all the existing models. Especially, the proposed approach improves the semantic consistency by 4% in terms of human evaluation.
- Armen Aghajanyan. 2017. SoftTarget regularization: An effective technique to reduce over-fitting in neural networks. In Proceedings of the 3rd IEEE International Conference on Cybernetics. IEEE, New York, NY, 1--5.Google Scholar
Cross Ref
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014), 15. arxiv:1409.0473Google Scholar
- Ziqiang Cao, Furu Wei, Wenjie Li, and Sujian Li. 2018. Faithful to the original: Fact aware neural abstractive summarization. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. AAAI Press, Palo Alto, California, 4784--4791. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16121Google Scholar
- Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, and Hui Jiang. 2016. Distraction-based neural networks for modeling documents. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. AAAI Press/International Joint Conferences on Artificial Intelligence, Palo Alto, California, 2754--2760. http://www.ijcai.org/Abstract/16/391 Google Scholar
Digital Library
- Sumit Chopra, Michael Auli, and Alexander M. Rush. 2016. Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. The Association for Computational Linguistics, Stroudsburg, Pennsylvania, 93--98.Google Scholar
- Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 1 (1960), 37--46.Google Scholar
Cross Ref
- Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O. K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers. The Association for Computer Linguistics, Stroudsburg, PA, 1631--1640.Google Scholar
- Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015), 9. arxiv:1503.02531Google Scholar
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780. Google Scholar
Digital Library
- Baotian Hu, Qingcai Chen, and Fangze Zhu. 2015. LCSTS: A large scale Chinese short text summarization dataset. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, 1967--1972.Google Scholar
Cross Ref
- Sébastien Jean, KyungHyun Cho, Roland Memisevic, and Yoshua Bengio. 2015. On using very large target vocabulary for neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers. Association for Computer Linguistics, Stroudsburg, PA, 1--10.Google Scholar
Cross Ref
- Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2014), 15. arxiv:1412.6980Google Scholar
- J. Richard Landis and Gary G. Koch. 1977. The measurement of observer agreement for categorical data. Biom. 33, 1 (1977), 159--174.Google Scholar
Cross Ref
- Haoran Li, Junnan Zhu, Jiajun Zhang, and Chengqing Zong. 2018. Ensure the correctness of the summary: Incorporate entailment knowledge into abstractive sentence summarization. In Proceedings of the 27th International Conference on Computational Linguistics. ACL, Stroudsburg, PA, 1430--1441. http://aclweb.org/anthology/C18-1121Google Scholar
- Piji Li, Wai Lam, Lidong Bing, and Zihao Wang. 2017. Deep recurrent generative decoder for abstractive text summarization. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, 2081--2090.Google Scholar
Cross Ref
- Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. 74--81. http://aclweb.org/anthology/W04-1013Google Scholar
- Junyang Lin, Xu Sun, Shuming Ma, and Qi Su. 2018. Global encoding for abstractive summarization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 2: Short Papers. The Association for Computer Linguistics, Stroudsburg, PA, 163--169. http://aclweb.org/anthology/P18-2027Google Scholar
Cross Ref
- Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, 1412--1421.Google Scholar
- Shuming Ma, Xu Sun, Wei Li, Sujian Li, Wenjie Li, and Xuancheng Ren. 2018. Query and output: Generating words by querying distributed word representations for paraphrase generation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1: Long Papers. The Association for Computer Linguistics, Stroudsburg, PA, 196--206.Google Scholar
Cross Ref
- Shuming Ma, Xu Sun, Junyang Lin, and Xuancheng Ren. 2018. A hierarchical end-to-end model for jointly improving text summarization and sentiment classification. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence, Palo Alto, California, 4251--4257. Google Scholar
Digital Library
- Shuming Ma, Xu Sun, Jingjing Xu, Houfeng Wang, Wenjie Li, and Qi Su. 2017. Improving semantic relevance for sequence-to-sequence learning of Chinese social media text summarization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 2: Short Papers. The Association for Computer Linguistics, Stroudsburg, PA, 635--640.Google Scholar
Cross Ref
- Julian John McAuley and Jure Leskovec. 2013. From amateurs to connoisseurs: Modeling the evolution of user expertise through online reviews. In Proceedings of the 22nd International World Wide Web Conference. International World Wide Web Conferences Steering Committee. ACM, Geneva, Switzerland, New York, NY. 897--908. Google Scholar
Digital Library
- Haitao Mi, Zhiguo Wang, and Abe Ittycheriah. 2016. Supervised attentions for neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, 2283--2288.Google Scholar
Cross Ref
- Ramesh Nallapati, Bowen Zhou, Cícero Nogueira dos Santos, Çaglar Gülçehre, and Bing Xiang. 2016. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, CoNLL 2016. The Association for Computer Linguistics, Stroudsburg, PA, 280--290.Google Scholar
Cross Ref
- Quang Nguyen, Hamed Valizadegan, and Milos Hauskrecht. 2014. Learning classification models with soft-label information. J. Am. Med. Inf. Assoc. 21, 3 (2014), 501--508.Google Scholar
Cross Ref
- Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, 379--389.Google Scholar
Cross Ref
- Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers. The Association for Computer Linguistics, Stroudsburg, PA, 1073--1083.Google Scholar
- Kaiqiang Song, Lin Zhao, and Fei Liu. 2018. Structure-infused copy mechanisms for abstractive summarization. In Proceedings of the 27th International Conference on Computational Linguistics. The Association for Computer Linguistics, Stroudsburg, PA, 1717--1729. http://aclweb.org/anthology/C18-1146Google Scholar
- Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014. Curran Associates, Inc., Red Hook, NY, 3104--3112. http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks Google Scholar
Digital Library
Index Terms
Regularizing Output Distribution of Abstractive Chinese Social Media Text Summarization for Improved Semantic Consistency
Recommendations
Neural Abstractive Text Summarization with Sequence-to-Sequence Models
Survey Paper, Special Issue on Urban Computing and Smart Cities and Regular PaperIn the past few years, neural abstractive text summarization with sequence-to-sequence (seq2seq) models have gained a lot of popularity. Many interesting techniques have been proposed to improve seq2seq models, making them capable of handling different ...
Reinforcement Learning Models for Abstractive Text Summarization
ACM SE '19: Proceedings of the 2019 ACM Southeast ConferenceAbstractive text summarization is an active research topic in Natural Language Understanding. We live in a digital world where the information for every topic in Internet is increasing considerable, and users would benefit by generating summaries. ...
Abstractive text summarization using LSTM-CNN based deep learning
Abstractive Text Summarization (ATS), which is the task of constructing summary sentences by merging facts from different source sentences and condensing them into a shorter representation while preserving information content and overall meaning. It is ...






Comments