skip to main content
research-article

Regularizing Output Distribution of Abstractive Chinese Social Media Text Summarization for Improved Semantic Consistency

Published:30 April 2019Publication History
Skip Abstract Section

Abstract

Abstractive text summarization is a highly difficult problem, and the sequence-to-sequence model has shown success in improving the performance on the task. However, the generated summaries are often inconsistent with the source content in semantics. In such cases, when generating summaries, the model selects semantically unrelated words with respect to the source content as the most probable output. The problem can be attributed to heuristically constructed training data, where summaries can be unrelated to the source content, thus containing semantically unrelated words and spurious word correspondence. In this article, we propose a regularization approach for the sequence-to-sequence model and make use of what the model has learned to regularize the learning objective to alleviate the effect of the problem. In addition, we propose a practical human evaluation method to address the problem that the existing automatic evaluation method does not evaluate the semantic consistency with the source content properly. Experimental results demonstrate the effectiveness of the proposed approach, which outperforms almost all the existing models. Especially, the proposed approach improves the semantic consistency by 4% in terms of human evaluation.

References

  1. Armen Aghajanyan. 2017. SoftTarget regularization: An effective technique to reduce over-fitting in neural networks. In Proceedings of the 3rd IEEE International Conference on Cybernetics. IEEE, New York, NY, 1--5.Google ScholarGoogle ScholarCross RefCross Ref
  2. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. CoRR abs/1409.0473 (2014), 15. arxiv:1409.0473Google ScholarGoogle Scholar
  3. Ziqiang Cao, Furu Wei, Wenjie Li, and Sujian Li. 2018. Faithful to the original: Fact aware neural abstractive summarization. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. AAAI Press, Palo Alto, California, 4784--4791. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16121Google ScholarGoogle Scholar
  4. Qian Chen, Xiaodan Zhu, Zhenhua Ling, Si Wei, and Hui Jiang. 2016. Distraction-based neural networks for modeling documents. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. AAAI Press/International Joint Conferences on Artificial Intelligence, Palo Alto, California, 2754--2760. http://www.ijcai.org/Abstract/16/391 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sumit Chopra, Michael Auli, and Alexander M. Rush. 2016. Abstractive sentence summarization with attentive recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. The Association for Computational Linguistics, Stroudsburg, Pennsylvania, 93--98.Google ScholarGoogle Scholar
  6. Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20, 1 (1960), 37--46.Google ScholarGoogle ScholarCross RefCross Ref
  7. Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O. K. Li. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers. The Association for Computer Linguistics, Stroudsburg, PA, 1631--1640.Google ScholarGoogle Scholar
  8. Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015), 9. arxiv:1503.02531Google ScholarGoogle Scholar
  9. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Baotian Hu, Qingcai Chen, and Fangze Zhu. 2015. LCSTS: A large scale Chinese short text summarization dataset. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, 1967--1972.Google ScholarGoogle ScholarCross RefCross Ref
  11. Sébastien Jean, KyungHyun Cho, Roland Memisevic, and Yoshua Bengio. 2015. On using very large target vocabulary for neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers. Association for Computer Linguistics, Stroudsburg, PA, 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  12. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2014), 15. arxiv:1412.6980Google ScholarGoogle Scholar
  13. J. Richard Landis and Gary G. Koch. 1977. The measurement of observer agreement for categorical data. Biom. 33, 1 (1977), 159--174.Google ScholarGoogle ScholarCross RefCross Ref
  14. Haoran Li, Junnan Zhu, Jiajun Zhang, and Chengqing Zong. 2018. Ensure the correctness of the summary: Incorporate entailment knowledge into abstractive sentence summarization. In Proceedings of the 27th International Conference on Computational Linguistics. ACL, Stroudsburg, PA, 1430--1441. http://aclweb.org/anthology/C18-1121Google ScholarGoogle Scholar
  15. Piji Li, Wai Lam, Lidong Bing, and Zihao Wang. 2017. Deep recurrent generative decoder for abstractive text summarization. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, 2081--2090.Google ScholarGoogle ScholarCross RefCross Ref
  16. Chin-Yew Lin. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. 74--81. http://aclweb.org/anthology/W04-1013Google ScholarGoogle Scholar
  17. Junyang Lin, Xu Sun, Shuming Ma, and Qi Su. 2018. Global encoding for abstractive summarization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 2: Short Papers. The Association for Computer Linguistics, Stroudsburg, PA, 163--169. http://aclweb.org/anthology/P18-2027Google ScholarGoogle ScholarCross RefCross Ref
  18. Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective approaches to attention-based neural machine translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, 1412--1421.Google ScholarGoogle Scholar
  19. Shuming Ma, Xu Sun, Wei Li, Sujian Li, Wenjie Li, and Xuancheng Ren. 2018. Query and output: Generating words by querying distributed word representations for paraphrase generation. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1: Long Papers. The Association for Computer Linguistics, Stroudsburg, PA, 196--206.Google ScholarGoogle ScholarCross RefCross Ref
  20. Shuming Ma, Xu Sun, Junyang Lin, and Xuancheng Ren. 2018. A hierarchical end-to-end model for jointly improving text summarization and sentiment classification. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence, Palo Alto, California, 4251--4257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shuming Ma, Xu Sun, Jingjing Xu, Houfeng Wang, Wenjie Li, and Qi Su. 2017. Improving semantic relevance for sequence-to-sequence learning of Chinese social media text summarization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 2: Short Papers. The Association for Computer Linguistics, Stroudsburg, PA, 635--640.Google ScholarGoogle ScholarCross RefCross Ref
  22. Julian John McAuley and Jure Leskovec. 2013. From amateurs to connoisseurs: Modeling the evolution of user expertise through online reviews. In Proceedings of the 22nd International World Wide Web Conference. International World Wide Web Conferences Steering Committee. ACM, Geneva, Switzerland, New York, NY. 897--908. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Haitao Mi, Zhiguo Wang, and Abe Ittycheriah. 2016. Supervised attentions for neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, 2283--2288.Google ScholarGoogle ScholarCross RefCross Ref
  24. Ramesh Nallapati, Bowen Zhou, Cícero Nogueira dos Santos, Çaglar Gülçehre, and Bing Xiang. 2016. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, CoNLL 2016. The Association for Computer Linguistics, Stroudsburg, PA, 280--290.Google ScholarGoogle ScholarCross RefCross Ref
  25. Quang Nguyen, Hamed Valizadegan, and Milos Hauskrecht. 2014. Learning classification models with soft-label information. J. Am. Med. Inf. Assoc. 21, 3 (2014), 501--508.Google ScholarGoogle ScholarCross RefCross Ref
  26. Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. The Association for Computer Linguistics, Stroudsburg, PA, 379--389.Google ScholarGoogle ScholarCross RefCross Ref
  27. Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 1: Long Papers. The Association for Computer Linguistics, Stroudsburg, PA, 1073--1083.Google ScholarGoogle Scholar
  28. Kaiqiang Song, Lin Zhao, and Fei Liu. 2018. Structure-infused copy mechanisms for abstractive summarization. In Proceedings of the 27th International Conference on Computational Linguistics. The Association for Computer Linguistics, Stroudsburg, PA, 1717--1729. http://aclweb.org/anthology/C18-1146Google ScholarGoogle Scholar
  29. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014. Curran Associates, Inc., Red Hook, NY, 3104--3112. http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Regularizing Output Distribution of Abstractive Chinese Social Media Text Summarization for Improved Semantic Consistency

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!