skip to main content
research-article

Implicit Discourse Relation Recognition for English and Chinese with Multiview Modeling and Effective Representation Learning

Authors Info & Claims
Published:17 March 2017Publication History
Skip Abstract Section

Abstract

Discourse relations between two text segments play an important role in many Natural Language Processing (NLP) tasks. The connectives strongly indicate the sense of discourse relations, while in fact, there are no connectives in a large proportion of discourse relations, that is, implicit discourse relations. Compared with explicit relations, implicit relations are much harder to detect and have drawn significant attention. Until now, there have been many studies focusing on English implicit discourse relations, and few studies address implicit relation recognition in Chinese even though the implicit discourse relations in Chinese are more common than those in English. In our work, both the English and Chinese languages are our focus. The key to implicit relation prediction is to properly model the semantics of the two discourse arguments, as well as the contextual interaction between them. To achieve this goal, we propose a neural network based framework that consists of two hierarchies. The first one is the model hierarchy, in which we propose a max-margin learning method to explore the implicit discourse relation from multiple views. The second one is the feature hierarchy, in which we learn multilevel distributed representations from words, arguments, and syntactic structures to sentences. We have conducted experiments on the standard benchmarks of English and Chinese, and the results show that compared with several methods our proposed method can achieve the best performance in most cases.

References

  1. Or Biran and Kathleen McKeown. 2013. Aggregated word pair features for implicit discourse relation disambiguation. In Proceedings of ACL’13.Google ScholarGoogle Scholar
  2. Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Proceedings of the Advances in Neural Information Processing Systems (NIPS’13). 2787--2795. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chloé Braud and Pascal Denis. 2015. Comparing word representations for implicit discourse relation classification. In Proceedings of the Empirical Methods in Natural Language Processing (EMNLP’15).Google ScholarGoogle ScholarCross RefCross Ref
  4. Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3, Article 27 (May 2011), 27 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Philipp Cimiano, Uwe Reyle, and Jasmin Šarić. 2005. Ontology-driven discourse analysis for information extraction. Data Knowl. Eng. 55, 1 (2005). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12 (July 2011), 2121--2159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Hen-Hsen Huang and Hsin-Hsi Chen. 2011. Chinese discourse relation recognition. In IJCNLP. 1442--1446.Google ScholarGoogle Scholar
  8. Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015).Google ScholarGoogle Scholar
  9. Peter Jansen, Mihai Surdeanu, and Peter Clark. 2014. Discourse complements lexical semantics for non-factoid answer reranking. In Proceedings of ACL’14.Google ScholarGoogle ScholarCross RefCross Ref
  10. Yangfeng Ji and Jacob Eisenstein. 2015. One vector is not enough: Entity-augmented distributed semantics for discourse relations. Trans. Assoc. Comput. Ling. 3, 1.Google ScholarGoogle Scholar
  11. Xiaomian Kang, Haoran Li, Long Zhou, Jiajun Zhang, and Chengqing Zong. 2016. An end-to-end chinese discourse parser with adaptation to explicit and non-explicit relation recognition. In Proceedings of the 20th Conference on Computational Natural Language Learning: Shared Task (CoNLL). 27--32.Google ScholarGoogle ScholarCross RefCross Ref
  12. Dan Klein and Christopher D. Manning. 2003. Accurate unlexicalized parsing. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1 (ACL’03). Association for Computational Linguistics, Stroudsburg, PA, 423--430. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Junyi Jessy Li, Marine Carpuat, and Ani Nenkova. 2014. Cross-lingual discourse relation analysis: A corpus study and a semi-supervised classification system. In COLING. 577--587.Google ScholarGoogle Scholar
  14. Junyi Jessy Li and Ani Nenkova. 2014. Reducing sparsity improves the recognition of implicit discourse relations. In Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Citeseer, 199.Google ScholarGoogle ScholarCross RefCross Ref
  15. Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng. 2009. Recognizing implicit discourse relations in the penn discourse treebank. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1 (EMNLP’09). Association for Computational Linguistics, Stroudsburg, PA, 343--351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yiqun Liu, Fei Chen, Weize Kong, Huijia Yu, Min Zhang, Shaoping Ma, and Liyun Ru. 2012. Identifying web spam with the wisdom of the crowds. ACM Trans. Web 6, 1, Article 2 (March 2012), 30 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yang Liu, Sujian Li, Xiaodong Zhang, and Zhifang Sui. 2016. Implicit discourse relation classification via multi-task neural networks. arXiv preprint arXiv:1603.02776 (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Annie Louis, Aravind Joshi, Rashmi Prasad, and Ani Nenkova. 2010. Using entity features to classify implicit discourse relations. In Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL’10). Association for Computational Linguistics, Stroudsburg, PA, 59--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Daniel Marcu and Abdessamad Echihabi. 2002. An unsupervised approach to recognizing discourse relations. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02). Association for Computational Linguistics, 368--375. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google ScholarGoogle Scholar
  21. Joonsuk Park and Claire Cardie. 2012. Improving implicit discourse relation recognition through feature set optimization. In Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL’12). Association for Computational Linguistics, Stroudsburg, PA, 108--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In Proceedings of The 30th International Conference on Machine Learning. 1310--1318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Wenzhe Pei, Tao Ge, and Baobao Chang. 2014. Max-margin tensor neural network for Chinese word segmentation. In Proceedings of ACL’14, Vol. 1. 293--303.Google ScholarGoogle ScholarCross RefCross Ref
  24. Emily Pitler, Annie Louis, and Ani Nenkova. 2009. Automatic sense prediction for implicit discourse relations in text. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2 (ACL’09). Association for Computational Linguistics, Stroudsburg, PA, 683--691. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Rashmi Prasad, Alan Lee, Eleni Miltsakaki, Livio Robaldo, Aravind K. Joshi, Bonnie L. Webber, and Nikhil Dinesh. 2008. The penn discourse treebank 2.0. In Proceedings of the International Conference on Language Resources and Evaluation (LREC 2008). 2961--2968.Google ScholarGoogle Scholar
  26. Lutz Prechelt. 1998. Automatic early stopping using cross validation: Quantifying the criteria. Neural Netw. 11, 4 (1998), 761--767. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Attapol Rutherford and Nianwen Xue. 2014. Discovering implicit discourse relations through Brown cluster pair representation and coreference patterns. In Proceedings of the EACL, Vol. 645. 2014.Google ScholarGoogle Scholar
  28. Attapol Rutherford and Nianwen Xue. 2015. Improving the inference of implicit discourse relations via classifying explicit discourse connectives. In Proceedings of the NAACL-HLT.Google ScholarGoogle ScholarCross RefCross Ref
  29. Manami Saito, Kazuhide Yamamoto, and Satoshi Sekine. 2006. Using phrasal patterns to identify discourse relations. In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers (NAACL-Short’06). Association for Computational Linguistics, Stroudsburg, PA, 133--136. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Richard Socher, Danqi Chen, Christopher D. Manning, and Andrew Ng. 2013. Reasoning with neural tensor networks for knowledge base completion. In Proceedings of the 26th International Conference on Advances in Neural Information Processing Systems. 926--934. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Swapna Somasundaran, Galileo Namata, Janyce Wiebe, and Lise Getoor. 2009. Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 (EMNLP’09). Association for Computational Linguistics, Stroudsburg, PA, 170--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Mei Tu, Yu Zhou, and Chengqing Zong. 2013. A novel translation framework based on rhetorical structure theory. In Proceedings of ACL’13, Vol. 2. 370--374.Google ScholarGoogle Scholar
  33. WenTing Wang, Jian Su, and Chew Lim Tan. 2010. Kernel based discourse relation recognition with temporal ordering information. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL’10). Association for Computational Linguistics, 710--719. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Yu Xu, Man Lan, Yue Lu, Zheng Yu Niu, and Chew Lim Tan. 2012. Connective prediction using machine learning for implicit discourse relation classification. In Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN’12). IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  35. Yiming Yang and Jan O. Pedersen. 1997. A comparative study on feature selection in text categorization. In Proceedings of ICML, Vol. 97. 412--420. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Biao Zhang, Jinsong Su, Deyi Xiong, Yaojie Lu, Hong Duan, and Junfeng Yao. 2015. Shallow convolutional neural network for implicit discourse relation recognition. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2230--2235.Google ScholarGoogle ScholarCross RefCross Ref
  37. Yuping Zhou and Nianwen Xue. 2015. The Chinese discourse treebank: A Chinese corpus annotated with discourse relations. Lang. Res. Eva. 49, 2 (2015), 1--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Zhi-Min Zhou, Yu Xu, Zheng-Yu Niu, Man Lan, Jian Su, and Chew Lim Tan. 2010. Predicting discourse connectives for implicit discourse relation recognition. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters (COLING’10). Association for Computational Linguistics, 1507--1514. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ciyou Zhu, Richard H. Byrd, Peihuang Lu, and Jorge Nocedal. 1997. Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization. ACM Trans. Math. Softw. 23, 4 (Dec. 1997), 550--560. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Chengqing Zong. 2013. Statistical Natural Language Processing. Tsinghua University Press.Google ScholarGoogle Scholar

Index Terms

  1. Implicit Discourse Relation Recognition for English and Chinese with Multiview Modeling and Effective Representation Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 16, Issue 3
      September 2017
      167 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3041821
      Issue’s Table of Contents

      Copyright © 2017 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 March 2017
      • Accepted: 1 December 2016
      • Revised: 1 September 2016
      • Received: 1 May 2016
      Published in tallip Volume 16, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!