Abstract
By leveraging self-supervised tasks, pre-trained language model (PLM) has made significant progress in the field of machine reading comprehension (MRC). However, in classical Chinese MRC (CCMRC), the passage is typically in classical style, but the question and options are given in modern style. Existing pre-trained methods seldom model the relationship between classical and modern styles, resulting in overall misunderstanding of the passage. In this paper, we propose a contrastive learning method between classical and modern Chinese in order to reach a deep understanding of the two different styles. In particular, a novel pre-training task and an enhanced co-matching network have been defined: (1) The synonym discrimination (SD) task is used to identify whether modern meaning corresponds to classical Chinese. (2) The enhanced dual co-matching (EDCM) network is employed for a more interactive understanding of the classical passage and the modern options. The experimental results show that our proposed method improves language understanding ability and outperforms existing PLMs on the Haihua, CCLUE, and ChID datasets.
- [1] . 2020. ELECTRA: Pre-training text encoders as discriminators rather than generators. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20). https://openreview.net/forum?id=r1xMH1BtvB.Google Scholar
- [2] . 2021. Pre-training with whole word masking for Chinese BERT. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021), 3504–3514.
DOI: Google ScholarDigital Library
- [3] . 2022. Interactive gated decoder for machine reading comprehension. ACM Transactions on Asian and Low-Resource Language Information Processing 21, 4, Article
72 (2022), 19 pages.DOI: Google ScholarDigital Library
- [4] . 2019. A span-extraction dataset for Chinese machine reading comprehension. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 5883–5889.
DOI: Google ScholarCross Ref
- [5] . 2006. Self-supervised monocular road detection in desert terrain. In Robotics: Science and Systems, Vol. 38.Google Scholar
- [6] . 1984. A new introduction to Classical Chinese. Oxford University Press.Google Scholar
- [7] . 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186.
DOI: Google ScholarCross Ref
- [8] . 2019. Unified language model pre-training for natural language understanding and generation. Advances in Neural Information Processing Systems 32 (2019).Google Scholar
- [9] . 2018. DuReader: A Chinese machine reading comprehension dataset from real-world applications. In Proceedings of the Workshop on Machine Reading for Question Answering. Association for Computational Linguistics, Melbourne, Australia, 37–46.
DOI: Google ScholarCross Ref
- [10] . 2018. Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Melbourne, Australia, 328–339.
DOI: Google ScholarCross Ref
- [11] . 2020. SpanBERT: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics 8 (2020), 64–77.
DOI: Google ScholarCross Ref
- [12] . 2017. RACE: Large-scale ReAding comprehension dataset from examinations. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 785–794.
DOI: Google ScholarCross Ref
- [13] . 2020. ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20). https://openreview.net/forum?id=H1eA7AEtvS.Google Scholar
- [14] . 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 7871–7880.
DOI: Google ScholarCross Ref
- [15] . 2018. An option gate module for sentence inference on machine reading comprehension. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM’18). Association for Computing Machinery, 1743–1746.
DOI: Google ScholarDigital Library
- [16] . 2020. A robust adversarial training approach to machine reading comprehension. Proceedings of the 34th AAAI Conference on Artificial Intelligence 34, 5 (2020), 8392–8400.
DOI: Google ScholarCross Ref
- [17] . 2019. Neural machine reading comprehension: Methods and trends. Applied Sciences 9, 18 (2019), 3698.
DOI: Google ScholarCross Ref
- [18] . 2020. K-BERT: Enabling language representation with knowledge graph. In Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2901–2908.Google Scholar
Cross Ref
- [19] . 2020. An iterative multi-source mutual knowledge transfer framework for machine reading comprehension. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI’20). 3794–3800.
DOI: Google ScholarCross Ref
- [20] . 2019. RoBERTa: A robustly optimized BERT pretraining approach. ArXiv preprint abs/1907.11692 (2019).Google Scholar
- [21] . 1989. Catastrophic interference in connectionist networks: The sequential learning problem.
In Psychology of Learning and Motivation , Vol. 24. Academic Press, 109–165.DOI: Google ScholarCross Ref
- [22] . 2019. A multiple granularity co-reasoning model for multi-choice reading comprehension. In 2019 International Joint Conference on Neural Networks (IJCNN). 1–7.
DOI: Google ScholarCross Ref
- [23] . 2016. MS MARCO: A human generated machine reading comprehension dataset. In CoCo@ NIPS.Google Scholar
- [24] . 2021. Contrastive learning for many-to-many multilingual neural machine translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Online, 244–258.
DOI: Google ScholarCross Ref
- [25] . 2016. Text matching as image recognition. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2793–2799.Google Scholar
Cross Ref
- [26] . 2021. Survey of pre-trained models for natural language processing. In 2021 International Conference on Electronic Communications, Internet of Things and Big Data (ICEIB). 277–280.
DOI: Google ScholarCross Ref
- [27] . 2021. ERICA: Improving entity and relation understanding for pre-trained language models via contrastive learning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Online, 3350–3363.
DOI: Google ScholarCross Ref
- [28] . 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences 63, 10 (2020), 1872–1897.Google Scholar
Cross Ref
- [29] . 2018. Know what you don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Melbourne, Australia, 784–789.
DOI: Google ScholarCross Ref
- [30] . 2022. Deep understanding based multi-document machine reading comprehension. ACM Transactions on Asian and Low-Resource Language Information Processing 21, 5, Article
108 (2022), 21 pages.DOI: Google ScholarDigital Library
- [31] . 2019. MASS: Masked sequence to sequence pre-training for language generation. In International Conference on Machine Learning (ICML). 5926–5936.Google Scholar
- [32] . 2020. Investigating prior knowledge for challenging Chinese machine reading comprehension. Transactions of the Association for Computational Linguistics 8 (2020), 141–155.
DOI: Google ScholarCross Ref
- [33] . 2019. ERNIE: Enhanced representation through knowledge integration. ArXiv preprint abs/1904.09223 (2019).Google Scholar
Cross Ref
- [34] . 1953. “Cloze procedure”: A new tool for measuring readability. Journalism Quarterly 30, 4 (1953), 415–433.Google Scholar
Cross Ref
- [35] . 2017. Attention is all you need. In Advances in Neural Information Processing Systems, , , , , , , and (Eds.), Vol. 30. Curran Associates, Inc., 5998–6008.Google Scholar
- [36] . 2021. CLINE: Contrastive learning with semantic negative examples for natural language understanding. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Online, 2332–2342.
DOI: Google ScholarCross Ref
- [37] . 2021. SikuBERT and SikuRoBERTa: Research on the construction and application of the pre-training model of Sikuquanshu for digital humanities. Library Forum (2021), 1–14.Google Scholar
- [38] . 2018. A co-matching model for multi-choice reading comprehension. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Melbourne, Australia, 746–751.
DOI: Google ScholarCross Ref
- [39] . 2021. CLEVE: Contrastive pre-training for event extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Online, 6283–6297.
DOI: Google ScholarCross Ref
- [40] . 2018. Large-scale Cloze test dataset created by teachers. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 2344–2356.
DOI: Google ScholarCross Ref
- [41] . 2020. CLUE: A Chinese language understanding evaluation benchmark. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 4762–4772.
DOI: Google ScholarCross Ref
- [42] . 2021. Native Chinese reader: A dataset towards native-level Chinese machine reading comprehension. arXiv preprint arXiv:2112.06494 (2021).Google Scholar
- [43] . 2017. Dynamic fusion networks for machine reading comprehension. ArXiv preprint abs/1711.04964 (2017). https://arxiv.org/abs/1711.04964.Google Scholar
- [44] . 2019. XLNet: Generalized autoregressive pretraining for language understanding. Advances in Neural Information Processing Systems 32 (2019).Google Scholar
- [45] . 2020. DCMN+: Dual co-matching network for multi-choice reading comprehension. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, Vol. 34. 9563–9570.
DOI: Google ScholarCross Ref
- [46] . 2019. ChID: A large-scale Chinese IDiom dataset for Cloze test. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 778–787.
DOI: Google ScholarCross Ref
Index Terms
Contrastive Learning between Classical and Modern Chinese for Classical Chinese Machine Reading Comprehension
Recommendations
Innovative Application of Ancient Chinese in the Teaching of Classical Chinese Reading in Middle Schools
IPEC2021: 2021 2nd Asia-Pacific Conference on Image Processing, Electronics and ComputersAt present, the requirements for the teaching of classical Chinese in the new Chinese curriculum standards for middle schools pay more attention to students' emotions, attitudes and values. The college entrance examination is also more inclined to ...
Classical communication and non-classical fidelity of quantum teleportation
In quantum teleportation, the role of entanglement has been much discussed. It is known that entanglement is necessary for achieving non-classical teleportation fidelity. Here we focus on the amount of classical communication that is necessary to obtain ...
Disentanglement and decoherence from classical non-Markovian noise: random telegraph noise
We calculate the two-qubit disentanglement due to classical random telegraph noise using the quasi-Hamiltonian method. This allows us to obtain analytical results even for strong coupling and mixed noise, important when the qubits have tunable working ...






Comments