skip to main content
research-article

New Vietnamese Corpus for Machine Reading Comprehension of Health News Articles

Authors Info & Claims
Published:23 September 2022Publication History
Skip Abstract Section

Abstract

Machine reading comprehension is a natural language understanding task where the computing system is required to read a text and then find the answer to a specific question posed by a human. Large-scale and high-quality corpora are necessary for evaluating machine reading comprehension models. Furthermore, machine reading comprehension (MRC) for the health sector has potential for practical applications; nevertheless, MRC research in this domain is currently scarce. This article presents UIT-ViNewsQA, a new corpus for the Vietnamese language to evaluate MRC models for the healthcare textual domain. The corpus consists of 22,057 human-generated question-answer pairs. Crowd-workers create the questions and answers on a collection of 4,416 online Vietnamese healthcare news articles, where the answers are textual spans extracted from the corresponding articles. We introduce a process for creating a high-quality corpus for the Vietnamese machine reading comprehension task. Linguistically, our corpus accommodates diversity in question and answer types. In addition, we conduct experiments and compare the effectiveness of different MRC methods based on the neural networks and transformer architectures. Experimental results on our corpus show that the MRC system based on ALBERT architecture outperforms the neural network architectures and the BERT-based approach, an exact match score of 65.26% and an F1-score of 84.89%. The best machine model achieves about 10.90% F1-score less efficiently than humans, which proves that exploring machine models on UIT-ViNewsQA to surpass humans is challenging for researchers in the future. Our corpus is publicly available on our website: http://nlp.uit.edu.vn/datasets for research purposes.

REFERENCES

  1. [1] Aniol Anna, Pietron Marcin, and Duda Jerzy. 2019. Ensemble approach for natural language question answering problem. In Proceedings of the 2019 7th International Symposium on Computing and Networking Workshops (CANDARW). IEEE, 180183.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Chen Danqi, Fisch Adam, Weston Jason, and Bordes Antoine. 2017. Reading Wikipedia to answer open-domain questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 18701879.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Conneau Alexis, Khandelwal Kartikay, Goyal Naman, Chaudhary Vishrav, Wenzek Guillaume, Guzmán Francisco, Grave Edouard, Ott Myle, Zettlemoyer Luke, and Stoyanov Veselin. 2020. Unsupervised cross-lingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 84408451. Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Cui Yiming, Liu Ting, Che Wanxiang, Xiao Li, Chen Zhipeng, Ma Wentao, Wang Shijin, and Hu Guoping. 2019. A span-extraction dataset for chinese machine reading comprehension. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 58865891.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Cui Yiming, Liu Ting, Chen Zhipeng, Wang Shijin, and Hu Guoping. 2016. Consensus attention-based neural networks for chinese reading comprehension. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 17771786.Google ScholarGoogle Scholar
  6. [6] Cui Yiming, Liu Ting, Yang Ziqing, Chen Zhipeng, Ma Wentao, Che Wanxiang, Wang Shijin, and Hu Guoping. 2020. A sentence cloze dataset for Chinese machine reading comprehension. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 67176723. https://www.aclweb.org/anthology/2020.coling-main.589.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 41714186.Google ScholarGoogle Scholar
  8. [8] d’Hoffschmidt Martin, Belblidia Wacim, Heinrich Quentin, Brendlé Tom, and Vidal Maxime. 2020. FQuAD: French question answering dataset. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 11931208. https://www.aclweb.org/anthology/2020.findings-emnlp.107.Google ScholarGoogle Scholar
  9. [9] Do Phuc, Phan Truong H. V., and Gupta Brij B.. 2021. Developing a Vietnamese tourism question answering system using knowledge graph and deep learning. Transactions on Asian and Low-Resource Language Information Processing 20, 5 (2021), 118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Dua Dheeru, Wang Yizhong, Dasigi Pradeep, Stanovsky Gabriel, Singh Sameer, and Gardner Matt. 2019. DROP: A reading comprehension benchmark requiring discrete reasoning over paragraphs. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 23682378.Google ScholarGoogle Scholar
  11. [11] Efimov Pavel, Chertok Andrey, Boytsov Leonid, and Braslavski Pavel. 2020. SberQuAD – Russian reading comprehension dataset: Description and analysis. In Experimental IR Meets Multilinguality, Multimodality, and Interaction, Arampatzis Avi, Kanoulas Evangelos, Tsikrika Theodora, Vrochidis Stefanos, Joho Hideo, Lioma Christina, Eickhoff Carsten, Névéol Aurélie, Cappellato Linda, and Ferro Nicola (Eds.). Springer International Publishing, Cham, 315. Google ScholarGoogle Scholar
  12. [12] Gupta Deepak, Ekbal Asif, and Bhattacharyya Pushpak. 2019. A deep neural network framework for English Hindi question answering. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 19, 2 (2019), 122.Google ScholarGoogle Scholar
  13. [13] He Wei, Liu Kai, Liu Jing, Lyu Yajuan, Zhao Shiqi, Xiao Xinyan, Liu Yuan, Wang Yizhong, Wu Hua, She Qiaoqiao, et al. 2018. DuReader: A Chinese machine reading comprehension dataset from real-world applications. ACL 2018 (2018), 37.Google ScholarGoogle Scholar
  14. [14] Hermann Karl Moritz, Kocisky Tomas, Grefenstette Edward, Espeholt Lasse, Kay Will, Suleyman Mustafa, and Blunsom Phil. 2015. Teaching machines to read and comprehend. In Advances in Neural Information Processing Systems. 16931701.Google ScholarGoogle Scholar
  15. [15] Hill Felix, Bordes Antoine, Chopra Sumit, and Weston Jason. 2015. The Goldilocks principle: Reading children’s books with explicit memory representations. arXiv preprint arXiv:1511.02301 (2015).Google ScholarGoogle Scholar
  16. [16] Howard Jeremy and Ruder Sebastian. 2018. Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 328339. Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Huang Hsin-Yuan, Zhu Chenguang, Shen Yelong, and Chen Weizhu. 2018. FusionNet: Fusing via fully-aware attention with application to machine comprehension. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  18. [18] Jin Qiao, Dhingra Bhuwan, Liu Zhengping, Cohen William, and Lu Xinghua. 2019. PubMedQA: A dataset for biomedical research question answering. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 25672577. Google ScholarGoogle ScholarCross RefCross Ref
  19. [19] Joshi Mandar, Chen Danqi, Liu Yinhan, Weld Daniel S., Zettlemoyer Luke, and Levy Omer. 2020. Spanbert: Improving pre-training by representing and predicting spans. Transactions of the Association for Computational Linguistics 8 (2020), 6477.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Jurczyk Tomasz, Zhai Michael, and Choi Jinho D.. 2016. Selqa: A new benchmark for selection-based question answering. In 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 820827.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Kudo Taku and Richardson John. 2018. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Association for Computational Linguistics, Brussels, Belgium, 6671. Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Lai Guokun, Xie Qizhe, Liu Hanxiao, Yang Yiming, and Hovy Eduard. 2017. RACE: Large-scale reading comprehension dataset from examinations. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 785794.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Lan Zhenzhong, Chen Mingda, Goodman Sebastian, Gimpel Kevin, Sharma Piyush, and Soricut Radu. 2019. ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  24. [24] Le Phuong Hong and Bui Duc-Thien. 2018. A factoid question answering system for Vietnamese. In Companion Proceedings of the Web Conference 2018. International World Wide Web Conferences Steering Committee, 10491055.Google ScholarGoogle Scholar
  25. [25] Lim Seungyoung, Kim Myungji, and Lee Jooyoul. 2019. Korquad1.0: Korean QA dataset for machine reading comprehension. arXiv preprint arXiv:1909.07005 (2019).Google ScholarGoogle Scholar
  26. [26] Nguyen Kiet, Nguyen Vu, Nguyen Anh, and Nguyen Ngan. 2020. A Vietnamese dataset for evaluating machine reading comprehension. In Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), 25952605. https://www.aclweb.org/anthology/2020.coling-main.233.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Nguyen K. V., Nguyen V. D., Nguyen P. X. V., Truong T. T. H., and Nguyen N. L.. 2018. UIT-VSFC: Vietnamese students’ feedback corpus for sentiment analysis. In Proceedings of the 2018 10th International Conference on Knowledge and Systems Engineering (KSE). 1924. Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Nguyen Luan Thanh, Nguyen Kiet Van, and Nguyen Ngan Luu-Thuy. 2021. Constructive and toxic speech detection for open-domain social media comments in Vietnamese. In Advances and Trends in Artificial Intelligence. Proceedings of the 34th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (IEA/AIE 2021, Kuala Lumpur, Malaysia, July 26-29, 2021), Part I.Lecture Notes in Computer Science, Vol. 12798,. Fujita Hamido, Selamat Ali, Lin Jerry Chun-Wei, and Ali Moonis (Eds.). Springer-Verlag, 572583. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Nguyen Nhung Thi-Hong, Ha Phuong Phan-Dieu, Nguyen Luan Thanh, Nguyen Kiet Van, and Nguyen Ngan Luu-Thuy. 2021. Vietnamese complaint detection on e-commerce websites. In New Trends in Intelligent Software Methodologies, Tools and Techniques. IOS Press, 618629.Google ScholarGoogle Scholar
  30. [30] Nguyen Quy T., Miyao Yusuke, Le Ha TT, and Nguyen Nhung TH. 2018. Ensuring annotation consistency and accuracy for Vietnamese treebank. Language Resources and Evaluation 52, 1 (2018), 269315.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Park Cheoneum, Song Heejun, and Lee Changki. 2020. S3-NET: SRU-based sentence and self-matching networks for machine reading comprehension. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 19, 3 (2020), 114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Peters Matthew, Neumann Mark, Iyyer Mohit, Gardner Matt, Clark Christopher, Lee Kenton, and Zettlemoyer Luke. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 22272237.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Radford Alec, Narasimhan Karthik, Salimans Tim, and Sutskever Ilya. 2018. Improving language understanding by generative pretraining. (2018). https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.Google ScholarGoogle Scholar
  34. [34] Rajpurkar Pranav, Jia Robin, and Liang Percy. 2018. Know what you don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 784789. Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Rajpurkar Pranav, Zhang Jian, Lopyrev Konstantin, and Liang Percy. 2016. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 23832392.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Reddy Siva, Chen Danqi, and Manning Christopher D.. 2019. CoQA: A conversational question answering challenge. Transactions of the Association for Computational Linguistics 7 (2019), 249266.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Richardson Matthew, Burges Christopher J. C., and Renshaw Erin. 2013. MCTest: A challenge dataset for the open-domain machine comprehension of text. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 193203.Google ScholarGoogle Scholar
  38. [38] Seo Minjoon, Kembhavi Aniruddha, Farhadi Ali, and Hajishirzi Hannaneh. 2016. Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016).Google ScholarGoogle Scholar
  39. [39] Shao Chih Chieh, Liu Trois, Lai Yuting, Tseng Yiying, and Tsai Sam. 2018. DRED: A Chinese machine reading comprehension dataset. arXiv preprint arXiv:1806.00920 (2018).Google ScholarGoogle Scholar
  40. [40] Sun Kai, Yu Dian, Chen Jianshu, Yu Dong, Choi Yejin, and Cardie Claire. 2019. DREAM: A challenge data set and models for dialogue-based reading comprehension. Transactions of the Association for Computational Linguistics 7 (2019), 217231.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Šuster Simon and Daelemans Walter. 2018. CliCR: A dataset of clinical case reports for machine reading comprehension. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 15511563. Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Trischler Adam, Wang Tong, Yuan Xingdi, Harris Justin, Sordoni Alessandro, Bachman Philip, and Suleman Kaheer. 2017. NewsQA: A machine comprehension dataset. In Proceedings of the 2nd Workshop on Representation Learning for NLP. Association for Computational Linguistics, Vancouver, Canada, 191200. Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Nguyen Anh Tuan, Dao Mai Hoang, and Nguyen Dat Quoc. 2020. A pilot study of text-to-SQL semantic parsing for Vietnamese. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 40794085. Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Nguyen Kiet Van, Tran Khiem Vinh, Luu Son T., Nguyen Anh Gia-Tuan, and Nguyen Ngan Luu-Thuy. 2020. Enhancing lexical-based approach with external knowledge for Vietnamese multiple-choice machine reading comprehension. IEEE Access 8 (2020), 201404201417.Google ScholarGoogle ScholarCross RefCross Ref
  45. [45] Wadhwa Soumya, Chandu Khyathi, and Nyberg Eric. 2018. Comparative analysis of neural QA models on SQuAD. In Proceedings of the Workshop on Machine Reading for Question Answering. 8997.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Wang Shuohang and Jiang Jing. 2016. Machine comprehension using match-lstm and answer pointer. arXiv preprint arXiv:1608.07905 (2016).Google ScholarGoogle Scholar
  47. [47] Wang Shuohang, Yu Mo, Guo Xiaoxiao, Wang Zhiguo, Klinger Tim, Zhang Wei, Chang Shiyu, Tesauro Gerry, Zhou Bowen, and Jiang Jing. 2018. R 3: Reinforced ranker-reader for open-domain question answering. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Wang Wenhui, Yang Nan, Wei Furu, Chang Baobao, and Zhou Ming. 2017. Gated self-matching networks for reading comprehension and question answering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 189198.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Weissenborn Dirk, Wiese Georg, and Seiffe Laura. 2017. Making neural QA as simple as possible but not simpler. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017). 271280.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Xie Qizhe, Lai Guokun, Dai Zihang, and Hovy Eduard. 2018. Large-scale cloze test dataset created by teachers. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 23442356. Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Yang Wei, Xie Yuqing, Lin Aileen, Li Xingyu, Tan Luchen, Xiong Kun, Li Ming, and Lin Jimmy. 2019. End-to-end open-domain question answering with bertserini. arXiv preprint arXiv:1902.01718 (2019).Google ScholarGoogle Scholar
  52. [52] Yu Adams Wei, Dohan David, Luong Minh-Thang, Zhao Rui, Chen Kai, Norouzi Mohammad, and Le Quoc V.. 2018. QANet: Combining local convolution with global self-attention for reading comprehension. In Proceedings of the International Conference on Learning Representations.Google ScholarGoogle Scholar
  53. [53] Zhang Xiao, Wu Ji, He Zhiyang, Liu Xien, and Su Ying. 2018. Medical exam question answering with large-scale reading comprehension. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar

Index Terms

  1. New Vietnamese Corpus for Machine Reading Comprehension of Health News Articles

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 5
        September 2022
        486 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3533669
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 September 2022
        • Online AM: 2 May 2022
        • Accepted: 3 February 2022
        • Revised: 26 November 2021
        • Received: 29 May 2020
        Published in tallip Volume 21, Issue 5

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!