skip to main content
research-article

From Softmax to Nucleusmax: A Novel Sparse Language Model for Chinese Radiology Report Summarization

Published:16 June 2023Publication History
Skip Abstract Section

Abstract

The Chinese radiology report summarization is a crucial component in smart healthcare that employs language models to summarize key findings in radiology reports and communicate these findings to physicians. However, most language models for radiology report summarization utilize a softmax transformation in their output layer, leading to dense alignments and strictly positive output probabilities. This density is inefficient, reducing model interpretability and giving probability mass to many unrealistic outputs. To tackle this issue, we propose a novel approach named nucleusmax. Nucleusmax is able to mitigate dense outputs and improve model interpretability by truncating the unreliable tail of the probability distribution. In addition, we incorporate nucleusmax with a copy mechanism, a useful technique to avoid professional errors in the generated diagnostic opinions. To further promote the research of radiology report summarization, we also have created a Chinese radiology report summarization dataset, which is freely available. Experimental results showed via both automatic and human evaluation that the proposed approach substantially improves the sparsity and overall quality of outputs over competitive softmax models, producing radiology summaries that approach the quality of those authored by physicians. In general, our work demonstrates the feasibility and prospect of the language model to the domain of radiology and smart healthcare.

REFERENCES

  1. [1] Abacha Asma Ben, M’rabet Yassine, Zhang Yuhao, Shivade Chaitanya, Langlotz Curtis, and Demner-Fushman Dina. 2021. Overview of the MEDIQA 2021 shared task on summarization in the medical domain. In Proceedings of the 20th Workshop on Biomedical Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Adams Griffin, Alsentzer Emily, Ketenci Mert, Zucker Jason, and Elhadad Noémie. 2021. What’s in a summary? Laying the groundwork for advances in hospital-course summarization. In Proc. of NAACL.Google ScholarGoogle Scholar
  3. [3] Blei David M., Ng Andrew Y., and Jordan Michael I.. 2003. Latent Dirichlet allocation. The Journal of Machine Learning Research (2003).Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Blondel Mathieu, Martins André F. T., and Niculae Vlad. 2020. Learning with Fenchel-Young losses. J. Mach. Learn. Res. (2020).Google ScholarGoogle Scholar
  5. [5] Bridle John S.. 1990. Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In Neurocomputing.Google ScholarGoogle Scholar
  6. [6] Bustos Aurelia, Pertusa Antonio, Salinas Jose-Maria, and Iglesia-Vayá Maria de la. 2020. Padchest: A large chest x-ray image dataset with multi-label annotated reports. Medical Image Analysis (2020).Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Cai Xiaoyan, Liu Sen, Han Junwei, Yang Libin, Liu Zhenguo, and Liu Tianming. 2021. ChestXRayBERT: A pretrained language model for chest radiology report summarization. IEEE Transactions on Multimedia (2021).Google ScholarGoogle Scholar
  8. [8] Chen Xinlei, Fang Hao, Lin Tsung-Yi, Vedantam Ramakrishna, Gupta Piotr, and Zitnick C. Lawrence. 2015. Microsoft COCO captions: Data collection and evaluation server. IEEE Conference on Computer Vision and Pattern Recognition (2015).Google ScholarGoogle Scholar
  9. [9] Chen Yen-Chun and Bansal Mohit. 2018. Fast abstractive summarization with reinforce-selected sentence rewriting. In Proceedings of ACL.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Chen Zhihong, Song Yan, Chang Tsung-Hui, and Wan Xiang. 2020. Generating radiology reports via memory-driven transformer. In Proc. of EMNLP.Google ScholarGoogle Scholar
  11. [11] Cornia Marcella, Stefanini Matteo, Baraldi Lorenzo, and Cucchiara Rita. 2020. Meshed-memory transformer for image captioning. In Proc. of CVPR.Google ScholarGoogle Scholar
  12. [12] Cui Yiming, Che Wanxiang, Liu Ting, Qin Bing, Wang Shijin, and Hu Guoping. 2020. Revisiting pre-trained models for Chinese natural language processing. In Proc. of EMNLP.Google ScholarGoogle Scholar
  13. [13] Datta Surabhi and Roberts Kirk. 2020. A dataset of chest X-ray reports annotated with spatial role labeling annotations. Data in Brief (2020).Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Demner-Fushman Dina, Kohli Marc D., Rosenman Marc B., Shooshan Sonya E., Rodriguez Laritza, Antani Sameer, Thoma George R., and McDonald Clement J.. 2016. Preparing a collection of radiology examinations for distribution and retrieval. Journal of the American Medical Informatics Association (2016).Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Deng Jia, Dong Wei, Socher Richard, Li Li-Jia, Li Kai, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In Proc. of CVPR.Google ScholarGoogle Scholar
  16. [16] Deng Jiankang, Guo Jia, Xue Niannan, and Zafeiriou Stefanos. 2019. Arcface: Additive angular margin loss for deep face recognition. In Proc. of CVPR.Google ScholarGoogle Scholar
  17. [17] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. of NAACL.Google ScholarGoogle Scholar
  18. [18] Dong Li, Yang Nan, Wang Wenhui, Wei Furu, Liu Xiaodong, Wang Yu, Gao Jianfeng, Zhou Ming, and Hon Hsiao-Wuen. 2019. Unified language model pre-training for natural language understanding and generation. In Proc. of ICONIP.Google ScholarGoogle Scholar
  19. [19] Donnelly Lane F., Grzeszczuk Robert, and Guimaraes Carolina V.. 2022. Use of natural language processing (NLP) in evaluation of radiology reports: An update on applications and technology advances. In Seminars in Ultrasound, CT and MRI.Google ScholarGoogle Scholar
  20. [20] Du Yongping, Zhao Yiliang, Yan Jingya, and Li Qingxiao. 2022. UGDAS: Unsupervised graph-network based denoiser for abstractive summarization in biomedical domain. (2022).Google ScholarGoogle Scholar
  21. [21] Gliwa Bogdan, Mochol Iwona, Biesek Maciej, and Wawer Aleksander. 2019. SAMSum corpus: A human-annotated dialogue dataset for abstractive summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Holtzman Ari, Buys Jan, Du Li, Forbes Maxwell, and Choi Yejin. 2019. The curious case of neural text degeneration. In Proc. of ICLR.Google ScholarGoogle Scholar
  23. [23] Hu Baotian, Chen Qingcai, and Zhu Fangze. 2015. LCSTS: A large scale Chinese short text summarization dataset. In Proc. of EMNLP.Google ScholarGoogle Scholar
  24. [24] Hu Jinpeng, Li Zhuo, Chen Zhihong, Li Zhen, Wan Xiang, and Chang Tsung-Hui. 2022. Graph enhanced contrastive learning for radiology findings summarization. In Proc. of ACL.Google ScholarGoogle Scholar
  25. [25] Jiang Nan, Chen Jing, Zhou Ri-Gui, Wu Changxing, Chen Honglong, Zheng Jiaqi, and Wan Tao. 2020. PAN: Pipeline assisted neural networks model for data-to-text generation in social Internet of Things. Information Sciences (2020).Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Johnson Alistair E. W., Pollard Tom J., Berkowitz Seth J., Greenbaum Nathaniel R., Lungren Matthew P., Deng Chih-ying, Mark Roger G., and Horng Steven. 2019. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific Data (2019).Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Karn Sanjeev Kumar, Liu Ning, Schütze Hinrich, and Farri Oladimeji. 2022. Differentiable multi-agent actor-critic for multi-step radiology report summarization. In Proc. of ACL.Google ScholarGoogle Scholar
  28. [28] Kaur Navdeep, Mittal Ajay, and Singh Gurprem. 2022. Methods for automatic generation of radiological reports of chest radiographs: A comprehensive survey. (2022).Google ScholarGoogle Scholar
  29. [29] Kingma Diederik P. and Ba Jimmy. 2015. Adam: A method for stochastic optimization. In ICLR (Poster).Google ScholarGoogle Scholar
  30. [30] Lin Chin-Yew and Hovy Eduard. 2003. Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proc. of NAACL.Google ScholarGoogle Scholar
  31. [31] Liu Fenglin, You Chenyu, Wu Xian, Ge Shen, Sun Xu, et al. 2021. Auto-encoding knowledge graph for unsupervised medical report generation. Proc. of NeurIPS (2021).Google ScholarGoogle Scholar
  32. [32] Liu Weiyang, Wen Yandong, Yu Zhiding, Li Ming, Raj Bhiksha, and Song Le. 2017. Sphereface: Deep hypersphere embedding for face recognition. In Proc. of CVPR.Google ScholarGoogle Scholar
  33. [33] Liu Weiyang, Wen Yandong, Yu Zhiding, and Yang Meng. 2016. Large-margin softmax loss for convolutional neural networks. In Proc. of ICML.Google ScholarGoogle Scholar
  34. [34] Liu Xuebo, Wang Longyue, Wong Derek F., Ding Liang, Chao Lidia S., Shi Shuming, and Tu Zhaopeng. 2021. On the copying behaviors of pre-training for neural machine translation. In Proc. of ACL Findings.Google ScholarGoogle Scholar
  35. [35] Liu Yang. 2019. Fine-tune BERT for extractive summarization. arXiv preprint arXiv:1903.10318 (2019).Google ScholarGoogle Scholar
  36. [36] Lu Jiasen, Xiong Caiming, Parikh Devi, and Socher Richard. 2017. Knowing when to look: Adaptive attention via a visual sentinel for image captioning. In Proc. of CVPR.Google ScholarGoogle Scholar
  37. [37] Lu Yao, Dong Yue, and Charlin Laurent. 2020. Multi-XScience: A large-scale dataset for extreme multi-document summarization of scientific articles. In Proc. of EMNLP.Google ScholarGoogle Scholar
  38. [38] MacAvaney Sean, Sotudeh Sajad, Cohan Arman, Goharian Nazli, Talati Ish, and Filice Ross W.. 2019. Ontology-aware clinical abstractive summarization. In Proc. of SIGIR.Google ScholarGoogle Scholar
  39. [39] Martins Andre and Astudillo Ramon. 2016. From softmax to sparsemax: A sparse model of attention and multi-label classification. In Proc. of ICML.Google ScholarGoogle Scholar
  40. [40] Martins Pedro Henrique, Marinho Zita, and Martins André F. T.. 2020. Sparse text generation. In Proc. of EMNLP.Google ScholarGoogle Scholar
  41. [41] Mihalcea Rada and Tarau Paul. 2004. TextRank: Bringing order into text. In Proc. of EMNLP.Google ScholarGoogle Scholar
  42. [42] Nallapati Ramesh, Zhou Bowen, Santos Cicero dos, Gu̇lçehre Çağlar, and Xiang Bing. 2016. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proc. of CoNLL.Google ScholarGoogle Scholar
  43. [43] Niculae Vlad and Blondel Mathieu. 2017. A regularized framework for sparse and structured neural attention. In Proc. of ICONIP.Google ScholarGoogle Scholar
  44. [44] Niculae Vlad, Martins Andre, Blondel Mathieu, and Cardie Claire. 2018. SparseMAP: Differentiable sparse structured inference. In Proc. of ICML.Google ScholarGoogle Scholar
  45. [45] Niederkohr Ryan D., Greenspan Bennett S., Prior John O., Schöder Heiko, Seltzer Marc A., Katherine A. Rohren Zukotynski and Eric M.. 2013. Reporting guidance for oncologic 18F-FDG PET/CT imaging. Journal of Nuclear Medicine (2013).Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Papineni Kishore, Roukos Salim, Ward Todd, and Zhu Wei-Jing. 2002. Bleu: A method for automatic evaluation of machine translation. In Proc. of ACL.Google ScholarGoogle Scholar
  47. [47] Peters Ben, Niculae Vlad, and Martins André F. T.. 2019. Sparse sequence-to-sequence models. In Proc. of ACL.Google ScholarGoogle Scholar
  48. [48] Piantadosi Steven T.. 2014. Zipf’s word frequency law in natural language: A critical review and future directions. (2014).Google ScholarGoogle Scholar
  49. [49] Pilault Jonathan, Li Raymond, Subramanian Sandeep, and Pal Christopher. 2020. On extractive and abstractive neural document summarization with transformer language models. In Proc. of EMNLP.Google ScholarGoogle Scholar
  50. [50] Rennie Steven J., Marcheret Etienne, Mroueh Youssef, Ross Jerret, and Goel Vaibhava. 2017. Self-critical sequence training for image captioning. In Proc. of CVPR.Google ScholarGoogle Scholar
  51. [51] See Abigail, Liu Peter J., and Manning Christopher D.. 2017. Get to the point: Summarization with pointer-generator networks. In Proc. of ACL.Google ScholarGoogle Scholar
  52. [52] Su Jianlin. 2020. SPACES: “Extractive-abstractive” Long Text Summaries. Technical Report.Google ScholarGoogle Scholar
  53. [53] Sun Shaoshi, Zhang Zhenyuan, Huang BoCheng, Lei Pengbin, Su Jianlin, Pan Shengfeng, and Cao Jiarun. 2021. Sparse-softmax: A simpler and faster alternative softmax transformation. (2021).Google ScholarGoogle Scholar
  54. [54] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., and Kaiser. 2017. Attention is all you need. In Proc. of NeurIPS.Google ScholarGoogle Scholar
  55. [55] Vinyals Oriol, Fortunato Meire, and Jaitly Navdeep. 2015. Pointer networks. In Proc. of ICONIP.Google ScholarGoogle Scholar
  56. [56] Vinyals Oriol, Toshev Alexander, Bengio Samy, and Erhan Dumitru. 2015. Show and tell: A neural image caption generator. In Proc. of CVPR.Google ScholarGoogle Scholar
  57. [57] Wang Hao, Wang Yitong, Zhou Zheng, Ji Xing, Gong Dihong, Zhou Jingchao, Li Zhifeng, and Liu Wei. 2018. CosFace: Large margin cosine loss for deep face recognition. In Proc. of CVPR.Google ScholarGoogle Scholar
  58. [58] Wang Xiaosong, Peng Yifan, Lu Le, Lu Zhiyong, and Summers Ronald M.. 2018. TieNet: Text-image embedding network for common thorax disease classification and reporting in chest x-rays. In Proc. of CVPR.Google ScholarGoogle Scholar
  59. [59] Wei Junqiu, Ren Xiaozhe, Li Xiaoguang, Huang Wenyong, Liao Yi, Wang Yasheng, Lin Jiashu, Jiang Xin, Chen Xiao, and Liu Qun. 2019. NEZHA: Neural contextualized representation for Chinese language understanding. arXiv preprint arXiv:1909.00204 (2019).Google ScholarGoogle Scholar
  60. [60] Wu Felix, Fan Angela, Baevski Alexei, Dauphin Yann, and Auli Michael. 2018. Pay less attention with lightweight and dynamic convolutions. In Proc. of ICLR.Google ScholarGoogle Scholar
  61. [61] Xiao Liqiang, Wang Lu, He Hao, and Jin Yaohui. 2020. Copy or rewrite: Hybrid summarization with hierarchical reinforcement learning. In Proc. of AAAI.Google ScholarGoogle Scholar
  62. [62] Xu Liwen, Zhang Yan, Hong Lei, Cai Yi, and Sung Szui. 2021. ChicHealth@ MEDIQA 2021: Exploring the limits of pre-trained seq2seq models for medical summarization. In Proceedings of the 20th Workshop on Biomedical Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  63. [63] Yadav Shweta, Gupta Deepak, Abacha Asma Ben, and Demner-Fushman Dina. 2022. Question-aware transformer models for consumer health question summarization. Journal of Biomedical Informatics (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. [64] Yu Adams Wei, Dohan David, Luong Minh-Thang, Zhao Rui, Chen Kai, Norouzi Mohammad, and Le Quoc V.. 2018. QANet: Combining local convolution with global self-attention for reading comprehension. In Proc. of ICLR.Google ScholarGoogle Scholar
  65. [65] Zhang Biao, Titov Ivan, and Sennrich Rico. 2021. Sparse attention with linear units. (2021).Google ScholarGoogle Scholar
  66. [66] Zhang Ningyu, Jia Qianghuai, Yin Kangping, Dong Liang, Gao Feng, and Hua Nengwei. 2020. Conceptualized representation learning for Chinese biomedical text mining. (2020).Google ScholarGoogle Scholar
  67. [67] Zhang Yuhao, Ding Daisy Yi, Qian Tianpei, Manning Christopher D., and Langlotz Curtis P.. 2018. Learning to summarize radiology findings. In Proceedings of the 9th International Workshop on Health Text Mining and Information Analysis.Google ScholarGoogle ScholarCross RefCross Ref
  68. [68] Zhang Yuhao, Merck Derek, Tsai Emily, Manning Christopher D., and Langlotz Curtis. 2020. Optimizing the factual correctness of a summary: A study of summarizing radiology reports. In Proc. of ACL.Google ScholarGoogle Scholar
  69. [69] Zhao Chao, Walker Marilyn, and Chaturvedi Snigdha. 2020. Bridging the structural gap between encoding and decoding for data-to-text generation. In Proc. of ACL.Google ScholarGoogle Scholar
  70. [70] Zhao Shuai, Liang Zhuoqian, Wen Jinming, and Chen Jie. 2022. Sparsing and smoothing for the seq2seq models. IEEE Transactions on Artificial Intelligence (2022).Google ScholarGoogle Scholar
  71. [71] Zhao Shuai, Zhang Tianyu, Hu Man, Chang Wen, and You Fucheng. 2022. AP-BERT: Enhanced pre-trained model through average pooling. Applied Intelligence (2022).Google ScholarGoogle Scholar
  72. [72] Shen Xubai Xuan Zhihui, and Ruimin Wang. 2019. The standards for PET/CT diagnostic reports: Setting and exploring. Labeled Immunoassays and Clinical Medicine (2019).Google ScholarGoogle Scholar

Index Terms

  1. From Softmax to Nucleusmax: A Novel Sparse Language Model for Chinese Radiology Report Summarization

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Asian and Low-Resource Language Information Processing
          ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 6
          June 2023
          635 pages
          ISSN:2375-4699
          EISSN:2375-4702
          DOI:10.1145/3604597
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 June 2023
          • Online AM: 13 May 2023
          • Accepted: 29 April 2023
          • Revised: 28 January 2023
          • Received: 19 September 2022
          Published in tallip Volume 22, Issue 6

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
        • Article Metrics

          • Downloads (Last 12 months)97
          • Downloads (Last 6 weeks)35

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!