skip to main content
research-article

Deep Neural Network with Embedding Fusion for Chinese Named Entity Recognition

Published:23 March 2023Publication History
Skip Abstract Section

Abstract

Chinese Named Entity Recognition (NER) is an essential task in natural language processing, and its performance directly impacts the downstream tasks. The main challenges in Chinese NER are the high dependence of named entities on context and the lack of word boundary information. Therefore, how to integrate relevant knowledge into the corresponding entity has become the primary task for Chinese NER. Both the lattice LSTM model and the WC-LSTM model did not make excellent use of contextual information. Additionally, the lattice LSTM model had a complex structure and did not exploit the word information well. To address the preceding problems, we propose a Chinese NER method based on the deep neural network with multiple ways of embedding fusion. First, we use a convolutional neural network to combine the contextual information of the input sequence and apply a self-attention mechanism to integrate lexicon knowledge, compensating for the lack of word boundaries. The word feature, context feature, bigram feature, and bigram context feature are obtained for each character. Second, four different features are used to fuse information at the embedding layer. As a result, four different word embeddings are obtained through cascading. Last, the fused feature information is input to the encoding and decoding layer. Experiments on three datasets show that our model can effectively improve the performance of Chinese NER.

REFERENCES

  1. [1] Cao Pengfei, Chen Yubo, Liu Kang, Zhao Jun, and Liu Shengping. 2018. Adversarial transfer learning for Chinese named entity recognition with self-attention mechanism. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 182192.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Chen Xinchi, Qiu Xipeng, Zhu Chenxi, Liu Pengfei, and Huang Xuan-Jing. 2015. Long short-term memory neural networks for Chinese word segmentation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 11971206.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Chen Xiang, Zhang Ningyu, Li Lei, Xie Xin, Deng Shumin, Tan Chuanqi, Huang Fei, Si Luo, and Chen Huajun. 2021. Lightner: A lightweight generative framework with prompt-guided attention for low-resource NER. arXiv preprint arXiv:2109.00720 (2021).Google ScholarGoogle Scholar
  4. [4] Chen Yubo, Xu Liheng, Liu Kang, Zeng Daojian, and Zhao Jun. 2015. Event extraction via dynamic multi-pooling convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 167176.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Chiu Jason P. C. and Nichols Eric. 2016. Named entity recognition with bidirectional LSTM-CNNs. Transactions of the Association for Computational Linguistics 4 (2016), 357370.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Cui Leyang, Wu Yu, Liu Jian, Yang Sen, and Zhang Yue. 2021. Template-based named entity recognition using BART. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics, 1835–1845.Google ScholarGoogle Scholar
  7. [7] Diefenbach Dennis, Lopez Vanessa, Singh Kamal, and Maret Pierre. 2018. Core techniques of question answering systems over knowledge bases: A survey. Knowledge and Information Systems 55, 3 (2018), 529569.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Ding Ruixue, Xie Pengjun, Zhang Xiaoyan, Lu Wei, Li Linlin, and Si Luo. 2019. A neural multi-digraph model for Chinese NER with gazetteers. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 14621467.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Dong Chuanhai, Zhang Jiajun, Zong Chengqing, Hattori Masanori, and Di Hui. 2016. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In Natural Language Understanding and Intelligent Applications. Springer, 239250.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Forney G. David. 1973. The Viterbi algorithm. Proceedings of the IEEE 61, 3 (1973), 268278.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Geng Zhichao, Yan Hang, Yin Zhangyue, An Chenxin, and Qiu Xipeng. 2022. TURNER: The uncertainty-based retrieval framework for Chinese NER. arXiv preprint arXiv:2202.09022 (2022).Google ScholarGoogle Scholar
  12. [12] Gui Tao, Ma Ruotian, Zhang Qi, Zhao Lujun, Jiang Yu-Gang, and Huang Xuanjing. 2019. CNN-based Chinese NER with lexicon rethinking. In Proceedings of the 28th International Joint Conference on Artificial Intelligence: Main Track. 49824988.Google ScholarGoogle Scholar
  13. [13] Gui Tao, Zou Yicheng, Zhang Qi, Peng Minlong, Fu Jinlan, Wei Zhongyu, and Huang Xuan-Jing. 2019. A lexicon-based graph neural network for Chinese NER. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 10401050.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] He Hangfeng and Sun Xu. 2016. F-score driven max margin neural network for named entity recognition in Chinese social media. arXiv preprint arXiv:1611.04234 (2016).Google ScholarGoogle Scholar
  15. [15] He Hangfeng and Sun Xu. 2017. A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Huang Zhiheng, Xu Wei, and Yu Kai. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015).Google ScholarGoogle Scholar
  17. [17] Jia Chen, Shi Yuefeng, Yang Qinrong, and Zhang Yue. 2020. Entity enhanced BERT pre-training for Chinese NER. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 63846396.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Hu Biao, Huang Zhen, Hu Minghao, Zhang Ziwen, and Dou Yong. 2022. Adaptive threshold selective self-attention for Chinese NER. In Proceedings of the 29th International Conference on Computational Linguistics, 1823–1833.Google ScholarGoogle Scholar
  19. [19] Jia Shengbin, Ding Ling, Chen Xiaojun, Shijia E, and Yang Xiang. 2020. Incorporating uncertain segmentation information into Chinese NER for social media text. arXiv preprint arXiv:2004.06384 (2020).Google ScholarGoogle Scholar
  20. [20] Lee Jacob Devlin, Ming-Wei Chang, Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19). 41714186.Google ScholarGoogle Scholar
  21. [21] Lafferty John, McCallum Andrew, and Pereira Fernando C. N.. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the 18th International Conference on Machine Learning (ICML’01).Google ScholarGoogle Scholar
  22. [22] Lample Guillaume, Ballesteros Miguel, Subramanian Sandeep, Kawakami Kazuya, and Dyer Chris. 2016. Neural architectures for named entity recognition. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’16). 260270.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Levow Gina-Anne. 2006. The Third International Chinese Language Processing Bakeoff: Word segmentation and named entity recognition. In Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 108117.Google ScholarGoogle Scholar
  24. [24] Li Lantian, Xu Weizhi, and Yu Hui. 2020. Character-level neural network model based on Nadam optimization and its application in clinical concept extraction. Neurocomputing 414 (2020), 182190.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Li Xiaonan, Yan Hang, Qiu Xipeng, and Huang Xuan-Jing. 2020. FLAT: Chinese NER using flat-lattice transformer. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 68366842.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Li Xiangyang, Zhang Huan, and Zhou Xiao-Hua. 2020. Chinese clinical named entity recognition with variant neural structures based on BERT methods. Journal of Biomedical Informatics 107 (2020), 103422.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Liu Kun, Fu Yao, Tan Chuanqi, Chen Mosha, Zhang Ningyu, Huang Songfang, and Gao Sheng. 2021. Noisy-labeled NER with confidence estimation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’21). 34373445.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Liu Pan, Guo Yanming, Wang Fenglei, and Li Guohui. 2022. Chinese named entity recognition: The state of the art. Neurocomputing 473 (2022), 3753.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Liu Wei, Xu Tongge, Xu Qinghua, Song Jiayu, and Zu Yueran. 2019. An encoding strategy based word-character LSTM for Chinese NER. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 23792389.Google ScholarGoogle Scholar
  30. [30] Lu Yanan, Zhang Yue, and Ji Donghong. 2016. Multi-prototype Chinese character embedding. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 855859.Google ScholarGoogle Scholar
  31. [31] Ma Ruotian, Peng Minlong, Zhang Qi, and Huang Xuanjing. 2019. Simplify the usage of lexicon in Chinese NER. arXiv preprint arXiv:1908.05969 (2019).Google ScholarGoogle Scholar
  32. [32] Ma Xuezhe and Hovy Eduard. 2016. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 10641074.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Mengge Xue, Yu Bowen, Liu Tingwen, Zhang Yue, Meng Erli, and Wang Bin. 2020. Porous lattice transformer encoder for Chinese NER. In Proceedings of the 28th International Conference on Computational Linguistics. 38313841.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Miwa Makoto and Bansal Mohit. 2016. End-to-end relation extraction using LSTMs on sequences and tree structures. arXiv preprint arXiv:1601.00770 (2016).Google ScholarGoogle Scholar
  35. [35] Peng Nanyun and Dredze Mark. 2015. Named entity recognition for Chinese social media with jointly trained embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 548554.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Peng Nanyun and Dredze Mark. 2016. Improving named entity recognition for Chinese social media with word segmentation representation learning. arXiv preprint arXiv:1603.00786 (2016).Google ScholarGoogle Scholar
  37. [37] Pradhan Sameer, Moschitti Alessandro, Xue Nianwen, Uryupina Olga, and Zhang Yuchen. 2012. CoNLL-2012 shared task: Modeling multilingual unrestricted coreference in OntoNotes. In Proceedings of the Joint Conference on EMNLP and CoNLL-Shared Task (CoNLL’12). 140.Google ScholarGoogle Scholar
  38. [38] Sang Erik Tjong Kim and Meulder Fien De. 2003. Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003. 142147.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Sobhana N., Mitra Pabitra, and Ghosh S. K.. 2010. Conditional random field based named entity recognition in geological text. International Journal of Computer Applications 1, 3 (2010), 143147.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Strubell Emma, Verga Patrick, Belanger David, and McCallum Andrew. 2017. Fast and accurate entity recognition with iterated dilated convolutions. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 26702680.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Sui Dianbo, Chen Yubo, Liu Kang, Zhao Jun, and Liu Shengping. 2019. Leverage lexical knowledge for Chinese named entity recognition via collaborative graph network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 38303840.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30.Google ScholarGoogle Scholar
  43. [43] Wu Fangzhao, Liu Junxin, Wu Chuhan, Huang Yongfeng, and Xie Xing. 2019. Neural Chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation. In Proceedings of the World Wide Web Conference. 33423348.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Xu Fuyong, Xu Guangtao, Wang Yuanying, Wang Ru, Ding Qi, Liu Peiyu, and Zhu Zhenfang. 2022. Diverse dialogue generation by fusing mutual persona-aware and self-transferrer. Applied Intelligence 52, 5 (2022), 47444757.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Yan Hang, Gui Tao, Dai Junqi, Guo Qipeng, Zhang Zheng, and Qiu Xipeng. 2021. A unified generative framework for various NER subtasks. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 58085822.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Yang Jie, Teng Zhiyang, Zhang Meishan, and Zhang Yue. 2016. Combining discrete and neural features for sequence labeling. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. 140154.Google ScholarGoogle Scholar
  47. [47] Yang Jie and Zhang Yue. 2018. NCRF++: An open-source neural sequence labeling toolkit. In Proceedings of ACL 2018: System Demonstration.74.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Zhang Yue and Yang Jie. 2018. Chinese NER using lattice LSTM. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 15541564.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Zhao Shan, Hu Minghao, Cai Zhiping, Chen Haiwen, and Liu Fang. 2021. Dynamic modeling cross-and self-lattice attention network for Chinese NER. In Proceedings of the 35th AAAI Conference on Artificial Intelligence. 1451514523.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Zhou GuoDong and Su Jian. 2002. Named entity recognition using an HMM-based chunk tagger. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 473480.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. [51] Zhu Yuying and Wang Guoxin. 2019. CAN-NER: Convolutional attention network for Chinese named entity recognition. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 33843393.Google ScholarGoogle Scholar

Index Terms

  1. Deep Neural Network with Embedding Fusion for Chinese Named Entity Recognition

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 3
      March 2023
      570 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3579816
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 March 2023
      • Online AM: 10 February 2023
      • Accepted: 24 October 2022
      • Revised: 14 August 2022
      • Received: 31 December 2021
      Published in tallip Volume 22, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)168
      • Downloads (Last 6 weeks)12

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!