Abstract
Chatbots such as Xiaoice have gained huge popularity in recent years. Users frequently mention their favorite works such as songs and movies in conversations with chatbots. Detecting these entities can help design better chat strategies and improve user experience. Existing named entity recognition methods are mainly designed for formal texts, and their performance on the informal chatbot conversation texts may not be optimal. In addition, these methods rely on massive manually annotated data for model training. In this article, we propose a neural approach to detect entities of works for Chinese chatbot. Our approach is based on a language model (LM) long-short term memory (LSTM) convolutional neural network (CNN) conditional random value (CRF), or LM-LSTM-CNN-CRF, framework, which contains a language model to generate context-aware character embeddings, a Bi-LSTM network to learn contextual character representations from global contexts, a CNN to learn character representations from local contexts, and a CRF layer to jointly decode the character label sequence. In addition, we propose an automatic text annotation method via quote marks to reduce the effort of manual annotation. Besides, we propose an iterative data purification method to improve the quality of the automatically constructed labeled data. Massive experiments on a real-world dataset validate that our approach can achieve good performance on entity detection for Chinese chatbots.
- John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In EMNLP. ACM, 120--128.Google Scholar
- Rich Caruana, Steve Lawrence, and C. Lee Giles. 2001. Overfitting in neural nets: Backpropagation, conjugate gradient, and early stopping. In NIPS. 402--408.Google Scholar
- Aitao Chen, Fuchun Peng, Roy Shan, and Gordon Sun. 2006. Chinese named entity recognition with conditional probabilistic models. In Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 173--176.Google Scholar
- Jason P. C. Chiu and Eric Nichols. 2016. Named entity recognition with bidirectional LSTM-CNNs. TACL 4 (2016), 357--370.Google Scholar
Cross Ref
- Arjun Das, Debasis Ganguly, and Utpal Garain. 2017. Named entity recognition with word embeddings and Wikipedia categories for a low-resource language. TALLIP 16, 3 (2017), 18.Google Scholar
Digital Library
- Yann Dauphin, Harm de Vries, and Yoshua Bengio. 2015. Equilibrated adaptive learning rates for non-convex optimization. In NIPS. 1504--1512.Google Scholar
- Leon Derczynski, Diana Maynard, Giuseppe Rizzo, Marieke van Erp, Genevieve Gorrell, Raphaël Troncy, Johann Petrak, and Kalina Bontcheva. 2015. Analysis of named entity recognition and linking for tweets. Information Processing 8 Management 51, 2 (2015), 32--49.Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT. 4171--4186.Google Scholar
- Chuanhai Dong, Huijia Wu, Jiajun Zhang, and Chengqing Zong. 2017. Multichannel LSTM-CRF for named entity recognition in Chinese social media. In CCL-NABD. Springer, 197--208.Google Scholar
- Chuanhai Dong, Jiajun Zhang, Chengqing Zong, Masanori Hattori, and Hui Di. 2016. Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In Proceedings of the International Conference on Computer Processing of Oriental Languages. Springer, 239--250.Google Scholar
Cross Ref
- Cicero dos Santos and Victor Guimarães. 2015. Boosting named entity recognition with neural character embeddings. In Proceedings of the 5th Named Entity Workshop. 25--33.Google Scholar
Cross Ref
- Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. 2016. The rise of social bots. Communications of the ACM 59, 7 (2016), 96--104.Google Scholar
Digital Library
- Jianfeng Gao, Mu Li, Andi Wu, and Chang-Ning Huang. 2005. Chinese word segmentation and named entity recognition: A pragmatic approach. Computational Linguistics 31, 4 (2005), 531--574.Google Scholar
Digital Library
- Alex Graves and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18, 5–6 (2005), 602--610.Google Scholar
Digital Library
- Hangfeng He and Xu Sun. 2017. A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media. In AAAI. 3216--3222.Google Scholar
- Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015).Google Scholar
- Safia Kanwal, Kamran Malik, Khurram Shahzad, Faisal Aslam, and Zubair Nawaz. 2019. Urdu named entity recognition: Corpus generation and deep learning applications. TALLIP 19, 1 (2019), 8.Google Scholar
- John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML. 282--289.Google Scholar
Digital Library
- Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Recurrent convolutional neural networks for text classification. In AAAI.Google Scholar
- Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural architectures for named entity recognition. In NAACL. 260--270.Google Scholar
- Shuying Lin, Huosheng Xie, Liang-Chih Yu, and K. Robert Lai. 2017. SentiNLP at IJCNLP-2017 task 4: Customer feedback analysis using a Bi-LSTM-CNN model. In IJCNLP, Shared Tasks. 149--154.Google Scholar
- Yankai Lin, Shiqi Shen, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. 2016. Neural relation extraction with selective attention over instances. In ACL. 2124--2133.Google Scholar
- Zhangxun Liu, Conghui Zhu, and Tiejun Zhao. 2010. Chinese named entity recognition with a sequence labeling approach: Based on characters, or based on words? In Advanced Intelligent Computing Theories and Applications with Aspects of Artificial Intelligence. Springer, 634--640.Google Scholar
- Gang Luo, Xiaojiang Huang, Chin-Yew Lin, and Zaiqing Nie. 2015. Joint entity recognition and disambiguation. In EMNLP. 879--888.Google Scholar
- Wencan Luo and Fan Yang. 2016. An empirical study of automatic chinese word segmentation for spoken language understanding and named entity recognition. In NAACL. 238--248.Google Scholar
- Xuezhe Ma and Eduard H. Hovy. 2016. End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In ACL. 1064--1074.Google Scholar
- Thien Huu Nguyen, Avirup Sil, Georgiana Dinu, and Radu Florian. 2016. Toward mention detection robustness with recurrent neural networks. arXiv preprint arXiv:1602.07749 (2016).Google Scholar
- Nanyun Peng and Mark Dredze. 2015. Named entity recognition for Chinese social media with jointly trained embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 548--554.Google Scholar
Cross Ref
- Nanyun Peng and Mark Dredze. 2016. Improving named entity recognition for Chinese social media with word segmentation representation learning. arXiv preprint arXiv:1603.00786 (2016).Google Scholar
- Matthew Peters, Waleed Ammar, Chandra Bhagavatula, and Russell Power. 2017. Semi-supervised sequence tagging with bidirectional language models. In ACL. 1756--1765.Google Scholar
- Barbara Plank and Alessandro Moschitti. 2013. Embedding semantic similarity in tree kernels for domain adaptation of relation extraction. In ACL, Vol. 1. 1498--1507.Google Scholar
- Desh Raj, Sunil Sahu, and Ashish Anand. 2017. Learning local and global contexts using a convolutional recurrent network model for relation classification in biomedical text. In CoNLL. 311--321.Google Scholar
- Marc-Antoine Rondeau and Yi Su. 2016. LSTM-Based NeuroCRFs for named entity recognition. In INTERSPEECH. 665--669.Google Scholar
- Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1 (2014), 1929--1958.Google Scholar
Digital Library
- Xiaojun Wan, Liang Zong, Xiaojiang Huang, Tengfei Ma, Houping Jia, Yuqian Wu, and Jianguo Xiao. 2011. Named entity recognition in Chinese news comments on the web. In IJCNLP. 856--864.Google Scholar
- Fangzhao Wu, Junxin Liu, Chuhan Wu, Yongfeng Huang, and Xing Xie. 2019. Neural Chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation. In WWW. 3342--3348.Google Scholar
- Yuejie Zhang, Zhiting Xu, and Tao Zhang. 2008. Fusion of multiple features for chinese named entity recognition based on CRF model. In Asia Information Retrieval Symposium. Springer, 95--106.Google Scholar
Cross Ref
- Yue Zhang and Jie Yang. 2018. Chinese NER using lattice LSTM. In ACL. 1554--1564.Google Scholar
Index Terms
Detecting Entities of Works for Chinese Chatbot
Recommendations
Neural Chinese Named Entity Recognition via CNN-LSTM-CRF and Joint Training with Word Segmentation
WWW '19: The World Wide Web ConferenceChinese named entity recognition (CNER) is an important task in Chinese natural language processing field. However, CNER is very challenging since Chinese entity names are highly context-dependent. In addition, Chinese texts lack delimiters to separate ...
Inducing Gazetteer for Chinese Named Entity Recognition Based on Local High-Frequent Strings
FITME '09: Proceedings of the 2009 Second International Conference on Future Information Technology and Management EngineeringGazetteers, or entity dictionaries, are important for named entity recognition (NER). Although the dictionaries extracted automatically by the previous methods from a corpus, web or Wikipedia are very huge, they also misses some entities, especially the ...
Chinese Named Entity Recognition with Inducted Context Patterns
IITA '09: Proceedings of the 2009 Third International Symposium on Intelligent Information Technology Application - Volume 03Since whether or not a word is a name is determined mostly by the context of the word, the context pattern induction plays an important role in name entity recognition (NER). We present a NER method based on the context pattern induction. It induces ...






Comments