Abstract
This article presents an approach to response selection and message-response (MR) database expansion from the unstructured data on the psychological consultation websites for a retrieval-based question answering (QA) system in a constrained domain for emotional support and comforting. First, we manually construct an initial MR database based on the articles collected from the psychological consultation websites. The Chinese Knowledge and Information Processing probabilistic context-free grammar is adopted to obtain the semantic dependency graphs (SDGs) of all the messages and responses in the initial MR database. For each sentence in the MR database, all the semantic dependencies, each composed of two words and their semantic relation, are extracted from the SDG of the sentence to form a semantic dependency set. Finally, a matrix with the element representing the correlation between the semantic dependencies of the messages and their corresponding responses is constructed as a semantic dependency pair model (SDPM) for response selection. Moreover, as the number of MR pairs in the psychological consultation websites is increasing day by day, the MR database in the QA system should be expanded to meet the needs of the users. For MR database expansion, the unstructured data from the message board are automatically collected. For the collected data, the supervised latent Dirichlet allocation is adopted for event detection and then the event-based delta Bayesian Information Criterion is used for message and response article segmentation. Each extracted message segment is then fed to the constructed retrieval-based QA system to find the best matched response segment and the matching score is also estimated to verify if the new MR pair is suitable to be included in the expanded MR database. Fivefold cross validation was employed to evaluate the performance of the proposed retrieval-based QA system over the expanded MR database based on SDPM. Compared to the vector space model-based method, the Okapi BM25 model, and the deep learning-based sequence-to-sequence with attention model, the proposed approach achieved a more favorable performance according to a statistical significance test. The retrieval accuracy based on MR expansion was also evaluated and a satisfactory result was obtained confirming the effectiveness of the expanded MR database. In addition, the user's satisfaction score of the proposed system was evaluated using the Cronbach's alpha value and the satisfaction score of the proposed SDPM was higher than those of the methods for comparison.
- A. Mathur and M. T. U. Haider. 2015. Question answering system: A survey. In Proceedings of the 2015 International Conference on Smart Technologies and Management for Computing, Communication, Controls, Energy and Materials (ICSTM’15). IEEE, Chennai, India, 47--57.Google Scholar
- J. Sadek and F. Meziane. 2016. A discourse-based approach for Arabic question answering. ACM Transactions on Asian and Low-Resource Language Information Processing 16, 2, 11. Google Scholar
Digital Library
- L.-C. Yu, C.-H. Wu, and F.-L. Jang. 2009. Psychiatric document retrieval using a discourse-aware model. Artificial Intelligence 173 (7--8), 817--829. Google Scholar
Digital Library
- J.-F. Yeh, C.-H. Wu, L.-C. Yu, and Y.-S. Lai. 2009. Extended probabilistic HAL with close temporal association for psychiatric consultation query retrieval. ACM Transactions on Information Systems 27, 1, 4:1--4:28. Google Scholar
Digital Library
- L.-C. Yu, C.-H. Wu, and F.-L. Jang. 2007. Psychiatric consultation record retrieval using scenario-based representation and multilevel mixture model. IEEE Transactions on Information Technology in Biomedicine 11, 4, 415--427. Google Scholar
Digital Library
- M. Tatu, S. Werner, M. Balakrishna, T. Erekhinskaya, and D. Moldovan. 2016. Semantic question answering on big data. In Proceedings of the International Workshop on Semantic Big Data. ACM, San Francisco, California, 10. Google Scholar
Digital Library
- G. Salton, A. Wong, and C.-S. Yang. 1975. A vector space model for automatic indexing. Communications of the ACM, ACM 18, 11 (1975), 613--620. Google Scholar
Digital Library
- G. Salton and C.-S. Yang. 1973. On the specification of term values in automatic indexing. Journal of Documentation, Emerald 29, 4 (1973), 351--372.Google Scholar
Cross Ref
- A. Peñas, P. Forner, R. Sutcliffe, Á. Rodrigo, C. Forăscu, I. Alegria, D. Giampiccolo, N. Moreau, and P. Osenova. 2009. Overview of ResPubliQA 2009: Question answering evaluation over European legislation. In Proceedings of the Workshop of the Cross-Language Evaluation Forum for European Languages. Springer, Berlin, 174--196. Google Scholar
Digital Library
- S. E. Robertson and S. Walker. 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Springer, Dublin, Ireland, 232--241. Google Scholar
Digital Library
- W. Zhang, T. Yoshida, and X. Tang. 2011. A comparative study of TF* IDF, LSI and multi-words for text classification. Expert Systems with Applications 38, 3 (2011), 2758--2765. Elsevier. Google Scholar
Digital Library
- L. Hu and Z. Duan. 2014. Information passing functions of negative sentences in EST-Function to compare topics. In Proceedings of the 2014 International Conference on Engineering Technology, Engineering Education and Engineering Management (ETEEEM’14). CRC, Hong Kong, 53--56.Google Scholar
- M. Majumder and S. K. Saha. 2015. A system for generating multiple choice questions: With a novel approach for sentence selection. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP’15). ACL, Beijing, China, 64--72.Google Scholar
- J. Tuan and W. Shuang. 2015. Query assistant system based on academic synonym ring. In Proceedings of the10th International Conference on Computer Science and Education (ICCSE’15). IEEE, Cambridge, United Kingdom, 961--964.Google Scholar
- W. Hwang, H. Hajishirzi, M. Ostendorf, and W. Wu. 2015. Aligning sentences from standard Wikipedia to simple Wikipedia. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics -- Human Language Technologies (NAACL-HLT’15). ACL, Colorado, 211--217.Google Scholar
- J. Bian, Y. Yang, H. Zhang, and T.-S. Chua. 2015. Multimedia summarization for social events in microblog stream. IEEE Transactions on Multimedia. IEEE 17, 2 (2015), 216--228.Google Scholar
- E. Khalifa, S. Al-Maadeed, M. A. Tahir, A. Bouridane, and A. Jamshed. 2015. Off-line writer identification using an ensemble of grapheme codebook features. Pattern Recognition Letters 59, 1 (2015), 18--25. Elsevier. Google Scholar
Digital Library
- A. Qadir and E. Riloff. 2011. Classifying sentences as speech acts in message board posts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. ACL, Edinburgh, United Kingdom, 748--758. Google Scholar
Digital Library
- R. Nikhil, N. Tikoo, S. Kurle, H. S. Pisupati, and G. R. Prasad. 2015. A survey on text mining and sentiment analysis for unstructured web data. Journal of Emerging Technologies and Innovative Research 2, 4 (2015), 1292--1296.Google Scholar
- M. Qiu, F. L. Li, S. Wang, X. Gao, Y. Chen, W. Zhao, H. Chen, J. Huang, and W. Chu. 2017. AliMe chat: A sequence to sequence and rerank based chatbot engine. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL'17). ACL, Vancouver, Canada, 498--503.Google Scholar
- Y. Wu, W. Wu, Z. Li, and M. Zhou. 2016. Response selection with topic clues for retrieval-based chatbots. arXiv:1605.00090.Google Scholar
- D. Perez-Marin. 2011. Conversational agents and natural language interaction: Techniques and effective practices: Techniques and effective practices. IGI Global. Google Scholar
Digital Library
- M. F. McTear. 2004. Spoken dialogue technology: Toward the conversational user interface. Springer Science 8 Business Media. Google Scholar
Digital Library
- H. Chen, X. Liu, D. Yin, and J. Tang. 2017. A survey on dialogue systems: Recent advances and new frontiers. arXiv:1711.01731.Google Scholar
- A. Ritter, C. Cherry, and W. B. Dolan. 2011. Data-driven response generation in social media. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Edinburgh, United Kingdom, 583--593. Google Scholar
Digital Library
- A. Stent and S. Bangalore. 2014. Natural Language Generation in Interactive Systems. Cambridge University Press. Google Scholar
Digital Library
- B. Hu, Z. Lu, H. Li, and Q. Chen. 2014. Convolutional neural network architectures for matching natural language sentences. In Proceedings of Advances in Neural Information Processing Systems (NIPS’14). NIPS, 2042--2050. Google Scholar
Digital Library
- M. Wang, Z. Lu, H. Li, and Q. Liu. 2015. Syntax-based deep matching of short texts. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI’15). AAAI Press, Buenos Aires, Argentina, 1354--1361. Google Scholar
Digital Library
- Z. Lu and H. Li. 2013. A deep architecture for matching short texts. In Proceedings of Advances in Neural Information Processing Systems (NIPS’13). NIPS, 1367--1375. Google Scholar
Digital Library
- T. Kato, J. I. Fukumoto, F. Masui, and N. Kando. 2005. Are open-domain question answering technologies useful for information access dialogues? An empirical study and a proposal of a novel challenge. ACM Transactions on Asian Language Information Processing 4, 3 (2005), 243--262. Google Scholar
Digital Library
- B. Wang, B. Liu, X. Wang, C. Sun, and D. Zhang. 2011. Deep learning approaches to semantic relevance modeling for chinese question-answer pairs. ACM Transactions on Asian Language Information Processing 10, 4 (2011), 21:1--21:16. Google Scholar
Digital Library
- C. R. Huang, F. Y. Chen, K. J. Chen, Z. M. Gao, and K. Y. Chen. 2000. Sinica treebank: Design criteria, annotation guidelines, and on-line interface. In Proceedings of the 2nd workshop on Chinese language processing: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics-Volume 12. ACL, Hong Kong, 29--37. Google Scholar
Digital Library
- D. Ramage, D. Hall, R. Nallapati, and C. D. Manning. 2009. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1. ACL, Singapore, 248--256. Google Scholar
Digital Library
- N. Erbs. 2015. Approaches to Automatic Text Structuring. Ph.D. Dissertation, Computer Science, Technische Universität, Darmstadt, Darmstadt City, Germany.Google Scholar
- S. Tellex, B. Katz, J. Lin, A. Fernandes, and G. Marton. 2003. Quantitative evaluation of passage retrieval algorithms for question answering. In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. ACM, Toronto, Canada, 41--47. Google Scholar
Digital Library
- B. K. Boguraev and M. S. Neff. 2000. Discourse segmentation in aid of document summarization. In Proceedings of the 33rd Annual Hawaii International Conference on System Sciences. IEEE, Maui, HI, USA, USA. Google Scholar
Digital Library
- M. Bayomi, K. Levacher, and M. R. Ghorab. 2015. OntoSeg: A novel approach to text segmentation using ontological similarity. In Proceedings of 2015 IEEE International Conference on Data Mining Workshop (ICDMW’15). IEEE, Atlantic City, NJ, USA, 1274--1283. Google Scholar
Digital Library
- M. A. Hearst. 1997. TextTiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics. MIT 23, 1 (1997), 33--64. Google Scholar
Digital Library
- L. Du, W. L. Buntine, and M. Johnson. 2013. Topic segmentation with a structured topic model. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’13). ACL, Atlanta, 190--200.Google Scholar
- A. Kazantseva and S. Szpakowicz. 2014. Hierarchical topical segmentation with affinity propagation. In Proceedings of COLING 2014: the 25th International Conference on Computational Linguistics. ACL, Dublin, Ireland, 37--47.Google Scholar
- G. Kumar, M. Henderson, S. Chan, H. Nguyen, and L. Ngoo. 2018. Question-answer selection in user to user marketplace conversations. arxiv:1802.01766.Google Scholar
- X. Zhou, B. Hu, Q. Chen, and X. Wang. 2018. Recurrent convolutional neural network for answer selection in community question answering. Neurocomputing 274 (2018), 8--18. Elsevier.Google Scholar
Digital Library
- C. Tan, F. Wei, Q. Zhou, N. Yang, B. Du, W. Lv, and M. Zhou. 2018. Context-aware answer sentence selection with hierarchical gated recurrent neural networks. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, 3 (2018), 540--549. Google Scholar
Digital Library
- Q. H. Tran, V. Tran, T. Vu, M. Nguyen, and S. B. Pham. 2015. JAIST: Combining multiple features for answer selection in community question answering. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). ACL, Denver, Colorado, 215--219.Google Scholar
- M. Tan, C. dos Santos, B. Xiang, and B. Zhou. 2016. Improved representation learning for question answer matching. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Vol. 1: Long Papers. ACL, Berlin, Germany, 464--473.Google Scholar
- R. Higashinaka, T. Meguro, H. Sugiyama, T. Makino, and Y. Matsuo. 2015. On the difficulty of improving hand-crafted rules in chat-oriented dialogue systems. In Proceedings of 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA’15). IEEE, Hong Kong, 1014--1018.Google Scholar
- A. Ritter, C. Cherry, and W. B. Dolan. 2011. Data-driven response generation in social media. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11). ACL, Edinburgh, United Kingdom, 583--593. Google Scholar
Digital Library
- A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J.-Y. Nie, B. Dolan, and J. Gao. 2015. A neural network approach to context-sensitive generation of conversational responses. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics— Human Language Technologies (NAACL HLT’15). ACL, Colorado, USA, 196--205.Google Scholar
- T. Mikolov, M. Karafiát, L. Burget, J. Cernocký, and S. Khudanpur. 2010. Recurrent neural network based language model. In Proceedings of INTERSPEECH, ISCA, Makuhari, Japan, 1045--1048.Google Scholar
- J. Li, W. Monroe, A. Ritter, and D. Jurafsky. 2016. Deep reinforcement learning for dialogue generation. arXiv:1606.01541.Google Scholar
- T. H. Wen, M. Gašic, D. Kim, N. Mrkšic, P. H. Su, D. Vandyke, and S. Young. 2015. Stochastic language generation in dialogue using recurrent neural networks with convolutional sentence reranking. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL’15). SIGdial, Prague, Czech Republic, 275--284.Google Scholar
- R. Higashinaka, N. Kobayashi, T. Hirano, C. Miyazaki, T. Meguro, T. Makino, and Y. Matsuo. 2016. Syntactic filtering and content-based retrieval of twitter sentences for the generation of system utterances in dialogue systems. In Situated Dialog in Speech-Based Human-Computer Interaction, A. Rudnicky, A. Raux, I. Lane, and T. Misu (Eds.). Signals and Communication Technology. Springer, London, 15--26.Google Scholar
- Y. Wang, J. Guo, W. Che, and T. Liu. 2016. Transition-based Chinese semantic dependency graph parsing. In Proceedings of the China National Conference on Chinese Computational Linguistics. Springer, Yantei, China, 12--24.Google Scholar
- John Tung Foundation Home Page. 2017. https://www.jtf.org.tw.Google Scholar
- C.-H. Wu, J.-F. Yeh, and Y.-S. Lai. 2006. Semantic segment extraction and matching for internet FAQ retrieval. IEEE Transactions on Knowledge and Data Engineering 18, 7 (2006), 930--940. Google Scholar
Digital Library
- C.-H. Wu, L.-C. Yu, and F.-L. Jang. 2005. Using semantic dependencies to mine depressive symptoms from consultation records. IEEE Intelligent Systems 20, 6 (2005), 50--58. Google Scholar
Digital Library
- D. Proudian and C. Pollard. 1985. Parsing head-driven phrase structure grammar. In Proceedings of the 23rd Annual Meeting on Association for Computational Linguistics. ACL, Chicago, Illinois, USA, 167--171. Google Scholar
Digital Library
- K. J. Chen and Y. M. Hsieh. 2004. Chinese treebanks and grammar extraction. In Proceedings of the IJCNLP: International Conference on Natural Language Processing. Springer, Hainan Island, China, 655--663. Google Scholar
Digital Library
- J. F. Yeh. 2016. Speech act identification using semantic dependency graphs with probabilistic context-free grammars. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP’16) 15, 1 (2016), 5:1--5:28. Google Scholar
Digital Library
- F. Chollet. 2015. Keras: Deep learning library for Theano and Tensorflow. Retrieved January 2018 from https://github.com/fchollet/keras.Google Scholar
- Theano Development Team. 2016. Theano: A Python framework for fast computation of mathematical expressions. http://arxiv.org/abs/1605.02688.Google Scholar
- C. R. Huang, A. Kilgarriff, Y. Wu, C. M. Chiu, S. Smith, P. Rychly, M. H. Bai, and K. J. Chen. 2005. Chinese sketch engine and the extraction of grammatical collocations. In Proceedings of the 4th SIGHAN Workshop on Chinese Language Processing. SIGHAN, Jeju Island, Korea, 48--55.Google Scholar
- T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean. 2013. Distributed representations of words and phrases and their compositionality. In Proceedings of the Advances in Neural Information Processing Systems 26 (NIPS’13). NIPS, Stateline, NV, 3111--3119. Google Scholar
Digital Library
- S. E. Roberston, S. Walker, M. Beaulieu, M. Gatford, and A. Payne. 1998. Okapi at trec-7. In Proceedings of the 7th International Conference on Text Retrieval (TREC7’98). NIST, Gaithersburg, USA, 253--264.Google Scholar
- I. Sutskever, O. Vinyals, and Q. V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the Advances in Neural Information Processing Systems 27 (NIPS’14). NIPS, Montreal, Canada, 3104--3112. Google Scholar
Digital Library
- C. Shah and J. Pomerantz. 2010. Evaluating and predicting answer quality in community QA. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, Geneva, Switzerland, 411--418. Google Scholar
Digital Library
- R. R. Gliem and J. A. Gliem. 2003. Calculating, interpreting, and reporting Cronbach's alpha reliability coefficient for likert-type scales. In Proceedings of 2003 Midwest Research to Practice Conference in Adult, Continuing, and Community Education. IUPUI, Columbus, Ohio, 82--88.Google Scholar
- D. L. Streiner, G. R. Norman, and J. Cairney. 2014. Health Measurement Scales: A Practical Guide to Their Development and Use (5th. ed.). Oxford University Press, Chapter 15.Google Scholar
Index Terms
Response Selection and Automatic Message-Response Expansion in Retrieval-Based QA Systems using Semantic Dependency Pair Model
Recommendations
Co-occurrence and Semantic Similarity Based Hybrid Approach for Improving Automatic Query Expansion in Information Retrieval
ICDCIT 2015: Proceedings of the 11th International Conference on Distributed Computing and Internet Technology - Volume 8956Pseudo Relevance feedback PRF based query expansion approaches assumes that the top ranked retrieved documents are relevant. But this assumption is not always true; it may also possible that a PRF document may contain different topics, which may or may ...
Expansion Model of Semantic Query Based on Ontology
WMWA '09: Proceedings of the 2009 Second Pacific-Asia Conference on Web Mining and Web-based ApplicationAn expansion model of semantic query based on ontology is proposed for the problem of “Expression Difference” and “Mechanical Match” that usually exists in information retrieval. By using domain knowledge in ontology, retrieval system can improve ...
A new fuzzy logic-based query expansion model for efficient information retrieval using relevance feedback approach
Efficient query expansion (QE) terms selection methods are really very important for improving the accuracy and efficiency of the system by removing the irrelevant and redundant terms from the top-retrieved feedback documents corpus with respect to a ...






Comments