skip to main content
research-article

Novel Character Identification Utilizing Semantic Relation with Animate Nouns in Korean

Published:21 July 2018Publication History
Skip Abstract Section

Abstract

For identifying speakers of quoted speech or extracting social networks from literature, it is indispensable to extract character names and nominals. However, detecting proper nouns in the novels translated into or written in Korean is harder than in English because Korean does not have a capitalization feature. In addition, it is almost impossible for any proper noun dictionary to include all kinds of character names that have been created or will be created by authors. Fortunately, a previous study shows that utilizing postpositions for animate nouns is a simple and effective tool for character identification in Korean novels without a proper noun dictionary and a training corpus. In this article, we propose a character identification method utilizing the semantic relation with known animate nouns. For 80 novels in Korean, the proposed method increases the micro- and macro-average recall by 13.68% and 11.86%, respectively, while decreasing the micro-average precision by 0.28% and increasing the macro-average precision by 0.07% compared to the previous study. If we focus on characters that are responsible for more than 1% of the character name mentions in each novel, the micro- and macro-average F-measure of the proposed method are 96.98% and 97.32%, respectively.

References

  1. D. K. Elson, N. Dames, and K. R. McKeown. 2010. Extracting social networks from literary fiction. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL’10). ACL, 138--147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. D. K. Elson and K. McKeown. 2010. Automatic attribution of quoted speech in literary narrative. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI’10). AAAI, 1013--1019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. R. Finkel, T. Grenager, and C. Manning. 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (ACL'05). ACL, 363--370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. H. He, D. Barbosa, and G. Kondrak. 2013. Identification of speakers in novels. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL’13). ACL, 1312--1320.Google ScholarGoogle Scholar
  5. E. Iosif and T. Mishra. 2014. From speaker identification to affective analysis: A multi-step system for analyzing children’s stories. In Proceedings of the 3rd Workshop on Computational Linguistics for Literature. ACL, 40--49.Google ScholarGoogle Scholar
  6. B. K. Kwak and J. W. Cha. 2005. Named Entity Tagging for Korean Using DL-CoTrain Algorithm. Lecture Notes in Computer Science, Vol. 3689. Springer-Verlag, New York, NY. 589--594 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Küçük and A. Yazici. 2012. A hybrid named entity recognizer for Turkish. Expert Systems with Applications 39, 3 (2012), 2733--2742. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. D. Lee, J. Yeon, I. Hwang, and S. G. Lee. 2010. KKMA: A tool for utilizing Sejong corpus based on relational database. Journal of KIISE: Computing Practices and Letters 16, 11 (2010), 1046--1050.Google ScholarGoogle Scholar
  9. E. Lee. 2009. Named entity detection and relation extraction in the personal chronology of the 19th century. Journal of EONEOHAG 53 (2009), 141--162.Google ScholarGoogle Scholar
  10. Christopher D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, and D. McClosky. 2014. The Stanford coreNLP natural language processing toolkit. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. ACL, 55--60.Google ScholarGoogle Scholar
  11. George A. Miller. 1995. WordNet: A lexical database for English. Communications of the ACM 38, 11 (1995), 39--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Nadeau and S. Kekine. 2007. A survey of named entity recognition and classification. Lingvisticæ Investigationes 30, 1 (2007), 3--26.Google ScholarGoogle ScholarCross RefCross Ref
  13. The National Institute of the Korean Language. 2002. The Standard Dictionary of the Korean Language. Retrieved May 15, 2017 from http://stdweb2.korean.go.kr/.Google ScholarGoogle Scholar
  14. G. M. Park, S. H. Kim, and H. G. Cho. 2013. Analysis of social network according to the distance of characters statements. Journal of the Korea Contents Association 13, 4 (2013), 427--439.Google ScholarGoogle ScholarCross RefCross Ref
  15. S. Y. Park, Y. J. Kwak, H. C. Rim, and H. S. Lim. 2005. Feature-based Korean grammar utilizing learned constraint rules. Computational Intelligence 21, 1 (2005), 69--89.Google ScholarGoogle ScholarCross RefCross Ref
  16. T. Park and S. H. Kim. 2016. A character identification method using postpositions for animate nouns in Korean novels. Journal of Information Technology Services 15, 3 (2016), 115--125.Google ScholarGoogle Scholar
  17. G. Petasis, F. Vichot, and F. Wolinski. 2001. Using machine learning to maintain rule-based named-entity recognition and classification systems. In Proceedings of the 39th Annual Meeting on Association for Computational Linguistics (ACL’01). ACL, Toulouse, France, 426--433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. N. Seon, Y. Ko, J. S. Kim, and J. Seo. 2001. Named entity recognition using machine learning methods and pattern-selection rules. In Proceedings of the 6th Natural Language Processing Pacific Rim Symposium. NLPRS, Tokyo, Japan, 229--236.Google ScholarGoogle Scholar
  19. K. Shaalan and M. Oudah. 2014. A hybrid approach to Arabic named entity recognition. Journal of Information Science 40, 1 (2014), 67--87. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. SWRC. 1999. HanNanum. Retrieved May 15, 2017 from http://semanticweb.kaist.ac.kr/home/index.php/HanNanum.Google ScholarGoogle Scholar
  21. T. H. Tsai, S. H. Wu, C. W. Lee, C. W. Shih, and W. L. Hsu. 2004. Mencius: A Chinese named entity recognizer using maximum entropy-based hybrid model. International Journal of Computational Linguistics and Chinese Language Processing 9, 1 (2004), 65--82.Google ScholarGoogle Scholar

Index Terms

  1. Novel Character Identification Utilizing Semantic Relation with Animate Nouns in Korean

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Article Metrics

        • Downloads (Last 12 months)9
        • Downloads (Last 6 weeks)2

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!