skip to main content
research-article

Confidence Indexing of Automated Detected Synsets: A Case Study on Contemporary Turkish Dictionary

Published:01 November 2021Publication History
Skip Abstract Section

Abstract

In this study, a novel confidence indexing algorithm is proposed to minimize human labor in controlling the reliability of automatically extracted synsets from a non-machine-readable monolingual dictionary. Contemporary Turkish Dictionary of Turkish Language Association is used as the monolingual dictionary data. First, the synonym relations are extracted by traditional text processing methods from dictionary definitions and a graph is prepared in Lemma-Sense network architecture. After each synonym relation is labeled by a proper confidence index, synonym pairs with desired confidence indexes are analyzed to detect synsets with a spanning tree-based method. This approach can label synsets with one of three cumulative confidence levels (CL-1, CL-2, and CL-3). According to the confidence levels, synsets are compared with KeNet which is the only open access Turkish Wordnet. Consequently, while most matches with the synsets of KeNet is determined in CL-1 and CL-2 confidence levels, the synsets determined at CL-3 level reveal errors in the dictionary definitions. This novel approach does not find only the reliability of automatically detected synsets, but it can also point out errors of detected synsets from the dictionary.

REFERENCES

  1. [1] Alexeyevsky Daniil and Temchenko Anastasiya V.. 2016. Word sense disambiguation in monolingual dictionaries for building russian WordNet. In Proceedings of the 8th Global WordNet Conference. 914.Google ScholarGoogle Scholar
  2. [2] Amasyalı M. Fatih. 2005. Automatic construction of turkish wordnet. In Proceedings of the IEEE 13th Signal Processing and Communications Applications Conference. IEEE, 248251.Google ScholarGoogle ScholarCross RefCross Ref
  3. [3] Anaya-Sánchez Henry, Pons-Porrata Aurora, and Berlanga-Llavori Rafael. 2007. TKB-UO: Using sense clustering for WSD. In Proceedings of the 4th International Workshop on Semantic Evaluations. 322325.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Bakay Özge, Ergelen Özlem, and Yıldız Olcay Taner. 2019. Integrating turkish WordNet KeNet to princeton WordNet: The case of one-to-many correspondences. In Proceedings of the 2019 Innovations in Intelligent Systems and Applications Conference. IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Bakay Özge, Ergelen Özlem, and Yıldız Olcay Taner. 2019. Problems caused by semantic drift in wordnet synset construction. In Proceedings of the 2019 4th International Conference on Computer Science and Engineering. IEEE, 15.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Bilgin Orhan, Çetinoğlu Özlem, and Oflazer Kemal. 2004. Building a wordnet for Turkish. Romanian Journal of Information Science and Technology 7, 1–2 (2004), 163172.Google ScholarGoogle Scholar
  7. [7] Bond Francis and Foster Ryan. 2013. Linking and extending an open multilingual wordnet. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 13521362.Google ScholarGoogle Scholar
  8. [8] Bosch Sonja E. and Griesel Marissa. 2017. Strategies for building wordnets for under-resourced languages: The case of African languages. Literator (Potchefstroom. Online) 38, 1 (2017), 112.Google ScholarGoogle Scholar
  9. [9] Camacho-Collados Jose and Pilehvar Mohammad Taher. 2018. From word to sense embeddings: A survey on vector representations of meaning. Journal of Artificial Intelligence Research 63, 1 (2018), 743788.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Paiva Valeria de, Real Livy, Oliveira Hugo Gonçalo, Rademaker Alexandre, Freitas Claudia, and Simões Alberto. 2016. An overview of Portuguese wordnets. In Proceedings of the 8th Global WordNet Conference. 7482.Google ScholarGoogle Scholar
  11. [11] Dubossarsky Haim, Grossman Eitan, and Weinshall Daphna. 2018. Coming to your senses: On controls and evaluation sets in polysemy research. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 17321740.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Ehsani Razieh, Solak Ercan, and Yıldız Olcay Taner. 2018. Constructing a wordnet for Turkish using manual and automatic annotation. ACM Transactions on Asian and Low-Resource Language Information Processing 17, 3 (2018), 115.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Ercan Gonenc and Haziyev Farid. 2019. Synset expansion on translation graph for automatic wordnet construction. Information Processing & Management 56, 1 (2019), 130150.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Evans Roger, Gelbukh Alexander, Grefenstette Gregory, Hanks Patrick, Jakubíček Miloš, McCarthy Diana, Palmer Martha, Pedersen Ted, Rundell Michael, Rychlỳ Pavel, Serge Sharoff, and David Tugwell. 2016. Adam Kilgarriff’s legacy to computational linguistics and beyond. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 325.Google ScholarGoogle Scholar
  15. [15] Fellbaum Christiane. 1998. WordNet: An Electronic Lexical Database. MIT Press.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Finlayson Mark. 2014. Java libraries for accessing the princeton wordnet: Comparison and evaluation. In Proceedings of the 7th Global Wordnet Conference. 7885.Google ScholarGoogle Scholar
  17. [17] Oliveira Hugo Gonçalo and Gomes Paulo. 2014. ECO and Onto. PT: A flexible approach for creating a Portuguese wordnet automatically. Language Resources and Evaluation 48, 2 (2014), 373393.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Oliveira Hugo Gonçalo, Santos Diana, and Gomes Paulo. 2009. Relations extracted from a portuguese dictionary: Results and first evaluation.In Proceedings of the New Trends in Artificial Intelligence, 14th Portuguese Conference on Artificial Intelligence (EPIA’09). 541551.Google ScholarGoogle Scholar
  19. [19] Hamp Birgit and Feldweg Helmut. 1997. Germanet-a lexical-semantic net for german. In Proceedings of the Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications.Google ScholarGoogle Scholar
  20. [20] Hogan Aidan, Blomqvist Eva, Cochez Michael, D’amato Claudia, Melo Gerard De, Gutierrez Claudio, Kirrane Sabrina, Labra Gayo José Emilio, Navigli Roberto, Neumaier Sebastian, Ngonga Ngomo Axel-Cyrille, Polleres Axel, Rashid Sabbir M., Rula Anisa, Schmelzeisen Lukas, Sequeda Juan, Staab Steffen, and Zimmermann Antoine. 2021. Knowledge graphs. ACM Computing Surveys 54, 4, Article 71 (May 2022), 37 pages. DOI: https://doi.org/10.1145/3447772Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Horák Aleš, Pala Karel, Rambousek Adam, Povolnỳ and Martin. 2006. DEBVisDic–first version of new client-server WordNet browsing and editing tool. In Proceedings of the 3rd International WordNet Conference. Citeseer, 325328.Google ScholarGoogle Scholar
  22. [22] Jackson Howard. 2002. Lexicography: An Introduction. Routledge.Google ScholarGoogle Scholar
  23. [23] Johansson Richard and Pina Luis Nieto. 2015. Embedding a semantic network in a word space. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 14281433.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Karaağaç Günay. 2013. Dil Bilimi Terimleri Sözlüğü. Türk Dil Kurumu Yayınları.Google ScholarGoogle Scholar
  25. [25] Kashgary Amira D.. 2011. The paradox of translating the untranslatable: Equivalence vs. non-equivalence in translating from Arabic into English. Journal of King Saud University-Languages and Translation 23, 1 (2011), 4757.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Kenett Yoed N., Levi Effi, Anaki David, and Faust Miriam. 2017. The semantic distance task: Quantifying semantic distance with semantic network path length.Journal of Experimental Psychology: Learning, Memory, and Cognition 43, 9 (2017), 1470.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Khorasani Mahsa, Minaei-Bidgoli Behrouz, and Saedi Chakaveh. 2019. Automatic synset extraction from text documents using a graph-based clustering approach via maximal cliques finding. International Journal of Information and Communication Technology Research 11, 1 (2019), 2735.Google ScholarGoogle Scholar
  28. [28] Kilgarriff Adam. 1992. Dictionary word sense distinctions: An enquiry into their nature. Computers and the Humanities 26, 5–6 (Dec. 1992), 365387. DOI:DOI: DOI: https://doi.org/10.1007/BF00136981Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Lin Dekang. 1998. Automatic retrieval and clustering of similar words. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Vol. 2, 768774.Google ScholarGoogle Scholar
  30. [30] Miller George A.. 1995. WordNet: A lexical database for English. Communications of the ACM 38, 11 (1995), 3941.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] Mostafazadeh Nasrin and Allen James F.. 2015. Learning semantically rich event inference rules using definition of verbs. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 402416.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Munirsyah Munirsyah, Bijaksana Moch Arif, and Astuti Widi. 2020. Development synonym set for the English wordnet using the method of comutative and agglomerative clustering. Jurnal Sisfokom (Sistem Informasi dan Komputer) 9, 2 (2020), 171176.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Nedioui Med Abdelhamid, Moussaoui Abdelouahab, Saoud Bilal, and Babahenini Mohamed Chaouki. 2020. Detecting communities in social networks based on cliques. Physica A: Statistical Mechanics and its Applications 551, 12 (2020), 124100.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Oliver Antoni and Climent Salvador. 2012. Parallel corpora for wordnet construction: Machine translation vs. automatic sense tagging. In Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Springer, 110121.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Pedersen Ted, Banerjee Satanjeev, and Patwardhan Siddharth. 2005. Maximizing Semantic Relatedness to Perform Word Sense Disambiguation. Technical Report. Research Report UMSI 2005/25. University of Minnesota Supercomputing Institute.Google ScholarGoogle Scholar
  36. [36] Pina Luis Nieto and Johansson Richard. 2016. Embedding senses for efficient graph-based word sense disambiguation. In Proceedings of the 2016 Workshop on Graph-based Methods for Natural Language Processing. 15.Google ScholarGoogle Scholar
  37. [37] Putra Desmond Darma, Arfan Abdul, and Manurung Ruli. 2008. Building an Indonesian wordnet. In Proceedings of the 2nd International MALINDO Workshop. 1213.Google ScholarGoogle Scholar
  38. [38] Sagot Benoît and Fišer Darja. 2008. Building a free French wordnet from multilingual resources. In Proceedings of the OntoLex.Google ScholarGoogle Scholar
  39. [39] Sowa John F.. 1979. Semantics of conceptual graphs. In Proceedings of the 17th Annual Meeting of the Association for Computational Linguistics. 3944.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Stamou Sofia, Oflazer Kemal, Pala Karel, Christoudoulakis Dimitris, Cristea Dan, Tufis Dan, Koeva Svetla, Totkov George, Dutoit Dominique, and Grigoriadou Maria. 2002. Balkanet: A multilingual semantic network for the balkan languages. In Proceedings of the International Wordnet Conference. 2125.Google ScholarGoogle Scholar
  41. [41] Tarjan Robert. 1972. Depth-first search and linear graph algorithms. SIAM Journal on Computing 1, 2 (1972), 146160.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Thoongsup Sareewan, Charoenporn Thatsanee, Robkop Kergrit, Sinthurahat Tan, Mokarat Chumpol, Sornlertlamvanich Virach, and Isahara Hitoshi. 2009. Thai wordnet construction. In Proceedings of the 7th Workshop on Asian Language Resources. 139144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Turan Erhan and Orhan Umut. 2018. Building a Turkish semantic network and connecting synonym senses bidirectionally. In Proceedings of the 2018 Innovations in Intelligent Systems and Applications. IEEE, 16.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Ustalov Dmitry, Panchenko Alexander, and Biemann Chris. 2017. Watset: Automatic induction of synsets from a graph of synonyms. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vol. 1. Association for Computational Linguistics, 15791590. DOI:DOI: DOI: https://doi.org/10.18653/v1/P17-1145Google ScholarGoogle Scholar
  45. [45] Vossen Piek (Eds.). 1998. A multilingual database with lexical semantic networks. 1st Ed. Springer Netherlands. DOI: 10.1007/978-94-017-1491-4Google ScholarGoogle Scholar
  46. [46] Widdows Dominic, Cederberg Scott, and Dorow Beate. 2002. Visualisation techniques for analysing meaning. In Proceedings of the International Conference on Text, Speech and Dialogue. Springer, 107114.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Widdows Dominic and Dorow Beate. 2002. A graph model for unsupervised lexical acquisition. In Proceedings of the 19th International Conference on Computational Linguistics.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. [48] Widdows Dominic and Dorow Beate. 2005. Automatic extraction of idioms using graph analysis and asymmetric lexicosyntactic patterns. In Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition. 4856.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Woods William A.. 1975. What’s in a link: Foundations for semantic networks. Representation and Understanding. Elsevier, 3582.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Yazici E. and Amasyali MF. 2011. Automatic extraction of semantic relationships using Turkish dictionary definitions. EMO Bilimsel Dergi 1, 1 (2011), 113.Google ScholarGoogle Scholar
  51. [51] Zhu Huichun and Malt Barbara C.. 2014. Cross-linguistic evidence for cognitive foundations of polysemy. In Proceedings of the Annual Meeting of the Cognitive Science Society, Vol. 36.Google ScholarGoogle Scholar

Index Terms

  1. Confidence Indexing of Automated Detected Synsets: A Case Study on Contemporary Turkish Dictionary

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 1
      January 2022
      442 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3494068
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 November 2021
      • Accepted: 1 June 2021
      • Revised: 1 January 2021
      • Received: 1 March 2020
      Published in tallip Volume 21, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed
    • Article Metrics

      • Downloads (Last 12 months)24
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!