skip to main content
research-article

Synonymy Expansion Using Link Prediction Methods: A Case Study of Assamese WordNet

Published:02 November 2021Publication History
Skip Abstract Section

Abstract

WordNets built for low-resource languages, such as Assamese, often use the expansion methodology. This may result in missing lexical entries and missing synonymy relations. As the Assamese WordNet is also built using the expansion method, using the Hindi WordNet, it also has missing synonymy relations. As WordNets can be visualized as a network of unique words connected by synonymy relations, link prediction in complex network analysis is an effective way of predicting missing relations in a network. Hence, to predict the missing synonyms in the Assamese WordNet, link prediction methods were used in the current work that proved effective. It is also observed that for discovering missing relations in the Assamese WordNet, simple local proximity-based methods might be more effective as compared to global and complex supervised models using network embedding. Further, it is noticed that though a set of retrieved words are not synonyms per se, they are semantically related to the target word and may be categorized as semantic cohorts.

REFERENCES

  1. Adamic Lada A. and Adar Eytan. 2003. Friends and neighbors on the web. Soc. Netw. 25, 3 (2003), 211230.Google ScholarGoogle ScholarCross RefCross Ref
  2. Adamic Lada A. and Huberman Bernardo A.. 2000. Power-law distribution of the world wide web. Science 287, 5461 (2000), 21152115.Google ScholarGoogle ScholarCross RefCross Ref
  3. Allan James, Papka Ron, and Lavrenko Victor. 1998. On-line new event detection and tracking. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 3745.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Barabási Albert-László. 2009. Scale-free networks: A decade and beyond. Science 325, 5939 (2009), 412413.Google ScholarGoogle ScholarCross RefCross Ref
  5. Bharali Himadri, Mahanta Mayashree, Sarma Shikhar Kr., Saikia Utpal, and Sarmah Dibyajyoti. 2014. An analytical study of synonymy in Assamese language using WorldNet: Classification and structure. In Proceedings of the 7th Global WordNet Conference. 250255.Google ScholarGoogle Scholar
  6. Bhattacharyya Pushpak. 2017. IndoWordNet. In The WordNet in Indian Languages. Springer, 118.Google ScholarGoogle ScholarCross RefCross Ref
  7. Blondel Vincent D., Gajardo Anahí, Heymans Maureen, Senellart Pierre, and Dooren Paul Van. 2004. A measure of similarity between graph vertices: Applications to synonym extraction and web searching. SIAM Rev. 46, 4 (2004), 647666.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Blondel Vincent D. and Senellart Pierre P.. 2002. Automatic extraction of synonyms in a dictionary. Vertex 1 (2002), x1.Google ScholarGoogle Scholar
  9. Chandramouli C. and General Registrar. 2011. Census of India. Rural Urban Distribution of Population, Provisional Population Total. New Delhi: Office of the Registrar General and Census Commissioner, India.Google ScholarGoogle Scholar
  10. DiMarco Chrysanne, Hirst Graeme, and Stede Manfred. 1993. The semantic and stylistic differentiation of synonyms and near-synonyms. In Proceedings of the AAAI Spring Symposium on Building Lexicons for Machine Translation. 114121.Google ScholarGoogle Scholar
  11. Eagle Nathan, Pentland Alex Sandy, and Lazer David. 2009. Inferring friendship network structure by using mobile phone data. Proc. Nat. Acad. Sci. 106, 36 (2009), 1527415278.Google ScholarGoogle ScholarCross RefCross Ref
  12. Edmonds Philip and Hirst Graeme. 2002. Near-synonymy and lexical choice. Comput. Ling. 28, 2 (2002), 105144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Edmonds Philip Glenny. 2000. Semantic Representations of Near-synonyms for Automatic Lexical Choice.University of Toronto.Google ScholarGoogle Scholar
  14. Fei Hongliang, Tan Shulong, and Li Ping. 2019. Hierarchical multi-task word embedding learning for synonym prediction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 834842.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Glänzel Wolfgang and Schubert András. 2004. Analysing scientific networks through co-authorship. In Handbook of Quantitative Science and Technology Research. Springer, 257276.Google ScholarGoogle Scholar
  16. Grover Aditya and Leskovec Jure. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 855864.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Hanley James A. and McNeil Barbara J.. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve.Radiology 143, 1 (1982), 2936.Google ScholarGoogle ScholarCross RefCross Ref
  18. Hauer Bradley and Kondrak Grzegorz. 2020. Synonymy= translational equivalence. arXiv preprint arXiv:2004.13886 (2020).Google ScholarGoogle Scholar
  19. He Yeye, Chakrabarti Kaushik, Cheng Tao, and Tylenda Tomasz. 2016. Automatic discovery of attribute synonyms using query logs and table corpora. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 14291439.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jannink Jan and Wiederhold Gio. 1999. Thesaurus entry extraction from an on-line dictionary. In Proceedings of Fusion, Vol. 99. Citeseer.Google ScholarGoogle Scholar
  21. Jeh Glen and Widom Jennifer. 2002. SimRank: A measure of structural-context similarity. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 538543.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kashima Hisashi, Kato Tsuyoshi, Yamanishi Yoshihiro, Sugiyama Masashi, and Tsuda Koji. 2009. Link propagation: A fast semi-supervised learning algorithm for link prediction. In Proceedings of the SIAM International Conference on Data Mining. SIAM, 11001111.Google ScholarGoogle ScholarCross RefCross Ref
  23. Kipf Thomas N. and Welling Max. 2016. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016).Google ScholarGoogle Scholar
  24. Kleinberg Jon M.. 1999. Authoritative sources in a hyperlinked environment. J. ACM 46, 5 (1999), 604632.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Koren Yehuda, Bell Robert, and Volinsky Chris. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 3037.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Kovács István A., Luck Katja, Spirohn Kerstin, Wang Yang, Pollis Carl, Schlabach Sadie, Bian Wenting, Kim Dae-Kyum, Kishore Nishka, Hao Tong, et al. 2019. Network-based prediction of protein interactions. Nat. Commun. 10, 1 (2019), 1240.Google ScholarGoogle ScholarCross RefCross Ref
  27. Leeuwenberg Artuur, Vela Mihaela, Dehdari Jon, and Genabith Josef van. 2016. A minimally supervised approach for synonym extraction with word embeddings. Prague Bull. Math. Ling. 105, 1 (2016), 111142.Google ScholarGoogle ScholarCross RefCross Ref
  28. Lei Chengwei and Ruan Jianhua. 2012. A novel link prediction algorithm for reconstructing protein–protein interaction networks by topological similarity. Bioinformatics 29, 3 (2012), 355364.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Liben-Nowell David and Kleinberg Jon. 2007. The link-prediction problem for social networks. J. Amer. Soc. Inf. Sci. Technol. 58, 7 (2007), 10191031.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Menon Aditya Krishna and Elkan Charles. 2011. Link prediction via matrix factorization. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 437452.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Mikolov Tomas, Chen Kai, Corrado Greg, and Dean Jeffrey. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google ScholarGoogle Scholar
  32. Miller George A., Beckwith Richard, Fellbaum Christiane, Gross Derek, and Miller Katherine J.. 1990. Introduction to WordNet: An on-line lexical database. Int. J. Lexicog. 3, 4 (1990), 235244.Google ScholarGoogle ScholarCross RefCross Ref
  33. Mish Frederick C. et al. 2003. Merriam-Webster’s Collegiate Dictionary (11th ed.). Merriam-Webster, Springfield, MA.Google ScholarGoogle Scholar
  34. Murphy M. Lynne and Koskela Anu. 2010. Key Terms in Semantics. A&C Black.Google ScholarGoogle Scholar
  35. Navarro Emmanuel, Sajous Franck, Gaume Bruno, Prévot Laurent, ShuKai Hsieh, Tzu-Yi Kuo, Magistry Pierre, and Chu-Ren Huang. 2009. Wiktionary and NLP: Improving synonymy networks. In Proceedings of the Workshop on the People’s Web Meets NLP: Collaboratively Constructed Semantic Resources. Association for Computational Linguistics, 1927.Google ScholarGoogle ScholarCross RefCross Ref
  36. Ott Hans Rudolf, Rudolf Petra, and Schweitzer Frank. 1998. The European Physical Journal: Condensed Matter and Complex Systems. B. Springer.Google ScholarGoogle Scholar
  37. Page Lawrence, Brin Sergey, Motwani Rajeev, and Winograd Terry. 1999. The PageRank Citation Ranking: Bringing Order to the Web.Technical Report. Stanford InfoLab.Google ScholarGoogle Scholar
  38. Pang Bo, Lee Lillian, et al. 2008. Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2, 1–2 (2008), 1135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Pennington Jeffrey, Socher Richard, and Manning Christopher. 2014. GloVe: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). 15321543.Google ScholarGoogle ScholarCross RefCross Ref
  40. Perozzi Bryan, Al-Rfou Rami, and Skiena Steven. 2014. DeepWalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 701710.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Pujari Manisha and Kanawati Rushed. 2012. Link prediction in complex networks by supervised rank aggregation. In Proceedings of the IEEE 24th International Conference on Tools with Artificial Intelligence. IEEE, 782789.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Qian Longhua, Zhou Guodong, Kong Fang, and Zhu Qiaoming. 2009. Semi-supervised learning for semantic relation classification using stratified sampling strategy. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 14371445.Google ScholarGoogle ScholarCross RefCross Ref
  43. Qu Meng, Ren Xiang, and Han Jiawei. 2017. Automatic synonym discovery with knowledge bases. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 9971005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Reichert Richard, Olney John, and Paris James. 1969. Two Dictionary Transcripts and Programs for Processing Them. Volume I. The Encoding Scheme, Parsent and Conix.Technical Report. System Development Corp., Santa Monica, CA.Google ScholarGoogle Scholar
  45. Sarma Shikhar Kr., Gogoi Moromi, Saikia Utpal, and Medhi Rakesh. 2010. Foundation and structure of developing Assamese WordNet. In Proceedings of the 5th International Conference of the Global WordNet Association (GWC).Google ScholarGoogle Scholar
  46. Schlichtkrull Michael, Kipf Thomas N., Bloem Peter, Berg Rianne Van Den, Titov Ivan, and Welling Max. 2018. Modeling relational data with graph convolutional networks. In Proceedings of the European Semantic Web Conference. Springer, 593607.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Schwikowski Benno, Uetz Peter, and Fields Stanley. 2000. A network of protein–protein interactions in yeast. Nature Biotechnol. 18, 12 (2000), 12571261.Google ScholarGoogle ScholarCross RefCross Ref
  48. Shen Jiaming, Lyu Ruiliang, Ren Xiang, Vanni Michelle, Sadler Brian, and Han Jiawei. 2019. Mining entity synonyms with efficient neural set generation. In Proceedings of the AAAI Conference on Artificial Intelligence. 249256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Singhal Amit et al. 2001. Modern information retrieval: A brief overview. IEEE Data Eng. Bull. 24, 4 (2001), 3543.Google ScholarGoogle Scholar
  50. Snow Rion, Jurafsky Daniel, and Ng Andrew Y.. 2005. Learning syntactic patterns for automatic hypernym discovery. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 12971304.Google ScholarGoogle Scholar
  51. Steyvers Mark and Tenenbaum Josh. 2005. The large-scale structure of semantic networks. Cogn. Sci. 29, 1 (2005), 41–78. DOI: 10.1207/s15516709cog2901_3Google ScholarGoogle Scholar
  52. Symeonidis Panagiotis and Tiakas Eleftherios. 2014. Transitive node similarity: Predicting and recommending links in signed social networks. World Wide Web 17, 4 (2014), 743776.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Tsitsulin Anton, Mottin Davide, Karras Panagiotis, and Müller Emmanuel. 2018. Verse: Versatile graph embeddings from similarity measures. In Proceedings of the World Wide Web Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 539548.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Ustalov Dmitry, Chernoskutov Mikhail, Biemann Chris, and Panchenko Alexander. 2017. Fighting with the sparsity of synonymy dictionaries for automatic synset induction. In Proceedings of the International Conference on Analysis of Images, Social Networks and Texts. Springer, 94105.Google ScholarGoogle Scholar
  55. Wang Peng, Xu BaoWen, Wu YuRong, and Zhou XiaoYu. 2015. Link prediction in social networks: The state-of-the-art. Sci. China Inf. Sci. 58, 1 (2015), 138.Google ScholarGoogle ScholarCross RefCross Ref
  56. Wang Tong and Hirst Graeme. 2009. Extracting synonyms from dictionary definitions. In Proceedings of the International Conference on Recent Advances in Natural Language Processing. 471477.Google ScholarGoogle Scholar
  57. Wang Xiao Fan and Chen Guanrong. 2003. Complex networks: Small-world, scale-free and beyond. IEEE Circ. Syst. Mag. 3, 1 (2003), 620.Google ScholarGoogle ScholarCross RefCross Ref
  58. Weeds Julie, Clarke Daoud, Reffin Jeremy, Weir David, and Keller Bill. 2014. Learning to distinguish hypernyms and co-hyponyms. In Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers.Google ScholarGoogle Scholar

Index Terms

  1. Synonymy Expansion Using Link Prediction Methods: A Case Study of Assamese WordNet

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 21, Issue 1
      January 2022
      442 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3494068
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 November 2021
      • Accepted: 1 May 2021
      • Revised: 1 December 2020
      • Received: 1 July 2020
      Published in tallip Volume 21, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!