Abstract
Numerous attempts for hypernymy relation (e.g., dog “is-a” animal) detection have been made for resourceful languages like English, whereas efforts made for low-resource languages are scarce primarily due to lack of gold-standard datasets and suitable distributional models. Therefore, we introduce four gold-standard datasets for hypernymy detection for each of the two languages, namely, Hindi and Bengali, and two gold-standard datasets for Amharic. Another major contribution of this work is to prepare distributional thesaurus (DT) embeddings for all three languages using three different network embedding methods (DeepWalk, role2vec, and M-NMF) for the first time on these languages and to show their utility for hypernymy detection. Posing this problem as a binary classification task, we experiment with supervised classifiers like Support Vector Machine, Random Forest, and so on, and we show that these classifiers fed with DT embeddings can obtain promising results while evaluated against proposed gold-standard datasets, specifically in an experimental setup that counteracts lexical memorization. We further incorporate DT embeddings and pre-trained fastText embeddings together using two different hybrid approaches, both of which produce an excellent performance. Additionally, we validate our methodology on gold-standard English datasets as well, where we reach a comparable performance to state-of-the-art models for hypernymy detection.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Hypernymy Detection for Low-resource Languages: A Study for Hindi, Bengali, and Amharic
- . 2019. role2vec: Role-based network embeddings. In Proceedings of the 1st International Workshop on Deep Learning on Graphs: Methods and Applications. 1–7.Google Scholar
- . 2019. Every child should have parents: A taxonomy refinement algorithm based on hyperbolic term embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4811–4817.Google Scholar
Cross Ref
- . 2016. Supervised distributional hypernym discovery via domain adaptation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 424–435.Google Scholar
- . 2020. Patch-based identification of lexical semantic relations. In Proceedings of the European Conference on Information Retrieval. Springer, 126–140.Google Scholar
Digital Library
- . 2012. Entailment above the word level in distributional semantics. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. 23–32.Google Scholar
Digital Library
- . 2011. How we BLESSed distributional semantic evaluation. In Proceedings of the Workshop on GEometrical Models of Natural Language Semantics (GEMS’11). 1–10.Google Scholar
- . 1991. Organized on Psycholinguistic Principles. Psychology Press, 211 pages.Google Scholar
- . 2018. Crim at semeval-2018 task 9: A hybrid approach to hypernym discovery. In Proceedings of the 12th International Workshop on Semantic Evaluation. 725–731.Google Scholar
Cross Ref
- . 2013. Text: Now in 2D! a framework for lexical expansion with contextual similarity. J. Lang. Model. 1, 1 (2013), 55–95.Google Scholar
Cross Ref
- . 2013. Classifying taxonomic relations between pairs of Wikipedia articles. In Proceedings of the 6th International Joint Conference on Natural Language Processing. 788–794.Google Scholar
- . 2017. Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5 (2017), 135–146.Google Scholar
Digital Library
- . 2015. A large annotated corpus for learning natural language inference. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 632–642.Google Scholar
Cross Ref
- . 2018. SemEval-2018 task 9: Hypernym discovery. In Proceedings of the 12th International Workshop on Semantic Evaluation. 712–724.Google Scholar
Cross Ref
- . 2018. Distributional inclusion vector embedding for unsupervised hypernymy detection. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 485–495.Google Scholar
Cross Ref
- . 2018. Word relation autoencoder for unseen hypernym extraction using word embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 4834–4839.Google Scholar
Cross Ref
- . 1972. Experiments on Semantic Memory and Language Comprehension.John Wiley & Sons.Google Scholar
- . 2017. Word translation without parallel data. Retrieved from https://arXiv:1710.04087.Google Scholar
- . 2013. Recognizing textual entailment: Models and applications. Synth. Lect. Hum. Lang. Technol. 6, 4 (2013), 1–220.Google Scholar
Cross Ref
- . 1957. A Synopsis of Linguistic Theory, 1930–1955. Basil Blackwell.Google Scholar
- . 2014. Learning semantic hierarchies via word embeddings. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 1199–1209.Google Scholar
Cross Ref
- . 2018. Hyperbolic entailment cones for learning hierarchical embeddings. In Proceedings of the International Conference on Machine Learning. 1646–1655.Google Scholar
- . 2017. Dual tensor model for detecting asymmetric lexico-semantic relations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1757–1767.Google Scholar
Cross Ref
- . 2013. A dataset of syntactic-ngrams over time from a very large corpus of English books. In Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics (*SEM’13): Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity. 241–247.Google Scholar
- . 2012. Building large monolingual dictionaries at the leipzig corpora collection: From 100 to 200 languages. In Proceedings of the 8th International Conference on Language Resources and Evaluation, Vol. 29. 31–43.Google Scholar
- . 2010. Sketch techniques for scaling distributional similarity to the web. In Proceedings of the Workshop on GEometrical Models of Natural Language Semantics. Association for Computational Linguistics, 51–56.Google Scholar
- . 2017. Taxonomy induction using hypernym subsequences. In Proceedings of the ACM Conference on Information and Knowledge Management. 1329–1338.Google Scholar
Digital Library
- . 1954. Distributional structure. Word 10, 2-3 (1954), 146–162.Google Scholar
Digital Library
- . 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th Conference on Computational Linguistics. 539–545.Google Scholar
Digital Library
- . 2019. The effectiveness of simple hybrid systems for hypernym discovery. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3362–3367.Google Scholar
Cross Ref
- . 2018. Can network embedding of distributional thesaurus be combined with word vectors for better representation? In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 463–473.Google Scholar
Cross Ref
- . 2020. Using distributional thesaurus embedding for co-hyponymy detection. In Proceedings of the 12th Language Resources and Evaluation Conference. 5766–5771.Google Scholar
- . 2004. Itri-04-08 the sketch engine. Info. Technol. 105 (2004), 116.Google Scholar
- . 2020. Data augmentation for hypernymy detection. Retrieved from https://arXiv:2005.01854.Google Scholar
- . 2010. Directional distributional similarity for lexical inference. Natural Lang. Eng. 16, 4 (2010), 359.Google Scholar
Digital Library
- . 2010. A semi-supervised method to learn and construct taxonomies using the web. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1110–1118.Google Scholar
- . 2015. IIT-TUDA: System for sentiment analysis in indian languages using lexical acquisition. In Proceedings of the International Conference on Mining Intelligence and Knowledge Exploration. Springer, 684–693.Google Scholar
Digital Library
- . 2019. Inferring concept hierarchies from text corpora via hyperbolic embeddings. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3231–3241.Google Scholar
Cross Ref
- . 2015. Do supervised distributional methods really learn lexical inference relations? In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 970–976.Google Scholar
Cross Ref
- . 1998. Automatic retrieval and clustering of similar words. In Proceedings of the 17th International Conference on Computational Linguistics. 768–774.Google Scholar
Digital Library
- . 2016. Learning term embeddings for taxonomic relation identification using dynamic weighting neural network. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 403–413.Google Scholar
- . 2018. End-to-end reinforcement learning for automatic taxonomy induction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2462–2472.Google Scholar
Cross Ref
- . 2011. A graph-based algorithm for inducing lexical taxonomies from scratch. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence. 1872–1877.Google Scholar
- . 2017. Hierarchical embeddings for hypernymy detection and directionality. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 233–243.Google Scholar
Cross Ref
- . 2017. Poincaré embeddings for learning hierarchical representations. In Proceedings of the 31st Conference on Neural Information Processing Systems (NeurIPS’17). 6338–6347.Google Scholar
- . 2018. Learning continuous hierarchies in the lorentz model of hyperbolic geometry. In Proceedings of the International Conference on Machine Learning. 3779–3788.Google Scholar
- . 2018. pyiwn: A python-based API to access Indian language WordNets. In Proceedings of the 9th Global WordNet Conference (GWC’18), Vol. 382.Google Scholar
- . 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 701–710.Google Scholar
Digital Library
- . 2018. Scoring lexical entailment with a supervised directional similarity network. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 638–643.Google Scholar
Cross Ref
- . 2013. Scaling to large\( ^3 \) data: An efficient and effective method to compute distributional thesauri. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 884–890.Google Scholar
- . 2014. Distributional lexical entailment by topic coherence. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 511–519.Google Scholar
Cross Ref
- . 2016. Relations such as Hypernymy: Identifying and exploiting Hearst patterns in distributional vectors for lexical entailment. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin, TX, USA, 2163–2172.Google Scholar
Cross Ref
- . 2014. Inclusive yet selective: Supervised distributional hypernymy detection. In Proceedings of the 25th International Conference on Computational Linguistics (COLING’14). 1025–1036.Google Scholar
- . 2018. Hearst patterns revisited: Automatic hypernym detection from large text corpora. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia, 358–363.Google Scholar
Cross Ref
- . 2020. An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs. Retrieved from https://arXiv:2003.04819.Google Scholar
- . 2007. An efficient algorithm for building a distributional thesaurus (and other Sketch Engine developments). In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions. Association for Computational Linguistics, 41–44.Google Scholar
Digital Library
- . 2017. How well can we predict hypernyms from word embeddings? A dataset-centric analysis. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL’17), Vol. 2. 401–407.Google Scholar
Cross Ref
- . 2016. Nine features in a random forest to learn taxonomical semantic relations. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC’16). 4557–4564.Google Scholar
- . 2014. Chasing hypernyms in vector spaces with entropy. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 38–42.Google Scholar
- . 2015. Evalution 1.0: An evolving semantic dataset for training and evaluation of distributional semantic models. In Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications. 64–69.Google Scholar
Cross Ref
- . 2019. Discovering hypernymy in text-rich heterogeneous information network by exploiting context granularity. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 599–608.Google Scholar
Digital Library
- . 2016. Improving hypernymy detection with an integrated path-based and distributional method. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2389–2398.Google Scholar
Cross Ref
- . 2017. Hypernyms under Siege: Linguistically motivated artillery for hypernymy detection. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. 65–75.Google Scholar
Cross Ref
- . 2005. Learning syntactic patterns for automatic hypernym discovery. In Proceedings of the Conference on Neural Information Processing Systems (NeurIPS’05). 1297–1304.Google Scholar
- . 2006. Semantic taxonomy induction from heterogenous evidence. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. 801–808.Google Scholar
Digital Library
- . 2015. Experiments with three approaches to recognizing lexical entailment. Natural Lang. Eng. 21, 3 (2015), 437–476.Google Scholar
Cross Ref
- . 2018. Robust cross-lingual hypernymy detection using dependency context. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 607–618.Google Scholar
Cross Ref
- . 2017. Negative sampling improves hypernymy extraction based on projection learning. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL’17). 543–550.Google Scholar
Cross Ref
- . 2008. Visualizing data using t-SNE. J. Mach. Learn. Res. 9 (2008), 2579–2605.Google Scholar
- . 2017. HyperLex: A large-scale evaluation of graded lexical entailment. Comput. Linguist. 43, 4 (2017), 781–835.Google Scholar
Digital Library
- . 2018. Specialising word vectors for lexical entailment. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1134–1145.Google Scholar
- . 2019. Multilingual and cross-lingual graded lexical entailment. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4963–4974.Google Scholar
Cross Ref
- . 2016. Take and took, gaggle and goose, book and read: Evaluating the utility of vector differences for lexical relation learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1671–1682.Google Scholar
Cross Ref
- . 2019a. A family of fuzzy orthogonal projection models for monolingual and cross-lingual hypernymy prediction. In Proceedings of the World Wide Web Conference. 1965–1976.Google Scholar
Digital Library
- . 2020. BiRRE: Learning bidirectional residual relation embeddings for supervised hypernymy detection. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3630–3640.Google Scholar
Cross Ref
- . 2017b. A short survey on taxonomy learning from text corpora: Issues, resources and recent advances. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1190–1203.Google Scholar
Cross Ref
- . 2019b. Improving hypernymy prediction via taxonomy enhanced adversarial learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 7128–7135.Google Scholar
Digital Library
- . 2017c. Transductive non-linear learning for Chinese hypernym prediction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 1394–1404.Google Scholar
Cross Ref
- . 2018. An attention-based Bi-GRU-CapsNet model for hypernymy detection between compound entities. In Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM’18). IEEE, 1031–1035.Google Scholar
Cross Ref
- . 2017a. Community preserving network embedding. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 203–209.Google Scholar
Cross Ref
- . 2015. Query understanding through knowledge-based conceptualization. In Proceedings of the 24th International Conference on Artificial Intelligence. 3264–3270.Google Scholar
Digital Library
- . 2014. Learning to distinguish hypernyms and co-hyponyms. In Proceedings of the 25th International Conference on Computational Linguistics (COLING’14). 2249–2259.Google Scholar
- . 2004. Characterising measures of lexical distributional similarity. In Proceedings of the 20th International Conference on Computational Linguistics (COLING’04). 1015–1021.Google Scholar
Digital Library
- . 2018. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1. 1112–1122.Google Scholar
Cross Ref
- . 2018. Term definitions help hypernymy detection. In Proceedings of the 7th Joint Conference on Lexical and Computational Semantics. 203–213.Google Scholar
Cross Ref
- . 2020. Hypernymy detection for low-resource languages via meta learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3651–3656.Google Scholar
Cross Ref
- . 2015. Learning term embeddings for hypernymy identification. In Proceedings of the 24th International Conference on Artificial Intelligence. 1390–1397.Google Scholar
Digital Library
Index Terms
Hypernymy Detection for Low-resource Languages: A Study for Hindi, Bengali, and Amharic
Recommendations
Using distributional thesaurus to enhance transformer-based contextualized representations for low resource languages
SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied ComputingTransformer-based language models recently gained large popularity in Natural Language Processing (NLP) because of their diverse applicability in various tasks where they reach state-of-the-art performance. Even though for resource-rich languages like ...
Hypernymy Detection for Vietnamese Using Dynamic Weighting Neural Network
Computational Linguistics and Intelligent Text ProcessingA Lemmatizer for Low-resource Languages: WSD and Its Role in the Assamese Language
The morphological variations of highly inflected languages that appear in a text impede the progress of computer processing and root word determination tasks while extracting an abstract. As a remedy to this difficulty, a lemmatization algorithm is ...






Comments