ABSTRACT
Distant supervision (DS) has been widely used to automatically construct (noisy) labeled data for relation extraction (RE). To address the noisy label problem, most models have adopted the multi-instance learning paradigm by representing entity pairs as a bag of sentences. However, this strategy depends on multiple assumptions (e.g., all sentences in a bag share the same relation), which may be invalid in real-world applications. Besides, it cannot work well on long-tail entity pairs which have few supporting sentences in the dataset. In this work, we propose a new paradigm named retrieval-augmented distantly supervised relation extraction (ReadsRE), which can incorporate large-scale open-domain knowledge (e.g., Wikipedia) into the retrieval step. ReadsRE seamlessly integrates a neural retriever and a relation predictor in an end-to-end framework. We demonstrate the effectiveness of ReadsRE on the well-known NYT10 dataset. The experimental results verify that ReadsRE can effectively retrieve meaningful sentences (i.e., denoise), and relieve the problem of long-tail entity pairs in the original dataset through incorporating external open-domain corpus. Through comparisons, we show ReadsRE outperforms other baselines for this task.
Supplemental Material
- Christoph Alt, Marc Hü bner, and Leonhard Hennig. 2019. Fine-tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction. In Proceedings of the 57th Conference of the Association for Computational Linguistics (ACL). Florence, Italy, 1388--1398.Google Scholar
Cross Ref
- Giusepppe Attardi. 2015. WikiExtractor. https://github.com/attardi/wikiextractor .Google Scholar
- Ermei Cao, Difeng Wang, Jiacheng Huang, and Wei Hu. 2020. Open knowledge enrichment for long-tail entities. In Proceedings of The Web Conference (WWW). Tapei, 384--394.Google Scholar
Digital Library
- Zi Chai, Xiaojun Wan, Zhao Zhang, and Minjie Li. 2019. Harvesting Drug Effectiveness from Social Media. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). Paris, French, 55--64.Google Scholar
Digital Library
- Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. 2017. Reading Wikipedia to Answer Open-Domain Questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL). Vancouver, Canada, 1870--1879.Google Scholar
Cross Ref
- Eunsol Choi, Omer Levy, Yejin Choi, and Luke Zettlemoyer. 2018. Ultra-Fine Entity Typing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL). Melbourne, Australia, 87--96.Google Scholar
Cross Ref
- Despina Christou and Grigorios Tsoumakas. 2021. Improving Distantly-Supervised Relation Extraction through BERT-based Label & Instance Embeddings. arXiv preprint arXiv:2102.01156 (2021).Google Scholar
- Xiang Deng and Huan Sun. 2019. Leveraging 2-hop Distant Supervision from Table Entity Pairs for Relation Extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China, 410--420.Google Scholar
Cross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). Minneapolis, MN, 4171--4186.Google Scholar
- Jinhua Du, Jingguang Han, Andy Way, and Dadong Wan. 2018. Multi-Level Structured Self-Attentions for Distantly Supervised Relation Extraction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). Brussels, Belgium, 2216--2225.Google Scholar
Cross Ref
- José Esquivel, Dyaa Albakour, Miguel Martinez-Alvarez, David Corney, and Samir Moussa. 2017. On the Long-Tail Entities in News. In Proceedings of the 39th European Conference on IR Research (ECIR). Aberdeen, UK, 691--697.Google Scholar
Cross Ref
- Alec Go, Richa Bhayani, and Lei Huang. 2009. Twitter sentiment classification using distant supervision. CS224N project report, Stanford, Vol. 1, 12 (2009), 2009.Google Scholar
- Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Ming-Wei Chang. 2020. Retrieval Augmented Language Model Pre-Training. In Proceedings of the 37th International Conference on Machine Learning (ICML). Virtual Event, 3929--3938.Google Scholar
- Xu Han, Tianyu Gao, Yuan Yao, Deming Ye, Zhiyuan Liu, and Maosong Sun. 2019. OpenNRE: An Open and Extensible Toolkit for Neural Relation Extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), System Demonstrations. Hong Kong, China, 169--174.Google Scholar
Cross Ref
- Xu Han, Pengfei Yu, Zhiyuan Liu, Maosong Sun, and Peng Li. 2018. Hierarchical Relation Extraction with Coarse-to-Fine Grained Attention. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). Brussels, Belgium, 2236--2245.Google Scholar
Cross Ref
- Yuyun Huang and Jinhua Du. 2019. Self-Attention Enhanced CNNs and Collaborative Curriculum Learning for Distantly Supervised Relation Extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China, 389--398.Google Scholar
Cross Ref
- Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick S. H. Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Online, 6769--6781.Google Scholar
Cross Ref
- Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent Retrieval for Weakly Supervised Open Domain Question Answering. In Proceedings of the 57th Conference of the Association for Computational Linguistics (ACL). Florence, Italy, 6086--6096.Google Scholar
Cross Ref
- Kai Lei, Daoyuan Chen, Yaliang Li, Nan Du, Min Yang, Wei Fan, and Ying Shen. 2018. Cooperative Denoising for Distantly Supervised Relation Extraction. In Proceedings of the 27th International Conference on Computational Linguistics (COLING). Santa Fe, NM, 426--436.Google Scholar
- Patrick S. H. Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Kü ttler, Mike Lewis, Wen-tau Yih, Tim Rockt"a schel, Sebastian Riedel, and Douwe Kiela. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems (NeurIPS). Virtual.Google Scholar
- Yang Li, Guodong Long, Tao Shen, Tianyi Zhou, Lina Yao, Huan Huo, and Jing Jiang. 2020. Self-Attention Enhanced Selective Gate with Entity-Aware Embedding for Distantly Supervised Relation Extraction. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI). New York, NY, 8269--8276.Google Scholar
Cross Ref
- Yankai Lin, Shiqi Shen, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. 2016. Neural Relation Extraction with Selective Attention over Instances. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany, 2124--2133.Google Scholar
Cross Ref
- Guiliang Liu, Xu Li, Jiakang Wang, Mingming Sun, and Ping Li. 2020. Extracting Knowledge from Web Text with Monte Carlo Tree Search. In Proceedings of The Web Conference 2020. Association for Computing Machinery.Google Scholar
Digital Library
- Tianyu Liu, Kexiang Wang, Baobao Chang, and Zhifang Sui. 2017. A Soft-label Method for Noise-tolerant Distantly Supervised Relation Extraction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP). Copenhagen, Denmark, 1790--1795.Google Scholar
Cross Ref
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).Google Scholar
- Colin Lockard, Xin Luna Dong, Prashant Shiralkar, and Arash Einolghozati. 2018. CERES: Distantly Supervised Relation Extraction from the Semi-Structured Web. Proc. VLDB Endow., Vol. 11, 10 (2018), 1084--1096.Google Scholar
Digital Library
- Bingfeng Luo, Yansong Feng, Zheng Wang, Zhanxing Zhu, Songfang Huang, Rui Yan, and Dongyan Zhao. 2017. Learning with Noise: Enhance Distantly Supervised Relation Extraction with Dynamic Transition Matrix. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL). Vancouver, Canada, 430--439.Google Scholar
Cross Ref
- Mike Mintz, Steven Bills, Rion Snow, and Daniel Jurafsky. 2009. Distant supervision for relation extraction without labeled data. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics (ACL). Singapore, 1003--1011.Google Scholar
Cross Ref
- Kyosuke Nishida, Itsumi Saito, Atsushi Otsuka, Hisako Asano, and Junji Tomita. 2018. Retrieve-and-Read: Multi-task Learning of Information Retrieval and Reading Comprehension. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM). Torino, Italy, 647--656.Google Scholar
Digital Library
- Matthew Purver and Stuart Adam Battersby. 2012. Experimenting with Distant Supervision for Emotion Classification. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Avignon, France, 482--491.Google Scholar
Digital Library
- Pengda Qin, Weiran Xu, and William Yang Wang. 2018. DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL). Melbourne, Australia, 496--505.Google Scholar
Cross Ref
- Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI blog, Vol. 1, 8 (2019), 9.Google Scholar
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res., Vol. 21 (2020), 140:1--140:67.Google Scholar
- Ridho Reinanda, Edgar Meij, and Maarten de Rijke. 2016. Document Filtering for Long-tail Entities. In Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM). Indianapolis, IN, 771--780.Google Scholar
Digital Library
- Sebastian Riedel, Limin Yao, and Andrew McCallum. 2010. Modeling Relations and Their Mentions without Labeled Text. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD), Part III. Barcelona, Spain, 148--163.Google Scholar
Digital Library
- Stephen E. Robertson and Hugo Zaragoza. 2009. The Probabilistic Relevance Framework: BM25 and Beyond. Found. Trends Inf. Retr., Vol. 3, 4 (2009), 333--389.Google Scholar
Digital Library
- Anshumali Shrivastava and Ping Li. 2014. Asymmetric textLSH (ALSH) for Sublinear Time Maximum Inner Product Search (MIPS). In Advances in Neural Information Processing Systems (NIPS). Montreal, Canada, 2321--2329.Google Scholar
- Anshumali Shrivastava and Ping Li. 2015. Asymmetric Minwise Hashing for Indexing Binary Inner Products and Set Containment. In Proceedings of the 24th International Conference on World Wide Web (WWW). Florence, Italy, 981--991.Google Scholar
Digital Library
- Tingting Sun, Chunhong Zhang, Yang Ji, and Zheng Hu. 2019. Reinforcement learning for distantly supervised relation extraction. IEEE Access, Vol. 7 (2019), 98023--98033.Google Scholar
Cross Ref
- Shikhar Vashishth, Rishabh Joshi, Sai Suman Prayaga, Chiranjib Bhattacharyya, and Partha P. Talukdar. 2018. RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). Brussels, Belgium, 1257--1266.Google Scholar
- Hong Wang, Christfried Focke, Rob Sylvester, Nilesh Mishra, and William Wang. 2019. Fine-tune bert for docred with two-step process. arXiv preprint arXiv:1909.11898 (2019).Google Scholar
- Zhiguo Wang, Patrick Ng, Ramesh Nallapati, and Bing Xiang. 2021. Retrieval, Re-ranking and Multi-task Learning for Knowledge-Base Question Answering. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (EACL). Online, 347--357.Google Scholar
Cross Ref
- Shanchan Wu and Kai Fan. 2020. A Practical Framework for Relation Extraction with Noisy Labels Based on Doubly Transitional Loss. arXiv preprint arXiv:2004.13786 (2020).Google Scholar
- Shanchan Wu, Kai Fan, and Qiong Zhang. 2019. Improving Distantly Supervised Relation Extraction with Neural Noise Converter and Conditional Optimal Selector. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI). Honolulu, HI, 7273--7280.Google Scholar
Digital Library
- Peng Xu and Denilson Barbosa. 2019. Connecting Language and Knowledge with Heterogeneous Representations for Neural Relation Extraction. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). Minneapolis, MN, 3201--3206.Google Scholar
Cross Ref
- Lingyong Yan, Xianpei Han, Le Sun, Fangchao Liu, and Ning Bian. 2020. From Bag of Sentences to Document: Distantly Supervised Relation Extraction via Machine Reading Comprehension. arXiv preprint arXiv:2012.04334 (2020).Google Scholar
- Zhi-Xiu Ye and Zhen-Hua Ling. 2019. Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). Minneapolis, MN, 2810--2819.Google Scholar
Cross Ref
- Erxin Yu, Wenjuan Han, Yuan Tian, and Yi Chang. 2020. ToHRE: A Top-Down Classification Strategy with Hierarchical Bag Representation for Distantly Supervised Relation Extraction. In Proceedings of the 28th International Conference on Computational Linguistics (COLING). Barcelona, Spain (Online), 1665--1676.Google Scholar
Cross Ref
- Yujin Yuan, Liyuan Liu, Siliang Tang, Zhongfei Zhang, Yueting Zhuang, Shiliang Pu, Fei Wu, and Xiang Ren. 2019. Cross-Relation Cross-Bag Attention for Distantly-Supervised Relation Extraction. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI). Honolulu, HI, 419--426.Google Scholar
Digital Library
- Weijie Zhao, Shulong Tan, and Ping Li. 2020. SONG: Approximate Nearest Neighbor Search on GPU. In Proceedings of the 35th IEEE International Conference on Data Engineering (ICDE). Dallas, TX, 1033--1044.Google Scholar
Cross Ref
- Zhixin Zhou, Shulong Tan, Zhaozhuo Xu, and Ping Li. 2019. Möbius Transformation for Fast Inner Product Search on Graph. In Advances in Neural Information Processing Systems (NeurIPS). Vancouver, Canada, 8216--8227.Google Scholar
Index Terms
ReadsRE: Retrieval-Augmented Distantly Supervised Relation Extraction
Recommendations
End-to-end Distantly Supervised Information Extraction with Retrieval Augmentation
Distant supervision (DS) has been a prevalent approach to generating labeled data for information extraction (IE) tasks. However, DS often suffers from noisy label problems, where the labels are extracted from the knowledge base (KB), regardless of the ...
Distant Supervision for Relation Extraction via Group Selection
Distant supervision DS aligns relations between name entities from a knowledge base KB with free text and automatically annotates the training corpus with relation mentions. One big challenge of DS is that the heuristically generated relation labels ...
Learning labeling functions in distantly supervised relation extraction
Distant supervision has become the leading method for training large-scale information extractors. It could be encoded in the form of labeling functions, which employ knowledge bases to provide labels for the data. However, most previous works use ...






Comments