Abstract
RDF verbalization has received increasing interest, which aims to generate a natural language description of the knowledge base. Sequence-to-sequence models based on Transformer are able to obtain strong performance equipped with pre-trained language models such as BART and T5. However, in spite of the general performance gain introduced by the pre-trained models, the performance of the task is still limited by the small scale of the training dataset. To address the problem, we propose two orthogonal strategies to enhance the representation learning of RDF triples. Concretely, two types of knowledge are introduced, i.e., descriptive knowledge and relational knowledge, respectively. The descriptive knowledge indicates the semantic information of self definition, and the relational knowledge indicates the semantic information learned from the structural context. We further combine the descriptive and relational knowledge together to enhance the representation learning. Experimental results on the WebNLG and SemEval-2010 datasets show that the two types of knowledge can both enhance the model performance, and their combination is able to obtain further improvements in most cases, providing new state-of-the-art results.
- [1] . 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).Google Scholar
- [2] . 2018. Graph-to-sequence learning using gated graph neural networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 273–283.Google Scholar
Cross Ref
- [3] . 2013. Translating embeddings for modeling multi-relational data. Adv. Neural Inf. Process. Syst. 26 (2013), 2787–2795.Google Scholar
- [4] . 2016. A thorough examination of the CNN/daily mail reading comprehension task. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2358–2367.Google Scholar
Cross Ref
- [5] . 2019. Structural neural encoders for AMR-to-text generation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 3649–3658.Google Scholar
- [6] . 2014. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the 9th Workshop on Statistical Machine Translation. 376–380.Google Scholar
Cross Ref
- [7] . 2017. The WebNLG challenge: Generating text from RDF data. In Proceedings of the 10th International Conference on Natural Language Generation. 124–133.Google Scholar
Cross Ref
- [8] . 2019. Densely connected graph convolutional networks for graph-to-sequence learning. Trans. Assoc. Computat. Ling. 7 (2019), 297–312.Google Scholar
- [9] . 2020. Have your text and use it too! End-to-end neural data-to-text generation with semantic fidelity. arXiv preprint arXiv:2004.06577 (2020).Google Scholar
- [10] . 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google Scholar
Cross Ref
- [11] . 2010. SemEval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In Proceedings of the 5th International Workshop on Semantic Evaluation.Google Scholar
Digital Library
- [12] . 2020. Text-to-text pre-training for data-to-text tasks. arXiv peprint arXiv:2005.10433 (2020).Google Scholar
- [13] . 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- [14] . 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations (ICLR).Google Scholar
- [15] . 2019. Text generation from knowledge graphs with graph transformers. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2284–2293.Google Scholar
- [16] . 2020. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 7871–7880.Google Scholar
Cross Ref
- [17] . 2020. GPT-too: A language-model-first approach for AMR-to-text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 1846–1852.Google Scholar
Cross Ref
- [18] . 2018. Deep graph convolutional encoders for structured data to text generation. In Proceedings of the 11th International Conference on Natural Language Generation. 1–9.Google Scholar
Cross Ref
- [19] . 2020. Denoising pre-training and data augmentation strategies for enhanced RDF verbalization with transformers. arXiv preprint arXiv:2012.00571 (2020).Google Scholar
- [20] . 2019. Step-by-Step: Separating planning from realization in neural data-to-text generation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2267–2277.Google Scholar
- [21] . 2002. LEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311–318.Google Scholar
- [22] . 2015. chrF: Character n-gram f-score for automatic MT evaluation. In Proceedings of the 10th Workshop on Statistical Machine Translation. 392–395.Google Scholar
Cross Ref
- [23] . 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.Google Scholar
- [24] . 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019).Google Scholar
- [25] . 2020. Investigating pretrained language models for graph-to-text generation. arXiv preprint arXiv:2007.08426 (2020).Google Scholar
- [26] . 2020. Modeling global and local node contexts for text generation from knowledge graphs. Trans. Assoc. Computat. Ling. 8 (2020), 589–604.Google Scholar
- [27] . 2008. The graph neural network model. IEEE Trans. Neural Netw. 20, 1 (2008), 61–80.Google Scholar
Digital Library
- [28] . 2006. A study of translation edit rate with targeted human annotation. In Proceedings of Association for Machine Translation in the Americas.Google Scholar
- [29] . 2014. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS’14). MIT Press, Cambridge, MA, 3104–3112.Google Scholar
- [30] . 2018. GTR-LSTM: A triple encoder for sentence generation from RDF data. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 1627–1637.Google Scholar
Cross Ref
- [31] . 2017. Attention is all you need. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 5998–6008.Google Scholar
- [32] . 2018. Graph attention Networks. In Proceedings of the International Conference on Learning Representations.Google Scholar
- [33] . 2019. SuperGLUE: A stickier benchmark for general-purpose language understanding systems. In Proceedings of the Conference on Advances in Neural Information Processing Systems. 3266–3280.Google Scholar
- [34] . 2020. HuggingFace’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2020).Google Scholar
- [35] . 2019. ERNIE: Enhanced language representation with informative entities. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1441–1451.Google Scholar
Cross Ref
- [36] . 2020. Bridging the structural gap between encoding and decoding for data-to-text generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2481–2491.Google Scholar
Cross Ref
- [37] . 2021. Representation iterative fusion based on heterogeneous graph neural network for joint entity and relation extraction. Knowl.-based Syst. 219 (2021), 106888.Google Scholar
Cross Ref
- [38] . 2019. Triple-to-text: Converting RDF triples into high-quality natural languages via optimizing an inverse KL divergence. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 455–464.Google Scholar
Digital Library
Index Terms
Enhancing RDF Verbalization with Descriptive and Relational Knowledge
Recommendations
Conveying Procedural and Descriptive Knowledge with Augmented Reality
PETRA '22: Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive EnvironmentsRepairing or maintaining technical systems requires procedural knowledge. When dealing with complex problems, solid descriptive knowledge is also necessary to understand unexpected system states. Consequently, service technicians are taught both ...
RDF/RDFS-based Relational Database Integration
ICDE '06: Proceedings of the 22nd International Conference on Data EngineeringWe study the problem of answering queries through a RDF/RDFS ontology, given a set of view-based mappings between one or more relational schemas and this target ontology. Particularly, we consider a set of RDFS semantic constraints such as rdfs:...
Automatic verbalisation of SNOMED classes using OntoVerbal
AIME'11: Proceedings of the 13th conference on Artificial intelligence in medicineSNOMED is a large description logic based terminology for recording in electronic health records. Often, neither the labels nor the description logic definitions are easy for users to understand. Furthermore, information is increasingly being recorded ...






Comments