skip to main content
research-article

Turkish Data-to-Text Generation Using Sequence-to-Sequence Neural Networks

Published:27 December 2022Publication History
Skip Abstract Section

Abstract

End-to-end data-driven approaches lead to rapid development of language generation and dialogue systems. Despite the need for large amounts of well-organized data, these approaches jointly learn multiple components of the traditional generation pipeline without requiring costly human intervention. End-to-end approaches also enable the use of loosely aligned parallel datasets in system development by relaxing the degree of semantic correspondences between training data representations and text spans. However, their potential in Turkish language generation has not yet been fully exploited. In this work, we apply sequence-to-sequence (Seq2Seq) neural models to Turkish data-to-text generation where the input data given in the form of a meaning representation is verbalized. We explore encoder-decoder architectures with attention mechanism in unidirectional, bidirectional, and stacked recurrent neural network (RNN) models. Our models generate one-sentence biographies and dining venue descriptions using a crowdsourced dataset where all field value pairs that appear in meaning representations are fully captured in reference sentences. To support this work, we also explore the performances of our models on a more challenging dataset, where the content of a meaning representation is too large to fit into a single sentence, and hence content selection and surface realization need to be learned jointly. This dataset is retrieved by coupling introductory sentences of person-related Turkish Wikipedia articles with their contained infobox tables. Our empirical experiments on both datasets demonstrate that Seq2Seq models are capable of generating coherent and fluent biographies and venue descriptions from field value pairs. We argue that the wealth of knowledge residing in our datasets and the insights obtained from this study hold the potential to give rise to the development of new end-to-end generation approaches for Turkish and other morphologically rich languages.

REFERENCES

  1. [1] Angeli Gabor, Liang Percy, and Klein Dan. 2010. A simple domain-independent probabilistic approach to generation. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 502512.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Ayan Burcu Karagol. 2000. Morphosyntactic generation of Turkish from predicate-argument structure. In Proceedings of the COLING Student Session. Association for Computational Linguistics, Saarbrucken, Germany.Google ScholarGoogle Scholar
  3. [3] Bahdanau Dzmitry, Cho Kyunghyun, and Bengio Yoshua. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations. OpenReview.net, San Diego, California.Google ScholarGoogle Scholar
  4. [4] Banerjee Satanjeev and Lavie Alon. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, 6572.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Barzilay Regina and Lapata Mirella. 2005. Collective content selection for concept-to-text generation. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Vancouver, British Columbia, Canada, 331338.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Bateman John A.. 1990. Upper modeling: A general organization of knowledge for natural language processing. In Proceedings of the Information Sciences Institute. USC.Google ScholarGoogle Scholar
  7. [7] Belz Anja. 2005. Statistical generation: Three methods compared and evaluated. In Proceedings of the 10th European Workshop on Natural Language Generation. Association for Computational Linguistics, Aberdeen, Scotland, 1523.Google ScholarGoogle Scholar
  8. [8] Biswas Russa, Türker Rima, Moghaddam Farshad Bakhshandegan, Koutraki Maria, and Sack Harald. 2018. Wikipedia infobox type prediction using embeddings. In Proceedings of the 1st Workshop on Deep Learning for Knowledge Graphs and Semantic Technologies, ESWC. CEUR-WS.org, Crete, Greece.Google ScholarGoogle Scholar
  9. [9] Ferreira Thiago Castro, Lee Chris van der, Miltenburg Emiel van, and Krahmer Emiel. 2019. Neural data-to-text generation: A comparison between pipeline and end-to-end architectures. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Hong Kong, China, 552562.Google ScholarGoogle Scholar
  10. [10] Channarukul Songsak, Mcroy Susan, and Ali Syed. 2001. YAG a template-based text realization system for dialog. International Journal of Uncertainty, Fuzziness and Knowledge-based Systems 9, 6 (2001), 649659.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Chen David L. and Mooney Raymond J.. 2008. Learning to sportscast: A test of grounded language acquisition. In Proceedings of the 25th International Conference on Machine Learning. ACM, New York, NY, 128135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Chen Shuang. 2018. A general model for neural text generation from structured data. In Proceedings of the E2E NLG Challenge System Descriptions, 11th International Conference on Natural Language Generation. Association for Computational Linguistics, Tilburg, The Netherlands.Google ScholarGoogle Scholar
  13. [13] Chen Zhiyu, Eavani Harini, Chen Wenhu, Liu Yinyin, and Wang William Yang. 2020. Few-shot NLG with pre-trained language model. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 183190.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Chisholm Andrew, Radford Will, and Hachey Ben. 2017. Learning to generate one-sentence biographies from Wikidata. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Valencia, Spain, 633642.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Cicekli Ilyas and Korkmaz Turgay. 1998. Generation of simple Turkish sentences with systemic-functional grammar. In Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 165173.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Colin Emilie, Gardent Claire, Mrabet Yassine, Narayan Shashi, and Perez-Beltrachini Laura. 2016. The WebNLG challenge: Generating text from DBPedia data. In Proceedings of the 9th International Natural Language Generation Conference. Association for Computational Linguistics, Edinburgh, UK, 163167.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Daza Angel and Frank Anette. 2018. A sequence-to-sequence model for semantic role labeling. In Proceedings of the 3rd Workshop on Representation Learning for NLP. Association for Computational Linguistics, Melbourne, Australia, 207216.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Demir Seniz and Oktem Seza. 2022. A benchmark dataset for Turkish data-to-text generation. Computer Speech and Language (to appear). https://www.sciencedirect.com/science/article/abs/pii/S0885230822000614.Google ScholarGoogle Scholar
  19. [19] Doddington George. 2002. Automatic evaluation of machine translation quality using N-Gram co-occurrence statistics. In Proceedings of the 2nd International Conference on Human Language Technology Research. Morgan Kaufmann Publishers Inc., San Francisco, CA, 138145.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Doğan Emre, Kaya Buket, and Müngen Ahmet. 2018. Generation of original text with text mining and deep learning methods for Turkish and other languages. In Proceedings of the International Conference on Artificial Intelligence and Data Processing. IEEE, Malatya, Turkey, 19.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Duma Daniel and Klein Ewan. 2013. Generating natural language from linked data: Unsupervised template extraction. In Proceedings of the 10th International Conference on Computational Semantics. Association for Computational Linguistics, Potsdam, Germany, 8394.Google ScholarGoogle Scholar
  22. [22] Dusek Ondrej, Novikova Jekaterina, and Rieser Verena. 2020. Evaluating the state-of-the-art of end-to-end natural language generation: The E2E NLG challenge. Computer Speech & Language 59, 1 (2020), 123156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Fan Angela, Lewis Mike, and Dauphin Yann. 2018. Hierarchical neural story generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Melbourne, Australia, 889898.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Fuad Tanvir Ahmed, Nayeem Mir Tafseer, Mahmud Asif, and Chali Yllias. 2019. Neural sentence fusion for diversity driven abstractive multi-document summarization. Computer Speech & Language 58, 6 (2019), 216230.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Gao Hanning, Wu Lingfei, Hu Po, and Xu Fangli. 2021. RDF-to-text generation with graph-augmented structural neural encoders. In Proceedings of the 29th International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, Yokohama, Yokohama, Japan, Article 419, 7 pages.Google ScholarGoogle Scholar
  26. [26] Gardent Claire, Shimorina Anastasia, Narayan Shashi, and Perez-Beltrachini Laura. 2017. The WebNLG challenge: Generating text from RDF data. In Proceedings of the 10th International Conference on Natural Language Generation. Association for Computational Linguistics, Santiago de Compostela, Spain, 124133.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Gatt Albert and Krahmer Emiel. 2018. Survey of the state of the art in natural language generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research 61, 1 (2018), 65170. Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Gehrmann Sebastian, Dai Falcon, Elder Henry, and Rush Alexander. 2018. End-to-end content and plan selection for data-to-text generation. In Proceedings of the 11th International Conference on Natural Language Generation. Association for Computational Linguistics, Tilburg, The Netherlands, 4656.Google ScholarGoogle ScholarCross RefCross Ref
  29. [29] Ghaddar Abbas and Langlais Phillippe. 2016. Coreference in Wikipedia: Main concept resolution. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Association for Computational Linguistics, Berlin, Germany, 229238.Google ScholarGoogle ScholarCross RefCross Ref
  30. [30] Goldberg Yoav. 2016. A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research 57, 1 (2016), 345420.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Gong Heng, Sun Yawei, Feng Xiaocheng, Qin Bing, Bi Wei, Liu Xiaojiang, and Liu Ting. 2020. TableGPT: Few-shot table-to-text generation with table structure reconstruction and content matching. In Proceedings of the 28th International Conference on Computational Linguistics. Association for Computational Linguistics, Barcelona, Spain (Online), 19781988.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Gong Li, Crego Josep, and Senellart Jean. 2019. Enhanced transformer model for data-to-text generation. In Proceedings of the 3rd Workshop on Neural Generation and Translation. Association for Computational Linguistics, Hong Kong, 148156.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Graves Alex and Jaitly Navdeep. 2014. Towards end-to-end speech recognition with recurrent neural networks. In Proceedings of the 31st International Conference on International Conference on Machine Learning. JMLR.org, Beijing, China, 17641772.Google ScholarGoogle Scholar
  34. [34] Gu Jiatao, Lu Zhengdong, Li Hang, and Li Victor O. K.. 2016. Incorporating copying mechanism in sequence-to-sequence learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Berlin, Germany, 16311640.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Güran Aysun, Bayazit Nilgun Güler, and Gürbüz Mustafa Zahid. 2013. Efficient feature integration with Wikipedia-based semantic feature extraction for Turkish text summarization. Turkish Journal of Electrical Engineering and Computer Sciences 21, 5 (2013), 14111425.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Hakkani Dilek Zeynep. 1996. Design and Implementation of a Wide-coverage Tactical Generator for Turkish, a Free Constituent Order Language. Master’s thesis. Bilkent University.Google ScholarGoogle Scholar
  37. [37] Harris Mary Dee. 2008. Building a large-scale commercial NLG system for an EMR. In Proceedings of the Fifth International Natural Language Generation Conference. Association for Computational Linguistics, Salt Fork, Ohio, 157160.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Hewlett Daniel, Lacoste Alexandre, Jones Llion, Polosukhin Illia, Fandrianto Andrew, Han Jay, Kelcey Matthew, and Berthelot David. 2016. WikiReading: A novel large-scale language understanding task over Wikipedia. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Berlin, Germany, 15351545.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Hsu Wei-Ning, Zhang Yu, and Glass James R.. 2016. A prioritized grid long short-term memory RNN for speech recognition. In Proceedings of the IEEE Spoken Language Technology Workshop. IEEE, San Diego, California, 467473.Google ScholarGoogle ScholarCross RefCross Ref
  40. [40] Kawashima Takashi and Takagi Tomohiro. 2019. Sentence simplification from non-parallel corpus with adversarial learning. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence. ACM, New York, NY, 4350.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Kingma Diederik P. and Ba Jimmy. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations. ACM, San Diego, CA.Google ScholarGoogle Scholar
  42. [42] Koncel-Kedziorski Rik, Bekal Dhanush, Luan Yi, Lapata Mirella, and Hajishirzi Hannaneh. 2019. Text generation from knowledge graphs with graph transformers. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 22842293.Google ScholarGoogle Scholar
  43. [43] Konstas Ioannis and Lapata Mirella. 2013. Inducing document plans for concept-to-text generation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, Washington, 15031514.Google ScholarGoogle Scholar
  44. [44] Kutlu Mücahid, Cığır Celal, and Cicekli Ilyas. 2010. Generic text summarization for Turkish. Computer Journal 53, 8 (Oct.2010), 13151323.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Kutlugun Mehmet Ali and Şirin Yahya. 2018. Turkish meaningful text generation with class based n-gram model. In Proceedings of the 26th Signal Processing and Communications Applications Conference. IEEE, Izmir, Turkey, 14.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Kuyu Menekşe, Erdem Aykut, and Erdem Erkut. 2018. Image captioning in Turkish with subword units. In Proceedings of the 26th Signal Processing and Communications Applications Conference. IEEE, Izmir, Turkey, 14.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Lampouras Gerasimos and Androutsopoulos Ion. 2018. Extracting linguistic resources from the web for concept-to-text generation. arXiv: 1810.13414. Retrieved from https://arxiv.org/abs/1810.13414.Google ScholarGoogle Scholar
  48. [48] Lange Dustin, Böhm Christoph, and Naumann Felix. 2010. Extracting structured information from Wikipedia articles to populate Infoboxes. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management. ACM, New York, NY, 16611664.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Langkilde-Geary Irene. 2002. An empirical verification of coverage and correctness for a general-purpose sentence generator. In Proceedings of the International Natural Language Generation Conference. Association for Computational Linguistics, Harriman, New York, 1724.Google ScholarGoogle Scholar
  50. [50] Lebret Rémi, Grangier David, and Auli Michael. 2016. Neural text generation from structured data with application to the biography domain. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 12031213.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Li Junyi, Tang Tianyi, Zhao Wayne Xin, and Wen Ji-Rong. 2021. Pretrained language model for text generation: A survey. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, IJCAI-21. International Joint Conferences on Artificial Intelligence Organization, Online, 44924499.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Liang Percy, Jordan Michael, and Klein Dan. 2009. Learning semantic correspondences with less supervision. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Association for Computational Linguistics, Suntec, Singapore, 9199.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. [53] Lin Chin-Yew. 2004. ROUGE: A package for automatic evaluation of summaries. In Procedings of the Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 7481.Google ScholarGoogle Scholar
  54. [54] Liu Tianyu, Wang Kexiang, Sha Lei, Chang Baobao, and Sui Zhifang. 2018. Table-to-text generation by structure-aware Seq2seq learning. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (2018).Google ScholarGoogle Scholar
  55. [55] Luong Thang, Sutskever Ilya, Le Quoc, Vinyals Oriol, and Zaremba Wojciech. 2015. Addressing the rare word problem in neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Beijing, China, 1119.Google ScholarGoogle ScholarCross RefCross Ref
  56. [56] Mahapatra Joy, Naskar Sudip Kumar, and Bandyopadhyay Sivaji. 2016. Statistical natural language generation from tabular non-textual data. In Proceedings of the 9th International Natural Language Generation Conference. Association for Computational Linguistics, Edinburgh, Scotland, 143152.Google ScholarGoogle ScholarCross RefCross Ref
  57. [57] Mairesse Francois, Gasic Milica, Jurcicek Filip, Keizer Simon, Thomson Blaise, Yu Kai, and Young Steve. 2010. Phrase-based statistical language generation using graphical models and active learning. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Uppsala, Sweden, 15521561.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Mairesse François and Young Steve. 2014. Stochastic language generation in dialogue using factored language models. Computational Linguistics 40, 4 (2014), 763799.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. [59] Malouf Robert. 2017. Abstractive morphological learning with a recurrent neural network. Morphology 27, 4 (2017), 431458.Google ScholarGoogle ScholarCross RefCross Ref
  60. [60] Mangrulkar Sourab, Shrivastava Suhani, Thenkanidiyoor Veena, and Dinesh Dileep Aroor. 2018. A context-aware convolutional natural language generation model for dialogue systems. In Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue. Association for Computational Linguistics, Melbourne, Australia, 191200.Google ScholarGoogle ScholarCross RefCross Ref
  61. [61] Manishina Elena. 2016. Data-driven natural language generation using statistical machine translation and discriminative learning. Ph.D. Dissertation. Université d’Avignon, France.Google ScholarGoogle Scholar
  62. [62] Manishina Elena, Jabaian Bassam, Huet Stéphane, and Lefèvre Fabrice. 2016. Automatic corpus extension for data-driven natural language generation. In Proceedings of the 10th International Conference on Language Resources and Evaluation. European Language Resources Association, Portorož, Slovenia, 36243631.Google ScholarGoogle Scholar
  63. [63] Mrabet Yassine, Vougiouklis Pavlos, Kilicoglu Halil, Gardent Claire, Demner-Fushman Dina, Hare Jonathon, and Simperl Elena. 2016. Aligning texts and knowledge bases with semantic sentence simplification. In Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web. Association for Computational Linguistics, Edinburgh, Scotland, 2936.Google ScholarGoogle ScholarCross RefCross Ref
  64. [64] Novikova Jekaterina, Lemon Oliver, and Rieser Verena. 2016. Crowd-sourcing NLG data: Pictures elicit better data. In Proceedings of the 9th International Natural Language Generation Conference. Association for Computational Linguistics, Edinburgh, UK, 265273.Google ScholarGoogle ScholarCross RefCross Ref
  65. [65] Nuzumlalı Muhammed Yavuz and Özgür Arzucan. 2014. Analyzing stemming approaches for Turkish multi-document summarization. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Doha, Qatar, 702706.Google ScholarGoogle ScholarCross RefCross Ref
  66. [66] O’Donnell Michael, Knott Alistair, Mellish Christopher Stuart, and Oberlander Jon. 2001. ILEX: The architecture for a dynamic hypertext generation system. Natural Language Engineering 7, 3 (2001), 225250.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. [67] Oflazer Kemal and Saraclar Murat. 2018. Turkish Natural Language Processing. Springer.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. [68] Papineni Kishore, Roukos Salim, Ward Todd, and Zhu Wei-Jing. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, Pennsylvania, 311318.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. [69] Perez-Beltrachini Laura and Lapata Mirella. 2018. Bootstrapping generators from noisy data. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies. Association for Computational Linguistics, New Orleans, Louisiana, 15161527.Google ScholarGoogle ScholarCross RefCross Ref
  70. [70] Popović Maja. 2015. chrF: Character n-gram F-score for automatic MT evaluation. In Proceedings of the 10th Workshop on Statistical Machine Translation. Association for Computational Linguistics, Lisbon, Portugal, 392395.Google ScholarGoogle ScholarCross RefCross Ref
  71. [71] Portet Francois, Reiter Ehud, Gatt Albert, Hunter Jim, Sripada Somayajulu, Freer Yvonne, and Sykes Cindy. 2009. Automatic generation of textual summaries from neonatal intensive care data. Artificial Intelligence 173, 7–8 (2009), 789816.Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. [72] Prabhavalkar Rohit, Rao Kanishka, Sainath Tara N., Li Bo, Johnson Leif, and Jaitly Navdeep. 2017. A comparison of sequence-to-sequence models for speech recognition. In Proceedings of the 18th International Speech Communication Association (Interspeech). International Speech Communication Association, Stockholm, Sweden, 939943.Google ScholarGoogle ScholarCross RefCross Ref
  73. [73] Puduppully Ratish, Dong Li, and Lapata Mirella. 2019. Data-to-text generation with content selection and planning. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. AAAI Press, Honolulu, Hawaii, 69086915.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. [74] Rajpurkar Pranav, Zhang Jian, Lopyrev Konstantin, and Liang Percy. 2016. SQuAD: 100,000 + questions for machine comprehension of text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 23832392.Google ScholarGoogle ScholarCross RefCross Ref
  75. [75] Rebuffel Clément, Soulier Laure, Scoutheeten Geoffrey, and Gallinari Patrick. 2020. A hierarchical model for data-to-text generation. In Proceedings of the Advances in Information Retrieval, Jose Joemon M., Yilmaz Emine, Magalhães João, Castells Pablo, Ferro Nicola, Silva Mário J., and Martins Flávio (Eds.), Springer International Publishing, 6580.Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. [76] Reiter Ehud and Dale Robert. 1997. Building applied natural language generation systems. Natural Language Engineering 3, 1 (1997), 5787.Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. [77] Ritter Alan, Cherry Colin, and Dolan William B.. 2011. Data-driven response generation in social media. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Edinburgh, Scotland, UK., 583593.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. [78] Sankarasubramaniam Yogesh, Ramanathan Krishnan, and Ghosh Subhankar. 2014. Text summarization using Wikipedia. Information Processing and Management 50, 3 (2014), 443461. Google ScholarGoogle ScholarCross RefCross Ref
  79. [79] See Abigail, Liu Peter J., and Manning Christopher D.. 2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Vancouver, Canada, 10731083.Google ScholarGoogle ScholarCross RefCross Ref
  80. [80] Sennrich Rico, Haddow Barry, and Birch Alexandra. 2016. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Berlin, Germany, 17151725.Google ScholarGoogle ScholarCross RefCross Ref
  81. [81] Sha Lei, Mou Lili, Liu Tianyu, Poupart Pascal, Li Sujian, Chang Baobao, and Sui Zhifang. 2018. Order-planning neural text generation from structured data. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, the 30th innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence. AAAI Press, New Orleans, Louisiana, 54145421.Google ScholarGoogle ScholarCross RefCross Ref
  82. [82] Shimorina Anastasia and Gardent Claire. 2018. Handling rare items in data-to-text generation. In Proceedings of the 11th International Conference on Natural Language Generation. Association for Computational Linguistics, Tilburg University, The Netherlands, 360370.Google ScholarGoogle ScholarCross RefCross Ref
  83. [83] Socher Richard, Lin Cliff Chiung-Yu, Ng Andrew Y., and Manning Christopher D.. 2011. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of the 28th International Conference on International Conference on Machine Learning. Omnipress, Madison, WI, 129136.Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. [84] Sripada Somayajulu, Reiter Ehud, Hunter Jim, and Yu Jin. 2003. Exploiting a parallel TEXT - DATA corpus. In Proceedings of the Corpus Linguistics. Lancaster University, UK, 734743.Google ScholarGoogle Scholar
  85. [85] Suadaa Lya Hulliyyatus, Kamigaito Hidetaka, Funakoshi Kotaro, Okumura Manabu, and Takamura Hiroya. 2021. Towards table-to-text generation with numerical reasoning. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Bangkok, Thailand, 14511465.Google ScholarGoogle ScholarCross RefCross Ref
  86. [86] Temizsoy Murat and Cicekli Ilyas. 1998. A language-independent system for generating feature structures from interlingua representations. In Proceedings of Natural Language Generation. Association for Computational Linguistics, Niagara-on-the-Lake, Ontario, Canada, 188197.Google ScholarGoogle Scholar
  87. [87] Tran Van-Khanh and Nguyen Le-Minh. 2018. Adversarial domain adaptation for variational neural language generation in dialogue systems. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, 12051217.Google ScholarGoogle Scholar
  88. [88] Tran Van-Khanh, Nguyen Le-Minh, and Tojo Satoshi. 2017. Neural-based natural language generation in dialogue using RNN encoder-decoder with semantic aggregation. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. Association for Computational Linguistics, Saarbrücken, Germany, 231240.Google ScholarGoogle ScholarCross RefCross Ref
  89. [89] Trisedya Bayu Distiawan, Qi Jianzhong, Wang Wei, and Zhang Rui. 2021. GCP: Graph encoder with content-planning for sentence generation from knowledge base. IEEE Transactions on Pattern Analysis and Machine Intelligence (2021), 11.Google ScholarGoogle Scholar
  90. [90] Deemter Kees van, Theune Mariet, and Krahmer Emiel. 2005. Real versus template-based natural language generation: A false opposition? Computational Linguistics 31, 1 (2005), 1524.Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. [91] Vardar Uluc Furkan, Devran Ilkay Tevfik, and Demir Seniz. 2019. An XML parser for Turkish Wikipedia. In Proceedings of the 27th Signal Processing and Communications Applications Conference, SIU. IEEE, Sivas, Turkey, 14.Google ScholarGoogle ScholarCross RefCross Ref
  92. [92] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Long Beach, CA, 60006010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. [93] Wang Rui, Zhao Hai, Ploux Sabine, Lu Bao-Liang, Utiyama Masao, and Sumita Eiichiro. 2018. Graph-based bilingual word embedding for statistical machine translation. ACM Transactions on Asian Low-Resource Language Information Processing 17, 4 (2018), 123.Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. [94] Wen Tsung-Hsien, Gašić Milica, Mrkšić Nikola, Su Pei-Hao, Vandyke David, and Young Steve. 2015. Semantically conditioned LSTM-based natural language generation for spoken dialogue systems. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, 17111721.Google ScholarGoogle ScholarCross RefCross Ref
  95. [95] Wen Tsung-Hsien and Young Steve. 2020. Recurrent neural network language generation for spoken dialogue systems. Computer Speech and Language 63, 5 (2020), 101017.Google ScholarGoogle ScholarCross RefCross Ref
  96. [96] Yermakov Ruslan, Drago Nicholas, and Ziletti Angelo. 2021. Biomedical data-to-text generation via fine-tuning transformers. In Proceedings of the 14th International Conference on Natural Language Generation. Association for Computational Linguistics, Aberdeen, Scotland, UK, 364370.Google ScholarGoogle Scholar
  97. [97] Yu Seunghak, Kulkarni Nilesh, Lee Haejun, and Kim Jihie. 2017. Syllable-level neural language model for agglutinative language. In Proceedings of the 1st Workshop on Subword and Character Level Models in NLP. Association for Computational Linguistics, Copenhagen, Denmark, 9296.Google ScholarGoogle ScholarCross RefCross Ref
  98. [98] Yıldız Tuğba. 2019. A comparative study of author gender identification. Turkish Journal of Electrical Engineering and Computer Science 27, 2 (2019), 10521064.Google ScholarGoogle ScholarCross RefCross Ref
  99. [99] Zhang Xingxing and Lapata Mirella. 2014. Chinese poetry generation with recurrent neural networks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Doha, Qatar, 670680.Google ScholarGoogle ScholarCross RefCross Ref
  100. [100] Zhou Luowei, Zhou Yingbo, Corso Jason J., Socher Richard, and Xiong Caiming. 2018. End-to-end dense video captioning with masked transformer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City, UT, 87398748.Google ScholarGoogle ScholarCross RefCross Ref
  101. [101] Zhu Chenguang, Zeng Michael, and Huang Xuedong. 2019. Multi-task learning for natural language generation in task-oriented dialogue. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Hong Kong, China, 12611266.Google ScholarGoogle ScholarCross RefCross Ref
  102. [102] Birant Çağdaş Can, Koşaner Özgün, and Aktaş Özlem. 2016. A survey to text summarization methods for Turkish. International Journal of Computer Applications 144, 6 (2016), 2328.Google ScholarGoogle ScholarCross RefCross Ref
  103. [103] Çıtamak Begüm, Kuyu Menekşe, Erdem Aykut, and Erdem Erkut. 2019. MSVD-Turkish: A large-scale dataset for video captioning in Turkish. In Proceedings of the 27th Signal Processing and Communications Applications Conference (SIU). IEEE, Sivas, Turkey.Google ScholarGoogle Scholar

Index Terms

  1. Turkish Data-to-Text Generation Using Sequence-to-Sequence Neural Networks

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 2
        February 2023
        624 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3572719
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 December 2022
        • Online AM: 8 July 2022
        • Accepted: 29 May 2022
        • Revised: 4 March 2022
        • Received: 30 March 2021
        Published in tallip Volume 22, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Refereed
      • Article Metrics

        • Downloads (Last 12 months)194
        • Downloads (Last 6 weeks)2

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!