Abstract
Persian poetry has consistently expressed its philosophy, wisdom, speech, and rationale based on its couplets, making it an enigmatic language on its own to both native and non-native speakers. Nevertheless, the noticeable gap between Persian prose and poems has left the two pieces of literature mediumless. Having curated a parallel corpus of prose and their equivalent poems, we introduce a novel Neural Machine Translation approach for translating prose to ancient Persian poetry using transformer-based language models in an exceptionally low-resource setting. Translating input prose into ancient Persian poetry presents two primary challenges: In addition to being reasonable in conveying the same context as the input prose, the translation must also satisfy poetic standards. Hence, we designed our method consisting of three stages. First, we trained a transformer model from scratch to obtain an initial translations of the input prose. Next, we designed a set of heuristics to leverage contextually rich initial translations and produced a poetic masked template. In the last stage, we pretrained different variations of BERT on a poetry corpus to use the masked language modelling technique to obtain final translations. During the evaluation process, we considered both automatic and human assessment. The final results demonstrate the eligibility and creativity of our novel heuristically aided approach among Literature professionals and non-professionals in generating novel Persian poems.
- [1] . 2016. Machine learning for metrical analysis of English poetry. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). The COLING 2016 Organizing Committee, Osaka, Japan, 772–781.Google Scholar
- [2] . 2013. POS-tag based poetry generation with WordNet. In Proceedings of the 14th European Workshop on Natural Language Generation. Association for Computational Linguistics, 162–166.Google Scholar
- [3] . 2020. Introducing Aspects of Creativity in Automatic Poetry Generation.
arxiv:2002.02511 [cs.CL]. Retrieved from https://arxiv.org/abs/2002.02511.Google Scholar - [4] . 2009. Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit. O’Reilly Media, Inc.Google Scholar
- [5] . 2020. GPT2-Persian. Retrieved from http://bolbolzaban.com/.Google Scholar
- [6] Ricardo Campos, Vítor Mangaravite, Arian Pasquali, Alípio Jorge, Célia Nunes, and Adam Jatowt. 2020. YAKE! Keyword extraction from single documents using multiple local features. Information Sciences 509 (2020), 257–289. Google Scholar
Digital Library
- [7] Tuhin Chakrabarty, A. Saakyan, and Smaranda Muresan. 2021. Don’t go far off: An empirical study on neural poetry translation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Online and Punta Cana, Dominican Republic. Association for Computational Linguistics, 7253–7265.Google Scholar
- [8] . 2019. Sentiment-controllable chinese poetry generation. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’19).Google Scholar
Cross Ref
- [9] . 2007. A canvas for words: Spatial and temporal attitudes in the writing of poems. New Writ. 4, 1 (2007), 79–90. Google Scholar
Cross Ref
- [10] . 2021. Transformers analyzing poetry: Multilingual metrical pattern prediction with transfomer-based language models. Neural Comput. Appl. (
11 2021), 1–6. Google ScholarCross Ref
- [11] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805. Retrieved from https://arxiv.org/abs/1810.04805.Google Scholar
- [12] . 2020. Automatic machine translation of poetry and a low-resource language pair. In Proceedings of the 43rd International Convention on Information, Communication and Electronic Technology (MIPRO’20), 1034–1039.Google Scholar
Cross Ref
- [13] . 2011. Poetry in translation: A comparative study of Silverstein’s monolingual and bilingual (English to Persian) poems. Int. J. Engl. Lit. 2, 3 (2011), 75–82.Google Scholar
- [14] . 2020. ParsBERT: Transformer-based model for Persian language understanding.
arxiv:2005.12515 [cs.CL]. Retrieved from https://arxiv.org/abs/2005.12515.Google Scholar - [15] . 2018. Neural poetry translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). Association for Computational Linguistics, 67–71. Google Scholar
Cross Ref
- [16] . 2017. Hafez: An interactive poetry generation system. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL’17), System Demonstrations. Association for Computational Linguistics, 43–48.Google Scholar
Cross Ref
- [17] Memduh Gokirmak. 2021. Converting prose into poetry using neural networks. Master thesis. Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics, Prague.Google Scholar
- [18] . 2015. On using monolingual corpora in neural machine translation.
arxiv:1503.03535 [cs.CL]Google Scholar - [19] . 2008. Exploring network structure, dynamics, and function using networkx. In Proceedings of the 7th Python in Science Conference, , , and (Eds.). 11–15.Google Scholar
- [20] . 2018. Machine translation evaluation resources and methods: A survey.
arxiv:1605.04515 [cs.CL]. Retrieved from https://arxiv.org/abs/1605.04515.Google Scholar - [21] . 2021. ParsGPT2 the Persian Version of GPT2. Retrieved from https://github.com/hooshvare/parsgpt.Google Scholar
- [22] . 2017. Automatically generating rhythmic verse with neural networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada, 168–178. Google Scholar
Cross Ref
- [23] . 2019. Learning rhyming constraints using structured adversaries. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). Association for Computational Linguistics, 6025–6031. Google Scholar
Cross Ref
- [24] . 2017. Metaphor detection in a poetry corpus. In Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature. Association for Computational Linguistics, 1–9. Google Scholar
Cross Ref
- [25] Reza Khanmohammadi and Seyed Abolghasem Mirroshandel. 2020. PGST: A polyglot gender style transfer method. arXiv preprint arXiv:2009.01040.Google Scholar
- [26] Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. ALBERT: A lite BERTt for self-supervised learning of language representations. In Proceedings of (ICLR’19).Google Scholar
- [27] . 1998. Generation that exploits corpus-based statistical knowledge. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1. Association for Computational Linguistics, 704–710. Google Scholar
Digital Library
- [28] . 2018. Deep-speare: A joint neural model of poetic language, meter and rhyme. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1948–1958. Google Scholar
Cross Ref
- [29] Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning. 1188–1196.Google Scholar
- [30] . 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. Association for Computational Linguistics, 74–81.Google Scholar
- [31] . 2018. Beyond narrative description. In Proceedings of the 26th ACM International Conference on Multimedia. Google Scholar
Digital Library
- [32] Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, M. Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. arXiv abs/1907.11692. Retrieved from https://arxiv.org/abs/1907.11692.Google Scholar
- [33] . 2018. Deep learning-based poetry generation given visual input. In Proceedings of the International Conference on Computational Creativity (ICCC’18).Google Scholar
- [34] . 2013. Efficient estimation of word representations in vector space.
arxiv:1301.3781 [cs.CL]. Retrieved from https://arxiv.org/abs/1301.3781.Google Scholar - [35] . 2020. A study of rendering metaphors in the translation of the titles of persian medical articles. Budap. Int. Res. Crit. Ling. Educ. J. 3 (
02 2020), 540–551. Google ScholarCross Ref
- [36] Marmik Pandya. 2016. NLP based poetry analysis and generation. Technical Report. Google Scholar
Cross Ref
- [37] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia, Pennsylvania, Association for Computational Linguistics, 311–318.Google Scholar
- [38] Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. Technical Report. OpenAI.Google Scholar
- [39] Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, A distilled version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).Google Scholar
- [40] . 2020. Poem generation using transformers and Doc2Vec embeddings. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’20). 1–7. Google Scholar
Cross Ref
- [41] Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving neural machine translation models with monolingual data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 86–96. Google Scholar
Cross Ref
- [42] Stanley Xie, Ruchir Rastogi, and Max Chang. 2017. Deep poetry: Word-level and character-level language models for Shakespearean sonnet generation. Tech. rep., Stanford University. Natural Language Processing with Deep Learning Course.Google Scholar
- [43] . 2016. Automatic classification of poetry by meter and rhyme. In Proceedings of the Florida AI Research Society Conference (FLAIRS’16).Google Scholar
- [44] . 2020. Automatic poetry generation from prosaic text. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2471–2480. Google Scholar
Cross Ref
- [45] . 2016. The Comprehension of Figurative Language: What Is the Influence of Irony and Sarcasm on NLP Techniques? Vol. 639. Google Scholar
Cross Ref
- [46] . 2017. Deep poetry: Word-level and character-level language models for shakespearean sonnet generation.Google Scholar
- [47] . 2018. How images inspire poems: Generating classical Chinese poetry from images with memory networks.
arxiv:1803.02994 [cs.CL]. Retrieved from https://arxiv.org/abs/1803.02994.Google Scholar - [48] . 2018. Automatic poetry generation with mutual reinforcement learning. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3143–3153. Google Scholar
Cross Ref
- [49] . 2017. A comparative study to understanding about poetics based on natural language processing. Open J. Mod. Ling. 07 (
01 2017), 229–237. Google ScholarCross Ref
Index Terms
Prose2Poem: The Blessing of Transformers in Translating Prose to Persian Poetry
Recommendations
How Good are Transformers in Reordering?
Multi-disciplinary Trends in Artificial IntelligenceAbstractTranslation requires transfer of lexical items (words/phrases) from Source Language to Target Language and also reordering of the transferred lexical items as appropriate for the target language. Whatever be the approach used, quality of ...
Translating Classical Chinese Poetry into Modern Chinese with Transformer
Chinese Lexical SemanticsAbstractClassical Chinese poetry, as the cultural heritage of human beings, is very popular in Chinese community all over the world. Nearly every person in these regions can recite several poems to artistically express his or her emotion. However, due to ...
Using decision tree to hybrid morphology generation of Persian verb for English-Persian translation
HighlightsAnalyzing the output of English to Persian machine translation systems.Presenting hybrid morphology generation using a parallel corpus.Using a set of linguistically motivated features.Making a model to predict six morphological features of the ...






Comments