Abstract
Transliteration removes the script barriers. Unfortunately, Punjabi is written in four different scripts, i.e., Gurmukhi, Shahmukhi, Devnagri, and Latin. The Latin script is understandable for nearly all factions of the Punjabi community. The objective of our work is to transliterate the Punjabi Gurmukhi script into Latin script. There has been considerable progress in Punjabi to Latin transliteration, but the accuracy of present-day systems is less than 50% (Google Translator has approximately 45% accuracy). We do not have the facility of a rich parallel corpus for Punjabi, so we cannot use the corpus-based techniques of machine learning that are in vogue these days. The existing systems of transliteration follow grapheme-based approach. The grapheme-based transliteration is unable to handle many scenarios such as tones, inherent schwa, glottal stops, nasalization, and gemination. In this article, the grapheme-based transliteration has been augmented with phonetic rectification where the Punjabi script is rectified phonetically before applying character-to-character mapping. Handling the inherent short vowel schwa was the major challenge in phonetic rectification. Instead of following the fixed syllabic pattern, we devised a generic finite state transducer to insert schwa. The accuracy of our transliteration system is approximately 96.82%.
- Hardev Bahri. 1982. Teach Yourself Panjabi. Panjabi University.Google Scholar
- Manoj K. Chinnakotla, Om P. Damani, and Avijit Satoskar. 2010. Transliteration for resource-scarce languages. ACM Trans. Asian Lang. Inform. Proc. 9, 4 (2010), 14.Google Scholar
- Kamal Deep and Dr. Vishal Goyal. 2011. Hybrid approach for Punjabi to English transliteration system. Int. J. Comput. Appl. (0975—8887) 28, 1 (2011), 0975--8887. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.259.11028rep=rep18type=pdf.Google Scholar
Cross Ref
- Kamal Deep and Vishal Goyal. 2011. Development of a Punjabi to English transliteration system. Int. J. Comput. Sci. Commun. 2, 2 (2011), 521--526.Google Scholar
- Narinder K. Dulai and Omkar Nath Koul. 1980. Punjabi Phonetic Reader. Vol. 22. Central Institute of Indian Languages.Google Scholar
- Harjeet Singh Gill and Henry Allan Gleason. 1969. A Reference Grammar of Punjabi. Department of Linguistics, Punjabi University Patiala.Google Scholar
- Anterpreet Kaur, Parminder Singh, and Kamaldeep Kaur. 2017. Punjabi dialects conversion system for Majhi, Malwai, and Doabi Dialects. In Proceedings of the 8th International Conference on Computer Modeling and Simulation. ACM, 125--128.Google Scholar
Digital Library
- Samandeep Kaur and Er. Charanjiv Singh. 2015. Conversion of Punjabi text to IPA using phonetic symbols. 2, 12 (2015), 3180--3183. Retrieved from www.ijtre.com.Google Scholar
- Gurpreet Singh Lehal. 2009. A Gurmukhi to Shahmukhi transliteration system. In Proceedings of the 7th International Conference on Natural Language Processing. 167--173.Google Scholar
- Muhammad G. Malik. 2006. Punjabi machine transliteration. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1137--1144.Google Scholar
- Jong-Hoon Oh, Key-Sun Choi, and Hitoshi Isahara. 2006. A machine transliteration model based on correspondence between graphemes and phonemes. ACM Trans. Asian Lang. Inform. Proc. 5, 3 (2006), 185--208.Google Scholar
Digital Library
- Er Sheilly Padda, Rupinderdeep Kaur, and Er Nidhi. 2012. Punjabi phonetic: Punjabi text to IPA conversion. Int. J. Emerg. Technol. Adv. Eng. Retrieved from www.ijetae.com.Google Scholar
- M. Parkvall. 2007. Världens 100 största sprak (the world's largest 100 languages). Nationalencyklopedin. Malmö: NE Nationalencyklopedin AB.Google Scholar
- Tejinder Singh Saini, Gurpreet Singh Lehal, and Virinder S. Kalra. 2008. Shahmukhi to Gurmukhi transliteration system. In Proceedings of the 22nd International Conference on Computational Linguistics: Demonstration Papers. Association for Computational Linguistics, 177--180.Google Scholar
- Pardeep Singh and Kamlesh Dutta. 2011. Formant analysis of Punjabi non-nasalized vowel phonemes. In Proceedings of the International Conference on Computational Intelligence and Communication Networks (CICN’11). IEEE, 375--380.Google Scholar
Digital Library
- Parminder Singh and Gurpreet Singh Lehal. 2006. Text-to-speech synthesis system for Punjabi language. In Proceedings of International Conference on Multidisciplinary Information Sciences and Technologies.Google Scholar
- Parminder Singh and Gurpreet Singh Lehal. 2011. A rule based schwa deletion algorithm for Punjabi TTS system. In Information Systems for Indian Languages. Springer, 98--103.Google Scholar
- Sukhdeep Singh, Anuj Sharma, and Indu Chhabra. 2016. Online handwritten Gurmukhi strokes dataset based on minimal set of words. ACM Trans. Asian Low-Resour. Lang. Inform. Proc. 16, 1 (2016), 1.Google Scholar
Digital Library
- M. S. Vijaya, V. P. Ajith, G. Shivapratap, and K. P. Soman. 2009. English to Tamil transliteration using Weka. Int. J. Rec. Trends Eng. 1, 1 (2009), 498.Google Scholar
- Paola Virga and Sanjeev Khudanpur. 2003. Transliteration of proper names in cross-language applications. In Proceedings of the 26th International ACM SIGIR Conference on Research and Development in Informaion Retrieval. ACM, 365--366.Google Scholar
Digital Library
- George Yule. 2016. The Study of Language. Cambridge University Press.Google Scholar
Index Terms
Punjabi to ISO 15919 and Roman Transliteration with Phonetic Rectification
Recommendations
Forward-backward Transliteration of Punjabi Gurmukhi Script Using N-gram Language Model
Transliterating the text of a language to a foreign script is called forward transliteration and transliterating the text back to the original script is called backward transliteration. In this work, we perform both forward as well as backward ...
Transliteration of Arabizi into Arabic Script for Tunisian Dialect
The evolution of information and communication technology has markedly influenced communication between correspondents. This evolution has facilitated the transmission of information and has engendered new forms of written communication (email, chat, ...
Punjabi machine transliteration
ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational LinguisticsMachine Transliteration is to transcribe a word written in a script with approximate phonetic equivalence in another language. It is useful for machine translation, cross-lingual information retrieval, multilingual text and speech processing. Punjabi ...






Comments