skip to main content
research-article

TAM GAN: Tamil Text to Naturalistic Image Synthesis Using Conventional Deep Adversarial Networks

Published:08 May 2023Publication History
Skip Abstract Section

Abstract

Text-to-image synthesis has advanced recently as a prospective area for improvement in computer vision applications. The image synthesis model follows significant neural network architectures such as Generative Adversarial Networks (GANs). The flourishing text-to-image generation approaches can nominally reflect the meaning of the text in generated images. Still, they need the prospect of providing the necessary details and eloquent object features. Intelligent systems are trained in text-to-image synthesis applications for various languages. However, their contribution to regional languages is yet to be explored. Autoencoders prompt the synthesis of images, but they result in blurriness, which results in clear output and essential features of the picture. Based on textual descriptions, The GAN model is capable of producing realistic images of a high quality that can be used in various applications, like fashion design, photo editing, computer-aided design, and educational platforms. The proposed method uses two-stage processing to create a language model using a BERT model called TAM-BERT and an existing MuRIL BERT, followed by image synthesis using a GAN. The work was conducted using the Oxford-102 dataset, and the model's efficiency was evaluated using the F1-Score measure.

REFERENCES

  1. [1] Balakrishnan Guha, Zhao Amy, Dalca Adrian V., Durand Fredo, and Guttag John. 2018. Synthesizing images of humans in unseen poses. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 83408348.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Reed Scott, Akata Zeynep, Yan Xinchen, Logeswaran Lajanugen, Schiele Bernt, and Lee Honglak. 2016. Generative adversarial text to image synthesis. In Proceedings of the International Conference on Machine Learning. PMLR, 10601069.Google ScholarGoogle Scholar
  3. [3] Ak Kenan E., Lim Joo Hwee, Tham Jo Yew, and Kassim Ashraf A.. 2020. Semantically consistent text to fashion image synthesis with an enhanced attentional generative adversarial network. Pattern Recogn. Lett. 135 (2020), 2229.Google ScholarGoogle ScholarCross RefCross Ref
  4. [4] Regmi Krishna and Borji Ali. 2018. Cross-view image synthesis using conditional GANs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 35013510.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Saraswathi S. and Geetha T. V.. 2004. Building language models for Tamil speech recognition system. In Proceedings of the Asian Applied Computing Conference. Springer, Berlin, 161168.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Sundaram Suresh and Ramakrishnan A. G.. 2012. Language models for online handwritten Tamil word recognition. In Proceeding of the Workshop on Document Analysis and Recognition. 4248.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Sundaram Suresh and Ramakrishnan A. G.. 2015. Bigram language models and reevaluation strategy for improved recognition of online handwritten Tamil words. ACM Trans. Asian Low-Res. Lang. Info. Process. 14, 2 (2015), 128.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Goodfellow Ian, Pouget-Abadie Jean, Mirza Mehdi, Xu Bing, Warde-Farley David, Ozair Sherjil, Courville Aaron, and Bengio Yoshua. 2020. Generative adversarial networks. Commun. ACM 63, 11 (2020), 139144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Radford Alec, Metz Luke, and Chintala Soumith. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. Retrieved from https://arXiv:1511.06434.Google ScholarGoogle Scholar
  10. [10] Mehralian Mehran and Karasfi Babak. 2018. RDCGAN: Unsupervised representation learning with regularized deep convolutional generative adversarial networks. In Proceedings of the 9th Conference on Artificial Intelligence and Robotics and 2nd Asia-Pacific International Symposium. IEEE. 3138.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Zhu Bin and Ngo Chong-Wah. 2020. CookGAN: Causality-based text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 55195527.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Zhu Minfeng, Pan Pingbo, Chen Wei, and Yang Yi. 2019. Dm-gan: Dynamic memory generative adversarial networks for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 58025810.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Tan Hongchen, Liu Xiuping, Liu Meng, Yin Baocai, and Li Xin. 2020. KT-GAN: Knowledge-transfer generative adversarial network for text-to-image synthesis. IEEE Trans. Image Process. 30 (2020), 12751290.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Hinz Tobias, Heinrich Stefan, and Wermter Stefan. 2019. Semantic object accuracy for generative text-to-image synthesis. Retrieved from https://arXiv:1910.13321.Google ScholarGoogle Scholar
  15. [15] Raut Purva, Doshi Moxa, Diwan Monil, and Doshi Karan. 2020. Face completion using generative adversarial network. In Advanced Computing Technologies and Applications. Springer, 523531.Google ScholarGoogle ScholarCross RefCross Ref
  16. [16] Bharath A. and Madhvanath Sriganesh. 2011. HMM-based lexicon-driven and lexicon-free word recognition for online handwritten Indic scripts. IEEE Trans. Pattern Anal. Mach. Intell. 34, 4 (2011), 670682.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Sarveswaran Kengatharaiyer, Dias Gihan, and Butt Miriam. 2021. ThamizhiMorph: A morphological parser for the Tamil language. Mach. Transl. 35, 1 (2021), 3770.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Suriyah M., Anandan Aarthy, Narasimhan Anitha, and Karky Madhan. 2019. Piripori: Morphological analyser for Tamil. In Proceedings of the International Conference on Artificial Intelligence, Smart Grid And Smart City Applications. Springer, Cham, 801809.Google ScholarGoogle Scholar
  19. [19] Ramraj S., Arthi R., Murugan Solai, and Julie M. S.. 2020. Topic categorization of Tamil News Articles using PreTrained Word2Vec Embeddings with Convolutional Neural Network. In Proceedings of the International Conference on Computational Intelligence for Smart Power System and Sustainable Energy (CISPSSE’20). IEEE, 14.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Thavareesan Sajeetha and Mahesan Sinnathamby. 2019. Sentiment analysis in Tamil texts: A study on machine learning techniques and feature representation. In Proceedings of the 14th Conference on Industrial and Information Systems (ICIIS’19). IEEE, 320325.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Shivachi Casper Shikali, Mokhosi Refuoe, Shijie Zhou, and Qihe Liu. 2021. Learning syllables using Conv-LSTM model for Swahili word representation and part-of-speech Tagging. Trans. Asian Low-Res. Lang. Info. Process. 20, 4 (2021), 125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Iqbal Touseef and Qureshi Shaima. 2022. The survey: Text generation models in deep learning. Journal of King Saud University-Computer and Information Sciences 34, 6 (2022), 2515--2528.Google ScholarGoogle Scholar
  23. [23] Beltagy Iz, Lo Kyle, and Cohan Arman. 2019. Scibert: A pre-trained language model for scientific text. Retrieved from https://arXiv:1903.10676.Google ScholarGoogle Scholar
  24. [24] Van Gysel Christophe, Rijke Maarten De, and Kanoulas Evangelos. 2018. Neural vector spaces for unsupervised information retrieval. ACM Trans. Info. Syst. 36, 4 (2018), 125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Chang Bo, Zhang Qiong, Pan Shenyi, and Meng Lili. 2018. Generating handwritten chinese characters using cyclegan. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE, 199207.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Hossain Md Zakir, Sohel Ferdous, Shiratuddin Mohd Fairuz, Laga Hamid, and Bennamoun Mohammed. 2021. Text to image synthesis for improved image captioning. IEEE Access 9 (2021), 6491864928.Google ScholarGoogle ScholarCross RefCross Ref
  27. [27] Wang Min, Lang Congyan, Liang Liqian, Feng Songhe, Wang Tao, and Gao Yutong. 2020. End-to-end text-to-image synthesis with spatial constraints. ACM Trans. Intell. Syst. Technol. 11, 4, Article 47 (Aug. 2020), 19 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Zhang Min, Li Chunye, and Zhou Zhiping. 2021. Text to image synthesis using multi-generator text conditioned generative adversarial networks. Multimedia Tools Appl. 80, 5 (Feb 2021), 77897803. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Khanuja S., Bansal D., Mehtani S., Khosla S., Dey A., Gopalan B., Margam D. K., Aggarwal P., Nagipogu R. T., Dave S., Gupta S., Gali S. C., Subramanian V., and Talukdar. 2021. MuRIL: Multilingual representations for Indian languages. Retrieved from https://arxiv.org/abs/2103.10730.Google ScholarGoogle Scholar

Index Terms

  1. TAM GAN: Tamil Text to Naturalistic Image Synthesis Using Conventional Deep Adversarial Networks

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 5
      May 2023
      653 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3596451
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 8 May 2023
      • Online AM: 16 February 2023
      • Accepted: 4 February 2023
      • Revised: 29 December 2022
      • Received: 13 January 2022
      Published in tallip Volume 22, Issue 5

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)107
      • Downloads (Last 6 weeks)17

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!