skip to main content
research-article

Melody Generation from Lyrics with Local Interpretability

Authors Info & Claims
Published:27 February 2023Publication History
Skip Abstract Section

Abstract

Melody generation aims to learn the distribution of real melodies to generate new melodies conditioned on lyrics, which has been a very interesting topic in the area of artificial intelligence and music. However, a challenging issue still limits the quality and reliability of melody generation conditioned on lyrics: how to enhance the interpretability between the input lyrics and generated melodies so humans can understand their relationships. To solve this issue, in this article, we propose a model for melody generation from lyrics with local interpretability, which contains two significant contributions: (i) Mutual information between input lyrics and generated melody is exploited to instruct the training of the network, which avoids the loss of content consistency during the training stage. (ii) Transformer is explored to efficiently extract semantic features from lyrics sequences, which provides more interpretable correlations between different syllables in lyrics. Experiments on a large-scale dataset with paired lyrics-melodies demonstrate that the proposed approach can generate higher-quality melodies from lyrics compared with existing methods.

REFERENCES

  1. [1] Ackerman Margareta and Loker David. 2017. Algorithmic songwriting with ALYSIA. In 6th International Conference on Computational Intelligence in Music, Sound, Art and Design. 116.Google ScholarGoogle ScholarCross RefCross Ref
  2. [2] Bao Hangbo, Huang Shaohan, Wei Furu, Cui Lei, Wu Yu, Tan Chuanqi, Piao Songhao, and Zhou Ming. 2019. Neural melody composition from lyrics. In 8th International Conference on Natural Language Processing and Chinese Computing. 499511.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Bounliphone Wacha, Belilovsky Eugene, Blaschko Matthew B., Antonoglou Ioannis, and Gretton Arthur. 2016. A test of relative similarity for model selection in generative models. In 4th International Conference on Learning Representations. http://arxiv.org/abs/1511.04581.Google ScholarGoogle Scholar
  4. [4] Chen Xi, Duan Yan, Houthooft Rein, Schulman John, Sutskever Ilya, and Abbeel Pieter. 2016. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. In 29th International Conference on Neural Information Processing Systems. 21722180.Google ScholarGoogle Scholar
  5. [5] Dhariwal Prafulla, Jun Heewoo, Payne Christine, Kim Jong Wook, Radford Alec, and Sutskever Ilya. 2020. Jukebox: A generative model for music. CoRR abs/2005.00341 (2020).Google ScholarGoogle Scholar
  6. [6] Dong Hao-Wen, Hsiao Wen-Yi, Yang Li-Chia, and Yang Yi-Hsuan. 2018. MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In 32nd AAAI Conference on Artificial Intelligence. 3441.Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Goodfellow Ian J., Pouget-Abadie Jean, Mirza Mehdi, Xu Bing, Warde-Farley David, Ozair Sherjil, Courville Aaron C., and Bengio Yoshua. 2014. Generative adversarial networks. CoRR abs/1406.2661 (2014).Google ScholarGoogle Scholar
  8. [8] Hochreiter Sepp and Schmidhuber Jürgen. 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 17351780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Huang Cheng-Zhi Anna, Vaswani Ashish, Uszkoreit Jakob, Simon Ian, Hawthorne Curtis, Shazeer Noam, Dai Andrew M., Hoffman Matthew D., Dinculescu Monica, and Eck Douglas. 2019. Music transformer: Generating music with long-term structure. In 7th International Conference on Learning Representations.Google ScholarGoogle Scholar
  10. [10] Jolicoeur-Martineau Alexia. 2019. The relativistic discriminator: A key element missing from standard GAN. In 7th International Conference on Learning Representations.Google ScholarGoogle Scholar
  11. [11] Ju Zeqian, Lu Peiling, Tan Xu, Wang Rui, Zhang Chen, Wu Songruoyao, Zhang Kejun, Li Xiangyang, Qin Tao, and Liu Tie-Yan. 2021. TeleMelody: Lyric-to-melody generation with a template-based two-stage method. CoRR abs/2109.09617 (2021).Google ScholarGoogle Scholar
  12. [12] Kusner Matt J. and Hernández-Lobato José Miguel. 2016. GANS for sequences of discrete elements with the Gumbel-Softmax distribution. CoRR abs/1611.04051 (2016).Google ScholarGoogle Scholar
  13. [13] Lee Hsin-Pei, Fang Jhih-Sheng, and Ma Wei-Yun. 2019. iComposer: An automatic songwriting system for Chinese popular music. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 8488.Google ScholarGoogle Scholar
  14. [14] Madhumani Gurunath Reddy, Yu Yi, Harscoët Florian, Canales Simon, and Tang Suhua. 2020. Automatic neural lyrics and melody composition. CoRR abs/2011.06380 (2020).Google ScholarGoogle Scholar
  15. [15] Mikolov Tomás, Chen Kai, Corrado Greg, and Dean Jeffrey. 2013. Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations, Workshop Track Proceedings.Google ScholarGoogle Scholar
  16. [16] Monteith Kristine, Martinez Tony R., and Ventura Dan. 2012. Automatic generation of melodic accompaniments for lyrics. In 3rd International Conference on Computational Creativity. 8794.Google ScholarGoogle Scholar
  17. [17] Nichols Eric. 2009. Lyric-based rhythm suggestion. In 35th International Conference on International Computer Music Conference.Google ScholarGoogle Scholar
  18. [18] Papineni Kishore, Roukos Salim, Ward Todd, and Zhu Wei-Jing. 2002. Bleu: A method for automatic evaluation of machine translation. In 40th Annual Meeting of the Association for Computational Linguistics. ACL, 311318.Google ScholarGoogle Scholar
  19. [19] Ren Yi, He Jinzheng, Tan Xu, Qin Tao, Zhao Zhou, and Liu Tie-Yan. 2020. PopMAG: Pop music accompaniment generation. In 28th ACM International Conference on Multimedia. ACM, 11981206.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Rudin Cynthia, Chen Chaofan, Chen Zhi, Huang Haiyang, Semenova Lesia, and Zhong Chudi. 2021. Interpretable machine learning: Fundamental principles and 10 grand challenges. CoRR abs/2103.11251 (2021).Google ScholarGoogle Scholar
  21. [21] Scirea Marco, Barros Gabriella A. B., Shaker Noor, and Togelius Julian. 2015. SMUG: Scientific music generator. In 6th International Conference on Computational Creativity. 204211.Google ScholarGoogle Scholar
  22. [22] Sheng Zhonghao, Song Kaitao, Tan Xu, Ren Yi, Ye Wei, Zhang Shikun, and Qin Tao. 2021. SongMASS: Automatic song writing with pre-training and alignment constraint. In 35th AAAI Conference on Artificial Intelligence, 33rd Conference on Innovative Applications of Artificial Intelligence, 11th Symposium on Educational Advances in Artificial Intelligence. AAAI Press, 1379813805. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17626.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Toivanen Jukka M., Toivonen Hannu, and Valitutti Alessandro. 2013. Automatical composition of lyrical songs. In 4th International Conference on Computational Creativity. 8791.Google ScholarGoogle Scholar
  24. [24] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Lukasz, and Polosukhin Illia. 2017. Attention is all you need. In International Conference on Advances in Neural Information Processing Systems. 59986008.Google ScholarGoogle Scholar
  25. [25] Yu Yi, Srivastava Abhishek, and Canales Simon. 2021. Conditional LSTM-GAN for melody generation from lyrics. ACM Transaction on Multimedia Computing Communication and Applications (TOMCCAP’21) 17, 1 (2021), 35:1–35:20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Yu Yi, Srivastava Abhishek, and Shah Rajiv Ratn. 2020. Conditional hybrid GAN for sequence generation. CoRR abs/2009.08616 (2020).Google ScholarGoogle Scholar
  27. [27] Zhu Hongyuan, Liu Qi, Yuan Nicholas ing, Qin Chuan, Li Jiawei, Zhang Kun, Zhou Guang, Wei Furu, Xu Yuanchun, and Chen Enhong. 2018. XiaoIce band: A melody and arrangement generation framework for pop music. In 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 28372846.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Melody Generation from Lyrics with Local Interpretability

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Multimedia Computing, Communications, and Applications
        ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 3
        May 2023
        514 pages
        ISSN:1551-6857
        EISSN:1551-6865
        DOI:10.1145/3582886
        • Editor:
        • Abdulmotaleb El Saddik
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 February 2023
        • Online AM: 29 November 2022
        • Accepted: 6 November 2022
        • Received: 15 June 2022
        Published in tomm Volume 19, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!