Abstract
Melody generation aims to learn the distribution of real melodies to generate new melodies conditioned on lyrics, which has been a very interesting topic in the area of artificial intelligence and music. However, a challenging issue still limits the quality and reliability of melody generation conditioned on lyrics: how to enhance the interpretability between the input lyrics and generated melodies so humans can understand their relationships. To solve this issue, in this article, we propose a model for melody generation from lyrics with local interpretability, which contains two significant contributions: (i) Mutual information between input lyrics and generated melody is exploited to instruct the training of the network, which avoids the loss of content consistency during the training stage. (ii) Transformer is explored to efficiently extract semantic features from lyrics sequences, which provides more interpretable correlations between different syllables in lyrics. Experiments on a large-scale dataset with paired lyrics-melodies demonstrate that the proposed approach can generate higher-quality melodies from lyrics compared with existing methods.
- [1] . 2017. Algorithmic songwriting with ALYSIA. In 6th International Conference on Computational Intelligence in Music, Sound, Art and Design. 1–16.Google Scholar
Cross Ref
- [2] . 2019. Neural melody composition from lyrics. In 8th International Conference on Natural Language Processing and Chinese Computing. 499–511.Google Scholar
Digital Library
- [3] . 2016. A test of relative similarity for model selection in generative models. In 4th International Conference on Learning Representations. http://arxiv.org/abs/1511.04581.Google Scholar
- [4] . 2016. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. In 29th International Conference on Neural Information Processing Systems. 2172–2180.Google Scholar
- [5] . 2020. Jukebox: A generative model for music. CoRR abs/2005.00341 (2020).Google Scholar
- [6] . 2018. MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In 32nd AAAI Conference on Artificial Intelligence. 34–41.Google Scholar
Cross Ref
- [7] . 2014. Generative adversarial networks. CoRR abs/1406.2661 (2014).Google Scholar
- [8] . 1997. Long short-term memory. Neural Comput. 9, 8 (1997), 1735–1780.Google Scholar
Digital Library
- [9] . 2019. Music transformer: Generating music with long-term structure. In 7th International Conference on Learning Representations.Google Scholar
- [10] . 2019. The relativistic discriminator: A key element missing from standard GAN. In 7th International Conference on Learning Representations.Google Scholar
- [11] . 2021. TeleMelody: Lyric-to-melody generation with a template-based two-stage method. CoRR abs/2109.09617 (2021).Google Scholar
- [12] . 2016. GANS for sequences of discrete elements with the Gumbel-Softmax distribution. CoRR abs/1611.04051 (2016).Google Scholar
- [13] . 2019. iComposer: An automatic songwriting system for Chinese popular music. In Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 84–88.Google Scholar
- [14] . 2020. Automatic neural lyrics and melody composition. CoRR abs/2011.06380 (2020).Google Scholar
- [15] . 2013. Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations, Workshop Track Proceedings.Google Scholar
- [16] . 2012. Automatic generation of melodic accompaniments for lyrics. In 3rd International Conference on Computational Creativity. 87–94.Google Scholar
- [17] . 2009. Lyric-based rhythm suggestion. In 35th International Conference on International Computer Music Conference.Google Scholar
- [18] . 2002. Bleu: A method for automatic evaluation of machine translation. In 40th Annual Meeting of the Association for Computational Linguistics. ACL, 311–318.Google Scholar
- [19] . 2020. PopMAG: Pop music accompaniment generation. In 28th ACM International Conference on Multimedia. ACM, 1198–1206.Google Scholar
Digital Library
- [20] . 2021. Interpretable machine learning: Fundamental principles and 10 grand challenges. CoRR abs/2103.11251 (2021).Google Scholar
- [21] . 2015. SMUG: Scientific music generator. In 6th International Conference on Computational Creativity. 204–211.Google Scholar
- [22] . 2021. SongMASS: Automatic song writing with pre-training and alignment constraint. In 35th AAAI Conference on Artificial Intelligence, 33rd Conference on Innovative Applications of Artificial Intelligence, 11th Symposium on Educational Advances in Artificial Intelligence. AAAI Press, 13798–13805. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17626.Google Scholar
Cross Ref
- [23] . 2013. Automatical composition of lyrical songs. In 4th International Conference on Computational Creativity. 87–91.Google Scholar
- [24] . 2017. Attention is all you need. In International Conference on Advances in Neural Information Processing Systems. 5998–6008.Google Scholar
- [25] . 2021. Conditional LSTM-GAN for melody generation from lyrics. ACM Transaction on Multimedia Computing Communication and Applications (TOMCCAP’21) 17, 1 (2021), 35:1–35:20. Google Scholar
Digital Library
- [26] . 2020. Conditional hybrid GAN for sequence generation. CoRR abs/2009.08616 (2020).Google Scholar
- [27] . 2018. XiaoIce band: A melody and arrangement generation framework for pop music. In 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2837–2846.Google Scholar
Digital Library
Index Terms
Melody Generation from Lyrics with Local Interpretability
Recommendations
Interpretable Melody Generation from Lyrics with Discrete-Valued Adversarial Training
MM '22: Proceedings of the 30th ACM International Conference on MultimediaGenerating melody from lyrics is an interesting yet challenging task in the area of artificial intelligence and music. However, the difficulty of keeping the consistency between input lyrics and generated melody limits the generation quality of previous ...
Conditional LSTM-GAN for Melody Generation from Lyrics
Melody generation from lyrics has been a challenging research issue in the field of artificial intelligence and music, which enables us to learn and discover latent relationships between interesting lyrics and accompanying melodies. Unfortunately, the ...
Conditional hybrid GAN for melody generation from lyrics
AbstractConditional sequence generation aims to instruct the generation procedure by conditioning the model with additional context information, which is an interesting research issue in AI and machine learning. Unfortunately, current state-of-the-art ...






Comments