skip to main content
research-article

μ-Forcing: Training Variational Recurrent Autoencoders for Text Generation

Authors Info & Claims
Published:13 July 2019Publication History
Skip Abstract Section

Abstract

It has been previously observed that training Variational Recurrent Autoencoders (VRAE) for text generation suffers from serious uninformative latent variables problems. The model would collapse into a plain language model that totally ignores the latent variables and can only generate repeating and dull samples. In this article, we explore the reason behind this issue and propose an effective regularizer-based approach to address it. The proposed method directly injects extra constraints on the posteriors of latent variables into the learning process of VRAE, which can flexibly and stably control the tradeoff between the Kullback-Leibler (KL) term and the reconstruction term, making the model learn dense and meaningful latent representations. The experimental results show that the proposed method outperforms several strong baselines and can make the model learn interpretable latent variables and generate diverse meaningful sentences. Furthermore, the proposed method can perform well without using other strategies, such as KL annealing.

References

  1. Alexander Alemi, Ben Poole, Ian Fischer, Joshua Dillon, Rif A. Saurous, and Kevin Murphy. 2018. Fixing a broken elbo. In ICML.Google ScholarGoogle Scholar
  2. Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer normalization. In ICML.Google ScholarGoogle Scholar
  3. Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, and Samy Bengio. 2016. Generating sentences from a continuous space. In CONLL.Google ScholarGoogle Scholar
  4. Xi Chen, Diederik P. Kingma, Tim Salimans, Yan Duan, Prafulla Dhariwal, John Schulman, Ilya Sutskever, and Pieter Abbeel. 2016. Variational lossy autoencoder. In ICML.Google ScholarGoogle Scholar
  5. Andrew M. Dai and Quoc V. Le. 2015. Semi-supervised sequence learning. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR.Google ScholarGoogle Scholar
  7. Li Dong, Shaohan Huang, Furu Wei, Mirella Lapata, Ming Zhou, and Ke Xu. 2017. Learning to generate product reviews from attributes. In EACL.Google ScholarGoogle Scholar
  8. Otto Fabius and Joost R. van Amersfoort. 2014. Variational recurrent auto-encoders. arXiv preprint arXiv:1412.6581 (2014).Google ScholarGoogle Scholar
  9. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Anirudh Goyal Alias Parth Goyal, Alessandro Sordoni, Marc-Alexandre Côté, Nan Rosemary Ke, and Yoshua Bengio. 2017. Z-forcing: Training stochastic recurrent networks. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. 2015. DRAW: A recurrent neural network for image generation. In ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Matthew D. Hoffman and Matthew J. Johnson. 2016. Elbo surgery: Yet another way to carve up the variational evidence lower bound. In Workshop in Advances in Approximate Bayesian Inference, NIPS.Google ScholarGoogle Scholar
  14. Yoon Kim, Sam Wiseman, Andrew C. Miller, David Sontag, and Alexander M. Rush. 2018. Semi-amortized variational autoencoders. arXiv preprint arXiv:1802.02550 (2018).Google ScholarGoogle Scholar
  15. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. In ICML.Google ScholarGoogle Scholar
  16. Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. Improved variational inference with inverse autoregressive flow. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational Bayes. In ICLR.Google ScholarGoogle Scholar
  18. Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. 2015. Adversarial autoencoders. In ICML.Google ScholarGoogle Scholar
  19. Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015. Inferring networks of substitutable and complementary products. In SIGKDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic regularities in continuous space word representations. In NAACL.Google ScholarGoogle Scholar
  21. Danilo Jimenez Rezende and Shakir Mohamed. 2015. Variational inference with normalizing flows. In ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic backpropagation and approximate inference in deep generative models. In ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Stanislau Semeniuta, Aliaksei Severyn, and Erhardt Barth. 2017. A hybrid convolutional variational autoencoder for text generation. In EMNLP.Google ScholarGoogle Scholar
  24. Xiaoyu Shen, Hui Su, Shuzi Niu, and Vera Demberg. 2018. Improving variational encoder-decoders in dialogue generation. In AAAI.Google ScholarGoogle Scholar
  25. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. 2016. Ladder variational autoencoders. In NIPS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Martin Sundermeyer, Ralf Schlüter, and Hermann Ney. 2012. LSTM neural networks for language modeling. In INTERSPEECH.Google ScholarGoogle Scholar
  27. Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11 (2010), 3371--3408. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Xinchen Yan, Jimei Yang, Kihyuk Sohn, and Honglak Lee. 2016. Attribute2image: Conditional image generation from visual attributes. In ECCV.Google ScholarGoogle Scholar
  29. Zichao Yang, Zhiting Hu, Ruslan Salakhutdinov, and Taylor Berg-Kirkpatrick. 2017. Improved variational autoencoders for text modeling using dilated convolutions. In ICML. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Serena Yeung, Anitha Kannan, Yann Dauphin, and Li Fei-Fei. 2017. Tackling over-pruning in variational autoencoders. arXiv preprint arXiv:1706.03643 (2017).Google ScholarGoogle Scholar
  31. Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In AAAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Shengjia Zhao, Jiaming Song, and Stefano Ermon. 2017. Infovae: Information maximizing variational autoencoders. arXiv preprint arXiv:1706.02262 (2017).Google ScholarGoogle Scholar
  33. Tiancheng Zhao, Ran Zhao, and Maxine Eskenazi. 2017. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. In ACL.Google ScholarGoogle Scholar
  34. Yaoming Zhu, Sidi Lu, Lei Zheng, Jiaxian Guo, Weinan Zhang, Jun Wang, and Yong Yu. 2018. Texygen: A Benchmarking Platform for Text Generation Models. In SIGIR. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. μ-Forcing: Training Variational Recurrent Autoencoders for Text Generation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!