Abstract
Topic modelling is an important approach of unsupervised machine learning that allows automatically extracting the main “topics” from large collections of documents. In addition, topic modelling is able to identify the topic proportions of each individual document, which can be helpful for organizing the collections. Many topic modelling algorithms have been proposed to date, including several that leverage advanced techniques such as variational inference and deep autoencoders. However, to date topic modelling has made limited use of reinforcement learning, a framework that has obtained vast success in many other unsupervised learning tasks. For this reason, in this article we propose training a neural topic model using a reinforcement learning objective and minimizing the objective with the recently-proposed REBAR gradient estimator. Experiments performed over two probing datasets have shown that the proposed model has achieved improvements over all the compared models in terms of both model perplexity and topic coherence, and produced topics that appear qualitatively informative and consistent.
- [1] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale machine learning on heterogeneous distributed systems. http://download.tensorflow.org/paper/whitepaper2015.pdf.Google Scholar
- [2] . 2016. Topic modeling in Twitter: Aggregating tweets by conversations. In Proceedings of the 10th International Conference on Web and Social Media. 519–522.Google Scholar
- [3] . 2010. Clinical case-based retrieval using latent topic analysis. AMIA Annual Symposium Proceedings 2010 (2010), 26–30.Google Scholar
- [4] . 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (2003), 993–1022.Google Scholar
Digital Library
- [5] . 2018. JAX: composable transformations of Python+NumPy programs. Retrieved from http://github.com/google/jax.Google Scholar
- [6] . 2010. Making words work: Using financial text as a predictor of financial events. Decision Support Systems 50, 1 (2010), 164–175.Google Scholar
Digital Library
- [7] . 2014. BTM: Topic modeling over short texts. IEEE Transactions on Knowledge and Data Engineering 26, 12 (2014), 2928–2941.Google Scholar
Cross Ref
- [8] . 2015. Gaussian LDA for topic models with word embeddings. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. 795–804.Google Scholar
Cross Ref
- [9] . 1990. Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 6 (1990), 391–407.Google Scholar
Digital Library
- [10] . 2018. Mapping the research landscape of agricultural sciences. Foresight and STI Governance1 (2018), 69–78.Google Scholar
Cross Ref
- [11] . 2012. Sequential latent Dirichlet allocation. Knowledge and Information Systems 31 (2012), 475–503.Google Scholar
Digital Library
- [12] . 2016. Modeling documents with Generative Adversarial Networks. In Proceedings of the NIPS 2016 Workshop on Adversarial Training. 1–7.Google Scholar
- [13] . 2018. Backpropagation through the Void: Optimizing control variates for black-box gradient estimation. In Proceedings of the 6th International Conference on Learning Representations.Google Scholar
- [14] . 2018. Flow-GAN: Combining maximum likelihood and adversarial learning in generative models. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence. AAAI, 3069–3076.Google Scholar
Cross Ref
- [15] . 2019. Neural topic model with reinforcement learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. , , , and (Eds.), Association for Computational Linguistics, 3476–3481.Google Scholar
Cross Ref
- [16] . 1954. Statistical Theory of Extreme Values and Some Practical Applications: A Series of Lectures. US Government Printing Office.Google Scholar
- [17] . 2010. Text Mining: Classification, clustering, and applications. International Statistical Review 78 (2010), 134–135.Google Scholar
Cross Ref
- [18] . 1999. Probabilistic latent semantic analysis. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence. 289–296.Google Scholar
Digital Library
- [19] . 2017. Categorical reparameterization with Gumbel-Softmax. In Proceedings of the 5th International Conference on Learning Representations. 1–12.Google Scholar
- [20] . 2021. ArchiText: Interactive hierarchical topic modeling. IEEE Transactions on Visualization and Computer Graphics 17, 9 (2021), 3644–3655.Google Scholar
Digital Library
- [21] . 2014. Auto-Encoding variational Bayes. In Proceedings of the 2nd International Conference on Learning Representations. 1–14.Google Scholar
- [22] . 2019. An introduction to variational autoencoders. Foundations and Trends in Machine Learning 12, 4 (2019), 307–392.Google Scholar
Digital Library
- [23] . 1999. Actor-Critic algorithms. In Proceedings of the Advances in Neural Information Processing Systems 12. 1008–1014.Google Scholar
- [24] . 2021. Topic-Document inference with the Gumbel-Softmax distribution. IEEE Access 9 (2021), 1313–1320.
DOI :Google ScholarCross Ref
- [25] . 1995. Newsweeder: Learning to filter netnews. In Proceedings of the 12th International Conference on Machine Learning. 331–339.Google Scholar
Cross Ref
- [26] . 2014. Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 530–539.Google Scholar
Cross Ref
- [27] . 2018. Supervised topic modeling using hierarchical Dirichlet process-based inverse regression: Experiments on E-Commerce applications. IEEE Transactions on Knowledge and Data Engineering 30, 6 (2018), 1192–1205.Google Scholar
Cross Ref
- [28] . 2019. Single image rain removal using image decomposition and a dense network. IEEE/CAA Journal of Automatica Sinica 6 (2019), 1428–1437.Google Scholar
- [29] . 2014. Hierarchical latent tree analysis for topic detection. CoRR 8725 (2014), 256–272.Google Scholar
- [30] . 2014. A* sampling. In Proceedings of the Advances in Neural Information Processing Systems. 3086–3094.Google Scholar
- [31] . 2013. From amateurs to connoisseurs: Modeling the evolution of user expertise through online reviews. In Proceedings of the 22nd International Conference on World Wide Web. 897–908.Google Scholar
Digital Library
- [32] . 2017. Discovering discrete latent topics with neural variational inference. In Proceedings of the 34th International Conference on Machine Learning. 2410–2419.Google Scholar
Digital Library
- [33] . 2015. Topic modeling based sentiment analysis on social media for stock market prediction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing. 1354–1364.Google Scholar
Cross Ref
- [34] . 2017. Automatic differentiation in PyTorch. In Proceedings of the NIPS 2017 Workshop on Autodiff Submission.Google Scholar
- [35] . 2018. Neural sparse topical coding. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2332–2340.Google Scholar
Cross Ref
- [36] . 2010. Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. 45–50.Google Scholar
- [37] . 2015. Exploring the space of topic coherence measures. In Proceedings of the 8th ACM International Conference on Web Search and Data Mining. 399–408.Google Scholar
Digital Library
- [38] . 2017. Learning supervised topic models for classification and regression from crowds. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 12 (2017), 2409–2422.Google Scholar
Cross Ref
- [39] . 2021. An introduction to deep generative modeling.GAMM-Mitteilungen 44, 2 (2021), 1–24.Google Scholar
- [40] . 2012. Clinical report classification using natural language processing and topic modeling. In Proceedings of the 11th International Conference on Machine Learning and Applications. 204–209.Google Scholar
Digital Library
- [41] . 2017. Understanding text pre-processing for latent Dirichlet allocation. In Proceedings of the First Women and Underrepresented Minorities in NLP Workshop. 1–4.Google Scholar
- [42] . 2021. An embedding-based topic model for document classification. ACM Transactions on Asian and Low-Resource Language Information Processing 20, 3 (2021), 1–13.Google Scholar
Digital Library
- [43] . 2017. Autoencoding variational inference for topic models. In Proceedings of the 5th International Conference on Learning Representations. 1–12.Google Scholar
- [44] . 2018. Reinforcement Learning: An Introduction (second ed.). MIT Press.Google Scholar
- [45] . 2017. REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models. In Proceedings of the Advances in Neural Information Processing Systems.2627–2636.Google Scholar
- [46] . 2011. Online variational inference for the hierarchical dirichlet process. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR.org, 752–760.Google Scholar
- [47] . 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8 (1992), 229–256.Google Scholar
Digital Library
- [48] . 2019. Research on topic detection and tracking for online news texts. IEEE Access 7 (2019), 58407–58418.Google Scholar
Cross Ref
- [49] . 2013. Sparse online topic models. In Proceedings of the 22nd International World Wide Web Conference. 1489–1500.Google Scholar
Digital Library
- [50] . 2020. Deep autoencoding topic model with scalable hybrid Bayesian inference. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 12 (2020), 1–22.Google Scholar
- [51] . 2012. Automated assessment of medical training evaluation text. AMIA Annual Symposium Proceedings 2012 (2012), 1459–68.Google Scholar
- [52] . 2020. Convolutional multi-head self-attention on memory for aspect sentiment classification. IEEE/CAA Journal of Automatica Sinica 7 (2020), 1038–1044.Google Scholar
Cross Ref
- [53] . 2011. Sparse topical coding. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence. 831–838.Google Scholar
- [54] . 2021. Topic modeling of short texts: A pseudo-document view with word embedding enhancement. IEEE Transactions on Knowledge and Data Engineering Early Access (2021), 1–14.Google Scholar
Cross Ref
Index Terms
Neural Topic Model Training with the REBAR Gradient Estimator
Recommendations
Topic model tutorial: A basic introduction on latent dirichlet allocation and extensions for web scientists
WebSci '16: Proceedings of the 8th ACM Conference on Web ScienceIn this tutorial, we teach the intuition and the assumptions behind topic models. Topic models explain the co-occurrences of words in documents by extracting sets of semantically related words, called topics. These topics are semantically coherent and ...
Topic sentiment mixture: modeling facets and opinions in weblogs
WWW '07: Proceedings of the 16th international conference on World Wide WebIn this paper, we define the problem of topic-sentiment analysis on Weblogs and propose a novel probabilistic model to capture the mixture of topics and sentiments simultaneously. The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent ...
Identifying Sentence-Level Semantic Content Units with Topic Models
DEXA '10: Proceedings of the 2010 Workshops on Database and Expert Systems ApplicationsStatistical approaches to document content modeling typically focus either on broad topics or on discourse-level subtopics of a text. We present an analysis of the performance of probabilistic topic models on the task of learning sentence-level topics ...






Comments