Abstract
In probabilistic programming languages (PPLs), a critical step in optimization-based inference methods is constructing, for a given model program, a trainable guide program. Soundness and effectiveness of inference rely on constructing good guides, but the expressive power of a universal PPL poses challenges. This paper introduces an approach to automatically generating guides for deep amortized inference in a universal PPL. Guides are generated using a type-directed translation per a novel behavioral type system. Guide generation extracts and exploits independence structures using a syntactic approach to conditional independence, with a semantic account left to further work. Despite the control-flow expressiveness allowed by the universal PPL, generated guides are guaranteed to satisfy a critical soundness condition and moreover, consistently improve training and inference over state-of-the-art baselines for a suite of benchmarks.
- 2022. https://docs.pyro.ai/en/1.8.0/infer.autoguide.html
Google Scholar
- Guillaume Baudart and Louis Mandel. 2021. Automatic guide generation for Stan via NumPyro. In Int’l Conf. on Probabilistic Programming (PROBPROG). arxiv:2110.11790.
Google Scholar
- Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D. Goodman. 2019. Pyro: Deep universal probabilistic programming. Journal of Machine Learning Research (JMLR), 20, 1 (2019), arxiv:1810.09538.
Google Scholar
- Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2016. A lambda-calculus foundation for universal probabilistic programming. In ACM SIGPLAN Conf. on Functional Programming (ICFP). https://doi.org/10.1145/2951913.2951942
Google Scholar
Digital Library
- Marco F. Cusumano-Towner, Feras A. Saad, Alexander K. Lew, and Vikash K. Mansinghka. 2019. Gen: A general-purpose probabilistic programming system with programmable inference. In ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI). https://doi.org/10.1145/3314221.3314642
Google Scholar
Digital Library
- Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. 1987. The program dependence graph and its use in optimization. ACM Tran. on Programming Languages and Systems (TOPLAS), 9, 3 (1987), July, https://doi.org/10.1145/24039.24041
Google Scholar
Digital Library
- Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle. 2015. MADE: Masked autoencoder for distribution estimation. In Int’l Conf. on Machine Learning (ICML). http://proceedings.mlr.press/v37/germain15.pdf
Google Scholar
- Noah Goodman, Vikash K. Mansinghka, Daniel M Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: A language for generative models. In Conf. on Uncertainty in Artificial Intelligence (UAI). arxiv:1206.3255.
Google Scholar
- Maria I. Gorinova, Andrew D. Gordon, Charles Sutton, and Matthijs Vákár. 2021. Conditional independence by typing. ACM Tran. on Programming Languages and Systems (TOPLAS), 44, 1 (2021), Dec., https://doi.org/10.1145/3490421 arxiv:2010.11887.
Google Scholar
Digital Library
- William Harvey, Andreas Munk, Atılım Güneş Baydin, Alexander Bergholm, and Frank Wood. 2019. Attention for inference compilation. arxiv:1910.11961.
Google Scholar
- Geoffrey E. Hinton, Peter Dayan, Brendan J. Frey, and Radford M. Neal. 1995. The “wake-sleep” algorithm for unsupervised neural networks. Science, 268, 5214 (1995), https://doi.org/10.1126/science.7761831
Google Scholar
Cross Ref
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation, 9, 8 (1997), Nov., https://doi.org/10.1162/neco.1997.9.8.1735
Google Scholar
Digital Library
- Kohei Honda, Vasco T. Vasconcelos, and Makoto Kubo. 1999. Language primitives and type discipline for structured communication-based programming. In European Symp. on Programming (ESOP). https://doi.org/10.1007/BFb0053567
Google Scholar
Cross Ref
- Chin-Wei Huang, David Krueger, Alexandre Lacoste, and Aaron Courville. 2018. Neural autoregressive flows. In Int’l Conf. on Machine Learning (ICML). http://proceedings.mlr.press/v80/huang18d/huang18d.pdf
Google Scholar
- Chung-Kil Hur, Aditya V. Nori, Sriram K. Rajamani, and Selva Samuel. 2014. Slicing probabilistic programs. In ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI). https://doi.org/10.1145/2594291.2594303
Google Scholar
Digital Library
- Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence K. Saul. 1999. An introduction to variational methods for graphical models. Machine learning, 37, 2 (1999), https://doi.org/10.1023/A:1007665907178
Google Scholar
Digital Library
- Diederik P. Kingma and Max Welling. 2014. Auto-encoding variational Bayes. In Int’l Conf. on Learning Representations (ICLR). arxiv:1312.6114.
Google Scholar
- Augustine Kong. 1992. A Note on Importance Sampling Using Standardized Weights. Department of Statistics, University of Chicago. https://d3qi0qp55mx5f5.cloudfront.net/stat/docs/tech-rpts/tr348.pdf
Google Scholar
- Tuan Anh Le, Atılım Güneş Baydin, and Frank Wood. 2017. Inference compilation and universal probabilistic programming. In Int’l Conf. on Artificial Intelligence and Statistics (AISTATS). arxiv:1610.09900.
Google Scholar
- Tuan Anh Le, Adam R. Kosiorek, N. Siddharth, Yee Whye Teh, and Frank Wood. 2019. Revisiting reweighted wake-sleep for models with stochastic control flow. In Conf. on Uncertainty in Artificial Intelligence (UAI). arxiv:1805.10469.
Google Scholar
- Wonyeol Lee, Hangyeol Yu, Xavier Rival, and Hongseok Yang. 2019. Towards verified stochastic variational inference for probabilistic programs. Proc. of the ACM on Programming Languages (PACMPL), 4, POPL (2019), Dec., https://doi.org/10.1145/3371084
Google Scholar
Digital Library
- Alexander K. Lew, Marco F. Cusumano-Towner, Benjamin Sherman, Michael Carbin, and Vikash K. Mansinghka. 2019. Trace types and denotational semantics for sound programmable inference in probabilistic languages. Proc. of the ACM on Programming Languages (PACMPL), 4, POPL (2019), Dec., https://doi.org/10.1145/3371087
Google Scholar
Digital Library
- Jianlin Li, Leni Ven, Pengyuan Shi, and Yizhou Zhang. 2022. Synthesizing Guide Programs for Sound, Effective Deep Amortized Inference. School of Computer Science, University of Waterloo.
Google Scholar
- Carol Mak, C.-H. Luke Ong, Hugo Paquet, and Dominik Wagner. 2021. Densities of almost surely terminating probabilistic programs are differentiable almost everywhere. In European Symp. on Programming (ESOP). https://doi.org/10.1007/978-3-030-72019-3_16 arxiv:2004.03924.
Google Scholar
Digital Library
- Christopher Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. MIT Press.
Google Scholar
Digital Library
- Vikash K. Mansinghka, Tejas D. Kulkarni, Yura N. Perov, and Joshua B. Tenenbaum. 2013. Approximate Bayesian image interpretation using generative probabilistic graphics programs. In Conf. on Neural Information Processing Systems (NIPS). arxiv:1307.0060.
Google Scholar
- Vikash K. Mansinghka, Ulrich Schaechtle, Shivam Handa, Alexey Radul, Yutian Chen, and Martin Rinard. 2018. Probabilistic programming with programmable inference. In ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI). https://doi.org/10.1145/3192366.3192409
Google Scholar
Digital Library
- Christopher Meek. 1995. Strong completeness and faithfulness in Bayesian networks. In Conf. on Uncertainty in Artificial Intelligence (UAI). arxiv:1302.4973.
Google Scholar
- Brooks Paige and Frank Wood. 2016. Inference networks for sequential Monte Carlo in graphical models. In Int’l Conf. on Machine Learning (ICML). arxiv:1602.06701.
Google Scholar
- Judea Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann. isbn:978-0-08-051489-5 https://doi.org/10.1016/B978-0-08-051489-5.50001-1
Google Scholar
Cross Ref
- Danilo Rezende and Shakir Mohamed. 2015. Variational inference with normalizing flows. In Int’l Conf. on Machine Learning (ICML). arxiv:1505.05770.
Google Scholar
- Daniel Ritchie, Paul Horsfall, and Noah D. Goodman. 2016. arxiv:1610.05735.
Google Scholar
- Feras A. Saad, Marco F. Cusumano-Towner, Ulrich Schaechtle, Martin C. Rinard, and Vikash K. Mansinghka. 2019. Bayesian synthesis of probabilistic programs for automatic data modeling. Proc. of the ACM on Programming Languages (PACMPL), 3, POPL (2019), https://doi.org/10.1145/3290350 arxiv:1907.06249.
Google Scholar
Digital Library
- Juliane Schäfer and Korbinian Strimmer. 2005. A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4 (2005), https://doi.org/10.2202/1544-6115.1175
Google Scholar
Cross Ref
- N. Siddharth, Brooks Paige, Jan-Willem van de Meent, Alban Desmaison, Noah D. Goodman, Pushmeet Kohli, Frank Wood, and Philip Torr. 2017. Learning disentangled representations with semi-supervised deep generative models. In Conf. on Neural Information Processing Systems (NIPS). arxiv:1706.00400.
Google Scholar
- Andreas Stuhlmüller, Jessica Taylor, and Noah D. Goodman. 2013. Learning stochastic inverses. In Conf. on Neural Information Processing Systems (NIPS). https://proceedings.nips.cc/paper/2013/file/7f53f8c6c730af6aeb52e66eb74d8507-Paper.pdf
Google Scholar
- Marcin Szymczak and Joost-Pieter Katoen. 2019. Weakest preexpectation semantics for Bayesian inference. In Int’l School on Engineering Trustworthy Software Systems (SETSS). https://doi.org/10.1007/978-3-030-55089-9_3
Google Scholar
Digital Library
- Dustin Tran, Matthew D. Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, Alexey Radul, Matthew Johnson, and Rif A. Saurous. 2018. Simple, distributed, and accelerated probabilistic programming. In Conf. on Neural Information Processing Systems (NeurIPS). arxiv:1811.02091.
Google Scholar
- Alan M. Turing. 1937. On computable numbers, with an application to the entscheidungsproblem. Proc. of the London mathematical society, 2, 1 (1937), https://doi.org/10.1112/plms/s2-42.1.230
Google Scholar
Cross Ref
- Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 2021. An Introduction to Probabilistic Programming. arxiv:1809.10756.
Google Scholar
- Thomas Verma and Judea Pearl. 1988. Causal networks: Semantics and expressiveness. In Conf. on Uncertainty in Artificial Intelligence (UAI). arxiv:1304.2379.
Google Scholar
- Di Wang, Jan Hoffmann, and Thomas Reps. 2021. Sound probabilistic inference via guide types. In ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI). arxiv:2104.03598.
Google Scholar
Digital Library
- Stefan Webb, Jonathan P. Chen, Martin Jankowiak, and Noah Goodman. 2019. Improving automated variational inference with normalizing flows. In 6 ICML Workshop on Automated Machine Learning. https://www.automl.org/wp-content/uploads/2019/06/automlws2019_Paper23.pdf
Google Scholar
- Stefan Webb, Adam Goliński, Robert Zinkov, N. Siddharth, Tom Rainforth, Yee Whye Teh, and Frank Wood. 2018. Faithful inversion of generative models for effective amortized inference. In Conf. on Neural Information Processing Systems (NIPS). arxiv:1712.00287.
Google Scholar
- Christian Weilbach, Boyan Beronov, William Harvey, and Frank Wood. 2020. Structured conditional continuous normalizing flows for efficient amortized inference in graphical models. In Int’l Conf. on Artificial Intelligence and Statistics (AISTATS). http://proceedings.mlr.press/v108/weilbach20a/weilbach20a.pdf
Google Scholar
- Frank D. Wood, Jan-Willem van de Meent, and Vikash Mansinghka. 2014. A new approach to probabilistic programming inference. In Int’l Conf. on Artificial Intelligence and Statistics (AISTATS). arxiv:1507.00996.
Google Scholar
- Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, and Oriol Vinyals. 2017. Understanding deep learning requires rethinking generalization. In Int’l Conf. on Learning Representations (ICLR). arxiv:1611.03530.
Google Scholar
- Cheng Zhang, Judith Bütepage, Hedvig Kjellström, and Stephan Mandt. 2019. Advances in variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 41, 8 (2019), https://doi.org/10.1109/TPAMI.2018.2889774 arxiv:1711.05597.
Google Scholar
Cross Ref
Index Terms
Type-Preserving, Dependence-Aware Guide Generation for Sound, Effective Amortized Probabilistic Inference
Recommendations
Sound probabilistic inference via guide types
PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and ImplementationProbabilistic programming languages aim to describe and automate Bayesian modeling and inference. Modern languages support programmable inference, which allows users to customize inference algorithms by incorporating guide programs to improve inference ...
Type inference, principal typings, and let-polymorphism for first-class mixin modules
Proceedings of the tenth ACM SIGPLAN international conference on Functional programmingA mixin module is a programming abstraction that simultaneously generalizes λ-abstractions, records, and mutually recursive definitions. Although various mixin module type systems have been developed, no one has investigated principal typings or ...
Type inference, principal typings, and let-polymorphism for first-class mixin modules
ICFP '05: Proceedings of the tenth ACM SIGPLAN international conference on Functional programmingA mixin module is a programming abstraction that simultaneously generalizes λ-abstractions, records, and mutually recursive definitions. Although various mixin module type systems have been developed, no one has investigated principal typings or ...






Comments