Abstract
We present a static analysis for discovering differentiable or more generally smooth parts of a given probabilistic program, and show how the analysis can be used to improve the pathwise gradient estimator, one of the most popular methods for posterior inference and model learning. Our improvement increases the scope of the estimator from differentiable models to non-differentiable ones without requiring manual intervention of the user; the improved estimator automatically identifies differentiable parts of a given probabilistic program using our static analysis, and applies the pathwise gradient estimator to the identified parts while using a more general but less efficient estimator, called score estimator, for the rest of the program. Our analysis has a surprisingly subtle soundness argument, partly due to the misbehaviours of some target smoothness properties when viewed from the perspective of program analysis designers. For instance, some smoothness properties, such as partial differentiability and partial continuity, are not preserved by function composition, and this makes it difficult to analyse sequential composition soundly without heavily sacrificing precision. We formulate five assumptions on a target smoothness property, prove the soundness of our analysis under those assumptions, and show that our leading examples satisfy these assumptions. We also show that by using information from our analysis instantiated for differentiability, our improved gradient estimator satisfies an important differentiability requirement and thus computes the correct estimate on average (i.e., returns an unbiased estimate) under a regularity condition. Our experiments with representative probabilistic programs in the Pyro language show that our static analysis is capable of identifying smooth parts of those programs accurately, and making our improved pathwise gradient estimator exploit all the opportunities for high performance in those programs.
- Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein Generative Adversarial Networks. In International Conference on Machine Learning (ICML). 214–223.
Google Scholar
- Gilles Barthe, Raphaëlle Crubillé, Ugo Dal Lago, and Francesco Gavazzo. 2020. On the Versatility of Open Logical Relations - Continuity, Automatic Differentiation, and a Containment Theorem. In European Symposium on Programming (ESOP). 56–83. https://doi.org/10.1007/978-3-030-44914-8_3
Google Scholar
Digital Library
- Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul A. Szerlip, Paul Horsfall, and Noah D. Goodman. 2019. Pyro: Deep Universal Probabilistic Programming. Journal of Machine Learning Research, 20, 28 (2019), 1–6.
Google Scholar
Digital Library
- B. Blanchet, P. Cousot, R. Cousot, J. Feret, L. Mauborgne, A. Miné, D. Monniaux, and X. Rival. 2003. A Static Analyzer for Large Safety Critical Software. In Programming Languages, Design and Implementation (PLDI). 196–207. https://doi.org/10.1145/781131.781153
Google Scholar
Digital Library
- Vladimir I. Bogachev. 2007. Measure Theory (first ed.). Springer. https://doi.org/10.1007/978-3-540-34514-5
Google Scholar
Cross Ref
- Bob Carpenter, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A Probabilistic Programming Language. Journal of Statistical Software, 76, 1 (2017), 1–32. https://doi.org/10.18637/jss.v076.i01
Google Scholar
Cross Ref
- Arun Tejasvi Chaganty, Aditya V. Nori, and Sriram K. Rajamani. 2013. Efficiently Sampling Probabilistic Programs via Program Analysis. In Artificial Intelligence and Statistics (AISTATS). 153–160.
Google Scholar
- Swarat Chaudhuri, Sumit Gulwani, and Roberto Lublinerman. 2010. Continuity analysis of programs. In Principles of Programming Languages (POPL). 57–70. https://doi.org/10.1145/1706299.1706308
Google Scholar
Digital Library
- Swarat Chaudhuri, Sumit Gulwani, and Roberto Lublinerman. 2012. Continuity and robustness of programs. Commun. ACM, 55, 8 (2012), 107–115. https://doi.org/10.1145/2240236.2240262
Google Scholar
Digital Library
- Guillaume Claret, Sriram K. Rajamani, Aditya V. Nori, Andrew D. Gordon, and Johannes Borgström. 2013. Bayesian inference using data flow analysis. In Foundations of Software Engineering (FSE). 92–102. https://doi.org/10.1145/2491411.2491423
Google Scholar
Digital Library
- M. R. Clarkson and F. B. Schneider. 2008. Hyperproperties. In Computer Security Foundations (CSF). 51–65. https://doi.org/10.1109/CSF.2008.7
Google Scholar
Digital Library
- Patrick Cousot and Radhia Cousot. 1977. Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints. In Principles of Programming Languages (POPL). 238–252. https://doi.org/10.1145/512950.512973
Google Scholar
Digital Library
- Patrick Cousot and Radhia Cousot. 1979. Systematic design of program analysis frameworks. In Principles of Programming Languages (POPL). 269–282. https://doi.org/10.1145/567752.567778
Google Scholar
Digital Library
- S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, and Geoffrey E. Hinton. 2016. Attend, Infer, Repeat: Fast Scene Understanding with Generative Models. In Neural Information Processing Systems (NIPS). 3233–3241.
Google Scholar
Digital Library
- Hong Ge, Kai Xu, and Zoubin Ghahramani. 2018. Turing: A Language for Flexible Probabilistic Inference. In Artificial Intelligence and Statistics (AISTATS). 1682–1690.
Google Scholar
- Timon Gehr, Sasa Misailovic, and Martin T. Vechev. 2016. PSI: Exact Symbolic Inference for Probabilistic Programs. In Computer Aided Verification (CAV). 62–83. https://doi.org/10.1007/978-3-319-41528-4_4
Google Scholar
Cross Ref
- Noah Goodman, Vikash Mansinghka, Daniel M Roy, Keith Bonawitz, and Joshua B Tenenbaum. 2008. Church: a language for generative models. In Uncertainty in Artificial Intelligence (UAI). 220–229.
Google Scholar
- Andrew D. Gordon, Thore Graepel, Nicolas Rolland, Claudio Russo, Johannes Borgstrom, and John Guiver. 2014. Tabular: A Schema-driven Probabilistic Programming Language. In Principles of Programming Languages (POPL). 321–334. https://doi.org/10.1145/2578855.2535850
Google Scholar
Digital Library
- Maria I. Gorinova, Andrew D. Gordon, Charles Sutton, and Matthijs Vákár. 2022. Conditional Independence by Typing. ACM Trans. Program. Lang. Syst., 44, 1 (2022), 4:1–4:54. https://doi.org/10.1145/3490421
Google Scholar
Digital Library
- Maria I. Gorinova, Dave Moore, and Matthew D. Hoffman. 2020. Automatic Reparameterisation of Probabilistic Programs. In International Conference on Machine Learning (ICML). 3648–3657.
Google Scholar
- Steven Holtzen, Guy Van den Broeck, and Todd D. Millstein. 2020. Scaling exact inference for discrete probabilistic programs. Proc. ACM Program. Lang., 4, OOPSLA (2020), 140:1–140:31. https://doi.org/10.1145/3428208
Google Scholar
Digital Library
- Hyunjik Kim, George Papamakarios, and Andriy Mnih. 2021. The Lipschitz Constant of Self-Attention. In International Conference on Machine Learning (ICML). 5562–5571.
Google Scholar
- Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In International Conference on Learning Representations (ICLR).
Google Scholar
- Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman, and David M. Blei. 2015. Automatic Variational Inference in Stan. In Neural Information Processing Systems (NIPS). 568–576.
Google Scholar
- Jacob Laurel, Rem Yang, Gagandeep Singh, and Sasa Misailovic. 2022. A dual number abstraction for static analysis of Clarke Jacobians. Proc. ACM Program. Lang., 6, POPL (2022), 1–30. https://doi.org/10.1145/3498718
Google Scholar
Digital Library
- John M. Lee. 2012. Introduction to Smooth Manifolds (second ed.). Springer. https://doi.org/10.1007/978-1-4419-9982-5
Google Scholar
Cross Ref
- Wonyeol Lee, Xavier Rival, and Hongseok Yang. 2022. Artifact for the Paper “Smoothness Analysis for Probabilistic Programs with Application to Optimised Variational Inference”. https://doi.org/10.5281/zenodo.7246597
Google Scholar
Digital Library
- Wonyeol Lee, Xavier Rival, and Hongseok Yang. 2022. Smoothness Analysis for Probabilistic Programs with Application to Optimised Variational Inference. arXiv:2208.10530, https://doi.org/10.48550/ARXIV.2208.10530
Google Scholar
- Wonyeol Lee, Hangyeol Yu, Xavier Rival, and Hongseok Yang. 2020. Towards verified stochastic variational inference for probabilistic programs. Proc. ACM Program. Lang., 4, POPL (2020), 16:1–16:33. https://doi.org/10.1145/3371084
Google Scholar
Digital Library
- Alexander K. Lew, Marco F. Cusumano-Towner, Benjamin Sherman, Michael Carbin, and Vikash K. Mansinghka. 2020. Trace types and denotational semantics for sound programmable inference in probabilistic languages. Proc. ACM Program. Lang., 4, POPL (2020), 19:1–19:32. https://doi.org/10.1145/3371087
Google Scholar
Digital Library
- Ravi Mangal, Kartik Sarangmath, Aditya V. Nori, and Alessandro Orso. 2020. Probabilistic Lipschitz Analysis of Neural Networks. In Static Analysis Symposium (SAS). 274–309. https://doi.org/10.1007/978-3-030-65474-0_13
Google Scholar
Digital Library
- Vikash K. Mansinghka, Daniel Selsam, and Yura N. Perov. 2014. Venture: a higher-order probabilistic programming platform with programmable inference. arXiv:1404.0099, https://doi.org/10.48550/ARXIV.1404.0099
Google Scholar
- T. Minka, J.M. Winn, J.P. Guiver, S. Webster, Y. Zaykov, B. Yangel, A. Spengler, and J. Bronskill. 2014. Infer.NET 2.6. https://dotnet.github.io/infer/
Google Scholar
- Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Functional and Logic Programming (FLOPS). 62–79. https://doi.org/10.1007/978-3-319-29604-3_5
Google Scholar
Cross Ref
- Radford M. Neal. 2011. MCMC Using Hamiltonian Dynamics. In Handbook of Markov Chain Monte Carlo. 113–162. https://doi.org/10.1201/b10905
Google Scholar
Cross Ref
- Aditya V. Nori, Chung-Kil Hur, Sriram K. Rajamani, and Selva Samuel. 2014. R2: An Efficient MCMC Sampler for Probabilistic Programs. In AAAI Conference on Artificial Intelligence (AAAI). 2476–2482. https://doi.org/10.1609/aaai.v28i1.9060
Google Scholar
Cross Ref
- André Platzer. 2018. Logical Foundations of Cyber-Physical Systems. Springer. https://doi.org/10.1007/978-3-319-63588-0
Google Scholar
Cross Ref
- Rajesh Ranganath, Sean Gerrish, and David M. Blei. 2014. Black Box Variational Inference. In Artificial Intelligence and Statistics (AISTATS). 814–822.
Google Scholar
- Danilo Jimenez Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In International Conference on Machine Learning (ICML). 7344–7353.
Google Scholar
- Daniel Ritchie, Paul Horsfall, and Noah D. Goodman. 2016. Deep Amortized Inference for Probabilistic Programs. arXiv:1610.05735, https://doi.org/10.48550/ARXIV.1610.05735
Google Scholar
- Daniel Ritchie, Andreas Stuhlmüller, and Noah D. Goodman. 2016. C3: Lightweight Incrementalized MCMC for Probabilistic Programs using Continuations and Callsite Caching. In Artificial Intelligence and Statistics (AISTATS). 28–37.
Google Scholar
- John Salvatier, Thomas V. Wiecki, and Christopher Fonnesbeck. 2016. Probabilistic programming in Python using PyMC3. PeerJ Comput. Sci., 2 (2016), e55. https://doi.org/10.7717/peerj-cs.55
Google Scholar
Cross Ref
- John Schulman, Nicolas Heess, Theophane Weber, and Pieter Abbeel. 2015. Gradient Estimation Using Stochastic Computation Graphs. In Neural Information Processing Systems (NIPS). 3528–3536.
Google Scholar
- N. Siddharth, Brooks Paige, Jan-Willem van de Meent, Alban Desmaison, Noah D. Goodman, Pushmeet Kohli, Frank Wood, and Philip Torr. 2017. Learning Disentangled Representations with Semi-Supervised Deep Generative Models. In Neural Information Processing Systems (NIPS). 5927–5937.
Google Scholar
- Sam Staton, Hongseok Yang, Frank D. Wood, Chris Heunen, and Ohad Kammar. 2016. Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints. In Logic in Computer Science (LICS). 525–534. https://doi.org/10.1145/2933575.2935313
Google Scholar
Digital Library
- David Tolpin, Jan-Willem van de Meent, Hongseok Yang, and Frank D. Wood. 2016. Design and Implementation of Probabilistic Programming Language Anglican. In Implementation and Application of Functional Programming Languages (IFL). 6:1–6:12. https://doi.org/10.1145/3064899.3064910
Google Scholar
Digital Library
- Dustin Tran, Matthew D. Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, and Alexey Radul. 2018. Simple, Distributed, and Accelerated Probabilistic Programming. In Neural Information Processing Systems (NeurIPS). 7609–7620.
Google Scholar
- Dustin Tran, Alp Kucukelbir, Adji B. Dieng, Maja R. Rudolph, Dawen Liang, and David M. Blei. 2016. Edward: A library for probabilistic modeling, inference, and criticism. arXiv:1610.09787, https://doi.org/10.48550/ARXIV.1610.09787
Google Scholar
- Uber AI Labs. 2022. Pyro examples. http://pyro.ai/examples/ Version used: June 18, 2022
Google Scholar
- Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 2018. An Introduction to Probabilistic Programming. arXiv:1809.10756, https://doi.org/10.48550/ARXIV.1809.10756
Google Scholar
- Di Wang, Jan Hoffmann, and Thomas W. Reps. 2021. Sound probabilistic inference via guide types. In Programming Language Design and Implementation (PLDI). 788–803. https://doi.org/10.1145/3453483.3454077
Google Scholar
Digital Library
- WebPPL. 2019. https://github.com/probmods/webppl/blob/v0.9.15/src/header.wppl#L510
Google Scholar
- Ronald J. Williams. 1992. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Machine Learning, 8, 3-4 (1992), 229–256. https://doi.org/10.1007/BF00992696
Google Scholar
Digital Library
- David Wingate, Noah D. Goodman, Andreas Stuhlmüller, and Jeffrey Mark Siskind. 2011. Nonstandard Interpretations of Probabilistic Programs for Efficient Inference. In Neural Information Processing Systems (NIPS). 1152–1160.
Google Scholar
- David Wingate, Andreas Stuhlmüller, and Noah D. Goodman. 2011. Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation. In Artificial Intelligence and Statistics (AISTATS). 770–778.
Google Scholar
- David Wingate and Theophane Weber. 2013. Automated Variational Inference in Probabilistic Programming. arXiv:1301.1299, https://doi.org/10.48550/ARXIV.1301.1299
Google Scholar
- Frank Wood, Jan Willem van de Meent, and Vikash Mansinghka. 2014. A New Approach to Probabilistic Programming Inference. In Artificial Intelligence and Statistics (AISTATS). 1024–1032.
Google Scholar
- Chenling Xu, Romain Lopez, Edouard Mehlman, Jeffrey Regier, Michael I Jordan, and Nir Yosef. 2021. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Molecular systems biology, 17, 1 (2021), e9620. https://doi.org/10.15252/msb.20209620
Google Scholar
Cross Ref
- Yuan Zhou, Hongseok Yang, Yee Whye Teh, and Tom Rainforth. 2020. Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support. In International Conference on Machine Learning (ICML). 11534–11545.
Google Scholar
Index Terms
Smoothness Analysis for Probabilistic Programs with Application to Optimised Variational Inference
Recommendations
Sensitivity Analysis for Nonlinear Programs and Variational Inequalities with Nonunique Multipliers
We extend the principal sensitivity analysis results for parametric nonlinear programs and variational inequalities by considering problems with nonunique optimal Lagrange multipliers. Specifically, we derive sufficient conditions for the existence, ...
Particle-based energetic variational inference
AbstractWe introduce a new variational inference (VI) framework, called energetic variational inference (EVI). It minimizes the VI objective function based on a prescribed energy-dissipation law. Using the EVI framework, we can derive many existing ...
Smoothness properties of a regularized gap function for quasi-variational inequalities
This article studies continuity and differentiability properties for a reformulation of a finite-dimensional quasi-variational inequality QVI problem using a regularized gap function approach. For a special class of QVIs, this gap function is ...






Comments