Abstract
We present a modular semantic account of Bayesian inference algorithms for probabilistic programming languages, as used in data science and machine learning. Sophisticated inference algorithms are often explained in terms of composition of smaller parts. However, neither their theoretical justification nor their implementation reflects this modularity. We show how to conceptualise and analyse such inference algorithms as manipulating intermediate representations of probabilistic programs using higher-order functions and inductive types, and their denotational semantics.
Semantic accounts of continuous distributions use measurable spaces. However, our use of higher-order functions presents a substantial technical difficulty: it is impossible to define a measurable space structure over the collection of measurable functions between arbitrary measurable spaces that is compatible with standard operations on those functions, such as function application. We overcome this difficulty using quasi-Borel spaces, a recently proposed mathematical structure that supports both function spaces and continuous distributions.
We define a class of semantic structures for representing probabilistic programs, and semantic validity criteria for transformations of these representations in terms of distribution preservation. We develop a collection of building blocks for composing representations. We use these building blocks to validate common inference algorithms such as Sequential Monte Carlo and Markov Chain Monte Carlo. To emphasize the connection between the semantic manipulation and its traditional measure theoretic origins, we use Kock's synthetic measure theory. We demonstrate its usefulness by proving a quasi-Borel counterpart to the Metropolis-Hastings-Green theorem.
Supplemental Material
- R. J. Aumann. 1961. Borel structures for function spaces. Illinois Journal of Mathematics 5 (1961), 614–630.Google Scholar
Cross Ref
- Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2015. A Lambda-Calculus Foundation for Universal Probabilistic Programming. CoRR abs/1512.08990 (2015). http://arxiv.org/abs/1512.08990Google Scholar
- Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2016. A lambda-calculus foundation for universal probabilistic programming. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016. 33–46. Google Scholar
Digital Library
- Bob Carpenter, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A Probabilistic Programming Language. Journal of Statistical Software, Articles 76, 1 (2017), 1–32. Google Scholar
- Arnaud Doucet and Adam M. Johansen. 2011. A Tutorial on Particle Filtering and Smoothing: Fifteen years later. In The Oxford Handbook of Nonlinear Filtering, Dan Crisan and Boris Rozovskii (Eds.). Oxford University Press, 656–704.Google Scholar
- Matthias Felleisen. 1991. On the Expressive Power of Programming Languages. Sci. Comput. Program. 17, 1-3 (1991), 35–75.Google Scholar
Digital Library
- Marcelo Fiore and Philip Saville. 2017. List Objects with Algebraic Structure. In 2st International Conference on Formal Structures for Computation and Deduction, FSCD 2017.Google Scholar
- Herman Geuvers and Erik Poll. 2007. Iteration and primitive recursion in categorical terms. In Reflections on Type Theory, Lambda Calculus, and the Mind, Essays Dedicated to Henk Barendregt on the Occasion of his 60th Birthday. Radboud Universiteit, Nijmegen, 101–114.Google Scholar
- Charles J. Geyer. 2011. Introduction to Markov Chain Monte Carlo. In Handbook of Markov Chain Monte Carlo, Steve Brooks, Andrew Gelman, Galin L. Jones, and Xiao-Li Meng (Eds.). Chapman and Hall/CRC, Chapter 1, 3–48. Google Scholar
Cross Ref
- Noah Goodman, Vikash Mansinghka, Daniel M Roy, Keith Bonawitz, and Joshua B Tenenbaum. 2008. Church: a language for generative models. In UAI.Google Scholar
- Noah Goodman and Andreas Stuhlmüller. 2014. Design and Implementation of Probabilistic Programming Languages. http://dippl.org . (2014).Google Scholar
- Andrew D. Gordon, Thore Graepel, Nicolas Rolland, Claudio Russo, Johannes Borgstrom, and John Guiver. 2014. Tabular: A Schema-driven Probabilistic Programming Language. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’14). ACM, New York, NY, USA, 321–334. Google Scholar
Digital Library
- Esfandiar Haghverdi and Philip Scott. 2006. A categorical model for the geometry of interaction. Theoretical Computer Science 350, 2 (2006), 252 – 274. Google Scholar
Digital Library
- Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A Convenient Category for Higher-Order Probability Theory. In Proceedings of the 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS ’17, Reykjavik, Iceland, June 20-23, 2017. Google Scholar
Cross Ref
- Chung-Kil Hur, Aditya V. Nori, Sriram K. Rajamani, and Selva Samuel. 2015. A Provably Correct Sampler for Probabilistic Programs. In 35th IARCS Annual Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS 2015, December 16-18, 2015, Bangalore, India. 475–488.Google Scholar
- Graham Hutton. 1999. A Tutorial on the Universality and Expressiveness of Fold. J. Funct. Program. 9, 4 (July 1999), 355–372. Google Scholar
Digital Library
- Bart Jacobs. 2017. From Probability Monads to Commutative Effectuses. Journ. of Logical and Algebraic Methods in Programming (2017). To appear.Google Scholar
- Mauro Jaskelioff. 2009. Lifting of Operations in Modular Monadic Semantics. Ph.D. Dissertation. University of Nottingham.Google Scholar
- G M Kelly. 1980. A unified treatment of transfinite constructions for free algebras, free monoids, colimits, associated sheaves and so on. Bull. Austral. Math. Soc. 22 (1980), 1–83. Google Scholar
Cross Ref
- Anders Kock. 1972. Strong functors and monoidal monads. Archiv der Mathematik 23, 1 (1972), 113–120. Google Scholar
Cross Ref
- Anders Kock. 2012. Commutative monads as a theory of distributions. Theory and Applications of Categories 26, 4 (2012), 97–131.Google Scholar
- Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman, and David Blei. 2015. Automatic Variational Inference in Stan. In NIPS. https://papers.nips.cc/paper/5758-automatic-variational-inference-in-stanGoogle Scholar
- Tuan Anh Le, Atilim Gunes Baydin, and Frank Wood. 2017. Inference Compilation and Universal Probabilistic Programming. In AISTATS. http://www.tuananhle.co.uk/assets/pdf/le2016inference.pdfGoogle Scholar
- Vikash K. Mansinghka, Daniel Selsam, and Yura N. Perov. 2014. Venture: a higher-order probabilistic programming platform with programmable inference. arXiv:1404.0099 (2014). http://arxiv.org/abs/1404.0099Google Scholar
- Francisco Marmolejo and Richard J. Wood. 2010. Monads as extension systems — no iteration is necessary. Theory and Applications of Categories 24, 4 (2010), 84–113.Google Scholar
- T. Minka, J.M. Winn, J.P. Guiver, S. Webster, Y. Zaykov, B. Yangel, A. Spengler, and J. Bronskill. 2014. Infer.NET 2.6. (2014). Microsoft Research Cambridge. http://research.microsoft.com/infernet.Google Scholar
- Eugenio Moggi. 1989. Computational Lambda-Calculus and Monads. In LICS. IEEE Computer Society, USA, 14–23. Google Scholar
Cross Ref
- Lawrence M. Murray. 2013. Bayesian State-Space Modelling on High-Performance Hardware Using LibBi. arXiv:1306.3277. (2013).Google Scholar
- Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In International Symposium on Functional and Logic Programming - 13th International Symposium, FLOPS 2016, Kochi, Japan, March 4-6, 2016, Proceedings. Springer, 62–79. Google Scholar
Cross Ref
- Sungwoo Park, Frank Pfenning, and Sebastian Thrun. 2005. A probabilistic language based upon sampling functions. In Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2005, Long Beach, California, USA, January 12-14, 2005. 171–182. Google Scholar
Digital Library
- Maciej Piróg. 2016. Eilenberg-Moore Monoids and Backtracking Monad Transformers. In Proceedings 6th Workshop on Mathematically Structured Functional Programming, [email protected] 2016, Eindhoven, Netherlands, 8th April 2016. (EPTCS), Robert Atkey and Neelakantan R. Krishnaswami (Eds.), Vol. 207. 23–56. Google Scholar
Cross Ref
- Norman Ramsey and Avi Pfeffer. 2002. Stochastic lambda calculus and monads of probability distributions. In Conference Record of POPL 2002: The 29th SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Portland, OR, USA, January 16-18, 2002. 154–165. Google Scholar
Digital Library
- Adam Ścibior, Zoubin Ghahramani, and Andrew Gordon. 2015. Practical Probabilistic Programming with Monads. In Haskell. http://dl.acm.org/citation.cfm?id=2804317Google Scholar
- Sam Staton. 2017. Commutative semantics for probabilistic programming. In Proc. ESOP 2017. Google Scholar
Digital Library
- Andrew Thomas, David J. Spiegelhalter, and W. R. Gilks. 1992. BUGS: A program to perform Bayesian inference using Gibbs sampling. Bayesian statistics 4 (1992), 837–842. Issue 9.Google Scholar
- Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, and David M. Blei. 2017. Deep Probabilistic Programming. In ICLR.Google Scholar
- David Wingate, Andreas Stuhlmüller, and Noah Goodman. 2011. Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation. In AISTATS. https://web.stanford.edu/~ngoodman/papers/ lightweight-mcmc-aistats2011.pdf The published version contains a serious bug in the definition of alpha that was fixed in revision 3 available at the given URL.Google Scholar
- David Wingate and Theophane Weber. 2013. Automated Variational Inference in Probabilistic Programming. arXiv:1301.1299. (2013).Google Scholar
- Frank Wood, Jan Willem van de Meent, and Vikash Mansinghka. 2014. A New Approach to Probabilistic Programming Inference. In Proceedings of the 17th International conference on Artificial Intelligence and Statistics. 1024–1032.Google Scholar
- Robert Zinkov and Chung-chieh Shan. 2017. Composing inference algorithms as program transformations. In Proceedings of Uncertainty in Artificial Intelligence.Google Scholar
Index Terms
Denotational validation of higher-order Bayesian inference
Recommendations
Affine Monads and Lazy Structures for Bayesian Programming
We show that streams and lazy data structures are a natural idiom for programming with infinite-dimensional Bayesian methods such as Poisson processes, Gaussian processes, jump processes, Dirichlet processes, and Beta processes. The crucial semantic ...
Mean-field variational approximate Bayesian inference for latent variable models
The ill-posed nature of missing variable models offers a challenging testing ground for new computational techniques. This is the case for the mean-field variational Bayesian inference. The behavior of this approach in the setting of the Bayesian probit ...
A Unified Category-theoretic Semantics for Binding Signatures in Substructural Logics
Generalizing Fiore et al.'s use of the category <Fopf> of finite sets to model untyped Cartesian contexts and Tanaka's use of the category <Popf> of permutations to model untyped linear contexts, we let S be an arbitrary pseudo-monad on Cat and let S1 ...






Comments