skip to main content

Affine Monads and Lazy Structures for Bayesian Programming

Published:11 January 2023Publication History
Skip Abstract Section

Abstract

We show that streams and lazy data structures are a natural idiom for programming with infinite-dimensional Bayesian methods such as Poisson processes, Gaussian processes, jump processes, Dirichlet processes, and Beta processes. The crucial semantic idea, inspired by developments in synthetic probability theory, is to work with two separate monads: an affine monad of probability, which supports laziness, and a commutative, non-affine monad of measures, which does not. (Affine means that T(1)≅ 1.) We show that the separation is important from a decidability perspective, and that the recent model of quasi-Borel spaces supports these two monads.

To perform Bayesian inference with these examples, we introduce new inference methods that are specially adapted to laziness; they are proven correct by reference to the Metropolis-Hastings-Green method. Our theoretical development is implemented as a Haskell library, LazyPPL.

References

  1. Nathanael L. Ackerman, Jeremy Avigad, Cameron E. Freer, Daniel M. Roy, and Jason M. Rute. 2019. Algorithmic barriers to representing conditional independence. In Proc. LICS 2019. Google ScholarGoogle Scholar
  2. Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2016. Exchangeable random primitives. In Proceedings of the Workshop on Probabilistic Programming Semantics. Google ScholarGoogle Scholar
  3. A. Aguirre, G. Barthe, D. Garg, M. Gaboardi, S. Katsumata, and T. Sato. 2021. Higher-order probabilistic adversarial computations: categorical semantics and program logics. In Proc. ICFP 2021. Google ScholarGoogle Scholar
  4. Robert J. Aumann. 1961. Borel structures for function spaces. Illinois Journal of Mathematics 5 (1961). Google ScholarGoogle Scholar
  5. Richard Baker. 1991. Lebesgue Measure on R^∞. Proc. AMS 113, 4 (1991). Google ScholarGoogle ScholarCross RefCross Ref
  6. Eli Bingham, Jonathan P. Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D. Goodman. 2018. Pyro: Deep Universal Probabilistic Programming. Journal of Machine Learning Research (2018). Google ScholarGoogle Scholar
  7. B. Bloem-Reddy, E. Mathieu, A. Foster, T. Rainforth, Y. W. Teh, M. Lomeli, H. Ge, and Z. Ghahramani. 2017. Sampling and inference for discrete random probability measures in probabilistic programs. In Proc. NeurIPS 2017 Workshop on Advances in Approximate Bayesian Inference. Google ScholarGoogle Scholar
  8. Johannes Borgstrom, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2016. A Lambda-Calculus Foundation for Universal Probabilistic Programming. In Proc. ICFP 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Bob Carpenter, Andrew Gelman, Matthew D Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A probabilistic programming language. Journal of statistical software 76, 1 (2017). Google ScholarGoogle ScholarCross RefCross Ref
  10. K. Cho and B. Jacobs. 2017. The EfProb Library for Probabilistic Calculations. In Proc. CALCO 2017. Google ScholarGoogle Scholar
  11. Kenta Cho and B. Jacobs. 2019. Disintegration and Bayesian inversion via string diagrams. Math. Struct. Comput. Sci. 29 (2019), 938–971. Google ScholarGoogle ScholarCross RefCross Ref
  12. Kenta Cho, Bart Jacobs, Bas Westerbaan, and Abraham Westerbaan. 2015. An Introduction to Effectus Theory. (2015). arxiv:1512.05813. Google ScholarGoogle Scholar
  13. Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. In Proc. ICFP 2000. 268–279. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Bob Coecke. 2014. Terminality implies non-signalling. In Proc. QPL 2014. Google ScholarGoogle ScholarCross RefCross Ref
  15. Reuben Cohn-Gordon. 2022. Improving the probabilistic programming language Monad-Bayes. https://www.tweag.io/blog/2022-10-18-monad-bayes-fellowship/. Google ScholarGoogle Scholar
  16. R. Cohn-Gordon, A. Scibior, and Tweag team. 2022. Monad-Bayes website. https://monad-bayes-site.netlify.app/_site/about.html. Google ScholarGoogle Scholar
  17. Marco F. Cusumano-Towner, Feras A. Saad, Alexander K. Lew, and Vikash K. Mansinghka. 2019. Gen: a general-purpose probabilistic programming system with programmable inference. 221–236. Google ScholarGoogle Scholar
  18. Andreas Damianou and Neil D. Lawrence. 2013. Deep Gaussian Processes. In Proc. AISTATS 2013. Google ScholarGoogle Scholar
  19. Swaraj Dash, Younesse Kaddar, Hugo Paquet, and Sam Staton. 2022. Affine monads and lazy structures for Bayesian programming. (Oct 2022). Google ScholarGoogle ScholarCross RefCross Ref
  20. Swaraj Dash, Younesse Kaddar, Hugo Paquet, and Sam Staton. 2022. LazyPPL: Lazy Probabilistic Programming Library. https://lazyppl.bitbucket.io/ Code repository and web page with examples. Google ScholarGoogle Scholar
  21. Swaraj Dash and Sam Staton. 2020. A Monad for Probabilistic Point Processes. In Proceedings of the 3rd Annual International Applied Category Theory Conference 2020, ACT 2020, Cambridge, USA, 6-10th July 2020 (EPTCS, Vol. 333), David I. Spivak and Jamie Vicary (Eds.). 19–32. Google ScholarGoogle ScholarCross RefCross Ref
  22. Thomas Ehrhard, Michele Pagani, and Christine Tasson. 2018. Measurable cones and stable, measurable functions: a model for probabilistic higher-order programming. In Proc. POPL 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. T. Fritz. 2020. A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics. Adv. Math. 370 (2020). Google ScholarGoogle Scholar
  24. T. Fritz, T. Gonda, and P. Perrone. 2021. De Finetti’s Theorem in Categorical Probability. Journal of Stochastic Analysis 2, 4 (2021). Google ScholarGoogle ScholarCross RefCross Ref
  25. T. Fritz, T. Gonda, P. Perrone, and Eigil Fjeldgren Rischel. 2020. Representable Markov Categories and Comparison of Statistical Experiments in Categorical Probability. https://arxiv.org/abs/2010.07416 Google ScholarGoogle Scholar
  26. Tobias Fritz and Wendong Liang. 2022. Free gs-monoidal categories and free Markov categories. (April 2022). arXiv:2204.02284. Google ScholarGoogle Scholar
  27. Tobias Fritz and Eigil Fjeldgren Rischel. 2020. Infinite products and zero-one laws in categorical probability. Compositionality 2 (Aug. 2020). Issue 3. issn:2631-4444 Google ScholarGoogle ScholarCross RefCross Ref
  28. Charles Geyer. 2011. Introduction to Markov Chain Monte Carlo. In Handbook of Markov Chain Monte Carlo. Chapman Hall/CRC. Google ScholarGoogle Scholar
  29. S Ghosal and A van der Vaart. 2017. Fundamentals of non-parametric Bayesian inference. CUP. Google ScholarGoogle Scholar
  30. M. Giry. 1982. A categorical approach to probability theory. Categorical Aspects of Topology and Analysis. Lecture Notes in Mathematics (1982). Google ScholarGoogle Scholar
  31. Noah Goodman, Vikash Mansinghka, Daniel M. Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: a language for generative models. Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence. Google ScholarGoogle Scholar
  32. Noah D Goodman and Andreas Stuhlmüller. 2014. The Design and Implementation of Probabilistic Programming Languages. http://dippl.org. Accessed: 2020-10-15. Google ScholarGoogle Scholar
  33. Jean Goubault-Larrecq, Xiaodong Jia, and Clément Théron. 2021. A Domain-Theoretic Approach to Statistical Programming Languages. arxiv:2106.16190. Google ScholarGoogle Scholar
  34. Peter J Green. 1995. Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination. Biometrika 82, 4 (1995), 711–732. Google ScholarGoogle ScholarCross RefCross Ref
  35. T.L. Griffiths and Z. Ghahramani. 2011. The Indian Buffet Process: An Introduction and Review. Journal of Machine Learning Research 12, 32 (2011), 1185–1224. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A convenient category for higher-order probability theory. In Proc. LICS 2017. Google ScholarGoogle ScholarCross RefCross Ref
  37. Ralf Hinze. 2000. Memo Functions, Polytypically!. In Proceedings of the 2nd Workshop on Generic Programming, Ponte de. 17–32. Google ScholarGoogle Scholar
  38. Daniel Huang, Greg Morrisett, and Bas Spitters. 2018. An Application of Computable Distributions to the Semantics of Probabilistic Programs. (2018). arxiv:1806.07966. Google ScholarGoogle Scholar
  39. Bart Jacobs. 1994. Semantics of weakening and contraction. Ann. Pure Appl. Logic 69 (1994). Google ScholarGoogle Scholar
  40. Bart Jacobs. 2011. Probabilities, distribution monads, and convex categories. Theoret. Comput. Sci. 412 (2011). Google ScholarGoogle Scholar
  41. Alexander Kechris. 1987. Classical Descriptive Set Theory. Springer. Google ScholarGoogle Scholar
  42. Oleg Kiselyov and Chung-chieh Shan. 2009. Embedded Probabilistic Programming. In Proc. DSL 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Anders Kock. 1970. Monads on symmetric monoidal closed categories. Arch. Math. 21 (1970). Google ScholarGoogle Scholar
  44. Anders Kock. 1971. Bilinearity and cartesian closed monads. Math. Scand. 29 (1971). Google ScholarGoogle Scholar
  45. A. Kock. 2012. Commutative Monads as a Theory of Distributions. Theory and Applications of Categories 26, 4 (2012). Google ScholarGoogle Scholar
  46. Daphne Koller, David McAllester, and Avi Pfeffer. 1997. Effective Bayesian Inference for Stochastic Programs. In Proc. AAAI 1997. Google ScholarGoogle Scholar
  47. Dexter Kozen. 1981. Semantics of probabilistic programs. J. Comput. System Sci. 22 (1981), 328–350. Google ScholarGoogle Scholar
  48. Ugo Dal Lago and Naohiko Hoshino. 2019. The Geometry of Bayesian Programming. In 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, Vancouver, BC, Canada, June 24-27, 2019. IEEE, 1–13. Google ScholarGoogle ScholarCross RefCross Ref
  49. Paul B Levy, John Power, and Hayo Thielecke. 2003. Modelling environments in call-by-value programming languages. Inform. Comput. 185 (2003), 182–210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Alex K Lew, Marco F Cusumano-Towner, Benjamin Sherman, Michael Carbin, and Vikash K Mansinghka. 2020. Trace types and denotational semantics for sound programmable inference in probabilistic languages. In Proc. POPL 2020. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. D. Lunn, D. Spiegelhalter, A. Thomas, and N. Best. 2009. The BUGS project: Evolution, critique and future directions. Statistics in Medicine 28, 25 (2009), 3049–3067. Google ScholarGoogle ScholarCross RefCross Ref
  52. John Maraist, Martin Odersky, David N Turner, and Philip Wadler. 1999. Call-by-name, call-by-value, call-by-need and the linear lambda calculus. Theoretical Computer Science 228, 1-2 (1999), 175–210. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Donald Michie. 1968. ’Memo’ Functions and Machine Learning. Nature 218 (1968). Google ScholarGoogle Scholar
  54. E. Moggi. 1991. Notions of computation and monads. Information and Computation (1991). Google ScholarGoogle Scholar
  55. Lawrence Murray, Daniel Lundén, Jan Kudlicka, David Broman, and Thomas Schön. 2018. Delayed Sampling and Automatic Rao-Blackwellization of Probabilistic Programs. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. 1037–1046. Google ScholarGoogle Scholar
  56. Lawrence M Murray. 2020. Lazy object copy as a platform for population-based probabilistic programming. (Jan 2020). arxiv:2001.05293. Google ScholarGoogle Scholar
  57. L. M. Murray and B. Schön. 2018. Automated learning with a probabilistic programming language: Birch. Annual Reviews in Control (2018). Google ScholarGoogle Scholar
  58. Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Proc. FLOPS 2016. 62–79. Google ScholarGoogle ScholarCross RefCross Ref
  59. Praveen Narayanan and Chung-chieh Shan. 2020. Symbolic disintegration with a variety of base measures. ACM Transactions on Programming Languages and Systems 42, 2 (2020). Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Daniel Navarro and Thomas Griffiths. 2006. A Nonparametric Bayesian Method for Inferring Features from Similarity Judgments. In Advances in Neural Information Processing Systems, B. Schölkopf, J. Platt, and T. Hoffman (Eds.), Vol. 19. MIT Press. https://proceedings.neurips.cc/paper/2006/file/2ecd2bd94734e5dd392d8678bc64cdab-Paper.pdf Google ScholarGoogle Scholar
  61. Minh Nguyen, Roly Perera, Meng Wang, and Nicolas Wu. 2022. Modular probabilistic models via algebraic effects. In Proc. ICFP 2022. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. John W. Paisley, David M. Blei, and Michael I. Jordan. 2012. Stick-Breaking Beta Processes and the Poisson Process. In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2012, La Palma, Canary Islands, Spain, April 21-23, 2012 (JMLR Proceedings, Vol. 22), Neil D. Lawrence and Mark A. Girolami (Eds.). JMLR.org, 850–858. http://proceedings.mlr.press/v22/paisley12.html Google ScholarGoogle Scholar
  63. Avi Pfeffer, Brian Ruttenberg, Amy Sliva, Michael Howard, and Glenn Takata. 2015. Lazy Factored Inference for Functional Probabilistic Programming. (2015). arxiv:1509.03564. Google ScholarGoogle Scholar
  64. David A. Roberts, Marcus Gallagher, and Thomas Taimre. 2019. Reversible Jump Probabilistic Programming. In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 89), Kamalika Chaudhuri and Masashi Sugiyama (Eds.). PMLR, 634–643. https://proceedings.mlr.press/v89/roberts19a.html Google ScholarGoogle Scholar
  65. Daniel Roy, Vikash Mansinghka, Noah Goodman, and Josh Tenenbaum. 2008. A stochastic programming perspective on nonparametric Bayes. In Proc. Workshop on Non-Parametric Bayes. Google ScholarGoogle Scholar
  66. Daniel M Roy. 2014. The continuum-of-urns scheme, generalized beta and Indian buffet processes, and hierarchies thereof. arXiv preprint arXiv:1501.00208 (2014). Google ScholarGoogle Scholar
  67. Daniel M Roy, Nate Ackerman, Jeremy Avigad, Cameron Freer, and Jason Rute. 2013. Exchangeable graphs, conditional independence, and computably-measurable samplers. Talk at CCA 2013. http://cca-net.de/cca2013/slides/17_Daniel Google ScholarGoogle Scholar
  68. F Saad and V Mansinghka. 2017. Detecting dependencies in sparse, multivariate databases using probabilistic programming and non-parametric Bayes. In Proc. AISTATS 2017. Google ScholarGoogle Scholar
  69. T. Sato, A. Aguirre, G. Barthe, D. Garg, M Gaboardi, and J. Hsu. 2019. Formal verification of higher-order probabilistic programs. In Proc. POPL 2019. Google ScholarGoogle Scholar
  70. Adam Ścibior, Ohad Kammar, and Zoubin Ghahramani. 2018. Functional programming for modular Bayesian inference. In Proc. ICFP 2018. Google ScholarGoogle Scholar
  71. Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K. Moss, Chris Heunen, and Zoubin Ghahramani. 2018. Denotational Validation of Higher-Order Bayesian Inference. In Proc. POPL 2018. Google ScholarGoogle Scholar
  72. D Shiebler. 2020. Categorical Stochastic Processes and Likelihood. In Proc. ACT 2020. Google ScholarGoogle Scholar
  73. Sam Staton. 2017. Commutative Semantics for Probabilistic Programming. In Proc. ESOP 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Sam Staton. 2020. Probabilistic Programs as Measures. In Foundations of Probabilistic Programming. CUP. Google ScholarGoogle Scholar
  75. Sam Staton. 2022. LazyPPL. https://bitbucket.org/samstaton/lazyppl/src/ Google ScholarGoogle Scholar
  76. Guy L. Steele Jr, Doug Lea, and Christine H Flood. 2014. Fast splittable pseudorandom number generators. In Proc. OOPSLA 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Dario Stein. 2021. GaussianInfer. https://github.com/damast93/GaussianInfer. Google ScholarGoogle Scholar
  78. Dario Stein. 2021. Structural Foundations for Probabilistic Programming Languages. Ph. D. Dissertation. University of Oxford. Google ScholarGoogle Scholar
  79. Dario Stein and Sam Staton. 2021. Compositional Semantics for Probabilistic Programs with Exact Conditioning. In Proc. LICS 2021. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. Romain Thibaux and Michael I Jordan. 2007. Hierarchical Beta Processes and the Indian Buffet Process. In Proc. AISTATS 2007. Google ScholarGoogle Scholar
  81. Luke Tierney. 1994. Markov Chains for Exploring Posterior Distributions. The Annals of Statistics 22, 4 (1994), 1701–1728. Google ScholarGoogle ScholarCross RefCross Ref
  82. D. Tolpin, H. Yang, J. W. van de Meent, and F. Wood. 2016. Design and Implementation of Probabilistic Programming Language Anglican. Proceedings of the IFL. Google ScholarGoogle Scholar
  83. Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 2018. An Introduction to Probabilistic Programming. (2018). https://arxiv.org/abs/1809.10756 Google ScholarGoogle Scholar
  84. A Vandenbroucke and T Schrijvers. 2020. PloNK: functional probabilistic NetKAT. In Proc. POPL 2020. Google ScholarGoogle Scholar
  85. M. Vákár, O. Kammar, and S. Staton. 2019. A domain theory for statistical probabilistic programming. In Proc. POPL 2019. Google ScholarGoogle Scholar
  86. R. Walia, P. Narayanan, J. Carette, S. Tobin-Hochstadt, and C-c Shan. 2019. From high-level inference algorithms to efficient code. In Proc. ICFP 2019. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. David Wingate, Andreas Stuhlmueller, and Noah Goodman. 2011. Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation. In Proc. AISTATS 2011. Google ScholarGoogle Scholar
  88. Frank D. Wood, Cédric Archambeau, Jan Gasthaus, Lancelot James, and Yee Whye Teh. 2009. A stochastic memoizer for sequence data. In Proc. ICML 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. J Wu. 2013. Reduced Traces and JITing in Church. Master’s thesis. MIT. Google ScholarGoogle Scholar

Index Terms

  1. Affine Monads and Lazy Structures for Bayesian Programming

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Article Metrics

        • Downloads (Last 12 months)227
        • Downloads (Last 6 weeks)31

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!