skip to main content
research-article
Public Access

Symbolic Disintegration with a Variety of Base Measures

Published:19 May 2020Publication History
Skip Abstract Section

Abstract

Disintegration is a relation on measures and a transformation on probabilistic programs that generalizes density calculation and conditioning, two operations widely used for exact and approximate inference. Existing program transformations that find a disintegration or density automatically are limited to a fixed base measure that is an independent product of Lebesgue and counting measures, so they are of no help in practical cases that require tricky reasoning about other base measures. We present the first disintegrator that handles variable base measures, including discrete-continuous mixtures, dependent products, and disjoint sums. By analogy with type inference, our disintegrator can check a given base measure as well as infer an unknown one that is principal. We derive the disintegrator and prove it sound by equational reasoning from semantic specifications. It succeeds in a variety of applications where disintegration and density calculation had not been previously mechanized.

References

  1. Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2011. Noncomputable conditional distributions. In Proceedings of the 26th Symposium on Logic in Computer Science (LICS’11). IEEE Computer Society Press, 107--116.Google ScholarGoogle Scholar
  2. Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2017. On computability and disintegration. Math. Struct. Comput. Sci. 27, 8 (2017), 1287--1314.Google ScholarGoogle ScholarCross RefCross Ref
  3. Hadi Mohasel Afshar, Scott Sanner, and Christfried Webers. 2016. Closed-form Gibbs sampling for graphical models with algebraic constraints. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. AAAI Press.Google ScholarGoogle Scholar
  4. Amal Ahmed and Matthias Blume. 2011. An equivalence-preserving CPS translation via multi-language semantics. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming, Manuel M. T. Chakravarty, Zhenjiang Hu, and Olivier Danvy (Eds.). ACM Press, 431--444.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sooraj Bhat, Ashish Agarwal, Richard Vuduc, and Alexander Gray. 2012. A type theory for probability density functions. In Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, 545--556.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sooraj Bhat, Johannes Borgström, Andrew D. Gordon, and Claudio V. Russo. 2013. Deriving probability density functions from probabilistic functional programs. In Proceedings of the 19th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’13), Nir Piterman and Scott A. Smolka (Eds.). Springer, 508--522.Google ScholarGoogle Scholar
  7. Richard S. Bird and Oege de Moor. 1996. Algebra of Programming. Prentice-Hall.Google ScholarGoogle Scholar
  8. Anders Bondorf. 1992. Improving binding times without explicit CPS-conversion. In Proceedings of the ACM Conference on Lisp and Functional Programming (Lisp Pointers), William D. Clinger (Ed.), Vol. V(1). ACM Press, 1--10.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2016. A lambda-calculus foundation for universal probabilistic programming. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Jacques Garrigue, Gabriele Keller, and Eijiro Sumii (Eds.). ACM Press, 33--46.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Wray L. Buntine. 1994. Operations for learning with graphical models. J. Artific. Intell. Res. 2 (1994), 159--225.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jacques Carette and Chung-chieh Shan. 2016. Simplifying probabilistic programs using computer algebra. In Proceedings of the 18th International Symposium on Practical Aspects of Declarative Languages (PADL’16) (Lecture Notes in Computer Science), Marco Gavanelli and John H. Reppy (Eds.). Springer, 135--152.Google ScholarGoogle ScholarCross RefCross Ref
  12. Bob Carpenter, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A probabilistic programming language. J. Stat. Softw. 76, 1 (2017), 1--32.Google ScholarGoogle ScholarCross RefCross Ref
  13. Joseph T. Chang and David Pollard. 1997. Conditioning as disintegration. Statistica Neerlandica 51, 3 (1997), 287--317.Google ScholarGoogle ScholarCross RefCross Ref
  14. Thomas M. Cover and Joy A. Thomas. 2006. Elements of Information Theory (2nd ed.). Wiley.Google ScholarGoogle Scholar
  15. Ryan Culpepper and Andrew Cobb. 2017. Contextual equivalence for probabilistic programs with continuous random variables and scoring. In Proceedings of the 26th European Symposium on Programming Languages and Systems (ESOP’17) (Lecture Notes in Computer Science), Yang Hongseok (Ed.). Springer, 368--392.Google ScholarGoogle ScholarCross RefCross Ref
  16. Marco F. Cusumano-Towner, Feras A. Saad, Alexander K. Lew, and Vikash K. Mansinghka. 2019. Gen: A general-purpose probabilistic programming system with programmable inference. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, Kathryn S. McKinley and Kathleen Fisher (Eds.). ACM Press, 221--236.Google ScholarGoogle Scholar
  17. Olivier Danvy and Andrzej Filinski. 1990. Abstracting control. In Proceedings of the ACM Conference on Lisp and Functional Programming. ACM Press, 151--160.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Bruno de Finetti. 1970. Teoria delle Probabilità: Sintesi Introduttiva con Appendice Critica. Vol. 1. Giulio Einaudi, Torino. Translated as de Finetti 1974.Google ScholarGoogle Scholar
  19. Bruno de Finetti. 1972. Probability, Induction, and Statistics. Wiley.Google ScholarGoogle Scholar
  20. Bruno de Finetti. 1974. Theory of Probability: A Critical Introductory Treatment. Vol. 1. Wiley.Google ScholarGoogle Scholar
  21. Luc Devroye. 1986. Non-Uniform Random Variate Generation. Springer.Google ScholarGoogle Scholar
  22. Jean Dieudonné. 1947–1948. Sur le Théorème de Lebesgue-Nikodym (III). Annales de l’université de Grenoble 23 (1947–1948), 25--53. http://eudml.org/doc/84619.Google ScholarGoogle Scholar
  23. Sebastian Fischer, Josep Silva, Salvador Tamarit, and Germán Vidal. 2008. Preserving sharing in the partial evaluation of lazy functional programs. In Proceedings of the 17th International Symposium on Logic-based Program Synthesis and Transformation (LOPSTR’07) (Lecture Notes in Computer Science), Andy King (Ed.). Springer, 74--89.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Weihao Gao, Sreeram Kannan, Sewoong Oh, and Pramod Viswanath. 2017. Estimating mutual information for discrete-continuous mixtures. In Advances in Neural Information Processing Systems, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). Curran Associates, 5986--5997.Google ScholarGoogle Scholar
  25. Timon Gehr, Sasa Misailovic, and Martin T. Vechev. 2016. PSI: Exact symbolic inference for probabilistic programs. In Proceedings of the 28th International Conference on Computer Aided Verification, Part I (Lecture Notes in Computer Science), Swarat Chaudhuri and Azadeh Farzan (Eds.). Springer, 62--83.Google ScholarGoogle Scholar
  26. Alan E. Gelfand, Adrian F. M. Smith, and Tai-Ming Lee. 1992. Bayesian analysis of constrained parameter and truncated data problems using Gibbs sampling. J. Amer. Statist. Assoc. 87, 418 (1992), 523--532.Google ScholarGoogle ScholarCross RefCross Ref
  27. Stuart Geman and Donald Geman. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 6 (1984), 721--741.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Michèle Giry. 1982. A categorical approach to probability theory. In Proceedings of an International Conference on Categorical Aspects of Topology and Analysis, Bernhard Banaschewski (Ed.). Springer, 68--85.Google ScholarGoogle ScholarCross RefCross Ref
  29. Noah D. Goodman, Vikash K. Mansinghka, Daniel Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: A language for generative models. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, David Allen McAllester and Petri Myllymäki (Eds.). AUAI Press, 220--229.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. N. J. Gordon, D. J. Salmond, and A. F. M. Smith. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F (Radar and Signal Processing) 140, 2 (1993), 107--113.Google ScholarGoogle ScholarCross RefCross Ref
  31. Peter J. Green. 1995. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 4 (1995), 711--732.Google ScholarGoogle ScholarCross RefCross Ref
  32. W. Keith Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 1 (1970), 97--109.Google ScholarGoogle ScholarCross RefCross Ref
  33. Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A convenient category for higher-order probability theory. In Proceedings of the 32nd Symposium on Logic in Computer Science (LICS’17). IEEE Computer Society Press, 1--12.Google ScholarGoogle ScholarCross RefCross Ref
  34. Gérard Huet. 1976. Résolution d’Équations dans des Langages d’Ordre . Thèse de doctorat es sciences mathématiques. Université Paris VII.Google ScholarGoogle Scholar
  35. Gérard Huet and Bernard Lang. 1978. Proving and applying program transformations expressed with second-order patterns. Acta Informatica 11, 1 (1978), 31--55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. John Hughes. 1995. The design of a pretty-printing library. In Proceedings of the 1st International Spring School on Advanced Functional Programming Techniques, Johan Jeuring and Erik Meijer (Eds.). Number 925 in Lecture Notes in Computer Science. Springer, 53--96.Google ScholarGoogle Scholar
  37. Graham Hutton and Erik Meijer. 1996. Back to basics: Deriving representation changers functionally. J. Funct. Program. 6, 1 (1996), 181--188.Google ScholarGoogle ScholarCross RefCross Ref
  38. Neil D. Jones, Carsten K. Gomard, and Peter Sestoft. 1993. Partial Evaluation and Automatic Program Generation. Prentice-Hall.Google ScholarGoogle Scholar
  39. Jesper Jørgensen. 1992. Generating a compiler for a lazy language by partial evaluation. In Proceedings of the 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Ravi Sethi (Ed.). ACM Press, 258--268.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Herman Kahn and T. E. Harris. 1951. Estimation of particle transmission by random sampling. Nat. Bureau Stand. Appl. Math. Ser. 12 (1951), 27--30.Google ScholarGoogle Scholar
  41. Anders Kock. 2012. Commutative monads as a theory of distributions. Theory Appl. Cat. 26, 4 (2012), 97--131.Google ScholarGoogle Scholar
  42. Julia L. Lawall and Olivier Danvy. 1994. Continuation-based partial evaluation. In Proceedings of the ACM Conference on Lisp and Functional Programming. ACM Press, 227--238.Google ScholarGoogle Scholar
  43. David J. Lunn, Andrew Thomas, Nicky Best, and David Spiegelhalter. 2000. WinBUGS—A Bayesian modelling framework: Concepts, structure, and extensibility. Statist. Comput. 10, 4 (2000), 325--337.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. David J. C. MacKay. 1998. Introduction to Monte Carlo methods. In Learning and Inference in Graphical Models, Michael I. Jordan (Ed.). Kluwer.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. 1953. Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 6 (1953), 1087--1092.Google ScholarGoogle ScholarCross RefCross Ref
  46. Wazim Mohammed Ismail and Chung-chieh Shan. 2016. Deriving a probability density calculator (functional pearl). In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Jacques Garrigue, Gabriele Keller, and Eijiro Sumii (Eds.). ACM Press, 47--59.Google ScholarGoogle Scholar
  47. Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Proceedings of the 13th International Symposium on Functional and Logic Programming (FLOPS’16) (Lecture Notes in Computer Science), Oleg Kiselyov and Andy King (Eds.). Springer, 62--79.Google ScholarGoogle ScholarCross RefCross Ref
  48. Praveen Narayanan and Chung-chieh Shan. 2017. Symbolic conditioning of arrays in probabilistic programs. Proc. ACM Program. Lang. 1, ICFP (2017), 11:1–11:25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Otton Nikodym. 1930. Sur une Généralisation des Intégrales de M. J. Radon. Fundamenta Mathematicae 15 (1930), 131--179.Google ScholarGoogle ScholarCross RefCross Ref
  50. Avi Pfeffer. 2009. CTPPL: A continuous time probabilistic programming language. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, Craig Boutilier (Ed.). 1943--1950.Google ScholarGoogle Scholar
  51. Benjamin C. Pierce. 2002. Types and Programming Languages. MIT Press.Google ScholarGoogle Scholar
  52. Gordon Plotkin and John Power. 2003. Algebraic operations and generic effects. Appl. Categor. Struct. 11, 1 (2003), 69--94.Google ScholarGoogle ScholarCross RefCross Ref
  53. Gordon D. Plotkin. 1975. Call-by-name, call-by-value and the -calculus. Theoret. Comput. Sci. 1, 2 (1975), 125--159.Google ScholarGoogle ScholarCross RefCross Ref
  54. David Pollard. 2001. A User’s Guide to Measure Theoretic Probability. Cambridge University Press.Google ScholarGoogle Scholar
  55. Norman Ramsey and Avi Pfeffer. 2002. Stochastic lambda calculus and monads of probability distributions. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, 154--165.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. John C. Reynolds. 1972. Definitional interpreters for higher-order programming languages. In Proceedings of the ACM National Conference, Vol. 2. ACM Press, 717--740. Reprinted with introduction in Higher-Order and Symbolic Computation 11, 4 (1998), 363–397.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. David A. Roberts, Marcus Gallagher, and Thomas Taimre. 2019. Reversible jump probabilistic programming. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS’19) (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Masashi Sugiyama (Eds.). 634--643.Google ScholarGoogle Scholar
  58. Halsey L. Royden. 1988. Real Analysis (3rd ed.). Prentice-Hall.Google ScholarGoogle Scholar
  59. Tetsuya Sato, Alejandro Aguirre, Gilles Barthe, Marco Gaboardi, Deepak Garg, and Justin Hsu. 2019. Formal verification of higher-order probabilistic programs: Reasoning about approximation, convergence, Bayesian inference, and optimization. Proc. ACM Program. Lang. 3, POPL (2019), 38:1–38:30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K. Moss, Chris Heunen, and Zoubin Ghahramani. 2018. Denotational validation of higher-order Bayesian inference. Proc. ACM Program. Lang. 2, POPL (2018), 60:1–60:29.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Chung-chieh Shan and Norman Ramsey. 2017. Exact Bayesian inference by symbolic disintegration. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, Giuseppe Castagna and Andrew D. Gordon (Eds.). ACM Press, 130--144.Google ScholarGoogle Scholar
  62. Robert J. Shiller. 1999. The ET interview: Professor James Tobin. Econometric Theory 15, 6 (1999), 867--900.Google ScholarGoogle ScholarCross RefCross Ref
  63. Sam Staton. 2017. Commutative semantics for probabilistic programming. In Proceedings of the 26th European Symposium on Programming Languages and Systems (ESOP’17) (Lecture Notes in Computer Science), Yang Hongseok (Ed.). Springer, 855--879.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Robert D. Tennent. 1973. Mathematical semantics of SNOBOL 4. In Proceedings of the ACM Symposium on Principles of Programming Languages, Patrick C. Fischer and Jeffrey D. Ullman (Eds.). ACM Press, 95--107.Google ScholarGoogle Scholar
  65. Hayo Thielecke. 2003. From control effects to typed continuation passing. In Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, 139--149.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Luke Tierney. 1998. A note on Metropolis-Hastings kernels for general state spaces. Ann. Appl. Probabil. 8, 1 (1998), 1--9.Google ScholarGoogle ScholarCross RefCross Ref
  67. James Tobin. 1958. Estimation of relationships for limited dependent variables. Econometrica 26, 1 (1958), 24--36.Google ScholarGoogle ScholarCross RefCross Ref
  68. Matthijs Vákár, Ohad Kammar, and Sam Staton. 2019. A domain theory for statistical probabilistic programming. Proc. ACM Program. Lang. 3, POPL (2019), 36:1–36:29.Google ScholarGoogle Scholar
  69. Matthijs Vákár and Luke Ong. 2018. On s-finite measures and kernels. e-Print 1810.01837. Retrieved from https://arxiv.org/abs/1810.01837Google ScholarGoogle Scholar
  70. Rajan Walia, Praveen Narayanan, Jacques Carette, Sam Tobin-Hochstadt, and Chung-chieh Shan. 2019. From high-level inference algorithms to efficient code. Proc. ACM Program. Lang. 3, ICFP (2019), 98:1–98:30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Mitchell Wand. 1987a. Complete type inference for simple objects. In Proceedings of the Symposium on Logic in Computer Science (LICS’87). IEEE Computer Society Press, 37--44.Google ScholarGoogle Scholar
  72. Mitchell Wand. 1987b. A simple algorithm and proof for type inference. Fundamenta Informaticae 10, 2 (1987), 115--122.Google ScholarGoogle ScholarCross RefCross Ref
  73. Mitchell Wand, Ryan Culpepper, Theophilos Giannakopoulos, and Andrew Cobb. 2018. Contextual equivalence for a probabilistic language with continuous random variables and recursion. Proc. ACM Program. Lang. 2, ICFP (2018), 87:1–87:30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. David Wingate, Andreas Stuhlmüller, and Noah D. Goodman. 2011. Lightweight implementations of probabilistic programming languages via transformational compilation. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS’11) (JMLR Workshop and Conference Proceedings), Geoffrey Gordon, David Dunson, and Miroslav Dudík (Eds.). 770--778.Google ScholarGoogle Scholar
  75. Frank Wood, Jan Willem van de Meent, and Vikash Mansinghka. 2014. A new approach to probabilistic programming inference. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS’14) (JMLR Workshop and Conference Proceedings). 1024--1032.Google ScholarGoogle Scholar
  76. Yi Wu, Siddharth Srivastava, Nicholas Hay, Simon Du, and Stuart Russell. 2018. Discrete-continuous mixtures in probabilistic programming: Generalized semantics and inference algorithms. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. 5339--5348.Google ScholarGoogle Scholar
  77. Robert Zinkov and Chung-chieh Shan. 2017. Composing inference algorithms as program transformations. In Proceedings of the Conference on Uncertainty in Artificial Intelligence.Google ScholarGoogle Scholar

Index Terms

  1. Symbolic Disintegration with a Variety of Base Measures

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Programming Languages and Systems
            ACM Transactions on Programming Languages and Systems  Volume 42, Issue 2
            June 2020
            286 pages
            ISSN:0164-0925
            EISSN:1558-4593
            DOI:10.1145/3395960
            Issue’s Table of Contents

            Copyright © 2020 Owner/Author

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 19 May 2020
            • Online AM: 7 May 2020
            • Accepted: 1 November 2019
            • Revised: 1 July 2019
            • Received: 1 February 2019
            Published in toplas Volume 42, Issue 2

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!