Abstract
Disintegration is a relation on measures and a transformation on probabilistic programs that generalizes density calculation and conditioning, two operations widely used for exact and approximate inference. Existing program transformations that find a disintegration or density automatically are limited to a fixed base measure that is an independent product of Lebesgue and counting measures, so they are of no help in practical cases that require tricky reasoning about other base measures. We present the first disintegrator that handles variable base measures, including discrete-continuous mixtures, dependent products, and disjoint sums. By analogy with type inference, our disintegrator can check a given base measure as well as infer an unknown one that is principal. We derive the disintegrator and prove it sound by equational reasoning from semantic specifications. It succeeds in a variety of applications where disintegration and density calculation had not been previously mechanized.
- Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2011. Noncomputable conditional distributions. In Proceedings of the 26th Symposium on Logic in Computer Science (LICS’11). IEEE Computer Society Press, 107--116.Google Scholar
- Nathanael L. Ackerman, Cameron E. Freer, and Daniel M. Roy. 2017. On computability and disintegration. Math. Struct. Comput. Sci. 27, 8 (2017), 1287--1314.Google Scholar
Cross Ref
- Hadi Mohasel Afshar, Scott Sanner, and Christfried Webers. 2016. Closed-form Gibbs sampling for graphical models with algebraic constraints. In Proceedings of the 30th AAAI Conference on Artificial Intelligence. AAAI Press.Google Scholar
- Amal Ahmed and Matthias Blume. 2011. An equivalence-preserving CPS translation via multi-language semantics. In Proceedings of the ACM SIGPLAN International Conference on Functional Programming, Manuel M. T. Chakravarty, Zhenjiang Hu, and Olivier Danvy (Eds.). ACM Press, 431--444.Google Scholar
Digital Library
- Sooraj Bhat, Ashish Agarwal, Richard Vuduc, and Alexander Gray. 2012. A type theory for probability density functions. In Proceedings of the 39th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, 545--556.Google Scholar
Digital Library
- Sooraj Bhat, Johannes Borgström, Andrew D. Gordon, and Claudio V. Russo. 2013. Deriving probability density functions from probabilistic functional programs. In Proceedings of the 19th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’13), Nir Piterman and Scott A. Smolka (Eds.). Springer, 508--522.Google Scholar
- Richard S. Bird and Oege de Moor. 1996. Algebra of Programming. Prentice-Hall.Google Scholar
- Anders Bondorf. 1992. Improving binding times without explicit CPS-conversion. In Proceedings of the ACM Conference on Lisp and Functional Programming (Lisp Pointers), William D. Clinger (Ed.), Vol. V(1). ACM Press, 1--10.Google Scholar
Digital Library
- Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2016. A lambda-calculus foundation for universal probabilistic programming. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Jacques Garrigue, Gabriele Keller, and Eijiro Sumii (Eds.). ACM Press, 33--46.Google Scholar
Digital Library
- Wray L. Buntine. 1994. Operations for learning with graphical models. J. Artific. Intell. Res. 2 (1994), 159--225.Google Scholar
Digital Library
- Jacques Carette and Chung-chieh Shan. 2016. Simplifying probabilistic programs using computer algebra. In Proceedings of the 18th International Symposium on Practical Aspects of Declarative Languages (PADL’16) (Lecture Notes in Computer Science), Marco Gavanelli and John H. Reppy (Eds.). Springer, 135--152.Google Scholar
Cross Ref
- Bob Carpenter, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A probabilistic programming language. J. Stat. Softw. 76, 1 (2017), 1--32.Google Scholar
Cross Ref
- Joseph T. Chang and David Pollard. 1997. Conditioning as disintegration. Statistica Neerlandica 51, 3 (1997), 287--317.Google Scholar
Cross Ref
- Thomas M. Cover and Joy A. Thomas. 2006. Elements of Information Theory (2nd ed.). Wiley.Google Scholar
- Ryan Culpepper and Andrew Cobb. 2017. Contextual equivalence for probabilistic programs with continuous random variables and scoring. In Proceedings of the 26th European Symposium on Programming Languages and Systems (ESOP’17) (Lecture Notes in Computer Science), Yang Hongseok (Ed.). Springer, 368--392.Google Scholar
Cross Ref
- Marco F. Cusumano-Towner, Feras A. Saad, Alexander K. Lew, and Vikash K. Mansinghka. 2019. Gen: A general-purpose probabilistic programming system with programmable inference. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, Kathryn S. McKinley and Kathleen Fisher (Eds.). ACM Press, 221--236.Google Scholar
- Olivier Danvy and Andrzej Filinski. 1990. Abstracting control. In Proceedings of the ACM Conference on Lisp and Functional Programming. ACM Press, 151--160.Google Scholar
Digital Library
- Bruno de Finetti. 1970. Teoria delle Probabilità: Sintesi Introduttiva con Appendice Critica. Vol. 1. Giulio Einaudi, Torino. Translated as de Finetti 1974.Google Scholar
- Bruno de Finetti. 1972. Probability, Induction, and Statistics. Wiley.Google Scholar
- Bruno de Finetti. 1974. Theory of Probability: A Critical Introductory Treatment. Vol. 1. Wiley.Google Scholar
- Luc Devroye. 1986. Non-Uniform Random Variate Generation. Springer.Google Scholar
- Jean Dieudonné. 1947–1948. Sur le Théorème de Lebesgue-Nikodym (III). Annales de l’université de Grenoble 23 (1947–1948), 25--53. http://eudml.org/doc/84619.Google Scholar
- Sebastian Fischer, Josep Silva, Salvador Tamarit, and Germán Vidal. 2008. Preserving sharing in the partial evaluation of lazy functional programs. In Proceedings of the 17th International Symposium on Logic-based Program Synthesis and Transformation (LOPSTR’07) (Lecture Notes in Computer Science), Andy King (Ed.). Springer, 74--89.Google Scholar
Digital Library
- Weihao Gao, Sreeram Kannan, Sewoong Oh, and Pramod Viswanath. 2017. Estimating mutual information for discrete-continuous mixtures. In Advances in Neural Information Processing Systems, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). Curran Associates, 5986--5997.Google Scholar
- Timon Gehr, Sasa Misailovic, and Martin T. Vechev. 2016. PSI: Exact symbolic inference for probabilistic programs. In Proceedings of the 28th International Conference on Computer Aided Verification, Part I (Lecture Notes in Computer Science), Swarat Chaudhuri and Azadeh Farzan (Eds.). Springer, 62--83.Google Scholar
- Alan E. Gelfand, Adrian F. M. Smith, and Tai-Ming Lee. 1992. Bayesian analysis of constrained parameter and truncated data problems using Gibbs sampling. J. Amer. Statist. Assoc. 87, 418 (1992), 523--532.Google Scholar
Cross Ref
- Stuart Geman and Donald Geman. 1984. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6, 6 (1984), 721--741.Google Scholar
Digital Library
- Michèle Giry. 1982. A categorical approach to probability theory. In Proceedings of an International Conference on Categorical Aspects of Topology and Analysis, Bernhard Banaschewski (Ed.). Springer, 68--85.Google Scholar
Cross Ref
- Noah D. Goodman, Vikash K. Mansinghka, Daniel Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: A language for generative models. In Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence, David Allen McAllester and Petri Myllymäki (Eds.). AUAI Press, 220--229.Google Scholar
Digital Library
- N. J. Gordon, D. J. Salmond, and A. F. M. Smith. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F (Radar and Signal Processing) 140, 2 (1993), 107--113.Google Scholar
Cross Ref
- Peter J. Green. 1995. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 4 (1995), 711--732.Google Scholar
Cross Ref
- W. Keith Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 1 (1970), 97--109.Google Scholar
Cross Ref
- Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A convenient category for higher-order probability theory. In Proceedings of the 32nd Symposium on Logic in Computer Science (LICS’17). IEEE Computer Society Press, 1--12.Google Scholar
Cross Ref
- Gérard Huet. 1976. Résolution d’Équations dans des Langages d’Ordre . Thèse de doctorat es sciences mathématiques. Université Paris VII.Google Scholar
- Gérard Huet and Bernard Lang. 1978. Proving and applying program transformations expressed with second-order patterns. Acta Informatica 11, 1 (1978), 31--55.Google Scholar
Digital Library
- John Hughes. 1995. The design of a pretty-printing library. In Proceedings of the 1st International Spring School on Advanced Functional Programming Techniques, Johan Jeuring and Erik Meijer (Eds.). Number 925 in Lecture Notes in Computer Science. Springer, 53--96.Google Scholar
- Graham Hutton and Erik Meijer. 1996. Back to basics: Deriving representation changers functionally. J. Funct. Program. 6, 1 (1996), 181--188.Google Scholar
Cross Ref
- Neil D. Jones, Carsten K. Gomard, and Peter Sestoft. 1993. Partial Evaluation and Automatic Program Generation. Prentice-Hall.Google Scholar
- Jesper Jørgensen. 1992. Generating a compiler for a lazy language by partial evaluation. In Proceedings of the 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Ravi Sethi (Ed.). ACM Press, 258--268.Google Scholar
Digital Library
- Herman Kahn and T. E. Harris. 1951. Estimation of particle transmission by random sampling. Nat. Bureau Stand. Appl. Math. Ser. 12 (1951), 27--30.Google Scholar
- Anders Kock. 2012. Commutative monads as a theory of distributions. Theory Appl. Cat. 26, 4 (2012), 97--131.Google Scholar
- Julia L. Lawall and Olivier Danvy. 1994. Continuation-based partial evaluation. In Proceedings of the ACM Conference on Lisp and Functional Programming. ACM Press, 227--238.Google Scholar
- David J. Lunn, Andrew Thomas, Nicky Best, and David Spiegelhalter. 2000. WinBUGS—A Bayesian modelling framework: Concepts, structure, and extensibility. Statist. Comput. 10, 4 (2000), 325--337.Google Scholar
Digital Library
- David J. C. MacKay. 1998. Introduction to Monte Carlo methods. In Learning and Inference in Graphical Models, Michael I. Jordan (Ed.). Kluwer.Google Scholar
Digital Library
- Nicholas Metropolis, Arianna W. Rosenbluth, Marshall N. Rosenbluth, Augusta H. Teller, and Edward Teller. 1953. Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 6 (1953), 1087--1092.Google Scholar
Cross Ref
- Wazim Mohammed Ismail and Chung-chieh Shan. 2016. Deriving a probability density calculator (functional pearl). In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Jacques Garrigue, Gabriele Keller, and Eijiro Sumii (Eds.). ACM Press, 47--59.Google Scholar
- Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In Proceedings of the 13th International Symposium on Functional and Logic Programming (FLOPS’16) (Lecture Notes in Computer Science), Oleg Kiselyov and Andy King (Eds.). Springer, 62--79.Google Scholar
Cross Ref
- Praveen Narayanan and Chung-chieh Shan. 2017. Symbolic conditioning of arrays in probabilistic programs. Proc. ACM Program. Lang. 1, ICFP (2017), 11:1–11:25.Google Scholar
Digital Library
- Otton Nikodym. 1930. Sur une Généralisation des Intégrales de M. J. Radon. Fundamenta Mathematicae 15 (1930), 131--179.Google Scholar
Cross Ref
- Avi Pfeffer. 2009. CTPPL: A continuous time probabilistic programming language. In Proceedings of the 21st International Joint Conference on Artificial Intelligence, Craig Boutilier (Ed.). 1943--1950.Google Scholar
- Benjamin C. Pierce. 2002. Types and Programming Languages. MIT Press.Google Scholar
- Gordon Plotkin and John Power. 2003. Algebraic operations and generic effects. Appl. Categor. Struct. 11, 1 (2003), 69--94.Google Scholar
Cross Ref
- Gordon D. Plotkin. 1975. Call-by-name, call-by-value and the -calculus. Theoret. Comput. Sci. 1, 2 (1975), 125--159.Google Scholar
Cross Ref
- David Pollard. 2001. A User’s Guide to Measure Theoretic Probability. Cambridge University Press.Google Scholar
- Norman Ramsey and Avi Pfeffer. 2002. Stochastic lambda calculus and monads of probability distributions. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, 154--165.Google Scholar
Digital Library
- John C. Reynolds. 1972. Definitional interpreters for higher-order programming languages. In Proceedings of the ACM National Conference, Vol. 2. ACM Press, 717--740. Reprinted with introduction in Higher-Order and Symbolic Computation 11, 4 (1998), 363–397.Google Scholar
Digital Library
- David A. Roberts, Marcus Gallagher, and Thomas Taimre. 2019. Reversible jump probabilistic programming. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS’19) (Proceedings of Machine Learning Research), Kamalika Chaudhuri and Masashi Sugiyama (Eds.). 634--643.Google Scholar
- Halsey L. Royden. 1988. Real Analysis (3rd ed.). Prentice-Hall.Google Scholar
- Tetsuya Sato, Alejandro Aguirre, Gilles Barthe, Marco Gaboardi, Deepak Garg, and Justin Hsu. 2019. Formal verification of higher-order probabilistic programs: Reasoning about approximation, convergence, Bayesian inference, and optimization. Proc. ACM Program. Lang. 3, POPL (2019), 38:1–38:30.Google Scholar
Digital Library
- Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K. Moss, Chris Heunen, and Zoubin Ghahramani. 2018. Denotational validation of higher-order Bayesian inference. Proc. ACM Program. Lang. 2, POPL (2018), 60:1–60:29.Google Scholar
Digital Library
- Chung-chieh Shan and Norman Ramsey. 2017. Exact Bayesian inference by symbolic disintegration. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, Giuseppe Castagna and Andrew D. Gordon (Eds.). ACM Press, 130--144.Google Scholar
- Robert J. Shiller. 1999. The ET interview: Professor James Tobin. Econometric Theory 15, 6 (1999), 867--900.Google Scholar
Cross Ref
- Sam Staton. 2017. Commutative semantics for probabilistic programming. In Proceedings of the 26th European Symposium on Programming Languages and Systems (ESOP’17) (Lecture Notes in Computer Science), Yang Hongseok (Ed.). Springer, 855--879.Google Scholar
Digital Library
- Robert D. Tennent. 1973. Mathematical semantics of SNOBOL 4. In Proceedings of the ACM Symposium on Principles of Programming Languages, Patrick C. Fischer and Jeffrey D. Ullman (Eds.). ACM Press, 95--107.Google Scholar
- Hayo Thielecke. 2003. From control effects to typed continuation passing. In Proceedings of the 30th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, 139--149.Google Scholar
Digital Library
- Luke Tierney. 1998. A note on Metropolis-Hastings kernels for general state spaces. Ann. Appl. Probabil. 8, 1 (1998), 1--9.Google Scholar
Cross Ref
- James Tobin. 1958. Estimation of relationships for limited dependent variables. Econometrica 26, 1 (1958), 24--36.Google Scholar
Cross Ref
- Matthijs Vákár, Ohad Kammar, and Sam Staton. 2019. A domain theory for statistical probabilistic programming. Proc. ACM Program. Lang. 3, POPL (2019), 36:1–36:29.Google Scholar
- Matthijs Vákár and Luke Ong. 2018. On s-finite measures and kernels. e-Print 1810.01837. Retrieved from https://arxiv.org/abs/1810.01837Google Scholar
- Rajan Walia, Praveen Narayanan, Jacques Carette, Sam Tobin-Hochstadt, and Chung-chieh Shan. 2019. From high-level inference algorithms to efficient code. Proc. ACM Program. Lang. 3, ICFP (2019), 98:1–98:30.Google Scholar
Digital Library
- Mitchell Wand. 1987a. Complete type inference for simple objects. In Proceedings of the Symposium on Logic in Computer Science (LICS’87). IEEE Computer Society Press, 37--44.Google Scholar
- Mitchell Wand. 1987b. A simple algorithm and proof for type inference. Fundamenta Informaticae 10, 2 (1987), 115--122.Google Scholar
Cross Ref
- Mitchell Wand, Ryan Culpepper, Theophilos Giannakopoulos, and Andrew Cobb. 2018. Contextual equivalence for a probabilistic language with continuous random variables and recursion. Proc. ACM Program. Lang. 2, ICFP (2018), 87:1–87:30.Google Scholar
Digital Library
- David Wingate, Andreas Stuhlmüller, and Noah D. Goodman. 2011. Lightweight implementations of probabilistic programming languages via transformational compilation. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS’11) (JMLR Workshop and Conference Proceedings), Geoffrey Gordon, David Dunson, and Miroslav Dudík (Eds.). 770--778.Google Scholar
- Frank Wood, Jan Willem van de Meent, and Vikash Mansinghka. 2014. A new approach to probabilistic programming inference. In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS’14) (JMLR Workshop and Conference Proceedings). 1024--1032.Google Scholar
- Yi Wu, Siddharth Srivastava, Nicholas Hay, Simon Du, and Stuart Russell. 2018. Discrete-continuous mixtures in probabilistic programming: Generalized semantics and inference algorithms. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research), Jennifer Dy and Andreas Krause (Eds.), Vol. 80. 5339--5348.Google Scholar
- Robert Zinkov and Chung-chieh Shan. 2017. Composing inference algorithms as program transformations. In Proceedings of the Conference on Uncertainty in Artificial Intelligence.Google Scholar
Index Terms
Symbolic Disintegration with a Variety of Base Measures
Recommendations
Exact Bayesian inference by symbolic disintegration
POPL '17Bayesian inference, of posterior knowledge from prior knowledge and observed evidence, is typically defined by Bayes's rule, which says the posterior multiplied by the probability of an observation equals a joint probability. But the observation of a ...
Exact Bayesian inference by symbolic disintegration
POPL '17: Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming LanguagesBayesian inference, of posterior knowledge from prior knowledge and observed evidence, is typically defined by Bayes's rule, which says the posterior multiplied by the probability of an observation equals a joint probability. But the observation of a ...
Some cosine similarity measures and distance measures between q‐rung orthopair fuzzy sets
AbstractIn this paper, we consider some cosine similarity measures and distance measures between q‐rung orthopair fuzzy sets (q‐ROFSs). First, we define a cosine similarity measure and a Euclidean distance measure of q‐ROFSs, their properties are also ...






Comments