Abstract
A Bayesian model is based on a pair of probability distributions, known as the prior and sampling distributions. A wide range of fundamental machine learning tasks, including regression, classification, clustering, and many others, can all be seen as Bayesian models. We propose a new probabilistic programming abstraction, a typed Bayesian model, which is based on a pair of probabilistic expressions for the prior and sampling distributions. A sampler for a model is an algorithm to compute synthetic data from its sampling distribution, while a learner for a model is an algorithm for probabilistic inference on the model. Models, samplers, and learners form a generic programming pattern for model-based inference. They support the uniform expression of common tasks including model testing, and generic compositions such as mixture models, evidence-based model averaging, and mixtures of experts. A formal semantics supports reasoning about model equivalence and implementation correctness. By developing a series of examples and three learner implementations based on exact inference, factor graphs, and Markov chain Monte Carlo, we demonstrate the broad applicability of this new programming pattern.
Supplemental Material
- , Olmedo, and Zanella Béguelin}Barthe:2012:CertiPrivG. Barthe, B. Köpf, F. Olmedo, and S. Zanella Béguelin. Probabilistic relational reasoning for differential privacy. In J. Field and M. Hicks, editors, phPOPL, pages 97--110. ACM, 2012. Google Scholar
Digital Library
- S. Bhat, A. Agarwal, R. W. Vuduc, and A. G. Gray. A type theory for probability density functions. In J. Field and M. Hicks, editors, POPL, pages 545--556. ACM, 2012. Google Scholar
Digital Library
- S. Bhat, J. Borgström, A. D. Gordon, and C. Russo. Deriving probability density functions from probabilistic functional programs. Draft paper, 2012. Google Scholar
Digital Library
- C. M. Bishop and M. Svensén. Bayesian hierarchical mixtures of experts. In C. Meek and U. Kjarulff, editors, Uncertainty in Artificial Intelligence (UAI'03), pages 57--64. Morgan Kaufmann, 2003. Google Scholar
Digital Library
- D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3: 993--1022, 2003. Google Scholar
Digital Library
- J. Borgström, A. D. Gordon, M. Greenberg, J. Margetson, and J. Van Gael. Measure transformer semantics for Bayesian machine learning. In European Symposium on Programming (ESOP'11), volume 6602 of LNCS, pages 77--96. Springer, 2011. Download available at http://research.microsoft.com/fun. Google Scholar
Digital Library
- M. Bozga and O. Maler. On the representation of probabilities over structured domains. In Computer Aided Verification (CAV'09), pages 261--273, 1999. Google Scholar
Digital Library
- M. Chavira and A. Darwiche. Compiling Bayesian networks using variable elimination. In International Joint Conference on on Artificial Intelligence (IJCAI'07), pages 2443--2449, 2007. Google Scholar
Digital Library
- G. Claret, S. K. Rajamani, A. V. Nori, A. D. Gordon, and J. Borgström. Bayesian inference for probabilistic programs via symbolic execution. Technical Report MSR--TR--2012--86, Microsoft Research, 2012.Google Scholar
- P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for the static analysis of programs by construction or approximation of fixpoints. In POPL, pages 238--252, 1977. Google Scholar
Digital Library
- A. Darwiche. Modeling and Reasoning with Bayesian Networks. CUP, 2009. Google Scholar
Digital Library
- H. Daumé III. HBC: Hierarchical Bayes Compiler, 2008. Available at http://www.cs.utah.edu/ hal/HBC/.Google Scholar
- P. Domingos, S. Kok, D. Lowd, H. Poon, M. Richardson, and P. Singla. Markov logic. In L. De Raedt, P. Frasconi, K. Kersting, and S. Muggleton, editors, Probabilistic inductive logic programming, pages 92--117. Springer-Verlag, Berlin, Heidelberg, 2008. Google Scholar
Digital Library
- M. Erwig and S. Kollmansberger. Functional pearls: Probabilistic functional programming in Haskell. J. Funct. Program., 16 (1): 21--34, 2006. Google Scholar
Digital Library
- W. R. Gilks, A. Thomas, and D. J. Spiegelhalter. A language and program for complex Bayesian modelling. The Statistician, 43: 169--178, 1994.Google Scholar
Cross Ref
- M. Giry. A categorical approach to probability theory. In B. Banaschewski, editor, Categorical Aspects of Topology and Analysis, volume 915 of Lecture Notes in Mathematics, pages 68--85. Springer Berlin / Heidelberg, 1982.Google Scholar
- N. Goodman, V. K. Mansinghka, D. M. Roy, K. Bonawitz, and J. B. Tenenbaum. Church: a language for generative models. In Uncertainty in Artificial Intelligence (UAI'08), pages 220--229. AUAI Press, 2008.Google Scholar
- A. D. Gordon, M. Aizatulin, J. Borgström, G. Claret, T. Graepel, A. Nori, S. Rajamani, and C. Russo. A model-learner pattern for Bayesian reasoning. Technical Report MSR-TR-2013--1, Microsoft Research, 2013.Google Scholar
Digital Library
- A. Guazzelli, M. Zeller, W. Chen, and G. Williams. PMML: An open standard for sharing models. The R Journal, 1 (1), May 2009.Google Scholar
Cross Ref
- V. Gupta, R. Jagadeesan, and P. Panangaden. Stochastic processes as concurrent constraint programs. In POPL, pages 189--202, 1999. Google Scholar
Digital Library
- W. K. Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57 (1): 97--109, 1970.Google Scholar
Cross Ref
- R. Herbrich, T. Minka, and T. Graepel. Trueskill™: A Bayesian skill rating system. In NIPS, pages 569--576, 2006.Google Scholar
- J. A. Hoeting, D. Madigan, A. E. Raftery, and C. T. Volinsky. Bayesian model averaging: A tutorial. Statistical Science, 14 (4): 382--401, 1999.Google Scholar
Cross Ref
- R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive mixtures of local experts. Neural Computation, 3: 79--87, 1991. Google Scholar
Cross Ref
- C. Jones and G. D. Plotkin. A probabilistic powerdomain of evaluations. In Logic in Computer Science (LICS'89), pages 186--195. IEEE Computer Society, 1989. Google Scholar
Digital Library
- M. I. Jordan and R. A. Jacobs. Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6 (2): 181--214, 1994. Google Scholar
Digital Library
- O. Kiselyov and C. Shan. Monolingual probabilistic programming using generalized coroutines. In Uncertainty in Artificial Intelligence (UAI'09), 2009. Google Scholar
Digital Library
- D. Koller, D. A. McAllester, and A. Pfeffer. Effective Bayesian inference for stochastic programs. In AAAI/IAAI, pages 740--747, 1997. Google Scholar
Digital Library
- D. Kozen. Semantics of probabilistic programs. Journal of Computer and System Sciences, 22 (3): 328--350, 1981.Google Scholar
Cross Ref
- M. Z. Kwiatkowska, G. Norman, and D. Parker. Quantitative analysis with the probabilistic model checker PRISM. In Quantitative Aspects of Programming Languages (QAPL 2005), volume 153(2) of ENTCS, pages 5--31, 2006. Google Scholar
Digital Library
- D. J. C. MacKay. Information Theory, Inference, and Learning Algorithms. CUP, 2003. Google Scholar
Digital Library
- P. Mardziel, S. Magill, M. Hicks, and M. Srivatsa. Dynamic enforcement of knowledge-based security policies. In Computer Security Foundations Symposium (CSF'11), pages 114--128, 2011. Google Scholar
Digital Library
- A. McCallum, K. Schultz, and S. Singh. Factorie: Probabilistic programming via imperatively defined factor graphs. In NIPS, pages 1249--1257, 2009.Google Scholar
Digital Library
- A. McIver and C. Morgan. Abstraction, refinement and proof for probabilistic systems. Monographs in computer science. Springer, 2005. Google Scholar
Digital Library
- N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller. Equation of state calculations by fast computing machines. Journal of Chemical Physics, 21: 1087--1092, 1953.Google Scholar
- T. Minka. A family of algorithms for approximate Bayesian inference. PhD thesis, MIT, 2001. Google Scholar
Digital Library
- T. Minka and J. M. Winn. Gates. In phAdvances in Neural Information Processing Systems (NIPS'08), pages 1073--1080. MIT Press, 2008.Google Scholar
- T. Minka, J. Winn, J. Guiver, and A. Kannan. Infer.NET 2.3, Nov. 2009. Software available from http://research.microsoft.com/infernet.Google Scholar
- R. M. Neal. Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93--1, Dept. of Computer Science, University of Toronto, September 1993.Google Scholar
- S. Park, F. Pfenning, and S. Thrun. A probabilistic language based upon sampling functions. In POPL, pages 171--182. ACM, 2005. Google Scholar
Digital Library
- J. Pearl and G. Shafer. Probabilistic reasoning in intelligent systems: Networks of plausible inference. Synthese-Dordrecht, 104 (1): 161, 1995.Google Scholar
- A. Pfeffer. IBAL: A probabilistic rational programming language. In B. Nebel, editor, International Joint Conference on Artificial Intelligence (IJCAI'01), pages 733--740. Morgan Kaufmann, 2001. Google Scholar
Digital Library
- A. Pfeffer. The design and implementation of IBAL: A general-purpose probabilistic language. In L. Getoor and B. Taskar, editors, Introduction to Statistical Relational Learning. MIT Press, 2007.Google Scholar
- A. Pfeffer. Practical probabilistic programming. In P. Frasconi and F. A. Lisi, editors, Inductive Logic Programming (ILP 2010), volume 6489 of Lecture Notes in Computer Science, pages 2--3. Springer, 2010. Google Scholar
Digital Library
- D. Purves and V. Lyutsarev. Filzbach User Guide, 2012. Available at http://research.microsoft.com/en-us/um/cambridge/groups/science/tools/f%ilzbach/filzbach.htm.Google Scholar
- A. Radul. Report on the probabilistic language scheme. In Proceedings of the 2007 symposium on Dynamic languages, DLS'07, pages 2--10, New York, NY, USA, 2007. ACM. ISBN 978--1--59593--868--8. 10.1145/1297081.1297085. URL http://doi.acm.org/10.1145/1297081.1297085. Google Scholar
Digital Library
- N. Ramsey and A. Pfeffer. Stochastic lambda calculus and monads of probability distributions. In POPL, pages 154--165, 2002. Google Scholar
Digital Library
- N. Saheb-Djahromi. Probabilistic LCF. In Mathematical Foundations of Computer Science (MFCS), volume 64 of LNCS, pages 442--451. Springer, 1978.Google Scholar
- S. Sanner and D. A. McAllester. Affine Algebraic Decision Diagrams (AADDs) and their application to structured probabilistic inference. In International Joint Conference on on Artificial Intelligence (IJCAI'05), pages 1384--1390, 2005. Google Scholar
Digital Library
- J. Schumann, T. Pressburger, E. Denney, W. Buntine, and B. Fischer. AutoBayes program synthesis system users manual. Technical Report NASA/TM--2008--215366, NASA Ames Research Center, 2008.Google Scholar
- F. Somenzi. CUDD: CU decision diagram package, release 2.5.0, 2012. Software available from http://vlsi.colorado.edu.Google Scholar
- D. Syme. Leveraging .NET meta-programming components from F#: integrated queries and interoperable heterogeneous execution. In A. Kennedy and F. Pottier, editors, ML, pages 43--54. ACM, 2006. Google Scholar
Digital Library
- J. Winn and T. Minka. Probabilistic programming with Infer.NET. Machine Learning Summer School lecture notes, available at http://research.microsoft.com/ minka/papers/mlss2009/, 2009.Google Scholar
Index Terms
A model-learner pattern for bayesian reasoning
Recommendations
A model-learner pattern for bayesian reasoning
POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesA Bayesian model is based on a pair of probability distributions, known as the prior and sampling distributions. A wide range of fundamental machine learning tasks, including regression, classification, clustering, and many others, can all be seen as ...
Functional programming for modular Bayesian inference
We present an architectural design of a library for Bayesian modelling and inference in modern functional programming languages. The novel aspect of our approach are modular implementations of existing state-of-the-art inference algorithms. Our design ...
Probabilistic abductive logic programming using Dirichlet priors
Probabilistic programming is an area of research that aims to develop general inference algorithms for probabilistic models expressed as probabilistic programs whose execution corresponds to inferring the parameters of those models. In this paper, we ...







Comments