PyMC3 - Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

  •        120

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning which focuses on advanced Markov chain Monte Carlo and variational fitting algorithms. Its flexibility and extensibility make it applicable to a large suite of problems.Note: Running pip install pymc will install PyMC 2.3, not PyMC3, from PyPI.



Related Projects

Pyro - Deep universal probabilistic programming with Python and PyTorch

Pyro is a universal probabilistic programming language (PPL) written in Python and supported by PyTorch on the backend. Pyro enables flexible and expressive deep probabilistic modeling, unifying the best of modern deep learning and Bayesian modeling.

pymc - PyMC: Bayesian Stochastic Modelling in Python (for PyMC3:

NOTE: The development version of PyMC (version 3) has been moved to its own repository called pymc3.PyMC is a python module that implements Bayesian statistical models and fitting algorithms, including Markov chain Monte Carlo. Its flexibility and extensibility make it applicable to a large suite of problems. Along with core sampling functionality, PyMC includes methods for summarizing output, plotting, goodness-of-fit and convergence diagnostics.

probability - Probabilistic reasoning and statistical analysis in TensorFlow

TensorFlow Probability is a library for probabilistic reasoning and statistical analysis in TensorFlow. As part of the TensorFlow ecosystem, TensorFlow Probability provides integration of probabilistic methods with deep networks, gradient-based inference via automatic differentiation, and scalability to large datasets and models via hardware acceleration (e.g., GPUs) and distributed computation. Our probabilistic machine learning tools are structured as follows.

Bios8366 - Advanced Statistical Computing at Vanderbilt University's Department of Biostatistics

Course covers numerical optimization, statistical machine learning, Markov Chain Monte Carlo (MCMC), variational inference (VI) algorithms, data augmentation algorithms with applications for model fitting and techniques for dealing with missing data. Prerequisites: Bios 6341 (Fundamentals of Probability), Bios 6342 (Contemporary Statistical Inference), or permission of instructor. Students must be familiar with basic probability, have some formal programming experience, and be comfortable using the Git version control system.

Bayesian Phylogenetic Diagnostics

This project will focus on the development of tools and analyses for the diagnosis of convergence in Markov Chain Monte Carlo (MCMC) processes, with a main focus on inference of phylogeny using Bayesian statistical methods.

iaf - Code for reproducing key results in the paper "Improving Variational Inference with Inverse Autoregressive Flow"

Code for reproducing key results in the paper Improving Variational Inference with Inverse Autoregressive Flow by Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. Set floatX = float32 in the [global] section of Theano config (usually ~/.theanorc). Alternatively you could prepend THEANO_FLAGS=floatX=float32 to the python commands below.


JProGraM (PRObabilistic GRAphical Models in Java) is a statistical machine learning library. It supports statistical modeling and data analysis along three main directions: (1) probabilistic graphical models (Bayesian networks, Markov random fields, dependency networks, hybrid random fields); (2) parametric, semiparametric, and nonparametric density estimation (Gaussian models, nonparanormal estimators, Parzen windows, Nadaraya-Watson estimator); (3) generative models for random networks (


VIBES stands for Variational Inference in BayES Nets. It consists of a graphical Bayes Net editor and an inference engine which allows variational inference to be applied automatically using Variational Message Passing.

Pattern Analysis Library in C++

The Pattern Analysis Library (PALib) is a C++ class library for pattern classification/recognition. PALib consists of a wide range of machine learning routines such as Bayesian decision theory, artificial neural networks, and fuzzy inference systems.

financial-analysis-python-tutorial - Financial Analysis in Python tutorial

You can view the video of the talk here. Thomas Wiecki is a Quantitative Researcher at Quantopian Inc -- a Boston based startup providing you with the first browser based algorithmic trading platform -- and a PhD student at Brown University where he studies Computational Cognitive Neuroscience. He specializes in Bayesian Inference, Machine Learning, Scientific Computing in Python, algorithmic trading and Computational Psychiatry.

CausalImpact - An R package for causal inference in time series

This R package implements an approach to estimating the causal effect of a designed intervention on a time series. For example, how many additional daily clicks were generated by an advertising campaign? Answering a question like this can be difficult when a randomized experiment is not available. The package aims to address this difficulty using a structural Bayesian time-series model to estimate how the response metric might have evolved after the intervention if the intervention had not occurred.As with all approaches to causal inference on non-experimental data, valid conclusions require strong assumptions. The CausalImpact package, in particular, assumes that the outcome time series can be explained in terms of a set of control time series that were themselves not affected by the intervention. Furthermore, the relation between treated series and control series is assumed to be stable during the post-intervention period. Understanding and checking these assumptions for any given application is critical for obtaining valid conclusions.

statistical-analysis-python-tutorial - Statistical Data Analysis in Python

Chris Fonnesbeck is an Assistant Professor in the Department of Biostatistics at the Vanderbilt University School of Medicine. He specializes in computational statistics, Bayesian methods, meta-analysis, and applied decision analysis. He originally hails from Vancouver, BC and received his Ph.D. from the University of Georgia. This tutorial will introduce the use of Python for statistical data analysis, using data stored as Pandas DataFrame objects. Much of the work involved in analyzing data resides in importing, cleaning and transforming data in preparation for analysis. Therefore, the first half of the course is comprised of a 2-part overview of basic and intermediate Pandas usage that will show how to effectively manipulate datasets in memory. This includes tasks like indexing, alignment, join/merge methods, date/time types, and handling of missing data. Next, we will cover plotting and visualization using Pandas and Matplotlib, focusing on creating effective visual representations of your data, while avoiding common pitfalls. Finally, participants will be introduced to methods for statistical data modeling using some of the advanced functions in Numpy, Scipy and Pandas. This will include fitting your data to probability distributions, estimating relationships among variables using linear and non-linear models, and a brief introduction to bootstrapping methods. Each section of the tutorial will involve hands-on manipulation and analysis of sample datasets, to be provided to attendees in advance.

Bayesian Network Tools in Java (BNJ)

Java/XML toolkit for research using Bayesian networks and other graphical models of probability (exact and approximate inference, structure learning, etc.)


This is a Python library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations. You may need to install a compiler with -std=c++11 support, like gcc-4.7 or higher.


RevBayes provides an R-like environment for evolutionary analysis using Bayesian inference.

ai-resources - Selection of resources to learn Artificial Intelligence / Machine Learning / Statistical Inference / Deep Learning / Reinforcement Learning

Update April 2017: It’s been almost a year since I posted this list of resources, and over the year there’s been an explosion of articles, videos, books, tutorials etc on the subject — even an explosion of ‘lists of resources’ such as this one. It’s impossible for me to keep this up to date. However, the one resource I would like to add is ( led by Gene Kogan. It’s specifically aimed at artists and the creative coding community. This is a very incomplete and subjective selection of resources to learn about the algorithms and maths of Artificial Intelligence (AI) / Machine Learning (ML) / Statistical Inference (SI) / Deep Learning (DL) / Reinforcement Learning (RL). It is aimed at beginners (those without Computer Science background and not knowing anything about these subjects) and hopes to take them to quite advanced levels (able to read and understand DL papers). It is not an exhaustive list and only contains some of the learning materials that I have personally completed so that I can include brief personal comments on them. It is also by no means the best path to follow (nowadays most MOOCs have full paths all the way from basic statistics and linear algebra to ML/DL). But this is the path I took and in a sense it's a partial documentation of my personal journey into DL (actually I bounced around all of these back and forth like crazy). As someone who has no formal background in Computer Science (but has been programming for many years), the language, notation and concepts of ML/SI/DL and even CS was completely alien to me, and the learning curve was not only steep, but vertical, treacherous and slippery like ice.

bogofilter -- Fast Bayesian Spam Filter

Bogofilter is a mail filter that classifies mail as spam or ham (non-spam) by a statistical analysis of the message's header and content (body). The program is able to learn from the user's classifications and corrections. Bogofilter provides processing for plain text and HTML. It supports multi-part MIME messages with decoding of base64, quoted-printable, and uuencoded text and ignores attachments, such as images.


msBayes allows complex and flexible phylogeographic inference. More specifically, you can test the simultaneous divergence (TSD) of multiple population (species) pairs. It uses approximate Bayesian computation (ABC) under a hierarchical model.


Mocapy++ is a Dynamic Bayesian Network toolkit, implemented in C++. It supports discrete, multinomial, Gaussian, Kent, Von Mises and Poisson nodes. Inference and learning is done by Gibbs sampling/Stochastic-EM.

notes-on-dirichlet-processes - :game_die: Notes explaining Dirichlet Processes, HDPs, and Latent Dirichlet Allocation

I taught myself Dirichlet processes and Hierarchical DPs in the spring of 2015 in order to understand nonparametric Bayesian models and related inference algorithms. In the process, I wrote a bunch of code and took a bunch of notes. I preserved those notes here for the benefit of others trying to learn this material.