Weather forecasts often make statements like “there is a 70% chance of snowing tomorrow”. On the surface, this statement is merely discussing the probability of snowing, but could there be more to it? Let’s find out. Determining the actual meaning of such statements can be quite challenging. This is where the renowned Bayesian approach comes into play. It offers a solution for such statements by interpreting probability in terms of an event’s believability.

This approach stands out because while we can never be completely sure about events, we do have a certain level a confidence about them taking place. Our belief strengthens as we obtain more data. For instance, a scientist’s profession trains their brains to believe in any data they come across and look at everything with a critical eye. The Bayesian inference is quite similar in that regard as it is incredibly intuitive.

That said, the Bayesian inference is also conceptually and computationally challenging, no matter how familiar you are to it.  In most cases, you need to make complicated and lengthy mathematical computations to achieve results. Even the most experienced mathematicians tend to find computations like these tedious, particularly when they require a short rundown of the issue they want to solve.

This is where a package known as the PyMC3 comes into play. Known for its ability to perform numerical Bayesian inference effectively, the PyMC3 has been a godsend in the field of data science and will only evolve as time passes.

PyMC3 – How it Works

At its core, PyMC3 is a handy package that helps with Bayesian inferences, but how does it actually work? The example mentioned below shows PyMC3s application:

Let’s say for instance that you have a coin and you flip it thrice, resulting in 0, 0 and 1. For this example, the zero means that your coin landed on the tail and one means it landed on the head. Can we confidently claim that the coin is fair? To make it simpler, if we assume the probability that our coin will land on the head, would the evidence be enough for supporting the claim that theta is equals to twelve?

Well, since we don’t know anything regarding the coin, except the above mentioned experiment’s result, it is difficult to be certain. If we look at things from a frequentist’s point of view, θ’s point estimation would be

Θ = number of heads/number of trails =2/3

While there is no denying that this figure makes sense, the approach we are operating on (frequentist approach) doesn’t offer too much confidence. This is especially true if we perform more trials, there is a high likelihood to achieve varying point estimations for theta.  The Bayesian approach may provide some improvement in this scenario. The concept is relatively straightforward, as we don’t have any knowledge regarding theta, we can presume that the value of theta could be anything from 0 or 1.

Mathematically speaking, we previously believed that theta followed a uniform distribution of zero and one. Then, we can utilize our observations and use the evidence for updating our beliefs regarding θ’s distribution.

PyMC3 and Theano  

Theano’s original authors announced in 2017 that they plan to discontinue their library’s development for good. It was a strange situation for PyMC3 as it depended on Theano for computation in the backend. This prompted the developers to turn their focus towards PyMC4, a program that strongly relied on tensor flow as its main base.

The process eventually made developers realize that creating an interactive programming library on TF was not as straightforward as they initially thought. Eventually, this encouraged developers to extend PyMC3s life, and they started by taking over Theano’s maintenance.

Using the code base for Theano led to a realization that everything the team required was already there. They also noted they could expand the code in useful ways, like implementing support for JAX and similar execution backends. The main idea is that Theano creates a computational graph for each operation to ensure you can execute it in a sequence.

This graph-based structure is incredibly useful for several reasons. First off, it allows you to perform optimizations easily through replacing particular operations with superior, numerically stable alternatives. Then you can use that graph, compiling it with a variety of execution back ends. Theano can support 2 execution back ends (Ops implementations), C and Python.

As any dev or programmer would expect, the backend in Python is relatively slow because it only runs graphs using various NumPy functions tethered together. As far as speed goes, Theano depends on the C backed, which is usually incorporated in the C Python. After the graph simplification and transformation, the resulting operations compile to their respective C analogues followed by the C source files compiling to a library (shared).

Although this procedure is incredibly quick, maintaining the C backend could be a massive burden. That said, it holds Theano back from the tremendous compiler technology and processor architecture developments, as devs have to resort to hand writing the C code for them.

In most cases, PymC3 models work with the Theano-PyMC master branch using SMC and NUTS samplers. While the experiments have been limited, the C backend on smaller models is a bit faster compared to ITS JAX variant. However, there is a huge possibility that the performance will improve.

What Does the Future Hold?

With the potential of compiling Theano graphs with JAX along with the presence of MCMC samplers (JAX based), PyMC3 is due to undergo a groundbreaking transformation in the future. Without making any massive changes to the code base of PyMC3, switching the backend and alternating it with JAX samplers to achieve quick sampling (from small to large models) will indeed be possible.

The best part about this is that users will not have to worry about making changes to their current PyMC3 model code for running their models on the latest hardware, backend, and JAX samplers to get improved speeds.