While watching the news, you’ve probably come across weather forecasts stating that the chances of snowfall are 75%. Have you ever wondered how they come up with this percentage? Or do you know what they mean? These are complicated statements, especially from the perspective of frequentist. You may find it difficult to explain whether snow will fall or not.

According to Bayesian, this approach is explainable, and they provide a solution to these types of statements. Bayesians’ approach towards probability explains that you can measure probability based on the believability of an event. That is why Bayesian are confident about the probability of an event.

In simple words, a Bayesian is not completely sure about what they believe, but they are confident about the event. However, as a Bayesian, you become more confident about what you believe after collecting the data and interpreting the connection between that data.

What is PyMC3

PP or probabilistic programming enables you to code the specification of Bayesian models. PyMC3 is an open-source and updated version of PyMC2. PyMC3 is a framework for probabilistic programming. With the help of the PyMC3 framework, you can describe your models with a powerful, readable, and intuitive syntax. The syntax is similar to natural syntax statisticians.

This framework allows you to feature Markov chain Monte Carlo algorithms. These algorithms are a next-generation tool for sampling and segmentation—for instance, No-U-Turn Sampler, Hamilton’s self-tuning variant Monte Carlo, etc. You can work on these samplers with the complex posterior and high dimensional distribution. These algorithms help you with complex and detailed models without the need for extensive knowledge and specialization in complicated algorithms.

Bayesian Network with PyMC3

You can calculate the probabilities of the event and define the relationship between different variables with the help of probabilistic models. For instance, if you are creating a model covering all the possible cases and probabilities, you need large amounts of data. The network will simplify the assumptions by conditional independence and increasing the effectiveness of all random variables. Although the steps are simple, you can take the example of Naïve Bayes in this case.

You can create a model to preserve the conditional dependence that you already know among random conditional independence and variables in different cases.

Bayesian networks are models for probabilistic graphics, capturing the conditional dependence that you know and directing the values in a graphical representation. You can find the missing connections with the help of the model’s conditional independencies.

Bayesian Networks enable you with useful tools so you can generate the domain’s graph and visualize the probabilistic model. Furthermore, it helps you to review the relationship of your random variables. Moreover, it indicates the reason for these probabilities with evidence. The Bayesian network has two major benefits:

  • Firstly, the Bayesian network enables you to understand how you can create and fit your model according to your requirement. However, before performing this activity, you need to understand the concept and the working mechanism behind the model.
  • Secondly, you can check the model’s performance where we already have the output data. For instance, you can check the uncertainty and values. This helps us understand the accuracy of the model.

Below, we will understand how Python stimulates your data with the help of properties that we already have.

PyMC3 with Python

Using Python for Probabilistic programming gives you numerous benefits. Here are some advantages of PyMC3 with Python:

  • Compatibility with multiple platforms
  • Readable, clean, and expressive syntax
  • Extensibility with Cython, Fortron, C, and C++
  • Easy integration with scientific libraries

With the help of these features and the quality of PyMC3, you can easily write custom statistical distribution, transformation function, and samples for Bayesian analysis.

Example of PyMC3 with Python

You can check the following example from pomegranate, which is a Python package. In this example, we use the data of COVID patients and Smokers to find out how many people will end up in the hospital. You can understand the concept from the following example:

import pomegranate aspg

 

smokeD = pg.DiscreteDistribution({‘yes’: 0.25, ‘no’: 0.75})

covidD = pg.DiscreteDistribution({‘yes’: 0.1, ‘no’: 0.9})

hospitalD = pg.ConditionalProbabilityTable(

    [[‘yes’, ‘yes’, ‘yes’, 0.9], [‘yes’, ‘yes’, ‘no’, 0.1],

[‘yes’, ‘no’, ‘yes’, 0.1], [‘yes’, ‘no’, ‘no’, 0.9],

[‘no’, ‘yes’, ‘yes’, 0.9], [‘no’, ‘yes’, ‘no’, 0.1],

     [‘no’, ‘no’, ‘yes’, 0.01], [‘no’, ‘no’, ‘no’, 0.99]],

    [smokeD, covidD])

 

smoke = pg.Node(smokeD, name=”smokeD”)

covid = pg.Node(covidD, name=”covidD”)

hospital = pg.Node(hospitalD, name=”hospitalD”)

 

model = pg.BayesianNetwork(“Covid Collider”)

model.add_states(smoke, covid, hospital)

model.add_edge(smoke, hospital)

model.add_edge(covid, hospital)

model.bake()

You could then calculate P(covid|smoking, hospital) = 0.5 with

model.predict_proba({‘smokeD’: ‘yes’, ‘hospitalD’: ‘yes’})

and P(covid|¬smoking, hospital)=0.91 with

model.predict_proba({‘smokeD’: ‘no’, ‘hospitalD’: ‘yes’})

 

Future of PyMC3

We await major transformation in the PyMC3 approach. We can now gather the Theano graphs in JAX and use an MCMC sampler based on JAX. This signifies that we can change the backend coding to JAX without any change in PyMC3 codes. Furthermore, we can utilize lightning-fast sampling for huge models with the help of JAX-based samplers.

The amazing part about this opportunity is that you don’t have to change the existing code for the PyMC3 model to run your models on modern hardware, modern backend, and JAX-based samplers. This approach and change will provide amazing speed without additional expenses.

With the increasing focus of TF and PyTorch on dynamic graphs, Python no longer has adequate static libraries. Furthermore, there are numerous benefits of static over dynamic graphs. That is why we consider Theano as a powerful and mature library for the future. This framework will gain more traction in the future as you can access and modify all kinds of graph representation. You can execute Theano for numerous modern backend interface.

Conclusion

PyMC3 helps you solve basic Bayesian statistical predictions and inference problems. In this article, we understand PyMC3 and its uses. Furthermore, we understand how Bayesian Networks and Python help you create a sophisticated model with PyMC3.