Rene-Jean Corneille

Welcome to my blog. I love to write about statistics, data science, machine learning, and also programming in my languages of choice (python mainly and occasionaly C++ and typescript).

Simulating a Brownian motion with tensorflow probability

Dec 20, 2020 • Written by Rene-Jean Corneille

import functools

import pandas as pd
import seaborn as sns
import tensorflow.compat.v1 as tf
import tensorflow_probability as tfp

from tensorflow_probability.python.math.psd_kernels.internal import util
from tensorflow_probability.python.internal import tensor_util

Tensorflow has gained in popularity in the past 5 years because of its application in deep learning. But it is at its core a numerical computations. The library has historically been defined as as "library for numerical computation" but the most recent iterations are presented as a machine learning library which quite frankly erases how versatile this library is.

Besides the main tensorflow library, other libraries have florished augmenting tensorflow's capabilities:

tensorflow-text: adds better text processing capabilites
tensorflow-addons: very recent advancement if deep learning will be found in this library
tensorflow-io: I have used this library mainly to create tf.Dataset objects from a bigquery connection (which is very conveninent)
tensorflow-gan: defines wrapper objects (very similar to a tf.Estimator) for generative adversarial networks
tensorflow-probability: adds probabilitic computing capabilities.

I have particulary grown more and more interested in tensorflow probability. I am still exploring what the library can do but one thing is sure is that I am very optimisitic! I remember during my graduate studies working on a few bayesian estimation projects and really seeing how computationally expensive these methods can be. Fast forward a few years later the use of accelerator hardwares has become more and more common (I don't think that it's mianstream yet but getting there), since the research and advancement in deep learning have allowed to train models that are larger I wonder what are the possibilities for bayesian inference if these tools can be adapted. The increasing number of probabilistic programming libraries for python is going to make more and more accessible the use of Bayesian methods that were until a few years prohibitive because of the computational cost. Why I am excited about tensorflow probability is because Google has invested a incredible amount of resources to develop and continuously improve tensorflow, this will ultimately profit anyone interested in probabilistic programming (I wonder how easy these projects would have been with tensorflow probability and a GPU).

Anyways, during my few experiments, I found out that Tensorflow probability has a tfp.distributions.GaussianProcess class. Gaussian processes are an extension of the normal distribution to function spaces.They are powerful models for Bayesian statistics as they are flexible priors over function spaces. But simulation is quite expensive (this topic comes back so often in probabilistic programming).

A stochastic process is defined as a collection or random variables ${h(\mathbf{x}), \mathbf{x}\in X}$ with $X\subset\mathbb{R}^d$ $(d\geq1)$ refered to as the input space. The process is said to be Gaussian if all the finite-dimensional distributions are Gaussian, i.e.:

$\forall x_1,...,x_N \in X, \quad \mathbf{h} = [h(x_1),...,h(x_N)]^T \sim\mathcal{N}(\mathbf{m},\mathbf{K})$

with:

$m_i=m(x_i) = \mathbb{E}[h(x_i)]$

$S_{i,j}=k(x_i,x_j) = \mathbb{E}[(h(x_i) - m(x_i))^T(h(x_j) - m(x_j))]$

where $m: X \rightarrow \mathbb{R}$ is defined as the mean function and $k: X \times X \rightarrow \mathbb{R}$ the kernel (or more covariance) function. Gaussian process have had a particular success in time series modelling and are the basis of the theory of continuous stochastic calculus with many applications (finance, physics, ...). For instance, the Wiener process is a Gaussian process with input space $\mathbb{R}^+$ with mean function $m = 0$ and kernel function $k(s,t) = \min(s,t)$ .

And this is the trick that I use to simulate a Brownian motion by using the tensorflow probability API. A Gaussian process must be initialized with a kernel function. The abstractions for kernel functions can be found in the module: tfp.math.psd_kernels (positive semi definite kernels). I subclassed the PositiveSemidefiniteKernel class to add a "Brownian kernel", i.e. $k(s,t) = \sigma^2 \min(s,t)$ .

class BrownianKernel(tfp.math.psd_kernels.PositiveSemidefiniteKernel):
    """Brownian Kernel
    """

    def __init__(self,
                volatility=1.0,
                feature_ndims=1,
                validate_args=False,
                parameters=None,
                name="Brownian"):
        """Construct a Brownian kernel instance.
        Notes:
            As tensorflow_probability already defines a GaussianProcess distribution,
            we simply need to subclass this GaussianProcess class with a specific kernel.
            
        Args:
            volatility: Python `float` number of .
                Default Value: 1.0
            feature_ndims: Python `int` number of rightmost dims to include in
                kernel computation.
                Default Value: 1
            validate_args: If `True`, parameters are checked for validity despite
                possibly degrading runtime performance.
                Default Value: `False`
            parameters: For subclasses, a dict of constructor arguments.
                name: Python `str` name prefixed to Ops created by this class.
                Default Value: `'Brownian'`
        """
        parameters = dict(locals()) if parameters is None else parameters
        with tf.variable_scope(name):
            dtype = util.maybe_get_common_dtype([volatility])
        self._volatility = tensor_util.convert_nonref_to_tensor(
          volatility, name='volatility', dtype=dtype)
        super(BrownianKernel, self).__init__(
            feature_ndims,
            dtype=dtype,
            name=name,
            validate_args=validate_args,
            parameters=parameters)

    @property
    def volatility(self):
        """Volatility parameter."""
        return self._volatility

    def _apply(self, x1, x2, example_ndims=0):
        volatility = tf.convert_to_tensor(self.volatility)
        volatility = util.pad_shape_with_ones(
            volatility, example_ndims + self.feature_ndims)
        return volatility * volatility * util.sum_rightmost_ndims_preserving_shape(
                tf.math.minimum(x1, x2),
            ndims=self.feature_ndims)

    def _parameter_control_dependencies(self, is_init):
        if not self.validate_args:
            return []
        assertions = []
        for arg_name, arg in dict(volatility=self.volatility).items():
            if arg is not None and is_init != tensor_util.is_ref(arg):
                assertions.append(assert_util.assert_positive(
                    arg,
                    message='{} must be positive.'.format(arg_name)))
        return assertions

As the kernel is defined, then we can instantiate a Gaussian process with this kernel function, and simulate one Brownian sample path.

When computing the standard deviation of the 1 step returns, I get 0.02386895 which is close. By looking at the tfp code I see that the simulation method is based on defining the marginal distribution of the sample for the time steps (which is by definition a multivariate Gaussian).

And we get the following result (with matplotlib):

brownian_kernel = BrownianKernel(volatility=0.02)

steps = tf.expand_dims(tf.linspace(0.0, 1.0, 1000), 1)

gp = tfp.distributions.GaussianProcess(brownian_kernel, steps)

experiment = tf.squeeze(gp.sample()).numpy()

df = pd.DataFrame(
    data={
        "steps": tf.squeeze(steps).numpy(),
        "simulation": experiment
    }
)
sns.lineplot(x="steps", y="simulation", data=df)
plt.show()

What's next

This is cool, but that's just a simple simulation. The next step is to look into the parameter sampling using Bayesian sampling methods. As I used the kernel trick to avoid redefining a Brownian motion distribution I think this will help with using built-in tfp MCMC classes.

Ideally, I am looking into implementing a diffusion process distribution (and maybe a jump process diffusion) that can be sampled from and whose parameters can be estimated using MCMC methods! This can be useful particularly in financial applications. Then I want to run some experiments to confirm the speed up improvements of Monte Carlo methods running on CPU, GPU and TPU.