Probability & Statistics in Quant Finance

TQT 2024

Siddharth Vishwanath

A little about me

Visiting Assistant Professor at the Department of Mathematics
PhD in Statistics from Penn State
Quant research at Goldman Sachs
Model validation Quant at Nomura

A little bit about my research

Broadly, I’m interested in uncertainty quantification & inference for complex systems

You discover circles? 🥱 😴 What does that have to do with finance?

You discover circles? 🥱 😴 What does that have to do with finance?

ARIMA(1,1,2)

Cox-Ingersoll-Ross

Statistics & finance

Introduction

A Tale of Two Probabilities

${\mathbb P}$ vs. $\widetilde{\mathbb P}$

The forward probability ${\mathbb P}$

Goal: Model the future
Uses: Risk management, investing
Reference: the “real world” probability
Machinery: High-dimensional statistics, machine learning, etc.

The risk neutral probability $\widetilde{\mathbb P}$

Goal: Extrapolate the present
Uses: Pricing, hedging
Reference: the “risk neutral” probability
Machinery: Itô calculus, PDEs

What is the risk-neutral probability?

Consider the following setup:

For time intervals \[ 0=t_0 < t_1 < t_2 < \dots < t_n \]

At each time $t_n$ you have access to:
- A stock with price $S_n$.
- A bond (risk-free asset) with price $B_n$

From time $t_n$ to $t_{n+1}$ the bond always gives you risk-free return, i.e., \[ B_{n+1} = (1+r)B_n \] where $r$ is the interest rate your bank gives you, for example. $0.1\%$ 🤣

What is the risk-neutral probability?

Portfolio

A portfolio is a collection of assets you own at any given time.

In this case, a portfolio is two numbers $\alpha_n, \beta_n > 0$ such that \[ \begin{aligned} V_n &= \alpha_n S_n + \beta_n B_n \quad\quad \text{such that} \quad\quad \alpha_n + \beta_n \equiv 1 \end{aligned} \] $V_n$ is your net value at time $t_n$.

Your investment strategy is simple.

At each time $t_n$, choose a value $\alpha_{n+1}, \beta_{n+1}$ such that
Your net value $V_{n+1} = \alpha_{n+1} S_{n+1} + \beta_{n+1} B_{n+1}$ is maximized 💰🤑💰

What is the risk-neutral probability?

European Call Option

A European call option $X$ is a derivative where the payoff at time $t_n$ is \[ X_n = \max(S_n - K, 0) \] where $K$ is called the strike price.

What is the risk-neutral probability?

Now consider the stock price $S_n$ at time $t_n$. At time $t_{n+1}$ suppose one of two things can happen:

\[ S_{n+1} = \begin{cases} (1+u) \times S_n & \text{with probability } p\\ (1-d) \times S_n & \text{with probability } 1-p \end{cases} \]

here $p$ is the “real world” probability

Since $X_n$ depends on $S_n$, let’s choose a portfolio which mimics the payoff of $X_{n+1}$ at time $t_{n+1}$, i.e.,

\[ \begin{aligned} X^u_{n+1} &= \Big(\alpha_{n+1} \times (1+u) S_n\Big) + \Big(\beta_{n+1} \times (1+r) B_n\Big)\\ X^d_{n+1} &= \Big(\alpha_{n+1} \times (1-d) S_n\Big) + \Big(\beta_{n+1} \times (1+r) B_n\Big) \end{aligned} \]

Here, we can solve for $\alpha_{n+1}$ and $\beta_{n+1}$

What is the risk-neutral probability?

Here, we can solve for $\alpha_{n+1}$ and $\beta_{n+1}$

\[ \begin{aligned} \alpha_{n+1} &= \frac{X^u_{n+1} - X^d_{n+1}}{(u+d)S_n}\\ \beta_{n+1} &= \frac{1}{1+r}\Big(\frac{(1+u)X^d_{n+1} - (1-d)X^u_{n+1}}{u+d}\Big) \end{aligned} \]

Since you buy $\alpha_{n+1}$ shares of the stock and $\beta_{n+1}$ shares of the bond at time $t_n$, you net value at time $t_{n}$ needs to be

What is the risk-neutral probability?

Since you buy $\alpha_{n+1}$ shares of the stock and $\beta_{n+1}$ shares of the bond at time $t_n$, you net value at time $t_{n}$ needs to be \[ \begin{aligned} V_n &= \alpha_{n+1} S_n + \beta_{n+1} B_n\\ &= \dots\\ &= \frac{1}{1+r} \Big( \frac{r+d}{u+d} X^u_{n+1} + \frac{u-r}{u+d} X^d_{n+1} \Big )\\ &= \frac{1}{1+r} \Big( \tilde p X^u_{n+1} + (1-\tilde p) X^d_{n+1} \Big ) \end{aligned} \]

In other words, \[ \begin{aligned} &\overbrace{(1+r) V_n}^{\text{If you took the money and invested it all in bonds at time $t_n$}}\\ &= \underbrace{\tilde p X^u_{n+1} + (1-\tilde p) X^d_{n+1}={\mathbb E}_{\tilde p}(X_{n+1})}_{\text{expected returns from the call option at time $t_{n+1}$}} \end{aligned} \]

What is the risk-neutral probability?

Here $\tilde p = \frac{r+d}{u+d}$ is the risk-neutral probability.

When the stock price $S_t$ doesn’t just go up/down but can take a range of values (like a normal distribution), e.g., \[ S_{n+1} \mid S_n \sim N(S_n, \sigma^2) \equiv {\mathbb P} \] Then $\widetilde{\mathbb P}\sim N(0, 1)$

When $t_{n+1} - t_n \approx dt$, then everything becomes continuous time
- You have to model the assets using stochastic differential equations (SDEs)
- Brownian Motion $W_t$
- Girsanov’s theorem: There exists a risk-neutral Brownian Motion $\tilde W_t$ associated with the real-world Brownian Motion $W_t$

Derivative Pricing

Use the real-world probability ${\mathbb P}$ to model the stock price $S_t$.
Use the risk-neutral probability $\widetilde{\mathbb P}$ to price the derivative $X_t$.
The price of the derivative $X_0$ at time $t=0$ is the expected value of the derivative at time $T$ under the risk-neutral probability $\widetilde{\mathbb P}$, i.e.,

\[ X_0 = e^{-rT} \times {\mathbb E}_{\widetilde{\mathbb P}}(X_T) \]

Example: The Black-Scholes-Merton formula

\[ X_0 = S_0 \times \Phi(d_1) - K \times e^{-rT} \times \Phi(d_2) \] where $d_2 = d_1 - \sigma\sqrt{T}$ and \[ \begin{aligned} d_1 &= \frac{\log(S_0/K) + (r + \sigma^2/2)T}{\sigma\sqrt{T}}\\ \end{aligned} \]

Statistical Modeling

What about ${\mathbb P}$?

The whole discussion about $\widetilde{\mathbb P}$ is based on the assumption that we know the real-world probability ${\mathbb P}$.
But how do we know ${\mathbb P}$? 🤔

We don’t. We have to estimate it from data.

Estimating ${\mathbb P}$

Let’s look at a simple example where we only have one asset $r_t$ at time $t$. Here:

$r_t$ is a (stochastic) interest rate
It’s used to:
- evaluate bond prices
- create interest rate swaps, and
- underlies almost every other financial derivative.

A common model for $r_t$ is to assume it follows a Vašiček model, i.e., \[ dr_t = (\alpha - \beta r_t) dt + \sigma dW_t \] where $W_t$ is a Brownian motion.

Estimating ${\mathbb P}$

A common model for $r_t$ is to assume it follows a Vašiček model, i.e., \[ dr_t = (\alpha - \beta r_t) dt + \sigma dW_t \] where $W_t$ is a Brownian motion.

In other words,

\[ r_{t+ \Delta t} \sim N\Big(r_t + (\alpha - \beta r_t) \cdot \Delta t, \ \ \ \sigma^2 \cdot \Delta t\Big) \]

Estimating ${\mathbb P}$

For times $t_0 < t_1 < t_2 \dots t_n$
You collect data $r_{t_0}, r_{t_1}, r_{t_2}, \dots, r_{t_n}$

Estimating ${\mathbb P}$

For times $t_0 < t_1 < t_2 \dots t_n$
You collect data $r_{t_0}, r_{t_1}, r_{t_2}, \dots, r_{t_n}$
How do you estimate the parameters $\alpha, \beta, \sigma$?

Let $f_t(r_t \mid \alpha, \beta, \sigma)$ be the probability density function of $r_t$ at time $t$

Then the likelihood of the data is \[ L(\alpha, \beta, \sigma) = \prod_{i=1}^n f_{t_i}(r_{t_i} \mid \alpha, \beta, \sigma) \]

Estimating ${\mathbb P}$

Let $f_t(r_t \mid \alpha, \beta, \sigma)$ be the probability density function of $r_t$ at time $t$

Then the likelihood of the data is \[ L(\alpha, \beta, \sigma) = \prod_{i=1}^n f_{t_i}(r_{t_i} \mid \alpha, \beta, \sigma) \]

The maximum likelihood estimate (MLE) of $\alpha, \beta, \sigma$ is:

\[ \hat \alpha, \hat \beta, \hat \sigma = \arg\max_{\alpha, \beta, \sigma} L(\alpha, \beta, \sigma) \]

This is a standard optimization problem.

The statistical advantage

You have data $x_1, x_2, \dots, x_t$
You assume a model $x_{t + \Delta t} = F(x_t \mid \theta)$
- Where $\theta$ are some parameters with interpretation

* If you are willing to dip your toes into the math, you can:

Estimate $\hat\theta$ using a dinosaur computer
Use $\hat\theta$ to make predictions about the future
Quantify how much uncertainty you have in your predictions
Quantify the effect that changing $\hat\theta \mapsto \hat\theta + \Delta\theta$ has on your predictions,
etc.

21st Century Forecasting

You have data $x_1, x_2, \dots, x_t$
You take a deep learning architecture $x_t = F(t \mid \theta)$
- Where $\theta$ are some parameters of the network

If you have have enough compute you can
- Estimate $\hat\theta$ using state of the art GPUs
- Use $\hat\theta$ to make predictions about the future
- But it comes at the price of uncertainty quantification 😭
- But you don’t have to worry about the math 🙂👍

Which is better?

The answer is: it depends.

Philosophically:

The first method is based on assumptions.
- Assumptions have consequences!

The second method is based on data.
- Garbage in, garbage out!

Relistically:

If you are a mathematician, you might enjoy the first approach.
- It’s like solving a puzzle.

If you are a computer scientist, you might enjoy the second approach.
- It’s like playing a video game.

Probability & Statistics in Quant Finance

A little about me

A little bit about my research

Statistics & finance

Introduction

A Tale of Two Probabilities

\({\mathbb P}\) vs. \(\widetilde{\mathbb P}\)

What is the risk-neutral probability?

Consider the following setup:

What is the risk-neutral probability?

What is the risk-neutral probability?

What is the risk-neutral probability?

What is the risk-neutral probability?

What is the risk-neutral probability?

What is the risk-neutral probability?

Derivative Pricing

Statistical Modeling

What about \({\mathbb P}\)?

Estimating \({\mathbb P}\)

Estimating \({\mathbb P}\)

Estimating \({\mathbb P}\)

Estimating \({\mathbb P}\)

Estimating \({\mathbb P}\)

The statistical advantage

21st Century Forecasting

Which is better?

Questions?

References