Math and science::INF ML AI

Multivariate Gaussian distribution

This card derives the general multivariate normal distribution from the standard multivariate normal distribution.

Standard multivariate Gaussian/normal distribution

Let $(Ω, F, P)$ be a probability space. Let $X : Ω \to R^{K}$ be a continuous random vector. $X$ is said to have a standard multivariate normal distribution iff its joint probability density function is:

f_{X} (x) = (\frac{1}{\sqrt{2 π}})^{K} e^{- \frac{x^{T} x}{2}}

As a vector of random variables

$X$ can be considered to be a vector of independent random variables, each having a standard normal distribution. The proof of this formulation on the reverse side.

General multivariate

The general multivariate normal distribution is best understood as being the distribution that results from applying a linear transformation to a random variable having a multivariate standard normal distribution.

General multivariate normal distribution

Let $(Ω, F, P)$ be a probability space, and let $Z : Ω \to R^{K}$ be a random vector with a multivariate standard normal distribution. Then let $X = μ + Σ Z$ be another random vector. $X$ has a distribution $f_{X} : R^{K} \to R$ which is a transformed version of $Z$ 's distribution, $f_{Z} : R^{K} \to R$ :

\begin{aligned} f_{X} (x) & = \frac{1}{rescaling factor} & f_{z} (z in terms of x) \\ = \frac{1}{| \det (Σ) |} & f_{z} (Σ^{- 1} (x - μ)) \\ = \frac{1}{| \det (Σ) |} & (\frac{1}{\sqrt{2 π}})^{K} e^{\frac{(Σ^{- 1} (x - μ))^{T} (Σ^{- 1} (x - μ))}{2}} \end{aligned}

Standard multivariate normal as a vector of independent random variables. Proof.

Proof that the above probability density function represents

K

independent standard normal random variables. For clarity, only the case when

K = 3

will be highlighted:

\begin{aligned} f_{Z} (z) & = \frac{1}{(2 π)^{\frac{3}{2}}} e^{- \frac{z^{T} z}{2}} \\ = \frac{1}{(2 π)^{\frac{1}{2}}} \frac{1}{(2 π)^{\frac{1}{2}}} \frac{1}{(2 π)^{\frac{1}{2}}} e^{- \frac{z^{T} z}{2}} \\ = \frac{1}{(2 π)^{\frac{1}{2}}} \frac{1}{(2 π)^{\frac{1}{2}}} \frac{1}{(2 π)^{\frac{1}{2}}} e^{- \frac{z_{1}^{2} + z_{2}^{2} + z_{3}^{3}}{2}} \\ = \frac{1}{\sqrt{2 π}} \frac{1}{\sqrt{2 π}} \frac{1}{\sqrt{2 π}} e^{- \frac{z_{1}^{2}}{2}} e^{- \frac{z_{2}^{2}}{2}} e^{- \frac{z_{3}^{2}}{2}} \\ = \frac{1}{\sqrt{2 π}} e^{- \frac{z_{1}^{2}}{2}} \frac{1}{\sqrt{2 π}} e^{- \frac{z_{2}^{2}}{2}} \frac{1}{\sqrt{2 π}} e^{- \frac{z_{3}^{2}}{2}} \\ = f (z_{1}) f (z_{2}) f (z_{3}) \end{aligned}

Thus, the propability distribution for $Z$ represents 3 independent random variables having a individual distribution of $\frac{1}{\sqrt{2 π}} e^{- \frac{z_{i}^{2}}{2}}$ which is the standard normal distribution for a single random variable.

Elementwise conceptualization

The quantity $[X - μ] Σ^{- 1} [X - μ]$ can be thought of as a component wise operation, sum( $\frac{[X - μ]^{2}}{σ^{2}}$ ), where $σ$ is the column vector of standard deviations. Another alternative is to imagine the component wise methods that would be called in a linear algebra/DL library: accumulate(mult(invert(pow( $σ$ , 2)), pow( $X - μ$ , 2))).

$V = Σ^{T} Σ$ form. Proof.

The Statlect formulation simplifies the expression. Copy-pasted here:

Dispelling the mystery of the co-variance matrix

Here is a screenshot of some notes on the perspective of a multivariate Gaussian random variable being a composition of two standard normal variables. It tries to dispell some of the mystery about why the co-variance matrix comes up the way it does.