Press "Enter" to skip to content

Libres pensées d'un mathématicien ordinaire Posts

Back to basics : Student and Barenblatt

Photo of William Sealy Gosset (1876 - 1937)
William Sealy Gosset (1876 – 1937)

This tiny post is about a class of distributions that play a role in Statistics, Engineering, and in the Analysis of PDE. They are heavy tailed and give the Gaussians as weak limits.

Multivariate Student t-distribution. In Statistics, it is the law of the random vector of $\mathbb{R}^d$
\begin{equation}
X:=x_0+\frac{Y}{\sqrt{\frac{Z}{r}} }
\quad\text{where}\quad
\begin{cases}
Y&\sim\mathcal{N}(0,\Sigma)\\
Z&\sim\chi^2(r)=\mathrm{Gamma}(\frac{r}{2},\frac{1}{2})
\end{cases}
\quad\text{are independent}.
\end{equation} Here $x_0\in\mathbb{R}^d$ is a vector (the position), $r>0$ is a real parameter (the degree of freedom), and $\Sigma$ is an $d\times d$ positive-definite symmetric matrix. The probability density function is
\begin{equation}
x\in\mathbb{R}^d\mapsto
\frac{C}{\bigr(1+\frac{1}{r}(x-x_0)^\top\Sigma^{-1}(x-x_0)\bigr)^{\frac{r+d}{2}}},
\quad
C=\frac{\Gamma(\frac{r+d}{2})}
{\Gamma(\frac{r}{2})\sqrt{\det(r\pi\Sigma)}}.
\end{equation} It has a mean (respectively covariance) iff $r>1$ (respectively $r>2$), given respectively by
\begin{equation}\label{eq:x0Sigma}
x_0\quad \text{and}\quad \frac{r}{r-2}\Sigma.
\end{equation} The case $r=1$ is also known as a multivariate Cauchy distribution. At fixed $d$, the multivariate Student t-distribution tends as $r\to\infty$ to $\mathcal{N}(0,\Sigma)$, thanks to the law of large numbers and the Slutsky lemma. The law of $X$ is rotationally invariant, and the real random variable $\frac{1}{r}|X-x_0|^2$ follows a Fisher-Snedecor F-distribution. The Gaussian, the chi-square, the Student t, and the Fisher-Snedecor F distributions are at the heart of classical hypotheses testing in Statistics.

In the isotropic case $\Sigma=\sigma^2\mathrm{id}_d$ with $\sigma^2>0$, the density becomes, with $C:=\frac{\Gamma(\frac{r+d}{2})}
{\Gamma(\frac{r}{2})\sqrt{r^d\pi^d\sigma^{2d}}}$,
\begin{equation}
\frac{C}{\bigr(1+\frac{1}{r\sigma^2}\left|x-x_0\right|^2\bigr)^{\frac{r+d}{2}}}
= \frac{1}{(C^{-\frac{2}{r+d}}+C^{-\frac{2}{r+d}}\frac{1}{r\sigma^2}\left|x-x_0\right|^2)^{\frac{r+d}{2}}}.
\end{equation}The Student t-distribution takes its name from an article published in 1908 by William Sealy Gosset in Biometrika under the pseudonym Student (he was an employee of the Guiness brewery). It turns out that the formula above is a special case of the Barenblatt profile.

Barenblatt profiles. These are the probability density functions on $\mathbb{R}^d$ given by
\begin{equation}\label{eq:B}
B(x):={\Bigr(c+\alpha\frac{1-m}{2m}|x|^2\Bigr)_+^{\frac{1}{m-1}}},
\quad
\alpha:=\frac{1}{2-d(1-m)}
=\frac{1}{d(m-\frac{d-2}{d})}.
\end{equation}It makes sense for $m > \frac{d-2}{2}$. The constant $c=c_{d,m}>0$ is such that $B$ is normalized to one. When $\frac{d-2}{2} < m <1$, it is a multivariate Student t distribution (see above). When $m\to 1$, it boils down to a Gaussian. For $m>1$, it has a compact support and is related to spherical projections (see below). The Barenblatt profile takes its name from Grigory Isaakovich Barenblatt who introduced it in 1952 as the exact solution of nonlinear evolution equations known as fast diffusion ($m < 1$) and porous medium ($m > 1$) equation: $\partial_tu=\Delta(u^m)$. For $m=1$, we recover the heat equation, which has a Gaussian solution.

Compactly supported Barenblatt profile and projected spherical law. If $X$ is a random vector of $\mathbb{R}^d$, $d\geq2$, uniformly distributed on the sphere $\{x\in\mathbb{R}^d:|x|=R\}$ of radius $R>0$, then for all $1\leq n\leq d-1$, the law of the random vector $Y:=(X_1,\ldots,X_n)$ of $\mathbb{R}^n$ has density
\[
y\in\mathbb{R}^n\mapsto
C(R^2-|y|^2)_+^{\frac{d-n-2}{2}}
=C(R-|y|)^{\frac{d-n-2}{2}}(R+|y|)^{\frac{d-n-2}{2}}\mathbf{1}_{|y|\leq R}
\]where
\[
C:=\frac{2}{R^{d-2}\mathrm{Beta}(\frac{n}{2},\frac{d-n}{2})}.
\]We recognize a compactly supported Barenblatt profile with shape parameter $p=\frac{d-n-2}{2}$, in other words a special radial or multivariate symmetric Beta distribution. When $n=1$, this is also known as the Funk–Hecke formula in Harmonic Analysis. We have $Y=\pi_n(X)$ where $\pi_n(x_1,\ldots,x_n)=(x_1,\ldots,x_n)$ is the projection of $\mathbb{R}^d$ on the first $n$ coordinates. Note that if we use instead the stereographic projection, then we will end up with a radially symmetric distribution on $\mathbb{R}^{d-1}$ which is the deformation of a full space Barenblatt profile by a radial power weight. To make a link with the Gaussian construction at the top of this post, if $Z$ follows the standard Gaussian distribution $\mathcal{N}(0,I_d)$ on $\mathbb{R}^d$ then $Y:=Z/|Z|$ follows the uniform distribution on the unit sphere of $\mathbb{R}^d$, and $R(Y_1,\ldots,Y_n)$ follows the compactly supported Barenblatt profile above. Here we divide a Gaussian vector $Z$ by $|Z|\sim\chi(d)$ but this time these two objects are not independent. In another direction, it is worth noting that the Wigner semicircle distribution on $[-1,1]$ is nothing else but a Beta distribution on $[-1,1]$, namely a univariate Barenblatt distribution : $\sqrt{1-x^2}=(1-x)^{1/2}(1+x)^{1/2}$, known to solve a quadratic McKean-Vlasov Fokker-Planck evolution equation, and for which the Cauchy-Stieltjes transform solves a quadratic complex Burgers evolution equation in relation with free probability.

Photo of Grigory Isaakovich Barenblatt (1927 - 2018)
Grigory Isaakovich Barenblatt (1927 – 2018)

Further reading.

  • Student (pseudonym of William Sealy Gosset, employee at Guinness brewery)
    The probable error of a mean
    Biometrika (1908)
  • Samuel Kotz and Saralees Nadarajah
    Multivariate t Distributions and Their Applications
    Cambridge University Press (2004)
  • Grigory Isaakovich Barenblatt
    On some unsteady motions of a liquid and gas in a porous medium
    Akad. Nauk SSSR. Prikl. Mat. Meh. (1952)
  • Grigory Isaakovich Barenblatt
    Scaling, self-similarity, and intermediate asymptotics
    Cambridge University Pres (1996)
  • Jérôme Demange
    Des équations à diffusion rapide aux inégalités de Sobolev sur les modèles de la géométrie
    Thèse de doctorat, sous la direction de Dominique Bakry (2005)
  • Yan Doumerc
    Matrices aléatoires, processus stochastiques et groupes de réflexions
    Thèse de doctorat, sous la direction de Michel Ledoux (2005)
Leave a Comment
Syntax · Style · .