# Libres pensées d'un mathématicien ordinaire Posts

L’université Paris-Dauphine doit son nom à sa proximité avec la porte Dauphine. Selon certains [1], cette porte doit son nom à l’avenue Dauphine. L’avenue Dauphine, ancienne voie des communes de Passy et de Neuilly, aurait été créée en 1826 en l’honneur de la dauphine de France, Marie-Thérèse, duchesse d’Angoulême (1778-1851). Son prolongement jusqu’au bois de Boulogne a été remanié en 1854 lors de la création de l’avenue de l’Impératrice (devenue avenue du bois en 1875, puis avenue Foch en 1929). Elle porte depuis 1864 le nom du maréchal Bugeaud (1784-1849).

Selon d’autres sources [2], la porte Dauphine doit son nom à Marie-Antoinette, épouse du dauphin et futur Louis XVI, qui l’aurait fait édifier au bout de la rue de la Faisanderie, vers 1770, lorsqu’elle résidait au château de la Muette. Ce serait plutôt la porte qui aurait donné son nom à l’ancienne avenue ?

Note. L’idée d’écrire ce billet m’est venue suite à la question pertinente d’une collègue nouvelle recrutée sur l’origine exacte du nom de la porte de Paris qui a donné son nom à l’université.

This post is devoted to a quick proof of a version of the law of large numbers for random variables in $\mathrm{L}^2$ with non-negative partial sums. It does not require independence or same distribution. It is a variant of the famous one for independent random variables bounded in $\mathrm{L}^4$.

The statement. If $X_1,X_2,\ldots$ are $\mathrm{L}^2$ random variables on $(\Omega,\mathcal{A},\mathbb{P})$ such that $S_n:=X_1+\cdots+X_n\geq0$, $\lim_{n\to\infty}\frac{1}{n}\mathbb{E}(S_n)=\ell\in\mathbb{R}$, and $\mathrm{Var}(S_n)=\mathcal{O}(n)$, then $$\lim_{n\to\infty}\frac{S_n}{n}=\ell\text{ almost surely}.$$

A proof. We have $$\mathbb{E}\Bigr(\sum_n\Bigr(\frac{S_{n^2}-\mathbb{E}(S_{n^2})}{n^2}\Bigr)^2\Bigr)=\sum_n\frac{\mathrm{Var}(S_{n^2})}{n^4}=\mathcal{O}\Bigr(\sum_n\frac{1}{n^2}\Bigr)<\infty.$$ It follows that $\sum_n\Bigr(\frac{S_{n^2}-\mathbb{E}(S_{n^2})}{n^2}\Bigr)^2<\infty$ almost surely, which implies that $\frac{S_{n^2}-\mathbb{E}(S_{n^2})}{n^2}\to0$ almost surely. This gives that $\frac{1}{n^2}S_{n^2}\to\ell$ almost surely. It remains to use the sandwich$$\frac{S_{n^2}}{n^2}\frac{n^2}{(n+1)^2}\leq\frac{S_k}{k}\leq\frac{S_{(n+1)^2}}{(n+1)^2}\frac{(n+1)^2}{n^2}$$ valid if $k$ is such that $n^2\leq k\leq (n+1)^2$ (here we use the non-negative nature of the $S$’s).

Note and further reading. I have learnt this proof recently from my friend Arnaud Guyader who found it in a paper (Theorem 5) by Bernard Delyon and François Portier. It is probably a classic however. It is possible to drop the assumption of being non-negative by using $X=X_+-X_-$, but this would require to modify the remaining assumptions, increasing the complexity, see for instance Theorem 6.2 in Chapter 3 of Erhan Çinlar book (and the blog post comments below). The proof above has the advantage of being quick and beautiful.

​​​This tiny post is about a basic characterization of Gaussian distributions.​

The theorem. A random vector of dimension two or more has independent components and is rotationally invariant if and only if its components are Gaussian, centered, with same variances.

In other words, for all $n\geq2$, a probability measure on $\mathbb{R}^n$ is in the same time product and rotationally invariant if and only if it is a Gaussian distribution $\mathcal{N}(0,\sigma^2I_n)$ for some $\sigma\geq0$.

Note that this does not work for $n=1$. In a sense it is a purely multivariate phenomenon.

A proof. For all $\sigma\geq0$, the Gaussian distribution $\mathcal{N}(0,\sigma^2I_n)$ is product and is rotationally invariant, and if $\sigma>0$, its density is, denoting $|x|:=\sqrt{x_1^2+\cdots+x_n^2}$, $$x\in\mathbb{R}^n\mapsto\mathrm{exp}\Bigr(-\frac{|x|^2}{2\sigma^2}-n\log\sqrt{2\pi\sigma^2}\Bigr).$$ Conversely, suppose that $\mu$ is a rotationally invariant product probability distribution on $\mathbb{R}^n$. We can assume without loss of generality that it has a smooth positive density $f:\mathbb{R}^n\to(0,\infty)$, since otherwise we can consider the probability measure $\mu*\mathcal{N}(0,\varepsilon I_n)$ for $\varepsilon>0$, which is also product and rotationally invariant. By rotational invariance, $\log f(x)=g(|x|^2)$, and thus $$\partial_i\log f(x)=2g'(|x|^2)x_i.$$ On the other hand, since $\mu$ is product, we have $\log f (x)=h(x_1)+\cdots+h(x_n)$ and thus $$\partial_i\log f (x)=h'(x_i).$$ Hence $\partial_i\log f(x)$, which depends on $|x|$ via $g'(|x|)$, depend only on $x_i$. Since $n\geq2$, it follows that $g’$ is constant. Therefore there exist $a,b\in\mathbb{R}$ such that $g(u)=au+b$ for all $u$, and thus $f(x)=\mathrm{e}^{a|x|^2+b}$ for all $x\in\mathbb{R}^n$. Since $f$ is a density, $a<0$ and $\mathrm{e}^b=(\pi/a)^{-n/2}$.

History. ​This was probably known before Maxwell, maybe by Carl Friedrich Gauss (1777 – 1855) himself. The proof above is roughly the reasoning followed by James Clerk Maxwell (1831 – 1879) to derive the distribution of velocities in an ideal gas at equilibrium. In his case $n=3$, and the distribution is known in statistical physics as the Maxwellian distribution. This was a source of inspiration for Ludwig Boltzmann (1844 – 1906) for the derivation of his kinetic evolution equation and his H-theorem about entropy.

Characterizations. This characterization of Gaussian laws among product distributions using invariance by the action of transformations (rotations) leads to the same characterization for the heat semi-group and for the Laplacian operator. There are of course other remarkable characterizations of the Gaussian, for instance as being an eigenvector of the Fourier transform, and also, following Boltzmann, as being the maximum entropy distribution at fixed variance.

Further reading. Robert Robson, Timon Mehrling, and Jens Osterhoff, Great moments in kinetic theory: 150 years of Maxwell’s (other) equations, European Journal of Physics 38(6) 2017 (PDF)

Maxwell characterization for unitary invariant random matrices. A random $n\times n$ Hermitian matrix has in the same time independent entries and a law invariant by conjugacy with respect to unitary matrices if and only if it has a Gaussian law with density of the form $$H\mapsto\exp(a\mathrm{Tr}(H^2)+b\mathrm{Tr}(H)+c).$$ Note that the unitary invariance implies that the density depends only of the spectrum and is actually a symmetric function of the eigenvalues. A complete solution can be found for instance in Madan Lal Mehta book on Random matrices (Theorem 2.6.3). It is based on the following lemma due to Hermann Weyl: all the invariants of an $n\times n$ matrix $H$ under non-singular similarity transformations $H\mapsto UHU^*$ can be expressed in terms of traces of the first $n$ powers of $H$. The assumption about the independence of entries kills all powers above $2$.

Complement. It is not difficult to show that if $X$ is a random vector of $\mathbb{R}^n$, $n\geq1$ with independent Gaussian and centered components of positive variance then $\mathbb{P}(X=0)=0$ and $X/|X|$ is uniformly distributed on the sphere. Conversely, it was shown by my former teacher and colleague Gérard Letac in The Annals of Statistics (1981) that if a random vector $X$ of $\mathbb{R}^n$, $n\geq3$, has independent components and is such that $\mathbb{P}(X=0)=0$ and $X/|X|$ is uniformly distributed on the sphere, then $X$ is Gaussian and in particular its components are Gaussian with zero mean and same positive variance. Moreover there are counter examples for $n=1$ and $n=2$. When $n\geq3$, this result of Letac implies the Maxwell theorem.

This tiny post is devoted to a proof of the almost sure convergence of martingales bounded in $\mathrm{L}^1$. This proof that we give below relies on the almost sure convergence of martingales bounded in $\mathrm{L}^2$, after a truncation step. In order to keep the martingale property after truncation, we truncate with a stopping time. The boundedness in $\mathrm{L}^1$ is used to show via the maximal inequality that the martingale is almost surely bounded. Note that this proof differs from the classical and historical proof from scratch which is based on up-crossing or oscillations.

The martingales are either in discrete time or in continuous time with continuous paths.

The theorem. Let $M={(M_t)}_{t\geq0}$ be a continuous martingale bounded in $\mathrm{L}^1$. Then there exists $M_\infty\in\mathrm{L}^1$ such that $\lim_{t\to\infty}M_t=M_\infty$ almost surely. Moreover the convergence holds in $\mathrm{L}^1$ if and only if $M$ is uniformly integrable.

A proof. The fact that $M_\infty\in\mathrm{L}^1$ follows without effort from the almost sure convergence, the boundedness in $\mathrm{L}^1$, and the Fatou lemma, namely
$\mathbb{E}(|M_\infty|) =\mathbb{E}(\varliminf_{t\to\infty}|M_t|) \leq\varliminf_{t\to\infty}\mathbb{E}(|M_t|) \leq C<\infty.$ Moreover, it is a general fact that a sequence of random variables that converges almost surely to a limit belonging to $\mathrm{L}^1$ does converge in $\mathrm{L}^1$ if and only if it is uniformly integrable.

It remains to prove a.s. convergence. By the Doob maximal inequality with $p=1$, and $r>0$,

$$\mathbb{P}\Bigr(\sup_{s\in[0,t]}|M_s|\geq r\Bigr) \leq\frac{\mathbb{E}(|M_t|)}{r}.$$
By monotone convergence, with $C:=\sup_{t\geq0}\mathbb{E}(|M_t|)<\infty$, for all $r>0$,
$\mathbb{P}\Bigr(\sup_{t\geq0}|M_t|\geq r\Bigr) \leq\frac{C}{r}.$
It follows that $\mathbb{P}\Bigr(\sup_{t\geq0}|M_t|=\infty\Bigr)\leq\lim_{r\to\infty}\mathbb{P}\Bigr(\sup_{t\geq0}|M_t|\geq r\Bigr)=0.$
In other words almost surely ${(M_t)}_{t\geq0}$ is bounded.
As a consequence, on an almost sure event, say $\Omega’$, for large enough $n$,
$T_n:=\inf\{t\geq0:|M_t|\geq n\}=\infty.$

On the other hand, by the Doob stopping theorem, for all $n\geq0$, ${(M_{t\wedge T_n})}_{t\geq0}$ is a martingale and $\sup_{t\geq0}|M_{t\wedge T_n}|\leq n$. Since it is bounded in $\mathrm{L}^2$, there exists $M^{(n)}_\infty\in\mathrm{L}^2$ such that $\lim_{t\to\infty}M_{t\wedge T_n}=M^{(n)}_\infty$ almost surely (and in $\mathrm{L}^2$ but this is useless here). Let us denote by $\Omega_n$ the almost sure event on which this convergence holds. Then, on the almost sure event $\Omega’\cap(\cap_n\Omega_n)$, we have, for all $m,n$, $M^{(n)}_\infty=M^{(m)}_\infty=:M_\infty$, and

$\lim_{t\to\infty}M_t=M_\infty.$

About truncation. Truncation is very natural to increase integrability. It is for instance used in the proof of the strong law of large numbers for independent random variables in $\mathrm{L}^1$ in order to reduce the problem to variables in $\mathrm{L}^p$ with $p>1$, the case $p=4$ being particularly simple.

Final comments. The ingredients should be established before and without using this theorem namely maximal inequalities for martingales, almost sure convergence of martingales bounded in $\mathrm{L}^2$, and stopping theorem for martingales and arbitrary stopping times.

Syntax · Style · Tracking & Privacy.