Press "Enter" to skip to content

Libres pensées d'un mathématicien ordinaire Posts


This post is about a remarkable projective property of certain Boltzmann-Gibbs measures.

The model. Let $d\geq1$ and $g:\mathbb{R}^d\to(-\infty,+\infty]$ be continuous with $g(x)<\infty$ for all $x\neq0$. For all $\beta>0$, $n\geq2$, let $P_n$ be the probability measure on $(\mathbb{R}^d)^n$ with density proportional to $$(x_1,\ldots,x_n)\in(\mathbb{R}^d)^n\mapsto\exp(-\beta H(x_1,\ldots,x_n))$$ where $$H(x_1,\ldots,x_n)=\sum_{i=1}^n\frac{1}{2}|x_i|^2+\sum_{i\neq j}g(x_i-x_j).$$ Let $$X=(X_{n,1},\ldots,X_{n,n})\sim P_n.$$

A projection. Let $p:\mathbb{R}^d\to\mathbb{R}^d$ be an orthogonal projection on a subspace $E\subset\mathbb{R}^d$. Let $\pi$ and $\pi^\perp$ be the orthogonal projections on the subspaces $$L=\{(p(z),\ldots,p(z))\in(\mathbb{R}^d)^n:z\in\mathbb{R}^d\}\quad\text{and}\quad L^\perp.$$ We have, for all $x\in(\mathbb{R}^d)^n$,

$$\pi(x)=(p(s(x)),\ldots,p(s(x)))\quad\text{where}\quad s(x):=\frac{x_1+\cdots+x_n}{n}\in\mathbb{R}^d.$$

Indeed, we have
p(z)=0\text{ for all }z\in\mathbb{R}^d\},
and for all $x=(x_1,\ldots,x_n)\in(\mathbb{R}^d)^n$ and all $(z,\ldots,z)\in L$, $z\in E$, we have
\begin{align}((x_1,\ldots,x_n)-(p(s(x)),\ldots,p(s(x))))\cdot(z,\ldots,z) &=\sum_{i=1}^nx_i\cdot z-np(s(x))\cdot z\\&=\sum_{i=1}^nx_i\cdot z-\sum_{i=1}^np(x_i)\cdot z\\&=\sum_{i=1}^n(x_i-p(x_i))\cdot z\\&=\sum_{i=1}^n0=0.\end{align} 

Example 1: if $E=\mathbb{R}^d$ then $p(x)=x$.

Example 2 : $E=\mathbb{R}z$ for $z\in\mathbb{R}^d$ with $|z|=1$. Then $p(x)=(x\cdot z)z$.​​

The statement.  If all the ingredients are as above, then:

  • $\pi(X)$ and $\pi^\perp(X)$ are independent random vectors;
  • $\pi(X)$ is Gaussian with law $\mathcal{N}(0,\frac{1}{\beta}I_{\mathrm{dim}(E)})$ in an orthonormal basis of $L$
  • $\pi^\perp(X)$ has law of density proportional to $x\in L^\perp\mapsto\mathrm{e}^{-\beta H(x)}$ with respect to the trace of the Lebesgue measure on the linear subspace $L^\perp$ of $(\mathbb{R}^d)^{n-1}$.

A proof. For all $x\in(\mathbb{R}^d)^n$, from $x=\pi(x)+\pi^\perp(x)$ we get
and on the other hand, for all $i,j\in{1,\ldots,n}$,


Since $V(x)=|x|^2$, it follows that for all $x=(x_1,\ldots,x_n)\in(\mathbb{R}^d)^n$, $$H(x)=|x|^2+\sum_{i\neq j}W(x_i-x_j)=|\pi(x)|^2+H(\pi^\perp(x)).$$

Let $u_1,\ldots,u_{dn}$ be an orthogonal basis of $(\mathbb{R}^d)^n=\mathbb{R}^{dn}$ such that $u_1,\ldots,u_{\mathrm{dim}(E)}$ is an orthonormal basis of $L$. For all $x\in(\mathbb{R}^d)^n$ we write $x=\sum_{i=1}^{dn}t_i(x)u_i$. We have $$\pi(x)=\sum_{i=1}^{\mathrm{dim}(E)}t_i(x)u_i\quad\text{and}\quad\pi^\perp(x)=\sum_{i=d+1}^{dn}t_i(x)u_i.$$ For all bounded measurable $f:L\to\mathbb{R}$ and $g:L^\perp\to\mathbb{R}$,
\begin{align}\mathbb{E}(f(\pi(X))g(\pi^\perp(X))) &=Z^{-1}\int_{(\mathbb{R}^d)^n}f(\pi(x))g(\pi^\perp(x))\mathrm{e}^{-\beta|\pi(x)|^2}\mathrm{e}^{-\beta H(\pi^\perp(x))}\mathrm{d}x_1\cdots\mathrm{d}x_n\\&=Z^{-1}\Bigr(\int_{\mathbb{R}^{\mathrm{dim}(E)}}f(t’)\mathrm{e}^{-\beta|t’|^2}\mathrm{d}t’\Bigr) \Bigr(\int_{\mathbb{R}^{d(n-1)}}g(t”) \mathrm{e}^{-\beta H(t”)}\mathrm{d}t”\Bigr)\end{align}

where $t’:=\sum_{i=1}^{\mathrm{dim}(E)}t_iu_i$, $\mathrm{d}t’:=\prod_{i=1}^{\mathrm{dim}(E)}\mathrm{d}t_i$, $t”:=\sum_{i=d+1}^{dn}t_iu_i$,$\mathrm{d}t”:=\prod_{i=d+1}^{dn}\mathrm{d}t_i$.

Further reading. arXiv:1805.00708


Leave a Comment

Aspects of Beta Ensembles

Aspirateur sans fil Dyson Absolute
Ceci n’est pas un ensemble de Dyson.

This post is devoted to computations for beta ensembles from random matrix theory.

Real case. Following MR2813333, or MR2325917 MR1936554 with a different scaling, for all $\beta>0$ and $n\geq2$, the $\beta$ Hermite ensemble is the probability measure on $\mathbb{R}^n$ defined by $$
$$ Let $X_n=(X_{n,1},\ldots,X_{n,n})\sim P_{\beta,n}$. The normalization $C_{\beta,n}$ can be explicitly computed in terms of Gamma functions via a Selberg integral. Following MR1936554, the law $P_{\beta,n}$ is the distribution of the ordered eigenvalues of the random tridiagonal symmetric $n\times n$ matrix
M_{\beta,n}=\frac{1}{\sqrt{\beta n}}
\mathcal{N}(0, 2) & \chi_{(n-1) \beta} & & &\\
\chi_{(n-1) \beta} & \mathcal{N}(0, 2) & \chi_{(n-2) \beta} & & \\
& \ddots & \ddots & \ddots & \\
& & \chi_{2\beta} & \mathcal{N}(0,2) & \chi_{\beta} \\
& & & \chi_{\beta} & \mathcal{N}(0,2)
where, up to the scaling prefactor $1/\sqrt{\beta n}$, the entries in the upper triangle including the diagonal are independent, follow a Gaussian law $\mathcal{N}(0,2)$ on the diagonal, and $\chi$-laws just above the diagonal with a decreasing parameter with step $\beta$ from $(n-1)\beta$ to $\beta$. In particular $$Z_{n,1}+\cdots+Z_{n,n}=\mathrm{Trace}(M_{\beta,n})\sim\mathcal{N}\left(0,\frac{2}{\beta}\right).$$

Using standard algebra on Gamma distributions we also get
\sim\mathrm{Gamma}\left(\frac{n}{2}+\frac{\beta n(n-1)}{4},\frac{\beta n}{4}\right).
$$ In particular, we have $$\mathbb{E}((X_{n,1}+\cdots+X_{n,n})^2)=\frac{2}{\beta}\quad\text{and}\quad\mathbb{E}(X_{n,1}^2+\cdots+X_{n,n}^2)=\frac{2}{\beta}+n-1.$$

The mean and covariance of the random vector $X_n$ are given, for all $1\leq i\neq j\leq n$, by
\int x_i \, \mathrm{d}P_{\beta,n} =0, \quad
\int x_i^2 \, \mathrm{d}P_{\beta,n} = \frac{n-1}{n} + \frac{2}{n\beta},
\quad \int x_ix_j \, \mathrm{d}P_{\beta,n} = – \frac{1}{n} .

Dynamics. It is also possible to compute using the overdamped Langevin dynamics associated to the Boltzmann-Gibbs measure $P_{\beta,n}$. This is known as the Dyson-Ornstein-Uhlenbeck dynamics. Namely, the law $P_{\beta,n}$ is invariant for the operator $$Lf(x)=\Delta f(x)-\nabla H(x)\cdot\nabla f(x)$$ where $$H(x)=n\frac{\beta}{4}(x_1^2+\cdots+x_n^2)-\frac{\beta}{2}\sum_{i\neq j}\log|x_i-x_j|.$$ Since $\partial_{x_i}H(x)=n\frac{\beta}{2}x_i-\beta\sum_{j\neq i}\frac{1}{x_i-x_j}$, we find $$L=\sum_{i=1}^n\partial_{x_i}^2-n\frac{\beta}{2}\sum_{i=1}^nx_i\partial_{x_i}+\frac{\beta}{2}\sum_{i\neq j}\frac{\partial_{x_i}-\partial_{x_j}}{x_i-x_j}.$$ The first two terms form an Ornstein-Uhlenbeck operator, while the last term leaves globally invariant symmetric polynomials. Certain symmetric polynomials are eigenvectors. For instance the function $x_1+\cdots+x_n$ is an eigenvector, indeed we find $$L(x_1+\cdots+x_n)=-n\frac{\beta}{2}(x_1+\cdots+x_n).$$. Similarly, the function $x_1^2+\cdots+x_n^2+c$ is, for a choice of $c$, an eigenvector since $$L(x_1^2+\cdots+x_n^2)=2n+\beta n(n-1)-n\beta(x_1^2+\cdots+x_n^2).$$

Let $S={(S_t)}_{t\geq0}$ be the stochastic process on $\mathbb{R}^n$ solution of the stochastic differential equation $$\mathrm{d}S_t=\sqrt{2}\mathrm{d}B_t-\nabla H(S_t)\mathrm{d}t$$ where $B={(B_t)}_{t\geq0}$ is a standard Brownian motion. The Itô formula gives, for $f\in\mathcal{C}^2(\mathbb{R}^n,\mathbb{R})$, $$\mathrm{d}f(S_t)=\sqrt{2}\nabla f(S_t)\cdot \mathrm{d}B_t+(Lf)(S_t)\mathrm{d}t.$$

With $f(x_1,\ldots,x_n)=x_1+\cdots+x_n$ we get that $U_t:=S_{t,1}+\cdots+S_{t,n}$ solves $$\mathrm{d}U_t=\sqrt{2 n}\mathrm{d}W_t-n\frac{\beta}{2}U_t\mathrm{d}t$$ where $W={(\frac{B_{t,1}+\cdots+B_{t,n}}{\sqrt{n}})}_{t\geq0}$ is a standard Brownian motion. Thus $U$ is an Ornstein-Uhlenbeck process. Since $U_\infty$ has the law of $X_n$, we recover by this way the formula $X_n\sim\mathcal{N}(0,2\beta^{-1})$.

With $f(x_1,\ldots,x_n)=x_1^2+\cdots+x_n^2$, we get that $V_t:=S_{t,1}^2+\cdots+S_{t,n}^2=|S_t|^2$ solves $$\mathrm{d}V_t=\sqrt{2}2\sqrt{V_t}\mathrm{d}W_t+n\beta\left(\frac{2}{\beta}+n-1-V_t\right)\mathrm{d}t$$ where $W={(\frac{S_t}{|S_t|}B_t)}_{t\geq0}$ is a standard Brownian motion. Thus $V$ is a Cox-Ingersoll-Ross process. The generator of such a process is a Laguerre operator, and the invariant distribution is a Gamma law. Since $V_\infty$ has the law of $|X_n|^2$, we recover the formula $|X_n|^2\sim\mathrm{Gamma}(…)$.

Complex case. For all $\beta>0$ and $n\geq2$, we consider the probability measure on $\mathbb{C}^n$ defined by
$$ Let $X_n=(X_{n,1},\ldots,X_{n,n})\sim P_{\beta,n}$. Up to our knowledge, there is no useful matrix model with independent entries valid for all $\beta$. However, it is possible as in the real case to use the eigenvectors of an (overdamped) Langevin dynamics, namely
Lf(x)=\Delta f(x)-\nabla f(x)\cdot \nabla H(x)
$$ where $$
H(x)=n\frac{\beta}{2}(|x_1|^2+\cdots+|x_n|^2)-\frac{\beta}{4}\sum_{i\neq j}\log|x_i-x_j|^2.
Now $\nabla_{x_i}H(x)=n\beta x_i-\beta\sum_{j\neq i}\frac{x_i-x_j}{|x_i-x_j|^2}$, which gives $$L=\sum_{i=1}^n\partial^2_{x_i}-n\beta\sum_{i=1}^nx_i\cdot\partial_{x_i}+\frac{\beta}{2}\sum_{j\neq i}\frac{(x_i-x_j)\cdot(\partial_{x_i}-\partial_{x_j})}{|x_i-x_j|^2}.$$ As in the real case, the first two terms still form an Ornstein-Uhlenbeck operator. Certain special symmetric polynomials are eigenvectors, such as $\Re(x_1+\cdots+x_n)$, $\Im(x_1+\cdots+x_n)$ and $|x_1|^2+\cdots+|x_n|^2+c$ for a suitable constant $c$. More precisely, we have $$ L(\Re(x_1+\cdots+x_n))=-n\beta\Re(x_1+\cdots+x_n)$$ and $$L(\Im(x_1+\cdots+x_n))=-n\beta\Im(x_1+\cdots+x_n).$$ Similarly we find
L(|x_1|^2+\cdots+|x_n|^2)=4n-2n\beta(|x_1|^2+\cdots+|x_n|^2)+\beta n(n-1).
$$ Let $S={(S_t)}_{t\geq0}$ be the stochastic process on $\mathbb{R}^n$ solution of the stochastic differential equation $$\mathrm{d}S_t=\sqrt{2}\mathrm{d}B_t-\nabla H(S_t)\mathrm{d}t$$ where $B={(B_t)}_{t\geq0}$ is a standard Brownian motion. By Itô formula, for all $f\in\mathcal{C}^2(\mathbb{R}^{2n},\mathbb{R})$, $$\mathrm{d}f(S_t)=\sqrt{2}\nabla f(S_t)\cdot \mathrm{d}B_t+(Lf)(S_t)\mathrm{d}t.$$

If $f(x_1,\ldots,x_n)=\Re(x_1+\cdots+x_n)$ then $U_t:=\Re(S_{t,1}+\cdots+S_{t,n})$ solves $$\mathrm{d}U_t=\sqrt{2 n}\mathrm{d}W_t-n\beta U_t\mathrm{d}t$$ where $W={(\frac{\Re(B_{t,1}+\cdots+B_{t,n})}{\sqrt{n}})}_{t\geq0}$ is a standard Brownian motion. Thus $U$ is an Ornstein-Uhlenbeck process. Since $U_\infty$ has the law of $\Re(X_n)$, we get that $\Re(X_n)\sim\mathcal{N}(0,\beta^{-1})$. By doing the same for $\Im$, we get that $\Re(X_n)$ and $\Im(X_n)$ are independent and of same law $\mathcal{N}(0,\beta^{-1})$ and therefore $X_n\sim\mathcal{N}(0,\beta^{-1}I_2)$.

If $f(x_1,\ldots,x_n)=|x_1|^2+\cdots+|x_n|^2$ then $V_t:=|S_{t,1}|^2+\cdots+|S_{t,n}|^2=|S_t|^2$ solves $$\mathrm{d}V_t=\sqrt{2}2\sqrt{V_t}\mathrm{d}W_t+2n\beta\left(\frac{2}{\beta}+\frac{n-1}{2}-V_t\right)\mathrm{d}t$$ where $W={(\frac{S_t}{|S_t|}B_t)}_{t\geq0}$ is a standard Brownian motion. Thus $V$ is a Cox-Ingersoll-Ross process. The generator of such a process is a Laguerre operator, and the invariant distribution is a Gamma law. Since $V_\infty$ has the law of $|X_n|^2$, we obtain the formula $$|X_n|^2\sim\mathrm{Gamma}\left(n+\frac{\beta n(n-1)}{4},\frac{n\beta}{2}\right).$$ When $\beta=2$, this is compatible with the observation of Kostlan that $n|Z_n|^2$ has the law of a sum of $n$ independent random variables of law $\mathrm{Gamma}(1,1),\ldots,\mathrm{Gamma}(n,1)$.

We can compute quickly the mean using the invariance of $P_{\beta,n}$ with respect to $L$, namely
0=4n+\beta n(n-1)-2n\beta\mathbb{E}(|Z_{n,1}|^2+\cdots+|Z_{n,n}|^2)

Note. A squared Ornstein-Uhlenbeck (OU) process is a Cox-Ingersoll-Ross (CIR) process. In this sense CIR processes play for OU processes the role played for BM by squared Bessel processes.

Further reading. 

  • Chafaï & Lehec, On Poincaré and logarithmic Sobolev inequalities for a class of singular Gibbs measures, arXiv:1805.00708
  • Bolley & Chafaï & Fontbona, Dynamics of a planar Coulomb gas, arXiv:​​1706.08776


Leave a Comment


Avenue de l’Impératrice, photographiée par Charles Marville en 1858.

L’université Paris-Dauphine doit son nom à sa proximité avec la porte Dauphine. Selon certains [1], cette porte doit son nom à l’avenue Dauphine. L’avenue Dauphine, ancienne voie des communes de Passy et de Neuilly, aurait été créée en 1826 en l’honneur de la dauphine de France, Marie-Thérèse, duchesse d’Angoulême (1778-1851). Son prolongement jusqu’au bois de Boulogne a été remanié en 1854 lors de la création de l’avenue de l’Impératrice (devenue avenue du bois en 1875, puis avenue Foch en 1929). Elle porte depuis 1864 le nom du maréchal Bugeaud (1784-1849).

[1] Auguste Doniol, Histoire du XVIe arrondissement de Paris (1902). BNF.

Selon d’autres sources [2], la porte Dauphine doit son nom à Marie-Antoinette, épouse du dauphin et futur Louis XVI, qui l’aurait fait édifier au bout de la rue de la Faisanderie, vers 1770, lorsqu’elle résidait au château de la Muette. Ce serait plutôt la porte qui aurait donné son nom à l’ancienne avenue ?

[2] Marquis de Rochegude, Promenade dans toutes les rues de Paris (1910). BNF.

Note. L’idée d’écrire ce billet m’est venue suite à la question pertinente d’une collègue nouvelle recrutée sur l’origine exacte du nom de la porte de Paris qui a donné son nom à l’université.

Leave a Comment

Law of large numbers

This post is devoted to a quick proof of a version of the law of large numbers for random variables in $\mathrm{L}^2$ with non-negative partial sums. It does not require independence or same distribution. It is a variant of the famous one for independent random variables bounded in $\mathrm{L}^4$.

The statement. If $X_1,X_2,\ldots$ are $\mathrm{L}^2$ random variables on $(\Omega,\mathcal{A},\mathbb{P})$ such that $S_n:=X_1+\cdots+X_n\geq0$, $\lim_{n\to\infty}\frac{1}{n}\mathbb{E}(S_n)=\ell\in\mathbb{R}$, and $\mathrm{Var}(S_n)=\mathcal{O}(n)$, then $$\lim_{n\to\infty}\frac{S_n}{n}=\ell\text{ almost surely}.$$

A proof. We have $$\mathbb{E}\Bigr(\sum_n\Bigr(\frac{S_{n^2}-\mathbb{E}(S_{n^2})}{n^2}\Bigr)^2\Bigr)=\sum_n\frac{\mathrm{Var}(S_{n^2})}{n^4}=\mathcal{O}\Bigr(\sum_n\frac{1}{n^2}\Bigr)<\infty.$$ It follows that $\sum_n\Bigr(\frac{S_{n^2}-\mathbb{E}(S_{n^2})}{n^2}\Bigr)^2<\infty$ almost surely, which implies that $\frac{S_{n^2}-\mathbb{E}(S_{n^2})}{n^2}\to0$ almost surely. This gives that $\frac{1}{n^2}S_{n^2}\to\ell$ almost surely. It remains to use the sandwich$$\frac{S_{n^2}}{n^2}\frac{n^2}{(n+1)^2}\leq\frac{S_k}{k}\leq\frac{S_{(n+1)^2}}{(n+1)^2}\frac{(n+1)^2}{n^2}$$ valid if $k$ is such that $n^2\leq k\leq (n+1)^2$ (here we use the non-negative nature of the $S$’s).

Note and further reading. I have learnt this proof recently from my friend Arnaud Guyader who found it in a paper (Theorem 5) by Bernard Delyon and François Portier. It is probably a classic however. It is possible to drop the assumption of being non-negative by using $X=X_+-X_-$, but this would require to modify the remaining assumptions, increasing the complexity, see for instance Theorem 6.2 in Chapter 3 of Erhan Çinlar book (and the blog post comments below). The proof above has the advantage of being quick and beautiful.

Syntax · Style · Tracking & Privacy.