December 2016 – Libres pensées d'un mathématicien ordinaire

Poincaré. Recently I have spent some time thinking about the following problem: for any integer \( {N\geq1} \), what is the best constant in the Poincaré inequality for the probability measure \( {\gamma_N} \) on the convex set \( {\Lambda_N:=\{x\in\mathbb{R}^N:x_1\leq\cdots\leq x_N\}} \) with density proportional to

\[ x\in\Lambda_N\mapsto\mathrm{e}^{-\frac{N}{2}\sum_{i=1}^Nx_i^2}\prod_{j<k}(x_j-x_k)^2 ? \]

In other words, for every \( {N\geq1} \), find the best (in fact the largest) constant \( {\rho_N\geq0} \) such that for every \( {\mathcal{C}^\infty} \) test function \( {f:\Lambda_N\rightarrow\mathbb{R}} \) with compact support,

\[ \rho_N\mathrm{Var}_{\gamma_N}(f) \leq\int |\nabla f|^2\mathrm{d}\gamma_N, \]

where \( {\mathrm{Var}_{\gamma_N}(f):=\int f^2\mathrm{d}\gamma_N-\left(\int f\mathrm{d}\gamma_N\right)^2} \) and where \( {|\nabla f|^2:=(\partial_1f)^2+\cdots+(\partial_Nf)^2} \).

The reader familiar with random matrix theory knows that \( {\gamma_N} \) is the law of the eigenvalues of a random matrix drawn from the Gaussian Unitary Ensemble (GUE) with density proportional to \( {\mathrm{e}^{-\frac{N}{2}\mathrm{tr}(H^2)}} \) on \( {N\times N} \) Hermitian matrices.

Note that \( {\gamma_N} \) is a Boltzmann-Gibbs measure, with density \( {Z_N^{-1}\mathrm{e}^{-E_N(x)}} \) with

\[ E_N(x) :=\frac{N}{2}\sum_{i=1}^Nx_i^2+\sum_{j<k}\log\frac{1}{(x_k-x_j)^2}. \]

Class of test functions. Far beyond \( {\mathcal{C}^\infty} \) with compact support, standard approximation arguments give that the Poincaré inequality for \( {\gamma_N} \) with constant \( {\rho_N} \) remains valid for any test function in the Sobolev space \( {H^2(\gamma_N):=W^{1,2}(\gamma_N)} \).

Symmetrization. Let \( {\widetilde{\gamma}_N} \) be the probability measure on \( {\mathbb{R}^N} \) instead of \( {\Lambda_N} \) with same density as \( {\gamma_N} \) up to a multiplicative normalizing factor. It is exchangeable in other words invariant by permutation of the coordinates since its density is a symmetric function of the coordinates. Obviously the normalizing factor for \( {\widetilde\gamma_N} \) is \( {N!} \) times the one of \( {\gamma_N} \). For every bounded and measurable function \( {f:\mathbb{R}^N\rightarrow\mathbb{R}} \),

\[ \int f\mathrm{d}\widetilde{\gamma}_N =\int f_*\mathrm{d}\gamma_N \]

where \( {f_*} \) is the symmetrization of \( {f} \) defined by

\[ f_*(x_1,\ldots,x_N)=\frac{1}{N!}\sum_{\sigma\in\Sigma_N}f(x_{\sigma(1)},\ldots,x_{\sigma(N)}) \]

where \( {\Sigma_N} \) is the symmetric group of permutations of \( {\{1,\ldots,N\}} \). Also \( {\gamma_N=\widetilde\gamma_N} \) on symmetric functions. Conversely for any bounded and measurable \( {f:\Lambda_N\rightarrow\mathbb{R}} \),

\[ \int f\mathrm{d}\gamma_N =\int f(x_{(1)},\ldots,x_{(N)})\mathrm{d}\widetilde\gamma_N \]

where \( {x_{(1)}<\cdots<x_{(N)}} \) is the reordering of \( {x_1,\ldots,x_N} \).

It is possible to compute moments of \( {\widetilde\gamma_N} \) by using the link with the GUE. For instance

\[ \int_{\mathbb{R}^N}(x_1+\cdots+x_N)\widetilde\gamma_N(\mathrm{d}x) =\mathbb{E}(\mathrm{tr}(H))=N\mathrm{H_{11}}=0 \]

where \( {H\sim\mathrm{GUE}} \) since \( {H_{11}\sim\mathcal{N}(0,N^{-1})} \), while using \( {H_{12}\sim\mathcal{N}(0,(2N)^{-1}I_2)} \),

\[ \int_{\mathbb{R}^N}(x_1^2+\cdots+x_N^2)\widetilde\gamma_N(\mathrm{d}x) =\mathbb{E}(\mathrm{tr}(H^2)) =N\mathbb{E}(H_{11}^2)+N(N-1)\mathbb{E}(|H_{12}|^2) =N. \]

Since \( {\widetilde\gamma_N} \) is exchangeable it follows that

\[ \int x_1\widetilde\gamma_N(\mathrm{d}x)=\cdots=\int x_N\mathrm{d}\widetilde\gamma_N =\frac{1}{N}\int(x_1+\cdots+x_N)\widetilde\gamma_N(\mathrm{d}x) =0 \]

and

\[ \int x_1^2\widetilde\gamma_N(\mathrm{d}x)=\cdots=\int x_N^2\widetilde\gamma_N(\mathrm{d}x) =\frac{1}{N}\int(x_1^2+\cdots+x_N^2)\widetilde\gamma_N(\mathrm{d}x) =1. \]

To compute a mixed moment \( {x_jx_k} \) with \( {j\neq k} \), we can start from

\[ \int(x_1+\cdots+x_N)^2\widetilde\gamma_N(\mathrm{d}x) =\mathbb{E}(\mathrm{tr}(H)^2)=1 \]

since \( {\mathrm{tr}(H)\sim\mathcal{N}(0,1)} \), and use

\[ \int(x_1+\cdots+x_N)^2\widetilde\gamma_N(\mathrm{d}x) =N(N-1)\int x_1x_2\widetilde\gamma_N(\mathrm{d}x) +N\int\!x_1^2\widetilde\gamma_N(\mathrm{d}x) \]

to get, for any \( {j\neq k} \),

\[ \int x_jx_j\widetilde\gamma_N(\mathrm{d}x) =\int x_1x_2\widetilde\gamma_N(\mathrm{d}x) =-\frac{1}{N}. \]

This shows that \( {\widetilde\gamma_N} \) is asymptotically isotropic, in the sense that its mean is zero while its covariance matrix is close to the identity \( {I_N} \) as \( {N\rightarrow\infty} \). Note that the interior of the support of \( {\widetilde\gamma_N} \), which is \( {\cap_{j\neq k}\{x\in\mathbb{R}^N:x_j\neq x_k\}} \), is not connected, and this suggests that the probability measure \( {\widetilde\gamma_N} \) is in some sense artificial.

How about \( {\gamma_N} \) defined on the convex set \( {\Lambda_N} \)? In fact, this distribution is not isotropic, even when \( {N\gg1} \). As a matter of fact, let us recall that if \( {Z\sim\gamma_N} \) then a well known result of random matrix theory states that almost surely, regardless of the way we choose the common probability space,

\[ Z_1\underset{N\rightarrow\infty}{\longrightarrow}-2 \quad\mbox{and}\quad Z_N\underset{N\rightarrow\infty}{\longrightarrow}2 \quad\mbox{while}\quad \frac{1}{N}\sum_{i=1}^N\delta_{Z_i} \underset{N\rightarrow\infty}{\overset{\mathrm{weak}}{\longrightarrow}} \frac{\sqrt{4-x^2}}{2\pi}\mathrm{1}_{[-2,2]}(x)\mathrm{d}x. \]

Nevertheless, since \( {\gamma_N} \) and \( {\widetilde\gamma_N} \) agree on symmetric functions, we still have

\[ \int(x_1+\cdots+x_N)\gamma_N(\mathrm{d}x)=0,\quad \int(x_1+\cdots+x_N)^2\gamma_N(\mathrm{d}x)=1, \]

and

\[ \int(x_1^2+\cdots+x_N^2)\gamma_N(\mathrm{d}x)=N. \]

Dyson. The probability measure \( {\gamma_N} \) is invariant for the irreducible Markov diffusion process \( {{(X_t)}_{t\geq0}} \) on \( {\Lambda_N} \) solution of the stochastic differential equation

\[ \mathrm{d}X_t=\sqrt{2}\mathrm{d}B_t-\nabla E_N(X_t)\mathrm{d}t \]

where \( {{(B_t)}_{t\geq0}} \) is a standard Brownian motion on \( {\mathbb{R}^N} \), in other words

\[ \mathrm{d}X_{t,i} =\sqrt{2}\mathrm{d}B_{t,i} -NX_{t,i}\mathrm{d}t -2\sum_{j\neq i}\frac{1}{X_{t,j}-X_{t,i}}\mathrm{d}t, 1\leq i\leq N. \]

We may call it the Dyson Ornstein-Uhlenbeck process. The equation above is a system of stochastic differential equations of interacting particles, which can be seen (via the empirical measure) as a mean-field linear approximation of a McKean-Vlasov semilinear evolution equation with singular interaction, but this is another story.

Markov. The semigroup \( {{(P_t)}_{t\geq0}} \) of the process is defined by

\[ P_t(f)(x):=\mathbb{E}(f(X_t)\mid X_0=x) \]

for any \( {t\geq0} \), any bounded measurable \( {f:\mathbb{R}^N\rightarrow\mathbb{R}} \), and any \( {x\in\mathbb{R}^N} \). For any \( {t\geq0} \) the linear operator \( {P_t} \) is a contraction of \( {\mathrm{L}^p(\gamma_N)} \) for any \( {p\in[1,\infty]} \). In \( {\mathrm{L}^2(\gamma_N)} \), the infinitesimal generator of this semigroup is the differential operator

\[ Af=\lim_{t\rightarrow0^+}\frac{P_t(f)-f}{t}=\Delta f-\nabla E_N\cdot \nabla f \]

for any smooth enough test function \( {f} \) (\( {A} \) is unbouded with a domain). The Poincaré inequality for \( {\gamma_N} \) is equivalent to an exponential decay of the variance: \( {\forall f, \forall t\geq0} \),

\[ \mathrm{Var}_{\gamma_N}(P_t(f))\leq\mathrm{e}^{-\rho_N t}\mathrm{Var}_{\gamma_N}(f). \]

The Poincaré inequality is also equivalent to state that \( {A} \) has a spectral gap:

\[ \mathrm{spectrum}(A)\subset(-\infty,-\rho_N]\cup\{0\}. \]

Convexity. On the convex set \( {\Lambda_N} \), the energy \( {E_N} \) of the Boltzmann-Gibbs measure \( {\gamma_N} \) is convex, and therefore \( {\gamma_N} \) is log-concave (but is not isotropic). Note that in contrast \( {\widetilde\gamma_N} \) is not log-concave (but is almost isotropic as \( {N\rightarrow\infty} \)). Thanks to a result by Bobkov, it follows that the Poincaré constant of \( {\gamma_N} \) is positive:

\[ \rho_N>0. \]

It may depend on \( {N} \) however. Let us compute the Hessian matrix of \( {E_N} \):

\[ \partial^2_{i,i}E_N(x) =N+2(N-1)\sum_{k\neq i}\frac{1}{(x_i-x_k)^2} \quad\mbox{and}\quad \partial^2_{j,k}E_N(x) =-\frac{2(N-1)}{(x_j-x_k)^2},\quad j\neq k. \]

This shows in particular that for any \( {x\in\mathbb{R}^N} \), as quadratic forms,

\[ \mathrm{Hess}(E_N)(x)\geq NI_N. \]

Thus by the Brascamp-Lieb inequality or by the Bakry-Émery criterion or by the Caffarelli theorem, it follows that the Poincaré constant of \( {\gamma_N} \) satisfies

\[ \rho_N\geq N. \]

This bound comes from the strong convexity of the confinement term \( {N\sum_{i=1}^Nx_i^2} \) and the convexity of the interaction term \( {\sum_{j<k}\log\frac{1}{(x_j-x_k)^2}} \). The interaction term is strongly convex when \( {x} \) in away from infinity, but this is not seen by global criteria.

The spectral gap is \( {N} \). The Poincaré inequality for \( {\gamma_N} \) written for the special test function \( {f(x)=x_1+\cdots+x_N} \) writes

\[ \rho_N\mathrm{Var}(Z_1+\cdots+Z_N)\leq N \]

where \( {Z\sim\gamma_N} \). But if \( {H\sim\mathrm{GUE}} \) then \( {Z_1+\cdots+Z_N=\mathrm{tr}(H)\sim\mathcal{N}(0,1)} \) and thus

\[ \rho_N\leq N. \]

Combining this fact with the previous lower bound we obtain remarkably that

\[ \rho_N=N. \]

Actually, it turns out that the function \( {f(x)=x_1+\cdots+x_N} \) is an eigenfunction of the infinitesimal generator \( {A} \) associated to the spectral value \( {N} \), indeed

\[ Af(x) =\sum_{i=1}^N\partial_iE_N(x) =\sum_{i=1}^N\left(Nx_i+2\sum_{j\neq i}\frac{1}{x_j-x_i}\right) =Nf(x). \]

This gives a conservation law: \( {\sum_{i=1}^NX_{t,i}=\sum_{i=1}^NX_{0,i}} \) for any \( {t\geq0} \).

Projections. Let us consider the hyperplane of \( {\mathbb{R}^N} \)

\[ H_N:=\{x\in\mathbb{R}^N:x_1+\cdots+x_N=0\} \]

orthogonal to \( {(1,\ldots,1)} \). Let \( {\pi_N} \) and \( {\pi_N^\perp} \) be the orthogonal projections on \( {H_N} \) and \( {H_N^\perp=\mathbb{R}(1,\ldots,1)} \). The Itô formula shows that the projected processes \( {{(\pi_N(X_t))}_{t\geq0}} \) and \( {{(\pi_N^\perp(X_t))}_{t\geq0}} \) are independent, and that moreover the first one is a Markov diffusion process on \( {H_N} \) while the second is a Brownian motion on \( {H_N^\perp\equiv\mathbb{R}} \).

Similarly if \( {Z\sim\gamma_N} \) then \( {\pi_N(Z)} \) and \( {\pi_N^\perp(Z)} \) are independent, the first one follows a Boltzmann-Gibbs measure with same density as \( {\gamma_N} \) but on \( {H_N} \), while the second follows a Gaussian distribution of dimension \( {1} \).

The quadratic term in the exponential and the Vandermonde determinant which both appear in the density of \( {\gamma_N} \) play an essential role in these projection properties. More precisely, the Pythagoras theorem gives

\[ x_1^2+\cdots+x_N^2 =|x|^2 =|x-\pi_N(x)|^2+|\pi_N(x)|^2 =|\pi_N^\perp(x)|^2+|\pi_N(x)|^2, \]

while the formulas \( {\pi_N^\perp(x)=\frac{x_1+\cdots+x_N}{N}(1,\ldots,1)} \) and \( {\pi_N(x)=x-\pi_N^\perp(x)} \) give for \( {j<k} \)

\[ x_j-x_k =x_j-\frac{x_1+\cdots+x_N}{N}-\left(x_k-\frac{x_1+\cdots+x_N}{N}\right) =\pi_N(x)_j-\pi_N(x)_k, \]

which yield together finally the factorization

\[ \mathrm{e}^{-\frac{N}{2}\sum_{i=1}^Nx_i^2}\prod_{j<k}(x_j-x_k)^2 =\mathrm{e}^{-\frac{N}{2}|\pi_N^\perp(x)|^2}\times \mathrm{e}^{-\frac{N}{2}|\pi_N(x)|^2}\prod_{j<k}(\pi_N(x)_j-\pi_N(x)_k)^2. \]

It is natural to ask about the Poincaré constant for the projected process. This constant is necessarily \( {\geq N} \) and one can check on quadratic functions that this bound is in fact sharp, therefore the spectral gap is not improved by this projection and remains equal to \( {N} \). However, the Dyson conjecture formulated by Dyson in 1962 suggests that the Poincaré constant for a class of sufficiently local test functions might be much larger than \( {N} \).

Hoffman-Wielandt. The Hoffman-Wielandt theorem for Hermitian matrices states that for any \( {N\geq1} \) if \( {A} \) and \( {B} \) are two \( {N\times N} \) Hermitian matrices with eigenvalues \( {x_1(A)\leq\cdots\leq x_N(A)} \) and \( {x_1(B)\leq\cdots\leq x_N(B)} \) then

\[ \sum_{i=1}^N(x_i(A)-x_i(B))^2\leq\sum_{j,k=1}^N(A_{jk}-B_{jk})^2. \]

This shows that the ordered vector of eigenvalues is a Lipschitz function the matrix entries, and this explains why the Gaussian nature of the GUE induces nice properties for \( {\gamma_N} \). This can also be seen as an alternative to the Caffarelli theorem.

Logarithmic Sobolev inequalities. Thanks to the Hoffman-Wielandt inequality, the Gaussian nature of the GUE should induces a sub-Gaussian nature for \( {\gamma_N} \). Indeed again the Bakry-Émery criterion or the Caffarelli theorem give that \( {\gamma_N} \) satisfies to a logarithmic Sobolev inequality with constant \( {\geq N/2} \), and thus \( {=N/2} \) since it implies the Poincaré inequality with twice the constant.

Beyond the Gaussian case. Many properties considered above remain valid if one replaces, in the exponential in the density of \( {\gamma_N} \), the quadratic term \( {x_i^2} \) by \( {V(x_i)} \) where \( {V:\mathbb{R}\rightarrow\mathbb{R}} \) is convex with \( {\inf_{\mathbb{R}}V”>0} \). However certain rigid properties are lost, such as the fact that \( {x_1+\cdots+x_N} \) is an eigenvector of \( {A} \), and in particular the projected process is no longer necessarily a Markov process.

Beyond GUE. Certain aspects remain valid if one replaces, in the density of \( {\gamma_N} \), the term \( {(x_j-x_k)^2} \) by \( {|x_j-x_k|^\beta} \) for some \( {\beta>0} \). The cases \( {\beta=1} \) and \( {\beta=4} \) correspond respectively to the Gaussian Orthogonal Ensemble (GOE) and the Gaussian Simplectic Ensemble (GSE). The condition \( {\beta\geq1} \) seems to be necessary in order to ensure that the diffusion process \( {{(X_t)}_{t\geq0}} \) does not explode in finite time.

Kannan-Lovász-Simonovits. Log-concave probability measures play a central role in the analysis and geometry of convex bodies, as being functional generalization of uniform distributions on convex bodies. In this context, the Kannan-Lovász-Simonovits (KLS) conjecture formulated in 1995 states that there exists a universal constant \( {\rho\in(0,\infty)} \) such that for every \( {N\geq1} \) and every log-concave probability measure \( {\mu} \) on \( {\mathbb{R}^N} \) with mean \( {0} \) and covariance matrix \( {I_N} \) (isotropy), the Poincaré constant of \( {\mu} \) is larger than or equal to \( {\rho} \). The feature here is the uniformity of the bound in particular with respect to the dimension \( {N} \). Several mathematicians have tried to prove the conjecture. The best bound for now is \( {N^{-1/4}} \). Possible counter examples should be searched among non-product probability measures due to the stability by tensor product of the Poincaré inequality and the fact that the Poincaré constant of a one dimensional log-concave distribution is controlled by its second moment. The KLS conjecture is related to other important conjectures in geometric functional analysis such that the hyperplane and the thin-shell conjectures.

Note. This post is inspired from numerous conversations with my colleague Joseph Lehec on Dyson Brownian motion and related topics.

Further reading.

2 Comments

Month: December 2016

Mind the gap!