Press "Enter" to skip to content

Category: Uncategorized

About rigidity for Markov diffusions

Dominique Bakry (1954 -- )
Dominique Bakry (1954 — )

This tiny post is about a rigidity property for certain Markov diffusion operators, using solely integration by parts and convexity, in particular without using semi-groups, processes, functional inequalities, and Gamma calculus. This minimalism is for fun, but may help to better understand.

Dans un papier, entre l’abstract et le non-sense, il y a l’introduction !

Dominique Bakry, private joke during a seminar talk, Toulouse (2002).

Markov diffusion. Let $V:\mathbb{R}^d\to\mathbb{R}$ be convex, $\mathcal{C}^\infty$, with $\lim_{|x|\to\infty}V(x)=+\infty$, and with slow growth, meaning that it has, and its derivatives of all orders, a polynomial growth. A basic example is given by $V(x)=P(|x|^2)$ where $P$ is a non-constant polynomial with non-negative coefficients. Let us consider the differential operator
\[
\mathrm{T}=\Delta -\langle \nabla V,\nabla \rangle=\sum_{i=1}^d\partial^2_{ii}-\sum_{i=1}^d(\partial_iV)\partial_i.
\]By adding a constant to $V$, we can assume without loss of generality that
\[
\mathrm{d}\mu(x)=\mathrm{e}^{-V(x)}\mathrm{d}x
\] is a probability measure. We see $\mathrm{T}$ as an unbounded operator on $L^2(\mu)$.

Thanks to the regularily assumptions on $V$, the space $\mathcal{S}$ of $\mathcal{C}^\infty$ rapidly decaying functions, stable by $\nabla$ and $\Delta$, is also stable by $\mathrm{T}$, as well as the the space $\mathcal{P}$ of $\mathcal{C}^\infty$ functions with slow growth. The space $\mathcal{S}$ is convenient for integration by parts thanks to its vanishing behavior at infinity, while $\mathcal{P}$ has the advantage of containing the polynomials including the constants.

The letter $\mathrm{T}$ is for transform, which is natural for an operator. We could also use $\mathrm{L}$ like Laplacian.

Integration by parts.The operator $\mathrm{T}$ is symmetric and positive. For all $f$ and $g$ in $\mathcal{S}$,
\[
\int f\mathrm{T}g\mathrm{d}\mu=-\int\langle\nabla f,\nabla g\rangle\mathrm{d}\mu=\int g\mathrm{T}f\mathrm{d}\mu
\quad\text{in particular}\quad
\int f(-\mathrm{T})f\mathrm{d}\mu\geq0.
\]This comes from integration by parts, namely from the fact that $\Delta=\mathrm{div}\nabla$, $\mathrm{div}$ is the adjoint of $-\nabla$ in $L^2(\mathrm{d}x)$, and $\nabla\mathrm{e}^{-V}=-\nabla V\mathrm{e}^{-V}$. The adjoint of $\mathrm{T}$ in $L^2(\mathrm{d}x)$ is $\mathrm{div}(\nabla+\nabla V)$.

The spectrum of $-\mathrm{T}$ is included in $\{0\}\cup[\lambda_1,\infty)$ for some $\lambda_1>0$ called the spectral gap. Moreover, the spectrum is discrete, $\lambda_1$ is an eigenvalue and the associated eigenfunctions are $\mathcal{C}^\infty$. Note however that they do not belong to $\mathcal{S}$ in general. The kernel or eigenspace associated to the eigenvalue $0$ is formed by the constant functions.

Ornstein-Uhlenbeck operator. It is obtained with $V=\frac{\rho}{2}\left|\cdot\right|^2$, for a constant $\rho>0$. The law $\mu$ is then the Gaussian or normal law $\mathcal{N}(0,\frac{1}{\rho}\mathrm{I}_d)$. The eigenvalues of $-\mathrm{T}$ are then the integer multiples of $\rho$, namely $0,\rho,2\rho,3\rho,\ldots$, and the eigenfunctions associated to the eigenvalue $n\rho$ are the multivariate Hermite polynomials of degree $n$, which do not belong to $\mathcal{S}$, but belong to $\mathcal{P}$. In particular $\lambda_1=\rho$ and the associated eigenvectors are linear : $f(x)=\langle a,x\rangle$, $a\in\mathbb{R}^d\setminus\{0\}$.

Polynomials. Beyond the OU case, it is natural to ask if $\mathrm{T}$ can admit polynomial eigenfunctions. If $V$ is a polynomial, then the set of polynomials $\mathbb{R}[X_1,\ldots,X_d]$ is left stable by $\mathrm{T}$, but the degree is stable only if $V$ has degree one, which means that $\mathrm{T}$ is an affine deformation of OU. An important class of non-Gaussian and non-product examples beyond pure OU is
\[
V(x)=\frac{\rho}{2}|x|^2+W(x), \quad x\in\mathbb{R}^d,
\]where $\rho>0$ and where $W:\mathbb{R}^d\to\mathbb{R}$ is convex and translation invariant in the direction $(1,\ldots,1)\in\mathbb{R}^d$, namely for all $u\in\mathbb{R}$ and all $x\in\mathbb{R}^d$,
\[
W(x+u(1,\ldots,1))=W(x).
\]This is the case for example when for some convex even function $h:\mathbb{R}\to\mathbb{R}$,
\[
W(x)=\sum_{i< j}h(x_i-x_j),\quad x\in\mathbb{R}^d.
\]The translation invariance of $W$ in the direction $(1,\ldots,1)$ gives $\nabla W(x)\perp(1,\ldots,1)$ for all $x\in\mathbb{R}^d$. Now it follows from this fact that
the linear polynomial $f(x)=x_1+\cdots+x_d$, for which $\nabla f(x)=(1,\ldots,1)$, is an eigenfunction of $\mathrm{T}$ associated to the eigenvalue $\rho$, indeed
\[
\mathrm{T}(f)=\Delta f-\rho\langle x,\nabla f(x)\rangle-\langle\nabla W(x),\nabla f(x)\rangle=0-\rho f(x)-0=-\rho f(x).
\]This gives a wide class of non-polynomial $V$ for which $\mathrm{T}$ admits at least one polynomial eigenfunction. It does not depend on $W$, it is the first multivariate Hermite polynomial, the one of OU. Moreover, if $\pi$ and $\pi^\perp$ are the orthogonal projections on $\mathbb{R}(1,\ldots,1)$ and its orthogonal, respectively, then $|x|^2=|\pi(x)|^2+|\pi^\perp(x)|^2$, while the translation invariance of $W$ in the direction $(1,\ldots,1)$ gives $W(x)=W(\pi(x)+\pi^\perp(x))=W(\pi^\perp(x))$, hence the splitting
\[
\mathrm{e}^{-V(x)}
=\mathrm{e}^{-\frac{\rho}{2}|\pi(x)|^2}\mathrm{e}^{-W(\pi^\perp(x))-\frac{\rho}{2}|\pi^\perp(x)|^2}
\]meaning that $\mu$ is product, up to a rotation, with a one-dimensional Gaussian factor $\mathcal{N}(0,\frac{1}{\rho})$.

In dimension $d=1$, the condition on $W$ forces $W$ to be constant, and thus $\mathrm{T}$ to be OU. More generally, an observation that dates back to Dominique Bakry in the 1990’s is that among one dimensional diffusion operators of the form $\sigma(x)^2f”(x)-b(x)f'(x)$, and up to translation and dilation, the cases for which the eigenfunctions are the polynomials, which are then the orthogonal polynomials of the associated measure $\mu$, are the OU or Hermite case ($\sigma^2=1$ constant and $b(x)=x$) on $\mathbb{R}_+$, the Laguerre case ($\sigma(x)^2=x$ and $b$ is affine) on $\mathbb{R}_+$, and the Jacobi case ($\sigma(x)^2=1-x^2$ and $b$ is affine) on $[-1,1]$. Moreover the Jacobi case gives the Hermite and Laguerre cases by deformation and passage to the limit, just like for the three integrable cases of the Selberg integral in random matrix theory. In higher dimension, the situation is less rigid and much more subtle, and Dominique Bakry has devoted to it a significant part of his late scientific life, unfortunately disconnected from the works of Michel Lassalle and Peter Forrester on multivariate orthogonal polynomials in combinatorics and maths-physics.

Strong convexity and Bochner formula. For all $\rho>0$, the following properties are equivalent:

  1. $V-\frac{\rho}{2}\left|\cdot\right|^2$ is convex, in other words $V$ is $\rho$-convex
  2. $\mathrm{Hess}(V)(x)\geq\rho\mathrm{I}_d$ as quadratic forms, for all $x\in\mathbb{R}^d$
  3. $\langle\mathrm{Hess}(V)\nabla f,\nabla f\rangle\geq\rho|\nabla f|^2$, for all $f\in\mathcal{S}$
  4. $\langle(\mathrm{T}\nabla-\nabla\mathrm{T})(f),\nabla f\rangle\geq\rho|\nabla f|^2$, for all $f\in\mathcal{S}$.

The equality case $V=\frac{\rho}{2}\left|\cdot\right|^2$ corresponds to the Ornstein-Uhlenbeck operator.

The third property gives the $\rho$-convexity by approximating linear (or affine) functions by elements of $\mathcal{S}$. The fourth property is a reformulation of the third one thanks to the Bochner formula:
\[
\nabla\mathrm{T}-\mathrm{T}\nabla=-\mathrm{Hess}(V)\nabla.
\]This deformed commutation can be interpreted as a curvature. It is at the heart of the geometry of the probabilistic functional analysis developed notably by Dominique Bakry and his followers.

The fourth property looks like a sort of abstract non-sense here, but actually its goal is to provides a reformulation of $\rho$-convexity suitable to make the link with the eigenfunctions of $\mathrm{T}$, notably after taking the average with respect to $\mu$ of both sides of the inequality.

By using integration by parts we get, denoting $\left\|\cdot\right\|_{\mathrm{HS}}$ the Hilbert-Schmidt norm,
\[
\int\langle\mathrm{T}\nabla f,\nabla f\rangle\mathrm{d}\mu
=\sum_{i=1}^d\int(\mathrm{T}\partial_if)\partial_if\mathrm{d}\mu
=-\sum_{i=1}^d\int(\partial^2_{ii}f)^2\mathrm{d}\mu
=-\int\|\mathrm{Hess}(f)\|_{\mathrm{HS}}^2\mathrm{d}\mu.
\]As a consequence, if $V$ is $\rho$-convex then for all $f\in\mathcal{S}$,
\[
-\int\langle\nabla\mathrm{T}f,\nabla f\rangle\mathrm{d}\mu
\geq\rho\int|\nabla f|^2\mathrm{d}\mu+\int\|\mathrm{Hess}(f)\|_{\mathrm{HS}}^2\mathrm{d}\mu.
\]

Spectral gap. If $V$ is $\rho$-convex and if $-\mathrm{T}f=\lambda_1f$, then, approximating $f$ by elements of $\mathcal{S}$,
\[
\lambda_1\int|\nabla f|^2\mathrm{d}\mu
\geq\rho\int|\nabla f|^2\mathrm{d}\mu+\int\|\mathrm{Hess}(f)\|_{\mathrm{HS}}^2\mathrm{d}\mu,
\]hence
\[
\lambda_1\geq\rho.
\]This lower bound dates back at least to the works of André Lichnérowicz in the 1950’s. The equality is achieved in the case of the Ornstein-Uhlenbeck operator, for which $V=\frac{\rho}{2}\left|\cdot\right|^2$. This bound should be understood as a comparison : if $V$ has at least the convexity of the OU case, then the spectral gap is at least the one of OU.

Rigidity and splitting. Let us consider the equality case, namely $V$ is $\rho$-convex and $\lambda_1=\rho$, in other words $V$ is $\lambda_1$-convex. It follows from the above that if $-\mathrm{T}f=\lambda_1f$ then we get $\mathrm{Hess}(f)=0$, and since $f$ is smooth, this gives that $f$ is linear, just like for OU!

Moreover, for such an $f$, we get, from the Bochner formula,
\[
0
=\int\Bigr(\langle(\mathrm{T}\nabla-\nabla\mathrm{T})f,\nabla f\rangle-\rho|\nabla f|^2\Bigr)\mathrm{d}\mu
=\int\Bigr(\langle\mathrm{Hess}(V)\nabla f,\nabla f\rangle-\rho|\nabla f|^2\Bigr)\mathrm{d}\mu.
\]Now the integrand in the right hand side is non-negative since $V$ is $\rho$-convex, hence the constant vector $\nabla f$ (recall that $f$ is linear) is in the kernel of $\mathrm{Hess}(V)-\rho\mathrm{I}_d$. Since $f$ is not constant, it follows that $\nabla f$ is not zero. As a consequence, up to rotation, $\mu$ is a product measure, with a univariate Gaussian factor $\mathcal{N}(0,\frac{1}{\rho})$, times a $\rho$-convex factor. The splitting says also that $\mathrm{T}$ is, up to rotation, the direct sum of an OU operator $f”-\rho xf’$ and an operator with $\rho$-convex $V$.

Such links between convexity and eigenfunctions were studied notably by Dominique Bakry and Zhongmin Qian. The splitting related to rigidity was explored notably by Xu Cheng and Detang Zhou in a geometric context, as well as by Guido De Philippis and Alessio Figalli using optimal transport. My own motivation comes initially from the study of the Dyson-Ornstein-Uhlenbeck operator that emerges from random matrix theory. Rigidity has a nice application to the cutoff phenomenon related to the trend to the equilibrium of the associated stochastic process.

Further reading.

Leave a Comment
Syntax · Style · .