Press "Enter" to skip to content

Cutoff for high-dimensional curved diffusions

Photo of Michel Ledoux (1958 -- )
Michel Ledoux (1958 -- ) A great explorer of Markov semigroups and Gauss analysis

This post is inspired from the last part of a recent work [CF] in collaboration with Max Fathi about the cutoff phenomenon for curved diffusions in high dimension.

Diffusion. Let ${(X_t)}_{t\in\mathbb{R}_+}$ be the solution of the stochastic differential equation \[ \mathrm{d}X_t = -\nabla V(X_t)\mathrm{d}t + \sqrt{2}\mathrm{d}B_t \] where ${(B_t)}_{t\geq0}$ is a standard Brownian motion in $\mathbb{R}^d$, $V:\mathbb{R}^d\to\mathbb{R}$ is strictly convex and $\mathcal{C}^2$ with $\lim_{|x|\to\infty}V(x)=+\infty$, and $\left|\cdot\right|$ is the Euclidean norm of $\mathbb{R}^d$. In Statistical Physics, this drift-diffusion is also known as an overdamped Langevin process with potential $V$. By adding a constant to $V$, we can assume without loss of generality that $\mu_V=\mathrm{e}^{-V}$ namely \[ \mathrm{d}\mu_V(x)=\mathrm{e}^{-V(x)}\mathrm{d}x \] is a probability measure. It is the unique invariant law of the process, and it is moreover reversible. The associated infinitesimal generator is the linear differential operator \[ \mathrm{L} = \Delta - \nabla V \cdot \nabla \] acting on smooth functions. It is symmetric in $L^2(\mu_V)$, and its kernel is the set of constant functions. Moreover, its spectrum is included in $(-\infty,-\lambda_1]\cup\{0\}$, for some $\lambda_1 > 0$ called the spectral gap of $\mathrm{L}$. When $V(x)=\frac{\rho}{2}|x|^2$ for some $\rho > 0$ then $X$ is the Ornstein-Uhlenbeck (OU) process, $\mu_V$ is Gaussian, $\lambda_1=\rho$, and $\mathrm{Hess}(V)(x)=\rho \mathrm{Id}$ for all $x\in\mathbb{R}^d$.

When $V$ is $\rho$-convex for some $\rho > 0$, namely if $V-\frac{\rho}{2}\left|\cdot\right|^2$ is convex, then the spectrum of $-\mathrm{L}$ is discrete, the spectral gap is an eigenvalue, and $\mathrm{Hess}(V)(x)\geq\rho\mathrm{Id}$ for all $x\in\mathbb{R}^d$.

Trend to the equilibrium. Let us denote $\mu_t:=\mathrm{Law}(X_t)$. We know that for all $\mu_0$, \[ \mu_t \xrightarrow[t\to\infty]{\mathrm{d}} \mu_V. \]

Functional inequalities. In order to quantify this trend to the equilibrium, we use the total variation distance, the Kantorovich--Wasserstein quadratic cost distance, the Kullback-Leibler relative entropy, and the Fisher information. Recall the formulas \begin{eqnarray*} \mathrm{d}_{\mathrm{TV}}(\nu,\mu) &=&\inf_{\substack{(U,V)\\U\sim\mu,V\sim\nu}}\mathbb{P}(U\neq V)\\\\ \mathrm{W}_2(\nu,\mu) &=&\inf_{\substack{(U,V)\\U\sim\mu,V\sim\nu}}\sqrt{\mathbb{E}(|U-V|^2)}\\ \mathrm{H}(\nu\mid\mu) &=&\displaystyle\int\log\frac{\mathrm{d}\nu}{\mathrm{d}\mu}\mathrm{d}\nu\\ \mathrm{I}(\nu\mid\mu) &=&\displaystyle\int\Bigr|\nabla\log\frac{\mathrm{d}\nu}{\mathrm{d}\mu}\Bigr|^2\mathrm{d}\nu \end{eqnarray*} They take their values in $[0,+\infty]$, but $\mathrm{d}_{\mathrm{TV}}\leq1$. They are comparable, generically or under certain conditions, and these comparisons are known as functional inequalities. The simplest and most well known is the Pinsker or Csiszár-Kullback inequality \[ \mathrm{d}_{\mathrm{TV}}(\nu,\mu)^2\leq 2\mathrm{H}(\nu\mid\mu). \] A more recent and sophisticated functional inequality is the Otto-Villani HWI inequality, valid for any $\nu$ and for $\mu_V=\mathrm{e}^{-V}$ with a $\rho$-convex $V$, $\rho > 0$ : \[ \mathrm{H}(\nu\mid\mu_V) \leq\mathrm{W}_2(\nu,\mu_V)\sqrt{\mathrm{I}(\nu\mid\mu_V)} -\frac{\rho}{2}\mathrm{W}_2(\nu,\mu_V)^2. \] See [BGL] for more information. It contains the Talagrand inequality \[ \frac{\rho}{2}\mathrm{W}_2(\nu,\mu_V)^2\leq\mathrm{H}(\nu\mid\mu_V) \] as well as the logarithmic Sobolev inequality \[ 2\rho\mathrm{H}(\nu\mid\mu_V)\leq\mathrm{I}(\nu\mid\mu_V). \] By linearization, the latter implies a Poincaré inequality which is equivalent to $\lambda_1\geq\rho$.

Cutoff phenomenon. Let $S\subset\mathcal{P}(\mathbb{R}^d)$ be an arbitrary non-empty set of probability measures. Let $\eta\in(0,1)$ be an arbitrary fixed threshold which does not dependent on $d$. Suppose that there exists a positive constant $\rho$ that may depend on $d$ such that $\mathrm{Hess}(V)(x)\geq\rho\mathrm{Id}$ for all $x$, and that the following curvature product condition holds: \[ \lim_{d\to\infty}\rho T=+\infty \quad\text{where}\quad T := \inf\bigr\{t\in\mathbb{R}_+:\sup_{\mu_0\in S} \mathrm{d}_{\mathrm{TV}}(\mu_t, \mu_V) \leq\eta\bigr\}. \] Then there is cutoff at critical time $T$ in the sense that for all fixed $\varepsilon\in(0,1)$, \begin{eqnarray*} \lim_{d\to\infty} \sup_{\mu_0\in S} \mathrm{d}_{\mathrm{TV}}(\mu_{t_d},\mu_V) &=& \begin{cases} 1 & \text{if $t_d=(1-\varepsilon)T$}\\ 0 & \text{if $t_d=(1+\varepsilon)T$} \end{cases}\\ \lim_{d\to\infty} \sup_{\mu_0\in S} \mathrm{I}(\mu_{t_d}\mid\mu_V) &=& \begin{cases} +\infty & \text{if $t_d=(1-\varepsilon)T$}\\ 0 & \text{if $t_d=(1+\varepsilon)T$} \end{cases}\\ \lim_{d\to\infty} \sup_{\mu_0\in S} \mathrm{H}(\mu_{t_d}\mid\mu_V) &=& \begin{cases} +\infty & \text{if $t_d=(1-\varepsilon)T$}\\ 0 & \text{if $t_d=(1+\varepsilon)T$} \end{cases}\\ \lim_{d\to\infty} \sup_{\mu_0\in S} \mathrm{W}_2(\mu_{t_d},\mu_V) &=& \begin{cases} +\infty & \text{if $t_d=(1-\varepsilon)T$}\\ 0 & \text{if $t_d=(1+\varepsilon)T$} \end{cases}. \end{eqnarray*} It is a high-dimensional phenomenon. Note that $X$, $S$, $\rho$, $T$, $S$ depend on $d$.

This mathematical formulation is a way to express the abrupt transition at critical time $T$ from the maximum value to the minimum value of the distance or divergence.

Proof. The Bakry-Émery version of the Lichnérovicz inequality gives $\lambda_1\geq\rho$, thus \begin{equation} \lim_{d\to\infty}\lambda_1 T=+\infty, \end{equation} which is the Peres product condition in Corollary 1 of [S], hence the cutoff in total variation distance. It remains to prove cutoff for the other cases. Let us start with the relative entropy lower bound. The Pinsker or Csiszár-Kullback inequality gives \begin{equation} \varliminf_{d\to\infty} \sup_{\mu_0\in S} \mathrm{H}(\mu_{(1-\frac{\varepsilon}{2})T}\mid\mu_V) \geq\frac{\eta^2}{2}. \end{equation} On the other hand, by the Bakry-Émery curvature theorem, for all $t'\geq t\geq0$, \[ \mathrm{H}(\mu_{t'}\mid\mu_V)\leq\mathrm{e}^{-2\rho(t'-t)}\mathrm{H}(\mu_t\mid\mu_V) \] Taking $t' = (1-\frac{\varepsilon}{2})T$, $t = (1-\varepsilon)T$, and using $\lim_{d\to\infty}\rho T=+\infty$, we get \begin{equation} \varliminf_{d\to\infty}\mathrm{H}(\mu_{(1-\varepsilon)T}\mid\mu_V) \geq\mathrm{e}^{\varepsilon\varlimsup_{d\to\infty}\rho T} \frac{\eta^2}{2}=+\infty. \end{equation} For the upper bound, a careful reading of the proof of Theorem 1 in [S] shows that \begin{equation} \varlimsup_{d\to\infty} \mathrm{H}\bigr(\mu_{(1+\frac{\varepsilon}{2})T}\mid\mu_V\bigr) \leq C_\varepsilon < \infty. \end{equation} Using the exponential decay of the relative entropy and $\lim_{d\to\infty}\rho T=+\infty$, we get \begin{equation} \varlimsup_{d\to\infty}\mathrm{H}(\mu_{(1+\varepsilon)T}\mid\mu_V) \leq\mathrm{e}^{-\varepsilon\varliminf_{d\to\infty}\rho T} C_\varepsilon =0. \end{equation}

For Wasserstein distance, the upper bound comes from the one for relative entropy via the Talagrand inequality $\rho\mathrm{W}_2(\mu_t,\mu_V)^2 \leq 2\mathrm{H}(\mu_t\mid\mu_V)$, while the lower bound comes from the Wasserstein regularization inequality (see [BGL]) \[ \mathrm{H}(\mu_t\mid\mu_V) \leq \frac{\rho\mathrm{e}^{-2\rho t}}{1-\mathrm{e}^{-2\rho t}}\mathrm{W}_2(\mu_0, \mu_V)^2 \leq\frac{1}{2t}\mathrm{W}_2(\mu_0, \mu_V)^2 \] used with $t = \varepsilon T$ and combined with the Markov semigroup property.

Finally, for Fisher information, the lower bound comes from the one for the relative entropy via the logarithmic Sobolev inequality $2\rho\mathrm{H}(\mu_t\mid\mu_V) \leq \mathrm{I}(\mu_t\mid \mu_V)$, while to upper bound $\mathrm{I}(\mu_{t_1}\mid\mu_V)$, we write, for all $0 < t_0 < t_1$, \[ \mathrm{H}(\mu_{t_0}\mid\mu_V)-\mathrm{H}(\mu_{t_1}\mid\mu_V) =\int_{t_0}^{t_1}\mathrm{I}(\mu_s\mid\mu_V)\mathrm{d}s \geq(t_1-t_0)\mathrm{I}(\mu_{t_1}\mid\mu_V) \] where we have used the monotinicity of $\mathrm{I}$, which gives, when $t_0 > 1$, the regularization \[ \mathrm{I}(\mu_{t_1}\mid\mu_V) \leq\frac{\mathrm{H}(\mu_{t_0}\mid\mu_V)}{t_1-t_0} \leq\frac{\mathrm{e}^{-\rho(t_0-1)}}{2(t_1-t_0)}\mathrm{W}_2(\mu_0, \mu_V)^2. \] This proof melts arguments from [CSC], [S], and [CF]. It is inspired from what is done in [BCL], with a simpler regularization procedure. Note that $\mathrm{H}(\mu_0\mid\mu_V)=+\infty$ and $\mathrm{I}(\mu_0\mid\mu_V)=+\infty$ when $\mu_0$ is a Dirac mass, which is not the case for $\mathrm{W}_2$. The $\rho$-convexity of $V$ is used several times : exponential decay of $\mathrm{H}$, monotonicity of $\mathrm{I}$, regularization with $\mathrm{W}_2$.

Rigidity. Following Cheng and Zhou or De Philippis and Figalli, in the special case $\rho=\lambda_1$, then the eigenfunctions of $-\mathrm{L}$ associated to $\lambda_1$ are affine and $\mu_V$ factorizes into the product of a 1D Gaussian factor of variance $\frac{1}{\rho}$ with a $\rho$-convex factor. This is known as rigidity. In this case, it is shown in [CF] that if $S=B(m_V,c\sqrt{d})$ or $S=m_V+[-c,c]^d$ where $m_V$ is the mean of $\mu_V$ and $c > 0$ is an arbitrary constant, then \[ T\asymp\frac{\log(d)}{2\rho}. \] This is for instance the case when \[ V(x)=\frac{\rho}{2}|x|^2+W(x), \quad x\in\mathbb{R}^d, \] where $\rho > 0$ and $W:\mathbb{R}^d\to\mathbb{R}$ is convex and translation invariant in the direction $(1,\ldots,1)\in\mathbb{R}^d$, namely for all $u\in\mathbb{R}$ and all $x\in\mathbb{R}^d$, $W(x+u(1,\ldots,1))=W(x)$. This is the case for example when for some convex even function $h:\mathbb{R}\to\mathbb{R}$, \[ W(x)=\sum_{i < j}h(x_i-x_j),\quad x\in\mathbb{R}^d. \] If $\pi$ and $\pi^\perp$ are the orthogonal projections on $\mathbb{R}(1,\ldots,1)$ and its orthogonal, respectively, then $|x|^2=|\pi(x)|^2+|\pi^\perp(x)|^2$, while the translation invariance of $W$ in the direction $(1/\sqrt{d},\ldots,1/\sqrt{d})$ gives $W(x)=W(\pi(x)+\pi^\perp(x))=W(\pi^\perp(x))$, therefore \[ \mathrm{e}^{-V(x)} =\mathrm{e}^{-\frac{\rho}{2}|\pi(x)|^2}\mathrm{e}^{-(W(\pi^\perp(x))+\frac{\rho}{2}|\pi^\perp(x)|^2)} \] which means that $\mu_V$ is, up to a rotation, a product measure, and splits into a 1D Gaussian factor $\mathcal{N}(0,\frac{1}{\rho})$ and a log-concave factor with a $\rho$-convex potential.

This covers as a special degenerate case the Dyson-OU (DOU) process studied in [CL,CF] as \[ h(x)= \begin{cases} -\beta\log(x) & \text{if $x > 0$}
+\infty & \text{if $x\leq0$} \end{cases},\quad\text{for an arbitrary constant $\beta\geq0$}, \] the degeneracy being equivalent to define the DOU process on the convex domain $\{x\in\mathbb{R}^d:x_1 > \cdots>x_d\}$ instead of on the whole space $\mathbb{R}^d$, to exploit convexity. In this case, the symmetric Hermite polynomial $x_1+\cdots+x_d$ is an eigenfunction associated to $\lambda_1$.

Geometry. These cutoff and rigidity estimate extend, beyond Euclidean space, to positively curved diffusions on Riemannian manifolds, see [CF] for more information.

Open questions. How about cutoff via stability beyond rigidity?

Further reading.

  • [SC] Laurent Saloff-Coste
    Precise estimates on the rate at which certain diffusions tend to equilibrium
    Mathematische Zeitschrift (1994)
  • [CSC] Guan-Yu Chen and Laurent Saloff-Coste
    The cutoff phenomenon for ergodic Markov processes
    Electronic Journal of Probability (2008)
    See also https://djalil.chafai.net/blog/2024/01/27/cutoff-for-markov-processes/
  • [BGL] Dominique Bakry, Ivan Gentil, and Michel Ledoux
    Analysis and geometry of Markov diffusion operators
    Springer (2014)
  • [CZ] Xu Cheng and Detang Zhou
    Eigenvalues of the drifted Laplacian on complete metric measure spaces
    Communications in Contemporary Mathematics (2017)
  • [DPF] Guido De Philippis and Alessio Figalli
    Rigidity and stability of Caffarelli's log-concave perturbation theorem
    Nonlinear Analysis Theory Methods and Applications (2017)
  • [CL] Djalil Chafaï and Joseph Lehec
    On Poincaré and logarithmic Sobolev inequalities for a class of singular Gibbs measures
    Geometric aspects of functional analysis. Vol. I
    Lecture Notes in Mathematics, Springer (2020)
  • [BCL] Jeanne Boursier, Djalil Chafaï, and Cyril Labbé
    Universal cutoff for Dyson Ornstein Uhlenbeck process
    Probability Theory and Related Fields (2023)
  • [CF] Djalil Chafaï and Max Fathi
    On cutoff via rigidity for high dimensional curved diffusions
    arXiv:2412.15969v2 (2024)
  • [S] Justin Salez
    Cutoff for non-negatively curved diffusions
    arXiv:2501.01304v1 (2025)

Group Photo of ANR Conviviality Meeting in Lyon June 2025
Group Photo of an ANR Conviviality meeting in Lyon, June 2025, including Justin Salez, Max Fathi, Ivan Gentil, and Joseph Lehec.
    Leave a Reply

    Your email address will not be published.

    This site uses Akismet to reduce spam. Learn how your comment data is processed.

    Syntax · Style · .