
This post is inspired from the last part of a recent work [CF] in collaboration with Max Fathi about the cutoff phenomenon for curved diffusions in high dimension.
Diffusion. Let ${(X_t)}_{t\in\mathbb{R}_+}$ be the solution of the stochastic differential equation \[ \mathrm{d}X_t = -\nabla V(X_t)\mathrm{d}t + \sqrt{2}\mathrm{d}B_t \] where ${(B_t)}_{t\geq0}$ is a standard Brownian motion in $\mathbb{R}^d$, $V:\mathbb{R}^d\to\mathbb{R}$ is strictly convex and $\mathcal{C}^2$ with $\lim_{|x|\to\infty}V(x)=+\infty$, and $\left|\cdot\right|$ is the Euclidean norm of $\mathbb{R}^d$. In Statistical Physics, this drift-diffusion is also known as an overdamped Langevin process with potential $V$. By adding a constant to $V$, we can assume without loss of generality that $\mu_V=\mathrm{e}^{-V}$ namely \[ \mathrm{d}\mu_V(x)=\mathrm{e}^{-V(x)}\mathrm{d}x \] is a probability measure. It is the unique invariant law of the process, and it is moreover reversible. The associated infinitesimal generator is the linear differential operator \[ \mathrm{L} = \Delta - \nabla V \cdot \nabla \] acting on smooth functions. It is symmetric in $L^2(\mu_V)$, and its kernel is the set of constant functions. Moreover, its spectrum is included in $(-\infty,-\lambda_1]\cup\{0\}$, for some $\lambda_1 > 0$ called the spectral gap of $\mathrm{L}$. When $V(x)=\frac{\rho}{2}|x|^2$ for some $\rho > 0$ then $X$ is the Ornstein-Uhlenbeck (OU) process, $\mu_V$ is Gaussian, $\lambda_1=\rho$, and $\mathrm{Hess}(V)(x)=\rho \mathrm{Id}$ for all $x\in\mathbb{R}^d$.
When $V$ is $\rho$-convex for some $\rho > 0$, namely if $V-\frac{\rho}{2}\left|\cdot\right|^2$ is convex, then the spectrum of $-\mathrm{L}$ is discrete, the spectral gap is an eigenvalue, and $\mathrm{Hess}(V)(x)\geq\rho\mathrm{Id}$ for all $x\in\mathbb{R}^d$.
Trend to the equilibrium. Let us denote $\mu_t:=\mathrm{Law}(X_t)$. We know that for all $\mu_0$, \[ \mu_t \xrightarrow[t\to\infty]{\mathrm{d}} \mu_V. \]
Functional inequalities. In order to quantify this trend to the equilibrium, we use the total variation distance, the Kantorovich--Wasserstein quadratic cost distance, the Kullback-Leibler relative entropy, and the Fisher information. Recall the formulas \begin{eqnarray*} \mathrm{d}_{\mathrm{TV}}(\nu,\mu) &=&\inf_{\substack{(U,V)\\U\sim\mu,V\sim\nu}}\mathbb{P}(U\neq V)\\\\ \mathrm{W}_2(\nu,\mu) &=&\inf_{\substack{(U,V)\\U\sim\mu,V\sim\nu}}\sqrt{\mathbb{E}(|U-V|^2)}\\ \mathrm{H}(\nu\mid\mu) &=&\displaystyle\int\log\frac{\mathrm{d}\nu}{\mathrm{d}\mu}\mathrm{d}\nu\\ \mathrm{I}(\nu\mid\mu) &=&\displaystyle\int\Bigr|\nabla\log\frac{\mathrm{d}\nu}{\mathrm{d}\mu}\Bigr|^2\mathrm{d}\nu \end{eqnarray*} They take their values in $[0,+\infty]$, but $\mathrm{d}_{\mathrm{TV}}\leq1$. They are comparable, generically or under certain conditions, and these comparisons are known as functional inequalities. The simplest and most well known is the Pinsker or Csiszár-Kullback inequality \[ \mathrm{d}_{\mathrm{TV}}(\nu,\mu)^2\leq 2\mathrm{H}(\nu\mid\mu). \] A more recent and sophisticated functional inequality is the Otto-Villani HWI inequality, valid for any $\nu$ and for $\mu_V=\mathrm{e}^{-V}$ with a $\rho$-convex $V$, $\rho > 0$ : \[ \mathrm{H}(\nu\mid\mu_V) \leq\mathrm{W}_2(\nu,\mu_V)\sqrt{\mathrm{I}(\nu\mid\mu_V)} -\frac{\rho}{2}\mathrm{W}_2(\nu,\mu_V)^2. \] See [BGL] for more information. It contains the Talagrand inequality \[ \frac{\rho}{2}\mathrm{W}_2(\nu,\mu_V)^2\leq\mathrm{H}(\nu\mid\mu_V) \] as well as the logarithmic Sobolev inequality \[ 2\rho\mathrm{H}(\nu\mid\mu_V)\leq\mathrm{I}(\nu\mid\mu_V). \] By linearization, the latter implies a Poincaré inequality which is equivalent to $\lambda_1\geq\rho$.
Cutoff phenomenon. Let $S\subset\mathcal{P}(\mathbb{R}^d)$ be an arbitrary non-empty set of probability measures. Let $\eta\in(0,1)$ be an arbitrary fixed threshold which does not dependent on $d$. Suppose that there exists a positive constant $\rho$ that may depend on $d$ such that $\mathrm{Hess}(V)(x)\geq\rho\mathrm{Id}$ for all $x$, and that the following curvature product condition holds: \[ \lim_{d\to\infty}\rho T=+\infty \quad\text{where}\quad T := \inf\bigr\{t\in\mathbb{R}_+:\sup_{\mu_0\in S} \mathrm{d}_{\mathrm{TV}}(\mu_t, \mu_V) \leq\eta\bigr\}. \] Then there is cutoff at critical time $T$ in the sense that for all fixed $\varepsilon\in(0,1)$, \begin{eqnarray*} \lim_{d\to\infty} \sup_{\mu_0\in S} \mathrm{d}_{\mathrm{TV}}(\mu_{t_d},\mu_V) &=& \begin{cases} 1 & \text{if $t_d=(1-\varepsilon)T$}\\ 0 & \text{if $t_d=(1+\varepsilon)T$} \end{cases}\\ \lim_{d\to\infty} \sup_{\mu_0\in S} \mathrm{I}(\mu_{t_d}\mid\mu_V) &=& \begin{cases} +\infty & \text{if $t_d=(1-\varepsilon)T$}\\ 0 & \text{if $t_d=(1+\varepsilon)T$} \end{cases}\\ \lim_{d\to\infty} \sup_{\mu_0\in S} \mathrm{H}(\mu_{t_d}\mid\mu_V) &=& \begin{cases} +\infty & \text{if $t_d=(1-\varepsilon)T$}\\ 0 & \text{if $t_d=(1+\varepsilon)T$} \end{cases}\\ \lim_{d\to\infty} \sup_{\mu_0\in S} \mathrm{W}_2(\mu_{t_d},\mu_V) &=& \begin{cases} +\infty & \text{if $t_d=(1-\varepsilon)T$}\\ 0 & \text{if $t_d=(1+\varepsilon)T$} \end{cases}. \end{eqnarray*} It is a high-dimensional phenomenon. Note that $X$, $S$, $\rho$, $T$, $S$ depend on $d$.
This mathematical formulation is a way to express the abrupt transition at critical time $T$ from the maximum value to the minimum value of the distance or divergence.
Proof. The Bakry-Émery version of the Lichnérovicz inequality gives $\lambda_1\geq\rho$, thus \begin{equation} \lim_{d\to\infty}\lambda_1 T=+\infty, \end{equation} which is the Peres product condition in Corollary 1 of [S], hence the cutoff in total variation distance. It remains to prove cutoff for the other cases. Let us start with the relative entropy lower bound. The Pinsker or Csiszár-Kullback inequality gives \begin{equation} \varliminf_{d\to\infty} \sup_{\mu_0\in S} \mathrm{H}(\mu_{(1-\frac{\varepsilon}{2})T}\mid\mu_V) \geq\frac{\eta^2}{2}. \end{equation} On the other hand, by the Bakry-Émery curvature theorem, for all $t'\geq t\geq0$, \[ \mathrm{H}(\mu_{t'}\mid\mu_V)\leq\mathrm{e}^{-2\rho(t'-t)}\mathrm{H}(\mu_t\mid\mu_V) \] Taking $t' = (1-\frac{\varepsilon}{2})T$, $t = (1-\varepsilon)T$, and using $\lim_{d\to\infty}\rho T=+\infty$, we get \begin{equation} \varliminf_{d\to\infty}\mathrm{H}(\mu_{(1-\varepsilon)T}\mid\mu_V) \geq\mathrm{e}^{\varepsilon\varlimsup_{d\to\infty}\rho T} \frac{\eta^2}{2}=+\infty. \end{equation} For the upper bound, a careful reading of the proof of Theorem 1 in [S] shows that \begin{equation} \varlimsup_{d\to\infty} \mathrm{H}\bigr(\mu_{(1+\frac{\varepsilon}{2})T}\mid\mu_V\bigr) \leq C_\varepsilon < \infty. \end{equation} Using the exponential decay of the relative entropy and $\lim_{d\to\infty}\rho T=+\infty$, we get \begin{equation} \varlimsup_{d\to\infty}\mathrm{H}(\mu_{(1+\varepsilon)T}\mid\mu_V) \leq\mathrm{e}^{-\varepsilon\varliminf_{d\to\infty}\rho T} C_\varepsilon =0. \end{equation}
For Wasserstein distance, the upper bound comes from the one for relative entropy via the Talagrand inequality $\rho\mathrm{W}_2(\mu_t,\mu_V)^2 \leq 2\mathrm{H}(\mu_t\mid\mu_V)$, while the lower bound comes from the Wasserstein regularization inequality (see [BGL]) \[ \mathrm{H}(\mu_t\mid\mu_V) \leq \frac{\rho\mathrm{e}^{-2\rho t}}{1-\mathrm{e}^{-2\rho t}}\mathrm{W}_2(\mu_0, \mu_V)^2 \leq\frac{1}{2t}\mathrm{W}_2(\mu_0, \mu_V)^2 \] used with $t = \varepsilon T$ and combined with the Markov semigroup property.
Finally, for Fisher information, the lower bound comes from the one for the relative entropy via the logarithmic Sobolev inequality $2\rho\mathrm{H}(\mu_t\mid\mu_V) \leq \mathrm{I}(\mu_t\mid \mu_V)$, while to upper bound $\mathrm{I}(\mu_{t_1}\mid\mu_V)$, we write, for all $0 < t_0 < t_1$, \[ \mathrm{H}(\mu_{t_0}\mid\mu_V)-\mathrm{H}(\mu_{t_1}\mid\mu_V) =\int_{t_0}^{t_1}\mathrm{I}(\mu_s\mid\mu_V)\mathrm{d}s \geq(t_1-t_0)\mathrm{I}(\mu_{t_1}\mid\mu_V) \] where we have used the monotinicity of $\mathrm{I}$, which gives, when $t_0 > 1$, the regularization \[ \mathrm{I}(\mu_{t_1}\mid\mu_V) \leq\frac{\mathrm{H}(\mu_{t_0}\mid\mu_V)}{t_1-t_0} \leq\frac{\mathrm{e}^{-\rho(t_0-1)}}{2(t_1-t_0)}\mathrm{W}_2(\mu_0, \mu_V)^2. \] This proof melts arguments from [CSC], [S], and [CF]. It is inspired from what is done in [BCL], with a simpler regularization procedure. Note that $\mathrm{H}(\mu_0\mid\mu_V)=+\infty$ and $\mathrm{I}(\mu_0\mid\mu_V)=+\infty$ when $\mu_0$ is a Dirac mass, which is not the case for $\mathrm{W}_2$. The $\rho$-convexity of $V$ is used several times : exponential decay of $\mathrm{H}$, monotonicity of $\mathrm{I}$, regularization with $\mathrm{W}_2$.
Rigidity. Following Cheng and Zhou or De Philippis and Figalli, in the special case $\rho=\lambda_1$, then the eigenfunctions of $-\mathrm{L}$ associated to $\lambda_1$ are affine and $\mu_V$ factorizes into the product of a 1D Gaussian factor of variance $\frac{1}{\rho}$ with a $\rho$-convex factor. This is known as rigidity. In this case, it is shown in [CF] that if $S=B(m_V,c\sqrt{d})$ or $S=m_V+[-c,c]^d$ where $m_V$ is the mean of $\mu_V$ and $c > 0$ is an arbitrary constant, then \[ T\asymp\frac{\log(d)}{2\rho}. \] This is for instance the case when \[ V(x)=\frac{\rho}{2}|x|^2+W(x), \quad x\in\mathbb{R}^d, \] where $\rho > 0$ and $W:\mathbb{R}^d\to\mathbb{R}$ is convex and translation invariant in the direction $(1,\ldots,1)\in\mathbb{R}^d$, namely for all $u\in\mathbb{R}$ and all $x\in\mathbb{R}^d$, $W(x+u(1,\ldots,1))=W(x)$. This is the case for example when for some convex even function $h:\mathbb{R}\to\mathbb{R}$, \[ W(x)=\sum_{i < j}h(x_i-x_j),\quad x\in\mathbb{R}^d. \] If $\pi$ and $\pi^\perp$ are the orthogonal projections on $\mathbb{R}(1,\ldots,1)$ and its orthogonal, respectively, then $|x|^2=|\pi(x)|^2+|\pi^\perp(x)|^2$, while the translation invariance of $W$ in the direction $(1/\sqrt{d},\ldots,1/\sqrt{d})$ gives $W(x)=W(\pi(x)+\pi^\perp(x))=W(\pi^\perp(x))$, therefore \[ \mathrm{e}^{-V(x)} =\mathrm{e}^{-\frac{\rho}{2}|\pi(x)|^2}\mathrm{e}^{-(W(\pi^\perp(x))+\frac{\rho}{2}|\pi^\perp(x)|^2)} \] which means that $\mu_V$ is, up to a rotation, a product measure, and splits into a 1D Gaussian factor $\mathcal{N}(0,\frac{1}{\rho})$ and a log-concave factor with a $\rho$-convex potential.
This covers as a special degenerate case the Dyson-OU (DOU) process studied in [CL,CF] as \[ h(x)= \begin{cases} -\beta\log(x) & \text{if $x > 0$}
+\infty & \text{if $x\leq0$} \end{cases},\quad\text{for an arbitrary constant $\beta\geq0$}, \] the degeneracy being equivalent to define the DOU process on the convex domain $\{x\in\mathbb{R}^d:x_1 > \cdots>x_d\}$ instead of on the whole space $\mathbb{R}^d$, to exploit convexity. In this case, the symmetric Hermite polynomial $x_1+\cdots+x_d$ is an eigenfunction associated to $\lambda_1$.
Geometry. These cutoff and rigidity estimate extend, beyond Euclidean space, to positively curved diffusions on Riemannian manifolds, see [CF] for more information.
Open questions. How about cutoff via stability beyond rigidity?
Further reading.
- [SC] Laurent Saloff-Coste
Precise estimates on the rate at which certain diffusions tend to equilibrium
Mathematische Zeitschrift (1994) - [CSC] Guan-Yu Chen and Laurent Saloff-Coste
The cutoff phenomenon for ergodic Markov processes
Electronic Journal of Probability (2008)
See also https://djalil.chafai.net/blog/2024/01/27/cutoff-for-markov-processes/ - [BGL] Dominique Bakry, Ivan Gentil, and Michel Ledoux
Analysis and geometry of Markov diffusion operators
Springer (2014) - [CZ] Xu Cheng and Detang Zhou
Eigenvalues of the drifted Laplacian on complete metric measure spaces
Communications in Contemporary Mathematics (2017) - [DPF] Guido De Philippis and Alessio Figalli
Rigidity and stability of Caffarelli's log-concave perturbation theorem
Nonlinear Analysis Theory Methods and Applications (2017) - [CL] Djalil Chafaï and Joseph Lehec
On Poincaré and logarithmic Sobolev inequalities for a class of singular Gibbs measures
Geometric aspects of functional analysis. Vol. I
Lecture Notes in Mathematics, Springer (2020) - [BCL] Jeanne Boursier, Djalil Chafaï, and Cyril Labbé
Universal cutoff for Dyson Ornstein Uhlenbeck process
Probability Theory and Related Fields (2023) - [CF] Djalil Chafaï and Max Fathi
On cutoff via rigidity for high dimensional curved diffusions
arXiv:2412.15969v2 (2024) - [S] Justin Salez
Cutoff for non-negatively curved diffusions
arXiv:2501.01304v1 (2025)