# Month: November 2013

The idea of writing this tiny post came after a coffee break discussion with MAP432 colleagues. All is about the Jensen inequality, one of my favorite inequality, which connects convexity and probability in a rigid way. Equality cases and strict convexity are not considered here.

Jensen inequality. In its simplest form, the Jensen inequality states that if ${\varphi:\mathbb{R}\rightarrow\mathbb{R}}$ is a convex function and if ${X}$ is a real random variable such that ${X}$ and ${\varphi(X)}$ are integrable, then

$\varphi(\mathbb{E}(X))\leq\mathbb{E}(\varphi(X)).$

Geometric proof. The function ${\varphi}$ is convex iff it is equal to the envelope of all its affine lower bounds (of course, only those in contact are relevant, which leads to Legendre transform and convex duality, but we do not need these subtleties here). Namely

$\varphi=\sup_{f\in\mathcal{F}}f$

where ${\mathcal{F}}$ is the family of all affine functions ${f}$ such that ${f\leq\varphi}$. But for any affine function ${f}$,

$f(\mathbb{E}(X))=\mathbb{E}(f(X)),$

and therefore

$\varphi(\mathbb{E}(X)) =\sup_{f\in\mathcal{F}}f(\mathbb{E}(X)) =\sup_{f\in\mathcal{F}}\mathbb{E}(f(X)) \leq\mathbb{E}(\sup_{f\in\mathcal{F}}f(X)) =\mathbb{E}(\varphi(X)).$

This proof can be extended to the case where ${\varphi}$ may take the value ${+\infty}$, by reduction to the convex set ${\{\varphi<\infty\}}$. It can be also extended to the multivariate case where ${\varphi}$ is defined on ${\mathbb{R}^d}$ and ${X}$ is a random vector of ${\mathbb{R}^d}$, by taking affine forms (i.e. affine hyperplanes) instead.

Probabilistic proof. We have already seen this in a previous post. Let ${{(X_n)}_{n\geq1}}$ be i.i.d. copies of ${X}$. Since ${\varphi}$ is convex, for any ${n\geq1}$,

$\varphi\left(\frac{X_1+\cdots+X_n}{n}\right) \leq\frac{\varphi(X_1)+\cdots+\varphi(X_n)}{n}.$

Now, since ${X}$ and ${\varphi(X)}$ are integrable, it remains to use the strong law of large numbers for both sides, for an ${\omega}$ lying in the intersection of the left side and right side almost sure sets (necessarily not empty!). For the left hand side, we also need to use the fact that ${\varphi}$ is continuous, which follows from convexity.

Here again, the proof can be extended to the case where ${\varphi}$ may take the value ${+\infty}$, and to the multivariate case. This proof is very quick, but relies on the strong law of large numbers, which is a non trivial result. One can use the weak law of large numbers instead, but the proof is then less beautiful. More generally, the law of large numbers is only used to produce an empirical probability measure which converges to the law of ${X}$, and the randomness is a nuisance, not a feature.

Integrability. Let ${X}$ be a real random variable and let ${\varphi:\mathbb{R}\rightarrow\mathbb{R}}$ be a convex function. It turns out that if ${X}$ is integrable then ${\mathbb{E}(\varphi(X))}$ makes sense. Conversely, if ${\varphi(X)}$ is integrable and ${\varphi}$ is not constant then ${\mathbb{E}(X)}$ makes sense.

To see it, let us assume that ${X}$ is integrable, and let us show then that ${\varphi(X)_-}$ is integrable, in other words that ${\mathbb{E}(\varphi(X))}$ makes always sense in ${\mathbb{R}\cup\{+\infty\}}$. Since ${\varphi}$ is convex, there exists an affine function ${f}$ (possibly constant) such that ${f\leq\varphi}$. Since ${X}$ is integrable and ${f}$ is affine, it follows that ${f(X)}$ is integrable. Therefore, ${\varphi(X)}$ admits an integrable lower bound. Let us define

$U:=f(X)\leq\varphi(X)=:V.$

We have

$V_-=-V\mathbf{1}_{V\leq0}\leq -U\mathbf{1}_{V\leq0}\leq|U|$

and therefore ${V_-}$ is integrable, and thus ${\mathbb{E}(V)}$ has always a meaning in ${\mathbb{R}\cup\{+\infty\}}$.

Conversely, let us assume that ${\varphi(X)}$ is integrable and that ${\varphi}$ is not constant. Since ${\varphi}$ is convex and non constant, we have ${\lim_{x\rightarrow-\infty}\varphi(x)=+\infty}$ or ${\lim_{x\rightarrow+\infty}\varphi(x)=+\infty}$. Let us show then that

1. ${X_+}$ is integrable when ${\lim_{x\rightarrow+\infty}\varphi(x)=+\infty}$ (example: ${\varphi(x)=e^x}$);
2. ${X_-}$ is integrable when ${\lim_{x\rightarrow-\infty}\varphi(x)=+\infty}$ (example: ${\varphi(x)=e^{-x}}$);
3. ${X}$ is integrable when ${\lim_{|x|\rightarrow+\infty}\varphi(x)=+\infty}$ (example: ${\varphi(x)=|x|^p}$, ${p\geq1}$).

The last statement means ${\mathrm{L}^1\subset\mathrm{L}^{\varphi}}$ when ${\varphi}$ is convex with ${\lim_{|x|\rightarrow+\infty}\varphi(x)=+\infty}$. Obviously, one may deduce the integrability of ${X}$ from the integrability of ${\varphi(X)}$ when ${\varphi}$ is affine, except if ${\varphi}$ is constant. More generally, if ${\varphi}$ is convex and not constant, then there exits a non constant affine function ${f}$ such that ${f(X)\leq\varphi(X)}$. If ${f}$ is increasing (equivalently if ${\lim_{x\rightarrow+\infty}\varphi(x)=+\infty}$) then ${X}$ has an integrable upper bound ${V}$ and thus ${X_+}$ is integrable because

$X_+=X\mathbf{1}_{X\geq0}\leq V\mathbf{1}_{X\geq0}\leq|V|.$

If ${f}$ is decreasing (equivalently if ${\lim_{x\rightarrow-\infty}\varphi(x)=+\infty}$) then ${X}$ has an integrable lower bound ${V}$ and thus ${X_-}$ is integrable because

$X_-=-X\mathbf{1}_{X\leq0}\leq-V\mathbf{1}_{X\leq0}\leq|V|.$

Remark. Suppose that ${X}$ is integrable and that we do not know if ${\varphi(X)}$ is integrable or not. For every deterministic threshold ${n\geq0}$, the function ${\varphi_n=\max(-n,\varphi)}$ is convex as a maximum of convex functions. Since ${\varphi_n\geq-n}$, the quantity ${\mathbb{E}(\varphi_n(X))}$ makes sense in ${[-n,+\infty]}$, and we obtain, say by using the envelope approach above,

$\varphi(\mathbb{E}(X)) \leq\varphi_n(\mathbb{E}(X)) \leq\mathbb{E}(\varphi_n(X)).$

We have ${\varphi_n=\varphi_+-\min(n,\varphi_-)}$, and thus, if ${\varphi_+(X)}$ is integrable then by monotone convergence, ${\mathbb{E}(\varphi_-(X))\leq\mathbb{E}(\varphi_+(X))-\varphi(\mathbb{E}(X))<\infty}$, and then ${\varphi(X)}$ is integrable.

I have uploaded recently the final version of arXiv:1304.7569 entitled First order global asymptotics for confined particles with singular pair repulsion, written in collaboration with Nathaël Gozlan and Pierre André Zitt, and to appear in the Annals of Applied Probability. The former title was First order global asymptitics for Calogero-Sutherland gases, but we decided to follow one of the reviewers since our work goes beyond Calogero-Sutherland models.

We study a physical system of ${N}$ interacting particles in ${\mathbb{R}^d}$, ${d\geq1}$, subject to pair repulsion and confined by an external field. We establish a large deviations principle for their empirical distribution as ${N}$ tends to infinity. In the case of Riesz interaction, including Coulomb interaction in arbitrary dimension ${d>2}$, the rate function is strictly convex and admits a unique minimum, the equilibrium measure, characterized via its potential. It follows that almost surely, the empirical distribution of the particles tends to this equilibrium measure as ${N}$ tends to infinity. In the more specific case of Coulomb interaction in dimension ${d>2}$, and when the external field is a convex or increasing function of the radius, then the equilibrium measure is supported in a ring. With a quadratic external field, the equilibrium measure is uniform on a ball.

Particles and configuration energy. The system is made with ${N}$ particles at positions ${x_1,\ldots,x_N\in\mathbb{R}^d}$, ${d\geq1}$, with identical “charge” ${q_N=1/N}$, subject to a confining potential ${V:\mathbb{R}^d\rightarrow\mathbb{R}}$ coming from an external field and acting on each particle, and to an interaction potential

$W:\mathbb{R}^d\times\mathbb{R}^d\rightarrow(-\infty,+\infty]$

acting on each pair of particles. The function ${W}$ is finite outside the diagonal and symmetric: for all ${x,y\in\mathbb{R}^d}$ with ${x\neq y}$, we have ${W(x,y)=W(y,x)<\infty}$. The energy ${H_N(x_1,\ldots,x_N)}$ of the configuration ${(x_1,\ldots,x_N)\in(\mathbb{R}^d)^N}$ takes the form

$\begin{array}{rcl} \notag H_N(x_1,\ldots,x_N) &=& \sum_{i=1}^Nq_NV(x_i) +\sum_{i<j}q_N^2W(x_i,x_j) \\ \notag &=& \frac{1}{N}\sum_{i=1}^NV(x_i) +\frac{1}{N^2}\sum_{i<j}W(x_i,x_j) \\ &=&\int\!V(x)\,d\mu_N(x)+\frac{1}{2}\iint_{\neq}\!W(x,y)\,d\mu_N(x)\,d\mu_N(y) \end{array}$

where

$\mu_N=\frac{1}{N}\sum_{i=1}^N\delta_{x_i}$

is the empirical measure of the particles, and where the subscript “${\neq}$” indicates that the double integral is off-diagonal. The energy ${H_N:(\mathbb{R}^d)^N\rightarrow\mathbb{R}\cup\{+\infty\}}$ is a quadratic form functional in the variable ${\mu_N}$. We denote by ${\left|\cdot\right|}$ the Euclidean norm of ${\mathbb{R}^d}$ and we make the following additional assumptions:

• (H1) The function ${W:\mathbb{R}^d\times\mathbb{R}^d\rightarrow(-\infty,+\infty]}$ is continuous on ${\mathbb{R}^d\times \mathbb{R}^d}$, symmetric, takes finite values on ${\mathbb{R}^d\times \mathbb{R}^d \setminus \{(x,x) ; x\in \mathbb{R}^d\}}$ and satisfies the following integrability condition: for all compact subset ${K\subset\mathbb{R}^d}$, the function

$z\in\mathbb{R}^d\mapsto \sup \{W(x,y); \left|x-y\right|\geq|z|, x,y\in K\}$

is locally Lebesgue-integrable on ${\mathbb{R}^d}$;

• (H2) The function ${V:\mathbb{R}^d\rightarrow\mathbb{R}}$ is continuous and such that ${\lim_{|x| \rightarrow+ \infty } V(x)=+\infty}$ and

$\int_{\mathbb{R}^d} \exp(-V(x))\,dx<\infty.$

• (H3) There exists constants ${c\in\mathbb{R}}$ and ${\varepsilon_o \in (0,1)}$ such that for every ${x,y\in\mathbb{R}^d}$,

$W(x,y)\geq c-\varepsilon_o(V(x)+V(y)).$

(This must be understood as “${V}$ dominates ${W}$ at infinity”).

Boltzmann-Gibbs distribution. Let ${{(\beta_N)}_{N}}$ be a sequence of positive real numbers such that ${\beta_N\rightarrow+\infty}$ as ${N\rightarrow\infty}$. Under (H2)-(H3), there exists an integer ${N_0}$ depending on ${\varepsilon_o}$ such that for any ${N\geq N_0}$, we have

$Z_N=\int_{\mathbb{R}^d}\cdots\int_{\mathbb{R}^d}\! \exp\left(-\beta_NH_N(x_1,\ldots,x_N)\right)\,dx_1\cdots{}dx_N<\infty,$

so that we can define the Boltzmann-Gibbs probability measure ${P_N}$ on ${(\mathbb{R}^d)^N}$ by

$dP_N(x_1,\ldots,x_N) =\frac{\exp\left(-\beta_N H_N(x_1,\ldots,x_N)\right)}{Z_N}\,dx_1\cdots dx_N.$

The law ${P_N}$ is the equilibrium distribution of a system of ${N}$ interacting Brownian particles in ${\mathbb{R}^d}$, at inverse temperature ${\beta_N}$, with equal individual “charge” ${1/N}$, subject to a confining potential ${V}$ acting on each particle, and to an interaction potential ${W}$ acting on each pair of particles. For ${\beta_N=N^2}$, the quantity ${\beta_NH_N}$ can also be interpreted as the distribution of a system of ${N}$ particles living in ${\mathbb{R}^d}$, with unit “charge”, subject to a confining potential ${NV}$ acting on each particle, and to an interaction potential ${W}$ acting on each pair of particles.

Physical control problem. Our work is motivated by the following physical control problem: given the (internal) interaction potential ${W}$, for instance a Coulomb potential, a target probability measure ${\mu_\star}$ on ${\mathbb{R}^d}$, for instance the uniform law on the unit ball, and a cooling scheme ${\beta_N\rightarrow+\infty}$, for instance ${\beta_N=N^2}$, can we tune the (external) confinement potential ${V}$ (associated to an external confinement field) such that ${\mu_N\rightarrow\mu_\star}$ as ${N\rightarrow\infty}$? In this direction, we provide some partial answers in our main results stated in the sequel.

Limiting energy. Let ${\mathcal{M}_1(\mathbb{R}^d)}$ be the set of probability measures on ${\mathbb{R}^d}$. The mean-field symmetries of the model suggest to study, under the exchangeable measure ${P_N}$, the behavior as ${N\rightarrow\infty}$ of the empirical measure ${\mu_N}$, which is a random variable on ${\mathcal{M}_1(\mathbb{R}^d)}$. With this asymptotic analysis in mind, we introduce the functional ${I:\mathcal{M}_1(\mathbb{R}^d)\rightarrow (-\infty,+\infty]}$ given by

$I(\mu) = \frac{1}{2}\iint\!\left(V(x)+V(y)+W(x,y)\right)\,d\mu(x)d\mu(y).$

The assumptions (H2)(H3) imply that the function under the integral is bounded from below, so that the integral defining ${I}$ makes sense in ${\mathbb{R}\cup\{+\infty\}=(-\infty,+\infty]}$. If it is finite, then ${\int\!Vd\mu}$ and ${\iint Wd\mu^2}$ both exist, so that

$I(\mu) = \int\!V d\mu + \frac{1}{2} \iint W d\mu^2.$

The energy ${H_N}$ is “almost” given by ${I(\mu_N)}$, where the infinite terms on the diagonal are forgotten.

Large deviations principle. Theorem 1 below is our first main result. It is of topological nature, inspired from the available results for logarithmic Coulomb gases in random matrix theory BenArous and Guionnet, BenArous and Zeitouni, Hiai and Petz, Hardy. We equip ${\mathcal{M}_1(\mathbb{R}^d)}$ with the weak topology, defined by duality with bounded continuous functions. For any set ${A\subset\mathcal{M}_1(\mathbb{R}^d)}$ we denote by ${\mathrm{int}{A}}$, ${\mathrm{clo}{A}}$ the interior and closure of ${A}$ with respect to this topology. This topology can be metrized by the Fortet-Mourier distance defined by (see also Rachev and Rüschendorf):

$d_{\mathrm{FM}}(\mu,\nu)= \sup_{\max(|f|_\infty,|f|_{\mathrm{Lip}})\leq 1}\left\{\int\!f\,d\mu-\int\!f\,d\nu\right\},$

where

$|f|_\infty=\sup|f| \quad\mbox{and}\quad |f|_{\mathrm{Lip}}=\sup_{x\neq y}\frac{|f(x)-f(y)|}{|x-y|}.$

To formulate the large deviations result we need to introduce the following additional technical assumption:

• (H4) For all ${\nu \in \mathcal{M}_1(\mathbb{R}^d)}$ such that ${I(\nu)<+\infty}$, there is a sequence ${(\nu_n)_{n\in \mathbb{N}}}$ of probability measures, absolutely continuous with respect to Lebesgue, such that ${\nu_n}$ converges weakly to ${\nu}$ and ${I(\nu_n) \rightarrow I(\nu),}$ when ${n\rightarrow\infty.}$

It turns out that assumption (H4) is satisfied for a very large class of potentials ${V,W}$, including the special case in which the function ${I}$ is convex, which is typically the case for the Coulomb and Riesz intercations.

In all the paper, if ${(a_N)_{N}}$ and ${(b_N)_{N}}$ are non negative sequences, the notation ${a_N \gg b_N}$ means that ${a_N=b_Nc_N}$, for some ${c_N}$ that goes to ${+\infty}$ when ${N\rightarrow\infty.}$

Theorem 1 (Large Deviations Principle) Suppose that

$\beta_N\gg N\log(N).$

If (H1)-(H2)-(H3) are satisfied then

1. ${I}$ has compact level sets (and is thus lower semi-continuous) and

$\inf_{\mathcal{M}_1(\mathbb{R}^d)}I>-\infty;$

2. Under ${(P_N)_N}$, the sequence ${{(\mu_N)}_{N}}$ of random elements of ${\mathcal{M}_1(\mathbb{R}^d)}$ equipped with the weak topology has the following asymptotic properties. For every Borel subset ${A}$ of ${\mathcal{M}_1(\mathbb{R}^d)}$,

$\limsup_{N\rightarrow\infty}\frac{\log Z_NP_N(\mu_N\in A)}{\beta_N} \leq-\inf_{\mu\in\mathrm{clo}{A}}I(\mu)$

and

$\liminf_{N\rightarrow\infty}\frac{\log Z_NP_N(\mu_N\in A)}{\beta_N} \geq -\inf \{I(\mu); \mu\in\mathrm{int}{A}, \mu \ll \mathrm{Lebesgue}\}.$

3. Under the additional assumption ${\textbf{(H4)}}$, the full Large Deviation Principle (LDP) at speed ${\beta_N}$ holds with the rate function

$I_\star=I-\inf_{\mathcal{M}_1(\mathbb{R}^d)}I.$

More precisely, for all Borel set ${A \subset \mathcal{M}_1(\mathbb{R}^d)}$,

$-\inf_{\mu \in \mathrm{int}{A}} I_\star(\mu) \leq \liminf_{N\rightarrow\infty} \frac{\log P_N(\mu_N \in A)}{\beta_N} \\ \leq \limsup_{N\rightarrow\infty} \frac{\log P_N(\mu_N \in A)}{\beta_N} \leq -\inf_{\mu \in \mathrm{clo}{A}} I_\star(\mu).$

In particular, by taking ${A=\mathcal{M}_1(\mathbb{R}^d)}$, we get

$\lim_{N\rightarrow\infty}\frac{\log Z_N}{\beta_N} =\inf_{\mathcal{M}_1(\mathbb{R}^d)}I_\star.$

4. Let ${I_{\text{min}}=\{\mu\in\mathcal{M}_1:I_\star(\mu)=0\}\neq \emptyset}$. If ${\textbf{(H4)}}$ is satisfied and if ${{(\mu_N)}_{N}}$ are constructed on the same probability space, and if ${d}$ stands for the Fortet-Mourier distance, then we have, almost surely,

$\lim_{N\rightarrow\infty}d_{\mathrm{FM}}(\mu_N,I_{\text{min}})=0.$

A careful reading of the proof of Theorem 1 indicates that if ${I_\text{min}=\{\mu_\star\}}$ is a singleton, and if (H4) holds for ${\nu=\mu_\star}$, then ${\mu_N\rightarrow\mu_\star}$ almost surely as ${N\rightarrow\infty}$.

Link with Sanov theorem. If we set ${W=0}$ then the particles become i.i.d. and ${P_N}$ becomes a product measure ${\eta_N^{\otimes N}}$ where ${\eta_N\propto e^{-(\beta_N/N)V}}$, where the symbol “${\propto}$” means ”proportional to”. When ${\beta_N=N}$ then ${\eta_N\propto e^{-V}}$ does not depend on ${N}$, and we may denote it ${\eta}$. To provide perspective, recall that the classical Sanov theorem for i.i.d. sequences means in our settings that if ${W=0}$ and ${\beta_N=N}$ then ${{(\mu_N)}_N}$ satisfies to a large deviations principle on ${\mathcal{M}_1(\mathbb{R}^d)}$ at speed ${N}$ and with good rate function (Kullback-Leibler relative entropy or free energy)

$\mu\mapsto K(\mu|\eta)= \int\!f\log(f)\,d\eta$

if ${\mu\ll\eta}$, with ${f=\frac{d\mu}{d\eta}}$, and ${+\infty}$ otherwise. This large deviations principle corresponds to the convergence ${\lim_{N\rightarrow\infty}d_{\mathrm{FM}}(\mu_N,\eta)=0}$. Note that, if ${\mu}$ is absolutely continuous with respect to Lebesgue measure with density function ${g}$, then ${K(\mu|\eta)}$ can be decomposed in two terms

$K(\mu|\eta) = \int\!V\,d\mu-H(\mu)+\log Z_V,$

where

$Z_V=\int_{\mathbb{R}^d}\!e^{-V(x)}\,dx$

and where ${H(\mu)}$ is the Boltzmann-Shannon “continuous” entropy

$H(\mu) = -\int\!g(x)\log(g(x))\,dx;$

therefore at the speed ${\beta_N = N}$, the energy factor ${\int\!V\,d\mu}$ and the Boltzmann-Shannon entropy factor ${H(\mu)}$ both appear in the rate function. In contrast, note that Theorem 1 requires a higher inverse temperature ${\beta_N\gg N\log(N)}$. If we set ${W=0}$ in Theorem 1, then ${P_N}$ becomes a product measure, the particles are i.i.d. but their common law depends on ${N}$, the function ${\mu\mapsto I_*(\mu)=\int\!V\,d\mu-\inf V}$ is affine, its minimizers ${I_{\text{min}}}$ over ${\mathcal{M}_1(\mathbb{R}^d)}$ coincide with

$\mathcal{M}_V=\{\mu\in\mathcal{M}_1(\mathbb{R}^d):\mathrm{supp}(\mu)\subset\arg\inf V\},$

and Theorem 1 boils down to a sort of Laplace principle, which corresponds to the convergence ${\lim_{N\rightarrow\infty}d_{\mathrm{FM}}(\mu_N,\mathcal{M}_V)=0}$. It is worthwhile to notice that the main difficulty in Theorem 1 lies in the fact that ${W}$ can be infinite on the diagonal (short scale repulsion). If ${W}$ is continuous and bounded on ${\mathbb{R}^d\times\mathbb{R}^d}$, then one may deduce the large deviations principle for ${{(\mu_N)}_{N}}$ from the case ${W=0}$ by using the Laplace-Varadhan. To complete the picture, let us mention that if ${\beta_N=N}$ and if ${W}$ is bounded and continuous, then the Laplace-Varadhan lemma and the Sanov theorem would yield to the conclusion that ${(\mu_N)_N}$ verifies a large deviations principle on ${\mathcal{M}_1(\mathbb{R}^d)}$ at speed ${N}$ with rate function ${R-\inf_{\mathcal{M}_1(\mathbb{R}^d)}R}$ where the functional ${R}$ is defined by

$\begin{array}{rcl} R(\mu) &=& K(\mu|\eta) + \frac{1}{2}\iint\!W(x,y)\,d\mu(x)d\mu(y) \\ &=& -H(\mu) + I(\mu)+\log Z_V; \end{array}$

once more, the Boltzmann-Shannon entropy factor ${H(\mu)}$ reappears at this rate. For an alternative point of view, we refer to Messer and Spohn, Caglioti and Lions and Marchioro and Pulvirenti, Bodineau and Guionnet.

Equilibrium measure for Coulomb and Riesz interactions. Our second main result, expressed in Theorem 2 and Corollary 3 below is of differential nature. It is based on an instance of the general Gauss problem in potential theory Frostman, Landkof, Zorii. It concerns special choices of ${V}$ and ${W}$ for which ${I_\star}$ achieves its minimum ${0}$ for a unique and explicit ${\mu_\star\in\mathcal{M}_1(\mathbb{R}^d)}$. Recall that the Coulomb interactions correspond to the choice ${W(x,y)=k_\Delta(x-y)}$ where ${k_\Delta}$ is the Coulomb kernel (opposite in sign to the Newton kernel) defined on ${\mathbb{R}^d}$, ${d\geq1}$, by

$k_\Delta(x)= \left\{ \begin{array}{cc} -|x| & d=1,\\ \log\frac{1}{|x|} & d=2,\\ \frac{1}{|x|^{d-2}}& d\geq3. \end{array} \right.$

This is, up to a multiplicative constant, the fundamental solution of the Laplace equation. In other words, denoting ${\Delta=\partial_{x_1}^2+\cdots+\partial_{x_d}^2}$ the Laplacian, we have, in a weak sense, in the space of Schwartz-Sobolev distributions ${\mathcal{D}'(\mathbb{R}^d)}$,

$-c\Delta k_\Delta=\delta_0 \quad\text{with}\quad c= \left\{ \begin{array}{cc} \frac{1}{2} & d=1,\\ \frac{1}{2\pi} & d=2,\\ \frac{1}{d(d-2)\omega_d} & d\geq3, \end{array} \right.$

where ${\omega_d=\frac{\pi^{d/2}}{\Gamma(1+d/2)}}$ is the volume of the unit ball of ${\mathbb{R}^d}$. Our notation is motivated by the fact that ${-\Delta}$ is a nonnegative operator. The case of Coulomb interactions in dimension ${d=2}$ is known as “logarithmic potential with external field” and is widely studied in the literature: see Hiai and Petz, Saff and Totik, Anderson and Guionnet and Zeitouni, Hardy. To focus on novelty, we will not study the Coulomb kernel for ${d\leq2}$. We refer to Lenard, Edwards and Lenard, Lenard, Brascamp and Lieb, Aizenman and Martin, Sandier and Serfaty and references therein for the Coulomb case in dimension ${d=1}$, to BenArous and Guionnet Anderson and Guionnet and Zeitouni, Hardy to the Coulomb case in dimension ${d=2}$ with support restriction on a line, to BenArous and Zeitouni, Hiai and Petz, Hiai and Petz, Hardy, Saff and Totik, Sandier and Serfaty, Yattselev for the Coulomb case in dimension ${d=2}$. We also refer to Berman for the asymptotic analysis in terms of large deviations of Coulomb determinantal point processes on compact manifolds of arbitrary dimension.

The asymptotic analysis of ${\mu_N}$ as ${N\rightarrow\infty}$ for Coulomb interactions in dimension ${d\geq3}$ motivates our next result, which is stated for the more general Riesz interactions in dimension ${d\geq1}$. The Riesz interactions correspond to the choice ${W(x,y)=k_{\Delta_\alpha}(x-y)}$ where ${k_{\Delta_\alpha}}$, ${0<\alpha<d}$, ${d\geq1}$, is the Riesz kernel defined on ${\mathbb{R}^d}$, by

$k_{\Delta_\alpha}(x)=\frac{1}{|x|^{d-\alpha}}.$

Up to a multiplicative constant, this is the fundamental solution of a fractional Laplace equation (which is the true Laplace equation when ${\alpha=2}$), namely

$-c_\alpha\Delta_\alpha k_{\Delta_\alpha}=\mathcal{F}^{-1}(1)=\delta_0 \quad\text{with}\quad c_\alpha=\frac{\pi^{\alpha-\frac{d}{2}}}{4\pi^2} \frac{\Gamma(\frac{d-\alpha}{2})}{\Gamma(\frac{\alpha}{2})},$

where the Fourier transform ${\mathcal{F}}$ and the fractional Laplacian ${\Delta_\alpha}$ are given by

$\mathcal{F}(k_{\Delta_\alpha})(\xi)=\int_{\mathbb{R}^d}\!e^{2i\pi\xi\cdot x}\,k_{\Delta_\alpha}(x)\,dx =\frac{1}{c_\alpha4\pi^2|\xi|^\alpha} \quad\text{and}\quad \Delta_\alpha f = -4\pi^2\mathcal{F}^{-1}(|\xi|^\alpha\mathcal{F}(f)).$

Note that ${\Delta_2=\Delta}$ while ${\Delta_\alpha}$ is a non-local integro-differential operator when ${\alpha\neq2}$. When ${d\geq3}$ and ${\alpha=2}$ then Riesz interactions coincide with Coulomb interactions and the constants match. Beware that our notations differ slightly from the ones of Landkof. Several aspects of the Gauss problem in the Riesz case are studied in Dragnev and Saff, Zorii.

In the Riesz case, ${0<\alpha<d}$, one associates to any probability measure ${\mu}$ on ${\mathbb{R}^d}$ a function ${U_\alpha^\mu:\mathbb{R}^d\mapsto[0,+\infty]}$ called the potential of ${\mu}$ as follows

$U_\alpha^\mu(x)= (k_{\Delta_\alpha}*\mu)(x) =\int\!k_{\Delta_\alpha}(x-y)\,d\mu(y),\qquad \forall x\in \mathbb{R}^d.$

In classical potential theory, a property is said to hold quasi everywhere if it holds outside a set of zero capacity. The following theorem is essentially the analogue in ${\mathbb{R}^d}$ of a result of Dragnev and Saff on spheres. The analogue problem on compact subsets, without external field, was initially studied by Frostman (in his PhD thesis, advised by Riesz, 1934!), see also the book of Landkof. A confinement (by an external field or by a support constraint) is always needed for such type of results.

Theorem 2 (Riesz gases) Suppose that ${W}$ is the Riesz kernel ${W(x,y)= k_{\Delta_\alpha}(x-y)}$. Then:

1. The functional ${I}$ is strictly convex where it is finite;
2. (H1)-(H2)-(H3)-(H4) are satisfied and Theorem 1 applies;
3. There exists a unique ${\mu_\star\in\mathcal{M}_1(\mathbb{R}^d)}$ such that

$I(\mu_\star)=\inf_{\mu\in\mathcal{M}_1(\mathbb{R}^d)}I(\mu);$

4. If we define ${(\mu_N)_N}$ on a unique probability space (for a sequence ${\beta_N\gg N\log(N)}$) then with probability one,

$\lim_{N\rightarrow\infty}\mu_N=\mu_\star.$

If we denote by ${C_\star}$ the real number

$C_\star = \int\!\left(U_\alpha^{\mu_\star} + V\right)\,d\mu_\star = J(\mu_\star) + \int\!Vd\mu_\star,$

then the following additional properties hold:

1. The minimizer ${\mu_\star}$ has compact support, and satisfies

$U_\alpha^{\mu_\star} + V \geq C_\star$

quasi everywhere, with equality on the support of ${\mu_\star}$;

2. If a compactly supported measure ${\mu}$ creates a potential ${U_\alpha^\mu}$ such that, for some constant ${C\in\mathbb{R}}$,

$U_\alpha^\mu + V \geq C$

quasi everywhere, with equality on the support of ${\mu}$, then ${C = C_\star}$ and ${\mu=\mu_\star}$. The same is true under the weaker assumptions:

$U_\alpha^\mu + V \leq C$

on the support of ${\mu}$, and

$U_\alpha^\mu + V \geq C$

quasi everywhere on the support of ${\mu_\star}$.

3. If ${\alpha \leq 2}$, for any measure ${\mu}$, the following “converse” holds:

$\sup_{\mathrm{supp}(\mu)} \left(U_\alpha^\mu + V\right) \geq C_\star,$

and

$“\inf_{\mathrm{supp}(\mu_\star)}”\, \left(U_\alpha^\mu(x) + V(x)\right) \leq C_\star,$

where the ${“\inf”}$ means that the infimum is taken quasi-everywhere.

The constant ${C_\star}$ is called the “modified Robin constant”, see e.g. Saff and Totik for the analogous result for the logarithmic potential in dimension ${2}$. The minimizer ${\mu_\star}$ is called the equilibrium measure.

Corollary 3 (Equilibrium of Coulomb gases with radial external fields in dimension ${\geq3}$) Suppose that for a fixed real parameter ${\beta>0}$, and for every ${x,y\in\mathbb{R}^d}$, ${d\geq3}$,

$V(x)=v(|x|)\quad\text{and}\quad W(x,y)=\beta k_{\Delta}(x-y),$

where ${v}$ is two times differentiable. Denote by ${d\sigma_r}$ the Lebesgue measure on the sphere of radius ${r}$, and let ${\sigma_d}$ be the total mass of ${d\sigma_1}$ (i.e. the surface of the unit sphere of ${\mathbb{R}^d}$). Let ${w(r) = r^{d-1}v'(r)}$, and suppose either that ${v}$ is convex, or that ${w}$ is increasing. Define two radii ${r_0<R_0}$ by:

$r_0 = \inf\left\{r>0 ; v'(r)>0\right\} \quad\text{and}\quad w(R_0) = \beta(d-2).$

Then the equilibrium measure ${\mu_\star}$ is supported on the ring ${\left\{x; |x|\in [r_0,R_0]\right\}}$, and is absolutely continuous with respect to Lebesgue measure:

$d\mu(r) = M(r)\,d\sigma_r dr \quad\text{where}\quad M(r) = \frac{w'(r)}{\beta(d-2)\sigma_d r^{d-1}} \mathbf{1}_{[r_0,R_0]}(r).$

In particular, when ${v(t)=t^2}$ then ${\mu_\star}$ is the uniform distribution on the centered ball of radius

$\left(\beta\frac{d-2}{2}\right)^{\frac{1}{d}}.$

The result provided by Corollary 3 on Coulomb gases with radial external fields can be found for instance in Lopez Garcia. It follows quickly from the Gauss averaging principle and the variational characterization. By using Theorem 2 with ${\alpha=2}$ together with Corollary 3, we obtain that the empirical measure of a Coulomb gas with quadratic external field in dimension ${d\geq3}$ tends almost surely to the uniform distribution on a ball when ${N\rightarrow\infty}$. This phenomenon is the analogue in arbitrary dimension ${d\geq3}$ of the well known result in dimension ${d=2}$ for the logarithmic potential with quadratic radial external field (where the uniform law on the disc or “circular law” appears as a limit for the Complex Ginibre Ensemble, see for instance BenArous and Zeitouni, Hiai and Petz). The study of the equilibrium measure for Coulomb interaction with non radially symmetric external fields was initiated recently in dimension ${d=2}$ by Bleher and Kuijlaars in a beautiful work Bleher and Kuijlaars by using orthogonal polynomials.

The following proposition shows that in the Riesz case, it is possible to construct a good confinement potential ${V}$ so that the equilibrium measure is prescribed in advance.

Corollary 4 (Riesz gases: external field for prescribed equilibrium measure) Let ${0<\alpha<d}$, ${d\geq1}$, and ${W(x,y)=k_{\Delta_\alpha}}$. Let ${\mu_\star}$ be a probability measure with a compactly supported density ${f_\star \in\mathbb{L}^p(\mathbb{R}^d)}$ for some ${p>d/\alpha.}$ Define the confinement potential

$V(x)= -U_\alpha^{\mu_\star}(x) + [|x|^2-R]_+,\qquad x\in \mathbb{R}^d,$

where ${U_\alpha^{\mu_\star}}$ is the Riesz potential created by ${\mu_\star}$ and ${R>0}$ is such that ${\mathrm{supp}(\mu_\star)\subset B(0,R).}$ Then the couple of functions ${(V,W)}$ satisfy (H1)-(H2)-(H3)-(H4) and the functional

$\mu\in\mathcal{M}_1(\mathbb{R}^d)\mapsto I(\mu)=\int\!V\,d\mu + \frac{1}{2}\iint\!k_{\Delta_\alpha}(x-y)\,d\mu(x)d\mu(y) \in\mathbb{R}\cup\{+\infty\}$

admits ${\mu_\star}$ as unique minimizer. In particular, the probability ${\mu_\star}$ is the almost sure limit of the sequence ${{(\mu_N)}_{N}}$ (constructed on the same probability space), as soon as ${\beta_N\gg N\log(N)}$.

Non-compactly supported equilibrium measures. The assumptions made on the external field ${V}$ in Theorem 1 and Theorem 2 explain why the equilibrium measure ${\mu_\star}$ is compactly supported. If one allows a weaker behavior of ${V}$ at infinity, then one may produce equilibrium measures ${\mu_\star}$ which are not compactly supported (and may even be heavy tailed). This requires to adapt some of the arguments, and one may use compactification as in Hardy. This might allow to extend Corollary 4 beyond the compactly supported case.

Equilibrium measure for Riesz interaction with radial external field. To the knowledge of the authors, the computation of the equilibrium measure for Riesz interactions with radial external field, beyond the more specific Coulomb case of Corollary 3, is an open problem, due to the lack of the Gauss averaging principle when ${\alpha\neq2}$.

Beyond the Riesz and Coulomb interactions. Theorem 2 concerns the minimization of the Riesz interaction potential with an external field ${V}$, and includes the Coulomb interaction if ${d\geq3}$. In classical Physics, the problem of minimization of the Coulomb interaction energy with an external field is known as the Gauss variational problem Frostman, Landkof, Zorii. Beyond the Riesz and Coulomb potentials, the driving structural idea behind Theorem 2 is that if ${W}$ is of the form ${W(x,y)=k_D(x-y)}$ where ${k_D}$ is the fundamental solution of an equation ${-Dk_D=\delta_0}$ for a local differential operator ${D}$ such as ${\Delta_\alpha}$ with ${\alpha=2}$, and if ${V}$ is super-harmonic for ${D}$, i.e. ${DV\geq0}$, then the density of ${\mu_\star}$ is roughly given by ${DV}$ up to support constraints. This can be easily understood formally with Lagrange multipliers. The limiting measure ${\mu_\star}$ depends on ${V}$ and ${W}$, and is thus non-universal in general.

Second order asymptotic analysis. The asymptotic analysis of ${\mu_N-\mu_\star}$ as ${N\rightarrow\infty}$ is a natural problem, which can be studied on various classes of tests functions. It is well known that a repulsive interaction may affect dramatically the speed of convergence, and make it dependent over the regularity of the test function. In another direction, one may take ${\beta_N=\beta N^2}$ and study the low temperature regime ${\beta\rightarrow\infty}$ at fixed ${N}$. In the Coulomb case, this leads to Fekete points. We refer to Serfaty, Borodin and Serfaty, Sandier and Serfaty for the analysis of the second order when both ${\beta\rightarrow\infty}$ and ${N\rightarrow\infty}$. In the one-dimensional case, another type of local universality inside the limiting support is available in Götze and Venker.

Edge behavior. Suppose that ${V}$ is radially symmetric and that ${\mu_\star}$ is supported in the centered ball of radius ${r}$, like in Corollary 3. Then one may ask if the radius of the particle system ${\max_{1\leq k\leq n}|x_k|}$ converges to the edge ${r}$ of the limiting support as ${N\rightarrow\infty}$. This is not provided by the weak convergence of ${\mu_N}$. The next question is the fluctuation. In the two-dimensional Coulomb case, a universality result is available for a class of external fields in arXiv:1310.0727.

Topology. It is known that the weak topology can be upgraded to a Wasserstein topology in the classical Sanov theorem for empirical measures of i.i.d. sequences, see Wang and Wang and Wu, provided that tails are strong exponentially integrable. It is then quite natural to ask about such an upgrade for Theorem 1.

Connection to random matrices. Our initial inspiration came, when writing the survey on the circular law, from the role played by the logarithmic potential in the analysis of the Ginibre ensemble. When ${d=2}$, ${\beta_N=N^2}$, ${V(x)=|x|^2}$ and ${W(x,y)=\beta k_\Delta(x-y)=\beta \log\frac{1}{|x-y|}}$ with ${\beta=2}$ then ${P_N}$ is the law of the (complex) eigenvalues of the complex Ginibre ensemble:

$dP_N(x)=Z_N^{-1}e^{-N\sum_{i=1}^N|x_i|^2}\prod_{i<j}|x_i-x_j|^2dx.$

(here ${\mathbb{R}^2\equiv\mathbb{C}}$ and ${P_N}$ is the law of the eigenvalues of a random ${N\times N}$ matrix with i.i.d. complex Gaussian entries of covariance ${\frac{1}{2N}I_2}$). For a non-quadratic ${V}$, we may see ${P_N}$ as the law of the spectrum of random normal matrices such as the ones studied in Ameur and Hedenmalm, and Makarov. On the other hand, in the case where ${d=1}$ and ${V(x)=|x|^2}$ and ${W(x,y)=\beta\log\frac{1}{|x-y|}}$ with ${\beta>0}$ then

$dP_N(x)=Z_N^{-1}e^{-N\sum_{i=1}^N|x_i|^2}\prod_{i<j}|x_i-x_j|^\beta\,dx.$

This is known as the ${\beta}$-Ensemble in Random Matrix Theory. For ${\beta=1}$, we recover the law of the eigenvalues of the Gaussian Orthogonal Ensemble (GOE) of random symmetric matrices, while for ${\beta=2}$, we recover the law of the eigenvalues of the Gaussian Unitary Ensemble (GUE) of random Hermitian matrices. It is worthwhile to notice that ${-\log|\cdot|}$ is the Coulomb potential in dimension ${d=2}$, and not in dimension ${d=1}$. For this reason, we may interpret the eigenvalues of GOE/GUE as being a system of charged particles in dimension ${d=2}$, experiencing Coulomb repulsion and an external quadratic field, but constrained to stay on the real axis. We believe this type of support constraint can be incorporated in our initial model, at the price of a bit heavier notations and analysis.

Simulation problem and numerical approximation of the equilibrium measure. It is natural to ask about the best way to simulate the probability measure ${P_N}$. A pure rejection algorithm is too naive. Some exact algorithms are available in the determinantal case such as for ${d=2}$ and ${W(x,y)=-2\log|x-y|}$ (algorithm 18 in Hough and Khrishnapur and Perez and Virag), and Scardicchio and Zachary and Torquato, and also the more recent Decreusefond and Flit and Low. One may prefer to use a non exact algorithm such as a Hastings-Metropolis algorithm. One may also use an Euler scheme to simulate a stochastic process for which ${P_N}$ is invariant, or use a Metropolis adjusted Langevin approach (MALA) Roberts and Rosenthal. In this context, a very natural way to approximate numerically the equilibrium measure ${\mu_\star}$ is to use a simulated annealing stochastic algorithm.

More general energies. The density of ${P_N}$ takes the form

$\prod_{i=1}^Nf_1(x_i)\prod_{1\leq i<j\leq N}f_2(x_i,x_j),$

which comes from the structure of ${H_N}$. One may study more general energies with many bodies interactions, of the form, for some prescribed symmetric ${W_k:(\mathbb{R}^d)^k\mapsto\mathbb{R}}$, ${1\leq k\leq K}$, ${K\geq1}$,

$H_N(x_1,\ldots,x_N) =\sum_{k=1}^K \sum_{i_1<\cdots<i_k}N^{-k}W_k(x_{i_1},\ldots,x_{i_k}).$

This leads to the following candidate for the asymptotic first order global energy functional:

$\mu\mapsto\sum_{k=1}^K 2^{-k}\int\!\cdots\int\!W_k(x_1,\ldots,x_k)\,d\mu(x_1)\cdots d\mu(x_k).$

Stochastic processes. Under general assumptions on ${V}$ and ${W}$, see for instance Royer, the law ${P_N}$ is the invariant probability measure of a well defined (the absence of explosion comes from the assumptions on ${V}$ and ${W}$) reversible Markov diffusion process ${{(X_t)}_{t\in\mathbb{R}_+}}$ with state space

$\{x\in(\mathbb{R}^d)^N:H_N(x)<\infty\} =\{x\in(\mathbb{R}^d)^N:\sum_{i<j}W(x_i,x_j)<\infty\},$

solution of the system of Kolmogorov stochastic differential equations

$dX_t=\sqrt{2\frac{\alpha_N}{\beta_N}}\,dB_t-\alpha_N\nabla H_N(X_t)\,dt$

where ${{(B_t)}_{t\geq0}}$ is a standard Brownian motion on ${(\mathbb{R}^d)^N}$, and where ${\alpha_N>0}$ is an arbitrary scale parameter (natural choices being ${\alpha_N=1}$ and ${\alpha_N=\beta_N}$). The law ${P_N}$ is the equilibrium distribution of a system of ${N}$ interacting Brownian particles ${{(X_{1,t})}_{t\geq0},\ldots,{(X_{N,t})}_{t\geq0}}$ in ${\mathbb{R}^d}$ at inverse temperature ${\beta_N}$, with equal individual “charge” ${q_N=1/N}$, subject to a confining potential ${\alpha_N V}$ acting on each particle and to an interaction potential ${\alpha_N W}$ acting on each pair of particles, and one can rewrite the stochastic differential equation above as the system of coupled stochastic differential equations (${1\leq i\leq N}$)

$dX_{i,t} =\sqrt{2\frac{\alpha_N}{\beta_N}}\,dB_{i,t} -q_N\alpha_N\nabla V(X_{i,t}) -\sum_{j\neq i}q_N^2\alpha_N\nabla_1W(X_{i,t},X_{j,t})\,dt$

where ${{(B_t^{(1)})}_{t\geq0},\ldots,{(B_t^{(N)})}_{t\geq0}}$ are i.i.d. standard Brownian motions on ${\mathbb{R}^d}$. From a partial differential equations point of view, the probability measure ${P_N}$ is the steady state solution of the Fokker-Planck evolution equation ${\partial_t-L=0}$ where ${L}$ is the elliptic Markov diffusion operator (second order linear differential operator without constant term)

$L=\frac{\alpha_N}{\beta_N}\left(\Delta-\beta_N\nabla H_N\cdot\nabla\right),$

acting as ${Lf=\frac{\alpha_N}{\beta_N}(\Delta f-\left<\beta_N\nabla H_N,\nabla f\right>)}$. This self-adjoint operator in ${\mathrm{L}^2(P_N)}$ is the infinitesimal generator of the Markov semigroup ${{(P_t)}_{t\geq0}}$, ${P_t(f)(x)=\mathbb{E}(f(X_t)|X_0=x)}$. Let us take ${\alpha_N=\beta_N}$ for convenience. In the case where ${V(x)=|x|^2}$ and ${W\equiv0}$ (no interaction) then ${P_N}$ is a standard Gaussian law ${\mathcal{N}(0,I_{dN})}$ on ${(\mathbb{R}^d)^N}$ and ${{(X_t)}_{t\geq0}}$ is an Ornstein-Uhlenbeck Gaussian process; while in the case where ${d=1}$ and ${V(x)=|x|^2}$ and ${W(x,y)=-\beta\log|x-y|}$ of some fixed parameter ${\beta>0}$ then ${P_N}$ is the law of the spectrum of a ${\beta}$-Ensemble of random matrices and ${{(X_t)}_{t\geq0}}$ is a so called Dyson Brownian motion Anderson and Guionnet and Zeitouni. If ${\mu_{N,t}}$ is the law of ${X_t}$ then ${\Delta\mu_{N,t}\rightarrow\Delta\mu_N}$ weakly as ${t\rightarrow\infty}$. The study of the dynamic aspects is an interesting problem connected to McKean-Vlasov models Cépa and Lépingle, Fontbona, Li and Li and Xie, Osada, Osada.

Calogero-(Moser-)Sutherland Schrödinger operators. Let us keep the notation used above. We define ${U_N=\beta_NH_N}$ and we take ${\beta_N=N^2}$ for simplicity. Let us consider the isometry ${\Theta:\mathrm{L}^2(P_N)\rightarrow\mathrm{L}^2(dx)}$ defined by

$\Theta(f)(x)=f(x)\sqrt{\frac{dP_N(x)}{dx}}=f(x)e^{-\frac{1}{2}(U_N(x)+\log(Z_N))}.$

The differential operator ${S=-\Theta L \Theta^{-1}}$ is a Schrödinger operator:

$S=-\Theta L \Theta^{-1}=-\Delta+Q, \quad Q=\frac{1}{4}|\nabla U_N|^2-\frac{1}{2}\Delta U_N$

which acts as ${S f=-\Delta f+Qf}$. The operator ${S}$ is self-adjoint in ${\mathrm{L}^2(dx)}$. Being isometrically conjugated, the operators ${-L}$ and ${S}$ have the same spectrum, and their eigenspaces are isometric. In the case where ${V(x)=|x|^2}$ and ${W\equiv0}$ (no interactions), we find that and ${Q=\frac{1}{2}(1-V)}$, and ${S}$ is a harmonic oscillator. On the other hand, following Proposition 11.3.1 in Forrester, in the case ${d=1}$ and ${W(x,y)=-\log|x-y|}$ (Coulomb interaction), then ${S}$ is a Calogero-(Moser-)Sutherland Schrödinger operator:

$S=-\Delta -E_0+\frac{1}{4}\sum_{i=1}^Nx_i^2 -\frac{1}{2}\sum_{1\leq i<j\leq N}\frac{1}{(x_i-x_j)^2}, \quad E_0=\frac{N}{2}+\frac{N(N-1)}{2}.$

More examples are given in Proposition 11.3.2 of MR2641363, related to classical ensembles of random matrices. The study of the spectrum and eigenfunctions of such operators is a wide subject, connected to Dunkl operators. These models attracted some attention due to the fact that for several natural choices of the potentials ${V,W}$, they are exactly solvable (or integrable). We refer to section 11.3.1 of Forrester, Section 9.6 of Dunkl and Xue, and Section 2.7 of Chybiryakov et al.

Résumé. Les revues de mathématiques françaises peuvent-elles adopter le modèle Open Access Diamond, gratuit à la fois pour les auteurs et pour les lecteurs ? Oui, et cet objectif peut parfaitement être atteint, avec des éditeurs publics ou privés, grâce à une baisse des coûts et à de nouveaux circuits de l’argent.

Mots-clés. Coût des publications; gratuité; électronique; circuit de l’argent.

Texte. Le problème du coût des publications, à la fois international et pluridisciplinaire, est difficile à résoudre au seul niveau de la France ou des mathématiques. Néanmoins, il est parfaitement possible, dès maintenant, de changer le modèle utilisé par les revues de mathématiques françaises. Nous avons là une opportunité exceptionnelle pour donner l’exemple à l’échelle européenne et mondiale. Les mathématiques françaises sont fortes et structurées, et bénéficient d’institutions puissantes. Nous pouvons, si nous le souhaitons, faire preuve d’audace et choisir notre avenir.

Ce texte part du constat qu’en matière de publication scientifique, le temps est décidément à l’électronique, et aborde le problème suivant: comment faire vivre de manière durable des revues mathématiques françaises purement électroniques, de bonne qualité, d’accès gratuit à la fois pour auteurs et lecteurs?

Plusieurs solutions ont été proposées à ce problème. Deux exemples concrets ont été présentés récemment dans la Gazette: Electronic Journal of Probability (EJP) dans [C] et Journal de l’École Polytechnique (JEP) dans [S]. L’analyse de ces deux exemples révèle à quel point les coûts et la quantité des tâches à accomplir peuvent être réduits de manière drastique, grâce notamment au passage au tout électronique, à l’utilisation d’un logiciel de gestion éditoriale, à l’absence de secrétariat, à l’absence de gestion d’abonnements, mais aussi grâce au soutien d’organisations dont c’est en partie la mission comme le CNRS(Mathrice), le Centre de diffusion de revues académiques mathématiques (CEDRAM), Public Knowledge Project (PKP), etc.

Les solutions du type EJP ou JEP sont cependant critiquées ici et là. Elles seraient difficilement généralisables car elles reposent de manière cruciale sur un bénévolat informatique assuré par un mathématicien «geek» sur son temps libre. Même si ce sont les auteurs qui adaptent leur article au format LaTeX de la revue, il reste toute l’ingénierie autour des meta-données notamment, effectuée par un mathématicien bénévole. Ce sont les fameux «coûts cachés», qui seraient couverts précisément par les abonnements dans le modèle traditionnel. La critique est fondée. Certains rétorquent que les comités éditoriaux reposent eux aussi de manière cruciale sur un bénévolat assuré par des mathématiciens sur leur temps libre. Cependant, ce travail éditorial est plus naturel pour les mathématiciens. Il serait certainement idéal que le travail informatique indispensable soit accompli par un professionnel de ce genre de choses.

Soulignons qu’aujourd’hui, la plupart des revues de mathématiques sont électroniques, et que celles qui n’utilisent pas de logiciel de gestion éditoriale sont de plus en plus rares. La spécificité des revues comme EJP ou JEP est d’être purement électroniques, d’accès gratuit pour auteurs et lecteurs, et de fonctionner avec un budget très limité, sans secrétariat. Leur prestige leur permettrait sans doute de monter en gamme de luxe, en sollicitant une institution généreuse et bienveillante. Cependant, il nous faut souligner également que le logiciel de gestion éditoriale utilisé vaut vraiment mieux qu’un secrétariat trop ordinaire. Il reste surtout à trouver comment faire pour susciter ou remplacer le bénévolat «geek». En fait, ce ne sont pas les moyens techniques ou financiers qui manquent, mais plutôt les solutions clé en main pour les comités éditoriaux. Voici quelques pistes.

Rôle du CNRS. Le CNRS a une mission nationale, et des services efficaces comme Mathrice et le Centre pour la Communication Scientifique Directe (CCSD). Il s’agit là d’une richesse rare et remarquable dont nous devons tirer partie. Le CNRS via l’INSMI et la Cellule de coordination documentaire nationale pour les maths (MathDoc), pourrait mettre en place progressivement une plate-forme nationale pour les revues électroniques française de mathématiques. Cela aurait l’avantage d’être mutualisé et professionnel, et pourrait se faire en cohérence avec le Centre de diffusion de revues académiques mathématiques (CEDRAM), ainsi qu’avec les organismes d’archivage comme Numdam, et d’archivage pérenne comme le Centre Informatique National de l’Enseignement Supérieur CINES. Cela reviendrait en quelque sorte à mettre en place des presses électroniques nationales. Cela serait compatible avec le développement du projet Episciences.org : une revue électronique hébergée par la plate-forme du CNRS pourraient devenir, si elle le souhaite, une épirevue. L’offre de service aux revues de la plate-forme pourrait aller du simple hébergement comme pour JEP, à la prise en charge intégrale.

Rôle des sociétés savantes. Les sociétés savantes pourraient impulser un changement en phase avec les grands mouvements à l’échelle européenne (La commission européenne a décidé que l’OpenAccess devait s’imposer avant 2020) en déclarant par exemple qu’à l’horizon 2016, toutes les revues de mathématiques françaises doivent passer au tout électronique à accès gratuit pour auteurs et lecteurs.

Rôle des établissements. Un certain nombre de revues françaises sont naturellement associées à un établissement éponyme: Journal de l’École Polytechnique, Annales de l’ÉNS, Annales de l’Institut Fourier, etc. En devenant purement électroniques et libres d’accès pour auteurs et lecteurs, elles simplifieraient leur mode de fonctionnement, baisseraient leur prix de revient, et pourraient se tourner vers leur établissement pour obtenir un soutien, ainsi que vers le CNRS (cf. ci-dessus). Par ailleurs, il est bien connu qu’un mathématicien enseignant-chercheur effectuant une tâche administrative importante bénéficie en général d’une décharge de service d’enseignement. Pourquoi ne pas utiliser ce mécanisme pour soutenir un mathématicien «geek» enseignant-chercheur qui souhaite effectuer le travail informatique nécessaire au fonctionnement d’une revue électronique ? Un établissement suffisamment doté peut alternativement soutenir une revue en affectant (à temps partiel) un professionnel de l’informatique à la gestion technique de la revue électronique. Curieusement, beaucoup de revues françaises ont bénéficié ou bénéficient encore de moyens humains non négligeables, dont l’énergie est absorbée en partie par un mode de fonctionnement obsolète, incluant gestion d’une version papier et abonnements.

Rôle des fondations. Des fondations richement dotées ont fait leur apparition dans le paysage mathématique français ces dernières années, comme par exemple la Fondation des Sciences Mathématiques de Paris (FSMP), et la Fondation Mathématique Jacques Hadamard (FMJH). Elles pourraient naturellement contribuer à soutenir des revues purement électroniques d’accès gratuit pour auteurs et lecteurs. La FMJH s’est par exemple déjà engagée à soutenir JEP si nécessaire.

Cas des comptes rendus de l’Académie des Sciences. Il faut payer (cher) un abonnement pour consulter électroniquement les notes aux comptes rendus de l’Académie des Sciences (CRAS). Cette publication française est éditée par Elsevier. Les CRAS, au moins pour les mathématiques, pourraient devenir, disons avant 2016, purement électroniques, d’accès gratuit pour auteurs et lecteurs, en faisant appel éventuellement au CNRS et aux fondations (cf. ci-dessus) si l’Académie n’en a pas les moyens. Une telle conversion serait symboliquement importante.

Cas des annales de l’Institut Henri Poincaré. La revue Annales de l’Institut Poincaré série A (Physique Mathématique & Théorique) est éditée par Birkhäuser, tandis que la série C (Analyse Non Linéaire) est éditée par Elsevier. La série B (Probabilités et Statistique) est passée récemment de Elsevier à Institut of Mathematical Statistics (IMS), une institution des USA à but non lucratif, qui soutient de nombreuses autres revues (dont EJP). Les comités éditoriaux de ces trois revues n’ont semble-t-il pas de solution clé en main à la française. Ces trois revues pourraient passer avant 2016 au tout électronique, à accès gratuit pour auteurs et lecteurs. Cela simplifierait leur gestion et baisserait leur prix de revient. Elles pourraient se tourner vers le CNRS (cf. ci-dessus). Alternativement, elles pourraient mettre en place une solution spécifique à l’IHP avec le soutien financier éventuel des fondations parisiennes (cf. ci-dessus). L’IHP est une institution dont la mission est nationale mais qui est très soutenue par les établissement parisiens, et il n’est pas choquant que les riches payent pour le bien de la nation.

Revues des sociétés savantes. La transformation des revues de la SMF et de la SMAI avant 2016 en revues purement électroniques à accès gratuit pour auteurs et lecteurs est plus délicate en raison des salariés hors CNRS employés par ces sociétés savantes. Ici encore, cette transformation simplifie la gestion et diminue considérablement le prix de revient. La sauvegarde des emplois pourrait se faire par une subvention du CNRS, ou par le versement volontaire d’une contribution par les bibliothèques. Une autre solution consisterait à faire parrainer chaque revue par un établissement particulier. Mais la solution la plus audacieuse, évoquée plus loin, serait sans doute la création d’une fondation pour changer le circuit de l’argent. Dans tous les cas, en Europe, le passage à un OpenAccess est obligatoire à l’horizon 2020, et les mathématiciens ne veulent pas du système auteur-payeur. Il faudra donc bien se restructurer peu ou prou.

Autres revues françaises. L’algorithme serait toujours le même: avant 2016, devenir purement électronique d’accès gratuit pour auteurs et lecteurs. Se tourner vers la plate-forme nationale du CNRS (cf. ci-dessus), et éventuellement vers une institution ou un établissement naturellement associé pour obtenir un soutien. La situation est parfois délicate, comme par exemple pour les Annales de l’Institut Fourier, gérées par une association, qui a mis en place de nombreux échanges avec d’autres revues pour garnir la bibliothèque de l’Institut Fourier. Néanmoins, dans tous les cas, en Europe, le passage à un OpenAccess est obligatoire à l’horizon 2020.

Facture des abonnements. Le changement de modèle des revues de mathématiques françaises proposé ci-dessus est ambitieux mais n’affecte pas (directement) le modèle des revues hors de France. En particulier, cela ne résout pas vraiment le problème des dépenses des bibliothèques de mathématiques françaises pour leurs abonnements. Cela dépasse le cadre français et les seules mathématiques. Cependant, une audace des mathématiques françaises pourrait inciter les européens à opter avant 2020 pour une alternative au système auteur-payeur.

Changer le circuit de l’argent. Les abonnements des revues traditionnelles sont payés par les lecteurs via leur institution. Pour être compatible avec la gratuité pour auteurs et lecteurs, il suffirait que ces institutions versent directement dans un fond commun géré par une fondation, qui financerait directement le fonctionnement des revues. Cela n’empêcherait pas les revues de sous-traiter le travail informatique à des éditeurs classiques si elles le souhaitent. Dans un tel système par fondation, ce ne sont plus les lecteurs ni les auteurs qui payent, mais plutôt les institutions, qui du reste payent depuis toujours les abonnements! Cette solution a plusieurs avantages:

• elle est compatible avec les solutions déjà évoquées dans ce texte;
• elle mutualise les moyens, et permet par exemple à des organismes riches de payer plus et à des organismes pauvres de payer moins. C’est une solution de ce type qui permet le financement durable de arXiv.org ;
• elle protège les publications fragiles et maximise la diffusion de la science;
• elle reste compatible avec les éditeurs à but lucratif, qui peuvent être financés par le fond à condition d’être en accès gratuit pour auteurs et lecteurs;
• elle est compatible avec l’organisation des publications des sociétés savantes (nul besoin de licencier, mais peut-être de réorganiser et de mutualiser);
• elle est compatible avec un système du même type pour d’autres disciplines, à l’échelle française, européenne ou mondiale;
• elle est compatible avec le projet Episciences.org;
• elle est compatible avec l’existence de divers degrés de luxe des revues, en fonction de leur prestige par exemple, et n’impose pas forcément le bas coût de EJP et de JEP (qui du reste pourraient monter en gamme de luxe!);
• elle redonne le pouvoir aux scientifiques dans le monde de l’édition;
• elle respecte le métier des éditeurs, en le finançant autrement;
• elle ne change que la destination du flux financier sortant des organismes, et n’impose que la gratuité pour auteurs et lecteurs;
• elle est compatible avec les multiples revues existantes, à condition de ne financer que la partie électronique. Si certains organismes tiennent absolument à une version papier, la revue peut très bien la leur vendre si elle le souhaite. Réciproquement, si une revue veut absolument produire du papier, elle peut toujours rechercher des organismes qui seraient prêts à se l’offrir.

Une telle fondation pourrait être reconnue d’utilité publique, et pourrait être administrée par des représentants des sociétés savantes, des organismes, des universités, etc. La mise en place pourrait être graduelle, avec une montée en puissance sur plusieurs années, sans grande révolution précipitée. Pourquoi ne pas tenter l’expérience ? Il ne tient qu’aux sociétés savantes de faire preuve d’audace en la matière. L’avenir nous dira si un tel système peut voir le jour, peut-être à l’échelle européenne. Ce mode de fonctionnement est déjà celui de arXiv.org à l’heure actuelle, avec la participation d’institutions du monde entier, y compris françaises.

Mot de la fin. La situation de l’écosystème des publications mathématiques françaises contraste avec l’organisation et la détermination des éditeurs à but lucratif, qui n’ont même pas besoin de diviser pour régner. Il est urgent pour notre communauté de se hisser à la hauteur des enjeux. Attendre que le système change pour s’y adapter est la meilleure manière de ne pas choisir son avenir.

Post-scriptum. Ce texte n’aborde pas certains problèmes importants liés aux publications, comme le problème de l’archivage, et le problème du droit d’auteur par exemple. Certains aspects de ces problèmes sont abordés dans [C], [P], et [S].

Références.

Note. Ce texte est disponible au format PDF. Il a été soumis pour publication à Matapli et à la Gazette des mathématiciens. L’auteur remercie Jean Dolbeault, Maria Esteban, Arnaud Guillin, Michel Ledoux, Claude Sabbah, et Christoph Sorger pour leurs commentaires sur des versions préliminaires.