Press "Enter" to skip to content

20 search results for "circular law"

From Boltzmann to random matrices and beyond

Persi Diaconis during his talk, Toulouse, March 2014
Persi during his talk – Toulouse, March 2014.

I have recently uploaded on the French HAL a preprint numbered hal-00987177 (on arXiv it is 1405.1003) modestly entitled From Boltzmann to random matrices and beyond. This is my first upload in the History and Overview (math.HO) section. This initial version has 26 pages. Remarkably, the number $26$, used for Applied Maths by the French Conseil national des universités, is the unique number between a square ($25=5^2$) and a cube ($27=3^3$), cf. [GB].

This text forms the written notes of a talk entitled “About confined particles with singular pair repulsion”, given at the occasion of the workshop “Talking Across Fields” on convergence to equilibrium of Markov chains. This exciting workshop was organized in Toulouse from 24 to 28 March by Laurent Miclo at the occasion of the CIMI Excellence research chair for Persi Diaconis.

Regarding the content, these expository notes propose to follow, across fields, some aspects of the concept of entropy. Starting from the work of Boltzmann in the kinetic theory of gases, various universes are visited, including Markov processes and their Helmholtz free energy, the Shannon monotonicity problem in the central limit theorem, the Voiculescu free probability theory and the free central limit theorem, random walks on regular trees, the circular law for the complex Ginibre ensemble of random matrices, and finally the asymptotic analysis of mean-field particle systems in arbitrary dimension, confined by an external field and experiencing singular pair repulsion. The text is written in an informal style driven by energy and entropy. It aims to be recreative and to provide to the curious readers entry points in the literature, and connections across boundaries.

Leave a Comment

De la Vallée Poussin on Uniform Integrability

Charles Jean de la Vallée-Poussin
Charles de la Vallée-Poussin.

« La Belgique, victime de cruelles violences, a rencontré, dans ses épreuves, de grands et de nombreux témoignages de sympathie. L’Université de Louvain en a recueilli sa part. Ses professeurs, dispersés après le désastre de la ville, ont été invités dans des Universités étrangères. Beaucoup y ont trouvé un asile, quelques-uns même la possibilité de poursuivre leur enseignement. Ainsi, j’ai été appelé a faire des conférences à l’Université Harvard en Amérique l’année dernière, puis cette année au Collège de France. Le présent volume contient la matière des Leçons que j’ai faites au Collège de France entre décembre 1915 et mars 1916. Il est, après bien d’autres, un modeste souvenir de ces événements. … Une partie des résultats établis dans cet article se trouvaient dans la troisième édition du Tome II de mon Cours d’Analyse, qui était sous presse lors de l’incendie de Louvain. Les Allemands ont brulé cet ouvrage : toutes les installations de mon éditeur ont, en effet, partagé le sort de la bibliothèque de l’Université. … » Charles de la Vallée Poussin in Intégrales de Lebesgue, fonctions d’ensemble, classes de Baire, Leçons professées au Collège de France (1916). Excerpt from the introduction.

This post is devoted to some probabilistic aspects of uniform integrability, a basic concept that I like very much. Let \( {\Phi} \) be the class of non-decreasing functions \( {\varphi:\mathbb{R}_+\rightarrow\mathbb{R}_+} \) such that

\[ \lim_{x\rightarrow+\infty}\frac{\varphi(x)}{x}=+\infty. \]

This class contains for instance the convex functions \( {x\mapsto x^p} \) with \( {p>1} \) and \( {x\mapsto x\log(x)} \). Let us fix a probability space \( {(\Omega,\mathcal{A},\mathbb{P})} \), and denote by \( {L^\varphi} \) the set of random variables \( {X} \) such that \( {\varphi(|X|)\in L^1=L^1((\Omega,\mathcal{F},\mathbb{P}),\mathbb{R})} \). We have \( {L^\varphi\subsetneq L^1} \). Clearly, if \( {{(X_i)}_{i\in I}\subset L^1} \) is bounded in \( {L^\varphi} \) with \( {\varphi\in\Phi} \) then \( {{(X_i)}_{i\in I}} \) is bounded in \( {L^1} \).

Uniform integrability. For any family \( {{(X_i)}_{i\in I}\subsetneq L^1} \), the following three properties are equivalent. When one (and thus all) of these properties holds true, we say that the family \( {{(X_i)}_{i\in I}} \) is uniformly integrable (UI). The first property can be seen as a natural definition of uniform integrability.

  1. (definition of UI) \( {\lim_{x\rightarrow+\infty}\sup_{i\in I}\mathbb{E}(|X_i|\mathbf{1}_{|X_i|\geq x})=0} \);
  2. (epsilon-delta) the family is bounded in \( {L^1} \): \( {\sup_{i\in I}\mathbb{E}(|X_i|)<\infty} \), and moreover \( {\forall\varepsilon>0} \), \( {\exists\delta>0} \), \( {\forall A\in\mathcal{F}} \), \( {\mathbb{P}(A)\leq\delta\Rightarrow\sup_{i\in I}\mathbb{E}(|X_i|\mathbf{1}_A)\leq\varepsilon} \);
  3. (de la Vallée Poussin) there exists \( {\varphi\in\Phi} \) such that \( {\sup_{i\in I}\mathbb{E}(\varphi(|X_i|))<\infty} \).

The second property is often referred to as the “epsilon-delta” criterion. The third and last property is a boundedness in \( {L^\varphi\subsetneq L^{1}} \) and is due to Charles-Jean Étienne Gustave Nicolas de la Vallée Poussin (1866 – 1962), a famous Belgian mathematician, also well known for his proof of the prime number theorem.

Proof of \( {1\Rightarrow2} \). For the boundedness in \( {L^1} \), we write, for any \( {i\in I} \), and some \( {x\geq0} \) large enough,

\[ \mathbb{E}(|X_i|) \leq \mathbb{E}(|X_i|\mathbf{1}_{|X_i|<x})+\mathbb{E}(|X_i|\mathbf{1}_{|X_i|\geq x}) \leq x+\sup_{i\in I}\mathbb{E}(|X_i|\mathbf{1}_{|X_i|\geq x})<\infty. \]

Next, by assumption, for any \( {\varepsilon>0} \), there exists \( {x_\varepsilon} \) such that \( {\sup_{i\in I}\mathbb{E}(|X_i|\mathbf{1}_{|X_i|\geq x})\leq\varepsilon} \) for any \( {x\geq x_\varepsilon} \). If \( {\mathbb{P}(A)\leq \delta_\varepsilon:=\varepsilon/x_\varepsilon} \) then for any \( {i\in I} \),

\[ \mathbb{E}(|X_i|\mathbf{1}_A) =\mathbb{E}(|X_i|\mathbf{1}_{|X_i|<x_\varepsilon}\mathbf{1}_A) +\mathbb{E}(|X_i|\mathbf{1}_{|X_i|\geq x_\varepsilon}\mathbf{1}_A) \leq x_\varepsilon\mathbb{P}(A)+\varepsilon\leq 2\varepsilon. \]

Proof of \( {2\Rightarrow1} \). Since \( {{(X_i)}_{i\in I}} \) is bounded in \( {L^1} \), for every \( {j\in I} \), if \( {A:=A_j:=\{|X_j|\geq x\}} \), then, by Markov inequality, \( {\mathbb{P}(A)\leq x^{-1}\sup_{i\in I}\mathbb{E}(|X_i|)\leq \delta} \) for \( {x} \) large enough, uniformly in \( {j\in I} \), and the assumption gives \( {\lim_{x\rightarrow+\infty}\sup_{i,j\in I}\mathbb{E}(|X_i|\mathbf{1}_{|X_j|\geq x})=0} \).

Proof of \( {3\Rightarrow1} \). For any \( {\varepsilon>0} \), since \( {\varphi\in\Phi} \), by definition of \( {\Phi} \) there exists \( {x_\varepsilon\geq0} \) such that \( {x\leq\varepsilon \varphi(x)} \) for every \( {x\geq x_\varepsilon} \), and therefore

\[ \sup_{i\in I}\mathbb{E}(|X_i|\mathbf{1}_{|X_i|\geq x_\varepsilon}) \leq \varepsilon\sup_{i\in I}\mathbb{E}(\varphi(|X_i|)\mathbf{1}_{|X_i|\geq x_\varepsilon}) \leq \varepsilon\sup_{i\in I}\mathbb{E}(\varphi(|X_i|)). \]

Proof of \( {1\Rightarrow 3} \). Let us seek for a piecewise linear function \( {\varphi} \) of the form

\[ \varphi(x)=\int_0^x\varphi'(t)dt \quad\mbox{with}\quad \varphi’=\sum_{n=1}^\infty u_n\mathbf{1}_{[n,n+1[} \quad\mbox{and}\quad u_n=\sum_{m\geq1}\mathbf{1}_{x_m\leq n} \]

for a sequence \( {{(x_m)}_{m\geq1}\nearrow+\infty} \) to be constructed. We have \( {\varphi\in\Phi} \) since \( {\lim_{n\rightarrow\infty}u_n=+\infty} \). Moreover, since \( {\varphi\leq\sum_{n=1}^\infty(u_1+\cdots+u_n)\mathbf{1}_{[n,n+1[}} \), we get, for any \( {i\in I} \),

\[ \begin{array}{rcl} \mathbb{E}(\varphi(|X_i|)) &\leq& \sum_{n=1}^\infty (u_1+\cdots+u_n)\mathbb{P}(n\leq |X_i|<n+1) \\ &=& \sum_{n=1}^\infty u_n\mathbb{P}(|X_i|\geq n)\\ &=& \sum_{m\geq1}\sum_{n\geq x_m}\mathbb{P}(|X_i|\geq n)\\ &\leq & \sum_{m\geq1}\sum_{n\geq x_m}n\mathbb{P}(n\leq |X_i|<n+1)\\ &\leq & \sum_{m\geq1}\mathbb{E}(|X_i|\mathbf{1}_{|X_i|\geq x_m})\\ &\overset{*}{\leq} & \sum_{m\geq1}2^{-m}<\infty \end{array} \]

where \( {\overset{*}{\leq}} \) holds if for every \( {m} \) we select \( {x_m} \) such that \( {\sup_{i\in I}\mathbb{E}(|X_i|\mathbf{1}_{|X_i|\geq x_m})\leq 2^{-m}} \), which is allowed by assumption (we may replace \( {2^{-m}} \) by anything sumable in \( {m} \)).

This achieves the proof of the equivalence of the three properties: \( {1\Leftrightarrow 2\Leftrightarrow 3} \).

Alternative proof of \( {1\Rightarrow 3} \). Following a suggestion made by Nicolas Fournier on an earlier version of this post, one can simply construct \( {\varphi} \) as follows:

\[ \varphi(x)=\sum_{m\geq 1} (x-x_m)_+. \]

This function is non-decreasing and convex, and

\[ \frac{\varphi(x)}{x} = \sum_{m\geq1} \left(1-\frac{x_m}{x}\right)_+ \underset{x\rightarrow\infty}{\nearrow} \sum_{m\geq1} 1=\infty. \]

It remains to note that

\[ \mathbb{E}(\varphi(|X_i|)) \leq \sum_{m\geq1}\mathbb{E}(|X_i|\mathbf{1}_{|X_i|\geq x_m}) \leq 1. \]

Convexity and moderate growth. In the de la Vallée Poussin criterion, one can always assume that the function \( {\varphi} \) is convex (in particular continuous), with moderate growth in the sense that for every \( {x\geq0} \),

\[ \varphi(x)\leq x^2. \]

Indeed, the construction of \( {\varphi} \) that we gave above provides a function \( {\varphi} \) with piecewise constant and non-decreasing derivative, thus the function is convex (it is a Young function). Following Paul-André Meyer, to derive the moderate growth property, we may first observe that thanks to the way we constructed \( {\varphi} \), for every \( {x\geq0} \),

\[ \varphi'(2x)\leq c\varphi'(x) \]

where one can take \( {c=2} \). This follows from the fact that we have the freedom to take the \( {x_m} \)’s as large as we want, for instance in such a way that for every \( {n\geq1} \),

\[ u_{2n}\leq u_n+\sum_{m\geq1}\mathbf{1}_{n<x_m\leq 2n}\leq 2u_n. \]

Consequently, the function \( {\varphi} \) itself has also moderate growth since, denoting \( {C:=2c} \),

\[ \varphi(2x)=\int_0^{2x}\!g(u)\,du=\int_0^x\!g(2t)\,2dt\leq 2c\varphi(x)=C\varphi(x). \]

Now \( {\varphi(x)\leq C^{1+k}\varphi(2^{-k}x)} \) for any \( {k\geq1} \), and taking \( {k=k_x=\lceil\log_2(x)\rceil} \) we obtain

\[ \varphi(x)\leq C^2C^{\log_2(x)}\varphi(1)=C^2\varphi(1)2^{\log_2(C)\log_2(x)}=C^2\varphi(1)x^{\log_2(C)}. \]

Since one can take \( {c=2} \) we get \( {C=4} \), which allows \( {\varphi(x)\leq x^2} \) by scaling.

Examples of UI families.

  • Any finite subset of \( {L^1} \) is UI;
  • More generally, if \( {\sup_{i\in I}|X_i|\in L^1} \) (domination: \( {|X_i|\leq X} \) for every \( {i\in I} \), with \( {X\in L^1} \)) then \( {{(X_i)}_{i\in I}} \) is UI. To see it, we may first observe that the singleton family \( {\{\sup_{i\in I}|X_i|\}} \) is UI, and thus, by the de la Vallée Poussin criterion, there exists \( {\varphi\in\Phi} \) such that \( {\varphi(\sup_{i\in I}|X_i|)\in L^1} \), and therefore

    \[ \sup_{i\in I}\mathbb{E}(\varphi(|X_i|))\leq\mathbb{E}(\sup_{i\in I}\varphi(|X_i|))\leq\mathbb{E}(\varphi(\sup_{i\in I}|X_i|))<\infty, \]

    which implies, by the de la Vallée Poussin criterion again, that \( {{(X_i)}_{i\in I}} \) is UI;

  • If \( {\mathcal{U}_1\subset L^1,\ldots,\mathcal{U}_n\subset L^1} \) is a finite collection of UI families then the union \( {\mathcal{U}_1\cup\cdots\cup\mathcal{U}_n} \) is UI, and also the vector span \( {\mathrm{span}(\mathcal{U}_1\cup\cdots\cup\mathcal{U}_n)} \) is UI.
  • If \( {{(X_n)}_{n\geq1},X\in L^1} \) and \( {X_n\overset{L^1}{\rightarrow} X} \) then \( {{(X_n)}_{n\geq1}\cup\{X\}} \) is UI and \( {{(X_n-X)}_{n\geq1}} \) is UI. To see it, for any \( {\varepsilon>0} \), we first select \( {n} \) large enough such that \( {\sup_{k\geq n}\mathbb{E}(|X_k-X|)\leq\varepsilon} \), and then \( {\delta>0} \) with the epsilon-delta criterion for the finite family \( {\{X_1,\ldots,X_n\}} \), which gives, for any \( {A\in\mathcal{F}} \) such that \( {\mathbb{P}(A)\leq\delta} \),

    \[ \sup_{n\geq1}\mathbb{E}(|X_n-X|\mathbf{1}_A) \leq \max\left(\max_{1\leq k\leq n}\mathbb{E}(|X_k-X|\mathbf{1}_A) ; \sup_{k\geq n}\mathbb{E}(|X_k-X|) \right) \leq \varepsilon. \]

  • The de la Vallée Poussin criterion is often used with \( {\varphi(x)=x^2} \), and means in this case that every bounded subset of \( {L^2} \) is UI.

Integrability. The de la Vallée Poussin criterion, when used with a singleton UI family \( {\{X\}} \), states that \( {X\in L^1} \) implies that \( {\varphi(|X|)\in L^1} \) for some \( {\varphi\in\Phi} \). In other words, for every random variable, integrability can always be improved in a sense. There is not any paradox here since \( {\varphi} \) depends actually on \( {X} \). Topologically, integrability is in a sense an open statement rather than a closed statement. An elementary instance of this phenomenon is visible for Riemann series, in the sense that if \( {\sum_{n\geq1}n^{-s}<\infty} \) for some \( {s>0} \) then \( {\sum_{n\geq1}n^{-s’}<\infty} \) for some \( {s'<s} \), because the convergence condition of the series is “\( {s>1} \)”, which is an open condition.

Integrability of the limit. If \( {{(X_n)}_{n\geq1}} \) is UI and \( {X_n\rightarrow X} \) almost surely as \( {n\rightarrow\infty} \) then \( {X\in L^1} \). Indeed, by the Fatou Lemma,

\[ \mathbb{E}(|X|) =\mathbb{E}(\varliminf_{n\rightarrow\infty}|X_n|) \leq\varliminf_{n\rightarrow\infty}\mathbb{E}(|X_n|) \leq\sup_{n\geq1}\mathbb{E}(|X_n|)<\infty. \]

Dominated convergence. For any \( {{(X_n)}_{n\geq1},X\in L^1} \), we have

\[ X_n\overset{ L^1}{\rightarrow}X \quad\text{if and only if}\quad X_n\overset{\mathbb{P}}{\rightarrow}X \text{ and } {(X_n)}_{n\geq1}\text{ is UI} \]

and in fact, for any \( {{(X_n)}_{n\geq1}\in L^1} \) and any random variable \( {X} \),

\[ X\in L^1\text{ and }X_n\overset{ L^1}{\rightarrow}X \quad\text{if and only if}\quad X_n\overset{\mathbb{P}}{\rightarrow}X \text{ and } {(X_n)}_{n\geq1}\text{ is UI}. \]

This can be seen as an improved dominated convergence theorem (since \( {{(X_n)}_{n\geq1}} \) is UI when \( {\sup_n|X_n|\in L^1} \)). The proof may go as follows. We already know that if \( {X_n\rightarrow X} \) in \( {L^1} \) then \( {X_n\rightarrow X} \) in probability (Markov inequality) and \( {{(X_n)}_{n\geq1}\cup\{X\}} \) is UI (see above). Conversely, if \( {X_n\rightarrow X} \) in probability and \( {{(X_n)}_{n\geq1}} \) is UI, then \( {{(X_n-X)}_{n\geq1}} \) is UI since \( {X\in L^1} \), and thus, using the convergence in probability and the epsilon-delta criterion, we obtain that for any \( {\varepsilon>0} \) and large enough \( {n} \),

\[ \mathbb{E}(|X_n-X|) =\mathbb{E}(|X_n-X|\mathbf{1}_{|X_n-X|\geq\varepsilon}) +\mathbb{E}(|X_n-X|\mathbf{1}_{|X_n-X|<\varepsilon}) \leq 2\varepsilon. \]

Martingales. The American mathematician Joseph Leo Doob (1910 – 2004) has shown that if a sub-martingale \( {{(M_n)}_{n\geq1}} \) is bounded in \( {L^1} \) then there exists \( {M_\infty\in L^1} \) such that \( {M_n\rightarrow M_\infty} \) almost surely. Moreover, in the case of martingales, this convergence holds also in \( {L^1} \) if and only if \( {{(M_n)}_{n\geq1}} \) is UI. In the same spirit, and a bit more precisely, if \( {{(M_n)}_{n\geq1}} \) is a martingale for a filtration \( {{(\mathcal{F}_n)}_{n\geq1}} \), then the following two properties are equivalent:

  • \( {{(M_n)}_{n\geq1}} \) is UI;
  • \( {{(M_n)}_{n\geq1}} \) is closed, meaning that there exists \( {M_\infty\in L^1} \) such that \( {M_n=\mathbb{E}(M_\infty|\mathcal{F}_n)} \) for all \( {n\geq1} \), and moreover \( {M_n\rightarrow M_\infty} \) almost surely and in \( {L^1} \).

Are there non UI martingales? Yes, but they are necessarily unbounded: \( {\sup_{n\geq0}|M_n|\not\in L^1} \), otherwise we may apply dominated convergence. A nice counter example is given by critical Galton-Watson branching process, defined recursively by \( {M_0=1} \) and

\[ M_{n+1}=X_{n+1,1}+\cdots+X_{n+1,M_n}, \]

where \( {X_{n+1,j}} \) is the number of offspring of individual \( {j} \) in generation \( {n} \), and \( {{(X_{j,k})}_{j,k\geq1}} \) are i.i.d. random variables on \( {\mathbb{N}} \) with law \( {\mu} \) of mean \( {1} \) and such that \( {\mu(0)>0} \). The sequence \( {{(M_n)}_{n\geq1}} \) is a non-negative martingale, and thus it converges almost surely to some \( {M_\infty\in L^1} \). It is also a Markov chain with state space \( {\mathbb{N}} \). The state \( {0} \) is absorbing and all the remaining states can lead to \( {0} \) and are thus transient. It follows then that almost surely, either \( {{(M_n)}_{n\geq0}} \) converges to \( {0} \) or to \( {+\infty} \), and since \( {M_\infty \in L^1} \), it follows that \( {M_\infty=0} \). However, the convergence cannot hold in \( {L^1} \) since this leads to the contradiction \( {1=\mathbb{E}(M_n)\rightarrow\mathbb{E}(M_\infty)=0} \) (note that \( {{(M_n)}_{n\geq1}} \) is bounded in \( {L^1} \)).

Topology. The Dunford-Pettis theorem, due to the American mathematicians Nelson James Dunford (1906 – 1986) and Billy James Pettis (1913 – 1979), states that for every family \( {{(X_i)}_{i\in I}\in L^1} \), the following propositions are equivalent.

  • \( {{(X_i)}_{i\in I}} \) is UI;
  • \( {{(X_i)}_{i\in I}} \) is relatively compact for the weak \( {\sigma(L^1,L^\infty)} \) topology;
  • \( {{(X_i)}_{i\in I}} \) is relatively sequentially compact for the weak \( {\sigma(L^1,L^\infty)} \) topology.

The proof, which is not given in this post, can be found for instance in Delacherie and Meyer or in Diestel. We just recall that a sequence \( {{(X_n)}_{n\geq1}} \) in \( {L^1} \) converges to \( {X\in L^1} \) for the weak \( {\sigma(L^1,L^\infty)} \) topology when \( {\lim_{n\rightarrow\infty}\ell(X_n)=\ell(X)} \) for every \( {\ell\in (L^1)’=L^\infty} \), in other words when \( {\lim_{n\rightarrow\infty}\mathbb{E}(YX_n)=\mathbb{E}(YX)} \) for every \( {Y\in L^\infty} \).

The Dunford-Pettis theorem opens the door for the fine analysis of closed (possibly linear) subsets of \( {L^1} \), a deep subject in functional analysis and Banach spaces.

Tightness. If \( {{(X_i)}_{i\in I}\subset L^1} \) is bounded in \( {L^\varphi} \) for \( {\varphi:\mathbb{R}_+\rightarrow\mathbb{R}_+} \) non-decreasing and such that \( {\lim_{x\rightarrow\infty}\varphi(x)=+\infty} \), then the Markov inequality gives

\[ \sup_{i\in I}\mathbb{P}(|X_i|\geq R)\leq \frac{\sup_{i\in I}\mathbb{E}(\varphi(|X_i|))}{\varphi(R)}\underset{R\rightarrow\infty}{\longrightarrow0} \]

and thus the family of distributions \( {{(\mathbb{P}_{X_i})}_{i \in I}} \) is tight, in the sense that for every \( {\varepsilon>0} \), there exists a compact subset \( {K_\varepsilon} \) of \( {\mathbb{R}} \) such that \( {\sup_{i\in I}\mathbb{P}(|X_i|\in K_\varepsilon)\geq 1-\varepsilon} \). The Prohorov theorem states that tightness is equivalent to being relatively compact for the narrow topology (which is by the way metrizable by the bounded-Lipschitz Fortet-Mourier distance). The Prohorov and the Dunford-Pettis theorems correspond to different topologies on different objects (distributions or random variables). The tightness of \( {{(\mathbb{P}_{X_i})}_{i\in I}} \) is strictly weaker than the UI of \( {{(X_i)}_{i\in I}} \).

UI functions with respect to a family of laws. The UI property for a family \( {{(X_i)}_{i\in I}\subset L^1} \) depends actually only on the marginal distributions \( {{(\mathbb{P}_{X_i})}_{i\in I}} \) and does not feel the dependence between the \( {X_i} \)’s. In this spirit, if \( {(\eta_i)_{i\in I}} \) is a family of probability measures on a Borel space \( {(E,\mathcal{E})} \) and if \( {f:E\rightarrow\mathbb{R}} \) is a Borel function then we say that \( {f} \) is UI for \( {(\eta_i)_{i\in I}} \) on \( {E} \) when

\[ \lim_{t\rightarrow\infty}\sup_{i\in I}\int_{\{|f|>t\}}\!|f|\,d\eta_i=0. \]

This means that \( {{(f(X_i))}_{i\in I}} \) is UI, where \( {X_i\sim\eta_i} \) for every \( {i\in I} \). This property is often used in applications as follows: if \( {\eta_n\rightarrow\eta} \) narrowly as \( {n\rightarrow\infty} \) for some probability measures \( {{(\eta_n)}_{n\geq1}} \) and \( {\eta} \) and if \( {f} \) is continuous and UI for \( {(\eta_n)_{n\geq1}} \) then

\[ \int\!|f|\,d\eta<\infty \quad\text{and}\quad \lim_{n\rightarrow\infty}\int\!f\,d\eta_n=\int\!f\,d\eta. \]

Logarithmic potential. What follows is extracted from the survey Around the circular law (random matrices). We have already devoted a previous post to the logarithmic potential. Let \( {\mathcal{P}(\mathbb{C})} \) be the set of probability measures on \( {\mathbb{C}} \) which integrate \( {\log\left|\cdot\right|} \) in a neighborhood of infinity. The logarithmic potential \( {U_\mu} \) of \( {\mu\in\mathcal{P}(\mathbb{C})} \) is the function \( {U_\mu:\mathbb{C}\rightarrow(-\infty,+\infty]} \) defined for all \( {z\in\mathbb{C}} \) by

\[ U_\mu(z)=-\int_{\mathbb{C}}\!\log|z-w|\,d\mu(w) =-(\log\left|\cdot\right|*\mu)(z). \]

Let \( {\mathcal{D}'(\mathbb{C})} \) be the set of Schwartz-Sobolev distributions on \( {\mathbb{C}} \). We have \( {\mathcal{P}(\mathbb{C})\subset\mathcal{D}'(\mathbb{C})} \). Since \( {\log\left|\cdot\right|} \) is Lebesgue locally integrable on \( {\mathbb{C}} \), the Fubini-Tonelli theorem implies that \( {U_\mu} \) is a Lebesgue locally integrable function on \( {\mathbb{C}} \). In particular, we have \( {U_\mu<\infty} \) almost everywhere and \( {U_\mu\in\mathcal{D}'(\mathbb{C})} \). By using Green’s or Stockes’ theorems, one may show, for instance via the Cauchy-Pompeiu formula, that for any smooth and compactly supported function \( {\varphi:\mathbb{C}\rightarrow\mathbb{R}} \),

\[ -\int_{\mathbb{C}}\!\Delta\varphi(z)\log|z|\,dxdy=2\pi\varphi(0) \]

where \( {z=x+iy} \). Now can be written, in \( {\mathcal{D}'(\mathbb{C})} \),

\[ \Delta\log\left|\cdot\right| = 2\pi\delta_0 \]

In other words, \( {\frac{1}{2\pi}\log\left|\cdot\right|} \) is the fundamental solution of the Laplace equation on \( {\mathbb{R}^2} \). Note that \( {\log\left|\cdot\right|} \) is harmonic on \( {\mathbb{C}\setminus\{0\}} \). It follows that in \( {\mathcal{D}'(\mathbb{C})} \),

\[ \Delta U_\mu=-2\pi\mu, \]

i.e. for every smooth and compactly supported “test function” \( {\varphi:\mathbb{C}\rightarrow\mathbb{R}} \),

\[ \langle\Delta U_\mu,\varphi\rangle_{\mathcal{D}’} =-\!\int_{\mathbb{C}}\!\Delta\varphi(z)U_\mu(z)\,dxdy =-2\pi\int_{\mathbb{C}}\!\varphi(z)\,d\mu(z) =-\langle2\pi\mu,\varphi\rangle_{\mathcal{D}’} \]

where \( {z=x+iy} \). Also \( {-\frac{1}{2\pi}U_\cdot} \) is the Green operator on \( {\mathbb{R}^2} \) (Laplacian inverse). For every \( {\mu,\nu\in\mathcal{P}(\mathbb{C})} \), we have

\[ U_\mu=U_\nu\text{ almost everywhere }\Rightarrow \mu=\nu. \]

To see it, since \( {U_\mu=U_\nu} \) in \( {\mathcal{D}'(\mathbb{C})} \), we get \( {\Delta U_\mu=\Delta U_\nu} \) in \( {\mathcal{D}'(\mathbb{C})} \), and thus \( {\mu=\nu} \) in \( {\mathcal{D}'(\mathbb{C})} \), and finally \( {\mu=\nu} \) as measures since \( {\mu} \) and \( {\nu} \) are Radon measures. (Note that this remains valid if \( {U_\mu=U_\nu+h} \) for some harmonic \( {h\in\mathcal{D}'(\mathbb{C})} \)). As for the Fourier transform, the pointwise convergence of logarithmic potentials along a sequence of probability measures implies the narrow convergence of the sequence to a probability measure. We need however some strong tightness. More precisely, if \( {{(\mu_n)}_{n\geq1}} \) is a sequence in \( {\mathcal{P}(\mathbb{C})} \) and if \( {U:\mathbb{C}\rightarrow(-\infty,+\infty]} \) is such that

  • (i) for a.a. \( {z\in\mathbb{C}} \), \( { \lim_{n\rightarrow\infty}U_{\mu_n}(z)=U(z)} \);
  • (ii) \( {\log(1+\left|\cdot\right|)} \) is UI for \( {(\mu_n)_{n \geq 1}} \);

then there exists \( {\mu \in \mathcal{P}(\mathbb{C})} \) such that

  • (j) \( {U_\mu=U} \) almost everywhere;
  • (jj) \( {\mu = -\frac{1}{2\pi}\Delta U} \) in \( {\mathcal{D}'(\mathbb{C})} \);
  • (jjj) \( {\mu_n\rightarrow\mu} \) narrowly.

Let us give a proof inspired from an article by Goldsheid and Khoruzhenko on random tridiagonal matrices. From the de la Vallée Poussin criterion, assumption (ii) implies that for every real number \( {r\geq1} \), there exists \( {\varphi\in\Phi} \), which may depend on \( {r} \), which is moreover convex and has moderate growth \( {\varphi(x)\leq 1+x^2} \), and

\[ \sup_n \int\!\varphi(\log(r+|w|))\,d\mu_{n}(w)<\infty. \]

Let \( {K\subset \mathbb{C}} \) be an arbitrary compact set. Take \( {r = r(K) \geq 1} \) large enough so that the ball of radius \( {r-1 } \) contains \( {K} \). Therefore for every \( {z\in K} \) and \( {w\in\mathbb{C}} \),

\[ \varphi(|\log|z-w||) \leq (1 + |\log|z-w||^2)\mathbf{1}_{\{|w|\leq r\}} +\varphi(\log(r+|w|))\mathbf{1}_{\{|w|>r\}}. \]

The couple of inequalities above, together with the local Lebesgue integrability of \( {(\log\left|\cdot\right|)^2} \) on \( {\mathbb{C}} \), imply, by using Jensen and Fubini-Tonelli theorems,

\[ \sup_n\int_K\!\varphi(|U_n(z)|)\,dxdy \leq \sup_n\iint\!\mathbf{1}_K(z)\varphi(|\log|z-w||)\,d\mu_n(w)\,dxdy<\infty, \]

where \( {z=x+iy} \) as usual. Since the de la Vallée Poussin criterion is necessary and sufficient for UI, this means that the sequence \( {{(U_{\mu_n})}_{n\geq1}} \) is locally Lebesgue UI. Consequently, from (i) it follows that \( {U} \) is locally Lebesgue integrable and that \( {U_{\mu_n}\rightarrow U} \) in \( {\mathcal{D}'(\mathbb{C})} \). Since the differential operator \( {\Delta} \) is continuous in \( {\mathcal{D}'(\mathbb{C})} \), we find that \( {\Delta U_{\mu_n}\rightarrow\Delta U} \) in \( {\mathcal{D}'(\mathbb{C})} \). Since \( {\Delta U\leq0} \), it follows that \( {\mu:=-\frac{1}{2\pi}\Delta U} \) is a measure (see e.g. Hormander). Since for a sequence of measures, convergence in \( {\mathcal{D}'(\mathbb{C})} \) implies narrow convergence, we get \( {\mu_n=-\frac{1}{2\pi}\Delta U_{\mu_n}\rightarrow\mu=-\frac{1}{2\pi}\Delta U} \) narrowly, which is (jj) and (jjj). Moreover, by assumptions (ii) we get additionally that \( {\mu\in\mathcal{P}(\mathbb{C})} \). It remains to show that \( {U_\mu=U} \) almost everywhere Indeed, for any smooth and compactly supported \( {\varphi:\mathbb{C}\rightarrow\mathbb{R}} \), since the function \( {\log\left|\cdot\right|} \) is locally Lebesgue integrable, the Fubini-Tonelli theorem gives

\[ \int\!\varphi(z)U_{\mu_n}(z)\,dz =-\int\!\left(\int\!\varphi(z)\log|z-w|\,dz\right)\,d\mu_n(w). \]

Now \( {\varphi*\log\left|\cdot\right|:w\in\mathbb{C}\mapsto\int\!\varphi(z)\log|z-w|\,dz} \) is continuous and is \( {\mathcal{O}(\log|1+\cdot|)} \). Therefore, by (i-ii), \( {U_{\mu_n}\rightarrow U_\mu} \) in \( {\mathcal{D}'(\mathbb{C})} \), thus \( {U_\mu=U} \) in \( {\mathcal{D}'(\mathbb{C})} \) and then almost everywhere, giving (j).

8 Comments

Confined particles with singular pair repulsion

Charles Augustin Coulomb

I have uploaded recently the final version of arXiv:1304.7569 entitled First order global asymptotics for confined particles with singular pair repulsion, written in collaboration with Nathaël Gozlan and Pierre André Zitt, and to appear in the Annals of Applied Probability. The former title was First order global asymptitics for Calogero-Sutherland gases, but we decided to follow one of the reviewers since our work goes beyond Calogero-Sutherland models.

We study a physical system of \( {N} \) interacting particles in \( {\mathbb{R}^d} \), \( {d\geq1} \), subject to pair repulsion and confined by an external field. We establish a large deviations principle for their empirical distribution as \( {N} \) tends to infinity. In the case of Riesz interaction, including Coulomb interaction in arbitrary dimension \( {d>2} \), the rate function is strictly convex and admits a unique minimum, the equilibrium measure, characterized via its potential. It follows that almost surely, the empirical distribution of the particles tends to this equilibrium measure as \( {N} \) tends to infinity. In the more specific case of Coulomb interaction in dimension \( {d>2} \), and when the external field is a convex or increasing function of the radius, then the equilibrium measure is supported in a ring. With a quadratic external field, the equilibrium measure is uniform on a ball.

Particles and configuration energy. The system is made with \( {N} \) particles at positions \( {x_1,\ldots,x_N\in\mathbb{R}^d} \), \( {d\geq1} \), with identical “charge” \( {q_N=1/N} \), subject to a confining potential \( {V:\mathbb{R}^d\rightarrow\mathbb{R}} \) coming from an external field and acting on each particle, and to an interaction potential

\[ W:\mathbb{R}^d\times\mathbb{R}^d\rightarrow(-\infty,+\infty] \]

acting on each pair of particles. The function \( {W} \) is finite outside the diagonal and symmetric: for all \( {x,y\in\mathbb{R}^d} \) with \( {x\neq y} \), we have \( {W(x,y)=W(y,x)<\infty} \). The energy \( {H_N(x_1,\ldots,x_N)} \) of the configuration \( {(x_1,\ldots,x_N)\in(\mathbb{R}^d)^N} \) takes the form

\[ \begin{array}{rcl} \notag H_N(x_1,\ldots,x_N) &=& \sum_{i=1}^Nq_NV(x_i) +\sum_{i<j}q_N^2W(x_i,x_j) \\ \notag &=& \frac{1}{N}\sum_{i=1}^NV(x_i) +\frac{1}{N^2}\sum_{i<j}W(x_i,x_j) \\ &=&\int\!V(x)\,d\mu_N(x)+\frac{1}{2}\iint_{\neq}\!W(x,y)\,d\mu_N(x)\,d\mu_N(y) \end{array} \]

where

\[ \mu_N=\frac{1}{N}\sum_{i=1}^N\delta_{x_i} \]

is the empirical measure of the particles, and where the subscript “\( {\neq} \)” indicates that the double integral is off-diagonal. The energy \( {H_N:(\mathbb{R}^d)^N\rightarrow\mathbb{R}\cup\{+\infty\}} \) is a quadratic form functional in the variable \( {\mu_N} \). We denote by \( {\left|\cdot\right|} \) the Euclidean norm of \( {\mathbb{R}^d} \) and we make the following additional assumptions:

  • (H1) The function \( {W:\mathbb{R}^d\times\mathbb{R}^d\rightarrow(-\infty,+\infty]} \) is continuous on \( {\mathbb{R}^d\times \mathbb{R}^d} \), symmetric, takes finite values on \( {\mathbb{R}^d\times \mathbb{R}^d \setminus \{(x,x) ; x\in \mathbb{R}^d\}} \) and satisfies the following integrability condition: for all compact subset \( {K\subset\mathbb{R}^d} \), the function

    \[ z\in\mathbb{R}^d\mapsto \sup \{W(x,y); \left|x-y\right|\geq|z|, x,y\in K\} \]

    is locally Lebesgue-integrable on \( {\mathbb{R}^d} \);

  • (H2) The function \( {V:\mathbb{R}^d\rightarrow\mathbb{R}} \) is continuous and such that \( {\lim_{|x| \rightarrow+ \infty } V(x)=+\infty} \) and

    \[ \int_{\mathbb{R}^d} \exp(-V(x))\,dx<\infty. \]

  • (H3) There exists constants \( {c\in\mathbb{R}} \) and \( {\varepsilon_o \in (0,1)} \) such that for every \( {x,y\in\mathbb{R}^d} \),

    \[ W(x,y)\geq c-\varepsilon_o(V(x)+V(y)). \]

    (This must be understood as “\( {V} \) dominates \( {W} \) at infinity”).

Boltzmann-Gibbs distribution. Let \( {{(\beta_N)}_{N}} \) be a sequence of positive real numbers such that \( {\beta_N\rightarrow+\infty} \) as \( {N\rightarrow\infty} \). Under (H2)-(H3), there exists an integer \( {N_0} \) depending on \( {\varepsilon_o} \) such that for any \( {N\geq N_0} \), we have

\[ Z_N=\int_{\mathbb{R}^d}\cdots\int_{\mathbb{R}^d}\! \exp\left(-\beta_NH_N(x_1,\ldots,x_N)\right)\,dx_1\cdots{}dx_N<\infty, \]

so that we can define the Boltzmann-Gibbs probability measure \( {P_N} \) on \( {(\mathbb{R}^d)^N} \) by

\[ dP_N(x_1,\ldots,x_N) =\frac{\exp\left(-\beta_N H_N(x_1,\ldots,x_N)\right)}{Z_N}\,dx_1\cdots dx_N. \]

The law \( {P_N} \) is the equilibrium distribution of a system of \( {N} \) interacting Brownian particles in \( {\mathbb{R}^d} \), at inverse temperature \( {\beta_N} \), with equal individual “charge” \( {1/N} \), subject to a confining potential \( {V} \) acting on each particle, and to an interaction potential \( {W} \) acting on each pair of particles. For \( {\beta_N=N^2} \), the quantity \( {\beta_NH_N} \) can also be interpreted as the distribution of a system of \( {N} \) particles living in \( {\mathbb{R}^d} \), with unit “charge”, subject to a confining potential \( {NV} \) acting on each particle, and to an interaction potential \( {W} \) acting on each pair of particles.

Physical control problem. Our work is motivated by the following physical control problem: given the (internal) interaction potential \( {W} \), for instance a Coulomb potential, a target probability measure \( {\mu_\star} \) on \( {\mathbb{R}^d} \), for instance the uniform law on the unit ball, and a cooling scheme \( {\beta_N\rightarrow+\infty} \), for instance \( {\beta_N=N^2} \), can we tune the (external) confinement potential \( {V} \) (associated to an external confinement field) such that \( {\mu_N\rightarrow\mu_\star} \) as \( {N\rightarrow\infty} \)? In this direction, we provide some partial answers in our main results stated in the sequel.

Limiting energy. Let \( {\mathcal{M}_1(\mathbb{R}^d)} \) be the set of probability measures on \( {\mathbb{R}^d} \). The mean-field symmetries of the model suggest to study, under the exchangeable measure \( {P_N} \), the behavior as \( {N\rightarrow\infty} \) of the empirical measure \( {\mu_N} \), which is a random variable on \( {\mathcal{M}_1(\mathbb{R}^d)} \). With this asymptotic analysis in mind, we introduce the functional \( {I:\mathcal{M}_1(\mathbb{R}^d)\rightarrow (-\infty,+\infty]} \) given by

\[ I(\mu) = \frac{1}{2}\iint\!\left(V(x)+V(y)+W(x,y)\right)\,d\mu(x)d\mu(y). \]

The assumptions (H2)(H3) imply that the function under the integral is bounded from below, so that the integral defining \( {I} \) makes sense in \( {\mathbb{R}\cup\{+\infty\}=(-\infty,+\infty]} \). If it is finite, then \( {\int\!Vd\mu} \) and \( {\iint Wd\mu^2} \) both exist, so that

\[ I(\mu) = \int\!V d\mu + \frac{1}{2} \iint W d\mu^2. \]

The energy \( {H_N} \) is “almost” given by \( {I(\mu_N)} \), where the infinite terms on the diagonal are forgotten.

Large deviations principle. Theorem 1 below is our first main result. It is of topological nature, inspired from the available results for logarithmic Coulomb gases in random matrix theory BenArous and Guionnet, BenArous and Zeitouni, Hiai and Petz, Hardy. We equip \( {\mathcal{M}_1(\mathbb{R}^d)} \) with the weak topology, defined by duality with bounded continuous functions. For any set \( {A\subset\mathcal{M}_1(\mathbb{R}^d)} \) we denote by \( {\mathrm{int}{A}} \), \( {\mathrm{clo}{A}} \) the interior and closure of \( {A} \) with respect to this topology. This topology can be metrized by the Fortet-Mourier distance defined by (see also Rachev and Rüschendorf):

\[ d_{\mathrm{FM}}(\mu,\nu)= \sup_{\max(|f|_\infty,|f|_{\mathrm{Lip}})\leq 1}\left\{\int\!f\,d\mu-\int\!f\,d\nu\right\}, \]

where

\[ |f|_\infty=\sup|f| \quad\mbox{and}\quad |f|_{\mathrm{Lip}}=\sup_{x\neq y}\frac{|f(x)-f(y)|}{|x-y|}. \]

To formulate the large deviations result we need to introduce the following additional technical assumption:

  • (H4) For all \( {\nu \in \mathcal{M}_1(\mathbb{R}^d)} \) such that \( {I(\nu)<+\infty} \), there is a sequence \( {(\nu_n)_{n\in \mathbb{N}}} \) of probability measures, absolutely continuous with respect to Lebesgue, such that \( {\nu_n} \) converges weakly to \( {\nu} \) and \( {I(\nu_n) \rightarrow I(\nu),} \) when \( {n\rightarrow\infty.} \)

It turns out that assumption (H4) is satisfied for a very large class of potentials \( {V,W} \), including the special case in which the function \( {I} \) is convex, which is typically the case for the Coulomb and Riesz intercations.

In all the paper, if \( {(a_N)_{N}} \) and \( {(b_N)_{N}} \) are non negative sequences, the notation \( {a_N \gg b_N} \) means that \( {a_N=b_Nc_N} \), for some \( {c_N} \) that goes to \( {+\infty} \) when \( {N\rightarrow\infty.} \)

Theorem 1 (Large Deviations Principle) Suppose that

\[ \beta_N\gg N\log(N). \]

If (H1)-(H2)-(H3) are satisfied then

  1. \( {I} \) has compact level sets (and is thus lower semi-continuous) and

    \[ \inf_{\mathcal{M}_1(\mathbb{R}^d)}I>-\infty; \]

  2. Under \( {(P_N)_N} \), the sequence \( {{(\mu_N)}_{N}} \) of random elements of \( {\mathcal{M}_1(\mathbb{R}^d)} \) equipped with the weak topology has the following asymptotic properties. For every Borel subset \( {A} \) of \( {\mathcal{M}_1(\mathbb{R}^d)} \),

    \[ \limsup_{N\rightarrow\infty}\frac{\log Z_NP_N(\mu_N\in A)}{\beta_N} \leq-\inf_{\mu\in\mathrm{clo}{A}}I(\mu) \]

    and

    \[ \liminf_{N\rightarrow\infty}\frac{\log Z_NP_N(\mu_N\in A)}{\beta_N} \geq -\inf \{I(\mu); \mu\in\mathrm{int}{A}, \mu \ll \mathrm{Lebesgue}\}. \]

  3. Under the additional assumption \( {\textbf{(H4)}} \), the full Large Deviation Principle (LDP) at speed \( {\beta_N} \) holds with the rate function

    \[ I_\star=I-\inf_{\mathcal{M}_1(\mathbb{R}^d)}I. \]

    More precisely, for all Borel set \( {A \subset \mathcal{M}_1(\mathbb{R}^d)} \),

    \[ -\inf_{\mu \in \mathrm{int}{A}} I_\star(\mu) \leq \liminf_{N\rightarrow\infty} \frac{\log P_N(\mu_N \in A)}{\beta_N} \\ \leq \limsup_{N\rightarrow\infty} \frac{\log P_N(\mu_N \in A)}{\beta_N} \leq -\inf_{\mu \in \mathrm{clo}{A}} I_\star(\mu). \]

    In particular, by taking \( {A=\mathcal{M}_1(\mathbb{R}^d)} \), we get

    \[ \lim_{N\rightarrow\infty}\frac{\log Z_N}{\beta_N} =\inf_{\mathcal{M}_1(\mathbb{R}^d)}I_\star. \]

  4. Let \( {I_{\text{min}}=\{\mu\in\mathcal{M}_1:I_\star(\mu)=0\}\neq \emptyset} \). If \( {\textbf{(H4)}} \) is satisfied and if \( {{(\mu_N)}_{N}} \) are constructed on the same probability space, and if \( {d} \) stands for the Fortet-Mourier distance, then we have, almost surely,

    \[ \lim_{N\rightarrow\infty}d_{\mathrm{FM}}(\mu_N,I_{\text{min}})=0. \]

A careful reading of the proof of Theorem 1 indicates that if \( {I_\text{min}=\{\mu_\star\}} \) is a singleton, and if (H4) holds for \( {\nu=\mu_\star} \), then \( {\mu_N\rightarrow\mu_\star} \) almost surely as \( {N\rightarrow\infty} \).

Link with Sanov theorem. If we set \( {W=0} \) then the particles become i.i.d. and \( {P_N} \) becomes a product measure \( {\eta_N^{\otimes N}} \) where \( {\eta_N\propto e^{-(\beta_N/N)V}} \), where the symbol “\( {\propto} \)” means ”proportional to”. When \( {\beta_N=N} \) then \( {\eta_N\propto e^{-V}} \) does not depend on \( {N} \), and we may denote it \( {\eta} \). To provide perspective, recall that the classical Sanov theorem for i.i.d. sequences means in our settings that if \( {W=0} \) and \( {\beta_N=N} \) then \( {{(\mu_N)}_N} \) satisfies to a large deviations principle on \( {\mathcal{M}_1(\mathbb{R}^d)} \) at speed \( {N} \) and with good rate function (Kullback-Leibler relative entropy or free energy)

\[ \mu\mapsto K(\mu|\eta)= \int\!f\log(f)\,d\eta \]

if \( {\mu\ll\eta} \), with \( {f=\frac{d\mu}{d\eta}} \), and \( {+\infty} \) otherwise. This large deviations principle corresponds to the convergence \( {\lim_{N\rightarrow\infty}d_{\mathrm{FM}}(\mu_N,\eta)=0} \). Note that, if \( {\mu} \) is absolutely continuous with respect to Lebesgue measure with density function \( {g} \), then \( {K(\mu|\eta)} \) can be decomposed in two terms

\[ K(\mu|\eta) = \int\!V\,d\mu-H(\mu)+\log Z_V, \]

where

\[ Z_V=\int_{\mathbb{R}^d}\!e^{-V(x)}\,dx \]

and where \( {H(\mu)} \) is the Boltzmann-Shannon “continuous” entropy

\[ H(\mu) = -\int\!g(x)\log(g(x))\,dx; \]

therefore at the speed \( {\beta_N = N} \), the energy factor \( {\int\!V\,d\mu} \) and the Boltzmann-Shannon entropy factor \( {H(\mu)} \) both appear in the rate function. In contrast, note that Theorem 1 requires a higher inverse temperature \( {\beta_N\gg N\log(N)} \). If we set \( {W=0} \) in Theorem 1, then \( {P_N} \) becomes a product measure, the particles are i.i.d. but their common law depends on \( {N} \), the function \( {\mu\mapsto I_*(\mu)=\int\!V\,d\mu-\inf V} \) is affine, its minimizers \( {I_{\text{min}}} \) over \( {\mathcal{M}_1(\mathbb{R}^d)} \) coincide with

\[ \mathcal{M}_V=\{\mu\in\mathcal{M}_1(\mathbb{R}^d):\mathrm{supp}(\mu)\subset\arg\inf V\}, \]

and Theorem 1 boils down to a sort of Laplace principle, which corresponds to the convergence \( {\lim_{N\rightarrow\infty}d_{\mathrm{FM}}(\mu_N,\mathcal{M}_V)=0} \). It is worthwhile to notice that the main difficulty in Theorem 1 lies in the fact that \( {W} \) can be infinite on the diagonal (short scale repulsion). If \( {W} \) is continuous and bounded on \( {\mathbb{R}^d\times\mathbb{R}^d} \), then one may deduce the large deviations principle for \( {{(\mu_N)}_{N}} \) from the case \( {W=0} \) by using the Laplace-Varadhan. To complete the picture, let us mention that if \( {\beta_N=N} \) and if \( {W} \) is bounded and continuous, then the Laplace-Varadhan lemma and the Sanov theorem would yield to the conclusion that \( {(\mu_N)_N} \) verifies a large deviations principle on \( {\mathcal{M}_1(\mathbb{R}^d)} \) at speed \( {N} \) with rate function \( {R-\inf_{\mathcal{M}_1(\mathbb{R}^d)}R} \) where the functional \( {R} \) is defined by

\[ \begin{array}{rcl} R(\mu) &=& K(\mu|\eta) + \frac{1}{2}\iint\!W(x,y)\,d\mu(x)d\mu(y) \\ &=& -H(\mu) + I(\mu)+\log Z_V; \end{array} \]

once more, the Boltzmann-Shannon entropy factor \( {H(\mu)} \) reappears at this rate. For an alternative point of view, we refer to Messer and Spohn, Caglioti and Lions and Marchioro and Pulvirenti, Bodineau and Guionnet.

Equilibrium measure for Coulomb and Riesz interactions. Our second main result, expressed in Theorem 2 and Corollary 3 below is of differential nature. It is based on an instance of the general Gauss problem in potential theory Frostman, Landkof, Zorii. It concerns special choices of \( {V} \) and \( {W} \) for which \( {I_\star} \) achieves its minimum \( {0} \) for a unique and explicit \( {\mu_\star\in\mathcal{M}_1(\mathbb{R}^d)} \). Recall that the Coulomb interactions correspond to the choice \( {W(x,y)=k_\Delta(x-y)} \) where \( {k_\Delta} \) is the Coulomb kernel (opposite in sign to the Newton kernel) defined on \( {\mathbb{R}^d} \), \( {d\geq1} \), by

\[ k_\Delta(x)= \left\{ \begin{array}{cc} -|x| & d=1,\\ \log\frac{1}{|x|} & d=2,\\ \frac{1}{|x|^{d-2}}& d\geq3. \end{array} \right. \]

This is, up to a multiplicative constant, the fundamental solution of the Laplace equation. In other words, denoting \( {\Delta=\partial_{x_1}^2+\cdots+\partial_{x_d}^2} \) the Laplacian, we have, in a weak sense, in the space of Schwartz-Sobolev distributions \( {\mathcal{D}'(\mathbb{R}^d)} \),

\[ -c\Delta k_\Delta=\delta_0 \quad\text{with}\quad c= \left\{ \begin{array}{cc} \frac{1}{2} & d=1,\\ \frac{1}{2\pi} & d=2,\\ \frac{1}{d(d-2)\omega_d} & d\geq3, \end{array} \right. \]

where \( {\omega_d=\frac{\pi^{d/2}}{\Gamma(1+d/2)}} \) is the volume of the unit ball of \( {\mathbb{R}^d} \). Our notation is motivated by the fact that \( {-\Delta} \) is a nonnegative operator. The case of Coulomb interactions in dimension \( {d=2} \) is known as “logarithmic potential with external field” and is widely studied in the literature: see Hiai and Petz, Saff and Totik, Anderson and Guionnet and Zeitouni, Hardy. To focus on novelty, we will not study the Coulomb kernel for \( {d\leq2} \). We refer to Lenard, Edwards and Lenard, Lenard, Brascamp and Lieb, Aizenman and Martin, Sandier and Serfaty and references therein for the Coulomb case in dimension \( {d=1} \), to BenArous and Guionnet Anderson and Guionnet and Zeitouni, Hardy to the Coulomb case in dimension \( {d=2} \) with support restriction on a line, to BenArous and Zeitouni, Hiai and Petz, Hiai and Petz, Hardy, Saff and Totik, Sandier and Serfaty, Yattselev for the Coulomb case in dimension \( {d=2} \). We also refer to Berman for the asymptotic analysis in terms of large deviations of Coulomb determinantal point processes on compact manifolds of arbitrary dimension.

The asymptotic analysis of \( {\mu_N} \) as \( {N\rightarrow\infty} \) for Coulomb interactions in dimension \( {d\geq3} \) motivates our next result, which is stated for the more general Riesz interactions in dimension \( {d\geq1} \). The Riesz interactions correspond to the choice \( {W(x,y)=k_{\Delta_\alpha}(x-y)} \) where \( {k_{\Delta_\alpha}} \), \( {0<\alpha<d} \), \( {d\geq1} \), is the Riesz kernel defined on \( {\mathbb{R}^d} \), by

\[ k_{\Delta_\alpha}(x)=\frac{1}{|x|^{d-\alpha}}. \]

Up to a multiplicative constant, this is the fundamental solution of a fractional Laplace equation (which is the true Laplace equation when \( {\alpha=2} \)), namely

\[ -c_\alpha\Delta_\alpha k_{\Delta_\alpha}=\mathcal{F}^{-1}(1)=\delta_0 \quad\text{with}\quad c_\alpha=\frac{\pi^{\alpha-\frac{d}{2}}}{4\pi^2} \frac{\Gamma(\frac{d-\alpha}{2})}{\Gamma(\frac{\alpha}{2})}, \]

where the Fourier transform \( {\mathcal{F}} \) and the fractional Laplacian \( {\Delta_\alpha} \) are given by

\[ \mathcal{F}(k_{\Delta_\alpha})(\xi)=\int_{\mathbb{R}^d}\!e^{2i\pi\xi\cdot x}\,k_{\Delta_\alpha}(x)\,dx =\frac{1}{c_\alpha4\pi^2|\xi|^\alpha} \quad\text{and}\quad \Delta_\alpha f = -4\pi^2\mathcal{F}^{-1}(|\xi|^\alpha\mathcal{F}(f)). \]

Note that \( {\Delta_2=\Delta} \) while \( {\Delta_\alpha} \) is a non-local integro-differential operator when \( {\alpha\neq2} \). When \( {d\geq3} \) and \( {\alpha=2} \) then Riesz interactions coincide with Coulomb interactions and the constants match. Beware that our notations differ slightly from the ones of Landkof. Several aspects of the Gauss problem in the Riesz case are studied in Dragnev and Saff, Zorii.

In the Riesz case, \( {0<\alpha<d} \), one associates to any probability measure \( {\mu} \) on \( {\mathbb{R}^d} \) a function \( {U_\alpha^\mu:\mathbb{R}^d\mapsto[0,+\infty]} \) called the potential of \( {\mu} \) as follows

\[ U_\alpha^\mu(x)= (k_{\Delta_\alpha}*\mu)(x) =\int\!k_{\Delta_\alpha}(x-y)\,d\mu(y),\qquad \forall x\in \mathbb{R}^d. \]

In classical potential theory, a property is said to hold quasi everywhere if it holds outside a set of zero capacity. The following theorem is essentially the analogue in \( {\mathbb{R}^d} \) of a result of Dragnev and Saff on spheres. The analogue problem on compact subsets, without external field, was initially studied by Frostman (in his PhD thesis, advised by Riesz, 1934!), see also the book of Landkof. A confinement (by an external field or by a support constraint) is always needed for such type of results.

Theorem 2 (Riesz gases) Suppose that \( {W} \) is the Riesz kernel \( {W(x,y)= k_{\Delta_\alpha}(x-y)} \). Then:

  1. The functional \( {I} \) is strictly convex where it is finite;
  2. (H1)-(H2)-(H3)-(H4) are satisfied and Theorem 1 applies;
  3. There exists a unique \( {\mu_\star\in\mathcal{M}_1(\mathbb{R}^d)} \) such that

    \[ I(\mu_\star)=\inf_{\mu\in\mathcal{M}_1(\mathbb{R}^d)}I(\mu); \]

  4. If we define \( {(\mu_N)_N} \) on a unique probability space (for a sequence \( {\beta_N\gg N\log(N)} \)) then with probability one,

    \[ \lim_{N\rightarrow\infty}\mu_N=\mu_\star. \]

If we denote by \( {C_\star} \) the real number

\[ C_\star = \int\!\left(U_\alpha^{\mu_\star} + V\right)\,d\mu_\star = J(\mu_\star) + \int\!Vd\mu_\star, \]

then the following additional properties hold:

  1. The minimizer \( {\mu_\star} \) has compact support, and satisfies

    \[ U_\alpha^{\mu_\star} + V \geq C_\star \]

    quasi everywhere, with equality on the support of \( {\mu_\star} \);

  2. If a compactly supported measure \( {\mu} \) creates a potential \( {U_\alpha^\mu} \) such that, for some constant \( {C\in\mathbb{R}} \),

    \[ U_\alpha^\mu + V \geq C \]

    quasi everywhere, with equality on the support of \( {\mu} \), then \( {C = C_\star} \) and \( {\mu=\mu_\star} \). The same is true under the weaker assumptions:

    \[ U_\alpha^\mu + V \leq C \]

    on the support of \( {\mu} \), and

    \[ U_\alpha^\mu + V \geq C \]

    quasi everywhere on the support of \( {\mu_\star} \).

  3. If \( {\alpha \leq 2} \), for any measure \( {\mu} \), the following “converse” holds:

    \[ \sup_{\mathrm{supp}(\mu)} \left(U_\alpha^\mu + V\right) \geq C_\star, \]

    and

    \[ “\inf_{\mathrm{supp}(\mu_\star)}”\, \left(U_\alpha^\mu(x) + V(x)\right) \leq C_\star, \]

    where the \( {“\inf”} \) means that the infimum is taken quasi-everywhere.

The constant \( {C_\star} \) is called the “modified Robin constant”, see e.g. Saff and Totik for the analogous result for the logarithmic potential in dimension \( {2} \). The minimizer \( {\mu_\star} \) is called the equilibrium measure.

Corollary 3 (Equilibrium of Coulomb gases with radial external fields in dimension \( {\geq3} \)) Suppose that for a fixed real parameter \( {\beta>0} \), and for every \( {x,y\in\mathbb{R}^d} \), \( {d\geq3} \),

\[ V(x)=v(|x|)\quad\text{and}\quad W(x,y)=\beta k_{\Delta}(x-y), \]

where \( {v} \) is two times differentiable. Denote by \( {d\sigma_r} \) the Lebesgue measure on the sphere of radius \( {r} \), and let \( {\sigma_d} \) be the total mass of \( {d\sigma_1} \) (i.e. the surface of the unit sphere of \( {\mathbb{R}^d} \)). Let \( {w(r) = r^{d-1}v'(r)} \), and suppose either that \( {v} \) is convex, or that \( {w} \) is increasing. Define two radii \( {r_0<R_0} \) by:

\[ r_0 = \inf\left\{r>0 ; v'(r)>0\right\} \quad\text{and}\quad w(R_0) = \beta(d-2). \]

Then the equilibrium measure \( {\mu_\star} \) is supported on the ring \( {\left\{x; |x|\in [r_0,R_0]\right\}} \), and is absolutely continuous with respect to Lebesgue measure:

\[ d\mu(r) = M(r)\,d\sigma_r dr \quad\text{where}\quad M(r) = \frac{w'(r)}{\beta(d-2)\sigma_d r^{d-1}} \mathbf{1}_{[r_0,R_0]}(r). \]

In particular, when \( {v(t)=t^2} \) then \( {\mu_\star} \) is the uniform distribution on the centered ball of radius

\[ \left(\beta\frac{d-2}{2}\right)^{\frac{1}{d}}. \]

The result provided by Corollary 3 on Coulomb gases with radial external fields can be found for instance in Lopez Garcia. It follows quickly from the Gauss averaging principle and the variational characterization. By using Theorem 2 with \( {\alpha=2} \) together with Corollary 3, we obtain that the empirical measure of a Coulomb gas with quadratic external field in dimension \( {d\geq3} \) tends almost surely to the uniform distribution on a ball when \( {N\rightarrow\infty} \). This phenomenon is the analogue in arbitrary dimension \( {d\geq3} \) of the well known result in dimension \( {d=2} \) for the logarithmic potential with quadratic radial external field (where the uniform law on the disc or “circular law” appears as a limit for the Complex Ginibre Ensemble, see for instance BenArous and Zeitouni, Hiai and Petz). The study of the equilibrium measure for Coulomb interaction with non radially symmetric external fields was initiated recently in dimension \( {d=2} \) by Bleher and Kuijlaars in a beautiful work Bleher and Kuijlaars by using orthogonal polynomials.

The following proposition shows that in the Riesz case, it is possible to construct a good confinement potential \( {V} \) so that the equilibrium measure is prescribed in advance.

Corollary 4 (Riesz gases: external field for prescribed equilibrium measure) Let \( {0<\alpha<d} \), \( {d\geq1} \), and \( {W(x,y)=k_{\Delta_\alpha}} \). Let \( {\mu_\star} \) be a probability measure with a compactly supported density \( {f_\star \in\mathbb{L}^p(\mathbb{R}^d)} \) for some \( {p>d/\alpha.} \) Define the confinement potential

\[ V(x)= -U_\alpha^{\mu_\star}(x) + [|x|^2-R]_+,\qquad x\in \mathbb{R}^d, \]

where \( {U_\alpha^{\mu_\star}} \) is the Riesz potential created by \( {\mu_\star} \) and \( {R>0} \) is such that \( {\mathrm{supp}(\mu_\star)\subset B(0,R).} \) Then the couple of functions \( {(V,W)} \) satisfy (H1)-(H2)-(H3)-(H4) and the functional

\[ \mu\in\mathcal{M}_1(\mathbb{R}^d)\mapsto I(\mu)=\int\!V\,d\mu + \frac{1}{2}\iint\!k_{\Delta_\alpha}(x-y)\,d\mu(x)d\mu(y) \in\mathbb{R}\cup\{+\infty\} \]

admits \( {\mu_\star} \) as unique minimizer. In particular, the probability \( {\mu_\star} \) is the almost sure limit of the sequence \( {{(\mu_N)}_{N}} \) (constructed on the same probability space), as soon as \( {\beta_N\gg N\log(N)} \).

Non-compactly supported equilibrium measures. The assumptions made on the external field \( {V} \) in Theorem 1 and Theorem 2 explain why the equilibrium measure \( {\mu_\star} \) is compactly supported. If one allows a weaker behavior of \( {V} \) at infinity, then one may produce equilibrium measures \( {\mu_\star} \) which are not compactly supported (and may even be heavy tailed). This requires to adapt some of the arguments, and one may use compactification as in Hardy. This might allow to extend Corollary 4 beyond the compactly supported case.

Equilibrium measure for Riesz interaction with radial external field. To the knowledge of the authors, the computation of the equilibrium measure for Riesz interactions with radial external field, beyond the more specific Coulomb case of Corollary 3, is an open problem, due to the lack of the Gauss averaging principle when \( {\alpha\neq2} \).

Beyond the Riesz and Coulomb interactions. Theorem 2 concerns the minimization of the Riesz interaction potential with an external field \( {V} \), and includes the Coulomb interaction if \( {d\geq3} \). In classical Physics, the problem of minimization of the Coulomb interaction energy with an external field is known as the Gauss variational problem Frostman, Landkof, Zorii. Beyond the Riesz and Coulomb potentials, the driving structural idea behind Theorem 2 is that if \( {W} \) is of the form \( {W(x,y)=k_D(x-y)} \) where \( {k_D} \) is the fundamental solution of an equation \( {-Dk_D=\delta_0} \) for a local differential operator \( {D} \) such as \( {\Delta_\alpha} \) with \( {\alpha=2} \), and if \( {V} \) is super-harmonic for \( {D} \), i.e. \( {DV\geq0} \), then the density of \( {\mu_\star} \) is roughly given by \( {DV} \) up to support constraints. This can be easily understood formally with Lagrange multipliers. The limiting measure \( {\mu_\star} \) depends on \( {V} \) and \( {W} \), and is thus non-universal in general.

Second order asymptotic analysis. The asymptotic analysis of \( {\mu_N-\mu_\star} \) as \( {N\rightarrow\infty} \) is a natural problem, which can be studied on various classes of tests functions. It is well known that a repulsive interaction may affect dramatically the speed of convergence, and make it dependent over the regularity of the test function. In another direction, one may take \( {\beta_N=\beta N^2} \) and study the low temperature regime \( {\beta\rightarrow\infty} \) at fixed \( {N} \). In the Coulomb case, this leads to Fekete points. We refer to Serfaty, Borodin and Serfaty, Sandier and Serfaty for the analysis of the second order when both \( {\beta\rightarrow\infty} \) and \( {N\rightarrow\infty} \). In the one-dimensional case, another type of local universality inside the limiting support is available in Götze and Venker.

Edge behavior. Suppose that \( {V} \) is radially symmetric and that \( {\mu_\star} \) is supported in the centered ball of radius \( {r} \), like in Corollary 3. Then one may ask if the radius of the particle system \( {\max_{1\leq k\leq n}|x_k|} \) converges to the edge \( {r} \) of the limiting support as \( {N\rightarrow\infty} \). This is not provided by the weak convergence of \( {\mu_N} \). The next question is the fluctuation. In the two-dimensional Coulomb case, a universality result is available for a class of external fields in arXiv:1310.0727.

Topology. It is known that the weak topology can be upgraded to a Wasserstein topology in the classical Sanov theorem for empirical measures of i.i.d. sequences, see Wang and Wang and Wu, provided that tails are strong exponentially integrable. It is then quite natural to ask about such an upgrade for Theorem 1.

Connection to random matrices. Our initial inspiration came, when writing the survey on the circular law, from the role played by the logarithmic potential in the analysis of the Ginibre ensemble. When \( {d=2} \), \( {\beta_N=N^2} \), \( {V(x)=|x|^2} \) and \( {W(x,y)=\beta k_\Delta(x-y)=\beta \log\frac{1}{|x-y|}} \) with \( {\beta=2} \) then \( {P_N} \) is the law of the (complex) eigenvalues of the complex Ginibre ensemble:

\[ dP_N(x)=Z_N^{-1}e^{-N\sum_{i=1}^N|x_i|^2}\prod_{i<j}|x_i-x_j|^2dx. \]

(here \( {\mathbb{R}^2\equiv\mathbb{C}} \) and \( {P_N} \) is the law of the eigenvalues of a random \( {N\times N} \) matrix with i.i.d. complex Gaussian entries of covariance \( {\frac{1}{2N}I_2} \)). For a non-quadratic \( {V} \), we may see \( {P_N} \) as the law of the spectrum of random normal matrices such as the ones studied in Ameur and Hedenmalm, and Makarov. On the other hand, in the case where \( {d=1} \) and \( {V(x)=|x|^2} \) and \( {W(x,y)=\beta\log\frac{1}{|x-y|}} \) with \( {\beta>0} \) then

\[ dP_N(x)=Z_N^{-1}e^{-N\sum_{i=1}^N|x_i|^2}\prod_{i<j}|x_i-x_j|^\beta\,dx. \]

This is known as the \( {\beta} \)-Ensemble in Random Matrix Theory. For \( {\beta=1} \), we recover the law of the eigenvalues of the Gaussian Orthogonal Ensemble (GOE) of random symmetric matrices, while for \( {\beta=2} \), we recover the law of the eigenvalues of the Gaussian Unitary Ensemble (GUE) of random Hermitian matrices. It is worthwhile to notice that \( {-\log|\cdot|} \) is the Coulomb potential in dimension \( {d=2} \), and not in dimension \( {d=1} \). For this reason, we may interpret the eigenvalues of GOE/GUE as being a system of charged particles in dimension \( {d=2} \), experiencing Coulomb repulsion and an external quadratic field, but constrained to stay on the real axis. We believe this type of support constraint can be incorporated in our initial model, at the price of a bit heavier notations and analysis.

Simulation problem and numerical approximation of the equilibrium measure. It is natural to ask about the best way to simulate the probability measure \( {P_N} \). A pure rejection algorithm is too naive. Some exact algorithms are available in the determinantal case such as for \( {d=2} \) and \( {W(x,y)=-2\log|x-y|} \) (algorithm 18 in Hough and Khrishnapur and Perez and Virag), and Scardicchio and Zachary and Torquato, and also the more recent Decreusefond and Flit and Low. One may prefer to use a non exact algorithm such as a Hastings-Metropolis algorithm. One may also use an Euler scheme to simulate a stochastic process for which \( {P_N} \) is invariant, or use a Metropolis adjusted Langevin approach (MALA) Roberts and Rosenthal. In this context, a very natural way to approximate numerically the equilibrium measure \( {\mu_\star} \) is to use a simulated annealing stochastic algorithm.

More general energies. The density of \( {P_N} \) takes the form

\[ \prod_{i=1}^Nf_1(x_i)\prod_{1\leq i<j\leq N}f_2(x_i,x_j), \]

which comes from the structure of \( {H_N} \). One may study more general energies with many bodies interactions, of the form, for some prescribed symmetric \( {W_k:(\mathbb{R}^d)^k\mapsto\mathbb{R}} \), \( {1\leq k\leq K} \), \( {K\geq1} \),

\[ H_N(x_1,\ldots,x_N) =\sum_{k=1}^K \sum_{i_1<\cdots<i_k}N^{-k}W_k(x_{i_1},\ldots,x_{i_k}). \]

This leads to the following candidate for the asymptotic first order global energy functional:

\[ \mu\mapsto\sum_{k=1}^K 2^{-k}\int\!\cdots\int\!W_k(x_1,\ldots,x_k)\,d\mu(x_1)\cdots d\mu(x_k). \]

Stochastic processes. Under general assumptions on \( {V} \) and \( {W} \), see for instance Royer, the law \( {P_N} \) is the invariant probability measure of a well defined (the absence of explosion comes from the assumptions on \( {V} \) and \( {W} \)) reversible Markov diffusion process \( {{(X_t)}_{t\in\mathbb{R}_+}} \) with state space

\[ \{x\in(\mathbb{R}^d)^N:H_N(x)<\infty\} =\{x\in(\mathbb{R}^d)^N:\sum_{i<j}W(x_i,x_j)<\infty\}, \]

solution of the system of Kolmogorov stochastic differential equations

\[ dX_t=\sqrt{2\frac{\alpha_N}{\beta_N}}\,dB_t-\alpha_N\nabla H_N(X_t)\,dt \]

where \( {{(B_t)}_{t\geq0}} \) is a standard Brownian motion on \( {(\mathbb{R}^d)^N} \), and where \( {\alpha_N>0} \) is an arbitrary scale parameter (natural choices being \( {\alpha_N=1} \) and \( {\alpha_N=\beta_N} \)). The law \( {P_N} \) is the equilibrium distribution of a system of \( {N} \) interacting Brownian particles \( {{(X_{1,t})}_{t\geq0},\ldots,{(X_{N,t})}_{t\geq0}} \) in \( {\mathbb{R}^d} \) at inverse temperature \( {\beta_N} \), with equal individual “charge” \( {q_N=1/N} \), subject to a confining potential \( {\alpha_N V} \) acting on each particle and to an interaction potential \( {\alpha_N W} \) acting on each pair of particles, and one can rewrite the stochastic differential equation above as the system of coupled stochastic differential equations (\( {1\leq i\leq N} \))

\[ dX_{i,t} =\sqrt{2\frac{\alpha_N}{\beta_N}}\,dB_{i,t} -q_N\alpha_N\nabla V(X_{i,t}) -\sum_{j\neq i}q_N^2\alpha_N\nabla_1W(X_{i,t},X_{j,t})\,dt \]

where \( {{(B_t^{(1)})}_{t\geq0},\ldots,{(B_t^{(N)})}_{t\geq0}} \) are i.i.d. standard Brownian motions on \( {\mathbb{R}^d} \). From a partial differential equations point of view, the probability measure \( {P_N} \) is the steady state solution of the Fokker-Planck evolution equation \( {\partial_t-L=0} \) where \( {L} \) is the elliptic Markov diffusion operator (second order linear differential operator without constant term)

\[ L=\frac{\alpha_N}{\beta_N}\left(\Delta-\beta_N\nabla H_N\cdot\nabla\right), \]

acting as \( {Lf=\frac{\alpha_N}{\beta_N}(\Delta f-\left<\beta_N\nabla H_N,\nabla f\right>)} \). This self-adjoint operator in \( {\mathrm{L}^2(P_N)} \) is the infinitesimal generator of the Markov semigroup \( {{(P_t)}_{t\geq0}} \), \( {P_t(f)(x)=\mathbb{E}(f(X_t)|X_0=x)} \). Let us take \( {\alpha_N=\beta_N} \) for convenience. In the case where \( {V(x)=|x|^2} \) and \( {W\equiv0} \) (no interaction) then \( {P_N} \) is a standard Gaussian law \( {\mathcal{N}(0,I_{dN})} \) on \( {(\mathbb{R}^d)^N} \) and \( {{(X_t)}_{t\geq0}} \) is an Ornstein-Uhlenbeck Gaussian process; while in the case where \( {d=1} \) and \( {V(x)=|x|^2} \) and \( {W(x,y)=-\beta\log|x-y|} \) of some fixed parameter \( {\beta>0} \) then \( {P_N} \) is the law of the spectrum of a \( {\beta} \)-Ensemble of random matrices and \( {{(X_t)}_{t\geq0}} \) is a so called Dyson Brownian motion Anderson and Guionnet and Zeitouni. If \( {\mu_{N,t}} \) is the law of \( {X_t} \) then \( {\Delta\mu_{N,t}\rightarrow\Delta\mu_N} \) weakly as \( {t\rightarrow\infty} \). The study of the dynamic aspects is an interesting problem connected to McKean-Vlasov models Cépa and Lépingle, Fontbona, Li and Li and Xie, Osada, Osada.

Calogero-(Moser-)Sutherland Schrödinger operators. Let us keep the notation used above. We define \( {U_N=\beta_NH_N} \) and we take \( {\beta_N=N^2} \) for simplicity. Let us consider the isometry \( {\Theta:\mathrm{L}^2(P_N)\rightarrow\mathrm{L}^2(dx)} \) defined by

\[ \Theta(f)(x)=f(x)\sqrt{\frac{dP_N(x)}{dx}}=f(x)e^{-\frac{1}{2}(U_N(x)+\log(Z_N))}. \]

The differential operator \( {S=-\Theta L \Theta^{-1}} \) is a Schrödinger operator:

\[ S=-\Theta L \Theta^{-1}=-\Delta+Q, \quad Q=\frac{1}{4}|\nabla U_N|^2-\frac{1}{2}\Delta U_N \]

which acts as \( {S f=-\Delta f+Qf} \). The operator \( {S} \) is self-adjoint in \( {\mathrm{L}^2(dx)} \). Being isometrically conjugated, the operators \( {-L} \) and \( {S} \) have the same spectrum, and their eigenspaces are isometric. In the case where \( {V(x)=|x|^2} \) and \( {W\equiv0} \) (no interactions), we find that and \( {Q=\frac{1}{2}(1-V)} \), and \( {S} \) is a harmonic oscillator. On the other hand, following Proposition 11.3.1 in Forrester, in the case \( {d=1} \) and \( {W(x,y)=-\log|x-y|} \) (Coulomb interaction), then \( {S} \) is a Calogero-(Moser-)Sutherland Schrödinger operator:

\[ S=-\Delta -E_0+\frac{1}{4}\sum_{i=1}^Nx_i^2 -\frac{1}{2}\sum_{1\leq i<j\leq N}\frac{1}{(x_i-x_j)^2}, \quad E_0=\frac{N}{2}+\frac{N(N-1)}{2}. \]

More examples are given in Proposition 11.3.2 of MR2641363, related to classical ensembles of random matrices. The study of the spectrum and eigenfunctions of such operators is a wide subject, connected to Dunkl operators. These models attracted some attention due to the fact that for several natural choices of the potentials \( {V,W} \), they are exactly solvable (or integrable). We refer to section 11.3.1 of Forrester, Section 9.6 of Dunkl and Xue, and Section 2.7 of Chybiryakov et al.

Leave a Comment

Spectrum of Markov generators of random graphs

Spectrum of 50 iid copies of random generators in dimension 500 with exponential off diagonal entries

I had the pleasure to upload recently on arXiv and on HAL a collaborative work with Charles Bordenave and Pietro Caputo, entitled Spectrum of Markov generators on sparse random graphs.

Let \( {X=(X_{ij})_{1\leq i,j\leq n}} \) be a random matrix in \( {\mathbb{C}} \) whose entries are i.i.d. with mean \( {m} \), covariance matrix \( {K=\mathrm{Cov}(\Re X_{11},\Im X_{11})} \), and variance \( {\mathrm{Tr}(K)=1} \). The sparse regime is obtained by allowing the law of \( {X_{11}} \) to depend on \( {n} \) with \( {\mathrm{Tr}(K)\rightarrow0} \) as \( {n\rightarrow\infty} \), the basic example being the adjacency matrix of Erdös-Rényi random graphs. The analysis of the sparse regime was promoted by Charles Bordenave, who has a pretty Hungarian mathematical soul. However, to simplify the exposition, this blog post restricts to the non sparse regime, which captures most of the rigid algebraic-geometric structure: we thus assume that the law of \( {X_{11}} \) does not depend on \( {n} \). We consider the random matrix defined by

\[ L=X-D \]

where \( {D} \) is the diagonal matrix obtained from the row sums of \( {X} \), namely \( {D_{ii}=\sum_{k=1}^nX_{ik}} \). If \( {X} \) is interpreted as the adjacency matrix of a weighted oriented graph, then \( {L} \) is the associated Laplacian matrix, with zero row sums. In particular, if the weights \( {X_{ij}} \) take values in \( {[0,\infty)} \), then \( {L} \) is the infinitesimal generator of the continuous time random walk on that graph, and properties of the spectrum of \( {L} \) can be used to study its long-time behavior. Clearly, \( {L} \) has non independent entries but independent rows. A related model is obtained by considering the stochastic matrix \( {P=D^{-1}X} \), which corresponds to discrete time random walk (considered in arXiv:0808.1502). In order to analyze the spectrum of \( {L} \), it is more convenient to introduce the following affine transformation of \( {L} \):

\[ M =\frac{L+nmI}{\sqrt{n}} =\frac{X}{\sqrt{n}}-\frac{D-nmI}{\sqrt{n}}. \]

By the central limit theorem, the distribution of \( {n^{-1/2}(D_{ii}-nm)} \) converges to the Gaussian law \( {\mathcal{N}(0,K)} \). Combined with the circular law for \( {n^{-1/2}X} \), this suggests the interpretation of the spectral distribution of \( {M} \), in the limit \( {n\rightarrow\infty} \), as an additive Gaussian deformation of the circular law. In a sense, our model is a non-Hermitian analogue of a model already studied by Bryc, Dembo, and Jiang years ago with the method of moments.

Basic notations and concepts. Recall that if \( {A} \) is an \( {n\times n} \) matrix, we denote by \( {\lambda_1(A),\ldots,\lambda_n(A)} \) its eigenvalues, i.e. the roots in \( {\mathbb{C}} \) of its characteristic polynomial. We label them in such a way that \( {|\lambda_1(A)|\geq\cdots\geq|\lambda_n(A)|} \). We denote by \( {s_1(A),\ldots,s_n(A)} \) the singular values of \( {A} \), i.e. the eigenvalues of the Hermitian positive semidefinite matrix \( {| A |= \sqrt{A^*A}} \), labeled so that \( {s_1(A)\geq\cdots \geq s_n(A) \geq 0} \). The operator norm of \( {A} \) is \( {\| A \|= s_1(A)} \) while the spectral radius is \( {|\lambda_1(A)|} \). We define the discrete probability measures

\[ \mu_A=\frac{1}{n}\sum_{k=1}^n\delta_{\lambda_k(A)} \quad\text{and}\quad \nu_A=\mu_{|A|} =\frac{1}{n}\sum_{k=1}^n\delta_{s_k(A)}. \]

In the sequel, \( {G} \) is a Gaussian random variable on \( {\mathbb{R}^2\cong\mathbb{C}} \) with law \( {\mathcal{N}(0,K)} \) i.e. mean \( {0} \) and covariance matrix \( {K} \). This law has a Lebesgue density on \( {\mathbb{R}^2} \) if and only if \( {K} \) is invertible, given by \( {z=(x,y)\mapsto(2\pi\sqrt{\det(K)})^{-1}\exp(-\frac{1}{2}<(x,y)^{\top}K^{-1}(x,y)>)} \). Note that \( {K} \) is not invertible when \( {X_{11}} \) is supported in \( {\mathbb{R}} \).

Singular values of shifts. Our first result concerns the singular values of shifts of the matrix \( {M} \), a useful proxy to the eigenvalues. It states that for every \( {z\in\mathbb{C}} \), there exists a probability measure \( {\nu_z} \) on \( {\mathbb{R}_+} \) which depends only on \( {z} \) and \( {K} \) such that with probability one,

\[ \nu_{M-zI} \underset{n\rightarrow\infty}{\longrightarrow} \nu_z. \]

Moreover, the limiting law \( {\nu_z} \) is characterized as follows: its symmetrization \( {\check \nu_z} \) is the unique symmetric probability measure on \( {\mathbb{R}} \) with Cauchy-Stieltjes transform satisfying, for every \( {\eta\in\mathbb{C}_+=\{z\in\mathbb{C}:\Im(z)>0\}} \),

\[ S_{\check \nu_z}(\eta) =\int_{\mathbb{C}}\!\frac{1}{z-\eta}\,d\check\nu(z) =\mathbb{E}\left(\frac{S_{\check \nu_z}(\eta)+\eta } {|G-z|^2-(\eta+ S_{\check \nu_z}(\eta))^2}\right). \]

It is in fact classical to express the Cauchy-Stieltjes transform of the limiting singular values distribution as a fixed point of a non linear equation, which comes from a recursion on the trace of the resolvent exploiting the recursive structure of the model. The real difficulty in the proof of the result above lies in the fact the entries of \( {L} \) are dependent (but asymptotically independent).

Eigenvalues convergence. The next result concerns the eigenvalues of \( {M} \):

\[ \mu_{M} \underset{n\rightarrow\infty}{\longrightarrow} \mu \]

where \( {\mu} \) is the probability measure on \( {\mathbb{C}} \) defined by

\[ \mu=\frac{1}{2\pi}\Delta\int_0^\infty\!\log(t)\,d\nu_z(t), \]

where the Laplacian \( {\Delta=\partial_z\partial_{\overline z}=\partial_x^2+\partial_y^2} \) is taken in the sense of Schwartz-Sobolev distributions in the space \( {\mathcal{D}'(\mathbb{R}^2)} \). The limiting distribution \( {\mu} \) is independent of the mean \( {m} \) of \( {X_{11}} \), and this is rather natural since shifting the entries produces a deterministic rank one perturbation. As in other known circumstances, a rank one additive perturbation produces essentially a single outlier, and therefore does not affect the limiting spectral distribution. Our proof of the convergence to \( {\mu} \) is inspired from the logarithmic potential approach developed by Tao and Vu for the standard circular law (see also arXiv:1109.3343). As usual, the main difficulty lies in the control of the small singular values of shifts \( {M-zI} \), in particular the norm of the resolvent. We solve this difficulty by using essentially the techniques developed by Rudelson and Vershynin and others.

Rigid analysis of the limit. To obtain further properties of \( {\mu} \), we turn to a flavor of free probability extended to possibly unbounded operators to interpret \( {n^{-1/2}(D-nmI)} \) as \( {n\rightarrow\infty} \). Following Brown and Haagerup and Schultz, one can define a large operator \( {\star} \)-algebra in which each element \( {a} \) has a Brown spectral measure denoted \( {\mu_a} \), that is the probability measure on \( {\mathbb{C}} \) given by

\[ \mu_a=\Delta\int_0^\infty\!\log(s)\,d\nu_{|a-z|}(s) \]

where \( {|b|=\sqrt{bb^*}} \) (i.e. the square root of the self-adjoint operator \( {bb^*} \)). If \( {a} \) is normal (i.e. \( {aa^*=a^*a} \)) then its Brown measure \( {\mu_a} \) coincides with its usual spectral measure. Now let \( {c} \) and \( {g} \) be \( {\star} \)-free operators with \( {c} \) circular, and \( {g} \) normal (i.e. \( {gg^*=g^*g} \)) with spectral measure equal to the Gaussian law \( {\mathcal{N}(0,K)} \). Then we are able to show that

\[ \nu_z = \mu_{|c + g – z|} \quad \text{and} \quad \mu = \mu_{c+g}. \]

Having identified the limit law \( {\mu} \), we obtain some additional information on it. Namely, we use the concept of subordination developed by Biane and Voiculescu, which allows to show that the support of \( {\mu} \) is given by

\[ \mathrm{Supp}(\mu) = \left\{z \in \mathbb{C} : \mathbb{E}\left(\frac{1}{|G-z|^2}\right)\geq 1\right\}. \]

Moreover, there exists a unique function \( {f : \mathrm{Supp} (\mu) \rightarrow [0,1]} \), which is \( {C^\infty} \) in the interior of \( {\mathrm{Supp} (\mu)} \), such that for all \( {z\in \mathrm{Supp} (\mu)} \),

\[ \mathbb{E}\left[\frac{1}{|G-z|^2 + f(z)^2}\right]=1. \]

Moreover \( {\mu} \) is absolutely continuous in \( {\mathbb{C}} \) with density given by

\[ z\mapsto \frac{1}{\pi} f(z)^2 \mathbb{E}\left[\Phi(G,z)\right] + \frac{1}{\pi} \frac{\left[\mathbb{E}\left[(G-z)\Phi(G,z)\right]\right|^2}{\mathbb{E}\left[\Phi(G,z)\right]} \]

where

\[ \Phi(w,z):=\frac{1}{(|w-z|^2 + f(z)^2)^2}. \]

It can be seen that \( {\mu} \) is rotationally invariant when \( {K} \) is a multiple of the identity, while this is not the case if \( {X_{11}} \) is supported in \( {\mathbb{R}} \), in which case \( {K_{22}=K_{12}=K_{21}=0} \) (in this case \( {G} \) does not have a density on \( {\mathbb{C}} \) since \( {K} \) is not invertible). Note also that the support of \( {\mu} \) is unbounded since it contains the support of \( {\mathcal{N}(0,K)} \), and thus \( {\mathrm{Supp}(\mu)=\mathbb{C}} \) if \( {K} \) is invertible. If \( {K} \) is not invertible, it can be checked that the boundary of \( {\mathrm{Supp}(\mu)} \) is

\[ \left\{z \in \mathbb{C} : \mathbb{E}\left(\frac{1}{|G-z|^2}\right) = 1\right\} \]

On this set, \( {f(z) = 0} \), but the formula for the density above shows that the density does not vanish there. This phenomenon, not unusual for Brown measures, occurs for the circular law and more generally for \( {R} \)-diagonal operators, see Haagerup and Larsen. Our formula above for the density is slightly more explicit than the formulas given in Biane and Lehner. The subordination formula that we use can also be used to compute more general Brown measures of the form \( {\mu_{a + c}} \) with \( {a, c} \) \( {\star} \)-free and \( {c} \) circular.

Spectrum localization. The convergence of \( {\mu_M} \) suggests that the bulk of the spectrum of \( {L} \) is concentrated around the value \( {-mn} \) in a two dimensional window of width \( {\sqrt{n}} \). Actually, it is possible to localize more precisely the support of the spectrum, by controlling the extremal eigenvalues of \( {L} \). Recall that \( {L} \) has always the trivial eigenvalue \( {0} \). We define for convenience the centered matrices \( {\underline X = X-\mathbb{E}X} \) and \( {\underline D=D-\mathbb{E}D} \) and \( {\underline L=L-\mathbb{E}L=\underline X-\underline D} \). If \( {J} \) stands for the \( {n\times n} \) matrix with all entries equal to \( {1} \), then

\[ \mathbb{E}L = L-\underline L= mJ – mnI. \]

Now the idea is that if \( {\mathbb{E}(|X_{11}^4|)<\infty} \) then by Bai and Yin theorem, the operator norm of \( {\underline X} \) is \( {\sqrt{n}\,(2+o(1))} \). On the other hand, from the central limit theorem one expects that the operator norm and the spectral radius of the diagonal matrix \( {\underline D} \) are of order \( {\sqrt{2n\log(n)}\,(1+o(1))} \) as for maximum of i.i.d. Gaussian random variables. We show indeed that if \( {X_{11}} \) is supported in \( {\mathbb{R}_+} \) and if \( {\mathbb{E}(|X_{11}|^4)<\infty} \) then with probability one, for \( {n \gg1} \), every eigenvalue \( { \underline \lambda} \) of \( {\underline L} \) satisfies

\[ |\Re \underline \lambda| \leq \sqrt{2 n\log(n)}\,(1+o(1)) \quad\text{and}\quad |\Im \underline \lambda| \leq \sqrt{n}(2+o(1)) . \]

Moreover, with probability one, for \( {n \gg 1} \), every eigenvalue \( {\lambda\neq 0} \) of \( {L} \) satisfies

\[ |\Re \lambda + m n| \leq \sqrt{2 n \log(n)}\,(1+o(1)) \quad\text{and}\quad |\Im \lambda| \leq \sqrt{n}(2+o(1)). \]

Our proof is simple and relies on classical perturbative methods in matrix analysis: refined Gershgorin and the Bauer-Fike theorems. If one defines a spectral gap \( {\kappa} \) of the Markov generator \( {L} \) as the minimum of \( {|\Re \lambda|} \) for \( {\lambda\neq 0} \) in the spectrum of \( {L} \), then it follows that a.s.

\[ \kappa\geq mn – \sqrt{2n\log(n)}\,(1+o(1)). \]

Invariant measure. We turn to the properties of the invariant measure of \( {L} \). If \( {X_{11}} \) is supported in \( {\mathbb{R}_+} \) and if \( {L} \) is irreducible, then from the Perron-Frobenius theorem, the kernel of \( {L} \) has dimension \( {1} \) and there is a unique vector \( {\Pi\in(0,1)^n} \) such that \( {L^\top\Pi = 0} \) and \( {\sum_{i=1}^n\Pi_i =1} \). The vector \( {\Pi} \) is the invariant measure of the Markov process with infinitesimal generator \( {L} \). Actually, we show that a.s. for \( {n \gg 1} \), the Markov generator \( {L} \) is irreducible and

\[ \left\Vert\Pi – U_n\right\Vert_1 = \mathcal{O} \left(\sqrt{\frac{\log(n)}{n}}\right)=o(1). \]

where \( {U_n = \frac1n(1 ,\ldots,1)^\top} \) is the uniform probability distribution on the finite set \( {\{1, \ldots , n\}} \) and where \( {\left\Vert\cdot\right\Vert_1} \) is the total variation norm. Our proof relies on the remarkable Sylvester determinant theorem which states that if \( {A\in\mathcal{M}_{p,q}} \) and \( {B\in\mathcal{M}_{q,p}} \) are two rectangular matrices with swapped dimensions then

\[ \det(I_p+AB)=\det(I_q+BA). \]

To understand it, recall that \( {AB} \) and \( {BA} \) have the same spectrum up to the multiplicity of the eigenvalue zero, and thus, their characteristic polynomials in \( {z} \) are identical up to a multiplication by a power of \( {z} \), which gives the result taking \( {z=1} \). In particular, this formula allows to pass from a high dimensional problem to a one dimensional problem.

Interpolation. Our results for the matrix \( {L=X-D} \) can be extended with minor modifications to the case of the matrix \( {L_{(t)}=X-tD} \), where \( {t\in\mathbb{R}} \) provided the law \( {\mathcal{N}(0,K)} \) characterizing our limiting spectral distributions is replaced by \( {\mathcal{N}(0,t^2K)} \). This gives back the circular law for \( {t=0} \). One may also interpolate between the Gaussian and the circular laws by considering \( {(1-t)X-tD} \) with \( {t\in[0,1]} \) or other parametrizations.

Open problems.

  • Almost sure convergence. The mode of convergence of spectral distributions is the weak convergence in probability. We believe that this can be upgraded to almost sure weak convergence, but this requires stronger bounds on the smallest singular values of \( {M-zI} \) (i.e. norm of the resolvent). This is not a problem if \( {X_{11}} \) has a bounded density.
  • Sparsity. We are able to show that the results remain essentially available in the sparse case in which the law of \( {X_{11}} \) depends on \( {n} \). However, our treatment is not optimal. Note that an optimal answer in the sparse case is still pending even for the circular law model.
  • Heavy tails. A different model for random Markov generators is obtained when the law of \( {X_{11}} \) has heavy tails, with e.g. infinite first moment. In this context, we refer to arXiv:1006.1713 and to arXiv:1109.3343 for the spectral analysis of non-Hermitian matrices with i.i.d. entries, and to arXiv:0903.3528 for the case of reversible Markov transition matrices. It is natural to expect that, in contrast with the cases considered here, there is no asymptotic independence of the matrices \( {X} \) and \( {D} \) in the heavy tailed case.
  • Spectral edge and spectral gap. Concerning the localization of the spectrum, it seems natural to conjecture the asymptotic behavior \( {\kappa= mn – \sqrt{2n\log(n)}\,(1+o(1))} \) for the spectral gap, but we do not have a proof of the corresponding upper bound. In the same spirit, we believe that with probability one, with \( {\underline L=L-\mathbb{E}L} \),

    \[ \lim_{n\rightarrow\infty} \frac{s_1(\underline L)}{\sqrt{2 n\log(n)}} =\lim_{n\rightarrow\infty}\frac{|\lambda_1(\underline L)|}{\sqrt{2 n\log(n)}}=1, \]

    which contrasts with the behavior of \( {\underline X} \) for which \( {s_1/|\lambda_1|\rightarrow2} \) as \( {n\rightarrow\infty} \).

Kernel estimator for the data used in the upper graphic

1 Comment

Can't find what you're looking for? Try refining your search:

Syntax · Style · .