Press "Enter" to skip to content

Month: November 2012

About the Wigner theorem

Matrix

Recall that a random \( {n\times n} \) Hermitian matrix \( {M} \) belongs to the Gaussian Unitary Ensemble (GUE) when it admits a density proportional to

\[ M\mapsto \exp\left(-\frac{n}{2}\mathrm{Tr}(M^2)\right). \]

If \( {\lambda_1,\ldots,\lambda_n} \) are the eigenvalues of \( {M} \) then the Wigner theorem gives

\[ \mathbb{E}\left(\frac{\mathrm{Card}\{1\leq i\leq n:\lambda_i\in I\}}{n}\right) \underset{n\rightarrow\infty}{\longrightarrow} \mu_{\mathrm{SC}}(I) \]

for every interval \( {I\subset\mathbb{R}} \), where \( {\mu_{\mathrm{SC}}} \) is the semicircle distribution with density \( {\frac{1}{2\pi}\sqrt{4-x^2}\mathbf{1}_{[-2,2]}(x)} \). In other words, the averaged empirical spectral distribution

\[ \mu_n :=\mathbb{E}\left(\frac{1}{n}\sum_{i=1}^n\delta_{\lambda_i}\right) \]

tends weakly to \( {\mu_{\mathrm{SC}}} \) as \( {n\rightarrow\infty} \). There are many proofs of this result. We present in this post a simple analytic proof due to Pastur.

Cauchy-Stieltjes transform. Set \( {\mathbb{C}_+=\{z\in\mathbb{C}:\mathfrak{I}z>0\}} \). The Cauchy-Stieltjes transform \( {s_\mu:\mathbb{C}_+\rightarrow\mathbb{C}} \) of a probability measure \( {\mu} \) on \( {\mathbb{R}} \) is given by

\[ s_\mu(z):=\int\!\frac{1}{\lambda-z}\,d\mu(\lambda). \]

The knowledge of \( {s_\mu} \) on \( {\mathbb{C}_+} \) (and even on an small ball of \( {\mathbb{C}_+} \) by analyticity) is enough to characterize \( {\mu} \). We always have the universal bound for any \( {z\in\mathbb{C}_+} \):

\[ |s_\mu(z)|\leq\frac{1}{\mathfrak{I}z}. \]

Furthermore, if \( {{(\mu_n)}} \) is a sequence of probability measures on \( {\mathbb{R}} \) such that \( {s_{\mu_n}(z)\rightarrow s(z)} \) as \( {n\rightarrow\infty} \) for every \( {z\in\mathbb{C}_+} \) then there exists a unique probability measure \( {\mu} \) on \( {\mathbb{R}} \) such that \( {s_\mu=s} \) and such that \( {\mu_n\rightarrow\mu} \) weakly as \( {n\rightarrow\infty} \).

Our problem reduces to show that \( {s_{\mu_n}(z)\rightarrow s_{\mu_{\mathrm{SC}}}(z)} \) for every \( {z\in\mathbb{C}_+} \).

Resolvent. The resolvent \( {G} \) of a Hermitian matrix \( {H} \) at point \( {z\in\mathbb{C}_+} \) is

\[ G(z)=(H-zI)^{-1}. \]

The map \( {z\mapsto G(z)} \) is holomorphic on \( {\mathbb{C}_+} \) and

\[ \left\Vert G(z)\right\Vert_{2\rightarrow2}\leq \frac{1}{\mathfrak{I}z}. \]

Denoting \( {\tau:=\frac{1}{n}\mathrm{Tr}} \) and \( {\mu:=\frac{1}{n}\sum_{k=1}^n\delta_{\lambda_k(H)}} \), we have

\[ s_\mu(z)=\tau(G(z)). \]

The resolvent identity states that if \( {G_1} \) and \( {G_2} \) are the resolvent at point \( {z} \) of two Hermitian matrices \( {H_1} \) and \( {H_2} \) then

\[ G_2-G_1=-G_1(H_2-H_1)G_2. \]

Used for \( {H_1=M} \) and \( {H_2=0} \), this gives, taking expectations and multiplying both sides by \( {z} \), and denoting \( {G} \) the resolvent at point \( {z} \) of \( {M} \),

\[ -I-z\mathbb{E}(G)=-\mathbb{E}(GM). \]

From this identity, we will obtain an equation satisfied by \( {s_{\mu_n}=\mathbb{E}(\tau(G))} \).

Integration by parts. For a random vector \( {X} \) with density \( {e^{-V}} \) on \( {\mathbb{R}^d} \) and a smooth \( {F:\mathbb{R}^d\rightarrow\mathbb{C}} \), an integration by parts gives

\[ \mathbb{E}(F(X)\nabla V(X))=\mathbb{E}(\nabla F(X)) \]

as soon as \( {e^{-V(x)}F(x)\rightarrow0} \) as \( {\left\Vert x\right\Vert\rightarrow\infty} \). In particular, for every \( {1\leq p,q\leq n} \), by denoting \( {\partial_{M_{pq}}:=\frac{1}{2}(\partial_{\Re M_{pq}}-i\partial_{\mathfrak{I} M_{pq}})} \) if \( {p\neq q} \),

\[ \mathbb{E}(F(M)M_{pq}) =\frac{1}{n}\mathbb{E}(\partial_{M_{qp}}F(M)). \]

Let us examine the case \( {F(M)=G_{ab}} \) with \( {1\leq a,b\leq n} \). The resolvent identity used twice gives

\[ \begin{array}{rcl} G_2-G_1 &=&-G_1(M_2-M_1)G_2 \\ &=&-G_1(M_2-M_1)(G_1-G_1(M_2-M_1)G_2)\\ &=&-G_1(M_2-M_1)G_1+o(\left\Vert M_2-M_1\right\Vert) \end{array} \]

which gives in particular that for every \( {1\leq a,b,p,q\leq n} \),

\[ \partial_{M_{pq}}G_{ab}=-G_{ap}G_{qb} \]

As a consequence, we get, using the integration by parts, for every \( {1\leq p\leq n} \),

\[ \mathbb{E}(GM)_{pp} =\sum_{q=1}^n\mathbb{E}(G_{pq}M_{qp}) =\frac{1}{n}\sum_{q=1}^n\mathbb{E}(\partial_{pq}G_{pq}) =-\frac{1}{n}\sum_{q=1}^n\mathbb{E}(G_{pp}G_{qq}) =-\mathbb{E}(G_{pp}\tau(G)). \]

Summing up, we obtain, after taking normalized traces,

\[ -1-zs_{\mu_n}(z)=\mathbb{E}(\tau(G)^2). \]

Poincaré inequality. To compare \( {\mathbb{E}(\tau(G)^2)} \) in the right hand side with \( {\mathbb{E}(\tau(G))^2=s_{\mu_n}(z)^2} \), we may use concentration, or alternatively and following Pastur, a Poincaré inequality. Namely, we set \( {f(M):=\tau(G)} \) and we write

\[ \left|\mathbb{E}(\tau(G)^2)-(\mathbb{E}\tau(G))^2\right| = \left|\mathbb{E}((\tau(G)-\mathbb{E}\tau(G))^2)\right| \leq \mathbb{E}(\left|\tau(G)-\mathbb{E}\tau(G)\right|^2) =\mathrm{Var}(f(M)). \]

Now \( {M} \) is Gaussian with covariance matrix \( {C} \) such that \( {\left\Vert C\right\Vert_{2\rightarrow2}\leq \mathcal{O}(n^{-1})} \). Thus, the Poincaré inequality for the law of \( {M} \) and for function \( {f} \) gives

\[ \mathrm{Var}(f(M)) \leq \mathcal{O}(n^{-1})\mathbb{E}(\left\Vert \nabla f(M)\right\Vert_2^2). \]

But

\[ \partial_{M_{pq}}\tau(G) =\frac{1}{n}\sum_a\partial_{M_{pq}}G_{aa} =-\frac{1}{n}\sum_aG_{ap}G_{qa} =-\frac{1}{n}G^2_{qp} \]

and therefore

\[ \left\Vert \nabla f\right\Vert_2^2 =\frac{1}{n^2}\sum_{p,q}\left|\partial_{M_{pq}}\tau(G)\right|^2 =\frac{1}{n^2}\left\Vert G^2\right\Vert_{\mathrm{HS}}^2 \leq \frac{1}{n}\left\Vert G^2\right\Vert_{2\rightarrow2}^2 \leq \frac{1}{n}\left\Vert G\right\Vert_{2\rightarrow2}^4 \leq \frac{1}{n(\mathfrak{I} z)^4}. \]

Putting all together, this gives finally

\[ \left|s_{\mu_n}(z)^2+zs_{\mu_n}(z)+1\right| = \mathcal{O}\left(\frac{1}{n^2(\mathfrak{I} z)^4}\right) \underset{n\rightarrow\infty}{\longrightarrow}0. \]

Conclusion. The equation \( {s^2+zs+1=0} \) in \( {s} \) has two roots \( {s_\pm(z):=\frac{1}{2}(-z\pm\sqrt{z^2-4})} \). We observe that \( {\mathfrak{I}s_{\mu}(z)\mathfrak{I}z>0} \). This gives that \( {\lim_{n\rightarrow\infty}s_n(z)=\frac{1}{2}(-z+\sqrt{z^2-4})} \) for every \( {z\in\mathbb{C}_+} \), which is the Cauchy-Stieltjes transform of \( {\mu_{\mathrm{SC}}} \). We conclude that \( {\mu_n} \) converges weakly as \( {n\rightarrow\infty} \) to \( {\mu_{\mathrm{SC}}} \). One may drop the expectation in \( {\mu_n} \) and obtain an almost sure weak convergence by using concentration and the first Borel-Cantelli lemma.

Further reading. This method is quite powerful and can be extended to many models of random matrices, including Wigner matrices (not necessarily Gaussian), as explained in the book by Pastur and Shcherbina.

Leave a Comment

About the first Borel-Cantelli lemma

Dice

The first BorelCantelli lemma is one of rare elementary providers of almost sure convergence in probability theory. It states that if \( {(A_n)} \) is a sequence of events in a probability space \( {(\Omega,\mathcal{F},\mathbb{P})} \) such that \( {\sum_n\mathbb{P}(A_n)<\infty} \) then \( {\mathbb{P}(\varlimsup_nA_n)=0} \). The classical proof consists in noticing that \( {B_n=\cup_{m\geq n}A_m} \) is non increasing in \( {n} \), and thus, since \( {\sum_n\mathbb{P}(A_n)<\infty} \),

\[ \mathbb{P}(\varlimsup A_n) =\mathbb{P}(\cap_nB_n) =\lim_n\mathbb{P}(B_n) \leq \lim_{n}\sum_{m\geq n}\mathbb{P}(A_m)=0. \]

There is another proof of this lemma that I prefer, and which goes as follows: by the Fubini-Tonelli theorem or by the monotone convergence theorem,

\[ \mathbb{E}(\sum_n\mathbf{1}_{A_n}) =\sum_n\mathbb{E}(\mathbf{1}_{A_n}) =\sum_n\mathbb{P}(A_n) <\infty \]

therefore \( {\mathbb{P}(\sum_n\mathbf{1}_{A_n}=\infty)=0} \). Now \( {\{\sum_n\mathbf{1}_{A_n}=\infty\}=\varlimsup A_n} \). I like very much this proof because it concentrates useful ingredients for students:

  • we count by summing indicators of events and \( {\{\sum_n\mathbf{1}_{A_n}\}=\varlimsup A_n} \)
  • \( {\mathbb{E}(\mathbf{1}_A)=\mathbb{P}(A)} \)
  • Fubini-Tonelli allows to swap \( {\mathbb{E}} \) and \( {\sum} \)
  • if \( {X\geq0} \) and \( {\mathbb{E}(X)<\infty} \) then \( {\mathbb{P}(X=\infty)=0} \)

A very similar argument may be used to obtain quickly a strong law of large numbers. Namely, if \( {(X_n)} \) are independent centered random variables bounded in \( {\mathbf{L}^4} \) (i.e. \( {\sup_n\mathbb{E}(|X_n|^4)<\infty} \)) and if we set \( {S_n:=\frac{1}{n}(X_1+\cdots+X_n)} \), then by expanding and using the assumptions we get \( {\mathbb{E}(S_n^4)=\mathcal{O}(n^{-2})} \), which gives

\[ \mathbb{E}(\sum_nS_n^4) =\sum_n\mathbb{E}(S_n^4) <\infty \]

and thus \( {\mathbb{P}(S_n\rightarrow0)\geq\mathbb{P}(\sum_nS_n^4<\infty)=1} \). To me, this is a bit more elegant than using the first Borel-Cantelli lemma and the Markov inequality.

Leave a Comment