Archive

Archive for May 15th, 2010

Probability & Geometry in High Dimensions

May 15th, 2010 5 comments

We organize with my colleagues Olivier Guédon, Guillaume Lecué, and Alain Pajor, of Marne-la-Vallée, a one week workshop on Probability & Geometry in High Dimensions.

The aim of this workshop is to reflect on recent developments in Probability and Geometry in High Dimensions with emphasis on interactions with other fields of mathematics such as compressed sensing, sparse statistical problems, random matrices, and empirical processes.

The workshop will take place on the campus of Université Paris-Est Marne-la-Vallée, May 17-21 2010. It turns out that Michel Talagrand will give a talk at this occasion.

Poster of the workshop

When the central limit theorem fails… Sparsity and localization.

May 15th, 2010 No comments

This post discusses some basic aspects of the Central Limit Theorem (CLT) in relation with the notions of localization and sparsity. Let \( {G\sim\mathcal{N}(0,1)} \) and \( {(X_n)_{n\geq1}} \) be a sequence of independent real random variables with, for every \( {n\geq1} \),

\[ \mathbb{E}(X_n)=0\quad\text{and}\quad \sigma_n^2:=\mathbb{E}(X_n^2) \]

Let us define

\[ S_n:=\frac{X_1+\cdots+X_n}{s_n} \quad\text{where}\quad s_n^2:=\mathrm{Var}(X_1+\cdots+X_n)=\sigma_1^2+\cdots+\sigma_n^2. \]

The Lindeberg CLT states that if, for every \( {\varepsilon>0} \),

\[ \lim_{n\rightarrow\infty}\frac{1}{s_n^2}\sum_{k=1}^n \mathbb{E}(X_k^2\mathbf{1}_{\{|X_k|>\varepsilon s_n\}})=0 \]

then \( {(S_n)_{n\geq1}} \) converges in distribution to the standard Gaussian \( {\mathcal{N}(0,1)} \), in other words,

\[ \lim_{n\rightarrow\infty}\mathbb{P}(S_n\leq x)=\mathbb{P}(G\leq x) \quad\text{for all }x\in\mathbb{R}. \]

Moreover, the Feller criterion states that if

\[ \lim_{n\rightarrow\infty}\max_{1\leq k\leq n}\frac{\sigma_k^2}{s_n^2}=0 \]

then the Lindeberg condition is necessary and sufficient for the convergence of \( {(S_n)_{n\geq1}} \) in distribution to the standard Gaussian \( {\mathcal{N}(0,1)} \). The Feller condition means that each single variance \( {\sigma_k^2} \) represents an asymptotically negligible portion of the total variance \( {s_n^2} \), as \( {n} \) goes to infinity. In other words, the total variance is spread as \( {n} \) goes to infinity.

On the contrary, and quite intuitively, one can guess that if \( {(\sigma_n)_{n\geq1}} \) is localized then \( {S_n} \) is very close to the sum of few terms for arbitrary large \( {n} \), and the CLT may fail due to a lack of averaging (homogenization). Of course, if the sequence \( {(\sigma_n)_{n\geq1}} \) is sparse (extreme localization!) i.e. \( {\mathrm{Card}\{n\geq1:\sigma_n\neq0\}<\infty} \), then the CLT fails. Beyond sparsity, let us seek for a more subtle example for which one can check immediately from scratch that the CLT fails. Let us pick a sequence \( {(\sigma_n)_{n\geq1}} \) of positive real numbers, and a sequence \( {(U_n)_{n\geq1}} \) of bounded i.i.d. random variables on \( {[-c,c]} \) with mean \( {0} \) and variance \( {1} \). If we define the random variable \( {X_k:=\sigma_kU_k} \) then

\[ \mathbb{E}(X_k)=0 \quad\text{and}\quad \mathbb{E}(X_k^2)=\sigma_k^2 \]

and

\[ c^{-1}|S_n|\leq \frac{\left\Vert(\sigma_1,\ldots,\sigma_n)\right\Vert_1} {\left\Vert(\sigma_1,\ldots,\sigma_n)\right\Vert_2} =:\rho_n. \]

Now, if \( {(\rho_n)_{n\geq1}} \) is bounded then \( {(S_n)_{n\geq1}} \) is bounded and thus the CLT fails. The norms-ratio \( {\rho_n} \) measures the delocalisation of the vector \( {(\sigma_1,\ldots,\sigma_n} \)). Note that \( {(\rho_n)_{n\geq1}} \) is bounded if \( {(\sigma_n)_{n\geq1}} \) grows too fast or decays too fast. For instance, \( {(\rho_n)_{n\geq1}} \) is bounded if \( {(\sigma_n)_{n\geq1}\in\ell^1} \). On the other hand, since \( {s_n^2=s_{n-1}^2+\sigma_n^2} \), the Cauchy-Schwarz inequality gives

\[ \rho_n \leq \frac{1}{s_n}\sum_{k=1}^{n-1}\sigma_k +\frac{\sigma_n}{s_n} \leq \sqrt{n-1}\,\frac{s_{n-1}}{\sigma_n}+1 \]

and \( {(\rho_n)_{n\geq1}} \) is bounded e.g. if \( {\sigma_n=s_{n-1}\sqrt{n}} \). The delocalization control is an essential aspect of the CLT. The Berry-Esséen theorem, which constitues a quantitative CLT, involves also a norms-ratio measuring localization: if \( {(X_n)_{n\geq1}} \) are independent real random variables with

\[ \mathbb{E}(X_n)=0 \quad\text{and}\quad \sigma^2_n:=\mathbb{E}(X_n^2) \quad\text{and}\quad \tau_n^3:=\mathbb{E}(|X_n|^3) \]

and if \( {V_n:=(X_1,\ldots,X_n)} \) and \( {S_n} \) is defined from \( {(X_n)_{n\geq1}} \) as before then for all \( {n\geq1} \),

\[ \sup_{x\in\mathbb{R}} \left|\mathbb{P}\left(S_n\leq x\right)-\mathbb{P}(G\leq x)\right| \leq 6\frac{\mathbb{E}(\left\Vert V_n\right\Vert_3^3)} {\mathbb{E}(\left\Vert V_n\right\Vert_2^2)^3} = 6\frac{\tau_1^3+\cdots+\tau_n^3}{(\sigma_1^2+\cdots+\sigma_n^2)^{3/2}}. \]

You may take a look at the recent work of Klartag and Sodin on the role of delocalization in the Berry-Esséen theorem.

Measuring (de)localisation with norms-ratios is a classical trick in mathematics and physics. It plays a role for instance for eigenvectors in the formalization of the Anderson localization phenomenon for random Schrödinger operators, and in the recent work of Erdös, Schlein, Ramirez, Yau, Tao and Vu on the universality of eigenvalues spacings for models of random matrices. The norm ratio is also related to embeddings in the local theory of Banach spaces.

This post is inspired from a question asked by my friend Sébastien Blachère.

Categories: Probability, Statistics