The empirical method of Maurey – Libres pensées d'un mathématicien ordinaire

A famous theorem due to Constantin Carathéodory states that if \( {A\subset\mathbb{R}^d} \) then every point \( {x} \) in the convex hull of \( {A} \) is the mean of a discrete probability measure supported by at most \( {d+1} \) atoms in \( {A} \). In other words, there exist \( {x_1,\ldots,x_{d+1}\in A} \) and \( {\lambda_1,\ldots,\lambda_{d+1}\in[0,1]} \) such that \( {\lambda_1+\cdots+\lambda_{d+1}=1} \) and

\[ x=\sum_{i=1}^{d+1}\lambda_ix_i. \]

The simplex case \( {A=\{0,e_1,\ldots,e_d\}} \), where \( {e_1,\ldots,e_d} \) is the canonical basis of \( {\mathbb{R}^d} \), shows that one cannot replace \( {d+1} \) by a smaller number in general. The proof is not difficult. Namely, since \( {x} \) belongs to the convex hull of \( {A} \), it is the convex combination of say \( {n} \) points \( {x_1,\ldots,x_n} \) of \( {A} \), namely \( {x=\sum_{i=1}^n\lambda_ix_i} \) with \( {\lambda_1,\ldots,\lambda_n\in[0,1]} \) and \( {\lambda_1+\cdots+\lambda_n=1} \). Now if \( {n>d+1} \) then the \( {n-1} \) vectors \( {x_1-x_n,\ldots,x_{n-1}-x_n} \) of \( {\mathbb{R}^d} \) are linearly dependent, which means that \( {\sum_{i=1}^n\mu_ix_i=0} \) for some reals \( {\mu_1,\ldots,\mu_{n-1}} \) not all equal to zero, with \( {\mu_n:=-\sum_{i=1}^{n-1}\mu_i} \). This gives \( {x=\sum_{i=1}^n(\lambda_i+\theta\mu_i)x_i} \) for any real \( {\theta} \). Now we may select \( {\theta} \) in order to obtain an expression of \( {x} \) as a convex combination of only \( {n-1} \) points. The desired result follows then by repeating this argument \( {n-(d+1)} \) times. This proof is constructive as soon as we start with an explicit \( {x} \).

Following Bernard Maurey, one may drop the dependence over the dimension by relaxing the desired result: if \( {H} \) is a Hilbert space and \( {A\subset H} \) (say bounded for simplicity) then there exists a constant \( {c=c(A)} \) such that for every \( {x} \) in the convex hull of \( {A} \) and for every \( {n\geq1} \), there exists \( {x_1,\ldots,x_n\in A} \) such that

\[ \left\Vert x-\frac{1}{n}\sum_{i=1}^nx_i\right\Vert \leq \frac{c}{\sqrt{n}}. \]

The proof of Maurey is probabilistic. Namely, since \( {x} \) belongs to the convex hull of \( {A} \), there exists a probability measure \( {\mu} \) supported in \( {A} \) with mean \( {x} \). Let \( {(X_i)_{i\geq1}} \) be independent and identically distributed random variables with law \( {\mu} \). We have \( {\mathbb{E}(X_i)=x} \). For every \( {n\geq1} \),

\[ \mathbb{E}\left(\left\Vert x-\frac{1}{n}\sum_{i=1}^nX_i\right\Vert^2\right) =\frac{1}{n}\mathbb{E}(\left\Vert X_1-x\right\Vert^2) = \frac{c^2}{n}. \]

Therefore, there exists at least one \( {\omega} \) such that

\[ \left\Vert x-\frac{1}{n}\sum_{i=1}^nX_i(\omega)\right\Vert \leq \frac{c}{\sqrt{n}}. \]

It remains to take \( {x_i=X_i(\omega)} \). This proof is a simple instance of what is known as the empirical method. It is not constructive, as many probabilistic proofs. The method of Maurey has many useful applications such as covering numbers, see for instance Pisier, Remarques sur un résultat non publié de B. Maurey. Séminaire d’analyse fonctionnelle (“Maurey-Schwartz”, Polytechnique) (1980-1981), exp. no 5, p. 1-12. A recent application is given in Vershynin, How close is the sample covariance matrix to the actual covariance matrix?, arXiv:1004.3484 [math.PR] and in Gaïffas and Lecué, Sharp oracle inequalities for the prediction of a high-dimensional matrix, arXiv:1008.4886v1 [math.ST].

Bernard Maurey during the party after the defense of the Habilitation à Diriger des Recherches of Joseph Lehec in Paris-Dauphine – November, 15 2016.

Some other posts:

3 Comments

Nicolas 2011-06-28
Salut Djalil,
Je ne crois pas que la méthode de Maurey soit appliqué dans le papier de Srivastava et Vershynin (en tout cas je ne vois pas où). Tu as peut-être confondu avec le papier de Rudelson et Vershynin “On sparse reconstruction from Fourier and Gaussian measurements”.
Djalil Chafaï 2011-06-28
Salut Nicolas,
oui, je me suis trompé, c’était dans le papier d’avant que Vershynin a écrit tout seul. C’est corrigé. Merci d’avoir signalé cette coquille.
Djalil.
Yihong Wu 2016-03-15
Dear Djalil,
Thanks for the exposition. I have a couple of questions:
1. you mentioned “one may drop the dependence over the dimension by relaxing the desired result” but the constant c=c(A) depends on the set A hence also the dimension. I understand in the special case of simplex it turns out to be a constant. In general can c be made a dimension-independent?
2. from the proof the constant c depends on both A and x. When A is bounded it is clear that c can be bounded by radius of A. Is it possible to get round this? You mentioned “(say bounded for simplicity)” so I presume there is a way to drop the boundedness assumption.
Many thanks.
Yihong

This site uses Akismet to reduce spam. Learn how your comment data is processed.