
This tiny back to basics post is devoted to a couple of bits of Probability and Statistics.
The central limit theorem cannot hold in probability. Let (Xn)n≥1 be iid real random variables with zero mean and unit variance. The central limit theorem (CLT) states that as n→∞,
Zn=X1+⋯+Xn√nlaw⟶N(0,1).
A frequently asked question by good students is to know if one can replace the convergence in law by the (stronger) convergence in probability. The answer is negative, and in particular the convergence cannot hold almost surely or in Lp. Let us examine why. Recall that the convergence in probability is stable by linear combinations and by subsequence extraction.
We proceed by contradiction. Suppose that Zn→Z∞ in probability. Then necessarily Z∞∼N(0,1). Now, on the one hand, Z2n−Zn→0 in probability, while
Z2n−Zn=1−√2√2Zn+Xn+1⋯+X2n√2n=1−√2√2Zn+1√2Z′n.
But Z′n is an independent copy of Zn. Thus the CLT used twice gives Z2n−Znlaw⟶N(0,σ2) with σ2=(1−√2)2/2+1/2=2−√2≠0, hence the contradiction.
Alternative proof. Set Sn=X1+⋯+Xn, and observe that
S2n−Sn√n=√2Z2n−Zn.
Now, if the CLT was in probability, the right hand side would converge in probability to √2Z∞−Z∞ which follows the law N(0,(√2−1)2). On the other hand, since S2n−Sn has the law of Sn, by the CLT, the left hand side converges in law towards Z∞∼N(0,1), hence the contradiction. This reversed'' proof was kindly suggested by Michel Ledoux.
Yet another proof. If Zn→Z in probability then Zn→Z in L2 since (Zn)n is uniformly integrable (it is bounded in L2), but this convergence in L2 is impossible since (Zn)n does not satisfy to the Cauchy criterion (consider Z2n−Zn as above!).
Intermezzo: Slutsky lemma. The Slutsky lemma asserts that if
Xnlaw⟶XandYnlaw⟶c
with c constant, then
(Xn,Yn)law⟶(X,c),
and in particular, f(Xn,Yn)law⟶f(X,c) for every continuous f.
Let us prove it. Since Ynlaw⟶c and c is constant, we have Yn→c in probability, and since for all t∈R, the function y↦eity is uniformly continuous on R, we have that for all s,t∈R and all ε>0, there exists η>0 such that for large enough n,
|E(eisXn+itYn)−E(eisXn+itc)|≤E(|eitYn−eitc|1|Yn−c|≤η)+2P(|Yn−c|>η)≤ε+2ε.
Alternatively we can use the Lipschitz property instead of the uniform continuity:
|E(eisXn+itYn)−E(eisXn+itc)|≤E(|eitYn−eitc|1|Yn−c|≤η)+2P(|Yn−c|>η)≤|t|η+2ε.
On the other hand, since Xnlaw⟶X, we have, for all s,t∈R, as n→∞,
E(eisXn+itc)=eitcE(eisXn)⟶eitcE(eisX)=E(eisX+itc).
The delta-method. Bizarrely this basic result, very useful in Statistics, appears to be unknown to many young probabilists. Suppose that as n→∞,
an(Zn−bn)law⟶L,
where (Zn)n≥1 is a sequence of real random variables, L a probability distribution, and (an)n≥1 and (bn)n≥1 deterministic sequences such that an→∞ and bn→b. Then for any C1 function f:R→R such that f′(b)≠0, we have
anf′(b)(f(Zn)−f(bn))law⟶L.
The typical usage in Statistics is for the fluctuations of estimators say for an(Zn−bn)=√n(ˆθn−θ). Note that the rate in n and the fluctuation law are not modified! Let us give a proof. By a Taylor formula or here the mean value theorem,
f(Zn)−f(bn)=f′(Wn)(Zn−bn)
where Wn is a random variable lying between bn and Zn. Since an→∞, the Slutsky lemma gives Zn−bn→0 in law, and thus in probability since the limit is deterministic. As a consequence Wn−bn→0 in probability and thus Wn→b in probability. The continuity of f′ at point b provides f′(Wn)→f′(b) in probability, hence f′(Wn)/f′(b)→1 in probability, and again by Slutsky lemma,
anf′(b)(f(Zn)−f(bn))=f′(Wn)f′(b)an(Zn−bn)law⟶L.
If f′(b)=0 then one has to use a higher order Taylor formula, and the rate and fluctuation will be deformed by a power. Namely, suppose that f(1)(b)=⋯=f(r−1)(b)=0 while f(r)(b)≠0, then, denoting Lr the push forward of L by x↦xr, we get
arnr!f(r)(b)(f(Zn)−f(bn))law⟶Lr.
The delta-method can be of course generalized to sequences of random vectors, etc.
Leave a Comment