The usual proof of the Central Limit Theorem (CLT) is based on characteristic functions (Fourier transform). This tiny post is devoted to a less usual yet useful approach by coupling.
Lindeberg replacement principle (based on coupling).Let $X_1,\ldots,X_n$ be independent real random variables with with $\mathbb{E}(|X_k|^3)<\infty$. Set $$m_k:=\mathbb{E}(X_k),\quad\sigma^2_k:=\mathbb{E}(|X_k-m_k|^2),\quad\tau_k^3:=\mathbb{E}(|X_k-m_k|^3).$$Let $Y_1,\ldots,Y_n$ be independent random variables, independent of $X_1,\ldots,X_n$, such that $$Y_k\sim\mathcal{N}(m_k,\sigma^2_k).$$Then for all $f\in\mathcal{C}^3(\mathbb{R},\mathbb{R})$ with $f,f’,f”,f^{(3)}$ bounded,$$|\mathbb{E}(f(X_1+\cdots+X_n))-\mathbb{E}(f(Y_1+\cdots+Y_n))|\leq(\tau_1^3+\cdots+\tau_n^3)\frac{{\|f^{(3)}\|}_\infty}{2}.$$This Lindeberg coupling inequality implies immediately that if $$S_n:=\frac{X_1-m_1+\cdots+X_n-m_n}{\sqrt{\sigma_1^2+\cdots+\sigma_n^2}}\quad\text{and}\quad G\sim\mathcal{N}(0,1)$$ then for all $f\in\mathcal{C}^3(\mathbb{R},\mathbb{R})$ with $f,f’,f”,f^{(3)}$ bounded, $$|\mathbb{E}(f(S_n))-\mathbb{E}(f(G))|\leq\frac{\tau_1^3+\cdots+\tau_n^3}{\sqrt{\sigma_1^2+\cdots+\sigma_n^2}^3}\frac{{\|f^{(3)}\|}_\infty}{2}.$$In the iid case, $m_k$, $\sigma_k$, and $\tau_k$ no longer depend on $k$, and $$\frac{\tau_1^3+\cdots+\tau_n^3}{\sqrt{\sigma_1^2+\cdots+\sigma_n^2}^3}=\frac{\tau^3}{\sigma^3\sqrt{n}}.$$We thus obtain a quantitative (non-asymptotic) version of the CLT, in the spirit of the Berry-Esseen inequality. In terms of asymptotic analysis, this also leads in particular to the classical iid CLT under finite third moment by approximating indicators by smooth functions, namely, for all $x\in\mathbb{R}$, and all $\varepsilon>0$, there exists $f,g\in\mathcal{C}^3(\mathbb{R},\mathbb{R})$ with $f,f’,f”,f^{(3)},g,g’,g”,g^{(3)}$ bounded such that $$\mathbf{1}_{(-\infty,x-\varepsilon]}\leq f_\varepsilon\leq\mathbf{1}_{(-\infty,x]}\leq g_\varepsilon\leq\mathbf{1}_{(-\infty,x+\varepsilon]}.$$
Let us prove the Lindeberg coupling inequality above. Since the statement is invariant by translation on $f$, we can assume without loss of generality that $m_k=0$ for all $1\leq k\leq n$. The idea now is to replace, in the sum $X_1+\cdots+X_n$, the $X_k$ by the $Y_k$, step by step. Namely, introducing $$Z_k:=X_1+\cdots+X_{k-1}+Y_{k+1}\cdots+Y_n,$$we get the telescopic sum $$f(X_1+\cdots+X_n)-f(Y_1+\cdots+Y_n)=\sum_{k=1}^n(f(Z_k+X_k)-f(Z_k+Y_k)).$$Now, the Taylor-Lagrange formula applied at $Z_k$ at order $2$ gives $$f(Z_k+X_k)=f(Z_k)+f'(Z_k)X_k+f”(Z_k)\frac{X_k^2}{2!}+f^{(3)}(Z_k)\frac{A_k^3}{3!}$$where $A_k\in[Z_k,Z_k+X_k]$. Similarly, $$f(Z_k+Y_k)=f(Z_k)+f'(Z_k)Y_k+f”(Z_k)\frac{Y_k^2}{2!}+f^{(3)}(Z_k)\frac{B_k^3}{3!}$$ where $B_k\in[Z_k,Z_k+Y_k]$. Taking the expectation, using the independence of $(X_k,Y_k)$ and $Z_k$, and using the fact that the first two moments of the $X_k$’s and the $Y_k$’s match, we get $$|f(Z_k+X_k)-f(Z_k+Y_k)|\leq\frac{{\|f^{(3)}\|}_\infty}{3!}\mathbb{E}(|X_k|^3+|Y_k|^3).$$ It remains to note that $Y_k=\sigma(X_k)G_k$ where $G_k\sim\mathcal{N}(0,1)$, hence $$\mathbb{E}(|Y_k|^3)=\mathbb{E}(|X_k|^2)^{3/2}\mathbb{E}(|G_k|^3)\leq2\mathbb{E}(|X_k|^3).$$
For sums of independent square integrable random variables, beyond the finite third moment condition, the CLT holds as soon as the Lindeberg truncation condition holds: for all $\varepsilon>0$, $$\lim_{n\to\infty}\frac{1}{\sigma_1^2+\cdots+\sigma_n^2}\sum_{k=1}^n\mathbb{E}((X_k-m_k)^2\mathbf{1}_{\{|X_k-m_k|\geq\varepsilon\sigma_k\}})=0.$$ This holds in particular if the Lyapunov moment condition is satisfied: for some $p>1$, $$\lim_{n\to\infty}\frac{1}{{(\sigma_1^2+\cdots+\sigma_n^2)}^{p}}\sum_{k=1}^n\mathbb{E}(|X_k-m_k|^{2p})=0.$$These versions of the CLT for sums of independent random variables extend to martingales.
Comments. The replacement principle can be used beyond the realm of sums of independent random variables, typically for nonlinear stochastic models involving independent ingredients, for instance for the high dimensional asymptotic analysis of the eigenvalues of random matrices.
Further reading.
- Feller, William
An Introduction to Probability Theory and Its Applications
Wiley (1968) - Billingsley, Patrick
Probability and Measure
Wiley (1995) - Chatterjee, Sourav
A generalization of the Lindeberg principle
Annals of Probability (2006) - Tao, Terence
Least singular value, circular law, and Lindeberg exchange
Random matrices, American Mathematical Society (2019) - On this blog
When the central limit theorem fails… Sparsity and localization (2010)