I enjoy basic beautiful mathematical proofs. I see them like small jewels, that I collect from time to time. In this spirit, this post proposes probabilistic proofs of a couple of basic results.
Jensen inequality. The Jensen inequality states that if \( {X} \) is an integrable random variable on \( {\mathbb{R}^n} \) and \( {\varphi:\mathbb{R}^n\rightarrow\mathbb{R}} \) a convex function such that \( {\varphi(X)} \) is integrable, then
\[ \varphi(\mathbb{E}(X))\leq \mathbb{E}(\varphi(X)). \]
To prove it, we start by using the convexity of \( {\varphi} \), which gives, for every integer \( {n\geq1} \) and every sequence \( {x_1,\ldots,x_n} \) in \( {\mathbb{R}^n} \),
\[ \varphi\left(\frac{x_1+\cdots+x_n}{n}\right) \leq \frac{\varphi(x_1)+\cdots+\varphi(x_n)}{n}. \]
Now, we use the integrability of \( {X} \) and \( {\varphi(X)} \): we take \( {x_1,x_2,\ldots} \) random independent and distributed as \( {X} \), we use the strong law of large numbers for both sides, the fact that \( {\varphi} \) is continuous for the left hand side, and the fact that if \( {\mathbb{P}(A)=\mathbb{P}(B)=1} \) then \( {A\cap B\neq\varnothing} \). I also appreciate the proof based on the equality for affine functions, the variational expression of a convex function as the envelope of its tangent hyperplanes, together with the fact that the supremum of expectations is less than or equal to the expectation of the supremum.
Schur-Hadamard product and cone of positive matrices. The Schur-Hadamard product of two square matrices \( {A,B\in\mathcal{M}_n(\mathbb{R})} \) is the matrix \( {A\circ B\in\mathcal{M}_n(\mathbb{R})} \) defined by
\[ (A\circ B)_{ij}:=A_{ij}B_{ij} \]
for evey \( {1\leq i,j\leq n} \). This entrywise product is denoted .* in Matlab/Octave/Freemat/Scilab.
Obviously, the Schur-Hadamard product preserves the cone of symmetric matrices. It is however not obvious that if \( {A} \) and \( {B} \) are symmetric positive semidefinite (i.e. non negative spectrum) then \( {A\circ B} \) is also symmetric positive semidefinite.
To prove this remarkable statement, let us recall that the set of symmetric positive semidefinite matrices coincides with the set of covariance matrices of random vectors (and even of Gaussian random vectors). Next, let us consider two independent centered random vectors \( {X} \) and \( {Y} \) of \( {\mathbb{R}^n} \) with respective covariance matrices \( {A} \) and \( {B} \). Now, the random vector \( {Z=X\circ Y} \) of \( {\mathbb{R}^n} \) defined by \( {Z_i:=X_iY_i} \) for every \( {1\leq i\leq n} \) has covariance matrix \( {A\circ B} \), which is thus necessarily symmetric positive semidefinite! Any simpler proof?
The second proof is definitely __really__ elegant. Though I wasn’t aware of the fact that any scalar product could be defined as the covariance of random vectors, this appears as stunningly remarquable stated in this way, thus the proof. Nice!
I also like the second proof.
[…] clean random variable-based proof is from this blog post. One can also show the following claim. (I don’t know of a slick r.v. based […]
[…] Repost from http://djalil.chafai.net/blog/2011/08/23/two-basic-probabilistic-proofs/ […]