Singular values and rows distances

Any matrix \( {M\in\mathcal{M}_{m,n}} \) maps the unit sphere to an ellipsoid. The singular values of \( {M} \) are the half lenghts of the principal axes of this ellipsoid. We denote them

\[ s_1(M)\geq\cdots\geq s_{\min(n,m)}(M) \]

the \( {\max(m,n)-\min(m,n)} \) others are zero. If \( {M^*} \) is the conjugate transpose of \( {M} \) then the positive-semidefinite Hermitian matrices \( {\sqrt{MM^*}} \) and \( {\sqrt{MM^*}} \) share the same spectrum up to the multiplicity of \( {0} \), which is given by the singular values of \( {M} \). The singular values decomposition (SVD) theorem states that there exits unitary matrices \( {U} \) \( {m\times m} \) and \( {V} \) \( {n\times n} \) such that

\[ UMV=\mathrm{diag}(s_1(M),\ldots,s_{\min(n,m)}(M)) \]

where we append as much zeros as needed. The largest singular value

\[ s_1(M)=\max_{\left|x\right|_2=1}\left|Mx\right|_2=\left|M\right|_2 \]

is the operator norm of \( {M} \). For the smallest we have

\[ s_{\min(n,m)}(M)=\min_{\left|x\right|_2=1}\left|Mx\right|_2. \]

The matrix \( {M} \) has \( {\mathrm{rank}(M)} \) non zero singular values. The matrix \( {M} \) is invertible iff \( {m=n} \) and \( {s_n(M)>0} \) and in this case \( {s_n(M)=\left|M^{-1}\right|_2^{-1}} \). If \( {m=n} \) and \( {M} \) is normal (i.e. \( {MM^*=M^*M} \)) then the singular values of \( {M} \) are the modules of the eigenvalues of \( {M} \).

If \( {m=n} \) and \( {M} \) has rows \( {R_1,\ldots,R_n} \) then by viewing \( {\left|\det(M)\right|} \) as a hyperparallelogram volume,

\[ |\det(M)|=\prod_{k=1}^ns_k(M) =\prod_{k=1}^n\mathrm{dist}(R_k,\mathrm{span}\{R_1,\ldots,R_{k-1}\}). \]

Here are two additional nice facts relating singular values with rows distances, due to Rudelson and Vershynin [CPAM 2009] and Tao and Vu [AoP 2010].

Lemma 1 (Rudelson and Vershynin) If \( {M\in\mathcal{M}_{m,n}(K)} \) has rows \( {R_1,\ldots,R_m} \), then, denoting \( {R_{-i}=\mathrm{span}\{R_j:j\neq i\}} \), we have
\[ m^{-1/2}\min_{1\leq i\leq m}\mathrm{dist}_2(R_i,R_{-i}) \leq s_{\min(m,n)}(M) \leq \min_{1\leq i\leq m}\mathrm{dist}_2(R_i,R_{-i}). \]

Proof: Since \( {M} \) and \( {M^\top} \) have same singular values, we can prove the statement for the columns of \( {M} \). For every vector \( {x\in K^n} \) and every \( {i\in\{1,\ldots,n\}} \), the triangle inequality and the identity \( {Mx=x_1C_1+\cdots+x_n C_n} \) give

\[ \left|Mx\right|_2 \geq \mathrm{dist}_2(Mx,C_{-i}) =\min_{y\in C_{-i}} \left|Mx-y\right|_2 =\min_{y\in C_{-i}} \left|x_iC_i-y\right|_2 =\vert x_i\vert\mathrm{dist}_2(C_i,C_{-i}). \]

If \( {\left|x\right|_2 =1} \) then necessarily \( {\vert x_i\vert \geq n^{-1/2}} \) for some \( {i\in\{1,\ldots,n\}} \), and therefore

\[ s_{\min(m,n)}(M) =\min_{\left|x\right|_2 =1}\left|Mx\right|_2 \geq n^{-1/2}\min_{1\leq i\leq n}\mathrm{dist}_2(C_i,C_{-i}). \]

Conversely, for any \( {i\in\{1,\ldots,n\}} \), there exists a vector \( {y\in K^n} \) with \( {y_i=1} \) such that

\[ \mathrm{dist}_2(C_i,C_{-i}) =\left|y_1C_1+\cdots+y_nC_n\right|_2 =\left|My\right|_2 \geq \left|y\right|_2 \min_{\left|x\right|_2=1}\left|Mx\right|_2 \geq s_{\min(m,n)}(M) \]

where we used the fact that \( {\left|y\right|^2_2 = |y_1|^2+\cdots+|y_n|^2\geq |y_i|^2=1} \). ☐

Lemma 2 (Tao and Vu) Let \( {1\leq m\leq n} \) and \( {M\in\mathcal{M}_{m,n}(K)} \) with rows \( {R_1,\ldots,R_m} \). If \( {\mathrm{rank}(M)=m} \) then, denoting \( {R_{-i}=\mathrm{span}\{R_j:j\neq i\}} \), we have
\[ \sum_{i=1}^ms_i^{-2}(M)=\sum_{i=1}^m\mathrm{dist}_2(R_i,R_{-i})^{-2}. \]

Proof: The orthogonal projection of \( {R_i} \) on \( {R_{-i}} \) is \( {B^*(BB^*)^{-1}BR_i^*} \) where \( {B} \) is the \( {(m-1)\times n} \) matrix obtained from \( {M} \) by removing the row \( {R_i} \). In particular, we have

\[ \left| R_i\right|_2^2-\mathrm{dist}_2(R_i,R_{-i})^2 =\left| B^*(BB^*)^{-1}BR_i^*\right|_2^2 = (BR_i^*)^*(BB^*)^{-1}BR_i^* \]

by the Pythagoras theorem. On the other hand, the Schur bloc inversion formula states that if \( {M} \) is an \( {m\times m} \) matrix then for every partition \( {\{1,\ldots,m\}=I\cup I^c} \),

\[ (M^{-1})_{I,I}=(M_{I,I}-M_{I,I^c}(M_{I^c,I^c})^{-1}M_{I^c,I})^{-1}. \]

Now we take \( {M=MM^*} \) and \( {I=\{i\}} \), and we note that \( {(MM^*)_{i,j}=R_i\cdot R_j} \), which gives

\[ ((MM^*)^{-1})_{i,i}=(R_i\cdot R_i-(BR_i^*)^*(BB^*)^{-1}BR_i^*)^{-1} =\mathrm{dist}_2(R_i,R_{-i})^{-2}. \]

The desired formula follows by taking the sum over \( {i\in\{1,\ldots,m\}} \). ☐

Some other posts: