# Libres pensées d'un mathématicien ordinaire Posts

Let $latex X$ be an $latex n\times n$ complex matrix. The eigenvalues $latex \lambda_1(X), \ldots, \lambda_n(X)$ of $latex X$ are the roots in $latex \mathbb{C}$ of its characteristic polynomial. We label them in such a way that

$latex \displaystyle |\lambda_1(X)|\geq\cdots\geq|\lambda_n(X)|$

with growing phases. The spectral radius of $latex X$ is $latex \rho(X):=|\lambda_1(X)|$. The singular values

$latex \displaystyle s_1(X)\geq\cdots\geq s_n(X)$

of $latex X$ are the eigenvalues of the positive semi-definite Hermitian matrix $latex \sqrt{XX^*}$ where $latex X^*$ is the conjugate-transpose of $latex X$. We have thus $latex s_k(X)=\sqrt{\lambda_k(XX^*)}$. The operator norm of $latex X$ is $latex \Vert X\Vert_{2\to2}=s_1(X)$. Geometrically, the matrix $latex X$ maps the unit sphere to an ellipsoid, and the half lenghts of the principal axes of this ellipsoid are exactly the singular values of $latex X$.

H. Weyl has shown in 1949 that for every $latex k\in\{1,\ldots,n\}$, we have

$latex \displaystyle \prod_{i=k}^ns_k(X)\leq\prod_{i=k}^n|\lambda_i(X)| \quad\text{and}\quad \prod_{i=1}^k|\lambda_i(X)|\leq\prod_{i=1}^ks_k(X).$

This expresses a two ways logarithmic majorization. Note that we recover the radius/norm bound $latex \rho(X)\leq\Vert X\Vert_{2\to2}$. Note also that one may deduce the second Weyl inequalities from the first ones by using them for the inverse of $latex X+\varepsilon I$ (and conversely).  Note moreover that

$latex \displaystyle\prod_{i=1}^n|\lambda_i(X)|=\prod_{i=1}^ns_i(X)=|\det(X)|$.

In 1954, A. Horn has shown that any sequences of numbers in $latex \mathbb{C}$ and $latex \mathbb{R}_+$ satisfying to the Weyl inequalities and the ordering that we used are actually the eigenvalues and singular values of a complex matrix. This  solves positively a problem formulated by S. Sherman.

We say that $latex X$ is Markov (or stochastic) if $latex X$ has entries in $latex [0,1]$ and each row sums up to $latex 1$. The spectrum of $latex X$ is then included in the unit disc of the complex plane, contains $latex 1$, and is symmetric with respect to the real axis. With the Weyl-Sherman-Horn inverse problem in mind, one can ask about the possible set of singular values for a Markov matrix with prescribed  eigenvalues. One may also reverse the question. One has to take into account the additional symmetries due to the periods. Actually, instead of the singular values of $latex M$, it is maybe more natural, from a Markovian point of view, to consider the spectrum of $latex \sqrt{MM^{*\mu}}$ where $latex M^{*\mu}$ is the adjoint of $latex M$ for $latex \ell^2(\mu)$ where $latex \mu$ is the invariant measure of $latex M$ (when $latex M$ is irreducible). This remark leads to many other questions.

The content of this post is inspired from informal discussions with Laurent Miclo during the EVOL workshop held in Hammamet in May 2010.

Let $latex M$ be an $latex n\times n$ Markov matrix (e.g. a stochastic matrix). This means that $latex M$ has entries in $latex [0,1]$ and each row sums up to $latex 1$. Let $latex \lambda_1,\ldots,\lambda_n$ be the roots in $latex \mathbb{C}$ of its characteristic polynomial. Let us order them in such a way that $latex |\lambda_1|\geq\cdots\geq|\lambda_n|$ with growing phases. It is not difficult to show that $latex \{\lambda_1,\ldots,\lambda_n\}\subset\{z\in\mathbb{C}:|z|\leq1\}$, that $latex \lambda_1=1$, and that the spectrum of $latex M$ is symmetric with respect to the real axis. A good reference for the spectrum of Markov matrices is the book of Seneta.

Suppose from now on that $latex M$ is irreducible. Then the eigenspace associated to $latex \lambda_1$ has dimension $latex 1$ and $latex M$ admits a unique invariant probability measure $latex \mu$. One can show that the spectrum of $latex M$ is invariant by the rotation of angle $latex 2\pi/d$ where $latex d\in\{1,2,\ldots\}$ is called the period of $latex M$. In particular, there are exactly $latex d$ eigenvalues of unit module, and if $latex M$ is reversible, then its spectrum is real and $latex -1$ belongs to the spectrum if and only if $latex d=2$.

From now on, we suppose additionally that $latex M$ is aperiodic: $latex d=1$. Then $latex |\lambda_2|<1$. In this case, the spectral gap $latex 1-|\lambda_2|$  describes the worst exponential decay to the equilibrium $latex \mu$ in $latex \ell^2(\mu)$ for a Markov chain of kernel $latex M$. This corresponds to the control of the decay of the powers of $latex M-I$ by using its spectral radius.

Beyond the spectral gap, one may ask about some dynamical interpretation of the remaining eigenvalues. If $latex M$ is reversible, then  $latex M$ is self-adjoint in $latex \ell^2(\mu)$ and $latex 1-|\lambda_k|$ for $latex \lambda_k\not\in\{0,1\}$ describes the worst exponential decay to the equilibrium $latex \mu$ in $latex \ell^2(\mu)$ when the initial law of the chain is orthogonal to the eigenspaces of $latex M^\top$ associated to the eigenvalues of higher module. One may also think about the Chen and Saloff-Coste criterion for the $latex \ell^2$ cuttoff of reversible Markov chains. Beyond reversibility, this simple interpretation is still valid if $latex M$ is normal, i.e. if $latex MM^*=M^*M$ where $latex M^*$ is the adjoint of $latex M$ in $latex \ell^2(\mu)$.

The empirical spectral distribution of $latex M$ is the discrete probability measure

$latex \displaystyle\nu:=\frac{1}{n}\delta_{\lambda_1}+\cdots+\frac{1}{n}\delta_{\lambda_n}.$

For every $latex k\in\{0,1,2,\ldots\}$, we have the global spectral identity

$latex \displaystyle \int\!z^k\,d\nu(z)=\frac{1}{n}\sum_{i=1}^n\lambda_i^k=\frac{1}{n}\mathrm{Tr}(M^k)=\frac{1}{n}\sum_{i=1}^nM^k_{i,i}=\frac{1}{n}\sum_{i=1}^np(i,k)$

where $latex p(i,k):=M^k_{i,i}$ is the probability of a loop of lenght $latex k$ rooted at $latex i$ for a Markov chain of kernel $latex M$. This gives a global interpretation of the spectrum: the moments of the empirical spectral distribution of $latex M$ correspond to the mean loops probabilities for the chain.

Let $latex f:\{1,\ldots,n\}\to\mathbb{R}$ be some observable and $latex (X_n)_{n\geq1}$ be a Markov chain on $latex \{1,\ldots,n\}$ with kernel $latex M$. The ergodic theorem (strong law of large numbers for additive functionals) states that regardless of the initial law of the chain, we have

$latex \displaystyle \frac{f(X_0)+\cdots+f(X_{n-1})}{n}\overset{\text{a.s.}}{\underset{n\to\infty}{\longrightarrow}}\mu(f):=\int\!f\,d\mu=\sum_{k=1}^nf(k)\mu(k).$

Let $latex g:\{1,\ldots,n\}\to\mathbb{R}$ be such that $latex (M-I)g=f-\mu(f)$. Note that $latex L:=M-I$ is the generator of $latex M$ and $latex g$ solves the Poisson equation. Then the central limit theorem states that

$latex \displaystyle \sqrt{n}\left(\frac{f(X_0)+\cdots+f(X_{n-1})}{n}-\mu(f)\right)\overset{\text{law}}{\underset{n\to\infty}{\longrightarrow}}\mathcal{N}(0,\sigma(f)^2)$

where $latex \sigma(f)^2:=\mu(g^2)-\mu(g)^2$. Thus, if $latex M$ is reversible then

$latex \displaystyle\sigma(f)^2\leq \frac{\mu(f^2)-\mu(f)^2}{\mathrm{dist}(1,\{\lambda_2,\ldots,\lambda_n\})^2}.$

Note that when $latex M$ and $latex \mu$ are not known, one may estimate $latex \sigma(f)^2$ empirically. See e.g. the work of Trevezas and Limnios and references therein.

The algorithm (several instances may run in parallel):

1. be curious and open minded, do retain concepts, beyond techniques
2. when you catch a new idea (for instance in a seminar) try it on all your problems
4. if the problem is already sparsely solved by others, write down an expository synthesis
5. if not, never hesitate to learn new techniques to solve your problem
Markov-Chains-Monte-Carlo (MCMC for short) methods are widely used in practice for the approximate computation of integrals on various types of spaces. More precisely, let $\mu$ be a probability measure on $E$, known only up to a multiplicative constant. Let $K$ be an irreducible Markov kernel on $E$. Then by using a classical Metropolis-Hastings type construction, one cook up a computable ergodic Markov kernel $M$ on $E$ which admits $\mu$ as a unique reversible invariant law. One can then simulate a Markov chain $(X_n)_{n\geq1}$ with state space $E$ and kernel $M$. When $n$ is large, the law of the random variable $X_n$ is approximately $\mu$. The construction of $K$ relies in general on the knowledge of the local structure of $E$. The quantitative control of the speed of convergence is a delicate problem. Many research articles are devoted to specific models. You may take a look at the expository article of Laurent Michel and of Persi Diaconis.
It is well known that nonreversible ergodic Markov chains can avoid diffusive effects and may converge faster than reversible ergodic Markov chains. It is thus tempting to think about nonreversible MCMC algorithms. This program is currently under developpement. You may take a look at the recent work of Diaconis, Miclo, and Zuñiga on the spectral analysis of second order Markov chains i.e. $\mathcal{L}(X_{n+1}|X_n,X_{n-1},\ldots)=\mathcal{L}(X_{n+1}|X_n,X_{n-1})$.