Spectrum of a Markov matrix

Let $M$ be an $n\times n$ Markov matrix (e.g. a stochastic matrix). This means that $M$ has entries in $[0,1]$ and each row sums up to $1$. Let $\lambda_1,\ldots,\lambda_n$ be the roots in $\mathbb{C}$ of its characteristic polynomial. Let us order them in such a way that $|\lambda_1|\geq\cdots\geq|\lambda_n|$ with growing phases. It is not difficult to show that $\{\lambda_1,\ldots,\lambda_n\}\subset\{z\in\mathbb{C}:|z|\leq1\}$, that $\lambda_1=1$, and that the spectrum of $M$ is symmetric with respect to the real axis. A good reference for the spectrum of Markov matrices is the book of Seneta.

Suppose from now on that $M$ is irreducible. Then the eigenspace associated to $\lambda_1$ has dimension $1$ and $M$ admits a unique invariant probability measure $\mu$. One can show that the spectrum of $M$ is invariant by the rotation of angle $2\pi/d$ where $d\in\{1,2,\ldots\}$ is called the period of $M$. In particular, there are exactly $d$ eigenvalues of unit module, and if $M$ is reversible, then its spectrum is real and $-1$ belongs to the spectrum if and only if $d=2$.

From now on, we suppose additionally that $M$ is aperiodic: $d=1$. Then $|\lambda_2|<1$. In this case, the spectral gap $1-|\lambda_2|$ describes the worst exponential decay to the equilibrium $\mu$ in $\ell^2(\mu)$ for a Markov chain of kernel $M$. This corresponds to the control of the decay of the powers of $M-I$ by using its spectral radius.

Beyond the spectral gap, one may ask about some dynamical interpretation of the remaining eigenvalues. If $M$ is reversible, then $M$ is self-adjoint in $\ell^2(\mu)$ and $1-|\lambda_k|$ for $\lambda_k\not\in\{0,1\}$ describes the worst exponential decay to the equilibrium $\mu$ in $\ell^2(\mu)$ when the initial law of the chain is orthogonal to the eigenspaces of $M^\top$ associated to the eigenvalues of higher module. One may also think about the Chen and Saloff-Coste criterion for the $\ell^2$ cuttoff of reversible Markov chains. Beyond reversibility, this simple interpretation is still valid if $M$ is normal, i.e. if $MM^*=M^*M$ where $M^*$ is the adjoint of $M$ in $\ell^2(\mu)$.

The empirical spectral distribution of $M$ is the discrete probability measure

$\displaystyle\nu:=\frac{1}{n}\delta_{\lambda_1}+\cdots+\frac{1}{n}\delta_{\lambda_n}.$

For every $k\in\{0,1,2,\ldots\}$, we have the global spectral identity

$\displaystyle \int\!z^k\,d\nu(z)=\frac{1}{n}\sum_{i=1}^n\lambda_i^k=\frac{1}{n}\mathrm{Tr}(M^k)=\frac{1}{n}\sum_{i=1}^nM^k_{i,i}=\frac{1}{n}\sum_{i=1}^np(i,k)$

where $p(i,k):=M^k_{i,i}$ is the probability of a loop of lenght $k$ rooted at $i$ for a Markov chain of kernel $M$. This gives a global interpretation of the spectrum: the moments of the empirical spectral distribution of $M$ correspond to the mean loops probabilities for the chain.

Let $f:\{1,\ldots,n\}\to\mathbb{R}$ be some observable and $(X_n)_{n\geq1}$ be a Markov chain on $\{1,\ldots,n\}$ with kernel $M$. The ergodic theorem (strong law of large numbers for additive functionals) states that regardless of the initial law of the chain, we have

$\displaystyle \frac{f(X_0)+\cdots+f(X_{n-1})}{n}\overset{\text{a.s.}}{\underset{n\to\infty}{\longrightarrow}}\mu(f):=\int\!f\,d\mu=\sum_{k=1}^nf(k)\mu(k).$

Let $g:\{1,\ldots,n\}\to\mathbb{R}$ be such that $(M-I)g=f-\mu(f)$. Note that $L:=M-I$ is the generator of $M$ and $g$ solves the Poisson equation. Then the central limit theorem states that

$\displaystyle \sqrt{n}\left(\frac{f(X_0)+\cdots+f(X_{n-1})}{n}-\mu(f)\right)\overset{\text{law}}{\underset{n\to\infty}{\longrightarrow}}\mathcal{N}(0,\sigma(f)^2)$

where $\sigma(f)^2:=\mu(g^2)-\mu(g)^2$. Thus, if $M$ is reversible then

$\displaystyle\sigma(f)^2\leq \frac{\mu(f^2)-\mu(f)^2}{\mathrm{dist}(1,\{\lambda_2,\ldots,\lambda_n\})^2}.$

Note that when $M$ and $\mu$ are not known, one may estimate $\sigma(f)^2$ empirically. See e.g. the work of Trevezas and Limnios and references therein.

Some other posts:

One Comment