Let M be an n×n Markov matrix (e.g. a stochastic matrix). This means that M has entries in [0,1] and each row sums up to 1. Let λ1,…,λn be the roots in C of its characteristic polynomial. Let us order them in such a way that |λ1|≥⋯≥|λn| with growing phases. It is not difficult to show that {λ1,…,λn}⊂{z∈C:|z|≤1}, that λ1=1, and that the spectrum of M is symmetric with respect to the real axis. A good reference for the spectrum of Markov matrices is the book of Seneta.
Suppose from now on that M is irreducible. Then the eigenspace associated to λ1 has dimension 1 and M admits a unique invariant probability measure μ. One can show that the spectrum of M is invariant by the rotation of angle 2π/d where d∈{1,2,…} is called the period of M. In particular, there are exactly d eigenvalues of unit module, and if M is reversible, then its spectrum is real and −1 belongs to the spectrum if and only if d=2.
From now on, we suppose additionally that M is aperiodic: d=1. Then |λ2|<1. In this case, the spectral gap 1−|λ2| describes the worst exponential decay to the equilibrium μ in ℓ2(μ) for a Markov chain of kernel M. This corresponds to the control of the decay of the powers of M−I by using its spectral radius.
Beyond the spectral gap, one may ask about some dynamical interpretation of the remaining eigenvalues. If M is reversible, then M is self-adjoint in ℓ2(μ) and 1−|λk| for λk∉{0,1} describes the worst exponential decay to the equilibrium μ in ℓ2(μ) when the initial law of the chain is orthogonal to the eigenspaces of M⊤ associated to the eigenvalues of higher module. One may also think about the Chen and Saloff-Coste criterion for the ℓ2 cuttoff of reversible Markov chains. Beyond reversibility, this simple interpretation is still valid if M is normal, i.e. if MM∗=M∗M where M∗ is the adjoint of M in ℓ2(μ).
The empirical spectral distribution of M is the discrete probability measure
ν:=1nδλ1+⋯+1nδλn.
For every k∈{0,1,2,…}, we have the global spectral identity
∫zkdν(z)=1nn∑i=1λki=1nTr(Mk)=1nn∑i=1Mki,i=1nn∑i=1p(i,k)
where p(i,k):=Mki,i is the probability of a loop of lenght k rooted at i for a Markov chain of kernel M. This gives a global interpretation of the spectrum: the moments of the empirical spectral distribution of M correspond to the mean loops probabilities for the chain.
Let f:{1,…,n}→R be some observable and (Xn)n≥1 be a Markov chain on {1,…,n} with kernel M. The ergodic theorem (strong law of large numbers for additive functionals) states that regardless of the initial law of the chain, we have
f(X0)+⋯+f(Xn−1)na.s.⟶n→∞μ(f):=∫fdμ=n∑k=1f(k)μ(k).
Let g:{1,…,n}→R be such that (M−I)g=f−μ(f). Note that L:=M−I is the generator of M and g solves the Poisson equation. Then the central limit theorem states that
√n(f(X0)+⋯+f(Xn−1)n−μ(f))law⟶n→∞N(0,σ(f)2)
where σ(f)2:=μ(g2)−μ(g)2. Thus, if M is reversible then
σ(f)2≤μ(f2)−μ(f)2dist(1,{λ2,…,λn})2.
Note that when M and μ are not known, one may estimate σ(f)2 empirically. See e.g. the work of Trevezas and Limnios and references therein.
See also this paper.