Press "Enter" to skip to content

Month: July 2012

An observation on permutation matrices

Multiplication table of the 3x3 permutation matrices (source: Wikipedia)

This post is devoted to certain properties of random permutation matrices. Let \( {\mathfrak{S}_n} \) be the symmetric group, \( {n\geq2} \), and let \( {\mathcal{P}_n} \) be the group of \( {n\times n} \) permutation matrices, obtained by the isomorphism \( {\sigma\in\mathfrak{S}_n\mapsto P_\sigma={(\mathbf{1}_{j=\sigma(i)})}_{1\leq i,j\leq n}} \). We know that \( {\mathcal{P}_n} \) is a subgroup of the orthogonal group \( {\mathbb{O}_n} \). It is also the set of extremal points of the Birkhoff polytope \( {\mathcal{B}_n} \) formed by doubly stochastic matrices (these matrices with entries in \( {[0,1]} \) with rows and columns all summing to one). The uniform probability measure on \( {\mathcal{P}_n} \), which gives mass \( {1/n!} \) to every atom, is a Haar measure (left and right translation invariant). We denote by \( {J} \) the \( {n\times n} \) matrix with all entries equal to \( {1} \). We have \( {\frac{1}{n}J\in\mathcal{B}_n} \).

Theorem 1 (Moments and fixed points) Let \( {k\geq1} \) and \( {\psi_n(k):=\mathrm{Card}\{1\leq r\leq n:r \vert k\}} \). If \( {P} \) is a random matrix uniformly distributed on \( {\mathcal{P}_n} \), and \( {F_k:=\mathbb{E}(P^k)} \), then

\[ \mathrm{diag}(F_k)=\frac{\psi_n(k)}{n}I \quad\text{and in particular}\quad \mathrm{Tr}(F_k)=\psi_n(k). \]

If \( {\sigma} \) is a random permutation uniformly distributed on \( {\mathfrak{S}_n} \), then

\[ \mathbb{E}(\mathrm{Card}\{1\leq i\leq n:\sigma^k(i)=i\}) =\psi_n(k). \]

Note that \( {n\mapsto\psi_n(k)} \) grows until \( {\psi_k(k)=\mathrm{Card}\{1\leq r\leq k:r\vert k\}} \).

Proof: The second part follows from the first part since if \( {P_\sigma:={(\mathbf{1}_{j=\sigma(i)})}_{1\leq i,j\leq n}} \) then \( {P^k\overset{d}{=}P_\sigma^k=P_{\sigma^k}} \), and \( {\mathrm{Tr}(P_{\sigma^k})=\mathrm{Card}\{1\leq i\leq n:\sigma^k(i)=i\}} \), and thus

\[ \mathrm{Tr}(F_k) =\mathbb{E}(\mathrm{Tr}(P_{\sigma^k})) =\mathbb{E}(\mathrm{Card}\{1\leq i\leq n:\sigma^k(i)=i\}). \]

To establish the first part, we write, for all \( {1\leq i\leq n} \),

\[ (F_k)_{i,i} =\mathbb{E}((P_{\sigma^k})_{i,i}) =\mathbb{E}(\mathbf{1}_{\sigma^k(i)=i}) =\mathbb{P}(\sigma^k(i)=i) =\frac{1}{n!}\sum_{\sigma\in\mathfrak{S}_n}\mathbf{1}_{\sigma^k(i)=i}. \]

Now \( {i} \) is a fixed point of \( {\sigma^k} \) iff \( {i} \) belongs to a cycle of \( {\sigma} \) of length \( {r} \) such that \( {r\vert k} \). There are

\[ A_{n-1,r-1}=\frac{(n-1)!}{(n-1-(r-1))!}=\frac{(n-1)!}{(n-r)!} \]

ways to choose this cycle, and \( {(n-r)!} \) ways to permute the elements outside this cycle. Therefore

\[ \frac{1}{n!}\sum_{\sigma\in\mathfrak{S}_n}\mathbf{1}_{\sigma^k(i)=i} =\frac{1}{n!}\sum_{\substack{1\leq r\leq k\\r\vert k}}\frac{(n-1)!}{(n-r)!}(n-r)! =\frac{\psi_n(k)}{n}. \]

Theorem 2 (Structure of \( {F_k} \)) With the settings of theorem 1, we have

\[ F_k =\frac{\psi_n(k)-1}{n-1}I +\left(1-\frac{\psi_n(k)-1}{n-1}\right)\frac{1}{n}J. \]

In particular, the matrix \( {F_k} \) belongs to the interval \( {[I,\frac{1}{n}J]} \) in \( {\mathcal{B}_n} \).

Proof: Let \( {\sigma} \) be a random permutation following the uniform distribution on \( {\mathfrak{S}_n} \), as in theorem 1. For any fixed \( {\tau\in\mathfrak{S}_n} \), we have \( {\tau\sigma\overset{d}{=}\sigma\overset{d}{=}\sigma\tau} \), and thus

\[ P_\tau C_\tau=F_k=C_\tau P_\tau \]

where \( {C_\tau=\mathbb{E}(P_\sigma\underbrace{P_\tau P_\sigma\cdots P_\tau P_\sigma}_{k-1\text{ times}})} \). Therefore, for any \( {\tau\in\mathfrak{S}_n} \),

\[ P_{\tau^{-1}}F_k=F_kP_{\tau^{-1}}=C_\tau \ \ \ \ \ (1) \]

and thus, for any \( {\tau\in\mathfrak{S}_n} \)

\[ P_{\tau^{-1}}F_kP_{\tau}=F_k. \]

In other words, for any \( {\tau\in\mathfrak{S}_n} \) and any \( {1\leq i,j\leq n} \),

\[ (F_k)_{\tau(i),\tau(j)}=(F_k)_{i,j}. \]

Thus \( {(F_k)_{i,i}=(F_k)_{1,1}} \) and \( {(F_k)_{i,j}=(F_k)_{1,2}} \) for any \( {1\leq i\neq j\leq n} \). Since \( {F_k} \) is a stochastic matrix, we have \( {1=F_{1,1}+(n-1)F_{1,2}} \). Now theorem 1 gives \( {F_{1,1}=\frac{\psi_n(k)}{n}} \), and the value of \( {F_{1,2}}\) follows. ☐

Algebra. For \( {k=1} \), the formula (1) allows to determine \( {F_1} \) without using theorem 1. Namely, for \( {k=1} \), we have \( {C_\tau=\mathbb{E}(P_\sigma)=F_1} \), and thus \( {P_\tau F_1=F_1=F_1P_\tau} \). Therefore \( {F_1} \) has all its rows identical, and all its columns identical, and thus all its entries identical. Since \( {F_1} \) is a stochastic matrix, we obtain \( {F_1=\frac{1}{n}J} \) and we recover in particular that \( {\mathrm{Tr}(F_1)=1=\psi_n(1)} \). For \( {k>1} \), the matrix \( {C_\tau} \) depends on \( {\tau} \), and we ignore how to deduce \( {F_k} \) by using simply the translation invariance of the law of \( {\sigma} \), without using the combinatorics of the cycles structure of \( {\sigma} \).

Blocs and cycles. We may play with the cycles decomposition of \( {\sigma} \) using matrices. Namely, we know that \( {P_\sigma} \) is conjugated in \( {\mathcal{P}_n} \) to \( {\oplus_{i=1}^\ell B_i} \) where \( {B_1\in\mathcal{P}_{n_1},\ldots,B_\ell\in\mathcal{P}_{n_\ell}} \) are the matrices of the \( {\ell} \) cycles of \( {\sigma} \), of lengths \( {n_1,\ldots,n_\ell} \). Thus \( {P_\sigma^k} \) is conjugated to \( {\oplus_{i=1}^\ell B_i^k} \). But \( {\mathrm{Tr}(B_i^k)=n_i\mathbf{1}_{n_i\vert k}} \) and thus

\[ \mathrm{Tr}(P_\sigma^k) =\sum_{i=1}^\ell\mathrm{Tr}(B_i^k) =\sum_{i=1}^\ell n_i\mathbf{1}_{n_i\vert k}. \]

Denoting \( {A_1,\ldots,A_n} \) the number of cycles of \( {\sigma} \) of lengths \( {1,\ldots,n} \), we get

\[ \mathrm{Tr}(P_\sigma^k)=\sum_{j=1}^n jA_j\mathbf{1}_{j\vert k}. \]

Therefore, theorem 1 gives

\[ \sum_{j=1}^n j\mathbb{E}(A_j)\mathbf{1}_{j\vert k} =\mathbb{E}(\mathrm{Tr}(P_\sigma^k)) =\psi_n(k). \]

Not a surprise for the readers knowing that \( {\mathbb{E}(A_j)=\frac{1}{j}} \) for any \( {1\leq j\leq n} \).

Note. This post comes from discussions with Yan Doumerc.

Leave a Comment

Simulating random walks bridges again

A random walk loop of length 1000Let \( {E_{d,n}} \) be the set of paths of the simple random walk on \( {\mathbb{Z}^d} \) with \( {2n} \) steps starting and ending at the origin. That is, \( {\gamma=(\gamma_0,\ldots,\gamma_{2n})\in E_{d,n}} \) iff \( {\gamma_0=\gamma_{2n}=0} \) and \( {\gamma_i-\gamma_{i-1}\in\{\pm e_1,\ldots,\pm e_d\}} \) for every \( {1\leq i\leq 2n} \), where \( {e_1,\ldots,e_d} \) is the canonical basis. The uniform law on \( {E_{d,n}} \) is the law induced by the simple random walk \( {{(X_k)}_{k\geq0}} \) on \( {\mathbb{Z}^d} \) conditional on \( {\{X_0=0,X_{2n}=0\}} \). We have already discussed the simulation problem of the uniform law on \( {E_{d,n}} \) in a previous post.

It is quite natural to ask about the simulation of the uniform law on the subset \( {F_{d,n}} \) of \( {E_{d,n}} \) constituted by the paths of the simple random walk on \( {\mathbb{Z}^d} \) with \( {2n} \) steps touching the origin only at times \( {0} \) and \( {2n} \). That is, \( {\gamma=(\gamma_0,\ldots,\gamma_{2n})\in F_{d,n}} \) iff \( {\gamma\in E_{d,n}} \) and \( {\gamma_i\neq0} \) for every \( {1\leq i\leq 2n-1} \). Here again, the uniform law on \( {F_{d,n}} \) is the law induced by the simple random walk \( {{(X_k)}_{k\geq0}} \) on \( {\mathbb{Z}^d} \) conditional on \( {\{X_0=0,X_1\neq0,\ldots,X_{2n-1}\neq0,X_{2n}=0\}} \). We have of course \( {F_{1,n}=E_{1,n}} \) while \( {F_{d,n}\varsubsetneq E_{d,n}} \) if \( {d>1} \).

One may observe that if \( {\gamma\in E_{d,n}} \) and if \( {\left\Vert\gamma_i\right\Vert_1=r} \), then \( {\left\Vert\gamma_{i+1}\right\Vert_1=r\pm 1} \), meaning that \( {\gamma} \) follows a one-dimensional random walk on \( {\ell^1} \) spheres. Actually, when \( {\gamma} \) runs over \( {F_{d,n}} \) then \( {\left\Vert\gamma\right\Vert_1} \) runs over the non negative part of \( {F_{1,n}} \). This observation provides an algorithm producing all the elements of \( {F_{d,n}} \). Unfortunately, this does not help to simulate the uniform law on \( {F_{d,n}} \) since the law of \( {\left\Vert\gamma\right\Vert_1} \) is complicated. This is due to the special role of the corners of the \( {\ell^1} \) spheres (these points with some null coordinates), from which the path has more possibilities to increase his \( {\ell^1} \) norm, and the importance of this phenomenon increases with \( {d} \) (we see here clearly the transcient behavior in action).

The method used to simulate the uniform law on \( {E_{d,n}} \) does not work for the uniform law on \( {F_{d,n}} \). One may of course cook up a Metropolis-Hastings algorithm but this will not produce an exact simulation. The best after all is probably to proceed using the rejection method: simulate random elements of \( {E_{d,n}} \) according to the uniform law on \( {E_{d,n}} \) until we fall into \( {F_{d,n}} \). The number of rejections is then random and follows a geometric distribution with parameter

\[ p_{d,n} =\frac{\mathrm{card}(F_{d,n})}{\mathrm{card}(E_{d,n})} =\frac{\mathbb{P}_0(\tau_d=2n)}{\mathbb{P}_0(X_{2n}=0)} =\mathbb{P}_0(\tau_d=2n|X_{2n}=0) \]

where \( {\tau_d=\inf\{k>0:X_k=0\}} \) and \( {\mathbb{P}_0=\mathbb{P}(\cdot|X_0=0)} \). For a fixed \( {n} \), we have \( {p_{d,n}\rightarrow1} \) as \( {d\rightarrow\infty} \), and this is in favor of the rejection algorithm.

Note. The sequences \( {{(\mathrm{card}(F_{2,n}))}_{n\geq1}} \) and \( {{(\mathrm{card}(F_{3,n}))}_{n\geq1}} \) are known as sequences A054474 and A049037 in The On-Line Encyclopedia of Integer Sequences. There are formulas for the generating function, expressing essentially the fact that the elements of \( {F_{d,n}} \) can be extracted from the elements of \( {E_{d,n}} \). It is unclear that these sequences satisfy to an algebraic recursion.

Note. This post benefited from informal discussions with Charles Bordenave, Mireille Bousquet-Mélou, Pietro Caputo, and Pierre Del Moral at various occasions. The rejection method is also at the heart of the best algorithm for the simulation of the uniform law on the set of \( {d} \)-regular graphs (the so called configuration model based on random permutations).

1 Comment