Press "Enter" to skip to content

The ballot theorem

A random walk loop of length 1000

The classical ballot theorem states that in an election with two candidates \( {A} \) and \( {B} \) in which \( {A} \) obtains \( {a} \) votes and B obtains \( {b \leq a} \) votes, the probability that \( {A} \) stays above \( {B} \) during the counting of the votes is \( {(a-b)/(a+b)} \). In probabilistic terms, setting \( {n=a+b} \) and \( {k=a-b} \), if \( {{(S_n)}_{n\geq0}} \) is a simple random walk on \( {\mathbb{Z}} \), i.e. \( {(S_{n+1}-S_n)_{n\geq0}} \) are i.i.d. \( {\pm1} \) Bernoulli r.v. taking the value \( {+1} \) with probability \( {p\in(0,1)} \) and \( {-1} \) with probability \( {1-p} \), then

\[ \mathbb{P}(S_1>0,\ldots,S_n>0|S_0=0,S_n=k)=\frac{k}{n}. \]

A proof. Let \( {P_{n,k}} \) be the set of paths of length \( {n} \) of the simple random walk starting from \( {0} \) and ending at \( {k} \), and let \( {P^+_{n,k}} \) be the set of such paths staying positive. Note that \( {0\leq k\leq n} \) and \( {n-k} \) is odd. We have

\[ \mathbb{P}(S_1>0,\ldots,S_n>0|S_0=0,S_n=k) =|P^+_{n,k}|\frac{p^a(1-p)^b}{\mathbb{P}(S_n=k|S_0=0)}. \]

On the other hand, \( {\mathbb{P}(S_n=k|S_0=0)=|P_{n,k}|p^a(1-p)^b} \) and thus

\[ \mathbb{P}(S_1>0,\ldots,S_n>0|S_0=0,S_n=k) =\frac{|P^+_{n,k}|}{|P_{n,k}|} \]

which remarkably does not depend on \( {p} \). We also have

\[ |P_{n,k}|=\binom{a+b}{a}=\binom{n}{(n+k)/2} \]

and in particular \( {|P_{n+1,k+1}|=|P_{n,k+2}|+|P_{n,k}|} \). The ballot theorem states that

\[ |P^+_{n,k}|=\frac{k}{n}|P_{n,k}|=\frac{a-b}{a+b}\binom{a+b}{a}. \]

To prove the ballot theorem, one may observe by a simple analysis of the last step that \( {|P^+_{n,k}|} \) satisfies to the same recurrence relation as \( {|P_{n,k}|} \), namely

\[ |P^+_{n+1,k+1}|=|P^+_{n,k+2}|+|P^+_{n,k}|. \]

This leads by induction on \( {n} \) to \( {n|P^+_{n,k}|=k|P_{n,k}|} \), and we are done.

A combinatorial proof. There is actually an elegant proof of the ballot theorem due to Désiré André. Namely, \( {P_{n-1,k-1}} \) is in bijection with the set of elements of \( {P_{n,k}} \) starting with an increment \( {-1} \). On the other hand, this set is itself in bijection with the set of elements of \( {P_{n,k}\setminus P^+_{n,k}} \) starting with an increment \( {+1} \) (it suffices to reflect the path before reaching zero). It follows then that \( {|P_{n,k}|-|P^+_{n,k}|=2|P_{n-1,k-1}|} \), and thus \( {|P^+_{n,k}|=(k/n)|P_{n,k}|} \). This combinatorial trick provides also the famous Catalan number formula:

\[ \mathbb{P}(S_1>0,\ldots,S_{2n-1}>0|S_0=0,S_{2n}=0)=\frac{1}{n+1}\binom{2n}{n}. \]

Universal statement. Set \( {p=1/2} \). It can be shown that for every \( {k\in\mathbb{Z}} \),

\[ \mathbb{P}(S_n=k|S_0=0)=n^{-1/2}\varphi(n^{-1/2}k)+o(n^{-1/2}) \]

where \( {\varphi(t)=(2\pi)^{-1/2}e^{-\frac{1}{2}t^2}} \) is the density of the standard Gaussian law \( {\mathcal{N}(0,1)} \). This follows for instance from the de Moivre-Laplace theorem, which is a local central limit theorem for Bernoulli random variables. As a consequence, from the ballot theorem, for some constants \( {C_2\geq C_1>0} \), as \( {n\gg1} \) with \( {k=\mathcal{O}(\sqrt{n})} \),

\[ C_1\frac{k}{n^{3/2}} \leq \mathbb{P}(L_n(0)=0,S_n=k|S_0=0) \leq C_2\frac{k}{n^{3/2}}. \]

where \( {L_n(x)=\mathbf{1}_{\{S_1=x\}}+\cdots+\mathbf{1}_{\{S_n=x\}}} \) is the local time in \( {x} \) up to time \( {n} \). This type of statement can be extended to sums of i.i.d. lattice random variables, and even to more general random variables.

Some further reading. The ballot theorem goes back at least to Bertrand (1887), and is discussed in the Probability Theory classical books of Feller and of Shiryaev for instance. There are many variants and extensions, see for example the article Ballot theorems, old and new, by Addario-Berry and Reed (2007).

    Leave a Reply

    Your email address will not be published.

    This site uses Akismet to reduce spam. Learn how your comment data is processed.

    Syntax · Style · .