Convergence of discrete time martingales

Joseph Leo Doob (1910 – 2004) as president of the AMS (1963 – 1964)

It is tempting to think that discrete time martingales are deeper and more elementary than continuous martingales, and that most of the statements on continuous martingales can be reduced by approximation to statements on discrete time martingales. But the truth is that some statements on continuous martingales can be proved with genuine continuous methods, which can be more elegant or more simple than discrete methods. The best for a probabilist is probably to be comfortable on both sides and to focus on the probabilistic intuition, contrary to pure analysts! Even if most of the physics of the phenomena is the same, there are specific aspects related to continuities and discontinuities and their links by passage to the limit, which cannot be reduced completely to technical aspects.

This post is a discrete time counterpart of a previous post on the almost sure convergence of martingales. The argument that we have used for a continuous martingale \( {{(M_t)}_{t\geq0}} \) with \( {M_0=0} \) involves that if for some threshold \( {R} \) we define \( {T=\inf\{t\geq0:|M_t|\geq R\}} \), then \( {|M_T|\leq R} \). Due to a possible jump at time \( {T} \), this is no longer valid when \( {M} \) is discontinuous. In particular, the argument is not valid for discrete time martingales.

In this post, we provide a proof of almost sure convergence of submartingales bounded in \( {\mathrm{L}^1} \), by reduction to the almost sure convergence of nonnegative supermartingales, itself reduced to the convergence of martingales bounded in \( {\mathrm{L}^2} \), which uses the Doob decomposition of adapted integrable processes as well as the Doob maximal inequality. We do not use the Doob stopping theorem (only the germ of it). What is remarkable here is that the whole approach is alternative to the classical proof from scratch with upcrossings which goes back to Joseph Leo Doob.

Submartingales bounded in \( {\mathrm{L}^1} \). If \( {{(X_n)}_{n\geq0}} \) is a submartingale bounded in \( {\mathrm{L}^1} \) then there exists \( {X_\infty\in\mathrm{L}^1} \) such that \( {\lim_{n\rightarrow\infty}X_n=X_\infty} \) almost surely.

Proof. The fact that \( {X_\infty\in\mathrm{L}^1} \) follows by the Fatou lemma since

\[ \mathbb{E}(|X_\infty|) =\mathbb{E}(\varliminf_n|X_n|) \leq\varliminf_n\mathbb{E}(|X_n|) \leq\sup_n\mathbb{E}(|X_n|)<\infty. \]

Set \( {C=\sup_n\mathbb{E}(|X_n|)<\infty} \). To get almost sure convergence it suffices to show that

\[ X=Y-Z \]

where \( {{(Y_n)}_{n\geq0}} \) and \( {{(Z_n)}_{n\geq0}} \) are both nonnegative supermartingales and to use the theorem of convergence for nonnegative supermartingales. Since \( {(\bullet)^+=\max(\bullet,0)} \) is convex and nondecreasing, \( {X_n^+=\max(X_n,0)} \) defines a submartingale. Let

\[ X_n^+=X_0^++M_n+A_n \]

be its Doob decomposition. We known that \( {0\leq A_n\nearrow A_\infty} \) as \( {n\rightarrow\infty} \) almost surely where \( {A_\infty} \) takes its values in \( {[0,+\infty]} \). But since \( {\mathbb{E}(A_n)=\mathbb{E}(X_n^+)-\mathbb{E}(X_0^+)\leq C} \), it follows by monotone convergence that \( {\mathbb{E}(A_\infty)\leq C} \). Let us define

\[ Y_n=X_0^++M_n+\mathbb{E}(A_\infty\mid\mathcal{F}_n). \]

The process \( {{(Y_n)}_{n\geq0}} \) is a martingale. It is nonnegative since

\[ Y_n\geq X_0^++M_n+A_n=X_n^+\geq0; \]

Finally \( {Z_n=Y_n-X_n} \) defines a submartingale as the difference of a martingale and a supermartingale and \( {Z_n\geq X_n^+-X_n=X_n^-\geq0} \).

Nonnegative supermartingales. If \( {{(X_n)}_{n\geq0}} \) is a nonnegative supermartingale then there exists \( {X_\infty} \) taking values in \( {[0,+\infty]} \) such that \( {\lim_{n\rightarrow\infty}X_n=X_\infty} \) almost surely.

Proof. Since \( {\mathrm{e}^{-\bullet}} \) is nonincreasing and convex, the Jensen inequality gives that \( {Y_n=\mathrm{e}^{-X_n}} \) defines a submartingale. Let us write its Doob decomposition

\[ Y_n=Y_0+M_n+A_n \]

where \( {M} \) is a martingale and \( {A} \) is nonnegative and predictable, and \( {M_0=A_0=0} \). We have \( {0\leq A_n\nearrow A_\infty} \) as \( {n\rightarrow\infty} \) almost surely where \( {A_\infty} \) takes its values in \( {[0,+\infty]} \). It suffices now to show that \( {M} \) is a martingale bounded in \( {\mathrm{L}^2} \) and to use the theorem about the convergence of martingales bounded in \( {\mathrm{L}^2} \). The martingale property gives, for all \( {n,m} \), denoting \( {\Delta M_k=M_k-M_{k-1}} \),

\[ \mathbb{E}((M_{n+m}-M_n)^2) =\sum_{k=n+1}^{n+m}\mathbb{E}((\Delta M_k)^2). \]

Let us write \( {Y_n^2=Y_0^2+\sum_{k=1}^n(Y_k^2-Y_{k-1}^2)} \). Since \( {Y_k=Y_{k-1}+\Delta M_k+\Delta A_k} \), we get

\[ Y_n^2=Y_0^2+\sum_{k=1}^n\left[(\Delta M_k)^2+(\Delta A_k)^2+2Y_{k-1}\Delta M_k+2Y_{k-1}\Delta A_k+2\Delta M_k\Delta A_k\right]. \]

Now \( {Y_0^2+\sum_k(\Delta A_k)^2\geq0} \) and \( {2\sum_kY_{k-1}\Delta A_k\geq0} \) since \( {Y\geq0} \) and \( {\Delta A\geq0} \). Thus

\[ \sum_{k=1}^n(\Delta M_k)^2 +2\sum_{k=1}^n(Y_{k-1}+\Delta A_k)\Delta M_k \leq Y_n^2\leq1. \]

At this step, we note that

\[ \mathbb{E}((Y_{k-1}+\Delta A_k)\Delta M_k) =\mathbb{E}((Y_{k-1}+\Delta A_k)\mathbb{E}(\Delta M_k\mid\mathcal{F}_{k-1})) =0. \]

It follows that \( {\mathbb{E}(M_n^2)=\mathbb{E}((M_n-M_0)^2)=\sum_{k=1}^n\mathbb{E}((\Delta M_k)^2)\leq1} \).

Martingales bounded in \( {\mathrm{L}^2} \). If \( {{(M_n)}_{n\geq0}} \) is a martingale bounded in \( {\mathrm{L}^2} \), then there exists \( {M_\infty\in\mathrm{L}^2} \) such that \( {\lim_{n\rightarrow\infty}M_n=M_\infty} \) almost surely and in \( {\mathrm{L}^2} \).

Proof. For all \( {n,m} \), for all \( {1\leq k<n} \), we have, denoting \( {\Delta M_k=M_k-M_{k-1}} \),

\[ \mathbb{E}(\Delta M_k\Delta M_n) =\mathbb{E}(\mathbb{E}(\Delta M_k\Delta M_n\mid\mathcal{F}_{n-1})) =\mathbb{E}(\Delta M_k\mathbb{E}(\Delta M_n\mid\mathcal{F}_{n-1})) =0. \]

This orthogonality of successive increments gives, for all \( {n,m\geq0} \),

\[ \mathbb{E}((M_{n+m}-M_n)^2) =\sum_{k=n+1}^{n+m}\mathbb{E}((\Delta M_k)^2). \]

In particular, since \( {\sup_{n\geq0}\mathbb{E}(M_n^2)<\infty} \), we get \( {\sup_{n\geq0}\mathbb{E}((M_n-M_0)^2)<\infty} \), and thus \( {\sum_{k\geq0}\mathbb{E}((\Delta M_k)^2)<\infty} \). Moreover \( {{(M_n)}_{n\geq0}} \) is a Cauchy sequence in \( {\mathrm{L}^2} \), and thus it converges in \( {\mathrm{L}^2} \) to some \( {M_\infty\in\mathrm{L^2}} \). It remains to establish almost sure convergence. It suffices to show that \( {{(M_n)}_{n\geq0}} \) is almost surely a Cauchy sequence. Let us define

\[ X_n=\sup_{i,j\geq n}|M_i-M_j|. \]

Now it suffices to show that almost surely \( {\lim_{n\rightarrow\infty}X_n=0} \). Actually \( {0\leq X_n\searrow X_\infty} \) as \( {n\rightarrow\infty} \) almost surely where \( {X_\infty\geq0} \). Hence it suffice to show that \( {\mathbb{E}(X_\infty^2)=0} \) where the square is for computational convenience later on. By monotone convergence it suffices to show that \( {\lim_{n\rightarrow\infty}\mathbb{E}(X_n^2)=0} \). We have \( {X_n\leq 2Y_n} \) where

\[ Y_n=\sup_{k\geq n}|M_k-M_n|. \]

It suffices to show that \( {\lim_{n\rightarrow\infty}\mathbb{E}(Y_n^2)=0} \). But the Doob maximal inequality for the martingale \( {{(M_{n+k}-M_n)}_{k\geq 0}} \) gives

\[ \mathbb{E}(Y_n^2) \leq 4\sup_{k\geq n}\mathbb{E}((M_k-M_n)^2) =4\sum_{k=n+1}^\infty\mathbb{E}((\Delta M_k)^2), \]

and we already know that the right hand side is the reminder of a converging series!

Finally note that both the limit in \( {\mathrm{L}^2} \) and the almost sure limit are the same either by using uniform integrability and using the uniqueness of the limit in \( {\mathrm{L}^2} \) or by extracting an almost sure subsequence from the \( {\mathrm{L}^2} \) convergence and using the uniqueness of the almost sure limit.

Doob maximal inequalities.

If \( {{(X_n)}_{n\geq0}} \) is a nonnegative submartingale then for all \( {n\geq0} \) and all \( {r>0} \),

\[ \mathbb{P}(\max_{0\leq k\leq n}X_k\geq r)\leq\frac{\mathbb{E}(X_n)}{r} \]

If \( {{(M_n)}_{n\geq0}} \) is a martingale then for all \( {n\geq0} \) and and all \( {p>1} \),

\[ \mathbb{E}\left(\sup_{0\leq k\leq n}|M_k|^p\right) \leq\left(\frac{p}{p-1}\right)^p\mathbb{E}(|M_n|^p) \]

in particular by monotone convergence we get

\[ \mathbb{E}\left(\sup_{n\geq0}|M_n|^p\right) \leq\left(\frac{p}{p-1}\right)^p\sup_{n\geq0}\mathbb{E}(|M_n|^p). \]

Note that \( {q=p/(p-1)} \) is the Hölder conjugate of \( {p} \). For \( {p=2} \) then \( {(p/(p-1))^p=4} \).

Proof. For the first inequality, we set \( {T=\inf\{n\geq0:X_n\geq r\}} \). For all \( {k\leq n} \), we have \( {\{T=k\}=\{X_0<r,\ldots,X_{k-1}<r,X_k\geq r\}\in\mathcal{F}_k} \). Also

\[ r\mathbf{1}_{T=k} \leq X_k\mathrm{1}_{T=k} \leq \mathbb{E}(X_n\mid\mathcal{F}_k)\mathbf{1}_{T=k} =\mathbb{E}(X_n\mathbf{1}_{T=k}\mid\mathcal{F}_k) \]

hence

\[ r\mathbb{P}(T=k)\leq\mathbb{E}(X_n\mathbf{1}_{T=k}) \]

and summing over all \( {k\leq n} \) gives

\[ r\mathbb{P}(T\leq n)\leq\mathbb{E}(X_n\mathbf{1}_{T\leq n}). \]

It remains to note that \( {\{\max_{0\leq k\leq n}X_k\geq r\}=\{T\leq n\}} \) to get the first inequality.

For the second inequality, we use the proof of the first part with the nonnegative submartingale \( {{(|M_n|)}_{n\geq0}} \). This gives, for all \( {r>0} \), denoting \( {S_n=\max_{0\leq k\leq n}|M_k|} \),

\[ r\mathbb{P}(S_n\geq r)\leq\mathbb{E}(|M_n|\mathbf{1}_{S_n\geq a}). \]

Now

\[ \int_0^\infty r\mathbb{P}(S_n\geq r)pr^{p-2}\mathrm{d}r \leq\int_0^\infty\mathbb{E}(|M_n|\mathbf{1}_{S_n\geq r})pr^{p-2}\mathrm{d}r. \]

Now by the Fubini–Tonelli theorem, this rewrites

\[ \mathbb{E}\int_0^\infty r\mathbf{1}_{S_n\geq r}pr^{p-2}\mathrm{d}r \leq\mathbb{E}\int_0^\infty|M_n|\mathbf{1}_{S_n\geq r}pr^{p-2}\mathrm{d}r \]

namely

\[ \mathbb{E}\int_0^{S_n} pr^{p-1}\mathrm{d}r \leq\frac{p}{p-1}\mathbb{E}\int_0^{S_n}|M_n|(p-1)r^{p-2}\mathrm{d}r \]

in other words

\[ \mathbb{E}(S_n^p)\leq\frac{p}{p-1}\mathbb{E}(|M_n|S_n^{p-1}). \]

The right hand side is bounded by the Hölder inequality as

\[ \mathbb{E}(|M_n|S_n^{p-1}) \leq\mathbb{E}(|M_n|^p)^{1/p}\mathbb{E}(S_n^p)^{1-1/p}, \]

hence

\[ \mathbb{E}(S_n^p)\leq\left(\frac{p}{p-1}\right)^p\mathbb{E}(|M_n|^p). \]

Doob decomposition. If \( {{(X_n)}_{n\geq0}} \) is adapted, and integrable in the sense that \( {\mathbb{E}(|X_n|)<\infty} \) for all \( {n} \), then there exists a martingale \( {M} \) and a predictable process \( {A} \) such that

\[ M_0=A_0=0\quad\text{and}\quad X=X_0+M+A. \]

Moreover this decomposition is unique. Furthermore if \( {X} \) is a submartingale then \( {A} \) is nondecreasing and there exists \( {A_\infty} \) taking values in \( {[0,+\infty]} \) such that almost surely

\[ 0\leq A_n\underset{n\rightarrow\infty}{\nearrow} A_\infty. \]

Recall that predictable means that \( {A_n} \) is \( {\mathcal{F}_{n-1}} \) measurable for all \( {n\geq1} \).

The process \( {A} \) is the compensator of \( {X} \) in the sense that \( {X-A} \) is a martingale. For a martingale \( {N} \), the compensator of the submartingale \( {X=N^2} \) is the increasing process of \( {N} \).

There is a continuous time analogue known as the Doob–Meyer decomposition.

Proof. Note that \( {A} \) is necessarily integrable too. For the uniqueness, if \( {X=X_0+M+A} \) then

\[ \mathbb{E}(X_{n+1}-X_n\mid\mathcal{F}_n)=A_{n+1}-A_n, \]

and since \( {A_0=0} \) we get, for all \( {n\geq1} \),

\[ A_n=\sum_{k=0}^{n-1}\mathbb{E}(X_{k+1}-X_k\mid\mathcal{F}_k), \]

and \( {M_n=X_n-X_0-A_n} \). For the existence, we set \( {A_0=M_0=0} \) and we use the formulas above to define \( {A_n} \) and \( {M_n} \) for all \( {n\geq1} \). Since \( {X} \) is adapted, \( {A_{n+1}} \) and \( {M_n} \) are \( {\mathcal{F}_n} \) measurable. By definition \( {A_n} \) is integrable and since \( {X_n} \) is integrable we also have that \( {M_n} \) is integrable. Moreover \( {\mathbb{E}(M_{n+1}-M_n\mid\mathcal{F}_n)=0} \) because

\[ M_{n+1}-M_n =X_{n+1}-X_n-(A_{n+1}-A_n) =X_{n+1}-X_n-\mathbb{E}(X_{n+1}-X_n\mid\mathcal{F}_n). \]

Finally, when \( {X} \) is a submartingale then for all \( {n\geq0} \) we have

\[ \begin{array}{rcl} A_{n+1}-A_n &=&\mathbb{E}(A_{n+1}-A_n\mid\mathcal{F}_n)\\ &=&\mathbb{E}(X_{n+1}-X_n\mid\mathcal{F}_n) -\mathbb{E}(M_{n+1}-M_n\mid\mathcal{F}_n)\\ &=&\mathbb{E}(X_{n+1}-X_n\mid\mathcal{F}_n) \geq0. \end{array} \]

Curiosity. In the special case of nonnegative martingales bounded in \( {L\log L} \), there is an information theoretic argument due to Andrew R. Barron that resembles a little bit to the one that we have used for continuous martingales in a previous post. This is written in an apparently unpublished document available online.

Thanks. This post benefited from discussions with Nicolas Fournier and Nathaël Gozlan.

Some other posts: