Press "Enter" to skip to content

Libres pensées d'un mathématicien ordinaire Posts

Le temps retrouvé

Paul Magendie - La cage ouverte.
Paul Magendie – La cage ouverte.

Cette année de crise aura au moins apporté du temps à certains. C’est ainsi que j’ai pu lire encore plus souvent qu’à l’accoutumée. Voici quelques lectures en rapport avec l’actualité :

Pour ma part, je me suis modestement dit, en mars 2020, que les mesures imposées à la population étaient trop peu ciblées, puis début mai 2020 qu’on avait sacrifié la jeunesse au profit de la vieillesse, qui est morte quand même. L’attente un peu bébête des vaccins pour sauver le monde m’a semblé passer à côté des sujets de société importants comme le rapport à la mort et à la fin de vie. Mais je me suis bien vite rendu compte de la très faible acceptabilité sociale de ces pensées, y compris à l’université, et qu’il fallait du temps pour une digestion collective. À ce jour, je n’ai pas changé d’analyse. Par ailleurs, sur le plan professionnel, j’ai pris le temps de m’exprimer, en tant que mathématicien, sur la modélisation mathématique et sur l’analyse quantitative. Les mathématiciens sont trop peu nombreux à questionner leur discipline.

Pour terminer, et à propos d’année misérable, connaissez-vous l’histoire de l’année sans été ?

Leave a Comment

RNBM, zbMATH, …

RNBMSince January and for the next four years, I serve as the scientific director of the French network of mathematical libraries (RNBM : réseau national des bibliothèques de mathématiques). My principal collaborator for this mission is Elisabeth Kneller from CNRS, the librarian at the head of the network. My predecessor is Frédéric Hélein. The RNBM has several projects. In particular, the RNBM encourages colleagues and departments to favor zbMATH over MathSciNet.

 

Leave a Comment

Fisher information between two Gaussians

Photo of Ronald Aylmer Fisher
Ronald A. Fisher (1890 – 1962) in 1951.

Fisher information. The Fisher information or divergence of a positive Borel measure measure \( {\nu} \) with respect to another one \( {\mu} \) on the same space is

\[ \mathrm{Fisher}(\nu\mid\mu) =\int\left|\nabla\log\textstyle\frac{\mathrm{d}\nu}{\mathrm{d}\mu}\right|^2\mathrm{d}\nu =\int\frac{|\nabla\frac{\mathrm{d}\nu}{\mathrm{d}\mu}|^2}{\frac{\mathrm{d}\nu}{\mathrm{d}\mu}}\mathrm{d}\mu =4\int\left|\nabla\sqrt{\textstyle\frac{\mathrm{d}\nu}{\mathrm{d}\mu}}\right|^2\mathrm{d}\mu \in[0,+\infty] \]

if \( {\nu} \) is absolutey continuous with respect to \( {\mu} \), and \( {\mathrm{Fisher}(\nu\mid\mu)=+\infty} \) otherwise.

It plays a role in the analysis and geometry of statistics, information, partial differential equations, and Markov diffusion stochastic processes. It is named after Ronald Aylmer Fisher (1890 – 1962), a British scientist who is also the Fisher of many other objects and concepts including for instance:

However, he should not be confused with for instance:

Let us denote \( {|x|=\sqrt{x_1^2+\cdots+x_n^2}} \) and \( {x\cdot y=x_1y_1\cdots+x_ny_n} \) for all \( {x,y\in\mathbb{R}^n} \).

Explicit formula for Gaussians. For all \( {n\geq1} \), all vectors \( {m_1,m_2\in\mathbb{R}^n} \), and all \( {n\times n} \) covariance matrices \( {\Sigma_1} \) and \( {\Sigma_2} \), we have

\[ \mathrm{Fisher}(\mathcal{N}(m_1,\Sigma_1)\mid\mathcal{N}(m_2,\Sigma_2)) =|\Sigma_2^{-1}(m_1-m_2)|^2+\mathrm{Tr}(\Sigma_2^{-2}\Sigma_1-2\Sigma_2^{-1}+\Sigma_1^{-1}). \]

When \( {\Sigma_1} \) and \( {\Sigma_2} \) commute, this reduces to the following, closer to the univariate case,

\[ \mathrm{Fisher}(\mathcal{N}(m_1,\Sigma_1)\mid\mathcal{N}(m_2,\Sigma_2)) =|\Sigma_2^{-1}(m_1-m_2)|^2+\mathrm{Tr}(\Sigma_2^{-2}(\Sigma_2-\Sigma_1)^2\Sigma_1^{-1}). \]

In the univariate case, this reads, for all \( {m_1,m_2\in\mathbb{R}} \) and \( {\sigma_1^2,\sigma_2^2\in(0,\infty)} \),

\[ \mathrm{Fisher}(\mathcal{N}(m_1,\sigma_1^2)\mid\mathcal{N}(m_2,\sigma_2^2)) =\frac{(m_1-m_2)^2}{\sigma_2^2}+\frac{(\sigma_2^2-\sigma_1^2)^2}{\sigma_1^2\sigma_2^4}. \]

A proof. If \( {X\sim\mathcal{N}(m,\Sigma)} \) then, for all \( {1\leq i,j\leq n} \),

\[ \mathbb{E}(X_iX_j)=\Sigma_{ij}+m_im_j, \]

hence, for all \( {n\times n} \) symmetric matrices \( {A} \) and \( {B} \),

\[ \begin{array}{rcl} \mathbb{E}(AX\cdot BX) &=&\mathbb{E}\sum_{i,j,k=1}^nA_{ij}X_jB_{ik}X_k\\ &=&\sum_{i,j,k=1}^nA_{ij}B_{ik}\mathbb{E}(X_jX_k)\\ &=&\sum_{i,j,k=1}^nA_{ij}B_{ik}(\Sigma_{jk}+m_jm_k)\\ &=&\mathrm{Trace}(A\Sigma B)+Am\cdot Bm, \end{array} \]

and thus for all \( {n} \)-dimensional vectors \( {a} \) and \( {b} \),

\[ \begin{array}{rcl} \mathbb{E}(A(X-a)\cdot B(X-b)) &=&\mathbb{E}(AX\cdot BX)+A(m-a)\cdot B(m-b)-Am\cdot Bm\\ &=&\mathrm{Trace}(A\Sigma B)+A(m-a)\cdot B(m-b). \end{array} \]

Now, using the notation \( {q_i(x)=\Sigma_i^{-1}(x-m_i)\cdot(x-m_i)} \) and \( {|\Sigma_i|=\det(\Sigma_i)} \),

\[ \begin{array}{rcl} \mathrm{Fisher}(\Gamma_1\mid\Gamma_2) &=&\displaystyle4\frac{\sqrt{|\Sigma_2|}}{\sqrt{|\Sigma_1|}}\int\Bigr|\nabla\mathrm{e}^{-\frac{q_1(x)}{4}+\frac{q_2(x)}{4}}\Bigr|^2\frac{\mathrm{e}^{-\frac{q_2(x)}{2}}}{\sqrt{2\pi|\Sigma_2|}}\mathrm{d}x\\ &=&\displaystyle\int|\Sigma_2^{-1}(x-m_2)-\Sigma_1^{-1}(x-m_1)|^2\frac{\mathrm{e}^{-\frac{q_1(x)}{2}}}{\sqrt{2\pi|\Sigma_1|}}\mathrm{d}x\\ &=&\displaystyle\int(|\Sigma_2^{-1}(x-m_2)|^2\\ &&\qquad-2\Sigma_2^{-1}(x-m_2)\cdot\Sigma_1^{-1}(x-m_1)\\ &&\qquad+|\Sigma_1^{-1}(x-m_1)|^2)\frac{\mathrm{e}^{-\frac{q_1(x)}{2}}}{\sqrt{2\pi|\Sigma_1|}}\mathrm{d}x\\ &=&\mathrm{Trace}(\Sigma_2^{-1}\Sigma_1\Sigma_2^{-1})+|\Sigma_2^{-1}(m_1-m_2)|^2-2\mathrm{Trace}(\Sigma_2^{-1})+\mathrm{Trace}(\Sigma_1^{-1})\\ &=&\mathrm{Trace}(\Sigma_2^{-2}\Sigma_1-2\Sigma_2^{-1}+\Sigma_1^{-1})+|\Sigma_2^{-1}(m_1-m_2)|^2. \end{array} \]

The formula when \( {\Sigma_1\Sigma_2=\Sigma_2\Sigma_1} \) follows immediately.

Other distances. Recall that the Hellinger distance between probability measures \( {\mu} \) and \( {\nu} \) with densities \( {f_\mu} \) and \( {f_\nu} \) with respect to the same reference measure \( {\lambda} \) is

\[ \mathrm{Hellinger}(\mu,\nu) =\Bigr(\int(\sqrt{f_\mu}-\sqrt{f_\nu})^2\mathrm{d}\lambda\Bigr)^{1/2} =\Bigr(2-2\int\sqrt{f_\mu f_\nu}\mathrm{d}\lambda\Bigr)^{1/2} \in[0,\sqrt{2}]. \]

This quantity does not depend on the choice of \( {\lambda} \).

The KullbackLeibler divergence or relative entropy is defined by

\[ \mathrm{Kullback}(\nu\mid\mu) =\int\log{\textstyle\frac{\mathrm{d}\nu}{\mathrm{d}\mu}}\mathrm{d}\nu =\int{\textstyle\frac{\mathrm{d}\nu}{\mathrm{d}\mu}} \log{\textstyle\frac{\mathrm{d}\nu}{\mathrm{d}\mu}}\mathrm{d}\mu \in[0,+\infty] \ \ \ \ \ (1) \]

if \( {\nu} \) is absolutey continuous with respect to \( {\mu} \), and \( {\mathrm{Kullback}(\nu\mid\mu)=+\infty} \) otherwise.

The WassersteinKantorovichMonge transportation distance of order \( {2} \) and with respect to the underlying Euclidean distance is defined for all probability measures \( {\mu} \) and \( {\nu} \) on \( {\mathbb{R}^n} \) by

\[ \mathrm{Wasserstein}(\mu,\nu)=\Bigr(\inf_{(X,Y)}\mathbb{E}(\left|X-Y\right|^2)\Bigr)^{1/2} \in[0,+\infty] \ \ \ \ \ (2) \]

where the inf runs over all couples \( {(X,Y)} \) with \( {X\sim\mu} \) and \( {Y\sim\nu} \).

Now, for all \( {n\geq1} \), \( {m_1,m_2\in\mathbb{R}^n} \), and all \( {n\times n} \) covariance matices \( {\Sigma_1,\Sigma_2} \), denoting

\[ \Gamma_1=\mathcal{N}(\mu_1,\Sigma_1) \quad\mbox{and}\quad \Gamma_2=\mathcal{N}(\mu_2,\Sigma_2), \]

we have

\[ \begin{array}{rcl} \mathrm{Hellinger}^2(\Gamma_1,\Gamma_2) &=&2-2\frac{\det(\Sigma_1\Sigma_2)^{1/4}}{\det(\frac{\Sigma_1+\Sigma_2}{2})^{1/2}}\mathrm{exp}\Bigr(-\frac{1}{4}(\Sigma_1+\Sigma_2)^{-1}(m_2-m_1)\cdot(m_2-m_1)\Bigr),\\ 2\mathrm{Kullback}(\Gamma_1\mid\Gamma_2) &=&\Sigma_2^{-1}(m_1-m_2)\cdot(m_1-m_2)+\mathrm{Tr}(\Sigma_2^{-1}\Sigma_1-\mathrm{Id})+\log\det(\Sigma_2\Sigma_1^{-1}),\\ \mathrm{Fisher}(\Gamma_1\mid\Gamma_2) &=&|\Sigma_2^{-1}(m_1-m_2)|^2+\mathrm{Tr}(\Sigma_2^{-2}\Sigma_1-2\Sigma_2^{-1}+\Sigma_1^{-1}),\\ \mathrm{Wasserstein}^2(\Gamma_1,\Gamma_2) &=&|m_1-m_2|^2+\mathrm{Tr}\Bigr(\Sigma_1+\Sigma_2-2\sqrt{\sqrt{\Sigma_1}\Sigma_2\sqrt{\Sigma_1}}\Bigr), \end{array} \]

and if \( {\Sigma_1} \) and \( {\Sigma_2} \) commute, \( {\Sigma_1\Sigma_2=\Sigma_2\Sigma_1} \), then we find the simpler formulas

\[ \begin{array}{rcl} \mathrm{Fisher}(\Gamma_1\mid\Gamma_2) &=&|\Sigma_2^{-1}(m_1-m_2)|^2+\mathrm{Tr}(\Sigma_2^{-2}(\Sigma_2-\Sigma_1)^2\Sigma_1^{-1})\\ \mathrm{Wasserstein}^2(\Gamma_1,\Gamma_2) &=&|m_1-m_2|^2+\mathrm{Tr}((\sqrt{\Sigma_1}-\sqrt{\Sigma_2})^2). \end{array} \]

Fisher as an infinitesimal Kullback. The BoltzmannShannon entropy is in a sense the opposite of the Kullback divergence with respect to the Lebesgue measure \( {\lambda} \), namely

\[ \mathrm{Entropy}(\mu) =-\int\frac{\mathrm{d}\mu}{\mathrm{d}\lambda} \log\frac{\mathrm{d}\mu}{\mathrm{d}\lambda}\mathrm{d}\lambda =\mathrm{Kullback}(\mu\mid\lambda). \]

It was discovered by Nicolaas Govert de Bruijn (1918 — 2012) that the Fisher information appears as the differential version of the entropy under Gaussian noise. More precisely, it states that if \( {X} \) is a random vector of \( {\mathbb{R}^n} \) with finite entropy and if \( {Z\sim\mathcal{N}(0,I_n)} \) then

\[ \frac{\mathrm{d}}{\mathrm{d}t}\Bigr\vert_{t=0} \mathrm{Entropy}(\mathrm{Law}(X+\sqrt{t}Z)\mid\lambda) =-\mathrm{Fisher}(\mathrm{Law}(X)\mid\lambda). \]

In other words, if \( {\mu_t} \) is the law at time \( {t} \) of an \( {n} \)-dimensional Brownian motion started from a random initial condition \( {X} \) then

\[ \frac{\mathrm{d}}{\mathrm{d}t}\Bigr\vert_{t=0} \mathrm{Entropy}(\mu_t\mid\lambda) =-\mathrm{Fisher}(\mu_0\mid\lambda). \]

The Lebesgue measure is the invariant (and reversible) measure of Brownian motion. More generally, let us consider the stochastic differential equation

\[ \mathrm{d}X_t=\sqrt{2}\mathrm{d}B_t-\nabla V(X_t)\mathrm{d}t \]

on \( {\mathbb{R}^n} \), where \( {V:\mathbb{R}^n\mapsto\mathbb{R}} \) is \( {\mathcal{C}^2} \) and where \( {{(B_t)}_{t\geq0}} \) is a standard Brownian motion. If we assume that \( {V-\frac{\rho}{2}\left|\cdot\right|^2} \) is convex for some \( {\rho\in\mathbb{R}} \) then it admits a solution \( {{(X_t)}_{t\geq0}} \) known as the overdamped Langevin process, which is a Markov diffusion process. If we further assume that \( {\mathrm{e}^{-V}} \) is integrable with respect to the Lebesgue measure, then the probability measure \( {\mu} \) with density proportional to \( {\mathrm{e}^{-V}} \) is invariant and reversible. Now, denoting \( {\mu_t=\mathrm{Law}(X_t)} \), the analogue of the De Bruijn identity reads, for all \( {t\geq0} \),

\[ \frac{\mathrm{d}}{\mathrm{d}t} \mathrm{Kullback}(\mu_t\mid\mu) =-\mathrm{Fisher}(\mu_t\mid\mu) \]

but this requires that \( {\mu_0} \) is chosen in such a way that \( {t\mapsto\mathrm{Kullback}(\mu_t\mid\mu)} \) is well defined and differentiable. This condition is easily checked in the example of the OrnsteinUhlenbeck process which corresponds to \( {V=\frac{1}{2}\left|\cdot\right|^2} \) and for which \( {\mu=\mathcal{N}(0,I_n)} \).

Ornstein–Uhlenbeck. If \( {{(X_t^x)}_{t\geq0}} \) is an \( {n} \)-dimensional Ornstein–Uhlenbeck process solution of the stochastic differential equation

\[ X_0^x=x\in\mathbb{R}^n, \quad\mathrm{d}X^x_t=\sqrt{2}\mathrm{d}B_t-X^x_t\mathrm{d}t \]

where \( {{(B_t)}_{t\geq0}} \) is a standard \( {n} \)-dimensional Brownian motion, then the invariant law is \( {\gamma=\mathcal{N}(0,I_n)} \) and the Mehler formula reads

\[ X^x_t=x\mathrm{e}^{-t}+\int_0^t\mathrm{e}^{s-t}\mathrm{d}B_s\sim\mathcal{N}(x\mathrm{e}^{-t},(1-\mathrm{e}^{-2t})I_n), \]

and the explicit formula for the Fisher information for Gaussians gives

\[ \mathrm{Fisher}(\mathrm{Law}(X^x_t)\mid\gamma) =\mathrm{Fisher}(\mathcal{N}(x\mathrm{e}^{-t},(1-\mathrm{e}^{-2t})I_n)\mid\gamma) =|x|^2\mathrm{e}^{-2t}+n\frac{\mathrm{e}^{-4t}}{1-\mathrm{e}^{-2t}}. \]

Further reading.

2 Comments

Back to basics : the Dubins-Schwarz theorem

Lester Eli Dubins (1921-2010)
Lester Eli Dubins (1921-2010)

The Dubins-Schwarz theorem is an important result of stochastic calculus. It states essentially that continuous local martingales and in particular continuous martingales are time changed Brownian motion. It is named after the American mathematician Lester Dubins (1920-2010), and the Israeli mathematician and statistician Gideon E. Schwarz (1933-2007) who is also at the origin of the Bayesian information criterion (BIC) in statistics. He is neither the famous German mathematician Karl Hermann Amandus Schwarz (1841-1921) nor the famous French mathematician Laurent Schwartz (1915-2002). The Dubins-Schwarz theorem was also discovered independently by the Russian mathematician K.È. Dambis, who apparently published a single article, in Russian, in 1965, the same year as the paper by Dubins and Schwarz.

Dubins-Schwarz theorem. Let \( {M} \) be a continuous local martingale with respect to a filtration \( {{(\mathcal{F}_t)}_{t\geq0}} \), such that \( {M_0=0} \) and \( {\langle M\rangle_\infty=\infty} \) almost surely. For all \( {t\geq0} \), let

\[ T_t=\inf\{s\geq0:\langle M\rangle_s>t\}=\langle M\rangle_t^{-1} \]

be the generalized inverse of the non-decreasing process \( {\langle M\rangle} \) issued from \( {0} \). Then

  1. \( {B={(M_{\langle M\rangle_t^{-1}})}_{t\geq0}} \) is a Brownian motion with respect to the filtration \( {{(\mathcal{F}_{T_t})}_{t\geq0}} \)
  2. \( {{(B_{\langle M\rangle_t})}_{t\geq0}={(M_t)}_{t\geq0}} \).

For instance, if \( {M=\alpha W} \) where \( {\alpha>0} \) is a constant and \( {W} \) is a Brownian motion issued from the origin, then for all \( {t\geq0} \) we have \( {\langle M\rangle_t=\alpha^2t} \) and \( {T_t=\alpha^{-2}t} \), and the process

\[ B={(M_{T_t})}_{t\geq0}={(\alpha W_{\alpha^{-2}t})}_{t\geq0} \]

is a Brownian motion with respect to \( {{(\mathcal{F}_{\alpha^{-2}t})}_{t\geq0}} \). In this example, the change of time is deterministic, but in general, it is random, for instance if \( {M_t=\int_0^tW_s\mathrm{d}W_s} \) where \( {{(W_t)}_{t\geq0}} \) is a Brownian motion then \( {\langle M\rangle_t=\int_0^tW_s^2\mathrm{d}s} \) which is random.

Flatness lemma. Since \( {\langle M\rangle} \) can be flat on an interval, the map \( {t\mapsto T_t} \) can be discontinuous. But this does not contradict the continuity of \( {t\mapsto M_{T_t}} \). Indeed, the flatness lemma states that \( {M} \) and \( {\langle M\rangle} \) are constant on the same intervals in the sense that almost surely, for all \( {0\leq a<b} \),

\[ \forall t\in[a,b], M_t=M_a \quad\text{if and only if}\quad \langle M\rangle_b=\langle M\rangle_a. \]

Proof of the flatness lemma. Since \( {M} \) and \( {\langle M\rangle} \) are continuous, it suffices to show that for all \( {0\leq a\leq b} \), almost surely,

\[ \{\forall t\in[a,b]:M_t=M_a\}=\{\langle M\rangle_b=\langle M\rangle_a\}. \]

The inclusion \( {\subset} \) comes from the approximation of the quadratic variation \( {\langle M\rangle=[M]} \). Let us prove the converse. To this end, we consider the continuous local martingale \( {{(N_t)}_{t\geq0}={(M_t-M_{t\wedge a})}_{t\geq0}} \). We have

\[ \langle N\rangle =\langle M\rangle-2\langle M,M^a\rangle+\langle M^a\rangle =\langle M\rangle-2\langle M\rangle^a+\langle M\rangle^a =\langle M\rangle-\langle M\rangle^a. \]

For all \( {\varepsilon>0} \), we set the stopping time \( {T_\varepsilon=\inf\{t\geq0:\langle N\rangle_t>\varepsilon\}} \). The continuous semi-martingale \( {N^{T_\varepsilon}} \) satisfies \( {N^{T_\varepsilon}_0=0} \) and \( {\langle N^{T_\varepsilon}\rangle_\infty=\langle N\rangle_{T_\varepsilon}\leq\varepsilon} \). It follows that \( {N^{T_\varepsilon}} \) is a martingale bounded in \( {\mathrm{L}^2} \), and for all \( {t\geq0} \),

\[ \mathbb{E}(N^2_{t\wedge T_\varepsilon}) =\mathbb{E}(\langle N\rangle_{t\wedge T_\varepsilon}) \leq\varepsilon. \]

Let us define the event \( {A=\{\langle M\rangle_b=\langle M\rangle_a\}} \). Then \( {A\subset\{T_\varepsilon\geq b\}} \) and, for all \( {t\in[a,b]} \),

\[ \mathbb{E}(\mathbf{1}_AN^2_t) =\mathbb{E}(\mathbf{1}_AN^2_{t\wedge T_\varepsilon}) \leq\mathbb{E}(N^2_{t\wedge T_\varepsilon}) \leq\varepsilon. \]

By sending \( {\varepsilon} \) to \( {0} \) we obtain \( {\mathbb{E}(\mathbf{1}_AN^2_t)=0} \) and thus \( {N_t=0} \) almost surely on \( {A} \). This ends the proof of the flatness lemma, which is of independent interest.

Proof of the Dubins-Schwarz theorem. For all \( {t\geq0} \), the random variable \( {T_t} \) is a stopping time with respect to \( {{(\mathcal{F}_u)}_{u\geq0}} \), and \( {s\mapsto T_s} \) is non-decreasing. It follows that for all \( {0\leq s\leq t} \), \( {\mathcal{F}_{T_s}\subset\mathcal{F}_{T_t}} \), and thus \( {{(\mathcal{F}_{T_u})}_{u\geq0}} \) is a filtration. Moreover for all \( {t\geq0} \), \( {T_t} \) is a stopping time for the filtration \( {{(\mathcal{F}_{T_u})}_{u\geq0}} \). We have \( {T_t<\infty} \) for all \( {t\geq0} \) on the almost sure event \( {\{\langle M\rangle_\infty=\infty\}} \). By construction \( {{(T_t)}_{t\geq0}} \) is right continuous, non-decreasing (and thus with left limits), and adapted with respect to \( {{(\mathcal{F}_{T_t})}_{t\geq0}} \). Since \( {M} \) is continuous, \( {B={(M_{T_t})}_{t\geq0}} \) is right continuous with left limits. Moreover, for all \( {t\geq0} \),

\[ B_{t^-}=\lim_{s\underset{<}{\rightarrow}t}B_s=M_{T_{t^-}}. \]

By the flatness lemma, almost surely \( {B_{t^-}=B_t} \) for all \( {t\geq0} \), hence \( {B} \) is continuous.

Let us show that \( {B} \) is a Brownian motion for \( {{(\mathcal{F}_{T_t})}_{t\geq0}} \). For all \( {n\geq0} \), \( {M^{T_n}} \) is a continuous local martingale issued from the origin and \( {\langle M^{T_n}\rangle_\infty=\langle M\rangle_{T_n}=n} \) almost surely. It follows that for all \( {n\geq0} \), the processes

\[ M^{T_n} \quad\mbox{and}\quad (M^{T_n})^2-\langle M\rangle^{T_n} \]

are uniformly integrable martingales. Now, for all \( {0\leq s\leq t\leq n} \), and by the Doob stopping theorem for uniformly integrable martingales, using \( {T_s\leq T_t\leq T_n} \),

\[ \mathbb{E}(B_t\mid\mathcal{F}_{T_s}) =\mathbb{E}(M^{T_n}_{T_t}\mid\mathcal{F}_{T_s}) =M^{T_n}_{T_s} =M_{T_n\wedge T_s} =B_{s} \]

and similarly, using additionally the property \( {\langle M\rangle^{T_n}_{T_t}=\langle M\rangle_{T_n\wedge T_t}=\langle M\rangle_{T_t}=t} \),

\[ \mathbb{E}(B_t^2-t\mid\mathcal{F}_{T_s}) =\mathbb{E}((M^{T_n}_{T_t})^2-\langle M^{T_n}\rangle_{T_t}\mid\mathcal{F}_{T_s}) =(M^{T_n}_{T_s})^2-\langle M^{T_n}\rangle_{T_s} =B_{T_s}. \]

Thus \( {B} \) and \( {{(B_t^2-t)}_{t\geq0}} \) are martingales with respect to the filtration \( {{(\mathcal{F}_{T_t})}_{t\geq0}} \). It follows now from the Lévy characterization that \( {B} \) is a Brownian motion for \( {{(\mathcal{F}_{T_t})}_{t\geq0}} \).

Let us show that \( {M=B_{\langle M\rangle}} \). By definition of \( {B} \), almost surely, for all \( {t\geq0} \),

\[ B_{\langle M\rangle_t}=M_{T_{\langle M\rangle_t}}. \]

Now \( {T_{\langle M\rangle_t^-}\leq t\leq T_{\langle M\rangle_t}} \) and since \( {\langle M\rangle} \) takes the same value at \( {T_{\langle M\rangle_t^-}} \) and \( {T_{\langle M\rangle_t}} \), we get \( {t=T_{\langle M\rangle_t}} \) and the flatness lemma gives \( {M_t=M_{T_{\langle M\rangle_t}}} \) for all \( {t\geq0} \) almost surely. In other words, using the definition of \( {B} \), this means that almost surely, for all \( {t\geq0} \),

\[ M_t=M_{T_{\langle M\rangle_t}}=B_{\langle M\rangle_t}. \]

This ends the proof of the Dubins-Schwarz theorem.

Warnings about the Dubins-Schwarz theorem.

  • The Dubins-Schwarz theorem does not state that \( {B_{\langle M\rangle}=M} \) for a Brownian motion \( {B} \) with respect to the filtration for which \( {M} \) is a local martingale.
  • The Dubins-Schwarz theorem is not valid for semi-martingales.

Ornstein-Uhlenbeck process. For an arbitrary \( {x\in\mathbb{R}} \), let us consider the Ornstein-Uhlenbeck process \( {{(Z_t)}_{t\geq0}} \) issued from \( {x} \) and given for all \( {t\geq0} \) by

\[ Z_t=x\mathrm{e}^{-t}+\mathrm{e}^{-t}M_t \quad\mbox{where}\quad M_t=\sqrt{2}\int_0^t\mathrm{e}^s\mathrm{d}B_s \]

where \( {B={(B_t)}_{t\geq0}} \) is a Brownian motion in \( {\mathbb{R}} \) with respect to \( {{(\mathcal{F}_t)}_{t\geq0}} \). The process \( {{(Z_t)}_{t\geq0}} \) is the unique square integrable continuous semi-martingale solution of the stochastic differential equation \( {Z_t=x+\sqrt{2}B_t-\int_0^tZ_s\mathrm{d}s} \), \( {t\geq0} \).

The process \( {{(M_t)}_{t\geq0}} \) is Gaussian and for all \( {t\geq0} \), \( {M_t\sim\mathcal{N}(0,\langle M\rangle_t)} \) (Wiener integral) with \( {\langle M\rangle_t=\int_0^t(\sqrt{2}\mathrm{e}^{s})^2\mathrm{d}s=\mathrm{e}^{2t}-1} \). Hence, for all \( {t\geq0} \), we have the equality in law

\[ Z_t\overset{\mathrm{d}}{=}x\mathrm{e}^{-t}+\mathrm{e}^{-t}B_{\mathrm{e}^{2t}-1}. \]

The processes \( {{(Z_t)}_{t\geq0}} \) and \( {{(x\mathrm{e}^{-t}+\mathrm{e}^{-t}B_{\mathrm{e}^{2t}-1})}_{t\geq0}} \) have same one-dimensional marginal distributions, but they are not equal since the second is not measurable with respect to \( {{(\mathcal{F}_t)}_{t\geq0}} \).

However, since \( {{(M_t)}_{t\geq0}} \) is a continuous local martingale with respect to \( {{(\mathcal{F}_t)}_{t\geq0}} \) for which \( {M_0=0} \) and \( {\langle M\rangle_\infty=\infty} \), the Dubins-Schwarz theorem states that there exists a Brownian motion \( {{(W_t)}_{t\geq0}} \) with respect to \( {{(\mathcal{F}_{T_t})}_{t\geq0}} \) where

\[ T_t =\inf\{s\geq0:\langle M\rangle_s>t\} =\frac{\log(t+1)}{2} \]

such that

\[ {(Z_t)}_{t\geq0} = {(x\mathrm{e}^{-t}+\mathrm{e}^{-t}W_{\langle M\rangle_t})}_{t\geq0} = {(x\mathrm{e}^{-t}+\mathrm{e}^{-t}W_{\mathrm{e}^{2t}-1})}_{t\geq0}. \]

About Gideon E. Schwarz. Born 1933 in Salzburg, Austria. Escaped in 1938, after the Anschluss, to Palestine, today Israel. M.Sc. in Mathematics at the Hebrew University, Jerusalem in 1956. Ph.D. in Mathematical Statistics at Columbia University in 1961. Research fellowships: Miller Institute 1964-66, Institute for Advanced Studies on Mt. Scopus 1975-76. Visiting appointments: Stanford University, Tel Aviv University, University of California in Berkeley. Since 1961, Fellow of the Institute of Mathematical Statistics. Presently, Professor of Statistics at the Hebrew University. Taken from his paper The dark side of the Moebius strip, Amer. Math. Monthly 97 (1990), no. 10, 890-897.

Gideon E. Schwarz (1933-2017)
Gideon E. Schwarz (1933-2017)
Leave a Comment
Syntax · Style · .