Press "Enter" to skip to content

Libres pensées d'un mathématicien ordinaire Posts

Completeness and right-continuity of filtrations

Nikolai Nikolaevich Luzin (1883 - 1950)
Nikolai Nikolaevich Luzin (1883 – 1950)

If you believe that the notions of completion and right-continuity of filtrations are typical abstract non sense of the general theory of stochastic processes, useless and obscure, you are maybe missing something interesting. Contrary to discrete time/space processes, continuous time/space processes lead naturally to measurability issues, when considering for instance natural objects such as running suprema or stopping times.

Negligible sets and completeness. In a probability space \( {(\Omega,\mathcal{F},\mathbb{P})} \), we say that \( {A\subset\Omega} \) is negligible when there exists \( {A’\in\mathcal{F}} \) with \( {A\subset A’} \) and \( {\mathbb{P}(A’)=0} \). We say that the \( {(\Omega,\mathcal{F},\mathbb{P})} \) is complete when \( {\mathcal{F}} \) contains the negligible subsets of \( {\Omega} \). A filtration \( {{(\mathcal{F}_t)}_{t\in\mathbb{R}_+}} \) on \( {(\Omega,\mathcal{F},\mathbb{P})} \) is complete when \( {\mathcal{F}_0} \) contains the negligible subsets of \( {\mathcal{F}} \). Completeness emerges naturally via almost sure events which are complement of negligible subsets.

We say that a process \( {{(X_t)}_{t\in\mathbb{R}_+}} \) taking values in a topological space equipped with its Borel \( {\sigma} \)-field is continuous when it has almost surely continuous trajectories.

Measurability of running supremum from completeness. Let \( {{(X_t)}_{t\in\mathbb{R}_+}} \) be continuous, defined on a probability space \( {(\Omega,\mathcal{F},\mathbb{P})} \), and taking values in a topological space \(E\) equipped with its Borel \( {\sigma} \)-field \(\mathcal{E}\). Let \( {f:E\rightarrow\mathbb{R}} \) be a measurable function.

  • If \( {(\Omega,\mathcal{F},\mathbb{P})} \) is complete then \( {\sup_{s\in[0,t]}f(X_s)} \) is measurable for all \( {t\in\mathbb{R}_+} \).
  • If \( {X} \) is adapted for a complete \( {{(\mathcal{F}_t)}_{t\in\mathbb{R}_+}} \) then \( {{(\sup_{s\in[0,t]}f(X_s))}_{t\in\mathbb{R}_+}} \) is adapted.

Proof. Let \( {\Omega’\in\mathcal{F}} \) be an a.s. event on which \( {X} \) is continuous. Set \( {S_t=\sup_{s\in[0,t]}f(X_s)} \).

  • For all \( {t\in\mathbb{R}_+} \) and \( {A\in\mathcal{E}} \), we have

    \[ \Omega’\cap\{S_t\in A\} =\Omega’\cap \bigr\{\sup_{s\in[0,t]\cap\mathbb{Q}}f(X_s)\in A\bigr\}\in\mathcal{F}, \]

    while \( {(\Omega\setminus\Omega’)\cap\{S_t\in A\}\subset\Omega\setminus\Omega’} \) is negligible and thus in \( {\mathcal{F}} \) by completeness.

  • Same argument as before with \( {\mathcal{F}_t} \) instead of \( {\mathcal{F}} \).

Universal completeness. The notion of completeness is relative to the probability measure \( {\mathbb{P}} \). There is also a notion of universal completeness, see Dellacherie and Meyer 1978, that does not depend on the probability measure, which is less useful in probability.

Stopping times. A map \( {T:\Omega\rightarrow[0,+\infty]} \) is a stopping time for \( {{(\mathcal{F}_t)}_{t\in\mathbb{R}_+}} \) when

\[ \{T\leq t\}\in\mathcal{F}_t \]

for all \( {t\in\mathbb{R}_+} \). Contrary to discrete time filtrations, the notion of stopping times for continuous time filtrations leads naturally to the notions of complete filtration and right continuous filtration. This is visible notably with hitting times as follows.

Hitting times as archetypal examples of stopping times. Let \( {X={(X_t)}_{t\in\mathbb{R}_+}} \) be a continuous and adapted process defined on \( {(\Omega,\mathcal{F},\mathbb{P})} \) with respect to a complete filtration \( {{(\mathcal{F}_t)}_{t\in\mathbb{R}_+}} \), and taking its values in a metric space \( {G} \) equipped with its Borel \( {\sigma} \)-field. Then, for all closed subset \( {A\subset G} \), the hitting time \( {T_A:\Omega\rightarrow[0,+\infty]} \) of \( {A} \), given by

\[ T_A=\inf\{t\in\mathbb{R}_+:X_t\in A\}, \]

with convention \( {\inf\varnothing=+\infty} \), is a stopping time.

Proof. Let \( {\Omega’} \) be the a.s. event on which \( {X} \) is continuous. On \( {\Omega’} \), since \( {X} \) is continuous and \( {A} \) is closed, we have \( {\{t\in\mathbb{R}_+:X_t\in A\}=\{t\in\mathbb{R}_+:\mathrm{dist}(X_t,A)=0\}} \), the map \( {t\in\mathbb{R}_+\mapsto\mathrm{dist}(X_t,A)} \) is continuous, and the \( {\inf} \) in the definition of \( {T_A} \) is a \( {\min} \). Now, since \( {X} \) is adapted, we have, for all \( {t\in\mathbb{R}_+} \),

\[ \Omega’\cap\{T_A\leq t\} =\Omega’\cap\bigcap_{s\in[0,t]\cap\mathbb{Q}}\{X_s\in A\} \in\mathcal{F}_t, \]

where we have also used \( {\Omega’\in\mathcal{F}_t} \) for all \( {t\in\mathbb{R}_+} \) since \( {{(\mathcal{F}_t)}_{t\in\mathbb{R}_+}} \) is complete. Moreover \( {(\Omega\setminus\Omega’)\cap\{T_A\leq t\}\subset\Omega\setminus\Omega’} \) is negligible, and thus in \( {\mathcal{F}_t} \) by completeness.

Right-continuity. A filtration \( {{(\mathcal{F}_t)}_{t\in\mathbb{R}_+}} \) is right-continuous if for all \( {t\in\mathbb{R}_+} \) we have

\[ \mathcal{F}_t=\mathcal{F}_{t^+} \quad\text{where}\quad \mathcal{F}_{t+} =\bigcap_{\varepsilon>0}\mathcal{F}_{t+\varepsilon} =\bigcap_{s>t}\mathcal{F}_s. \]

Alternative definition of stopping times. If \( {T:\Omega\rightarrow[0,+\infty]} \) is a stopping time with respect to a filtration \( {{(\mathcal{F}_t)}_{t\in\mathbb{R}_+}} \) then \( {\{T<t\}\in\mathcal{F}_t} \) for all \( {t\in\mathbb{R}_+} \). Conversely this property implies that \( {T} \) is a stopping time when the filtration is right-continuous. Indeed, if \( {T} \) is a stopping time then for all \( {t\in\mathbb{R}_+} \) we have

\[ \{T<t\} =\bigcup_{n=1}^\infty\{T\leq t-{\textstyle\frac{1}{n}}\} \in\mathcal{F}_t, \]

Conversely \( {\{T\leq t\}\in\cap_{s>t}\mathcal{F}_s=\mathcal{F}_{t+}} \) since for all \( {s>t} \),

\[ \{T\leq t\} =\bigcap_{n=1}^\infty\{T<(t+{\textstyle\frac{1}{n}})\wedge s\} \in\mathcal{F}_{s}. \]

Note that if \( {T} \) is a stopping time then \( {\{T=t\}=\{T\leq t\}\cap\{T<t\}^c\in\mathcal{F}_t} \).

Progressively measurable processes. Recall that a process \( {{(X_t)}_{t\in\mathbb{R}_+}} \) defined on a probability space \( {(\Omega,\mathcal{F},\mathbb{P})} \) is progressively measurable for \( {{(\mathcal{F}_t)}_{t\in\mathbb{R}_+}} \) when for all \( {t\in\mathbb{R}_+} \) the map \( {(\omega,s)\in\Omega\times[0,t]\mapsto X_s(\omega)} \) is measurable for \( {\mathcal{F}_t\otimes\mathcal{B}_{[0,t]}} \). Example of progressively measurable processes include adapted right-continuous processes.

Hitting time of Borel sets. Let \( {X={(X_t)}_{t\in\mathbb{R}_+}} \) be a progressively measurable process defined on a probability space \( {(\Omega,\mathcal{F},\mathbb{P})} \) equipped with a right continuous and complete filtration \( {{(\mathcal{F}_t)}_{t\in\mathbb{R}_+}} \), and taking its values in a measurable space \( {G} \). Then for all measurable subset \( {A\subset G} \), the hitting time \( {T_A:\Omega\rightarrow[0,+\infty]} \) defined by

\[ T_A=\inf\{t\in\mathbb{R}_+:X_t\in A\}, \]

with convention \( {\inf\varnothing=+\infty} \), is a stopping time.

Proof. The debut \( {D_B} \) of any \( {B\in\mathcal{F}\otimes\mathcal{B}(\mathbb{R}_+)} \) is defined for all \( {\omega\in\Omega} \) by

\[ D_B(\omega)=\inf\{t\in\mathbb{R}_+:(\omega,t)\in B\}\in[0,+\infty]. \]

If \( {B} \) is progressive, then \( {D_B} \) is a stopping time (this is known as the debut theorem). Indeed, for all \( {t\in\mathbb{R}_+} \) the set \( {\{D_B<t\}} \) is then the projection on \( {\Omega} \) of

\[ C=\{s\in[0,t):(\omega,s)\in B\}, \]

which belongs to \( {\mathcal{B}(\mathbb{R}_+)\otimes\mathcal{F}_t} \) since \( {B} \) is progressive. Since the filtration is right-continuous and complete, this projection belongs to \( {\mathcal{F}_t} \), see for instance Dellacherie and Meyer Theorem IV.50 page 116. Now \( {\{D_B<t\}\in\mathcal{F}_t} \) for all \( {t\in\mathbb{R}_+} \) implies that \( {D_B} \) is a stopping time since the filtration is right continuous. Finally it remains to note that

\[ T_A=D_B\quad\text{with}\quad B=\{(\omega,t):X_t\in A\}, \]

which is progressive as pre-image of \( {\mathbb{R}_+\times A} \) by \( {(\omega,t)\mapsto X_t(\omega)} \) (\( {X} \) is progressive).

A famous mistake. This is related to a famous mistake made by Henri Lebesgue (1875 — 1941) on the measurability of projections of measurable sets in product spaces, that motivated Nikolai Luzin (1883 — 1950) and his student Mikhail Yakovlevich Suslin (1894 — 1919) to forge the concept of analytic set and descriptive set theory.

[…] Sans que le terme de tribu soit utilisé à l’époque, il semblait à Borel qu’aucune opération de l’analyse ne ferait jamais sortir de la tribu borélienne. C’était aussi l’avis de Lebesgue, et il avait cru le démontrer en 1905. Rarement erreur a été plus fructueuse. Au début de l’année 1917, les Comptes rendus publient deux notes des Russes Nicolas Lusin et M. Ya. Souslin. […] La projection d’un borélien n’est pas nécessairement un borélien. L’analyse classique force donc à sortir de la tribu borélienne. Entre la tribu de Borel et celle de Lebesgue se trouve la tribu de Lusin, constituée par les ensembles que Lusin appelle analytiques et qui sont des images continues de boréliens. Au cours des années 1920 s’est développée à Moscou une école mathématique extrêmement brillante, dont Lusin a été le fondateur. Ainsi Lebesgue et son œuvre ont été beaucoup mieux connus à Moscou qu’ils ne l’étaient en France. Hongrie, Pologne et Russie ont été les foyers de rayonnement de la pensée de Lebesgue et de son héritage. […]

Jean-Pierre Kahane (2001)

Canonical filtration. It is customary to assume that the underlying filtration is right-continuous and complete. For a given filtration \( {{(\mathcal{F}_t)}_{t\in\mathbb{R}_+}} \), it is always possible to consider its completion \( {(\sigma_t)}_{t\in\mathbb{R}_+}={(\sigma(\mathcal{N}\cup\mathcal{F}_t))}_{t\in\mathbb{R}_+} \) where \( {\mathcal{N}} \) is the collection of negligible subsets of \( {\mathcal{F}} \). It is also customary to consider the right-continuous version \( {{(\sigma_{t+})}_{t\in\mathbb{R}_+}} \), called the canonical filtration. A process is always adapted with respect to the canonical filtration constructed from its completed natural filtration.

Subtleties about righ-continuity of filtrations. The natural filtration of a right-continuous process is not right-continuous in general, indeed a counter example is given by \( {X_t=tZ} \) for all \( {t\in\mathbb{R}_+} \) where \( {Z} \) is a non-constant random variable. Indeed, we have \( {\sigma(X_0)=\{\varnothing,\Omega\}} \) while \( {\sigma(X_{0+\varepsilon}:\varepsilon>0)=\sigma(Z)\neq\sigma(X_0)} \). However it can be shown that the completion of the natural filtration of a Feller Markov process, including all Lévy processes and in particular Brownian motion, is always right-continuous.

Further reading.

Leave a Comment

Sinkhorn and circular law

Circular law for sinkhorn
Sinkhorn spectrum for initial matrix with iid exponential entries

The Sinkhorn factorization of a square matrix $A$ with positive entries takes the form $$A = D_1 S D_2$$ where $D_1$ and $D_2$ are diagonal matrices with positive diagonal entries and where $S$ is a doubly stochastic matrix meaning that it has entries in $[0,1]$ and each row and column sums up to one. It is named after Richard Dennis Sinkhorn (1934 – 1995) who worked on the subject in the 1960s. A natural algorithm for the numerical approximation of this factorization consists in iteratively normalize each row and each column, and this is referred to as the Sinkhorn-Knopp algorithm. Its convergence is fast. Actually this is a special case of the iterative proportional fitting (IPF) algorithm, classic in probability and statistics, which goes back at least to the analysis of the contingency tables in the 1920s, and which was reinvented many times, see for instance the article by Friedrich Pukelsheim and references therein. It is in particular well known that $S$ is the projection of $A$, with respect to the Kullback-Leibler divergence, on the set of doubly stochastic matrices (Birkhoff polytope). The Sinkhorn factorization and its algorithm became fashionable in the domain of computational optimal transportation due to a relation with entropic regularization. It is also considered in the domain of quantum information theory.

Here is a simple implementation using the Julia programming language.

using LinearAlgebra # for Diagonal
#
function sinkhorn(A, tol = 1E-6, maxiter = 1E6)
 # Sinkhorn factorization A = D1 S D2 where D1 and D2 are diagonal and S doubly 
 # stochastic using Sinkhorn-Knopp algorithm which consists in iterative rows 
 # and columns sums normalizations. This code is not optimized.
 m , n = size(A,1), size(A,2) 
 D1, D2 = ones(1,n), ones(1,n)
 S = A
 iter, err = 0, +Inf
 while ((err > tol) && (iter < maxiter))
   RS = vec(sum(S, dims = 2)) # column vector of rows sums
   D1 = D1 .* RS
   S = Diagonal(1 ./ RS) * S 
   CS = vec(sum(S, dims = 1)) # row vector of columns sums
   D2 = CS .* D2
   S = S * Diagonal(1 ./ CS)
   iter += 1
   err = norm(RS .- 1) + norm(CS .- 1)
 end
 return S, Diagonal(D1), Diagonal(D2), iter
end # function

Circular law. Suppose now that $A$ is a random matrix, $n\times n$, with independent and identically distributed positive entries of mean $m$ and variance $\sigma^2$, for instance following the uniform or the exponential distribution. Let $S$ be the random doubly stochastic matrix provided by the Sinkhorn factorization. It is tempting to conjecture that the empirical spectral distribution of $$\frac{m\sqrt{n}}{\sigma}S$$ converges weakly as $n\to\infty$ to the uniform distribution on the unit disc of the complex plane, with a single outlier equal to $\frac{m\sqrt{n}}{\sigma}$. The circular law for $S$ is inherited from the circular law for $A$ and the law of large numbers. This can be easily guessed from the first iteration of the Sinkhorn-Knopp algorithm, that reminds the circular law analysis of random Markov matrices inspired from the Dirichlet Markov Ensemble and its decomposition. The circular law for $A$ is a universal high dimensional phenomenon that holds as soon as the variance of the entries is finite. One can also explore the case of heavy tails, and wonder if the circular law for $S$ still remains true!

The following Julia code produces the graphics at the top and the bottom of this post.

using Printf # for @printf
using Plots # for scatter and plot
using LinearAlgebra # for LinRange

function sinkhornplot(A,m,sigma,filename)
    S, D1, D2, iter = sinkhorn(A)
    @printf("%s\n",filename)
    @printf(" ‖A-D1 S D2‖ = %s\n",norm(A-D1*S*D2))
    @printf(" ‖RowsSums(S)-1‖ = %s\n",norm(sum(S,dims=2).-1))
    @printf(" ‖ColsSums(S)-1‖ = %s\n",norm(sum(S,dims=1).-1))
    @printf(" SK iterations = %d\n",iter)
    spec = eigvals(S) * m * sqrt(size(S,1)) / sigma
    maxval, maxloc = findmax(abs.(spec))
    deleteat!(spec, maxloc)
    scatter(real(spec),imag(spec),aspect_ratio=:equal,legend=false)
    x = LinRange(-1,1,100)
    plot!(x,sqrt.(1 .- x.^2),linewidth=2,linecolor=:steelblue)
    plot!(x,-sqrt.(1 .- x.^2),linewidth=2,linecolor=:steelblue)
    savefig(filename)    
end #function

sinkhornplot(rand(500,500),1/2,1/sqrt(12),"sinkhorn-unif.png")
sinkhornplot(-log.(rand(500,500)),1,1,"sinkhorn-expo.png")
sinkhornplot(abs.(randn(500,500)./randn(500,500)),1,1,"sinkhorn-heavy.png")
Sinkhorn spectrum for initial matrix with iid heavy tailed entries

Example of program output.

sinkhorn-unif.png
 ‖A-D1 S D2‖ = 6.704194265149758e-14
 ‖RowsSums(S)-1‖ = 8.209123420903459e-11
 ‖ColsSums(S)-1‖ = 3.326966272203901e-15
 SK iterations = 4
sinkhorn-expo.png
 ‖A-D1 S D2‖ = 1.9956579227110527e-13
 ‖RowsSums(S)-1‖ = 5.155302149020719e-11
 ‖ColsSums(S)-1‖ = 3.3454392974351074e-15
 SK iterations = 5
sinkhorn-heavy.png
 ‖A-D1 S D2‖ = 8.423054036449791e-10
 ‖RowsSums(S)-1‖ = 4.5152766460217607e-7
 ‖ColsSums(S)-1‖ = 3.8778423131653425e-15
 SK iterations = 126

Further reading.

Leave a Comment

Landen transformation of complete elliptic integrals

John Landen manuscript (1719-1770)
John Landen manuscript (1764)

This post is devoted to some aspects of the Landen transformation, essentially \( {x\mapsto(1-x)/(1+x)} \) or \( {x\mapsto 4x/(1+x)^2} \), used for certain special functions. It was introduced by John Landen (1719 — 1790) for expressing a hyperbolic arc in terms of two elliptic arcs. It is useful for numerical evaluation. Since its invention, the infinitesimal calculus was systematically used in order to compute geometric quantities such as for instance the length of an arc of ellipse and in particular the circumference of an ellipse. There is no closed formula like in the special case of the circle, and this leads to integrals known as elliptic integrals. Nowadays, they belong to the vast zoo of special functions.

Complete elliptic integrals of first and second kind. Given, for \( {\rho\in[0,1]} \), by

\[ K(\rho) :=\int_0^{\frac{\pi}{2}}\frac{\mathrm{d}\theta}{\sqrt{1-\rho\sin^2(\theta)}} =\int_0^1\frac{\mathrm{d}t}{\sqrt{1-\rho t^2}\sqrt{1-t^2}} \]

and

\[ E(\rho):=\int_0^{\frac{\pi}{2}}\sqrt{1-\rho\sin^2(\theta)}\mathrm{d}\theta =\int_0^1\frac{\sqrt{1-\rho t^2}}{\sqrt{1-t^2}}\mathrm{d}t. \]

The incomplete elliptic integrals are given by the same formula after replacing \( {\frac{\pi}{2}} \) by an arbitrary angle. The inverse of incomplete elliptic integrals are known as elliptic functions. Geometrically, the length of the arc of an ellipse can be expressed using the elliptic integral of the second kind, while the surface measure of an ellipsoid involves a combination of elliptic integrals of first and second kind. Elliptic integrals and functions appear at many places in mathematics, physics, and engineering, and were studied by several mathematicians, including historically, among others, Leonhard Euler (1707 — 1783), Adrien-Marie Legendre (1752 — 1833), Johann Carl Friedrich Gauss (1777 — 1855), Niels Henrik Abel (1802 — 1829), Carl Gustav Jacob Jacobi (1804 — 1851), Karl Weierstrass (1815 — 1897), and Arthur Cayley (1821 — 1895). Elliptic integrals and functions, are, as most classical special functions, very well known by software packages specialized in mathematics such as Maple and Mathematica.

Landen transformation. For all \( {x\in[0,1]} \),

\[ K\Bigr(\Bigr(\frac{1-x}{1+x}\Bigr)^2\Bigr) =\frac{1+x}{2}K(1-x^2) \]

and

\[ E\Bigr(\Bigr(\frac{1-x}{1+x}\Bigr)^2\Bigr) =\frac{1}{1+x}E(1-x^2)+\frac{2x}{(1+x)^2}K\Bigr(\Bigr(\frac{1-x}{1+x}\Bigr)^2\Bigr). \]

Reformulation. If \( {x_1:=\frac{1-x}{1+x}} \) then \( {1+x_1=\frac{2}{1+x}} \) and \( {\frac{4x_1}{(1+x_1)^2}=1-x^2} \), and we get

\[ K\Bigr(\frac{4x_1}{(1+x_1)^2}\Bigr) = (1+x_1)K(x_1^2). \]

Note that \( {x_1} \) runs over \( {[0,1]} \) when \( {x} \) runs over \( {[0,1]} \). Similarly we get

\[ E\Bigr(\frac{4x_1}{(1+x_1)^2}\Bigr) = \frac{2}{1+x_1}E(x_1^2)-(1-x_1)K(x_1^2). \]

For all \( {x\geq1} \), the identity \( {\frac{4x}{(1+x)^2}=\frac{4x^{-1}}{(1+x^{-1})^2}} \) allows to express \( {K} \) and \( {E} \) at \( {\frac{4x}{(1+x)^2}} \) for all \( {x\geq1} \).

Ivory derivation via hypergeometric series. It is possible to derive the formulas by using powerful change of variables, presented later on in this post, which remain valid for more general formulas for incomplete elliptic integrals. Nevertheless, following James Ivory (1765 — 1842), for complete elliptic integrals, it is more efficient to proceed by using a hypergeometric series expansion. Namely, by using the trick

\[ 1+2x\cos(\alpha)+x^2=(1+x\mathrm{e}^{\mathrm{i}\alpha})(1+x\mathrm{e}^{-\mathrm{i}\alpha}) \]

and the Newton binomial theorem

\[ \frac{1}{(1-z)^{\alpha}}=\sum_{n=0}^\infty\frac{(\alpha)_n}{n!}z^n, \]

where \( {(\alpha)_n:=\alpha(\alpha+1)\cdots(\alpha+n-1)} \) is the rising factorial, we get, for \( {0\leq x<1} \),

\[ \begin{array}{rcl} K\Bigr(\frac{4x}{(1+x)^2}\Bigr) &=&\frac{1}{2}\int_0^{\pi} \Bigr(\Bigr(1-\frac{4x}{(1+x)^2}\sin^2(\theta)\Bigr)^{-1/2} \mathrm{d}\theta\\ &=&\frac{1}{2}\int_0^{\pi} \Bigr(\Bigr(1-\frac{2x}{(1+x)^2}(1-\cos(2\theta))\Bigr)^{-1/2} \mathrm{d}\theta\\ &=&\frac{1+x}{2}\int_0^{\pi} \Bigr(1+x^2+2x\cos(2\theta)\Bigr)^{-1/2} \mathrm{d}\theta\\ &=&\frac{1+x}{2}\int_0^{\pi} \Bigr(1+x\mathrm{e}^{2\mathrm{i}\theta}\Bigr)^{-1/2} \Bigr(1+x\mathrm{e}^{-2\mathrm{i}\theta}\Bigr)^{-1/2} \mathrm{d}\theta\\ &=&\frac{1+x}{2}\sum_{m=0}^\infty \frac{(\frac{1}{2})_m(-x)^m}{m!} \sum_{n=0}^\infty \frac{(\frac{1}{2})_n(-x)^n}{n!} \int_0^{\pi}\mathrm{e}^{2\mathrm{i}(m-n)\theta}\mathrm{d}\theta\\ &=&\frac{\pi}{2}(1+x)\sum_{n=0}^\infty\frac{(\frac{1}{2})_n^2x^{2n}}{n!^2}\\ &=&(1+x)\frac{\pi}{2}F_{2,1}\Bigr(\frac{1}{2},\frac{1}{2};1;x^2\Bigr)\\ &=&(1+x)K(x^2). \end{array} \]

This can be seen as a formula for hypergeometric series: if \( {0\leq x<1} \) then

\[ F_{2,1}\Bigr(\frac{1}{2},\frac{1}{2};1;\frac{4x}{(1+x)^2}\Bigr) =(1+x)F_{2,1}\Bigr(\frac{1}{2},\frac{1}{2};1;x^2\Bigr). \]

Similarly, using in the last step

\[ K(x^2)=F_{2,1}\Bigr(\frac{1}{2},\frac{1}{2};1;x^2\Bigr) \quad\text{and}\quad E(x^2)=F_{2,1}\Bigr(-\frac{1}{2},\frac{1}{2};1;x^2\Bigr), \]

we get

\[ \begin{array}{rcl} (1+x)E\Bigr(\frac{4x}{(1+x)^2}\Bigr) &=&\frac{1+x}{2}\int_0^{\pi} \Bigr(1-\frac{4x}{(1+x)^2}\sin^2(\theta)\Bigr)^{1/2} \mathrm{d}\theta\\ &=&\frac{1+x}{2}\int_0^{\pi} \Bigr(\Bigr(1-\frac{2x}{(1+x)^2}(1-\cos(2\theta))\Bigr)^{1/2} \mathrm{d}\theta\\ &=&\frac{1}{2}\int_0^{\pi} \Bigr(1+x^2+2x\cos(2\theta)\Bigr)^{1/2} \mathrm{d}\theta\\ &=&\frac{1}{2}\int_0^{\pi} \Bigr(1+x\mathrm{e}^{2\mathrm{i}\theta}\Bigr)^{1/2} \Bigr(1+x\mathrm{e}^{-2\mathrm{i}\theta}\Bigr)^{1/2} \mathrm{d}\theta\\ &=&\frac{1}{2}\sum_{m=0}^\infty \frac{(-\frac{1}{2})_m(-x)^m}{m!} \sum_{n=0}^\infty \frac{(-\frac{1}{2})_n(-x)^n}{n!} \int_0^{\pi}\mathrm{e}^{2\mathrm{i}(m-n)\theta}\mathrm{d}\theta\\ &=&\frac{\pi}{2}\sum_{n=0}^\infty\frac{(-\frac{1}{2})_n^2x^{2n}}{n!^2}\\ &=&\frac{\pi}{2}F_{2,1}\Bigr(-\frac{1}{2},-\frac{1}{2};1;x^2\Bigr)\\ &=&\frac{\pi}{2}\Bigr(2F_{2,1}\Bigr(-\frac{1}{2},\frac{1}{2};1;x^2\Bigr) -(1-x^2)F_{2,1}\Bigr(\frac{1}{2},\frac{1}{2};1;x^2\Bigr)\Bigr)\\ &=&2E(x^2)-(1-x^2)K(x^2). \end{array} \]

This corresponds to the hypergeometric identity

\[ \begin{array}{rcl} (1+x)F_{2,1}\Big(-\frac{1}{2},\frac{1}{2};1;\frac{4x}{(1+x)^2}\Bigr) &=&F_{2,1}\Bigr(-\frac{1}{2},-\frac{1}{2};1;x^2\Bigr)\\ &=&2F_{2,1}\Bigr(-\frac{1}{2},\frac{1}{2};1;x^2\Bigr) -(1-x^2)F_{2,1}\Bigr(\frac{1}{2},\frac{1}{2};1;x^2\Bigr). \end{array} \]

Invariance of Cayley elliptic integral. For all \( {a,b>0} \), the Cayley elliptic integral

\[ I(a,b) :=\int_0^{\frac{\pi}{2}} \frac{1}{\sqrt{a^2\cos^2(\theta)+b^2\sin^2(\theta)}} \mathrm{d}\theta \]

is left unchanged if we replace \( {a,b} \) by their arithmetic and geometric means, namely

\[ I(a,b)=I\left(\frac{a+b}{2},\sqrt{ab}\right). \]

Similarly if we define

\[ J(a,b):= \int_0^{\frac{\pi}{2}} \sqrt{a^2\cos^2(\theta)+b^2\sin^2(\theta)} \mathrm{d}\theta \]

then

\[ 2J\Bigr(\frac{a+b}{2},\sqrt{ab}\Bigr) =J(a,b) + ab I(a,b) \]

Link with Landen transformation for \( {K} \). If \( {a>b>0} \) then

\[ I(a,b) =\int_0^{\frac{\pi}{2}} \frac{1}{\sqrt{a^2-(a^2-b^2)\sin^2(\theta)}} \mathrm{d}\theta =\frac{1}{a}K\left(\frac{a^2-b^2}{a^2}\right) \]

and

\[ I\left(\frac{a+b}{2},\sqrt{ab}\right) =\frac{1}{\frac{a+b}{2}} K\left(\frac{\left(\frac{a+b}{2}\right)^2-b^2}{\left(\frac{a+b}{2}\right)^2}\right) =\frac{2}{a+b}K\left(\frac{(a+b)^2-4b^2}{(a+b)^2}\right) =\frac{2}{a+b}K\left(\left(\frac{a-b}{a+b}\right)^2\right), \]

and thus, by the invariance of Cayley elliptic integrals,

\[ K\left(\left(\frac{a-b}{a+b}\right)^2\right) =\frac{a+b}{2a}K\left(\frac{a^2-b^2}{a^2}\right). \]

Setting \( {x:=\frac{a-b}{a+b}} \) we get \( {1+x=\frac{2a}{a+b}} \), \( {\frac{4x}{(1+x)^2}=a^2-b^2} \), and, when \( {a=1} \),

\[ (1+x)K(x^2) = K\left(\frac{4x}{(1+x)^2}\right). \]

We have thus obtained, from the invariance formula for the Cayley integral \( {I(a,b)} \), an alternative proof of the Landen transform formula involving \( {K} \). Similarly, the invariance formula for \( {J(a,b)} \) leads to the Landen transform formula involving \( {E} \) and \( {K} \).

Link with arithmetic-geometric mean. If we define \( {a_0:=a} \), \( {b_0=b} \), and

\[ a_{n+1}:=\frac{a_n+b_n}{2} \quad\text{and}\quad b_{n+1}:=\sqrt{a_nb_n} \]

for all \( {n\geq0} \), then it can be shown that

\[ b=b_0\leq b_1\leq\cdots\leq b_{n+1}\leq a_{n+1}\leq\cdots\leq a_1\leq a_0=a \]

and that actually both sequences converge as \( {n\rightarrow\infty} \) to a common limit, the arithmetic-geometric mean (AGM) \( {M(a,b)} \). Indeed, if \( {c_n:=\sqrt{a_n^2-b_n^2}} \) then

\[ c_{n+1}=\frac{a_n-b_n}{2}\quad\text{and}\quad c_n^2=(a_n-b_n)(a_n+b_n)=4c_{n+1}a_{n+1}, \]

hence \( {c_n} \) decreases to \( {0} \) as \( {n\rightarrow\infty} \) and

\[ M(a,b):= \lim_{n\rightarrow\infty}a_n=\lim_{n\rightarrow\infty}b_n. \]

Using the invariance of Cayley integrals and observing that \( {(a,b)\mapsto I(a,b)} \) is continuous and \( {I(c,c)=\frac{\pi}{2c}} \) for all \( {c} \), we get

\[ I(a,b)=I(M(a,b),M(a,b))=\frac{\pi}{2M(a,b)}. \]

Moreover since \( {I(a,b)=\frac{1}{a}K\Bigr(\frac{a^2-b^2}{a^2}\Bigr)} \) we get

\[ M(a,b)=\frac{a\pi}{K\Big(\frac{a^2-b^2}{a^2}\Bigr)}. \]

If we set \( {x:=\frac{\sqrt{a^2-b^2}}{a}} \) then using

\[ M(a,b)=a^2M\Bigr(1,\frac{b}{a}\Bigr)=M(1,\sqrt{1-x^2})=M(1-x,1+x) \]

we get

\[ M(1-x,1+x)=\frac{\pi}{2K(x^2)}. \]

Proof of the invariance of Cayley elliptic integrals. The change of variable

\[ b\tan\theta=x \]

gives \( {\mathrm{d}\theta=\frac{\cos^2(\theta)}{b}\mathrm{d}x} \), and since \( {\cos^2(\theta)=\frac{b^2}{x^2+b^2}} \),

\[ I(a,b) =\int_0^\infty \frac{1}{\sqrt{\cos^2(\theta)}\sqrt{(a^2+x^2)}}\frac{b}{x^2+b^2} \mathrm{d}x =\int_0^\infty\frac{1}{\sqrt{(a^2+x^2)(x^2+b^2)}} \mathrm{d}x. \]

The further substitution \( {x=t+\sqrt{t^2+ab}} \) satisfies

\[ \mathrm{d}x=(1+\frac{t}{\sqrt{t^2+ab}})\mathrm{d}t=\frac{x}{\sqrt{t^2+ab}}\mathrm{d}t, \]

and the identity

\[ \sqrt{(x^2+a^2)(x^2+b^2)} =2x\sqrt{t^2+\left(\frac{a+b}{2}\right)^2}, \]

which can be checked using the simpler identity \( {x^2=2tx+ab} \), gives

\[ I(a,b) = \frac{1}{2}\int_{-\infty}^\infty \frac{1}{\sqrt{t^2+\left(\frac{a+b}{2}\right)^2}\sqrt{t^2+ab}} \mathrm{d}t = \int_0^\infty \frac{1}{\sqrt{\left(t^2+\left(\frac{a+b}{2}\right)^2\right) \left(t^2+\sqrt{ab}^2\right)}} \mathrm{d}t \]

which is equal to \( {I\left(\frac{a+b}{2},\sqrt{ab}\right)} \) according to the preceding formula with \( {\frac{a+b}{2},\sqrt{ab}} \) instead of \( {a,b} \). Alternatively, following Jean-Pierre Demailly (1957 –), using the change of variable

\[ \varphi=\theta+\arctan\Bigr(\frac{b}{a}\tan(\theta)\Bigr) \]

which is an increasing bijection from \( {[0,\pi/2)} \) to \( {[0,\pi)} \), we have

\[ \frac{\mathrm{d}\varphi}{\mathrm{d}\theta} =1+\frac{\frac{b}{a}(1+\tan^2(\theta))}{1+\frac{b^2}{a^2}\tan^2(\theta)} =\frac{(a+b)(a\cos^2(\theta)+b\sin^2(\theta))}{a^2\cos^2(\theta)+b^2\sin^2(\theta)}. \]

On the other hand

\[ \varphi=\theta+\alpha\quad\text{where}\quad\alpha:=\arctan\Bigr(\frac{b}{a}\tan(\theta)\Bigr)\in[0,\pi/2). \]

We have \( {\tan(\alpha)=\frac{b}{a}\tan(\theta)} \) and since

\[ \cos(\alpha)=\frac{1}{\sqrt{1-\tan^2(\alpha)}} \quad\text{and}\quad \sin(\alpha)=\frac{\tan(\alpha)}{\sqrt{1+\tan^2(\alpha)}} \]

we get

\[ \begin{array}{rcl} \cos(\varphi) &=&\cos(\theta)\cos(\alpha)-\sin(\theta)\sin(\alpha)\\ &=&\cos(\theta)\frac{1}{\sqrt{1+\Bigr(\frac{b}{a}\Bigr)^2\tan^2(\theta)}} -\sin(\theta)\frac{\frac{b}{a}\tan(\theta)}{\sqrt{1+\Bigr(\frac{b}{a}\Bigr)^2\tan^2(\theta)}} =\frac{a\cos^2(\theta)-b\sin^2(\theta)}{\sqrt{a^2\cos^2(\theta)+b^2\sin^2(\theta)}} \end{array} \]

and

\[ \begin{array}{rcl} \sin(\varphi) &=&\sin(\theta)\cos(\alpha)+\cos(\theta)\sin(\alpha)\\ &=&\sin(\theta)\frac{1}{\sqrt{1+\Bigr(\frac{b}{a}\Bigr)^2\tan^2(\theta)}} +\cos(\theta)\frac{\frac{b}{a}\tan(\theta)}{\sqrt{1+\Bigr(\frac{b}{a}\Bigr)^2\tan^2(\theta)}} =\frac{(a+b)\sin(\theta)\cos(\theta)}{\sqrt{a^2\cos^2(\theta)+b^2\sin^2(\theta)}}. \end{array} \]

This gives

\[ \begin{array}{rcl} a_1^2\cos^2(\varphi)+b_1^2\sin^2(\varphi) &=&\Bigr(\frac{a+b}{2}\Bigr)^2\Bigr(\cos^2(\varphi)+\frac{4ab}{(a+b)^2}\sin^2(\varphi)\Bigr)\\ &=&\Bigr(\frac{a+b}{2}\Bigr)^2 \frac{(a\cos^2(\theta)-b\sin^2(\theta))^2+4ab\sin^2(t)\cos^2(t)} {a^2\cos^2(\theta)+b^2\sin^2(\theta)}\\ &=&\Bigr(\frac{a+b}{2}\Bigr)^2 \frac{(a\cos^2(\theta)+b\sin^2(\theta))^2} {a^2\cos^2(\theta)+b^2\sin^2(\theta)}. \end{array} \]

In other words, setting \( {\Delta(\theta):=\sqrt{a^2\cos^2(\theta)+b^2\sin^2(\theta)}} \), we get

\[ \Delta_1(\varphi) :=\sqrt{a_1^2\cos^2(\varphi)+b_1^2\sin^2(\varphi)} =\frac{a+b}{2}\frac{a\cos^2(\varphi)+b\sin^2(\theta)}{\sqrt{a^2\cos^2(\theta)+b^2\sin^2(\theta)}} =\frac{a+b}{2}\frac{a\cos^2(\theta)+b\sin^2(\theta)}{\Delta(\theta)} \]

and thus, combining this with a formula above for \( {\mathrm{d}\varphi/\mathrm{d}\theta} \) we obtain

\[ \frac{1}{2}\frac{\mathrm{d}\varphi}{\Delta_1(\varphi)} =\frac{1}{2}\frac{\mathrm{d}\varphi}{\sqrt{a_1^2\cos^2(\varphi)+b_1^2\sin^2(\varphi)}} =\frac{\mathrm{d}\theta}{\sqrt{a^2\cos^2(\theta)+b^2\sin^2(\theta)}} =\frac{\mathrm{d}\theta}{\Delta(\theta)} \]

hence finally

\[ I\Big(\frac{a+b}{2},\sqrt{ab}\Bigr)=I(a,b). \]

Next, for the formula concerning \( {J} \), we observe that

\[ \Delta_1(\varphi)+\frac{a-b}{2}\cos(\varphi) =\frac{a^2\cos^2(\theta)+b^2\sin^2(\theta)}{\sqrt{a^2\sin^2(\theta)+b^2\cos^2(\theta)}} =\Delta(\theta) \]

and

\[ \Delta_1(\varphi)-\frac{a-b}{2}\cos(\varphi) =\frac{ab\cos^2(\theta)+ab\sin^2(\theta)}{\sqrt{a^2\sin^2(\theta)+b^2\cos^2(\theta)}} =\frac{ab}{\Delta(\theta)}, \]

hence

\[ 2\Delta_1(\varphi)=\Delta(\theta)+\frac{ab}{\Delta(\theta)}, \]

while a formula above reads \( {\mathrm{d}\varphi=2\frac{\Delta_1(\varphi)}{\Delta(\theta)}\mathrm{d}\theta} \) hence

\[ \Delta_1(\varphi)\mathrm{d}\varphi +\frac{a-b}{2}\cos(\varphi)\mathrm{d}\varphi =\Delta(\theta)\mathrm{d}\varphi =2\Delta_1(\varphi)\mathrm{d}\theta =\Bigr(\Delta(\theta)+\frac{ab}{\Delta(\theta)}\Bigr)\mathrm{d}\theta. \]

Finally, since \( {\int_0^{2\pi}\cos(\theta)\mathrm{d}\theta=0} \) we get as expected

\[ 2J(a_1,b_1)=J(a,b)+abI(a,b). \]

Historical proof by change of variable due to Landen. This proof works also for incomplete elliptic integrals, and admits a geometrical interpretation. Let

\[ x\in[0,1],\quad x’=\sqrt{1-x^2},\quad x_1=\frac{1-x’}{1+x’}. \]

We start from the formula

\[ 2K(x_1^2) =\int_0^{\pi}\frac{\mathrm{d}\theta_1}{\sqrt{1-x_1^2\sin^2(\theta_1)}}. \]

We would like to set \( {x_1\sin(\theta_1)=\sin(\alpha)} \) in such a way that \( {x_1^2\sin^2(\theta_1)=\sin^2(\alpha)} \). But it turns out that it is better to use another change of variable, namely replace \( {\theta_1} \) by \( {\theta} \) with

\[ 2\theta=\theta_1+\alpha=\theta_1+\arcsin(x_1\sin(\theta_1)) \]

where \( {\arcsin} \) takes its values in \( {[0,\pi/2]} \). Then \( {\theta} \) runs over \( {[0,\pi/2]} \) when \( {\theta_1} \) runs over \( {[0,\pi]} \), the derivative of the formula with respect to \( {\theta_1} \) is \( {1+\cos(\theta_1)/\sqrt{1-x_1^2\sin^2(\theta_1)}} \), which vanishes only for \( {\theta_1=\pi} \). The crucial point now is the identity

\[ (1+x_1)\sqrt{1-x^2\sin^2(\theta)} =\sqrt{1-x_1^2\sin^2(\theta_1)} +x_1\cos(\theta_1) \]

Now we get

\[ (1+x_1)x^2 \frac{\sin(\theta)\cos(\theta)} {\sqrt{1-x^2\sin^2(\theta)}} \mathrm{d}\theta = \left( x_1^2 \frac{\sin(\theta_1)\cos(\theta_1)} {\sqrt{1-x_1^2\sin^2(\theta_1)}} +x_1\sin(\theta_1) \right)\mathrm{d}\theta_1 \]

hence

\[ \frac{1+x_1}{2x_1}x^2 \frac{\sin(2\theta)} {\sqrt{1-x^2\sin^2(\theta)}} \mathrm{d}\theta = \left( \frac{x_1\sin(\theta_1)\cos(\theta_1) +\sin(\theta_1)\sqrt{1-x_1^2\sin^2(\theta_1)}} {\sqrt{1-x_1^2\sin^2(\theta_1)}} \right)\mathrm{d}\theta_1 \]

hence

\[ (1+x’) \frac{\sin(2\theta)} {\sqrt{1-x^2\sin^2(\theta)}} \mathrm{d}\theta = \left( \frac{\sin(\alpha)\cos(\theta_1) +\sin(\theta_1)\cos(\alpha)} {\sqrt{1-x_1^2\sin^2(\theta_1)}} \right)\mathrm{d}\theta_1 \]

hence

\[ (1+x’)\frac{\sin(2\theta)} {\sqrt{1-x^2\sin^2(\theta)}} \mathrm{d}\theta = \left( \frac{\sin(\alpha+\theta_1)} {\sqrt{1-x_1^2\sin^2(\theta_1)}} \right)\mathrm{d}\theta_1 \]

and finally

\[ (1+x’) \frac{1} {\sqrt{1-x^2\sin^2(\theta)}} \mathrm{d}\theta = \frac{1} {\sqrt{1-x_1^2\sin^2(\theta_1)}} \mathrm{d}\theta_1. \]

Integrating over the range of \( {\theta} \) and \( {\theta_1} \) we obtain

\[ (1+x’) \int_0^{\frac{\pi}{2}} \frac{\mathrm{d}\theta} {\sqrt{1-x^2\sin^2(\theta)}} = \int_0^{\pi} \frac{\mathrm{d}\theta_1} {\sqrt{1-x_1^2\sin^2(\theta_1)}} \]

which gives \( {(1+x’)K(x^2)=2K(x_1^2)} \), in other words \( {(1+x’)K(1-x’^2)=2K(x_1^2)} \) which is the desired formula with \( {x’} \) instead of \( {x} \). To get the formula involving \( {E} \), we write

\[ 1-x^2\sin^2(\theta) =1-x^2\frac{1-\cos(2\theta)}{2} =1-\frac{x^2}{2}+\frac{x^2}{2}\cos(2\theta), \]

and since

\[ \cos(2\theta) =\cos(\theta_1)\cos(\alpha)-\sin(\theta_1)\sin(\alpha) =\cos(\theta_1)\sqrt{1-x_1^2\sin^2(\theta_1)}-x_1\sin^2(\theta_1) \]

we get

\[ 1-x^2\sin^2(\theta) =1-\frac{x^2}{2}-\frac{x^2}{2x_1} +\frac{x^2}{2}\cos(\theta_1)\sqrt{1-x_1^2\sin(\theta_1)} +\frac{x^2}{2x_1}(1-x_1^2\sin^2(\theta_1)). \]

Now since \( {\frac{x^2}{2x_1}=\frac{(1+x’)^2}{2}} \) and \( {1-\frac{x^2}{2}-\frac{x^2}{2x_1}=-x’} \) we obtain

\[ 1-x^2\sin^2(\theta) =x’ +\frac{x^2}{2}\cos(\theta_1)\sqrt{1-x_1^2\sin(\theta_1)} +\frac{(1+x’)^2}{2}(1-x_1^2\sin^2(\theta_1)). \]

Combining this identity with a previous identity we obtain

\[ (1+x’) \int_0^{\frac{\pi}{2}} \sqrt{1-x^2\sin^2(\theta)} \mathrm{d}\theta = \int_0^{\pi} \frac{-x’ +\frac{x^2}{2}\cos(\theta_1)\sqrt{1-x_1^2\sin(\theta_1)} +\frac{(1+x’)^2}{2}(1-x_1^2\sin^2(\theta_1))} {\sqrt{1-x_1^2\sin^2(\theta_1)}} \mathrm{d}\theta_1, \]

hence \( {(1+x’)E(x^2)=-2x’K(x_1^2)+0+\frac{(1+x’)^2}{2}2E(x_1^2)} \) which rewrites as \( {\frac{1}{1+x’}E(1-x’^2)=-\frac{2x’}{(1+x’)^2}K(x_1^2)+E(x_1^2)} \) which is the desired formula with \( {x’} \) instead of \( {x} \).

Alternative expression of the change of variable. It reads

\[ \tan(\alpha) =\frac{x_1\sin(\theta_1)}{\sqrt{1-x_1^2\sin^2(\theta_1)}} =\frac{x_1\sin(2\theta)}{1+x_1\cos(2\theta)}. \]

Indeed, we have

\[ \cos(2\theta) =\cos(\theta_1)\cos(\alpha)-\sin(\theta_1)\sin(\alpha) =\cos(\theta_1)\sqrt{1-x_1^2\sin^2(\theta_1)}-x_1\sin^2(\theta_1) \]

thus

\[ 1+x_1\cos(2\theta) =\sqrt{1-x_1^2\sin^2(\theta_1)} \Bigr(x_1\cos(\theta_1)+\sqrt{1-x_1^2\sin^2(\theta_1)}\Bigr) \]

and therefore

\[ \sqrt{1-x_1^2\sin^2(\theta_1)} =\frac{1+x_1\cos(2\theta)} {x_1\cos(\theta_1)+\sqrt{1-x_1^2\sin^2(\theta_1)}} =\frac{1+x_1\cos(2\theta)} {(1+x_1)\sqrt{1-x^2\sin^2(\theta)}}. \]

Similarly we get

\[ \sin(2\theta) =\sin(\theta_1)\cos(\alpha)+\cos(\theta_1)\sin(\alpha) =\sin(\theta_1)\sqrt{1-x_1^2\sin^2(\theta_1)}+x_1\sin(\theta_1)\cos(\theta_1) \]

and therefore

\[ x_1\sin(\theta_1) =\frac{x_1\sin(2\theta)} {\sqrt{1-x_1^2\sin^2(\theta_1)}+x_1\cos(\theta_1)} =\frac{x_1\sin(2\theta)} {(1+x_1)\sqrt{1-x^2\sin^2(\theta)}} \]

Finally writing \( {\tan(\alpha)=\frac{\sin(\alpha)}{\cos(\alpha)}=\frac{x_1\sin(\alpha)}{\sqrt{1-x_1^2\sin^2(\theta)}}} \) leads to the desired expression.

Further reading.

Final words. A famous quotes says “Be careful about reading health books. You may die of a misprint”. We could say the same about books on elliptic integrals!

Leave a Comment
Syntax · Style · .