Loading [MathJax]/extensions/TeX/mathchoice.js
Press "Enter" to skip to content

Some few moments with the problem of moments

12π+x2nex22dx=(2n)!n!2n.

This post in an invitation to the moments problem, taken from old notes. Let P be a probability distribution on R. The sequence of absolute moments (Mn) of P is given for every nN by

Mn=R|x|ndP(x)[0,].

When Mn<, the associated moment mn is given by

mn=RxndP(x),

and we have |mn|Mn. The sequence of moments (mn) of P is well defined if and only if P has finite absolute moments, in other words if and only if R[X]L1(P). Notice that (Mn)=(mn) when P is supported in R+. We say that a probability distribution P on R (resp. R+) is characterized by its moments if and only if P is the unique probability distribution on R (resp. R+) with sequence of moments (mn). Moments problems go back probably to Tchebychev, Markov, and Stieltjes, and can be subdivided into several subproblems including:

  1. existence. under which condition a sequence of real numbers (mn) is the sequence of moments of a probability distribution?
  2. uniqueness. under which condition a probability distribution is characterized by its moments?
  3. structure. how to describe the convex set of all probability distributions sharing the same sequence of moments?

Notice that existence and uniqueness problems are in general sensitive to additional constraints. For the moments problem, uniqueness on R+ does not imply uniqueness on R, whereas existence on R does not imply existence on R+. Stieltjes studied moments problems on R+. He obtained a necessary and sufficient condition for existence, and studied uniqueness. His methods involve for instance continuous fractions. Later, Hamburger continued the work of Stieltjes and studied moments problems on R. Actually, moments problems were studied by many people including among others Marcel Riesz, Krein, Hausdorff, Hamburger, and Carleman. Nowadays, there is less activity around moments problems, but one may read Simon and also Diaconis and Freedman for a revival.

1. Existence

The moments problem is a linear problem, with a positivity constraint. The following result is due to Hamburger.

Theorem 1 (Hamburger) A sequence (mn) of real numbers is the sequence of moments of a probability distribution on the real line if and only if the infinite Hankel matrix

H=(m0m1m2m1m2m3m2m3m4)

is positive definite: n,nmn+nun¯un0 for any sequence (un) of complex numbers such that un=0 except for a finitely many values of n.

Notice that Hn,n=mn+n and that the condition on H means that every finite square submatrix of H is positive definite. There exists a necessary and sufficient condition it terms of matrices for the Stieltjes moment problem. The reader will find more details on Stietljes and Hamburger moments problems in Shohat and Tamarkin, Akhiezer, and Krein and Nudel'man.

2. Uniqueness

One can derive sufficient conditions for uniqueness in terms of Hankel determinants. We give in the sequel some aspects of a more general approach based on quasi-analytic functions. In the sequel, and unless explicitly mentioned, we consider uniqueness on R, not on R+. Following Hausdorff, the density of the polynomials for the uniform topology (Weierstass-Bernstein theorem) implies that any probability distribution P supported in a compact interval [a,b]R is characterized by its moments.

Remark 1 (A counter-example by C.C. Heyde) The log-normal distribution is not characterized by its moments. Namely, a real random variable X follows the log-normal distribution if and only if log(X) is a standard Gaussian distribution. The associated Lebesgue density is x(2π)1/2x1exp((log(x))2/2)1R+(x). Now, for any fixed real number a[1,+1], let us consider the Lebesgue density fa defined by fa(x)=f(x)(1+asin(2πlog(x))) for every xR. It turns out that f and fa share the same sequence of moments, see for instance page 227 in Feller.

The following result is due to Tchakaloff, see also Bayer and Teichmann.

Theorem 2 (Tchakaloff, Bayer and Teichmann) Let P be a probability distribution on Rd with some finite absolute moments: max1knMn< for some integer n1. Let Rn[X] be the vector space of polynomial functions of total degree less than of equal to n. Then there exists a finitely supported probability distribution Pk=p1δx1++pkδxk with supp(Pk)={x1,,xk}supp(P) and kDimRn[X] such that for every fRn[X],

Rdf(x)dP(x)=Rdf(x)dPk(x)=ki=1pif(xi).

Proof: The proof uses basic separation properties of convex sets, the extremal points theorem of Minkowski-Carathéodory, and the transformation of the measure into a measure on the space of polynomials of bounded degree. This kind of result is also referred as quadrature of cubature formulas. See also Putinar and Curto and Fialkow. There a link between Hamburger moments problems and the density of polynomials in Lebesgue spaces, see for instance Stoyanov, Bakan, Bakan, Berg, Fialkow and Petrovic, and Stochel, Szafraniec, Putinar, Vasilescu. ☐

Theorem 3 (Analycity of the Fourier transform and the moments problem) Let P be a probability distribution on R with well defined moments (mn) and Fourier transform φP. The following propositions are equivalent.

  1. φP is analytic on a neighborhood of the origin;
  2. φP is analytic on R;
  3. ¯limn(1n!|mn|)1n<.

Moreover, if they hold true, then P is characterized by its moments (mn). It is the case in particular when P is compactly supported or when

¯limn1n|mn|1n<.

.

Proof: For every n, we have Mn<, and thus φP is n times differentiable on R. Moreover φ(n)P is continuous on R and for every tR,

φ(n)P(t)=R(ix)neitxdP(x).

In particular, φ(n)P(0)=inmn, and the Taylor series of φP at the origin is determined by the sequence (mn). Recall that the radius of convergence r of the power series nanzn associated to the sequence of complex numbers (an) is given by the Hadamard formula

r1=¯limn|an|1n,

and consequently, 12 (just take an=inmn/n!). In the other hand, for any nN and any s,tR, we have

eisx(eitx1itx1!(itx)n1(n1)!)|tx|nn!,

see for instance p. 512 and 514 in Feller, and thus, for any nN and any s,tR,

(φP(s+t)φP(s)t1!φP(s)tn1(n1)!φ(n1)P(s))mn|t|nn!,

which implies 1 2. By the Stirling formula, if ¯limn1n|mn|1n< then condition 3 holds true. If P is compactly supported, then supn|mn|< and thus condition 3 holds true. Suppose now that conditions 1-3 hold true. From condition 2, the analytic continuation principle states that φP admits a maximal simply connected analytic continuation to a neighborhood of R in C, which is thus holomorphic. Next, the sequence of moments (mn) uniquely characterizes the Taylor series at the origin, and thus uniquely characterizes the analytic continuation of φP by virtue of the isolated zeros theorem. In particular, the sequence of moments characterizes the function φP on R, and thus P by virtue of the the Cramér-Wold theorem. ☐

It turns out that a probability distribution on R can be characterized by its moments without having an analytic Fourier transform. Actually, in view of the moments problem, the main useful property here regarding analycity is that an analytic function on R is uniquely determined by its value and the values of all its derivatives at the origin. Quasi-analytic functions have this property. These functions where introduced by Borel and Hadamard, and where later brilliantly studied by Denjoy and Carleman, see for instance the memoir of Carleman, or chapter 19 in Rudin and section 4.2 in Bélisle, Massé, and Ransford. The Carleman condition appearing below is strictly weaker than the Hadamard condition.

For any sequence of positive real numbers (cn) and any bounded interval [a,b]R, we denote by C([a,b],(cn)) the class of infinitely differentiable functions f:[a,b]RC such that

sup[a,b]|f(n)|rncn

for any nN and for some positive real constant r which may depend on f. The Hadamard problem consists in finding conditions on (cn) such that any couple of functions f and g in C([a,b],(cn)) that are equal together with all their derivatives at some fixed point of [a,b] are equal on the whole interval [a,b]. Such functions are called quasi-analytic. The analytic functions on [a,b] correspond to the class C([a,b],(n!)).

Theorem 4 (Denjoy-Carleman characterization of quasi-analycity) For any sequence of positive real numbers (cn) and any bounded interval [a,b]R, the class C([a,b],(cn)) is quasi-analytic if and only if

n=1(infkn|ck|1k)1=.

Analycity implies quasi-analycity but the converse if false. The Carleman condition is satisfied if the Hadamard condition is satisfied.

Corollary 5 (Carleman condition for the moments problem) Let P be a probability distribution on R with finite absolute moments (Mn) and moments (mn). If at least one of the following conditions is satisfied

  1. n=1M2n12n=;
  2. n=1Mn1n=;
  3. n=1|mn|1n=

then P is characterized by its moments.

Proof: We have |mn|Mn for every nN. In the other hand, the elementary bound

2|u|2n+1|u|2n+|u|2n+2

valid for every uR and nN implies that

2M2n+12nM2n+M2n+2

for every nN. Consequently, we obtain the following cascading Carleman like conditions:

n=1M2n12n=n=1Mn1n=n=1|mn|1n=.

They imply that φP is quasi-analytic, and that it is characterized by the sequence (mn) since φ(n)P(t)=(it)nmn for every nN and tR. The desired result follows then from the Denjoy-Carleman theorem with cn=|mn| by using the bound infknckcn. ☐

If φP is analytic on a neighborhood of the origin then the Hadamard condition ¯limnn1|mn|1n< holds true and implies the Carleman condition n=1|mn|1n=.

Let us move now to the multivariate moment problem. We define the sequence of absolute moments (Mn) of a probability distribution P on Rd with d>1 by

Mn=RdxndP(x)[0,]

for any nN. If P denotes the image distribution of P by the map xx, then P is a probability distribution on R+ with sequence of moments (Mn). By using Hölder inequality, if Mn< for every nN, then for any multi-index kNd, one can define the moment mk of P by

mk=mk1,,kd=Rdxk11xkdddP(x).

We say that P is characterized by its moments if and only if P is the unique probability distribution on Rd with moments (mk). Notice that when Mn< for every nN, the Fourier transform φP is infinitely Fréchet differentiable on Rd and for every tRd and kNd,

k1t1kdtdφP(t1,,td)=ik1++kdtk11tkddmk.

It is delicate to make use of analycity in dimension strictly bigger than 1 due to the lack of multidimensional isolated zeros theorem (e.g. Hartog type phenomena). In some sense, the Carleman moments condition turns out to be more flexible. If P is a law on Rd and XP and xRd then we denote by Px the law of the real random variable x,X.

Theorem 6 (Multidimensional case) Let P be a probability distribution on Rd with Fourier transform φP and finite absolute moments (Mn) and moments (mk). If at least one the following propositions hold true

  1. Px is characterized by its moments for every xS(Rd);
  2. P satisfies to the Carleman condition

    n=1(Mn)1n=;

  3. for every xS(Rd), the function tRφP(tx) is analytic on R;

then P is characterized by its moments (mk).

Proof: For every xS(Rd), the moments of the unidimensional probability distribution Px are uniquely determined by the sequence (mk) since for every nN,

\int_{\mathbb{R}}\!u^n\,dP_{\left<x\right>}(u)=\int_{\mathbb{R}}\!\left<x,y\right>^n\,dP(y) =\sum_{k_1+\cdots+k_d=n}\binom{n}{k_1\cdots k_d} x_1^{k_1}\cdots x_d^{k_d}m_{k_1,\ldots,k_d}.

By the Cramér-Wold theorem, 1 {\Rightarrow} {P} is characterized by {(m_k)} . 2 {\Rightarrow} 1. For every {x\in\mathbb{S}(\mathbb{R}^d)} , the unidimensional probability distribution {P_{\left<x\right>}} satisfies in turn to the Carleman condition since for every {n\in\mathbb{N}} ,

\int_{\mathbb{R}}\!|u|^n\,dP_{\left<x\right>}(u)= \int_{\mathbb{R}^d}\!|\left<x,y\right>|^n\,dP(y) \leq \left\Vert x\right\Vert^{n}\int_{\mathbb{R}^d}\!\left\Vert y\right\Vert^{n}\,dP(y) =M_n,

3 {\Rightarrow} 1. We have For every {x\in\mathbb{S}(\mathbb{R}^d)} , {\varphi_{P_{\left<x\right>}}(t)=\varphi_P(tx)} for every {t\in\mathbb{R}} . ☐

2 Comments

  1. Ti 2024-05-30

    Dear Prof. Chafaï,

    thank you for your post, Existence and the associated counter example on the log normal are new to me... So the following question would seem a bit naive, anyway :
    I was wondering if such results would exist with cumulants ?

    On the moment problem, do you know any book that gather the theorems you mentioned ?

    Kind regards

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Syntax · Style · .