Press "Enter" to skip to content

Category: Uncategorized

Eigenvalues repulsion

Eugene Wigner and Edward Teller
Eugene Wigner and Edward Teller

GOE. The most well known model of random matrices is probably the Gaussian Orthogonal Ensemble (GOE) of symmetric $n\times n$ random matrices, with Gaussian density proportional to $$X\mapsto\mathrm{e}^{-\tfrac{1}{2}\mathrm{Trace}(X^2)}.$$ The eigen-decomposition $X=ODO^\top$ and the change of variable $S\mapsto (O,D)$ give that the eigenvalues of GOE have density in $\mathbb{R}^n$ proportional to $$\mathrm{e}^{-\tfrac{1}{2}\sum_{i=1}^n\lambda_i^2}\prod_{i<j}|\lambda_i-\lambda_j|=\mathrm{e}^{-\Bigr(\tfrac{1}{2}\sum_{i=1}^n\lambda_i^2+\sum_{i<j}\log\frac{1}{|\lambda_i-\lambda_j|}\Bigr)}.$$ The term $\prod_{i<j}|\lambda_i-\lambda_j|$ is the absolute value of the determinant of the Jacobian of the change of variable. It expresses the way the Lebesgue measure on symmetric matrices is deformed by the change of variable. It vanishes when two eigenvalues are the same, revealing an eigenvalues repulsion. What is the simplest way to understand this repulsion phenomenon?

2×2 fact. It is actually a basic non-linear fact checkable on $2×2$ fixed matrices : $$X=\begin{pmatrix} x_1 & x_3\\x_3 & x_2\end{pmatrix},\quad x_1,x_2,x_3\in\mathbb{R}.$$ Namely, the eigenvalues $\lambda_1$ and $\lambda_2$, the roots of $\lambda^2-(x_1+x_2)\lambda+x_1x_2-x_3^2=0$, are $$\frac{x_1+x_2\pm\sqrt{(x_1+x_2)^2-4(x_1x_2-x_3^2)}}{2}$$ hence the eigenvalues spacing is given by $$|\lambda_1-\lambda_2|=\sqrt{(x_1-x_2)^2+4x_3^2}.$$ Now if $x_3=0$ then the matrix is diagonal and $|\lambda_1-\lambda_2|=|x_1-x_2|,$ while if $x_3\neq0$ then $|\lambda_1-\lambda_2|>2|x_3|>0$ which is precisely an eigenvalues repulsion phenomenon.

Surmise. Let us put now a Gaussian law on this $2×2$ matrix, making it GOE, with density $$\mathrm{e}^{-\tfrac{1}{2}\mathrm{Trace}\begin{pmatrix}x_1 & x_3\\x_3 & x_2\end{pmatrix}^2}=\mathrm{e}^{-\tfrac{1}{2}(x_1^2+x_2^2+2x_3^2)}.$$ Then the law of the spacing $S:=|\lambda_1-\lambda_2|$ can be computed as follows : $$S\overset{d}{=}\sqrt{(X_1-X_2)^2+4X_3^2},\quad\text{with } X_1,X_2,\sqrt{2}X_3\text{ iid }\mathcal{N}(0,1),$$ therefore
$$S^2\overset{d}{=}(X_1-X_2)^2+4X_3^2\sim\Gamma(\tfrac{1}{2},\tfrac{1}{4})*\Gamma(\tfrac{1}{2},\tfrac{1}{4})=\Gamma(1,\tfrac{1}{4})=\mathrm{Exp}(\tfrac{1}{4}),$$and thus the spacing $S$ has density proportional to $$s\mathrm{e}^{-\tfrac{1}{4}s^2}=\mathrm{e}^{-(\tfrac{1}{4}s^2-\log(s))}.$$ This density vanishes at the origin $s=0$, which corresponds to a logarithmic repulsion in the energy. This simple computation, made by Wigner, led him to his famous surmise (or conjecture) on the distribution of the eigenvalues spacings for high dimensional random matrices.

Non-linear effect and removable singularity. The squared spacing $S^2$ follows the exponential law of density proportional to $t\mapsto\mathrm{e^{-t/4}}$, which does not vanish at $t=0$, and this shows that the singularity at the origin of the density of the spacing $S$ can be removed by a non-linear change of variable. More generally, if for instance $t\mapsto f(t)$ is a density which may not vanish at $t=0$, then the change of variable $s=\sqrt{t}$ leads to a density proportional to $s\mapsto sf(s^2)$. In the case of a diagonal matrix, $x_3=0$, then $S^2=(X_1-X_2)^2\sim\Gamma(\tfrac{1}{2},\tfrac{1}{4})$, which has density proportional to $t\mapsto\mathrm{e}^{-t/4}/\sqrt{t}$, while $S$ has density proportional to $s\mapsto\mathrm{e}^{-s/4}$, which is not singular at $s=0$. What is important is the singularity or the non-singularity of the spacing $S$ because the squared spacing $S^2$ is not proportional to the coordinates of interest (the $\lambda$’s).

Jacobian. Let us check on the $2×2$ example that the absolute value of determinant of the Jacobian of the change of variable $X=ODO^\top\to (O,D)$ equals $|\lambda_1-\lambda_2|=\lambda_+-\lambda_-$ (here $n=2$). The matrix $X$ has $3=\frac{n^2-n}{2}+n$ degrees of freedom : $x_1,x_2,x_3$, while for $(O,D)$ we have two degrees of freedom $\lambda_\pm$, defining the diagonal matrix $D$, and another degree of freedom, say $\theta$, defining the $2×2$ orthogonal matrix $O$, which can be coded as $$O=\begin{pmatrix}\cos(\theta) & -\sin(\theta)\\\sin(\theta) & \cos(\theta)\end{pmatrix}.$$ Now $X=ODO^\top$ gives \begin{align*}x_1&=\lambda_+\cos(\theta)^2+\lambda_-\sin(\theta)^2\\x_2&=\lambda_+\sin(\theta)^2+\lambda_-\cos(\theta)^2\\x_3&=(\lambda_+-\lambda_-)\cos(\theta)\sin(\theta)\end{align*} and the determinant of the Jacobian is then \begin{align*}\det J&=\begin{vmatrix}\partial_{\lambda_+}x_1 & \partial_{\lambda_-}x_1 & \partial_{\theta}x_1\\\partial_{\lambda_+}x_2 & \partial_{\lambda_-}x_2 & \partial_{\theta}x_3\\\partial_{\lambda_+}x_3 & \partial_{\lambda_-}x_3 & \partial_{\theta}x_3\end{vmatrix}\\&=\begin{vmatrix}\cos(\theta)^2&\sin(\theta)^2&-2(\lambda_+-\lambda_-)\cos(\theta)\sin(\theta)\\\sin(\theta)^2&\cos(\theta)^2&2(\lambda_+-\lambda_-)\cos(\theta)\sin(\theta)\\\cos(\theta)\sin(\theta)&-\cos(\theta)\sin(\theta)&(\lambda_+-\lambda_-)(\cos(\theta)^2-\sin(\theta)^2)\end{vmatrix}\\&=\lambda_+-\lambda_-.\end{align*}

2×2 fact, non-Hermitian version. Let us consider now the fixed matrix $$X=\begin{pmatrix} x_1 & x_3\\x_4 & x_2\end{pmatrix},\quad x_1,x_2,x_3,x_4\in\mathbb{C}.$$ The eigenvalues $\lambda_1$ and $\lambda_2$, the roots in $\mathbb{C}$ of $\lambda^2-(x_1+x_2)\lambda+x_1x_2-x_3x_4=0$, are $$\frac{x_1+x_2\pm\sqrt{(x_1+x_2)^2-4(x_1x_2-x_3x_4)}}{2}$$ hence the eigenvalues spacing is given by $$|\lambda_1-\lambda_2|=\sqrt{|(x_1-x_2)^2+4x_3x_4|}.$$ Now if $x_3x_4=0$ then the matrix is tridiagonal and $|\lambda_1-\lambda_2|=|x_1-x_2|,$ while if $x_3x_4\neq0$ then the situation is more subtle than in the case of symmetric or Hermitian matrices.

Further reading.

This post benefited from useful comments by Guillaume Dubach.

Leave a Comment

Little ell p

Stefan Banach
Stefan Banach (1892 – 1945)

I teach topology and differential calculus this semester. This is the occasion to play with many nice mathematical concepts, including one of my favorite Banach spaces : $\ell^p$ spaces. These spaces are at the same time simple, important, and subtle. This tiny post collects some basic properties of these spaces, just for pleasure. I will improve this tiny post from time to time.

In what follows, $\mathbb{K}\in\{\mathbb{R},\mathbb{C}\}$.

$\ell^p$ spaces. For all $p\in[1,\infty)$, $\ell^p=\ell^p(\mathbb{N},\mathbb{K})$ is the set of sequences $(x_n)_{n\in\mathbb{N}}$ in $\mathbb{K}$ such that $$\|x\|_p:=\sum_n|x_n|^p<\infty.$$ We also define $\ell^\infty=\ell^\infty(\mathbb{N},\mathbb{K})$ the set of sequences $(x_n)_{n\in\mathbb{N}}$ in $\mathbb{K}$ such that $$\|x\|_\infty:=\sup_n|x_n|<\infty.$$ These spaces naturally generalize $\mathbb{R}^n$ to $\mathbb{R}^\infty$. Note by the way that $[0,1]^\infty\subset\ell^p$, $p\in[1,\infty]$, but the topology induced by $\ell^p$ on $[0,1]^\infty$ is not the product topology, it is stronger, in particular for $p=\infty$ it corresponds to uniform convergence rather than convergence of each coordinate.

We always have $\ell^{p_1}\subsetneq\ell^{p_2}$ for all $1\leq p_1<p_2\leq\infty$.

Three spaces are of special importance: $\ell^1$, $\ell^2$, and $\ell^\infty$.

Functional point of view. A sequence $(x_n)_{n\in\mathbb{N}}$ in $\mathbb{K}$ is a function $\mathbb{N}\to\mathbb{K}$, $x(n):=x_n$. If we equip $\mathbb{N}$ with the discrete topology, the discrete $\sigma$-field, and the counting measure $\mathrm{d}n$, then $$\ell^p(\mathbb{N},\mathbb{K})=L^p(\mathbb{N},\mathrm{d}n,\mathbb{K})\quad\text{and}\quad \|x\|_p^p=\int |x(n)|^p\mathrm{d}n,$$ while $\ell^\infty=L^\infty(\mathbb{N},\mathrm{d}n,\mathbb{K})=\mathcal{C}_b(\mathbb{N},\mathbb{K})$. But let us study these spaces from scratch.

For all $n\in\mathbb{N}$, we set $e_n:=\mathbf{1}_n$, in such a way that for all $x\in\ell^p$, $x=\sum_nx_ne_n$.

Completeness. The spaces $\ell^p$, $p\in[1,\infty]$, are Banach spaces : complete normed vector spaces. The vector space nature and the norm axioms are not difficult to check. To establish the completeness of $\ell^p$ when $p\in[1,\infty)$, we consider a Cauchy sequence $(x^{(m)})$ in $\ell^p$. For all $n$, $(x^{(m)}_n)$ is a Cauchy sequence in $\mathbb{R}^n$ which is complete, hence we get a sequence $x^*=(x^*_n)$ such that $x^{(m)}_n\to x^*_n$ as $m\to\infty$, for all $n$. Next, since $(x^{(m)})$ is bounded in $\ell^p$, for all $N$, $$\sum_{n=0}^N|x^*_n|^p=\lim_{m\to\infty}\sum_{n=0}^N|x^{(m)}_k|^p\leq\sup_m\|x^{(m)}\|_p^p<\infty$$ thus $x^*\in\ell^p$ by taking the limit $N\to\infty$. Next, $$\sum_{n=0}^N|x^{(m)}_n-x^*_n|^p=\lim_{k\to\infty}\sum_{n=0}^K|x^{(m)}_n-x^{(k)}_n|^p\leq\varlimsup_{k\to\infty}\|x^{(m)}-x^{(k)}\|_p^p,$$ which implies that $x^{(m)}\to x^*$ in $\ell^p$ since $(x^{(m)})$ is Cauchy in $\ell^p$. This also works for $\ell^\infty$.

For $p=2$, we get a Hilbert space $\ell^2$ with dot product $x\cdot y:=\sum_nx_n\overline{y_n}$, while when $p\neq2$, the norm $\left\|\cdot\right\|_p$ does not satisfy the parallelogram identity and is thus not Hilbertian.

Not locally compact. The argument can be adapted to any infinite dimensional normed vector space. Namely, for $\ell^p$, $p\in[1,\infty]$, for any $r>0$, $x_n:=re_n=r\mathbf{1}_n$ satisfies $\|x_n\|_p=r$ and $\|x_n-x_m\|=r2^{1/p}$ if $n\neq m$, also for any $\varepsilon>0$, if $\overline{B}(0,r)$ was coverable by a finite number of balls of radius $\varepsilon>0$, then one of these ball would contain at least two distinct points $x_n$ and $x_m$, and this would give $r2^{1/p}=\|x_n-x_m\|\leq2\varepsilon$, which is impossible when $\varepsilon<r2^{1/p-1}$.

Non separability of $\ell^\infty$. For $I\subset\mathbb{N}$, set $e_I:=\sum_{i\in I}e_i$. Then for $I\cap J$, we have $\|e_I-e_J\|_\infty=1$, thus $B(e_I,\tfrac{1}{2})\cap B(e_J,\tfrac{1}{2})=\varnothing$, thus $\ell^\infty$ can be covered by an uncountable family of disjoint non-empty balls, since the set of non-empty subsets of $\mathbb{N}$ is uncountable. This argument does not work for $\ell^p$ with $p\in[1,\infty)$, since in this case $e_I\in\ell^p$ imposes that $I$ is finite, and the set of non-empty finite subsets of $\mathbb{N}$ is countable.

Separability of $\ell^p$, $p\in[1,\infty)$. If suffices to consider $A:=\cup_nA_n$ where $$A_n:=\Bigr\{\sum_{i=0}^nq_ie_i:q\in\mathbb{Q}_{\mathbb{K}}^{n+1}\Bigr\}.$$ Hence for all $x\in\ell^p$ and all $\varepsilon>0$, denoting $\pi_n(x):=\sum_{i=0}^{n-1}x_ie_i$, we have $\|x-\pi_n(x)\|_p\leq\varepsilon$ for a large enough $n$, and then $\|\pi_n(x)-y_n\|_p\leq\varepsilon$ for some $y\in A_n$ thanks to the density of $\mathbb{Q}_{\mathbb{K}}$ in $\mathbb{K}$. Note that the approximation $\|x-\pi_n(x)\|_p\leq\varepsilon$ of $x$ by the finitely supported sequence $\pi_n(x)$ does not work in $\ell^\infty$. Actually the closure in $\ell^\infty$ of the set of finitely supported sequences is the set of sequences which tend to $0$ at $\infty$, denoted $\ell^\infty_0$, which is strictly smaller than $\ell^\infty$, and which is a separable Banach subspace of $\ell^\infty$.

Hölder inequality. For all $p\in[1,\infty]$, $q:=p/(p-1)\in[1,\infty]$, all $x\in\ell^p$, $y\in\ell^q$, $$\sum_n|x_ny_n|\leq\|x\|_p\|y\|_q.$$ It follows from the Hölder inequality on $\mathbb{R}^n$ for $\pi_n(x)=\sum_{i=0}^{n-1}x_ie_i$ and $\pi_n(y)=\sum_{i=0}^{n-1}y_ie_i$. The equality is achieved when $|x_n|^p$ and $|y_n|^q$ are proportional (possibly asymptotically).

Dual of $\ell^p$, $p\in[0,\infty)$. Recall that the dual of a normed vector space $X$ is the normed vector space $X’=L(X,\mathbb{K})$ of continuous linear forms, namely continuous linear mappings $X\to\mathbb{K}$.

For all $p\in[1,\infty)$, denoting $q:=1/(1-1/p)=p/(p-1)\in(1,\infty]$ the Hölder conjugate of $p$, the map $\Phi:\ell^q\mapsto(\ell^p)’$ defined for all $y\in\ell^q$ and $x\in\ell^p$ by $$\Phi(y)(x):=\sum_nx_ny_n$$ is a bijective linear isometry. In particular $$(\ell^p)’\equiv\ell^q.$$ In particular $(\ell^2)’\equiv\ell^2$, $(\ell^1)’\equiv\ell^\infty$, and $((\ell^p)’)’\equiv\ell^p$ (reflexivity) for $p\in(1,\infty)$.

Proof. The map $\Phi$ is well defined thanks to the Hölder inequality, and $\|\Phi(x)\|_{(\ell^p)’}\leq\|x\|_q$. Let us establish the equality, hence it is injective or into, and then that it is surjective or onto.

Let us consider the case $p=1$ ($q=\infty$). Let $y\in\ell^\infty$. There exists $(n_k)$ such that $|y_{n_k}|\to\|y\|_\infty$ when $k\to\infty$. For all $k$, $x_k:=\tfrac{|y_{n_k}|}{y_{n_k}}e_{n_k}\in\ell^1$ satifies $\|x_k\|_1=1$ and $\Phi(y)(x_k)=|y_{n_k}|\to\|y\|_\infty$ when $k\to\infty$, hence $\|\Phi\|=\|y\|_\infty$, and $\Phi$ is an isometry and thus an injection. For the surjectivity, if h$\varphi\in(\ell^1)’$ then for all $x\in\ell^1$, $\varphi(x)=\sum_ny_nx_n$ with $y_n:=\varphi(e_n)$, but $|y_n|\leq\|\varphi\|\|e_n\|_1=\|\varphi\|<\infty$ hence $y\in\ell^\infty$ and $\varphi=\Phi(y)$.

Let us consider the case $p>1$ ($q<\infty$). The equality case in the Hölder inequality $x_n:=\tfrac{y_n}{|y_n|}|y_n|^{q-1}$ satisfies $\|x\|_p^p=\|y\|_q^q$ since $p(q-1)=q$, hence $\Phi(y)(x)=\|y\|_q^q=\|y\|_q\|x\|_p$, hence $\|\Phi(y)\|_{(\ell^p)’}=\|y\|_q$, and $\Phi$ is an isometry hence it is injective. Let us show that $\Phi$ is surjective. Let $\varphi\in(\ell^p)’$, then for all $x\in\ell^p$, $\varphi(x)=\sum_nx_ny_n$ where $y_n:=\varphi(e_n)$ with $e_n:=\mathbf{1}_n$. Suppose by contradiction that $y\not\in\ell^q$. Denoting $\pi_N(z):=(z_0,z_1,\ldots,z_{N-1},0,0,\ldots)$, and $x_n:=\frac{|y_n|}{y_n}|y_n|^{q-1}$, since $y\not\in\ell^q$,
$$
\frac{\Phi(y)(\pi_N(x))}{\|\pi_N(x)\|_p}
=\|\pi_N(y)\|_q\underset{n\to\infty}{\longrightarrow}\infty,
\quad\text{which contradicts $\Phi(y)\in(\ell^p)’$}.
$$

Dual of $\ell^\infty$. The dual of $\ell^\infty=(\ell^1)’$ is strictly larger than $\ell^1$. More precisely, the map $\Phi:\ell^1\mapsto(\ell^\infty)’$ defined by $\Phi(y)(x):=\sum_nx_ny_n$ is a linear isometry which is injective but not surjective. In other words, $\ell^1\subsetneq(\ell^\infty)’=((\ell^1)’)’$, in other words $\ell^1$ is not reflexive.

Proof. The isometry (and thus the injectivity) comes from the Hölder inequality
$|\Phi(y)(x)|\leq\|x\|_\infty\|y\|_1$ and its equality case $x_n=\tfrac{|y_n|}{y_n}$, which gives  $\|\Phi(y)\|=\|y\|_1$. It remains to establish that $\Phi$ is not surjective. Consider the following subspace of $\ell^\infty$:
\[
S:=\Bigr\{(x_n):x_*:=\lim_{n\to\infty}x_n\text{ exists}\Bigr\}.
\]
The linear functional $x\mapsto x_*$ is bounded has unit norm on $S$. Thanks to the Hahn–Banach (for the non separable Banach space $\ell^\infty$) we can extend it into $L:\ell^\infty\to\mathbb{K}$ in such a way that $|Lx|\leq\|x\|_\infty=\sup_n|x_n|$ and $Lx=\lim_{n\to\infty}x_n$ of $(x_n)$ converges. We have thus constructed a “limit” to each bounded sequence, which respects linearity, and which coincides with the usual limit for converging sequences. In particular this proves that $(\ell^\infty)’\supsetneq\ell^1$. Indeed, if we had $L(x)=\sum_nx_ny_n$ for some $(y_n)\in\ell^1$, then, if we define, for a fixed $m$ and $\ell\neq0$, $x_n:=0$ if $n<m$ and $x_n:=\ell$ if $n\geq m$, then we would get a contradiction : $$\ell=L(x)=\ell\sum_{n\geq m}y_n\xrightarrow[m\to\infty]{}0.$$

The space $S\subset\ell^\infty$ above can be seen as the space of continuous functions on the Alexandrov compactification $\overline{\mathbb{N}}=\mathbb{N}\cup\{\infty\}$ of $\mathbb{N}$. The topology on this space is metrizable, and by a Riesz theorem, every linear form on $S$ can be seen as a measure on $\overline{\mathbb{N}}$. The linear form $L$ above corresponds then clearly to the Dirac mass at $\infty$. This functional respects the additive structure but not necessarily the multiplicative one, in other words we do not have necessarily $L(xy)=L(x)L(y)$, and similarly we do not have necessarily $L(f(x))=f(L(x))$ for all $f$.

Representation.

  • Every separable Banach space is isomorphic to a quotient $\ell^1/S$ where $S$ is a closed subspace of $\ell^1$.
  • Every separable Hilbert space of infinite dimension is isomorphic to $\ell^2$ (via a Hilbert basis).
Stefan Banach
Stefan Banach (1892 – 1945)
Leave a Comment

Bibliothèque de mathématiques de l’ÉNS

Quel avenir pour les bibliothèques de mathématiques des universités et des grandes écoles françaises ? L’électronisation des publications mathématiques et le désengagement progressif du CNRS font que de plus en plus de ces bibliothèques de proximité et de recherche disparaissent ou sont absorbées par une bibliothèque centrale. Voici un entretien avec Bernard Tessier, mené par Nathalie Queyroux, le 22 mars 2019, à la Bibliothèque de mathématiques et informatique de l’ÉNS. Les curieux mais pressés pourront sans difficulté l’écouter en vitesse x1,5. Pour en savoir plus : https://www.oralemens.ens.fr/s/PPM/page/bernard-teissier

 

Leave a Comment

Mellin transform and Riesz potentials

Photo of Robert Hjalmar Mellin (1854 – 1933)
Robert Hjalmar Mellin (1854 – 1933)

Even if the Mellin transform is just a deformed Fourier or Laplace transform, it plays a pleasant role in all the mathematics involving powers and integral transforms, such as Dirichlet series in number theory, Riesz power kernels in potential theory, mathematical statistics, etc. In particular, the famous Meijer G-function is defined as the inverse Mellin transform of ratios of products of Gamma functions. This tiny post is about an application of the Mellin transform to Riesz integral formulas, taken from an article by Bartłomiej Dyda, Alexey Kuznetsov, and Mateusz Kwaśnicki.

The Mellin transform and its inverse. Quoting Davies’s book on integral transforms (chapter 12), we recall that the Fourier transform pair may be written in the form\begin{align}
A(\theta)&:=\int_{\mathbb{R}}a(t)\mathrm{e}^{\mathrm{i}\theta t}\mathrm{d}t,\quad\alpha<\Im\theta<\beta,\\
a(t)&=\frac{1}{2\pi}\int_{\mathrm{i}c+\mathbb{R}}A(\theta)\mathrm{e}^{-\mathrm{i}\theta t}\mathrm{d}\theta,\quad\alpha<c<\beta.
\end{align} The Mellin transform and its inverse follows if we introduce the variable change
\begin{equation}
z=\mathrm{i}\theta,\quad x=\mathrm{e}^t,\quad f(x)=a(\log(x)),
\end{equation} so that we obtain the reciprocal pair of integral transforms, for $f:(0,+\infty)\to\mathbb{R}$,
\begin{align}
F(z)&:=\int_0^\infty f(x)x^{z-1}\mathrm{d}x,\quad\alpha<\Re z<\beta,\\
f(x)&=\frac{1}{2\pi\mathrm{i}}\int_{c+\mathrm{i}\mathbb{R}}F(z)x^{-z}\mathrm{d}z,\quad\alpha<c<\beta.
\end{align} These are the Mellin transform, and the Mellin inversion formula. The integral defining the transform normally exists only in the strip $\alpha<\Re(z)<\beta$; therefore the inversion contour must be placed in this strip.
For convenience we also denote by $\mathcal{M}f=F$ the Mellin transform of $f$.

The Mellin transform of $x\mapsto\mathrm{e}^{-x}$ is the Euler $\Gamma$ function. Its poles are $0,-1,-2,-3,\ldots$ In the same spirit, the Mellin transform of $x\mapsto(1-x)_+^{b-1}$ at point $z$ is $$\int_0^1x^{z-1}(1-x)^{b-1}\mathrm{d}x=\mathrm{Beta}(z,b).$$Recall the definition of the Euler Beta function $\mathrm{Beta}(a,b):=\int_0^1t^{a-1}(1-t)^{b-1}\mathrm{d}t=\frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}$.

Lemma (Riesz potential of radial functions). Suppose that
\[
x\in\mathbb{R}^d\mapsto f(x)=\varphi(|x|^2)
\] where $\varphi:\mathbb{C}\to\mathbb{R}$ is given as the absolutely convergent inverse Mellin transform
\[
\varphi(r):=\frac{1}{2\pi\mathrm{i}}
\int_{\lambda+\mathrm{i}\mathbb{R}}
\mathcal{M}\varphi(z)r^{-z}\mathrm{d}z,
\quad
\text{for some }\lambda\in\mathbb{R}.
\] If $0<\alpha<2\lambda<d$ then the Riesz potential $(\left|\cdot\right|^{-(d-\alpha)}*f)(x)$ is well defined for $x\neq0$, and
\[
(\left|\cdot\right|^{-(d-\alpha)}*f)(x)
=\psi(|x|^2)
\] where \[
\psi(r):=
\frac{1}{2\pi\mathrm{i}}
\frac{\pi^{\frac{d}{2}}\Gamma(\frac{\alpha}{2})}{\Gamma(\frac{d-\alpha}{2})}
\int_{\lambda-\frac{\alpha}{2}+\mathrm{i}\mathbb{R}}
\frac{\Gamma(z)\Gamma(\frac{d-\alpha}{2}-z)}
{\Gamma(\frac{\alpha}{2}+z)\Gamma(\frac{d}{2}-z)}
\mathcal{M}\varphi(z+\tfrac{\alpha}{2})
r^{-z}\mathrm{d}z.
\] In other words, the Mellin transform of $\psi$ satisfies
\[
\mathcal{M}\psi(z)
=
\frac{\pi^{\frac{d}{2}}\Gamma(\frac{\alpha}{2})}
{\Gamma(\frac{d-\alpha}{2})}
\frac{\Gamma(z)\Gamma(\frac{d-\alpha}{2}-z)}
{\Gamma(\frac{\alpha}{2}+z)\Gamma(\frac{d}{2}-z)}
\mathcal{M}\varphi(z+\tfrac{\alpha}{2}).
\]

Proof of Lemma (Riesz potential of radial functions). The Lemma is Proposition 2 in Dyda-Kuznetsov-Kwaśnicki with $V\equiv1$ and $l=0$. The idea is to use the inverse Mellin transform of $\varphi$ to reduce the problem, via the Fubini theorem, to the Riesz potential of inverse powers of the norm, which is immediate from the semigroup property of the Riesz kernel. Namely, following eq. (1.1.12) in Landkof’s book or eq. (8) p. 118 in Stein’s book, on $\mathbb{R}^d$, the semigroup property for Riesz kernels reads, for all $\alpha,\beta\in\mathbb{C}$ such that $\Re\alpha,\Re\beta>0$ and $\Re\alpha+\Re\beta<d$, \begin{equation}\left|\cdot\right|^{-(d-\alpha)}*\left|\cdot\right|^{-(d-\beta)}
=\frac{c_d(\alpha)c_d(\beta)}{c_d(\alpha+\beta)}
\left|\cdot\right|^{-(d-(\alpha+\beta))}
\quad\text{where}\quad
c_d(z)
:=\frac{2^z\pi^{\frac{d}{2}}\Gamma(\frac{z}{2})}{\Gamma(\frac{d-z}{2})}.
\end{equation}Now, by the inverse Mellin transform of $\varphi$, the Fubini theorem, and the semigroup property,
\begin{align*}
(\left|\cdot\right|^{-(d-\alpha)}*f)(x)
&=\frac{1}{2\pi\mathrm{i}}
\int_{\lambda+\mathrm{i}\mathbb{R}}
(\left|\cdot\right|^{-(d-\alpha)}*\left|\cdot\right|^{-2z})\mathcal{M}\varphi(z)\mathrm{d}z\\
&=\frac{1}{2\pi\mathrm{i}}
\int_{\lambda+\mathrm{i}\mathbb{R}}
\frac{c_d(\alpha)c_d(d-2z)}{c_d(d+\alpha-2z)}
\mathcal{M}\varphi(z)\left|\cdot\right|^{-(2z-\alpha)}\mathrm{d}z\\
&=\frac{1}{2\pi\mathrm{i}}
\int_{\lambda-\frac{\alpha}{2}+\mathrm{i}\mathbb{R}}
\frac{c_d(\alpha)c_d(d-\alpha-2w)}{c_d(d-2w)}
\mathcal{M}\varphi(w+\tfrac{\alpha}{2})\left|\cdot\right|^{-2w}\mathrm{d}w.
\end{align*}

Theorem (Riesz integral formula). Here is a Riesz integral formula mentioned in the Appendix of Landkof’s book, in eq.~(1.6) of C.-Saff-Womersley, and in Remark 1 of Dyda-Kuznetsov-Kwaśnicki : if $d\geq1$ and $0<s<d$. Define, on $\mathbb{R}^d$,\[f:=(1-\left|\cdot\right|^2)_+^{\frac{s-d}{2}}.
\] Then, for all $x\in\mathbb{R}^d$ such that $0<|x|<1$,
\[
(\left|\cdot\right|^{-s}*f)(x)
=\pi^{\frac{d}{2}}\frac{\Gamma(\frac{d-s}{2})\Gamma(1-\frac{d-s}{2})}{\Gamma(\frac{d}{2})}
=\frac{\pi^{\frac{d}{2}+1}}{\Gamma(\frac{d}{2})\sin(\frac{d-s}{2}\pi)}.
\] The last equality simply comes from the Euler reflection formula $\Gamma(z)\Gamma(1-z)=\frac{\pi}{\sin(\pi z)}$.

Proof of the Riesz integral formula. The theorem is a special case of Corollary 4 in Dyda-Kuznetsov-Kwaśnicki, namely with $V\equiv1$, $l=0$, $\alpha=s-d$, $\delta=d$, $\rho = \sigma = -\frac{d-s}{2}$, $0<s<d-2$ (implies $\sigma>-1$). Let us give the proof extracted from there. With $\varphi(r):=(1-r)_+^{\frac{s-d}{2}}$, we have
\begin{equation}
\mathcal{M}\varphi(z)=\int_0^1r^{z-1}(1-r)^{\frac{s-d}{2}}\mathrm{d}r=\mathrm{Beta}(z,1-\tfrac{d-s}{2}),
\end{equation}and by the Lemma above,
\begin{align}
\mathcal{M}\psi(z)
&=\frac{\pi^{\frac{d}{2}}\Gamma(\frac{d-s}{2})\Gamma(1-\frac{d-s}{2})}
{\Gamma(\frac{s}{2})}
\frac{\Gamma(z)\Gamma(\frac{s}{2}-z)}{\Gamma(\frac{d}{2}-z)\Gamma(z+1)}.
\end{align} Now, if the vertical line $\lambda+\mathrm{i}\mathbb{R}$ separates the poles of $z\mapsto\Gamma(z)$ and of $z\mapsto\Gamma(\frac{s}{2}-z)$, then
\begin{align}
\frac{1}{2\pi\mathrm{i}}\int_{\lambda+\mathrm{i}\mathbb{R}}
\frac{\Gamma(z)\Gamma(\frac{s}{2}-z)}{\Gamma(\frac{d}{2}-z)\Gamma(z+1)}
x^{-z}\mathrm{d}z
&=\sum_{k=0}^\infty\mathrm{Residue}_{z=-k}\Bigr(\frac{\Gamma(z)\Gamma(\frac{s}{2}-z)}{\Gamma(\frac{d}{2}-z)\Gamma(z+1)}x^{-z}\Bigr)\\
&=\sum_{k=0}^\infty\frac{\Gamma(\frac{s}{2}+k)}{\Gamma(\frac{d}{2}+k)\Gamma(-k+1)}x^{k}\mathrm{Residue}_{z=-k}(\Gamma(z))\\
&=\sum_{k=0}^\infty\frac{\Gamma(\frac{s}{2}+k)}{\Gamma(\frac{d}{2}+k)\Gamma(-k+1)}\frac{(-x)^{k}}{k!} =\frac{\Gamma(\frac{s}{2})}{\Gamma(\frac{d}{2})}.
\end{align}

Recall that the $\Gamma$ function has no zeros, indeed a zero leads via $\Gamma(z)=(z-1)\Gamma(z-1)$ to infinitely many zeros to the left, and then via $\Gamma(z)\Gamma(1-z)=\frac{\pi}{\sin(\pi z)}$ to infinitely many poles to the right, which contradicts the analycity of $\Gamma$ on $\Re z>0$.

Recall also that the $\Gamma$ function is meromorphic on the complex plane, its poles are the non-positive integers, and are simple. Moreover, using $(z+n)\Gamma(z)=\frac{\Gamma(z+n+1)}{z(z+1)\cdots(z+n-1)}$ we get $$\mathrm{Residue}_{z=-n}(\Gamma(z)):=\lim_{z\to-n}(z-(-n))\Gamma(z)=\frac{(-1)^n}{n!}.$$

Meijer G-functions. A key point of the proof above is the computation of the inverse Mellin transform of a certain ratio of products of Gamma functions (Mellin transfrom of $\psi$). This is actually the definition of Meijer G-functions. If for example $$f(x):=(1-|x|^2)_+^\sigma\ {}_2F_1(a,b;c;1-|x|^2)=\varphi(|x|^2)$$ where $\varphi(r)=(1-r)_+^\sigma\ {}_2F_1(a,b;c;r)$, then it is possible, by using the same method as above, to express $\varphi$ as a Meijer G-function, and to deduce that the Riesz potential of $f$ on the unit ball is equal to another Meijer G-function, which reduces to a hypergeometric function in certain cases. This is explored in Dyda-Kuznetsov-Kwaśnicki and references therein.

Goody: Mellin transform of Gauss hypergeometric function. Recall the definition $${}_2F_1(a,b;c;z):=\sum_{n=0}^\infty\frac{(a)_n(b)_n}{(c)_n}\frac{z^n}{n!}$$ where $(a)_n:=a(a+1)\cdots(a+n-1)$ if $n\geq1$ and $(a)_0:=1$, in other words $(a)_n=\Gamma(a+n)/\Gamma(a)$. Now, if $f(x):={}_2F_1(a,b;c;-x)$ then \[\mathcal{M}f(z)
=\frac{\mathrm{Beta}(z,a-z)\mathrm{Beta}(z,b-z)}{\mathrm{Beta}(z,c-z)}
=\frac{\Gamma(c)}{\Gamma(a)\Gamma(b)}\frac{\Gamma(z)\Gamma(a-z)\Gamma(b-z)}{\Gamma(c-z)}.\] To see it, we can start from the Euler integral formula (which can be proved by a series expansion of $(1+xt)^{-b}$ using the Newton binomial series $(1-z)^{-\alpha}=\sum_{n=0}^\infty(\alpha)_n\frac{z^n}{n!}$) : \[{}_2F_1(a,b;c;-x) =\frac{\Gamma(c)}{\Gamma(a)\Gamma(c-a)} \int_0^1t^{a-1}(1-t)^{c-a-1}(1+xt)^{-b}\mathrm{d}t. \] This mixture representation gives, by the Fubini theorem, with $g_{t,b}(x):=(1+xt)^{-b}$,
\[
\mathcal{M}f(z)
=\frac{\Gamma(c)}{\Gamma(a)\Gamma(c-a)}
\int_0^1t^{a-1}(1-t)^{c-a-1}\mathcal{M}g_{t,b}(z)\mathrm{d}t
\quad\text{where}.
\] But by using the change of variable $y=(1+xt)^{-1}$ we get
\[
\mathcal{M}g_{t,b}(z)
=\int_0^\infty x^{z-1}g_{t,b}(x)\mathrm{d}x
=t^{-z}\int_0^1(1-y)^{z-1}y^{b-1-z}\mathrm{d}y
=t^{-z}\mathrm{Beta}(z,b-z),
\] It remains to note that $\int_0^1t^{a-1}(1-t)^{c-a-1}t^{-z}\mathrm{d}t=\mathrm{Beta}(c-a,a-z)$ and \[
\frac{\Gamma(c)}{\Gamma(a)\Gamma(c-a)}
\mathrm{Beta}(z,b-z)
\mathrm{Beta}(c-a,a-z)
=\frac{\Gamma(c)}{\Gamma(a)\Gamma(b)}\frac{\Gamma(z)\Gamma(a-z)\Gamma(b-z)}{\Gamma(c-z)}.
\] Alternatively, we could compute the inverse Mellin transform by using the residue formula.

Motivation. The original proof by Riesz of the integral formula, largely geometric, is also given in the Appendix of Landkof’s book and of C.-Saff-Womersley. We were happy to locate a relatively short analytic proof, due to Dyda-Kuznetsov-Kwaśnicki, which is the subject of this post.

Further reading. 

Leave a Comment
Syntax · Style · .