{"id":10897,"date":"2018-12-14T16:07:06","date_gmt":"2018-12-14T15:07:06","guid":{"rendered":"http:\/\/djalil.chafai.net\/blog\/?p=10897"},"modified":"2025-05-01T09:28:20","modified_gmt":"2025-05-01T07:28:20","slug":"maxwell-characterization-of-gaussian-distributions","status":"publish","type":"post","link":"https:\/\/djalil.chafai.net\/blog\/2018\/12\/14\/maxwell-characterization-of-gaussian-distributions\/","title":{"rendered":"Maxwell characterization of Gaussian distributions"},"content":{"rendered":"<p>&nbsp;<\/p>\r\n\r\n<div class=\"wp-block-image\">\r\n<figure class=\"aligncenter\"><img loading=\"lazy\" width=\"252\" height=\"326\" class=\"wp-image-10915\" src=\"http:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/12\/Maxwell.jpeg\" alt=\"\" srcset=\"https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/12\/Maxwell.jpeg 252w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/12\/Maxwell-232x300.jpeg 232w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/12\/Maxwell-116x150.jpeg 116w\" sizes=\"(max-width: 252px) 100vw, 252px\" \/>\r\n<figcaption><a href=\"https:\/\/en.wikipedia.org\/wiki\/James_Clerk_Maxwell\">James Clerk Maxwell (1831 - 1879)<\/a><\/figcaption>\r\n<\/figure>\r\n<\/div>\r\n\r\n\r\n\r\n<p style=\"text-align: justify;\">\u200b\u200b\u200bThis tiny post is about a basic characterization of Gaussian distributions.\u200b<\/p>\r\n\r\n\r\n\r\n<p style=\"text-align: justify;\"><strong>The theorem.<\/strong>\u00a0A random vector of dimension two or more has independent components and is rotationally invariant if and only if its components are Gaussian, centered, with same variances.<\/p>\r\n\r\n\r\n\r\n<p style=\"text-align: justify;\">In other words, for all $n\\geq2$, a probability measure on $\\mathbb{R}^n$ is in the same time product and rotationally invariant if and only if it is a Gaussian distribution $\\mathcal{N}(0,\\sigma^2I_n)$ for some $\\sigma\\geq0$.<\/p>\r\n\r\n\r\n\r\n<p>Note that this does not work for $n=1$. In a sense it is a purely multivariate phenomenon.<\/p>\r\n\r\n\r\n\r\n<p style=\"text-align: justify;\"><strong>A proof.<\/strong>\u00a0For all $\\sigma\\geq0$, the Gaussian distribution $\\mathcal{N}(0,\\sigma^2I_n)$ is product and is rotationally invariant, and if $\\sigma&gt;0$, its density is, denoting $|x|:=\\sqrt{x_1^2+\\cdots+x_n^2}$, $$x\\in\\mathbb{R}^n\\mapsto\\mathrm{exp}\\Bigr(-\\frac{|x|^2}{2\\sigma^2}-n\\log\\sqrt{2\\pi\\sigma^2}\\Bigr).$$ Conversely, suppose that $\\mu$ is a rotationally invariant product probability distribution on $\\mathbb{R}^n$. We can assume without loss of generality that it has a smooth positive density $f:\\mathbb{R}^n\\to(0,\\infty)$, since otherwise we can consider the probability measure $\\mu*\\mathcal{N}(0,\\varepsilon I_n)$ for $\\varepsilon&gt;0$, which is also product and rotationally invariant. By rotational invariance, $\\log f(x)=g(|x|^2)$, and thus $$\\partial_i\\log f(x)=2g'(|x|^2)x_i.$$ On the other hand, since $\\mu$ is product, we have $\\log f (x)=h(x_1)+\\cdots+h(x_n)$ and thus $$\\partial_i\\log f (x)=h'(x_i).$$ Hence $\\partial_i\\log f(x)$, which depends on $|x|$ via $g'(|x|)$, depend only on $x_i$. Since $n\\geq2$, it follows that $g'$ is constant. Therefore there exist $a,b\\in\\mathbb{R}$ such that $g(u)=au+b$ for all $u$, and thus $f(x)=\\mathrm{e}^{a|x|^2+b}$ for all $x\\in\\mathbb{R}^n$. Since $f$ is a density, $a&lt;0$ and $\\mathrm{e}^b=(\\pi\/a)^{-n\/2}$.<\/p>\r\n\r\n\r\n\r\n<p style=\"text-align: justify;\"><strong>Another proof (for the converse).<\/strong>\u00a0Suppose that $X_1,\\ldots,X_n$ are $n\\geq2$ independent random variables, such that the random vector $(X_1,\\ldots,X_n)$ is rotationally invariant. Since for all $1\\leq i\\neq j\\leq n$, there exists a rotation $R$ such that $Re_i=\\pm e_j$, it follows that $X_1,\\ldots,X_n$ are independent and identically distributed of say law $\\nu$ which is symmetric. This reduces the problem to show that $\\nu$ is Gaussian. This also means that it suffices to solve the problem for $n=2$. Now, for all $\\theta\\in\\mathbb{R}$, the rotation of angle $\\theta$ in $\\mathrm{span}\\{e_1,e_2\\}$ gives that $\\cos(\\theta)X_1-\\sin(\\theta)X_2$ has the law of $X_1$. This indicates that $\\nu$ is (symmetric) stable. But denoting $\\varphi$ its characteristic function, and using the independence, we obtain $\\varphi(\\cos(\\theta)t)\\varphi(-\\sin(\\theta)t)=\\varphi(t)$ for all $t\\in\\mathbb{R}$. Using the expression of the characteristic function of symmetric stable distributions, this leads to the Gaussianity of $\\nu$. Note that without using stability, this shows also that all the cumulants or order $\\neq2$ are all zero when $\\nu$ has all its moments finite. This alternative proof without regularization is inspired by the idea of reduction to $n=2$ due to Dinh-Toan Nguyen, PhD student, communicated by my colleague Laure Dumaz.<\/p>\r\n\r\n\r\n<p style=\"text-align: justify;\"><strong>History.<\/strong> The proof above is roughly the reasoning followed by <a href=\"https:\/\/en.wikipedia.org\/wiki\/James_Clerk_Maxwell\">James Clerk Maxwell<\/a>\u00a0(1831 - 1879) to derive the distribution of velocities in an ideal gas at equilibrium. In his case $n=3$, and the distribution is known in statistical physics as the <em>Maxwellian distribution<\/em>. This was a source of inspiration for <a href=\"https:\/\/en.wikipedia.org\/wiki\/Ludwig_Boltzmann\">Ludwig Boltzmann<\/a>\u00a0(1844 - 1906) for the derivation of his kinetic evolution equation and his H-theorem about entropy. This was known apparently before Maxwell, for instance by John Herschel (1792 - 1871) in 1850 in his commentaries on the work of Adolf Quetelet (1796 - 1874) in social statistics and probabilities. But this was maybe also known by\u00a0<a href=\"https:\/\/en.wikipedia.org\/wiki\/Carl_Friedrich_Gauss\">Carl Friedrich Gauss<\/a>\u00a0(1777 - 1855) himself.<\/p>\r\n\r\n\r\n\r\n<p style=\"text-align: justify;\">\u200b<strong>Characterizations.<\/strong> This characterization of Gaussian laws among product distributions using invariance by the action of transformations (rotations) leads to the same characterization for the heat semi-group and for the Laplacian operator. There are of course other remarkable characterizations of the Gaussian, for instance as being an eigenvector of the Fourier transform, and also, following Boltzmann, as being the maximum entropy distribution at fixed variance.<\/p>\r\n\r\n\r\n\r\n<p style=\"text-align: justify;\"><strong>Maxwell characterization for unitary invariant random matrices.<\/strong>\u00a0A random $n\\times n$ Hermitian matrix has in the same time independent entries and a law invariant by conjugacy with respect to unitary matrices if and only if it has a Gaussian law with density of the form $$H\\mapsto\\exp(a\\mathrm{Tr}(H^2)+b\\mathrm{Tr}(H)+c).$$ Note that the unitary invariance implies that the density depends only on the spectrum and is actually a symmetric function of the eigenvalues. A complete solution can be found for instance in <a href=\"https:\/\/en.wikipedia.org\/wiki\/Madan_Lal_Mehta\">Madan Lal Mehta<\/a> book on Random matrices (Theorem 2.6.3), who attributes the result to Charles E. Porter and Norbert Rosenzweig (~1960). It is related to a lemma due to <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hermann_Weyl\">Hermann Weyl<\/a>: all the invariants of an $n\\times n$ matrix $H$ under non-singular similarity transformations $H\\mapsto UHU^*$ can be expressed in terms of traces of the first $n$ powers of $H$. The assumption about the independence of entries kills all powers above $2$.<\/p>\r\n\r\n\r\n\r\n<p style=\"text-align: justify;\"><strong>Letac observation.<\/strong> It is not difficult to show that if $X$ is a random vector of $\\mathbb{R}^n$, $n\\geq1$ with independent Gaussian and centered components of positive variance then $\\mathbb{P}(X=0)=0$ and $X\/|X|$ is uniformly distributed on the sphere. Conversely, it was shown by my former teacher and colleague G\u00e9rard Letac (1940 - ) that if a random vector $X$ of $\\mathbb{R}^n$, $n\\geq3$, has independent components and is such that $\\mathbb{P}(X=0)=0$ and $X\/|X|$ is uniformly distributed on the sphere, then $X$ is Gaussian and in particular its components are Gaussian with zero mean and same positive variance. Moreover there are counter examples for $n=1$ and $n=2$. When $n\\geq3$, this result of Letac implies the Maxwell theorem.<\/p>\r\n\r\n<p style=\"text-align: justify;\"><strong>L\u00e9vy observation.<\/strong><\/p>\r\n<blockquote style=\"text-align: justify;\">On sait que la th\u00e9orie cin\u00e9tique des gaz repose sur la loi de Maxwell : si on choisit au hasard une mol\u00e9cule d'une masse gazeuse homog\u00e8ne (j'entends par l\u00e0 que toutes les mol\u00e9cules sont de la m\u00eame nature), les trois composantes de sa vitesse sont trois variables al\u00e9atoires gaussiennes et ind\u00e9pendantes. La justification de cette loi, dans le trait\u00e9 classique de Boltzmann, est terriblement compliqu\u00e9e. On en avait donn\u00e9 deux explications plus simples mais \u00e0 mon avis sans valeur.<\/blockquote>\r\n<blockquote style=\"text-align: justify;\">L'une est bas\u00e9e sur un fait math\u00e9matique exact: si la grandeur de la vitesse est ind\u00e9pendante de sa direction, et si ses trois composantes sont ind\u00e9pendantes, il ne peut s'agir que de la loi de Maxwell. Mais on n'a \u00e0 priori aucune raison de croire \u00e0 cette ind\u00e9pendance ; si par exemple la vitesse d\u00e9passe rarement mille m\u00e8tres par seconde, et que nous constations une vitesse horizontale de 980 m\u00e8tres par seconde, ne pouvons-nous pas penser que le mouvement est presque horizontal et que par suite la composante verticale est faible ? Elle ne serait pas ind\u00e9pendante de la composante horizontale.<\/blockquote>\r\n<blockquote style=\"text-align: justify;\">D'autre part, si une masse gazeuse contient n mol\u00e9cules, on peut consid\u00e9rer les 3n composantes de leurs vitesses comme un point d'un espace euclidien \u00e0 3n dimensions. D'apr\u00e8s le th\u00e9or\u00e8me des forces vives, ce point est sur une sph\u00e8re de cet espace, et, en supposant la probabilit\u00e9 uniform\u00e9ment r\u00e9partie sur cette surface, on obtient la loi de Maxwell, d'autant plus exactement que n est plus grand. \u00c9mile Borel, qui semble avoir \u00e9t\u00e9 premier fait cette remarque, y voyait une explication du r\u00f4le de cette loi. Je ne suis pas de cet avis. Je ne vois aucune raison de consid\u00e9rer deux \u00e9l\u00e9ments \u00e9gaux de cette sph\u00e8re comme \u00e9galement probables, si l'un implique ]a concentration de presque toute l'\u00e9nergie sur une seule mol\u00e9cule, tandis que l'autre implique qu'elle soit au contraire \u00e9quitablement r\u00e9partie.<\/blockquote>\r\n<blockquote style=\"text-align: justify;\">Je pensai alors \u00e0 utiliser la r\u00e9versibilit\u00e9 des lois de la m\u00e9canique. D'apr\u00e8s ces lois, si un mouvement des particules d'un gaz est possible, le mouvement inverse, qui s'en d\u00e9duit en remontant le cours du temps, est aussi compatible avec les lois du choc. II y a cependant, dans la nature, des ph\u00e9nom\u00e8nes irr\u00e9versibles. Mais on sait depuis longtemps (depuis Gibbs, je crois) que, s'il s'agit de ph\u00e9nom\u00e8nes r\u00e9gis par les lois de m\u00e9canique, cette impossibilit\u00e9 du mouvement inverse n'est pas absolue. II s'agit seulement d'un ph\u00e9nom\u00e8ne tr\u00e8s peu probable. Ainsi, faisons communiquer un r\u00e9servoir rempli d'azote et un r\u00e9servoir rempli d'oxyg\u00e8ne ; les deux gaz se m\u00e9langeront. Ils ne pourront plus se s\u00e9parer. Pourtant le mouvement invers\u00e9 des mol\u00e9cules est possible ; mais, si leurs positions et leurs vitesses initiales sont choisies au hasard, le nombre des si\u00e8cles qu'il faudrait attendre pour que cela ait des chances appr\u00e9ciables de se r\u00e9aliser, est si grand qu'il faudrait (m\u00eame pour des masses gazeuses assez faibles) beaucoup de milliards de chiffres pour l'\u00e9crire). Naturellement, quand on m\u00e9lange deux gaz, il y a une p\u00e9riode transitoire plus ou moins longue. Mais on peut facilement admettre qu'un \u00e9tat d'\u00e9quilibre finisse par s' \u00e9tablir, dans lesquelles vitesses sont r\u00e9parties suivant une loi bien d\u00e9termin\u00e9e, et la r\u00e9partition reste alors \u00e9videmment la m\u00eame si on remonte le cours du temps. Si on consid\u00e8re les chocs de deux mol\u00e9cules assimil\u00e9es pour simplifier \u00e0 des sph\u00e8res \u00e9lastiques, on peut alors dire que le nombre de chocs d'une esp\u00e8ce d\u00e9termin\u00e9e (l'esp\u00e8ce \u00e9tant d\u00e9finie par l'orientation du plan tan- gent aux mol\u00e9cules au point de contact et les vitesses de deux mol\u00e9cules) est sensiblement \u00e9gal au nombre des chocs de l'esp\u00e8ce contraire (obtenue en \u00e9changeant les composantes normales des vitesses de deux mol\u00e9cules). Ce point admis, on en d\u00e9duit ais\u00e9ment la loi de Maxwell. C'est ce que j'ai montr\u00e9 dans le dernier chapitre de mon livre de 1925.<\/blockquote>\r\n<blockquote style=\"text-align: justify;\">Suis-je le premier \u00e0 avoir indiqu\u00e9 cette m\u00e9thode, qui est couramment enseign\u00e9e aujourd'hui ? Je n'ose pas l'affirmer. Ce qui est s\u00fbr, c'est qu'elle n'\u00e9tait gu\u00e8re connue en 1925. J'ai vu des professeurs de physique me dire qu'ils n'avaient compris la th\u00e9orie cin\u00e9tique des gaz que gr\u00e2ce \u00e0 moi. J'ajoute qu'ayant envoy\u00e9 mon livre \u00e0 l'illustre math\u00e9maticien italien Levi Civita, je re\u00e7us une lettre de lui me disant que, (bien entendu), c'\u00e9tait le chapitre sur la th\u00e9orie cin\u00e9tique des gaz qui l'avait le plus int\u00e9ress\u00e9. C'est ce qui m'a fait penser que ma m\u00e9thode devait \u00eatre nouvelle.<\/blockquote>\r\n<p style=\"text-align: right;\">Paul L\u00e9vy (1886 - 1971), Quelques aspects de la pens\u00e9e d'un math\u00e9maticien, 1970.<\/p>\r\n\r\n\r\n<p style=\"text-align: justify;\"><strong>Further reading.<\/strong><\/p>\r\n<ul>\r\n<li>John Frederick William Herschel<br \/>\r\n<a>Quetelet on probabilities<\/a><br \/>\r\nEdinburgh Rev., 92, 1\u201357 (1850)<\/li>\r\n<li>James Clerk Maxwell<br \/>\r\n<a>Illustrations of the dynamical theory of gases<\/a><br \/>\r\nPhilosophical Magazine. 4th Series. 19: 390\u2013393 (1860)<\/li>\r\n<li>Robert Robson, Timon Mehrling, and Jens Osterhoff<br \/>\r\n<a>Great moments in kinetic theory: 150 years of Maxwell's (other) equations<\/a><br \/>\r\nEuropean Journal of Physics 38(6) 2017<\/li>\r\n<li>Bal\u00e1zs, Gyenis<br \/>\r\n<a>Maxwell and the normal distribution: A colored story of probability, independence, and tendency toward equilibrium<\/a><br \/>\r\nStudies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics. 57: 53\u201365 (2017)<\/li>\r\n<li>Paul L\u00e9vy<br \/>\r\n<a>Quelques aspects de la pens\u00e9e d'un math\u00e9maticien<\/a><br \/>\r\nAlbert Blanchard (1970)<\/li>\r\n<li>William Feller<br \/>\r\n<a>An Introduction to Probability Theory and its Applications. Vol. II, section III.4<\/a><br \/>\r\nWiley (1971)<\/li>\r\n<li>Norbert Rosenzweig and Charles E. Porter<br \/>\r\n<a>Repulsion of energy levels in complex atomic spectra<\/a><br \/> \r\nPhys. Rev., 120:1698\u20131714 (1960)<\/li>\r\n<li>Madan Lal Mehta<br \/>\r\n<a>Random Matrices<\/a><br \/>\r\nElsevier (2004)<\/li>\r\n<li>Hermann Weyl<br \/>\r\n<a>The Classical Groups: Their Invariants and Representations<\/a><br \/>\r\nPrinceton University Press (1966)<\/li>\r\n<li>G\u00e9rard Letac<br \/>\r\n<a>Isotropy and sphericity: Some characterisations of the normal distribution<\/a><br \/>\r\nThe Annals of Statistics, 9(2):408\u2013417 (1981)<\/li>\r\n<\/ul>\r\n\r\n\r\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; &#8203;&#8203;&#8203;This tiny post is about a basic characterization of Gaussian distributions.&#8203; The theorem.&nbsp;A random vector of dimension two or more has independent components and&#8230;<\/p>\n<div class=\"more-link-wrapper\"><a class=\"more-link\" href=\"https:\/\/djalil.chafai.net\/blog\/2018\/12\/14\/maxwell-characterization-of-gaussian-distributions\/\">Continue reading<span class=\"screen-reader-text\">Maxwell characterization of Gaussian distributions<\/span><\/a><\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":1301},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/10897"}],"collection":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/comments?post=10897"}],"version-history":[{"count":136,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/10897\/revisions"}],"predecessor-version":[{"id":21937,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/10897\/revisions\/21937"}],"wp:attachment":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/media?parent=10897"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/categories?post=10897"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/tags?post=10897"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}