{"id":1171,"date":"2011-01-29T22:44:51","date_gmt":"2011-01-29T20:44:51","guid":{"rendered":"http:\/\/djalil.chafai.net\/blog\/?p=1171"},"modified":"2021-03-16T09:31:36","modified_gmt":"2021-03-16T08:31:36","slug":"the-marchenko-pastur-law","status":"publish","type":"post","link":"https:\/\/djalil.chafai.net\/blog\/2011\/01\/29\/the-marchenko-pastur-law\/","title":{"rendered":"The Marchenko-Pastur law"},"content":{"rendered":"<p style=\"text-align: justify;\"><a href=\"\/blog\/wp-content\/uploads\/2011\/01\/marpas2.png\"><img loading=\"lazy\" class=\"size-medium wp-image-1232 aligncenter\" title=\"Marchenko-Pastur distribution\" src=\"\/blog\/wp-content\/uploads\/2011\/01\/marpas2-300x210.png\" alt=\"Marchenko-Pastur distributions\" width=\"300\" height=\"210\" srcset=\"https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2011\/01\/marpas2-300x210.png 300w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2011\/01\/marpas2.png 750w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p style=\"text-align: justify;\">The <a href=\"http:\/\/en.wikipedia.org\/wiki\/Vladimir_Marchenko\">Marchenko<\/a> - <a href=\"http:\/\/www.google.com\/search?q=Leonid+Pastur\">Pastur<\/a> law \\( {\\mu_\\rho} \\) with shape parameter \\( {\\rho&gt;0} \\) is the probability distribution<\/p>\n<p style=\"text-align: center;\">\\[ \\left(1-\\frac{1}{\\rho}\\right)_+\\delta_0+\\frac{1}{\\rho 2\\pi x}\\sqrt{(b-x)(x-a)}\\,\\mathbf{1}_{[a,b]}(x)dx \\]<\/p>\n<p style=\"text-align: justify;\">where<\/p>\n<p style=\"text-align: center;\">\\[ a=(1-\\sqrt{\\rho})^2 \\quad\\text{and}\\quad b=(1+\\sqrt{\\rho})^2. \\]<\/p>\n<p style=\"text-align: justify;\">It has mean \\( {1} \\) and variance \\( {\\rho} \\). It has an atom at point \\( {0} \\) if and only if \\( {\\rho&gt;1} \\).<\/p>\n<blockquote style=\"background: white; border: solid thick #e4e5e7; text-align: justify; padding-left: 1em;\"><p><b>Lemma 1 (Moments of the Marchenko-Pastur law)<\/b> <em>For all \\( {r\\geq1} \\),<\/em><\/p>\n<p style=\"text-align: center;\"><em>\\[ \\int\\!x^r\\,d\\mu_\\rho(x) =\\sum_{k=0}^{r-1}\\frac{\\rho^k}{k+1}\\binom{r}{k}\\binom{r-1}{k}. \\]<\/em><\/p>\n<\/blockquote>\n<p style=\"text-align: justify;\"><em>Proof:<\/em> Since \\( {a+b=2(1+\\rho)} \\) and \\( {ab=(1-\\rho)^2} \\) we have<\/p>\n<p style=\"text-align: center;\">\\[ \\sqrt{(b-x)(x-a)} =\\sqrt{\\frac{(a+b)^2}{4}-ab-\\left(x-\\frac{a+b}{2}\\right)^2} =\\sqrt{4\\rho-(x-(1+\\rho))^2} \\]<\/p>\n<p style=\"text-align: justify;\">The change of variable \\( {y=(x-(1+\\rho))\/\\sqrt{\\rho}} \\) gives<\/p>\n<p style=\"text-align: center;\">\\[ \\int\\!x^r\\,d\\mu_\\rho(x) =\\int\\!x^r\\,d\\mu_\\rho(x) =\\frac{1}{2\\pi} \\int_{-2}^2\\!(\\sqrt{\\rho}y+1+\\rho)^{r-1}\\sqrt{4-y^2}\\,dy. \\]<\/p>\n<p style=\"text-align: justify;\">Recall that the moments of the semicircle law are given by the Catalan numbers :<\/p>\n<p style=\"text-align: center;\">\\[ \\frac{1}{2\\pi}\\int_{-2}^2\\!y^{2k+1}\\,\\sqrt{4-y^2}\\,dy =0 \\quad\\text{and}\\quad \\frac{1}{2\\pi}\\int_{-2}^2\\!y^{2k}\\,\\sqrt{4-y^2}\\,dy =\\frac{1}{1+k}\\binom{2k}{k}. \\]<\/p>\n<p style=\"text-align: justify;\">This gives, by using binomial expansions and <a href=\"http:\/\/www.google.com\/search?q=Vandermonde+identity\">Vandermonde's identity<\/a><\/p>\n<p style=\"text-align: center;\">\\[ \\begin{array}{rcl} \\int\\!x^r\\,d\\mu_\\rho(x) &amp;=&amp;\\sum_{k=0}^{\\lfloor(r-1)\/2\\rfloor} \\rho^k(1+\\rho)^{r-1-2k}\\binom{r-1}{2k}\\binom{2k}{k}\\frac{1}{1+k}\\\\ &amp;=&amp;\\sum_{k=0}^{\\lfloor(r-1)\/2\\rfloor} \\rho^k(1+\\rho)^{r-1-2k}\\frac{(r-1)!}{(r-1-2k)!k!(k+1)!}\\\\ &amp;=&amp;\\sum_{k=0}^{\\lfloor(r-1)\/2\\rfloor} \\sum_{s=0}^{r-1-2k} \\rho^{k+s}\\frac{(r-1)!}{k!(k+1)!(r-1-2k-s)!s!}\\\\ &amp;=&amp;\\sum_{t=0}^{r-1}\\rho^t\\sum_{k=0}^{\\min(t,r-1-t)} \\frac{(r-1)!}{k!(k+1)!(r-1-t-k)!(t-k)!}\\\\ &amp;=&amp;\\frac{1}{r}\\sum_{t=0}^{r-1}\\rho^t\\binom{r}{t}\\sum_{k=0}^{\\min(t,r-1-t)} \\binom{t}{k}\\binom{r-t}{k+1}\\\\ &amp;=&amp;\\frac{1}{r}\\sum_{t=0}^{r-1}\\rho^t\\binom{r}{t}\\binom{r}{t+1} \\\\ &amp;=&amp;\\sum_{t=0}^{r-1}\\frac{\\rho^t}{t+1}\\binom{r}{t}\\binom{r-1}{t}. \\end{array} \\]<\/p>\n<p style=\"text-align: justify;\">\u2610 The Marchenko-Pastur law appears in the famous Marchenko-Pastur theorem. Namely, let \\( {X_1,\\ldots,X_n} \\) be independent and identically distributed random column vectors of \\( {\\mathbb{R}^m} \\), with mean \\( {0} \\) and covariance matrix \\( {I_m} \\). Let us consider the \\( {m\\times m} \\) empirical covariance matrix<\/p>\n<p style=\"text-align: center;\">\\[ \\frac{1}{n}\\sum_{i=1}^nX_iX_i^\\top=\\frac{1}{n}YY^\\top \\quad\\text{where}\\quad Y=\\left(X_1\\cdots X_n\\right). \\]<\/p>\n<p style=\"text-align: justify;\">We have \\( {\\mathbb{E}(\\frac{1}{n}YY^\\top)=I_m} \\) and the strong law of large numbers indicates that with probability one,<\/p>\n<p style=\"text-align: center;\">\\[ \\lim_{n\\rightarrow\\infty}\\frac{1}{n}YY^\\top = I_m. \\]<\/p>\n<p style=\"text-align: justify;\">What happens if \\( {m} \\) grows with \\( {n} \\)? Let us assume for simplicity that the components of \\( {X_1} \\) are independent (automatic if \\( {X_1} \\) is Gaussian).<\/p>\n<blockquote style=\"background: white; border: solid thick #e4e5e7; text-align: justify; padding-left: 1em;\"><p><b>Theorem 2 (Marchenko-Pastur)<\/b> <em>Suppose that \\( {m=m_n} \\) depends on \\( {n} \\) is such a way that<\/em><\/p>\n<p style=\"text-align: center;\"><em>\\[ \\lim_{n\\rightarrow\\infty}\\frac{m_n}{n}=\\rho\\in(0,\\infty). \\]<\/em><\/p>\n<p><em>Let \\( {\\lambda_{n,1},\\ldots,\\lambda_{n,m_n}\\in[0,\\infty)} \\) be the eigenvalues of the \\( {m_n\\times m_n} \\) empirical covariance matrix \\( {\\frac{1}{n}YY^\\top} \\), and let us consider the empirical spectral distribution<\/em><\/p>\n<p style=\"text-align: center;\"><em>\\[ \\mu_n:=\\frac{1}{m_n}\\sum_{k=1}^{m_n}\\delta_{\\lambda_{n,k}}. \\]<\/em><\/p>\n<p><em>Then with probability one, for every bounded continuous \\( {f:\\mathbb{R}\\rightarrow\\mathbb{R}} \\),<\/em><\/p>\n<p style=\"text-align: center;\"><em>\\[ \\lim_{n\\rightarrow\\infty}\\int\\!f\\,d\\mu_n=\\int\\!f\\,d\\mu_\\rho \\]<\/em><\/p>\n<\/blockquote>\n<p style=\"text-align: justify;\">In other words, with probability one, for every \\( {x\\in\\mathbb{R}} \\) (\\( {x\\neq0} \\) if \\( {\\rho&gt;1} \\)), denoting \\( {I=(-\\infty,x]} \\),<\/p>\n<p style=\"text-align: center;\">\\[ \\lim_{n\\rightarrow\\infty}\\mu_n(I)=\\mu_\\rho(I). \\]<\/p>\n<p style=\"text-align: justify;\">Note that<\/p>\n<p style=\"text-align: center;\">\\[ \\int\\!x\\,d\\mu_n=\\frac{1}{m_n}\\sum_{k=1}^{m_n}\\lambda_{n,k} =\\frac{1}{m_n}\\mathrm{Tr}\\left(\\frac{1}{n}YY^\\top\\right) =\\frac{1}{nm_n}\\sum_{\\substack{1\\leq i\\leq m_n\\\\1\\leq j\\leq n}}Y_{ij}^2 \\]<\/p>\n<p style=\"text-align: justify;\">and thus, by the strong law of large numbers, with probability one,<\/p>\n<p style=\"text-align: center;\">\\[ \\lim_{n\\rightarrow\\infty} \\int\\!x\\,d\\mu_n = 1 =\\int\\!x\\,d\\mu_\\rho(x). \\]<\/p>\n<p style=\"text-align: justify;\">This shows the almost sure convergence of the first moment. It shows also that the sequence \\( {{(\\mu_n)}_{n\\geq1}} \\) is tight with probability one since by Markov's inequality, for all \\( {r&gt;0} \\),<\/p>\n<p style=\"text-align: center;\">\\[ \\mu_n([0,r]^c) \\leq \\frac{1}{r}\\int\\!x\\,d\\mu_n. \\]<\/p>\n<p style=\"text-align: justify;\">The Marchenko-Pastur theorem can be proved by using the <b>method of moments<\/b> or the <b>Cauchy-Stieltjes resolvent method<\/b>. In the Gaussian case, one may also use <b>orthogonal polynomials<\/b> (in this case the empirical covariance matrix belongs to the <b>Laguerre Ensemble<\/b> and its law is <b>Wishart<\/b>). Useful references include the recent <a href=\"http:\/\/www.ams.org\/mathscinet-getitem?mr=2567175\">book by Bai and Silverstein<\/a> and the forthcoming <a href=\"http:\/\/www.google.fr\/search?q=Pastur+Shcherbina+Eigenvalue+Distribution+of+Large+Random+Matrices+Mathematical+Surveys+and+Monographs+AMS\"> book of Pastur and Shcherbina<\/a>. Regarding orthogonal polynomials, there is a differential operator approach due to Haagerup and Thorbj\u00f8rnsen, developped by <a href=\"http:\/\/www.ams.org\/mathscinet-getitem?mr=2041832\">Ledoux<\/a>, in which the Marchenko-Pastur distribution is recovered as a mixture of a uniform and arcsine distributions. In free probability theory, the Marchenko-Pastur law is known as the free Poisson law, see for instance the <a href=\"http:\/\/www.ams.org\/mathscinet-getitem?mr=2266879\">book by Nica and Speicher<\/a> or the <a href=\"http:\/\/www.ams.org\/mathscinet-getitem?mr=1746976\">book by Hiai and Petz<\/a>.<\/p>\n<p style=\"text-align: justify;\">If \\( {\\rho=1} \\) then \\( {\\frac{1}{n}YY^\\top} \\) is asymptotically square, \\( {a=0} \\), \\( {b=4} \\), and the change of variable \\( {t\\mapsto\\sqrt{t}} \\) shows that with probability one, the empirical spectral distribution of<\/p>\n<p style=\"text-align: center;\">\\[ \\sqrt{\\frac{1}{n}YY^\\top} \\]<\/p>\n<p style=\"text-align: justify;\">converges as \\( {n\\rightarrow\\infty} \\) to the quarter circle law with density<\/p>\n<p style=\"text-align: center;\">\\[ \\frac{1}{\\pi}\\sqrt{4-x^2}\\,\\mathbf{1}_{[0,2]}(x)dx. \\]<\/p>\n<p style=\"text-align: justify;\">Concerning the support of the spectrum, one may first order the spectrum of \\( {\\frac{1}{n}YY^\\top} \\) as follows:<\/p>\n<p style=\"text-align: center;\">\\[ \\lambda_{n,1}\\geq\\cdots\\geq\\lambda_{n,m}. \\]<\/p>\n<p style=\"text-align: justify;\">Note that \\( {\\lambda_{n,k}=0} \\) if \\( {k&gt;\\min(m,n)} \\). We have then the following:<\/p>\n<blockquote style=\"background: white; border: solid thick #e4e5e7; text-align: justify; padding-left: 1em;\"><p><b>Theorem 3 (Bai-Yin)<\/b> <em>Under the assumptions of the Marchenko-Pastur theoren and if additionnaly \\( {X_{11}} \\) have a finite fourth moment then with probability one<\/em><\/p>\n<p style=\"text-align: center;\"><em>\\[ \\lim_{n\\rightarrow\\infty}\\lambda_{n,\\min(m,n)}=b \\quad\\text{and}\\quad \\lim_{n\\rightarrow\\infty}\\lambda_{n,1}=a. \\]<\/em><\/p>\n<\/blockquote>\n<p style=\"text-align: justify;\">The Marchenko-Pastur theorem remains valid even if \\( {X_{11}} \\) does not have a zero mean, in contrast with the Bai-Yin theorem which fails in that case. The edge of the spectrum of \\( {\\frac{1}{n}YY^\\top} \\) is sensitive to finite rank perturbations of \\( {Y} \\) while the bulk of the spectrum is not, asymptotically.<\/p>\n<p style=\"text-align: justify;\"><b>Dimensional noise.<\/b> The Marchenko-Pastur law has mean \\( {1} \\) and variance \\( {\\rho} \\). It is enlightening to set \\( {1_\\pm=(1\\pm\\sqrt{\\rho})^2} \\), making apparent that the support \\( {[a,b]=[1_-,1_+]} \\) of the Marchenko-Pastur law is an interval around the mean \\( {1} \\), which is the spectrum of the population covariance matrix \\( {I_m} \\). This noise is a high dimensional effect due to the fact that \\( {m} \\) as well as \\( {n} \\) go to infinity.<\/p>\n<p style=\"text-align: justify;\"><b>Quarter circular law.<\/b> If \\( {\\rho=1} \\) then \\( {[1_-,1_+]=[0,4]} \\) and the Marchenko-Pastur law becomes<\/p>\n<p style=\"text-align: center;\">\\[ \\frac{\\sqrt{4-x}}{2\\pi\\sqrt{x}}\\mathbf{1}_{x\\in[0,4]}\\mathrm{d}x, \\]<\/p>\n<p style=\"text-align: justify;\">and its image by the map \\( {x\\mapsto\\sqrt{x}} \\) in this case is then the quartercircle law<\/p>\n<p style=\"text-align: center;\">\\[ \\frac{\\sqrt{4-x^2}}{\\pi}\\mathbf{1}_{x\\in[0,2]}\\mathrm{d}x. \\]<\/p>\n<p style=\"text-align: justify;\"><b>Shape parameter.<\/b> The image of the Marchenko-Pastur law by the map \\( {x\\mapsto\\frac{x-(1+\\rho)}{\\sqrt{\\rho}}} \\) is<\/p>\n<p style=\"text-align: center;\">\\[ \\frac{\\sqrt{4-x^2}}{2\\pi(1+\\rho+\\sqrt{\\rho}x)}\\mathbf{1}_{x\\in[-2,2]}\\mathrm{d}x, \\]<\/p>\n<p style=\"text-align: justify;\">which reveals that \\( {\\rho} \\) is a shape rather than a scale parameter around the mean. This makes also apparent the convergence to the Wigner semicircle law as \\( {\\rho\\rightarrow0} \\).<\/p>\n<figure id=\"attachment_1190\" aria-describedby=\"caption-attachment-1190\" style=\"width: 447px\" class=\"wp-caption aligncenter\"><a href=\"\/blog\/wp-content\/uploads\/2011\/01\/marchenko-and-pastur.jpg\"><img loading=\"lazy\" class=\"size-full wp-image-1190\" title=\"Vladimir Marchenko and Leonid Pastur\" src=\"\/blog\/wp-content\/uploads\/2011\/01\/marchenko-and-pastur.jpg\" alt=\"\" width=\"447\" height=\"284\" srcset=\"https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2011\/01\/marchenko-and-pastur.jpg 447w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2011\/01\/marchenko-and-pastur-300x190.jpg 300w\" sizes=\"(max-width: 447px) 100vw, 447px\" \/><\/a><figcaption id=\"caption-attachment-1190\" class=\"wp-caption-text\">Vladimir Marchenko and Leonid Pastur<\/figcaption><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>The Marchenko - Pastur law \\( {\\mu_\\rho} \\) with shape parameter \\( {\\rho&gt;0} \\) is the probability distribution \\[ \\left(1-\\frac{1}{\\rho}\\right)_+\\delta_0+\\frac{1}{\\rho 2\\pi x}\\sqrt{(b-x)(x-a)}\\,\\mathbf{1}_{[a,b]}(x)dx \\] where \\[&#8230;<\/p>\n<div class=\"more-link-wrapper\"><a class=\"more-link\" href=\"https:\/\/djalil.chafai.net\/blog\/2011\/01\/29\/the-marchenko-pastur-law\/\">Continue reading<span class=\"screen-reader-text\">The Marchenko-Pastur law<\/span><\/a><\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":6869},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/1171"}],"collection":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/comments?post=1171"}],"version-history":[{"count":73,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/1171\/revisions"}],"predecessor-version":[{"id":14866,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/1171\/revisions\/14866"}],"wp:attachment":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/media?parent=1171"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/categories?post=1171"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/tags?post=1171"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}