{"id":10041,"date":"2018-01-12T12:41:18","date_gmt":"2018-01-12T11:41:18","guid":{"rendered":"http:\/\/djalil.chafai.net\/blog\/?p=10041"},"modified":"2024-05-30T22:07:49","modified_gmt":"2024-05-30T20:07:49","slug":"concentration-without-moments","status":"publish","type":"post","link":"https:\/\/djalil.chafai.net\/blog\/2018\/01\/12\/concentration-without-moments\/","title":{"rendered":"Concentration without moments"},"content":{"rendered":"<figure id=\"attachment_10042\" aria-describedby=\"caption-attachment-10042\" style=\"width: 195px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Wassily_Hoeffding\"><img loading=\"lazy\" class=\"size-medium wp-image-10042\" src=\"http:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/01\/Hoeffding_Wassily-195x300.jpg\" alt=\"Wassily Hoeffding (1914-1991)\" width=\"195\" height=\"300\" srcset=\"https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/01\/Hoeffding_Wassily-195x300.jpg 195w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/01\/Hoeffding_Wassily-98x150.jpg 98w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/01\/Hoeffding_Wassily.jpg 340w\" sizes=\"(max-width: 195px) 100vw, 195px\" \/><\/a><figcaption id=\"caption-attachment-10042\" class=\"wp-caption-text\">Wassily Hoeffding (1914 -- 1991)<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">This post presents an inequality for self-normalized sums without moment assumptions, due to Bradley Efron (1969), that I have learnt from La\u00ebtitia Comminges.<\/p>\n<p style=\"text-align: justify;\"><b>Symmetric laws.<\/b> Recall that a probability distribution is symmetric when \\( {X} \\) and \\( {-X} \\) are equally distributed if \\( {X} \\) is a random variable following this distribution. In this case \\( {\\varepsilon=\\mathrm{sign}(X)} \\) and \\( {|X|} \\) are independent and \\( {\\varepsilon} \\) follows a symmetric Rademacher distribution: \\( {\\mathbb{P}(\\varepsilon=\\pm1)=1\/2} \\).<\/p>\n<p style=\"text-align: justify;\"><b>Concentration.<\/b> Let \\( {X_1,\\ldots,X_n} \\) be independent real random variables with symmetric law and without atom at \\( {0} \\). Then for any real \\( {r&gt;0} \\),<\/p>\n<p style=\"text-align: center;\">\\[ \\mathbb{P}\\left(\\frac{X_1+\\cdots+X_n}{\\sqrt{X_1^2+\\cdots+X_n^2}}\\geq r\\right) \\leq\\mathrm{e}^{-\\frac{r^2}{2}}. \\]<\/p>\n<p style=\"text-align: justify;\">Note that this is available without any moment assumption on the random variables.<\/p>\n<p style=\"text-align: justify;\"><b>Proof.<\/b> Thanks to the independence and symmetry assumptions, the random variables \\( {\\varepsilon_1=\\mathrm{sign}(X_1),\\ldots,\\varepsilon_n=\\mathrm{sign}(X_n)} \\) are iid, follow the symmetric Rademacher distribution, and are independent of \\( {|X_1|,\\ldots,|X_n|} \\). Now by conditioning we get<\/p>\n<p style=\"text-align: center;\">\\[ \\mathbb{P}\\left(\\frac{X_1+\\cdots+X_n}{\\sqrt{X_1^2+\\cdots+X_n^2}}\\geq r\\right) =\\mathbb{E}(\\varphi_r(|X_1|,\\ldots,|X_n|)) \\]<\/p>\n<p style=\"text-align: justify;\">where \\( {\\varphi_r(c_1,\\ldots,c_n)=\\mathbb{P}((\\varepsilon_1c_1+\\cdots+\\varepsilon_nc_n)\/\\sqrt{c_1^2+\\cdots+c_n^2}\\geq r)} \\). We can assume that \\( {c_i&gt;0} \\) since \\( {\\mathbb{P}(X_i=0)=0} \\). It remains to use the Hoeffding inequality, which states that if \\( {Z_1,\\ldots,Z_n} \\) are independent centered and bounded real random variables then for any real \\( {r&gt;0} \\),<\/p>\n<p style=\"text-align: center;\">\\[ \\mathbb{P}\\left(Z_1+\\cdots+Z_n\\geq r\\right) \\leq\\exp\\left(-\\frac{2r^2}{\\mathrm{osc}(Z_1)^2+\\cdots+\\mathrm{osc}(Z_n)^2}\\right). \\]<\/p>\n<p style=\"text-align: justify;\">where \\( {\\mathrm{osc}(Z)=\\max(Z)-\\min(Z)} \\). Here we use it with, for any \\( {i=1,\\ldots,n} \\),<\/p>\n<p style=\"text-align: center;\">\\[ Z_i=\\frac{c_i}{\\sqrt{c_1^2+\\cdots+c_n^2}}\\varepsilon_i \\quad\\text{for which}\\quad \\mathrm{osc}(Z_i)^2=\\frac{4c_i^2}{c_1^2+\\cdots+c_n^2}. \\]<\/p>\n<p style=\"text-align: justify;\">Indeed this gives \\( { \\varphi_r(c_1,\\ldots,c_n) =\\mathbb{P}\\left(Z_1+\\cdots+Z_n\\geq r\\right) \\leq\\mathrm{e}^{-\\frac{r^2}{2}}} \\).<\/p>\n<p style=\"text-align: justify;\"><b>Probabilistic interpretation.<\/b> When \\( {X_1,\\ldots,X_n} \\) are iid and in \\( {L^2} \\), then their mean is zero, and their variance is say \\( {\\sigma^2&gt;0} \\). The law of large numbers gives \\( {\\sqrt{X_1^2+\\cdots+X_n^2}=\\sqrt{n}(\\sigma+o_{n\\rightarrow\\infty}(1))} \\) almost surely. Therefore by the central limit theorem and Slutsky's lemma we get \\( {(X_1+\\cdots+X_n)\/\\sqrt{X_1^2+\\cdots+X_n^2}\\overset{\\text{law}}{\\longrightarrow}\\mathcal{N}(0,1)} \\) as \\( {n\\rightarrow\\infty} \\).<\/p>\n<p style=\"text-align: justify;\"><b>Geometric interpretation.<\/b> If \\( {X_1,\\ldots,X_n} \\) are iid standard Gaussian, then<\/p>\n<p style=\"text-align: center;\">\\[ \\frac{X_1+\\cdots+X_n}{\\sqrt{X_1^2+\\cdots+X_n^2}} =\\langle U_n,\\theta_n\\rangle \\]<\/p>\n<p style=\"text-align: justify;\">where<\/p>\n<p style=\"text-align: center;\">\\[ U_n=\\sqrt{n}\\frac{(X_1,\\ldots,X_n)}{\\sqrt{X_1^2+\\cdots+X_n^2}} \\quad\\text{and}\\quad \\theta_n=\\frac{(1,\\ldots,1)}{\\sqrt{n}}. \\]<\/p>\n<p style=\"text-align: justify;\">The random vector \\( {U_n} \\) is uniformly distributed on the sphere of \\( {\\mathbb{R}^n} \\) of radius \\( {\\sqrt{n}} \\), while the vector \\( {\\theta_n} \\) belongs to the unit sphere. Note that \\( {\\langle U_n,\\theta_n\\rangle} \\) is the law of the sum of the coordinates of a row or column of a uniform random orthogonal matrix.<\/p>\n<p style=\"text-align: justify;\"><b>Relation to Studentization.<\/b> The result above can be related to the Studentized version of the empirical mean. Indeed, if one defined the empirical mean and the empirical variance<\/p>\n<p style=\"text-align: center;\">\\[ \\overline{X}_n=\\frac{X_1+\\cdots+X_n}{n} \\quad\\text{and}\\quad \\widehat\\sigma^2_n=\\frac{(X_1-\\overline{X}_n)^2+\\cdots+(X_n-\\overline{X}_n)^2}{n-1} \\]<\/p>\n<p style=\"text-align: justify;\">then using<\/p>\n<p style=\"text-align: center;\">\\[ (n-1)\\widehat\\sigma_n^2 =X_1^2+\\cdots+X_n^2-\\frac{(X_1+\\cdots+X_n)^2}{n} \\]<\/p>\n<p style=\"text-align: justify;\">we get, for any \\( {r\\geq0} \\), after some algebra,<\/p>\n<p style=\"text-align: center;\">\\[ \\left\\{\\sqrt{n}\\frac{\\overline{X}_n}{\\widehat\\sigma_n}\\geq r\\right\\} =\\left\\{\\frac{X_1+\\cdots+X_n}{\\sqrt{X_1^2+\\cdots+X_n^2}} \\geq r\\sqrt{\\frac{n}{n-1+r^2}}\\right\\}. \\]<\/p>\n<p style=\"text-align: justify;\">It follows then from the concentration inequality above that if \\( {X_1,\\ldots,X_n} \\) are independent, with symmetric law without atom at \\( {0} \\), then for any \\( {r\\geq0} \\),<\/p>\n<p style=\"text-align: center;\">\\[ \\mathbb{P}\\left(\\sqrt{n}\\frac{\\overline{X}_n}{\\widehat\\sigma_n}\\geq r\\right) \\leq\\exp\\left(-\\frac{nr^2}{2(n-1+r^2)}\\right). \\]<\/p>\n<p style=\"text-align: justify;\">If \\( {X_1,\\ldots,X_n} \\) are iid centered Gaussian then \\( {\\overline{X}_n\\sim\\mathcal{N}(0,1)} \\) and \\( {\\widehat{\\sigma}^2_n\\sim\\chi^2(n-1)} \\) are independent and their ratio \\( {\\sqrt{n}\\overline{X}_n\/\\widehat\\sigma_n} \\) follows the Student \\( {t(n-1)} \\) law, of density proportional to \\( {x\\mapsto 1\/(1+t^2\/(n-1))^{n\/2}} \\), which is in particular heavy tailed.<\/p>\n<p style=\"text-align: justify;\"><b>Further reading.<\/b><\/p>\n<ul>\n<li>Bradley Efron<br \/> <a href=\"https:\/\/doi.org\/10.2307\/2286068\">Student's t-Test Under Symmetry Conditions<\/a><br \/> Journal of the American Statistical Association 64(328) 1278-1302 (1969)<\/li>\n<li>Sergey Bobkov and Friedrich G\u00f6tze<br \/> <a href= \"https:\/\/mathscinet.ams.org\/mathscinet-getitem?mr=2278452\">Concentration inequalities and limit theorems for randomized sums<\/a><br \/> Probability Theory Related Fields 137(1-2) 49-81 (2007)<\/li>\n<li>Qi-Man Shao and Qiying Wang<br \/> <a href= \"https:\/\/mathscinet.ams.org\/mathscinet-getitem?mr=3161676\">Self-normalized limit theorems: a survey<\/a> (includes Cram\u00e9r large deviations)<br \/> Probability Surveys 10 69-93 (2013)<\/li>\n<\/ul>\n<figure id=\"attachment_20316\" aria-describedby=\"caption-attachment-20316\" style=\"width: 320px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/fr.wikipedia.org\/wiki\/Bradley_Efron\"><img loading=\"lazy\" src=\"http:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/01\/Efron1972.jpg\" alt=\"\" width=\"320\" height=\"320\" class=\"size-full wp-image-20316\" srcset=\"https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/01\/Efron1972.jpg 320w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/01\/Efron1972-300x300.jpg 300w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2018\/01\/Efron1972-80x80.jpg 80w\" sizes=\"(max-width: 320px) 100vw, 320px\" \/><\/a><figcaption id=\"caption-attachment-20316\" class=\"wp-caption-text\">Bradley Efron (1938 -) in 1972<\/figcaption><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>This post presents an inequality for self-normalized sums without moment assumptions, due to Bradley Efron (1969), that I have learnt from La&euml;titia Comminges. Symmetric laws.&#8230;<\/p>\n<div class=\"more-link-wrapper\"><a class=\"more-link\" href=\"https:\/\/djalil.chafai.net\/blog\/2018\/01\/12\/concentration-without-moments\/\">Continue reading<span class=\"screen-reader-text\">Concentration without moments<\/span><\/a><\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":1417},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/10041"}],"collection":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/comments?post=10041"}],"version-history":[{"count":25,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/10041\/revisions"}],"predecessor-version":[{"id":20318,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/10041\/revisions\/20318"}],"wp:attachment":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/media?parent=10041"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/categories?post=10041"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/tags?post=10041"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}