{"id":15124,"date":"2021-08-28T12:46:49","date_gmt":"2021-08-28T10:46:49","guid":{"rendered":"https:\/\/djalil.chafai.net\/blog\/?p=15124"},"modified":"2021-09-12T21:50:19","modified_gmt":"2021-09-12T19:50:19","slug":"sinkhorn-and-circular-law","status":"publish","type":"post","link":"https:\/\/djalil.chafai.net\/blog\/2021\/08\/28\/sinkhorn-and-circular-law\/","title":{"rendered":"Sinkhorn and circular law"},"content":{"rendered":"<figure id=\"attachment_15168\" aria-describedby=\"caption-attachment-15168\" style=\"width: 600px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2021\/08\/sinkhorn-expo.png\"><img loading=\"lazy\" class=\"size-full wp-image-15168\" src=\"http:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2021\/08\/sinkhorn-expo.png\" alt=\"Circular law for sinkhorn\" width=\"600\" height=\"400\" srcset=\"https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2021\/08\/sinkhorn-expo.png 600w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2021\/08\/sinkhorn-expo-300x200.png 300w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/a><figcaption id=\"caption-attachment-15168\" class=\"wp-caption-text\">Sinkhorn spectrum for initial matrix with iid exponential entries<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">The <a href=\"https:\/\/en.wikipedia.org\/wiki\/Sinkhorn%27s_theorem\">Sinkhorn factorization<\/a> of a square matrix $A$ with positive entries takes the form $$A = D_1 S D_2$$ where $D_1$ and $D_2$ are diagonal matrices with positive diagonal entries and where $S$ is a doubly stochastic matrix meaning that it has entries in $[0,1]$ and each row and column sums up to one. It is named after <a href=\"https:\/\/www.mathgenealogy.org\/id.php?id=8396\">Richard Dennis Sinkhorn (1934 - 1995)<\/a> who worked on the subject in the 1960s. A natural algorithm for the numerical approximation of this factorization consists in iteratively normalize each row and each column, and this is referred to as the <strong>Sinkhorn-Knopp algorithm<\/strong>. Its convergence is fast. Actually this is a special case of the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Iterative_proportional_fitting\">iterative proportional fitting<\/a> (IPF) algorithm, classic in probability and statistics, which goes back at least to the analysis of the contingency tables in the 1920s, and which was reinvented many times, see for instance the article by Friedrich Pukelsheim and references therein. It is in particular well known that $S$ is the projection of $A$, with respect to the Kullback-Leibler divergence, on the set of doubly stochastic matrices (Birkhoff polytope). The Sinkhorn factorization and its algorithm became fashionable in the domain of computational optimal transportation due to a relation with entropic regularization. It is also considered in the domain of quantum information theory.<\/p>\n<p style=\"text-align: justify;\">Here is a simple implementation using the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Julia_(programming_language)\">Julia programming language<\/a>.<\/p>\n<pre class=\"prettyprint\">using LinearAlgebra # for Diagonal\r\n#\r\nfunction sinkhorn(A, tol = 1E-6, maxiter = 1E6)\r\n # Sinkhorn factorization A = D1 S D2 where D1 and D2 are diagonal and S doubly \r\n # stochastic using Sinkhorn-Knopp algorithm which consists in iterative rows \r\n # and columns sums normalizations. This code is not optimized.\r\n m , n = size(A,1), size(A,2) \r\n D1, D2 = ones(1,n), ones(1,n)\r\n S = A\r\n iter, err = 0, +Inf\r\n while ((err &gt; tol) &amp;&amp; (iter &lt; maxiter))\r\n   RS = vec(sum(S, dims = 2)) # column vector of rows sums\r\n   D1 = D1 .* RS\r\n   S = Diagonal(1 .\/ RS) * S \r\n   CS = vec(sum(S, dims = 1)) # row vector of columns sums\r\n   D2 = CS .* D2\r\n   S = S * Diagonal(1 .\/ CS)\r\n   iter += 1\r\n   err = norm(RS .- 1) + norm(CS .- 1)\r\n end\r\n return S, Diagonal(D1), Diagonal(D2), iter\r\nend # function\r\n<\/pre>\n<p style=\"text-align: justify;\"><strong>Circular law.<\/strong> Suppose now that $A$ is a random matrix, $n\\times n$, with independent and identically distributed positive entries of mean $m$ and variance $\\sigma^2$, for instance following the uniform or the exponential distribution. Let $S$ be the random doubly stochastic matrix provided by the Sinkhorn factorization. It is tempting to conjecture that the empirical spectral distribution of $$\\frac{m\\sqrt{n}}{\\sigma}S$$ converges weakly as $n\\to\\infty$ to the uniform distribution on the unit disc of the complex plane, with a single outlier equal to $\\frac{m\\sqrt{n}}{\\sigma}$. The circular law for $S$ is inherited from the circular law for $A$ and the law of large numbers. This can be easily guessed from the first iteration of the Sinkhorn-Knopp algorithm, that reminds the circular law analysis of random Markov matrices inspired from the Dirichlet Markov Ensemble and its decomposition. The circular law for $A$ is a universal high dimensional phenomenon that holds as soon as the variance of the entries is finite. One can also explore the case of heavy tails, and wonder if the circular law for $S$ still remains true!<\/p>\n<p style=\"text-align: justify;\">The following Julia code produces the graphics at the top and the bottom of this post.<\/p>\n<pre class=\"prettyprint\">using Printf # for @printf\r\nusing Plots # for scatter and plot\r\nusing LinearAlgebra # for LinRange\r\n\r\nfunction sinkhornplot(A,m,sigma,filename)\r\n    S, D1, D2, iter = sinkhorn(A)\r\n    @printf(\"%s\\n\",filename)\r\n    @printf(\" \u2016A-D1 S D2\u2016 = %s\\n\",norm(A-D1*S*D2))\r\n    @printf(\" \u2016RowsSums(S)-1\u2016 = %s\\n\",norm(sum(S,dims=2).-1))\r\n    @printf(\" \u2016ColsSums(S)-1\u2016 = %s\\n\",norm(sum(S,dims=1).-1))\r\n    @printf(\" SK iterations = %d\\n\",iter)\r\n    spec = eigvals(S) * m * sqrt(size(S,1)) \/ sigma\r\n    maxval, maxloc = findmax(abs.(spec))\r\n    deleteat!(spec, maxloc)\r\n    scatter(real(spec),imag(spec),aspect_ratio=:equal,legend=false)\r\n    x = LinRange(-1,1,100)\r\n    plot!(x,sqrt.(1 .- x.^2),linewidth=2,linecolor=:steelblue)\r\n    plot!(x,-sqrt.(1 .- x.^2),linewidth=2,linecolor=:steelblue)\r\n    savefig(filename)    \r\nend #function\r\n\r\nsinkhornplot(rand(500,500),1\/2,1\/sqrt(12),\"sinkhorn-unif.png\")\r\nsinkhornplot(-log.(rand(500,500)),1,1,\"sinkhorn-expo.png\")\r\nsinkhornplot(abs.(randn(500,500).\/randn(500,500)),1,1,\"sinkhorn-heavy.png\")\r\n<\/pre>\n<figure id=\"attachment_15176\" aria-describedby=\"caption-attachment-15176\" style=\"width: 600px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2021\/08\/sinkhorn-heavy.png\"><img loading=\"lazy\" class=\"size-full wp-image-15176\" src=\"http:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2021\/08\/sinkhorn-heavy.png\" alt=\"\" width=\"600\" height=\"400\" srcset=\"https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2021\/08\/sinkhorn-heavy.png 600w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2021\/08\/sinkhorn-heavy-300x200.png 300w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/a><figcaption id=\"caption-attachment-15176\" class=\"wp-caption-text\">Sinkhorn spectrum for initial matrix with iid heavy tailed entries<\/figcaption><\/figure>\n<p style=\"text-align: justify;\"><strong>Example of program output.<\/strong><\/p>\n<pre class=\"prettyprint\">sinkhorn-unif.png\r\n \u2016A-D1 S D2\u2016 = 6.704194265149758e-14\r\n \u2016RowsSums(S)-1\u2016 = 8.209123420903459e-11\r\n \u2016ColsSums(S)-1\u2016 = 3.326966272203901e-15\r\n SK iterations = 4\r\nsinkhorn-expo.png\r\n \u2016A-D1 S D2\u2016 = 1.9956579227110527e-13\r\n \u2016RowsSums(S)-1\u2016 = 5.155302149020719e-11\r\n \u2016ColsSums(S)-1\u2016 = 3.3454392974351074e-15\r\n SK iterations = 5\r\nsinkhorn-heavy.png\r\n \u2016A-D1 S D2\u2016 = 8.423054036449791e-10\r\n \u2016RowsSums(S)-1\u2016 = 4.5152766460217607e-7\r\n \u2016ColsSums(S)-1\u2016 = 3.8778423131653425e-15\r\n SK iterations = 126\r\n<\/pre>\n<p style=\"text-align: justify;\"><strong>Further reading.<\/strong><\/p>\n<ul>\n<li>Richard Sinkhorn<br \/>\n<a href=\"https:\/\/doi.org\/10.1214%2Faoms%2F1177703591\">A relationship between arbitrary positive matrices and doubly stochastic matrices<\/a><br \/>\nAnn. Math. Statist. 35:876-879 (1964)<\/li>\n<li>Richard Sinkhorn and Paul Knopp<br \/>\n<a href=\"https:\/\/doi.org\/pjm\/1102992505\">Concerning nonnegative matrices and doubly stochastic matrices<\/a><br \/>\nPacific J. Math. 21:343-348 (1967)<\/li>\n<li>Friedrich Pukelsheim<br \/>\n<a href=\"http:\/\/doi.org\/10.1007\/s10479-013-1468-3\">Biproportional scaling of matrices and the iterative proportional fitting procedure<\/a><br \/>\nAnnals of Operations Research 215(1):269-283 (2014)<\/li>\n<li>Julie Champion<br \/>\n<a href=\"http:\/\/thesesups.ups-tlse.fr\/2036\/1\/2013TOU30083.pdf\">Sur les algorithmes de projections en entropie relative avec contraintes marginales<\/a><br \/>\nTh\u00e8se de doctorat, universit\u00e9 de Toulouse (2013)<\/li>\n<li>Djalil Chafa\u00ef<br \/>\n<a href=\"https:\/\/dx.doi.org\/10.1016\/j.jmva.2009.10.013\">The Dirichlet Markov Ensemble<\/a><br \/>\nJournal of Multivariate Analysis 101:555-567 (2010)<\/li>\n<li>Charles Bordenave, Pietro Caputo, and Djalil Chafa\u00ef<br \/>\n<a href=\"https:\/\/dx.doi.org\/10.1007\/s00440-010-0336-1\">Circular Law Theorem for Random Markov Matrices<\/a><br \/>\nProbability Theory and Related Fields 152(3-4):751-779 (2012)<\/li>\n<li>Hoi Nguyen<br \/>\n<a href=\"https:\/\/doi.org\/10.1214\/13-AOP877\">Random doubly stochastic matrices: the circular law<\/a><br \/>\nAnn. Probab. 42(3):1161-1196 (2014)<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>The Sinkhorn factorization of a square matrix $A$ with positive entries takes the form $$A = D_1 S D_2$$ where $D_1$ and $D_2$ are diagonal&#8230;<\/p>\n<div class=\"more-link-wrapper\"><a class=\"more-link\" href=\"https:\/\/djalil.chafai.net\/blog\/2021\/08\/28\/sinkhorn-and-circular-law\/\">Continue reading<span class=\"screen-reader-text\">Sinkhorn and circular law<\/span><\/a><\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":515},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/15124"}],"collection":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/comments?post=15124"}],"version-history":[{"count":56,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/15124\/revisions"}],"predecessor-version":[{"id":15254,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/15124\/revisions\/15254"}],"wp:attachment":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/media?parent=15124"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/categories?post=15124"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/tags?post=15124"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}