{"id":12689,"date":"2020-03-15T18:06:04","date_gmt":"2020-03-15T17:06:04","guid":{"rendered":"http:\/\/djalil.chafai.net\/blog\/?p=12689"},"modified":"2020-03-18T06:37:34","modified_gmt":"2020-03-18T05:37:34","slug":"coupling-divergences-and-markov-kernels","status":"publish","type":"post","link":"https:\/\/djalil.chafai.net\/blog\/2020\/03\/15\/coupling-divergences-and-markov-kernels\/","title":{"rendered":"Coupling, divergences, and Markov kernels"},"content":{"rendered":"<figure id=\"attachment_12711\" aria-describedby=\"caption-attachment-12711\" style=\"width: 300px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Coupling_(British_TV_series)\"><img loading=\"lazy\" src=\"http:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2020\/03\/Coupling_title_card.jpg\" alt=\"Coupling (British TV series)\" width=\"300\" height=\"169\" class=\"size-full wp-image-12711\" srcset=\"https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2020\/03\/Coupling_title_card.jpg 300w, https:\/\/djalil.chafai.net\/blog\/wp-content\/uploads\/2020\/03\/Coupling_title_card-150x85.jpg 150w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><figcaption id=\"caption-attachment-12711\" class=\"wp-caption-text\">Coupling (British TV series)<\/figcaption><\/figure>\n<p style=\"text-align: justify;\">Let \\( {{(X_t)}_{t\\geq0}} \\) be a Markov process on a state space \\( {E} \\). Let us define<\/p>\n<p style=\"text-align: center;\">\\[ P_t(x,\\cdot)=\\mathrm{Law}(X_t\\mid X_0=x),\\quad x\\in E, t\\geq0. \\]<\/p>\n<p style=\"text-align: justify;\">It follows that if \\( {X_0\\sim\\mu} \\) then \\( {X_t\\sim\\mu P_t} \\) where<\/p>\n<p style=\"text-align: center;\">\\[ \\mu P_t = \\int P_t(x,\\cdot)\\mathrm{d}\\mu(x). \\]<\/p>\n<p style=\"text-align: justify;\">In this post a <b>divergence<\/b> between probability measures \\( {\\mu} \\) and \\( {\\nu} \\) on \\( {E} \\) is a quantitative way to measure the difference between \\( {\\mu} \\) and \\( {\\nu} \\). A divergence can be a distance such as the total variation or the Wasserstein distance. We study nice examples later on.<\/p>\n<p style=\"text-align: justify;\">Suppose that for some divergence between probability measures and for some quantity \\( {\\varphi_t(x,y)} \\) which depends on \\( {x,y,t} \\) we have, typically by using a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Coupling_(probability)\">coupling<\/a>,<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}(P_t(x,\\cdot),P_t(y,\\cdot)) \\leq\\varphi_t(x,y). \\]<\/p>\n<p style=\"text-align: justify;\">In this post, we explain how to <b>deduce<\/b> that for all probability measures \\( {\\mu,\\nu} \\),<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}(\\mu P_t,\\nu P_t) \\leq\\int\\varphi_t(x,y)\\mathrm{d}\\mu(x)\\mathrm{d}\\nu(y). \\]<\/p>\n<p style=\"text-align: justify;\">The initial inequality corresponds to take \\( {\\mu=\\delta_x} \\) and \\( {\\nu=\\delta_y} \\). All this is about notions of <b>couplings<\/b>, <b>divergences<\/b>, and <b>functional inequalities<\/b>.<\/p>\n<p style=\"text-align: justify;\"><b>Coupling.<\/b> Let \\( {\\mathcal{P}(E)} \\) be the set of proability measures on \\( {E} \\). If \\( {\\mu} \\) and \\( {\\nu} \\) are in \\( {\\mathcal{P}(E)} \\), then a coupling of \\( {\\mu} \\) and \\( {\\nu} \\) is an element \\( {\\pi} \\) of \\( {\\mathcal{P}(E\\times E)} \\) with marginal distributions \\( {\\mu} \\) and \\( {\\nu} \\). The set of couplings is convex, and is not empty since it contains the product measure \\( {\\mu\\otimes\\nu} \\).<\/p>\n<p style=\"text-align: justify;\"><b>Supremum divergence.<\/b> Let \\( {\\mathcal{F}} \\) be a class of bounded functions \\( {E\\rightarrow[0,+\\infty)} \\). For all \\( {\\mu,\\nu\\in\\mathcal{P}(E)} \\), we define the quantity<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_{\\mathcal{F}}(\\mu,\\nu) =\\sup_{f\\in\\mathcal{F}}\\int f\\mathrm{d}(\\mu-\\nu)\\in(-\\infty,+\\infty]. \\]<\/p>\n<p style=\"text-align: justify;\">This is not necessarily a distance. We give nice examples later.<\/p>\n<p style=\"text-align: justify;\"><b>Inequality.<\/b> Let \\( {P:E\\rightarrow\\mathcal{P}(E)} \\) be a Markov kernel. Recall that for all \\( {\\mu\\in\\mathcal{P}(\\mu)} \\), \\( {\\mu P\\in\\mathcal{P}(E)} \\) is defined by \\( {\\mu P=\\int P(x,\\cdot)\\mathrm{d}\\mu(x)} \\). Then, for all \\( {\\mu} \\) and \\( {\\nu} \\) in \\( {\\mathcal{P}(E)} \\),<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_{\\mathcal{F}}(\\mu P,\\nu P) \\leq \\inf_\\pi\\int \\mathrm{div}_{\\mathcal{F}}(P(x,\\cdot),P(y,\\cdot))\\mathrm{d}\\pi(x,y) \\]<\/p>\n<p style=\"text-align: justify;\">where the infimum runs over all couplings of \\( {\\mu} \\) and \\( {\\nu} \\). Taking \\( {\\pi=\\mu\\otimes\\nu} \\) gives<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_{\\mathcal{F}}(\\mu P,\\nu P) \\leq \\int \\mathrm{div}_{\\mathcal{F}}(P(x,\\cdot),P(y,\\cdot))\\mathrm{d}\\mu(x)\\mathrm{d}\\nu(x), \\]<\/p>\n<p style=\"text-align: justify;\">and in particular<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_{\\mathcal{F}}(\\mu P,\\nu P) \\leq \\sup_{x,y}\\mathrm{div}_{\\mathcal{F}}(P(x,\\cdot),P(y,\\cdot)). \\]<\/p>\n<p style=\"text-align: justify;\"><b>A proof.<\/b> The idea is to introduce a coupling and then to proceed by conditioning or desintegration. Namely, if \\( {\\pi} \\) is a coupling of \\( {\\mu} \\) and \\( {\\nu} \\), for instance \\( {\\mu\\otimes\\nu} \\), then<\/p>\n<p style=\"text-align: center;\">\\[ \\int f\\mathrm{d}(\\mu P-\\nu P) =\\int\\Bigr(\\int f\\mathrm{d}(P(x,\\cdot)-P(y,\\cdot))\\Bigr)\\mathrm{d}\\pi(x,y). \\]<\/p>\n<p style=\"text-align: justify;\">As a consequence,<\/p>\n<p style=\"text-align: center;\">\\[ \\sup_{f\\in\\mathcal{F}}\\int f\\mathrm{d}(\\mu P-\\nu P) \\leq \\int\\Bigr(\\sup_{f\\in\\mathcal{F}}\\int f\\mathrm{d}(P(x,\\cdot)-P(y,\\cdot))\\Bigr)\\mathrm{d}\\pi(x,y). \\]<\/p>\n<p style=\"text-align: justify;\">This gives the desired inequality.<\/p>\n<p style=\"text-align: justify;\"><b>Infimum divergence.<\/b> For a given map \\( {c:E\\times E\\rightarrow[0,+\\infty]} \\) that we call a <b>cost<\/b>, we define, for all \\( {\\mu} \\) and \\( {\\nu} \\) in \\( {\\mathcal{P}(E)} \\),<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_c(\\mu,\\nu)=\\inf_\\pi\\int c(x,y)\\mathrm{d}\\pi(x,y)\\in[0,+\\infty] \\]<\/p>\n<p style=\"text-align: justify;\">where the infimum runs over all couplings of \\( {\\mu} \\) and \\( {\\nu} \\). This is also known as the transportation or coupling distance, even if it is not necessarily a distance. We give nice examples later on.<\/p>\n<p style=\"text-align: justify;\"><b>Inequality.<\/b> For all Markov kernel \\( {P:E\\mapsto\\mathcal{P}(E)} \\) and all \\( {\\mu} \\) and \\( {\\nu} \\) in \\( {\\mathcal{P}(E)} \\),<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_c(\\mu P,\\nu P) \\leq\\inf_\\pi\\int \\mathrm{div}_c(P(x,\\cdot),P(y,\\cdot))\\mathrm{d}\\pi(x,y), \\]<\/p>\n<p style=\"text-align: justify;\">where the infimum runs over all couplings of \\( {\\mu} \\) and \\( {\\nu} \\). Taking \\( {\\pi=\\mu\\otimes\\nu} \\) gives<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_c(\\mu P,\\nu P) \\leq\\int \\mathrm{div}_c(P(x,\\cdot),P(y,\\cdot))\\mathrm{d}\\mu(x)\\nu(y), \\]<\/p>\n<p style=\"text-align: justify;\">and in particular<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_c(\\mu P,\\nu P) \\leq\\sup_{x,y} \\mathrm{div}_c(P(x,\\cdot),P(y,\\cdot)). \\]<\/p>\n<p style=\"text-align: justify;\"><b>A proof.<\/b> Let \\( {\\pi_{x,y}} \\) be a coupling of \\( {P(x,\\cdot)} \\) and \\( {P(y,\\cdot)} \\). Then \\( {\\int\\pi_{x,y}(\\cdot,\\cdot)\\mathrm{d}\\mu(x)\\mathrm{d}\\nu(y)} \\) is a coupling of \\( {\\mu P} \\) and \\( {\\nu P} \\). Indeed, for instance for the first marginal, we have<\/p>\n<p style=\"text-align: center;\">\\[ \\begin{array}{rcl} \\int_{y'}\\int_{x,y}\\pi_{x,y}(\\cdot,\\mathrm{d}y')\\mathrm{d}\\mu(x)\\mathrm{d}\\nu(y) &amp;=&amp;\\int_{x,y}\\int_{y'}\\pi_{x,y}(\\cdot,\\mathrm{d}y')\\mathrm{d}\\mu(x)\\mathrm{d}\\nu(y)\\\\ &amp;=&amp;\\int_{x,y}P_x(\\cdot)\\mathrm{d}\\mu(x)\\mathrm{d}\\nu(y)\\\\ &amp;=&amp;\\mu P. \\end{array} \\]<\/p>\n<p style=\"text-align: justify;\">Now, for all \\( {\\varepsilon&gt;0} \\) there exists a coupling \\( {\\pi_{x,y}} \\) of \\( {P(x,\\cdot)} \\) and \\( {P(y,\\cdot)} \\) such that<\/p>\n<p style=\"text-align: center;\">\\[ \\begin{array}{rcl} \\int_{x',y'} c(x',y')\\mathrm{d}\\pi_{x,y}(x',y')-\\varepsilon &amp;\\leq&amp;\\inf_{\\pi}\\int c(x',y')\\mathrm{d}\\pi_{x,y}(x',y')\\\\ &amp;=&amp;\\mathrm{div}_c(P(x,\\cdot),P(y,\\cdot)), \\end{array} \\]<\/p>\n<p style=\"text-align: justify;\">and thus<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_c(\\mu P,\\nu P) -\\varepsilon \\leq\\int \\mathrm{div}_c(P(x,\\cdot),P(y,\\cdot))\\mathrm{d}\\mu(x)\\mathrm{d}\\nu(y). \\]<\/p>\n<p style=\"text-align: justify;\">This gives the desired inequality.<\/p>\n<p style=\"text-align: justify;\"><b>Playing with Markov kernels.<\/b> Let us consider the identity Markov kernel defined by<\/p>\n<p style=\"text-align: center;\">\\[ P(x,\\cdot)=\\delta_x\\quad\\mbox{for all}\\quad x\\in E. \\]<\/p>\n<p style=\"text-align: justify;\">Then \\( {\\mu P=\\mu} \\) for all \\( {\\mu\\in\\mathcal{P}(E)} \\), hence the name. Next, since \\( {\\mathrm{div}_c(\\delta_x,\\delta_y)=c(x,y)} \\), the inequality above for the infimum divergence gives in this case the tautology \\( {\\mathrm{div}_c(\\mu,\\nu)=\\mathrm{div}_c(\\mu,\\nu)} \\). In contrast, the inequality for the supremum divergence gives<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_{\\mathcal{F}}(\\mu,\\nu) \\leq \\inf_\\pi\\int c(x,y)\\mathrm{d}\\pi(x,y) =\\mathrm{div}_c(\\mu,\\nu) \\]<\/p>\n<p style=\"text-align: justify;\">where the infimum runs over all couplings of \\( {\\mu} \\) and \\( {\\nu} \\) and where the cost is<\/p>\n<p style=\"text-align: center;\">\\[ c(x,y) =\\mathrm{div}_{\\mathcal{F}}(\\delta_x,\\delta_y) =\\sup_{f\\in\\mathcal{F}}(f(x)-f(y)). \\]<\/p>\n<p style=\"text-align: justify;\"><b>Kantorovich-Rubinstein duality.<\/b> When the cost \\( {(x,y)\\mapsto c(x,y)} \\) is a distance making \\( {E} \\) a metric space, this duality theorem states that<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}_c =\\mathrm{div}_{\\mathcal{F}} \\]<\/p>\n<p style=\"text-align: justify;\">where \\( {\\mathcal{F}} \\) is the class of functions \\( {f:E\\rightarrow\\mathbb{R}} \\) such that<\/p>\n<p style=\"text-align: center;\">\\[ \\left\\Vert f\\right\\Vert_{\\mathrm{Lip}} =\\sup_{x\\neq y}\\frac{|f(x)-f(y)|}{c(x,y)} \\leq1. \\]<\/p>\n<p style=\"text-align: justify;\">In the case of the discrete distance \\( {c(x,y)=\\mathbf{1}_{x\\neq y}} \\), this identity becomes<\/p>\n<p style=\"text-align: center;\">\\[ \\inf_{\\substack{(X,Y)\\\\X\\sim\\mu\\\\Y\\sim\\nu}}\\mathbb{P}(X\\neq Y) =\\sup_{\\substack{f:E\\rightarrow\\mathbb{R}\\\\\\left\\Vert f\\right\\Vert_\\infty\\leq 1\/2}}\\int f\\mathrm{d}(\\mu-\\nu) \\]<\/p>\n<p style=\"text-align: justify;\">and this matches the <b>total variation distance<\/b><\/p>\n<p style=\"text-align: center;\">\\[ \\left\\Vert \\mu-\\nu\\right\\Vert_{\\mathrm{TV}} =\\sup_{B\\subset E}|\\mu(B)-\\mu(B)| \\]<\/p>\n<p style=\"text-align: justify;\">(all right, \\( {\\geq} \\) is immediate, while \\( {\\leq} \\) requires approximation\/structure on \\( {E} \\)).<\/p>\n<p style=\"text-align: justify;\"><b>Bounded-Lipschitz<\/b> or <b>Fortet-Mourier distance<\/b>. Still when \\( {E} \\) is a metric space, it corresponds to \\( {\\mathrm{div}_{\\mathcal{F}}} \\) when \\( {\\mathcal{F}} \\) is the class of \\( {f:E\\rightarrow\\mathbb{E}} \\) such that<\/p>\n<p style=\"text-align: center;\">\\[ \\left\\Vert f\\right\\Vert_{\\mathrm{Lip}}\\leq1\\quad\\mbox{(implies continuity)}\\quad \\mbox{and}\\quad\\left\\Vert f\\right\\Vert_\\infty\\leq1. \\]<\/p>\n<p style=\"text-align: justify;\"><b>(Monge-Kantorovich-)Wasserstein distances.<\/b> When \\( {E} \\) is a metric space equipped with a distance \\( {d} \\), and when \\( {p\\in[1,\\infty)} \\), the \\( {W_p} \\) distance is defined by<\/p>\n<p style=\"text-align: center;\">\\[ W_p(\\mu,\\nu)=\\mathrm{div}_c(\\mu,\\nu)^{1\/p} \\quad\\mbox{with}\\quad c(x,y)=d(x,y)^p. \\]<\/p>\n<p style=\"text-align: justify;\">It is finite when \\( {\\mu} \\) and \\( {\\nu} \\) have finite \\( {p} \\)-th order moment in the sense that for some (and thus any) \\( {x\\in E} \\) we have \\( {\\int d(x,y)^p\\mathrm{d}\\mu(y)&lt;\\infty} \\) and \\( {\\int d(x,y)^p\\mathrm{d}\\nu(y)&lt;\\infty} \\). On this subset of \\( {\\mathcal{P}(E)} \\), \\( {W_p} \\) turns out indeed to be a true distance.<\/p>\n<p style=\"text-align: justify;\">In the case \\( {p=1} \\), the Kantorovich-Rubinstein duality can be used for \\( {W_1=\\mathrm{div}_c} \\) with \\( {c(x,y)=d(x,y)} \\) since it is a distance on \\( {E} \\), giving \\( {W_1=\\mathrm{div}_{\\mathcal{F}}} \\) where \\( {\\mathcal{F}} \\) is the class of bounded (this condition can be relaxed) and Lipschitz functions \\( {f:E\\rightarrow\\mathbb{R}} \\) with \\( {\\left\\Vert f\\right\\Vert_{\\mathrm{Lip}}\\leq1} \\).<\/p>\n<p style=\"text-align: justify;\">When \\( {p\\neq 1} \\), the cost is no longer a distance, but we have still the variational formula<\/p>\n<p style=\"text-align: center;\">\\[ W_p(\\mu,\\nu)=\\sup\\left(\\int f\\mathrm{d}\\mu-\\int g\\mathrm{d}\\nu\\right)^{1\/p} \\]<\/p>\n<p style=\"text-align: justify;\">where the supremum runs over all bounded and Lipschitz \\( {f,g:E\\rightarrow\\mathbb{R}} \\) such that \\( {f(x)-g(y)\\leq d(x,y)^p} \\). In other words<\/p>\n<p style=\"text-align: center;\">\\[ W_p(\\mu,\\nu)=\\sup\\left(\\int Q(f)\\mathrm{d}\\mu-\\int f\\mathrm{d}\\nu\\right)^{1\/p} \\]<\/p>\n<p style=\"text-align: justify;\">where the supremum runs over bounded Lipschitz \\( {f:E\\rightarrow\\mathbb{R}} \\) and where \\( {Q(f)} \\) is the <b>infimum convolution<\/b> of \\( {f} \\) with \\( {\\left|\\cdot\\right|^p} \\) defined by<\/p>\n<p style=\"text-align: center;\">\\[ Q(f)(x)=\\inf_{y\\in E}\\Bigr(f(y)+d(x,y)^p\\Bigr). \\]<\/p>\n<p style=\"text-align: justify;\">Note tha \\( {W_p} \\) defines the same topology than the <b>Zolotarev distance<\/b> \\( {\\mathrm{div}_{\\mathcal{F}}^{1\/p}} \\) where \\( {\\mathcal{F}} \\) is the class of functions with growth at most like \\( {d^p(x,\\cdot)} \\) for some arbitrary \\( {x} \\). They coincide when \\( {p=1} \\) and differ metrically when \\( {p\\neq1} \\).<\/p>\n<p style=\"text-align: justify;\"><b>Trend to the equilibrium.<\/b> In the study of the trend to the equilibrium \/ long time behavior of the Markov process \\( {X} \\), we have typically \\( {\\lim_{t\\rightarrow\\infty}\\varphi_t(x,y)=0} \\) for all \\( {x,y} \\). Also, if \\( {\\nu} \\) is invariant, meaning that \\( {\\nu P_t=\\nu} \\) for all \\( {t} \\), then<\/p>\n<p style=\"text-align: center;\">\\[ \\mathrm{div}(\\mu P_t,\\nu)\\leq\\int\\varphi_t(x,y)\\mathrm{d}\\mu(x)\\mathrm{d}\\nu(y) \\underset{t\\rightarrow\\infty}{\\longrightarrow}0 \\]<\/p>\n<p style=\"text-align: justify;\">provided that \\( {\\sup_t\\varphi_t} \\) is \\( {\\mu\\otimes\\nu} \\) integrable (dominated convergence).<\/p>\n<p style=\"text-align: justify;\"><b>Further reading.<\/b><\/p>\n<ul>\n<li><a href=\"https:\/\/mathscinet.ams.org\/mathscinet-getitem?mr=1105086\">Rachev - Probability metrics and the stability of stochastic models. (1991)<\/a><\/li>\n<li><a href=\"https:\/\/mathscinet.ams.org\/mathscinet-getitem?mr=1619171\">Rachev and R\u00fcschendorf - Mass transportation problems I and II. (1998)<\/a><\/li>\n<li><a href=\"https:\/\/mathscinet.ams.org\/mathscinet-getitem?mr=1964483\">Villani - Topics in optimal transportation. (2003)<\/a><\/li>\n<li><a href=\"https:\/\/mathscinet.ams.org\/mathscinet-getitem?mr=2459454\">Villani - Optimal transport. Old and new. (2009)<\/a><\/li>\n<li><a href=\"https:\/\/mathscinet.ams.org\/mathscinet-getitem?mr=3409718\">Santambrogio - Optimal transport for applied mathematicians. Calculus of variations, PDEs, and modeling. (2015)<\/a><\/li>\n<li><a href=\"https:\/\/mathscinet.ams.org\/mathscinet-getitem?mr=2895086\">Gozlan and L\u00e9onard - Transport inequalities. A survey. (2010)<\/a><\/li>\n<li><a href=\"https:\/\/mathscinet.ams.org\/mathscinet-getitem?mr=1924231\">Lindvall - Lectures on the coupling method. Revised and corrected edition. (2002)<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Let \\( {{(X_t)}_{t\\geq0}} \\) be a Markov process on a state space \\( {E} \\). Let us define \\[ P_t(x,\\cdot)=\\mathrm{Law}(X_t\\mid X_0=x),\\quad x\\in E, t\\geq0. \\]&#8230;<\/p>\n<div class=\"more-link-wrapper\"><a class=\"more-link\" href=\"https:\/\/djalil.chafai.net\/blog\/2020\/03\/15\/coupling-divergences-and-markov-kernels\/\">Continue reading<span class=\"screen-reader-text\">Coupling, divergences, and Markov kernels<\/span><\/a><\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":514},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/12689"}],"collection":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/comments?post=12689"}],"version-history":[{"count":17,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/12689\/revisions"}],"predecessor-version":[{"id":12714,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/posts\/12689\/revisions\/12714"}],"wp:attachment":[{"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/media?parent=12689"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/categories?post=12689"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/djalil.chafai.net\/blog\/wp-json\/wp\/v2\/tags?post=12689"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}