
If you know what are Poincaré and log-Sobolev inequalities, you may ask what makes variance and entropy so special. Bloody hell?! This is an old natural question. This tiny post is about Φ-entropies, a subject that was explored two or three decades ago, and which provides an answer based on convexity. Convexity is something important in probabilistic functional analysis, which cannot be reduced to the Cauchy-Schwarz inequality, and which interacts with the integration by parts formula and the Bochner formula. In particular, it is useful for the study of the analysis and geometry of discrete and continuous Markov processes.
Phi-entropy. Let Φ:I→R∪{+∞} be strictly convex, defined on a closed interval I⊂R with non-empty interior, finite and C4 on the interior of I. Let (Ω,A,μ) be a probability space, and f:Ω→I such that f∈L1(μ) and Φ(f)∈L1(μ). Following for instance [L], the Φ-entropy (actually relative entropy) of f with respect to μ is defined by EntΦμ(f)=∫Φ(f)dμ−Φ(∫fdμ). The Jensen inequality gives EntΦμ(f)≥0, with equality iff f is constant μ-almost-everywhere. We can recover the convexity of Φ from the non-negativity of EntΦμ by considering the probability measure μ=(1−t)δu+tδv for arbitrary t∈[0,1] and u,v∈I.
The Φ-entropy can be seen as a Jensen divergence. It is linear with respect to Φ, and invariant by additive affine perturbations on Φ. Basic examples include the following:
- Variance : Φ(u)=u2 on I=R
- Entropy : Φ(u)=ulog(u) on I=[0,+∞)
- In between interpolation : Φ(u)=up−up−1 on I=[0,+∞), 1<p≤2.
The case p>2 produces a valid Φ-entropy, however, as explained in the sequel, the Φ-entropy in this case is not convex with respect to its functional argument, and therefore, it does not have a variational formula nor a tensorization property.
A convex functional subset. The following set is convex: LΦ(μ)={f∈L1(μ):Φ(f)∈L1(μ)}. This suggests to study the convexity of f∈L1(μ)↦EntΦμ(f) (we do it in the sequel).
Indeed, let f,g∈LΦ(μ) and λ∈[0,1]. Then λf+(1−λ)g∈L1(μ). Next, since Φ is convex and since x∈R↦x+=max(0,x) is convex and non-decreasing, we get Φ(λf+(1−λ)g)+≤(λΦ(f)+(1−λ)Φ(g))+≤λΦ(f)++(1−λ)Φ(g)+. Therefore Φ(λf+(1−λ)g)+∈L1(μ). On the other hand, since Φ is convex, there exists an affine function ψ such that Φ≥ψ, therefore Φ(λf+(1−λ)g)≥ψ(λf+(1−λ)g)∈L1(μ), which implies Φ(λf+(1−λ)g)∈L1(μ). We have proved that λf+(1−λ)g∈LΦ(μ).
Approximation. Let F be the set of measurable functions f:Ω→I such that f(Ω) is a compact subset of the interior of I. In particular, if f∈F then both f and Φ(f) are bounded. We have F⊂LΦ(μ) and F is convex. If (Ω,A) is nice enough, such as Rd, then F is dense in the sense that for all f∈LΦ(μ), there exists a sequence (fn) in F such that fn→f and Φ(fn)→Φ(f) in L1(μ). In particular EntΦμ(fn)→EntΦμ(f). We always assume that this approximation property is satisfied in the sequel.
Characterization of convexity. The following properties are equivalent:
- f↦EntΦμ(f) is convex on LΦ(μ), for all (Ω,A,μ)
- (u,v)↦CΦ(u,v)=Φ″(u)v2 is convex on J={(u,v)∈R2:u∈I,u+v∈I}.
Indeed, the convexity of EntΦμ is a univariate property, equivalent to state that t∈[0,1]↦α(t)=EntΦμ((1−t)f+tg) is convex for all f,g∈F. It is in turn equivalent to α″≥0 for all f,g∈F, and since f and g are free, it is equivalent to α″(0)≥0 for all f,g∈F. Now α′(t)=∫Φ′(f+t(g−f))(g−f)dμ−Φ′(∫(f+t(g−f))dμ)∫(g−f)dμ=∫(Φ′(f+t(g−f))−Φ′(∫(f+t(g−f))dμ))(g−f)dμ.α″(t)=∫Φ″(f+t(g−f))(g−f)2dμ−Φ″(∫(f+t(g−f))dμ)(∫(g−f)dμ)2. Hence α″(0) is actually nothing else but the CΦ-entropy for bivariate functions: α″(0)=∫Φ″(f)(g−f)2dμ−Φ″(∫fdμ)(∫(g−f)dμ)2=EntCΦμ((f,g−f)). Now the convexity of CΦ implies α″(0)≥0 and thus the convexity of EntΦμ. Conversely, if EntΦμ is convex for all μ, then the CΦ-entropy is ≥0 for all μ, Used in particular with μ=(1−t)δ(u,v)+tδ(u′,v′), t∈[0,1], (u,v),(u′,v′)∈J, this gives the convexity of CΦ.
More convexity. The following properties are equivalent
- (u,v)↦AΦ(u,v)=Φ(u+v)−Φ(u)−Φ′(u)v is convex on J.
- (u,v)↦BΦ(u,v)=(Φ′(u+v)−Φ′(u))v is convex on J.
- (u,v)↦CΦ(u,v)=Φ″(u)v2 is convex on J.
- Either Φ is affine on I, or Φ″>0 and 1/Φ″ is concave on I.
Indeed, the equivalence between the convexity of AΦ, BΦ, and CΦ comes from AΦ(u,v)=∫10(1−p)CΦ(u+pv,v)dp,BΦ(u,v)=∫10CΦ(u+pv,v)dp, AΦ(u,εv)=12CΦ(u,v)ε2+o(ε2),BΦ(u,εv)=CΦ(u,v)ε2+o(ε2). Finally, for the equivalence between the convexity of AΦ and of −1/Φ″, we start from HessAΦ(u,v)=(AΦ″(u,v)Φ″(u+v)−Φ″(u)Φ″(u+v)−Φ″(u)Φ″(u+v)). If AΦ is convex, then the diagonal elements of HessAΦ(u,v) are ≥0, and thus Φ″≥0. Moreover the convexity of AΦ yields det(HessAΦ(u,v))≥0. Now, if Φ″(u+v)=0, then det(HessAΦ(u,v))=−Φ″(u)2, and thus Φ″(u)=0, therefore {w:Φ″(w)=0} is either empty of equal to I (Φ is the affine). If Φ″>0 on I, then it turns out that det(HessAΦ(u,v))=Φ″(u+v)Φ″2(u)A−1/Φ″(u,v), which is ≥0 since AΦ is convex. Since Φ″>0, we get A−1/Φ″≥0, thus −1/Φ″ is convex.
Conversely, if Φ is affine, then AΦ is ≡0, hence convex. If Φ″>0 and −1/Φ″ is convex, then (−1/Φ″)″=(Φ⁗Φ″−2Φ‴2)/Φ″3 hence Φ⁗Φ″≥2Φ‴2, thus Φ⁗≥0, which gives AΦ″≥0, therefore the diagonal elements of HessAΦ(u,v) are ≥0. On the other hand, det(HessAΦ(u,v))=Φ″(u+v)Φ″2(u)A−1/Φ″(u,v)≥0, therefore AΦ is convex.
By moving only along the u variable, we see that the convexity of CΦ implies the one of Φ″. However the converse is false : the convexity of Φ″ does not imply the one of CΦ. For instance if Φ(u)=u4 and I=[0,+∞), then 1/Φ″(u)=1/(12u2) is not concave.
Variational formula. The map f↦EntΦμ(f) is convex iff for all f∈F, EntΦμ(f)=supg∈F(EntΦμ(g)+∫(Φ′(g)−Φ′(∫gdμ))(f−g)dμ), and equality is achieved when f=g. Indeed, this formula expresses EntΦμ as a supremum of affine functions, showing that EntΦμ is convex. The convexity on F and on LΦ(μ) are equivalent by approximation. Conversely, if EntΦμ is convex, then for all f,g∈F, α:t∈[0,1]↦α(t)=EntΦμ((1−t)f+tg) is convex and differentiable, equal to the envelope of its affine tangents. In particular, α(0)=supt∈[0,1](α(t)+α′(t)(0−t)), and equality is achieved when t=0, as well as when t=1 if f=g. But recall that α′(t)=∫Φ′(f+t(g−f))(g−f)dμ−Φ′(∫(f+t(g−f))dμ)∫(g−f)dμ=∫(Φ′(f+t(g−f))−Φ′(∫(f+t(g−f))dμ))(g−f)dμ. Finally, the desired variational formula comes from EntΦμ(f)=α(0)=supg∈F(α(1)+α′(1)(0−1)).
Tensorization inequality. If f↦EntΦμ(f) is convex for all (Ω,A,μ), then for all n≥1, all μ=μ1⊗⋯⊗μn on a product space (Ω1×⋯×Ωn,A1⊗⋯⊗An), all f∈F, denoting EntΦμi(f) the partial Φ-entropy of f with respect to the i-th variable, EntΦμ(f)≤n∑i=1∫EntΦμi(f)dμ. Indeed, it suffices to prove the case n=2. Now, denoting g1=∫gdμ1, we note that ∫EntΦμ1(g)dμ2+∫EntΦμ2(g1)dμ1=EntΦμ(g). Therefore, by using the variational formula for μ1,g and μ2,g1, we get ∫EntΦμ1(f)dμ2+∫EntΦμ2(f)dμ1≥∫EntΦμ1(g)dμ2+∫(Φ′(g)−Φ′(∫gdμ1))(f−g)dμ1dμ2+∫EntΦμ2(g1)dμ1+∫(Φ′(g1)−Φ′(∫g1dμ2))(f−g1)dμ2dμ1=EntΦμ(g)+∫(Φ′(g)−Φ′(g1))(f−g)dμ1dμ2+∫(Φ′(g1)−Φ′(∫gdμ))(f−g)dμ2dμ1=EntΦμ(g)+∫(Φ′(g)−Φ′(∫gdμ))(f−g)dμ. It remains to take the supremum over g and use the variational formula for μ.
The tensorization inequality is actually equivalent to the convexity of EntΦμ. More precisely, on the product space {0,1}×Ω equipped with ((1−p)δ0+pδ1)⊗μ, the tensorization for f defined by f(i,y)=gi(y) gives, after rearrangement, EntΦμ((1−p)g1+pg2)≤(1−p)EntΦμ(g1)+pEntΦμ(g2).
Variance and entropy.
- If Φ(u)=u2 and I=R, then 1/Φ″(u)=2, which is concave.
- If Φ(u)=ulog(u) and I=[0,+∞), then 1/Φ″(u)=u, which is concave.
- If Φ(u)=up−up−1, p>1, I=[0,+∞), then 1/Φ″(u)=u2−p/p, concave iff p≤2.
This shows that the entropy and variance cases are actually critical or extremal : p=1+ and p=2 respectively. Even if it is not apparent on 1/Φ″, the linearity of AΦ,BΦ,CΦ with respect to Φ implies that the set of convex Φ for which they are convex is a convex non-negative cone, in other words stable by taking finite or infinite linear combinations on Φ with non-negative coefficients (a sum or more generally an integral with respect to a positive measure). In particular Φ(u)=au2+bulog(u), a,b≥0 works.
Phi-Sobolev inequalities for diffusions. Let (Xt)t≥0 be the Markov process solving the stochastic differential equation (also known as an overdamped Langevin diffusion) dXt=√2dBt−∇V(Xt)dt where B is a Brownian motion on Rd and V:Rd→R is C2 with V−ρ2|⋅|2 convex for some ρ∈R. This ensures that there is no explosion in finite time, and the process (Xt)t≥0 is well defined. In the special case in which V=ρ2|⋅|2 for a constant ρ≥0, we get the Ornstein-Uhlenbeck process, for which we have the explicit (Mehler) formula Law(Xt∣X0=x)=N(xe−ρt,(1−e−2ρ)Id). For any bounded and measurable f:Rd→R and x∈Rd, we set Pt(f)(x)=E(f(Xt)∣X0=x),namelyPt(⋅)(x)=Law(Xt∣X0=x). This also defines a linear operator on bounded measurable functions Pt:f↦Pt(f). We have P0=Id, and the Markov nature of X translates into a semigroup property: Pt+s=PtPs=PsPt,s,t≥0. The semigroup acts on functions (right) and on measures (left): E(f(Xt))=∫Pt(f)dν=νPtfwhenX0∼ν. Since the stochastic differential equation involves only the gradient of V, we can add to it a constant to make μ=e−Vdx a probability measure on Rd. This probability measure is invariant: if X0∼μ then Xt∼μ for all t≥0, in other words μPt=μ for all t≥0. The semigroup (Pt)t≥0 leaves invariant Lp(μ) for all p∈[1,∞]. The infinitesimal generator of this semigroup is the linear differential operator L=Δ−∇V⋅∇, namely ∂tPtf=LPtf=PtLf. The integration by parts gives, for all rapidly decaying C2 functions f and g, ∫fLgdμ=−∫∇f⋅∇gdμ=∫gLfdμ. We recover the invariance, ∂t∫Pt(f)dμ=0, and moreover ∂tEntΦμ(Ptf)=∫Φ′(Ptf)LPtfdμ=−∫Φ″(Ptf)|∇Ptf|2dμ=−∫CΦ(Ptf,|∇Ptf|)dμ≤0. This can be seen as a sort of Boltzmann H-theorem for the evolution equation ∂tft=Lft where ft=Ptf is the density of Xt with respect to μ. Now, following Dominique Bakry, the Bochner commutation formula ∇L=L∇−(HessV)∇ gives |∇Ptf|≤e−ρtPt|∇f|, hence, by the bivariate Jensen inequality for the convex function CΦ and the law Pt(⋅)(x), CΦ(Ptf,|∇Ptf|)≤e−2ρtCΦ(Ptf,Pt|∇f|)≤e−2ρtPtCΦ(f,|∇f|). This gives, using again the invariance μPt=μ for the last equality, EntΦμ(f)−EntΦμ(PTf)=−∫T0∂tEntΦμ(Ptf)dt=∫T0∫CΦ(Ptf,|∇Ptf|)dμdt≤(∫T0e−2ρtdt)(∫Pt(CΦ(f,|∇f|))dμ)=1−e−2ρT2ρ∫CΦ(f,|∇f|)dμ. Alternatively, instead of using the Jensen inequality with CΦ, we could use the Cauchy-Schwarz inequality and the Jensen inequality for the concave function 1/Φ″ as (Pt|∇f|)2≤Pt(Φ″(f)|∇f|2)Pt(1Φ″(f))≤Pt(Φ″(f)|∇f|2)Pt(Φ″(f)). Now when ρ>0, the process is ergodic : Law(Xt)→μ as t→∞, regardless of X0. In other words Pt(⋅)(x)→μ as t→∞, for all x. In particular, PTf→∫fdμ, which is constant, as T→∞, giving the following Φ-Sobolev inequality for μ: EntΦμ(f)≤12ρ∫CΦ(f,|∇f|)dμ. This is a Poincaré inequality when Φ(u)=u2, a logarithmic Sobolev inequality when Φ(u)=ulog(u), and a Beckner inequality when Φ(u)=up−up−1, 1<p≤2.
Local inequalities for diffusions. It is also possible to get similar inequalities, even when ρ=0, for Pt(⋅)(x) instead of μ=P∞(⋅)(x), by using the interpolation Pt−sΦ(Psf), and replacing the integration by parts formula by the diffusion property L(Φ(f))−Φ′(f)Lf=Φ″(f)|∇f|2=CΦ(f,|∇f|). Namely, for all t∈R+, all x∈Rd, and all f:Rd→I, EntΦPt(⋅)(x)(f)=Pt(Φ(f))(x)−Φ(Pt(f)(x))=∫t0∂sPs(Φ(Pt−sf))(x)ds. Dropping the notation (x), we get, from the diffusion property, denoting g=Pt−sf, ∂sPsΦ(Pt−sf)=Ps(LΦ(g)−Φ′(g)Lg)=Ps(Φ″(f)|∇g|2)=PsCΦ(g,|∇g|). Now recall that the Bochner formula gives |∇g|≤e−ρ(t−s)Pt−s|∇f|, hence, by the Jensen inequality for the bivariate convex function CΦ, and the semigroup property, PsCΦ(g,|∇g|)≤e−ρ(t−s)PsCΦ(Pt−sf,Pt−s|∇f|)≤e−ρ(t−s)PsPt−sCΦ(f,|∇f|)=PtCΦ(f,|∇f|). This gives finally the following local Φ-Sobolev inequality: EntΦPt(⋅)(f)≤(∫t0e−ρ(t−s)ds)PtCΦ(f,|∇f|)=1−e−ρtρPtCΦ(f,|∇f|). The formula is still valid when ρ=0 as soon as we use the natural convention 1−e−ρtρ=t. When ρ>0 and t→∞, we recover the inequality for the invariant law μ=P∞(⋅)(x). Moreover it can be checked that the constants in front of the right hand side are optimal (smallest) in the case of Brownian motion and Ornstein-Uhlenbeck processes.
Modified inequalities for Poisson. Fix any λ≥0, we consider the Poisson law πλ=e−λ∑x∈Nλxx!δx Then for all convex Φ on I⊂R with Φ″ convex as before, and all f:N→I bounded, EntΦπλ(f)≤λEπλ(AΦ(f,Df)) where D(f)(x)=f(x+1)−f(x). Following [C1, C2], let us give a simple proof melting the semigroups of Dominique Bakry and the bivariate convexity of Liming Wu. Let (Xt)t∈R+ be the simple Poisson process on N with intensity λ>0. Then Pt(⋅)(x)=Law(Xt∣X0=x)=δx∗πλt. In other words Pt(f)(x)=E(f(Xt)∣X0=x)=E(f(x+Zt)) where Zt∼πλt. The infinitesimal generator is the difference operator given for f:N→R and x∈N by L(f)(x)=λf(x+1)−f(x)=λ(Df)(x). Now EntΦπλt=Pt(Φ(f))(x)−Φ(Pt(f)(x))=∫t0∂sPs(Φ(Pt−sf))(x)ds. Dropping the notation (x) and setting g=Pt−sf, we get ∂sPs(Φ(Pt−sf))=Ps(LΦ(g)−Φ′(g)Lg)=λPs(AΦ(g,Dg)). The commutation DL=LD gives DPt=PtD, in particular Dg=Pt−sDf, hence, by the Jensen inequality for the bivariate convex function AΦ and the semigroup property, PsA(g,Dg)=Ps(A(Pt−sf,Pt−sDf))≤Ps(Pt−sA(f,Df))=PtA(f,Df). Finally we obtain a local Φ-Sobolev inequality for the simple Poisson process: EntΦPt(⋅)(f)≤(∫t0λds)PtA(f,Df)=λtPtA(f,Df). In particular for x=0 and t=1 we get a Φ-Sobolev inequality for the Poisson law: EntΦπλ(f)≤λEπλ(A(f,Df)). It can be checked that the constant λ in the right hand side is optimal (smallest).
The lack of chain rule in discrete space produces a lack of diffusion property, which is circumvented here by using AΦ and convexity. This AΦ-based modified inequality can be generalized to Poisson point processes by using stochastic calculus, see [W1].
Modified inequalities for Poisson again. We have seen above that CΦ- based Φ-Sobolev inequalities for the Gaussian law can be obtained by semigroup interpolation and the Bochner formula via two distinct methods:
- See the Gaussian law as the invariant law of the Ornstein-Uhlenbeck process, the overdamped Langevin with V=12|⋅|2. This approach involves an integration by parts of the generator. The obtained inequality is at equilibrium for the process.
- See the Gaussian law as the law at time t of a Brownian motion, the overdamped Langevin with V≡0. This approach involves the diffusion property of the generator. The obtained inequality is a local inequality for the process.
The simple Poisson process plays for the Poisson law the role played by Brownian motion for the Gaussian law. As we have seen, this gives an AΦ-based Φ-Sobolev inequality for the Poisson law via a local AΦ-based Φ-Sobolev inequality for the simple Poisson process. Is there an analogue of the Ornstein-Uhlenbeck process for the Poisson law ? Following [C2], it turns out that the answer is positive. It is the M/M/∞ queue, for which the Poisson law is invariant. It allows to get a BΦ-based Φ-Sobolev inequality for the Poisson law.
More precisely, the M/M/∞ queue is a Markov process (Xt)t∈R+ with state space N and infinitesimal generator given for f:N→R and x∈N by L(f)(x)=λD(f)(x)+xμD∗(f)(x) where D(f)(x)=f(x+1)−f(x), D∗(f)(x)=f(x−1)−f(x), and where λ>0 and μ>0 are parameters. We have a discrete analogue of the Mehler formula: Pt(⋅)(x)=Law(Xt∣X0=x)=Binomial(x,e−μt)∗Poisson(ρ(1−e−μt)) where ρ=λ/μ. In particular the Poisson law πρ=Poisson(ρ) is invariant. Moreover it is reversible and we have the integration by parts formula ∫fLgdπρ=−λ∫(Df)(Dg)dπρ=∫gLfdπρ. Furthermore, we have the following discrete Bochner formula DL=LD−μDandDPt=e−μtPtD Now, denoting ft=Ptf, and using the invariance and integration by parts, −∂tEntΦπρ(ft)=∫Φ′(ft)Lftdπρ=λ∫D(Φ′(ft))Dft)dπρ=∫BΦ(ft,Dft)dπρ. Next, by using the discrete Bochner formula, the inequality BΦ(u,pv)≤pBΦ(u,v) for p∈[0,1], and the Jensen inequality for Pt(⋅)(x) and the convex function BΦ, we get BΦ(ft,Dft)=BΦ(ft,e−μt(Df)t)≤e−μtPtBΦ(f,Df). Therefore, by using EntΦπρ(f∞)=0 and the invariance of πρ, we get (recall that f0=f) EntΦπρ(f)=−∫∞0∂tEntΦπρ(ft)dπρ≤λ(∫∞0e−μtdt)∫BΦ(f,Df)dπρ=ρ∫BΦ(f,Df)dπρ. We have thus obtained a BΦ-based Φ-Sobolev inequality for the Poisson law πρ: EntΦπρ(f)≤ρEπρBΦ(f,Df). It can be checked that the constant in front of the right hand side is optimal (smallest).
Convex Φ-transforms and modified Φ-Sobolev inequalities.
AΦ(f,Df)=DΦ(f)−Φ′(f)DfBΦ(f,Df)=D(Φ′(f))DfCΦ(f,Df)=Φ″(f)(Df)2
\begin{array}{lll} \boldsymbol{\Phi(u)} & \boldsymbol{A^\Phi(u,v)} & \boldsymbol{A^\Phi(f,\mathrm{D} f)} \\ u\log(u) & \displaystyle{(u+v)(\log(u+v)-\log(u))-v} & (f+\mathrm{D} f)\mathrm{D}(\log f)-\mathrm{D} f\\ u^2 & v^2 & (\mathrm{D} f)^2 \\ u^p & \displaystyle{(u+v)^p-u^p-p u^{p-1}v} & \mathrm{D}(f^p)-p f^{p-1}\mathrm{D} f \\ & \boldsymbol{B^\Phi(u,v)} & \boldsymbol{B^\Phi(f,\mathrm{D} f)} \\ u\log(u) & \displaystyle{v(\log(u+v)-\log(u))} & \mathrm{D}(f)\mathrm{D}(\log f) \\ u^2 & 2v^2 & 2(\mathrm{D} f)^2 \\ u^p & \displaystyle{p v((u+v)^{p-1}-u^{p-1})} & p \mathrm{D}(f)\mathrm{D}(f^{p-1}) \\ & \boldsymbol{C^\Phi(u,v)} & \boldsymbol{C^\Phi(f,\mathrm{D} f)} \\ u\log(u) & \displaystyle{v^2 u^{-1}} & (\mathrm{D} f)^2 f^{-1} \\ u^2 & 2v^2 & 2(\mathrm{D} f)^2 \\ u^p & \displaystyle{p(p-1)v^2u^{p-2}} & p(p-1)(\mathrm{D} f)^2 f^{p-2} \end{array}
The function A^\Phi is known in convex analysis as a Bregman divergence.
The tensorization can be used to get \Phi-entropy inequalities for Gauss and Poisson laws from two-point space, as Gross did for the log-Sobolev inequality in [G], see [C2]. This can be pushed further to infinite dimension (Wiener measure and Poisson space).
Further comments. This post is a revival of an article written in 2010 for an online encyclopedia of functional inequalities, a MoinMoin wiki run by the former EVOL ANR research project. Well, this blog will also disappear at some point! The content of this post is taken from [C1-C2], with corrections and simplifications. The main motivation of [C1] was to explore a unification and generalization, incorporating as much as possible [H], [LO], [W1], among other works, using convexity and the semigroup approach of Dominique Bakry. Unfortunately, the main results in [C2] are buried in Section 4, never do that! The writing of [C1] and [C2] was done in parallel of [BRC] and [M], and without being aware of [AMTU] and [BT] respectively. The variational formula and the tensorization of \Phi-entropies is also considered or mentioned to some extent in [L] and [BT]. There are now many works on the topic, including for instance [Co] and [LRS].
Further reading.
- [ABD] Arnold, A. and Bartier, J.-Ph. and Dolbeault, J.
Interpolation between logarithmic Sobolev and Poincaré inequalities.
Commun. Math. Sci. 5 (2007), no. 4, 971--979. - [AMTU] Arnold, A. and Markowich, P. and Toscani, G. and Unterreiter, A.
On convex Sobolev inequalities and the rate of convergence to equilibrium for Fokker-Planck type equations.
Comm. Partial Differential Equations 26 (2001), no. 1-2, 43--100. - [B] Beckner, W.
A generalized Poincaré inequality for Gaussian measures.
Proceedings of the American Mathematical Society, pages 397--400, 1989. - [BCR] Barthe, F. and Cattiaux, P. and Roberto, C.
Interpolated inequalities between exponential and Gaussian Orlicz hypercontractivity and isoperimetry.
Rev. Mat. Iberoam. 22 (2006), no. 3, 993--1067. - [BE] Bakry, D. and Émery, M.
Hypercontractive diffusions
Lecture Notes in Math., 1123 Springer, 1985, 177--206. - [BGL] Bakry, D. and Gentil, I. and Ledoux, M.
Analysis and geometry of Markov diffusion operators
Springer, 2014, xx+552 pp. - [BaL] Bakry, D. and Ledoux, M.
Lévy-Gromov's isoperimetric inequality for an infinite-dimensional diffusion generator
Invent. Math. 123 (1996), no. 2, 259-281. - [BoL] Bobkov, S. G. and Ledoux, M.
On modified logarithmic Sobolev inequalities for Bernoulli and Poisson measures.
J. Funct. Anal. 156 (1998), no. 2, 347--365 - [BT] Bobkov, S. and Tetali, P.
Modified logarithmic Sobolev inequalities in discrete settings
J. Theoret. Probab. 19 (2006), no. 2, 289--336. - [Br] Brègman, L. M.
A relaxation method of finding a common point of convex sets and its application to the solution of problems in convex programming.
Z. Vyčisl. Mat. i Mat. Fiz. 7 1967 620--631. - [C1] Chafaï, D.
Entropies, convexity, and functional inequalities: on \Phi-entropies and \Phi-Sobolev inequalities.
J. Math. Kyoto Univ., 44(2):325–363, 2004. - [C2] Chafaï, D.
Binomial-Poisson entropic inequalities and the M/M/ \infty queue.
ESAIM Probab. Stat. 10 (2006), 317--339. - [CL] Chafaï, D. and Lehec, J.
Logarithmic Sobolev Inequalities Essentials
Master 2 Lecture Notes (2017) Available online - [Co] Conforti, G.
A probabilistic approach to convex Phi-entropy decay for Markov chains.
Ann. Appl. Probab. 32 (2022), no. 2, 932-973. - [Cs] Csiszár, I.
A class of measures of informativity of observation channels. Collection of articles dedicated to the memory of Alfréd Rényi
I. Period. Math. Hungar. 2 (1972), 191--213. - [G] Gross, L.
Logarithmic Sobolev inequalities.
Amer. J. Math. 97 (1975), no. 4, 1061--1083. - [H] Hu, Y.-Z.
A unified approach to several inequalities for Gaussian and diffusion measures.
Séminaire de Probabilités, XXXIV, 329--335, Lecture Notes in Math., 1729, Springer, Berlin, 2000. - [L] Ledoux, M.
On Talagrand's deviation inequalities for product measures.
ESAIM Probab. Statist. 1 (1995/97), 63--87. - [LO] Latała, R. and Oleszkiewicz, K.
Between Sobolev and Poincaré.
Geometric aspects of functional analysis, 147--168, Lecture Notes in Math., 1745, Springer, Berlin, 2000. - [LRS] López-Rivera, P. and Shenfeld Y.
The Poisson transport map.
Preprint arXiv:2407.02359 (2024) - [M] Massart, P. Concentration inequalities and model selection. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6-23, 2003. With a foreword by Jean Picard.
Lecture Notes in Mathematics, 1896. Springer, Berlin, 2007. xiv+337 pp. - [MC] Malrieu, F. and Collet, J.-F.
Logarithmic Sobolev Inequalities for Inhomogeneous Semigroups
ESAIM PS, 12 (2008), pp 492--504. - [W1] Wu, L.
A new modified logarithmic Sobolev inequality for Poisson point processes and several applications.
Probab. Theory Related Fields 118 (2000), no. 3, 427--438. - [W2] Wu, L.
A Phi-entropy contraction inequality for Gaussian vectors.
J. Theoret. Probab. 22 (2009), no. 4, 983--991.