
The aim of this short post is to explain why the maximum entropy principle could be better seen as a minimum relative entropy principle, in other words an entropic projection.
Relative entropy. Let λ be a reference measure on some measurable space E. The relative entropy with respect to λ is defined for every measure μ on E with density dμ/dλ by H(μ∣λ):=∫dμdλlogdμdλdλ. If the integral is not well defined, we could simply set H(μ∣λ):=+∞.
- An important case is when λ is a probability measure. In this case H becomes the Kullback-Leibler divergence, and the Jensen inequality for the strictly convex function u↦ulog(u) indicates then that H(μ∣λ)≥0 with equality if and only if μ=λ.
- Another important case is when λ is the Lebesgue measure on Rn or the counting measure on a discrete set, then −H(μ∣λ) is the Boltzmann-Shannon entropy of μ. Beware that when E=Rn, this entropy takes its values in the whole (−∞,+∞) since for all positive scale factor σ>0, denoting μσ the push forward of μ by the dilation x↦σx, we have H(μσ∣λ)=H(μ∣λ)−nlogσ.
Boltzmann-Gibbs probability measures. Such a probability measure μV,β takes the form dμV,β:=e−βVZV,βdλ where V:E↦(−∞,+∞], β∈[0,+∞), and ZV,β:=∫e−βVdλ<∞ is the normalizing factor. The more β is large, the more μV,β puts its probability mass on the regions where V is low. The corresponding asymptotic analysis, known as the Laplace method, states that as β→∞ the probability measure μV,β concentrates on the minimizers of V.
The mean of V or V-moment of μV,β writes
∫VdμV,β=−1βH(μV,β∣λ)−1βlogZV,β.
In thermodynamics −1βlogZV,β appears as a Helmholtz free energy since it is equal to ∫VdμV,β (mean energy) minus 1β×−H(μV,β∣λ) (temperature times entropy).
When β ranges from −∞ to ∞, the V-moment of μV,β ranges from supV downto infV, and ∂β∫VdμV,β=(∫VdμV,β)2−∫V2dμV,β≤0. If λ(E)<∞ then μV,0=1λ(E)λ and its V-moment is 1λ(E)∫Vdλ.
Variational principle. Let β≥0 such that ZV,β<∞ and c:=∫VdμV,β<∞. Then, among all the probability measures μ on E with same V-moment as μV,β, the relative entropy H(μ∣λ) is minimized by the Boltzmann-Gibbs measures μV,β. In other words,min∫Vdμ=cH(μ∣λ)=H(μV,β∣λ).
Indeed we have H(μ∣λ)−H(μV,β∣λ)=∫logdμdλdλ−∫logdμV,βdλdμV,β=∫logdμdλdλ+∫(log(ZV,β)+βV)dμV,β=∫logdμdλdλ+∫(log(ZV,β)+βV)dμ=∫logdμdλdλ−∫logdμV,βdλdμ=H(μ∣μV,β)≥0 with equality if and only if μ=μV,β. The crucial point is that μ and μV,β are equal on test functions of the form a+bV where a,b are arbitrary real constants, by assumption.
- When λ is the Lebesgue measure on Rn or the counting measure on a discrete set, we recover the usual maximum Boltzmann-Shannon entropy principe max∫Vdμ=c−H(μ∣λ)=−H(μV,β).In particular, Gaussians maximize the Boltzmann-Shannon entropy under variance constraint (take for V a quadratic form), while the uniform measures maximize the Boltzmann-Shannon entropy under support constraint (take V constant on a set of finite measure for λ, and infinity elsewere). Maximum entropy is minimum relative entropy with respect to Lebesgue or counting measure, a way to find, among the probability measures with a moment constraint, the closest to the Lebesgue or counting measure.
- When λ is a probability measure, then we recover the fact that the Boltzmann-Gibbs measures realize the projection or least Kullback-Leibler divergence of λ on the set of probability measures with a given V-moment. This is the Csiszár I-projection.
- There are other interesiting applications, for instance when λ is a Poisson point process.
Note. The concept of maximum entropy was studied notably by
- Rudolf Julius Emanuel Clausius (1822 - 1888)
- Ludwig Boltzmann (1844 - 1906)
- Hermann von Helmholtz (1821 - 1894)
- Josiah Willard Gibbs (1839 – 1903)
- Claude Elwood Shannon (1916 - 2001)
- Solomon Kullback (1907 - 1994)
- Richard Leibler (1914 - 2003)
and by Edwin Thompson Jaynes (1922 - 1998) in relation with thermodynamics, statistical physics, statistical mechanics, information theory, and Bayesian statistics. The concept of I-projection or minimum relative entropy was studied notably by Imre Csiszár (1938 - ).
Related.
- On this blog
Entropy ubiquity
LPMO (2015) - Olivier Darrigol, Roger Balian, Christian Maes, Félix Ritort, Thibault Damour
L'entropie
Séminaire Poincaré or Bourbaphy, a Bourbaki of Physics (2003)