The work of Boltzmann on entropy in the years 1865-1905 is really amazing. Beyond important combinatorial aspects, one of the general ideas behind his work is that along certain dynamics, some functional is monotonic, and thus, the long time equilibrium, if it exists, is related to the optimum of the functional over the constraints related to the conservation laws. For the original Boltzmann equation $\partial_tf_t=A(f_t)$ which comes from kinetic gases modelling, the entropy is $H(f)=-\int\!f(x)\log f(x)dx$, and is maximized by Gaussians under a variance constraint. Here the variance constraint corresponds to the convervation law of the energy. One may call entropies such functionals. Boltzmann was the first to use a partial differential equation to describe the evolution of a probability density function, dozens of years before the rigorous analysis of such concepts in mathematics.
Of course, for nonlinear dynamics, the initial data may play a subtle role. The same idea is present in the notion of gradient flow equations. Beyond statistical physics, the maximum entropy principle plays a role in Bayesian statistics. It has also something to do with the consistency of the maximum likelihood estimator.
For an ergodic and reversible Markov process, the equilibrium is typically a Gibbs measure, and the free energy plays the role of the entropy and is monotonic. A Gibbs measure is a maximum entropy under an averaged energy constraint involving the potential of the Gibbs measure. The monotonicity does not contradict the reversibility, because reversibility is a property of the equilibrium, and has nothing to do with the initial data.
Another interesting problem involving entropy and monotonicity emerged from information theory and was stated by Shannon: does the entropy of Boltzmann is monotonic along the standard central limit theorem? How about the speed? Here the dynamics is related to independence and convolution and the conservation law is the variance. This problem was solved dozens of years later by many authors including Artstein, Ball, Barthe and Naor. The central limit theorem is available in many contexts beyond the classical Abelian case, including the Voiculescu operators algebras of free probability (Shlyakhtenko has solved positively the problem) and Lie groups. It is tempting to formulate the Shannon conjecture on (non-compact) Lie groups with dilation. The answer in unknown, even for the Heisenberg group.