# Libres pensées d'un mathématicien ordinaire Posts

La comptabilité analytique fait peu à peu son apparition dans les administrations des universités françaises. À Paris-Dauphine, le calcul du coût par étudiant en fonction du département est mené au moyen d’un modèle de répartition subtilement paramétré. Bien que le modèle soit arbitraire et discutable, les résultats obtenus sont intéressants car source de questionnements sur le mode d’organisation et sur l’efficacité des composantes. Ainsi, pour le département MIDO – licences et masters en mathématiques et informatique – le prix moyen d’un étudiant par an serait de l’ordre de 18000 Euros. Ceci inclut la masse salariale, sauf celle des chercheurs qui n’enseignent pas comme ceux du CNRS. L’ordre de grandeur est facile à retrouver. En effet, en divisant le budget de l’université, d’environ 110 Millions d’Euro, par le nombre d’étudiants, environ 10000, on obtient environ 11000 Euros par étudiant et par an. Voici le résultat de cette division toute simple menée pour quelques autres établissements d’enseignement supérieur français. Les données – budgets et effectifs – sont extraits des pages de Wikipédia.

Établissement    Budget(M€)    Étudiants      B/E
-------------------------------------------------
ÉNS-Paris              102         2400   42500 €
Polytechnique          100         2899   34495 €
HEC                    127         4000   31750 €
Sciences-Po            173        13000   13308 €
Paris-Dauphine         110        10320   10659 €
Strasbourg             500        48011   10414 €
Toulouse-III           328        31987   10254 €
UPMC                   345        33789   10210 €
Paris-Diderot          244        25000    9760 €
Aix-Marseille          722        74000    9757 €
Lyon-I                 421        45000    9356 €
Paris-Sud              275        30200    9106 €


À Paris-Dauphine, le feuilleton des frais de scolarité se poursuit, notamment au sein du département MIDO, qui a finalement décidé de temporiser. Au delà du département MIDO, certains défendent l’idée que des frais de scolarité modulés en fonction du revenu risquent de faire fuir les riches, qui ne sont pas captifs comme dans la situation des cantines scolaires. Il s’agit là d’une idée souvent agitée par la droite contre les impôts progressifs défendus par la gauche. La réalité du phénomène ne fait aucun doute, et c’est plutôt son ampleur et son impact final qui font débat. Excités par cette idée, certains vont même plus loin et pensent qu’il faudrait afficher le vrai prix des études, les fameux 18000 Euros, et attribuer des bourses « virtuelles » qui effacent plus ou moins ce montant en fonction du revenu. Cela modifierait la perception de la part des riches, et ne changerait rien pour les pauvres. Il resterait tout de même à doser la progressivité pour les classes moyennes. On retrouve la question initiale. Elle alimente chez les collègues une sourde angoisse du déclassement. Il est vrai qu’ils sont les dominants et bénéficiaires du secteur de l’éducation.

Le schéma ci-dessus a été préparé par la direction des systèmes d’information (DSI) de l’université Paris-Dauphine. Il est assez semblable à ce que l’on trouve dans la plupart des établissements d’enseignement supérieur français, à quelques spécificités dauphinoises près. Il permet de visualiser les logiciels utilisés en 2015 par les personnels de l’université. Les cases rouges correspondent à des obsolescences. Le schéma ci-dessous correspond à l’année 2016, et permet de visualiser l’ampleur de l’effort accompli. La tâche n’est pas simple : plus de 130 logiciels, faisant appel à plus de 20 technologies différentes. Ces logiciels communiquent mal ou pas du tout entre eux, et certains ne sont pas vraiment conçus pour l’usage qu’on en fait.

Comment en est-on arrivé là ? Sans doute par une lente sédimentation, au fil des besoins, des arbitrages, et des compromis. Cette complexion fait penser à celle des législations des vielles démocraties. Il n’est pas simple de réformer et de simplifier, même partiellement.

Les discussions actuelles à Paris-Dauphine autour des logiciels DFIS et Adum font bien comprendre les mécanismes qui mènent à une augmentation de complexion. L’essentiel se résume en quelques mots : imperfection et utilité sont parfaitement compatibles.

Ci-dessus la situation en 2016.

Poincaré. Recently I have spent some time thinking about the following problem: for any integer ${N\geq1}$, what is the best constant in the Poincaré inequality for the probability measure ${\gamma_N}$ on the convex set ${\Lambda_N:=\{x\in\mathbb{R}^N:x_1\leq\cdots\leq x_N\}}$ with density proportional to

$x\in\Lambda_N\mapsto\mathrm{e}^{-\frac{N}{2}\sum_{i=1}^Nx_i^2}\prod_{j<k}(x_j-x_k)^2 ?$

In other words, for every ${N\geq1}$, find the best (in fact the largest) constant ${\rho_N\geq0}$ such that for every ${\mathcal{C}^\infty}$ test function ${f:\Lambda_N\rightarrow\mathbb{R}}$ with compact support,

$\rho_N\mathrm{Var}_{\gamma_N}(f) \leq\int |\nabla f|^2\mathrm{d}\gamma_N,$

where ${\mathrm{Var}_{\gamma_N}(f):=\int f^2\mathrm{d}\gamma_N-\left(\int f\mathrm{d}\gamma_N\right)^2}$ and where ${|\nabla f|^2:=(\partial_1f)^2+\cdots+(\partial_Nf)^2}$.

The reader familiar with random matrix theory knows that ${\gamma_N}$ is the law of the eigenvalues of a random matrix drawn from the Gaussian Unitary Ensemble (GUE) with density proportional to ${\mathrm{e}^{-\frac{N}{2}\mathrm{tr}(H^2)}}$ on ${N\times N}$ Hermitian matrices.

Note that ${\gamma_N}$ is a Boltzmann-Gibbs measure, with density ${Z_N^{-1}\mathrm{e}^{-E_N(x)}}$ with

$E_N(x) :=\frac{N}{2}\sum_{i=1}^Nx_i^2+\sum_{j<k}\log\frac{1}{(x_k-x_j)^2}.$

Class of test functions. Far beyond ${\mathcal{C}^\infty}$ with compact support, standard approximation arguments give that the Poincaré inequality for ${\gamma_N}$ with constant ${\rho_N}$ remains valid for any test function in the Sobolev space ${H^2(\gamma_N):=W^{1,2}(\gamma_N)}$.

Symmetrization. Let ${\widetilde{\gamma}_N}$ be the probability measure on ${\mathbb{R}^N}$ instead of ${\Lambda_N}$ with same density as ${\gamma_N}$ up to a multiplicative normalizing factor. It is exchangeable in other words invariant by permutation of the coordinates since its density is a symmetric function of the coordinates. Obviously the normalizing factor for ${\widetilde\gamma_N}$ is ${N!}$ times the one of ${\gamma_N}$. For every bounded and measurable function ${f:\mathbb{R}^N\rightarrow\mathbb{R}}$,

$\int f\mathrm{d}\widetilde{\gamma}_N =\int f_*\mathrm{d}\gamma_N$

where ${f_*}$ is the symmetrization of ${f}$ defined by

$f_*(x_1,\ldots,x_N)=\frac{1}{N!}\sum_{\sigma\in\Sigma_N}f(x_{\sigma(1)},\ldots,x_{\sigma(N)})$

where ${\Sigma_N}$ is the symmetric group of permutations of ${\{1,\ldots,N\}}$. Also ${\gamma_N=\widetilde\gamma_N}$ on symmetric functions. Conversely for any bounded and measurable ${f:\Lambda_N\rightarrow\mathbb{R}}$,

$\int f\mathrm{d}\gamma_N =\int f(x_{(1)},\ldots,x_{(N)})\mathrm{d}\widetilde\gamma_N$

where ${x_{(1)}<\cdots<x_{(N)}}$ is the reordering of ${x_1,\ldots,x_N}$.

It is possible to compute moments of ${\widetilde\gamma_N}$ by using the link with the GUE. For instance

$\int_{\mathbb{R}^N}(x_1+\cdots+x_N)\widetilde\gamma_N(\mathrm{d}x) =\mathbb{E}(\mathrm{tr}(H))=N\mathrm{H_{11}}=0$

where ${H\sim\mathrm{GUE}}$ since ${H_{11}\sim\mathcal{N}(0,N^{-1})}$, while using ${H_{12}\sim\mathcal{N}(0,(2N)^{-1}I_2)}$,

$\int_{\mathbb{R}^N}(x_1^2+\cdots+x_N^2)\widetilde\gamma_N(\mathrm{d}x) =\mathbb{E}(\mathrm{tr}(H^2)) =N\mathbb{E}(H_{11}^2)+N(N-1)\mathbb{E}(|H_{12}|^2) =N.$

Since ${\widetilde\gamma_N}$ is exchangeable it follows that

$\int x_1\widetilde\gamma_N(\mathrm{d}x)=\cdots=\int x_N\mathrm{d}\widetilde\gamma_N =\frac{1}{N}\int(x_1+\cdots+x_N)\widetilde\gamma_N(\mathrm{d}x) =0$

and

$\int x_1^2\widetilde\gamma_N(\mathrm{d}x)=\cdots=\int x_N^2\widetilde\gamma_N(\mathrm{d}x) =\frac{1}{N}\int(x_1^2+\cdots+x_N^2)\widetilde\gamma_N(\mathrm{d}x) =1.$

To compute a mixed moment ${x_jx_k}$ with ${j\neq k}$, we can start from

$\int(x_1+\cdots+x_N)^2\widetilde\gamma_N(\mathrm{d}x) =\mathbb{E}(\mathrm{tr}(H)^2)=1$

since ${\mathrm{tr}(H)\sim\mathcal{N}(0,1)}$, and use

$\int(x_1+\cdots+x_N)^2\widetilde\gamma_N(\mathrm{d}x) =N(N-1)\int x_1x_2\widetilde\gamma_N(\mathrm{d}x) +N\int\!x_1^2\widetilde\gamma_N(\mathrm{d}x)$

to get, for any ${j\neq k}$,

$\int x_jx_j\widetilde\gamma_N(\mathrm{d}x) =\int x_1x_2\widetilde\gamma_N(\mathrm{d}x) =-\frac{1}{N}.$

This shows that ${\widetilde\gamma_N}$ is asymptotically isotropic, in the sense that its mean is zero while its covariance matrix is close to the identity ${I_N}$ as ${N\rightarrow\infty}$. Note that the interior of the support of ${\widetilde\gamma_N}$, which is ${\cap_{j\neq k}\{x\in\mathbb{R}^N:x_j\neq x_k\}}$, is not connected, and this suggests that the probability measure ${\widetilde\gamma_N}$ is in some sense artificial.

How about ${\gamma_N}$ defined on the convex set ${\Lambda_N}$? In fact, this distribution is not isotropic, even when ${N\gg1}$. As a matter of fact, let us recall that if ${Z\sim\gamma_N}$ then a well known result of random matrix theory states that almost surely, regardless of the way we choose the common probability space,

$Z_1\underset{N\rightarrow\infty}{\longrightarrow}-2 \quad\mbox{and}\quad Z_N\underset{N\rightarrow\infty}{\longrightarrow}2 \quad\mbox{while}\quad \frac{1}{N}\sum_{i=1}^N\delta_{Z_i} \underset{N\rightarrow\infty}{\overset{\mathrm{weak}}{\longrightarrow}} \frac{\sqrt{4-x^2}}{2\pi}\mathrm{1}_{[-2,2]}(x)\mathrm{d}x.$

Nevertheless, since ${\gamma_N}$ and ${\widetilde\gamma_N}$ agree on symmetric functions, we still have

$\int(x_1+\cdots+x_N)\gamma_N(\mathrm{d}x)=0,\quad \int(x_1+\cdots+x_N)^2\gamma_N(\mathrm{d}x)=1,$

and

$\int(x_1^2+\cdots+x_N^2)\gamma_N(\mathrm{d}x)=N.$

Dyson. The probability measure ${\gamma_N}$ is invariant for the irreducible Markov diffusion process ${{(X_t)}_{t\geq0}}$ on ${\Lambda_N}$ solution of the stochastic differential equation

$\mathrm{d}X_t=\sqrt{2}\mathrm{d}B_t-\nabla E_N(X_t)\mathrm{d}t$

where ${{(B_t)}_{t\geq0}}$ is a standard Brownian motion on ${\mathbb{R}^N}$, in other words

$\mathrm{d}X_{t,i} =\sqrt{2}\mathrm{d}B_{t,i} -NX_{t,i}\mathrm{d}t -2\sum_{j\neq i}\frac{1}{X_{t,j}-X_{t,i}}\mathrm{d}t, 1\leq i\leq N.$

We may call it the Dyson Ornstein-Uhlenbeck process. The equation above is a system of stochastic differential equations of interacting particles, which can be seen (via the empirical measure) as a mean-field linear approximation of a McKean-Vlasov semilinear evolution equation with singular interaction, but this is another story.

Markov. The semigroup ${{(P_t)}_{t\geq0}}$ of the process is defined by

$P_t(f)(x):=\mathbb{E}(f(X_t)\mid X_0=x)$

for any ${t\geq0}$, any bounded measurable ${f:\mathbb{R}^N\rightarrow\mathbb{R}}$, and any ${x\in\mathbb{R}^N}$. For any ${t\geq0}$ the linear operator ${P_t}$ is a contraction of ${\mathrm{L}^p(\gamma_N)}$ for any ${p\in[1,\infty]}$. In ${\mathrm{L}^2(\gamma_N)}$, the infinitesimal generator of this semigroup is the differential operator

$Af=\lim_{t\rightarrow0^+}\frac{P_t(f)-f}{t}=\Delta f-\nabla E_N\cdot \nabla f$

for any smooth enough test function ${f}$ (${A}$ is unbouded with a domain). The Poincaré inequality for ${\gamma_N}$ is equivalent to an exponential decay of the variance: ${\forall f, \forall t\geq0}$,

$\mathrm{Var}_{\gamma_N}(P_t(f))\leq\mathrm{e}^{-\rho_N t}\mathrm{Var}_{\gamma_N}(f).$

The Poincaré inequality is also equivalent to state that ${A}$ has a spectral gap:

$\mathrm{spectrum}(A)\subset(-\infty,-\rho_N]\cup\{0\}.$

Convexity. On the convex set ${\Lambda_N}$, the energy ${E_N}$ of the Boltzmann-Gibbs measure ${\gamma_N}$ is convex, and therefore ${\gamma_N}$ is log-concave (but is not isotropic). Note that in contrast ${\widetilde\gamma_N}$ is not log-concave (but is almost isotropic as ${N\rightarrow\infty}$). Thanks to a result by Bobkov, it follows that the Poincaré constant of ${\gamma_N}$ is positive:

$\rho_N>0.$

It may depend on ${N}$ however. Let us compute the Hessian matrix of ${E_N}$:

$\partial^2_{i,i}E_N(x) =N+2(N-1)\sum_{k\neq i}\frac{1}{(x_i-x_k)^2} \quad\mbox{and}\quad \partial^2_{j,k}E_N(x) =-\frac{2(N-1)}{(x_j-x_k)^2},\quad j\neq k.$

This shows in particular that for any ${x\in\mathbb{R}^N}$, as quadratic forms,

$\mathrm{Hess}(E_N)(x)\geq NI_N.$

Thus by the Brascamp-Lieb inequality or by the Bakry-Émery criterion or by the Caffarelli theorem, it follows that the Poincaré constant of ${\gamma_N}$ satisfies

$\rho_N\geq N.$

This bound comes from the strong convexity of the confinement term ${N\sum_{i=1}^Nx_i^2}$ and the convexity of the interaction term ${\sum_{j<k}\log\frac{1}{(x_j-x_k)^2}}$. The interaction term is strongly convex when ${x}$ in away from infinity, but this is not seen by global criteria.

The spectral gap is ${N}$. The Poincaré inequality for ${\gamma_N}$ written for the special test function ${f(x)=x_1+\cdots+x_N}$ writes

$\rho_N\mathrm{Var}(Z_1+\cdots+Z_N)\leq N$

where ${Z\sim\gamma_N}$. But if ${H\sim\mathrm{GUE}}$ then ${Z_1+\cdots+Z_N=\mathrm{tr}(H)\sim\mathcal{N}(0,1)}$ and thus

$\rho_N\leq N.$

Combining this fact with the previous lower bound we obtain remarkably that

$\rho_N=N.$

Actually, it turns out that the function ${f(x)=x_1+\cdots+x_N}$ is an eigenfunction of the infinitesimal generator ${A}$ associated to the spectral value ${N}$, indeed

$Af(x) =\sum_{i=1}^N\partial_iE_N(x) =\sum_{i=1}^N\left(Nx_i+2\sum_{j\neq i}\frac{1}{x_j-x_i}\right) =Nf(x).$

This gives a conservation law: ${\sum_{i=1}^NX_{t,i}=\sum_{i=1}^NX_{0,i}}$ for any ${t\geq0}$.

Projections. Let us consider the hyperplane of ${\mathbb{R}^N}$

$H_N:=\{x\in\mathbb{R}^N:x_1+\cdots+x_N=0\}$

orthogonal to ${(1,\ldots,1)}$. Let ${\pi_N}$ and ${\pi_N^\perp}$ be the orthogonal projections on ${H_N}$ and ${H_N^\perp=\mathbb{R}(1,\ldots,1)}$. The Itô formula shows that the projected processes ${{(\pi_N(X_t))}_{t\geq0}}$ and ${{(\pi_N^\perp(X_t))}_{t\geq0}}$ are independent, and that moreover the first one is a Markov diffusion process on ${H_N}$ while the second is a Brownian motion on ${H_N^\perp\equiv\mathbb{R}}$.

Similarly if ${Z\sim\gamma_N}$ then ${\pi_N(Z)}$ and ${\pi_N^\perp(Z)}$ are independent, the first one follows a Boltzmann-Gibbs measure with same density as ${\gamma_N}$ but on ${H_N}$, while the second follows a Gaussian distribution of dimension ${1}$.

The quadratic term in the exponential and the Vandermonde determinant which both appear in the density of ${\gamma_N}$ play an essential role in these projection properties. More precisely, the Pythagoras theorem gives

$x_1^2+\cdots+x_N^2 =|x|^2 =|x-\pi_N(x)|^2+|\pi_N(x)|^2 =|\pi_N^\perp(x)|^2+|\pi_N(x)|^2,$

while the formulas ${\pi_N^\perp(x)=\frac{x_1+\cdots+x_N}{N}(1,\ldots,1)}$ and ${\pi_N(x)=x-\pi_N^\perp(x)}$ give for ${j<k}$

$x_j-x_k =x_j-\frac{x_1+\cdots+x_N}{N}-\left(x_k-\frac{x_1+\cdots+x_N}{N}\right) =\pi_N(x)_j-\pi_N(x)_k,$

which yield together finally the factorization

$\mathrm{e}^{-\frac{N}{2}\sum_{i=1}^Nx_i^2}\prod_{j<k}(x_j-x_k)^2 =\mathrm{e}^{-\frac{N}{2}|\pi_N^\perp(x)|^2}\times \mathrm{e}^{-\frac{N}{2}|\pi_N(x)|^2}\prod_{j<k}(\pi_N(x)_j-\pi_N(x)_k)^2.$

It is natural to ask about the Poincaré constant for the projected process. This constant is necessarily ${\geq N}$ and one can check on quadratic functions that this bound is in fact sharp, therefore the spectral gap is not improved by this projection and remains equal to ${N}$. However, the Dyson conjecture formulated by Dyson in 1962 suggests that the Poincaré constant for a class of sufficiently local test functions might be much larger than ${N}$.

Hoffman-Wielandt. The Hoffman-Wielandt theorem for Hermitian matrices states that for any ${N\geq1}$ if ${A}$ and ${B}$ are two ${N\times N}$ Hermitian matrices with eigenvalues ${x_1(A)\leq\cdots\leq x_N(A)}$ and ${x_1(B)\leq\cdots\leq x_N(B)}$ then

$\sum_{i=1}^N(x_i(A)-x_i(B))^2\leq\sum_{j,k=1}^N(A_{jk}-B_{jk})^2.$

This shows that the ordered vector of eigenvalues is a Lipschitz function the matrix entries, and this explains why the Gaussian nature of the GUE induces nice properties for ${\gamma_N}$. This can also be seen as an alternative to the Caffarelli theorem.

Logarithmic Sobolev inequalities. Thanks to the Hoffman-Wielandt inequality, the Gaussian nature of the GUE should induces a sub-Gaussian nature for ${\gamma_N}$. Indeed again the Bakry-Émery criterion or the Caffarelli theorem give that ${\gamma_N}$ satisfies to a logarithmic Sobolev inequality with constant ${\geq N/2}$, and thus ${=N/2}$ since it implies the Poincaré inequality with twice the constant.

Beyond the Gaussian case. Many properties considered above remain valid if one replaces, in the exponential in the density of ${\gamma_N}$, the quadratic term ${x_i^2}$ by ${V(x_i)}$ where ${V:\mathbb{R}\rightarrow\mathbb{R}}$ is convex with ${\inf_{\mathbb{R}}V”>0}$. However certain rigid properties are lost, such as the fact that ${x_1+\cdots+x_N}$ is an eigenvector of ${A}$, and in particular the projected process is no longer necessarily a Markov process.

Beyond GUE. Certain aspects remain valid if one replaces, in the density of ${\gamma_N}$, the term ${(x_j-x_k)^2}$ by ${|x_j-x_k|^\beta}$ for some ${\beta>0}$. The cases ${\beta=1}$ and ${\beta=4}$ correspond respectively to the Gaussian Orthogonal Ensemble (GOE) and the Gaussian Simplectic Ensemble (GSE). The condition ${\beta\geq1}$ seems to be necessary in order to ensure that the diffusion process ${{(X_t)}_{t\geq0}}$ does not explode in finite time.

Kannan-Lovász-Simonovits. Log-concave probability measures play a central role in the analysis and geometry of convex bodies, as being functional generalization of uniform distributions on convex bodies. In this context, the Kannan-Lovász-Simonovits (KLS) conjecture formulated in 1995 states that there exists a universal constant ${\rho\in(0,\infty)}$ such that for every ${N\geq1}$ and every log-concave probability measure ${\mu}$ on ${\mathbb{R}^N}$ with mean ${0}$ and covariance matrix ${I_N}$ (isotropy), the Poincaré constant of ${\mu}$ is larger than or equal to ${\rho}$. The feature here is the uniformity of the bound in particular with respect to the dimension ${N}$. Several mathematicians have tried to prove the conjecture. The best bound for now is ${N^{-1/4}}$. Possible counter examples should be searched among non-product probability measures due to the stability by tensor product of the Poincaré inequality and the fact that the Poincaré constant of a one dimensional log-concave distribution is controlled by its second moment. The KLS conjecture is related to other important conjectures in geometric functional analysis such that the hyperplane and the thin-shell conjectures.

Note. This post is inspired from numerous conversations with my colleague Joseph Lehec on Dyson Brownian motion and related topics.