Recueil de modèles stochastiques

October 29th, 2014 No comments

Recueil de modèles stochastiquesJ’ai l’immense joie ce soir de terminer la mise au point de la version préliminaire du livre intitulé «Recueil de modèles stochastiques », écrit en collaboration avec mon vieil ami Florent Malrieu (sauriez-vous nous identifier sur la photo de Saint-Flour 1999 ?). Extrait de l’avant-propos:

Ce recueil de modèles stochastiques puise sa source dans les cours de Master de ma- thématiques et de préparation à l’agrégation de mathématiques, que nous avons dispensés, pendant plusieurs années, aux étudiants des universités de Toulouse, Rennes, Marne-la-Vallée, Paris-Dauphine, et Tours. Le parti pris est de polariser la rédaction par les modèles plutôt que par les outils, et de consacrer chaque séance ou chapitre à un modèle. Bien que parfois reliés, les chapitres sont essentiellement autonomes. Ils ne contiennent pas de rappels de cours sur les outils fondamentaux comme les théorèmes limites, les martingales, et les chaînes de Markov, pour lesquels d’excellentes références sont disponibles. Chaque chapitre commence par une liste de mots-clés et se termine par quelques notes et commentaires. La liste des thèmes abordés n’a rien de canonique ni d’exhaustif, mais constitue un panorama varié, que nous espérons enrichissant.

Le livre compte environ 280 pages, et comporte 26 chapitres, une bibliographie, et un index :

  1. Pile, face, coupons
  2. Marches aléatoires
  3. Modèle d’Ehrenfest
  4. Modèle de Fisher-Wright
  5. Branchement
  6. Percolation
  7. Renforcement
  8. Simulation de lois discrètes
  9. Restaurants chinois
  10. Généalogies et coalescence
  11. Agrégation limitée par diffusion interne
  12. Chaînes de Markov cachées
  13. Algorithme EM et mélanges
  14. Records, extrêmes, et recrutements !
  15. Polymères dirigés en environnement aléatoire
  16. Matrices aléatoires
  17. Problème du voyageur de commerce
  18. File d’attente M/M/Infini
  19. Ruine d’une compagnie d’assurance
  20. Naissances et assassinats
  21. Croissance et fragmentation
  22. Modèle du télégraphe
  23. Problème de Dirichlet
  24. Processus d’Ornstein-Uhlenbeck
  25. Modèles de diffusion
  26. Des chaînes de Markov aux processus de diffusion

C’est l’occasion de rappeler que le nombre $26$, utilisé en France pour la section Mathématiques appliquées et applications des mathématiques du Conseil national des universités, est l’unique nombre coincé entre un carré ($25=5^2$) et un cube ($27=3^3$), cf. [GB].

Pour ceux qui cherchent les graphes aléatoires, il y en a dans les chapitres 5, 6, 7, 8.

Categories: Books, Probability, Teaching

Linux kernel 3.17 getrandom()

October 13th, 2014 No comments
Linux Tux from Wikimedia

Linux Tux

Linus Torvalds has just released the Linux kernel version 3.17. Among other things, it comes with a new system call for random numbers named getrandom() introduced by Theodore Ts’o for the needs of applications such as LibreSSL. This new system call can be used to emulate the getentropy() of OpenBSD. Such random numbers are not algorithmic, and their unpredictability is useful for cryptographic and security applications. More information is available on Linux Weekly News.

Excerpt from the commit in the Git repository of the Linux kernel source:

The getrandom(2) system call was requested by the LibreSSL Portable
developers. It is analoguous to the getentropy(2) system call in

The rationale of this system call is to provide resiliance against
file descriptor exhaustion attacks, where the attacker consumes all
available file descriptors, forcing the use of the fallback code where
/dev/[u]random is not available. Since the fallback code is often not
well-tested, it is better to eliminate this potential failure mode

The other feature provided by this new system call is the ability to
request randomness from the /dev/urandom entropy pool, but to block
until at least 128 bits of entropy has been accumulated in the
/dev/urandom entropy pool. Historically, the emphasis in the
/dev/urandom development has been to ensure that urandom pool is
initialized as quickly as possible after system boot, and preferably
before the init scripts start execution.

This is because changing /dev/urandom reads to block represents an
interface change that could potentially break userspace which is not
acceptable. In practice, on most x86 desktop and server systems, in
general the entropy pool can be initialized before it is needed (and
in modern kernels, we will printk a warning message if not). However,
on an embedded system, this may not be the case. And so with this new
interface, we can provide the functionality of blocking until the
urandom pool has been initialized. Any userspace program which uses
this new functionality must take care to assure that if it is used
during the boot process, that it will not cause the init scripts or
other portions of the system startup to hang indefinitely.

Notes. LibreSSL is a free version of the Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols, forked from OpenSSL cryptographic software library in April 2014 by OpenBSD developers after the Heartbleed security vulnerability in OpenSSL.


Deux petites productions pédagogiques du mois de septembre

October 8th, 2014 No comments

Wigner about level spacing and Wishart

September 26th, 2014 2 comments
Figure IIB1.1 Probability of a level spacing X.

Original picture in Wigner’s proceedings.

Have you ever opened The Oxford Handbook of Random Matrix Theory edited by Akemann, Baik, and Di Francesco? The short foreword by Freeman Dyson is fun. I was intrigued, page 16, by a sentence in chapter 2 (“History – An overview”, by Bohigas and Weidenmüller):

… That search and, as he says (see page 225 of Ref. [Por65]), the ‘accidental’ discovery of the Wishart ensemble of random matrices in an early version of the book by Wilks [Wil62], motivated Wigner to use random matrices.

I already knew Wilks’s book [Wil62], but not [Por65], which turned out to be a book:

Statistical Theories of Spectra: Fluctuations, A Collection of Reprints and Original Papers. With an introductory Review by Charles E. Porter, Brookhaven National Laboratory, New York. Academic Press, 1965.

I’ve bought a used copy of this book. Wigner’s paper is a conference proceedings p. 223-225:

Proceedings of the International Conference on the Neutron Interactions with the Nucleus. Held at Columbia University, New York, September 9-13, 1957. Published by the United States Atomic Energy Commission, Technical Information Service Extension, Oak Ridge, Tennessee.

Talk IIB1. E. P. Wigner, Princeton University. Presented by E. P. Wigner.

Distribution of Neutron Resonance Level Spacing.

Session IIB. Interpretation of Low Energy Neutron Spectroscopy.

Chairman: W. W. Havens, Jr.

The problem of the spacing of levels is neither a terribly important one nor have I solved it. That is really the point which I want to make very definitely. As we go up in the energy scale it is evident that the detailed analyses which we have seen for low energy levels is not possible, and we can only make statistical statements about the properties of the energy levels. Statistical properties refer particularly to two quantities, to wit, the distribution of the widths of energy levels and the spacing of energy levels. As far as the widths of the energy levels is concerned, this is an important problem which has been solved experimentally principally by Hughes, Harvey and the group which was at the time at Brookhaven. The theoretical work, which resulted in a relatively minor correction, was done by Scott, Porter and Thomas. The distribution of widths is rather important, as has been shown by Mr. Dresner, since the use of different distributions can introduce a factor of 3 or more in the determination of the average cross sections. Nothing of similar importance can be said about the spacing except that it is tantalizing not to know what the probability of a certain spacing of the energy levels is.

Now, if we have an average spacing, let us say D, what is the probability of a spacing, X. This question is the one which I have not solved, but perhaps I can tell you something about this which is relatively easy to see. It was pointed out at the Amsterdam conference last year that the probability is 0 at X = 0. This is rather obvious, and follows from the fact that if you have a symmetric matrix, the probability that two energy levels coincide is governed by the requirement that two parameters be 0, rather than that one parameter be 0. This was pointed out by Dr. Von Neuman and myself in an article which everybody has probably forgotten, but it is a very easy thing to show. If I take, for Instance, a two-dimensional matrix then the secular equation reads: $$E =1/2(E_1+ E_2) \pm 1/2\sqrt{(E_1 — E_2)^2 + 4V_{(12)}^2}.$$ The two roots can coincide only if the square root is equal to zero, which requires that both $E_1 – E_2 = 0$ and $V_{(12)} =0$. If you have a third order equation, which is slightly more complicated, that tells you two roots coincide, it again amounts to two equations rather than one equation. As a result, the probability is 0 and, therefore, the probability curve looks something like that in Figure 1. This is of course what is also observed. The only question is how does this look for large spacings, and the answer to this question is what I don’t know.

Dr. Hughes presented a curve which I have not seen, in which he compared an experiment with a curve which I suggested some time ago, a form $Xe^{-X^2/2}$ or something like this. Hughes found reasonably good agreement, but the agreement was good in the region of Small X, and not very good in the region of large X. Agreement in this large X region would not look so bad if the experimental curve were the theoretical curve and the theoretical curve, the experimental curve. Because if you could calculate that there should be four cases with a given spacing and you find only one, then that’s unfortunate, but it is a reasonable statistical deviation. However, if you calculate that there should only be one and find four, that is a much more serious deviation and much less probable statistically.

Let me say only one more word. It is very likely that the curve in Figure 1 is a universal function. in other words, it doesn’t depend on the details of the model with which you are working. There is one particular model in which the probability of the energy levels can be written down exactly. I mentioned this distribution already in Gatlinburg. It is called the Wishart distribution. Consider a set of symmetric matrixes in such a way that the diagonal element $m_{11}$ has a distribution $\exp(-m_{11}^2/4)$. In other words, the probability that this diagonal element shall assume the value $m_{11}$ is proportional to $\exp(-m_{11}^2/4)$. Then as I mentioned, and this was shown a long time ago by Wishart, the probability for the characteristic roots to be $\lambda_1, \lambda_2, \lambda_3, \ldots, \lambda_n$, if this is an $n$ dimensional matrix, is given by the expression: $$\exp(-1/2(\lambda_1^2+\lambda_2^2+..+\lambda_n^2))[(\lambda_1-\lambda_2)(\lambda_1-\lambda_3).......(\lambda_{n-1}-\lambda_n)].$$ You see again this characteristic factor which shows that the probability of two roots coinciding is 0. If you want to calculate from this formula, the probability that two successive roots have a distance X, then you have to integrate over all of them except two. This is very easy to do for the first integration, possible to do for the second integration, but when you get to the third, fourth and fifth, etc., integrations you have the same problem as in statistical mechanics, and presumably the solution of the problem will be accomplished by one of the methods of statistical mechanics. Let me only mention that I did integrate over all of them except one, and the result is $\frac{1}{2\pi}\sqrt{4n-\lambda^2}$. This is the probability that the root shall be $\lambda$. All I have to do is to integrate over one less variable than I have integrated over, but this I have not been able to do so far.


W. HAVENS: Where does one find out about a Wishart distribution?

E. WIGNER: A Wishart distribution is given in S.S. Wilks book about statistics and I found it just by accident.



Eugene Wigner and Edward Teller

Eugene Wigner and Edward Teller – Two Hungarian-born American theoretical physicists.

Note. Edward Teller is among other things credited as being the father of the US H-bomb. He is also one of the authors of the the famous 1953 paper entitled Equation of State Calculations by Fast Computing Machines, together with Nicholas Metropolis, Arianna Rosenbluth, Marshall Rosenbluth, and Augusta Teller, which introduced the so-called Metropolis-Hastings algorithm. Eugene Wigner received a Nobel Prize in Physics in 1963 “for his contributions to the theory of the atomic nucleus and the elementary particles, particularly through the discovery and application of fundamental symmetry principles”. Wigner is often considered as a hidden genius, in comparison with Einstein.