This post gives below the Mathematical Citation Quotient (MCQ) from 2000 to 2018 for journals in probability, statistics, analysis, and general mathematics. The numbers were obtained using home brewed scripts and MathSciNet data. The graphics were created with LibreOffice.

Recall that the MCQ is a ratio of two counts for a selected journal and a selected year. The MCQ for year $Y$ and journal $J$ is given by the formula $\mathrm{MCQ}=m/n$ where

$m$ is the total number of citations of papers published in jounal $J$ in years $Y-1$,…,$Y-5$ by papers published in year $Y$ in any journal known by MathSciNet;

$n$ is the total number of papers published in journal $J$ in years $Y-1$,…,$Y-5$.

The Mathematical Reviews compute every year the MCQ for every indexed journal, and make it available on MathSciNet. This formula is very similar to the one of the five years impact factor, the main difference being the population of journals which is specifically mathematical for the MCQ (reference list journals) and the way the citations are extracted. Both biases are negative.

The MCQ is a rough measurement of the social scientific value of journals. The results are quite compatible with what we have in mind. The trends are sometimes intriguing. For probability journals, for instance, it seems that there are three groups. This reminds reinforcement or self-organized criticality. The first group is AOP-PTRF-CMP-JFA-AAP, with some hesitations and an “Annals” naming effect. The second group is AIHP-EJP-SPA-Bernoulli, the third group is ALEA-ECP-AdAP-JAP-JTP-ESAIM. We observe some transitions from one group to another, for instance, since 2010, AAP moved to the first group while ALEA moved to the third group. The case of ECP is very special since its papers are half the standard size. The MCQ probably underestimates the social value of ECP by a rough factor 2, which is logical if we compare with EJP.

Yes, there are more robust ways to measure the social value of a journal, such as for instance the (recursive) eigenfactor, and it could be interesting to check if the three groups are stable!

Let $X=(X_1,\ldots,X_n)$ be a random vector of $(\mathbb{R}^d)^n$ with density proportional to $$(x_1,\ldots,x_n)\in(\mathbb{R}^d)^n\mapsto\mathrm{e}^{-\beta\sum_{i=1}^nV(x_i)}\prod_{i<j}W(x_i-x_j),$$ where $V,W:\mathbb{R}^d\to\mathbb{R}$ are homogeneous functions, with $W\geq0$. This means that there exist $a,b\geq0$ such that for all $\lambda\geq0$ and $x\in\mathbb{R}^d$, $V(\lambda x)=\lambda^a V(x)$ and $W(\lambda x)=\lambda^bW(x)$. Now, for all $\theta>0$, by the change of variable $x_i=\sqrt[a]{\beta/(\theta+\beta)}y_i$, \begin{multline*} \int_{(\mathbb{R}^d)^n}\mathrm{e}^{-(\theta+\beta)\sum_iV(x_i)}\prod_{i<j}W(x_i-x_j)\mathrm{d}x\\ =\Bigr(\frac{\beta}{\theta+\beta}\Bigr)^{\frac{nd}{a}+\frac{n(n-1)a}{2b}} \int_{(\mathbb{R}^d)^n}\mathrm{e}^{-\beta\sum_iV(y_i)}\prod_{i<j}W(y_i-y_j)\mathrm{d}y. \end{multline*} We recognize the Laplace transform of a Gamma distribution, since \[ \int_0^\infty\mathrm{e}^{-\theta u}u^{\alpha-1}\mathrm{e}^{-\beta u}\mathrm{d}u =\int_0^\infty u^{\alpha-1}\mathrm{e}^{-(\theta+\beta)u}\mathrm{d}u =\Bigr(\frac{\beta}{\theta+\beta}\Bigr)^\alpha\frac{\Gamma(\alpha)}{\beta^\alpha}, \]and we obtain \[ \sum_iV(X_i)\sim\mathrm{Gamma}\Bigr(\frac{nd}{a}+\frac{n(n-1)bd}{2a},\beta\Bigr). \] A remarkable general fact! The case $V=\frac{1}{2}\left|\cdot\right|^2$ and $W=\left|\cdot\right|^\beta$ corresponds to the beta Ginibre gas of random matrix theory. The case $V=\frac{n+1}{2}\log(1+\left|\cdot\right|^2)$ and $W=\left|\cdot\right|^2$ corresponds to the Forrester–Krishnapur spherical gas of random matrix theory.

We could generalize even more, and replace $(x_1,\ldots,x_n)\mapsto\sum_iV(x_i)$ by a homogenenous $(x_1,\ldots,x_n)\mapsto V(x_1,\ldots,x_n)$ and $(x_1,\ldots,x_n)\mapsto\prod_{i<j}W(x_i-x_j)$ by a homogeneous $(x_1,\ldots,x_n)\mapsto W(x_1,\ldots,x_n)$, in the sense that for some $a,b\geq0$ and all $\lambda\geq0$, $x\in(\mathbb{R}^d)^n$, $V(\lambda x)=\lambda^aV(x)$ and $W(\lambda x)=\lambda^bW(x)$. In this case $X=(X_1,\ldots,X_n)$ has density proportional to $x\in(\mathbb{R}^d)^n\mapsto\mathrm{e}^{-\beta V(x)}W(x)$. This would hide the structure of exchangeable gas with pair-interaction that we had in mind for the examples. But this would give $$V(X)=V(X_1,\ldots,X_n)\sim\mathrm{Gamma}\Bigr((n+b)\frac{d}{a},\beta\Bigr).$$

Suppose that we would like to describe mathematically the convergence of a sequence ${(X_n)}_n$ of random variables towards a limiting random variable $X_\infty$, as $n\to\infty$. We have to select a notion of convergence. If we decide to use almost sure convergence, we need to define all the $X_n$’s as well as the limit $X_\infty$ on a common probability space in order to give a meaning to $$\mathbb{P}(\lim_{n\to\infty}X_n=X_\infty)=1.$$ This means that we need to couple the random variables. If we decide to use convergence in probability or in $L^p$, we have to define, for all $n$, both $X_n$ and $X_\infty$ in the same probability space in order to give a meaning to $\mathbb{P}(|X_n-X_\infty|>\varepsilon)$ and $\mathbb{E}(|X_n-X_\infty|^p)$ respectively, and therefore we end up to define all the $X_n$’s as well as $X_\infty$ on a common probability space. However, if we decide to use convergence in law (i.e. in distribution), then we do not need at all to define the random variables on a common probability space.

In the special case where $X_\infty$ is deterministic, the convergence in probability or in $L^p$ no longer impose to define the random variables on the same probability space. However, the almost sure convergence still requires the same probability space. Moreover if we impose that the almost sure convergence holds regardless of the way we define the random variables on the same probability space (i.e. for arbitrary couplings), then we end up with the important notion of complete convergence, which is equivalent, thanks to Borel-Cantelli lemmas, to a summable convergence in probability. Note that when the limit is deterministic, we also know that the convergence in law is equivalent to the convergence in probability. Moreover, we know in general from the Borel-Cantelli lemma that a summable convergence in probability implies almost sure convergence. Furthermore, the convergence in probability becomes easily summable under moment conditions.

Following Hsu & Robbins, if we consider $X_n=\frac{1}{n}(Z_1+\cdots+Z_n)$ where $Z_1,\ldots,Z_n$ are independent copies of some $Z$ of mean $m$, then the sequence ${(X_n)}_n$ converges completely towards $m$ as soon as $Z$ has a finite second moment, and this condition is almost necessary. This sheds an interesting light on the law of large numbers for triangular arrays.

Some people refuse to consider the almost sure convergence as a true mode of convergence in the sense that it is not associated to a metric, contrary to the other modes of convergence. In some sense, it appears as a critical notion in the law of large numbers, when we lower the concentration typically via integrability (moments conditions). Of course there are plenty of concrete situations for instance with martingales in which the coupling is in fact imposed and for which the almost sure convergence towards a non-constant random variable holds very naturally. A famous example is for instance the one of Pólya urns and of Galton-Watson branching processes. The Marchenko-Pastur theorem in random matrix theory provides an example of natural coupling with a limiting object which is deteterministic, and the convergence is complete via concentration of measure provided that the ingredients have enough finite moments.

Note. The idea of writing this tiny post came from a discussion with my friend Adrien Hardy.

Recently, during a coffee break, emerged a discussion about the presence of probability and statistics in top journals such as Annals of mathematics, Acta Mathematica, Inventiones Mathematicae, or Journal of the AMS. Well, the question has an interest from the point of view of the sociology and history of science. Let us use the Primary and Secondary Mathematical Subject Classification (MSC) codes of each article in order to detect Probability (60x) or Statistics (62x). Here is the data from MathSciNet/zbMath:

Annals of Mathematics published 4464 papers in total from 1938 to 2019. Among them, 76 (1.7%) have Primary MSC 60x [PDF] Among them, 112 (2.5%) have Primary or Secondary MSC 60x [PDF] Moreover only 2 have Primary or Secondary MSC 62x [PDF]

Acta Mathematica published 1297 papers in total from 1938 to 2017. Among them, 44 (3.4%) have Primary MSC 60x [PDF] Among them, 63 (4.9%) have Primary or Secondary MSC 60x [PDF] Moreover only 4 have Primary or Secondary MSC 62x [PDF]

Inventiones Mathematicae published 4311 papers in total from 1966 to 2019. Among them, 52 (1.2%) have Primary MSC 60x [PDF] Among them, 95 (2.2%) have Primary or Secondary MSC 60x [PDF] Moreover only 2 have Primary or Secondary MSC 62x [PDF]

Journal of the AMS published 963 papers in total from 1988 to 2019. Among them, 28 (2.9%) have Primary MSC 60x [PDF] Among them, 49 (5.1%) have Primary or Secondary MSC 60x [PDF] Moreover only 5 have Primary or Secondary MSC 62x [PDF]

The presence of probability is low, while the one of statistics is microscopic. A scandal.

AO(P|S). Annals of Probability (AOP) and Annals of Statistics (AOS) were founded only in 1973.

1938. Annals of Mathematics is historically American whereas Acta Mathematica is European. They started respectively in 1892 and 1882. According to MathSciNet, it seems that the first article classified 60x in these journals was published in 1938. The MSC by itself was introduced at the end of the thirties and many articles in MathSciNet are not classified before 1940 at the time of writing. Note that N. Wiener published in the twenties while A. N. Kolmogorov published in the thirties.

Why. The phenomenon has probably multiple explanations, among them we could mention for instance the possible effects of utilitarism and anti-utilitarism in the mathematical elite, in particular during the fifties and sixties, and the possible overweight of some kind of “snobish pure mathematics or mathematicians” in top journals boards. We could also see AOP and AOS as some sort of mathematical ghettos and think about self-censorship. We could moreover think about generational effects. Finally we have to keep in mind that some probability papers were published without any primary or secondary 60x code, such as for instance this one or that one.

Here is some additional data provided by MathSciNet for Annals of Mathematics:

Graphics for Annals of mathematics.Graphics for Acta Mathematica.Graphics for Inventiones Mathematicae.Graphics for Journal of the AMS.

JMPA. We could think that a journal such as Journal de mathématiques pures et appliquées, founded in 1872, is in the same time relatively prestigious, generalist, and more open to applied mathematics in general and to probability and statistics in particular. Here is the data for all MSC codes, taken from MathSciNet. We see an obvious overweight for partial differential equations. In the mean time, the situation of probability is better than before, while the presence of statistics is still microscopic.

CPAM. Finally, here is the same data for Communication on Pure and Applied Mathematics. This journal, established in 1948, is truly open to applied mathematics in general and to probability theory in particular. However, the presence of statistics is still extremely low.

MSC

Description

Count

35

Partial differential equations

898

76

Fluid mechanics

234

58

Global analysis, analysis on manifolds

182

60

Probability theory and stochastic processes

177

53

Differential geometry

97

65

Numerical analysis

92

82

Statistical mechanics, structure of matter

92

34

Ordinary differential equations

85

47

Operator theory

65

49

Calculus of variations and optimal control; optimization

64

37

Dynamical systems and ergodic theory

58

78

Optics, electromagnetic theory

58

20

Group theory and generalizations

49

46

Functional analysis

48

81

Quantum theory

43

10

Number theory

39

73

Mechanics of solids

37

30

Functions of a complex variable

29

36

Other

25

32

Several complex variables and analytic spaces

24

57

Manifolds and cell complexes

24

11

Number theory

23

74

Mechanics of deformable solids

23

94

Information and communication, circuits

22

42

Harmonic analysis on Euclidean spaces

20

03

Mathematical logic and foundations

15

31

Potential theory

15

45

Integral equations

15

62

Statistics

15

14

Algebraic geometry

14

55

Algebraic topology

14

70

Mechanics of particles and systems

14

83

Relativity and gravitational theory

14

92

Biology and other natural sciences

14

01

History and biography

13

15

Linear and multilinear algebra; matrix theory

13

52

Convex and discrete geometry

13

44

Integral transforms, operational calculus

12

22

Topological groups, Lie groups

11

26

Real functions

9

80

Classical thermodynamics, heat transfer

9

85

Astronomy and astrophysics

8

86

Geophysics

8

05

Combinatorics

7

43

Abstract harmonic analysis

7

12

Field theory and polynomials

6

28

Measure and integration

6

90

Operations research, mathematical programming

6

00

General

5

39

Difference and functional equations

5

68

Computer science

5

93

Systems theory; control

5

41

Approximations and expansions

4

91

Game theory, economics, social and behavioral sciences