Empirical measure

The motivation for studying empirical measures is that it is often impossible to know the true underlying probability measure $P$ . We collect observations $X_{1},X_{2},\dots ,X_{n}$ and compute relative frequencies. We can estimate $P$ , or a related distribution function $F$ by means of the empirical measure or empirical distribution function, respectively. These are uniformly good estimates under certain conditions. Theorems in the area of empirical processes provide rates of this convergence.

Definition

Let $X_{1},X_{2},\dots$ be a sequence of independent identically distributed random variables with values in the state space S with probability measure P.

Definition

The empirical measure

P_{n}

is defined for measurable subsets of S and given by

P_{n}(A)={1 \over n}\sum _{i=1}^{n}I_{A}(X_{i})={\frac {1}{n}}\sum _{i=1}^{n}\delta _{X_{i}}(A)

where

I_{A}

is the indicator function and

\delta _{X}

is the Dirac measure.

For a fixed measurable set A, $nP_{n}(A)$ is a binomial random variable with mean nP(A) and variance nP(A)(1-P(A)).

Definition

{\bigl (}P_{n}(c){\bigr )}_{c\in {\mathcal {C}}}

is called empirical measure indexed by

{\mathcal {C}}

, a collection of measurable subsets of S.

To generalize this notion further, observe that the empirical measure $P_{n}$ maps measurable functions $f:S\to \mathbb {R}$ to their empirical mean,

f\mapsto P_{n}f=\int _{S}fdP_{n}={\frac {1}{n}}\sum _{i=1}^{n}f(X_{i})

In particular, empirical measure of A is simply empirical mean of the indicator function, $P_{n}(A)=P_{n}I_{A}$ .

For a fixed measurable function f, $P_{n}f$ is a random variable with mean $\mathbb {E} f$ and variance ${\frac {1}{n}}\mathbb {E} (f-\mathbb {E} f)^{2}$ .

By the the strong law of large numbers, $P_{n}(A)$ converges to P(A) almost surely for fixed A. Similarly $P_{n}f$ converges to $\mathbb {E} f$ almost surely for a fixed measurable function f. Problem of uniform convergence of $P_{n}$ to P was open until Vapnik and Chervonenkis solved it in 1968.

If class ${\mathcal {C}}$ (or ${\mathcal {F}}$ ) is Glivenko-Cantelli with respect to P then $P_{n}$ converges to P uniformly over $c\in {\mathcal {C}}$ (or $f\in {\mathcal {F}}$ ), that is, with probability 1 we have

\|P_{n}-P\|_{\mathcal {C}}=\sup _{c\in {\mathcal {C}}}|P_{n}(c)-P(c)|\to 0,

\|P_{n}-P\|_{\mathcal {F}}=\sup _{f\in {\mathcal {F}}}|P_{n}f-\mathbb {E} f|\to 0.

Empirical distribution function

Empirical distribution function provides an example of empirical measures. For real-valued iid random variables $X_{1},X_{n},...$ it is given by

F_{n}(x)=P_{n}((-\infty ,x])=P_{n}I_{(-\infty ,x]}.

In this case, empirical measures are indexed by a class ${\mathcal {C}}=\{(-\infty ,x]:x\in \mathbb {R} \}.$ It has been shown that ${\mathcal {C}}$ is a uniformly Glivenko-Cantelli class, in particular,

\sup _{F}\|F_{n}(x)-F(x)\|_{\infty }\to 0

with probability 1.

References

P. Billingsley, Probability and Measure, John Wiley and Sons, New York, third edition, 1995.
M.D. Donsker, Justification and extension of Doob's heuristic approach to the Kolmogorov-Smirnov theorems, Annals of Mathematical Statistics, 23:277--281, 1952.
R.M. Dudley, Central limit theorems for empirical measures, Annals of Probability, 6(6): 899â€“929, 1978.
R.M. Dudley, Uniform Central Limit Theorems, Cambridge Studies in Advanced Mathematics, 63, Cambridge University Press, Cambridge, UK, 1999.
J. Wolfowitz, Generalization of the theorem of Glivenko-Cantelli. Annals of Mathematical Statistics, 25, 131-138, 1954.

Definition

Empirical distribution function

See also

References