True variance

Inspection of the keyboard of a scientific calculator will often show a key engraved with σ² and a key engraved with s². The σ² signifies the true variance, sometimes also called the population variance or the biased variance. The s² signifies the unbiased variance. Entering a few numbers and pressing the s² key will return a number (the unbiased variance) which will be greater than the number (the true variance) you'll get when pressing the σ² key. The more numbers you'll enter, the smaller this difference will be. When you'll enter more than 30 numbers, this difference will be negligible. However, to get identical values of these two kinds of variance, you would have to enter an infinitely large number of numbers.

Definitions of the true and unbiased variances

The very title of this section can only mislead the reader.

Mathematical formulae defining the true and the unbiased variance use the Greek letter Σ which means sum all values of a variable. The variable in this context is the lowercase Latin character x which denotes the deviation scores. The number of values of the variable X is signified as n. The values of the variable X are the obtained values , sometimes also called the obtained scores, i.e., values of the variable X obtained from quantification of properties of some entity or some attribute. The deviation values, also called the deviation scores (values that deviate from the mean) are obtained from the obtained scores X by subtracting the algebraic mean, M, from all values of the obtained scores X; i.e., x = X − M. The convention to signify the deviation scores by a lowercase letter is due to the notion that the obtained scores, signified by capital letters, are 'diminished' in size by subtraction of the arithmetic mean. The true variance is defined as

\sigma ^{2}={\frac {\sum x^{2}}{n}}

and the unbiased variance is defined as

s^{2}={\frac {\sum x^{2}}{n-1}}

where the expression n − 1 signifies the number of degrees of freedom, sometimes also signified by the Greek character ν (nu). For instance, a variable X ∈ { 1, 2, 3, 4, 5 }, (obtained scores) can be transformed into the vector of deviation scores x = [ −2, −1, 0, 1, 2 ] by subtracting the mean of the variable X (3.0). The deviation scores have to be summed (10.0) and divided by n (5.0) to obtain the true variance (2.0) or divided by n − 1 (4.0) to obtain the unbiased variance (2.5).

Changing true variance to unbiased variance and vice versa

The variance can be easily changed from the true variance to the unbiased variance, as

s^{2}={\frac {n}{n-1}}\sigma ^{2}

and from the unbiased variance to the true variance, as

\sigma ^{2}={\frac {n-1}{n}}s^{2}

For the example, the true variance (2.0) can be changed to the unbiased variance as (5/4) = 2.5 and the unbiased variance (2.5) can be changed to the true variance as (4/5) = 2.0.

True variance and all possible differences between values of a variable

Using all possible differences between values of a variable as a foundation of statistical theory was contemplated by Kendall (1943, p. 47) who defined a coefficient, called here u², as

\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }(x-y)^{2}\,dF(x)dF(y)

For the discontinuous infinite case, the above equation can be written as

u^{2}={\frac {\sum _{i=-\infty }^{\infty }\sum _{i=-\infty }^{\infty }(x_{i}-x_{j})^{2}f(x_{i})f(x_{j})}{n^{2}}}

and for the finite case as

u^{2}={\frac {\sum _{i=1}^{n^{2}}x_{\Delta _{i}}^{2}}{n^{2}}}

where the summed term in the above equation is a vector of all possible differences between elements of variable x. Pointing out that the value of the u² coefficient is dependent on the spread of the variate-values among themselves and not on the deviations from some central value, Kendall (1943, p.47) shows that u² = 2σ², concludes that the initial defining formula is nothing but twice the variance, and abandons the idea. One can only wonder which direction statistics could have taken if Kendall had realized that matrices of differences between all values of a variable are not just an another way to compute variance, but reflect also the information content of the variable and the hierarchical relationships between its elements.

Matrices of differences

A major difference of a vector results in a skew-symmetric matrix with elements describing all possible differences between its values. For instance, consider the major difference of the vector x [0, 1, 2, 3] with true variance equal to 1.22,

${\begin{bmatrix}0\\1\\2\\3\end{bmatrix}}-{\begin{bmatrix}0&1&2&3\end{bmatrix}}={\begin{bmatrix}0&-1&-2&-3\\1&0&-1&-2\\2&1&0&-1\\3&2&1&0\\\end{bmatrix}}$

Binarization of the vector x into its adjacent implicational matrix, shown below

${\begin{bmatrix}0\\1\\2\\3\end{bmatrix}}{\begin{bmatrix}0&0&0\\0&0&1\\0&1&1\\1&1&1\end{bmatrix}}$

and subtraction of the transpose of this binarized implicational matrix from itself (cf., matrix subtraction)

${\begin{bmatrix}0&0&0\\0&0&1\\0&1&1\\1&1&1\end{bmatrix}}-{\begin{bmatrix}0&0&0&1\\0&0&1&1\\0&1&1&1\\\end{bmatrix}}={\begin{bmatrix}0&-1&-2&-3\\1&0&-1&-2\\2&1&0&-1\\3&2&1&0\\\end{bmatrix}}$

results in the same skew symmetric matrix as that of the major difference of the vector x. This matrix can be triangularized,

${\begin{bmatrix}0&0&0&0\\1&0&0&0\\2&1&0&0\\3&2&1&0\\\end{bmatrix}}$

into a skew asymmetrix matrix. This matrix contains not only information necessary to compute the true variance, defined as the sum of squared elements of skew asymmetric matrices divided by the square of their order (for the above example, 1² + 2² +1² + 3² + 2² + 1² = 20 and (20/4²) = 1.22), but also information about the number of bits contained by the data. Conceptualized as a matrix adjacent to an ordered graph facilitates the visualization of the hierarchical relationships among the elements of the data matrices.

Conventional language of computation

In statistics, the term true variance is often used to refer to the unobservable variance of a whole population, as distinguished from an observable statistic based on a sample. Suppose a number, such as a person's height or income or age or cholesterol level, is assigned to every member of a population of n individuals. Let x_i be the number assigned to the ith individual, for i = 1, ..., n. Then the variance is

\sigma ^{2}={1 \over n}\sum _{i=1}^{n}(x_{i}-{\overline {x}})^{2},\quad \quad \quad (1)

where

\mu ={\overline {x}}={x_{1}+\cdots +x_{n} \over n}

is the population mean. If x_i were the ith member of a random sample rather than of the whole population, then one sometimes uses the same function seen in (1) above as an estimate of the "true variance" or "population variance" σ². But sometimes one replaces n with n − 1, or n + 1 or otherwise alters the expression (1), in order to estimate σ². In particular, using n − 1 makes the estimator unbiased, and in some often considered contexts, using n + 1 minimizes the mean squared error of estimation.

Statisticians do not normally use Greek letters μ and σ for estimates based on samples, but only for (often) unobservable characteristics of whole populations. Because the "true" or "population" variance uses the denominator 1/n rather than 1/(n &minus 1), it is conventional among those concerned with computation sometimes to call the expression (1), with the denominator 1/n, the "true variance" without regard to whether it is an estimate or a characteristic of whole population or a random sample.

References

Kendall, M. (1943). The Advanced Theory of Statistics. In Stuart, A., & Ord, J.K. (1987) Kendall’s Advanced Theory of Statistics, 5th Ed. London: Griffin.