Normal distribution

The normal or Gaussian distribution is an ubiquitous and extremely important probability distribution considered in statistics. It is actually a family of distributions of the same general form, differing only in their location and scale parameters, commonly called the mean and standard deviation. The standard normal distribution is the normal distribution with a mean of zero and a standard deviation of one.

Probability density function

The probability density function of the normal distribution with mean μ and standard deviation σ (or variance σ²) is

p(x) = ((2π)^1/2σ)^-1 exp(-½((x-μ)/σ)²).

(see exponential function and pi). If a random variable X follows this distribution, we write X ~ N(μ, σ²). If μ = 0 and σ = 1, we talk about the standard normal distribution.

This picture is the graph of the probability density function of the standard normal distribution. The distribution is symmetric about its mean value and its shape resembles a bell, which has led to it being called the bell curve. About 68% of the area under the curve is within one standard deviation of the mean, 95.5% within two standard deviations, and 99.7% within three standard deviations (the "67 - 95.5 - 99.7 rule"). The inflection points of the curve occur at one standard deviation away from the mean. These statements are also true for non-standard normal distributions.

Standardizing Gaussian random variables

If X is a Gaussian random variable with mean μ and variance σ², then

Z=(X-μ)/σ

is a standard normal random variable: Z~N(0,1). Conversely, if Z is a standard normal random variable,

X=σZ+μ

is a Gaussian random variable with mean μ and variance σ².

The standard normal distribution has been tabulated, and the other normal distributions are simple transformations of the standard one. Therefore, if one knows the mean and the standard deviation of a normal distribution, one can use this table to answer all questions about the distribution.

Occurrence

Approximately normal distributions occur in many situations, as a result of the central limit theorem. Simply stated, this theorem says that adding up a large number of small independent variables results in an approximately normal distribution. Therefore, whenever there is reason to suspect the presence of a large number of small effects acting additively, it is reasonable to assume that observations will be normal. The IQ score of an individual for example can be seen as the result of many small additive influences: many genes and many environmental factors all play a role.

IQ scores and other ability scores are approximately normally distributed. For most IQ tests, the mean is 100 and the standard deviation is 15.
the heights of adult specimens of an animal or plant species are approximately normally distributed, unless there is sexual dimorphism; the size of newborn children is approximately normal (but not weight, see below)
Repeated measurements of the same quantity usually yield results which are approximately normally distributed (many little effects contribute additively to the measurement error). This is, in any case, the central assumption of the mathematical theory of errors.
A binomial distribution with parameters n and p is approximately normal if n is big enough (the approximation is very good if both np and n(1-p) are at least 5). The approximating normal distribution has mean μ = np and standard deviation σ = (n p (1 - p))^1/2.
- For example, suppose you randomly sample n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If you sampled groups of n people repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation σ = (p(1 - p)/n)^1/2. Large sample sizes n are good because the standard deviation gets smaller, which allows a more precise estimate of the unknown parameter p.
A Poisson distribution with parameter λ is approximately normal if λ is big enough (λ > 10 is sufficient). The approximating normal distribution has mean μ = λ and standard deviation σ = √λ.
- For instance, the number of edits per hour recorded on Wikipedia's Recent Changes page is approximately normal.

It is important to realize, however, that small effects often act as multiplicative (rather than additive) increases. In that case, the assumption of normality is not justified, and it is the logarithm of the variable of interest that is normally distributed. The distribution of the directly observed variable is then called log-normal. Good examples of this behaviour are financial indicators such as interest rates or stock values. Also, in biology it has been observed that organism growth sometimes proceeds by multiplicative rather than additive increments, implying that the distribution of body sizes should be log-normal.

Other examples of variables that are not normally distributed:

weight of humans: the weight is approximately proportional to the third power of the height, but the third power of a normally distributed variable is not normal.
blood pressure or height of adult humans: these are interesting cases which at first appear to yield normal distributions. In reality, they are mixtures of two normal variables: blood pressure (height) of males (which is normally distributed) and blood pressure (height) of females (which is also normally distributed, but with different mean). Generally, if there is a single characteristic (like sex) which has a large influence on the measured variable, the assumption of normality is not justified.
lifetimes of humans or technical devices

Further properties

If X ~ N(μ, σ²) and a and b are real numbers, then aX + b ~ N(aμ + b, (aσ)²).

If X₁ ~ N(μ₁, σ₁²) and X₂ ~ N(μ₂, σ₂²), and X₁ and X₂ are independent, then X₁ + X₂ ~ N(μ₁ + μ₂, σ₁² + σ₂²).

If X₁, ..., X_n are independent standard normal variables, then X₁² + ... + X_n² follows a chi-squared distribution with n degrees of freedom.

Characteristic function

The characteristic function of a gaussian random variable X ~ N(μ,σ²) is defined as the expected value of e^izX and can be written as

φ_X(z) = E[e^izX] = ∫ 1/(√(2π);σ)e^{-½(x-μ)²/σ²}e^izxdx = e^{iμz-½σ²z²},

as can be seen by completing the square in the exponent.

Generating Gaussian random variables

For computer simulations, it is often necessary to generate values that follow a Gaussian distribution. This is best done with the Box-Muller transforms. These methods require two uniformly distributed values as input which can easily be generated by the computer's pseudorandom number generator.

History

The normal distribution was first introduced by de Moivre in the second edition (1718) of his Doctrine of Chances, in the context of approximations of large binomial coefficients. His result was extended by Laplace in his book Analytical Theory of Probabilities (1812), and is now called the Theorem of de Moivre-Laplace. Around that time the analysis of errors of experiments was pioneered by Laplace, Legendre and Gauss. The distribution appearing in the theorem of de Moivre-Laplace was called Gaussian as a result of Gauss' work on the method of least squares, introduced by Legendre in the context of the theory of errors. Towards the end of the 19th century, Pearson established the priority of de Moivre, and Poincaré coined the name normal. This terminology is unfortunate, since it reflects and encourages the fallacy that "everything is Gaussian".

External links

A. Kropinski's normal distribution tutorial: includes a number of standard normal distribution tables