Lebesgue integral

In the mathematical branch of real analysis, Lebesgue integration is a means of extending the usual notion of (Riemann-)integration to more functions and to more general settings.

Introduction

Integration is the mathematical operation which corresponds to finding the area under the graph of a function. A Riemann integral defines this operation by filling the area under the curve with smaller and smaller rectangles. As the rectangles become smaller and smaller, the total area of the rectangles becomes closer and closer to the area under the curve. Unfortunately, there are functions for which this method of finding areas does not work, for example, consider a function f(x) which is 0 when x is rational and 1 otherwise. You can't draw rectangles under the curve and find its area using the Riemann approach.

This is where Lebesque integration comes in. Instead of using limits of sums, Lebesque integration uses maximums. Take that function f(x). I don't know the area underneath that function, but I do know that it has to be at least as big as that under all those functions which are smaller than or equal to f(x) across the interval I am interested in. The idea behind Lebesque integration is to first build the integral for simple functions, and then define it for more complicated f(x) as the upper bound of all the integrals of "simple" functions smaller than or equal to f(x).

The main advantage of the Lebesgue integral over the Riemann integral is that more functions become integrable, and that the integral can often be determined easily using convenient convergence theorems (listed below). Furthermore, the extension of the notion of integration to functions defined on general measure spaces, which include probability spaces, allows the proper formulation of the foundations of probability and statistics. A formal introduction of the concept follows.

Formal construction

Let μ be a (non-negative) measure on a sigma-algebra X over a set E. (In real analysis, E will typically be Euclidean n-space Rⁿ or some Lebesgue measurable subset of it, X will be the sigma-algebra of all Lebesgue measurable subsets of E, and μ will be the Lebesgue measure. In probability and statistics, μ will be a probability measure on a probability space E.) We build up an integral for real-valued functions defined on E as follows.

Fix a set S in X and let f be the function on E whose value is 0 outside of S and 1 inside of S (i.e., f(x) = 1 if x is in S, otherwise f(x) = 0.) This is called the indicating or characteristic function of S and is denoted 1_S.

To assign a value to ∫1_S consistent with the given measure μ, the only reasonable choice is to set ∫1_S := μ(S).

We extend by linearity to the linear span of indicating functions: ∫∑a_k1_Sk:=∑a_kμ(S_k) (where the sum is finite and the coefficients a_k are real numbers.) Such a finite linear combination of indicating functions is called a simple function. Note that a simple function can written in many ways as a linear combination of characteristic functions, but the integral will always be the same.

Now the difficulties begin as we attempt to take limits so that we can integrate more general functions. It turns out that the following process works and is most fruitful.

If f is a non-negative function on E (we allow it to attain the value +∞, in other words, f takes values in the extended real number line) then we define ∫f (sometimes written more explicitly as ∫_Ef dμ) to be the supremum of ∫s where s varies over all simple functions which are under f (that is, s(x) ≤ f(x) for all x.) This is analogous to the lower sums of Riemann. However, we will not build an upper sum, and this fact is important in getting a more general class of integrable functions.

There is the question of whether this definition makes sense (do simple function or indicating function keep the same integral?) There is also the question of whether this corresponds in any way to a Riemann notion of integration. It is not so hard to prove that the answer to both questions is yes.

We have defined ∫f for any non-negative function on E; however for some functions ∫f will be infinite. Furthermore, desirable additive and limit properties of the integral are not satisfied, unless we require that all our functions are measurable, meaning that the pre-image of any interval is in X. We will make this assumption from now on.

To handle signed functions, we need a few more definition. If f is a function of E to the reals (including ± ∞), then we can write f = g - h where g(x) = f(x) and h(x) = 0 if x>0 and g(x) = 0 and h(x) = -f(x) if f(x) < 0. Note that both g and h are non-negative functions. Also note that |f| = g + h. If ∫|f| is finite, then f is called Lebesgue integrable. In this case, both ∫g and ∫h are finite, and it makes sense to define ∫f by ∫g - ∫h. It turns out that this definition is the correct one. Complex valued functions can be similarly integrated, by considering the real part and the imaginary part separately.

Theorems

Every reasonable notion of integral needs to be linear and monotone, and the Lebesgue integral is: if f and g are integrable functions and a and b are real numbers, then af + bg is integrable and ∫(af + bg) = a∫f + b∫g; if f ≤ g, then ∫f ≤ ∫g.

Two functions which only differ on a set of μ-measure zero have the same integral, or more precisely: if μ({x : f(x) ≠ g(x)}) = 0, then f is integrable if and only if g is, and in this case ∫ f = ∫ g.

One of the most important advantages that the Lebesgue integral carries over the Riemann integral is the ease with which we can perform limit processes. Three theorems are key here.

The monotone convergence theorem states that if f_k is a sequence of non-negative measurable functions such that f_k(x) ≤ f_k+1(x) for all k, and if f = lim f_k, then ∫f_k converges to ∫f as k goes to infinity. (Note: ∫f may be infinite here.)

Fatou's Lemma states that if f_k is a sequence of non-negative measurable functions and if f = liminf f_k, then ∫f ≤ liminf ∫f_k. (Again, ∫f may be infinite.)

The Dominated Convergence Theorem states that if f_k is a sequence of measureable functions with pointwise limit f, and if there is an integrable function g such that |f_k| ≤ g for all k, then f is integrable and ∫f_k converges to ∫f.