Eigenvalues and eigenvectors

Eigenvalue, eigenvector and generalized eigenvector will be merged in here. This article is a work in progress.

In this transformation of the Mona Lisa, the blue vector has been rotated, but the red one has not. The red vector is thus an **eigenvector** of the transformation. Since it has been neither stretched nor compressed, its **eigenvalue** is equal to one. All vectors parallel to it are also eigenvectors, with the same eigenvalue. They form the **eigenspace** associated with this eigenvalue.

In mathematics, and in particular in vectorial analysis, the eigenvalues, eigenvectors, and eigenspaces of a transformation (from a vector space to itself) are important properties of this transformation. These key concepts play a major role in mathematics and applied disciplines.

The prefix eigen underlines that these properties are important characteristics of the transformation. In many common cases knowing all eigenvalues and eigenvectors of a transformation is equivalent to the explicit knowledge of the transformation. The word eigen is German for "own", "peculiar", or "individual": the most likely translation into English mathematical jargon would be "characteristic", and some older references do use the expressions "characteristic value", "characteristic vector" and so forth, or even "eigenwert" which is German for eigenvalue, but the more distinctive term "eigenvalue" has now become standard.

Definition

Transformations of space, like translation, rotation, reflection, stretching, compression, or any combination of all (but other nonlinear transformations can be also listed), may be visualized as the effect they produce on vectors, i.e., one-dimensional arrows pointing from one point to another. In this context, eigenvectors of transformations are vectors left unaffected or just scaled by a factor after the transformation. This factor is called eigenvalue. The corresponding eigenspace is the space formed by all eigenvectors associated to a particular eigenvalue. The geometric multiplicity of an eigenvalue is the dimension of the associated eigenspace. The spectrum of the transformation is the set of all its eigenvalues. For instance, an eigenvector of a rotation is a vector located within the axis around which the rotation is performed. The corresponding eigenvalue is one and all vectors parallel to the axis form the associated eigenspace. Its geometric multiplicity is one. This is the only eigenvalue of the spectrum to be a real number. Eigenplanes are defined as planes (2-dimensional subspaces) which are left invariant by the transformation, that is vectors in such planes are turned into other vectors of the same planes (see also eigenplane).

Examples

For example, as the Earth rotates, every arrow from the center of the Earth also rotates, apart from those arrows that lie on the axis of rotation. Consider the transformation of the Earth after one hour of rotation. An arrow from the center of the Earth to the South Pole would be an example of an eigenvector of this transformation, while an arrow from the center of the Earth to Paris would not be an eigenvector. Since the arrow pointing at the pole is not stretched by the rotation of the Earth, its eigenvalue would be 1.

As another example, consider a thin metal sheet expanding uniformly about a fixed point. The transformation that expands every point on the sheet to twice its original distance from the fixed point has eigenvalue 2. Every vector from the fixed point to a point on the sheet is an eigenvector, and the eigenspace is the set of all these vectors.

However, three-dimensional geometric space is not the only vector space. For example, consider a stressed rope fixed at its both ends, like the vibrating strings of a string instrument. The distances of atoms of the rope to their positions when the rope is at rest can be seen as the components of a vector in a space with as many dimensions as atoms in the rope. If one considers the transformation of the rope as time elapses, its eigenvectors (or eigenfunctions if one assumes the rope is a continuous medium) are its standing waves well known to musicians which label them according to the notes. The standing waves correspond to particular oscillations of the rope such that the shape of the rope is scaled by a factor (the eigenvalue) as the time evolves. Each component of the vector associated to the rope is mulitiplied by this time-dependent factor. If one takes the damping of the oscillations into account, the amplitude of the standing waves (their eigenvalues) decrease with time. One can then associate to the eigenvector a lifetime and relate the concept of eigenvector to the concept of resonance which is a key concept in physics.

Eigenvalue equation

Mathematically, the equation for the eigenvalues λ and eigenvectors v_λ of a transformation ${\mathcal {T}}$ can be written

{\mathcal {T}}(\mathbf {v} _{\lambda })=\lambda \,\mathbf {v} _{\lambda }

where ${\mathcal {T}}(\mathbf {v} _{\lambda })$ stands for the vector obtained when applying the transformation ${\mathcal {T}}$ to v_λ.

In the case ${\mathcal {T}}$ is a linear transformation, i.e., if ${\mathcal {T}}(a\mathbf {v} +b\mathbf {w} )=a{\mathcal {T}}(\mathbf {v} )+b{\mathcal {T}}(\mathbf {w} )$ for all scalars a, b, and vectors v, w, one can write, in a given basis set (if such a basis set exists), in which ${\mathcal {T}}$ is represented by the matrix (or two-dimensional array) T and v_λ by the one-dimensional vertical array v_λ, the eigenvalue equation in its matrix representation

T\,v_{\lambda }=\lambda \,v_{\lambda }

which is a set of N linear equations if N is the number of basis vectors in the basis set. In this equation both the eigenvalue λ and the N components of v_λ are unknown.

Sometimes, when ${\mathcal {T}}$ is a nonlinear transformation, or when it is not possible or difficult to provide a basis set (for instance if the dimension of the vector space is infinite), eigenvalue equation cannot be written down in a matrix form. In this case it can be advantageous to represent the eigenvalue equation as a set of nonlinear equations or of differential equations depending on the nature of ${\mathcal {T}}$ . In the latter case the eigenvectors are commonly called eigenfunctions of the differential operator representing ${\mathcal {T}}$ .

A simple example is given by the theory of exponential decay or exponential growth. Any quantity which grows or decays proportionally to itself (like for example an idealized population of rabbits) evolves according to the equation ${dN/dt}=\lambda N$ . This differential equation is an eigenvalue equation. The solution $N=\exp(\lambda t)$ , the exponential function, is an eigenfunction of the differential operator d/dt, the derivative with respect to the time with the eigenvalues λ. If λ is negative the evolution of N corresponds to an exponential decay; if it is positive this is an exponential growth. The value of λ can be any complex number. The spectrum of d/dt is therefore the whole complex plane. In this example the vector space in which the operator d/dt acts is the space of the differentiable functions of one variable. This space has an infinite dimension because it is not possible to express any differentiable function as a linear combination of a finite number of basis functions. However the eigenspace associated to λ is one dimensional. It is the set of all functions $N=N_{0}\exp(\lambda t)$ where N₀ is arbitrary and is the initial population at t=0.

Spectral theorem

The spectral theorem shows the whole importance of the eigenvalues and eigenvectors for characterizing a linear transformation in a unique way. In its simplest version, the spectral theorem states that, under precise conditions (see Spectral theorem), a linear transformation of a vector can be expressed as the linear combination of the eigenvectors with coefficients equal to the eigenvalues times the scalar product (or dot product) of the eigenvectors with the vector on which the transformation is applied. Mathematically, it can be written

{\mathcal {T}}(\mathbf {v} )=\mathbf {v} _{1}\lambda _{1}(\mathbf {v} _{1}\cdot \mathbf {v} )+\mathbf {v} _{2}\lambda _{2}(\mathbf {v} _{2}\cdot \mathbf {v} )+\dots

where $\mathbf {v} _{1},\mathbf {v} _{2},\dots$ and $\lambda _{1},\lambda _{2},\dots$ stand for the eigenvectors and eigenvalues of ${\mathcal {T}}$ . The simplest case in which the theorem is valid is the case where the linear transformation is given by a real symmetric matrix or complex Hermitian matrix.

Eigenvalues of matrices

Computing eigenvalues of matrices

Suppose that we want to compute the eigenvalues of a given matrix. If the matrix is small, we can compute them symbolically using the characteristic polynomial. However, this is often impossible for larger matrices, in which case we must use a numerical method.

Symbolic computations using the characteristic polynomial

An important tool for describing eigenvalues of square matrices is the characteristic polynomial: saying that λ is an eigenvalue of A is equivalent to stating that the system of linear equations (A - λI) v = 0 (where I is the identity matrix) has a non-zero solution v (namely an eigenvector), and so it is equivalent to the determinant det(A - λI) being zero. The function p(λ) = det(A - λI) is a polynomial in λ since determinants are defined as sums of products. This is the characteristic polynomial of A: the eigenvalues of a matrix are the zeros of its characteristic polynomial.

(Sometimes the characteristic polynomial is taken to be det(λI - A) instead, which is the same polynomial if V is even-dimensional but has the opposite sign if V is odd-dimensional. This has the slight advantage that its leading coefficient is always 1 rather than -1.)

It follows that we can compute all the eigenvalues of a matrix A by solving the equation $p_{A}(\lambda )=0$ . If A is an n-by-n matrix, then $p_{A}$ has degree n and A can therefore have at most n eigenvalues. Conversely, the fundamental theorem of algebra says that this equation has exactly n roots (zeroes), counted with multiplicity. All real polynomials of odd degree have a real number as a root, so for odd n, every real matrix has at least one real eigenvalue. In the case of a real matrix, for even and odd n, the non-real eigenvalues come in conjugate pairs.

An example of a matrix with no real eigenvalues is the 90-degree rotation

{\begin{bmatrix}0&1\\-1&0\end{bmatrix}}

whose characteristic polynomial is $x^{2}+1$ and so its eigenvalues are the pair of complex conjugates i, -i.

The Cayley-Hamilton theorem states that every square matrix satisfies its own characteristic polynomial, that is, $p_{A}(A)=0$ .

Numerical computations

In practice, eigenvalues of large matrices are not computed using the characteristic polynomial. Computing the polynomial becomes expensive in itself, and exact (symbolic) roots of a high-degree polynomial can be difficult to compute and express (for example the Abel-Ruffini theorem implies that they cannot be expressed simply using $n$ th roots. Effective numerical algorithms for approximating roots of polynomials exist, but small errors in the eigenvalues can lead to large errors in the eigenvectors. Therefore, general algorithms to find eigenvectors and eigenvalues, are iterative. The easiest method is the power method: we choose a random vector v and compute Av, $A^{2}v$ , $A^{3}v$ , ... This sequence will after normalization almost always converge to an eigenvector corresponding to the dominant eigenvalue. This algorithm is easy, but not very useful by itself. However, popular methods such as the QR algorithm are based on it.

Properties

Algebraic multiplicity

The (algebraic) multiplicity of an eigenvalue λ of A is the order of λ as a zero of the characteristic polynomial of A; in other words, it is the number of factors t − λ in the characteristic polynomial. An n-by-n matrix has n eigenvalues, counted according to their algebraic multiplicity, because its characteristic polynomial has degree n.

An eigenvalue of algebraic multiplicity 1 is called a simple eigenvalue.

Occasionally, in an article on matrix theory, one may read a statement like

"the eigenvalues of a matrix A are 4,4,3,3,3,2,2,1,"

meaning that the algebraic multiplicity of 4 is two, of 3 is three, of 2 is two and of 1 is one. This style is used because algebraic multiplicity is the key to many mathematical proofs in matrix theory.

The algebraic multiplicity can also be thought of as a dimension: it is the dimension of the associated generalized eigenspace, which is the nullspace of the matrix (λI − A)^k for any sufficiently large k. That is, it is the space of generalized eigenvectors, where a generalized eigenvector is any vector which eventually becomes 0 if λI − A is applied to it enough times successively. Obviously any eigenvector is a generalized eigenvector, and so each eigenspace is contained in the associated generalized eigenspace. This provides an easy proof that the geometric multiplicity is always less than or equal to the algebraic multiplicity. (Do not confuse these with generalized eigenvalue problem, below.)

Consider for example the matrix

{\begin{bmatrix}1&1\\0&1\end{bmatrix}}.

It has only one eigenvalue, namely λ = 1. The characteristic polynomial is $(\lambda -1)^{2}$ , so this eigenvalue has algebraic multiplicity 2. However, the associated eigenspace is spanned by (1, 0)^T, so the geometric multiplicity is only 1.

Decomposition theorem

An n by n matrix A has n linearly independent eigenvectors if and only if it can be decomposed into the form

A=U\Lambda U^{-1}\;

where Λ is a diagonal matrix. In this case A is said to be diagonalizable. The columns of U will form a basis of eigenvectors and the diagonal entries of Λ are the corresponding eigenvalues: thus, the entries of U can be chosen to be real (as opposed to complex) if and only if there are n linearly independent real eigenvectors. Even with complex coefficients, however, such a matrix U does not always exist; for example

A=\left({\begin{matrix}1&1\\0&1\end{matrix}}\right)

has only one 1-dimensional eigenspace.

There are several generalizations of this decomposition which can cope with the non-diagonalizable case, suited for different purposes:

the singular value decomposition, $A=U\Sigma V^{*}$ where $\Sigma$ is diagonal but U is not necessarily equal to V;
the Jordan normal form, where $A=U\Lambda U^{-1}$ but $\Lambda$ is not quite diagonal;
any matrix A can be written uniquely as A = S + N where S is semisimple (i.e. diagonalizable) and N is nilpotent, and S commutes with N. This is easily found from the Jordan form.
any invertible matrix A can be written uniquely as A = SJ where S is semisimple and J is unipotent, and S commutes with J. This is found from the previous decomposition by taking J = 1 + S^-1N.

Similarly, an n by n matrix A has n linearly independent conjugate eigenvectors (such as arising from an alternate coordinate system) if and only if it can be decomposed into the form

A=U^{*}\Lambda U^{\dagger }\;

Other theorems

The spectrum is invariant under similarity transformations: the matrices A and P^-1AP have the same eigenvalues for any matrix A and any invertible matrix P. The spectrum is also invariant under transposition: the matrices A and A^T have the same eigenvalues.

A matrix is invertible if and only if zero is not an eigenvalue of the matrix.

A matrix is diagonalizable if and only if the algebraic and geometric multiplicities coincide for all its eigenvalues, which is the same as requiring that all generalized eigenvectors are eigenvectors. In particular, an n-by-n matrix which has n different eigenvalues is always diagonalizable.

The vector space on which the matrix acts is always the direct sum of the generalised eigenspaces (i.e., is spanned by them and they are independent). This is true of the ordinary (non-generalised) eigenspaces if and only if they are equal to the generalized eigenspaces, i.e., if and only if the matrix is diagonalizable.

The location of the spectrum is often restricted if the matrix has a special form:

All eigenvalues of a Hermitian matrix (A = A^*) are real. Furthermore, all eigenvalues of a positive-definite matrix (v^*Av > 0 for all vectors v) are positive.
All eigenvalues of a skew-Hermitian matrix (A = −A^*) are purely imaginary.
All eigenvalues of a unitary matrix (A^-1 = A^*) have absolute value one.
The eigenvalues of a triangular matrix are the entries on the main diagonal. This holds a fortiori for diagonal matrices.

Generally, the trace of a matrix equals the sum of the eigenvalues, and the determinant equals the product of the eigenvalues (counted according to algebraic multiplicity).

Suppose that A is an m-by-n matrix, with m ≤ n, and that B is an n-by-m matrix. Then BA has the same eigenvalues as AB plus m − n eigenvalues equal to zero.

Conjugate eigenvector

A conjugate eigenvector or coneigenvector is an element in the domain of the linear transformation sent to a scalar multiple of its conjugate, where the scalar is called the conjugate eigenvalue or coneigenvalue of the linear transformation. The coneigenvectors and coneigenvalues represent essentially the same information and meaning as the regular eigenvectors and eigenvalues, but arise when a alternative coordinate system is used. The corresponding equation is

Av=\lambda v^{*}.\,

For example, in coherent electromagnetic scattering theory, the linear transformation A represents the action performed by the scattering object, and the eigenvectors represent polarization states of the electromagnetic wave. In optics, the coordinate system is defined from the wave's viewpoint, known as the Forward Scattering Alignment (FSA), and gives rise to a regular eigenvalue equation, whereas in radar, the coordinate system is defined from the radar's viewpoint, known as the Back Scattering Alignment (BSA), and gives rise to a coneigenvalue equation.

Generalized eigenvalue problem

A generalized eigenvalue problem is of the form

Av=\lambda Bv\quad \quad (1)

where A and B are matrices (with complex entries). The generalized eigenvalues λ can be obtained by solving the equation

\det(A-\lambda B)=0.\,

If B is invertible, then problem (1) can be obviously written in the form

B^{-1}Av=\lambda v\quad \quad (2)

which is a standard eigenvalue problem. However, in most situations it is preferable not to perform the inversion, and solve the generalized eigenvalue problem as stated originally.

If A and B are symmetric matrices with real entries, then problem (1) has real eigenvalues. This would have not been easy to see from the equivalent formulation (2), because the matrix $B^{-1}A$ is not necessarily symmetric if A and B are.

Eigenvalues of a matrix with entries from a ring

In the case of a square matrix A with entries in a ring, λ is called a right eigenvalue if there exists a column vector x such that Ax=λx, or a left eigenvalue if there exists a nonzero row vector y such that yA=yλ.

If the ring is commutative, the left eigenvalues are equal to the right eigenvalues and are just called eigenvalues. If not, for instance if the ring is a set of quaternions, they may be different.

Infinite-dimensional spaces: spectrum of an operator

If the vector space is infinite dimensional, it may be advantageous to define the concept of spectral values. The spectral values are the set of scalars λ for which the Green's operator associated to the transformation is not defined, that is such that ${\mathcal {T}}-\lambda$ is not invertible (i.e., the inverse transformation to ${\mathcal {T}}-\lambda$ does not exists).

If λ is an eigenvalue of ${\mathcal {T}}$ , λ is also a spectral value of it. However, the reverse relation is not true: any spectral value is not an eigenvalue. There are operators on Hilbert or Banach spaces which have no eigenvectors at all. This can be seen on the following example. The bilateral shift on the Hilbert space $\ell ^{2}(\mathbf {Z} )$ (the space of all sequences of scalars $\dots a_{-1},a_{0},a_{1},a_{2},\dots$ such that $\dots |a_{-1}|^{2}+|a_{0}|^{2}+|a_{1}|^{2}+|a_{2}|^{2}+\dots$ converge) has no eigenvalue but has spectral values.

In functional analysis, the spectrum of an operator is defined as the set of all its spectral values. This is a key concept in scattering theory.

Applications

Schrödinger equation

The wavefunctions associated to the bound states of an electron in a hydrogen atom can be seen as the eigenvectors of the hydrogen atom Hamiltonian as well as of the angular momentum operator. They are associated to eigenvalues interpreted as their energies (increasing downward: n=1,2,3,...) and angular momentum (increasing across: s, p, d,...). Here are plotted the square of the absolute value of the wavefunctions. Brighter areas correspond to higher probability density for a position measurement. The center of each figure is the atomic nuclei, a proton

A particularly important example of eigenvalue equation where the transformation ${\mathcal {T}}$ is represented in terms of a differential operator is the time independent Schrödinger equation in quantum mechanics

H\Psi _{E}=E\Psi _{E}

where H, the Hamiltonian, is a second-order differential operator and $\Psi _{E}$ , the wavefunction, is one of its eigenfunctions corresponding to the eigenvalue E, interpreted as its energy.

However, in the case we only look for the bound state solutions of the Schrödinger equation, as is usually the case in quantum chemistry, we look for $\Psi _{E}$ within the space of square integrable functions. Since this space is a Hilbert space, with a well defined scalar product, we can introduce a basis set in which $\Psi _{E}$ and H can be represented as a one-dimensional array and a matrix respectively. This allows us to represent the Schrödinger equation in a matrix form.

The Dirac notation often used in this context stresses the difference between the vector or state $|\Psi _{E}\rangle$ and its representation, the function $\Psi _{E}$ . In this context one writes the Schrödinger equation

H|\Psi _{E}\rangle =E|\Psi _{E}\rangle

and call $|\Psi _{E}\rangle$ an eigenstate of H (sometimes written ${\hat {H}}$ in introductory textbooks) which is seen as a transformation (see Observable) instead of particular representation of it in terms of differential operators. In the equation above $H|\Psi _{E}\rangle$ is understood as the vector obtained by application of the transformation H to $|\Psi _{E}\rangle$ .

Molecular orbitals

In quantum mechanics, and in particular in atomic and molecular physics, within the Hartree-Fock theory, the atomic and molecular orbitals can be defined by the eigenvectors of the Fock operator. The corresponding eigenvalues are interpreted as ionization potentials via Koopmans' theorem. In this case, the term eigenvector is used in a somewhat more general meaning since the Fock operator is explicitly dependent of the orbitals and their eigenvalues. If one wants to underline this aspect one speaks of implicit eigenvalue equation. Such equations are usually solved by an iteration procedure, called in this case self-consistent field method. In quantum chemistry, one often represents the Hartree-Fock equation in a non orthogonal basis set. This particular representation is a generalized eigenvalue problem called Roothaan equations.

Factor analysis

In factor analysis, the eigenvectors of a covariance matrix correspond to factors, and eigenvalues to factor loadings. Factor analysis is a statistical technique used in the social sciences and in marketing, product management, operations research, and other applied sciences that deal with large quantities of data. The objective is to explain the most of the variability among a number of observable random variables in terms of a smaller number of unobservable random variables called factors. The observable random variables are modeled as linear combinations of the factors, plus "error" terms.

Eigenfaces

In image processing, processed images of faces can be seen as vectors whose components are the brightnesses of each pixels. The dimension of this vector space is the number of pixels. The eigenvectors of the covariance matrix associated to a large set of normalized pictures of faces are called eigenfaces. They are very useful for expressing any face image as a linear combination of some of them. Eigenfaces provide a means of applying data compression to faces for identification purposes.

Tensor of inertia

In mechanics, the eigenvectors of the inertia tensor define the principal axes of a rigid body. The tensor of inertia is a key quantity required in order to determine the rotation of a rigid body around its center of mass.

Stress tensor

In solid mechanics, the stress tensor is symmetric and so can be decomposed into a diagonal tensor with the eigenvalues on the diagonal and eigenvectors as a basis. Because it is diagonal, in this orientation, the stress tensor has no shear components; the components it does have are the principal components.

Eigenvalues of a graph

In spectral graph theory an eigenvalue of a graph, is defined as an eigenvalue of the graph's adjacency matrix A, or (increasingly) of the graph's Laplacian matrix $I-T^{-1/2}AT^{-1/2}$ , where T is a diagonal matrix holding the degree of each vertex, and in $T^{-1/2}$ , 0 is substituted for $0^{-1/2}$ .

External links

Videos of MIT Linear Algebra Course, fall 1999 - See Lecture #21: Eigenvalues and Eigenvectors
MathWorld: Eigenvector
Earliest Known Uses of Some of the Words of Mathematics: E - see eigenvector and related terms
ARPACK is a collection of FORTRAN subroutines for solving large scale eigenvalue problems
"Eigenvalue (of a matrix)". PlanetMath.

References

Roger A. Horn and Charles R. Johnson, Matrix Analysis, Cambridge University Press (1985). ISBN 0-521-30586-1 (hardback), ISBN 0-521-38632-2 (paperback).
John B. Fraleigh and Raymond A. Beauregard, Linear Algebra (3^rd edition), Addison-Wesley Publishing Company (1995). ISBN 0-201-83999-7 (international edition).