Functional derivative

In mathematics and theoretical physics, the functional derivative is a generalization of the directional derivative. The difference is that the latter differentiates in the direction of a vector, while the former differentiates in the direction of a function. Both of these can be viewed as extensions of the usual calculus derivative.

Two possible, restricted definitions suitable for certain computations are given here. There are more general definitions of functional derivatives.

For a functional F mapping (continuous/smooth/with certain boundary conditions/etc.) functions φ from a manifold M to R or C, the functional derivative of F, denoted $\delta F$ or ${\delta F}/{\delta \phi (x)}$ is a distribution such that for all test functions f,

\left\langle \delta F[\phi ],f\right\rangle =\left.{\frac {d}{d\epsilon }}F[\phi +\epsilon f]\right|_{\epsilon =0}.

Sometimes physicists write the definition in terms of a limit and the Dirac delta function, δ:

{\frac {\delta F[\phi (x)]}{\delta \phi (y)}}=\lim _{\varepsilon \to 0}{\frac {F[\phi (x)+\varepsilon \delta (x-y)]-F[\phi (x)]}{\varepsilon }}.

However, the right hand side is mathematically incorrect, since F isn't defined for distributions.

Formal description

The definition of a functional derivative may be made much more mathematically precise and formal by defining the space of functions more carefully. For example, when the space of functions is a Banach space, the functional derivative becomes known as the Fréchet derivative, while one uses the Gâteaux derivative on more general locally convex spaces. Note that the well-known Hilbert space is a special case of a Banach space. The more formal treatment allows many theorems from ordinary calculus and analysis to be generalized to corresponding theorems in functional analysis, as well as numerous new theorems to be stated.

Relationship between the mathematical and physical definitions

The mathematicians' definition and the physicists' definition of the functional derivative mean slightly different things. The first describes how the entire functional, F, changes as a result of a small change in the function $\phi (x)$ . The functional derivative is itself still a functional. However a physicist often wants to know how one quantity, say the density of charge at position 1, $n(y_{1})$ , is affected by changing another quantity, say the value of the electric potential at position 2, $U(y_{2})$ . (If there are lots of interacting charges in your system, changing the potential at position 2 moves those charges, which changes the potential and the density of charges at every other point in space, including position 1.) The density is a functional of the function $U(y)$ that describes the potential at each point in space. But when this functional is evaluated for a specific functional form of the potential, the density becomes a function, since it has a different value at each point in space.

In this case the physicist is interested in the "functional derivative" ${\frac {\delta n(y_{1})}{\delta U(y_{2})}}$ . To get this quantity take the mathematician's functional derivative of the functional (density) for the special case that the function (potential) has a particular variation that is only non-zero at position 2, namely a delta function: $f(y)=\delta (y-y_{2})$ . The resulting functional derivative is still a functional of the potential function. But when the actual potential function is substituted in for the argument of this functional, the result is the function, ${\frac {\delta n(y)}{\delta U(y_{2})}}$ which describes the change in the charge density as a function of position. Finally this function can be evaluated at position 1 to give ${\frac {\delta n(y_{1})}{\delta U(y_{2})}}$ , the change in density at position 1 due to changing the potential only at position 2.

Examples

We start by an example of how to use the definition.

Given a functional of the form

F[\rho ]=\int f(\mathbf {r} ,\rho (\mathbf {r} ),\nabla \rho (\mathbf {r} ))\,d^{3}r,

the functional derivative can be written

{\begin{matrix}\left\langle \delta F[\rho ],\phi \right\rangle &=&{\frac {d}{d\epsilon }}\left.\int f(\mathbf {r} ,\rho +\epsilon \phi ,\nabla \rho +\epsilon \nabla \phi )\,d^{3}r\right|_{\epsilon =0}\\&=&\int \left({\frac {\partial f}{\partial \rho }}\phi +{\frac {\partial f}{\partial \nabla \rho }}\cdot \nabla \phi \right)d^{3}r\\&=&\int \left[{\frac {\partial f}{\partial \rho }}\phi +\nabla \cdot \left({\frac {\partial f}{\partial \nabla \rho }}\phi \right)-\left(\nabla \cdot {\frac {\partial f}{\partial \nabla \rho }}\right)\phi \right]d^{3}r\\&=&\int \left[{\frac {\partial f}{\partial \rho }}\phi -\left(\nabla \cdot {\frac {\partial f}{\partial \nabla \rho }}\right)\phi \right]d^{3}r\\&=&\left\langle {\frac {\partial f}{\partial \rho }}-\nabla \cdot {\frac {\partial f}{\partial \nabla \rho }}\,,\phi \right\rangle \end{matrix}}

,

where, in the third line, $\phi =0$ is assumed for the boundary of integration. Thus,

\delta F[\rho ]={\frac {\partial f}{\partial \rho }}-\nabla \cdot {\frac {\partial f}{\partial \nabla \rho }}

or, writing the expression more explicitly,

{\frac {\delta F[\rho ]}{\delta \rho (\mathbf {r} )}}={\frac {\partial }{\partial \rho }}f(\mathbf {r} ,\rho (\mathbf {r} ),\nabla \rho (\mathbf {r} ))-\nabla \cdot {\frac {\partial }{\partial \nabla \rho }}f(\mathbf {r} ,\rho (\mathbf {r} ),\nabla \rho (\mathbf {r} ))

The above example is specific to the particular case that the functional depends on the function, $\rho (\mathbf {r} )$ , and its first derivative, $\nabla \rho (\mathbf {r} )$ , only. In the more general case that the functional depends on higher order derivatives, i.e.,

F[\rho ]=\int f(\mathbf {r} ,\rho (\mathbf {r} ),\nabla \rho (\mathbf {r} ),\nabla ^{2}\rho (\mathbf {r} ),\cdots ,\nabla ^{N}\rho (\mathbf {r} ))\,d^{3}r,

where $\nabla ^{i}$ is a vector whose $N*N$ components are all partial derivative operators of order $i$ , i.e. $\partial ^{i}/(\partial r_{1}^{i_{1}}\partial r_{2}^{i_{2}}...\partial r_{N}^{i_{N}})$ with $i_{1}+i_{2}+...+i_{N}=i$ , an analogous application of the definition yields

{\frac {\delta F[\rho ]}{\delta \rho (\mathbf {r} )}}={\frac {\partial f}{\partial \rho }}-\nabla \cdot {\frac {\partial f}{\partial (\nabla \rho )}}+\nabla ^{2}\cdot {\frac {\partial f}{\partial \left(\nabla ^{2}\rho \right)}}-\cdots +(-1)^{N}\nabla ^{N}\cdot {\frac {\partial f}{\partial \left(\nabla ^{N}\rho \right)}}=\sum _{i=0}^{N}(-1)^{i}\nabla ^{i}\cdot {\frac {\partial f}{\partial \left(\nabla ^{i}\rho \right)}}.

It is also worthwhile to briefly discuss functional derivatives beyond their formal, mathematical definition. Functional derivatives occur regularly in physical problems which obey a variational principle, therefore it is useful to show how functional derivatives are performed through physically-relevant examples.

Consider the Coulomb energy functional, $J[\rho ]$ ,