[Home]Chain rule

HomePage | Recent Changes | Preferences

Showing revision 10
Difference (from revision 10 to revision 10) (minor diff, author diff)
(The revisions are identical or unavailable.)
The chain rule in calculus states that if one variable y depends on a second variable u which in turn depends on a third variable x, then the rate of change of y with respect to x can be computed as the product of the rate of change of y with respect to u times the rate of change of u with respect to x.

In Leibniz' symbolism, this can be written as

dy/dx = (dy/du) * (du/dx)

A real world example will show that this rule makes sense: suppose you are climbing up a mountain and you are gaining elevation at a rate of 0.5 kilometers an hour. The temperature is lower at higher elevations; suppose the rate at which it decreases is 6 degrees per kilometer. How fast do you get colder? Well, we have to multiply: 6 degrees per kilometer times 0.5 kilometers per hour makes 3 degrees per hour. Every hour, you'll get three degrees colder. That is the heart of the chain rule.

In the modern treatment, the chain rule is seen as a formula for the derivative of the composition of two functions. Suppose the real-valued function f is defined on some open subset of the real numbers containing the number x, and g is defined on some open subset of the reals containing f(x). If f is differentiable at x and g is differentiable at f(x), then the composition g o f is differentiable at x and the derivative can be computed as

(g o f)'(x) = g'(f(x)) * f '(x)

For example, in order to differentiate

h(x) = sin(x2),
we write h(x) = g(f(x)) with g(u) = sin(u) and f(x) = x2 and the chain rule then yields
h'(x) = cos(x2) * 2x
since g'(u) = cos(u) and f '(x) = 2 x.

The chain rule is a fundamental property of all definitions of derivative and is therefore valid in much more general contexts. For instance, if E, F and G are Banach spaces (which includes Euclidean space) and f : E -> F and g : F -> G are functions, and if x is an element of E such that f is differentiable at x and g is differentiable at f(x), then the derivative of the composition g o f at the point x is given by

Dx(g o f) = Df(x)(g) o Dx(f)
Note that the derivatives here are linear maps and not numbers. If the linear maps are represented as matrices, the composition on the right hand side turns into a matrix multiplication.

A particularly nice formulation of the chain rule can be achieved in the most general setting: let M, N and P be Ck manifolds (or even Banach-manifolds) and let f : M -> N and g : N -> P be differentiable maps. The derivative of f, denoted by df, is then a map from the [tangent bundle]? of M to the tangent bundle of N, and we may write

d(g o f) = dg o df
In this way, the formation of derivatives and tangent bundles is seen as a functor on the category of Ck manifolds.

HomePage | Recent Changes | Preferences
This page is read-only | View other revisions | View current revision
Edited September 5, 2001 10:24 pm by AxelBoldt (diff)
Search: