The Dual Pythagorean Theorem [Extra]

Dual Vectors

Suppose we have an n-dimensional vector space, V, and a linear function f from V to the real numbers, R.

The kernel of f, written ker f, is the subspace of V on which f is zero. The dimension of ker f must be at least n–1 (and unless f is the zero function, it will be exactly n–1). For example, if n = 2 and f ≠ 0 then f will be zero on a line through the origin of V. If n = 3 and f ≠ 0 then f will be zero on a plane through the origin, and so on. This follows from a basic theorem of linear algebra known as the Rank-Nullity Theorem.

Linear function on a two-dimensional vector space
Linear function on a two-dimensional vector space

The diagram shows a two-dimensional example; here the function f is zero on a line through the origin spanned by the vector k.

If we pick a vector such as v, for which f(v) is non-zero, there will be a line in V passing, not through the origin but through the tip of v, on which f is uniformly equal to f(v). We form this line by adding arbitrary multiples of k to v:

f(v + s k) = f(v) + s f(k) = f(v) = 1

The last equality only follows because we happen to have chosen v such that f(v) = 1. But all the sets on which f takes on a constant value will be of this form: lines parallel to the vector k. And just as we draw different vectors on this diagram as arrows of various lengths and directions, we can draw different linear functions as stacks of lines of various spacings and orientations.

Larger vectors are drawn with larger arrows, but in this scheme a larger function would be drawn, not with greater distances between the lines, but with the lines packed closer together. For example, the function g = 2 f would be drawn with an extra line between every pair of lines in the stack for f, because it reaches 1 when f is just 1/2.

The value of f for any vector effectively counts the number of gaps between the stack of lines that the vector’s arrow crosses. For example, the arrow for w crosses two complete gaps in the stack for f, so f(w) = 2. We also need to account for arrows that cross the gaps backwards to yield a negative value for f (such as p), or those that don’t cross a whole number of gaps (such as q), so this is more of an appeal to geometric intuition than a rigorous mathematical scheme, but you can probably already see how it fits in with the examples on the main page, where we discuss how many furrows in a field or contour lines on a map are crossed when we move a given distance in a certain direction.

If V has more than two dimensions, instead of a stack of lines in V we will have a stack of planes, or hyperplanes, of dimension n–1.

Now, let’s consider the set of all linear functions from V to R. We can turn this set itself into an n-dimensional vector space in its own right, which is called the dual space to V, and written V*. We call elements of V*, such as f, dual vectors.

To make a set into a real vector space we need to be able to add its elements together, and multiply them by real numbers. We can do this to the set of linear functions on V in a pretty obvious way; if f and g are in V*, v is in V, and s is in R, we define the sum f + g and the scalar multiple s f by stating their values as functions:

(f + g)(v) = f(v) + g(v)
(s f)(v) = s f(v)

As with any vector space, we can choose a basis for V*, consisting of a set of n linear functions in terms of which we can write any function in V*. We will write the elements of this basis as {e1, e2, ... en}, and write:

f = f1 e1 + f2 e2 + ... + fn en

where f1 etc. are the components of f with respect to this basis. In these notes, while we usually label the components of vectors with superscripts, we will label the components of dual vectors with subscripts — and while we usually label the individual vectors in a basis with subscripts, we will label the individual dual vectors in a basis with superscripts.

Dual bases

If we have a particular basis for V, {e1, e2, ... en}, we will say that a basis {e1, e2, ... en} for V* is dual to the basis for V if the following condition holds:

ei(ej) = δij

where δij, known as the Kronecker delta symbol, is 1 if i=j and 0 if ij. Geometrically, what this means is that the function ei for a particular i appears as a stack of lines or planes in V such that the vector ei crosses exactly one gap between them, and all the other basis vectors ej lie within the line or plane that passes through the origin, and don’t cross the stack at all. For example, in the diagram on the left the vector e2 lies within the line e1 = 0, and the vector e1 lies within the line e2 = 0.

Given our basis {e1, e2, ... en} for V, we can find the components of any linear function f in V* with respect to the dual basis just by feeding each of the vectors ei in the original basis to f. Because the two bases are duals, only the particular basis function ei will be non-zero at ei, and only the component that multiplies it, fi, will remain in the sum.

f(ei) = (f1 e1 + f2 e2 + ... + fn en)(ei) = fi

If the components of a vector v in V are vi with respect to some chosen basis for V and the components of a dual vector f in V* are fi with respect to the dual basis, then:

f(v) = f(vi ei) = fi vi

where we have used the Einstein summation convention to abbreviate sums over repeated indices (e.g. vi ei = v1 e1 + v2 e2 + ... + vn en).

Suppose we have a dot product on V. Then for any vector w in V, we can define a linear function fw in V* by:

fw(v) = v · w

Equally, given any linear function f in V*, we can find a vector wf such that f(v) = v · wf for every v in V. We choose an orthonormal basis {e1, e2, ... en} for V, then we set:

wf = fi ei

where we’re using the Einstein summation convention again, and the components fi are with respect to the basis of V* dual to our orthonormal basis of V. Then:

v · wf = v · [fi ei] = fi [v · ei] = fi vi = f(v)

So we can identify any element of V with a unique element of V*, and vice versa.

The vector fi ei that we associate with f this way will be orthogonal to any vector k that lies in the kernel of f, since the dot product of k with fi ei is just f(k) = 0. Since all the lines (or planes, etc.) in the stack we use to visualise f are parallel to the one through the origin that is the kernel of f, the vector fi ei will be perpendicular to the whole stack.

We can put a dot product on V*, by declaring that any basis dual to an orthonormal basis for V is itself orthonormal. This lets us define the “length” or magnitude of a dual vector f via its square:

|f|2 = f · f = (f1 e1 + f2 e2 + ... + fn en) · (f1 e1 + f2 e2 + ... + fn en) = (f1)2 + (f2)2 + ... + (fn)2

Here the fi are components taken with respect to a basis {e1, e2, ... en} of V* that is dual to an orthonormal basis for V.


According to our geometric interpretation of f, each component fi = f(ei) here is a count of the number of “gaps in the stack” that the basis vector ei crosses. So these are like the counts of cycles of the wave crossed in the diagram on the right, when we move one metre along the x and y axes.

But now suppose we cross the stack perpendicularly, with a unit vector in the direction of fi ei — explicitly, fi ei / |fi ei| — which we know is orthogonal to the stack. The count from such a direct crossing is:

f(fi ei / |fi ei|)
= fi f(ei) / |fi ei|
= fi fi / |fi ei|
= |f|2 / |f|
= |f|

This is another statement of the Dual Pythagorean Theorem! Crossing the stack directly with a unit vector gives a count, |f|, whose square we have seen is just the sum of squares of the counts obtained by crossing the stack with a unit vector in each of n orthogonal directions.

Valid HTML Valid CSS
Orthogonal / The Dual Pythagorean Theorem [Extra] / created Wednesday, 6 April 2011
If you link to this page, please use this URL: https://www.gregegan.net/ORTHOGONAL/01/DualPythagoreanExtra.html
Copyright © Greg Egan, 2011. All rights reserved.