Riemannian Calculus of Variations Using Strongly Typed Tensor Calculus

Dods, Victor

doi:10.3390/math10183231

Open AccessFeature PaperArticle

Riemannian Calculus of Variations Using Strongly Typed Tensor Calculus

by

Victor Dods

^†,‡

Independent Researcher, Seattle, WA 98106, USA

^†

Homepage: https://thedods.com/victor.

^‡

Website: https://itdont.work.

Mathematics 2022, 10(18), 3231; https://doi.org/10.3390/math10183231

Submission received: 23 July 2022 / Revised: 23 August 2022 / Accepted: 26 August 2022 / Published: 6 September 2022

(This article belongs to the Special Issue Variational Methods on Riemannian Manifolds: Theory and Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In this paper, the notion of strongly typed language will be borrowed from the field of computer programming to introduce a calculational framework for linear algebra and tensor calculus for the purpose of detecting errors resulting from inherent misuse of objects and for finding natural formulations of various objects. A tensor bundle formalism, crucially relying on the notion of pullback bundle, will be used to create a rich type system with which to distinguish objects. The type system and relevant notation is designed to “telescope” to accommodate a level of detail appropriate to a set of calculations. Various techniques using this formalism will be developed and demonstrated with the goal of providing a relatively complete and uniform method of coordinate-free computation. The calculus of variations pertaining to maps between Riemannian manifolds will be formulated using the strongly typed tensor formalism and associated techniques. Energy functionals defined in terms of first-order Lagrangians are the focus of the second half of this paper, in which the first variation, the Euler–Lagrange equations, and the second variation of such functionals will be derived.

Keywords:

Riemannian manifolds; tensor and tensor field type system; calculus of variations; Euler–Lagrange equations

MSC:

49-02; 49Q10; 53C99; 58C99

1. Introduction

Many important differential equations have a variational origin, being derived as the Euler–Lagrange equations for a particular functional on some space of functions. The variational approach lends itself particularly to physics, in which conservation of energy or minimization of action is a central concept. The naturality of such formulations cannot be understated, as solutions to such problems often depend critically on the inherent geometry of the underlying objects. For example, solutions to Laplace’s equation for a real valued function (e.g., modeling steady-state heat flow) on a Riemannian manifold depend qualitatively on the topology of the manifold (e.g., harmonic functions on a closed Riemannian manifold are necessarily constant, which makes sense geometrically because there is no boundary through which heat can escape).

A central concept in the field of software design is that of information hiding [1], in which a computer program is organized into modules, each presenting an abstract public interface. Other parts of the program can interact only through the presented interface, and the details of how each module works are hidden, thereby preventing interference in the implementation details which are not required by the inherent structure of the module. This concept has clear usefulness in the field of mathematics as well. For example, there are several formulations of the real numbers (e.g., equivalence classes of Cauchy sequences of rational numbers, Dedekind cuts, decimal expansions, etc.), but their particulars are instances of what are known as implementation details, and the details of each particular implementation are irrelevant in most areas of mathematics, which only use the inherent properties of the real numbers as a complete, totally ordered field. Of course, at certain levels, it is useful or necessary to “open up the box” [go past the public interface] and work with a particular representation of the real numbers.

Information hiding is characteristic of abstract mathematics, in which general results are proved about abstract mathematical objects without using any particular implementation of said objects. These results can then be used modularly in other proofs, just as the functionality of a computer program is organized into modularized objects and functions. For example, a fixed point theorem for contractive mappings on closed sets in Banach spaces, but a particular application of this theorem renders an existence and uniqueness theorem for first-order ODEs ([2], pp. 59, 62).

A loose conceptual analogy for modularity is that of diagonalizing a linear operator. A basis of eigenvectors is chosen so that the action of the linear operator on each eigenspace has a particularly simple expression, and distinct eigenspaces do not interact with respect to the operator’s action. In this analogy, the eigenvectors then correspond to individual lemmas, and the linear operator corresponds to a large theorem which uses each lemma. Decomposing the proof of the main result in terms of non-interacting lemmas simplifies the proof considerably, just as it simplifies the quantification of the linear operator. The term “orthogonal” has been borrowed by software design to describe two program modules whose functionality is independent ([3], Chapter 4, Section 2). Orthogonality in software design is highly desirable as it generally eases program implementation and program correctness verification, as the human designers are only capable of keeping track of a certain finite number of details simultaneously [4]. The scope of each detail level of the design is limited in complexity, making the overall design easier to comprehend.

This technique in software design carries over directly to proof design, where it is desirable (elegant) to write proofs and do calculations without introducing extraneous details, such as choice of bases in vector spaces or local coordinates in manifolds. Because such choices are generally non-unique, they can often obscure the inherent structure of the relevant objects by introducing artifacts arising from properties of the particular details used to implement said objects. For example, the choice of a particular local coordinate chart on a manifold artificially imposes an additive structure on a neighborhood of the manifold, but such a structure has nothing to do with the inherent geometry of the manifold. Furthermore, the descent to this “lower level” of calculation discards some type information, representing points in a manifold as Euclidean vectors, thereby losing the ability to distinguish points from different manifolds, or even different localities in the same manifold.

This paper makes a particular emphasis on natural formulations and calculations in order to expose the underlying geometric structures rather than relying on coordinate-based expressions. The construction of the “full” direct sum and “full” tensor product bundles are used in combination with induced covariant derivatives to this end.

For more on relevant introductory theory on manifolds, bundles and Riemannian geometry, see [5,6,7,8].

2. Notation and Conventions

Let all vector spaces, manifolds and [fiber] bundles be real and finite-dimensional unless otherwise noted (this allows the canonical identification

V^{* *} ≅ V

for a vector space or vector bundle V), and let all tensor products be over

R

. The unqualified term “bundle” will mean “fiber bundle”. The Einstein summation convention will be assumed in situations when indexed tensors are used for computation.

Unary operators are understood to have higher binding precedence than binary operators, and super and subscripts are understood to have the highest binding precedence. For example, the expression

\nabla X_{, M} \circ ϕ

would be parenthesized as

(\nabla (X_{, M})) \circ ϕ

.

Apart from the obvious purpose of providing a concise and central reference for the notation in this paper, the following notation index serves to illustrate the use of telescoping notation (see Section 3.2). The high-level (terse notation which requires the reader to do more work in type inference but is more agile), mid-level, and low-level (completely type-specified, requiring little work on the part of the reader) notations are presented side-by-side with their definitions.

Let

I \subseteq R

be a neighborhood of 0, let

ϵ, i

each be coordinates on I, let

A, A_{1}, \dots, A_{n}, B

be sets, let

M, N

be manifolds, let

ϕ \in C^{\infty} (M, N)

, let

π_{M}^{A} : A \to M

and

π_{N}^{H} : H \to N

and be vector bundles, where

A = E, F, F_{1}, \dots, F_{n}, G

, let

U, V, V_{1}, \dots, V_{n}, W

be vector spaces, and let

c_{i} \in Γ (F_{i} \otimes_{M} T^{*} M)

such that

c_{1} \oplus_{M} \dots \oplus_{M} c_{n} \in Γ ((F_{1} \oplus_{M} \dots \oplus_{M} F_{n}) \otimes_{M} T^{*} M)

is a vector bundle isomorphism.

Table 1 and Table 2 are references which enumerate the notation used in this paper in its high-, mid-, and low-level forms.

3. Mathematical Setting

3.1. Using Strong Typing to Error-Check Calculations

Linear algebra is an excellent setting for discussion of the strong typing [9] of a language, a concept used in the design of computer programming languages. The idea is that when the human-readable source code of a program is compiled (translated into machine-readable instructions), the compiler (the program which performs this translation) or runtime (the software which executes the code) verifies that the program objects are being used in a well-defined way, producing an error for each operation that is not well-defined. For example, a vector-type value would not be allowed to be added to a permutation-type value, even though tuples of unsigned integers (i.e., bytes) are used by the computer to represent both, and the computer’s processing unit could add together their byte-valued representations. However, such an operation would be meaningless with respect to the types of the operands. The result of the operation would depend on the non-canonical choice of representation for each object. Strong type checking has the advantage of catching many programming errors, including most importantly those resulting from an inherent misuse of the program’s objects. Within this paper, certain type-explicit notations will be used to provide forms of type awareness conducive to error-checking.

An important example of semi-strong typing in math is Penrose’s abstract index notation [10], modeled on Einstein’s summation convention, in which linear algebra and tensor calculus are implemented using indexed objects (tensors) having a certain number and order of “up” and “down” indices (an abstraction of the genuine basis/coordinate expressions in which the indexed objects are arrays of scalars/functions). A non-indexed tensor is a scalar value, a tensor having a single up or down index is a vector or covector value respectively, a tensor having an up and a down index is an endomorphism, and so forth. The tensors are contracted by pairing a certain number of up indices with the same number of down indices, resulting in an object having as indices the uncontracted indices.

For example, given a finite-dimensional inner product space

(V, g)

, where g is a

(\binom{0}{2})

-tensor (having the form

g_{i j}

, i.e., two down indices), a vector

v \in V

is a

(\binom{1}{0})

-tensor, and the length of v is

\sqrt{v^{i} g_{i j} v^{j}}

. If

dim V > 1

, then

⋀^{2} V

has positive dimension, its vectors each being

(\binom{2}{0})

-tensors, and

G_{i j k ℓ} : = g_{i k} g_{j ℓ} - g_{i ℓ} g_{j k}

is an inner product on

⋀^{2} V

(which must be a

(\binom{0}{4})

-tensor in order to contract with two

(\binom{2}{0})

-tensors).

Certain type errors are detected by use of abstract index notation in the form of index mismatch. For example, with

(V, g)

as above, if

α \in V^{*}

, then

α

is a

(\binom{0}{1})

-tensor. Because of the repeated j down indices, the expression

g_{i j} α_{j}

typically indicates a type error;

g_{i j}

cannot contract with

α_{j}

because of incompatible valence (valence being the number of up and down indices). Furthermore, multiplying a

(\binom{0}{2})

-tensor with a

(\binom{0}{1})

-tensor without contraction should result in a

(\binom{0}{3})

-tensor, which should be denoted using three indices, as in

g_{i j} α_{k}

.

The only explicit type information provided by abstract index notation is that of valence. The “semi” qualifier mentioned earlier is earned by the lack of distinction between the different spaces in which the tensors reside. For example, if

U, V, W

are finite-dimensional vector spaces, then linear maps

A : U \to V

and

B : V \to W

can be written as

(\binom{1}{1})

-tensors, and their composition

B \circ A : U \to W

is written as the tensor contraction

{(B \circ A)}_{j}^{i} = B_{k}^{i} A_{j}^{k}

. However, while the expression

A_{k}^{i} B_{j}^{k}

makes sense in terms of valence compatibility (i.e., grammatically), the composition “

A \circ B

” that it should represent is not well-defined. Thus this form of type error is not caught by abstract index notation, since the domains/codomains of the linear maps must be checked separately.

The use of dimensional analysis (the abstract use of units such as kilograms, seconds, etc.) in Physics is an important precedent of strong typing. Each quantity has an associated “dimension” (this is a different meaning from the “dimension” of linear algebra) which is expressed as a fraction of powers of formal symbols. The ordinary algebraic rules for fractions and formal symbols are used for the dimensional components, with the further requirement that addition and equality may only occur between quantities having the same dimension.

For example, if E, M and C represent the dimensions of energy, mass and cost, respectively, and if the energy storage density

ρ E / M

of a battery manufacturing process is known (having dimensions energy per mass) and the manufacturing weight yield

w M / C

of the battery is known (having dimensions mass per cost), then under the algebraic conventions of dimensional analysis, calculating the energy storage per cost (which should have dimensions energy per cost) is simple;

(ρ \frac{E}{M}) (w \frac{M}{C}) = ρ w \frac{E M}{M C} = ρ w \frac{E}{C}

(the M symbols cancel in the fraction). Here, both

ρ

and w are real numbers, and besides using the well-definedness of real multiplication, no type-checking is done in the expression

ρ w

.

A contrasting example is the quantity

ρ / w

, having dimensions

E C / M^{2}

. However, these dimensions may be considered to be meaningless in the given context. The quantity’s type adds meaning to the real-valued quantity, and while the quantity is well-defined as a real number, the uselessness of the type may indicate that an error has been made in the calculations. For example, a type mismatch between the two sides of an equation is a strong indication of error.

This is also a convenient way to think about the chain rule of calculus. If

z (y)

,

y (x)

, and x measure real-valued quantities, then

z (y (x))

measures the quantity z with respect to quantity x. Using Z, Y and X for the dimensions of the quantities z, y and x respectively, the derivative

\frac{d z}{d x}

has units

Z / X

. When worked out, the dimensions for the quantities on either side of the equation

\frac{d z}{d x} = \frac{d z}{d y} \frac{d y}{d x}

will match exactly, having a non-coincidental similarity to the calculation in the battery product example.

3.2. Telescoping Notation (Also Known as Do Not Fear the Verbosity)

Many of the computations developed in this paper will appear to be overly pedantic, owing to the decoration-heavy notation that will be introduced in Section 3.3. This decoration is largely for the purpose of tracking the myriad of types in the type system and to assist the human reader or writer in making sense of and error-checking the expressions involved. The pedantry in this paper plays the role of introducing the technique. The notation is designed to telescope (Credit for the notion of telescoping notation is due in part to David DeConde, during one of many enjoyable and insightful conversations.), meaning that there is a spectrum of notational decoration; from

pedantically type-specified, verbose, and decoration-heavy, where [almost] no types must be inferred from context and there is little work or expertise required on the part of the reader, to
somewhat decorated but more compact, where the reader must do a little bit of thinking to infer some types, all the way to
tersely notated with minimal type decoration, where [almost] all types must be inferred from context and the reader must either do a lot of thinking or be relatively experienced.

Additionally, some of the chosen symbols are meant to obey the same telescoping range of specificity. For example, compare n-fold tensor contraction

\cdot^{n}

with type-specified

\cdot_{V_{1} \otimes \dots \otimes V_{n}}

as discussed in Section 3.3, or the symbols ∇,

\nabla^{_{○}}

, and

\nabla^{|}

as discussed in Section 3.10. Tersely notated computations can be seen in Section 3.10, while fully-verbose computations abound in the careful exposition of Section 4.

3.3. Strongly-Typed Linear Algebra via Tensor Products

A fully strongly typed formulation of linear algebra will now be developed which enjoys a level of abstraction and flexibility similar to that of Penrose’s abstract index notation. Emphasis will be placed on notational and conceptual regularity via a tensor formalism, coupled with a notion of “untangled” expression which exploits and notationally depicts the associativity of linear composition.

If V denotes a finite-dimensional vector space, then let

\cdot_{V} : V^{*} \times V \to R, (α, v) \mapsto α (v)

denote the natural pairing on V, and denote

\cdot_{V} (α, v)

using the infix notation

α \cdot_{V} v

. The natural pairing is a nondegenerate bilinear form and its bilinearity gives the expression

α \cdot_{V} v

multiplicative semantics (distributivity and commutativity with scalar multiplication), thereby justifying the use of the infix · operator normally reserved for multiplication. The natural pairing subscript V is seemingly pedantic, but will prove to be an invaluable tool for articulating and navigating the rich type system of the linear algebraic and vector bundle constructions used in this paper. When clear from context, the subscript V may be omitted.

Because V is finite-dimensional, it is reflexive (i.e., the canonical injection

V \to V^{* *}, v \mapsto (α \mapsto α (v))

is a linear isomorphism). Thus the natural pairing

\cdot_{V^{*}}

on

V^{*}

can be written naturally as

\cdot_{V^{*}} : V \times V^{*} \to R, (v, α) \mapsto α (v) .

Note that

α \cdot_{V} v = v \cdot_{V^{*}} α

. Though subtle, the distinction between

\cdot_{V}

and

\cdot_{V^{*}}

is important within the type system used in this paper.

Through a universal mapping property of multilinear maps, the bilinear forms

\cdot_{V}

and

\cdot_{V^{*}}

descend to the natural trace maps

\begin{matrix} {tr}_{V} : V^{*} \otimes V & \to R, α \otimes v \mapsto α (v), and \\ {tr}_{V^{*}} : V \otimes V^{*} & \to R, v \otimes α \mapsto α (v), \end{matrix}

each extended linearly to non-simple tensors. These operations can also be called tensor contraction. Noting that

{(V^{*} \otimes V)}^{*}

and

{(V \otimes V^{*})}^{*}

are canonically isomorphic to

V \otimes V^{*}

and

V^{*} \otimes V

respectively, then for each

A \in V^{*} \otimes V

and

B \in V \otimes V^{*}

, it follows that

{tr}_{V} (A) = I_{V^{*}} \cdot_{V^{*} \otimes V} A

and

{tr}_{V^{*}} (B) = I_{V} \cdot_{V \otimes V^{*}} B

.

Definition 1

(Linear maps as tensors). Let V and W be finite-dimensional vector spaces, and let

Hom (V, W)

denote the space of vector space morphisms from V to W (i.e., linear maps). The linear isomorphism

\begin{matrix} W \otimes V^{*} & \to & Hom (V, W), \\ w \otimes α & \mapsto & (V \to W, v \mapsto w (α \cdot_{V} v)) \end{matrix}

(extended linearly to general tensors) will play a central conceptual role in the calculations employed in this paper, as it will facilitate constructions which would otherwise be awkward or difficult to express. Linear maps and appropriately typed tensor products will be identified via this isomorphism.

Given bases

v_{1}, \dots, v_{m} \in V

and

w_{1}, \dots, w_{n} \in W

, and dual bases

v^{1}, \dots, v^{m} \in V^{*}

and

w^{1}, \dots, w^{n} \in W^{*}

, a linear map

A : V \to W

can be written under the identification in (1) as

A = A_{j}^{i} w_{i} \otimes v^{j},

where

A_{j}^{i} = w^{i} \cdot_{W} A \cdot_{V} v_{j} \in R

, and in fact

[A_{j}^{i}] \in M_{n \times m} (R)

is the matrix representation of A with respect to the bases

v_{1}, \dots, v_{m} \in V

and

w_{1}, \dots, w_{n} \in W

, noting that the i and j indices denote the “output” and “input” components of A respectively. Tensors are therefore the strongly typed analog of matrices, where the

W \otimes V^{*}

type information is carried by the

w_{i} \otimes v^{j}

component. One particular example is the the identity map on V, which has type

V \otimes V^{*}

and is expressed simply as

v_{i} \otimes v^{i}

(equivalently,

δ_{j}^{i} v_{i} \otimes v^{j}

). Throughout this paper, the identity map on V will be referred to as the identity tensor on V, or just the identity tensor if V is clear from context, and will be denoted as

I_{V}

.

One clarifying example of the tensor formulation is the adjoint operation of the natural pairing, also known as forming the dual of a linear map. It is straightforward to show that

\begin{matrix} * : W \otimes V^{*} & \to & V^{*} \otimes W, \\ w \otimes α & \mapsto & α \otimes w, \end{matrix}

(where the map is extended linearly to general tensors). This is literally the tensor abstraction of the matrix transpose operation; if

A = A_{j}^{i} w_{i} \otimes α^{j}

, then the dual A is

A^{*} = A_{i}^{j} α^{i} \otimes w_{j}

. The matrix of

A^{*}

is precisely the transpose of the matrix of A with respect to the relevant bases. The map * itself can be written as a 4-tensor

* \in V^{*} \otimes W \otimes W^{*} \otimes V

, where

A^{*} = * \cdot_{W \otimes V^{*}} A

.

There is a notion of the natural pairing of tensor products, which implements composition and evaluation of linear maps, and can be thought of as a natural generalization of scalar multiplication in a field. If

U,

V,

and W are each finite-dimensional vector spaces, then the bilinear form

\begin{matrix} (U \otimes V^{*}) \times (V \otimes W) & \to & U \otimes R \otimes W ≅ U \otimes W, \\ (u \otimes α, v \otimes w) & \mapsto & u \otimes (α \cdot_{V} v) \otimes w = (α \cdot_{V} v) u \otimes w \end{matrix}

will be denoted also by the infix notation

\cdot_{V}

(i.e.,

(u \otimes α) \cdot_{V} (v \otimes w) = (α \cdot_{V} v) u \otimes w

). If V itself is a tensor product of n factors which are clear from context, then

\cdot_{V}

may be denoted by

\cdot^{n}

(think an n-fold tensor contraction). If

n = 2

, then typically : is used in place of

\cdot^{2}

. For example, from above,

A^{*} = * \cdot_{W \otimes V^{*}} A = * : A

.

Given a permutation

σ \in S_{n}

, define a right-action by

σ : V_{1} \otimes \dots \otimes V_{n} \to V_{σ^{- 1} (1)} \otimes \dots \otimes V_{σ^{- 1} (n)}

, mapping elements in the obvious way. For example,

(2 3 4)

acting on

v_{1} \otimes v_{2} \otimes v_{3} \otimes v_{4}

puts the second factor in the third position, the third factor in the fourth position, and the fourth factor in the second, giving

v_{1} \otimes v_{4} \otimes v_{2} \otimes v_{3}

. This permutation is itself a linear map and of course can be written as a tensor. However, because it is defined in terms of a right action, the “domain factors” will come on the left. Thus

σ

is written as a tensor of the form

V_{1}^{*} \otimes \dots \otimes V_{n}^{*} \otimes V_{σ^{- 1} (1)} \otimes \dots \otimes V_{σ^{- 1} (n)}

(i.e., as a

2 n

-tensor). Certain tensor constructions are conducive to using such permutations. In the above example, * can be written as

(1 2) \in W^{*} \otimes V \otimes V^{*} \otimes W

.

The permutation right-action also works naturally when notated using superscripts. For example, if

B \in U \otimes V \otimes W

, then

B^{(1 2)} : = B \cdot_{U^{*} \otimes V^{*} \otimes W^{*}} (1 2) \in V \otimes U \otimes W

and so

\begin{matrix} {(B^{(1 2)})}^{(2 3)} & = (B \cdot_{U^{*} \otimes V^{*} \otimes W^{*}} (1 2)) \cdot_{V^{*} \otimes U^{*} \otimes W^{*}} (2 3) \\ = B \cdot_{U^{*} \otimes V^{*} \otimes W^{*}} ((1 2) \cdot_{V^{*} \otimes U^{*} \otimes W^{*}} (2 3)) \\ = B \cdot_{U^{*} \otimes V^{*} \otimes W^{*}} (1 2) (2 3) \\ = B \cdot_{U^{*} \otimes V^{*} \otimes W^{*}} (1 3 2) \in V \otimes W \otimes U . \end{matrix}

When multiplying the permutations

(1 2)

and

(2 3)

in the third line, it is important to note that they are read left-to-right, since they are acting on B on the right.

The inline cycle notation is somewhat ambiguous in isolation because the number of factors in the domain/codomain is not specified, let alone their types. This information can sometimes be inferred from context, such as from the natural pairing subscripts, as in the following examples.

Example 1

(Linearizing the inversion map). Let

i : G L (V) \to G L (V), A \mapsto A^{- 1}

, i.e., the linear map inversion operator, where

G L (V)

is an open submanifold of

V \otimes V^{*}

via the isomorphism

V \otimes V^{*} ≅ Hom (V, V)

. Its linearization (derivative)

D i : G L (V) \to V \otimes V^{*} \otimes {(V \otimes V^{*})}^{*} ≅ V \otimes V^{*} \otimes V^{*} \otimes V

at

A \in G L (V)

in the direction

B \in T_{A} (G L (V)) ≅ V \otimes V^{*}

is

\begin{matrix} D i (A) \cdot_{V \otimes V^{*}} B = & D i \cdot_{V \otimes V^{*}} δ (A + ϵ B) \\ = & δ (i (A + ϵ B)) \\ = & δ ({(A + ϵ B)}^{- 1}) \\ = & δ ({((1 + ϵ B A^{- 1}) A)}^{- 1}) \\ = & δ (A^{- 1} {(1 + ϵ B A^{- 1})}^{- 1}) \\ = & δ (A^{- 1} \sum_{n = 0}^{\infty} {(- ϵ B A^{- 1})}^{n}) \\ = & δ (A^{- 1} - ϵ A^{- 1} B A^{- 1} + O (ϵ^{2})) \\ = & - A^{- 1} \cdot_{V} B \cdot_{V} A^{- 1} . \end{matrix}

(

|- ϵ B A^{- 1}|

is taken arbitrarily small due to the derivative

δ : = \frac{d}{d ϵ} ∣_{ϵ = 0}

being evaluated in an arbitrarily small neighborhood of ϵ = 0).

In order to “move” the B parameter out so that it plays the same syntactical role as in the original expression

D i (A) \cdot B

, via adjacent natural pairing, some simple tensor manipulations can be done. The process is easily and accurately expressed via diagram. The following sequence of diagrams is a sequence of equalities. The diagram should be self-explanatory, but for reference, the number of boxes for a particular label denotes the rank of the tensor, with each box labeled with its type. The lines connecting various boxes are natural pairings, and the circles represent the unpaired “slots”, which comprise the type of the resulting expression.

The following step is nothing but moving the boxes for B out; the natural pairings still apply to the same slots, hence the cables dangling below.

In this setting, a tensor product amounts to flippantly gluing boxes together.

In order for B to be naturally paired in the same adjacent manner as in the original expression

D i (A) \cdot B

, the slots of

- A^{- 1} \otimes A^{- 1}

must be permuted; the second moves to the third, the third to the fourth, and the fourth to the second.

The first diagram equals the last one, thus

D i (A) \cdot_{V \otimes V^{*}} B = - {(A^{- 1} \otimes A^{- 1})}^{(2 3 4)} \cdot_{V \otimes V^{*}} B

, and by the nondegeneracy of the natural pairing on

V \otimes V^{*}

, this implies that

D i (A) = - {(A^{- 1} \otimes A^{- 1})}^{(2 3 4)}

, noting that the statement of this expression does not require the direction vector B. The permutation exponent

(2 3 4)

can be calculated easily using simple tensors, if not by the above diagrammatic manipulations;

\begin{matrix} (a_{1} \otimes a_{2}) \cdot (b_{1} \otimes b_{2}) \cdot (a_{3} \otimes a_{4}) & = & (a_{1} \otimes a_{4} \otimes a_{2} \otimes a_{3}) : (b_{1} \otimes b_{2}) \\ = & {(a_{1} \otimes a_{2} \otimes a_{3} \otimes a_{4})}^{(2 3 4)} : (b_{1} \otimes b_{2}) . \end{matrix}

Here, the expression

(a_{1} \otimes a_{2}) \cdot (b_{1} \otimes b_{2}) \cdot (a_{3} \otimes a_{4})

represents the expression

A^{- 1} \cdot B \cdot A^{- 1}

.

The next example will later be extended to the setting of Riemannian manifolds and their metric tensor fields, and put to use to formulate what are known as harmonic maps (see (3)). However, first, a new tensor operation must be defined.

Definition 2

(Parallel tensor product). If

U, V, W, X

are vector spaces and

A \in U \otimes V

and

B \in W \otimes X

, then define their parallel tensor product

A ⊠ B

by

A ⊠ B : = {(A \otimes B)}^{(2 3)} \in (U \otimes W) \otimes (V \otimes X) .

The parentheses in the type specification are unnecessary, but hint at what the tensor decomposition for the quantity

A ⊠ B

should be, if used as an operand to ⊠ again (see below).

If A and B represent linear maps, then

A ⊠ B \in (U \otimes W) \otimes (V \otimes X)

represents their tensor product as linear maps (the parentheses are unnecessary but hint at what the domain and codomain are, and for use of

A ⊠ B

as an operand in another parallel tensor product), which is a “parallel” composition; if

α \in V^{*}

and

β \in X^{*}

, then

(A ⊠ B) \cdot_{V^{*} \otimes X^{*}} (α \otimes β) = (A \cdot_{V^{*}} α) \otimes (B \cdot_{X^{*}} β)

.

There is a slight ambiguity in the notation coming from a lack of specification on how the tensor product of the operands is decomposed in the case when there is more than one such decomposition. Notation explicitly resolving this ambiguity will not be needed in this paper as the relevant tensor product is usually clear from context.

The parallel tensor product is associative; if Y and Z are also vector spaces and

C \in Y \otimes Z

, then

(A ⊠ B) ⊠ C = A ⊠ (B ⊠ C) \in (U \otimes W \otimes Y) \otimes (V \otimes X \otimes Z),

allowing multiply-parallel tensor products.

Example 2

(Tensor product of inner product spaces). If

(V, g)

and

(W, h)

are inner product spaces (noting that

g \in V^{*} \otimes V^{*}

and

h \in W^{*} \otimes W^{*}

are symmetric, i.e., literally invariant under

(1 2)

), then

W \otimes V^{*}

is an inner product space having induced inner product

k (A, B) : = {tr}_{V} (g^{- 1} \cdot_{V^{*}} A^{*} \cdot_{W^{*}} h \cdot_{W} B)

. Here, the “inputs” of A and B (the

V^{*}

factors) are being paired using

g^{- 1} \in V \otimes V

, while the “outputs” (the W factors) are being paired using

h \in W^{*} \otimes W^{*}

, and the trace is used to “complete the cycle” by plugging the output into the input, thereby producing a real number. The expression

k (A, B)

can be written in a more natural way, which takes advantage of the linear composition, as

A : k : B

(or, pedantically,

A \cdot_{W^{*} \otimes V} k \cdot_{W \otimes V^{*}} B

), instead of the more common but awkward trace expression mentioned earlier. In the tensor formalism, the inner product k should have type

W^{*} \otimes V \otimes W^{*} \otimes V

. Permuting the middle two components of the 4-tensor

h \otimes g^{- 1} \in W^{*} \otimes W^{*} \otimes V \otimes V

gives the correct type. In fact,

k = h ⊠ g^{- 1}

. A further advantage to this formulation is that if any or all of

A, k, B

are functions, there is a clear product rule for derivatives of the expression

A : k : B

. This is something that is used critically in Riemannian geometry in the form of covariant derivatives of tensor fields (see (4)).

In this paper, the main use of the tensor formulation of linear maps is twofold: to facilitate linear algebraic constructions which would otherwise be difficult or awkward (this includes the ability to express derivatives of [possibly vector or manifold-valued] maps without needing to “plug in” the derivative’s directional argument), and to make clear the product-rule behavior of many important differentiable constructions.

3.4. Bundle Constructions

In order to use the calculus of variations involving Lagrangians depending on tangent maps of maps between smooth manifolds, it suffices to consider Lagrangians defined on smooth vector bundle morphisms. Continuing in the style of the previous section, a “full” tensor product of smooth vector bundles (4) will be formulated which will then allow expression of smooth vector bundle morphisms as tensor fields, sometimes called two-point tensor fields ([11], p. 70). The full arsenal of tensor calculus can then be used to considerable advantage.

First, some definitions and simpler bundle constructions will be introduced. A smooth [fiber] bundle (hereafter referred to simply as a smooth bundle) is a 4-tuple

(E, E, π, N)

where

E

, E and N are smooth manifolds and

π : E \to N

is locally trivial, i.e., N is covered by open sets

\{U_{α}\}

such that

π^{- 1} (U_{α}) ≅ U_{α} \times E

as smooth manifolds. The manifolds

E

, E and N are called the typical fiber, the total space, and the base space respectively. The map

π

is called the bundle projection. The full 4-tuple specifying a bundle can be recovered from the bundle projection map, so a locally trivial smooth map can be said to define a smooth bundle. The dimension of the typical fiber of a bundle will be called its rank, and will be denoted by

rank π

or

rank E

when the bundle is understood from context.

The space of smooth sections of a smooth bundle defined by

π : E \to N

is

Γ (π) : = \{σ \in C^{\infty} (N, E) ∣ π \circ σ = {Id}_{N}\},

and may also be denoted by

Γ (E)

, if the bundle is clear from context. If nonempty,

Γ (π)

is generally an infinite-dimensional manifold (the exception being when the base space N is finite) [12].

Proposition 1

(Trivial bundle). Let M and N be smooth manifolds. With

M ⥇ N : = M \times N

and

π^{M ⥇ N} : = {pr}_{2}^{M \times N} : M ⥇ N \to N

defines a smooth bundle

(M, M ⥇ N, π^{M ⥇ N}, N)

, called a trivial bundle. Similarly, with

M ⬾ N : = M \times N

and

π^{M ⬾ N} : = {pr}_{1}^{M \times N} : M ⬾ N \to M

,

(N, M ⬾ N, π^{M ⬾ N}, M)

is a trivial bundle.

No proof is deemed necessary for (1), as each bundle projection trivializes globally in the obvious way. The

⥇

symbol is a composite of × (indicating direct product) and → or ← (indicating the base space).

If M and N are smooth manifolds as in (1), then there are two particularly useful natural identifications.

\begin{matrix} C^{\infty} (M, N) & ≅ Γ (M ⬾ N) & C^{\infty} (M, N) & ≅ Γ (N ⥇ M) \\ ϕ & \mapsto {Id}_{M} \times_{M} ϕ & ϕ & \mapsto ϕ \times_{M} {Id}_{M} \\ {pr}_{2}^{M \times N} \circ Φ & ↤ Φ & {pr}_{1}^{N \times M} \circ Φ & ↤ Φ \end{matrix}

These identifications can be thought of identifying a map

ϕ \in C^{\infty} (M, N)

with its graph in

M \times N

and

N \times M

respectively. Furthermore, this allows bundle theory to be applied to reasoning about spaces of maps. The symbols

M ⬾ N

and

N ⥇ M

now carry a significant amount of meaning. Generally

N ⥇ M

will be used in this paper, for consistency with the

Hom (V, W) ≅ W \otimes V^{*}

convention discussed in Section 3.3. The symbols

⥇

and

⬾

are examples of telescoping notation, as they are built notationally on ×, and conceptually on the direct product, which is what is denoted by ×. The arrow portion of the symbols can be discarded when type-specificity is not needed.

Proposition 2

(Direct product bundle). Let

(E, E, π^{E}, M)

and

(F, F, π^{F}, N)

be smooth bundles. Then

π^{E} \times π^{F} : E \times F \to M \times N, (e, f) \mapsto (π^{E} (e), π^{F} (f))

defines a smooth bundle

(E \times F, E \times F, π^{E} \times π^{F}, M \times N)

. This bundle is called the direct product of $π^{E}$ and $π^{F}$ , and is not necessarily a trivial bundle.

Proof.

Let

Ψ^{E} : {(π^{E})}^{- 1} (U) \to U \times E

and

Ψ^{F} : {(π^{F})}^{- 1} (V) \to V \times F

trivialize

π^{E}

and

π^{F}

over open sets

U \subseteq M

and

V \subseteq N

respectively. Then

Ψ^{E} \times Ψ^{F} : {(π^{E})}^{- 1} (U) \times {(π^{F})}^{- 1} (V) \to U \times E \times V \times F

has inverse

{(Ψ^{E})}^{- 1} \times {(Ψ^{F})}^{- 1}

. Note that

\begin{matrix} {(π^{E})}^{- 1} (U) \times {(π^{F})}^{- 1} (V) & = \{(e, f) \in E \times F ∣ π^{E} (e) \in U, π^{F} (f) \in V\} \\ = \{(e, f) \in E \times F ∣ (π^{E} \times π^{F}) (e, f) \in U \times V\} \\ = {(π^{E} \times π^{F})}^{- 1} (U \times V), \end{matrix}

and that

P : (U \times E) \times (V \times F) \to (U \times V) \times (E \times F), ((u, e), (v, f)) \mapsto ((u, v), (e, f))

defines a diffeomorphism. Then

Ψ^{E \times F} : = P \circ (Ψ^{E} \times Ψ^{F}) : {(π^{E} \times π^{F})}^{- 1} (U \times V) \to (U \times V) \times (E \times F)

defines a diffeomorphism, and

\begin{matrix} {pr}_{1}^{(U \times V) \times (E \times F)} \circ Ψ^{E \times F} (e, f) \\ = & {pr}_{1}^{(U \times V) \times (E \times F)} \circ P \circ (Ψ^{E} \times Ψ^{F}) (e, f) \\ = & {pr}_{1}^{(U \times V) \times (E \times F)} \circ P (Ψ^{E} (e), Ψ^{F} (f)) \\ = & {pr}_{1}^{(U \times V) \times (E \times F)} \circ P (Ψ^{E} (e), Ψ^{F} (f)) \\ = & {pr}_{1}^{(U \times V) \times (E \times F)} \circ P (({pr}_{1}^{U \times E} \circ Ψ^{E} (e), {pr}_{2}^{U \times E} \circ Ψ^{E} (e)), ({pr}_{1}^{V \times F} \circ Ψ^{F} (f), {pr}_{2}^{V \times F} \circ Ψ^{F} (f))) \\ = & {pr}_{1}^{(U \times V) \times (E \times F)} (({pr}_{1}^{U \times E} \circ Ψ^{E} (e), {pr}_{1}^{V \times F} \circ Ψ^{F} (f)), ({pr}_{2}^{U \times E} \circ Ψ^{E} (e), {pr}_{2}^{V \times F} \circ Ψ^{F} (f))) \\ = & ({pr}_{1}^{U \times E} \circ Ψ^{E} (e), {pr}_{1}^{V \times F} \circ Ψ^{F} (f)) \\ = & (π^{E} (e), π^{F} (f)) \\ = & (π^{E} \times π^{F}) (e, f), \end{matrix}

showing that

Ψ^{E \times F}

trivializes

π^{E} \times π^{F}

over

U \times V \subseteq M \times N

. Since

M \times N

can be covered by such trivializing sets, this establishes that

π^{E} \times π^{F}

defines a smooth bundle. The typical fiber of

π^{E} \times π^{F}

is

E \times F

. □

A smooth vector bundle is a fiber bundle whose typical fiber is a vector space and whose local trivializations are linear isomorphisms when restricted to each fiber. If

(E, E, π, M)

is a smooth vector bundle, then its dual vector bundle

(E^{*}, E^{*}, π^{*}, M)

is a smooth vector bundle defined in the following way.

E^{*} : = \underset{p \in M}{∐} {(E_{p})}^{*}, π^{*} : E^{*} \to M, η_{p} \mapsto p .

Because

E

is a vector space, the notation

E^{*}

is already defined. In analogy with Section 3.3, there are natural pairings on a vector bundle and its dual, defined simply by evaluation. If

p \in M

,

η \in E_{p}^{*}

and

e \in E_{p}

, then

η \cdot_{E} e : = η \cdot_{E_{p}} e

and

e \cdot_{E} η : = e \cdot_{E_{p}} η

. Both expressions evaluate to

η (e)

. Natural traces and n-fold tensor contraction can be defined analogously. Again, while seemingly pedantic, the subscripted natural pairing notation will prove to be a valuable tool in articulating and error-checking calculations involving vector bundles. To generalize the rest of Section 3.3 will require the definition of additional structures.

For the remainder of this section, let

(E, E, π^{E}, M)

and

(F, F, π^{F}, N)

now be smooth vector bundles. The following construction is essentially an alternate notation for

π^{E} \times π^{F} : E \times F \to M \times N

, but is one that takes advantage of the fact that

π^{E}

and

π^{F}

are vector bundles, and encodes in the notation the fact that the resulting construction is also a vector bundle. This is analogous to how

V \times W

is a vector space with a natural structure if V and W are vector spaces, except that this is usually denoted by

V \oplus W

.

Proposition 3

(“Full” direct sum vector bundle). If

E \oplus_{M \times N} F : = E \times F,

Then

π^{E} \oplus_{M \times N} π^{F} : = π^{E} \times π^{F} : E \oplus_{M \times N} F \to M \times N

defines a smooth vector bundle

(E \oplus F, E \oplus_{M \times N} F, π^{E} \oplus_{M \times N} π^{F}, M \times N)

, called the full direct sum of $π^{E}$ and $π^{F}$ .

For each

(p, q) \in M \times N

, the vector space structure on

{(π^{E} \oplus_{M \times N} π^{F})}^{- 1} (p, q)

is given in the following way. Let

α \in R

and

(e_{1}, f_{1}), (e_{2}, f_{2}) \in {(π^{E} \oplus_{M \times N} π^{F})}^{- 1} (p, q)

. Then

α (e_{1}, f_{1}) + (e_{2}, f_{2}) = (α e_{1} + e_{2}, α f_{1} + f_{2}) .

It is critical to see (1) for remarks on notation.

Proof.

Let U, V,

E

,

F

, P,

Ψ^{E}

,

Ψ^{F}

and

Ψ^{E \times F}

be as in the proof of (2), and define

Ψ^{E \oplus_{M \times N} F} : = Ψ^{E \times F}

. Noting that

Ψ^{E \oplus_{M \times N} F}

is a smooth bundle isomorphism over

{Id}_{U \times V}

, so to show that

Ψ^{E \oplus_{M \times N} F}

is a linear isomorphism in each fiber, it suffices to show that it is linear in each fiber. Let

α \in R

,

(p, q) \in U \times V

and

(e_{1}, f_{1}), (e_{2}, f_{2}) \in {(π^{E} \oplus_{M \times N} π^{F})}^{- 1} (p, q)

. Then

\begin{matrix} Ψ^{E \oplus_{M \times N} F} (α e_{1} + e_{2}, α f_{1} + f_{2}) = & P \circ (Ψ^{E} \times Ψ^{F}) (α e_{1} + e_{2}, α f_{1} + f_{2}) \\ = & P (Ψ^{E} (α e_{1} + e_{2}), Ψ^{F} (α f_{1} + f_{2})) \\ = & P (α Ψ^{E} (e_{1}) + Ψ^{E} (e_{2}), α Ψ^{F} (f_{1}) + Ψ^{F} (f_{2})) \\ (by trivial vector bundle structures on U \times E and V \times F) \\ = & α P (Ψ^{E} (e_{1}), Ψ^{F} (f_{1})) + P (Ψ^{E} (e_{2}), Ψ^{F} (f_{2})) \\ (by trivial vector bundle structure on (U \times V) \times (E \times F)) \\ = & α P \circ (Ψ^{E} \times Ψ^{F}) (e_{1}, f_{1}) + P \circ (Ψ^{E} \times Ψ^{F}) (e_{2}, f_{2}) \\ = & α Ψ^{E \oplus_{M \times N} F} (e_{1}, f_{1}) + Ψ^{E \oplus_{M \times N} F} (e_{2}, f_{2}) . \end{matrix}

Thus

Ψ^{E \oplus_{M \times N} F}

is linear in each fiber, and because it is invertible, it is a linear isomorphism in each fiber. In particular,

Ψ^{E \oplus_{M \times N} F}

is a smooth vector bundle isomorphism over

{Id}_{U \times V}

. Applying

{(Ψ^{E \oplus_{M \times N} F})}^{- 1}

to the above equation gives

(α e_{1} + e_{2}, α f_{1} + f_{2}) = α (e_{1}, f_{1}) + (e_{2}, f_{2}),

as desired. □

This construction differs from the Whitney sum of two vector bundles, as the base spaces of the bundles are kept separate, and are not even required to be the same. This allows the identification of

T (M \times N) \to M \times N

as

T M \oplus_{M \times N} T N \to M \times N

, which may be done without comment later in this paper. Some important related structures are

{pr}_{1}^{*} π_{M}^{T M} : {pr}_{1}^{*} T M \to M \times N

and

{pr}_{2}^{*} π_{N}^{T N} : {pr}_{2}^{*} T N \to M \times N

, where

{pr}_{i} : = {pr}_{i}^{M \times N}

.

The next construction is what will be used in the implementation of smooth vector bundle morphisms as tensor fields.

Proposition 4

(“Full” tensor product bundle). If

E \otimes_{M \times N} F : = \underset{(p, q) \in M \times N}{∐} E_{p} \otimes F_{q} (d i s j o i n t u n i o n),

Then

π^{E} \otimes_{M \times N} π^{F} : E \otimes_{M \times N} F \to M \times N, α^{i j} e_{i} \otimes f_{j} \mapsto (π^{E} (e_{1}), π^{F} (f_{1})) (h e r e, α^{i j} \in R)

defines a smooth vector bundle

(E \otimes F, E \otimes_{M \times N} F, π^{E} \otimes_{M \times N} π^{F}, M \times N)

, called the full tensor product (This construction is alluded to in ([13], p. 121), but is not defined or discussed.) of $π^{E}$ and $π^{F}$ .

It is critical to see (1) for remarks on notation.

Proof.

Since the argument

α^{i j} e_{i} \otimes f_{j}

in the definition of

π^{E} \otimes_{M \times N} π^{F}

is not necessarily unique, the well-definedness of

π^{E} \otimes_{M \times N} π^{F}

must be shown. Let

α^{i j} e_{i}^{1} \otimes f_{j}^{1} = β^{i j} e_{i}^{2} \otimes f_{j}^{2}

. Then in particular,

α^{i j} e_{i}^{1} \otimes f_{j}^{1}, β^{i j} e_{i}^{2} \otimes f_{j}^{2} \in E_{p} \otimes F_{q}

for some

(p, q) \in M \times N

, and therefore

e_{i}^{1}, e_{i}^{2} \in E_{p}

and

f_{j}^{1}, f_{j}^{2} \in F_{q}

for each index i and j. Thus

π^{E} (e_{1}^{1}) = p = π^{E} (e_{1}^{2})

and

π^{F} (f_{1}^{1}) = q = π^{F} (f_{1}^{2})

, so the expression defining

π^{E} \otimes_{M \times N} π^{F}

is well-defined.

The set

E \otimes_{M \times N} F

does not have an a priori global smooth manifold structure, as it is defined as the disjoint union of vector spaces. A smooth manifold structure compatible with that of the constituent vector spaces will now be defined.

Let

Ψ^{E} : {(π^{E})}^{- 1} (U) \to U \times E

and

Ψ^{F} : {(π^{F})}^{- 1} (V) \to V \times F

trivialize

π^{E}

and

π^{F}

over open sets

U \subseteq M

and

V \subseteq N

respectively, such that

Ψ^{E}

and

Ψ^{F}

are each linear in each fiber. Define

\begin{matrix} Ψ^{E \otimes_{M \times N} F} : {(π^{E} \otimes_{M \times N} π^{F})}^{- 1} (U \times V) & \to & (U \times V) \times (E \otimes F), \end{matrix}

with element mapping

\begin{matrix} X & \mapsto & ((π^{E} \otimes_{M \times N} π^{F}) (X), (({pr}_{2}^{U \times E} \circ Ψ^{E}) \otimes ({pr}_{2}^{V \times F} \circ Ψ^{F})) (X)) . \end{matrix}

The map

Ψ^{E \otimes_{M \times N} F}

is well-defined and smooth in each fiber by construction, since for each

(p, q) \in U \times V

,

({pr}_{2}^{U \times E} \circ Ψ^{E}) \otimes ({pr}_{2}^{V \times F} \circ Ψ^{F}) ∣_{E_{p} \otimes E_{q}} : E_{p} \otimes E_{q} \to E \otimes F

is a linear isomorphism by construction. Additionally,

Ψ^{E \otimes_{M \times N} F}

has been constructed so that

{pr}_{1}^{(U \times V) \times (E \otimes_{M \times N} F)} \circ Ψ^{E \otimes_{M \times N} F} = π^{E} \otimes_{M \times N} π^{F}

on

{(π^{E} \otimes_{M \times N} π^{F})}^{- 1} (U \times V)

. Define the smooth structure on

{(π^{E} \otimes_{M \times N} π^{F})}^{- 1} (U \times V) \subseteq E \otimes_{M \times N} F

by declaring

Ψ^{E \otimes_{M \times N} F}

to be a diffeomorphism. The map

π^{E} \otimes_{M \times N} π^{F}

is trivialized over

U \times V

. The set

E \otimes_{M \times N} F

can be covered by such trivializing open sets. Thus

E \otimes_{M \times N} F

has been shown to be locally diffeomorphic to the direct product of smooth manifolds, and therefore it has been shown to be a smooth manifold. With respect to the smooth structure on

E \otimes_{M \times N} F

, the map

π^{E} \otimes_{M \times N} π^{F}

is smooth, and has therefore been shown to define a smooth vector bundle. □

Remark 1

(Notation regarding base space). The “full” direct sum (3) and “full” tensor product (4) bundle constructions allow direct sums and tensor products to be taken of vector bundles when the base spaces differ. If the base spaces are the same, then the construction “joins” them, producing a vector bundle over that shared base space. For example, if E and F are vector bundles over M, then

E \otimes_{M \times M} F

has base space

M \times M

, while

E \otimes F

has base space M. The base space can be specified in either case as a notational aide; the latter example would be written as

E \otimes_{M} F

. If no subscript is provided on the ⊗ symbol, then the base spaces are “joined” if possible (if they are the same space), otherwise they are kept separate, as in the “full” tensor product construction. This notational convention conforms to the standard Whitney sum and tensor product bundle notation, and uses the notion of telescoping notation to provide more specificity when necessary.

Given a fiber bundle, a natural vector bundle can be constructed “on top” of it, essentially quantifying the variations of bundle elements along each fiber. This is known as the vertical [tangent] bundle ([12], p. 43), and it plays a critical role in the development of Ehresmann connections, which provide the “horizontal complement” to the vertical bundle.

Proposition 5

(Vertical bundle). Let

π^{E} : E \to M

define a smooth [fiber] bundle. If

V E : = ker T π^{E} \leq T E

, then

π^{V E} : = π_{E}^{T E} ∣_{V E} : V E \to E

defines a smooth vector bundle subbundle of

π_{E}^{T E} : T E \to E

, called the vertical bundle over E. Furthermore, the fiber over

e \in E

is

V_{e} E = T_{e} E_{π^{E} (e)} \leq T_{e} E

.

Proof.

Because

π^{E}

is a smooth surjective submersion,

V E \to E

is a subbundle of

T E \to E

having corank

dim M

and therefore rank equal to that of E. Furthermore, if

e \in E

and

ϵ \mapsto e_{ϵ} \in E_{π^{E} (e)}

, then

δ e_{ϵ}

represents an arbitrary element of

T_{e} E_{π^{E} (e)}

, and

T π^{E} (δ e_{ϵ}) = δ (π^{E} (e_{ϵ})) = δ (π (e)) = 0

, showing that

δ e_{ϵ} \in ker T π^{E}

, and therefore that

δ e_{ϵ} \in V_{e} E

. This shows that

T_{e} E_{π^{E} (e)} \subseteq V_{e} E

. Because

dim T_{e} E_{π^{E} (e)} = rank E

, this shows that

T_{e} E_{π^{E} (e)} = V_{e} E

. □

Given the extra structure that a vector bundle provides over a [fiber] bundle, there is a canonical smooth vector bundle isomorphism which adds significant value to the pullback bundle formalism used throughout this paper. This can be seen put to greatest use in Section 4, for example, in development of the first variation (see (1)).

Proposition 6

(Vertical bundle as pullback). If

π : E \to M

defines a smooth vector bundle, then

\begin{matrix} ι_{V E}^{π^{*} E} : π^{*} E & \to & V E, \\ (x, y) & \mapsto & δ_{ϵ} (x + ϵ y) \end{matrix}

is a smooth vector bundle isomorphism over

{Id}_{E}

, called the vertical lift, having inverse

\begin{matrix} ι_{π^{*} E}^{V E} : δ e_{ϵ} & \mapsto & (e_{0}, lim_{ϵ \to 0} \frac{e_{ϵ} - e_{0}}{ϵ}), \end{matrix}

where, without loss of generality,

e_{ϵ}

is an E-valued variation which lies entirely in a single fiber.

Proof.

It is clear that

ι_{V E}^{π^{*} E}

is linear and injective on each fiber. By a dimension counting argument, it is therefore an isomorphism on each fiber. Because it preserves the basepoint, it is a vector bundle isomorphism over

{Id}_{E}

. Because the map

(x, y, ϵ) \mapsto x + ϵ y

is smooth, so is the defining expression for

ι_{V E}^{π^{*} E}

, thereby establishing smoothness. That

ι_{π^{*} E}^{V E}

inverts

ι_{V E}^{π^{*} E}

is a trivial calculation. □

3.5. Strongly-Typed Tensor Field Operations

Because vector bundles and the related operations can be thought of conceptually as “sheaves of linear algebra”, the constructions in Section 3.3, generalized earlier in this section, can be further generalized to the setting of sections of vector bundles.

If

E, F, G

are smooth vector bundles over M, then define the natural pairing of a tensor field with a vector:

\begin{matrix} \cdot_{F} : Γ (E \otimes_{M} F^{*}) \times F & \to & E, \\ (e \otimes_{M} ϕ, f) & \mapsto & e (π^{F} (f)) [ϕ (π^{F} (f)) \cdot_{F} f], \end{matrix}

extending linearly to general tensor fields. Further, define the natural pairing of tensor fields:

\begin{matrix} \cdot_{F} : Γ (E \otimes_{M} F^{*}) \times Γ (F \otimes_{M} G) & \to & Γ (E \otimes_{M} G), \\ (e \otimes_{M} ϕ, f \otimes_{M} g) & \mapsto & (p \mapsto e (p) \otimes_{M} (ϕ (p) \cdot_{F_{p}} f (p)) \otimes_{M} g (p)) \\ = (p \mapsto (ϕ (p) \cdot_{F_{p}} f (p)) (e \otimes_{M} g) (p)), \end{matrix}

extending linearly to general tensor fields. This multiple use of the

\cdot_{F}

symbol is a concept known as operator overloading in computer programming. No ambiguity is caused by this overloading, as the particular use can be inferred from the types of the operands. As before, the subscript F may be optionally omitted when clear from context.

The permutations defined in Section 3.3 are generalized as tensor fields. If

F_{1}, \dots, F_{n}

are smooth vector bundles over M, and

σ \in S_{n}

is a permutation, then

σ

can act on

F_{1} \otimes_{M} \dots \otimes_{M} F_{n}

by permuting its factors, and therefore can be identified with a tensor field

σ \in Γ (F_{1}^{*} \otimes_{M} \dots \otimes_{M} F_{n}^{*} \otimes_{M} F_{σ^{- 1} (1)} \otimes_{M} \dots \otimes_{M} F_{σ^{- 1} (n)})

defined by

(f_{1} \otimes_{M} \dots \otimes_{M} f_{n}) \cdot_{F_{1}^{*} \otimes_{M} \dots \otimes_{M} F_{n}^{*}} σ : = f_{σ^{- 1} (1)} \otimes_{M} \dots \otimes_{M} f_{σ^{- 1} (n)} .

An important feature of such permutation tensor fields is that they are parallel with respect to covariant derivatives on the factors

F_{1}, \dots, F_{n}

(see (2) for more on this).

3.6. Pullback Bundles

The pullback bundle, defined below, is a crucial building block for many important bundle constructions, as it enriches the type system dramatically, and allows the tensor formulation of linear algebra to be extended to the vector bundle setting. In particular, the abstract, global formulation of the space of smooth vector bundle morphisms over a map

ϕ : M \to N

is achieved quite cleanly using a pullback bundle. Furthermore, the use of pullback bundles and pullback covariant derivatives simplifies what would otherwise be local coordinate calculations, thereby giving more insight into the geometric structure of the problem.

For the duration of this section, let

(F, F, π, N)

be a smooth bundle having rank r.

Proposition 7

(Pullback bundle). Let M and N be smooth manifolds and let

ϕ : M \to N

be smooth. If

ϕ^{*} F : = \{(m, f) \in M \times F ∣ ϕ (m) = π (f)\},

and

π^{ϕ^{*} F} : = {pr}_{1}^{M \times F} ∣_{ϕ^{*} F} : ϕ^{*} F \to M, (m, f) \mapsto m,

then

(F, ϕ^{*} F, π^{ϕ^{*} F}, M)

defines a smooth bundle. In particular,

ϕ^{*} F

is a smooth manifold having dimension

dim M + rank π

. The bundle defined by

π^{ϕ^{*} F}

is called the pullback of $π$ by $ϕ$ .

Proof.

Recalling that

F

denotes the typical fiber of

π

, let

Ψ : π^{- 1} (U) \to U \times F

trivialize

π

over open set

U \subseteq N

. Define

Ψ_{ϕ} : ϕ^{*} (π^{- 1} (U)) \to ϕ^{- 1} (U) \times F, (m, f) \mapsto (m, {pr}_{2}^{U \times F} \circ Ψ (f))

and

Ψ_{ϕ}^{- 1} : ϕ^{- 1} (U) \times F \to ϕ^{*} (π^{- 1} (U)), (m, f) \mapsto (m, Ψ^{- 1} (ϕ (m), f)) .

Claim (1):

Ψ_{ϕ}

and

Ψ_{ϕ}^{- 1}

are smooth. Proof:

ϕ^{*} (π^{- 1} (U)) \subseteq ϕ^{- 1} (U) \times π^{- 1} (U)

, and

Ψ_{ϕ}

is clearly smooth as a map defined on the larger manifold. Therefore it restricts to a smooth map on

ϕ^{*} (π^{- 1} (U))

. An analogous argument shows that

Ψ_{ϕ}^{- 1}

is smooth. Claim (1) proved.

Claim (2):

Ψ_{ϕ}^{- 1}

inverts

Ψ_{ϕ}

. Proof: Let

(m, f) \in ϕ^{*} (π^{- 1} (U))

. Then

\begin{matrix} Ψ_{ϕ}^{- 1} \circ Ψ_{ϕ} (m, f) & = Ψ_{ϕ}^{- 1} (m, {pr}_{2}^{U \times F} \circ Ψ (f)) \\ = (m, Ψ^{- 1} (ϕ (m), {pr}_{2}^{U \times F} \circ Ψ (f))) \\ = (m, Ψ^{- 1} (π (f), {pr}_{2}^{U \times F} \circ Ψ (f))) (since ϕ (m) = π (f)) \\ = (m, Ψ^{- 1} ({pr}_{1}^{U \times F} \circ Ψ (f), {pr}_{2}^{U \times F} \circ Ψ (f))) \\ = (m, Ψ^{- 1} \circ Ψ (f)) \\ = (m, f) . \end{matrix}

With

g \in F

,

\begin{matrix} Ψ_{ϕ} \circ Ψ_{ϕ}^{- 1} (m, g) & = Ψ_{ϕ} (m, Ψ^{- 1} (ϕ (m), g)) \\ = (m, {pr}_{2}^{U \times F} \circ Ψ \circ Ψ^{- 1} (ϕ (m), g)) \\ = (m, {pr}_{2}^{U \times F} (ϕ (m), g)) \\ = (m, g), \end{matrix}

proving Claim (2).

Claim (3):

Ψ_{ϕ}

trivializes

π^{ϕ^{*} F}

over

ϕ^{- 1} (U) \subseteq M

. Proof: Let

(m, f) \in ϕ^{*} (π^{- 1} (U))

. Then

{pr}_{1}^{ϕ^{- 1} (U) \times F} \circ Ψ_{ϕ} (m, f) = {pr}_{1}^{ϕ^{- 1} (U) \times F} \circ (m, {pr}_{2}^{U \times F} \circ Ψ (f)) = m = π^{ϕ^{*} F} (m, f),

and by claims (1) and (2),

Ψ_{ϕ}

is a diffeomorphism, so

Ψ_{ϕ}

trivializes

π^{ϕ^{*} F}

over

ϕ^{- 1} (U) \subseteq M

. Claim (3) proved.

Since M can be covered with sets as in claim (3) and since the typical fiber of

π^{ϕ^{*} F}

is diffeomorphic to

F

, this shows that

π^{ϕ^{*} F}

defines a smooth bundle

(F, ϕ^{*} F, π^{ϕ^{*} F}, M)

. Because

ϕ^{*} F

is locally diffeomorphic to the product of an open subset of M with

F

,

ϕ^{*} F

has been shown to be a smooth manifold having dimension

dim M + dim F = dim M + rank π

. □

While the pullback bundle is constructed as a submanifold of a direct product, there is a natural bundle morphism into the pulled-back bundle, which serves as an interface to maps defined on the pulled-back bundle. Usually this morphism is notationally suppressed, just as naturally isomorphic spaces can be identified without explicit notation.

Corollary 1

(Pullback fiber projection bundle morphism). If

ϕ : M \to N

is smooth, then

\begin{matrix} ρ_{F}^{ϕ^{*} F} : ϕ^{*} F & \to & F, \\ (m, f) & \mapsto & f \end{matrix}

is a smooth bundle morphism over ϕ which is an isomorphism when restricted to any fiber of

ϕ^{*} F

.

Because

ρ_{F}^{ϕ^{*} F}

is the projection

{pr}_{F}^{M \times F} ∣_{ϕ^{*} F}

, its tangent map is also just the projection

{pr}_{T F}^{T M \oplus T F} ∣_{T ϕ^{*} F}

.

Proposition 8

(Bundle pullback is a contravariant functor). The map of categories

\begin{matrix} P u l l b a c k : M a n i f o l d & \to & \{B u n d l e (M) ∣ M \in M a n i f o l d\}, \\ M & \mapsto & B u n d l e (M), \\ (ϕ : M \to N) & \mapsto & (B u n d l e (N) \to B u n d l e (M), (F, F, π, N) \mapsto (F, ϕ^{*} F, π^{ϕ^{*} F}, M)) \end{matrix}

is a contravariant functor. Here, naturally isomorphic bundles in

B u n d l e (M)

, for each manifold M, are identified (along with the corresponding morphisms).

Proof.

Noting that

{Id}_{N}^{*} F = \{(n, f) \in N \times F ∣ {Id}_{N} (n) = π (f)\} ≅ F

and that

\begin{matrix} ({Id}_{N}^{*} π) (n, f) & = ({pr}_{1}^{N \times F} ∣_{{Id}_{N}^{*} F}) (n, f) = n = π (f) \\ \Rightarrow {Id}_{N}^{*} π & ≅ π, \end{matrix}

it follows that

Pullback ({Id}_{N}) = {Id}_{Bundle (N)} = {Id}_{Pullback (N)}

, i.e.,

Pullback

satisfies the identity axiom of functoriality.

For the contravariance axiom, let

ϕ : M \to N

and

ψ : L \to M

be smooth manifold morphisms and let

(F, F, π, N)

be a smooth bundle. Then

\begin{matrix} ψ^{*} ϕ^{*} F & = \{(ℓ, p) \in L \times ϕ^{*} F ∣ ψ (ℓ) = π^{ϕ^{*} F} (p)\} \\ = \{(ℓ, (m, f)) \in L \times (M \times F) ∣ ψ (ℓ) = π^{ϕ^{*} F} (m, f) and ϕ (m) = π (f)\} \\ = \{(ℓ, (m, f)) \in L \times (M \times F) ∣ ψ (ℓ) = m and ϕ (m) = π (f)\} \\ ≅ \{(ℓ, f) \in L \times F ∣ ϕ \circ ψ (ℓ) = π (f)\} \\ = {(ϕ \circ ψ)}^{*} F \end{matrix}

and

\begin{matrix} π^{ψ^{*} ϕ^{*} F} (ℓ, (m, f)) & = ({pr}_{1}^{L \times ϕ^{*} F} ∣_{ψ^{*} ϕ^{*} F}) (ℓ, (m, f)) = ℓ and \\ π^{{(ϕ \circ ψ)}^{*} F} (ℓ, f) & = ({pr}_{1}^{L \times F} ∣_{{(ϕ \circ ψ)}^{*} F}) (ℓ, f) = ℓ, \end{matrix}

showing that

π^{ψ^{*} ϕ^{*} F} ≅ π^{{(ϕ \circ ψ)}^{*} F}

, and therefore

Pullback (ψ) \circ Pullback (ϕ) = Pullback (ϕ \circ ψ),

establishing

Pullback

as a contravariant functor. □

The space of sections of a pullback bundle is easily quantified.

Γ (ϕ^{*} F) = \{σ \in C^{\infty} (M, ϕ^{*} F) ∣ π^{ϕ^{*} F} \circ σ = {Id}_{M}\} .

This space will be central in the theory developed in the rest of this paper. Furthermore, it is naturally identified with the space of sections along the pullback map;

Γ_{ϕ} (F) : = \{Σ \in C^{\infty} (M, F) ∣ π^{F} \circ Σ = ϕ\} .

These spaces are naturally isomorphic to one another, and therefore an identification can be made when convenient. While the former space is more correct from a strongly typed standpoint, the latter space is a convenient and intuitive representational form. The particular correspondence depends heavily on the fact that

ϕ^{*} F

is a submanifold of

M \times F

.

\begin{matrix} Γ (ϕ^{*} F) & ≅ & Γ_{ϕ} (F) \\ σ & \mapsto & {pr}_{2}^{M \times F} \circ σ, \\ {Id}_{M} \times_{M} Σ & ↤ & Σ . \end{matrix}

Furthermore, if

f \in Γ (F)

, then

f \circ ϕ \in Γ_{ϕ} (F)

. Note that it is not true that any

σ \in Γ_{ϕ} (F)

can be written as

f \circ ϕ

for some

f \in Γ (F)

, for example when there exists some distinct

p, q \in M

such that

ϕ (p) = ϕ (q)

and

σ (p) \neq σ (q)

. Furthermore, the representation

f \circ ϕ

is generally non-unique, for example when

ϕ

is not surjective, sections

f_{1}, f_{2} \in Γ (F)

which differ only away from the image of

ϕ

will still give

f_{1} \circ ϕ = f_{2} \circ ϕ

. Before developing the notion of a linear connection on a pullback bundle, it will be necessary to address these features which, while inconvenient, provide the strength of the pullback bundle and pullback covariant derivative (see (5)).

Lemma 1

(Local representation of

Γ_{ϕ} (F)

elements). Recall that r denotes the rank of smooth bundle F. If

σ \in Γ_{ϕ} (F)

then each point

p \in M

has some neighborhood U in which σ can be written locally as

σ ∣_{U} = σ^{i} f_{i} \circ ϕ ∣_{U}

, where

f_{1}, \dots, f_{r} \in Γ (F ∣_{ϕ (U)})

is a frame for

F ∣_{ϕ (U)}

, and

σ^{1}, \dots, σ^{r} \in C^{\infty} (U, R)

are defined by

σ^{i} = (f^{i} \circ ϕ ∣_{U}) \cdot_{F} σ ∣_{U}

.

Proof.

Let

p \in M

, let

V \subseteq N

be a neighborhood of

ϕ (p)

over which

F ∣_{V}

is trivial, and let

U = ϕ^{- 1} (V)

, so that U is a neighborhood of p. Let

f_{1}, \dots, f_{r} \in Γ (F ∣_{V})

be a frame for

F ∣_{V}

(i.e.,

F ∣_{ϕ (U)}

), and let

f^{1}, \dots, f^{r} \in Γ ({(F ∣_{V})}^{*})

be the corresponding coframe (i.e., the unique

f^{1}, \dots, f^{r}

such that

f^{i} \cdot_{F} f_{j} = δ_{j}^{i}

for each

i, j

). Define

σ^{i} \in C^{\infty} (M, R)

by

σ^{i} = (f^{i} \circ ϕ ∣_{U}) \cdot_{F} σ ∣_{U}

. Then

\begin{matrix} σ^{i} f_{i} \circ ϕ ∣_{U} & = (f^{i} \circ ϕ ∣_{U}) \cdot_{F} σ ∣_{U} f_{i} \circ ϕ ∣_{U} \\ = ((f_{i} \circ ϕ ∣_{U}) \otimes_{U} (f^{i} \circ ϕ ∣_{U})) \cdot_{F} σ ∣_{U} \\ = ((f_{i} \otimes_{V} f^{i}) \circ ϕ ∣_{U}) \cdot_{F} σ ∣_{U} \\ = (I_{F ∣_{V}} \circ ϕ ∣_{U}) \cdot_{F} σ ∣_{U} \\ = σ ∣_{U}, \end{matrix}

as desired. □

Some literature uses expressions of the form

f \circ ϕ \in Γ_{ϕ} (F)

along with an implicit use of the section-identifying isomorphism to write down particular sections of pullback bundles. In most cases, this tacit identification of spaces is harmless, but certain highly involved calculations may suffer from it. The section that

f \circ ϕ

corresponds to under said isomorphism is

{Id}_{M} \times_{M} (f \circ ϕ) \in Γ (ϕ^{*} F)

. However, because this expression is unwieldy and therefore a more compact and contextually meaningful expression is called for.

Definition 3

(Pullback section). If

f \in Γ (F)

and

ϕ : M \to N

is smooth, then define

ϕ^{*} f : = {Id}_{M} \times_{M} (f \circ ϕ) \in Γ (ϕ^{*} F) .

This is known as a pullback section.

The pullback section is deservedly named. If

ϕ : M \to N

and

ψ : L \to M

are smooth, then

ψ^{*} ϕ^{*} f ≅ {(ϕ \circ ψ)}^{*} f

in the sense of the proof of (8).

Proposition 9

(Bundle pullback commutes with tensor product). If E and F are smooth vector bundles over manifold N and

ϕ : M \to N

is smooth, then the map

\begin{matrix} ϕ^{*} E \otimes_{M} ϕ^{*} F & \to & ϕ^{*} (E \otimes_{N} F), \\ (m, e) \otimes_{M} (m, f) & \mapsto & (m, e \otimes_{N} f) \end{matrix}

(extended linearly to general tensors) is a smooth vector bundle isomorphism.

Proof.

Let c denote the above map. The well-definedness of c comes from the universal mapping property on multilinear forms which induces a linear map on a corresponding tensor product. If

c ((m, e) \otimes_{M} (m, f)) = 0

, then

e \otimes_{N} f = 0,

which implies that

e = 0

or

f = 0

, and therefore that

(m, e) \otimes_{M} (m, f) = 0

. Because there exists a basis for

{(ϕ^{*} E \otimes_{M} ϕ^{*} F)}_{m}

consisting only of simple tensors, this implies that c is injective, and by a dimensionality argument, that c is an isomorphism. The map is clearly smooth and respects the fiber structures of its domain and codomain. Thus c is a smooth vector bundle isomorphism. □

The contravariance of pullback and its naturality with respect to tensor product are two essential properties which provide some of the flexibility and precision of the strongly typed tensor formalism described in this paper. This will become quite apparent in Section 4.

Remark 2

(Tensor field formulation of smooth vector bundle morphisms). A particularly useful application of pullback bundles is in forming a rich-type system for smooth vector bundle morphisms. This approach was inspired by ([14], p. 11). Let

π^{E} : E \to M

and

π^{F} : F \to N

be smooth vector bundles, and let

ϕ : M \to N

be smooth. Consider

{Hom}_{ϕ} (E, F)

, i.e., the space of smooth vector bundle morphisms over the map ϕ. There is a natural identification with another space which lets the base map ϕ play a more direct role in the space’s type. In particular,

\begin{matrix} {Hom}_{ϕ} (E, F) & ≅ & {Hom}_{{Id}_{M}} (E, ϕ^{*} F), \\ A & \mapsto & π^{E} \times_{E} A, \\ {pr}_{2}^{M \times F} \circ B & ↤ & B . \end{matrix}

This particular identification of smooth vector bundle morphisms over ϕ can now be directly translated into the tensor field formalism, analogously to (1).

\begin{matrix} Γ (ϕ^{*} F \otimes_{M} E^{*}) & \to & {Hom}_{{Id}_{M}} (E, ϕ^{*} F), \\ A & \mapsto & (e \mapsto A \cdot_{E} e) . \end{matrix}

The inverse image of

B \in {Hom}_{{Id}_{M}} (E, ϕ^{*} F)

is given locally; let

(e_{i})

and

(f_{i})

denote local frames for E and F in neighborhoods

U \subseteq M

and

V \subseteq N

respectively, with

ϕ (U) \subseteq V

, and let

(e^{i})

and

(f^{i})

denote their dual coframes. Then the tensor field corresponding to B is given locally in U by

B_{j}^{i} ϕ^{*} f_{i} \otimes_{M} e^{j}

, where

B_{j}^{i} : = ϕ^{*} d f^{i} \circ B \circ e_{j} \in C^{\infty} (U, R)

.

Quantifying smooth vector bundle morphisms as tensor fields lends itself naturally to doing calculus on vector and tensor bundles, as the relevant derivatives (covariant derivatives) take the form of tensor fields. The type information for a particular vector bundle morphism is encoded in the relevant tensor bundle.

3.7. Tangent Map as a Tensor Field

This section deals specifically with the tangent map operator by using concepts from Section 3.5 and Section 3.6 to place it in a strongly typed setting and to prepare to unify a few seemingly disparate concepts and notation for some tangible benefit (in particular, see Section 3.10).

Given a smooth map

ϕ : M \to N

, its tangent map

T ϕ : T M \to T N

is a smooth vector bundle morphism over

ϕ

, so by (2), is naturally identified with a tensor field

\nabla^{_{○}}^{M \to N} ϕ \in Γ (ϕ^{*} T N \otimes_{M} T^{*} M),

which may be denoted by

\nabla^{_{○}} ϕ

where type pedantry is deemed unnecessary. This construction is known as a two-point tensor field ([11], p. 70). In general, if

ψ : M \to N

, then

\nabla^{_{○}} ϕ

and

\nabla^{_{○}} ψ

have distinct types

Γ (ϕ^{*} T N \otimes_{M} T^{*} M)

and

Γ (ψ^{*} T N \otimes_{M} T^{*} M)

respectively, and therefore have no well-defined sum. Thus

\nabla^{_{○}}

is a nonlinear derivative. The inscribed ∘ symbol within the symbol

\nabla^{_{○}}

is meant to denote that nonlinearity, in particular distinguishing it from a linear covariant derivative.

Remark 3

(Generalized covariant derivative). The well-known one-to-one correspondence between linear connections and linear covariant derivatives ([6], p. 520) generalizes to a one-to-one correspondence between Ehresmann connections and a generalized notion of covariant derivative. To give a partial definition for the purposes of utility, a generalized covariant derivative on a smooth [fiber] bundle

F \to N

is a map ∇ on

Γ (F)

such that

\nabla σ \in Γ (σ^{*} V F \otimes_{N} T^{*} N)

for each

σ \in Γ (F)

. The space of maps

C^{\infty} (M, N)

is naturally identified as

Γ (N ⥇ M)

, and there is a natural Ehresmann connection on the bundle

N ⥇ M

, whose corresponding covariant derivative is the tangent map operator. This is the subject of another of the author’s papers and will not be discussed here further. This is mentioned here to incorporate linear covariant derivatives (to be introduced and discussed in Section 3.8) and the tangent map operator (a nonlinear covariant derivative) under the single category “covariant derivative”.

There is a subtle issue regarding construction of the cotangent map of

ϕ

which is handled easily by the tensor field construction. In particular, while the cotangent map

T^{*} ϕ

is the pointwise adjoint of the tangent map

T ϕ

, i.e., for each

p \in M

,

T_{p} ϕ : T_{p} M \to T_{ϕ (p)} N

is linear and

T_{p}^{*} ϕ : T_{ϕ (p)}^{*} N \to T_{p}^{*} M

is the adjoint of

T_{p} ϕ

, it does not follow that

T^{*} ϕ \in Hom (T^{*} N, T^{*} M)

, being some sort of “total adjoint” of

T ϕ \in Hom (T M, T N)

. The obstruction is due to the fact that

ϕ

may not be surjective, so there may be some fiber

T_{q}^{*} N

that is not of the form

T_{ϕ (p)}^{*} N

, and therefore the domain could not be all of

T^{*} N

. Furthermore, even if

ϕ

were surjective, if it were not also injective, say

ϕ (p_{0}) = ϕ (p_{1})

for some distinct

p_{0}, p_{1} \in M

, then

T_{ϕ (p_{0})}^{*} N = T_{ϕ (p_{1})}^{*} N

, and

T_{p_{0}} M \neq T_{p_{1}} M

, so the action on the fiber

T_{ϕ (p_{0})}^{*} N

is not well-defined.

In the tensor field parlance, the cotangent map

T^{*} ϕ

simply takes the form

{(\nabla^{_{○}} ϕ)}^{(1 2)} \in Γ (T^{*} M \otimes_{M} ϕ^{*} T N) .

The permutation superscript

(1 2)

is used here instead of * to distinguish it notationally from pullback notation, which will be necessary in later calculations. The key concept is that the tensor field

{(\nabla^{_{○}} ϕ)}^{(1 2)}

encodes the base map

ϕ

; the basepoint

p \in M

is part of the domain

ϕ^{*} T^{*} N

itself.

The chain rule in the tensor field formalism makes use of the bundle pullback. If

ψ : L \to M

is smooth, then

\nabla^{_{○}}^{L \to N} (ϕ \circ ψ) = ψ^{*} \nabla^{_{○}}^{M \to N} ϕ \cdot_{ψ^{*} T M} \nabla^{_{○}}^{L \to M} ψ .

Because

\nabla^{_{○}} ψ \in Γ (ψ^{*} T M \otimes_{L} T^{*} L)

, to form a well-defined natural pairing, the use of the pullback

ψ^{*} \nabla^{_{○}} ϕ \in Γ (ψ^{*} (ϕ^{*} T N \otimes_{M} T^{*} M)) = Γ (ψ^{*} ϕ^{*} T N \otimes_{L} ψ^{*} T^{*} M) = Γ ({(ϕ \circ ψ)}^{*} T N \otimes_{L} ψ^{*} T^{*} M)

is necessary (instead of just

\nabla^{_{○}} ϕ \in Γ (ϕ^{*} T N \otimes_{M} T^{*} M)

).

Sometimes it is useful to discard some type information and write

\nabla^{_{○}} ϕ \in Γ_{ϕ \times_{M} {Id}_{M}} (T N \otimes_{N \times M} T^{*} M),

i.e.,

\nabla^{_{○}} ϕ : M \to T N \otimes_{N \times M} T^{*} M

such that

(π_{N}^{T N} \otimes_{N \times M} π_{M}^{T^{*} M}) \circ \nabla^{_{○}} ϕ = ϕ \times_{M} {Id}_{M}

. This is easily done by the canonical fiber projection available to all pullback bundle constructions;

ϕ^{*} T N \otimes_{M} T^{*} M ≅ {(ϕ \times_{M} {Id}_{M})}^{*} (T N \otimes_{N \times M} T^{*} M)

, and the canonical fiber projection is

ρ_{T N \otimes_{N \times M} T^{*} M}^{{(ϕ \times_{M} {Id}_{M})}^{*} (T N \otimes_{N \times M} T^{*} M)} : {(ϕ \times_{M} {Id}_{M})}^{*} (T N \otimes_{N \times M} T^{*} M) \to T N \otimes_{N \times M} T^{*} M,

as defined in (1). The granularity of the type system should reflect the weight of the calculations being performed. For demonstration of contrasting situations, see the discussion at the beginning of Section 3.8 and the computation of the first variation in (1).

It is important to have notation which makes the distinction between the smooth vector bundle morphism formalism and the tensor field formalism, because it may sometimes be necessary to mix the two, though this paper will not need this. An added benefit to the tensor field formulation of tangent maps is that certain notions regarding derivatives can be conceptually and notationally combined, for example in Section 3.10.

3.8. Linear Covariant Derivatives

As will be shown in the following discussion, a linear covariant derivative (commonly referred to in the standard literature without the “linear” qualifier) provides a way to generalize the notion in elementary calculus of the differential of a vector-valued function. The linear covariant derivative interacts naturally with the notion of the pullback bundle, and this interaction leads naturally to what could be called a covariant derivative chain rule, which provides a crucial tool for the tensor calculus computations seen later.

Let V and W be finite-dimensional vector spaces let

U \subseteq V

be open, and let

ϕ : U \to W

be differentiable. Recall from elementary calculus the differential

D ϕ : U \to W \otimes V^{*}

(essentially matrix-valued). There is no base map information encoded in

D ϕ

(i.e.,

ϕ

cannot be recovered from

D ϕ

alone), it contains only derivative information. The vector space structure of V and W allows the trivializations

T U ≅ V ⥇ U

and

T W ≅ W ⥇ W

, where the first factors are the base spaces and the second factors are the fibers (see (1)). The tangent map

\nabla^{_{○}}^{U \to W} ϕ : U \to T W \otimes_{W \times U} T^{*} U

(see Section 3.7) has a codomain that can be trivialized similarly;

T W \otimes_{W \times U} T^{*} U ≅ (W ⥇ W) \otimes_{W \times U} (V^{*} ⥇ U) ≅ (W \otimes V^{*}) ⥇ (W \times U) .

Because

(W \otimes V^{*}) ⥇ (W \times U)

, as a set, is a direct product, it can be decomposed into two factors. Letting

{pr}_{1}

and

{pr}_{2}

be the projections onto the first and second factors respectively,

{pr}_{1} \circ \nabla^{_{○}} ϕ : U \to W \otimes V^{*} and {pr}_{2} \circ \nabla^{_{○}} ϕ : U \to W \times U .

The map

{pr}_{2} \circ \nabla^{_{○}} ϕ

is the element of

Γ (W ⥇ U)

identified with the base map

ϕ

itself;

{pr}_{W}^{W \times U} \circ {pr}_{2} \circ \nabla^{_{○}} ϕ = ϕ

. This base map information is discarded in defining the differential of

ϕ

as

D ϕ : = {pr}_{1} \circ \nabla^{_{○}} ϕ

; the fiber portion of

\nabla^{_{○}} ϕ

. This construction relies critically on the natural isomorphism

T W ≅ W ⥇ W

for a vector space W.

An analogous construction shows that the differential

D ϕ

of a map

ϕ

is well-defined even when its domain is a manifold. However, when the codomain of a map

ϕ

is only a manifold, there does not in general exist a natural trivialization of its tangent bundle (in contrast to the vector space case), and therefore

D ϕ

cannot be defined without additional structure. A linear covariant derivative provides the missing structure.

For the remainder of this section, let

π : E \to N

define a smooth vector bundle having rank r.

A linear covariant derivative on E provides a means of taking derivatives of sections of E (i.e., maps

σ : N \to E

such that

π \circ σ = {Id}_{N}

) without passing to a higher tangent bundle as would happen under the tangent map functor (i.e., if

σ \in Γ (E)

then

T σ : T N \to T E

and

\nabla^{_{○}}^{N \to E} σ : N \to T E \otimes_{E \times N} T^{*} N

). A linear covariant derivative provides an effective “trivialization” of

T E

analogous to the trivialization

T W ≅ W ⥇ W

as discussed above, discarding all but the “fiber” portion of the derivative, allowing the construction of an object known as the total linear covariant derivative analogous to the differential

D ϕ

as discussed above.

The notion of a linear covariant derivative on a vector bundle is arguably the crucial element of differential geometry (The Fundamental Lemma of Riemannian Geometry establishes the existence of the Levi-Civita connection ([8], p. 68), which is a linear covariant derivative satisfying certain naturality properties.). In particular, this operator implements the product rule property common to anything that can be called a derivation—a property which is particularly conducive to the operation of tensor calculus. The total linear covariant derivative of a vector field (i.e., section of a vector bundle) allows the generalization of many constructions in elementary calculus to the setting of smooth vector bundles equipped with linear covariant derivatives. For example, the divergence

div X : = tr D X

of a vector field X on

R^{n}

generalizes to the divergence

div X : = tr \nabla X

of a vector field X on N, which has an analogous divergence theorem among other qualitative similarities.

Remark 4

(Natural linear covariant derivative on trivial line bundle). Before making the general definition for the linear covariant derivative, a natural linear covariant derivative will be introduced. With N denoting a smooth manifold as before, if

f \in C^{\infty} (N, R)

, then

d f \in Γ (T^{*} N)

is the differential of f. Let

\nabla^{|}^{N \to R} f : = d f .

Because

C^{\infty} (N, R)

is naturally identified with

Γ (R ⥇ N)

, this is essentially the natural linear covariant derivative on the trivial line bundle

R ⥇ N

. Note that there is an associated product rule; if

f, g \in C^{\infty} (N, R)

, then

f g \in C^{\infty} (N, R)

, and

\nabla^{|}^{N \to R} (f g) = d (f g) = g d f + f d g = g \nabla^{|}^{N \to R} f + f \nabla^{|}^{N \to R} g .

When clear from context, the superscript decoration can be omitted and the derivative denoted as

\nabla^{|} f

.

Definition 4

(Linear covariant derivative). A linear covariant derivative on a vector bundle defined by

π : E \to N

is an

R

-linear map

\nabla^{|}^{E} : Γ (E) \to Γ (E \otimes_{N} T^{*} N)

satisfying the product rule

\nabla^{|}^{E} (f \otimes_{N} σ) = σ \otimes_{N} \nabla^{|}^{N \to R} f + f \otimes_{N} \nabla^{|}^{E} σ,

(1)

where

f \in C^{\infty} (N, R)

and

σ \in Γ (E)

. The switch in order in the first term of the expression is necessary to form a tensor field of the correct type,

Γ (E \otimes_{N} T^{*} N)

. If

σ \in Γ (E)

, then the expression

\nabla^{|}^{E} σ

is known as the total [linear] covariant derivative of σ. If

\nabla^{|}^{E} σ = 0

[in a subset

U \subseteq N

], then σ is said to be parallel [on U]. The “linear” qualifier is implied in standard literature and is therefore often omitted.

The inscribed | in

\nabla^{|}

is to indicate that the covariant derivative is linear, and can be omitted when clear from context, or when it is unnecessary to distinguish it from the nonlinear tangent map operator whose decorated symbol is

\nabla^{_{○}}

. For the remainder of this section, this distinction will not be necessary, so an undecorated ∇ will be used.

For

V \in Γ (T N)

, it is customary to denote

\nabla^{E} σ \cdot V

by

\nabla_{V}^{E} σ

, where V indicates the “directional” component of the derivative. Following this convention, the product rule can be written in a form where the product rule is more obvious;

\nabla_{V}^{E} (f \otimes_{N} σ) = \nabla_{V}^{N \to R} f \otimes_{N} σ + f \otimes_{N} \nabla_{V}^{E} σ .

Given a linear covariant derivative

\nabla^{E}

on E, there is a naturally induced linear covariant derivative

\nabla^{E^{*}}

on

E^{*}

satisfying the product rule for the natural pairing on E, namely,

\begin{matrix} \nabla_{X}^{N \to R} (α \cdot_{E} σ) & = & \nabla_{X}^{E^{*}} α \cdot_{E} σ + α \cdot_{E} \nabla_{X}^{E} σ, \end{matrix}

where

X \in Γ (T N)

,

α \in Γ (E^{*})

, and

σ \in Γ (E)

.

A covariant derivative is a local operator with respect to the base space N; if

p \in N

, then

(\nabla^{E} σ) (p)

depends only on the restriction of

σ

to an arbitrarily small neighborhood of p ([8], p. 50), and therefore the restriction

\nabla^{E ∣_{U}} : Γ_{U} (E) \to Γ_{U} (E \otimes_{N} T^{*} N)

makes sense, allowing calculations using local expressions. Furthermore, a covariant derivative can be constructed locally and glued together under certain conditions. See ([6], p. 503) for more on this, and as a reference for general theory on bundles, covariant derivatives, and connections.

Linear covariant derivatives on several vector bundle constructions will now be developed. In analogy to defining a linear map by its action on a generating subset (e.g., a basis or a dense subspace) and then extending using the linear structure, Lemma (3) allows a covariant derivative to be defined on a generating subset (which can be chosen to make the defining expression particularly natural) and then extending. In this case, the relevant space is the space of sections of the vector bundle, which is a module over the ring of smooth functions on a manifold, and the extension process is done via linearity and the product rule (see (4)). This approach will allow the local trivialization implementation details to be hidden within the proof of Lemma (3)—an example of information hiding—so that constructions of covariant derivatives can proceed clearly by focusing only on the natural properties of the relevant objects and then invoking the lemma to do the “dirty” work (see (10) and (11)).

A bit of useful notation will be introduced to simplify the next definition. If

G \subseteq Γ

is a subset of a

C^{\infty} (N, R)

-module

Γ

whose elements are functions on N (and therefore have a notion of restriction to a subset) and

U \subseteq N

is open, then let

G_{U}

denote the set of restrictions of the elements of G to the set U. Note that

G_{U} \subseteq Γ_{U}

by construction.

Definition 5

(Finitely generating subset). Say that a subset of a module finitely generates the module if the subset contains a finite set of generators for the module.

Definition 6

(Locally finitely generating subset). If Γ is a

C^{\infty} (N, R)

-module and

G \subseteq Γ

, then G is said to be a locally finitely generating subset of

Γ

if each point

q \in N

has a neighborhood

U \subseteq N

for which

G_{U}

finitely generates

Γ_{U}

.

The space of sections of a vector bundle is the archetype for the above definition. The locally trivial nature of

π : E \to N

allows local frames to be chosen in a neighborhood of each point of N, from which global smooth sections (though not necessarily a global frame) can be made using a partition of unity subordinate to the trivializing neighborhoods. The set of such global sections forms a locally finite generating subset of

Γ (E)

.

Lemma 2.

If G is a locally finitely generating subset of

Γ (E)

, then each point in N has a neighborhood

U \subseteq N

and

e_{1}, \dots, e_{r} \in G_{U}

such that

e_{1}, \dots, e_{r}

forms a frame for

Γ_{U} (E)

. In other words, a local frame can be chosen out of G near each point in N.

Proof.

Let

q \in N

and let

V \subseteq N

be a neighborhood of q for which

G_{V} = \{g_{1}, \dots, g_{ℓ}\}

finitely generates

Γ_{V} (E)

(here,

ℓ \geq r

, recalling that

r = rank E

). Without loss of generality, let

g_{1} (q), \dots, g_{r} (q)

be linearly independent (this is possible because

\{g_{1} (q), \dots, g_{ℓ} (q)\}

spans the vector space

E_{q}

). Because

g_{i}

is continuous for each i and the linear independence of the sections

g_{1}, \dots, g_{r}

is an open condition (defined by

L^{- 1} (R \ \{0\})

where

L : N \to ⋀^{r} E_{q}, p \mapsto g_{1} (p) \land \dots \land g_{r} (p)

), there is a neighborhood

U \subseteq V

of q for which

\{g_{1} (p), \dots, g_{r} (p)\}

is a linearly independent set for each

p \in U

. Finally, letting

e_{i} : = g_{i} ∣_{U}

for

i \in \{1, \dots, r\}

, the sections

e_{1}, \dots, e_{r} \in G_{U}

form a frame for

Γ_{U} (E)

. □

The following lemma shows that defining a covariant derivative on a locally finitely generating subset of the space of sections of a vector bundle is sufficient to uniquely define a covariant derivative on the whole space. The particular generating subset can be chosen so the covariant derivative has a particularly natural expression within that subset.

Lemma 3

(Linear covariant derivative construction). Let G be a locally finite generating subset of

Γ (E)

. If

\nabla^{G} : G \to Γ (E \otimes_{N} T^{*} N)

satisfies the linear covariant derivative axioms (What is meant by this is that the product rule must only be satisfied on

λ \otimes_{N} g

if

λ g \in G

, where

λ \in C^{\infty} (N, R)

and

g \in G

.), then there is a unique linear covariant derivative

\nabla^{E} : Γ (E) \to Γ (E \otimes_{N} T^{*} N)

whose restriction to G is

\nabla^{G}

.

Proof.

If

q \in N

, then by (2) there exists a neighborhood

U \subseteq N

of q for which there are

e_{1}, \dots, e_{r} \in G_{U}

forming a frame for

E ∣_{U}

. If

σ \in Γ (E)

, then

σ ∣_{U} = σ^{i} e_{i}

for some

σ^{1}, \dots, σ^{r} \in C^{\infty} (U, R)

(specifically,

σ^{i} = e^{i} \cdot_{E} σ ∣_{U}

, where

e^{1}, \dots, e^{r} \in Γ_{U} (E^{*})

denotes the dual coframe of

e_{1}, \dots, e_{r}

). Define

\nabla^{E} : Γ (E) \to Γ (E \otimes_{N} T^{*} N)

locally on

Γ_{U} (E)

so as to satisfy the product rule

\nabla^{E} (σ ∣_{U}) : = e_{i} \otimes_{N} \nabla^{N \to R} σ^{i} + σ^{i} \otimes_{N} \nabla^{G} e_{i} .

To show well-definedness, let

f_{1}, \dots, f_{r} \in G_{U}

be another frame for

E ∣_{U}

. Then

σ = τ^{i} f_{i}

for some

τ^{1}, \dots, τ^{r} \in C^{\infty} (U, R)

. Let

Ψ : Γ_{U} (E) \to Γ_{U} (E)

be the unique smooth vector bundle isomorphism such that

f_{i} = Ψ \cdot_{E} e_{i}

. Writing

Ψ

and

Ψ^{- 1}

with respect to the frame

(e_{i})

as

Ψ_{j}^{i} e_{i} \otimes e^{j}

and

{(Ψ^{- 1})}_{j}^{i} e_{i} \otimes e^{j}

respectively, it follows that

f_{i} = Ψ_{i}^{j} e_{j}

and

τ^{i} = σ^{j} {(Ψ^{- 1})}_{j}^{i}

. Then

\begin{matrix} \nabla^{E} (τ^{i} f_{i}) = & f_{i} \otimes_{N} \nabla^{N \to R} τ^{i} + τ^{i} \otimes_{N} \nabla^{G} f_{i} \\ = & Ψ_{i}^{j} e_{j} \otimes_{N} \nabla^{N \to R} (σ^{k} {(Ψ^{- 1})}_{k}^{i}) + σ^{j} {(Ψ^{- 1})}_{j}^{i} \otimes_{N} \nabla^{G} (Ψ_{i}^{k} e_{k}) \\ = & Ψ_{i}^{j} e_{j} {(Ψ^{- 1})}_{k}^{i} \otimes_{N} \nabla^{N \to R} σ^{k} + Ψ_{i}^{j} e_{j} σ^{k} \otimes_{N} \nabla^{N \times R} {(Ψ^{- 1})}_{k}^{i} \\ + σ^{j} {(Ψ^{- 1})}_{j}^{i} e_{k} \otimes_{N} \nabla^{N \to R} Ψ_{i}^{k} + σ^{j} {(Ψ^{- 1})}_{j}^{i} Ψ_{i}^{k} \otimes \nabla^{G} e_{k} \\ = & δ_{k}^{j} e_{j} \otimes_{N} \nabla^{N \to R} σ^{k} + σ^{j} δ_{j}^{k} \otimes \nabla^{G} e_{k} + σ^{ℓ} e_{k} \otimes_{N} \nabla^{N \to R} (Ψ_{i}^{k} {(Ψ^{- 1})}_{ℓ}^{i}) \\ = & \nabla^{E} (σ^{i} e_{i}) . \end{matrix}

The last equality follows because

Ψ_{i}^{k} {(Ψ^{- 1})}_{ℓ}^{i} = δ_{ℓ}^{k}

, which is a constant function, so

\nabla^{N \to R} (Ψ_{i}^{k} {(Ψ^{- 1})}_{ℓ}^{i}) = 0 .

Thus the expression defining

\nabla^{E}

does not depend on the choice of local frame. This establishes the well-definedness of

\nabla^{E}

.

Clearly the restriction of

\nabla^{E}

to G is

\nabla^{G}

. This establishes the claim of existence. Uniqueness follows from the fact that

\nabla^{E}

is defined in terms of the maps

\nabla^{N \to R}

and

\nabla^{G}

. □

Lemma (3) is used in the proof of the following proposition to allow a natural formulation of the pullback covariant derivative with respect to a natural locally finite generating subset of

Γ (ϕ^{*} E)

, in which the relevant derivative has a natural chain rule.

Proposition 10

(Pullback covariant derivative). If

ϕ : M \to N

is smooth and

\nabla^{E}

is a covariant derivative on E, then there is a unique covariant derivative

\nabla^{ϕ^{*} E}

on

ϕ^{*} E

satisfying the chain rule

\nabla^{ϕ^{*} E} ϕ^{*} e = ϕ^{*} \nabla^{E} e \cdot_{ϕ^{*} T N} \nabla^{_{○}}^{M \to N} ϕ

for all

e \in Γ (E)

.

Proof.

Let

G : = \{σ \in Γ (ϕ^{*} E) ∣ σ = ϕ^{*} e for some e \in Γ (E)\},

noting that a local frame

e_{1}, \dots, e_{rank E} \in Γ_{U} (E)

over open set

U \subseteq N

induces a local frame

ϕ^{*} e_{1}, \dots, ϕ^{*} e_{rank E} \in Γ_{ϕ^{- 1} (U)} (ϕ^{*} E)

, so G is a locally finite generating subset of

Γ (ϕ^{*} E)

. Define

\begin{matrix} \nabla^{G} : G & \to & Γ (ϕ^{*} E \otimes_{M} T^{*} N), \\ ϕ^{*} e & \mapsto & ϕ^{*} \nabla^{E} e \cdot_{ϕ^{*} T N} \nabla^{_{○}}^{M \to N} ϕ . \end{matrix}

The well-definedness and

R

-linearity of

\nabla^{G}

comes from that of

\nabla^{E}

. For the product rule, if

λ \in C^{\infty} (M, R)

and

e \in Γ (E)

, then the product

λ \otimes_{M} ϕ^{*} e

is an element of G if and only if

λ = ϕ^{*} μ

for some

μ \in C^{\infty} (N, R)

, in which case,

λ \otimes_{M} ϕ^{*} e

= ϕ^{*} μ \otimes_{M} ϕ^{*} e

= ϕ^{*} (μ \otimes_{N} e)

. Then it follows that

\begin{matrix} \nabla^{G} (λ \otimes_{M} ϕ^{*} e) & = \nabla^{G} ϕ^{*} (μ \otimes_{N} e) \\ = ϕ^{*} \nabla^{E} (μ \otimes_{N} e) \cdot_{ϕ^{*} T N} \nabla^{_{○}}^{M \to N} ϕ \\ = ϕ^{*} (e \otimes_{N} \nabla^{N \to R} μ + μ \otimes_{N} \nabla^{E} e) \cdot_{ϕ^{*} T N} \nabla^{_{○}}^{M \to N} ϕ \\ = ϕ^{*} (e \otimes_{N} \nabla^{N \to R} μ) \cdot_{ϕ^{*} T N} \nabla^{_{○}}^{M \to N} ϕ + ϕ^{*} (μ \otimes_{N} \nabla^{E} e) \cdot_{ϕ^{*} T N} \nabla^{_{○}}^{M \to N} ϕ \\ = ϕ^{*} e \otimes_{M} (ϕ^{*} \nabla^{N \to R} μ \cdot_{ϕ^{*} T N} \nabla^{_{○}}^{M \to N} ϕ) + ϕ^{*} μ \otimes_{M} (ϕ^{*} \nabla^{E} e \cdot_{ϕ^{*} T N} \nabla^{_{○}}^{M \to N} ϕ) \\ = ϕ^{*} e \otimes_{M} \nabla^{M \to R} ϕ^{*} μ + ϕ^{*} μ \otimes_{M} \nabla^{G} ϕ^{*} e \\ = ϕ^{*} e \otimes_{M} \nabla^{M \to R} λ + λ \otimes_{M} \nabla^{G} ϕ^{*} e, \end{matrix}

which is exactly the required product rule. By (3), there exists a unique covariant derivative

\nabla^{ϕ^{*} E}

on

ϕ^{*} E

whose restriction to G is

\nabla^{G}

. □

The full notation

\nabla^{ϕ^{*} E}

is often cumbersome, so it may be denoted by

\nabla^{ϕ}

when the pulled-back bundle is clear from context.

Remark 5.

There is an important feature of a pullback covariant derivative in the case that pullback map is not an immersion; the pullback covariant derivative may be nonzero even where the pullback map is singular. This fact can be obscured by a certain abuse of notation which often comes in the expression of the geodesic equations in differential geometry (see (4)). An example will illustrate this point.

Let

\nabla^{T M}

be a covariant derivative on

π_{M}^{T M} : T M \to M

. Let

Θ : R \to T M

be a unit-length vector field which describes the location of a person (the basepoint) and direction s/he is looking (the fiber portion) with respect to time (let

R

have standard coordinate t). Define

θ : R \to M

by

θ : = π_{M}^{T M} \circ Θ

, so that θ is the base map of Θ, i.e., θ has discarded the direction information and only encodes the location information. Say that for some closed interval

I \subseteq R

,

\frac{d θ}{d t} ∣_{I}

is identically zero (and so is not an immersion), but that

\frac{d Θ}{d t} ∣_{I}

is nonvanishing; see Figure 1. Mathematically, this means that during this time, Θ is varying only within a single fiber of

T M

. Physically, this means that during this time, the person is standing still but the direction s/he is looking is changing. Passing to a higher tangent space is often undesirable (note that

\frac{d Θ}{d t}

takes values in

T T M

), so to avoid this, a covariant derivative is used. In order to be meaningful, the covariant derivative must capture this fiber-only variation.

Because Θ is a vector field along θ, it can be written as

Θ \in Γ (θ^{*} T M)

, and the covariant derivative on

T M

induces a pullback covariant derivative on

θ^{*} T M

, which has base space

R

. In other words,

θ^{*} T M

is parameterized by time. Then

\nabla_{\frac{d}{d t}}^{θ^{*} T M} Θ \in Γ (θ^{*} T M)

is the desired covariant derivative of Θ with respect to time. A coordinate-based calculation will be made to make completely obvious why this pullback covariant derivative captures the desired information. Let

(x^{i})

be local coordinates on M and, for simplicity, assume that the image of θ lies entirely within this coordinate chart. Because

(\partial_{i})

is a local frame for

T M

,

(θ^{*} \partial_{i})

is a local frame for

θ^{*} T M

, by (1) and

Θ \in Γ (θ^{*} T M)

can be written locally as

Θ (t) = Θ^{i} (t) (θ^{*} \partial_{i}) (t)

for some functions

(Θ^{i} : R \to R)

. Then

\begin{matrix} \nabla_{\frac{d}{d t}}^{θ^{*} T M} Θ & = \nabla_{\frac{d}{d t}}^{θ^{*} T M} (Θ^{i} θ^{*} \partial_{i}) \\ = (\nabla_{\frac{d}{d t}} Θ^{i}) θ^{*} \partial_{i} + Θ^{i} \nabla_{\frac{d}{d t}}^{θ^{*} T M} θ^{*} \partial_{i} \\ = \frac{d Θ^{i}}{d t} θ^{*} \partial_{i} + Θ^{i} θ^{*} \nabla^{T M} \partial_{i} \cdot_{θ^{*} T M} \nabla^{_{○}}^{R \to M} θ \cdot_{T R} \frac{d}{d t} \\ = \frac{d Θ^{i}}{d t} θ^{*} \partial_{i} + Θ^{i} θ^{*} \nabla^{T M} \partial_{i} \cdot_{θ^{*} T M} \frac{d θ}{d t} . \end{matrix}

Note that

\nabla^{_{○}}^{R \to M} θ \in Γ (θ^{*} T M)

. Within the interval I,

\frac{d θ}{d t}

vanishes, so the second term vanishes on I. However, because Θ is varying in a fiber-only direction within I, the basepoint is not changing and

\frac{d Θ^{i}}{d t} θ^{*} \partial_{i}

can be identified with an elementary vector space derivative (the fiber is a vector space and so an elementary derivative is well-defined there). This fiber-direction derivative is nonvanishing by assumption, so

\nabla_{\frac{d}{d t}}^{θ^{*} T M} Θ

is nonvanishing on I as desired.

Introducing a bit of natural notation which will be helpful for the next result, if

X \in Γ (E)

and

Y \in Γ (F)

, then define

X \oplus Y \equiv X \oplus_{M \times N} Y \in Γ (E \oplus_{M \times N} F)

and

X \otimes Y \equiv X \otimes_{M \times N} Y \in Γ (E \otimes_{M \times N} F)

by

(X \oplus_{M \times N} Y) (p, q) : = X (p) \oplus Y (q) and (X \otimes_{M \times N} Y) (p, q) : = X (p) \otimes Y (q)

for each

(p, q) \in M \times N

.

Proposition 11

(Induced covariant derivatives on

E \oplus_{M \times N} F

and

E \otimes_{M \times N} F

). If

\nabla^{E}

and

\nabla^{F}

are covariant derivatives on E and F respectively, then there are unique covariant derivatives

\nabla^{E \oplus_{M \times N} F} : Γ (E \oplus_{M \times N} F) \to Γ ((E \oplus_{M \times N} F) \otimes_{M \times N} (T^{*} M \oplus_{M \times N} T^{*} N))

and

\nabla^{E \otimes_{M \times N} F} : Γ (E \otimes_{M \times N} F) \to Γ ((E \otimes_{M \times N} F) \otimes_{M \times N} (T^{*} M \oplus_{M \times N} T^{*} N))

on

E \oplus F

and

E \otimes F

respectively, satisfying the sum rule

\nabla_{u \oplus v}^{E \oplus F} (X \oplus Y) = \nabla_{u}^{E} X \oplus \nabla_{v}^{F} Y

and the product rule

\nabla_{u \oplus v}^{E \otimes F} (X \otimes Y) = \nabla_{u}^{E} X \otimes Y + X \otimes \nabla_{v}^{F} Y,

respectively, where

X \in Γ (E)

,

Y \in Γ (F)

, and

u \oplus v \in T M \oplus T N

. Here,

T M \oplus T N \to M \times N

(and its dual) is used instead of the isomorphic vector bundle

T (M \times N) \to M \times N

(and its dual).

Proof.

Suppressing the pedantic use of the

M \times N

subscript to avoid unnecessary notational overload, the set

G : = \{e \oplus f ∣ e \in Γ (E), f \in Γ (F)\}

is a locally finite generator of

Γ (E \oplus F)

, since local frames for

E \oplus F

take the form

\{e_{i} \oplus 0, 0 \oplus f_{j}\}

, where

\{e_{i}\}

and

\{f_{j}\}

are local frames for E and F respectively. Define

\begin{matrix} \nabla^{G} : G & \to & Γ ((E \oplus F) \otimes (T^{*} M \oplus T^{*} N)), \\ X \oplus Y & \mapsto & (u \oplus v \mapsto \nabla_{u}^{E} X \oplus \nabla_{v}^{F} Y), where u \oplus v \in T M \oplus T N . \end{matrix}

This map is well-defined and

R

-linear by construction, since the connections

\nabla^{E}

and

\nabla^{F}

are well-defined and

R

-linear. If

λ \in C^{\infty} (M \times N, R)

,

X \in Γ (E)

, and

Y \in Γ (F)

, then the product

λ \otimes (X \oplus Y)

is in G (i.e., has the form

\bar{X} \oplus \bar{Y}

for some

\bar{X} \in Γ (E)

and

\bar{Y} \in Γ (F)

) if and only if

λ

is constant. Thus the product rule (restricted to elements of G) reduces to

R

-linearity, which is already satisfied. By (3), there exists a unique connection

\nabla^{E \oplus F}

on

E \oplus F

whose restriction to G is

\nabla^{G}

.

Similarly, the set

H : = \{e \otimes f ∣ e \in Γ (E), f \in Γ (F)\}

is a locally finite generator of

Γ (E \otimes F)

, since local frames for

E \otimes F

take the form

\{e_{i} \otimes f_{j}\}

, where

\{e_{i}\}

and

\{f_{j}\}

are local frames for E and F respectively. Define

\begin{matrix} \nabla^{H} : H & \to & Γ ((E \otimes F) \otimes (T^{*} M \oplus T^{*} N)), \\ X \otimes Y & \mapsto & (u \oplus v \mapsto \nabla_{u}^{E} X \otimes Y + X \otimes \nabla_{v}^{F} Y), where u \oplus v \in T M \oplus T N . \end{matrix}

This map is well-defined and

R

-linear by construction, since the connections

\nabla^{E}

and

\nabla^{F}

are well-defined and

R

-linear. For the product rule, with

λ \in C^{\infty} (M \times N, R)

,

X \in Γ (E)

, and

Y \in Γ (F)

, the product

λ \otimes (X \otimes Y)

is in H if and only if there exist

μ \in C^{\infty} (M, R)

and

ν \in C^{\infty} (N, R)

such that

λ = μ \otimes ν \in (R ⥇ M) \otimes (R ⥇ N)

(noting that then

λ \otimes_{M \times N} (X \otimes Y)

= (μ \otimes ν) \otimes_{M \times N} (X \otimes Y)

= (μ \otimes_{M} X) \otimes (ν \otimes_{N} Y)

). In this case, with

u \oplus v \in T M \oplus T N

,

\begin{matrix} \nabla_{u \oplus v}^{H} (λ \otimes_{M \times N} (X \otimes Y)) \\ = & \nabla_{u \oplus v}^{H} ((μ \otimes ν) \otimes_{M \times N} (X \otimes Y)) \\ = & \nabla_{u \oplus v}^{H} ((μ \otimes_{M} X) \otimes (ν \otimes_{N} Y)) \\ = & \nabla_{u}^{E} (μ \otimes_{M} X) \otimes (ν \otimes_{N} Y) + (μ \otimes_{M} X) \otimes \nabla_{v}^{F} (ν \otimes_{N} Y) \\ = & (\nabla_{u}^{M \to R} μ \otimes_{M} X) \otimes (ν \otimes_{N} Y) + (μ \otimes_{M} \nabla_{u}^{E} X) \otimes (ν \otimes_{N} Y) \\ + (μ \otimes_{M} X) \otimes (\nabla_{v}^{N \to R} ν \otimes_{N} Y) + (μ \otimes_{M} X) \otimes (ν \otimes_{N} \nabla_{v}^{F} Y) \\ = & (\nabla_{u}^{M \to R} μ \otimes ν + μ \otimes \nabla_{v}^{N \to R} ν) \otimes_{M \times N} (X \otimes Y) + λ \otimes_{M \times N} (\nabla_{u}^{E} X \otimes Y + X \otimes \nabla_{v}^{F} Y) \\ = & \nabla_{u \oplus v}^{M \times N \to R} λ \otimes_{M \times N} (X \otimes Y) + λ \otimes_{M \times N} \nabla_{u \oplus v}^{H} (X \otimes Y), \end{matrix}

which is exactly the required product rule. By (3), there exists a unique connection

\nabla^{E \otimes F}

on

E \otimes F

whose restriction to H is

\nabla^{H}

. □

Remark 6

(Naturality of the covariant derivatives on

E \oplus_{M \times N} F

and

E \otimes_{M \times N} F

). Letting

{pr}_{i} : = {pr}_{i}^{M \times N}

(

i \in \{1, 2\}

) for brevity, the maps

\begin{matrix} ξ : E \oplus_{M \times N} F & \to & {pr}_{1}^{*} E \oplus_{M \times N} {pr}_{2}^{*} F, \\ e \oplus f & \mapsto & ((π^{E} \oplus π^{F}) (e \oplus f), e) \oplus_{M \times N} ((π^{E} \oplus π^{F}) (e \oplus f), f) \end{matrix}

and

\begin{matrix} ψ : E \otimes_{M \times N} F & \to & {pr}_{1}^{*} E \otimes_{M \times N} {pr}_{2}^{*} F, \\ e \otimes f & \mapsto & ((π^{E} \otimes π^{F}) (e \otimes f), e) \otimes_{M \times N} ((π^{E} \otimes π^{F}) (e \otimes f), f), \end{matrix}

each extended linearly to the rest of their domains, are easily shown to be smooth vector bundle isomorphisms over

{Id}_{M \times N}

. Then

\nabla_{z}^{E \oplus F} (X \oplus Y) = ξ^{- 1} (\nabla_{z}^{{pr}_{1}^{*} E \oplus_{M \times N} {pr}_{2}^{*} F} ξ (X \oplus Y))

and

\nabla_{z}^{E \otimes F} (X \otimes Y) = ψ^{- 1} (\nabla_{z}^{{pr}_{1}^{*} E \otimes_{M \times N} {pr}_{2}^{*} F} ψ (X \otimes Y))

for all

X \in Γ (E)

,

Y \in Γ (F)

, and

z \in T (M \times N)

, showing that the connections on

E \oplus F

and

E \otimes F

are ξ and ψ-related to the naturally induced connections on

{pr}_{1}^{*} E \oplus {pr}_{2}^{*} F

and

{pr}_{1}^{*} E \otimes_{M \times N} {pr}_{2}^{*} F

respectively, and are therefore in this sense natural. The sum

X \oplus Y \in Γ (E \oplus F)

and product

X \otimes Y \in Γ (E \otimes F)

correspond to

{pr}_{1}^{*} X \oplus_{M \times N} {pr}_{2}^{*} Y

and

{pr}_{1}^{*} X \otimes_{M \times N} {pr}_{2}^{*} Y \in Γ ({pr}_{1}^{*} E \otimes_{M \times N} {pr}_{2}^{*} F)

under ξ and ψ respectively.

Many important tensor constructions involve permutations. An extremely useful property of these permutations is that they commute with the covariant derivatives induced by the covariant derivatives on the tensor bundle factors, making them natural operators in the setting of covariant tensor calculus.

Proposition 12

(Transposition tensor fields are parallel). Let

E_{1}, E_{2}, E_{3}, E_{4}

be smooth vector bundles over M having covariant derivatives

\nabla^{E_{1}}, \nabla^{E_{2}}, \nabla^{E_{3}}, \nabla^{E_{4}}

respectively, let

A : = E_{1} \otimes_{M} E_{2} \otimes_{M} E_{3} \otimes_{M} E_{4}

and

B : = E_{1} \otimes_{M} E_{3} \otimes_{M} E_{2} \otimes_{M} E_{4}

, and let

\nabla^{A}

and

\nabla^{B}

denote the induced covariant derivatives.

If

(2 3) \in Γ (A^{*} \otimes_{M} B)

denotes the tensor field which maps

e_{1} \otimes_{M} \otimes e_{2} \otimes_{M} e_{3} \otimes_{M} e_{4} \in A

to

e_{1} \otimes_{M} e_{3} \otimes_{M} e_{2} \otimes_{M} e_{4} \in B

(i.e.,

(2 3)

transposes the second and third factors), then

(2 3)

is a parallel tensor field with respect to the covariant derivative induced on the vector bundle

A^{*} \otimes_{M} B \to M

, i.e.,

\nabla^{A^{*} \otimes_{M} B} (2 3) = 0

.

Proof.

Let

X \in Γ (T M)

. Then

\begin{matrix} (e_{1} \otimes_{M} e_{2} \otimes_{M} e_{3} \otimes_{M} e_{4}) \cdot_{A^{*}} \nabla_{X}^{A^{*} \otimes_{M} B} (2 3) \\ = & \nabla_{X}^{B} ((e_{1} \otimes_{M} e_{2} \otimes_{M} e_{3} \otimes_{M} e_{4}) \cdot_{A^{*}} (2 3)) - \nabla_{X}^{A} (e_{1} \otimes_{M} e_{2} \otimes_{M} e_{3} \otimes_{M} e_{4}) \cdot_{A^{*}} (2 3) \\ = & \nabla_{X}^{B} (e_{1} \otimes_{M} e_{3} \otimes_{M} e_{3} \otimes_{M} e_{4}) \\ - \nabla_{X}^{E_{1}} e_{1} \otimes_{M} e_{3} \otimes_{M} e_{2} \otimes_{M} e_{4} \\ - e_{1} \otimes_{M} \nabla_{X}^{E_{3}} e_{3} \otimes_{M} e_{2} \otimes_{M} e_{4} \\ - e_{1} \otimes_{M} e_{3} \otimes_{M} \nabla_{X}^{E_{2}} e_{2} \otimes_{M} e_{4} \\ - e_{1} \otimes_{M} e_{3} \otimes_{M} e_{2} \otimes_{M} \nabla_{X}^{E_{4}} e_{4} \\ = & \nabla_{X}^{B} (e_{1} \otimes_{M} e_{3} \otimes_{M} e_{3} \otimes_{M} e_{4}) - \nabla_{X}^{B} (e_{1} \otimes_{M} e_{3} \otimes_{M} e_{3} \otimes_{M} e_{4}) \\ = & 0 . \end{matrix}

Because X is arbitrary, this shows that

(e_{1} \otimes_{M} e_{2} \otimes_{M} e_{3} \otimes_{M} e_{4}) \cdot_{A^{*}} \nabla^{A^{*} \otimes_{M} B} (2 3) = 0

. This extends linearly to general tensors, so

\nabla^{A^{*} \otimes_{M} B} (2 3) = 0

, as desired. □

The fact that all transposition tensor fields are parallel implies that all permutation tensor fields are parallel, since every permutation is just the product of transpositions. This gives as an easy corollary that a covariant derivative operation commutes with a permutation operation, which has quite a succinct statement using the permutation superscript notation.

Corollary 2

(Permutation tensor fields are parallel). Let

E_{1}, \dots, E_{k}

be smooth vector bundles over M each having a covariant derivative, and let

A : = E_{1} \otimes_{M} \dots \otimes_{M} E_{k}

and

B : = E_{σ^{- 1} (1)} \otimes_{M} \dots \otimes_{M} E_{σ^{- 1} (k)}

. If

σ \in S_{k}

is interpreted as the tensor field in

Γ (A^{*} \otimes_{M} B)

which maps

e_{1} \otimes_{M} \dots \otimes_{M} e_{k}

to

e_{σ^{- 1} (1)} \otimes_{M} \dots \otimes_{M} e_{σ^{- 1} (k)}

, then σ is a parallel tensor field. Stated using the superscript notation, with

X \in Γ (T M)

and

a \in Γ (A)

,

\nabla_{X}^{B} a^{σ} = {(\nabla_{X}^{A} a)}^{σ} .

Proof.

This follows from the fact that

σ

can be written as the product of transpositions;

\nabla_{X} σ = 0

because of the product rule and because each transposition is parallel. The claim regarding commutation with the superscript permutation follows easily from its definition.

\nabla_{X}^{B} a^{σ} = \nabla_{X}^{B} (a \cdot_{A^{*}} σ) = a \cdot_{A^{*}} \nabla_{X}^{A^{*} \otimes_{M} B} σ + \nabla_{X}^{A} a \cdot_{A^{*}} σ = {(\nabla_{X}^{A} a)}^{σ},

using the fact that

\nabla_{X}^{A^{*} \otimes_{M} B} σ = 0

, since

σ

is a parallel tensor field. □

Finally, any smooth vector bundle

E \to M

has a canonical identity tensor field

I_{E} \in Γ (E \otimes_{M} E^{*})

acting as the identity on

Γ (E)

, i.e.,

I_{E} \cdot_{E} σ = σ

for all

σ \in Γ (E)

. Given a local frame

e_{1}, \dots, e_{r} \in Γ_{U} (E)

over open set

U \subseteq M

, it has the expression

I_{E} = e_{i} \otimes e^{i}

. The identity tensor field is an invaluable tool in forming tensor field expressions and in phrasing other naturality conditions regarding covariant derivatives.

Proposition 13

(Identity tensor field is parallel). Let

E \to M

be a smooth vector bundle with a linear covariant derivative

\nabla^{E}

. Then

I_{E}

is parallel with respect to

\nabla^{E \otimes_{M} E^{*}}

, i.e.,

\nabla^{E \otimes_{M} E^{*}} I_{E} = 0

.

Proof.

Let

σ \in Γ (E)

. Then by definition,

0 = I_{E} \cdot_{E} σ - σ

. Taking the covariant derivative of both sides with respect to

X \in Γ (T M)

,

\begin{matrix} 0 & = & \nabla_{X}^{E} (I_{E} \cdot_{E} σ - σ) \\ = & \nabla_{X}^{E} (I_{E} \cdot_{E} σ) - \nabla_{X}^{E} σ \\ = & \nabla_{X}^{E \otimes_{M} E^{*}} I_{E} \cdot_{E} σ + I_{E} \cdot_{E} \nabla_{X}^{E} σ - \nabla_{X}^{E} σ \\ = & \nabla_{X}^{E \otimes_{M} E^{*}} I_{E} \cdot_{E} σ + \nabla_{X} σ - \nabla_{X} σ \\ = & \nabla_{X}^{E \otimes_{M} E^{*}} I_{E} \cdot_{E} σ \\ = & (\nabla^{E \otimes_{M} E^{*}} I_{E} \cdot_{T M} X) \cdot σ . \end{matrix}

Because

σ

is arbitrary, this implies that

\nabla^{E \otimes_{M} E^{*}} I_{E} \cdot_{T M} X = 0

. Because X is arbitrary, this implies that

\nabla^{E \otimes_{M} E^{*}} I_{E} = 0

. □

3.9. Decomposition of $π_{E}^{T E} : T E \to E$

In using the calculus of variations on a manifold M where the Lagrangian is a function of

T M

(this form of Lagrangian is ubiquitous in mechanics), taking the first variation involves passing to

T T M

. Without a way to decompose variations into more tractable components, the standard integration-by-parts trick ([15], p. 16) cannot be applied. The notion of a local trivialization of

T T M

via choice of coordinates on M is one way to provide such a decomposition. A coordinate chart

(U, ϕ : U \to R^{n})

on M establishes a locally trivializing diffeomorphism

T T U ≅ ϕ (U) \times R^{n} \times R^{n} \times R^{n}

. However such a trivialization imposes an artificial additive structure on

T T U

depending on the [non-canonical] choice of coordinates, only gives a local formulation of the relevant objects, and the ensuing coordinate calculations do not give clear insight into the geometric structure of the problem. The notion of the linear connection remedies this ([16]).

A linear connection on the vector bundle

π : E \to M

is a subbundle

H \to E

of

π_{E}^{T E} : T E \to E

such that

T E = H \oplus_{E} V E

and

T λ_{a} \cdot H_{x} = H_{a x}

for all

a \in R \ \{0\}

and

x \in E

, where

λ_{a} : E \to E, e \mapsto a e

is the scalar multiplication action of a on E ([6], p. 512). The bundle

H \to E

may also be called a horizontal space of the vector bundle

π_{E}^{T E} : T E \to E

(“a” is used instead of “the” because a choice of

H \to E

is generally non-unique). For convenience, define

h : = \nabla^{_{○}} π \in Γ (π^{*} T M \otimes_{E} T^{*} E),

noting then that

V E = ker h

.

A linear connection can equivalently be specified by what is known as a connection map; essentially a projection onto the vertical bundle. This is a slightly more active formulation than just the specification of a horizontal space, as a covariant derivative can be defined directly in terms of the connection map—see ([6], p. 518), ([17], p. 128), ([18], p. 173), and ([7], p. 208).

Proposition 14

(Connection map formulation of a linear connection). If

v \in Γ (π^{*} E \otimes_{E} T^{*} E)

(i.e.,

v : T E \to E

is a smooth vector bundle morphism over π) is a left-inverse for

ι_{V E}^{π^{*} E} \in Γ (V E \otimes_{E} π^{*} E^{*})

that is equivariant with respect to

T λ_{a}

and

λ_{a}

(i.e.,

v \cdot T λ_{a} = π^{*} λ_{a} \cdot v

) ([7], p. 245), then

H : = ker v \leq T E

defines a linear connection on the vector bundle

π : E \to M

. Such a map v is called the connection map associated to H. Conversely, given a linear connection H, there is exactly one connection map defining H in the stated sense.

Proof.

That v is a left-inverse for

ι_{V E}^{π^{*} E}

implies that v has full rank, so

H : = ker v

defines a subbundle of

π_{E}^{T E} : T E \to E

having the same rank as

T M

. Because v is smooth, H is a smooth subbundle. Furthermore, the condition implies that

V_{e} E \cap H_{e} = \{0\}

for each

e \in E

, and therefore

T E = H \oplus_{E} V E

by a rank-counting argument.

If

x \in T E

and

a \in R \ \{0\}

, then

v \cdot T λ_{a} \cdot x = π^{*} λ_{a} \cdot v \cdot x

, which equals zero if and only if

v \cdot x = 0

, i.e., if and only if

x \in H

. Thus

T λ_{a} \cdot H = H

. This establishes

H \to E

as a linear connection.

Conversely, if H is a linear connection and

v_{1}

and

v_{2}

are connection maps for H, then

v_{1} \cdot_{T E} ι_{V E}^{π^{*} E} = {Id}_{π^{*} E} = v_{2} \cdot_{T E} ι_{V E}^{π^{*} E}

. Then because the image of

ι_{V E}^{π^{*} E}

is all of

V E

, it follows that

v_{1} ∣_{V E} = v_{2} ∣_{V E}

. Since

v_{1} ∣_{H} = 0 = v_{2} ∣_{H}

by definition, and since

T E = H \oplus_{E} V E

, this shows that

v_{1} = v_{2}

. Uniqueness of connection maps has been established. To show existence, define

v : = ι_{π^{*} E}^{V E} \cdot_{V E} {pr}_{V E} \in Γ (π^{*} E \otimes_{E} T^{*} E)

, where

{pr}_{V E} : H \oplus_{E} V E \to V E

be the canonical projection, recalling that

H \oplus_{E} V E = T E

. It is easily shown that v is a connection map for H. □

Proposition 15

(Decomposing

π_{E}^{T E} : T E \to E

). If

v \in Γ (π^{*} E \otimes_{E} T^{*} E)

is a connection map, then

\begin{matrix} h \oplus_{E} v : T E & \to & π^{*} T M \oplus_{E} π^{*} E \end{matrix}

(2)

is a smooth vector bundle isomorphism over

{Id}_{E}

. See Figure 2.

Proof.

Because

T E = H \oplus_{E} V E

, and

H = ker v

and

V E = ker h

, the fiber-wise restriction

\begin{matrix} h \oplus_{E} v ∣_{T_{e} E} : T_{e} E & \to & {(π^{*} T M \oplus_{E} π^{*} E)}_{e} ≅ T_{π (e)} M \oplus E_{π (e)} \end{matrix}

is a linear isomorphism for each

e \in E

. The map is a smooth vector bundle morphism over

{Id}_{E}

by construction. It is therefore a smooth vector bundle isomorphism over

{Id}_{E}

. □

Remark 7

(Linear connection/covariant derivative correspondence). Given a covariant derivative

\nabla^{E}

on a smooth vector bundle

π : E \to M

, there is a naturally induced linear connection, defined via the connection map

\begin{matrix} v : T E & \to & E, \\ δ_{ϵ} Θ & \mapsto & \nabla_{δ_{ϵ}}^{{(π \circ Θ)}^{*} E} Θ, \end{matrix}

(3)

where

Θ : I \to E

is a variation of

θ \in E

. Here,

\nabla^{{(π \circ Θ)}^{*} E}

denotes the pullback of the covariant derivative

\nabla^{E}

through the map

π \circ Θ

(see (10)). Conceptually, all v does is replace an ordinary derivative (

δ_{ϵ}

) with the corresponding covariant one (

\nabla_{δ_{ϵ}}^{{(π \circ Θ)}^{*} E}

).

Conversely, given a connection map

v \in Γ (π^{*} E \otimes_{E} T^{*} E)

for a linear connection

H \to E

, there is a naturally induced covariant derivative

\nabla^{E}

on the smooth vector bundle

π : E \to M

, defined by

\begin{matrix} \nabla^{E} : Γ (E) & \to & Γ (E \otimes_{M} T^{*} M), \\ σ & \mapsto & σ^{*} v \cdot_{σ^{*} T E} \nabla^{_{○}}^{M \to E} σ . \end{matrix}

The scaling equivariance of v is critical for showing that this map actually defines a covariant derivative. Full type safety should be observed here; by the contravariance of the pullback of bundles (see (8)),

σ^{*} π^{*} E ≅ {(π \circ σ)}^{*} E = {Id}_{M}^{*} E ≅ E

, so

σ^{*} v \in Γ (σ^{*} (π^{*} E \otimes_{E} T^{*} E)) ≅ Γ (σ^{*} π^{*} E \otimes_{M} σ^{*} T^{*} E) ≅ Γ (E \otimes_{M} σ^{*} T^{*} E),

and therefore

σ^{*} v \cdot \nabla^{_{○}} σ \in Γ (E \otimes_{M} T^{*} M)

as desired. This connection map construction of a covariant derivative gives (10) as an immediate consequence via the chain rule for the tangent map.

The following construction is an abstraction of taking partial derivatives of a function, inspired by ([11], p. 277). Instead of taking partial derivatives with respect to individual coordinates, partial covariant derivatives along distributions over the base manifold are formed, where the distributions (subbundles) decompose the base manifold’s tangent bundle into a direct sum. Such a construction conveniently captures the geometry of maps with respect to the geometry of its domain.

Proposition 16

(Partial covariant derivatives). Let

L \in C^{\infty} (M, R)

, and for each

i \in \{1, \dots, n\}

let

F_{i} \to M

be a smooth vector bundle. If, for each

i \in \{1, \dots, n\}

,

c_{i} \in Γ (F_{i} \otimes_{M} T^{*} M)

such that

c_{1} \oplus_{M} \dots \oplus_{M} c_{n} \in Γ ((F_{1} \oplus_{M} \dots \oplus_{M} F_{n}) \otimes_{M} T^{*} M)

is a smooth vector bundle isomorphism over

{Id}_{E}

, then there exist unique sections

L_{, c_{i}} \in Γ (F_{i}^{*})

for each

i \in \{1, \dots, n\}

such that

\nabla^{M \to R} L = L_{, c_{1}} \cdot_{F_{1}} c_{1} + \dots + L_{, c_{n}} \cdot_{F_{n}} c_{n} .

This decomposition of

\nabla L

provides what will be called partial covariant derivatives of L (with respect to the given decomposition).

Proof.

The following equivalences provide a formula for directly defining

L_{, c_{1}}, \dots, L_{, c_{n}}

.

\begin{matrix} \nabla L & = L_{, c_{1}} \cdot_{F_{1}} c_{1} + \dots + L_{, c_{n}} \cdot_{F_{n}} c_{n} \\ \Leftrightarrow \nabla L & = (L_{, c_{1}} \oplus_{M} \dots \oplus_{M} L_{, c_{n}}) \cdot_{F_{1} \oplus_{M} \dots \oplus_{M} F_{n}} (c_{1} \oplus_{M} \dots \oplus_{M} c_{n}) \\ \Leftrightarrow \nabla L \cdot_{T M} {(c_{1} \oplus_{M} \dots \oplus_{M} c_{n})}^{- 1} & = L_{, c_{1}} \oplus_{M} \dots \oplus_{M} L_{, c_{n}} . \end{matrix}

Existence and uniqueness is therefore proven. □

Corollary 3

(Horizontal/vertical derivatives). Let

h : = \nabla^{_{○}} π \in Γ (π^{*} T M \otimes_{E} T^{*} E)

as before. If

v \in Γ (π^{*} E \otimes_{E} T^{*} E)

is a connection map, and if

L : E \to R

is smooth, then there exist unique

L_{, h} \in Γ (π^{*} T^{*} M)

and

L_{, v} \in Γ (π^{*} E^{*})

such that

\nabla L = L_{, h} \cdot_{π^{*} T M} h + L_{, v} \cdot_{π^{*} E} v

.

It should be noted that the basepoint-preserving issue discussed in Section 3.7 plays a role in choosing to use the tensor field formulation of

h : T E \to T M

and

v : T E \to E

. In particular, without preserving the basepoint (via the

π

-pullback of

T M

and E to form

h \in Γ (π^{*} T M \otimes_{E} T^{*} E)

and

v \in Γ (π^{*} E \otimes_{E} T^{*} E)

), the map

h \oplus_{E} v

would not be a smooth bundle isomorphism, and the horizontal and vertical derivatives would be maps of the form

L_{, h} : E \to T^{*} M

and

L_{, v} : E \to E^{*}

, but that, critically, are not sections of smooth vector bundles, and can only claim to be smooth [fiber] bundle morphisms. Derivative trivializations will be central in calculating the first and second variations of an energy functional having Lagrangian L (see (1) and (2)).

3.10. Curvature and Commutation of Derivatives

A ubiquitous consideration in mathematics is to determine when two operations commute. In the setting of tensor calculus, this often manifests itself in determining the commutativity (or lack thereof) of two covariant derivatives. Here, “covariant derivatives” may refer to both linear covariant derivatives and the tangent map operator (see (3)). This unified categorization of derivatives will now be leveraged to show that certain fiber bundles are flat (in a sense analogous to the vanishing of a curvature endomorphism) with respect to particular covariant derivatives. This reduces the work often done showing commutativity of derivatives in the derivation of the first variation of a function in the calculus of variations to the simple statement that a particular tensor field is symmetric, which is comes as a corollary to the aforementioned flatness.

In this section, the symbol ∇ may denote

\nabla^{_{○}}

or

\nabla^{|}

, depending on context. This eases the expression of repeated covariant derivatives, such as the covariant Hessian of a section (see below), and is an example of telescoping notation as discussed in Section 3.3.

If

π : E \to M

defines a smooth [fiber] bundle whose space of sections

Γ (E)

has two repeated covariant derivatives defined, if

σ \in Γ (E)

, and if

\nabla^{T M}

is a symmetric linear covariant derivative (meaning

\nabla_{X} Y - \nabla_{Y} X = [X, Y]

for

X, Y \in Γ (T M)

), then the tensor contraction

\nabla^{2} σ : (X \otimes_{M} Y - Y \otimes_{M} X)

is an expression measuring the non-commutativity of the X and Y derivatives of

σ

. The quantity

\nabla^{2} σ

will be called the covariant Hessian of

σ

, because it generalizes the Hessian of elementary calculus; it contains only second-derivative information, and in the special case seen below, it is symmetric in the argument components. It should be noted that if

F \to M

is the vector bundle such that

\nabla σ \in Γ (F \otimes_{M} T^{*} M)

, then

\nabla^{2} σ \in Γ (F \otimes_{M} T^{*} M \otimes_{M} T^{*} M)

. Intentionally leaving the ∇ and · symbols undecorated in preference of contextual interpretation, unwinding the expression above gives

\begin{matrix} \nabla^{2} σ : (X \otimes_{M} Y - Y \otimes_{M} X) & = \nabla_{Y} \nabla σ \cdot X - \nabla_{X} \nabla σ \cdot Y \\ = \nabla_{Y} \nabla_{X} σ - \nabla σ \cdot \nabla_{Y} X - \nabla_{X} \nabla_{Y} σ + \nabla σ \cdot \nabla_{X} Y \\ = - \nabla_{X} \nabla_{Y} σ + \nabla_{Y} \nabla_{X} σ + \nabla σ \cdot [X, Y] \\ = - \nabla_{X} \nabla_{Y} σ + \nabla_{Y} \nabla_{X} σ + \nabla_{[X, Y]} σ, \end{matrix}

which is syntactically identical to the common definition for the [Riemannian] curvature endomorphism

R (X, Y) σ

. In the traditional setting, where

\nabla^{E}

is a linear covariant derivative on vector bundle E, the curvature endomorphism takes the form of a tensor field

R^{E} \in Γ (E \otimes_{M} E^{*} \otimes_{M} T^{*} M \otimes_{M} T^{*} M)

. In this setting however, because

\nabla^{E}

may be nonlinear (for example,

\nabla^{_{○}}^{M \to S}

when M and S are manifolds), such a tensorial formulation does not generally exist. Instead,

R^{E} (X, Y) : = - \nabla_{X} \nabla_{Y}^{E} + \nabla_{Y} \nabla_{X}^{E} + \nabla_{[X, Y]}^{E}

defines a second-order covariant differential operator (“covariant” meaning tensorial in the X and Y components). Put differently,

R^{E} (X, Y) σ = \nabla^{2} σ : (X \otimes_{M} Y - Y \otimes_{M} X),

which will be called the (possibly nonlinear) curvature operator, which in particular measures the non-commutativity of the X and Y derivatives of

σ

. If

R^{E}

is identically zero, then the bundle E is said to be flat with respect to the relevant connections/covariant derivatives.

There are two particularly important instances of flat bundles. The first is the trivial line bundle defined by

π^{R ⥇ S}

(whose space of smooth sections, as discussed in Section 3.4, is naturally identified with

C^{\infty} (S, R)

; S is a smooth manifold). In this case,

\nabla^{|}^{S \to R} f \in Γ (T^{*} S),

and

\nabla^{2} f \equiv \nabla^{|}^{T^{*} S} \nabla^{|}^{S \to R} f \in Γ (T^{*} S \otimes_{S} T^{*} S)

is the object referred to in most literature as the covariant Hessian of f. Here,

R^{S \to R} (X, Y) f

is a real-valued function on S.

Proposition 17

(Symmetry of covariant Hessian on functions). Let S be a smooth manifold and let

\nabla^{T S}

be a symmetric covariant derivative. If

f \in C^{\infty} (S, R)

, then

\nabla^{2} f \in Γ (T^{*} S \otimes_{S} T^{*} S)

is a symmetric tensor field (i.e., it has a

(1 2)

symmetry). Here, the covariant derivative on

C^{\infty} (S, R)

is

\nabla^{S \to R}

as defined above.

Proof.

Let

X, Y \in Γ (T S)

. Recall that

\nabla f \equiv d f \in Γ (T^{*} S)

. Then

\begin{matrix} \nabla^{2} f : (X \otimes_{S} Y - Y \otimes_{S} X) \\ = & \nabla_{Y} \nabla f \cdot X - \nabla_{X} \nabla f \cdot Y \\ = & \nabla_{Y} (\nabla f \cdot X) - \nabla f \cdot \nabla_{Y} X - \nabla_{X} (\nabla f \cdot Y) + \nabla f \cdot \nabla_{X} Y \\ = & \nabla (\nabla f \cdot X) \cdot Y - \nabla (\nabla f \cdot Y) \cdot X + \nabla f \cdot [X, Y] & (by symmetry of \nabla^{T S}) \\ = & - \nabla f \cdot [X, Y] + \nabla f \cdot [X, Y] & (by symmetry of [X, Y]) \\ = & 0 . \end{matrix}

Because

X \otimes_{S} Y

is pointwise-arbitrary in

T S \otimes_{S} T S

, this shows that

\nabla^{2} f

is symmetric. Equivalently stated,

R^{S \to R}

is identically zero, and therefore the relevant bundle is flat. □

The second important case involves the nonlinear covariant derivative

\nabla^{M \to S}

on

C^{\infty} (M, S)

. Here, if

ϕ \in C^{\infty} (M, S)

, then

\nabla^{2} ϕ \equiv \nabla^{|}^{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{_{○}}^{M \to S} ϕ \in Γ (ϕ^{*} T S \otimes_{M} T^{*} M \otimes_{M} T^{*} M),

so

R^{M \to S} (X, Y) ϕ \in Γ (ϕ^{*} T S)

.

Proposition 18

(Symmetry of covariant Hessian on maps). Let M and S be smooth manifolds and let

\nabla^{T M}

and

\nabla^{T S}

be symmetric covariant derivatives. If

ϕ \in C^{\infty} (M, S)

, then

\nabla^{2} ϕ \in Γ (ϕ^{*} T S \otimes_{M} T^{*} M \otimes_{M} T^{*} M)

is a tensor field which is symmetric in the two

T^{*} M

components (i.e., it has a

(2 3)

symmetry). Here, the covariant derivative on

C^{\infty} (M, S)

is

\nabla^{_{○}}^{M \to S}

as defined above.

Proof.

Let

X, Y \in Γ (T M)

and

f \in C^{\infty} (S, R)

, so that

ϕ^{*} \nabla f \in Γ (ϕ^{*} T S)

. Then

\begin{matrix} ϕ^{*} \nabla f \cdot_{ϕ^{*} T S} R^{M \to S} (X, Y) ϕ = & ϕ^{*} \nabla f \cdot (- \nabla_{X} \nabla_{Y} ϕ + \nabla_{Y} \nabla_{X} ϕ + \nabla_{[X, Y]} ϕ) \\ = & - \nabla_{X} (ϕ^{*} \nabla f \cdot \nabla ϕ \cdot Y) + \nabla_{X} ϕ^{*} \nabla f \cdot \nabla ϕ \cdot Y \\ + \nabla_{Y} (ϕ^{*} \nabla f \cdot \nabla ϕ \cdot X) - \nabla_{Y} ϕ^{*} \nabla f \cdot \nabla ϕ \cdot X \\ + ϕ^{*} \nabla f \cdot \nabla ϕ \cdot [X, Y] \\ = & - \nabla_{X} (\nabla ϕ^{*} f \cdot Y) + \nabla_{Y} (\nabla ϕ^{*} f \cdot X) + \nabla ϕ^{*} f \cdot [X, Y] \\ + (ϕ^{*} \nabla^{2} f \cdot \nabla ϕ \cdot X) \cdot \nabla ϕ \cdot Y - (ϕ^{*} \nabla^{2} f \cdot \nabla ϕ \cdot Y) \cdot \nabla ϕ \cdot X \\ = & - \nabla (\nabla ϕ^{*} f \cdot Y) \cdot X + \nabla (\nabla ϕ^{*} f \cdot X) \cdot Y + \nabla ϕ^{*} f \cdot [X, Y] \\ - ϕ^{*} \nabla^{2} f : ((\nabla ϕ \cdot X) \otimes_{M} (\nabla ϕ \cdot Y) - (\nabla ϕ \cdot Y) \otimes_{M} (\nabla ϕ \cdot X)) . \end{matrix}

By definition,

- \nabla (\nabla ϕ^{*} f \cdot Y) \cdot X + \nabla (\nabla ϕ^{*} f \cdot X) \cdot Y = - \nabla ϕ^{*} f \cdot [X, Y]

, which cancels out the other term. By (17),

\nabla^{2} f

is symmetric, so the final term is zero. Because

ϕ^{*} \nabla f

is pointwise-arbitrary in

ϕ^{*} T^{*} S

and X and Y are pointwise-arbitrary in

T M

, this shows that

R^{M \to S}

is identically zero, so the bundle defined by

π_{M}^{S \times M} : S \times M \to M

, whose space of sections is identified with

C^{\infty} (M, S)

, is flat, and therefore

\nabla^{2} ϕ

is symmetric in its two

T^{*} M

components. □

The construction used in (16) can be applied to nonlinear as well as linear covariant derivatives to considerable advantage. For example, if

ψ : M \times N \to L

, where

M, N, L

are smooth manifolds and

p_{M} : = {pr}_{1}^{M \times N}

and

p_{N} : = {pr}_{2}^{M \times N}

, then define

ψ_{, M} \in Γ (ψ^{*} T L \otimes_{M \times N} p_{M}^{*} T^{*} M)

and

ψ_{, N} \in Γ (ψ^{*} T L \otimes_{M \times N} p_{N}^{*} T^{*} N)

by

\nabla^{_{○}} ψ = \nabla^{_{○}}^{M \times N \to L} ψ = ψ_{, M} \cdot_{p_{M}^{*} T M} \nabla^{_{○}} p_{M} + ψ_{, N} \cdot_{p_{N}^{*} T N} \nabla^{_{○}} p_{N} .

This gives a convenient way to express partial covariant derivatives, which will be used heavily in Section 4 in calculating the first and second variations of an energy functional. Note that in this parlance,

ψ_{, (M \times N)}

is the full tangent map

\nabla^{_{○}} ψ

.

Defining second partial covariant derivatives

ψ_{, M M}

,

ψ_{, M N}

,

ψ_{, N M}

and

ψ_{, N N}

by

\begin{matrix} \nabla ψ_{, M} & = ψ_{, M M} \cdot \nabla^{_{○}} p_{M} + ψ_{, M N} \cdot \nabla^{_{○}} p_{N} and \\ \nabla ψ_{, N} & = ψ_{, N M} \cdot \nabla^{_{○}} p_{M} + ψ_{, N N} \cdot \nabla^{_{○}} p_{N}, \end{matrix}

the symmetry of the covariant Hessian of

ψ

can be used to show the various symmetries of these second derivatives.

Proposition 19

(Symmetries of partial covariant derivatives). With ψ and its second partial covariant derivatives as above,

ψ_{, M M} \in Γ (ψ^{*} T L \otimes_{M \times N} p_{M}^{*} T^{*} M \otimes_{M \times N} p_{M}^{*} T^{*} M)

and

ψ_{, N N}

(having analogous type) are

(2 3)

-symmetric (i.e.,

{(ψ_{, M M})}^{(2 3)} = ψ_{, M M}

and

{(ψ_{, N N})}^{(2 3)} = ψ_{, N N}

) and the mixed, second partial covariant derivatives

\begin{matrix} ψ_{, M N} & \in Γ (ψ^{*} T L \otimes_{M \times N} p_{M}^{*} T^{*} N \otimes_{M \times N} p_{N}^{*} T^{*} N) and \\ ψ_{, N M} & \in Γ (ψ^{*} T L \otimes_{M \times N} p_{N}^{*} T^{*} N \otimes_{M \times N} p_{M}^{*} T^{*} M) \end{matrix}

are mutually

(2 3)

-symmetric (i.e.,

ψ_{, M N} = {(ψ_{, N M})}^{(2 3)}

).

Proof.

Let

X, Y \in Γ (T M \oplus T N)

. If

T p_{N} \cdot X = 0

and

T p_{M} \cdot Y = 0

, then

\begin{matrix} 0 = & \nabla^{2} ψ : (X \otimes_{M \times N} Y - Y \otimes_{M \times N} X) (by (18)) \\ = & ψ_{, M M} : (\nabla^{_{○}} p_{M} \cdot X \otimes_{M \times N} \nabla^{_{○}} p_{M} \cdot Y - \nabla^{_{○}} p_{M} \cdot Y \otimes_{M \times N} \nabla^{_{○}} p_{M} \cdot X) \\ + ψ_{, M N} : (\nabla^{_{○}} p_{M} \cdot X \otimes_{M \times N} \nabla^{_{○}} p_{N} \cdot Y - \nabla^{_{○}} p_{M} \cdot Y \otimes_{M \times N} \nabla^{_{○}} p_{N} \cdot X) \\ + ψ_{, N M} : (\nabla^{_{○}} p_{N} \cdot X \otimes_{M \times N} \nabla^{_{○}} p_{M} \cdot Y - \nabla^{_{○}} p_{N} \cdot Y \otimes_{M \times N} \nabla^{_{○}} p_{M} \cdot X) \\ + ψ_{, N N} : (\nabla^{_{○}} p_{N} \cdot X \otimes_{M \times N} \nabla^{_{○}} p_{N} \cdot Y - \nabla^{_{○}} p_{N} \cdot Y \otimes_{M \times N} \nabla^{_{○}} p_{N} \cdot X) \\ = & ψ_{, M N} : (\nabla^{_{○}} p_{M} \cdot X \otimes_{M \times N} \nabla^{_{○}} p_{N} \cdot Y - \nabla^{_{○}} p_{N} \cdot Y \otimes_{M \times N} \nabla^{_{○}} p_{M} \cdot X) \\ = & (ψ_{, M N} - {(ψ_{, N M})}^{(2 3)}) : (\nabla^{_{○}} p_{M} \cdot X \otimes_{M \times N} \nabla^{_{○}} p_{N} \cdot Y) . \end{matrix}

Because

\nabla^{_{○}} p_{M} \cdot X

and

\nabla^{_{○}} p_{N} \cdot Y

are pointwise-arbitrary in

p_{M}^{*} T M

and

p_{N}^{*} T N

respectively, this implies that

ψ_{, M N} = {(ψ_{, N M})}^{(2 3)}

.

Analogous calculations (setting

\nabla^{_{○}} p_{M} \cdot X = 0

and

\nabla^{_{○}} p_{M} \cdot Y = 0

and then separately setting

\nabla^{_{○}} p_{N} \cdot X = 0

and

\nabla^{_{○}} p_{N} \cdot Y = 0

) show that

ψ_{, M M} = {(ψ_{, M M})}^{(2 3)}

and

ψ_{, N N} = {(ψ_{, N N})}^{(2 3)}

. □

There are two final results regarding the second covariant derivative that will be especially useful in the calculation of the first and second variations of an energy functional (see (1) and (3)).

Proposition 20

(Chain rule for covariant Hessian). Let

π : E \to N

define a bundle having a first and second covariant derivative (i.e., a section of E can be covariantly differentiated twice). If

ϕ : M \to N

and

e \in Γ (E)

, then

\nabla^{2} ϕ^{*} e = ϕ^{*} \nabla^{2} e :_{ϕ^{*} T N} (\nabla^{_{○}} ϕ ⊠_{M} \nabla^{_{○}} ϕ) + ϕ^{*} \nabla e \cdot_{ϕ^{*} T N} \nabla \nabla^{_{○}} ϕ .

Proof.

Let

X \in Γ (T M)

. Then

\begin{matrix} \nabla^{2} ϕ^{*} e \cdot X & = \nabla_{X} \nabla^{ϕ^{*} E} ϕ^{*} e \\ = \nabla_{X} (ϕ^{*} \nabla^{E} e \cdot_{ϕ^{*} T N} \nabla^{_{○}} ϕ) \\ = \nabla_{X} (ϕ^{*} \nabla e) \cdot_{ϕ^{*} T N} \nabla^{_{○}} ϕ + ϕ^{*} \nabla e \cdot_{ϕ^{*} T N} \nabla_{X} \nabla^{_{○}} ϕ \\ = (ϕ^{*} \nabla^{2} e \cdot_{ϕ^{*} T N} \nabla^{_{○}} ϕ \cdot X) \cdot_{ϕ^{*} T N} \nabla^{_{○}} ϕ + ϕ^{*} \nabla e \cdot_{ϕ^{*} T N} \nabla_{X} \nabla^{_{○}} ϕ \\ = [ϕ^{*} \nabla^{2} e :_{ϕ^{*} T N} (\nabla^{_{○}} ϕ ⊠_{M} \nabla^{_{○}} ϕ) + ϕ^{*} \nabla e \cdot_{ϕ^{*} T N} \nabla \nabla^{_{○}} ϕ] \cdot X . \end{matrix}

Because X is pointwise-arbitrary in

T M

, this establishes the desired equality. □

Proposition 21

(Pullback curvature endomorphism). Let

π : E \to N

define a vector bundle having first and second covariant derivatives. If

ϕ : M \to N

, then

R^{ϕ^{*} T N} = ϕ^{*} R^{T N} :_{ϕ^{*} T N} (\nabla^{_{○}} ϕ ⊠_{M} \nabla^{_{○}} ϕ) .

Proof.

Note that

R^{ϕ^{*} T N} \in Γ (ϕ^{*} T N \otimes_{M} ϕ^{*} T^{*} N \otimes_{M} T^{*} M \otimes_{M} T^{*} M)

. Let

X, Y \in Γ (T M)

and let

Z \in Γ (T N)

, so that

ϕ^{*} Z \in Γ (ϕ^{*} T N)

. Then

\begin{matrix} (I_{ϕ^{*} T N} \otimes_{M} ϕ^{*} Z) \cdot_{ϕ^{*} T N \otimes_{M} ϕ^{*} T^{*} N} R^{ϕ^{*} T N} :_{T M} (X \otimes_{M} Y) \\ = & R^{ϕ^{*} T N} (X, Y) (ϕ^{*} Z) \\ = & \nabla^{2} ϕ^{*} Z :_{T M} (X \land_{M} Y) \\ = & ϕ^{*} \nabla^{2} Z :_{ϕ^{*} T N} (\nabla^{_{○}} ϕ ⊠_{M} \nabla^{_{○}} ϕ) :_{T M} (X \land_{M} Y) \\ + ϕ^{*} \nabla Z \cdot_{ϕ^{*} T N} \nabla \nabla^{_{○}} ϕ :_{T M} (X \land_{M} Y) \\ (by (20)) \\ = & ϕ^{*} \nabla^{2} Z :_{ϕ^{*} T N} ((\nabla^{_{○}} ϕ \cdot X) \land_{M} (\nabla^{_{○}} ϕ \cdot Y)) + ϕ^{*} \nabla Z \cdot_{ϕ^{*} T N} 0 \\ (by symmetry of \nabla \nabla^{_{○}} ϕ) \\ = & (ϕ^{*} ((I_{T N} \otimes_{N} Z) \cdot_{T N \otimes_{N} T^{*} N} R^{T N})) :_{ϕ^{*} T N} ((\nabla^{_{○}} ϕ \cdot X) \otimes_{M} (\nabla^{_{○}} ϕ \cdot Y)) \\ = & (I_{ϕ^{*} T N} \otimes_{M} ϕ^{*} Z) \cdot_{ϕ^{*} T N \otimes_{M} ϕ^{*} T^{*} N} ϕ^{*} R^{T N} :_{ϕ^{*} T N} (\nabla^{_{○}} ϕ ⊠_{M} \nabla^{_{○}} ϕ) :_{T M} (X \otimes_{M} Y), \end{matrix}

and because

X, Y

and

ϕ^{*} Z

are pointwise-arbitrary in their respective spaces, this establishes the desired equality. □

A common operation is to evaluate a covariant derivative along a single tangent vector. One can express a single tangent vector as a section of a particular pullback bundle, the map being the constant map evaluating to the basepoint of the vector. This allows the richly typed formalism of pullback bundles to be used to evaluate derivatives at a point, particularly noting that this safely deals with the overloading of the natural pairing operator · (see Section 3.5).

Proposition 22

(Evaluation commutes with non-involved derivatives). Let A and B be smooth manifolds and let

σ \in Γ (E)

for some smooth bundle

E \to A \times B

having a covariant derivative

\nabla^{E}

. If

b \in B

and the map

z : A \to A \times B, a \mapsto (a, b)

represents evaluation at b, then

z^{*} (σ_{, A}) = {(z^{*} σ)}_{, A},

i.e., evaluation in B commutes with a derivative along A.

Proof.

Let

X \in Γ (T A)

, and let

p_{A} : = {pr}_{1}^{A \times B}

and

p_{B} : = {pr}_{2}^{A \times B}

. Then

\begin{matrix} {(z^{*} σ)}_{, A} \cdot X & = \nabla^{z^{*} E} z^{*} σ \cdot X \\ = z^{*} \nabla^{E} σ \cdot_{z^{*} (T A \oplus T B)} \nabla^{_{○}} z \cdot_{T A} X \\ = z^{*} (\nabla^{E} σ \cdot_{T A \oplus T B} p_{A}^{*} X) \\ = z^{*} (σ_{, A} \cdot_{p_{A}^{*} T A} \nabla^{_{○}} p_{A} \cdot_{T A \oplus T B} p_{A}^{*} X) & (since \nabla^{_{○}} p_{B} \cdot_{T A \oplus T B} p_{A}^{*} X = 0) \\ = z^{*} σ_{, A} \cdot_{T A} X & (since z^{*} p_{A}^{*} X = {(p_{A} \circ z)}^{*} X = {Id}_{A}^{*} X = X), \end{matrix}

and because X is pointwise-arbitrary in

T M

, this implies that

z^{*} σ_{, A} = {(z^{*} σ)}_{, A}

as desired. □

Proposition 23.

Let

A, B, C

be smooth manifolds, let

ψ : A \times B \to C

be smooth, let

p_{A} : = {pr}_{1}^{A \times B}

and

p_{B} : = {pr}_{2}^{A \times B}

, and let

X, Y \in Γ (T A \oplus T B)

. If

\nabla^{_{○}} p_{B} \cdot X = 0

and

\nabla^{_{○}} p_{A} \cdot Y = 0

, then

ψ_{, A B} : ((\nabla^{_{○}} p_{A} \cdot X) \otimes_{A \times B} (\nabla^{_{○}} p_{B} \cdot Y)) = \nabla_{Y}^{ψ^{*} T C} \nabla_{X}^{A \times B \to C} ψ .

Proof.

The conditions

\nabla^{_{○}} p_{B} \cdot X = 0

and

\nabla^{_{○}} p_{A} \cdot Y = 0

imply that

\nabla_{Y} X = 0

in the product covariant derivative. Then since

p_{A} \times_{A \times B} p_{B} = {Id}_{A \times B}

, it follows that

\begin{matrix} \nabla \nabla^{_{○}} p_{A} \oplus_{A \times B} \nabla \nabla^{_{○}} p_{B} & = & \nabla (\nabla^{_{○}} p_{A} \oplus_{A \times B} \nabla^{_{○}} p_{B}) \\ = & \nabla \nabla^{_{○}} (p_{A} \times_{A \times B} p_{B}) \\ = & \nabla \nabla^{_{○}} {Id}_{T A \oplus T B} \\ = & \nabla I_{T A \oplus T B} \\ = & 0, \end{matrix}

where

I_{T A \oplus T B} \in Γ ((T A \oplus T B) \otimes {(T A \oplus T B)}^{*})

denotes the identity tensor field on

T A \oplus T B

, and therefore

\nabla_{Y} (\nabla^{_{○}} p_{A} \cdot X) = \nabla_{Y} \nabla^{_{○}} p_{A} \cdot X + \nabla^{_{○}} p_{A} \cdot \nabla_{Y} X = 0 \cdot X + \nabla^{_{○}} p_{A} \cdot 0 = 0 .

For the main calculation,

\begin{matrix} ψ_{, A B} : ((\nabla^{_{○}} p_{A} \cdot X) \otimes_{A \times B} (\nabla^{_{○}} p_{B} \cdot Y)) \\ = & (ψ_{, A B} \cdot_{p_{B}^{*} T B} \nabla^{_{○}} p_{B} \cdot_{T A \oplus T B} Y) \cdot_{p_{A}^{*} T A} \nabla^{_{○}} p_{A} \cdot_{T A \oplus T B} X \\ = & (\nabla_{Y}^{ψ \times_{A \times B} p_{A}} ψ_{, A}) \cdot_{p_{A}^{*} T A} \nabla^{_{○}} p_{A} \cdot_{T A \oplus T B} X & (since \nabla^{_{○}} p_{A} \cdot Y = 0) \\ = & \nabla_{Y}^{ψ} (ψ_{, A} \cdot \nabla^{_{○}} p_{A} \cdot X) - ψ_{, A} \cdot \nabla_{Y}^{p_{A}} (\nabla^{_{○}} p_{A} \cdot X) & (by reverse product rule) \\ = & \nabla_{Y}^{ψ^{*} T C} \nabla_{X}^{A \times B \to C} ψ & (since \nabla_{Y} (\nabla^{_{○}} p_{A} \cdot X) = 0), \end{matrix}

as desired. □

4. Riemannian Calculus of Variations

The use of the Calculus of Variations in the Riemannian setting to develop the geodesic equations and to study harmonic maps is quite well-established. A more general formulation is required for more specific applications, such as continuum mechanics in Riemannian manifolds. The tools developed in Section 3 will now be used to formulate the first and second variations and Euler–Lagrange equations of an energy functional corresponding to a first-order Lagrangian. In particular, the bundle decomposition discussed in Section 3.9 will be needed to employ the standard integration-by-parts trick seen in the formulation of the analogous parts of the elementary Calculus of Variations. The seemingly heavy and pedantic formalism built up thus far will now show its usefulness.

In this part, let

(M, g)

and

(S, h)

be Riemannian manifolds with M compact. Calculations will be done formally in the space

C^{\infty} (M, S)

, noting that its completion under various norms will give various Sobolev spaces of maps from M to S, which are ultimately the spaces which must be considered when finding critical points of the relevant energy functionals. See [17,18] for details on the analytical issues. Let

d V_{g}

denote the Riemannian volume form corresponding to metric g, and let

d {\bar{V}}_{g}

be the induced volume form on

\partial M

. Let

ι : \partial M \to M

be the inclusion, and let

ν \in Γ (ι^{*} T^{*} M)

be the unit normal covector field on

\partial M

. Let

E : = T S \otimes_{S \times M} T^{*} M

and

π : = π_{S}^{T S} \otimes_{S \times M} π_{M}^{T^{*} M}

, making

π : E \to S \times M

a vector bundle.

The energy functionals in this section will be assumed to have the form

\begin{matrix} L : C^{\infty} (M, S) & \to & R, \\ ϕ & \mapsto & \int_{M} L \circ \nabla^{_{○}} ϕ d V_{g}, \end{matrix}

where

L : E \to R

, referred to as the Lagrangian of the functional, is smooth. Here,

\nabla^{_{○}} ϕ

could be understood to take values either in

E = T S \otimes_{S \times M} T^{*} M

or

ϕ^{*} T S \otimes_{M} T^{*} M

. In the former case, the composition

L \circ \nabla^{_{○}} ϕ

is literal, while in the latter case, there is an implicit conversion from

ϕ^{*} T S \otimes_{M} T^{*} M

to

T S \otimes_{S \times M} T^{*} M

via a fiber projection bundle morphism (see (1)). Either way,

L \circ \nabla^{_{○}} ϕ : M \to R

. Let

\nabla^{T S}

and

\nabla^{T M}

denote the respective Levi-Civita connections, which induce a covariant derivative

\nabla^{E}

on E (see (11)). Define the connection map

v \in Γ (π^{*} E \otimes_{E} T^{*} E)

using

\nabla^{E}

as in (3). For convenience, the

S \times M

subscript will be suppressed on the “full” tensor product defining E from here forward.

4.1. Critical Points and Variations

One of the most pertinent properties of an energy functional is its set of critical points. Often, the solution to a problem in physics will take the form of minimizing a particular energy functional. Lagrangian mechanics is the quintessential example of this. This section will deal with some of the main considerations regarding such critical points.

Because the domain of a [real-valued] functional

L

may be a nonlinear space, the relevant first derivative is the [real-valued] differential

d L

, which is paired with the linearized variation of a map

ϕ \in C^{\infty} (M, S)

. In particular, a one-parameter variation of

ϕ

is a smooth map

Φ : M \times I \to S

, where the I component is the variational parameter. Letting i denote the standard coordinate on I, the linearized variation is then

δ_{i} Φ : M \to T S

, recalling that

δ_{i} : = \frac{\partial}{\partial i} ∣_{i = 0}

. Because

π_{S}^{T S} \circ δ_{i} Φ = ϕ

, it follows that

δ_{i} Φ \in Γ (ϕ^{*} T S)

, i.e.,

δ_{i} Φ

is a vector field along

ϕ

. The object

δ_{i} Φ

will be called a linearized variation. Call the elements of

Γ (ϕ^{*} T S)

linear variations.

Proposition 24

(Each linear variation is a linearized variation). Let

exp : U \to S

denote the exponential map associated to

\nabla^{T S}

, where

U \subseteq T S

is a neighborhood of the zero bundle in

T S

on which exp is defined, and let

λ : T S \times R \to T S, (s, ϵ) \mapsto ϵ s

denote the scalar multiplication structure on

T S

. If

A \in Γ (ϕ^{*} T S)

and if

Φ : U \to S

is defined by

Φ : = exp \circ λ \circ (A \times {Id}_{I}) ∣_{U}

, then

δ_{i} Φ = A

. In other words, every vector field over ϕ is realized as the linearization of a one-parameter variation of ϕ.

Proof.

The map

Φ

is well-defined and smooth by construction. Let

p \in M

. Then

\begin{matrix} (δ_{i} Φ) (p) & = δ_{i} (Φ (p, i)) \\ = δ_{i} (exp \circ λ \circ (A \times {Id}_{I}) (p, i)) \\ = \nabla^{_{○}} exp \cdot δ_{i} (λ (A (p), i)) \\ = \nabla^{_{○}} exp \cdot δ_{i} (i A (p)) \\ = \nabla^{_{○}} exp \cdot (ι_{V E}^{π^{*} E} ∣_{Z (π^{*} E)}) \cdot A (p) \\ = A (p), \end{matrix}

where

Z (π^{*} E)

denotes the zero subbundle of

π^{*} E

. The last equality follows from a naturality property of the exponential map ([5], p. 523). □

Thus each linear variation is a linearized variation, establishing a natural identification of

T_{ϕ} (C^{\infty} (M, S))

with

Γ (ϕ^{*} T S)

, which will be useful when calculating the differential of a functional on

C^{\infty} (M, S)

. In fact, the exponential map construction in (24) is a way to construct charts for the infinite dimensional manifold

C^{\infty} (M, S)

([18], Theorem 5.2).

4.2. First Variation

This section is devoted to calculating the first variation of the previously defined energy functional. Here is where the full richness of the type system of the objects developed earlier in the paper will really show their power (and arguably, necessity). While the type-specifying notation may appear overly decorated and pedantic, subtle usage errors can be detected and avoided by keeping track of the myriad of types of the relevant objects through the sub/superscripts on covariant derivatives and natural pairings; extremely complex constructions can be made and navigated without much trouble. By contrast, performing the ensuing calculations in coordinate trivializations would result in an intractable proliferation of Christoffel symbols and indexed expressions which would prove difficult to read and would be highly prone to error.

Because the Lagrangian

L : E \to R

is defined on a vector bundle

π : E \to S \times M

over the product space

S \times M

, the decomposition in (3) can be slightly refined. The projection

π

can be decomposed into the factors

π_{S} : = {pr}_{S}^{S \times M} \circ π

and

π_{M} : = {pr}_{M}^{S \times M} \circ π

, so that

π = π_{S} \times_{E} π_{M}

. Then

h = \nabla^{_{○}} π = \nabla^{_{○}} π_{S} \oplus_{E} \nabla^{_{○}} π_{M}

. Let

σ : = \nabla^{_{○}} π_{S} \in Γ (π_{S}^{*} T S \otimes_{E} T^{*} E) and μ : = \nabla^{_{○}} π_{M} \in Γ (π_{M}^{*} T M \otimes_{E} T^{*} E) .

The letters sigma and mu have been chosen to reflect the fact that

L_{, σ} \in Γ (π_{S}^{*} T^{*} S)

and

L_{, μ} \in Γ (π_{M}^{*} T^{*} M)

give the “S component” (spatial) and “M component” (material) of the derivative

\nabla^{E \to R} L \in Γ (T^{*} E)

. The connection map v will be retained as is, giving

L_{, v} \in Γ (π^{*} E^{*})

, the “E component” (fiber) of

\nabla^{E \to R} L

. See (8) for a discussion of how the quantities

L_{, μ}, L_{, σ}, L_{, v}

generalize the analogous structures in the elementary treatment of the calculus of variations.

Because a one-parameter variation of

ϕ \in C^{\infty} (M, S)

has the form

Φ : M \times I \to S

but the energy functional

L

involves only the M derivative of its argument, the partial tangent map must be used here. For the purposes of calculating the first and second variations,

L

must be written as

L (ϕ) : = \int_{M} L \circ ϕ_{, M} d V_{g} .

Theorem 1

(First variation of

L

). Let

L

, L, σ, μ, v and ν all be defined as above. If

ϕ \in C^{\infty} (M, S)

and

A \in Γ (ϕ^{*} T S)

, then

d L (ϕ) \cdot A = \int_{M} A \cdot_{ϕ^{*} T^{*} S} (ϕ_{, M}^{*} L_{, σ} - {div}_{M} (ϕ_{, M}^{*} L_{, v})) d V_{g} + \int_{\partial M} A \cdot_{ϕ^{*} T^{*} S} ϕ_{, M}^{*} L_{, v} \cdot_{T^{*} M} ν d {\bar{V}}_{g} .

The expression above is often called the first variation of

L

. A type analysis here gives

ϕ_{, M}^{*} L_{, σ} \in Γ (ϕ^{*} T^{*} S)

and

ϕ_{, M}^{*} L_{, v} \in Γ (ϕ^{*} T^{*} S \otimes_{M} T M)

. Recall that because the domain of

ϕ

is M, it follows that

\nabla^{_{○}} ϕ \equiv ϕ_{, M}

.

Proof.

Supporting calculations will be made below in lemmas. Let

Φ : M \times I \to S

be as in (24), so that

δ_{i} Φ = A

. For tidiness, let

L_{, σ} : = ϕ_{, M}^{*} L_{, σ}

and

L_{, v} : = ϕ_{, M}^{*} L_{, v}

. Then

\begin{matrix} d L (ϕ) \cdot A = & d L (ϕ) \cdot δ_{i} Φ \\ = & δ_{i} (L (Φ)) \\ = & \int_{M} δ_{i} (L \circ Φ_{, M}) d V_{g} \\ = & \int_{M} L_{, σ} \cdot_{ϕ^{*} T S} A + L_{, v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} A d V_{g} \\ (by (4)) \\ = & \int_{M} A \cdot_{ϕ^{*} T^{*} S} (L_{, σ} - {div}_{M} L_{, V}) + {div}_{M} (A \cdot_{ϕ^{*} T^{*} S} L_{, V}) d V_{g} \\ (by (4)) \\ = & \int_{M} A \cdot_{ϕ^{*} T^{*} S} (L_{, σ} - {div}_{M} L_{, V}) d V_{g} + \int_{\partial M} A \cdot_{ϕ^{*} T^{*} S} L_{, V} \cdot_{T^{*} M} ν d {\bar{V}}_{g} \\ (divergence theorem), \end{matrix}

as desired.

As for the types of

ϕ_{, M}^{*} L_{, σ}

and

ϕ_{, M}^{*} L_{, v}

, the contravariance of bundle pullback allows significant simplification. Because

L_{, σ} \in Γ (π_{S}^{*} T^{*} S)

and

L_{, v} \in Γ (π^{*} E^{*})

,

\begin{matrix} ϕ_{, M}^{*} L_{, σ} & \in Γ (ϕ_{, M}^{*} π_{S}^{*} T^{*} S) = Γ ({(π_{S} \circ ϕ_{, M})}^{*} T^{*} S) = Γ (ϕ^{*} T^{*} S) and \\ ϕ_{, M}^{*} L_{, v} & \in Γ (ϕ_{, M}^{*} π^{*} E) = Γ ({(π \circ ϕ_{, M})}^{*} (T^{*} S \otimes T M)) = Γ (ϕ^{*} T^{*} S \otimes_{M} T M) . \end{matrix}

The supporting calculations follow. Define

z : M \to M \times I, m \mapsto (m, 0)

for purposes of evaluation of

i = 0

via precomposition as in (22). Then

δ_{i}

is a section of a pullback bundle;

δ_{i} = z^{*} \partial_{i} \in Γ (z^{*} (T M \oplus T I))

. It should be noted that

Φ \circ z = ϕ

by definition, and that

z^{*} Φ_{, M} = {(z^{*} Φ)}_{, M} = ϕ_{, M}

by (22). □

Lemma 4.

Let L, Φ, A, σ, and v be as in Theorem 1. The variational derivative of

L \circ Φ_{, M}

decomposes in terms of the partial covariant derivatives

L_{, σ}

and

L_{, v}

and the linearized variation A;

δ_{i} (L \circ Φ_{, M}) = ϕ_{, M}^{*} L_{, σ} \cdot_{ϕ^{*} T S} δ_{i} Φ + ϕ_{, M}^{*} L_{, v} \cdot_{{(ϕ \times_{M} {Id}_{M})}^{*} E} \nabla^{ϕ} δ_{i} Φ .

The integration-by-parts trick as in the derivation of the first variation in elementary calculus of variations generalizes to the covariant setting;

L_{, σ} \cdot_{ϕ^{*} T S} A + L_{, v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ} A = A \cdot_{ϕ^{*} T^{*} S} (L_{, σ} - {div}_{M} L_{, v}) + {div}_{M} (A \cdot_{ϕ^{*} T^{*} S} L_{, v}) .

Proof.

A wonderful string of equalities follows.

\begin{matrix} δ_{i} (L \circ Φ_{, M}) \\ = & z^{*} \nabla^{M \times I \to R} (L \circ Φ_{, M}) \cdot_{z^{*} (T M \oplus T I)} δ_{i} \\ (here, δ_{i} = z^{*} \partial_{i}) \\ = & z^{*} Φ_{, M}^{*} \nabla^{E \to R} L \cdot_{z^{*} Φ_{, M}^{*} T E} z^{*} \nabla^{_{○}}^{M \times I \to E} Φ_{, M} \cdot_{z^{*} (T M \oplus T I)} δ_{i} \\ (chain rule) \\ = & ϕ_{, M}^{*} (L_{, σ} \cdot_{π_{S}^{*} T S} σ + L_{, μ} \cdot_{π_{M}^{*} T M} μ + L_{, v} \cdot_{π^{*} E} v) \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} \\ (by (16) and because Φ_{, M} \circ z = ϕ_{, M}) \\ = & ϕ_{, M}^{*} L_{, σ} \cdot_{ϕ_{, M}^{*} π_{S}^{*} T S} ϕ_{, M}^{*} σ \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} \\ + ϕ_{, M}^{*} L_{, μ} \cdot_{ϕ_{, M}^{*} π_{M}^{*} T M} ϕ_{, M}^{*} μ \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} \\ + ϕ_{, M}^{*} L_{, v} \cdot_{ϕ_{, M}^{*} π^{*} E} ϕ_{, M}^{*} v \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} \\ = & ϕ_{, M}^{*} L_{, σ} \cdot_{ϕ^{*} T S} δ_{i} Φ + ϕ_{, M}^{*} L_{, v} \cdot_{{(ϕ \times_{M} {Id}_{M})}^{*} E} \nabla^{ϕ} δ_{i} Φ \\ (by (5)) \end{matrix}

Note that by (8),

Φ_{, M}^{*} π_{S}^{*} T S = {(π_{S} \circ Φ_{, M})}^{*} T S = Φ^{*} T S

,

Φ_{, M}^{*} π_{M}^{*} T M = {(π_{M} \circ Φ_{, M})}^{*} T M = {({pr}_{M}^{M \times I})}^{*} T M

and

Φ_{, M}^{*} π^{*} E = {(π \circ Φ_{, M})}^{*} E = {(Φ \times_{M \times I} {pr}_{M}^{M \times I})}^{*} E

. Replacing

δ_{i} Φ

with A gives

δ_{i} (L \circ Φ_{, M}) = L_{, σ} \cdot_{ϕ^{*} T S} A + L_{, v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ} A,

establishing the first equality.

For the second,

\begin{matrix} L_{, σ} \cdot_{ϕ^{*} T S} A + L_{, v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ} A \\ = & L_{, σ} \cdot_{ϕ^{*} T S} A + {tr}_{T^{*} M} (L_{, v} \cdot_{ϕ^{*} T S} \nabla^{ϕ} A) \\ (tracing T M separately) \\ = & A \cdot_{ϕ^{*} T^{*} S} L_{, σ} + {tr}_{T^{*} M} (\nabla (L_{, v} \cdot_{ϕ^{*} T S} A) - (\nabla^{ϕ \times_{M} {Id}_{M}} L_{, v}) \cdot_{ϕ^{*} T S} A) \\ (reverse product rule) \\ = & A \cdot_{ϕ^{*} T^{*} S} L_{, σ} - A \cdot_{ϕ^{*} T^{*} S} {tr}_{T^{*} M} \nabla^{ϕ \times_{M} {Id}_{M}} L_{, v} + {tr}_{T^{*} M} \nabla (A \cdot_{ϕ^{*} T^{*} S} L_{, v}) \\ (\cdot_{ϕ^{*} T S} commutes with {tr}_{T^{*} M}) \\ = & A \cdot_{ϕ^{*} T^{*} S} (L_{, σ} - {div}_{M} L_{, v}) + {div}_{M} (A \cdot_{ϕ^{*} T^{*} S} L_{, v}) \\ (definition of {div}_{M}) . \end{matrix}

Note that

L_{, v} \in Γ (ϕ^{*} T^{*} S \otimes_{M} T M)

, so

{div}_{M} L_{, v} \in Γ (ϕ^{*} T^{*} S)

and

A \cdot_{ϕ^{*} T^{*} S} L_{, v} \in Γ (T M)

. □

Lemma 5.

The variation

δ_{i} Φ_{, M}

decomposes as follows.

\begin{matrix} ϕ_{, M}^{*} σ \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} & = δ_{i} Φ \in Γ (ϕ^{*} T S), \\ ϕ_{, M}^{*} μ \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} & = 0 \in Γ (T M), \\ ϕ_{, M}^{*} v \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} & = \nabla^{ϕ^{*} T S} δ_{i} Φ \in Γ (ϕ^{*} T S \otimes_{M} T^{*} M) . \end{matrix}

Proof.

This calculation determines the

σ

component of

δ_{i} Φ_{, M}

.

\begin{matrix} ϕ_{, M}^{*} σ \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} & = ϕ_{, M}^{*} \nabla^{_{○}} π_{S} \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} \\ = δ_{i} (π_{S} \circ Φ_{, M}) \\ = δ_{i} ({pr}_{S}^{S \times M} \circ π \circ Φ_{, M}) \\ = δ_{i} Φ \in Γ (z^{*} Φ^{*} T S) ≅ Γ (ϕ^{*} T S) . \end{matrix}

This calculation determines the

μ

component of

δ_{i} Φ_{, M}

.

\begin{matrix} ϕ_{, M}^{*} μ \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} & = ϕ_{, M}^{*} \nabla^{_{○}} π_{M} \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} \\ = δ_{i} (π_{M} \circ Φ_{, M}) \\ = δ_{i} ({pr}_{M}^{S \times M} \circ π \circ Φ_{, M}) \\ = δ_{i} {pr}_{M}^{M \times I} \\ = 0 \in Γ (z^{*} {({pr}_{M}^{M \times I})}^{*} T M) ≅ Γ (T M) . \end{matrix}

The last equality follows from the fact that

{pr}_{M}^{M \times I}

does not depend on the i coordinate.

This calculation determines the v component of

δ_{i} Φ_{, M}

. Let

p_{M} : = {pr}_{M}^{M \times I}

and

p_{I} : = {pr}_{I}^{M \times I}

. The left-hand side of the third equality claimed in the lemma will be examined before evaluating at

i = 0

;

\begin{matrix} Φ_{, M}^{*} v \cdot_{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M} & = \nabla_{p_{I}^{*} \partial_{i}}^{{(π \circ Φ_{, M})}^{*} E} Φ_{, M} \\ = \nabla_{p_{I}^{*} \partial_{i}}^{{(Φ \times_{M \times I} p_{M})}^{*} (T S \otimes T^{*} M)} Φ_{, M} \in Γ (Φ^{*} T S \otimes_{M \times I} p_{M}^{*} T^{*} M) . \end{matrix}

Let

Y \in Γ (T M)

, noting that

p_{M}^{*} Y \in Γ (p_{M}^{*} T M)

. Then

\begin{matrix} (Φ_{, M}^{*} v \cdot_{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M}) \cdot_{p_{M}^{*} T M} p_{M}^{*} Y \\ = & \nabla_{p_{I}^{*} \partial_{i}}^{Φ^{*} T S \otimes_{M \times I} p_{M}^{*} T^{*} M} Φ_{, M} \cdot_{p_{M}^{*} T M} p_{M}^{*} Y \\ = & (Φ_{, M I} \cdot_{p_{I}^{*} T I} p_{I}^{*} \partial_{i}) \cdot_{p_{M}^{*} T M} p_{M}^{*} Y \\ = & (Φ_{, I M} \cdot_{p_{M}^{*} T M} p_{M}^{*} Y) \cdot_{p_{I}^{*} T I} p_{I}^{*} \partial_{i} \\ (by (19)) \\ = & {(Φ_{, I} \cdot_{p_{I}^{*} T I} p_{I}^{*} \partial_{i})}_{, M} \cdot_{p_{M}^{*} T M} p_{M}^{*} Y - Φ_{, I} \cdot_{p_{I}^{*} T I} ({(p_{I}^{*} \partial_{i})}_{, M} \cdot_{p_{M}^{*} T M} p_{M}^{*} Y) \\ = & {(\partial_{i} Φ)}_{, M} \cdot_{p_{M}^{*} T M} p_{M}^{*} Y \\ (since p_{I}^{*} \partial_{i} does not depend on M) . \end{matrix}

Recall that

{Id}_{M} = p_{M} \circ z

and that the pullback of bundles is contravariant. Then evaluating at

i = 0

via pullback by z renders

\begin{matrix} (ϕ_{, M}^{*} v \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M}) \cdot_{T M} Y \\ = & ({(Φ_{, M} \circ z)}^{*} v \cdot_{{(Φ_{, M} \circ z)}^{*} T E} z^{*} \partial_{i} Φ_{, M}) \cdot_{{(p_{M} \circ z)}^{*} T M} {(p_{M} \circ z)}^{*} Y \\ = & (z^{*} Φ_{, M}^{*} v \cdot_{z^{*} Φ_{, M}^{*} T E} z^{*} \partial_{i} Φ_{, M}) \cdot_{z^{*} p_{M}^{*} T M} z^{*} p_{M}^{*} Y \\ = & z^{*} ((Φ_{, M}^{*} v \cdot_{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M}) \cdot_{p_{M}^{*} T M} p_{M}^{*} Y) \\ = & z^{*} ({(\partial_{i} Φ)}_{, M} \cdot_{p_{M}^{*} T M} p_{M}^{*} Y) \\ = & z^{*} {(\partial_{i} Φ)}_{, M} \cdot_{z^{*} p_{M}^{*} T M} z^{*} p_{M}^{*} Y \\ = & {(z^{*} \partial_{i} Φ)}_{, M} \cdot_{{(p_{M} \circ z)}^{*} T M} {(p_{M} \circ z)}^{*} Y & (by (22)) \\ = & {(δ_{i} Φ)}_{, M} \cdot_{T M} Y \\ = & \nabla^{ϕ^{*} T S} δ_{i} Φ \cdot_{T M} Y \end{matrix}

The last equality is because

δ_{i} Φ \in Γ (ϕ^{*} T S)

, which is a bundle over M, and therefore

{(δ_{i} Φ)}_{, M}

is the total covariant derivative. Because Y is pointwise-arbitrary in

T M

, this implies that

ϕ_{, M}^{*} \cdot_{ϕ_{, M}^{*} T I} δ_{i} Φ_{, M} = \nabla^{ϕ^{*} T S} δ_{i} Φ

, i.e., the variational derivative

δ_{i}

commutes with the first material derivative, just as in the analogous situation in elementary calculus of variations. □

Corollary 4

(Euler–Lagrange equations). If

ϕ \in C^{\infty} (M, S)

is a critical point of

L

(i.e., if

d L (ϕ) \cdot A = 0

for all

A \in Γ (ϕ^{*} T S)

), then

\begin{matrix} ϕ_{, M}^{*} L_{, σ} - {div}_{M} (ϕ_{, M}^{*} L_{, v}) & = 0 o n M, \\ ϕ_{, M}^{*} L_{, v} \cdot_{T M} ν & = 0 o n \partial M . \end{matrix}

These are called the Euler-Lagrange equations for the energy functional

L

. Recall that because the domain of ϕ is M,

\nabla^{_{○}} ϕ \equiv ϕ_{, M}

.

Proof.

This follows trivially from (1) and the Fundamental Lemma of the Calculus of Variations ([15], p. 16). □

It should be noted that the boundary Euler–Lagrange equation is due to the fact that the admissible variations are entirely unrestricted. If, for example, the class of maps being considered had fixed boundary data, then any variation would vanish at the boundary, and there would be no boundary Euler–Lagrange equation; this is typically how geodesics and harmonic maps are formulated.

Remark 8

(Analogs in elementary calculus of variations). The quantities

L_{, μ}, L_{, σ}, L_{, v}

generalize the quantities

\frac{\partial L}{\partial x}, \frac{\partial L}{\partial z}, \frac{\partial L}{\partial p}

respectively of the elementary treatment of the calculus of variations for energy functional

(f : U \to R^{n}) \mapsto \int_{U} L (x, f (x), D f (x)) d x,

where

U \subset R^{m}

is compact and

U \times R^{n} \times R^{n \times m} ∋ (x, z, p) \mapsto L (x, z, p)

is the Lagrangian. Here,

\frac{\partial L}{\partial x} : U \times R^{n} \times R^{n \times m} \to R^{m},

\frac{\partial L}{\partial z} : U \times R^{n} \times R^{n \times m} \to R^{n},

and

\frac{\partial L}{\partial p} : U \times R^{n} \times R^{n \times m} \to R^{n \times m}

decompose the total derivative

d L

and are defined by the relation

d L (x, z, p) \cdot (u, v, w) = \frac{\partial L}{\partial x} (x, z, p) \cdot u + \frac{\partial L}{\partial z} (x, z, p) \cdot v + \frac{\partial L}{\partial p} (x, z, p) : w

for

u \in R^{m},

v \in R^{n},

and

w \in R^{n \times m}

. The Euler-Lagrange equation in this setting is

(\frac{\partial L}{\partial z} - {div}_{U} \frac{\partial L}{\partial p}) (x, f (x), D f (x)) = 0 f o r x \in U,

noting that the left hand side of the equation takes values in

R^{n}

.

In most situations involving simpler calculations, it is desirable and acceptable to dispense with the highly decorated notation and use trimmed-town, context-dependent notation, leaving off type-specifying sub/superscripts when clear from context.

Proposition 25

(Conserved quantity). If M is a real interval,

ϕ \in C^{\infty} (M, S)

satisfies the Euler–Lagrange equation, and

L_{, μ} = 0

, then

H : = {(\nabla^{_{○}} ϕ)}^{*} L_{, v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{_{○}} ϕ - {(\nabla^{_{○}} ϕ)}^{*} L \in C^{\infty} (M, R)

is constant. If L is kinetic minus potential energy, then H is kinetic plus potential energy (the total energy), and is referred to as the Hamiltonian.

Proof.

Let t be the standard real coordinate. Note that because M is a real interval, it follows that

\nabla^{_{○}} ϕ = ϕ^{'} \otimes_{M} d t

. Terms appearing in the derivative of H can be simplified as follows. Note the repeated

\nabla^{_{○}}

derivatives;

\nabla^{_{○}} ϕ : M \to ϕ^{*} T S \otimes_{M} T^{*} M

but

\nabla^{_{○}} \nabla^{_{○}} ϕ : M \to {(\nabla^{_{○}} ϕ)}^{*} T (ϕ^{*} T S \otimes_{M} T^{*} M)

lands in a higher tangent space.

\begin{matrix} {(\nabla^{_{○}} ϕ)}^{*} σ \cdot \nabla^{_{○}} \nabla^{_{○}} ϕ \cdot \frac{d}{d t} & = {(\nabla^{_{○}} ϕ)}^{*} \nabla^{_{○}} π_{S} \cdot \nabla^{_{○}} \nabla^{_{○}} ϕ \cdot \frac{d}{d t} = \frac{d}{d t} (π_{S} \circ \nabla^{_{○}} ϕ) = ϕ^{'}, \\ {(\nabla^{_{○}} ϕ)}^{*} v \cdot \nabla^{_{○}} \nabla^{_{○}} ϕ \cdot \frac{d}{d t} & = {(\nabla^{_{○}} ϕ)}^{*} v \cdot \frac{d}{d t} \nabla^{_{○}} ϕ = \nabla_{\frac{d}{d t}} \nabla^{_{○}} ϕ, \\ \nabla_{\frac{d}{d t}} {(\nabla^{_{○}} ϕ)}^{*} L & = {(\nabla^{_{○}} ϕ)}^{*} \nabla L \cdot \nabla^{_{○}} \nabla^{_{○}} ϕ \cdot \frac{d}{d t} \\ = {(\nabla^{_{○}} ϕ)}^{*} L_{, σ} \cdot ϕ^{'} + {(\nabla^{_{○}} ϕ)}^{*} L_{, v} \cdot \nabla_{\frac{d}{d t}} \nabla^{_{○}} ϕ, \\ \nabla_{\frac{d}{d t}} ({(\nabla^{_{○}} ϕ)}^{*} L_{, v} : \nabla^{_{○}} ϕ) & = \nabla_{\frac{d}{d t}} {(\nabla^{_{○}} ϕ)}^{*} L_{, v} : (ϕ^{'} \otimes_{M} d t) + {(\nabla^{_{○}} ϕ)}^{*} L_{, v} : \nabla_{\frac{d}{d t}} \nabla^{_{○}} ϕ \\ = (\nabla_{\frac{d}{d t}} {(\nabla^{_{○}} ϕ)}^{*} L_{, v} \cdot d t) \cdot ϕ^{'} + {(\nabla^{_{○}} ϕ)}^{*} L_{, v} : \nabla_{\frac{d}{d t}} \nabla^{_{○}} ϕ . \end{matrix}

Again, because M is a real interval, the divergence is just the derivative, so the Euler–Lagrange equation is

\begin{matrix} 0 & = {(\nabla^{_{○}} ϕ)}^{*} L_{, σ} - {div}_{M} ({(\nabla^{_{○}} ϕ)}^{*} L_{, v}) \\ = {(\nabla^{_{○}} ϕ)}^{*} L_{, σ} - \nabla {(\nabla^{_{○}} ϕ)}^{*} L_{, v} : (d t \otimes_{M} \frac{d}{d t}) \\ = {(\nabla^{_{○}} ϕ)}^{*} L_{, σ} - \nabla_{\frac{d}{d t}} {(\nabla^{_{○}} ϕ)}^{*} L_{, v} \cdot d t, \end{matrix}

and therefore

\nabla_{\frac{d}{d t}} {(\nabla^{_{○}} ϕ)}^{*} L_{, v} \cdot d t = {(\nabla^{_{○}} ϕ)}^{*} L_{, σ}

. Thus

\nabla_{\frac{d}{d t}} H = \nabla_{\frac{d}{d t}} ({(\nabla^{_{○}} ϕ)}^{*} L_{, v} : \nabla^{_{○}} ϕ - {(\nabla^{_{○}} ϕ)}^{*} L) = (\nabla_{\frac{d}{d t}} {(\nabla^{_{○}} ϕ)}^{*} L_{, v} \cdot d t - {(\nabla^{_{○}} ϕ)}^{*} L_{, σ}) \cdot ϕ^{'}

which is zero because

ϕ

satisfies the Euler–Lagrange equation. This shows that H is constant along solutions of the Euler–Lagrange equation, and is therefore a conserved quantity. It should be noted that this proof relies on the fact that the divergence takes a particularly simple form when the domain M is a real interval; the result does not necessarily hold for a general choice of M. □

Example 3

(Harmonic maps). Define a metric

k \in Γ (E^{*} \otimes_{S \times M} E^{*}) ≅ Γ ((T S \otimes T^{*} M) \otimes_{S \times M} (T S \otimes T^{*} M))

in a manner analogous to that in (2);

k : = h ⊠ g^{- 1} .

To clarify,

h \otimes g^{- 1} \in Γ ((T^{*} S \otimes_{S} T^{*} S) \otimes (T M \otimes_{M} T M))

, so permuting the middle two components (as in the definition of

h ⊠ g^{- 1}

) gives the correct type, including the necessary metric symmetry condition. If

A \in E

, then

{|A|}_{k}^{2}

is the quantity obtained by raising/lowering the indices of A and pairing it naturally with A. A useful fact is that

\nabla k = 0

; if

u \oplus v \in T S \oplus T M

, then permutation commutativity (2) and the product rule gives

\nabla_{u \oplus v} k = \nabla_{u \oplus v} (h ⊠ g^{- 1}) = \nabla_{u} h ⊠ g^{- 1} + h ⊠ \nabla_{v} g^{- 1},

which equals zero because h and

g^{- 1}

are parallel with respect to

\nabla^{T S}

and

\nabla^{T^{*} M}

respectively.

With Lagrangian

L : E \to R, A \mapsto \frac{1}{2} {|A|}_{k}^{2}

and energy functional

E (ϕ) : = \int_{M} L \circ \nabla^{_{○}} ϕ d V_{g}

(

E (ϕ)

is called theenergyof ϕ), the resulting Euler–Lagrange equations can be written down after calculating

L_{, σ}

and

L_{, v}

. It is worthwhile to note that L is a quadratic form

A \mapsto A : \frac{1}{2} k : A

on E, which will automatically imply that

L_{, v} (A) = A : k

. However, the calculation showing this will be carried out for demonstration purposes.

Let

A, B \in T S \otimes T^{*} M

. Then

ϵ \mapsto A + ϵ B

is a vertical variation of A, since

h (δ (A + ϵ B)) = 0

, so

\begin{matrix} L_{, v} (A) : B & = L_{, v} (A) : v \cdot δ_{ϵ} (A + ϵ B) \\ = δ_{ϵ} (L (A + ϵ B)) \\ = δ_{ϵ} ((A + ϵ B) : \frac{1}{2} (π^{*} k (A + ϵ B)) : (A + ϵ B)) . \end{matrix}

The product rule gives three terms. The middle term is zero because

π (A + ϵ B) = π (A)

, and therefore does not depend on ϵ. The basepoint evaluation notation for

π^{*} k (A)

will be suppressed for brevity (see Section 3.5). Thus

L_{, v} (A) : B = B : \frac{1}{2} k : A + A : \frac{1}{2} k : B = A : k : B,

where the last equality results from the symmetry of k. By the nondegeneracy of the natural pairing on

T S \otimes T^{*} M

(which is denoted here by :), this implies that

L_{, v} (A) = A : k

.

To calculate

L_{, σ}

, it is sufficient (and can be easier) to calculate

L_{, h}

, as

h = \nabla^{_{○}} π

,

π = π_{S} \times_{E} π_{M}

, so

h = σ \oplus_{E} μ

. Let

A (ϵ)

be a horizontal curve in

E = T S \otimes T^{*} M

; this means that

v \cdot \frac{d}{d ϵ} A = 0

. Recall that

v \cdot \frac{d}{d ϵ} A

is defined by

\nabla_{\frac{d}{d ϵ}}^{{(π \circ A)}^{*} E} A

. Then

\begin{matrix} L_{, h} (A) \cdot_{π^{*} (T S \oplus T M)} h \cdot_{T E} δ_{ϵ} A & = (L_{, h} \cdot_{π^{*} (T S \oplus T M)} h + L_{, v} \cdot_{π^{*} E} v) \cdot_{T E} δ_{ϵ} A \\ = \nabla L \cdot_{T E} δ_{ϵ} A \\ = δ_{ϵ} (L \circ A) \\ = δ_{ϵ} (A : \frac{1}{2} A^{*} π^{*} k : A) . \end{matrix}

As before, the product rule gives three terms. Using the contravariance of bundle pullback, the middle term is

\frac{1}{2} \nabla_{δ_{ϵ}}^{{(π \circ A)}^{*} (E^{*} \otimes_{S \times M} E^{*})} {(π \circ A)}^{*} k = \frac{1}{2} {(π \circ A)}^{*} \nabla^{E^{*} \otimes_{S \times M} E^{*}} k \cdot δ_{ϵ} (π \circ A),

which equals zero because

\nabla k = 0

. Thus

L_{, h} (A) \cdot h \cdot δ_{ϵ} A = \nabla_{δ_{ϵ}} A : \frac{1}{2} k : A + A : \frac{1}{2} k : \nabla_{δ_{ϵ}} A,

which equals zero because

\nabla_{δ_{ϵ}} A = v \cdot δ_{ϵ} A = 0

. The quantity

h \cdot δ_{ϵ} A

can take any value in

π^{*} (T S \oplus T M)

, showing that

L_{, h} = 0

. Finally,

h = σ \oplus_{E} μ

implies that

L_{, σ} = 0

and

L_{, μ} = 0

. This can be understood from the fact that L depends only on the fiber values of A, and has no explicit dependence on the basepoint; this relies crucially on the fact that

\nabla k = 0

.

Finally, the Euler–Lagrange equations can be written down. Recalling that the natural trace of a tensor (used in the divergence term in the Euler–Lagrange equation) is contraction with the appropriate identity tensor, let

(e_{i})

be a local frame for

T M

and let

(e^{i})

be its dual coframe, so that

e_{i} \otimes_{M} e^{i}

is a local expression (It should be noted that while

I_{T M}

is being written as the local expression

e_{i} \otimes_{M} e^{i}

, no inherently local property is being used; this tensor decomposition is only used so that the product rule can be used in the following calculations in a clear way.) for

I_{T M} \in Γ (T M \otimes_{M} T^{*} M)

. The type-subscripted notation will be minimized except to help clarify. On M:

\begin{matrix} 0 & = {(\nabla^{_{○}} ϕ)}^{*} L_{, σ} - {div}_{M} ({(\nabla^{_{○}} ϕ)}^{*} L_{, v}) \\ = - tr \nabla (\nabla^{_{○}} ϕ : k) \\ = - \nabla_{e_{i}} (\nabla^{_{○}} ϕ : k) \cdot_{T^{*} M} e^{i} \\ = - \nabla_{e_{i}} \nabla^{_{○}} ϕ : k \cdot e^{i} - \nabla^{_{○}} ϕ : \nabla_{e_{i}} k \cdot e^{i} . \end{matrix}

The second term vanishes because

\nabla k = 0

. Unraveling the definition of k gives

\nabla_{e_{i}} \nabla^{_{○}} ϕ : k = ϕ^{*} h \cdot \nabla_{e_{i}} \nabla^{_{○}} ϕ \cdot g^{- 1}

. Contracting both sides of the above equation with

- ϕ^{*} h^{- 1}

gives

0 = \nabla_{e_{i}} \nabla^{_{○}} ϕ \cdot g^{- 1} \cdot e^{i} = \nabla_{e_{i}} \nabla^{_{○}} ϕ \cdot e_{i} = {tr}_{g} \nabla^{2} ϕ \in Γ (ϕ^{*} T S) .

The quantity

{tr}_{g} \nabla^{2} ϕ

is the g-trace of the covariant Hessian of ϕ and can rightfully be called thecovariant Laplacianof ϕ and denoted by

Δ_{g} ϕ

(this is also referred to as thetension fieldof ϕ in other literature ([14], p. 13), which is denoted

τ (ϕ)

). Note that

Δ_{g} ϕ

is a vector field along ϕ. This makes sense because ϕ is not necessarily a scalar function; it takes values in S. In the case

S = R

,

Δ_{g} ϕ

is the ordinary covariant Laplacian on scalar functions.

Aharmonic mapis defined as a critical point of the energy functional

E (ϕ) : = \int_{M} \frac{1}{2} {|\nabla^{_{○}} ϕ|}_{k}^{2} d V_{M} .

Assuming a fixed boundary (so that the variations vanish on the boundary) eliminates the boundary Euler–Lagrange equation, the remaining equation is

Δ_{g} ϕ = 0 o n t h e i n t e r i o r o f M,

which is the generalization of Laplace’s equation. Satisfying Laplace’s equation is a sufficient condition for a map to be a critical point of the energy functional. There is an abundance of literature concerning harmonic maps and the analysis thereof [14,15,19,20].

Example 4

(The geodesic equation). A fundamental problem in differential geometry is determining length-minimizing curves between given points. If M is a bounded, real interval, and t denotes the standard real coordinate, then the length functional on curves

ϕ : M \to S

is

L (ϕ) : = \int_{M} {|ϕ^{'}|}_{g} d t

. A topological metric

d : M \times M \to R

on M can be defined as

d (p, q) : = inf \{L (ϕ) ∣ ϕ j o i n s p t o q\} .

It can be shown that the length functional

L (ϕ) : = \int_{M} {|ϕ^{'}|}_{h} d t

and the energy functional

E (ϕ) : = \int_{M} \frac{1}{2} {|ϕ^{'}|}_{h}^{2} d t

have identical minimizers. Note that

ϕ^{'} \in Γ (ϕ^{*} T S)

. It is therefore sufficient to consider the analytically preferable energy functional.

In this case, the metric g on M is just scalar multiplication on

R

. Because M is one-dimensional and t is the standard real coordinate,

\frac{d}{d t}

is a global, parallel orthonormal frame for

T M

, and the g-trace of

\nabla^{2} ϕ

(i.e.,

Δ_{g} ϕ

) has a single term. The Euler–Lagrange equation, on the interior of M, is

0 = Δ_{g} ϕ = {tr}_{g} \nabla^{2} ϕ = \nabla \nabla^{_{○}} ϕ : (\frac{d}{d t}, \frac{d}{d t}) = \nabla_{\frac{d}{d t}} \nabla^{_{○}} ϕ \cdot \frac{d}{d t} = \nabla_{\frac{d}{d t}} (\nabla^{_{○}} ϕ \cdot \frac{d}{d t}) - \nabla^{_{○}} ϕ \cdot \nabla_{\frac{d}{d t}} \frac{d}{d t} .

However,

\nabla^{_{○}} ϕ \cdot \frac{d}{d t} = ϕ^{'}

and

\nabla_{\frac{d}{d t}} \frac{d}{d t} = 0

, giving thegeodesic equation

\nabla_{\frac{d}{d t}}^{ϕ^{*} T S} ϕ^{'} = 0 o n t h e i n t e r i o r o f M .

This is the covariant way to state that the acceleration of ϕ is identically zero. The geodesic equation is commonly notated as

0 = \nabla_{ϕ^{'}} ϕ^{'}

, though such notation is inaccurate because

ϕ^{'}

is not a vector field on S, but a vector field along ϕ, and therefore use of the pullback covariant derivative

\nabla^{ϕ^{*} T S}

is correct (see (5)).

While formulated using fixed boundary conditions (ϕ has p and q as its endpoints), the geodesic equation is a second order ODE for which initial tangent vector conditions are sufficient to uniquely determine a solution.

4.3. Second Variation

A further consideration after finding critical points of the energy functional

L

is determining which critical points are extrema. This will involve calculating the second derivative of

L

. Let

C : = C^{\infty} (M, S)

, noting that

T_{ϕ} C ≅ Γ (ϕ^{*} T S)

for

ϕ \in C

. The first derivative of

L

is

\nabla^{C \to R} L : = d L

, as seen in the previous section. The second derivative is the covariant Hessian

\nabla^{T^{*} C} \nabla^{C \to R} L

, where the covariant derivative

\nabla^{T^{*} C}

is induced by

\nabla^{T S}

([18], Theorem 5.4).

For the remainder of this section, let

I, J \subseteq R

be neighborhoods of zero, let i and j be their respective standard coordinates, and extend the existing

δ

-style derivative-at-a-point notation by defining

δ_{i} : = \frac{\partial}{\partial i} ∣_{i = j = 0}

,

δ_{j} : = \frac{\partial}{\partial j} ∣_{i = j = 0}

, and evaluation map

z : M \to M \times I \times J, m \mapsto (m, 0, 0)

. Then

δ_{i} = z^{*} \partial_{i}

and

δ_{j} = z^{*} \partial_{j}

; these will be used as in the calculation of the first variation.

Theorem 2

(Second variation of

L

). Let

L

, L, σ, μ, v and ν all be defined as above. If

ϕ \in C^{\infty} (M, S)

is a critical point of

L

and

A, B \in T_{ϕ} C ≅ Γ (ϕ^{*} T S)

, then the covariant Hessian of

L

is

\begin{matrix} \nabla^{2} L (ϕ) :_{T_{ϕ} C} (A \otimes B) \\ = & \int_{M} A \cdot_{ϕ^{*} T^{*} S} ϕ_{, M}^{*} L_{, σ σ} \cdot_{ϕ^{*} T S} B \\ + A \cdot_{ϕ^{*} T^{*} S} ϕ_{, M}^{*} L_{, σ v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B \\ + \nabla^{ϕ^{*} T S} A \cdot_{ϕ^{*} T^{*} S \otimes_{M} T M} ϕ_{, M}^{*} L_{, v σ} \cdot_{ϕ^{*} T S} B \\ + \nabla^{ϕ^{*} T S} A \cdot_{ϕ^{*} T^{*} S \otimes_{M} T M} ϕ_{, M}^{*} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B \\ - A \cdot_{ϕ^{*} T^{*} S} (ϕ_{, M}^{*} L_{, v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} (ϕ^{*} R^{T S} \cdot_{ϕ^{*} T S} ϕ_{, M})) \cdot_{ϕ^{*} T S} B d V_{g} . \end{matrix}

This is often called the second variation of

L

. Here,

R^{T S} \in Γ (T S \otimes_{S} T^{*} S \otimes_{S} T^{*} S \otimes_{S} T^{*} S)

denotes the Riemannian curvature endomorphism tensor for the Levi-Civita connection on

T S

.

Proof.

Let

Φ : M \times I \times J \to S

be a two-parameter variation such that

δ_{i} Φ = A

and

δ_{j} Φ = B

(e.g.,

Φ (m, i, j) : = exp (i A (m) + j B (m))

). The variation

Φ

can be naturally identified with a variation

\bar{Φ} : I \times J \to C, (i, j) \mapsto (m \mapsto Φ (m, i, j))

which is more conducive to the use of C as a manifold. The tensor products in the generally infinite-dimensional

T C

are taken formally. Let

\bar{z} : = (0, 0) \in I \times J

.

By (20), taking the algebra formally in the case of infinite-rank vector bundles,

\nabla^{2} (L \circ \bar{Φ}) = {\bar{Φ}}^{*} \nabla^{2} L :_{{\bar{Φ}}^{*} T C} (\nabla^{_{○}} \bar{Φ} ⊠_{I \times J} \nabla^{_{○}} \bar{Φ}) + {\bar{Φ}}^{*} \nabla L \cdot_{{\bar{Φ}}^{*} T C} \nabla \nabla^{_{○}} \bar{Φ},

so

\begin{matrix} (\nabla^{2} L \circ_{C} ϕ) :_{T_{ϕ} C} (A \otimes B) \\ = & (\nabla^{2} L \circ_{C} ϕ) :_{T_{ϕ} C} (δ_{i} \bar{Φ} \otimes δ_{j} \bar{Φ}) + (\nabla L \circ_{C} ϕ) \cdot_{T_{ϕ} C} \nabla_{δ_{j}}^{{\bar{Φ}}^{*} T C} \partial_{i} \bar{Φ} \\ (since \nabla L \circ_{C} ϕ = 0) \\ = & (\nabla^{2} L \circ_{C} \bar{Φ} \circ_{I \times J} \bar{z}) :_{{\bar{z}}^{*} {\bar{Φ}}^{*} T C} (δ_{i} \bar{Φ} \otimes δ_{j} \bar{Φ}) + (\nabla L \circ_{C} \bar{Φ} \circ_{I \times J} \bar{z}) \cdot_{{\bar{z}}^{*} {\bar{Φ}}^{*} T C} \nabla_{δ_{j}}^{{\bar{Φ}}^{*} T C} \partial_{i} \bar{Φ} \\ = & \nabla^{2} (L \circ_{C} \bar{Φ}) :_{T I \oplus T J} (δ_{i} \otimes_{I \times J} δ_{j}) \\ (by above) \\ = & δ_{j} \partial_{i} (L \circ_{C} \bar{Φ}) \\ = & \int_{M} δ_{j} \partial_{i} (L \circ Φ_{, M}) d V_{g} \\ = & \int_{M} \nabla^{2} (L \circ Φ_{, M}) :_{T M \oplus T I \oplus T J} (δ_{i} \otimes_{M \times I \times J} δ_{j}) d V_{g} \\ = & \int_{M} A \cdot_{ϕ^{*} T^{*} S} ϕ_{, M}^{*} L_{, σ σ} \cdot_{ϕ^{*} T S} B + A \cdot_{ϕ^{*} T^{*} S} ϕ_{, M}^{*} L_{, σ v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B \\ + \nabla^{ϕ^{*} T S} A \cdot_{ϕ^{*} T^{*} S \otimes_{M} T M} ϕ_{, M}^{*} L_{, v σ} \cdot_{ϕ^{*} T S} B \\ + \nabla^{ϕ^{*} T S} A \cdot_{ϕ^{*} T^{*} S \otimes_{M} T M} ϕ_{, M}^{*} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B \\ - A \cdot_{ϕ^{*} T^{*} S} (ϕ_{, M}^{*} L_{, v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} (ϕ^{*} R^{T S} \cdot_{ϕ^{*} T S} ϕ_{, M})) \cdot_{ϕ^{*} T S} B d V_{g} \\ (by Calculation (1)) . \end{matrix}

Supporting calculations follow.

Calculation (1): Abbreviate

ϕ_{, M}^{*} L_{, x y}

by

L_{, x y}

. By (20),

\begin{matrix} \nabla^{2} (L \circ Φ_{, M}) :_{T M \oplus T I \oplus T J} (δ_{i} \otimes_{M \times I \times J} δ_{j}) \\ = & ([Φ_{, M}^{*} \nabla^{2} L :_{Φ_{, M}^{*} T E} (\nabla^{_{○}} Φ_{, M} ⊠_{M \times I \times J} \nabla^{_{○}} Φ_{, M}) + Φ_{, M}^{*} \nabla L \cdot_{Φ_{, M}^{*} T E} \nabla \nabla^{_{○}} Φ_{, M}] \circ z) \\ :_{z^{*} (T M \oplus T I \oplus T J)} (δ_{i} \otimes_{M} δ_{j}) \\ = & z^{*} Φ_{, M}^{*} \nabla^{2} L :_{z^{*} Φ_{, M}^{*} T E} (δ_{i} Φ_{, M} \otimes_{M} δ_{j} Φ_{, M}) + z^{*} Φ_{, M}^{*} \nabla L \cdot_{z^{*} Φ_{, M}^{*} T E} \nabla_{δ_{j}}^{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M} \\ (by Calculation (2)) \\ = & L_{, σ σ} :_{ϕ_{, M}^{*} π_{S}^{*} T S} (δ_{i} Φ \otimes_{M} δ_{j} Φ) + L_{, σ v} \cdot_{ϕ_{, M}^{*} π_{S}^{*} T S \otimes_{M} ϕ_{, M}^{*} π^{*} E} (δ_{i} Φ \otimes_{M} \nabla^{ϕ^{*} T S} δ_{j} Φ) \\ + L_{, v σ} \cdot_{ϕ_{, M}^{*} π^{*} E \otimes_{M} ϕ_{, M}^{*} π_{S}^{*} T S} (\nabla^{ϕ^{*} T S} δ_{i} Φ \otimes_{M} δ_{j} Φ) + L_{, v v} :_{ϕ_{, M}^{*} π^{*} E} (\nabla^{ϕ^{*} T S} δ_{i} Φ \otimes_{M} \nabla^{ϕ^{*} T S} δ_{j} Φ) \\ + L_{, σ} \cdot_{ϕ_{, M}^{*} π_{S}^{*} T S} \nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ + L_{, v} \cdot_{ϕ_{, M}^{*} π^{*} E} \nabla^{ϕ^{*} T S} \nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ \\ + L_{, v} \cdot_{ϕ_{, M}^{*} π^{*} E} ((I_{ϕ^{*} T S} \otimes_{M} δ_{i} Φ) \cdot_{ϕ^{*} T S \otimes_{M} ϕ^{*} T^{*} S} (ϕ^{*} R^{T S} \cdot_{ϕ^{*} T S} δ_{j} Φ) \cdot_{ϕ^{*} T S} ϕ_{, M}) \\ (by Calculation (3)) . \end{matrix}

Note that

\nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ \in Γ (ϕ^{*} T S)

, and since

ϕ

is a critical point of

L

,

\int_{M} L_{, σ} \cdot_{ϕ_{, M}^{*} π_{S}^{*} T S} \nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ + L_{, v} \cdot_{ϕ_{, M}^{*} π^{*} E} \nabla^{ϕ^{*} T S} \nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ d V_{g} = 0 .

Thus

\begin{matrix} \int_{M} \nabla^{2} (L \circ Φ_{, M}) :_{T M \oplus T I \oplus T J} (δ_{i} \otimes_{M \times I \times J} δ_{j}) d V_{g} \\ = & \int_{M} L_{, σ σ} :_{ϕ_{, M}^{*} π_{S}^{*} T S} (δ_{i} Φ \otimes_{M} δ_{j} Φ) + L_{, σ v} \cdot_{ϕ_{, M}^{*} π_{S}^{*} T S \otimes_{M} ϕ_{, M}^{*} π^{*} E} (δ_{i} Φ \otimes_{M} \nabla^{ϕ^{*} T S} δ_{j} Φ) \\ + L_{, v σ} \cdot_{ϕ_{, M}^{*} π^{*} E \otimes_{M} ϕ_{, M}^{*} π_{S}^{*} T S} (\nabla^{ϕ^{*} T S} δ_{i} Φ \otimes_{M} δ_{j} Φ) + L_{, v v} :_{ϕ_{, M}^{*} π^{*} E} (\nabla^{ϕ^{*} T S} δ_{i} Φ \otimes_{M} \nabla^{ϕ^{*} T S} δ_{j} Φ) \\ + δ_{i} Φ \cdot_{ϕ^{*} T^{*} S} (L_{, v} \cdot_{ϕ_{, M}^{*} π^{*} E} ((ϕ^{*} R^{T S} \cdot_{ϕ^{*} T S} δ_{j} Φ) \cdot_{ϕ^{*} T S} ϕ_{, M})) d V_{g} \\ = & \int_{M} A \cdot_{ϕ^{*} T^{*} S} L_{, σ σ} \cdot_{ϕ^{*} T S} B + A \cdot_{ϕ^{*} T^{*} S} L_{, σ v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B \\ + \nabla^{ϕ^{*} T S} A \cdot_{ϕ^{*} T^{*} S \otimes_{M} T M} L_{, v σ} \cdot_{ϕ^{*} T S} B + \nabla^{ϕ^{*} T S} A \cdot_{ϕ^{*} T^{*} S \otimes_{M} T M} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B \\ - A \cdot_{ϕ^{*} T^{*} S} (L_{, v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} (ϕ^{*} R^{T S} \cdot_{ϕ^{*} T S} ϕ_{, M})) \cdot_{ϕ^{*} T S} B d V_{g} \\ (by antisymmetry of curvature tensor) . \end{matrix}

Calculation (2):

\begin{matrix} z^{*} \nabla \nabla^{_{○}} Φ_{, M} :_{z^{*} (T M \oplus T I \oplus T J)} (δ_{i} \otimes_{M} δ_{j}) \\ = & z^{*} \nabla^{Φ_{, M}^{*} T E \otimes_{M \times I \times J} (T^{*} M \oplus T^{*} I \oplus T^{*} J)} \nabla^{_{○}} Φ_{, M} :_{z^{*} (T M \oplus T I \oplus T J)} z^{*} (\partial_{i} \otimes_{M \times I \times J} \partial_{j}) \\ = & z^{*} (\nabla^{Φ_{, M}^{*} T E \otimes_{M \times I \times J} (T^{*} M \oplus T^{*} I \oplus T^{*} J)} \nabla^{_{○}} Φ_{, M} :_{T M \oplus T I \oplus T J} (\partial_{i} \otimes_{M \times I \times J} \partial_{j})) \\ = & z^{*} (\nabla_{\partial_{j}}^{Φ_{, M}^{*} T E \otimes_{M \times I \times J} (T^{*} M \oplus T^{*} I \oplus T^{*} J)} \nabla^{_{○}} Φ_{, M} \cdot_{T M \oplus T I \oplus T J} \partial_{i}) \\ = & z^{*} \nabla_{\partial_{j}}^{Φ_{, M}^{*} T E} (\nabla^{_{○}} Φ_{, M} \cdot_{T M \oplus T I \oplus T J} \partial_{i}) (since \nabla_{\partial_{j}}^{T M \oplus T I \oplus T J} \partial_{i} = 0) \\ = & \nabla_{δ_{j}}^{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M} . \end{matrix}

Calculation (3): As calculated in the proof of (1),

\begin{matrix} ϕ_{, M}^{*} σ \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} & = δ_{i} Φ \in Γ (ϕ^{*} T S), \\ ϕ_{, M}^{*} μ \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} & = 0 \in Γ (T M), \\ ϕ_{, M}^{*} v \cdot_{ϕ_{, M}^{*} T E} δ_{i} Φ_{, M} & = \nabla^{ϕ^{*} T S} δ_{i} Φ \in Γ (ϕ^{*} T S \otimes_{M} T^{*} M) . \end{matrix}

Furthermore, letting

P : = {pr}_{M}^{M \times I \times J}

for brevity and noting that

P \circ z = {Id}_{M}

,

\begin{matrix} ϕ_{, M}^{*} σ \cdot_{ϕ_{, M}^{*} T E} \nabla_{δ_{j}}^{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M} \\ = & z^{*} \nabla_{\partial_{j}}^{Φ_{, M}^{*} π_{S}^{*} T S} (Φ_{, M}^{*} σ \cdot_{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M}) & (since \nabla σ = 0) \\ = & z^{*} \nabla_{\partial_{j}}^{{(π_{S} \circ Φ_{, M})}^{*} T S} \partial_{i} Φ & (using calculation from (1)) \\ = & \nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ \in Γ (z^{*} Φ^{*} T S) ≅ Γ (ϕ^{*} T S), \end{matrix}

\begin{matrix} ϕ_{, M}^{*} μ \cdot_{ϕ_{, M}^{*} T E} \nabla_{δ_{j}}^{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M} \\ = & z^{*} \nabla_{\partial_{j}}^{Φ_{, M}^{*} π_{M}^{*} T M} (Φ_{, M}^{*} μ \cdot_{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M}) & (since \nabla μ = 0) \\ = & z^{*} \nabla_{\partial_{j}}^{{(π_{M} \circ Φ_{, M})}^{*} T M} 0 & (using calculation from (1)) \\ = & 0 \in Γ (z^{*} {(π_{M} \circ Φ_{, M})}^{*} T M) ≅ Γ (z^{*} P^{*} T M) ≅ Γ (T M), \end{matrix}

\begin{matrix} ϕ_{, M}^{*} v \cdot_{ϕ_{, M}^{*} T E} \nabla_{δ_{j}}^{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M} \\ = & z^{*} \nabla_{\partial_{j}}^{Φ_{, M}^{*} π^{*} E} (Φ_{, M}^{*} v \cdot_{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M}) & (since \nabla v = 0) \\ = & z^{*} \nabla_{\partial_{j}}^{{(π \circ Φ_{, M})}^{*} E} {(\partial_{i} Φ)}_{, M} & (using calculation from (1)) . \end{matrix}

Note that

ϕ_{, M}^{*} v \in Γ (ϕ_{, M}^{*} π^{*} E \otimes_{M} ϕ_{, M}^{*} T^{*} E) ≅ Γ ({(ϕ \times_{M} {Id}_{M})}^{*} E \otimes_{M} ϕ_{, M}^{*} T^{*} E),

and therefore

ϕ_{, M}^{*} v \cdot_{ϕ_{, M}^{*} T E} \nabla_{δ_{j}}^{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M} \in Γ ({(ϕ \times_{M} {Id}_{M})}^{*} E) ≅ Γ (ϕ^{*} T S \otimes_{M} T^{*} M),

so it suffices to examine its natural pairing with

T M

elements. Let

X \in Γ (T M)

, noting that

X = {Id}_{M}^{*} X = z^{*} P^{*} X

and that

P^{*} X = T P \cdot (X \oplus 0_{T I} \oplus 0_{T J}) \in Γ (P^{*} T M)

. Then

\begin{matrix} (ϕ_{, M}^{*} v \cdot_{ϕ_{, M}^{*} T E} \nabla_{δ_{j}}^{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M}) \cdot_{T M} X \\ = & z^{*} \nabla_{\partial_{j}}^{Φ^{*} T S \otimes_{M \times I \times J} P^{*} T^{*} M} {(\partial_{i} Φ)}_{, M} \cdot_{z^{*} P^{*} T M} z^{*} P^{*} X \\ = & z^{*} \nabla_{\partial_{j}}^{Φ^{*} T S} ({(\partial_{i} Φ)}_{, M} \cdot_{P^{*} T M} \nabla^{_{○}} P \cdot_{T M \oplus T I \oplus T J} (X \oplus 0_{T I} \oplus 0_{T J})) \\ - z^{*} ({(\partial_{i} Φ)}_{, M} \cdot \nabla_{\partial_{j}}^{P^{*} T M} (\nabla^{_{○}} P \cdot_{T M \oplus T I \oplus T J} \cdot (X \oplus 0_{T I} \oplus 0_{T J}))) \\ = & z^{*} (\nabla_{\partial_{j}}^{Φ^{*} T S} \nabla_{X \oplus 0_{T I} \oplus 0_{T J}}^{Φ^{*} T S} \partial_{i} Φ - {(\partial_{i} Φ)}_{, M} \cdot 0_{P^{*} T M}) \\ = & z^{*} (\nabla_{X \oplus 0_{T I} \oplus 0_{T J}}^{Φ^{*} T S} \nabla_{\partial_{j}}^{Φ^{*} T S} \partial_{i} Φ + \nabla_{[\partial_{j}, X \oplus 0_{T I} \oplus 0_{T J}]}^{Φ^{*} T S} \partial_{i} Φ - R^{Φ^{*} T S} (\partial_{j}, X \oplus 0_{T I} \oplus 0_{T J}) \partial_{i} Φ) \\ = & z^{*} (\nabla_{X \oplus 0_{T I} \oplus 0_{T J}}^{Φ^{*} T S} \nabla_{\partial_{j}}^{Φ^{*} T S} \partial_{i} Φ + \nabla_{0}^{Φ^{*} T S} \partial_{i} Φ) \\ - z^{*} ((I_{Φ^{*} T S} \otimes_{M \times I \times J} \partial_{i} Φ) \cdot_{Φ^{*} T S \otimes_{M \times I \times J} Φ^{*} T^{*} S} R^{Φ^{*} T S} :_{T M \oplus T I \oplus T J} (\partial_{j} \otimes_{M \times I \times J} (X \oplus 0_{T I} \oplus 0_{T J}))) \\ = & [\nabla^{ϕ^{*} T S} \nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ + (I_{ϕ^{*} T S} \otimes_{M} δ_{i} Φ) \cdot_{ϕ^{*} T S \otimes_{M} ϕ^{*} T^{*} S} (ϕ^{*} R^{T S} \cdot_{ϕ^{*} T S} δ_{j} Φ) \cdot_{ϕ^{*} T S} ϕ_{, M}] \cdot_{T M} X, \end{matrix}

where the last equality follows from Calculations (4) and (5). Because X is pointwise-arbitrary in

T M

, this shows that

\begin{matrix} ϕ_{, M}^{*} v \cdot_{ϕ_{, M}^{*} T E} \nabla_{δ_{j}}^{Φ_{, M}^{*} T E} \partial_{i} Φ_{, M} \\ = & {(\nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ)}_{, M} + (I_{ϕ^{*} T S} \otimes_{M} δ_{i} Φ) \cdot_{ϕ^{*} T S \otimes_{M} ϕ^{*} T^{*} S} (ϕ^{*} R^{T S} \cdot_{ϕ^{*} T S} δ_{j} Φ) \cdot_{ϕ^{*} T S} ϕ_{, M} . \end{matrix}

Calculation (4):

\begin{matrix} z^{*} (\nabla_{X \oplus 0_{T I} \oplus 0_{T J}}^{Φ^{*} T S} \nabla_{\partial_{j}}^{Φ^{*} T S} \partial_{i} Φ + \nabla_{0}^{Φ^{*} T S} \partial_{i} Φ) \\ = & z^{*} {(\nabla_{\partial_{j}}^{Φ^{*} T S} \partial_{i} Φ)}_{, M} \cdot_{z^{*} P^{*} T M} z^{*} P^{*} X \\ = & {(\nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ)}_{, M} \cdot_{T M} X (by (22)) \\ = & \nabla^{ϕ^{*} T S} \nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ \cdot_{T M} X (because \nabla_{δ_{j}}^{Φ^{*} T S} \partial_{i} Φ \in Γ (z^{*} Φ^{*} T S) ≅ Γ (ϕ^{*} T S)) . \end{matrix}

Calculation (5):

\begin{matrix} - z^{*} (R^{Φ^{*} T S} :_{T M \oplus T I \oplus T J} (\partial_{j} \otimes_{M \times I \times J} (X \oplus 0_{T I} \oplus 0_{T J}))) \\ = & z^{*} (R^{Φ^{*} T S} :_{T M \oplus T I \oplus T J} ((X \oplus 0_{T I} \oplus 0_{T J}) \otimes_{M \times I \times J} \partial_{j})) \\ (antisymmetry of R^{Φ^{*} T S}) \\ = & z^{*} (Φ^{*} R^{T S} :_{Φ^{*} T S} (\nabla^{_{○}} Φ ⊠_{M \times I \times J} \nabla^{_{○}} Φ) :_{T M \oplus T I \oplus T J} ((X \oplus 0_{T I} \oplus 0_{T J}) \otimes_{M \times I \times J} \partial_{j})) \\ (by (21)) \\ = & z^{*} (Φ^{*} R^{T S} :_{Φ^{*} T S} ((Φ_{, M} \cdot_{P^{*} T M} P^{*} X) \otimes_{M \times I \times J} \partial_{j} Φ)) \\ = & z^{*} ((Φ^{*} R^{T S} \cdot_{Φ^{*} T S} \partial_{j} Φ) \cdot_{Φ^{*} T S} Φ_{, M} \cdot_{P^{*} T M} P^{*} X) \\ = & (z^{*} Φ^{*} R^{T S} \cdot_{z^{*} Φ^{*} T S} z^{*} \partial_{j} Φ) \cdot_{z^{*} Φ^{*} T S} z^{*} Φ_{, M} \cdot_{z^{*} P^{*} T M} z^{*} P^{*} X \\ = & (ϕ^{*} R^{T S} \cdot_{ϕ^{*} T S} δ_{j} Φ) \cdot_{ϕ^{*} T S} ϕ_{, M} \cdot_{T M} X . \end{matrix}

□

Theorem 3

(Second variation of

L

(alternate form)). Let

L

, L, σ, μ, v and ν all be defined as above. If

ϕ \in C^{\infty} (M, S)

is a critical point of

L

and

A, B \in Γ (ϕ^{*} T S)

, then

\begin{matrix} \nabla^{2} L (ϕ) :_{T_{ϕ} C} (A \otimes B) \\ = & \int_{M} A \cdot_{ϕ^{*} T^{*} S} L_{, σ σ} \cdot_{ϕ^{*} T S} B + A \cdot_{ϕ^{*} T^{*} S} L_{, σ v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B \\ - A \cdot_{ϕ^{*} T^{*} S} {div}_{M} L_{, v σ} \cdot_{ϕ^{*} T S} B - A \cdot_{ϕ^{*} T^{*} S} L_{, v σ} \cdot_{T^{*} M \otimes_{M} ϕ^{*} T S} {(\nabla^{ϕ^{*} T S} B)}^{(1 2)} \\ - A \cdot_{ϕ^{*} T^{*} S} {div}_{M} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B \\ - A \cdot_{ϕ^{*} T^{*} S} L_{, v v} \cdot_{T^{*} M \otimes_{M} ϕ^{*} T S \otimes_{M} T^{*} M} {(\nabla^{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B)}^{(1 2 3)} \\ - A \cdot_{ϕ^{*} T^{*} S} (L_{, v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} (ϕ^{*} R^{T S} \cdot_{ϕ^{*} T S} ϕ_{, M})) \cdot_{ϕ^{*} T S} B d V_{g} \\ + \int_{\partial M} (A \cdot_{ϕ^{*} T^{*} S} L_{, v σ} \cdot_{ϕ^{*} T S} B) \cdot_{T^{*} M} ν + (A \cdot_{ϕ^{*} T^{*} S} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B) \cdot_{T^{*} M} ν d {\bar{V}}_{g} \end{matrix}

Proof.

This result follows essentially from (2) via several instances of integration by parts to express the integrand(s) entirely in terms of A and not its covariant derivatives. Abbreviate

ϕ_{, M}^{*} L_{, x y}

by

L_{, x y}

. Then, integrating by parts allows the covariant derivatives of A to be flipped across the natural pairings over

ϕ^{*} T S

.

\begin{matrix} \int_{M} \nabla^{ϕ^{*} T S} A \cdot_{ϕ^{*} T^{*} S \otimes_{M} T M} L_{, v σ} \cdot_{ϕ^{*} T S} B d V_{g} \\ = & \int_{M} {tr}_{T M} ({(\nabla^{ϕ^{*} T S} A)}^{(1 2)} \cdot_{ϕ^{*} T S} L_{, v σ} \cdot_{ϕ^{*} T S} B) d V_{g} & (T M trace is taken separately) \\ = & \int_{M} {tr}_{T M} (\nabla^{T M} (A \cdot_{ϕ^{*} T^{*} S} L_{, v σ} \cdot_{ϕ^{*} T S} B)) \\ - {tr}_{T M} (A \cdot_{ϕ^{*} T^{*} S} \nabla^{ϕ^{*} T^{*} S \otimes_{M} T M \otimes_{M} ϕ^{*} T S} L_{, v σ} \cdot_{ϕ^{*} T S} B) \\ - {tr}_{T M} (A \cdot_{ϕ^{*} T^{*} S} L_{, v σ} \cdot_{ϕ^{*} T S} \nabla^{ϕ^{*} T S} B) d V_{g} & (reverse product rule) \\ = & \int_{M} - A \cdot_{ϕ^{*} T^{*} S} {div}_{M} L_{, v σ} \cdot_{ϕ^{*} T S} B & (definition of divergence) \\ - A \cdot_{ϕ^{*} T^{*} S} L_{, v σ} \cdot_{T^{*} M \otimes_{M} ϕ^{*} T S} {(\nabla^{ϕ^{*} T S} B)}^{(1 2)} d V_{g} \\ + \int_{\partial M} (A \cdot_{ϕ^{*} T^{*} S} L_{, v σ} \cdot_{ϕ^{*} T S} B) \cdot_{T^{*} M} ν d {\bar{V}}_{g} & (divergence theorem) . \end{matrix}

Similarly,

\begin{matrix} \int_{M} \nabla^{ϕ^{*} T S} A \cdot_{ϕ^{*} T^{*} S \otimes_{M} T M} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B d V_{g} \\ = & \int_{M} {tr}_{T M} ({(\nabla^{ϕ^{*} T S} A)}^{(1 2)} \cdot_{ϕ^{*} T^{*} S} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B) d V_{g} \\ = & \int_{M} {tr}_{T M} (\nabla^{T M} (A \cdot_{ϕ^{*} T^{*} S} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B)) \\ - {tr}_{T M} (A \cdot_{ϕ^{*} T^{*} S} \nabla^{ϕ^{*} T^{*} S \otimes_{M} T M \otimes_{M} ϕ^{*} T^{*} S \otimes_{M} T M} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B) \\ - {tr}_{T M} (A \cdot_{ϕ^{*} T^{*} S} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B) d V_{g} \\ = & \int_{M} - A \cdot_{ϕ^{*} T^{*} S} {div}_{M} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B \\ - A \cdot_{ϕ^{*} T^{*} S} L_{, v v} \cdot_{T^{*} M \otimes_{M} ϕ^{*} T S \otimes_{M} T^{*} M} {(\nabla^{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B)}^{(1 2 3)} d V_{g} \\ + \int_{\partial M} (A \cdot_{ϕ^{*} T^{*} S} L_{, v v} \cdot_{ϕ^{*} T S \otimes_{M} T^{*} M} \nabla^{ϕ^{*} T S} B) \cdot_{T^{*} M} ν d {\bar{V}}_{g} . \end{matrix}

Together with (2), this gives the desired result. □

5. Discussion

This paper is a first pass at the development of a strongly typed tensor calculus formalism. The details of its workings are by no means complete or fully polished, and its landscape is riddled with many tempting rabbit holes which would certainly produce useful results upon exploration, but which were out of the scope of a first exposition. Here is a list of some topics which the author considers worthwhile to pursue, and which will likely be the subject of his future work. Hopefully some of these topics will be inspiring to other mathematicians, and ideally will start a conversation on the subject.

There are refinements to be made to the type system used in this paper in order to achieve better error-checking and possibly more insight into the relevant objects. There are still implicit type identifications being done (mostly the canonical identifications between different pullback bundles).
The calculations done in this paper are not in an optimally polished and refined state. With experience, certain common operations can be identified, abstract computational rules generated for these operations, and the relevant calculations simplified.
The language of Category Theory can be used to address the implicit/explicit handling of natural type identifications, for example, the identification used in showing the contravariance of bundle pullback; $ψ^{*} ϕ^{*} F ≅ {(ϕ \circ ψ)}^{*} F$ .
The details of the particular implementation of the pullback bundle $ϕ^{*} F$ as a submanifold of the direct product $M \times F$ are used in this paper, but there is no reason to “open up the box” like this. For most purposes, the categorical definition of pullback bundle suffices; the pullback bundle can be worked exclusively using its projection maps $π_{M}^{ϕ^{*} F}$ and $ρ_{F}^{ϕ^{*} F}$ . Using this abstract interface often cleans up calculations involving pullback bundles significantly.
The type system used for any particular problem or calculation can be enriched or simplified to adjust to the level of detail appropriate for the situation. For example, if $γ \in C^{\infty} (R, M)$ , then $\nabla^{_{○}} γ \in Γ (γ^{*} T M \otimes_{R} T^{*} R)$ , but if t is the standard coordinate on $R$ , then $\nabla^{_{○}} γ = γ^{'} \otimes_{R} d t$ , where $γ^{'} \in Γ (γ^{*} T M)$ is given by $\nabla^{_{○}} γ \cdot \frac{d}{d t}$ . This “primed” derivative has a simpler type than the total derivative, and would presumably lead to simpler calculations (e.g., in (25). This “primed” derivative could also be used in the derivation of the first and second variations. While this would simplify the type system, it would diversify the notation and make the computational system less regularized. However, some situations may benefit overall from this.
The notion of strong typing comes from computer programming languages. The human-driven type-checking which is facilitated by the pedantically decorated notation in this paper can be done by computer by implementing the objects and operations of this tensor calculus formalism in a strongly typed language such as Haskell. This would be a step toward automated calculation checking, and could be considered a step toward automated proof checking from the top down (as opposed to from the bottom up, using a system such as the Coq Proof Assistant). Furthermore, a computer could display tensor expressions with whatever level of detail the user desires, showing low-, mid-, or high-level notation, or showing or suppressing identification isomorphisms.
Is there some sort of completeness result about the calculational tools and type system in this paper? In other words, is it possible to accomplish “everything” in a global, coordinate-free way using a certain set of tools, such as pullback bundles, covariant derivatives, chain rules, permutations, evaluation-by-pullback?
The alternate form of the second variation (see (3)) can be used to form a generalized Jacobi field equation for a particular energy functional. Analysis of this equation and its solutions may give insights analogous to the standard (geodesic-based) Jacobi field equation.

Funding

This research was funded in part by a 2011–2012 ARCS (Achievement Rewards for College Scientists) Fellowship.

Acknowledgments

I would like to thank the ARCS Foundation for having granted me an ARCS Fellowship, and for their generous efforts to promote excellence in young scientists. I would like to thank my advisor Debra Lewis for trusting in my abilities and providing me with the freedom in which the creative endeavor that this paper required could flourish. I would like to thank David DeConde for the invaluable conversations at the Octagon in which imagination, creativity, and exploration were gladly fostered. Thanks to Chris Shelley for showing me how to create the tensor diagrams using Tikz. Finally, I would like to thank both Debra and David for their help in editing this paper.

Conflicts of Interest

The author declares no conflict of interest. The funders had no role in the research, writing, or publication of the manuscript, nor in the decision to publish the results.

References

Parnas, D. On the Criteria to Be Used in Decomposing Systems Into Modules. Commun. ACM 1972, 15, 1053–1058. [Google Scholar] [CrossRef]
Walter, W. Ordinary Differential Equations; Springer: Berlin/Heidelberg, Germany, 1998; Volume 182. [Google Scholar]
Raymond, E.S. The Art of Unix Programming; Pearson Education, Inc.: Canada, 2003; Available online: http://www.catb.org/~esr/writings/taoup/html/ (accessed on 20 July 2022).
Miller, G.A. The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information. Am. Psychol. Assoc. 1955, 101, 343–352. [Google Scholar]
Lee, J.M. Introduction to Smooth Manifolds; Springer: Berlin/Heidelberg, Germany, 2006; Volume 218. [Google Scholar]
Lee, J.M. Manifolds and Differential Geometry; American Mathematical Society, 2009; Volume 107, Available online: https://bookstore.ams.org/gsm-107 (accessed on 25 August 2022).
Michor, P.W. Topics in Differential Geometry; American Mathematical Society, 2008; Volume 93, Available online: https://www.mat.univie.ac.at/~michor/dgbook.pdf (accessed on 25 August 2022).
Lee, J.M. Riemannian Manifolds: An Introduction to Curvature; Springer: Berlin/Heidelberg, Germany, 1997; Volume 176. [Google Scholar]
Cardelli, L. Typeful Programming. 1991. (Revised 1993). Available online: http://www.lucacardelli.name/Papers/TypefulProg.pdf (accessed on 20 July 2022).
Penrose, R. The Road to Reality; Vintage Books: UK, 2004; Available online: https://en.wikipedia.org/wiki/The_Road_to_Reality (accessed on 25 August 2022).
Marsden, J.E.; Hughes, T.J.R. Mathematical Foundations of Elasticity; Prentice Hall, Inc.: New York, NY, USA, 1983; Available online: https://www.amazon.co.jp/Mathematical-Foundations-Elasticity-Mechanical-Engineering/dp/0486678652 (accessed on 25 August 2022).
Palais, R.S. Foundations of Global Non-Linear Analysis; W.A. Benjamin, Inc., 1968; Available online: https://vmm.math.uci.edu/PalaisPapers/FoundationsOfGlobalNonlinearAnalysis.pdf (accessed on 25 August 2022).
Kolár, I.; Michor, P.W.; Slovák, J. Natural Operations in Differential Geometry; Springer: Berlin/Heidelberg, Germany, 1993; Volume 434, Available online: http://www.mat.univie.ac.at/~michor/listpubl.html (accessed on 20 July 2022).
Xin, Y. Geometry of Harmonic Maps; Birkhäuser, 1996; Volume 23, Available online: https://link.springer.com/chapter/10.1007/978-1-4612-4084-6_4 (accessed on 25 August 2022).
Giaquinta, M.; Hildebrandt, S. Calculus of Variations I; Springer: Berlin/Heidelberg, Germany, 1996. [Google Scholar]
Dodson, C.; Radivoiovici, M. Second-Order Tangent Structures. Int. J. Theor. Phys. 1982, 21, 151–161. [Google Scholar] [CrossRef]
Ebin, D.G.; Marsden, J.E. Groups of Diffeomorphisms and the Motion of an Incompressible Fluid. Ann. Math. Second. Ser. 1970, 92, 102–163. [Google Scholar] [CrossRef]
Eliasson, H.I. Geometry of Manifolds of Maps. J. Differ. Geom. 1967, 1, 169–194. [Google Scholar] [CrossRef]
Conference Board of the Mathematical Sciences Regional Conference Series in Mathematics. Selected Topics in Harmonic Maps; American Mathematical Society, 1983; Number 50; Available online: https://bookstore.ams.org/cdn-1655676954536/cbms-50/5 (accessed on 25 August 2022).
Nishikawa, S. Vartiational Problems in Geometry; American Mathematical Society, 2002; Volume 205, Available online: https://www.semanticscholar.org/paper/Variational-problems-in-geometry-Nishikawa/f493efd6d5b84cff294179450e85b358b766d265 (accessed on 25 August 2022).

Figure 1. A picture of the manifold M, path

θ

, and vector fields

θ^{'}

and

Θ

. The blue dots represent

θ (t)

at certain points

t \in R

, while the green and red arrows represent

θ^{'} (t)

and

Θ (t)

at at these points respectively. Note that

Θ

is a unit-length vector field along

θ

and varies within I, whereas

θ^{'}

is a vector field along

θ

that vanishes within I.

Figure 1. A picture of the manifold M, path

θ

, and vector fields

θ^{'}

and

Θ

. The blue dots represent

θ (t)

at certain points

t \in R

, while the green and red arrows represent

θ^{'} (t)

and

Θ (t)

at at these points respectively. Note that

Θ

is a unit-length vector field along

θ

and varies within I, whereas

θ^{'}

is a vector field along

θ

that vanishes within I.

Figure 2. A diagram representing the decomposition of

T E \to E

into horizontal and vertical subbundles. The vertical lines represent individual fibers of E, while

p \in M

,

e_{p} \in E_{p}

,

0_{p} \in E_{p}

denotes the zero vector of

E_{p}

, and

Z E

denotes the zero subbundle of E;

Z E ≅ M

. By the equivariance property of the linear connection,

Z E

is a submanifold of E which is entirely horizontal (its tangent space is entirely composed of horizontal vectors). The tangent spaces

T_{0_{p}} E

and

T_{e_{p}} E

are drawn; green arrows representing the vertical subspaces (“along” the fibers), red arrows representing the horizontal subspaces. Finally, c is a horizontal curve passing through

e_{p}

.

Figure 2. A diagram representing the decomposition of

T E \to E

into horizontal and vertical subbundles. The vertical lines represent individual fibers of E, while

p \in M

,

e_{p} \in E_{p}

,

0_{p} \in E_{p}

denotes the zero vector of

E_{p}

, and

Z E

denotes the zero subbundle of E;

Z E ≅ M

. By the equivariance property of the linear connection,

Z E

is a submanifold of E which is entirely horizontal (its tangent space is entirely composed of horizontal vectors). The tangent spaces

T_{0_{p}} E

and

T_{e_{p}} E

are drawn; green arrows representing the vertical subspaces (“along” the fibers), red arrows representing the horizontal subspaces. Finally, c is a horizontal curve passing through

e_{p}

.

Table 1. Notational Reference Part 1.

High-	Mid-	Low-Level	Description
Variations; variational derivatives; tangent vectors.
$m_{ϵ}$		m	Variation of a point in M; $I ∋ ϵ \mapsto m_{ϵ} \in M$ ; $m : I \to M$ .
$δ$		$δ_{ϵ}$	Variational derivative; $δ_{ϵ} : = \frac{\partial}{\partial ϵ} ∣_{ϵ = 0}$ .
$δ m_{ϵ}$		$δ_{ϵ} m$	Tangent vector; linearization of a variation;
			$δ m_{ϵ} \in T_{m_{0}} M$ ; $δ_{ϵ} m \in T_{m (0)} M$ .
Projection maps; canonical isomorphisms; bundle-related maps and spaces.
$pr$	${pr}_{i}$	${pr}_{i}^{A_{1} \times \dots \times A_{n}}$	Set-theoretic projection onto ith factor or named factor;
	${pr}_{A_{i}}$	${pr}_{A_{i}}^{A_{1} \times \dots \times A_{n}}$	${pr}_{i}^{A_{1} \times \dots \times A_{n}} : A_{1} \times \dots \times A_{n} \to A_{i}$ .
$ι$	$ι_{B}$ , $ι^{A}$	$ι_{B}^{A}$	Canonical isomorphism; $ι_{B}^{A} : A \to B$ ; $ι_{A}^{B} : = {(ι_{B}^{A})}^{- 1}$ .
$π$	$π_{M}$ , $π^{F}$	$π_{M}^{F}$	Bundle projection map; $π_{M}^{F} : F \to M$ .
$ρ$	$ρ_{H}$ , $ρ^{ϕ^{*} H}$	$ρ_{H}^{ϕ^{*} H}$	Pullback bundle fiber projection map; $ρ_{H}^{ϕ^{} H} : ϕ^{} H \to H$ .
$I$		$I_{V}$	Identity tensor on V; $I_{V} \in V \otimes V^{*}$ .
$I$		$I_{E}$	Identity tensor field on E; $I_{E} \in Γ (E \otimes E^{*})$ .
Trivial bundle constructions and projection maps.
$M \times N \to N$		$M ⥇ N \to N$	Trivial bundle over N; $M ⥇ N : = M \times N$ ;
			$π_{N}^{M ⥇ N} : M ⥇ N \to N, (m, n) \mapsto n$ .
$M \times N \to M$		$M ⬾ N \to M$	Trivial bundle over M; $M ⬾ N : = M \times N$ ;
			$π_{M}^{M ⬾ N} : M ⬾ N \to M, (m, n) \mapsto m$ .
Shared base-space bundle constructions and projection maps.
$E \times F \to M$		$E \times_{M} F \to M$	Direct product; $E \times_{M} F : = ∐_{m \in M} E_{m} \times F_{m}$ ;
			$π_{M}^{E \times_{M} F} (e, f) : = π_{M}^{E} (e) \equiv π_{M}^{F} (f)$ .
$E \oplus F \to M$		$E \oplus_{M} F \to M$	Whitney sum; $E \oplus_{M} F : = ∐_{m \in M} E_{m} \oplus F_{m}$ ;
			$π_{M}^{E \oplus_{M} F} (e \oplus f) : = π_{M}^{E} (e) \equiv π_{M}^{F} (f)$ .
$E \otimes F \to M$		$E \otimes_{M} F \to M$	Tensor product; $E \otimes_{M} F : = ∐_{m \in M} E_{m} \otimes F_{m}$ ;
			$π_{M}^{E \otimes_{M} F} (c^{i j} e_{i} \otimes f_{j}) : = π_{M}^{E} (e_{k}) \equiv π_{M}^{F} (f_{ℓ})$ (for any $k, ℓ$ ).
Separate base-space bundle constructions and projection maps.
$E \times H \to M \times N$		$E \times_{M \times N} H \to M \times N$	Direct product; $E \times_{M \times N} H : = ∐_{(m, n) \in M \times N} E_{m} \times H_{n}$ .
			$π_{M \times N}^{E \times_{M \times N} H} (e, h) : = (π_{M}^{E} (e), π_{N}^{H} (h))$ .
$E \oplus H \to M \times N$		$E \oplus_{M \times N} H \to M \times N$	Whitney sum; $E \oplus_{M \times N} H : = ∐_{(m, n) \in M \times N} E_{m} \oplus H_{n}$ .
			$π_{M \times N}^{E \oplus_{M \times N} H} (e \oplus h) : = (π_{M}^{E} (e), π_{N}^{H} (h))$ .
$E \otimes H \to M \times N$		$E \otimes_{M \times N} H \to M \times N$	Tensor product; $E \otimes_{M \times N} H : = ∐_{(m, n) \in M \times N} E_{m} \otimes H_{n}$ .
			$π_{M \times N}^{E \otimes_{M \times N} H} (c^{i j} e_{i} \otimes h_{j}) : = (π_{M}^{E} (e_{k}), π_{N}^{H} (h_{ℓ}))$ (for any $k, ℓ$ ).
Trace; natural pairing; tensor/tensor field contraction. Simple tensor expressions are extended linearly.
$tr$		${tr}_{V}$	Trace on V; ${tr}_{V} : V^{*} \otimes V \to R, α \otimes v \mapsto α (v)$ .
$α \cdot v$		$α \cdot_{V} v$	Natural pairing; $\cdot_{V} : V^{*} \times V \to R, (α, v) \mapsto α (v)$ .
$A \cdot B$		$A \cdot_{V} B$	Tensor contraction; $\cdot_{V} : (U \otimes V^{*}) \times (V \otimes W) \to U \otimes W,$
			$(u \otimes α) \cdot_{V} (v \otimes w) : = u \otimes (α \cdot_{V} v) \otimes w \equiv α (v) u \otimes w$ .
$S \cdot^{n} T$		$S \cdot_{V_{1} \otimes \dots \otimes V_{n}} T$	Alternate for $\cdot_{V}$ , where $V = V_{1} \otimes \dots \otimes V_{n}$ .
$S : T$ , $S \cdot^{2} T$		$S \cdot_{V_{1} \otimes V_{2}} T$	Special notation for $n = 2$ .
$tr$		${tr}_{F}$	Trace on $F \to M$ ; ${tr}_{F} : Γ (F^{*} \otimes_{M} F) \to C^{\infty} (M, R),$
			$[{tr}_{F} (σ \otimes_{M} f)] (m) : = σ (m) \cdot_{F_{m}} f (m)$ for $m \in M$ .
$σ \cdot f$		$σ \cdot_{F} f$	Natural pairing; $\cdot_{F} : Γ (F^{*}) \times Γ (F) \to C^{\infty} (M, R)$ ,
			$(σ \cdot_{F} f) (m) : = σ (m) \cdot_{F_{m}} f (m)$ for $m \in M$ .
$A \cdot f$		$A \cdot_{F} f$	Natural pairing; $\cdot_{F} : Γ (E \otimes_{M} F^{*}) \times F \to E,$
			$(e \otimes_{M} σ) \cdot_{F} f : = e (m) (σ (m) \cdot_{F_{m}} f) \in E_{m}$ ; $m : = π_{M}^{F} (f)$ .
$S \cdot T$		$S \cdot_{F} T$	Tensor field contraction; pointwise tensor contraction;
			$\cdot_{F} : Γ (E \otimes_{M} F^{*}) \times Γ (F \otimes_{M} G) \to Γ (E \otimes_{M} G),$
			$[(e \otimes σ) \cdot_{F} (f \otimes g)] (m) : = (σ (m) \cdot_{F_{m}} f (m)) e (m) \otimes g (m)$ .
$S \cdot^{n} T$		$S \cdot_{F_{1} \otimes_{M} \dots \otimes_{M} F_{n}} T$	Alternate for $\cdot_{F}$ , where $F = F_{1} \otimes_{M} \dots \otimes_{M} F_{n}$ .
$S : T$ , $S \cdot^{2} T$		$S \cdot_{F_{1} \otimes_{M} F_{2}} T$	Special notation for $n = 2$ .

Table 2. Notational Reference Part 2.

High-	Mid-	Low-Level	Description
Permutations of tensors and tensor fields.
$A^{σ}$ , $A \cdot^{n} σ$		$A \cdot_{V_{1}^{} \otimes \dots \otimes V_{n}^{}} σ$	Right-action of permutations on n-tensors/n-tensor fields;
			${(v_{1} \otimes \dots \otimes v_{n})}^{σ} : = v_{σ^{- 1} (1)} \otimes \dots \otimes v_{σ^{- 1} (n)}$ ; ${(A^{σ})}^{τ} = A^{σ τ}$ .
Spaces of sections of bundles.
$Γ (H)$ , $Γ (π_{N}^{H})$			Space of smooth sections of the bundle $π_{N}^{H}$ ;
			$Γ (H) : = \{h \in C^{\infty} (N, H) ∣ π_{N}^{H} \circ h = {Id}_{N}\}$ .
$Γ_{ϕ} (H)$ , $Γ_{ϕ} (π_{N}^{H})$			Space of smooth sections of $π_{N}^{H}$ along $ϕ$ ;
			$Γ_{ϕ} (H) : = \{h \in C^{\infty} (M, H) ∣ π_{N}^{H} \circ h = ϕ\}$ .
Vertical bundle, pullback bundle, projection maps, pullback of sections.
$V E \to E$			Vertical bundle over $E \to M$ ; $V E : = ker T π_{M}^{E} \leq T E$ .
			projection map $π_{E}^{V E} : = π_{E}^{T E} ∣_{V E}$ .
$ϕ^{*} H \to M$			Pullback bundle; $ϕ^{*} H : = \{(m, h) \in M \times H ∣ ϕ (m) = π (h)\}$ .
			$π_{M}^{ϕ^{} H} (m, h) : = m$ ; $ρ_{H}^{ϕ^{} H} (m, h) : = h$ .
$ϕ^{*} h$			Pullback of section $h \in Γ (H)$ ; $ϕ^{} h \in Γ (ϕ^{} H)$
			defined by $ρ_{H}^{ϕ^{} H} \circ ϕ^{} h = h$ ; $h \in Γ (H)$ .
Covariant derivatives; partial covariant derivatives.
$\nabla L$	$\nabla^{\|} L$	$\nabla^{\|}^{M \to R} L$	Natural linear covariant derivative; differential of functions;
	$\nabla^{M \to R} L$		$\nabla^{\|}^{M \to R} L : = d L \in Γ (T^{*} M)$ , where $L \in C^{\infty} (M, R)$ .
$\nabla X$	$\nabla^{\|} X$	$\nabla^{\|}^{E} X$	Linear covariant derivative on vector bundle $E \to M$ ;
	$\nabla^{E} X$		$\nabla^{E} X \in Γ (E \otimes_{M} T^{*} M)$ , where $X \in Γ (E)$ .
$\nabla α$	$\nabla^{\|} α$	$\nabla^{\|}^{E^{*}} α$	Linear covariant derivative on dual vector bundle $E^{*} \to M$ ;
	$\nabla^{E^{*}} α$		$\nabla^{E^{}} α \in Γ (E^{} \otimes_{M} T^{} M)$ , where $α \in Γ (E^{})$ .
$\nabla ϕ$	$\nabla^{_{○}} ϕ$	$\nabla^{_{○}}^{M \to N} ϕ$	Tangent map as tensor field;
	$\nabla^{M \to N} ϕ$		$\nabla^{_{○}}^{M \to N} ϕ \in Γ (ϕ^{} T N \otimes_{M} T^{} M)$ , where $ϕ \in C^{\infty} (M, N)$ .
$\nabla σ$	$\nabla^{ϕ} σ$	$\nabla^{ϕ^{*} H} σ$	Pullback covariant derivative; $σ \in Γ (ϕ^{*} H)$ ;
			defined by $\nabla^{ϕ^{} H} ϕ^{} h = ϕ^{} \nabla^{H} h \cdot_{ϕ^{} T N} \nabla^{_{○}}^{M \to N} ϕ$ ; $h \in Γ (H)$ .
$L_{, c_{1}}, \dots, L_{, c_{n}}$			Partial differential of functions;
			$L_{, c_{i}} \in Γ (F_{i}^{*})$ , defined by $\nabla^{\|}^{M \to R} L = \sum_{i = 1}^{n} L_{, c_{i}} \cdot_{F_{i}} c_{i}$ .
$X_{, c_{1}}, \dots, X_{, c_{n}}$			Partial linear covariant derivative;
			$X_{, c_{i}} \in Γ (E \otimes_{M} F_{i}^{*})$ , defined by $\nabla^{\|}^{E} X = \sum_{i = 1}^{n} X_{, c_{i}} \cdot_{F_{i}} c_{i}$ .
$ϕ_{, M_{1}}, \dots, ϕ_{, M_{n}}$			Partial derivative decomposition of tangent map;
			$ϕ_{, M_{i}} \in Γ (ϕ^{} T N \otimes_{M} {pr}_{i}^{} T^{*} M_{i})$ ,
			where $M = M_{1} \times \dots \times M_{n}$ , ${pr}_{i} : = {pr}_{i}^{M}$ , and
			$\nabla^{_{○}}^{M \to N} ϕ = \sum_{i = 1}^{n} ϕ_{, M_{i}} \cdot_{{pr}_{i}^{*} T M_{i}} \nabla^{_{○}}^{M \to M_{i}} {pr}_{i}$ .
Covariant Hessians.
$\nabla^{2} L$	$\nabla^{T^{*} M} \nabla^{M \to R} L$		Covariant Hessian of functions;
$\nabla^{\|} \nabla^{\|} L$	$\nabla^{\|}^{T^{*} M} \nabla^{\|}^{M \to R} L$		$\nabla^{2} L \in Γ (T^{} M \otimes T^{} M)$ ; $L \in C^{\infty} (M, R)$ .
$\nabla^{2} X$	$\nabla^{E \otimes T^{*} M} \nabla^{E} X$		Covariant Hessian on vector bundle $E \to M$ ;
$\nabla^{\|} \nabla^{\|} X$	$\nabla^{\|}^{E \otimes_{M} T^{*} M} \nabla^{\|}^{E} X$		$\nabla^{2} X \in Γ (E \otimes T^{} M \otimes T^{} M)$ ; $X \in Γ (E)$ .
$\nabla^{2} ϕ$	$\nabla^{ϕ^{} T N \otimes T^{} M} \nabla^{M \to N} ϕ$		Covariant Hessian of maps;
$\nabla^{\|} \nabla^{_{○}} ϕ$	$\nabla^{\|}^{ϕ^{} T N \otimes_{M} T^{} M} \nabla^{_{○}}^{M \to N} ϕ$		$\nabla^{2} ϕ \in Γ (ϕ^{} T N \otimes T^{} M \otimes T^{*} M)$ . $ϕ \in C^{\infty} (M, N)$ .
Derivative conventions.
$\nabla_{X} e$			Directional derivative notation; $\nabla_{X} e : = \nabla e \cdot_{T M} X$ .
$\nabla^{n} e \cdot^{n} (X_{1} \otimes \dots \otimes X_{n - 1} \otimes X_{n})$			Iterated covariant derivative convention;
			defined by $(\nabla_{X_{n}} \nabla^{n - 1} e) \cdot^{n - 1} (X_{1} \otimes \dots \otimes X_{n - 1})$ .
$R (X, Y) : = - \nabla_{X} \nabla_{Y} + \nabla_{Y} \nabla_{X} + \nabla_{[X, Y]}$			Curvature operator; $R (X, Y) e = \nabla^{2} e : (X \otimes Y - Y \otimes X)$ .
$z : M \to M \times I, m \mapsto (m, 0)$			Evaluation-at-zero map.
$z^{*} \partial_{i} = δ_{i}$			Pullback formulation of derivative-at-zero.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dods, V. Riemannian Calculus of Variations Using Strongly Typed Tensor Calculus. Mathematics 2022, 10, 3231. https://doi.org/10.3390/math10183231

AMA Style

Dods V. Riemannian Calculus of Variations Using Strongly Typed Tensor Calculus. Mathematics. 2022; 10(18):3231. https://doi.org/10.3390/math10183231

Chicago/Turabian Style

Dods, Victor. 2022. "Riemannian Calculus of Variations Using Strongly Typed Tensor Calculus" Mathematics 10, no. 18: 3231. https://doi.org/10.3390/math10183231

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Riemannian Calculus of Variations Using Strongly Typed Tensor Calculus

Abstract

1. Introduction

2. Notation and Conventions

3. Mathematical Setting

3.1. Using Strong Typing to Error-Check Calculations

3.2. Telescoping Notation (Also Known as Do Not Fear the Verbosity)

3.3. Strongly-Typed Linear Algebra via Tensor Products

3.4. Bundle Constructions

3.5. Strongly-Typed Tensor Field Operations

3.6. Pullback Bundles

3.7. Tangent Map as a Tensor Field

3.8. Linear Covariant Derivatives

3.9. Decomposition of $π_{E}^{T E} : T E \to E$

3.10. Curvature and Commutation of Derivatives

4. Riemannian Calculus of Variations

4.1. Critical Points and Variations

4.2. First Variation

4.3. Second Variation

5. Discussion

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Riemannian Calculus of Variations Using Strongly Typed Tensor Calculus

Abstract

1. Introduction

2. Notation and Conventions

3. Mathematical Setting

3.1. Using Strong Typing to Error-Check Calculations

3.2. Telescoping Notation (Also Known as Do Not Fear the Verbosity)

3.3. Strongly-Typed Linear Algebra via Tensor Products

3.4. Bundle Constructions

3.5. Strongly-Typed Tensor Field Operations

3.6. Pullback Bundles

3.7. Tangent Map as a Tensor Field

3.8. Linear Covariant Derivatives

3.9. Decomposition of π E T E : T E → E

3.10. Curvature and Commutation of Derivatives

4. Riemannian Calculus of Variations

4.1. Critical Points and Variations

4.2. First Variation

4.3. Second Variation

5. Discussion

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.9. Decomposition of $π_{E}^{T E} : T E \to E$