Next Article in Journal
Criteria of Oscillation for Second-Order Mixed Nonlinearities in Dynamic Equations
Previous Article in Journal
Exploring Properties and Applications of Laguerre Special Polynomials Involving the Δh Form
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Explicit Form for the Most General Lorentz Transformation Revisited

Santa Cruz Institute for Particle Physics, University of California, 1156 High Street, Santa Cruz, CA 95064, USA
Symmetry 2024, 16(9), 1155; https://doi.org/10.3390/sym16091155
Submission received: 8 August 2024 / Revised: 26 August 2024 / Accepted: 28 August 2024 / Published: 4 September 2024
(This article belongs to the Section Physics)

Abstract

:
Explicit formulae for the 4 × 4 Lorentz transformation matrices corresponding to a pure boost and a pure three-dimensional rotation are very well known. Significantly less well known is the explicit formula for a general Lorentz transformation with arbitrary non-zero boost and rotation parameters. We revisit this more general formula by presenting two different derivations. The first derivation (which is somewhat simpler than previous ones appearing in the literature) evaluates the exponential of a 4 × 4 real matrix A, where A is a product of the diagonal matrix diag ( + 1 , 1 , 1 , 1 ) and an arbitrary 4 × 4 real antisymmetric matrix. The formula for exp A depends only on the eigenvalues of A and makes use of the Lagrange interpolating polynomial. The second derivation exploits the observation that the spinor product η σ ¯ μ χ transforms as a Lorentz four-vector, where χ and η are two-component spinors. The advantage of the latter derivation is that the corresponding formula for a general Lorentz transformation Λ reduces to the computation of the trace of a product of 2 × 2 matrices. Both computations are shown to yield equivalent expressions for Λ .

1. Introduction

In the theory of special relativity, space and time are combined into Minkowski spacetime (e.g., see Ref. [1]). Two different inertial reference frames (with coinciding origins fixed) are related through a Lorentz transformation. Equivalently, consider a four-vector, x = ( x 0 ; x ) , with squared-length x 2 η μ ν x μ x ν = ( x 0 ) 2 | x | 2 (with an implied double sum over the repeated indices μ , ν { 0 , 1 , 2 , 3 } ), where η μ ν = diag ( + 1 , 1 , 1 , 1 ) is the Minkowski spacetime metric. One can also define the Lorentz transformation Λ as a symmetry transformation of a four-vector, x = Λ x , that preserves the length of x . Since the length of a four-vector is a scalar quantity and thus invariant under a Lorentz transformation, it follows that η α β = Λ μ α Λ ν β η μ ν , which serves as the general definition of the 4 × 4 Lorentz transformation matrix [cf. Equations (30)–(32)]. Moreover, this same equation implies that η μ ν is an invariant tensor. Indeed, the Lorentz transformations (along with spacetime translations) are the maximally allowed symmetry transformations of Minkowski spacetime in which the spacetime metric is left invariant (e.g., see Ref. [2]).
Consider two inertial reference frames with coinciding origins, where one reference frame is moving with respect to the other with three-vector velocity v . The corresponding Lorentz transformation is called a Lorentz boost. The boost parameters are defined by the components of the three-vector ζ ( v / v ) tanh 1 ( v / c ) , where v | v | and c is the speed of light. However, this is not the most general Lorentz transformation. For example, let R be an arbitrary 3 × 3 orthogonal matrix of unit determinant, i.e., a proper rotation matrix parametrized by the components of the three-vector θ θ n ^ (such that θ is the angle of rotation, counterclockwise, about a fixed axis that lies along the unit vector n ^ ). Then, the transformation x 0 = x 0 and x = R x is also a Lorentz transformation as it leaves the Minkowski spacetime metric invariant. The corresponding matrix representations of the general Lorentz boost and three-dimensional rotation are quite well known [see Equations (22) and (26), respectively] and are reviewed in Section 2.
A more general Lorentz transformation matrix, which shall henceforth be denoted by Λ ( ζ , θ ) , corresponds to a simultaneous boost and rotation. As shown in Section 3, Λ ( ζ , θ ) can be expressed as the exponential of a 4 × 4 matrix,
Λ ( ζ , θ ) = exp 0 ζ 1 ζ 2 ζ 3 ζ 1 0 θ 3 θ 2 ζ 2 θ 3 0 θ 1 ζ 3 θ 2 θ 1 0 .
In contrast to Λ ( ζ , 0 ) and Λ ( 0 , θ ) , which correspond to a Lorentz boost matrix and a three-dimensional rotation, respectively, an explicit form for Λ ( ζ , θ ) is much less well known.
The first published formula for Λ ( ζ , θ ) appeared in Ref. [3]. Subsequent derivations have also been given in Refs. [4,5,6]. These derivations are based on the Cayley–Hamilton theorem of linear algebra (e.g., see Section 8.4 of Ref. [7]), which asserts that any n × n matrix A satisfies its own characteristic equation, p ( x ) = det ( A x I n ) = 0 , where I n is the n × n identity matrix and p ( x ) is an nth-order polynomial whose roots are the eigenvalues of A. That is, p ( A ) is equal to the zero matrix. It follows that for any integer k n , the matrix A k can be expressed as a linear combination of I n , A, A 2 , A k 1 . In particular,
Λ ( ζ , θ ) exp A = k = 0 A k k ! = c 0 I 4 + c 1 A + c 2 A 2 + c 3 A 3 ,
where each of the coefficients c k is an infinite series whose terms depend on the eigenvalues of A. Note that by setting either θ = 0 or ζ = 0 in Equation (1), one can easily compute the resulting matrix exponential to derive the well-known expressions given in Equations (22) and (26), respectively. In contrast, if both the boost vector and the rotation vector are non-zero, then the corresponding computation of the matrix exponential, which is carried out in Refs. [3,4], is significantly more difficult. In Ref. [5], this computation is performed by showing that a Lorentz transformation matrix g exists such that the 4 × 4 matrix A ˜ g A g 1 in block matrix form is made up of very simple 2 × 2 matrix blocks. The exponential exp A ˜ is then easy to evaluate directly via its Taylor series to obtain the coefficients c k , and
exp A = g 1 ( exp A ˜ ) g = g 1 c 0 I 4 + c 1 A ˜ + c 2 A ˜ 2 + c 3 A ˜ 3 g = c 0 I 4 + c 1 A + c 2 A 2 + c 3 A 3 .
Finally, Ref. [6] derives a system of four linear equations for the coefficients c k in Equation (2), whose solution provides the desired expression for exp A .
In this paper, we shall provide a somewhat simpler and more straightforward evaluation of Λ ( ζ , θ ) as compared to the derivations given in Refs. [3,4,5,6]. In Section 2, we first exhibit the explicit forms for the general Lorentz boost and the three-dimensional rotation matrices of Minkowski spacetime, which correspond to special cases of the more general 4 × 4 Lorentz transformation matrix, as noted above. In Section 3, an expression for the most general Lorentz transformation is then derived. Indeed, it is sufficient to consider the set of all Lorentz transformations that are continuously connected to the identity, known as the proper orthochronous Lorentz transformations (e.g., see Ref. [1]). The matrix representation of any element of this latter set can be expressed in the form given by Equation (1), as discussed below Equation (40). In Section 4, we explicitly evaluate Equation (1) for arbitrary boost and rotation parameters. We then demonstrate that an alternative derivation of Λ ( ζ , θ ) can be given that only involves the manipulation of 2 × 2 matrices, by making use of two-component spinors. In particular, we show in Section 5 that the most general proper orthochronous Lorentz transformation matrix can be expressed as a trace of the product of four 2 × 2 matrices, which is then explicitly evaluated. Both methods for computing Λ ( ζ , θ ) are carried out in pedagogical detail. In Section 6, we check that both computations yield the same expression for Λ ( ζ , θ ) . Final remarks are presented in Section 7, and some related discussions are relegated to the appendices.

2. Lorentz Transformations—Special Cases

In a first encounter with special relativity, a student learns how the spacetime coordinates change between two inertial reference frames K and K . If the spacetime coordinates with respect to K are ( c t ; x , y , z ) and the spacetime coordinates with respect to K are ( c t ; x , y , z ) , where K is moving relative to K with velocity v = v x ^ in the x direction, then
c t = γ ( c t β x ) ,
x = γ ( x β c t ) ,
y = y ,
z = z ,
where c is the speed of light and
β v c , γ ( 1 β 2 ) 1 / 2 .
It is straightforward to generalize the above results for an arbitrary velocity v by writing
x = x + x ,
where x is the projection of x along the direction of v c β , and x is perpendicular to v (so that x · x = 0 ). The definition of x implies that
x | x | = β β ,
where β | β | . Note that 0 ≤ β < 1 for any particle of non-zero mass.
In light of Equation (10), Equations (4)–(7) are equivalent to
c t = γ ( c t β · x ) ,
x = γ ( x β c t ) ,
x = x ,
where γ ( 1 | β | 2 ) 1 / 2 . Note that 1 γ < for any particle of non-zero mass. More explicitly,
x = β · x β 2 β , x = x β · x β 2 β ,
which yield β · x = β · x and β · x = 0 as required. Inserting the expressions given in Equation (14) back into Equations (11)–(13), we end up with the well-known result (e.g., see Equation (11.19) of Ref. [8]):
c t = γ ( c t β · x ) ,
x = x + ( γ 1 ) β 2 ( β · x ) β γ β c t .
Following Equation (11.20) of Ref. [8], it is convenient to introduce the boost parameter ζ (also called the rapidity),
γ = cosh ζ , γ β = sinh ζ ,
since the definitions of β and γ are consistent with the relation cosh 2 ζ sinh 2 ζ = 1 . In particular, note that 0 ζ < . We then define the boost vector  ζ to be the vector of magnitude ζ that points in the direction of β . Since Equation (17) yields β = tanh ζ , it follows that
ζ β β tanh 1 β .
In terms of the boost vector ζ and its magnitude ζ | ζ | , Equations (15) and (16) yield
c t = c t cosh ζ ζ · x ζ sinh ζ ,
x = x ζ ζ c t sinh ζ ζ · x ζ ( cosh ζ 1 ) .
Before proceeding, it is instructive to distinguish between active and passive Lorentz transformations (e.g., see Ref. [1]). The Lorentz transformation discussed above is a passive transformation, since the reference frame K (specified by the coordinate axes) is transformed into K , while leaving the observer fixed. Equivalently, one can consider an active transformation, in which the coordinate axes are held fixed while the location of the observer in spacetime is boosted using the inverse of the transformation specified by Equations (19) and (20). That is, a spacetime point of the observer located at ( c t ; x ) is transformed by the boost to ( c t ; x ) using Equations (19) and (20) with ζ replaced by ζ . Henceforth, all Lorentz transformations treated in this paper will correspond to active transformations.
The transformation that boosts the spacetime point ( c t ; x ) to ( c t ; x ) is given by
c t x i = Λ ( ζ , 0 ) c t x j ,
where the 4 × 4 matrix Λ ( ζ , 0 ) can be written in block matrix form as
Λ ( ζ , 0 ) = cosh ζ ζ j ζ sinh ζ ζ i ζ sinh ζ δ i j + ζ i ζ j | ζ | 2 ( cosh ζ 1 ) ,
after converting Equations (19) and (20) to an active transformation via ζ ζ . In Equation (22),
δ i j = 1 , if i = j , 0 , if i j ,
where the Latin indices i , j { 1 , 2 , 3 } refer to the x, y, and z components of the three-vector ζ , and there is an implicit sum over the repeated index j on the right hand side of Equation (21).
The matrix Λ ( ζ , 0 ) is sometimes inaccurately called the Lorentz transformation matrix. In fact, this matrix represents a special type of Lorentz transformation consisting of a boost without rotation [the latter is indicated by the second argument of Λ ( ζ , 0 ) ]. Furthermore, note that Λ ( 0 , 0 ) = I 4 is the 4 × 4 identity matrix. Any Lorentz transformation of the form Λ ( ζ , 0 ) can be continuously deformed into the identity matrix by continuously shrinking the vector ζ to the zero vector.
Another example of a Lorentz transformation is a three-dimensional proper rotation of the vector x into the vector x = R x by an angle θ , counterclockwise, about a fixed axis n ^ , where R is a 3 × 3 orthogonal matrix of unit determinant, and the time coordinate is not transformed. In this notation, n ^ = ( n 1 , n 2 , n 3 ) is a unit vector (i.e., n ^ · n ^ = 1 ). It is then convenient to define a three-vector quantity called the rotation vector,
θ θ n ^ ,
where 0 θ π . In the case of a proper three-dimensional rotation, the transformation of the spacetime point ( c t ; x ) to ( c t ; x ) is given by
c t x i = Λ ( 0 , θ ) c t x j ,
where the 4 × 4 matrix Λ ( 0 , θ ) can be written in block matrix form as
Λ ( 0 , θ ) = 1 0 j 0 i R i j ( n ^ , θ ) ,
where 0 j [ 0 i ] are the components of the zero row [column] vector (with i, j { 1 , 2 , 3 } ), and
R i j ( n ^ , θ ) = δ i j cos θ + n i n j ( 1 cos θ ) ϵ i j k n k sin θ .
In Equation (27), the Levi–Civita symbol is defined by ϵ i j k = + 1 [ 1 ] when i j k is an even [odd] permutation of 123, and ϵ i j k = 0 if any two of the indices coincide. Equation (27) is known as Rodrigues’ rotation formula (e.g., see Refs. [9,10]). A clever proof of this formula is provided in Appendix A.

3. General Lorentz Transformations

Consider a four-vector v μ = ( v 0 ; v ) . Under an active Lorentz transformation, the spacetime components of the four-vector v μ transform as
v μ = Λ μ α v α ,
where the Greek indices such as μ , α { 0 , 1 , 2 , 3 } , and there is an implied sum over any repeated upper/lower index pair. The quantities Λ μ α can be viewed as the elements of a 4 × 4 real matrix, where μ labels the row and α labels the column. In special relativity, the metric tensor (in a rectangular coordinate system) is given by the diagonal matrix.
η μ ν = diag ( + 1 ; 1 , 1 , 1 ) ,
where the so-called mostly minus convention for the metric tensor has been chosen.
To construct a Lorentz-invariant scalar quantity that is unchanged under a Lorentz transformation, one only needs to combine tensors in such a way that all upper/lower index pairs are summed over and no unsummed indices remain. For example,
η μ ν v μ v ν = η α β v α v β .
Using Equations (28) and (30), it follows that
( η μ ν Λ μ α Λ ν β η α β ) v α v β = 0 .
Since the four-vector v is arbitrary, it follows that
Λ μ α η μ ν Λ ν β = η α β .
Equation (32) defines the most general Lorentz transformation matrix Λ . The set of all such 4 × 4 Lorentz transformation matrices is a group (under matrix multiplication) and is denoted by O ( 1 , 3 ) . Here, the notation ( 1 , 3 ) refers to the number of plus and minus signs in the metric tensor η μ ν [cf. Equation (29)]. In particular, O ( 1 , 3 ) is a Lie group, appropriately called the Lorentz group (e.g., see Refs. [1,2,10]).
After taking the determinant of both sides of Equation (32), one obtains ( det Λ ) 2 = 1 . Hence,
det Λ = ± 1 .
Moreover, by setting α = β = 0 in Equation (32) and summing over μ and ν , one obtains
( Λ 0 0 ) 2 = 1 + ( Λ 1 0 ) 2 + ( Λ 2 0 ) 2 + ( Λ 3 0 ) 2 ( Λ 0 0 ) 2 1 .
The Lie group SO ( 1 , 3 ) is the group of proper Lorentz transformation matrices that satisfy det Λ = + 1 . The elements of the subgroup of SO ( 1 , 3 ) that also satisfy Λ 0 0 + 1 are continuously connected to the identity element [the 4 × 4 identity matrix, denoted by I 4 ] and constitute the set of proper orthochronous Lorentz transformations, which is often denoted by SO0 ( 1 , 3 ) . Three examples of Lorentz transformations that are not continuously connected to the identity are as follows
Λ P = diag ( 1 , 1 , 1 , 1 ) , Λ T = diag ( 1 , 1 , 1 , 1 ) , Λ P Λ T = diag ( 1 , 1 , 1 , 1 ) .
In particular, there is no way to continuously change the parameters of a proper orthochronous Lorentz transformation to yield a Lorentz transformation with det Λ = 1 and/or Λ 0 0 1 in light of Equations (33) and (34).
The complete list of Lorentz transformations is then given by
Λ , Λ P Λ , Λ T Λ , Λ P Λ T Λ | Λ SO 0 ( 1 , 3 ) .
Consequently, to determine the explicit form of the most general Lorentz transformation, it suffices to consider the explicit form of the most general proper orthochronous Lorentz transformation.
The Lie algebra of the Lorentz group is obtained by considering an infinitesimal Lorentz transformation,
Λ = I 4 + A ,
where A is a 4 × 4 matrix that depends on infinitesimal Lorentz group parameters. In particular, terms that are quadratic or of higher order in the infinitesimal group parameters are neglected. Inserting Equation (37) into Equation (32), and denoting G = diag ( + 1 , 1 , 1 , 1 ) to be the 4 × 4 matrix whose matrix elements are η μ ν , it follows that
I 4 + A T G ( I 4 + A ) = G .
Keeping only terms up to linear order in the infinitesimal group parameters, we conclude that A T G = G A or equivalent (since G is a diagonal matrix),
( G A ) T = G A .
That is, G A is a 4 × 4 real antisymmetric matrix. Hence, the Lie algebra of the Lorentz group, henceforth denoted by so ( 1 , 3 ) , consists of all 4 × 4 real matrices A such that G A is an antisymmetric matrix.
To construct a proper orthochronous Lorentz transformation, one can choose any 4 × 4 real matrix A that satisfies Equation (39), and consider a large positive integer n such that A / n is an infinitesimal quantity. Then, a proper orthochronous Lorentz transformation is obtained by applying a sequence of n infinitesimal Lorentz transformations in the limit as n ,
Λ = lim n I 4 + A n n = exp A .
Note that Λ is continuously connected to the identity matrix since one can continuously deform A into the zero matrix. Hence, it follows that Λ SO 0 ( 1 , 3 ) . However, one can make a stronger statement: the exponential map, exp : so ( 1 , 3 ) SO 0 ( 1 , 3 ) , is surjective. A proof of this result can be found in Section 6.3 of Ref. [10]. That is, the set of all proper orthochronous Lorentz transformations consists of matrices of the form exp A , where G A is a 4 × 4 real antisymmetric matrix.
Let us first reconsider the two special cases examined in Section 2. A matrix representation of an infinitesimal boost is obtained by evaluating Equation (22) to leading order in ζ ,
Λ ( ζ , 0 ) | 1 | ζ j | ζ i | δ i j = I 4 i ζ · k + O ( | ζ | 2 ) ,
where the three matrices k = ( k 1 , k 2 , k 3 ) are defined by
k 1 = i 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 , k 2 = i 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 , k 3 = i 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 .
Similarly, a matrix representation of an infinitesimal rotation is obtained by evaluating Equations (26) and (27) to leading order in θ (with θ k θ n k ),
Λ ( 0 , θ ) | 1 | 0 j | | 0 i | δ i j ϵ i j k θ k = I 4 i θ · s + O ( | θ | 2 ) ,
where the three matrices s = ( s 1 , s 2 , s 3 ) are defined by
s 1 = i 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 , s 2 = i 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 , s 3 = i 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 .
The six matrices k = ( k 1 , k 2 , k 3 ) and s = ( s 1 , s 2 , s 3 ) satisfy the following commutation relations:
[ s i , s j ] = i ϵ i j s , [ k i , k j ] = i ϵ i j s , [ s i , k j ] = i ϵ i j k ,
where i , j , { 1 , 2 , 3 } and there is an implicit sum over the repeated index .
Using Equations (41) and (43), it follows that the matrix representation of a general infinitesimal Lorentz transformation, to linear order in the boost and rotation parameters, is given by
Λ ( ζ , θ ) Λ ( 0 , θ ) Λ ( ζ , 0 ) 1 ζ j ζ i δ i j ϵ i j k θ k I 4 i θ · s i ζ · k .
Note that we also could have written Λ ( ζ , θ ) Λ ( ζ , 0 ) Λ ( 0 , θ ) in Equation (46), since the infinitesimal Lorentz transformations commute at linear order.
In light of the remarks below Equation (40), one can conclude that the most general proper orthochronous Lorentz transformation matrix Λ ( ζ , θ ) is a 4 × 4 matrix given by
Λ ( ζ , θ ) = exp i θ · s i ζ · k .
Here, we follow the conventions of Refs. [11,12]. Note that in the notation of Ref. [8], k = i K and s = i S , where the 4 × 4 matrix representations of K and S are given in Equation (11.91) of Ref. [8] and yield Λ = exp ( θ · S + ζ · K ) . The argument of exp differs by an overall sign with Equation (11.93) of Ref. [8], where a passive Lorentz transformation is employed, which amounts to replacing { ζ , θ } with { ζ , θ } .
Equations (42), (44) and (47) imply that
Λ ( ζ , θ ) = exp A , where A i θ · s i ζ · k = 0 ζ 1 ζ 2 ζ 3 ζ 1 0 θ 3 θ 2 ζ 2 θ 3 0 θ 1 ζ 3 θ 2 θ 1 0 .
As anticipated in Equation (39), G A is the most general 4 × 4 real antisymmetric matrix, which depends on six real independent parameters ζ i and θ i ( i { 1 , 2 , 3 } ). The { s i , k i } satisfy the commutation relations [Equation (45)] of the real Lie algebra so ( 1 , 3 ) . As indicated in Equation (48), A is a real linear combination of the six Lie algebra generators { i s i , i k i } and thus constitutes a general element of so ( 1 , 3 ) . In Section 4, we provide an explicit computation of exp A .
Before moving on, we shall introduce a useful notation that assembles the matrices { s i , k i } into six independent non-zero matrices, s ρ λ = s λ ρ (with λ , ρ { 0 , 1 , 2 , 3 } ) such that
s 1 2 ϵ i j s i j , k i s 0 i = s i 0 .
Note that Equation (49) implies that s i j = ϵ i j s , so that the six independent matrices can be taken to be s i j ( i < j ) and s 0 i ( i , j { 1 , 2 , 3 } ). The matrix elements of the s ρ λ are given by
( s ρ λ ) μ ν = i η ρ μ δ ν λ η λ μ δ ν ρ ,
where μ indicates the row and ν indicates the column of the corresponding matrix.
Using Equation (49), one can check that Equation (50) is equivalent to Equations (42) and (44). In addition, the so ( 1 , 3 ) commutation relations exhibited in Equation (45) now take the following form:
[ s α β , s ρ λ ] = i ( η β ρ s α λ η α ρ s β λ η β λ s α ρ + η α λ s β ρ ) .
One can also assemble the boost and rotation parameters { ζ i , θ i } into a second rank antisymmetric tensor θ α β by defining
θ i j ϵ i j θ , θ i 0 = θ 0 i ζ i .
With this new notation, Equation (47) can be rewritten as
Λ ( ζ , θ ) = exp 1 2 i θ ρ λ s ρ λ ,
where θ ρ λ η ρ α η λ β θ α β . As usual, there is an implied sum over each pair of repeated upper/lower indices.

4. An Explicit Evaluation of Λ ( ζ , θ ) = e x p   A

We now proceed to evaluate exp A , where A is given by Equation (48). First, we compute the characteristic polynomial of A,
p ( x ) det ( A x I 4 ) = x 4 + | θ | 2 | ζ | 2 x 2 ( θ · ζ ) 2 ( x 2 + a 2 ) ( x 2 b 2 ) ,
where
a 2 b 2 = ( θ · ζ ) 2 , a 2 b 2 = | θ | 2 | ζ | 2 .
Solving Equation (55) for a 2 and b 2 yields
a 2 = 1 2 [ | θ | 2 | ζ | 2 + | θ | 2 | ζ | 2 2 + 4 ( θ · ζ ) 2 ] ,
b 2 = 1 2 [ | ζ | 2 | θ | 2 + | θ | 2 | ζ | 2 2 + 4 ( θ · ζ ) 2 ] .
Note that a 2 0 and b 2 0 so that a , b R . The individual signs of a and b are not determined, but none of the results that follow depend on these signs. The eigenvalues of A, denoted by λ i ( i = 1 , 2 , 3 , 4 ), are the solutions of p ( x ) = 0 , which are given by
λ i = i a , i a , b , b .
If a b 0 , then the four eigenvalues of A [Equation (58)] are distinct, which implies that A is a diagonalizable matrix.
To evaluate exp A for a diagonalizable matrix A, we shall make use of a formula [Equation (60) below] that is based on the Lagrange interpolating polynomial. Consider an n × n matrix A with n eigenvalues of which m are distinct and denoted by λ i ( i = 1 , 2 , , m ). The matrix A is diagonalizable if and only if (e.g., see Section 8.3.2 of Ref. [7] or Section 7.11 of Ref. [13])
j = 1 m ( A λ i I n ) = 0 ,
where I n is the n × n identify matrix. Note that if m = n (i.e., all n eigenvalues are distinct), then A is diagonalizable, since in this case Equation (59) is automatically satisfied due to the Cayley–Hamilton theorem.
Any function of a diagonalizable matrix A is given by the following formula (e.g., see Equations (7.3.6) and (7.3.11) of Ref. [13], Equation (5.4.17) of Ref. [14], or Chapter V, Section 2.2 of Ref. [15]):
f ( A ) = i = 1 m f ( λ i ) K i , where K i = j = 1 j i m A λ j I n λ i λ j ,
if 2 m n and K 1 I n if m = 1 . Note that i = 1 m K i = I n .
Applying Equation (60) to f ( A ) = exp A , where A is given by Equation (48), under the assumption that a b 0 , it follows that
exp A = e i a A + i a I 4 2 i a A b I 4 i a b A + b I 4 i a + b + e i a A i a I 4 2 i a A b I 4 i a b A + b I 4 i a + b + e b A i a I 4 b i a A + i a I 4 b + i a A + b I 4 2 b + e b A i a I 4 b i a A + i a I 4 b + i a A b I 4 2 b .
Simplifying the above expression yields
exp A = 1 a 2 + b 2 ( b 2 I 4 A 2 ) A sin a a + I 4 cos a + ( A 2 + a 2 I 4 ) A sinh b b + I 4 cosh b .
Combining terms, we end up with
exp 0 ζ 1 ζ 2 ζ 3 ζ 1 0 θ 3 θ 2 ζ 2 θ 3 0 θ 1 ζ 3 θ 2 θ 1 0 = 1 a 2 + b 2 f 0 ( a , b ) I 4 + f 1 ( a , b ) A + f 2 ( a , b ) A 2 + f 3 ( a , b ) A 3 ,
where a and b are defined in Equation (55) and
f 0 ( a , b ) = b 2 cos a + a 2 cosh b , f 1 ( a , b ) = b 2 a sin a + a 2 b sinh b ,
f 2 ( a , b ) = cosh b cos a , f 3 ( a , b ) = sinh b b sin a a ,
in agreement with the results previously obtained in Refs. [3,4,5,6].
The matrix A and its powers can be conveniently written in block matrix form:
A = 0 ζ j ζ i ϵ i j k θ k , A 2 = | ζ | 2 ϵ j k ζ k θ ϵ i k ζ k θ ζ i ζ j + θ i θ j δ i j | θ | 2 ,
and
A 3 = 0 | ζ | 2 | θ | 2 ζ j + ( θ · ζ ) θ j | ζ | 2 | θ | 2 ζ i + ( θ · ζ ) θ i ( ϵ j k ζ i ϵ i k ζ j ) ζ k θ + ϵ i j k θ k | θ | 2 .
The i j element of A 3 can be simplified by noting that the i j element of any 3 × 3 antisymmetric matrix must be of the form ϵ i j k C k (after summing over the repeated index k). Thus,
( ϵ j k ζ i ϵ i k ζ j ) ζ k θ = ϵ i j k C k .
Multiplying the above equation by ϵ i j m and summing over i and j yields
( δ i δ k m δ i k δ m ) ζ i ζ k θ ( δ j k δ m δ j δ k m ) ζ j ζ k θ = 2 δ k m C k .
It follows that C m = ( θ · ζ ) ζ m | ζ | 2 θ m . That is, we have derived the identity
( ϵ j k ζ i ϵ i k ζ j ) ζ k θ = ϵ i j k ( θ · ζ ) ζ k | ζ | 2 θ k .
Thus, the matrix A 3 [Equation (67)] can be rewritten in a more convenient form,
A 3 = 0 | ζ | 2 | θ | 2 ζ j + ( θ · ζ ) θ j | ζ | 2 | θ | 2 ζ i + ( θ · ζ ) θ i ϵ i j k ( θ · ζ ) ζ k | ζ | 2 | θ | 2 θ k .
Consider separately the case of a b = 0 . The eigenvalues given in Equation (58) are no longer distinct. If a = 0 and b 0 , then the matrix A is diagonalizable since A satisfies Equation (59), i.e., A ( A 2 b 2 I 4 ) = 0 . In particular, if a = 0 then Equation (55) implies that θ · ζ = 0 and b 2 = | ζ | 2 | θ | 2 . Plugging these results into Equations (66) and (71) yields A 3 b 2 A = 0 . Consequently, one can make use of Equation (60) with m = 3 to obtain
exp A = A b I 4 b A + b I 4 b + e b A b A + b I 4 2 b + e b A b A b I 4 2 b = I 4 + sinh b b A + cosh b 1 b 2 A 2 , for a = 0 .
One can check that Equation (72) coincides with the a 0 limit of Equations (63)–(65) after making use of A 3 = b 2 A .
Likewise, if b = 0 and a 0 , then the matrix A is diagonalizable since A satisfies Equation (59), i.e., A ( A 2 + a 2 I 4 ) = 0 . In particular, if b = 0 , then Equation (55) implies that θ · ζ = 0 and a 2 = | θ | 2 | ζ | 2 . Plugging these results into Equations (66) and (71) yields A 3 + a 2 A = 0 . Consequently, one can make use of Equation (60) with m = 3 to obtain
exp A = A i a I 4 i a A + i a I 4 i a + e i a A i a A + i a I 4 2 i a + e i a A i a A i a I 4 2 i a x x = I 4 + sin a a A + 1 cos a a 2 A 2 , for b = 0 .
One can check that Equation (73) coincides with the b 0 limit of Equations (63)–(65) after making use of A 3 = a 2 A .
Finally, in the case of a = b = 0 , Equation (55) yields θ · ζ = 0 and | ζ | 2 = | θ | 2 . Using Equation (71), it then follows that A 3 = 0 . Thus, the Taylor series of the exponential terminates and one obtains
exp A = 1 + A + 1 2 A 2 , for a = b = 0 .
Although one cannot directly employ Equation (60) in this final case (since A is no longer diagonalizable), one can still recover Equation (74) either by taking the b 0 limit of Equation (72) or the a 0 limit of Equation (73).
It is instructive to check the two limiting cases exhibited in Section 2. First, if θ = 0 , then a = 0 and b = | ζ | ζ . It then follows that
A = 0 ζ j ζ i 0 i j , A 2 = | ζ | 2 0 | 0 | ζ i ζ j ,
where 0 i j is a 3 × 3 matrix of zeros. Using Equations (72) and (75), we obtain
Λ ( ζ , 0 ) = cosh ζ ζ j ζ sinh ζ ζ i ζ sinh ζ δ i j + ζ i ζ j | ζ | 2 ( cosh ζ 1 ) ,
in agreement with Equation (22).
Second, if ζ = 0 , then a = | θ | θ and b = 0 . It follows that
A = 0 0 | 0 | | ϵ i j k θ k , A 2 = 0 0 | 0 | θ i θ j δ i j | θ | 2 .
Using Equations (73) and (77), we end up with
Λ ( 0 , θ ) = x x 1 x x 0 | x x 0 x x | δ i j cos θ + n i n j ( 1 cos θ ) ϵ i j k n k sin θ ,
after identifying θ i = θ n i . We have thus recovered Equation (26) and Rodrigues’ rotation formula [Equation (27)].
A final limiting case of interest is the most general orthochronous Lorentz transformation in 2 + 1 spacetime dimensions. In this case, we can choose θ = θ z ^ and ζ = ζ 1 x ^ + ζ 2 y ^ , which implies that a b = 0 [cf. Equation (55)]. Without loss of generality, one can take b = 0 and a 2 = θ 2 | ζ | 2 , where θ 2 is the square of the rotation angle θ (in two space dimensions, there is no danger in confusing θ 2 with the second component of the vector θ = θ z ^ ). Hence, Equation (73) yields
exp 0 ζ 1 ζ 2 ζ 1 0 θ ζ 2 θ 0 = I 3 + sin θ 2 | ζ | 2 θ 2 | ζ | 2 A + 1 cos θ 2 | ζ | 2 θ 2 | ζ | 2 A 2 ,
where the 3 × 3 matrices A and A 2 are given in block-diagonal form by
A = 0 ζ j ζ i θ ϵ i j , A 2 = | ζ | 2 θ ϵ k j ζ k θ ϵ i k ζ k ζ i ζ j θ 2 δ i j ,
with i, j { 1 , 2 } (and an implied sum over k = 1 , 2 ), ϵ 12 = ϵ 21 = 1 , and ϵ 11 = ϵ 22 = 0 .

5. An Explicit Evaluation of Λ μ ν = 1 2 Tr M σ ¯ μ M σ ν

In Section 3, we remarked that a general element of the Lie algebra so ( 1 , 3 ) is a real linear combination of the six generators { i s i , i k i } . In particular, the matrix A defined in Equation (48) provides a four-dimensional matrix representation of so ( 1 , 3 ) . The corresponding 4 × 4 matrix that represents a general element of the proper orthochronous Lorentz group, SO0(1,3), is then obtained by exponentiation, Λ ( ζ , θ ) = exp A . In this section, we will take advantage of the existence of a two-dimensional matrix representation of so ( 1 , 3 ) . It is noteworthy that by exponentiating this two-dimensional representation, one obtains a two-dimensional matrix representation of the group of complex 2 × 2 matrices with unit determinant, which defines the Lie group SL ( 2 , C ) . Thus, the two-dimensional matrix representation of SL ( 2 , C ) provides representation matrices M [defined in Equation (81) below] for the elements of SO0(1,3). However, in this case, the 2 × 2 matrices M and M of SL ( 2 , C ) represent the same element of SO0(1,3) [cf. Equation (91)].
For example, consider the general element of the two-dimensional representation of SL ( 2 , C ) that is given by
M = exp 1 2 i θ · σ 1 2 ζ · σ ,
where ζ and θ are the boost and rotation vectors that parametrize an element of the proper orthochronous Lorentz group and σ = ( σ 1 , σ 2 , σ 3 ) are the three Pauli matrices assembled into a vector whose components are the 2 × 2 matrices,
σ 1 = 0 1 1 0 , σ 2 = 0 i i 0 , σ 3 = 1 0 0 1 .
It is convenient to define a fourth Pauli matrix, σ 0 = I 2 , where I 2 is the 2 × 2 identity matrix. We can then define the four Pauli matrices in a unified notation. Following the notation of Refs. [11,12], we define
σ μ = ( I 2 ; σ ) , σ ¯ μ = ( I 2 ; σ ) ,
where μ { 0 , 1 , 2 , 3 } . Note that these sigma matrices have been defined with an upper (contravariant) index. They are related to sigma matrices with a lower (covariant) index in the usual way:
σ μ = η μ ν σ ν = ( I 2 ; σ ) , σ ¯ μ = η μ ν σ ¯ ν = ( I 2 ; σ ) .
However, the use of the spacetime indices μ and ν is slightly deceptive since the sigma matrices defined above are fixed matrices that do not change under a Lorentz transformation.
It is also convenient to introduce the set of 2 × 2 matrices,
σ μ ν = σ ν μ 1 4 i σ μ σ ¯ ν σ ν σ ¯ μ .
One can then rewrite Equation (81) in the following form that is reminiscent of Equation (53),
M = exp 1 2 i θ μ ν σ μ ν .
That is, the six independent i σ μ ν matrices are generators of the Lie algebra of SL ( 2 , C ) , henceforth denoted by sl ( 2 , C ) . It is straightforward to check that the 2 × 2 matrices σ μ ν possess the same commutation relations as the 4 × 4 matrices s μ ν [cf. Equation (51)], which establishes the isomorphism so ( 1 , 3 ) sl ( 2 , C ) .
Under an active Lorentz transformation, a two-component spinor χ α (where α { 1 , 2 } ) transforms as
χ α = M α β χ β , α , β { 1 , 2 } .
Suppose that χ and η are two-component spinors and consider the spinor product η σ ¯ μ χ . Under a Lorentz transformation,
η σ ¯ μ χ ( M η ) σ ¯ μ ( M χ ) = η ( M σ ¯ μ M ) χ .
We assert that the quantity η σ ¯ μ χ transforms as a Lorentz four-vector,
η σ ¯ μ χ Λ μ ν η σ ¯ ν χ .
The standard proof of this assertion based on the analysis of infinitesimal Lorentz transformations is given in Appendix B. (See also Appendix C, where the corresponding result is obtained by employing the four-component spinor formalism.) Equations (88) and (89) imply that the following identity must be satisfied:
M σ ¯ μ M = Λ μ ν σ ¯ ν .
Multiplying Equation (90) on the right by σ ρ and using Tr ( σ ¯ ν σ ρ ) = 2 δ ρ ν , it follows that
Λ μ ν = 1 2 Tr M σ ¯ μ M σ ν .
It is now convenient to introduce the complex vector, z ζ + i θ , and the associated quantity,
Δ z · z 1 / 2 = | ζ | 2 | θ | 2 + 2 i θ · ζ 1 / 2 .
One can now evaluate the matrix exponential M = exp 1 2 z · σ [cf. Equation (81)] by making use of Equation (60) if Δ 0 . The corresponding eigenvalues of 1 2 z · σ are λ = ± 1 2 Δ . Hence,
M = exp 1 2 z · σ = e Δ / 2 I 2 Δ z · σ 2 Δ + e Δ / 2 I 2 Δ + z · σ 2 Δ = I 2 cosh 1 2 Δ z · σ sinh 1 2 Δ Δ .
Note that the limit as Δ 0 is continuous and yields M = I 2 1 2 z · σ .
Since the Pauli matrices are hermitian,
M = exp 1 2 z · σ = I 2 cosh 1 2 Δ z · σ sinh 1 2 Δ Δ .
We shall evaluate Λ μ ν in four separate cases depending whether the spacetime index is 0 or i { 1 , 2 , 3 } . In particular, using block matrix notation, Equation (91) yields
Λ ( ζ , θ ) = Λ 0 0 Λ 0 j Λ i 0 Λ i j = 1 2 Tr ( M M ) Tr ( M σ j M ) Tr ( M σ i M ) Tr ( M σ i M σ j ) ,
where we have used σ j = σ j to obtain the final matrix expression above.
Plugging Equations (93) and (94) into Equation (91) and evaluating the traces,
Tr ( σ i σ j ) = 2 δ i j ,
Tr ( σ i σ j σ k ) = 2 i ϵ i j k ,
Tr ( σ i σ j σ k σ ) = 2 ( δ i j δ k δ i k δ j + δ i δ j k ) ,
we end up with the following expressions:
Λ 0 0 = | cosh 1 2 Δ | 2 + sinh 1 2 Δ Δ 2 | ζ | 2 + | θ | 2 ,
Λ 0 j = cosh 1 2 Δ sinh 1 2 Δ Δ z j + c . c . + i sinh 1 2 Δ Δ 2 ϵ j k z k z ,
Λ i 0 = cosh 1 2 Δ sinh 1 2 Δ Δ z i + c . c . + i sinh 1 2 Δ Δ 2 ϵ i k z k z ,
Λ i j = { | cosh 1 2 Δ | 2 sinh 1 2 Δ Δ 2 | ζ | 2 + | θ | 2 } δ i j + ( z i z j + z i z j ) sinh 1 2 Δ Δ 2 + ( i   s i n h ( 1 2 Δ ) c o s h ( 1 2 Δ ) Δ ϵ i j k z k + c . c . ) ,
where c . c . means the complex conjugate of the previous term and Δ is defined in Equation (92). Note that since Δ is a complex quantity, | cosh 1 2 Δ | 2 = cosh 1 2 Δ cosh 1 2 Δ and | sinh 1 2 Δ / Δ | 2 = sinh 1 2 Δ sinh 1 2 Δ / | Δ | 2 in Equations (99)–(102).
We can check the results of Equations (99)–(102) in three special cases. First, consider the case of a pure boost, where θ = 0 . Then, z = z = ζ and Δ = | ζ | ζ . Plugging these values into Equations (99)–(102) yields the following block matrix form:
Λ ( ζ , 0 ) = cosh ζ ζ j ζ sinh ζ ζ i ζ sinh ζ δ i j + ζ i ζ j | ζ | 2 ( cosh ζ 1 ) ,
which again reproduces the result of Equation (22).
Second, consider the case of ζ = 0 . Then, z = z = i θ and Δ = i θ . Plugging these values into Equations (99)–(102) and writing θ i = θ n i yields
Λ ( 0 , θ ) = 1 | 0 j | 0 i | | δ i j cos θ + n i n j ( 1 cos θ ) ϵ i j k n k sin θ .
Once again, we have recovered Equation (26) and Rodrigues’ rotation formula [Equation (27)].
Third, one can check that Equations (99)–(102) reduce to the most general orthochronous Lorentz transformation in 2 + 1 spacetime dimensions for i , j { 1 , 2 } if we take z 1 = ζ 1 , z 2 = ζ 2 , and z 3 = i θ , which implies that Δ = z · z 1 / 2 = | ζ | 2 θ 2 1 / 2 . The resulting formulae reproduce the expressions obtained in Equations (79) and (80).
Finally, it is instructive to consider the case of an infinitesimal Lorentz transformation. Working to linear order in ζ and θ , note that Δ 0 in light of Equation (92). Hence, Equations (99)–(102) reduce to the following result given in block matrix form:
Λ ( ζ , θ ) 1 ζ j ζ i δ i j ϵ i j k θ k ,
which coincides with Equation (46).

6. Reconciling the Results of Section 4 and Section 5

In this section, we shall verify that the explicit expressions for Λ ( ζ , θ ) obtained, respectively, in Section 4 and Section 5 coincide in the general case of non-zero boost and rotation parameters.
First, it is convenient to rewrite Equations (56) and (57) as follows:
a 2 = 1 2 | θ | 2 | ζ | 2 + | Δ | 2 , b 2 = 1 2 | ζ | 2 | θ | 2 + | Δ | 2 ,
where Δ is defined in Equation (92). As noted below Equation (57), a, b R but their undetermined signs have no impact on the expressions obtained for the matrix elements of Λ ( ζ , θ ) . Using Equation (55), we can fix the relative sign of a and b by choosing a b = θ · ζ . It then follows that
( b + i a ) 2 = b 2 a 2 + 2 i a b = | ζ | 2 | θ | 2 + 2 i θ · ζ = Δ 2 .
After taking the positive square root, the signs of a and b are now fixed by identifying
Δ = b + i a .
One can check that Equations (99)–(102) are unchanged if Δ Δ and/or Δ Δ . This reflects the fact that the expressions obtained for the matrix elements of Λ ( ζ , θ ) do not depend on the choice of signs for a and b.
Thus, Equations (63)–(66) and (71) yield:
Λ 0 0 = 1 | Δ | 2 ( b 2 | ζ | 2 ) cos a + a 2 + | ζ | 2 cosh b = 1 2 ( cosh b + cos a ) + | ζ | 2 + | θ | 2 2 | Δ | 2 ( cosh b cos a ) ,
after making use of Equation (106). We now employ the following two identities:
cosh b + cos a = 2 cosh b + i a 2 cosh b i a 2 = 2 cosh b + i a 2 2 ,
cosh b cos a = 2 sinh b + i a 2 sinh b i a 2 = 2 sinh b + i a 2 2 .
Hence, Equations (108) and (109) yield
Λ 0 0 = | cosh 1 2 Δ | 2 + sinh 1 2 Δ Δ 2 ( | ζ | 2 + | θ | 2 ) ,
in agreement with Equation (99).
Next, Equations (63)–(66) and (71) yield
Λ 0 j = 1 | Δ | 2 { b 2 a sin a + a 2 b sinh b ζ j + ( cosh b cos a ) ϵ j k ζ k θ + sinh b b sin a a | ζ | 2 | θ | 2 ζ j + ( θ · ζ ) θ j } .
Using Equation (55), it follows that | ζ | 2 | θ | 2 = b 2 a 2 and θ · ζ = a b [the latter with the sign conventions adopted above Equation (107)]. Inserting these results into Equation (113), we obtain
Λ 0 j = 1 | Δ | 2 ( b sinh b + a sin a ) ζ j + ( a sinh b b sin a ) θ j + ( cosh b cos a ) ϵ j k ζ k θ .
We can rewrite Equation (114) with the help of some identities. It is straightforward to show that
b sinh b + a sin a = Δ sinh 1 2 Δ cosh 1 2 Δ + c . c . ,
a sinh b b sin a = i Δ sinh 1 2 Δ cosh 1 2 Δ + c . c . ,
( cosh b cos a ) ϵ j k ζ k θ = i sinh 1 2 Δ ϵ j k z k z .
Collecting the results obtained above, we end up with
Λ 0 j = sinh 1 2 Δ cosh 1 2 Δ Δ ( ζ j + i θ j ) + c . c . + i sinh 1 2 Δ Δ 2 ϵ j k z k z ,
in agreement with Equation (100).
The computation of Λ i 0 is nearly identical. The only change is due to the change in the sign multiplying the term proportional to the Levi–Civita tensor. Consequently, it is convenient to replace Equation (117) with an equivalent form:
( cosh b cos a ) ϵ i k ζ k θ = i sinh 1 2 Δ ϵ i k z k z .
Hence, we end up with
Λ i 0 = 1 | Δ | 2 ( b sinh b + a sin a ) ζ i + ( a sinh b b sin a ) θ i ( cosh b cos a ) ϵ i k ζ k θ x x = sinh 1 2 Δ cosh 1 2 Δ Δ ( ζ i + i θ i ) + c . c . + i sinh 1 2 Δ Δ 2 ϵ i k z k z ,
in agreement with Equation (101).
Finally, we use Equations (63)–(66) and (71) to obtain
Λ i j = 1 | Δ | 2 { ( b 2 cos a + a 2 cosh b ) δ i j b 2 a sin a + a 2 b sinh b ϵ i j k θ k + ( cosh b cos a ) ( ζ i ζ j + θ i θ j δ i j | θ | 2 ) + sinh b b sin a a ϵ i j k ( θ · ζ ) ζ k + | θ | 2 | ζ | 2 θ k } .
The following identities can be derived:
1 | Δ | 2 ( cosh b cos a ) ( ζ i ζ j + θ i θ j ) = z i z j + z i z j sinh 1 2 Δ Δ 2 ,
1 | Δ | 2 sinh b b sin a a θ · ζ = i sinh 1 2 Δ Δ cosh 1 2 Δ + c . c . ,
1 | Δ | 2 b 2 cos a + a 2 cosh b ( cosh b cos a ) | θ | 2 = | cosh 1 2 Δ | 2 sinh 1 2 Δ Δ 2 | ζ | 2 + | θ | 2 ,
1 | Δ | 2 sinh b b sin a a | θ | 2 | ζ | 2 b 2 a sin a + a 2 b sinh b = sinh 1 2 Δ Δ cosh 1 2 Δ + c . c . .
Note that the terms proportional to ϵ i j k in Equation (121) combine nicely and yield
i sinh 1 2 Δ cosh 1 2 Δ Δ ϵ i j k z k + c . c . ,
after putting z k = ζ k + i θ k .
Collecting the results obtained above, we end up with
Λ i j = { | cosh 1 2 Δ | 2 sinh 1 2 Δ Δ 2 | ζ | 2 + | θ | 2 } δ i j + ( z i z j + z i z j ) sinh 1 2 Δ Δ 2 + i sinh 1 2 Δ cosh 1 2 Δ Δ ϵ i j k z k + c . c . ,
in agreement with Equation (102).
We have therefore verified by an explicit computation that the results obtained in Equations (63)–(65) are equivalent to Equations (99)–(102). In particular, we have established that
Λ μ ν ( ζ , θ ) = 1 2 Tr M σ ¯ μ M σ ν ,
where M = exp 1 2 ( ζ + i θ ) · σ .

7. Final Remarks

The main goal of this paper is to exhibit an explicit form for the 4 × 4 proper orthochronous Lorentz transformation matrix as a function of general boost and rotation parameters ζ and θ . Whereas the matrices Λ ( ζ , 0 ) and Λ ( 0 , θ ) are well known and appear in many textbooks, the explicit form for more general Λ ( ζ , θ ) is much less well known. Two different derivations are provided for Λ ( ζ , θ ) . One derivation evaluates the exponential of a real 4 × 4 matrix A that satisfies ( G A ) T = G A [where G diag ( 1 , 1 , 1 , 1 ) ], and a second derivation evaluates 1 2 Tr M σ ¯ μ M σ ν , where the 2 × 2 matrix M = exp { 1 2 ( ζ + i θ ) · σ } . Although the results obtained by the two computations look somewhat different at first, we have verified by explicit calculation that these two results are actually equivalent.
One can also obtain the most general proper orthochronous Lorentz transformation in another way by invoking the following theorem (e.g., see Section 1.5 of Ref. [1], Section 6.6 of Ref. [16], or Section 4.5 of Ref. [17]):
Every proper orthochronous Lorentz transformation Λ ( ζ , θ ) possesses a unique factorization into a product of a boost and a rotation in two different ways:
Λ ( ζ , θ ) = Λ ( ζ , 0 ) Λ ( 0 , θ ) = Λ ( 0 , θ ) Λ ( ζ , 0 ) .
for an appropriate choice of parameters { ζ , θ } and { ζ , θ } , respectively. Equation (129) is called the polar decomposition of SO0(1,3) in Refs. [10,18,19].
In particular, if none of the parameters are zero, then ζ ζ ζ and θ θ θ due to the fact that boosts and rotations do not commute [as a consequence of the commutation relations given in Equation (45)]. Indeed, for non-vanishing boost and rotation parameters,
Λ ( ζ , θ ) = exp i θ · s i ζ · k exp i θ · s exp i ζ · k exp i ζ · k exp i θ · s .
In contrast to Equation (130), when considering infinitesimal Lorentz transformations, the boost matrix [Equation (41)] and the rotation matrix [Equation (43)] commute at linear order, which results in Equation (46). The effects of the noncommutativity appear first at quadratic order in the boost and rotation parameters.
Given the parameters { ζ , θ } (or { ζ , θ } ), it would be quite useful to be able to obtain expressions for the corresponding parameters of Λ ( ζ , θ ) . The formulae that determine { ζ , θ } in Equation (129) are quite complicated [20], although they could in principle be derived by using the explicit matrix representations given in this paper. This is left as an exercise for the reader.

Funding

This research was partially supported by the U.S. Department of Energy Grant number DE-SC0010107.

Data Availability Statement

Data are contained within the article.

Acknowledgments

I am grateful to João P. Silva for discussions in which he challenged me to provide an explicit proof of Equation (128) and for his encouragements during the writeup of this work.

Conflicts of Interest

The author declares no conflicts of interest.

Appendix A. Rodrigues’ Rotation Formula

A proper rotation matrix R ( n ^ , θ ) [which satisfies R R T = I 3 and det R = 1 ] represents an active transformation consisting of a counterclockwise rotation by an angle θ about an axis n ^ with respect to a fixed Cartesian coordinate system. For example, the matrix representation of the counterclockwise rotation by an angle θ about the z-axis is given by
R ( z ^ , θ ) cos θ sin θ 0 sin θ cos θ 0 0 0 1 .
The matrix elements of R ( n ^ , θ ) will be denoted by R i j , where the indices of the tensors in this Appendix are written in the lowered position to simplify the typography of the presentation. The goal of this Appendix is to provide a simple derivation of Rodrigues’ formula for an active (counterclockwise) rotation by an angle θ about an axis that points along the unit vector n ^ = ( n 1 , n 2 , n 3 ) . Note that since n ^ is a unit vector, it follows that
n 1 2 + n 2 2 + n 3 2 = 1 .
The traditional approach to deriving Rodrigues’ rotation formula involves the computation of the exponential of an arbitrary 3 × 3 real antisymmetric matrix (e.g., see Refs. [9,10]). Below, we provide an alternative derivation of the formula for R i j that makes use of the techniques of tensor algebra.
Consider how R i j changes under an orthogonal change of basis, which can be viewed as a orthogonal transformation of the coordinate axes. Using the well-known results derived in any textbook on matrices and linear algebra, one can check that the transformation of the components of R i j under a change of basis corresponds to the transformation law of a second-rank Cartesian tensor. Likewise, the n i are components of a vector (equivalently, a first-rank tensor). Two other important quantities of the analysis are the invariant tensors δ i j (the Kronecker delta) and ϵ i j k (the Levi–Civita tensor). If we invoke the covariance of Cartesian tensor equations, then one must be able to express R i j in terms of a second-rank tensor composed of n i , δ i j and ϵ i j k , as there are no other tensors in the problem that could provide a source of indices. Thus, the form of the formula for R i j must be
R i j = a δ i j + b n i n j + c ϵ i j k n k ,
where there is an implicit sum over the repeated index k in the last term of Equation (A3). The numbers a, b and c are real scalar quantities. As such, a, b and c are functions of θ , since the rotation angle is the only scalar variable in this problem.
We now determine the conditions that are satisfied by a, b and c. The first condition is obtained by noting that
R ( n ^ , θ ) n ^ = n ^ .
This is clearly true, since R ( n ^ , θ ) , when acting on a vector, rotates the vector around the axis n ^ , whereas any vector parallel to the axis of rotation is invariant under the action of R ( n ^ , θ ) . In terms of components,
R i j n j = n i .
To determine the consequence of this equation, we insert Equation (A3) into Equation (A5). In light of Equation (A2), it follows immediately that n i ( a + b ) = n i . Hence,
a + b = 1 .
Since the formula for R i j given by Equation (A3) must be completely general, it must hold for any special case. In particular, consider the case where n ^ = z ^ . In this case, Equations (A1) and (A3) yield
R ( k , θ ) 11 = cos θ = a , R ( k , θ ) 12 = sin θ = c ,
after using n 3 = ϵ 123 = 1 . Consequently, Equations (A6) and (A7) yield
a = cos θ , b = 1 cos θ , c = sin θ .
Inserting these results into Equation (A3), we obtain Rodrigues’ rotation formula:
R i j ( n ^ , θ ) = cos θ δ i j + ( 1 cos θ ) n i n j sin θ ϵ i j k n k .
Note that
R ( n ^ , θ + 2 π k ) = R ( n ^ , θ ) , k = 0 , ± 1 ± 2 ,
[ R ( n ^ , θ ) ] 1 = R ( n ^ , θ ) = R ( n ^ , θ ) .
Combining these two results, it follows that
R ( n ^ , 2 π θ ) = R ( n ^ , θ ) ,
which implies that any three-dimensional proper rotation can be described by a counterclockwise rotation by an angle θ about some axis n ^ , where 0 θ π .

Appendix B. η σ ¯ μ χ Transforms as a Lorentz Four-Vector

Equation (89) asserts that the spinor product η σ ¯ μ χ transforms as a Lorentz four-vector. In light of Equation (88), it follows that Equation (90) must be satisfied (and vice versa). In this Appendix, we shall establish Equation (90) by demonstrating that both sides of this identity agree to first order in ζ and θ .
In addition to the σ μ ν defined in Equation (85), it is convenient to introduce the set of 2 × 2 matrices,
σ ¯ μ ν = σ ¯ ν μ 1 4 i ( σ ¯ μ σ ν σ ¯ ν σ μ ) .
Then, using the properties of the Pauli matrices, Equations (81) and (86) yield
M = exp 1 2 i θ ρ λ σ ¯ ρ λ = exp 1 2 i θ · σ 1 2 ζ · σ .
Working to first order in the parameters θ ρ λ and making use of Equations (50), (53), (86), and (A14),
Λ μ ν δ ν μ + 1 2 θ λ ν η λ μ θ ν ρ η ρ μ ,
M I 2 1 2 i θ ρ λ σ ρ λ ,
M I 2 + 1 2 i θ ρ λ σ ¯ ρ λ .
It then follows that
M σ ¯ μ M I 2 + 1 2 i θ ρ λ σ ¯ ρ λ σ ¯ μ I 2 1 2 i θ ρ λ σ ρ λ σ ¯ μ + 1 2 i θ ρ λ σ ¯ ρ λ σ ¯ μ σ ¯ μ σ ρ λ .
One can easily derive the following identity [11,12]:
σ ¯ ρ λ σ ¯ μ σ ¯ μ σ ρ λ = i η λ μ σ ¯ ρ η ρ μ σ ¯ λ .
Hence, Equation (A18) yields
M σ ¯ μ M σ ¯ μ 1 2 θ ρ λ η λ μ σ ¯ ρ η ρ μ σ ¯ λ δ ν μ 1 2 θ ρ λ η λ μ δ ν ρ η ρ μ δ ν λ σ ¯ ν δ ν μ 1 2 θ ν λ η λ μ θ ρ ν η ρ μ σ ¯ ν δ ν μ + 1 2 θ λ ν η λ μ θ ν ρ η ρ μ σ ¯ ν = Λ μ ν σ ¯ ν , x x x
after using the antisymmetry of θ ν λ in the penultimate step above. After employing Equation (A15) in the final step above, we conclude that
M σ ¯ μ M = Λ μ ν σ ¯ ν ,
thereby confirming Equation (90). In particular, it follows that η σ ¯ μ χ transforms as a Lorentz four-vector in light of Equations (88) and (89), as previously noted. Equation (A21) is a statement of the well-known isomorphism SO(1, 3)0 SL(2, C )/ Z 2 , since the SL(2, C ) matrices M and M correspond to the same Lorentz transformation Λ .
Of course, the derivation of Equation (A21) is much simpler than a direct derivation of Equation (91), which requires the explicit evaluation of all the relevant matrix exponentials. Indeed, we can assert that having derived Equation (A21) to first order in θ ρ λ , this result must be true for arbitrary θ ρ λ . The reason that a derivation based on the infinitesimal forms of Λ , M and M is sufficient is due to the strong constraints imposed by the group multiplication law of the Lorentz group near the identity element, which in light of the discussion following Equation (40) implies that a proper orthochronous Lorentz transformation can be expressed as an exponential of an element of the corresponding Lie algebra.
There is a second inequivalent two-dimensional matrix representation of SL ( 2 , C ) whose general element is represented by the matrix ( M 1 ) , as discussed in greater detail in Refs. [11,12]. This leads to a second identity that is similar to that of Equation (A21):
M 1 σ μ ( M 1 ) = Λ μ ν σ ν .
One can derive Equation (A22) by again working to first order in the parameters θ ρ λ and making use of Equations (A15)–(A17):
M 1 σ μ ( M 1 ) I 2 + 1 2 i θ ρ λ σ ρ λ σ μ I 2 1 2 i θ ρ λ σ ¯ ρ λ σ μ + 1 2 i θ ρ λ σ ρ λ σ μ σ μ σ ¯ ρ λ .
In light of the identity [11,12],
σ ρ λ σ μ σ μ σ ¯ ρ λ = i η λ μ σ ρ η ρ μ σ λ ,
it follows that
M 1 σ μ ( M 1 ) σ μ 1 2 θ ρ λ η λ μ σ ρ η ρ μ σ λ δ ν μ 1 2 θ ρ λ η λ μ δ ν ρ η ρ μ δ ν λ σ ν δ ν μ 1 2 θ ν λ η λ μ θ ρ ν η ρ μ σ ν δ ν μ + 1 2 θ λ ν η λ μ θ ν ρ η ρ μ σ ν = Λ μ ν σ ν ,
which establishes Equation (A22) after employing Equation (A15) in the final step above.
Multiplying Equation (A22) on the right by σ ¯ ρ and using Tr ( σ ν σ ¯ ρ ) = 2 δ ρ ν , it follows that
Λ μ ν = 1 2 Tr M 1 σ μ ( M 1 ) σ ¯ ν ,
which provides yet another formula for the most general orthochronous Lorentz transformation matrix. Using block matrix notation, Equation (A26) yields
Λ ( ζ , θ ) = Λ 0 0 Λ 0 j Λ i 0 Λ i j = 1 2 Tr M 1 ( M 1 ) Tr ( M 1 ) σ j M 1 Tr M 1 σ i ( M 1 ) Tr M 1 σ i ( M 1 ) σ j ,
after noting that σ ¯ j = σ j = σ j [cf. Equation (83)]. Comparing with Equation (95), we see that M ( M 1 ) and M M 1 , which results in θ θ and ζ ζ , or equivalently z z and Δ Δ . In addition, the block off-diagonal elements of Λ ( ζ , θ ) have changed sign. Under these replacements, it is straightforward to check that the resulting expressions for Λ μ ν are the same as those obtained previously in Equations (99)–(102). That is, Equation (A26) is established by explicit calculation.

Appendix C. Ψ ¯ γμΨ Transforms as a Lorentz Four-Vector

Most textbook treatments of the Dirac equation employ the more familiar four-component spinors and Dirac gamma matrices (e.g., see Ref. [21]). The relation between the two-component and four-component spinor formalisms is briefly presented in this Appendix. Further details can be found in Refs. [11,12].
One can construct four-component spinors
Ψ χ η ,
in terms of a pair of two-component spinors χ and η . The Dirac gamma matrices are defined via their anticommutation relations:
{ γ μ , γ ν } γ μ γ ν + γ ν γ μ = 2 η μ ν .
In the so-called chiral representation of the gamma matrices,
γ μ = 0 σ μ σ ¯ μ 0 .
It is convenient to introduce
1 2 Σ μ ν 1 4 i [ γ μ , γ ν ] = σ μ ν 0 0 σ ¯ μ ν ,
where [ γ μ , γ ν ] γ μ γ ν γ ν γ μ . The Dirac adjoint spinor is defined by
Ψ ¯ ( x ) Ψ ( x ) γ 0 = η χ .
The matrices γ μ and Σ μ ν satisfy
γ 0 γ μ γ 0 = ( γ μ ) ,
γ 0 Σ μ ν γ 0 = ( Σ μ ν ) .
Four-component spinors transform under an active Lorentz transformation as
Ψ = M Ψ ,
where
M M 0 0 ( M 1 ) = exp 1 4 i θ μ ν Σ μ ν
combines the two inequivalent two-dimensional matrix representations of SL ( 2 , C ) ,
M = exp 1 2 i θ ρ λ σ ρ λ = exp 1 2 i θ · σ 1 2 ζ · σ ,
( M 1 ) = exp 1 2 i θ ρ λ σ ¯ ρ λ = exp 1 2 i θ · σ + 1 2 ζ · σ .
To compute the corresponding matrix inverses, simply change the overall sign of the parameters θ μ ν . For example,
M 1 = exp 1 4 i θ μ ν Σ μ ν .
In light of Equation (A34), one can easily check that the 4 × 4 matrix M satisfies
γ 0 M γ 0 = ( M 1 ) .
Using Equations (A32) and (A35), it then follows that
Ψ ¯ Ψ γ 0 = Ψ M γ 0 = Ψ ¯ γ 0 M γ 0 .
Finally, taking the hermitian conjugate of Equation (A40) and using Equation (A33) [which implies that ( γ 0 ) = γ 0 in light of Equation (A29)], we end up with
Ψ ¯ = Ψ ¯ M 1 ,
under an active Lorentz transformation.
It is now straightforward to verify that the identities, Equations (A21) and (A22), derived in Appendix B, are equivalent to
M 1 γ μ M = Λ μ ν γ ν ,
after employing Equations (A30) and (A36). Consequently, in light of Equations (A35), (A42) and (A43), it follows that under an active Lorentz transformation,
Ψ ¯ γ μ Ψ Ψ ¯ M 1 γ μ M Ψ = Λ μ ν Ψ ¯ γ ν Ψ .
That is, under a Lorentz transformation, Ψ ¯ γ μ Ψ transforms as a four-vector. Moreover, using Tr ( γ μ γ ν ) = 4 δ ν μ , Equation (A43) yields
Λ μ ν = 1 4 Tr ( M 1 γ μ M γ ν ) .
Of course, Equation (A45) is equivalent to Equations (91) and (A26) taken together.
Note that using Equation (A45) to obtain an explicit form for Λ μ ν requires the evaluation of the trace of a product of four 4 × 4 matrices. In contrast, the computation of Λ μ ν presented in Section 5 is more straightforward involving less duplication of effort.

References

  1. Sexl, R.U.; Urbantke, H.K. Relativity, Groups, Particles: Special Relativity and Relativistic Symmetry in Field and Particle Physics; Springer: Wien, Austria, 2001. [Google Scholar]
  2. Markoutsakis, M. Geometry, Symmetries, and Classical Physics—A Mosaic; CRC Press: Boca Raton, FL, USA, 2022. [Google Scholar]
  3. Zeni, J.R.; Rodrigues, W.A., Jr. The Exponential of the Generators of the Lorentz Group and the Solution of the Lorentz Force. Hadron. J. 1990, 3, 317–327. [Google Scholar]
  4. Geyer, C.M. Catadioptric Projective Geometry: Theory and Applications. Ph.D. Dissertation, University of Pennsylvania, Philadelphia, PA, USA, 2003. [Google Scholar]
  5. Dimitro, G.K.; Mladenov, I.M. A New Formula for the Exponents of the Generators of the Lorentz Group. In Proceedings of the Seventh International Conference on Geometry, Integrability and Quantization, Varna, Bulgaria, 2–10 June 2005; Mladenov, I.M., de León, M., Eds.; pp. 98–115. [Google Scholar]
  6. Andrica, D.; Rohan, R.-A. A new way to derive the Rodrigues formula for the Lorentz group. Carpathian J. Math. 2014, 30, 23–29. [Google Scholar] [CrossRef]
  7. Carrell, J.B. Groups, Matrices, and Vector Spaces; Springer Science+Business Media, LLC: New York, NY, USA, 2017. [Google Scholar]
  8. Jackson, J.D. Classical Electrodynamics, 3rd ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1999. [Google Scholar]
  9. Marsden, J.E.; Ratiu, T.S. Introduction to Mechanics and Symmetry—A Basic Exposition of Classical Mechanical Systems, 2nd ed.; Springer: New York, NY, USA, 1999. [Google Scholar]
  10. Gallier, J.; Quaintance, J. Differential Geometry and Lie Groups—A Computational Perspective; Springer Nature Switzerland AG: Cham, Switzerland, 2020. [Google Scholar]
  11. Dreiner, H.K.; Haber, H.E.; Martin, S.P. Two-component spinor techniques and Feynman rules for quantum field theory and supersymmetry. Phys. Rep. 2010, 494, 1–196. [Google Scholar] [CrossRef]
  12. Dreiner, H.K.; Haber, H.E.; Martin, S.P. From Spinors to Supersymmetry; Cambridge University Press: Cambridge, UK, 2023. [Google Scholar]
  13. Meyer, C.D. Matrix Analysis and Applied Linear Algebra; SIAM: Philadelphia, PA, USA, 2000. [Google Scholar]
  14. Mehta, M.L. Matrix Theory—Selected Topics and Useful Results; Hindustan Publishing Corporation: New Delhi, India, 1989. [Google Scholar]
  15. Gantmacher, F.R. Theory of Matrices; Chelsea Publishing Company: New York, NY, USA, 1959; Volume I. [Google Scholar]
  16. Rao, K.N.S. The Rotation and Lorentz Groups and Their Representations for Physicists; Wiley Eastern Limited: New Delhi, India, 1988. [Google Scholar]
  17. Scheck, F. Mechanics: From Newton’s Laws to Deterministic Chaos, 6th ed.; Springer: Berlin, Germany, 2018. [Google Scholar]
  18. Moretti, V. The interplay of the polar decomposition theorem and the Lorentz group. Lect. Notes Semin. Interdiscip. Mat. 2006, 5, 153–171. [Google Scholar]
  19. Urbantke, H.K. Elementary Proof of Moretti’s Polar Decomposition Theorem for Lorentz Transformations. arXiv 2002, arXiv:math-ph/0211077. [Google Scholar]
  20. Karplyuk, K.S.; Kozak, M.I.; Zhmudskyy, O.O. Factorization of the Lorentz Transformations. Ukr. J. Phys. 2023, 68, 19–24. [Google Scholar] [CrossRef]
  21. Peskin, M.E.; Schroeder, D.V. An Introduction to Quantum Field Theory; Westview Press: Boulder, CO, USA, 1995. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Haber, H.E. Explicit Form for the Most General Lorentz Transformation Revisited. Symmetry 2024, 16, 1155. https://doi.org/10.3390/sym16091155

AMA Style

Haber HE. Explicit Form for the Most General Lorentz Transformation Revisited. Symmetry. 2024; 16(9):1155. https://doi.org/10.3390/sym16091155

Chicago/Turabian Style

Haber, Howard E. 2024. "Explicit Form for the Most General Lorentz Transformation Revisited" Symmetry 16, no. 9: 1155. https://doi.org/10.3390/sym16091155

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop