Geometry and Motion [Extra]


Energy Comparisons Between Observers

On the main page for geometry and motion, we defined the energy-momentum vector of an object to be a vector whose length was the mass of the object, pointing along the object’s world line. Of course there are two directions along any line, so the rule was given to choose the direction that was not more than 90° away from the time axis of the observer describing the object.

Now, that’s a useful scheme when you’re committed to describing everything in a single reference frame, but if you have two observers in relative motion, there will be cases where the direction rule causes them to choose opposite directions along the world line for the object’s energy-momentum vector, so the two descriptions won’t be compatible.

In the Lorentzian universe this problem doesn’t arise, because ordinary observers in relative motion still agree on the sign of the time component of any vector. Of course we can imagine time-reversed observers, but they occupy a distinct class for whom all time components are negative compared to our own. But in the Riemannian universe, if you put arrows pointing in your positive time direction on a collection of world lines for objects with a range of velocities, and I’m moving at a significant speed relative to you, I might agree with your arrows in some cases while disagreeing in others. If we each add up all the energy-momentum vectors, the total vectors we get will bear no relation to each other. Of course we’ll be using different coordinate systems, but that’s not the problem; transforming vectors from one coordinate system to another is a straightforward matter. But if we set that aside and just think of the individual energy-momentum vectors Pi as coordinate-independent geometric objects, your total might be P1 + P2 + P3 + P4, while mine is P1 + P2P3P4, so they’re completely different vectors.

Since there really is no physical basis[*] on which to choose a particular direction along the world line, the energy-momentum vector in four-space is undetermined up to a sign. The good news is, the stress-energy tensor for an object — which encapsulates information about the density and flow of energy and momentum — doesn’t depend on choosing a direction along the world line. The matrix of coordinates of the stress-energy tensor for an object with density ρ in its rest frame is given by:

Tab = ρ ua ub

where u is the object’s four-velocity. Clearly it makes no difference here if we swap the sign of u! So if one observer computes the stress-energy tensor in their coordinates, and another observer computes it in their own, the two will be related by the normal geometric rules for transforming tensors.

Conservation of energy-momentum is then embodied in the fact that the stress-energy tensor (for everything, including all matter and radiation) will have a divergence of zero.

Now, having said this there’s still nothing wrong with fixing a reference frame and computing the total energy and momentum of a system in that frame. We can do physics in a relativistic Hamiltonian formalism, which we describe below, where having chosen a time coordinate we go ahead and define a positive total energy that serves as the Hamiltonian. In that formalism, the total energy and momentum will be conserved quantities. The only mistake would be to combine those conserved quantities into a four-vector, and expect that four-vector to be related by a rotation to the vector someone else obtained the same way.

[*] If the object carries an electric charge, the two directions along its world line are no longer indistinguishable. This is discussed further in the section on Riemannian electromagnetism.


Suppose we assume, for the sake of simplicity, that a four-dimensional Riemannian universe is infinite in all directions and perfectly flat. This is an idealisation, but it’s a very useful one; a similar idealisation of our own universe, known as Minkowski spacetime, underlies special relativity and most relativistic quantum physics. We will call this idealised universe Euclidean four-space, because it obeys the postulates of Euclidean geometry.

[NB: In the physics literature, the term “Euclidean” is frequently applied to the versions of physical laws produced by Wick rotation. These are not the laws applicable in the universe of Orthogonal.]

We pick an event O to be the origin of our coordinate system, four mutually perpendicular directions for our x, y, z and t axes, and a system of units to work in. We can then specify any event in this universe with a four-tuple of real numbers, (x, y, z, t). The set of all such four-tuples is known to mathematicians as R4.

R4 is a vector space, and we find the length |v| of any vector v = (vx, vy, vz, vt) in R4 with the standard dot product:

|v|2 = v · v = (vx)2 + (vy)2 + (vz)2 + (vt)2

In a general, curved universe we can’t add and subtract the coordinates of two events, but in this flat universe it makes perfect sense. The vector from an event a = (ax, ay, az, at) to an event b = (bx, by, bz, bt) is simply ba = (bxax, byay, bzaz, btat), and the distance between the two events is |ba|.

Now suppose our Euclidean four-space is full of various objects, satisfying laws that are essentially geometric in nature (an example would be the law of conservation of energy and momentum). We can gain a deeper understanding of what such geometric laws might be by looking at the set of symmetries of the space: operations that change the location or orientation of objects, while leaving their intrinsic geometric properties — such as the distance between two events, the angle between two world lines, or the spacing between two wavefronts — unchanged.

Formally, we’ll call a function f:R4→R4 a symmetry if for any two events a and b, the distance between f(a) and f(b) equals the distance between a and b:

|f(b)–f(a)| = |ba|

Intuitively, it’s not hard to think of three kinds of functions that would satisfy this condition, described in geometrical terms:

Translations are easy to describe mathematically. If s is any vector, we can define:

Ts(a) = a+s

Ts will clearly be a symmetry, because:

Ts(b)–Ts(a) = ba

Since translations can shift the origin of the coordinates to any other event, we will focus on those rotations and reflections that leave the origin fixed. They can always be combined with translations to give the most general symmetries.

A symmetry R that leaves the origin fixed will be a linear function on R4, which means we can add, subtract and multiply vectors either before or after applying it and expect the same result:

R(v + w) = R(v) + R(w)
R(s v) = s R(v)

Associated with any linear function on R4 is a 4×4 matrix Rij, with:

R(v) = R(vj ej) = Rij vj ei

where the vector components vj and the matrix components Rij are with respect to the standard basis {ex, ey, ez, et}, and we have used the Einstein summation convention. In other words, we can reproduce any linear function on the vector space R4 by means of a matrix that we use to multiply the vector’s components.

If R is a symmetry that leaves the origin fixed, it will preserve the standard dot product on R4, so for any two vectors v and w:

R(v) · R(w) = v · w

This is just another way of saying that R will leave the lengths of vectors and the angles between them unchanged, since these things depend on the dot product.

Suppose we set v = ei and w = ej, where ei is the ith basis vector in the standard basis on R4, namely a four-tuple whose ith component is 1 and all other components are zero. Then we have:

R(ei) · R(ej) = ei · ej

Now the standard basis is orthonormal, so ei · ej = δij (the Kronecker delta symbol, 1 if i=j, and 0 if ij) ... which is also true of the i, j component of the 4×4 identity matrix, I4.

R(ei) · R(ej) = ei · ej = (I4)ij

What’s more:

R(ei) · R(ej) = Rki Rkj = (RT)ik Rkj = (RT R)ij

where RT is the transpose of R, the matrix made by flipping R so that its rows become columns. Since all the components of RT R and I4 agree, the two must be equal:

RT R = I4

and so RT is the inverse of R.

This gives us a way to characterise all the linear functions that are symmetries of Euclidean four-space: the transpose of the matrix for such a function will also be its inverse. The set of all such 4×4 matrices is known as O(4), or the 4-dimensional orthogonal group.

To give some concrete examples, this matrix:

cos θ –sin θ 0 0
sin θ cos θ 0 0
0 0 cos φ –sin φ
0 0 sin φ cos φ

rotates by an angle of θ in the xy-plane and an angle of φ in the zt-plane. If we multiply it by its transpose, we get:

cos θ sin θ 0 0
–sin θ cos θ 0 0
0 0 cos φ sin φ
0 0 –sin φ cos φ
cos θ –sin θ 0 0
sin θ cos θ 0 0
0 0 cos φ –sin φ
0 0 sin φ cos φ
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1

A simpler example is this matrix:

–1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1

which is a reflection that reverses the x-coordinate. If we multiply it by its transpose (which is just the same matrix), we get:

–1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
–1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1

We can distinguish pure rotations from reflections by taking the determinant of the matrix. Rotations will have a determinant of 1, reflections a determinant of –1. The set of all 4×4 matrices whose transpose is their inverse and whose determinant is 1 is known as SO(4), or the 4-dimensional special orthogonal group.

The group of transformations we obtain by combining arbitrary elements of O(4), the orthogonal group, with translations is known as the 4-dimensional Euclidean group, E(4). This group gives all the symmetries of 4-dimensional Euclidean space. If we restrict ourselves to rotations and translations — leaving out reflections — we have SE(4), the special Euclidean group.

It can be convenient to express the general symmetries of the Euclidean group, which include translations, in terms of matrices. We can do this by adopting a convention where we stick an extra coordinate of 1 on the end of the four coordinates for an event in R4. In those terms, we can combine a linear operation R and a translation by a vector s into a single matrix:

R s
0 1

This is a 5×5 matrix; we’re abbreviating a whole 4×4 matrix with the symbol R, the first four components in the final column of the matrix with the symbol s, and the first four components in the final row of the matrix with the symbol 0. Multiplying vectors with an extra coordinate of 1 then gives us:

R s
0 1
R v + s

We can see how two such symmetries applied in succession interact by multiplying their matrices, with the matrix for the first symmetry on the right in the matrix product:

R2 s2
0 1
R1 s1
0 1
R2 R1 R2 s1 + s2
0 1

Hamiltonian and Lagrangian for a Free Classical Particle

In the introductory section on the energy-momentum vector, we considered a free classical particle with velocity v, and derived formulas for its total energy E and momentum p, defined as the t and x components of the energy-momentum vector:

E=m / √(1 + v2)(1)
p=mv / √(1 + v2)(2)

Our aim now will be to describe the same particle in terms of Hamiltonian and Lagrangian mechanics. This won’t tell us anything new about the behaviour of the particle, which we already know simply follows a straight world line through Euclidean four-space, but it will allow us to choose the correct signs and conventions to make everything work, which will help enormously when we want to use the Hamiltonian and Lagrangian formalisms for more sophisticated purposes.

First, we assume we’ve picked a system of orthogonal coordinates, x, y, z and t. We will simplify the analysis by further assuming that the particle is moving solely in the x direction, so we have only one spatial coordinate to consider; generalising to three spatial coordinates is straightforward.

In Hamiltonian mechanics, the Hamiltonian, H, is an expression for the total energy of the system as a function of the n generalised coordinates qi, i=1,...n that represent the degrees of freedom of the system, the n momenta pi conjugate to those coordinates, and time t. Hamilton’s equations then give us the rates of change of the momenta and the coordinates[1]:

dpi / dt=–∂H / ∂qi(4)
dqi / dt=H / ∂pi(5)

For our free particle moving in the x direction, x is the sole coordinate, and there will be a momentum p conjugate to it. So at first glance, we might think we can put H(x, p, t) = E from equation (3):

H(x, p, t) =√(m2p2)(6)

with the momentum conjugate to the coordinate x simply being the x component of the energy-momentum vector.

However, this doesn’t work! Using this setup, the second of Hamilton’s equations, (5), would give us:

dx / dt=H / ∂p 
 =∂ [ √(m2p2) ] / ∂p 
 =p / √(m2p2) 
 =v   [Wrong!](7)

Our mistake was to assume that the momentum defined geometrically, as the x component of the energy-momentum vector, would necessarily be the same as the momentum conjugate to x in the Hamiltonian formalism.

But this is easily corrected. If we define a “Hamiltonian momentum”:

 =mv / √(1 + v2)(8)

as the momentum conjugate to the coordinate x, then everything works out nicely:

H(x, pH, t) =√(m2pH2)(9)
dx / dt=H / ∂pH 
 =∂ [ √(m2pH2) ] / ∂pH 
 =pH / √(m2pH2) 
 =v   [Right!](10)

This might seem trivial, but getting the sign correct at this point is easier than waiting until we encounter the peculiarities of Riemannian thermodynamics, where temperatures can be either positive or negative and the right choice might not have been so obvious.

The first of Hamilton’s equations, (4), gives us:

dpH / dt=–∂H / ∂x 

which simply tells us that the momentum of the particle is constant.

Next, we will look at the Lagrangian for a free particle. The Lagrangian, L, is a quantity whose integral over time — known as the action, S — is either a maximum or a minimum under variations of the particle’s trajectory[2]. This immediately gives us a clue: if we make L dt proportional to the length of the segment of the world line for the particle over the interval of time dt, then the action S = ∫L dt will be proportional to the length of the world line. Since the shortest path between two points is a straight line, minimising the action will produce straight world lines, which is exactly what we want for a free particle.

The Lagrangian L is formally defined as a function of the system’s generalised coordinates qi, the rates of change with time of those coordinates qi = dqi / dt, and time t. For our system, with a single coordinate x whose rate of change with time is v, we have:

L(x, v, t) dt=C ds 
L(x, v, t)=C ds / dt(12)

where s is the four-space distance along the particle’s world line, and C is a constant yet to be determined.

For a particle moving with velocity v in our chosen coordinate system, the four-space distance s along the particle’s world line comes straight from Pythagoras’s Theorem: its square is the sum of the squares of the elapsed time t, and the distance through space, x = vt, that the particle has travelled:

s(t)=√(t2 + v2 t2) 
 =t √(1 + v2)(13)

This, along with (12), gives us:

ds / dt=√(1 + v2) 
L(x, v, t)=C √(1 + v2)(14)

The Euler-Lagrange equations in general are[2]:

d [∂L / ∂qi] / dt=L / ∂qi(15)

which for our system becomes:

d [∂L / ∂v] / dt=L / ∂x 
d [C v / √(1 + v2)] / dt=0 
[C / m] dp / dt=0(16)

So whatever we choose for the constant C this will just tell us once again that the momentum of the particle is constant. However, we can find the correct value for C by exploiting the relationship between Hamiltonian and Lagrangian mechanics, in which the Hamiltonian conjugate momenta are defined in terms of the Lagrangian[3]:

pi=L / ∂qi(17)

which for our system becomes, making use of (8) and (14):

pH=L / ∂v 
mv / √(1 + v2)=C v / √(1 + v2) 

and so our final expression for the Lagrangian is:

L(x, v, t)=m √(1 + v2)(18)

As a cross-check, we can verify that we have the correct relationship between the Hamiltonian itself and the Lagrangian, which in general is[1]:

H=pi qiL(19)

and which for our system becomes:

H=pH vL 
 =v (–mv / √(1 + v2)) – (–m √(1 + v2)) 
 =m [ √(1 + v2) – v2 / √(1 + v2)] 
 =m / √(1 + v2)(20)

in agreement with equation (1).

So, to use the Hamiltonian and Lagrangian formalisms with Riemannian physics:


[1] Mechanics by L.D. Landau and E.M. Lifshitz, Butterworth-Heinemann, 1976. Section 40.

[2] Landau and Lifshitz, op. cit., Section 2.

[2] Landau and Lifshitz, op. cit., Section 7.

Valid HTML Valid CSS
Orthogonal / Geometry and Motion [Extra] / created Wednesday, 6 April 2011
If you link to this page, please use this URL: http://www.gregegan.net/ORTHOGONAL/02/MotionExtra.html
Copyright © Greg Egan, 2011. All rights reserved.