Orthogonal

Geometry and Waves [Extra]


Orthogonal


Representations of the Symmetry Groups

In the notes on symmetries, we identified a number of groups of symmetries of Euclidean 4-space. The largest of these groups is the 4-dimensional Euclidean group, E(4), which includes every possible transformation that preserves the distance between points. Within that group are various subgroups, including:

The transformations in these groups can be viewed in two ways. If various processes taking place in Euclidean four-space obey the laws of physics — whether we’re talking about classical particles colliding, waves moving from place to place, or some quantum-mechanical process — then the symmetry that we demand of the laws should guarantee that applying any transformation in E(4) to the original processes should yield new ones that also obey the laws. To give a simple example, if we imagine the world lines of several particles colliding in a way that obeys the law of conservation of energy-momentum, then we ought to be able to rotate, reflect and translate all those world lines together and end up with a new collision that once again obeys the conservation law.

The other way of viewing these symmetries is to think of them as transformations from one coordinate system to another. If I use a certain coordinate system and you use a different one, then so long as we’re both using rectangular coordinates — with the x, y, z and t axes all orthogonal to each other, and the same scale used on all four axes — then the way I assign coordinate four-tuples in R4 to actual, physical points in four-space will differ from the way you do, but our two sets of coordinates will be related by a symmetry transformation in E(4). If we both describe the same physical processes, but merely assign different coordinates to things, then we should both be able to apply the laws of physics to our observations and reach the same verdict on any objective matter. Either energy-momentum is conserved in a collision or it isn’t — and the two of us, using our different coordinate systems to describe the same events, should come to the same conclusion on a question like that.

In either case, understanding the symmetries of Euclidean space will help us to understand the kind of laws that can satisfy those symmetries. And in understanding the physics of our own universe, one of the most powerful tools has turned out to be a branch of mathematics known as representation theory.

Representations

Suppose we have a group, G, and a vector space, V. If we consider linear functions from V to V (these are often called “linear operators”) we can take all such functions that have an inverse and form a group, using the composition of functions as the group multiplication. This group is called GL(V), the general linear group on V. A representation of G on V is a function ρ:G→GL(V), which satisfies the property that for any g, h in G:

ρ(g) ρ(h) = ρ(gh)

In other words, the two processes:

must yield the same final result in order for ρ to be a representation of G. Another way of stating this is to say that ρ is a group homomorphism from G to GL(V).

Some groups are defined as (or are equivalent to) groups of n×n matrices: for example, SO(3) and SO(4). In that case, they have an obvious representation on Rn and Cn (if the matrices are real-valued), or just on Cn (if the matrices are complex-valued), where the linear operator on the vector space is just matrix multiplication of vectors. This is known as the fundamental representation of the group.

Here’s an example of a different kind of representation. The set of real numbers, R, forms a group with addition as the group operation; we’ll write this as (R,+) as a reminder of the operation we’re using. We’ll choose the two-dimensional vector space R2, and then GL(R2) consists of all invertible linear functions from R2 to R2. We can also think of such functions as the set of 2×2 matrices with inverses. Let’s define a representation ρ1 by declaring that for any number r in the group (R,+), ρ1(r) is the matrix that rotates vectors in R2 by an angle of r:

ρ1(r) =
cos(r) –sin(r)
sin(r) cos(r)
(1)

It’s not hard to see that ρ1(r+s) = ρ1(r) ρ1(s), where on the left-hand side we’re using addition in (R,+) as the group operation, and on the right-hand side we’re using matrix multiplication in GL(R2). We don’t really need to carry out the matrix multiplication to know that rotation by an angle s followed by rotation by an angle r is the same as rotation by an angle r+s.

Reducible and Irreducible Representations

In the example we just gave of a representation of (R,+) on R2, where ρ1 was defined via equation (1), it’s pretty clear that we “use” the whole vector space for the representation. That is to say, there’s no subspace of R2 such that the vectors within it stay within it when we apply the matrices that ρ1 gives us for the various members of the group (R,+).

But now consider a different representation of (R,+) on R2:

ρ2(r) =
exp(r) 0
0 exp(2r)
(2)

When multiplied by one of these matrices, any vector on the x-axis of R2 will remain on the x-axis, and any vector on the y-axis will remain on the y-axis. We describe the representation ρ2 as reducible, because it really consists of these two separate one-dimensional representations “stuck together” rather than a single, truly two-dimensional representation.

In contrast to this, we describe our earlier representation, ρ1, as irreducible, because there are no subspaces of R2 that are preserved by the action of ρ1(r) for every r. [It’s worth noting that exactly the same matrices acting on pairs of complex numbers, C2, give a reducible representation; whether we’re working with real or complex numbers can make all the difference.]

Irreducible representations are the fundamental building blocks from which other representations can be constructed, so from a purely mathematical standpoint it’s worth focusing on them. But in physics, it turns out that each fundamental particle is associated with a different irreducible representation of the relevant symmetry group! In the notes on vector waves, when we added a condition to ensure that we properly distinguished between vector and scalar waves, what we were doing was ensuring that we picked out an irreducible representation.

Unitary Representations

Quantum mechanics describes the states of physical systems in terms of vectors in a complex vector space. The vector spaces of quantum states are always equipped with what’s known as an inner product, which is an extension of the idea of the dot product to deal with complex numbers. For example, on the vector space of pairs of complex numbers, C2, the standard inner product is:

<(a, b), (c, d)> = a*c + b*d

This is the same as the standard dot product on R2, except we take the complex conjugate of the components of the first vector. (Sometimes the inner product is defined with the complex conjugate of the second vector, rather than the first. Which one is conjugated is just a convention, and all that really matters is being consistent about the choice.)

Note that <v, v> will always give us a non-negative real number, despite the components of v being arbitrary complex numbers. We use the inner product this way to assign a length or magnitude |v| to every vector. For example:

|(a, b)|2 = <(a, b), (a, b)> = a*a + b*b = |a|2 + |b|2

where |a| and |b| refer to the absolute value of the complex numbers a and b.

If a complex vector space with an inner product is also complete, then it’s called a Hilbert space. The complex vector spaces used in quantum mechanics are always Hilbert spaces.

If two unit vectors v and w represent two different quantum states of a physical system, then the probability that the system will pass a test for being in state w when it has been prepared to be in state v is given by:

P(passing test for w | prepared as v) = |<v, w>|2

Now, just as we call a matrix orthogonal if it preserves the dot product, we call a matrix or a linear operator unitary if it preserves the complex inner product:

<U(v), U(w)> = <v, w>

How do we know when a matrix describes a unitary operator? For an orthogonal matrix, the condition is MT = M–1, that is, the transpose of the matrix equals its inverse. The condition for a matrix to be unitary is very similar, but in place of the transpose we use the Hermitian adjoint, which is the complex conjugate of the transpose. We write this as M*, and a matrix will be unitary if M* = M–1.

A representation ρ of a group G on a Hilbert space V is called unitary if ρ(g) is a unitary linear operator on V for every g in G.

Unitary representations of symmetry groups play a special role in quantum mechanics, because by preserving the inner product they preserve probabilities. If you and I use different coordinates on four-space, then we will generally end up describing the same physical states of some quantum system with different complex vectors in the system’s Hilbert space, V. But there ought to be a representation on V of the appropriate symmetry group which lets me convert your vectors for the quantum system into mine — and if that representation is unitary, we will agree on all the inner products between vectors, and hence the probabilities they predict.

Actually, though, because the probabilities are of the form |<v, w>|2, they will still agree even if my inner product and your inner product differ by a factor whose absolute value is 1. A complex number z with |z| = 1 is known as a phase, so we could also say our inner products “differ by a phase”. The exponential of any purely imaginary number has an absolute value of 1, so phases are often written as exp(iθ) with θ a real number. Functions ρP from G to GL(V) that satisfy a relationship where the right-hand side of the usual definition of a representation is multiplied by an extra phase:

ρP(g) ρP(h) = exp(iθ(g, h)) ρP(gh)

are known as projective representations[1] of G on V.

In our own universe it’s certainly the case that some physical systems obey projective representations of the relevant symmetry groups rather than “true” ones, and we’d expect that to happen in the Riemannian universe too. However, we won’t try to give a rigorous account here of the subtleties of dealing with projective representations, because we’re largely going to side-step them by dealing with double covers of two groups for which this issue arises.

The Double Covers of SO(3) and SO(4)

In our universe, if you rotate your coordinate system continuously around some axis by an angle θ, the state vector you use to describe an electron whose spin is aligned with that axis will be multiplied by a phase of exp(iθ/2). When θ reaches 2π and your coordinates are back where they started, the electron’s state vector will have changed by a factor of exp(iπ) = –1. That’s because the electron’s state vector obeys a projective representation of the group of rotations in three dimensions, SO(3), rather than a true representation.

The Topology of SO(3) and SO(4)
Non-contractible loops in SO(3)
Contractible loops in SO(3)

One way to picture SO(3) is as a solid, three-dimensional ball of radius π, centred on a point O. Any point P in the ball can be taken to represent a rotation by an angle equal to the distance OP, with the axis of the rotation being the line segment OP; take the rotation to be in the direction the fingers of your right hand curl if your thumb points from O to P. With that scheme, opposite points on the surface of the ball represent identical rotations, since the effects of rotating by π in either direction around the same axis are identical. So any diameter of this ball will actually be a loop in SO(3): its two endpoints on the surface of the ball will correspond to exactly the same rotation – so they should be treated as a single point, closing the loop. But unlike a loop in ordinary space, it’s impossible to continuously shrink this one down to a point. The endpoints of the diameter can move around together, so long as they’re always opposite each other, but they can’t be brought closer together.

Loops in SO(3) can come in either of two kinds: a contractible loop that can be continuously deformed down to a point, or a non-contractible one, which can’t. Continuously increasing the angle of rotation around some axis from 0 to 2π gives a non-contractible loop, which is just a diameter in our ball-of-rotations picture — for example, the straight blue line in the upper diagram on the right. However, if you keep going and increase the angle of rotation to 4π, you get a contractible loop! The almost-straight pair of green lines in the lower diagram on the right illustrate this, though for clarity they avoid repeating any rotations; following the sequence from green to yellow to red shows this path contracting down to a point.

We say that SO(3) is not simply connected. In a simply connected topological space, any two continuous paths from point A to point B can be continuously deformed into each other. In SO(3), there are always two distinct classes of paths from A to B, where paths in one class can’t be deformed into paths from the other.

Given a projective representation of a group, we could always try to turn it into a true representation by multiplying ρP(g) by a further phase; we can do that without altering the probabilities we get from the inner product between vectors. But it turns out that we can only guarantee that the phases can be eliminated completely if the group is simply connected. For a group such as SO(3), we’re stuck with some projective representations.

A mathematically convenient tactic for dealing with this is to construct a new group SO(3)~ that consists of equivalence classes of paths from the identity in SO(3) to the various elements of the group, rather than just the elements themselves. If p is some specific path, we will write [p] for the equivalence class of all paths that share the same endpoints and can be continuously deformed into p.

If we take the set of all continuous paths from I3 to g in SO(3), but then consider two paths to be equivalent if they can be continuously deformed into each other, we end up with precisely two equivalence classes of paths from I3 to g. So for each g in SO(3) the new group SO(3)~ will contain two elements, g1 and g2, which can be thought of as two different ways of moving continuously from the identity in SO(3) to g.

How do we multiply two elements in SO(3)~? Formally, a path through SO(3) is a function p:[0,1]→SO(3), where [0,1] is the set of real numbers between 0 and 1 inclusive. If we take representative paths p1 and p2 from the two equivalence classes we wish to multiply, we define the product of [p1] and [p2] to be the equivalence class of the path given by p3(t) = p1(t) p2(t). This captures details of both the multiplication of rotations and the “multiplication of classes of paths”.

Multiplying classes of paths in SO(3) is easier than it might sound: for fixed endpoints it’s exactly like multiplication in the two-element group {1,–1}, where squaring any element gives you the identity.

For a concrete example of this, consider the path we get by rotating around the z-axis by angles that range from 0 to 2π:

p(t) =
cos(2π t) –sin(2π t) 0
sin(2π t) cos(2π t) 0
0 0 1
 

This path is a loop from the identity back to the identity (that is, p(0) = p(1) = I3), but it cannot be continuously deformed into the identity path:

e(t) =
1 0 0
0 1 0
0 0 1
 

However, [p][p] is the equivalence class [p2], where:

p2(t) =
cos(4π t) –sin(4π t) 0
sin(4π t) cos(4π t) 0
0 0 1
 

This path rotates around the z-axis by angles that range from 0 to 4π, and it can be continuously deformed into the identity path. So [p][p] = [p2] = [e].

Still, rather than dealing with SO(3)~ exactly as we’ve defined it, it would be simpler to find a matrix group that’s isomorphic to SO(3)~. For the moment we will call this as-yet-unidentified matrix group “G3” — and we will refer to both G3 and SO(3)~ itself as the double cover of SO(3). The double cover will be a simply connected group, so we will only have to deal with its true representations, but in doing so we will effectively be dealing with both the true and the projective representations of SO(3).

To show that G3 is the double cover of SO(3), we will need a 2-to-1 function f3:G3→SO(3) such that for all g1, g2 in G3:

f3(g1) f3(g2) = f3(g1g2)

This shows that multiplication in G3 captures what it means to multiply rotations in SO(3); it is an example of a group homomorphism. To show that multiplication in G3 also captures the (very simple) structure of multiplication of classes of paths, we require of the two elements of G3 that f3 maps to the identity I3 in SO(3):

f3(i1) = f3(i2) = I3

that their multiplication table is just like that of the two-element group {1,–1}. In other words, one will be the identity in G3, and the other, when squared, will give the identity.

Note that for SO(3)~ itself we can easily construct a similar function, f3~:SO(3)~→SO(3). For any path q:[0,1]→SO(3), we define f3~([q]) = q(1), and the two elements that f3~ maps to the identity in SO(3) are [p] and [e] as defined above.

The group SO(4) is also not simply connected, and has all the same issues, so we would also like to find a double cover G4 for SO(4) and an analogous homomorphism f4:G4→SO(4). The topology of SO(4) is SO(3)×S3: that is, it’s homeomorphic to the direct product of SO(3) and a 3-sphere. To see this, for a given element R in SO(4), consider the vector R et to which it maps the unit vector in the t direction. The set of all such vectors will be the unit 3-sphere in R4, since obviously there will be some rotation taking et to any unit vector in R4. If we fix on a particular unit vector u, those elements R of SO(4) that take et to u will comprise a subgroup of SO(4); all the subgroups for different choices of u will be isomorphic to each other, and for the choice u = et the subgroup will, by definition, be SO(3). This is not a rigorous proof of our claim about the topology of SO(4), but it should make it intuitively plausible.

Constructing the 2-to-1 Homomorphisms

To identify double covers for both SO(3) and SO(4), we will start with four 2×2 matrices:

Hx =
0i
i0
 
Hy =
0–1
10
 
Hz =
i0
0i
 
Ht =
10
01
 

These four matrices all have a determinant of 1. If we multiply any two of them together, we get back either another matrix from this set, or its opposite. The multiplication table is given below, where we take the matrix in the leftmost column and multiply it on the right with the matrix in the top row. [If the multiplication table looks familiar to you, it might be because it gives a matrix representation of the quaternions.]

  HxHyHzHt
HxHtHzHyHx
HyHzHtHxHy
HzHyHxHtHz
HtHxHyHzHt

For any point in R4, x = (x, y, z, t), we will define H(x) as:

H(x) = xμ Hμ  
= x Hx + y Hy + z Hz + t Ht  
  =
tz iyx i
yx it + z i
 

We can think of the set of 2×2 complex matrices as an 8-dimensional real vector space; we can add such matrices and multiply them by real numbers in the obvious way, and one basis for this space would consist of the eight matrices that have either 1 or i in each of the four locations in the matrix, and zero elsewhere. H is then an isomorphism between R4 and a four-real-dimensional subspace of that 8-dimensional vector space; we will also call this subspace H. We can see from the multiplication table for the matrices {Hx, Hy, Hz, Ht} that H is closed under matrix multiplication: the matrix product of any two elements of H will again be in H.

The determinant of H(x) is the squared length of x:

det H(x) = (tz i)(t + z i) – (–yx i)(yx i)  
  = x2 + y2 + z2 + t2  
  = |x|2  

and the trace of H(x), the sum of its diagonal elements, is twice the t coordinate of x:

tr H(x) = (tz i) + (t + z i)  
  = 2t  

The Hermitian adjoint of H(x) is also in H. Specifically, H(x)* = H((–x, –y, –z, t)):

H(x)* =
t + z iy + x i
y + x itz i
 

and the product of H(x) with its Hermitian adjoint is the squared length of x times the 2×2 identity matrix:

H(x) H(x)* =
x2 + y2 + z2 + t20
0x2 + y2 + z2 + t2
 
  = |x|2 I2  

This means that when |x| = 1, H(x) will be unitary. In fact the group of all 2×2 unitary matrices with a determinant of 1, known as SU(2), is precisely the subset of H consisting of matrices H(u) for unit vectors u in R4.

As well as the norm of x, we can extract the dot product between two vectors x and y from their images in H:

x · y = ½ tr(H(x) H(y)*)

This result can be verified by checking that, for the individual H matrices (putting an upper index on the Hermitian adjoints):

½ tr(Hμ Hν*) = δμν

This also allows us to describe the inverse of our map H as:

H–1(h) = ½ tr(h Hν*) eν

where {eν} is the coordinate basis of R4.

Suppose g and h are in SU(2). We will define the function f4:SU(2)×SU(2)→SO(4) — that is, a function from the set of pairs of elements of SU(2) to SO(4) — as follows:

f4(g, h) x = H–1(g H(x) h–1)

Because g and h–1 are in SU(2), which is a subset of H, the product g H(x) h–1 will also be in H. So f4(g, h) is a well-defined linear operator on R4. Using the relationship between det(H(x)) and |x|2, and the fact that the determinant of a product of matrices is the product of their determinants:

|f4(g, h) x|2
= det(H(f4(g, h) x))
= det(g H(x) h–1)
= det(g) det(H(x)) det(h–1)
= det(H(x))
= |x|2

So f4(g, h) maps vectors in R4 to other vectors of the same length, making it either a rotation or a reflection. And it turns out always to give a rotation, i.e. an element of SO(4).

Can f4 give us every element of SO(4)? With a few calculations, it’s not hard to show that f4(g, h) can give us rotations by any angle we want in all six coordinate planes. By composing rotations of that form, we could construct any rotation whatsoever.

ghCoordinate plane for rotation (by 2θ)
cos(θ) Ht + sin(θ) Hx cos(θ) Ht + sin(θ) Hxyz plane
cos(θ) Ht + sin(θ) Hy cos(θ) Ht + sin(θ) Hyxz plane
cos(θ) Ht + sin(θ) Hz cos(θ) Ht + sin(θ) Hzxy plane
cos(θ) Ht – sin(θ) Hx cos(θ) Ht + sin(θ) Hxxt plane
cos(θ) Ht – sin(θ) Hy cos(θ) Ht + sin(θ) Hyyt plane
cos(θ) Ht – sin(θ) Hz cos(θ) Ht + sin(θ) Hzzt plane

We turn SU(2)×SU(2) into a single group in the obvious way, by multiplying the individual elements of the pairs: (g1, h1) (g2, h2) = (g1g2, h1h2). Using this multiplication, we find:

f4((g1, h1)(g2, h2)) x
= f4(g1g2, h1h2) x
= H–1( g1 g2 H(x) h2–1 h1–1 )
= H–1( g1 H(H–1(g2 H(x) h2–1)) h1–1 )
= f4(g1, h1) f4(g2, h2) x

We also have:

f4(I2, I2) = I4
f4(–I2, –I2) = I4

showing that two elements of SU(2)×SU(2) map to the identity in SO(4), and those two elements have the same group structure as {1,–1}. This fulfils the conditions we needed in order to show that SU(2)×SU(2) is the double cover of SO(4).

We began by discussing the idea of a double cover for SO(3), not SO(4), but everything we’ve done can easily be adapted to that purpose. If we define f3:SU(2)→SO(3) as follows:

f3(g) x = f4(g, g) x = H–1(g H(x) g–1)

then f3(g) will again act as a rotation on R4. But suppose we restrict x to R3 by setting the t coordinate to 0. Because of the relationship between t and the trace of H(x), this means tr H(x) = 0, and then from the cyclic property of the trace we have:

tr H(f3(g) x) = tr(g H(x) g–1) = tr(g–1 g H(x)) = tr H(x) = 0

So the t coordinate of f3(g) x is also 0. Since all vectors that start in the three-dimensional subspace of R4 with t = 0 stay in that subspace, f3(g) gives us a rotation in three dimensions, an element of SO(3).

To sum up, we’ve established that:

Representations of SU(2)

Now that we have seen the connection between SU(2) and SO(3) and SO(4), the next step is to construct irreducible unitary representations of SU(2). From these representations, we can then build both true representations and projective representations of SO(3) and SO(4).

Spin-0

The simplest irreducible unitary representation of SU(2) is one that exists for every group, the one-dimensional trivial representation:

ρ0(g) = I1

The complex numbers, C, are a 1-complex-dimensional vector space, and the linear operator we associate with every element g in SU(2) is the 1×1 identity matrix. This is known as the spin-0 representation, and it describes the way scalar quantities transform under changes of coordinates: not at all!

Spin-½

The next simplest irreducible unitary representation of SU(2) is two-dimensional:

ρ½(g) = g

In other words, we simply use the SU(2) element g to define a linear operator on C2 via matrix multiplication of a vector in the usual way. This is known as the spin-½ representation.

Spin-1

We can construct a three-dimensional irreducible unitary representation of SU(2) on C3 as follows:

ρ1(g) = f3(g)

The function f3 that we used to show SU(2) to be a double cover of SO(3) gives us an element of SO(3), but there is no reason why we can’t treat that 3×3 real-valued matrix as a linear operator on the space of triples of complex numbers, C3. This is known as the spin-1 representation.

Spin-j

Next, we’ll give a general construction for the spin-j representation for any half-integer j; this is a (2j+1)-dimensional irreducible unitary representation of SU(2).

From any two vector spaces V and W of dimension n and m, we can construct a new vector space: the space of n×m matrices. We can add matrices and multiply them by real or complex numbers in the obvious way. The new vector space is known as the tensor product of V and W, and is written V⊗W.

If V has a basis {e1, ..., en} and W has a basis {e'1, ..., e'm}, we define eie'j to be an n×m matrix filled with zeroes except for a 1 in the jth column of the ith row. The set of all eie'j then constitutes a basis for V⊗W. We then define a tensor product between any vector v = vi ei in V and any vector w = wj e'j in W, as:

vw = vi wj eie'j

where we have used the Einstein Summation Convention for repeated indices. In other words, we form the matrix for vw by listing the components of v vertically beside an n×m matrix and the components of w horizontally above the same matrix, and filling in the components of the matrix with products of the vector components that lie in the same row and column. Of course a general element of V⊗W won’t be of this form, but it will always be possible to write it as a linear combination of tensor products.

We can construct tensor products of any number of vector spaces, U⊗V⊗W⊗ ... in the same way, creating spaces that we can think of as arrays similar to matrices that extend in three or more dimensions. If the individual vector spaces all have inner products defined on them, we can define an inner product on the tensor product by multiplying the individual ones:

<uv, wx> = <u, w> <v, x>

This inner product extends to all pairs of tensors by linearity on the first argument and conjugate-linearity on the second.

Now, we can form a representation of SU(2) on the 2j-fold tensor product of C2:

ρ½⊗2j(g) v1v2⊗ ... v2j
= ρ½(g) v1 ⊗ ρ½(g) v2 ⊗ ... ρ½(g) v2j
= gv1gv2 ⊗ ... gv2j

This defines ρ½⊗2j(g) on the tensor products of 2j elements of C2, and then the result for any other element of C2⊗C2⊗C2⊗... comes from linearity. Note that:

ρ½⊗2j(–g) = (–1)2j ρ½⊗2j(g)

This is a unitary representation of SU(2), but it is not irreducible for j≥1. For example, for j=1 the one-dimensional subspace of C2⊗C2 consisting of antisymmetric 2×2 matrices will be preserved by the action of ρ½⊗½(g) for any g:

ρ½⊗½(
pq
rs
)
0a
a0
=
ρ½⊗½(
pq
rs
)[a (1,0) ⊗ (0,1) – a (0,1) ⊗ (1,0)]
 
  =
a
pq
rs
(1,0)
pq
rs
(0,1) a
pq
rs
(0,1)
pq
rs
(1,0)
 
  = a (p, r) ⊗ (q, s) – a (q, s) ⊗ (p, r)  
  =
a
pqps
rqrs
a
pqrq
psrs
 
  =
a
0psrq
rqps0
 
  =
0a
a0
 

In the last step we’ve used the fact that the determinant of our matrix in SU(2), psrq, is equal to 1.

So, the 2j-fold tensor product of the spin-½ representation on C2 doesn’t give us an irreducible representation. However, it always contains a particular irreducible representation of interest to us: one which acts on the subspace of symmetric tensors. Just as we say that a matrix is symmetric if swapping the indices on any component gives us another component with the same value, so Mij = Mji, we say that a tensor is symmetric if swapping any pair of indices on any component gives us another component with the same value; for example, T ijklm = T jiklm. But if we can swap any pair of indices, we can permute the indices any way at all, so all n! permutations of the n indices must gives us a component of the same value.

An equivalent condition is that a symmetric tensor can be written as a linear combination of terms each of which is the sum, over all permutations, of the tensor product of the same set of n vectors in each permuted order. For example, if n = 3, each term would take the form:

uvw + vwu + wuv + wvu + uwv + vuw

for some triple of vectors u, v and w. The n vectors need not be distinct, and if they’re not then some tensor products will appear more than once in the sum. We can construct a basis for the symmetric tensors in the n-fold tensor product of C2 simply by varying the number of times e2 appears in the set of vectors, with the count ranging from 0 to n, giving us n+1 vectors in all. For example, if n = 3 an orthonormal basis for the symmetric tensors in C2⊗C2⊗C2 would be:

{ e1e1e1,
(e2e1e1 + e1e2e1 + e1e1e2) / √3,
(e1e2e2 + e2e1e2 + e2e2e1) / √3,
e2e2e2 }

Since the 2j-fold tensor product of the spin-½ representation, ρ½⊗2j, just applies ρ½(g) to each individual vector in a tensor product, the result of applying it to a sum of permuted tensor products like this will be another term of the same form. So ρ½⊗2j maps symmetric tensors to symmetric tensors.

We define the spin-j representation of SU(2), which we’ll call ρj, to be ρ½⊗2j restricted to the subspace of symmetric tensors. This subspace has dimension 2j+1, and we’ll refer to it as Vj.

The spin-j representations of SU(2) are irreducible unitary representations, and as you’ve probably guessed from their name they are associated with fundamental particles or other quantum-mechanical systems with j units of angular momentum. If j is an integer, the representation is called bosonic. If j is half an odd integer, it’s called fermionic.

Equivalent Representations

At the start of this section we gave definitions for the spin-0, spin-½ and spin-1 representations. It’s plain that the first two agree with our general definition for spin-j, but the spin-1 definition looks completely different. We said:

ρ1(g) = f3(g)

where f3(g) is a linear operator on R3 which we define as:

f3(g) x = H–1(g H(x) g–1)

We then treat the associated real 3×3 matrix as an element of GL(C3). But our general definition of spin-j specialises, for j=1, to:

ρ1(g) vw = gvgw

where v and w are in C2. We extend the definition by linearity to any tensor in C2⊗C2, but then restrict the representation to symmetric tensors. How can these two definitions possibly be describing the same thing?

Well, they’re clearly not describing exactly the same thing, but we say that two representations are equivalent if there’s an invertible linear function between the two vector spaces on which the representations act that commutes with the representations. That is to say, a representation ρA of group G on the vector space VA is equivalent to a representation ρB of the same group on the vector space VB if there’s an invertible linear function T:VA→VB such that, for every g in G:

T ρA(g) = ρB(g) T

Another way to write this is to say that for every g in G and every v in VA:

ρA(g) v = T–1 ρB(g) T v

Whatever ρA(g) does to v in VA, if we use T to send v off to VB, let ρB(g) act on its image there, then use T–1 to bring the result back to VA, we end up with the same final vector. So T is a kind of translator we use to prove that ρA(g) and ρB(g) are doing the same thing — under different names, in different places.

With this definition, the 2-to-1 homomorphism f3:SU(2)→SO(3) and the spin-1 representation ρ1 are equivalent.

Readers with some experience in the quantum mechanics of angular momentum might like to try the following exercise:

  1. Write a general element g of SU(2) in the form H((x, y, z, t)), with the condition that x2 + y2 + z2 + t2 = 1.
  2. Compute the matrix Udc in SO(3) that f3 gives for g, with respect to the standard basis of R3.
  3. Compute the matrix U1 in V1 that ρ1 gives for g, with respect to the orthonormal basis {e1e1, (e1e2 + e2e1)/√2, e2e2}.
  4. Find a matrix T such that Udc = T–1 U1 T. Hint: think of the basis vectors in V1 as corresponding to quantum states where the spin along the z-axis is –1, 0 and 1 respectively, and the basis vectors in R3 as corresponding to quantum states where the spin along the x-axis, y-axis and z-axis respectively is 0. If you know how to construct the latter states in terms of the former for a particle with a total spin of 1, that will give you T.
Dual Representations

Suppose we have a representation ρ of the group G on some complex vector space V. We write V* to denote the dual vector space of all linear functions from V to the complex numbers C. We define the dual representation ρ* of G on V* as follows:

(ρ*(g) f) (v) = f(ρ(g–1) v) for all f in V*, v in V, g in G

Here we are defining a new function, ρ*(g) f, by specifying its value on any vector v. We need to apply ρ to v with the inverse of g in this definition in order that ρ* works out to be a representation:

(ρ*(gh) f) (v) = f(ρ((gh)–1) v)
= f(ρ(h–1g–1) v)
= f(ρ(h–1) ρ(g–1) v)
= (ρ*(h) f) (ρ(g–1) v)
= (ρ*(g) ρ*(h) f) (v)

It turns out that all the spin-j representations of SU(2) are equivalent to their own duals.

To show this, let’s start with spin-½. The n-dimensional Levi-Civita symbol, written εij...k with n subscripts, is defined by:

εij...k = +1 when {i, j, ..., k} is an even permutation of {1, 2, ..., n}
εij...k = –1 when {i, j, ..., k} is an odd permutation of {1, 2, ..., n}
εij...k = 0 otherwise

We define an isomorphism T½:C2→C2* by:

(T½(v))(w) = εij vi wj

To establish that ρ½ is equivalent to ρ½*, we need to show that for all g in SU(2) and all v in C2, the following two functions in C2* are the same:

T½½(g) v) = ρ½*(g) T½(v)

Applying these two functions to the same element w of C2, this becomes:

T½½(g) v) (w) = (ρ½*(g) T½(v)) (w)
εik gij vj wk = T½(v) (ρ½(g–1) w)
εik gij vj wk = εjm (g–1)mk vj wk

We can write the determinant of any 2×2 matrix A as:

det A = εiq Ai1 Aq2

Since g is an element of SU(2) its determinant is 1, and so:

εiq gi1 gq2 = 1

Swapping the lower indices here just changes the sign:

εiq gi2 gq1 = εqi gi1 gq2 = –εiq gi1 gq2 = –1

Repeating the lower index would give a result of zero, because εiq is antisymmetric and gim gqm is symmetric:

εiq gi1 gq1 = εiq gi2 gq2 = 0

We can sum up all of this as showing that the Levi-Civita symbol ε is invariant when transformed by g:

εiq gij gqm = εjm

If we multiply through on the right by (g–1)mk we get:

εiq gij gqm (g–1)mk = εjm (g–1)mk
εiq gij δqk = εjm (g–1)mk
εik gij = εjm (g–1)mk

Applying this to any pair of vectors v and w in C2 gives the result we need:

εik gij vj wk = εjm (g–1)mk vj wk

So we have shown that the spin-½ representation of SU(2) is equivalent to its dual, via T½.

Because the spin-j representation is built up as the symmetrised tensor product of copies of spin-½, we have essentially done all the work needed to show that ρj is also equivalent to its dual. For example, for j = 1 we define T1:C2⊗C2→(C2⊗C2)* as:

(T1(vw))(xy) = εij vi xj εkl wk yl

This extends by linearity to all elements of C2⊗C2. The actual representation space V1 consists of the symmetric tensors in C2⊗C2, but we can simply restrict the domain of T1 to give T1:V1V1*. The fact that each ε is invariant when transformed by any matrix g with determinant 1 is enough to show that T1 will commute with the group action:

T11(g) vw) = ρ1*(g) T1(vw)

It’s possible to choose a basis {b1, b2, b3} of V1:

b1 = i (–e1e1 + e2e2)/√2
b2 = –(e1e1 + e2e2)/√2
b3 = i (e1e2 + e2e1)/√2,

which is orthonormal according to the inner product:

< bi, bj > = δij

and which is also “orthonormal” in another sense, if we use T1 to define a symmetric bilinear pairing of vectors:

(T1(bi)) (bj) = δij

So the pairing of vectors we get from T1 is just the Euclidean dot product or metric in the basis {bi}. We know that the action of ρ1 is unitary, preserving the complex inner product on V1, but ρ1 also preserves this Euclidean metric:

(T11(g) bi)) (ρ1(g) bj) = δij

These basis elements can be related to the matrices {Hx, Hy, Hz} via a map T:V1→H such that Hi = T(bi):

T(vw) = (√2) v ⊗ (wT ε)

Here we have defined a matrix ε whose components are those of the Levi-Civita symbol εij, and used matrix multiplication between the row vector wT and ε to produce a new row vector. The tensor product between a column vector and a row vector can be thought of as a matrix, and this fits nicely with the usual notion of matrix multiplication as a linear operation on column vectors. We can express a previous result:

εik gij = εjm (g–1)mk

in terms of matrix multiplication rather than index notation as:

gT ε = ε g–1

If we describe the action of the double cover homomorphism f3 on the space of matrices spanned by {Hx, Hy, Hz}, rather than on R3, we have:

f3(g) H = g H g–1

while the action of ρ1 is:

ρ1(g) vw = gvgw

We can use T to show that these two representations are equivalent:

f3(g) T(vw)
= g T(vw) g–1
= (√2) g v ⊗ (wT ε) g–1
= (√2) g vwT gT ε
= (√2) g v ⊗ (g w)T ε
= T(gvgw)
= T1(g) vw)

Representations of SO(3) and SO(4)

Spin-j Representations of SO(3)

For each integer j, the bosonic spin-j representation of SU(2) gives us an irreducible, unitary true representation of SO(3), of dimension 2j+1.

We obtain this representation with some help from our 2-to-1 function f3. For any given rotation R in SO(3) there will be two elements of SU(2) that are the opposite of each other, say g and –g, such that f3(g) = f3(–g) = R. However, since j is an integer 2j is even. An even number of minus signs will cancel out in the definition of ρj, giving us:

ρj(–g) = (–1)2j ρj(g) = ρj(g).

So it makes no difference whether we choose g or –g, and the map R → ρj(g) gives us a true representation of SO(3) of dimension 2j+1.

When j is half an odd integer, though — the fermionic case — the choice of g or –g does make a difference, because we have:

ρj(–g) = (–1)2j ρj(g) = –ρj(g).

In this case we can only get a projective (2j+1)-dimensional representation of SO(3) from ρj.

Spin-(j,k) Representations of SO(4)

Something very similar happens with SO(4). First, for some choice of two half-integers j and k, we use ρj and ρk, the spin-j and spin-k representations of SU(2) on spaces Vj and Vk of dimension 2j+1 and 2k+1 respectively, to define an irreducible unitary representation ρj, k of SU(2)×SU(2) on the tensor product Vj⊗Vk:

ρj, k(g, h) vw = (ρj(g) v) ⊗ (ρk(h) w)

We then extend this by linearity to arbitrary elements of Vj⊗Vk. If j+k is an integer, the representation is called bosonic. If j+k is half an odd integer, it’s called fermionic.

For any given rotation R in SO(4) there will be two pairs of elements of SU(2), say (g, h) and (–g, –h), such that f4(g, h) = f4(–g, –h) = R. If j+k is an integer — the bosonic case — then our choice between the two won’t matter, because we’ll have:

ρj, k(–g, –h) = (–1)2(j+k) ρj, k(g, h) = ρj, k(g, h)

So in that case, the map R → ρj, k(g, h) gives us a true representation of SO(4) on Vj⊗Vk, of dimension (2j+1)(2k+1).

But when j+k is half an odd integer — the fermionic case — we will have:

ρj, k(–g, –h) = (–1)2(j+k) ρj, k(g, h) = –ρj, k(g, h)

and ρj, k can only give us a projective representation of SO(4).

We noted previously that the representation of SU(2) that we get on R3 from the 2-to-1 homomorphism f3:SU(2)→SO(3) is equivalent to the three-dimensional spin-1 representation of SU(2) on V1. So what about the 2-to-1 homomorphism f4:SU(2)×SU(2)→SO(4)? That turns out to be equivalent to the four-dimensional spin-(½, ½) representation of SU(2)×SU(2) on C2⊗C2.

Quaternions and Representations of the Double Cover of SO(4)

The three most important representations of the double cover of SO(4) can be described in terms of operations on quaternions, in a very simple and natural way.[5][6] My thanks to John Baez, who explained to me how spinors can be viewed as quaternions.

We have already seen how the 2-to-1 homomorphism f4:SU(2)×SU(2)→SO(4) can be defined by constructing a four-real-dimensional space H of certain 2×2 complex matrices. In fact the space H is just a matrix representation of the quaternions, a four-dimensional vector space on which addition and multiplication are defined, much as they are defined for real and complex numbers. The essential difference is that multiplication of quaternions is not commutative. Of course matrix multiplication is also noncommutative — and multiplying our matrices in H is precisely the same as multiplying quaternions.

One way to think of the quaternions is as an extension of the complex numbers in which rather than there being a single square root of –1, i, there are three distinct quaternions, i, j and k that satisfy i2 = j2 = k2 = –1, along with the equation ijk = –1. If we identify the quaternion i with our matrix Hx, the quaternion j with our matrix Hy, the quaternion k with our matrix Hz, and the quaternion 1 with our matrix Ht, then everything works out perfectly, with the matrices and the quaternions having identical multiplication tables.

Matrices in HQuaternions
  HxHyHzHt
HxHtHzHyHx
HyHzHtHxHy
HzHyHxHtHz
HtHxHyHzHt
  ijk1
i–1kji
jk–1ij
kji–1k
1ijk1

We will treat the vector space of four-tuples of real numbers, R4, the quaternions expressed as real numbers plus linear combinations of our three square roots of –1, and the vector space of 2 × 2 matrices H as interchangeable, with:

(a, b, c, d) in R4 ↔ the quaternion a + b i + c j + d kH(a, b, c, d) = a Ht + b Hx + c Hy + d Hz

Given a quaternion q = a + b i + c j + d k, the conjugate q* is ab ic jd k, and for any quaternions p and q we have (pq)* = q* p*. Conjugation corresponds to taking the Hermitian adjoint of the corresponding matrices.

The real part of q, Re(q), is just a, which is also equal to ½(q + q*). This corresponds to taking half the trace of the corresponding matrix.

We define the length or norm |q| of a quaternion q as just the usual Euclidean length of the corresponding vector, so:

|q|2 = a2 + b2 + c2 + d2 = q q* = q* q

We’ve already noted that when we map a vector x in R4 to our space of matrices H, the determinant of the matrix H(x) in H is the square of |x|. So the unit vectors in H are 2×2 complex matrices of determinant 1, and in fact they are exactly the elements of the matrix group SU(2). So the double cover of SO(4), the set of pairs of elements of SU(2), can equally well be thought of as the set of pairs of unit quaternions.

This allows us to describe the 2-to-1 homomorphism f4 as a representation of SU(2)×SU(2) on the vector space H of quaternions:

f4(g, h) q = g q h–1

where g and h in SU(2) are construed as unit quaternions. As we mentioned in the previous section, this 2-to-1 homomorphism is equivalent to the four-dimensional spin-(½, ½) representation of SU(2)×SU(2) on C2⊗C2, also known as the vector representation. Apart from the fact that this is a representation of the double cover of SO(4) rather than SO(4) itself, this is all just a way of talking about the ordinary rotation of vectors in R4 by elements of SO(4). Every rotation in SO(4) can be produced this way by some pair of unit quaternions (g, h); the only complication is that the pair (–g, –h) also produces exactly the same rotation, as we would expect given that SU(2)×SU(2) is a double cover of SO(4).

So far, we haven’t really done anything new; we are just identifying our space of matrices H with the quaternions, and performing all the same operations on it as before. But, somewhat less obviously, we can also show that ρ½, the spin-½ representation of SU(2) — and hence also the spin-(½, 0) and spin-(0, ½) representations of SU(2)×SU(2) — can be defined in terms of quaternions.

We define ρ'½ as a representation of SU(2) on H as:

ρ'½(g) q = g q

Here as usual we’re construing g in SU(2) as a unit quaternion. We can construct an isomorphism T between H and C2 by picking any unit vector v0 in C2 and defining:

T(q) = q v0

where we multiply q and v0 by identifying the quaternion q with a 2×2 complex matrix.

If we’re going to use T to identify the quaternions with the complex vector space C2, we need to be able to give H a complex structure: a linear operator J on H that corresponds to multiplication by i on C2. What we choose for this role will depend on our choice of v0, but one simple choice would be v0=(0, 1) and the complex structure:

J(q) = q Hz

i.e. right multiplication by Hz. The reason for choosing Hz is that our choice of v0 is an eigenvector of Hz with eigenvalue i:

Hz (0, 1) = i (0, 1)

Now J2(q) = q Hz2 = –q, and with v0=(0, 1) we have:

T(J(q)) = q Hz (0, 1) = i q (0, 1) = i T(q)

Using T, it’s easy to show that the two representations, ρ'½ and ρ½, are equivalent:

T(ρ'½(g) q) = g q v0 = ρ½(g) T(q)

Note too that ρ'½ commutes with the complex structure J, because the right multiplication performed by J commutes with the left multiplication performed by ρ'½.

We can now define the spin-(½, 0) and spin-(0, ½) representations of SU(2)×SU(2) on H as:

ρ'(½,0)(g, h) q = g q
ρ'(0,½)(g, h) q = h q

These are known as the left-handed spinor representation and the right-handed spinor representation. We will adopt uniform notation and write the vector representation on H, equivalent to f4, as:

ρ'(½,½)(g, h) q = g q h–1

If v is a vector (that is, a quaternion that transforms according to ρ'(½,½)), and r is a right-handed spinor (that is, a quaternion that transforms according to ρ'(0,½)) then the product of the two, v r, will be a left-handed spinor, transforming according to ρ'(½,0):

(ρ'(½,½)(g, h) v) (ρ'(0,½)(g, h) r)
= (g v h–1) (h r)
= g (v r)
= ρ'(½,0)(g, h) (v r)

This means that the operation of quaternion multiplication acts as an intertwiner between the tensor product of the vector representation and the right-handed spinor representation, and the left-handed spinor representation — that is, a linear map from the first of these spaces to the second that commutes with all the representations.

We can do something similar starting with a vector v and a left-handed spinor l, but we have to combine them in the form vl:

(ρ'(½,½)(g, h) v)* (ρ'(½,0)(g, h) l)
= (g v h–1)* (g l)
= (h v* g–1) (g l)
= h (v* l)
= ρ'(0,½)(g, h) (v* l)

So we see that vl transforms as a right-handed spinor.

When we were mapping R4 to H, we saw that we could put a real inner product on H with:

x · y = ½ tr(H(x) H(y)*)

We can rewrite this as:

p · q = Re(p* q) = Re(p q*) = ½(p q* + q p*)

But now that we’re mapping C2 to H what we really need is a complex inner product on H. This can be defined as:

<p, q>S = Re(p* q) – Re(p* J(q)) i = Re(p* q) – Re(p* q Hz) i

Here the subscript “S” stands for spinor. It’s easy to see that left-multiplication with any unit quaternion g from SU(2) will preserve this inner product, since g* g = |g|2 = 1:

<gp, gq>S
= Re((gp)* gq) – Re((gp)* gq Hz) i
= Re(p* g* gq) – Re(p* g* gq Hz) i
= <p, q>S

So our representation of SU(2), ρ'½, is a unitary operator on H with this complex inner product. We also have:

<gp, q>S
= Re((gp)* q) – Re((gp)* q Hz) i
= Re(p* g* q) – Re(p* g* q Hz) i
= <p, g* q>S

This tells us that if we think of left-multiplication by g as a linear operator, the adjoint operator with respect to our inner product is just left-multiplication by the conjugate, g*.

Parity

The orthogonal group O(4) consists of two parts: orthogonal matrices with determinant 1, which comprise the group SO(4), and those with determinant –1, which include reflections. Topologically, these are two separate pieces; this must be the case, because the determinant is a polynomial in the matrix components, which makes it a continuous function. It can’t just jump from 1 to –1 in the middle of a connected set, so the parts of O(4) with determinants 1 and –1 must be disconnected.

[In our own universe things are quite a bit more complicated: the group of Lorentz transformations consists of four disconnected pieces, because as well as reflections in space we have to consider time reversal, and also combinations of reflection and time reversal. But in the Riemannian universe, time reversal is just another reflection.]

The two pieces of O(4) can be mapped into each other by multiplication with any element of O(4) with determinant –1. We will choose a particular linear operator called the parity operation, P, which is defined by:

P(x, y, z, t) = (–x, –y, –z, t)

If R is in SO(4) then det(R) = 1, and det(PR) = det(P) det(R) = –1. Conversely, any F in O(4) with det(F) = –1 can be multiplied by P to give us an element of SO(4). So multiplication by P gives us an invertible map between the two components of O(4). Clearly P2 = I4, so applying the map twice gets us back where we started.

Now, we’d like to enlarge SU(2)×SU(2), the double cover of SO(4), into a new group that forms a double cover of O(4). Mimicking the relationship between SO(4) and O(4), we’ll do this by adding an “extra copy” of SU(2)×SU(2), which we reach by multiplying elements of SU(2)×SU(2) with a new group element that we’ll call Q. The aim is to use the second part of the enlarged group as a double cover for the part of O(4) containing reflections.

We’ll write elements of our new group either as (g, h) for elements of SU(2)×SU(2) or Q·(g, h) for elements of the second part of the group. The latter is just a formal expression that can’t be simplified any further; Q here acts as a kind of marker.

As well as extending SU(2)×SU(2), we want to extend our 2-to-1 homomorphism f4:SU(2)×SU(2)→SO(4) into one from the new group to O(4). We defined f4 as:

f4(g, h) x = H–1(g H(x) h–1)

and we will extend it by declaring:

f4(Q) x = P x

We previously noted that the effect of taking the Hermitian adjoint of H(x) is:

H(x)* = H(P x)

so we have:

P x = H–1(H(x)*)

This, and the fact that we require the extended f4 to be a group homomorphism, tell us that:

f4(Q·(g, h)) x
= f4(Q) f4(g, h) x
= P f4(g, h) x
= P H–1(g H(x) h–1)
= H–1( (g H(x) h–1)* )
= H–1( h*–1 H(x)* g* )     [The adjoint of a product of matrices is the product in reverse order of the adjoints.]
= H–1( h H(P x) g–1 )     [Since g and h are in SU(2), their adjoints are their inverses.]
= f4(h, g) P x
= f4(h, g) f4(Q) x
= f4((h, gQ) x

Now, from this we can conclude that:

f4(Q·(g, hQ–1)
= f4(Q·(g, h)) f4(Q–1)
= f4((h, gQ) f4(Q–1)
= f4(h, g)

Because f4 is 2-to-1, and the two elements it maps to the same point are always opposites, it follows that Q·(g, hQ–1 = ±(h, g). But for the case (g, h) = (I2, I2) — which is the identity in SU(2)×SU(2), and hence for the whole extended group — the only correct choice is the plus sign. Since SU(2)×SU(2) is connected and we expect conjugation with Q to be a continuous map, we can’t switch signs. So:

Q·(g, hQ–1 = (h, g)
Q·(g, h) = (h, gQ

We also have:

f4(Q2) = f4(Q)2 = P2 = I4
Q2 = ±(I2, I2)

Either choice would be valid mathematically; which one is true can only be decided empirically. In our own universe, it turns out that parity is not an exact symmetry at all, since experiments with beta decay have shown parity violation. However, it is still a useful concept, applicable to a wide range of phenomena even though it’s not a perfect symmetry for all of physics.

If we view O(3) as the subgroup of O(4) that leaves the time axis fixed, and we view SU(2) as the subgroup of SU(2)×SU(2) consisting of elements of the form (g, g), everything we’ve done above can be adapted to a double cover for O(3). Note that the element Q commutes with every element of SU(2):

Q·(g, g) = (g, gQ
Representations Including Parity

Suppose we take one of the spin-(j, k) representations of SU(2)×SU(2), and we want to extend it to the group we get by including Q. We previously noted that when Q moves from the left to the right side of an element of SU(2)×SU(2), it swaps the order of those elements:

Q·(g, h) = (h, gQ

This identity needs to be preserved by the representation:

ρj, k(Q) ρj, k(g, h) = ρj, k(h, g) ρj, k(Q)

Applying this to a tensor product of two vectors:

ρj, k(Q) ρj, k(g, h) vw = ρj, k(h, g) ρj, k(Q) vw
ρj, k(Q) (ρj(g) v) ⊗ (ρk(h) w) = ρj, k(h, g) ρj, k(Q) vw

If we restrict ourselves to the special case j=k, and define:

ρj, j, p(Q) vw = p wv

for some number p that’s yet to be determined — for now we’ll just include it in the label for the representation — then both the left-hand side and right-hand side of our identity become equal to:

pj(h) w) ⊗ (ρj(g) v)

But what can we do in the case jk? We need to extend our representation to what we will call (j, k)⊕(k, j), the direct sum of the original representation and another version with the spins swapped. We define ρ(j, k)⊕(k, j), p on (Vj⊗Vk)×(Vk⊗Vj), the vector space of pairs of elements, one from each tensor product, by:

ρ(j, k)⊕(k, j), p(g, h) (ab, cd) = ((ρj(g) a) ⊗ (ρk(h) b), (ρk(g) c) ⊗ (ρj(h) d))
ρ(j, k)⊕(k, j), p(Q) (ab, cd) = p (dc, ba)

This again allows us to satisfy the identity that swaps the order of the elements in SU(2)×SU(2). If not for the symmetry Q this representation would clearly be reducible, with each part of the direct sum an invariant subspace, but since Q swaps elements in the two subspaces the overall representation is irreducible.

What about the number p? The effect of both ρj, j, p(Q)2 and ρ(j, k)⊕(k, j), p(Q)2 is just multiplication by p2. If Q2 = (I2, I2), then for any of our representations:

ρ(Q)2 = ρ(Q2) = ρ(I2, I2) = I

so we must have p2 = 1, and p = ±1. We call p the intrinsic parity of the representation.

On the other hand, if Q2 = (–I2, –I2), we have:

ρ(j, k)⊕(k, j), p(Q)2 = ρ(j, k)⊕(k, j), p(Q2) = ρ(j, k)⊕(k, j), p(–I2, –I2) = (–1)2(j+k) I

In the bosonic case, with j+k an integer, the choices of intrinsic parity are the same as before. In the fermionic case, with j+k equal to half an odd integer, the intrinsic parity is an imaginary number, ±i.

The representations of SU(2) extended to include Q are much simpler. Since Q commutes with every element, it must be represented by a multiple of the identity, so we have:

ρj, p(Q) = p I

with the same choices for p depending on the same criteria.

Finally, we can use the extended 2-to-1 homomorphisms f3 and f4 to construct representations of O(3) and O(4), in much the same way as we constructed representations for SO(3) and SO(4). These will be true representations in the bosonic case and projective representations in the fermionic case.

Representations of the Euclidean Group

So far, all the representations we’ve discussed have been of subgroups of the Euclidean group that leave the origin of our coordinate system fixed. Now we want to move on to the full group, which includes translations, symmetries that displace every point in four-space by some constant vector.

Representations on Function Spaces

To find representations of the entire Euclidean group, we need to start by introducing a new kind of vector space: a space of complex-valued functions on R4. If we take the set of functions from R4 to C, we can treat it as a vector space by defining addition and multiplication by scalars in the obvious way. For any such functions A, B and any complex number c:

(A+B)(x) = A(x) + B(x)
(c A)(x) = c A(x)

We can make a subset of the complex-valued functions on R4 into a Hilbert space, which we’ll call L2(R4), by defining the inner product between functions as:

<A, B> = ∫R4A(x)* B(x) d4x

L2(R4) consists of those functions whose squared norm:

|A|2 = <A, A> = ∫R4A(x)* A(x) d4x = ∫R4|A(x)|2 d4x

is finite. Functions like this are known as square-integrable functions.

We can define a representation ρF of E(4) on L2(R4) by:

F(g) A)(x) = A(g–1x)

We can imagine the function A as a kind of rigid object in R4 with complex numbers “painted” on it. When we slide this object around by applying the symmetry g, each point in R4 now becomes associated with whatever new number ends up at that location. We need to use g–1 here, rather than just g, in order to get a representation:

F(gh) A)(x)
= A((gh)–1x)
= A(h–1g–1x)
= (ρF(h) A)(g–1x)
= (ρF(g) ρF(h) A)(x)

Because the Euclidean group’s action on R4 preserves four-dimensional volumes, the integral over R4 that gives the inner product will be unchanged. So this is a unitary representation. It is not irreducible on all of L2(R4), but we can find an irreducible subspace: the solutions of the Riemannian Scalar Wave equation for any fixed value of ωm.

x2A + ∂y2A + ∂z2A + ∂t2A + ωm2 A = 0 (RSW)

To see that A satisfying the RSW equation implies that ρF(g) A also satisfies it, suppose that g–1 acts on x = xi ei as:

g–1 x = (Rij xj + ai) ei

where Rij are the components of an orthogonal matrix, ai are the components of a vector in R4, and we’re using the Einstein Summation Convention for repeated indices. Then:

x2F(g) A)(x) + ∂y2F(g) A)(x) + ∂z2F(g) A)(x) + ∂t2F(g) A)(x)
= ∂x2(A(g–1x)) + ∂y2(A(g–1x)) + ∂z2(A(g–1x)) + ∂t2(A(g–1x))
= Σpxp2(A(g–1x))
= Σp, q, r Rqp RrpyqyrA(y) | y = g–1x
= Σq, r (R RT)qryqyrA(y) | y = g–1x
= Σryr2A(y) | y = g–1x
= –ωm2 A(y) | y = g–1x
= –ωm2F(g) A)(x)

What we’re doing here is converting from derivatives of A with respect to coordinates xp that A only sees via g–1 x, to derivatives with respect to coordinates yr that are passed directly to A; this allows us to refer back to the equation that the original function satisfies. In effect, we’re showing that a function that satisfies the RSW equation in one rectangular coordinate system will also satisfy it when the calculations are performed in any other rectangular coordinate system.

What about vector-valued functions? We can define an inner product on the space of Cn-valued functions by:

<A, B> = ∫R4<A(x), B(x)> d4x

Here the inner product inside the integral is just the standard inner product on Cn itself. Then the squared norm of a Cn-valued function is:

|A|2 = <A, A> = ∫R4<A(x), A(x)> d4x = ∫R4|A(x)|2 d4x

We can get a unitary representation of the Euclidean group on the space of square-integrable C4-valued functions C4⊗L2(R4) by applying the “non-translational part” of g to the four-tuple of complex numbers that such a function will assign to each point in R4. If A is in C4⊗L2(R4) and g–1 x = (Rij xj + ai) ei, we define:

V(g) A)(x) = (R–1)ij Aj(g–1 x) ei = Rji Aj(g–1 x) ei

If we require that each component of A satisfies the RSW equation, the same will be true for the components of ρV(g) A. But as we hinted in the notes on vector waves, this is not enough to give us an irreducible representation, since we can restrict our functions to lie in an even smaller invariant subspace by imposing the transverse condition:

x Ax + ∂y Ay + ∂z Az + ∂t At = 0 (Transverse)

Suppose A meets the transverse condition. Then for ρV(g) A we have:

xV(g) A)x(x) + ∂yV(g) A)y(x) + ∂zV(g) A)z(x) + ∂tV(g) A)t(x)
= Σp, jxp Rjp Aj(g–1 x)
= Σp, j, q Rqpyq Rjp Aj(y) | y = g–1x
= Σj, q (R RT)qjyq Aj(y) | y = g–1x
= Σjyj Aj(y) | y = g–1x
= 0

So ρV(g) A meets the transverse condition too.

In this case we managed to find an irreducible representation within the larger one, but the process is still a bit mysterious. To systematically identify all the irreducible unitary representations of the Euclidean group, we need to invoke a subtler method.

A warning: the remainder of this section contains a lot of abstract group-theoretical constructions. If you enjoy this level of detail, great — but if you find it heavy going just skip it! It’s not an essential prerequisite for later material in the notes.

Group Actions, Orbits and Isotropy Subgroups

First, a few simple definitions we’re going to need[2].

An action α of a group G on a set M assigns to every element g of the group an invertible function α(g):M→M, with α(e) x = x, where e is the group’s identity. If αL is a left action, then for all x in M and g, h in G it must satisfy the condition:

αL(g) αL(h) x = αL(gh) x

If αR is a right action, the condition is:

αR(h) αR(g) x = αR(gh) x

The most obvious examples are when the group acts on itself, by multiplication on the left or right:

αL(g) a = g a
αR(g) a = a g

A group can also act on itself through conjugation, either as a left or right action:

αL(g) a = g a g–1
αR(g) a = g–1 a g

Representations of groups on vector spaces are left actions, with the additional requirement that the function associated with each group element must be a linear operator.

If we have an action α of G on M, then the orbit of any element x of M is the set {α(g) x | g in G}. It consists of all points in M to which x can be moved by the action of some element of the group. For example, if we take the action of SO(3) on R3 to be matrix multiplication, so that elements of SO(3) simply rotate vectors in R3, the orbit of any point in R3 except the origin will be a sphere, while the orbit of the origin will be just the origin itself.

An action α of G on M is called transitive if for any two elements m1 and m2, there is an element g of G such that:

α(g) m1 = m2

To put this another way, if the action on M is transitive, the orbit of any point is the entire set M.

We write the set of all orbits of G on M as M / G. In our example of SO(3) and R3, the set of orbits R3 / SO(3) is isomorphic to the set of non-negative numbers, equal to the radii of the orbits as spheres.

An isotropy subgroup Gm is the set of elements of G that leave some point m in M fixed under the action:

Gm = {h in G | α(h) m = m}

For example, if G is SO(3) and M is R3, the isotropy subgroup G(0, 0, 1) that leaves the z-axis fixed is a subgroup of SO(3) isomorphic to the group of two-dimensional rotations, SO(2).

If two points x and y in M lie on the same orbit of some action α, then their isotropy subgroups under that action are isomorphic to each other; that is, there’s an invertible group homomorphism φ:Gx→Gy. For a left action, if y = αL(g) x, the isomorphism is φ(h) = ghg–1. For a right action, if y = αR(g) x, the isomorphism is φ(h) = g–1hg.

For example, the two subgroups of SO(3) that leave the x-axis fixed and the y-axis fixed are isomorphic to each other, which we can prove by using φ(h) = ghg–1 with g equal to any rotation that takes the x-axis into the y-axis.

Semidirect Products

Suppose we have a group G and a representation σ of G on the vector space R4. Then we can turn G×R4, the set of pairs (g, x), into a group S by defining the group operation as:

(g, x)(h, y) = (gh, σ(g) y + x)

The group identity is (I, 0) and the inverse is:

(g, x)–1 = (g–1, –σ(g)–1x)

When G is O(4) and σ is matrix multiplication, S is precisely the Euclidean group E(4) of all symmetries of Euclidean four-space. The group multiplication rule for S is just the rule for composing Euclidean symmetries that we’ve seen previously, when we expressed it via matrix multiplication of matrices extended to include the translation vectors:

R2 s2
0 1
R1 s1
0 1
=
R2 R1 R2 s1 + s2
0 1

Elements of the form (I, x) make up the subgroup of translations, T(4), while elements of the form (g, 0) make up O(4) as a subgroup of E(4).

We say E(4) is the semidirect product of O(4) and T(4). This term has a more general definition in group theory, but for our purposes we’re most interested in the fact that every element of E(4) is written uniquely as a pair of elements of O(4) and T(4).

We’re also interested in what happens when we replace O(4) with another group — such as SO(4), or its double cover SU(2)×SU(2), or the extension of SU(2)×SU(2) to a double cover of O(4). If we use one of these double covers, its representation σ on R4 will be the two-to-one homomorphism f4, and the semidirect product S we get will itself be a double cover, either of E(4) or of its subgroup SE(4) that excludes reflections.

Vector Bundles

We now define a mathematical structure called a vector bundle[3]. This consists of a set E, the bundle, along with a set we’ll call M, known as the base space of the bundle, and a projection, π:E→M, which maps every element in E to an element of M.

Vector bundle over a circle
Moebius bundle over a circle

For every point m in M, the subset of E that projects to m is a vector space; we will write Em for π–1(m) as a vector space. We call E a vector bundle over the set M, and we can picture it as a collection of vector spaces, each one “sitting above” a different point in M. We call each individual vector space Em the fibre over m.

A simple example of a vector bundle is an infinite cylinder in R3, centred on the z-axis, along with a projection onto the circle where the cylinder intersects the xy-plane. “Over” each point in the circle lies a one-dimensional real vector space, consisting of the line that projects onto that point.

A trivial bundle is one that consists simply of pairs of points from the base space M and some vector space V, so that the bundle is E = M×V. Our cylinder is an example of that. But a bundle need not be trivial. For example, we can construct a bundle with a Möbius strip topology. This has the same base space as the cylinder, and also has each Em a one-real-dimensional vector space, but it is clearly not the same set. (We can’t embed an infinitely wide Möbius strip in three-dimensional space, so the illustration is just intended to give an intuitive picture of the topology.)

A vector bundle E is known as a homogeneous G-bundle if there are left actions αE and αM of G that act on E and M respectively, and that are compatible with the projection π:E→M in the sense that applying the action on E then projecting to M has the same effect as projecting to M then applying the action on M. In other words, for all g in G and w in E:

αM(g) π(w) = π(αE(g) w)

This implies that αE(g) maps the fibre Em into the fibre EαM(g) m. We require this map to be linear.

On any vector bundle E, we can define a vector space of sections of the bundle, Γ(E). Each section f in Γ(E) is a function from the base space M to E, such that f assigns to each point m an element of the vector space Em “sitting above” that point. Another way to state this is to say that:

π(f(m)) = m

for all m in M.

Γ(E) is made into a vector space by adding two sections point by point, and multiplying a section by a scalar by multiplying its value at every point. A section is a bit like a vector-valued function on M — except that each point in M has a value that lies in a distinct vector space, Em. If we drew a curve on our cylinder that intersected each vertical line on it exactly once, that would give us a section.

If E is a homogeneous G-bundle, we can use the actions αE and αM to get a representation of G on the vector space of sections, Γ(E). If we call this representation r, we have for every g in G, every f in Γ(E) and every m in M:

( r(g) f ) (m) = αE(g) fM(g–1) m)

We can check that the new function we get really is a section by applying the projection π to it:

π( ( r(g) f ) (m) )
= π( αE(g) fM(g–1) m) )
= αM(g) π( fM(g–1) m) )
= αM(g) αM(g–1) m
= m

Representations on Vector Bundles

What follows is based largely on Sternberg[4], adapted where necessary from Lorentzian to Riemannian physics.

Let’s start by considering the vector space R4 as an Abelian group: a group where the operation (vector addition, in this case) is commutative. We could call this group T(4), but instead we’ll write it as (R4, +) as a reminder that we can think of every translation in T(4) simply as a vector in R4.

Though we won’t prove it, every irreducible representation of an Abelian group on a complex vector space is one-dimensional. A linear operator on a one-dimensional complex vector space is just multiplication by a complex number, and if the representation is unitary that number’s inverse must equal its complex conjugate, so the number must have an absolute value of 1. An irreducible unitary representation χ of (R4, +) is essentially a function from (R4, +) to the set of complex numbers with absolute value 1. These complex numbers are also known as phases, and every phase can be written in the form exp(i θ) for some real number θ.

Any vector k in R4 gives us a one-dimensional unitary representation χk of the group (R4, +), defined by:

χk(x) = exp(i k · x)

We can see that this is a representation by checking that:

χk(x) χk(y) = exp(i k · x) exp(i k · y) = exp(i k · (x+y)) = χk(x+y)

In fact, all irreducible unitary representations of (R4, +) are of this form. So we can think of R4 itself in a second way: as being the set of all irreducible unitary representations of the additive group (R4, +).

Suppose S is the semidirect product of some group G and (R4, +), via a representation σ of G on R4 that preserves the dot product. If G is O(4) or SO(4) then σ would just be the fundamental matrix representation, whereas if G is a double cover then σ would be the two-to-one homomorphism f4. There is a right action α of S on its subgroup (R4, +) defined as follows, where s = (g, y) is any element of S, x is any element of (R4, +), and we write x as (I, x) when we want to refer to it explicitly as an element of the semidirect product S.

α(s) x
= s–1 x s
= (g, y)–1 (I, x) (g, y)
= (g–1, –σ(g)–1y) (I, x) (g, y)
= (g–1, –σ(g)–1y) (g, y+x)
= (I, σ(g)–1x)
= σ(g)–1x

From this right action α, we can get a left action β of S on R4 as the set of irreducible unitary representations of the additive group (R4, +):

(β(s) χk)(x)
= χk(α(s) x)
= χk(σ(g)–1x)
= exp(i k · (σ(g)–1x))
= exp(i (σ(g) k) · x)     [σ(g) preserves the dot product, and we apply it to both k and σ(g)–1x]
= χσ(g) k(x)

We’ve defined β through its action on the set of representations χk, but we will blur the distinction between χk and k and also use β as an action on R4 viewed simply as a vector space:

β(s) k = β(g, y) k = σ(g) k

The orbit of any non-zero vector k under this action, β, will be a 3-sphere in R4, because σ(g) preserves the length of k — and (for our purposes) the group G is always going to be at least SO(4). The orbit of the zero vector will just be itself, a single point. So we can classify all the orbits by their radius. [In Lorentzian physics the situation is more complicated: in that case, k will have a different kind of orbit depending on whether it’s timelike, spacelike or null. For more on this, see the sections cited in Sternberg.]

Now, suppose ρ is an irreducible unitary representation of our semidirect product group S on some vector space V. S is either going to be the Euclidean group E(4), the special Euclidean group SE(4), or the double cover of one of these groups.

If we restrict ρ to the translational subgroup (R4, +) of S, then ρ will give us a unitary representation of (R4, +), but unless V is one-dimensional this restricted representation won’t be irreducible — because (R4, +) is an Abelian group, so its irreducible representations are one-dimensional. In general, then, V will be spanned by many one-dimensional subspaces that are invariant under the action of ρ restricted to (R4, +).

In each such one-dimensional subspace of V, we will have, for all x in (R4, +) and every w in the subspace:

ρ(x) w = χk(x) w

where the vector k identifies which of the possible irreducible unitary representations of (R4, +) the restriction of ρ is equivalent to. Since there’s no guarantee that a given vector k will show up in just a single one-dimensional subspace, we will write E(k) for the entire subspace of V on which, for every w in E(k):

ρ(x) w = χk(x) w

So we’ve decomposed V into subspaces E(k) where the translational part of the representation ρ is the same as χk for some specific vector k.

Vector bundle from representation

What happens if we act with ρ(s) on some w in E(k), where s = (g, y) is any element of S whatsoever? For any x in (R4, +) we have:

ρ(x) ρ(s) w
= ρ(s) ρ(s–1) ρ(x) ρ(s) w
= ρ(s) ρ(s–1 x s) w
= ρ(s) χk(α(s) x) w
= ρ(s) (β(s) χk)(x) w
= ρ(s) χσ(g) k(x) w
= χσ(g) k(x) ρ(s) w

This tells us that ρ(s) w lies in E(σ(g) k), where the g here comes from s = (g, y). In other words, ρ(g, y) maps E(k) into E(σ(g) k).

So all the vectors k corresponding to the various E(k) must come from a single orbit in R4 under the action β of S on R4 — or equivalently, the action σ of the subgroup G on R4 — and so they will all have the same length. If that length is k = |k|, we’ll define k0 = (0,0,0,k) in R4. Then every k comes from the orbit of k0, and that orbit is a 3-sphere in R4 of radius k. We’ll refer to this three-sphere as Mk.

Consider all the E(k) we find in V, taken together. We will write:

E = ∪k E(k)

where by “∪” we mean a disjoint union, in which all the E(k) are treated as completely distinct sets and we’re collecting all their elements into one large set E. (Specifically, this means that although all the subspaces E(k) share their origin, which is the origin of V, in E we treat each E(k) as having its own distinct origin.)

A general vector in V won’t lie in any of these subspaces, but ρ(s) always maps each E(k) to another such subspace, E(σ(g) k) — so ρ gives us an action of S on E that is just like the action on a vector bundle, where fibres are mapped to fibres! We make our collection E of vector spaces E(k) into a vector bundle over the orbit Mk in R4, by defining the projection π to take every element of E(k) to k itself.

The representation ρ gives us a left action of S on E, and we also have a left action of S on Mk, namely β. These actions are compatible with our projection π; for all s in S and all w in E:

β(s) π(w) = π(ρ(s) w)

We can use those actions to get a representation of S on the vector space of sections, Γ(E). If we call this representation r, we have for every s in S, every f in Γ(E) and every k in Mk:

( r(s) f ) (k) = ρ(s) f(β(s–1) k)

In fact, the representation r of S on Γ(E) is equivalent to our original representation ρ of S on V! Given any vector v in V, we can write it uniquely as a sum of vectors that lie in each of the E(k). That assigns a vector in E(k) to each k, so we can think of it as a section in Γ(E). Equally, given any section in Γ(E) we can add up all the vectors in each E(k) as vectors in V. With this identification between V and Γ(E), the representations are equivalent.

Now, in general ρ moves vectors from one E(k) to another, but if we restrict it to a suitable subgroup we can look at how it acts on just one of these spaces. Let’s define Sk0 as the isotropy subgroup of S consisting of all (h, x) such that:

β(h, x) k0 = σ(h) k0 = k0

We’ll also define Gk0 as the isotropy subgroup of G consisting of all h such that:

σ(h) k0 = k0

Obviously Gk0 just consists of the non-translational parts h of the elements (h, x) of Sk0. In fact, Sk0 is the semidirect product of Gk0 and (R4, +). The isotropy subgroups for all points on the same orbit are isomorphic, so our choice of k0, as opposed to some other k on the orbit, makes no difference to the structure of these subgroups.

Restricted to Sk0, ρ will map E(k0) into itself, giving us a representation of Sk0. But we already understand the translational part of the representation: our definition of E(k) says that on E(k0) the restriction of ρ to translations (I, y) is described by χk0(y), a uniform multiplication by the phase exp(i k0 · y). So let’s focus on the non-translational part, and extract a representation η of Gk0 on E(k0). For any h in Gk0 and w in E(k0), define:

η(h) w = ρ(h, 0) w

The fact that ρ is a representation is enough to make η a representation, and the fact that ρ is irreducible means that η is irreducible too. The latter might not be obvious, but if there were a proper subspace U of E(k0) that Gk0 preserved, we could define:

U+ = the subspace of V spanned by {ρ(s) w | s in S, w in U}

U+ is not the whole of V, since by assumption no element of Gk0 can take w out of U, nor can any element of Sk0, which at most merely multiplies by a further phase, while any element of S outside Sk0 maps every w into some E(k) with kk0. But U+ is an invariant subspace, because acting with ρ on any linear combination of vectors of the form ρ(s) w will again give a linear combination of vectors of the same form. So our assumption about U would contradict the irreducibility of ρ.

What can we say about the isotropy subgroup Gk0, for which η is an irreducible representation on E(k0)? We will assume that k≠0, so k0 is a non-zero vector in Euclidean four-space. The subgroup of SO(4) that keeps k0 = (0,0,0,k) fixed is just SO(3). We can get variations on this either by dropping the “S” for special from both the full group and the subgroup, and/or making both into double covers. The simplest thing is to use double covers rather than SO(4) or O(4) themselves, in order to obtain true representations, so the two most useful cases are:

We’ve already examined the irreducible unitary representations of these groups. In the case of SU(2), they are the spin-j representations. In the case of the double cover of O(3), there is also a choice of intrinsic parity.

Having “reverse-engineered” ρ, a putative irreducible representation of S, we can now see how to construct the same kind of representation ourselves. The steps to do this are as follows:

  1. We want a homogeneous S-bundle E with a base space of Mk and each vector space E(k) isomorphic to W. But we can’t define E as the set Mk×W, the trivial bundle, as there is no natural action of the full group S on that bundle. But it’s easy to find actions of S on itself — and what’s more, there’s an action of Sk0 on S such that the orbits S / Sk0 can be put into one-to-one correspondence with Mk. We define a left action γ0 of Sk0 on S as:
    γ0(s0) s = s s0–1
    Inverting the element of Sk0 and putting it on the right of the element of S might seem quirky, but it’s essential. We then define the map:
    π0:S→Mk
    π0(g, x) = f4(g) k0
    With π0, all elements of S on the same orbit in S / Sk0 under the action γ0 will be identified with the same element of Mk. Having seen how to “shrink” S to Mk, we’ll copy this trick on the set S×W. We define a left action γ of Sk0 on S×W:
    γ(h, y) ((g, x), w) = ((g, x) (h, y)–1, ξ(h, y) w)
    We’ll write the orbit of ((g, x), w) under this action as [((g, x), w)]. We now define our bundle E as the set of orbits in S×W under the action γ of Sk0, which we write as E = (S×W) / Sk0. Then we define the projection π:E→Mk by:
    π( [((g, x), w)] ) = f4(g) k0
    which makes E a bundle over Mk.
  2. Is our projection π defined unambiguously, giving a result that’s independent of the particular point in the orbit we use? As we move through the orbit the element of S, (g, x), becomes (g, x) (h, y)–1, and the non-translational part of that — on which the projection depends — is gh–1. What values does gh–1 take on, as h ranges over the isotropy subgroup Gk0? Define G(k0, k) as the subset of G that maps k0 to k via the action f4:
    G(k0, k) = {a in G | f4(a) k0 = k}
    If g is in G(k0, k), then gh–1 will be too, since h–1 leaves k0 fixed. Suppose a is also in G(k0, k), so we have:
    f4(g) k0 = k
    f4(a) k0 = k
    Then:
    k0 = f4(a–1) k = f4(a–1) f4(g) k0 = f4(a–1 g) k0
    which tells us that a–1g is in Gk0. Setting h = a–1g means h–1 = g–1a, and gh–1 = a. So the values gh–1 takes on in each orbit are precisely the set G(k0, k), and whatever point we choose in the orbit, it will project to the same k.
  3. Define E(k) as π–1(k) = {[((a, x), w)] | a in G(k0, k), x in R4, w in W}. To turn the fibre E(k) into a vector space, we need to be able to perform the usual operations of adding vectors and multiplying by scalars. In order to do that, choose some function a:Mk→G such that a(k) is always in G(k0, k), then define the maps φa(k):W→E(k) and ψa(k):E(k)→W as:
    φa(k)(w) = [((a(k), 0), w)]
    ψa(k)([((g, x), w)]) = ξ( (a(k),0)–1(g, x) ) w
    The map ψa(k) is well-defined, because if we choose any other element in the same orbit, we get the same result. This is where the inverted right-side placement of (h, y) pays off:
    ψa(k)([((g, x)(h, y)–1, ξ(h, y) w)])
    = ξ( (a(k),0)–1(g, x)(h, y)–1 ) ξ(h, y) w
    = ξ( (a(k),0)–1(g, x) ) w
    These two maps φa(k) and ψa(k) are actually inverses:
    φa(k)a(k)([((g, x), w)]))
    = φa(k)(ξ( (a(k),0)–1(g, x) ) w)
    = [((a(k), 0), ξ( (a(k),0)–1(g, x) ) w)]
    = [γ((g, x)–1 (a(k),0)) ((a(k), 0), ξ( (a(k),0)–1(g, x) ) w)]
    = [((a(k), 0)(a(k),0)–1(g, x), ξ( (g, x)–1 (a(k),0) ) ξ( (a(k),0)–1(g, x) ) w)]
    = [((g, x), w)]
    and:
    ψa(k)a(k)(w))
    = ψa(k)([((a(k), 0), w)])
    = ξ( (a(k),0)–1(a(k), 0) ) w
    = w
    We use φa(k) and ψa(k) to identify E(k) with W. Then if c and d are in E(k), and z is a complex scalar, we turn E(k) into a vector space by defining:
    c + d = φa(k)( ψa(k)(c) + ψa(k)(d) )
    z c = φa(k)( z ψa(k)(c) )
  4. There are some simple relationships between φa(k), ψa(k) and ξ(h, 0)=η(h), the representation of Gk0 on W. If h is any element of Gk0 and a(k) is any element of G(k0, k):
    φa(k)h = φa(k) ∘ ξ(h, 0)
    ψa(k)h = ξ(h, 0)–1 ∘ ψa(k)
    We can see from this that our choice of a particular a(k) in G(k0, k) makes no difference to the definition of vector space operations on E(k). If we redefine the sum of two elements c and d in E(k) as:
    c + d = φa(k)h( ψa(k)h(c) + ψa(k)h(d) )
    then this becomes:
    φa(k)( ξ(h, 0) ( ψa(k)h(c) + ψa(k)h(d) ) )
    = φa(k)( ξ(h, 0) (ξ(h, 0)–1 ( ψa(k)(c) + ψa(k)(d) ) ) )
    = φa(k) ( ψa(k)(c) + ψa(k)(d) )
    in agreement with the original definition.
  5. Define the action ρ of S on E by:
    ρ(b, z) [((g, x), w)] = [((b, z)(g, x), w)]
    and the action β of S on Mk by:
    β(b, z) k = f4(b) k
    These actions are compatible with our projection π:
    π( ρ(b, z) [((g, x), w)] )
    = π( [((b, z)(g, x), w)] )
    = f4(bg) k0
    = f4(b) f4(g) k0
    = β(b, z) π([((g, x), w)] )
  6. Suppose we act with ρ( (a(k), 0)(h, y)(a(k), 0)–1 ) on the fibre E(k), where (h, y) is in Sk0. The element (a(k), 0)(h, y)(a(k), 0)–1 will be in the isotropy subgroup Sk that fixes k — a subgroup which is isomorphic to Sk0. It’s not hard to see that:
    ρ( (a(k), 0)(h, y)(a(k), 0)–1 ) [((g, x), w)] = φa(k)( ξ(h, y) ψa(k)([((g, x), w)]) )
    This shows us that the action of ρ when restricted to the fibre E(k) and the subgroup Sk is “equivalent to” ξ, the irreducible representation of Sk0 that we started with. Strictly speaking we should only talk about equivalent representations of precisely the same group, but having identified Sk and Sk0 with an explicit isomorphism, we’re demonstrating that the representation of Sk on each fibre is essentially just a version of the representation ξ of Sk0 on W.
  7. Define the representation r of S on the vector space of sections of our bundle, Γ(E), by the following formula, for every s in S, f in Γ(E) and k in Mk:
    ( r(s) f ) (k) = ρ(s) f(β(s–1) k)
    This will be the irreducible unitary representation of S that we sought. Whew! This representation of the double cover of E(4) or SE(4) is completely determined by our choices of:

The Spin-0 Bundle Representation

In the case of the spin-0 representation of SU(2), the vector space W is just the set of complex numbers, η(h) is the identity operator, and we have:

ξ(h, y) = χk0(y) η(h) = exp(i k0 · y)

The map ψa(k):E(k)→W that we use to identify each fibre with W becomes:

ψa(k)([((g, x), w)])
= ξ( (a(k),0)–1(g, x) ) w
= χk0(f4(a(k)–1) x) w
= exp(i k0 · (f4(a(k)–1) x)) w
= exp(i (f4(a(k)) k0) · x) w
= exp(i k · x) w

This is completely independent of our choice of a(k)! So we can define a map from the whole bundle E to the trivial bundle, ψ:E→Mk×C as:

ψ([((g, x), w)]) = (k, exp(i k · x) w)

where we’ve simply written k for clarity, but this can be obtained explicitly from the point in E as k = π([((g, x), w)]) = f4(g) k0. We can use ψ to turn a section of the bundle into a function from Mk to the complex numbers. Equally, we can turn such a function into a section of the bundle, via the map φ:Mk×C→E with:

φ(k, w) = [((a(k), 0), w)]

It might look as if this depends on a(k), but in fact it maps to exactly the same orbit regardless. This is because ξ(h, y) now depends only on the translational part, y, and so long as a1 and a2 are both in G(k0, k), the action of γ(a2–1a1, 0) takes ((a1, 0), w) to ((a2, 0), w), showing that they’re in the same orbit.

If we use φ to turn a function w:Mk→C into a section, act on that section with some element (b, z) of S via the representation r, then use ψ to turn the section back into a function on Mk, we get a representation of S on the space of functions on Mk that we will also call r. For any such function w:

(r(b, z) w)(k) = exp(i k · z) w(f4(b)–1 k)

We expect the representation on sections of the spin-0 bundle over Mk to be connected to solutions of the RSW equation with ωm = k. In the notes on scalar waves we looked at real-valued plane-wave solutions to that equation, but for our present purposes we need to look at the complex-valued plane waves:

Ak(x) = A0 exp(–i k · x)

We can forge a link between the two representations by constructing a function Aw on R4 equal to an integral over the set of plane wave solutions, weighted by a complex-valued function w(k) on Mk. In other words, we treat every vector k in Mk as the propagation vector for a plane wave, and take the value w(k) to be the contribution of each plane wave to a Fourier synthesis of the whole function on R4.

Aw(x) = ∫Mk w(k) exp(–i k · x) d3k

How does Aw transform under ρF, the representation on the space of functions on R4 that we discussed previously?

F(b, z) Aw)(x) = Aw((b, z)–1 x) = Aw(f4(b–1) (xz))
= ∫Mk w(k) exp(–i k · (f4(b–1) (xz))) d3k
= ∫Mk w(f4(b)–1 k') exp(–i (f4(b)–1 k') · (f4(b–1) (xz))) d3k'     [Change integration variable to k'=f4(b) k]
= ∫Mk w(f4(b)–1 k') exp(–i k' · (xz)) d3k'
= ∫Mk exp(i k' · z) w(f4(b)–1 k') exp(–i k' · x) d3k'
= Ar(b, z) w(x)

For each positive value of ωm the square-integrable solutions of the RSW equation form an irreducible subspace of the functions on R4, and ρF, restricted to that subspace, is equivalent to the spin-0 bundle representation for k = ωm. The spin-0 bundle representation is the Fourier transform of the functions-on-space representation, showing how a solution of the RSW equation transforms when it is written as an integral of plane-wave solutions.

If we want to include the parity operation, P, we’re free to choose a representation with an intrinsic parity of either 1 or –1.

The Spin-1 Bundle Representation

The spin-1 representation of SU(2) is three-dimensional, and gives a true representation of SO(3) that’s equivalent to the fundamental representation, where each 3×3 real matrix in SO(3) acts on vectors in C3 by matrix multiplication.

The puzzle here is how this three-dimensionality fits into a representation of the four-dimensional Euclidean group. A vector in four-space transforms in a natural way under SO(4), so how does SO(3) come into the picture?

We saw some clues to this in the notes on vector waves, where we found we could limit the solutions of the Riemannian Vector Wave equation to those that obeyed a further “transverse condition” that required the vector value of a plane wave to be orthogonal to the wave’s propagation vector. And as we pointed out when discussing representations on function spaces, the transverse condition selects an invariant subspace from the larger space of solutions. So, having related the spin-0 bundle to scalar plane waves, we can try to relate the spin-1 bundle to vector plane waves, with the three-dimensionality of the bundle fibres coming from the fact that the vectors describing the wave’s value should be orthogonal to the wave’s propagation vector.

Tangent bundle for a 2-sphere

In fact — quite apart from our construction of bundles for representations — there is already a very natural choice of a three-dimensional vector bundle associated with any 3-sphere such as Mk, known as the tangent bundle of the 3-sphere. If we step back and think about 2-spheres in R3, the analogous bundle can be visualised very easily. On every point on the surface of a sphere, there is a two-dimensional vector space consisting of all vectors that are tangent to the surface there. This tangent space is a different vector space at different points, so the whole collection of such vectors is naturally conceived of as a vector bundle. Some readers will be aware that differential geometry offers a precise definition of a tangent bundle that is intrinsic to the 2-sphere or 3-sphere itself, and doesn’t rely on seeing these spheres as sitting inside R3 or R4. But for our purposes it’s actually very useful to think of the 3-sphere Mk as sitting in R4, with its tangent vectors being four-dimensional vectors that are literally tangent to the 3-sphere, and hence orthogonal to the vector k that points from the centre of the 3-sphere to the surface. In that picture, we can treat k as the propagation vector for a plane wave, and the tangent vector A0 gives the wave’s four-vector value:

Ak(x) = A0 exp(–i k · x)
k · A0 = 0

There’s just one small extra step required: we need to allow for “complex tangent vectors”. That means instead of having A0 confined to R4, we allow it to take values in C4. So long as we still have k · A0 = 0, the mathematics works out just as nicely. We refer to the vector bundle where we allow these vectors to be complex as the complexified tangent bundle of Mk.

Surprisingly, the tangent bundle of a 3-sphere is actually trivial! This is not the case for a 2-sphere, and it’s related to the fact that we can identify any 3-sphere with the group SU(2). We first encountered SU(2) when we were constructing the 2-to-1 homomorphism from SU(2)×SU(2) to SO(4), and we saw that SU(2) was the 3-sphere of unit vectors in the four dimensional vector space of matrices we called H. Now, with our map H from R4 to that space of matrices we have:

H(k0) = H(0,0,0,k) = k I2

If G is SU(2)×SU(2) we can choose a(k) = (H(k/k), I2):

f4(a(k)) k0
= H–1(H(k/k) H(k0) I2)
= H–1(H(k)/k (k I2) I2)
= k

As well as taking k0 to k, the operator f4(a(k)) can also be used to carry the tangent space at k0 into the tangent space at k — and because everything here involves smooth functions of the coordinates in R4, doing this for all k in Mk lets us take a basis of the tangent space at k0 and turn it into a basis of smooth vector fields that cover the entire 3-sphere. If the tangent bundle were not trivial, any attempt to do this would have to involve a discontinuity somewhere (for example, imagine trying to do the same thing on a Möbius strip).

We need to show that the spin-1 bundle is actually the same as the complexified tangent bundle of Mk. Using a(k) = (H(k/k), I2) and η(h) = f3(h) = f4(h, h), the map ψa(k) that we use to identify each fibre with the representation space, which in this case is C3, becomes:

ψa(k)([((g, x), w)])
= ξ( (a(k),0)–1(g, x) ) w
= χk0(f4(a(k)–1) x) f4(a(k)–1g) w
= exp(i k · x) f4(a(k)–1g) w

Since w is in C3 and f4(a(k)–1g) is a rotation that preserves the t-axis, this vector will also lie in C3. If we then operate on this vector with f4(a(k)), it will give us a vector in a subspace of C4 that is orthogonal to k. We will define a map ψ':E→Mk×C4 as:

ψ'([((g, x), w)])
= (k, f4(a(k)) ψa(k)([((g, x), w)]))
= (k, f4(a(k)) exp(i k · x) f4(a(k)–1g) w)
= (k, exp(i k · x) f4(g) w)

where we know that f4(g) w is orthogonal to k because w is in C3, i.e. is orthogonal to k0 = (0,0,0,k), and f4(g) preserves the dot product while taking k0 to k. The inverse map φ':Mk×C4→E is given by:

φ'(k, u) = [((a(k), 0), f4(a(k)–1) u)]

Here f4(a(k)–1) u is in C3 if and only if k · u = 0, so φ' is only defined on the part of Mk×C4 that we’re identifying with the complexified tangent bundle of Mk, not on the whole of Mk×C4.

If we use φ' to turn a section of the complexified tangent bundle, f(k)=(k, u(k)), into a section of our spin-1 bundle, act on that section with some element (b, z) of S via the representation r, then use ψ' to turn the result back into a section of the tangent bundle, the representation on sections of the tangent bundle is:

(r(b, z) f)(k) = (k, exp(i k · z) f4(b) u(f4(b)–1 k))

We can also re-interpret this as a representation on the space of functions u:Mk→C4 such that u(k) · k = 0, and write:

(r(b, z) u)(k) = exp(i k · z) f4(b) u(f4(b)–1 k)

The representation on sections is the same as we’d get from the natural choice of actions of S on Mk×C4:

ρ'(b, z)(k, u) = (f4(b) k, exp(i (f4(b) k) · z) f4(b) u)

Finally, we can check that this is equivalent to ρV, our representation on the space of square-integrable vector-valued functions on R4, restricted to a subspace of functions that satisfy the Riemannian Vector Wave equation and the transverse condition. For each function u:Mk→C4 such that u(k) · k = 0 we define a function on R4 by Fourier synthesis:

Au(x) = ∫Mk u(k) exp(–i k · x) d3k

How does Au transform under ρV?

V(b, z) Au)(x) = f4(b) Au((b, z)–1 x) = f4(b) Au(f4(b–1) (xz))
= f4(b) ∫Mk u(k) exp(–i k · (f4(b–1) (xz))) d3k
= f4(b) ∫Mk u(f4(b)–1 k') exp(–i (f4(b)–1 k') · (f4(b–1) (xz))) d3k'     [Change integration variable to k'=f4(b) k]
= f4(b) ∫Mk u(f4(b)–1 k') exp(–i k' · (xz)) d3k'
= f4(b) ∫Mk exp(i k' · z) u(f4(b)–1 k') exp(–i k' · x) d3k'
= Ar(b, z) u(x)

So the spin-1 bundle representation can be seen as a Fourier transform of the representation on a subspace of vector-valued functions on R4. The Fourier transform of the transverse condition:

x Ax + ∂y Ay + ∂z Az + ∂t At = 0 (Transverse)

corresponds to the condition u(k) · k = 0, since the Fourier transform of a partial derivative in any direction corresponds to multiplication of the transform by the frequency component in that direction, i.e. ∂x Axkx ux, etc.

As with the spin-0 case, if we want a representation of E(4) rather than just SE(4) we’re free to choose a representation with an intrinsic parity of either 1 or –1. If we chose the latter, then the action of P would multiply everything by a further factor of –1.

The Spin-½ Bundle Representation

The spin-½ representation of SU(2) is two-dimensional, so the bundle E on whose sections we obtain a representation of the double cover of the Euclidean group will have fibres E(k) isomorphic to C2. As always, the base space Mk will be a three-sphere.

Visualising this bundle is tricky, but as with the complexified tangent bundle to Mk it will help if we see it as a “sub-bundle” of a trivial bundle of the form Mk×Cn for some n. Following the pattern we saw with the spin-1 bundle, we would expect to relate the spin-½ bundle representation to a representation on a subspace of the Cn-valued functions on R4. In the spin-1 case, if we restrict the representation to the non-translational group, G, and to functions that are non-zero only at the origin, we end up with the four-dimensional vector representation of the double cover of SO(4) or O(4). What could the analogous result be if we do the same in the spin-½ case? The smallest fermionic representation of the double cover of O(4) is spin-(½,0) ⊕ spin-(0,½), which is also four-dimensional. So we will try to identify the spin-½ bundle with a sub-bundle of the trivial bundle Mk×C4, in such a way that we end up with C4-valued functions that transform at the origin, under rotations and reflections, according to spin-(½,0) ⊕ spin-(0,½).

The spin-(½,0) ⊕ spin-(0,½) representation of SU(2)×SU(2) on C4 = C2⊕C2 is given by:

τ(g1, g2) (v1, v2) = (g1 v1, g2 v2)

where g1, g2 are elements of SU(2) and v1, v2 are elements of C2.

We define the map ψ'':E→Mk×C4 as:

ψ''([((g, x), w)]) = (k, exp(i k · x) τ(g)(w, w))

where as usual k = π([((g, x), w)]) = f4(g) k0. Here we are writing just g for an element of SU(2)×SU(2) rather than explicitly writing it as a pair of separate element of SU(2). This definition is unambiguous, because if we choose another representative of the same equivalence class we get the same result:

ψ''([((g, x)(h, y)–1, ξ(h, y) w)])
= ψ''([((gh–1, –f4(gh–1)y+x), exp(i k0 · y) η(h) w)])
= (f4(gh–1) k0, exp(i (f4(gh–1) k0) · (–f4(gh–1)y+x)) exp(i k0 · y) τ(gh–1)(η(h) w, η(h) w))
= (f4(g) k0, exp(–i k0 · y) exp(i (f4(g) k0) · x) exp(i k0 · y) τ(g) τ(h–1) (h w, h w))
= (f4(g) k0, exp(i (f4(g) k0) · x) τ(g) (w, w))
= (k, exp(i k · x) τ(g) (w, w))

Next we define a left action ρ'' of S on Mk×C4:

ρ''(b, z)(k, (v,w)) = (f4(b) k, exp(i (f4(b) k) · z) τ(b)(v,w))

The usual action ρ on our spin-½ bundle E is:

ρ(b, z) [((g, x), w)] = [((b, z)(g, x), w)]

ψ'' commutes with these bundle actions:

ρ''(b, z) ψ''([((g, x), w)]) = ψ''(ρ(b, z) [((g, x), w)])

The representation of S on sections f(k)=(k, u(k)) of Mk×C4 that follows from the action ρ'' is given by:

(r(b, z) f)(k) = (k, exp(i k · z) τ(b) u(f4(b)–1 k))

More simply, the representation on functions u:Mk→C4 is:

(r(b, z) u)(k) = exp(i k · z) τ(b) u(f4(b)–1 k)

Of course the image of ψ'' is not the whole of Mk×C4, so we need to restrict the function u. In the spin-1 case the sub-bundle was specified by the condition u(k) · k = 0, and we need to identify the analogous condition satisfied by u(k) and k in the spin-½ case. To start with, we note that the image of the fibre E(k0) is:

ψ''(E(k0)) = {ψ''([((I, 0), w)]) | w ∈ C2} = {(k0, (w, w)) | w ∈ C2}

Acting on elements of this form with ρ''(b, z) gives us:

ψ''(E) = {(f4(b) k0, exp(i (f4(b) k0) · z) τ(b)(w, w)) | w ∈ C2, b ∈ SU(2)×SU(2), z ∈ R4}

If b = (b1, b2) for b1, b2 in SU(2), we have:

f4(b1, b2) k0
= H–1(b1 H(k0) b2–1)
= k H–1(b1 b2–1)
So H(f4(b1, b2) k0) = k b1 b2–1

and:

τ(b1, b2) (w, w) = (b1 w, b2 w)

Combining the last two results, we have for every (k, (u1,u2)) in ψ''(E):

H(k) u2 = k u1

Multiplying on the left by the adjoint H(k)* we obtain:

H(k)* H(k) u2 = k H(k)* u1
k2 H(k/k)* H(k/k) u2 = k H(k)* u1
k2 u2 = k H(k)* u1
H(k)* u1 = k u2

These equations can be packaged as a single matrix equation:

k I2 H(k)
H(k)* k I2
u1
u2
=
0
0

We can also write this as:

(kμ γμk) u(k) = 0

where we’re summing over μ and we define the matrices γμ as:

γμ =
0 Hμ
Hμ* 0

We won’t write out the matrices we obtain from this definition, because they depend on the choice of a basis we made when we identified the spin-½ bundle with a sub-bundle of Mk×C4, and when we discuss the Dirac equation later we’ll focus on two particular choices of basis that are useful for specific applications. The really important thing about the gamma matrices is a property that doesn’t depend on the choice of basis:

γμ γν + γν γμ = 2 δμν I4

It’s not hard to check this from the definitions of properties of the H matrices.

Imposing the condition:

(kμ γμk) u(k) = 0

on each plane wave in a Fourier synthesis:

Au(x) = ∫Mk u(k) exp(–i k · x) d3k

is the same as requiring that Au(x) satisfies the differential equation:

i γμμ Au(x) – k Au(x) = 0

If we identify k, the magnitude of the propagation vector, with the mass m of a particle, this is precisely the Dirac equation. So the solutions of the Dirac equation form a subspace of C4-valued functions on R4 that transform according to the spin-½ bundle representation.

The Projective Orthogonal Groups

There is one further variation on the idea of symmetry in Euclidean four-space that we need to examine. Suppose we have a neutral classical particle that travels freely through four-space, so its world line is an infinite straight line. We can ascribe an energy-momentum vector to this particle, pointing in a particular direction along its world line, but only if we choose a reference frame and decide, as a matter of convenience or convention, that the energy-momentum vector should have a positive time coordinate in our reference frame. There really is no such vector associated with the particle itself. The world line does have a direction in four-space — up to a sign — and the particle’s mass determines the magnitude of its energy-momentum, but any energy-momentum vector we choose could just as well be replaced by its opposite.

This suggests that for a free neutral particle, SO(4) isn’t really the correct symmetry group. What we really want to do is to treat the operation that reverses all vectors as having no effect whatsoever on this system. Of course that wouldn’t be appropriate for an electrically charged particle, whose charge would change sign under this operation.

Essentially what we want is to treat I4, the usual identity operator on R4, and –I4, which reverses the direction of all vectors, as both corresponding to the identity of a new symmetry group that takes the place of SO(4). We can do this, not by restricting to some subgroup of SO(4), but by forming a quotient group. We take the subgroup SZO(4) = {I4, –I4} of SO(4), and then form a new group, PSO(4) = SO(4) / SZO(4), consisting of equivalence classes of elements that lie in the same coset of SZO(4). The cosets of a subgroup are the sets you obtain by multiplying that subgroup either on the left or on the right by elements of the group; in this case the elements of SZO(4) commute with everything, so there is no difference between the left and right cosets. So each coset takes the form g SZO(4) = {g, –g} for some g in SO(4). We will write this as ±g; for any g in SO(4), ±g is an element of PSO(4). We multiply these elements using multiplication in SO(4), so:

g)(±h) = ±(gh)

Clearly ±I4 serves as the identity, and inverses are given by:

g)–1 = ±(g–1)

The new group we have formed, PSO(4), is known as the projective special orthogonal group in four dimensions. We can perform a similar construction with O(4) to obtain the projective general orthogonal group, PGO(4) = O(4) / {I4, –I4}, which includes reflections. It also make sense to talk about PGO(3), the quotient group O(3) / {I3, –I3}. Because –I3 has a determinant of –1 it is not an element of SO(3), so there is no such thing as PSO(3) — though by some definitions, PSO(3) is simply defined as SO(3) / {I3} = SO(3).

The map π: SO(4)→PSO(4) defined by:

π(g) = ±g

is a 2-to-1 homomorphism from SO(4) to PSO(4), so SO(4) is a double cover of PSO(4) in the same way that SU(2)×SU(2) is a double cover of SO(4). What’s more, we can combine the 2-to-1 homorphism π with the 2-to-1 homomorphism f4 to get a 4-to-1 homomorphism α: SU(2)×SU(2)→PSO(4):

α(g, h) = π(f4(g, h)) = ±f4(g, h)

The four elements (g, h), (–g, h), (g,–h) and (–g, –h) of SU(2)×SU(2) all have the same image under α in PSO(4). Via α, we can use the spin-(j, k) unitary representations of SU(2)×SU(2) to obtain what will generally be projective representations of PSO(4), in the same way that we used them to obtain representations of SO(4). Recall that the spin-(j, k) representation is defined by:

ρj, k(g, h) xy = (ρj(g) x) ⊗ (ρk(h) y)

and that:

ρj(–g) = (–1)2j ρj(g)

This shows that when j and k are both integers, the four operators ρj, kg, ±h) are all identical, so ρj, k will yield a true representation of PSO(4), not just a projective one.

Lie Algebras

The group of rotations in n dimensions, SO(n), is an example of a mathematical structure known as a Lie group: the set of elements of the group can be given coordinates such that the group operations of multiplying two elements or taking the inverse of an element are smooth functions of these coordinates. It might be necessary to use more than one coordinate “chart”, just as it’s necessary to use at least two charts on the surface of the Earth if you want to avoid the kind of problems that occur at the poles and at 180° longitude. For example, SO(3), the group of rotations in three dimensions can be thought of as a solid three-dimensional ball of radius π with opposite points on its surface identified with each other; the vector from the centre of the ball to some point within it identifies the axis of rotation, while the length of that vector gives the amount of rotation. This set could be covered by one coordinate chart centred on the identity element, and another centred on a particular rotation by π.

Now, suppose a rigid object in n-dimensional space is rotating over time, with its centre of mass remaining fixed at the origin of the coordinate system. At any time t, there will be some matrix R(t) that takes the location of every point on the object at time 0 into its new location at time t. If the object is rotating smoothly, the components of R(t) will be smooth functions of t, and we can define R'(t) as the matrix whose individual components are the derivatives with respect to t of the components of R itself.

R'(t) has obviously got something to do with the angular velocity of the rotating object. In three dimensions we’re used to identifying rotations with a particular axis, and describing an object’s angular velocity as a vector pointing along that axis whose length is proportional to the rate of rotation. In four dimensions, though, rotations won’t leave a unique axis fixed — they will either leave no vectors at all fixed, or leave an entire plane unchanged. So in general, angular velocity is best described by a matrix Ω which, when applied to the displacement vector from the centre of rotation to a point on the object, tells you the instantaneous linear velocity of that part of the object:

Ω x = v

The original location (at t=0) of a point on the object whose position is x1 at time t1 is x0 = R(t1)–1 x1, so we can write the general location of the same point as:

x(t) = R(t) R(t1)–1 x1

The linear velocity of this point in general is:

v(t) = R'(t) R(t1)–1 x1

and in particular at t = t1:

v(t1) = R'(t1) R(t1)–1 x1

So we see that:

Ω(t1) = R'(t1) R(t1)–1

R'(t) is related to the angular velocity, but to get Ω we have to undo the net rotation that has occurred up to the present time. Alternatively, if we simply choose the origin of our time coordinate so that t1=0, then R(t1)–1 will be the identity matrix, and we’ll have Ω(0) = R'(0).

In any case, since we know that the matrices R(t) will all be orthogonal matrices — which is to say, their transpose will be equal to their inverse — we have:

R(t) R(t)T = In
t ( R(t) R(t)T ) = 0
R'(t) R(t)T + R(t) R'(t)T = 0
R'(t) R(t)T = –R(t) R'(t)T
Ω(t) = –Ω(t)T

This tells us that the angular velocity Ω will be an antisymmetric matrix: its transpose is the opposite of the original matrix.

For example, suppose:

R(t) =
cos(ω t)–sin(ω t)0
sin(ω t)cos(ω t)0
001

These are the rotations over time for a body rotating around the z-axis with an angular velocity of ω. We find the angular velocity matrix by taking the derivative of this with respect to time, then setting the time to zero:

R'(t) =
–ω sin(ω t)–ω cos(ω t)0
ω cos(ω t)–ω sin(ω t)0
000
Ω(0) = R'(0)
=
0–ω0
ω00
000

We call the set of n × n antisymmetric matrices the Lie algebra corresponding to SO(n), and by convention we form the name for it by changing the name of the group to lower-case: so(n). A Lie algebra is a vector space: we can add elements to each other, or multiply them by scalars (in this case, real numbers), and end up with other elements of the same Lie algebra. Adding antisymmetric matrices together or multiplying them by real numbers will always yield matrices that are themselves antisymmetric.

Given two elements v and w of a Lie algebra, we can also perform an operation known as the Lie bracket or commutator:

[v, w] = v ww v

where the products v w and w v are formed by normal matrix multiplication. It’s easy to see that the Lie bracket of two antisymmetric matrices is also antisymmetric:

[v, w]T
= (v ww v)T
= (v w)T – (w v)T
= wT vTvT wT
= w vv w
= –[v, w]

In general, we can think of the Lie algebra associated with any Lie group as the vector space of all possible tangents to smooth paths that pass through the identity element of the group. With rotations, we can make this a bit more concrete by thinking of the path through the group as the rotations of some solid body over time, and the tangent to the path at the identity element as describing the angular velocity of the body at a particular moment, but the same idea works for more abstract groups such as SU(2).

Given an element v of a Lie algebra g, there is a function from g to the Lie group G known as the exponential map that takes multiples of v into a path through the identity whose tangent is v. This can be done very generally, but we will concentrate on the case where we have matrices as the elements of G and g. Then the exponential map is just the usual exponential function, extended to work with matrices. This can be done either with a power series:

exp(A) = In + A + A2 / 2! + A3 / 3! + A4 / 4! + ...

or, in many cases, by choosing a basis in which the matrix A takes a diagonal form, diag(λ1, ... , λn), and then just exponentiating all the diagonal entries to get diag(exp(λ1), ... , exp(λn)), and converting back to the original basis.

For example, if we exponentiate a multiple of an antisymmetric angular velocity matrix, Ω, we get an orthogonal rotation matrix:

R(t) = exp(t Ω)

giving us a path through the group of rotations. We can find the tangent to this path at the identity element by differentiating with respect to t then setting t to zero:

R'(t) = Ω exp(t Ω)
R'(0) = Ω

We’ve noted that the Lie algebra so(n) consists of the n × n antisymmetric matrices. What about the Lie algebra su(n) associated with the group SU(n)? The matrices in SU(n) are unitary, which means that the complex conjugate of their transpose, also known as the Hermitian adjoint, is their inverse. So if we have a path M(t) through SU(n) with M(0) = In, and we write the Hermitian adjoint as M(t)*, we have:

M(t) M(t)* = In
t ( M(t) M(t)* ) = 0
M'(t) M(t)* + M(t) M'(t)* = 0
M'(t) M(t)* = –M(t) M'(t)*
M'(0) = –M'(0)*

So the Hermitian adjoint of the Lie algebra element m = M'(0) is the opposite of the original element. A matrix with that property is known as skew-Hermitian.

But su(n) doesn’t contain all n × n skew-Hermitian matrices. It turns out there is a relationship between the trace of a matrix and the determinant of the exponential of that matrix:

det( exp(A) ) = exp(tr A)

It’s not hard to see why this is true for diagonal matrices. If A = diag(λ1, λ2, ... , λn), then:

tr A = λ1 + λ2 + ... + λn
exp(A) = diag(exp(λ1), exp(λ2), ... , exp(λn))
det( exp(A) ) = exp(λ1) × exp(λ2) × ... × exp(λn)
= exp (tr A)

Since the elements of SU(n) have a determinant of 1, the elements of su(n) must have a trace whose exponential is 1, i.e. zero. So su(n) consists of the traceless, skew-Hermitian n × n matrices.

so(3) and su(2)

We know that the group SU(2) forms a double cover for the group SO(3). When one Lie group is a double cover for another, their Lie algebras will always be “the same”, in the sense that we can find a Lie algebra isomorphism between them: an invertible linear map that respects the Lie bracket.

We previously defined f3:SU(2)→SO(3) as follows:

f3(g) x = H–1(g H(x) g–1)

where for any point in R3, x = (x, y, z), we define:

H(x) =
z iyx i
yx iz i
 

H(x), in this three-dimensional case, is traceless and skew-Hermitian, making it an element of su(2). So we have H:R3→su(2).

Suppose the group element g in SU(2) is produced by exponentiating multiples of an element v of su(2):

g(t) = exp(t v)

We can then find an element of so(3) corresponding to v in su(2) by taking the derivative with respect to t of f3(g) x then setting t to zero.

t (f3(g(t)) x)
= ∂t (H–1(g(t) H(x) g(t)–1))
= H–1(∂t (g(t) H(x) g(t)–1))
= H–1(∂t (exp(t v) H(x) exp(–t v)))
= H–1(v exp(t v) H(x) exp(–t v) – exp(t v) H(x) v exp(–t v))

Setting t = 0, we have:

t (f3(g(t)) x) | t = 0
= H–1(v H(x) – H(x) v)
= H–1([v, H(x)])

We can now define a map df3:su(2)→so(3), which is the derivative of the double cover.

df3(v) x = H–1([v, H(x)])

Let’s list the matrices we get by applying H to the basis vectors of R3, and use them as a basis of su(2):

Hx = H(ex) =
0i
i0
 
Hy = H(ey) =
0–1
10
 
Hz = H(ez) =
i0
0i
 

The commutators, or Lie brackets, between these matrices are:

[Hx, Hy] = 2 Hz
[Hy, Hz] = 2 Hx
[Hz, Hx] = 2 Hy

with the opposite results if we reverse the order of the elements within the Lie brackets, and of course zero for the Lie bracket of any element with itself. It follows that:

df3(Hx/2) ex = H–1([Hx/2, Hx]) = 0
df3(Hx/2) ey = H–1([Hx/2, Hy]) = ez
df3(Hx/2) ez = H–1([Hx/2, Hz]) = –ey

So df3 takes Hx/2 in su(2) to an element Jx of so(3) that generates rotations around the x-axis (i.e. if we exponentiate any multiple of Jx we’ll get a rotation around the x-axis.) Writing this out explicitly for all the basis elements we have:

df3(Hx/2) = Jx =
000
00–1
010
 
df3(Hy/2) = Jy =
001
000
–100
 
df3(Hz/2) = Jz =
0–10
100
000
 

The commutators between the J matrices, the generators of rotations around each coordinate axis, are simply:

[Jx, Jy] = Jz
[Jy, Jz] = Jx
[Jz, Jx] = Jy

The annoying factors of 2 and 1/2 all cancel out in the end, and the map df3:su(2)→so(3) respects the Lie bracket:

[df3(v), df3(w)] = df3([v, w])

so(4)

The Lie algebra so(4), which is associated with the Lie group SO(4) of rotations in four dimensional Euclidean space, is a six-dimensional vector space. We can see this by counting the degrees of freedom in 4 × 4 antisymmetric matrices (e.g. there are 6 elements that lie above the diagonal of the matrix). One basis for these elements consists of the generators for rotations in each of the six planes defined by a choice of two coordinate axes. Note that we order the coordinates t, x, y, z, so the first row and column in these matrices involves t.

Jxy =
0000
00–10
0100
0000
Jyz =
0000
0000
000–1
0010
Jzx =
0000
0001
0000
0–100
Jxt =
0100
–1000
0000
0000
Jyt =
0010
0000
–1000
0000
Jzt =
0001
0000
0000
–1000

Here Jij maps ei to ej, and ej to –ei. The pairings of the coordinate indices in reverse order to those listed above give the opposite matrices, e.g. Jtz = –Jzt. The Lie algebra can be summed up by saying that when i, j and k are distinct:

[Jij, Jkj] = Jik

Apart from rearrangements of this basic identity with appropriate changes of sign, all other commutators are zero.

That’s simple enough, but it turns out that there’s a different choice of basis that reveals a crucial property of so(4).

Ax = ½(Jyz+ Jxt) =
0½00
–½000
000–½
00½0
Ay = ½(Jzx + Jyt) =
00½0
000½
–½000
0–½00
Az = ½(Jxy + Jzt) =
000½
00–½0
0½00
–½000
Bx = ½(JyzJxt) =
0–½00
½000
000–½
00½0
By = ½(JzxJyt) =
00–½0
000½
½000
0–½00
Bz = ½(JxyJzt) =
000–½
00–½0
0½00
½000

The three A matrices are formed by taking half the sum of a J matrix and its dual: the generator of the rotation in the plane that is untouched by the first matrix. (Note that this is a different meaning of the term “dual” than the one we’ve used before, where it concerned the relationship between vectors and linear functions of vectors). The B matrices follow the same recipe but use the difference rather than the sum. The result is that if we think of “taking the dual” as a linear operator on so(4) (and there is a whole beautiful context in which that makes sense, though it would be too much of a detour to explain it here), the A matrices are “self-dual”, equal to their own duals, while the B matrices are “anti-self-dual”, equal to the opposite of their own duals.

In this basis, the commutators are:

[Ax, Ay] = Az, and cyclic permutations thereof.
[Bx, By] = Bz, and cyclic permutations thereof.
[Ai, Bj] = 0, for all i and j.

So the individual three-dimensional Lie sub-algebras formed by the A and B matrices have exactly the same structure as so(3) or su(2), and all the commutators between the two sub-algebras are zero. We write this result as:

so(4) = so(3) ⊕ so(3)

The fact that so(4) splits up this way into two copies of so(3) or su(2) is directly connected to the fact that SU(2)×SU(2) is the double cover of SO(4). We previously defined the function f4:SU(2)×SU(2)→SO(4) as:

f4(g, h) x = H–1(g H(x) h–1)

where for x = (x, y, z, t) in R4, we define H(x) as:

H(x) =
tz iyx i
yx it + z i
 

If we set g(t) = exp(t v) and h(t) = exp(t w) for v and w elements of su(2), we can then differentiate with respect to t and set t to 0, giving us a map df4: su(2) ⊕ su(2) → so(4), defined as:

df4(v, w) x = H–1(v H(x) – H(x) w)

Making use of the multiplication table:

  HxHyHzHt
HxHtHzHyHx
HyHzHtHxHy
HzHyHxHtHz
HtHxHyHzHt

we find that:

df4(0, Hx/2) ex = H–1(–Hx2/2) = ½ et
df4(0, Hx/2) ey = H–1(–Hy Hx/2) = ½ ez
df4(0, Hx/2) ez = H–1(–Hz Hx/2) = –½ ey
df4(0, Hx/2) et = H–1(–Ht Hx/2) = –½ ex

In other words, df4(0, Hx/2) = Ax. Continuing in this way, we get:

df4(0, Hx/2) = Ax
df4(0, Hy/2) = Ay
df4(0, Hz/2) = Az

df4(Hx/2, 0) = Bx
df4(Hy/2, 0) = By
df4(Hz/2, 0) = Bz

So df4 maps one copy of su(2) into the self-dual part of so(4), and the other copy into the anti-self-dual part.

Lie Algebra Representations

Whenever we have a representation ρ of a Lie group on a vector space V, we can take its derivative to obtain a representation dρ of the associated Lie algebra on V. Just as a Lie group representation preserves multiplication between the group elements, so that:

ρ(gh) = ρ(g) ρ(h)

a Lie algebra representation preserves the Lie bracket:

dρ([v, w]) = [dρ(v), dρ(w)]

We find the derivative in the usual way: by setting g = exp(t v) for some v in the Lie algebra, taking the derivative with respect to t, and then setting t = 0. By this method, we can easily get Lie algebra representations from our most important Lie group representations.

We’ll start with the first three spin-j representations of SU(2), and turn them into spin-j representations of su(2).

Spin-0

The spin-0 representation of su(2) is as trivial as you’d expect: since the group representation always gives the identity operator, the derivative is zero.

0(v) = 0

Here the 0 on the right-hand side is the linear operator on complex numbers that consists of multiplying them by zero.

Spin-½

The spin-½ representation of su(2) acts on C2, the two-dimensional complex vector space.

½(v) = v

The representation involves simply using the original element v of su(2) to act on C2 by matrix multiplication in the usual way.

Spin-1

The spin-1 representation of su(2) on C3 finally involves a bit of work, but it’s work we’ve mostly already done.

1(v) x = df3(v) x = H–1([v, H(x)])

Strictly speaking, the way H(x) is defined this gives us a linear operator on R3 rather than C3, but as with the spin-1 group representation we can simply take the real 3 × 3 matrix for the operator and apply it to C3. This representation can be summed up by stating that:

1(Hk/2) = Jk, for k = x, y, z.

The Hk here are elements of the basis for su(2) we listed earlier, and the Jk are the generators for rotations around the coordinate axes in R3.

In all of these cases, we can obtain representations of so(3) rather than su(2) by using the inverse of the isomorphism df3:su(2)→so(3) to map any element of so(3) back to su(2), via the rule:

Jk → Hk/2

and then apply the representation of su(2) to the result. The spin-0 representation of so(3) is just zero, of course, and the spin-1 representation of so(3) uses the so(3) matrix element directly on C3.

Spin-(j,k) Representations of so(4)

We previously defined the spin-(j,k) representation of the group SU(2)×SU(2), the double cover of SO(4). For some choice of two half-integers j and k, we use ρj and ρk, the spin-j and spin-k representations of SU(2) on spaces Vj and Vk of dimension 2j+1 and 2k+1 respectively, to define an irreducible unitary representation ρj, k of SU(2)×SU(2) on the tensor product Vj⊗Vk:

ρj, k(g, h) xy = (ρj(g) x) ⊗ (ρk(h) y)

We then extend this by linearity to arbitrary elements of Vj⊗Vk.

The derivative of this Lie group representation is a Lie algebra representation of su(2) ⊕ su(2) on Vj⊗Vk:

j, k(v, w) xy = (dρj(v) x) ⊗ y) + (x ⊗ (dρk(w) y)

We’ll give some specific examples, using the spin-j representations of su(2) that we’ve described so far. For j=½ and k=0, Vj = C2 and Vk = C, so Vj⊗Vk is just C2 and we can drop the second factor in all the tensor products:

½, 0(v, w) xy = (dρ½(v) x) ⊗ y) + (x ⊗ (dρ0(w) y)
½, 0(v, w) x = v x

where the product in the right-hand side of the last line is matrix multiplication. Similarly, for j=0 and k=½:

0, ½(v, w) x = w x

What we have described so far are representations of su(2) ⊕ su(2). The Lie algebra so(4) is isomorphic to su(2) ⊕ su(2), and we can use the function df4: su(2) ⊕ su(2) → so(4)

df4(v, w) x = H–1(v H(x) – H(x) w)

to identify the two algebras. This amounts to identifying the basis elements:

(0, Hk/2) ↔ Ak, for k = x, y, z.
(Hk/2, 0) ↔ Bk, for k = x, y, z.

So for example, if we wanted to use the spin-(0,½) representation with the generator Jxy, we would go through the following steps:

Jxy = Az + Bz
df4–1(Jxy) = (Hz/2, Hz/2)
0, ½(Hz/2, Hz/2) x = ½ Hz x

The spin-(0,½) representation of so(4) projects from so(4) down into the copy of so(3) spanned by the self-dual A matrices, and then maps that to su(2) in the usual way. Similarly, the spin-(½,0) representation of so(4) does the same thing but projects into the subspace spanned by the anti-self-dual B matrices. We will tabulate all the possibilities, because they’ll come in handy when we’re looking at the Riemannian Dirac equation.

spin-(½,0) spin-(0,½)
Jxy = Az + Bz ½ Hz½ Hz
Jyz = Ax + Bx ½ Hx½ Hx
Jzx = Ay + By ½ Hy½ Hy
Jxt = AxBx –½ Hx½ Hx
Jyt = AyBy –½ Hy½ Hy
Jzt = AzBz –½ Hz½ Hz

References

[1] Gauge Fields, Knots and Gravity by John Baez and Javier P. Muniain, World Scientific, 1994. pp 179-182.

[2] Group Theory and Physics by S. Sternberg, Cambridge University Press, 1994. Section 1.3.

[3] Sternberg, op. cit., Section 3.2.

[4] Sternberg, op. cit., Sections 3.3, 3.8, 3.9 and 4.9.

[5] Spinors and Trialities by John Baez.

[6] Symplectic, Quaternionic, Fermionic by John Baez.



Valid HTML Valid CSS
Orthogonal / Geometry and Waves [Extra] / created Wednesday, 6 April 2011
If you link to this page, please use this URL: http://www.gregegan.net/ORTHOGONAL/03/WavesExtra.html
Copyright © Greg Egan, 2011. All rights reserved.