What can the elliptical orbits of planets tell us about the energy levels of an atom? An atom isn’t really much like a miniature solar system, but the electrostatic force does obey the same kind of inversesquare law as Newtonian gravity. In the simplest case — a hydrogen atom, with just one proton and one electron — a special symmetry shared by all inversesquare laws leaves a clear mark on the quantummechanical version of the system. This has been known since the pioneering days of quantum mechanics, when Wolfgang Pauli used it to compute the energy levels of hydrogen.^{[1]}^{[2]} And as we’ll see, this symmetry involves rotations in four dimensions.
In what follows, we’ll take two wellknown facts for granted: that the orbit of a single planet around a star under Newtonian gravity is an ellipse (with a circle as a special case), and that the sum of the distances from the two foci of an ellipse to any point on the ellipse itself is a constant. You can find proofs of both these claims here.
We’ll start by considering a planet in a circular orbit around the sun. Of course in reality the planet and the sun would both orbit their common centre of mass, but that doesn’t really change anything important. If we want to take account of it, we can replace the planet’s mass with its “reduced mass”, and then the calculations proceed in exactly the same fashion.
The planet’s velocity vector will change direction as it moves around the orbit, but the planet’s speed — the length of the velocity vector — will remain the same. So if we draw all the velocity vectors from around the orbit with their bases at a common origin, they will form a circle.
Say the planet has a mass of m and an angular momentum of L. We can find its speed v by solving the equation L = m v R. So the radius of the circle of velocity vectors is v = L / (m R).
Now, suppose the planet is moving in an elliptical orbit, with a perihelion distance (the distance closest to the sun) of R_{1}, and an aphelion distance (the distance farthest from the sun) of R_{2}. Unlike the previous case, the planet’s speed will vary as it moves around the orbit.
As well as knowing R_{1} and R_{2}, it will be useful to know the major and minor semiaxes of the ellipse, which are traditionally referred to as a and b respectively. These are the largest and smallest distances from the centre of the ellipse to the curve, as opposed to the smallest and largest distances from one focus, R_{1} and R_{2}.
It’s clear that the semimajor axis a will be given by the arithmetic mean of R_{1} and R_{2}:
a = ½(R_{1} + R_{2})
Less obviously, the semiminor axis b turns out to be the geometric mean of R_{1} and R_{2}. The geometric mean of two numbers is defined as the square root of their product:
b = √(R_{1} R_{2})
To prove this, we start by noting that by symmetry, the second focus has to be a distance of R_{1} from the leftmost point of the ellipse, which makes the distance between the foci equal to R_{2} – R_{1}. Traditionally, half of this, the distance from the centre of the ellipse to either focus, is known as c, and we have:
c = ½(R_{2} – R_{1})
Next, we use the property of an ellipse that the sum of the distances from the two foci to any point on the curve is constant. For our ellipse, this constant sum is 2 a = R_{1} + R_{2}. A point equidistant from both foci will lie at a distance of a = ½(R_{1} + R_{2}), while its distance from the centre of the ellipse is b. Then Pythagoras’s Theorem gives us:
b^{2} = a^{2} – c^{2}
= (a – c) (a + c)
= R_{1} R_{2}
What do the velocity vectors for this elliptical orbit look like, when we gather them all together? The angular momentum of the planet around the sun is equal to the product of its mass, its speed, and the distance between the planet and the sun measured perpendicular to the velocity vector, so we can immediately solve for the velocity at the four locations where the vector is either parallel or perpendicular to the axis of the ellipse:
Surprisingly, these four points lie on a perfect circle! The main difference from the case of a circular orbit is that the centre of the velocity circle is offset from the origin of the vectors:
Here, we’ve simply put the centre midway between the top and bottom points, and then the test for circularity is using Pythagoras’s Theorem to check that the distance to the side points is the same as the distance to the top and bottom points. This is easy if we note that:
Distance to the top and bottom points = L (R_{1} + R_{2}) / (2 m R_{1} R_{2}) = a L / (m R_{1} R_{2})
Horizontal distance to side points = L / (m √(R_{1} R_{2})) = b L / (m R_{1} R_{2})
Vertical distance to side points = L (R_{2} – R_{1}) / (2 m R_{1} R_{2}) = c L / (m R_{1} R_{2})
Since a^{2} = b^{2} + c^{2}, the same relationship holds for the versions of these quantities multiplied by L / (m R_{1} R_{2}) that appear in the diagram.
Another way to see this is to make use of the fact that, in the diagram above, the rightangled triangles OAB and OBC have the same ratio of side lengths:
OA/OB = √(R_{1} R_{2}) / R_{1} = √(R_{2} / R_{1})
OB/OC = R_{2} / √(R_{1} R_{2}) = √(R_{2} / R_{1})
This makes them similar triangles, with the same angle (say α) as the angle OAB and the angle OBC. It follows that the angle ABC is a right angle. The circle that passes through the three vertices of any rightangled triangle has the hypotenuse as a diameter, and so by symmetry the fourth vector (whose tip is point D here), lies on the same circle.
To prove that all the velocity vectors, not just these four, lie on the same circle, we need to do a little more work. Using coordinates with their origin at the rightmost focus of the orbit (that is, the sun), we’ll write an arbitrary point on the orbit as:
r(θ) = r(θ) (cos θ, sin θ)
The other focus will be at (–(R_{2} – R_{1}), 0). So in order for the sum of the distances from the two foci to be equal to R_{1} + R_{2}, the distance r(θ) must satisfy the equation:
[R_{1} + R_{2} – r(θ)]^{2} = [r(θ) cos θ + R_{2} – R_{1}]^{2} + r(θ)^{2} sin^{2} θ
A lot of terms cancel in this equation, and it’s easily solved to give:
r(θ) = 2 R_{1} R_{2} / (R_{1} + R_{2} + (R_{2} – R_{1}) cos θ)
To find the velocity of the planet, we take the derivative with respect to time:
v(θ) = dr(θ)/dt = dr(θ)/dθ × dθ/dt
The angular velocity dθ/dt can be found in terms of the angular momentum and the distance from the sun:
dθ/dt = L / (m r(θ)^{2})
The derivative of the position with respect to θ is, by the product rule:
dr(θ)/dθ = dr(θ)/dθ (cos θ, sin θ) + r(θ) (–sin θ, cos θ)
dr(θ)/dθ = 2 R_{1} R_{2} (R_{2} – R_{1}) sin θ / (R_{1} + R_{2} + (R_{2} – R_{1}) cos θ)^{2}
Putting the pieces together, we end up with:
v(θ) = [L / (2 m R_{1} R_{2})] [(0, R_{2} – R_{1}) + (R_{1} + R_{2}) (–sin θ, cos θ)]
So the velocities trace out a perfect circle with centre (0, L (R_{2} – R_{1}) / (2 m R_{1} R_{2})) and radius L (R_{1} + R_{2}) / (2 m R_{1} R_{2}), which is precisely the circle we’ve already identified as passing through our four original points.
Given a planet orbiting the sun with a certain total energy E — the sum of its kinetic energy, due to its velocity, and its gravitational potential energy, due to its distance from the sun — what other orbits will have the same energy?
We can write the total energy in terms of L, m, R_{1} and R_{2}. First we find the energy of the planet in terms that include the gravitational constant G and the mass of the sun M_{S}:
E = Kinetic energy at perihelion + Potential energy at perihelion
= ½ m v^{2}_{Perihelion} – G M_{S} m / R_{1}
= ½ L^{2} / (m R_{1}^{2}) – G M_{S} m / R_{1}
Of course we can equally well compute this at aphelion, which gives us:
E = ½ L^{2} / (m R_{2}^{2}) – G M_{S} m / R_{2}
If we equate these two expressions for energy and solve for G M_{S}, we find:
G M_{S} = L^{2} (R_{1} + R_{2}) / (2 m^{2} R_{1} R_{2})
If we then substitute that back into either expression for the energy, we get:
E = –L^{2} / (2 m R_{1} R_{2})
If we compute the speed of a planet moving parallel to the axis of its orbit, we see that it depends only on the total energy, E, and the mass of the planet, m:
v_{Parallel} = L/(m √(R_{1} R_{2})) = √(–2 E/m)
Now, one way to find orbits with the same energy is by applying a rotation that leaves the sun fixed but repositions the planet. Any ordinary threedimensional rotation can be used in this way, yielding another orbit with exactly the same shape, but oriented differently.
But there is another transformation we can use to give us a new orbit without changing the total energy. If we grab hold of the planet at either of the points where it’s travelling parallel to the axis of the ellipse, and then swing it along a circular arc centred on the sun, we can reposition it without altering its distance from the sun. But rather than rotating its velocity in the same fashion (as we would do if we wanted to rotate the orbit as a whole) we leave its velocity vector unchanged: its direction, as well as its length, stays the same.
Since we haven’t changed the planet’s distance from the sun, its potential energy is unaltered, and since we haven’t changed its velocity, its kinetic energy is the same. What’s more, since the speed of a planet of a given mass when it’s moving parallel to the axis of its orbit depends only on its total energy, the planet will still be in that state with respect to its new orbit, and so the new orbit’s axis must be parallel to the axis of the original orbit.
The orbits we create by this process will (a) all have the same energy, (b) all share a common axis, and (c) all have the same two opposing velocities at the two points when the planet is moving parallel to the orbit’s axis. In terms of the circles traced out by the velocities, this means those circles will all intersect at the two points corresponding to velocities parallel to the axis.
Now, another way to generate exactly the same family of circles is by stereographic projection from a sphere of the appropriate size, sitting tangent to the plane in which the velocity circles lie. We draw lines from the “north pole” of this sphere, through points that lie on a great circle that passes through both ends of a horizontal diameter of the sphere, and continue the lines until they hit the plane. This projects each such great circle onto one of the velocity circles.
The radius s of this sphere must be half the speed of the planet when it’s moving parallel to the axis of the orbit:
s = ½ √(–2 E/m)
This ensures that the projected distance from the origin when the line from the north pole hits the plane is twice this, giving the point of intersection of all the relevant velocity circles.
We will adopt coordinates where the plane of the velocity circles is z=–s, the centre of the sphere is (0, 0, 0), and the north pole is (0, 0, s). The xaxis will correspond to the axis of our elliptical orbits. An arbitrary point (x, y, z) in the threedimensional space above the velocity plane is projected onto the plane at:
S(x, y, z) = [2 s / (s – z)] (x, y)
where we omit the constant zcoordinate of –s from the projection and just treat S as mapping into a twodimensional plane.
The equator of the sphere will simply map into the velocity circle corresponding to a circular orbit, centred on the origin and with a radius of √(–2 E/m). If we tilt the equator around the xaxis by some angle α, a general point on this tilted circle will take the form (s cos θ, s cos α sin θ, s sin α sin θ). The projection of this point is:
S(s cos θ, s cos α sin θ, s sin α sin θ) = [2 s / (1 – sin α sin θ)] (cos θ, cos α sin θ)
The projected points all lie on a circle with centre (0, 2 s tan α) and radius 2 s sec α.
It will be convenient at this point to work with the “force constant” k for the orbit, which includes the gravitational constant, G, the mass of the sun, M_{S}, and the mass of the planet, m:
k = G M_{S} m = L^{2} (R_{1} + R_{2}) / (2 m R_{1} R_{2})
The radius of the velocity circle, which we found previously to be L (R_{1} + R_{2}) / (2 m R_{1} R_{2}), can be expressed in terms of the force constant and the angular momentum:
Radius of circle = k / L
So we have:
2 s sec α = k / L
L √(–2 E/m) = k cos α
The singular case α=π/2, where the velocity circle becomes infinitely large, corresponds to an orbit with zero angular momentum: the planet crashes into the sun, and in this idealised model where the sun is a point mass, the falling planet is accelerated up to infinite velocity.
So far we’ve mostly been treating the planet’s orbit as lying in a fixed plane, but of course the orbit can lie in any plane. So rather than projecting from a sphere in threedimensional space down to a plane, the more general situation will involve projecting from a 3sphere (the threedimensional surface of a fourdimensional ball) down to three dimensions. Any great circle on that 3sphere will still project onto a velocity circle, but the velocity circles can now be oriented arbitrarily in threedimensional space. And any fourdimensional rotation applied to a 3sphere of radius s will take a great circle that corresponds to one orbit of energy E into another great circle corresponding to another orbit with the same energy.
Suppose we rotate a threedimensional object around the zaxis by some angle θ. The change in the coordinates of any point on the object can be described with the aid of a matrix, which we’ll call R_{xy}(θ):
R_{xy}(θ)  = 

We find the new coordinates of a point in the rotated object by matrix multiplication:
r' = R_{xy}(θ) r
where r=(x, y, z) are the old coordinates and r'=(x', y', z') are the new coordinates. For example, the effect of this rotation on each of the coordinate axes is:
R_{xy}(θ) (1, 0, 0) = (cos(θ), sin(θ), 0)
R_{xy}(θ) (0, 1, 0) = (–sin(θ), cos(θ), 0)
R_{xy}(θ) (0, 0, 1) = (0, 0, 1)
Why do we call the matrix R_{xy}(θ), rather than R_{z}(θ), given that it rotates around the zaxis? The z coordinate is unchanged by a rotation around the zaxis, so we’re choosing to describe this rotation in terms of the coordinates that are changed, x and y. This scheme generalises easily to more than three dimensions, where the simplest rotations are those that only affect a given pair of coordinates, while all the remaining coordinates (however many that might be) are left unaltered.
We can construct similar matrices for rotations around the x and y axes, or indeed rotations around any axis at all. And we can find the net effect of two or more successive rotations by multiplying together the relevant matrices. For example, if we rotate an object first by an angle θ around the zaxis and then by an angle φ around the xaxis, the net effect will be to multiply its original coordinates by the matrix:
T = R_{yz}(φ) R_{xy}(θ)
The matrix describing the first rotation appears on the right, because we place the original coordinates on the right when calculating the effect of the rotation. Matrix multiplication is not commutative: the order of multiplication matters, so R_{yz}(φ) R_{xy}(θ) will not be equal to R_{xy}(θ) R_{yz}(φ). And of course this reflects the geometrical reality of rotations: rotating an object around two different axes has a different net effect depending on which rotation you perform first.
It turns out that a very powerful way to get a handle on this noncommutativity is to work, not with rotation matrices, but the matrices we get if we describe a rotation by an angle θ, take the derivative of that matrix with respect to θ, and then set θ equal to 0. For example:
J_{xy}  =  dR_{xy}(θ) /dθ _{θ=0}  
= 


= 

Physically, what this corresponds to is an angular velocity matrix: the matrix we would use to multiply the coordinates of points on an object in order to obtain their velocity if the object was rotating around a given axis. In that case we would have θ=ωt and we would take the derivative with respect to t, but the only difference that would produce is to multiply the matrix by an overall factor of ω. Similarly, we have:
J_{yz}  =  dR_{yz}(θ) /dθ _{θ=0}  
= 


= 


J_{zx}  =  dR_{zx}(θ) /dθ _{θ=0}  
= 


= 

All three of these matrices are antisymmetric: flipping them around their diagonal so that rows become columns and vice versa negates the whole matrix. The set of all 3 × 3 antisymmetric matrices can be viewed as a threedimensional vector space, and it turns out that if we rotate around any axis we always get an antisymmetric matrix which can be written as a linear combination of these three. If the axis of rotation is a=(a_{x}, a_{y}, a_{z}), the corresponding angular velocity matrix is:
J_{a} = a_{x} J_{yz} + a_{y} J_{zx} + a_{z} J_{xy}
These J matrices don’t commute, but the way in which they fail to do so can be captured more simply than for the rotation matrices. First, we define the commutator for any two matrices as:
[A, B] = A B – B A
Clearly the commutator will be zero if the two matrices commute and A B = B A. For the J matrices for the three coordinate pairs, the commutators turn out to be remarkably simple:
[J_{yz}, J_{zx}] = J_{xy}
[J_{zx}, J_{xy}] = J_{yz}
[J_{xy}, J_{yz}] = J_{zx}
These three equations essentially cover every possibility, because reversing the order of the matrices in the commutator brackets will always just negate the result ([B, A] = –[A, B]), and if the same matrix appears twice then it commutes with itself ([A, A] = 0).
Now, to make a connection with quantum mechanics, consider the operators on functions of three coordinates, say f(x, y, z), defined by:
L_{xy} f = –iℏ (x ∂_{y} f – y ∂_{x} f)
L_{yz} f = –iℏ (y ∂_{z} f – z ∂_{y} f)
L_{zx} f = –iℏ (z ∂_{x} f – x ∂_{z} f)
where i is the square root of minus one and ℏ is the “reduced Planck’s constant”, ℏ=h/(2π). We can calculate the commutators of these L operators, where instead of matrix multiplication we just apply the operators in succession to a function f(x, y, z), i.e.
[L_{yz}, L_{zx}] f = L_{yz}(L_{zx} f ) – L_{zx}(L_{yz} f )
Making use of the definitions we get:
[L_{yz}, L_{zx}] = iℏ L_{xy}
[L_{zx}, L_{xy}] = iℏ L_{yz}
[L_{xy}, L_{yz}] = iℏ L_{zx}
Clearly the pattern here mimics the commutators of the J matrices very closely. We can talk about this mimicry more formally by saying that we have a representation of the algebra of threedimensional rotations (the vector space of J matrices, which is known as so(3)) on the vector space of functions of threedimensional space, given by associating each J matrix with the corresponding L operator. If we define:
dρ(J_{αβ})=L_{αβ}/(iℏ)
(where αβ stands for any pair of coordinates), we can extend dρ to any linear combination of the coordinate J matrices by requiring that dρ be a linear function. We then have:
[dρ(J_{a}), dρ(J_{b})] = dρ([J_{a}, J_{b}])
So what we mean by a representation is a function like dρ that “preserves” the commutator in this sense: we can take the commutator of the original matrices and then apply the function, or we can apply the function first and then take the commutator, and it makes no difference to the result.
What are these L operators? In quantum mechanics, we obtain the component of a particle’s momentum in a given direction by taking the derivative of the particle’s wave function ψ(x, y, z) in that direction, and multiplying by –iℏ. For example, the momentum in the x direction is given by:
p_{x} ψ(x, y, z) = –iℏ ∂_{x} ψ(x, y, z)
When the wave function describes a particle with a definite value for some quantity like momentum, the corresponding operator simply multiplies the wave function by that value. So, for example, if the wave function were given by ψ(x, y, z)=exp(i P x / ℏ), we would have:
p_{x} ψ(x, y, z) = –iℏ ∂_{x} exp(i P x / ℏ) = P exp(i P x / ℏ) = P ψ(x, y, z)
So that particular wave function has a definite value of P for the x component of its momentum, and we say that it is an eigenfunction of the momentum operator, with eigenvalue P.
The angular momentum vector, L, of a classical particle is given by the vector cross product of the particle’s position and momentum vectors:
L = r × p
If we expand out the cross product for the individual coordinates, and then convert the momentum coordinates to their quantummechanical equivalents, we get:
L_{x} = y p_{z} – z p_{y} → –iℏ (y ∂_{z} – z ∂_{y}) = L_{yz}
L_{y} = z p_{x} – x p_{z} → –iℏ (z ∂_{x} – x ∂_{z}) = L_{zx}
L_{z} = x p_{y} – y p_{x} → –iℏ (x ∂_{y} – y ∂_{x}) = L_{xy}
So these L operators are the quantummechanical equivalents of the coordinates of the particle’s angular momentum vector. The similarity between their commutators and those of the J matrices isn’t too surprising, given that the J matrices are essentially tools for computing angular velocities, and for a single particle its angular momentum is just its mass times its angular velocity.
We can construct an operator that corresponds to the squared magnitude of the angular momentum vector:
L^{2} f = L_{yz}(L_{yz} f ) + L_{zx}(L_{zx} f ) + L_{xy}(L_{xy} f )
This operator, L^{2}, commutes with all the individual components of the angular momentum.
Now, we’d like to identify various vector spaces of wave functions that are invariant under rotation: that is, if ψ(r) is a function in the vector space, and R is a rotation, then ψ_{R}(r) = ψ(R r) is also in the same vector space. [Note that it’s the vector space as a whole that is invariant. We’re not saying that the individual functions in the space are unchanged by rotations.] Physically, if we rotate a selfcontained quantum system (or describe the same system in a new, rotated set of coordinates), its wave function will change, and we would like to know what kinds of vector spaces of functions contain all the wave functions we can get by performing such a rotation on any given function in the space.
One example of a space of functions that will be invariant under rotations is the set of all homogeneous polynomials of a given degree in x, y and z. A homogeneous polynomial is one where the sum of the powers of all its variables is the same in every term. That sum is called the degree of the polynomial. For example:
f(x, y, z) = 7 x^{2} y z + 3 x y z^{2}
is a homogeneous polynomial of degree four. It’s not hard to see that if we perform any linear transformation of our coordinates — that is, we replace each coordinate by some linear combination of the three original coordinates — we will still have a homogeneous polynomial of the same degree. Since a rotation is a linear transformation, the space of homogeneous polynomials of any particular degree will be invariant under rotations.
To make things even simpler, we would like to find a space of functions that, as well as being invariant under rotations, can’t be broken down into any smaller spaces that are themselves invariant. We’ll call such a space irreducible. It turns out that the space of homogeneous polynomials of a given degree does contain a smaller invariant space: those polynomials that are also solutions of Laplace’s equation:
∇^{2} f = 0
where in threedimensions we define the Laplacian operator ∇^{2} as the sum of the second derivatives with respect to the three coordinates:
∇^{2} f = ∂_{x, x} f + ∂_{y, y} f + ∂_{z, z} f
To see why the property of satisfying Laplace’s equation is unchanged under a rotation, just change to a new Cartesian coordinate system related to the first by a rotation. If we write the three original coordinates as x^{j} for j=1,2,3, we can write the Laplacian operator as:
∇^{2} f = Σ_{j} ∂_{xj, xj} f
Our new, rotated coordinates ξ^{i} will be related to the old coordinates by:
ξ^{i} = Σ_{j} R_{ij} x^{j}
where R_{ij} are the coordinates of the rotation matrix. We then have, making use of the chain rule for derivatives:
∇^{2} f = Σ_{j} ∂_{xj, xj} f
= Σ_{j} Σ_{α} Σ_{β} (∂_{xj} ξ^{α}) (∂_{xj} ξ^{β}) ∂_{ξα, ξβ} f
= Σ_{j} Σ_{α} Σ_{β} R_{α j} R_{β j} ∂_{ξα, ξβ} f
= Σ_{α} Σ_{β} (Σ_{j} R_{α j} R_{β j}) ∂_{ξα, ξβ} f
= Σ_{α} Σ_{β} δ_{αβ} ∂_{ξα, ξβ} f
= Σ_{α} ∂_{ξα, ξα} f
which is just the Laplacian in the new coordinates. In the step:
Σ_{j} R_{α j} R_{β j} = δ_{αβ}
the symbol δ_{αβ} is known as the Kronecker delta, which is equal to 1 when α and β are equal and 0 otherwise. This step amounts to claiming that the product of the rotation matrix and its transpose is the identity matrix, or, equivalently, that the dot product of any two rows of the matrix will be 0 if they are different rows and 1 if they are the same. It’s easy to see that this must be true for the columns of the rotation matrix, since the columns are the vectors produced by rotating each of the coordinate axes. The statement about the columns corresponds to the matrix equation R^{T}R=I, while the statement about the rows corresponds to RR^{T}=I. But every invertible matrix only has a single inverse, and it makes no difference whether we multiply by the inverse on the left or the right. So R^{T} is the inverse of R, and the statement about the rows is also true.
A solid spherical harmonic is a homogeneous polynomial in the Cartesian coordinates that satisfies Laplace’s equation. We will refer to a solid spherical harmonic of degree λ as Z_{λ, σ}(x, y, z), where λ is an integer greater than or equal to zero, and the second subscript σ stands for some asyetunspecified way of labelling individual polynomials. For any fixed value of λ, the space spanned by these functions is invariant under rotations, and also irreducible: it contains no smaller invariant space.
We will define the ordinary (as opposed to the solid) spherical harmonic for unit vectors u as:
Y_{λ, σ}(u) = Z_{λ, σ}(u)
So Y_{λ, σ} is defined only on the unit sphere S^{2}, where it agrees with Z_{λ, σ}. Because the solid spherical harmonics are homogeneous polynomials, they scale with the power of λ, giving us:
Z_{λ, σ}(r) = r^{λ} Z_{λ, σ}(r/r) = r^{λ} Y_{λ, σ}(r/r)
If we apply the operator L^{2} to Z_{λ, σ}, and make use of the fact that Z_{λ, σ} is a solution of Laplace’s equation, we obtain the result:
L^{2} Z_{λ, σ}(r) = ℏ^{2} λ(λ+1) Z_{λ, σ}(r) – ℏ^{2} r^{2} ∇^{2} Z_{λ, σ}(r)
= ℏ^{2} λ(λ+1) Z_{λ, σ}(r)
So all the solid spherical harmonics of degree λ are eigenfunctions of the squared angular momentum operator, with eigenvalue ℏ^{2} λ(λ+1). The same will be true for any linear combination of the Z_{λ, σ} with the same λ. So for each invariant space of functions like this we have a distinct value for the squared angular momentum.
We will refer to the space of functions spanned by the Z_{λ, σ} as the spinλ representation of the rotation algebra. To find its dimension, first we note that with two variables, it’s easy to see that the number of possible monomials of degree λ is just λ+1, since we can give the first variable any power from 0 to λ and then we have no choice for the second variable. With three variables, we can take any twovariable monomial of degree ranging from 0 up to λ and multiply it by the third variable raised to an appropriate power, so we have a total of:
1 + 2 + ... + (λ+1) = ½(λ+2)(λ+1)
The Laplacian operator reduces the degree of a homogeneous polynomial by two, so requiring it to be zero imposes ½λ(λ–1) constraints. That leaves the dimension of the space equal to the difference of these two expressions, which is 2λ+1. So the spinλ representation has dimension 2λ+1.
To interpret this more physically, we can choose a basis of the space in which each element of the basis has a different eigenvalue for one of the L operators: for example, L_{xy}, which measures the z component of the angular momentum vector. If we define:
Z_{λ, λ}(x, y, z) = (x + i y)^{λ}
then this is clearly a homogeneous polynomial of degree λ, and it’s a straightforward calculation to show that:
∇^{2} Z_{λ, λ} = 0
L_{xy} Z_{λ, λ} = λℏ Z_{λ, λ}
So we’ve identified one function in the spinλ representation which has a z component of its angular momentum of exactly λ (measured in the quantum units of spin, ℏ). Similarly, we can define a function whose z component of angular momentum is the opposite of this:
Z_{λ, –λ}(x, y, z) = (x – i y)^{λ}
∇^{2} Z_{λ, –λ} = 0
L_{xy} Z_{λ, –λ} = –λℏ Z_{λ, –λ}
Now, suppose we’re given a function Z_{λ, m} which is an eigenfunction of L_{xy} with eigenvalue mℏ. In what follows, we will make use of the commutators between the L operators:
[L_{xy}, L_{yz}] = iℏ L_{zx}
L_{xy} L_{yz} = iℏ L_{zx} + L_{yz} L_{xy}
[L_{zx}, L_{xy}] = iℏ L_{yz}
–L_{xy} L_{zx} = iℏ L_{yz} – L_{zx} L_{xy}
We now define a “spinlowering operator”:
L_{–} = L_{yz} – i L_{zx}
If we apply this operator to Z_{λ, m}, an existing eigenfunction of L_{xy} with eigenvalue mℏ, we get:
L_{xy} (L_{–} Z_{λ, m})
= L_{xy} (L_{yz} – i L_{zx}) Z_{λ, m}
= (L_{xy} L_{yz} – i L_{xy} L_{zx}) Z_{λ, m}
= (iℏ L_{zx} + L_{yz} L_{xy} + i (iℏ L_{yz} – L_{zx} L_{xy})) Z_{λ, m}
= (iℏ L_{zx} + mℏ L_{yz} + i (iℏ L_{yz} – mℏ L_{zx})) Z_{λ, m}
= (m–1) ℏ (L_{yz} – i L_{zx}) Z_{λ, m}
= (m–1) ℏ (L_{–} Z_{λ, m})
So L_{–} gives us a new eigenfunction of L_{xy} with an eigenvalue that is one unit less than that of the function we started with. However, this process can’t go on forever. If we apply the spinlowering operator to Z_{λ, –λ}, we get:
L_{–} Z_{λ, –λ}
= (L_{yz} – i L_{zx}) (x – i y)^{λ}
= –iℏ (y ∂_{z} – z ∂_{y} – i z ∂_{x} + i x ∂_{z}) (x – i y)^{λ}
= iℏ z (∂_{y} + i ∂_{x}) (x – i y)^{λ}
= 0
So we get a basis {Z_{λ, –λ}, Z_{λ, –λ+1}, ..., Z_{λ, λ–1}, Z_{λ, λ}} of the spinλ representation, with exactly 2λ+1 elements, each of which is an eigenfunction of L_{xy} with an eigenvalue that varies in integer units from –λ up to λ.
Though we won’t prove it, it turns out that all irreducible representations of the rotation algebra have an identical structure to the function spaces that we’ve called the spinλ representations: they can all be characterised by a number λ related to their dimension 2λ+1, and they can all be given a basis of eigenvectors of one of the angular momentum components, with eigenvalues that range (in suitable units) from –λ to λ in integer steps. The only aspect that can’t be seen in our examples of spaces of wave functions is that it’s possible to have a representation where λ takes, not an integer value, but a halfinteger value. For example, the vector space for the intrinsic spin of an electron is the spin½ representation, a twodimensional vector space, and the z component of an electron’s spin can have eigenvalues of either –½ℏ or ½ℏ.
If we extend the ideas from the previous section to a space with four dimensions (adding a coordinate w alongside the original x, y and z), then we can define rotation matrices in a similar way to encompass the three new pairs of coordinates, and take their derivatives to obtain three new J matrices: J_{xw}, J_{yw}, J_{zw}. The fourdimensional rotation algebra can be summed up by saying that when i, j and k are distinct:
[J_{ij}, J_{kj}] = J_{ik}
Apart from rearrangements of this basic identity with appropriate changes of sign, all other commutators are zero. That’s simple enough, but it turns out that there’s a different choice of matrices that makes things even simpler.







For these matrices, the commutators are:
[A_{x}, A_{y}] = A_{z}, and cyclic permutations
[B_{x}, B_{y}] = B_{z}, and cyclic permutations
[A_{i}, B_{j}] = 0, for all i and j.
So there are individual “subalgebras” formed by the A and B matrices that have exactly the same structure as the threedimensional rotation algebra so(3), and all the commutators between the two subalgebras are zero.
In general, then, we’d expect to describe an irreducible representation of the fourdimensional rotation algebra, so(4), by choosing two numbers, λ_{A} and λ_{B}, and combining the spinλ_{A} and the spinλ_{B} representations of the threedimensional rotation algebra, so(3). Since we can define spinlowering operators that act indepently on the A_{z} or B_{z} eigenvalues, those eigenvalues will vary independently across their respective ranges, and we will have a total of (2λ_{A}+1)(2λ_{B}+1) different vectors as a basis for the whole space.
However, if we work with functions on fourdimensional space and extend the L operators to all six pairs of coordinates, for these representations we will always have λ_{A}=λ_{B}. The reason for this is that these functions describe the quantum mechanics of a single particle, and while a completely general angular velocity in four dimensions can involve an arbitrary linear combination of the six J matrices, the angular velocity of a point particle always entails motion in a single plane: the plane spanned by the vector giving the particle’s position relative to the origin, and the vector for its velocity.
Suppose we rewrite the angular velocity as a linear combination of the A and B matrices:
J = a_{x} A_{x} + a_{y} A_{y} + a_{z} A_{z} + b_{x} B_{x} + b_{y} B_{y} + b_{z} B_{z}
Since the motion lies in a single plane, we could always choose coordinates such that:
J = ω J_{xy} = ω A_{z} + ω B_{z}
a_{x} = a_{y} = b_{x} = b_{y} = 0
a_{z} = b_{z} = ω
a_{x}^{2} + a_{y}^{2} + a_{z}^{2} = b_{x}^{2} + b_{y}^{2} + b_{z}^{2}
The exact values of a_{x}, b_{x} etc. depend on the choice of coordinates, but the last line, where the sum of the squares of both triples of coefficients are the same, can be shown to be independent of the coordinate system. So for any simple rotation (a rotation in a single plane), the triples of coefficients for the A and B matrices have the same squared magnitude.
To prove the quantummechanical version of this result, the notation will be simpler if we define some threedimensional vectors of operators:
L = (L_{yz}, L_{zx}, L_{xy})
M = (L_{xw}, L_{yw}, L_{zw})
A = ½(L + M)
B = ½(L – M)
Here we’re using the same letters, A and B, for the sums and differences we form from the L operators as we used for the matrices formed from the sums and differences of the J matrices. We can define squared angular momentum operators for the two spins:
A^{2} = A · A = ¼(L · L + L · M + M · L + M · M)
B^{2} = B · B = ¼(L · L – L · M – M · L + M · M)
A^{2} – B^{2} = ½(L · M + M · L)
A^{2} + B^{2} = ½(L^{2} + M^{2})
Now, if we make use of the definitions of the L operators, we find that:
(L · M) f
= –ℏ^{2} [ (y ∂_{z} – z ∂_{y}) (x ∂_{w} – w ∂_{x}) + (z ∂_{x} – x ∂_{z}) (y ∂_{w} – w ∂_{y}) + (x ∂_{y} – y ∂_{x}) (z ∂_{w} – w ∂_{z}) ] f
= –ℏ^{2} [ (x y ∂_{z, w} + z w ∂_{x, y} – x z ∂_{y, w} – y w ∂_{x, z})
+ (y z ∂_{x, w} + x w ∂_{y, z} – x y ∂_{z, w} – z w ∂_{x, y})
+ (x z ∂_{y, w} + y w ∂_{x, z} – y z ∂_{x, w} – x w ∂_{y, z}) ] f
= 0
Similarly, (M · L) f = 0. So A^{2} = B^{2}, and λ_{A}=λ_{B}.
We define solid spherical harmonics in four dimensions as homogeneous polynomials in the four Cartesian coordinates x, y, z and w that satisfy the fourdimensional version of Laplace’s equation. We will refer to a fourdimensional solid spherical harmonic of degree λ as Z_{λ, σ, μ}, and the ordinary spherical harmonic that agrees with it on the unit sphere as Y_{λ, σ, μ}. There are many different ways we could choose individual functions, but one obvious scheme would be to have σ and μ equal to the eigenvalues of A_{z} and B_{z} respectively, with both ranging from –λ_{A} to λ_{A}.
In three dimensions, the squared magnitude of the angular momentum vector, L^{2}, is unchanged by any rotation, and the set of eigenfunctions of L^{2} with a given eigenvalue form an invariant space. In four dimensions, the angular velocity matrices have six independent components, and any quantity that is left unchanged by rotations needs to include all six, since in general a rotation might shift any one of those components into any other. In the notation we’re using, A^{2} + B^{2} and L^{2} + M^{2} (which are the same thing except for a factor of ½) are invariant under all rotations, and the fourdimensional solid spherical harmonics are eigenfunctions of both:
(A^{2} + B^{2}) Z_{λ, σ, μ}(r)
= ½ (L^{2} + M^{2}) Z_{λ, σ, μ}(r)
= ½ [ℏ^{2} λ(λ+2) Z_{λ, σ, μ}(r) – ℏ^{2} r^{2} ∇^{2} Z_{λ, σ, μ}(r)]
= ½ℏ^{2} λ(λ+2) Z_{λ, σ, μ}(r)
The third line above depends only on the function Z_{λ, σ, μ}(r) being homogeneous with degree λ, with the fourth line following because Z_{λ, σ, μ}(r) is also harmonic. Since A^{2} = B^{2}, this becomes:
A^{2} Z_{λ, σ, μ}(r)
= ¼ℏ^{2} λ(λ+2) Z_{λ, σ, μ}(r)
= ℏ^{2} (½λ)(½λ+1) Z_{λ, σ, μ}(r)
In the spinλ_{A} representation of the threedimensional rotation algebra, A^{2} = ℏ^{2} λ_{A}(λ_{A}+1), so we must have λ_{A} = λ_{B} = ½λ. The dimension of the representation we get from the solid spherical harmonics of degree λ is:
(2λ_{A}+1)(2λ_{B}+1) = (λ+1)^{2}
As a check, we can get the same dimension by directly counting the number of independent polynomials. By a similar argument to that we gave for the threedimensional spherical harmonics, we can show that there are (λ+3)(λ+2)(λ+1)/6 linearly independent homogeneous polynomials in four variables with degree λ. Then, since the Laplacian operator reduces the degree of a homogeneous polynomial by two, requiring it to be zero imposes (λ+1) λ (λ–1)/6 constraints. That leaves the dimension of the space equal to the difference of these two expressions, which is just (λ+1)^{2}.
To give concrete examples of fourdimensional spherical harmonics, we can start by defining:
Z_{λ, ½λ, ½λ}(x, y, z, w) = (x + i y)^{λ}
This has eigenvalues for both A_{z} and B_{z} of ½ℏλ. We can then obtain other eigenfunctions with lower spins using the lowering operators:
A_{–} = A_{x} – i A_{y}
B_{–} = B_{x} – i B_{y}
Pauli^{[1]}^{[2]} found the energy levels of hydrogen by realising that the quantum mechanical equivalent of a suitablyscaled version of a classical vector known as the LaplaceRungeLenz vector (which points along the axis of a planet’s elliptical orbit), acts exactly like our vector M, so that the algebra formed by the LaplaceRungeLenz vector and the angular momentum vector L was none other than the fourdimensional rotation algebra in disguise! Pauli wasn’t working with functions on fourdimensional space, and he obtained the equation L · M = 0 from the classical relationship between the angular momentum (which is perpendicular to the plane of the orbit) and the LaplaceRungeLenz vector, which lies in the plane.
We won’t go into the details of Pauli’s calculations, but there is one aspect of the connection that’s very easy to see. For every ellipse, we have the relationship between the semimajor axis, a, the semiminor axis, b, and the distance from the centre to each focus, c:
b^{2} + c^{2} = a^{2}
For an orbit, fixing the total energy E is enough to fix the semimajor axis, a. The semiminor axis, b, is then proportional to the angular momentum, L. The length of the LaplaceRungeLenz vector, M, is proportional to the eccentricity of the orbit, which in turn is proportional to c. So, given the right choice of an overall factor for M, the invariance of:
L^{2} + M^{2}
for all orbits with a fixed total energy just reflects the Pythagorean relationship between a, b and c.
To take this a little further, suppose we choose two vectors A and B in threedimensional space, with the only restriction being:
A^{2} = B^{2} = –k^{2} m/(8 E)
for some fixed total energy E. Clearly we’re free to rotate either vector, independently of the other, without changing this condition. Then if we set:
L = A + B
M = A – B
we have:
L · M = (A + B) · (A – B) = A · A – B · B = 0
L^{2} + M^{2} = (A + B) · (A + B) + (A – B) · (A – B) = 2(A^{2} + B^{2}) = –k^{2} m/(2 E)
Since L and M are guaranteed to be orthogonal, we can always choose an elliptical orbit whose plane is perpendicular to L and has M pointing in the direction from the centre of attraction to the second focus of the ellipse.
The exact relationships between energy, angular momentum and the semimajor and semiminor axes of the orbit can be found from two of our earlier results:
E = –L^{2} / (2 m R_{1} R_{2})
k = G M_{S} m = L^{2} (R_{1} + R_{2}) / (2 m R_{1} R_{2})
a = ½(R_{1} + R_{2}) = –k/(2 E)
b^{2} = R_{1} R_{2} = –L^{2}/(2 E m)
If we scale M so it’s related to c by the same factor as L and b, we have:
c^{2} = –M^{2}/(2 E m)
b^{2} + c^{2} = –(L^{2} + M^{2})/(2 E m) = k^{2}/(4 E^{2}) = a^{2}
So, once we fix E we see that the set of orbits that share the same total energy is invariant under two completely independent rotations of the two vectors A and B, giving the same kind of doubling of the rotation algebra that we see in fourdimensional rotations.
If we set aside the physical vectors and just concentrate on the geometry, there’s an even simpler construction that demonstrates the same symmetries. Choose two vectors A and B whose lengths are equal. Place one focus of an ellipse at the origin, place the second focus at A + B, and position the two points of the ellipse where the tangent to the curve is parallel to its axis at A and B. This construction will always yield an ellipse with a semimajor axis equal to A (with the degenerate case of a line when A = B), and the entire family of ellipses with one focus at the origin and the same length for the semimajor axis can be generated by rotating A and B independently.
Of course our B here is more like the opposite of the B used in the previous construction, and the vector A – B whose length is now proportional to the angular momentum lies in the plane of the orbit, not perpendicular to it. But everything can be mapped between the different schemes in a way that preserves the essential feature: the two independent threedimensional rotations.
In 1935, Vladimir Fock^{[3]} used the stereographic projection from a 3sphere to momentum space (which is the same as the velocity space we’ve been working with, apart from a factor of m) to analyse the quantummechanical solutions for a particle moving under an inversesquare force. Of course in the quantum mechanical case, rather than a star and a planet we’re studying a proton and an electron. I haven’t read Fock’s original paper, so for parts of what follows I’m indebted to the accounts of Jonas Karlsson^{[4]}, Radosław Szmytkowski^{[5]} and Bander and Itzykson^{[6]}.
The ordinary Schrödinger equation for a particle moving under an inversesquare force with a force constant of k is:
–ℏ^{2}/(2m) ∇^{2} φ(r) – (k/r) φ(r) = E φ(r)
This is the timeindependent equation, for an eigenfunction φ(r) with energy E. Here we’ve expressed the particle’s state φ(r) as a function of its position, r. But if we wish, we can express the particle’s state as a function ψ(p) of its momentum, p, instead. The two representations are related by what are essentially Fourier transforms:
φ(r) = 1/√(8 π^{3} ℏ^{3}) ∫_{R3} ψ(p) exp(i p·r/ℏ) d^{3}p
ψ(p) = 1/√(8 π^{3} ℏ^{3}) ∫_{R3} φ(r) exp(–i p·r/ℏ) d^{3}r
We obtain the momentumspace Schrödinger equation by performing a Fourier transform of the original. The hardest part here is Fouriertransforming the potential energy, –k/r. We start by establishing a handy integral representation of 1/r:
∫_{0}^{∞} [exp(–αr^{2})/√α] dα
Putting α=β/r^{2}
= 1/r ∫_{0}^{∞} [exp(–β)/√β] dβ
= Γ(½)/r [where Γ is the Gamma function]
= √(π)/r
So we can write:
1/r = [1/√π] ∫_{0}^{∞} [exp(–αr^{2})/√α] dα
In what follows we’ll write the magnitude of any vector using the same letter as we use for the vector itself, but dropping the boldface, e.g. p for p and r for r. Using the integral representation of 1/r, the Fourier transform of the potential is:
V(p) = 1/√(8 π^{3} ℏ^{3}) ∫_{R3} [–k/r] exp(–i p·r/ℏ) d^{3}r
= –k/√(8 π^{4} ℏ^{3}) ∫_{0}^{∞} [1/ √α] ∫_{R3} exp(–i p·r/ℏ) exp(–αr^{2}) d^{3}r dα
We can transform the integral over R^{3} into spherical polar coordinates, choosing to measure the angle θ from the direction of the momentum vector p, so that p·r = p r cos θ.
∫_{R3} exp(–i p·r/ℏ) exp(–αr^{2}) d^{3}r
= ∫_{0}^{2π} ∫_{0}^{∞} ∫_{0}^{π} exp(–i p r cos θ/ℏ) sin θ dθ exp(–αr^{2}) r^{2} dr dφ
= 2π ∫_{0}^{∞} ∫_{0}^{π} exp(–i p r cos θ/ℏ) sin θ dθ exp(–αr^{2}) r^{2} dr
Putting z=cos θ
= 2π ∫_{0}^{∞} ∫_{–1}^{1} exp(–i p r z/ℏ) dz exp(–αr^{2}) r^{2} dr
= [2π ℏ i / p] ∫_{0}^{∞} [ exp(–i p r/ℏ) – exp(i p r/ℏ) ] exp(–αr^{2}) r dr
= [π ℏ i / p] ∫_{–∞}^{∞} [ exp(–i p r/ℏ) – exp(i p r/ℏ) ] exp(–αr^{2}) r dr
Putting r=u/√α – i p / (2 ℏ α) in the first term, r=u/√α + i p / (2 ℏ α) in the second term
= [π / (2 p α^{3/2})] exp(–p^{2} / (4 ℏ^{2} α))
× [∫_{–∞}^{∞} exp(–u^{2}) (p + 2 i ℏ u √α) du + ∫_{–∞}^{∞} exp(–u^{2}) (p – 2 i ℏ u √α) du]
= [π / α^{3/2}] exp(–p^{2} / (4 ℏ^{2} α)) ∫_{–∞}^{∞} exp(–u^{2}) du
= (π/α)^{3/2} exp(–p^{2} / (4 ℏ^{2} α))
We then have:
V(p) = –k/√(8 π ℏ^{3}) ∫_{0}^{∞} [1/ α^{2}] exp(–p^{2} / (4 ℏ^{2} α)) dα
Putting α=p^{2}/(4 ℏ^{2} γ)
= –k √(2ℏ/π) [1/p^{2}] ∫_{0}^{∞} exp(–γ) dγ
= –k √(2ℏ/π) [1/p^{2}]
So the Fourier transform of an inverse potential in distance is just an inverse square in the magnitude of the momentum. The Fourier transform of the product of the potential and the original wave function is the convolution of the individual Fourier transforms, so we end up with the momentumspace Schrödinger equation:
[p^{2}/(2m) – E] ψ(p) – [k/(2 π^{2} ℏ)] ∫_{R3} [ψ(p')/p–p'^{2}] d^{3}p' = 0
[Note that we’ve had to insert a further factor of 1/√(8 π^{3} ℏ^{3}) into the convolution between the Fourier transforms of the potential and that of the wave function, because of our choice of normalisation in the original Fourier transforms. Different choices there give different results along the way, but the final equation in momentum space will always end up the same.]
Fock’s extraordinary insight was that this integral equation is equivalent to the fourdimensional Laplace equation restricted to the surface of a 3sphere that we stereographically project onto the threedimensional momentum space!
The stereographic projection from a 3sphere of radius s is given by:
S_{4}(x, y, z, w) = [2 s / (s – w)] (x, y, z)
We’re projecting onto a threedimensional hyperplane tangent to the 3sphere’s “south pole” at (0,0,0,–s). Note that our s is not the same as when we were projecting onto the velocity circles, because we’re dealing with momentum space with an extra factor of m. For the moment, though, we’ll just leave s as a free parameter.
Starting from the integral over R^{3} in the momentumspace version of the Schrödinger equation, we want to perform a change of variables to coordinates on the 3sphere. We will put coordinates on the 3sphere of η, θ and φ, with:
0 ≤ η ≤ π
0 ≤ θ ≤ π
0 ≤ φ ≤ 2π
Cartesian coordinates are s (sin η sin θ cos φ, sin η sin θ sin φ, sin η cos θ, cos η)
3volume element is dΩ = s^{3} sin^{2} η sin θ dη dθ dφ
Total 3volume of 3sphere is 2 π^{2} s^{3}
If we consider the projection from the generic point s=s (sin η sin θ cos φ, sin η sin θ sin φ, sin η cos θ, cos η), we have:
S_{4}(s) = [2s sin η / (1 – cos η)] (cos φ sin θ, sin φ sin θ, cos θ)
The two angular spherical coordinates in R^{3}, θ and φ, are exactly the same as the corresponding coordinates on the 3sphere, while the radial coordinate in R^{3} (which we’ll call p, since we’re in momentum space) and the angular coordinate η are related by:
p = 2s sin η / (1 – cos η)
We’ll shortly need the derivative of this, which turns out to be:
dp/dη = –2s / (1 – cos η)
If we change from p to η, the volume element on momentum space becomes:
Momentumspace 3volume element
d^{3}p = p^{2} sin θ dp dθ dφ
= 4s^{2} sin^{2} η / (1 – cos η)^{2} dp/dη sin θ dη dθ dφ
= 8s^{3} sin^{2} η / (1 – cos η)^{3} sin θ dη dθ dφ
= 8 / (1 – cos η)^{3} dΩ
We can reexpress the function of η here by noting that:
p^{2} + 4 s^{2} = 4 s^{2} (1 + sin^{2} η / (1 – cos η)^{2})
= 4 s^{2} (1 – 2 cos η + cos^{2} η + sin^{2} η) / (1 – cos η)^{2}
= 8 s^{2} (1 – cos η) / (1 – cos η)^{2}
= 8 s^{2} / (1 – cos η)
So we have:
d^{3}p = (p^{2}/(4 s^{2}) + 1)^{3} dΩ
The inverse projection from a point (X, Y, Z) in the threedimensional hyperplane to a point on the 3sphere is:
R_{4}(X, Y, Z) = (4 s^{2} X, 4 s^{2} Y, 4 s^{2} Z, s (X^{ 2} + Y^{ 2} + Z^{ 2} – 4 s^{2})) / (X^{ 2} + Y^{ 2} + Z^{ 2}+ 4 s^{2})
From this, we find:
p–p'^{2} = [p^{2}/(4 s^{2}) + 1][p'^{2}/(4 s^{2}) + 1] R_{4}(p) – R_{4}(p')^{2}
If we write s=R_{4}(p) and s'=R_{4}(p'), and define:
Ψ(s) = [p^{2}/(4 s^{2}) + 1]^{2} ψ(p)
then our original integral over R^{3} becomes:
∫_{R3} [ψ(p')/p–p'^{2}] d^{3}p' = 1/[p^{2}/(4 s^{2}) + 1] ∫_{s S3} [Ψ(s')/s–s'^{2}] dΩ'
Here we’re writing the surface of the ball of radius s as s S^{3}, where S^{3} is taken to be the unitradius 3sphere in R^{4}. We set the value of s to √(–½ E m), which is just m times the value we used for velocity space. This means that:
–E = 2 s^{2} / m
p^{2}/(2 m) – E = (2 s^{2} / m) × (p^{2}/(4 s^{2}) + 1)
and the momentumspace Schrödinger equation becomes:
Ψ(s) – [m/(2 s^{2})][k/(2 π^{2} ℏ)] ∫_{s S3} [Ψ(s')/s–s'^{2}] dΩ' = 0
This equation is invariant under any fourdimensional rotation: if Ψ(s) is a solution, and R is a rotation of R^{4}, then Ψ_{R}(s)=Ψ(R s) will solve the equation too. To prove this, all we have to do is change the variable of integration from s' to s''=R s', which has no effect on the volume element or the set over which we’re integrating, and which gives us s–s'=R s–R s'=R s–s''.
Now consider the function:
P(r, s) = 1 / r – s^{2} = 1 / [(r – s) · (r – s)]
where r and s are vectors in R^{4}. We claim that P(r, s) is a solution of the fourdimensional Laplace equation, if we hold s fixed and treat P(r, s) as a function of the fourdimensional vector r alone. To see this, we first look at the derivatives with respect to an individual coordinate, say x, where r=(x, y, z, w) and s=(s_{x}, s_{y}, s_{z}, s_{w}).
∂_{x}P(r, s) = –2 (x – s_{x}) / [(r – s) · (r – s)]^{2}
∂_{x, x}P(r, s) = –2 / [(r – s) · (r – s)]^{2} + 8 (x – s_{x})^{2} / [(r – s) · (r – s)]^{3}
Adding up the equivalent terms for all four coordinates, we get:
∇^{2}P(r, s) = –8 / [(r – s) · (r – s)]^{2} + 8 [(r – s) · (r – s)] / [(r – s) · (r – s)]^{3} = 0
This calculation goes through nicely when r ≠ s, but of course P(r, s) is singular when r = s. In fact, rather than ∇^{2}P(r, s)=0, we have:
∇^{2}P(r, s) = –4 π^{2} δ(r – s)
where δ is the Dirac delta function, a generalised function or distribution with the property that integrating the product of any “test function” f(r) and the Dirac delta δ(r – s) gives the value of f at r=s. We won’t give a rigorous proof of this, but to make this claim plausible we note first that the gradient of P(r, s) is given by:
∇P(r, s) = –2 (r – s) / [(r – s) · (r – s)]^{2}
We can find the integral of the Laplacian throughout a unit solid ball in R^{4} centred on s (which we’ll describe as s+B^{4}, with B^{4} the unit sold ball centred on the origin) by noting that the Laplacian is just the divergence of the gradient, and applying the divergence theorem to compute this as an integral over the ball’s threedimensional boundary. That boundary is a unit 3sphere, s+S^{3}, and the unit outwards normal at any point r on it is just r – s. So we have:
∫_{s+B4} ∇^{2}P(r, s) d^{4}r
= ∫_{s+B4} ∇ · ∇P(r, s) d^{4}r
= ∫_{s+S3} (r – s) · ∇P(r, s) dΩ
= –2 ∫_{s+S3} dΩ
= –4 π^{2}
Given that P(r, s) is a solution of Laplace’s equation, at least for r ≠ s, and that Ψ(s) in our equation on the 3sphere is an integral over s' of P(s, s'), it seems plausible that Ψ(s) itself should satisfy Laplace’s equation. To make progress in seeing exactly which solutions will work, we will substitute a fourdimensional spherical harmonic:
Ψ(s) = Z_{λ, σ, μ}(s)
into our transformed Schrödinger equation on the 3sphere s S^{3}. In order to evaluate the integral ∫_{s S3} [Z_{λ, σ, μ}(s')/s–s'^{2}] dΩ' we will make use of a form of Green’s second identity, which is a result that follows easily from the divergence theorem:
∫_{V} (f ∇^{2} g – g ∇^{2} f) d^{4}r = ∫_{Boundary of V} n · (f ∇g – g ∇f) dΩ
Here n is a unit outwards normal to the fourdimensional set V over which we’re integrating on the lefthand side. To apply this identity we will set f = Z_{λ, σ, μ} and g=P(s', s)=1/s–s'^{2}. Rather than integrating over the entire fourdimensional ball s B^{4}, we will exclude a piece of radius ε that contains the point s, and then let ε go to zero. That is, we will define the set over which we integrate as:
V = {s' in R^{4} such that s' ≤ s and s–s' ≥ ε}
This has the advantage that both f and g are solutions of Laplace’s equation on V, so the lefthand side of Green’s identity, the integral over V of an expression that contains the Laplacians, is simply zero. So we end up with:
∫_{Boundary of V} Z_{λ, σ, μ}(s') (n · ∇P(s', s)) dΩ' = ∫_{Boundary of V} P(s', s) (n · ∇Z_{λ, σ, μ}(s')) dΩ'
The boundary of V will consist of two pieces. One piece is the 3sphere s S^{3} excluding those points closer to s than ε. The other piece will approximate a half3sphere centred on s, of radius ε.
On the 3sphere s S^{3}, the dot product of the outwards normal, n, with the gradient of the solid spherical harmonic is just the rate of change of the solid spherical harmonic with distance from the origin. From the homogeneity of the polynomial, this evaluates to a multiple of the original function:
n · ∇Z_{λ, σ, μ}(s')
= ∂_{r} Z_{λ, σ, μ}(r (s' / s))_{r=s}
= ∂_{r} [r^{λ} Y_{λ, σ, μ}(s' / s)]_{r=s}
= λ r^{λ–1} Y_{λ, σ, μ}(s' / s)_{r=s}
= [λ / s] Z_{λ, σ, μ}(s')
And again on the 3sphere s S^{3}, the dot product of the outwards normal, n, with the gradient of P(s', s) turns out to be a multiple of the original function:
∇P(r, s) = –2 (r – s) / [(r – s) · (r – s)]^{2}
(r / r) · ∇P(r, s)_{r=s}
= –[2/s] (s – r · s) / (2 s – 2 r · s)^{2}
= –1 / [2 s (s – r · s)]
P(r, s)_{r=s}
= 1 / [(r – s) · (r – s)]_{r=s}
= 1 / [2 (s – r · s)]
Putting r=s', we have on the 3sphere s S^{3}:
n · ∇P(s', s) = [–1/s] P(s', s)
The other piece of the boundary of V is the (approximate) half3sphere around the point s, with radius ε. The integral on any complete 3sphere around s of the dot product of the outwards normal with the gradient ∇P(s', s) will just be –4 π^{2}, as we found previously when showing that the Laplacian of P is a multiple of the Dirac delta. Because P(s', s) is radially symmetrical about s, as ε tends to zero one of the integrals over this part of the boundary of V will approach:
∫_{Half3sphere around s} Z_{λ, σ, μ}(s') (n · ∇P(s', s)) dΩ' → 2 π^{2} Z_{λ, σ, μ}(s)
Here we’re taking half the opposite of the integral of the normal gradient of P, because we only have half a 3sphere, and the outwards normal of V is the opposite of the outwards normal of the 3sphere around s that we’re excluding from V. And as ε tends to zero, we can treat Z_{λ, σ, μ}(s') as constant over the whole region of integration, taking on the single value Z_{λ, σ, μ}(s).
The other integral we need to consider is:
∫_{Half3sphere around s} P(s', s) (n · ∇ Z_{λ, σ, μ}(s')) dΩ' → π^{2} ε^{3} [1/ε^{2}] (n · ∇ Z_{λ, σ, μ}(s'))_{s'=s} → 0
Because the volume of the half3sphere we’re integrating over grows smaller with ε faster than P(s', s) grows larger, the integral tends to zero.
As ε goes to zero, the integrals on the 3sphere s S^{3} excluding the region close to s just approach the same integrals over the full 3sphere. Putting all of these pieces together, we have:
∫_{Boundary of V} Z_{λ, σ, μ}(s') (n · ∇P(s', s)) dΩ' = ∫_{Boundary of V} P(s', s) (n · ∇Z_{λ, σ, μ}(s')) dΩ'
2 π^{2} Z_{λ, σ, μ}(s) – [1/s] ∫_{s S3} [Z_{λ, σ, μ}(s')/s–s'^{2}] dΩ' = [λ/s] ∫_{s S3} [Z_{λ, σ, μ}(s')/s–s'^{2}] dΩ'
Z_{λ, σ, μ}(s) – [(λ+1)/(2 π^{2} s)] ∫_{s S3} [Z_{λ, σ, μ}(s')/s–s'^{2}] dΩ' = 0
This will match the equation we want to solve, so long as:
[m/(2 s^{2})][k/(2 π^{2} ℏ)] = (λ+1)/(2 π^{2} s)
s = m k / [2 ℏ (λ+1)]
Since E = –2 s^{2} / m, this implies that the quantum system has energy levels:
E = –m k^{2} / [2 ℏ^{2} (λ+1)^{2}]
where λ+1 is a positive integer. And, as we would hope, this exactly matches the formula obtained by solving the Schrödinger equation in the usual way. The quantum number here that we’ve called λ+1 is traditionally referred to as n when describing the energy levels of a hydrogen atom.
The amount of “degeneracy” of each energy level — the number of independent eigenfunctions for a given value of n — can be found in Fock’s approach as the dimension of the space of homogeneous polynomials of degree λ in four variables that satisfy Laplace’s equation. We have previously shown that this is (λ+1)^{2} = n^{2}, the same degeneracy found by the usual methods.
The usual way of classifying the individual states within each energy level is by the quantum number l, where ℏ^{2} l(l+1) is the eigenvalue of L^{2}, and the quantum number m (not to be confused with the particle mass), where mℏ is the eigenvalue of L_{z}. From what we know about the representations of the threedimensional rotation algebra, m will range in integer steps from –l to l, giving a total of 2l+1 states. By examining the way two independent spins add, it can be shown that because L = A + B, the quantum number l can range anywhere between λ_{A} – λ_{B} and λ_{A} + λ_{B}. But λ_{A}=λ_{B}=½λ, so l ranges from 0 to λ. The total number of states is thus:
1 + 3 + 5 + ... + (2λ+1) = (λ+1)^{2} = n^{2}
in agreement with our previous result.
Note that the functions Z_{λ, σ, μ} are eigenfunctions of A^{2} = B^{2}:
A^{2} Z_{λ, σ, μ}(r) = B^{2} Z_{λ, σ, μ}(r) = ℏ^{2} (½λ)(½λ+1) Z_{λ, σ, μ}(r)
and of A_{z} and B_{z}, and hence of L_{z} = A_{z} + B_{z}:
A_{z} Z_{λ, σ, μ}(r) = σ ℏ Z_{λ, σ, μ}(r)
B_{z} Z_{λ, σ, μ}(r) = μ ℏ Z_{λ, σ, μ}(r)
L_{z} Z_{λ, σ, μ}(r) = (σ + μ) ℏ Z_{λ, σ, μ}(r)
but they are not eigenfunctions of L^{2}. So although we can associate λ+1 with the standard quantum number n, and σ + μ with the standard quantum number m, the functions Z_{λ, σ, μ} generally do not have specific values of the standard quantum number l. To obtain functions with definite values of both m and l, we would need to take suitable linear combinations of all the Z_{λ, σ, μ} such that σ + μ = m.
[1] W. Pauli, “Über das Wasserstoffspektrum vom Standpunkt der neuen Quantenmechanik”. Zeitschrift für Physik 36: 336–363 (1926).
[2] Quantum Mechanics by Leonard I. Schiff, McGrawHill, 1968. Section 30.
[3] V. Fock, “Zur Theorie des Wasserstoffatoms”. Zeitschrift für Physik 98: 145–154 (1935).
[4] Jonas Karlsson, “The SO(4) symmetry of the hydrogen atom”, University of Minnesota, 14 December 2010. Online as PDF.
[5] Radosław Szmytkowski, “Solution of the momentumspace Schrödinger equation for bound states of the Ndimensional Coulomb problem (revisited)”. Online at arXiv preprint server.
[6] M. Bander and C. Itzykson, “Group theory and the hydrogen atom (I)”. Rev. Mod. Phys. 38: 330–345 (1966).