G~3
HILARY. D. BREWSTER
Relativity
"This page is Intentionally Left Blank"
Relativity
Hilary. D. Brewster
wXIOrCl Book Company
Jaipur , India
ISBN: 978-93-80179-03-2
First Edition 2009
Oxford Book Company
267, 10-B-Scheme, Opp. Narayan Niwas,
Gopalpura By Pass Road. Jaipur-302018
Phone: 0141-2594705, Fax: 0141-2597527
e-mail: oxfordbook@sify.com
website: www.oxfordbookcompany.com
Typeset by :
Shivangi Computers
267, 10-B-Scheme, Opp. Narayan Niwas,
Gopalpura By Pass Road, Jaipur-302018
Printed at :
Rajdhani Printers, Delhi
All Rights are Reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording, scanning or otherwise, without the prior written permission of the copyright owner.
Responsibility for the facts stated, opinions expressed, conclusions reached and plagiarism, if any,
in this volume is entirely that of the Author, according to whom the matter encompassed in this
book has been originally created/edited and resemblance with any such publication may be
incidental. The Publisher bears no responsibility for them, whatsoever.
Preface
The theory of relativity has become a cornerstone of modern physics.
Over the course of time it has been scrutinized in a multitude of
experiments and has always been verified with high accuracy. The
correctness of this theory can no longer be called into question. Right after
its discovery by Albert Einstein in 1905, special relativity was only
gradually accepted because it made numerous predictions contradicting
common sense, fervently castigated by Einstein, and also defied
experiment for too long a time. It was only with the advent of particle or
high energy physics that matter could be accelerated to very high
velocities, close to the speed of light, which not only verified special
relativity but also made it a requirement for machine construction.
The book opens with a description of the smooth transition from
Newtonian to Einsteinian behaviour from electrons as their energy is
progressively increased, and this leads directly to the relativistic
expressions for mass, momentum and energy of a particle. The expansion
of the physical research frontier toward astronomy and cosmology during
the past ten to twenty years considerably increased the importance of
special relativity and, above all, general relativity based thereupon.
Since astrophysics has in the same time become very popular among
readers with a scientific background, the two theories of relativity have
attained unprecedented publicity. The fascination with astronomy of
children and youths shall only be mentioned incidentally, it is, however,
one of the most impressive features of schools today. This book proceeds
to do just that, offering a radically reoriented presentation of Einstein's
Theory of Relativity that derives Relativity "from" Newtonian ideas, rather
than "in opposition to" them.
Hilary. D. Brewster
"This page is Intentionally Left Blank"
Contents
Preface v
1. Introduction to Relativity 1
2. Relativity Made Simple 29
3. Space, Time, and Newtonian Physics 70
4. Minkowskian Geometry 113
5. Accelerating Reference Frames 142
6. Energy and Momentum in Relativity 156
7. Relativity and Gravitational Field 177
8. Relativity and Curved Spacetime 199
9. Black Holes 226
10. The Universe 272
Index 297
"This page is Intentionally Left Blank"
Chapter 1
Introduction to Relativity
Relativity is a word that is used in a lot of different contexts to mean
a lot of different things; however, in common usage, if you say "'relativity,"
people think "Einstein."
Relativity is the idea that the laws of the universe are the same no
matter what direction you are facing, no matter where you are standing,
no matter how fast you are moving. Now, to say the laws of the universe
are the same does not mean everything looks the same. A person standing
in the Mojave desert sees very different things than an astronaut floating
in space, or a diver 300 feet under water. But, they all see the same laws of
physics and mathematics. Gravity always pulls you towards heavy objects.
Objects in motion tend to stay in motion, unless something pushes on them.
Electricity can give you a big shock.
Relativity is nearly always presented as a theory about the speed of
light and black holes. The culmination of a 2400 year intellectual struggle
to identify and understand mankind's place in the universe. In my opinion,
by far the most important consequences of relativity are not the ability of
physicists to calculate an extra decimal place or two, but rather the changes
in philosophy, our understanding of religion and our relationship to
Creation.
For most of western history, people have had their beliefs shaped by
our experience on the Earth: the Earth seems to us to be enormously large
and completely immovable. We jump up and down, throw large rocks,
watch the tides come and go, but the Earth never seems to move.
Historically, the stars were thought to be small objects, perhaps painted
on a backdrop, but in any case visibly orbiting around the centre of the
universe, the Earth. Similarly the Sun and the Moon also orbited around
the Earth. This universe was thought to be fixed and unchanging, as is
only appropriate for a perfect creation of a perfect God. It was only natural
for mankind's original idea to be that the Earth was immobile and at the
centre of this perfect universe.
Meanwhile, Euclid's geometry convinced everyone that mathematics
was the most rigorous of all sciences, and therefore must be central to any
2 Introduction to Relativity
description of the universe. Euclid, as it turned out, assumed that space
was flat.
From about 1500 to 1916, these ideas were slowly examined and found
wanting. Eventually, we found what we now consider to be a far more
fundamental truth: the universe is a really quite large place, and the Earth
is a very tiny little object in the universe. Everything in our universe is
constantly in motion - nothing is fixed and unchanging. The Earth moves
through the universe on a path determined by the laws of gravity, the
very same laws that seem to determine the path of everything else. The
Earth is in no way a special object, nor does it occupy a special position.
This new understanding, that there seems to be no special place in the
universe, no special direction, and no special speed which could be called
"at rest," this understanding is called Relativity.
Just to put this into context, our current understanding is that the
universe contains about 200 billion galaxies, and each galaxy contains
about 200 billion stars. Our particular star, the Sun, seems to be a very
average type of star, in no particular way distinguished from the other 40
thousand billion billion stars in the visible universe. We now know of many
planets orbiting many stars, and seemingly find new ones on almost a
weekly basis. Most of the planets we have found are enormous, imposing
giants like Jupiter or even larger, with hundreds or even thousands of times
the mass of the Earth - by comparison, the Earth is an all but invisible
little rock.
Relativity is the story of mankind learning that we are not the centre
of creation, at least as far as physics is concerned. This has been a rather
upsetting and ego-deflating lesson.
Although this is the story of relativity, it must be emphasized that
this story is not complete. At the time of this writing, there is no acceptable
quantum theory of gravity. It is widely believed by many, including myself,
that before we can build a quantum theory of gravity, first we need a
quantum theory of relativity.
THE HISTORY OF RELATIVITY
Relativity is the culmination of 2400 years of human thought.
Originally, we thought that the Earth was at rest at the centre of a flat
universe that was constructed with a mathematical plan. 500 years ago,
we figured out that the Earth was actually moving. 100 years ago we figured
out that there is no centre to the universe, there is no place in the universe
that is at rest, and the universe is not flat. We do continue to delude
ourselves into thinking the universe was constructed with a purpose and
a mathematical plan.
Today, we live in a world that is constantly changing - 100 years ago
Introduction to Relativity 3
there were no airplanes and almost no cars. 50 year ago there were no
lasers, no television, nothing we would recognize today a a computer. 25
years ago the Internet was a strange little network occupied by professors
and military people. The computers from 5 years ago are doorstops today.
We are very accustomed to change; in fact, today people almost don't
believe in the idea that things might stay the same.
This is all new. Isaac Newton lived in a culture that believed the height
of man's knowledge and understanding was reached between the time of
Aristotle and Jesus. Newton himself believed that to really understand
something, you had to go back to the original greek writings of Aristotle,
Euclid, and the Bible. He believed that the modern knowledge of his day
was currupted and watered- down versions of the original deeper
understandings.
Western culture busied itself with the search for truth. This
philosophical orientation led to the assumption that truth existed, and
that is was perfect and unchanging.
Thus it was an easy step to believe that the Universe and indeed God
himself were perfect and unchanging. Eastern philosophy was oriented
on the search for what is real. Their presumption was that nothing was
fixed and unchanging, that everything has a beginning and an ending.
While this search led to many profound understandings, the search for
what is real does not seem to lead to the development of science and
mathematical logic. This is most curious, as we are now coming to
understand that the universe is indeed a place of constant change, that
nothing is fixed and unchanging, and that the universe itself has both a
beginning and, presumably, an ending.
Our story begins with Aristotle, who compiled a set of ideas about
how the world worked. Aristotle lived in Greece from 384 BC to 322 BC.
Aristotle was primarily a teacher, and believed the highest calling for a
man was to teach. He thought everything was made up of earth, water,
air, and fire.
He thought things had a natural state - earth-like objects wanted to
be at rest, air-like and fire-like objects wanted to rise up. This all apparently
seemed intuitively obvious to him. Aristotle taught that heavier objects
fall faster than light objects, as a rock falls faster than a feather. At the
time, no one actually thought to do experiments and see if any of this was
true - in fact, like his teacher Plato, Aristotle thought that the universe
was guided by the rules of logic and mathematics, and therefore the laws
of the universe could be deduced by logical thought and mathematical
reasoning. Aristotle thought that the Earth was immobile at the centre of
the universe.
Aristotle taught classes and basically invented the subjects of logic,
4 Introduction to Relativity
physics, astronomy, meteorology, zoology, metaphysics, theology,
psychology, politics, economics, and ethics. 250 years after Aristotle's
death, his lecture notes were published by Andronicus of Rhodes. For the
next 1500 years, the great thinkers of Europe and the Islamic world all
traced their roots and primary influences back to Aristotle's teachings.
Euclid of Alexandria lived from 325 BC until about 265 BC. Euclid
compiled many theorems and demonstrations that were already known,
and carefully ordered them into what we now call an axiomatic system.
An axiom is something you take as a given, take on faith as it were. An
axiomatic system is a compilation of things you can conclude, demonstrate,
or prove based on those axioms.
Euclid organized the theorems very carefully, and was able to reduce
his assumptions to five axioms. Euclid's book, The Elements, was and is
considered one of the greatest achievements of mankind. The beauty of
his work is that once you accept the five axioms, you are led inexorably
to his results. This was the first great work of logical reasoning. It has
captivated many young thinkers over the ages, and motivated them to try
to produce such a work of their own. It also convinced many people that
this beauty was an element of the thoughts of God; that it must be the
case that the entire universe is governed by a similar set of laws and their
logical consequences. Physics is nothing more or less than the search for
these laws.
Euclid's five axioms are:
A straight line segment can be drawn joining any two points.
Any straight line segment can be extended indefinitely in a
straight line.
• Given any straight line segment, a circle can be drawn having
the segment as radius and one endpoint as centre.
All right angles are congruent.
If two lines are drawn which intersect a third in such a way that
the sum of the inner angles on one side is less than two right
angles, then the two lines inevitably must intersect each other
on that side if extended far enough.
The first three of Euclid's axioms are now called axioms of
construction, that is, they tell you that you can build certain things. The
fourth axiom is now recognized as being of a different nature - the axiom
that all right angles are equivalent is the same as assuming that space is
the same everywhere, and if you move a right angle around, it stays a
right angle.
For 2,000 years it was recognized and agonized over that the first
four axioms seem completely obvious and very simple, and the fifth axiom
seems by contrast complicated and artificial. Many mathematicians spent
Introduction to Relativity 5
substantially their entire careers trying to prove the fifth axiom was a
logical consequence of the first four. All such attempts failed, for a simple
reason which is now well understood. We now know that the fifth axiom
is equivalent to the axiom that space is flat.
When Europe became interested in trying to climb out of the dark
ages, the idea of learning and knowledge also became interesting. At this
time, education meant learning Greek and Latin, and studying the Bible,
Aristotle, and Euclid. It was considered that this was all of knowledge. It
was also against this backdrop that the basic ideas of relativity had to
evolve - according to Aristotle, the earth was in a special place, and material
objects wanted to be at rest in this same special place. According to the
interpretations of the Bible at the time, the Earth was a special part of
God's creation, the very centre and purpose of existence.
So, the ideas that things were the same everywhere, that the stars were
just like the Sun and the Sun was just like the stars, and that the Earth did
not define and occupy the perfect centre of creation, but rather was just
another part of creation, moving around and subject to the same forces
as everything else, these ideas were very controversial and not very
welcome. The people who promoted these ideas were similarly
controversial and not very welcome.
The Greeks noticed that there were a few "stars" with rather peculiar
habits - these stars moved forwards and backwards and then forwards
again against the backdrop of the fixed stars. The Greeks called these
moving stars "wanderers," or "planets" in Greek. Over the centuries,
people continued to observe the planets and chart their courses - it became
very fashionable to try to find a way to predict the positions of the planets.
For most of this time, the prevailing religious view was that God's creation,
the universe, was perfect, and the only perfect shape was the circle,
therefore planets must move in circles. It proved impossible to describe
the positions of the planets by assuming they moved in circles around the
Earth.
As observations got better and better, people invented more and more
complicated systems of planets moving in circles, which moved on greater
circles, which moved on even greater and greater circles. However,
although these systems of circles revolving in circles in circles in circles
could be made to predict the planets positions reasonably well, the
complexity of these artificial systems made a joke of the "perfection" and
"simple beauty" of the circle.
Nicolaus Copernicus, a Polish economist and scientist who lived from
1473 to 1543. was the first person to openly challenge the Aristotelian
view. In 1513, while in Italy, he published a short paper saying it was the
Sun that was at rest at the centre of the universe, and everything else
6 Introduction to Relativity
including the Earth moved around the Sun. Copernicus said the planets-
moved in circles around the Sun, as did the Earth. Using this simple system,
he was able to predict the orbits of the planets with great precision, but at
a great cost: the Earth had to be moved out of the centre of the universe.
This was a great leap forward, philosophically: the Earth was now
considered a moving object, and not a special object at a special place in
the universe. This idea was a threat to the established church, since it also
effectively removed Earth from being the central object of Creation, and
called into question the idea that mankind was God's greatest creation.
Copernicus unintentionally started a conflict between religion and science,
a conflict which unfortunately maintains to this day.
Tycho Brahe lived and worked in Denmark from his birth in 1546,
and died in 1601 in the Czech Republic. On 11 November 1572, in the
early evening, he saw a new star in the constellation of Cassiopeia, almost
directly overhead. This was remarkable, because the fixed, unchanging
stars had changed. This star is now called "Tycho's Supernova." With
funding from the King of Denmark, Brahe set up an observatory and made
the most accurate observations of the planets up to his time. Brahe was
aware of Copernicus' theory, but did not like it. He invented a competing
theory, where the Earth was at the centre of the universe, and the moon,
Sun, and stars revolved around the Earth. In his system, the other
planets revolved around the Sun. Thus, in his mind, he melded the most
important ideas of the day: the planets all moved in circles, and the Earth
was the centre of creation again. But, the idea of an unchanging perfect
universe was gone forever. And, Brahe set the stage for more and more
precise measurements of our universe, and requiring that theories should
agree with measurements.
Johannes Kepler was born in Germany in 1571 and lived until 1630.
When Kepler was a young man, he was hired by Brahe as a mathematician.
Thus Kepler had immediate access to the most precise astronomical
observations of the day.
Kepler, like most scientists of his day, was convinced that God had
made the Universe according to a mathematical plan, and that
mathematics was therefore a strategy for understanding the universe. He
spent most of his career calculating the orbits of the planets - for example,
it took him nearly 1,000 sheets of paper to calculate the orbit of Mars. It
was Kepler who first realized that the planets travel in ellipses, not in circles.
Kepler also pointed out that Venus and Mercury are always seen near the
Sun, which makes perfect sense if they orbit the Sun, but no sense if they
orbit the Earth. Kepler also observed a supernova of his own in 1604,
providing more evidence that the universe was a place of change.
Galileo Galilei lived in Italy from 1564 to 1642. Galileo studied the
Introduction to Relativity 7
works of Copernicus, and was a great believer his views. In 1609, Galileo
heard of a Dutch invention, the spyglass, and quickly made his own
version, the first telescope. With his telescope, he discovered the moons
of Jupiter, and he saw that Venus had phases like the moon. This proved
that Venus orbited the sun, not the Earth.
Galileo also noticed that the moons of Jupiter orbited with a fixed
period, just as our moon makes a complete revolution around the Earth
every 28 days.
However, Galileo noticed that a few months later, his predictions of
the orbits of Jupiter's moons were off by ten to fifteen minutes. Galileo
correctly realized that this was because when the Earth is on the far side
of the Sun from Jupiter, the light from the moons takes extra time to reach
us over the extra distance. From this effect, Galileo was able to make a
quite good estimate of the speed of light. The idea that light traveled at a
finite speed, not instantaneously, was very new.
At the time, the views of Copernicus were quite controversial. Galileo
had the habit of not only supporting these views, but trying to make his
opponents look like fools, a habit which did not serve him particularly
well. In 1616, Galileo was subjected to the Inquisition, and given a secret
warning to recant his Copernican views.
In 1632, Galileo published his famous Dialogue concerning the two
greatest world systems, which only got him into further trouble, first for
continuing to support Copernicus' views, and second for continuing to
try to make his opponents look like fools. Galileo was summoned to Rome,
where he was found to be "vehemently suspected of heresy", and forced
to recant his Copernican views and sentenced to house arrest for life. He
is reputed to have muttered under his breath as he left the Inquisition,
"Never the less, it still moves."
Galileo set the stage for modern physics by noticing that all things
fall at the same rate without regard for their composition. Aristotle had
taught that heavier things fall faster. Galileo realized that by rolling things
down a ramp, he could slow the effects of gravity and make more careful
measurements. In this fashion Galileo collected the data that Newton
would use to create his great theory of gravity.
Sir Isaac Newton lived in England from 1643 to 1717. Newton took
Galileo's work and put it into a mathematical form. Newton, as one of
his axioms, said objects at rest tend to stay at rest, and objects in motion
tend to stay in motion. This is in complete disagreement with Aristotle's
view that moving objects tend towards their natural state of being at rest
on the Earth. Newton codified Galileo's views on systems in mathematics
- the idea that the laws of physics are the same no matter what direction
you face, no matter where you stand, no matter how fast you move. In
8 Introduction to Relativity
codifying these ideas, Newton made a distinction between moving at a
constant velocity, and accelerating.
Newton claimed that the laws of physics were valid so long as you
were moving at a constant velocity, but not if you were accelerating. These
ideas are today called "Galilean Relativity." Although Newton
undoubtedly knew of the speed of light, he did not think it was in any
way special, and he thought that you could go as fast as you like, if you
had the means to propel yourself.
Newton once said, "If 1 have seen far, it is because I have stood on
the shoulders of giants." Here we have seen something of the giants whose
shoulders he used. The mathematical abilities and physical intuition of
Newton truly stand out in history and mark him as perhaps the most
prominent physicist who ever lived. However, just as he indicated, much
of the philosophical "heavy lifting" had already been done. The notion
that to stand still on the Earth was to be perfectly at rest in the precise
centre of God's perfect, unchanging creation was painful to give up. It
took at least four noteworthy geniuses 150 years to set an appropriate
stage for Newton's great work.
Johann Carl Friedrich Gauss lived in Germany from 1777 until 1855.
He was one of the greatest mathematicians who ever lived, and made many
important contributions to physics, also. Gauss considered non-Euclidean
geometry, also called curved spaces, and worked out much of the math,
but never published anything on the topic. He was afraid of the backlash
from his church and the community. However, he though a lot about these
issues, and at one point actually measured the angles in a triangle about 5
miles on a side in an attempt to determine if space was flat or curved.
Today, we think Gauss and his student Riemann came very close to
developing General Relativity.
Michael Faraday lived in England from 1791 until 1867. By today's
standards, Faraday had little in the way of formal education or
mathematical training or abilities, which makes his scientific
accomplishment all the more impressive.
When Faraday was young, he was apprenticed to a book binder, and
he used all his spare time educating himself by reading the scientific books
laying around. Later, Faraday got himself a job as an assistant in a lab
where he worked ceaselessly. Faraday almost single-handedly worked out
the laws of the relationship between electricity and magnetism. Today,
it's Faraday's work that tells us how to make electric motors and
generators, the basic foundation of our entire technological civilization.
Georg Riemann lived in Germany from 1826 until 1866. He was
Gauss' last and most famous student. To finish his doctoral degree, in
1854 Riemann was required to give a lecture. Gauss asked him to lecture
Introduction to Relativity 9
on geometry. In this single lecture, Riemann laid almost all of the
mathematical foundation for Einstein's General Relativity. Riemann went
on to become an important mathematician, but did little work on curved
spaces after this one lecture. Riemann was not a Catholic, and therefore
was completely unconcerned about any backlash his lecture might
generate.
James Clerk Maxwell lived in England from 1831 until 1879. Maxwell
took the works of several other physicists including Coulomb, Ampere,
Gauss, and especially Faraday, and brought them together into one set of
equations that described all electric and magnetic phenomenon. Maxwell's
equations are the first, and by far the most successful example of what we
now call a Unified Field Theory. Prior to Maxwell it was thought that
electricity, magnetism, and light were all completely unrelated. Maxwell
showed that these phenomenon are all different aspects of the same thing.
Maxwell's equations made a very peculiar prediction: with his
equations, he was able to calculate the speed of light. This is strange
because up until that time it was thought that velocities simply added: if
you were on a train moving at 100 miles per hour, and you turned on a
flashlight pointing forwards, the light from that flashlight would go 100
miles per hour faster than light from a flashlight that was not moving. So,
according to the physics of Newton, a theory should not predict the speed
of light, since that speed should depend on how fast the observer and the
source were moving. No one really knew quite what to make of this, but
the longer people thought about it the more it bothered them.
At this time, people thought that since light is a wave, there must be
something that is waving. Waves in the ocean are moving water. So light
waves must be moving something. This something was given the name
the Luminiferous Ether. From about 1865 until about 1920 people
searched diligently for this Ether, but no one could ever find any. Today,
we simply accept that light propagates, and we don't worry about what is
moving.
We also know today that light travels in our universe at an absolute
velocity - no matter how you produce it, no matter how you are moving
when you see it, you always see light moving at the same speed. This is
the single most important point of Einstein's relativity. This is also the
reason why Maxwell's equations were able to predict a particular speed
for light - Maxwell's equations turned out to be fully compatible with
Einstein's relativity, even though Maxwell wrote his equations down 40
years before Einstein developed Special Relativity.
Maxwell knew of the rings of Saturn, and wondered what they were.
He was able to show rather quickly that the rings could not be a solid
disk, because the gravitational forces on a large solid disk would either
10 Introduction to Relativity
tear the ring apart or make it crash into Saturn. Next he considered the
idea that the rings were liquid, but again he was able to show that liquid
rings could not orbit Saturn in a stable fashion. Lastly, he considered the
possibility that the rings were made of dust, that is uncounted trillions of
tiny individual particles.
This system turned out to be stable, and today we know that Saturn's
rings are indeed made up of tiny little ice chips, grains of sand, and small
rocks. This was one of the first times that it was shown that continuous
systems can have stability problems, but quantized systems can work in
the same circumstances. Today, all of our theories suffer from stability
problems brought on by the continuum, and a need for a theory of
quantized space and time is becoming more and more apparent.
Ernst Mach was an Austrian physicist and philosopher who lived from
1838 until 1916. Mach did important work on sound, so the speed of sound
is called Mach 1. He also criticized the existing physical theories of his
day on several grounds.
Mach said that since all we ever know of the universe comes to us
through our senses, our theories should speak only of things which can be
observed and measured. Mach also asserted that all phenomenon in our
universe must have causes from within our universe. He proposed Mach's
principle, which is that inertia, the tendency of a body in motion to stay
in motion and to resist accelerations, must be a result of interactions with
the other matter in the universe.
For example, if you whirl a bucket full of water about your head, the
surface of the water will assume a curved shape. But, how does the water
know to do that? How does the bucket know that it is accelerating in a
circle, and you are standing still? From the bucket's point of view, it seems
equally reasonable to assert that the bucket is just hanging out, and you
are whirling around the bucket.
Mach's answer was that the key difference between you and the bucket
is that you see the distant stars in the sky standing still, and the bucket
sees the stars whirling about. Therefore, the surface of the water curves
due to some interaction between the water and the distant stars. Mach
therefore claimed that if there was nothing in the universe but you and
the bucket, and you whirled the bucket about your head, the water would
not curve. Mach further claimed that if you could spin the entire universe
about the bucket, the water surface would curve. Unfortunately, we do
not currently know how to test either of these very interesting ideas.
There is a logical problem with Newton's physical explanation of the
water curving. We say the water surface curves because the bucket is
accelerating due to an external force - you're pulling on the bucket handle.
But, you may ask, how does the water know it's being pulled? According
Introduction to Relativity 11
to Newton's theories, the answer is that the water knows its accelerating
because its surface is curving, which is a sure sign of acceleration and forces.
So, in the end, all we can really say is that the water surface is curving
because it is curving. Not very satisfying. While physics can tell us with
enormous accuracy precisely how much the water surface will curve,
physics is an almost total failure at telling us why it will curve. Mach did
not like this in the slightest.
The idea that things which happen in the universe must have causes
from within the universe seems so obvious as to be tautological. However,
in spite of this, many of our natural assumptions and theories fail this
criteria. For example, we consider inertia and electric charge to be
properties of material objects with names but no causes.
Perhaps more critically, we all know that time marches forwards
inexorably, but we consider this most pervasive of effects to be without
cause - it just is. Of course, the idea that everything has a cause is just an
idea, and may be wrong. Or it may truly be that some things were simply
chosen by God, so that their cause is not within our universe. However, it
seems to me and most scientists that we should continue to try to explain
what happens in our universe strictly in terms of things which are already
in the universe until we're quite clear that this approach simply isn't
working.
Einstein cited Mach as one of his primary inspirations. Hendrik
Lorentz was a Dutch physicist who lived from 1853 until 1928. Lorentz
worked out the basic mathematics of Special Relativity, which is now called
the Lorentz transform. Einstein used the formulas invented by Lorentz to
develop his theories.
Jules Henri Poincari was a French physicist who lived from 1854 to
1912. His theories of coordinate transformations were also instrumental
in the development of Special Relativity. Today, we consider that Special
Relativity was simultaneously discovered by Lorentz, Poincari, and
Einstein.
Albert Einstein was born on March 14th, 1879 in Ulm, Germany and
died on April 18th, 1955 in Princeton, USA. Einstein worked only in the
field of theoretical physics because of "my disposition for abstract and
mathematical thought, and my lack of imagination and practical ability."
Einstein created two theories of relativity, which he called the Special
Theory of Relativity, and the General Theory of Relativity. Einstein said
he started working on his theory of relativity when he was 16. He had just
learned about Maxwell's equations and their predictions of the speed of
light. Einstein liked to perform what he called "thought experiments", in
his native German "Gedanken experiments." Here are two he thought up
when he was 16.
12 Introduction to Relativity
First, he imagined himself holding a mirror in his hand at arms length,
and looking at his own reflection. Then, he imagined starting to run faster
and faster, until he was running at the speed of light. Would he be able to
see his own reflection? Second, he imagined a light ray zooming past him,
and he ran to catch up with it. What would the light ray look like when
he was running right alongside of it?
Ten years later, in 1905, he figured out the answers to these two
questions. You cannot run at the speed of light, and at any lesser speed,
you simply see your own reflection. If you could run at the speed of light,
the light ray would look like an electric field which changes in space but
not in time. This is impossible according to Maxwell's equations, but you
cannot run at the speed of light so you can never see such a thing.
We have seen that originally, people thought that the Earth was
precisely at rest precisely in the centre of the universe. This belief, of course,
would mean that we would have a very hard time figuring out what the
laws of physics would look like if we were somewhere far from the centre
of the universe traveling at some high rate of speed. From about 1500
until about 1680, several very smart people figured out what we now call
Galilean Relativity, which is that the laws of physics are the same for
anyone anywhere, traveling at any speed, so long as they were traveling
at a constant speed and not accelerating.
Einstein's first contribution to relativity was to add to this, The speed
of light is a constant, and does not depend on where you are or how fast
you are moving. Once he understood this very counter-intuitive fact, he
was able to quickly work out the laws of Special Relativity, which he
published in 1905. Maxwell's equations were already compatible with
Special Relativity, however Newton's equations were not. So, Einstein had
to reformulate Newtonian Mechanics so as to be consistent with Special
Relativity.
Special Relativity is the idea that the laws of physics are the same
everywhere in the universe, no matter where you are, no matter how fast
you are moving, so long as you are not accelerating, and one of the laws
of physics is that light always travels at the same speed.
Very shortly after publishing Special Relativity, Einstein realized that
the theory did not actually apply anywhere in our universe. The reason is
very simple to understand: we live in a universe filled with thousands of
billions of billions of planets and stars.
No matter where you go in the universe, you are being pulling in some
direction by gravity. So, Einstein realized, there is actually no such thing
as a place or a person who is not accelerating, because there is no such
thing as a place or a person who is not being pulled on by gravity. Einstein
mulled over this idea for another 1 1 vears - one might almost think he
Introduction to Relativity 13
was a bit slow. In 1907, two years after he had published Special Relativity,
Einstein walked home from his job as a patent examiner in the Swiss Patent
office, and said to his wife, "I had today the happiest thought of my entire
life. I realized that a man freely falling in an elevator does not feel his own
weight." Einstein had discovered a new axiom, that gravitational mass
was equivalent to inertial mass.
This was already known as a measured fact, but was not understood.
Einstein decided to elevate this fact from a curious coincidence to a
principle, which he called the principle of equivalence. What Einstein had
decided was that you could not tell the difference between the force of
gravity and some other kind of force.
For example, if you were in an elevator with the doors closed and it
was freely falling, you could not tell if you were close to the Earth or
floating in space very far away from any other mass. Alternatively, if the
elevator were sitting on the ground, you would feel your weight, but you
could not tell if you were sitting on the Earth, or if the elevator were being
pushed by a rocket motor with an acceleration of precisely one g, the
acceleration of gravity on the Earth.
What Einstein had decided was that the closest thing there was to an
inertial frame, that is moving at a constant velocity without acceleration,
was to be freely falling. However, there are limits to this. Imagine you
had an elevator which was 8,000 miles wide, and it was falling near the
Earth. The Earth itself is only about 8,000 miles across, so clearly gravity
will pull you straight down in the middle of the elevator, but at each end
of the elevator you'll be pulled towards the middle. So, there's a limit to
how big the elevator can be and still represent a freely falling frame.
Also, after a little while, the elevator will hit the Earth - this will be a
big clue to the people inside that something has changed. So, there's a
limit to how long this situation can last and still represent a freely falling
frame.
Fig. A Very Large Elevator Freely Falling Towards the Earth
14 Introduction to Relativity
Einstein realized that this was a lot like a curved space. If you're on a
ship on the ocean, it looks like the Earth is flat. But, if there's another
nearby ship and it sails away, after a time the ship seems to sink below the
horizon. So, there's a limit to how big an area you can look at on the
Earth and convince yourself that the Earth is flat.
Einstein spent much of the next 8 years learning the math invented
by Riemann (invented in one week of work for a single afternoon lecture,
which gives you an idea of their relative abilities at mathematics), and
finally was able in 1916 to publish his General Theory of Relativity, which
says that there is no centre to the universe and nothing is at rest. The only
places in the universe that seem to be moving at a constant velocity are
small areas that only last for a short time.
Is this the end of the story of Relativity? Well, yes and no. This brings
us up to our best current understanding. However, many people are
dissatisfied with the current situation, as was Einstein himself. There are
a lot of pretty strong hints that we're still missing several important ideas.
Einstein noted that gravity waves travel at the same speed as light waves.
He could not believe this was a coincidence, and felt strongly that this
indicated there was a link between gravity and electro-magnetism.
It is widely believed that we should somehow be able to make a
quantum theory of gravity, but non-stop efforts from 1930 until 2004 have
failed to produce a working theory. We have managed to prove that the
techniques we currently use to build quantum field theories will not work
for gravity, and we have no clue what techniques will work. All we're really
certain of is that we have a lot left to learn.
THE SPEED OF LIGHT
In 1 862, Maxwell calculated the speed of light. This seemed a very
strange result at the time. In Galilean relativity, the speed of light should
depend on the speed of the observer and the speed of the source.
Physicists thought about this for some time, and came up with an
explanation: they decided that all of space was filled with a substance,
which they called the Lumeniferous Ether. Light was then thought to be
a disturbance of this ether, just as water waves are a disturbance of the
surface of the water. This idea also explained the prediction of the speed
of light - the speed of light was thought to be relative to this ether.
In 1887, Albert Michelson decided to try to prove the existence of
this ether substance. He noticed that as the Earth revolves around the
sun, the Earth travels through space at about 20 miles per second, about
70,000 mph. The speed of light was known to be about 186,000 miles per
second, so the orbital speed of the Earth is about.01% the speed of light.
Michelson decided he should be able to detect this - he would compare
Introduction to Relativity 15
the travel time of two beams of light, one which traveled along the direction
of the Earth's orbit, and a second beam which traveled sideways compared
to Earth's orbit.
Today, we would also add in that the Sun is revolving around the
centre of the Galaxy, with an orbital velocity of about 200 miles per second.
This is about. 1% the speed of light. So, actually, Michelson's experiment
was about ten times more sensitive than he knew.
Michelson's daughter says he explained his experiment to her like this:
Suppose we have a river 100 feet wide flowing at 3 feet per second,
and two swimmers who both swim at 5 feet per second. The swimmers
have a race. One swims upstream !00 feet, then swims back to the start.
The other swims directly across the river, then turns around and swims
back. Who wins?
The swimmer going upstream is easiest to analyse. Going against the
current, the swimmer makes only 5-3=2 feet per second, so the 100 feet
takes 50 second. Coming back he's going 5+3=8 feet per second, so it
takes him 12.5 seconds. His total time is 62.5 seconds for the 200 foot
swim.
The swimmer going across the river has a different job. As he swims
across the flow at 5 feet per second, the river is carrying him downstream
at 3 feet per second. So, he has to swim at an angle in order to make it
straight across the river. His net speed is the hypotenuse of a 3,4,5 triangle,
so his net speed is 4 feet per second. He swims the 100 foot width in 25
seconds, then takes another 25 seconds to swim back, for a total time of
50 seconds for the 200 foot swim. So, this swimmer wins.
- 3 fps against flow
River flow, 3 feet per second
5 fps swimming
Fig. A Swimmer in a River.
Michelson realized the this exact same argument should apply to light
moving along and against the flow of the Ether, and across the flow of
the Ether. The math is pretty much the same as swimming across the
stream. Let's suppose one light beam travels with and against the Earth's
motion over a distance L each way. The other light beam travels normal
(sideways) to the Earth's motion a distance L each way. The Earth is
moving at a velocity v around the Sun.
The time required for the first beam is L, the length the light travels,
16 Introduction to Relativity:
divided by the speed of the light beam. Michelson figured the speed would
be c + v going one way, and c - v going the other.
L
L
2cL
2L
Tl =
+
=
=
c + v
c-v
c(l-v 2 /c 2 )
The light beam going normal to the Earth's motion is following a
path which is like the verticle leg of the triangle above. The three legs of
the triangle now have length c (replaces 5), v (replaces 3), and <I>(c 2 - v 2 )
(replacing 4), so it's travel time is
2L
T2 =
cV(l -v 2 /c 2 )
v against Earth' s^motion
net speed =
sqrt (c*c - v*v)
Earth's motion around
he sun, velocity v
Fig. The Light that Travels Normal to the Earth's Velocity
Michelson set up an experiment and tried it. He found no effect - the
two beams of light took exactly the same amount of time. Michelson's
experiment was mounted on a big bearing. What he actually did was rotate
the entire table as he was measuring, looking for a direction where the
light took longer to travel in one direction than in the other. He never
found such a direction - the light always took exactly the same amount of
time to travel down either leg.
Of course, Michelson immediately realized that perhaps the Ether was
moving compared to the Sun, and on the day of his experiment the Earth
happened to be precisely standing still in the Ether. So, he repeated his
experiments every couple of months for a year, and continued to find no
effect. Michelson also wondered if perhaps the Earth was dragging the
Ether around with it, so he repeated his experiment on top of a mountain.
Again, no effect. Michelson also reasoned that if the Ether was being
dragged along with the Earth, we should see the apparent position of stars
move, depending on the angle their light entered the Earth's ether. No
such effect has ever been observed.
Everyone found this result very confusing - as we have seen, the time
taken by the light should be longer when the beam is aligned with the
Introduction to Relativity 17
Earth's motion than when the beam is at 90° to the Earth's motion. The
understanding of the day was that the velocity of the Earth's motion should
add and subtract from the speed of light, depending on the direction of
the light. But, what was found was that the speed of light seemed to never
change.
Hendrik Lorentz and George FitzGerald analyzed the Michelson-
Morley experiment. They decided to postulate that when something is
moving, it shrinks in the direction it is moving. This effect is called the
Lorentz-FitzGerald contraction. This contraction can be calculated to
exactly compensate for the velocity of the Earth. In other words, Lorentz
and FitzGerald decided that the reason the beam aligned with the Earth's
motion took the same time as the other beam was that it had a slightly
shorter distance to traverse. So, they decided that Michelson's table shrunk
in the direction of the Earth's motion, by exactly the right amount so that
the two beams of light tied in their race.
We see immediately that if we were to multiply Tj by 0(1 - v 2 / c 2 ),
then these equations give the same result. So, Lorentz and FitzGerald
decided to assume that the table had contracted by this factor, O (1 - v 2 /
c 2 ), in the direction of Earth's motion. Then the math gave the right answer,
which is that the travel time is the same in both directions.
What does this mean? To see this, we're going to learn how to make
what are called Space-Time diagrams. This is an ordinary graph, but with
time as the vertical axis. It's very inconvenient to label a graph with seconds
running up the vertical axis, and units of 186,000 miles on the horizontal
axis, so we're going to work in different units. We'll measure distance in
feet, and time in nanoseconds. One nanosecond is one billionth of a second.
One billion nanoseconds is one second. The speed of light is almost
precisely one foot per nanosecond. That is, 186,000 miles times 5280 feet
per mile is almost exactly one billion feet. On a space-time diagram, light
is always drawn at a 45° angle, because light always moves at 1 foot per
nanosecond.
By the way, you could ask "Why does light always move at one foot
per nanosecond?" The answer is, we have not even the slightest clue. It
just does. We've checked this a zillion times in as many ways as we can
think of for over 125 years now, and it's always been true. That's why
Einstein took this as an axiom. He could not prove it, he could not justify
it, he could not even motivate it. He could only say that this seems to be
true in our universe, so lets assume it's always true and see what the
ramifications. Well, while this is a very important and interesting result,
it's a trick from our current perspective. This light is traveling through a
very peculiar medium in a very peculiar fashion. When we speak of the
speed of light being a constant, we mean in a vacuum. If there are atoms
18
Introduction to Relativity
nearby, the light can interact with the electrons and protons and start doing
strange quantum things. It's these strange quantum things that cause
rainbows and make lenses work and make the sky blue and make your
eyes work. But, we're not studying quantum things in this book, so we're
just going to think about the vacuum.
In figure space-time diagram, with a bunch of things drawn in it.
Remember, the horizontal (X) axis is position, and the vertical (Y) axis is
time. The units are feet and nanoseconds. We see three rays of light, one
starting at x = 0, t = -2, and moving to the right. You can tell it's a ray of
light because it's drawn at a 45° angle. Now, this is a space-time diagram,
so there's something important to notice here: this particular ray of light
comes into existence at -2 nanoseconds, and evaporates at +2 nanoseconds.
There's another ray of light which starts at x = 6, t = -3, and ends at
x = 3, t = 0. Another ray of lights starts at x = 0, t = 2 and goes to at least
x = -3, t = 5. There's a particle moving very quickly, at half the speed of
light, from x = 1, t = 2 to x = 2, t = 4. This particle only exists for a short
time. There's another particle at x = -5.
This particle is not moving at all, but it exists for at least as long as
this graph exists. Finally, at about x = -2.5, t = 1.5, there's one of our new
favourite characters, a rocket ship. This is not a book on art, so comments
on the aesthetics of this particular rocket ship are not welcome.
: \
S__.2
\_L
2 z .. _
^
z_s<
i z \__
\
-7-6-5-4-3-2-1 1 2 3 4 5 6 7
Fig. A Space-Time Diagram
So, we see that in a space-time diagram we know both where and
when things are. This is a very different perspective than you are used to.
For example, on a space-time diagram, you would look like a long pink
Introduction to Relativity 19
tube, with one end attached to your mother and the other end just stopping
somewhere about 75 years later. In between, the tube that is you twists
and turns and wiggles to reflect where you went while you were alive. If
you're female and have children, your children would start out as small
pink tubes which branch off from you.
Space-time diagrams show "now" as the x-axis, and they show the
past and the future below and above the x-axis. Things which are not
moving are vertical lines. According to the rules, all lines which represent
the motion of something with mass must be tipped from vertical less than
45°. Light is always at 45°. If a line is drawn which is tipped at more than
45°, it would represent something moving faster than light. We have a
name for such objects: we call them tachyons. We also have names for
things like "nice lawyers" and "honest politicians," but we've never actually
seen any of these things, so don't get too excited.
Now, using our space-time diagram, we're going to try to understand
what it means to say where and when something is. We're on unfamiliar
ground here, so we're going to try to see how to do this without making
any assumptions.
For example, you might look up in the sky and see an airplane fly by,
maybe 3 miles up, going maybe 500 mph. You could look at your watch,
and say "That airplane was right over my head at noon." However, this is
an assumption that we're not going to make. What you can say is, "We
saw the airplane at noon." We will not assume that we know how to tell
time at places far away from us. In fact, what we would really like would
be to have a clock hanging in the air 3 miles right above our head, and
when the airplane flies past the clock, we can see the airplane and the
clock next to each other and read the time off of that clock.
Now, we have the problem of trying to synchronize this clock 3 miles
away with our wrist watch. How can we do this? Well, first we'll design a
new clock. Our new clock has hands. The second hand ticks off
nanoseconds, so to us mere humans it looks like a blur, but that's no big
deal.
The light will flash, say, every micro-second, that is every 1,000
nanoseconds. Now, anyone anywhere can synchronize their clock with
the flash from our clock. When you see the flash, you know the second
hand is pointing straight up. Of course, we're trying to synchronize clocks
here, so we have to account for the speed of light. If you're 100 feet away
from my clock, and you see the flash, you know that my clock emitted
the flash 100 nanoseconds ago. But, no problem, you just make sure your
clock's second hand hits 100 nanoseconds exactly as you see the flash from
my clock. The clock that's 3 miles up in the air is about 15,000 feet away,
so that clock will be set 15,000 nanoseconds ahead of the flash.
20 Introduction to Relativity
Now, our clocks are synchronized. Remember, we're going to have a
lot of clocks, strung out all over the place, all synchronized. We will know
where each clock is. When we see something happen, we'll know where it
happened and when it happened - the where is from knowing the position
of the nearest clock, and the when is from reading the time off that clock,
and no other. We have a special word for something happening, we call
this an event. An event is something that happens at a particular time and
place. Here's a space-time diagram of us synchronizing a couple of clocks.
At x = (that's where we're standing) and t = 0, we send out a flash.
The flash travels at the speed of light, which is a 45° line. Two feet
away from us is another clock. Our trusty graduate student is standing
there with another clock. He sees the flash at t = 2 nanoseconds, sets his
clock, and sends a flash back. At t = 4 nanoseconds, we see our grad
student's flash come back. Graduate students, by the way, are a very
important part of physics: they're smart, educated, do what they're told,
and work nearly for free. Without graduate students, all of science would
come to a screeching halt.
\
\
/
/
-7-6-5-4-3-2-1 1 2 3 4 5 6 7
Fig. Synchronizing a Pair of Clocks with Light Flashes
Here's the situation we're going to imagine. Suppose there's some guy,
George, who has a small lab. He has two clocks and he wants to
synchronize them. So, we know Iiqw he's going to do this. He's going to
have flashers on his clocks to help set the clocks. When the clocks are
synchronized, he's going to just sit back and watch the clocks flash at
each other - it's going to look like they're bouncing a little light ball back
and forth between them, like they're playing ping-pong with light. Mow,
here's the trick.
Introduction to Relativity
21
George and his small lab are in a rocket ship, flying by us at half the
speed of light. What do we see? Conveniently, his rocket ship is transparent,
just like Wonder Woman's jet airplane, so we can see inside. We can turn
on our clocks however we wish, so we'll agree that George will set the
clock closest to him to read t = exactly as he passes us. We'll also set our
clock to read t = just as George passes us. So, at the instant t = our
clock and one of George's clocks are right next to each other and read the
same thing. Our job is to figure out what happens next. By the way,
George's clocks happen to be bolted down to a large solid table which is
11.55 feet long.
Below is a space-time diagram of this situation, the scale to 5 feet
and 5 nanoseconds per tick. George has two clocks. One of them flies
right past us, so that's the line that goes through the origin. George is
flying at half the speed of light, so in 10 nanoseconds he moves 5 feet -
half as far as light would move. George has a second clock which is 11.55
feet away from him. But, we have to remember the Lorentz- FitzGerald
contraction. Although George has carefully measured out 11.55 feet, we
see his two clocks as being closer to each other. The distance is contracted
by the factor 0(1 - v 2 I c 2 ).
J -I z
Z Z_y Z
Z J-^t z
Z L7T Z
Z J/1. Z
Z ? f Z
- J^J- ~X~ Z
Z tJr Z
- ■/--£ z
Z t?t z
Z J? J. Z
::_.._„ ft _ _ __r
-35-30-25-20-15-10-5 5 10 15 20 25 30 35
Fig. George and His Clocks Fly Past us at Half the Speed of Light
He's moving a c/2, so v 2 / c 2 = 1/4. 0(3/4) =.866..866 * 1 1 .55 feet = 10
feet. So we see George's two clocks as being 10 feet apart. In our space-
time diagram, George's second clock is on a parallel line 10 feet away,
just as I've drawn it. At our t = 0, we see one of George's clocks right on
top of us, and one which is 10 feet away from us. I"ve also drawn a couple
22
Introduction to Relativity
of light flashes emitted by George's clocks, as they look to us. Light always
goes at 45°. When George's clock reads t = 0, it flashes. To us, it looks
like it takes 20 nanoseconds for that flash to reach the clock at the other
end of George's table. That clock then flashes, and it looks to us like it
takes about 6.67 nanoseconds for that flash to reach George's first clock.
Now, remember, George has carefully set his clocks. He doesn't think
there's anything strange about his lab, he's just setting his clocks to read
the right thing. So, when his second clock sees the flash from the first
clock, it's reading 11.55 nanoseconds.
After all, George has carefully synchronized his clocks. When the flash
from that clock reaches George's first clock, the first clock is reading 23
nanoseconds. How can this be? To see how this works, we'll draw another
space-time diagram where we show the light flashes that happened
immediately before these flashes.
% 2 -
Z-V -
-t-,4-"" -
t 7T Z
1
7 7 -
■*t
T z
7 s
t.
M1.5
7^4
t y >
1?-/-
V /
-35-30-25-20-15-10-5 5 10 15 20 25 30 35
Fig. George's Clocks Flash as they Fly by us at half the Speed of Light
The time shown on George's clocks when they see a flash of light.
We quickly notice some things. George's clocks are running slow. The
clock nearest George read 23 nanoseconds when our clock reads 26.67
nanoseconds, and his reads -23 nanoseconds when ours reads -26.67
nanoseconds. The other thing we notice is that George's clock which is
ten feet away from us at closest approach is not only running slow, but is
also off.
Halfway between the points where George's second clock reads -1 1.55
and 11.55 nanoseconds happens when our clocks are reading 7.33
Introduction to Relativity 23
nanoseconds. So, George's second clock reads when our clocks read 7.33
nanoseconds.
When our clocks read and we see George's first clock as reading 0,
we see George's second clock is reading -5.75 nanoseconds. So, this is big:
George's clocks and our clocks are not synchronized. We see our clocks
as synchronized, and George sees his clocks as synchronized, but we don't
see George's clocks as synchronized, and he does not see our clocks as
synchronized.
This special relativity idea has now cost us one of our most
fundamental intuitions. It is not possible to synchronize clocks
unambiguously. Or, we can say, there's no such thing as simultaneous.
We see George's distant clock as being 5.75 nanoseconds behind his first
clock, so events that George sees as simultaneous, like his two clocks both
reading 0, we see as happening at very different times. We see our two
clocks as reading zero at the same time - we worked very hard to arrange
that - but George sees our two clocks as off by 5.75 nanoseconds.
The next thing we notice is that George's clocks are running slow by
exactly the same factor as we think his table has contracted. That is, 26.67/
23 = 11.5/ 10. This effect is called Lorentz-FitzGerald time dilation. We
see George's clocks as running <X> (1 - v 2 / c 2 ) as fast as our clocks.
This time dilation effect is the source of the "twin's paradox." If you
stay on Earth, and your twin gets in a rocket ship and flies away at a very
high speed for a year, then turns around and flies back to Earth, he has
aged two years, and you've aged more than two years.
THE INVARIANT INTERVAL
We saw that the speed of light, c, is the same number for everyone,
everywhere. The speed of light does not depend on how fast you are going,
nor on how fast the source of the light is going, you always measure the
same number.
We learned that two different observers moving at different speeds
cannot synchronize their clocks with each other. We learned that moving
clocks appear to be slow, and moving objects appear to contract in the
direction of their motion.
The factor by which clocks slow down and objects contract is
0(1 - v 2 / c 2 ). We also learned that the space concept of position had to be
replaced by the space-time concept of event, which is a particular position
at a particular time.
We're used to using the Pythagorean theorem to calculate distances.
So, in Figure below, A 2 = B 2 ^ C 2 . We need to be able to calculate distances
in special relativity, so we need to know how this formula should look in
space-time.
Introduction to Relativity
Fig. Caclulation of A 2 = B 2 + C 2
Here's what we know: the speed of light is always c. Speed is distance
divided by time, so this means sqrt(X 2 + Y 2 + Z 2 )/ T = c. We can multiply
both sides by T then square both sides to get X 2 + Y 2 + Z 2 = c 2 T 2 . Or,
C 2 T 2 _ x 2 - Y 2 - Z 2 = 0. We call this quantity c 2 T 2 - X 2 - Y 2 - Z 2 the
interval. We're going to see that in special relativity the interval takes the
place of distance. But, there's a big difference - the interval is how far you
went minus the time is took to get there. For a ray of light, the interval is
always zero, because at light speed, 1 meter of distance takes 1 meter of
time, and 1-1=0. This fact, that all people see the same speed of light,
will be elevated from a curiosity to a fundamental axiom. The equation
c 2 T 2 - X 2 - Y 2 - Z 2 will similarly be elevated from a special equation
about light to a fundamental equation about the distance between any
two events.
We're used to (distance) 2 = X 2 + Y 2 + Z 2 . This formula is called a
metric, and this particular type of metric is called positive-definite. Positive
because the sum of three squares is always positive. Definite because for
any X,Y,Z we can always calculate a unique number. So, in 3-space, we
use the 3-distance sqrt(X 2 + Y 2 + Z 2 ). But, we're working in 4-space now,
so we need to figure out what the 4-distance is. In Special Relativity, the
idea of distance will be replaced by the interval c 2 T 2 - X 2 - Y 2 - Z 2 , which
is not positive definite.
We can see that if something moves a short distance in a long time,
the interval is positive. If something moves at precisely the speed of light,
the interval is zero. And if something moves faster than light, or more
reasonably if we consider two points which are far apart in space but not
in time, the interval is negative.
This is very different from flat Euclidean space. This is why we use a
new word, interval: to help remind us that this is a very strange kind of
distance that can be positive, zero, or negative. For example, the interval
between the Earth and the Sun is about -7 minutes if we consider where
the Earth and Sun are at the same time: the interval is zero if we consider
Introduction to Relativity 25
the path that a ray of light would take from the Sun to the Earth; and the
interval is about 6 months if we consider the path that a typical NASA
satellite would take.
We're not used to a type of distance where how long you take to go
somewhere counts as part of the distance. Similarly the interval from Los
Angeles to New York is not the 3-distance of 3,000 miles. The interval
from Los Angeles to New York is zero for a ray of light, it's about six
hours for a traveller on a 747, and it's about five days for someone driving
a car. If we consider where Los Angeles and New York are at exactly the
same instant, so that T 2 = 0, then the interval is -3,000 miles, but now it's
a negative number.
Right away we can see that these factors of c are going to be popping
up all over the place. Also, we see that there's some confusion on whether
the interval is measured in meters or seconds or hours or feet or light years
or whatever. Actually, we're used to this for distance. If we asked someone,
"How far is it from Los Angeles to New York," we would not be surprised
to hear 3,000 miles, or 5,000 kilometers, or maybe even six million feet, or
50 million centimeters. But the idea that it could be two seconds or six
hours or -3,000 miles from Los Angeles to New York seems very strange.
How can a distance be the same as a time? Why is the speed of light
this strange number, 186,000 miles per second? Is there something special
about this number, 186,000, that God particularly liked?
Let's think about horses for a minute. A horse's height is measured
in hands, where a hand is four inches (don't ask me why, I've never owned
a horse). So, the horse below stands about 15 hands high at the shoulders,
and is about 7 feet long. When the horse rears up on its hind legs, if we
were to measure the dumb way we might find that the horse is now 22
hands high but only 5 feet long.
The confusion here is because we're measuring height in hands, and
length in feet. It would make a lot more sense if we were using the same
units in both directions, like hands for both length and height. As it is, if
we want to know the distance from the horse's rear hoof to his nose, we
can't use Pythagoris' theorem, we can't say (15 hands) 2 + (7 feet) 2 =
distance 2 , because hands are not in the same units as feet. We could say
something like (15 hands * 4 inches per hand/ 12 inches per foot) 2 + (7
feet) 2 = distance 2 You can see this is a real pain - it's really not very
convenient to use different units for different dimensions.
Similarly, we have a built-in confusion about space and time: we
measure time in seconds and distance in feet or meters. However, knowing
that the speed of light is the same for everyone, we can use the speed of
light to convert seconds into feet or meters. From now on, we'll agree
that we're going to use the same units for time and space. For example, as
26 Introduction to Relativity
we've already seen, one foot of distance equals one nanosecond at the
speed of light, so if we say a foot of time we mean the same thing as if we
say a nanosecond.
A meter of time is about 3 nanoseconds. Velocity, distance per time,
is now meters per meter, so velocity has no dimensions. The speed of light
is now just 1 with no units. The speed limit on most freeways is 65 miles
per hour which equals about c/ 10,000,000. If physicists were running the
highways, apparently highway signs would say "Speed limit 10" 7 ." A traffic
ticket for going 85 would read "excessive speed: 1.3*1 0" 7 in a 10" 7 zone."
That's it, no units.
APPLICATIONS OF DERIVATIVES
At the time of this writing, physics is rather annoyingly split into two
completely different types of theories. Relativity is a theory about how
space and time work, and very basic ideas about how particles must act in
space-time. The other half of physics is quantum field theories, which are
theories about what types of particles exist and how they interact with
each other. One may say that relativity builds a stage, and quantum field
theories fill the stage with players and a script.
In 3-space, we're used to putting X,Y,Z into a vector. Then the length
of a vector squared is the dot product of the vector with itself, that is, L 2
= V«V. But now we're working in 4-space, that is space-time, so at the
very least we're going to need to have vectors which hold 4 things. Here's
the rules we'll use for these 4-vectors:
A vector will have a superscript index, as V 1 . We always use
superscripts for a vector. Subscripts will mean a different type
of object.
• If the superscript is a latin letter like i,j,k, then the vector lives
in 3-space and the index runs from 1 to 3. V 1 , V 2 , and V 3 are
respectively X, Y, and Z.
If the superscript is a greek letter like a,b,g, then the vector lives
in 4-space and the index runs from to 3. V° is time, and V 1 ,
V 2 , and V 3 are respectively X, Y, and Z. Notice that V 2 means
the second entry in V, not V squared.
V°, the time coordinate, will be measured with the same units as
the space coordinates, so the speed of light is 1.
• We can multiply a vector, that is something with a superscript,
by something with a subscript. We never multiply two things
that both have superscripts. We never multiply two things that
both have subscripts.
In 3-space we have the dot product to help us calculate distance. The
dot product makes no sense in space-time. A 4 vector dotted with itself
Introduction to Relativity 27
gives us X 2 + Y 2 + Z 2 + T 2 , which has no meaning. Why? Because we are
looking for the interval, so we need T 2 - X 2 - Y 2 - Z 2 . We have seen that
T 2 - X 2 - Y 2 - Z 2 is zero for a ray of light for all observers. If we calculate
X 2 + Y 2 + Z 2 + T 2 for a ray of light, we get a number which depends on
the observer's velocity, as we'll see in a bit. We need a replacement for the
dot product which gets us a minus sign in front of the space terms.
We could do this by remembering that the t term always gets
subtracted instead of added, but this is just a recipe for trouble - we'll
forget sometimes. We'll handle this with a matrix - a special matrix, called
the Metric Tensor. The Metric Tensor, usually just called the Metric, is
called h, which is read out loud as "eta." The Metric tensor will be the
matrix:
(1,0,0,0)
(0,-1,0,0)
(0,0,-1,0)
(0,0,0,-1)
If we multiply the vector V = (V°, V 1 , V 2 , V 3 ) by the matrix h, we get
(V°, -V 1 , -V 2 , -V 3 ). Now, if we multiply our original V by this, we get
(V ) 2 - (V 1 ) 2 - (V 2 ) 2 - (V 3 ) 2 = T 2 - X 2 - Y 2 - Z 2 , which is just what we're
looking for. So, V*V gives us the wrong answer, but V # T|»V gives us the
right answer. The purpose of h is to keep track of the minus signs in the
space terms for us.
In Euclidean 3-space, the metric tensor is
(1, 0, 0)
(0, 1, 0)
(0, 0, 1)
which just turns a vector into itself, so the metric tensor is always
ignored in 3-space. But it's still there, formally, and when we make the
move to space-time we can't ignore it any longer. When we move to curved
4-space, we'll call the metric tensor g instead of h. This is because we'll
find that the metric tensor g is not a constant, but depends on where we
are. We'll find that in most of our universe, so long as we're not moving
at close to the speed of light and we're not near any black holes, the metric
tensor g is very nearly equal to T|, so we can say that g = r\ + h, where h is
a matrix containing only small numbers. We'll find out that h is the
gravitational potential. So, this metric tensor stuff is very important, both
formally and physically.
We've been cheating here for just a little bit - we"ve been ignoring our
rules above, at least as far as notation goes.
The V we have been talking about is a vector in space-time, so by our
rules it must have a superscript, like V a .
28 Introduction to Relativity
The metric tensor h is an object with two subscripts, for example T| a p.
So V»r|»V really means:
a=0p=0
Now we can see explicitly that we followed our rules. There's two
superscripts, and two subscripts, and we are always multiplying something
with a superscript by something with a subscript. Things with superscripts
are called vectors, or contravarients. Things with subscripts are called
forms, or covarients. The metric tensor is called a 2-form, because it has
two subscripts.
Einstein published a lot of papers, and one day the guy who did his
typesetting said to Einstein, "Every time you have an index repeated, you
have one of these sum symbols. Why bother?" So, Einstein invented the
rule that whenever an index is repeated, it means multiply the terms
containing that index, and sum from to 3. Even though it was the
typesetter who thought this up, Einstein gets the credit. We add to this
the rule that if an index is repeated, one of them must be "up" (superscript)
and one must be "down" (subscript). This saves us from writing a bunch
of "*" and "S" characters. This is called the Einstein Convention.
a=0p=0
Next, we know that the speed of light is the same for all observers.
That means that h ab must be the same for all observers. So, if the
transformation matrix which gets us from one person's frame of reference
to another is L, then this must be true:
There are a couple different forms of notation for what we're learning
here, so I'm going to take this chance to talk briefly about them. Some
people use 4-vectors where the index runs from 1 to 4, and the 4th
component is i times T, where i is the square root of -1. This notation was
invented by Minkowski, one of Einstein's professors when Einstein was a
college student.
Einstein always said he hated this notation, but in spite of that he
sometimes used it. If you use Minkowski notation, you don't need the
metric tensor, which somehow make us feel like space-time is more life
space, but also makes the transition to General Relativity much harder.
Some people use g for the metric tensor, that letter for the metric tensor
in curved space-time. The g will remind us that there are gravity fields
around. The notation I'm using is the modern notation, but not everyone
has gotten with the programme yet.
Chapter 2
Relativity Made Simple
LORENTZ COORDINATE TRANSFORMATION
It is a common misconception that Einstein based his theory of special
relativity on the Michaelson- Morley experiment. In the days of Michaelson
and Morley it was thought that electromagnetic waves propagated by a medium
which was called the Luminiferous Aether. The earth rotates on its axis and
rotates around the sun with circles the galaxy which wanders about the local
cluster group etc so it was expected that if a device could be made to detect
the its motion with respect to the aether that it would yield a significantly
nonzero result. Michaelson and Morley constructed just such a device and to
their astonishment it yielded a null result.
This is enough to dispel the aether theory in most peoples minds, but at
the time some people tried to explain why one would not be able to measure
a speed based on the motion of light in the device even though they insisted
the Earth was in motion with respect to the medium. Lorentz was one such
person who empirically derived the Lorentz transformation equations by
introducing length contractions and time dilations into a transformation which
would leave the speed c invariant to frame so that one could not use it to
determine a speed with respect to the medium.
Einstein is given so much credit because what he did different was to
come up with two powerful postulates from a simple idea and from those
postulates was able to derive the Lorentz transformations from first principles
and developed special relativistic physics from there. The idea that led Einstein
to his postulates is depicted in the figure.
Fig. Einstein Postulates of Lorentz Transformations
30 Relativity Made Simple
Start out placing a magnet stationary on a table perpendicular to the plane
of a loop of wire and pointed at the centre. Wire the loop in series with a
resister and a current meter to calculate from the current and resitance the
voltage induced in the loop. Move the loop at constant velocity and note the
voltage function. Next fix the loop with respect to the table and move the
magnet at the same velocity except for opposite in direction. Note the voltage
function. Either way you do the experiment you get the same voltage function,
the same physics.
Therefor it won't matter whether we say the magnet source is moving
and the loop receiver is stationary or visa-versa as you get the same physics
either way. This led Einstein to his first postulate. The electrons form a current
in response to the changing magnetic flux, or the way the magnetic field
changes across the loop over time. Either perspective yields the same physics
so the information about the magnet as received by the electrons must travel
at a speed independent of whether we say the source is stationary and the
receiver is moving or visa-versa. This led Einstein to realise that there must
be a speed that is invariant to frame at which the electromagnetic information
transfers.
• The first postulate can be worded as: The laws of physics are
invariant to inertial frame transformation.
• The second postulate can be worded as: The invariant speed c, is finite
and is the vacuum speed of light.
These postulates are phrased a little different in virtually every text, but
the core idea behind them as depicted above is the same.
The second in this manner for the reason that time dilation and such effects
really have nothing directly to do with light itself. These are instead due to
space time being structured such that the invariant speed c is finite. If we
surprisingly discovered that light had some small amount of mass and thus
really traveled at speeds just short of c, it would have no ramification on the
physics of special relativity whatsoever.
It would merely mean that we are using bad terminology, for instance in
calling c speed particles "light-like". What actually distinguishes Lorentz
transformations and special relativistic physics from Galilean transformations
and Newtonian physics is that in the Lorentz transformations the invariant
speed c is finite.
In the mathematical limit as c goes to infinity the Lorentz transformations
become Galilean and physics reduces to Newtonian physics. The statement
that this invariant speed c is the vacuum speed of light merely tells us where
to look experimentally for what that speed it. And should we find that light
travels at speeds just less than c then one may merely remove the "and is the
vacuum speed oflighf part and the fact that physics is relativistic according
to the remainder of the two postulates would be unaffected.
Relativity Made Simple 31
The first postulate tells us that it does not matter what inertial frames we
take to be in motion, or what inertial frame we take to be stationary as the
laws of physics do not depend on inertial frame. In a sense it is odd that
relativity is called relativity at all because according to the first postulate
physics is invariant to frame.
Thus relativity is really a theory of invariance. What is really relative in
relativity are time and space coordinates and things defined in such a way that
they depend on the coordinate frames. The equations used to model the "laws"
of physics must be invariant equations if we are to be consistent with the first
postulate. As we shall see tensor equations are invariant and so in relativity
we write the laws of physics as tensor equations.
These two postulates imply that the coordinate transformation that
correctly describes boosts between different inertial frames is the Lorentz
coordinate transformation.
Lets say that one observer uses an inertial coordinate frame S with
coordinates (ct,x,y,z). Another observer uses another inertial coordinate frame
S' given by (ct',x',y',z'). They will be in motion with respect to each other so
that the S frame observer observes the other to moving at speed v an the +x
direction and the S' frame observer will observe the other to be moving at the
same speed in the -x' direction. Lets say we know the location of an event
according to one observer's coordinate frame and wish to determine the location
according to the other coordinate frame. We transform the coordinates of the
event from the one to the other by doing a Lorentz coordinate transformation.
In this case the Lorentz coordinate transformation equations are
ct = y(ct' + px')
x = y(x' + Pet')
y = y'
z = z'
where we make the definitions
p =v/c
and
Y = (l-P 2 )- 1/2
A more compact form of the transformation that allows the boost to be in
any direction rather then restricting it to a coordinate axis is
ct = yet' + yP-r'
r = Ypct' + r' + ( Y -l)(p-r'/p 2 )p
Inverted equation becomes
ct' = y(ct - Px)
x' = y(x - Pet)
y' =y
z' = z
and in differential form equation becomes
32 Relativity Made Simple
dct = y(dct' + Pdx')
dx = y(dx' + (3dcf)
dy = dy'
dz = dz'
and the inverse differential form is
dct' = y(dct - pdx)
dx' = y(dx - |3dct)
y' =y
z' = z
c is called the Lorentz invariant speed and according to Einstein's second
postulate it is the finite vacuum speed of light.
(Exact by definition)
c = 299792458m/s
If c were to be infinite then the Lorentz transformation equations would
be the Galilean transformations
t=t'
x = x' + vt'
y = y'
z = z'
When speeds are large enough so that c can no longer be taken to be
infinite then the Galilean transformations can no longer be used and the special
relativistic phenomena such as time dilation and length contraction are observed
RELATIVE SPACE AND TIME
The differential form of one of the Lorentz coordinate transformation
equations Eqn 1.1.5 is
dct = y(dct' + Pdx')
Thus extended to a finite interval this becomes.
Act = "KAcf + pAx')
Given the interval in time and space between two events according to the
S' frame, this equation gives the interval in time between the events according
to the S frame. Now consider the case that the events happen at the same
location according to the S' frame, for instance the ticks on the S' frame
observer's watch. In this case Ax' = and we have
At = yAt'
To distinguish that the time interval is for events at constant location
according to the proper frame we often write the proper time interval as At' =
Ax and call it proper time.
At = yAx
From this equation we see that the time interval between the events
according to the S coordinate frame is longer than the time interval between
the events according to the S' coordinate frame. This phenomenon is called
Relativity Made Simple 33
time dilation. Time intervals between events are not absolute, but depend on
whose coordinate frame you use.
We can extend this phenomenon to the case in which one of the observers
is accelerated. Though special relativity is really only directly concerned with
inertial frames, we can consider an accelerated state to be a state of transitions
or boosts between different inertial frames. We will next let the S' frame
observer enter a state of acceleration. For small time intervals we can say the
S' frame observer is an inertial frame observer. Thus the equation
dt = ydx
holds valid for describing how much time goes by according to the S
frame observer given how much the S' observer has aged even if the S' frame
observer is accelerating.
Now consider the case that the two events happen at the same time
according to the S' frame. Then Act' = and Eqn 1.2.1 becomes
Act = yfiAx'
From this we see that if the events have a displacement in space along
the x' direction, then they happen at different times according to the S frame.
Thus the very notion of simultaneity is relative. Events simultaneous according
to one coordinate frame are not all simultaneous according to another.
Next, consider the following differential Lorentz coordinate
transformation equation from Eqn. 1.1. 6
dx' = y(dx - (3dct)
Extend over a finite interval to arrive at the corresponding equation for
two displaced events
Ax' = y(Ax - PAct)
Lets say that the S' observer puts a fire cracker on each end of a measuring
stick of length L oriented along the x' direction and sets them off timed so
that the events occur at the same time according to the S frame(Act = 0). Then
the length of the stick according to the S frame L will be given by the spatial
displacement between the events. Then we have Ax' = L Q and Ax = L. Inserting
these three inputs into the above equation results in
L =yL
or
L=(l/y)L
This is called length contraction.
PARADOXES
Consider a ship that travels to another star at a constant velocity, then
immediately whips around the star for the return trip at the same speed. We
on Earth observe that the clocks on board the ship run slow due to time dilation
throughout the entire trip both outgoing and incoming. On the way out,
according to the ship frame it is the Earth that is moving away and so it is the
34 Relativity Made Simple
Earth clocks that run slow due to time dilation. Also on the way back, according
to the ship frame it is the Earth that is approaching the ship and so again the
Earth clocks run slow due to time dilation. This presents a problem called the
twin paradox.
The problem is how to answer how the clocks read when the ship and
earth arrive together again and what causes the difference.
The total time dilation can be calculated by 1.2.2 or 1.1 .3, but only if the
accelerated frame is taken as the primed frame. The question would be why
this round trip case is not symmetrical. If the two remained in inertial states
then each would observe the other as aging slower in a symmetric fashion.
But in this round trip the end result couldn't work symmetrically because that
would lead to a true paradox. One can explain this from various perspectives
just as there are various frames that can be used to describe the situation.
The simplest explanation is that the accelerated frame of the ship is
actually a piecewise construction of two different inertial frames for which
there is a lack of simultaneity.
Because of the piecewise construction of the accelerated frame, the ship
observer reckons that clocks in the direction of the acceleration undergo an
advance during the acceleration by an amount that depends on how far away
they are. Primes (') will indicate the ship frame. T 'will be the time it takes the
ship to reach the star according to the ship frame and P as a function of proper
time is.
I p for ct'<cT'
P = |-Po far ct'>cT'
The coordinate transformation from the accelerated ship frame coordinates
to the inertial Earth frame coordinates is
ct = y(ct' + p x ')
x = y[x' + Pet' + (P - P)cT 'J
y = y'
z = z'
This set of transformation equations results in Lorentz transformation on
the way out as well as on the way in and gives the solution that the ship clocks
read less time upon their arrival back at Earth. Symmetric time dilation only
occurs during the portions of the trip where both observers maintain inertial
states. During acceleration the symmetry is broken and both observers will
always agree on how much they should age differently in a round trip.
Consider a train whose proper length is greater than the proper length of
a tunnel. The train moves at near c speeds so that it is extremely length
contracted according to the frame of the tunnel. Let's say it is so length
contracted that it tits inside the tunnel. Gates at the ends of the tunnel are set
so that they each close once it's entirely inside.
Relativity Made Simple 35
According to the frame of the train observer the train is still while the
tunnel moves the other direction. Therefor according to this frame it is the
tunnel whose length is contracted. The proper length of the tunnel is itself
shorter than the length of the train and so the tunnel is length contracted beyond
this so that the train never fits to be enclosed by the gates.
This leads to a superficially apparent contradiction called the length
contraction paradox. As with all the so-called paradoxes of relativity this only
superficially seems to be a contradiction and is easily shown not to be a true
contradiction. The solution is the realization that we did not account for relative
simultaneity in this mind experiment. According to the tunnel frame the events
that the two gates close are simultaneous. According to the train observer's
frame the two events still occur, but they are not simultaneous. According to
the train observer's frame the event that the gate closes behind the train happens
after the front end of the train has smashed through the front gate which had
already been closed.
REPRESENTING SIMPLE RELATIVITY
MINKOWSKI SPACETIME DIAGRAMS
To make a 2d Minkowski space-time diagram one typically first picks an
inertial frame. The vertical axis of the graph is chosen to be the time ct axis.
The horizontal axis is chosen to be a distance coordinate x. This means that
velocity of a particle is obtained from the reciprocal of the slope of its path on
the diagram instead of directly from the slope. Yes, this seems weird to do at
first but its also the usual way to do it so you get used to it. So the path of an
accelerating particle on a spacetime diagram can look like:
Now lets say the path represents the path of an accelerated observer. The
events comprised of the ticks on the watch of this observer all lay along the
36 Relativity Made Simple
path and so the observer's world line or the path the observer takes can be
labeled as the time axis for this observer. So the graph looks like:
We might consider a case where the primed frame observer is in an inertial
frame and the clocks are synchronized at the origin. In this case it becomes:
The time ct' axis consists of the events that are all at x' = 0. The Lorentz
transformation equations describe this line parametrically.
ct = yet'
x = y|3ct'
The x* axis consists of events at ct' = 0. The Lorentz transformation
equations also describe this parametrically
ct = ypV
x =yx*
From these we verify that the reciprocal of the ct" axis slope results in
the velocity (3. but we also see that for the x" axis, the slope itself results in
the velocity. So we can also now put the x' axis on the graph.
Relativity Made Simple 37
Lines representing events of constant position in x' are then drawn parallel
to the ct' axis and lines representing events simultaneous in ct' are drawn
parallel to the x* axis.
If we consider the paths of light moving along the x axis that cross the
origin, they are drawn at 45°.
If a y axis is included coming out of the screen, the light paths sweep out
a cone shape. This is known as a light cone.
TENSORS IN SR
The Pythagorean Theorem in 3 dimensions of space can be written
da 2 = dx 2 + dy 2 -r dz 2
For the displacement between events in 4 dimensional space-time we
should include the temporal displacement between the event in the interval.
This can be done in such a way that the displacement calculated is the same
according to any inertial frame. We define the invariant interval as
38 Relativity Made Simple
ds 2 = dct 2 - da 2
which can be written
ds 2 = dct 2 - dx 2 - dy 2 - dz 2
To verify that this interval has been constructed in an invariant way insert
the expressions for the differentials from the differential form of the Lorentz
coordinate transformation equations Eqn 1.1.5.
After simplification, the interval reduces to the same form
ds 2 = dct' 2 - dx' 2 - dy >2 - dz" 2
When we observe an object in motion and describe the length of its path
through space-time by the invariant interval we should realise that the object
does not move according to its own frame
dx' = dy" = dz' =
therefor the invariant interval reduces to ds = del. Since the interval is an
invariant and it is equal to the proper time of the object, we can look at proper
time dx as an invariant.
We can choose to define a displacement four-vector dx^ with the following
relations
dx° = dct
dx 1 = dx
dx 2 = dy
dx 3 = dz
Here we introduce the metric tensor for special relativity r\
1
-1
-1
-1
Using Einstein summation as discussed the invariant interval is written
in more compact notation as
ds 2 = r\ dx^dx v
Light paths are described by ds = 0. therefor any path given by ds = is
called a light-like path. A path where the overall sign of ds 2 is negative is
called a space-like path. A path where the overall sign of ds 2 is positive is
called a time-like path.
According to the first postulate of special relativity the laws of physics
are the same for every inertia! frame. Therefor when modeling the general
laws of physics with equations we must use equations that do not change their
basic form when transformed from one frame to another. For instance, if we
use one coordinate system to write an equation like
' F(ct.x,y,z) - G(ct,x.y,z) = 0,
Relativity Made Simple
39
Then in any other coordinate system it should also be
F'(ct',x',y',z') - G'(ct\x\y',z') =
And for example, it should not become
F'(ct',x\y',z') -G'(ct',x',y',z')= WfajCtf £)
If such an equation does transforms like this then it is not one of the
fundamental equations of physics.
We will define a tensor in terms of its transformation properties.
First recall the following differential form of the Lorentz transformation
equations Eqn. 1.1.6
dct' = y(dct - pdx)
dx' = y(dx - pdct)
These can be written in matrix form as
dct r
7
-yP o o"
'dct
dx'
-yP
y
dx
dy'
=
1
dy
dz'
1
K dz
-yP
Y
1
Y
Y p o
0"
y p
Y
1
From this we define the Lorentz transformation matrix
- y p o
And its inverse as
1
Now the Lorentz transformation equations can be written as matrix
equations as
~dx' =A^
and
Tx =A' l dx'
In element notation, using the Einstein summation convention the
transformation can be written
dx» = A^,dx v
40 Relativity Made Simple
Note:
a^ v = 3x^/ax v
and this transformation is just the ordinary chain rule of calculus.
For special relativity a tensor will be anything that Lorentz transforms. A
contravariant tensor will be any quantity that transforms between frames
according to
r = AT
T 'H = A*\,T V
A covariant tensor will be any quantity that transforms between frames
according to
T' M =V T v
There are also mixed tensors. For example
T'H v =A M cAv PTa p
From these transformation properties we can deduce that for an individual
particle,
• A sum or difference of tensors is still a tensor.
• A product of tensors is still a tensor.
A tensor multiplied or divided by an invariant is still a tensor.
Note: These rules apply only when the tensors involved describe that
which is observed, not the state of the observer himself. So for example let
F be a tensor describing something observed like say the electromagnetic
field and U v is the four-vector velocity of the observer (c,0,0,0). It turns out
that the electric field given by
E H = F u0 = F nv UV/c
is NOT a tensor. As U v is the four-vector velocity of whoever is the observer
everyone uses (c,0,0,0) as a result and the expression does not transform as a
four-vector. E' = F' * (3xV3x'* 1 )Fj L0 . If U v were the four-vector velocity of
one "particular" observer then the expression would transform as a tensor,
but then it wouldn't represent the electric field to anyone except that observer
and it would then only when F is the electromagnetic field already expressed
according to his own frame. Likewise the magnetic field
B^ = - (l^e^p/c = - (l/2)e M /PF, p U v /c2
where U v is the four-velocity of the observer (c,0,0,0) is also not a tensor.]
In relativity we write the fundamental equations of physics as tensor
equations such as
because this doesn't change its form in a frame transformation. For
instance, using the above transformation properties, it is easy to show that in
any other frame this equation remains in the same form
T ■>">•■■ =0
Relativity Made Simple 41
This is what satisfies the first postulate of special relativity, that the laws
of physics are the same for all inertial frames.
Notice that since we model the laws of physics with tensor equations
whose expressions are tensors defined by the transformation properties of the
coordinates and since the Lorentz coordinate transformations are one to one
invertable, there can be no true paradox's in special relativity.
SIMPLE RELATIVITY DYNAMIC IMPLICATIONS
In many texts mass has been defined in a circular manner. Some such
texts have asserted a four-vector momentum in the form of p^ = ml)^ as a
premise which doesn't work for massless particles and then defined mass as
the contraction of that vector or visa-versa.
In order to avoid circularity and to include massless particles and also in
order to facilitate a smoother transition to relativistic quantum mechanics this
text will take a newer though not unique approach. For example, here there
will be two different momentum four-vectors distinguished by capitalization
p^ 1 , and P^. The lower case will be the momentum four-vector of the first kind
and the upper case will be the momentum four-vector of the second kind. This
is done in part because a particle's mass will be defined as the contraction of
the momentum four-vector of the first kind which is the momentum four-vector
referred to in classical relativistic (non-quantum relativistic) texts. The
momentum four vector of the second kind is here defined mainly because its
elements are what will correspond to quantum operators in relativistic quantum
mechanics. (Some authors choose the capitals the other way around)
From experiments or due to quantum mechanics we know that the
magnitude of the three component momentum of a particle can be related to a
wavelength (whether or not the particle has mass).
h
and the relation between the three element momentum and the three element
k (whether or not the particle has mass) is
P = hk
Also, a fourth element corresponding to a time coordinate can be related
to a frequency (whether or not the particle has mass) and that element times c
we will term relativistic energy E R .
E R =p\
where co can be related to the wavelength by
W = 2 ^
42 Relativity Made Simple
and from
Pc = dco/dk:
co 2 = c 2 k 2 + constant
The integration constant will turn out to be proportional to the square of
the mass.
We will start with the premise that this four component definition of
momentum constitutes a four-vector that will be called the momentum four-
vector of the first kind p^.
Next consider the introduction of a four-vector potential $*■ to which the
test particle responds with a charge q. It does not matter at this point if this
charge is electric, only that the vector potential to which it responds is a four-
vector. We will define the momentum four-vector of the second kind by
P^ = p* + (q/c)<^
and we will call its time element P total energy E
E = P Q c
As an artifact from physics texts that do not make this distinction, when
the potential is not zero one might think of the total energy E as p°/c even
though it is really P /c which is E R + q(|) in special relativity. At the same
time, one would think of the relativistic momentum as p 1 . As a result one may
think of energy E due to containing a potential as something that can have an
arbitrary constant added to it, but would think of momentum as something
that can not. This superficially seems to draw a distinction between time and
space, as energy corresponds to a time element and momentum corresponds
to spatial elements.
However here where the distinction between momentum four-vectors is
made, one finds that there is no such distinction between time and space. This
is because it is in PV- that such an arbitrary constant can be added to the potential,
and it is also in that four-vector that such arbitrary constants can be added to
the spatial components of the vector potential. One can do this so long as one
demands that they transform as the coordinates do, as a four-vector.
The special relativistic definition for the mass of a particle given those
relations is
m 2 c 2 = ^vrp^ _ (q/ c )<jg[P v - (q/c)«|) v ]|
or
m = [(E R /c 2 ) 2 -(p/c) 2 ] I/2 = E /c 2
This could just as well be expressed as
mV = ft p^p v |
In the second we define E as the relativistic energy evaluated at zero
velocity, E = E R ' V=0 . Magnitude bars are included above merely so that the
choice of sign convention for the metric's signature is arbitrary.
Though all equation are equivalent given the above relations, the definition
in terms of the momentum four-vector of the second kind equation is preferable
Relativity Made Simple 43
in quantum mechanics discussions because that is what yields relativistic
quantum mechanics. For example, when one replaces the elements of the
momentum four-vector of the second kind with the energy and momentum
operators of quantum mechanics, and operates that on the wave function it
yields the Klein-Gordon equation with the inclusion of a nonzero vector
potential
^[Pop^ - (q/c)<g [P opv - (q/c)<MY = mV^F
which can also be written
[(H - <|)) 2 - (P op c - q^) 2 ]^ = m 2 c 4 *F
where
., d
H= A—
dt
and
P op =-m
The above definition of mass, that a mass is rest energy m = Eg/c 2 , E =
E R \ V=Q . or that its is centre of momentum frame relativistic energy m = E c Jc 2 ,
^cm = P°cm c \vcm=0' ' n tne case °f a s y stem of particles, is the definition that
we will use throughout the rest of the special relativity site where ever the
letter m or the word mass is used unqualified. This is the m that goes into the
relativistic version of Newton's second law in the form
F x = mA x
(four-force = mass times four-acceleration)
This mass is an invariant. It does not change with speed! Equations is
called the mass-shell condition, -because they are of isomorphic form to the
equation of a spherical shell. Under the above definition of mass, a photon
does not have mass. Due to quantum mechanical issues, virtual particles do
not tend to have the expected value of energy for a given momentum. So
sometimes it is said a particle lays off shell.
The coordinate velocity of a particle is simply given by
u^ = dx^/dt
We write Four- Vector Velocity or Proper Velocity
U^ = dx^/dx
where x is called proper time, which is just time according to the frame
of the particle at its location.
Through time dilation we can relate the two
Consider the following expression
T] m U' U mL' v
Tl uv mU^mL- v = [(mU ) 2 - (ml! 1 ) 2 - (mli 2 ) 2 - (mU 3 ) 2 ]
ri v mUh-nL' v = m 2 [(dct/dx) 2 -(dx/di) 2 -(dy/dT) 2 - (dz/dx) 2 ]
44 Relativity Made Simple
r^ v mUhTiU v = m 2 (dt/dT) 2 [c 2 -(dx/dt) 2 -<dy/dt) 2 -(dz/dt) 2 ]
T^ v mUhnU v = m 2 y 2 c 2 (l - v 2 /c 2 )
r^ v mU%iU v = m 2 c 2 fY~ 2
T^ v mU^mU v = m 2 c 2
Next we refer to the definition of mass to arrive at
Ti RV mU^mU v = V pV
Final examination of this reveals the relation between four-vector velocity
and the four-vector momentum of the first kind.
P H = mlP
We can then from equation discover the relation between four-momentum
and coordinate velocity for massive particles
p^ = ymu^
or write the relation for particles that may or may not have mass
p^ = (E R /c)(u^c)
The y term is physically associated to the velocity term through time
dilation. In the past a few physicists starting with Planck, Lewis, and Tolman,
not Einstein, have miss-associated the y term with the mass defining a new
kind of mass
M = ym <- Bad
This M is then inappropriately called "relativistic mass". In the absence
of a potential, the zero* element of the momentum four-vector is defined as
the energy divided by c, resulting in
p° = Mu°
E/c = Mc
M = E/c 2 <- Bad
Though much more complicated in the long run, the math is consistent
and leads to consistent predictions concerning observation and so one might
argue that the physics is therefor correct. But, in keeping with Occam's razor
this definition and method must be done away. The m in this method is then
inappropriately qualified and called the "rest mass".
It is wrong to do this for the following reason. Calling m the "rest mass"
infers to the listener that m is not the mass according to other frames for which
it is not at rest. We have already noted that m is an invariant as it is the same
value as calculated according to any frame. It is not just the value for the rest
frame. The relativistic mass method also leads to many erroneous conclusions.
By that method light has zero "rest mass". For one of many examples, it has
been argued that since light is not at rest in any frame, that the question of
whether it has mass at rest or "rest mass" is unanswerable. No. m = is
observed as the contraction of a photon's four-momentum according to any
frame, not just the "rest frame".
Relativity Made Simple 45
In short the terms "relativistic mass" and "rest mass" need to be done
away and the real mass m which is actually observed is an invariant. It does
not change with speed. Also, by this, the physically correct definition a photon,
or anything that travels at the Lorentz invariant speed c, has zero mass.
We have
p o = E R /c.
We have also demonstrated the relation between Four-Momentum and
Four-Velocity equation resulting in
p° = mU°.
Putting these together we have
E R /c = mU°
E R /c = m(dct/dt)
E R = (dt/dt)mc 2
E R = ymc 2 <r- Good
This is the mass - relativistic energy relationship for a massive particle.
Now this energy does not go to zero as v goes to zero so we see that a massive
particle still has energy even when it is at rest. This tells us that mass is
equivalent to rest energy meaning relativistic energy at zero velocity
E o = e rUo * mc2 <- Good
The kinetic energy of a particle is the amount of energy that is associated
with its motion only. Therefor
E K = E R ~ E
This results in
E K = (Y-l)mc 2
The stress energy tensor is a tensor that contains information about the
density of energy, momentum, stresses, etc.. contained in the space. The energy
tensor mass alone is Equation
T* v = p U^U v
The T 00 component of this is
T 00 = (dt/dx) 2 p c 2
p is the mass density according to a frame moving with that bit of mass,
but because of special relativistic Lorenz length contraction on the local mass
the coordinate frame mass density is then
p = (dt/dx)p
So this becomes
T 00 = (dt/dx)pc 2
But this is just the coordinate frame energy density.
The simplest consistent general relativistic definition of coordinate frame
energy density is then just
coordinate frame energy density = T 0<)
For more general stress-energy tensors it is common to define p as
p = T^U u L- v/ c 4
46 Relativity Made Simple
If p is to be positive then FUU y must be greater than 0. For this not
to be the case is called a violation of the weak energy condition. More generally
speaking a the weak energy condition is
FVV V >
for any timelike vector V . Matter may only violate this condition within limits
set by the Pfenning inequality.
Other elements have other interpretations. For instance T" is a flow of
momentum per area in the x 1 direction or the pressure on a plane whose normal
is in the x 1 direction. T'J is the x 1 component of momentum per area in the x>
direction or describes a shearing from stresses. T ' is the volume density of
the i th component of momentum flow.
Next we will discuss the concept of system mass. We have seen that for
a single particle mass is equivalent to rest energy. Equation
E = mc 2 .
For a system of particles the best concept for system mass m is defined
as centre of momentum frame energy E cm .
E cm = mc 2 .
The system mass does not turn out to be equal to the total or sum of
masses m tot of the constituent parts. Instead it is the total energy summed for
all of the constituent parts according to the centre of momentum frame.
Consider for a moment a Lorenz invariant for the system consistent with
the mass shell condition.
First define
Psvs' = [ E cm /C > 0. 0. °]
as a four-element vector for the inertial centre of momentum frame. Then
define P as the Lorentz transform of this for any frame of interest.
Psys = A Psys'
Due to relative simultaneity p as defined here is not always equal to
the "simultaneous" sum of the four-momentum of the constituent parts when
there are external forces acting at various locations on the system. The system
mass is defined as the following invariant.
m2c2=T VvPsys^Psys V
or for the scenario described above,
m = [(ERs y s^) 2 -(Ps y ^) 2 ] 1/2 = E Cm ^ 2
Considering the time element of equation restores the relation
E Rsys=Ycm mc2
Proof that this definition of system four momentum is the same as the
sum of the four-momenta of the systems components goes as follows. Start
with the sum of four-momentum for an arbitrary frame.
Psvs = Z . Pi
Lorentz boost to another frame
A Psys = AZ > P.
Relativity Made Simple 47
Interchange sum and transformation symbols
A Psvs = Z . A Pi
The Lorentz transform of each four-momentum is the four momentum
according to the new frame
A Psys = E . Pi'
But the right side is the net four momentum according to the new frame
A Psys = Psvs'
This proves that the system net four-momentum is indeed a four-vector
itself and yields
E Rsys=Ycm mc2
where m is the system's mass and is its centre of momentum frame energy as
well as
P = Ycm mU cm
and
m2c2 = VPsys%s V = E Rsys 2/c2 - Psys 2 = E cm 2/c2
The reason that the mass of equation is not the same as the "total" of
constituent masses, m tot , is that the sum of masses of the constituent parts
does not always equal the centre of momentum frame energy. For example, a
system of massless particles have a zero mass shell condition when they all
move the same direction while the system has a nonzero mass shell condition
when they move in different directions.
One advantage the definition of centre of momentum frame energy for
mass has over "total mass" is that by this definition, not only is mass an
invariant, but this mass of a system is also conserved. Note that the concept of
captive mass is equivalent to centre of momentum frame energy m and not the
total of masses m tot . In order to increase the system mass m one must increase
the total centre of momentum frame energy E cm equivalently. This demonstrates
that mass defined by m is conserved in the same way that energy is.
Transferring energy from some external matter to change the centre of
momentum frame energy of an object will increase its individual system mass,
but when you extend the system to include the matter from which the energy
was transferred, it will always be found that centre of momentum frame energy
or system mass m is ultimately conserved.
A sum of invariants is also an invariant and so one could just as well ♦
write the total of masses m tot as a sum of the constituent parts. For a system of
n particles this could be written-
'"to. = (VPl* 1 Pi") 1 ' 2 + (V2* P 2 V )' /2 +- + (\vPf Pn V )' /2
or
m tot = m l + m 2 + "- + m n
The subscript indicates the particle number. Again, the major problem
with thinking of a system mass as this is that this total of masses is not
conserved. However, part of the reason this is brought up is that people do
48 Relativity Made Simple
tend to think that mass is that kind of sum. This leads to another
missunderstanding as to what is meant by "mass to kinetic energy conversion".
Consider for example a massive particle that decays into two massless photons.
Because system energy is conserved and the system energy fof the centre
of momentum frame is the system mass, the system mass did not change in
the decay. What did change was that the energy initially was associated with
rest, the rest energy of the particle, but finally was associated with motion,
the kinetic energies of the photons. In that light one should really not say there
is mass-energy conversion.
The energy and system mass for the system is conserved. One should
instead say that energy associated with the resting particles of the initial state
becomes associated with the motion of the particles in the final state. Since
this is cumbersome the term mass-energy conversion is used, but be wary that
what it refers to is that a change in the sum of masses can be the cause of the
change in kinetic energies of the remaining masses. Just remember that the
system mass of a closed system doesn't change or "convert" into anything.
Sometimes it is more useful to define an invariant mass density instead.
Just as there are two ways to describe the masses of a system above, m and
m (ot , there are two important ways of describing an invariant mass density.
The first is the p in the following relation equation
T^v = p lW
This definition most closely corresponds to the mass m description of
system mass above. It relates the stress energy tensor for matter composed of
non-interacting constituents to the four-velocity of the unpressurized "fluid"
at any given location.
The total energy for the system is conserved and could instead be defined
by the following volume integral
E sys =JjjT 00 dxdydz
For the example stress energy tensor this would become
E sys = ^^fluid 2 P0 c2dxd y dz
One can also instead define the system momentum from the next integral
p sys =\\\ (T 0i ei /c)dxdydz
For the example stress energy tensor this would become
Psys = M (Yflu.d 2 Po uie i) dxd y dz
e ; is a unit vector in the direction of the i th momentum component.
One then still can define the system's mass as the centre of momentum
frame energy. It is the energy for the frame according to which
p svs = 0. So we still have
E cm = mc 2 .
The other invariant mass density concept corresponding to the total of
masses m tot would be
Pto,=V T " V/c2
Relativity Made Simple 49
This kind of mass density is an invariant, but its volume integral is not
conserved. This kind of mass is what is meant when one refers to a field such
as the electromagnetic field as a massless field, or when one refers to any
system as massless. This is zero for any system of massless particles.
Of the two descriptions of system mass, the mass m concept is far more
useful.
E R = ymc 2
where y was given by
Y = (l _ v 2 /c 2 ) -i/2
Notice that the energy becomes divergent at v = c for nonzero mass. Thus
no matter how hard or how long you push on a mass, you can never impart
enough energy to it so that it reaches the speed c. The only way such a thing
can travel at the speed c and still have finite energy is if it had zero mass. In
that case, instead of a mass energy relation, there is a energy momentum relation
from Equation resulting in,
E = E R = pc,
where E and p are related to frequency and wavelength.
One might consider the case of a particle that instead of being pushed
beyond the speed c, moves faster than c upon its creation. Such a hypothetical
particle is called a tachyon. Notice that if v is greater than c then y is imaginary.
Since imaginary energy makes no physical sense, we would expect that the
mass would also have to be imaginary so that the energy(and momentum) would
be real.
The primary problem with the existence of such a particle is that it could
be use to violate the principle of causality. The principle of causality is simply
the statement that effect never precedes cause. Imagine setting up a tachyon
emitter and a tachyon receiver at different points along an S frame x axis.
Lets say that the signal travels arbitrarily fast so that the event of transmission
and the event of reception are virtually simultaneous.
Next recall that events simultaneous in one frame are not all simultaneous
in other frames. We could then easily pick a frame to look at the situation in
which the event of reception precedes the event the event of transmission.
This is a violation of causality.
Worse yet, it then leads to grandfather paradox's. The grandfather paradox
is the idea that a time traveller goes back in time and kills his grandfather
before his father was conceived. To set up a grandfather paradox with tachyons
we simply set the receiver in motion away from the transmitter and give it a
relay transmitter. We call the frame in which it is stationary the S* frame. We
also connect a receiver to the S frame transmitter. We then programme the S
frame transmitter so that in say lhr it will send a signal unless it's receiver
receives a signal.
To begin lets say that it receives no signal and so it sends one. The signal
50 Relativity Made Simple
arrives at the relay, which is moving away and sets off the relay transmitter.
Now the relay transmitter sends a return signal, but the return signal travels
back to the S frame receiver/transmitter setup virtually instantaneously
according to the relay's S' frame. Due to the Lack of simultaneity between
the frames the return signal will be received back at the S frame transmitter at
a time prior to the original transmission. But because we programmed it not
to send a signal if it receives one it will now not send a signal. But then there
is no signal to receive and so it sends one.
It is sometimes said that special relativity says that nothing can travel
faster than the speed of light. As discussed, what it really implies is that long
as we restrict our physics to special relativity, and we wish to preserve causality,
information can not travel faster than c. Likewise, as long as we restrict our
physics to special relativity, nothing with mass can travel at the speed c.
There have been experiments done in which the physicists involved say
that they have indeed been able to get electromagnetic waves to propagate
information faster than c through a dispersive medium.
In particular the controversy is over gain assisted faster than c group
velocity transmission demonstrated in anomalous dispersive media.
If their claims that it was the "information" that has indeed been transferred
at faster than c speeds are correct, then SR implies that we can find a frame
according to which the reception of a signal at one end of the apparatus precedes
the transmission of the signal at the other end. This would indeed be a causality
violation and brings to question the validity of the principle of causality and
causes us to reevaluate the (im)possibility of a physical grandfather paradox.
However, it is conceivable that the universe may be structured in such a way
that such a causality violation is attainable, but that grandfather paradoxes
will still not be allowed. For example, if one of their dispersive media faster
than c experiments were devised in attempt to simulate the relay-transmitter
paradox discussed above, one could hypothetically set such a receiver in motion
such a medium, but that medium is what determines the speed of the
electromagnetic waves according to its rest frame. A relay transmitter sending
the signal through the same medium would not end with the signal arriving at
a time prior to transmission.
The following is an explanation why the experiments are not completely
convincing of faster than c "'information" transfer.
As an example, in a 6.0cm medium a laser pulse has been transmitted
that transversed the distance at a speed of 3 10c. This is a group wave speed,
not a phase wave speed. The below figures are a recreation representative of
a receiver's data for two pulses. The blue dotted curve represents the intensity
Vs time curve for the reception of the 310c speed pulse and the red curve
represents the intensity Vs time curve for the arrival of a c speed pulse sent at
the same time.
Relativity Made Simple
n micro-sec
Time in micro-sec
One can see on the close up second graph from the horizontal shift that
the 310c curve arrived 62ns earlier than the c speed pulse. The question then
arises whether this experiment is an example of a causality violation. In order
for causality to be violated one must have faster than c "information" transfer.
Under typical "long" transmissions one can consider the information transfer
speed to be the group wave speed or the speed of the energy carried by the
pulse. It has long been known that phase wave speeds often exceed c which is
why it is often pointed out that the- energy transmission in ordinary wave-
guides occurs at the group speed which is less than c.
As such, the information transfer speed is less than c in ordinary wave
guides. The reason that the group speed exceeding c for this experiment is not
convincing of faster than c information transfer and the reason why the
information transfer for this experiment can not be taken to be the group speed
is because of the following.
Notice that the 62ns time shift between the two pulses is much less than
the time it takes to receive the entirely of a pulse itself. Thus the time it took
from the time the pulse began to enter the medium until the time the receiver
read the entirety of the information was the sum of group transfer and read
times. The full width at half-max FWHM of a pulse here is approximately
4.0(0.s. Take that to be read time. The group transfer time was 6.0cm/3 10c =
0.65ps. Taking the information transfer time to be the sum of these,
approximately just 4.0|J.s, one finds that the information speed was 6.0cm/
4.0(J.s = 5.0xl0" 5 c, a mere measly fraction of the vacuum speed of light. There
are two ways one might modify this experiment so that if successful it would
clearly demonstrate faster than c information transfer.
First, one might made the medium of transfer much longer so that in the
information transfer time it is the read time that is insignificant instead. The
reason that this may be an impossible task is that due to the dispersive nature
of the media itself, even with the gain assistance, there will be a trade off
between the signal degradation and length. There may be a limiting trade off
so that in a long enough transmission line so that the information transfer time
yields a faster than c speed the signal would have been lost. Second, one might
try to significantly narrow the pulse so that the information transfer time is
approximately just the group transfer time.
52 Relativity Made Simple
The reason that this may be an impossible task two fold. There is a narrow
frequency range at which the light must be sent through the medium in order
for it to transfer with a faster than c group speed. This in itself puts a limit on
how narrow the pulse may be. Also, the narrower the pulse is made the more
rapidly it will tend to widen itself as it travels across the medium. By the time
it gets to the other end the read time will always be longer than the send time
so no matter how short the send time is made one will have to contend with a
longer read time.
In conclusion, though such experiments do successfully demonstrate faster
than c group transfer, they do not conclusively demonstrate the faster than c
information transfer, which they would have to in order to show a causality
violation.
THE SR DYNAMICAL EQUATIONS
In special relativity we define a four-vector force as
F^ = dpVdx
For a particle with mass we have
p x = mV x
The Acceleration Four-Vector for special relativity is given by
A* = dUVdi
and so we can write the relativistic version of Newton's second law as
F x = mA*
Considering an inertial frame according to which the test mass is
instantaneously at rest it is easy to show that
V A^U V =
which yields
This serves as the work energy theorem for modern special relativity.
Consider where r) F^U V = leads:
T! MV (dp^dT)U v =0
y\ v (dpn/dt)u v =
V (dp>Vdt)u v =
(dp°/dt)u° - (dp/dt)- u =
u° = c and p°c = E R
so
(dE R /dt) - (dp/dt)- u =
dE R = (dp/dt)- dx
dE„ = dE, so
Relativity Made Simple 53
jdE K = J(dp/dt> dx
W = J(dp/dt> dx
If we define another kind of force that is not a four-vector which we will
call ordinary force as
f = dp/dt
then we arrive at
W = Jf-dx
and there we see the work energy theorem of pre-modern relativistic physics.
So Newton's second law for special relativity in terms of ordinary force
is
f^dp'/dt
Using this force definition has its purposes, but in a lot of ways thinking
of relativistic physics in terms of nontensor quantities very much complicates
things. For example let us work out the relation between ordinary force and
coordinate acceleration,
we can write as
f^mCd/dtXdx'/dT)
We then define a x by
a x = (d/dt)(dxVdx)
Which for acceleration in the direction of motion results in
a = ^a
and for that case of motion a will be equal to the proper acceleration A' which
is the acceleration as observed from a frame according to which the particle is
instantaneously at rest. The magnitude of this can be calculated from any frame
as it is an invariant, |A'| = (-r\ A^A V ) 1/2 . This is the amount of acceleration
"felt" by the accelerated observer and according to the inertial frame in which
the accelerated or "proper frame" observer is instantaneously at rest a = A'.
We define a according equation as well because it is useful, a as calculated
from any inertial frame according to which the force is in the direction of the
motion turns out to be equal to the proper acceleration.)
This also restores a Newtonian form
f ' = ma 1
(Note also - When the force is in the same direction as the motion, then
the force felt by the object being pushed is equal to the ordinary force. In that
case we have F' *• feh = f *■ = moc^)
we can eliminate dx, in terms of dt from time dilation
f « = m(d/dt)(Ydx'/dt)
Use of the chain rule and simplification results in
f = ym[a + y 2 u(u-a)/c 2 ]
where a^ is the coordinate acceleration.
The four-vector force for special relativity is sometimes called the
Minkowski force and is related to the electromagnetic field tensor F' uv
Relativity Made Simple
-E x
- E y
-E.
E x
cB.
-cB y
E v
~cB.
C B x
E.
cB v
-cB x
[Ffiv] -
by
F* = qT^ v (LWc)F^
From this we can work out the relation between the components of the
electromagnetic field, the coordinate velocity and the ordinary force, which
yields
f = q(E + uxB)
In the case that the force is in the direction of the motion yields,
f ' = ^ma 1
Note that no matter what finite value the ordinary force is, as u approaches
c, y diverges and so the acceleration must vanish.
We expect this as nothing with mass can be pushed all the way up to the
speed of light.
We have seen how c is a speed limit for the universe. Because of this, we
must answer a question concerning velocity addition. Lets say an S' frame
observer observes an object at speed u'. An S frame observer observes the S
frame observer to be moving at speed v. u' can be any value less than c and v
can be any value less than c. People tend to come to the wrong conclusion
that the S frame observer observes that the object moves at a speed u = u' + v
and that this speed should therefor be any speed less than 2c. They are using
the wrong velocity addition formula. Consider the following Lorentz coordinate
transformation equations in differential form.
dx = y(dx' + pdct')
and
dct = y(dct' + pdx')
To obtain the correct velocity addition equation divide equations and
simplify.
dx/dct = [y(dx' + (3dct')]/[Y(dct' + pdx')]
simplified
dx/dt = [dx'/df + v]/[l + (dx'/dt')v/c 2 ]
Now making the replacements u = dx/dt and u' = dx'dt' we arrive at
u=(u'+v)/(l +u'v/c 2 )
This is the correct equation to use for that velocity addition, u' and v can
be any values less than c but the result will always be that the speed of the
object according to the S frame, u, will always be less than c. One can also
use the same method to find the velocity addition equations for the case that
the object moves in the y or z directions.
Relativity Made Simple 55
ROTATIONS, ROCKETS, AND FREQUENCY SHIFTS
We have shown that velocities do not add linearly in special relativity.
For motion along one direction velocities were adding nonlinearly according
to Equation
u =(u'+v)/(l +u'v/c 2 )
Rapidity 8 as a function of v is given by
tanhS = v/c
This definition is useful as it simplifies much of dynamics equations. It
does this because, unlike velocity, rapidity does add linearly.
e u = e u . + e v
The Lorentz transformation matrix
Y -yp
-yP y
10
1
Rapidity also has the following relations to y and yp
y = coshG
yp = sinhO
From these, the Lorentz transformation matrix becomes
coshG -sinhO
-sinhG cosh
10
1
Comparing this to an ordinary rotation matrix makes it clear why Lorentz
transformations can be thought of as a rotation in space-time. At this point the
relation between Lorentz transformation and rotation may still seem to be a
superficial one, but once one becomes familiar with spinor calculus a much
more intimate relation is revealed.
Here we will derive and discuss the implications of single stage relativistic
rocket equations. The non-relativistic rocket equation is
Av = v ex ln(m 1 /m)
This gives the change in velocity Av a rocket undergoes accelerating in
one direction given a measure of exhaust speed v ex which is a constant and
the initial mass of the rocket m, and the final mass of the rocket m after some
of the ships mass in fuel has been burnt off.
The relativistic version of this equation in terms of rapidity 8 is similar
A8 = (v ex /c)ln(m/m)
56 Relativity Made Simple
The speed of the rocket is then calculated from the rapidity Equation.
v = ctanh0
Notice that since tanhS < 1 for any 0, v is always less than c no matter
how much of the ships mass is burnt off as fuel and no matter how fast the
exhaust speed is. We can even consider tachyon exhaust where v ex > c and yet
the rocket still never reaches the speed of light.
Start with conservation of momentum and energy relating the initial and
final states of the rocket and exhaust for a small element m fex burned off.
ymv = (m + dm)[yv + d(yv)] + m fex y fe v fcx
ymc 2 = (m + dm)(y + dy)c 2 + m fex y fex c 2
Simplified
= yvdm + md(yv) + m fex y fex v fex
= ydm + mdy + m fex Y fex
Eliminate m fex y fex
= yvdm + md(yv) - (ydm + mdy)v fex
Insert relativistic velocity addition
= y\'dm + md(yv) - (ydm + mdy)[(v - v ex )/( 1 - w ex /c 2 )]
Simplify
= [yvdm + md(yv)] ( 1 - vv ex /c 2 ) - (ydm + mdy)(v - v ex )
Switch variables to rapidity
= [sinhedm + md(sinh0)] [1- tanh0(v ex /c)]
c - [cosh0(dm) + md (cosh0)] (tanh0 - v ex /c)c
Simplify
= [sinh0dm + mcosh0d0] [1 - tanh6(v ex /c)]
- [cosh0dm + msinh0d0](tanh0 - v ex /c)
= sinh0dm + mcosh6d0 - (sinhedm + mcosh0d0) tanh0 (v ex /c)
- (coshedm + msinh6d8)tanh8 + (coshedm + msinh0d0)(v ex /c)
= sinh0dm + mcosh0d0 - (v ex /c) sinh0tanh0dm - (v ex /c)msinh0d0
- sinh0dm - msinh0tanh0d0 + (v ex /c)cosh6dm + (v ex /c)msinh0d0
= mcosh0d0 - (vjc) sinh0tanh0dm - msinh0tanh0d0 +(v ex /c) cosh dm
= (mcoshe - msinh0tanh0)d0 + (v ex /c)(cosh0 - sinh0tanh0)dm
= m(cosh 2 8 - sinh 2 0)d0 + (v ex /c)(cosh 2 e - sinh 2 0)dm
= mdO + (v ex /c)dm
d0 = -(v cx /c)dm/m
After integration equation is obtained
A0 = (v /c)ln(m/m)
Relativity Made Simple 57
Now consider the ships proper acceleration for motion in one direction
refer to equation:
a = y 3 a = cosh 3 6dv/dt = cosh 3 e(dv/de)(de/dm)(dm/dt')(dt'/dt)
a = cosh 3 8(csech 2 8) (-(v ex /mc))(dm/dt')seche
a = (v ex /m)(dm/dt')
If the proper acceleration is kept constant then integration results in
ocAt'/c = (v ex /c)ln(m/m) = A6
Consider initial conditions of v = at t = t' = 0.
at'/c = 9
If the rocket starts at rest and is run at a constant proper acceleration a,
then the equation can be written
v = ctanh(octVc)
These initial conditions also result in
v = ctanh[(v ex /c)ln(m i /m)]
equivalently
(±P +l
Inverting these results in
mjm = exp[(c/v ex )tanh"'(v/c)]
equivalently
m/m = [7(1 + (3)] c/vex
Running it at a constant proper acceleration also results in
(3 = tanh(atVc)
7 = cosh(act'/c 2 )
7P = sinh(act'/c 2 )
Using the Lorentz like transformation equations
ct = J ct 'ydct' + TpV
x = yx' + j ct Y|3dct'
Results in a good global coordinate transformation from the accelerated
ship frame to an inertial frame. Take the inertial frame to be the one in which
it starts instantaneously at rest at t' = and these become
ct = (c 2 /a + x')sinh(act'/c 2 )
58 Relativity Made Simple
x = (c 2 /ot + x')cosh(occt7c 2 ) - c 2 /a
y = y'
z = z'
There is a difference between what frequency one observes as being
emitted from a source and what frequency an observer actually sees as coming
from the source. This is true even in nonrelativistic physics. For instance, as a
car drives past you will hear a shift in the tone of the engine as it goes from
coming toward you to going away from you. This is the frequency you hear.
You may use the ordinary Doppler shift formula with the speed it was traveling
to then extrapolate what frequency it really emits according to your coordinate
frame. This is the frequency you observe.
The relativistic Doppler shift formula is really the same thing as the
ordinary Doppler shift formula except that it is usually written in terms of the
source frame's emitted frequency instead of the observed emitted frequency.
Its just that in the nonrelativistic case these are the same. In relativistic Doppler
shift, you accounts for the fact that due to time dilation the frequency you
observe to be emitted is different then the frequency according to the frame of
the object.
If you are at rest with respect to the medium of propagation for a wave,
then the ordinary Doppler shift formula is
f = f /[l + (v/c)cos6]
In terms of sound we make the following relations.
f is the frequency you hear (for instance if this was sound).
f is the frequency you observe have been emitted according to your frame
at the time of the emission. It is the transverse or
8 = nil frequency for f.
v is the speed the emitter travels with respect to the medium of the waves at
the time the wave was actually emitted,
c is the speed of the waves of the medium with respect to the medium.
9 is t^e angle it was traveling off of strait away from you at the time the
heard frequency was actually emitted.
Relativistic Doppler shift IS ordinary Doppler shift. This formula happens
to stand correct for the relativistic Doppler shift of light with the following
adjustments to the relations.
f is the frequency you see.
f is the frequency you observe to have been emitted according to your
frame at the time of the emission. It is the transverse or
6 = n/2 frequency for f.
v is the speed the emitter travels with respect your frame at the time the light
was actually emitted.
c is the Lorentz invariant vacuum speed of light.
is the angle it was traveling off of strait away from you at the time the
Relativity Made Simple 59
heard frequency was actually emitted according to your frame. The angle is
different according to the other frame and so use of the other frames angle
changes the form of the equation.
Now to write it in terms of the frequency emitted according to the frame
of the object we start by writing the periods according to the two frames in te
rms of time dilation.
T = T '(l-v 2 /c 2 )- I/2
We then relate frequency to period.
f = l/T
fo' = "TV
Putting these together results in
f =f '(l-v2/ C 2)"2
Inserting this into the Doppler shift formula results in
f = f ' (1 - v 2 /c 2 ) 1/2 /[l + (v/c)cos9]
The wavelength of the light will be X = c/f, or...
A. = A. „' [1 + (v/c)cos9] /(l - v 2 /c 2 ) 1/2
Next consider the case that the object travels strait toward the observer.
9 = 7t. Then after algebraic simplification these becomes
f = f '[(c + v)/(c-v)] 1/2
?, = V[(c-v)/(c + v)] 1/2
If the object traveled strait away from the observer 9 = 0, it would have
become
f=f , [(c-v)/(c + v)] 1/2
x = VK c + v V( c - v )] 1/2
STARTING GENERAL RELATIVITY
THE CONCEPTUAL PREMISES FOR GENERAL RELATIVITY
Lets say that there is a space-lab out in the depths of space sealed up so
that there is no way for its crew to see anything outside of the lab. There are
two experimentalists, Terrance and Stella, inside the space-lab. In this
environment they are weightless and Terrance is still with respect to the ship
walls. Stella is also initially still with respect to the lab walls, but she can
maneuver around without touching the walls because she wears a rocket pack.
They both also carry with them cesium watches that keep time accurate to
within a millionth of a second and a computer that can read off such small
time differences in their displays.
They then do the following experiment. They synchronize their watches
to start and they start at the same location within the space- lab. Terrance stays
there and Stella travels away and back to him along any number of paths so
long as she arrives back when his watch says an hour has gone by to within its
60 Relativity Made Simple
millionth of a second accuracy. The watch's times are then compared and a
path is sought for which as much time as possible goes by on Stella's watch.
Finally they experimentally discover what we knew from special relativity
which is that the path that maximized her watches time was simply where she
stayed put weightless next to Terrance and didn't go anywhere else. Every
other path she took she underwent special relativistic time dilation while in
motion with respect to Terrance.
Next we shift perspectives to a third party, Lois, who is for the moment
moving in a state of constant velocity through the ship. According to Lois the
path that Stella followed that maximized Stella's time between the events of
the experiment's start and stop next to Terrance was a path of constant velocity.
So we see that in special relativity the paths things tend to take which are
paths of constant velocity are also the paths that maximize proper time intervals
between events along the path.
Next they do another experiment. Lois releases two balls of different mass.
They are both unacted on by forces in the ship so they just keep their same
motion of constant velocity right along with Lois without deviating away from
each-other.
Next we go to a fourth observer, Clark Kent, who is far out in the depths
of space, but can see through the walls of the space-lab into the experiments.
He also sees that their space-lab is falling toward a planet which they didn't
realise because they were in free fall and couldn't see outside their lab.
According to Clark the path of maximal proper time that Stella took between
the events of the beginning and the ending of her experiment was not a path
of a constant velocity state at all, but was the path of a body accelerating in
the presence of a gravitational field. So we note that the path that things tend
to follow in gravitational fields are still paths of maximal proper time even
though they are not paths for a constant velocity state.
He also notices that the balls of the experiment though they have different
masses, accelerate at the same rate.
Through this mind experiment we have discovered the core essence of
general relativity.
The equivalence principle comes in different strengths.
The weak version of the equivalence principle boils down to the
equivalence of gravitational and inertial mass. "Gravitational mass" and
'"inertial mass" are Newtonian concepts refering to variables that enter into
equations for Newtonian physics. In Newtonian the gravitational force f from
a point active gravitational mass M, acting on a point passive gravitational
mass M 2 at a distance r comes from
f r = -GM,M 2 /r 2
In Newtonian physics we also write the relation between the f r acting on
an inertial mass M 2l and a r as
Relativity Made Simple 61
f r = M 2i a r
Putting these together we have
a r = (-GM,/r 2 )(M 2 /M 2l )
We noted the balls of different masses fell at the same rate of acceleration
according to Clark. In order for this acceleration to be independent of the ball
mass as Clark saw that it was, with the correct choice for the value of G it
becomes clear that the gravitational mass M 2 must be equivalent to inertial
mass M 2j . Then we have
a r = - GM/r 2
In general relativity we will have an invariant definition of mass. There
will also be a four-vector force equation for general relativity in the form
F^ = mA x
where m is the mass as invariant for general relativity.
Gravitation acting alone corresponds to F^ = 0. This yields:
mA x =
The Acceleration four-vector for general relativity is a combonation of
two parts, resulting in
mdUVdx + ml* U^ =
The m in the term on the left corresponds to the "inertial mass" in
Newtonian physics. The m in the term on the right corresponds to the passive
"gravitational mass" in Newtonian physics. As these are really the same thing
that was just multiplied through it is obvious that indeed the inertial and
gravitational masses are identically equivalent.
The semi-strong level of the equivalence principle comes from the
realization that the crew never knew that they were actually falling in a
gravitational field. The experiments of a local free fall frame have results
indistinguishable from the same experiments done in inertial frames. This is
an equivalence of inertial and local free fall frames.
We could also extend this to the realization that if the lab had rocket
engines burning, keeping them at a constant proper acceleration, they wouldn't
have known the difference between being accelerated by the rocket engines
or sitting on the surface of a planet in the presence of a gravitational field.
The strong level of the equivalence principle comes from the realization
that any local free fall frames are equivalent for doing the physics. The laws
of physics were the same for Lois as they were for Terrance. When the
equivalence principle is mention unqualified it is usually this level of
equivalence that is being referred to.
Above this strength we find the level of equivalence that is really required
to result in the form of general relativity that we have today. This is sometimes
called the general principle of relativity and sometimes the general principle
of covariance. That is simply the statement that the general laws of physics
are frame covariant. In other words the equation form that the laws of physics
62 Relativity Made Simple
take are the same, invariant, according to every frame whether accelerated or
not, whether in the presence of a gravitational field or not, whether rotating or
not. To ensure this we must model the general laws of physics with tensor
equations. The equations for the general laws of physics are then unchanged
by transformations.
TENSORS IN GENERAL RELATIVITY
What defines a vector in any physics is its vector transformation properties.
Not everything that merely has a magnitude and a direction is a vector, even
in non-relativistic physics. For instance angular displacement is not really a
vector because it doesn't always obey the vector property
A + B = B + A.
The vectors of relativity obey tensor transformation properties. In general,
a four-vector is a rank one tensor. In element notation is has only one index,
so it is a tensor with only four elements.
Some of the things we like to think of as individual properties of nature
are incomplete as physical properties being only a component of a tensor. For
instance, the electric field by itself does not obey tensor transformation
properties. The magnetic field by itself also does not obey tensor transformation
properties. In the context of this text a pseudovector will be anything that has
multiple elements like a vector, but lacks any of the tensor transformation
properties. These two pseudo-vectors can be combined into a unified field called
the electromagnetic field tensor. Thus we see that the electric and magnetic
fields are actually incomplete parts of the actual unified field called the
electromagnetic field. This is a rank two tensor.
In the same sense, momentum by itself is not a complete physical quantity
as it does not obey tensor transformation properties and so it is not really a
vector in the relativistic sense. But, when we combine it with a fourth element,
energy, we get a tensor called the momentum four-vector.
Likewise there are displacement four-vectors, velocity four-vectors,
acceleration four-vectors, force four-vectors, etc...
According to a general principle of relativity the laws of physics are frame
covariant. Therefor when modeling the general laws of physics with equations
we must use expressions that are also frame covariant. For instance, if we use
one coordinate system to write an equation like
F(ct,x,y,z) - G(ct,x,y,z) = 0,
Then in any other coordinate system it should also be
F' (cf, x',y',z') - G' (cf, x', y*, z'> =
It should not change its basic form. For example, it should not become
F (cf, x*, y, z') - G (cf, x', y, z') = H (cf , x', y', z')
If such an equation does transforms like this then it is not one of the
fundamental equations of physics.
Relativity Made Simple 63
Here we will define a tensor in terms of its transformation properties. A
contravariant tensor will be any quantity that transforms between frames
according to
T'^ = (dx'^dx v )T v
A covariant tensor will be any quantity that transforms between frames
according to
T' tl = (5x v/ ax'^)T v
There are also mixed tensors. For example
V ^ v = (dx' ^9x CT )(dx P'dx' v )T a p
From these transformation properties we can deduce that for an individual
particle,
A sum or difference of tensors is still a tensor.
A product of tensors is still a tensor.
A tensor multiplied or divided by an invariant is still a tensor.
Note: These rules apply only when the tensors involved describe that
which is observed, not the state of the observer himself. So for example let
F be a tensor describing something observed like say the electromagnetic
field and U v is the four-vector velocity of the observer (c,0,0,0). It turns out
that the electric field given by
E tl = F = F^ vU v/c
is NOT a tensor. As LP is the four-vector velocity of whoever is the observer
everyone uses (c,0,0,0) as a result and the expression does not transform as a
four-vector. E' = F' * (9xV9x'^)F ?l0 . If U v were the four-vector velocity of
one "particular" observer then the expression would transform as a tensor,
but then it wouldn't represent the electric field to anyone except that observer
and it would then only when F is the electromagnetic field already expressed
according to his own frame. Likewise the magnetic field
B^ = - (l/2)s/PF, p /c = - (l/2)6^PF, p U v /c 2
where LP is the four-velocity of the observer (c,0,0,0) is also not a tensor.]
In relativity we must write the fundamental equations of physics as tensor
equations such as
T f»- =
A vp... "
because this remains frame covariant. For instance, using the above
transformation properties, it is easy to show that in any other frame this equation
remains in the same form
T'^"vp... =
THE METRIC AND INVARIANTS OF GENERAL RELATIVITY
The invariant interval can be expressed in the form equation
ds 2 = dct 2 - dx 2 - dy 2 - dz 2
Or in a more compact notation it can be written equation
ds 2 = r| dx^dx v
64 Relativity Made Simple
If we were to express this in a curvilinear coordinate system it will take
on a form different from the top equation. For example do the following
transformation to cylindrical coordinates
x = rcos
y = rsin
The invariant interval will then take the form
ds 2 = dct 2 - dr 2 - r 2 d 2 - dz 2
Notice that in curvilinear coordinate systems functions of the coordinates
may appear as coefficients of the differential quantities within the interval
such as the -r 2 appears front of the d0 2 term above. Another possibility is the
appearance of cross terms such as a dctdz term. To write this as a more compact
and general form it is expressed
ds 2 = g Rv dx^dx v
When there is matter or fields of any type in the space it effects the form
that g can take globally. So the popular interpretation for gravitation is simply
that matter gives space-time an intrinsic curvature. In a situation where matter
curves the space-time one can not globally transform g to r\ . However one
can always do the transformation locally.
We again express the invariant interval in the form
ds 2 = g RV dx^dx v
Given that the interval is invariant we know that
g^ v dx^dx v = g\ p dx a dx'P
We also know that dx^- transforms according to the calculus chain rule
dxP = (dx^/3x v )dx v
This results in
g RV dx^dx v = (ax'Vax^)(ax'P/ax v )g\ p dx^dx v
And therefor
g^Ox'Vax^Ox-p/ax^p
Now this is how a rank 2 covariant tensor transforms. Therefor if ds 2 is
to be invariant then g is a rank 2 covariant tensor. This has been given the
name "the metric tensor"
As we shall cover in the sections on gravitational pseudo forces the metric
tensor is analogous to the gravitational potential for non-relativistic physics.
In non-relativistic physics the gravitational force or other fields are often
describable as the gradient of a potential. The gravitational pseudo forces will
be related to affine connections which contain the metric tensor and its first
order derivatives.
For special relativity we have
Vv = °
Relativity Made Simple 65
We can always transform to a local frame according to which the metric
is T[ so we know so far that for a local frame also
y v= °
Now consider the transformation to be to a local free fall frame so that
the affine connections vanish. In that case we also have
Vv= °
Now transform this result to an arbitrary frame and we also find
(Summation still implied on all four above)
Next consider the quantity
as arrived at for any point in spacetime by a transformation to an arbitrary
set of Coordinates from a local Cartesian coordinate frame:
g^ pg P v = (3x a /3x^)(3xP , /3x'P)Ti (xp (ax'P/8x^)(8x' v /3x a )ri^
Rearrange terms
gupg pv = (ax a /3x^)OxP/ax'p)(3x'p/ax^)Ox' v /ax o )Ti 0( pri Xa
Yielding
g^ p gp v = Ox^/ax'^s^Ox-v/ax^ri^Ti^
Simplify
From the matrix equation for T| it is easy to verify the next step
g w gp y = ox«/ax^)Ox' v /ax°)o a o
Simplify
g^ pg P v = Ox«/3x^)Ox ,v /3x«)
This yields
g, P g pv = V
Contract this and we have
g,pg PH =§/
Which results in
g, v g^= 4
The covariant metric tensor also acts as a lowering index operator and
the contravariant metric tensor acts as a raising index operator. For example,
T n = g^v TV and
J\i = g^V j
It is easy to verify this property based how contravariant and covariant
tensors are defined by how they transform. For example consider the following
expression.
66 Relativity Made Simple
Ox^x'M)(g Xv r)
based on how tensors transform this becomes
(ax^ / ax^)(ax' a/ ax^)(ax , P / ax v )g' ce p(ax v/ ax'p)T'p = (9x^ / ax^)(g Xv T v )
Rearranging:
(ax^ / ax^)(ax' a/ ax x )(ax'P / ax v )(ax v/ ax'p)g' a pT i p = (ax^ / ax , ^)(g^ v T v )
Recognizing these result in delta Kroneckers and collecting the priming
it becomes,
8 , a8P p(gap TP )'=(^^ax^)(g^ v T v )
This simplifies to
(g,pTP) , =Ox^ / ax'^(g^T v )
But then we recognize that this is how a covariant tensor transforms and
so we name T by calling it,
T,=g^v TV
Thus we've verified the lowering index property of the covariant metric
tensor. Verifying the raising index property of the contravariant metric tensor
is easier at this point. Start with the expression,
g^ v T v
We've named our previous expression T v and so we insert it.
g^ v g vp TP = g^ v T v
But we've already verified that g* iV g vp = 8^ so we have
5 y P = g^ v T v
Which results in
T U = g nv Tv
This verifies the raising of index property of the contravariant metric
tensor.
With the exception of the locations of physical singularities, the space-
time for the universe in which we live is an everywhere locally Lorentzian
spacetime.
A locally Lorentzian spacetime is a spacetime for which we can locally
transform g to r\ where r\ is given by Equation
j~l 0]
j — 1 j
[% v ] = I o -1 o I
^0 -1:
A locally Euclidean Space-time is a spacetime for which we can locally
transform g uv to \v uv where \v uv is given by
Relativity Made Simple
0-10
[ V ]= -1
• [o 0-1
In other words all the dimensions of a Euclidean "spacetime" are
spacelike.
Either type of spacetime can have Riemannian Curvature as these are
only locally Euclidean, or Lorentzian.
Note: Sometimes it is said that our Universe is everywhere locally
Euclidean. This basically means that we can do local transformations to arrive
-1
0"
-1
-1
[*,/]
This is correct, but to prevent confusion it is really more appropriate to
say that our universe is everywhere locally Lorentzian.
Our universe is also described as being a globally Riemannian spacetime.
This means that it globally takes the quadradic form of Equation.
ds 2 = g fiV dx^dx v
and is the same thing as saying it is everywhere locally Lorentzian.
An invariant as defined for this text is a quantity whose value does not
depend on speed, location with respect to gravitational sources etc... nor upon
whose frame it was calculated from. Invariants are said to be invariant to frame
transformations, or frame invariant.
This does not imply that the value of an invariant must be the same
everywhere (for example invariant "densities") nor that it must be conserved.
In this context an invariant can be thought of as short for invariant scalar though
there are tensor expressions such as the delta kronecker tensor whose elements
are all frame invariant.
Some people also think of tensors in general as invariants as they represent
physical entities and physical entities will not depend in any intrinsic way on
our choice of frame. From this perspective the '"elements of a tensor" are
thought of as "projections of the tensor" onto a coordinate dependent template.
The paradigm for this text will instead be that the tensor is the template onto
which the projections have been made.
It is not invariant, but transforms according to the transformation
properties of an infinitesimal displacement vector. Some relativity authors use
the word scalar to be short for invariant scalars or what are just called invariants
in this text. This is popular, but extremely inappropriate. The reason that it is
68 Relativity Made Simple
inappropriate is that if people continue to redefine things without good reason
so that they have a different meaning for whatever theory comes along then
when they are used in general, eventually a student will practically have to
learn a different dialect of the spoken language for every theory encountered.
This is complication beyond reason. Here are a few examples of invariants
• c The local vacuum speed of light
• m Mass
• p The pressure scalar [p = (l/3)(T^ v U^U v c- 2 - g^T^ v ), for example
the pressure of a gass]
• x
The proper time between events along a world line,
q Charge
An example of how one of these invariants might not be conserved would
be to consider the pressure of the gas after a balloon is popped in space. As it
expands the pressure decreases and so it is not conserved.
An example from special relativity of a quantity that is conserved, but
not invariant would be the total energy of a particle E.
An example of a quality that is both invariant and conserved would be
total charge q.
Consider the transformation of the full contraction of a tensor T* 1 -
g^ v T^T' v = [(3xV3x^)(axP/3x' v )g^ p ] [(3xV/ax a )T a ] [(8x' v /3xP)TP]
g'^T^T'x, = (a x vax^)(axv/ax a )(3xP/ax' v )(ax' v /axi 3 )g^ p T a TP
g ; v T^Tv = 5^5P p g, p T«TP
g ; lv Tvr v =g^ p T^TP
So we note that the full contraction of a tensor is an invariant.
THE AFFINE CONNECTIONS AND THE COVARIANT DERIVATIVE
We want to make equations for the general laws of physics out of tensor
equations. So in developing a differentiation operator for general relativity
we must assure that when it is operated on a tensor it results in something that
is still a tensor.
We find that many of the special relativistic laws of physics are described
by equations involving ordinary differentiation and so this operator must also
reduce to the ordinary differentiation operator in local free fall frames. Consider
the chain rule for the ordinary differentiation of a tensor.
dT x = (3TV3xP)dxP
Using the transformation property of a contravariant tensor we find
dT x = {(3/3xP)[(3xVax' )T ,a ]}dxP
Using the product rule we come to
dT x = (3 2 xV3xP3x' a )T' dxP - (axVax' )(3T' a /axP)dxP
Relativity Made Simple 69
And again from the chain rule we finally have
dT x = (3 2 xVaxPax' a )T'°dxP + (3xVax' )dT'°
Now if on the right hand side we only had the second term then the
differentiation of a tensor would still transform as a tensor, but we have the
extra first term so we know it does not. Thus to find a differentiation operator
which maps tensors to tensors we introduce a second term in the operation.
The new differential operator is called the covariant derivative opperator.
m \ = dT x + § T x
For a contravariant vector the second term necessary to keep DT X a tensor
is
8T^ = r* iV T^dx v
where the affine connection(sometimes called the Christophel symbol of the
second kind) F x is given by
r^ v = (i/2)g^(g w , v + g vp , u -g^ v , p )
For covariant four vectors we can write it in the same form
DT, = dT, + 5T x
But here we have
In the case of the differentiation of a multiple mixed rank tensor we find
DT X K = dT^ • K + T\^ K dx v +... - P 5 ^ p dx v -...
Also it is important to make note that though the affine connection is a
part of a covariant derivative operator, it is not a tensor itself.
So, for example, the covariant derivative of a tensor T^ with respect to
some invariant parameter such as dx is
DTVdx = dTVdx + r^ v T^(dx v /dx)
As mentioned, a comma will represent a partial derivative and a semicolon
will represent a partial covariant derivative. So for example
T\ = T\ + r^ v T^(3x v /3xP)
j\ = t\ + r* t* 1
Chapter 3
Space, Time, and Newtonian Physics
The fundamental principle of relativity is the constancy of a quantity called
c, which is the speed of light in a vacuum:
c = 2.998 x 10 8 m/s, or roughly 3 x 10 8 m/s.
This is fast enough to go around the earth along the equator 7 times
each second. This speed is the same as measured by "everybody." We'll
talk much more about just who "everybody" is. But, yes, this principle
does mean that, if your friend is flying by at 99% of the speed of light,
then when you turn on a flashlight both of the following are true:
The beam advances away from you at 3 x 10 8 mis.
Your friend finds that the light beam catches up to her, at
3 x l0 8 m/ 5 .
'Newtonian' physics is the stuff embodied in the work of Isaac Newton.
Now, there were a lot of developments in the 200 years between Newton and
Einstein, but an important conceptual framework remained unchanged. It is
this framework that we will refer to as Newtonian Physics and, in this sense,
the term can be applied to all physics up until the development of Relativity
by Einstein. Reviewing this framework will also give us an opportunity to
discuss how people came to believe in such a strange thing as the constancy
of the speed of light and why you should believe it too.
Many people feel that Newtonian Physics is just a precise formulation of
their intuitive understanding of physics based on their life experiences. But
still, the basic rules of Newtonian physics ought to 'make sense' in the sense
of meshing.
COORDINATE SYSTEMS
We're going to be concerned with things like speed (e.g., speed of light),
distance, and time. As a result, coordinate systems will be very important.
How many of you have worked with coordinate systems?
Let me remind you that a coordinate system is a way of labeling points;
say, on a line. You need:
A zero
A positive direction
Space, Time, and Newtonian Physics
A scale of distance
We're going to stick with one-dimensional motion most of the time. Of
course, space is 3-dimensional, but 1 dimension is easier to draw and captures
some of the most important properties.
In this course, we're interested in space and time:
Put these together to get a 'spacetime diagram'
Note: For this class. / increases upward and .v increases to the right.
This is the standard convention in relativity and we adopt it so that this
72 Space, Time, and Newtonian Physics
course is compatible with all books.
Also note:
The x -axis is the line / = 0.
The t-axis is the line x = 0.
REFERENCE FRAMES
A particular case of interest is when we choose the line x = to be the
position of some object: e.g. let x = be the position of a piece of chalk.
In this case, the coord system is called a 'Reference Frame'; i.e., the
reference frame of the chalk is the (collection of) coordinate systems where
the chalk lies at x = (All measurements are 'relative to' the chalk.)
We can talk about the chalk's reference frame whether it is "at rest,"
moving at constant velocity, or wiggling back and forth in a chaotic way. In
both cases we draw the x = line as a straight line in the object's own frame
of reference.
Also: The reference frame of a clock has / = whenever the clock reads
zero. (If we talk about the reference frame of an object like a piece of chalk,
which is not a clock, we will be sloppy about when t = 0.)
Note: A physicist's clock is really a sort of stopwatch. It reads t = at
some time and afterwards the reading increases all the time so that it moves
toward + °o.
Before t = it reads some negative time, and the distant past is - °°. A
physicist's clock does not cycle from 1 to 12.
Unfortunately, we're going to need a bit more terminology. Here are a
couple of key definitions:
• Your Worldline: The line representing you on the spacetime diagram.
In your reference frame, this is the line x = 0.
• Event: A point of spacetime; i.e., something with a definite position
and time. Something drawn as a dot on a spacetime diagram.
Examples: a firecracker going o, a door slamming, you leaving a
house.
That definition was a simple thing, now let's think deeply about it. Given
an event (say, the opening of a door), how do we know where to draw it on a
spacetime diagram?
Suppose it happens in our 1- D world.
How can we find out what time it really happens?: One way is to
give someone a clock and somehow arrange for them be present at
the event. They can tell you at what time it happened.
How can we find out where (at what position) it really happens?: We
could hold out a meter stick (or imagine holding one out). Our friend at
the event in question can then read offhow far away she is.
Space, Time, and Newtonian Physics
.
jl The door
Q (an event)
"
K
Note that what we have done here is to really define what we mean by
the position and time of an event.
This type of definition, where we define something by telling how to
measure it (or by stating what a thing does) is called an operational
definition. They are very common in physics.
Now, the speed of light thing is really weird. So, we want to be very
careful in our thinking. You see, something is going to go terribly wrong,
and we want to be able to see exactly where it is.
Let's take a moment to think deeply about this and to act like
mathematicians. When mathematicians define a quantity they always stop
and ask two questions:
• Does this quantity actually exist? (Can we perform the above
operations
and find the position and time of an event?)
• Is this quantity unique or, equivalently, is the quantity "well-
defined?" (Might there be some ambiguity in our definition? Is
there a possibility that two people applying the above definitions
could come up with two different positions or two different
times?)
Well, it seems pretty clear that we can in fact perform these
measurements, so the quantities exist. This is one reason why physicists
like operational definitions so much.
Now, how well-defined are our definitions are for position and time?
Z" One thing you might worry about is that clocks and measuring rods
are not completely accurate.
Maybe there was some error that caused it to give the wrong reading. We
74 Space, Time, and Newtonian Physics
will not concern ourselves with this problem. We will assume that there is
some real notion of the time experienced by a clock and some real notion of
the length of a rod. Furthermore, we will assume that we have at hand 'ideal'
clocks and measuring rods which measure these accurately without mistakes.
Our real clocks and rods are to be viewed as approximations to ideal clocks
and rods.
Let's take the question of measuring the time. Can we give our friend
just any old ideal clock? No. it is very important that her clock be
synchronized with our clock so that the two clocks agree.
And what about the measurement of position? Well, let's take an
example. Suppose that our friend waits five minutes after the event and
then reads the position offof the meter stick. What if, for example, she is
moving relative to us so that the distance between us is changing?
So, perhaps a better definition would be:
Time: If our friend has a clock synchronized with ours and is present
at an event, then the time of that event in our reference frame is the reading
of her clock at that event.
position: Suppose that we have a measuring rod and that, at the time
that some event occurs, we are located at zero. Then if our friend is present
at that event, the value she reads from the measuring rod at the time the
event occurs is the location of the event in our reference frame.
But, how can we be sure that they are well-defined? There are no
certain statements without rigorous mathematical proof. So, since we have
agreed to think deeply about simple things (and to check all of the
subtleties), let us try to prove these statements.
NEWTONIAN ASSUMPTIONS ABOUT SPACE AND TIME
Of course, there is also no such thing as a proof from nothing. This is the
usual vicious cycle. Certainty requires a rigorous proof, but proofs proceed
only from axioms (a.k.a. postulates or assumptions). So, where do we begin?
We could simply assume that the above definitions are well-defined, taking
these as our axioms. However, it is useful to take even more basic statements
as the fundamental assumptions and then prove that position and time in the
above sense are well-defined. We take the fundamental Newtonian
Assumptions about space and time to be:
T All (ideal) clocks measure the same time interval between any
two events through which they pass.
.S" Given any two events at the same time, all (ideal) measuring
rods measure the same distance between those events.
What do we mean by the phrase 'at the same time' used in (S; This,
after all requires another definition, and we must also check that this
concept is welldefined. The point is that the same clock will not be present at
Spat
, Time, and Newtonian Physi
75
two different events which occur at the same time. So, we must allow ourselves
to define two events as occurring at the same time if any two synchronized
clocks pass through these events and, when they do so, the two clocks read
the same value. To show that this is well-defined, we must prove that the
definition of whether event A occurs 'at the same time' as event B does not
depend on exactly which clocks (or which of our friends) pass through events.
Corollary to T: The time of an event (in some reference frame) is well-
defined. Proof: A reference frame is defined by some one clock ±. The
time of event A in that reference frame is defined as the reading at A on
any clock j3 which passes through A and which has been synchronized
with ±. Let us assume that these clocks were synchronized by bringing /3
together with ± at event B and setting /3 to agree with ± there. We now
want to suppose that we have some other clock (y) which was synchronized
with ± at some other event C. We also want to suppose that y is present at
A. The question is, do j3 and y read the same time at event A?
Yes, they will. The point is that clock ± might actually pass through ± a
well as shown below.
Now, by assumption T we know that = and j3 will agree at event A.
Similarly, =n and y will agree at event A. Thus, (5 and y must also agree at
event A. Finally, a proof! We are beginning to make progress! Since the time
76
Space, Time, and Newtonian PhysU
of any event is well defined, the difference between the times of any two events
is well defined. Thus, the statement that two events are 'at the same time in a
given reference frame' is well-defined.
But, might two events be at the same time in one reference frame but not
in other frames ?
Second Corollary to T: Any two reference frames measure the same
time interval between a given pair of events.
Proof: A reference frame is defined by a set of synchronized clocks.
From the first corollary, the time of an event defined with respect to a
synchronized set of clocks is well-defined no matter how many clocks are
in that synchronized set.
Thus, we are free to add more clocks to a synchronized set as we like.
This will not change the times measured by that synchronized set in any way,
but will help us to construct our proof.
So, consider any two events Ej and E 2 . Let us pick two clocks P x and
y x from set X that pass through these two events. Let us now pick two
clocks P Y and y Y fr° m set Y that follow the same worldlines as P x and y x -
If such clocks are not already in set Y then we can add them in. Now, P x
and P Y were synchronized with some original clock oc x from set X at some
events B and C.
Let us also consider some clock oc Y from set Y having the same
worldline as a x . We have the following spacetime diagram:
Note that, by assumption T, clocks ± X and ± Y measure the same time
interval between B and C. Thus, sets Xand 7 measure the same time interval
between B and C.
Similarly, sets X and Y measure the same time intervals between B and
El and between C and E 7 . Let T X (A, B) be the time difference between any
two events A and B as determined by set X, and similarly for 'FY (A, B). Now
since we have both T X (E X , E 2 ) = 7>(E,, B) + T x (B, C) + T x (C, E 2 ) and T Y
(£,, E 2 ) = T Y (E V B) + T Y (B, C) + T y (C. E 2 ), and since we have just said that
Spec
, Time, and Newtonian Physics
all of the entries on the right hand side are the same for both X and Y, it
follows that T X {E X , E 2 ) = T Y (E V E 2 ). In contrast, note that (S) basically states
directly that position is well-defined.
NEWTONIAN ADDITION OF VELOCITIES?
Let's go back and look at this speed of light business. Remember the 99
%c example? Why was it confusing?
Let V BA be the velocity of B as measured by A (i.e., "in A's frame of
reference").
Similary V CB is the velocity of C as measured by B and V CA is the
velocity of C as measured by A. What relationship would you guess
between V BA , V CB and V CA ?
Most likely, your guess was:
V C A = V CB + V BA ,
and this was the reason that the 99%c example didn't make sense to you.
But do you know that this is the correct relationship?
The answer (still leaving the speed of light example clear as coal tar)
is follows from assumptions S and T. Proof: Let A, B, C be clocks. For
simplicity, suppose that all velocities are constant and that all three clocks
pass through some one event and that they are synchronized there. The
more general case where this does not occur will be one of your homework
problems, so watch carefully!! Without Loss of Generality (WLOG) we
can take this event to occur at / = 0.
The diagram below is drawn in the reference frame of A:
At time t, the separation between A and C is V CA t, but we see from
the diagram that it is is also V CB t + V BA t. Canceling the t's, we have
V C a=V cb + V ba .
Now, our instructions about how to draw the diagram (from the facts
that our ideas about time and position are well-defined) came from
assumptions Tand S, so the Newtonian formula for the addition of velocities
78 Space, Time, and Newtonian Physics
is a logical consequence of T and S. If this formula does not hold, then at least
one of T and S must be false. It is a good idea to start thinking now, based on
the observations we have just made, about how completely any such evidence
will make us restructure our notions of reality.
Q: Where have we used T?
A: In considering events at the same time (i.e., at time t on the
diagram above).
Q: Where have we used 5?
A: In implicitly assuming that d BC is same as measured by anyone
(A, B, or C).
NEWTON'S LAWS: ARE ALL REFERENCE FRAMES EQUAL?
The above analysis was true for all reference frames. It made no
difference how the clock that defines the reference frame was moving.
However, one of the discoveries of Newtonian Physics was that not
all reference frames are in fact equivalent. There is a special set of reference
frames that are called Inertial Frames. This concept will be extremely
important for us throughout the course.
Here's the idea: Before Einstein, physicists believed that the behaviour
of almost everything (baseballs, ice skaters, rockets, planets, gyroscopes,
bridges, arms, legs, cells,...) was governed by three rules called 'Newton's
Laws of Motion.' The basic point was to relate the motion of objects to
the 'forces' that act on that object. These laws picked out certain reference
frames as special.
The first law has to do with what happens when there are no forces.
Consider someone in the middle of a perfectly smooth, slippery ice rink.
An isolated object in the middle of a slippery ice rink experiences zero
force in the horizontal direction. Now, what will happen to such a person?
What if they are moving?
Newton's first law of motion: There exists a class of reference frames
(called inertial frames) in which an object moves in a straight line at
constant speed (at time t) if and only if zero (net) force acts on that object
at time t.
Note: When physicists speak about velocity this includes both the
speed and the direction of motion. So, we can restate this as: There exists
a class of reference frames (called inertial frames) in which the velocity of
any object is constant (at time t) if and only if zero net force acts on that
object at time t.
This is really an operational definition for an inertial frame. Any frame
in which the above is true is called inertial.
The qualifier 'net' (in 'net force" above) means that there might be two
or more forces acting on the object, but that they all counteract each other and
Space, Time, and Newtonian Physics 79
cancel out. An object experiencing zero net force behaves identically to one
experiencing no forces at all.
We can restate Newton's first law as: Object A moves at constant
velocity in an inertial frame T! Object A experiences zero net force.
Here the symbol (T!) means 'is equivalent to the statement that.' Trust
me, it is good to encapsulate this awkward statement in a single symbol.
AN OBJECT IS IN AN INERTIAL FRAME
Newton's first Law: There exists a class of reference frames (called
inertial frames). If object A's frame is inertial, then object A will measure
object B to have constant velocity (at time t) if and only if zero force acts
on object B at time t.
To tell if you are in an inertial frame, think about watching a distant
(very distant) rock floating in empty space. It seems like a safe bet that
such a rock has zero force acting on it.
Examples: Which of these reference frames are inertial? An
accelerating car? The earth? The moon? The sun? Note that some of these
are 'more inertial' than others. Probably the most inertial object we can
think of is a rock drifting somewhere far away in empty space.
It will be useful to have a few more results about inertial frames. To
begin, note that an object never moves in its own frame of reference.
Therefore, it moves in a straight line at constant (zero) speed in an inertial
frame of reference (its own). Thus it follows from Newton's first law that,
if an object's own frame of reference is inertial, zero net force acts on that
object. Is the converse true? To find out, consider some inertial reference
frame. Any object A experiencing zero net force has constant velocity v A
in that frame. Let us ask if the reference frame of A is also inertial.
To answer this question, consider another object C experiencing zero
net force (say, our favourite pet rock). In our inertial frame, the velocity
v c of C is constant. Note that the velocity of C in the reference frame of
A is just v c "v A , which is constant. Thus, C moves with constant velocity
in the reference frame of A!!!! Since this is true for any object C experiencing
zero force, A's reference frame is in fact inertial.
We now have: Object A is in an inertial frame T! Object A experiences
zero force T! Object A moves at constant velocity in any other inertial
frame. Note that therefore any two inertial frames differ by a constant
velocity.
NEWTON'S OTHER LAWS
We will now complete our review of Newtonian physics by briefly
discussing Newton's other laws. We'll start with the second and third laws.
The second law deals with what happens to an object that does experience a
80 Space, Time, and Newtonian Physics
new force. Definition of acceleration, a, (of some object in some reference
frame): a = dv/dt, the rate of change of velocity with respect to time. Note
that this includes any change in velocity, such as a change in direction. Z"
In particular, an object that moves in a circle at a constant speed is in fact
accelerating in the language of physics.
Newton's Second Law: In any inertial frame, (net force on an object)
= (mass of object)(acceleration of object) F = ma.
The phrase "in any inertial frame" above means that the acceleration
must be measured relative to an inertial frame of reference. By the way,
part of your homework will be to show that calculating the acceleration
of one object in any two inertial frames always yields identical results.
Thus, we may speak about acceleration 'relative to the class of inertial
frames.'
Note: We assume that force and mass are independent of the reference
frame. On the other hand, Newton's third law addresses the relationship
between two forces.
Newton's Third Law: Given two objects (A and B), we have (force
from A on B at some time t) = - (force from B on A at some time t)
This means that the forces have the same size but act in opposite directions.
Now, this is not yet the end of the story. There are also laws that tell us what
the forces actually are. For example, Newton's Law of Universal Gravitation
says: Given any two objects A and B, there is a gravitational force between
them (pulling each toward the other) of magnitude
_ ™ A m B
d 2 AB
with G = 6.673 x 10-1 \Nm 2 /kg 2 .
Important Observation: These laws hold in any inertial frame. There
is no special inertial frame that is any different from the others. It makes
no sense to talk about one inertial frame being more 'at rest' than any other.
You could never find such a frame, so you could never construct an operational
definition of 'most at rest.' Why then, would anyone bother to assume that a
special 'most at rest frame' exists? As you will see in the reading, Newton
discussed something called 'Absolute space.' However, he didn't need
to and no one really believed in it. We will therefore skip this concept
completely and deal with all inertial frames on an equal footing. The
above observation leads to the following idea, which turns out to be
much more fundamental than Newton's laws.
Principle of Relativity: The Laws of Physics are the same in all
inertial frames. This understanding was an important development. It
ended questions like 'why don't we fall offthe earth as it moves around
the sun at 67,000 mph?'
F AK =-
Space, Time, and Newtonian Physics 81
Since the acceleration of the earth around the sun is only 0.006 m/
s 2 , the motion is close to inertial. This fact was realized by Galileo,
quite awhile before Newton did his work (actually, Newton consciously
built on Galileo's observations. As a result, applications of this idea to
Newtonian physics are called 'Galilean Relativity').
MAXWELL, ELECTROMAGNETISM, AND ETHER
Newtonian physics (essentially, the physics of the 1700's) worked
just fine. And so, all was well and good until a scientist named Maxwell
came along. The hot topics in physics in the 1800's were electricity and
magnetism. Everyone wanted to understand batteries, magnets, lightning,
circuits, sparks, motors, and so forth (eventually to make power plants).
THE BASICS OF E & M
Let me boil all of this down to some simple basics. People had
discovered that there were two particular kinds of forces (Electric and
Magnetic) that acted only on special objects. (This was as opposed to
say, gravity, which acted on all objects.) The special objects were said
to be charged and each kind of charge (Electric or Magnetic) came in
two 'flavors':
Electric: + and -
Magnetic: N and S (north and south)
Like charges repel and opposite charges attract.
There were many interesting discoveries during this period, such as
the fact that 'magnetic charge' is really just electric charge in motion.
As they grew to understand more and more, physicists found it useful
to describe these phenomena not in terms of the forces themselves, but
in terms of things called "fields". Here's the basic idea:
Instead of just saying that X and Y 'repel' or that there is a force
between them, we break this down into steps:
• We say that X 'fills the space around it with an electric field
E
• Then, it is this electric field E that produces a force on Y.
(Electric force onfj = (charge of F)(Electric field at location of Y)
F on Y = q Y E
Note that changing the sign (±) of the charge changes the sign of
the force. The result is that a positive charge experiences a force in the
direction of the field, while the force on a negative charge is opposite to
the direction of the field.
The arrows indicate the field. Red (positive charge) moves left with
the field. Blue (negative charge) moves right against the field.
Space, Time, and Newtonian Physics
Fig. Blue = Negative Charge red = Positive Charge
Similarly, a magnetic charge fills the space around it with a magnetic
field B that then exerts a force on other magnetic charges.
Now, you may think that fields have only made things more
complicated, but in fact they are a very important concept as they allowed
people to describe phenomena which are not directly related to charges
and forces.
For example, the major discovery behind the creation of electric
generators was Faraday's Law. This Law says that a magnetic field that
changes in time produces an electric field. In a generator, rotating a magnet
causes the magnetic field to be continually changing, generating an electric
field. The electric field then pulls electrons and makes an electric current.
By the way: Consider a magnet in your (inertial) frame of reference.
You, of course, find zero electric field. But, if a friend (also in an inertial
frame) moves by at a constant speed, they see a magnetic field which
'moves' and therefore changes with time.
Thus, Maxwell says that they must see an electric field as well! We see
that a field which is purely magnetic in one inertial frame can have an electric
part in another. But recall: all inertial frames are supposed to yield equally
valid descriptions of the physics.
Conversely, Maxwell discovered that an electric field which changes in
time produces a magnetic field. Maxwell codified both this observation and
Faraday's law in a set of equations known as, well, Maxwell's equations. Thus,
a field that is purely electric in one reference frame will have a magnetic part
in another frame of reference.
It is best not to think of electricity and magnetism as separate phe-
nomena. Instead, we should think of them as forming a single
"electromagnetic" field which is independent of the reference frame. It is
the process of breaking this field into electric and magnetic parts which depends
on the reference frame.
There is a strong analogy with the following example: The spatial
relationship between the physics building and the Hall of Languages is fixed
and independent of any coordinate system. The relationship is fixed, but the
Space, Time, and Newtonian Physics 83
description differs. For the moment this is just a taste of an idea, but we will
be talking much more about this in the weeks to come. In the case of
electromagnetism, note that this is consistent with the discovery that magnetic
charge is really moving electric charge.
Not only do we find a conceptual unity between electricity and magnetism,
but we also find a dynamical loop. If we make the electric field change with
time in the right way, it produces a magnetic field which changes with time.
This magnetic field then produces an electric field which changes with time,
which produces a magnetic field which changes with time and so on.
Moreover, it turns out that a changing field (electric or magnetic) produces a
field (magnetic or electric) not just where it started, but also in the neighboring
regions of space.
This means that the disturbance spreads out as time passes! This
phenomenon is called an electromagnetic wave. For the moment, we merely
state an important property of electro-magnetic waves: they travel with a
precise (finite) speed.
Maxwell's Equations and Electromagnetic Waves
Maxwell's equations lead to electromagnetic waves, the important
point here is just to get the general picture of how Maxwell's equations
determine that electromagnetic waves travel at a constant speed.
Maxwell's equations (Faraday's Law) says that a magnetic field (B)
that changes is time produces an electric field (E). I'd like to discuss some
of the mathematical form of this equation. To do so, we have to turn the
ideas of the electric and magnetic fields into some kind of mathematical
objects. Let's suppose that we are interested in a wave that travels in, say,
the x direction.
Then we will be interested in the values of the electric and magnetic
fields at different locations (different values of x) and a different times t.
We will want to describe the electric field as a function of two variables
E(jc, t) and similarly for the magnetic field B (x, t).
Now, Faraday's law refers to magnetic fields that change with time.
How fast a magnetic field changes with time is described by the derivative
of the mag- netic field with respect to time. For those of you who have
not worked with 'multivariable calculus,' taking a derivative of a function
of two variables like B (x, t) is no harder than taking a derivative of a
function of one variable like y(t). To take a derivative of B (x, t) with
respect to t, all you have to do is to momentarily forget that x is a variable and
treat it like a constant. For example, suppose B (x, t) = x 2 t + xt 2 . Then the
derivative with respect to / would be just x 2 + 2xt. When B is a function of
oB
two variables, the derivative of B with respect to / is written— .
84 Space, Time, and Newtonian Physics
dB
It turns out that Faraday's law does not relate — directly to the electric
at
field. Instead, it relates this quantity to the derivative of the electric field with
respect to x. That is, it relates the time rate of change of the magnetic field to
the way in which the electric field varies from one position to another. In
symbols,
dB _ 8E
dt ~ dx '
It turns out that another of Maxwell's equations has a similar form,
which relates the time rate of change of the electric field to the way that
the magnetic field changes across space. Figuring this out was Maxwell's
main contribution to science. This other equation has pretty much the
same form as the one above, but it contains two 'constants of nature' -
numbers that had been measured in various experiments. They are called
u and u ('epsilon zero and mu zero'). The first one, e is related to the
amount of electric field produced by a charge of a given strength when
that charge is in a vacuum.
Similarly, u is related to the amount of magnetic field produced by a
certain amount of electric current (moving charges) when that current is
in a vacuum. The key point here is that both of the numbers are things
that had been measured in the laboratory long before Maxwell or anybody
else had ever thought of 'electromagnetic waves.
C 2 ™L_
Their values were e n = 8.854 x 10~ 12 j u n = 4n x 10~ 7 r 2 .
Nm "--
Anyway, this other Equation of Maxwell's looks like:
dE dB
Now, to understand how the waves come out of all this, it is useful to
take the derivative (on both sides) of equation with respect to time. This
yields some second derivatives:
d 2 B = d 2 E
dt 2 ~ Bxdt
Note that on the right hand side we have taken one derivative with
respect to t and one derivative with respect to x.
Similarly, we can take a derivative of equation on both sides with respect
to x and get:
d 2 B _ d 2 E
dx 2 ~ e ^ Q ~dxJt-
Space, Time, and Newtonian Physics 85
The interesting fact that it does not matter whether we first differentiate
d d r d 8 v
with respect to x or with respect to t: T~T~ ^ ~~Z"T t ' •
Note that the right hand sides of equations and differ only by a factor
of 6 |i . So,divide equation by this factor and then subtract it from to
d 2 B 1 d 2 B
This is the standard form for a so-called 'wave equation.' To
understand why, let's see what happens if we assume that the magnetic
field takes the form
B = B sin(jc - vt)
for some speed v. Note that equation has the shape of a sine wave at
any time t. However, this sine wave moves as time passes. For example,
at t = the wave vanishes at x = 0. On the other hand, at time t = rc/2v, at
j = 0we and other materials are associated with somewhat different values
of electric and magnetic fields, and that depend on the materials. This is
due to what are called 'polarization effects' within the material, where the
presence of the charge (say, in water) distorts the equilibrium between the
positive and negative charges that are already present in the water
molecules. This is a fascinating topic (leading to levitating frogs and such)
but is too much of a digression to discuss in detail here.
The subscript on ?0 and indicates that they are the vacuum values
or, as physicists of the time put it, the values for 'free space.' have B - -
B . A 'trough' that used to be at x = -n/2 has moved to x - 0. We can see
that this wave travels to the right at constant speed v.
Taking a few derivatives shows that for B of this form we have
d 2 B 1 d 2 B fi 1 ) D . . ,
dt 2 e u- dx 2 I e n )
This will vanish (and therefore solve equation if (and only if)
v = ±l/^/e u. . Thus, we see that Maxwell's equations do lead to waves,
and that those waves travel at a certain speed4 given by \/yje u . Maxwell
realized this, and was curious how fast this speed actually is. Plugging in
the numbers that had been found by measuring electric and magnetic fields
in the laboratory, he found (as you can check yourself using the numbers
above!) I/^Eq u. = 2.99... * 10 8 m/s. Now, the kicker is that, not too long
before Maxwell, people had measured the speed at which light travels, and
found that (in a vacuum) this speed was also 2.99... x 108m/s.
86 Space, Time, and Newtonian Physics
Maxwell didn't think so. Instead, he jumped to the quite reasonable
conclusion that light actually was a a kind of electromagnetic wave, and
that it consists of a magnetic field of the kind we have just been describing
(together with the accompanying electric field). We can therefore replace
the speed v above with the famous symbol c that we reserve for the speed
of light in a vacuum.
THE ELUSIVE ETHER
The Laws of Physics are the same in all inertial frames. So, the laws of
electromagnetism (Maxwell's equations) ought to hold in any inertial reference
frame, right?
But then light would move at speed c in all reference frames, violating
the law of addition of velocities... And this would imply that T&n&Saxz wrong!
How did physicists react to this observation?
They said "Obviously, Maxwell's equations can only hold in a certain
frame of reference."
Consider, for example, Maxwell's equations in water. There, they also
predict a certain speed for the waves as determined by e and li in water (which
are different from the s and u of the vacuum).
However, here there is an obvious candidate for a particular reference
frame with respect to which this speed should be measured: the reference
frame of the water itself. Moreover, experiments with moving water did
in fact show that 1/ '^Js\i gave the speed of light through water only when
the water was at rest.
The same thing, by the way, happens with regular surface waves on water
(e.g., ocean waves, ripples on a pond, etc.). There is a wave equation not unlike
which controls the speed of the waves with respect to the water.
So, clearly, c should be just the speed of light 'as measured in the reference
frame of the vacuum.' Note that there is some tension here with the idea we
discussed before that all inertial frames are fundamentally equivalent. If this
is so, one would not expect empty space itself to pick out one as special. To
reconcile this in their minds, physicists decided that 'empty space' should not
really be completely empty.
After all, if it were completely empty, how could it support
electromagnetic waves? So, they imagined that all of space was filled with
a fluid-like substance called the "Luminiferous Ether." Furthermore, they
supposed that electromagnetic waves were nothing other than wiggles of
this fluid itself.
So, the thing to do was to next was to go out and look for the ether.
In particular, they wanted to determine what was the ether's frame of reference.
Was the earth moving through the ether? Was there an 'ether wind' blowing
by the earth or by the sun? Did the earth or sun drag some of the ether with it
Space, Time, and Newtonian Physics 87
as it moved through space?
The experiment that really got people's attention was done by Albert
Michelson and Edward Morley in 1887. They were motivated by issues about
the nature of light and the velocity of light, but especially by a particular
phenomenon called the "aberration" of light. This was an important discovery
in itself, so let us take a moment to understand it.
The Aberration of Light
Here is the idea: Consider a star very far from the earth. Suppose we look
at this star through a telescope. Suppose that the star is "straight ahead" but
the earth is moving sideways. Then, we will not in fact see the star as straight
ahead.
Note that, because of the finite speed of light, if we point a long thin
telescope straight at the star, the light will not make it all the way down the
telescope but will instead hit the side because of the motion of the earth. A bit
of light entering the telescope and moving straight down, will be smacked
into by the rapidly approaching right wall of the telescope, even if it entered
on the far left side of the opening.
The effect is the same as if the telescope was at rest and the light had
been coming in at a slight angle so that the light moved a bit to the right.
The only light that actually makes it to the bottom is light that is moving
at an angle so that it runs away from the oncoming right wall as it moves
down the telescope tube.
If we want light from the straight star in front of us to make it all the
way down, we have to tilt the telescope. In other words, what we do see
though the telescope is not the region of space straight in front of the
telescope opening, but a bit of space slightly to the right.
Light Ray hits side instead of reaching bottom
M'\
Fig. Telescope moves Through Ether Must Tilt Telescope to See Star
This phenomenon had been measured, using the fact that the earth first
moves in one direction around the sun and then, six months later, it moves in
88 Space, Time, and Newtonian Physics
the opposite direction. In fact, someone else (Fizeau) had measured the effect
again using telescopes filled with water.
The light moves more slowly through water than through the air, so this
should change the angle of aberration in a predictable way. While the details
of the results were actually quite confusing, the fact that the effect occurred at
all seemed to verify that the earth did move through the ether and, moreover,
that the earth did not drag very much of the ether along with it.
You might wonder how Fizeau could reach such a conclusion. After all,
as you can see from the diagram below, there is also and effect if the ether is
dragged along by the earth.
In the region far from the earth where the ether is not being dragged, it
still provides a 'current' that affects the path of the light. The point, however,
is that the telescope on the Earth must now point at the place where the light
ray enters the region of ether being dragged by the earth.
Note that this point does not depend on whether the telescope is filled
with air or with water!
So, Fizeau's observation that filling the telescope with water increases
the stellar aberration tells us that the ether is not strongly dragged along by
the earth.
Michelson, Morely, and their Experiment
Because of the confusion surrounding the details of Fizeau's results, it
seemed that the matter deserved further investigation.
Michelson and Morely thought that they might get a handle on things by
measuring the velocity of the ether with respect to the earth in a different
way. Have a look at their original paper to see what they did in their own
words.
Michelson and Morely used a device called an interferometer, which
looks like the picture below.
The idea is that they would shine light (an electromagnetic wave) down
each arm of the interferometer where it would bounce off a mirror at the end
and return.
The two beams are then recombined and viewed by the experimenters.
Both arms are the same length, say L.
Space, Time, and Newtonian Physics
■ Mirror
Rays st
^rror
Fig. Light Rays Bounce off Both Mirrors
What do the experimenters see? Well, if the earth was at rest in the ether,
the light would take the same amount of time to travel down each arm and
return. Now, when the two beams left they were synchronized ("in phase"),
meaning that wave crests and wave troughs start down each arm at the same
time. Since each beam takes the same time to travel, this means that wave
crests emerge at the same time from each arm and similarly with wave troughs.
Waves add together as shown below8, with two crests combining to make a
big crest, and two troughs combining to make a big trough. The result is
therefore a a bright beam of light emerging from the device. This is what the
experimenters should see.
-Vv^
On the other hand, if the earth is moving through the ether (say, to the
right), then the right mirror runs away from the light beam and it takes the
light longer to go down the right arm than down the top arm. On the way back
though, it takes less time to travel the right arm because of the opposite effect.^
detailed calculation is required to see which effect is greater (and to properly
take into account that the top beam actually moves at an angle as shown below).
Fig. This one takes Less Time.
After doing this calculation one finds9 that the light beam in the right
90 Space, Time, and Newtonian Physics
arm comes back faster than light beam in the top arm. The two signals would
no longer be in phase, and the light would not be so bright. In fact, if the
difference were great enough that a crest came back in one arm when a trough
came back in the other, then the waves would cancel out completely and they
would see nothing at all! Michelson and Morely planned to use this effect to
measure the speed of the earth with respect to the ether.
- A V/ ^ V A .
- » A V A V A
Fig. These waves Cancel out
However, they saw no effect whatsoever! No matter which direction they
pointed their device, the light seemed to take the same time to travel down
each arm. Clearly, they thought, the earth just happens to be moving with the
ether right now (i.e., bad timing).
So, they waited six months until the earth was moving around the sun in
the opposite direction, expecting a relative velocity between earth and ether
equal to twice the speed of the earth around the sun. However, they still found
that the light took the same amount of time to travel down both arms of the
interferometer!
So, what did they conclude? They thought that maybe the ether is dragged
along by the earth... But then, how would we explain the stellar aberration
effects?
Deeply confused, Michelson and Morely decided to gather more data.
Despite stellar aberration, they thought the earth must drag some ether
along with it. After all, as we mentioned, the details of the aberration
experiments were a little weird, so maybe the conclusion that the earth
did not drag the ether was not really justified
If the earth did drag the ether along, they thought there might be less of
this effect up high, like on a mountain top. So, they repeated their experiment
at the top of a mountain.
Still, they found no effect. There then followed a long search trying to
find the ether, but no luck. Some people were still trying to find an ether
'dragged along very efficiently by everything' in the 1920's and 1930's. They
never had any luck.
EINSTEIN AND INERTIAL FRAMES
THE POSTULATES OF RELATIVITY
In 1905, Albert Einstein tried a different approach. He asked "What if
there is no ether?" What if the speed of light in a vacuum really is the same in
every inertial reference frame? He soon realized, as we have done, that this^
Space, Time, and Newtonian Physics 91
means that we must abandon T and S, our Newtonian assumptions about space
and time.
Hopefully, you are sufficiently confused by the Michelson and Morely
and stellar aberration results that you will agree to play along with Einstein
for awhile. This is what we want to do in the next few sections. We will explore
the consequences of Einstein's idea.
Surprisingly, one can use this idea to build a consistent picture of
what is going on that explains both the Michelson-Morely and stellar
aberration.
It turns out that this idea makes a number of other weird and ridiculous-
sounding predictions as well. Perhaps even more surprisingly, these predictions
have actually been confirmed by countless experiments over the last 100 years.
We are about to embark on a very strange path, one that runs counter to
the intuition that we accumulate in our daily lives. We will have to tread
carefully, taking the greatest care with our logical reasoning. Careful logical
reasoning can only proceed from clearly stated assumptions (a.k.a. 'axioms' or
'postulates'). We're throwing out almost everything that we thought we
understood about space and time. So then, what should we keep?
We'll keep the bare minimum consistent with Einstein's idea. We will
take our postulates to be:
The laws of physics are the same in every inertial frame.
The speed of light in an inertial frame is always c = 2.99.. x
10 8 m/s.
We also keep Newton's first law, which is just the definition of an
inertial frame: There exists a class of reference frames (called inertial
frames) in which an object moves in a straight line at constant speed if
and only if zero net force acts on that object.
Finally, we will need a few properties of inertial frames. We therefore
postulate the following familiar statement.
Object A is in an inertial frame <=> Object A experiences zero force <=>
Object A moves at constant velocity in any other inertial frame.
Since we no longer have S and T, we can no longer derive this last
statement. It turns out that this statement does in fact follow from even
more elementary (albeit technical) assumptions that we could introduce
and use to derive it. This is essentially what Einstein did. However, in
practice it is easiest just to assume that the result is true and go from there.
Finally, it will be convenient to introduce a new term:
Definition: An "observer" is a person or apparatus that makes
measurements.
Using this term, assumption II becomes: The speed of light is always
c = 2.9979 x 10 8 m/s as measured by any inertial observer.
By the way, it will be convenient to be a little sloppy in our language and
92 Space, Time, and Newtonian Physics
to say that two observers with zero relative velocity are in the same reference
frame, even if they are separated in space.
TIME AND POSITION, TAKE II
We used the old assumptions T and S to show that our previous notions
of time and position were well-defined. Thus, we can no longer rely even on
the definitions of 'time and position of some event in some reference frame'.
We will need new definitions based on our new postulates.
For the moment, let us stick to inertial reference frames. What tools can
we use? We don't have much to work with.
The only assumption that deals with time or space at all is postulate II,
which sets the speed of light. Thus, we're going to somehow base out definitions
on the speed of light.
We will use the following: To define position in a given inertial frame:
Build a framework of measuring rods and make sure that the zero mark always
stays with the object that defines the (inertial) reference frame. Note that, once
we set it up, this framework will move with the inertial observer without us
having to apply any forces.
The measuring rods will move with the reference frame. An observer
(say,a friend of ours who rides with the framework) at an event can read o the
position (in this reference frame) of the event from the mark on the rod that
passes through that event.
To define time in a given inertial frame: Put an ideal clock at each mark
on the framework of measuring rods above. Keep the clocks there, moving
with the reference frame. The clocks can be synchronized with a pulse of light
emitted, for example, from t = 0. A clock at x knows that, when it receives the
pulse, it should read | x |/c.
These notions are manifestly well-defined. We do not need to make the
same kind of checks as before as to whether replacing one clock with another
would lead to the same time measurements. This is because the rules just given
do not in fact allow us to use any other clocks, but only the particular set of
clocks which are bolted to our framework of measuring rods.
Whether other clocks yield the same values is still an interesting question,
but not one that a ects whether the above notions of time and position of some
event in a given reference frame are well defined.
Significantly, we have used a different method here to synchronize clocks.
The new method based on a pulse of light is available now that we have
assumption II, which guarantees that it is an accurate way to synchronize clocks
in an inertial frame. This synchronization process is shown in the spacetime
diagram below.
Space, Time, and Newtonian Physics
Note that the diagram is really hard to read if we use meters and seconds
Therefore, it is convenient to use units of seconds and light-seconds: lLs
= (1 sec)c = 3x x 10 8 m = 3 * 10 5 km. This is the distance that light can travel
in one second, roughly 7 times around the equator. Working in such units is
often called "choosing c = 1," since light travels at lLs/sec. We will make this
choice for the rest of the course, so that light rays will always appear on our
diagrams as lines at a 45? angle with respect to the vertical; i.e. slope = 1.
94 Space, Time, and Newtonian Physics
SIMULTANEITY: OUR FIRST DEPARTURE
FROM GALILEO AND NEWTON
The above rules allow us to construct spacetime diagrams in various
reference frames. An interesting question then becomes just how these
diagrams are related.
Let us start with an important example. We went to some trouble to
show that the notion that two events happen 'at the same time' does not
depend on which reference frame (i.e., on which synchronized set of clocks)
we used to measure these times. Now that we have thrown out T and S,
will this statement still be true?
Let us try to find an operational definition of whether two events
occur 'simul-taneously' (i.e., at the same time) in some reference frame.
We can of course read the clocks of our friends who are at those events
and who are in our ref-erence frame.
However, it turns out to be useful to find a way of determining which
events are simultaneous with each other directly from postulate II, the
one about the speed of light. Note that there is no problem in determining
whether or not two things happen (like a door closing and a firecracker
going o) at the same event. The question is merely whether two things
that occur at different events take place simultaneously.
Suppose that we have a friend in an inertial frame and that she emit
a flash of light from her worldline. The light will travel outward both to
the left and the right, always moving at speed c. Suppose that some of
this light is reflected back to her from event
A on the left and from event B on the right. The diagram below makes
it clear that the two reflected pulses of light reach her at the same time if
and only if A and B are simultaneous. So, if event C (where the reflected
pulses cross) lies on her worldline, she knows that A and B are in fact
simultaneous in our frame of reference.
Note: Although the light does not reach our friend until event C (at t
= 2 sec, where she 'sees' the light), she knows that the light has taken
some time to travel and he measurements place the reflections at t = 1 sec.
Friend Friend
In fact, even if we are in a different reference frame, we can tell that
Space, Time, and Newtonian Physics
95
A and B are simultaneous in our friend's frame if event C lies on her worldline.
Suppose that we are also inertial observers who meet our friend at the origin
event and then move on. What does the above experiment look like in our
frame?
Let's start by drawing our friend's worldline and marking event C.
We don't really know where event C should appear, but it doesn't make
much difference since we have drawn no scale. All that matters is that
event C is on our friend's worldline (xy= 0).
Now let's add the light rays from the origin and from event C.
The events where these lines cross must be A and B, as shown below.
Note that, on either diagram, the worldline xf = makes the same
angle with the light cone as the line of simultaneity tf = const. That is, the
angles a and p below are equal. You will in fact derive this in one of your
homework problems.
By the way, we also can find other pairs of events on our diagram
that are simultaneous in our friend's reference frame. We do this by sending
out light signals from another observer in the moving frame. For example,
the diagram below shows another event (D) that is also simultaneous with
A and B in our friend's frame of reference.
Space, Time, and Newtonian Physics
In this way we can map out our friend's entire line of simultaneity - the
set of all events that are simultaneous with each other in her reference frame.
The result is that the line of simultaneity for the moving frame does indeed
appear as a straight line on our spacetime diagram. This property will be very
important in what is to come.
Before moving on, let us get just a bit more practice and ask what set
of events our friend (the moving observer) finds to be simultaneous with
the origin (the event where the her worldline crosses ours)? We can use
light signals to find this line as well. Let's label that line tf = under the
assumption that our friend chooses to set her watch to zero at the event
where the worldlines cross. Drawing in a carefully chosen box of light
rays, we arrive at the diagram below.
Note that we could also have used the rule noticed above: that the
worldline and any line of simultaneity make equal angles with the light
cone.
Space, Time, and Newtonian Physics 97
The line of simultaneity drawn above (tf = const) represents some constant
time in the moving frame, we do not yet know which time that is! In particular,
we do not yet know whether it represents a time greater than one second or a
time less than on second. We were able to label the tf = line with an actual
value only because we explicitly assumed that our friend would measure time
from the event (on that line) where our worldlines crossed. We will explore
the question of how to assign actual time values to other lines of simultaneity
shortly.
We have learned that events which are simultaneous in one inertial
reference frame are not in fact simultaneous in a different inertial frame.
We used light signals and postulate. II to determine which events were
simultaneous in which frame of reference.
RELATIONS BETWEEN EVENTS IN SPACETIME
It will take some time to absorb the implications, but let us begin
with an interesting observation. A pair of events which is separated by
"pure space" in one inertial frame (i.e., is simultaneous in that frame) is
separated by both space and time in another. Similarly, a pair of events
that is separated by "pure time" in one frame (occurring at the same
location in that frame) is separated by both space and time in any other
frame. This may remind you a bit of our discussion of electric and magnetic
fields, where a field that was purely magnetic in one frame involved both
electric and magnetic parts in another frame. In that case we decided that
is was best to combine the two and to speak simply of a single
"electromagnetic" field. Similarly here, it is best not to speak of space and
time separately, but instead only of "spacetime" as a whole. The spacetime
separation is fixed, but the decomposition into space and time depends
on the frame of reference.
Note the analogy to what happens when you turn around in space.
The notions of Forward/Backward vs. Right/Left get mixed up when you
turn (rotate) your body. If you face one way, you may say that the Hall
of Languages is "straight ahead." If you turn a bit, you might say that the
Hall of Languages is "somewhat ahead and somewhat to the left."
However, the separation between you and the Hall of Languages is the
same no matter which way you are facing. As a result, Forward/Backward
and Right/Left are not strictly speaking separate, but rather fit together
to form two-dimensional space.
This is exactly what is meant by the phrase "space and time are not
separate, but fit together to form four-dimensional spacetime." As a result,
"time is the fourth dimension of spacetime." So then, how do we
understand the way that events are related in this spacetime?
In particular, we have seen that simultaneity is not an absolute concept
98 Space, Time, and Newtonian Physics
in spacetime itself. There is no meaning to whether two events occur at the
same time unless we state which reference frame is being used.
If there is no absolute meaning to the word 'simultaneous,' what about
'before' and 'after' or 'past' and 'future?' Let's start o slowly. We have seen
that if A and B are simultaneous in your (inertial) frame of reference (but
are not located at the same place), then there is another inertial frame in
which A occurs before B. A similar argument (considering a new inertial
observer moving in the other direction) shows that there is another inertial
frame in which B occurs before A. Looking back at our diagrams, the same
is true if A occurs just slightly before B in your frame of reference.
However, this does not happen if B is on the light cone emitted from
A, or if B is inside the light cone of A. To see this, remember that since the
speed of light is c = 1 in any inertial frame, the light cone looks the same
on everyone's spacetime diagram. A line more horizontal than the light
cone therefore represents a 'speed' greater than c, while a line more vertical
than the light cone represents a speed less than c. Because light rays look
the same on everyone's spacetime diagram, the distinction between these
three classes of lines must also be the same in all reference frames.
Thus, it is worthwhile to distinguish three classes of relationships that
pairs of events can have. These classes and some of their properties are
described below. Note that in describing these properties we limit ourselves
to inertial reference frames that have a relative speed less than that of
light.
Case 1: A and B are outside each other's light cones.
Space, Time, and Newtonian Physics
99
In this case, we say that they are spacelike related. Note that the following
things are true in this case:
• There is an inertial frame in which A and B are simultaneous.
There are also inertial frames in which event A happens first as
well as frames in which event B happens first (even more tilted
than the simultaneous frame shown above). However, A and B
remain outside of each other's light cones in all inertial frames.
Case 2: A and B are inside each other's light cones in all inertial frames.
Simultaneous frame
in which a happens first
In this case we say that they are timelike related. Note that the
following things are true in this case:
• There is an inertial observer who moves through both events
and whose speed in the original frame is less than that of light.
• All inertial observers agree on which event (A or B) happened
first.
As a result, we can meaningfully speak of, say, event A being to
the past of event B.
Case 3: A and B are on each other's light cones. In this case we say
that they are lightlike related. Again, all inertial observers agree on which
event happened first and we can meaningfully speak of one of them being
to the past of the other.
Now, why did we consider only inertial frames with relative speeds
less than c? Suppose for the moment that our busy friend (the inertial
observer) could in fact travel at v > c (i.e., faster than light).
pr = "other = °
Fig. Worldline Moves Faster than Light
100
Space, Time, and Newtonian Physics
We have marked two events, A and B that occur on her worldline. In our
frame event A occurs first.
However, the two events are spacelike related. Thus, there is another
inertial frame (tother, xother) in which B occurs before. This means that
there is some inertial observer (the one whose frame is drawn at right)
who would see her traveling backwards in time.
This was too weird even for Einstein. After all, if she could turn
around, our faster-than-light friend could even carry a message from some
observer's future into that observer's past. This raises all of the famous
'what if you killed your grandparents' scenarios from science fiction fame.
Fig. Worldline Moves Faster than Light
The point is that, in relativity, travel faster than light is travel
backwards in time. For this reason, let us simply ignore the possibility of
such observers for awhile. In fact, we will assume that no information of
any kind can be transmitted faster than c.
We are beginning to come to terms with simultaneity but, as pointed
out earlier, we are still missing important information about how different
Space, Time, and Newtonian Physics 101
inertial frames match up. In particular, we still do not know just what value of
constant tf the line marked "friend's line of simultaneity" below actually
In other words, we do not yet understand the rate at which some observer's
clock ticks in another observer's reference frame. That is, we should somehow
make a clock out of light. For example, we can bounce a beam of light back
and forth between two mirrors separated by a known distance.
Perhaps we imagine the mirrors being attached to a rod of fixed length L.
Since we know how far apart the mirrors are, we know how long it takes a
pulse of light to travel up and down and we can use this to mark the passage
of time. We have a clock.
Rods in the Perpendicular Direction
A useful trick is to think about what happens when this 'light clock' is
held perpendicular to the direction of relative motion. This direction is simpler
than the direction of relative motion itself.
For example, two inertial observers actually do agree on which events
are simultaneous in that direction. Suppose that you have two firecrackers,
one placed one light second to your left and one placed one light second
to your right. Suppose that both explode at the same time in your frame of
102 Space, Time, and Newtonian Physics
reference. Does one of them explode earlier in mine? No, and the easiest way
to see this is to argue by symmetry: the only difference between the two
firecrackers is that one is on the right and the other is on the left.
Since the motion is forward or backward, left and right act exactly the
same in this problem.
Thus, the answer to the question 'which is the earliest' must not
distinguish between left and right. But, there are only three possible
answers to this question: left, right, and neither. Thus, the answer must
be 'neither', and both firecrackers explode at the same time in our reference
frame as well.
Now, suppose we ask about the length of the meter sticks. Let's ask
whose meter stick you measure to be longer. For simplicity, let us suppose
that you conduct the experiment at the moment that the two meter sticks
are in contact (when they "pass through each other"). The meter sticks
passing through each other, since this involves only simultaneity in the
direction along the meter sticks and, in the present case, this direction is
perpendicular to our relative velocity.
On the one hand, since we both agree that we are discussing the same
set of events, we must also agree on which meter stick is longer. This is
merely a question of whether the event at the end of your meter stick is
inside or outside of the line of events representing my meter stick.
Said more physically, suppose that we put a piece of blue chalk on
the end of my meter stick, and a piece of red chalk on the end of yours.
Then, after the meter sticks touch, we must agree on whether there is now
a blue mark on your stick (in which case yours in longer), there is a red
mark on my stick (in which case that mine is longer), or whether each
piece of chalk marked the very end of the other stick (in which case they
are the same length). On the other hand, the laws of physics are the same in
all inertial frames.
In particular, suppose that the laws of physics say that, if you (as an inertial
observer) take a meter stick lm long in its own rest frame and move it toward
you, then that that meter stick appears to be longer than a meter stick that is at
rest in your frame of reference.
Here we assume that it does not matter in which direction (forward
or backward) the meter stick is moving, as all direction in space are the
same.
In that case, the laws of physics must also say that, if we (as an inertial
observer) take a meter stick lm long in its own rest frame and move it
toward me, that that meter stick again appears to be longer than a meter
stick that is at rest in my of reference.
Thus, if you find my stick to be longer, we must find your stick to be
longer. If you find my stick to be shorter, then we must find your stick to be
Space, Time, and Newtonian Physics
103
shorter. Consistency requires both of us find the two meter sticks to be of the
same length.
We conclude that the length of a meter stick is the same in two inertial
frames for the case where the stick points in the direction perpendicular
to the relative motion.
LIGHT CLOCKS AND REFERENCE FRAMES
The property just derived makes it convenient to use such meter sticks to
build clocks. We have given up most of our beliefs about physics for the
moment, so that in particular we need to think about how to build a reliable
clock.
The one thing that we have chosen to build our new framework upon
is the constancy of the speed of light. Therefore, it makes sense to use
light to build our clocks. We will do this by sending light signals out to
the end of our meter stick and back.
For convenience, let us assume that the meter stick is one light-second
long. This means that it will take the light one second to travel out to the
end of the stick and then one second to come back. A simple model of such a
light clock would be a device in which we put mirrors on each end of the
meter stick and let a short pulse of light bounce back and forth. Each time the
light returns to the first mirror, the clock goes 'tick' and two seconds have
passed.
Now, suppose we look at our light clock from the side. Let's say that the
rod in the clock is oriented in the vertical direction. The path taken by the
light looks like this:
However, what if we look at a light clock carried by our inertial friend
who is moving by at speed v?
Suppose that the rod in her clock is also oriented vertically, with the
relative motion in the horizontal direction. Since the light goes straight up
104 Space, Time, and Newtonian Physics
and down in her reference frame, die light pulse moves up and forward (and
then down and forward) in our reference frame.
This should be clear from thinking about the path you see a basketball
follow if someone lifts the basketball above their head while they are walking
past you. The length of each side of the triangle is marked on the diagram
above.
Here, L is the length of her rod and tus is the time that it takes the light to
move from one end of the stick to the other. To compute two of the lengths,
we have used the fact that, in our reference frame, the light moves at speed c
while our friend moves at speed v.
The interesting question, of course, is just how long is this time tus. We
know that the light takes 1 second to travel between the tips of the rod as
measured in our friend's reference frame, but what about in ours? It turns out
that we can calculate the answer by considering the length of the path traced
out by the light pulse.
Using the Pythagorean theorem, the distance that we measure the light to
travel is^(vt us ) 2
However, we know that it covers this distance in a time tus at speed c.
Therefore, we have
c 2 t 2 us = v 2 t\ s + L 2 ,
or,
Z 2 /c 2 = t 2 us -(v/c) 2 t 2 us = (1 - [v/cf)t 2 us .
L
Thus, we measure a time f = ; ' between when the light
cjl-(v/c) 2
leaves one mirror and when it hits the next! This is in contrast to the time
'friend = Lie = \ second measured by our friend between these same to events.
Since this will be true for each tick of our friend's clock, we can conclude
that:
Between any two events where our friend's clock ticks, the time t us that
we measure is related to the time f fnend measured by our friend by through
_ * friend
Finally, we have learned how to label another line on our diagram above:
Space, Time, and Newtonian Physics
The dot labeled A is the event where the moving (friend's) clock ticks / =
1 second. It is an event on the friend's worldline. The dot labeled B is the
event where our clock ticks t = 1 second. It is an event on our worldline.
PROPER TIME
We have seen that3 different observers in different inertial frames measure
dif-ferent amounts of time to pass between two given events. We might ask if
any one of these is a "better" answer than another? Well, in some sense the
answer must be 'no,' since the principle of relativity tells us that all inertial
frames are equally valid. However, there can be a distinguished answer. Note
that, if one inertial observer actually experiences both events, then inertial
observers in other frames have different worldlines and so cannot pass through
both of these events. It is useful to use the term proper time between two
events to refer to the time measured by an inertial observer who actually moves
between the two events. Note that this concept exists only for timelike separated
events.
Let's work through at a few cases to make sure that we understand
what is going on. Consider two observers, red and blue. The worldlines of
the two observers intersect at an event, where both set their clocks to read
f = 0.
• Suppose that red sets a firecracker to go off on red's worldline
a t
t rgd = 1. At what time does blue find it to go off? Our result
tells us that t hlm =l/^l-(v/c) 2 .
• Suppose now that blue sets a firecracker to go off on blue's
worldline at t blue = 1. At what time does red find it to go off?
From we now have t red =1/^/1 -(v/c) 2 .
106 Space, Time, and Newtonian Physics
Suppose that (when they meet) blue plants a time bomb in red's
luggage and sets it to go o after lsec. What times does blue find
it to go off? The time bomb will go off after it experiences lsec
of time. In other words, it will go o at the point along its
worldline which is lsec of proper time later. Since red is traveling
along the same worldline, this is lsec later according to red and
on red's worldline. As a result, tells us that this happens at
hlue =1/Vl-(V/C) 2 -
• Suppose that (when they meet) red plants a time bomb in blue's
luggage and he wants it to go o at t red = 1. How much time
delay should the bomb be given? This requires figuring out how
much proper time will pass on blue's worldline between red's
lines of simultaneity t red = and t red - 1. Since the events are
on blue's worldline, blue plays the role of the moving friend. As
a result, the time until the explosion as measured by blue should
be t blue =^i_( v /c) 2 , and this is the delay to set.
Why should you believe all of this ? So far, we have just been working
out consequences of Einstein's idea. We have said little about whether you
should actually believe that this represents reality.
In particular, the idea that clocks in different reference frames measure
different amounts of time to pass blatantly contradicts your experience,
doesn't it? Just because you go and fly around in an airplane does not
mean that your watch becomes unsynchronized with the Cartoon
Network's broadcast schedule, does it?
Well, let's start thinking about this by figuring out how big the time
dilation effect would be in everyday life. Commercial airplanes move at
about 300m/s.
So, v/c ~ 10" 6 for an airplane. Now,^l-( v /c) 2 = 1- ^ (v/c) 2 +... =
1 - 5 x 10~ 13 for the airplane. This is less than 1 part in a trillion.
Tiny, eh? You'd never notice this by checking your watch against the
Cartoon Network. However, physics is a very precise science. It turns out
that it is in fact possible to measure time to better than one part in a trillion.
A nice form of this experiment was first done in the 1960's. Some physicists
got two identical atomic clocks, brought them together, and checked that
they agreed to much better than 1 part in a trillion. Then, they left one in
the lab and put the other on an airplane (such clocks were big, they bought
a seat for the clock on a commercial airplane flight) and flew around for
awhile. When they brought the clocks back together at the end of the
experiment, the moving clock had in fact 'ticked' less times, measuring
less time to pass in precise accord with our calculations above and
Space, Time, and Newtonian Physics 107
Einstein's prediction. We were merrily exploring Einstein's crazy idea. While
Einstein's suggestion clearly fits with the Michelson-Morely experiment, we
still have not figured out how it fits with the stellar aberration experiments.
So, we were just exploring the suggestion to see where it leads.
It led to a (ridiculous) prediction that clocks in different reference frames
measure different amounts of time to pass. This prediction has in fact been
experimentally tested, and that Einstein's idea passed with flying colors. Now,
you should begin to believe that all of this crazy stuff really is true. Oh, and
there will be plenty more weird predictions and experimental verifications to
come.
Another lovely example of this kind of thing comes from small
subatomic parti-cles called muons (pronounced moo-ons). Muons are
"unstable," meaning that they exist only for a short time and then turn
into something else involving a burst of radiation. You can think of them
like little time bombs. They live (on average) about 106 seconds. Now,
muons are created in the upper atmosphere when a cosmic ray collides
with the nucleus of some atom in the air (say, oxygen or nitrogen). In the
1930's, people noticed that these particles were traveling down through
the atmosphere and appearing in their physics labs. Now, the atmosphere
is about 30,000m tall, and these muons are created near the top.
The muons then travel downward at something close to the speed of
light. Note that, if they traveled at the speed of light 3 x 1 8 m/ s, it would
take them a time t = 3 x 10 4 m/(3 x 10 8 m/s) = lO^ 1 sec. to reach the earth.
But, they are only supposed to live for 10" 6 seconds! So, they should only
make it 1/100 of they way down before they explode.
The point is that the birth and death of a muon are like the ticks of
its clock and should be separated by 10 -6 seconds as measured in the rest
frame of the muon. In other words, the relevant concept here is 10~ 6
seconds of proper time. In our rest frame, we will measure a time lO^sec/
•Jl-(v/c) 2 to pass. For v close enough to c, this can be as large (or larger
than) 10" 4 seconds.
This concludes our first look at time dilation. We turn our attention
to measurements of position and distance. However, there remain several
subtleties involving time dilation that we have not yet explored.
LENGTH CONTRACTION
We learned how to relate times measured in different inertial frames.
Clearly, the next thing to understand is distance. While we had to work
fairly hard to compute the amount of time dilation that occurs, we will
see that the effect on distances follow quickly from our results for time.
Let's suppose that two inertial observers both have measuring rods that
108 Space, Time, and Newtonian Physics
are at rest in their respective inertial frames. Each rod has length L in the
frame in which it is at rest (it's "rest frame").
We saw that distances in the direction perpendicular to the relative
motion are not affeected. So, to finish things o, this time we must consider
the case where the measuring rods are aligned with the direction of the
relative velocity.
For definiteness, let us suppose that the two observers each hold their
meter stick at the leftmost end. The relevant spacetime diagram is shown
below. As usual, we assume that the two observers clocks both read t =
at the event where their worldlines cross. We will call our observers 'student'
and 'professor.' We begin by drawing the diagram in the student's rest
frame and with the professor moving by at relative velocity v.
x p = x p (end) = ???????
Now, the student must find that the professor takes a time L/v to
traverse the length of the student's measuring rod. Let us refer to the event
(marked in magenta) where the moving professor arrives at the right end
of the student's measuring rod as "event A."
Since this event has t s = L/v, we can use our knowledge of time dilation
to conclude that the professor assigns a time t p = (L/v) Jl-(v/c) 2 to this
event. Our goal is to determine the length of the student's measuring rod
in the professor's frame of reference. That is, we wish to know what
position xP (end) the professor assigns to the rightmost end of the students
rod when this end crosses the professor's line of simultaneity t p = 0.
To find this out, note that from the professor's perspective it is the
student's rod that moves past him at speed v. It takes the rod a time t p =
(Z/v)yl -(v/c) 2 to pass by. Thus, the student's rod must have a length
L p =yj] _(v/c) 2 in the professor's frame of reference. The professor's rod,
of course, will similarly be shortened in the student's frame of reference. So,
Space, Time, and Newtonian Physics 109
we see that distance measurements also depend on the observer's frame of
reference. Note however, that given any inertial object, there is a special inertial
frame in which the object is at rest. The length of an object in its own rest
frame is known as its proper length. The length of the object in any other
inertial frame will be shorter than the object's proper length. We can summarize
what we have learned by stating:
An object of proper length L moving through an inertial frame at
speed v has length Lsjl-v 2 /c 2 as measured in that inertial frame.
There is an important subtlety that we should explore. Note that the
above statement refers to an object. However, we can also talk about
proper distance between two events. When two events are spacelike related,
there is a special frame of reference in which the events are simultaneous
and the separation is "pure space" (with no separation in time). The
distance between them in this frame is called the proper distance between
the events. It turns out that this distance is in fact longer in any other
frame of reference.
Why longer? To understand this, look back at the above diagram and
compare the two events at either end of the students' rod that are
simultaneous in the professor's frame of reference. Note that the proper
distance is the distance measured in the professor's reference frame, which
we just concluded is shorter than the distance measured by the student.
The difference here is that we are now talking about events (points on the
diagram) where as before we were talking about objects (whose ends appear
as worldlines on the spacetime dia-gram). The point is that, when we talk
about measuring the length of an object, different observers are actually
measuring the distance between different pairs of events.
THE TRAIN PARADOX
Let us now test our new skills and work through some subtleties by
considering an age-old parable known as the train paradox. It goes like
this: Once upon a time there was a really fast Japanese bullet train that
ran at 80% of the speed of light. The train was 1 00m long in its own rest
frame. The train carried as cargo the profits of SONY corporation from
Tokyo out to their headquarters in the countryside. The profits were, of
course, carried in pure gold.
Now, some less than reputable characters found out about this and
devised an elaborate scheme to rob the train. They knew that the train
would pass through a 100m long tunnel on its route. Watching the train
go by, they measured the train to be quite a bit less than 100m long and
so figured that they could easily trap it in the tunnel.
Of course, the people on the train found that, when the train was in motion,
110 Space, Time, and Newtonian Physics
it was the train that was 100m long while the tunnel was significantly shorter.
As a result, they had no fear of being trapped in the tunnel by train robbers.
Now, do you think the robbers managed to catch the train?
Let's draw a spacetime diagram using the tunnel's frame of reference. We
can let E represent the tunnel entrance andXrepresent the tunnel exit. Similarly,
we let B represent the back of the train and F represent the front of the train.
Let event 1 be the event where the back of the train finally reaches the tunnel
and let event 2 be the event where the front of the train reaches the exit.
'ground -
t ground = 1 °0
Vound - °
x T = -100 t A T~ U
Suppose that one robber sits at the entrance to the tunnel and that one sits
at the exit. When the train nears, they can blow up the entrance just after
event 1 and they can blow up the exit just before event 2. Note that, in between
these two events, the robbers find the train to be completely inside the tunnel.
Now, what does the train think about all this? How are these events
described in its frame of reference? Note that the train finds event 2 to
occur long before event 1. So, can the train escape?
Let's think about what the train would need to do to escape. At event
2, the exit to the tunnel is blocked, and (from the train's perspective) the
debris blocking the exit is rushing toward the train at 80% the speed of
light. The only way the train could escape would be to turn around and
back out of the tunnel. The train finds that the entrance is still open at the time
of event 2.
Of course, both the front and back of the train must turn around.
How does the back of the train know that it should do this? It could find
Space, Time, and Newtonian Physics 111
out via a phone call from an engineer at the front to an engineer at the back of
the train, or it could be via a shock wave that travels through the metal of the
train as the front of the train throws on its brakes and reverses its engines. The
point is though that some signal must pass from event 2 to the back of the
train, possibly relayed along the way by something at the front of the train.
Sticking to our assumption that signals can only be sent at speed c or slower,
the earliest possible time that the back of the train could discover the exit
explosion is at the event marked D on the diagram. Note that, at event D, the
back of the train does find itself inside the tunnel and also finds that event 1
has already occurred. The entrance is closed and the train cannot escape.
There are two things that deserve more explanation. The first is the above
comment about the shock wave. Normally we think of objects like trains as
being perfectly stiff. Also, it takes a (small but finite) amount of time for each
atom to respond to the push it has been given on one side and to move over
and begin to push the atom on the other side. The result is known as a "shock
wave" that travels at finite speed down the object. Note that an important part
of the shock wave are the electric forces that two atoms use to push each
other around. Thus, the shock wave can certainly not propagate faster than an
electromagnetic disturbance can. As a result, it must move at less than the
speed of light.
For the other point, let's suppose that the people at the front of the
train step on the brakes and stop immediately. Stopping the atoms at the
front of the train will make them push on the atoms behind them, stopping
them, etc. The shock wave results from the fact that atoms just behind
the front slam into atoms right at the front; the whole system compresses
a bit and then may try to reexpand, pushing some of the atoms farther
back.
What we saw above is that the shock wave cannot reach the back of
the train until event D. Suppose that it does indeed stop the back of the
train there. The train has now come to rest in the tunnel's frame of
reference. Thus, after event D, the proper length of the train is less then
100m!!!!
In fact, suppose that we use the lines of simultaneity in the train's
original frame of reference (before it tries to stop) to measure the proper
length of the train. Then, immediately after event 2 the front of the train
changes its motion, but the back of the train keeps going. As a result, in
this sense the proper length of the train starts to shrink immediately after
event 2. This is how it manages to fit itself into a tunnel that, in this frame, is
less than 100m long.
What has happened? The answer is in the compression that generates
the shock wave. The train really has been physically compressed by the
wall of debris at the exit slamming into it at half the speed of light6! This
112 Space, Time, and Newtonian Physics
compression is of course accompanied by tearing of metal, shattering of glass,
death screams of passengers, and the like, just as you would expect in a crash.
The train is completely and utterly destroyed. The robbers will be lucky if the
gold they wish to steal has not been completely vaporized in the carnage.
Now, you might want to get one more perspective on this by analyzing
the problem again in a frame of reference or the equivalent damage
inflicted through the use of the train's brakes, that moves with the train at
all times, even slowing down and stopping as the train slows down and
stops. However, we do not know enough to do this yet since such a frame
is not inertial.
Chapter 4
Minkowskian Geometry
We were faced with the baffling results of the Michelson-Morely
experiment and the stellar aberration experiments. In the end, we decided
to follow Einstein and to allow the possibility that space and time simply
do not work in the way that our intuition predicts. In particular, we took
our cue from the Michelson-Morely experiment which seems to say that
the speed of light in a vacuum is the same in all inertial frames and,
therefore, that velocities do not add together in the Newtonian way. We
wondered "How can this be possible?"
We then spent the last chapter working out "how this can be possible."
That is, we have worked out what the rules governing time and space must
actually be in order for the speed of light in a vacuum to be the same in all
inertial reference frames. In this way, we discovered that different observers
have different notions of simultaneity, and we also discovered time dilation
and length contraction. Finally, we learned that some of these strange
predictions are actually correct and have been well verified experimentally.
It takes awhile to really absorb what is going on here. The process
does take time, though at this stage of the course the students who regularly
come to my o ce hours are typically moving along well. There are lots of
levels at which one might try to "understand" the various effects. Some
examples are:
Logical Necessity: Do see that the chain of reasoning leading to these
con-elusions is correct? If so, and if you believe the results of Michelson
and Morely that the speed of light is constant in all inertial frames, then
you must believe the conclusions.
External Consistency: Understanding at this level involves determin-
ing how big the a ects actually would be in your everyday life. You will
quickly find that they are seldom more than one part in a billion or a
trillion. At this level, it is no wonder that you never noticed.
Internal Consistency: How can these various effects possibly be self-
consistent? How can a train 100m long get stuck inside a tunnel that, in
it's initial frame of reference, is less than 1 00m long?
Step Outside the old Structure: When people ask this question, what
114 Minkowski an Geometry
they mean is "Can you explain why these strange things occur in terms of
things that are familiar to my experience, or which are reasonable to my
intu-ition?" It is important to realise that, in relativity, this is most definitely
not possible in a direct way.
This is because all of your experience has built up an intuition that
believes in the Newtonian assumptions about space and time and, as we
have seen, these cannot possibly be true! Therefore, you must remove your
old intuition, remodel it completely, and then put a new kind of intuition
back in your head.
Finding the new logic: If we have thrown out all of our intuition and
experience, what does it mean to "understand" relativity? We will see that
relativity has a certain logic of its own. What we need to do is to uncover
the lovely structure that space and time really do have, and not the one
that we want them to have. In physics as in life, this is often necessary.
Typically, when one understands a subject deeply enough, one finds
that the subject really does have an intrinsic logic and an intrinsic sense
that are all its own. This is the level at which finally see "what is actually
going on." This is also the level at which people finally begin to "like" the
new rules for space and time.
MINKOWSKIAN GEOMETRY
Minkowski was a mathematician, and he is usually credited with
emphasizing the fact that time and space are part of the same "spacetime"
whole in relativity. He also who emphasized the fact that this spacetime
has a special kind of geometry. It is this geometry which is the underlying
structure and the new logic of relativity.
Understanding this geometry will provide both insight and useful
technical tools. For this reason, we now pursue what at first sight will
seem like a technical aside in which we first recall how the familiar
Euclidean geometry relates quantities in different coordinate systems. We
can then build an analogous technology in which Minkowskian geometry
relates different inertial frames.
Invariants: Distance vs. the Interval
A fundamental part of familiar Euclidean geometry is the Pythagorean
theorem. One way to express this result is to say that
(distance) 2 = Ax 2 + Ay 2 ,
where distance is the distance between two points and Ax, Ay are
respectively the differences between the x coordinates and between the y
coordinates of these points. Here the notation A.v 2 means (Ax) 2 and not
the change in x 2 . Note that this relation holds in either of the two
coordinate systems drawn below.
Minkowskian Geometry
We compare coordinate systems (with one rotated relative to the
other) / find
Ax 2 , + Ay 2 = Ax 2 + Ay 2 2 .
Let's think about an analogous issue involving changing inertial
frames. Consider, for example, two inertial observers. Suppose that our
friend flies by at speed v. For simplicity, let us both choose the event where
our worldlines intersect to be t = 0. Let us now consider the event (on his
worldline) where his clock 'ticks' t,= T. Note that our friend assigns this
event the position x, = since she passes through it.
What coordinates do we assign? Our knowledge of time dilation tells
us that we assign a longer time: t us = T l\\-v 2 Ic 1 ■ F° r position, recall
that at t us = our friend was at the same place that we are {x us = 0).
Therefore, after moving at a speed v for a time t m = T/y]\-v 2 /c 2 > our
friend is at x us = vt us =7V/Vl-v 2 Ic 1 ■
116 Minkowski an Geometry
Now, we'd like to examine a Pythagorean-like relation. Of course we can't
just mix x and t in an algebraic expression since they have different units.
But, we have seen that x and ct do mix well! Thinking of the marked event
where our friend's clock ticks, is it true that x 2 + (ct) 2 is the same in both
reference frames? Clearly no, since both of these terms are larger in our
reference frame than in our friend's (jr^ > and ct us > ctM
What we have just observed is that whenever an inertial observer
passes through two events and measures a proper time T between them,
any inertial observer finds Ax 2 - c 2 At 2 = -c 2 T 2 . But, given any two timelike
separated events, an inertial observer could in fact pass through them.
So, we conclude that the quantity Ax 2 - c 2 At 2 computed for a pair of
timelike separated events is the same in all inertial frames of reference.
A.ny quantity with this property is called an 'invariant' because it does not
vary when we change reference frames.
A quick check shows that the same is true for spacelike separated
events. For lightlike separated events, the quantity Ax 2 - c 2 At 2 is actually
zero in all reference frames. We see that for any pair of events the quantity
Ax 2 - c 2 At 2 is completely independent of the inertial frame that you use
to compute it. This quantity is known as the (interval) 2 .
(interval) 2 = Ax 2 - c 2 At 2
The language here is a bit difficult since this can be negative. The
way that physicists solve this in modern times is that we always discuss
the (interval)2 and never (except in the abstract) just "the interval" (so
that we don't have to deal with the square root). The interval functions
like 'distance,' but in spacetime, not in space.
Let us now explore a few properties of the interval. As usual, there
are three cases to discuss depending on the nature of the separation between
the two events.
Timelike separation: In this case the squared interval is negative, for
two timelike separated events there is (or could be) some inertial observer
who actually passes through both events, experiencing them both. One
might think that her notion of the amount of time between the two events
is the most interesting and indeed we have given it a special name, the
"proper time" (Ax; "delta tau") between the events. Note that, for this
observer the events occur at the same place. Since the squared interval is
the same in all inertial frames of reference, we therefore have:
2 - c 2 (Ax) 2 = Ax 2 - c 2 A/ 2 .
Solving this equation, we find that we can calculate the proper time
D in terms of the distance Ax and time At in any inertial frame using:
I ^T~
Ai = VAr 2 -Ax 2 /c 2 = A V 1 — 2T~T = Wl-v 2 /c 2
Minkowski an Geometry
117
We see that At < At.
Spacelike separation: Similarly, if the events are spacelike separated,
there is an inertial frame in which the two are simultaneous - that is, in
which At = 0. The distance between two events measured in such a reference
frame is called the proper distance d. Much as above,
d = ^Ax 2 -c 2 At 2 <Ax-
Note that this seems to "go the opposite way" from the length
contraction effect. That is because here we consider the proper distance
between two particular events. In contrast, in measuring the length of an
object, different observers do NOT use the same pair of events to determine
length.
Lightlike separation: Two events that are along the same light ray
satisfy Ax = ± cAt. It follows that they are separated by zero interval in
all reference frames. One can say that they are separated by both zero
proper time and zero proper distance.
Curved Lines and Accelerated Objects
Thinking of things in terms of proper time and proper distance makes
it easier to deal with, say, accelerated objects. Suppose we want to compute,
for example, the amount of time experienced by a clock that is not in an
inertial frame. Perhaps it quickly changes from one inertial frame to
another, shown in the blue worldline (marked B) below. This worldline
(Z?) is similar in nature to the worldline of the muon in part (b).
m
i
Note that the time experienced by the blue clock between events (a)
and (b) is equal to the proper time between these events since, on that
segment, the clock could be in an inertial frame. Surely the time measured
118
Minkowskian Geometry
by an ideal clock between (a) and (b) cannot depend on what it was doing
before (a) or on what it does after (b). Similarly, the time experienced by
the blue clock between events (b) and (c) should be the same as that
experienced by a truly inertial clock moving between these events; i.e. the
proper time between these events. Thus, we can find the total proper time
experienced by the clock by adding the proper time between (a) and (b) to
the proper time between (b) and (c) and between (c) and (d).
We also refer to this as the total proper time along the clock's worldline
between (a) and (d). A red observer (R) is also shown above moving
between events (0, -4) and (0, +4). Let Ai R ad and Ar 5 ^ be the proper time
experienced by the red and the blue observer respectively between times /
= -4 and t = +4; that is, between a, d > D ?Ba, d and similarly for the
other time intervals. Thus we see that the proper time along the broken
line is less than the proper time along the straight line.
Since proper time (i.e., the interval) is analogous to distance in
Euclidean ge-ometry, we also talk about the total proper time along a
curved worldline in much the same way that we talk about the length of a
curved line in space. We obtain this total proper time much as we did for
the blue worldline above by adding up the proper times associated with
each short piece of the curve. This is just the usual calculus trick in which
we approximate a curved line by a sequence of lines made entirely from
straight line segments. One simply replaces any Ax (or At) denoting a
difference between two points with dx or dt which denotes the difference
between two infinitesimally close points.
The rationale here, of course, is that if you look at a small enough
(infinitesimal) piece of a curve, then that piece actually looks like a straight
line segment. Thus we have
Minkowskian Geometry 119
Again we see that a straight (inertial) line in spacetime has the longest
proper time between two events. In other words, in Minkowskian geometry
the longest line between two events is a straight line.
THE TWIN PARADOX
That's enough technical stuff for the moment. "The twin paradox."
Using the notions of proper time and proper distance turns out to simplify
the discussion significantly compared.
Let's think about two identical twins who, for obscure historical
reasons are named Alphonse and Gaston. Alphonse is in an inertial
reference frame floating in space somewhere near our solar system. Gaston,
on the other hand, will travel to the nearest star (Alpha Centauri) and
back at. 8c.
Alpha Centauri is (more or less) at rest relative to our solar system
and is four light years away. During the trip, Alphonse finds Gaston to
be aging slowly because he is traveling at 8c. On the other hand, Gaston
finds Alphonse to be aging slowly because, relative to Gaston, Alphonse
is traveling at. 8c.
During the trip out there is no blatant contradiction, since we have
seen that the twins will not agree on which event (birthday) on Gaston's
worldline they should compare with which event (birthday) on Alphonse's
worldline in order to decide who is older. But, who is older when they
meet again and Alphonse returns to earth?
The above diagram shows the trip in a spacetime diagram in
Alphonse's frame of reference. Let's work out the proper time experienced
by each observer. For Alphonse, Ax = 0. How about At? Well, the amount
of time that passes is long enough for Gaston to travel 8 light-years (there
and back) at. 8c. That is, At = 81yr/(.8c) = lOyr. So, the proper time AtA
experienced by Alphonse is ten years.
On the other hand, we see that on the first half of his trip Gaston
120 Minkowskian Geometry
travels 4 light proper time of^f^l. 4 2 = 3years. The same occurs on the trip
back. So, the total proper time experienced by Gaston is At g = 6years. Is
Gaston really younger then when they get back together? Couldn't we draw
the same picture in Gaston's frame of reference and reach the opposite conclu-
sion? NO, we cannot.
The reason is that Gaston's frame of reference is not an inertial frame!
Gaston does not always move in a straight line at constant speed with
respect to Alphonse. In order to turn around and come back, Gaston must
experience some force which makes him non-inertial. Most importantly,
Gaston knows this! When, say, his rocket engine fires, he will feel the force
acting on him and he will know that he is no longer in an inertial reference
frame.
The point here is not that the process is impossible to describe in
Gaston's frame of reference. Gaston experiences what he experiences, so
there must be such a description.
The point is, however, that so far we have not worked out the rules
to understand frames of reference that are not inertial. Therefore, we
cannot simply blindly apply the time dilation/length contraction rules for
inertial frames to Gaston's frame of reference.
Thus, we should not expect our results so far to directly explain what
is happening from Gaston's point of view.
But, you might say, Gaston is almost always in an inertial reference
frame. He is in one inertial frame on the trip out, and he is in another
inertial frame on the trip back. What happens if we just put these two
frames of reference together?
Let's do this, but we must do it carefully since we are now treading
new ground. First, we should draw in Gaston's lines of simultaneity on
Alphonse's spacetime diagram above. His lines of simultaneity will match2
simultaneity in one inertial frame during the trip out, but they will match
those of a different frame during the trip back. Then, those lines of
simultaneity draw a diagram in Gaston's not-quite-inertial frame of
reference, much as we have done in the past in going from one inertial
frame to another.
Since Gaston is in a different inertial reference frame on the way out
than on the way back, to draw two sets of lines of simultaneity and each
set will have a different slope. Now, two lines with different slopes must
intersect.
The lines of simultaneity with Gaston's proper time at the events where
he crosses those lines. Note that there are two lines of simultaneity marked
t G ~ 3years!. One of these 3" (which is "just before" Gaston turns around)
and one 3~ (which is "just after" Gaston turns around).
Minkowskian Geometry
t G = 6 yrs.
t G = 3 yrs.
Simply knit together Gaston's lines of simultaneity and copy the events
from the diagram above, the following diagram in Gaston's frame of
reference. Note that it is safe to use the standard length contraction result
to find that in the inertial frame of Gaston on his trip out and in the inertial
frame of Gaston on his way back the distance between Alphonse and
Alpha Centauri is 4Lyr^/l-(4/5) 2 = (12/5)Lyr.
x G =-12/5Lyr. x Q =0
t G = 6 yr.
There are a couple of weird things here. For example, what happened
to event E? In fact, what happened to all of the events between B and C?
122 Minkowskian Geometry
By the way, how old is Alphonse at event Bl In Gaston's frame of reference
(which is inertial before t G = 3, so we can safely calculate things that are
confined to this region of time),
Alphonse has traveled (12/5)Lyr. in three years.
So, Alphonse must experience a proper time of-^3 2 — (12/5) 2 =
n/9 -144/25 =V81/25 (9/5)years.
Similarly, Alphonse experiences (9/5)years between events C and D.
This means that there are 10 - 18/5 = (32/5)years of Alphonse's life missing
from the diagram.
It turns out that part of our problem is the sharp corner in Gaston's
worldline. The corner means that Gaston's acceleration is infinite there,
since he changes velocity in zero time. Let's smooth it out a little and see
what happens.
Suppose that Gaston still turns around quickly, but not so quickly
that we cannot see this process on the diagram.
If the turn-around is short, this should not change any of our proper
times very much (proper time is a continuous function of the curve!!!), so
Gaston will still experience roughly 6 years over the whole trip, and roughly
3 years over half. Let's say that he begins to slow down (and therefore
ceases to be inertial) after 2.9 years so that after 3 years he is momentarily
at rest with respect to Alphonse.
Then, his acceleration begins to send him back home. A tenth of a
year later (3.1 years into the trip) he reaches. 8 c, his rockets shut o, and he
coasts home as an inertial observer.
We have already worked out what is going on during the periods where
Gaston is inertial. But, what about during the acceleration? Note that, at
each instant,
Gaston is in fact at rest in some inertial frame - it is just that he keeps
changing from one inertial frame to another. One way to draw a spacetime
diagram for Gaston is try to use, at each time, the inertial frame with
respect to which he is at rest. This means that we would use the inertial
frames to draw in more of Gaston's lines of simultaneity on Alphonse's
diagram, at which point we can again copy things to Gaston's diagram.
A line that is particularly easy to draw is Gaston's t Q = 3year line.
This is because, at t G = 3years, Gaston is momentarily at rest relative to
Alphonse. This means that Gaston and Alphonse share a line of
simultaneity.
For Alphonse, it it t A = 5years. For Gaston, it is t Q = 3years. On that
line, Alphonse and Gaston have a common frame of reference and their
measurements agree.
Minkowskian Geometry
t G = 3.1 yrs\j
Gold light ray
t G = 2.9 yrs.
Va = 5
9 <
Alpha
Centauri
t A"° ^j
t G = 3 yrs.
X A =
x A =
Note that we finally have a line of simultaneity for Gaston that passes
through event E So, event E really does belong on Gaston's t G = 3year
line after all. By the way, just "for fun" added to our diagram an light ray
moving to the left from the origin.
We are almost ready to copy the events onto Gaston's diagram. But,
to properly place event R, we must figure out just where it is in Gaston's
frame. In other words, how far away is it from Gaston along the line t G =
3years? Gaston and Alphonse measure things in the same way. Therefore
they agree that, along that line, event E is four light years away from
Alphonse. Placing event E onto Gaston's diagram connecting the dots to
get Alphonse's worldline, we find:
x G = -4 Lyr. x G = -1 2/5 Lyr. x Q =
t G = 3 yr.
E
/c
'd
Gold light ray
\b
A
124
Minkowskian Geometry
There is something interesting about Alphonse's worldline between B and
E. It is almost horizontal, and has speed much greater than one light-year per
year! What is happening?
The gold light ray, and that it too moves at more than one light-year
per year in this frame. We see that Alphonse is in fact moving more slowly
than the light ray, which is good. However, we also see that the speed of
a light ray is not in general equal to c in an accelerated reference frame! In
fact, it is not even constant since the gold light ray appears 'bent' on
Gaston's diagram. Thus, it is only in inertial frames that light moves at a
constant speed of 3 * 108 meters per second. This is one reason to avoid
drawing diagrams in non-inertial frames whenever you can.
Actually, though, things are even worse than they may seem at first
glance... Suppose, for example, that Alphonse has a friend Zelda who is
an inertial observer at rest with respect to Alphonse, but located four light
years on the other side of Alpha Centauri. We can then draw the following
diagram in Alphonse's frame of reference:
Once again, we simply can use Gaston's lines of simultaneity to mark
the events (T,U,V,W, X,Y,Z) in Zelda's life on Gaston's diagram. In doing
so, however, we find that some of Zelda's events appear on TWO of
Gaston's lines of simultaneity - a (magenta) one from before the
turnaround and a (green) one from after the turnaround! In fact, many of
them (like event W) appear on three lines of simultaneity, as they are caught
by a third 'during' the turnaround when Gaston's line of simultaneity
sweeps downward from the magenta / = 2.9 to the green / = 3.1 as indicated
Minkowskian Geometry
125
by the big blue arrow! Marking all of these events on Gaston's diagram
(taking the time to first calculate the corresponding positions) yields
something like this:
Lyr x
_=-12/SLyt
V*
2/5 Lyr
/
Dt^fiy,
6
\
. G =3»<
\
,-
The events T, U,V, W at the very bottom and W, X, Y, Z are not
drawn to scale, but they indicate that Zelda's worldline is reproduced in
that region of the diagram in a more or less normal fashion.
Let us quickly run though Gaston's description of Zelda's life: Zelda
merrily experiences events T, U, V, W, X, and Y. Then, Zelda is described
as "moving backwards in time" through events Y, X, W, V, and U. During
most of this period she is also described as moving faster than one light-
year per year. After Gaston's tG = 3.1year line, Zelda is again described
as moving forward in time (at a speed of 4 light-years per 5 years),
experiencing events V, W. X, Y, for the third time and finally experiencing
event Z.
The moral here is that non-inertial reference frames are "all screwed
up." Ob-servers in such reference frames are likely to describe the world
in a very funny way. To figure out what happens to them, it is certainly
best to work in an iner-tial frame of reference and use it to carefully
construct the non-inertial spacetime diagram. By the way, there is also
the issue of what Gaston would see if he watched Alphonse and Zelda
through a telescope. This has to do with the sequence in which light rays
reach him, and with the rate at which they reach him.
MORE ON MINKOWSKIAN GEOMETRY
Now that we've ironed out the twin paradox, it's time to talk more
about Minkowskian Geometry (a.k.a. "why you should like relativity").
126
Minkowskian Geometry
We will shortly see that understanding this geometry makes relativity much
simpler. Or, perhaps it is better to say that relativity is in fact simple but that
we so far been viewing it through a confusing "filter" of trying to separate
space and time. Understanding Minkowskian geometry removes this filter,
as we realise that space and time are really part of the same object.
Drawing Proper time and Proper Distance
The notion of the spacetime interval, the interval was a quantity built
from both time and space, but which had the interesting property of being
the same in all reference frames. We write it as:
(interval) 2 = Ax 2 - ^At 2 .
This quantity has two different manifestations: proper time, and
proper distance. In essence these are much the same concept. However, it
is convenient to use one term (proper time) when the squared interval is
negative and another (proper distance) when the squared interval is
positive.
Let's draw some pictures to better understand these concepts. The set
of all events that are one second of proper time (At = lsec) to the future
of some event (x , t ). We have
-(lsec) 2 = -At 2 = A/ 2 - A^/c 2 .
Suppose that we take x Q = 0, t = for simplicity. Then we have just
x^c 2 - t 2 = -(lsec) 2 .
You may recognize this as the equation of a hyperbola with focus at
the origin and asymptotes x = ±ct. In other words, the hyperbola
asymptotes to the light cone. Since we want the events one second of proper
time to the future, we draw just the top branch of this hyperbola:
There are similar hyperbolae representing the events one second of proper
Minkowskian Geometry
127
time in the past, and the events one light-second of proper distance to the left
and right.
We should also note in passing that the light light rays form the (some-
what degenerate) hyperbolae of zero proper time and zero proper distance.
1 sec. proper time
to the future
"of origin
1 1s proper distance
to the left
of origin
1s proper distance
- to the right
of origin
Changing Reference Frames
The worldline and a line of simultaneity for a second inertial observer
moving at half the speed of light relative to the first. How would the curves
of constant proper time and proper distance look if we re-drew the diagram
in this new inertial frame? Stop reading and think about this for a minute.
Because the separation of two events in proper time and proper
distance is invariant (i.e., independent of reference frame), these curves
must look exactly the same in the new frame.
1s proper distance
to the right
of origin
128 Minkowskian Geometry
That is, any event which is one second of proper time to the future of
some event A (say, the origin in the diagram above) in one inertial frame is
also one second of proper time to the future of that event in any other inertial
frame and therefore must lie on the same hyperbola x 2 - c 2 t 2 = -{lsec) 2 .
The same thing holds for the other proper time and proper distance
hyperbolae.
We see that changing the inertial reference frame simply slides events
along a given hyperbola of constant time or constant distance, but does
not move events from one hyperbola to another.
Remember our Euclidean geometry analogue from last time? The
above observation is exactly analogous to what happens when we rotate
an object4. The points of the object move along circles of constant radius
from the axis, but do not hop from circle to circle.
By the way, the transformation that changes reference frames is called a
'boost.'
HyperboJae, Again
In order to extract the most from our diagrams, let's hit the analogy with
circles one last time. If an arbitrary straight line through the centre of a circle,
it always intersects the circle a given distance from the centre.
Minkowskian Geometry
What happens if we draw an arbitrary straight line through the origin of
our hyperbolae?
If it is a timelike line, it could represent the worldline of some inertial
observer. Suppose that the observer's clock reads zero at the origin. Then
the worldline intersects the future At = lsec hyperbola at the event where
that observers clock reads one second.
Similarly, since a spacelike line is the line of simultaneity of some
inertial observer. It intersects the d = lLs curve at what that observer
measures to be a distance of lLs from the origin.
What we have seen is that these hyperbolae encode the Minkowskian
geometry of spacetime.
The hyperbolae of proper time and proper distance (which are
different manifestations of the same concept: the interval) are the right
130 Minkowskian Geometry
way to think about how events are related in spacetime and make things much
simpler than trying to think about time and space separately.
Boost Parameters and Hyperbolic Trigonometry
So, you might rightfully ask, what exactly can we do with this new way
of looking at things? Let's go back and look at how velocities combine in
relativity. This is the question of "why don't velocities just add?" Or, if you
are going at 1/2 c relative to Alice, and Charlie is going at 1/2 c relative to
you, how fast is Charlie going relative to Alice? The formula looks like
'A^-
R /c +
~lc
1 + V,;
~/c z
v abv B c I
It is interesting to remark here that this odd effect was actually
observed experimentally by Fizeau in the 1850's. He managed to get an
effect big enough to see by looking at light moving through a moving
fluid. The point is that, when it is moving through water, light does not in
fact travel at speed c. Instead, it travels relative to the water at a speed cl
77 where n is around 1.5. The quantity n is known as the 'index of refraction'
of water. Thus, it is still moving at a good fraction of "the speed of light."
Anyway, if the water is also flowing at a fast rate, then the speed of the
light toward us is given by the above expression in which the velocities do
not just add together. This is just what Fizeau found5, though he had no
idea whv it should be true.
Now, the above formula looks like a mess. Why in the world should
the composition of two velocities be such an awful thing? As with many
questions, the answer is that the awfulness is not in the composition rule
Minkowskian Geometry 131
itself, but in the filter (the notion of velocity) through which we view it. We
will now see that, when this filter is removed and we view it in terms native to
Minkowskian geometry, the result is quite simple indeed.
The analogy between boosts and rotations. How do we describe
rotations? We use an angle 9. That rotations mix x and y through the sine
and cosine functions.
x 2 = r sin 6,
y 2 = r cos 0.
Note what happens when rotations combine. Well, they add of course.
Combining rotations by 0, and 2 yields a rotation by an angle = 0j + 2 .
x
But we often measure things in terms of the slope m =~ (note the similarity
to v = x/t). Now, each rotation 0j, 2 is associated with a slope m l = tan 0,,
m 2 = tan 2 . But the full rotation by is associated with a slope:
m = tan = tan (0, + 2 )
_ tan0j+tan0 2
l-tan0 1 tan9 2
m x + m 2
\-m x m 2
So, by expressing things in terms of the slope we have turned a simple
addition rule into something much more complicated.
The point here is that the final result bears a strong resemblance to
our formula for the addition of velocities. In units where c = 1, they differ
only by the minus sign in the deominator above. This suggests that the
addition of velocities can be simplified by using something similar to, but
still different than, the trigonometry above. To get an idea of where to
start, recall one of the basic facts associated with the relation of sine and
cosine to circles is the relation:
sin 2 + cos 2 0=1.
It turns out that there are other natural mathematical functions called
hyperbolic sine (sinh) and hyperbolic cosine (cosh) that satisfy a similar
(but different!)
cosh 2 0- sinh 2 0=1,
so that they are related to hyperbolae.
These functions can be defined in terms of the exponential function, ex:
e -e
. , n e -e
sinh =
132 Minkowskian Geometry
You can do the algebra to check for yourself that these satisfy relation
above. By the way, although you may not recognize this form, these
functions are actually very close to the usual sine and cosine functions.
Introducing / = y [Z\ , one can write sine and cosine as.
sin 9 =
Thus, the two sets of functions differ only by factors of i which, as
you can imagine, are related to the minus sign that appears in the formula
for the squared interval.
Now, consider any event (A) on the hyperbola that is a proper time t
to the future of the origin. Due to the relation 4.8, we can write the
coordinates /, x of this event as:
t = t cosh 9,
x = ex sinh 9.
The worldline of an inertial observer that passes through both the
origin and event A. Note that the parameter 9 gives some notion of how
different the two inertial frames (that of the moving observer and that of
the stationary observer) actually are. For 9 = 9, event A is at x = 9 and
the two frames are the same, while for large 9 event A is far up the
hyperbola and the two frames are very different.
We can parameterize the points that are a proper distance d from the
origin in a similar way, though we need to 'flip x and /.'
t = d/c sinh 9,
a- = d cosh 9.
If we choose the same value of 9, then we do in fact just interchange
x and t, "flipping things about the light cone." Note that this will take the
worldline of the above inertial observer into the corresponding line of
simultaneity. In other words, a given worldline and the corresponding line
Minkowskian Geometry 133
of simultaneity have the same 'hyperbolic angle,' though we measure this angle
from different reference lines (x = vs. / = 0) in each case.
Again, we see that is really a measure of the separation of the two
reference frames. In this context, we also refer to as the boost parameter
relating the two frames. The boost parameter is another way to encode the
information present in the relative velocity, and in particular it is a very natural
way to do so from the viewpoint of Minkowskian geometry.
In what way is the relative velocity v of the reference frames related to
the boost parameter 9? Let us again consider the inertial observer passing from
the origin through event A on the hyperbola of constant proper time. This
observer moves at speed:
cxsinhG sinhG
= ctanh9
t tcoshO cosh 9
and we have the desired relation. Here, we have introduced the hyperbolic
tangent function in direct analogy to the more familiar tangent function
of trigonometry. Note that we may also write this function as
tanh = —r k
e Q +e~ B
The hyperbolic tangent function may seem a little weird, but we can
get a better feel for it by drawing a graph like the one below. The vertical
axis is tanh and the horizontal axis is 9.
134
Minkov/skian Geometry
To go from velocity v to boost parameter 8, we just invert the relationship:
9 = tanh" \v/c).
Here, tanh -1 is the function such that tanh -1 (tanh 9) = tanh(tanh -1
9) = 1. This one is difficult to write in terms of more elementary functions
(though it can be done).
However, we can draw a nice graph simply by 'turning the above picture
on its side.' The horizontal axis on the graph below is x and the vertical axis is
tanh -1 x. Note that two reference frames that differ by the speed of light in
fact differ by an infinite boost parameter.
J
r
Velocity
Now for the magic: Let's consider three inertial reference frames, Alice,
Bob, and Charlie. Let Bob have boost parameter Q BC = tanh -1 (v gc /c) relative
to Charlie, and let Alice have boost parameter Q AB = tanh~ l (v AB /c) relative to
Bob. Then the relative velocity of Alice and Charlie is
v Ac/ c =
nlC
] + v AB v BC /c
Let's write this in terms of the boost parameter:
tanh(0 /4g ) + tanh(9 5C )
v AC /c
1 + tanh(6 /4g ) + tanh(9 sc /c z )
After a little algebra, one can show that this is in fact:
v AC /c = tanh(9, fl + 9 5C ).
In other words, the boost parameter Q AC relating Alice to Charlie is
just the
sum of the boost parameters Q AB and 9 gc .
Boost parameters add:
Because boost parameters are part of the native Minkowskian
geometry of spacetime, they allow us to see the rule for combining boosts
in a simple form. In particular, they allow us to avoid the confusion created
by first splitting things into space and time and introducing the notion of
"velocity."
Minkowskian Geometry 13n
2+1 and Higher Dimensional Effects: A re-turn to Stellar Aberration
So, we are beginning to understand how this relativity stuff works, and
how it can be self-consistent. Although we now 'understand' the fact that the
speed of light is- the same in all inertia! reference frames (and thus the
Michelson-Morely experiment), recall that it was not just the Michelson-
Morely experiment that compelled us to abandon the ether and to move to this
new point of view. Another very important set of experiments involved stellar
aberration (the tilting telescopes) - a subject to which we need to return.
One" might think that assuming the speed of light to be constant in all
reference frames would remove all effects of relative motion on light, in
which case the stellar aberration experiments would contradict relativity.
However, we will now see that this is not so.
Stellar Aberration in Relativity
The basic setup of the aberration experiments. Starlight hits the earth
from the side, but the earth is "moving forward" so this somehow means
that astronomers can't point their telescopes straight toward the star if
they actually want to see it. This is shown in the diagram below.
Light Ray hits side instead of reaching bottom
Fig, Telescope Moves Through Ether Must tilt Telescope to See Star
To reanalyze the situation using our new understanding of relativity
we will have to deal the fact that the star light comes in from the side
while the earth travels forward (relative to the star). Thus, we will need to
use a spacetime diagram having three dimensions - two space, and one
time. One often calls such diagrams "2-i dimensional."
These are harder to draw than the l-l dimensional diagrams that we
have been using so far, but are really not so much different. After all. we have
I 3f>
Minknwskian Geometry
already talked a little bit about the fact that, under a boost, things behave
reasonably simply in the direction perpendicular to the action of the boost:
neither simultaneity nor lengths are affected in that direction.
We'll try to draw 2+-1 dimensional spacetime diagrams using our
standard conventions: all light rays move -at 45 degrees to the vertical.
Thus, a light cone looks like this:
3- ^It^sil ^^HsUr
Y 2 '^f^2
We can also draw an observer and their plane of simultaneity.
In the direction of the boost, this plane of simultaneity acts just like
the lines of simultaneity that we have been drawing. However, in the
direction perpendicular to the boost direction, the boosted plane of
simultaneity is not tilted. This is the statement that simultaneity is not
affeected in this direction.
The moving observer's idea of "right and left," so the plane of events
that the moving observer finds to be straight to her right or to her left. Here.
Minkowskian Geometry 137
the observer is moving across the paper, so her "right and left" are more or
less into and out of the paper.
Of course, we would like to know how this all looks when redrawn in
the moving observer's reference frame. One thing that we know is that
every ray of light must still be drawn along some line at 45 degrees from
the vertical. Thus, it will remain on the light cone. However, it may not
be located at the same place on this light cone. In particular, note that the
light rays direct straight into and out of the page as seen in the original
reference frame are 'left behind' by the motion of the moving observer.
That is to say that our friend is moving away from the plane
containing these light rays. Thus, in the moving reference frame these two
light rays do not travel straight into and out of the page, but instead move
somewhat in the "backwards" direction!
This is how the aberration effect is described in relativity.- Suppose
that, in the reference frame of our sun, the star being viewed through the
telescope is "straight into the page." Then, in the reference frame of the
sun, the light from the star is a light ray coming straight out from the
page. However, in the "moving" reference frame of the earth, this light
ray appears to be moving a hit "backwards." Thus, astronomers must point
their telescopes a bit forward in order to catch this light ray.
Qualitatively, the aberration effect is actually quite similar in
Newtonian and post-Einstein physics. However, the actual amount of the
aberration effect observed in the 1800's made no sense to physicists of the
time.
This is because, at the quantitative level, the Newtonian and post-
Einstein aberration effects are quite different. As usual, the post Einstein
version gets the numbers exactly correct, finally tying up the loose ends of
19th century observations.
Einstein's idea that the speed of light is in fact the same in all inertia!
reference frames wins again.
MORE ON BOOSTS AND THE 2+1 LIGHT CONE: THE HEAD-LIGHT
EFFECT
It is interesting to explore the effect of boosts on 2+1 light cones in
more detail, as this turns out to uncover two more new effects. Instead of
investigating this by drawing lots of three-dimensional pictures, it is useful
to find a way to encode the information in terms of a two-dimensional
picture that is easily drawn on the blackboard or on paper. We can do
this by realizing that the light cone above can be thought of as being made
up of a collection of light rays arrayed in a circle.
Some inertial reference frame, perhaps the one in which one of the
above diagrams is drawn. That observer finds that light from an "explosion"
138 Minkowskian Geometry
at the origin moves outward along various rays of light. One light ray travels
straight forward, one travels straight to the observer's left, one travels straight
to the observer's right, and one travels straight backward.
There is one light ray traveling outward in each direction, and of course
the set of all directions (in two space dimensions) forms a circle. Thus, we
may talk about the circle of light rays. It us convenient to dispense with all of
the other parts of the diagram and just draw this circle of light rays. The picture
below depicts the circle of light rays in the same reference frame used to draw
the above diagram and uses the corresponding colored dots to depict the front,
back, left, and right light rays.
Right
Fig. Light Circle in the Original Frame
Now let's draw the corresponding circle of light rays from the moving
observer's perspective. A given light ray from one reference frame is still
some light ray in the new reference frame. Therefore, the effect of the boost
on the light cone can be described by simply moving the various dots to
appropriate new locations on the circle. For example, the light rays that
originally traveled straight into and out of the page now fall a bit 'behind' the
Minkowskian Geometry 139
moving observer. So, they are now moved a bit toward the back. Front, back,
left, and right now refer to the new reference frame.
Right
Fig. Light Circle in the Second Frame
Note that most of the dots have fallen toward our current observer's back
side - the side which represents the direction of motion of the first observer!
Suppose then that the first observer were actually, say, a star like the sun.
In it's own rest frame, a star shines more or less equally brightly in all
directions - in other words, it emits the same number of rays of light in all
directions. So, if we drew those rays as dots on a corresponding light-
circle in the star's frame of reference, they would all be equally spread out
as in the first light circle we drew above.
What we see, therefore, is that in another reference frame (with respect
to which the star is moving) the light rays do not radiate symmetrically
from the star. Instead, most of the light rays come out in one particular
direction! In particular, they tend to come out in the direction that the
star is moving. Thus, in this reference frame, the light emitted by the star
is bright in the direction of motion and dim in the opposite direction and
the star shines like a beacon in the direction it is moving. For this reason,
this is known as the "headlight" effect.
By the way, this effect is seen all the time in high energy particle
accelerators and has important applications in materials science and
medicine. Charged particles whizzing around the accelerator emit radiation
in all directions as described in their own rest frame. However, in the frame
of reference of the laboratory, the radiation comes out in a tightly focussed
beam in the direction of the particles' motion. This means that the radiation
can be directed very precisely at materials to be studied or tumors to be
destroyed.
Multiple Boosts in 2+1 Dimensions
The above two circles of light rays and notice that there is a certain
symmetry about the direction of motion. So, suppose you are given a circle
of light rays marked with dots which show, as above, the direction of motion
of light rays in your reference frame. Suppose also that these light rays were
emitted by a star, or by any other source that emits equally in all directions in
140
Minkowskian Geometry
its rest frame. Then you can tell which direction the star is moving relative to
you by identifying the symmetry axis in the circle! There must always be such
a symmetry axis. The result of the boost was to make the dots flow as shown
below:
Right
green (font and back) dots are on the symmetry axis, and so do not move at
all.
So, just for fun, let's take the case above and consider another observer
who is moving not in the forward/backward direction, but instead is
moving in the direction that is "left/right" relative to the "moving" observer
above. To find out what the dots looks like in the new frame of references,
we just rotate the flow shown above by 90 degrees as shown below
and apply it to the dots in the second frame. The result looks something
like this:
The new symmetry axis is shown above. Thus, with respect to the
original observer, this new observer is not moving along a line straight to
the right. Instead, the new observer is moving somewhat in the forward
direction as well. But wait.... something else interesting is going on here....
the light rays don't line up right. Note that if we copied the above symmetry
axis onto the light circle in the original frame, it would sit exactly on top
Minkowskian Geometry 141
of rays 4 and 8. However, in the figure above the symmetry axis sits half-way
between 1 and 8 and 4 and 5. This is the equivalent of having first rotated the
light circle in the original frame by 1/16 of a revolution before performing a
boost along the new symmetry axis! The new observer differs from the original
one not just by a boost, but by a rotation as well!
In fact, by considering two further boost transformations as above
(one acting only backward, and then one acting to the right), one can
obtain the following circle of light rays, which are again evenly distributed
around the circle. You should work through this for yourself, pushing
the dots around the circle with care.
Thus, by a series of boosts, one can arrive at a frame of reference
which, while it is not moving with respect to the original fame, is in fact
rotated with respect to the original frame. By applying only boost
transformations, we have managed to turn our observer by 45 degrees in
space. This just goes to show again that time and space are completely
mixed together in relativity, and that boost transformations are even more
closely related to rotations than you might have thought. A boost
transformation can often be thought of as a "rotation of time into space."
In this sense the above effect may be more familiar: Consider three
perpendicular axes, x, y, and z. By performing only rotations about the x
and y axes, one can achieve the same result as any rotation about the z axes.
Other Effects
Boosts in 3+1 dimensions and higher works pretty much like it they
do in 2+1 dimensions, which as we have seen has only a few new effects
beyond the 1+1 case on which we spent most of our time. This has to do
with how rapidly moving objects actually look; that is, they have to do
with how light rays actually reach your eyes to be processed by your brain.
Chapter 5
Accelerating Reference Frames
We have now reached an important point in our study of relativity.
Although that many of you are still absorbing it, we have learned the basic
structure of the new ideas about spacetime, how they developed, and how
they fit with the various pieces of experimental data. We have also finished
all of the material in Einstein's Relativity (and in fact in most introductions)
associated with so-called 'special relativity.'
One important subject with which we have not yet dealt is that of
"dynamics," or, "what replaces Newton's Laws in post-Einstein physics?"
Newton's second Law (F=ma), the centerpiece of pre-relativistic physics,
in-volves acceleration. Although we have to some extent been able to deal
with accelerations in special relativity (as in the twin paradox), we have
seen that accelerations produce further unexpected effects. We need to
study these more carefully before continuing onward. So, we are going to
carefully investigate the simple but illustrative special case known as 'uni-
form' acceleration.
THE UNIFORMLY ACCELERATING WORLDLINE
One might at first think that this means that the acceleration a = dv/
dt of some object is constant, as measured in some inertial frame. However,
this would imply that the velocity (relative to that frame) as a function of
time is of the form v = vO + at. One notes that this eventually exceed the
speed of light. Given our experience to date, this would seem to be a bit
odd.
Also, on further reflection, one realizes that this notion of acceleration
depends strongly on the choice of inertial frame. The dv part of a involves
subtracting velocities, and we have seen that plain old subtraction does
not in fact give the relative velocity between two inertial frames. Also, the
dt part involves time measurements, which we know to vary greatly
between reference frames.
Thus, there is no guarantee that a constant acceleration a as measured
in some inertial frame will be constant in any other inertial frame, or that
it will in any way "feel" constant to the object that is being accelerated.
Accelerating Reference Frames 143
Defining Uniform Acceleration
What we have in mind for uniform acceleration is something that
does in fact feel constant to the object being accelerated. In fact, we will
take this as a definition of "uniform acceleration." We can in fact feel
accelerations directly when an airplane takes off, a car goes around a
corner, or an elevator begins to move upward we feel the forces associated
with this acceleration (as in Newton's law F=ma). To get the idea of
uniform acceleration, picture a large rocket in deep space that burns fuel
at a constant rate. Here we have in mind that this rate should be constant
as measured by a clock in the rocket ship. Presumably the astronauts on
this rocket experience the same force at all times.
Newton's laws will need to be modified in relativity. However, we
know that Newton's laws hold for objects small velocities (much less than
the speed of light) relative to us. These laws are precisely correct in the
limit of zero relative velocity.
So, how can we keep the rocket "moving slowly" relative to us as it
continues to accelerate? We can do so by continuously changing our own
reference frame. Perhaps a better way to say this is that we should arrange
for many of our friends to be inertial observers, but with a wide range of
velocities relative to us. During the short time that the rocket moves slowly
relative to us, we use our reference frame to describe the motion. Then, at
event Ej (after the rocket has sped up a bit), we'll use the reference frame
of one of our inertial friends whose velocity relative to us matches that of
the rocket at event E y Then the rocket will be at rest relative to our friend.
Our friend's reference frame is known as the momentarily co-moving
inertial frame at event E,. A bit later (at event E 2 ), we will switch to another
friend, and so on.
In fact, to do this properly, we should switch friends (and reference
frames) fast enough so that we are always using a reference frame in which
144
Accelerating Reference Frames
the rocket is moving only infinitesimally slowly. Then the relativistic effects
will be of zero size. In other words, we wish to borrow techniques from
calculus and take the limit in which we switch reference frames continually,
always using the momentarily co-moving inertial frame.
Anyway, the thing that we want to be constant in uniform
acceleration is called the "proper acceleration." Of course, it can change
along the rocket's worldline (depending on how fast the rocket decides
to burn fuel), so we should talk about the proper acceleration 'at some
event (E) on the rocket's worldline.' To find the proper acceleration (a)
at event E, first consider an inertial reference frame in which the rocket
is at rest at event E.
Momentarily Co-m
The proper acceleration a(E) at event E is just the acceleration of the
rocket at event E as computed in this momentarily co-moving reference
frame.
Thus we have
a(£) = dv/dtp
where the E -subscripts remind us that this is to be computed in the
momentarily co-moving inertial frame at event E. Notice the analogy with
the definition of proper time along a worldline, which says that the proper
time is the time as measured in a co-moving inertial frame (i.e., a frame in
which the worldline is at rest).
An important point is that, although our computation of a(E) involves
a discussion of certain reference frames, a(E) is a quantity that is intrinsic
to themotion of the rocket and does not depend on choosing of some
particular inertial frame from which to measure it. Thus, it is not necessary
to specify an inertial frame in which a(E) is measured, or to talk about
a(E) "relative" to some frame. As with proper time, we use a Greek letter
(a) to distinguish proper acceleration from the more familiar frame-
dependent acceleration a.
We should also point out that the notion of proper acceleration is
also just how the rocket would naturally measure its own acceleration
(relative to inertial frames). For example, a person in the rocket might
Accelerating Reference Frames 145
decide to drop a rock out the window at event E. If the rock is gently
released at event E, it will initially have no velocity relative to the rocket
- its frame of reference will be the momentarily co-moving inertial frame
at event E.
If the observer in the rocket measures the relative acceleration between
the rock and the rocket, this will be the same size (though in the opposite
direction) as the acceleration of the rocket as measured by the (inertial
and momentarily co-moving) rock. In other words, it will be the proper
acceleration of the rocket.
Uniform Acceleration and Boost Parameters
So, now we know what we mean by uniform acceleration. But, it
would be useful to know how to draw this kind of motion on a spacetime
diagram (in some inertial frame). In other words, we'd like to know what
sort of worldline this rocket actually follows through spacetime.
There are several ways to approach this question, to use some of the
tools that we've been developing. Uniform acceleration is a very natural
notion that is not tied to any particular reference frame. We also know
that, in some sense, it involves a change in velocity and a change in time.
One might expect the discussion to be simplest if we measure each of these
in the most natural way possible, without referring to any particular
reference frame.
The natural way to describe velocity (Minkowskian geometry) is in
terms of the associated boost parameter 9. Boost parameters really do
add together in the simple, natural way. This means that when we consider
a difference of two boost parameters (like, say, in A0 or d 0), this difference
is in fact independent of the reference frame in which it is computed. The
boost parameter of the reference frame itself just cancels out.
What about measuring time? The 'natural' measure of time along a
worldline is the proper time. The proper time is again independent of any
choice of reference frame. Let's again think about computing the proper
acceleration Q(E) at some event E using the momentarily co-moving inertial
frame. We have
dv E
What we want to do is to write dvE and dtE in terms of the boost
parameter (9) and the proper time (x). Let's start with the time part. The
proper time t along the rocket's worldline is just the time that is measured
by a clock on the rocket.
Thus, the question is just "How would a small time interval dx
measured by this clock (at event E) compare to the corresponding time
146 Accelerating Reference Frames
interval dtE measured in the momentarily co-moving inertial frame?" But we
are interested only in the infinitesimal time around event E where there is
negligible relative velocity between these two clocks. Clocks with no relative
velocity measure time intervals in exactly the same way. So, we have dt E =
dx.
Now let's work in the boost parameter, using d9 to replace the dvE
in equation. The boost parameter 9 is just a function of the velocity v/c =
tanh 9. So, let's tiy to compute dvE/dtE using the chain rule. You can use
the definition of tanh to check that
dv _ c
dQ ~ cosh 2
dv dv dv _ c dQ
~dx~ d§dx~ cosh 2 dx '
Thus, we have
Finally, note that at event E, the boost parameter of the rocket
relative to the momentarily co-moving inertial frame is zero. So, if we
dv F
- into the above equation:
dv
dx ~
dv E _ c 1 dQ _ dQ
dx cosh 2 e|e = o di dx
other words,
dQ
— = ale.
dx
d and dr do not in fact depend on a choice of inertial reference frame.
The relation holds whether or not we are in the momentarily co-moving
inertial frame.
If we translate equation into words, it will come as no surprise: "An
object that experiences uniform acceleration gains the same amount of
boost parameter for every second of proper time: that is. for every second
of time measured by a clock on the rocket."
It will be useful to solve for the case of uniform a and in which the
boost parameter (and thus the relative velocity) vanishes at t = 0. For this
case, yields the relation:
= ax/c.
This statement encodes a particularly deep bit of plnsics. In particular,
it turns out to answer the question "Wh\ can't an object go faster than
the speed of light?" Here, we have considered the simple case of a rocket
that tries to continualK accelerate bv burning fuel at a constant rate.
Accelerating Reference Frames ' 147
What we see is that it gains equal boost parameter in every interval of
proper time. So, will it ever reach the speed of light ? No. After a very long
(but finite) proper time has elapsed the rocket will merely have a large (but
finite) boost parameter.
Since any finite boost parameter (no matter how large) corresponds
to some v less than c, the rocket never reaches the speed of light. Similarly,
it turns out that whether or not the acceleration is uniform, any rocket
must burn an infinite amount of fuel to reach the speed of light. Thus, the
speed of light (infinite boost parameter) plays the same role in relativity
that was played by infinite velocity in Newtonian physics.
Finding the Worldline
We worked out the relation between the proper acceler-ation a of an
object, the boost parameter that describes the object's motion, and the
proper time x along the object's worldline. This relation was encoded in
equation— = ale.
This results told us quite a bit, and in particular let to insight into the
"why things don't go faster than light" issue. However, we still don't know
exactly what worldline a uniformly accelerating object actually follows in
some inertial frame. This means that we don't yet really know how to draw
the uniformly accelerating object on a spacetime diagram, so that we
cannot yet apply our powerful diagrammatic tools to understanding the
physics of uniform accelera-tion.
Let's start by drawing the rough qualitative shape of the worldline
on a spacetime diagram.
The worldline will have v = at t = 0, but the velocity will grow with
t. The velocity will thus be nearly +c for large positive / and it will be
nearly +c for large negative t.
Uniform acceleration is in some sense invariant. When the uniformly
accelerated rocket enters our frame of reference (i.e., when v = 0), no matter
what inertial frame we are in! Thus, the curve should in some sense 'look
the same' in every inertial frame.
So, any guesses? Can you think of a curve that looks something like
the figure above that is 'the same' in all inertial frames?
How about the constant proper distance curve .y = d cosh 9. Since 8
148 Accelerating Reference Frames
was a boost angle there, it is natural to guess that it is the same 8 = axle that
we used above.
Let us check our guess to see that it is in fact correct. What we will do is
to simply take the curve x = d cosh(ax/c), t = (d/c) sinh(ax/c) and show that,
for the proper choice of the distance d, its velocity is v = c tanh(on:/c), where
i is the proper time along the curve.
But we have seen that this relation between time is the defining
property of a uniformly accelerated worldline with proper acceleration a,
so this will indeed check our guess.
First, we simply calculate:
dx = sinh(ax)^T
Dividing these two equations we have
dx
dt
ctanh(ax/c);
i.e., is indeed ai/c along this curve. Now, we must show that x is the
proper time along the curve. But
, ,2 1 j 2 a 2 ^ 2 . 2
Jpropertime 2 = dt - — dx = — — dx .
c c
So, we need only choose d such that ad/c 2 = 1 and we are done. Thus,
d = c 2 la. In summary,
If we start a uniformly accelerated object in the right place (c 2 /a away
from the origin), it follows a worldline that remains a constant proper
distance (c 2 /a) from the origin.
For a general choice of starting location (say, jt ), it follows a worldline
that remains a constant proper distance — from some other event. Since
Accelerating Reference Frames 149
it is some-times useful to have this more general equation, let us write it down
here:
*(")-
Exploring the Uniformly Accelerated Reference Frame
We have now found that a uniformly accelerating observer with
proper acceleration a follows a worldline that remains a constant proper
distance c 2 /a away from some event.
Just which event this is depends on where and when the observer began
to accelerate. For simplicity, let us consider the case where this special
event is the origin. Let us now look more closely at the geometry of the
situation.
Horizons and Simultaneity
The diagram below shows the uniformly accelerating worldline
together with a few important light rays.
Future Acceleration Horizon
The Light
catch up with the rocket
is region Past Acceleration Horizor,
Note the existence of the light ray marked "future acceleration
horizon." It marks the boundary of the region of spacetime from which
the uniformly accel- erated observer can receive signals, since such signals
cannot travel faster than c.
This is an interesting phenomenon in and of itself: merely by
undergoing uniform acceleration, the rocket ship has cut itself o from
communication with a large part of the spacetime. In general, the term
'horizon' is used whenever an object is cut o in this way. On the diagram
above there is a light ray marked "past acceleration horizon" which is the
boundary of the region of spacetime to which the uniformly accelerated
observer can send signals.
When considering inertial observers, we found it very useful to know
how to draw their lines of simultaneity and their lines of constant position.
Presumably, we will learn equally interesting things from working this out
for the uniformly accelerating rocket.
But, what notion of simultaneity should the rocket use? Let us define
150 Accelerating Reference Frames
the rocket's lines of simultaneity to be those of the associated momentarily
co-moving inertial frames. It turns out that these are easy to draw. Let us
simply pick any event A on the uniformly accelerated worldline. A Z the
event from which the worldline maintains a constant proper distance.
A boost transformation simply slides the events along the hyperbola.
This means that we can find an inertial frame in which the above
picture looks like this:
In the new frame of reference, the rocket is at rest at event A.
Therefore, the rocket's line of simultaneity through A is a horizontal line.
Note that this line passes through event Z.
This makes the line of simultaneity easy to draw on the original
diagram. What we have just seen is that: Given a uniformly accelerating
observer, there is an event Z from which it maintains proper distance.
The observer's line of simultaneity through any event A on her worldline
is the line that connects event A to event Z.
Thus, the diagram below shows the rocket's lines of simultaneity.
Let me quickly make one comment here on the passage of time.
Suppose that events -2, - 1 above are separated by the same sized boost
as events -1,0. events 0, 1, and events 1, 2. From the relation 8 = axle 2 it
follows that each such pair of events is also separated by the same interval
of proper time along the worldline.
Accelerating Reference Frames
But now on to the more interesting features of the diagram above!
Note that the acceleration horizons divide the spacetime into four regions.
In the right-most region, the lines of simultaneity look more or less normal.
However, in the top and bottom regions, there are no lines of simultaneity
at all! The rocket's lines of simultaneity simply do not penetrate into these
regions. Finally, in the left-most region things again look more or less
normal except that the labels on the lines of simultaneity seem to go the
wrong way, 'moving backward in time.'
And, of course, all of the lines of simultaneity pass through event Z
where the horizons cross. These strange-sounding features of the diagram
should remind you of the weird effects we found associated with Gaston's
acceleration in our discussion of the twin paradox.
As with Gaston, one is tempted to ask "How can the rocket see things
running backward in time in the left-most region?" In fact, the rocket does
not see, or even know about, anything in this region. No signal of any
kind from any event in this region can ever catch up to the rocket. This
phenomenon of finding things to run backwards in time is a pure
mathematical artifact and is not directly related to anything that observers
on the rocket actually notice.
Friends on a Rope
We uncovered some odd effects associated with the the acceleration
horizons. In particular, we found that there was a region in which the
lines of simultaneity seemed to run backward. However, we also found
that the rocket could neither signal this region nor receive a signal from
it. The fact that the lines of simultaneity run backward here is purely a
mathematical artifact.
Despite our discussion above, you might wonder if that funny part
of the rocket's reference frame might somehow still be meaningful. It turns
out to be productive to get another perspective on this, so let's think a bit
about how we might actually construct a reference frame for the rocket.
152
Accelerating Reference Frames
We would like to know what happens to the ones that lie below the
horizon. Let us begin by asking the question: what worldlines do these
fellow observers follow?
Consider a friend who remains a constant distance A below us as
measured by us; that is, as measured in the momentarily co-moving frame
of reference. This means that this distance is measured along our line of
simultaneity. But look at what this means on the diagram below:
A distance (measured in some inertial frame) between two events on
a given a line of simultaneity (associated with that same inertial frame) is
in fact the proper distance between those events. Thus, on each line of
simultaneity the proper distance between us and our friend is A. But, along
each of these lines the proper distance between us and event Z is ale 2 .
Thus, along each of these lines, the proper distance between our friend
and event Z is a/c 2 . A. In other words, the proper distance between our
friend and event Z is again a constant and our friend's worldline must
also be a hyperbola! Note, however, that the proper distance between our
friend and event Z is less than the proper distance c 2 /a between us and
event Z. This means that our friend is again a uniformly accelerated
observer, but with a different proper acceleration!
We can use the relations to find the proper acceleration a L of our
lower friend. The result is "
= proper distance between Z and lower
friend = A,
so that our friend's proper acceleration is larger than our own.
Accelerating Reference Frames 153
In particular, let's look at what happens when our friend is sufficiently
c 2
far below us that they reach the acceleration horizons. This is A = — .At
a
this value, we find a L = °°!! Note that this fits with the fact that they would
have to travel right along a pair of light rays and switch between one ray
and the other in zero time...
So then, suppose that we hung someone below us on a rope and slowly
lowered them toward the horizon. The proper acceleration of the person
(and thus the force that the rope must exert on them) becomes infinite as
they get near the horizon. Similarly (by Newton's 3rd law) the force that
they exert on the rope will become infinite as they near the horizon. Thus,
no matter what it is made out of, the rope must break (or begin to stretch,
or somehow fail to remain rigid such that the person falls away, never to
be seen by us again) before the person is lowered across the horizon.
Again we see that, in the region beyond the horizon, the reference
frame of a uniformly accelerating object is "unphysical" and could never
in fact be constructed. There is no way to make one of our friends move
along a worldline below the horizon that remains at a constant proper
distance from us.
The Long Rocket
Suppose now that our rocket is long enough that we should draw
separate world-lines for its front and back. If the rocket is 'rigid,' it will
remain a constant proper length A as time passes. This is just like our
'friend on a rope' example. Thus, the back of the rocket also follows a
uniformly accelerated worldline with a proper acceleration a B which is
related to the proper acceleration F of the front by:
" c'-^a F
Clearly, the back and front have different proper accelerations.
154
Accelerating Reference Frames
Note that the front and back of the rocket do in fact have the same lines
of simul-taneity, so that they agree on which events happen "at the same time."
But do they agree on how much time passes between events that are not
simultaneous? Since they agree about lines of simultaneity it must be that,
along any such line,both ends of the rocket have the same speed v and the
same boost parameter 0.
However, because the proper acceleration of the back is greater than
that of the front, the relation = at/c 2 then tells us that more proper time
t passes at the front of the rocket than at the back. In other words, there
is more proper time between the events A p B F below than between events
A B ,B B . In fact, a top i top = a bottom T bottom .
Front of Rocket
Here it is important to note that, since they use the same lines of
simultaneity, both ends of the rocket agree that the front (top) clock runs
faster! Thus, this effect is of a somewhat different nature than the time
dilation associated with inertial observers. This, of course, is because all
accelerated observers are not equivalent - some are more accelerated than
others.
By the way, we could have read off the fact that Ax Front is bigger than
Ax Back directly from our diagram without doing any calculations.
(This way of doing things is useful for certain similar homework problems.)
To see this, note that between the two lines (t = ±t ) of simultaneity
(for the inertial frame!!) drawn below, the back of the rocket is moving
faster (relative to the inertial frame in which the diagram is drawn) than
is the front of the rocket.
You can see this from the fact that the front and back have the same
line of simultaneity (and therefore the same speed) at events B F , B B and
at events A p A B . This means that the speed of the back at B B is greater
than that of the front at D F and that the speed of the back at A B is greater
than that of the front at C F .
Accelerating Reference Frames
Thus, relative to the inertial frame in which the diagram is drawn, the
back of the rocket experiences more time dilation in the interval (-t Q , t) and
it's clock runs more slowly. Thus, the proper time along the back's worldline
between events A B and B B is less than the proper time along the front's worldline
between events C F and D F .
We now combine this with the fact that the proper time along the
front's worldline between A F and B F is even greater than that between C F
and Dp. Thus, we see that the front clock records much more proper time
between A F and BF than does the back clock between A g and B g .
Chapter 6
Energy and Momentum in Relativity
Up until now, we have been concerned mostly with describing motion.
We have asked how various situations appear in different reference frames,
both inertial and accelerated.
However, we have largely ignored the question of what would make
an object follow a given worldline ('dynamics'). The one exception was
when we studied the uniformly accelerated rocket and realized that it must
burn equal amounts of fuel in equal amounts of proper time. This
realization came through using Newton's second law in the regime where
we expect it to hold true: in the limit in which v/c is vanishingly small.
DYNAMICS, OR, "WHATEVER HAPPENED TO FORCES?"
That Newton's various laws used the old concepts of space and time.
Before we can apply them to situations with finite relative velocity, they
will have to be at least rewritten and perhaps greatly modified to
accommodate our new understanding of relativity. This was also true for
our uniformly accelerating rocket. A constant thrust does not provide a
constant acceleration as measured from a fixed inertial reference frame
but, instead, it produces a constant proper acceleration.
Now, a central feature of Newton's laws (of much of pre-Einstein
physics) was the concept of force. It turns out that the concept of force is
not as useful in relativistic physics.
This has something to do with our discovery that accelera-tion is now
a frame-dependent concept (so that a statement like F = ma would be
more complicated), but the main point actually involves Newton's third
law: The third law of Newtonian Physics: When two objects (A and B)
exert forces FA on B and FB on A on each other, these forces have the
same size but act in opposite directions.
To understand why this is a problem, let's think about the gravitational
forces between the Sun and the Earth.
Energy and Momentum in Relativity 157
Newton said that the force between the earth and the sun is given by an
GM eanh M sm
inverse square law: F = ~3 where d is the distance between them.
a
In particular, the force between the earth and sun decreases if they move farther
apart. Let's draw a spacetime diagram showing the two objects moving apart.
4 V--
At some time /, when they are close together, there is some strong force
F, acting on each object. Then, later, when they are farther apart, there is
some weaker force F 2 acting on each object.
However, what happens if we consider this diagram in a moving
reference frame? In a line of simultaneity (the dashed line) for a different
reference frame above, and we can see that it passes through one point
marked Fj and one point marked F 2 \ This shows that Newton's third law
as stated above cannot possibly holdl in all reference frames.
So, Newton's third law has to go. But of course, Newton's third law
is not completely wrong - it worked very well for several hundred years!
So, as with the law of composition of velocities and Newton's second law,
we may expect that it is an approximation to some other law, with this
approximation being valid only for velocities that are very small compared
to c.
It turns out that this was not such a shock to Einstein, as there had been a
bit of trouble with Newton's third law even before relativity itself was
understood. Again, the culprit was electromagnetism.
FIELDS, ENERGY, AND MOMENTUM
To see the point, consider an electron in an electric field. We have said
that it is really the field that exerts a force on the electron.
Newton's third law would seem to say that the electron then exerts a force
on the electric field. But what would this mean? Does an electric field have
mass? Can it accelerate?
Luckily for Einstein, this problem had been solved. It was understood
that the way out of this mess was to replace the notion of force with two
somewhat more abstract notions: energy and momentum. Since not all of
you are intimately familiar with these notions, let me say just a few words
about them before we continue.
158 Energy and Momentum in Relativity
A word on Energy (E)
Most people have an intuitive concept of energy as "what comes out of a
power plant" and this is almost good enough for our purposes. Anything which
can do something has energy: batteries, light, gasoline, wood, coal (these three
can be burned), radioactive substances, food, and so on. Also, any moving
object has energy due to it's motion. For example, a moving bowling ball has
energy that allows it to knock down bowling pins. By the way, in Newtonian
1 ,
physics, there in an energy — mv 1 (called 'kinetic energy' from the Greek word
for motion) due to the motion of an object of mass m.
The most important thing about energy is that it cannot be created or
destroyed; it can only be transformed from one form to another. As an example,
in a power plant, coal is burned and electricity is generated. Burning coal is a
process in which the chemical energy stored in the coal is turned into heat
energy.
This heat energy boils water and creates a rising column of steam
(which has energy due to its motion). The column of steam then turns a
crank which turns a wire in a magnetic field. This motion converts the
mechanical energy of motion of the wire into electrical energy.
Physicists say that Energy is "conserved," which means that the total
energy E in the universe can not change.
A Few Words on Momentum (P)
Momentum is a bit less familiar, but it is like energy in that it cannot
be created or destroyed: it can only be transferred from one object to
another. Thus, momentum is also "conserved." Momentum is a quantity
that describes in a certain sense "how much motion is taking place, and in
what direction." If the total momentum is zero, we might say that there is
"no net motion" of a system. Physicists say that the velocity of the "centre
of mass" vanishes in this case.
Let's look back at the bowling ball example above. The energy of the
bowling ball is a measure of how much mayhem the ball can cause when
it strikes the pins. However, when the ball hits the pins, the pins do not
fly about in an arbitrary way. In particular, the pins tend to fly away in
more or less the same direction as the ball was moving originally. This is
because, when the ball hits the pins, it gives up not only some of its energy
to the pins, but also some of its momentum. The momentum is the thing
that knows what direction the ball was traveling and makes the pins move
in the same direction that the ball was going.
As an example, consider what happens if the bowling ball explodes
when it reaches the pins.
This releases more energy (so that the pins fly around more) but will not
Energy and Momentum in Relativity 159
change the momentum. As a result, the pins and ball shards will have the same
net forward motion as would have happened without the explosion. Some pins
and shards will now move more forward, but some other bits will also move
more backward to cancel this effect.
In Newtonian physics, the momentum p of a moving object is given
by the formula p = mv. This says that an object that moves very fast has
more momentum than one that moves slowly, and an object that has a
large mass has more momentum than one with a small mass. This second
bit is why it is easier to knock over a bowling pin with a bowling ball than
with a ping-pong ball.
By the way, the fact that momentum is a type of object which points
in some direction makes it something called a vector. A vector is something
that you can visualize as an arrow. The length of the arrow tells you how
big the vector is (how much momentum) and the direction of the arrow
tells you the direction of the momentum.
Now, in Newtonian physics, momentum conservation is closely
associated with Newton's third law. One way to understand this is to realise
that both rules (Newton's third law and momentum conservation)
guarantee that an isolated system (say, a closed box o in deep space) that
begins at rest cannot ever start to move.
In terms of Newton's third law, this is because, if we add up all of the
forces between things inside the box, they will cancel in pairs: F A on B +
F B on A = 0. In terms of momentum conservation, it is because the box at
rest has zero momentum, whereas a moving box has a nonzero momentum.
Momentum conservation says that the total momentum of the box cannot
change from zero to non-zero.
In fact, in Newtonian physics, Newton's third law is equivalent to
momentum conservation. To see this, consider two objects, A and B, with
momenta pA = mv A and p B = mv B . Suppose for simplicity that the only
forces on these objects are caused by each other. Note what happens when
we take a time derivative:
dp a
— — = m A a, = FB on A,
dt A A
= m B°B = F A on B -
The total momentum is ^ total = p A + p B - We have
— r^~ = Fr. on A + F , on B = 0.
dt B A
Thus, Newton's 3rd law is equivalent to momentum conservation. One
holds if and only if the other does.
Anyway, physicists in the ! 800's had understood that there was a problem
160 Energy and Momentum in Relativity
with Newton's third law when one considered electric fields. It did not really
seem to make sense to talk about an electron exerting a force on an electric
field. However, it turns out that one can meaningfully talk about momentum
carried by an electromagnetic field, and one can even compute the momentum
of such a field - say, for the field representing a light wave or a radio wave.
Furthermore, if one adds the momentum of the electro-magnetic field to the
momentum of all other objects, Maxwell's equations tell us that the resulting
total momentum is in fact conserved.
In this way, physicists had discovered that momentum conservation
was a slightly more abstract principle that held true more generally than
did Newton's third law.
In relativity, too, it turns out to be a good idea to think in terms of
momentum and momentum conservation instead of thinking in terms of
Newton's third law. For example, in the Sun-Earth, the field between the
Sun and the Earth can carry momentum. As a result, momentum
conservation does not have to fail if, on some slice of simultaneity, the
momentum being gained by the Earth does not equal the momentum being
lost by the Sun! Instead, the missing momentum is simply being stored or
lost by the field in between the two objects.
ON TO RELATIVITY
Now, while the concepts of momentum and energy can make sense
in relativistic physics, the detailed expressions for them in terms of mass
and velocity should be somewhat different than in the Newtonian versions.
However, as usual we expect that the Newtonian versions are correct in
the particular limit in which all velocities are small compared to the speed
of light.
There are a number of ways to figure out what the correct relativistic
expressions are however, that way of getting at the answer is a bit technical.
So, for the moment, we're going to approach the question from a different
standpoint.
Einstein noticed that, even within electromagnetism, there was still
something funny going on. Momentum was conserved, but this did not
necessarily seem to keep isolated boxes (initially at rest) from running
away! The example he had in mind was connected with the observation
that light can exert pressure.
This was well known in Einstein's time and could even be measured.
The measurements were made as early as 1900, while Einstein published
his theory of special relativity in 1905. It was known, for example, that
pressure caused by light from the sun was responsible for the long and
lovely tails on comets: light pressure (also called radiation pressure or solar
wind) pushed droplets of water and bits of dust and ice backwards from the
Energy and Momentum in Relativity 161
comet making a long and highly reflective tail. Nowadays, we can use lasers
to lift grains of sand, or to smash things together to induce nuclear fusion.
Lasers in a Box
Anyway, suppose that we start with a box having a powerful laser5 at
one end.
When the laser fires a pulse of light, the light is near the left end and
pressure from this light pushes the box to the left. The box moves to the
left while the pulse is traveling to the right. Then, when the pulse hits the
far wall, its pressure stops the motion of the box. The light itself is absorbed
by the wall and disappears.
Now, momentum conservation says that the total momentum is
always zero. Nevertheless, the entire box seems to have moved a bit to the
left. With a large enough battery to power the laser, we could repeat this
experiment many times and make the box end up very far to the left of
where it started. Or, perhaps we do not even need a large battery: we can
imagine recycling the energy used the laser.
If we could catch the energy at the right end and then put it back in
the battery, we would only need a battery tiny enough for a single pulse.
By simply recycling the energy many times, we could still move the box
very far to the left. This is what really worried our friend Mr. Einstein.
Centre of Mass
The moving laser box worried him because of something called the
centre of mass. Here's the idea: Imagine yourself in a canoe on a lake.
You stand at one end of the canoe and then walk forward. However, while
you walk forward, the canoe will slide backward a bit. A massive canoe
slides only a little bit, but a light canoe will slide a lot. It turns out- that in
non-relativistic physics the average position of all the mass (including both
you and the canoe) does not move. This average location is technically
known as the 'center of mass'.
This follows from Newton's third law and momentum conservation. To
understand the point suppose that in the above experiment we throw rocks
162 Energy and Momentum in Relativity
from left to right instead of firing the laser beam. While most of the box would
shift a bit to the left (due to the recoil) with each rock thrown, the rock in
flight would travel quite a bit to the right. In this case, a sort of average location
of all of the things in the box (including the rock) does not move.
Suppose now that we want to recycle the rock, taking it back to the left to
be thrown again. We might, for example, try to throw it back. But this would
make the rest of the box shift back to the right, just where it was before. It
turns out that any other method of moving the rock back to the left side has
the same effect.
To make a long story short, since the average position cannot change,
a box can never move itself more than one box-length in any direction,
and this can only be done by piling everything inside the box on one side.
In fact, when there are no forces from outside the box, the centre of mass
of the stuff in the box does not accelerate at all! In general, it is the centre
of mass that responds directly to outside forces.
Mass vs. Energy
So, what's going on with our box? Let's look at the experiment more
carefully.
After the experiment, it is clear that the box has moved, and in fact
that every single atom in the box has slid to the left. So, the^centre of
mass seems to have moved! But, Einstein asked, might something elserhave
changed during the experiment which we need to take into account? Is
the box after the experiment really identical to the one before the
experiment began?
The answer is: "not quite." Before, the experiment, the battery that
powers the laser is fully charged. After the experiment, the battery is not
fully charged. What happened to the associated energy? It traveled across
the box as a pulse of light. It was then absorbed by the right wall, causing
the wall to become hot. The net result is that energy has been transported
from one end of the box (where it was battery energy) to the other (where it
became heat).
Energy and Momentum in Relativity
Battery not n
Fully charged
So, Einstein said, "perhaps we should think about something like the centre
of energy as opposed to the centre of mass." But, of course, the mass must
also contribute to the centre of energy... so is mass a form of energy?
Anyway, the relevant question here is "Suppose we want to calculate the
centre of mass/energy. Just how much mass is a given amount of energy
worth?" Or, said another way, how much energy is a given amount of
mass worth?
Well, from Maxwell's equations, Einstein could figure out the energy
transported.
He could also figure out the pressure exerted on the box so that he knew
how far all of the atoms would slide. Assuming that the centre of mass-energy
did not move, this allowed him to figure out how much energy the mass of the
box was in fact worth. The computation is a bit complicated, so we won't do
it here6. However, the result is that an object of mass m which is at rest is
worth the energy:
E = mc 2
Note that, since c 2 = 9 x 10 16 m 2 /s 2 is a big number, a small mass is
worth a lot of energy. Or, a 'reasonable amount' of energy is in fact worth
very little mass.
This is why the contribution of the energy to the 'center of mass-energy'
had not been noticed in pre-Einstein experiments. Let's look at a few. We buy
electricity in 'kilowatt-hours' (kWh) - roughly the amount of energy it takes to
run a house for an hour. The mass equivalent of 1 kilowatt-hour is
\kWh \kWh 3600 sec lOOOff 3.6xlO b
hr.
\kW
9xl0 10
= 4x10"'
kg-
In other words, not much.
By the way, one might ask whether the fact that both mass and energy
contribute to the 'center of mass-energy' really means that mass and energy
are convertible into one another. Let's think about what this really means.
We have a fair idea of what energy is, but what is mass? We have not
164 Energy and Momentum in Relativity
really talked about this yet in this course, but what Newtonian physicists meant
by mass might be better known as 'inertia.' In other words, mass is defined
through its presence in the formula F = ma which tells us that the mass is
what governs how diffcult an object is to accelerate.
Mass, Energy, and Inertia
So, then, what we really want to know is whether adding energy to an
object increases its inertia. That is, is it harder to move a hot wall than a cold
wall?
To get some perspective on this, recall that one way to add energy to
an object is to speed it up. But we have already seen that rapidly moving
objects are indeed hard to accelerate (e.g., a uniformly accelerating object
never accelerates past the speed of light). But, this just means that you
make the various atoms speed up and move around very fast in random
ways. So, this example is really a lot like our uniformly accelerating rocket.
In fact, there is no question about the answer. We saw that heat enters
into the calculation of the centre of mass. So, let's think back to the
example of you walking in a canoe floating in water. If the canoe is hot,
we have seen that it counts more in figuring the centre of mass than when
it is cold. It acts like a heavier canoe and will not move as far. Why did it
not move as far when you walked in it in the same way? It must have been
harder to push; i.e., it had more inertia when it was hot. Thus we conclude
that adding energy to a system (say, charging a battery) does in fact give
it more inertia; i.e, more mass.
By the way, this explains something rather odd that became known
through experiments in the 1920's and 30's, a while after Einstein published
his theory of relativity. Atomic nuclei are made out of protons and
neutrons. An example is the Helium nucleus (also called an a particle)
which contains two neutrons and two protons. However, the masses of
these objects are:
Proton mass: 1.675 x l(T 27 kg.
Neutron mass: 1.673 * 10 -27 kg.
a-Particle mass: 6.648 x 10" 27 kg.
So, if we check carefully, we see that an a particle has less mass than
the mass of two protons plus the mass of two neutrons. The difference is
m a - 2m p - 2m n = -.0477 * lO" 27 kg.
Why should this be the case? First note that, since these when these
four particles stick together (i.e., the result is stable), they must have less
energy when they are close together than when they are far apart. That is, it
takes energy to rip them apart. But, if energy has inertia, this means exactly
that the object you get by sticking them together will have less inertia (mass)
than 2/h + 2m, .
Energy and Momentum in Relativity 165
This, by the way, is how nuclear fusion works as a power source. For
example, inside the sun, it often happens that two neutrons and two
protons will be pressed close together. If they bind together to form an a
particle then this releases an extra.0477 x 10"" 27 kg of energy that becomes
heat and light.
Again, it is useful to have a look at the numbers. This amount of
mass is worth an energy of E = mc 2 = 5 x 10"" 12 Watt - seconds ~ 1.4 * 10""
15 kWh. This may not seem like much, but we were talking about just 2
protons and 2 neutrons. What if we did this for one gram7 worth of stuff?
Since four particles, each of which is about 1 Amu of mass, give the above
1
result, one gram would produce the above energy multiplied by— of
Avagadro's number. In other words, we should multiply by 1.5 * 10 23 .
This yields roughly 2 x 10 8 kWh = 5kW - years. In other words, fusion
energy from 1 gram of material could power 5 houses for one year! Nuclear
fission yields comparable results.
By the way, when we consider any other form of power generation
(like burning coal or gasoline), the mass of the end products (the burned
stu) is again less than the mass of the stuff we started with by an amount
that is exactly c~ 2 times the energy released. However, for chemical
processes this turns out to be an extremely tiny fraction of the total mass
and is thus nearly impossible to detect.
MORE ON MASS, ENERGY, AND MOMENTUM
We saw that what we used to call mass and energy can be converted
into each other - and in fact are converted into each other all of the time.
Does this mean that mass and energy really are the same thing? Well, that
depends on exactly how one defines mass and energy.... the point is that,
as with most things in physics, the old (Newtonian) notions of mass and
energy will no longer be appropriate.
So, we must extend both the old concept of mass and the old concept
of energy before we can even start talking. There are various ways to extend
these concepts.
Energy and Rest Mass
My notion of mass will be independent of reference frame. This is
not the case for an older convention which has a closer tie to the old F =
ma. This older convention then defines a mass that changes with velocity.
However, for the moment, let me skirt around this issue by talking about the
"rest mass" (m , by definition an invariant) of an object, which is just the
mass (inertia) it has when it is at rest. In particular, an object at rest has inertia
Energy and Momentum in Relativity
1
'?'
motion. Almost certainly, this expression will need to be modified in
relativity, but it should be approximately correct for velocities small
compared with the speed of light. Thus, for a slowly moving object we
have
E= nt c + — m Q v + small corrections.
Note that we can factor out an m Q c 2 to write this as:
E = m Q c (1 +— m Q v + small corrections).
The precise form of these small corrections. However, this derivation
is somewhat technical and relies on a more in-depth knowledge of energy
and momentum in Newtonian physics. It came up there because it gives the
first few terms in the Taylor's series expansion of the time-dilation factor,
1 , lv 2
j 2 2 = 1 + T — 2" + small corrections,
a factor which has appeared in almost every equation we have due to its
connection to the interval and Minkowskian geometry.
It is therefore natural to guess that the correct relativistic formula for
the total energy of a moving object is
m c
irf 2 cosh 0.
vv777
Momentum and Mass
Momentum is a little trickier, since we only have one term in the
expansion so far: p = mv + small corrections. Based on the analogy with
energy, we expect that this is the expansion of something native to
Minkowskian geometry - probably a hyperbolic trig function of the boost
parameter 0. Unfortunately there are at least two natural candidates, mO
c sinh and m Q c tanh (which is of course just m Q v). The detailed
derivation is given in 6.6, but it should come as no surprise that the answer
is the sinh one that is simpler from the point of Minkowskian geometry
and which is not the Newtonian answer. Thus the relativistic formula for
momentum is:
p= r 2—7 =m c sinh0.
VI — v I c
If you don't really know what momentum is, don't worry too much
about it. However, that the relativistic formulas for energy and momentum
Energy and Momentum in Relativity 167
are very important for things you encounter everyday - like high resolution
computer graphics.
The light from your computer monitorlO is generated by electrons
traveling the speed of light and then hitting the screen. This is fast enough
that, if engineers did not take into account the relativistic formula for
momentum and tried to use just p = mv, the electrons would not land at
the right places on the screen and the image would be all screwed up. There
are some calculations about this in homework problem.
By the way, you may notice a certain similarity between the formulas
for p and E in terms of rest mass m and, say, the position x and the
coordinate time t relative to the origin for a moving inertial object in terms
of it's own proper time x and boost parameter 9. In particular, we have
El .
E
We also have
E 2 - c 2 p 2 = m c 4 (cosh 2 9 - sinh 2 9) = m 2 c A .
Since mO does not depend on the reference frame, this is an invariant
like, say, the interval. Hmm.... The above expression even looks kind of
like the interval.... Perhaps it is a similar object? Here is what is going on:
a displacement (like Ax, or the position relative to an origin) in general
defines a vector - an object that can be thought of like an arrow. Now, an
arrow that you draw on a spacetjme diagram can point in a timelike
direction as much as in a spacelike direction. Furthermore, an arrow that
points in a 'purely spatial' direction as seen in one frame of reference points
in a direction that is not purely spatial as seen in another frame.
^- =- = tanh I
So, spacetime vectors have time parts (components) as well as space parts.
A displacement in spacetime involves cA/ as much as a Ax. The interval is
actually something that computes the size of a given spacetime vector. For a
displacement, it is Ax 2 - cA/ 2 . Together, the momentum and the energy form a
168 Energy and Momentum in Relativity
single spacetime vector. The momentum is already a vector in space, so it
forms the space part of this vector.
It turns out that the energy forms the time part of this vector. So, the size
of the energy-momentum vector is given by a formula much like the one above
for displacements. This means that the rest mass m is basically a measure of
the size of the energy-momentum vector.
Furthermore, we see that this 'size' does not depend on the frame of
reference and so does not depend on how fast the object is moving. However,
for a rapidly moving object, both the time part (the energy) and the space part
(the momentum) are large - it's just that the Minkowskian notion of the size of
a vector involves a minus sign, and these two parts largely cancel against each
other.
How About an Example?
As with many topics, a concrete example is useful to understand certain
details of what is going on. The point that while energy and momentum are
both conserved, mass is not conserved.
Let's suppose we take two electrons and places them in a box. Suppose
that both electrons are moving at 4/5c, but in opposite directions. If me is
the rest mass of an electron, each particle
m v d
\P\ =
and an energy
m e v 5
-m„c
VrT777 _ 3
We also need to consider the box. For simplicity, let us suppose that
the box also has mass me. But the box is not moving, so it has p BoK =
and ^Box = m e cl -
Now, what is the energy and momentum of the system as a whole?
Well, the two electron momenta are of the same size, but they are in
opposite directions. So, they cancel out. Since p Box = 0, the total
momentum is also zero. However, the energies are all positive (energy
doesn't care about the direction of motion), so they add together. We find:
^system ~~ '
13 2
£ svstem = —m e c\
So, what is the rest mass of the systemas a whole?
So, the rest mass of the positronium system is given by dividing the
Energy and Momentum in Relativity 169
. 13
right hand side by c 4 . The result is — »' e , which is significantly greater than
the rest mass of the Box plus twice the rest mass of the electron!
Similarly, two massless particles can in fact combine to make an object
with a finite non-zero mass. For example, placing photons in a box adds
to the mass of the box.
ENERGY AND MOMENTUM FOR LIGHT
At this point we have developed a good understanding of energy and
momentum for objects. However, there has always been one other very
important player in our discussions, which is of course light itself. We'll
take a moment to explore the energy and momentum of light waves and
to see what it has to teach us.
Light Speed and Massless Objects
"What happens if we try to get an object moving at a speed greater
than c?" Let's look at the formulas for both energy and momentum. Notice
™ Q c 2
that E = i — ~i — 2 becomes infinitely large as v approaches the speed of
\l-v Ic
light. Similarly, an object (with finite rest mass m Q ) requires an infinite
momentum to move at the speed of light. Again this tells us is that, much
as with our uniformly accelerating rocket from last week, no finite effort
will ever be able to make any object (with m Q > 0) move at speed c. By the
way, what happens if we try to talk about energy and momentum for light
itself? Many of our formulas fail to make sense for v = c. However, some
of them do. Consider, for example,
££ JL
E c '
Since light moves at speed c through a vacuum, this would lead us to
expect that for light we have E = pc. In fact, one can compute the energy
and momentum of a light wave using Maxwell's equations. One finds that
both the energy and the momentum of a light wave depend on several
factors, like the wavelength and the size*«f the wave. However, in all cases
the energy and momentum exactly satisfies the relation E = pc. We can
consider a bit of light (a.k.a., a photon) with any energy E so long as we
also assign it a corresponding momentum p = E/c. The energy and
momentum of photons adds together in just the way. So, what is the rest mass
of light? Well, if we compute m \c 2 = E 2 - p 2 c 2 = 0, we find m Q = 0. Thus,
light has no mass. This to some extent shows how light can move at speed c
and have finite energy. The zero rest mass 'cancels' against the infinite factor
170 Energy and Momentum in Relativity
coming from 1 - v^/c 2 in our formulas above. By the way, note that this also
v pc
goes the other way: if m Q = then E = ±pc and so — — = ±1 . Such an object
has no choice but to always move at the speed of light.
Another Look at the Doppler Effect
For a massive particle (i.e., with m > 0), if we are in a frame that is
moving rapidly toward the object, the object has a large energy and
momentum as measured by us. One might ask if the same is true for light.
The easy way to discover that it is in fact true for light as well is to
use the fact (which we have not yet discussed, and which really belongs to
a separate subject called 'quantum mechanics,' but what the heck...) that
light actually comes in small chunks called 'photons.'
The momentum and energy of a single photon are both proportional to its
frequency f, which is the number of times that the corresponding wave shakes
up and down every second. The frequency with which the light was emitted
in, say, Alphonse's frame of reference was not the same as the frequency at
which the light was received in the other frame (Gaston's).
The result was that if Gaston was moving toward Alphonse, the frequency
was higher in Gaston's frame of reference. Using the relation between frequency
and energy (and momentum), we see that for this case the energy and
momentum of the light is indeed higher in Gaston's frame of reference than in
Alphonse's frame of reference. So, moving toward a ray of light has a similar
effect on how we measure its energy and momentum as does moving toward
a massive object.
DERIVING THE RELATIVISTIC EXPRESSIONS
FOR ENERGY AND MOMENTUM
Due tb its more technical nature and the fact that this discussion requires
a more solid understanding of energy and momentum in Newtonian physics.
Energy and Momentum in Relativity 171
Still, if you're inclined to see just how far logical reasoning can take you in
this subject.
It turns out that the easiest way to do the derivation is by focusing
on momentumll. The energy part will then emerge as a pleasant surprise.
The argument has four basic inputs:
• We know that Newtonian physics is not exactly right, but it is a
good approximation at small velocities. So, for an object that
moves slowly, it's momentum is well approximated by p = mv.
• We will assume that, whatever the formula for momentum is,
momentum in relativity is still conserved. That is, the total
momentum does not change with time.
• We will use the principle of relativity; i.e., the idea that the laws
o physics are the same in any inertial frame of reference.
• We choose a clever special case to study. We will look at a collisio
of two objects and we will assume that this collision is 'reversible.'
That is, we will assume that it is possible for two objects to
collide in such a way that, if we filmed the collision and played
the resulting movie backwards, what we see on the screen could
also be a real collision. In Newtonian physics, such collisions
are called elastic because energy is conserved.
Let us begin with the observation that momentum is a vector. In
Newtonian relativity, the momentum points in the same direction as the
velocity vector. This follows just from symmetry considerations. It must
also be true in relativistic physics. The only special direction is the one
along the velocity vector.
It turns out that to make our argument we will have to work with at
least two dimensions of space. This is sort of like how we needed to think-
about sticks held perpendicular to the direction of motion when we worked
out the time dilation effect. There is just not enough information if we
stay with only one dimension of space.
So, let us suppose that we are in a long, rectangular room. The north
and south walls are fairly close together, while the east and west walls are
far apart:
Now, suppose that we have two particles that have the same rest mass
mO, and which in fact are exactly the same when they are at rest. We will
set things up so that the two particles are moving at the same speed relative
to the room, but in opposite directions.
We will also set things up so that they collide exactly in the middle of the
room, but are not moving exactly along either the north-south axis or the east-
west axis. Also, the particles will not quite collide head-on, so that one scatters
to each side after the collision. In the reference frame of the room, the collision
will look like this:
172 Energy and Momentum in Relativity
North
A before ""--~^ ^-•" , A after
B after ~^ ^~"~- B before
South
However, we will assume that the particles are nearly aligned with the
east-west axis and that the collision is nearly head-on, so that their velocities
in the northsouth direction are small.
To proceed, we will analyse the collision in a different reference frame.
Suppose that one of our friends (say, Alice) is moving rapidly to the east
through the room. If she travels at the right speed she will find that, before
the collision and relative to her, particle A does not move east or west but
only moves north and south.
We wish to set things up so that the motion of particle A in Alice's
frame of reference is slow enough that we can use the Newtonian formula
p = mv for this particle in this frame of reference.
For symmetry purposes, we will have another friend Bob who travels
to the right fast enough that, relative to him, particle B only moves in the
north-south direction.
Now, suppose we set things up so that the collision is not only reversible,
but in fact looks exactly the same if we run it in reverse. That is, we suppose
that in Alice's frame of reference, the collision looks like:
^19 B B®^
where particle A has the same speed before as it does after, as does particle B.
Also, the angle is the same both before and after. Such a symmetric situation
must be possible unless there is an inherent breaking of symmetry in spacetime.
Now, the velocity of particle A in this frame is to be slow enough that its
momentum is given by the Newtonian formula/?^ = m Q v A . For convenience,
we take coordinate directions x and y on the diagram in Alice's reference frame.
It's velocity in the x direction is zero, so its momentum in this direction must
also be zero.
Thus, particle A only has momentum in the v direction. As a result, the
change in the momentum of particle A is 2m^y/dt, where dy/dt denotes the
velocity in the y direction after the collision.
If momentum is to be conserved, the total vector momentum must be the
same before as after. That is to say, if (in Alice's frame of reference) we add
Energy and Momentum in Relativity 173
the arrows corresponding to the momentum before, and the momentum after,
we must get the same result:
For the next part of the argument notice that if, after the collision, we
observe particle B for awhile, it will eventually hit the south wall. Let us call
this event Y, where B hits the south wall after the collision. The collision takes
place in the middle of the box. Event 7 and the collision will be separated by
some period of time At B (as measured by Alice) and some displacement
vector A3c B = (Ax g , Ay g ) in space as measured by Alice. If the box has some
length 2L in the north-south direction, then since the collision took place in
the middle, Ay g = L.
Also, if we trace particle B back in time before the collision, then
there was some event before the collision when it was also at the south
wall. Let us call this event X, when B was at the south wall before the
collision. By symmetry, this event will be separated from the collision by
the same A.t B and by -Ax B . The displacement Ax B points in the same
direction as the momentum of particle B, since that is the direction in which
B moves. Thus, we can draw another nice right triangle:
Note that this triangle has the same angle 9 as the one drawn above.
As a result, we have
L Pa
\*Z B \ =SinQ \PB\-
Note that, since p A has no x component. If this notation bothers you,
just replace all my p A 's with p A . Here, \p B \ is the usual length of this
vector, and similarly forx g .
Technically speaking, what we will do is to rearrange this formula as
_ (Pa)(^b)
where now we have put the direction information back in. We will then
compute p B in the limit as the vertical velocity of particle A (and thus p A , in
174 Energy and Momentum in Relativity
Alice's frame) goes to zero. In other words, we will use the idea that a slowly
moving particle (in Alice's frame) could have collided with particle B to
determine particle B 's momentum.
Let us now take a moment to calculate/?^. In the limit where the velocity
of particle^ is small, we should be able to usep A = m dy/dt after the collision.
Now, we can calculate dy/dt by using the time kt A that it takes particle^ to go
from the collision site (in the centre of the box) to the north wall.
In this time, it travels a distance L, sop A = m^/ 'At A .
Again, the distance is L in Alice's frame, Bob's frame, or the Box's frame
of reference since it refers to a direction perpendicular to the relative motion
of the frames.
Thus we have
Pb v^o te A
This is a somewhat funny formula as two bits (p B and Ax B ) are measured
in the lab frame while another bit t A is measured in Alice's frame. Nevertheless,
the relation is true and we will rewrite it in a more convenient form below.
Now, what we want to do is in fact to derive a formula for the momentum
of particle B. This formula should be the same whether or not the collision
actually took place.
Thus, we should be able to forget entirely about particle A and rewrite
the above expression' purely in terms of things having to do with particle B.
We can do this by a clever observation.
We originally set things up in a way that was symmetric with re-spect to
particles A and B.
Thus, if we watched the collision from particle As perspective, it would
look just the same as if we watched it from particle B's perspective. In
particular, we can see that the proper time At between the collision and the
event where particle A hits the north wall must be exactly the same as the
proper time between the collision and the event where particle B hits the south
wall.
Further, recall that we are interested in the formula above only in the
limit of small vA. However, in this limit Alice's reference frame coincides
with that of particle A.
As a result, the proper time At is just the time At A measured by Alice.
Thus, we may replace At A above with At.
^ fflp(Axg)
PB ~ At •
The point is that At (proper time) is a concept we undersand in any frame
Energy and Momentum in Relativity 175
of reference. In particular, we understand it in the lab frame where the two
particles (A and E) behave in a symmetric manner. Thus, Ax is identical for
both particles.
Note that since Ax is independent of reference frame, this statement holds
in any frame - in particular, it holds in Alice's frame. Thus, the important point
about equation is that all of the quantities on the right hand side can be taken
to refer only to particle B\
In particular, the expression no longer depends on particle A, so the limit
is trivial. We have:
Since the motion of B is uniform after the collision, we can replace this
ratio with a derivative:
Thus, we have derived
|p|=»«o-
% '
the relativistic formula for momentum.
Now, the form of equation is rather suggestive. It shows that the
momentum forms the spatial components of a spacetime vector:
dx
where x represents all of the spacetime coordinates {t, x, y, z). One is
tempted to ask, "What about the time component m Q dt/ dx of this vector?"
We have assumed that the momentum is conserved, and that this must
therefore hold in every inertial frame. If 3 components of a spacetime
vector are conserved in every inertial frame, then it follows that the fourth
one does as well. So, this time component does represent some conserved
quantity.
We can get an idea of what it is by expanding the associated formula
in a Taylor series for small velocity:
m — = m coshG = m - — -
dx \-v 2 lc 2
-- «o 1
H
In Newtonian physics, the first term is just the mass, which is
conserved separately. The second term is the kinetic energy. So, we identify
176 Energy and Momentum in Relativity
this time component of the spacetime momentum as the (c~ 2 times) the energy:
E = cp' =
Vl-v 2 /* 2-
In relativity, mass and energy are not conserved separately. Mass and
energy in some sense merge into a single concept 'mass-energy.'
Also, we have seen that energy and momentum fit together into a single
spacetime vector just as space and time displacements fit together into a
'spacetime displacement' vector.
Thus, the concepts of momentum and energy also merge into a single
'energy-momentum vector.'
Chapter 7
Relativity and Gravitational Field
We finished the part of the course that is re- ferred to as 'Special
Relativity'. Now, special relativity by itself was a real achievement. In
addition to revolutionizing our conceptions of time and space, uncovering
new phenomena, and dramatically changing our understanding of mass,
energy, and momentum, Minkowskian geometry finally gave a good pic-
ture of how it can be that the speed of light (in a vacuum!) is the same in
all frames of reference. However, in some sense there is still a large hole
to be filled. We've talked about what happens when objects accelerate,
but we have only begun to discuss why they accelerate, in terms of why
and how various forces act on these objects.
We have the relation E = / — ~ 2 — j so we know that, when we feed
an object a certain amount of energy it will speed up, and when we take
energy away it will slow down. We can even use this formula to calculate
exactly how much the object will speed up or slow down. But what we
haven't talked about are the basic mechanisms that add and subtract energy
- the 'forces' themselves. Of course, physicists already had some
understanding of these forces when Einstein broke onto the scene. The
important question, of course, is whether this understanding fit well with
relativity or whether relativity would force some major change the
understanding of the forces themselves.
Physicists in Einstein's time knew about many kinds of forces:
• Electricity.
Magnetism.
• Gravity.
Friction.
One object pushing another.
• Pressure.
and so on Now, the first two of these forces are described by Maxwell's
equations. As we have discussed, Maxwell's equations fit well with (and
even led to!) relativity. Unlike Newton's laws, Maxwell's equations are
178 Relativity and Gravitational Field
fully compatible with relativity and require no modifications at all. Thus, we
may set these forces aside as 'complete' and move on to the others.
Let's skip ahead to the last three forces. These all have to do in the
end with atoms pushing and pulling on each other. In Einstein's time, such
things we believed 1 to be governed by the electric forces between atoms.
So, it was thought that this was also properly described by Maxwell's
equations and would fit well with relativity.
You may have noticed that this leaves one force (gravity) as the odd
one out. Einstein wondered: how hard can it be to make gravity consistent
with relativity?
THE GRAVITATIONAL FIELD
Let's begin by revisiting the pre-relativistic understanding of gravity.
Perhaps we will get lucky and find that it too requires no modification.
Newtonian Gravity vs. relativity
Newton's understanding of gravity was as follows:
Newton's Universal Law of Gravity Any two objects of masses Wj
and m 2 exert 'gravitational' forces on each other of magnitude
d 2 '
directed toward each other, where G = 6.67 x 10 _u Nm 2 /kg 2 is called
"Newton's Gravitational Constant." G is a kind of intrinsic measure of
how strong the gravitational force is.
It turns out that this rule is not compatible with special relativity. In
particular, having learned relativity we now believe that it should not be
possible to send messages faster than the speed of light. However, Newton's
rule above would allow us to do so using gravity. The point is that Newton
said that the force depends on the separation between the objects at this
instant.
Example: The earth is about eight light-minutes from the sun. This
means that, at the speed of light, a message would take eight minutes to
travel from the sun to the earth. However, suppose that, unbeknownst to
us, some aliens are about to move the sun.
O >•
Then, based on our understanding of relativity, we would expect it to
take eight minutes for us to find out! But Newton would have expected us
Relativity and Gravitational Field 179
to find out instantly because the force on the earth would shift (changing the
tides and other things )
The Importance of the Field
Now, it is important to understand how Maxwell's equations get
around this sort of problem. That is to say, what if the Sun were a positive
electric charge, the earth were a big negative electric charge, and they were
held together by an Electro-Magnetic field? We said that Maxwell's
equations are consistent with relativity - so how what would they tell us
happens when the aliens move the sun?
The point is that the positive charge does not act directly on the
negative charge. Instead, the positive charge sets up an electric field which
tells the negative charge how to move.
When the positive charge is moved, the electric field around it must
change, but it turns out that the field does not change everywhere at the
same time.
Instead, the movement of the charge modifies the field only where
the charge actually is. This makes a 'ripple' in the field which then moves
outward at the speed of light. In the figure below, the black circle is
centered on the original position of the charge and is of a size ct, where t
is the time since the movement b
180 Relativity and Gravitational Field
Thus, the basic way that Maxwell's equations get around the problem of
instant reaction is by having a field that will carry the message to the other
charge (or, say, to the planet) at a finite speed.
Oh, and remember that having a field that could carry momentum
was also what allowed Maxwell's equations to fit with momentum
conservation in relativity. What we see is that the field concept is the
essential link that allows us to understand electric and magnetic forces in
relativity.
Something like this must happen for gravity as well. Let's try to
introduce a gravitational field by breaking Newton's law of gravity up
into two parts.
The idea will again be than an object should produce a gravitational
field (g) in the spacetime around it, and that this gravitational field should
then tell the other objects how to move through spacetime. Any
information about the object causing the gravity should not reach the other
objects directly, but should only be communicated through the field.
Old: F = ' , 2
New: F Qn m x = /w, g,
?n 2 G
SOME OBSERVATIONS
General Relativity from a somewhat different point of view than your
readings do. The readings are simply stressing different aspects of the
various thoughts that were rattling around inside Albert Einstein's head
in the early 1 900's. BTW, figuring out General Relativity was much harder
than figuring out special relativity.
Einstein worked out special relativity is about a year (and he did many
other things in that year). In contrast, the development of general relativity
required more or less continuous work from 1905 to 1916.
For future reference, they are:
• Free fall and the gravitational field.
The question of whether light is a ected by gravity.
Further reflection on inertial frames.
Free Fall
Before going on to the other important ingredients, let's take a moment
to make a few observations about gravitational fields and to introduce
some terminology.
\otice an important property of the gravitational field. The
Relativity and Gravitational Field 181
gravitational force on an object of mass m is given by F = mg. But, in
Newtonian physics, we also have F = ma. Thus, we have
in
The result is that all objects in a given gravitational field accelerate at
the same rate (if no other forces act on them). The condition where gravity
is the only influence on an object is known as "free fall." So, the
gravitational field g has a direct meaning: it gives the acceleration of "freely
falling" objects.
A particularly impressive example of this is called the 'quarter and
feather experiment.' Imagine taking all of the air out of a cylinder (to
remove air resistance which would be an extra force), and then releasing a
quarter and a feather at the same time. The feather would then then "drops
like a rock." In particular, the quarter and the feather fall together in
exactly the same way.
Now, people over the years have wondered if it was really true that
all objects fall at exactly the same rate in a gravitational field, or if this
was only approximately true. If it is exactly correct, they wondered why it
should be so. It is certainly a striking fact.
For example, we have seen that energy is related to mass through E
= mc 2 . So, sometimes in order to figure out the exact mass of an object
(like a hot wall that a laser has been shining on....) you have to include
some things (like heat) that we used to count separately as 'energy'.... Does
this E/c 2 have the same effect on gravity as the more familiar notion of
mass?
In order to be able to talk about all of this without getting too
confused, people invented two distinct terms for the following two distinct
concepts:
• Gravitational mass m G . This is the kind of mass that interacts
with the gravitational field. Thus, we have F = m G .
• Inertial mass m { . This is the kind of mass that goes into Newton's
second law. So, we have F = mp.
Now, we can ask the question we have been thinking of in the clean
form: is it always true that gravitational mass and inertial mass are the
same? That is, do we always have m G = mp
The 2nd Ingredient: The Effects of Gravity on Light
Let's leave aside for the moment further thought about fields as such
and turn to another favourite question: to what extent is light affected by
gravity?
Now, first, why do we care? Well, we built up our entire discussion of
182 Relativity and Gravitational Field
special relativity using light rays and we assumed in the process that
light always traveled at a constant speed in straight lines! So, what if it
happens that gravity can pull on light? If so, we may have to modify
our thinking.
Clearly, there are two possible arguments:
No. Light has no mass (m light = 0). So, gravity cannot exert a
force on light and should not a ect it.
Yes. After all, all things fall at the same rate in a gravitational
field, even things with a very small mass. So, light should fall.
Well, we could go back and forth between these two points of view
for quite awhile.... but let's proceed by introducing a third argument in
order to break the tie. We'll do it by recalling that there is a certain
equivalence between energy and mass.
In fact, in certain situations, "pure mass" can be converted into "pure
energy" and vice versa. A nice example of this happens all the time in
particle accelerators when an electron meets a positron (it's 'anti-particle').
Light Light
Let us suppose that gravity does not effect light and consider the
following process:
atrest E =2mc 2
W
Makes light w/ Energy Ej :
First, we start with an electron (mass m) and a positron (also
mass m) at rest. Thus, we have a total energy of E = 2mc 2 .
Now, these particles fall a bit in a gravitational field. They speed
up and gain energy.
We have a new larger energy E x > E .
Suppose that these two particles now interact and turn into some
light. By conservation of energy, this light has the same energy E x
Let us take this light and shine it upwards, back to where the
Relativity and Gravitational Field 183
particles started. (This is not hard to do - one simply puts enough
mirrors around the region where the light is created.) Since we
have assumed that gravity does not a ect the light, it must still
have an energy E { > E .
Finally, let us suppose that this light interacts with itself to make
an electron and a positron again. By energy conservation, these
particles must have an energy of E { > E .
Now, at the end of the process, nothing has changed except that we
have more energy than when we started. And, we can keep repeating this
to make more and more energy out of nothing. Just think about what this
would do, for example, to ideas about energy conservation!
We have seen some hard to believe things turn out to be true, but
such an infinite free source of energy seems especially hard to believe. This
strongly suggests that light is in fact affeected by gravity in such a way
that, when the light travels upwards though a gravitational field, it loses
energy in much the same way as would a massive object.
Gravity, Light, Time, and all That
In the previous subsection we argued that light is in fact affected
by gravity. In particular, when light travels upwards though a
gravitational field, it looses just as much energy as would a massive
object. Now, what happens to light when it looses energy? Well, it
happens that light comes in little packages called 'photons.' This was
only beginning to be understood when Einstein started thinking about
gravity, but it is now well established and it will be a convenient crutch
for us to use in assembling our own understanding of gravity. The amount
of energy in a beam of light depends on how many photons are in the
beam, and on how much energy each photon has separately. You can
see that there are two ways for a beam of light to loose energy. It can
either actually loose photons, or each photon separately can loose energy.
/As the light travels up through the gravitational field, it should
loose energy continuously. Loosing photons would not be a continuous,
gradual process - it would happen in little discrete steps, one step each
time a photon was lost. So, it is more likely that light looses energy in a
gravitational field by each photon separately loosing energy.
How does this work? The energy of a single photon depends on
something called the frequency of the light. The frequency is just a measure
of how fast the wave oscillates. The energy E is in fact proportional to the
frequency f, through something called "Plank's constant" (h). In other
words, E = hf for a photon.
So, as it travels upwards in our gravitational field, this means that
our light wave must loose energy by changing frequency and oscillating
Relativity and Gravitational Field
184
more slowly. It may please you to know that, long after this effect was sugj
by Einstein, it was measured experimentally. The experiment was done by
Pound and Rebke at Harvard in 1959.
Now, a light wave is a bunch of wave crests and wave troughs that
chase each other around through spacetime. Let's draw a spacetime
diagram showing the motion of, say, a bunch of the wave crests. Note
that, if the wave oscillates more slowly at the top, then the wave crests
must be farther apart at the top than at the bottom.
But, isn't each wave crest supposed to move at the same speed c in a
vacuum?
It looks like the speed of light gets faster and faster as time passes!
Perhaps we have done something wrong? By the way, do you remember
any time before when we saw light doing weird stuff?
Nothing is really changing with time, so each crest should act the same
as the one before and move at the same speed, at least when the wave is at the
same place.
Let's choose to draw this speed as a 45° line as usual. In that case, our
diagram must look like the one below.
However, we know both from our argument above and from Pound
and Rebke's experiment that the time between the wave crests is larger at
the top. So, what looks like the same separation must actually represent a
greater proper time at the top.
This may seem very odd. Should we believe that time passes at a faster
rate higher up? Note that we are really comparing time as measured by
two different clocks, one far above the other. Also note that these clocks
have no relative motion.
Relativity and Gravitational Field
In fact, this does really occur! The Pound and Rebke experiment is an
observation of this kind, but it direct experimental verification was made by
precise atomic clocks maintained by the National Bureau of Standards in the
1960's. They kept one clock in Washington D.C. (essentially at sea level) and
one clock in Denver (much higher up). The one in Denver measured more
time to pass (albeit only by a very small amount, one part in 10 15 !).
So, we have clocks with no relative motion that run at different rates. Is
this absurd?
Well, no, and actually it should sound somewhat familiar. Do you
recall seeing something like this before? (Hint: remember the accelerating
rocket?)
Ah, yes. This sounds very much like the phenomenon in which clocks
at the front and back of a uniformly rocket ship experienced no relative
motion but had clocks that ran at different rates. If one works out the
math based on our discussion of energy and frequency above one finds
that, at least over small distances, a gravitational field is not just
qualitatively, but also quantitatively like an accelerating rocket ship with
Gravity and Locality
But, what if we were not allowed to look at the Sun? What if we were
only allowed to make measurements here in this room? [Such
measurements are called local measurements.] What objects in this room
are in inertial frames?
How do we know? Should we drop a sequence of rocks, as we would
in a rocket ship?
If only local measurements are made, then it is the state of free-fall
that is much like being in an inertial frame. In particular, a person in free-
fall in a gravitational field feels just like an inertial observer!
Note how this fits with our observation about clocks higher up running
186 Relativity and Gravitational Field
faster than clocks lower down. We said that this exactly matches the results
for an accelerating rocket with a = g. As a result, things that accelerate relative
to the lab will behave like things that accelerate relative to the rocket.
In particular, it is the freely falling frame that accelerates downward at g
relative to the lab, while it is the inertial frame that accelerates downward at g
relative to the rocket! Thus, clocks in a freely falling frame act like those in
an inertial frame, and it is in the freely falling frame that clocks with no relative
motion in fact run at the same rate!!
Similarly, a lab on the earth and a lab in a rocket (with it's engine on,
and, say, accelerating at 10m/s 2 ) are very similar. They have the following
features in common:
I \ Rocket
( ** fa 1> 1 DO
\ J Labi Lab2
Clocks farther up run faster in both cases, and by the same
amount!
If the non-gravitational force on an object is zero, the object "falls"
relative to the lab at a certain acceleration that does not depend
on what the object is!
• If you are standing in such a lab, you feel exactly the same in
both cases.
Einstein's guess (insight?) was that, in fact:
Under local measurements, a gravitational field is completely
equivalentto an acceleration.
This statement is known as The Equivalence Principle.
In particular, gravity has NO local effects in a freely falling reference
frame. This ideas turns out to be useful even in answering non-relativistic
problems. For example, what happens when we drop a hammer held
horizontally? Does the heavy end hit first, or does the light end?
So then, what would be the best way to draw a spacetime diagram
for a tower sitting on the earth? The answer of course is the frame that
acts like an inertial frame. In this case, this is the freely falling reference
frame. We have learned that, in such a reference frame, we can ignore
gravity completely.
Now, how much sense does the above picture really make? Let's make
this easy, and suppose that the earth were really big.... it turns out that, in
this case, the earth's gravitational field would be nearly constant, and would
weaken only very slowly as we go upward. Does this mesh with the diagram
above?
Relativity and Gravitational Field 187
Not really We said that the diagram above is effectively in an inertial
frame. However, in this case we know that, if the distance between the bottom
and top of the tower does not change, then the bottom must accelerate at a
faster rate than the top does! But we just said that we want to consider a constant
gravitational field.
Side note: No, it does not help to point out that the real earth's
gravitational field is not constant.
How Local?
Well, we do have a way out of this: We realized before that the idea
of freely falling frames being like inertial frames was not universally true.
After all, freely falling objects on opposite side of the earth do accelerate
towards each other. In contrast, any two inertial objects experience zero
relative acceleration.
However, we did say that inertial and freely falling frames are the
same 'locally.'
Let's take a minute to refine that statement.
How local is local? Well, this is much like the question of "when is a
velocity small compared to the speed of light?" What we found before
was that Newtonian physics held true in the limit of small velocities. In
the same way, our statement that inertial frames and freely-falling frames
are similar is supposed to be true in the sense of a limit. This comparison
becomes more and more valid the smaller a region of spacetime we use to
compare the two.
Nevertheless, it is still meaningful to ask how accurate this comparison
is. In other words, we will need to know exactly which things agree in the
above limit.
To understand Einstein's answer, let's consider a tiny box of spacetime
from our diagram above.
lis acceleration was
matched to g
For simplicity, consider a 'square' box of height e and width ce. This
square should contain the event at which we matched the "gravitational field
g" to the acceleration of the rocket.
188 Relativity and Gravitational Field
In this context, Einstein's proposal was that
Errors in dimensionless quantities like angles, v/c, and boost
parameters should be proportional to e 2 .
Let us motivate this proposal through the idea that the equivalence
principle should work "as well as it possibly can." Suppose for example
that the gravitational field is really constant, meaning that static observers
at any position measure the same gravitational field g.
We then have the following issue: when we match this gravitational
field to an accelerating rocket in flat spacetime, do we choose a rocket
with a to = g or one with cibottom = 8- ^ n y r '§'^ roc ket will have a different
acceleration at the top than it does at the bottom. So, what we mean by
saying that the equivalence principle should work 'as well as it possibly
can' is that it should predict any quantity that does not depend on whether
we match a = g at the top or at the bottom, but it will not directly predict
any quantity that would depend on this choice.
To see how this translates to the s 2 criterion above, let us consider a
slightly simpler setting where we have only two freely falling observers.
Again, we will study such observers inside a small box of spacetime of
dimensions 8t = e. Let's assume that they are located on opposite sides of
the box, separated by a distance 8x.
In general, we have seen that two freely falling observers will accelerate
relative to each other. Let us write a Taylor's series expansion for this
relative acceleration a as a function of the separation 8x In general, we
have
a(8x)=a + ai 8 x +O(bx 2 ).
But, we know that this acceleration vanishes in the limit 8jc — > where
the two observers have zero separation. As a result, a Q = and for small
Sat we have the approximation a ~ afix.
Now, if this were empty space with no gravitational field, everything
would be in a single inertial frame. As a result there would be no relative
acceleration and, if we start the observers at rest relative to each other,
their relative velocity would always remain zero. This is an example of an
error we would make if we tried to use the equivalence principle in too
strong of a fashion. What is the correct answer? Well, the relative
acceleration is afix = a x cz and they accelerate away from each other for
a time (within our box) 5/ = e. As a result, they attain a relative velocity
of v = a,ce 2 . But since our inertial frame model would have predicted v =
0, the error in v/c is also a,e 2 . examples turn out to work in much the
same way, and this is why Einstein made the proposal above.
• To summarize: what we have found is that locally a freely falling
reference frame is almost the same as an inertial frame. If we think
about a freely falling reference frame as being exactly like an inertial
Relativity and Gravitational Field 189
frame, then we make a small error in computing things. The
fractional error is proportional to e 2 , where e is the size of the
spacetime region needed to make the measurement.
The factor of proportionality is called R after the mathematician
Riemann. Note that R is not a radius. Since the error in an angle 9 is Re 2 ,
R has dimensions of (length) -2 .
GOING BEYOND LOCALITY
The fact that locally a freely falling frame in a gravitational field acts
like an inertial frame does in the absence of gravity. However, we saw
that freely falling frames and inertial frames are not exactly the same if
they are compared over any bit of spacetime of finite size. No matter how
small of a region of spacetime we consider, we always make some error if
we interpret a freely falling frame as an inertial frame. So, since any real
experiment requires a finite piece of spacetime, how can our local principle
be useful in practice?
The answer lies in the fact that we were able to quantify the error
that we make by pretending that a freely falling frame is an inertial; frame.
We found that if we consider a bit of spacetime of size e, then the error in
dimensionless quantities like angle or velocity measurements is e 2 . Note
that ratios of lengths (L x /L 2 ) or times (T x /T 2 ) are also dimensionless. In
fact, an angle is nothing but a ratio of an arc length to a radius (8 = s/r)\
So, this principle should also apply to ratios of lengths and/or times:
S^oce 2
T 2 '
where 8 denotes the error.
Let me pause here to say that the conceptual setup with which we
have surrounded equation is much like what we find in calculus. In calculus,
we learned that locally any curve was essentially the same as a straight
line. Over a region of finite size, curves are generally not straight lines.
However, the error we make by pretending a curve is straight over a small
finite region is small. Calculus is the art of carefully controlling this error
to build up curves out of lots of tiny pieces of straight lines. Similarly, the
main idea of general relativity is to build up a gravitational field out of
190 Relativity and Gravitational Field
lots of tiny pieces of inertial frames. Suppose, for example, that we wish to
compare clocks at the top and bottom of a tall tower. We begin by breaking
up this tower into a larger number of short towers, each of size A/.
o
If the tower is tall enough, the gravitational field may not be the same at
the top and bottom - the top might be enough higher up that the gravitational
field is measurably weaker.
So, in general each little tower (0,1,2...) will have a different value of the
gravitational field g (g , gj, g 2 - ■•)• If ' is the distance of any given tower from
the bottom, we might describe this by a function g(l).
A Tiny Tower
Let's compare the rates at which clocks run at the top and bottom of one
of these tiny towers. We will try to do this by using the fact that a freely
falling frame is much like an inertial frame. Of course, we will have to keep
track of the error we make by doing this.
In any accelerating rocket, the front and back actually do agree about
simultaneity. As a result, all of our clocks in the towers will also agree
about simultaneity. Thus, we can summarize all of the interesting
information in a 'rate function' p(/) which tells us how fast the clock at
positign / runs compared to the clock at position zero:
Ax,
"»-;£-.
We wish to consider a gravitational field that does not change with
time, so that p is indeed a function only of / and not of /. So, let us model
our tiny tower as a rigid rocket accelerating through an inertial frame. A
spacetime diagram drawn in the inertial frame is shown below.
Now, the tiny tower had some acceleration g relative to freely falling
frames. Let us suppose that we match this to the proper acceleration of the
back of the rocket.
Relativity and Gravitational Field 191
In this case, the back of the rocket will follow a worldline that remains a
constant proper distance d = c 2 /a from some fixed event.
Note that the top of the rocket remains a constant distance d + A/ from
this event. As a result, the top of the rocket has a proper acceleration a,
= . As we have learned, this means that the clocks at the top and bottom
d + Al V
run at different rates:
A V _ a bottom d = M A/
^bottom a '0P d d '
In terms of our rate function, this is just
p(/ + AQ p(/) + Ap = 1| Ap
p(/) p(/) p(/) •
Thus, we have
Ap M = otA/
P(0 ~ d " c 2 ■
Now, how much of an error would we make if we use this expression
for our tiny tower in the gravitational field? Well, the above is in fact a
fractional change in a time measurement. So, the error must be of size
A/ 2 . So, for our tower case, we have
for some number k. Here, we have replaced a with g, since we matched
the acceleration g(/) of our tower (relative to freely falling frames) to the
proper acceleration a.
Actually, we might have figured out the error directly from expression
above. The error can be seen in the fact that the acceleration does not
change in the same way from the bottom of the tower to the top of the
tower as it did from the bottom of the rocket to the top of the rocket. The
equivalence principle directly predicts only quantities that are independent
of such matching details. So what would is the difference between these
two options? Well, the difference in the accelerations is just Act = —r A/
Note that a is already multiplied by A/ in the expression above. This means
da
that, if we were to change the value if A/ by — A/, we would indeed create a
term of the form &(A/) 2 ! So, we see again that this term really does capture
192 Relativity and Gravitational Field
well all of the errors we might possibly make in matching a freely falling
frame to an inertial frame.
Let us write our relation above as
%-&*"}*»■
As in calculus, we wish to consider the limit as A/ -> 0. In this case,
dp
the left— 7 . On the right hand side, the first term does not depend on A/
at
at all, while the second term vanishes in this limit. Thus, we obtain
dp _ g(/)p(Q
dl c 2 '
Note that the term containing k (which encodes our error) has
disappeared entirely. We have managed to use our local matching of freely
falling and inertial frames to make an exact statement not directly, but
dp
:iv
The Tall Tower
Of course, we have still not answered the question about how the
clocks actually run at different heights in the tower. To do so, we need to
solve the equation for p(J). We can do this by multiplying both sides by dl
and integrating:
£ <* = m dl
•WO) p * c 2
Now, looking at our definition above we find that p = 1. Thus, we
have
rP dp
W)~p~ = ^ p - In 1 = In p
f^dl
lnp=L,
Expression is the exact relation relating clocks at different heights /
in a gravitational field. One important property of this formula is that
the factor inside the exponential is always positive. As a result, we find that
Relativity and Gravitational Field 193
clocks higher up in a gravitational field always run faster, regardless of whether
the gravitational field is weaker or stronger higher up!
Note that, due to the properties of exponential functions, we can also
write this as:
Gravitational time Dilation Near the Earth
The effect described in equation is known as gravitational time
dilation. There are a couple of interesting special cases of this effect that
are worth investigating in detail. The first is a uniform gravitational field
in which g(/) is constant.
This is not in fact the same as a rigid rocket accelerating through an
inertial frame, as the acceleration is actually different at the top of the
rigid rocket than at the bottom.
Still, in a uniform gravitational field with g(l) = g the integral is easy
to do and we get just:
Ax/
Ax «
In this case, the difference in clock rates grows exponentially with
distance. The other interesting case to consider is something "that describes
(to a good approximation) the gravitational field near the earth. We have
seen that Newton's law of gravity is a pretty good description of gravity
near the earth, so we should be able to use the Newtonian form of the
gravitational field:
m E G
where r is the distance from the centre of the earth. Let us refer to the
radius of the earth as rO. For this case, it is convenient to compare the
rate at which some clock runs at radius r to the rate at which a clock runs
on the earth's surface (i.e., at r = r Q ).
f 8 -2,v , , At O)
Since ]^r ^ = rf 1 -rf 1 , we have At(|&)
Here, it is interesting to note that the r dependence drops out as r -> °°,
so that the gravitational time dilation factor between the earth's surface (at rO)
and infinity is actually finite. The result is
194 Relativity and Gravitational Field
At(oo) m E G
Ax(r ) = e r oc 2 ■
So, time is passing more slowly for us here on earth than it would be
if we were living far out in space By how much? Well, we just need to
put in some numbers for the earth. We have
w £ =6 x 10 24 kg,
G=6 x Kr u Nm 2 /kg 2 ,
r = 6 x I0 6 m.
Putting all of this into the above formula gives a factor of about
ej xl(r . Now, how big is this? Well, here it is useful to use the Taylor
series expansion e* = 1 + x + small corrections for small x. We then have
Ax(Ab) 3
This means that time passes more slowly for us than it does far away
by roughly one part in 1010, or, one part in ten billion! This is an incredibly
small amount - one that can easily go unnoticed. However, as mentioned
earlier, the national bureau of standards was in fact able to measure this
back in the 1960's, by comparing very accurate clocks in Washington, D.C
with very accurate clocks in Denver! Their results were of just the right
size to verify the prediction above.
In fact, there is an even more precise version of this experiment that
is going on right now - constantly verifying Einstein's prediction every
day! It is called the "Global Positioning System" (GPS). Perhaps you have
heard of it?
The Global Positioning System
The Global Positioning System is a setup that allows anyone, with
the aid of a small device, to tell exactly where they are on the earth's surface.
It is made up of a number of satellites in precise, well-known orbits around
the earth. Each of these satellites contains a very precise clock and a
microwave transmitter.
Each time the clock 'ticks' (millions of times every second!) it sends
out a microwave pulse which is 'stamped' with the time and the ID of that
particular satellite.
A hand-held GPS locator then receives these pulses. Because it is closer
to some satellites than to others, the pulses it receives take less time to
reach it from some satellites than from others. The result is that the pulses it
receives at a given instant are not all stamped with the same time.
The locator then uses the differences in these time-stamps to figure
Relativity and Gravitational Field 195
out which satellites it is closest to, and by how much. Since it knows the orbits
of the satellites very precisely, this tells the device exactly where it itself it
located. This technology allows the device to pinpoint its location on the earth's
surface to within a one meter circle.
To achieve this accuracy, the clocks in the satellites must be very
precise, and the time stamps must be very accurate. In particular, they must be
much more accurate than one part in ten billion.
If they were by that much, then every second the time stamps would
become by 10~ 10 seconds. But, in this time, microwaves (or light) travel a
distance (3 x 10 8 m/s) (lO" 10 sec) = 3 x 10" 2 m = 3cm and the GPS locator
would think it was 'drifting away' at 3cm/sec. While this is not very fast, it
would add up over time.
This drift rate is 72 m/hr, which would already spoil the accuracy of the
GPS system.
Over long times, the distance becomes even greater. The drift rate can
also be expressed as 1.5km/day or 500km/year. So, after one year, a GPS device
in Syracuse, NY might think that it is in Philadelphia!
By the way, since the GPS requires this incredible precision, you might
ask if it can measure the effects of regular speed-dependent special relativity
time dilation as well (since the satellites are in orbit and are therefore
'moving.') The answer is that it can. In fact, for the particular satellites
used in the GPS system, these speed-dependent effects turn out to be of a
comparable size to the gravitational time dilation effect.
Note that these effects actually go in opposite directions: the gravity
effect makes the higher (satellite) clock run fast while the special relativity
effect makes the faster (satellite) clock run slow.
Which effect is larger turns out to depend on the particular orbit.
Low orbits (like that of the space shuttle) are higher speed, so in this case
the special relativity effect dominates and the orbiting clocks run more
slowly than on the earth's surface.
High orbits (like that of the GPS satellites) are lower speed, so the
gravity effect wins and their clocks run faster than clocks on the earth's
surface. For the case of GPS clocks, the special relativity effect means
that the amount of the actual time dilation is less than the purely
gravitational effect by about a factor of two.
THE MORAL OF THE STORY
We were able to figure out how clocks run at different heights in a
gravitational field. We have also seen how important this is for the running
of things like GPS. But, what does all of this mean? And, why is this often
considered a new subject (called 'General Relativity'), different from our old
friend Special Relativity?
196 Relativity and Gravitational Field
Local Frames vs. Global frames
Let us briefly retrace our logic. While thinking about various frames of
reference in a gravitational field, we discovered that freely falling reference
frames are useful. In fact, they are really the most useful frames of reference,
as they are similar to inertial frames. This fact is summarized by the equivalence
principle which says "freely falling frames are locally equivalent to inertial
frames."
The concept of these things being locally equivalent is a subtle one, so
let me remind you what it means. The idea is that freely falling reference frames
are indistinguishable from inertial reference frames so long as we are only
allowed to perform experiments in a tiny region of spacetime.
More technically, suppose that we make then mistake of pretending that
a freely falling frame actually is an inertial frame of special relativity, but that
we limit ourselves to measurements within a region of spacetime of size e .
When we then go and predict the results of experiments, we will make
small errors in, say, the position of objects. However, these errors will be
very small when? is small; in fact, the per cent error will go to zero like e 2 .
The same sort of thing happens in calculus. There, the corresponding
statement is that a curved line is locally equivalent to a straight line.
Anyway, the important point is that we would make an error by pretending
that freely falling frames are exactly the same as inertial frames. Physicists
say that the two are locally equivalent, but are not "globally" equivalent. The
term 'global' (from globe, whole, etc.) is the opposite of local and refers to the
frame everywhere (as opposed to just in a small region).
So, if freely falling frames are not globally inertial frames, then where
are the inertial frames? They cannot be the frames of reference that are attached
to the earth's surface. After all, if a frame is globally like an inertial frame
then it must also be like an inertial frame locally. However, frames tied to the
surface of the earth are locally like uniformly accelerated frames, not inertial
frames.
But, there are really not any other frames left to consider. To match an
inertial frame locally requires free fall, but that will not let us match globally.
We are left with the conclusion that:
In a generic gravitational field, there is no such thing as a global inertial
frame.
One can take various perspectives on this, but the bottom line is that we
(following Einstein) merely assumed that the speed of light was constant in
all (globally) inertial frames of reference. However, no such reference frame
will exist in a generic gravitational field.
And what if we retreat to Newton's first law, asking about the behaviour
Relativity and Gravitational Field 197
of objects on which no forces act? The trouble is that, as we have discussed,
to identify an inertial frame in this way we would need to first identify an
object on which no forces act.
But, which object is this? Any freely falling object seems to pass the 'no
forces' tests as well (or better than!) an object sitting on the earth! However, if
freely falling objects are indeed free of force, then Newton's first law tells us
that they do not accelerate relative to each other in gross contradiction with
experiment.
This strongly suggests that global inertial frames do not exist and that
we should therefore abandon the concept and move on. In its place, we
will now make use of local inertial frames, a.k.a. freely falling frames. It
is just this change that marks the transition from 'special' to 'general'
relativity. Special relativity is just the special case in which global inertial
frames exist. Actually, there is another reason why the study of gravity is known
as "General Relativity."
The point is that in special relativity (actually, even before) we noticed
that the concept of velocity is intrinsically a relative one. That is to say, it
does not make sense to talk about whether an object is moving or at rest, but
only whether it is moving or at rest relative to some other object. However,
we did have an absolute notion of acceleration: an object could be said to be
accelerating without stating explicitly what frame was being used to make
this statement. The result would be the same no matter what inertial frame
was used.
However, now even the concept of acceleration becomes relative in a
certain sense. Suppose that you are in a rocket in deep space and that you
cannot look outside to see if the rockets are turned on. You drop an object
and it falls.
Are you accelerating, or are you in some monster gravitational field? There
is no right answer to this question as the two are identical. In this sense, the
concept of acceleration is now relative as well - it is equivalent to being in a
gravitational field.
While this point is related to why the study of gravity historically
acquired the name "General Relativity," it is not clear that this is an
especially useful way to think about things. One can still measure one's
proper acceleration as the acceleration relative to a nearby (i.e., local!)
freely falling frame.
Thus, there is an absolute distinction between freely falling and not
freely falling. Whether you wish to identify these terms with non-
accelerating and accelerating is just a question of semantics - though most
modern relativists find it convenient to do so. A language in which acceleration
is not a relative concept but in which it implicitly means "acceleration measured
locally with respect to freely falling frames."
198 Relativity and Gravitational Field
And what About the Speed of Light?
There is a question that you probably wanted to ask a few paragraphs
back, but then A general gravitational field there are no frames of reference
in which light rays always travel in straight lines at constant speed. So,
after all of our struggles, have we finally thrown out the constancy of the
speed of light? No, not completely.
There is one very important statement left. Suppose that we measure
the speed of light at some event (E) in a frame of reference that falls freely
at event E. Then, near event E things in this frame work just like they do
in inertial frames - so, light moves at speed c and in a straight line. Said in
our new language:
As measured locally in a freely falling frame, light always moves in
straight lines at speed c.
Chapter 8
Relativity and Curved Spacetime
The equivalence principle to calculate the effects of a gravitational field
over a finite distance by carefully patching together local inertial frames. If
we are very, very careful, we can calculate the effects of any gravitational
field in this way.
However, this approach turns out to be a real mess. Consider for
example the case where the gravitational field changes with time. Then, it
is not enough just to patch together local inertial frames at different
positions. One must make a quilt of them at different places as well as at
different times!
ffil
832
333
£E1
mi
£523
mi
ffl2
fi!3
As you might guess, this process becomes even more complicated if
we consider all 3+1 dimensions. One then finds that clocks at different
locations in the gravitational field may not agree about simultaneity even
if the gravitational field does not change with time.... but that is a story
that we need not go into herel. What Einstein needed was a new way of
looking at things - a new language in which to discuss gravity that would
organize all of this into something relatively simple.
Another way to say this is that he needed a better conception of what a
gravitational field actually is. This next step was very hard for Albert. It took
him several years to learn the appropriate mathematics and to make that
mathematics into useful physics. Instead of going through all of the twists and
turns in the development of the subject.
A RETURN TO GEOMETRY
You see, Einstein kept coming back to the idea that freely falling observers
200 Relativity and Curved Spacetime
are like inertial observers - or at least as close as we can get. In the presence
of a general gravitational field, there really are no global inertial frames.
When we talked about our 'error' in thinking of a freely falling frame as
inertial, it is not the case that there is a better frame which is more inertial
than is a freely falling frame.
Instead, when gravity is present there are simply no frames of reference
that act precisely in the way that global inertial frames act. Anyway, Einstein
focussed on the fact that freely falling frames are locally the same as inertial
frames.
However, he knew that things were tricky for measurements across a finite
distance. Consider, for example, the reference frame of a freely falling person.
Suppose that this person holds out a rock and releases it. The rock is then also
a freely falling object, and the rock is initially at rest with respect to the person.
However, the rock need not remain exactly at rest with respect to the
person. Suppose, for example, that the rock is released from slightly higher
up in the gravitational field.
Then, Newton would have said that the gravitational field was weaker
higher up, so that the person should accelerate toward the earth faster than
does the rock.
This means that there is a relative acceleration between the person and
the rock, and that the person finds the rock to accelerate away! A spacetime
diagram in the person's reference frame looks like this:
Suppose, on the other hand, that the rock is released to the person's side.
Then, Newton would say that both person and rock accelerate toward the centre
of the earth.
However, this is not in quite the same direction for the person as for the
rock:
So, again there is a relative acceleration. This time, however, the person
finds the rock to accelerate toward her. So, she would draw a spacetime diagram
for this experiment as follows:
Relativity and Curved Spacetime
The issue is that we would like to think of the freely falling worldlines as
inertial worldlines.
That is, we would like to think of them as being 'straight lines in
spacetime.' However, we see that we are forced to draw them on a
spacetime diagram as curved.
Now, we can straighten out any one of them by using the reference
frame of an observer moving along that worldline. However, this makes
the other freely falling worldlines appear curved. How are we to understand
this?
Straight Lines in Curved Space
Eventually Einstein found a useful analogy with something that at
first sight appears quite different - a curved surface. The idea is captured
by the question "What is a straight line on a curved surface?"
To avoid language games, mathematicians made up a new word for
this idea: "geodesic." A geodesic can be thought of as the "straightest
possible line on a curved surface."
More precisely, we can define a geodesic as a line of minimal distance
- the shortest line between two points2. The idea is that we can define a
straight line to be the shortest line between two points. Actually, there is
another definition of geodesic that is even better, but requires more
202
Relativity and Curved Spacetime
mathematical machinery to state precisely. Intuitively, it just captures the idea
that the geodesic is 'straight.' It tells us that a geodesic is the path on a curved
surface that would be traveled, for example, by an ant (or a person) walking
on the surface who always walks straight ahead and does not turn to the right
or left.
As an example, suppose you stand on the equator of the earth, face
north, and then walk forward. Where do you go? If you walk far enough
(over the ocean, etc.) you will eventually arrive at the north pole. The
path that you have followed is a geodesic on the sphere.
Note that this is true no matter where you start on the equator. So, suppose
there are in fact two people walking from the equator to the north pole, Alice
and Bob. As you can see, Alice and Bob end up moving toward each other.
So, if we drew a diagram of this process using Alice's frame of reference (so
that her own path is straight), it would look like this:
By the way, the above picture is not supposed to be a spacetime diagram.
It is simply supposed to be a map of part of the (two dimensional) earth's
surface, on which both paths have been drawn.
This particular map is drawn in such a way that Alice's path appears
as a straight line. As you probably know from looking at maps of the
earth's surface, no flat map will be an accurate description globally, over
the whole earth. There will always be some distortion somewhere.
However, a flat map is perfectly fine locally, say in a region, the size of the
Relativity and Curved Spacetime 203
city of Syracuse (if we ignore the hills). Now, does this look or sound at all
familiar? What if we think about a similar experiment involving Alice and
Bob walking on a funnel-shaped surface:
In this case they begin to drift apart as they walk so that Alice's map
would look like this:
So, we see that straight lines (geodesies) on a curved surface act much
like freely falling worldlines in a gravitational field. It is useful to think
through this analogy at one more level: Consider two people standing on
the surface of the earth. We know that these two people remain the same
distance apart as time passes.
Why do they do so? Because the earth itself holds them apart and
prevents gravity from bringing them together. The earth exerts a force on
each pei son, keeping them from falling freely.
Now, what is the analogy in terms of Alice and Bob's walk across the
sphere or the funnel? Suppose that Alice and Bob do not simply walk
independently, but that they are actually connected by a sti bar. This bar
will force them to always remain the same distance apart as they walk
toward the north pole. The point is that, in doing so, Alice and Bob will
be unable to follow their natural (geodesic) paths. As a result, Alice and
Bob will each feel some push or pull from the bar that keeps them a
constant distance apart. This is much like our two people standing on the
earth who each feel the earth pushing on their feet to hold them in place.
Curved Surfaces are Locally Flat
Note that straight lines (geodesies) on a curved surface act much like
204
Relativity and Curved Spacetirr
freely falling worldlines in a gravitational field. In particular, exactly the same
problems arise in trying to draw a flat map of a curved surface as in trying to
represent a freely falling frame as an inert ial frame.
A quick overview of the errors made in trying to draw a flat map of
a curved surface are shown below:
Bob 's geodesic eventually curves away
fc Locally, geodesies
in parallel
We see that something like the equivalence principle holds for curved
surfaces: flat maps are very accurate in small regions, but not over large
ones.
In fact, we know that we can in fact build up a curved surface from a
bunch of flat ones. One example of this happens in an atlas. An atlas of
the earth contains many flat maps of small areas of the earth's surface
(the size of states, say). Each map is quite accurate and together they
describe the round earth, even though a single flat map could not possibly
describe the earth accurately.
Computer graphics people do much the same thing all of the time.
They draw little flat surfaces and stick them together to make a curved
surface.
This is much like the usual calculus trick of building up a curved line
from little pieces of straight lines. In the present context with more than
one dimension, this process has the technical name of "differential
geometry."
Relativity and Curved Spacetime 205
From Curved Space to Curved Spacetime
The point is that this process of building a curved surface from flat
ones is just exactly what we want to do with gravity! We want to build up
the gravitational field out of little pieces of "flat" inertia! frames. Thus,
we might say that gravity is the curvature of spacetime. This gives us the
new language that Einstein was looking for:
(Global) Inertial Frames « Minkowskian Geometry <=> Flat
Spacetime: We can draw it on our flat paper or chalk board
and geodesies behave like straight lines.
* Worldlines of Freely Falling Observers <=> Straight lines in
Spacetime
Gravity <=> The Curvature of Spacetime
Similarly, we might refer to the relation between a world line and a
line of simultaneity as the two lines being at a "right angle in spacetime."
It is often nice to use the more technical term "orthogonal" for this
relationship.
By the way, the examples (spheres, funnels, etc.) that we have discussed
so far are all curved spaces. A curved spacetime is much the same concept.
However, we can't really put a curved spacetime in our 3-D Euclidean
space. This is because the geometry of spacetime is fundamentally
Minkowskian, and not Euclidean.
Remember the minus sign in the interval? Anyway, what we can do
is to once again think about a spacetime diagram for 2+1 Minkowski space
- time will run straight up, and the two space directions (x and y) will run
to the sides.
Light rays will move at 45 degree angles to the (vertical) t-axis as usual.
With this understanding, we can draw a (1 + 1) curved spacetime inside
this 2+1 spacetime diagram. An example is shown below:
Note that one can move along the surface in either a timelike manner
206 Relativity and Curved Spacetime
(going up the surface) or a spacelike manner (going across the surface), so
that this surface does indeed represent a (1 + 1) spacetime. The picture above
turns out to represent a particular kind of gravitational field that we will be
discussing more in a few weeks.
To see the similarity to the gravitational field around the earth, think
about two freely falling worldlines (a.k.a, "geodesies," the straightest
possible lines) that begin near the middle of the diagram and start out
moving straight upward. Suppose for simplicity that one geodesic is on
one side of the fold while the second is on the other side. You will see that
the two worldlines separate, just as two freely falling objects do at different
heights in the earth's gravitational field.
Thus, if we drew a two-dimensional map of this curved spacetime
using the reference frame of one of these observers, the results would be
just like the spacetime diagram we drew for freely falling stones at different
heights! This is a concrete picture of what it means to say that gravity is
the curvature of spacetime.
Well, there is one more subtlety that we should mention it is
important to realise that the extra dimension we used to draw the picture
above was just a crutch that we needed because we think best in fiat spaces.
One can in fact talk about curved spacetimes without thinking about a
"bigger space" that contains points "outside the spacetime." This minimalist
view is generally a good idea.
MORE ON CURVED SPACE
Let us remember that the spacetime in which we live is fundamentally
four (=3+1) dimensional and ask if this will cause any new wrinkles in
our story. It turns out to create only a few. The point is that curvature is
fundamentally associated with two-dimensional surfaces.
Roughly speaking, the curvature of a four-dimensional spacetime
(labelled by x, y, z, t) can be described in terms of xt curvature, yt curvature,
etc. associated with two-dimensional bits of the spacetime.
However, this is relativity, in which space and time act pretty much
the same. So, if there is xt, yt, and zt curvature, there should also be xy,
yz, and xz curvature!
This means that the curvature can show up even if we consider only
straight lines in space (determined, for example, by stretching out a string)
in addition to the effects on the motion of objects that we have already
discussed.
For example, if we draw a picture showing spacelike straight lines
(spacelike geodesies), it might look like this:
So, curved space is as much a part of gravity as is curved spacetime.
This is nice, as curved spaces are easier to visualize.
Relativity and Curved Spacetin
Let us now take a moment to explore these in more depth and build
some intuition about curvature in general. Curved spaces have a number
of fun properties. Some of my favorites are:
C ^ 27tR:The circumference of a circle is typically not 2jt times its
radius. Letus take an example: the equator is a circle on a
sphere. What is it's centre? We are only supposed to consider
the two-dimensional surface of the sphere itself as the third
dimension was just a crutch to let us visualize the curved
two-dimensional surface. So this question is really 'what
point on the sphere is equidistant from all points on the
equator?' In fact, there are two answers: the north pole and
the south pole. Either may be called the centre of the sphere.
Now, how does the distance around the equator compare
to the distance (measured along the sphere) from the north
pole to the equator? The arc running from the north pole
to the equator goes 1/4 of the way around the sphere. This
is the radius of the equator in the relevant sense. Of course,
the equator goes once around the sphere. Thus, its
circumference is exactly four times its radius.
A t- kR 2 : The area of a circle is typically not it times the square of its
radius. Again, the equator on the sphere makes a good
example. With the radius defined as above, the area of this
circle is much less than ttR 2 .
208 Relativity and Curved Spacetime
E (angles) * 1 80°: The angles in a triangle do not in general add up to
180°. An example on a sphere is shown below.
Squares do not close: A polygon with four sides of equal length and
four right angles (a.k.a., a square) in general does not close.
Vectors (arrows) "parallel transported" around closed curves are
rotated: This one is a bit more complicated to explain. Unfortunately, to
describe this property as precisely as the ones above would require the
introduction of more complicated mathematics. Nevertheless, the
discussion below should provide you with both the flavour of the idea
and an operational way to go about checking this property.
In a flat space (like the 3-D space that most people think we live in
until they learn about relativity....), we know what it means to draw an
arrow, and then to pick up this arrow and carry it around without turning
it. The arrow can be carried around so that it always remains parallel to
its original direction.
Now, on a curved surface, this is not possible. Suppose, for example,
that we want to try to carry an arrow around a triangular path on the
sphere much like the one that we discussed a few examples back. For
concreteness, let's suppose that we start on the equator, with the arrow
also pointing along the equator as shown below:
Relativity and Curved Spacetime
209
We now wish to carry this vector to the north pole, keeping it always
pointing in the same direction as much as we can. Well, if we walk along
the path shown, we are going in a straight line and never turning. So,
since we start with the arrow pointing to our left, we should keep the arrow
pointing to our left at all times. This is certainly what we would do in a
flat space. When we get to the north pole, the arrow looks like this:
Now we want to turn and walk toward the equator along a different
side of the triangle. We turn (say, to the right), but we are trying to keep
the arrow always pointing in the same direction. So the arrow should not
turn with us. As a result, it points straight behind us. We carry it down to
the equator so that it points straight behind us at every step:
Finally, we wish to bring the arrow back to where it started. We see
that the arrow has rotated 90o relative to the original direction:
All of these features will be present in any space (say, a surface of
simultaneity) in a curved spacetime. Now, since we identify the
gravitational field with the curvature of spacetime then the above features
must also be encoded in the gravitational field. But there is a lot of
information in these features. In particular there are independent
210 Relativity and Curved Spacetime
curvatures in the xy, yz, and xz planes that control, say, the ratio of
circumference to radius of circles in these various planes.
But wait! Doesn't this seem to mean that the full spacetime curvature
(gravitational field) contains a lot more information than just specifying
an acceleration g at each point?
After all, acceleration is related to how thing behave in time, but we
have just realized that at least parts of the spacetime curvature are
associated only with space. How are we to deal with this?
GRAVITY AND THE METRIC
Let's recall where we are. A while back we discovered the equivalence
principle: that locally a gravitational field is equivalent to an acceleration
in special relativity. Another way of stating this is to say that, locally, a
freely falling frame is equivalent to an inertial frame in special relativity.
We noticed the parallel between this principle and the underlying ideas
being calculus: that locally every curve is a straight line.
A global inertial frame describes a flat spacetime - one in which, for
example, geodesies follow straight lines and do not accelerate relative to
one another.
A general spacetime with a gravitational field can be thought of as
being curved. Just as a general curved line can be thought of as being
made up of tiny bits of straight lines, a general curved spacetime can be
thought of as being made of of tiny bits of flat spacetime - the local inertial
frames of the equivalence principle.
This gives a powerful geometric picture of a gravitational field. It is
nothing else than a curvature of spacetime itself. Now, there are several
ways to discuss curvature. We are used to looking at curved spaces inside
Relativity and Curved Spacetime 211
of some larger (flat) space. Einstein's idea was that the only relevant things
are those that can be measured in terms of the curved surface itself and
which have nothing to do with it (perhaps) being part of some larger flat
space. As a result, one would gain nothing by assuming that there is such
a larger flat space. In Einstein's theory, there is no reason to suppose that
one exists.
For example, we noticed above that this new understanding of gravity
means that the gravitational field contains more information than just
giving an accelera- tion at various points in spacetime. The acceleration is
related to curvature in spacetime associated with a time direction (say, in
the xt plane), but there are also parts of the gravitational field associated
with the (purely spatial) xy, xz, and yz planes.
Let's begin by thinking back to the flat spacetime case (special
relativity). What was the object which encoded the flat Minkowskian
geometry? It was the interval: (interval) 2 = -c 2 A/ 2 + Ax 2 .
Building Intuition in Flat Space
To understand fully what information is contained in the interval, it
is perhaps even better to think first about flat space, for which the
analogous quantity is the distance As between two points: As 2 = Ax 2 +
Ay 2 . Much of the important information in geometry is not the distance
between two points per se, but the closely related concept of length. For
example, one of the properties of flat space is that the length of the
circumference of a circle is equal to 2% times the length of its radius.
Now, in flat space, distance is most directly related to length for
straight lines: the distance between two points is the length of the straight
line connecting them. To link this to the length of a curve, we need only
recall that locally every curve is a straight line.
In particular, what we need to do is to approximate any curve by a
set of tiny (infinitesimal) straight lines. Because we wish to consider the-
limit in which these straight lines are of zero size, let us denote the length
of one such line by ds.
The relation of Pythagoras then tells us that ds 2 = dx 2 + dy 2 for that
straight line, w heie dx and dy are the infinitesimal changes in the x and y
coordinates between the two ends of the infinitesimal line segment. To
find the length of a curve, we need only add up these lengths over all of
212 Relativity and Curved Spacetime
the straight line segments. In the language of calculus, we need only perform
the integral:
Length = { UIW ds= { urve V^+^
You may not be used to seeing integrals written in a form like the
one above. Let me just pause for a moment to note that this can be written
in a more familiar form by, say, taking out a factor of dx from the square
root. We have
Le "g th =Le4 + (f
\dx) '
So, if the curve is given as a function y = yix), the above formula
does indeed allow you to calculate the length of the curve.
Now, what does this all really mean? What is the 'take home' lesson
from this discussion? The point is that the length of every curve is governed
by the formula
ds 2 = dx 2 + dy 2 .
Thus, this formula encodes lots of geometric information, such that
the fact that the circumference of a circle is 2% times its radius. As a result,
Above equation will be false on a curved surface like a sphere.
A formula of the form ds 2 = stuff is known as a metric, as it tells us
how to measure things (in particular, it tells us how to measure lengths).
What we are saying is that this formula will take a different form on a
curved surface and will not match with equation.
On to Angles
What other geometric information is there aside from lengths? Here,
you might consider the examples we talked about last time during class:
that flat spaces are characterized by having 1 80o in every triangle, and by
squares behaving nicely.
So; one would also like to know about angles. Now, the important
question is: "Is information about angles also contained in the metric?"
It turns out that it is. You might suspect that this is true on the basis
of trigonometry, which relates angles to (ratios of) distances. Of course,
trigonom- etry is based on flat space, but recall that any space is locally
flat, and notice that an angle is something that happens at a point (and so
is intrinsically a local notion).
To see just how angular information is encoded in the metric, let's
look at an example.
The standard (Cartesian) metric on flat space ds 2 = dx 2 + dy 2 is based
on an 'orthogonal' coordinate system - one in which the constant x lines
Relativity and Curved Spacetime 213
intersect the constant y lines at right angles. What if we wish to express the
metric in terms of x and, say, some other coordinate z which is not orthogonal
tox?
In this case, the distances Ax, Az, and As are related in a slightly more
complicated way. If you have studied much vector mathematics, you will
have seen the relation:
As 2 = Ax 2 + Az 2 + 2AxAz cos 0.
In vector notation, this is just|jc + f| 2 = |f| 2 +\z\ + 2x-z.
Even if you have not seen this relation before, it should make some
sense to you.
Note, for example, that if 9 = we get As = 2Ax (since x and z are
parallel and our 'triangle' is just a long straight line), while for 9 = 1 80° we
get As = (since x and z now point in opposite directions and, in walking
along the two sides of our triangle, we cover the same path twice in opposite
directions, returning to our starting point.).
For an infinitesimal triangle, we would write this as:
ds 2 = dx 2 + dz 2 - Idxdz cos 9.
So, the angular information lies in the "cross term" with a dxdz. The
coefficient of this term tells us the angle between the x and z directions.
Metrics on Curved Space
This gives us an idea of what a metric on a general curved space should
look like. After all, locally (i.e., infinitesimally) it should looks like one of
the flat cases above! Thus, a general metric should have a part proportional
to dx 2 , a part proportional to dy 2 , and a part proportional to dxdy. In
general, we write this as:
ds2 = S xx dx 2 + 2g xy dxdy + g dy 2 .
What makes this metric different from the ones above (and therefore
not necessarily flat) is that g xx , g , and g are in general functions of the
coordinates x, y. In contrast, the functions were constants for the flat
metrics above.
Note that this fits with our idea that curved spaces are locally flat
since, close to any particular point (x, y) the functions g xx , g xV g will
214 Relativity and Curved Spacetime
not deviate too much from the values at that point. In other words, any
smooth function is locally constant.
Now, why is there a 2 with the dxdy term? Note that since
dxdy = dydx, there is no need to have a separate gyx term. The metric is
always symmetric, with g = g^. So, g xy dxdy + g^dydx = 2g X}J dxdy.
If you are familiar with vectors; then a bit more about how lengths
and angles are encoded.
Consider the 'unit' vectors £ andj). By 'unit' vectors, the vectors that
go from x = to x = 1 and from y = to y = 1 . As a result, their length is
one in terms of the coordinates.
This may or may not be the physical length of the vectors. For
example, to use coordinates with a tiny spacing (so that* is very short) or
coordinates with a huge spacing (so that* is large). What the metric tells
us directly are the dot products of these vectors:
xx = g xx ,
*-y = 8xy>
y-y = gyy-
Anyway, this object (g a(3 ) is called the metric (or, the metric tensor)
for the space. It tells us how to measure all lengths and angles. The
corresponding object for a spacetime will tell us how to measure all proper
lengths, proper times, angles, etc. It will be much the same except that it
will have a time part with gtt negative4 instead of positive, as did the flat
Minkowski space.
Rather than write out the entire expression all of the time (especially
when working in, say, four dimensions rather than just two) physicists
use a condensed notation called the 'Einstein summation convention'.
To see how this works, let us first relabel our coordinates. Instead of
using x and v, let's use x v x 2 with jCj = x and x 2 = y. Then we have:
rf^E^*"^^.
It is in the last equality that we have used the Einstein summation
convention instead of writing out the summation signs, the convention is
that we implicitly sum over any repeated index.
A First Example
To get a better feel for how the metric works, let's look at the metric
for a flat plane in polar coordinates (r, 0). It is useful to think about this
in terms of the unit vectors f, 9 ■
Relativity and Curved Spacetime
From the picture above, we see that these two vectors are
perpendicular: f . q = 0. Normally, we measure the radius in terms of
length, so that p has length one and f- ■ f = 1 •
The same is not true for 0: one radian of angle at large r corresponds
to a much longer arc than does one radian of angle at small r. In fact, one
radian of angle corresponds to an arc of length r. The result is that§ has
length r and 9.9 = r 2 . So, for theta measured in radians and running from
to lit, the metric turns out to be:
ds 2 ^d^ + r 2 dQ 2 .
Now, let's look at a circle located at some constant value of r.
To find the circumference of the circle, we need to compute the length
of a curve along the circle. Now, along the circle, r does not change, so we
have dr = 0. So, we have ds = rdQ. Thus, the length is:
f27t fil
rdQ = 2Ttr.
Let's check something that may seem trivial: What is the radius of
this circle? The radius (R) is the length of the curve that runs from the
origin out to the circle along a line of constant 0. Along this line, we have
d6 = 0. So, along this curve, we have ds = dr. The line runs from r = to
r = r, so we have
«=l
dr =
216 Relativity and Curved Spacetime
So, we do indeed have C = 2nR. Note that while the result R = r may
seem obvious it is true only because we used an r coordinate which was
marked off in terms of radial distance.
In general, this may not be the case. There are times when it is
convenient to use a radial coordinate which directly measures something
other than distance from the origin and, in such cases, it is very important
to remember to calculate the actual 'Radius' (the distance from the origin
to the circle) using the metric.
A Aecond Example
Now let's look at a less trivial example. Suppose the metric of some
surface is given by:
dr 2 +r 2 dB 2
Is this space flat? Well, let's compare the circumference (C) of a circle
at constant r to the radius (R) of that circle.
Again, the circumference is a line of constant r, so we have dr = for
this line and ds = i -> . The circle as usual runs from 9 = to = 2n.
So, we have
c - * 7^77 Vi77-
Now, how about the radius? Well, the radius R is the length of a line
= that connects r = with r = r. So, we have
dr . ]
■ = sinh
(This is yet another neat use of hyperbolic trigonometry.... it allows
us to explicitly evaluate certain integrals that would otherwise be a real
mess.) Clearly, C ^ 2nR. In fact, studying the large r limit of the
circumference shows that the circumference becomes constant at large r.
This is certainly not true of the radius: R —> °° as r — > °°.
Thus, C is much less than 27tR for large R.
Some Parting Comments on Metrics
This is perhaps the right place to make a point: We often think about
curved spaces as being curved inside some larger space. For example, the
two-dimensional surface of a globe can be thought of as a curved surface
that sits inside some larger (flat) three-dimensional space.
Relativity and Curved Spacetime 217
However, there is a notion of curvature (associated with the geometry of
the surface - the measurements of circles, triangles, and rectangles drawn in
that surface - and encoded by the metric) that does not refer in any way to
anything outside the surface itself. So, in order for a four dimensional spacetime
to be curved, there does not need to be any 'fifth dimension' for the universe
to be 'curved into.'
The point is that what physicists mean by saying that spacetime is
curved is not that it is 'bent' in some new dimension, but rather they mean
that the geometry on the spacetime is more complicated than that of
Minkowski space. For example, they mean that not every circle has
circumference 2jcR.
Another comment that should be made involves the relationship
between the metric and the geometry. We have seen that the metric
determines the geometry: it allows us to compute, for example, the ratio
of the circumference of a circle to its radius.
One might ask if the converse is true: Does the geometry determine
the metric? The answer is a resounding "no." We have, in fact, already
seen three metrics for flat space: We had one metric in (orthogonal)
Cartesian coordinates, one in 'tilted' Cartesian coordinates where the axes
were at some arbitrary angle 9 , and one in polar coordinates. Actually,
we have seen infinitely many different metrics since the metric was different
for each value of the tilt angle 9 for the tilted Cartesian coordinates. So,
the metric carries information not only about the geometry itself, but also
about the coordinates you happen to be using to describe it.
The idea in general relativity is that the real physical effects depend
only on the geometry and not upon the choice of coordinates5. After all,
the circumference of a circle does not depend on whether you calculate it
in polar or in Cartesian coordinates.
As a result, one must be careful in using the metric to make physical
predictions - some of the information in the metric is directly physical,
but some is an artifact of the coordinate system and disentangling the
two can sometimes be subtle.
The choice of coordinates is much like the choice of a reference frame.
We saw this to some extent in special relativity. For a given observer (say,
Alice) in a given reference frame, we would introduce a notion of position
(xAlice) as measured by Alice, and we would introduce a notion of time
(tAlice) as measured by Alice.
In a different reference frame (say, Bob's) we would use different
coordinates (xBob and tBob). Coordinates describing inertial reference
frames were related in a relatively simple way, while coordinates describing
an accelerated reference frame were related to inertial coordinates in a
more complicated way.
218 Relativity and Curved Spacetime
However, whatever reference frame we used and whatever coordinate
system we chose, the physical events are always the same. Either a given
clock ticks 2 at the event where two light rays cross or it does not. Either
a blue paintbrush leaves a mark on a meter stick or it does not. Either an
observer writes "I saw the light!" on a piece of paper or she does not. The
true physical predictions do not depend on the choice of reference frame
or coordinate system at all.
So long as we understand how to deal with physics in funny (say,
accelerating) coordinate systems, such coordinate systems will still lead
to the correct physical results.
The idea that physics should not depend on the choice of coordinates
is called General Coordinate Invariance. Invariance is a term that captures
the idea that the physics itself does not change when we change
coordinates. This turns out to be an important principle for the
mathematical formulation of General Relativity.
THE METRIC OF SPACETIME
We have now come to understand that the gravitational field is encoded
in the metric. Once a metric has been given to us, we have also learned how to
use it to compute various objects of interest. In particular, we have learned
how to test a space to see if it is flat by computing the ratio of circumference
to radius for a circle.
However, all of this still leaves open what is perhaps the most
important question: just which metric is it that describes the spacetime in
which we live?
First, let's again recall that there really is no one 'right' metric, since
the metric will depend on the choice of coordinates and there is no one
'right' choice of coordinates.
But there is a certain part of the metric that is in fact independent of
the choice of coordinates.
That part is called the 'geometry' of the spacetime. It is mathematically
very complicated to write this part down by itself. So, in practice, physicists
work with the metric and then make sure that the things they calculate do
not depend on the choice of coordinates.
What determines the right geometry? The geometry is nothing other
than the gravitational field. So, we expect that the geometry should in
some way be tied to the matter in the universe: the mass, energy, and so
on should control the geometry.
Figuring out the exact form of this relationship is a difficult task,
and Einstein worked on it for a long time. We will not reproduce his
thoughts in any detail here.
However, in the end he realized that there were actually not many
Relativity and Curved Spacetime 219
possible choices for how the geometry and the mass, energy, etc. should be
related.
The Einstein Equations
It turns out that, if we make five assumptions, then there is really just one
family of possible relationships. These assumptions are:
• Gravity is spacetime curvature, and so can be encoded in a
metric.
General Coordinate Invariance: Real physics is independent of
the choice of coordinates used to describe it.
The basic equations of general relativity should give the dynamics
of the metric, telling how the metric changes in time.
• Energy (including the energy in the gravitational field) is
conserved.
• The (local) equivalence principle.
Making these five assumptions, one is led to a relation between an
object Gab (called the Einstein Tensor) which encodes part of the spacetime
curvature and an object Tab (called the Stress-Energy or Energy-
Momentum Tensor) which encodes all of the energy, momentum, and
stresses in everything else ("matter," electric and magnetic fields, etc.). Here,
a, P run over the various coordinates {t, x, y, z). What is a stress? One
example of a stress is pressure. It turns out that, in general relativity,
pressure contributes to the gravitational field directly6, as do mass, energy,
and momentum. The relationship can be written:
G ap = kT a$ + Z #ap
The g aB in the equation above is just the metric itself. This relation is
"known as The Einstein equations.
We see that k controls just the overall strength of gravity. Making k
larger is the same thing as making T a Q bigger, which is the same as adding
more mass and energy. On the other hand, A is something different. Note
that it controls a term which relates just to the geometry and not to the
energy and mass of the matter. However, this term is added to the side of
the equation that contains the energy-momentum tensor. As a result, A
can be said to control the amount of energy that is present in spacetime
that has no matter in it at all. Partially for this reason, A is known as the
Cosmological constant.
The Newtonian Approximation
As we said (but did not explicitly derive), equation can be deduced
from the five above assumptions on purely mathematical grounds. It is
not necessary to use Isaac Newton's theory of gravity here as even partial
input. So, what is the connection to Newton's ideas about gravity?
220 Relativity and Curved Spacetime
Newton's law of gravity can only be correct when the objects are slowly
moving - otherwise special relativity would be rele-vant and all sorts of
things would go wrong. There is in fact another restriction on when
Newtonian gravity is valid. The point is that, in Newtonian gravity, mass
creates a gravitational field. But, we know now that energy and mass are
very closely related. So, all energy should create some kind of gravity.
How-ever, we have also seen that a field (like the gravitational field itself)
can carry energy.
As a result, the gravitational field must also act as a source of further
gravity. That is, once relativity is taken into account, gravitational fields
should be stronger than Newton would have expected. For a very weak
field (where the field itself would store little energy), this effect should be
small. But, for a strong gravitational field, this effect should be large. So,
Newton's law of grav-ity should only be correct for slowly moving objects
in fairly weak gravitational fields.
If one does study the Einstein equations for the case of slowly moving
objects and weak gravitational fields, one indeed obtains the Newtonian
law of gravity for the case A = 0, K = 8rcG, where G = 6.67 * KH'Nm 2 /
kg 2 is Newton's gravitational constant. So, to the extent that these numbers
are determined by experimental data, they must be the correct values.
In summary, given a lot of thought, Einstein came up with the above
five assumptions about the nature of gravity.
Then, by mathematics alone he was able to show that these
assumptions lead to equation. For weak gravitational fields and slowly
moving objects something like Newton's law of gravity also follows, but
with two arbitrary parameters K and A.
One of these (k) is just 8? times Newton's own arbitrary parameter
G. As a result, except for one constant (A) Newton's law of gravity has
also followed from the five assumptions using only mathematics. Finally,
by making use of experimental data (the same data that Newton used
originally!) Einstein was able to determine the values of K and A. The
Einstein equations then take on the pleasing form:
G ap = 87iGr a p.
The Schwarzschild Metric
Of course, the natural (and interesting!) question to ask is "What
happens when the gravitational field is strong and Newton's law of gravity
does NOT hold?" We're not actually going to solve the Einstein equations
ourselves they're pretty complicated even for the simplest of cases.
When an object is perfectly round (spherical), the high symmetry of
the situation simplifies the mathematics.
The point is that if the object is round, and if the object completely
Relativity and Curved Spacetime 221
determines the gravitational field, then the gravitational field must be round
as well. So, the first simplification we will perform is to assume that our
gravitational field (i.e., our spacetime) is spherically symmetric.
The second simplification we will impose is to assume that there is no
matter (just empty spacetime), at least in the region of spacetime that we
are studying.
In particular, the energy, momentum, etc., of matter are equal to zero
in this region. As a result, we will be describing the gravitational field of
an object (the earth, a star, etc.) only in the region outside of the object.
This would describe the gravitational field well above the earth's surface,
but not down in the interior.
For this case, the Einstein equations were solved by a young German
mathematician named Schwarzschild. There is an interesting story here,
as Schwarzschild solved these equations during his spare time while he
was in the trenches fighting (on the German side) in World War I. The
story is that Schwarzschild got his calculations published but, by the time
this happened, he had been killed in the war.
Because of the spherical symmetry, it was simplest for Schwarzschild
to use what are called spherical coordinates (r, 0, cp) as opposed to Cartesian
Coordinates {x, y, z). Here, r tells us how far out we are, and 0, cp are
latitude and longitude coordinates on the sphere at any value of r.
Schwarzschild found that, for any spherically symmetric spacetime
and outside of the matter, the metric takes the form:
(\-^)dt 2 + ^- + r 2 (dQ 2 + sm 2 Qd$ 2 )
ds 2 = V r J j _ ■«,
r
Here, the parameter R s depends on the total mass of the matter inside.
In particular, it turns out that R s = 2MG/c 2 .
The last part of the metric, ^(dd 2 + sin 2 Qdo 2 ), is just the metric on a
standard sphere of radius r.
This part follows just from the spherical symmetry itself. is a latitude
222 Relativity and Curved Spacetime
coordinate and $ is a longitude coordinate. The factor of sin 2 encodes the
fact that circles at constant (i.e., with d0 = 0) are smaller near the poles (0 =
0, 7t) than at the equator (0 = rc/2).
North Pole
THE EXPERIMENTAL VERIFICATION OF GENERAL RELATIVITY
Now that the Schwarzschild metric is in hand, we know what is the
spacetime geometry around any round object. Now, what can we do with
it? Well, in principle, one can do just about anything. The metric encodes
all of the information about the geometry, and thus all of the information
about geodesies. Any freely falling worldline (like, say, that of an orbiting
planet) is a geodesic. So, one thing that can be done is to compute the orbits
of the planets. Another would be to compute various gravitational time dilation
effects.
Having arrived at the Schwarzschild solution, we are finally at the point
where Einstein's ideas have a lot of power. They now predict the curvature
around any massive object (the sun, the earth, the moon, etc.). So, Einstein
started looking for predictions that could be directly tested by experiment to
check that he was actually right.
This makes an interesting contrast with special relativity, in which
quite a bit of experimental data was already available before Einstein
constructed the theory.
In the case of GR, Einstein was guided for a long time by a lot of
intuition (i.e., guesswork) and, for the most part, after he had constructed
the theory.
A few pieces of experimental evidence already (such as the Pound-Rebke
and GPS experiments) these occurred only in 1959 and in the 1990's! Einstein
finished developing General Relativity in 1916 and certainly wanted to find
an experiment that could be done soon after.
The Planet Mercury
We have seen that Einstein's theory of gravity agrees with Newton's when
the gravitational fields are weak (i.e., far away from any massive object). But,
the discrepancy increases as the field gets stronger. So. the best place (around
Relativity and Curved Spacetime 223
here) to look for new effects is close to the sun. One might therefore start by
considering the orbit of Mercury. Actually, there is an interesting story about
Mercury and its orbit. Astronomers had been tracking the motion of the planets
for hundreds of years. Ever since Newton, they had been comparing these
motions to what Newton's law of gravity predicted.
The agreement was incredible. In the early 1800's, they had found small
dis-crepancies (30 seconds of arc in 10 years) in the motion of Uranus. For
awhile people thought that Newton's law of gravity might not be exactly right.
However, someone then had the idea that maybe there were other objects out
there whose gravity a ected Uranus. They used Newton's law of gravity to
predict the existence of new planets: Neptune, and later Pluto. They could
even tell astronomers where to look for Neptune within about a degree of
angle on the sky.
However, there was one discrepancy with Newton's laws that the
astronomers could" not explain. This was the 'precession' of Mercury's orbit.
The point is that, if there were nothing else around, Newton's law of gravity
would say that Mercury would move in a perfect ellipse around the sun,
retracing its path over, and over, and over...
Mercury
Of course, there are small tugs on Mercury by the other planets that
modify this behaviour. However, the astronomers knew how to account
for these effects. Their results seemed to say that, even if the other planets
and such were not around, Mercury would do a sort of spiral dance around
the sun, following a path that looks more like this:
Mercury
Here, the ellipse itself as rotating (a.k.a. 'precessing') about the sun.
After all known effects had been taken into account, astronomers found
that Mercury's orbit precessed by an extra 43 seconds of arc per century.
This is certainly not very much, but the astronomers already understood
all of the other planets to a much higher accuracy. So. what was going
wrong with Mercury? Most astronomers thought that it must be due to
224 ■ Relativity and Curved Spacetime
some sort of gas or dust surrounding the Sun (a big 'solar atmosphere') that
was somehow a ecting Mercury's orbit.
However, Einstein knew that his new theory of gravity would predict a
preces-sion of Mercury's orbit for two reasons. First, he predicted a slightly
stronger gravitational field (since the energy in the gravitational field itself
acts as a source of gravity). Second, in Einstein's theory, space itself is curved
and this effect will also make the ellipse precess (though, since the velocity of
mercury is small, this effect turns out to be much smaller than the one due to
the stronger gravitational field).
The number that Einstein calculated from his theory was 43 seconds
of arc per century. That is, his prediction agreed with the experimental
data to better than 1%! Clearly, Einstein was thrilled.
This was big news. However, it would have been even bigger news if
Einstein had predicted this result before it had been measured. Physicists
are always skeptical of just explaining known effects. After all, maybe the
scientist (intentionally or not) fudged the numbers or the theory to get
the desired result? So, physicists tend not to really believe a theory until it
predicts something new that is then verified by experiments.
This is the same sort of idea as in double blind medical trials, where
even the researchers don't know what effect they want a given pill to have
on a patient!
The Bending of Starlight
Luckily, Einstein had an idea for such an effect and now had enough
confidence in his theory to push it through. The point is that, as we have
discussed, light will fall in a gravitational field. For example, a laser beam
fired horizontally across the classroom will be closer to the ground on the
side where it hits the far wall it was when it left the laser.
Similarly, a ray of light that goes skimming past a massive object (like
the sun) will fall a bit toward the sun. The net effect is that this light ray is
bent. Suppose that the ray of light comes from a star. What this means in
the end is that, when the Sun is close to the line connecting us with the
star, the star appears to be in a slightly different place than when the Sun
is not close to that light ray. For a light ray that just skims the surface of
the Sun, the effect is about.875 seconds of arc.
star appears to be
actual path 'falls' toward the Sun
However, this is not the end of the story. It turns out that there is
also another effect which causes the ray to bend. This is due to the effect
of the curvature of space on the light ray. This effect turns out to be exactly
Relativity and Curved Spacetime 225
the same size as the first effect, and with the same sign. As a result, Einstein
predicted a total bending angle of 1.75 seconds of arc - twice what would
come just from the observation that light falls in a gravitational field.
This is a tricky experiment to perform, because the Sun is bright
enough that any star that close to the sun is very hard to see. One solution
is to wait for a solar eclipse (when the moon pretty much blocks out the
light from the sun itself) and then one can look at the stars nearby.
Just such an observation was performed by the British physicist Sir
Arthur Eddington in 1919. The result indicated a bending angle of right
around 2 seconds of arc.
More recently, much more accurate versions of this experiment have
been performed which verify Einstein's theory to high precision. See Theory
and experiment in gravitational physics by Cli ord M. Will, (Cambridge
University Press, New York, 1993) QC178.W47 1993 for a modern
discussion of these issues.
Other Experiments: Radar Time Delay
The bending of starlight was the really big victory for Einstein's theory.
However, there are two other classic experimental tests of general relativity
that should be mentioned. One of these is just the effect of gravity on the
frequency of light that we have already discussed. As we said before, this
had to wait quite a long time (until 1959) before technology progressed to
the stage where it could be performed.
The last major class of experiments is called 'Radar Time delay.' These
turn out to be the most accurate tests of Einstein's theory, but they had to
wait until even more modern times. The point is that the gravitational
field effects not only the path through space taken by a light ray, but that
it also effects the time that the light ray takes to trace out that path. As
we have discussed once or twice before, time measurements can be made
extremely accurately. So, these experiments can be done to very high
precision.
The idea behind these experiments is that you then send a microwave
(a.k.a. radar) signal (which is basically a long wavelength light wave) over
to the other side of the sun and back.
You can either bounce it o a planet (say, Venus) or a space probe
that you have sent over there for just this purpose. If you measure the
time it takes for the signal to go over and then return, this time is always
longer than it would have been in flat spacetime. In this way, you can
carefully test Einstein's theory.
Chapter 9
Black Holes
INVESTIGATING THE SCHWARZSCHILD METRIC
We computed the gravitational effect on time dilation back. However, in
this computation we needed to know the gravitational acceleration g(l). We
could of course use Newton's prediction for g(/), which experiments tell us is
approximately correct near the earth. However, in general we expect this to
be the correct answer only for weak gravitational fields.
On the other hand, we know that the Schwarzschild metric describes
the gravitational field around a spherical object even when the field is
strong. So, what we will do is to first use the Schwarzschild metric to
compute the gravitational time dilation effect directly. We will then be
able to use the relation between this time dilation and the gravitational
acceleration to compute the corrections to Newton's law of gravity.
Gravitational Time Dilation from the Metric
Suppose we want to calculate how clocks run in this gravitational
field. This has to do with proper time dx, so we should remember that
dx 2 = -ds 2 . For the Schwarzschild metric we have:
*'- ] dt 2 + -^— + r 2 ( dQ 2 + sin 2 Qdfy 2 )
dx 2 = -ds 2 =
1-
The Schwarzschild metric describes any spherically symmetric
gravitational field in the region outside of all the matter. So, for example,
it gives the gravitational field outside of the earth. In using the
Schwarzschild metric, remember that R s = 2MG/c 2 .
Let's think about a clock that just sits in one place above the earth. It
does not move toward or away from the earth, and it does not go around
the earth. It just 'hovers.'
Perhaps it sits in a tower, or is in some rocket ship whose engine is
tuned in just the right way to keep it from going either up or down. Such
Black Holes 227
a clock is called a static clock since, from it's point of view, the gravitational
field does not change with time.
Since r, 9, and § do not change, we have dr = dQ = J<j) = 0. So, on our
clock's worldline we have just: dx 2 =1 1 — - \dt . That is,
rfc = Jl — -dt.
V r
Note that if the clock is at r = <=° then the square root factor is equal
/, R s .
to 1. So, we might write dx = dt. In other words, dx =J1 dx x or,
V r
Ax„ V r '
As saw before, clocks higher up run faster. Now, however, the answer
seems to take a somewhat simpler form, when we were using only the
Newtonian approximation.
Corrections to Newton's Law
Note that the Schwarzschild geometry is a time independent
gravitational field. The rate at which various clocks run to the acceleration
of freely falling observers. In other words, we can use this to compute the
corrections to Newton's law of gravity.
The relation is
At„
= exp
Here, a(s) is the acceleration of a static clock relative to a freely falling
clock at s, and 5 measures distance. To compare this with our formula
above, we want to take a = s and b = °°. Taking the In of both sides gives
Now, taking a derivative with respect to 5 we find:
c ds l^oo )•
Now, it is important to know what exactly 5 measures in this formula.
When we derived this result we were interested in the actual physical height
228 Black Holes
of a tower. As a result, this s describes proper distance, say, above the surface
of the earth.
On the other hand, equation is given in terms of r which, it turns out,
does not describe proper distance. To see this, let's think about the proper
distance ds along a radial line with dt ~ dQ = d§ = 0. In this case, we have
ds = T^77' ° r dS = ^RJ~r ' and
dr
ds ^
However we can deal with this by using the chain rule:
c 2 U
a=C ds 1
Going through the calculation yields:
's I Too J VdsJdr {x x )-
- d ,
- 2 ^\-R s lr — In yJl-R s /r
1 d ,
"I 1
= c 2 y J\-R s /r-—\n(l-R s /r)
l-R s lr ,
2^\-R s tr r 2 '
c 2 R, MG
Note that for r » R r , we have a — - — =- = — =- . This is exactly
2 r 2 r 2
Newton's result.
However, for small r, a is much bigger. In particular, look at what
happens when r = RS. There we have a(7y = °°! So, at r = R s , it takes an
infinite proper acceleration for a clock to remain static. A static person at
r = R s would therefore feel infinitely heavy. This is clearly a various special
value of the radius coordinate, r. This value is known as the Schwarzschild
radius. Now, let's remember that the Schwarzschild metric only gives the
right answer outside of all of the matter.
Suppose then that the actual physical radius of the matter is bigger
than the associated Schwarzschild radius (as is the case for the earth and
the Sun). In this case, you will not see the effect described above since the
place where it would have occurred (r = R s ) in inside the earth where the
matter is non-zero and the Schwarzschild metric does not apply.
Black Holes 229
But what if the matter source is very small so that its physical radius
is less than Rs? Then the Schwarzschild radius R s will lie outside the matter
at a place you could actually visit. In this case, we call the object a "black
hole." You will see why in a moment.
ON BLACK HOLES
Objects that are smaller than their Schwarzschild radius (i.e., black
holes) are one of the most intriguing features of general relativity. We
now proceed to explore them in some detail, discussing both the formation
of such objects and a number of their interesting properties. Although
black holes may seem very strange at first, we will soon find that many of
their properties are in quite similar to features that we encountered in our
development of special relativity some time ago.
Forming a Black Hole
A question that often arises when discussing black holes is whether
such objects actually exist or even whether they could be formed in
principle. After all, to get R s = IMG/c 1 to be bigger than the actual radius
of the matter, you've got to put a lot of matter in a very small space, right?
So, maybe matter just can't be compactified that much. In fact, it turns
out that making black holes (at least big ones) is actually very easy.
In order to stress the importance of understanding black holes and
the Schwarzschild radius in detail, we'll first talk about just why making a
black hole is so easy before going on to investigate the properties of black
holes. Suppose we want to make a black hole out of, say, normal rock.
What would be the associated Schwarzschild radius? We know that
R s - 2MG/c 2 . Suppose we have a big ball of rock or radius r.
How much mass in in that ball? Well, our experience is that rock does
not curve spacetime so much, so let's use the flat space formula for the
4 3
volume of a sphere: V = — nr . The mass of the ball of rock is determined
by its density, p, which is just some number 1. The mass of the ball of
4n 3
rock is therefore M= pV = — pr . The associated Schwarzschild
8tiG 3
Now, for large enough r, any cubic function is bigger than r. In
r 3c- 2 v "
particular, we set r = R at r = and there is a solution no matter
V 57rGp ;
what the value of p! So, a black hole can be made out of rock, without
230 Black Holes
even working hard to compress it more than normal, so long as we just have
enough rock. Similarly, a black hole could be made out of people, so long as
we had enough of them just insert the average density of a person in the formula
above.
A black hole could even be made out of very diffuse air or gas, so
long so as we had enough of it. For air at normal density, we would need
a ball of air 10 13 meters across. For comparison, the Sun is 10 9 meters
across, so we would need a ball of gas 10,000 times larger than the Sun (in
terms of radius).
Black holes in nature seem to come come in two basic kinds. The
first kind consists of small black holes whose mass is a few times the mass
of the Sun. These form at the end of a stars life cycle when nuclear fusion
no longer produces enough heat (and thus pressure) to hold up the star.
The star then collapses and compresses to enormous densities. Such
collapses are accompanied by extremely violent processes called
supernovae. The second kind consists of huge black holes, whose mass is
10 6 (a million) to 10 10 (ten billion) times the mass of the sun. Some black
holes may be even larger.
Astronomers tell us that there seems to be a large black hole at the
centre of every galaxy, or almost every galaxy. These large black holes are
much easier to form than are small ones and do not require especially
high densities. To pack the mass of 10 6 suns within the corresponding
Schwarzschild radius does not require a density much higher than that of
the Sun itself (which is comparable to the density of water or rock).
One can imagine such a black hole forming in the centre of a galaxy,
where the stars are densely packed, just by having a few million stars
wander in very close together.
The larger black holes are even easier to make: to pack a mass of ten
billion suns within the corresponding Schwarzschild radius requires a
density of only 10~ 5 times the density of air! It could form from just a
very large cloud of very thin gas.
Matter within the Schwarzschild Radius
Since black holes exist (or at least could easily be made) we're going
to have to think more about what is going on at the Schwarzschild radius.
At First, the Schwarzschild radius seems like a very strange place. There, a
rocket would require an infinite proper acceleration to keep from falling
in. So what about the matter that first formed the black hole itself? Where
is that matter and what is it doing?
Let's go through this step by step. Let us first ask if there can be matter
sitting at the Schwarzschild radius (as part of a static star or ball or gas).
Clearly not. since the star or ball of gas cannot produce the infinite force
Black Holes 231
that would be needed to keep its atoms from falling inward. The star or ball of
gas must contract. Even more than this, the star will be already be contracting
when it reaches the Schwarzschild radius and, since gravitation produces
accelerations, it must cause this rate of contraction to increase.
Now, what happens when the star becomes smaller than its
Schwarzschild radius? The infinite acceleration of static observers at the
Schwarzschild radius suggests that the Schwarzschild metric may not be
valid inside Rs. As a result we cannot yet say for sure what happens to
objects that have contracted within Rs. However, we would certainly find
it odd if the effects of gravity became weaker when the object was
compressed. Thus, since the object has no choice but to contract (faster
and faster) when it is of size Rs, one would expect smaller objects also to
have no choice but accelerated contraction!
It now seems that in a finite amount of time the star must shrink to
an object of zero size, a mathematical point. This most 'singular' occurrence
(to quote Sherlock Holmes) is called a 'singularity.' But, once it reaches
zero size, what happens then? This is an excellent question, but we are
getting ahead of ourselves. For the moment, let's go back out to the
Schwarzschild radius and find out what is really going on there.
The Schwarzschild Radius and the Horizon
Not only does a clock require an infinite acceleration to remain static
at the Schwarzschild radius, but something else interesting happens there
as well. Let's look back at the formula we had for the time measured by a
static clock:
Ax(r)
Notice what happens at at the Schwarzschild radius. Since r = R s , we
have At = 0. Our clock stops, and no time passes at all.
Now, this is certainly very weird, but perhaps it rings a few bells? It
should sound vaguely familiar.... clocks running infinitely slow at a place
where the acceleration required to keep from falling becomes infinite....
You may recall that the same thing occurred for the acceleration horizons
back in special relativity.
This gives us a natural guess for what is going on near the
Schwarzschild radius. In fact, let us recall that any curved spacetime is
locally flat. So, if our framework holds together at the Schwarzschild radius
we should be able to match the region near r = R s to some part of
Minkowski space. Perhaps we should match it to the part of Minkowski
space near an acceleration horizon? Let us guess that this is correct and
then proceed to check our answer.
We will check our answer using the equivalence principle. The point is
that an accelerating coordinate system in flat spacetime contains an apparent
gravita-tional field.
There is some nontrivial proper acceleration a that is required to
remain static at each position. Furthermore, this proper acceleration is
not thesame at all locations, but instead becomes infinitely large as one
approaches thehorizon.
What we want to do is to compare this apparent gravitational field
(the proper acceleration a(s), where s is the proper distance from the
horizon) near the acceleration horizon with the corresponding proper
acceleration a(s) required to remain static a small proper distance s away
from the Schwarzschild radius.
If the two turn out to be the same then this will mean that static
observers have identical experiences in both cases. But, the experiences of
static observers are related to the experiences of freely falling observers.
Thus, if we then consider freely falling observers in both cases, they will
also describe both situations in the same way. It will then follow that
physics near the event horizon is identical to physics near an acceleration
horizon - something that we understand well from special relativity.
In flat spacetime the proper acceleration required to maintain a
constant proper distance s from the acceleration horizon (e.g., from event
Z) is given by
a = c 2 /s.
Now, so far this does not look much like our result for the black hole.
However, we should again recall that r and s represent different quantities.
That is, r does not measure proper distance. Instead, we have
dr r ^
ds = yj\-R s /r \yj\-R s '"
The behaviour when r - R is small. To examine this, it is useful to
Black Holes 233
introduce the quantity A = r-R s . We can then write the above formula as: ds
—dA . Integrating, we get
■&■
This integral is hard to perform exactly since r = R s + A is a function
of A. However, since we are only interested in small A (for our local
comparison), r doesn't differ much from R s . So, we can simplify our work
and still maintain sufficient accuracy by replacing r in the above integral
by R s . The result is:
-JRsU^iJr^.
Let us use this to write a for the black hole (let's call this a BH ) in
terms of the proper distance s. From above, we have
: %J\-R s /\r r 2
c 2 4~r R s
2 1-/?, r 2
c 2 1 r R s
: l^frr 2
iJKr s s ■
Note that this is identical to the expression for a near an acceleration
horizon It worked! Thus we can conclude:
Near the Schwarzschild radius, the black hole spacetime is just the
same as flat spacetime near an acceleration horizon.
The part of the black hole spacetime at the Schwarzschild radius is
known as the horizon of the black hole.
Going Beyond the Horizon
We are of course interested in what happens when we go below the
horizon of a black hole. However, the connection with acceleration
horizons tells us that we will need to be careful in investigating this
question. In particular, so far we have made extensive use of static
234 Black Holes
observers - measuring the acceleration of freely falling frames relative to them.
Static observers were also of interest when discussing acceleration horizons -
so long as they were outside of the acceleration horizons.
The past and future acceleration horizons divided Minkowski space
into four regions: static worldlines did not enter two of these at all, and in
another region static worldlines would necessarily move 'backwards in
time.' The fourth region was the normal 'outside' region, and we concluded
that true static observers could only exist there.
We have seen that the spacetime near the black hole horizon is just
like that near an acceleration horizon. As a result, there will again be no
static observers below the horizon.
We suspected this earlier based on the idea that it takes infinite
acceleration to remain static at the horizon and we expected the
gravitational effects to be even stronger deeper inside.
Based on our experience with acceleration horizons, we now begin to
see how this may in fact be possible. It has become clear that we will need
to abandon static observers in order to describe the region below the black
hole horizon.
("WoftlKWOrid")
Past Acceleration Horizon
Suppose then that we think about freely falling observers instead. As
we know, freely falling observers typically have the simplest description
of spacetime.
Using the connection with acceleration horizons, we see immediately
how to draw a (freely falling) spacetime diagram describing physics near
the Schwarzschild radius. It must look just like our diagram above for
flat spacetime viewed from an inertial frame near an acceleration horizon!
Note that r = R s for the black hole is like s = for the acceleration horizon
since a -» °° in both cases.
The important part of this is that s = is not only the event Z, but is
in fact the entire horizon! This is because events separated by a light ray
are separated by zero proper distance.
It also follows from continuity since, arbitrarily close to the light rays
shown below we clearly have a curve of constant ;• for r arbitrarily close
to R s . So, /■ = R s \s also the path of a light ray, and forms a horizon in the
black hole spacetime. In the black hole context, the horizon is often
referred to as the 'event horizon of the black hole.'
Past Horizon
Let us review our discussion so far. We realized that, so long as we
were outside the matter that is causing the gravitational field, any
spherically symmetric (a.k.a. 'round') gravitational field is described by
the Schwarzschild metric. This metric has a special place, at r = R s , the
'Schwarzschild Radius.' Any object which is smaller than its Schwarzschild
Radius will be surrounded by an event horizon, and we call such an object
a black hole.
If we look far away from the black hole, at r » R s , then the
gravitational field is much like what Newton would have predicted for an
object of that mass. There is of course a little gravitational time dilation,
and a little curvature4, but not much.
Indeed, the Schwarzschild metric describes the gravitational field not
only of a black hole, but of the earth, the Sun, the moon, and any other
round object. However, for those more familiar objects, the surface of the
object is at r » R s . For example, on the surface of the Sun r/R s ~ 5 x 10 5 .
So, far from a black hole, objects can orbit just like planets orbit the
Sun. By the way, remember that orbiting objects are freely falling - they do
not require rocket engines or other forces to keep them in orbit. However,
suppose that we look closer in to the horizon. What happens then?
In one of the homework problems, you will see that something
interesting happens to orbiting objects when they orbit at r = 3RJ2. There,
an orbiting object experiences no proper time: At = 0. This means that
the orbit at this radius is a lightlike path. In other words, a ray of light
will orbit the black hole in a circle at r = 3RJ2. For this reason, this region
is known as the 'photon sphere.' This makes for some very interesting visual
effects if you would imagine traveling to the photon sphere.
236 Black Holes
This is not to say that light cannot escape from the photon sphere. The
point is that, if the light is moving straight sideways (around the black hole)
then the black hole's gravity is strong enough to keep the light from moving
farther away.
However, if the light were directed straight outward at the photon
sphere, it would indeed move outward, and would eventually escape.
And what about closer in, at r < 3RJ21 Any circular orbit closer in is
spacelike, and represents an object moving faster than the speed of light.
So, given our usual assumptions about physics, nothing can orbit the black
hole closer than r = 3RJ2. Any freely falling object that moves inward
past the photon sphere will continue to move to smaller and smaller values
of r. However, if it ceases to be freely falling (by colliding with something
or turning on a rocket engine) then it can still return to larger values of r.
Now, suppose that we examine even smaller r, and still have not run
into the surface of an object that is generating the gravitational field. If
we make it all the way to r = R s without hitting the surface of the object,
we find a horizon and we call the object a black hole.
It is at a constant value of r, the horizon contains the worldlines of
outward directed light rays. To see what this means, imagine an expanding
sphere of light (like one of the ones produced by a firecracker) at the
horizon. Although it is moving outward at the speed of light (which is
infinite boost parameter.), the sphere does not get any bigger. The
curvature of spacetime is such that the area of the spheres of light do not
increase. A spacetime diagram looks like this:
Not only do light rays directed along the horizon remain at /- = R s ,
any light ray at the horizon which is directed a little bit sideways (and not
perfectly straight outward) cannot even stay at r = R s , but must move to
smaller /-. The diagram below illustrates this by showing the horizon as a
surface made up of light rays. If we look at a light cone emitted from a
point on this surface, only the light ray that is moving in the same direction
as the rays on the horizon can stay in the surface. The other light rays all fall
behind the surface and end up inside the black hole (at r < R s ).
Similarly, any object of nonzero mass requires an infinite acceleration
(directed straight outward) to remain at the horizon. With any finite
acceleration, the object falls to smaller values of r. At any value of r less
than Rs no object can ever escape from the black hole.
This is clear from the above spacetime diagram, since to move from
the future interior to, say, the right exterior the object would have to cross
the light ray at r - R s , which is not possible.
Note that we could have started with this geometric insight at the
horizon and used it to argue for the existence of the photon sphere: Light
aimed sideways around the black hole escapes when started far away but
falls in at the horizon.
Somewhere in the middle must be a transition point where the light
neither escapes nor falls in. Instead, it simply circles the black hole forever
at the same value of r.
BEYOND THE HORIZON
Of course, the question that everyone would like to answer is "What
the heck is going on inside the black hole?" To understand this, we will
turn again to the Schwarzschild metric.
The Interior Diagram
To make things simple, let's suppose that all motion takes place in
the r, t plane. This means that dQ = d§ = 0, and we can ignore those parts
of the metric. The relevant pieces are just
d s 2 = -(!-/?./ r)dt 2 + —
aS K s J \-RJr-
238 Black Holes
Let's think for a moment about a line of constant r (with dr = 0). For such
a line, ds 2 = -(1 - RJr)dt 2 . The interesting thing is that, for r < R s , this is
positive. Thus, for r<R s ,a line of constant r is spacelike.
You will therefore not be surprised to find that, near the horizon, the
lines of constant r are just like the hyperbolae that are a constant proper
time from where the two horizons meet.
The coordinate / increases along these lines, in the direction indicated
by the arrows. This means that the t-direction is actually spacelike inside
the black hole. The point here is not that something screwy is going on
with time inside a black hole.
Instead, it is merely that using the Schwarzschild metric in the way
that we have written it we have done something 'silly' and labelled a space
direction t. The problem is in our notation, not the spacetime geometry.
Let us fix this by changing notation when we are in this upper region.
We introduce f - r and r' = t. The metric then takes the form
ds 2= -{\-R s lt')dr ,2 +
dt' 2
\-R s /t'
You might wonder if the Schwarzschild metric is still valid in a region
where the / direction is spacelike.
It turns out that it is. Unfortunately, we were not able to discuss the
Einstein equations in detail. If we had done so, however, then we could
check this by directly plugging the Schwarzschild metric into equation just
as we would to check that the Schwarzschild metric is a solution outside
the horizon.
Finally, notice that the lines above look just the like lines we drew to
describe the boost symmetry of Minkowski space associated with the
change of reference frames. In the same way, these lines represent a
Black Holes 239
symmetry of the black hole spacetime. After all, the lines represent the direction
of increasing t = r'.
But, the Schwarzschild metric is completely independent of / = / - it
depends only on r = t'\ So, sliding events along these lines and increasing
their value of / = / does not change the spacetime in any way. Outside of
the horizon, this operation moves events in time.
As a result, the fact that it is a symmetry says that the black hole's
gravitational field is not changing in time.
However, inside the horizon, the operation moves events in a spacelike'
direction. Roughly speaking, we can interpret the fact that this is a
symmetry as saying that the black hole spacetime is the same at every
place inside.
However, the metric does depend on r = f, so the interior is dynamical.
We have discovered a very important point: although the black hole
spacetime is independent of time on the outside, it does in fact change
with time on the inside. On the inside the only symmetry is one that relates
different points in space, it says nothing about the relationship between
events at different times.
Now, you might ask just how the spacetime changes in time. On one
of the hyperbolae drawn above there is a symmetry that relates all of the
points in space. The full spacetime is 3+1 dimensional and that for every
point on the diagram above the real spacetime contains an entire sphere
of points.
Even inside the horizon, the spacetime is spherically symmetric Now,
the fact the points on our hyperbola are related by a symmetry means
that the spheres are the same size (/■) at each of these points! What changes
as we move from one hyperbola to another ('as time passes') is that the
size of the spheres (r) decreases.
This is 'why' everything must move to smaller r inside the black hole
- the whole spacetime is shrinking!
To visualize what this means, it is useful to draw a picture of the curved
space of a black hole at some time.
You began this process on a recent homework assignment when you
considered a surface of constant t {dt = 0) and looked at circumference
(C) vs. radius {R) for circles in this space6. You found that the space was
not flat, but that the size of the circles changed more slowly with radius
than in flat space.
One can work out the details for any constant / slice in the exterior
(since the symmetry means that they are all the same!). Two such slices
are shown below. Note that they extend into both the 'right exterior' with
which we are familiar and the 'left exterior', a region about which we have
so far said little.
Another constant t
Ignoring, say, the 8 direction and drawing a picture of r and <|> (at the
equator, 8 = 7i/2), any constant t slice (through the two "outside" regions)
looks like this:
This is the origin of the famous idea that black holes can connect our
universe (right exterior) to other universes (left exterior), or perhaps to
some distant region of our own universe.
If this idea bothers you, don't worry too much, the other end of the
tunnel is not really present for the black holes commonly found in nature.
Note that the left exterior looks just like the right exterior and represents
another region 'outside' the black hole, connected to the first by a tunnel.
This tunnel is called a 'wormhole,' or 'Einstein-Rosen bridge.'
So, what are these spheres inside the black hole ? They are the 'throat'
of the wormhole.
Gravity makes the throat shrink, and begin to pinch o. That is, if we
draw the shape of space on each of the slices numbered 0,1,2 below, they
would look much like the Einstein-Rosen bridge above, but with narrower
and narrower necks as we move up the diagram.
Does the throat ever pinch o completely? That is, does it collapse to
r = in a finite proper time? We can find out from the metric. Let's see
Black Holes
241
what happens to a freely falling observer who falls from where the horizons
cross (at r = R s ) to r = (where the spheres are of zero size and the throat
has collapsed).
Our question is whether the proper time measured along such a
worldline is finite. Consider an observer that starts moving straight up
the diagram, as indicated by the dashed line in the figure below. We first
need to figure out what the full worldline of the freely falling observer
will be.
SupposQ worldline
Will the freely falling worldline curve to the left or to the right? Since
t is the space direction inside the black hole, this is just the question of
whether it will move to larger t or smaller t. What do you think will happen?
Well, our diagram is exactly the same on the right as on the left, so
there seems to be a symmetry.
In fact, you can check that the Schwarzschild metric is unchanged if
242 Black Holes
we replace t by -/. So, both directions must behave identically. If any
calculation found that the worldline bends to the left, then there would
be an equally valid calculation showing that the worldline bends to the
right. As a result, the freely falling worldline will not bend in either
direction and will remain at a constant value of t.
Now, how long does it take to reach r = 0? We can compute the proper
time by using the freely falling worldline with dt = 0. For such a worldline
the metric yields:
R s lr-\
Integrating, we have:
It is not important to compute this answer exactly. What is important
is to notice that the answer is finite.
We can see this from the fact that, near r ~ R s the integral is much
dx
like "1= near x = 0.
This latter integral integrates tOyfx and is finite at x = 0. Also, near
x
r = the integral is much like ~7~ dx, which clearly gives a finite result.
Thus, our observer measures a finite proper time between r = R s and r =
and the throat does collapse to zero size in finite time.
The Singularity
This means that we should draw the line r = as one of the hyperbolae
on our digram. It is clearly going to be a 'rather singular line' (to paraphrase
Sherlock Holmes again), and we will mark it as special by using a jagged
line. As you can see, this line is spacelike and so represents a certain time.
We call this line the singularity.
Note that this means that the singularity of a black hole is not a place
at all!
The singularity is most properly thought of as being a very special
time, at which the entire interior of the black hole squashes itself (and
everything in it) to zero size.
Note that, since it cuts all of the way across the future light cone of
any events in the interior (such as event A below), there is no way for any
object in the interior to avoid the singularity.
By the way, this is a good place to comment on what would happen to
you if you tried to go from the right exterior to the left exterior through the
wormhole. Note that, once you leave the right exterior, you are in the future
interior region. From here, there is no way to get to the left exterior without
moving faster than light. Instead, you will encounter the singularity. What
this means is that the wormhole pinches o so quickly that even a light ray
cannot pass through it from one side to the other. It turns out that this behaviour
is typical of wormholes.
Let's get a little bit more information about the singularity by studying
the motion of two freely falling objects. Some particularly simple geodesies
inside the black hole are given by lines of constant t.
One question that we can answer quickly is how far apart these lines
are at each r (say, measured along the line r = const). That is, "What is the
proper length of the curve at constant r from t = t x to t = t 2 T Along such
a curve, dr = and we have ds 1 = (RJr - \)dt 2 . So, s = (t x - t 2 ) RJr - 1 . As
r -> 0, the separation becomes infinite. Since a freely falling object reaches
244 Black Holes
r = in finite proper time, this means that any two such geodesies move
infinitely far apart in a finite proper time.
It follows that the relative acceleration (a.k.a. the gravitational tidal
force) diverges at the singularity. (This means that the spacetime curvature
also becomes infinite.) Said differently, it would take an infinite proper
acceleration acting on the objects to make them follow (nongeodesic) paths
that remain a finite distance apart. Physically, this means that it requires
an infinite force to keep any object from being ripped to shreds near the
black hole singularity.
Beyond the Singularity?
Another favourite question is "what happens beyond (after!) the
singularity?" The answer is not at all clear. The point is that just as
Newtonian physics is not valid at large velocities and as special relativity
is valid only for very weak spacetime curvatures, we similarly expect
General Relativity to be an incomplete description of physics in the realm
where curvatures become truly enormous. This means that all we can really
say is that a region of spacetime forms where the theory we are using
(General Relativity) can no longer be counted on to correctly predict what
happens.
The main reason to expect that General Relativity is incomplete comes
from another part of physics called quantum mechanics. Quantum
mechanical effects should become important when the spacetime becomes
very highly curved.
Roughly speaking, you can see this from the fact that when the
curvature is strong local inertial frames are valid only over very tiny regions
and from the fact the quantum mechanics is always important in
understanding how very small things work. Unfortunately, no one yet
understands just how quantum mechanics and gravity work together. We
say that we are searching for a theory of "quantum gravity." It is a very
active area of research that has led to a number of ideas, but as yet has no
definitive answers. This is in fact the area of my own research.
Just to give an idea of the range of possible answers to what happens
at a black hole singularity, it may be that the idea of spacetime simply
ceases to be meaningful there. As a result, the concept of time itself may
also cease to be meaningful, and there may simply be no way to properly
ask a question like "What happens after the black hole singularity?" Many
apparently paradoxical questions in physics are in fact disposed of in just
this way (as in the question 'which is really longer, the train or the tunnel?').
In any case, one expects that the region near a black hole singularity
will be a very strange place where the laws of physics act in entirely
unfamiliar ways.
Black Holes 245
There still remains one region of the diagram (the 'past interior') about
which we have said little. The Schwarzschild metric is time symmetric
(under t — » -/). As a result, the diagram should have a top/bottom
symmetry, and the past interior should be much like the future interior.
This part of the spacetime is often called a 'white hole' as there is no way
that any object can remain inside: everything must pass outward into one
of the exterior regions through one of the horizons!
r = R,
As we mentioned briefly with regard to the second exterior, the past
interior does not really exist for the common black holes found in nature.
Let's talk about how this works. So far, we have been studying the pure
Schwarzschild solution. As we have discussed, it is only a valid solution
in the region in which no matter is present. Of course, a little bit of matter
will not change the picture much. However, if the matter is an important
part of the story (for example, if it is matter that causes the black hole to
form in the first place), then the modifications will be more important.
Let us notice that in fact the 'hole' (whether white or black) in the
above spacetime diagram has existed since infinitely far in the past. If the
Schwarzschild solution is to be used exactly, the hole (including the
wormhole) must have been created at the beginning of the universe. We
expect that most black holes were not created with the beginning of the
universe, but instead formed later when too much matter came too close
together. A black hole must form when, for example, too much thin gas
gets clumped together.
Once the gas gets into a small enough region (smaller than its
Schwarzschild radius), we have seen that a horizon forms and the gas must
shrink to a smaller size. No finite force (and, in some sense, not even infinite
force) can prevent the gas from shrinking.
Now, outside of the gas, the Schwarzschild solution should be valid.
So, let me draw a worldline on our Schwarzschild spacetime diagram that
represents the outside edge of the ball of gas. This breaks the diagram
246 Black Holes
into two pieces: an outside that correctly describes physics outside the gas,
and an inside that has no direct physical relevance and must be replaced
by something that depends on the details of the matter:
We see that the 'second exterior' and the 'past <nterior' are in the part
of the diagram with no direct relevance to relevance to black holes that
form from collapsing matter.
A careful study of the Einstein equations shows that, inside the matter,
the spacetime looks pretty normal. A complete spacetime diagram
including both then region inside the matter and the region outside would
look like this:
Schwarzschild here
Not Schwarzschild
i = R s along Horizon from here 01
\ Outside Edge of the matter
VISUALIZING BLACK HOLE SPACETIMES
We have now had a fairly thorough discussion about Schwarzschild
black holes including the outside, the horizon, the inside, and the "extra
regions" (second exterior and past interior). One of the things that we
emphasized was that the spacetime at the horizon of a black hole is locally
flat, just like everywhere else in the spacetime.
Also, the curvature at the horizon depends on the mass of the black
hole. The result is that, if the black hole is large enough, the spacetime at
the horizon is less curved than it is here on the surface of the earth, and a
person could happily fall through the horizon without any discomfort. It
Black Holes
247
is useful to provide another perspective on the various issues that we have
discussed. The idea is to draw a few pictures.
The point is that the black hole horizon is an effect caused by the
curvature of spacetime, and the way that our brains are most used
to thinking about curved spaces is to visualize them inside of a larger flat
space.
For example, we typically draw a curved (two-dimensional) sphere)
as sitting inside a flat three-dimensional space.
Now, the r, / plane of the black hole that we have been discussing
and drawing on our spacetime diagrams forms a curved two-dimensional
spacetime. It turns out that this two-dimensional spacetime can also be
drawn as a curved surface inside of a flat three-dimensional spacetime.
To get an idea of how this works, let me first do something very simple:
A flat two-dimensional spacetime inside of a flat three-dimensional
spacetime.
As usual, time runs up the diagram, and we use units such that light
rays move along lines at 45o angles to the vertical. Note that any world! ine
of a light ray in the 3-D spacetime that happens to lie entirely in the 2-D
spacetime will also be the worldline of a light ray in the 2-D spacetime,
since it is clearly a curve of zero proper time. A pair of such crossed light
rays are shown below where the light cone of the 3-D spacetime intersects
the 2-D spacetime.
Now that we've got the idea, a picture that represents the (2-D) r, t
plane of our black hole, drawn as a curved surface inside a 3-D flat
spacetime. It looks like this:
The curves of constant r so that you can visualize them more easily.
Note that larger r is farther from the centre of the diagram, and in
particular farther out along the 'flanges.' One flange represents the left
exterior, and one represents the right exterior.
The most important thing to notice is that we can once again spot
two lines that 1) are the worldlines of light rays in the 3D flat space and 2)
lie entirely within the curved 2D surface.
As a result, they again represent worldlines of light rays in the black
hole spacetime. They are marked with lines on the first picture / showed
you (above) of the black hole spacetime and also on the diagrams below. Note
that they do not move at all outward toward larger values of r.
These are the horizons of the black hole.
Another thing we can see from these diagrams is the symmetry we
discussed. The symmetry of the 2- D black hole spacetime is the same as
the boost symmetry of the larger 3D Minkowski space. Inside the black
hole, this symmetry moves events in a spacelike direction. We can also see
from this picture that, inside the black hole, the spacetime does change
with time.
STRETCHING AND SQUISHING: TIDAL EFFECTS IN GENERAL
RELATIVITY
We have now seen several manifestations of what are called 'tidal
effects' in general relativity, where gravity by itself causes the stretching
or squashing of an object. These a lot in homework problems 1 and 2, but
even earlier our most basic observation in genera! relativity was that gravity
causes freely falling observers to accelerate relative to each other. That is
to say that, on a spacetime diagram, freely falling world lines may bend
toward each other or bend away.
This effects the ocean around the earth as the earth fails freely around
the moon. The answer was that it stretches the ocean in the direction
pointing toward (and away from) the moon, while it squishes (or compresses)
Black Holes
24*
the ocean in the perpendicular directions. This is because different parts of
the ocean would like to separate from each other along the direction toward
the moon, while they would like to come closer together in the other directions:
As stated in the homework solutions, this effect is responsible for the
tides in the earth's oceans. (You know: if you stand at the beach for 24
hours, the ocean level rises, falls, then rises and falls again.) Whenever gravity
causes freely falling observers (who start with no relative velocity) to come
together or to separate, we call this a tidal effect. As we have seen, tidal effects
are the fundamental signature of spacetime curvature, and in fact tidal effects
are a direct measure of spacetime curvature.
Of course any other object (a person, rocket ship, star, etc.) would feel a
similar stretching or squishing in a gravitational field. Depending on how you
are lined up, your head might be trying ft> follow a geodesic which would
cause it to separate from your feet, or perhaps to move closer to your feet. If
this effect were large, it would be quite uncomfortable, and could even rip
you into shreds (or squash you flat).
On the other hand, we argued that this tidal effect will become infinitely
large at the singularity of a black hole. There the effect certainly will be strong
enough to rip apart even tiny objects like humans, or cells, or atoms, or even
subatomic particles.
It is therefore of interest to learn how to compute how strong this
effect actually is. We know that it is small far away from a black hole and
that it is large at the singularity, but how big is it at the horizon?
This last question is the key to understanding what you would feel as you
fell through the horizon of a Schwarzschild black hole.
The SetUp
So, let's suppose that somebody tells us what the spacetime metric is (for
example, it might be the Schwarzschild metric). For convenience, let's suppose
that it is independent of time and spherically symmetric.
250 Black Holes
In this case, we discussed in class how to find the acceleration of static
observers relative to freely falling observers who are at the same event in
spacetime.
What we are going to do now is to use this result to compute the relative
acceleration of two neighboring freely falling observers.
To start with, let's draw a spacetime diagram in the reference frame
of one of the freely falling observers. What this means is that lines drawn
straight up (like the dotted one below) represent curves that remain a
constant distance away from our first free faller. If you followed Einstein's
discussion, this is what he would call a 'Gaussian' coordinate system. We
want our two free fallers to start o with the same velocity - this is analogous
to using 'initially parallel geodesies'.
For the sake of argument, let's suppose that the geodesies separate as
time passes, though the discussion is exactly the same if they come together.
The freely falling observers are the solid lines, and the static observers are
the dashed lines. To be concrete, the static observers to be accelerating
toward the right, but again it doesn't really matter.
Wants to come
together in this
direction
Wants to separate this way.
The coordinate x measures the distance from the first freely falling
observer.
What we would like to know is how fast the second geodesic is
accelerating away from the first. Let us call this acceleration a FF 2 , the
acceleration of the second free faller. Since we are working in the reference
frame of the first free faller, the corresponding acceleration aFF 1 is
identically zero.
Now, what we already know is the acceleration of the two static
observers relative to the corresponding free faller. In other words, we know
the acceleration asl of the first static observer relative to the first free faller,
and we know the acceleration a s2 of the second static observer relative to
the second free faller. Note that the total acceleration of the second static
observer in our coordinate system is a FF2 + as^ - her acceleration relative
to the second free faller plus the acceleration of the second free faller in
our coordinate system.
This is represented pictorially on the diagram above. Actually, there
is something else that we know: since the two static observers are, well,
Black Holes
251
static, the proper distance between them (as measured by them) can never
change. We will use this result to figure out what a FF2 is. The way we will
proceed is to use the standard Physics/Calculus trick of looking at small changes
over small regions.
Note that there are two parameters (T and L, as shown below) that tell us
how big our region is. L is the distance between the two free fallers, and T is
how long we need to watch the system. We will assume that both L and Tare
very small, so that the accelerations a sX and a sl are not too different, and so
that the speeds involved are all much slower than the speed of light.
x=0
Now, pick a point (p x ) on the worldline of the first static observer. Call
the coordinates of that point x v t v (We assume t x < T.) Since the velocity is
still small at that point, we can ignore the difference between acceleration and
proper acceleration and the Newtonian formula:
1 _ 3. . —4,
is a good approximation. The notation Oil 4 ) is read "terms of order
T 4 ." This represents the error we make by using only the Newtonian formula.
It means that the errors are proportional to T 4 (or possibly even smaller), and
so become much smaller than the term that we keep (t\) as T— > 0. Note that
since this is just a rough description of the errors, we can use T instead of t x .
The two static observers will remain a constant distance apart as
determined by their own measurements.
To write this down mathematically, we need to understand how these
observers measure distance. Any observer will measure distance along a line
of simultaneity, and called the point p 2 (where it intersects the worldline of
the second static observer) x 2 , t 2 .
Now, since spacetime is curved, this line of simultaneity need not be
perfectly straight on our diagram.
However, we also know that, in a very small region near the first Free
Faller (around whom we drew our diagram), space is approximately flat.
This means that the curvature of the line of simultaneity has to vanish near
the line x = 0.
Technically, the curvature of this line (the second derivative of t with
respect to x) must itself be 'of order (jc 2 - x : ). This means that p x and p 2
are related by an equation that looks like:
t 2 ~h
x _ x = slope at p x + [curvature at p x ] (x 2 - x x ) + 0([x 2 - x x ] 2 )
= slope at p x + 0([x 2 - x x f) + 0(T 2 [x 2 - *,]).
Again, we need only a rough accounting of the errors. As a result, we
can just call the errors 0(L 2 ) instead of 0([x 2 - x x ] 2 ).
Remember that, in flat space, the slope of this line of simultaneity
would be v sX lc 2 , where v sl is the velocity of the first static observer. Very
close to jc = 0, the spacetime can be considered to be flat. Also, as long as
tl is small, the pointy, is very close to x = 0. So, the slope at pl is essentially
v^/c 2 . Also, for small t x we have v^ = a s] t. Substituting this into the above
equation and including the error terms yields
t 2 = t l +(a s] t l /c 2 )(x 2 -x l ) + 0(L 3 )
= t l +(\ + ^f(x 2 -x l )) + 0(L 3 ) + 0(T 2 L)
We've already got two useful equations, and we know that a third
will be the condition that the proper distance between pl and p2 will be
the same as the initial separation L between the two free fallers:
L 2 = ix2 - Xl) 2 - c 2 ( t2 - tl f
In addition, there is clearly an analogue of equation for the second
static observer (remembering that the second one does not start at x = 0,
but instead starts at x = L):
x 2 = L^-{a s2 +a FF2 )t 2 2 +0{t A ).
The Solution
So, let's try using these equations to solve. The way proceed is to
substitute equation for t 2 in equation. That way we express both positions in
terms of just t v The result is
x 2 = L + ^{a s2 + a FF2 )[\+?±{x 2 -x x )\ t 2 +0(t 4 ) + 0(I?T 3 )
The condition that the proper distance between the static observers
does not change. This equation involves the difference x 2 - x v Subtracting
equation, we get:
x 2 - jc, =L + -(a s2 +a FF2 -a sX )t 2 + (a s2 + a FFi )^f(x 2 -x x )t x 2
+ L + Ua s2 + a FF2 )?f(x 2 -x x ) 2 t 2 + 0(T 4 ) + 0(I?T 2 )
2. c
And, actually, we won't need to keep the (x 2 - x x ) 2 term, so we can
write this as:
1 2
x 2 ~ x \ = L + 2 (° s2 + apF1 ~ ° sX ^
Now, this equation involves x 2 - x x on both the left and right sides,
so let's solve it for x 2 -x v As you can check, the result is:
X 2 ~ X \ = \ L+ -( a s2 +a FF2- a s\) t \
h-(a s2+ a FF2 )^t 2 } + 0(T 4 ) + 0(T 3 L 2 )
But there is a standard 'expansion' ( 1 - jc) _1 = 1 + x + 0(x 2 ) that we
can use to simplify this. We find:
x 2 ~ x \ = L + j (° s2 + ° FF2 ~ QsX ^
Believe it or not, we are almost done!!!! All we have to do now is to
substitute this (and also equation for the times) into the requirement that Ax 2
- c 2 At 2 = L 2 . Below, we will only keep terms up through T 2 and I?. Note
that:
(x 2 - .y,) 2 = ]} - L{a s2 -r a FF1 - a sl )t 2
-2L 2 (a s2 - a hT , )^fl 2 -r 0(T 4 )- ()(T 2 L 3 )
while
(t 2 - t x ? = t 2 a 2 x L 2 lc 2 +0(T 3 L 2 ).
So, since the proper distance between p x and p 2 must be L 2 ,
L 2 = Ax 2 - c 2 M 2
= L 2 + L(a s2 + a FF2 - a tl )t 2 + L 2 (2a s2 + a FFi -a sX )^-t 2
+ 0(T 4 ) + 0(T 2 I?)
Canceling the L 2 terms on both sides leaves only terms proportional
to t 2 L. So, after subtracting the L 2 , let's also divide by t\ L. This will
leave:
= ( a s2 +a FF2 -a sX ) + L{2a s2 +a FFi -a,i)-j- + 0(T 2 / L) + 0(T 2 )
Reminder: What we want to do is to solve for aFF 2, the acceleration
of the second free faller. In preparation for this, let's regroup the terms
above to collect things with a FF2 in them:
= a FF2 (1 + 2La sX /c 2 ) + (a s2 +a sl ) + 0(T 2 IL) + 0(L 2 )
Now, before we solve for a FF2 . Remember that we started o by
assuming that the region was very small. If it is small enough, then in fact
a sX and a s2 are not very different. In fact, we will have a sX - a s2 = 0(L).
This simplifies the last term a lot since L (2a s2 - a sl ) = La s] + 0(L 2 ). Using
this fact, and solving the above equation for a FF 2 we get:
a s2 -a s ,+La 2 sX lc 2 + 2/ 2
FF1 \ + La sl /c 2
= -(a s2 -a si )-La 2 l /c 2 +0(T 2 /L) + 0(L 2 ).
The Differential Equation
We want to convert it into a more useful form which will apply without
worrying about whether our region is small. What we're going to do is to take
the limit as Tand L go to zero and turn this into a differential equation.
Technically, we will take Tto zero faster that L so that T 2 /L 2 — > 0. Note
that we are really interested in how things change with position at / = 0, so
that is is natural to take T to zero before taking L to zero.
Imagine not just two free fallers, but a whole set of them at every value
of x. Each of these starts out with zero velocity, and each of them has an
accompanying static observer.
The free faller at x will have some acceleration a FF (x), and the static
observer at ,v will have some acceleration a s (x) relative to the corresponding
free faller. If L is very small above, notice that a s2 - a s{ = L-~ + 0(L ) and
that (since a m = 0), a FF1 = L- 9 - — + 0(L ) . So, we can rewrite equation as:
L^sEL = -L^-\La 2 /c 2 + 0(T 2 /L) + 0(L 2 )
dx dx
We can now divide by L and take the limit as 77 L and L go to zero.
The result is a lovely differential equation:
dx dx
By the way, the important point to remember about the above
expression is that the coordinate x represents proper distance.
What Does it all Mean?
One of the best ways to use this equation is to undo part of the last step.
Say that you have two free falling observers close together that have no initial
velocity.
Then, if their separation L is small enough, their relative acceleration
. r da FF
i$L — - — or
dx
( da s 2 / 2 1
Relative acceleration= ~ L \ -j- + a s lc I
Let's take a simple example of this. Suppose that you are near a black
hole and that your head and your feel are both freely falling objects. Then,
this formula tells you at what acceleration your head would separate from
(or, perhaps, accelerate toward) your feet.
Of course, your head and feet are not, in reality, separate freely falling
objects.
The rest of your body will pull and push on them to keep your head
and feet roughly the same distance apart at all times. However, your head and
feet will want to separate or come together, so depending on how big the
relative acceleration is, keeping your head and feet in the proper places will
cause a lot of stress on your body.
For example, suppose that the relative acceleration is 10m/s 2 (lg) away
from each other.
In that case, the experience would feel much like what you feel if you tie
your legs to the ceiling and hang upside down. In that case also, your head
wants to separate from the ceiling (where your feet are) at 1 mis 2 .
However, if the relative acceleration were a lot bigger, it would be
extremely uncomfortable. In fact, a good analogy with the experience would
be being on a Medieval rack - an old torture device where they pulled your
arms one way and your feet in the opposite direction.
Black Holes and the Schwarzschild Metric
The acceleration of static observers (relative to freely falling observers)
in the Schwarzschild metric is given by:
-0>-
IrY 1
We would like to take the derivative of this with respect to the proper
distance S in the radial direction. That is, we will work along a line of
constant /, <(>, and 0. In this case, as we have seen before,
dr .
So,
A bit of computation yields
da. , f, — „ , s da.
^. ( Vi=v7)^.
S-^MM^-
On the other hand, we have:
To evaluate the relative acceleration, equation tells us to add these
two results together. Clearly, there is a major cancellation and all that we
have left is:
Relative acceleration= c \~T \ L .
Black Holes 257
This gives the relative acceleration of two freely falling observers who,
at that moment, are at rest with respect to the static observers. (The free fallers
are also located at radius r and are separated by a radial distance L, which is
much smaller than r.) The formula holds anywhere that the Schwarzschild
metric applies. In particular, anywhere outside a black hole.
2 (R S \ T
Relative acceleration = c \~\ L -
The most important thing to notice about this formula is that the
answer is finite. Despite the fact that a static observer at the horizon would
need an infinite acceleration relative to the free fallers, any two free fallers
have only a finite acceleration relative to each other.
The second thing to notice is that, for a big black hole (large R s ), this
relative acceleration is even small. (However, for a small black hole, it
can be rather large.)
BLACK HOLE ASTROPHYSICS AND OBSERVATIONS
We have now come to understand basic round (Schwarzschild) black
holes fairly well. We have obtained several perspectives on black hole
exteriors and interiors and we have also learned about black hole
singularities. However, there are several issues associated with black holes
that we have yet to discuss. Not least of these is the observational evidence
that indicates that black holes actually exist.
The Observational Evidence for Black Holes
Big black holes should not be too hard to make, the question arises,
are there really such things out there in the universe? If so, how do we
find them? Black holes are dark after all, they themselves do not shine
brightly like stars do.
Well, admittedly most of the evidence is indirect. Nevertheless, it is
quite strong.
Let's begin by reviewing the evidence for a black hole at the centre of
our own galaxy.
What is quite clear is that there is something massive, small, and dark
at the centre of our galaxy. Modern techniques allow us to make high
resolution photographs of stars orbiting near the galactic centre. One can
also measure the velocities using the Doppler shift. The result is that we
know a lot about the orbits of these objects, so that we can tell a lot about
the mass of whatever object lies at the very centre at they are orbiting
around.
The status of black hole observa-tions by Andrew Fabian, What the
data shows quite clearly is that there is a mass of 2.61 * 10 6 solar masses
{M is the mass of the sun) contained in a region of size.02 parsecs (pc). Now,
a parsec is around 3 * 10 16 m.
So, this object has a radius of less than 6 x 10 14 m. In contrast, the
Schwarzschild radius for a 2.61 * 10 6 solar mass object is around 10 10 m.
So, what we get from direct observations of the orbits of stars is that this
object is smaller than 10, 000R S .
That may not sound like a small bound (since 10,000 is a pretty big
number), but an important point is that an object of that mass at r = 10,
000^ could not be very dense. If we simply divide mass by volume, we
would find an average density of 10?9 that of water! We know an awful lot
about how matter behaves at that density and the long and short of it is that
the gravitational field of this object should make such a diffuse gas of stuff
contract.
You might then ask what happens when it becomes dense enough to
form a solid. This brings us to another interesting observation...
It turns out that, at the very position at the centre of our galaxy where
the massive object (black hole?) should be located, a strong radio signal is
being emitted. The source of this signal has been named "Sagittarius A* "
(Sgr A*). It therefore natural to assume that this signal is coming from
the massive object that we have been discussing.
It is natural for radio signals to be emitted not from black holes
themselves, but from things falling into black holes. Precision radio
measurements using what is called "very-long baseline interferometry"
(VLBI) tell us that the radio signal is coming from a small region. In terms
of Rs for the mass we have been discussing, the region's size is about 30/?^.
It therefore appears that the object itself is within 30i? f . If the mass
were spread uniformly over a volume of 30/^, it would have a density
about three times greater than that of air.
However, the proper acceleration (of static observers relative to freely
falling ones) would be about lOOg's. Again, we know a lot about how
matter behaves under such conditions. In particular, we know that matter
at that density behaves like a gas. However, the lOOg acceleration means
that the pressure in the gas must be quite high in order to keep the gas
from collapsing.
In particular, the pressure would reach one atmosphere about 1km
inside the object. One hundred thousand km inside, the pressure would
reach one hundred thousand atmospheres! Since we are thinking of an
object of size 30R S = 3 x 10 H m (which is 300 million km), one hundred
thousand km is less than.1% of the way to the centre.
So, the vast majority of the object is under much more than one
hundred thousand (10 5 ) atmospheres of pressure. At 10 5 atmospheres of
pressure, all forms of matter will have roughly the density of a solid. The
Black Holes 259
matter supports this pressure by the electrons shells of the atoms bumping up
against one another.
So, using what we know about matter, the object must surely be even
smaller: small enough that have at least the density of water. Such an object
(for this mass) would have a size of less than 3R S . So, we are getting very
close. At the surface of such an object, the relative accelerations of freely
falling and static observers would be around 10, OOOg's. At a depth of 10,
000km (again. 1% of the way to the centre), the pressure would be 10 14 N/
m 2 , or roughly one billion atmospheres. At this pressure, any kind of
matter will compress to more than 30 times the density of water. So, again,
we should redo the calculation, but now at 30 times the density of water...
At this density, the object would be within its Schwarzschild radius.
It would be a black hole. We conclude that we the experimental bounds
and what we know about physics the object at the centre of our galaxy
either is a black hole already or is rapidly collapsing to become one. Oh,
the time such an object would it take to collapse from 30R S is about 15
minutes. Astronomers have been monitoring this thing for a while.
FINDING OTHER BLACK HOLES
So, while the astronomical measurements do not directly tell us that
Sagittarius A? is a black hole, when combined with what we know about
(more or less) ordinary matter, the conclusion that the object is a black
hole is hard to escape. Much the same story is true for other "black hole
candidates" as the astronomers call them. The word candidate is added to
be intentionally conservative.
Black hole candidates at the centre of other galaxies are identified in
much the same way that Sagittarius A? was found. Astronomers study
how stars orbit around those galactic centers to conclude that there is
"massive compact object" near the centre. Typically, such objects are also
associated with strong emissions of radio waves.
Similar techniques are used for finding smaller black holes as well.
The small black holes that we think we have located are in so-called ^binary
systems.' The way that these black holes were found was that astronomers
found certain stars which seemed to be emitting a lot of high energy x -
rays. This is an unusual thing for a star to do, but it is not so odd for a
black hole. On closer inspection of the star, it was found that the star
appeared to "wobble" back and forth.
This is just what the star would seem to do if it was in fact orbiting
close to a small massive dark object that could not be seen directly. This
is why they are called binary systems, since there seem to be two objects
in the system. These massive dark objects have masses between 5 and 10
solar masses. Actually, there are also cases where the dark companion
260 Black Holes
has a mass of less then 2 solar masses, but those are known to be neutron
stars. Our knowledge of normal matter led to the conclusion that Sagittarius
A* is a black hole. Well, we also have a pretty good idea of how star-like
objects work in the solar mass range. In actual stars, what happens is that
the objects become dense enough that nuclear fusion occurs.
This generates large amounts of heat that increases the pressure in the
matter far above what it would be otherwise. It is this pressure that keeps the
object from collapsing to higher density. Thus, the reason that a star has a
relatively low density (the average density of the sun is a few times that of
water) is that it is very hot! This of course is also the reason that stars shine.
Now, the dark companions in the binary systems do not shine. It follows
that they are not hot. As a result, they must be much smaller and much more
dense. Our understanding of physics tells us that massive cold objects will
collapse under their own weight. In particular, a cold object greater than 1 .4
times the mass of the sun will not be a star at all. It will be so dense that the
electrons will be crushed into the atomic nuclei, with the result that they will
be absorbed into the protons and electron + proton will turn into a neutron.
Thus the object ceases to be normal matter (with electrons, protons, and
neutrons) at all, but becomes just a big bunch of neutrons. This number of 1 .4
solar masses is called the Chandrasekhar limit after the physicist who
discovered it. In practice, when we look at the vast numbers of stars in the
universe, we have never found a cold star of more than 1 .4 solar masses though
we have found some that are close.
So, any cold object of more than 1.4 solar masses must be at least as
strange as a big bunch of neutrons. Well, neutrons can be packed very
tightly without resistance, so that in fact such 'neutron stars' naturally have
the density of an atomic nucleus. What this means is that one can think
of a neutron star as being essentially one incredibly massive atomic nucleus
(but will all neutrons and no protons).
The density of an atomic nucleus is a huge 10 18 kg/m 3 . (This is 10 15
times that of normal matter.) Let us ask: suppose we had a round ball of
nuclear matter at this density. How massive would this ball need to be for
the associated Schwarzschild radius to be larger than the ball itself? The
answer is about 4 times the mass of the sun.
So, working with a very simple model in which the density is constant
(and always equal to the density of normal nuclei, which are under
significantly less pressure) inside the object, we find that any cold object
with a mass greater than four solar masses will be a black hole! It turns
out that any model where the density increases with depth and pressure
yields an even stronger bound. As a result, modern calculations predict that
any cold object with a mass of greater than 2.1 solar masses will be a black
hole.
Black Holes
The dark companions in the binary systems all have masses significantly
greater than 2 solar masses. By the way, it is reassuring to note that every
neutron star that has been found has been in the range between 1.4 and 2.1
solar masses.
A few words on Accretion and Energy
Even with the above arguments, one might ask what direct measurements
could be made of the size of the dark companions. Can we show directly that
their size is comparable to the Schwarzschild radius? To do so one needs to
use the energy being released from matter falling into a black hole. This leads
us to a brief discussion of what are called accretion disks.
In general, matter tends to flow into black holes. This addition of matter
to an object is called "accretion." Black holes (and neutron stars) are very
small, so that a piece of matter from far away that becomes caught in the
gravitational field is not likely to be directed straight at the black hole or neutron
star, but instead is likely to go into some kind of orbit around it. The matter
piles up in such orbits and then, due to various interactions between the bits
of matter, some bits slowly loose angular momentum and move closer and
closer to the centre.
Eventually, they either fall through the horizon of the black hole or
hit the surface of the neutron star.
In cases where the compact object is in a binary system, the matter
flowing in comes mostly from the shining star. This, process makes the
accreting matter into a disk, as shown in the picture8 below. This is why
astronomers often talk about 'accretion disks' around black holes and
neutron stars.
Now, an important point is that a lot of energy is released when matter
falls toward a black hole. Why does this happen? Well, as an object falls,
its speed relative to static observers becomes very large. When many such
of matter bump into each other at high these speeds, the result is a lot of
very hot matter. This is where those x -rays come from. The matter is hot
enough that x -rays are emitted as thermal radiation.
By the way, it is worth talking a little bit about just how we can
calculate the extra 'kinetic energy' produced when objects fall toward black
262 Black Holes
holes (or neutron stars). To do so, we will run in reverse a discussion we had
long ago about light falling in a gravitational field.
Do you recall how we first argued that there must be something like
a gravitational time dilation effect? It was from the observation that a
photon going upward through a gravitational field must loose energy and
therefore decrease in frequency.
Well, let's now think about a photon that falls down into a gravitational
field from far away to a radius r. Clocks at r run slower than clocks far away
by a factor of^/l -R s lr . Since the lower clocks run more slowly, from the
viewpoint of these clocks the electric field of the photon seems to be oscillating
very quickly.
So, this must mean that the frequency of the photon (measured by a
static clock at r) is higher by a factor of 1/ yj\-R s lr than when the
frequency is measured by a clock far away. Since the energy of a photon
is proportional to its frequency, the energy of the photon has increased
by ]/ y j\-R s /r.
Now, in our earlier discussion of the effects of gravity on light we
noted that the energy in light could be turned into any other kind of energy
and could then be turned back into light.
We used this to argue that the effects of gravity on light must be the
same as on any other kind of energy. So, consider an object of mass m
which begins at rest far away from the black hole. It contains an energy E
— mc 1 . So, by the time the object falls to a radius r, its energy (measured
locally) must have increased by the same factor was would the energy of a
photon; to E = mc 2 \l ^1 - R s Ir .
What this means is that if the object gets anywhere even close to the
Schwarzschild radius, it's energy will have increased by an amount
comparable to its rest mass energy. Roughly speaking, this means that
objects which fall toward a black hole or neutron star and collide with
each other release energy on the same scale as a star or a thermonuclear
bomb. This is the source of those x -rays and the other hard radiation
that we detect from the accretion disk.
Actually, there is one step left in our accounting of the energy. After
all, we don't sit in close to the black hole and measure the energy of the jc
-rays. Instead, we are far away.
So, we also need to think about the energy that the jc -rays loose as
they climb back out of the black hole's gravitational field. To this end,
suppose our object begins far away from the black hole and falls to r. As we
said above, its energy is now E - mc 2 / \j\ -R^/r . Suppose that the object
263
now comes to rest at r. The object will then have an energy E = mc 1 as measured
at r. So, stopping this object will have released an energy of
as measured at r. This is how much energy can be put into x-ray photons
and sent back out. But, on it's way back out, such photons will decrease
in energy by a factor of ,Jl -R s lr - 1 . So, the final energy that gets out of
the gravitational field is:
A£
; = mc 2 ^j\-R s lr
1
= -1
= mc 2 {\-^\-R s lr)
In other words, the total energy released to infinity is a certain fraction
of the energy in the rest mass that fell toward the black hole. This fraction
goes to 1 if the mass fell all the way down to the black hole horizon. Again,
so long as r was within a factor of 100 or so of the Schwarzschild radius,
this gives an efficiency comparable to thermonuclear reactions.
Using direct observations, how strongly can we bound the size of a
black hole candidate? It turns out that one can study the detailed properties
of the spectrum of radiation produced by an accretion disk, and that one
can match this to what one expects from an accretion disk living in the
Schwarzschild geometry. Current measurements focus on a particular (x
-ray) spectral line associated with iron. In the best case, the results show
that the region emitting radiation is within 25Rs.
So, where Does all of this Energy go, Anyway?
This turns out to be a very interesting question. There is a lot of energy
be- ing produced by matter falling into a black hole or a neutron star.
People are working very hard with computer models to figure out just
how much matter falls into black holes, and therefore just how much
energy is produced. Unfor- tunately, things are sufficiently complicated
that one cannot yet state results with certainty.
Nonetheless, some very nice work has been done in the last few years
by Ramesh Narayan and his collaborators showing that in certain cases
there appears to be much less energy coming out than there is going in.
Where is this energy going?
It is not going into heating up the object or the accretion disk, as
such effects would increase the energy that we see coming out (causing
the object to shine more brightly). If their models are correct, one is forced to
conclude that the energy is truly disappearing from the part of the spacetime
264 Black Holes
that can communicate with us. In other words, the energy is falling behind the
horizon of a black hole. As the models and calculations are refined over the
next five years or so, it is likely that this missing energy will be the first 'direct
detection' of the horizon of a black hole.
A Very few words about Hawking Radiation
Strictly speaking, Hawking Radiation is not a part of this course because
it does not fall within the framework of general relativity.
Here's the story, when we discussed the black hole singularity, we said
that what really happens there will not be described by general relativity? We
mentioned that physicists expect a new and even more fundamental
understanding of physics to be important there, and that the subject is
called "quantum gravity." We also mentioned that very little is understood
about quantum gravity at the present time.
Well, there is one thing that we think we do understand about
quantum effects in gravity. This is something that happens outside the
black hole and therefore far from the singularity. In this setting, the effects
of quantum mechanics in the gravitational field itself are extremely small.
So small that we believe that we can do calculations by simply splicing
together our understanding of quantum mechanics (which governs the
behaviour of photons, electrons, and such things) and our understanding
of gravity. In effect, use quantum mechanics together with the equivalence
principle to do calculations.
Stephen Hawking did such a calculation back in the early 1970's. What
he found came as a real surprise. Consider a black hole by itself, without
an accretion disk or any other sort of obvious matter nearby. It turns out
that the region around the black hole is not completely dark! Instead, it
glows like a hot object, albeit at a very low temperature. The resulting
thermal radiation is called Hawking radiation.
This is an incredibly tiny effect. For a solar mass black hole the
associated temperature is only 10" 5 Kelvin, that is, 105 degrees above
absolute zero. Large black holes are even colder, as the temperature is
proportional to M~ 2 , where M is the black hole mass. So black holes are
very, very cold. In particular, empty space has a temperature of about 3K
due to what is called the 'cosmic microwave background', so a black hole
is much colder than empty space.
However, if one could make or find a very tiny black hole, that black
hole would be very hot.
Second, let me add that the radiation does not come directly from
the black hole itself, but from the space around the black hole. This is a
common misconception about Hawking radiation: the radiation does not by
itself contradict our statement that nothing can escape from within the horizon.
Black Holes 265
But, you may ask, how can radiation be emitted from the space around
the black hole? How can there be energy created from nothing? The answer is
that, in 'quantum field theory 10,' one can have negative energies as well as
positive energies. However, these negative energies should always be very
small and should survive only for a short time.
What happens is that the space around the black hole produces a net
zero energy, but it sends a positive energy flux of Hawking radiation
outward away from the black hole while sending a negative energy flux
inward across the horizon of the black hole. The negative energy is visible
only for a short time between when it is created and when it disappears behind
the horizon of the black hole.
The net effect is that the black hole looses mass and shrinks, while
positive energy is radiated to infinity. A diagram illustrating the fluxes of
energy is shown below.
Positive energy escapes as
Penrose Diagrams, or "How to put Infinity in a Box"
There are a few comments left to make about black holes, and this
will re- quire one further technical tool. The tool is yet another kind of
spacetime diagram (called a 'Penrose diagram') and it will be useful both
for discussing more complicated kinds of black holes.
The point is that, as we have seen, it is often useful to compare what
an observer very far from the black hole sees to what one sees close to the
black hole. We say that an observer very far from the black hole is "at
infinity." Comparing infinity with finite positions is even more important
for more complicated sorts of black holes that we have not yet discussed.
However, it is difficult to draw infinity on our diagrams since infinity is
after all infinitely far away.
How can we draw a diagram of an infinite spacetime on a finite piece
of paper? Think back to the Escher picture of the Lobachevskian space.
By 'squishing' the space, Escher managed to draw the infinitely large
Lobachevskian space inside a finite circle. If you go back and try to count the
number of fish that appear along on a geodesic crossing the entire space, it
turns out to be infinite. It's just that most of the fish are drawn incredibly
266
Black Holes
small. Escher achieved this trick by letting the scale vary across his map of
the space.
In particular, at the edge an infinite amount of Lobachevskian space is
crammed into a very tiny amount of Escher's map. In some sense this means
that his picture becomes infinitely bad at the edge, but nevertheless we were
able to obtain useful information from it.
We want to do much the same thing for our spacetimes. However,
for our case there is one catch: As usual, we will want all light rays to
travel along lines at 45 degrees to the vertical.
This idea was first put forward by (Sir) Roger Penrosell, so that the
resulting pictures are often called "Penrose Diagrams." They are also called
"conformal diagrams" - conformal is a technical word related to the rescaling
of size.
Let's think about how we could draw a Penrose diagram of Minkowski
space. For simplicity, let's consider our favourite case of 1+1 dimensional
Minkowski space. Would you like to guess what the diagram should look
like? As a first guess, we might try a square or rectangle.
However, this guess has a problem associated with the picture below.
To see the point, consider any light ray moving to the right in 1 + 1
Minkowski space, and also consider any light ray moving to the left. Any
two such light rays are guaranteed to meet at some event. The same is in
fact true of any pair of leftward and rightward moving objects since, in 1
space dimension, there is no room for two objects to pass each other!
Left- and right- moving objects
always collide when space has only
one dimension
However, if the Penrose diagram for a spacetime is a square, then
there are in fact leftward and rightward moving light rays that never meet!
Some examples are shown on the diagram below.
These light rays do not meet
So, the rectangular Penrose diagram does not represent Minkowski
space. What other choices do we have? A circle turns out to have the same
problem. After a little thought, one finds that the only thing which behaves
differently is a diamond:
That is to say that infinity (or at least most of it) is best associated not
which a place or a time, but with a set of light rays! In 3+1 dimensions, we
can as usual decide to draw just the r, t coordinates. In this case, the Penrose
diagram for 3+1 Minkowski space is drawn as a half-diamond:
Penrose Diagrams for Black Holes
Using the same scheme, we can draw a diagram that shows the entire
spacetime for the eternal Schwarzschild black hole. Remember that the
distances are no longer represented accurately.
As a result, some lines that used to be straight get bent. For example,
the constant r curves that we drew as hyperbolae before appear somewhat
different on the Penrose diagram. However, all light rays still travel along
straight 45 degree lines. The result is:
A new diagram for the Schwarzschild black hole, it turns out though
that Schwarzschild black holes are not the only kind of black holes that
can exist. The Schwarzschild metric was correct only outside of all of the
'matter' (which means anything other than gravitational fields) and only if the
matter was spherically symmetric ('round').
268
Black Holes
Another interesting case to study occurs when we add a little bit of electric
charge to a black hole. In this case, the charge creates an electric field which
will fill all of space!
This electric field carries energy, and so is a form of 'matter.' Since we
can never get out beyond all of this electric field, the Schwarzschild metric
by itself is never quite valid in this spacetime. Instead, the spacetime is
described by a related metric called the Reissner-No"rdstr? m (RN) metric.
The Penrose diagram for this metric is shown below:
Actually, this is not the entire spacetime.... the dots in the diagram above
indicate that this pattern repeats infinitely both to the future and to the past!
This diagram has many interesting differences when compared to the
Schwarzschild diagram. One is that the singularity in the RN metric is timelike
instead of being spacelike.
Another is that instead of there being only two exterior regions, there are
now infinitely many!
The most interesting thing about this diagram is that there does exist a
timelike worldline (describing an observer that travels more slowly than light)
that starts in one external region, falls into the black hole, and then comes
back out through a 'past horizon' into another external region. Actually, is
possible to consider the successive external regions as just multiple copies of
the same external region.
In this case, the worldline we are discussing takes the observer back
into the same universe but in such a way that they emerge to the past of
when the entered the black hole!
However, it turns out that there is an important difference between the
Schwarzschild metric and the RN metric. The Schwarzschild metric is stable.
This means that, while the Schwarzschild metric describes only an eternal
black hole in a spacetime by itself (without, for example, any rocket ships
270 Black Holes
near by carrying observers who study the black hole), the actual metric which
would include rocket ships, falling scientists and students, and so on can be
shown to be very close to the Schwarzschild metric. This is why we can use
the Schwarzschild metric itself to discuss what happens to objects that fall
into the black hole.
It turns out though that the RN metric does not have this property.
The exterior is stable, but the interior is not. This happens because of an
effect illustrated on the diagram below. Suppose that some energy (say, a
light wave) falls into the black hole. From the external viewpoint this is a
wave with a long wavelength and therefore represents a small amount of energy.
The two light rays drawn below are in fact infinitely far apart from the outside
perspective, illustrating that the wave has a long wavelength when it is far
away.
Inner horizon
However, inside the black hole, we can see that the description is different.
Now the two light rays have a finite separation. This means that that near the
light ray marked "inner horizon," what was a long wavelength light ray outside
is now of very short wavelength, and so very high energy! In fact, the energy
created by any small disturbance will become infinite at the "inner horizon."
It will come as no surprise that this infinite energy causes a large change in
the spacetime.
The result is that dropping even a small pebble into an RN black
hole creates a big enough e ect at the inner horizon to radically change
the Penrose diagram. The Penrose diagram for the actual spacetime containing
an RN black hole together with even a small disturbance looks like this:
Some of the researchers who originally worked this out have put together
a nice readable website that you might enjoy.
Real black holes in nature will have a significant electric charge. The
point is that a black hole with a sig-nificant (say, positive charge) will
attract other (negative) charges, which fall in so that the final object has
zero total charge. However, real black holes do have one property that
turns out to make them quite different from Schwarzschild black holes:
they are typically spinning. Spinning black holes are not round, but become
somewhat disk shaped.
As a result, they are not described by the Schwarzschild metric. The
spacetime that describes a rotating black hole is called the Kerr metric.
There is also of course a generalization that allows both spin and charge
and which is called the Kerr-Newman metric.
It turns out that the Penrose diagram for a rotating black hole is much
the same as that of an RN black hole, but with the technical complication
that rotating black holes are not round. One finds the same story about
an unstable inner horizon in that context as well, with much the same
resolution. The details of the Kerr metric because of the technical
complications involved, but it is good to know that things basically work
just the same as for the RN metric above.
Chapter 1
The Universe
THE COPERNICAN PRINCIPLE AND RELATIVITY
Of course, in the early 1900's people did not know all that much about
the universe, but they did have a few ideas on the subject. In particular, a
certain philosophical tradition ran strong in astronomy, dating back to
Copernicus. (Copernicus was the person who promoted the idea that the
stars and planets did not go around the earth, but that instead the planets
go around the sun.) This tradition held in high esteem the principle that
"The earth is not at a particularly special place in the Universe".
It was this idea which had freed Copernicus from having to place the
earth at the centre of the Universe. The idea was then generalized to say
that, for example "The Sun is not a particularly special star," and then
further to "There is no special place in the Universe." Or, said differently,
the Copernican principle is that "Every place in the universe is basically
the same."
So, on philosophical grounds, people believed that the stars were
sprinkled more or less evenly throughout the universe. Now, one might
ask, is this really true?
Well, the stars are not in fact evenly sprinkled. We now know that
they are clumped together in galaxies. And even the galaxies are clumped
together a bit. However, if one takes a sufficiently rough average then it
is basically true that the clusters of galaxies are evenly distributed. We say
that the universe is homogeneous. Homogeneous is just a technical word
which means that every place in the universe is the same.
Homogeneity and Isotropy
In fact, there is another idea that goes along with every place being
essentially the same. This is the idea that the universe is the same in every
direction. The technical word is that the universe is isotropic. To give you
an idea of what this means, a picture of a universe that is homogeneous
but is not isotropic - the galaxies are farther apart in the vertical direction
than in the horizontal direction:
In contrast, a universe that is both homogeneous and isotropic must
look roughly like this:
That Technical point about Newtonian Gravity in Homogeneous Space
The point is that, to compute the gravitational field at some point in
space we need to add up the contributions from all of the infinitely many
galaxies. This is an infinite sum. When you discussed such things in your
calculus class, you learned that some infinite sums converge and some do
not. Actually, this sum is one of those interesting in-between cases where
the sum converges, but it does not converge absolutely. What happens in
this case is that you can get different answers depending on the order in
which you add up the contributions from the various objects.
To see how this works, recall that all directions in this universe are
essentially the same. Thus, there is a rotational symmetry and the
gravitational field must be pointing either toward or away from the centre.
Now, it turns out that New-tonian gravity has a property that is much
like Gauss' law in electromagnetism. In the case of spherical symmetry,
the gravitational field on a given sphere de-pends only on the total charge
inside the sphere. This makes it clear that on any given sphere there must
be some gravitational field, since there is certainly matter inside:
But what if the sphere is very small? Then, there is essentially no matter
inside, so the gravitational field will vanish. So, at the 'center' the
gravitational field must vanish, but at other places it does not.
But now we recall that there is no centre! This universe is
homogeneous, meaning that every place is the same. So, if the^gravitational
field vanishes at one point, it must also vanish at every other point. This
is what physicists call a problem.
However, Einstein's theory turns out not to have this problem. In large
part, this is because Einstein's conception of a gravitational field is very
different from Newton's. In particular, Einstein's conception of the gravitational
field is local while Newton's is not.
Homogeneous Spaces
Now, in general relativity, we have to worry about the curvature (or shape)
of space. So, we might ask: "what shapes are compatible with the idea that
space must be homogeneous and isotropic?" It turns out that there are exactly
three answers:
• A three-dimensional sphere (what the mathematicians call S 3).
This can be thought of as the set of points that satisfy x 2 + x\
+ xf + x\ ~ R 2 in four-dimensional Euclidean space.
Flat three dimensional space.
The three dimensional version of the Lobachevskian space.
By the way, it is worth pointing out that option gives us a finite sized
universe. The second and third options gives us infinite spaces. However,
if we were willing to weaken the assumption of isotropy just a little bit,
we could get finite sized spaces that are very much the same. To get an
idea of how this works, think of taking a piece of paper (which is a good
model of an infinite flat plane) and rolling it up into a cylinder. This
cylinder is still flat, but it is finite in one direction. This space is
homogeneous, though it is not isotropic (since one direction is finite while
the other is not):
Rolling up flat three dimensional space in all three directions gives
what is called a 3-torus, and is finite in all three directions. The
The Universe 275
Lobachevskian space can also be 'rolled up' to get a finite universe. This
particular detail is not mentioned in many popular discussions of
cosmology.
Actually, these are not just three spaces. Instead, each possibility
(sphere, flat, Lobachevskian) represents 3 sets of possibilities. To see the
point, let's consider option #1, the sphere. There are small spheres, and
there are big spheres. The big spheres are very flat while the tiny spheres
are tightly curved. So, the sphere that would be our universe could, in
principle, have had any size.
The same is true of the Lobachevskian space. Think of it this way: in
Escher's picture, no one told us how big each fish actually is. Suppose
that each fish is one light-year across.
Such a space can also be considered 'big,' although of course any
Lobachevskian space has infinite volume (an infinite number of fish). In
particular, if we consider a region much smaller than a single fish, we
cannot see the funny curvature effects and the space appears to be flat.
You may recall that we have to look at circles of radius 2 fish or so to see
that C/R is not always 2?. So, if each fish was a light year across, we would
have to look really far away to see the effects of the curvature. On the
other hand, if each fish represented only a millimeter (a 'small' space), the
curvature would be readily apparent just within our class room. The point
is again that there is really a family of spaces here labelled by a length -
roughly speaking, this length is the size of each fish.
What about for the flat space? After all, flat is flat Here, making
the universe bigger does not change the geometry of space at all - it simply
remains flat. However, it will spread out the galaxies, stars, and such. (The
same is, of course, also true in the spherical and Lobachevskian contexts.)
So, for the flat space case, one easy effect to visualize is the change in the
density of matter. However, there is more to it than this: the spacetime is
curved, and the curvature depends on the rate of expansion.
We can see this because observers at different places in 'space' who
begin with no relative velocity nevertheless accelerate apart when the
universe 'expands' !
Dynamics (a.k.a. Time Evolution)
So, homogeneity and isotropy restrict the shape of space to be in one of
a few simple classes. That is to say, at any time (to the extent that this means
anything) the shape of space takes one of these forms. But what happens as
time passes? Does it maintain the same shape, or does it change? The answer
must somehow lie inside Einstein's equations (the complicated ones that we
have said rather little about), since they are what control the behaviour of the
spacetime metric.
276 The Universe
Luckily, the assumptions of homogeneity and isotropy simplify these
equations a lot. Let's think about what the metric will look like. It will
certainly have a dt 2 part. If we decide to use a time coordinate which measures
proper time directly then the coefficient of dt 2 will just be 1. We can always
decide to make such a choice.
The rest of the metric controls the metric for space 1, which must be
the metric for one of the three spaces described above. Now, the universe
cannot suddenly change from, say, a sphere to a Lobachevskian space.
So, as time passes the metric for space can only change by changing the
the overall size (a.k.a. 'scale') of the space. In other words, the space can
only get bigger or smaller.
What this means mathematically is that the metric must take the
general form:
ds 2 = -dz 2 + a 2 (f)(metric for unit - sized space).
The factor a{t) is called the 'scale factor' or 'size of the universe.' When
a is big, all of the spatial distances are very big. When a is small, all of the
spatial distances are very small. So, a space with small a will have a highly
curved space and very dense matter. Technically, the curvature of space is
proportional to Met 2 , while the density of matter is proportional to Ma 2 .
Note that the only freedom we have left in the metric is the single
function a(i). Einstein's equations must therefore simplify to just a single
equation that tells us how a{t) evolves in time.
Expanding and Contracting Universes
Before diving into Einstein's equations themselves, let's first take a
moment to understand better what it means if a changes with time. To do
so, let's consider a case where a starts o 'large' but then quickly decreases
to zero:
This represents any reasonable solution of Einstein's equations.
Neverthe less, let's think about what happens to a freely falling object in
this universe that begins 'at rest', meaning that it has zero initial velocity
in the reference frame used in equation. If it has no initial velocity, then
we can draw a spacetime diagram showing the first part of its worldline
as a straight vertical line:
Now, when a shrinks to zero, what happens to the worldline? Will it bend
to the right or to the left? Well, we assumed that the Universe is isotropic,
right? So, the universe is the same in all directions. This means that there is a
symmetry between right and left, and there is nothing to make it prefer one
over the other. So, it does not bend at all but just runs straight up the diagram.
In other words, an object that begins at jc = with zero initial velocity will
always remain at jc = 0.
Of course, since the space is homogeneous, all places in the space are
the same and any object that begins at any x = x with zero initial velocity
will always remain at x = x . From this perspective it does not look like
much is happening.
However, consider two such objects: one at x, and one at x 2 - The
metric ds 2 contains a factor of the scale a. So, the actual proper distance
between these two points is proportional to a. Suppose that the distance
between x x and x 2 is L when a = 1 (at / = 0). Then, later, when the scale
has shrunk to a < 1, the new distance between this points is only aL. In
other words, the two objects have come closer together.
Clearly, what each object sees is another object that moves toward it.
The reason that things at first appeared not to move is that we chose a
funny sort of coordinate system (if you like, you can think of this as a
funny reference frame, though it is nothing like an inertial reference frame
in special relativity). The funny coordinate system simply moves along
with the freely falling objects cosmologists call it the 'co-moving' coordinate
system. It is also worth pointing out what happens if we have lots of such
freely falling objects, each remaining at a different value of x. In this case,
each object sees all of the other objects rushing toward it as a decreases.
Furthermore, an object which is initially a distance L away (when a = 1)
becomes only a distance aL away.
So, the object has 'moved' a distance (1 - a) L. Similarly, an object
which is initially a distance 21 away becomes 2aL away and 'moves' a
distance 2(1 - a) L - twice as far.
This reasoning leads to what is known as the 'Hubble Law.' This law
states that in a homogeneous universe the relative velocity between any
two objects is proportional to their distance:
v = H(t) ■ d,
278
The Universe
where v is the relative 'velocity', d is the distance, and H (t) is the 'Hubble
constant' - a number that does depend on time but does not depend on
the distance to the object being considered.
It is important to stress again that the Hubble constant is constant
only in the sense of being independent of d. There is no particular reason
that this 'constant' should be independent of time and, indeed, we will see
that it is natural for H to change with time. The above relation using H
(t) to emphasize this point. The Hubble constant is determined by the rate
1 da
of change of a: H (t) = .
a dt
There is no special object that is the 'center' of our collapsing universe.
Instead, every object sees itself as the centre of the process. As usual, none
of these objects is any more 'right' about being the centre than any other.
The difference is just a change of reference frames.
A Flat Spacetime Model
In case this is hard to grasp, it is worth mentioning that you have seen
something similar happen even in flat spacetime.
Suppose an infinite collection of inertial observers all of whom pass
through some special event. Let me suppose that observer #1 differs from
observer by the same boost parameter as any other observer n + 1 differs
from observer n. We could draw a spacetime diagram showing these observers
as below:
Note that this is not the k = Universe which has flat space. Instead,
the entire spacetime is flat here when viewed as a whole, but the slice
representing space on the above diagram is a hyperboloid, which is most
definitely not flat.
The Universe 279
Instead, this hyperboloid is a constant negative curvature space
(Jk = -1). Since the spacetime here is flat, we have drawn the limit of the
k = -1 case as we take the matter density to zero. It is not physically realistic
as a cosmology, but
The co-moving coordiante system used in cosmology. In addition,
for k = -1 the matter density does become vanishingly small in the distant
future (if the cosmological constant vanishes; see below). Thus, for such a
case this diagram does become accurate in the limit / — > «,
Shown here in the reference frame of observer 0, that observer appears
to be the centre of the expansion. However, we know that if we change
reference frames, the result will be:
In this new reference frame, now another observer appears to be the
'center.' These discussions in flat spacetime illustrate three important
points: The first is that although the universe is isotropic (spherically
symmetric), there is no special 'center.' Note that the above diagrams even
have a sort of 'big bang' where everything comes together, but that it does
not occur any more where one observer is than where any other observer
is. The second important point that the above diagram illustrates is that
the surface that is constant / in our co-moving cosmological coordinates
does not represent the natural notion of simultaneity for any of the co-
moving observers. The 'homogeneity' of the universe is a result of using a
special frame of reference in which the / = const surfaces are hyperbolae.
As a result, the universe is not in fact homogeneous in any inertial reference
frame (or any similar reference frame in a curved spacetime).
This is related to the third point: When discussing the Hubble law. a
natural question is, "What happens when d is large enough that H (I), d is
greater than the speed of light?" In general relativity measurements that
are not local are a subtle thing. For example, in the flat spacetime example
280 The Universe
above, in the coordinates that we have chosen for our homogeneous metric,
the / = const surfaces are hyperbolae. They are not in fact the surfaces of
simultaneity for any of the co-moving observers.
Now, the distance between co-moving observers that we have been
discussing is the distance measured along the'hyperbola (i.e., along the
homogeneous slice), which is a very different notion of distance than we
are used to using in Minkowski space.
This means that the 'velocity' in the Hubble law is not what we had
previously called the relative velocity of two objects in Minkowski space.
Instead, in our flat spacetime example, the velocity in the Hubble law turns
out to correspond directly to the boost parameter 9.
However, for the nearby galaxies (for which the relative velocity is
much less than the speed of light), this subtlety can be safely ignored
(since v and are proportional there).
On to the Einstein Equations
So, the all important question is going to be: What is the function
a(t)? What do the Einstein equations tell us about how the Universe will
actually evolve? Surely what Newton called the attractive 'force' of gravity
must cause something to happen!
As you might expect, the answer turns out to depend on what sort of
stu you put in the universe. For example, a universe filled only with light
behaves somewhat differently from a universe filled only with dirt.
In particular, it turns out to depend on the density of energy (p) and
on the pressure (P). [You may recall that we briefly mentioned earlier that,
in general relativity, pressure is directly a source of gravity.]
For our homogeneous isotropic metrics, it turns out that the Einstein
equations can be reduced to the following two equations:
3 (da) 2 8tiG Ac 1
-Am) = -r p -^-
a dt 2 c 2
In the first equation, the constant k is equal to +1 for the spherical
(positively curved) universe, k = for the flat universe, and k = -1 for the
Lobachevskian (negatively curved) universe.
We're not going to derive these equations, but let's talk about them a
bit. The second one is of a more familiar form. It looks kind of like
Newton's second law combined with Newton's law of Universal
Gravitation - on the left we have the acceleration d 2 a/dt 2 while the right
provides a force that depends on the amount of matter present (p).
Interestingly though, the pressure P also contributes. The reason that Newton
never noticed the pressure term is that p is the density of energy and, for an
object like a planet, the energy is mc 2 which is huge due to the factor of c 2 . In
comparison, the pressure inside the earth is quite small. Nevertheless, this
pressure contribution can be important in cosmology.
A changes it tells us whether the (co-moving) bits of matter are coming
closer together or spreading farther apart. This means that, in the present
context, the Einstein equations tell us what the matter is doing as well as
what the spacetime is doing. Thought of this way, the second equation
should make a lot of sense.
The left hand side is an acceleration term, while the right hand side is
related to the sources of gravity. Under familiar conditions where the
particles are slowly moving, the energy density is roughly c 2 times the mass
density. This factor of c 2 nicely cancels the c 2 in the denominator, leaving
the first term on the right hand side as G times the density of mass.
The pressure has no hidden factors of c 2 and so P/c 2 is typically small.
Under such conditions, this equation says that gravity causes the bits of
matter to accelerate toward one another (this is the meaning of the minus
sign) at a rate proportional to the amount of mass around. That sounds
just like Newton's law of gravity, doesn't it?
In fact, we see that gravity is attractive in this sense whenever energy
density p and pressure (P) are positive. In particular, for positive energy
and pressure, a must change with time in such a way that things accelerate
toward each other. Under such conditions it is impossible for the universe to
remain static.
Now, back in the early 1900's people in fact believed (based on no
particular evidence) that the universe had been around forever and had been
essentially the same for all time. So, the idea that the universe had to be
changing really bothered Einstein. In fact, it bothered him so much that he
found a way out.
Negative Pressure, Vacuum Energy, and the Cosmological Constant
Physicists do expect that (barring small exceptions in quantum field
theory) the energy density p will be always be positive. However, the is no
reason in principle why the pressure P must be positive. Let's think about
what a negative pressure would mean. A positive pressure is an effect that
resists something being squeezed.
So, a negative pressure is an effect that resists something being
stretched. This is also known as a 'tension.' Imagine, for example, a rubber
band that has been stretched. We say that it is under tension, meaning
that it tries to pull itself back together. A sophisticated relativistic physicist
calls such an effect a 'negative pressure.'
282 The Universe
We see that the universe can in fact 'sit still' and remain static if p + 3P
= 0. If p + 3P is negative, then gravity will in fact be repulsive (as opposed to
attractive) the various bits of matter will accelerate apart.
Now, because p is typically very large (since it is the density of energy
and E = mc 2 ) this requires a truly huge negative pressure. The kinds of
matter that we are most familiar with will never have such a large negative
pressure. However, physicists can imagine that their might possibly be
such a kind of matter.
The favourite idea along these lines is called "vacuum energy." The
idea is that empty space itself might somehow have energy. At first, this is
a rather shocking notion. If it is empty, how can it have energy? But, some
reflection will tell us that this may simply be a matter of semantics: given
the space that we think is empty (because we have cleared it of everything
that we know how to remove), how empty is it really? In the end, like
everything else in physics, this question must be answered experimentally.
We need to find a way to go out and to measure the energy of empty
space.
Now, what is clear is that the energy of empty space must be rather
small. Otherwise, it's gravitational effects would screw up our predictions
of, for example, the orbits of the planets. However, there is an awful lot
of 'empty' space out there. So, taken together it might still have some
nontrivial effect on the universe as a whole.
Why should vacuum energy (the energy density of empty space) have
negative pressure? Well, an important fact here is that energy density and
pressure are not completely independent. Pressure, after all is related to
the force required to change the size of a system: to smash it or to stretch
it out. On the other hand, force is related to energy: for example, we must
add energy to a rubber band in order fight the tension forces and stretch
it out. The fact that we must add energy to a spring in order to stretch it
is what causes the spring to want to contract; i.e., to have a negative
pressure when stretched.
Now, if the vacuum itself has some energy density p and we stretch
the space (which is just what we will do when the universe expands) then
the new (stretched) space has more vacuum and therefore more energy.
So, we again have to add energy to stretch the space, so there is a negative
pressure. It turns out that pres-sure is (minus) the derivative of energy
with respect to volume P = -dE/dV. Here, E = pV, so P = -p.
Clearly then for pure vacuum energy we have p + 3P < and gravity
is repulsive. On the other hand, combining this with the appropriate
amount of normal matter could make the two effects cancel out and could
result in a static universe.
Since P = -p for vacuum energy, we see that vacuum energy is in fact
characterized by a single number. It is traditional to call this number A, and to
define A so that we have
8tiG
A
Such a A is called the 'cosmological constant.' We have, in fact seen
it before. You may recall that, during our very brief discussion of the
Einstein equations, we mentioned that Einstein's assumptions and the
mathematics in fact allowed two free parameters. One of these we identified
as Newton's Universal Gravitational Constant G. The other was the
cosmological constant L. This is the same cosmological constant: as we
discussed back then, the cosmological constant term in the Einstein
equations could be called a funny sort of 'matter.' In this form, it is none
other than the vacuum energy that we have been discussing.
We mentioned that A must be small to be consistent with the
observations of the motion of planets. However, clearly matter is -somewhat
more clumped together in our solar system than outside. Einstein hoped
that this local clumping of normal matter (but not of the cosmological
constant) would allow the gravity of normal matter to completely dominate
the situation inside the solar system while still allowing the two effects to
balance out for the universe overall.
Anyway, Einstein thought that this cosmological constant had to be
there otherwise the universe could not remain static.
However, in the early 1920's, something shocking happened: Edwin
Hubble made detailed measurements of the galaxies and found that the
universe is in fact not static. He used the Doppler effect to measure the
motion of the other galaxies and he found that they are almost all moving
away from us. Moreover, they are moving away from us at a rate
proportional to their distance! This is why the rule v = H (t). d is known
as the 'Hubble Law.'
The universe appeared to be expanding The result was that Einstein
immediately dropped the idea of a cosmological constant and declared it
to be the biggest mistake of his life.
OUR UNIVERSE: PAST, PRESENT, AND FUTURE
The other galaxies are running away from ours at a rate proportional
to their distance from us. The implication is that the universe is expanding,
and that it has been expanding for some time. In fact, since gravity is
generally attractive, we would expect that the universe was expanding even
faster in the past.
284
The Universe
To find out more of the details we will have to look again to the Einstein
equations. We will also need to decide how to encode the current matter in
the universe in terms of a density p and a pressure P. Let's first think about the
pressure.
Most matter today is clumped into galaxies, and the galaxies are quite
well separated from each other. How much pressure does one galaxy apply
to another? Essentially none. So, we can model the normal matter by
setting P = 0.
When the pressure vanishes, one can use the Einstein equations to
show that the quantity: £ = 8rtGpa 3 /3 is independent of time. Roughly
speaking, this is just conservation of energy (since p is the density of energy
and a 3 is proportional to the volume). As a result, assuming that A =
the Einstein equations can be written:
c 2 \dt J c 2 a
k is a constant that depends on the overall shape of space: k = +1 for
the spherical space, k = for the flat space, and k = -1 for the
Lobachevskian space.
In the above form, this equation can be readily solved to determine
the behaviour of the universe for the three cases k = -1, 0, +1. We don't
need to go into the details here, but let me draw a graph that gives the
idea of how a changes with t in each case:
Proper time
Note that for k = +1 the universe expands and then recontracts,
whereas for k - 0, -1 it expands forever. In the case k = the Hubble
constant goes to zero at very late times, but for k = -1 the Hubble constant
asymptotes to a constant positive value at late times.
Note that at early times the three curves all look much the same.
Roughly speaking, our universe is just now at the stage where the three
curves are beginning to separate. This means that, the past history of the
universe is more or less independent of the value of k.
The Universe 285
OBSERVATIONS AND MEASUREMENTS
So, which is the case for our universe? How can we tell? Well, one
way to figure this out is to try to measure how fast the universe was
expanding at various times in the distant past. This is actually not as hard
as you might think: you see, it is very easy to look far backward in time.
All we have to do is to look at things that are very far away. Since the
light from such objects takes such a very long time to reach us, this is
effectively looking far back in time.
Runaway Universe?
The natural thing to do is to try to enlarge on what Hubble did. If we
could figure out how fast the really distant galaxies are moving away from
us, this will tell us what the Hubble constant was like long ago, when the
light now reaching us from those galaxies was emitted. The redshift of a
distant galaxy is a sort of average of the Hubble constant over the time
during which the signal was in transit, but with enough care this can be
decoded to tell us about the Hubble constant long ago.
By measuring the rate of decrease of the Hubble constant, we can
learn what kind of universe we live in.
However, it turns out that accurately measuring the distance to the
distant galaxies is quite difficult. (In contrast, measuring the redshift is
easy.) Until recently, no one had seriously tried to measure such distances
with the accuracy that we need. However, a few years ago it was realized
that there may be a good way to do it using supernovae.
The particular sort of supernova of interest here is called 'Type la.'
Astrophysicists believe that type la supernovae occur when we have a
binary star system containing one normal star and one white dwarf. We
can have matter flowing from the normal star to the white dwarf in an
accretion disk, much as matter would flow to a neutron star or black hole
in that binary star system. But remember that a white dwarf can only exist
if the mass is less than 1 .4 solar masses.
When extra matter is added, bringing the mass above this threshold,
the electrons in the core of the star get squeezed so tightly by the high
pressure that they bond with protons and become neutrons. This releases
vast amount of energy in the form of neutrinos (another kind of tiny
particle) and heat which results in a massive explosion: a (type la)
supernova.
Anyway, it appears that this particular kind of supernova is pretty
much always the same. It is the result of a relatively slow process where
matter is gradually added to the white dwarf, and it always explodes when
the total mass hits 1.4 solar masses. In particular, all of these supernovae
are roughly the same brightness (up to one parameter that astrophysicists
286 The Universe
think they know how to correct for). As a result, supernovae are a useful tool
for measuring the distance to far away galaxies. All we have to do is to watch
a galaxy until one of these supernovae happens, and then see how bright the
supernova appears to be. Since it's actual brightness is known, we can then
figure out how far away it is. Supernovae farther away appear to be much
dimmer while those closer in appear brighter.
About two years ago, the teams working on this project released their
data. The result came as quite a surprise.
Their data shows that the universe is not slowing down at all. Instead,
it appears to be accelerating!
As you might guess, this announcement ushered in the return of the
cosmological constant. By the way, the cosmological constant has very
little effect when the universe is small (since vacuum energy is the same
density whether the universe is large or small while the density of normal
matter was huge when the universe was small).
However, with a cosmological constant, the effects of the negative
pressure get larger and larger as time passes (because there is more and
more space, and thus more and more vacuum energy). As a result, a
cosmological constant makes the universe expand forever at an ever
increasing rate. Adding this case to our graph, we get:
The line for A > is more or less independent of the constant k.
So, should we believe this? The data in s upport of an accelerating
universe has held up well for three years now. However, there is a long
history of problems with observations of this sort.
There are often subtleties in understanding the data that are not
apparent at first sight, as the various effects can be much more complicated
than one might naively expect. Physicists say that there could be significant
'systematic errors' in the technique.
All this is to say that, when you measure something new, it is always
best to have at least two independent ways to find the answer. Then, if they
agree, this is a good confirmation that both methods are accurate.
Once Upon a time in a Universe Long Long Ago
It turns out that one way to get an independent measurement of the
cosmological constant is tied up with the story of the very early history of
the universe. This is of course an interesting story in and of itself.
Let's read the story backwards. Here we are in the present day with
the galaxies spread wide apart and speeding away from each other. Clearly,
the galaxies used to be closer together. As indicated by the curves in our
graphs, the early history of the universe is basically independent of the
value of L or k.
So, imagine the universe as a movie that we now play backwards.
The galaxies now appear to move toward each other. They collide and
get tangled up with each other. At some point, there is no space left between
the galaxies, and they all get scrambled up together - the universe is just a
mess of stars.
Then the universe shrinks some more, so that the stars all begin to
collide. There is no space left between the stars and the universe is filled
with hot matter, squeezing tighter and tighter. The story here is much like
it is near the singularity of a black hole: even though squeezing the matter
increases the pressure, this does not stop the spacetime from collapsing.
In fact, as we have seen, pressure only adds to the gravitational attraction
and accelerates the collapse.
As the universe squeezes tighter, the matter becomes very hot. At a
certain point, the matter becomes so hot that all of the atoms ionize: the
electrons come o and separate from the nuclei. Something interesting
happens here. Because ionized matter interacts strongly with light, light
can no longer travel freely through the universe. Instead, photons bounce
around between nuclei like ping pong balls! It it the cosmic equivalent of
trying to look through a very dense fog, and it becomes impossible to see
anything in the universe.
This event is particularly important because, as we discussed earlier,
the fact that it takes light a long time to travel across the universe means
that when we look out into the universe now, looking very far away is
effectively looking back in time.
So, this ionization sets a limit on how far away and how far back in
time we can possibly see. On the other hand, ever since the electrons and
nuclei got together into atoms (deionization) the universe has been more
or less transparent. For this reason, this time is also called 'decoupling.'
[Meaning that light 'decouples' or 'disconnects' from matter.] As a result,
we might expect to be able to see all the way back to this time.
288 The Universe
What would we see if we could see that far back? Well, the universe was
hot, right? And it was all sort of mushed together. So, we might expect to see
a uniform glow that is kind of like looking into a hot fire. In fact, it was quite
hot: several thousand degrees.
Another way to discuss this glow is to remember that the universe is
homo- geneous.
This means that, not only was stu "way over there" glowing way back
when, but so was the stuff where we are. What we are saying is that the
whole universe (or, if you like, the whole electromagnetic field) was very
hot back then.
A hot electromagnetic field contains a lot of light.... Anyway, the point
about light barely interacting with matter since decoupling means that,
since that time, the electromagnetic field (i.e., light) should just have gone
on and done its thing independent of the matter. In other words, it cannot
receive energy (heat) from matter or loose energy (heat) by dumping it
into matter. It should have pretty much the same heat energy that it had
way back then.
So, why then is the entire universe today not just one big cosmic oven
filled with radiation at a temperature of several thousand degrees? The
answer is that the expansion of the universe induces a redshift not only in
the light from the distant galaxies, but in the thermal radiation as well.
The effect is similar to the fact that a gas cools when it expands. Here,
however, the gas is a gas of photons and the expansion is due to the
expansion of the universe. The redshift since decoupling is about a factor
of 2000, with the result that the radiation today has a temperature of a
little over 3 degrees Kelvin (i.e. 3 degrees above absolute zero).
At 3 degrees Kelvin, electromagnetic radiation is in the form of
microwaves (in this case, think of them as short wavelength radio waves).
This radiation can be detected with what are basically big radio telescopes
or radar dishes. Back in the 60's some folks at Bell Labs built a high quality
radio dish to track satellites. Two of them (Penzias and Wilson) were
working on making it really sensitive, when the discovered that they kept
getting a lot of noise coming in, and coming in more or less uniformly
from all directions. It appeared that radio noise was being produced
uniformly in deep space!
This radio 'noise' turned out to be thermal radiation at a temperature
of 2.7 Kelvin. Physicists call it the 'Cosmic Microwave Background
(CMB).' It's discovery is one of the greatest triumphs of the 'big bang'
idea. After all, that is what we have been discussing. Long ago, before
decoupling, the universe was very hot, dense, and energetic.
It was also in the process of expanding, so that the whole process
bears a certain resemblance (except for the homogeneity of space) to a
The Universe 289
huge cosmic explosion: a big bang. The discovery of the CMB verifies this
back to an early stage in the explosion, when the universe was so hot and
dense that it was like one big star.
By the way, do you remember our assumption that the universe is
homogeneous? We said that it is of course not exactly the same everywhere
(since, for example, the earth is not like the inside of the sun) but that,
when you measure things on a sufficiently large scale, the universe does
appear to be homogeneous.
Well, the cosmic microwave background is our best chance to test
the homogeneity on the largest possible scales since, as we argued above,
it will not be possible to directly 'see' anything coming from farther away.
The microwaves in the CMB have essentially traveled in a straight line
since decoupling. We will never see anything from farther away since, for
the light to be reaching us now, it would have had to have been emitted
from an distant object before decoupling - back when the universe was
filled with thick 'fog.'
When we measure the cosmic microwave background, it turns out to
be incredi-bly homogeneous. The departures from homogeneity in the
CMB are only about 1 part in one hundred thousand.
An important point about the early universe. It was not like what we
would get if we simply took the universe now and made all of the galaxies
come together instead of rushing apart. If we pushed all of the galaxies
together we would, for example, end up with a lot of big clumps (some
related to galactic black holes, for example). While there would be a lot
of general mushing about, we would not expect the result to be anywhere
near as homogeneous as one part in one hundred thousand.
It appears then that the universe started in a very special, very uniform
state with only very tiny fluctuations in its density. So then, why are there
such large clumps of stuff today? Today, the universe is far from
homogeneous on the small scale. The reason for this is that gravity tends
to cause matter to clump over time.
Places with a little higher density pull together gravitationally and
become even more dense, pulling in material from neighboring under-dense
regions so that they become less dense. It turns out that tiny variations of
one part in one hundred thousand back at decoupling are just the right
size to grow into roughly galaxy-style clumps today.
This is an interesting fact by itself: Galaxies do not require special
'seeds' to start up. They are the natural consequence of gravity amplifying
teeny tiny variations in density in an expanding universe.
Well, that's the rough story anyway. Making all of this work in detail
is a little more complicated, and the details do depend on the values of L,
k, and so on. As a result, if one can measure the CMB with precision, this
290 The Universe
becomes an independent measurement of the various cosmological parameters.
The data from COBE confirmed the whole general picture and put some
constraints on
L. The results were consistent with the supernova observations, but
by itself COBE was not enough to measure L accurately. A number of
recent balloonbased CMB experiments have improved the situation
somewhat, and in the next few years two more satellite experiments (MAP
and PLANCK) will measure the CMB in great detail. Astrophysicists are
eagerly awaiting the results.
A Cosmological 'Problem'
Actually, the extreme homogeneity of the CMB raises another issue:
how could the universe have ever been so homogeneous? For example,
when we point our radio dish at one direction in the sky, we measure a
microwave signal at 2.7 Kelvin coming to us from ten billion light-years
away. Now, when we point our radio dish in the opposite direction, we
measure a microwave signal at the same temperature (to within one part
in one hundred thousand) coming at us from ten billion light-years away
in the opposite direction! Now, how did those two points so far apart
know that they should be at exactly the same temperature?
Ah! You might say, "Didn't the universe used to be a lot smaller, so
that those two points were a lot closer together?" This is true, but it turns
out not to help. The point is that all of the models we have been discussing
have a singularity where the universe shrinks to zero size at very early
times. An important fact is that this singularity is spaceHke (as in the black
hole). The associated Penrose diagram looks something like this:
In finite Futu re
Big Bang Singularity
The Penrose diagram including a cosmological constant, but the part
describing the big bang singularity is the same in any case (since, as we
have discussed, A is not important when the universe is small).
The fact that the singularity is spacelike means that no two points on
the singularity can send light signals to each other (even though they are
zero distance apart).
Thus, it takes a finite time for any two 'places' to be able to signal
each other and tell each other at what temperature they should be2. In
fact, we can see that if the two points begin far enough apart then they
will never be able to communicate with each other, though they might
both send a light (or microwave) signal to a third observer in the middle.
The light rays that tell us what part of the singularity a given event
The Universe 291
has access to form what is called the 'particle horizon' of that event and the
issue we have been discussing (of which places could possibly have been in
thermal equilibrium with which other places) is called the 'horizon problem.'
There are two basic ways out of this, but it would be disingenuous to
claim that either is understood at more than the most vague of levels. One
is to simply suppose that there is something about the big bang itself that
makes things incredibly homogeneous, even outside of the particle
horizons. The other is to suppose that for some reason the earliest evolution
of the universe happened in a different way than we drew on our graph
above and which somehow removes the particle horizons.
The favourite idea of this second sort is called "inflation." Basically,
the idea is that for some reason there was in fact a truly huge cosmological
constant in the very earliest universe - sufficiently large to affect the
dynamics. Let us again think of running a movie of the universe in reverse.
In the forward direction, the cosmological constant makes the universe
accelerate. So, running it backward it acts as a cosmic brake and slows
things down.
The result is that the universe would then be older than we would
otherwise have thought, giving the particle horizons a chance to grow
sufficiently large to solve the horizon problem. The resulting Penrose
diagram looks something like this:
Infinite Future
Big Bang Singularity
The regions we see at decoupling now have past light cones that
overlap quite a bit. So, they have access to much of the same information
from the singularity. In this picture, it is easier to understand how these
entire universe could be at close to the same temperature at decoupling.
Oh, to be consistent with what we know, this huge cosmological
'constant' has to shut itself o long before decoupling. This is the hard part
about making inflation work.
Making the cosmological constant turn o requires an amount of fine
tuning that many people feel is comparable to the one part in one-hundred
thousand level of inhomogeneities that inflation was designed to explain.
292 The Universe
Luckily, inflation makes certain predictions about the detailed form of the
cosmic microwave background. The modern balloon experiments are beginning
to probe the interesting regime of accuracy, and it is hoped that MAP and
PLANCK will have some defmiti ve commentary on whether inflation is or is
not the correct explanation.
Looking for Mass in all the Wrong Places
The cosmological constant, turns out that the supernovae results and the
CMB do not really measure L directly, but instead link the cosmological
constant to the overall density of matter in the universe.
So, to get a real handle on things, one has to know the density of
more or less regular matter in the universe as well. Before we get into how
much matter there actually is (and how we find out), The Hubble expansion
1 da tt2 SnGp A ,-22
3 3
The cosmologists like to reorganize this equation by dividing by H 2 .
This gives # -^-— + ka V = 0.
Now, the three interesting cases are k = -1, 0, +1. The middle case is
k = 0. overall density of stu(matter or cosmological constant) in 'Hubble
units' must be one! So, this is a convenient reference point. If we want to
measure k, it is this quantity that we should compute. So, cosmologists
give it a special name:
SrcGp A
°~ 3H 2 + 3H 2 '
This quantity is often called the 'density parameter,' but we see that it
is slightly more complicated than that name would suggest. In particular,
(like the Hubble 'constant') Q will in general change with time. If, however,
Q happens to be exactly equal to one at some time, it will remain equal to
one. So, to tell if the universe is positively curved (k = +1), negatively
curved (k = -1), or [spatially] flat (k = 0), what we need to do is to measure
W and to see whether it is bigger than, smaller than, or equal to one.
By the way, cosmologists in fact break this Q up into two parts
corresponding to the matter and the cosmological constant.
87tGp
A
Not only do these two parts change with time, but their ratio changes as
The Universe 293
well. The natural tendency is for Q A to grow with time at the expense of Q matter
as the universe gets larger and the vacuum energy becomes more important
Anyway, when cosinologists discuss the density of matter and the size of the
cosmological constant, they typically discuss these things in terms of ^ matter
and fi A .
So, just how does one start looking for matter in the universe? Well,
the place to start is by counting up all of the matter that we can see - say,
counting up the number of stars and galaxies. Using the things we can see
gives about Q. = 0.05.
But, there are more direct ways to measure the amount of mass around
- for example, we can see how much gravity it generates! Remember our
discussion of how astronomers find black holes at the centers of galaxies?
They use the stars orbiting the black hole to tell them about the mass of
the black hole. Similarly, we can use stars orbiting at the edge of a galaxy
to tell us about the total amount of mass in a galaxy.
It turns out to be much more than what we can see in the 'visible'
matter. Also, recall that the galaxies are a little bit clumped together. If
we look at how fast the galaxies in a given clump orbit each other, we
again find a bit more mass than we expected.
It turns out that something like 90% of the matter out there is stuff
that we can't see. For this reason, it is called 'Dark Matter.' Interestingly,
although it is attached to the galaxies, it is spread a bit more thinly than is
the visible matter. This means that a galaxy is surrounded by a cloud of
dark matter than is a good bit larger than the part of the galaxy that we
can see. All of these measurements of gravitational effects bring the matter
count up to about ^ matter =.4.
Now, there is of course a natural question: Just what is this Dark
Matter stu anyway? Well, there are lots of things that it is not. For example,
it is not a bunch of small black holes or a bunch of little planet-like objects
running around. At least, the vast majority is not of that sort.
That possibility has been ruled out by studies of gravitational lensing.
Briefly, recall that general relativity predicts that light 'falls' in a
gravitational field and, as a result, light rays are bent toward massive
objects.
This means that massive objects actually act like lenses, and focus
the light from objects shining behind them. When such a 'gravitational
lens' passes in front of a star, the star appears to get brighter. When the
lens moves away, the star returns to its original brightness. By looking at
a large number of stars and seeing how often they happen to brighten in
this way, astronomers can 'count' the number of gravitational lenses out
there. To make a long story short, there are too few such events for all of
the dark matter to be clumped together in black holes or small planets. Instead,
294 The Universe
most of it must be spread out more evenly. Even more interestingly, it cannot
be just thin gas That is, there are strong arguments why the dark matter,
whatever it is, cannot be made up of protons and neutrons like normal matter!
To understand this, we need to continue the story of the early universe as a
movie that we run backward in time. We discussed earlier how there was a
very early time (just before decoupling) when the Universe was so hot and
dense that the electrons were detached from the protons. Well, continuing to
watch the movie backwards the universe becomes even more hot and dense.
Eventually, it becomes so hot and dense that the nuclei fall apart.
Now there are just a bunch of free neutrons and protons running
around, very evenly spread throughout the universe. It turns out that we
can calculate what should happen in such a system as the universe expands
and cools. As a result, one can calculate how many of these neutrons and
protons should stick together and form Helium vs. how many extra
protons should remain as Hydrogen.
This process is called 'nucleosynthesis.' One can also work out the
proportions of other light elements like Lithium... (The heavy elements
were not made in the big bang itself, but were manufactured in stars and
supernovae.) To cut short another long story, the more dense the stuff
was, the more things stick together and the more Helium and Lithium
should be around. Astronomers are pretty good at measuring the relative
abundance. of Hydrogen and Helium, and the answers favour roughly
^normal matter ~-^> ~ me stu ff we can see pl us a little bit more.
As a result, this means that the dark matter is not made up of normal
things like protons and neutrons. By the way, physicists call such matter
'baryonic3 matter' so that this fact is often quoted as ^ baryon =.1. A lot of
this may be in the form of small not-quite stars and such, but the important
point is that at least 75% of the matter in the universe really has to be
stuff that is not made up of protons and neutrons.
So, what is the dark matter then? That is an excellent question and a
subject of much debate. It may well be the case that all of this unknown
dark matter is some strange new kind of tiny particle which simply happens
not to interact with regular matter except by way of gravity.
A number of ideas have been proposed, but it is way too early to say
how likely they are to be right.
Putting it all Together
The last part of our discussion is to put all of this data together to see
what the implications are for Q A and ^ matter . Many of these graphs (and
some other stu) come from from a talk given by Sean Carroll.
These graphs show that that each of the three measurements put some
kind of constraint on the relationship between fi matter and Q A , corresponding
The Universe 295
to a (wide) line in the ^ matter ? Q A plane. You can see that, taken together, the
data strongly favors a value near ^ matter -.4, Q A =.6. That is, 60% of the energy
in the universe appears to be vacuum energy!
Now, what is really impressive here is that any two of the
measurements would predict this same value. The third measurement can
then be thought of as a double-check. As the physicists say, any two lines
in a plane intersect somewhere, but to get three lines to intersect at the
same point you have to do something right. This means that the evidence for
a cosmological constant is fairly strong we have not just one experiment that
finds it, but in fact we have another independent measurement that confirms
this result. However, the individual measurements are not all that accurate
and may have unforeseen systematic errors. So, we look forward to getting
more and better data in the future to see whether these results continue to hold
up.
We are in fact expecting to get a lot more data over the next few years.
Two major satellite experiments (called 'MAP' and 'PLANCK') are going
to make very detailed measurements of the Cosmic Microwave
Background which should really tighten up the CMB constraints on
Wmatter and WL. It is also hoped that these experiments will either
confirm or deny the predictions of inflation.
By the way, it is a rather strange picture of the universe with which
we are left. There are several confusing issues. One of them is "where does
this vacuum energy come from?" It turns out that there are some reasonable
ideas on this subject coming from quantum field theory... However, while they
are all reasonable ideas for creating a vacuum energy, they all predict a value
that is 10 120 times too large.
A moment to state the obvious: 10 120 is an incredibly huge number. A
billion is ten to the ninth power, so 10 120 is one billion raised to the thirteenth
power. Physicists are always asking, "Why is the cosmological constant so
small?"
Another issue is that, as we mentioned, WL and Wmatter do not stay
constant in time. They change, and in fact they change in different ways.
There is a nice diagram (also from Sean Carroll) showing how they change
with time. What you can see is that, more or less independently of where
you start, the universe naturally evolves toward Q A = 1. On the other
hand, back at the big bang Q A was almost certainly near zero.
So, an interesting question is: "why is Q A only now in the middle
ground (Q A =.6), making it's move between zero and one?"
For example, does this argue that the cosmological constant is not
really constant, and that there is some new physical principle that keeps it
in this middle ground? Otherwise, why should the value of the cosmological
constant be such that Q A is just now making it's debut? It is not clear why A
should not have a value such that it would have taken over long ago, or such
that it would still be way too tiny to notice.
THE BEGINNING AND THE END
Well, we are nearly finished with our story but we are not yet at the end.
We traced the universe back to a time when it was so hot and dense that the
nuclei of atoms were just forming. We have seen that there is experimental
evidence (in the abundances of Hydrogen and Helium) that the universe actually
was this hot and dense in its distant past. Well, if our understanding of physics
is right, it must have been even hotter and more dense before. So, what was
this like? How hot and dense was it? From the perspective of general relativity,
the most natural idea is that the farther back we go, the hotter and denser it
was.
Looking back in time, we expect that there was a time when it was so
hot that protons and neutrons themselves fell apart, and that the universe
was full of things called quarks. Farther back still, the universe so hot
that our current knowledge of physics is not sufficient to describe it. All
kinds of weird things might have happened, like maybe the universe had
more than four dimensions back then. Maybe the universe was filled with
truly exotic particles. Maybe the universe underwent various periods of
inflation followed by relative quiet.
Anyway, looking very far back we expect that one would find
conditions very similar to those near the singularity of a black hole. This
is called the 'big bang singularity.' Just as at a black hole, general relativity
would break down there and would not accurately describe what was
happening.
Roughly speaking, we would be in a domain of quantum gravity
where, as with a Schwarzschild black hole, our now familiar notions of
space and time may completely fall apart. It may or may not make sense
to even ask what came 'before.' Isn't that a good place to end our story?
Index
Accelerated objects 117, 146
Accelerating 8, 10, 11, 12, 33, 43, 61,
68, 86, 87, 151, 158, 159, 162, 164,
172, 180, 185, 200, 201, 202, 205,
207, 208, 210, 216, 232, 249, 280,
288, 291
Astrophysics 257
Starlight 224, 225
Black hole 229, 230, 232, 233, 234, 235,
236, 237, 238, 239, 240, 241, 242,
243, 244, 245, 246, 247, 248, 249,
255, 257, 258, 259, 260, 261, 262,
263, 264, 265, 267, 268, 269, 270,
271, 285, 287, 290, 293, 296
Boost parameters 130, 134, 145, 188
Building intuition 211
C
Conceptual premises 59
Coordinate systems 64, 70, 72, 114,
115, 218
Copernican 7, 272
Cosmological 219, 279, 281, 283, 286,
287, 290, 291, 292, 293, 295
Covariant derivative 68, 69
Curved surfaces 203, 204
Differential equation 254, 255
Dimensions 25, 26, 37, 67, 135, 138,
139, 141, 171, 188, 189, 199, 214,
267, 296
Doppler effect 170, 283
Dynamic implications 41
Dynamics 55, 142, 156, 219, 275, 291
Einstein 1, 9, 11, 12, 13, 14, 17, 28, 29,
30, 38, 39, 44, 70, 78, 90, 91, 100,
113, 137, 142, 156, 157, 160, 161,
162, 163, 164, 177, 178, 180, 183,
184, 188, 196, 199, 200, 201, 205
Electromagnetic 29, 30, 40, 49, 50, 53,
54, 62, 63, 82, 83, 84, 86, 88, 97, 111,
160, 288
Electromagnetism 81, 83, 86, 157, 160,
273
Elusive ether 86
Energy 41, 42, 43, 44, 45, 46, 47, 48, 49,
51, 52, 53, 56, 62, 68, 139, 156, 157,
158, 159, 160, 161, 162, 163, 164,
165, 166, 167, 168, 169, 170, 171,
176, 177, 181, 182, 183, 185, 218,
219, 220, 221, 224, 259, 261, 262,
263, 264, 265, 268, 270, 280, 281,
282, 283, 284, 285, 286, 288, 293
Equilibrium 85, 291
Evidence 6, 78, 222, 257, 281, 295, 296
Derivatives 26, 64, 84, 85
Forces 5, 9, 11, 46, 60, 64, 78, 79, 80, 81,
82, 92, 111, 143, 156, 159, 162, 177,
178, 180, 181, 197, 235, 282
Frequency shifts 55
Ingredient 181
Interior diagram 237
Investigating 137, 193, 226, 233
Isotropy 272, 274, 275, 276
Geometry 1, 8, 9, 113, 114, 119, 125,
126, 128, 129, 131, 133, 134, 145,
149, 166, 177, 199, 204, 205, 211,
217, 218, 219, 222, 227, 238, 263
Global frames 196
Gravitational 9, 13, 27, 60, 61, 62, 64,
67, 80, 156, 177, 178, 180, 181, 182,
183, 185, 186, 187, 188, 189, 190,
191, 193, 194, 195, 196, 197, 198,
199, 200, 203, 204, 205, 206, 209,
210, 211, 218, 219, 220, 221, 222,
224, 225, 226, 227, 232, 234, 235,
236, 239, 244, 249, 258, 261, 262,
263, 264, 268, 273, 274, 282, 283
Gravity and locality 185
H
Hawking radiation 264, 265
History of relativity 2
Homogeneity 272, 275, 276, 288, 289,
290
Horizon 14, 149, 152, 153, 231, 232, 233,
234, 235, 236, 237, 238, 239, 245,
246, 247, 249, 257, 261, 263, 264,
265, 270, 271, 291
Hyperbolae 127, 128, 129, 131, 238,
239, 242, 267, 279, 280
Hyperbolic trigonometry 130, 216
I
Indistinguishable 61, 196
Inertia] frame 13, 30, 31, 33, 35, 36, 37,
38, 52, 53, 57, 78, 79, 80, 82, 91, 92,
94, 97, 98, 99, 100, 109, 116, 117,
120, 121, 122, 127, 128, 142, 143,
144, 145, 146, 147, 150, 152, 154,
155, 171, 175, 185, 186, 187, 188,
189, 190, 192, 193, 196, 197, 204
Length contraction 32, 33, 35, 45, 1074,
113, 117, 120, 121
Light clocks 103, 169
Light speed 24
Lorentz coordinate transformation 29,
31, 32, 33, 38, 54
M
Mass energy 49, 262
Minkowski spacetime 35
Momentum 41, 42, 43, 44, 45, 46, 47,
48, 49, 56, 62, 156, 157, 158, 159,
160, 161, 162, 165, 166, 167, 168,
169, 170, 171, 172, 173, 174, 175,
176, 177, 180, 219, 221, 261
Multiple boosts 139
N
Negative pressure 281, 282, 286
Newtonian 12, 30, 53, 60, 61, 70, 74,
77, 78, 79, 81, 91, 113, 114, 137, 147,
156, 158, 159, 160, 164, 165, 166,
171, 172, 176, 178, 181, 187, 193,
219, 220, 227, 244, 251, 273
Newton's law 143, 180, 193, 220, 223,
226, 227, 280, 281
Observations 5, 6, 78, 81, 137, 180, 257,
258, 263, 283, 285, 286, 290
Our universe 2, 6, 9, 10, 11, 12, 17, 27,
67, 240, 275, 283, 284, 285
Paradoxes 33, 35, 50
Penrose diagrams 265, 266, 267
Planet mercury 222
Index
Principle and relativity 272
Proper distance 109, 117, 119, 126, 127,
128, 129, 132, 147, 148, 149, 150,
152, 153, 191, 228, 232, 233, 234,
250, 252, 253, 254, 255, 256, 277
Proper time 105, 118, 126, 145
2.99
190, 196, 199, 200, 201, 202, 205,
206, 209, 210, 211, 214, 217, 218,
219, 221, 222, 225, 229, 231, 232,
233, 234, 236, 237, 238, 239, 244
Stellar aberration 88, 90, 91, 107, 113,
135
Reference frames 72, 76, 78, 79, 86, 91,
92, 94, 98, 103, 106, 107, 113, 116,
117, 125, 126, 127, 128, 133, 134,
135, 137, 142, 143, 144, 156, 157,
196,217,238,278,279
Relativity 1, 2, 5, 8, 9, 11, 12, 13, 14, 23,
24, 26, 28, 29, 30, 31, 33, 35, 38, 40,
41, 42, 43, 50, 52, 53, 55, 59, 60, 61,
62, 63, 64, 67, 68, 70, 72, 81, 90, 100,
105, 114, 125, 126, 130, 135, 137,
141", 142, 143, 147, 156, 157, 160,
164, 166, 171, 176, 177, 178, 179,
180, 182, 190, 195, 196, 197, 199,
206, 208, 210, 211, 217, 218, 219,
220, 222, 225, 229, 231, 232, 244,
248, 264, 272, 274, 277, 279, 280
Runaway universe 285
Schwarzschild metric 220, 222, 226,
228, 231, 235, 237, 238, 239, 241,
245, 250, 256, 257, 267, 268, 269,
270, 271
Schwarzschild radius 228, 229, 230,
231, 232, 233, 234, 245, 258, 259,
260, 261, 262, 263
Singularity 242, 243, 244, 249, 264, 268,
287, 290, 291, 296
Spacetime 35, 65, 66, 67, 71, 72, 76, 92,
94, 96, 97, 98, 108, 109, 110, 114,
116, 119, 120, 122, 125, 126, 129,
130, 134, 135, 136, 142, 145, 147,
149, 151, 157, 167, 168, 172, 175,
176, 180, 184, 186, 187, 188, 189,
Tall tower 190, 192
Time dilation 23, 30, 32, 33, 34, 43, 44,
• 53, 58, 59, 60, 106, 107, 108, 113,
115, 120, 154, 155, 171, 193, 194,
195, 222, 226, 235, 262
Tiny tower 190, 191
Twin paradox 34, 109, 119, 125, 142,
151
U
Uniform acceleration 143, 144, 145,
146, 147, 149
Uniformly 142, 147, 148, 149, 150, 152,
153, 156, 164, 169, 185, 196, 258,
Vacuum energy 281, 282, 283, 286, 293,
295
Velocities 9, 55, 77, 78, 86, 113, 130, 131,
142, 143, 157, 160, 166, 171, 172,
187, 244, 257
Verification 185, 222
Visualizing 246
W
Worldline 72, 76, 94, 95, 96, 99, 100,
105, 106, 115, 117, 118, 119, 122,
123, 124, 125, 127, 129, 132, 142,
144, 145, 147, 148, 149, 150, 152,
153, 155, 156, 191, 201, 205, 222,
227, 241, 242, 245, 247, 251, 269,
276, 277
"This page is Intentionally Left Blank"