Skip to main content

Full text of "Modern cosmology"

See other formats




1M 



:MZ 



^9 

in 

MM 



Introduction to 



Third Edition 



Introduction 
to Cosmology 



Third Edition 



Matts Roos 



John Wiley a Sons, Ltd 



Introduction 
to Cosmology 

Third Edition 



Introduction 
to Cosmology 



Third Edition 



Matts Roos 



John Wiley & Sons, Ltd 



Copyright © 2003 John Wiley & Sons, Ltd, The Atrium, Southern Gate, Chichester, 

West Sussex P019 8SQ, England 

Telephone (+44) 1243 779777 
Email (for orders and customer service enquiries): cs-books@wiley.co.uk 
Visit our Home Page on www.wileyeurope.com or www.wiley.com 

All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval 
system or transmitted in any form or by any means, electronic, mechanical, photocopy- 
ing, recording, scanning or otherwise, except under the terms of the Copyright, Designs 
and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing 
Agency Ltd, 90 Tottenham Court Road, London WIT 4LP, UK, without the permission in 
writing of the Publisher. Requests to the Publisher should be addressed to the Permissions 
Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex 
P019 8SQ, England, or emailed to permreq@wiley.co.uk, or faxed to (+44) 1243 770571. 
This publication is designed to provide accurate and authoritative information in regard 
to the subject matter covered. It is sold on the understanding that the Publisher is not 
engaged in rendering professional services. If professional advice or other expert assis- 
tance is required, the services of a competent professional should be sought. 

Other Wiley Editorial Offices 

John Wiley & Sons Inc., Ill River Street, Hoboken, NJ 07030, USA 

Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA 

Wiley- VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany 

John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia 

John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 
129809 

John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1 

Wiley also publishes its books in a variety of electronic formats. Some content that appears 
in print may not be available in electronic books. 

Library of Congress Cataloging-in-Publication Data 



Includes bibliographical references and index. 

ISBN 0-470-84909-6 (acid-free paper) - ISBN 0-470-84910-X (pbk. : acid-free paper) 

1. Cosmology. I. Title. 



2003020688 

British Library Cataloguing in Publication Data 

A catalogue record for this book is available from the British Library 
ISBN 470 84909 6 (hardback) 
470 84910 X (paperback) 

Typeset in 9.5/12.5pt Lucida Bright by T&T Productions Ltd, London. 

Printed and bound in Great Britain by Antony Rowe Ltd., Chippenham, Wilts. 

This book is printed on acid-free paper responsibly manufactured from sustainable 

forestry in which at least two trees are planted for each one used for paper productio 



To my dear grandchildren 

Francis Alexandre Wei Ming (1986) 
Christian Philippe Wei Sing (1990) 

Cornelia (1989) 
Erik (1991) 
Adrian (1994) 

Emile Johannes (2000) 

Alaia Ingrid Markuntytar (2002) 



Contents 



Preface to First Edition 
Preface to Second Edition 
Preface to Third Edition 

1 From Newton to Hubble 

1.1 Historical Cosmology 

1.2 Inertial Frames and the Cosmological Principle 

1.3 Olbers' Paradox 

1.4 Hubble's Law 

1.5 The Age of the Universe 

1.6 Expansion in a Newtonian World 

2 Relativity 

2.1 Lorentz Transformations and Special Relativity 

2.2 Metrics of Curved Space-time 

2.3 RelatMstic Distance Measures 

2.4 General Relativity and the Principle of Covariance 

2.5 The Principle of Equivalence 

2.6 Einstein's Theory of Gravitation 

3 Gravitational Phenomena 

3.1 Classical Tests of General Relativity 

3.2 The Binary Pulsar 

3.3 Gravitational Lensing 

3.4 Black Holes 

3.5 Gravitational Waves 

4 Cosmological Models 

4.1 Friedmann-Lemaitre Cosmologies 

4.2 de Sitter Cosmology 

4.3 Dark Energy 

4.4 Model Testing and Parameter Estimation. I 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



5 Thermal History of the Universe 

5.1 Photons 

5.2 Adiabatic Expansion 

5.3 Electroweak Interactions 

5.4 The Early Radiation Era 

5.5 Photon and Lepton Decoupling 

5.6 Big Bang Nucleosynthesis 

6 Particles and Symmetries 

6.1 Spin Space 

6.2 SU(2) Symmetries 

6.3 Hadrons and Quarks 

6.4 The Discrete Symmetries C, P, T 

6.5 Spontaneous Symmetry Breaking 

6.6 Primeval Phase Transitions and Symmetries 

6.7 Baryosynthesis and Antimatter Generation 

7 Cosmic Inflation 

7.1 Paradoxes of the Expansion 

7.2 'Old' and 'New' Inflation 

7.3 Chaotic Inflation 

7.4 The Inflaton as Quintessence 

7.5 Cyclic Models 

8 Cosmic Microwave Background 

8.1 The CMB Temperature 

8.2 Temperature Anisotropies 

8.3 Polarization Anisotropies 

8.4 Model Testing and Parameter Estimation. II 

9 Cosmic Structures and Dark Matter 

9.1 Density Fluctuations 

9.2 Structure Formation 

9.3 The Evidence for Dark Matter 

9.4 Dark Matter Candidates 

9 . 5 The Cold Dark Matter Paradigm 



10 Epilogue 



Preface to First Edition 



A few decades ago, astronomy and particle physics started to merge in the com- 
mon held of cosmology. The general public had always been more interested in 
the visible objects of astronomy than in invisible atoms, and probably met cosmol- 
ogy first in Steven Weinberg's famous book The First Three Minutes. More recently 
Stephen Hawking's A Brief History of Time has caused an avalanche of interest in 
this subject. 

Although there are now many popular monographs on cosmology, there are 
so far no introductory textbooks at university undergraduate level. Chapters on 
cosmology can be found in introductory books on relativity or astronomy, but 
they cover only part of the subject. One reason may be that cosmology is explicitly 
cross-disciplinary, and therefore it does not occupy a prominent position in either 
physics or astronomy curricula. 

At the University of Helsinki I decided to try to take advantage of the great 
interest in cosmology among the younger students, offering them a one-semester 
course about one year before their specialization started. Hence I could not count 
on much familiarity with quantum mechanics, general relativity, particle physics, 
astrophysics or statistical mechanics. At this level, there are courses with the 
generic name of Structure of Matter dealing with Lorentz transformations and 
the basic concepts of quantum mechanics. My course aimed at the same level. Its 
main constraint was that it had to be taught as a one-semester course, so that it 
would be accepted in physics and astronomy curricula. The present book is based 
on that course, given three times to physics and astronomy students in Helsinki. 

Of course there already exist good books on cosmology. The reader will in fact 
find many references to such books, which have been an invaluable source of 
information to me. The problem is only that they address a postgraduate audience 
that intends to specialize in cosmology research. My readers will have to turn to 
these books later when they have mastered all the professional skills of physics 
and mathematics. 

In this book I am not attempting to teach basic physics to astronomers. They 
will need much more. I am trying to teach just enough physics to be able to explain 
the main ideas in cosmology without too much hand-waving. I have tried to avoid 
the other extreme, practised by some of my particle physics colleagues, of writing 
books on cosmology with the obvious intent of making particle physicists out of 
every theoretical astronomer. 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



x Preface to First Edition 

I also do not attempt to teach basic astronomy to physicists. In contrast to 
astronomy scholars, I think the main ideas in cosmology do not require very 
detailed knowledge of astrophysics or observational techniques. Whole books 
have been written on distance measurements and the value of the Hubble param- 
eter, which still remains imprecise to a factor of two. Physicists only need to know 
that quantities entering formulae are measurable— albeit incorporating factors h 
to some power— so that the laws can be discussed meaningfully. At undergraduate 
level, it is not even usual to give the errors on measured values. 

In most chapters there are subjects demanding such a mastery of theoretical 
physics or astrophysics that the explanations have to be qualitative and the deriva- 
tions meagre, for instance in general relativity, spontaneous symmetry breaking, 
inflation and galaxy formation. This is unavoidable because it just reflects the 
level of undergraduates. My intention is to go just a few steps further in these 
matters than do the popular monographs. 

I am indebted in particular to two colleagues and friends who offered construc- 
tive criticism and made useful suggestions. The particle physicist Professor Kari 
Enqvist of NORDITA, Copenhagen, my former student, has gone to the trouble 
of reading the whole manuscript. The space astronomer Professor Stuart Bowyer 
of the University of California, Berkeley, has passed several early mornings of jet 
lag in Lapland going through the astronomy-related sections. Anyway, he could 
not go out skiing then because it was either a snow storm or -30 °C! Finally, the 
publisher provided me with a very knowledgeable and thorough referee, an astro- 
physicist no doubt, whose criticism of the chapter on galaxy formation was very 
valuable to me. For all remaining mistakes I take full responsibility. They may well 
have been introduced by me afterwards. 

Thanks are also due to friends among the local experts: particle physicist Pro- 
fessor Masud Chaichian and astronomer Professor Kalevi Mattila have helped me 
with details and have answered my questions on several occasions. I am also 
indebted to several people who helped me to assemble the pictorial material: 
Drs Subir Sarkar in Oxford, Rocky Kolb in the Fermilab, Carlos Frenk in Durham, 
Werner Kienzle at CERN and members of the COBE team. 

Finally, I must thank my wife Jacqueline for putting up with almost two years 
of near absence and full absent-mindedness while writing this book. 



Preface to Second Edition 



In the three years since the first edition of this book was finalized, the field of 
cosmology has seen many important developments, mainly due to new obser- 
vations with superior instruments such as the Hubble Space Telescope and the 
ground-based Keck telescope and many others. Thus a second edition has become 
necessary in order to provide students and other readers with a useful and up-to- 
date textbook and reference book. 

At the same time I could balance the presentation with material which was 
not adequately covered before— there I am in debt to many readers. Also, the 
inevitable number of misprints, errors and unclear formulations, typical of a first 
edition, could be corrected. I am especially indebted to Kimmo Kainulainen who 
served as my course assistant one semester, and who worked through the book 
and the problems thoroughly, resulting in a very long list of corrigenda. A similar 
shorter list was also dressed by George Smoot and a student of his. It still worries 
me that the errors found by George had been found neither by Kimmo nor by 
myself, thus statistics tells me that some errors still will remain undetected. 

For new pictorial material I am indebted to Wes Colley at Princeton, Carlos Frenk 
in Durham, Charles Lineweaver in Strasbourg, Jukka Nevalainen in Helsinki, Subir 
Sarkar in Oxford, and George Smoot in Berkeley. I am thankful to the Academie 
des Sciences for an invitation to Paris where I could visit the Observatory of Paris- 
Meudon and profit from discussions with S. Bonazzola and Brandon Carter. 

Several of my students have contributed in various ways: by misunderstandings, 
indicating the need for better explanations, by their enthusiasm for the subject, 
and by technical help, in particular S. M. Harun-or-Rashid. My youngest grandchild 
Adrian (not yet 3) has showed a vivid interest for supernova bangs, as demon- 
strated by an X-ray image of the Cassiopeia A remnant. Thus the future of the 
subject is bright. 



Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



Preface to Third Edition 



This preface can start just like the previous one: in the seven years since the 
second edition was finalized, the field of cosmology has seen many important 
developments, mainly due to new observations with superior instruments. In the 
past, cosmology often relied on philosophical or aesthetic arguments; now it is 
maturing to become an exact science. For example, the Einstein-de Sitter universe, 
which has zero cosmological constant (Q\ = 0), used to be favoured for esthetical 
reasons, but today it is known to be very different from zero (Q\ = 0.73 ± 0.04). 

In the first edition I quoted £?o = 0.8 ± 0.3 (daring to believe in errors that many 
others did not), which gave room for all possible spatial geometries: spherical, flat 
and hyperbolic. Since then the value has converged to Oq = 1.02 ± 0.02, and every- 
body is now willing to concede that the geometry of the Universe is flat, Oq = 1. 
This result is one of the cornerstones of what we now can call the 'Standard Model 
of Cosmology'. Still, deep problems remain, so deep that even Einstein's general 
relativity is occasionally put in doubt. 

A consequence of the successful march towards a 'standard model' is that many 
alternative models can be discarded. An introductory text of limited length like 
the current one cannot be a historical record of failed models. Thus I no longer 
discuss, or discuss only briefly, k * geometries, the Einstein-de Sitter universe, 
hot and warm dark matter, cold dark matter models with A = 0, isocurvature fluc- 
tuations, topological defects (except monopoles), Bianchi universes, and formulae 
which only work in discarded or idealized models, like Mattig's relation and the 
Saha equation. 

Instead, this edition contains many new or considerably expanded subjects: Sec- 
tion 2.3 on Relativistic Distance Measures, Section 3.3 on Gravitational Lensing, 
Section 3.5 on Gravitational Waves, Section 4.3 on Dark Energy and Quintessence, 
Section 5.1 on Photon Polarization, Section 7.4 on The Inflaton as Quintessence, 
Section 7.5 on Cyclic Models, Section 8.3 on CMB Polarization Anisotropies, Sec- 
tion 8.4 on model testing and parameter estimation using mainly the first-year 
CMB results of the Wilkinson Microwave Anisotropy Probe, and Section 9.5 on 
large-scale structure results from the 2 degree Field (2dF) Galaxy Redshift Survey. 
The synopsis in this edition is also different and hopefully more logical, much has 
been entirely rewritten, and all parameter values have been updated. 

I have not wanted to go into pure astrophysics, but the line between cosmology 
and cosmologically important astrophysics is not easy to draw. Supernova explo- 
sion mechanisms and black holes are included as in the earlier editions, but not 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



xiv Preface to Third Edition 

for instance active galactic nuclei (AGNs) or jets or ultra-high-energy cosmic rays. 
Observational techniques are mentioned only briefly— they are beyond the scope 
of this book. 

There are many new figures for which I am in debt to colleagues and friends, 
all acknowledged in the figure legends. I have profited from discussions with Pro- 
fessor Carlos Frenk at the University of Durham and Professor Kari Enqvist at 
the University of Helsinki. I am also indebted to Professor Juhani Keinonen at the 
University of Helsinki for having generously provided me with working space and 
access to all the facilities at the Department of Physical Sciences, despite the fact 
that I am retired. 

Many critics, referees and other readers have made useful comments that I have 
tried to take into account. One careful reader, Urbana Lopes Franga Jr, sent me 
a long list of misprints and errors. A critic of the second edition stated that the 
errors in the first edition had been corrected, but that new errors had emerged 
in the new text. This will unfortunately always be true in any comparison of edi- 
tion n + 1 with edition n. In an attempt to make continuous corrections I have 
assigned a web site for a list of errors and misprints. The address is 

http://www.physics.helsinki.fi/-fl_cosmo/ 

My most valuable collaborator has been Thomas S. Coleman, a nonphysicist who 
contacted me after having spotted some errors in the second edition, and who 
proposed some improvements in case I were writing a third edition. This came 
at the appropriate time and led to a collaboration in which Thomas S. Coleman 
read the whole manuscript, corrected misprints, improved my English, checked 
my calculations, designed new figures and proposed clarifications where he found 
the text difficult. 

My wife Jacqueline has many interesting subjects of conversation at the break- 
fast table. Regretfully, her breakfast companion is absent-minded, thinking only 
of cosmology. I thank her heartily for her kind patience, promising improvement. 



Matts Roos 

Helsinki, March 2003 



From Newton to 
Hubble 



The history of ideas on the structure and origin of the Universe shows that 
humankind has always put itself at the centre of creation. As astronomical evi- 
dence has accumulated, these anthropocentric convictions have had to be aban- 
doned one by one. From the natural idea that the solid Earth is at rest and the 
celestial objects all rotate around us, we have come to understand that we inhabit 
an average-sized planet orbiting an average-sized sun, that the Solar System is in 
the periphery of a rotating galaxy of average size, flying at hundreds of kilometres 
per second towards an unknown goal in an immense Universe, containing billions 
of similar galaxies. 

Cosmology aims to explain the origin and evolution of the entire contents of 
the Universe, the underlying physical processes, and thereby to obtain a deeper 
understanding of the laws of physics assumed to hold throughout the Universe. 
Unfortunately, we have only one universe to study, the one we live in, and we 
cannot make experiments with it, only observations. This puts serious limits on 
what we can learn about the origin. If there are other universes we will never know. 

Although the history of cosmology is long and fascinating, we shall not trace it 
in detail, nor any further back than Newton, accounting (in Section 1.1) only for 
those ideas which have fertilized modern cosmology directly, or which happened 
to be right although they failed to earn timely recognition. In the early days of 
cosmology, when little was known about the Universe, the field was really just a 
branch of philosophy. 

Having a rigid Earth to stand on is a very valuable asset. How can we describe 
motion except in relation to a fixed point? Important understanding has come 
from the study of inertial systems, in uniform motion with respect to one another. 
From the work of Einstein on inertial systems, the theory of special relativity 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



2 From Newton to Hubble 

was born. In Section 1.2 we discuss inertial frames, and see how expansion and 
contraction are natural consequences of the homogeneity and isotropy of the 
Universe. 

A classic problem is why the night sky is dark and not blazing like the disc of 
the Sun, as simple theory in the past would have it. In Section 1.3 we shall discuss 
this so-called Olbers' paradox, and the modern understanding of it. 

The beginning of modern cosmology may be fixed at the publication in 1929 
of Hubble's law, which was based on observations of the redshift of spectral 
lines from remote galaxies. This was subsequently interpreted as evidence for 
the expansion of the Universe, thus ruling out a static Universe and thereby set- 
ting the primary requirement on theory. This will be explained in Section 1.4. In 
Section 1.5 we turn to determinations of cosmic timescales and the implications 
of Hubble's law for our knowledge of the age of the Universe. 

In Section 1.6 we describe Newton's theory of gravitation, which is the earliest 
explanation of a gravitational force. We shall 'modernize' it by introducing Hub- 
ble's law into it. In fact, we shall see that this leads to a cosmology which already 
contains many features of current Big Bang cosmologies. 



1.1 Historical Cosmology 



At the time of Isaac Newton (1642-1727) the heliocentric Universe of Nicolaus 
Copernicus (1473-1543), Galileo Galilei (1564-1642) and Johannes Kepler (1571- 
1630) had been accepted, because no sensible description of the motion of the 
planets could be found if the Earth was at rest at the centre of the Solar System. 
Humankind was thus dethroned to live on an average-sized planet orbiting around 
an average-sized sun. 

The stars were understood to be suns like ours with fixed positions in a static 
Universe. The Milky Way had been resolved into an accumulation of faint stars 
with the telescope of Galileo. The anthropocentric view still persisted, however, 
in locating the Solar System at the centre of the Universe. 

Newton's Cosmology. The first theory of gravitation appeared when Newton 
published his Philosophiae Naturalis Principia Mathematica in 1687. With this 
theory he could explain the empirical laws of Kepler: that the planets moved in 
elliptical orbits with the Sun at one of the focal points. An early success of this 
theory came when Edmund Halley (1656-1742) successfully predicted that the 
comet sighted in 1456, 1531, 1607 and 1682 would return in 1758. Actually, the 
first observation confirming the heliocentric theory came in 1727 when James 
Bradley (1693-1762) discovered the aberration of starlight, and explained it as 
due to the changes in the velocity of the Earth in its annual orbit. In our time, 
Newton's theory of gravitation still suffices to describe most of planetary and 
satellite mechanics, and it constitutes the nonrelativistic limit of Einstein's rela- 
tivistic theory of gravitation. 



Historical Cosmology 3 

Newton considered the stars to be suns evenly distributed throughout infinite 
space in spite of the obvious concentration of stars in the Milky Way. A dis- 
tribution is called homogeneous if it is uniformly distributed, and it is called 
isotropic if it has the same properties in all spatial directions. Thus in a homo- 
geneous and isotropic space the distribution of matter would look the same to 
observers located anywhere— no point would be preferential. Each local region of 
an isotropic universe contains information which remains true also on a global 
scale. Clearly, matter introduces lumpiness which grossly violates homogeneity 
on the scale of stars, but on some larger scale isotropy and homogeneity may 
still be a good approximation. Going one step further, one may postulate what is 
called the cosmological principle, or sometimes the Copernican principle. 

The Universe is homogeneous and isotropic in three-dimensional space, 
has always been so, and will always remain so. 

It has always been debated whether this principle is true, and on what scale. 
On the galactic scale visible matter is lumpy, and on larger scales galaxies form 
gravitationally bound clusters and narrow strings separated by voids. But galaxies 
also appear to form loose groups of three to five or more galaxies. Several surveys 
have now reached agreement that the distribution of these galaxy groups appears 
to be homogeneous and isotropic within a sphere of 170 Mpc radius [1]. This is 
an order of magnitude larger than the supercluster to which our Galaxy and our 
local galaxy group belong, and which is centred in the constellation of Virgo. 

Based on his theory of gravitation, Newton formulated a cosmology in 1691. 
Since all massive bodies attract each other, a finite system of stars distributed 
over a finite region of space should collapse under their mutual attraction. But 
this was not observed, in fact the stars were known to have had fixed positions 
since antiquity, and Newton sought a reason for this stability. He concluded, erro- 
neously, that the self-gravitation within a finite system of stars would be com- 
pensated for by the attraction of a sufficient number of stars outside the system, 
distributed evenly throughout infinite space. However, the total number of stars 
could not be infinite because then their attraction would also be infinite, making 
the static Universe unstable. It was understood only much later that the addition 
of external layers of stars would have no influence on the dynamics of the interior. 
The right conclusion is that the Universe cannot be static, an idea which would 
have been too revolutionary at the time. 

Newton's contemporary and competitor Gottfried Wilhelm von Leibnitz (1646- 
1716) also regarded the Universe to be spanned by an abstract infinite space, but 
in contrast to Newton he maintained that the stars must be infinite in number 
and distributed all over space, otherwise the Universe would be bounded and 
have a centre, contrary to contemporary philosophy. Finiteness was considered 
equivalent to boundedness, and infinity to unboundedness. 



Rotating Galaxies. The first description of the Milky Way as a rotating galaxy 
can be traced to Thomas Wright (1711-1786), who wrote An Original Theory or 
New Hypothesis of the Universe in 1750, suggesting that the stars are 



4 From Newton to Hubble 

all moving the same way and not much deviating from the same plane, 
as the planets in their heliocentric motion do round the solar body. 

Wright's galactic picture had a direct impact on Immanuel Kant (1724-1804). In 
1755 Kant went a step further, suggesting that the diffuse nebulae which Galileo 
had already observed could be distant galaxies rather than nearby clouds of incan- 
descent gas. This implied that the Universe could be homogeneous on the scale 
of galactic distances in support of the cosmological principle. 

Kant also pondered over the reason for transversal velocities such as the move- 
ment of the Moon. If the Milky Way was the outcome of a gaseous nebula con- 
tracting under Newton's law of gravitation, why was all movement not directed 
towards a common centre? Perhaps there also existed repulsive forces of gravi- 
tation which would scatter bodies onto trajectories other than radial ones, and 
perhaps such forces at large distances would compensate for the infinite attrac- 
tion of an infinite number of stars? Note that the idea of a contracting gaseous 
nebula constituted the first example of a nonstatic system of stars, but at galactic 
scale with the Universe still static. 

Kant thought that he had settled the argument between Newton and Leibnitz 
about the finiteness or infiniteness of the system of stars. He claimed that either 
type of system embedded in an infinite space could not be stable and homoge- 
neous, and thus the question of infinity was irrelevant. Similar thoughts can be 
traced to the scholar Yang Shen in China at about the same time, then unknown 
to Western civilization [2]. 

The infinity argument was, however, not properly understood until Bernhard 
Riemann (1826-1866) pointed out that the world could be finite yet unbounded, 
provided the geometry of the space had a positive curvature, however small. On 
the basis of Riemann's geometry, Albert Einstein (1879-1955) subsequently estab- 
lished the connection between the geometry of space and the distribution of mat- 
ter. 

Kant's repulsive force would have produced trajectories in random directions, 
but all the planets and satellites in the Solar System exhibit transversal motion in 
one and the same direction. This was noticed by Pierre Simon de Laplace (1749- 
1827), who refuted Kant's hypothesis by a simple probabilistic argument in 1825: 
the observed movements were just too improbable if they were due to random 
scattering by a repulsive force. Laplace also showed that the large transversal 
velocities and their direction had their origin in the rotation of the primordial 
gaseous nebula and the law of conservation of angular momentum. Thus no repul- 
sive force is needed to explain the transversal motion of the planets and their 
moons, no nebula could contract to a point, and the Moon would not be expected 
to fall down upon us. 

This leads to the question of the origin of time: what was the first cause of the 
rotation of the nebula and when did it all start? This is the question modern cos- 
mology attempts to answer by tracing the evolution of the Universe backwards in 
time and by reintroducing the idea of a repulsive force in the form of a cosmo- 
logical constant needed for other purposes. 



Historical Cosmology 5 

Black Holes. The implications of Newton's gravity were quite well understood 
by John Michell (1724-1793), who pointed out in 1783 that a sufficiently massive 
and compact star would have such a strong gravitational field that nothing could 
escape from its surface. Combining the corpuscular theory of light with Newton's 
theory, he found that a star with the solar density and escape velocity c would 
have a radius of 486-R© and a mass of 120 million solar masses. This was the first 
mention of a type of star much later to be called a black hole (to be discussed in 
Section 3.4). In 1796 Laplace independently presented the same idea. 



Galactic and Extragalactic Astronomy. Newton should also be credited with 
the invention of the reflecting telescope— he even built one— but the first one of 
importance was built one century later by William Herschel (1738-1822). With 
this instrument, observational astronomy took a big leap forward: Herschel and 
his son John could map the nearby stars well enough in 1785 to conclude cor- 
rectly that the Milky Way was a disc-shaped star system. They also concluded 
erroneously that the Solar System was at its centre, but many more observations 
were needed before it was corrected. Herschel made many important discoveries, 
among them the planet Uranus, and some 700 binary stars whose movements 
confirmed the validity of Newton's theory of gravitation outside the Solar System. 
He also observed some 250 diffuse nebulae, which he first believed were distant 
galaxies, but which he and many other astronomers later considered to be nearby 
incandescent gaseous clouds belonging to our Galaxy. The main problem was then 
to explain why they avoided the directions of the galactic disc, since they were 
evenly distributed in all other directions. 

The view of Kant that the nebulae were distant galaxies was also defended 
by Johann Heinrich Lambert (1728-1777). He came to the conclusion that the 
Solar System along, with the other stars in our Galaxy, orbited around the galac- 
tic centre, thus departing from the heliocentric view. The correct reason for the 
absence of nebulae in the galactic plane was only given by Richard Anthony Proc- 
tor (1837-1888), who proposed the presence of interstellar dust. The arguments 
for or against the interpretation of nebulae as distant galaxies nevertheless raged 
throughout the 19th century because it was not understood how stars in galax- 
ies more luminous than the whole galaxy could exist— these were observations 
of supernovae. Only in 1925 did Edwin P. Hubble (1889-1953) resolve the conflict 
indisputably by discovering Cepheids and ordinary stars in nebulae, and by deter- 
mining the distance to several galaxies, among them the celebrated M31 galaxy in 
the Andromeda. Although this distance was off by a factor of two, the conclusion 
was qualitatively correct. 

In spite of the work of Kant and Lambert, the heliocentric picture of the Galaxy— 
or almost heliocentric since the Sun was located quite close to Herschel's galactic 
centre— remained long into our century. A decisive change came with the observa- 
tions in 1915-1919 by Harlow Shapley (1895-1972) of the distribution of globular 
clusters hosting 10 5 -10 7 stars. He found that perpendicular to the galactic plane 
they were uniformly distributed, but along the plane these clusters had a distri- 
bution which peaked in the direction of the Sagittarius. This defined the centre 



6 From Newton to Hubble 

of the Galaxy to be quite far from the Solar System: we are at a distance of about 
two-thirds of the galactic radius. Thus the anthropocentric world picture received 
its second blow— and not the last one— if we count Copernicus's heliocentric pic- 
ture as the first one. Note that Shapley still believed our Galaxy to be at the centre 
of the astronomical Universe. 



The End of Newtonian Cosmology. In 1883 Ernst Mach (1838-1916) published a 
historical and critical analysis of mechanics in which he rejected Newton's concept 
of an absolute space, precisely because it was unobservable. Mach demanded that 
the laws of physics should be based only on concepts which could be related 
to observations. Since motion still had to be referred to some frame at rest, he 
proposed replacing absolute space by an idealized rigid frame of fixed stars. Thus 
'uniform motion' was to be understood as motion relative to the whole Universe. 
Although Mach clearly realized that all motion is relative, it was left to Einstein to 
take the full step of studying the laws of physics as seen by observers in inertial 
frames in relative motion with respect to each other. 

Einstein published his General Theory of Relativity in 1917, but the only solu- 
tion he found to the highly nonlinear differential equations was that of a static 
Universe. This was not so unsatisfactory though, because the then known Uni- 
verse comprised only the stars in our Galaxy, which indeed was seen as static, 
and some nebulae of ill-known distance and controversial nature. Einstein firmly 
believed in a static Universe until he met Hubble in 1929 and was overwhelmed 
by the evidence for what was to be called Hubble's law. 

Immediately after general relativity became known, Willem de Sitter (1872- 
1934) published (in 1917) another solution, for the case of empty space-time in an 
exponential state of expansion. We shall describe this solution in Section 4.2. In 
1922 the Russian meteorologist Alexandr Friedmann (1888-1925) found a range 
of intermediate solutions to Einstein's equations which describe the standard cos- 
mology today. Curiously, this work was ignored for a decade although it was pub- 
lished in widely read journals. This is the subject of Section 4.1. 

In 1924 Hubble had measured the distances to nine spiral galaxies, and he found 
that they were extremely far away. The nearest one, M31 in the Andromeda, is now 
known to be at a distance of 20 galactic diameters (Hubble's value was about 8) and 
the farther ones at hundreds of galactic diameters. These observations established 
that the spiral nebulae are, as Kant had conjectured, stellar systems comparable 
in mass and size with the Milky Way, and their spatial distribution confirmed the 
expectations of the cosmological principle on the scale of galactic distances. 

In 1926-1927 Bertil Lindblad (1895-1965) and Jan Hendrik Oort (1900-1992) 
verified Laplace's hypothesis that the Galaxy indeed rotated, and they determined 
the period to be 10 8 yr and the mass to be about 1O 11 M . The conclusive demon- 
stration that the Milky Way is an average-sized galaxy, in no way exceptional or 
central, was given only in 1952 by Walter Baade. This we may count as the third 
breakdown of the anthropocentric world picture. 

The later history of cosmology up until 1990 has been excellently summarized 
by Peebles [3]. 



Inertial Frames and the Cosmological Principle 7 




Figure 1.1 Two observers at A and B making observations in the directions r, r' . 

To give the reader an idea of where in the Universe we are, what is nearby and 
what is far away, some cosmic distances are listed in Table A.l in the appendix. On 
a cosmological scale we are not really interested in objects smaller than a galaxy! 
We generally measure cosmic distances in parsec (pc) units (kpc for 10 3 pc and 
Mpc for 10 6 pc). A parsec is the distance at which one second of arc is subtended 
by a length equalling the mean distance between the Sun and the Earth. The par- 
sec unit is given in Table A.2 in the appendix, where the values of some useful 
cosmological and astrophysical constants are listed. 

1.2 Inertial Frames and the Cosmological Principle 

Newton's first law— the law of inertia— states that a system on which no forces 
act is either at rest or in uniform motion. Such systems are called inertial frames. 
Accelerated or rotating frames are not inertial frames. Newton considered that 'at 
rest' and 'in motion' implicitly referred to an absolute space which was unobserv- 
able but which had a real existence independent of humankind. Mach rejected the 
notion of an empty, unobservable space, and only Einstein was able to clarify the 
physics of motion of observers in inertial frames. 

It may be interesting to follow a nonrelativistic argument about the static or 
nonstatic nature of the Universe which is a direct consequence of the cosmological 
principle. 

Consider an observer 'A' in an inertial frame who measures the density of galax- 
ies and their velocities in the space around him. Because the distribution of galax- 
ies is observed to be homogeneous and isotropic on very large scales (strictly 
speaking, this is actually true for galaxy groups [1]), he would see the same mean 
density of galaxies (at one time t) in two different directions r and r'\ 
p A (r,t) = p A (r',t). 

Another observer 'B' in another inertial frame (see Figure 1.1) looking in the direc- 
tion r from her location would also see the same mean density of galaxies: 

PB(r',t) = p A (r,t). 

The velocity distributions of galaxies would also look the same to both observers, 
in fact in all directions, for instance in the r' direction: 

f B (r',t) = v A (r',t). 



8 From Newton to Hubble 

Suppose that the B frame has the relative velocity va ir",t) as seen from the 
A frame along the radius vector r" = r - r' . If all velocities are nonrelativistic, 
i.e. small compared with the speed of light, we can write 

v A (r', £) = v A (r - r" , t) = v A (r, £) - v A (r", £). 

This equation is true only if v a (r, £) has a specific form: it must be proportional 
to r, 

v A (r,t)=f(t)r, (1.1) 

where /(t) is an arbitrary function. Why is this so? 

Let this universe start to expand. From the vantage point of A (or B equally well, 
since all points of observation are equal), nearby galaxies will appear to recede 
slowly. But in order to preserve uniformity, distant ones must recede faster, in 
fact their recession velocities must increase linearly with distance. That is the 
content of Equation (1.1). 

If /(£) > 0, the Universe would be seen by both observers to expand, each 
galaxy having a radial velocity proportional to its radial distance r. If fit) < 0, 
the Universe would be seen to contract with velocities in the reversed direction. 
Thus we have seen that expansion and contraction are natural consequences of 
the cosmological principle. If /(£) is a positive constant, Equation (1.1) is Hubble's 
law, which we shall meet in Section 1.4. 

Actually, it is somewhat misleading to say that the galaxies recede when, rather, 
it is space itself which expands or contracts. This distinction is important when 
we come to general relativity. 

A useful lesson may be learned from studying the limited gravitational system 
consisting of the Earth and rockets launched into space. This system is not quite 
like the previous example because it is not homogeneous, and because the motion 
of a rocket or a satellite in Earth's gravitational field is different from the motion 
of galaxies in the gravitational field of the Universe. Thus to simplify the case 
we only consider radial velocities, and we ignore Earth's rotation. Suppose the 
rockets have initial velocities low enough to make them fall back onto Earth. The 
rocket-Earth gravitational system is then closed and contracting, corresponding 
to fit) <0. 

When the kinetic energy is large enough to balance gravity, our idealized rocket 
becomes a satellite, staying above Earth at a fixed height (real satellites circu- 
late in stable Keplerian orbits at various altitudes if their launch velocities are in 
the range 8-11 km s _1 ). This corresponds to the static solution fit) = for the 
rocket-Earth gravitational system. 

If the launch velocities are increased beyond about 11 kms" 1 , the potential 
energy of Earth's gravitational field no longer suffices to keep the rockets bound 
to Earth. Beyond this speed, called the second cosmic velocity by rocket engineers, 
the rockets escape for good. This is an expanding or open gravitational system, 
corresponding to /(£) > 0. 

The static case is different if we consider the Universe as a whole. According 
to the cosmological principle, no point is preferred, and therefore there exists no 
centre around which bodies can gravitate in steady-state orbits. Thus the Universe 



Olbers' Paradox 9 

is either expanding or contracting, the static solution being unstable and therefore 
unlikely. 

1.3 Olbers' Paradox 

Let us turn to an early problem still discussed today, which is associated with 
the name of Wilhelm Olbers (1758-1840), although it seems to have been known 
already to Kepler in the 1 7th century, and a treatise on it was published by Jean- 
Philippe Loys de Cheseaux in 1744, as related in the book by E. Harrison [5]. Why 
is the night sky dark if the Universe is infinite, static and uniformly filled with 
stars? They should fill up the total field of visibility so that the night sky would 
be as bright as the Sun, and we would find ourselves in the middle of a heat bath 
of the temperature of the surface of the Sun. Obviously, at least one of the above 
assumptions about the Universe must be wrong. 

The question of the total number of shining stars was already pondered by 
Newton and Leibnitz. Let us follow in some detail the argument published by 
Olbers in 1823. The absolute luminosity of a star is defined as the amount of 
luminous energy radiated per unit time, and the surface brightness B as luminosity 
per unit surface. Suppose that the number of stars with average luminosity L is 
N and their average density in a volume V is n = N/V. If the surface area of an 
average star is A, then its brightness is B = LI A. The Sun may be taken to be such 
an average star, mainly because we know it so well. 

The number of stars in a spherical shell of radius r and thickness dr is then 
4nr 2 ndr. Their total radiation as observed at the origin of a static universe of 
infinite extent is then found by integrating the spherical shells from to oo: 

f Anr 2 nBdr=\ nLdr = oo. (1.2) 

Jo Jo 

On the other hand, a finite number of visible stars each taking up an angle A/r 2 
could cover an infinite number of more distant stars, so it is not correct to inte- 
grate r to oo . Let us integrate only up to such a distance R that the whole sky of 
angle 4tt would be evenly tiled by the star discs. The condition for this is 

rR a 

4nr 2 n^rdr = An. 

Jo r 2 

It then follows that the distance is R = 1 1 An. The integrated brightness from 
these visible stars alone is then 

rR 

nLdr = LI A, (1.3) 

or equal to the brightness of the Sun. But the night sky is indeed dark, so we are 
faced with a paradox. 

Olbers' own explanation was that invisible interstellar dust absorbed the light. 
That would make the intensity of starlight decrease exponentially with distance. 
But one can show that the amount of dust needed would be so great that the Sun 



10 From Newton to Hubble 

would also be obscured. Moreover, the radiation would heat the dust so that it 
would start to glow soon enough, thereby becoming visible in the infrared. 

A large number of different solutions to this paradox have been proposed in the 
past, some of the wrong ones lingering on into the present day. Let us here follow 
a valid line of reasoning due to Lord Kelvin (1824-1907), as retold and improved 
in a popular book by E. Harrison [5]. 

A star at distance r covers the fraction A/4nr 2 of the sky. Multiplying this by 
the number of stars in the shell, 4nr 2 ndr, we obtain the fraction of the whole 
sky covered by stars viewed by an observer at the centre, An dr. Since n is the 
star count per volume element, An has the dimensions of number of stars per 
linear distance. The inverse of this, 

£=1/An, (1.4) 

is the mean radial distance between stars, or the mean free path of photons emit- 
ted from one star and being absorbed in collisions with another. We can also 
define a mean collision time: 

t = £/c. (1.5) 

The value of f can be roughly estimated from the properties of the Sun, with 
radius R Q and density p Q . Let the present mean density of luminous matter in the 
Universe be po and the distance to the farthest visible star r* . Then the collision 
time inside this volume of size ^Trr 3 is 

. - 1 1 47rr| 4p .R 

T ~ T e> = ~a = ~ 7T7 -,,, = "^ ■ (I- 6 ) 

A nc nRi 3Nc 3p c 

Taking the solar parameters from Table A.2 in the appendix we obtain approxi- 
mately 10 23 yr. 

The probability that a photon does not collide but arrives safely to be observed 
by us after a flight distance r can be derived from the assumption that the photon 
encounters obstacles randomly, that the collisions occur independently and at a 
constant rate £~ l per unit distance. The probability P(r) that the distance to the 
first collision is r is then given by the exponential distribution 

P(r) =£- 1 e~ rl£ . (1.7) 

Thus flight distances much longer than £ are improbable. 

Applying this to photons emitted in a spherical shell of thickness dr, and inte- 
grating the spherical shell from zero radius to r* , the fraction of all photons emit- 
ted in the direction of the centre of the sphere and arriving there to be detected 



/(r*)= f*£ 

Jo 



'Mr = 1 - e~ r * /4r . (1.8) 



Obviously, this fraction approaches 1 only in the limit of an infinite universe. 
In that case every point on the sky would be seen to be emitting photons, and the 
sky would indeed be as bright as the Sun at night. But since this is not the case, we 
must conclude that r* /£ is small. Thus the reason why the whole field of vision 



Olbers' Paradox 1 1 

is not filled with stars is that the volume of the presently observable Universe is 
not infinite, it is in fact too small to contain sufficiently many visible stars. 
Lord Kelvin's original result follows in the limit of small r* /£, in which case 

/(r*)«r/£ 

The exponential effect in Equation (1.8) was neglected by Lord Kelvin. 

We can also replace the mean free path in Equation (1.8) with the collision 
time (1.5), and the distance r* with the age of the Universe to, to obtain the fraction 

f(r*)=g{t ) = l-e- tu/f . (1.9) 

If u Q is the average radiation density at the surface of the stars, then the radiation 
density uo measured by us is correspondingly reduced by the fraction g (to): 

u = u o (l-e- t0lf ). (1.10) 

In order to be able to observe a luminous night sky we must have uq « u , or 
the Universe must have an age of the order of the collision time, to « 10 23 yr. 
However, this exceeds all estimates of the age of the Universe (some estimates 
will be given in Section 1.5) by 13 orders of magnitude! Thus the existing stars 
have not had time to radiate long enough. 

What Olbers and many after him did not take into account is that even if the 
age of the Universe was infinite, the stars do have a finite age and they burn their 
fuel at well-understood rates. 

If we replace 'stars' by 'galaxies' in the above argument, the problem changes 
quantitatively but not qualitatively. The intergalactic space is filled with radiation 
from the galaxies, but there is less of it than one would expect for an infinite 
Universe, at all wavelengths. There is still a problem to be solved, but it is not 
quite as paradoxical as in Olbers' case. 

One explanation is the one we have already met: each star radiates only for a 
finite time, and each galaxy has existed only for a finite time, whether the age of the 
Universe is infinite or not. Thus when the time perspective grows, an increasing 
number of stars become visible because their light has had time to reach us, but 
at the same time stars which have burned their fuel disappear. 

Another possible explanation evokes expansion and special relativity. If the 
Universe expands, starlight redshifts, so that each arriving photon carries less 
energy than when it was emitted. At the same time, the volume of the Universe 
grows, and thus the energy density decreases. The observation of the low level 
of radiation in the intergalactic space has in fact been evoked as a proof of the 
expansion. 

Since both explanations certainly contribute, it is necessary to carry out detailed 
quantitative calculations to establish which of them is more important. Most of 
the existing literature on the subject supports the relativistic effect, but Harrison 
has shown (and P. S. Wesson [6] has further emphasized) that this is false: the 
finite lifetime of the stars and galaxies is the dominating effect. The relativistic 
effect is quantitatively so unimportant that one cannot use it to prove that the 
Universe is either expanding or contracting. 



12 From Newton to Hubble 

1.4 Hubble's Law 

In the 1920s Hubble measured the spectra of 18 spiral galaxies with a reason- 
ably well-known distance. For each galaxy he could identify a known pattern of 
atomic spectral lines (from their relative intensities and spacings) which all exhib- 
ited a common redward frequency shift by a factor 1 + z. Using the relation (1.1) 
following from the assumption of homogeneity alone, 

v=cz, (1.11) 

he could then obtain their velocities with reasonable precision. 



The Expanding Universe. The expectation for a stationary universe was that 
galaxies would be found to be moving about randomly. However, some obser- 
vations had already shown that most galaxies were redshifted, thus receding, 
although some of the nearby ones exhibited blueshift. For instance, the nearby 
Andromeda nebula M31 is approaching us, as its blueshift testifies. Hubble's fun- 
damental discovery was that the velocities of the distant galaxies he had studied 
increased linearly with distance: 

v=H Q r. (1.12) 

This is called Hubble's law and Hq is called the Hubble parameter. For the relatively 
nearby spiral galaxies he studied, he could only determine the linear, first-order 
approximation to this function. Although the linearity of this law has been verified 
since then by the observations of hundreds of galaxies, it is not excluded that the 
true function has terms of higher order in r. In Section 2.3 we shall introduce a 
second-order correction. 

The message of Hubble's law is that the Universe is expanding, and this general 
expansion is called the Hubble flow. At a scale of tens or hundreds of Mpc the dis- 
tances to all astronomical objects are increasing regardless of the position of our 
observation point. It is true that we observe that the galaxies are receding from 
us as if we were at the centre of the Universe. However, we learned from studying 
a homogeneous and isotropic Universe in Figure 1.1 that if observer A sees the 
Universe expanding with the factor f(t) in Equation (1.1), any other observer B 
will also see it expanding with the same factor, and the triangle ABP in Figure 1.1 
will preserve its form. Thus, taking the cosmological principle to be valid, every 
observer will have the impression that all astronomical objects are receding from 
him/her. A homogeneous and isotropic Universe does not have a centre. Con- 
sequently, we shall usually talk about expansion velocities rather than recession 
velocities. 

It is surprising that neither Newton nor later scientists, pondering about why 
the Universe avoided a gravitational collapse, came to realize the correct solu- 
tion. An expanding universe would be slowed down by gravity, so the inevitable 
collapse would be postponed until later. It was probably the notion of an infinite 
scale of time, inherent in a stationary model, which blocked the way to the right 
conclusion. 



Hubble 's Law 13 

Hubble Time and Radius. From Equations (1.11) and (1.12) one sees that 
the Hubble parameter has the dimension of inverse time. Thus a characteristic 
timescale for the expansion of the Universe is the Hubble time: 

t h = Hq 1 = 9.78/1" 1 x 10 9 yr. (1.13) 

Here h is the commonly used dimensionless quantity 

h = H /(100kms- 1 Mpc" 1 ). 

The Hubble parameter also determines the size scale of the observable Universe. 
In time t h , radiation travelling with the speed of light c has reached the Hubble 
radius: 

r H = t h c = 3000/1" 1 Mpc. (1.14) 

Or, to put it a different way, according to Hubble's nonrelativistic law, objects at 
this distance would be expected to attain the speed of light, which is an absolute 
limit in the theory of special relativity. 

Combining Equation (1.12) with Equation (1.11), one obtains 

z = H Q -. (1.15) 

c 

In Section 2.1 on Special Relativity we will see limitations to this formula when v 
approaches c. The redshift z is in fact infinite for objects at distance r H receding 
with the speed of light and thus physically meaningless. Therefore no information 
can reach us from farther away, all radiation is redshifted to infinite wavelengths, 
and no particle emitted within the Universe can exceed this distance. 



The Cosmic Scale. The size of the Universe is unknown and unmeasurable, but 
if it undergoes expansion or contraction it is convenient to express distances at 
different epochs in terms of a cosmic scale R(t), and denote its present value Rq = 
R(to). The value of R(t) can be chosen arbitrarily, so it is often more convenient 
to normalized it to its present value, and thereby define a dimensionless quantity, 
the cosmic scale factor. 

a(t)=R(t)/R . (1.16) 

The cosmic scale factor affects all distances: for instance the wavelength A of light 
emitted at one time t and observed as Ao at another time to: 
Ao _ A 
Rq R(tV 
Let us find an approximation for a(t) at times t < to by expanding it to first- 
order time differences, 

a(t)~l-d (t -t), (1.17) 

using the notation do for a (to), and r = c(to - t) for the distance to the source. 
The cosmological redshift can be approximated by 

z = ^-1 = a" 1 -l«ao-. (1.18) 

A c 



14 From Newton to Hubble 

Thus 1/ 1 +z is a measure of the scale factor a(t) at the time when a source emitted 
the now-redshifted radiation. Identifying the expressions for z in Equations (1.18) 
and (1.15) we find the important relation 

d = ^=H . (1.19) 

The Hubble Constant. The value of this constant initially found by Hubble was 
Hq = 550 km s" 1 Mpc -1 : an order of magnitude too large because his distance 
measurements were badly wrong. To establish the linear law and to determine 
the global value of Hq one needs to be able to measure distances and expan- 
sion velocities well and far out. Distances are precisely measured only to nearby 
stars which participate in the general rotation of the Galaxy, and which there- 
fore do not tell us anything about cosmological expansion. Even at distances of 
several Mpc the expansion-independent, transversal peculiar velocities of galax- 
ies are of the same magnitude as the Hubble flow. The measured expansion at 
the Virgo supercluster, 17 Mpc away, is about 1100 km s _1 , whereas the peculiar 
velocities attain 600 km s _1 . At much larger distances where the peculiar veloci- 
ties do not contribute appreciably to the total velocity, for instance at the Coma 
cluster 100 Mpc away, the expansion velocity is 6900 km s" 1 and the Hubble flow 
can be measured quite reliably, but the imprecision in distance measurements 
becomes the problem. Every procedure is sensitive to small, subtle corrections 
and to systematic biases unless great care is taken in the reduction and analysis 
of data. 

A notable contribution to our knowledge of Hq comes from the Hubble Space 
Telescope (HST) Key Project [7]. The goal of this project was to determine Hq by a 
Cepheid calibration of a number of independent, secondary distance indicators, 
including Type la supernovae, the Tully-Fisher relation, the fundamental plane for 
elliptical galaxies, surface-brightness fluctuations, and Type-II supernovae. Here I 
shall restrict the discussion to the best absolute determinations of Hq, which are 
those from far away supernovae. (Cepheid distance measurements are discussed 
in Section 2.3 under the heading 'Distance Ladder Continued'.) 

Occasionally, a very bright supernova explosion can be seen in some galaxy. 
These events are very brief (one month) and very rare: historical records show 
that in our Galaxy they have occurred only every 300 yr. The most recent nearby 
supernova occurred in 1987 (code name SN1987A), not exactly in our Galaxy but 
in our small satellite, the Large Magellanic Cloud (LMC). Since it has now become 
possible to observe supernovae in very distant galaxies, one does not have to wait 
300 yr for the next one. 

The physical reason for this type of explosion (a Type SNII supernova) is the 
accumulation of Fe group elements at the core of a massive red giant star of 
size 8-2OOM , which has already burned its hydrogen, helium and other light 
elements. Another type of explosion (a Type SNIa supernova) occurs in binary 
star systems, composed of a heavy white dwarf and a red giant star. White dwarfs 
have masses of the order of the Sun, but sizes of the order of Earth, whereas red 



Hubble's Law 15 

giants are very large but contain very little mass. The dwarf then accretes mass 
from the red giant due to its much stronger gravitational field. 

As long as the fusion process in the dwarf continues to burn lighter ele- 
ments to Fe group elements, first the gas pressure and subsequently the elec- 
tron degeneracy pressure balance the gravitational attraction (degeneracy pres- 
sure is explained in Section 5.3). But when a rapidly burning dwarf star reaches 
a mass of 1.44M©, the so-called Chandrasekhar mass, or in the case of a red 
giant when the iron core reaches that mass, no force is sufficient to oppose 
the gravitational collapse. The electrons and protons in the core transform 
into neutrinos and neutrons, respectively, most of the gravitational energy 
escapes in the form of neutrinos, and the remainder is a neutron star which 
is stabilized against further gravitational collapse by the degeneracy pressure 
of the neutrons. As further matter falls in, it bounces against the extremely 
dense neutron star and travels outwards as energetic shock waves. In the col- 
lision between the shock waves and the outer mantle, violent nuclear reac- 
tions take place and extremely bright light is generated. This is the super- 
nova explosion visible from very far away. The nuclear reactions in the man- 
tle create all the elements; in particular, the elements heavier than Fe, Ni 
and Cr on Earth have all been created in supernova explosions in the distant 
past. 

The released energy is always the same since the collapse always occurs at the 
Chandrasekhar mass, thus in particular the peak brightness of Type la supernovae 
can serve as remarkably precise standard candles visible from very far away. (The 
term standard candle is used for any class of astronomical objects whose intrin- 
sic luminosity can be inferred independently of the observed flux.) Additional 
information is provided by the colour, the spectrum, and an empirical correla- 
tion observed between the timescale of the supernova light curve and the peak 
luminosity. The usefulness of supernovae of Type la as standard candles is that 
they can be seen out to great distances, 500 Mpc or z « 0.1, and that the internal 
precision of the method is very high. At greater distances one can still find super- 
novae, but Hubble's linear law (1.15) is no longer valid— the expansion starts to 
accelerate. 

The SNela are the brightest and most homogeneous class of supernovae. (The 
plural of SN is abbreviated SNe.) Type II are fainter, and show a wider varia- 
tion in luminosity. Thus they are not standard candles, but the time evolution 
of their expanding atmospheres provides an indirect distance indicator, useful 
out to some 200 Mpc. 

Two further methods to determine Ho make use of correlations between dif- 
ferent galaxy properties. Spiral galaxies rotate, and there the Tully-Fisher relation 
correlates total luminosity with maximum rotation velocity. This is currently the 
most commonly applied distance indicator, useful for measuring extragalactic 
distances out to about 150 Mpc. Elliptical galaxies do not rotate, they are found 
to occupy a fundamental plane in which an effective radius is tightly correlated 
with the surface brightness inside that radius and with the central velocity dis- 
persion of the stars. In principle, this method could be applied out to z ~ 1, but 



From Newton to Hubble 



• 1-band Tully-Fisher 
a fundamental plane 

♦ surface bright] 
■ supernovae la 
Q supernovae II 




Figure 1.2 Recession velocities of different objects as a function of distance [7]. The 
slope determines the value of the Hubble constant. 



in practice stellar evolution effects and the nonlinearity of Hubble's law limit the 
method to z < 0.1, or about 400 Mpc. 

The resolution of individual stars within galaxies clearly depends on the dis- 
tance to the galaxy. This method, called surface-brightness fluctuations (SBFs), 
is an indicator of relative distances to elliptical galaxies and some spirals. The 
internal precision of the method is very high, but it can be applied only out to 
about 70 Mpc. 

The observations of the HST have been confirmed by independent SNIa obser- 
vations from observatories on the ground [8]. The HST team quotes 



h ^Ho/dOOkms" 1 Mpc" 1 ) = 0.72 ± 0.03 ±0.07. 



(1.20) 



At the time of writing, even more precise determinations of Ho, albeit not sig- 
nificantly different, come from combined multiparameter analyses of the cosmic 
microwave background spectrum [9] and large-scale structures, to which we shall 
return in Chapters 8 and 9. The present best value, h = 0.71, is given in Equa- 



The Age of the Universe 1 7 

tion (8.43) and in Table A.2 in the appendix. In Figure 1.2 we plot the combined 
HST observations of Ho. 

Note that the second error in Equation (1.20), which is systematic, is much bigger 
than the statistical error. This illustrates that there are many unknown effects 
which complicate the determination of Ho, and which in the past have made all 
determinations controversial. To give just one example, if there is dust on the 
sight line to a supernova, its light would be reddened and one would conclude 
that the recession velocity is higher than it is in reality. There are other methods, 
such as weak lensing (to be discussed in Section 3.3), which do not suffer from 
this systematic error, but they have not yet reached a precision superior to that 
reported in Equation (1.20). 



1.5 The Age of the Universe 

One of the conclusions of Olbers' paradox was that the Universe could not be 
eternal, it must have an age much less than 10 23 yr, or else the night sky would be 
bright. More recent proofs that the Universe indeed grows older and consequently 
has a finite lifetime comes from astronomical observations of many types of extra- 
galactic objects at high redshifts and at different wavelengths: radio sources, X-ray 
sources, quasars, faint blue galaxies. High redshifts correspond to earlier times, 
and what are observed are clear changes in the populations and the characteris- 
tics as one looks toward earlier epochs. Let us therefore turn to determinations 
of the age of the Universe. 

In Equation (1.13) we defined the Hubble time t h , and gave a value for it of the 
order of 10 billion years. However, t h is not the same as the age to of the Uni- 
verse. The latter depends on the dynamics of the Universe, whether it is expand- 
ing forever or whether the expansion will turn into a collapse, and these scenarios 
depend on how much matter there is and what the geometry of the Universe is, 
all questions we shall come back to later. Taking h to be in the range 0.68-0.75, 
Equation (1.13) gives 

to « t h = 13.0-14.4 Gyr. (1.21) 



Cosmochronology by Radioactive Nuclei. There are several independent tech- 
niques, cosmochronometers, for determining the age of the Universe. At this point 
we shall only describe determinations via the cosmochronology of long-lived 
radioactive nuclei, and via stellar modelling of the oldest stellar populations in 
our Galaxy and in some other galaxies. Note that the very existence of radioactive 
nuclides indicates that the Universe cannot be infinitely old and static. 

Various nuclear processes have been used to date the age of the Galaxy, to, for 
instance the 'Uranium clock'. Long-lived radioactive isotopes such as 232 Th, 235 U, 
238 U and 244 Pu have been formed by fast neutrons from supernova explosions, 
captured in the envelopes of an early generation of stars. With each generation 
of star formation, burn-out and supernova explosion, the proportion of metals 



18 From Newton to Hubble 

increases. Therefore the metal-poorest stars found in globular clusters are the 
oldest. 

The proportions of heavy isotopes following a supernova explosion are calcu- 
lable with some degree of confidence. Since then, they have decayed with their 
different natural half-lives so that their abundances in the Galaxy today have 
changed. For instance, calculations of the original ratio K = 235 tj/ 238 U give val- 
ues of about 1.3 with a precision of about 10%, whereas this ratio on Earth at the 
present time is K = 0.007 23. 

To compute the age of the Galaxy by this method, we also need the decay con- 
stants A of 238 U and 235 U which are related to their half-lives: 

A 238 = In 2/ (4.46 Gyr), A 235 = In 2/ (0.7038 Gyr). 

The relation between isotope proportions, decay constants, and time to is 

K = £oexp[(A238-A235)tG]- (1-22) 

Inserting numerical values one finds t& ~ 6.2 Gyr. However, the Solar System is 
only 4.57 Gyr old, so the abundance of 232 Th, 235 U and 238 U on Earth cannot be 
expected to furnish a very interesting limit to tc- Rather, one has to turn to the 
abundances on the oldest stars in the Galaxy. 

The globular clusters (GCs) are roughly spherically distributed stellar systems 
in the spheroid of the Galaxy. During the majority of the life of a star, it converts 
hydrogen into helium in its core. Thus the most interesting stars for the deter- 
mination of tc are those which have exhausted their supply of hydrogen, and 
which are located in old, metal-poor GCs, and to which the distance can be reli- 
ably determined. Over the last 10 yr, the GC ages have been reduced dramatically 
because of refined estimates of the parameters governing stellar evolution, and 
because of improved distance measurements. One can now quote [10] a best-fit 
age of 13.2 Gyr and a conservative lower limit of 

t GC > 11.2 Gyr. 

This includes an estimated age of greater than 0.8 Gyr for the Universe when the 
clusters formed. 

Of particular interest is the detection of a spectral line of 238 U in the extremely 
metal-poor star CS 31082-001, which is overabundant in heavy elements [11]. 
Theoretical nucleosynthesis models for the initial abundances predict that the 
ratios of neighbouring stable and unstable elements should be similar in early 
stars as well as on Earth. Thus one compares the abundances of the radioactive 
232 Th and 238 U with the neighbouring stable elements Os and Ir ( 235 U is now 
useless, because it has already decayed away on the oldest stars). One result [11] 
is that any age between 11.1 and 13.9 is compatible with the observations, whereas 
another group [12] using a different method quotes 

t* = 14.1 ± 2.5 Gyr. (1.23) 



Expansion in a Newtonian World 19 

Bright Cluster Galaxies (BCGs). Another cosmochronometer is offered by the 
study of elliptical galaxies in BCGs at very large distances. It has been found that 
BCG colours only depend on their star-forming histories, and if one can trust stel- 
lar population synthesis models, one has a cosmochronometer. From an analysis 
of 17 bright clusters in the range 0.3 < z < 0.7 observed by the HST, the result is 
[13] 

t B CG = 13.4!};4 Gyr. (1.24) 

Allowing 0.5-1 Gyr from the Big Bang until galaxies form stars and clusters, all 
the above three estimates fall in the range obtained from the Hubble constant in 
Equation (1.21). There are many more cosmochronometers making use of well- 
understood stellar populations at various distances which we shall not refer to 
here, all yielding ages near those quoted. It is of interest to note that in the past, 
when the dynamics of the Universe was less well known, the calculated age t h 
was smaller than the value in Equation (1.21), and at the same time the age t* 
of the oldest stars was much higher than the value in Equation (1.23). Thus this 
historical conflict between cosmological and observational age estimates has now 
disappeared. 

In Section 4.1 we will derive a general relativistic formula for to which depends 
on a few measurable dynamical parameters. These parameters will only be defined 
later. They are determined in supernova analyses (in Section 4.4) and cosmic 
microwave background analyses (in Section 8.4). The best present estimate of 
to is based on parameter values quoted by the Wilkinson Microwave Anisotropy 
Probe (WMAP) team [9] 

t = 13.7 ±0.2 Gyr. (1.25) 



1.6 Expansion in a Newtonian World 

In this section we shall use Newtonian mechanics to derive a cosmology without 
recourse to Einstein's theory. Inversely, this formulation can also be derived from 
Einstein's theory in the limit of weak gravitational fields. 

A system of massive bodies in an attractive Newtonian potential contracts 
rather than expands. The Solar System has contracted to a stable, gravitation- 
ally bound configuration from some form of hot gaseous cloud, and the same 
mechanism is likely to be true for larger systems such as the Milky Way, and per- 
haps also for clusters of galaxies. On yet larger scales the Universe expands, but 
this does not contradict Newton's law of gravitation. 

The key question in cosmology is whether the Universe as a whole is a gravi- 
tationally bound system in which the expansion will be halted one day. We shall 
next derive a condition for this from Newtonian mechanics. 



Newtonian Mechanics. Consider a galaxy of gravitating mass wig located at a 
radius r from the centre of a sphere of mean density p and mass M = 4nr 3 p/3 



From Newton to Hubble 




Figure 1.3 A galaxy of mass m at radial distance r receding with velocity v from the 
centre of a homogeneous mass distribution of density p. 

(see Figure 1.3). The gravitational potential of the galaxy is 

U = -GMm G /r = -lnGm G pr 2 , (1.26) 

where G is the Newtonian constant expressing the strength of the gravitational 
interaction. Thus the galaxy falls towards the centre of gravitation, acquiring a 
radial acceleration 

f = -GM/r 2 = -jnGpr. (1.27) 

This is Newton's law of gravitation, usually written in the form 

F = -^, (1.28) 

where F (in old-fashioned parlance) is the force exerted by the mass M on the mass 
mc- The negative signs in Equations (1.26)-(1.28) express the attractive nature of 
gravitation: bodies are forced to move in the direction of decreasing r. 

In a universe expanding linearly according to Hubble's law (Equation (1.12)), the 
kinetic energy T of the galaxy receding with velocity v is 

T = \mv 2 = \mHlr 2 , (1.29) 

where m is the inertial mass of the galaxy. Although there is no theoretical reason 
for the inertial mass to equal the gravitational mass (we shall come back to this 
question later), careful tests have verified the equality to a precision better than a 
few parts in 10 13 . Let us therefore set m G = m. Thus the total energy is given by 

E = T + U = \mHlr 2 - ^nGmpr 2 = mr 2 (\Hl - f TrGp). (1.30) 

If the mass density p of the Universe is large enough, the expansion will halt. The 
condition for this to occur is E = 0, or from Equation (1.30) this critical density is 

p c = Bk. = 1.0539 x 10 10 h 2 eVm- 3 . (1.31) 

A universe with density p > p c is called closed; with density p < p c it is called 
open. 



Expansion in a Newtonian World 21 

Expansion. Note that r and p are time dependent: they scale with the expansion. 
Denoting their present values ro and po, one has 

r(t) = r Q a(t), p(t) = p a~^(t). (1.32) 

The acceleration f in Equation (1.27) can then be replaced by the acceleration 
of the scale: 

a = f Iyq = -\txGo~ 2 . (1.33) 

Let us use the identity 
in Equation (1.33) to obtain 



Id. 
a = --—a 
2 da 



j ■ 2 8 ^ da 

da = -tTtGoo^". 

3 a 1 

This can be integrated from the present time to to an earlier time t with the result 

d 2 (t) - d 2 (t ) = InGpoia- 1 - 1). (1.34) 

Let us now introduce the dimensionless density parameter. 

Pa 8ttGpq 

° Q = 7c = ~wf ( ] 

Substituting Qq into Equation (1.34) and making use of the relation (1.19), 
a(to) = Hq, we find 

a 2 =^(D a- 1 -r3o + D. (1.36) 

Thus it is clear that the presence of matter influences the dynamics of the Uni- 
verse. Without matter, Do = 0, Equation (1.36) just states that the expansion is 
constant, a = Ho, and Hq could well be zero as Einstein thought. During expan- 
sion a is positive; during contraction it is negative. In both cases the value of a 2 
is nonnegative, so it must always be true that 

1 -Q Q + Q /a ^ 0. (1.37) 



Cosmological Models. Depending on the value of Oq the evolution of the Uni- 
verse can take three courses. 

(i) Do < 1, the mass density is undercritical. As the cosmic scale factor a(t) 
increases for times t > to the term Qo/a decreases, but the expression (1.37) 
stays positive always. Thus this case corresponds to an open, ever-expanding 
universe, as a consequence of the fact that it is expanding now. In Figure 1.4 
the expression (1.37) is plotted against a as the long-dashed curve for the 
choice do = 0.5. 



From Newton to Hubble 




Figure 1.4 Dependence of the expression (1.37) on the cosmic scale a for an undercritical 
(Do = 0.5), critical (D = 1) and overcritical (O = 1.5) universe. Time starts today at scale 
a = 1 in this picture and increases with a, except for the overcritical case where the 
Universe arrives at its maximum size, here a = 3, whereupon it reverses its direction and 
starts to shrink. 



(ii) Qo = 1, the mass density is critical. As the scale factor a(t) increases for 
times t > to the expression in Equation (1.37) gradually approaches zero, 
and the expansion halts. However, this only occurs infinitely late, so it also 
corresponds to an ever-expanding universe. This case is plotted against a as 
the short-dashed curve in Figure 1.4. Note that cases (i) and (ii) differ by hav- 
ing different asymptotes. Case (ii) is quite realistic because the observational 
value of Qo is very close to 1, as we shall see later. 

(iii) Qo > 1> the mass density is overcritical and the Universe is closed. As 
the scale factor a(t) increases, it reaches a maximum value a mi d when the 
expression in Equation (1.37) vanishes, and where the rate of increase, a mi d, 
also vanishes. But the condition (1.37) must stay true, and therefore the 
expansion must turn into contraction at a m id. The solid line in Figure 1.4 
describes this case for the choice Qq = 1.5, whence a mi( \ = 3. For later times 
the Universe retraces the solid curve, ultimately reaching scale a = 1 again. 

This is as far as we can go combining Newtonian mechanics with Hubble's law. 
We have seen that problems appear when the recession velocities exceed the speed 
of light, conflicting with special relativity. Another problem is that Newton's law 
of gravitation knows no delays: the gravitational potential is felt instantaneously 
over all distances. A third problem with Newtonian mechanics is that the Coper- 
nican world, which is assumed to be homogeneous and isotropic, extends up to a 
finite distance Yq, but outside that boundary there is nothing. Then the boundary 
region is characterized by violent inhomogeneity and anisotropy, which are not 
taken into account. To cope with these problems we must begin to construct a 
fully relativistic cosmology. 



Expansion in a Newtonian World 23 

Problems 

1. How many revolutions has the Galaxy made since the formation of the 
Solar System if we take the solar velocity around the galactic centre to be 
365 km s" 1 ? 

2. Use Equation (1.4) to estimate the mean free path £ of photons. What fraction 
of all photons emitted by stars up to the maximum observed redshift z = 7 
arrive at Earth? 

3. If Hubble had been right that the expansion is given by 

Ho = SSOkms^Mpc -1 , 
how old would the Universe be then (see (1.13))? 

4. What is the present ratio K = 2 35tj/238tj on a star 10 Gyr old? 

5. Prove Newton's theorem that the gravitational force at a radial distance R 
from the centre of a spherical distribution of matter acts as if all the mass 
inside R were concentrated at a single point at the centre. Show also that if 
the spherical distribution of matter extends beyond R, the force due to the 
mass outside R vanishes [14]. 

6. Estimate the escape velocity from the Galaxy. 

Chapter Bibliography 

[1] Ramella, M., Geller, M. J., Pisani, A. and da Costa, L. N. 2002 Astron. J. 123, 2976. 
[2] Fang Li Zhi and Li Shu Xian 1989 Creation of the Universe. World Scientific, Singapore. 
[3] Peebles, P. J. E. 1993 Principles of physical cosmology. Princeton University Press, 

Princeton, NJ. 
[4] Hagiwara, K. et al. 2002 Phys. Rev. D66, 010001-1. 

[5] Harrison, E. 1987 Darkness at night. Harvard University Press, Cambridge, MA. 
[6] Wesson, P. S. 1991 Astrophys. J. 367, 399. 
[7] Freedman, W. L. et al. 2001 Astrophys. J. 553, 47. 

[8] Gibson, B. K. and Brook, C. B. 2001 New cosmological data and the values of the fun- 
damental parameters (ed. A. Lasenby & A. Wilkinson), ASP Conference Proceedings 
Series, vol. 666. 
[9] Bennett, C. L. et al. 2003 Preprint arXiv, astro-ph/0302207 and 2003 Astrophys. J. (In 
press.) and companion papers cited therein. 
[10] Krauss, L. M. and Chaboyer, B. 2003 Science 299, 65-69. 
[11] Cayrel, R. et al. 2001 Nature 409, 691. 
[12] Wanajo, S. et al. 2002 Astrophys. J. 577, 853. 
[13] Ferreras, I. et al. 2001 Mon. Not. R. Astron. Soc. 327, L47. 
[14] Shu, F. H. 1982 The physical Universe. University Science Books, Mill Valley, CA. 



2 
Relativity 



The foundations of modern cosmology were laid during the second and third 
decade of the 20th century: on the theoretical side by Einstein's theory of gen- 
eral relativity, which represented a deep revision of current concepts; and on the 
observational side by Hubble's discovery of the cosmic expansion, which ruled 
out a static Universe and set the primary requirement on theory. Space and time 
are not invariants under Lorentz transformations, their values being different to 
observers in different inertial frames. Non-relativistic physics uses these quanti- 
ties as completely adequate approximations, but in relativistic frame-independent 
physics we must find invariants to replace them. This chapter begins, in Sec- 
tion 2.1, with Einstein's theory of special relativity, which gives us such invariants. 

In Section 2.2 we generalize the metrics in linear spaces to metrics in curved 
spaces, in particular the Robertson-Walker metric in a four-dimensional manifold. 
This gives us tools to define invariant distance measures in Section 2.3, and to 
conclude with a brief review of astronomical distance measurements which are 
the key to Hubble's parameter. 

A central task of this chapter is to derive Einstein's law of gravitation using as 
few mathematical tools as possible (for far more detail, see, for example, [1] and 
[2]). The basic principle of covariance introduced in Section 2.4 requires a brief 
review of tensor analysis. Tensor notation has the advantage of permitting one to 
write laws of nature in the same form in all invariant systems. 

The 'principle of equivalence' is introduced in Section 2.5 and it is illustrated by 
examples of travels in lifts. In Section 2.6 we assemble all these tools and arrive 
at Einstein's law of gravitation. 

2.1 Lorentz Transformations and Special Relativity 

In Einstein's theory of special relativity one studies how signals are exchanged 
between inertial frames in motion with respect to each other with constant veloc- 
ity. Einstein made two postulates about such frames: 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



26 Relativity 

(i) the results of measurements in different frames must be identical; and 
(ii) light travels by a constant speed, c, in vacuo, in all frames. 
The first postulate requires that physics be expressed in frame-independent 
invariants. The latter is actually a statement about the measurement of time in 
different frames, as we shall see shortly. 

Lorentz Transformations. Consider two linear axes x and x' in one-dimensional 
space, x' being at rest and x moving with constant velocity v in the positive 
x' direction. Time increments are measured in the two coordinate systems as dt 
and dt' using two identical clocks. Neither the spatial increments Ax and Ax' 
nor the time increments dt and dt' are invariants— they do not obey postulate (i). 
Let us replace dt and dt' with the temporal distances c At and c At' and look for 
a linear transformation between the primed and unprimed coordinate systems, 
under which the two-dimensional space-time distance As between two events, 

As 2 = c 2 At 2 = c 2 At 2 - Ax 2 = c 2 At' 2 - Ax' 2 , (2.1) 

is invariant. Invoking the constancy of the speed of light it is easy to show that 
the transformation must be of the form 

Ax' = y(Ax -vAt), cAt' = y(cAt - v Ax/c), (2.2) 

where 

^TTOT- (2 - 3) 

Equation (2.2) defines the Lorentz transformation, after Hendrik Antoon Lorentz 
(1853-1928). Scalar products in this two-dimensional (ct, x)-space are invariants 
under Lorentz transformations. 



Time Dilation. The quantity dT in Equation (2.1) is called the proper time and As 
the line element. Note that scalar multiplication in this manifold is here defined 
in such a way that the products of the spatial components obtain negative signs 
(sometimes the opposite convention is chosen). (The mathematical term for a 
many-dimensional space is a manifold.) 

Since dT 2 is an invariant, it has the same value in both frames: 

dT' 2 = dT 2 . 

While the observer at rest records consecutive ticks on his clock separated by a 
space-time interval dT = dt', she receives clock ticks from the x direction sepa- 
rated by the time interval dt and also by the space interval Ax = v At: 

At = At' = A /dt 2 -dx 2 /c 2 = ^/l - (v/c) 2 At. (2.4) 

In other words, the two inertial coordinate systems are related by a Lorentz trans- 
formation 



Lorentz Transformations and Special Relativity 17 

Obviously, the time interval dt is always longer than the interval dt', but only 
noticeably so when v approaches c. This is called the time dilation effect. 

The time dilation effect has been well confirmed in particle experiments. Muons 
are heavy, unstable, electron-like particles with well-known lifetimes in the lab- 
oratory. However, when they strike Earth with relativistic velocities after having 
been produced in cosmic ray collisions in the upper atmosphere, they appear to 
have a longer lifetime by the factor y. 

Another example is furnished by particles of mass m and charge Q circulating 
with velocity v in a synchrotron of radius r. In order to balance the centrifugal 
force the particles have to be subject to an inward-bending magnetic field den- 
sity B. The classical condition for this is 

r = mv IQB. 

The velocity in the circular synchrotron as measured by a physicist at rest in 
the laboratory frame is inversely proportional to t, say the time of one revolution. 
But in the particle rest frame the time of one revolution is shortened to t/y. When 
the particle attains relativistic velocities (by traversing accelerating potentials at 
regular positions in the ring), the magnetic field density B felt by the particle has 
to be adjusted to match the velocity in the particle frame, thus 

r = mvy/Q_B. 

This equation has often been misunderstood to imply that the mass m increases 
by the factor y, whereas only time measurements are affected by y. 



Relativity and Gold. Another example of relativistic effects on the orbits of 
circulating massive particles is furnished by electrons in Bohr orbits around a 
heavy nucleus. The effective Bohr radius of an electron is inversely proportional to 
its mass. Near the nucleus the electrons attain relativistic speeds, the time dilation 
will cause an apparent increase in the electron mass, more so for inner electrons 
with larger average speeds. For a Is shell at the nonrelativistic limit, this average 
speed is proportional to Z atomic units. For instance, v/c for the Is electron 
in Hg is 80/137 = 0.58, implying a relativistic radial shrinkage of 23%. Because 
the higher s shells have to be orthogonal against the lower ones, they will suffer 
a similar contraction. Due to interacting relativistic and shell-structure effects, 
their contraction can be even larger; for gold, the 6s shell has larger percentage 
relativistic effects than the Is shell. The nonrelativistic 5d and 6s orbital energies 
of gold are similar to the 4d and 5s orbital energies of silver, but the relativistic 
energies happen to be very different. This is the cause of the chemical difference 
between silver and gold and also the cause for the distinctive colour of gold [3]. 



Light Cone. The Lorentz transformations (2.1), (2.2) can immediately be gener- 
alized to three spatial dimensions, where the square of the Pythagorean distance 
element 

dl 2 = dl 2 = dx 2 + dy 2 + dz 2 (2.5) 



28 Relativity 

is invariant under rotations and translations in three-space. This is replaced by 
the four-dimensional space-time of Hermann Minkowski (1864-1909), defined by 
the temporal distance ct and the spatial coordinates x, y, z. An invariant under 
Lorentz transformations between frames which are rotated or translated at a con- 
stant velocity with respect to each other is then the line element of the Minkowski 
metric. 

ds 2 = c 2 dT 2 = c 2 dt 2 - dx 2 - dy 2 - dz 2 = c 2 dt 2 - dl 2 . (2.6) 

The trajectory of a body moving in space-time is called its world line. A body at a 
fixed location in space follows a world line parallel to the time axis and, of course, 
in the direction of increasing time. A body moving in space follows a world line 
making a slope with respect to the time axis. Since the speed of a body or a signal 
travelling from one event to another cannot exceed the speed of light, there is a 
maximum slope to such world lines. All world lines arriving where we are, here 
and now, obey this condition. Thus they form a cone in our past, and the envelope 
of the cone corresponds to signals travelling with the speed of light. This is called 
the light cone. 

Two separate events in space-time can be causally connected provided their 
spatial separation dl and their temporal separation dt (in any frame) obey 

Idl/dtl ^c. 

Their world line is then inside the light cone. In Figure 2.1 we draw this four- 
dimensional cone in t, x, y-space, but another choice would have been to use the 
coordinates t, a, 6. Thus if we locate the present event at the apex of the light 
cone at t = to = 0, it can be influenced by world lines from all events inside the 
past light cone for which ct < 0, and it can influence all events inside the future 
light cone for which ct > 0. Events inside the light cone are said to have timelike 
separation from the present event. Events outside the light cone are said to have 
spacelike separation from the present event: they cannot be causally connected 
to it. Thus the light cone encloses the present observable universe, which consists 
of all world lines that can in principle be observed. From now on we usually mean 
the present observable universe when we say simply 'the Universe'. 

For light signals the equality sign above applies so that the proper time interval 
in Equation (2.6) vanishes: 

dT = 0. 

Events on the light cone are said to have null or lightlike separation. 



Redshift and Scale Factor. The light emitted by stars is caused by atomic tran- 
sitions with emission spectra containing sharp spectral lines. Similarly, hot radi- 
ation traversing cooler matter in stellar atmospheres excites atoms at sharply 
defined wavelengths, producing characteristic dark absorption lines in the con- 
tinuous regions of the emission spectrum. The radiation that was emitted by stars 
and distant galaxies with a wavelength A res t = c/v res t at time t in their rest frame 
will have its wavelength stretched by the cosmological expansion to A b s when 



Lorentz Transformations and Special Relativity 29 




particle horizon 



Figure 2.1 Light cone in x,y, t-space. An event which is at the origin x = y = at the 
present time to will follow some world line into the future, always remaining inside the 
future light cone. All points on the world line are at timelike locations with respect to the 
spatial origin at t . World lines for light signals emitted from (received at) the origin at 
to will propagate on the envelope of the future (past) light cone. No signals can be sent 
to or received from spacelike locations. The space in the past from which signals can be 
received at the present origin is restricted by the particle horizon at tmm, the earliest time 
under consideration. The event horizon restricts the space which can at present be in 
causal relation to the present spatial origin at some future time t max . 

observed on Earth. Since the Universe expands, this shift is in the red direction, 
Aobs > A rest , and it is therefore called a redshift, denoted 



30 Relativity 

The letter z for the redshift is of course a different quantity than the coordinate 
z in Equations (2.5) and (2.6). 
The ratio of wavelengths actually measured by the terrestrial observer is then 

i . - _ A obs _ R() _ 1 ,~ R s 

1 + z -;w-*(t)-^t)- (2 " 8) 

It should be stressed that the cosmological redshift is not caused by the velocities 
of receding objects, only by the increase in scale a(t) since time t. A kinematic 
effect can be observed in the spectra of nearby stars and galaxies, for which their 
peculiar motion is more important than the effect of the cosmological expansion. 
This may give rise to a Doppler redshift for a receding source, and to a correspond- 
ing blueshift for an approaching source. 

Actually, the light cones in Figure 2.1 need to be modified for an expanding 
universe. A scale factor a(t) that increases with time implies that light will travel 
a distance greater than ct during time t. Consequently, the straight lines will be 
curved. 



2.2 Metrics of Curved Space-time 

In Newton's time the laws of physics were considered to operate in a flat Euclidean 
space, in which spatial distance could be measured on an infinite and immovable 
three-dimensional grid, and time was a parameter marked out on a linear scale 
running from infinite past to infinite future. But Newton could not answer the 
question of how to identify which inertial frame was at rest relative to this absolute 
space. In his days the solar frame could have been chosen, but today we know that 
the Solar System orbits the Galactic centre, the Galaxy is in motion relative to the 
local galaxy group, which in turn is in motion relative to the Hydra- Cent aurus 
cluster, and the whole Universe is expanding. 

The geometry of curved spaces was studied in the 19th century by Gauss, 
Riemann and others. Riemann realized that Euclidean geometry was just a par- 
ticular choice suited to flat space, but not necessarily correct in the space we 
inhabit. And Mach realized that one had to abandon the concept of absolute 
space altogether. Einstein learned about tensors from his friend Marcel Gross- 
man, and used these key quantities to go from flat Euclidean three-dimensional 
space to curved Minkowskian four-dimensional space in which physical quantities 
are described by invariants. Tensors are quantities which provide generally valid 
relations between different four-vectors. 



Euclidean Space. Let us consider how to describe distance in three-space. The 
path followed by a free body obeying Newton's first law of motion can suit- 
ably be described by expressing its spatial coordinates as functions of time: 
x(t), y(t), z(t). Time is then treated as an absolute parameter and not as a coor- 
dinate. This path represents the shortest distance between any two points along 
it, and it is called a geodesic of the space. As is well known, in Euclidean space 



Metrics of Curved Space-time 3 1 

the geodesies are straight lines. Note that the definition of a geodesic does not 
involve any particular coordinate system. 

If one replaces the components x, y, z of the distance vector I by x 1 , x 2 , x 3 , 
this permits a more compact notation of the Pythagorean squared distance I 2 in 
the metric equation (2.5): 

3 

I 2 = (x 1 ) 2 + (x 2 ) 2 + (x 3 ) 2 = X 9ijX i x j = gi^xK (2.9) 

The quantities gij are the nine components of the metric tensor g, which contains 
all the information about the intrinsic geometry of this three-space. In the last step 
we have used the convention to leave out the summation sign; it is then implied 
that summation is carried out over repeated indices. One commonly uses Roman 
letters in the indices when only the spatial components x l , i = 1,2, 3, are implied, 
and Greek letters when all the four space-time coordinates x u , ^ = 0, 1, 2, 3, are 
implied. Orthogonal coordinate systems have diagonal metric tensors and this is 
all that we will encounter. The components of g in flat Euclidean three-space are 

da = 5 t j, 

where <5y is the usual Kronecker delta. 

The same flat space could equally well be mapped by, for example, spherical or 
cylindrical coordinates. The components g^ of the metric tensor would be differ- 
ent, but Equation (2.9) would hold unchanged. For instance, choosing spherical 
coordinates R, 6, 4> as in Figure 2.2, 

x = R sin 6 sine/), y = R sin0cos</>, z = #cos0, (2.10) 

dl 2 takes the explicit form 

dl 2 = dR 2 +R 2 d9 2 +R 2 sm 2 0d<l> 2 . (2.11) 

Geodesies in this space obey Newton's first law of motion, which may be written 
as 

R = 0. (2.12) 



Minkowski Space-Time. In special relativity, symmetry between spatial coordin- 
ates and time is achieved, as is evident from the Minkowski metric (2.6) describing 
a flat space-time in four Cartesian coordinates. In tensor notation the Minkowski 
metric includes the coordinate dx° = c dt so that the invariant line element in 
Equation (2.6) can be written 

ds 2 = c 2 dr 2 =g iAV dx li dx v . (2.13) 

The components of g in flat Minkowski space-time are given by the diagonal 
matrix n^v, a generalization of the Kronecker delta function to four-space-time, 

»7oo = l. 1jj = -h J = 1,2,3, (2.14) 




Figure 2.2 A two-sphere on which points are specified by coordinates (0,4>)- 

all nondiagonal components vanishing. The choice of signs in the definition of 
r/^y is not standardized in the literature, but we shall use Equation (2.14). 

The path of a body, or its world fine, is then described by the four coordi- 
nate functions x(t), y(r), z(t), £(t), where the proper time t is a new abso- 
lute parameter, an invariant under Lorentz transformations. A geodesic in the 
Minkowski space-time is also a straight line, given by the equations 

f| = 0, ^ = 0. (2.15) 

dT 2 dT 2 

In the spherical coordinates (2.10) the Minkowski metric (2.6) takes the form 

ds 2 = c 2 dt 2 - dl 2 = c 2 dt 2 - dR 2 - R 2 d6 2 - R 2 sin 2 9 d</> 2 . (2.16) 

An example of a curved space is the two-dimensional surface of a sphere with 
radius R obeying the equation 

x 2 +y 2 + z 2 =R 2 . (2.17) 

This surface is called a two-sphere. 

Combining Equations (2.5) and (2.17) we see that one coordinate is really super- 
fluous, for instance z, so that the spatial metric (2.5) can be written 

dl 2 = dx 2 + dy 2 + {X * X+ 2 ydy } 2 . (2.18) 

R 2 - x 2 - y 2 

This metric describes spatial distances on a two-dimensional surface embedded 
in three-space, but the third dimension is not needed to measure a distance on 



Metrics of Curved Space-time 33 

the surface. Note that R is not a third coordinate, but a constant everywhere on 
the surface. 

Thus measurements of distances depend on the geometric properties of space, 
as has been known to navigators ever since Earth was understood to be spherical. 
The geodesies on a sphere are great circles, and the metric is 

dl 2 = R 2 d0 2 + R 2 sin 2 d$ 2 . (2.19) 

Near the poles where 6 = 0° or 6 = 180° the local distance would depend very 
little on changes in longitude </>. No point on this surface is preferred, so it can cor- 
respond to a Copernican homogeneous and isotropic two-dimensional universe 
which is unbounded, yet finite. 
Let us write Equation (2.19) in the matrix form 

dl 2 =(d9 d*),(*J), (2.20) 

where the metric matrix is 

*=Co 2 R*L*e)- (2 " 21) 

The 'two-volume' or area A of the two-sphere in Figure 2.2 can then be written 

A=[ d(/> f d9^detg=\ d<j> [ d9R 2 sinO = 4nR 2 , (2.22) 

as expected. 

In Euclidean three-space parallel lines of infinite length never cross, but this 
could not be proved in Euclidean geometry, so it had to be asserted without proof, 
the parallel axiom. The two-sphere belongs to the class of Riemannian curved 
spaces which are locally flat: a small portion of the surface can be approximated 
by its tangential plane. Lines in this plane which are parallel locally do cross when 
extended far enough, as required for geodesies on the surface of a sphere. 



Gaussian Curvature. The deviation of a curved surface from flatness can also 
be determined from the length of the circumference of a circle. Choose a point 'P' 
on the surface and draw the locus corresponding to a fixed distance s from that 
point. If the surface is flat, a plane, the locus is a circle and s is its radius. On a 
two-sphere of radius R the locus is also a circle, see Figure 2.2, but the distance 
5 is measured along a geodesic. The angle subtended by s at the centre of the 
sphere is s/R, so the radius of the circle is r = R sin(s/R). Its circumference is 
then 

C= 2nRsin(s/R) = 2tts(i--^-t + ■ ■ ■)■ (2.23) 

V 6R Z / 

Carl Friedrich Gauss (1777-1855) discovered an invariant characterizing the 

curvature of two-surfaces, the Gaussian curvature K. Although K can be given 

by a completely general formula independent of the coordinate system (see, for 



example, [1]), it is most simply described in an orthogonal system x, y. Let the 
radius of curvature along the x-axis be R x (x) and along the ;y-axis be R y (y). 
Then the Gaussian curvature at the point (Xo,yo) is 

K = l/R x (Xo)R y (y ). (2.24) 

On a two-sphere R x = R y = R, so K = R~ 2 everywhere. Inserting this into Equa- 
tion (2.23) we obtain, in the limit of small s, 

K = Alim(^f^). (2.25) 

This expression is true for any two-surface, and it is in fact the only invariant that 
can be defined. 

Whether we live in three or more dimensions, and whether our space is flat 
or curved, is really a physically testable property of space. Gauss actually pro- 
ceeded to investigate this by measuring the angles in a triangle formed by three 
distant mountain peaks. If space were Euclidean the value would be 180°, but 
if the surface had positive curvature like a two-sphere the angles would add up 
to more than 180°. Correspondingly, the angles on a saddle surface with nega- 
tive curvature would add up to less than 180°. This is illustrated in Figures 2.3 
and 2.4. The precision in Gauss's time was, however, not good enough to exhibit 
any disagreement with the Euclidean value. 



Comoving Coordinates. If the two-sphere with surface (2.17) and radius R were 
a balloon expanding with time, R = R(t), points on the surface of the balloon 
would find their mutual distances scaled by R(t) relative to a time to when the 
radius was Rq = 1. An observer located at any one point would see all the other 
points recede radially. This is exactly how we see distant galaxies except that we 
are not on a two-sphere but, as we shall see, on a spatially curved three-surface 
with cosmic scale factor R(t). 

Suppose this observer wants to make a map of all the points on the expanding 
surface. It is then no longer convenient to use coordinates dependent on R(t) as in 
Equations (2.11) and (2.19), because the map would quickly be outdated. Instead 
it is convenient to factor out the cosmic expansion and replace R by R(t)a, where 
<x is a dimensionless comoving coordinate, thus 

dl 2 = R 2 (t) (da 2 + a 2 d0 2 + cr 2 sin 2 9 d</> 2 ). (2.26) 

Returning to the space we inhabit, we manifestly observe that there are three 
spatial coordinates, so our space must have at least one dimension more than 
a two-sphere. It is easy to generalize from the curved two-dimensional manifold 
(surface) (2.17) embedded in three-space to the curved three-dimensional mani- 
fold (hypersurface) 

x 2 +y 2 +z 2 +w 2 =R 2 (2.27) 

of a three-sphere (hypersphere) embedded in Euclidean four-space with coordin- 
ates x, y, z and a fourth fictitious space coordinate w. 



Metrics of Curved Space-time 3 5 




Figure 2.3 The angles in a triangle on a surface with positive curvature add up to more 
than 180°. 




Figure 2.4 The angles in a triangle on a surface with negative curvature add up to less 
than 180°. 



Just as the metric (2.18) could be written without explicit use of z, the metric 
on the three-sphere (2.27) can be written without use of w, 

df = cb^ + dy 2 + dz 2 + (xdx + ydy + zdz)^ (2 2g) 



or, in the more convenient spherical coordinates used in (2.26), 

dl 2 = R 2 (t) ( R *_ d ( x a) 2 + °~ 2d02 + (j2 sin2 6 d< ^ 2 )- (2 - 29) 

Note that the introduction of the comoving coordinate a in Equation (2.26) does 
not affect the parameter R defining the hypersurface in Equation (2.27). No point 
is preferred on the manifold (2.27), and hence it can describe a spatially homoge- 
neous and isotropic three-dimensional universe in accord with the cosmological 
principle. 

Another example of a curved Riemannian two-space is the surface of a hyper- 
boloid obtained by changing the sign of R 2 in Equation (2.17). The geodesies are 
hyperbolas, the surface is also unbounded, but in contrast to the spherical surface 
it is infinite in extent. It can also be generalized to a three-dimensional curved 
hypersurface, a three-hyperboloid, defined by Equation (2.27) with R 2 replaced 
by-.R 2 . 

The Gaussian curvature of all geodesic three-surfaces in Euclidean four-space 
is 

K = k/R 2 , (2.30) 

where the curvature parameter k can take the values +1,0, -1, corresponding to 
the three-sphere, flat three-space, and the three-hyperboloid, respectively. Actu- 
ally, k can take any positive or negative value, because we can always rescale a 
to take account of different values of k. 



The Robertson-Walker Metric. Let us now include the time coordinate t and 
the curvature parameter k in Equation (2.28). We then obtain the complete metric 
derived by Howard Robertson and Arthur Walker in 1934: 

ds 2 = c 2 dt 2 -dl 2 = c 2 dt 2 - R(t) 2 ( - d ^ _ + a 2 d9 2 + a 2 sin 2 dcf> 2 ) . (2.31) 
VI - ka 2 J 

In the tensor notation of Equation (2.13) the components of the Robertson- Walker 
metric g are obviously 

R 2 
0oo = l. 0n = - y—r 2 ' 022 = -R 2 cr 2 , g 33 = -R 2 cr 2 sm 2 6. (2.32) 

Thus if the Universe is homogeneous and isotropic at a given time and has 
the Robertson-Walker metric (2.32), then it will always remain homogeneous and 
isotropic, because a galaxy at the point (a, 9, <p) will always remain at that point, 
only the scale of spatial distances R(t) changing with time. The displacements 
will be da = d6 = d</> = and the metric equation will reduce to 

ds 2 = c 2 dt 2 . (2.33) 

For this reason one calls such an expanding frame a comoving frame. An observer 
at rest in the comoving frame is called a fundamental observer. If the Universe 
appears to be homogeneous to him/her, it must also be isotropic. But another 



Relativistic Distance Measures 37 

observer located at the same point and in relative motion with respect to the 
fundamental observer does not see the Universe as isotropic. Thus the comoving 
frame is really a preferred frame, and a very convenient one, as we shall see later 
in conjunction with the cosmic background radiation. Let us note here that a 
fundamental observer may find that not all astronomical bodies recede radially; a 
body at motion relative to the comoving coordinates (a, 6, <fi) will exhibit peculiar 
motion in other directions. 
Another convenient comoving coordinate is x, defined by integrating over 

Inserting this into Equation (2.31), the metric can be written 

ds 2 = c 2 dt 2 - R 2 (t)[dx 2 + Sl(x)(d0 2 + sin 2 6>d</> 2 )], (2.35) 

where 

Sfe(x) = a- 
and 

5i(x) = sinx, S (x) = X, S-i(x) = sinhx- (2.36) 

We shall use the metrics (2.31) and (2.35) interchangeably since both offer advan- 
tages. During the course of gravitational research many other metrics with differ- 
ent mathematical properties have been studied, but none appears to describe the 
Universe well. Since they are not needed, we shall not discuss them here. 

Let us briefly digress to define what is sometimes called cosmic time. In an 
expanding universe the galaxies are all moving away from each other (let us ignore 
peculiar velocities) with clocks running at different local time, but from our van- 
tage point we would like to have a time value applicable to all of them. If one 
postulates, with Hermann Weyl (1885-195 5), that the expansion is so regular that 
the world lines of the galaxies form a nonintersecting and diverging three-bundle 
of geodesies, and that one can define spacelike hypersurfaces which are orthogo- 
nal to all of them. Then each such hypersurf ace can be labelled by a constant value 
of the time coordinate x°, and using this value one can meaningfully talk about 
cosmic epochs for the totality of the Universe. This construction in space-time 
does not imply the choice of a preferred time in conflict with special relativity. 

2.3 Relativistic Distance Measures 

Let us consider how to measure distances in our comoving frame in which we are 
at the origin. The comoving distance from us to a galaxy at comoving coordinates 
(cr,0, 0) is not an observable because a distant galaxy can only be observed by 
the light it emitted at an earlier time t < to. In a space-time described by the 
Robertson-Walker metric the light signal propagates along the geodesic ds 2 = 0. 
Choosing d6 2 = d<fi 2 = 0, it follows from Equation (2.35) that this geodesic is 
defined by 

c 2 dt 2 -,R(t) 2 dx 2 = 0. 



38 Relativity 

It follows that x can be written 



*-'fi 



The time integral in Equation (2.37) is called the conformal time. 



Proper Distance. Let us now define the proper distance dp at time to (when the 
scale is Ro) to the galaxy at (cr, 0, 0). This is a function of a and of the intrinsic 
geometry of space-time and the value of k. Integrating the spatial distance dl = 
\dl\ in Equation (2.31) from to dp we find 



d v = Rq \ 
Jo 



f = Ro^f sin 1 (^ka) = R ; 



-. = r - -Ml r -"-• - x '"' / - -MJA- (2.38) 

VI - ka 2 vfc 

For flat space k = we find the expected result dp = Ro<J- In a universe with 
curvature k = +1 and scale R the equation (2.38) becomes 

dp = Rx = R sin" 1 a or a = sm(d P /R). 
As the distance dp increases from to \nR, a also increases from to its max- 
imum value 1. However, when d P increases from \ttR to ttR, a decreases back 
to 0. Thus, travelling a distance d P = ttR through the curved three-space brings us 
to the other end of the Universe. Travelling on from d? = ttR to d P = 2nR brings 
us back to the point of departure. In this sense a universe with positive curvature 
is closed. 

Similarly, the area of a three-sphere centred at the origin and going through the 
galaxy at a is 

A = 4nR 2 cr 2 = 4nR 2 sm 2 (d P /R). (2.39) 

Clearly, A goes through a maximum when dp = \ttR, and decreases back to 
when dp reaches ttR. Note that A/4 equals the area enclosed by the circle formed 
by intersecting a two-sphere of radius R with a horizontal plane, as shown in 
Figure 2.2. The intersection with an equatorial plane results in the circle enclosing 
maximal area, A/4 = ttR 2 , all other intersections making smaller circles. A plane 
tangential at either pole has no intersection, thus the corresponding 'circle' has 
zero area. 

The volume of the three-sphere (2.27) can then be written in analogy with Equa- 
tion (2.22), 

f 2TT rn f l 

V = 2\ dtp \ d0 dcrydet^Rw, (2.40) 

where the determinant of the spatial part of the Robertson-Walker metric matrix 
#rw is now 

R a- 4 , 

detff R w = ^ 6 lo , 2 sin 2 g. (2.41) 

The factor 2 in Equation (2.40) comes from the sign ambiguity of w in Equa- 
tion (2.27). Both signs represent a complete solution. Inserting Equation (2.41) 
into Equation (2.40) one finds the volume of the three-sphere: 

V = 2n 2 R 3 . (2.42) 



Relativistic Distance Measures 39 

The hyperbolic case is different. Setting k = -1, the function in Equation (2.38) 
is i _1 sin" 1 icr = sinh" 1 a, thus 

dp = Rx = R sinh" 1 a or a = smh(d?/R). (2.43) 

Clearly this space is open because a grows indefinitely with d P . The area of the 
three-hyperboloid through the galaxy at a is 

A = 4nR 2 (T 2 = 4nR 2 sinh 2 (d P /.R). (2.44) 

Let us differentiate d? in Equation (2.38) with respect to time, noting that a is 
a constant since it is a comoving coordinate. We then obtain the Hubble flow v 
experienced by a galaxy at distance d?\ 



v = d P = R(t)\ -==== = — —d v . 
Jo VI - ka 2 R(t) 



Thus the Hubble flow is proportional to distance, and Hubble's law emerges in a 
form more general than Equation (1.19): 

««>=§§= if 

Recall that v is the velocity of expansion of the space-time geometry. A galaxy 
with zero comoving velocity would appear to have a radial recession velocity v 
because of the expansion. 



Particle and Event Horizons. In Equation (1.14) we defined the Hubble radius 
r H as the distance reached in one Hubble time, th, by a light signal propagating 
along a straight line in flat, static space. Let us define the particle horizon <x p h or 
X P h (also object horizon) as the largest comoving spatial distance from which a 
light signal could have reached us if it was emitted at time t = t mm < to- Thus it 
delimits the size of that part of the Universe that has come into causal contact 
since time t min . If the past time t is set equal to the last scattering time (the time 
when the Universe became transparent to light, and thus the earliest time anything 
was visible, as we will discuss in a later chapter) the particle horizon delimits the 
visible Universe. From Equation (2.37), 

xph = c £ n ^y' (2 - 47) 

and from the notation in Equation (2.36), 

CTph = Sk(Xph)- 

A particle horizon exists if t min is in the finite past. Clearly the value of 
crph depends sensitively on the behaviour of the scale of the Universe at that 
time, R(tph)- 

If fc > 0, the proper distance (subscript 'P') to the particle horizon (sub- 
script 'ph') at time t is 

rfp.ph = R(t)x P h- (2.48) 



Note that d P equals the Hubble radius r H = c/Hq when k = and the scale is 
a constant, R(t) = R. When k = -1 the Universe is open, and d P _ p h cannot be 
interpreted as a measure of its size. 

In an analogous way, the comoving distance cr e h to the event horizon is defined 
as the spatially most distant present event from which a world line can ever reach 
our world line. By 'ever' we mean a finite future time, t max : 

dt (2.49) 



r '-* _dj_ 
c J t() R(ty 



The particle horizon cr p h at time to lies on our past light cone, but with time our 
particle horizon will broaden so that the light cone at to will move inside the light 
cone at t > to (see Figure 2.1). The event horizon at this moment can only be 
specified given the time distance to the ultimate future, t max . Only at t max will 
our past light cone encompass the present event horizon. Thus the event horizon 
is our ultimate particle horizon. Comoving bodies at the particle horizon recede 
with velocity c = Hd PiP h, but the particle horizon itself recedes even faster. From 

d(Hd PiPh )/dt = Hd P:Ph + Hd PiVh = 0, 

and making use of the deceleration parameter q, defined by 

a = -^ = -4- 2 , (2.50) 

a 2 aH 2 

one finds 

rfp, P h = c(q + 1). (2.51) 

Thus when the particle horizon grows with time, bodies which were at spacelike 
distances at earlier times enter into the light cone. 

The integrands in Equations (2.47) and (2.49) are obviously the same, only the 
integration limits are different, showing that the two horizons correspond to dif- 
ferent conformal times. If t min = 0, the integral in Equation (2.47) may well diverge, 
in which case there is no particle horizon. Depending on the future behaviour of 
R(t), an event horizon may or may not exist. If the integral diverges as t — oo, 
every event will sooner or later enter the event horizon. The event horizon is then 
a function of waiting time only, but there exists no event horizon at t = oo. But 
if R(t) accelerates, so that distant parts of the Universe recede faster than light, 
then there will be an ultimate event horizon. We shall see later that R(t) indeed 
appears to accelerate. 



Redshift and Proper Distance. In Equation (1.19) in the previous chapter we 
parametrized the rate of expansion d by the Hubble constant Hq. It actually 
appeared as a dynamical parameter in the lowest-order Taylor expansion of R(t), 
Equation (1.17). If we allow H(t) to have some mild time dependence, that would 
correspond to introducing another dynamical parameter along with the next term 
in the Taylor expansion. Thus adding the second-order term to Equation (1.17), 
we have for R(t), 

R(t) « Ho - Ro(U) - t) + |jj (t - tf. (2.52) 



Relativistic Distance Measures 41 

Making use of the definition in Equation (2.46), the second-order expansion for 
the dimensionless scale factor is 

a(t) « 1 - Ho (t - t) + fH (to - t) 2 . (2.53) 

As long as the observational information is limited to Rq and its first time deriva- 
tive Rq, no further terms can be added to these expansions. To account for Rq, we 
shall now make use of the present value of the deceleration parameter (2.50), qo- 
Then the lowest-order expression for the cosmological redshift, Equation (1.18), 
can be replaced by 

z(t) = (aitr 1 -!) 

= [1 - H (t - t Q ) - \doHl(t - to) 2 ]" 1 - 1 

« H (t - t ) + (1 + \q )Hl(t - t ) 2 . 

This expression can further be inverted to express Ho(t - to) as a function of the 
redshift to second order: 

H (t -t)«z-(l + ^o)z 2 . (2.54) 

Let us now find the proper distance d? to an object at redshift z in this approx- 
imation. Eliminating x in Equations (2.37) and (2.38) we have 



f*> dt 
°J t Rii 



a(tY 
We then insert a(t) from Equation (2.53) to lowest order in to - t, obtaining 

dv ~ C J °[1 + Ho (to - t)] dt = c(t - t)[l + |Ho(t - t)]. (2.55) 

Substituting the expression (2.54) into this yields the sought result: 

dp(z)« ^-{z-\{l + cio)z 2 ). (2.56) 

The first term on the right gives Hubble's linear law (1.15), and thus the second 
term measures deviations from linearity to lowest order. The parameter value 
qo = -1 obviously corresponds to no deviation. The linear law has been used to 
determine Ho from galaxies within the Local Supercluster (LSC). On the other hand, 
one also observes deceleration of the expansion in the local universe due to the 
lumpiness of matter. For instance, the local group clearly feels the overdensity of 
the Virgo cluster at a distance of about 17 Mpc, falling towards it with a peculiar 
velocity of about 630 km s _1 [4]. It has been argued that the peculiar velocities 
in the LSC cannot be understood without the pull of the neighbouring Hydra- 
Centaurus supercluster and perhaps a still larger overdensity in the supergalactic 
plane, a rich cluster (the A3627) nicknamed 'the Great Attractor'. 

It should be clear from this that one needs to go to even greater distances, 
beyond the influences of local overdensities, to determine a value for qo. Within 
the LSC it is safe to conclude that only the linear term in Hubble's law is necessary. 



Equation (2.56) is the conventional formula, which is a good approximation for 
small z. The approximation obviously deteriorates as z increases, so that it attains 
its maximum at z = 1/1 + qo- In Figure 2.5 we plot the function d?(z) for small 
values of z. 



Redshift and Luminosity Distance. Consider an astronomical object emitting 
photons isotropically with power or absolute luminosity I. At the luminosity dis- 
tance <2l from the object we observe only the fraction B s , its surface brightness, 
given by the inverse-square distance law 

B s = -^-j. (2.57) 

4TTd[ 

Let us now find di as a function of z in such a way that the Euclidean inverse- 
square law (2.57) is preserved. If the Universe does not expand and the object is 
stationary at proper distance d P , a telescope with area A will receive a fraction 
A/4ndl of the photons. But in a universe characterized by an expansion a(t), the 
object is not stationary, so the energy of photons emitted at time t e is redshifted 
by the factor (1 + z) = a _1 (t e )- Moreover, the arrival rate of the photons suffers 
time dilation by another f actor ( 1 + z ) , often called the energy effect. The apparent 
brightness B a is then given by 

B ^ = ir-nh T7- ( 2 - 58 ) 

Atrdyil + z) 1 
Equating B a = B s one sees that d L = d P (l + z), and making use of the expression 
(2.56) one obtains 

d L (z)« ^-{z+\{l-q Q )z 2 ). (2.59) 

In Figure 2.5 we plot the function di(z) for small values of z. 

Astronomers usually replace L and B by two empirically defined quantities, 
absolute magnitude M of a luminous object and apparent magnitude m. The 
replacement rule is 

m-M = -5 + 51ogd L , (2.60) 

where d L is expressed in parsecs (pc) and the logarithm is to base 10. For exam- 
ple, if one knows the distance d L to a galaxy hosting a supernova, its absolute 
magnitude M can be obtained from observations of its apparent magnitude m. 



Parallax Distance. Some of the measurements of Ho in Figure 1.2 depend directly 
on the calibration of local distance indicators which form the first rung of a ladder 
of distance measurements. The distances to relatively nearby stars can be mea- 
sured by the trigonometrical parallaxup to about 30 pc away (see Table A.l in the 
appendix for cosmic distances). This is the difference in angular position of a star 
as seen from Earth when at opposite points in its circumsolar orbit. The parallax 
distance d v is defined as 

d v = dp/Vl -fccr 2 . (2.61) 



Relativistic Distance Measures 43 

It has been found that most stars in the Galaxy for which we know the luminosity 
from a kinematic distance determination exhibit a relationship between surface 
temperature or colour and absolute luminosity, the Hertzsprung -Russell relation. 
These stars are called main-sequence stars and they sit on a fairly well-defined 
curve in the temperature-luminosity plot. Temperature can be determined from 
colour— note that astronomers define colour as the logarithm of the ratio of the 
apparent brightnesses in the red and the blue wavelength bands. Cool stars with 
surface temperature around 3000 K are infrared, thus the part of their spectrum 
which is in the visible is dominantly red. Hot stars with surface temperature 
around 1 2 000 K are ultraviolet, thus the part of their spectrum which is in the 
visible is dominantly blue. The Sun, with a surface temperature of 5700 K, radi- 
ates mainly in the visible, thus its colour is a blended white, slightly yellow. Most 
main-sequence stars like our Sun are in a prolonged state of steady burning of 
hydrogen into helium. 

Once this empirical temperature-luminosity relation is established, it can be 
used the other way around to derive distances to farther main-sequence stars: 
from their colour one obtains the luminosity which subsequently determines d L . 
By this method one gets a second rung in a ladder of estimates which covers 
distances within our Galaxy. 



Angular Size Distance. Yet another measure of distance is the angular size dis- 
tance d A . In Euclidean space an object of size D that is at distance d A will subtend 
an angle 26 such that 

9 = tan(D/d A ) ~ D/d A , 

where the approximation is good for small 6. This can serve as the definition of 
dA in Euclidean space. In general relativity we can still use this equation to define 
a distance measure d A . From the metric equation (2.31) the diameter of a source 
of light at comoving distance a is D = Rad, so 

d A = D/6 = Ra =RS k (d P /R Q ). (2.62) 

This definition preserves the relation between angular size and distance, a prop- 
erty of Euclidean space. But expansion of the Universe and the changing scale 
factor R means that as proper distance d P or redshift z increases, the angular 
diameter distance initially increases but ultimately decreases. Light rays from the 
object detected by the observer have been emitted when the proper distance to 
the object, measured at fixed world time, was small. Because the proper distance 
between observer and source is increasing faster than the speed of light, emitted 
light in the direction of the observer is initially moving away from the observer. 
The redshift dependence of d A can be found from Equations (2.56) and (2.36) 
once k is known. In Figure 2.5 we plot d A for the choice k = when 

d A = ^-d P = ad P = -4*-. (2.63) 

Kq 1 + Z 



44 Relativity 




Figure 2.5 Approximate distance measures d to second order in z. The solid curve shows 
the linear function, Equation (1.15); the short-dashed curve the proper distance d F , Equa- 
tion (2.56); the medium-dashed curve the luminosity distance d L , Equation (2.59); and the 
long-dashed curve the angular size distance d A , Equation (2.63). The value of qo is -0.5. 

The k dependence makes it a useful quantity to determine cosmological param- 
eters. In particular, k is sensitive to certain combinations of well-measured param- 
eters occurring in supernova observations. 



Distance Ladder Continued. As the next step on the distance ladder one chooses 
calibrators which are stars or astronomical systems with specific uniform prop- 
erties, so called standard candles. The RR Lyrae stars all have similar absolute 
luminosities, and they are bright enough to be seen out to about 300 kpc. A very 
important class of standard candles are the Cepheid stars, whose absolute lumi- 
nosity oscillates with a constant period P in such a way that logP oc 1.3 log!. 
The period P can be observed with good precision, thus one obtains a value for L. 
Cepheids have been found within our Galaxy where the period-luminosity relation 
can be calibrated by distances from trigonometric parallax measurements. This 
permits use of the period-luminosity relation for distances to Cepheids within 
the Large Magellanic Cloud (LMC), our satellite galaxy. At a distance of 55 kpc the 
LMC is the first important extragalactic landmark. 

Globular clusters are gravitationally bound systems of 10 5 -10 6 stars forming 
a spherical population orbiting the centre of our Galaxy. From their composition 
one concludes that they were created very early in the evolution of the Galaxy. We 
already made use of their ages to estimate the age of the Universe in Section 1.5. 
Globular clusters can also be seen in many other galaxies, and they are visible out 
to 100 Mpc. Within the Galaxy their distance can be measured as described above, 
and one then turns to study the statistical properties of the clusters: the frequency 



General Relativity and the Principle of Covariance 45 

of stars of a given luminosity, the mean luminosity, the maximum luminosity, and 
so on. A well-measured cluster then becomes a standard candle with properties 
presumably shared by similar clusters at all distances. Similar statistical indica- 
tors can be used to calibrate clusters of galaxies; in particular the brightest galaxy 
in a cluster is a standard candle useful out to 1 Gpc. 

The next two important landmarks are the distances to the Virgo cluster, which 
is the closest moderately rich concentrations of galaxies, and to the Coma clus- 
ter, which is one of the closest clusters of high richness. The Virgo distance has 
been determined to be 17Mpc by the observations of galaxies containing sev- 
eral Cepheids, by the Hubble Space Telescope [5]. The Coma is far enough, about 
100 Mpc, that its redshift is almost entirely due to the cosmological expansion. 

The existence of different methods of calibration covering similar distances 
is a great help in achieving higher precision. The expansion can be verified by 
measuring the surface brightness of standard candles at varying redshifts, the 
Tolman test. If the Universe does indeed expand, the intensity of the photon signal 
at the detector is further reduced by a factor (1 + z) 2 due to an optical aberration 
which makes the surface area of the source appear increased. Such tests have 
been done and they do confirm the expansion. 

The Tully-Fisher relation is a very important tool at distances which overlap 
those calibrated by Cepheids, globular clusters, galaxy clusters and several other 
methods. This empirical relation expresses correlations between intrinsic prop- 
erties of whole spiral galaxies. It is observed that their absolute luminosity and 
their circular rotation velocity v c are related by 

I oc v c 4 . (2.64) 

The Tully-Fisher relation for spiral galaxies is calibrated by nearby spiral galax- 
ies having Cepheid calibrations, and it can then be applied to spiral galaxies out 
to 100 Mpc. 

For more details on distance measurements the reader is referred to the excel- 
lent treatment in the book by Peacock [6]. 

2.4 General Relativity and the Principle of 
Covariance 

Tensors. In four-dimensional space-time all spatial three-vectors have to acquire 
a zeroth component just like the line element four-vector ds in Equations (2.6) 
and (2.13). A vector A with components A^ in a coordinate system x u can be 
expressed in a transformed coordinate system x' v as the vector A' with compo- 
nents 

7)x' v 

dx» 
where summation over the repeated index n is implied, just as in Equation (2.13). A 
vector which transforms in this way is said to be contravariant, which is indicated 
by the upper index for the components A 11 . 



46 Relativity 

A vector B with components B lA in a coordinate system x**, which transforms in 
such a way that 

B' v = ^-B», (2.66) 

v dx' v * 

is called covariant. This is indicated by writing its components with a lower index. 

Examples of covariant vectors are the tangent vector to a curve, the normal to a 

surface, and the four-gradient of a four-scalar </>, 3$/ dx u . 

In general, tensors can have several contravariant and covariant indices running 
over the dimensions of a manifold. In a d-dimensional manifold a tensor with r 
indices is of rank r and has d r components. In particular, an r = 1 tensor is a 
vector, and r = corresponds to a scalar. An example of a tensor is the assembly 
of the n 2 components X U Y V formed as the products (not the scalar product!) of 
the n components of the vector X^ with the n components of the vector Y v . We 
have already met the rank 2 tensors n^ with components given by Equation (2.14), 
and the metric tensor g uv . 

Any contravariant vector A with components A u can be converted into a covari- 
ant vector by the operation 

A v =g, v A». 

The contravariant metric tensor g uv is the matrix inverse of the covariant g^ v : 

g<r^ v = S v a . (2.67) 

The upper and lower indices of any tensor can be lowered and raised, respectively, 
by operating with g^ v or g uv and summing over repeated indices. Thus a covariant 
vector A with components A iA can be converted into a contravariant vector by the 
operation 

A v =g> JV A„ 

For a point particle with mass m and total energy 

E = ymc 2 , (2.68) 

according to Einstein's famous relation, one assigns a momentum four-vector P 
with components p° = E/c, p 1 = p x , p 2 = p y , p 3 = p z , so that E and the linear 
momentum p = mv become two aspects of the same entity, P = (E/c, p). 
The scalar product P 2 is an invariant related to the mass, 

(2.69) 

where p 2 = \yp\ 2 . For a massless particle like the photon, it follows that the 
energy equals the three-momentum times c. 
Newton's second law in its nonrelativistic form, 

F = ma = mv = p, (2.70) 

is replaced by the relativistic expression 

F=^ = y^ = y(^,^). (2.71) 

dr y At y \cdt'dt) K ' 



General Relativity and the Principle of Covariance 47 




Figure 2.6 Parallel transport of 



around a closed contour on a curved surface. 



General Covariance. Although Newton's second law (2.71) is invariant under 
special relativity in any inertial frame, it is not invariant in accelerated frames. 
Since this law explicitly involves acceleration, special relativity has to be gener- 
alized somehow, so that observers in accelerated frames can agree on the value 
of acceleration. Thus the next necessary step is to search for quantities which 
remain invariant under an arbitrary acceleration and to formulate the laws of 
physics in terms of these. Such a formulation is called generally covariant. In 
a curved space-time described by the Robertson-Walker metric the approach to 
general covariance is to find appropriate invariants in terms of tensors which have 
the desired properties. 

Since vectors are rank 1 tensors, vector equations may already be covariant. 
However, dynamical laws contain many other quantities that are not tensors, 
in particular space-time derivatives such as d/dr in Equation (2.71). Space-time 
derivatives are not invariants because they imply transporting ds along some 
curve and that makes them coordinate dependent. Therefore we have to start by 
redefining derivatives and replacing them with new covariant derivatives, which 
are tensor quantities. 

To make the space-time derivative of a vector generally covariant one has to 
take into account that the direction of a parallel-transported vector changes in 
terms of the local coordinates along the curve as shown in Figure 2.6. The change 
is certainly some function of the space-time derivatives of the curved space that 
is described by the metric tensor. 



The covariant derivative operator with respect to the proper time t is denoted 
D/Dt (for a detailed derivation, see, for example, references [1] and [2]). Operating 
with it on the momentum four-vector P p results in another four-vector: 

F» = ^- ee ^- + rZ v P°^. (2.72) 

Dt dT dT 

The second term contains the changes this vector undergoes when it is paral- 
lel transported an infinitesimal distance cdr. The quantities T^v, called affine 
connections, are readily derivable functions of the derivatives of the metric g^ v in 
curved space-time, but they are not tensors. Their form is 

^ = *W¥?+^-¥t)- (2 - 73) 

1 \ dx v dx a dxP J 
With this definition Newton's second law has been made generally covariant. 

The path of a test body in free fall follows from Equation (2.72) by requiring 
that no forces act on the body, F 1 * = 0. Making the replacement 

P» = m ——, 
dT 

the relativistic equation of motion of the test body, its geodesic, can be written 

f? + ^^^=0. (2.74) 

dT z dT dT 

In an inertial frame the metric is flat, the metric tensor is a constant everywhere, 

dijv(x) = n^y, and thus the space-time derivatives of the metric tensor vanish: 

%^ = 0. (2.75) 

dxP 

It then follows from Equation (2.73) that the affine connections also vanish, and 
the covariant derivatives equal the simple space-time derivatives. 

Going from an inertial frame at x to an accelerated frame at x + Ax the expres- 
sions for g^ v (x) and its derivatives at x can be obtained as the Taylor expansions 

g^(x + Ax) = n^ + --^^-Ax p Ax a + ■■■ 

and 

dq uv (x + Ax) d 2 q L , v (x) 

— ^ = -, \ Ax ff + ■ ■ ■ . 

dxP dxPdx a 

The description of a curved space-time thus involves second derivatives of g uv , 
at least. (Only in a very strongly curved space-time would higher derivatives be 
needed.) 

Recall the definition of the noncovariant Gaussian curvature K in Equation (2.30) 
defined on a curved two-dimensional surface. In a higher-dimensional space-time, 
curvature has to be defined in terms of more than just one parameter K. It turns 
out that curvature is most conveniently defined in terms of the fourth-rank Rie- 
mann tensor 

*»* = JxT-3x^ + T *A ~ «■ (2.76) 



The Principle of Equivalence 49 

In four-space this tensor has 256 components, but most of them vanish or 
are not independent because of several symmetries and antisymmetries in the 
indices. Moreover, an observer at rest in the comoving Robertson-Walker frame 
will only need to refer to spatial curvature. In a spatial n-manifold, Rg y6 has only 
n 2 (n 2 - 1)/12 nonvanishing components, thus six in the spatial three-space of 
the Robertson-Walker metric. On the two-sphere there is only one component, 
which is essentially the Gaussian curvature K. 

Another important tool related to curvature is the second rank Ricci tensor Rp y , 
obtained from the Riemann tensor by a summing operation over repeated indices, 
called contraction: 

Rfiy = Rfiya = KR$ya = d^Rfiya- < 2 - 77 ) 

This n 2 -component tensor is symmetric in the two indices, so it has only \n(n+l) 
independent components. In four-space the 10 components of the Ricci tensor 
lead to Einstein's system of 10 gravitational equations as we shall see later. Finally, 
we may sum over the two indices of the Ricci tensor to obtain the Ricci scalar R: 

R=gV y Rp y , (2.78) 

which we will need later. 

2.5 The Principle of Equivalence 

Newton's law of gravitation, Equation (1.28), runs into serious conflict with special 
relativity in three different ways. Firstly, there is no obvious way of rewriting it in 
terms of invariants, since it only contains scalars. Secondly, it has no explicit time 
dependence, so gravitational effects propagate instantaneously to every location 
in the Universe (in fact, also infinitely far outside the horizon of the Universe!). 

Thirdly, the gravitating mass wlq appearing in Equation (1.28) is totally indepen- 
dent of the inert mass m appearing in Newton's second law (2.70), as we already 
noted, yet for unknown reasons both masses appear to be equal to a precision 
of 10" 13 or better (10~ 18 is expected soon). Clearly a theory is needed to estab- 
lish a formal link between them. Mach thought that the inert mass of a body was 
somehow linked to the gravitational mass of the whole Universe. Einstein, who 
was strongly influenced by the ideas of Mach, called this Mach's principle. In his 
early work on general relativity he considered it to be one of the basic, underlying 
principles, together with the principles of equivalence and covariance, but in his 
later publications he no longer referred to it. 

Facing the above shortcomings of Newtonian mechanics and the limitations of 
special relativity Einstein set out on a long and tedious search for a better law of 
gravitation, yet one that would reduce to Newton's law in some limit, of the order 
of the precision of planetary mechanics. 



Weak Principle of Equivalence. Consider the lift in Figure 2.7 moving vertically 
in a tall tower (it is easy to imagine an lift to be at rest with respect to an outside 



50 Relativity 

observer fixed to the tower, whereas the more 'modern' example of a spacecraft is 
not at rest when we observe it to be geostationary). A passenger in the lift testing 
the law of gravitation would find that objects dropped to the floor acquire the 
usual gravitational acceleration g when the lift stands still, or moves with con- 
stant speed. However, when the outside observer notes that the lift is accelerating 
upwards, tests inside the lift reveal that the objects acquire an acceleration larger 
than g, and vice versa when the lift is accelerating downwards. In the limit of free 
fall (unpleasant to the passenger) the objects appear weightless, corresponding 
to zero acceleration. 

Let us now replace the lift with a spacecraft with the engines turned off, located 
at some neutral point in space where all gravitational pulls cancel or are negligible: 
a good place is the Lagrange point, where the terrestrial and solar gravitational 
fields cancel. All objects, including the pilot, would appear weightless there. 

Now turning on the engines by remote radio control, the spacecraft could be 
accelerated upwards so that objects on board would acquire an acceleration g 
towards the floor. The pilot would then rightly conclude that 

gravitational pull and local acceleration are equivalent 

and indistinguishable if no outside information is available and if m = wq. This 
conclusion forms the weak principle of equivalence, which states that an observer 
in a gravitational field will not experience free fall as a gravitational effect, but as 
being at rest in a locally accelerated frame. 

A passenger in the lift measuring g could well decide from his local observa- 
tions that Earth's gravitation actually does not exist, but that the lift is accelerat- 
ing radially outwards from Earth. This interpretation does not come into conflict 
with that of another observer on the opposite side of Earth whose frame would 
accelerate in the opposite direction, because that frame is only local to him/her. 

The weak principle of equivalence is already embodied in the Galilean equiva- 
lence principle in mechanics between motion in a uniform gravitational field and 
a uniformly accelerated frame of reference. What Einstein did was to generalize 
this to all of physics, in particular phenomena involving light. 



Strong Principle of Equivalence. The more general formulation is the important 
strong principle of equivalence (SPE), which states that 

to an observer in free fall in a gravitational field the results of all local 
experiments are completely independent of the magnitude of the field. 

In a suitably small lift or spacecraft, curved space-time can always be approxi- 
mated by flat Minkowski space-time. In the gravitational field of Earth the gravi- 
tational acceleration is directed toward its centre. Thus the two test bodies in Fig- 
ure 2.8 with a space-like separation do not actually fall along parallels, but along 
different radii, so that their separation decreases with time. This phenomenon is 
called the tidal effect, or sometimes the tidal force, since the test bodies move as 
if an attractive exchange force acted upon them. The classic example is the tide 
caused by the Moon on the oceans. The force experienced by a body of mass m 



The Principle of Equivalence 




Figure 2.7 The Einstein lift mounted in a non-Euclidean tower. An observer is seen in 
the foreground. 

and diameter d in gravitational interaction with a body of mass M at a distance r 
is proportional to the differential of the force of attraction (1.28) with respect to r. 
Neglecting the geometrical shapes of the bodies, the tidal force is 

Fttdai ~ GMmd/r 3 . 



52 Relativity 




Figure 2.8 Tidal force F acting between two test bodies falling freely towards the surface 
of a gravitating body. On the right a spherical cluster of small bodies is seen to become 
ellipsoidal on approaching the body. 



Thus parts of m located at smaller distances r feel a stronger force. 

An interesting example is offered by a sphere of freely falling particles. Since the 
strength of the gravitational field increases in the direction of fall, the particles in 
the front of the sphere will fall faster than those in the rear. At the same time the 
lateral cross-section of the sphere will shrink due to the tidal effect. As a result, 
the sphere will be focused into an ellipsoid with the same volume. This effect is 
responsible for the gravitational breakup of very nearby massive stars. 

If the tidal effect is too small to be observable, the laboratory can be considered 
to be local. On a larger scale the gravitational field is clearly quite nonuniform, so 
if we make use of the principle of equivalence to replace this field everywhere by 
locally flat frames, we get a patchwork of frames which describe a curved space. 
Since the inhomogeneity of the field is caused by the inhomogeneous distribution 
of gravitating matter, Einstein realized that the space we live in had to be curved, 
and the curvature had to be related to the distribution of matter. 

But Einstein had already seen the necessity of introducing a four-dimensional 
space-time, thus it was not enough to describe space-time in a nonuniform grav- 
itational field by a curved space, time also had to be curved. When moving over 
a patchwork of local and spatially distinct frames, the local time would also have 
to be adjusted from frame to frame. In each frame the strong principle of equiva- 
lence requires that measurements of time would be independent of the strength 
of the gravitational field. 



Falling Photons. Let us return once more to the passenger in the Einstein lift for 
a demonstration of the relation between gravitation and the curvature of space- 
time. Let the lift be in free fall; the passenger would consider that no gravitational 
field is present. Standing by one wall and shining a pocket lamp horizontally 
across the lift, she sees that light travels in a straight path, a geodesic in flat space- 



The Principle of Equivalence 



) ---.?. 



I 



Figure 2.9 A pocket lamp at 'A' in the Einstein lift is shining horizontally on a point 'B'. 
However, an outside observer who sees that the lift is falling freely with acceleration g 
concludes that the light ray follows the dashed curve to point 'C. 



time. This is illustrated in Figure 2.9. Thus she concludes that in the absence of a 
gravitational field space-time is flat. 

However, the observer outside the tower sees that the lift has accelerated while 
the light front travels across the lift, and so with respect to the fixed frame of the 
tower he notices that the light front follows a curved path, as shown in Figure 2.9. 
He also sees that the lift is falling in the gravitational field of Earth, and so he 
would conclude that light feels gravitation as if it had mass. He could also phrase 
it differently: light follows a geodesic, and since this light path is curved it must 
imply that space-time is curved in the presence of a gravitational field. 

When the passenger shines monochromatic light of frequency v vertically up, 
it reaches the roof height d in time d/c. In the same time the outside observer 
records that the lift has accelerated from, say, v = to gd/c, where g is the 
gravitational acceleration on Earth, so that the colour of the light has experienced 
a gravitational redshift by the fraction 



Av 



GMd 

r 2 c 2 ' 



(2.79) 



Thus the photons have lost energy AE by climbing the distance d against Earth's 
gravitational field, 

gdhv 



AE = hAv = -- 



(2.80) 



where h is the Planck constant. (Recall that Max Planck (1858-1947) was the inven- 
tor of the quantization of energy; this led to the discovery and development of 
quantum mechanics.) 

If the pocket lamp had been shining electrons of mass m, they would have lost 
kinetic energy 

AE = -gmd (2.81) 



climbing up the distance d. Combining Equations (2.80) and (2.81) we see that the 
photons appear to possess mass: 

m = ^J. (2.82) 

c l 

Equation (2.79) clearly shows that light emerging from a star with mass M is 

redshifted in proportion to M. Thus part of the redshift observed is due to this 

gravitational effect. From this we can anticipate the existence of stars with so large 

a mass that their gravitational field effectively prohibits radiation to leave. These 

are the black holes to which we shall return in Section 3.4. 



Superluminal Photons. A cornerstone in special relativity is that no material 
particle can be accelerated beyond c, no physical effect can be propagated faster 
than c, and no signal can be transmitted faster than c. It is an experimental fact 
that no particle has been found travelling at superluminal speed, but a name 
for such particles has been invented, tachyons. Special relativity does not forbid 
tachyons, but if they exist they cannot be retarded to speeds below c. In this sense 
the speed of light constitutes a two-way barrier: an upper limit for ordinary matter 
and a lower limit for tachyons. 

On quantum scales this maybe violated, since the photon may appear to possess 
a mass caused by its interaction with virtual electron-positron pairs. In sufficiently 
strong curvature fields, the trajectory of a photon may then be distorted through 
the interaction of gravity on this mass and on the photon's polarization vector, so 
that the photon no longer follows its usual geodesic path through curved space- 
time. The consequence is that SPE may be violated at quantum scales, the photon's 
lightcone is changed, and it may propagate with superluminal velocity. This effect, 
called gravitational birefringence, can occur because general relativity is not con- 
structed to obey quantum theory. It may still modify our understanding of the 
origin of the Universe, when the curvature must have been extreme, and perhaps 
other similar situations like the interior of black holes. For a more detailed dis- 
cussion of this effect, see Shore [7] and references therein. 

2.6 Einstein's Theory of Gravitation 

Realizing that the space we live in was not flat, except locally and approximately, 
Einstein proceeded to combine the principle of equivalence with the requirement 
of general covariance. The inhomogeneous gravitational field near a massive body 
being equivalent to (a patchwork of flat frames describing) a curved space-time, 
the laws of nature (such as the law of gravitation) have to be described by gener- 
ally covariant tensor equations. Thus the law of gravitation has to be a covariant 
relation between mass density and curvature. Because the relativistic field equa- 
tions cannot be derived, Einstein searched for the simplest form such an equation 
may take. 

The starting point is Newton's law of gravitation, because this has to be true 
anyway in the limit of very weak fields. From Equation (1.27), the gravitational 



Einstein 's Theory of Gravitation 5 5 

force experienced by a unit mass at distance r from a body of mass M and density 
p is a vector in three-space 

r = F = -^, 
in component form (i = 1, 2, 3) 

*£-,<-«£. ,2.83, 

dt 2 r 3 

Let us define a scalar gravitational potential <fi by 



This can be written more compactly as 

V0 = -F. (2.84) 

Integrating the flux of the force F through a spherical surface surrounding M 
and using Stokes's theorem, one can show that the potential </> obeys Poisson's 
equation 

V 2 = 4ttGp. (2.85) 



Weak Field Limit. Let us next turn to the relativistic equation of motion (2.74). 
In the limit of weak and slowly varying fields for which all time derivatives of 
g^i V vanish and the (spatial) velocity components dx l /dr are negligible compared 
with dx°/dT = c dt/dT, Equation (2.74) reduces to 

From Equation (2.73) these components of the affine connection are 

r oo- 20 3^T- 

where goo is the time-time component of g uv and the sum over p is implied. 

In a weak static field the metric is almost that of flat space-time, so we can 
approximate g uv by 

dpv = Unv + V" 
where h^ is a small increment to n^. To lowest order in h^ we can then write 

uu z dxP 



Inserting this expression into Equation (2.86), the equations of motion become 

(2.88) 
(2.89) 



d2x im&h* 



dT 2 " 2VdT/ 
d 2 t 

dT 2 = 



56 Relativity 

Dividing Equation (2.88) by (dt/dT) 2 we obtain 

%£ = -\c 2 Vh m . (2.90) 

at 1 *■ 

Comparing this with the Newtonian equation of motion (2.83) in the x l direction 
we obtain the value of the time-time component of h^, 

from which it follows that 

^oo = 1 + 2^ = 1-^. (2.91) 

c 2 c 2 r 

Stress-Energy Tensor. Let us now turn to the distribution of matter in the Uni- 
verse. Suppose that matter on some scale can be considered to be continuously 
distributed as in an ideal fluid. The energy density, pressure and shear of a fluid 
of nonrelativistic matter are compactly described by the stress-energy tensor 7), v 
with the following components. 

(i) The time-time component Too is the energy density pc 2 , which includes the 
mass as well as internal and kinetic energies. 

(ii) The diagonal space-space components Tu are the pressure components in 
the i direction p l , or the momentum components per unit area. 

(iii) The time-space components cTqi are the energy flow components per unit 
area in the i direction. 

(iv) The space-time components cTjo are the momentum densities in the i direc- 
tion. 
(v) The nondiagonal space-space components Ty are the shear components of 
the pressure p l in the j direction. 

It is important to note that the stress-energy tensor is of rank 2 and is symmet- 
ric, thus it has 10 independent components in four-space. However, a comoving 
observer in the Robertson-Walker space-time following the motion of the fluid 
sees no time-space or space-time components. Moreover, we can invoke the cos- 
mological principle to neglect the anisotropic nondiagonal space-space compo- 
nents. Thus the stress-energy tensor can be cast into purely diagonal form: 

T^ = (p + pc 2 )^^-pg^. (2.92) 

In particular, the time-time component Too is pc 2 . The conservation of energy 
and three-momentum, or equivalently the conservation of four-momentum, can 
be written 

DT UV 

—^=0. (2.93) 

Dx v 

Thus the stress-energy tensor is divergence free. 



Einstein's Theory of Gravitation 57 

Taking T uv to describe relativistic matter, one has to pay attention to its Lorentz 
transformation properties, which differ from the classical case. Under Lorentz 
transformations the different components of a tensor do not remain unchanged, 
but become dependent on each other. Thus the physics embodied by T^ v also 
differs: the gravitational held does not depend on mass densities alone, but also 
on pressure. All the components of T^ v are therefore responsible for warping the 
space-time. 



Einstein's Equations. We can now put several things together: replacing p in the 
held equation (2.85) by Tqo/c 2 and substituting </> from Equation (2.91) we obtain 
a field equation for weak static fields generated by nonrelativistic matter: 

V 2 #oo = ^^00. (2.94) 

Let us now assume with Einstein that the right-hand side could describe the 
source term of a relativistic field equation of gravitation if we made it generally 
covariant. This suggests replacing Too with 7), v . In a matter-dominated universe 
where the gravitational field is produced by massive stars, and where the pressure 
between stars is negligible, the only component of importance is then Too. 

The left-hand side of Equation (2.94) is not covariant, but it does contain second 
derivatives of the metric, albeit of only one component. Thus it is already related 
to curvature. The next step would be to replace V 2 ^oo by a tensor matching the 
properties of T uv on the right-hand side. 

(i) It should be of rank two. 
(ii) It should be related to the Riemann curvature tensor R a py a .We have already 

found a candidate in the Ricci tensor R^ v in Equation (2.77). 
(iii) It should be symmetric in the two indices. This is true for the Ricci tensor. 
(iv) It should be divergence-free in the sense of covariant differentiation. This 

is not true for the Ricci tensor, but a divergence-free combination can be 

formed with the Ricci scalar R in Equation (2.78): 

G^ v = R^v ~ \d»vR- (2.95) 

Gfj V is called the Einstein tensor. It contains only terms which are either quadratic 
in the first derivatives of the metric tensor or linear in the second derivatives. 
Thus we arrive at Einstein's covariant formula for the law of gravitation: 

G„ v = ^T„ v . (2.96) 

The stress-energy tensor Tp V is the sum of the stress-energy tensors for the 
various components of energy, baryons, radiation, neutrinos, dark matter and 
possible other forms. Einstein's formula (2.96) expresses that the energy densi- 
ties, pressures and shears embodied by the stress-energy tensor determine the 
geometry of space-time, which, in turn, determines the motion of matter. 



58 Relativity 

For weak stationary fields produced by nonrelativistic matter, Goo indeed 
reduces to V 2 goo- The Einstein tensor vanishes for flat space-time and in the 
absence of matter and pressure, as it should. Thus the problems encountered by 
Newtonian mechanics and discussed at the end of Section 1.6 have been resolved 
in Einstein's theory. The recession velocities of distant galaxies do not exceed 
the speed of light, and effects of gravitational potentials are not felt instantly, 
because the theory is relativistic. The discontinuity of homogeneity and isotropy 
at the boundary of the Newtonian universe has also disappeared because four- 
space is unbounded, and because space-time in general relativity is generated by 
matter and pressure. Thus space-time itself ceases to exist where matter does not 
exist, so there cannot be any boundary between a homogeneous universe and a 
void outside space-time. 



Problems 

1. Starting from the postulates (i) and (ii) in Section 2.1 and the requirement 
that ds 2 in Equation (2.1) should have the same form in the primed and 
unprimed coordinates, derive the linear Lorentz transformation (2.2) and 
the expression (2.3). 

2. The radius of the Galaxy is 3 x 10 20 m. How fast would a spaceship have 
to travel to cross it in 300 yr as measured on board? Express your result in 
terms of y = 1/Vl -v 2 /c 2 [8]. 

3. An observer sees a spaceship coming from the west at a speed of 0.6c and 
a spaceship coming from the east at a speed 0.8c. The western spaceship 
sends a signal with a frequency of 10 4 Hz in its rest frame. What is the 
frequency of the signal as perceived by the observer? If the observer sends 
on the signal immediately upon reception, what is the frequency with which 
the eastern spaceship receives the signal [8]? 

4. If the eastern spaceship in the previous problem were to interpret the signal 
as one that is Doppler shifted because of the relative velocity between the 
western and eastern spaceships, what would the eastern spaceship conclude 
about the relative velocity? Show that the relative velocity must be (v\ + 
V2)/(l + V1V2/C 2 ), where V\ and V2 are the velocities as seen by an outside 
observer [8]. 

5. A source flashes with a frequency of 10 15 Hz. The signal is reflected by a 
mirror moving away from the source with speed 10 km s" 1 . What is the fre- 
quency of the reflected radiation as observed at the source [8]? 

6. Suppose that the evolution of the Universe is described by a constant decel- 
erating parameter q = \. We observe two galaxies located in opposite direc- 
tions, both at proper distance d P . What is the maximum separation between 
the galaxies at which they are still causally connected? Express your result 
as a fraction of distance to d P . What is the observer's particle horizon? 



Einstein's Theory of Gravitation 59 

7. Show that the Hubble distance r H = c/H recedes with radial velocity 

fu = c(l + q). (2.97) 

8. Is the sphere defined by the Hubble radius r H inside or outside the particle 
horizon? 

9. Calculate whether the following space-time intervals from the origin are 
spacelike, timelike or lightlike: (1, 3, 0, 0); (3, 3, 0, 0); (3, -3, 0, 0); (0, 3, 0, 
0); (3, 1, 0, 0) [1]. 

10. The supernova 1987A explosion in the Large Magellanic Cloud 170 000 light 
years from Earth produced a burst of anti-neutrinos v e which were observed 
in terrestrial detectors. If the anti-neutrinos are massive, their velocity would 
depend on their mass as well as their energy. What is the proper time inter- 
val between the emission, assumed to be instantaneous, and the arrival on 
Earth? Show that in the limit of vanishing mass the proper time interval is 
zero. What information can be derived about the anti-neutrino mass from 
the observation that the energies of the anti-neutrinos ranged from 7 to 
11 MeV, and the arrival times showed a dispersion of 7 s? 

11. The theoretical resolving power of a telescope is given by a = 1.22A/D, 
where A is the wavelength of the incoming light and D is the diameter of 
the mirror. Assuming D = 5m and A = 8 x 10" 7 m, determine the largest 
distance to a star that can be measured by the parallax method. (In reality, 
atmospheric disturbances set tighter limits.) 



Chapter Bibliography 

[1] Kenyon, I. R. 1990 General relativity. Oxford University Press, Oxford. 

[2] Peebles, P. J. E. 1993 Principles of physical cosmology. Princeton University Press, 

Princeton, NJ. 
[3] Pyykko, P. 1988 Chem. Rev. 88, 563. 
[4] Lynden-Bell, D. et al. 1988 Astrophys. J. 326, 19. 
[5] Freedman, W. L. et al. 1994 Nature 371, 757. 

[6] Peacock, J. A. 1999 Cosmological physics. Cambridge University Press, Cambridge. 
[7] Shore, G. M. 2002 Nuclear Phys. B 633, 271. 
[8] Gasiorowicz, S. 1979 The structure of matter. Addison-Wesley, Reading, MA. 



Gravitational 
Phenomena 



In the previous chapters we have gradually constructed a theory for a possible 
description of the Universe. However, the theory rests on many assumptions 
which should be tested before we proceed. For instance, in special relativity space- 
time was assumed to be four dimensional and the velocity of light c constant. 
Moreover, the strong principle of equivalence was assumed to be true in order 
to arrive at general relativity, and Einstein's law of gravitation, Equation (2.96), is 
really based on intuition rather than on facts. Consequently, before proceeding to 
our main task to describe the Universe, we take a look in this chapter at several 
phenomena predicted by general relativity. 

In Section 3.1 we describe the classical tests of general relativity which provided 
convincing evidence early on that the theory was valid. In Section 3.2 we describe 
the precision measurements of properties of a binary pulsar which showed con- 
vincingly that Einstein was right, and which ruled out most competing theories. 

An important gravitational phenomenon is gravitational lensing, encountered 
already in the early observations of starlight deflected by the passage near the 
Sun's limb. The lensing of distant galaxies and quasars by interposed galaxy clus- 
ters, which is discussed in Section 3.3, has become a tool for studying the internal 
structure of clusters. Weak lensing is a tool for studying the large-scale distribu- 
tion of matter in the Universe, and thus for determining some of the cosmological 
parameters. 

If the Einstein equations (2.96) were difficult to derive, it was even more difficult 
to find solutions to this system of 10 coupled nonlinear differential equations. A 
particularly simple case, however, is a single star far away from the gravitational 
influence of other bodies. This is described by the Schwarzschild solution to the 
Einstein equations in Section 3.4. A particularly fascinating case is a black hole, 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



62 Gravitational Phenomena 

a star of extremely high density. Black holes are certainly the most spectacular 
prediction of general relativity, and they appear to be ubiquitous in the nuclei of 
bright galaxies. 

The existence of gravitational radiation, already demonstrated in the case of 
the binary pulsar, is an important prediction of general relativity. However, it 
remains a great challenge to observe this radiation directly. How to do this will 
be described in Section 3.5. 



3.1 Classical Tests of General Relativity 

The classical testing ground of theories of gravitation, Einstein's among them, is 
celestial mechanics within the Solar System. Ideally one should consider the full 
many-body problem of the Solar System, a task which one can readily characterize 
as impossible. Already the relativistic two-body problem presents extreme math- 
ematical difficulties. Therefore, all the classical tests treated only the one-body 
problem of the massive Sun influencing its surroundings. 

The earliest phenomenon requiring general relativity for its explanation was 
noted in 1859, 20 years before Einstein's birth. The French astronomer Urban Le 
Verrier (1811-1877) found that something was wrong with the planet Mercury's 
elongated elliptical orbit. As the innermost planet it feels the solar gravitation 
very strongly, but the orbit is also perturbed by the other planets. The total effect 
is that the elliptical orbit is nonstationary: it precesses slowly around the Sun. The 
locus of Mercury's orbit nearest the Sun, the perihelion, advances 574" (seconds 
of arc) per century. This is calculable using Newtonian mechanics and Newtonian 
gravity, but the result is only 531", 43" too little. Le Verrier, who had already 
successfully predicted the existence of Neptune from perturbations in the orbit of 
Uranus, suspected that the discrepancy was caused by a small undetected planet 
inside Mercury's orbit, which he named Vulcan. That prediction was, however, 
never confirmed. 

With the advent of general relativity the calculations could be remade. This time 
the discrepant 43" were successfully explained by the new theory, which thereby 
gained credibility. This counts as the first one of three 'classical' tests of general 
relativity. For details on this test as well as on most of the subsequent tests, see, 
for example, [1] and [2]. 

Also, the precessions of Venus and Earth have been put to similar use, and 
within the Solar System many more consistency tests have been done, based on 
measurements of distances and other orbital parameters. 

The second classical test was the predicted deflection of a ray of light passing 
near the Sun. We shall come back to that test in Section 3.3 on gravitational lensing. 

The third classical test was the gravitational shift of atomic spectra, first 
observed by John Evershed in 1927. The frequency of emitted radiation makes 
atoms into clocks. In a strong gravitational field these clocks run slower, so the 
atomic spectra shift towards lower frequencies. This is an effect which we already 
met in Section 2.5: light emerging from a star with mass M is gravitationally red- 
shifted in proportion to M. Evershed observed the line shifts in a cloud of plasma 



The Binary Pulsar 63 

ejected by the Sun to an elevation of about 72 000 km above the photosphere 
and found an effect only slightly larger than that predicted by general relativ- 
ity. Modern observations of atoms radiating above the photosphere of the Sun 
have improved on this result, finding agreement with theory at the level of about 

2.1 x 10" 6 . Similar measurements have been made in the vicinity of more massive 
stars such as Sirius. 

Since then, many experiments have studied the effects of changes in a gravita- 
tional potential on the rate of a clock or on the frequency of an electromagnetic 
signal. Clocks have been put in towers or have travelled in airplanes, rockets and 
satellites. The so-called 'fourth' test of general relativity, which was conceived by 
7. 1. Shapiro in 1964 and carried out successfully in 1971 and later, deserves a spe- 
cial mention. This is based on the prediction that an electromagnetic wave suffers 
a time delay when traversing an increased gravitational potential. 

The fourth test was carried out with the radio telescopes at the Haystack 
and Arecibo observatories by emitting radar signals towards Mercury, Mars and, 
notably, Venus, through the gravitational potential of the Sun. The round-trip time 
delay of the reflected signal was compared with theoretical calculations. Further 
refinement was achieved later by posing the Viking Lander on the Martian surface 
and having it participate in the experiment by receiving and retransmitting the 
radio signal from Earth. This experiment found the ratio of the delay observed to 
the delay predicted by general relativity to be 1.000 ± 0.002. 

Note that the expansion of the Universe and Hubble's linear law (1.15) are not 
tests of general relativity. Objects observed at wavelengths ranging from radio 
to gamma rays are close to isotropically distributed over the sky. Either we are 
close to a centre of spherical symmetry— an anthropocentric view— or the Uni- 
verse is close to homogeneous. In the latter case, and if the distribution of objects 
is expanding so as to preserve homogeneity and isotropy (this is local Lorentz 
invariance), the recession velocities satisfy Hubble's law. 

3.2 The Binary Pulsar 

The most important tests have been carried out on the radio observations of 
pulsars that are members of binary pairs, notably the PSR 1913 + 16 discovered 
in 1974 by R. A. Hulse and /. H. Taylor, for which they received the Nobel Prize 
in 1993. Pulsars are rapidly rotating, strongly magnetized neutron stars. If the 
magnetic dipole axis does not coincide with the axis of rotation (just as is the case 
with Earth), the star would radiate copious amounts of energy along the magnetic 
dipole axis. These beams at radio frequencies precess around the axis of rotation 
like the searchlights of a beacon. As the beam sweeps past our line of sight, it 
is observable as a pulse with the period of the rotation of the star. Hulse, Taylor 
and collaborators at Arecibo have demonstrated that pulsars are the most stable 
clocks we know of in the Universe, the variation is about 10" 14 on timescales of 
6-12 months. The reason for this stability is the intense self -gravity of a neutron 
star, which makes it almost undeformable until, in a binary pair, the very last few 
orbits when the pair coalesce into one star. 



64 Gravitational Phenomena 

The pulsar PSR 1913 + 16 is a member of a binary system of two neutron stars 
which, in addition to their individual spins, rotate around their common centre 
of mass in a quite eccentric orbit. One of the binary stars is a pulsar, sweeping 
in our direction with a period of 59 ms, and the binary period of the system is 
determined to be 7.751 939 337 h. The radial velocity curve as a function of time 
is known, and from this one can deduce the masses m.\ and wi2 of the binary 
stars to a precision of 0.0005, as well as the parameters of a Keplerian orbit: the 
eccentricity and the semi-major axis. 

But the system does not behave exactly as expected in Newtonian astronomy, 
hence the deviations provide several independent confirmations of general rel- 
ativity. The largest relativistic effect is the apsidal motion of the orbit, which is 
analogous to the advance of the perihelion of Mercury. A second effect is the coun- 
terpart of the relativistic clock correction for an Earth clock. The light travel time 
of signals from the pulsar through the gravitational potential of its companion 
provides a further effect. 

During 17 years of observations the team has observed a steadily accumulating 
change of orbital phase of the binary system, which must be due to the loss of 
orbital rotational energy by the emission of gravitational radiation. This rate of 
change can be calculated since one knows the orbital parameters and the star 
masses so well. The calculations based on Einstein's general relativity agree to 
within 1% with the measurements. This is the first observation of gravitational 
radiation, although it is indirect, since we as yet have no detector with which to 
receive such waves. 

The result is also an important check on competing gravitational theories, sev- 
eral of which have been ruled out. In the near future one can expect even more 
stringent tests, since there are now several further binary pulsar systems known. 
Non-pulsating binary systems, which are much more common, are also of interest 
as sources of high-frequency gravitational radiation to be detected in planned or 
already existing detectors (Section 3.5). 

3.3 Gravitational Lensing 

A consequence of the relativistic phenomenon of light rays bending around grav- 
itating masses is that masses can serve as gravitational lenses if the distances are 
right and the gravitational potential is sufficient. Newton discussed the possibil- 
ity that celestial bodies could deflect light (in 1704), and the astronomer Soldner 
published a paper (in 1804) in which he obtained the correct Newtonian deflec- 
tion angle by the Sun, assuming that light was corpuscular in nature. Einstein 
published the general relativistic calculation of this deflection only in 1936, and 
it was not until 1979 that the effect was first seen by astronomers. 

Recall from Equation (2.82) and Section 2.5 that the Strong Principle of Equiv- 
alence (SPE) causes a photon in a gravitational field to move as if it possessed 
mass. A particle moving with velocity v past a gravitational point potential or a 
spherically symmetric potential U will experience an acceleration in the transver- 
sal direction resulting in a deflection, also predicted by Newtonian dynamics. The 



Gravitational Lensing 65 

deflection angle a can be calculated from the (negative) potential U by taking the 
line integral of the transversal gravitational acceleration along the photon's path. 



Weak Lensing. In the thin-lens approximation the light ray propagates in a 
straight line, and the deflection occurs discontinuously at the closest distance. 
The transversal acceleration in the direction y is then 

In Newtonian dynamics the factor in the brackets is just 1, as for velocities v <s c. 
This is also true if one invokes SPE alone, which accounts only for the distortion 
of time. However, the full theory of general relativity requires the particle to move 
along a geodesic in a geometry where space is also distorted by the gravitational 
held. For photons with velocity c the factor in brackets is then 2, so that the total 
deflection due to both types of distortion is doubled. 
The gravitational distortion can be described as an effective refraction index, 

n = l-\u>l, (3.2) 

so that the speed of light through the gravitational held is reduced to v = c/n. 
Different paths suffer different time delays At compared with undistorted paths: 



-ti 



At= -\ -,dl. (3.3) 

C J source C l 

In Figure 3.1 we show the geometry of a weak lensing event where the potential 
is weak and the light ray clearly avoids the lensing object. The field equations of 
GR can then be linearized. 

For the special case of a spherically or circularly symmetric gravitating body 
such as the Sun with mass M inside a radius b, photons passing at distance b of 
closest approach would be deflected by the angle (Problem 2) 

be 1 

For light just grazing the Sun's limb (b = 6.96 x 10 8 m), the relativistic deflection 
is a = 1.750", whereas the nonrelativistic deflection would be precisely half of 
this. 

To observe the deflection one needs stars visible near the Sun, so two conditions 
must be fulfilled. The Sun must be fully eclipsed by the Moon to shut out its 
intense direct light, and the stars must be very bright to be visible through the 
solar corona. Soon after the publication of Einstein's theory in 191 7 it was realized 
that such a fortuitous occasion to test the theory would occur on 29 May 1919. 
The Royal Astronomical Society then sent out two expeditions to the path of the 
eclipse, one to Sobral in North Brazil and the other one, which included Arthur S. 
Eddington (1882-1944), to the Isle of Principe in the Gulf of Guinea, West Africa. 



66 Gravitational Phenomena 




Figure 3.1 The geometry of a gravitational lensing event. For a thin lens, deflection 
through the small bend angle a may be taken to be instantaneous. The angles 8\ and 
0s specify the observed and intrinsic positions of the source on the sky, respectively. 
Reprinted from [4] with permission of J. A. Peacock. 

Both expeditions successfully observed several stars at various distances from 
the eclipsed Sun, and the angles of deflection (reduced to the edge of the Sun) 
were 1.98" ± 0.12" at Sobral and 1.61" ± 0.30" at Principe. This confirmed the 
predicted value of 1.750" with reasonable confidence and excluded the Newtonian 
value of 0.875". The measurements have been repeated many times since then 
during later solar eclipses, with superior results confirming the general relativistic 
prediction. 

The case of starlight passing near the Sun is a special case of a general lensing 
event, shown in Figure 3.1. For the Sun the distance from lens to observer is 
small so that the angular size distances are D L s ~ D$, which implies that the 
actual deflection equals the observed deflection and a = 6\ - 6$. In the general 
case, simple geometry gives the relation between the deflection and the observed 
displacement as 



Dls 



(0i -9s), 



(3.5) 



For a lens composed of an ensemble of point masses, the deflection angle is, in this 
approximation, the vectorial sum of the deflections of the individual point lenses. 
When the light bending can be taken to be occurring instantaneously (over a short 
distance relative to D L $ and D L ), we have a geometrically thin lens, as assumed in 
Figure 3.1. Thick lenses are considerably more complicated to analyse. 



Strong Lensing. The terms weak lensing and strong lensing are not defined very 
precisely. In weak lensing the deflection angles are small and it is relatively easy 



Gravitational Lensing 67 

to determine the true positions of the lensed objects in the source plane from 
their displaced positions in the observer plane. Strong lensing implies deflection 
through larger angles by stronger potentials. The images in the observer plane 
can then become quite complicated because there may be more than one null 
geodesic connecting source and observer, so that it is not even always possible to 
find a unique mapping onto the source plane. Strong lensing is a tool for testing 
the distribution of mass in the lens rather than purely a tool for testing general 
relativity. 

If a strongly lensing object can be treated as a point mass and is positioned 
exactly on the straight line joining the observer and a spherical or pointlike lensed 
object, the lens focuses perfectly and the lensed image is a ring seen around the 
lens, called an Einstein ring. The angular size can be calculated by setting the two 
expressions for a, Equations (3.4) and (3.5), equal, noting that ft = and solving 
for ft: 

01 = v^^- a6) 

For small M the image is just pointlike. In general, the lenses are galaxy clusters or 
(more rarely) single galaxies that are not spherical and the geometry is not simple, 
so that the Einstein ring breaks up into an odd number of sections of arc. Each arc 
is a complete but distorted picture of the lensed object. Many spectacular lensing 
pictures can be seen on the HST Internet web page [3]. 

In general, the solution of the lensing equation and the formation of multiple 
images can be found by jointly solving Equations (3.4) and (3.5). Equation (3.4) 
gives the bend angle a g (Afb) as a function of the gravitational potential for a 
(symmetric) mass Mb within a sphere of radius b, or the mass seen in projection 
within a circle of radius b. From Figure 3.1 we can see that b = ftxD L s, so inserting 
this into Equation (3.4) we have 

"■<*■*> "TO (3 - 7) 

Equation (3.5) is the fundamental lensing equation giving the geometrical rela- 
tion between the bend angle a\ and the source and image positions: 

«i(ft,ft) = -^(ft-ft). (3.8) 

There will be an image at an angle ft* that simultaneously solves both equations: 

a g (Mb,ft*) = ai(ft,ft*). (3.9) 

For the case of a symmetric (or point-mass) lens, 0* will be the two solutions to 
the quadratic 

This reduces to the radius of the Einstein ring when 0$ = 0. The angle correspond- 
ing to the radius of the Einstein ring we denote ft. 



Gravitational Phenomena 





Figure 3.2 Bend angle due to gravitational potential a s (thick line) and lens geometry «i 
(thin line), zero (dashed line). These figures are drawn for a lens of mass Af ~ 7.2 x 10 11 M G , 
with D s ra 1.64 Gpc (z = 3.4 in a Friedmann-Robertson-Walker (FRW) cosmology with 
O m = 0.3 and f3 A = 0.7), D L ~ 1.67 Gpc (z = 0.803), D LS ~ 0.96 Gpc and a s ~ 0.13". (Note 
that since distances are angular diameter distances, D s * D LS + D L .) In (a) the lens is a point 
mass and in (b) it is a spherical mass distribution with density given by Equation (3.11) for 
an ideal galaxy. This roughly corresponds to the parameters associated with the 'Einstein 
Cross' lensed quasar found by the Hubble telescope, HST 14176 + 5226 [3]. Courtesy of 
T. S. Coleman. 



Graphically, the solutions to the quadratic are given by the intersection of the 
two curves, a g (9i) and a\(9i). The lens equation «i(0i) (3.8) is a straight line, while 
the gravitational potential equation a g (0\) (3.7) depends on the mass distribution. 
For a point mass, Equation (3.10) describes a pair of hyperbolas and the two curves 
are as shown in Figure 3.2(a). Clearly there will always be two images for a point- 
mass lens. When the source displacement is zero (9s = 0) the images will be at 
the positive and negative roots of (3.6)— the Einstein ring. When 0s is large the 
positive root will be approximately equal to 9s, while the negative root will be 
close to zero (on the line of sight of the lens). This implies that every point-mass 
lens should have images of every source, no matter what the separation in the sky. 
Clearly this is not the case. The reason is that the assumption of a point mass and 
hyperbolic a g cannot be maintained for small 9\. 

A more realistic assumption for the mass distribution of a galaxy would be that 
the density is spherically symmetric, with density as a function of distance from 



Gravitational Lensing 69 

the galactic core, R, given by 

pCR)=Pcore(l + J" ) , (3.1D 

The density is approximately constant (equal to R CO re) for small radii (R <sc R coie ) 
and falls off as R~ 2 for large radii. This roughly matches observed mass-density 
distributions (including dark matter) as inferred from galaxy rotational-velocity 
observations (see Section 9.3). The mass will grow like R 3 for R <s R cor e and like 
R for R»R core . 

For a spherically symmetric mass with density (3.11), the bend angle due to 
gravitation, « g , will be as shown in Figure 3.2(b). In this case the straight line 
given by a\ may cross at multiple points, giving multiple images. Figure 3.2(b) 
shows three solutions, implying three images. In addition, when the mass is not 
circularly symmetric in the sky, all angles must be treated as two-dimensional 
vectors on the sky. 

For a nonsymmetric mass distribution, the function a g can become quite com- 
plicated (see, for example, [5]). Clearly, the problem quickly becomes complex. An 
example is shown in Figure 3.3, where each light ray from a lensed object propa- 
gates as a spherical wavefront. Bending around the lens then brings these wave- 
fronts into positions of interference and self-interaction, causing the observer to 
see multiple images. The size and shape of the images are therefore changed. 
From Figure 3.3 one understands how the time delay of pairs of images arises: 
this is just the time elapsed between different sheets of the same wavefront. In 
principle, the time delays in Equation (3.3) provide a potential tool for measuring 
Hq, but not with an interesting precision so far. 



Surface Brightness and Microlensing. Since photons are neither emitted nor 
absorbed in the process of gravitational light deflection, the surface brightness 
of lensed sources remains unchanged. Changing the size of the cross-section of a 
light bundle therefore only changes the flux observed from a source and magni- 
fies it at fixed surface-brightness level. For a large fraction of distant quasars the 
magnification is estimated to be a factor of 10 or more. This enables objects of 
fainter intrinsic magnitudes to be seen. However, lensing effects are very model 
dependent, so to learn the true magnification effect one needs very detailed infor- 
mation on the structure of the lens. 

If the mass of the lensing object is very small, one will merely observe a magni- 
fication of the brightness of the lensed object. This is called microlensing, and it 
has been used to search for nonluminous objects in the halo of our Galaxy. One 
keeps watch over several million stars in the Large Magellanic Cloud (LMC) and 
records variations in brightness. Some stars are Cepheids, which have an intrinsic 
variability, so they have to be discarded. A star which is small enough not to emit 
visible light and which is moving in the halo is expected to cross the diameter of a 
star in the LMC in a time span ranging from a few days to a couple of months. The 
total light amplification for all images from a point-mass lens and point source 



Gravitational Phenomena 




Figure 3.3 Wavefronts and light rays in the presence of a cluster perturbation. Reprinted 
from [5] with permission of N. Straumann. 



As the relative positions of the source, lens and observer change, 0s will change. 
Simple geometrical arguments give 0$ as a function of the relative velocities, and 
thus the amplification as a function of time (see [4, pp. 106, 118-120]). During the 
occultation, the image of the star behind increases in intensity according to this 
function and subsequently decreases along the time-symmetric curve. A further 
requirement is that observations in different colours should give the same time 
curve. Several such microlensing events have been found in the direction of the 
LMC, and several hundred in the direction of the bulge of our Galaxy. The impli- 
cations of these discoveries for the search of lumps of nonluminous dark matter 
will be discussed later. 



Cosmic Shear. The large-scale distribution of matter in the Universe is inho- 
mogeneous in every direction, so one can expect that everything we observe is 
displaced and distorted by weak lensing. Since the tidal gravitational field, and 
thus the deflection angles, depend neither on the nature of the matter nor on its 
physical state, light deflection probes the total projected mass distribution. Lens- 
ing in infrared light offers the additional advantage of being able to sense distant 
background galaxies, since their number density is higher than in the optical. The 
idea of mapping the matter distribution using the cosmic shear field was already 



proposed (in 1937) by Fritz Zwicky (1898-1974), who also proposed looking for 
lensing by galaxies rather than by stars. 

The ray-tracing process mapping a single source into its image can be expressed 
by the Jacobian matrix between the source-plane coordinates and the observer- 
plane coordinates: 

JM = ( l - K - y ' ~ y l ), (3.13) 

V -Yi 1 - #c + yij 

where k is the convergence of the lens and y = yi + iy2 is the shear. The matrix 
J(a) transforms a circular source into an ellipse with semi-axes stretched by the 
factor (1-K±|y|) _1 . The convergence affects the isotropic magnification or the 
projected mass density divided by the critical density, whereas the shear affects 
the shape of the image. The magnification is given by 

H = (detjr 1 = [(l-K) 2 -y 2 ]- 1 . (3.14) 

Clearly, there are locations where ^ can become infinite. These points in the source 
plane are called caustics and they lie on the intersections of critical curves. 

Background galaxies would be ideal tracers of distortions if they were intrin- 
sically circular. Any measured ellipticity would then directly reflect the action of 
the gravitational tidal field of the interposed lensing matter, and the statistical 
properties of the cosmic-shear field would reflect the statistical properties of the 
matter distribution. But many galaxies are actually intrinsically elliptical, and the 
ellipses are randomly oriented. These intrinsic ellipticities introduce noise into 
the inference of the tidal field from observed ellipticities. 

The sky is covered with a 'wall paper' of faint and distant blue galaxies, about 
20 000-40 000 on an area of the size of the full moon. This fine-grained pattern 
of the sky makes statistical weak-lensing studies possible, because it allows the 
detection of the coherent distortions imprinted by gravitational lensing on the 
images of the faint-blue-galaxy population. Several large collaborations carrying 
out such surveys have reported statistically significant observations of cosmic 
shear. In the future one can expect large enough surveys to have the sensitivity to 
identify even invisible (implying X-ray underluminous) galaxy clusters or clumps 
of matter in the field. 

To test general relativity versus alternative theories of gravitation, the best way 
is to probe the gravitational potential far away from visible matter, and weak 
galaxy-galaxy lensing is currently the best approach to this end, because it is 
accurate on scales where all other methods fail, and it is simple if galaxies are 
treated as point masses. Alternative theories predict an isotropic signal where 
general relativity predicts an azimuthal variation. The current knowledge is yet 
preliminary, but favours anisotropy and thus general relativity. 



3.4 Black Holes 

The Schwarzschild Metric. Suppose that we want to measure time t and radial 
elevation r in the vicinity of a spherical star of mass M in isolation from all other 



72 Gravitational Phenomena 

gravitational influences. Since the gravitational field varies with elevation, these 
measurements will surely depend on r. The spherical symmetry guarantees that 
the measurements will be the same on all sides of the star, and thus they are 
independent of 6 and </>. The metric does not then contain d6 and d<fi terms. Let 
us also consider that we have stable conditions: that the field is static during our 
observations, so that the measurements do not depend on t. 

The metric is then not flat, but the 00 time-time component and the 1 1 space- 
space component must be modified by some functions of r. Thus it is of the form 

ds 2 = B(r)c 2 dt 2 -A(r)dr 2 , (3.15) 

where B(r) and A(r) have to be found by solving the Einstein equations. 

Far away from the star the space-time is flat. This gives us the asymptotic con- 
ditions 

lim A(r) = limB(r) = 1. (3.16) 

From Equation (2.91) the Newtonian limit of goo is known. Here B(r) plays the 
role of goo; thus we have 

(3.17) 

To obtain A(r) from the Einstein equations is more difficult, and we shall not 
go to the trouble of deriving it. The exact solution found by Karl Schwarzschild 
(1873-1916) in 1916 preceded any solution found by Einstein himself. The result 
is simply 

A(r)=B(rr 1 . (3.18) 

These functions clearly satisfy the asymptotic conditions (3.16). 

Let us introduce the concept of Schwarzschild radius r c for a star of mass M, 
defined by B(r c ) = 0. It follows that 

r c -?™. (3.19) 

c 2 

The physical meaning of r c is the following. Consider a body of mass m and radial 
velocity v attempting to escape from the gravitational field of the star. To succeed, 
the kinetic energy must overcome the gravitational potential. In the nonrelativistic 
case the condition for this is 

\mv 2 ^ GMm/r. (3.20) 

The larger the ratio M/r of the star, the higher the velocity required to escape 
is. Ultimately, in the ultra-relativistic case when v = c, only light can escape. At 
that point a nonrelativistic treatment is no longer justified. Nevertheless, it just 
so happens that the equality in (3.20) fixes the radius of the star correctly to be 
precisely r c , as defined above. Because nothing can escape the interior of r c , not 
even light, John A. Wheeler coined the term black hole for it in 1967. Note that the 
escape velocity of objects on Earth is 11 kms" 1 , on the Sun it is 2.2 x 10 6 kmh" 1 , 
but on a black hole it is c. 



This is the simplest kind of a Schwarzschild black hole, and r c defines its event 
horizon. Inserting r c into the functions A and B, the Schwarzschild mefrzcbecomes 



Falling Into a Black Hole. The Schwarzschild metric has very fascinating conse- 
quences. Consider a spacecraft approaching a black hole with apparent velocity 
v = dr/dt in the fixed frame of an outside observer. Light signals from the space- 
craft travel on the light cone, dT = 0, so that 

|-c(l-?). 

Thus the spacecraft appears to slow down with decreasing r, finally coming to 
a full stop as it reaches r = r c . 

No information can ever reach the outside observer beyond the event horizon. 
The reason for this is the mathematical singularity of dt in the expression 

dr 

cdt = - -. (3.23) 

1 -r c /r 

The time intervals dt between successive crests in the wave of the emitted light 
become longer, reaching infinite wavelength at the singularity. Thus the frequency 
v of the emitted photons goes to zero, and the energy E = hv of the signal van- 
ishes. One cannot receive signals from beyond the event horizon because photons 
cannot have negative energy. Thus the outside observer sees the spacecraft slow- 
ing down and the signals redshifting until they cease completely. 

The pilot in the spacecraft uses local coordinates, so he sees the passage into 
the black hole entirely differently. If he started out at distance ro with velocity 
dr/dt = at time to. he will have reached position r at proper time t, which we 
can find by integrating dT in Equation (3.21) from to t: 



c 2 (l-r c /r)J 

The result depends on dr(t)/dt, which can only be obtained from the equation 
of motion. The pilot considers that he can use Newtonian mechanics, so he may 
take 



The result is then (Problem 7) 

roc (r -r) 3/2 . (3.25) 

However, many other expressions for dr(t)/dt also make the integral in Equa- 
tion (3.24) converge. 

Thus the singularity at r c does not exist to the pilot, his comoving clock shows 
finite time when he reaches the event horizon. Once across r c the spacecraft 



74 Gravitational Phenomena 

reaches the centre of the black hole rapidly. For a hole of mass 10M© this final pas- 
sage lasts about 10" 4 s. The fact that the singularity at r c does not exist in the local 
frame of the spaceship indicates that the horizon at r c is a mathematical singu- 
larity and not a physical singularity. The singularity at the horizon arises because 
we are using, in a region of extreme curvature, coordinates most appropriate for 
flat or mildly curved space-time. Alternate coordinates, more appropriate for the 
region of a black hole and in which the horizon does not appear as a singularity, 
were invented by Eddington (1924) and rediscovered by Finkelstein (1958) (cf. [1]). 

Although this spacecraft voyage is pure science fiction, we may be able to 
observe the collapse of a supernova into a black hole. Just as for the spacecraft, the 
collapse towards the Schwarzschild radius will appear to take a very long time. 
Towards the end of it, the ever-redshifting light will fade and finally disappear 
completely. 

Note from the metric equation (3.21) that inside r c the time term becomes nega- 
tive and the space term positive, thus space becomes timelike and time spacelike. 
The implications of this are best understood if one considers the shape of the 
light cone of the spacecraft during its voyage (see Figure 3.4). Outside the event 
horizon the future light cone contains the outside observer, who receives signals 
from the spacecraft. Nearer r c the light cone narrows and the slope dr/dt steep- 
ens because of the approaching singularity in expression on the the right-hand 
side of Equation (3.22). The portion of the future space-time which can receive 
signals therefore diminishes. 

Since the time and space axes have exchanged positions inside the horizon, 
the future light cone is turned inwards and no part of the outside space-time 
is included in the future light cone. The slope of the light cone is vertical at the 
horizon. Thus it defines, at the same time, a cone of zero opening angle around the 
original time axis, and a cone of 180° around the final time axis, encompassing 
the full space-time of the black hole. As the spacecraft approaches the centre, 
dt/dr decreases, defining a narrowing opening angle which always contains the 
centre. When the centre is reached, the spacecraft no longer has a future. 



Black Hole Properties. At the centre of the hole, the metric (3.21) is singular. 
This represents a physical singularity. One cannot define field equations there, 
so general relativity breaks down, unable to predict what will happen. Some peo- 
ple have speculated that matter or radiation falling in might 'tunnel' through a 
'wormhole' out into another universe. Needless to say, all such ideas are purely 
theoretical speculations with no hope of experimental verification. 

A black hole is a region from which nothing can escape, not even light. Black 
holes are very simple objects as seen from outside their event horizon, they have 
only the three properties: mass, electric charge and angular momentum. Their 
size depends only on their mass so that all holes with the same mass are identical 
and exactly spherical, unless they rotate. All other properties possessed by stars, 
such as shape, solid surface, electric dipole moment, magnetic moments, as well 
as any detailed outward structure, are absent. This has led to John Wheeler's 
famous statement 'black holes have no hair'. 




Figure 3.4 The world line of a spacecraft falling into a Schwarzschild black hole. A, the 
journey starts at time to when the spacecraft is at a radius r s , far outside the Schwarzschild 
radius r c , and the observer is at r . A light signal from the spacecraft reaches the observer 
at time £a > to (read time on the right-hand vertical scale!). B, nearer the black hole the 
future light cone of the spacecraft tilts inward. A light signal along the arrow will still reach 
the observer at a time t B » t A . C, near the Schwarzschild radius the light cone narrows 
considerably, and a light signal along the arrow reaches the observer only in a very distant 
future. D, inside r c the time and space directions are interchanged, time running from up 
to down on the left-hand vertical scale. All light signals will reach the centre of the hole 
at r = 0, and none will reach the observer. The arrow points in the backward direction, so 
a light signal will reach the centre after the spacecraft. E, the arrow points in the forward 
direction of the hole, so that a light signal will reach the centre at time t E , which is earlier 
than t max , when the spacecraft ends its journey. 



Black holes possessing either charge or angular momentum are called Reissner- 
Nordstrom black holes and Kerr black holes, respectively, and they are described 
by different metrics. It is natural to consider that matter attracted by a hole has 
angular momentum. Matter can circulate a hole in stable orbits with radii exceed- 
ing 3r c , but if it comes any closer it starts to spiral in towards the horizon, and is 
soon lost into the hole with no possibility to escape. Since angular momentum is 
conserved, the infalling matter must speed up the rotation of the hole. However, 
centrifugal forces set a limit on the angular momentum J that a rotating black 
hole can possess: 

/^. (3.26) 



76 Gravitational Phenomena 

This does not imply that the hole is ripped into pieces with one increment of rotat- 
ing matter, rather, that it could never have formed in the first place. Remember 
that angular momentum is energy, and energy is curvature, so incremental energy 
is modifying the space-time geometry of the black hole, leading to a smaller event 
horizon. Thus the angular momentum can never overcompensate the gravitational 
binding energy. If it could, there would be no event horizon and we would have 
the case of a visible singularity, also called a naked singularity. Since nobody has 
conceived of what such an object would look like, Stephen Hawking and Roger 
Penrose have conjectured that space-time singularities should always be shielded 
from inspection by an event horizon. This is called the principle of cosmic censor- 
ship—in Penrose's words 'Nature abhors a naked singularity'. The reader might 
find further enjoyment reading the book by Hawking and Penrose on this sub- 
ject [6]. 

J. Bekenstein noted in 1973 [7] that there are certain similarities between the size 
of the event horizon of a black hole and entropy. When a star has collapsed to the 
size of its Schwartzschild radius, its event horizon will never change (to an outside 
observer) although the collapse continues (see Figure 3.4). Thus entropy s could 
be defined as the surface area A of the event horizon times some proportionality 
factor, 

the Bekenstein-Hawking formula. For a spherically symmetric black hole of mass 
M the surface area is given by 

A = 8nM 2 G 2 /c 2 . (3.28) 

A can increase only if the black hole devours more mass from the outside, but 
A can never decrease because no mass will leave the horizon. Inserting this into 
Equation (3.27), entropy comes out proportional to M 2 : 

s =M 2 2nkGc/h. (3.29) 

Thus two black holes coalesced into one possess more entropy than they both 
had individually. This is illustrated in Figure 3.5. 



Hawking Radiation. Stephen Hawking has shown [8, 9] that although no light 
can escape from black holes, they can nevertheless radiate if one takes quantum 
mechanics into account. It is a property of the vacuum that particle-antiparticle 
pairs such as e"e + are continuously created out of nothing, to disappear in the 
next moment by annihilation, which is the inverse process. Since energy cannot 
be created or destroyed, one of the particles must have positive energy and the 
other one an equal amount of negative energy. They form a virtual pair, neither 
one is real in the sense that it could escape to infinity or be observed by us. 

In a strong electromagnetic field the electron e~ and the positron e + may 
become separated by a Compton wavelength A of the order of the Schwarzschild 
radius r c . Hawking has shown that there is a small but finite probability for one of 



Black Holes 77 




Figure 3.5 When we throw matter into a black hole, or allow two black holes to merge, 
the total area of the event horizons will never decrease, (a) At ^ Ai,(b)A3 ^ Ai+A2.From 
S. Hawking and R. Penrose [6], copyright 1996 by Princeton University Press. Reprinted by 
permission of Princeton University Press. 



them to 'tunnel' through the barrier of the quantum vacuum and escape the black 
hole horizon as a real particle with positive energy, leaving the negative-energy 
particle inside the horizon of the hole. Since energy must be conserved, the hole 
loses mass in this process, a phenomenon called Hawking radiation. 
The timescale of complete evaporation is 



t«10Gyr( T 



(3.30) 



Thus small black holes evaporate fast, whereas heavy ones may have lifetimes 
exceeding the age of the Universe. 

The analogy with entropy can be used even further. A system in thermal equi- 
librium is characterized by a unique temperature T throughout. When Hawking 
applied quantum theory to black holes, he found that the radiation emitted from 
particle-antiparticle creation at the event horizon is exactly thermal. The rate of 
particle emission is as if the hole were a black body with a unique temperature 
proportional to the gravitational field on the horizon, the Hawking temperature: 



T H = 



1 



= 6.15x 1(T ( 



S M, 



-K. 



(3.31) 



Black Hole Creation. Black holes may have been created in the Big Bang, and 
they are probably created naturally in the ageing of stars. As explained in Sec- 
tion 1.4, the gravitational collapse of stars burning light elements to heavier ele- 
ments by nuclear fusion is balanced by the gas pressure. At the end of this cycle 
they become red giants with iron cores. If the mass of the core does not exceed 



78 Gravitational Phenomena 

the Chandrasekhar limit of 1.4M , it is stabilized against gravitational collapse 
by the electron degeneracy pressure. 

But there is a limit to how densely electrons can be packed; if the mass exceeds 
the Chandrasekhar limit, even the electron degeneracy pressure cannot withstand 
the huge force of gravity, and the star may collapse and explode as a supernova. 
At this stage the core consists of neutrons, protons, some electrons and a few 
neutrinos, forming a proto-neutron star that is stabilized against gravity by the 
degeneracy pressure of the nucleons. Although the theoretical understanding of 
the collapse and explosion is still poor, it is believed that the proto-neutron star 
boils violently during its first second, is then flattened by rotation, and finally 
becomes a spherical neutron star with nuclear density and a solid surface, with a 
mass in the range 1.4-1.8M© and a radius of r « 3r c . 

However, if the mass exceeds the Landau-Oppenheimer-Volkov limit of 2.3M 
(sometimes the limit 1.8M is quoted), there is no stability and no bounce, and 
the neutron star collapses further to become a black hole. 

The fate of a collapsing spherical star can be illustrated schematically by a 
light cone diagram (see Figure 3.6). Since we cannot draw four-dimensional pic- 
tures, we consider only the evolution in time of an event horizon corresponding 
to the circular equatorial intersection of the star. With increasing time— vertically 
upwards in the figure— the equatorial intersection shrinks and light rays toward 
the future steepen. When the star has shrunk to the size of the Schwartz schild 
radius, the equatorial section has become a trapped surface, the future of all light 
rays from it are directed towards the world line of the centre of the star. For an 
outside observer the event horizon then remains eternally of the same size, but 
an insider would find that the trapped surface of the star is doomed to continue 
its collapse toward the singularity at zero radius. The singularity has no future 
light cone and cannot thus ever be observed. 

The collapse of an isolated heavy star is, however, not the only route to the 
formation of a black hole, and probably not even the most likely one. Binary stars 
are quite common, and a binary system consisting of a neutron star and a red 
giant is a very likely route to a black hole. The neutron star spirals in, accretes 
matter from the red giant at a very high rate, about 1M per year, photons are 
trapped in the flow until the temperature rises above 1 MeV, when neutrinos can 
carry off the thermal energy, and the neutron star collapses into a black hole. 

Black Hole Candidates. Binary systems consisting of a black hole and either 
a main-sequence star, a neutron star or another hole are likely to give rise to a 
heavier hole. For example, Cyg X-l is a black hole— a main-sequence star binary 
with a hole mass of more than about 1OM , probably the most massive black 
hole in a binary observed in the Galaxy. The enormous gravitational pull of the 
hole tears material from its companion star. This material then orbits the hole 
in a Saturnus-like accretion disc before disappearing into the hole. Gravity and 
friction heat the material in the accretion disc until it emits X-rays. Finally, the 
main-sequence star explodes and becomes a neutron star, which will ultimately 
merge into the hole to form a heavier black hole. 



Black Holes 79 




Figure 3.6 A space-time picture of the collapse of a star to form a black hole, showing 
the event horizon and a closed trapped surface. From S. Hawking and R. Penrose [6], copy- 
right 1996 by Princeton University Press. Reprinted by permission of Princeton University 
Press. 



Although black holes have long been suspected to be the enormously powerful 
engines in quasars residing in active galactic nuclei (AGN), observational evidence 
suggests a ubiquity of holes in the nuclei of all bright galaxies, regardless of their 
activity. Among some 50 hole candidates, the best case is under study in the centre 
of our Galaxy near the radio source Sgr A*. The proof is assembled by measur- 
ing the velocity vectors of many stars within 1 pc of Sgr A* and tracing their 
curved stellar orbits, thereby inferring their acceleration. All the acceleration vec- 
tors intersect at Sgr A*, and the velocity vectors do not decrease with decreasing 



80 Gravitational Phenomena 

distance to Sgr A*, indicating that the stars move in a very strong gravitational 
field of a practically pointlike source of mass (3.7 ± 1.5) x 1O 6 M . In particular, 
10 yr of astrometric imaging has permitted the tracing of two-thirds of a bound 
highly elliptical Keplerian orbit of the star currently closest to Sgr A* [10]. These 
data no longer allow for a central dense cluster of dark stellar objects or a ball of 
massive degenerate fermions. 

From measurements of the velocities of 64 individual stars in the central region 
of the dense globular cluster Ml 5 the Hubble Space Telescope inferred in 2002 
[11] that Ml 5 must have a central concentration of nonluminous material. If this 
is due to a single black hole, then its mass is M = (3.9 ± 2.2) x 1O 3 M , and is thus 
of intermediate size. 

Many other candidates have been spotted by the X-ray signal from the accretion 
of surrounding matter. A plausible black hole has been found in the spiral galaxy 
NGC4258 in the constellation Canes Venatici at a distance of 6.4 Mpc [12, 13]. A 
disk of water vapour (observed by its maser emission) and other molecular mate- 
rial, of mass up to 4 x 1O 6 M , is rotating in a Keplerian orbit near the galaxy's 
nucleus at velocities near 1000 km s _1 . Such high velocities for a rotating molec- 
ular disk would require the gravitational pull of a black hole of mass 3.9 x 1O 7 M . 
This galaxy also shows other features expected from black holes acting as central 
engines in active galaxies, such as jets of gas that are twisted into the shape of a 
helix emerging from the nucleus at speeds of 600 km s" 1 . It can safely be ruled 
out that the gravitational field is generated by a cluster of small dark objects such 
as stellar black holes, neutron stars, etc., because such clusters are expected to 
have a lifetime of less than 10 8 yr, too short with respect to the Hubble time. 



3.5 Gravitational Waves 

Einstein noted in 1916 that his general relativity predicted the existence of gravita- 
tional radiation, but its possible observation is still in the future. As we explained 
in Section 3.2, the slowdown of the binary pulsar PSR 1913 + 16 is indirect evi- 
dence that this system loses its energy by radiating gravitational waves. 

When gravitational waves travel through space-time they produce ripples of 
curvature, an oscillatory stretching and squeezing of space-time analogous to the 
tidal effect of the Moon on Earth. Any matter they pass through will feel this effect. 
Thus a detector for gravitational waves is similar to a detector for the Moon's tidal 
effect, but the waves act on an exceedingly weaker scale. 

Gravitational radiation travels with the speed of light and traverses matter 
unhindered and unaltered. It may be that the carriers are particles, gravitons, 
with spin J = 2, but it is hard to understand how that could be verified. Perhaps, 
if a theory were found combining gravitation and quantum mechanics, the particle 
nature of gravitational radiation would be more meaningful. 



Tensor Field. In contrast to the electromagnetic field, which is a vector field, 
the gravitational field is a tensor field. The gravitational analogue of electromag- 



Gravitational Waves 




Figure 3.7 The lines of force associated with the two polarizations of a gravitational 
wave. Reprinted with permission of A. Abramovici et al. [14]. Copyright 1992 American 
Association for the Advancement of Science. 

netic dipole radiation cannot produce any effect because of the conservation of 
momentum: any dipole radiation caused by the acceleration of an astronomical 
object is automatically cancelled by an equal and opposite change in momentum 
in nearby objects. Therefore, gravitational radiation is caused only by nonspher- 
ically symmetric accelerations of mass, which can be related to the quadrupole 
moment, and the oscillatory stretch and squeeze produced is then described by 
two dimensionless wave fields h+ and h x , which are associated with the gravita- 
tional wave's two linear polarizations. If h+ describes the amplitude of polariza- 
tion with respect to the x- and y-axes in the horizontal plane, h x describes the 
independent amplitude of polarization with respect to the rotated axes x + y and 
x - y (cf. Figure 3.7). The relative tidal effect a detector of length L may observe 
is then a linear combination of the two wave fields 



MIL = a+h+(t) + a x h x (t) = h(t). 



(3.32) 



The proper derivation of the quadrupole formula for the energy loss rate 
through gravitational radiation of an oscillating body and the spatial strain h(t) 
caused on bodies elsewhere cannot be carried out here, it requires general rela- 
tivity to be carried out to high orders of covariant derivation. This complication 
is a benefit, however, because it renders the detection of gravitational radiation 
an extremely sensitive test of general relativity. 

In a Newtonian approximation the strength of the waves from a nonspherical 
body of mass M, oscillating size L(t), and quadrupole moment Q(t) « ML 2 at a 
distance r from Earth is 

c 4 r at* c 4 r c*r 

where G is the Newtonian constant, v is the internal velocity, and E = \Mv 2 is 
the nonspherical part of the internal kinetic energy. The factor c 4 is introduced 
only to make h(t) dimensionless. 



h(t)* - 



-E(t), 



(3.33) 



Sources of Gravitational Waves. From this formula one can work out that a 
nonspherically symmetric supernova collapse at the centre of our Galaxy will give 



82 Gravitational Phenomena 

rise to waves of amplitude h « 10" 19 causing a subnuclear stretch and squeeze of 
an object 1 km in length by 10" 16 m. A spherically symmetric supernova collapse 
causes no waves. In a catastrophic event such as the collision of two neutron stars 
or two stellar-mass black holes in which E/c 2 is of the order of one solar mass, 
Equation (3.33) gives h « 10" 20 at the 16 Mpc distance of the Virgo cluster of 
galaxies, and h « 10" 21 at a distance of approximately 200 Mpc. 

The signals one can expect to observe in the amplitude range h « 10" 21 -10" 20 
with the next generation of detectors are bursts due to the coalescence of neutron- 
star binaries during their final minutes and seconds (in the high frequency band 
1-10 4 Hz), and periodic waves from slowly merging galactic binaries and extra- 
galactic massive black hole binaries (low-frequency band 10" 4 -10" 2 Hz), which 
are stable over hundreds to millions of years. The timing of millisecond binary 
pulsars such as the PSR 1913 + 16 belong to the very low-frequency band of 10~ 9 - 
10" 7 Hz. In this band, processes in the very early Universe may also act as sources. 

Merger waves from superheavy black holes with 1O 6 M mass may be so strong 
that both their direction and their amplitude can be determined by monitoring 
the waves while the detector rotates around the Sun. This may permit researchers 
to identify the source with the parallax method and to determine the distance to 
it with high precision. Combined with redshift measurements of the source, one 
could determine not only Hq but even the deceleration parameter qo of the Uni- 
verse. Thus the detection of gravitational waves from black holes would go beyond 
testing general relativity to determining fundamental cosmological parameters of 
the Universe. 

The dynamics of a hole-hole binary can be divided into three epochs: inspiral, 
merger and ringdown. The inspiral epoch ends when the holes reach their last 
stable orbit and begin plunging toward each other. Then the merger epoch com- 
mences, during which the binary is a single nonspherical black hole undergoing 
highly nonlinear space-time oscillations and vibrations of large amplitude. In the 
ringdown epoch, the oscillations decay due to gravitational wave emission, leaving 
finally a spherical, spinning black hole. 



Gravitational Wave Detection. Detection with huge metal bars as resonant 
antennas was started by Joseph Weber in 1969. These detectors couple to one 
axis of the eigenmodes of the incoming wave, and one then expects to observe 
a change in the state of oscillation. Today several coordinated and aligned cryo- 
genic bar detectors are in coordinated operation with sensitivities of approxi- 
mately 10" 21 Hz -1 . The detectors are tuned to see approximately 1 ms bursts 
occurring within a bandwidth of the order of 1 Hz. In order to eliminate random 
noise, the data from several detectors are analysed for coincidences. 

To improve the signal-to-noise ratio in the high-frequency range one turns to 
Michelson interferometers with very long arms. The principle is illustrated in Fig- 
ure 3.8. A laser beam is split, travels in two orthogonal directions to mirrors, and 
returns to be recombined and detected. A gravitational wave with either the h+ 
or h x component coinciding with the interferometer axes would lengthen the 
round-trip distance to one mirror and shorten it to the other. This would be 



Gravitational Waves 83 




Figure 3.8 A schematic view of a LIGO-type interferometer. Reprinted with permission 
of A. Abramovici et al. [14]. Copyright 1992 American Association for the Advancement 
of Science. 




Figure 3.9 LISA orbit and position in the Solar System [16]. 



observable as a mismatch of waves upon recombination, and hence as a decrease 
in the observed combined intensity of the laser. For isolation against mechani- 
cal disturbances the optical components are carefully suspended in vacuum. The 
arm lengths in present detectors range from 300 m (TAMA in Japan) and 600 m 
(GEO600 in Germany) to 3 km (VIRGO in Italy) and 4 km (LIGO at two locations in 
the US). Sensitivities of 10" 21 -10" 22 Hz" 1 can be reached in the high-frequency 
range. The range is limited to less than approximately 10 4 Hz by photo-electron 
shot noise in the components of the interferometer. 

To study sources in the low-frequency range one has to put the interferometer 
into space orbiting Earth. This is necessary in order to avoid low-frequency seis- 
mic noise on the ground and thermally induced medium-frequency motion in the 



84 Gravitational Phenomena 

atmosphere. The spectacular solution is the detector LISA (Laser Interferometer 
Space Antenna) consisting of three identical spacecraft, forming an equilateral tri- 
angle in space, with sidelength 5 million km, trailing Earth by 20° in a heliocentric 
orbit (cf. Figure 3.9). From each spacecraft a 1 W beam is sent to the two other 
remote spacecrafts via a telescope, is reflected by a platinum-gold cubic test mass, 
and the same telescopes are then used to focus the very weak returning beams. 
The interference signals from each arm are combined by on-board computers to 
perform the multiple-arm interferometry required to cancel the phase-noise com- 
mon to all three arms. Fluctuations in the optical paths between the test masses 
can be measured to sub-angstrom precision, which, when combined with the large 
separation of the spacecraft, allows LISA to detect gravitational-wave strain down 
to a level of order 10" 23 in one year of observation, with a signal-to-noise ratio 
of 5. 

For a review of both experimental and theoretical aspects of all detectors cur- 
rently operating or planned, see [15]. Since LISA is only scheduled to be launched 
in 201 1, the interested reader is recommended to follow the ESA and NASA home- 
pages [16]. 



Problems 

1. Calculate the gravitational redshift in wavelength for the 769.9 nm potas- 
sium line emitted from the Sun's surface [1]. 

2. Derive the deflection angle (3.4) using Equations (1.26) and (3.5). 

3. Derive Equation (3.12). What are the amplification of the individual images 

[4]? 

4. A galaxy at z = 0.9 contains a quasar showing redshift z = 1.0. Supposing 
that this additional redshift of the quasar is caused by its proximity to a 
black hole, how many Schwarzschild radii away from the black hole does 
the light emitted by the quasar originate? 

5. Estimate the gravitational redshift z of light escaping from a galaxy of mass 
1O 9 M after being emitted from a nearby star at a radial distance of 1 kpc 
from the centre of the galaxy. (Assume that all matter in the galaxy is con- 
tained within that distance [17].) 

6. Light is emitted horizontally in vacuo near the Earth's surface, and falls 
freely under the action of gravity. Through what vertical distances has it 
fallen after travelling 1 km? Calculate the radial coordinate (expressed in 
Schwarzschild radii) at which light travels in a circular path around a body 
of mass M [17]. 

7. Derive Equation (3.25). 



Gravitational Waves 85 

Chapter Bibliography 

[1] Kenyon, I. R. 1990 General relativity. Oxford University Press, Oxford. 

[2] Will, C. M. 1993 Theory and experiment in gravitational physics, revised edn. Cam- 
bridge University Press, Cambridge. 

[3] See http://oposite.stsci.edu/pubinfo/pictures.html. 

[4] Peacock, J. A. 1999 Cosmological physics. Cambridge University Press, Cambridge. 

[5] Straumann, N. 2002 Matter in the Universe, Space Science Series of ISSI, vol. 14. Kluwer. 
(Reprinted from Space Sci. Rev. 100, 29.) 

[6] Hawking, S. and Penrose, R. 1996 The nature of space and time. Princeton University 
Press, Princeton, NJ. 

[7] Bekenstein, J. 1973 Phys. Rev. D 7, 2333. 

[8] Hawking, S. W. 1974 Nature 248, 30. 

[9] Hawking, S. W. 1975 Commun. Math. Phys. 43, 199. 

[10] Schodel, R. et al. 2002 Nature 419, 694. 

[11] Gerssen, J. et al. 2002 Astron. J. 124, 3270. 

[12] Miyoshi, M. et al. 1995 Nature 373, 127 

[13] Wilkes, B. J. et al. 1995 Astrophys. J. 455, L13. 

[14] Abramovici, A. et al. 1992 Science 256, 325. 

[15] Maggiore, M. 2000 Phys. Rep. 331, 283. 

[16] See http://sci.esa.int/home/lisa/ and http://lisa.jpl.nasa.gov/. 

[17] Berry, M. V. 1989 Principles of cosmology and gravitation. Adam Hilger, Bristol. 



Cosmological 
Models 



In Section 4.1 we turn to the 'concordance' or Friedmann-Lemaitre-Robertson- 
Walker (FLRW) model of cosmology, really only a paradigm based on Friedmann's 
and Lemaitre's equations and the Robertson-Walker metric, which takes both 
energy density and pressure to be functions of time in a Copernican universe. 
Among the solutions are the Einstein universe and the Einstein-de Sitter universe, 
now known to be wrong, as we shall see in Section 4.4, and the currently accepted 
Friedmann-Lemaitre universe, which includes a positive cosmological constant. 

In Section 4.2 we describe the de Sitter model, which does not apply to the 
Universe at large as we see it now, but which may have dominated the very early 
universe, and which may be the correct description for the future. 

In Section 4.3 we study dark energy in the form of quintessence and other alter- 
natives to the cosmological constant, which try to remove some of the problems 
of the Friedmann-Lemaitre model. 

In Section 4.4 we examine some of the classical tests of cosmological models, 
arguing that what is called a test is more often a case of parameter estimation. 
We also give the values of some of the parameters entering the FLRW model. 



4.1 Friedmann-Lemaitre Cosmologies 

Let us now turn to our main subject, a model describing our homogeneous and 
isotropic Universe for which the Robertson-Walker metric (2.31) was derived. 
Recall that it could be written as a 4 x 4 tensor with nonvanishing components 
(2.32) on the diagonal only, and that it contained the curvature parameter k. 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



88 Cosmological Models 

Friedmann's Equations. The stress-energy tensor 7), v entering on the right-hand 
side of Einstein's equations (2.96) was given by Equation (2.92) in its diagonal 
form. For a comoving observer with velocity four-vector v = (c, 0, 0, 0), the time- 
time component Too and the space-space component T\\ are then 

Too = P c\ 7n = ^1^, (4.1) 

taking goo and gn from Equation (2.32). We will not need T22 or T33 because they 
just duplicate the results without adding new dynamical information. In what fol- 
lows we shall denote mass density by p and energy density by pc 2 . Occasionally, 
we shall use p m c 2 to denote specifically the energy density in all kinds of mat- 
ter: baryonic, leptonic and unspecified dark matter. Similarly we use p r c 2 or £ to 
specify the energy density in radiation. 

On the left-hand side of Einstein's equations (2.96) we need Goo and Gn to 
equate with Too and T n , respectively. We have all the tools to do it: the metric 
components g uv are inserted into the expression (2.73) for the affine connection, 
and subsequently we calculate the components of the Riemann tensor from the 
expression (2.76) using the metric components and the affine connections. This 
lets us find the Ricci tensor components that we need, #00 and R n , and the Ricci 
scalar from Equations (2.77) and (2.78), respectively. All this would require several 
pages to work out (see, for example, [1, 2]), so I only give the result: 

G Q o = 3(cR)- 2 (R 2 + kc 2 ), (4.2) 

Gn = -c- 2 (2RR+R 2 + kc 2 )(l-ka 2 )-\ (4.3) 

Here R is the cosmic scale factor R(t), not to be confused with the Ricci scalar R. 
Substituting Equations (4.1)-(4.3) into Einstein's equations (2.96) we obtain two 
distinct dynamical relations for R(t): 

R 2 + kc 2 8nG 

-R2- = — P> ^ 

2R R 2 + kc 2 8ttG tAr . 

-R + ^ 2 ——^ 2 ~ P - (4 " 5) 

These equations were derived in 1922 by Friedmann, seven years before Ftub- 
ble's discovery, at a time when even Einstein did not believe in his own equations 
because they did not allow the Universe to be static. Friedmann's equations did 
not gain general recognition until after his death, when they were confirmed by 
an independent derivation (in 1927) by Georges Lemaitre (1894-1966). For now 
they will constitute the tools for our further investigations. 

The expansion (or contraction) of the Universe is inherent to Friedmann's equa- 
tions. Equation (4.4) shows that the rate of expansion, R, increases with the mass 
density p in the Universe, and Equation (4.5) shows that it may accelerate. Sub- 
tracting Equation (4.4) from Equation (4.5) we obtain 

2R 8ttG. 2 _ . 

-£- = -^-^(pc 2 + 3p), (4.6) 



Friedmann-Lemaitre Cosmologies 89 

which shows that the acceleration decreases with increasing pressure and energy 
density, whether mass or radiation energy. Thus it is more appropriate to talk 
about the deceleration of the expansion. 

At our present time to when the mass density is po, the cosmic scale is Ro, the 
Hubble parameter is Ho and the density parameter £?o is given by Equation (1.35), 
Friedmann's equation (4.4) takes the form 

R% = InGRfoo - kc 2 =HlRlQ Q -kc 2 , (4.7) 

which can be rearranged as 

kc 2 =H 2 R 2 (n -l). (4.8) 

It is interesting to note that this reduces to the Newtonian relation (1.35). Thus 
the relation between the Robertson-Walker curvature parameter k and the present 
density parameter Oq emerges: to the k values +1,0 and -1 correspond an over- 
critical density Oq > 1, a critical density Qq = 1 and an undercritical density 
< Oq < 1, respectively. The spatially flat case with k = is called the Einstein- 
de Sitter universe. 

General Solution. When we generalized from the present Hq to the time- 
dependent Hubble parameter H(t) = a/a in Equation (2.46), this also implied 
that the critical density (1.31) and the density parameter (1.35) became functions 
of time: 

Pc(t) = ^ h2 ^' ( 4 - 9 ) 

Q(t)=p(t)lp c (t). (4.10) 

Correspondingly, Equation (4.8) can be generalized to 

kc 2 =H 2 R 2 (Q-l). (4.11) 

If k * 0, we can eliminate kc 2 between Equations (4.8) and (4.11) to obtain 

H 2 a 2 (Q- 1) = H$(Q -1), (4.12) 

which we shall make use of later. 

It is straightforward to derive a general expression for the solution of Fried- 
mann's equation (4.4). Replacing R/R by a/a, inserting kc 2 from Equation (4.8), 
and replacing (8ttG/3)p by Q(u)Hq, Equation (4.4) furnishes a solution for H(a): 



H(a) = d/a = H -y/(l - O )a~ 2 + Q(a). (4.13) 

Here we have left the a dependence of O(a) unspecified. As we shall see later, 
various types of energy densities with different a dependences contribute. 

Equation (4.13) can be used to solve for the lookback time t(z)/to or t(a)/to 
(normalized to the age to) since a photon with redshift z was emitted by writing 
it as an integral equation: 

p dt r da 
Jo Ji aH(a) 



90 Cosmological Models 

The age of the Universe at a given redshift is then 1 - t(z)/to- We shall specify 
this in more detail later. 



Einstein Universe. Consider now the static universe cherished by Einstein. This 
is defined by R(t) being a constant Ro so that R = and R = and the age of the 
Universe is infinite. Equations (4.4) and (4.5) then reduce to 

kc 2 8tt 8tt 

^pirGp„ = -^G P o. (4 - 15) 

In order that the mass density po be positive today, k must be +1. Note that this 
leads to the surprising result that the pressure of matter po becomes negative! 

Einstein corrected for this in 1917 by introducing a constant Lorentz-invariant 
term Ag^ v into Equation (2.95), where the cosmological constant A corresponds to 
a tiny correction to the geometry of the Universe. Equation (2.95) then becomes 

G uv = R^ - \g» v R - \g„ v . (4.16) 

In contrast to the first two terms on the right-hand side, the Ag^ v term does not 
vanish in the limit of flat space-time. With this addition, Friedmann's equations 
take the form 



R 2 + kc 2 
R 2 


A _ 8ttG 

3 " 3 P ' 


2R R 2 + kc 2 
R + R 2 


8nG 



(4.18) 

A positive value of A curves space-time so as to counteract the attractive gravi- 
tation of matter. Einstein adjusted A to give a static solution, which is called the 
Einstein universe. 

The pressure of matter is certainly very small, otherwise one would observe 
the galaxies having random motion similar to that of molecules in a gas under 
pressure. Thus one can set p = to a good approximation. In the static case 
when R = Ro, Ro = and Ro = 0, Equation (4.17) becomes 



kc^_ _ A 8nG 
~rJ~3~~ 



-Pi)- 



It follows from this that in a spatially flat Universe 

P, = ^ = -Po- (4.19) 

But Einstein did not notice that the static solution is unstable: the smallest 
imbalance between A and p would make R nonzero, causing the Universe to accel- 
erate into expansion or decelerate into contraction. This flaw was only noticed by 
Eddington in 1930, soon after Hubble's discovery, in 1929, of the expansion that 
caused Einstein to abandon his belief in a static universe and to withdraw the 
cosmological constant. This he called 'the greatest blunder of my lifetime'. 



Friedmann-Lemaitre Cosmologies 91 

The Friedmann-Lemaitre Universe. If the physics of the vacuum looks the same 
to any inertial observer, its contribution to the stress-energy tensor is the same as 
Einstein's cosmological constant A, as was noted by Lemaitre. The A term in Equa- 
tion (4.16) is a correction to the geometrical terms in G^, but the mathematical 
content of Equations (4.17) and (4.18) are not changed if the A terms are moved to 
the right-hand side, where they appear as corrections to the stress-energy tensor 
Tfj V . Then the physical interpretation is that of an ideal fluid with energy density 
Px = A/8ttG and negative pressure p\ = -p\c 2 . When the cosmological constant 
is positive, the gravitational effect of this fluid is a cosmic repulsion counter- 
acting the attractive gravitation of matter, whereas a negative A corresponds to 
additional attractive gravitation. 

The cosmology described by Equations (4.17) and (4.18) with a positive cos- 
mological constant is called the Friedmann-Lemaitre universe. Such a universe is 
now strongly supported by observations of a nonvanishing A (as we shall see in 
Section 4.4), so the Einstein-de Sitter universe, which has A = 0, is a dead end. 

In a Friedmann-Lemaitre universe the total density parameter is conveniently 
split into a matter term, a radiation term and a cosmological constant term, 

Q Q = Q m + Q r + Q A , (4.20) 

where Q r and Q\ are defined analogously to Equations (1.35) and (4.10) as 

£2 r = ?±, £2 A = — - — = -^y. (4.21) 

p c 8nGp c 3H 2 

D m , O r and Q\ are important dynamical parameters characterizing the Universe. 
If there is a remainder Ok = &o - 1 * 0, this is called the vacuum-energy term. 

Using Equation (4.19) we can find the value of A corresponding to the attractive 
gravitation of the present mass density: 

-A = 8nGp () = 3r2 H 2 « 1.3 x 10" 52 c 2 m" 2 . (4.22) 

No quantity in physics this small has ever been known before. It is extremely 
uncomfortable that A has to be fine-tuned to a value which differs from zero only 
in the 52nd decimal place(in units of c = 1). It would be much more natural if A 
were exactly zero. This situation is one of the enigmas which will remain with us 
to the end of this book. As we shall see, a repulsive gravitation of this kind may 
have been of great importance during the first brief moments of the existence of 
the Universe, and it appears that the present Universe is again dominated by a 
global repulsion. 



Energy-Momentum Conservation. Let us study the solutions of Friedmann's 
equations in the general case of nonvanishing pressure p. Differentiating Equa- 
tion (4.4) with respect to time, 

d - 2 , 2 . 8ttG d , n2 . 

di {R +kc) = — Tt (pR) > 



92 Cosmological Models 

we obtain an equation of second order in the time derivative: 

2RR = ^nG(pR 2 + 2pRR). (4.23) 

Using Equation (4.6) to cancel the second-order time derivative and multiply- 
ing through by c 2 /R 2 , we obtain a new equation containing only first-order time 
derivatives: 

pc 2 + 3H(pc 2 + p) = 0. (4.24) 

This equation does not contain k and A, but that is not a consequence of hav- 
ing started from Equations (4.4) and (4.5). If, instead, we had started from Equa- 
tions (4.17) and (4.18), we would have obtained the same equation. 

Note that all terms here have dimension energy density per time. In other words, 
Equation (4.24) states that the change of energy density per time is zero, so we 
can interpret it as the local energy conservation law. In a volume element dV, 
pc 2 dV represents the local decrease of gravitating energy due to the expansion, 
whereas p dV is the work done by the expansion. Energy does not have a global 
meaning in general relativity, whereas work does. If different forms of energy do 
not transform into one another, each form obeys Equation (4.24) separately. 

As we have seen, Equation (4.24) follows directly from Friedmann's equations 
without any further assumptions. But it can also be derived in another way, per- 
haps more transparently. Let the total energy content in a comoving volume R 3 
be 

E = (pc 2 + p)R 3 . 

The expansion is adiabatic if there is no net inflow or outflow of energy so that 

of = dT [(pc2 + p) - R3] = - (4 - 25) 

If p does not vary with time, changes in p and R compensate and Equation (4.24) 
immediately follows. 
The equation (4.24) can easily be integrated, 

f P(t)c 2 _fi?(t),. ,ffl(t).. ,._„ 

if we know the relation between energy density and pressure— the equation of 
state of the Universe. 



Entropy Conservation and the Equation of State. In contrast, the law of conser- 
vation of entropy S is not implied by Friedmann's equations, it has to be assumed 
specifically, as we shall demonstrate in Section 5.2, 



Then we can make an ansatz for the equation of state: let p be proportional to 
pc 2 with some proportionality factor w which is a constant in time, 



Friedmann-Lemaitre Cosmologies 93 

In fact, one can show that this is the most general equation of state in a space- 
time with Robertson- Walker metric. Inserting this ansatz into the integral in Equa- 
tion (4.26) we find that the relation between energy density and scale is 

p(a) oc a - 3(1+u,) = (1 + z) 3(1+w) . (4.29) 

Here we use z as well as a because astronomers prefer z since it is an observable. 
In cosmology, however, it is better to use a or R for two reasons. Firstly, redshift 
is a property of light, but freely propagating light did not exist at times when 
z > 1000, so z is then no longer a true observable. Secondly, it is possible to 
describe the future in terms of a > 1, but redshift is then not meaningful. 

The value of the proportionality factor w in Equations (4.28) and (4.29) follows 
from the adiabaticity condition. Leaving the derivation of w for a more detailed 
discussion in Section 5.2, we shall anticipate here its value in three special cases 
of great importance. 

(i) A matter-dominated universe filled with nonrelativistic cold matter in the 
form of pressureless nonradiating dust for which p = 0. From Equa- 
tion (4.28) then, this corresponds to w = 0, and the density evolves accord- 
ing to 

p m (a) oca" 3 = (1 + z) 3 . (4.30) 

It follows that the evolution of the density parameter O m is 

Q m {a) = £2 m — |a~ 3 . 

Solving for H 2 a 2 Q and inserting it into Equation (4.13), one finds the evolu- 
tion of the Hubble parameter: 



H(a) = Hoa- 1 ^ -O m + Q m a~ l = H (l + z)-y/l + O m z. (4.31) 

(ii) A radiation-dominated universe filled with an ultra-relativistic hot gas com- 
posed of elastically scattering particles of energy density £. Statistical 
mechanics then tells us that the equation of state is 

pr=\e = \p?c 2 . (4.32) 

This evidently corresponds to w = i, so that the radiation density evolves 
according to 

p r (a) oca' 4 = (l + z) 4 . (4.33) 

(iii) The vacuum-energy state corresponds to a flat, static universe (R = 0, 
R = 0) without dust or radiation, but with a cosmological term. From Equa- 
tions (4.17) and (4.18) we then obtain 

Pa = ~P\c 2 , w = -1. (4.34) 

Thus the pressure of the vacuum energy is negative, in agreement with 
the definition in Equation (4.19) of the vacuum-energy density as a neg- 
ative quantity. In the equation of state (4.28), p\ and p\ are then scale- 
independent constants. 



94 Cosmological Models 

Early Time Dependence. It follows from the above scale dependences that the 
curvature term in Equation (4.17) obeys the following inequality in the limit of 
small R: 

kc 2 8ttG A 

- W «—P+y (4-35) 

In fact, this inequality is always true when 

fc=+l, p>-\pc 2 , w>-\, A>0. (4.36) 

Then we can neglect the curvature term and the A term in Equation (4.17), which 
simplifies to 

Let us now find the time dependence of a by integrating this differential equa- 
tion: 

[daa- 1+3 < 1+ ^ 2 oc Jdt, 

to obtain the solutions 

fl 3(i+w)/2 K ( for if * -1, bfloc t for if = -1. 

Solving for a, 

a(t) oc t 2l3{1+w) forti/*-l, a(t) oc e ^sx.t forw = _ 1 _ ( 438 ) 

In the two epochs of matter domination and radiation domination we know 
the value of w. Inserting this we obtain the time dependence of a for a matter- 
dominated universe, 

a(t) oc t 213 , (4.39) 

and for a radiation-dominated universe, 

a(t) oc t 112 . (4.40) 



Big Bang. We find the starting value of the scale of the Universe independently 
of the value of k in the curvature term neglected above: 



In the same limit the rate of change a is obtained from Equation (4.37) with any 

w obeying w > -1: 

limd(t) = lima _1 (t) = «>. (4.42) 

t-o t-o 

It follows from Equations (4.32) and (4.33) that an early radiation-dominated 

Universe was characterized by extreme density and pressure: 

limp r (t) = lima" 4 (t) = oo, 

limp r (t) = lima" 4 (t) = oo. 

In fact, these limits also hold for any w obeying w > -1. 



Friedmann-Lemaitre Cosmologies 95 

Actually, we do not even need an equation of state to arrive at these limits. 
Provided pc 2 + 3p was always positive and A negligible, we can see from Equa- 
tions (4.6) and (4.18) that the Universe has always decelerated. It then follows that 
a must have been zero at some time in the past. Whether Friedmann's equations 
can in fact be trusted to that limit is another story which we shall come back to 
later. The time t = was sarcastically called the Big Bang by Fred Hoyle, who 
did not like the idea of an expanding Universe starting from a singularity, but the 
name has stuck. 



Late Einstein-de Sitter Evolution. The conclusions we derived from Equa- 
tion (4.35) were true for past times in the limit of small a. However, the recent 
evolution and the future depend on the value of k and on the value of A. For 
k = and k = -1 the expansion always continues, following Equation (4.38), and 
a positive value of A boosts the expansion further. 

In a matter-dominated Einstein-de Sitter universe which is flat and has Q\ = 0, 
Friedmann's equation (4.4) can be integrated to give 

t(z) = -^-(l + zr 312 , (4.43) 

3-«0 

and the present age of the Universe at z = would be 

to = dk- (4 - 44) 

In that case the size of the Universe would be cto = 2h~ l Gpc. Inserting the value 
of Ho used in Equation (1.21), Hq « 68-75 kms -1 Mpc -1 , one finds 

to ~ 8.6-9.6 Gyr. (4.45) 

This is in obvious conflict with to as determined from the ages of the oldest known 
star in the Galaxy in Equation (1.23), 14.1 ±2.5 Gyr. Thus the flat-universe model 
with Q\ = is in trouble. 



Evolution of a Closed Universe. In a closed matter-dominated universe with 
k = +1 and A = 0, the curvature term kc 2 /R 2 drops with the second power of R, 
while, according to Equation (4.30), the density drops with the third power, so the 
inequality (4.35) is finally violated. This happens at a scale # max such that 

*iL=^, (4-46) 

max 3c 2 

and the expansion halts because R = in Equation (4.4). Let us call this the 
turnover time t max . At later times the expansion turns into contraction, and the 
Universe returns to zero size at time 2t max - That time is usually called the Big 
Crunch. For k = +1 Friedmann's equation (4.4) then takes the form 



dR l8n G Pm (R)R 2 -c 2 . 



d/ 



96 Cosmological Models 

Then t max is obtained by integrating t from to t max and R from to R m ax, 

To solve the R integral we need to know the energy density p m (R) in terms of 
the scale factor, and we need to know R max . Let us take the mass of the Universe 
to be M. We have already found in Equation (2.42) that the volume of a closed 
universe with Robertson-Walker metric is 

V = 2tt 2 R*. 

Since the energy density in a matter-dominated universe is mostly pressureless 
dust, 

M M 

"»" 2^' (4AS) 

This agrees perfectly with the result (4.30) that the density is inversely propor- 
tional to R 3 . Obviously, the missing proportionality factor in Equation (4.30) is 
then M/2n 2 . Inserting the density (4.48) with R = R max into Equation (4.46) we 
obtain 

«-- ^. (4.49) 

3ttc z 

We can now complete the integral in Equation (4.47): 

«„-f *_-£«. (4.50) 

2c 3c 3 

Although we might not know whether we live in a closed universe, we certainly 
know from the ongoing expansion that t max > to- Using the lower limit for to from 
Equation (1.21) we And a lower limit to the mass of the Universe: 

M> ^- = 1.25xlO 23 M . (4.51) 

Actually, the total mass inside the present horizon is estimated to be about 
1O 22 M . 
The dependence of t max on Q m can also be obtained: 

tn ™ = 2H (flf -1)^ (4 " 52) 



The Radius of the Universe. The spatial curvature is given by the Ricci scalar R 
introduced in Equation (2.78), and it can be expressed in terms of O: 

R = m 2 (Q-l). (4.53) 

Obviously, R vanishes in a flat universe, and it is only meaningful when it is non- 
negative, as in a closed universe. It is conventional to define a 'radius of curvature' 
that is also valid for open universes: 



\l R HVIfl-if 
For a closed universe, r v has the physical meaning of the radius of a sphere. 



Friedmann-Lemaitre Cosmologies 97 

Another interesting quantity is the Schwarzschild radius of the Universe, r CjU . 
Combining Equations (4.50) and (3.19) we find 

r CiU = 3ct max > 12 Gpc. (4.55) 

Comparing this number with the much smaller Hubble radius 3/i _1 Gpc in Equa- 
tion (1.14) we might conclude that we live inside a black hole! However, the 
Schwarzschild metric is static whereas the Hubble radius recedes in expanding 
Friedmann models with superluminal velocity, as was seen in Equation (2.51), so 
it will catch up with r c>u at some time. Actually, it makes more sense to describe 
the Big Bang singularity as a white hole, which is a time-reversed black hole. A 
white hole only emits and does not absorb. It has a horizon over which nothing 
gets in, but signals from inside do get out. 



Evolution of Open, Closed and Flat Universes. The three cases k = -1, 0, +1 
with A = are illustrated qualitatively in Figure 4.1. All models have to be consis- 
tent with the scale and rate of expansion today, Rq and Rq, at time to- Following 
the curves back in time one notices that they intersect the time axis at different 
times. Thus what may be called time t = is more recent in a flat universe than 
in an open universe, and in a closed universe it is even more recent. 



Late Friedmann-Lemaitre Evolution. When A > 0, the recent past and the future 
take an entirely different course (we do not consider the case A < 0, which is of 
mathematical interest only). Since p\ and Q\ are then scale-independent con- 
stants, they will start to dominate over the matter term and the radiation term 
when the expansion has reached a given scale. Friedmann's equation (4.18) can 
then be written 

^ = 3H 2 D A . 

From this one sees that the expansion will accelerate regardless of the value of 
k. In particular, a closed universe with k = +1 will ultimately not contract, but 
expand at an accelerating pace. 

Let us now return to the general expression (4. 14) for the normalized age t(z)/to 
or t(a)/to of a universe characterized by k and energy density components Q m , 
Q r and Q\. Inserting the O components into Equations (4.13) and (4.14) we have 

^ = H 2 (t) = H 2 [(l - n )a~ 2 + Q m (a) + O t (a) + fl A (a)], 



) H Jo 



da\{l-Qo)+ C2 m a~ l + n r a~ 2 + Q x a 2 



The integral can easily be carried out analytically when Q\ = 0. But this is now 
of only academic interest, since we know today that Q\ « 0.7 (as we shall see in 



Cosmological Models 




Figure 4.1 Time dependence of the cosmic scale R{t) in various scenarios, all of which 
correspond to the same constant slope H = H Q at the present time t . k = +1: a closed 
universe with a total lifetime 2£ max . It started more recently than a flat universe would 
have, k = 0: a flat universe which started §£ H ago. k = -1: an open universe which started 
at a time f £ H < t < t H before the present time, de Sitter: an exponential (inflationary) 
scenario corresponding to a large cosmological constant. This is also called the Lemaitre 
cosmology. 



Section 4.4). Thus the integral is best solved numerically (or analytically in terms 
of hypergeometric functions or the Weierstrass modular functions [3, 4]). 

The lookback time is given by the same integral with the lower integration limit 
at 1/(1 + z) and the upper limit at 1. The proper distance (2.38) is then 



d P (z) =RqX(z) = ct(z). 



(4.57) 



In Figure 4.2 we plot the lookback time t(z)/to and the age of the Universe 1 - 
t(z)/to in units of to as functions of redshift for the parameter values Q m = 0.27, 
Da = 1 - O m . At infinite redshift the lookback time is unity and the age of the 
Universe is zero. 

Another important piece of information is that Oq « 1.0 (Table A.6). The vac- 
uum term (4.8) (almost) vanishes, in which case we can conclude that the geometry 
of our Universe is (almost) flat. With Oq = 1.0 and Q r well known (as we shall see 
in Section 5.2), the integral (4.56) really depends on only one unknown parameter, 
Q m = l-0\- 

From the values Q\ « 0.73 and O m « 1 - 0.73 = 0.27, one can conclude that 
the cosmological constant has already been dominating the expansion for some 



de Sitter Cosmology 99 




Figure 4.2 The lookback time and the age of the Universe normalized to t as functions 
of redshift for the parameter values O m = 0.27, Q\ = 1 - Q m . For z > 10 see Figure 5.9. 



time. The Universe began accelerating when Da and Q m were equal, when 



Omit) 
Oa 



0.27(1 + 2 
0.73 



or at z = 0.393. 



4.2 de Sitter Cosmology 

Let us now turn to another special case for which Einstein's equations can be 
solved. Consider a homogeneous flat universe with the Robertson-Walker metric 
in which the density of pressureless dust is constant, p(t) = po- Friedmann's 
equation (4.17) for the rate of expansion including the cosmological constant then 
takes the form 

d(t) 



a{t) 



- =H, 



(4.58) 



where H is now a constant: 



H = 



8tt 



Gp 



(4.59) 



A 
3 " PU "3" 

This is clearly true when k = but is even true for k * 0: since the density is 
constant and R increases without limit, the curvature term kc 2 /R 2 will eventu- 
ally be negligible. The solution to Equation (4.58) is obviously an exponentially 
expanding universe: 

a(t) oc e Ht . (4.60) 

This is drawn as the de Sitter curve in Figure 4.1. Substituting this function into 
the Robertson-Walker metric (2.31) we obtain the de Sitter metric 

ds 2 = c 2 dt 2 - e 2Ht {dr 2 + r 2 d6 2 + r 2 sin 2 6 d</> 2 ) (4.61) 

with r replacing a. In 1917 de Sitter published such a solution, setting p = p = 0, 
thus relating H directly to the cosmological constant A. The same solution of 



100 Cosmological Models 

course follows even with A = if the density of dust p is constant. Eddington 
characterized the de Sitter universe as 'motion without matter', in contrast to the 
static Einstein universe that was 'matter without motion'. 

If one introduces two test particles into this empty de Sitter universe, they 
will appear to recede from each other exponentially. The force driving the test 
particles apart is very strange. Let us suppose that they are at spatial distance rR 
from each other, and that A is positive. Then the equation of relative motion of 
the test particles is given by Equation (4.5) including the A term: 

^4r*-^G(p + 3 P c-V*. (4.62) 

at 1 3 3 

The second term on the right-hand side is the decelerating force due to the 

ordinary gravitational interaction. The first term, however, is a force due to the 

vacuum-energy density, proportional to the distance r between the particles! 

If A is positive as in the Einstein universe, the force is repulsive, accelerating 
the expansion. If A is negative, the force is attractive, decelerating the expansion 
just like ordinary gravitation. This is called an anti-de Sitter universe. Since A is 
so small (cf. Equation (4.22)) this force will only be of importance to systems with 
mass densities of the order of the vacuum energy. The only known systems with 
such low densities are the large-scale structures, or the full horizon volume of 
cosmic size. This is the reason for the name cosmological constant. In Chapter 7 
we shall meet inflationary universes with exponential expansion. 

Although the world is not devoid of matter and the cosmological constant is 
small, the de Sitter universe may still be of more than academic interest in situ- 
ations when p changes much more slowly than the scale R. The de Sitter metric 
then takes the form 

ds 2 = (1 - r 2 H 2 ) dt 2 - (1 - r 2 H 2 )~ l dr 2 - r 2 (d0 2 + sin 2 9 d<p 2 ), (4.63) 
which resembles the Schwarzschild metric, Equation (3.21). There is an inside 
region in the de Sitter space at r < H~ l , for which the metric tensor component 
goo is positive and^n is negative. This resembles the region outside a black hole of 
Schwarzschild radius r c = H~ l , at r > r c , where goo is positive and^n is negative. 
Outside the radius r = H~ l in de Sitter space and inside the Schwarzschild black 
hole these components of the metric tensor change sign. We shall come back to 
this metric in Chapter 10. 

The interpretation of this geometry is that the de Sitter metric describes an 
expanding space-time surrounded by a black hole. Inside the region r = H~ l 
no signal can be received from distances outside H~ l because there the metric 
corresponds to the inside of a black hole! In an anti-de Sitter universe the con- 
stant attraction ultimately dominates, so that the expansion turns into contrac- 
tion. Thus de Sitter universes are open and anti-de Sitter universes are closed. 

Let us study the particle horizon r H in a de Sitter universe. Recall that this is 
defined as the location of the most distant visible object, and that the light from 
it started on its journey towards us at time ta- From Equation (2.47) the particle 
horizon is at 

r H (t)=R(t)x P h = R(t)^° j^r y (4.64) 



Let us choose t H as the origin of time, t H = 0. The distance r H (£) as a function 
of the time of observation t then becomes 

r H (t) =H- 1 e Ht (l-e- Ht ). (4.65) 

The comoving distance to the particle horizon, Xph, quickly approaches the con- 
stant value H~ l . Thus for a comoving observer in this world the particle horizon 
would always be located at H~ l . Points which were inside this horizon at some 
time will be able to exchange signals, but events outside the horizon cannot influ- 
ence anything inside this world. 

The situation in a Friedmann universe is quite different. There the time depen- 
dence of a is not an exponential but a power of t, Equation (4.38), so that the 
comoving distance Xph is an increasing function of the time of observation, not 
a constant. Thus points which were once at space-like distances, prohibited to 
exchange signals with each other, will be causally connected later, as one sees in 
Figure 2.1. 

The importance of the de Sitter model will be illustrated later when we deal with 
exponential expansion at very early times in inflationary scenarios in Chapter 7. 



4.3 Dark Energy 

The introduction of the cosmological constant into our description of the Universe 
is problematic for at least three reasons. Firstly, as we noted in Equation (4.22), its 
present value is extremely small, in fact some 122 orders of magnitude smaller 
than theoretical expectations. The density is about 

p A ~ 2.9 x lO" 47 GeV 4 . 

If pa were even slightly larger, the repulsive force would cause the Universe to 
expand too fast so that there would not be enough time for the formation of 
galaxies or other gravitationally bound systems. This is called the cosmological 
constant problem. 
Secondly, it raises the question of why the sum 

Do = Q m + £?a 

is precisely 1.0 today when we are there to observe it, after an expansion of 
some 12 billion years when it was always greater than 1.0. The density of matter 
decreases like a" 3 , while Q\ remains constant, so why has the cosmological con- 
stant been fine-tuned to come to dominate the sum only now? This is referred to 
as the cosmic coincidence problem. 

Thirdly, we do not have the slightest idea what the A energy consists of, only 
that it distorts the geometry of the Universe as if it were matter with strongly 
negative pressure, and acts as an anti-gravitational force which is unclustered at 
all scales. Since we know so little about it, we also cannot be sure that A is constant 
in time, and that its equation of state is always w\ = -1. When it is not constant 
it is often called dark energy. 



102 Cosmological Models 

Dark energy comes as a complete surprise. Nothing in big bang or inflation- 
ary cosmology predicted its existence. Therefore we also have no prediction for 
whether it is permanent, as a cosmological constant, or whether it will decay away 
in time. 



Decaying Cosmological Constant. A dynamical approach to remove or alleviate 
the extreme need for fine-tuning A is to choose it to be a slowly varying function 
of time, A(t). The initial conditions require A(t P i anc k) « 10 122 Ao, from which it 
decays to its present value at time to- 

The Universe is then treated as a fluid composed of dust and dark energy in 
which the dark energy density, p\(t) = A(t)/8nG, continuously transfers energy 
to the material component. Its equation of state is then of the form 

P\ = -p\[l + - — rf . 4.66 

V 3 dlna / 

In the classical limit when p\{a) is a very slow function of a so that the deriva- 
tive term can be ignored, one obtains the equation of state of the cosmological 
constant, w\ = -1. 

The advantage in removing the need for fine-tuning is, however, only replaced by 
another arbitrariness: an ansatz for A(£) is required and new parameters charac- 
terizing the timescale of the deflationary period and the transfer of energy from 
dark energy to dust must be introduced. Such phenomenological models have 
been presented in the literature [6, 7], and they can lead to testable predictions. 



Scalar Fields. Instead of arguing about whether A should be interpreted as a 
correction to the geometry or to the stress-energy tensor, we could go the whole 
way and postulate the existence of a new kind of energy, described by a slowly 
evolving scalar held qp(t) that contributes to the total energy density together 
with the background (matter and radiation) energy density. This scalar field is 
assumed to interact only with gravity and with itself. 

Since a scalar field is mathematically equivalent to a fluid with a time-dependent 
speed of sound, one can find potentials V(qp) for which the dynamical vacuum 
acts like a fluid with negative pressure, and with an energy density behaving like 
a decaying cosmological constant. In comparison with plain A(t) models, scalar 
field cosmologies have one extra degree of freedom, since both a potential V(qp) 
and a kinetic term \op 2 need to be determined. 

The simplest equation of motion for a spatially homogeneous classical scalar 
field is the Klein-Gordon equation, which can be written 

qp + 3H(p + V'(qp) = 0, (4.67) 

where the prime indicates derivation with respect to qp. The energy density and 
pressure for a general scalar field enter in the diagonal elements of 7), v , and they 
are 

Pq) c 2 = \<p 2 + V{q>) and p v = \<p 2 - V(q>), (4.68) 



Dark Energy 103 

respectively. Clearly the pressure is always negative if the evolution is so slow 
that the kinetic energy density \qp 2 is less than the potential energy density. 
Note that in Equations (4.67) and (4.68) we have ignored terms describing spatial 
inhomogeneity which could also have been present. 

The conservation of energy-momentum for the background component (matter 
and radiation, denoted by 'b') is Equation (4.24), which leads to the equation of 
state w\>, and analogously one has for the scalar field 

p v c 2 + 3H(p q> c 2 + pep) = 

or 

p v + 3Hp v (l+w<p) = 0. (4.69) 

As in Equation (4.29), the energy density of the scalar field decreases as 

a -3(i+iiv) inserting Equations (4.68) into Equation (4.69), one indeed obtains 

Equation (4.67). The equation of state of the qp field is then a function of the 

cosmological scale a (or time t or redshift z), 

(p 2 + 2V(qp) 
* qp 2 - 2V(q>) 
or, in some epochs, it can be a constant between and -1. 

However, dark energy defined this way and called quintessence turns out to be 
another Deus ex machina which not only depends on the parametrization of an 
arbitrary function V(qp), but also has to be fine-tuned initially in a way similar to 
the cosmological constant. 



Tracking Quintessence. In a somewhat less arbitrary model [8, 9], one con- 
structs quintessence in such a way that its energy density is smaller than the 
background component for most of the history of the Universe, somehow track- 
ing it with the same time dependence. As long as the field stays on the tracker 
solution and regardless of the value of w v in the radiation-domination epoch, w v 
automatically decreases to a negative value at time t eq when the Universe trans- 
forms from radiation domination to matter domination. We saw in Equation (4.33) 
that radiation energy density evolves as a" 4 — faster than matter energy density, 
a" 3 . Consequently, p r is now much smaller than p m . 

But once Wq, is negative, p v decreases at a slower rate than p m so that it even- 
tually overtakes it. At that moment, q?(t) slows to a near stop, causing w^, to 
decrease toward -1, and tracking stops. Judging from the observed large value 
of the cosmological constant density parameter today, Q\ = 0.73, this happened 
in the recent past when the redshift was z ~ 2-4. Quintessence is already domi- 
nating the total energy density, driving the Universe into a period of de Sitter-like 
accelerated expansion. 

The tracker field should be an attractor in the sense that a very wide range of 
initial conditions for qp and op rapidly approach a common evolutionary track, 
so that the cosmology is insensitive to the initial conditions. Thus the need for 
fine-tuning is entirely removed, the only arbitrariness remains in the choice of a 



104 Cosmological Models 

function V(qp). With a judicious choice of parameters, the coincidence problem 
can also be considered solved, albeit by tuning the parameters ad hoc. 

In Chapter 7 we shall come back to the inflationary de Sitter expansion following 
the Big Bang, which may also be caused by a scalar inflaton field. Here we just 
note that the initial conditions for the quintessence field can be chosen, if one so 
desires, to match the inflaton field. 

Tracking behaviour with w v < w\> occurs [8, 9] for any potential obeying 

r = V"VI(V') 2 > 1, (4.71) 

and which is nearly constant over the range of plausible initial op, 

%£ <|r . 1| . 

or if -V IV is a slowly decreasing function of op. Many potentials satisfy these 
criteria, for instance power law, exponential times power law, hyperbolic, and 
Jacobian elliptic functions. For a potential of the generic form, 

V(qp) = V (qp /<P)~ P (4.73) 

with P constant, one has a good example of a tracker field for which the kinetic 
and potential terms remain in a constant proportion. 

The values of Wq, and Q<p depend both on V(qp) and on the background. The 
effect of the background is through the 3H<fi term in the scalar field equation of 
motion (4.67): when w\, changes, H also changes, which, in turn, changes the rate 
at which the tracker field evolves down the potential. 

The tracking potential is characterized as slow rolling when 

,(cp),^fa«l, e,%^) 2 «l, (4.74) 



meaning that <p in Equation (4.67) and <p 2 in Equation (4.68) are both negligi- 
ble. At very early times, however, -V IV is slowly changing, but is itself not 
small. This establishes the important distinction between static and quasi-static 
quintessence with w v « -1 and dynamical quintessence with w v > -1. This 
means that the slow-roll approximation is not necessarily applicable to dynamical 
quintessence, and that the latter generally requires exact solution of the equation 
of motion (4.67) [10]. 

Given a potential like Equation (4.73) and fixing the current values of param- 
eters O m , O r , &q>, w<p one can solve the equation of motion (4.67) by numerical 
integration. Finding the functions F(a), w<p(qp), w v (a) or O v (a) is a rather com- 
plicated exercise [8, 9, 10, 11]. In Figures 4.3 and 4.4 we show a few of these 
functions for inverse power potentials of the form (4.73). One can see that rel- 
atively fast-rolling dynamical quintessence also becomes static sooner or later, 
approaching w<p = -1. 

In this model the lookback time is given by 

1 r l/d+z) 
t(z) = — &a[{l-Q a ) +Q m a- 1 +Q r a- Z + Qya 1 -^ 1 ™^}- 112 . (4.75) 

Hq Ji 

For 1/(1 + z) = this gives us the age (4.56) of the Universe, to. 







— — v 




N).70 

\ 




\ 




0.75 






\ 


8Q 


\ 




\ 

\ 




\ 
\ 




\ 


0.90 


N 




''.^ ^^ 


nos 









Figure 4.3 The quintessence equation of state w^ia) for the inverse power potential 
(4.73) as a function of log a for p = 0.1 (solid line), p = 0.5 (dotted line), and p = 1.0 
(dashed line). Here Oq, = |, O m = i. 







\ 0.8 




\ / 




,'' o\ 




0.2 





Figure 4.4 The quintessence density parameter Q v (a) (solid line) and the background 
density parameter Q\,{a) (dotted line) for the inverse power potential (4.73) with p = 0.5 
as a function of log a. Note that log a = - 1 corresponds to z = 9. 



Other Models. As already mentioned, the weakness of the tracking quintessence 
model is that the energy density for which the pressure becomes negative is set 
by an adjustable parameter which has to be fine-tuned to explain the cosmic coin- 
cidence. Surely one can do better by adding degrees of freedom, for instance by 



106 Cosmological Models 

letting qp(t) interplay with the decaying cosmological constant A(t), or with the 
matter field, or with a second real scalar field qj(t) which dominates at a different 
time, or by taking qp (t) to be complex, or by choosing a double-exponential poten- 
tial, or by adding a new type of matter or a dissipative pressure to the background 
energy density. Surely the present acceleration could have been preceded by var- 
ious periods of deceleration and acceleration. All of these alternatives have been 
proposed, but one generally then comes into the situation described by Wigner: 
'with three parameters one can fit an elephant, with four one can make it wag its 
tail'. 

One interesting alternative called k-essence [12, 13] comes at the cost of intro- 
ducing a nonlinear kinetic energy density functional of the scalar field. The k-field 
tracks the radiation energy density until t eq , when a sharp transition from positive 
to negative pressure occurs, with w v = -1 as a consequence. The k-essence den- 
sity pk then drops below p m and, thereafter, in the matter-dominated epoch the 
k-field does not track the background at all, it just stays constant. Thus the time 
of k-essence domination and accelerated expansion simply depends on t eq . How- 
ever, this is another case of fine-tuning: pk must drop precisely to the magnitude 
of the present-day p\. 

Could the scalar field obey an equation of state with w v < -1? Such a situa- 
tion would require rather drastic revisions of general relativity and would lead 
to infinite acceleration within a finite time [14]. Speculations have also appeared 
in the literature that the Universe might have undergone shorter periods of this 
type. It is well to remember that nothing is known about whether the cosmologi- 
cal constant is indeed constant or whether it will remain so, nor about the future 
behaviour of a quintessence field and its equation of state. 

4.4 Model Testing and Parameter Estimation. I 

In this chapter we have concentrated on the 'concordance' FLRW cosmological 
model with a nonvanishing cosmological constant in a spatially flat universe. But 
we should also give motivations for this choice and explain why other possibilities 
have lost their importance. The astronomical literature presently lists 13 tests of 
the standard model [5], not counting tests which may become important in the 
future. Some of them will be discussed in the present section, while others must be 
postponed until we have reached the necessary understanding of the underlying 
physics. We shall see that some of these classical tests do not really qualify as tests 
at all, and one of them, light deflection by lensing, is not a test of the cosmological 
model, but of general relativity. We already accounted for this and other tests of 
general relativity in Chapter 3. 



Statistics. Let us take the meaning of the term 'test' from the statistical literature, 
where it is accurately defined [15]. When the hypothesis under test concerns the 
value of a parameter, the problems of parameter estimation and hypothesis testing 
are related; for instance, good techniques for estimation often lead to analogous 



Model Testing and Parameter Estimation. I 107 

testing procedures. The two situations lead, however, to different conclusions, and 
should not be confused. If nothing is known a priori about the parameter involved, 
it is natural to use the data to estimate it. On the other hand, if a theoretical 
prediction has been made that the parameter should have a certain value, it may 
be more appropriate to formulate the problem as a test of whether the data are 
consistent with this value. In either case, the nature of the problem, estimation 
or test, must be clear from the beginning and consistent to the end. When two or 
more independent methods of parameter estimation are compared, one can talk 
about a consistency test. 

A good example of this reasoning is offered by the discussion of Hubble's law 
in Section 1.4. Hubble's empirical discovery tested the null hypothesis that the 
Universe (out to the probed redshifts) expands. The test is a valid proof of the 
hypothesis for any value of H) that differs from zero at a chosen confidence 
level, CL%. Thus the value of Ho is unimportant for the test, only its precision 
matters. 

A determination of the value of H) is, however, not a test of a prediction, but a 
case of parameter estimation. The value of Ho is then chosen to be at the maximum 
of the likelihood function or in the middle of a confidence range. For a Gaussian 
probability density function, a ±l<x (one standard deviation) range represents a 
probability of 68.3%, a ±2cr range represents a probability of 95.4%, and so on. 

A combination of estimates such as those referred to in Section 1.4 furnishes 
a consistency test. The consistency is then quantified by the total sum of log- 
likelihood functions, which in the case of Gaussian probability density functions 
reduces to the well-known x 2 -test. 



Expansion Time. The so-called timescale test compares the lookback time in 
Figure 4.2 at redshifts at which galaxies can be observed with to obtained from 
other cosmochronometers inside our Galaxy, as discussed in Section 1.4. Thus we 
have to make do with a consistency test. At moderately high redshifts where the 
O m term dominates and Q\ can be neglected, Equation (4.56) can be written 

H t(z)« — |=(l + z)- 3/2 . (4.76) 

Let us multiply the Ho and to values in Table A. 2 to obtain a value for the 
dimensionless quantity 

H t = 0.97 ±0.05. (4.77) 

As we already saw in Equation (4.44) this rules out the spatially flat matter- 
dominated Einstein-de Sitter universe in which Ho to < 3. 

Equations (4.76) and (4.77) can also be combined to give an estimate for Q m . 
Taking t(3) = 2.4 Gyr at z = 3, one finds 

O m = 0.23, 

which is not very different from the figure that we quote in Table A.6. 



108 Cosmological Models 

The Magnitude-Redshift Relation. Equation (2.60) relates the apparent magni- 
tude m of a bright source of absolute magnitude M at redshift z to the lumi- 
nosity distance tf L - We noted in Section 1.4 that the peak brightness of SNe la 
can serve as remarkably precise standard candles visible from very far away; this 
determines M. Although the magnitude-redshift relation can be used in various 
contexts, we are only interested in testing cosmology. 

The luminosity distance di is a function of z and the model-dependent dynami- 
cal parameters, primarily O m , Q\ and Hq. In Section 1.4 and Figure 1.2 we already 
referred to measurements of Hq based on HST supernova observations (as well 
as other types of measurement). Supernova observations furnish the best infor- 
mation on Q\. The redshift can be measured in the usual way by observing the 
shift of spectral lines, but the supernova light-curve shape gives supplementary 
information: in the rest frame of the supernova the time dependence of light emis- 
sion follows a standard curve, but a supernova at relativistic distances exhibits a 
broadened light curve due to time dilation. 

Two groups [16, 17, 18, 19] have reported their analyses of 16 and 42 super- 
novae of type la, respectively, using somewhat different methods of light-curve 
fitting to determine the distance moduli, determining the parameters by maxi- 
mum likelihood fits, and reaching the same conclusions. Following the analysis of 
Sullivan etal. [17, 18, 19], who define a 'Hubble-constant free' luminosity distance 
Dl = Hodh, the effective B-band (a standard blue filter) magnitude wib becomes 

m B =M B - 51ogH + 25 + 51ogD L (z, Q m , Q A ). (4.78) 

Figure 4.5 shows a Hubble plot [19] of m B versus z for the supernovae studied. 
The solid curve represents an accelerating, spatially flat FLRW universe with 

O A = l-O m = 0.72. (4.79) 

Note that this value is not a test of the FLRW model since the model does not 
make a specific prediction for Q\. The dotted curve in Figure 4.5 that is denoted 
'EdS' represents the Einstein-de Sitter model, which predicts Q\ = 0. Comparison 
of the curves then constitutes a test of the Einstein-de Sitter model (here the null 
hypothesis), with an overwhelming statistical significance [17, 18, 19]. 

Figure 4.6 shows the confidence regions of the fit in the (O m ,Q\) -plane. One 
notices that the most precisely determined parameter is not Q\ or O m , but the 
difference Q\ - Q m , which is measured along the flat-space line. The position 
of the best-fit-confidence region also shows that the supernova data allow a test 
of a decelerating versus an accelerating universe. A decelerated universe is dis- 
favoured by a confidence of about 99.7% (3a). 

In general, the magnitude-redshift relation is a case of parameter estimation, 
and will remain so in the future when the study of more supernovae will allow 
more precise determination of Q\ and dependent quantities such as to, la, Wq>- 



The Angular Diameter-Redshift Relation. In Equation (2.63) we related the 
angular size distance d^ to the proper distance d P (k, Hq, O m , Q\). In conventional 



Model Testing and Parameter Estimation. I 




* s^f 




. E/SO 

> late/Irr 

> unknown 



-># 






Figure 4.5 Hubble diagram of effective B-band magnitude versus redshift for the super- 
novae studied by Sullivan ef al. [17, 18, 19]. The different round and boxed points cor- 
respond to different classes of host galaxy. Reproduced from M. Sullivan ef al. [19], The 
Hubble Diagram of type-la supernovae as a function of host galaxy morphology by per- 
mission of Blackwell Publishing Ltd. 



local physics with a single metric theory the relations (2.60) and (2.63) are physi- 
cally equivalent. Thus our comments on to what extent Equation (2.63) and super- 
nova data furnish tests or imply parameter estimation are the same as above. 



Galaxy and Quasar Counts. The observable here is the number of galaxies or 
quasars within a comoving volume element. The difficulty with galaxies is that 
there are many more galaxies with low luminosities than with high luminosities, 
so this possible test depends on the understanding of the evolution of galaxy lumi- 
nosities. But luminosity distances again depend on H(z), which in turn depends 
on the unknown parameters in Equation (4.56). Thus galaxy counts do not appear 
to constitute a test, rather a method of parameter estimation in the same sense 
as the previous cases. 

Quasars derive their luminosity from rotating accretion discs spiralling into 
massive black holes at the centres of galaxies. If one compares the number of 
lensed quasars seen with the number predicted by observations of intervening 



110 Cosmological Models 




Figure 4.6 Confidence regions in (O m ,flA) for the fitting procedures in Sullivan et al. 
[17, 18, 19]. The ellipses correspond to 68% and 90% confidence regions. The best-fitting 
general FLRW cosmology is denoted by a filled diamond, and the best-fitting flat cosmol- 
ogy by a filled circle. The solid line is the flat-space boundary between closed and open 
cosmologies, the dotted line is the boundary between finite and infinite t ('no Big Bang'), 
the dashed line is the infinite expansion boundary, and the dot-dashed line separates 
accelerating and decelerating universes. (By courtesy of the Supernova Cosmology Project 
team.) 

galaxies capable of lensing, one finds that the number of lensing events is about 
double the prediction, unless dark energy is present. This gives an independent 
estimate of the fraction of dark energy, about § , in agreement with Equation (4.79). 



Problems 



1. On the solar surface the acceleration caused by the repulsion of a nonva- 
nishing cosmological constant A must be much inferior to the Newtonian 
attraction. Derive a limiting value of A from this condition. 

2. In Newtonian mechanics, the cosmological constant A can be incorporated 
by adding to gravity an outward radial force on a body of mass m, a distance 
r from the origin, of F = +mAr/6. Assuming that A = -10" 20 yr" 2 , and that 



Model Testing and Parameter Estimation. I 111 

F is the only force acting, estimate the maximum speed a body will attain if 
its orbit is comparable in size with the Solar System (0.5 light day) [20]. 

3. Einstein's static universe has a oc e Ht , zero curvature of its comoving 
coordinates (k = 0), and a proper density of all objects that is constant in 
time. Show that the comoving volume out to redshiftz is V(z) = ^tt(cz/H) 3 , 
and hence that the number-count slope for objects at typical redshift z 
becomes [(3 + a)lnz] _1 for z » 1, where a is the spectral index for the 
objects [21]. 

4. Starting from Equation (4.56) with the parameters Qq = 1, Cl r = 0, show that 
the age of the Universe can be written 

_ 2 taniryoX 

" m V^I ' 

5. Suppose that dark energy is described by an equation of state w = -0.9 
which is constant in time. At what redshift did this dark energy density 
start to dominate over matter density? What was the radiation density at 
that time? 



Chapter Bibliography 



Rich, J. 2001 Fundamentals of cosmology. Springer. 

Sola, J. 2001 Nucl. Phys. B 95, 29. 

Cappi, A. 2001 Astrophys. Lett. Commun. 40, 161. 

Kraniotis, G. V. and Whitehouse, S. B. 2002 Classical Quantum Gravity 19, 5073. 

Peebles, P. J. E. and Ratra, B. 2003 Rev. Mod. Phys. 75, 559. 

Lima, J. A. S. and Trodden, M. 1996 Phys. Rev. D 53, 4280. 

Cunha, J. V., Lima, J. A. S. and Pires, N. 2002 Astron. Astrophys. 390, 809. 

Steinhardt, P. J., Wang, L. and Zlatev, I. 1999 Phys. Rev. D 59, 123504. 

Zlatev, I., Wang, L. and Steinhardt, P. J. 1999 Phys. Rev. Lett. 82, 896. 

Bludman, S. A. and Roos, M. 2002 Phys. Rev. D65, 043503. 

Ng, S. C. C, Nunes, N. J. and Rosati, F. 2001 Phys. Rev. D64, 083510. 

Armendariz-Picon, C, Mukhanov, V. and Steinhardt, P. J. 2000 Phys. Rev. Lett. 85, 

4438. 
Armendariz-Picon, C, Mukhanov, V. and Steinhardt, P. J. 2001 Phys. Rev. D 63, 103510. 
Caldwell, R. R., Kamionkowski, M. and Weinberg, N. V. 2003 Phys. Rev. Lett. (In press). 
Eadie, W. T., Drijard, D., James, F. E., Roos, M. and Sadoulet, B. 1971 Statistical methods 

in experimental physics. North-Holland, Amsterdam. 
Riess, A. G. et al. 1998 Astronom. J. 116, 1009. 
Perlmutter, S. et al. 1998 Nature 391, 51. 
Perlmutter, S. et al. 1999 Astrophys. J. 517, 565. 
Sullivan, M. et al. 2003 Mon. Not. R. Astron. Soc. 340, 1057. 
Berry, M. V. 1989 Principles of cosmology and gravitation. Adam Hilger, Bristol. 
Peacock, J. A. 1999 Cosmological physics. Cambridge University Press, Cambridge. 



Thermal History 
of the Universe 



The Big Bang models describe the evolution of our Universe from a state of 
extreme pressure and energy density, when it was very much smaller than it is 
now. Matter as we know it does not stand up to extreme temperatures. The Sun 
is a plasma of ionized hydrogen, helium and other elements, but we know also 
that the stability of nuclei cannot withstand temperatures corresponding to a few 
MeV of energy. They decompose into elementary particles, which at yet higher 
temperatures decompose into even more elementary constituents under condi- 
tions resembling those met in high-energy particle colliders. An understanding 
of cosmology therefore requires that we study the laws and phenomena of very 
high-temperature plasmas during the early radiation era. 

Motion of particles under electromagnetic interaction is described by the Max- 
well-Lorentz equations. The motion of a particle in a central field of force F, as for 
instance an electron of charge e moving at a distance r around an almost static 
proton, is approximated well by the Coulomb force 

F=V (5.1) 

r 2 

Note that this has the same form as Newton's law of gravitation, Equation (1.28). 
In the electromagnetic case the strength of the interaction is e 2 , whereas the 
strength of the gravitational interaction is GMtyiq. These two coupling constants 
are expressed in completely different units because they apply to systems of com- 
pletely different sizes. For the physics of radiation, the gravitational interaction 
can be completely neglected but, for the dynamics of the expansion of the Uni- 
verse, only the gravitational interaction is important because celestial objects are 
electrically neutral. 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



114 Thermal History of the Universe 

In Section 5.1 we begin with the physics of photons and Planck's radiation law, 
which describes how the energy is distributed in an ensemble of photons in ther- 
mal equilibrium, the blackbody spectrum. We also introduce the properties of 
polarization and spin. 

In Section 5.2 we introduce the important concept of entropy and we note 
that a universe filled with particles and radiation in thermal equilibrium must 
indeed have been radiation dominated at an early epoch. Comparing a radiation- 
dominated universe with one dominated by nonrelativistic matter in adiabatic 
expansion, we find that the relation between temperature and scale is different 
in the two cases. This leads to the conclusion that the Universe will not end in 
thermal death, as feared in the 19th century. 

In Section 5.3 we meet new particles and antiparticles, fermions and bosons, 
some of their properties such as conserved quantum numbers, spin, degrees of 
freedom and energy spectrum, and a fair number of particle reactions describing 
their electroweak interactions in the primordial plasma. 

In Section 5.4 we trace the thermal history of the Universe, starting at a time 
when the temperature was 10 13 K. The Friedmann equations offer us the means 
of time-keeping as a function of temperature. 

In Section 5.5 we continue the thermal history of photons and leptons from 
neutrino decoupling to electron decoupling to the cold microwave radiation of 
today. 

In Section 5.6 we follow the thermal history of the nucleons for the momen- 
tous process of Big Bang nucleosynthesis (BBN) which has left us very important 
clues in the form of relic abundances of helium and other light nuclei. The nucle- 
osynthesis is really a very narrow bottleneck for all cosmological models, and one 
which has amply confirmed the standard Big Bang model. We find that the bary- 
onic matter present since nucleosynthesis is completely insufficient to close the 
Universe. 

5.1 Photons 

Electromagnetic radiation in the form of radio waves, microwaves, light, X-rays 
or y-rays has a dual description: either as waves characterized by the wavelength 
A and frequency v = c/ A, or as energy quanta, photons, y. In the early days of 
quantum theory the wave-particle duality was seen as a logical paradox. It is 
now understood that the two descriptions are complementary, the wave picture 
being more useful to describe, for instance, interference phenomena, whereas the 
particle picture is needed to describe the kinematics of particle reactions or, for 
instance, the functioning of a photocell (this is what Einstein received the Nobel 
prize for!). Energy is not a continuous variable, but it comes in discrete packages: 
it is quantized. The quantum carried by an individual photon is 

E = hv, (5.2) 

where h is Planck's constant. The wavelength and energy ranges corresponding 
to the different types of radiation are given in Table A.3. 



Blackbody Spectrum. Let us study the thermal history of the Universe in the Big 
Bang model. At the very beginning the Universe was in a state of extreme heat and 
pressure, occupying an exceedingly small volume. Before the onset of the present 
epoch, in which most of the energy exists in the form of fairly cold matter, there 
was an era when the pressure of radiation was an important component of the 
energy density of the Universe, the era of radiation domination. As the Universe 
cooled, matter condensed from a hot plasma of particles and electromagnetic 
radiation, later to form material structures in the forms of clusters, galaxies and 
stars. 

During that era no atoms or atomic nuclei had yet been formed, because the tem- 
perature was too high. Only the particles which later combined into atoms existed. 
These were the free electrons, protons, neutrons and various unstable particles, as 
well as their antiparticles. Their speeds were relativistic, they were incessantly col- 
liding and exchanging energy and momentum with each other and with the radi- 
ation photons. A few collisions were sufficient to distribute the available energy 
evenly among them. On average they would then have the same energy, but some 
particles would have less than average and some more than average. When the col- 
lisions resulted in a stable energy spectrum, thermal equilibrium was established 
and the photons had the blackbody spectrum derived in 1900 by Max Planck. 

Let the number of photons of energy hv per unit volume and frequency interval 
be riy(v). Then the photon number density in the frequency interval (v, v + dv) 
is 

... 8tt v 2 dv 
ny(v)dv = __ ___ _ (5.3) 

At the end of the 19th century some 40 years was spent trying to And this for- 
mula using trial and error. With the benefit of hindsight, the derivation is straight- 
forward, based on classical thermodynamics as well as on quantum mechanics, 
unknown at Planck's time. 

Note that Planck's formula depends on only one parameter, the temperature 
T. Thus the energy spectrum of photons in thermal equilibrium is completely 
characterized by its temperature T. The distribution (5.3) peaks at the frequency 

V max =* 10 10 T (5.4) 

in units of hertz or cycles per second, where T is given in kelvin. 

The total number of photons per unit volume, or the number density N y , is 
found by integrating this spectrum over all frequencies: 



Ny 



-- Pn y (v)dv^ 1.202-^-f^V. (5.5) 

Jo * TT 2 \ChJ 

Here h represents the reduced Planck's constant h = h/2n. The solution of the 
integral in this equation can be given in terms of Riemann's zeta-function; £(3) « 
1.2020. 

Since each photon of frequency v is a quantum of energy hv (this is the inter- 
pretation Planck was led to, much to his own dismay, because it was in obvious 
conflict with classical ideas of energy as a continuously distributed quantity), the 



116 Thermal History of the Universe 

total energy density of radiation is given by the Stefan-Boltzmann law after Josef 
Stefan (1835-1893) and Ludwig Boltzmann (1844-1906), 



£ r = hviiy(v) dv = v^^"^- = a-sT\ 
Jo 15 fr 3 c 3 



where all the constants are lumped into Stefan's constant 

a s = 4723 eVm" 3 K -4 . 
A blackbody spectrum is shown in Figure 8.1. 



Polarization. Consider a plane wave of monochromatic light with frequency v 
moving along the momentum vector in the z direction. The components of the 
wave's electric field vector E in the (x, y)-plane oscillate with time t in such a way 
that they can be written 

E x (t)=a x (t)cos[vt-e x (t)], Ey(t) = a y (t)cos[vt-$y(t)], (5.7) 

where a x (t) and a y (t) are the amplitudes, and 6 x (t) and 6 y (t) are the phase 
angles. 

A well-known property of light is its two states of polarization. Unpolarized 
light passing through a pair of polarizing sunglasses becomes vertically polarized. 
Unpolarized light reflected from a wet street becomes horizontally polarized. The 
advantage of polarizing sunglasses is that they block horizontally polarized light 
completely, letting all the vertically polarized light through. Their effect on a beam 
of unpolarized sunlight is to let, on average, every second photon through verti- 
cally polarized, and to block every other photon as if it were horizontally polar- 
ized: it is absorbed in the glass. Thus the intensity of light is also reduced to 
one-half. 

Polarized and unpolarized light (or other electromagnetic radiation) can be 
described by the Stokes parameters, which are the time averages (over times much 
longer than 1/v) 

I={a 2 x ) + {a 2 y ), a = {a 2 x )-{a 2 y ), 1 

y * y \ (58) 

U = {2a x a y cos(0 x -9 y )), V = {2a x a y sin(0 x - y )).\ 

The parameter J gives the intensity of light, which is always positive definite. 
The electromagnetic field is unpolarized if the two components in Equation (5.7) 
are uncorrected, which translates into the condition Q = U = V = 0. If two 
components in Equation (5.7) are correlated, they either describe light that is 
linearly polarized along one direction in the (x,y)-plane, or circularly polarized 
in the plane. In the linear case U = or V = 0, or both. Under a rotation of angle 
4> in the (x,y)-plane, the quantity Q 2 + U 2 is an invariant (Problem 2) and the 
orientation of the polarization 

a= |arctan(U/Q) (5.9) 



Adiabatic Expansion 117 

transforms to a- <f>. Thus the orientation does not define a direction, it only refers 
the polarization to the (x^-plane. 

The photon is peculiar in lacking a longitudinal polarization state, and the 
polarization is therefore not a vector in the (x,y)-plane; in fact it is a second- 
rank tensor. This is connected to the fact that the photon is massless. Recall that 
the theory of special relativity requires the photons to move with the speed of 
light in any frame. Therefore they must be massless, otherwise one would be able 
to accelerate them to higher speeds, or decelerate them to rest. 

In a way, it appears as if there existed two kinds of photons. Physics has taken 
this into account by introducing an internal property, spin. Thus, one can talk 
about the two polarization states or about the two spin states of the photon. We 
shall come back to photon polarization later. 



5.2 Adiabatic Expansion 

For much of the thermal history of the Universe, the reaction rates of photons 
and other particles have been much greater than the Hubble expansion rate, so 
thermal equilibrium should have been maintained in any local comoving volume 
element dV. There is then no net inflow or outflow of energy, which defines the 
expansion as adiabatic, as was done in Equation (4.25). The law of conservation of 
energy (4.24), also called the first law of thermodynamics, followed, by assuming 
that matter behaved as an expanding nonviscous fluid at constant pressure p. 



Adiabaticity and Isentropy The entropy per unit comoving volume and physical 
volume V = R 3 at temperature T is defined by 

S = -^(pc 2 + p)V. (5.10) 

Let us rewrite the first law of thermodynamics more generally in the form 

d[(pc 2 + p)V] = Vdp, (5.11) 

where the energy is E = pc 2 V. 

The second law of thermodynamics can be written 



If the expansion is adiabatic and the pressure p is constant so that d(p V) = p dV, 
we recover Equation (4.25). Then it also follows that the expansion is isentropic: 



In the literature, the terms 'adiabaticity' and 'isentropy' are often confused. 
Moreover, it follows from Equations (5.11) and the constancy of p that 

d£ = -p dV. (5 



118 Thermal History of the Universe 

Thus a change in volume dV is compensated for by a change in energy d£ at 
constant pressure and entropy. 

The second law of thermodynamics states in particular that entropy cannot 
decrease in a closed system. The particles in a plasma possess maximum entropy 
when thermal equilibrium has been established. The assumption that the Uni- 
verse expands adiabatically and isentropically is certainly very good during the 
radiation-dominated era when the fluid was composed of photons and elementary 
particles in thermal equilibrium. 

This is also true during the matter-dominated era before matter clouds start to 
contract into galaxies under the influence of gravity. Even on a very large scale we 
may consider the galaxies forming a homogeneous 'fluid', an idealization as good 
as the cosmological principle that forms the basis of all our discussions. In fact, 
we have already relied on this assumption in the derivation of Einstein's equation 
and in the discussion of equations of state. However, the pressure in the 'fluid' 
of galaxies of density N is negligibly small, because it is caused by their random 
motion, just as the pressure in a gas is due to the random motion of the molecules. 
Since the average peculiar velocities (v) of the galaxies are of the order of 10~ 3 c, 
the ratio of pressure p = m(v) 2 N to matter density p gives an equation of state 
(Problem 5) of the order of 

m(v) 2 N (v) 2 6 



We have already relied on this value in the case of a matter-dominated universe 
when deriving Equation (4.30). 



Radiation/Matter Domination. Let us compare the energy densities of radiation 

and matter. The energy density of electromagnetic radiation corresponding to one 

photon in a volume V is 

2 hv he 

p r c 2 . fr = — = — . (5.15) 

In an expanding universe with cosmic scale factor a, all distances scale as a 
and so does the wavelength A. The volume V then scales as a?; thus e r scales as 
a~ 4 . Here and in the following the subscript 'r' stands for radiation and relativistic 
particles, while 'm' stands for nonrelativistic (cold) matter. 

Statistical mechanics tells us that the pressure in a nonviscous fluid is related 
to the energy density by the equation of state (4.32) 

P = \e, (5.16) 

where the factor | comes from averaging over the three spatial directions. Thus 
pressure also scales as a" 4 , so that it will become even more negligible in the 
future than it is now. The energy density of matter, 

p m c- = ^, (5.17) 



Adiabatic Expansion 119 




Figure 5.1 Scale dependence of the energy density in radiation f r , which dominates at 
small a, and in matter p m , which dominates at large a, in units of log(GeV m~ 3 ). The scale 
value z eq or a^ is evaluated in Equation (8.49) and indicated also in Figure 5.9. 

also decreases with time, but only with the power a" 3 . Thus the ratio of radiation 
energy to matter scales as a" 1 : 

JL oc f* oc a -i. (5.18) 

Pm a" 3 

The present radiation energy density is predominantly in the form of micro- 
waves and infrared light. Going backwards in time we reach an era when radiation 
and matter both contributed significantly to the total energy density. The change 
from radiation domination to matter domination is gradual: at t = 1000 yr the 
radiation fraction was about 90%, at t = 2 Myr only about 10% (see Figure 5.1). 
We shall calculate the time of equality t eq in Chapter 8 when we know the present 
energy densities. 



Temperature Dependence. A temperature T may be converted into units of 
energy by the dimensional relation 

E = kT, (5.19) 

where k is the Boltzmann constant (Table A.2). Since E scales as a" 1 it follows 
that also the temperature of radiation T T scales as a -1 (Problem 1): 

Troca" 1 oc (1 + z). (5.20) 

This dependence is roughly verified by measurements of the relic cosmic micro- 
wave background (CMB) radiation temperature at various times, corresponding 
to redshifts z < 4.4. Only two measurements give an absolute value: T = 10 ± 4 K 
at z = 2.338 and T = 12.l!^° K at z = 3.025, in agreement with Equation (5.20). 



Relativistic Particles. It is important to distinguish between relativistic and non- 
relativistic particles because their energy spectra in thermal equilibrium are dif- 



120 Thermal History of the Universe 

ferent. A coarse rule is that a particle is nonrelativistic when its kinetic energy is 
small in comparison with its mass, and relativistic when E > 10mc 2 . The masses 
of some cosmologically important particles are given in Table A.4. For compar- 
ison, the equivalent temperatures are also given. This gives a rough idea of the 
temperature of the heat bath when the respective particle is nonrelativistic. 

The isentropy condition (5.13) can be applied to both relativistic and nonrela- 
tivistic particles. Let us first consider the relativistic particles which dominate the 
radiation era. Recall from Equation (2.69) that the energy of a particle depends on 
two terms, mass and kinetic energy, 



£ = Vm 2 c 4 +P 2 c 2 , (5.21) 

where P is momentum. For massless particles such as the photons, the mass term 
is of course absent; for relativistic particles it can be neglected. 

Replacing E in Equation (5.14) by the energy density s r times the volume a 3 = V, 
Equation (5.14) becomes 

d(a 3 f r ) = -pd(fl 3 ). (5.22) 

Substituting f r for the pressure p from the equation of state (5.16) we obtain 

a? df r + £ r da 3 = -§£ r da 3 , 
or 

*—\%. (5.23) 

e t 3 a 3 

The solution to this equation is 



in agreement with our previous finding. We have in fact already used this result 
in Equation (4.33). 



Non-relativistic Particles. For nonrelativistic particles the situation is different. 
Their kinetic energy fkm is small, so that the mass term in Equation (5.21) can no 
longer be neglected. The motion of n particles per unit volume is then character- 
ized by a temperature T m , causing a pressure 

p = nkT m . (5.25) 

Note that T m is not the temperature of matter in thermal equilibrium, but rather a 
bookkeeping device needed for dimensional reasons. The equation of state differs 
from that of radiation and relativistic matter, Equation (5.16), by a factor of 2: 

V = ff km- 
Including the mass term of the n particles, the energy density of nonrelativistic 
matter becomes 

p m = £ m = nmc 2 + \nkT m . (5.26) 



Adiabatic Expansion 121 

Substituting Equations (5.25) and (5.26) into Equation (5.22) we obtain 

d(a 3 nmc 2 ) + \d(a 3 nkT m ) = -nkT m da 3 . (5.27) 

Let us assume that the total number of particles always remains the same: in a 
scattering reaction there are then always two particles coming in, and two going 
out, whatever their types. This is not strictly true because there also exist other 
types of reactions producing more than two particles in the final state. However, 
let us assume that the total number of particles in the volume V under consider- 
ation is N = Vn, and that N is constant during the adiabatic expansion, 

diV = d(Vn) = f tt d(a 3 n) = 0. (5.28) 

The first term in Equation (5.27) then vanishes and we are left with 

fa 3 dT m = -T m d(a 3 ), 



3 dT m _ d(a 3 ) 
2 7m a 3 ' 

The solution to this differential equation is of the form 

T m oc a~ 2 . (5.29) 

Thus we see that the temperature of nonrelativistic matter has a different depen- 
dence on the scale of expansion than does the temperature of radiation. This has 
profound implications for one of the most serious problems in thermodynamics 
in the 19th century. 



Thermal Death. Suppose that the Universe starts out at some time with y-rays at 
high energy and electrons at rest. This would be a highly ordered nonequilibrium 
system. The photons would obviously quickly distribute some of their energy to 
the electrons via various scattering interactions. Thus the original order would 
decrease, and the randomness or disorder would increase. The second law of 
thermodynamics states that any isolated system left by itself can only change 
towards greater disorder. The measure of disorder is entropy; thus the law says 
that entropy cannot decrease. 

The counterexample which living organisms seem to furnish, since they build 
up ordered systems, is not valid. This is because no living organism exists in 
isolation; it consumes nutrients and produces waste. Thus, establishing that a 
living organism indeed increases entropy would require measurement of a much 
larger system, certainly not smaller than the Solar System. 

It now seems to follow from the second law of thermodynamics that all energy 
would ultimately distribute itself evenly throughout the Universe, so that no fur- 
ther temperature differences would exist. The discoverer of the law of conser- 
vation of energy, Hermann von Helmholtz (1821-1894), came to the distressing 



122 Thermal History of the Universe 

conclusion in 1854 that 'from this point on, the Universe will be falling into a state 
of eternal rest'. This state was named thermal death, and it preoccupied greatly 
both philosophers and scientists during the 19th century. 

Now we see that this pessimistic conclusion was premature. Because, from the 
time when the temperatures of matter and radiation were equal, 



we see from Equations (5.20) and (5.29) that the adiabatic expansion of the Uni- 
verse causes matter to cool faster than radiation. Thus cold matter and hot radi- 
ation in an expanding Universe are not and will never be in thermal equilibrium 
on a cosmic timescale. This result permits us to solve the adiabatic equations of 
cold matter and hot radiation separately, as we in fact have. 



5.3 Electroweak Interactions 

Virtual Particles. In quantum electrodynamics (QED) the electromagnetic held 
is mediated by photons which are emitted by a charged particle and absorbed 
very shortly afterwards by another. Photons with such a brief existence during an 
interaction are called virtual, in contrast to real photons. 

Virtual particles do not travel freely to or from the interaction region. Energy 
is not conserved in the production of virtual particles. This is possible because 
the energy imbalance arising at the creation of the virtual particle is compensated 
for when it is annihilated, so that the real particles emerging from the interaction 
region possess the same amount of energy as those entering the region. We have 
already met this argument in the discussion of Hawking radiation from black 
holes. 

However, nature impedes the creation of very huge energy imbalances. For 
example, the masses of the vector bosons W ± and Z° mediating the electroweak 
interactions are almost 100 GeV. Reactions at much lower energies involving vir- 
tual vector bosons are therefore severely impeded, and much less frequent than 
electromagnetic interactions. For this reason such interactions are called weak 
interactions. 

Real photons interact only with charged particles such as protons p, electrons 
e" and their oppositely charged antiparticles, the anti-proton p and the positron 
e + . An example is the elastic Compton scattering of electrons by photons: 

y + e* — y + e ± . (5.30) 

As a result of virtual intermediate states neutral particles may exhibit electromag- 
netic properties such as magnetic moment. 

Antimatter does not exist on Earth, and there is very little evidence for its pres- 
ence elsewhere in the Galaxy. That does not mean that antiparticles are pure fic- 
tion: they are readily produced in particle accelerators and in violent astrophysical 
events. However, in an environment of matter, antiparticles rapidly meet their cor- 
responding particles and annihilate each other. The asymmetry in the abundance 



Electroweak Interactions 



Figure 5.2 Feynman diagram for elastic scattering of an electron e~ on a proton p. This 
is an electromagnetic interaction mediated by a virtual photon y. The direction of time is 
from left to right. 

of matter and antimatter is surprising and needs an explanation. We shall deal 
with that in Section 6.7. 

Charged particles interact via the electromagnetic field. Examples are the elastic 
scattering of electrons and positrons, 

e* + e* — ► e* + e*, (5.31) 

and the Coulomb interaction between an electron and a proton, depicted in Fig- 
ure 5.2. The free e" and p enter the interaction region from the left, time running 
from left to right. They then exchange a virtual photon, and finally they leave the 
interaction region as free particles. This Feynman diagram does not show that the 
energies and momenta of the e" and p change in the interaction. If one particle 
is fast and the other slow, the result of the interaction is that the slow particle 
picks up energy from the fast one, just as in the case of classical billiard balls. 
The Coulomb interaction between particles of like charges is repulsive and that 
between unlike charges is attractive. In both cases the energy and momentum get 
redistributed in the same way. 

When an electron is captured by a free proton, they form a bound state, a hydro- 
gen atom which is a very stable system. An electron and a positron may also form 
a bound atom-like state called positronium. This is a very unstable system: the 
electron and positron are antip articles, so they rapidly end up annihilating each 
other according to the reaction 

e" + e + — y + y. (5.32) 

Since the total energy is conserved, the annihilation results in two (or three) pho- 
tons possessing all the energy and flying away with it at the speed of light. 

The reverse reaction is also possible. A photon may convert briefly into a virtual 
e"e + pair, and another photon may collide with either one of these charged parti- 
cles, knocking them out of the virtual state, thus creating a free electron-positron 
pair: 

y + y — e"+e + . (5.33) 

This requires the energy of each photon to equal at least the electron (positron) 
mass, 0.51 MeV. If the photon energy is in excess of 0.51 MeV the e"e + pair will 
not be created at rest, but both particles will acquire kinetic energy. 



124 Thermal History of the Universe 




Figure 5.3 Feynman diagram for pp annihilation into e + e via an intermediate virtual 
photon y. The direction of time is from left to right. 

Protons and anti-protons have electromagnetic interactions similar to positrons 
and electrons. They can also annihilate into photons, or for instance into an 
electron-positron pair via the mediation of a virtual photon, 

p + p — yvirmai — e~ + e + , (5.34) 

as depicted in Figure 5.3. The reverse reaction 

e" + e + — yvirmai — P + P (5.35) 

is also possible, provided the electron and positron possess enough kinetic energy 
to create a proton, or 938.3 MeV. 

Note that the total electric charge is conserved throughout the reactions (5.30)- 
(5.35) and in Figures 5.1 and 5.2. Its value after the interaction (to the right of the 
arrow) is the same as it was before (to the left of the arrow). This is an important 
conservation law: electric charge can never disappear nor arise out of neutral 
vacuum. In the annihilation of an e"e + pair into photons, all charges do indeed 
vanish, but only because the sum of the charges was zero to start with. 

Baryons and Leptons. All the charged particles mentioned above have neutral 
partners as well. The partners of the p, p, e" , e + are the neutron n, the anti-neutron 
n, the electron neutrino v e and the electron anti-neutrino v e . The p and n are called 
nucleons: they belong together with a host of excited nucleon-like states to the 
more general family of baryons. The p and n are correspondingly anti-nucleons 
or anti-baryons. The e", e + , v e and v e are called leptons and anti-leptons of the 
electron family (e). 

We also have to introduce two more families or flavours of leptons: the n family, 
comprising the charged muons /j* and their associated neutrinos v p , v^j, and the t 
family comprising t* and v T , v T . The /j* and t* are much more massive than the 
electrons, but otherwise their physics is very similar. They participate in reactions 
such as Equations (5.30)-(5.33) with e replaced by p. or t, respectively. We shall 
discuss baryons and leptons in more detail in Chapter 6. 

The charge can easily move from a charged particle to a neutral one as long 
as that does not violate the conservation of total charge in the reaction. Further, 
we need to know the following two conservation laws governing the behaviour of 
baryons and leptons. 

(i) B or baryon number is conserved. This forbids the total number of baryons 
minus anti-baryons from changing in particle reactions. To help the book- 



Electroweak Interactions 125 

keeping in particle reactions we assign the value B = 1 to baryons and B = - 1 
to anti-baryons in a way analogous to the assignment of electric charges. 
Photons and leptons have B = 0. 

(ii) Li or l-lepton number is conserved for each of the flavours I = e, p, t. This 
forbids the total number of Meptons minus [-anti-leptons from changing in 
particle reactions. We assign L e = 1 to e" and v e , I e = -1 to e + and v e , 
and correspondingly to the members of the n and t families. Photons and 
baryons have no lepton numbers. 

However, there is an amendment to this rule, caused by the complications in the 
physics of neutrinos. Although the flavour state I is conserved in neutrino reac- 
tions, it is not conserved in free flight. To observe the flavour state I of neutrinos 
is not the same as observing the neutrino mass states. There are three neutrino 
mass states called vi, V2, V3, which are not identical to the flavour states; rather, 
they are quantum-mechanical superpositions of them. The states can mix in such 
a way that a pure mass state is a mixture of flavour states, and vice versa. Roughly, 
the v^ is the mixture of \vi, \vi and \v$. 

All leptons participate in the weak interactions mediated by the heavy virtual 
vector bosons W* and Z°. The Z° is just like a photon except that it is very massive, 
about 91 GeV, and the W* are its 10 GeV lighter charged partners. Weak leptonic 
reactions are 

t ± + Ve — e ± + Vg, (5.36) 

v e + v e — >v e + v e , (5.37) 

where v e stands for v e or v e . The Feynman diagrams of some of these reactions 
are shown in Figure 5.4. There is also the annihilation reaction 



and the pair production reaction 

Ve + Ve — e"+e + . (5.39) 

Similar reactions apply to the two other lepton families, replacing e above by 
H or t, respectively. Figure 5.4 illustrates that the total baryon number B and 
the total lepton number I e are both conserved throughout the above reactions. 
Note that the v e can scatter against electrons by the two Feynman diagrams cor- 
responding to W* exchange and Z° exchange, respectively. In contrast, v M and v T 
can only scatter by the Z° exchange diagram, because of the separate conservation 
of lepton-family numbers. 



Fermions and Bosons. The leptons and nucleons all have two spin states each. In 
the following we shall refer to them as fermions, after Enrico Fermi (1901-1954), 
whereas the photon and the W and Z are bosons, after Satyendranath Bose 



126 Thermal History of the Universe 



Figure 5.4 Feynman diagram for elastic scattering of an electron neutrino v e against 
an electron e~. This weak interaction is mediated by a virtual W~ vector boson in the 
charged current reaction (upper figure), and by a virtual Z° vector boson in the neutral 
current reaction (lower figure). The direction of time is from left to right. 

(1894-1974). In Table A.4 we have already met one more boson, the n meson, 
or pion. The difference between bosons and fermions is deep and fundamental. 
The number of spin states is even for fermions, odd for bosons (except the pho- 
ton). They behave differently in a statistical ensemble. Fermions have antiparticles 
which most bosons do not. The fermion number is conserved, indeed separately 
for leptons and baryons, as we have seen. The number of bosons is not conserved; 
for instance, in pp collisions one can produce any number of pions and photons. 

Two identical fermions refuse to get close to one another. This is the Pauli 
exclusion force responsible for the electron degeneracy pressure in white dwarfs 
and the neutron degeneracy pressure in neutron stars. A gas of free electrons will 
exhibit pressure even at a temperature of absolute zero. According to quantum 
mechanics, particles never have exactly zero velocity: they always carry out ran- 
dom motions, causing pressure. For electrons in a high-density medium such as 
a white dwarf with density 10 6 p©> the degeneracy pressure is much larger than 
the thermal pressure, and it is enough to balance the pressure of gravity. 

Bosons do not feel such a force, nothing inhibits them getting close to each 
other. However, it is beyond the scope and needs of this book to explain these 
properties further. They belong to the domains of quantum mechanics and quan- 
tum statistics. 

The massive vector bosons W* and Z° have three spin or polarization states: 
the transversal (vertical and horizontal) states which the photons also have, and 
the longitudinal state along the direction of motion, which the photon is lacking. 

In Table A. 5 the number of spin states, n spin , of some of the cosmologically 
important particles are given. The fourth column tabulates n anti , which equals 2 
for particles which possess a distinct antiparticle, otherwise it is equal to 1. 



Electroweak Interactions 



Figure 5.5 A beam of particles of flux F hitting a target T which scatters some of them 
into a detector D in the direction 9. The detector, which has a sensitive surface dD, then 
records diV scat scattered particles 

As already explained, the number of distinct states or degrees of freedom, g, of 
photons in a statistical ensemble (in a plasma, say) is two. In general, due to the 
intricacies of quantum statistics, the degrees of freedom are the product of n sp m, 
n an ti, and a factor n Pau ij = |, which only enters for fermions obeying Fermi-Dirac 
statistics. For bosons this factor is unity. This product, 

9 = nspinnantinpauli, (5.40) 

is tabulated in the fifth column of Table A. 5. 



Reaction Cross-Sections. The rate at which a reaction occurs, or the number of 
events per unit time, depends on the strength of the interaction as expressed by 
the coupling constant. It may also depend on many other details of the reaction, 
such as the spins and masses of the participating particles, and the energy E. 
All this information is contained in the reaction cross-section, a. Let us follow an 
elementary argument to derive this quantity. 

Suppose a beam contains k monoenergetic particles per m 3 , all flying with veloc- 
ity v m s" 1 in the same direction (see Figure 5.5). This defines the fluxF of particles 
per m 2 s in the beam. Let the beam hit a surface containing N target particles on 
which each beam particle may scatter. The number of particle reactions (actual 
scatterings) per second is then proportional to F and N. Consider the number of 
particles dJV scat scattered into a detector of angular opening dQ in a direction 9 
from the beam direction (we assume azimuthal symmetry around the beam direc- 
tion). Obviously, dJV scat is proportional to the number of particle reactions and to 
the detector opening, 

diV scat = .FiVcr(0)dr2. 
The proportionality factor a (6) contains all the detailed information about the 
interaction. Integrating over all directions we can write this as 



where the proportionality constant 

a= f cr(0)dO 



128 Thermal History of the Universe 

has the dimension of a surface, here m 2 . For this reason it has been named the 
cross-section. 

One can also understand the reason for the surface units from a classical argu- 
ment. Suppose a particle reaction can be treated as a game of billiard balls. Then 
the probability of hit is clearly proportional to the size of the target ball (of radius 
R), as seen by the hitting ball, or ttR 2 . The difference between billiard balls and 
particles is that a should not be understood as the actual size, because that is 
not a useful quantity in quantum mechanics. Rather it depends on the interaction 
in a complicated manner. 

The number density of relativistic particles other than photons is given by dis- 
tributions very similar to the Planck distribution. Let us replace the photon energy 
hv in Equation (5.3) by E, which is given by the relativistic expression (5.21). Not- 
ing that the kinematic variable is now the three-momentum p = \p\ (since for 
relativistic particles we can ignore the mass), we can replace Planck's distribution 
by the number density of particle species i with momentum between p and p + dp , 

. , 8TTn S p in ,i p 2 dp 

ndp) dp = w v 2 eEdp)lkTi±1 . (5.42) 

The ± sign is '-' for bosons and '+' for fermions, and the name for these distribu- 
tions are the Bose distribution and the Fermi distribution, respectively. The Fermi 
distribution in the above form is actually a special case: it holds when the number 
of charged fermions equals the number of corresponding neutral fermions (the 
'chemical potentials' vanish). In the following we shall need only that case. 

The number density N of nonrelativistic particles of mass m is given by the 
Maxwell-Bohzmann distribution for an ideal, nondegenerate gas. Starting from 
Equation (5.42) we note that for nonrelativistic particles the energy kT is smaller 
than the mass, so that the term ±1 in can be neglected in comparison with the 
exponential. Rewriting the Fermi distribution as a function of temperature rather 
than of momentum we obtain the Maxwell-Boltzmann distribution 

(2nmkT) 312 _ E/kT . t _ 

N = Hspin (hcP e ■ (5 " 43) 

Note that because of the exponential term the number density falls exponentially 
as temperature falls. James Clerk Maxwell (1831-1879) was a contemporary of 
Stefan and Boltzmann. 



5.4 The Early Radiation Era 

Primordial Hot Plasma. In Section 5.1 we established the dependence of the 
number density of photons on temperature, N y in Equation (5.5), and the cor- 
responding energy density, f r in Equation (5.6). For each species of relativistic 
fermions participating in the thermal equilibrium there is a specific number den- 
sity. To find the total number density of particles sharing the available energy we 
have to count each particle species i weighted by the corresponding degrees of 



The Early Radiation Era 129 

freedom Qi. Remembering that g y = 2 for photons, we thus rewrite Equation (5.6) 
with a factor gi explicitly visible: 

Si = ^gtasT 4 . (5.44) 

Remember that a$ is Stefan's constant. It turns out that this expression gives the 
correct energy density for every particle species if we insert its respective value 
of Qi from Table A. 5. 

Equation (5.5) canbe correspondingly generalized to relativistic fermions. Their 
number density is 

N t = §N y . (5.45) 

In general, the primordial plasma was a mixture of particles, of which some 
are relativistic and some nonrelativistic at a given temperature. Since the number 
density of a nonrelativistic particle (given by the Maxwell-Boltzmann distribution, 
Equation (5.43)) is exponentially smaller than that of a relativistic particle, it is a 
good approximation to ignore nonrelativistic particles. Different species i with 
mass wij have a number density which depends on nii/T, and they may have a 
thermal distribution with a temperature Tt different from that of the photons. Let 
us define the effective degrees of freedom of the mixture as 

bosons i fermions j 

As explained in the context of Equation (5.40) the sum over fermions includes 
a factor |, accounting for the difference between Fermi and Bose statistics. The 
factor (Tj/T) 4 applies only to neutrinos, which obtain a different temperature 
from the photons when they freeze out from the plasma (as we shall see later). 
Thus the energy density of the radiation in the plasma is 

£ T = \g*a s T 4 . (5.47) 

Let us now derive a relation between the temperature scale and the timescale. 
We have already found the relation (4.40) between the size scale R and the 
timescale t during the radiation era, 

a(t) oc Vt, (5.48) 

where we choose to omit the proportionality factor. The Hubble parameter can 
then be written 

H = ~n = Tf (5 " 49) 

a It 
Note that the proportionality factor omitted in Equation (5.48) has dropped out. 
In Equation (4.35) we noted that the curvature term kc 2 /R 2 in Friedmann's equa- 
tions is negligibly small at early times during the radiation era. We then obtained 
the dynamical relation 

f-(^,f. 



130 Thermal History of the Universe 

Inserting Equation (5.49) on the left and replacing the energy density p on the 
right by £ r /c 2 , we find the relation sought between photon temperature and time: 



g % T 2 = 3.07x10"' 



t V 3c 2 

The sum of degrees of freedom of a system of particles is of course the number 
of particles multiplied by the degrees of freedom per particle. Independently of 
the law of conservation of energy, the conservation of entropy implies that the 
energy is distributed equally between all degrees of freedom present in such a 
way that a change in degrees of freedom is accompanied by a change in random 
motion, or equivalently in temperature. 

Thus entropy is related to order: the more degrees of freedom there are present, 
the more randomness or disorder the system possesses. When an assembly of par- 
ticles (such as the molecules in a gas) does not possess energy other than kinetic 
energy (heat), its entropy is maximal when thermal equilibrium is reached. For 
a system of gravitating bodies, entropy increases by clumping, maximal entropy 
corresponding to a black hole. 

Cooling Plasma. Let us now study the thermal history of the Universe during 
the radiation era. We may start at a time when the temperature was 10 11 K, which 
corresponds to a mean energy of about 300 MeV. All of the electrons and pho- 
tons then have an energy below the threshold for proton-anti-proton production 
(see Equation (5.35)). Thus the number of protons (and also neutrons and anti- 
nucleons) will no longer increase as a result of thermal collisions. They can only 
decrease, for a number of reasons which I shall explain. 

Most of the other particles introduced in Section 5.3, the y, e", n~ , n, v e , v^ 
and v T , as well as their antiparticles, are then present. The sum in Equation (5.46) 
is then, using the degrees of freedom in Table A. 5, 

#* = 2 + 3 + 2xf + 3xf = f, (5.52) 

where the first term corresponds to the photon, the second to the three pions, the 
third to the two charged leptons and the fourth to the three kinds of neutrinos. 
Most of the unstable t leptons disappeared shortly after 1.78 GeV. Inserting the 
value of g % into Equation (5.51), we find that t = 6.87 ms at 10 11 K. 

We can follow the evolution of the function g*(T) in Figure 5.6. A compari- 
son of the graph with Equation (5.52) shows that the latter is an underestimate. 
This is because all the particles in thermal equilibrium contribute, not only those 
accounted for in Equation (5.52), but also heavier particles which are thermalized 
by energetic photons and pions in the tail of their Boltzmann distributions. The 
steep drop at 200 MeV is caused by a phase transition: below 200 MeV we have 
hadronic matter (and leptons), whereas, above 200 MeV, the hadrons dissolve into 
their subconstituents, which contribute much more to g*. We shall return to this 
in Section 6.6. 

At this time the number density of nucleons decreases quickly because they 
have become nonrelativistic. Consequently, they have a larger probability of anni- 



The Early Radiation Era 131 




Figure 5.6 The evolution of the effective degrees of freedom contributing to the energy 
density, g*(T) and to the entropy density, g*s(T), as functions of log T, where the tem- 
perature is in units of MeV [4]. 

hilating into lepton pairs, pion pairs or photons. Their number density is then 
no longer given by the Fermi distribution (5.42), but by the Maxwell-Boltzmann 
distribution, Equation (5.43). As can be seen from the latter, when T drops below 
the mass, the number density decreases rapidly because of the exponential fac- 
tor. If there had been exactly the same number of nucleons and anti-nucleons, 
we would not expect many nucleons to remain to form matter. But, since we live 
in a matter-dominated Universe, there must have been some excess of nucleons 
early on. Note that neutrons and protons exist in equal numbers at the time under 
consideration. 

Although the nucleons are very few, they still participate in electromagnetic 
reactions such as the elastic scattering of electrons, 

e* +p — ► e* +P, (5.53) 

and in weak charged current reactions in which charged leptons and nucleons 
change into their neutral partners, and vice versa, as in 

e" + p — >v e + n, (5.54) 

v e + p — e + +n. (5.55) 

Other such reactions are obtained by reversing the arrows, and by replacing e* 
by ^ or v e by v^ or v T . The nucleons still participate in thermal equilibrium, but 
they are too few to play any role in the thermal history any more. This is why we 
could neglect them in Equation (5.52). 

Below the pion mass (actually at about 70 MeV) the temperature in the Universe 
cools below the threshold for pion production: 

(e~ + e + ) or (/j" + ^ + ) — y V irtuai — tt + + n~. (5.56) 



132 Thermal History of the Universe 

The reversed reactions, pion annihilation, still operate, reducing the number of 
pions. However, they disappear even faster by decay. This is always the fate when 
such lighter states are available, energy and momentum, as well as quantum num- 
bers such as electric charge, baryon number and lepton numbers, being conserved. 
The pion, the muon and the tau lepton are examples of this. The pion decays 
mainly by the reactions 

tt~ — 'U'+v^, tt + — '^ + + V li . (5.57) 

Thus g % decreases by 3 to ^p. The difference in mass between the initial pion and 
the final state particles is 

mjr -m u -m v = (139.6 - 105.7 - 0.0) MeV = 33.9 MeV, (5.58) 

so 33.9 MeV is available as kinetic energy to the muon and the neutrino. This 
makes it very easy for the tt* to decay, and in consequence its mean life is short, 
only 0.026 |is (the tt° decays even faster). This is much less than the age of the 
Universe at 140 MeV, which is 23 |is from Equation (5.51). Note that the appear- 
ance of a charged lepton in the final state forces the simultaneous appearance of 
its anti-neutrino in order to conserve lepton number. 

Also, the muons decay fast compared with the age of the Universe, with a life- 
time of 2.2 (j,s, by the processes 

H~ — ' e" + v e + Vu, n + — ' e + + v e + v^. (5.59) 

Almost the entire mass of the muon, or 105.7 MeV, is available as kinetic energy 

to the final state particles. This is the reason for its short mean life. Here again 

the conservation of lepton numbers, separately for the e-family and the ^/-family, 

is observed. 

Below the muon mass (actually at about 50 MeV), the temperature in the Uni- 
verse cools below the threshold for muon-pair production: 

e" + e + — yvirtuai — V + + U~ ■ (5.60) 

The time elapsed is less than a millisecond. When the muons have disappeared, 
we can reduce g* by \ to ^. 

From the reactions (5.57) and (5.59) we see that the end products of pion and 
muon decay are stable electrons and neutrinos. The lightest neutrino Vi is cer- 
tainly stable, and the same is probably also true for V2 and V3. When this has taken 
place we are left with those neutrinos and electrons that only participate in weak 
reactions, with photons and with a very small number of nucleons. The number 
density of each lepton species is about the same as that of photons. 

5.5 Photon and Lepton Decoupling 

The considerations about which particles participate in thermal equilibrium at a 
given time depend on two timescales: the reaction rate of the particle, taking into 
account the reactions which are possible at that energy, and the expansion rate 
of the Universe. If the reaction rate is slow compared with the expansion rate, the 
distance between particles grows so fast that they cannot find each other. 



Photon and Lepton Decoupling 133 

Reaction Rates. The expansion rate is given by H = a I a, and its temperature 
dependence by Equations (5.50) and (5.51). The average reaction rate can be writ- 
ten 

r= (Nva(E)), (5.61) 

where a(E) is the reaction cross-section (in units of m 2 , say) as defined in Equa- 
tion (5.41). The product of a(E) and the velocity v of the particle varies over the 
thermal distribution, so one has to average over it, as is indicated by the angle 
brackets. Multiplying this product by the number density N of particles per m 3 , 
one obtains the mean rate r of reacting particles per second, or the mean collision 
time between collisions, I" -1 . 
The weak interaction cross-section turns out to be proportional to T 2 , 

*-$££. (5-62) 

n(hc) 4 

where Gf is the Fermi coupling measuring the strength of the weak interaction. The 
number density of the neutrinos is proportional to T 3 according to Equations (5.5) 
and (5.45). The reaction rate of neutrinos of all flavours then falls with decreasing 
temperature as T s . 

The condition for a given species of particle to remain in thermal equilibrium is 
then that the reaction rate r is larger than the expansion rate H, or equivalently 
that T" 1 does not exceed the Hubble distance H _1 , 

^>1- (5-63) 

Inserting the T 5 dependence of the weak interaction rate T wi and the T 2 depen- 
dence of the expansion rate H from Equation (5.51), we obtain 



Thus there may be a temperature small enough that the condition (5.63) is no 
longer fulfilled. 



Photon Reheating. Photons with energies below the electron mass can no longer 
produce e + -e" pairs, but the energy exchange between photons and electrons still 
continues by Compton scattering, reaction (5.30), or Thomson scattering, as it is 
called at very low energies. Electromagnetic cross-sections (subscript 'em') are 
proportional to T~ 2 , and the reaction rate is then proportional to T, so 

r em i_ 
h x r 

Contrary to the weak interaction case in Equation (5.64), the condition (5.63) is 
then satisfied for all temperatures, so electromagnetic interactions never freeze 
out. Electrons only decouple when they form neutral atoms during the Recombi- 
nation Era and cease to scatter photons. The term recombination is slightly mis- 
leading, because the electrons have never been combined into atoms before. The 



134 Thermal History of the Universe 

term comes from laboratory physics, where free electrons and ionized atoms are 
created by heating matter (and upon subsequent cooling the electrons and ions 
recombine into atoms) or from so-called HII regions, where interstellar plasma 
is ionized by ultraviolet radiation and characteristic recombination radiation is 
emitted when electrons and ions re-form. 

The exothermic electron-positron annihilation, reaction (5.32), is now of mount- 
ing importance, creating new photons with energy 0.51 MeV. This is higher than 
the ambient photon temperature at that time, so the photon population gets 
reheated. To see just how important this reheating is, let us turn to the law of 
conservation of entropy. 

Making use of the equation of state for relativistic particles (5.16), the entropy 
(5.10) can be written 

r 4V 

* - 3kT plasma- 

Substituting the expression for fpiasma from Equation (5.47) one obtains 

_ 2 9 * VasT 4 

5 = —^^, (5.65) 

which is valid where we can ignore nonrelativistic particles. Now asT 4 is the 
energy density, so Va^T 4 is energy, just like kT, and thus Va$T 4 /kT is a constant. 
g* is also a constant, except at the thresholds where particle species decouple. 
The physical meaning of entropy of a system is really its degrees of freedom 
multiplied by some constant, as one sees here. In Equation (5.5) we saw that the 
entropy density can also be written 

5 = | = f Z(3)g y N y , (5.66) 

where N y is the number density of photons. Between two decoupling thresholds 
we then have 

The second law of thermodynamics requires that entropy should be conserved 
in reversible processes, also at thresholds where g* changes. This is only possible 
if T also changes in such a way that g* T 3 remains a constant. When a relativistic 
particle becomes nonrelativistic and disappears, its entropy is shared between the 
particles remaining in thermal contact, causing some slight slowdown in the cool- 
ing rate. Photons never become nonrelativistic; neither do the practically massless 
neutrinos, and therefore they continue to share the entropy of the Universe, each 
species conserving its entropy separately. 

Let us now apply this argument to the situation when the positrons and most of 
the electrons disappear by annihilation below 0.2 MeV. We denote temperatures 
and entropies just above this energy by a subscript '+', and below it by '-'. Above 
this energy, the particles in thermal equilibrium are y, e", e + . Then the entropy 



Photon and Lepton Decoupling 135 



A 

B I 

L_L_ , 



Figure 5.7 A system of communicating vessels illustrating particles in thermal equilib- 
rium (from K. Kainulainen, unpublished research). At 3.7 MeV, valve A closes so that v^ 
and v T decouple. At 2.3 MeV, valve B closes so that v e also decouples, leaving only e" and 
y in thermal contact. 

Below that energy, only photons contribute the factor g* = 2. Consequently, the 
ratio of entropies S+ and S_ is 

But entropy must be conserved so this ratio must be unity. It then follows that 
T_ = (^) T+ = 1.40T+. (5.70) 

Thus the temperature T y of the photons increases by a factor 1.40 as the Universe 
cools below the threshold for electron-positron pair production. Actually, the 
temperature increase is so small and so gradual that it only slows down the cooling 
rate temporarily. 



Neutrino Decoupling. When the neutrinos no longer obey the condition (5.63) 
they decouple or freeze out from all interactions, and begin a free expansion. The 
decoupling of v^ and v T occurs at 3.5 MeV, whereas the v e decouple at 2.3 MeV. 
This can be depicted as a set of connecting baths containing different particles, 
and having valves which close at given temperatures (see Figure 5.7). 

At decoupling, the neutrinos are still relativistic, since they are so light 
(Table A.3). Thus their energy distribution is given by the Fermi distribution, Equa- 
tion (5.42), and their temperature equals that of the photons, T v = T y , decreasing 
with the increasing scale of the Universe as a -1 . But the neutrinos do not partici- 
pate in the reheating process, and they do not share the entropy of the photons, 
so from now on they remain colder than the photons: 



Ty = Ty/ 1 .40. 



(5.71) 



The number density iV v of neutrinos can be calculated as in Equation (5.5) using 
Equation (5.3), except that the -1 term in Equation (5.3) has to be replaced by 



136 Thermal History of the Universe 

+ 1, which is required for fermions (see Equation (5.42)). In the number density 
distributions (5.3) and (5.42), we have ignored possible chemical potentials for all 
fermions, which one can do for a thermal radiation background; for neutrinos it is 
an unproven assumption that nonetheless appears in reasonable agreement with 
their oscillation parameters. 

The result is that iV v is a factor of | times N y at the same temperature. Taking 
the difference between temperatures T v and T y into account and noting from 
Equation (5.5) that N y is proportional to T 3 , one finds 

N v = IyiN y . (5.72) 

After decoupling, the neutrino contribution to g* decreases because the ratio 
Ti/T in Equation (5.46) is now less than one. Thus the present value is 

^(r ) = 2 + 3|(^) 4/3 = 3.36. (5.73) 

The entropy density also depends ong*, but now the temperature dependence 
for the neutrino contribution in Equation (5.46) is (Ti/T) 3 rather than a power of 
four. The effective degrees of freedom are in that case given by Equation (5.73) if 
the power | is replaced by 1. This curve is denoted g*s in Figure 5.6. 

The density parameter is 

^T^Xm, (5.74) 



Recombination Era. As long as there are free electrons, the primordial pho- 
tons are thermalized by Thomson scattering against them, and this prohibits the 
electrons from decoupling, in contrast to neutrinos. Each scattering polarizes the 
photons, but on average this is washed out. The electromagnetic reaction rate is 
much higher than the weak reaction rate of the neutrinos; in fact, it is higher than 
the expansion rate of the Universe, so the condition (5.63) is fulfilled. 

Eventually the Universe expands and cools to such an extent, to about 1000 K, 
that electrons are captured into atomic orbits, primarily by protons but also by the 
trace amounts of ionized helium and other light nuclei. This process is referred 
to as recombination. Unlike the unstable particles n, tt, y that decay sponta- 
neously liberating kinetic energy in exothermic reactions, the hydrogen atom H 
is a bound state of a proton and an electron. Its mass is less than the p and e" 
masses together, 

m H -m p -m e = -13.59 eV, (5.75) 

so it cannot disintegrate spontaneously into a free proton and a free electron. The 
mass difference (5.75) is the binding energy of the hydrogen atom. 

The physics of recombination is somewhat subtle. Initially one might think that 
recombination occurs when the photon temperature drops below 13.59 eV, mak- 
ing formation of neutral hydrogen energetically favourable. Two characteristics of 
the physics push the recombination temperature lower. The first, and the easiest 



Photon and Lepton Decoupling 137 

to elaborate, is that there are vastly more photons than electrons and so in ther- 
mal equilibrium even a small proportion of high-energy photons are sufficient to 
maintain effectively complete ionization. Photons in thermal equilibrium have the 
blackbody spectrum given by equation (5.3). Even for photon temperatures some- 
what below 13.59 eV there will be enough highly energetic photons in the Wein 
tail (as the high-energy section is termed) to ionize any neutral hydrogen. The 
large amount of entropy in the Universe also favours free protons and electrons. 

With respect to the thermal history of the Universe, the fact that photons do 
not scatter against neutral atoms is critical. As recombination proceeds and the 
number of electrons falls, matter and radiation decouple. This has two results. 
First, with matter and radiation no longer in thermal equilibrium the thermal 
history of the two proceed independently. Perturbations in matter are no longer 
damped by interaction with radiation and any such perturbations can grow into 
structures through gravitational instability. Decoupling thus initiates the period 
of structure formation that has led to our present Universe being populated with 
stars, galaxies, galaxy clusters, etc. 

The second result is that, with photons no longer scattering against a sea of elec- 
trons, the photons can stream freely through the Universe; upon recombination, 
the Universe becomes transparent to light. Prior to recombination, the Universe 
was opaque to electromagnetic radiation (although not to neutrinos) and it would 
have been impossible to do astronomy if this situation had persisted until today. 
The freely streaming photons from this era form the CMB radiation and their point 
of last contact with matter forms a spherical shell called the last scattering sur- 
face (LSS). The era of recombination provides a crucial observational limit beyond 
which we cannot hope to see using electromagnetic radiation. 

The LSS is not a sharp boundary and does not exist at a unique redshift: it is 
actually a thin shell. The photons from the LSS preserve the polarization they 
incurred in the last Thomson scattering. This remaining primordial polarization 
is an interesting detectable signal, albeit much weaker than the intensity of the 
thermalized radiation (we shall return to discuss this further in Section 8.3). 

The LSS of the Universe has an exact analogue in the surface of the Sun. Photons 
inside the Sun are continuously scattered, so it takes millions of years for some 
photons to reach the surface. But once they do not scatter any more they con- 
tinue in straight lines (really on geodesies) towards us. Therefore, we can see the 
surface of the Sun, which is the LSS of the solar photons, but we cannot see the 
solar interior. We can also observe that sunlight is linearly polarized. In contrast, 
neutrinos hardly scatter at all in the Sun, thus neutrino radiation brings us a clear 
(albeit faint with present neutrino detectors) picture of the interior of the Sun. 



Equilibrium Theory. An analysis based on thermal equilibrium, the Saha equa- 
tion, implies that the temperature must fall to about 0.3 eV before the proportion 
of high-energy photons falls sufficiently to allow recombination to occur. The Saha 
analysis also implies that the time (or energy or redshift) for decoupling and last 
scattering depend on cosmological parameters such as the total cosmic density 
parameter Qq, the baryon density Ob, and the Hubble parameter. However, a sec- 



138 Thermal History of the Universe 

ond feature of the physics of recombination implies that the equilibrium analysis 
itself is not sufficient. 

The Saha analysis describes the initial phase of departure from complete ioniza- 
tion but, as recombination proceeds, the assumption of equilibrium ceases to be 
appropriate (see, for example, [1]). Paradoxically, the problem is that electromag- 
netic interactions are too fast (in contrast with the weak interaction that freezes 
out from equilibrium because of a small cross-section). A single recombination 
directly to the ground state would produce a photon with energy greater than 
the 13.59 eV binding energy and this photon would travel until it encountered a 
neutral atom and ionized it. This implies that recombination in an infinite static 
universe would have to proceed by smaller intermediate steps (thus not directly 
to the ground state). 

In fact the situation is even worse, because reaching the ground state by single 
photon emission requires transition from the 2P to IS levels and thus produc- 
tion of photons with energy at least 10.2 eV (Lyman a with A = 1216 A). As these 
photons become abundant they will re-ionize any neutral hydrogen through mul- 
tiple absorption and so it would seem that recombination will be, at a minimum, 
severely impeded. (Recombination in a finite HII region is different because the 
Ly« photons can escape (see [1, p. 286]).) 

There is an alternative path, however. Two-photon emission generated by the 
2S — IS transition produces lower-energy photons. The process is slow (with a 
lifetime of approximately 0.1 s), so recombination proceeds at a rate quite dif- 
ferent from the Saha prediction. Consequently, all the times predicted by this 
nonequilibrium analysis differs notably from the Saha prediction, but, interest- 
ingly, in such a way that the times of decoupling and last scattering have practi- 
cally no dependence on cosmological parameters. 

Summary. Recombination, decoupling and last scattering do not occur at the 
exactly same time. It should also be noted that these terms are often used inter- 
changeably in the literature, so what we refer to as the LSS may also be called 
the time of recombination or decoupling. Approximate results for these times, 
following from Peacock [1], are summarized below. 

Recombination is defined as the time when 90% of the electrons have combined 
into neutral atoms. This occurred at redshift 

a" 1 , = 1 + z rec ~ 131o( ^) ~ 910-1340. (5.76) 

V v^o / 

Last scattering is defined as the time when photons start to stream freely. This 
occurred at redshift 

a^ s = 1 + z LSS = 1065 ± 80, (5.77) 

when the Universe was 180 000(Do/i 2 )" 1/2 years old and at a temperature of 
0.26 eV, thus right after the recombination time. This redshift has been deter- 
mined much more precisely by WMAP [5], as we shall show in Section 8.4. 



Decoupling is denned as the time when the reaction rate (scattering) falls below 
the expansion rate of the Universe and matter falls out of thermal equilibrium 
with photons. This occurred at redshift 

a~ A l c = 1 + z dec ~ 890, (5.78) 

when the Universe was some 380000 years old. All three events depend on the 
number of free electrons (the ionization fraction) but in slightly different ways. 
As a result these events do not occur at exactly the same time. 



5.6 Big Bang Nucleosynthesis 

Let us now turn to the fate of the remaining nucleons. Note that the charged 
current reactions (5.54) and (5.55) changing a proton to a neutron are endothermic: 
they require some input energy to provide for the mass difference. In reaction 
(5.54) this difference is 0.8 MeV and in reaction (5.55) it is 1.8 MeV (use the masses 
in Table A.4!). The reversed reactions are exothermic. They liberate energy and 
they can then always proceed without any threshold limitation. 

The neutrons and protons are then nonrelativistic, so their number densities are 
each given by Maxwell-Boltzmann distributions (5.43). Their ratio in equilibrium 
is given by 

At energies of the order of m n - m p = 1.293 MeV or less, this ratio is dominated 
by the exponential. Thus, at kT = 0.8 MeV, the ratio has dropped from 1 to |. As 
the Universe cools and the energy approaches 0.8 MeV, the endothermic neutron- 
producing reactions stop, one by one. Then no more neutrons are produced but 
some of those that already exist get converted into protons in the exothermic 
reactions. 



Nuclear Fusion. Already, at a few MeV, nuclear fusion reactions start to build up 
light elements. These reactions are exothermic: when a neutron and a proton fuse 
into a bound state some of the nucleonic matter is converted into pure energy 
according to Einstein's formula (2.68). This binding energy of the deuteron d, 

m p + m n -m d = 2.22 MeV, (5.80) 

is liberated in the form of radiation: 

n + p — d + y. (5.81) 

The deuteron is also written 2 H + in general nuclear physics notation, where the 
superscript A = 2 indicates the number of nucleons and the electric charge is 
given by the superscript ' + ' . The bound state formed by a deuteron and an electron 
is the deuterium atom 2 H, which of course is electrically neutral. Although the 



140 Thermal History of the Universe 

deuterons are formed in very small quantities, they are of crucial importance to 
the final composition of matter. 

As long as photons of 2.22 MeV or more are available, the reaction (5.81) can 
go the other way: the deuterons photodisintegrate into free protons and neutrons. 
Even when the mean temperature of radiation drops considerably below 2.22 MeV, 
there is still a high-energy tail of the Planck distribution containing hard y-rays 
which destroy the deuterons as fast as they are produced. 

All evidence suggests that the number density of baryons, or equivalently nucle- 
ons, is today very small. In particular, we are able to calculate it to within a factor 

Nb= PL = «BP £=:1L3r3h 2 m -3. (5 . 82) 

m B m B 
At the end of this section we shall discuss the value of the baryon density param- 
eter f3 B , which is a few per cent. 

The photon number density today is N y = 4.11xl0 8 perm 3 from Equation (5.5). 
It is clear then that N-^/Ny is such a small figure that only an extremely tiny frac- 
tion of the high-energy tail of the photon distribution may contain sufficiently 
many hard y-rays to photodisintegrate the deuterons. However, the 2.22 MeV pho- 
tons created in photodisintegration do not thermalize, so they will continue to 
photodisintegrate deuterium until they have been redshifted below this thresh- 
old. Another obstacle to permanent deuteron production is the high entropy per 
nucleon in the Universe. Each time a deuteron is produced, the degrees of freedom 
decrease, and so the entropy must be shared among the remaining nucleons. This 
raises their temperature, counteracting the formation of deuterons. Detailed cal- 
culations show that deuteron production becomes thermodynamically favoured 
only at 0.07 MeV. Thus, although deuterons are favoured on energetic grounds 
already at 2 MeV, free nucleons continue to be favoured by the high entropy down 
to 0.07 MeV. 

Other nuclear fusion reactions also commence at a few MeV. The npp bound 
state 3 He ++ is produced in the fusion of two deuterons, 

d + d — 3 He ++ +n, (5.83) 

p + d — 3 He ++ + y, (5.84) 
where the final-state particles share the binding energy 

2m p + m n -m( 3 He ++ ) = 7.72 MeV. (5.85) 

This reaction is also hampered by the large entropy per nucleon, so it becomes 
thermodynamically favoured only at 0.11 MeV. 

The nnp bound state 3 H + , or triton t, is the ionized tritium atom, 3 H. It is pro- 
duced in the fusion reactions 

n + d — t + y, (5.86) 

d + d — t + p, (5.87) 

n + 3 He — t + p, (5.88) 



Big Bang Nucleosynthesis 141 

with the binding energy 

m p + 2m n -m t = 8.48 MeV. (5.89) 

A very stable nucleus is the nnpp bound state 4 He ++ with a very large binding 
energy, 

2m p + 2m n -m( 4 He ++ ) = 28.3 MeV. (5.90) 

Once its production is favoured by the entropy law, at about 0.28 MeV, there 
are no more y-rays left that are hard enough to photodisintegrate it. From the 
examples set by the deuteron fusion reactions above, it may seem that 4 He ++ 
would be most naturally produced in the reaction 

d + d — 4 He ++ + y. (5.91) 

However, 3 He ++ and 3 H + production is preferred over deuteron fusion, so 4 He ++ is 
only produced in a second step when these nuclei become abundant. The reactions 
are then 

n + 3 He ++ — 4 He ++ + y, (5.92) 

d + 3 He ++ — 4 He ++ + p, (5.93) 

p + t — 4 He ++ + y, (5.94) 

d + t — 4 He ++ +n. (5.95) 

The delay before these reactions start is often referred to as the deuterium bottle- 
neck. 

Below 0.8 MeV occasional weak interactions in the high-energy tails of the lepton 
and nucleon Fermi distributions reduce the nip ratio further, but no longer by 
the exponential factor in Equation (5.79). The neutrons also decay into protons 
by beta decay, 

n — e" + v e + p, (5.96) 

liberating 

m n -m p -m e -m v = 0.8 MeV (5.97) 

of kinetic energy in the process. This amount is very small compared with the 
neutron mass of 939.6 MeV. In consequence the decay is inhibited and very slow: 
the neutron mean life is 887 s. In comparison with the age of the Universe, which 
at this time is a few tens of seconds, the neutrons are essentially stable. The 
protons are stable even on scales of billions of years, so their number is not going 
to decrease by decay. 

At 0.1 MeV, when the temperature is 1.2 x 10 9 K and the time elapsed since 
the Big Bang is a little over two minutes, the beta decays have reduced the neu- 
tron/proton ratio to its final value: 

£4 (5 - 98 » 

AT p 7 

The temperature dependence of this ratio, as well as the equilibrium (Maxwell- 
Boltzmann) ratio, is shown in Figure 5.8. 



142 Thermal History of the Universe 




Figure 5.8 The equilibrium and actual values of the n/p ratio. Courtesy of E. W. Kolb and 
M. S. Turner. 



These remaining neutrons have no time to decay before they fuse into deuterons 
and subsequently into 4 He ++ . There they stayed until today because bound neu- 
trons do not decay. The same number of protons as neutrons go into 4 He, and 
the remaining free protons are the nuclei of future hydrogen atoms. Thus the end 
result of the nucleosynthesis taking place between 100 and 700 s after the Big 
Bang is a Universe composed almost entirely of hydrogen and helium ions. But 
why not heavier nuclei? 

It is an odd circumstance of nature that, although there exist stable nuclei com- 
posed of A = 1, 2, 3 and 4 nucleons, no nucleus of A = 5, or of A = 8, exists. 
In between these gaps, there exist the unstable nuclei 6 Li and 7 Be, and the stable 
7 Li. Because of these gaps and because 4 He is so strongly bound, nucleosynthesis 
essentially stops after 4 He production. Only minute quantities of the stable nuclei 
2 H, 3 He and 7 Li can be produced. 

The fusion rate at energy E of two nuclei of charges Z\, Z2 is proportional to 
the Gamow penetration factor 



cxp 



(5.99) 



2Z X Z 2 \ 

VI )■ 

Thus as the energy decreases, the fusion of nuclei other than the very lightest 
ones becomes rapidly improbable. 



Relic 4 He Abundance. The relic abundances of the light elements bear an impor- 
tant testimony of the n/p ratio at the time of the nucleosynthesis when the Uni- 
verse was only a few minutes old. In fact, this is the earliest testimony of the Big 
Bang we have. Recombination occurred some 300 000 years later, when the stable 
ions captured all the electrons to become neutral atoms. The CMB testimony is 



from that time. There is also more recent information available in galaxy cluster 
observations from z < 0.25. 
From the ratio (5.98) we obtain immediately the ratio of 4 He to 1 H: 

iV( 4 He) N n /2 1 , nn ^ 

X 4 = '■ 



NO-K) Np-Nn 12' ' 

The number of 4 He nuclei is clearly half the number of neutrons when the minute 
amounts of 2 H, 3 He and 7 Li are neglected. The same number of protons as neu- 
trons go into 4 He, thus the excess number of protons becoming hydrogen is 
N p - N n . The ratio of mass in 4 He to total mass in l H and 4 He is 

Y ^ifk- - 25 - (5 - 101) 

This is a function of the ratio of baryons to photons 

q = ^ a 2.75 x 10- & O h h 2 , (5.102) 

using N y from Table A.6. 

The helium mass abundance I4 depends sensitively on several parameters. If 
the number of baryons increases, Ob and r\ also increase, and the entropy per 
baryon decreases. Since the large entropy per baryon was the main obstacle to 
early deuteron and helium production, the consequence is that helium production 
can start earlier. But then the neutrons would have had less time to /3-decay, so 
the neutron/proton ratio would be larger than \ . It follows that more helium will 
be produced: Y\ increases. 

The abundances of the light elements also depend on the neutron mean life 
T n and on the number of neutrino families F v , both of which were poorly known 
until 1990. Although T n is now known to l%e [2], and F v is known to be 3 ± 4%o 
[2], it may be instructive to follow the arguments about how they affect the value 
of Y 4 . 

Let us rewrite the decoupling condition (5.64) for neutrons 

jj-=ATl (5.103) 

where A is the proportionality constant left out of Equation (5.64) and Ta is the 
decoupling temperature. An increase in the neutron mean life implies a decrease 
in the reaction rate r wi and therefore a decrease in A. At temperature Ta the ratio 
of the reaction rate to the expansion rate is unity; thus 

T d = A~ 113 . (5.104) 

Hence a longer neutron mean life implies a higher decoupling temperature and 
an earlier decoupling time. As we have already seen, an earlier start of helium 
production leads to an increase in Y4. 



144 Thermal History of the Universe 

The expansion rate H of the Universe is, according to Equations (5.49) and (5.51), 
proportional to ^/g^, which in turn depends on the number of neutrino families F v . 
In Equations (5.52) we had set F v = 3. Thus, if there were more than three neutrino 
families, H would increase and A would decrease with the same consequences as 
in the previous example. Similarly, if the number of neutrinos were very different 
from the number of anti-neutrinos, contrary to the assumptions in standard Big 
Bang cosmology, H would also increase. 



Light Element Abundance Observations. The value of O^h 2 (or n) is obtained 
in direct measurements of the relic abundances of 4 He, 3 He, 2 H or D, and 7 Li from 
the time when the Universe was only a few minutes old. Although the 4 He mass 
ratio Y4 is 0.25, the 3 He and 2 H mass ratios are less than 10" 4 and the 7 Li mass 
ratio as small as a few times 10" 10 , they all agree remarkably well on a common 
value for n. 

If the observed abundances are indeed of cosmological origin, they must not be 
affected significantly by later stellar processes. The helium isotopes 3 He and 4 He 
cannot be destroyed easily but they are continuously produced in stellar interiors. 
Some recent helium is blown off from supernova progenitors, but that fraction 
can be corrected for by observing the total abundance in hydrogen clouds of 
different ages and extrapolating to time zero. The remainder is then primordial 
helium emanating from BBN. On the other hand, the deuterium abundance can 
only decrease; it is easily burned to 3 He in later stellar events. The case of 7 Li is 
complicated because some fraction is due to later galactic cosmic ray spallation 
products. 

The 4 He abundance is easiest to observe, but it is also least sensitive to Q^h 2 , its 
dependence is logarithmic, so only very precise measurements are relevant. The 
best 'laboratories' for measuring the 4 He abundance are a class of low-luminosity 
dwarf galaxies called blue compact dwarf (BCD) galaxies, which undergo an 
intense burst of star formation in a very compact region. The BCDs are among 
the most metal-deficient gas-rich galaxies known (astronomers call all elements 
heavier than helium metals). Since their gas has not been processed through many 
generations of stars, it should approximate well the pristine primordial gas. 

The 3 He isotope can be seen in galactic star-forming regions containing ionized 
hydrogen (HII), in the local interstellar medium and in planetary nebulae. Because 
HII regions are objects of zero age when compared with the age of the Galaxy, 
their elemental abundances can be considered typical of primordial conditions. 

The 7 Li isotope is observed at the surface of the oldest stars. Since the age of 
stars can be judged by the presence of metals, the constancy of this isotope has 
been interpreted as being representative of the primordial abundance. 

The strongest constraint on the baryonic density comes from the primordial 
deuterium abundance. Ultraviolet light with a continuous flat spectrum emitted 
from objects at distances of z « 2-3.5 will be seen redshifted into the red range 
of the visible spectrum. Photoelectric absorption in intervening hydrogen along 
the line of sight then causes a sharp cut-off at A = 91.2 nm, the Lyman limit. 
This can be used to select objects of a given type, which indeed are star-forming 



Big Bang Nucleosynthesis 145 

galaxies. Deuterium is observed as a Lyman-a feature in the absorption spectra 
of high-re dshift quasars. A recent analysis [3] gives 

Q h ( 2 H)h 2 = 0.020 ±0.001, (5.105) 

which is more precise than any other determination. The information from the 
other light nucleids are in good agreement. The values of r\ and £?b in Table A.6 
come from a combined fit to 2 H data, CMB and large-scale structure. We defer that 
discussion to Section 8.4. 

In Figure 5.9 the history of the Universe is summarized in nomograms relating 
the scales of temperature, energy, size, density and time [3]. Note that so far we 
have only covered the events which occurred between 10 11 K and 10 3 K. 

Nuclear synthesis also goes on inside stars where the gravitational contraction 
increases the pressure and temperature so that the fusion process does not stop 
with helium. Our Sun is burning hydrogen to helium, which lasts about 10 10 yr, 
a time span which is very dependent on the mass of the star. After that, helium 
burns to carbon in typically 10 6 yr, carbon to oxygen and neon in 10 4 yr, those 
to silicon in 10 yr, and silicon to iron in 10 h, whereafter the fusion chain stops. 
The heavier elements have to be synthesized much later in supernova explosions, 
and all elements heavier than lithium have to be distributed into the intergalactic 
medium within the first billion years. 

To sum up, Big Bang cosmology makes some very important predictions. The 
Universe today should still be filled with freely streaming primordial photon (and 
neutrino) radiation with a blackbody spectrum (5.3) of temperature related to 
the age of the Universe and a polarization correlated to the temperature. This 
relic CMB radiation (as well as the relic neutrino radiation) should be essentially 
isotropic since it originated in the now spherical shell of the LSS. In particular, 
it should be uncorrelated to the radiation from foreground sources of later date, 
such as our Galaxy. In Chapter 8 we shall see that these predictions have been 
verified for the photons (but not yet for the neutrinos). 

A very important conclusion from BBN is that the Universe contains surprisingly 
little baryonic matter! Either the Universe is then indeed open, or there must exist 
other types of nonbaryonic, gravitating matter. 



Problems 

1. Show that an expansion by a factor a leaves the blackbody spectrum (5.3) 
unchanged, except that T decreases to T la. 

2. Show that the quantity Q 2 + U 2 is an invariant under the rotation of an angle 
4> in the (x,y)-plane, where Q and U are the Stokes parameters defined in 
Equation (5.8). 

3. Use the definition of entropy in Equation (5.10) and the law of conservation 
of energy, Equation (4.24), to show what functional forms of equations of 
state lead to conservation of entropy. 



146 Thermal History of the Universe 



t 7 

1 2 



g 3 K 



n £ 



r! 



4. The flow of total energy received on Earth from the Sun is expressed by 
the solar constant 1.36 x 10 3 J m" 2 s. Use Equation (5.47) to determine the 
surface temperature of the Sun, 



Using this temperature and the knowledge that the dominant colour of the 
Sun is yellow with a wavelength of A = 0.503 |im. What energy density does 
that flow correspond to? 

5. The random velocity of galaxies is roughly 100 km s"\ and their number 
density is 0.0029 per cubic megaparsec. If the average mass of a galaxy is 
3 x 10 44 g, what is the pressure of a gas of galaxies? What is the temperature 
[6]? 

6. A line in the spectrum of hydrogen has frequency v = 2.5 x 10 15 Hz. If 
this radiation is emitted by hydrogen on the surface of a star where the 
temperature is 6000 K, what is the Doppler broadening [6]? 

7. A spherical satellite of radius r painted black, travels around the Sun at a 
distance d from the centre. The Sun radiates as a blackbody at a temperature 
of 6000 K. If the Sun subtends an angle of 6 radians as seen from the satellite 
(with 6 <sc 1), find an expression for the equilibrium temperature of the 
satellite in terms of 6. To proceed, calculate the energy absorbed by the 
satellite, and the energy radiated per unit time [6]. 

8. Use the laws of conservation of energy and momentum and the equation of 
relativistic kinematics (2.69) to show that positronium cannot decay into a 
single photon. 

9. Use the equation of relativistic kinematics (2.69) to calculate the energy and 
velocity of the muon from the decay (5.57) of a pion at rest. The neutrino 
can be considered massless. 

10. What are the possible decay modes of the t~ lepton? 

11. Calculate the energy density represented by the mass of all the electrons 
in the Universe at the time of photon reheating when the kinetic energy of 
electrons is 0.2 MeV. 

12. When the pions disappear below 140 MeV because of annihilation and decay, 
some reheating of the remaining particles occurs due to entropy conserva- 
tion. Calculate the temperature-increase factor. 

13. Use the equation of relativistic kinematics (2.69) and the conservation of 
four-momentum to calculate the energy of the photon liberated in Equa- 
tion (5.86), assuming that the 4 He nucleus is produced at rest. (That is, 
v p = v t = v He = 0.) 

14. Free nucleons are favoured over deuterons down to a radiation energy of 
0.07 MeV. What is the ratio of photons with energies exceeding the deuteron 
binding energy 2.22 MeV to the number of protons at 0.07 MeV? 



148 Thermal History of the Universe 

15. Propose a two-stage fusion process leading to the production of 12 C. 

16. Gamow's penetration factor (5.99) gives a rough idea about the ignition tem- 
peratures in stellar interiors for each fusion reaction. Estimate these under 
the simplifying assumption that the burning rates during the different peri- 
ods are inversely proportional to the time spans (given at the end of this 
chapter). Take the hydrogen burning temperature to be 10 4 K. 

Chapter Bibliography 

[1] Peacock, J. A. 1999 Cosmological physics. Cambridge University Press, Cambridge. 

[2] Hagiwara, K. et al. 2002 Phys. Rev. D66, 010001-1. 

[3] Buries, S., Nollett, K. M. and Turner, M. S. 2001 Astrophys. J. 552, LI. 

[4] Coleman, T. S. and Roos, M. 2003 Phys. Rev. D68, 027702. 

[5] Bennett, C. L. et al. 2003 Preprint arXiv, astro-ph/0302207 and 2003 Astrophys. J. (In 

press.) and companion papers cited therein. 
[6] Gasiorowicz, S. 1979 The structure of matter. Addison-Wesley, Reading, MA. 



6 

Particles and 
Symmetries 



The laws of physics distinguish between several kinds of forces or interactions: 
gravitational, electroweak and strong. Although gravitation is the weakest, man- 
ifested by the fact that it takes bodies of astronomical size to make the gravita- 
tional interaction noticeable, gravitation is the most important force for under- 
standing the Universe on a large scale. The electromagnetic force has an infinite 
range, like gravitation, but all astronomical objects are electrically neutral and we 
detect no measurable magnetic field. The weak interaction has a range of only 
10" 19 m and the strong interaction about 10" 15 m so they are important for par- 
ticles on atomic scales, but not for astronomical bodies. 

The electromagnetic and weak interactions were formerly thought to be dis- 
tinct but are now understood to be united as the electroweak interaction, just as 
happened earlier with the electric and magnetic interactions. At energies much 
less than 100 GeV it is convenient to distinguish between the electromagnetic 
and weak interactions although they are simply different manifestations of the 
electroweak force. 

Prior to the epoch when electroweak reactions began to dominate, particles and 
interactions different from those we have met so far dominated the Universe. Dur- 
ing the different epochs or phases the interactions of the particles were character- 
ized by various symmetries governing their interactions. At each phase transition 
the symmetries and the physics changed radically. Thus we can only understand 
the electroweak epoch if we know where it came from and why. We must therefore 
start a journey backwards in time, where the uncertainties increase at each phase 
transition. 

A very important symmetry is SU(2), which one usually meets for the first time 
in the context of electron spin. Even if electron spin is not an end to this chapter, 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



1 50 Particles and Symmetries 

it is a good and perhaps familiar introduction to SU(2). Thus we shall devote 
Section 6.1 to an elementary introduction to spin space, without the intention of 
actually carrying out spinor calculations. 

Armed with SU(2) algebra, we study three cases of SU(2) symmetry in particle 
physics: the isospin symmetry of the nucleons and the weak-isospin symmetry of 
the leptons in Section 6.2, and the weak-isospin symmetry of the quarks in Sec- 
tion 6.3. We are then ready for the colour degree of freedom of quarks and gluons 
and the corresponding colour symmetry SU(3) C . This leads up to the 'standard 
model' of particle physics, which exhibits SU(3) C ® SU(2) W ® U(1) B -l symmetry. 

In Section 6.4 we study the discrete symmetries of parity P, charge conjugation C 
and time-reversal invariance T. 

In our present matter-dominated world, all the above symmetries are more or 
less broken, with exception only for the colour symmetry SU(3) C and the combined 
discrete symmetry CPT. This does not mean that symmetries are unimportant, 
rather that the mechanisms of spontaneous symmetry breaking deserve attention. 
We take care of this in Section 6.5. 

In Section 6.6 we assemble all our knowledge of particle symmetries and their 
spontaneous breaking in an attempt to arrive at a grand unification theory (GUT) 
which unites the electroweak and the strong forces. It is a dream of physics to 
unite these forces as well as gravitation into a theory of everything (TOE), but their 
properties are so different that this has not yet succeeded. 

Baryons and anti-baryons were produced from quarks in a phase transition near 
200 MeV. But the reason for the baryon-anti-baryon asymmetry must be traced 
back to a much earlier, and not very well understood, irreversible process, in which 
GUT leptoquarks decayed violating baryon- and lepton-number conservation and 
CP. We discuss this in Section 6.7. 



6.1 Spin Space 

In Chapter 5 we introduced the quantal concept of spin. We found that the number 
n sp in of spin states of a particle was one factor contributing to its degrees of free- 
dom g in thermal equilibrium (see Equation (5.40)). Thus horizontally and verti- 
cally polarized photons were counted as two effectively different particle species, 
although their physical properties were otherwise identical. 



Electron Spin. Charged particles with spin exhibit magnetic moment. In a mag- 
netic field free electrons orient themselves as if they were little magnets. The 
classical picture that comes to mind is an electric charge spinning around an 
axis parallel to the external magnetic field. Thus the resulting magnetic moment 
would point parallel or antiparallel to the external field depending on whether 
the charge is spinning right- or left-handedly with respect to the field. Although it 
must be emphasized that spin is an entirely quantal effect and not at all due to a 
spinning charge, the classical picture is helpful because the magnetic moment of 



Spin Space 151 

the electron— whatever the mechanism for its generation— couples to the external 
field just like a classical magnet. 

Let us take the spin axis of the electron to be given by the spin vector a with 
components a x , a y and <x z , and let the external magnetic field B be oriented in 
the z direction, 

B = (0,0,B z ). (6.1) 

The potential energy due to the magnetic coupling of the electron to the external 
field is then the Hamiltonian operator 

H = Ao- ■ B = Aa z B z , (6.2) 

where A is a constant. 

Measurements of H show that <x z is quantized, and that it can have only two 
values. With a suitable choice of units, these values are ±1. This fits the classical 
picture insofar as the opposite signs would correspond to right-handedly and left- 
handedly spinning charges, respectively. But the classical picture does not lead 
to quantized values: it permits a continuum of values. One consequence of the 
quantum dichotomy is that free electrons in thermal equilibrium with radiation 
each contribute n sp i n = 2 degrees of freedom to g. 

The above conclusions follow when the external magnetic field B was turned 
on. What if B = 0: where do the electron spin vectors a point then? The answer 
of course is 'anywhere', but we cannot confirm this experimentally, because we 
need a precise nonvanishing external field to measure H or a z . Thus quantum 
mechanics leads to a paradoxical situation: even if electrons which are not subject 
to observation or magnetic influence may have their spins arbitrarily oriented; by 
observing them we force them into either one of the two states <x z = ±1. 

Quantum mechanics resolves this paradox by introducing statistical laws which 
govern averages of ensembles of particles, but which say nothing about the indi- 
vidual events. Consider the free electron before the field B z has been switched on. 
Then it can be described as having the probability P to yield the a z value 1 after 
the field has been switched on, and the probability 1 - P to yield a value of - 1 . The 
value of P is anywhere between and 1, as the definition of probability requires. 

After a measurement has yielded the value a z = 1, a subsequent measurement 
has a probability of 1 to yield 1 and to yield -1. Thus a measurement changes 
the spin state from indefinite to definite. The definite spin states are characterized 
by <x z being ±1; the indefinite state is a superposition of these two states. This can 
be formulated either geometrically or algebraically. Let us turn to the geometrical 
formulation first. 



Spinor Algebra. Consider an abstract two-dimensional space with orthogonal 
axes labelled x+ and X-> see Figure 6.1. Let us draw an arc through the points 
X+ =1 and x- = 1- The length of the radius vector x is then unity for arbitrary 
polar angles <fi, and its coordinates are (cos <fi, sin</>). Let us identify the square 
of the x+ coordinate, cos 2 <p, with the probability P. It then follows that 1 - P = 
sin 2 4>. 



Particles and Symmetries 




Figure 6.1 Two-component spin space. 

We now identify every possible spin state with a vector from the origin to a 
point on the arc in Figure 6.1. These vectors in spin space are called spinors to 
distinguish them from the spin vector a in ordinary space. Let us write the x+ 
and X- coordinates in column form. Then the points 

*-(J). *-(!) 

are spinors corresponding to a z = 1 and a z = -1, respectively. An arbitrary point 
on the arc with coordinates cos <fi, sin</> corresponds to a linear superposition of 
the cr z = 1 and <j z = -1 states. Using the spinors (6.3) as base vectors, this can 
clearly be written 

X = cos <p + sine/) I J . (6.4) 

To summarize, points on the arc correspond to states of the electron before any 
spin measurement has been done, and the points (6.3) correspond to prepared 
states after a measurement. Points elsewhere in the (x+ , X- ) -plane have no phys- 
ical interpretation. The coordinates of points on the arc have no direct physical 
meaning, but their squares correspond to the probabilities of the outcome of a 
subsequent spin measurement. 

The spin space has as many dimensions as there are possible outcomes: two in 
the electron case. It is an abstract space spanned by spinors x, not by the spin 
vector a of ordinary three-dimensional space. The spinor formalism allows the 
two spin states of the electron to be treated symmetrically: the electron is just a 
single two-component object, and both components have identical properties in 
all other respects. 

Let us next turn to the algebra of spin states. The rotation of a spinor Xi into 
another spinor Xz corresponds algebraically to an operator U operating on Xi, 

X2=Uxi- (6.5) 



Both components of Xi are then transformed by U, thus U in component notation 
must be described by a 2 x 2 matrix. This is most easily exemplified by the matrix 
<j+, which rotates X- into x+> 

--G!X)(!)-(;K 

and the matrix cr_, which does the opposite. The nonunitary operators <x+ and <x_ 
are called raising and lowering operators, respectively. 

We noted above that the weights cos <fi and sin</> in Equation (6.4) have no 
direct physical interpretation, but that their squares are real numbers with val- 
ues between zero and unity. We may as well abandon the geometrical idea that 
the weights are real numbers, since no physical argument requires this. Let us 
therefore replace them by the complex numbers a+ = ae ia and a_ = be 1 ^. The 
magnitudes a, b are real numbers which can be required to obey the same rela- 
tions as cos <p and sine/), 

\a + \ 2 + \a-\ 2 = a 2 + b 2 = 1, 0^|a+|<l, CK|a_Kl, (6.7) 

but the phase angles a and fi need not have any physical interpretation. Thus we 
can make the identifications 

P=\a + \ 2 , l-P=\a-\ 2 . (6.8) 

The change from real weights in Equation (6.4) to complex weights does not 
at first sight change the physics, but it adds some useful freedom to the theory. 
Quantum mechanics embodies this in the important principle of superposition, 
which states that if the spinors x+ and X- describe physical states, then every 
linear superposition of them with complex coefficients a±, 

X = a + x + +a-x-, (6.9) 

also describes a physical state. 



Unitary Transformations. It follows from the complexity of the a ± that the 
matrix U in Equation (6.5) is also complex. Moreover, U is restricted to transform 
a point on the unit circle in Figure 6.1 into another point on the circle. One then 
proves easily that the operator U must be unitary, obeying 

UU + =U + U=1, (6.10) 

where the superscript '+' implies transposition and complex conjugation. Actu- 
ally, the unitarity condition (6.10) follows from Equations (6.5) and (6.7). 

If the spin space were one-dimensional the unitarity condition (6.10) would be 
satisfied by any pure phase transformation 

U = e iei . (6.11) 

The number '1' is of course superfluous in the exponent, but we keep it for later 
reference. All operators of this form are elements of a mathematical group called 



1 54 Particles and Symmetries 

U(l). Here U stands for unitary and (1) for the order of the group. The phase angle 
6 is a real parameter of the group having a global value all over space-time, that 
is, it does not depend on the space-time coordinates. The transformation (6.11) 
is therefore called a global gauge transformation. 

A similar situation occurs in another familiar context: Maxwell's equations for 
the electromagnetic held are invariant under a phase transformation 

U = e iQew , (6.12) 

where the electric charge Q is a conserved quantity. Now, however, the parameter 
of the group is the product Q_9(x), which depends on the local space-time coor- 
dinate x through the function 6(x). The U(l) symmetry implies that Maxwell's 
equations are independent of the local choice of 6(x). The transformation (6.12) 
is then called a local gauge transformation (see, for example, [1]). Because this 
gauge symmetry is exact, the gauge boson of the theory, the photon, has exactly 
zero mass. 

This situation is quite similar to the principle of covariance, which demanded 
that the laws of physics should be independent of the choice of local space-time 
coordinates. The principle of covariance is now replaced by the gauge principle, 
which applies to gauge field theories. 

In the spin space of the electron, the operators U are represented by unitary 
2x2 matrices which, in addition, obey the special condition 

detC/ = l. (6.13) 

This defines them to be elements of the global group SU(2), of order two. The 
letter S in SU(2) stands for 'special' condition. In group theory parlance, the two- 
component spinors in Equation (6.3) are doublet representations of SU(2). 
It is possible to express the U operators in terms of exponentiations, 

U = e 1H , (6.14) 

analogously to the one-dimensional case (6.11). This requires the quantities H 
to be complex 2x2 matrices as well. Substituting the expression (6.14) into the 
unitarity condition (6.10), we have 

UU + = e i(H " Ht) = 1. (6.15) 

It follows that the operators H must be Hermitian, 

H = H\ (6.16) 

Moreover, the special condition (6.13) requires the H matrices to be traceless. 

Pauli Matrices. The most general way to write a 2 x 2 Hermitian matrix is 

H = 6icr x + e 2 o- y + 6 3 a z , (6.17) 

where the o\ are the Pauli matrices invented by Wolfgang Pauli (1900-1958) 

-(?;)■ *-(!o')- -(;-:)• <-> 

Note that all the 07 are traceless, and only a z is diagonal. 



Spin Space 155 

Comparing the exponent in Equation (6.11) with the expression (6.17) we see 
that the single parameter 6 in the one-dimensional case corresponds to three real 
parameters 9\, 02, 03 in SU(2). The superfluous number 1 inU(l) is the vestige of 
the Pauli matrices appearing in SU(2). It shares with them the property of having 
the square 1. 

The number 1 generates the ordinary algebra— a trivial statement, indeed!— 
whereas the Pauli matrices generate a new, noncommutative algebra. In commu- 
tative or Abelian algebras the product of two elements 6\ and 02 can be formed 
in either order: 

6>i6> 2 -6> 2 6>i = [01,02] = 0. 
Here the square-bracketed expression is called a commutator. In the non-Abelian 
algebra SU(2) the commutator of two elements does not in general vanish. For 
instance, the commutator of two Pauli matrices cr; is 

[o-i,o-j] = 2io- k , (6.19) 

where i, j, k represent any cyclic permutation of x, y, z. 

In quantum theory all possible observations are represented algebraically by 
linear operators, operating on physical states. When the states are described by 
spinors as in Equation (6.3), the operators are diagonal matrices. The possible 
outcomes of the observation are the values on the diagonal. 

For the case of an electron in the spin state x± , the relation between the operator 
a z , describing the observation of spin in the z direction, the outcome ±1, and the 
state x± is formulated 

-.-(i-'iHSHi)- 
--=(i -°0(i)=-(i)=-- 

These two equations are called eigenvalue equations. The x± are said to be eigen- 
states of the operator <x z , with the eigenvalues ±1, respectively. One can always 
choose a base where one (but only one) of the operators cr x , a y or a z is diagonal. 
Note that the general spinor (6.9) is not an eigenstate of a z because 

o- z X = o- z (a + x + + a-x-) = («+X+ - a~x~) * ±X- 
The important lesson of this is that possible observations are operators which 
can be represented by diagonal matrices, and the numbers appearing on the diag- 
onal are the possible outcomes of the observation. Moreover, the operators must 
be linear, since they operate in a linear space, transforming spinors into spinors. 
In ordinary space the spin vector a has a length which of course is a real positive 
number. Since its projection on the z-axis is either cr z = + \ or o~ z = - \, the length 
of a must be a = \ cr\ = \. The sign of <r z indicates that a is parallel or antiparallel 
to the z direction. Consider a system formed by two electrons a and b with spin 
vectors a a and Ob and spinor states x+,X-, X+ an d X-- The sum vector 
a = a a + ot, 



156 Particles and Symmetries 

can clearly take any values in the continuum between zero and unity, depending 
on the relative orientations. However, quantum mechanics requires a to be quan- 
tized to integral values, in this case to or 1. This has important consequences 
for atomic spectroscopy and particle physics. 



6.2 SU(2) Symmetries 

The proton and the neutron are very similar, except in their electromagnetic prop- 
erties: they have different charges and magnetic moments. Suppose one could 
switch off the electromagnetic interaction, leaving only the strong interaction at 
play. Then the p and n fields would be identical, except for their very small mass 
difference. Even that mass difference one could explain as an electromagnetic 
effect, because it is of the expected order of magnitude. 



Nucleon Isospin. Making use of SU(2) algebra one would then treat the nucleon 
N as a two-component state in an abstract charge space, 

N = r J . (6.22) 

The p and n fields are the base vectors spanning this space, 

P=(J), »=(;)■ (6.23) 

In analogy with the spin case, these states are the eigenstates of an operator J3, 
completely unrelated to spin, but having the algebraic form of \cr z and eigen- 
values 

h = ±|- (6-24) 

Thus the proton with charge Q = + 1 has 73 = + \ , and the neutron with Q = has 
I3 = - \. It is then convenient to give J3 physical meaning by relating it to charge, 

Q=|+/ 3 - (6.25) 

It follows that the charge operator in matrix form is 

<h(; ?H(; -°K o). 

where the charges of p and n, respectively, appear on the diagonal. 

One can also define two operators I\ = \cr x and h = \cr y in order to recover 
the complete SU(2) algebra in the space spanned by p and n. These operators 
interchange the charge states, for instance 



SU(2) Symmetries 157 

The h, I2, h are the components of the isospin vector J in an abstract three- 
dimensional space. This contrasts with the spin case where a - is a vector in ordi- 
nary three-dimensional space. 

The advantage of this notation is that the strong interactions of protons and 
neutrons as well as any linear superposition of them are treated symmetrically. 
One says that strong interactions possess isospin symmetry. Just as in the case of 
electron spin, this is a global symmetry. 

In nature, the isospin symmetry is not exact because one cannot switch off 
the electromagnetic interactions, as we supposed to begin with. Since the main 
asymmetry between the proton and the neutron is expressed precisely by their 
different electric charges, electromagnetic interactions are not isospin symmetric. 
However, the strong interactions are so much stronger, as is witnessed by the fact 
that atomic nuclei containing large numbers of protons do not blow apart in spite 
of their mutual Coulomb repulsion. Thus isospin symmetry is approximate, and 
it turns out to be a more useful tool in particle physics than in nuclear physics. 



Lepton Weak Isospin. The lessons learned from spin space and isospin sym- 
metry have turned out to be useful in still other contexts. As we have seen in 
Section 5.3, the electron and its neutrino form a family characterized by the con- 
served e-lepton number I e . They participate in similar electroweak reactions, but 
they differ in mass and in their electromagnetic properties, as do the proton and 
neutron. The neutrinos are very light, so the mass difference is too important to 
be blamed on their different electric charges. Quite distinctly from the isospin 
symmetry, which is at best useful, there is another SU(2) symmetry which is of 
fundamental importance: the leptons are considered to be components of three 
SU(2) doublets, 

*-(£)■ ''"I?)' Mt 1 )' (6 ' 28 » 

Their antiparticles form three similar doublets where the up/down order is 
reversed. Thus we meet yet another abstract space, weak-isospin space, which 
is not identical to either spin space or isospin space; only the algebraic structure 
is the same. This space is spanned by spinor-like base vectors 

.-(J). -(?). 

Making use of the notation (6.28) the reactions (5.31), (5.36), and (5.37) can be 
written as one reaction, 

2+ lj — { ~l+ lj, (6.30) 

where the subscripts i and j refer to e, n or t. In analogy with the spin and isospin 
cases, the states (6.29) are the eigenstates of an operator 



, 1 i A o 

r 3 = 2<x z = - 



2 \o -il- (6 " 31) 



158 Particles and Symmetries 

The eigenvalues T% = ± \ appearing on the diagonal are called weak charge. One 
can also define a weak-isospin vector T with components 7\, T2, T% spanning an 
abstract three-dimensional space. 

In the isospin case we had to 'switch off' electromagnetism in order to achieve 
a global symmetry between the differently charged p and n states of the nucleon. 
In the case of the electroweak theory, the particle states are represented by gauge 
fields which are locally gauged (see, for example, [1]). In addition, a trick has 
been invented which incorporates both electric charge and weak charge in the 
symmetry group. This trick is to enlarge the gauge symmetry to the direct product 
of two local gauge symmetry groups, 

SU(2) w «U(l)y. 

Here w stands for weak isospin, and Y is a new quantum number called weak 
hypercharge, the parameter of a U(l) group. If one defines the latter by 

Iy = Q_r 3> (6.32) 

all the leptons have Y = - 1 regardless of charge, and all the anti-leptons have 
Y = 1. 

The assumption that nature observes SU(2) W ® U(l)y symmetry implies that 
the electroweak interaction does not see any difference between the neutrino and 
the electron fields and linear superpositions of them, 

£ e = a+v e + a_e"; 

they all have the same weak hypercharge and the same I e . That the symmetry 
is a local gauge symmetry implies that the laws of electroweak interactions are 
independent of the 'local' choice of gauge functions, analogous to 6(x) in Equa- 
tion (6.12). However, nature does not realize this symmetry exactly; it is a broken 
symmetry. This is seen by the fact that of the four gauge bosons y, Z°, W + , W" 
mediating the electroweak interaction, three are massive. In an exact local gauge- 
symmetric theory all gauge bosons are massless, as is the photon in the case of 
the exact U(l). 

Actually, the electroweak force is the outcome of a long and difficult search to 
unify the forces of nature. The first milestone on this road was set up by Maxwell 
when he managed to unify electricity and magnetism into one electromagnetic 
force. Since then science has been concerned with four forces of nature: the strong 
force responsible for the stability of nuclei, the electromagnetic force responsible 
for atomic structure and chemistry, the weak force which played such an impor- 
tant cosmological role during the late radiation era and the gravitational force 
acting during the matter-dominated era. The latter three forces are described by 
local gauge-field theories. 

Einstein attempted in vain to unify gravitation and electromagnetism during his 
last 20 years. A breakthrough came in 1967 when Sheldon Glashow, Steven Wein- 
berg and Abdus Salam succeeded in unifying the weak and electromagnetic forces 
into the electroweak interaction. Since then the goal has been to achieve a grand 
unified theory (GUT), which would unify strong and electroweak interactions, and 
ultimately gravitation as well in a theory of everything (TOE). 



Hadrons and Quarks 159 

6.3 Hadrons and Quarks 

The spectrum of hadrons participating in strong interactions is extremely rich. 
The hadrons comprise two large classes of particles already encountered, the 
baryons and the mesons. Charged hadrons also have electromagnetic interactions, 
and all hadrons have weak interactions, but strong interactions dominate when- 
ever possible. 

The reaction rate of strong interactions exceeds by far the rate of other inter- 
actions. Weak decays like reactions (5.57) and (5.59) typically require mean lives 
ranging from microseconds to picoseconds. An electromagnetic decay like 

tt° — y + y 

takes place in less than 10" 16 s, and heavier particles may decay 1000 times faster. 
Strongly decaying hadrons, however, have mean lives of the order of 10" 23 s. 

Some simplification of the hadron spectrum may be achieved by introducing 
isospin symmetry. Then the nucleon N stands for n and p, the pion n stands for 
tt + , tt°, tt", the kaon K for the strange mesons K + , K° with mass 495 MeV, etc. 



Quarks. In 1962 Murray Gell-Mann and George Zweig realized that the hadron 
spectrum possessed more symmetry than isospin symmetry. They proposed that 
all hadrons known could be built out of three hypothetical states called quarks, 
q = u, d, s, which spanned a three-dimensional space with SU(3) gauge sym- 
metry. This SU(3) group is an extension of the isospin SU(2) group to include 
strangeness, an additive quantum number possessed by the kaon and many other 
hadrons. The up and down quark fields u, d form an isospin doublet like Equa- 
tion (6.22), whereas the strange quark s is an isospin-neutral singlet. Together 
they form the basic building block of SU(3), a triplet of quarks of three flavours. 
The quarks are fermion fields just like the leptons, and in spin space they are 
SU(2) doublets. 

In the quark model the mesons are qq bound states, and the baryons are qqq 
bound states. The differences in hadron properties can be accounted for in two 
ways: the quarks can be excited to higher angular momenta, and in addition the 
three flavours can enter in various combinations. For instance, the nucleon states 
are the ground state configurations 

p = uud, n = udd. (6.33) 

The mesonic ground states are the pions and kaons with the configurations 

n + = ud, tt° = 4? (uu - dd), tt" = du,l 

V2 . \ (6.34) 

K + = us, K° = ds, j 

as well as the n and n' which are linear combinations of uu, dd and ss. 

After the discovery in the 1960s of the electroweak gauge symmetry SU(2) W ® 
U(l)y for the then known e- and ^-lepton families and the u, d family of quarks, 



160 Particles and Symmetries 

it was realized that this could be the fundamental theory of electroweak interac- 
tions. But the theory clearly needed a fourth flavour quark to complete a second 
SU(2) W doublet together with the s-quark. To keep the s-quark as a singlet would 
not do: it would be ^-neutral and not feel the weak interactions. In 1974 the long 
predicted charmed quark c was discovered simultaneously by two teams, an MIT 
team led by 5am Ting, and a SLAC team lead by Burt Richter. 

The following year the issue was confused once more when another SLAC team, 
led by Martin Perl, discovered the t lepton, showing that the lepton families 
were three. This triggered a search for the t neutrino and the corresponding 
two quarks, if they existed. In 1977 a team at the Cornell e + e" collider CESAR 
led by Leon Lederman found the fifth quark with the same charge as the d and 
s, but with its own new flavour. It was therefore a candidate for the bottom posi- 
tion in the third quark doublet. Some physicists lacking the imagination of those 
who invented 'strangeness' and 'charm', baptized it bottom quarkb, although the 
name beauty has also been used. The missing companion to the bottom quark 
in the third SU(2) W doublet was prosaically called top quark, t. The fields of the 
three quark families can then be ordered as 

CD- (:)■ ©■ 

The top quark was discovered by the CDF and DO teams at the Fermilab in 
1994-1995. The known ground state charm and bottom mesons are, respectively, 



D + 


= cd, 


D° 


= cu, 


D s 


= cs, 


B + 


= ub, 


B" 


= db, 


/;'■ 


= sb. 



In addition cc and bb states are known. 

The strong interaction symmetry, which had started successfully with SU(3) 
for three quarks, would logically be enlarged to SU(n) for quarks of n flavours. 
However, the quark masses, although not directly measurable, are so vastly dif- 
ferent that even SU(4) is a badly broken symmetry and not at all useful. Only the 
isospin SU(2) subgroup and the flavour SU(3) subgroup continue to be useful, in 
particular for the classification of hadrons. 

It follows from the quark structure (6.33) of the nucleons that each quark pos- 
sesses baryon number B = |.TheSU(2) w symmetry requires all the T3 = \ (upper) 
states in the doublets to have the same charge Q u , and all the T3 = -\ 
(lower) states to have the charge Qd. To match the nucleon charges, the charges 
of the quarks have to satisfy the relations 

2Q U + Qd = 1, Qu + 2Q d = 0. (6.36) 

The solution chosen by nature is 

Qu = f, Qd = -|. (6.37) 

It now follows from the definition of weak hypercharge Y in Equation (6.32) that 
all the quarks have the same weak hypercharge, for instance 

y d = -| + 1 = |- (6-38) 



Hadrons and Quarks 




Figure 6.2 Feynman diagram for neutron f> decay with quark lines. 

Thus the quarks differ from the leptons in weak hypercharge. Actually, we can 
get rid of the somewhat artificial notion of weak hypercharge by noting that 

Y = B-L, (6.39) 

where B is the baryon number not possessed by leptons, and I is the lepton 
number not possessed by baryons. Thus for leptons B - L = -1, whereas for 
quarks it is |. 

The electroweak interactions of the leptons and quarks are mediated by the 
gauge bosons y, Z°, W + , W". Two examples of this interaction were illustrated 
by the Feynman diagrams in Figures 5.2 and 5.3, where each line corresponds to 
a particle. Figure 6.2 shows the Feynman diagram for neutron $ decay, reaction 
(5.96), where the decomposition into quarks is explicit, a nucleon corresponding 
to three quark lines. As is seen, the decay of a neutron involves the transforma- 
tion of a d quark of charge - \ into a u quark of charge | and a virtual W" boson. 
The final quark system is therefore that of a proton. Two of the quarks do not 
participate in the reaction at all; they remain spectators of what is going on. Sub- 
sequently, the virtual W" produces a lepton-anti-lepton pair, conserving electric 
charge and keeping the total lepton number at L = 0. 

The strong interactions of hadrons, which were never very well understood, 
obviously had to be replaced by interactions at the quark level. For this a new 
mediator is needed, the gluon, which is a vector boson like the previously men- 
tioned mediators of interactions, but massless like the photon. The gluon is also 
then responsible for binding quarks into hadrons, but it does not feel the leptons. 
The field theory describing the interactions of quarks and gluons is called quan- 
tum chromodynamics (QCD)— the reason for the word chromo will be explained 
next. 



Colour. The quarks have another intrinsic property which they do not share with 
the leptons. It appears from problems in hadron spectroscopy, and from the rates 
of certain hadronic reactions which occur three times faster than expected, that 
each quark actually must come in three versions. These versions do not differ from 
each other in any respect encountered so far, so they require a new property called 
colour. To distinguish quarks of different colour, one may choose to call them red, 



162 Particles and Symmetries 

blue and yellow (R, B, Y), for instance. They span an abstract three-dimensional 
space with SU(3) C symmetry ('c' for colour), and base vectors 



0/ W \1. 

where q stands for the flavours u, d, c, s, t, b. 

Colour is an absolutely conserved quantum number, in contrast to flavour, 
which is conserved in strong and electromagnetic interactions, but broken in weak 
interactions. Since the gluon interacts with quarks mediating the colour force, it 
must itself carry colour, so that it can change the colour of a quark. The same 
situation occurs in electroweak interactions where the conservation of charge, as 
for instance in Figure 6.2, requires the W to carry charge so that it can change a 
d quark into a u quark. Gluons interact with gluons because they possess colour 
charge, in contrast to photons, which do not interact with photons because of 
their lack of electric charge. 

Since there are quarks of three colours, there must exist nine gluons, one for 
each distinct colour pair. Thus the gluon colours are BB, BR, BY, RB, RR, RY, YB, 
YR, YY. One linear combination of BB, RR, and YY, which is colour-neutral and 
totally antisymmetric under the exchange of any two colours, is a singlet state 
and the interaction is completely blind to it. Thus there are eight distinct gluons. 

Since the colour property is not observed in hadrons, they must be colour- 
neutral. How can one construct colour-neutral compounds of coloured con- 
stituents? For the mesons which are quark-anti-quark systems, the answer is 
quite simple: for a given quark colour the anti-quark must have the corresponding 
anti-colour. Thus for instance the n + meson is a ud system when we account for 
flavours only, but if we account also for colour three tt + mesons are possible, cor- 
responding to u B dg, u R dR and u Y dy, respectively. Each of these is colour-neutral, 
so hadronic physics does not distinguish between them. Also the baryons which 
are qqq states must be colour-neutral. This is possible for a totally antisymmetric 
linear combination of three quarks q, q' , q" , having three colours each. 



Asymptotic Freedom. It is a curious fact that free quarks never appear in the lab- 
oratory, in spite of ingenious searches. When quarks were first invented to explain 
the spectroscopy of hadrons, they were thought to be mere abstractions without 
real existence. Their reality was doubted by many people since nobody had suc- 
ceeded in observing them. But it was gradually understood that their nonobserv- 
ability was quite natural for deep reasons related to the properties of vacuum. 

The quark and the anti-quark in a meson are like the north and south poles of 
a magnet which cannot be separated, because there exists nothing such as a free 
magnetic north pole. If one breaks the bound of the poles, each of the pieces will 
become a new magnet with a north and a south pole. The new opposite poles are 
generated at the break, out of the vacuum so to speak . Similarly, if one tries to 
break a qq pair, a new qq pair will be generated out of the vacuum at the break, 
and one thus obtains two new qq mesons which are free to fly away. 



The Discrete Symmetries C, P, T 163 

To account for this property one must assign curious features to the poten- 
tial responsible for the binding of the qq system: the larger the interquark dis- 
tance, the stronger the potential. Inversely, at very small interquark distances the 
potential can be so weak that the quarks are essentially free! This feature is called 
asymptotic freedom. 



Higher Symmetries. The SU(3) algebra is a straightforward generalization of 
SU(2) to order three. Since the base vectors (6.40) have three components, the 
operators also in that space must be 3 x 3 matrices which generalize the Pauli 
matrices. Of these, two are diagonal, corresponding to two observable properties. 
We are now ready to combine the SU(3) C symmetry with the electroweak symme- 
try in a global symmetry for the gluonic interactions of quarks and the electroweak 
interactions of leptons and quarks. The most elegant way would be if nature were 
symmetric under a larger group, say SU(5) or SO(10), which has SU(2) and SU(3) 
as subgroups. However, no satisfactory larger group has been found, so the global 
symmetry group 

G s = SU(3) c ®SU(2) w ®U(l) B _i (6.41) 

seems to be the less elegant direct product of the three symmetry groups. This 
symmetry is referred to as the standard model in particle physics (not to be con- 
fused with the standard model in Big Bang cosmology). This will play an important 
role in the discussion of the primeval Universe. 



6.4 The Discrete Symmetries C, P, T 

According to the cosmological principle, the laws of physics should be indepen- 
dent of the choice of space-time coordinates in a homogeneous and isotropic 
universe. Indeed, all laws governing physical systems in isolation, independent 
of external forces, possess translational symmetry under the displacement of the 
origin in three-space as well as in four-space. Such systems also possess rotational 
symmetry in three-space. Translations and rotations are continuous transforma- 
tions in the sense that a finite transformation (a translation by a finite distance 
or a rotation through a finite angle) can be achieved by an infinite sequence of 
infinitesimal transformations. 



Space Parity. A different situation is met in the transformation from a right- 
handed coordinate system in three-space to a left-handed one. This is achieved 
by reflecting the coordinate system in a point, or by replacing the x, y, z coordin- 
ates simultaneously by -x, -y, -z, respectively. This transformation cannot be 
achieved by an infinite sequence of infinitesimal transformations, and it therefore 
represents a discrete transformation. 

The mirror reflection in three-space is called parity transformation, and the 
corresponding parity operator is denoted P. Obviously, every vector v in a right- 



164 Particles and Symmetries 

handed coordinate system is transformed into its negative in a left-handed coor- 
dinate system, 

Pu = -v . 

This has the structure of an eigenvalue equation: v is an eigenvector of P with the 
eigenvalue P = -1. A function f(r) of the position vector r is transformed by P 
into 

P/(r)=/(-r). (6.42) 

Let us take f(r) to be a scalar function which is either symmetric under the 
parity transformation, f(-r) = f(r), or antisymmetric, f(-r) = -f(r). In both 
cases Equation (6.42) is an eigenvalue equation with f(r) the eigenfunction of P 
having the eigenvalue P = +1 or P = -I, respectively. Thus, scalars transform 
under P in two ways: those corresponding to even parity P = +1 are called (true) 
scalars, those corresponding to odd parity P = -1 are called pseudoscalars. 

It may seem intuitively natural that the laws of physics should possess this left- 
right symmetry. The laws of classical mechanics in fact do, and so do Maxwell's 
laws of electrodynamics and Newton's and Einstein's laws of gravitation. All par- 
ticles transform under P in some particular way which may be that of a scalar, a 
pseudoscalar, a vector or yet another. One can then consider that this is an intrin- 
sic property, parity P = ±1, if the particles are eigenstates of P. The bosons are 
but the fermions are not eigenstates of P because of their spinor nature (recall 
that the W and Z are vector bosons). However, fermion-anti-fermion pairs are 
eigenstates with odd parity, P = -1. The strong interactions conserve parity, and 
so do the electromagnetic interactions, from the evidence of Maxwell's equations. 
In a parity-conserving universe there is no way to tell in an absolute sense which 
direction is left and which right. 

It came as a surprise then when, in 1957, it was discovered that the weak inter- 
actions turned out to violate left-right symmetry, in fact maximally so. In a weak 
interaction the intrinsic parity of a particle could change. Thus if we communi- 
cated with a being in another galaxy we could tell her in an absolute sense which 
direction is left by instructing her to do a p decay experiment. 



Helicity States. A consequence of this maximal violation is the helicity property 
of the leptons and quarks. For a particle with spin vector a moving in some frame 
of reference with momentum p, the helicity is defined as 

H = ^f . (6.43) 

\P\ 

Particles with H < are called left handed and particles with H > are called 
right handed. The maximal left-right symmetry violation implies that left-handed 
leptons and quarks have couplings to the weak interaction gauge bosons W and Z, 
whereas the couplings of the right-handed ones are strongly suppressed. This is 
true in particular for the neutrinos which have such small masses as to make the 
right-handed ones practically inert. The rate for W and Z mediated scattering of 
right-handed neutrinos is suppressed, in comparison with the left-handed rate, 



The Discrete Symmetries C, P, T 165 

by a factor of the order of ml/E 2 , where E is the energy in the centre-of-mass 
system, and the neutrino masses are (m v ) < 0.23 eV. This is the reason why the 
neutrinos contributed only \ to g* in Equation (5.52), while the charged leptons 
contributed \. 

All the above is true also for anti-leptons and anti-quarks, except that their 
helicity has the reversed sign. Thus their weak interactions are dominantly right- 
handed, and left-handed anti-neutrinos are practically inert. 



Charge Conjugation. Let us now introduce another discrete operator C called 
charge conjugation. The effect of C on a particle state is to turn it into its own 
antiparticle. For flavourless bosons like the pion this is straightforward because 
there is no fundamental difference between a boson and its anti-boson, only the 
electric charge changes, e.g. 

Ctt + = n~. (6.44) 

Thus the charged pion is not an eigenstate of this operator, but the tt° is. The 
C operator reverses the signs of all flavours, lepton numbers, and the baryon 
number. 

Weak interactions are in fact symmetric under CP to a very good approximation. 
The combined operator CP is useful because it transforms left-handed leptons into 
right-handed anti-leptons, both of which are observed states: 

CPA = Cv R = 4- (6.45) 

Some reactions involving kaons exhibit a tiny CP violation, of the order of 0.22% 
relative to CP-conserving reactions. The decays of B-mesons, which are bu- and 
bd-quark systems, also exhibit a very small CP violation. It turns out that this tiny 
effect is of fundamental importance for cosmology, as we shall see in Section 6.7. 
The reason for CP violation is not yet understood. 

The strong interactions violate CP with an almost equal amount but with oppo- 
site sign, such that the total violation cancels, or in any case it is less than 10" 9 . 
Why this is so small is also not known. 



Time Reversal. A third discrete symmetry of importance is time reversal T, or 
symmetry under inversion of the arrow of time. This is a mirror symmetry with 
respect to the time axis, just as parity was a mirror symmetry with respect to 
the space axes. All physical laws of reversible processes are formulated in such 
a way that the replacement of time t by -t has no observable effect. The particle 
reactions in Section 5.3 occur at the same rate in both directions of the arrow 
(to show this, one still has to compensate for differences in phase space, i.e. the 
bookkeeping of energy in endothermic and exothermic reactions). 



CPT Symmetry. Although time reversal is not very important in itself, for 
instance particles do not carry a conserved quantum number related to T, it is 



166 Particles and Symmetries 

one factor in the very important combined symmetry CPT. According to our most 
basic notions in theoretical physics, CPT symmetry must be absolute. It then fol- 
lows from the fact that CP is not an absolute symmetry, but slightly violated, that 
T must be violated by an equal and opposite amount. 

In a particle reaction, CPT symmetry implies that a left-handed particle entering 
the interaction region from the x-direction is equivalent to a right-handed antipar- 
ticle leaving the region in the x-direction. One consequence of this is that particles 
and antiparticles must have exactly the same mass and, if they are unstable, they 
must also have exactly the same mean life. 

Needless to say, many ingenious experiments have been and still are carried out 
to test CP violation and T violation to ever higher precision. CPT symmetry will 
probably be tested when sufficient numbers of anti-hydrogen atoms have been 
produced, and their properties will be compared with those of hydrogen. 

6.5 Spontaneous Symmetry Breaking 

As we have seen, nature observes exactly very few symmetries. In fact, the way a 
symmetry is broken may be an important ingredient in the theory. As an intro- 
duction to spontaneous breaking of particle symmetries, let us briefly study some 
simple examples from other fields. For further reading, see, for example, refer- 
ences [l]-[4]. 



Classical Mechanics. Consider a cylindrical steel bar standing on one end on a 
solid horizontal support. A vertical downward force is applied at the other end. 
This system is obviously symmetrical with respect to rotations around the vertical 
axis of the bar. If the force is increased beyond the strength of the steel bar, it 
buckles in one direction or another. At that moment, the cylindrical symmetry is 
broken. 



Bar Magnet. An iron bar magnet heated above the Curie temperature 770 °C 
loses its magnetization. The minimum potential energy at that temperature cor- 
responds to a completely random orientation of the magnetic moment vectors 
of all the electrons, so that there is no net magnetic effect. This is shown in Fig- 
ure 6.3, where the potential energy follows a parabola with its apex at zero magne- 
tization. Since no magnetization direction is selected, this bar magnet possesses 
full rotational symmetry. The corresponding symmetry group is denoted O (3) for 
orthogonal rotations in three-space. 

As the bar magnet cools below a temperature of 770 °C, however, this sym- 
metry is spontaneously broken. When an external magnetic held is applied, the 
electron magnetic-moment vectors align themselves, producing a net collective 
macroscopic magnetization. The corresponding curve of potential energy has two 
deeper minima symmetrically on each side of zero magnetization (see Figure 6.3). 
They distinguish themselves by having the north and south poles reversed. Thus 



Spontaneous Symmetry Breaking 




Figure 6.3 Magnetization curves of bar magnet, (a) The temperature is above the Curie 
point 770 °C and the net magnetization is zero at the potential energy minimum, (b) The 
temperature is below the Curie point 770 °C and the net magnetization is nonvanishing 
at the symmetric potential-energy minima. 




Figure 6.4 Potential energy of the form (6.46) of a real scalar field cp. 



the ground state of the bar magnet is in either one of these minima, not in the 
state of zero magnetization. 

The rotational symmetry has then been replaced by the lesser symmetry of 
parity, or inversion of the magnetization axis. To be exact, this argument actually 
requires that the bar magnet is infinitely long so that its moment of inertia is 
infinite. Then no unitary operator can rotate the north pole into the south pole. 
Note that the potential energy curve in Figure 6.4 has the shape of a polynomial 
of at least fourth degree. 



168 Particles and Symmetries 



<p 

Figure 6.5 Potential energy of the form (6.49) of a real scalar field cp. 

Free Scalar Bosons. As a third example of a spontaneously broken symmetry, 
consider the vacuum filled with a real scalar held qp(x), where x stands for the 
space-time coordinates. Recall that the electric field is a vector— it has a direction. 
A scalar field is like temperature: it may vary as a function of x, but it has no 
direction. 

If the potential energy in the vacuum is described by the parabolic curve in 
Figure 6.4, the total energy may be written 

f(Vcp) 2 + \m 2 q> 2 . (6.46) 

Here the first term is the kinetic energy, which does not interest us. Fhe second 
term is the potential energy V(q>), which has a minimum at qy = if q? is a 
classical field and m 2 is a positive number. Fhe origin of the \ factors is of no 
importance to us. 

If cp is a quantum held, it oscillates around the classical ground state qp = as 
one moves along some trajectory in space-time. Fhe quantum mechanical ground 
state is called the vacuum expectation value of the field. In this case it is 

(qp) = 0. (6.47) 

One can show that the potential (6.46) corresponds to a freely moving scalar boson 
of mass m (there may indeed exist such particles). 

Another parametrization of a potential with a single minimum at the origin is 
the fourth-order polynomial 

V(<p) = \m 2 cp 2 + |Acp 4 , (6.48) 

where A is some positive constant. 

Physical Scalar Bosons. Let us now study the potential in Figure 6.5, which 
resembles the curve in Figure 6.3 at temperatures below 770 °C. Fhis clearly 



Spontaneous Symmetry Breaking 169 

requires a polynomial of at least fourth degree. Let us use a form similar to Equa- 
tion (6.48), 

V(q>) = -\[A 2 q> 2 + |Acp 4 . (6.49) 

The two minima of this potential are at the held values 

qpo = ±ji/VA. (6.50) 

Suppose that we are moving along a space-time trajectory from a region where 
the potential is given by Equation (6.48) to a region where it is given by Equa- 
tion (6.49). As the potential changes, the original vacuum (6.47) is replaced by the 
vacuum expectation value (qpo). Regardless of the value of qp at the beginning of 
the trajectory, it will end up oscillating around qpo after a time of the order of 
/j _1 . We say that the original symmetry around the unstable false vacuum point 
at qp = has been broken spontaneously. 

Comparing the potentials (6.49) and (6.46), we see that a physical interpretation 
of (6.49) would correspond to a free scalar boson with negative squared mass, -p: 2 \ 
How can this be physical? 

Let us replace op by op + q?o in the expression (6.49). Then 

V(qp) = -\p 2 (cp + qpo) 2 + \Mqp + qp ) 4 = \C$A<p 2 - p 2 )qp 2 + ■■■ . (6.51) 

The dots indicate that terms of third and fourth order in op have been left out. 
Comparing the coefficients of op 2 in Equations (6.46) and (6.51) and substituting 
(6.50) for qpo, we obtain 

m 2 = 3Aq?l - [X 2 = 3A(a//VX) 2 - [x 2 = 2[i 2 . (6.52) 

Thus we see that the effective mass of the field is indeed positive, so it can be 
interpreted as a physical scalar boson. 

The bosons in this model can be considered to move in a vacuum filled with the 
classical background field q?o ■ In a way, the vacuum has been redefined, although it 
is just as empty as before. The only thing that has happened is that in the process 
of spontaneous symmetry breaking, the mass of the scalar boson has changed. 
The symmetry breaking has this effect on all particles with which the scalar field 
interacts, fermions and vector bosons alike. 



Electroweak Symmetry Breaking. We shall now apply this model to the case of 
SU(2) W ®C/(1) symmetry breaking. There are additional complications because of 
the group structure, but the principle is the same. The model is really the rela- 
tivistic generalization of the theory of superconductivity of Ginzburg and Landau. 
Let us start on a trajectory in a region of space-time where SU(2) W ® (7(1) is an 
exact symmetry. The theory requires four vector bosons as we have seen before, 
but under the exact symmetry they are massless and called B°, W + , W°, W". We 
now invent such a scalar field qp that, as we go along the trajectory into a region 
where the symmetry is spontaneously broken, the vector bosons obtain their phys- 
ical values. 



Particles and Symmetries 




Figure 6.6 'Mexican hat' potential of a complex scalar field op. All the true vacuum states 
are located on the minimum Equation (6.54) forming a circle at the bottom of the hat. The 
false vacuum is on the top of the hat at the centre. 

To do this we use a trick invented by Peter Higgs. We choose the Higgs field qp 
to be a complex scalar SU(2) w -doublet, 

c P =f (Pl t i(P2 V (6-53) 

The vector bosons interact with the four real components qpi of the SU(2) W - 
symmetric field op. The false vacuum corresponds to the state q> =0, or 

<Pi = <P2 = <P3 = <P4 = 0. 

The true vacuum, which has a lower potential energy than the false vacuum, cor- 
responds to the state 

qp 1 = qp 2 = o, qpj + cp\ = const. > 0. (6.54) 

This potential is like the one in Figure 6.5, but rotated around the V-axis. Thus 
it has rotational symmetry like a Mexican hat (see Figure 6.6). All values of the 
potential on the circle at the bottom are equally good. 



Primeval Phase Transitions and Symmetries 171 

If on our trajectory through space-time we come to a point where the field has a 
special value such as cp\ = 0, q?3 > 0, then the rotational symmetry of the Mexican 
hat is spontaneously broken. Just as in the case of the potential (6.51), the scalar 
field becomes massive, corresponding to one freely moving Higgs boson, H°, in a 
redefined vacuum. As a consequence, the vector bosons W + , W" interacting with 
the scalar field also become massive (80 GeV). The two neutral fields B°, W° form 
the linear combinations 

y = B° cos W + W° sin W , (6.55) 

Z° = -B° sin W + W° cos W , (6.56) 

of which Z° becomes massive (91 GeV), whereas our ordinary photon y remains 
massless. y remains massless is because it is electroweak-neutral (TVneutral), so 
it does not feel the electroweak Higgs field. 

Thus the Higgs boson explains the spontaneous breaking of the SU(2) W ® (7(1) 
symmetry. The H° mass is 

m v = 2VX x 246 GeV. (6.57) 

Unfortunately, the value of A is unknown, so this very precise relation is useless! 
At the time of writing, the Higgs boson has not yet been found, only a lower limit of 
114 GeV can be quoted. But since the standard model works very well, physicists 
are confident of finding it in the next generation of particle accelerators, if not 
before. 

Since the electroweak symmetry SU(2) W ® U(1)b-l is the direct product of two 
subgroups, U(1)b_l and SU(2) W , it depends on two coupling constants g\, Q2 
associated with the two factor groups. Their values are not determined by the 
symmetry, so they could in principle be quite different. This is a limitation to 
the electroweak symmetry. A more symmetric theory would depend on just one 
coupling constant g. This is one motivation for the search for a GUT which would 
encompass the electroweak symmetry and the colour symmetry, and which would 
be more general than their product (6.41). 

At the point of spontaneous symmetry breaking several parameters of the the- 
ory obtain specific values. It is not quite clear where the different masses of all 
the quarks and leptons come from, but symmetry breaking certainly plays a role. 
Another parameter is the so-called Weinberg angle W , which is related to the cou- 
pling constant of the SU(2) W subgroup. Its value is not fixed by the electroweak 
theory, but it is expected to be determined in whatever GUT may be valid. 

6.6 Primeval Phase Transitions and Symmetries 

The primeval Universe may have developed through phases when some symmetry 
was exact, followed by other phases when that symmetry was broken. The early 
cosmology would then be described by a sequence of phase transitions. Symme- 
try breaking may occur through a first-order phase transition, in which the field 
tunnels through a potential barrier, or through a second-order phase transition, in 



172 Particles and Symmetries 

which the field evolves smoothly from one state to another, following the curve 
of the potential. 



Temperature. An important bookkeeping parameter at all times is the temper- 
ature, T. When we follow the history of the Universe as a function of T, we are 
following a trajectory in space-time which may be passing through regions of dif- 
ferent vacua. In the simple model of symmetry breaking by a real scalar field, qp, 
having the potential (6.49), the T-dependence may be put in explicitly, as well as 
other dependencies (denoted by 'etc.'), 

V(<p,T, etc.) = -\[\ 2 <p 2 + \\cp 4 + \\T 2 <p 2 + ■■■ . (6.58) 

As time decreases T increases, the vacuum expectation value </>o decreases, so 
that finally, in the early Universe, the true minimum of the potential is the trivial 
one at q? = 0. This occurs above a critical temperature of 

T c = 2^/VX. (6.59) 

An example of this behaviour is illustrated by the potentials in Figure 6.7. The 
different curves correspond to different temperatures. At the highest tempera- 
ture, the only minimum is at q? = but, as the temperature decreases, a new 
minimum develops spontaneously. If there is more than one minimum, only one 
of these is stable. A classical example of an unstable minimum is a steam bubble 
in boiling water. 



Early History. Let us now construct a possible scenario for the early history of 
the Universe. We start at the same point as we did in Section 5.3 (about 1 \is after 
the Big Bang, when the energy of the particles in thermal equilibrium was about 
200 MeV), but now we let time run backwards. For the different scales we can refer 
to Figure 5.9. 

E « 200 MeV. At this temperature, the phase transition between low-energy 
hadronic physics and QCD occurs: the individual nucleons start to overlap and 
'melt' into asymptotically free quarks, forming quark matter. In this dense 
medium the separation between quarks in the nucleons decreases, and the inter- 
action between any two quarks in a nucleon is screened by their interaction with 
quarks in neighbouring nucleons. Quark matter may still exist today in the core 
of cold but dense stellar objects such as neutron stars. 

The colour symmetry SU(3) C is valid at this temperature, but it is in no way 
apparent, because the hadrons are colour-neutral singlets. The colour force medi- 
ated by gluons is also not apparent: a vestige of QCD from earlier epochs remains 
in the form of strong interactions between hadrons. It appears that this force is 
mediated by mesons, themselves quark bound states. 

Above 200 MeV, the particles contributing to the effective degrees of freedom 
g*, introduced in Section 5.4, are the photon, three charged leptons, three neutri- 
nos (not counting their three inert right-handed components), the six quarks with 



Primeval Phase Transitions and Symmetries 




Figure 6.7 Effective scalar potentials. The different curves correspond to different tem- 
peratures. At the highest temperature (a) the only minimum is at cp = but, as the tem- 
perature decreases (b), a new minimum develops spontaneously. Finally, in (c), a stable 
minimum is reached at qp . 

three colours each, the gluon of eight colours and two spin states, the scalar Higgs 
boson H°, and the vector gauge bosons W*, Z°. Thus Equation (5.52) is replaced 
by 



#*=2 + 3xf + 3x;j + 6x3xf 



+ 8x2 + 1 + 3x3 = 106.75 



(6.60) 



This large value explains the (arbitrarily steep) drop at the QCD-hadron phase 
transition at 200 MeV in Figure 5.6 [5]. 

There is no trace of the weak-isospin symmetry SU(2) W , so the weak and elec- 
tromagnetic interactions look quite different. Their strengths are very different, 
and the masses of the leptons are very different. Only the electromagnetic gauge 
symmetry U(l) is exactly valid, as is testified to by the conservation of electric 
charge. 



1 GeV < E < 100 GeV. As the temperature increases through this range, 
the unity of the weak and electromagnetic forces as the electroweak interaction 
becomes progressively clearer. The particles contributing to the effective d 



174 Particles and Symmetries 

of freedom in the thermal soup are quarks, gluons, leptons and photons. All the 
fermions are massive. 

The electroweak symmetry SU(2) W ® U(1)b_i is broken, as is testified to by the 
very different quark masses and lepton masses. The electroweak force is mediated 
by massless photons and virtual W*, Z° vector bosons. The latter do not occur 
as free particles, because the energy is still lower than their rest masses near the 
upper limit of this range. The SU(3) C ® U(l) symmetry is of course exact, and the 
interactions of quarks are ruled by the colour force. 

E « 100 GeV. This is about the rest mass of the W and Z, so they freeze out 
of thermal equilibrium. The Higgs boson also freezes out about now, if it has not 
already done so at higher temperature. Our ignorance here is due to the lack of 
experimental information about the mass. 

There is no difference between weak and electromagnetic interactions: there 
are charged-current electroweak interactions mediated by the W*, and neutral- 
current interactions mediated by the Z° and y. However, the electroweak symme- 
try is imperfect because of the very different masses. 

E k 1 TeV. Up to this energy, our model of the Universe is fairly reli- 
able, because this is the limit of present-day laboratory experimentation. Here 
we encounter the phase transition between exact and spontaneously broken 
SU(2) W ® U(1)b-i symmetry. The end of electroweak unification is marked by the 
massification of the vector boson fields, the scalar Higgs fields and the fermion 
fields. 

One much discussed extension to the standard model is supersymmetry (SUSY). 
This brings in a large number of new particles, some of which should be seen in 
this temperature range. In this theory there is a conserved multiplicative quantum 
number, R parity, defined by 

R = (_ 1 )3B+I+25 j (661) 

where B, L and s are baryon number, lepton number and spin, respectively. All 
known particles have R = +1, but the theory allows an equal number of super- 
symmetric partners, sparticles, having R = -1. Conservation of R ensures that 
SUSY sparticles can only be produced pairwise, as sparticle-antisparticle pairs. 
The lightest sparticles must therefore be stable, just as the lightest particles are. 
One motivation for introducing this intermediate scale is the hierarchy prob- 
lem: why is nip so enormously much larger than m w ? And why is Vcouiomb so 
much larger than VNewton? SUSY has so many free parameters that it can 'natu- 
rally' explain these problems. 

1 TeV < E < 10 11 " 12 TeV. The 'standard' symmetry group G s in Equa- 
tion (6.41) is an exact symmetry in this range. As we have seen, laboratory physics 
has led us to construct the 'standard' theory, which is fairly well understood, 
although experimental information above 1 TeV is lacking. The big question is 



Primeval Phase Transitions and Symmetries 175 




Figure 6.8 The solid lines show the evolution of the inverse of the coupling constants 
for the symmetry groups U(1)b-l, SU(2) e iectroweak and SU(3) co iour, respectively. The dotted 
lines illustrate a case when the evolution is broken at an intermediate energy £ INT = 9 GeV 
so that unification occurs at £ G ut =15 GeV. 

what new physics will appear in this enormous energy range. The possibility that 
nothing new appears is called 'the desert'. 

The new physics could be a higher symmetry which would be broken at the 
lower end of this energy range. Somewhere there would then be a phase transi- 
tion between the exactly symmetric phase and the spontaneously broken phase. 
Even in the case of a 'desert', one expects a phase transition to GUT at 10 14 or 
10 15 GeV. One effect which finds its explanation in processes at about 10 14 GeV 
is the remarkable absence of antimatter in the Universe. 

If the GUT symmetry Ggut breaks down to G s through intermediate steps, the 
phenomenology could be very rich. For instance, there are subconstituent models 
building leptons and quarks out of elementary particles of one level deeper ele- 
mentarity. These subconstituents would freeze out at some intermediate energy, 
condensing into lepton and quark bound states. The forces binding them are in 
some models called technicolour forces. 

10 12 GeV < E < 10 16 TeV. Let us devote some time to GUTs which might 
be exact in this range. The unification of forces is not achieved very satisfactorily 
within the G s symmetry. It still is the direct product of three groups, thus there are 
in principle three independent coupling constants g\, Q2 and g-$ associated with 
it. Full unification of the electroweak and colour forces would imply a symmetry 
group relating those coupling constants to only one g. The specific values of the 
coupling constants are determined (by accident?) at the moment of spontaneous 
Ggut breaking. 

Below the energy of electroweak unification we have seen that the electromag- 
netic and weak coupling strengths are quite different. As the energy increases, 
their relative strengths change. Thus the coupling constants are functions of 
energy; one says that they are running. If one extrapolates the running coupling 



Particles and Symmetries 



y 



\ 



Figure 6.9 Proton decay Feynman diagram. 

constants from their known low-energy regime, they almost intersect in one point 
(see Figure 6.8). The energy of that point is between 10 13 and 10 15 GeV, so, if there 
is a GUT, that must be its unification scale and the scale of the masses of its vector 
bosons and Higgs bosons. 

The fact that the coupling strengths do not run together to exactly one point is 
actually quite an interesting piece of information. It could imply that there exist 
intermediate symmetries in the 'desert' between Ggut and G s . Their effect on the 
lines in Figure 6.8 would be to change their slopes at the intermediate energy. The 
present extrapolation to GUT energy would then be wrong, and the lines could 
after all meet exactly at one point. In Figure 6.8 we have illustrated this, choosing 
£gut = 15 GeV, and the intermediate energy at £i NT = 9 GeV. 

As I have pointed out several times, it is not understood why the leptons and 
the quarks come in three families. The symmetry group requires only one family, 
but if nature provides us with more, why are there precisely three? This is one 
incentive to search for larger unifying symmetries. However, family unification 
theories (FUTs) need not be the same as GUT. 

The standard model leaves a number of other questions open which one would 
very much like to have answered within a GUT. There are too many free parameters 
in G s . Why are the electric charges such as they are? How many Higgs scalar s are 
there? What is the reason for CP violation? The only hint is that CP violation seems 
to require (but not explain) at least three families of quarks. 

Why is parity maximally violated? Could it be that the left-right asymmetry is 
only a low-energy artefact that disappears at higher energies? There are left-right 
symmetric models containing G s , in which the right-handed particles interact with 
their own right-handed vector bosons, which are one or two orders of magnitude 
heavier than their left-handed partners. Such models then have intermediate uni- 
fication scales between the G s and Ggut scales. 

In a GUT, all the leptons and quarks should be components of the same field. In 
consequence there must exist leptoquark vector bosons, X, which can transform 
a quark into a lepton in the same way as the colour (i)-anticolour (j) gluons can 
transform the colour of quarks. This has important consequences for the stabil- 
ity of matter: the quarks in a proton could decay into leptons, for instance, as 
depicted in Figure 6.9, thereby making protons unstable. 

Experimentally we know that the mean life of protons must exceed the age 
of the Universe by many orders of magnitude. Sensitive underground detectors 
have been waiting for years to see a single proton decay in large volumes of water. 



Primeval Phase Transitions and Symmetries 177 

There are about 10 33 protons in a detector of 2000 t of water, so if it could see 
one decay in a year, the proton mean life would be 

t p > 10 33 yr. (6.62) 

This is about the present experimental limit. It sets stringent limits on the possible 
GUT candidates, and it has already excluded a GUT based on the symmetry group 
SU(5), which offered the simplest scheme of quark-lepton cohabitation in the 
same multiplets. 

Although all GUTs are designed to answer some of the questions and relate 
some of the parameters in the standard model, they still have the drawback of 
introducing large numbers of new particles, vector bosons and Higgs scalars, all 
of which are yet to be discovered. 



E < 10 19 GeV. We have now reached the energy scale where gravitational and 
quantum effects are of equal importance, so we can no longer do particle physics 
neglecting gravitation. In quantum mechanics it is always possible to associate 
the mass of a particle M with a wave having the Compton wavelength 

A = -J-. (6.63) 

Mc 

In other words, for a particle of mass M, quantum effects become important at 
distances of the order of A. On the other hand, gravitational effects are important 
at distances of the order of the Schwarzschild radius. Equating the two distances, 
we find the scale at which quantum effects and gravitational effects are of equal 
importance. This defines the Planck mass 

My = jhc/G = 1.221 x 10 19 GeVc -2 . (6.64) 

From this we can derive the Planck energy M P c 2 and the Planck time 

t P = A P /c = 5.31xl0" 44 s. (6.65) 

Later on we shall make frequent use of quantities at the Planck scale. The reason 
for associating these scales with Planck's name is that he was the first to notice 
that the combination of fundamental constants 

A P = ^hG/c* = 1.62 x 10" 35 m (6.66) 

yielded a natural length scale. 

Unfortunately, there is as yet no theory including quantum mechanics and grav- 
itation. Thus we are forced to stop here at the Planck time, a 'mere' 10" 43 s after 
the Big Bang, because of a lack of theoretical tools. But we shall come back to these 
earliest times in connection with models of cosmic inflation in the next chapter 
and in the final chapter. 



178 Particles and Symmetries 

6.7 Baryosynthesis and Antimatter Generation 

In Section 5.6 we noted that the ratio r\ of the baryon number density JV B to the 
photon number density N y is very small. We can anticipate a value deduced in 
Equation (8.45) 

>7 = 6.1±0.7xl(r 10 (6.67) 

from deuterium synthesis, CMB and large-scale structures. 



The Ratio of Baryons to Photons. Before the baryons were formed (at about 
200 MeV), the conserved baryon number B was carried by quarks. Thus the total 
value of B carried by all protons and neutrons today should equal the total value 
carried by all quarks. It is very surprising then that r\ should be so small today, 
because when the quarks, leptons and photons were in thermal equilibrium there 
should have been equal numbers of quarks and anti-quarks, leptons and anti- 
leptons, and the value of B was equal to the total leptonic number I, 

jV B = JV B = iV L « N y (6.68) 

because of the U(1)b_i symmetry. 

When the baryons and anti-baryons became nonrelativistic, the numbers of 
baryons and anti-baryons were reduced by annihilation, so JV B decreased rapidly 
by the exponential factor in the Maxwell-Boltzmann distribution (5.43). The num- 
ber density of photons N y is given by Equation (5.5) at all temperatures. Thus the 
temperature dependence of r\ should be 

Nb = V2Wm^ 2 k 

1 N y 4.808 V kT J 
When the annihilation rate became slower than the expansion rate, the value of 
r] was frozen, and thus comparable to its value today (6.67). The freeze-out occurs 
at about 20 MeV, when r\ has reached the value 

f7^6.8xl0" 19 . (6.70) 

But this is a factor 9 x 10 8 too small! Thus something must be seriously wrong 
with our initial condition (6.68). 



Baryon-Anti-baryon Asymmetry. The other surprising thing is that no anti- 
baryons seem to have survived. At temperatures above 200 MeV, quarks and anti- 
quarks were in thermal equilibrium with the photons because of reactions such 
as(5.32)-(5.35), as well as 

y + y — q+q. (6.71) 

These reactions conserve baryon number, so every quark produced or annihilated 
is accompanied by one anti-quark produced or annihilated. Since all quarks and 
anti-quarks did not have time to annihilate one another, it would be reasonable 
to expect equal number densities of baryons or anti-baryons today. 



Baryosynthesis and Antimatter Generation 179 

But we know that the Earth is only matter and not antimatter. The solar wind 
does not produce annihilations with the Earth or with the other planets, so we 
know that the Solar System is matter. Since no gamma rays are produced in the 
interaction of the solar wind with the local interstellar medium, we know that the 
interstellar medium, and hence our Galaxy, is matter. The main evidence that other 
galaxies are not composed of antimatter comes from cosmic rays. Our Galaxy 
contains free protons (cosmic rays) with a known velocity spectrum very near the 
speed of light. A fraction of these particles have sufficiently high velocities to 
escape from the Galaxy. These protons would annihilate with antimatter cosmic 
rays in the intergalactic medium or in collisions with antimatter galaxies if they 
existed, and would produce characteristic gamma-rays many orders of magnitude 
more frequently than have been seen. The observed ratio is 

^ = io- 5 -io- 4 , 

N B 
depending on the kinetic energy. This small number is fully consistent with all 
the observed anti-protons having been produced by energetic cosmic rays in the 
Earth's atmosphere, and it essentially rules out the possibility that other galaxies 
emit cosmic rays composed of antimatter. There are many other pieces of evidence 
against antimatter, but the above arguments are the strongest. 

We are then faced with two big questions. What caused the large value of n? 
And why does the Universe not contain antimatter, anti-quarks and positrons. 
The only reasonable conclusion is that iVg and iV B must have started out slightly 
different while they were in thermal equilibrium, by the amount 

JV B -iVg ^ niV y . (6.72) 

Subsequently most anti-baryons were annihilated, and the small excess r)N y of 
baryons is what remained. This idea is fine, but the basic problem has not been 
removed; we have only pushed it further back to earlier times, to some early 
BB-asymmetric phase transition. 



Primeval Asymmetry Generation. Let us consider theories in which a BB- 
asymmetry could arise. For this three conditions must be met. 

First, the theory must contain reactions violating baryon number conserva- 
tion. Grand unified theories are obvious candidates for a reason we have already 
met in Section 6.6. We noted there that GUTs are symmetric with respect to lep- 
tons and quarks, because they are components of the same field and GUT forces 
do not see any difference. Consequently, GUTs contain leptoquarks X, Y which 
transform quarks into leptons. Reactions involving X, Y do explicitly violate both 
baryon number conservation and lepton number conservation since the quarks 
have B = I, Li = 0, whereas leptons have B = 0, Ij = 1, where i = e, h,t. The 
baryon and lepton numbers then change, as for instance in the decay reactions 

X — e" + d, AB = +|, AI e = 1, (6.73) 

X — u + u, A£ = -f. (6.74) 



180 Particles and Symmetries 

Secondly, there must be C and CP violation in the theory, as these operators 
change baryons into anti-baryons and leptons into anti-leptons. If the theory were 
C and CP symmetric, even the baryon- viola ting reactions (6.73) and (6.74) would 
be matched by equally frequently occurring reactions with opposite AB, so no 
net BB-asymmetry would result. In fact, we want baryon production to be slightly 
more frequent than anti-baryon production. 

Thirdly, we must require these processes to occur out of thermal equilibrium. 
In thermal equilibrium there is no net production of baryon number, because the 
reactions (6.73) and (6.74) go as frequently in the opposite direction. Hence the 
propitious moment is the phase transition when the X-bosons are freezing out 
of thermal equilibrium and decay. If we consult the timetable in Section 6.6, this 
would happen at about 10 14 GeV: the moment for the phase transition from the 
GUT symmetry to its spontaneously broken remainder. 

The GUT symmetry offers a good example, which we shall make use of in this 
Section, but it is by no means obvious that GUT is the symmetry we need and that 
the phase transition takes place at GUT temperature. It is more likely that we have 
the breaking of a symmetry at a lower energy, such as supersymmetry. 



Leptoquark Thermodynamics. Assuming the GUT symmetry, the scenario is 
therefore the following. At some energy E x = kT x which is of the order of the rest 
masses of the leptoquark bosons X, 

E x ^M x c 2 , (6.75) 

all the X, Y vector bosons, the Higgs bosons, the W, B vector bosons of Equa- 
tions (6.55) and (6.56), and the gluons are in thermal equilibrium with the leptons 
and quarks. The number density of each particle species is about the same as the 
photon number density, and the relations (6.68) hold. 

When the age of the Universe is still young, as measured in Hubble time th, 
compared with the mean life t x = T^ 1 of the X bosons, there are no X decays and 
therefore no net baryon production. The X bosons start to decay when 

r x <T^=H. (6.76) 

This is just like the condition (5.63) for the decoupling of neutrinos. The decay 
rate T x is proportional to the mass M x , 

T x = (XM X , (6.77) 

where a is essentially the coupling strength of the GUT interaction. It depends on 
the details of the GUT and the properties of the X boson. 

We next take the temperature dependence of the expansion rate H from Equa- 
tions (5.49) and (5.51). Replacing the Newtonian constant G by its expression in 
terms of the Planck mass M P , as given in Equation (6.64), we find 



Baryosynthesis and Antimatter Generation 181 

Substituting this H and the expression (6.77) into the condition (6.76), the Uni- 
verse is out of equilibrium when 

at T = M x , (6.79) 

where all the constants have been lumped into A. Solving for the temperature 
squared, we find 

At temperature T x , the effective degrees of freedom g % are approximately 100. 
The condition (6.80) then gives a lower limit to the X boson mass, 

" - L2Xl ° 19GeV ,A'xlO-GeV, (6.81) 

where A' includes all constants not cited explicitly. 

Thus, if the mass M x is heavier than A' x 10 18 GeV, the X bosons are stable 
at energies above M x . Let us assume that this is the case. As the energy drops 
below M x , the X and X bosons start to decay, producing the net baryon number 
required. The interactions must be such that the decays really take place out of 
equilibrium, that is, the temperature of decoupling should be above M x . Typically, 
bosons decouple from annihilation at about M x /20, so it is not trivial to satisfy 
this requirement. 

Let us now see how C and CP violation can be invoked to produce a net BB- 
asymmetry in X and X decays. We can limit ourselves to the case when the only 
decay channels are (6.73) and (6.74), and correspondingly for the X channels. For 
these channels we tabulate in Table A. 7 the net baryon number change AB and 
the ith branching fractions r(X — channel i)/T(X — all channels) in terms of 
two unknown parameters r and f . 

The baryon number produced in the decay of one pair of X, X vector bosons 
weighted by the different branching fractions is then 

AB = rAB x + (1 - r)AB 2 + rAB 3 + (1 - f )AB 4 =r-r. (6.82) 

If C and CP symmetry are violated, r and f are different, and we obtain the 
desired result AB * 0. Similar arguments can be made for the production of a net 
lepton-anti-lepton asymmetry, but nothing is yet known about leptonic CP viola- 
tion. 

Suppose that the number density of X and X bosons is N X - We now want to 
generate a net baryon number density 

N B = ABN X = ABNy 

by the time the Universe has cooled through the phase transition at Tout- After 
that the baryon number is absolutely conserved and further decrease in N B only 
follows the expansion. However, the photons are also bosons, so their absolute 
number is not conserved and the value of r\ may be changing somewhat. Thus, if 



182 Particles and Symmetries 

we want to confront the baryon production AB required at Tan with the present- 
day value of r], a more useful quantity is the baryon number per unit entropy 
N$/S. Recall that the entropy density of photons is 

s = 1.800* CHATy (6.83) 

from Equation (5.66). At temperature Tout the effective degrees of freedom were 
shown in Equation (6.60) to be 106.75 (in the standard model, not counting lepto- 
quark degrees of freedom), so the baryon number per unit entropy is 

Nb = AB ^AB_ 

S 1.800* (Tout) 180' " 

Clearly this ratio scales with g* 1 (T). Thus, to observe a present-day value of rj at 
about the value in (6.67), the GUT should be chosen such that it yields 

Afi = Jt (^ 106_75 _ 9xl0 _ 8> 

g*(T ) 3.36 

making use of the g* (T) values in Equations (5.73) and (6.60). This is within the 
possibilities of various GUTs. 

One may of course object that this solution of the baryosynthesis problem is 
only speculative, since it rests on the assumption that nature exhibits a suitable 
symmetry. At the beginning of this section, we warned that the GUT symmetry did 
not necessarily offer the best phase transition mechanism for baryosynthesis. The 
three conditions referred to could perhaps be met at some later phase transition. 
The reason why the GUT fails is to be found in the scenarios of cosmic inflation 
(Chapter 7). The baryon asymmetry produced at Tout is subsequently washed out 
when the Universe reheats to Tgut at the end of inflation. 

The search for another mechanism has turned to the electroweak phase transi- 
tion at about 100 GeV. The 'minimal standard model' of electroweak interactions 
cannot generate an asymmetry but, if the correct electroweak theory could be 
more general. New possibilities arise if all three neutrino species oscillate and 
violate CP, or if one turns to the 'minimal supersymmetric standard model'. At 
the expanding interface of the broken symmetry phase, the baryon-anti-baryon 
asymmetry could be generated via complex CP-violating reflections, transmissions 
and interference phenomena between fermionic excitations. Thus the existence 
of baryons is an indication that physics indeed has to go beyond the 'minimal 
standard model'. 



Problems 

1. Are the raising and lowering operators S+ and S_ unitary or Hermitian? 

2. Verify the commutation relations (6.19) for the Pauli spin matrices (6.18). 

3. Write down the weak hypercharge operator Y (6.32) in matrix form. 



Baryosynthesis and Antimatter Generation 183 

4. The s quark is assigned the value of strangeness 5 = -1. The relation (6.25) 
for all nonstrange hadrons reads Q_ = \B + J3. Generalize the relation (6.25) 
to include strangeness so that it holds true for the K mesons defined in 
Equations (6.34). Note that, of all the quark-anti-quark systems possible with 
three quarks, only five are listed in Equations (6.34). Write down the quark 
content of the remaining systems. 

5. Show by referring to the quark structure that the K mesons are not eigen- 
states of the C operator. 

6. All known baryons are qqq-systems. Use the u, d, s quarks to compose the 
27 ground-state baryons, and derive their charge and strangeness proper- 
ties. Plot these states in (h,B + 5) -space. 

7. Is the parity operator P defined in Equation (6.42) Hermitian? 

8. One can define the states K L = K° - K° and K s = K° + K°. Prove that these 
are eigenstates of the CP operator. These states decay dominantly to 2tt 
and 3tt. Which state decays to which, and why? What does this imply for the 
relative lifetimes of the Ks and K L ? 

9. Derive a value of weak hypercharge Y = B - L f or the X boson from reactions 
(6.72) and (6.73). 



Chapter Bibliography 

[1] Chaichian, M. and Nelipa, N. F. 1984 Introduction to gauge field theories. Springer. 

[2] Kolb, E. W. and Turner, M. S. 1990 The early Universe. Addison-Wesley, Reading, MA. 

[3] Collins, P. D. B., Martin, A. D. and Squires, E. J. 1989 Particle physics and cosmology. 
John Wiley & Sons, New York. 

[4] Linde, A. 1990 Particle physics and inflationary cosmology. Harwood Academic Pub- 
lishers, London. 

[5] Coleman, T. S. and Roos, M. 2003 Phys. Rev. D68, 027702. 



7 
Cosmic Inflation 



The standard FLRW Big Bang model describes an adiabatically expanding Universe, 
having a beginning of space and time with nearly infinite temperature and density. 
This model has, as so far presented, been essentially a success story. But the Big 
Bang assumes very problematic initial conditions: for instance, where did the 10 90 
particles which make up the visible Universe come from? We are now going to 
correct that optimistic picture and present a remedy: cosmic inflation. 

In Section 7.1 we shall discuss problems caused by the expansion of space-time: 
the horizon problem related to its size at different epochs, the monopole problem 
associated with possible topological defects, and the flatness problem associated 
with its metric. 

In Section 7.2 we shall study a now classical scenario to solve these problems, 
called 'old' inflation. In this scenario the Universe traversed an epoch when a scalar 
field with negative pressure caused a de Sitter-like expansion, and terminated it 
with a huge entropy increase, in violation of the law of entropy conservation (Equa- 
tion (5.13)). Although this scenario was qualitatively possible, it had quantitative 
flaws which were in part alleviated in 'new' inflation. 

In Section 7.3 we discuss the scenario of chaotic inflation, which introduces a 
bubble universe in which we inhabit one bubble, totally unaware of other bubbles. 
The inflationary mechanism is the same In each bubble, but different parameter 
values may produce totally different universes. Since our bubble must be just right 
for us to exist in, this model is a version of the Anthropic principle. We close this 
section with a discussion of the predictions of inflation, to be tested in Chapters 8 
and 9. 

In Section 7.4 we reconnect to the quintessence models of Section 4.3, learning 
how the primordial inflaton field could be connected to the dark energy field of 
today. 

In Section 7.5 we turn our attention to a speculative alternative to inflation, a 
cyclic universe containing dark energy as a driving force. 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



186 Cosmic Inflation 

7.1 Paradoxes of the Expansion 

Particle Horizons. Recall the definition of the particle horizon, Equation (2.47), 
which in a spatially flat metric is 

r'o dt r R <» da 
Xph = (Jph = c J tmin alt) =c k, n M- {7A) 

This was illustrated in Figure 2.1. In expanding Friedmann models, the particle 
horizon is finite. Let us go back to the derivation of the time dependence of the 
scale factor R(t) in Equations (4.39)-(4.41). At very early times, the mass density 
term in the Friedmann equation (4.4) dominates over the curvature term (we have 
also called it the vacuum-energy term), 

This permits us to drop the curvature term and solve for the Hubble parameter, 

Substituting this relation into Equation (7.1) we obtain 

f R ° dR ( 3c 2 \ 112 f R » dR 

^ h = C \ Rmin RHRjR) = ^) L,R^P- i?A) 

In a radiation-dominated Universe, p scales like R~ 4 , so the integral on the right 
converges in the lower limit R min = 0, and the result is that the particle horizon 
is finite: 

C R ° dR 
-pH ^=*o. (7.5) 

Similarly, in a matter-dominated Universe, p scales like R~ 3 , so the integral also 
converges, now yielding *JRq. Note that an observer living at a time ti < to would 
see a smaller particle horizon, Ri < Rq, in a radiation-dominated Universe or 
V#T < V^o in a matter-dominated Universe. 

Suppose however, that the curvature term or a cosmological constant dominates 
the Friedmann equation at some epoch. Then the conditions (4.35) and (4.36) are 
not fulfilled; on the contrary, we have a negative net pressure 

p < -\pc 2 . (7.6) 

Substituting this into the law of energy conservation (4.24) we find 

p < -2|p. (7.7) 

This can easily be integrated to give the R dependence of p, 

p<R~ 2 . (7.8) 



Paradoxes of the Expansion 187 

Inserting this dependence into the integral on the right-hand side of Equation (7.4) 
we And 



rRo dR rRo dR 



an integral which does not converge at the limit R m m = 0. Thus the particle hori- 
zon is not finite in this case. But it is still true that an observer living at a time 
ti < to would see a particle horizon that is smaller by InRo - \nR\. 



Horizon Problem. A consequence of the finite age to of the Universe is that the 
particle horizon today is finite and larger than at any earlier time t\. Also, the 
spatial width of the past light cone has grown in proportion to the longer time 
perspective. Thus the spatial extent of the Universe is larger than that our past 
light cone encloses today; with time we will become causally connected with new 
regions as they move in across our horizon. This renders the question of the 
full size of the whole Universe meaningless— the only meaningful size being the 
diameter of its horizon at a given time. 

In Chapter 5 we argued that thermal equilibrium could be established through- 
out the Universe during the radiation era because photons could traverse the 
whole Universe and interactions could take place in a time much shorter than 
a Hubble time. However, there is a snag to this argument: the conditions at any 
space-time point can only be influenced by events within its past light cone, and 
the size of the past light cone at the time of last scattering (£lss) would appear to 
be far too small to allow the currently observable Universe to come into thermal 
equilibrium. 

Since the time of last scattering, the particle horizon has grown with the expan- 
sion in proportion to the 3 -power of time (actually this power law has been valid 
since the beginning of matter domination at t eq , but t L $$ and t eq are nearly simul- 
taneous). The net effect is that the particle horizon we see today covers regions 
which were causally disconnected at earlier times. 

At the time of last scattering, the Universe was about 1065 times smaller than 
it is now (z L ss ~ 1065), and the time perspective back to the Big Bang was only the 
fraction £lss/£o - 2.3 x 10" 5 of our perspective. If we assume that the Universe 
was radiation dominated for all the time prior to £ L ss, then, from Equation (4.40), 
R(t) oc -Jt. The particle horizon at the LSS, cr p h, is obtained by substituting R(t) oc 
(£lss/£)~ 1/2 into Equation (2.37) and integrating from zero time to £lss: 

(T ph oc £ LSS dt(^) =2t LSS . (7.10) 

It is not very critical what we call 'zero' time: the lower limit of the integrand has 
essentially no effect even if it is chosen as late as 10" 4 ti,ss- 

The event horizon at the time of last scattering, cr e h, represents the extent of 
the Universe we can observe today as light from the LSS (cf. Figures 2.1 and 7.1), 
since we can observe no light from before the LSS. On the other hand, the particle 
horizon cr p h represents the extent of the LSS that could have come into causal 



188 Cosmic Inflation 




horizon 

L = 3ct d « 200/r 1 Mpc 



Figure 7.1 A co-moving space/conformal time diagram of the Big Bang. The observer 
(here and now) is at the centre. The Big Bang singularity has receded to the outermost 
dashed circle, and the horizon scale is schematically indicated at last scattering. It corre- 
sponds to an arc of angle today. Reproduced from reference [1] by permission of J. Silk 
and Macmillan Magazines Ltd. 



contact from t = to £lss- If the event horizon is larger than the particle horizon, 
then all the Universe we now see (in particular the relic CMB) could not have been 
in causal contact by £ L ss- 

The event horizon cr e h, is obtained by substituting R(t) oc (t L ss/£)~ 2/3 from 
Equation (4.39) into Equation (2.49) and integrating over the full epoch of matter 
domination from £ L ss to t max = to- Assuming flat space, k = 0, we have 



^>(¥f-4(^f-] 



(7.11) 



Let us take tiss = 0.35 Myr and to = 15 Gyr. Then the LSS particle horizon cr p h 
is seen today as an arc on the periphery of our particle horizon, subtending an 
angle 

^[^1 .1.12°. (7.12) 

TT L C7 e h J LSS 



Paradoxes of the Expansion 189 

This is illustrated in Figure 7.1, which, needless to say, is not drawn to scale. It 
follows that the temperature of the CMB radiation coming from any 1° arc could 
not have been causally connected to the temperature on a neighbouring arc, so 
there is no reason why they should be equal. Yet the Universe is homogeneous 
and isotropic over the full 360°. 

This problem can be avoided, as one sees from Equations (7.6)-(7.9), when the 
net pressure is negative, for example, when a cosmological constant dominates. In 
such a case, R(t) oc e const,t (the case w = -1 in Equation (4.38)). If a cosmological 
constant dominates for a finite period, say between t\ and t2 < £lss, then a term 
e const.-(t 2 -ti) enters into (7.10). This term can be large, allowing a reordering of 
horizons to give cr p h > cr eh . 

The age of the Universe at temperature 20 MeV was t = 2 ms and the distance 
scale 2ct. The amount of matter inside that horizon was only about 1O" 5 M , 
which is very far from what we see today: matter is separated into galaxies of 
mass 1O 12 M . The size of present superclusters is so large that their mass must 
have been assembled from vast regions of the Universe which were outside the 
particle horizon at t = 2 ms. But then they must have been formed quite recently, 
in contradiction to the age of the quasars and galaxies they contain. This paradox 
is the horizon problem. 

The lesson of Equations (7.4)-(7.9) is that we can get rid of the horizon prob- 
lem by choosing physical conditions where the net pressure is negative, either by 
having a large curvature term or a dominating cosmological term or some large 
scalar field which acts as an effective cosmological term. We turn to the latter case 
in Section 7.2. 



GUT Phase Transition. Even more serious problems emerge as we approach very 
early times. At GUT time, the temperature of the cosmic background radiation was 
Tgut - 1-2 x 10 28 K, or a factor 

Z^I. 4 .4xl0 27 
To 

greater than today. This is the factor by which the linear scale R(t) has increased 
since the time tcuT- If we take the present Universe to be of size 2000h _1 Mpc = 
6 x 10 25 m, its linear size was only 2 cm at GUT time. 

Note, however, that linear size and horizon are two different things. The horizon 
size depends on the time perspective back to some earlier time. Thus the particle 
horizon today has increased since tcuT by almost the square of the linear scale 
factor, or by 

At GUT time the particle horizon was only 2xl0" 29 m. It follows that to arrive 
at the present homogeneous Universe, the homogeneity at GUT time must have 
extended out to a distance 5 x 10 26 times greater than the distance of causal 



190 Cosmic Inflation 

contact! Why did the GUT phase transition happen simultaneously in a vast num- 
ber of causally disconnected regions? Concerning even earlier times, one may ask 
the same question about the Big Bang. Obviously, this paradox poses a serious 
problem to the standard Big Bang model. 

In all regions where the GUT phase transition was completed, several important 
parameters— such as the coupling constants, the charge of the electron, and the 
masses of the vector bosons and Higgs bosons— obtained values which would 
characterize the present Universe. Recall that the coupling constants are functions 
of energy, as illustrated in Figure 6.8, and the same is true for particle masses. 
One may wonder why they obtained the same value in all causally disconnected 
regions. 

The Higgs field had to take the same value everywhere, because this is uniquely 
dictated by what is its ground state. But one might expect that there would be 
domains where the phase transition was not completed, so that certain remnant 
symmetries froze in. The Higgs field could then settle to different values, caus- 
ing some parameter values to be different. The physics in these domains would 
then be different, and so the domains would have to be separated by domain 
walls, which are topological defects of space-time. Such domain walls would con- 
tain enormous amounts of energy and, in isolation, they would be indestructible. 
Intersecting domain walls would produce other types of topological defects such 
as loops or cosmic strings wiggling their way through the Universe. No evidence 
for topological defects has been found, perhaps fortunately for us, but they may 
still lurk outside our horizon. 



Magnetic Monopoles. A particular kind of topological defect is a magnetic 
monopole. Ordinarily we do not expect to be able to separate the north and south 
poles of a bar magnet into two independent particles. As is well known, cutting 
a bar magnet into two produces two dipole bar magnets. Maxwell's equations 
account for this by treating electricity and magnetism differently: there is an elec- 
tric source term containing the charge e, but there is no magnetic source term. 
Thus free electric charges exist, but free magnetic charges do not. Stellar bodies 
may have large magnetic fields, but no electric fields. 

Paul A. M. Dirac (1902-1984) suggested in 1931 that the quantization of the 
electron charge might be the consequence of the existence of at least one free 
magnetic monopole with magnetic charge 

9u = \— -68.5en, (7.14) 

2 e 

where e is the charge of the electron and n is an unspecified integer. This would 
then modify Maxwell's equations, rendering them symmetric with respect to elec- 
tric and magnetic source terms. Free magnetic monopoles would have drastic 
consequences, for instance destroying stellar magnetic fields. 

Without going into detail about how frequently monopoles might arise during 
the GUT phase transition, we assume that there could arise one monopole per 



Paradoxes of the Expansion 191 

10 horizon volumes 

iV M UGUT) = 0.1x(2xl0- 29 m)- 3 , 

and the linear scale has grown by a factor 4.4 x 10 27 . Nothing could have destroyed 
them except monopole-anti-monopole annihilation, so the monopole density 
today should be 

iV M Uo) * 0.1 x (4.4 x 0.02 m)" 3 =* 150 m" 3 . (7.15) 

This is quite a substantial number compared with the proton density which is at 
most 0.17 m -3 . Monopoles circulating in the Galaxy would take their energy from 
the galactic magnetic field. Since the field survives, this sets a very low limit to 
the monopole flux called the Parker bound. Experimental searches for monopoles 
have not yet become sensitive enough to test the Parker bound, but they are cer- 
tainly in gross conflict with the above value of N M ; the present experimental upper 
limit to JVm is 25 orders of magnitude smaller than N y . 
Monopoles are expected to be superheavy, 

m M > -^- =* 10 16 GeV =* 2 x 10" 11 kg. (7.16) 

«GUT 

Combining this mass with the number densities (5.82) and (7.15) the density 
parameter of monopoles becomes 

r3M = NMmM^ 2 _ 8xi()17 _ 

Pc 

This is in flagrant conflict with the value of the density parameter in Equa- 
tion (4.79): yet another paradox. Such a universe would be closed and its maximal 
lifetime would be only a fraction of the age of the present Universe, of the order 

'—zH^kr 40 " 1 - (7 - 18) 

Monopoles have other curious properties as well. Unlike the leptons and quarks, 
which appear to be pointlike down to the smallest distances measured (10~ 19 m), 
the monopoles have an internal structure. All their mass is concentrated within a 
core of about 10" 30 m, with the consequence that the temperature in the core is 
of GUT scale or more. Outside that core there is a layer populated by the X lepto- 
quark vector bosons, and outside that at about 10" 17 m there is a shell of W and 
Z bosons. 

The monopoles are so heavy that they should accumulate in the centre of stars 
where they may collide with protons. Some protons may then occasionally pene- 
trate in to the GUT shell and collide with a virtual leptoquark, which transforms 
a d quark into a lepton according to the reaction 

d + Xvirtuai — e + . (7.19) 

This corresponds to the lower vertex in the drawing of the proton decay dia- 
gram (Figure 6.9). Thus monopoles would destroy hadronic matter at a rate much 
higher than their natural decay rate. This would catalyse a faster disappearance 
of baryonic matter and yield a different timescale for the Universe. 



192 Cosmic Inflation 

Flatness Problem. Recall that in a spatially flat Einstein-de Sitter universe the 
curvature parameter k vanishes and the density parameter is O = 1. This is obvi- 
ous from Equation (4.11), where k and O are related by 

kc 2 

The current value of the total density parameter Do is of order unity. This does 
not seem remarkable until one considers the extraordinary fine-tuning required: a 
value of Oo close to, but not exactly, unity today implies that Qo(t) at earlier times 
must have been close to unity with incredible precision. During the radiation era 
the energy density e r is proportional to R~ 4 . It then follows from Equation (4.4) 
that 

R 2 ocR~ 2 . (7.20) 

At GUT time, the linear scale was some 10 27 times smaller than today, and since 
most of this change occurred during the radiation era 

Q - l oc R 2 ~ 10" 54 . (7.21) 

Thus the Universe at that time must have been flat to within 54 decimal places, 
a totally incredible situation. If this were not so the Universe would either have 
reached its maximum size within one Planck time (10~ 43 s), and thereafter col- 
lapsed into a singularity, or it would have dispersed into a vanishingly small 
energy density. The only natural values for O are therefore 0, 1 or infinity, whereas 
to generate a universe surviving for several Gyr without a O value of exactly unity 
requires an incredible fine-tuning. It is the task of the next sections to try to 
explain this. 



7.2 'Old' and 'New' Inflation 

The earliest time after the Big Bang we can meaningfully consider is Planck time 
t P (Equation (6.65)), because earlier than that the theory of gravitation must be 
married to quantum field theory, a task which has not yet been mastered. Let us 
assume that the r P -sized universe then was pervaded by a homogeneous scalar 
classical field op, the inflaton field, and that all points in this universe were causally 
connected. The idea with inflation is to provide a mechanism which blows up the 
Universe so rapidly, and to such an enormous scale, that the causal connection 
between its different parts is lost, yet they are similar due to their common ori- 
gin. This should solve the horizon problem and dilute the monopole density to 
acceptable values, as well as flatten the local fluctuations to near homogeneity. 



Guth's Scenario. Let us try to make this idea more quantitative. Suppose that 
the mass m^ of the inflaton carrying the field q? was much lighter than the Planck 
mass rap, 

< m<p « ra P , (7.22) 



'Old' and 'New' Inflation 193 

so that the inflaton can be considered to be massless. In fact, the particle sym- 
metry at Planck time is characterized by all fields except the inflaton field being 
exactly massless. Only when this symmetry is spontaneously broken in the transi- 
tion to a lower temperature phase do some particles become massive. We met this 
situation in Sections 6.5 and 6.6, where scalar Higgs fields played an important 
role. 

Let us introduce the potential V(qp, T) of the scalar field at temperature T. Its 
qp dependence is arbitrary, but we could take it to be a power function of qp just 
as we chose to do in Equations (4.73), (6.49) and (6.58). Suppose that the potential 
at time t P has a minimum at a particular value qp P . The Universe would then settle 
in that minimum, given enough time, and the value ap P would gradually pervade 
all of space-time. It would be difficult to observe such a constant field because it 
would have the same value to all observers, regardless of their frame of motion. 
Thus the value of the potential V(ap P , T P ) may be considered as a property of the 
vacuum. 

Suppose that the minimum of the potential is at q? P = in some region of 
space-time, and it is nonvanishing, 

|V(0,T P )|>0. (7.23) 

An observer moving along a trajectory in space-time would notice that the field 
fluctuates around its vacuum expectation value 

(qpp) =0, 

and the potential energy consequently fluctuates around the mean vacuum-energy 
value 

<v(o,r P )> >o. 

This vacuum energy contributes a repulsive energy density to the total energy 
density in Friedmann's equation (4.17), acting just as dark energy or as a cosmo- 
logical constant if we make the identification 

§8ttG<Vo> = |A, (7.24) 

where Vo = ^ (0, 0) is a temperature-independent constant. 

Inflation occurs when the Universe is dominated by the inflaton field op and 
obeys the slow-roll conditions (4.74). We shall restrict our considerations to theo- 
ries with a single inflaton field. Inflationary models assume that there is a moment 
when this domination starts and subsequently drives the Universe into a de Sitter- 
like exponential expansion in which T - 0. Alan Guth in 1981 [2] named this an 
inflationary universe. 

The timescale for inflation is 



Clearly the cosmic inflation cannot go on forever if we want to arrive at our 
present slowly expanding Friedmann-Lemaitre universe. Thus there must be a 



194 Cosmic Inflation 

mechanism to halt the exponential expansion, a graceful exit. The freedom we 
have to arrange this is in the choice of the potential function V(qp, T) at different 
temperatures T. 

GUT Potentials. Suppose that there is a symmetry breaking phase transition 
from a hot Gciir-symmetric phase dominated by the scalar held qp to a cooler 
Gs-symmetric phase. As the Universe cools through the critical temperature Tgut, 
bubbles of the cool phase start to appear and begin to grow. If the rate of bubble 
nucleation is initially small the Universe supercools in the hot phase, very much 
like a supercooled liquid which has a state of lowest potential energy as a solid. 

We assumed above that the potential V(qp, 7h t) in the hot GcuT-symmetric 
phase was symmetric around the point qp = 0, as in the top curve of Figure 6.7. 
Suppose now that V(qp, T) develops a new asymmetric minimum as the temper- 
ature decreases. At Tgut, this second minimum may be located at q?gut, and the 
potential may have become equally deep in both minima, as in the middle curve 
in Figure 6.7. Now the Universe could in principle tunnel out to the second mini- 
mum, but the potential barrier between the minima makes this process slow. Tun- 
nelling through potential barriers is classically forbidden, but possible in quan- 
tum physics because quantum laws are probabilistic. If the potential barrier is 
high or wide, tunnelling is less probable. This is the reason why the initial bubble 
nucleation can be considered to be slow. 

The lowest curve in Figure 6.7 illustrates the final situation when the true min- 
imum has stabilized at cpo and the potential energy of this true vacuum is lower 
than in the original false vacuum: 

V(qpo,T coo i) <V(0,Thot). 

When the phase transition from the supercooled hot phase to the cool phase 
finally occurs at T coo i the latent heat stored as vacuum energy is liberated in the 
form of radiation and kinetic energy of ultrarelativistic massive scalar particles 
with positive pressure. At the same time other GUT fields present massify in the 
process of spontaneous symmetry breaking, suddenly filling the Universe with 
particles of temperature T R . The liberated energy is of the order of 

<Vo> - (kT R ) 4 . (7.26) 

This heats the Universe enormously, from an ambient temperature 

Tcooi •« Tgut 
to Tr, which is at the Tgut scale. The remaining energy in the qp field is dumped in 
the entropy which is proportional to T 3 . Thus the entropy per particle is suddenly 
increased by the very large factor 

where the ratio T R /T coo \ is of the order of magnitude of 10 29 . This is a nonadiabatic 
process, grossly violating the condition (5.13). 



'Old' and 'New' Inflation 195 

At the end of inflation the Universe is a hot bubble of particles and radiation 
in thermal equilibrium. The energy density term in Friedmann's equations has 
become dominant, and the Universe henceforth follows a Friedmann-Lemaitre 
type evolution. 

The flatness problem is now solved if the part of the Universe which became 
our Universe was originally homogeneous and has expanded by the de Sitter scale 
factor (4.60) 

a = e HT =* 10 29 , (7.28) 

or Ht =! 65. Superimposed on the homogeneity of the pre -inflationary universe 
there were small perturbations in the held q? or in the vacuum energy. At the end 
of inflation these give rise to density perturbations which are the seeds of later 
mass structures and which can easily explain 10 90 particles in the Universe. 

It follows from Equations (7.25) and (7.28) that the duration of the inflation was 

T^65xl(T 34 s. (7.29) 

Then also the horizon problem is solved, since the initial particle horizon has 
been blown up by a factor of 10 29 to a size vastly larger than our present Uni- 
verse. (Note that the realistic particle horizon is not infinite as one would obtain 
from Equation (7.9), because the lower limit of the integral is small but nonzero.) 
Consequently, all the large-scale structures seen today have their common origin 
in a microscopic part of the Universe long before the last scattering of radiation. 
The development of Guth's scenario through the pre-inflationary, inflationary and 
post-inflationary eras is similar to Linde's scenario shown in Figure 7.2, except that 
the vertical scale here grows 'only' to 10 29 . 

When our bubble of space-time nucleated, it was separated from the surround- 
ing supercooled hot phase by domain walls. When the phase transition finally 
occurred the enormous amounts of latent heat was released to these walls. The 
mechanism whereby this heat was transferred to particles in the bubbles was by 
the collision of domain walls and the coalescence of bubbles. In some models 
knots or topological defects then remained in the form of monopoles of the order 
of one per bubble. Thus the inflationary model also solves the monopole problem 
by blowing up the size of the region required by one monopole. There remains no 
inconsistency then with the present observed lack of monopoles. 

Although Guth's classical model of cosmic inflation may seem to solve all the 
problems of the hot Big Bang scenario in principle, it still fails because of difficul- 
ties with the nucleation rate. If the probability of bubble formation is large, the 
bubbles collide and make the Universe inhomogeneous to a much higher degree 
than observed. If the probability of bubble formation is small, then they never 
collide and there is no reheating in the Universe, so each bubble remains empty. 
Thus there is no graceful exit from the inflationary scenario. 

The first amendments to Guth's model tried to modify the qp 4 -potential of Equa- 
tions (6.49) and (6.58) in such a way that the roll from the false minimum to the 
true minimum would start very slowly, and that the barrier would be very small 
or absent. This model has been called new inflation by its inventors, A. D. Linde 
[3, 4, 5] and A. Albrecht and P. J. Steinhardt [6]. However, this required unlikely 



196 Cosmic Inflation 

fine-tuning, and the amplitude of primordial quantum fluctuations needs to be 
smaller than 10~ 6 M P in order not to produce too large late-time density fluctua- 
tions. 

7.3 Chaotic Inflation 

Initial Conditions. Guth's model made the rather specific assumption that the 
Universe started out with the vacuum energy in the false minimum qp = at time 
r P . However, Linde pointed out that this value as well, as any other fixed starting 
value, is as improbable as complete homogeneity and isotropy because of the 
quantum fluctuations at Planck time (see, for example, references [7, 8]). Instead, 
the scalar field may have had some random starting value q? a , which could be 
assumed to be fairly uniform across the horizon of size Mp 1 , changing only by 
an amount 

Acp a a M p « <p a . (7.30) 

Regions of higher potential would expand faster, and come to dominate. With 
time the value of the field would change slowly until it finally reached q?o at the 
true minimum V(cpo) of the potential. 

But causally connected spaces are only of size Mp \ so even the metric of space- 
time may be fluctuating from open to closed in adjacent spaces of this size. Thus 
the Universe can be thought of as a chaotic foam of causally disconnected bubbles 
in which the initial conditions are different, and which would subsequently evolve 
into different kinds of universes. Only one bubble would become our Universe, 
and we could never get any information about the other ones. Linde called this 
essentially anthropic idea chaotic inflation. 

According to Heisenberg's uncertainty relation, at a timescale At = h/M^c 2 the 
energy is uncertain by an amount 

A£>A=M P c 2 . (7.31) 

Let us for convenience work in units common to particle physics where h = c = 
1 . Then the energy density is uncertain by the amount 

A » = TEk = llk= M '- ,732) 

Thus there is no reason to assume that the potential V(qj a ) would be much smaller 
than Mp. We may choose a general parametrization for the potential, 

nMp 4 
where n > and < k <k 1. This assumption ensures that V(q? a ) does not rise 
too steeply with op. For n = 4 it then follows that 



when the free parameter k is chosen to be very small. 



Chaotic Inflation 197 

A large number of different models of inflation have been studied in the liter- 
ature. Essentially they differ in their choice of potential function. We shall only 
study the simplest example of Equation (7.33) with n = 2. 



Scalar-Field Dynamics. In the simplest field theory coupling a scalar field q? to 
gravitation, the total inflaton energy is of the form 

\(p 2 + \(Vcp) 2 +V(qp), (7.35) 

and the dynamics can be described by two equations: Friedmann's equation 

H 2 + ^ = ^{\<p 2 + \{V(p) 2 + V{(p)); (7.36) 

a 1 3Mp 

and the Klein-Gordon equation (4.67) obeyed by scalar fields, 

<p + 3H<p + V'(qp) =0. (7.37) 

If the inflaton field starts out as qp a , being large and sufficiently homogeneous as 
we assumed in Equation (7.30), we have 

(Vqp a f « V(qp a ). (7.38) 

The speed of the expansion, H = a/ a, is then dominated by the potential V(qp a ) 
in Equation (7.36) and therefore large. (Note that we have also neglected all other 
types of energy, like p m and p r .) 

Since the potential (7.33) has a minimum at qp = 0, one may expect that qp 
should oscillate near this minimum. However, in a rapidly expanding universe, 
the inflaton field approaches the minimum very slowly, like a ball in a viscous 
medium, the viscosity V'(qp) being proportional to the speed of expansion. In 
this situation we have 

qj a « 3Hcp a , (p 2 a « V(apa), — « H 2 . (7.39) 

a 1 

The first inequality states that qp changes so slowly that its acceleration can be 
neglected. The second inequality sets the condition for expansion: the kinetic 
energy is much less than the potential energy, so the pressure p v of the scalar 
field is negative and the expansion accelerates. In the expansion the scale factor 
a grows so large that the third inequality follows. The equations (7.36) and (7.37) 
then simplify to 

Rl = vSz V ^ ( 7 - 40 > 

3Mp 

and 

3Hcp = -V'(qp). (7.41) 

Equation (7.41) then describes an exponentially expanding de Sitter universe. 
Initially all space-time regions of size H~ l = Mp 1 would contain inhomogeneities 
inside their respective event horizons. At every instant during the inflationary 



198 Cosmic Inflation 

de Sitter stage an observer would see himself surrounded by a black hole with 
event horizon H~ l (but remember that 'black hole' really refers to a static met- 
ric). There is an analogy between the Hawking radiation of black holes and the 
temperature in an expanding de Sitter space. Black holes radiate at the Hawking 
temperature T H (Equation (3.31)), while an observer in de Sitter space will feel as 
if he is in a thermal bath of temperature Tds = H/2n. 

Within a time of the order of H~ l all inhomogeneities would have traversed 
the Hubble radius. Thus they would not affect the physics inside the de Sitter 
universe which would be getting increasingly homogeneous and flat. On the other 
hand, the Hubble radius is also receding exponentially, so if we want to achieve 
homogeneity it must not run away faster than the inhomogeneities. 

Combining Equations (7.33), (7.40), and (7.41), we obtain an equation for the 
time dependence of the scalar held, 

ig,2 = _^L VM . (7 .4 2) 

Let us study the solution of this equation in the case when the potential is given 
by Equation (6.46) and Figure 6.4: 

V(q>) = \mlcp 2 . (7.43) 

The time dependence of the field is then 

fft(pM| 

' 27377" 

where r(4> a ) is the characteristic timescale of the expansion. At early times when 
t <s t the scalar field remains almost constant, changing only slowly from q> a to 
its ultimate value q?o- The scale factor then grows quasi-exponentially as 

R(t) = R(t a )exp(Ht-^m 2 t 2 ), (7.45) 

with H given by 

As the field approaches q?o = the slow-roll of the Universe ends in the true 
vacuum at V(q?o), and the inflation ends in graceful exit. 



Size of the Universe. At time t, the Universe has expanded from a linear size 
R(t a ) to 

R(t) ^ R(t a ) exp(HT) = R(t a )exp(^^). (7.47) 

\ M p z / 

For instance, a universe of linear size equal to the Planck length R(t a ) = 10" 35 m 
has grown to 

R( T ) ^R(t a )exp( 4nI f F ). (7.48) 

V mi J 



Chaotic Inflation 199 

For a numerical estimate we need a value for the mass m v of the inflaton. This 
is not known, but we can make use of the condition that the chaotic model must 
be able to form galaxies of the observed sizes. Then (as we shall see in the next 
chapters) the scalar mass must be of the order of magnitude 

m v a l(T 6 Mp. (7.49) 

Inserting this estimate into Equation (7.48) we obtain the completely unfath- 
omable scale 

R(T) a 10 -35exp(4Trxl0 12 ) m ^ 1() S.Sxl0 12 m (7 5Q) 

It is clear that all the problems of the standard Big Bang model discussed in Sec- 
tion 7.1 then disappear. The homogeneity, flatness and isotropy of the Universe 
turn out to be consequences of the inflaton field having been large enough in a 
region of size Mp 1 at time tp. The inflation started in different causally connected 
regions of space-time 'simultaneously' to within 10" 43 s, and it ended at about 
10" 35 s. Our part of that region was extremely small. Since the curvature term in 
Friedmann's equations decreased exponentially, the end result is exactly as if k 
had been zero to start with. A picture of this scenario is shown in Figure 7.2. 



Quantum Fluctuations. If inflation was driven by a pure de Sitter expansion, the 
enormous scale (7.50) guarantees that it would henceforth be absolutely flat. But 
we noted that at Planck time the field q? was indefinite by M p , at least, so that 
there were deviations from a pure de Sitter universe. Even if this Universe was 
empty, quantum field theory tells us that empty space is filled with zero-point 
quantum fluctuations of all kinds of physical fields, here fluctuations from the 
classical de Sitter inflaton field. 

The vacuum fluctuation spectrum of the slowly rolling scalar field during the 
inflationary expansion turns out to be quite unlike the usual spectrum of thermal 
fluctuations. This can be seen if one transforms the de Sitter metric (4.61) into the 
metric of a Euclidean 4-sphere (2.27). Bose fields (like the inflaton) obeying a mass- 
less Klein-Gordon equation turn out to oscillate harmonically on this sphere with 
period 2n/H, which is equivalent to considering quantum statistics at a temper- 
ature Tds = H/2n. However, the temperature in de Sitter space is highly unusual 
in that the fluctuations on the 4-sphere are periodic in all four dimensions [7, 8]. 

The fate of a bubble of space-time clearly depends on the value of qp a . Only 
when it is large enough will inflationary expansion commence. If qp a is very much 
larger than Mp, Equation (7.46) shows that the rate of expansion is faster than the 
timescale t, 

(7.51) 



H»2^m^ 2 -. 



Although the wavelengths of all quantum fields then grow exponentially, the 
change Aq? in the value of the inflaton field itself may be small. In fact, when the 
physical wavelengths have reached the size of the Hubble radius H~ l , all changes 
in q? are impeded by the friction 3Hq? in Equation (7.41), and fluctuations of size 



200 Cosmic Inflation 




Figure 7.2 Evolution of the scale R of the Universe since Planck time in (A) Friedmann 
models and (B) inflationary expansion. During the epoch Bi the Universe expands expo- 
nentially, and during B 2 the inflation ends by reheating the Universe. After the graceful 
exit from inflation the Universe is radiation dominated along B 3 , just as in Ai, following 
a Friedmann expansion. The sections B 4 and A2 are matter-dominated epochs. 



8qp freeze to an average nonvanishing amplitude of 



|5<p(x)| - 



27T ' 



(7.52) 



Consequently, the vacuum no longer appears empty and devoid of properties. 

Fluctuations of a length scale exceeding H~ l are outside the present causal 
horizon so they no longer communicate, crests and troughs in each oscillation 
mode remain frozen. But at the end of inflation, the expansion during radiation 
and matter domination returns these frozen fluctuations inside the horizon. With 
time they become the seeds of perturbations we now should observe in the CMB 
and in the density distribution of matter. 

The quantum fluctuations remaining in the inflaton field will cause the energy 
to be dumped into entropy at slightly fluctuating times. Thus, the Universe will 
also contain entropy fluctuations as seeds of later density perturbations. 



Linde's Bubble Universe. Since our part of the pre-inflationary universe was 
so small, it may be considered as just one bubble in a foam of bubbles having 
different fates. In Linde's chaotic model each original bubble has grown in one 
e-folding time t = H~ l to a size comprising e 3 mini-universes, each of diameter 
H~ l . In half of these mini-universes, on average, the value of qp may be large 
enough for inflation to continue, and in one-half it may be too small. In the next 
e-folding time the same pattern is repeated. Linde has shown that in those parts 



Chaotic Inflation 201 

of space-time where q? grows continuously the volume of space grows by a factor 

e (3 - ln2mt , (7.53) 

whereas in the parts of space-time where q? does not decrease the volume grows 
by the factor 

|e 3m . (7.54) 

Since the Hubble parameter is proportional to op, most of the physical volume 
must come from bubbles in which q? is maximal: 

q? =* Mp/my. (7.55) 

But there must also be an exponential number of bubbles in which qp is smaller. 
Those bubbles are the possible progenitors of universes of our kind. In them, 
qp attains finally the value corresponding to the true minimum V(qpo), and a 
Friedmann-Lemaitre-type evolution takes over. Elsewhere the inflatoric growth 
continues forever. Thus we happen to live in a Universe which is a minuscule part 
of a steady-state eternally inflating meta-Universe which has no end, and there- 
fore it also has no beginning. There is simply no need to turn inflation on in the 
first place, and the singularity at time zero has dropped out from the theory. 

During inflation, each bubble is generating new space-time to expand into, as 
required by general relativity, rather than expanding into pre-existing space-time. 
In these de Sitter space-times the bubble wall appears to an observer as a sur- 
rounding black hole. Two such expanding bubbles are causally disconnected, so 
they can neither collide nor coalesce. Thus the mechanism of vacuum-energy 
release and transfer of heat to the particles created in the phase transition is 
not by bubble collisions as in the classical model. Instead, the rapid oscillations 
of the inflaton field qp decay away by particle production as the Universe settles 
in the true minimum. The potential energy then thermalizes and the Universe 
reheats to some temperature of the order of Tout- 

In this reheating, any baryon-anti-baryon asymmetry produced during the GUT 
phase transition mechanism is washed out, that is why some other phase transi- 
tion must be sought to explain the baryon-anti-baryon asymmetry. Thus the exis- 
tence of baryons is an indication that particle physics indeed has to go beyond 
the 'minimal standard model'. 



Predictions. One consequence of the repulsive scalar field is that any two par- 
ticles appear to repel each other. This is the Hubble expansion, which is a con- 
sequence of inflation. In noninflationary theories the Hubble expansion is merely 
taken for granted. 

Inflationary models predict that the density of the Universe should today be 
critical, 

O = 1. (7.56) 

Consequently, we should not only observe that there is too little luminous mat- 
ter to explain the dynamical behaviour of the Universe, we also have a precise 



202 Cosmic Inflation 

theoretical specification for how much matter there should be. This links dark 
matter to inflation. 

We have already noted that the scalar inflaton field produced a spectrum of 
frozen density and radiation perturbations beyond the horizon, which moved 
into sight when the expansion of the Universe decelerated. In the post-inflationary 
epoch when the Friedmann expansion takes over we can distinguish between two 
types of perturbations, or perturbations and or perturbations. In the first case, 
the perturbations in the local number density, <5 m = Sp m /p m , of each species of 
matter— baryons, leptons, neutrinos, dark matter— is the same. In particular, these 
perturbations are coupled to those of radiation, 5 r = Sp r /p r , so that 45 m = 3<5 r 
(from Equation (5.45)). By the principle of covariance, perturbations in the energy- 
momentum tensor imply simultaneous perturbations in energy density and pres- 
sure, and by the equivalence principle, variations in the energy-momentum tensor 
are equivalent to variations in the curvature. Curvature perturbations can have 
been produced early as irregularities in the metric, and they can then have been 
blown up by inflation far beyond the Hubble radius. Thus adiabatic perturbations 
are a natural consequence of cosmic inflation. In contrast, inflation does not pre- 
dict any isocurvature perturbations. 

Let us write the power spectrum of density perturbations in the form 

P(k) oc k n \ (7.57) 

where n s is the scalar spectral index. Inflationary models predict that the primor- 
dial fluctuations have an equal amplitude on all scales, an almost scale-invariant 
power spectrum as the matter fluctuations cross the Hubble radius, and are Gaus- 
sian. This is the Harrison-Zel'dovich spectrum for which n s = 1 (n s = would 
correspond to white noise). 

A further prediction of inflationary models is that tensor fluctuations in the 
space-time metric, satisfying a massless Klein-Gordon equation, have a nearly 
scale-invariant spectrum of the form (7.57) with tensor spectral index n t « 1, just 
like scalar density perturbations, but independently of them. 

The above predictions are generic for a majority if inflation models which differ 
in details. Inflation as such cannot be either proved or disproved, but specific 
theories can be and have been ruled out. In Chapters 8 and 9 we shall test the 
validity of the above predictions. 

7.4 The Inflaton as Quintessence 

Now we have met two cases of scalar fields causing expansion: the inflaton field 
acting before tcuT and the quintessence field describing present-day dark energy. 
It would seem economical if one and the same scalar field could do both jobs. 
Then the inflaton field and quintessence would have to be matched at some time 
later than tcuT- This seems quite feasible since, on the one hand, the initially dom- 
inating inflaton potential V(qp) must give way to the background energy density 
Pv + Pm as the Universe cools, and on the other hand, the dark energy density 
must have been much smaller than the background energy density until recently. 



The Inflaton as Quintessence 203 

Recall that quintessence models are constructed to be quite insensitive to the 
initial conditions. 

On the other hand, nothing forces the identification of the inflaton and 
quintessence fields. The inflationary paradigm in no way needs nor predicts 
quintessence. 

In the previously described models of inflation, the inflaton field qp settled to 
oscillate around the minimum V(qp = 0) at the end of inflation. Now we want the 
inflaton energy density to continue a monotonic roll-down toward zero, turning 
ultimately into a minute but nonvanishing quintessence tail. The global minimum 
of the potential is only reached in a distant future, V(qp — oo) — 0. In this process 
the inflaton does not decay into a thermal bath of ordinary matter and radiation 
because it does not interact with particles at all, it is said to be sterile. A sterile 
inflaton field avoids violation of the equivalence principle, otherwise the interac- 
tion of the ultralight quintessence field would correspond to a new long-range 
force. Entropy in the matter fields comes from gravitational generation at the end 
of inflation rather than from decay of the inflaton field. 

The task is then to find a potential V(qp) such that it has two phases of acceler- 
ated expansion: from t P to t en d at the end of inflation, and from a time t F « t GUT 
when the instanton field freezes to a constant value until now, to- Moreover, the 
inflaton energy density must decrease faster than the background energy density, 
equalling it at some time t* when the field is <p*, and thereafter remaining sub- 
dominant to the energy density of the particles produced at tend- Finally it must 
catch up with a tracking potential at some time during matter domination, t > t eq . 

The mathematical form of candidate potentials is of course very complicated, 
and it would not be very useful to give many examples here. However, it is instruc- 
tive to follow through the physics requirements on op and V(qp) from inflation to 
present. 



Kination. Inflation is caused by an essentially constant potential V(qp) according 
to Equation (7.40). The condition V(qp — oo) — requires an end to inflation at 
some finite time t en d when the field is qp en a and the potential is V e nd = V(<Pend)- 
The change in the potential at t en d from a constant to a decreasing roll then 
implies, by Equation (7.41), that (p en a * 0, and furthermore, by Equation (7.37), 
that also (p en< \ * 0. Then the slow-roll conditions (4.74) for e and n are also 
violated. 
During inflation the kinetic energy density of the inflaton is 

Thus when V (qp) starts to grow, so does pkin- and the total energy density of the 
Universe becomes dominated by the inflaton kinetic energy density. This epoch 
has been called kination or deflation. Equation (4.70) then dictates that the equa- 
tion of state is 

qp 2 - 2V(qp) 



204 Cosmic Inflation 

so that the kinetic energy density decreases as 

p(a) oca" 3(1+w) =a" 6 (7.60) 

from Equation (4.29). This is much faster than the a~ 4 decrease of the radiation 
energy density p r , and the a -3 decrease of the initially much smaller matter energy 
density p m . Consequently, kination ends at the moment when p r overtakes pkin at 
time t* . When constructing phenomenological models for this scenario, one con- 
straint is of course that p r (£ e nd) -« V'end, or equivalently, t e nd < t* ■ This behaviour 
is well illustrated in Figure 7.3, taken from the work of Dimopoulos and Valle [9]. 
Since matter and radiation are gravitationally generated at £ en d. the reheating 
temperature of radiation is given by 

T reh = «r H) (7.61) 

where T H is the Hawking temperature (Equation (3.31)), and a is some reheating 
efficiency factor less than unity. In Figure 7.3 the radiation energy density p y = p r 
starts at T r 4 eh «: V e nd, and then catches up with V(qp) at cp*. Now the Universe 
becomes radiation dominated and the hot Big Bang commences. Note that the 
term 'hot Big Bang' has a different meaning here: it does not refer to a time zero 
with infinite temperature, but to a moment of explosive entropy generation. This 
mimics the Big Bang so that all of its associated successful predictions ensue. 



Quintessence. The inflaton field plays no role any more during the Big Bang. 
The kinetic energy density reduces rapidly to negligible values by its a~ 6 depen- 
dence and the field freezes ultimately to a nonzero value qp?. The residual inflaton 
potential V(qp) again starts to dominate over the kinetic energy density, however, 
staying far below the radiation energy density and, after t eq , also below the matter 
energy density. 

As we approach to , the task of phenomenology is to devise a quintessence poten- 
tial having a suitable tracker. The nature of the tracker potential is decided by 
the form of the quintessence potential. To arrive at the present-day dark energy 
which causes the evolution to accelerate, the field qp must be unfrozen again, 
so 4>? should not be very different from 4>q. Many studies have concluded that 
only exponential trackers are admissible, and that quintessence potentials can be 
constructed by functions which behave as exponentials in qp early on, but which 
behave more like inverse power potentials in the quintessential tail. A simple 
example of such a potential is 

where k ^ 1 is an integer, A > is a parameter characterizing the exponential 
scale, and m < m? is a mass scale characteristic of the adopted inverse power- 
law scale. For more details, see Dimopoulos and Valle [9]. 




Figure 7.3 Schematic view of the scalar potential from inflation to quintessence. 
The potential V (solid line) features two flat regions, the inflationary plateau and the 
quintessential tail. The inflation is terminated at q? e nd by a drastic reduction of V, lead- 
ing to a rapid roll-down of the scalar field from the inflationary plateau towards the 
quintessential tail. At the end of inflation the kinetic energy density of the scalar field, p^ 
(dash-dotted line), dominates for a brief period the energy density of the Universe. During 
this time the radiation energy density p Y (dashed line) reduces less rapidly and catches 
up with pjon at time £* when the field is qp*, and the explosive generation of entropy com- 
mences. After that the kinetic energy of the scalar field reduces rapidly to zero and the 
field freezes asymptotically to a value <Pf, while the overall energy density of the Universe 
(dash-dot-dotted line) continues to decrease due to the Hubble expansion. Assuming a 
quasi-exponential tail given by Equation (7.62), the potential beyond <p F is seen departing 
logarithmically from a pure exponential case (dotted line). Reprinted from K. Dimopoulos 
and J. W. F. Valle, Modeling quintessential inflation [9], copyright 2002, with permission 
from Elsevier. 

7.5 Cyclic Models 

As we have seen, 'consensus' inflation by a single inflaton field solves the problems 
described in Section 7.1. But in the minds of some people it does so at a very high 
price. It does not explain the beginning of space and time, it does not predict 
the future of the Universe, or it sweeps these fundamental questions under the 
carpet of the Anthropic Principle. It invokes several unproven ingredients, such 
as a scalar field and a scalar potential, suitably chosen for the field to slow-roll 
down the potential while its kinetic energy is negligible, and such that it comes to 
a graceful exit where ordinary matter and radiation are created by oscillations in 
the potential well, or by entropy generation during a second slow-roll phase of an 



206 Cosmic Inflation 

equally arbitrary dark energy field. Clearly, any viable alternative to single-field 
inflation must also be able to solve the problems in Section 7.1, and it should not 
contain more arbitrary elements than does single-held inflation— multiple scalar 
fields have more freedom but also more arbitrariness. 



The Entropy Problem. In the radiation-dominated Universe, the source of energy 
of photons and other particles is a phase transition or a particle decay or an anni- 
hilation reaction, many of these sources producing monoenergetic particles. Thus 
the energy spectrum produced at the source is very nonuniform and nonrandom, 
containing sharp spectral lines 'ordered' by the type of reaction. Such a spectrum 
corresponds to low entropy. Subsequent scattering collisions will redistribute the 
energy more randomly and ultimately degrade it to high-entropy heat. Thermal 
equilibrium is the state of maximum uniformity and highest entropy. The very 
fact that thermal equilibrium is achieved at some time tells us that the Universe 
must have originated in a state of low entropy. 

In the transition from radiation domination to matter domination no entropy 
is lost. We have seen the crucial effect of photon reheating due to entropy conser- 
vation in the decoupling of the electrons. As the Universe expands and the wave- 
lengths of the CMB photons grow, the available energy is continuously converted 
into lower-grade heat, thus increasing entropy. This thermodynamic argument 
defines a preferred direction of time. 

When the cooling matter starts to cluster and contract under gravity, a new 
phase starts. We have seen that the Friedmann-Lemaitre equations dictate insta- 
bility, the lumpiness of matter increases with the Universe developing from 
an ordered, homogeneous state towards chaos. It may seem that contracting 
gas clouds represent higher uniformity than matter clustered into stars and 
galaxies. If the only form of energy were thermal, this would indeed be so. It 
would then be highly improbable to find a gas cloud squeezed into a small 
volume if nothing hinders it from filling a large volume. However, the attrac- 
tive nature of gravity seemingly reverses this logic: the cloud gains enormously 
in entropy by contracting. Thus the preferred direction of time as defined by 
the direction of increasing entropy is unchanged during the matter-dominated 
era. 

The same trend continues in the evolution of stars. Young suns burn their fuel 
through a chain of fusion reactions in which energetic photons are liberated and 
heavier nuclei are produced. As the photons diffuse in the stellar matter, they ulti- 
mately convert their energy into a large number of low-energy photons and heat, 
thereby increasing entropy. Old suns may be extended low-density, high-entropy 
red giants or white dwarfs, without enveloping matter which loses mass by various 
processes. In the process of supernova explosion entropy grows enormously. 

Consider the contracting phase of an oscillating universe. After the time t max 
given by Equation (4.52) the expansion turns into contraction, and the density of 
matter grows. If the age of the Universe is short enough that it contains black 
holes which have not evaporated, they will start to coalesce at an increasing rate. 
Thus entropy continues to increase, so that the preferred direction of time is 



Cyclic Models 207 




^=fl~ 



rt^ J, 



Figure 7.4 Schematic view of the potential V(qp) as a function of the field q> for (a) infla- 
tionary cosmology and (b) cyclic models. The numbered sequence of stages is described 
in the text. From ref. [10] courtesy of P. J. Steinhardt. 



unchanged. Shortly before the Big Crunch, when the horizon has shrunk to linear 
size I p , all matter has probably been swallowed by one enormously massive black 
hole. 



A Cyclic Universe. Early attempts to build models with cyclically reoccurring 
expansion and contraction were plagued by this problem, that the entropy density 
would rise from cycle to cycle. The length of cycles must then increase steadily. 
But, in retrospect, there must then have been a first cycle a finite time ago, thus a 
beginning of time: precisely what the cyclic model was conceived to avoid. 

A cyclic model which solves the entropy problem and which appears as suc- 
cessful as the 'consensus' inflationary model (leaving aside whether this comes 
at a higher or lower price) has been proposed by Steinhardt, Turok and collab- 
orators [10]. The model is described qualitatively in Figure 7.4, which depicts a 
potential V(qp), function of a scalar held q?. Unlike the inflaton field, q? does not 
cause an inflationary expansion of space-time. 

Each cycle ends and begins with crunch turning into a bang at the field value qp = 
- oo . The bang is a transition or bounce from a pre-existing contracting phase with 
a with a decreasing field, into an expanding phase with an increasing field. The 
contraction occurs in the extra dimension co, rather than in our three dimensions. 
In the acceleration of the field at turn-around, matter and radiation are created 



208 Cosmic Inflation 

at large but finite temperature from the kinetic energy of the held (stage 6 in 
Figure 7.4). 

The bang is then followed by an immediate entry into a period of radiation and 
matter domination where the field is rushing towards positive values (stage 7). 
This stage is quite similar to the corresponding post-inflationary epoch in the con- 
ventional inflationary scenario, and therefore the predictions are the same. But, 
unlike the conventional model, here a subdominant dark energy field is required. 
During radiation and matter domination the scalar dark energy field is effectively 
frozen in place by the Hubble redshift of its kinetic energy. The potential energy 
of this field starts to dominate only when radiation and matter energy densities 
have been sufficiently diluted by the expansion; then a slow cosmic acceleration 
commences (stage 1), and a slow roll down the weakly sloping potential (stage 2). 

At this stage the potential energy of the scalar field dominates also over the 
kinetic energy of the scalar field so that dark energy drives the expansion, much 
like the post-inflationary quintessence in the previous section. Next the field 
crosses zero potential (stage 3) and the kinetic energy starts to dominate over 
the now negative potential, causing the expansion to stop and turn into contrac- 
tion with equation of state w » 1. 

The potential has a minimum, where it is strongly negative. As the scalar field 
rolls down into this minimum (stage 4), it picks up speed on the way and causes 
a Hubble blueshift of the scalar field kinetic energy. With this speed the field just 
continues beyond the minimum, climbs out of it on the other side and continues 
towards negative infinity in a finite time. 

In the approach towards q? = -oo, the scale factor a(t) goes to zero (stage 5), 
but the coupling of the scalar field to radiation and matter conspires in such a 
way as to keep the temperature and the energy density finite. In order for space 
not to disappear completely, this scenario has to take place in a five-dimensional 
space-time, and a(t) has to be interpreted as the effective scale factor on a four- 
dimensional brane of the full five-dimensional space-time. 

The homogeneity and flatness of the Universe and the density perturbations 
are established during long periods of ultra-slow accelerated expansion, and the 
conditions are set up during the negative time prior to a bang. In contrast, infla- 
tionary theories have very little time to set up large-scale conditions, only about 
10" 31 s until the inflationary fluctuations have been amplified and frozen in, at a 
length scale of 10" 25 cm. In the cyclic Universe, the fluctuations can be generated 
a fraction of a second before the bang when their length scale is thousands of 
kilometres. 

Although the acceleration due to dark energy is very slow, causing the Universe 
to double in size every 1 5 billion years or so, compared with the enormous expan- 
sion in Equation (7.50) during 10~ 35 s, this is enough to empty the Universe of its 
matter and radiation. The dark energy dilutes the entropy density to negligible 
levels at the end of each cycle (at stage 1), preparing the way for a new cycle of 
identical duration. Although the total entropy of the Universe and the number of 
black holes increase from cycle to cycle, and increase per comoving volume as 
well, the physical entropy density and the number of black holes per proper vol- 



Cyclic Models 209 

ume is expanded away in each cycle. And since it is the physical entropy density 
that determines the expansion rate, the expansion and contraction history is the 
same from cycle to cycle. 

All of this is very speculative, but so is consensus inflation and quintessence 
dark energy. Fortunately, the different models make different testable predictions, 
notably for gravitational radiation. 



Problems 

1. Derive Equations (7.41) and (7.42). 

2. Derive qp(t) for a potential V(qp) = \Xcp A . 

3. Suppose that the scalar held averaged over the Hubble radius H~ l fluctuates 
by an amount ip. The held gradient in this fluctuation is Vqj = Hqj and 
the gradient energy density is H 2 qj 2 . What is the energy of this fluctuation 
integrated over the Hubble volume? Use the timescale H~ l for the fluctuation 
to change across the volume and the uncertainty principle to derive the 
minimum value of the energy. This is the amount by which the fluctuation 
has stretched in one expansion time [11]. 

4. Material observed now at redshift z = 1 is at present distance Hq 1 . The 
recession velocity of an object at coordinate distance x is Rx. Show that the 
recession velocity at the end of inflation is 

Ax = HoR^z L ^ (763) 

where z r is the redshift at the end of inflation. Compute this velocity. The 
density contrast has grown by the factor z 2 /z eq . What value did it have at 
the end of inflation since it is now 8 ~ 10" 4 at the Hubble radius [11]? 

5. Show that in an exponentially expanding universe (q = -1) the Hubble 
sphere is stationary. Show that it constitutes an event horizon in the sense 
that events beyond it will never be observable. Show that in this universe 
there is no particle horizon [12]. 

6. Show that the number of e-foldings of inflation in the V(qp) = -\q? 4 model 
is of order 

""£ 

\q>l 
from the time at which the held has the value qpt to the end of inflation 
(qp <s opi). Hence show that the density perturbations in this model are of 
order 

(*£) «VXaP. (7.64) 

Deduce that A < 10" 14 is required if the fluctuations are to be compatible 
with the CMB. This of course amounts to the fine-tuning that inflation is 
supposed to avoid [12]. 



210 Cosmic Inflation 

Chapter Bibliography 

[1] Kaiser, N. and Silk, J. 1986 Nature 324, 529. 

[2] Guth, A. H. 1981 Phys. Rev. D 23, 347. 

[3] Linde, A. D. 1982 Phys. Lett. B 108, 389. 

[4] Linde, A. D. 1982 Phys. Lett. B 114, 431. 

[5] Linde, A. D. 1982 Phys. Lett. B 116, 335, 340. 

[6] Albrecht, A. and Steinhardt, P. J. 1982 Phys. Rev. Lett. 48, 1220. 

[7] Linde, A. D. 1990 Particle physics and inflationary cosmology. Harwood Academic 
Publishers, London. 

[8] Linde, A. D. 2002 Inflationary cosmology and creation of matter in the Universe. 
In Modern cosmology (ed. S. Bonometto, V. Gorini and U. Moschella). Institute of 
Physics Publishing, Bristol. 

[9] Dimopoulos, K. and Valle, J. W. F. 2002 Astroparticle Phys. 18, 287. 
[10] Steinhardt, P. J. and Turok, N. 2002 Science 296, 1496, and further references therein. 
[11] Peebles, P. J. E. 1993 Principles of physical cosmology. Princeton University Press, 

Princeton, NJ. 
[12] Raine, D. J. and Thomas, E. G. 2001 An introduction to the science of cosmology. Insti- 
tute of Physics Publishing, Bristol. 



8 

Cosmic 

Microwave 

Background 



In this chapter we shall meet several important observational discoveries. The 
cosmic microwave background (CMB), which is a consequence of the hot Big Bang 
and the following radiation-dominated epoch, was discovered in 1964. We discuss 
this discovery in Section 8.1. 

The hot Big Bang also predicts that the CMB radiation should have a blackbody 
spectrum. Inflation predicts that the mean temperature of the CMB should exhibit 
minute perturbations across the sky. These predictions were verified by a 1990 
satellite experiment, the Cosmic Background Explorer (COBE). Many experiments 
have since then extended the range of observations with improved precision at dif- 
ferent angular scales, most recently the satellite Wilkinson Microwave Anisotropy 
Probe (WMAP) (see cover picture), whose measurements will be discussed in the 
rest of this chapter. In Section 8.2 we shall discuss the method of analysing the 
temperature perturbations. 

The temperature perturbations are expected to be associated with even smaller 
polarization variations, due to Thomson scattering at the LSS. These were first 
observed by the ground-based Degree Angular Scale Interferometer (DASI) in late 
2002 and by WMAP in early 2003. We discuss this in Section 8.3. 

The CMB contains a wealth of information about the dynamical parameters of 
the Universe and on specific features of the theoretical models: general relativity, 
the standard FLRW cosmology versus other cosmologies, all versions of inflation 
and its alternatives, dark energy, etc. In Section 8.4 we establish the parameters, 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



212 Cosmic Microwave Background 

how they are related to each other, what observational values they have and what 
information they give about possible cosmological models. 

8.1 The CMB Temperature 

Predictions. In 1948, Georg Gamow (1904-1968), Ralph Alpher and Robert Her- 
man calculated the temperature at that time of the primordial blackbody radiation 
which started free streaming at the LSS. They found that the CMB should still exist 
today, but that it would have cooled in the process of expansion to the very low 
temperature of To « 5 K. This corresponds to a photon wavelength of 

A= p^- = 2.9xl0" 3 m. (8.1) 

KTq 

This is in the microwave range of radio waves (see Table A.3). (The term 
'microwave' is actually a misnomer, since it does not refer to micrometre wave- 
lengths, but rather to centimetres.) 

We can now redo their calculation, using some hindsight. Let us first recall from 
Equations (4.39) and (4.40) that the expansion rate changed at the moment when 
radiation and matter contributed equally to the energy density. For our calculation 
we need to know this equality time, r eq , and the temperature T eq . The radiation 
energy density is given by Equation (5.47): 

£r = (g y + 3g v )\aT^. (8.2) 

The energy density of matter at time T eq is given by Equation (5.26), except that 
the electron (e~ and e + ) energy needs to be averaged over the spectrum (5.42). We 
could in principle solve for T eq by equating the radiation and matter densities, 

£r(r eq ) = p m (T m ). (8.3) 

We shall defer solving this to Section 8.4. The temperature T eq corresponds to the 
crossing of the two lines in the log-log plot in Figure 5.1. 

The transition epoch happens to be close to the recombination time (z rec in red- 
shift, see Equation (5.76)) and the LSS (z L ss in redshift, see Equation (5.77)). With 
their values for t eq , T eq and to and Equation (4.39), Gamow, Alpher and Herman 
obtained a prediction for the present temperature of the CMB: 

T = r eq (^) =2.45K. (8.4) 

This is very close to the present-day observed value, as we shall see. 



Discovery. Nobody paid much attention to the prediction of Gamov et ah, 
because the Big Bang theory was generally considered wildly speculative, and 
detection of the predicted radiation was far beyond the technical capabilities exist- 
ing at that time. In particular, their prediction was not known to Arno Penzias and 



The CMB Temperature 213 

Robert Wilson who, in 1964, were testing a sensitive antenna intended for satellite 
communication. They wanted to calibrate it in an environment free of all radia- 
tion, so they chose a wavelength of A = 0.0735 m in the relatively quiet window 
between the known emission from the Galaxy at longer wavelengths and the emis- 
sion at shorter wavelengths from the Earth's atmosphere. They also directed the 
antenna high above the galactic plane, where scattered radiation from the Galaxy 
would be minimal. 

To their consternation and annoyance they found a constant low level of back- 
ground noise in every direction. This radiation did not seem to originate from 
distant galaxies, because in that case they would have seen an intensity peak in 
the direction of the nearby M31 galaxy in Andromeda. It could also not have orig- 
inated in Earth's atmosphere, because such an effect would have varied with the 
altitude above the horizon as a function of the thickness of the atmosphere. 

Thus Penzias and Wilson suspected technical problems with the antenna (in 
which a couple of pigeons turned out to be roosting) or with the electronics. 
All searches failing, they Anally concluded, correctly, that the Universe was uni- 
formly filled with an 'excess' radiation corresponding to a blackbody temperature 
of 3.5 K, and that this radiation was isotropic and unpolarized within their mea- 
surement precision. 

At Princeton University, a group of physicists led by Robert Dicke (1916-1997) 
had at that time independently arrived at the conclusion of Gamow and collabo- 
rators, and they were preparing to measure the CMB radiation when they heard of 
the remarkable 3.5 K 'excess' radiation. The results of Penzias and Wilson's mea- 
surements were immediately understood and they were subsequently published 
(in 1965) jointly with an article by Dicke and collaborators which explained the 
cosmological implications. The full story is told by Peebles [1], who was a stu- 
dent of Dicke at that time. Penzias and Wilson (but not Gamow or Dicke) were 
subsequently awarded the Nobel prize in 1978 for this discovery. 

This evidence for the 1 5-Gyr-old echo of the Big Bang counts as the most impor- 
tant discovery in cosmology since Hubble's law. In contrast to all radiation from 
astronomical bodies, which is generally hotter, and which has been emitted much 
later, the CMB has existed since the era of radiation domination. It is hard to 
understand how the CMB could have arisen without the cosmic material having 
once been highly compressed and exceedingly hot. There is no known mechanism 
at any time after decoupling that could have produced a blackbody spectrum in 
the microwave range, because the Universe is transparent to radio waves. 



Spectrum. In principle, one intensity measurement at an arbitrary wavelength of 
the blackbody spectrum (5.3) is sufficient to determine its temperature, T, because 
this is the only free parameter. On the other hand, one needs measurements at 
different wavelengths to establish that the spectrum is indeed blackbody. 

It is easy to see that a spectrum which was blackbody at time t with temperature 
T will still be blackbody at time t' when the temperature has scaled to 

T' = T^gr. (8.5) 



214 Cosmic Microwave Background 




Figure 8.1 Spectrum of the CMB from data taken with the FIRAS instrument aboard 
NASA's COBE. Reproduced from reference [3] by permission of the COBE Science Working 
Group. 

This is so because, in the absence of creation or annihilation processes, the num- 
ber of photons, n y a 3 (t), is conserved. Thus the number density dn y (v) in the 
frequency interval (v,v + dv) at time t transforms into the number density at 
time t' , 



dn'(v') = (^-fdn y (v). 
y \a(V) ) y 

Making use of Equations (5.3) and (8.5), the distribution at time t' becomes 

dn' y (v') = 



(8.6) 



3 c 3 e hv'/M"_l' 



(8.7) 



which is precisely the blackbody spectrum at temperature T". 

Although several accurate experiments since Penzias and Wilson have con- 
firmed the temperature to be near 3 K by measurements at different wavelengths, 
the conclusive demonstration that the spectrum is indeed also blackbody in the 
region masked by radiation from the Earth's atmosphere was first made by a ded- 
icated instrument, the Far Infrared Absolute Spectrophotometer (FIRAS) aboard 
the COBE satellite launched in 1989 [2]. The present temperature, To, is taken to 
be 

T = 2.725 ±0.001 K. (8.8) 

The spectrum reported by the COBE team in 1993 [3], shown in Figure 8.1, 
matches the predictions of the hot Big Bang theory to an extraordinary degree. 
The measurement errors on each of the 34 wavelength positions are so small that 



The CMB Temperature 215 

they cannot be distinguished from the theoretical blackbody curve. It is worth 
noting that such a pure blackbody spectrum had never been observed in labo- 
ratory experiments. All theories that attempt to explain the origin of large-scale 
structure seen in the Universe today must conform to the constraints imposed by 
these COBE measurements. 

The vertical scale in Figure 8.1 gives the intensity 1(1 /A) of the radiation, that 
is, the power per unit inverse wavelength interval arriving per unit area at the 
observer from one steradian of sky. In SI units this is 10" 9 Jm _1 sr _1 s _1 . This 
quantity is equivalent to the intensity per unit frequency interval, J(v). One can 
transform from dA to dv by noting that J(v) dv = 1(A) dA, from which 

v 2 
1(A) = —I(v). (8.9) 

c 

The relation between energy density f r and total intensity, integrated over the 

spectrum, is 

'" I J(v)dv. (8.10) 



fl' 



Energy and Entropy Density. Given COBE's precise value of To, one can deter- 
mine several important quantities. From Equation (5.47) one can calculate the 
present energy density of radiation 

£r,o = y*a s T$ = 2.604 x 10 5 eVm" 3 . (8.11) 

The corresponding density parameter then has the value 

Q r = *kA = 2.471 x 10" 5 IT 2 , (8.12) 

Pc 

using the value of p c from Equation (1.31). Obviously, the radiation energy is very 
small today and far from the value Oq = 1 required to close the Universe. 
The present value of the entropy density is 

>-i%-W*£ -**">*«*'■*■ 

Recall (from the text immediately after Equation (5.73)) that the (T v /T) depen- 
dence of g*$ is a power of three rather than a power of four, so the factor (n) 4/3 
becomes just fi andg*s becomes 3.91. 

The present number density of CMB photons is given directly by Equation (5.5): 

( 10 8 photons m~ 3 . (8.14) 



Neutrino Number Density. Now that we know To and N y we can obtain the 
neutrino temperature T v = 1.949 K from Equation (5.71) and the neutrino number 
density per neutrino species from Equation (5.72), 

AT V = Aw = i.i2 x 10 8 neutrinos m -3 . (8.15) 



216 Cosmic Microwave Background 

For three species of relic neutrinos with average mass < m v ) , Equation (5.74) can 
be used to cast the density parameter in the form 

8.2 Temperature Anisotropics 

The Dipole Anisotropy. The temperature measurement of Penzias and Wilson's 
antenna was not very precise by today's standards. Their conclusion about the 
isotropy of the CMB was based on an accuracy of only 1.0 K. When the mea- 
surements improved over the years it was found that the CMB exhibited a dipole 
anisotropy. The temperature varies minutely over the sky in such a way that it is 
maximally blueshifted in one direction (call it a) and maximally redshifted in the 
opposite direction (a + 180°). In a direction a + 9 it is 

T(d) = T(a)(l + vcos0), (8.17) 

where v is the amplitude of the dipole anisotropy. Although this shift is small, 
only vT(a) ~ 3.35 mK, it was measured with an accuracy better than 1% by the 
Differential Microwave Radiometer (DMR) instrument on board the COBE satel- 
lite [4]. 

At the end of Chapter 5 we concluded that the hot Big Bang cosmology predicted 
that the CMB should be essentially isotropic, since it originated in the LSS, which 
has now receded to a redshift of z =* 1100 in all directions. Note that the most 
distant astronomical objects have redshifts up to z = 7. Their distance in time to 
the LSS is actually much closer than their distance to us. 

In the standard model the expansion is spherically symmetric, so it is quite 
clear that the dipole anisotropy cannot be of cosmological origin. Rather, it is 
well explained by our motion 'against' the radiation in the direction of maximal 
blueshift with relative velocity v. 

Thus there is a frame in which the CMB is isotropic— not a rest frame, since 
radiation cannot be at rest. This frame is then comoving with the expansion of the 
Universe. We referred to it in Section 2.2, where we noted that, to a fundamental 
observer at rest in the comoving frame, the Universe must appear isotropic if it is 
homogeneous. Although general relativity was constructed to be explicitly frame 
independent, the comoving frame in which the CMB is isotropic is observationally 
convenient. The fundamental observer is at position B in Figure 8.2. 

The interpretation today is that not only does the Earth move around the Sun, 
and the Solar System participates in the rotation of the Galaxy, but also the Galaxy 
moves relative to our Local Galaxy Group, which in turn is falling towards a cen- 
tre behind the Hydra-Centaurus supercluster in the constellation Virgo. From the 
observation that our motion relative to the CMB is about 365 km s _1 , these veloc- 
ity vectors add up to a peculiar motion of the Galaxy of about 550 km s" 1 , and 
a peculiar motion of the Local Group of about 630 km s" 1 [5]. Thus the dipole 
anisotropy seen by the Earth-based observer A in Figure 8.2 tells us that we and 
the Local Group are part of a larger, gravitationally bound system. 



Temperature Anisotropies 




Figure 8.2 The observer A in the solar rest frame sees the CMB to have dipole 
anisotropy— the length of the radial lines illustrate the CMB intensity— because he is mov- 
ing in the direction of the arrow. The fundamental observer at position B has removed the 
anisotropy. 

Multipole Analysis. Temperature fluctuations around a mean temperature Tq 
in a direction a on the sky can be analysed in terms of the autocorrelation func- 
tion C(6), which measures the product of temperatures in two directions m,n 
separated by an angle 9 and averaged over all directions a, 

c(g)= / *r(m)*r(n) \ m . n ^ cos0 (8 18) 

\ T T I 

For small angles (6) the temperature autocorrelation function can be expressed as 
a sum of Legendre polynomials P{> ( ) of order £, the wavenumber, with coefficients 
or powers a p 

c ^ = J- X4^+ D^(cos0). (8.19) 



218 Cosmic Microwave Background 




Figure 8.3 The geometry used in the text for describing the polarization of an incoming 
unpolarized plane wave photon, y' in the (x', j/)-plane, which is Thomson scattering 
against an electron, and subsequently propagating as a polarized plane wave photon, y, 
in the z-direction. 

All analyses start with the quadrupole mode £ = 2 because the £ = monopole 
mode is just the mean temperature over the observed part of the sky, and the £ = 1 
mode is the dipole anisotropy. Higher multipoles correspond to fluctuations on 
angular scales 

In the analysis, the powers a\ are adjusted to give a best fit of C(6) to the 
observed temperature. The resulting distribution of a\ values versus £ is called 
the power spectrum of the fluctuations. The higher the angular resolution, the 
more terms of high £ must be included. Anisotropies on the largest angular scales 
corresponding to quadrupoles are manifestations of truly primordial gravitational 
structures. 

For the analysis of temperature perturbations over large angles, the Legendre 
polynomial expansion (8.19) will not do; one has to use tensor spherical harmon- 
ics. Consider a plane wave of unpolarized photons with electric field vector E 
propagating in the z direction (cf. Figure 8.3). The components of E can be taken 
from Equation (5.7), and the intensity J of the radiation field from Equation (5.8). 
Generalizing from a plane wave to a radiation field E(n) in the direction n, the 
temperature T(n) can be expanded in spherical harmonics 



t=lm=-{ 



(8.20) 



where a T im are the powers or temperature multipole components. These can be 
determined from the observed temperature T(n) using the orthonormality prop- 



Temperature Anisotropics 219 

erties of the spherical harmonics, 

«L = Y J d " r(n)Y ^ (n) - (8 - 21) 

Expressing the autocorrelation function C as a power spectrum in terms of the 
multipole components, the average of all statistical realizations of the distribution 
becomes 

kaY m a\ m ,) = C]5 M ,5 mm ' = C]. (8.22) 

The last step follows from statistical isotropy which requires 
5 f>g' = 1 and 8 mm - = 1. 



Sources of Anisotropies. Let us now follow the fate of the scalar density per- 
turbations generated during inflation, which subsequently froze and disappeared 
outside the (relatively slowly expanding) horizon. For wavelengths exceeding the 
horizon, the distinction between curvature (adiabatic) and isocurvature (isother- 
mal) perturbations is important. Curvature perturbations are true energy density 
fluctuations or fluctuations in the local value of the spatial curvature. These can 
be produced, for example, by the quantum fluctuations that are blown up by 
inflation. By the equivalence principle all components of the energy density (mat- 
ter, radiation) are affected. Isocurvature fluctuations, on the other hand, are not 
true fluctuations in the energy density but are characterized by fluctuations in 
the form of the local equation of state, for example, spatial fluctuations in the 
number of some particle species. These can be produced, for example, by cosmic 
strings that perturb the local equation of state. As long as an isocurvature mode 
is super-horizon, physical processes cannot re-distribute the energy density. 

When the Universe arrived at the radiation- and matter-dominated epochs, the 
Hubble expansion of the horizon reveals these perturbations. Once inside the 
horizon, the crests and troughs can again communicate, setting up a pattern of 
standing acoustic waves in the baryon-photon fluid. The tight coupling between 
radiation and matter density causes the adiabatic perturbations to oscillate in 
phase. After decoupling, the perturbations in the radiation held no longer oscil- 
late, and the remaining standing acoustic waves are visible today as perturbations 
to the mean CMB temperature at degree angular scales. 

Curvature and isocurvature fluctuations behave differently when they are super- 
horizon: isocurvature perturbations cannot grow, while curvature perturbations 
can. Once an isocurvature mode passes within the horizon, however, local pres- 
sure can move energy density and can convert an isocurvature fluctuation into a 
true energy-density perturbation. For sub-horizon modes the distinction becomes 
unimportant and the Newtonian analysis applies to both. However, isocurvature 
fluctuations do not lead to the observed acoustic oscillations seen in Figure 8.4 
(they do not peak in the right place), whereas the adiabatic picture is well con- 
firmed. 

At the LSS, crests in the matter density waves imply higher gravitational poten- 
tial. As we learned in Section 2.5, photons 'climbing out' of overdense regions will 



220 Cosmic Microwave Background 





multipoli 



Figure 8.4 The best-fit power spectra of CMB fluctuations as a function of angular scale 
(top) and wavenumber (bottom). The upper figure shows the temperature (T) power spec- 
trum, and the lower figure the temperature-polarization (TE) cross-power spectrum. Note 
that the latter is not multiplied by the additional factor $ . The grey shading represents 
the la cosmic variance. For further details, see [6], Reproduced from reference [6] by 
permission of the WMAP Team. 



be redshifted by an amount given by Equation (2.79), but this is partially offset 
by the higher radiation temperature in them. This source of anisotropy is called 
the Sachs-Wolfe effect. Inversely, photons emitted from regions of low density 'roll 
down' from the gravitational potential, and are blueshifted. In the long passage to 
us they may traverse further regions of gravitational fluctuations, but then their 
frequency shift upon entering the potential is compensated for by an opposite 



Temperature Anisotropics 221 

frequency shift when leaving it (unless the Hubble expansion causes the potential 
to change during the traverse). They also suffer a time dilation, so one effectively 
sees them at a different time than unshifted photons. Thus the CMB photons 
preserve a 'memory' of the density fluctuations at emission, manifested today as 
temperature variations at large angular scales. An anisotropy of the CMB of the 
order of ST/ T « 10" 5 is, by the Sachs-Wolfe effect, related to a mass perturbation 
of the order of S « 10" 4 when averaged within one Hubble radius. 

The gravitational redshift and the time dilation both contribute to ST /To by 
amounts which are linearly dependent on the density fluctuations 5p/p, so the 
net effect is given by 

£.I(^£. (8.23. 

T 3\ct dec J p 

where Idee is the size of the structure at decoupling time tdec (corresponding to 
Zdec in Equation (5.78)). (Note that Equation (8.23) is strictly true only for a critical 
universe with zero cosmological constant.) 

The space-time today may also be influenced by primordial fluctuations in 
the metric tensor. These would have propagated as gravitational waves, caus- 
ing anisotropies in the microwave background and affecting the large-scale struc- 
tures in the Universe. High-resolution measurements of the large-angle microwave 
anisotropy are expected to be able to resolve the tensor component from the scalar 
component and thereby shed light on our inflationary past. 

Further sources of anisotropies may be due to variations in the values of 
cosmological parameters, such as the cosmological constant, the form of the 
quintessence potential, and local variations in the time of occurrence of the LSS. 



Discovery. For many years microwave experiments tried to detect temperature 
variations on angular scales ranging from a few arc minutes to tens of degrees. 
Ever increasing sensitivities had brought down the limits on ST/ T to near 10" 5 
without finding any evidence for anisotropy until 1992. At that time, the first 
COBE observations of large-scale CMB anisotropies bore witness of the spatial 
distribution of inhomogeneities in the Universe on comoving scales ranging from 
a few hundred Mpc up to the present horizon size, without the complications of 
cosmologically recent evolution. This is inaccessible to any other astronomical 
observations. 

On board the COBE satellite there were several instruments, of which one, the 
DMR, received at three frequencies and had two antennas with 7° opening angles 
directed 60° apart. This instrument compared the signals from the two antennas, 
and it was sensitive to anisotropies on large angular scales, corresponding to 
multipoles £ < 30. Later radio telescopes were sensitive to higher multipoles, so 
one now has a detailed knowledge of the multipole spectrum up to £ = 2800. 

The most precise recent results are shown in the upper part of Figure 8.4. At low 
£, the temperature-power spectrum is smooth, caused by the Sachs-Wolfe effect. 
Near £ = 200 it rises towards the first and dominant peak of a series of Sakharov 
oscillations, also confusingly called the Doppler peak. They are basically caused by 



222 Cosmic Microwave Background 

density perturbations which oscillate as acoustic standing waves inside the LSS 
horizon. The exact form of the power spectrum is very dependent on assumptions 
about the matter content of the Universe; thus careful measurement of its shape 
yields precise information about many dynamical parameters. For details on the 
results included in the figure, see reference [6]. 

The definitive DMR results [4] cover four years of measurements of eight com- 
plete mappings of the full sky followed by the above spherical harmonic analysis. 
The CMB anisotropics found correspond to temperature variations of 

<5r = 29±l|iK, or ST/T = 1.06 xl(T 5 . (8.24) 

Half of the above temperature variations, or ST = 15.3 |iK, could be ascribed 
to quadrupole anisotropy at the 90° angular scale. Although some quadrupole 
anisotropy is kinetic, related to the dipole anisotropy and the motion of Earth, 
this term could be subtracted. The remainder is then quadrupole anisotropy of 
purely cosmological origin. 

Since the precision of the COBE measurements surpassed all previous experi- 
ments one can well understand that such small temperature variations had not 
been seen before. The importance of this discovery was succinctly emphasized 
by the COBE team who wrote that 'a new branch of astronomy has commenced'. 
The story of the COBE discoveries have been fascinatingly narrated by George 
Smoot [7]. 

8.3 Polarization Anisotropies 

Thomson Scattering. Elastic scattering of photons from free electrons, Equa- 
tion (5.30) is called Thomson scattering or Compton scattering, the latter being 
used for higher frequencies. In Section 5.5 we ignored the fate of the primordial 
photons, noting that they were thermalized by this process before decoupling. We 
also noted that unpolarized photons are polarized by the anisotropic Thomson 
scattering process, but as long as the photons continue to meet free electrons their 
polarization is washed out, and no net polarization is produced. At a photon's last 
scattering, however, the induced polarization remains and the subsequently free- 
streaming photon possesses a quadrupole moment (£ = 2). 

The perturbations to the baryon density and the radiation temperature in the 
tightly coupled baryon-photon fluid are scalar, thus corresponding to a monopole 
moment (£ = 0). As we saw in the previous section, the radiation field also exhibits 
dipole perturbations (£ = 1) which are coupled to the baryon bulk velocity, but 
there are no vector or tensor perturbations. Tensor perturbations would be due 
to gravitational waves, which have not been observed with present-day detectors. 

The quadrupole moment possessed by free-streaming photons couples more 
strongly to the bulk velocity (the peculiar velocities) of the baryon-photon fluid 
than to the density. Therefore, the photon density fluctuations generate tempera- 
ture fluctuations, while the velocity gradient generates polarization fluctuations. 

Let us now make use of the description of photons in Section 5.1 and the Stokes 
parameters (5.8). The parameter J, which describes the intensity of radiation, is, 



Polarization Anisotropies 223 

like V, a physical observable independent of the coordinate system. In contrast, 
the parameters Q_ and U depend on the orientation of the coordinate system. In 
the geometry of Figure 8.3, the coordinates x',y' define a plane wave of incoming 
radiation propagating in the z' direction (primes are used for unscattered quanti- 
ties). The incoming photon y' then Thomson scatters against an electron and the 
outgoing photon y continues as a plane wave in a new direction, z. 

It follows from the definition of the Stokes parameters Q_ and U that a rotation 
of the x' - and y'-axes in the incoming plane by the angle </> transforms them into 

Q(0) = Qcos(2</>) + Usin(2<f>), [/(</>) = -Qsin(2</>) + Ucos(2<f>). (8.25) 

We left it as an exercise (Chapter 5, Problem 2) to demonstrate that Q 2 + U 2 is 
invariant under the rotation (8.25). It follows from this invariance that the polar- 
ization P is a second rank tensor of the form 

Thus the polarization is not a vector quantity with a direction unlike the electric 
field vector E. 

Let us now see how Thomson scattering of the incoming, unpolarized radiation 
generates linear polarization in the (x, y)-plane of the scattered radiation (we fol- 
low closely the pedagogical review of A. Kosowsky [8]). The differential scattering 
cross-section, defined as the radiated intensity J divided by the incoming intensity 
V per unit solid angle O and cross-sectional area cr B , is given by 

2 =K\i'-i\ 2 . (8.27) 

Here <x T is the total Thomson cross-section, the vectors i', i are the unit vectors in 
the incoming and scattered radiation planes, respectively (cf. Figure 8.3), and we 
have lumped all the constants into one constant, K. The Stokes parameters of the 
outgoing radiation then depend solely on the nonvanishing incoming parameter 

I = KI' (I + cos 2 6), Q = KI' sin 2 6, U = 0, (8.28) 

where 9 is the scattering angle. By symmetry, Thomson scattering can generate 
no circular polarization, so V = always. 

The net polarization produced in the direction z from an incoming field of 
intensity I' (6, <fi) is determined by integrating Equations (8.28) over all incoming 
directions. Note that the coordinates for each incoming direction must be rotated 
some angle <fi about the z-axis as in Equations (8.25), so that the outgoing Stokes 
parameters all refer to a common coordinate system. The result is then [8] 

I(z) = \K \ dQ(l + cos 2 6)1' (6, cj)), (8.29) 

Q(z)-iU(z) = \K fd/3 sin 2 Be 21 * I' (9, 0). (8.30) 



224 Cosmic Microwave Background 

Expanding the incident radiation intensity in spherical coordinates, 

I'(d,4>) = J^a tm Y (m (d,4>), (8.31) 

leads to the following expressions for the outgoing Stokes parameters: 

I(z) = ^(fVTmoo + fVfm^o), (8.32) 

Q(z) - iU(z) = 2KJ^a22. (8.33) 

Thus, if there is a nonzero quadrupole moment CI22 in the incoming, unpolar- 
ized radiation held, it will generate linear polarization in the scattering plane. To 
determine the outgoing polarization in some other scattering direction, n, making 
an angle fi with z, one expands the incoming field in a coordinate system rotated 
through p. This derivation requires too much technical detail to be carried out 
here, so we only state the result [8]: 

Q(n)-iU(zn)=Kyl\TTa2QSin 2 fi. (8.34) 



Multipole Analysis. The tensor harmonic expansion (8.20) for the radiation tem- 
perature T and the temperature multipole components aL , in (8.21) can now 
be completed with the corresponding expressions for the polarization tensor P. 
From the expression (8.26) its components are 

1 / Q(n) -U(n) sine \ 

Fab(n} 2\-U(n)smQ -Q(n)sin 2 eJ 
t 
= ToX 1 [^ m) Yf em)ab (n) + a* £m) Yf £m)ab (n)]. (8.35) 

£=2 m =-e 
The existence of the two modes (superscripted) E and B is due to the fact that 
the symmetric traceless tensor (8.35) describing linear polarization is specified by 
two independent Stokes parameters, Q and U. This situation bears analogy with 
the electromagnetic vector field, which can be decomposed into the gradient of 
a scalar field (E for electric) and the curl of a vector field (B for magnetic). The 
source of the E-modes is Thomson scattering. The sources of the B-modes are 
gravitational waves entailing tensor perturbations, and E-modes which have been 
deformed by gravitational lensing of large-scale structures in the Universe. 
In analogy with Equation (8.21), the polarization multipole components are 

a L = j- \dnPab(n)Y^ b *(n), (8.36) 

a| TO = j- jdnP ab (n)Y*£ b *(n). (8.37) 

The three sets of multipole moments aj m , a^ m and a^ m fully describe the tem- 
perature and polarization map of the sky; thus, they are physical observables. 



Model Testing and Parameter Estimation. II 225 

There are then six power spectra in terms of the multipole components: the tem- 
perature spectrum in Equation (8.22) and Ave spectra involving linear polarization. 
The full set of physical observables is then 

Q T = («J>J'«'}. Q E = («!««!'«')■ Q B = «4m'>. 1 

^ (8.38) 
Q TE = (4>^). Q TB = <«&«5w>- Cf = {af m a\ m )\ 

For further details on polarization, see Kosowsky [8]. 



Observations. Thus polarization delivers six times more information than tem- 
perature alone. In practice, at most four of the power spectra will become avail- 
able, since the intensity of polarization is so much weaker than that of tempera- 
ture, and the components Cj B and C| E are then very difficult to observe. 

The first observations of the polarization spectra C{; and Cj E were made by the 
South Pole-based Degree Angular Scale Interferometer DASI [9] and the WMAP 
satellite [6]. In the lower part of Figure 8.4 we plot the temperature-polarization 
(TE) cross-power spectrum from the first year of WMAP observations. 



8.4 Model Testing and Parameter Estimation. II 

The parameters required to model CMB are some subset of the parameters H, 
O m h 2 , Q\,h 2 , Oo, the optical depth t to LSS, the amplitude A of the power spec- 
trum, the scalar and tensor power indices n s , n t in Equation (7.57), the energy 
variation of the scalar index dn s /dk, and the linear theory amplitude of fluctua- 
tions <Ts within 8 Mpc h~ l spheres at z = 0. 

The primordial fluctuations are assumed to be Gaussian random phase, since 
no evidence to the contrary has been found. Note the qualitative feature in Fig- 
ure 8.4 that the TE component is zero at the temperature-power spectrum max- 
ima, as is expected (the polarization is maximal at velocity maxima and minimal 
at temperature maxima, where the velocities are minimal), and that it exhibits a 
significant large-angle anti-correlation dip at £ « 150, a distinctive signature of 
super-horizon adiabatic fluctuations. 



Re-ionization. Since polarization originated in the LSS when the horizon was 
about 1.12° of our present horizon, the polarization fluctuations should only be 
visible in multipoles 

But WMAP also observes a strong signal on large angular scales, £ < 10. This 
can only be due to re-ionization later than LSS, when the radiation on its way 
to us traversed ionized hydrogen clouds, heated in the process of gravitational 
contraction. The effect of CMB re-ionization is called the Sunyaev-Zel'dovich Effect 



226 Cosmic Microwave Background 

(SZE) (Yakov B. Zel'dovich, 1914-1987). As a consequence of the SZE, the CMB 
spectrum is distorted, shifting towards higher energy. 

From the size of this spectral shift WMAP estimates a value for the optical depth 
to the effective re-ionization clouds, t, which is essentially independent of cos- 
mological modelling, 

t = 0.17 ±0.04, (8.39) 

but strongly degenerate with n s . This corresponds to re-ionization by an early 
generation of stars at z r = 20 ± 5. It could still be that this picture is simplistic, the 
re-ionization may have been a complicated process in a clumpy medium, involving 
several steps at different redshifts. We shall continue the discussion of this effect 
in Section 9.2. 



Power-Spectrum Parameters. The positions and amplitudes of the peaks and 
troughs in the temperature (T) and the temperature-polarization (TE) cross-power 
spectrum in Figure 8.4 contain a wealth of information on cosmological parame- 
ters. 

The first acoustic T peak determines the scale £ of the time when matter com- 
pressed for the first time after tdec- The position in i'-space is related to the param- 
eters n s , O m h 2 and 0\,h 2 (Note that the physical matter density Q m includes the 
baryon density £V) The amplitude of the first peak is positively correlated to 
O m h 2 and the amplitude of the second peak is negatively correlated to O^h 2 but, 
to evaluate O m and Ob, one needs to know a value for h, which one can take from 
Section 1.4. Increasing n s increases the ratio of the second peak to the first peak. 
At fixed n s , however, this ratio determines 0\,/O m . The amplitudes also put limits 
onO v h 2 . 

In (,Q m ,,Q/i)-space, the CMB data determine Do = O m + ^a most precisely, 
whereas supernova data (discussed in Section 4.4) determine Qa - &m most pre- 
cisely. Combining both sets of data with data on large-scale structures from 
2dFGRS [10] (discussed in Chapter 9) which depend on n s , O m h, Q\,h and which 
put limits on Q v h, one breaks the CMB power spectrum parameter degeneracies 
and improves the precision. 

It is too complicated to describe the simultaneous fit to all data and the eval- 
uation of the parameter values here, so we just quote the results published by 
the WMAP Team [6] with some comments. WMAP includes information also from 
the CMB detectors Cosmic Background Imager [11] and the Arcminute Cosmol- 
ogy Bolometer Array Receiver [12] and the matter-power spectrum at z ~ 3 as 
measured by the Ly« absorption in intergalactic hydrogen clouds (the Lyman-a 
forest). Note that all errors quoted refer to single-parameter fits at 68% confidence, 
marginalized over all the other parameters. This results in too small errors if one 
is interested in two- or many-dimensional confidence regions. 

WMAP finds that the scalar index n s is a slowly varying function of the spec- 
tral index k. This slow variation came as a surprise, because inflationary mod- 
els require n s to be a constant. WMAP chooses to quote two parameters n s at 



Model Testing and Parameter Estimation. II 

0.05 Mpc" 1 and its first derivative, which obtain the values 

dn s 

~dk 

The derivative improves the fit somewhat, but it is not very significantly nonzero 
[6], so inflation is not yet in danger. The WMAP's Hubble constant agrees per- 
fectly with the HST value in Equation (1.20). The combined value of all present 
information on the Hubble constant is 

h = 0.71t° ° l, (8.41) 

and that on the density parameters is 

O m h 2 = 0.135+8;88i. A>& 2 = 0.0224 ±0.0009 Q v h 2 < 0.0076. (8.42) 

It is a remarkable success of the FLRW concordance model that the baryonic den- 
sity at time 380 kyr as evidenced by the CMB is in excellent agreement with the 
BBN evidence (5.105) from about 20 minutes after the Big Bang. As explained in 
Section 5.6, the BBN value depends only on the expansion rate and the nuclear 
reaction cross-sections, and not at all on the details of the FLRW model. 

Density Parameters. From the above parameter values one derives 

O = 1-02 ±0.02, Q m = 0.27 ±0.04, O h = 0.044 ± 0.004, O v < 0.015. 

(8.43) 
Thus the geometry of the Universe is consistent with being spatially flat and we 
can henceforth set Oq = 1. 

Combining the limit for O v h 2 with the expression (8.16), the average mass of 
the three v mass eigenstates is 

<m v > < 0.23eVat95%CL, (8.44) 

where 'CL' denotes the confidence level. 

Inserting the values of ilb and h in Equation (5.102) we obtain the ratio of 
baryons to photons 

n = (6.1±0.7)xl0" 10 . (8.45) 

Inserting the value of h in Equation (8.12), we obtain the present value of the 
radiation density parameter, 

fi r = 4.902 x 10" 5 . (8.46) 



Dark Energy. The equation of state of dark energy, w <p , introduces a new degen- 
eracy with O m and h which cannot be resolved by CMB data alone. Using the full 
set of data, one can state a limit to Wq, (at 95% CL). The properties of dark energy 
are then 

D A = 0.73 ± 0.04, w v < -0.78 at 95% CL. (8.47) 



228 Cosmic Microwave Background 

This value of Qa agrees perfectly with the evidence for acceleration from super- 
novae in Equation (4.79). This is another remarkable success of the FLRW con- 
cordance model, since the supernova observations do not depend on Einstein's 
equations at all. 

Similar limits on w v have also been obtained in independent observations. 
R. Jimenez et aJ. [13] combined the absolute ages of Galactic stars with the posi- 
tion of the first peak in the WMAP angular power spectrum, finding w v < -0.8 
at 95% confidence. The High-z Supernova Search Team [14] have discovered and 
observed eight supernovae in the redshift interval z = 0.3-1.2, which gave them, 
assuming flatness, w v < -0.73 at 95% confidence. 



Timescales. From this value for fi r one can determine the time of equality of 
radiation and matter density, t e q- From Equations (4.13) and (4.29), the equation 
determining the evolution of the scale is 

H(a) 2 = Hg[(l - O )a- 2 + Q(a)] = H Q 2 [(1 - Q )a- 2 + Q m a~ 3 + fi r a" 4 + fl A ]. 

At the present time (a = 1) Q m > Q r but, as we move back in time and a gets 
smaller, the term Q r a~ 4 will come to dominate. The epoch of matter-radiation 
equality would have occurred when O m a~ 3 = £? r a~ 4 . Using the value of Q m = 0.27 
from (8.43) and O r from (8.47) would give a -1 = 1 + z « 5500. This is not correct, 
however, because the a dependence of O r should actually be given by 

(8.48) 
p c 2 p c V a J 

using the function g* discussed in Chapter 5. In the region of 1 + z > 1000 
neutrinos will be relativistic and g % = 3.36 instead of 2 (the contribution to the 
integral (4.56) from large z > 10 8 , where g % (a) > 3.36 is negligible). This gives 

a - q i = 1 + z eq = 822 ° x 2 ^ _ 5 - 3300. (8.49) 

Using Equation (4.56) and Q\ = 0.73 from (8.46) gives 

t eq «54 500yr. (8.50) 

In Figure 5 . 1 we have plotted the energy density for matter and radiation as a 
function of the scale a of the Universe from log a = -6. In Figure 5.9 we have 
plotted it from log a = -4 until now. 

Temperature scales inversely with a, thus at a eq we have 

T eq = Toa-q 1 w 9000° K. (8.51) 

Inserting the values of the energy densities (8.43) and (8.45) into Equation (4.56), 
one finds the age of the Universe to a remarkable precision, 

to = 13.7 ± 0.2 Gyr, (8.52) 

which is in excellent agreement with the independent determinations of lesser 
precision in Section 1.5. The WMAP team have also derived the redshift and age 



(8.53) 



Model Testing and Parameter Estimation. II 229 

of the Universe at last scattering and the thickness of the last scattering shell 
(noting that the WMAP team use the term decoupling where we have used last 
scattering surface—see our definitions in Section 5.5): 

t LSS = o.379+8;88f Myr, At LSS = o.ii8+8:88i M y r -} 

1 + z LSS = 1089 ± 1, Az L ss = 195 ±2. j 

Deceleration Parameter. Recalling the definitions 

Rq' 3H 2 3H 2 ' HV R H%' 

and ignoring Q r , since it is so small, we can find relations between the dynamical 
parameters Q\, Q m , Hq, and the deceleration parameter qo- Substitution of these 
parameters into Equations (4.17) and (4.18) at present time to, gives 

H o + ^T ~ T = Q raH 2 Q , (8.54) 

(8.55) 

where w denotes the equation of state p m /PmC 2 of matter. We can then obtain 
two useful relations by eliminating either k or A. In the first case we find 

O m (l + 3w) = 2q Q + 2Q A , (8.56) 

and, in the second case, 

kr 2 

lQ m (l + w)-q Q -l = -^-^. (8.57) 

In the present matter-dominated Universe, the pressure p m is completely negli- 
gible and we can set w = 0. From Equation (8.56) and the values for O m and Q\ 
above, we find 

q = -0.60 ±0.02. (8.58) 

The reason for the very small error is that the errors of O m and Q\ are com- 
pletely anti-correlated. The parameter values given in this section have been used 
to construct the scales in Figure 5.9. The values are collected in Table A.6 in the 
Appendix. 



Problems 

1. Derive an equation for r eq from condition (8.3). 

2. Use Wien's constant, Equation (5.106) and the CMB temperature to determine 
the wavelength of the CMB. 



230 Cosmic Microwave Background 

3. Use the present radiation-energy density to calculate the pressure due to 
radiation in the Universe. Compare this with the pressure due to a gas of 
galaxies calculated in Problem 3 of Chapter 3. 

4. Show that an observer moving with velocity p in a direction 6 relative to 
the CMB sees the rest frame blackbody spectrum with temperature T as a 
blackbody spectrum with temperature 

T ' = —R 1 HV ■ (8 " 59) 

y(l -£cos0) 

To first order in fi this gives the dipole anisotropy Equation (8.17) [1]. 

5. The dipole anisotropy is measured to be 1.2 x 10 _3 7o. Derive the velocity of 
Earth relative to the comoving coordinate system. 

Chapter Bibliography 

[1] Peebles, P. J. E. 1993 Principles of physical cosmology. Princeton University Press, 
Princeton, NJ. 

[2] Mather, J. C, Cheng, E. S., Eplee, R. E. et al. 1990 Astrophys. J. Lett. 354, L37. 

[3] Fixsen, D. J. et al. 1996 Astrophys. J. 473, 576. 

[4] Bennett, C. L. et al. 1996 Astrophys. J. Lett. 464, LI. 

[5] Lynden-Bell, D. et al. 1988 Astrophys. J. 326, 19. 

[6] Bennett, C. L. et al. 2003 Preprint arXiv, astro-ph/0302207 and 2003 Astrophys. J. (In 
press.) and companion papers cited therein. 

[7] Smoot, G. and Davidson, K. 1993 Wrinkles in time. Avon Books, New York. 

[8] Kosowsky, A. 1999 New Astronom. Rev. 43, 157. 

[9] Kovac, J. et al. 2002 Nature 420, 772. 

[10] Colless, M. et al. 2001 Mon. Not. R. Astron. Soc. 328, 1039. 

[11] Pearson, T. J. et al. 2003 Astrophys. J. 591, 556. 

[12] Kuo, C. L. etal. 2002 Preprint arXiv, astro-ph/02 12289. 

[13] Jimenez, R., Verde, L., Treu, T. and Stern, D. 2003 Preprint arXiv, astro-ph/0302560. 

[14] Tonry, J. L. et al. 2002 Preprint arXiv, astro-ph/0305008 and Astrophys. J. (2003). 



9 

Cosmic 

Structures and 

Dark Matter 



After the decoupling of matter and radiation described in Chapter 5, we followed 
the fate of the free-streaming CMB in Chapter 8. Here we shall turn to the fate of 
matter and cold nonradiating dust. After recombination, when atoms formed, den- 
sity perturbations in baryonic matter could start to grow and form structures, but 
growth in weakly interacting nonbaryonic (dark) matter could have started earlier 
at the time of radiation and matter equality. The time and size scales are impor- 
tant constraints to galaxy-formation models, as are the observations of curious 
patterns of filaments, sheets and voids on very large scales. 

In Section 9.1 we describe the theory of density fluctuations in a viscous fluid, 
which approximately describes the hot gravitating plasma. This very much paral- 
lels the treatment of the fluctuations in radiation that cause anisotropies in the 
CMB. 

In Section 9.2 we learn how pressure and gravitation conspire so that the hot 
matter can begin to cluster, ultimately to form the perhaps 10 9 galaxies, clusters 
and other large-scale structures. 

In Section 9.3 we turn to the dynamical evidence at various scales that a large 
fraction of the gravitating mass in the Universe is nonluminous and composed of 
some unknown kind of nonbaryonic matter. 

Section 9.4 lists the possible candidates of this dark matter. As we shall see, 
there are no good candidates, only some hypothetical particles which belong to 
speculative theories. 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



232 Cosmic Structures and Dark Matter 

In Section 9.5 we turn to observations of galaxy distributions and comparisons 
with simulations based on the cold dark matter (CDM) paradigm and to predic- 
tions and verifications based on the CDM paradigm. 

9.1 Density Fluctuations 

Until now we have described the dynamics of the Universe by assuming homo- 
geneity and adiabaticity. The homogeneity cannot have grown out of primeval 
chaos, because a chaotic universe can grow homogeneous only if the initial condi- 
tions are incredibly well fine-tuned. Vice versa, a homogeneous universe will grow 
more chaotic, because the standard model is gravitationally unstable. 

But the Universe appears homogeneous only on the largest scales (a debatable 
issue!), since on smaller scale we observe matter to be distributed in galaxies, 
groups of galaxies, supergalaxies and strings of supergalaxies with great voids 
in between. At the time of matter and radiation equality, some lumpiness in the 
energy density must have been the 'seeds' or progenitors of these cosmic struc- 
tures, and one would expect to see traces of that lumpiness also in the CMB tem- 
perature anisotropics originating in the last scattering. The angular scale sub- 
tended by progenitors corresponding to the largest cosmic structures known, of 
size perhaps 200/r" 1 Mpc, is of the order of 3°, corresponding to CMB multipoles 
around $ = 20. 



Viscous Fluid Approximation. The common approach to the physics of mat- 
ter in the Universe is by the hydrodynamics of a viscous, nonstatic fluid. With 
this nonrelativistic (Newtonian) treatment and linear perturbation theory we can 
extract much of the essential physics while avoiding the necessity of solving the 
full equations of general relativity. In such a fluid there naturally appear random 
fluctuations around the mean density p(t), manifested by compressions in some 
regions and rarefactions in other regions. An ordinary fluid is dominated by the 
material pressure but, in the fluid of our Universe, three effects are competing: 
radiation pressure, gravitational attraction and density dilution due to the Hubble 
flow. This makes the physics different from ordinary hydrodynamics: regions of 
overdensity are gravitationally amplified and may, if time permits, grow into large 
inhomogeneities, depleting adjacent regions of underdensity. 

The nonrelativistic dynamics of a compressible fluid under gravity is described 
by three differential equations, the Eulerian equations. Let us denote the den- 
sity of the fluid by p, the pressure p, and the velocity field v , and use comoving 
coordinates, thus following the time evolution of a given volume of space. The 
first equation describes the conservation of mass: what flows out in unit time 
corresponds to the same decrease of matter in unit space. This is written 

^ = -pV-v. (9.1) 

at 

Next we have the equation of motion of the volume element under consideration, 



Density Fluctuations 



where </> is the gravitational potential obeying Poisson's equation, which we met 
in Equation (2.85), 

V 2 </> = 4rrGp. (9.3) 

Equation (9.2) shows that the velocity field changes when it encounters pressure 
gradients or gravity gradients. 

The description in terms of the Eulerian equations is entirely classical and the 
gravitational potential is Newtonian. The Hubble flow is entered as a perturbation 
to the zeroth-order solutions with infinitesimal increments Sv, 8p, 5p and S<fi. 
Let us denote the local density p(r, t) at comoving spatial coordinate r and world 
time t. Then the fractional departure at r from the spatial mean density p(t) is 
the dimensionless mass density contrast 

Wfit) = ^!!). (9.4) 

The solution to Equations (9.1)-(9.3) can then be sought in the form of waves, 

S m (r,t) oc e 1( * -r ~ cot) , (9.5) 

where fe is the wave vector in comoving coordinates. An arbitrary pattern of fluctu- 
ations can be described mathematically by an infinite sum of independent waves, 
each with its characteristic wavelength A or comoving wavenumber k and its 
amplitude Sk- The sum can be formally expressed as a Fourier expansion for the 
density contrast 

5 m (r,t)oc^«5 fe (t)e ifcr . (9.6) 

A density fluctuation can also be expressed in terms of the mass M moved within 
one wavelength, or rather within a sphere of radius A, thus M oc A 3 . It follows that 
the wavenumber or spatial frequency k depends on mass as 

k=^ocM" 1/3 . (9.7) 



Power Spectrum. The density fluctuations can be specified by the amplitudes 
8k of the dimensionless mass autocorrelation function 

§(r) = <5(ri)5(r + ri)> oc £<|(5 fc (t)| 2 >e ifc - r , (9.8) 

which measures the correlation between the density contrasts at two points r and 
r\. The powers |5fc| 2 define the power spectrum of the root-mean-squared (RMS) 
mass fluctuations 

P(k) = (\S k (t)\ 2 ). (9.9) 

Thus the autocorrelation function 5 ix) is the Fourier transform of the power spec- 
trum. We have already met a similar situation in the context of CMB anisotropics, 
where the waves represented temperature fluctuations on the surface of the sur- 
rounding sky. There Equation (8.18) defined the autocorrelation function C(9) 



234 Cosmic Structures and Dark Matter 

and the powers a 2 , were coefficients in the Legendre polynomial expansion Equa- 
tion (8.19). 
Taking the power spectrum to be of the phenomenological form (7.57), 



and combining with Equation (9.7), one sees that each mode Sk is proportional to 
some power of the characteristic mass enclosed, M a . 
Inflationary models predict that the mass density contrast obeys 

<5^ock 3 <|5 fe (t)| 2 > (9.10) 

and that the primordial fluctuations have approximately a Harrison-Zel'dovich 
spectrum with n s = 1. Support for these predictions come from the CMB temper- 
ature and polarization asymmetry spectra which give the value quoted in Equa- 
tion (8.40), n = 0.93 ± 0.03 [1]. 

Independent, although less accurate information about the spectral index can 
be derived from constraints set by CMB isotropy, galaxies and black holes using 
Sk °c M a . The CMB scale (within the size of the present horizon of M « 1O 22 M ) 
is isotropic to less than about 10" 4 . Galaxy formation (scale roughly 1O 12 M ) 
requires perturbations of order 10" 4±1 . Taking the ratio of perturbations versus 
mass implies a constraint on a and implies that n is close to 1.0 at long wave- 
lengths. 

Turning to short wavelengths (scale about 10 12 kg or 1O" 18 M ), black holes pro- 
vide another constraint. Primordial perturbations on this scale must have been 
roughly smaller than 1.0. Larger perturbations would have led to production of 
many black holes, since large-amplitude perturbations inevitably cause overdense 
regions to collapse before pressure forces can respond. From Equation (3.30) one 
sees that black holes less massive than 10 12 kg will have already evaporated within 
10 Gyr, but those more massive will remain and those of mass 10 12 kg will be evap- 
orating and emitting y-rays today. Large amplitude perturbations at and above 
10 12 kg would imply more black holes than is consistent with the mass density 
of the Universe and the y-ray background. Combining the black hole limit on per- 
turbations (up to around 1 f or M ~ 10 12 kg) with those from the CMB and galaxy 
formation also implies the spectrum must be close to the Harrison-Zel'dovich 
form. 

The power spectra of theoretical models for density fluctuations can be com- 
pared with the real distribution of galaxies and galaxy clusters. Suppose that the 
galaxy number density in a volume element dV is Uq, then one can define the 
probability of finding a galaxy in a random element as 

dP = n G dV. (9.11) 

If the galaxies are distributed independently, for instance with a spatially homo- 
geneous Poisson distribution, the joint probability of having one galaxy in each 
of two random volume elements dVi , dV 2 is 

dP l2 = nl dV l dV 2 . (9.12) 



Density Fluctuations 235 

There is then no correlation between the probabilities in the two elements. How- 
ever, if the galaxies are clustered on a characteristic length r c , the probabilities in 
different elements are no longer independent but correlated. The joint probability 
of having two galaxies with a relative separation r can then be written 

dP 12 = ni[l + %(r/r c )]dV 1 dV 2 , (9.13) 

where %(r/r c ) is the two-point correlation function for the galaxy distribution. 
This can be compared with the autocorrelation function (9.8) of the theoretical 
model. If we choose our own Galaxy at the origin of a spherically symmetric galaxy 
distribution, we can simplify Equation (9.13) by setting nodVi = 1. The right- 
hand side then gives the average number of galaxies in a volume element dV 2 at 
distance r. 
Analyses of galaxy clustering show [2] that, for distances 

lOkpc < hr < lOMpc, (9.14) 

a good empirical form for the two-point correlation function is 

%(r/r c ) = (r/r c r y , (9.15) 

with the parameter values r c « 5.0h~ l Mpc, y « 1.7. 

Irregularities in the metric can be expressed by the curvature radius r v defined 
in Equation (4.54). If r v is less than the linear dimensions d of the fluctuating 
region, it will collapse as a black hole. Establishing the relation between the cur- 
vature of the metric and the size of the associated mass fluctuation requires the 
full machinery of general relativity, which is beyond our ambitions. 

Linear Approximation. Much of the interesting physics of density fluctuations 
can be captured by a Newtonian linear perturbation analysis of a viscous fluid. 
Small perturbations grow slowly over time and follow the background expan- 
sion until they become heavy enough to separate from it and to collapse into 
gravitationally bound systems. As long as these perturbations are small they can 
be decomposed into Fourier components that develop independently and can be 
treated separately. For fluctuations in the linear regime, \8k\ < 1, where 

Pm = Pm + Ap m , p = p+Ap, v l = V 1 + Av l , <fi = <p + &4>, (9.16) 
the size of the fluctuations and the wavelengths grows linearly with the scale a, 
whereas in the nonlinear regime, |5fcl > 1, the density fluctuations grow faster, 
with the power a 3 , at least (but not exponentially). The density contrast can also 
be expressed in terms of the linear size d of the region of overdensity normalized 
to the curvature radius, 

'-(£)'■ 

In the linear regime r v is large, so the Universe is flat. At the epoch when d is 
of the order of the Hubble radius, the density contrast is 



236 Cosmic Structures and Dark Matter 




log(tf) 

Figure 9.1 The evolution of the physical size of the comoving scale or wavelength A, 
and of the Hubble radius H _1 as functions of the scale of the Universe R. In the stan- 
dard noninflationary cosmology, a given scale crosses the horizon but once, while in the 
inflationary cosmology all scales begin sub-horizon sized, cross outside the Hubble radius 
('good bye') during inflation, and re-enter ('hello again') during the post-inflationary epoch. 
Note that the largest scales cross outside the Hubble radius first and re-enter last. The 
growth in the scale factor, N = lniRxn/R), between the time a scale crosses outside the 
Hubble radius during inflation and the end of inflation is also indicated. For a galaxy, 
JV GAL = ln(R RH /R 2 ) ~ 45, and for the present horizon scale, iV H0 R = ln(_R RH /i?i) ~ 53. 
Causal microphysics operates only on scales less than H' 1 , indicated by arrows. During 
inflation H _1 is a constant, and in the post-inflation era it is proportional to R 1/n , where 
n = 2 during radiation domination, and n = | during matter domination. Courtesy of 
E. W. Kolb and M. S. Turner. 



free streaming can leave the region and produce the CMB anisotropies. Structures 
formed when d <s r H , thus when 5 «; 1. Although 5 may be very small, the 
fluctuations may have grown by a very large factor because they started early on 
(see Problem 3 in Chapter 7). 

When the wavelength is below the horizon, causal physical processes can act 
and the (Newtonian) viscous fluid approximation is appropriate. When the wave- 
length is of the order of or larger than the horizon, however, the Newtonian analy- 
sis is not sufficient. We must then use general relativity and the choice of gauge 
is important. 



Structure Formation 237 

Gauge Problem. The mass density contrast introduced in Equation (9.4) and the 
power spectrum of mass fluctuations in Equation (9.9) represented perturbations 
to an idealized world, homogeneous, isotropic, adiabatic, and described by the 
FLRW model. For sub-horizon modes this is adequate. For super-horizon modes 
one must apply a full general-relativistic analysis. Let us call the space-time of 
the world just described (g. In the real world, matter is distributed as a smooth 
background with mass perturbations imposed on it. The space-time of that world 
is not identical to Gj, so let us call it Gj' . 

To go from Gj' , where measurements are made, to Gj, where the theories are 
defined, requires a gauge transformation. This is something more than a mere 
coordinate transformation— it also changes the event in Gj that is associated to 
an event in Q' . A perturbation in a particular observable is, by definition, the 
difference between its value at some space-time event in Q and its value at the 
corresponding event in the background (also in Q). An example is the mass auto- 
correlation function £(r) in Equation (9.8). 

But this difference need not be the same in Gj' . For instance, even if an observ- 
able behaves as a scalar under coordinate transformations in Gj, its perturbation 
will not be invariant under gauge transformations if it is time dependent in the 
background. Non-Newtonian density perturbations in Q on super-horizon scales 
may have an entirely different time dependence in Gj' , and the choice of gauge 
transformation Q — Q' is quite arbitrary. 

But arbitrariness does not imply that one gauge is correct and all others wrong. 
Rather, it imposes on physicists the requirement to agree on a convention, oth- 
erwise there will be problems in the interpretation of results. The formalism we 
chose in Chapter 2, which led to Einstein's equation (2.96) and Friedmann's equa- 
tions (4.4) and (4.5), implicitly used a conventional gauge. Alternatively one could 
have used gauge-invariant variables, but at the cost of a very heavy mathematical 
apparatus. Another example which we met briefly in Section 6.2 concerned the 
electroweak theory, in which particle states are represented by gauge fields that 
are locally gauged. 



9.2 Structure Formation 

As we have seen in the FLRW model, the force of gravity makes a homogeneous 
matter distribution unstable: it either expands or contracts. This is true for matter 
on all scales, whether we are considering the whole Universe or a tiny localized 
region. But the FLRW expansion of the Universe as a whole is not exponential and 
therefore it is too slow to produce our Universe in the available time. This requires 
a different mechanism to give the necessary exponential growth: cosmic inflation. 
Only after the graceful exit from inflation does the Universe enter the regime of 
Friedmann expansion, during which the Hubble radius gradually overtakes the 
inflated regions. Thus, inflationary fluctuations will cross the post-inflationary 
Hubble radius and come back into vision with a wavelength corresponding to the 
size of the Hubble radius at that moment. This is illustrated in Figure 9.1. 



238 Cosmic Structures and Dark Matter 

Jeans Mass. Primordial density fluctuations expand linearly at a rate slower than 
the Universe is expanding in the mean, until eventually they reach a maximum size 
and collapse nonlinearly. If the density fluctuates locally, also the cosmic scale 
factor will be a fluctuating function a(r,t) of position and time. In overdense 
regions where the gravitational forces dominate over pressure forces, causing 
matter to contract locally and to attract surrounding matter which can be seen 
as inflow. In other regions where the pressure forces dominate, the fluctuations 
move as sound waves in the fluid, transporting energy from one region of space 
to another. 

The dividing line between these two possibilities can be found by a classical 
argument. Let the time of free fall to the centre of an inhomogeneity in a gravita- 
tional field of strength G be 

t G = l/^Gp. (9.19) 

Sound waves in a medium of density p and pressure p propagate with velocity 



so they move one wavelength A in the time 

U = A/c s . (9.20) 

Note that only baryonic matter experiences pressure forces. In the next sections 
we shall meet noninteracting forms of matter which feel only gravitational forces. 
When tc is shorter than t s , the fluctuations are unstable and their amplitude 
will grow by attracting surrounding matter, becoming increasingly unstable until 
the matter eventually collapses into a gravitationally bound object. The opposite 
case is stable: the fluctuations will move with constant amplitude as sound waves. 
Setting to = t s , we find the limiting Jeans wavelength A = Aj at the Jeans instability, 
discovered by Sir James Jeans (1877-1946) in 1902, 

Aj = J^rC s . (9.21) 

Actually, the factor -Jrf was not present in the above Newtonian derivation; it 
comes from an exact treatment of Equations (9.1)-(9.3) (see for instance reference 
[3]). The mass contained in a sphere of radius Aj is called the Jeans mass, 

Mj = f TTAfp. (9.22) 

In order for tiny density fluctuations to be able to grow to galactic size, there 
must be enough time, or the expansion must be exponential in a brief time. The 
gravitational collapse of a cloud exceeding the Jeans mass develops exponentially, 
so the cloud halves its radius in equal successive time intervals. But galaxies and 
large-scale structures do not condense out of the primordial medium by exponen- 
tial collapse. The structures grow only linearly with the scale a or as some low 
power of a. 



Structure Formation 239 

For sub-horizon modes, the distinction between the radiation-dominated and 
matter-dominated eras is critical. During the radiation era, growth of perturba- 
tions is suppressed. During the matter era, perturbations can grow. But during 
the matter era the Jeans wavelength provides an important boundary. Large wave- 
length fluctuations will grow with the expansion as long as they are in the linear 
regime. In an accelerated expansion driven by dark energy, the condition for grav- 
itational collapse becomes extremely complicated. This happens rather late, only 
when matter domination ends and dark energy becomes dynamically important 
(z~l). 

For wavelengths less than the Jeans wavelength the pressure in the baryonic 
matter can oppose the gravitational collapse and perturbations will oscillate in the 
linear regime as sound waves, never reaching gravitational collapse. An alternative 
way of stating this is to note that the radiation pressure and the tight coupling 
of photons, protons and electrons causes the fluid to be viscous. On small scales, 
photon diffusion and thermal conductivity inhibit the growth of perturbations as 
soon as they arise, and on large scales there is no coherent energy transport. 

Mass fluctuations at still shorter wavelength, with A « ru <K r H , can break 
away from the general expansion and collapse to bound systems of the size of 
galaxies or clusters of galaxies. Fluctuations which enter in the nonlinear regime, 
where the ratio in Equation (9.17) is large, collapse rapidly into black holes before 
pressure forces have time to respond. 

For baryonic matter before the recombination era, the baryonic Jeans mass is 
some 30 times larger than the mass M H of baryons within the Hubble radius r H , so 
if there exist nonlinear modes they are outside it (the Jeans wavelength is greater 
than the horizon). A mass scale M is said to enter the Hubble radius when M = M H . 
Well inside the Hubble radius, the fluctuations may start to grow as soon as the 
Universe becomes matter dominated, which occurs at time t eq = 54 500 yr (from 
Equation (8.50)). 

Upon recombination, the baryonic Jeans mass falls dramatically. If the fluid 
is composed of some nonbaryonic particle species (cold dark matter), the Jeans 
wavelength is small after radiation-matter equality, allowing sub-horizon pertur- 
bations to grow from this time. After matter-radiation equality, nonbaryonic mat- 
ter can form potential wells into which baryons can fall after recombination. 

Matter can have two other effects on perturbations. Adiabatic fluctuations lead 
to gravitational collapse if the mass scale is so large that the radiation does not 
have time to diffuse out of one Jeans wavelength within the time t eq . As the Uni- 
verse approaches decoupling, the photon mean free path increases and radia- 
tion can diffuse from overdense regions to underdense ones, thereby smoothing 
out any inhomogeneities in the plasma. For wavelengths below the Jeans wave- 
length, collisional dissipation or Silk damping (after J. Silk) erases perturbations in 
the matter (baryon) radiation held through photon diffusion. This becomes most 
important around the time of recombination. Random waves moving through the 
medium with the speed of sound c s erase all perturbations with wavelengths less 
than c s t e q- This mechanism sets a lower limit to the size of the structures that 
can form by the time of recombination: they are not smaller than rich clusters 



240 Cosmic Structures and Dark Matter 

or superclusters. But, in the presence of nonbaryonic matter, Silk damping is of 
limited importance because nonbaryonic matter does not couple with the radia- 
tion field. 

The second effect is free streaming of weakly interacting relativistic particles 
such as neutrinos. This erases perturbations up to the scale of the horizon, but 
this also ceases to be important at the time of matter-radiation equality. 

The situation changes dramatically at recombination, when all the free electrons 
suddenly disappear, captured into atomic Bohr orbits, and the radiation pressure 
almost vanishes. This occurs at time 400 000 yr after Big Bang (see Figure 5.9). 
Now the density perturbations which have entered the Hubble radius can grow 
with full vigour. 



Sunyaev-Zel'dovich Effect (SZE). At some stage the hydrogen gas in gravitation- 
ally contracting clouds heats up enough to become ionized and to re-ionize the 
CMB: the Sunyaev-Zel'dovich effect. We refer to the WMAP result in Equation (8.39) 
that such re-ionization clouds occur at an average redshift z r « 20. 

The free electrons and photons in the ionized clouds build up a radiation pres- 
sure, halting further collapse. The state of such clouds today depends on how 
much mass and time there was available for their formation. Small clouds may 
shrink rapidly, radiating their gravitational binding energy and fragmenting. Large 
clouds shrink slowly and cool by the mechanism of electron Thomson scattering. 
As the recombination temperature is approached the photon mean free paths 
become larger, so that radiation can diffuse out of overdense regions. This damps 
the growth of inhomogeneities. 

The distortion of the CMB spectrum due to the SZE can be used to detect inter- 
galactic clouds and to provide another estimate of Hq by combining radio and 
X-ray observations to obtain the distance to the cloud. The importance of the SZE 
surveys is that they are able to detect all clusters above a certain mass limit inde- 
pendent of the redshifts of the clusters. The ratio of the magnitude of the SZE to 
the CMB does not change with redshift. The effects of re-ionization on the CMB 
temperature-polarization power were discussed in Section 8.4. 



Structure Sizes and Formation Times. Only clouds exceeding the Jeans mass 
stabilize and finally attain virial equilibrium. It is intriguing (but perhaps an acci- 
dent) that the Jeans mass just after recombination is about 1O 5 M , the size of 
globular clusters! Galaxies have masses of the order of 1O 12 M corresponding to 
fluctuations of order 8 =* 10" 4 as they cross the horizon. We have already made 
use of this fact to fix the mass m^ of the scalar field in Equation (7.49). 

The timetable for galaxy and cluster formation is restricted by two important 
constraints. At the very earliest, the Universe has to be large enough to have space 
for the first formed structures. If these were galaxies of the present size, their 
number density can be used to estimate how early they could have been formed. 
We leave this for a problem. 



The Evidence for Dark Matter 241 

The present density of the structures also sets a limit on the formation time. 
The density contrast at formation must have exceeded the mean density at that 
time, and since then the contrast has increased with a 3 . Thus, rich clusters, for 
instance, cannot have been formed much earlier than at 

l + z~ 2.5Q- 113 . (9.23) 

It seems that all the present structure was already in place at z = 5. This does 
not exclude the fact that the largest clusters are still collapsing today. In a critical 
universe structure formation occurs continuously, rich galaxy clusters form only 
at a redshift of 0.2-0.3, and continue to accrete material even at the present epoch. 
In that case many clusters are expected to show evidence for recent merger events 
and to have irregular morphologies. 

As a result of mass overdensities, the galaxies influenced by the ensuing fluc- 
tuations in the gravitational held will acquire peculiar velocities. One can derive a 
relation between the mass autocorrelation function and the RMS peculiar velocity 
(see reference [3]). If one takes the density contrast to be S m = 0.3 for RMS fluc- 
tuations of galaxy number density within a spherical volume radius 30/t" 1 Mpc, 
and if one further assumes that all mass fluctuations are uncorrelated at larger 
separations, then the acceleration caused by the gravity force of the mass fluc- 
tuations would predict deviations from a homogeneous peculiar velocity held in 
rough agreement with observations in our neighbourhood. Much larger density 
contrast would be in marked disagreement with the standard model and with the 
velocity field observations. 



9.3 The Evidence for Dark Matter 

Below we shall study the matter content in dynamical systems on a variety of 
scales: galaxies, small galaxy groups, the local supercluster, rich galaxy clusters, 
and the full Universe inside our horizon. 



Inventory. In Equation (8.43) we quoted a value for the mass density Q m from 
a combination of CMB, large-scale-structure and supernova data. At that point 
we did not specify the types of gravitating mass that Q m represented— baryons 
certainly, and neutrinos too. Let us at this point make a full inventory of the 
known contents of energy densities in the Universe on a large scale. This implies 
rewriting Equation (4.20) in detail: 

Q Q =Q b + Q v +Q, + Q A . (9.24) 

We know that 

(i) Do = 1.02 ± 0.02 from Equation (8.43); this permits us to consider the Uni- 
verse as being spatially flat (Qq = 1). Moreover, 

(ii) fi b = 0.044 ± 0.004 from Equation (8.43), 



242 Cosmic Structures and Dark Matter 

(iii) Q v < 0.015 from Equations (8.16) and (8.43), 

(iv) Q r = 4.902 x 10" 5 from Equation (8.46), 
(v) A A = 0.73 ± 0.04 from Equation (8.47). 
Obviously, lib does not, by far, make up all the matter content in Q m = 0.27, 
and O v and Q r can be neglected here. Therefore, a large fraction is missing from 
the right-hand side of (9.24) to make the equation balance. All forms of radiating 
baryonic mass are already accounted for in Qb: starlight amounts to £?* = 0.001- 
0.002, gas and star remnants in galaxies amount to Q\ um < 0.01. The intergalactic 
gas contains hydrogen clouds and filaments seen by its Lytx absorption, warm gas 
in groups of galaxies radiates soft X-rays, hot gas in clusters is seen in keV X-rays 
and in the SZE. The missing fraction is not radiating and is therefore called dark 
matter (DM) 

Q Am = Q m -n b = 0.23 ±0.05. (9.25) 

The remarkable fact is that the missing energy density is much larger than the 
known fraction. 



Galaxy Formation. Galaxies form by gas cooling and condensing into DM haloes, 
where they turn into stars. The star-formation rate is 1OM yr _1 in galaxies at 
2.8 < z < 3.5 for which the Ly« break at 91.2 nm shifts significantly (at z = 3 it 
has shifted to 364.8 nm). Galaxy mergers and feedback processes also play major 
roles. 

If the galaxies have arisen from primordial density fluctuations in a purely bary- 
onic medium, the amplitude of the fluctuations must have been very large, since 
the amount of baryonic matter is so small. But the amplitude of fluctuations in 
the CMB must then also be very large, because of adiabaticity. This leads to intol- 
erably large CMB anisotropies today. Thus galaxy formation in purely baryonic 
matter is ruled out by this argument alone. 



Spiral Galaxies. The spiral galaxies are stable gravitationally bound systems in 
which matter is composed of stars and interstellar gas. Most of the observable 
matter is in a relatively thin disc, where stars and gas travel around the galactic 
centre on nearly circular orbits. By observing the Doppler shift of the integrated 
starlight and the radiation at A = 0.21 m from the interstellar hydrogen gas, one 
finds that galaxies rotate. If the circular velocity at radius r is v in a galaxy of 
mass M, the condition for stability is that the centrifugal acceleration should 
equal the gravitational pull: 

V - = ™f. (9.26) 

r r l 

In other words, the radial dependence of the velocity of matter rotating in a disc 

is expected to follow Kepler's law 



The Evidence for Dark Matter 



Figure 9.2 Typical galaxy rotation curves: (a) derived from the observed Doppler shift 
of the 0.21 m line of atomic hydrogen relative to the mean; (b) prediction from the radial 
light distribution. 

Visible starlight traces velocity out to radial distances typically of the order of 
lOkpc, and interstellar gas out to 20-50 kpc. The surprising result from mea- 
surements of galaxy-rotation curves is that the velocity does not follow the 1/ Vr 
law (9.27), but stays constant after a maximum at about 5 kpc (see Figure 9.2). 
Assuming that the disc-surface brightness is proportional to the surface density 
of luminous matter, one derives a circular speed which is typically more than 
a factor of three lower than the speed of the outermost measured points (see, 
for example, reference [4]). This implies that the calculated gravitational field is 
too small by a factor of 10 to account for the observed rotation. This effect is 
even more pronounced in dwarf spheroidal galaxies, which are only a few parsecs 
across and are the smallest systems in which dynamical DM has been detected. 
These galaxies have the highest known DM densities, approximately 1M pc" 3 , 
and the motion of their stars is completely dominated by DM at all radii. 

There are only a few possible solutions to this problem. One is that the theory 
of gravitation is wrong. It is possible to modify ad hoc Kepler's inverse square 
law or Newton's assumption that G is a constant, but the corresponding modifi- 
cations cannot be carried out in the relativistic theory, and a general correlation 
between mass and light remains. The modifications would have to be strong at 
large scales, and this would greatly enhance cosmic shear, which is inconsistent 
with measurements. 

Another possibility is that spiral galaxies have magnetic fields extending out to 
regions of tens of kiloparsecs where the interstellar gas density is low and the gas 
dynamics may easily be modified by such fields [5]. But this argument works only 
on the gas halo, and does not affect the velocity distribution of stars. Also, the 
existence of magnetic fields of sufficient strength remains to be demonstrated; in 
our Galaxy it is only a few microgauss, which is insufficient. 



244 Cosmic Structures and Dark Matter 

The solution attracting most attention is that there exist vast amounts of non- 
luminous DM beyond that accounted for by stars and hydrogen clouds. One nat- 
ural place to look for DM is in the neighbourhood of the Solar System. In 1922, 
Jacobus C. Kapteyn deduced that the total density in the local neighbourhood is 
about twice as large as the luminous density in visible stars. Although the result 
is somewhat dependent on how large one chooses this 'neighbourhood' to be, 
modern dynamical estimates are similar. 

The luminous parts of galaxies, as evidenced by radiation of baryonic matter in 
the visible, infrared and X-ray spectra, account only for 0\ um < 0.01. The internal 
dynamics implies that the galaxies are embedded in extensive haloes of DM, of 
the order of 

D hal0 > 0.03-0.10. (9.28) 

In fact, to explain the observations, the radial mass distribution M(r) must be 
proportional to r, 

JGM(r) 

The radial density profile is then 

p(r)ocr" 2 . (9.30) 

This is precisely the distribution one would obtain if the galaxies were surrounded 
by a halo formed by an isothermal gas sphere, where the gas pressure and gravity 
were in virial equilibrium. 

Actually, the observed rotation curves show no obvious density plateau or core 
near the centre. The profile of DM in haloes is shallower than isothermal near the 
centre and steeper in the outer parts. The inner profiles (r < r s ) of DM haloes 
are remarkably similar and well approximated by a power-law central cusp of the 
form 

P^l = ** -, (9.31) 

p c (z) (r/r s )(l + r/r s ) 2 ' 

where S c and r s are two free parameters to be fitted and p c (z) is the critical 
density at the redshift of the galaxy [6]. The outer shape varies from halo to halo, 
mostly because of the presence of minihalo remnants and other substructures. 

Thus the unavoidable conclusion is that galactic haloes contain DM: in fact an 
order of magnitude more DM than baryonic matter. For instance, there is five 
times more DM mass than visible baryonic mass in the M33 galaxy, and the ratio 
is 50:1 in its halo [7]. 



Small Galaxy Groups. Let us now turn to gravitational systems formed by a 
small number of galaxies. There are examples of such groups in which the galax- 
ies are enveloped in a large cloud of hot gas, visible by its X-ray emission. The 
amount of gas can be deduced from the intensity of this radiation. Adding this to 
the luminous matter, the total amount of baryonic matter can be estimated. The 
temperature of the gas depends on the strength of the gravitational field, from 
which the total amount of gravitating matter in the system can be deduced. 



The Evidence for Dark Matter 245 

In the galaxy group HCG62 in the Coma cluster, the ROSAT satellite has found 
a temperature of about 10 7 K [8], which is much higher than that which the grav- 
itational field of visible baryonic matter (galaxies and gas) would produce. One 
then deduces a baryonic mass fraction of Q\> > 0.13. But this cannot be typical of 
the Universe as a whole, since it conflicts with the value in Equation (8.43). 



The Local Supercluster (LSC). The autocorrelation function %(r) in Equa- 
tion (9.8) was defined for distances r in real space. In practice, distances to 
galaxies are measured in redshifts, and then two important distortions enter. To 
describe the separation of galaxy pairs on the surface of the sky, let us introduce 
the coordinate a, transversal to the line of sight, and n radial. In redshift space 
the correlation function is then described by £(cr, tt) or its spherical average £(s), 
where s = ~Jn 2 + a 2 . 

The transversal distance a is always accurate, but the radial redshift distance 
tt is affected by velocities other than the isotropic Hubble flow. For relatively 
nearby galaxies, r ^ 2 Mpc, the random peculiar velocities make an unknown 
contribution to tt so that £(s) is radially distorted. The undistorted correlation 
function §(r) is seen isotropic in (a, Tr)-space in the top left panel of Figure 9.3. 
The lower left panel of Figure 9.3 shows the distortion to £(5) as an elongation in 
the tt direction. 

Over large distances where the peculiar velocities are unimportant relative to 
the Hubble flow (tens of Mpc), the galaxies in the LSC feel its attraction, as is mani- 
fested by their infall toward its centre with velocities in the range 1 50-450 km s" 1 . 
From this one can derive the local gravitational field and the mass excess SM 
concentrated in the LSC. The infall velocities cause another distortion to 5(5): a 
flattening as is shown in the top right panel of Figure 9.3. When both distortions 
are included, the correlation function in (a, Tr)-space looks like the bottom right 
panel of Figure 9.3. The narrow peaks in the tt direction have been seen for a long 
time, and are called Fingers of God. 

If galaxy formation is a local process, then on large scales galaxies must trace 
mass (on small scales galaxies are less clustered than mass), so that 5 ga i(r) and 
5mass(^") are proportional: 

g ga i(r) = b 2 % mass (r). 

Here b is the linear bias: bias is when galaxies are more clustered than mass, 
and anti-bias is the opposite case; b = 1 corresponds to the unbiased case. The 
presence of bias is an inevitable consequence of the nonlinear nature of galaxy 
formation. The distortions in £(s) clearly depend on the mass density Q m within 
the observed volume. Introducing the phenomenological flattening parameter 

P = Q™/b, (9.32) 

one can write a linear approximation to the distortion as 

|4 = l + f + f. (933, 

5(r) 3 5 



Cosmic Structures and Dark Matter 




oftr 1 Mpc 



Figure 9.3 Plot of theoretically calculated correlation functions §(o", tt) as described in 
reference [2]. The lines represent contours of constant §(o",tt) = 4.0,2.0,0.5,0.2,0.1. 
The models are: top left, undistorted; bottom left, no infall velocities but /3 = 0.4; top 
right, infall velocity dispersion a = 500 km s~\ p = 0; bottom right, a = 500 km s" 1 , 
fi = 0.4. All models use Equation (9.15) with r c = 5.0h _1 Mpc and y = 1.7. Reproduced 
from reference [2] by permission of the 2dFGRS Team. 



Estimates of p and b which we shall discuss later lead to a value of O m of the 
order of 0.3 [2]. Thus a large amount of matter in the LSC is dark. 



Rich Clusters. The classical method for obtaining the mean mass density, p m , 
is to measure the mean ratio of mass to luminosity, Y = M/L for a representative 
sample of galaxies. The light radiated by galaxies is a universal constant L\j and 
the mass density is then obtained as L V Y (Problem 6). In solar units, M e /L e , the 
value for the Sun is Y = 1 and values in the solar neighbourhood are about 2.5-7. 
Similar values apply to small galaxy groups. However, rich galaxy clusters exhibit 



The Evidence for Dark Matter 247 

much larger Y values, from 300 for the Coma cluster to 650. These large values 
show that rich clusters have their own halo of DM which is much larger than the 
sum of haloes of the individual galaxies. 

Zwicky noted in 1933 that the galaxies in the Coma cluster and other rich clus- 
ters move so fast that the clusters require about 10-100 times more mass to keep 
the galaxies bound than could be accounted for by the luminous mass in the galax- 
ies themselves. This was the earliest indication of DM in objects at cosmological 
distances. 

The virial theorem for a statistically steady, spherical, self -gravitating cluster of 
objects, stars or galaxies states that the total kinetic energy of AT objects with aver- 
age random peculiar velocities v equals -\ times the total gravitational potential 
energy. If r is the average separation between any two objects of average mass 
m, the potential energy of each of the possible N(N - l)/2 pairings is -Gm 2 /r. 
The virial theorem then states that 

^^N^^ (934) 

For a large cluster of galaxies of total mass M and radius r, this reduces to 

2rv 2 
M = —^-. (9.35) 

Thus, one can apply the virial theorem to estimate the total dynamic mass of a 
rich galaxy cluster from measurements of the velocities of the member galaxies 
and the cluster radius from the volume they occupy. When such analyses have 
been carried out, taking into account that rich clusters have about as much mass 
in hot gas as in stars, one finds that gravitating matter accounts for 

fig rav = 0.2-0.3, (9.36) 

or much more than the fraction of baryonic matter. 

This is well demonstrated in a study of groups and clusters of galaxies with 
the Position Sensitive Proportional Counter (PSPC) instrument on board the X-ray 
satellite ROSAT [9, 10]. An example is given in Figure 9.4, which shows the radial 
distribution of various gravitating components in the hot cluster A85. The intra- 
cluster gas visible by its X-ray emission is the most extended mass component 
(GAS), and galaxies constitute the most centrally concentrated component (GAL). 
Thus the total baryonic matter at large radii is well approximated by GAS, and it 
is reasonable to assume that the constant level it approaches corresponds to the 
primordial composition out of which galaxies formed. One clearly sees the need 
for a dominating DM component (DARK) which is clustered intermediate between 
GAL and GAS. 

One can then establish the relation 

/gas = t^- = 0.113 ±0.005, (9.37) 

where the subscript m in our notation corresponds to GRAY in Figure 9.4. Here 



248 Cosmic Structures and Dark Matter 




Figure 9.4 The mass inside given radii of different mass components of the A85 galaxy 
cluster. GAL, mass in galaxies; GAS, mass in intracluster gas; GRAV, total gravitating mass; 
DARK, missing mass component. From reference [9, 10] courtesy of J. Nevalainen. 



denotes the local enhancement of baryon density in a cluster compared with the 
universal baryon density, obtained from simulations, and the /gas value is an 
average for six clusters studied by the Chandra observatory [11]. Using the O^ 
value from our inventory above, one finds 



O m ~ 0.34. 



(9.38) 



Although not a precision determination of Q m , it is a demonstration that the 
missing DM in galaxy clusters is considerable. 

The amount of DM in rich clusters can also be estimated from the cosmic shear 
in weak lensing. The advantage of this method is that it does not depend on the 
radiation emitted by matter. It gives a value of Q m « 0.3, but not yet with an 
interesting precision. 

Strong evidence for DM comes from the simulation structures and comparison 
with the structures in the sky. We shall come to this subject in Section 9.5. 



9.4 Dark Matter Candidates 



If only a few per cent of the total mass of the Universe is accounted for by stars and 
hydrogen clouds, could baryonic matter in other forms make up DM? The answer 
given by nucleosynthesis is a qualified no: all baryonic DM is already included 
inf3b. 



Dark Matter Candidates 249 

Dark Baryonic Matter. Before the value of f3d m was pinned down as certainly 
as it is now, several forms of dark baryonic matter was considered. Gas or dust 
clouds were the first thing that came to mind. We have already accounted for hot 
gas because it is radiating and therefore visible. Clouds of cold gas would be dark 
but they would not stay cold forever. Unless there exists vastly more cold gas than 
hot gas, which seems unreasonable, this DM candidate is insufficient. 

It is known that starlight is sometimes obscured by dust, which in itself is invis- 
ible if it is cold and does not radiate. However, dust grains re-radiate starlight in 
the infrared, so they do leave a trace of their existence. But the amount of dust and 
rocks needed as DM would be so vast that it would have affected the composition 
of the stars. For instance, it would have prevented the formation of low-metallicity 
(population-II) stars. Thus dust is not an acceptable candidate. 

Snowballs of frozen hydrogenic matter, typical of comets, have also been con- 
sidered, but they would sublimate with time and become gas clouds. A similar fate 
excludes collapsed stars: they eject gas which would be detectable if their number 
density were sufficient for DM. 

A more serious candidate for baryonic matter has been jupiters or brown dwarfs: 
stars of mass less than O.O8M . They also go under the acronym MACHO for Mas- 
sive Compact Halo Object. They lack sufficient pressure to start hydrogen burning, 
so their only source of luminous energy is the gravitational energy lost during 
slow contraction. Such stars would clearly be very difficult to see since they do 
not radiate. However, if a MACHO passes exactly in front of a distant star, the 
MACHO would act as a gravitational microlens, because light from the star bends 
around the massive object. The intensity of starlight would then be momentarily 
amplified (on a timescale of weeks or a few months) by microlensing, as described 
in Section 2.6. The problem is that even if MACHOs were relatively common, one 
has to monitor millions of stars for one positive piece of evidence. At the time 
of writing only a few microlensing MACHOs have been discovered in the space 
between Earth and the Large Magellanic Cloud [12, 13], but their contribution to 
Ob cannot be precisely evaluated. 

The shocking conclusion is that the predominant form of matter in the Universe 
is nonbaryonic, and we do not even know what it is composed of! Thus we are 
ourselves made of some minor pollutant, a discovery which may well be called 
the fourth breakdown of the anthropocentric view. The first three were already 
accounted for in Chapter 1. 



Black Holes. Primordial black holes could be good candidates because they 
evade the nucleosynthesis bound, they are not luminous, they (almost) do not 
radiate, and if they are big enough they have a long lifetime, as we saw in Equa- 
tion (3.30). They are believed to sit at the centre of every galaxy and have masses 
exceeding 1OOM . The mass range O.3-3OM is excluded by the nonobservation 
of MACHOs in the galactic halo (cf. Section 3.4). Various astrophysical consider- 
ations limit their mass to around 1O 4 M . But black holes do not appear to be a 
solution to the galactic rotation curves which require radially distributed DM in 
the haloes. 



250 Cosmic Structures and Dark Matter 

CDM. Particles which were very slow at time t e q when galaxy formation started 
are candidates for CDM. If these particles are massive and have weak interactions, 
so called WIMPs (Weakly Interacting Massive Particles), they became nonrelativistic 
much earlier than the leptons and become decoupled from the hot plasma. For 
instance, the supersymmetric models briefly discussed in Section 6.6 contain a 
very large number of particles, of which the lightest ones would be stable. At least 
three such neutral SUSY 'sparticles'— the photino, the Zino and the Higgsino— or 
a linear combination of them (the neutralino) could serve. Laboratory searches 
have not found them up to a mass of about 37 GeV, but they could be as heavy as 
ITeV. 

Very heavy neutrinos, m v > 45 GeV, could also be CDM candidates, other cold 
thermal relics of mass up to 300 TeV, and superheavy nonthermal wimpzillas 
at inflaton scale with masses of 10 9 -10 19 GeV. The latter are produced by the 
gravitational expansion and could also be useful for explaining extremely hard 
cosmic y-rays. 

Alternatively, the CDM particles may be very light if they have some superweak 
interactions, in which case they froze out early when their interaction rate became 
smaller than the expansion rate, or they never even attained thermal equilibrium. 
Candidates in this category are the axion and its SUSY partner axino. The axion 
is a light pseudoscalar boson with a 2y coupling like the tt°, so it could con- 
vert to a real photon by exchanging a virtual photon with a proton. Its mass is 
expected to be of the order of 1 |a^eV to 10 meV. It was invented to prevent CP 
violation in QCD, and it is related to a slightly broken baryon number symmetry 
in a five-dimensional space-time. Another CDM candidate could be axion clusters 
with masses of the order of 1O" 8 M . 

Among further exotica are solitons, which are nontopological scalar-held quanta 
with conserved global charge Q (Q-balls) or baryonic charge B (B-balls). 

The WIMPs would traverse terrestrial particle detectors with a typical virial 
velocity of the order of 200 km s" 1 , and perhaps leave measurable recoil ener- 
gies in their elastic scattering with protons. The proof that the recoil detected 
was due to a particle in the galactic halo would be the annual modulation of the 
signal. Because of the motion of the Earth around the Sun, the signal should have 
a maximum in June and a minimum in December. Several experiments to detect 
such signals are currently running or being planned, but so far the absence of 
signals only permits us to set upper limits to the WIMP flux. 

All WIMPs have in common that they are hitherto unobserved particles which 
only exist in some theories. A signal worth looking for would be monoenergetic 
photons from their annihilation 

X dm + X dm — 2y. (9.39) 

Several experiments are planned or under way to observe these photons if they 
exist. 



WIMP Distribution. The ideal fluid approximation which is true for the col- 
lisionless WIMPs on large scales breaks down when they decouple from the 



Dark Matter Candidates 




Figure 9.5 The evolution of density fluctuations 5p/p in the baryon and WIMP compo- 
nents. The perturbations in the WIMPs begin to grow at the epoch of matter-radiation 
equality. However, the perturbations in the baryons cannot begin to grow until just after 
decoupling, when baryons fall into the WIMP potential wells. Within a few expansion times 
the baryon perturbations 'catch up' with the WIMP perturbations. The dashed line shows 
where the baryonic density fluctuations would have to start if DM were purely baryonic. 
Courtesy of E. W. Kolb and M. S. Turner. 

plasma and start to stream freely out of overdense regions and into underdense 
regions, thereby erasing all small inhomogeneities (Landau damping). This defines 
the characteristic length and mass scales for freely streaming particles of mass 



\m dm J 



Mfc a 3 X 10 1 



/ 30eV ^ 2 



M.._ . 



(9.40) 
(9.41) 



Perturbations in CDM start growing from the time of matter-radiation equality, 
while baryonic fluctuations are inhibited until recombination because of the tight 
coupling with photons (or alternatively one can say because of the large baryonic 
Jeans mass prior to recombination). After recombination, the baryons fall into the 
CDM potential wells. A few expansion times later, the baryon perturbations catch 
up with the WIMPs, and both then grow together until 5 > 1, when perturbations 
become Jeans unstable, collapse and virialize. The amplitude of radiation, how- 
ever, is unaffected by this growth, so the CMB anisotropics remain at the level 
determined by the baryonic fluctuations just before recombination. This is illus- 
trated in Figure 9.5. 

The lightest WIMPs are slow enough at time t eq to be bound in perturbations 
on galactic scales. They should then be found today in galaxy haloes together 



252 Cosmic Structures and Dark Matter 

with possible baryonic MACHOs. If the nonbaryonic DM in our galactic halo were 
to be constituted by WIMPs at a sufficient density, they should also be found 
inside the Sun, where they lose energy in elastic collisions with the protons, and 
ultimately get captured. They could then contribute to the energy transport in the 
Sun, modifying solar dynamics noticeably. So far this possible effect has not been 
observed. 

On the other hand, if the WIMP overdensities only constituted early potential 
wells for the baryons, but did not cluster so strongly, most WIMPs would have 
leaked out now into the intergalactic space. In that case the WIMP distribution 
in clusters would be more uniform than the galaxy (or light) distribution, so that 
galaxies would not trace mass. 

Hot and Warm Dark Matter. Although the neutrinos decoupled from the ther- 
mal plasma long before matter domination, they remained relativistic for a long 
time because of their small mass. For this reason they would possibly consti- 
tute hot dark matter (HDM), freely streaming at t eq . The accretion of neutrinos 
to form haloes around the baryon clumps would be a much later process. The 
CMB is then very little perturbed by the clumps, because most of the energy is 
in neutrinos and in radiation. However, we already know that the neutrino frac- 
tion is much too small to make it a DM candidate, so HDM is no longer a viable 
alternative. 

An intermediate category is constituted by possible sterile neutrinos and by 
the gravitino, which is a SUSY partner of the graviton. These have been called 
warm dark matter (WDM). Both HDM and WDM are now ruled out by computer 
simulations of the galaxy distribution in the sky. WDM is also ruled out by the 
WMAP detection of early re-ionization at z > 10 [1]. We shall therefore not discuss 
these alternatives further. 



9.5 The Cold Dark Matter Paradigm 

The ACDM paradigm (which could simply be called CDM, since CDM without a 
Q\ component is ruled out) is based on all the knowledge we have assembled so 
far: the FLRW model with a spatially flat geometry, BBN and thermodynamics with 
a known matter inventory including dark energy of unknown origin but known 
density; inflation-caused linear, adiabatic, Gaussian mass fluctuations accompany- 
ing the CMB anisotropies with a nearly scale-invariant Harrison-Zel'dovich power 
spectrum; growth by gravitational instability from t eq until recombination, and 
from hot gas to star formation and hierarchical clustering. 

The new element in this scenario is collisionless DM, which caused matter dom- 
ination to start much earlier than if there had been only baryons. The behaviour 
of DM is governed exclusively by gravity (unless we discover any DM interactions 
with matter or with itself), whereas the formation of the visible parts of galaxies 
involves gas dynamics and radiative processes. 



The Cold Dark Matter Paradigm 253 

While the CMB temperature and polarization anisotropies measure fluctuations 
at recombination, the galaxy distribution measures fluctuations up to present 
times. Cosmic shear in weak lensing is sensitive to the distribution of DM directly, 
but it leaves a much weaker signal than do clusters. 



Hierarchical Scenarios. Early CDM models (without an Q\ component) produced 
galaxies naturally, but underproduced galaxy clusters and supergalaxies of mass 
scale 1O 15 M . This was an example of a bottom-top scenario, where small-scale 
structures were produced first and large-scale structures had to be assembled 
from them later. Although there was not time enough in this scenario to produce 
large-scale structures within the known age of the Universe, the scenario could 
be improved by the introduction of a new degree of freedom, the cosmological 
constant. 

The opposite 'top-bottom' scenario was predicted by HDM models where the 
first structures, supergalaxies, formed by neutrino clouds contracting into pan- 
cakes which subsequently collapsed and disintegrated. Smaller structures and 
galaxies formed later from the crumbs. But computer simulations of pancake for- 
mation and collapse show that the matter at the end of the collapse is so shocked 
and so heated that the clouds do not condense but remain ionized, unable to form 
galaxies and attract neutrino haloes. Moreover, large clusters (up to 10 14 M o ) have 
higher escape velocities, so they should trap five times more neutrinos than large 
galaxies of size 1O 12 M . This scenario is not supported by observations, which 
show that the ratio of dynamic mass to luminous mass is about the same in objects 
of all sizes. 

The 'bottom-top' scenario is supported by the observations that supergalaxies 
are typically at distances z < 0.5, whereas the oldest objects known are quasars at 
redshifts up to z = 5-7. There are also several examples of galaxies which are older 
than the groups in which they are now found. Moreover, in our neighbourhood the 
galaxies are generally falling in towards the Virgo cluster rather than streaming 
away from it. 

Several pieces of evidence indicate that luminous galaxies could have been 
assembled from the merging of smaller star-forming systems before z « 1. The 
Hubble Space Telescope as well as ground-based telescopes have discovered vast 
numbers of faint blue galaxies at 1 ^ z ^ 3.5, which obviously are very young. 
There is also evidence that the galaxy merger rate was higher in the past, increas- 
ing roughly as a~ m or (1 + z) m with m « 2-3. All this speaks for a bottom-top 
scenario. 



Galaxy Surveys. Two large ongoing surveys of the nearby Universe are the two- 
degree Field Galaxy Redshift Survey (2dFGRS), which has reported studies on 
nearly 250 000 galaxies within z < 0.25 (blue magnitude limit 19.45) [2, 14, 15], 
and the Sloan Digital Sky Survey (SDSS), which will provide data on a million galax- 
ies out to a magnitude of 23 [16]. At the time of writing, only the results of the 
2dFGRS studies have been reported. 



254 Cosmic Structures and Dark Matter 




Figure 9.6 The distribution of galaxies in part of the 2dFGRS, drawn from a total of 
213 703 galaxies. Reproduced from reference [14] by permission of the 2dFGRS Team. 



The purpose of 2dFGRS is, to cite J. A. Peacock [14], 

(i) 'To measure the galaxy power spectrum P(k) on scales up to a few hun- 
dred Mpc, bridging the gap between the scales of nonlinear structures and 
measurements from the CMB'; 

(ii) 'To measure the redshift-space distortion of the large-scale clustering that 
results from the peculiar velocity held produced by the mass distributions'; 
and 

(Hi) 'To measure higher-order clustering statistics in order to understand biased 
galaxy formation, and to test whether the galaxy distribution on large scales 
is a Gaussian random field'. 

Distributions of galaxies in two-dimensional pictures of the sky show that they 
form long filaments separating large underdense voids with diameters up to 
60ft" 1 Mpc. Figure 9.6 shows such a picture of 82 821 galaxies from a total of 
213 703 galaxies selected by 2dFGRS. The image reveals a wealth of detail, includ- 
ing linear supercluster features, often nearly perpendicular to the line of sight. 

In three-dimensional pictures of the Universe seen by the earlier Infrared Astro- 
nomical Satellite IRAS [17, 18] in an all-sky redshift survey, the filaments turn out 
to form dense sheets of galaxies, of which the largest one is the 'Great Wall', which 
extends across 170ft." 1 Mpc length and 60ft" 1 Mpc width. 



Large Scale Structure Simulation. The formation and evolution of cosmic struc- 
tures is so complex and nonlinear and the number of galaxies considered so enor- 
mous that the theoretical approach must make use of either numerical simula- 
tions or semi-analytic modelling. The strategy in both cases is to calculate how 



The Cold Dark Matter Paradigm 255 

density perturbations emerging from the Big Bang turn into visible galaxies. As 
summarized by C. S. Frenk, J. A. Peacock and collaborators in the 2dFGRS Team 
[2], this requires a number of processes in a phenomenological manner: 

(i) the growth of DM haloes by accretion and mergers; 

(ii) the dynamics of cooling gas; 

(iii) the transformation of cold gas into stars; 

(iv) the spectrophotometric evolution of the resulting stellar populations; 

(v) the feedback from star formation and evolution on the properties of prestel- 
lar gas; and 

(vi) the build-up of large galaxies by mergers. 

The primary observational information consists of a count of galaxy pairs in the 
redshift space (a, n). From this, the correlation function 5(5) in redshift space, 
and subsequently the correlation function §(r) in real space, can be evaluated. 
Recall that %(s) and §(r) are related by Equation (9.33) via the parameter fi in 
Equations (9.32). From £(r), the power spectrum P(k) can in principle be con- 
structed using its definition in Equations (9.8) and (9.9). 

The observed count of galaxy pairs is compared with the count estimated from 
a randomly generated mass distribution following the same selection function 
both on the sky and in redshift. Different theoretical models generate different 
simulations, depending on the values of a large number of adjustable parameters: 
h, O m h = (Qdmh 2 + Obh 2 )/h, Q\,/Q m , £?o, n s , the normalization os and the bias 
b between galaxies and mass. 

The CDM paradigm sets well-defined criteria on the real fluctuation spectrum. 
A good fit then results in parameter values. Since the parameter combinations 
here are not the same as in the CMB analysis, the degeneracy in the 2dFGRS data 
between O m h and Ob/O m can be removed by combining the CMB and 2dFGRS 
analyses. Let us now summarize a few of the results. 

If the simulated mass-correlation function §dm(^) and the observed galaxy- 
number two-point-correlation function £gai(r") are identical, this implies that light 
(from galaxies) traces mass exactly. If not, they are biased to a degree described 
by b. The result is that there is no bias at large scales, as indeed predicted by 
theory, but on small scales some anti-bias is observed. This result is a genuine 
success of the theory because it does not depend on any parameter adjustments. 
Independently, weak lensing observations also show that visible light in clusters 
does trace mass (all the visible light is emitted by the stars in galaxies, not by 
diffuse emission), but it is not clear whether this is true on galaxy scales. 

Theoretical models predict that the brightest galaxies at z = 3 should be 
strongly clustered, which is indeed the case. This comparison is also independent 
of any parameter adjustments. In contrast, DM is much more weakly clustered at 
z = 3 than at z = 0, indicating that galaxies were strongly biased at birth. 

In Figure 9.7 we show the theoretical linear power spectrum P(k). The real 
2dFGRS galaxy power spectrum data lie so accurately on the theoretical curve in 



256 Cosmic Structures and Dark Matter 




iog(kMpc/ah 2 ) 

Figure 9.7 The power function P (k) as a function of the density fluctuation wavenumber 
k in units of h 2 Mpc -1 . This can also be expressed by the angular scale in degrees or by the 
linear size L of present cosmic structures in units of Mpc O^h^ 1 . Courtesy of C. Frenk 
and J. Peacock 



Figure 9.7 that we have refrained from plotting them. To achieve this success, all 
the free parameters have been adjusted. 

One important signature of gravitational instability is that material collapsing 
around overdense regions should exhibit peculiar velocities and infall leading 
to redshift-space distortions of the correlation as shown in Figure 9.3. We have 
previously referred to large-scale bulk flows of matter observed within the LSC 
attributed to the 'Great Attractor', an overdensity of mass about 5.4 x 1O 16 M 
in the direction of the Hydra-Centaurus cluster, but far behind it, at a distance 
of some 44/i -1 Mpc. The 2dFGRS has verified that both types of redshift-space 
distortions occur, the 'Fingers of God' due to nearby peculiar velocities and the 
flattening due to infall at larger distances. These results are quantified by the 
parameter value 

p = O^f/b = 0.43 ± 0.07. (9.42) 

With a large-scale bias of b = 1 to a precision of about 10%, the ft error dominates, 
so that one obtains 

O m = 0.25 ± 0.07, (9.43) 

in good agreement with other determinations. 



The Cold Dark Matter Paradigm 257 

The WMAP collaboration [1] obtained a value for the characteristic amplitude 
of velocity fluctuations within 8 Mpc h~ l spheres at z = 0, 

cr 8 = 0.9 ±0.1, 

by setting p = osQ^. 

This result from the 2dFGRS, as well as other parametric results, is very useful 
and was combined with CMB data in Section 8.4. 

To summarize one can state that on scales larger than a few Mpc the distribution 
of DM in CDM models is essentially understood. Understanding the inner structure 
of DM haloes and the mechanisms of galaxy formation has proved to be much 
more difficult. 



Problems 

1. The mean free path £ of photons in homogeneous interstellar dust can be 
found from Equation (1.4) assuming that the radius of dust grains is 10" 7 m. 
Extinction observations indicate that £ « 1 kpc at the position of the Solar 
System in the Galaxy. What is the number density of dust grains [19]? 

2. Derive Equation (9.35). On average the square of the mean random velocity 
v 2 of galaxies in spherical clusters is three times larger than V 2 , where V is 
the mean line-of-sight velocity displacement of a galaxy with respect to the 
cluster centre. Calculate M for V = 1000 km s" 1 and R = 1 Mpc in units of 
M [19]. 

3. Suppose that galaxies have flat rotation curves out to R max - The total 
mass inside R max is given by Equation (9.26), where v may be taken to be 
220 km s" 1 . If the galaxy number density is n = 0.01h 3 /Mpc 3 , show that 
O = 1 when R max is extended out to an average intergalactic distance of 
2.5/t- 1 Mpc [3]. 

4. To derive Jeans wavelength Aj and Jeans mass Mj (see Equation (9.22)), let us 
argue as follows. A mass Mj = pAj composed of a classical perfect gas will 
collapse gravitationally if its internal pressure P = pkT/m cannot withstand 
the weight of a column of material of unit area and height Aj. Here m is the 
mean mass of the particles forming the gas. If we set the weight of the latter 
greater than or equal to P, 

A J J ~ m 

we will obtain a constraint on the sizes of fragment which will separate grav- 
itationally out of the general medium. Show that this leads to Equation (9.22) 
[19]. 

5. Suppose that neutralinos have a mass of 100 GeV and that they move with 
a virial velocity of 200 km s" 1 . How much recoil energy would they impart 
to a germanium nucleus? 



258 Cosmic Structures and Dark Matter 

6. The universal luminosity density radiated in the blue waveband by galaxies 
is 

lu = (2 ± 0.2) x 10 8 h I Mpc" 3 . 

Show that the Coma value Y = M/L = 300 in solar units then gives O m = 
0.30. 

7. Assuming that galaxies form as soon as there is space for them, and that 
their mean radius is 30/i _1 kpc and their present mean number density is 
0.03/r 3 Mpc" 3 , estimate the redshift at the time of their formation [2]. 

Chapter Bibliography 

[1] Bennett, C. L. et al. 2003 Preprint arXiv, astro-ph/0302207 and 2003 Astrophys. J. (In 
press.) and companion papers cited therein. 

[2] Hawkins, E. et al. 2002 arXiv astro-ph/0212375 and Mon. Not. R. Astron. Soc. (2003). 

[3] Peebles, P. J. E. 1993 Principles of Physical Cosmology. Princeton University Press, 
Princeton, NJ. 

[4] Tremaine, S. and Gunn J. E. 1979 Phys. Rev. Lett. 42, 407. 

[5] Battaner, E. et al. 1992 Nature 360, 65. 

[6] Navarro, J. F., Frenk, C. S. and White, S. D. M. 1996 Astrophys. J. 462, 563. 

[7] Corbelli, E. 2003 arXiv astro-ph/0302318 and Mon. Not. R. Astron. Soc. 

[8] Ponman, T. J. and Bertram, D. 1993 Nature 363, 51. 

[9] David, L. P., Jones, C. and Forman, W. 1995 Astrophys. J. 445, 578. 
[10] Nevalainen, J. et al. 1997 Proc. 2nd Integral Workshop (ed. C. Winkler et al), European 

Space Agency Special Publication no. 382. 
[11] Allen, S. W., Schmidt, R. W. and Fabian, A. C. 2002 Mon. Not. R. Astron. Soc. 334, Lll. 
[12] Alcock, C, Akerlof, C.-W., Allsman, R. A. et al. 1993 Nature 365, 621. 
[13] Auborg, E., Bareyre, P., Brehin, S. et al. 1993 Nature 365, 623. 
[14] Peacock, J. A. 2002 In A new era in cosmology (ed. T. Shanks and N. Metcalfe). ASP 

Conference Proceedings Series. 
[15] Frenk, C. S. 2002 Phil. Trans. R. Soc. Lond. A360, 1277. 
[16] York, D. G. et al. 2000 Astr. J. 120, 1579. 

[17] Moore, R. L., Frenk, C. S., Weinberg, D. et al. 1992 Mon. Not. R. Astron. Soc. 256, 477. 
[18] Saunders, W., Frenk, C. S., Rowan-Robinson, M. et al. 1991 Nature, 349, 32. 
[19] Shu, F. H. 1982 The physical universe. University Science Books, Mill Valley, CA. 



10 
Epilogue 



We have now covered most of cosmology briefly and arrived happily at a Standard 
Model. However, it has many flaws and leaves many questions unanswered. Of 
course the biggest question is what caused the Universe? What caused the Big 
Bang in the first place, what caused it to be followed by inflation, what caused the 
inflation to stop, and what caused the properties of the elementary particles and 
their interactions? 

In Section 10.1 we discuss the properties of the initial singularity and related 
singularities in black holes and in a Big Crunch. 

In Section 10.2 we learn that there is a difference between The Beginning and 
The End. This defines a thermodynamically preferred direction of time. We also 
briefly discuss extra dimensions and some other open questions. We close with a 
brief outlook into the future of the Universe. 

10.1 Singularities 

The classical theory of gravity cannot predict how the Universe began. Gravity 
itself curls up the topology of the manifold on which it acts, and in consequence 
singularities can appear. The problem at time zero arises because time itself is 
created at that point. In studies of the global structure of space-time, Stephen 
Hawking and Roger Penrose have found [1, 2, 3] that gravity predicts singularities 
in two situations. One is the Big Bang 'white hole' singularity in our past at the 
beginning of time. In analogy with a black hole, we have defined a white hole as 
something which only emits particles and radiation, but does not absorb anything. 
The other case is a future with gravitationally collapsing stars and other massive 
bodies, which end up as black-hole singularities. In a closed universe, this is the 
ultimate Big Crunch at the end of time. In a singularity, the field equations of 
general relativity break down, so one can only say that 'the theory predicts that 
it cannot predict the Universe' [3]. Since the Universe is, and has been for some 
time, undergoing an accelerated expansion we shall not study a Big Crunch. 

Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



260 Epilogue 

Black Hole Analogy. Perhaps something can be learned about our beginning and 
our end by studying the properties of black-hole singularities: the only ones we 
may have a chance to observe indirectly. As we saw in Section 3.4, the Friedmann- 
Lemaitre equations are singular t = 0. On the other hand, the exponential de Sitter 
solution (4.60), a(t) = exp(Ht), is regular at t = 0. In the Schwartz schild metric 
(3.21) the coefficient of dt 2 is singular at r = 0, whereas the coefficient of dr 2 
is singular at r = r c . However, if we make the transformation from the radial 
coordinate r to a new coordinate u defined by 



the Schwartzschild metric becomes 



The coefficient of dt 2 is still singular at u 2 = -r c , which corresponds to r = 0, 
but the coefficient of Au 2 is now regular at u 2 = 0. 
A similar game can be played with the de Sitter metric (4.63) in which 

g 0Q = l-r 2 H 2 , g n = -(l-r 2 H 2 r\ 

At r = H~ l , goo vanishes and g n is singular. In the subspace defined by ^ r ^ 
H~ l we can transform the singularity away by the substitution u 2 = H~ l -r. The 
new radial coordinate, u, is then in the range ^ u ^ H~ l and the nonsingular 
metric becomes 

ds 2 = (1 - r 2 H 2 ) dt 2 - 4H" 1 (1 + rH)~ l Au 2 - r 2 (d0 2 + sin 2 9 d(/> 2 ). (10.1) 

From these examples we see that some singularities may be just the consequence 
of a badly chosen metric and not a genuine property of the theory. Moreover, 
singularities often exist only in exact mathematical descriptions of physical phe- 
nomena, and not in reality, when one takes into account limitations imposed by 
observations. Or they do not exist at all, like the North Pole, which is an incon- 
spicuous location on the locally flat surface of a two-sphere. Yet this surface can 
represent the complex plane, with the infinities ±oo and ±ioo located at the North 
Pole. 



Quantum Arguments. Possible escapes from the singularities in gravity could 
come from considerations outside the classical theory in four-dimensional space- 
time, for instance going to quantum theory or increasing the dimensionality of 
space-time. In quantum gravity, the metric g^ is a quantum variable which does 
not have a precise value: it has a range of possible values and a probability density 
function over that range. As a consequence, the proper distance from x 1 to x l +dx l 
is also a quantum variable which does not have a precise value; it can take on any 
value. It has a probability distribution which peaks at the classical expectation 
value {ds 2 }. Or, phrasing this a little more carefully, the quantum variable of 
proper distance is represented by a state which is a linear combination of all 



Singularities 261 

possible outcomes of an observation of that state. Under the extreme conditions 
near t = 0, nobody is there to observe, so 'observation' of (ds 2 ) has to be defined 
as some kind of interaction occurring at this proper distance. 
Similarly, the cosmic scale factor a(t) in classical gravity has the exact limit 

lima(t) = (10.2) 

at the singularity. In contrast, the quantum scale factor does not have a well- 
defined limit, it fluctuates with a statistical distribution having variance (a 2 (t)). 
This expectation value approaches a nonvanishing constant [2] 

lim<a 2 (t)> = C>0. (10.3) 

t-o 

This resembles the situation of a proton-electron pair forming a hydrogen atom. 

Classically the electron would spiral inwards under the influence of the attractive 

Coulomb potential 

e 2 
V = — , (10.4) 

r 

which is singular at r = 0. In the quantum description of this system, the electron 

has a lowest stable orbit corresponding to a minimum finite energy, so it never 

arrives at the singularity. Its radial position is random, with different possible radii 

having some probability of occurring. Thus only the mean radius is well defined, 

and it is given by the expectation value of the probability density function. 

In fact, it follows from the limit (10.3) that there is a lower bound to the proper 

distance between two points, 

<d^(^) 2 , (10.5. 

where L P is the Planck length 10" 35 m [2]. The light cone cannot then have a 
sharp tip at t = as in the classical picture. Somehow the tip of the cone must be 
smeared out so that it avoids the singular point. 

James Hartle and Stephen Hawking [4] have studied a particular model of this 
kind in an empty space-time with a de Sitter metric (10.1) and a cosmological 
constant. Although this model is unrealistic because our Universe is not empty, it 
is mathematically manageable and may perhaps lead to new insights. The expo- 
nential de Sitter expansion takes place at times t > t P , when the event horizon is 
r < H~ l . As we have just noted above, the coefficient goo in Equation (10.1) van- 
ishes at r = H~ l . Hartle and Hawking propose that the metric changes smoothly 
at this point to a purely spatial Euclidean metric, like the one in Equation (2.29), as 
shown in Figure 10.1. Thus the de Sitter part of space-time contains no 'beginning 
of time' with value t = 0, whereas in the Euclidean hemisphere, where r > H~ l , 
there is no time coordinate at all, time has become a spatial coordinate t = it. 
Thus one could say (if our space-time would permit such a metric) that time 
emerges gradually from this space without any abrupt coming into being. Time 
is limited in the past, to within (t 2 ) =s t|, but it has no sharp boundary. In that 
sense the classical singularity has disappeared, and there is no origin of time. As 




Figure 10.1 An observer can see only part of any surface 2. From S. Hawking and R. Pen- 
rose [3], copyright 1996 by Princeton University Press. Reprinted by permission of Prince- 
ton University Press. 

Hawking expresses it, 'the boundary condition of the Universe is that it has no 
boundary' [1, 3]. 

The Universe then exists because of one unlikely fluctuation of the vacuum, in 
which the energy A£ created could be so huge because the time At was small 
enough not to violate Heisenberg's uncertainty relation 



AtA£ < h. 



(10.6) 



However, the step from the Hartle-Hawking universe to a functioning theory of 
quantum gravity is very large and certainly not mastered. 

Some people take the attitude that quantum gravity is unimportant, because 
inflation has erased all information about what went on before it. Inflation is 
described to lowest order by classical fields— scalar and perhaps tensor— whereas 
quantum mechanics only enters as fluctuations. 



10.2 Open Questions 



The Direction of Time. Since entropy has been increasing all the time and cannot 
decrease, this defines a preferred direction of time. Therefore the Big Crunch 
singularity at t = 2t max in a closed Universe is of the same kind as the singularity 
in a black hole, and quite unlike the Big Bang white-hole singularity. In classical 



Open Questions 263 

gravity, the Universe cannot bounce and turn around to restart a new expansion 
cycle although classical gravity and quantum mechanics are symmetric in the 
direction of time, because the second law of thermodynamics is not symmetric. In 
Section 7.5 we met speculations on a cyclic universe which apparently circumvents 
this problem. 

In his celebrated book A brief history of time [1], Stephen Hawking discusses 
the direction of time, which can be specified in three apparently different ways. 
Biological time is defined by the ageing of living organisms like ourselves, and by 
our subjective sense of time (we remember the past but not the future). But this 
can be shown to be a consequence of the thermodynamical direction of time. 

Apparently independent of this is the direction defined by the cosmic expan- 
sion which happens to coincide with the thermodynamical arrow of time. Hawk- 
ing, Laflamme and Lyons [5] have pointed out that the density perturbations also 
define a direction of time independent of the cosmic time because they always 
grow, whether the Universe is in an expanding or a contracting phase. 

The possible reason why the different arrows of time all point in the same 
direction is still obscure. Perhaps one has to resort to the Anthropic Principle 
[6] to 'understand' it, implying that the Universe could not be different if we 
should be able to exist and observe it. One would then conclude that the agree- 
ment between all the different directions of time speaks for the necessity of the 
Anthropic Principle. 

Extra Dimensions. Another enigma is why space-time was created four dimen- 
sional. The classical answer was given by Paul Ehrenf est (1880-1933), who demon- 
strated that planetary systems with stable orbits in a central force field are pos- 
sible only if the number of spatial coordinates is two or three. However, extra 
space-time dimensions are possible if the higher dimensions are unnoticeably 
small. 

Increasing the dimensionality of space-time to five (or more) may solve many 
problems naturally. We would then be living on a four-dimensional projection, 
a brane, curved in the fifth dimension at some extremely small compactification 
scale, perhaps the GUT scale, unable to move to any other brane. 

The singularity at time zero would only be apparent on our brane, whereas in 
the full five-dimensional space-time it would correspond to a finite point. Also 
the hierarchy problem could then be solved: the gravitational interaction appears 
so weak in comparison with the strong and electroweak interactions because the 
latter are confined to our three-dimensional spatial brane, while gravitation acts 
in the full five-dimensional space. Gravitation could also be related to supersym- 
metry in a higher-dimensional space-time. 

Different bubbles in Linde's chaotic foam could perhaps be connected through 
the fifth dimension if different bubbles exist on different branes. Such ideas 
appear fruitful in many contexts, but they are quite speculative and in addition 
mathematically complicated, so we shall not treat them further here. 



264 Epilogue 

Multiply Connected Universes. We have tacitly assumed that our Universe is 
simply connected, so that no radial path takes us back to our present space- 
time location. But this assumption could be abandoned without any dramatic 
consequences. One can construct many versions of multiply connected universes 
of finite size, in which one could, in principle, observe our Galaxy at some distant 
location. Thus the idea is testable. 

The problem is, however, that if we search for our Galaxy or some conspicuous 
pattern of galaxies in another location of the sky, they may look quite different 
from the way they do now because of the different time perspective and their 
evolution history. We don't know how the Milky Way looked 10 billion years ago. 
Searches for conspicuous patterns of quasars or similar galaxies in different direc- 
tions of the sky have been done, but no convincing evidence has been found. 

Fine-Tuning Problems. The origin of the cosmological constant A is unknown 
and its value has to be fine-tuned to within the 52nd decimal of zero (in units of c = 
1). To try to remedy this accident we introduced a time-dependent A(£) initially 
coupled to inflation which, even so, had to be fine-tuned to within 17 decimal 
places (Chapter 7, Problem 6). The function A(£) is of course ad hoc, depending 
on new parameters that characterize the timescale of the deflationary period and 
the transfer of energy from dark energy to dust. 

Tracking quintessence appeared at first sight to remove the need for fine-tuning 
entirely, but at the cost of the arbitrariness in the choice of a parametric potential 
V{q>). We may perhaps content ourselves with a choice of parameter values fitting 
the initial conditions and the necessary properties of dark energy, but this does 
not answer the even deeper coincidence problem of why dark energy has come to 
dominate visible energy just now, when we are there to observe it. 

Dark energy dominates now, its pressure is negative and its equation of state 
is w = -1 or very close to that. Could it cease to dominate later, so that the 
accelerated expansion ends, perhaps to be replaced by contraction? Is the present 
the first time in the history of the Universe when dark energy has dominated, or 
is dark energy a cosmological constant? One can generalize this question and 
ask whether other constants of nature are indeed constants, or whether it is a 
coincidence that they have their present values and appear constant. 

Missing Physics. The fluctuations during the inflation are the source of the CMB 
fluctuations at the time of decoupling, but to connect the two eras is difficult. 
The Universe then traversed the graceful exit with its production of particles and 
radiation, but even the particle spectrum at that time is unknown to us. Was there 
an era of supersymmetry, later to be broken? 

Equally enigmatic is the nature and origin of DM, of which we only know that 
it is 'cold', it does not interact with visible matter and it is not identical to dark 
energy [7]. Does it consist of primordial supersymmetry particles? It is eagerly 
expected that laboratory particle physics will give the answer. 

If we turn to astrophysics at large, there are many unknowns. We haven't 



Open Questions 265 

even mentioned the enormously energetic gamma-ray bursts from sources at 
cosmological distances, the active galactic nuclei (AGN) or the ultra-high energy 
gamma rays of nearly 10 21 eV, coming from unknown sources and accelerated 
by unknown mechanisms. We have discussed many aspects of galaxies, but the 
astrophysics of galaxy formation is not understood, the thermal history of the 
intergalactic medium and its baryon and metal content is not known, etc. 

The Future. Since the Universe at present undergoes accelerated expansion one 
would not expect it to halt and turn around towards a Big Crunch. The expansion 
dilutes the energy density p m and the kinetic term of the dark energy, \op 2 . If they 
get much smaller than the potential V(q?) in Equations (4.68) or in Equation (7.36), 
and if it happens that V(qp) becomes negative, then the role of the dark energy 
is inverted: it becomes attractive. The Universe then starts to contract towards a 
Big Crunch with a rather more easily predictable future. 

As contraction proceeds, galaxies start to merge and, when the ambient temper- 
ature becomes equal to the interior of the stars, they disintegrate by explosion. 
Stars are also disrupted by colossal tidal forces in the vicinity of black holes which 
grow by accreting stellar matter. As the temperature rises, we run the expansion 
history backwards to recover free electrons, protons and neutrons, subsequently 
free quarks, and finally free massless GUT fields at time tcuT before the final singu- 
larity. The role of black holes is model dependent, but it is reasonable to imagine 
that all matter ends up in one black hole. What then happens at the singularity 
escapes all prediction. 

In an eternally expanding universe or an asymptotically coasting universe, pro- 
ton decay is a source of heat for dead stars until about time t p . Long before t p , 
all stars have already exhausted their fuel to become either black holes, neutron 
stars, black dwarfs or dead planets and their decay products: electrons, positrons, 
neutrinos and a considerable amount of radiation, all of which are responsible for 
most of the heat and entropy. The relic CMB has been redshifted away to com- 
pletely negligible energies. Almost all the energy density of the Universe is dark 
energy. 

From t p on, the future is boring [6]. The radiation from decayed protons may 
cause a brief time of increased radiation, of the order of 1000t p , followed by the 
redshift of these decay photons to lower enough energies. Then the only thing hap- 
pening is the very slow formation of positronium by free electrons and positrons. 
However, these atoms would have little resemblance to present-day positronium. 
Their size would be of the order of 10 15 Mpc, and they would rotate around their 
centre of gravity with an orbital velocity about 1 |im per century. In the end, each 
of these atoms would decay into some 10 22 ultrasoft photons at a timescale com- 
parable to the evaporation time of supergalaxy-sized black holes. 

Our last speculation concerns the fate of black holes. Since black holes devour 
matter of all kinds, the outside world will lose all knowledge of what went into 
the hole. Thus there is a net loss of information or entropy from part of our 
Universe which is (in principle) observable. Classically, one may argue that the 
information is not lost, it is just invisible to us inside the event horizon. But 



266 Epilogue 

quantum theory permits the black hole to radiate and lose mass without disclosing 
any other information than the temperature of the radiation. Once a black hole 
has evaporated completely there will remain no entropy and no information about 
its contents. Or, there remains a naked singularity, whatever that implies. Then 
where did the huge amount of information or entropy in the black hole go? Physics 
does not accept that it just vanished, so this problem has stimulated a flood of 
speculations. 

A singular point cannot simply 'appear' in the middle of space-time in such a 
way that it becomes 'visible' at some finite future point. We must not be able to 
observe a particle actually falling into a singularity, where the rules of physics 
would cease to hold or reach infinity. This is Hawking's and Penrose's hypothesis 
of cosmic censorship, already referred to in connection with black holes, that sin- 
gularities should be protected from inspection, either because they exist entirely 
in the future (Big Crunch), or entirely in the past (Big Bang), or else they are hid- 
den by an event horizon inside black holes [1, 2, 3]. Otherwise, since space-time 
has its origin in a singularity, perhaps all of space-time would disappear at the 
appearance of a naked singularity. 



Problems 

1. Assume that the protons decay with a mean life of t p = 10 35 yr, converting 
all their mass into heat. What would the ambient temperature on the surface 
of Earth be at time t = t p , assuming that no other sources of heat contribute? 

Chapter Bibliography 

[1] Hawking, S. W. 1988 A brief history of time. Bantam Books, New York. 

[2] Penrose, R. 1989 The emperor's new mind. Oxford University Press, Oxford. 

[3] Hawking, S. W. and Penrose, R. 1996 The nature of space and time. Princeton Univer- 
sity Press, Princeton, NJ. 

[4] Hartle, J. B. and Hawking, S. W. 1983 Phys. Rev. D 28, 2960. 

[5] Hawking, S. W., Laflamme, R. and Lyons, G. W. 1993 Phys. Rev. D47, 5342. 

[6] Barrow, J. D. and Tipler, F. J. 1988 The Anthropic Cosmological Principle. Oxford Uni- 
versity Press, Oxford. 

[7] Sandvik, H. B. et al. 2002 arXiv astro-ph/0212114. 



Tables 



Table A.l Cosmic distances and dimensions 



distance to the Sun 

distance to the nearest star (« Centauri) 

diameters of globular clusters 

thickness of our Galaxy, the 'Milky Way' 

distance to our galactic centre 

radius of our Galaxy, the 'Milky Way' 

distance to the nearest galaxy (Large Magellanic Cloud) 

distance to the Andromeda nebula (M31) 

size of galaxy groups 

thickness of filament clusters 

distance to the Local Supercluster centre (in Virgo) 

distance to the 'Great Attractor' 

size of superclusters 

size of large voids 

distance to the Coma cluster 

length of filament clusters 

size of the 'Great Wall' 

Hubble radius 



8' 15" (light minutes) 

1.3 pc 

5-30 pc 

0.3 kpc 

8kpc 

12.5 kpc 

55 kpc 

770 kpc 

1-5 Mpc 

5 h- 1 Mpc 

17 Mpc 

44 h' 1 Mpc 

> 50 h- 1 Mpc 
60 h- 1 Mpc 
100 h,- 1 Mpc 
100 h- 1 Mpc 

> 60 x I70h- 2 Mpc 2 
3000 h.- 1 Mpc 



Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



Table A.2 Cosmological and astrophysical constants (from the Particle Data Group com- 
pilation, K. Hagiwara, et al. (2002) Phys. Rev. D66, 010001-1.) 



speed of light 




c 


299 792 458 ms- 1 


light year 




ly 


0.3066 pc = 0.9461 x 10 16 m 


parsec 




pc 


3.262 ly= 3.085 678 x 10 16 m 


solar luminosity 




L 


(3.846 ±0.008) x 10 26 J s" 1 


solar mass 




M, 


1.989 x 10 30 kg 


solar equatorial 




Re 


6.961 x 10 8 m 


radius 








Hubble parameter 




H„ 


100k km s- 1 Mpc" 1 = h/(9.778 13 Gyr) 


Newtonian constant 




G 


6.673 x 10- 11 m 3 kg" 1 s~ 2 


Planck constant 




^ 


6.582 119 x 10~ 22 MeVs 


Planck mass 


M P 


= VWg 


1.221 x 10 19 GeVc 2 


Planck time 


tp = 


= VftG/c 5 


5.31 x 10~ 44 s 


Boltzmann constant 




k 


8.617 34 x 10~ 5 evr 1 


Stefan-Boltzmann 


a = tt 


2 fc 4 /15fi 3 c 3 


4.7222 x 10" 3 MeV m" 3 K~ 4 


constant 








critical density of 


Pc = 


3H$/8nG 


2.775 x l0 u h 2 M e Mpc -3 


the Universe 






= 10.538h 2 GeVm" 3 



Table A.3 Electromagnetic radiation 









energy 


type 


wavelength [m] 


energy [eV] 


density 1 [eVm~ 3 ] 


i.iilm 


> l 


<10" 6 


« 0.05 


microwave 


1-5 x 10" 3 


10- () -2 x 10~ 3 


3x 10 5 


infrared 


2x 10- 3 -7x 10~ 7 


10" 3 -1.8 


? 


optical 


(7-4) x 10- 7 


1.8-3.1 


~ 2 x 10 3 


ultraviolet 


4x 10- 7 -10" 8 


3.1-100 


? 


X-rays 


io- 8 -io- 12 


100-10 6 


75 


y-rays 


< io- 12 


>10 6 


25 



'From M. S. Longair 1995 In The Deep Universe (ed. A. R. Sandage, R. G. Kron, and M. S. 
Longair), pp. 317. Springer. 



Table A.4 Particle masses 1 



particle MeV u 



0.23 x 1CT 6 


< 2.7 x 10 3 


0.511 


5.93 x 10 9 


105.658 


1.226 x 10 1 


134.977 


1.566 x 10 1 


139.570 


1.620 x 10 1 


938.272 


1.089 x 10 1 


939.565 


1.090 x 10 1 


1777 


2.062 x 10 1 


80423 


9.333 x 10 lz 


91188 


1.058 x 10 1 


114 300 


> 1.326 x 10 1 



^rom the Particle Data Group compilation K. Hagiwara et al. 2002, Phys. Rev. D66, 
010001. 



Table A. 5 Particle degrees of freedom in the ultrarelativis 



particle 


particle type 


Y 


vector boson 


Ve, V„, V T 


fermion (lepton) 


e",^-, t- 


fermion (lepton) 


TT ± , TT° 


boson (meson) 


p, n 


fermion (baryon) 


\V r± ,Z 


vector boson 



= 2, but the right-handed neutrinos are inert below the electroweak symmetry 



= 1 if the neutrinos are their own antiparticles. 



Table A.6 Present properties of the Universe 



mass 

CMB radiation temperature 

cosmic neutrino temperature 

radiation energy density 

radiation density parameter 

entropy density 

CMB photon number density 

cosmological constant 

Schwarzschild radius 

Baryon to photon ratio 

total density parameter 

baryon density parameter (for Qo = 1) 

matter density parameter (for Do = 1) 

deceleration parameter 



Mr 
T 



ih. 



value 

(13.7 ± 0.2) Gyr 

~ 10 22 M o 
2.725 ±0.001 K 
1.949 ±0.001 K 
2.604 x 10 3 eVm- 3 
2.471h- 2 x 10~ 5 
2.890 x 10 9 m- 3 
1.11 x 10 8 photons m~ 3 
1.3 x 10- 52 c 2 m" 2 

> 11 Gpc 
(6.1 ± 0.7) x 10" 10 
1.02 ±0.02 
0.044 ±0.004 
0.27 ±0.04 
-0.60 ±0.02 



Table A.7 Net baryon number change AB and branching fraction BR for leptoquark X 
decays 



channel i 



1 X- u+u 

2 X- e- + d 

3 X- u+u 

4 X- e + + d 



Index 



2dFGRS, 253 

Abelian algebra, 155 
absolute luminosity, 9, 42, 45 
absolute space, 6, 7, 30 
absorption lines, 28, 138, 144 
active galactic nuclei (AGN), 79, 265 
adiabatic 

expansion, 92, 117, 239, 242 

fluctuations, 202, 219 
affine connections, 48 
age of the Universe, 17, 97, 104 
Alpher, Ralph, 212 
Andromeda nebula, 5,12 
angular diameter-redshift relation, 108 
angular size distance, 43 
anisotropy 

quadrupole, 81, 218, 222, 224 

sources of, 219 
annihilation, 76 

baryon-anti-baryon, 178 

electron-positron, 76, 123, 134 

leptoquark boson, 181 

monopole-anti-monopole, 191 

pion, 132 

WIMP, 250 
Anthropic Principle, 185, 263 
anthropocentric view, 2 
anti-baryons, 124 
anti-bias, 245 

anti-gravitational force, 101 
anti-leptons, 124 
anti-neutrons, 124 
anti-nucleons, 124 
anti-protons, 122 
asymptotic freedom, 162, 163 
autocorrelation function 

mass, 233 

temperature, 217 



axino, 250 
axion, 250 

B-balls, 250 

bar magnet, 166 

baryon, 124, 140, 143, 159, 179, 201, 227, 

239,251 

number, 124 

density, 140, 178 
baryon-anti-baryon asymmetry, 178 
baryon-to-photon ratio, 178 
baryosynthesis, 178, 182 
beauty quark, 160 
Bekenstein, J., 76 
Bekenstein-Hawking formula, 76 
beta decay, 141 
bias, 245 
Big Bang, 77, 94, 95, 97, 115, 188, 192, 

204, 213, 259 

nucleosynthesis, 139 
Big Crunch, 95, 266 
binary pulsar, 63 
binding energy, 136 
black hole, 5, 71, 73, 249 

analogy, 260 

candidates, 78 

creation, 77 

Kerr, 75 

properties, 74 

Reissner-Nordstrom, 75 

Schwarzschild, 100 

singularity, 259 
blackbody spectrum, 115 
blue compact dwarf (BCD), 144 
blueshift, 30 
Boltzmann, Ludwig, 116 
Boltzmann constant, 119 
Bose, Satyendranath, 125 
Bose distribution, 128 



Introduction to Cosmology Third Edition by Matts Roos 

© 2003 John Wiley & Sons, Ltd ISBN 470 84909 6 (cased) ISBN 470 84910 X (pbk) 



bosons, 125, 128 

gauge, 154, 161 

Higgs, 171 

leptoquark, 176 

scalar, 102, 168, 197 

vector, 122, 125, 169, 176, 180, 191 
bottom quark, 160 
bottom-top scenario, 253 
Bradley, James, 2 
brane, 208, 263 

Bright Cluster Galaxies (BCGs), 19 
brightness 

apparent, 42 

surface (SBF), 9, 16, 69 
brown dwarfs, 249 

CDM paradigm, 252 
Cepheids, 44 

Chandrasekhar mass, 15, 78 
chaotic inflation, 185, 196 
charge 

conjugation, 165 

operator, 156 

space, 156 
charged current, 131 
charmed quark, 160 
chemical potential, 128, 136 
Cheseaux, Jean-Philippe Loys de, 9 
classical mechanics, 19, 166, 232 
closed gravitational system, 8, 22 
CMB, 119, 211 

polarization, 222 

temperature, 212 
COBE, 214, 216, 221 
collapsed stars, 249 
collisional dissipation, 239 
colour, 161 

force, 162 
commutative algebra, 155 
commutator, 155 
comoving 

coordinates, 34 

distance, 37 

frame, 36 
compactification scale, 263 
Compton 

scattering, 122, 133, 222 

wavelength, 177 
conformal time, 38 
contraction operation, 49 
contravariant vector, 45 



Copernican principle, 3 
Copernicus, Nicolaus, 2 
cosmic 

censorship, 76, 266 

scale factor, 13 

strings, 190, 219 

structures, 231 

time, 37 
cosmochronometers, 17 
cosmological constant, 90, 91, 100, 227 

decaying, 102 
cosmological principle, 3, 7 
Coulomb force, 113 
coupling constants, 113 
covariance principle, 45 
covariant 

derivative, 47 

vector, 46 
CP violation, 165, 180 
CPT symmetry, 165 
critical density, 20 
cross-section, 128 
curvature 

parameter, 36 

perturbations, 202 
curved space-time, 30 
cyclic models, 205 

dark energy, 101, 193, 201, 204, 208, 227, 

239,252, 264 
dark matter, 231, 241, 242 

baryonic, 249 

candidates, 248 

cold, 250 

hot, 252 

warm, 252 
de Sitter, Willem, 6 
de Sitter 

cosmology, 99, 103, 195 

metric, 99, 260, 261 
deceleration, 89 

parameter, 40, 82, 229 
decoupling 

electron, 139, 229 

neutrino, 135 
degeneracy pressure, 15, 78, 126 
degrees of freedom, 127 

effective, 129 
density 

fluctuations, 232 

parameters, 21, 91, 227 



deuterium, 139 

bottleneck, 141 

photodisintegration, 140 
deuteron, 139 
diagonal matrices, 155 
Dicke, Robert, 213 
dipole anisotropy, 216 
Dirac, Paul A.M., 190 
discrete 

symmetries, 163 

transformation, 163 
distance ladder, 44 
DMR, 216 
domain walls, 190 
Doppler 

peak, 221 

redshift, 30 
doublet representations, 154 
down quark, 159 
dust, 5, 9, 93, 249 
dwarf spheroidal galaxies, 243 

Eddington, Arthur S., 65 
Ehrenfest, Paul, 263 
eigenfunction, 164 
eigenstates, 155 
eigenvalue, 155 

equations, 155 
eigenvector, 164 
Einstein, Albert, 4 
Einstein 

Einstein's equations, 57 

Einstein's law of energy, 46 

Einstein's theory of gravitation, 54 

ring, 67 

tensor, 57 

universe, 90, 100 
Einstein-de Sitter universe, 89 
electromagnetic interactions, 113, 124, 

133, 138, 164 
electron 

anti-neutrino, 124 

family, 124 

neutrino, 124 

spin, 150 
electroweak 

interactions, 122 

symmetry breaking, 158, 169 
endothermic, 139 
energy conservation law, 92 
energy effect, 42 



energy-momentum 

conservation, 91 

tensor, 56, 202 
entropy 

conservation, 92 

density, 215 
equation of state, 92, 229 
equilibrium theory, 137 
equivalence principle, 49 
Euclidean space, 30 
Eulerian equations, 232 
event horizon, 40 
Evershed, John, 62 
exothermic, 139 
expansion 

rate, 132 

time, 107 

velocities, 12 
extra dimensions, 263 
falling photons, 52 
false vacuum, 169, 194 
family unification theories (FUTs), 1 76 
Fermi, Enrico, 125 
Fermi 

coupling, 133 

distribution, 128 
fermion, 125 

number, 126 
Feynman diagram, 123 
fine-tuning problems, 264 
Fingers of God, 245 
FIR AS, 214 

first law of thermodynamics, 117 
first-order phase transition, 171 
flatness problem, 185, 192 
flavours, 124, 159 
fluid dynamics, 232 
flux, 69, 127 
Friedmann, Alexandr, 6 
Friedmann 

Friedmann's equations, 88 

Friedmann-Lemaitre cosmologies, 87 
fundamental observer, 36 
fundamental plane, 15 
fusion reactions, 139 
galaxy 

clusters, 234, 246, 253 

counts, 109 

formation, 242 

groups, 3, 244 

surveys, 253 



Galilean equivalence principle, 50 

Galilei, Galileo, 2 

gamma-rays, 179 

Gamow, Georg, 212 

Gamow penetration factor, 142 

gauge 

bosons, 154, 161 

principle, 154 

problem, 237 

transformation, 237 
Gauss, Carl Friedrich, 33 
Gaussian curvature, 33 
Gell-Mann, Murray, 159 
general covariance, 47, 54 
General Relativity, 45, 62 
geodesic, 30 
Glashow, Sheldon, 158 
globular clusters, 5, 18, 44, 80, 240 
gluon, 161, 180 
gold, 27 

graceful exit, 194 
grand unified theory (GUT), 158 

phase transition, 189 

potentials, 194 
gravitating mass, 19, 49 
gravitational 

birefringence, 54 

lenses, 64 

lensing, 64 

potential, 55 

radiation, 64 

repulsion, 91 

wave detection, 82 

wave sources, 81 

waves, 80 
gravitons, 80 
Great Attractor, 41, 256 
group, 153 

order, 154 
Guth, Alan, 193 

hadrons, 159 
Halley, Edmund, 2 
Hamiltonian operator, 151 
Hawking, Stephen, 76 
Hawking 

radiation, 77 

temperature, 77 
HDM, 252 

Heisenberg's uncertainty relation, 196, 262 
helicity, 164 

states, 164 



helium, 18, 142 
Helmholtz, Hermann von, 121 
Herman, Robert, 212 
Hermitian operator, 154 
Herschel, William, 5 
Hertzsprung-Russell relation, 43 
hierarchical scenarios, 253 
hierarchy problem, 174, 263 
Higgs, Peter, 170 
Higgs 

boson, 171 

field, 170 
Higgsino, 250 
higher symmetries, 163 
homogeneity assumption, 3 
horizon problem, 185, 187 
HST, 14 

Hubble, Edwin P., 5 
Hubble 

constant, 14 

flow, 12 

Hubble's law, 12 

parameter, 12 

radius, 13 

Space Telescope, 14, 45 

time, 13 
Hulse, R. A., 63 

Hydra-Centaurus, 30, 41, 216, 256 
hydrogen 

atom, 123 

burning, 17, 43, 145 

clouds, 144, 225, 240, 242 
hypothesis testing, 106 

ideal fluid, 56 
inertial frames, 7 
inertial mass, 20, 49 
inflation, 192 

chaotic, 185, 196 

Guth's scenario, 192 

new, 195 

old, 185 
inflaton field, 104, 192, 202 
infrared light, 43, 70, 254 
interaction (see also gravitational) 

strong, 156 

weak, 122 
intergalactic medium, 145, 179 
interstellar medium, 144, 179 
IRAS, 254 
isentropy, 117 



isocurvature fluctuations, 202, 219 
isospin, 157 

symmetry, 157 
isothermal, 202 
isotropy assumption, 3 

Jeans 

instability, 238 

mass, 238 

wavelength, 238 
jupiters, 249 

k-essence, 106 
Kant, Immanuel, 4 
kaon, 159 

Kapteyn, Jacobus C, 244 
Kepler, Johannes, 2 
Kerr black holes, 75 
kination, 203 
Klein-Gordon equation, 102 

Lagrange point, 50 
Lambert, Johann Heinrich, 5 
Landau damping, 251 
Landau-Oppenheimer-Volkov limit, 78 
Laplace, Pierre Simon de, 4 
Large Magellanic Cloud, 44 
last scattering surface, 137, 229 
Lederman, Leon, 160 
left handed, 164 
Legendre polynomials, 217 
Leibnitz, Gottfried Wilhelm von, 3 
lens 

caustics, 71 

convergence, 71 

shear, 71 
lensing 

strong, 66 

weak, 65, 66 
lepton, 124 

number, 125 

weak isospin, 157 
leptoquark, 176 

thermodynamics, 180 
Le Verrier, Urban, 62 
light cone, 27, 28 
light, speed of, 13, 26, 54 
lightlike separation, 28 
Lindblad, Bertil, 6 
Linde's Bubble Universe, 200 
line element, 26 
linear operators, 155 



linear transformation, 26 

lithium, 144 

local galaxy group, 3, 30, 41, 216 

local gauge transformation, 154 

Local Supercluster (LSC), 3, 41, 216, 245 

lookback time, 89 

loops, 190 

Lorentz, Hendrik Antoon, 26 

Lorentz transformations, 25, 26 

lowering operators, 153 

luminosity, 9, 15 

distance, 42 
Lyman limit, 144 
Lyman-a forest, 226 

Mach, Ernst, 6 

Mach's principle, 49 

MACHO, 249 

magnetic monopoles, 190 

magnitude 

absolute, 42, 108 

apparent, 42 
magnitude-redshift relation, 108 
main-sequence stars, 43 
manifold, 26 

curved, 34, 259 

higher-dimensional, 46 

multiply connected, 263 
mass density contrast, 233 
mass-to-luminosity ratio, 246 
Massive Compact Halo Object, 249 
matter domination, 93, 118 
Maxwell, James Clerk, 128 
Maxwell-Boltzmann distribution, 128 
mean free path, 10 
metals, 144 
metric equation, 31 
metric tensor, 31 
metrics, 30 
Michell, John, 5 
microlensing, 69 
Milky Way, 2-6, 14, 17, 19, 43, 44, 69, 79, 

145, 179, 243, 264 
Minkowski, Hermann, 28 
Minkowski 

metric, 28 

space-time, 31 
multiply connected universes, 263 
multipole analysis, 217, 224 
muons, 124 



naked singularity, 76, 266 
neutralino, 250 
neutrino 

clouds, 253 

families, 124, 130, 143, 157, 164 

number density, 135, 215 

oscillation, 125, 182 

sterile, 252 

temperature, 129, 135 
neutron, 124 
neutron star, 1 5 

neutron-to-proton ratio, 139, 142 
Newton, Isaac, 2 
Newton's law of gravitation, 20 
Newtonian 

constant, 20 

cosmology, 6 

mechanics, 19 
non-Abelian algebra, 155 
nuclear fusion, 139 
nucleon, 124 

isospin, 156 
null separation, 28 

object horizon, 39 
observations, possible, 155 
Olbers, Wilhelm, 9 
Olbers' Paradox, 9 
Oort, Jan Hendrik, 6 
open gravitational system, 8, 21 
operator, linear, 155 
optical depth, 226 

our Galaxy (Milky Way), 2-6, 14, 17, 19, 43, 
44, 69, 79, 145, 179, 243, 264 

parallax distance, 42 
parallel axiom, 33 
parameter estimation, 106, 225 
parity, 164 

operator, 163 

transformation, 163 
Parker bound, 191 
parsec, 7 

particle horizon, 39, 186 
Pauli, Wolfgang, 154 
Pauli 

exclusion force, 126 

matrices, 154 
peculiar velocity, 14 
Penrose, Roger, 76 
Penzias, Arno, 212 



perihelion, 62 

Perl, Martin, 160 

phase transformation, 153 

phase transitions, 171 

photino, 250 

photon, 114 

blackbody spectrum, 115, 213 

diffusion, 239 

number density, 115, 215 

reheating, 133 
pion, 126, 130, 159, 165 
Planck, Max, 53 
Planck 

constant, 53 

mass, 177 

time, 177 
Poisson's equation, 55, 233 
polarization, 116 

anisotropics, 222 

linear, 116 
positron, 122 
positronium, 123 
power spectrum, 218, 226, 233 
powers, 217 
prepared states, 152 
pressure 

of matter, 93 

of radiation, 93, 232, 239 

of vacuum, 93, 102 
primeval asymmetry generation, 1 79 
primeval phase transitions, 171 
primordial hot plasma, 128 
Proctor, Richard Anthony, 5 
proper distance, 38, 40 
proper time, 26 
proto-neutron star, 78 
proton, 122 
PSPC, 247 

Q-balls, 250 

QED, 122 

quadrupole anisotropy, 81, 218, 222, 224 

quantum 

chromodynamics, 161 

electrodynamics, 122 

fluctuations, 199 

mechanics, 53, 114, 151, 260 
quark, 159 

matter, 172 
quasar counts, 109 
quintessence, 103, 202, 204 



R parity, 1 74 
radiation 

domination, 93, 115, 118 

energy density, 215 

intensity, 215, 224 

photon, 114 

pressure, 93, 232, 239 
radioactive nuclei, 1 7 
radius of the Universe, 96, 198 
raising operators, 153 
rank, 46 

re-ionization, 225 
reaction 

cross-section, 127 

rate, 127, 132, 133 
recession velocities, 12 
recombination 

era, 133, 136 

radiation, 134 

redshift, 138, 212 

time, 138, 212 
red giant, 14, 77 
redshift, 28, 29, 40 

cosmological, 13 

distance, 42 
Reissner-Nordstrom black holes, 75 
relativistic particles, 119 
relativity 

general, 27 

special, 25 
relic 4 He abundance, 142 
Ricci 

scalar, 49 

tensor, 49 
rich clusters, 246 
Richter, Burt, 160 
Riemann, Bernhard, 4 
Riemann tensor, 48 
Robertson, Howard, 36 
Robertson-Walker metric, 36 
ROSAT, 245, 247 
rotational symmetry, 163 
RR Lyrae, 44 

Sachs-Wolfe effect, 220 
Saha equation, 137 
Sakharov oscillations, 221 
Salam, Abdus, 158 
scalar fields, 102, 164, 168, 197 
scalar spectral index, 202 
scale factor, 28 



Schwarzschild, Karl, 72 
Schwarzschild 

black hole, 73 

metric, 71, 73 

radius, 72 
second cosmic velocity, 8 
second law of thermodynamics, 117 
second-order phase transition, 171 
Shapiro, 1. 1., 63 
Shapley, Harlow, 5 
Shen, Yang, 4 
Silk, J., 239 
Silk damping, 239 
singlet representation, 162 
slow-rolling conditions, 104 
snowballs, 249 
solar constant, 147 
Solar System, 1-5, 18, 19, 30, 62, 179, 216, 

244 
solitons, 250 
space parity, 163 
space-time distance, 26 
spacelike, 28 
sparticles, 174 
special relativity, 2 5 
speed of light, 13, 26, 54 
spin, 117 

longitudinal state, 126 

space, 150 

state, 151, 155 

transversal state, 126 

vector, 151 
spinor algebra, 151 
spiral galaxies, 242 
spontaneous symmetry breaking, 166 
standard candle, 15, 44 
standard model, 163 
star formation, 17, 144, 242, 255 
statistics, 106 
Stefan, Josef, 116 
Stefan-Boltzmann law, 116 
Stokes parameters, 116, 223 
strange mesons, 159 
strange quark, 159 
strangeness, 159 
stress-energy tensor, 56, 102 
structure 

formation, 237 
time, 240 

simulation, 254 

size, 240 



SU(2) symmetry, 156 
SU(3) symmetry, 159 
subconstituent models, 175 
Sunyaev-Zel'dovich Effect (SZE), 225, 240 
superclusters, 14, 41, 216, 245, 254 
superluminal photons, 54 
supernovae, 5, 14, 17, 78, 81, 108, 144, 228 
superposition principle, 153 
supersymmetry (SUSY), 174 
surface-brightness fluctuations (SBFs), 16 
symmetry breaking 

electroweak, 158 

GUT, 175 

spontaneous, 166 

tachyons, 54 
Taylor, J. H., 63 
technicolour forces, 175 
temperature, 172 

anisotropics, 216 

critical, 172, 194 

fluctuations, 217 

multipoles, 218 
tensor, 30, 45 

field, 80 

spectral index, 202 
theory of everything (TOE), 158 
thermal 

conductivity, 239 

death, 121 

equilibrium, 115 

history of the Universe, 113, 146 
thermodynamics 

first law of, 117 

second law of, 117 
Thomson scattering, 133, 222 
tidal effect, 50 
lime 

dilation, 26 

direction of, 262 

reversal, 165 
timelike, 28 
timescales, 228 
Ting, Sam, 160 
Tolman test, 45 
top quark, 160 
top-bottom scenario, 253 
topological defects, 190 
tracking quintessence, 103 
translational symmetry, 163 
trigonometrical parallax, 42 



tritium, 140 

triton, 140 

Tully-Fisher relation, 15,45 

tunnelling, 194 

turnover time, 95 

two-degree Field Galaxy Redshift Survey 

(2dFGRS), 253 
two-point correlation function, 235 

unitary operator, 153 
unitary transformations, 153 
universe 

anti-de Sitter, 100 

closed, 20, 95 

contracting, 8, 13, 21, 95 

cyclic, 207 

de Sitter, 99, 100, 103, 195 

Einstein, 90, 100 

Einstein-de Sitter, 89, 108, 192 

expanding, 8, 12, 21, 95 

finite, 3 

Friedmann-Lemaitre, 87, 91 

Hartle-Hawking, 262 

inflationary, 193 

Newtonian, 19 

open, 20 
up quark, 159 

vacuum 

energy, 91, 93, 100, 186, 193, 201 

energy pressure, 93, 102 

expectation value, 168, 193 
vector bosons, 122, 125, 169, 176, 180, 191 
virial equilibrium, 240 
virtual particles, 76, 122 
virtual photons, 122 
viscous fluid approximation, 232 
von Helmholtz, Hermann, 121 

Walker, Arthur, 36 

wavenumber, 217 

WDM, 252 

weak charge, 158 

weak field limit, 5 5 

weak hypercharge, 158 

weak-isospin space, 157 

weakly interacting massive particles 

(WIMPs), 250 
Weinberg, Steven, 158 
Weinberg angle, 171 
Weyl, Hermann, 37 
Wheeler, John A., 72 



Index 279 



white dwarfs, 14, 126 X-rays, 80, 240, 244, 247 

white hole, 97 

Wilson, Robert, 213 

WIMP, 250 

wimpzillas, 250 Zel'dovich, Yakov B., 226 

WMAP, 19, 225 Zmo - 25 ° 

world line, 28 Zweig, George, 159 

Wright, Thomas, 3 Zwicky, Fritz, 71