THOMAS BANKS
wvHW.cambr ld£iewg/9 , T&0S21 85C&27
This page intentionally left blank
Modern Quantum Field Theory
A Concise Introduction
Quantum field theory is a key subject in physics, with applications in particle and
condensed matter physics. Treating a variety of topics that are only briefly touched on
in other texts, this book provides a thorough introduction to the techniques of field
theory.
The book covers Feynman diagrams and path integrals, and emphasizes the path
integral approach, the Wilsonian approach to renormalization, and the physics of non-
abelian gauge theory. It provides a thorough treatment of quark confinement and chiral
symmetry breaking, topics not usually covered in other texts at this level. The Standard
Model of particle physics is discussed in detail. Connections with condensed matter
physics are explored, and there is a brief, but detailed, treatment of non-perturbative
semi-classical methods (instantons and solitons).
Ideal for graduate students in high energy physics and condensed matter physics, the
book contains many problems, providing students with hands-on experience with the
methods of quantum field theory.
Thomas Banks is Professor of Physics at the Santa Cruz Institute for Particle Physics
(SCIPP), University of California, and NHETC, Rutgers University. He is considered
one of the leaders in high energy particle theory and string theory, and was co-discoverer
of the Matrix Theory approach to non-perturbative string theory.
Modern Quantum Field Theory
A Concise Introduction
Tom Banks
University of California, Santa Cruz
and Rutgers University
J Cambridge
t UNIVERSITY PRESS
CAMBRIDGE UNIVERSITY PRESS
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, Sao Paulo
Cambridge University Press
The Edinburgh Building, Cambridge CB2 8RU, UK
Published in the United States of America by Cambridge University Press, New York
www.cambridge.org
Information on this title: www.cambridge.org/9780521850827
© T. Banks 2008
This publication is in copyright. Subject to statutory exception and to the provision of
relevant collective licensing agreements, no reproduction of any part may take place
without the written permission of Cambridge University Press.
First published in print format 2008
ISBN-13 978-0-511-42899-9 eBook (EBL)
ISBN-13 978-0-521-85082-7 hardback
Cambridge University Press has no responsibility for the persistence or accuracy of urls
for external or third-party internet websites referred to in this publication, and does not
guarantee that any content on such websites is, or will remain, accurate or appropriate.
1 Introduction
1 . 1 Preface and conventions
1 .2 Why quantum field theory?
2 Quantum theory of free scalar fields
2.1 Local fields
2.2 Problems for Chapter 2
3 Interacting field theory
3.1 Schwinger-Dyson equations and functional integrals
3.2 Functional integral solution of the SD equations
3.3 Perturbation theory
3.4 Connected and 1-P( article) I(rreducible) Green functions
3.5 Legendre's trees
3.6 The Kallen-Lehmann spectral representation
3.7 The scattering matrix and the LSZ formula
3.8 Problems for Chapter 3
4 Particles of spin 1, and gauge invariance
4. 1 Massive spinning particles
4.2 Massless particles with helicity
4.3 Field theory for massive spin-1 particles
4.4 Problems for Chapter 4
5 Spin-i particles and Fermi statistics
5.1 Dirac, Majorana, and Weyl fields: discrete symmetries
5.2 The functional formalism for fermion fields
5.3 Feynman rules for Dirac fermions
5.4 Problems for Chapter 5
6 Massive quantum electrodynamics
6.1 Free the longitudinal gauge bosons!
6.2 Heavy-fermion production in electron-positron annihilation
6.3 Interaction with heavy fermions: particle paths and
external fields
32
6.4 The magnetic moment of a weakly coupled charged particle
6.5 Problems for Chapter 6
Symmetries, Ward identities, and Nambu-Goldstone bosons
7.1 Space-time symmetries
7.2 Spontaneously broken symmetries
7.3 Nambu-Goldstone bosons in the semi-classical expansion
7.4 Low-energy effective field theory of Nambu-Goldstone bosons
7.5 Problems for Chapter 7
Non-abelian gauge theory
1 The non-abelian Higgs phenomenon
2 BRST symmetry
3 A brief history of the physics of non-abelian gauge theory
4 The Higgs model, duality, and the phases of gauge theory
5 Confinement of monopoles in the Higgs phase
6 The electro-weak sector of the standard model
7 Symmetries and symmetry breaking in the strong interactions
8 Anomalies
9 Quantization of gauge theories in the Higgs phase
10 Problems for Chapter 8
9 Renormalization and effective field theory
9.1 Divergences in Feynman graphs
9.2 Cut-offs
9.3 Renormalization and critical phenomena
9.4 The renormalization (semi-)group in field theory
9.5 Mathematical (Lorentz-invariant, unitary) quantum field theory
9.6 Renormalization of ^ 4 field theory
9.7 Renormalization-group equations in dimensional regularization
9.8 Renormalization of QED at one loop
9.9 Renormalization-group equations in QED
9.10 Why is QED IR-free?
9.11 Coupling renormalization in non-abelian gauge theory
9.12 Renormalization-group equations for masses and the hierarchy
problem
9.13 Renormalization-group equations for the S-matrix
9.14 Renormalization and symmetry
9.15 The standard model through the lens of renormalization
9.16 Problems for Chapter 9
10 Instantons and solitons
10.1 The most probable escape path
10.2 Instantons in quantum mechanics
10.3 Instantons and solitons in field theory 213
10.4 Instantons in the two-dimensional Higgs model 216
10.5 Monopole instantons in three-dimensional Higgs models 221
10.6 Yang-Mills instantons 226
10.7 Solitons 232
10.8 't Hooft-Polyakov monopoles 236
10.9 Problems for Chapter 10 239
11 Concluding remarks 242
Appendix A Books 245
Appendix B Cross sections 247
Appendix C Diracology 248
Appendix D Feynman rules 251
Appendix E Group theory and Lie algebras 256
Appendix F Everything else 260
References 262
Author index 268
Subject index 269
Introduction
1.1 Preface and conventions
This book is meant as a quick and dirty introduction to the techniques of quantum field
theory. It was inspired by a little book (long out of print) by F. Mandl, which my advisor
gave me to read in my first year of graduate school in 1969. Mandl's book enabled the
smart student to master the elements of field theory, as it was known in the early
1960s, in about two intense weeks of self-study. The body of field-theory knowledge
has grown way beyond what was known then, and a book with similar intent has to be
larger and will take longer to absorb. I hope that what I have written here will fill that
Mandl niche: enough coverage to at least touch on most important topics, but short
enough to be mastered in a semester or less. The most important omissions will be
supersymmetry (which deserves a book of its own) and finite-temperature field theory.
Pedagogically, this book can be used in three ways. Chapters 1-6 can be used as a text
for a one-semester introductory course, the whole book for a one -year course. In either
case, the instructor will want to turn some of the starred exercises into lecture material.
Finally, the book was designed for self-study, and can be assigned as a supplementary
text. My own opinion is that a complete course in modern quantum field theory needs
3^1 semesters, and should cover supersymmetric and finite-temperature field theory.
This statement of intent has governed the style of the book. I have tried to be terse
rather than discursive (my natural default) and, most importantly, I have left many
important points of the development for the exercises. The simian should not imagine
that he/she can master the material in this hook without doing at least those exercises
marked with a *. In addition, at various points in the text I will invite the reader to prove
something, or state results without proof. The diligent reader will take these as extra
exercises. This book may appear to the student to require more work than do texts that
try to spoon-feed the reader. I believe strongly that a lot of the material in quantum
field theory can be learned well only by working with your hands. Reading or listening
to someone's explanation, no matter how simple, will not make you an adept. My hope
is that the hints in the text will be enough to let the student master the exercises and
come out of this experience with a thorough mastery of the basics.
The book also has an emphasis on theoretical ideas rather than application to experi-
ment. This is partly due to the fact that there already exist excellent texts that concentrate
on experimental applications, partly due to the desire for brevity, and partly to increase
the shelf life of the volume. The experiments of today are unlikely to be of intense inter-
est even to experimentalists of a decade hence. The structure of quantum field theory
will exist forever.
Throughout the book I use natural units, where h — c— 1. Everything has units of
some power of mass/energy. High-energy experiments and theory usually concentrate
on the energy range between 10~ 3 and 10 3 GeV and I will often use these units. Another
convenient unit of energy is the natural one defined by gravitation: the Planck mass,
M P % 10 19 GeV, or reduced Planck mass, m P ^ 2 x 10 18 GeV. The GeV is the natural
unit for hadron masses. Around 0.1 5 GeV is the scale at which strong interactions
become strong. Around 250 GeV is the natural scale of electro-weak interactions, and
~2 x 10 16 GeV appears to be the scale at which electro-weak and strong interactions
are unified.
I will use non-relativistic normalization, (p\q) — S 3 (p — q), for single-particle states.
Four- vectors will have names which are single Latin letters, while 3 -vectors will be
written in bold face. I will use Greek mid-alphabet letters for tensor indices, and Latin
early-alphabet letters for spinors. Mid-alphabet Latin letters will be 3-vector compo-
nents. I will stick to the van der Waerden dot convention (Chapter 5) for distinguishing
left- and right-handed Weyl spinors. As for the metric on Minkowski space, I will use
the West Coast, mostly minus, convention of most working particle theorists (and of
my toilet training), rather than the East Coast (mostly plus) convention of relativists
and string theorists.
Finally, a note about prerequisites. The reader must begin this book with a thorough
knowledge of calculus, particularly complex analysis, and a thorough grounding in non-
relativistic quantum mechanics, which of course includes expert-level linear algebra.
Thorough knowledge of special relativity is also assumed. Detailed knowledge of the
mathematical niceties of operator theory is unnecessary. The reader should be familiar
with the Einstein summation convention and the totally anti-symmetric Levi-Civita
symbol e" [ ■"" . We use the convention e 0123 — 1 in Minkowski space. It would be useful
to have a prior knowledge of the theory of Lie groups and algebras, at a physics level of
rigor, although we will treat some of this material in the text and Appendix G. I have
supplied some excellent references [1-4] because this math is crucial to much that we
will do. As usual in physics, what is required of your mathematical background is a
knowledge of terminology and how to manipulate and calculate, rather than intimate
familiarity with rigor and formal proofs.
1.1.1 Acknowledgements
I mostly learned field theory by myself, but I want to thank Nick Wheeler of Reed
College for teaching me about path integrals and the beauties of mathematical physics in
general. Roman Jackiw deserves credit for handing me Mandl's book, and Carl Bender
helped me figure out what an instanton was before the word was invented. Perhaps the
most important influence in my grad school years was Steven Weinberg, who taught
me his approach to fields and particles, and everything there was to know about broken
1.2 Why quantum field theory?
symmetry. Most of the credit for teaching me things about field theory goes to Lenny
Susskind, from whom I learned Wilson's approach to renormalization, lattice gauge
theory, and a host of other things throughout my career. Shimon Yankielowicz and
Eliezer Rabinovici were my most important collaborators during my years in Israel.
We learned a lot of great physics together. During the 1970s, along with everyone else
in the field, I learned from the seminal work of D. Gross, S. Coleman, G. 't Hooft,
G. Parisi, and E. Witten. Edward was a friend and a major influence throughout my
career. As one grows older, it's harder for people to do things that surprise you, but
my great friends and sometimes collaborators Michael Dine, Willy Fischler, and Nati
Seiberg have constantly done that. Most of the field theory they've taught me goes
beyond what is covered in this book. You can find some of it in Michael Dine's recent
book from Cambridge University Press.
Field theory can be an abstract subject, but it is physics and it has to be grounded in
reality. For me, the most fascinating application of field theory has been to elementary
particle physics. My friends Lisa Randall, Yossi Nir, Howie Haber, and, more recently,
Scott Thomas have kept me abreast of what's important in the experimental foundation
of our field.
In writing this book, I've been helped by M. Dine, H. Haber, J. Mason, L. Motl,
A. Shomer, and K. van den Broek, who've read and commented on all or part of the
manuscript. The book would look a lot worse than it does without their input. Chapter
10 was included at the behest of A. Strominger, and I thank him for the suggestion.
Chris France, Jared Rice, and Lily Yang helped with the figures. Finally, I'd like to
thank my wife Ada, who has been patient throughout all the trauma that writing a
book like this involves.
1.2 Why quantum field theory?
Students often come into a class in quantum field theory straight out of a course in
non-relativistic quantum mechanics. Their natural inclination is to look for a straight-
forward relativistic generalization of that formalism. A fine place to start would seem to
be a covariant classical theory of a single relativistic particle, with space-time position
variable x M (t), written in terms of an arbitrary parametrization r of the particle's path
in space-time.
The first task of a course in field theory is to explain to students why this is not the
right way to do things. 1 The argument is straightforward.
Consider a classical machine (an emission source) that has probability amplitude
Je(x) of producing a particle at position x in space-time, and an absorption source,
which has amplitude Ja(x) to absorb the particle. Assume that the particle propagates
1 Then, when thej get more sophisticated, you can show them how the panicle path formalism can be used,
wilh appropriate care.
Boosts can reverse causal order for (x - y) 2 < 0.
freely between emission and absorption, and has mass m. The standard rules of quan-
tum mechanics tell us that the amplitude (to leading order in perturbation theory in
the sources) for the entire process is (remember our natural units!)
■I***"
) \y)J A (x)JE(y),
where \x) is the state of the particle at spatial position x. This doesn't look very Lorentz-
covariant. To see whether it is, write the relativistic expression for the energy H —
y/p 2 + m 2 = Wp. Then
^ae = j d 4 xd 4 yJ A (x)J E (y) [ d 3 p\(0\p)\ 2 er»<*-y\ (1.2)
The space-time set-up is shown in Figure 1 . 1 . In writing this equation I've used the fact
that momentum is the generator of space translations 2 to evaluate position/momentum
overlaps in terms of the momentum eigenstate overlap with the state of a particle at
the origin. I've also used the fact that (co p ,p) is a 4-vector to write the exponent as a
Lorentz scalar product. So everything is determined by qua ics, translation
invariance and the relativistic dispersion relation, up to a function of 3-momentum. We
can determine this function up to an overall constant, by insisting that the expression is
Lorentz-invariant, if the emission and absorption amplitudes are chosen to transform
as scalar functions of space-time. An invariant measure for 4-momentum integration,
ensuring that the mass is fixed, is d 4 p S(p 2 - m 2 ). Since the momentum is then forced
to be time-like, the sign of its time component is also Lorentz-invariant (Problem 2.1).
So we can write an invariant measure d 4 p8(p 2 — m 2 )0(p°) for positive-energy particles
of mass m. On doing the integral over;? we find d i p/(2eo p ). Thus, if we choose the
normalization
(0\p) = . (1.3)
J(2nnw p
then the propagation amplitude will be Lorentz-invariant. The full absorption and
emission amplitude will of course depend on the Lorentz frame because of the coordi-
nate dependence of the sources /e,a- It will be covariant if these are chosen to transform
like scalar fields.
2 Here I'm using the notion of the infinitesimal generator of a symmetry transformation. If you don't know
this concept, take a quick look at Appendix G, or consult one of the many excellent introductions to Lie
groups [1-4],
1.2 Why quantum field theory?
This equation for the momentum-space wave function of "a particle localized at
the origin" is not the same as the one we are used to from non-relativistic quantum
mechanics. However, if we are in the non-relativistic regime where \p\ -C m then the
wave function reduces to \/m times the non-relativistic formula. When relativity is
taken into account, the localized particle appears to be spread out over a distance of
order its Compton wavelength, \/m = h/(mc).
Our formula for the emission/absorption amplitude is thus covariant, but it poses the
following paradox: it is non-zero when the separation between the emission and absorption
points is space-like. The causal order of two space-like separated points is not Lorentz -
invariant (Problem 2.1), so this is a real problem.
The only known solution to this problem is to impose a new physical postulate:
every emission source could equally well be an absorption source (and vice versa). We
will see the mathematical formulation of this postulate in the next chapter. Given this
postulate, we define a total source by J(x) — Je(x) + J\(x) and write an amplitude
A AE = f d^d^yJ^Jiy) f ^P mx °-y\-^-^
J J 2w p (27ty
+ 6(y -x WP <x -y ) ]
= jd 4 xd 4 yJ(x)J(y)D F (x-y), (1.4)
where 9(x°) is the Heaviside step function which is 1 for positive x° and vanishes for
x° < 0. From now on we will omit the superscript in the argument of these functions.
This formula is manifestly Lorentz-covariant when x — y is time-like or null. When
the separation is space-like, the momentum integrals multiplying the two different step
functions are equal, and we can add them, again getting a Lorentz-invariant amplitude.
It is also consistent with causality. In any Lorentz frame, the term with 6 (x° — y°) is
interpreted as the amplitude for a positive -energy particle to propagate forward in time,
being emitted at y and absorbed at x. The other term has a similar interpretation as
emission at x and absorption at y. Different Lorentz observers will disagree about the
causal order when x — y is space-like, but they will all agree on the total amplitude for
any distribution of sources.
Something interesting happens if we assume that the particle has a conserved
Lorentz-invariant charge, like electric charge. In that case, one would have expected
to be able to correlate the question of whether emission or absorption occurred to the
amount of charge transferred between x and y. Such an absolute definition of emission
versus absorption is not consistent with the postulate that saved us from a causality
paradox. In order to avoid it we have to make another, quite remarkable, postulate:
every charge-carrying particle has an anti-particle of exactly equal mass and opposite
charge. If this is true we will not be able to use charge transfer to distinguish between
emission of a particle and absorption of an anti-particle. One of the great triumphs of
quantum field theory is that this prediction is experimentally verified. The equality of
particle and anti-particle masses has been checked to one part in 10 !8 [5].
Now let's consider a slightly more complicated process in which the particle scatters
from some external potential before being absorbed. Suppose that the potential is
Scattering in one frame is production amplitude in another.
short-ranged, and is turned on for only a brief period, so that we can think of it
as being concentrated near a space-time point z. The scattering amplitude will be
approximately given by propagation from the emission point to the interaction point
z, some interaction amplitude, and then propagation from z to the absorption point.
We can draw a space-time diagram like Figure 1 .2. We have seen that the propagation
amplitudes will be non-zero, even when all three points are at space-like separation
from each other. Then, there will be some Lorentz frame in which the causal order is
that given in the second drawing in the figure. An observer in this frame sees particles
created from the vacuum by the external field! Scattering processes inevitably imply
particle-production processes. 3
We conclude that a theory consistent with special relativity, quantum mechanics, and
causality must allow for particle creation when the energetics permits it (in the example
of the previous paragraph, the time dependence of the external field supplies the energy
necessary to create the particles). This, as we shall see, is equivalent to the statement
that a causal, relativistic quantum mechanics must be a theory of quantized local fields.
Particle production also gives us a deeper understanding of why the single-particle wave
function is spread over a Compton wavelength. To localize a particle more precisely we
would have to probe it with higher momenta. Using the relativistic energy-momentum
relation, this means that we would be inserting energy larger than the particle mass.
This will lead to uncontrollable pair production, rather than localization of a single
particle.
Before leaving this introductory section, we can squeeze one more drop of juice
from our simple considerations. This has to do with how to interpret the propagation
amplitude Dp(x — y) when x — y is space-like, and we are in a Lorentz frame where
3 Indeed, there are quantitative relations, called crossing symmetries, between the two kinds of amplitude.
1.2 Why quantum field theory?
x° = j°. Our remarks about particle creation suggest that we should interpret this as
the probability amplitude for two particles to be found at time x°, at relative separation
x — y. Note that this amplitude is completely symmetric under interchange of x and y,
which suggests that the particles are bosons. We thus conclude that spin-zero particles
must be bosons. It turns out that this is true, and is a special case of a theorem that
says that integer-spin particles are bosons and half-integer spin particles are fermions.
What is more, this is not just a mathematical theorem, but an experimental fact about
the real world. I should warn you, though, that unlike the other remarks in this section,
and despite the fact that it leads to a correct conclusion, the reasoning here is not a
cartoon of a rigorous mathematical argument. The interpretation of the equal-time
propagator as a two-particle amplitude is of limited utility.
Quantum theory of free scalar fields
V. Fock invented an efficient method for dealing with multiparticle states. We will
work with delta-function-normalized single-particle states in describing Fock space.
This has the advantage that we never have to discuss states with more than one
particle in exactly the same state, and various factors involving the number of par-
ticles drop out of the formulae. In this section we will continue to work with spinless
particles.
Start by defining Fock space as the direct sum T — 0£lo ~H k , where % k is the Hilbert
space ofk particle states. We will assume that our particles are either bosons or fermions,
so these states are either totally symmetric or totally anti-symmetric under particle
interchange. In particular, if we work in terms of single-particle momentum eigenstates,
H k consists of states of the form \p\, . . .,p k ), either symmetric or anti-symmetric under
permutations. The inner product of two such states is
■ •> 9l) = Skl-r^ J](-l) Xcr sVl - q„(i)) ...S 3 (p k - q a(k) ), (2.1)
where the sum is over all permutations a in the symmetric group, S k , (— \) a is the sign
of the permutation, and the statistics factor, S, is for bosons and 1 for fermions.
The k — term in this direct sum is a one-dimensional Hilbert space containing a
unique normalized state, called the vacuum state and denoted by |0).
In ordinary quantum mechanics one can contemplate particles that form different
representations of the permutation group than bosons or fermions. Although we will
not prove it in general, this is impossible in quantum field theory. In one or two spatial
dimensions, one can have particles with different statistical properties (braid statistics),
but these can always be thought of as bosons or fermions with a particular long-range
interaction. In three or more spatial dimensions, only Bose and Fermi statistics are
allowed for particles in Lorentz -invariant QFT.
Fock realized that one can organize all the multiparticle states together in a way
that simplifies all calculations. Starting with the normalized state |0), which has no
particles in it, we introduce a set of commuting, or anti-commuting, operators cft(p) and
define
\p I.
..,p k )=a f (p 1 )...J(p k )\0).
C2)
Quantum theory of free scalar fields
These are called the creation operators. The scalar-product formula is reproduced
correctly if we postulate the following (anti-)commutation relation between creation
operators and their adjoints 1 (called the annihilation operators):
la(p),aHq)] ± = 8\p-q). (2.3)
Fermions are made with anti-commutators, and bosons with commutators.
To get a little practice with Fock space, let's construct the representation of the
Poincare symmetry- on the multiparticle Hilbert space. We begin with the energy and
momentum. These are diagonal on the single -particle states. The correct Fock-space
formula for them is
= jd*pp' x ^(p)a(p), (2.4)
where p° — co p . Its easy to verify that this operator does indeed give us the sum of the k
single-particle energies and momenta, when acting on a k -particle state. This is because
n p = a^(p)a(p) acts as the particle number density in momentum space. There is a
similar formula for all operators that act on a single particle at a time. For example,
the rotation generators are
Jij = J d 3 p aUp)i(pi 9/ - pj di)a(p). (2.5)
Here 9/ = 3/ dp'.
It is easy to verify that the following formula defines a unitary representation of the
Lorentz group on single-particle states:
U(A)\p) = J^\Ap).
V °>p
The reason for the funny factor in this formula is that the Dirac delta function in our
definition of the normalization is not covariant because it obeys
f d 3 p8(p) =
A Lorentz-invariant measure of integration on positive-energy time-like 4-vectors is
' d?p
j d 4 P 6(p )8(p 2 -m 2 ) = J ^£.
The factors in the definition of U(A) make up for this non-covariant choice of
normalization.
A general Lorentz transformation is the product of a rotation and a boost, so in
order to complete our discussion of Lorentz generators we have to write a formula for
1 This is an extremely important claim. It's easy to prove and every reader should do it. The same remark
applies to all of the equations in this subsection. Commutators and anti-commutators of operators are
defined by [A, B]± = AB ± BA. Assume that a(p)\Q) = 0.
Poincare symmetry is the semi-direct product of Lorentz transformations and translations. Semi-direct
means that the Loreniz iransfomuilions act on the translations.
Quantum theory of free scalar fields
the generator of a boost with infinitesimal velocity v. We write it as v'/o/- Under such
a boost, p' — ► p' + v'cop and co p -> co p + v'p,-. Thus
m = [%- + o> P v'--)\p).
\2top dp 1 /
The Fock-space formula for the boost generator is then
2.1 Local fields
We now want to model the response of our infinite collection of scalar particles to
a localized source J(x). We do this by adding a term to the Hamiltonian (in the
Schrodinger picture)
H-> H + V(t), (2.6)
>-/'■
V(t)= I & i x<l>(x)J(x,t). (2.7)
<f>(x) must be built from creation and annihilation operators. It must transform into
</> (x+a) under spatial translations. This is guaranteed by writing <j> (x) — f <i?p e vx <j>(p),
where 4>{p) is an operator carrying momentum/).
We want to model a source that creates and annihilates single particles. This statement
is meant in the sense of perturbation theory. That is, the amplitude J for the source to
create a single particle is small. It can create multiple particles by multiple action of
the source, which will be higher-order terms in a power series in /. Thus the field <p (x)
should be linear in creation and annihilation operators:
</>(*) = f ■. df \a(j>)aJ»* + fl t(p) a *e-H. (2.8)
y/WlVp 1
We have also imposed Hermiticity of the Hamiltonian, assuming that the source func-
tion is real, a could be a general complex constant, but we have already defined a
normalization for the field, and we can absorb the phase of a into the creation and
annihilation operators, so we set a — 1 .
We now want to study the time development of our system in the presence of the
source. We are not really interested in the free motion of the particles, but rather in the
question of how the source causes transitions between eigenstates of the free Hamil-
tonian. Dirac invented a formalism, called the Dirac picture, for studying problems of
and panicles.
2.1 Local fields
this sort. Let Us(t, to) be the time evolution operator of the system in the Schrodinger
picture. It satisfies
i3 t U s = [H +nt)]U s (2.9)
and the boundary condition
U s (to,to) = l. (2.10)
The Dirac-, or interaction-, picture evolution operator is defined by
U D (t, to) = e iHut U s (t, t )e- iHoto (2.11)
and the same boundary condition. If the interaction V(t) is zero, Ud — 1. It describes
how Hq eigenstates evolve under the influence of the perturbation.
Simple manipulations (Problem 2.2) show that the Dirac-picture evolution operator
satisfies
id t U D = W(t)U D , (2.12)
where W(t) = e lHot V(t)e~ lH °'. In words, the formula for W{t) means that it is con-
structed from the interaction potential in the Schrodinger picture by replacing every
operator by the corrcspoiuliipi Hci.senhcr» picture operator in the unperturbed theory. For
our problem,
W(x°) = J d 3 x /(x,xV(*,x°), (2.13)
where the Heisenberg-picture field is
4>(x, x°) = f df [a{p)e- ipx + a^(p)e ipx l (2.14)
^j(lTt) Lti) p
Note carefully that in the last formula we have Minkowski scalar products px —
p°x° —px (p° = a>p) replacing the three-dimensional scalar products of the Schrodinger
picture field. For future reference note also that all of the space and time dependence
is contained in the exponentials, which are solutions of the Klein-Gordon equation.
Thus the Heisenberg-picture field satisfies the Klein-Gordon equation (□ = 3q — V 2 )
(□ + ™ 2 )0 = O. (2.15)
In fact, the exponentials are the most general solutions of the Klein-Gordon equa-
tion, so the formula (2.14) turns solutions of the Klein-Gordon field equations into
quantum operators. Willy-nilly we find ourselves studying the theory of quantized
fields!
It is easy to solve the evolution equation (2.12) by infinitely iterated integration. This
is a formal, perturbative solution, but in the present case it actually sums up to the
exact answer:
Uoit.to) = JT (-i)" / dfi /'d/ 2 ... f n ~ 1 dt n W(ti)...W(t n ). (2.16)
Quantum theory of free scalar fields
Note that the operator order in this formula mirrors the time order: the leftmost oper-
ator is at the latest time etc. This suggests a more elegant way of writing the formula.
Define the time-ordered product, TW{t\) . . . W(t„), to be the product of the operators
with the order defined by their time arguments. Thus, e.g.
TW(h)W(t 2 ) = 0(t\ - t 2 )W{t l )W(t 2 ) + 0(t 2 - h)W(t 2 )W{h). (2.17)
Alert readers will note the similarity of this construction to our discussion of causality
in the introduction. If, in the formula for C/d, we replace ordinary operator products by
time-ordered products, and allow all the ranges of integration to run from to to /, then
the only mistake we are making is over-counting the result n\ times. A correct formula
is then
Unit, to) = JT ^- f d?i ...dt n TW(h) . . . W(t n ) - Te~ if ''o W(,) . (2.18)
The last form of the equation is a notational shorthand for the infinite sum.
In the case at hand, the formula looks even more elegant, because space and time inte-
grations combine into a Lorentz -invariant measure. We have, in an obvious shorthand,
U D (t, to) = re-^ d4 -^ w/(x) . (2.19)
We have already argued that we should allow the source function J(x) to transform
like a scalar field under Lorentz transformations. You will verify, in Problem 2.7, that
the field (j> (x) transforms as a scalar as well. The only things in our formula that are not
Lorentz-covariant are the (implicit) end points of the time integration and the time-
ordering symbol. They both appear because the Hamiltonian formulation of quantum
mechanics requires us to choose H — P° for some particular Lorentz frame.
We can solve the first problem by simply taking the limits of the time integration to
±oo. We can't expect the answer to what happens to the system as it evolves between
two fixed space-like surfaces to be independent of what those surfaces are. The infinite
time limit of the Dirac-picture evolution operator is called the scattering operator, or
S-matrix. It is reasonable to expect that the S-matrix is Lorentz-covariant. Its only
dependence on the Lorentz frame should come from that of the external source J(x).
The formula (2.19) also depends on the Lorentz frame because of the time-ordering
operation. When field arguments are at space-like separation, the causal order depends
on the Lorentz frame and should not appear in physical amplitudes. We can enforce
this by insisting that
[</>(*), 0001 = if (x - y) 2 < 0, (2.20)
because then, for space-like separation, the time ordering is superfluous. 4 The require-
ment that fields commute at space-like separation is called the locality postulate.
4 Actually, one has to be a bit more careful in denning the time-ordered product at coinciding points in
order to make sure that the locality postulate is enough to guarantee Lorentz invariance of the S-matrix.
The time-ordered products arc singular al coinciding points and this issue is nroperh treated as part of
renonnalizalion theory
2.2 Problems for Chapter 2
Together with some assumptions about the asymptotic behavior of the energy spectrum,
this is the defining property of local field theory.
We note that one could have achieved the same end had we insisted that the sources
were anti-commuting, or Grassmann, variables and required the fields to anti-commute
at space-like distances. Although at first sight this seems bizarre, we will see that it is
the right way to treat the fields of particles obeying Fermi statistics. We can now check
whether our formula for the field satisfies these requirements. This is the content of
Problem 2.5. The result is that, for spinless particles, locality can be imposed only for
Bose statistics. This is the first step in the proof of the spin-statistics theorem: in local
field theory in four and higher dimensions, all integer-spin particles must be bosons and
all half-integer-spin particles fermions. We will not prove this theorem in full generality
[6-8], but will see several more examples below. Problem 2.6 generalizes the result to
charged particles as well.
A conserved internal symmetry charge is an operator that commutes with the
Poincare generators, which can be written as an integral of a local density, Q — J d 3 x J°.
We can choose particle states to be eigenstates of Q. A particle with annihilation
operator a(p) may have an anti-particle with annihilation operator b(p). We have
[Q, a(p)] — a(p) and [Q, b(p)] — -b(p), expressing the fact that the particle and anti-
particle have opposite charge. The most general scalar field which creates or annihilates
single particles, and annihilates one unit of charge, has the form
ijilitnjpl+ml v /(27r)32 y /^
(^)e i '"' + a f P = tf{p)Q-
where a is a complex number and m± are the masses of the particle and anti-particle.
Problem 2.6 shows that we must choose a = 1, m + — m~, and Bose statistics to satisfy
the locality postulate.
2.2 Problems for Chapter 2
*2.1. Show that a Lorentz transformation, satisfying A T gA — g, has det A = ±1
and |Aq| > 1. The Lorentz group thus breaks up into four disconnected com-
ponents depending on these two choices of sign. Only the AJ] > 1, det A = 1
component is continuously connected to the identity. These are called proper
orthochronous Lorentz transformations. Show that we can get to the other com-
ponents of the group by multiplying proper orthochronous transformations by
time reversal, space reflection, or the product of these two. Show that proper
orthochronous Lorentz transformations preserve the sign of the time component
of time -like and null 4-vectors. Show that, for a space-like vector, an appropriate
proper orthochronous Lorentz transformation can change the sign of the time
component or set it equal to zero.
Quantum theory of free scalar fields
*2.2. The Dirac-picture evolution operator is denned by
U D (t,to) = e iHo 'U s (t,t )e- iH ' ,
where
id t U s = (H +V)U s
is the equation satisfied by the Schrodinger-picture evolution operator. Show that
id,U D = V(QUd,
where
V(t) = e iHo 'Ve- [Hat .
*2.3. Show that the solution of the Dirac-picture equation is
-i f V(s)ds
*2.4. Compute the overlap of the ground-state wave functions of a harmonic oscillator
with two different frequencies. A free-bosonic field theory is just a collection of
oscillators. Use your calculation to show that the overlap of the ground states
for two different values of the mass is zero for any field theory, in any number
of dimensions and infinite volume. Show that the overlap is zero even in finite
volume if the number of space dimensions is two or greater. This is symptomatic
of a more general problem. The states of two field theories, containing the same
fields but with different parameters in the Lagrangian, do not live in the same
Hilbert space. The formulation of field theory in terms of Green functions and
functional integrals avoids this problem.
*2.5. Show that the scalar field
4>(x) =
fix*
does indeed transform as a scalar, as a consequence of the transformation law
of the annihilation operators. Show that it is local if and only if the particles are
bosons. Compute the equal-time commutators
[<t>(t,x),<t>(t,y)] = 0,
[<t>(t,x)J(t,y)]=i&\x-y).
In the language of classical field theory, this means that the field and its time
derivative are canonically conjugate.
*2.6. Repeat the locality computation for the charged scalar field
<P(x) = f P {a{p)Q- ipx + ritfwe**).
J ^{Inflcop
In this case the non-trivial computation is that of the commutator [<p(x), rf> '(!)].
Show that the mass of the particle and that of the anti-particle (annihilated by a
2.2 Problems for Chapter 2
and b, respectively) must be equal and that the complex number r\ is a pure phase,
which can be absorbed into the definition of b.
*2.7. Use the Dirac-picture evolution equations, with V — — f d 3 x £/, to show that
the interaction-picture evolution operator over infinite times is Lorentz -invariant
if Li is a local scalar operator.
*2.8. The states of a particle of mass m and spiny at rest transform according to the
spin-y representation of the rotation group.
U(R)\0,k) =D J kl (R)\0,l).
Define the states with momentum p by
\p,k)=[^U(L(p))\0,k).
L(p) is a boost that takes the rest-frame momentum into (co p ,p). Show that the
Lorentz transformation law of these states is
U(A)\p,k) = J^D J kl (R w )\Ap,l),
where i?w = L (Ap)AL(p) is called the Wigner rotation. We can prove that
it is a rotation by showing that it leaves the rest frame invariant. Using these
transformation laws, show that one can construct fields transforming in the
[2/ + 1, 1] and [1,2/ + 1] representations of the Lorentz group (see Chapter 5)-,
which are linear in creation and annihilation operators of these particles. Show
that these fields are Hermitian and local only if we choose Fermi statistics for
half-integer-spin and Bose statistics for integer-spin particles. Compute the time-
ordered two-point functions for these fields and note the ultraviolet behavior of
the momentum-space propagator as a function of the spin.
*2.9. Writing an infinitesimal Lorentz transformation as A «s 1 + ico^J^, where J^ v
is a basis in the space of all ^-anti-symmetric 4x4 matrices (/>? + >iJ T ) = 0: r\ is
the Minkowski metric, considered as a matrix). Show that
[J/1V, Jap} = Kn^uJvP - IvcJ^p + TjvpJ^a ~ rj^pJva).
Write this out for the individual components /n; and Jy — e^-Jf,.
2.10. Show that the stability subgroup of a null momentum 6 is isomorphic to the
Euclidean group of translations and rotations in two dimensions. The only finite-
dimensional representations of this group have the translation generators set to
zero, while the rotation generator has a fixed eigenvalue, the helicity h. Follow-
ing the method of induced representations we used for massive particles in the
previous problem, work out the unitary representation of the Poincare group on
5 These representations are often denoted b\ their highest weight, rather than their dimension, and a
called [/'.()] and [0./1 representations.
" Suburoup of die Poincare group which leaves il invariant.
Quantum theory of free scalar fields
single massless particle states and on the corresponding creation and annihila-
tion operators. Show that, in order to construct Lorentz-covariant local fields,
we must include both signs of helicity, ±h, and the helicity must be quantized
in half-integer units (for fermions) and integer units (for bosons). Show that
for helicity ±1 the smallest-dimension field operator transforms like the elec-
tromagnetic field strength F I1V . Work out the analogous statement for helicity
3/2 and 2.
2.1 1. Calculate the vacuum expectation value of the time-ordered exponential
(0\Te^ d4x ' t ' {x)J(x) \0),
for a free field. Do this by writing <j>(x) as a sum of an operator involving only
creation operators and another with only annihilation operators. Compute a
few terms in the power-series expansion in /, in order to convince yourself that
the answer is
i / d 4 .Y / d 4 y J(x)J(y) J ^ y'X'l
which is the exponential of the second-order term. It should be easy for example
to prove that all odd terms vanish. We will see a simpler derivation of this
result below, in the language of functional integrals. Now consider the case
J(x) = 9(T - i)6{t + T)[8 4 (x) - S 4 (x - R)], with T » R » l/m, where m is
the mass of the field. Show that in this limit the answer has the form
where the potential V(R) is the Yukawa potential. We describe this result in
the words static sources for afield experience a force due to exchange of virtual
particles.
Interacting field theory
Our considerations so far give us a description of free relativistic spin-zero particles.
Problem 2.10 shows us that exchange of particles between sources leads to what Newton
called a force (gradient of a position-dependent potential energy) between the sources.
We can think of the mathematical sources of that problem as models for a pair of
infinitely heavy particles, separated by a distance R. So forces arise by the exchange of
virtual particles between other particles. This motivates the idea that the way to intro-
duce interactions between particles is to introduce a perturbation to the Lagrangian
density of the Klein-Gordon equation,
1
-\(i>d>r
n + ci,
(3.1)
where e.g. £/ has terms with two creation operators and one annihilation operator
and vice versa. Then (f> particles can interact via two operations of £/. Starting from
a two-particle state we can create an extra particle near particle 1 , with one operation
of Ci, and let that new particle propagate to the vicinity of particle 2, where it is
reabsorbed. The condition that these expressions be compatible with Lorentz invariance
and causality is, not surprisingly (Problem 2.7), that the interaction Lagrangian £j(x)
be local:
[£/(*), £/(»] =0 if (x - y) 2 < 0. (3.2)
Thus, causal Lorentz-invariant theories of interacting particles in Minkowski space-
time are identical to local quantum field theories with Lagrangians that are not purely
quadratic in the fields. The rest of this book is devoted to exploring such theories,
mostly in a perturbation expansion when the interactions are weak. We will touch on
non-perturbative issues in our discussions of chiral symmetry breaking and confine-
ment in quantum chromodynamics, instantons and solitons, and in our treatment of
renormalization.
3.1 Schwinger-Dyson equations and
functional integrals
Problem 2.4 shows us that many of the tools of conventional Hilbert-space quantum
mechanics become problematic in interacting quantum field theory (QFT). The correct
Hilbert-space formulation often depends on the interaction. Wightman and others,
Interacting field theory
following the seminal work of Schwinger and Dyson (SD) [9-1 1], showed how the
Hilbert space could be reconstructed from generalized Green 1 functions. We will follow
a variant of the SD approach to the derivation of a set of equations for Green functions,
which determine them and lead to a formal solution of the problem of QFT in terms of
Feynman path integrals, or functional integrals. This formulation is manifestly covariant
and easily amenable to a variety of approximation schemes.
The generalized «-point Green functions are the expressions
(O|7>(*i)...0(jc)|O),
where cj) is the Heisenberg field operator. These functions would appear in the Dirac-
picture perturbation expansion of the response of a QFT to an external source H ->■
H — f d 3 x J(x, t)<p(x, t). It is easy to see (Problem 3.2) that in conventional quantum
mechanics these would determine the ground-state wave function and the Hamiltonian
of the system. In field theory they lead directly to a computation of all particle masses
and all scattering amplitudes of those particles, as we will see below in the section on
the Lehmann-Symanzik-Zimmermann (LSZ) formula. This is done in a completely
covariant manner, without solving the Schrodinger equation or worrying about whether
the free and interacting theories live in the same Hilbert space.
The idea of SD was to use the Heisenberg equations of motion and canonical com-
mutation relations (Problem 2.5) to derive a closed set of equations for the Green
functions. The generating functional for the Green functions is the vacuum persistence
amplitude in the presence of the source,
Z[J] = (0|Te^ d4v/(Y) * (v) |0>. (3.3)
In free scalar field theory it is easy to work this out by operator methods. The action
of the source n times can create n particles, which must then be re-annihilated to get a
state that has overlap with the vacuum. Our aim will be to find an equation for Z[J],
allowing us to solve for it in a general field theory.
If the Lagrangian density is £ = [jC^^) 2 — V(<j))], then a naive application of the
Heisenberg equations of motion gives
3 2 ^ = -(OIT^We^'^io). ( 3.4)
10./ (x) dcp
This can be rewritten as
In these equations, I have introduced some notation that we will be using throughout
the book. Square brackets in Z[J] denote the fact that Z is 'Afunctional of the function
1 Here I follow a revered teacher who believed that the locution "Green's functions" was
i 1 i i i ii nl hi i I i I ndro. Whiuaker. am i II ' rsli
the possessive. Whal's m> special aboul Mr. Green'.'
3.1 Schwinger-Dyson equations and functional integrals
/. That is, it is a rule that gives us a number for each function. All the functionals we
will consider will have power-series expansions of the form
■xkh
where the integral is over all the indicated coordinates. The functional derivative of
such functionals is defined by the formal rule
The rigorous mathematical meaning of many of our equations will take mathemati-
cians decades to work out. Our attitude is that all of the equations of continuum
QFT have a formal sense only. They are really defined by the process of regularization
and renormalization, which we will study in Chapter 9. The real world is probably
not described by a mathematically well-defined continuum QFT. The combination of
quantum mechanics and gravity defines a fundamental length scale of order 10 -33 cm,
and QFT almost surely fails at that scale. Our real challenge will be to understand
how to define a procedure for making predictions about length scales accessible in the
laboratory, which depends as little as possible on the quantum gravitational physics we
do not yet understand.
Returning to our formal equation (3.4), the reader should review this carefully to
make sure that she/he understands that the time ordering is being applied both to
the space-time point x and to the multiple integration variables in the expansion
of the exponential. This is what allows us to differentiate operator expressions as if
they were ordinary functions. The time ordering takes care of the operator-ordering
problems.
Despite this cunning trick, we have made a mistake. The differential operator 3 2 =
3y — V 2 contains time derivatives, which act on the Heaviside functions in the time-
ordering operation. Consider
3 2 <0|r</>(x)«Mj;)|0) = do[{Q\Tdo<p(x)(i>(y)\Q} + 8(x -y ){0\[<l>(x),<p(y)}\0)]. (3.6)
The second term comes from differentiating the Heaviside functions (9o#(x — y) =
—do9(y — x) — S(x° —j )). 2 It gives rise to an equal-time commutator, which vanishes
by virtue of the canonical commutation relations you worked out in Problem 2.5.
However, when we perform the second time derivative, we get a similar term, which
involves the equal-time commutator of 3o</> and </>. Since these variables are canonically
conjugate, this term is proportional to <5 3 (x — y). Thus
d$(Q\T<l>(x)(t>(y)\0) = (0|r3 2 </)(A-)<K}>)|0) -iS 4 (x-y). (3.7)
When we apply the second derivative operator to (0\Tcf>(x)(j)(yi) • ■ ■</>0 ; «)|0> we
observe a similar phenomenon (Problem 3.1). Differentiation of the Heaviside functions
- Note that all of Uic dori\ali\e> in thi:. >cciion arc x derivatives.
Interacting field theory
gives rise to equal-time commutators of </>(x) and 9o</>(x) with each of the fields
at i'/:
a o 2 (O|T0(x)0(ji) . ..4>(y n )\0) = (0\T afa(x)4>(yi) . ..0<j»)|O)
-i^s 4 (x-^)(O|r<AOO...0O 7+ i)...«/>o„)|O).
If we multiply this equation by /(yi) f(j«), integrate, and use the Heisenberg
equation for 3q0, we obtain the correct SD equation:
+ J{x)\z[J]. (3.9)
This equation may be described succinctly in the following words: Think of the
functional differential operator 8/i 8/(x) as a field and write down the left-hand side of
the field equation implied by the Lagrangian C + cpJ, for this field. This gives us a linear
functional differential operator SD[8/i 8/(x)], where e.g. SD[</>] = 3 2 </> + V'(<j>) — J.
The SD equation is SDZ[/] = 0.
3.2 Functional integral solution of the SD equations
In order to understand how to solve the SD equations, it is convenient to think about a
regularization of QFT in which space-time is replaced by a finite hypercubic lattice of
points. The continuous variable x is replaced by a discrete variable A, with finite range.
The field and source functions become finite-dimensional variables <j> A and J A . The
differential operator d becomes a symmetric matrix Kab- In fact such a regularization
is used in numerical solutions of strongly coupled field theories, which are not amenable
to other approximation techniques. It converts QFT into a finite problem, which can
be solved on a computer.
The discretized SD equations take the form (summation convention for repeated
indices):
idJ B |_ \idJ A
Since we have only a finite number of variables, we have replaced functional derivatives
by ordinary partial derivatives.
If V is a polynomial of highest order k then this is a set of fcth-order partial dif-
ferential equations (PDEs) in a finite number of variables. The key to solving them is
to note that the explicit dependence on J A is linear. If we Fourier transform a set of
linear PDEs, every derivative turns into i times the Fourier transform variable, while
every J A turns into (— i times) a derivative w.r.t. the Fourier transform variable. The
Fourier transform of the discretized SD equations will be a set of first-order PDEs
3.2 Functional integral solution of the SD equations
for the Fourier transform. For reasons that will become apparent, we call the
Fourier- transform variable cj> A , and write
"/'
Z[J]= /[d0]e liWJ + l7 * , (3.11)
where square brackets around the integration measure indicate that it is a multiple
integral for which we are contemplating taking the limit of an infinite number of inte-
gration variables. Such integrals are called functional integrals. It requires a lot of care
to give them a rigorous mathematical definition. Most physicists deal with this problem
through the algorithm of renormalization, to which we will turn in Chapter 9.
The Fourier-transformed SD equations (as the reader will kindly verify) are
—— = K AB (p - — — ,
whose solution is
-<j> A K AB <t> B -V{(t>) + C, (3.12)
where C is a constant independent of . Our equations for Z are homogeneous and
do not determine its overall normalization.
Returning to the continuum, we write 5 as
Integrating by parts,
-Ud'-cp-VW)]. (3.13)
Ad4>y-v(4>)\. (3.14)
The integral defining the Fourier transform of the generating Junctional is now an
integral over some space of functions, or functional integral. For the free-field case,
which we will deal with in a moment, such integrals were given a rigorous mathematical
definition by Wiener [12]. For interacting field theory they are defined by a process of
regularization (turning them into finite integrals) and renormalization to which we will
return in Chapter 9. We certainly want our functions to approach constants at infinity
in space-time, so that the integration by parts that we did in the action is allowed, but we
will not delve further into their properties. In Chapter 10, in discussing gauge theories
and low-dimensional scalar field theories we will encounter examples where integrals
of total derivatives have to be kept in the action, and we will take care not to drop them
where they are important.
If we consider free field theory, where V{<p) — \m 2 (j> 2 , the integral is Gaussian and
we can do it explicitly. However, it is an oscillating Gaussian, so we must provide
a prescription for evaluating it. One obvious possibility is to make the replacement
m 2 -*■ m 2 — ie, where e is a small positive number, which is eventually sent to zero.
Interacting field theory
Consider the Gaussian integral
/(/) = f d> e" ? K <J yiyl + iyij ' , (3.1 5)
where K is a symmetric matrix with positive definite real part. This can be evaluated
by changing variables y —> K~i Y, evaluating the Jacobian, and doing the resulting
single-variable Gaussian integrals. The result is
m
/(7) = e-2^ ( *~V / |det*r*| A /^| , (3.16)
as the knowledgeable reader will know and the diligent reader can verify. In the func-
tional integral limit, we have to invert a differential operator and take its determinant.
We will temporarily evade the second of these tasks by noting that the definition of Z[J]
in terms of time-ordered products gives us the normalization Z[0] = 1. We achieve this
by writing Z[J] as the ratio of two functional integrals /(/)// (0), and note that the
functional determinant as well as the pesky factors of ^Jit/2 cancel out in this ratio.
We are left to evaluate the inverse of the differential operator
K = i(d 2 +m 2 -ie). (3.17)
That is, we want to find the Green function satisfying
i(9 2 + m 2_ ie)DF(x _ y) = S \ x _ y y ( 3.i 8)
This is easily done in momentum space:
J (2;r) 4
The integral defining this Feynman or causal Green function 3 of the Klein-Gordon
(KG) operator has poles in the/? plane just below the positive real axis and just above
the negative real axis. When x° — y° > 0, we evaluate it by closing the contour in the
lower half plane. The resulting Green function is an integral over only positive energy
waves. Similarly, when the time order is reversed, we close in the upper half plane,
and again conclude that only positive energies propagate forward in time. This is the
rigorous version of Feynman's idea that "anti-particles are negative-energy particles
propagating backwards in time."
The equality
D F (x-y) = (O|7>(*)0GO|O), (3.20)
for free Heisenberg fields, may be verified in two illuminating ways. First of all, the free
SD equations show that the RHS is a Green function for the KG equation, and the
discussion of the previous paragraph shows that the boundary conditions implied by
the ie prescription are precisely those satisfied by the time-ordered vacuum expectation
value ( VEV). Alternatively, one can compute the RHS by brute force using creation and
3 Also called the Feynman propagator.
3.2 Functional integral solution of the SD equations
annihilation operators. Thus, the ie trick for making the Gaussian integral converge
automatically chooses the right space-time boundary conditions to define the time-
ordered Green function of the KG equation.
We can obtain a deeper understanding of this connection by noting that the position
of the energy poles in the Feynman Green function allows us to rotate the contour of
integration 4 top -> ip4, if we simultaneously rotate time differences to be imaginary.
We realize that Dp is the analytic continuation of the Euclidean Green function,
OeW= - P -^- 2 - r (3.21)
PI
where the Euclidean scalar product is/?| = p\+P 2 - This is the unique Green function
of the four-dimensional Helmholtz operator — V' + mr, which falls off at Euclidean
infinity.
It is easy to see why the correlation functions of QFT should have analytic contin-
uations to imaginary time. Consider (for notational simplicity) the time ordering in
which the labels of the points coincide with their time order. Then
(0|r</>(xi)...<A(x„)|0) = (O|0(x 1 ,O)e- lff( "- ?2) ...e- iff( '"-'-'"V(A-„,O)|O>, (3.22)
where we have used H\0) — 0. By inserting complete sets of energy eigenstates, we can
rewrite this in the form
j p(E u ■ • ..En-Ot-'^Wk-^, (3.23)
where the spectral function (which also has implicit dependence on the spatial points at
which the fields are evaluated) p is determined in terms of sums over matrix elements of
fields between states of fixed energy. In free-field theory, and every order of perturbation
theory around it, it is easy to see that the spectral function is an analytic function of the
energy variables. Furthermore, it falls off when the energy variables become large. This
is enough to guarantee 5 that these functions have a well-defined analytic continuation
to imaginary, or Euclidean, time. Note that, if we take (t^ — ^+i) ~~ ► — lT /o with
t/c > 0, the exponentials give a Boltzmann-like suppression of high-energy states. 6
Thus, assuming that field matrix elements have at most power-law growth with energy,
these functions are well defined and analytic in imaginary time.
4 At this point I am expecting the readers to stop and remember their complex analysis course. If you don't
know complex analysis, stop here, learn it, and come back to this point.
5 In axiomatic approaches to field theory this behavior of the spectral functions is simply postulated. The
bound on the growth of the density of states can be derived from the definition of field theory we will
give in the chapter on renormalization. A field theory is a perturbation of a conformal field theory by an
operator that is negligible al high energv The densilx of stales of a confornial lick! theory in volume V is,
by virtue of conformal invariance and extensivity, p(E) ~ e cVE
This is not a coincidence. The Euclidean Green functions with compaclilied imaginaig time compute
thermal expectation \alues of lieids.
Interacting field theory
The Euclidean formulation of field theory 7 leads to mathematical expressions whose
properties are simpler to understand. Euclidean methods also turn out to be crucial for
understanding field theory at finite temperature and quantum tunneling amplitudes.
They are also central to numerical techniques (lattice field theory) for obtaining non-
perturbative solutions to field theory. So we will generally take the attitude that a field
theory is defined by analytic continuation of a Euclidean functional integral. We simply
write the formal expression for the Euclidean action and integrate e~ SE (j)(xi) . . .</>(x, t )
over fields in Euclidean space, and then analytically continue the result to get time-
ordered products of Lorentzian fields.
In this way, we obtain the definition of the generating functional of Euclidean Green
functions, also known as Schwinger functions:
/•[d^e-^M+i/^
Ze[/] ~ /W ]c-*W ' a24)
which has the form of the characteristic function of a probability distribution. Note
that in writing this formula we have also analytically continued the source function /
to i/. If we refrain from doing this we obtain the Boltzmann formula for the partition
function of a classical statistical system in an external field /. The "potential" of the
statistical system is the Euclidean action,
3 = j d 4 J~(V0) 2 +F(</>)l (3.25)
where we now integrate over four-dimensional Euclidean space, rather than Minkowski
space-time. This profound analogy between QFT and classical statistical mechanics has
led to an enormously successful cross fertilization between the two fields.
3.3 Perturbation theory
In problems amenable to perturbation theory, the non-quadratic part of the potential
is multiplied by a small dimensionless parameter g. The classical action has the form
S — So + g f V (</>), where So is the action of a free massive field. Therefore, we can
write a power-series expansion for Z[J] = I[J]/I[0]:
I[J] = j^^ /'[d0]e is ° + i '* / [d 4 xi...d 4 x„(V(xi)...V((x n )). (3.26)
,1=0 "• J J
If V has a power-series expansion in </>, then every integral that is required in order
to evaluate this formula has the form
[[dcp]e iSo (l)(yi)...cj)(y m ).
7 The idea of Euclidean Beld theory is due to Schwinger, and die re
is called the Wick rotation.
3.3 Perturbation theory
The 0s in this formula, which come from the same V(x,), will all have the same value
of j = x 1 , and we will integrate over the x, variables. When evaluating Z[J], we expand
both numerator and denominator. Then it is easy to see that every such integral in
the expansion will be accompanied by the denominator Iq, the value of the functional
integral at / = g = 0. The generating functional for these normalized functional
integrals is just Zq[J] = (Q\Te l J ^ J \0), the free-field vacuum persistence amplitude.
In free-field theory, we can either solve the (linear) SD equation directly or do the
Gaussian functional integral to obtain
Z [J] = e"i ^ d *y 'W<y)D*b-y) t (3.27)
where Dp(x) is the Feynman Green function or vacuum expectation value of the time-
ordered product of two free fields. It follows, by expanding out both the definition and
the solution for Zq[J], that
This formula is called Wick's theorem. It leads to the following set of rules for
perturbation theory, called Feynman rules.
• Write the perturbation as a sum of monomials V — g 5Z(v;t/fc!)0*.
• A term in Mth-order perturbation theory in g will have contributions proportional
to Vfc, . . . v/ (n . We associate this with a diagram with n vertices. The ith vertex has k,
lines coming out of it. These are called Feynman diagrams or Feynman graphs.
• Each line must be connected either to a line from another vertex or to an external
source point. For a contribution to an ls-point Green function, there will be E such
external lines. Lines connecting two vertices are called internal lines. The number of
internal lines / is given by 21 — ^ki — E.
• Write a factor Dp(jj — yj) for an internal line connecting two space-time points.
• Write a factor Dp(y, — x a )J(x a ) for an external line.
• Integrate over all internal and external points. If one wants to directly compute
individual Green functions, rather than the generating functional, omit the factors
of J{x a ) and the integral over the external points.
• We can easily generalize to interaction vertices like </> 2 (3 M ^) 2 , just letting the deriva-
tives act on the appropriate argument of Dp. Similarly, it is easy to generalize these
rules to theories of multiple scalar fields.
• A given Feynman diagram in «th order comes with a factor (ig)"v/ q . . . v/ Cm Sg> where
the combinatorial factor Sg is a combination of the inverse factorials from the
definition of v/ ( and the number of times a given graph can be obtained in the sum over
all possible pairings (also called all possible contractions). Sg is called the symmetry
number of the graph, because it can be shown that it is equal to the inverse of the
order of the group of geometrical symmetries of the graph. I have always found that
the art of figuring out the geometrical symmetries is harder than reproducing 5g by
counting the number of contractions which give the graph. It's a matter of taste. An
algorithm for doing the counting can be found in [13].
Interacting field theory
• By Fourier transforming everything in sight, one can derive an equivalent set of rules
in momentum space. These are somewhat easier to write down, because the Fourier
transform i/(p 2 — rrr + is) of Dp is so simple. The rules involve integrating over
the momenta of closed loops in the diagram, with measure d 4 p/(2jt) 4 (equivalently,
integrating over all internal line momenta, with 2n 4 8 4 (J2 Qi) enforcing momentum
conservation at each vertex; one momentum S function of the sum over external
momenta is left). Derivatives in the interaction turn into factors of momentum, but
one must take care to make sure that one has the right sum of momenta in these
factors.
• In Euclidean space, the rules are even easier. We replace the factor of i in (ig)" by a
minus sign, and drop the i from each propagator. The Euclidean propagator is just
l/(pl+m 2 ).
It's extremely important for the reader to work out the derivation of these rules,
following the outline we have supplied above.
The Feynman rules for a large class of field theories, along with standard dia-
grammatic conventions, are collected in Appendix D. There one can also find several
examples of working out the combinatoric factors for individual graphs. This is proba-
bly a good point in the exposition for students to stop and do a few exercises to convince
themselves that they know how to set up the computation of Feynman graphs. If you
go too far in this exercise you'll encounter the disturbing fact that most of the loop
graphs are infinite. We won't learn how to tackle this problem, and what it means, until
Chapter 9.
3.4 Connected and l-P(article) Irreducible) Green
functions
The free generating functional has the form eJ ' • / (*) / (>')- d f(*-j') = e iw a [j]^ j e ^ j s an
exponential of something simple. Wo has a nice graphical interpretation. If we write
out all of the graphs contributing to Zq[J], only the connected graphs contribute to
Wo. The Feynman rules show us that this is a general property of the interacting theory
as well. The sum over all graphs is the exponential of the sum over connected graphs.
This follows from the symmetry-number Feynman rule. If I have a disconnected graph
with k identical disconnected pieces, there is an obvious S& geometrical symmetry. The
order of the symmetric group S& is k\. Thus, in general, we define Z[J] = e lW ^ and
W is the generating functional of connected Green functions:
Z [J] = J2 -
e i!!u|./|
Wick's theorem and connected diagrams
The Feynman rules for W are the same as those for Z, but summing only over
connected diagrams. In particular, since all contributions from corrections to the
3.4 Connected and l-P(article) l(rreducible) Green functions
denominator I[0] are disconnected, we never have to worry about them. Aficionados
of statistical mechanics will recognize the relation between Z and W to be essentially
the same as that between the partition function and the Helmholtz free energy. 8
Often, one can conveniently reorganize perturbation theory into a semi-classical
expansion of the functional integral. This happens whenever, by rescaling the fields, we
can write the action as 5" = (1 /g 2 )s[(f>], where g is a small dimensionless parameter, and
s contains only positive powers of g (typically only g°). One then recognizes that one
can do the integral by the stationary-phase, or steepest-descents, method. That is, one
looks for stationary points of the logarithm of the integrand, and expands the integral
around them. The first step requires us to solve
^r+/(*) = 0, (3.29)
which is just the classical field equation in the presence of the source. If 0(x, J] 9 is the
classical solution then the connected generating functional is given by
h J JWl).
W C [J] = -j( S[ct>[J]] + I JcftJ] ). (3.30)
In words, in the classical approximation, the connected generating functional is the
Legendre transform of the classical action.
If the functional s[cj>] is ^--independent, then the semi -classical expansion has a topo-
logical interpretation in terms of Feynman graphs. If we expand around a stationary
point </> — (f>o(x) of the / = functional integral, the first correction to the classical
5(2)00,80] s 1 f d 4 x d 4 j 80 (x)H (y) . ^ , \* , J (3.31)
If we define A =g 80, we can write 0/g 2 M(/>] = (\/g 2 )s[<M+S2[&l + '£g k ~ 2 Vk[&l.
Viewed as a field theory for the generating functional of A Green functions, this looks
like our perturbation problem, except with a link between the power of the field and
the power of the perturbation parameter g.
Now consider a contribution to an .E-point function at order g" . This means that the
number of lines, V = X! k, emanating from vertices V£. in the diagram must sum up to
X>-2) = £>-2F = «,
where V is the number of vertices. The number of internal lines, /, is
8 Up to factors of i and - 1 . In Euclidean QFT, the parallel is exact.
The peculiar bracket structure in 4>(x. J] is supposed (o convey the fact that it is both a function of a- and
a functional of J.
Interacting field theory
and the number of loops for a connected diagram
L = / - V+ 1 = | J^ ki - E - V + 1.
Thus
For a fixed number of external lines, the expansion in powers of g 2 is an expansion
in the number of loops. Note that if we look at correlation functions of the original
fluctuations 8<£, rather than A, the powers of g associated with external lines disappear
from these formulae. An order-one fluctuation of A is an o(g) fluctuation of the original
fields. Solving the classical field equations with a source / of order one sums up all tree
diagrams, with any number of external legs.
These considerations motivate the introduction of a new generating functional r [0] ,
which is related to the exact W[J] by a Legendre transform
' J<t>[J], (3.32)
W[J]
= r [</>[/]] +
1-
where <p (x
, /] is a solution of
hW
8/(x) '
= tf>(x,
/].
The expansion coefficients of r,
r ^] = I]^/ r «(
*-*
n)<t>(Xi
)••
,<j)(x n )d 4 xi...d 4 x n , (3.34)
are called one -particle irreducible (1PI) Green functions. Connected Green functions
are constructed from tree diagrams with 1PI functions as vertices and propagators
given by the full connected two-point function.
Diagrammatically, 1PI functions are constructed as the sum of all diagrams that
cannot be cut into disconnected parts by cutting a single propagator line. At tree level,
these just give the vertices constructed from the classical action, r is often called the
quantum effective action. We will drop the word "effective," in order to reserve it for
another use in the discussion of renormalization. In Chapter 9 we will learn that the
cut-off-dependent effective action defined by the renormalization group is equal to the
quantum action defined here only in the limit that the momentum cut-off scale is taken
to zero, and only if there are no massless states in the theory.
3.5 Legendre's trees
The graphical assertions of the previous two paragraphs were that the Legendre
transform relation
W[J] = j <P(x)J(x) + r[<f>] (3.35)
(where (p(x) is identified with ?>W/hJ(x) and J(x) with — ?>V/hcp(x)) generates an
expansion of the (connected) Green functions obtained from the expansion of W in
3.5 Legendre's trees
terms of tree diagrams whose vertices are the 1PI Green functions and whose branches
are the full propagator
8-1!'
8/(x)8/(j) '
In leading order in the semi-classical expansion we have V — S/g~.
The proof of this relation is simple. We start from the obvious identity
b 2 W 800') /S/^V 1 8 2 r
= H(y) = / 8/(y) \
8/(x)8/(j) 8/(x) V80W/ 8</.(x)80(j)'
The inverse in the second equality is meant in the sense of integral operators,
J d 4 z K(x, z)K~\z,y) = S 4 (x - y),
which is the continuous analog of matrix inversion. We write this as W 2 — — ry 1
Y-i is the integral operator made from the 1PI two-point function.
Differentiate this identity with respect to /(z),
&W hT-\x,y)
8,/(.v)&y(v)8/(z) "
and use the continuous analog of the matrix differentiation formula AK ' =
-K~ l AK K~ l to write this as
W 3 (x,y,z) = [ d 4 wi d 4 w 2 W 2 (x, wi)W 2 (y, w 2 ) ^ 2 ^ '- m) ■ (3.37)
J oJ (z)
Using the chain rule for differentiation and the relation = 8 W /§J, this becomes
W 3 (x,y,z) = /Vu-i d\v 2 d 4 M'3 W 2 (x,w 1 )W2(y,W2)W 2 (z,w 3 )r 3 (w 1 ,W2,w 3 ).
(3.38)
This is the formula implied by the three-point position-space Feynman diagram of
Figure 3.1.
It is now all over except for the shouting, which mathematicians call induction. When
we differentiate W 3 w.r.t. J(s), the derivatives act either on the propagators or on T 3 .
The latter contribution gives the first term in the four-point diagram of Figure 3.1. For
the former, we simply rerun the previous derivation to get the second four-point term.
Similarly, for W n , either differentiation acts on the r\, with 3 < k < n, and generates
rfc + 1 and an extra external W 2 leg, or it acts on one of the propagators, adding an
extra three -point vertex and external propagator. In words, W n is generated from all
possible tree diagrams.
This result is very powerful. Below we will show that the exact momentum-space
two-point function, W 2 (p) (when / = the two-point function is translation-invariant
and its Fourier transform depends only on one variable), has a pole at the mass of
any stable particle that can be created from the vacuum by the quantum field (/>. The
Interacting field theory
Legendre transform W[J] = r[0] + f <j>J makes trees.
1PI expansion of connected Green functions then tells us that every W n has a pole
on each of its external legs. We will see that the residue of this multiple pole, after
proper normalization, contains the S-matrix for all scattering processes in which the
total number of ingoing and outgoing particles is n.
Before concluding this subsection, we want to remind the reader that our notation has
conflated the quantum field operator <j>, the functional integration variable (p, and the
argument of the quantum action <j>. These are distinct concepts, but it is typographically
insane to use different versions of the same Greek letter to separate them. The reader
who has truly understood the discussions above should have no trouble distinguishing
the meaning of cf> from its context.
3.6 The Kallen-Lehmann spectral representation
For x° > 0, we can write the two-point function of the interacting Heisenberg field
0(x) as
<O|r0(x)0(O)|O) = fd 4 p J2^(P ~ Pn)e~ ipx \(0\'P(0)\n)\ 2 . (3.39)
We have inserted a complete set of states between the two fields, and used the action of
the translation generator on the Heisenberg fields and on the states. p„ is the momentum
of the state \n). As we will discuss in more detail below, we assume that the theory has
a complete set of scattering states, i.e. states that are composed of multiple stable
particles traveling freely at large space-like separation from each other. The energies
and momenta of such states are determined by the usual free-particle relations. We will
3.6 The Kallen-Lehmann spectral representation
assume further that the theory has a mass gap. That is, apart from the non-degenerate
vacuum state, the lowest value of P is a non-zero number equal to the mass of the
lightest stable particle in the theory.
For theories with massless particles, the general form of the Kallen-Lehmann [14,15]
representation will remain unchanged, but the identification of particle masses with
poles is slightly more subtle, and depends in detail on the form of the massless particle
interactions. Let m be the mass of the lightest stable particle that can be created from
the vacuum by a single action of (f>. Then it follows from Lorentz invariance that
W(0)|/>)= / - - -j . (3.40)
The constant Z is called the on-shell wave-function rcnonnulization constant of (p.
In any field theory, with or without a mass gap, we will have only states of time-like
or null momentum, with positive energy. So we can rewrite the two-point function as
(O|7>(x)0(O)|O> = f dp 2 j d 4 P 9{p°)?,(p 2 -fi 2 )
Jo J (3.41)
x^BV-/?,,)e-n<0|«M0)|«)| 2 .
We recognize that this is
<O|7>(x)0(O)|O) = f dp 2 p(p 2 )D F {x- p 2 ), (3.42)
Jo
where Dp is the free-field two-point function and
J^^{p-PnW\4>(0)\n)\ 2 = p(p 2 ). (3.43)
Indeed, since U^(A)cp(0)U(A) = </>(0), for every Lorentz transformation A, the
positive function pip) is Lorentz-invariant, and depends only on/) 2 .
In a theory with a mass gap, the lower limit on the integral is m 2 , coming from the
one-particle state, and then there is a gap until the lightest multiparticle state that can
be created by the action of <p. Thus there is a contribution
p(p 2 )=Z?>(P- 2 -m 2 ) + C(pi 2 ),
where the continuum contribution C is positive and has support starting at the invariant
mass squared of the lowest multiparticle state.
It follows that the exact momentum-space two-point function Wi(p) has a pole at
the stable particle mass, with residue Z. Note that nothing in this derivation requires
that the field 4> be the fundamental variable that appears in the Lagrangian, or that
perturbation theory be applicable. Any field with a non-zero Z will do. In cases where
perturbation theory is applicable, the fundamental fields will have poles in their Green
functions. However, in quantum chromodynamics (QCD), the theory of the strong
interactions, perturbation theory is inapplicable in the vicinity of hadron mass scales.
The fundamental quark and gluon fields are not associated with stable particles, and
Interacting field theory
hadrons are instead created by multi-linear functions of the quark fields. In lattice
gauge theory, the functional integral of QCD is done numerically, and hadron masses
are computed by finding poles in the two-point functions of quark-anti-quark bilinears
and quark trilinears.
3.7 The scattering matrix and the LSZ formula
Most of the real experimental data to which QFT can be applied are the results of
scattering experiments. It is an experimental fact that there exist (approximately) stable
single-particle states in the world, as well as states of multiple particles at large relative
space-like distances, which behave, to a very good approximation, like free particles.
Interactions fall off at large spatial separation, and the cluster property of QFT provides
a neat mathematical explanation of this [16-17]. The cluster property states that, at
large space-like separation, the connected parts of Green functions fall to zero. In
perturbation theory, this follows from the falloff of a single Feynman propagator. If
all particles are massive, the falloff is exponential, whereas if there is no mass gap we
expect power-law falloff. In this case, a variety of behaviors is possible, and there is not
always a scattering theory.
The idea of scattering theory is to derive formulae for idealized amplitudes in which
some number of widely separated particles come in from past infinity, interact, and
go off to become a state of (possibly different) space-like separated particles in the
asymptotic future. The central quantity one computes is the scattering or S-matrix: the
amplitude
(OUt/l ...f n \g\ ...gm'm)
for m particles with asymptotic wave functions g, to turn into n particles with asymptotic
wave functions/^-. The g t and// run over complete sets of single free -particle states.
Non-relativistic quantum mechanics has a well-developed scattering theory [18-19].
In that theory one proves that the S-matrix for potential scattering is a partial isometry,
a unitary transformation on a subspace (the scattering states) of the Hilbert space
orthogonal to all bound states of the particles. The in and out states are two different
bases for this subspace and the S-matrix is the transformation relating them. In QFT
all bound states can be simply thought of as additional particles. So the entire Hilbert
space will be spanned by scattering states, as long as we treat bound states as separate
particles.
The basic assumption of scattering theory is thus that the Hilbert space has two
complete bases of states, each of which is isomorphic to the Fock space of some col-
lection of free particles. A list of the masses and spins of the particles completely
specifies the spectrum of the Hamiltonian. We have noted in Problem 2.3 how diffi-
cult it is to discuss the eigenstates of an interacting quantum field theory, and claimed
that the introduction of Green functions side-steps all these difficulties. The Lehmann-
Symanzik-Zimmermann (LSZ) formula [20] shows us how to find the masses and spins
3.7 The scattering matrix and the LSZ formula
of all single-particle states, and to compute the S-matrix, in terms of Green functions.
To derive it, we assume the existence of a single-particle state of mass m and spin 1 ".
We also assume the existence of a local field <t>(x), such that
"" V (2tt)32^
with Z ^ 0. The functional form of the matrix element follows from Lorentz ir
(prove it if you don't believe me). <£> might be proportional to the Lagrangian field of
an interacting field theory, or it might be a complicated function of such fields. We will
let Z[J] denote the generating functional of Green functions of <t>.
Consider a source of the following form: / = 5Z*^ m + JZ-^ouf Each component of
this composite source is defined as follows:
/ J^(x) = -j=ei a lim / d 3 x(<t> 9 «# n - <t>^ 3 <t>).
There is an analogous formula for the outgoing source. Here e' m is infinitesimal and
all formulae are to be interpreted by expansion to lowest order in all the e{ n , ouV </> m
is a normalizable, positive -energy solution of the KG equation. If 4> were a free field
of mass m, this formula would just define the creation operator for a single-particle
state with wave function (p' m , the spatial integral would be time-independent and the
limit superfluous. For any local field O, the source creates a state localized around
an infinitely distant spatial point as t -> oo. Furthermore, if we consider the matrix
element of {r)\ //<t>|*) between states |*> and \rj) of fixed 4-momentum, then, unless
(P^ — P^) 2 — m 2 , the limit will vanish. If we choose all of the incoming and outgoing
wave functions 11 to be localized around different asymptotic directions as t -> ±00,
then this operator acts just like an in creation operator even in the interacting theory 12 .
Similarly, the out part of the source acts like a sum of annihilation operators. Since
Z[J] = (0|rexp(-i//<I))|0>, all the annihilation operators sit to the left of all of the
creation operators. Thus, it is plausible that the generating functional with this source is
nothing but the generating functional for the scattering matrix. That is, the coefficient of
e in • • • e in e out ■ • ■ 6 out i n tne expansion of Z[J] for this source 13 is the scattering-matrix
element
(out/i .../„ |gi ...g,„m),
ollioi spins is str;iigl"ilfor\\;u'(.l and wo \\iii m«. go inlo the details, but simply use the
appropriate formulae when necessary.
1 ' The outgoing wave functions are negative-energy solutions of the KG equation, and we take the limit
t -> -00, instead of +00.
12 This is the assumption that particle interactions fall off at large distances. It can be motivated/proven
using certain plausible axioms about QFT [21-22].
We emphasize that other terms in the expansion of the composite source have no particular use. They
correspond to creating or annihilating multiple particles in exactly the same scattering stale.
Interacting field theory
where the single-particle states// and gj are defined by
and
l - px) fiip)-
^2co p (2k
Conventional time-dependent perturbation theory offers an alternative definition
of the scattering matrix as the infinite- T limit of the Dirac-picture evolution operator
Uu(T,—T), and it can be verified in all examples that this coincides with our definition
in terms of Z[J]. More importantly, the LSZ formula can be massaged into a form
that allows us to find particle masses and scattering amplitudes directly from Green
functions. We first write
/ j! n q> = £ dt i<4 d / d 3 x(0 3 </4 - # n 9 <I>) .
This is not quite right. The boundary term at t = — oo gives what we want, but the
term at t — oo contains a creation operator for a state localized in the future. However,
if all of the final states are orthogonal to all of the initial states, this term just acts to
the left on the vacuum, and vanishes. In the case where this is not true (the case of
partially forward scattering) the LSZ formula we are about to write must be slightly
modified. In practice, this modification is never of any consequence. Part of the forward-
scattering amplitude is due to disconnected processes happening independently at large
space-like separation. The rest can be obtained by analytic continuation from non-
forward scattering. We now act with the outer time derivative on the KG scalar product,
and exploit its Wronskian form and the fact that </>.' satisfies the KG equation to
write
/ J.^0 = i / d 4 x[(V 2 - m 2 )<p' in ® - dfo 4>l n ] = i / d 4 x </>/ n [D + m 2 ]0.
We can do a similar manipulation for all of the in and out sources. For outgoing
particles, we want to pick up the creation operator rather than the annihilation operator,
so we complex conjugate the source equation, and end up with the complex conjugate
of the positive-energy outgoing wave function. The result is the LSZ formula for the
S-matrix
(out/i . . ./„ \gi... g m in) = f J= J j Y[ d 4 x k ^(x k ) PI dVO* (jj)
k j
x (0|r*(xi) . . . i>(x m )*Oi) ■ ■ ■ *(yw)|0).
3.7 The scattering matrix and the LSZ formula
The product of differential operators includes one KG operator for each integration
variable. It is conventional in scattering theory to replace the normalizable initial- and
final-state wave packets by plane waves. As we shall see, this causes some trivial infinities
to show up in cross sections, but they are easily dealt with. Recall again that the field
<t> in the LSZ formula might be an elementary field, </> in the Lagrangian, but need not
be. All that is required is that it have a finite amplitude to create single-particle states,
which is verified by finding a pole in its two-point function.
The LSZ formula has two remarkable properties. The first is that it is symmet-
ric between in and out states except for the fact that the out wave functions are
negative-energy solutions and the in wave functions positive -energy solutions of the KG
equation. This leads one to surmise that different scattering amplitudes (e.g. electron-
electron scattering and electron-positron annihilation) are just analytic continuations
of each other. These crossing symmetry relations were discussed in a heuristic n
in the introduction. The hard part in proving them is the proof of analyticity [23].
Secondly, in momentum space, the formula reads
TA
2co p (2it)
(qj-m 2
(pf-m-)
*n[/^3.
x G n + m (-q\ ...-q„\p\ ...p m ).
G n + m is the Fourier transform of the time-ordered product of n + m <t> fields. Note
that for the outgoing states q, is the physical, positive -energy, 4-momentum, but the
Fourier transform is evaluated at negative -energy outgoing momenta. Because of trans-
lation invariance, the Fourier transform contains a delta function of the sum of all its
arguments, and because of the previous remark, this is just energy-momentum con-
servation §(%2Pi ~ J2 1j)- When we square the amplitude to get a cross section, one
of these momentum-space delta functions gives us 8 4 (0), which is interpreted as the
space-time volume. In a translation-invariant system, we want to compute the prob-
ability per unit volume per unit time for a plane wave to interact. With normalizable
wave packets, this infinity does not occur.
The most remarkable part of this formula is that all the momenta are on mass shell
and the formula contains what appears to be a product of zeros. One concludes that
the S-matrix element will be zero unless the Green function has a single pole at the
mass shell on each external leg. The connection between full connected Green func-
tions and 1PI Green functions shows how easy it is for this to happen. The full Green
functions are evaluated as tree diagrams whose vertices are 1PI functions and whose
internal and external lines are full propagators. In momentum space, each external leg
has a factor W^ip), the full connected two-point function. The Lehmann representa-
tion shows that each of these Green functions has a simple pole, giving precisely the
multiple-pole structure that we need, in order for the LSZ formula to predict a finite
S-matrix.
Interacting field theory
3.8 Problems for Chapter 3
3.1. Consider a quantum mechanics problem with a single variable x (the general-
ization is easy) and a normalizable ground state, denoted by |0), with energy 0.
Assume that you are given the Green functions
(0|I*(*i)...*fti)|0). (3.44)
Choose a time ordering. By inserting complete sets of intermediate states, show
that knowledge of these functions allows you to read off all of the eigenvalues of
the Hamiltonian. Show that you can also calculate
(0|*V|0)
for any m and n from your knowledge of these Green functions. Argue that this
allows you to calculate the wave function VoM of the ground state in the x
representation (up to an overall constant phase). Use the Schrodinger equation
to determine the potential in terms of i/tj- Thus, knowledge of the Green functions
is equivalent to knowledge of the functional form of the Hamiltonian, as well as
its eigenspectrum.
*3.2. This problem is a repetition of Problem 2.11, by functional methods. Compute
the generating functional for a free field of mass m in the presence of a source
J(t,x) = qi ^(x) + q 2 S\x-R), \t\ < T,
J(t,x) = 0, \t\>T.
This source causes a static disturbance of the field, at two spatial points, over a
time interval 2T. When T » R JS> l/m, this amplitude should have the form
e -2V(R)T w h ere V(R) is the lowest energy of states with two localized distur-
bances. Show that this is the case and that, apart from an additive constant, V is
the Yukawa potential. Thus, particle exchange is responsible for forces between
static disturbances. This motivates the claim that all particle interactions can be
understood in terms of particle creation, annihilation, and exchange.
3.3. Carefully derive the source term in the SD equations by looking at the action
of the second derivative 3q on the Heaviside functions in the definition of a
time-ordered product.
3.4. Compute all diagrams for four-point functions at the tree level in the theory with
Lagrangian
£=^<A) 2 -m^ 2 ]-^ 3 -^ 4 .
Do the computations in Euclidean space and analytically continue to Minkowski
space in order to understand the relation between Euclidean and Lorentzian
Feynman rules. Of what order in the loop approximation parameter g are ^.3,4?
Argue that, at tree level, the particle mass is mo and the wave-function renor-
malization Z = 1. Use the LSZ formula to compute the two-to-two scattering
3.8 Problems for Chapter 3
amplitude, for particles of momenta p 1.2 to scatter into particles of momenta/^zt.
Introduce the Mandelstam variables s = (p\ +P2) 2 , t = {p\ —pi) 2 , u = {p\ —p$) 2 ,
and prove that s+t + u — 4wq. Evaluate s, t, u in terms of center-of-mass energy
and scattering angle. The center of mass is the Lorentz frame where^i +p2 — 0.
Show that the amplitude is symmetric under interchange of s, t, u and interpret
this as a crossing symmetry relation.
*3.5. Compute the combinatoric factors for all diagrams, through two loop orders and
eight external legs for the Lagrangian in Problem 3.4. Do this both by figuring
out the geometrical symmetries of the diagrams and by combining the inverse
factorials from perturbation theory with the number of contractions that give
a particular graph. A useful notation for counting contractions is to write out
powers of the field, as in </> 4 Oi) — ► $i(/>i</>i0i and draw under-brackets faty, or
draw over-brackets 4>i4>j between fields that are contracted.
*3.6. Prove that, in a general theory of one scalar field with action (l/g 2 )[\{d(j)) 2 —
V((p)], if we make the decomposition </> = </> cl + gx discussed in the text, then
the general diagram for an is-point connected Green function has the power
giL-2 + E ^ w h ere fhg Feynman rules have a power g k ~ 2 for a /c-point vertex, and
NO powers of g for the external lines. L is the number of loops in the diagram.
Particles of spin 1, and gauge invariance
4.1 Massive spinning particles
We now want to extend our discussion of particles and fields to particles with spin. The
reader has been asked to do this on her/his own in Problem 2.6. This section can be
viewed as an extended answer to parts of that exercise. The first step is to understand
the Lorentz transformation properties of single-particle states. We start with massive
particles in their rest frame. The stability subgroup of the rest-frame momentum is
SU (2), so single -particle states are classified as irreducible representations of this group,
which is the same as the usual non-relativistic definition of spin. A particle of spin j
has (2/ + 1) states, |0, m) identified with eigenvalues m between [—j,j], of the operator
J-$ representing infinitesimal rotations around the 3-axis.
To generalize our discussion of spinless particles, we define the states of a massive
spinning particle, with momentum p and spin component m, as
\p,m)= f^-U(L(p))\0,m).
V m p
Here L(p) is defined as a boost along the 3-direction, with velocity \p\/co p , followed by
a clockwise rotation of the 3-direction into the direction of p. It transforms the rest
frame to the frame where the particle has momentum/). Then, using the fact that U(A)
is supposed to be a representation of the Lorentz group, we have
U(A)\p,m) =
- U(L(Ap)) U(L- i (Ap)AL(p))\0, m).
The transformation Ryj(A,p) = L~ 1 (Ap)AL(p), called the Wigner rotation, takes the
rest frame into itself, and hence belongs to SU(2). We therefore know how it acts on
the rest-frame states. Thus
U(A)\pm) =
-U(L(Ap))V mk (R w )\0,k) =
±V mk (R w )\Ap,k), (4.1)
where V is the usual (2/ + 1) -dimensional representation of the Wigner rotation. It's
easy to verify that this defines a unitary representation of the Poincare group on the
space of states of a massive spinning particle. The creation and annihilation operators
for these particles will thus carry a label a m (/>), transforming in the spin-y representation
ofSU(2).
4.2 Massless particles with helicity
4.2 Massless particles with helicity
When we try to generalize these considerations to massless particles, we run into a
surprise. We can always transform a null momentum to a frame where p = (E, 0, 0, E).
This is invariant under rotations around the 3-direction. The combination of transverse
boost and rotation generators T, = Kj + eg J' also leaves the null momentum invariant.
Together with the transverse rotation these form the two-dimensional Euclidean group.
This group has infinite-dimensional "continuous spin" representations unless Tj = 0.
There are no such infinitely degenerate massless particles in nature, and one cannot
make a local field theory which includes them. 1 The remaining representations are
characterized by the eigenvalue of Ji, called the helicity, which must be quantized in
half-integer units in order to have a local field theory. 2 Furthermore, local field theory
requires that a helicity value h be accompanied by —h. Indeed, if this were not the
case we could trace the causal order of emission and absorption processes between
space-like separated points by following the helicity flow. However, the causal order
can be changed by doing boosts along the momentum of the massless particle, which
preserve helicity. On a technical level we find that we need both helicities (and helicity
quantization) in order to construct a local field.
A general null momentum (\p\, p) can be obtained from the canonical one (E, 0,0, E)
by performing a boost of rapidity \n(\p\/E) along the 3-direction, followed by a rotation
of the 3-direction into the direction/?/ 1/?|. Call the product of these two transformations
C(p). By analogy with the massive case, we can derive the Lorentz transformation rule
for massless single-particle states. It is
U(A)\p,h) = ^ U{C-\Ap)AC(p))\p,h) = e m{A ^\p,h). (4.2)
The second equality follows because £~ l (Ap)A£(p) is in the stability subgroup of
(\p\,p) and so acts on the states like a rotation around the direction of motion. &(A,p)
is the angle of that rotation.
In Problem 4.3, you will show that the construction of local fields from the creation
and annihilation operators of massless particles requires that h be quantized in half-
integer multiples and that a particle of helicity —h exists for every particle of helicity h.
For particles that participate in parity-conserving interactions, it is conventional to call
the state with helicity —h another spin state of the same particle, whereas for neutrinos,
which have only parity-violating interactions, it is called the anti-neutrino.
Note that for \h\ > \ there is a discontinuity in the number of degrees of freedom
of a massive particle of spin h and a massless particle with helicity h. The "missing"
degrees of freedom are called longitudinal. A field theory of massive particles of spin
h, which has a smooth massless limit, must acquire some sort of symmetry principle in
the massless limit, which decouples the unwanted longitudinal states. By dimensional
1 T. Banks (1 L > i < ' ' (unpublished)
' You will prove the assertions on this page in the problems at the end of the chapter
Particles of spin 1, and gauge invariance
analysis, going to large momentum must be related to the zero-mass limit, so a theory
with good high-energy behavior must have a similar decoupling principle. We will see
that this requirement is the origin of gauge invariance, both for Maxwell's field and for
much of the rest of the standard model of particle physics. To that end, we proceed to
the field theory of massive spin-1 particles, after which we will take the massless limit.
4.3 Field theory for massive spin-1 particles
The analysis we did of the states of massive spinning particles implies that the creation
operators of massive spin-1 particles satisfy
U\A)a](p)U(A) = [^^(WiA^^iAp),
where R is the usual 3x3 matrix representation of rotations.
As in the scalar case we want to find a local field linear in creation and annihilation
operators, which is a model for a device that can create maximally localized states of a
single particle. The covariant field we construct from the creation operators must have
at least three components, each of which transforms as a vector under rotations. The
stability subgroup of a single space -time point, in the Poincare group, is the Lorentz
group, so the fields at a point should transform as a representation of this group. The
particle momentum is Fourier conjugate to its position, so the components of the field
are related to internal degrees of freedom of the particle. We have described particles
with a finite number of degrees of freedom per momentum state, so we should have
fields that transform in finite -dimensional representations of the Lorentz group. None
of these representations is unitary, but we are not talking about the action of the Lorentz
group on the Hilbert space. If that unitary action is denoted U(A) then we want the
field An to transform as
U t (A)A K (x)U(A) = S^(A)A L (A- 1 x),
where S(A) is the finite -dimensional representation. S is not a unitary matrix, but
£/ (A) is an infinite-dimensional unitary operator, which we constructed in the previous
section.
The only three-dimensional representations of the Lorentz group are the (complex)
self-dual and anti-self-dual tensors
B^ = ± l -e^ vXK B XK .
These are mapped into each other by reflection. In order to construct a Hermitian
Hamiltonian we would need both of these fields, so we really have six real components.
The smallest representation we can use to describe spin-1 particles is the 4-vector B IJL ,
and this can be reduced to three components by the covariant constraint
duB" = 0.
4.3 Field theory for massive spin-1 particles
Thus we may guess at a formula
'■/-
>,A2jry
where p^e^ip) = 0.
There are three solutions to the latter equation. We normalize the space component
of e M by e^ (p = 0) = 5 J t , and allow it to transform as a 4-vector, which defines it for all
values of p. In the exercises, the reader will verify that this choice leads to a Lorentz
4-vector transformation law for the field B^ (x). We can build an anti-symmetric tensor
B I1V = d v Bf, - d lx B v ,
so we see that the six-component anti-symmetric tensor is a derived field. Since B^ was
constructed to satisfy the Klein-Gordon equation and the transversality condition
d^ = 0, this field satisfies
d v B' lv + ii 1 B' 1 = 0,
which is called the Proca [24] equation. Note that, conversely, the Proca equation implies
the transversality condition.
A Lagrangian that leads to the Proca equation is
r l R 2 -u ^ R 2
To canonically quantize this Lagrangian (see the problems) we must use the constraint
equation to eliminate Bq, which does not have a canonical conjugate. The procedure is
slightly painful, but ends up giving the obvious answer: the generating functional for
Green functions Z[/ /U ] is given by a functional integral with this Lagrangian coupled to
a source via SC = B^J 11 . The generating functional for free connected Green functions
is obtained by doing a Gaussian integral:
WoV ti l = - 1 - f d 4 x a\ A yJn(x)J v (y)
(2jt) 4 ~ p 2 -n 2 +ie
The reader should be able to see from this expression's symmetry that it would be
inconsistent to quantize these fields as fermions. As in the case of spin-zero particles,
the connected two-point function of the free Proca field is just the Green function
(in the sense of partial differential equations) of the Proca equation, with Feynman
boundary conditions.
It should by now be obvious how to derive Feynman diagrams for any perturbation
around this free-field Lagrangian. It would be a good idea for the reader to work out
the diagrams for two-, three- and four-point functions up to one loop (without doing
the loop integrals) for the interaction £j = igB^icj)* d^cp — 9 M </>*) between the Proca
field and a charged scalar.
Particles of spin 1, and gauge invariance
The propagator of a massive vector boson 3 will be denoted by
p z - [i l + ie p l + [i l
Massive photon propagator in Minkowski and Euclidean signature
One look at the vector boson propagator tells us that something dramatic happens
for spin 1 and zero mass. Indeed, the longitudinal part of the propagator blows up in
this limit. The only way we can get a consistent limiting expression is to insist that the
source satisfy d^J^ = 0. A massless vector meson must he coupled to a conserved current.
This remark becomes even more interesting when we realize that the fi — limit of the
Proca equation coupled to a source is just Maxwell's equation for the electromagnetic
field.
4.3.1 The Stueckelberg formalism
We can obtain more insight by introducing a redundant parametrization of the space
of field variables, known as the Stueckelberg formalism. We introduce another vector
field A^, as well as a scalar, 9, via the formula
The Lagrangian is obtained by just substituting this combination into the Proca
Lagrangian. The well-known gauge invariance of Maxwell's field strength tensor with
respect to gauge transformations of the vector potential implies that
The full Lagrangian now has a gauge invariance under
a,, -> A^ + d^a,
e ->• $ + a,
even for non-zero mass. If we couple a current to A l± , then it must be conserved, in order
to preserve gauge invariance (one must integrate by parts to show this, which means
that either the gauge transformation or the current must vanish sufficiently rapidly at
infinity). If it is conserved, then f J^A^ — f J^B^. For such conserved sources, the
longitudinal part of the B l± propagator cancels out, and there are no divergences when
From the point of view of free-particle physics, Maxwell's gauge invariance is thus
the result of the discontinuity of the number of degrees of freedom of a spin-1 particle
3 Here and henceforth, we record the Feynman rules in both Minkowski and Euclidean space.
4.4 Problems for Chapter 4
in the massless limit. The Stueckelberg formalism incorporates the massless split of the
degrees of freedom into a spin-zero particle and a massless helicity ±1 particle into the
massive theory by introducing a redundant degree of freedom. It is a hint of the Higgs
mechanism, which we will discuss in Chapters 6 and 8. The connection between gauge
invariance and a redundancy of degrees of freedom will reappear again and again in
our discussions.
4.4 Problems for Chapter 4
4.1. Find the form, in an arbitrary frame, of the three transverse polarization vectors
cf (p) which satisfy ef = and e{ =S J t in the rest frame of a massive spin-1 particle.
Using the transformation law of the creation operators a/ip), show that the field
B^ (x) transforms like a vector field.
4.2. Canonically quantize the Proca Lagrangian. Note that the canonical conjugate to
Bq vanishes and that the Euler Lagrange equation for Bq allows us to write it at
any fixed time as a function of the variables Bj. Then quantize the spatial compo-
nents using standard procedures. Compute the Green function (0| TB^ (x)B v (0)|0)
using canonical commutators, and show that it is covariant despite the asymmetric
treatment of different Lorentz components. Show that this function is the Green
function of the Proca equation with Feynman boundary conditions.
4.3. Write down a covariant local field transforming in the (2/ + 1,0)- or (0, 2y + 1)-
dimensional representation of the Lorentz group, built as a linear combination
of creation and annihilation operators of helicity ±j massless particles. The two
representations are complex conjugates of each other. Show that, if we have only
one sign of the helicity, no local field is possible. Note that the helicity j must be
quantized in order for this construction to work.
4.4. Prove that the formula (4.1) defines a unitary representation of the Poincare group.
particles and Fermi statistics
For spin- j particles, we will introduce a change of pace and start directly from field
theory. We have to ask ourselves what kind of fields could create spin- \ particles. This
leads to the question of what kinds of fields there are, which is answered by saying that
fields at a fixed point form finite-dimensional representations of the Lorentz group.
This guarantees that the only infinity in the number of states of a single particle comes
from the different values of momentum it can carry.
We're lucky to live in four dimensions, where the analysis of finite-dimensional
representations is particularly easy. In particular, the kind reader will verify that the
combinations of Lorentz generators
tijkJij ± in-
form two commuting copies of the algebra of SU (2). We can use our complete knowl-
edge of the representations of SU (2) to list all of the finite-dimensional representations
of SO (1, 3). The finite-dimensional representations of SU (2) are all equivalent to uni-
tary representations, so we can always use a basis in which the two SU (2) generators
are Hermitian. Note that, using this construction, the rotation generators will be Her-
mitian and the boost generators anti-Hermitian. This was to be expected. We cannot
have a finite -dimensional unitary representation of the non-compact Lorentz group. 2
Thus, the general Lorentz-covariant field carries two integer labels N = [«l, «r] corre-
sponding to the dimensions of the representations of the two SU (2) groups. Recall that
even dimensions correspond to half-integer and odd to integer spin. Note that the two
kinds of integer are interchanged by space reflection. Thus, fields of the form [1,«r]
and [«l, 1] have a handedness, and are called right and left chiral fields.
The rotation generators are the sum of the left and right SU (2) generators, so we can
get spin i with either [ 1 , 2] or [2, 1 ] . These are called right- and left-handed Weyl spinors,
respectively. In fact the [1,2] representation is complex, 3 and its complex conjugate is
1 We are picking a particular rotation subgroup of SO(l, 3) in order to exhibit the invariant isomorphism
between groups.
It is important to remember that this is the representation of the Lorentz group on field labels. The
representation on the 1 filbert space is iiiliiiile dimensional and unitary.
SO (1, 3) is locally isomorphic to SL(2, C) and the [1,2] is the fundamental representation of the latter.
Spin- j particles and Fermi statistics
the [2, 1]. To see this note that the rotation and boost generators in the [1,2] are /,■ =
(Ti/2, Jo, =—i(ii/2. The representation of rotations is pseudo-real, o* — —020102, but
the boost operator has an extra sign change under conjugation, so that we get the [2, 1]
representation. We can describe a spin- 5 particle either in terms of a left-handed Weyl
fermion field v, in the [2, 1] representation, and its complex conjugate, or equivalently,
in terms of the four-component field -1
\io 2 v*J
which is a Dirac spinor field, satisfying the Majorana condition
*•-(£ T>
The equivalence between Weyl and Majorana descriptions of a neutral spin- j particle
was not always recognized, and there is a lot of confusion in the literature, as late as the
1970s. A general Dirac spinor has the form t/^i + ife with i/r,- satisfying the Majorana
condition.
Following van der Waerden, we describe left-handed Weyl fields as two-component
fields v a , whereas right-handed fields are written as ^- a - Each of these representations
has a Lorentz-invariant product
We use the Levi-Civita e symbols 5 to raise spinor indices v a = e ab Vb (note that we
contract on the second index). We also introduce e a b by
and similarly for dotted indices.
The product [2, 1] <g> [1,2] of the two Weyl representations of opposite chirality is
the [2,2] or 4-vector representation. The matrices o^ 1 . — (\,o) a - h are the Clebsch-
Gordan coefficients relating the tensor product of the two opposite-chirality Weyl
spinor representations to the conventional basis in the 4-vector representation. These
matrices map the right-handed spinor representation into the left-handed spinor. The
corresponding map in the opposite direction is given by <rj" = ( 1 , — o )- ba . These symbols
allow us to write the Lagrangian
-*-Weyl =
-i(v*) V%,9,X' + h - c - I 5 - 1 )
4 In this section we will use the symbol* to denote Hermitian conjugation ofthe operators in each component
of a spinor field. This is to distinguish it from t , which could be taken to imply turning row indices into
column indices as well as I lennilian conjugation. Later on, most of our equations will be written in terms
of complex Dirac spinors, and we will revert to the usual s\ tnbol for I lennilian conjugation.
5 e 21 = -e 12 = 1, and the same for dotted indices.
Spin-j particles and Fermi statistics
The two terms are equal, up to a total divergence, and it is conventional to simply drop
both the factor of one half and the Hermitian conjugate term in the Weyl Lagrangian.
If we vary it with respect to v* we get
or in momentum space
This tells us that positive-energy solutions have negative helicity. The opposite
correlation is valid for right-handed spinors.
The Weyl matrices satisfy
a (a) maps the right (left)-handed spinor into the left (right)-handed one, so the first
product maps the left-handed spinor to itself and the second maps the right-handed
spinor to itself. The product of the [1,2] with itself is [1, 1] © [1,3], and the second of
these is the anti-self-dual second-rank anti-symmetric tensor €^ va pT a P = —2T jJiV . rj is
the Lorentzian continuation of a standard basis for Euclidean anti-self-dual tensors,
introduced by 't Hooft [25], which we will study in the last chapter, rj is the analog for
self-dual tensors. In Lorentzian signature
^Ilv — ( s vQ8 a ^ - S^oS") — ie£ v ,
^v = ( s ij,oK ~ S v qS^) - ie° v .
The three -index Levi-Civita symbol appearing in these equations is the usual rotation-
invariant symbol in three dimensions, with indices raised by the Euclidean 3-metric.
The Weyl equation implies that p 1 = and that the solution with positive energy
has negative helicity (is left-handed) while the negative -energy solution has positive
helicity. The positive- and neg-dtive-helicity solutions are constrained two-component
spinors v±(p), with only one free component.
The general solution of the Weyl equation is
v(x) = f d P [a(p)e-fr*i;-(p) + bHp)e ip "v+tp)].
Note that it is inconsistent to insist that a and b be the same operator, because the field
v is complex. Quantization of a and b as either bosons or fermions is consistent with
the requirement that the Heisenberg equations derived from the Hamiltonian 6 be the
6 We are here anticipating Noether's theorem from Chapter 6. The point is that time-translation symmetry
allows us to construct the Hamiltonian from the Lagrangian without using the canonical formalism.
Canonical coniiin i I ui-comniuuiloi nil . '1 imui Hi llci nbci equations bu
it with the Euler-Lagrange equations.
Spin- j particles and Fermi statistics
Weyl equation. The covariant Feynman Green function of the Weyl equation, which
can be constructed without thinking about how the fields are quantized, is
S F (x)
J (2tt) 4
It is odd under interchange of spin indices combined with x — > — x (one should write
things out in terms of the four real components of the Majorana spinor to see the
anti-symmetry). Thus, it is compatible only with Fermi statistics. Correspondingly (see
Problem 5.1) the Weyl field cannot have local commutation relations for either choice
of statistics, but has local anti-commutation relations for Fermi statistics. Thus, spin- \
particles must be fermions. The proof of the spin-statistics theorem follows the same
pattern for all spins [6-8], by analyzing the symmetry properties of the Feynman Green
function.
It is worth noting that we cannot allow fields with local anti-commutation relations
in the Lagrangian density, which must commute with itself at space-like separation,
but that even functions of them are allowed. Indeed, Lorentz invariance also requires
that we have only even functions of half-integer spin fields in the Lagrangian, so the
spin-statistics connection replaces these two constraints by one. It can be stated in
symmetry terms. Let (— 1) F be the operator which is —1 on all states created from the
vacuum by products of fields containing an odd number of fermion operators and + 1
on the other states. Then the spin-statistics connection is the equation
(-l) F = e 2 ™ J .
(a is any unit 3-vector and J is the generator of angular momentum in any Lorentz
frame). The operator (— 1) F commutes with all observables (operators that can be
viewed as infinitesimal changes of the Lagrangian). It should be emphasized that the
proof of the spin-statistics theorem follows from assuming that the Hilbert space of
the theory consists only of positive-norm particle states. Later on we will encounter
fermionic scalar fields, called Faddeev-Popov ghosts, which live in a "Hilbert" space of
indefinite norm. A gauge equivalence principle will forbid them from being produced in
the scattering of physical particles, but they are extremely useful at intermediate stages
of calculation.
5.0.1 Chi ra I symmetry
The Weyl Lagrangian has a symmetry under v — > e la v, which gives rise via Noether's
theorem (see Chapter 7) to a conservation law. At the level of free particles this con-
served quantity is simply helicity. This symmetry is called a chiral symmetry because
it acts differently on particles of different helicities. It cannot remain conserved for a
massive particle. Indeed, if we try to add a non-derivative term to the Weyl Lagrangian,
in order to generate a mass, then it must have the form
h h.c.
Spin-j particles and Fermi statistics
where m is a complex number. Note that this term would be identically zero if the fields
were onumbers. Instead, in the next section we will show that they are anti-commuting
(Grassmann) numbers. In the presence of this term, the Weyl equation becomes
iff'' df,v+m*v* = 0.
By multiplying this equation by a^d^ and using the complex-conjugate equation, we
obtain
(d 2 + mm*)v = 0,
indicating that the field creates and annihilates massive particles, with mass \m\.
The appearance of both v and v* in the equation suggests that things will look
more elegant if we introduce the four-component Majorana field ijr. In terms of iff, the
massive field equation takes the form
[iyf 1 d^xfr - mP- - m*P+]ir = 0.
Here we have introduced the Dirac matrices,
/0 *>\
\a* )
P± are the projectors on spinors of fixed chirality: P- projects on the first two
components of a Dirac spinor. Note that
anti-commutes with all the y^ and that P T — |(1 =p y 5 ). We define A — A^y' 1 for any
4-vector A^. In the problems and Appendix C, you will find many properties of the
Dirac matrices. They mostly follow from the anti-commutation relation
which the kind and diligent reader will easily verify.
We can do a chiral transformation on v or \jr to make the mass of a single free fermion
real. The Lagrangian for the Majorana field with real mass is
£ = f(iy-m)f, (5.2)
where 7 \jr — i/^y - This is called the Dirac Lagrangian. The general solution of its
equations of motion does not satisfy the Majorana condition, but can be decomposed
into a pair of fields, which do. This is analogous to the breakup of a complex scalar
field into real and imaginary parts. In fact (Problem 5.2) we can change basis in such a
way that the Dirac matrices are all imaginary. In this basis the Dirac equation has real
solutions and the Majorana condition is just i/r* — xjr.
The reader should prove thai the y is necessary For (he Lorentz
5.1 Dirac, Majorana, and Weyl fields: discrete symmetries
5.1 Dirac, Majorana, and Weyl fields: discrete
symmetries
The Lorentz group has four disconnected components, which can be obtained by
appending the time -reversal operation T and space-inversion operation P to the proper
orthochronous Lorentz group. Our study of field theory suggests the existence of
another operation called charge conjugation, C, which takes any particle into its anti-
particle. The time has come to see how these symmetries are or are not implemented
in quantum field theory. Reflection symmetry reverses spatial positions and momenta,
while preserving angular momenta. The standard approach to this subject is to work
out the most general constraints on particle states and then find further constraints
that follow from local field theory. We will work directly with the fields.
If these operations correspond to symmetries there must be operators U(P), U(T),
and U(C) in quantum field theory, which implement these operations on the Hilbert
space and commute with the scattering matrix. In the case of P and C these are unitary
operators. U(T) must be anti-unitary, because uHT)e~ ip °' U (T) = e ip °'. If U(T)
were a unitary linear operator, commuting with the Hamiltonian, then the Hamiltonian
could not be bounded from below. Note also that, by virtue of its definition, U(T) must
map in states to out states and vice versa. Wigner pointed out long ago that the solution
to this problem was to choose U(T) to be the product of a unitary transformation and
the non-linear but idempotent operation of complex conjugation of the coefficients in
the expansion of states in the Hilbert space in some orthonormal basis. 8 Such operators
are called anti-unitary. They still satisfy UHT)U(T) = 1, but UHT)cU(T) = c*
for a general onumber. From now on, we will designate the unitary transformations
U(T, C, P) by the simpler notation T, C, P.
The most general space-reflection transformation (also called a parity transforma-
tion) on a set of n scalar fields must have the form
p- l <p A {t,x)P = B A <j> B {t,-x). (5.3)
P is the unitary representative of the parity transformation on the Hilbert space of
our field theory, and O is an O(n) matrix, the most general internal symmetry (see
Chapter 7) of free massless scalar field theory. The square of this parity transformation
is a purely internal symmetry, which acts on the multiplet of scalars via the matrix O 2 .
The matrix O is the product of a reflection in «-dimensional space and rotations by
angles 0, in some set of orthogonal 2-planes. If any of the rotation angles is irrational,
O 2 would generate an infinite discrete group. A theory with polynomial Lagrangian
could be invariant under this group only if it were invariant under the continuous one-
parameter subgroup of rotations with arbitrary 6 angles. In that case we could find a
product of an internal symmetry and our parity transformation, which was just simple
reflection of all coordinates, with at most an internal reflection.
Of course any two orthonormal bases are related by a unitarj transformation, so the choice of basis here
Spin-j particles and Fermi statistics
If the angles are rational, O generates a Z/ c group for some k. If/: is odd, then O
is in the group generated by O 2 , and we can again find a parity transformation with no
internal rotation. If k is even, then the group generated by OisZo x Z^/2, and we can
eliminate the internal Z/ ( /2 transformation. If we diagonalize the Z2 action, we find a
set of fields that transform as
p- l (j,A(t,X)P = € B (t>B(t,-x), (5.4)
where each e B is ±1. Fields with e B — 1 are called scalars, the others, pseudo-scalars.
For vector fields, the story is even simpler. Geometrically, a 4-vector could trans-
form like an ordinary or an axial vector under parity, depending on whether its space
component changes sign under reflection. We may ask whether there exists a possibil-
ity of appending internal symmetry transformations to the definition of parity, as we
did for scalars. However, with one exception that we will describe below, all internal
symmetry transformations for vector fields are gauge invariances (see Chapter 8), and
the Lagrangian for such fields contains a term of the form
d li Al l f abc A b l A c v ,
where f abc are the structure constants of the group (see Appendix E for an explanation
of the terminology). When the structure constants are non-vanishing, this term allows
only the ordinary vector transformation law for the fields.
There are two exceptions to this rule. Abelian (U(l)) gauge fields can be pseudo-
vectors, because the structure constants vanish and Maxwell's Lagrangian is invariant
under reflection of the fields. If we want to construct a parity-invariant Lagrangian, we
must couple these fields only to pseudo-vector currents. The second exception occurs if
we have a discrete Z? automorphism of the gauge group, which can be appended to the
definition of the parity transformation. The simplest examples involve a group G x G
where the Z? exchanges the groups. Hypothetical models of this type for an extension
of the standard model of electro-weak interactions based on the case G = SU(2) have
been studied in great detail [ - ] .
For spin- 1, the story of parity is more complicated, because the simple [2, 1] Weyl rep-
resentation is not mapped to itself by reflection. Before studying this case, we introduce
the notion of charge conjugation.
5.1.1 Charge conjugation
The basic idea of charge conjugation is a symmetry that interchanges particles and
anti-particles, without transforming space-time variables. Particles are distinguished
from anti-particles by their internal symmetry charges. So charge conjugation must
satisfy
c~ l uc= u\
5.1 Dirac, Majorana, and Weyl fields: discrete symmetries
where U is the unitary operator implementing any global symmetry. This will be true,
in particular, for the global residuum of any gauge invariance (Chapter 8). For scalar
ileitis.
rVi(x)c = c> fi (.v),
and C must satisfy
CO=0~ l C, (5.5)
for the matrices O of all internal symmetry rotations. We must also have C 2 — 1 .
At this point in the exposition, the reader should study Appendix E on Lie groups and
Lie algebras. This is the mathematical tool for studying symmetries that depend on con-
tinuous parameters. In that appendix, she/he will encounter the notion of infinitesimal
generators of transformations on fields
XM -> XM + IWnif^XN-
The parameters co a are infinitesimal real numbers and the t a are linearly independent
Hermitian matrices, which satisfy
[t"\ f] = iF mnk t k .
Any set of matrices with these commutation relations defines a finite -dimensional uni-
tary representation of the continuous group of transformations whose parameters are
a>a- We will denote by G g i bai the group of all global symmetries of our Lagrangian. It
acts on the space of scalar fields via unitary or orthogonal matrices. This set of matrices
is called the representation Rs of the group G g i bai on the space of scalar fields. The
complex-conjugate matrices also represent the same group and we can ask whether
our representation is equivalent to its complex conjugate. That is, is there a unitary or
orthogonal transformation on the internal labels of the scalar fields that obeys
tfO(g)U=0*(g),
for every group element g e G g i bal- If not, we say the representation is complex, and
there is an inequivalent representation Rs. For any group represented by unitary or
orthogonal matrices in the representation Rs, the expression
4> 4>A,
where 4> transforms in Rs, is invariant. If Rs = Rs (in the sense of unitary equivalence),
then this implies that one can make a bilinear invariant out of the components of <j>a
itself. If this invariant is symmetric then Rs is said to be real, because one can choose
a basis in which the action of all the group matrices is real. If it is anti-symmetric, we
say Rs is pseudo-real.
In Problem 5.16, you will show that the existence of a charge-conjugation operator
implies that the representation of G on scalar fields is real or pseudo-real. A famous
theorem in group theory, the Peter-Weyl theorem, says that any finite-dimensional
unitary representation of a group is a direct sum of a finite number of irreducible
unitary representations. In an irreducible representation, anything that commutes with
Spin-j particles and Fermi statistics
all the group generators is proportional to the identity matrix (Schur's lemma). The
irreducible components of Rs are not necessarily themselves real or pseudo-real, but
complex representations must come in conjugate pairs R + R that are interchanged by
the charge-conjugation matrix C. The form of C, up to conjugation by an element of
Ggiobab is mostly determined by group theory. However, if there are several different
fields transforming in the same irreducible representation of G g i bai then the action of
C is undetermined in that subspace. Given a set of mass terms and interactions, there
will be at most one choice of the matrix C, which leaves the Lagrangian invariant. Note
also that C takes fields in pseudo-real representations into their complex conjugates.
The charge-conjugation properties of vector fields A" are more constrained, because,
as we shall see in Chapter 8, they are all gauge fields, associated with a gauge equivalence
group G c Ggi bai. Call the generators of G T A . The structure constants of the Lie
algebra of G are defined by
[T A ,T B ] = if ABC T c .
The constants/" 4 ^ are real and totally anti-symmetric. The Jacobi identity for double
commutators is equivalent to the statement that the matrices
(T^ BC =if BAC
form a representation of the same Lie algebra, called the adjoint representation. Gauge
fields always transform in the adjoint representation of G, and are invariant under
continuous global symmetries that are not in G. It is a real representation of the group,
obtained by writing a general element as 0(g) = e ad J. The matrices 0(g) are all
rotation matrices. A Cartan subalgebra of the Lie algebra is a maximal independent
set of commuting generators //'. Given the Cartan subalgebra, we can form linear
combinations of the other independent generators, such that
[H { , E r ] = YiEy.
The number, r, of elements in the Cartan subalgebra is called the rank of the group,
and the r-dimensional vectors r,- are called the roots of the Lie algebra. The generators
Hj can be thought of as generators of rotations in orthogonal planes, in the vector
space on which the matrices T A d - act. 9 Thus, their eigenvalues, the roots, come in pairs
of equal magnitude and opposite sign. The charge -conjugation transformation on the
adjoint representation flips the sign of all the Cartan generators and exchanges positive
with negative roots. Thus, in the familiar example of SU(2), charge conjugation flips the
sign of the gauge field associated with the third component of isospin and exchanges
the two fields with It, — ±1.
We are finally ready to talk about the charge-conjugation properties of Weyl
fermions. The discussion parallels that of scalar fields, except for the case of pseudo-real
representations. For scalars, charge conjugation turned a scalar in a pseudo-real repre-
sentation into its complex conjugate. However, the complex conjugate of a left-handed
Both the matrices and ihc space itself are often called the adjoint representation by physicists.
5.1 Dirac, Majorana, and Weyl fields: discrete symmetries
Weyl field is right-handed. Complex conjugation of Weyl fields does not commute with
the action of the Lorentz group. Thus, the only way in which we can have a charge-
conjugation-invariant theory with Weyl fermions in a pseudo-real representation is
to have two copies of the pseudo-real representation. We can always choose one of
them to transform under G as the complex conjugate of the other and define charge
conjugation to exchange them.
5.1.2 CP transformations for Weyl fields
We have now learned enough to understand that the correct name for the space-
reflection transformation on Weyl fields is CP, the combined symmetry of parity and
charge conjugation. A space reflection must take Weyl fields into their complex conju-
gates. As such it must also change all internal symmetry transformations into their
complex conjugates, and therefore also performs what we have called charge con-
jugation in the previous section. The most general CP transformation for M Weyl
fields is
(CP)" Vf (*, OCP = Uijd^r^i-x, t),
where U is an element of the U(M) internal symmetry of the free Weyl Lagrangian.
As for scalars, the transformation of \jf,- by U 2 must be part of the internal symmetry
group of the full interacting Lagrangian, if CP is a symmetry. We can use this freedom
to redefine the CP transformation law.
The most general Lagrangian involving spin- 2 fermions coupled to vector and scalar
fields (the most general renormalizable couplings) has the form
V^iff^ - A^ixM + fM(x)f + h.c. (5.6)
i/f is a vector of left-handed Weyl fermion fields. Here the matrix-valued vector potential
A^ must be composed of matrices that transform in the adjoint representation of a
compact Lie group G, which must be an exact gauge symmetry of the Lagrangian
(Chapter 8). The matrix-valued scalar M(x) is a general compact symmetric matrix in
the internal space (the fermion bilinear that couples to it is anti-symmetrized in spin
to make a Lorentz scalar). These couplings between a single scalar field and fermion
bilinears are called Yukawa couplings. The rest of the Lagrangian, which we have not
written, has to be invariant under the action of G on M (M -> ( U J ) ~ 1 MU~ ! ) induced
by the action on \j/. There may also be a global symmetry group, larger than G. The
fermion gauge field couplings automatically preserve CP (this restricts the matrix U
to commute with the action of G on i/r,). It can be shown [30] that the rest of the
Lagrangian is CP-invariant if and only if there is a basis of fields for which all of the
Yukawa couplings are real. However, we will see in Chapter 8 that there are other
terms depending only on the gauge fields, which have dimension 4 and violate CP. We
will also see that transformations on the fermions of the form i/f, ->■ e'^Vv, which we
might want to do to make the Yukawa couplings real, can change the values of these
CP-violating pure gauge terms. Thus, the real criterion for CP invariance is that the
Spin-j particles and Fermi statistics
couplings are real in the same basis as that in which the CP-violating gauge terms
vanish.
Of course, there are operators of higher mass dimension, which can be added to
the Lagrangian, and it is hard to believe that one would ever find a basis in which
all coefficients in the Lagrangian were real. These higher-dimension terms are, by
dimensional analysis, always multiplied by inverse powers of a mass scale M. At energy
scales small compared with M they will be negligible (Chapter 9), and CP invariance
might be approximately valid. Experiment seems to show only small violations of CP
invariance in all physical processes except those involving the heaviest generation of
quarks. If there were no experimental evidence for CP violation, we might have imposed
CP invariance as a fundamental property of nature. This is not possible. It is therefore
important to understand why the violation of CP in nature is so small at low energy.
In the semi-classical approximation, we expand around a constant value for M,
which gives rise to masses for some of the fermion fields. The propagators for the
massive fields will involve correlators of fa with itself and with \jrj . This suggests the
utility of a four-component notation. The nature of the four component fields depends
on the structure of the mass matrix. Let us assume that all the fields acquire mass, since
that appears to be the case in the real world.
In the mass terms, each component of \jrj can be paired either with itself or with some
other component. 10 In the latter case, we can make a four-component complex Dirac
field, and the kinetic terms preserve a U( 1 ) symmetry under which this field is multiplied
by a phase. The mass term is called a Dirac mass. In the standard model of particle
physics, the corresponding approximately conserved quantum number (Chapter 7) is
called a quark or lepton flavor. A particular linear combination of these flavors is
called electric charge, and it appears to be exactly conserved. The neutral leptons, or
neutrinos, do not carry this charge. The experiments that show they are massive are
not yet sensitive enough to tell us whether we have six Weyl fields, or only three, which
are paired with themselves.
A mass term that pairs a Weyl field with itself is called a Majorana mass, and the
corresponding particle is called a Majorana particle. The only internal quantum num-
ber such a particle can carry is Z2-valued, which is compatible with the particle being
its own anti-particle. Some authors insist that a Majorana field/particle has such a
Z2 symmetry, but in general the quantum number might have to be carried by other
fermions as well. One can construct Lagrangians whose only conserved fermion number
is(-l) F = e 23ri/3 .
The charged standard-model fermions are best described by complex Dirac fields
with a diagonal mass matrix. In the approximation that neutrino masses vanish, which
is valid for most purposes, it is convenient to append a right-handed neutrino to the
standard model. The neutrino is described by a four-component complex field, but its
right-handed components simply do not interact. This is achieved by putting left-hand
projection operators in the interaction Lagrangian.
ii. a lAinnietric complex malrix. which can be brought i<
5.1 Dirac, Majorana, and Weyl fields: discrete symmetries
5.1.3 Time reversal and TCP
The anti-unitary time -reversal operator T maps left-handed fields into themselves.
This is because both momentum and angular momentum change sign under time
reversal, so helicity is invariant. The most general time-reversal transformation for Weyl
fermions is
T^fi(.x,t)T = T{mfj{x,-f),
where T is a U(M) matrix. Again, T must be an ordinary internal symmetry of the
Lagrangian and we can use this to simplify the time-reversal transformation.
We can try to use the freedom in the definition of the various transformations to
make a given Lagrangian invariant. Note in this connection that the various matrices
that act on the spinors and scalars (the gauge field matrices are strongly constrained
by gauge invariance, and in simple cases there is no freedom at all) in the definitions
of T, C, and P all commute with each other. This is because the symmetries commute
geometrically, and these matrices express their action on the algebra of observables,
not on the states of the theory. The action on the states need only be a representation
up to a phase of the geometrical group.
We have stated that the condition for CP invariance is simply that there be a basis of
fields in which all couplings are real. The conditions for individual C and P invariances
are stronger. If there is no action of P on the gauge group itself (which can occur only
if there is a Z2 outer automorphism of the group) then this can be satisfied only if the
scalar representation of the gauge and global symmetry groups is real or pseudo-real,
while the Weyl fermions lie in a real representation.
It is remarkable, by contrast, that the combined transformation CPT is a symmetry
of every local Lorentz-invariant quantum field theory (see a rigorous proof in [1 1]).
We can get a rough understanding of this by referring to the Euclidean formulation of
field theory. The geometrical operation of reflecting all coordinates has determinant 1
in four dimensions, and is part of the connected component of the group SO(4). Thus,
any field theory that can be obtained by Wick rotation of a Euclidean field theory will
be invariant under some kind of time- and space-reflecting transformation. It remains
to understand why this transformation must involve charge conjugation. When we
rotate back to Minkowski space, the transformation changes positive into negative
frequencies, and thus exchanges creation operators with annihilation operators. A local
charged field contains the creation operator of an anti-particle and the annihilation
operator of the particle. Thus, on continuing to Euclidean space, doing a rotation
by jr in two orthogonal planes, and continuing back to Minkowski space, we have
reversed both space and time and changed particles into anti-particles. This is the CPT
transformation. We remark in passing that the transformation PT alone turns left-
handed into right-handed fields, and cannot be a symmetry unless the spectrum of left-
handed fields is in a real representation of the symmetry group. Like most arguments
in this book, this discussion is a bit "quick and dirty," but can be turned into a rigorous
proof of the CPT theorem.
Spin-j particles and Fermi statistics
5.2 The functional formalism for fermion fields
Although we have worked explicitly only with a single scalar field, the functional formal-
ism immediately generalizes to any kind of local field theory. However, the formalism
we have considered so far integrates over c-number functions. It turns out that this is
appropriate for fields associated with bosonic particles. For fermion fields we start from
anti-commutation relations for creation and annihilation operators and it turns out that
fields cannot commute at space-like separation. The best we can do is to force them
to anti-commute at space-like separation. We will base our approach to the functional
quantization of fermion fields on this observation.
We have already seen that particles of spin zero and 1 cannot be fermions and particles
of spin J cannot be bosons. This generalizes to the spin-statistics theorem: integer-spin
particles are bosons and half-integer-spin particles are fermions. In this section we
will not make any commitment to the character of spin indices carried by the field.
However, we will work mostly with complex fermion fields, so we bias our notation to
deal with that case. We will touch briefly on the properties of real fermions when we
discuss Pfaffians.
In order to define covariant time-ordered products for fermion fields, we must change
the definition of time ordering to include a factor of (—1) every time we interchange a
pair of fermions in the process of reordering the fields from their order on the page to
their proper time order. For example,
Tf(x)f(y) = 6(x - y)f{x)f(y) -6(y- x)f(y)f{x).
Note also that, if \j/ were a Dirac field, then, as a consequence of this definition and
the canonical anti -commutation relations (Problem 5.1),
Y 11 ^T{f(x)jr{y)) = T(y» d ^ {x)f (y)) + &\x - y) .
If we consider a fermionic analog of the coupling to an external source,
Te if [_fj(x)t( X ) + i,(xMx)]
then if the sources are o numbers all but the first term in the series vanish. This is a con-
flict between the symmetry of a product of ordinary numbers ri{x)rj(y) = rj{y)r](x) and
the anti-symmetry of the fermionic time-ordered product. It can be cured by changing
the multiplication rule of source functions to read rj{x)iq(y) = —f\(y)i]{x). We will first
consider the physical meaning of such a rule and then elaborate on its mathematical
implementation.
A source function is a mathematical idealization of a large classical machine, which
probes the dynamics of quantum field theory. If such a machine creates a single
fermion, without violating the conservation law of (— 1) F in the entire Universe, then
another fermion must be created inside the machine. The requirement that the machine
be classical is, in this case, simply the requirement that there be so many different
fermionic excitations in the machine that the Pauli exclusion principle is irrelevant. As
5.2 The functional formalism for fermion fields
a mathematical model, think of ;? as being constructed from an average over a large
number of canonical fermion fields
^) = ^I>
In the large-A^ limit, such average fields anti-commute with all other fermionic fields,
including their own Hermitian conjugates.
In the nineteenth century, Grassmann invented the algebra of anti-commuting num-
bers generated by a finite set of generators [rj a , %]+ = 0. These had the properties
usually attributed to infinitesimals, dx a in calculus. In modern mathematics, Grass-
mann numbers are used to describe differential forms. The Grassmann sources of field
theory are made by combining an infinite -dimensional Grassmann algebra with a basis
in function space: n(x) — Yl riafaix). For Dirac spinors, the/ a (x) are just a basis of
ordinary spinor-valued functions on space-time.
The square of any individual Grassmann generator is zero, so functions on a
finite-dimensional Grassmann algebra are all polynomials, and functions on an infinite-
dimensional algebra are all power series. Differentiation is easily defined by the usual
algebraic rules, supplemented by the proviso that 9/3 r] a anti-commutes with all Grass-
mann numbers. Integration is defined by insisting that it be a linear functional, invariant
under translation. For a single Grassmann variable, the most general function is a + brj,
where a, b are complex numbers:
/ dn(a + br]) = a d;?l + b drjn.
Invariance under rj — >■ r\ + iff implies that
/<
and we normalize / drji-j — 1. Multiple integration is defined by iteration.
It is easy to see that this leads to the rule
jd n rif(r,)=f ai .. Mn ,
where the right-hand side is the coefficient of the terraf ai „ Mn rj ai . . . r\ an in the function
f(ri). In particular, if A is an N x TV anti-symmetric matrix, then
J d n rje^ A " h i' , ' lb = e ai .. MN A aia2 . . . A aN _ iaN =Pf[A],
ifN is even, and vanishes if N is odd. This combination of matrix elements of an even-
dimensional anti-symmetric matrix is called the Pfaffian. We can define the Pfaffian of
an odd-dimensional anti-symmetric matrix to be zero.
Similarly, if r\ a and r\ a are two independent sets of Grassmann variables, then
f d N ij d N rj e^ M " = det M,
for any complex matrix.
Spin-j particles and Fermi statistics
Using these formulae, and the invariance of Grassmann integration under shift sym-
metry, it is easy to prove that Wick's theorem is valid for Gaussian fermionic integrals
(Problem 5.9). The only thing we have to be careful of is the minus signs, which arise
between terms with different contractions.
5.3 Feynman rules for Dirac fermions
In this book, we will give Feynman rules only for complex spin-j fields satisfying the
Dirac equation. Weyl fermions are to be treated as Dirac fermions whose interactions
satisfy a constraint such that their right-handed components are free fields. That is,
the correct couplings will never let the right-handed components propagate in internal
lines, or scatter. Furthermore, we will discuss only Lagrangians quadratic in fermion
fields, which have a U(l) fermion-number symmetry. These will have the form
C D = fDf,
where D = i ^ + M and M is a 47V x 47V matrix, which depends on the bosonic fields
in the theory. TV is the number of independent complex Dirac fields, which we call the
number of species. The result of doing the path integral
/W d^e'-^WnUi)^, (yi) • • ■ fi n (Xn)fk n (yn)
(5.7)
det^,-,,/,, (x m , j'/)]Det D. (5.8)
In these equations i m and kj are composite indices combining the spin and species
indices. 5 is the Green function for (1 /i)D, with Feynman boundary conditions (or
the Euclidean Green function regular at infinity). The lower-case det just means the
determinant of the 47V« x ANn matrix of propagators, while the upper-case Det is the
functional determinant of D.
In the particular case of free-field theory, where M is a constant proportional to
the unit matrix, the functional determinant is just a number 1 ' and factors out of the
answers for connected correlation functions, since they are ratios of two functional
integrals. The propagator S(x,y) depends only on x - y, and its Fourier transform is
S(p) = 1 —— . (5.9)
y-m + ie
m is the mass of the free field. The propagators are oriented, carrying an arrow which
shows the flow of the conserved U(l) particle-anti-particle number (which, for elec-
trons, is minus the electron charge because of an unfortunate choice of conventions
in the nineteenth century). The momentum p which appears in the propagator is the
momentum flowing in the direction of the particle-number arrow.
It is infinite, bill lhal is proper!) Healed under (he heading el renin malizalion.
5.4 Problems for Chapter 5
The Feynman diagram for a free Dirac propagator is
l/-m +\e i/+\m
Dirac propagator in Minkowski and Euclidean space
5.4 Problems for Chapter 5
*5.1. Write the general solution of the Dirac equation,
(i$-m)f,
in terms of Fourier coefficients aj(p)uj(p) for positive-frequency solutions and
b](p)vi(p) for negative frequencies, {p - m)u, = and (/+ m)vj — 0. There are
two linearly independent solutions of each equation (see Appendix C). Show that
a] and b] transform like the creation operators of massive spin-| particles, if the
field is transforms in the [2, 1] ©[1,2] representation of the Lorentz group. Show
that the commutation or anti-commutation relation
[f a (x, t), irl(y, Ok = S ab s\x - y)
implies bosonic or fermionic commutation relations for the creation and annihi-
lation operators. Show that only the fermionic choice leads to a local field, which
anti-commutes with itself and its conjugate, at space-like separation.
5.2. Starting from the Weyl representation of the Dirac matrices, find a change of basis
that makes all the matrices imaginary (Majorana representation) and another one
in which y° is 0-3 eg) 1 and yji — Y°YfiY°-
*5.3. The most general action of the discrete symmetries T, C, and P on mass
eigenstates of spin- \ particles is (the rjs are complex phases)
2V(p,j)r = fj r «(-A-5),
pV(p,j)p = W(-p,*),
cV(p,j)c = i )c i)t( P> j).
Recall that the T operator is anti-unitary, while the others are unitary. If the
anti-particle operator b is not the same as that of the particle (i.e. the particle
carries a charge), then there are in principle independent phases y\t,p for the anti-
particle. Find the connections between these phases implied by a homogeneous
transformation law for local fields, and write down the transformation laws for
Dirac fields in the Weyl representation.
*5.4. The Dirac anti-commutation relations imply that independent elements of the
algebra generated by the Dirac matrices are the antisymmetrized products
yMi-M;^ where k runs from to 4. Rewrite the k — 3,4 cases in terms of the
Spin-j particles and Fermi statistics
matrices y M and y$. Use the results of the previous problem to characterize the
T, C, and P transformation laws of all Dirac bilinears ^•yW-w -f.
*5.5. Given a set of Weyl fields v' a , the most general non-derivative, bilinear term one
can add to the Lagrangian has the form
M ij (; ab v' a v' b + h.c.
Anti-commutativity of the fields and anti-symmetry of the Levi-Civita symbol
imply that M is a symmetric complex matrix. Suppose that there is a U(l)
transformation acting on the v' a , which commutes with M. Then the bilinear
can be written as a sum over subsets of fields with charges ±q under U(l),
Jlq V a^ v b (~ <?) m //(<7)- The part of this sum with q — is called a Majorana
mass term, while the rest of the bilinear is called a Dirac mass term, mu(q)
is a matrix acting only on the subspace of fields with charge q. Show that by
doing independent unitary transformations on the charge q and — q fields (for
q ^ 0) we can transform each mu(q) into a diagonal matrix with real positive
entries. Thus, the fermion kinetic term for Dirac fermion masses actually has
a U(l) No - of char s es symmetry and preserves T, C, and P. In the context of the
strong interactions these U(l) charges are called quark flavors. They are not
exactly preserved by weak interactions, but this is a small effect. Flavor is an
example of an accidental approximate symmetry. What can you say about the
Majorana mass matrix with regard to putting it in a canonical form, and its
transformation under discrete Lorentz transformations?
5.6. Derive the Kiillen-Lehmann spectral representation for a Dirac field ijf a (x) and
for a conserved vector current operator J jjL (x) satisfying 3^/^ = 0.
5.7. Solve the Dirac Green function equation
(i?-e4~ m)S{x,y) = S 4 (x - y)
in a constant background electromagnetic field A tl (x) — F ilv x v
*5.8. Show that y$ — iy°y l y 2 y 3 and that [y M , y v ]+ — 2^ MV . Prove that y 5 2 = 1 and
that [y 5 ,y /J ]+ = 0.
*5.9. Use the rules of Grassmann integration to prove the analog of Wick's theorem
for fermions.
*5.10. Using the transformation X± — (rj ± r])/2, in the Grassmann integral formula
for the determinant of M, for the case in which M is anti-symmetric, relate the
determinant and Pfaffian formulae and prove that det A = (Pf A) 2 .
*5.1 1. Derive the Feynman rules for the Yukawa interaction
A£ = (p(gsff+gpfY5f)
between a spin-zero field and a Dirac fermion.
*5.12. Prove all of the trace and contraction identities in Appendix C, and generalize
them to at least one higher power of Dirac matrices. Evaluate all of the traces
when an additional y$ is inserted, along with the Dirac matrices y^ .
5.4 Problems for Chapter 5
*5.13. Consider an interaction ^rY A -^rgA^> A + jf^^) 2 ' where r^ runs over the 16
independent anti-symmetrized Dirac products, (j) A is a tensor field of the appro-
priate kind to make a Lorentz-invariant product, and gA is the same for each
component of an irreducible Lorentz representation. The fields (p A have no
kinetic terms. Solve for them algebraically, and show that you get the most
general four-fermion interaction for the V field, which preserves the symmetry
\fr — > e 10l \[r. This trick can be generalized to higher powers of V and to inter-
actions that don't preserve the symmetry. Using this trick, one can formally do
the fermion functional integral in terms of the Green function for the Dirac
operator i $ — <j) A ^rT A \jf and the determinant of this operator. If i/r were a
boson, we would get the inverse determinant instead of the determinant. Show
that this replacement implies the extra Feynman rule of (—1) for each closed
fermion loop.
*5.14. Prove the Gordon identity
2mu(k)Y^u(p) = u{k)[(p + k)" + Y llv (k -p) v ]u(p)
and the analogous identity for the anti-particle spinors v{p).
*5.15. Solve the momentum-space Dirac equations (//— m)u(p, s) = = (p+ m)v(p,s),
in both the Dirac and the Weyl bases for the Dirac matrices (see Appendix C).
*5.16. Show that charge-conjugation symmetry implies that the representation of the
internal symmetry group G is real or pseudo-real.
Massive quantum electrodynamics
In this chapter we will do some perturbative calculations in a theory called massive
QED. It is quantum electrodynamics with the photon replaced by a massive vector
field. Actually, it is the simplest example of a theory with a Higgs mechanism, since we
have seen that the massive vector can be written as a gauge theory by introducing an
additional scalar degree of freedom. Indeed, we can write a theory of electrodynamics
of spinor and scalar fields in terms of the Lagrangian
g 2 C = \D^\ 2 - ]-F llv F' tv + fy^n - m)f + k(|</>| 2 - v 2 ) 2 .
(6.1)
If we write 4> — pe 10 , and treat 6 as the Stueckelberg field in the gauge formalism for the
massive vector field, then we can eliminate 9 in favor of a massive vector field B jX with a
mass that depends on the field p. p is called the Higgs field. In the formal limit k — > oo,
fluctuations around p = v are infinitely costly, and we get the theory called massive
QED. It turns out to be a perfectly good quantum field theory (it is renormalizable in
perturbation theory), in the sense that, if g is small and we introduce an ultraviolet
cut-off A to make sure the theory is well defined (see the chapter on renormalization),
then as long as A < e c / g E, where c is a number of order 1 that we will learn to compute
later, the predictions of the theory at energy scales E <5C A are insensitive to the value
of A and to the precise way in which we implement the cut-off.
We introduce massive QED for several reasons. First, it allows us to do computa-
tions in QED without worrying about gauge invariance. This is because the massive
theory has no gauge invariance, and can be quantized in a straightforward manner. The
Heisenberg equations of motion and canonical commutation relations lead, in a famil-
iar fashion, to the path-integral formula for the generating functional of Euclidean
Green functions
Z[J*,ri,rj] =
fl&Bn dlA dir ] e -S+Id^(B^+f,^+i,r,)
fidBft di/f dir ]e~ s
This leads to Euclidean Feynman rules with the propagators for L
P 2 + PI 2
The Euclidean fermion propagator is
(6.2)
Massive quantum electrodynamics
As usual we have written the propagators of fields that are rescaled to make the
quadratic terms in the Lagrangian e-independent. The gauge boson mass is fi = ev.
The interaction vertex corresponds to the following amputated ' Feynman diagram:
Fermion gauge vertex
and has the value —ey^, which goes to — iey' 1 in Lorentzian signature. It is a non-trivial,
but true, statement that, when we restrict attention to sources satisfying d^J' 1 — 0, this
generating functional has a finite limit as [i -> and the vector boson becomes massless.
In this limit the particle states split into two different representations of the Poincare
group, a massless scalar (the erstwhile longitudinal component of the massive spin-1
particle) and a massless particle with helicity ±1. The restriction to conserved sources
decouples the massless scalar. A particular class of conserved currents, of the form
jv = d v M^ v , with M^ v = —M V)J -, generates Green functions of the electromagnetic
field strength F^ v in the // ■ ->■ limit.
When we continue to Lorentzian signature and compute scattering amplitudes we see
several interesting effects in the massless limit. First of all the amplitudes for producing
longitudinal gauge bosons from initial states that had no longitudinal gauge bosons in
them goes to zero in this limit. Secondly, amplitudes involving scattering of fermions
diverge in perturbation theory. This divergence is primarily due to the fact that in the
\i — >■ limit one can emit an arbitrary number of vector bosons with a finite cost in
energy. Indeed, in this limit the probability of emitting any finite number of bosons goes
to zero." This is the infrared catastrophe of Maxwell's QED. As a catastrophe, it rates
pretty low on the Richter scale. It simply means that any scattering of charged particles
is accompanied by a low-energy burst of classical bremsstrahlung radiation, which more
or less follows the trajectories of the outgoing charged particles. If we restrict the energy
e in this radiation, rather than the number of particles (which is always infinite in the
(j, —> limit) we get a finite answer. The advantage of massive QED is that we can
derive this finite answer by summing up finite amplitudes in perturbation theory and
then taking fi to zero, rather than formally resumming a series of infinite terms.
We do not have space in this text to actually do a bremsstrahlung calculation, and
will leave it to the problem set. Instead we will do two tree-level computations of
fermion annihilation processes, and a calculation through one loop level of low-energy
scattering in an external magnetic field. It will be convenient to add a second fermion
to the theory, with the same charge as the original one, and a mass M ^> m. We will call
the light fermions electrons and positrons, and the heavy ones muons and anti-muons,
in honor of famous characters who have appeared in the Real World.
1 This adjective implies, as usual, that "the legs have been cut-off."
2 To understand how a zero can look infinite in perturbation theory, consider the expression e -e " n ^ .
A well-known piece of international cinema . of little artistic value, but enormous popular appeal.
Massive quantum electrodynamics
6.1 Free the longitudinal gauge bosons!
Our first calculation will be the tree-level amplitude for annihilation of electron and
positron into spin-1 particles. There are two Feynman diagrams (Figure 6.1), related
by the Bose statistics of the gauge bosons.
At tree level, we can short circuit much of the LSZ formula. The poles in external
lines sit at the bare Lagrangian masses, which are thus equal to the physical masses at
this order. Wave -function renormalization constants are all equal to one. The general
rule for external-line wave functions is that for incoming lines they are the coefficients
of annihilation operators, while for outgoing lines they are the coefficients of creation
operators, in the free fields which create and aim: ng states. Thus we have
u(p,s) — incoming fermion,
u(p,s) — outgoing fermion,
v(p, s) — outgoing anti-fermion,
v(p, s) — incoming anti-fermion.
The amplitude for the first diagram is thus
M = {-ie) 1 v{p2,s 2 )Y< 1 - -Y v u{pi,s l ){e*T(k l ,ai){€*) v {k 2 ,a 2 ). (6.4)
y-m + ie
The momentum in the internal fermion line is
p' 1 = (k\ -piT.
The amplitude for the second diagram is the same expression with the two massive-
photon polarization vectors interchanged, and/;'' = k 2 — p 2 . The two diagrams are to
be added, which shows that the Feynman rules implement Bose statistics.
We have written the invariant amplitude, M . The full S-matrix element for any process
is gotten by multiplying the invariant amplitude by
*=n 7 ^<^(p,-i>)M.
The product of energy factors comes from our non-relativistic state normalization, and
runs over all initial and final particles. The momentum-conservation delta function is
written here with the convention that all momenta have positive energy.
► (WW
Gauge-boson production in e+, e annihilation.
6.2 Heavy-fermion production in electron-positron annihilation
It is instructive to examine this amplitude for the case of longitudinally polarized
massive photons. The longitudinal polarization vector is
Since it blows up at [i = 0, production of these particles would seem to rule out the
possibility of taking the massless limit. Furthermore, at high energies the dimensionless
polarization vector must blow up like \k\. At the very least we seem to be faced by a
breakdown of perturbation theory in both the massless and the high-energy limit.
The observation that saves the day is that, at \k\ » (i, we have
e ll (k,L)=—+o(fi/\k\).
Now consider the production amplitude for ei longitudinal and center-of-mass energy
much bigger than \i. It is, in leading order,
Now use the identities
fa-&2-m = J l -j/( 1 -m,
V\ =¥1 -rfi-m+Pi + m,
¥i = -Wi-¥i-m)+ft-m,
(tf - m)u = v(/ 2 + m) = 0.
The terms coming from the two possible substitutions for k\ each consist of one term
that cancels out the fermion propagator and another that vanishes because the external
spinors satisfy the Dirac equation. The terms with canceled propagators from the two
diagrams have equal magnitude and opposite sign. Thus, the divergent contribution to
the longitudinal production amplitude, in the zero-mass or high-energy limit, cancels
out exactly. This cancelation is the heart of the proof of renormalizability of Higgs
models.
6.2 Heavy-fermion production in electron-positron
annihilation
In this section we will calculate the leading-order contribution, in massive QED, to the
annihilation cross section for a light charged particle (electrons and positrons) into a
heavier one. We will call the heavier particles muons, but the calculation is valid for
t leptons as well. As a consequence of asymptotic freedom, it can also be used for
calculating the e + e~ — ► hadrons cross section above the QCD confinement scale (see
Chapter 8 for a definition). In that case, the rigorous use of the calculation involves an
analytic continuation from Euclidean space [32], which we will not delve into. Since we
Massive quantum electrodynamics
Heavy-fermion production in e+, e annihilation.
work with massive photons, our calculation can also be adapted to calculate the part
of the full cross section that comes from Z-boson, rather than photon, exchange. We
need only substitute the correct couplings to the Z boson, and set /li equal to «z in the
calculation below. We will learn how to calculate the Z couplings in Chapter 8. Given
the format of this book, we cannot possibly do justice to the full extent of the physics
of this process. The reader is urged to consult the wonderful section in Chapter 5 of
Peskin and Schroeder [33] for a detailed description of it.
There is only one diagram (Figure 6.2) in leading order, and its value is
M = (-ie) 2 u(k_,r_)Y K v(k + ,r + )-
-v(p+,s+)y u(p_,s-). (6.6)
q L - ix-
Here q — p++ p~ — k + + k_ , and the labels indicate incoming and outgoing fermion
momenta and spins in what I hope is an obvious manner. The labels are related to the
gle in the center-of-mass frame by Figure 6.3.
We note the identity
>■</>_
+)(p++p-) x y u{p-.
-) =
«e)v«(p-
-) = 0,
which eliminates the longitudinal part of the gauge-boson propagator from the
amplitude.
We will now calculate the spin-averaged total cross section for this process where we
sum over initial and average over final spins. This corresponds to the simplest experi-
mental situation, in which the initial beams are unpolarized and no spin measurements
Kl = I E, 0, -^E 2 -
,+, jul Pair production by annihilation of massless e+, e in the center-of-mass frame.
6.2 Heavy-fermion production in electron-positron annihilation
are made on the final particles. Polarized beams and polarization detectors are power-
ful experimental tools in e + e~ annihilation. The reader is urged again to consult Peskin
and Schroeder [33] for a detailed description of polarization amplitudes. At this point
the reader should also turn to the appendix on Diracology. He/she is urged to take all
of the results stated there as exercises, even those whose proofs are given. Mastery of
this technology is an important part of the repertoire of any respectable field theorist.
Using the polarization sum identities of Diracology, the spin-averaged squared
amplitude is
\ E i^i 2 - W -£ +io p *[<* - fv- + "'•»■•] (6 7)
xtr[(^_+» V ) n (^ + -m |X )^].
Now we use the trace formulae of the appendix to write this as
T 2^ 1^1 = A\, a 2 _ u 2 , i^|2
* r± , s± *IW I 1 + 1€ >\ (6.8)
x [(p_k-)(p + k+) + (p-k + )(p + k_) + ml (p-p+)].
We have dropped a term proportional to m 2 (which the reader should calculate), since
these formulae are always used at center-of-mass energy above twice the muon mass.
We evaluate the scattering cross section in the center-of-mass frame (which is not
quite the lab frame in an e + e~ collider, because of bremsstrahlung). There, according
to Figure 6.3,
(p++P-) 2 = 4E 2 ; (p+p-) = 2E 2 ;
{p-k-) = {p+k+) = E 2 - EJE 2 - m\ cos 6;
(p-k+) = (p+k-) = E 2 + E^E 1 - ml cos 6.
We have again made the approximation that the electrons are massless and travel at the
speed of light. E is the energy and absolute value of the momentum of the electron or
positron. The square root in the above formulae is the absolute value of the muon or
anti-muon momentum.
We now consult the appendix on cross sections, using the two-to-two formula with
2?A = Eb — E, and |va - vb| = 2 to write
d0 (6.9)
-(T^^v^^l- -4) + c*W>«>«H (6 - 10)
We have introduced the fine-structure constant a = e 2 /4it « 1/137.
There are three interesting things to note about this cross section. The first is the
square-root turn on of the cross section near threshold and the second is the angular
Massive quantum electrodynamics
distribution of the reaction products. Both the square root and the angular distribu-
tion are characteristic of spin- 7 particles. Observations of thresholds like this in e + e~
annihilation signal new quarks and leptons and give us one of the pieces of evidence
that they all have spin ^ ■ If we are ever lucky enough to see a supersymmetric partner
of a quark or lepton with spin zero, we will see something quite different. The electro-
magnetic current for these particles carries a factor of momentum, so the turn on of the
cross section above threshold is slower by an extra square root. The angular distribution
is also different. You should do Problem 6.4, to appreciate these differences.
Finally, if we calculate the total cross section
and take the limit E » all mass scales, then we get the scale-invariant result
(6.11)
o-tot-^. (6.12)
The fact that this is scale-invariant seems obvious, but it is a clue to the very essence
of what a quantum field theory is. You'll have to hold your breath till the chapter on
renormalization to find out.
6.3 Interaction with heavy fermions: particle paths
and external fields
If the muon mass were extremely heavy, we would not be able to create muons in low-
energy processes involving electrons, photons, and muons. Furthermore, in this limit,
muons move without much disturbance from their interactions (momentum transfer
much less than the muon mass). Thus we can control the trajectories of any muons and
anti-muons in the initial state, and restrict our attention to trajectories where no anni-
hilation processes are possible. In this section we will discuss a set of approximations
that are useful for describing this limit.
In the path-integral formalism for QED, we can do the integral over the muon field
exactly, with the other fields held fixed, in terms of the Green function of the muon Dirac
operator in an external photon field. If we are trying to evaluate a Green function with
2N muon fields, appropriate for situations in which there are N muons or anti-muons
in the initial state of a scattering process, then the result is
Att[S(Xi,yj)]Da[\ ?-e4- M^]. (6.13)
S(x,y) is the solution of
[i ?-e4(x) - M„]S(x,y) = iS 4 (x -y),
with Feynman boundary conditions. The lower-case det refers to the ordinary determi-
nant of the N x N matrix of propagators, corresponding to all possible contractions of
6.4 The magnetic moment of a weakly coupled charged particle
initial and final points. The upper-case Det refers to the functional determinant of the
Dirac operator. The functional determinant is negligible as long as all external fields
and momenta are small compared with M^.
To evaluate the propagator S we write
S=(ip-M il )- l = (ip + M il )(-D 2 -M 1 -+ 1 -cr'" v F ll J\ , (6.14)
where D^ is the covariant derivative. To evaluate the inverse of the second-order opera-
tor, we think of it as the Hamiltonian for a quantum particle with four position coordi-
nates 4 and a spin, and write the inverse in terms of a mixture of path-integral formalism
for the positional mechanics and a time-ordered formula for the spin dynamics:
5'=(i|> + M |X ) f d.ye- LsM qdx^)V° dT Lw + A "^lT^f dt """^W*)).
(6.15)
The path integral is over paths satisfying x(0) = x, x(s) = y. On introducing the
dimensionless variables sM 2 = u, xMh = v, it is easy to see that, for large M^, the
particle action is large, and is dominated by the free-particle kinetic term, which is of
order M 2 The term involving the line integral of the vector potential is of order one,
while the interaction with the spin is down by l/M 2
Thus we can saturate the path integral by the straight-line path between x and y.
The result, to leading order in M^, is
S(x,y)= ["dse-^+^e^o^^Wj^. (6.16)
Jo
Al = (x — y^Ajj,, and the last exponential is just the line integral of the vector
potential along the straight-line particle path and, in particular, it is independent of
the parameter s. Thus, the effect of integrating out the massive fermion is to add a
source term to the A IJL Lagrangian. If we shift the A^ field to eliminate the source, it is
shifted exactly by the classical (massive) electromagnetic field produced by the current
of the heavy point particle. Obviously we can repeat this procedure for all the fermion
propagators. To leading order in the mass, the effect of a collection of massive particles
on the rest of the theory is simply to shift the background value of A^ from zero to some
other classical field. This justifies the study of such classical external-field problems.
6.4 The magnetic moment of a weakly coupled
charged particle
One of the most interesting calculations we can do with the external field method is
of the correction to the magnetic moment of a particle whose strongest interaction is
This actually looks a little more conventional in Euclidean space, but we will not bother to go through
Massive quantum electrodynamics
One-loop vertex correction.
electromagnetic. The classic result of Dirac is that the gyromagnetic ratio of a point
particle of spin j is 2. What we will show is that purely electromagnetic interactions
correct that result. In fact, higher-order calculations and modern experimental tech-
niques have combined to make this calculation/measurement for the electron and the
muon one of the most precise agreements we have between theoretical physics and the
real world. The current state of the art is sensitive to the effects of virtual strongly
interacting particles and of the weak interactions. It can even be used to put inter-
esting bounds on physics at quite high energy scales. The electron and muon dipole
moments are known to 13 and 10 significant digits experimentally. For the electron
there is complete agreement between theory and experiment at that level of accuracy.
Recent (November 2006) results indicate a possibly significant deviation for the muon,
which might be an indication of physics beyond the standard model. We will calculate
only the leading correction to the anomalous magnetic moment. A general reference
for precision QED results is [34].
We will employ the external field methods we have just learned, specialized to a
field of the form Fy = e i j] i B k , where B k is the constant value of the magnetic field,
in the rest frame of the particle. The electric field is assumed zero in this frame. We
will do the computation to leading order in the corrections to the result for the Dirac
equation in an external field. The reader should first do Problem 6.5, which shows that
the prediction of the Dirac theory is a gyromagnetic ratio of 2.
To first order in the external field, the invariant amplitude is given by
(-ien) j d 4 q A»(q)u(p + q)r„(p, q)u(p).
the spin indices on the incoming and outgi
tex T^ to one-loop order (Figure 6.4), taki
e mass parameter mo is equal to the physic
(-i)(-ie ) 2 _, , , f a4 , r}XK-{k-p)x(k-p) K /ii 2
7^ uip + q) j dk -
We have suppressed the spin indices on the incoming and outgoing spinors. We will
evaluate the 1PI vertex r M to one-loop order (Figure 6.4), taking into account that
to this order the bare mass parameter mo is equal to the physical mass m. Thus, the
diagram is given by
TiT u i
(2jt) 4 ^ J (k-p) 2 -n 2 +ie
, Y k W + 4 + nDy^ +
X [(k + q) 2 -m 2 + k][k 2 -
Wi
6.4 The magnetic moment of a weakly coupled charged particle
Let us first deal with the term which apparently diverges as /x — ► 0. Using the
identities u(p + q)(p - If) = u(p + q)(m - £ - sQ and {jf - Jf)u{p) = (m - #)w(p),
we see that the denominators in the two fermion propagators are canceled out by the
numerators, and we are left with a term proportional to
d 4 k
This term can be absorbed into a rescaling of the fermion fields because it has the
same form as the tree -level vertex. The precise value of this renormalization depends
on the method we use for cutting off ultraviolet divergences in the theory, a subject to
which we will return in Chapter 9. There we will distinguish between cut-off procedures
that preserve gauge invariance and those which don't. With a gauge-invariant regula-
tor, like the analytic continuation of Feynman diagrams in the space-time dimension 3
the apparently divergent integral in the previous expression is actually finite and pro-
portional to n . We have to be careful about the regulation procedure because, as we
have noted, massive electrodynamics is really a gauge theory in its Higgs phase. The
upshot of this discussion is that we can ignore the longitudinal term in the gauge -boson
propagator, and replace it by
i ^v
p 2 - ix 1 + ie
in this diagram.
Before continuing the computation let us stop to ask what we expect to get. The
current matrix element is u(p + q)Y )J :U{p), and the vertex function T^ can be expanded
in terms of the linearly independent Dirac matrices 1, ys, y M , y^ys, y^v The two expres-
sions with ys, as well as anything else involving the Levi-Civita symbol, cannot appear,
because massive electrodynamics conserves parity. We must combine the matrices with
the 4-vectors p, q to make a 4-vector, and, when we sandwich T M between on-shell
spinors, we can use
(y- m)u(p) = (#+4- m)u{p + q) =
and the conjugates of these equations to prove that
u(p + q)qu(p) =
u(p + q)(2pi" + q")u(p) = u(p + q)[2my^ + y^q v }u{p).
The last identity, named after Gordon, was proved in the exercises to Chapter 5. Using
these identities, it is easy to see that anything in T^ can be reduced to
Fi{q 1 ,m 1 )y il -F 2 (.q 1 ,m 2 ) } ^- + F^q 2 ,m 2 )q IJi ,
' i i abbreviated DR.
Massive quantum electrodynamics
when evaluated between on-shell spinors. The form factors F, depend on the Lorentz
invariants q 2 , p 2 — m 2 and 2pq — — q 2 .
The current matrix element is conserved, which means q^uY^u = 0. This is satisfied
for arbitrary i 7 ^? and implies that F3 = 0. In our calculation, we will seek to reduce
all expressions in T^ (after loop integration) to linear combinations of (2p + q)* 1 , y^,
and qV- . The diligent reader will prove that, once this is done, the coefficient of q^
vanishes. The Gordon identity will allow us to identify the individual i 7 ^?- An extremely
important point of the analysis is that, for q 2 = 0, which is the only thing our constant
external field probes, F\ serves merely to renormalize the value of the electric coupling
erj. Thus, it cannot change the gyromagnetic ratio of 2 implied by the Dirac equation
for any coupling. The change in the gyromagnetic ratio is due entirely to the second
form factor. F\ is called the electric and Fi the magnetic form factor.
At this point in our studies, this is fortunate. A direct calculation of F\ (0) (though
not its derivatives w.r.t. q 2 ) would lead to infinite integrals, whose meaning I am not
yet prepared to explain. We will find, however, that Fj_ is completely finite and unam-
biguous. This can be seen by dimensional analysis. The integral defining r^ contains
four powers of loop momentum in the denominator. It can, at most, give rise to a
logarithmic divergence. If we differentiate it w.r.t. q, we get a finite integral. Only the
first term in the Taylor expansion of T M diverges. The term containing Fj_ vanishes at
q = and will therefore be finite. This sort of dimensional analysis will be the key to
the understanding of renormalization, which we will achieve in Chapter 9.
For those readers who wish to practice their skills by calculating the entire vertex
function, I will introduce a method to render all integrals finite. It consists in changing
the denominator in the photon propagator to (p 2 — [i 2 + ie)K(p 2 /A 2 ), where K is a
smooth function, which is 1 for \p 2 \ < A 2 and blows up at least exponentially rapidly
above that. We will argue in Chapter 9 that, as long as A is much larger than the electron
and photon masses and the external momenta, the answers for physical amplitudes will
be independent of the detailed shape of K.
I will also employ an identity, invented by Feynman (its derivation follows in the
problems), which states that
D = xA\ +yA 2 + zAi.
For our purposes, the three Aj are the Feynman denominators, and
D = k 2 + 2k(yq - zp) + yq 2 + zp 2 - (x + y)m 2 + ie.
If we introduce r = k + yq — zp, then
D = r 2 - U + k,
with
TJ = -vim 2 4- H — v^m 2 - v,, 1
6.4 The magnetic moment of a weakly coupled charged particle
This is extremely useful, because the numerator contains terms of zeroth, first, and
second order in r^ . The linear terms vanish upon integration, by virtue of Lorentz
invariance, as long as we are careful to regulate the divergences with a Lorentz -invariant
prescription like DR, or our modified photon propagator. The quadratic terms are also
simplified by Lorentz invariance:
/■ ,4 r IA r v r,^ /• 4 / 1 U\ /• 4 D+U
Using the anti-commutation relations, we can rewrite the numerator of the diagram
as
Num = y^MynW + m) + (m 2 - k 2 ) Yll + 2k^+ni)]y x .
We use the contraction identities from Appendix C,
YxYij.Y X = ~ 2 Yn,
Y X YnYvYx = 4?? M v,
Y^YiiYvYaYk = -^VvaY/J. +2YvYaYn,
and the fact that u(p + q)tfu(p) = 0, to rewrite this as
Num = Amq„ + 4k„(2m -1)0 + Yfl [^+2(k 2 - m 2 )].
We substitute k — r + zp — yq and use the relations for integrals over r to express
this as
Num = Y „ [(1 - *)(1 - y)q 2 + (l-2z + z 2 )- l -(D+ IT)]
+ (2p + q) IA mz(z - 1) + qil m(z - 2){x - y).
We have again used the fact that the numerator is sandwiched between on-shell spinors.
Note in particular that
u(p + q)Yu,tfu(p) = 2u(p + q)(p fl + q^ - y ^m)u(p).
We have also reintroduced x — 1 — y — z, in order to make it clear that the term
proportional to q^ is odd under interchange of x and y. It therefore vanishes when
integrated. We now use the Gordon identity to find that the magnetic form factor is
given by
^ {p + q) ^ui P) j^f^x*y*z &{ l-x-y-z ) n ^p±. (6.17)
Note that we have left off the effect of the regulator for the photon propagator, as
well as the photon mass. In fact, we will find that the answer for the magnetic form
factor is completely finite. In Chapter 9 we will learn the reason for the UV finiteness:
expressed as corrections to the quantum Lagrangian for the Dirac field, the terms in the
Taylor expansion of the magnetic form factor around g = are all operators of dimen-
sion higher than four. In a theory whose couplings have non-negative mass dimension
Massive quantum electrodynamics
(which we will learn to call a renormalizable theory at the Gaussian fixed point), such
terms are independent of the cut-off.
The r integral is done by analytic continuation to Euclidean space, following the ie
prescription
The factor 2n 2 is the 3 -volume of the unit sphere embedded in four dimensions. The
Lorentzian signature integral is — i times the Euclidean one. The i comes from r° — ir^,
and the minus sign from r 2 — U — — (r| + U). The result is
J^u(p + g)^p^u(p) f dxdydzS(l-x-y-z) m * ( \ Z) (6.18)
An 2 2m J (1 - z) 2 m 2 - xyq 2
At this point, we have set fi 2 = 0, since we will obtain a finite result in this limit. The
electric form factor has an infrared divergence when fj, = 0, reflecting the zero prob-
ability for scattering without any emission of massless photons, but, to this order in
perturbation theory, the magnetic form factor is IR finite. For q 2 — it is easy to do the
remaining integrals. They give 7*2(0) = a/{2n), where a = e r/(4n) is the fine-structure
constant. In the exercises, you will verify that 7*2(0) — (g — 2)/2, also called the anoma-
lous magnetic momentum of the electron (g is the gyromagnetic ratio). This result was
first obtained by Schwinger in 1948. Since then, experimental and theoretical determi-
nations have competed with each other in precision, with no definite discrepancy yet
having been found for the electron magnetic moment. Possible deviations for the muon
moment, if they exist, are probably indications for physics beyond the standard model.
We will learn more about the standard model of particle physics in Chapter 8.
6.5 Problems for Chapter 6
*6. 1 Use the Feynman rules of QED to compute the amplitude for Compton scattering
(scattering of a photon by a charged particle) to leading order in perturbation
theory.
*6.2 Compute the scattering of an electron by a heavier spin-j charged particle at
leading order in perturbation theory. Show that it is related by crossing symmetry
to the annihilation amplitudes discussed in the text. Compute the spin-averaged
differential cross section. Take the limit where the heavy-particle mass goes to
infinity and show that you obtain Rutherford's formula for scattering of a charged
particle from a nucleus.
6.3 Set up an expansion for the functional determinant of iy M 3 M — M — V(x), in
the limit of large mass M . V(x) is a smoothly varying external field, which may
be a matrix in spinor space as well as in the external index space of a collection
of Weyl fields. For simplicity, insist that V = \A ll (x)y 11 + i(p(x), where A^ and <j>
6.5 Problems for Chapter 6
are Hermitian matrices in internal space. Work in Euclidean space (although the
result is also valid for Minkowski signature).
*6.4. Compute the differential cross section for electron-positron annihilation into a
particle-anti-particle pair of spin-zero bosons with charge Q.
*6.5. Compute the magnetic moment for a Dirac particle, with no interaction correc-
tions. Show that the gyromagnetic ratio is 2. Find the general relation between
the gyromagnetic ratio g and the magnetic form factor ^(O).
*6.6. Let Aj, i — \,...,n, be positive real numbers. Start from the obvious identity
(Schwinger)
Ai Jo
d.v, e"'
and prove the Feynman identity used in the text by making the change of variables
Sj — x,s, with J^Xj = 1.
*6.7. Show that the one-loop amplitude for scattering an electron in a background field
contains an infrared divergence, behaving like In /x as the photon mass is taken to
zero. Consider the inclusive cross section for scattering in the field, including the
possibility of emitting a finite-energy photon of energy E. Show that, if E is kept
finite as // — > 0, the inclusive cross section for either no photon or one photon
emitted in the scattering process is finite. The generalization of this, to all orders
in perturbation theory, is that we must include the possibility of any number
of photons, of total energy less than E. Faddeev and Kulish [35] addressed the
problem of defining IR finite amplitudes rather than just inclusive cross sections.
Symmetries, Ward identities, and
Nambu-Goldstone bosons
Emmy Noether, one of the first great female mathematical physicists, proved a theorem
central to the study of symmetries in classical and quantum physics. Noether 's theorem
was proved in the context of classical mechanics, but the path-integral formalism allows
us to immediately generalize it to the quantum theory.
Noether's theorem applies to groups of transformations that depend on a continuous
parameter, also known as Lie groups. 1 In general, the group will have a number of
independent continuous parameters (e.g. the Euler angles of the rotation group), but
Noether's theorem concentrates on one parameter at a time.
The classical Noether theorem states that any one-parameter group of global sym-
metries of the action leads to a conservation law. In order to discuss Noether's theorem
in a general way we introduce a field vector O' that contains all components of all
possible elementary fields in our theory. Using the condensed notation of the field vec-
tor, we write the infinitesimal variation of the fields under the symmetry as e 8g<J>'(x),
where e is the (x-independent) infinitesimal group parameter and 8g <J>' is the variation
of the field vector under the symmetry transformation. Generally it will be a function
of fields and their first derivatives, for an action that depends on at most first derivatives
of the field. The statement of invariance of the action is
8g ^/^ 68g *' (x) = °-
Note that this holds whether or not the fields satisfy the equations of motion, but only
for constant e.
If e is allowed to vary over space-time, the action is not usually invariant (unless we
have a gauge symmetry, but, as we shall see, that situation is completely different). For
an action depending only on first derivatives of the fields, the change in the action will
be a linear functional of 3„e:
?>gS — I df+e j£ = — I € d^J^.
The first of these equalities defines the Noether current J^(x), while the second follows
from the first if e is chosen to vanish rapidly enough at infinity to justify integration
by parts. The equation for variation of the action shows us that, if the fields satisfy the
Before reading thi ; , no
[1^].
.1, the reader should consul! Appendix L. and perhaps some of the rcfcrcnecs in
Symmetries, Ward identities, and Nambu-Goldstone bosons
classical equations of motion, then the variation vanishes even for variable e. Compari-
son with the last form of the Noether formula tells us that, when the fields are on shell,
the current must be conserved. This is Emmy Noether 's celebrated theorem.
To derive the quantum version of this theorem in the path-integral formalism, one
performs a change of variables in the numerator path integral for Z[J], which has the
form of a space-time dependent symmetry transformation: viz. <!>' -> <£' + e(x)8G$'.
One assumes that the measure of integration [d<J>] is invariant under this field redefini-
tion. 2 Using the fact that a change of variables does not change the value of an integral,
we obtain, to first order in e,
0=f [dO]e lS+ /*' / ( f d 4 x e d tl j£ + 8 G <t
We will generally deal with theories in which the action of the symmetry group
is linear in O, §g®' = TJ&. The most common case of non-linear action is that of
Nambu-Goldstone bosons, where the space of fields is a curved sub-manifold of a
linear space on which the group acts linearly. This case can be subsumed under linear
actions by coupling sources to the original linear fields and incorporating the sub-
manifold constraint into the measure of the path integral. For linear action, we can
I Vj&ji = 1 i (T T y i j j ,
which is the effect on the generating functional of an infinitesimal linear change of
variables, 8/ = T J J. If, when e is constant, we can drop the term / d^J' 1 (because it
is the integral of a total divergence), then we derive the quantum version of Noether's
theorem: Z[J] is invariant under transformations of/ inverse to those of d>. The same is
obviously true for the connected generating functional W[J], while the 1PI functional
r[3>] is invariant under the original transformations on <t>. We could have derived this
statement simply by making our change of variables with constant co„. We have carried
out the more general transformation in order to understand how the derivation could
fail. As we will see, the failure, when it occurs, has to do with the fact that the integral
of the divergence of the current doesn't vanish.
To see how this result is used, let's study the special case of scalar fields with an 0(h)
internal rotation symmetry. The statement that W[J] is invariant under rotations of/
is equivalent to the statement that the Green functions
(0 flI (*l)... **(**))
are constructed from invariant tensors of the 0(h) group. These tensors are the Kro-
necker S a j, and the Levi-Civita symbol e ai .. M „. In particular, the one-point function
vanishes, and the two-point function satisfies
(<t> a (x)4> b (y))=S ab G(,x-y).
In cases where it is not, we have what is called a quanUim anomaly m llic symmetry, or an anomaly for
Symmetries, Ward identities, and Nambu-Goldstone bosons
On applying the Lehmann spectral representation, we conclude that the particles cre-
ated from the vacuum by different components of the <p a field all have the same mass: we
have an «-fold-degenerate particle multiplet, as a consequence of the 0(») symmetry. In
the exercises, the diligent reader will employ the symmetry to constrain the properties
of particle scattering amplitudes.
7.1 Space-time symmetries
The treatment of space-time symmetries is roughly similar to that of internal sym-
metries, but the differences are illuminating. Let us start from space-time translations.
The first obvious difference is that the Lagrangian density is no longer invariant. Under
translations,
C{x) -» C(x + a),
or, infinitesimally,
As a consequence, when we (using the field vector notation) make an infinitesimal,
space-time-dependent translation x' 1 — > x 11 +f fl (x), we find
85
Thus.
./«,
ss =/ d4 ' v8 - /; {^ 8 -*' (v) -'"4
= ftfxdufvT^Oc).
Thus, in classical mechanics, when the equations of motion are satisfied, the stress-
energy tensor or energy-momentum tensor T^ v (x) satisfies
'■/
d'x r UlJ
is time-independent. P v is the energy momentum, also identified in the quantum theory
as the operator that generates infinitesimal translations in space-time on the Hilbert
space of states. The quantum analog of stress-energy conservation is the Ward identity
9;(0|r(r" y (x)o"(xi) . . . **•(**)) |0)
= -i£ S 4 (x - x k ){Q\T(4> h (xi) . . . 8 v * fe (*jt) ■ • ■ &"(x„)\Q).
7.1 Space-time symmetries
As before, this is derived by implementing a change of variables in the path integral,
replacing the fields by their values after a space-time -dependent translation, and then
noting that the change of variables does not change the value of the functional integral.
The fact that an infinitesimal space-time-dependent translation is actually a general
infinitesimal coordinate transformation suggests that the conserved currents associated
with Lorentz, or other, space-time symmetries will also be associated with the stress
tensor. In fact, this is true, but only after we take into account the fact that Noether's
definition of the stress tensor is somewhat ambiguous. In fact, it is easy to verify (see
the problems) that the change
T^ v ->• Tf, v + d x M llvX ,
where M 11 vk = —M Xvjl , changes neither the fact that the stress tensor is conserved nor
the value of the translation generator, P v .
The easiest way to resolve this ambiguity is to consider the quantum field theory we
are studying on a general space-time geometry, rather than sticking to flat Minkowski
space. This is of course a good thing to do, because Einstein's theory of gravitation
tells us that gravitation is nothing but curved space-time geometry, with the geometry
determined by the matter in it. (To quote J. A. Wheeler: "Space-time tells matter how to
move, and matter tells space -time how to curve.") The basic idea of curved space-time
geometry is that the Minkowski metric iy lv , which determines infinitesimal space-time
intervals between points, is replaced by a general symmetric tensor g^/x), which is
an invertible matrix with signature (+, — , — , — ). It is easy to generalize the action for
scalar and Maxwell fields simply by replacing r/ M „ by g IA v (x) and the volume element by
d 4 x -> d 4 x </=g,
where g is the determinant of the metric. Spinor fields require a more involved discus-
sion, which we will take up after we have understood non-abelian gauge theory and
vector bundles.
The scalar Lagrangian is
-^g(g>* v d^d^-ivm,
while the Maxwell Lagrangian is
- 4 V^^V^y-F^ )■
It is easy to see that, if we make a general coordinate transformation on the fields, as
well as the transformation
dx x dx K
then the action is invariant.
As a consequence
j d 4 x y/=g ^ fgflv (x)T^(x) = 0,
Symmetries, Ward identities, and Nambu-Goldstone bosons
whenever the fields <$>'(x) satisfy the classical equations of motion. 3 Here
MV _ 1 85
and §fgfiv (x) is the variation of the metric under the infinitesimal coordinate trans-
formation x' 1 — >■ x' 1 + f IJ -(x). In the approximation where we neglect gravitational
back reaction of the quantum fields on the metric, we are of course interested only
in fixed background metrics. The W-T identities are only interesting when the coor-
dinate transformations correspond to isometries of the metric, i.e. transformations
that leave the form of the metric invariant. For Minkowski space this means Poincare
transformations. Another interesting possibility is conformal isometries, where
These are of interest if the quantum field theory is invariant under Weyl transformations
of the metric, g^ v {x) -> Q, 2 (x)g, lv (x). The conformal isometry group of Minkowski
space is isomorphic to SO (2,4). In addition to Poincare transformations, it contains
scale transformations x M -> Ax^, for positive A, and special conformal transformations,
whose infinitesimal form is 8x M = b^x 2 — 2b v x II x v . Conformal invariance implies that
Tjf, the trace of the stress tensor, vanishes. It's also obvious that conformal transfor-
mations cannot be symmetries of a theory with a discrete mass spectrum, unless all the
particle masses are zero.
The stress tensor derived by variation with respect to the metric is symmetric. It is
also conserved, as a consequence of translation invariance. This symmetric tensor (the
Belinfante tensor) is equal to the Noether current only for translations for scalar fields.
In general it differs from the Noether current by a divergence of an anti-symmetric
tensor, the Noether ambiguity. Symmetry of the Belinfante tensor also implies that
9 M /^ = 0, where
TfJ-VX __ VrrifxX Xj^flV
The corresponding conserved charges
J vk =fd'xJ 0vX
are the boost and angular-momentum generators. Note that the boost generators
contain explicit time dependence, so that the conservation law does not imply that
they commute with the Hamiltonian. Indeed, it shouldn't because the commutation
rules of the Poincare algebra say that the commutator of a boost generator with the
Hamiltonian is a spatial momentum component.
3 In the quantum theory, using by now familiar manipulations, this translates into a Ward-Takahashi (W-T)
identity in which the m>erlk>n of t lie indicated \ a rial ion in I o a Green fund ion «i\es us the variation of the
Green function under the transformation.
7.2 Spontaneously broken symmetries
It is easy to verify that the free massless theories with spin j and spin 1 have traceless
Belinfante tensor. This is not the case for spin zero, but we can write a new tensor
rf = t% v - -(d»d v - v * v d 2 )ct> 2 ,
which is traceless. Note that this improved stress tensor is not invariant under transla-
tion of the field </> -> + c, whereas the Belinfante tensor is. 4 There is an interesting
connection between the lack of conformal invariance of the spin-zero Belinfante tensor
and the theory of Nambu-Goldstone bosons (NGBs) that we will investigate in the
next section. The field translation invariance of the massless free scalar is the simplest
example of a spontaneously broken symmetry. We will learn that every independent
one-parameter group of symmetries that does not leave the vacuum state invariant
gives rise to a massless spin-zero particle, the NGB. The theory of NGBs contains a
dimensionful constant,/ defined by
(0|/ M (0)|/>) = = .
y(27r) 3 2^,
In d space-time dimensions currents have mass dimension d — 1 while states have mass
dimension (1 — d)/2. Thus/' has dimensions (d — 2)/2. The theory of NGBs is not
conformally invariant, except perhaps for d = 2. 5 Correspondingly, the traceless stress
tensor for the massless field explicitly breaks the field translation symmetry.
To summarize, the W-T identities for Poincare symmetry are encoded in the con-
servation and symmetry of the Belinfante stress tensor. Conformally invariant theories
satisfy the additional property that the stress tensor is traceless. When we study renor-
malization in a later section, we will learn that classical conformal invariance usually
fails in the quantum theory. Conformal quantum field theories (CFTs) are few and far
between, but they provide us with the proper definition of all quantum field theories
in existence.
7.2 Spontaneously broken symmetries
The predictions of symmetries are powerful and have broad applications. Surprisingly,
even more wonderful results appear when the symmetry is broken spontaneously. We
use this phrase to describe what happens when, despite an exact symmetry of the
Lagrangian, the Green functions of a theory do not obey the symmetry predictions we
have just discussed. This is a phenomenon special to quantum field theory, resulting
from the infinite volume of space. Of course, in the real world we do not know that the
4 The diligent reader should take all the unproven statements in this section as additional e;
5 In fact, the Coleman-Mermin-Wagner theorem tells us that spontaneous breakdown of compact sym-
metry groups doc:, nol occur in two dimensions, in particular, llic aln.iia.ii NGB theory is conformally
ind does not have spontaneous s\ mmetry breakdown.
Symmetries, Ward identities, and Nambu-Goldstone bosons
volume is infinite. Nonetheless, the predictions of spontaneously broken symmetry are
valid with high precision even for finite volume as long as the dynamics is governed by
local field theory, perhaps with an ultraviolet cut-off, and the volume is large in cut-off
units.
To see how spontaneously broken symmetry can occur, we write the Noether formula
for variable e and take the limit of constant e more carefully. Thus, by differentiating
k times w.r.t. /, we obtain
d x (J^(x)^ 1 (x 1 )...^ x "(x k ))
= i J2 S 4 (x - x n )(0''i (*i) . . . Tj"&\x„) . . . ® k (x k )).
This is called the Ward-Takahashi identity.
On integrating this w.r.t. x and dropping the integral of a total derivative, we obtain
the invariance of the Green functions. The derivation can fail if it is illegitimate to drop
the integrated total derivative. If we take all of the x, to the same point y (in a way
we will understand better when we discuss renormalization - this is called an operator
product expansion), invariance will fail if and only if
for some (possibly composite) operator IT , which transforms in a non-trivial
representation of the symmetry group. In Fourier space this is the statement that
P„n* A (p)=p 2 r A (p 2 )£0,
when/? -> 0. Here we have used Lorentz invariance to write the Fourier transform of
the Green function, T llA (p) = p> J -T A (p 2 ). We conclude that the two-point function has
a pole, so the theory contains a massless particle that can be created from the vacuum
both by the current and by the field IF*. This is the celebrated Nambu-Goldstone
boson [36-40]. It carries spin zero, because the right-hand side of the equation can
be non-vanishing only if (U A ) ^ 0. If Lorentz invariance is not itself spontaneously
broken (which doesn't happen in most field theories), then IF 4 is a scalar field. Notice
that this also assumes that the symmetry is an internal symmetry, which doesn't change
the spin of fields. In space-time dimensions higher than 2, the only exception to this
rule is supersymmetry. In that case the Nambu-Goldstone (NG) particle is a fermion,
called the Goldstino.
We can obtain a more physical understanding of this result, and one that is useful in
systems that are not Lorentz-invariant, by thinking about the implications of the non-
zero vacuum expectation value ( VEV) in the operator formalism. In operator language,
a one-parameter group of symmetries is implemented by a one-parameter group of
unitary operators U(ca) = e l0J Q, where Q is Hermitian. A symmetry-violating VEV is
possible only if the vacuum is not invariant under the symmetry transformation, which
means that Q\0) ^ 0. On the other hand, since Q commutes with the Hamiltonian,
7.2 Spontaneously broken symmetries
the states U(co)\0) all have zero energy. 6 Now consider instead the operator Qv —
f a>(x)J°(x), where cd(x) is a smooth function which vanishes outside of a volume V
and is equal to a constant over most of the inside of this volume. We will consider
taking the volume to infinity and letting the function co(x) become more and more
slowly varying. The commutator of the Hamiltonian with this operator is (we use
current conservation)
[H, Q v ] oc I VwJ,
which can be made arbitrarily small as we take V — »■ oo and Vco — >■ 0. Thus we find,
as a consequence of spontaneously broken continuous symmetry (a symmetry that
does not preserve the ground state), a spectrum of localized excitations of arbitrarily
low energy. These are the NG bosons. In a relativistic theory they must be spin-zero
massless particles. The phenomenon of spontaneous symmetry breaking occurs in
non-relativistic condensed-matter systems. In general these systems may not have any
space-time symmetries, so all we can conclude is that the NG bosons have a dispersion
relation satisfying E(p^>0) — > 0. (Note that the energy gap in a non-relativistic system
has nothing to do with particle masses.)
The picture to keep in mind for understanding what is going on in a theory with
spontaneous symmetry breaking is shown in Figures 7.1 and 7.2. The first of these
shows a ground state with spontaneously broken symmetries. The rotational symmetry
of the arrows is broken because they all want to point in the same direction. Figure 7.2
shows a Nambu-Goldstone excitation of such a ground state: a long-wavelength wave
in the directions of the arrows.
A final note: it is not necessary for an elementary field, which appears in the
Lagrangian, to have a VEV. If there is any Green function of elementary fields that
A state with spontaneously broken symmetry.
A Nambu-Goldstone excitation.
We arc ignoring mathematical rigor here, in fact the operators U(co) a
Hilbert space of our system. But the argument is valid nonetheless.
Symmetries, Ward identities, and Nambu-Goldstone bosons
violates a symmetry, we can consider the limit where all points in this function are
taken close to the same point x. As we will discuss later, in this limit the product of
elementary fields has an expansion (operator product expansion or OPE) in terms of
functions of coordinate differences, multiplied by local composite operators evaluated
at x. For example, in free field theory
' ' ~:</) 2 (0):+o(x-jO-
If a symmetry-violating Green function is non-vanishing then some local composite
operator A will have a non-zero VEV, and we can replay Goldstone's theorem with
S A -> A . So, like the LSZ formula, Goldstone's theorem applies to composites of
Lagrangian fields, and is more general than the Lagrangian formalism.
7.3 Nambu-Goldstone bosons in the semi-classical
expansion
In the semi-classical approximation, we can understand spontaneous symmetry break-
ing in a very direct and intuitive fashion. For simplicity, consider a simple model of a
complex scalar field with a U(l) symmetry. The Lagrangian is
£=|9</,| 2 - V(<j>*4>).
The classical condition for spontaneous symmetry breaking is that the minimum of the
potential (which we fix to zero by adding a constant) occurs at a non-zero value of the
field. A simple example is
V=-fi 2 \<P\ 2 + j\<t>\ 4 -
The minimum is at \(j>\ 2 — /x 2 /A = v 2 . There is a whole circle of minima in the </> plane,
(p = ve lff . When we expand the potential around any of these minima, the curvature of
the potential in the circle direction vanishes, and we have a massless field.
More generally, if we introduce the vector of all real scalar fields in the theory, S A ,
which transforms as S A -> S A + co a (T a ) B S B , then the condition of invariance of the
potential is
dV R r
WB ico a T^ cS C = 0.
If we differentiate this equation with respect to S A and set S A — v A , a minimum of the
potential, then we get
d 2 V
ds A dS BVl
For every linearly independent combination of the generators T a , that does not annihi-
late the VEV, we find another zero eigenvalue of the Hessian of V . But the eigenvalues
of the Hessian are simply the principal curvatures of the potential, so if k generators
-< „ v T"f c v c =0.
7.4 Low-energy effective field theory of Nambu-Goldstone bosons
fail to annihilate the VEV, then we have k flat directions and thus k massless particles
in the expansion around the minimum.
When we expand the theory around S A — v A , the results are invariant under the
original symmetry transformations, but v A appears in all of the formulae. Thus we
really retain invariance only under the subgroup H of the symmetry group G, which
preserves v A . Note that the rules for functional integration do not allow us to integrate
over the constant value of S A —v A : fluctuating fields are required to fall off at infinity.
The symmetry is thus spontaneously broken to H. The action of the coset G/H on the
Green functions "takes us out of the Hilbert space of fluctuations of the vacuum." The
variations of the fields which correspond to slowly varying transformations in G/H,
which fall off at infinity, are the NG excitations.
As a final example of spontaneous symmetry breaking, let me mention QCD, with
N{ massless quarks. The quark part of the Lagrangian is
For our present purposes you don't need to know anything about the QCD covariant
derivative D^ except that it is proportional to the unit matrix in Dirac spin space, and
in the space of the "flavor" indices which distinguish the different quarks. It follows
that the classical Lagrangian is invariant under the following U(iVf) x U(7Vf) group:
-K^)-m>
where U and V are arbitrary U(7V~f) matrices. This group is called chiral symmetry.
The operator qq is invariant under the diagonal U(7Vf ) subgroup, for which U—V, but
transforms under elements of the coset of this subgroup. It turns out that the VEV of
qq is non-zero and this part of the symmetry is spontaneously broken. It also turns out
that the subgroup with U — V~ l = 1 is not a symmetry at all. The functional measure
of the quark fields is not invariant under this U^(l) symmetry (this is called the chiral
anomaly).
For N{ = 2 the Lie algebra SU(2) x SU(2) ~ SO(4). The fields (qq,qr a y 5 q), where
r" are the Pauli matrices, transform as a 4-vector under this symmetry. The VEV of qq
leaves only the SU(2) subgroup (isospin) invariant. There are three NG bosons, which
transform as an isovector. These are the pions, and this formalism helps to explain why
they are so much lighter than other hadrons. The non-zero mass of the pions is a con-
sequence of explicit breaking of the symmetry by non-zero up and down quark masses.
7.4 Low-energy effective field theory of
Nambu-Goldstone bosons
We have argued (and will argue in more detail when we discuss renormalization) that
low-energy physics can always be described by an effective Lagrangian, and also that
the low-energy physics of the NG bosons of a spontaneously broken symmetry is
Symmetries, Ward identities, and Nambu-Goldstone bosons
constrained by Ward identities. A shortcut to understanding what these constraints
are, is to write down a Lagrangian involving the NG boson as an elementary field.
If we find that, to leading order in the NG boson 4-momentum, the most general
Lagrangian consistent with the symmetries contains only a few parameters, then the
predictions of this Lagrangian, which follow from the symmetries, are universal.
To get an idea of how this works, let's first look at some examples. The simplest is the
spontaneous breakdown of a U (1) symmetry. We expect a single NG boson. Let's call
its effective field G. It takes values in the manifold of vacua. The manifold of vacua is
just the group U (1) itself, i.e. a circle. The angle parametrizing that circle is identified
with G/f. G is thus a periodic variable, with period 2nf.f has dimensions of mass and
is called the decay constant of G. The U(l) symmetry is just a shift of G. The most
general Lorentz-invariant, U (l)-invariant Lagrangian for G with the minimal number
of derivatives is
C= 1 -(d ll G) 2 . (7.1)
The shift symmetry does not allow any non-derivative interactions and, in particular, it
does not allow a mass term. Goldstone's theorem is thus incorporated automatically in
this formalism. Note that the coefficient of j in front of the Lagrangian is a convention.
Changing it would just change the definition of/. Adding higher derivative terms would
introduce new constants, but these terms are negligible at low momentum.
Now consider a fermion field \jr describing some stable massive particle in the original
high-energy theory, in which symmetry breaking led to a NG boson. It can be either one
of the fundamental fields of that theory, or a composite state (e.g. something cubic in the
fundamental fermions). Let's suppose that, in the original theory, the interpolating field
for this particle transformed chirally under the U(l) symmetry: * — > e lay5 ijr. A mass
term for * would not be allowed by the symmetry, but in the low-energy effective
theory the symmetry is broken. The symmetry transformations affect only G and no
other field. Another way to say this is that we can define
$ = e" i)/5G// *, (7.2)
which is invariant under the symmetry, ty can be used as the field in the low-energy
effective theory, and the symmetry cannot prevent us from writing a mass term for t/r.
The most general low-energy Lagrangian coupling G to x[r is
£= -(d^) 2 + ^(iy% - m)if + jd^Gfy^YsiJ • (7-3)
I have made the assumption here that the underlying theory conserves parity symmetry,
so that G is a pseudo-scalar. Note that G is always derivatively coupled: the amplitude
for emission of a G particle will always be proportional to its 4-momentum. This is
known as Adler's theorem [41], and is a general property of NG bosons.
A more complicated, and physically relevant, example is the breakdown of an
SU(2) x SU(2) symmetry to its diagonal SLV(2) subgroup. Quantum chromo-
dynamics (QCD), the theory of the strong interactions, contains six quark fields
7.4 Low-energy effective field theory of Nambu-Goldstone bosons
q 1 , I — 1, . . . , 6. The quarks all have mass (with the possible and very subtle excep-
tion of the up quark), but the masses of the up and down quarks are much smaller than
all the other strong-interaction energy scales. It is reasonable to imagine an approx-
imate description in which the up and down quark masses vanish. In this limit, the
classical QCD Lagrangian has a large symmetry group U(2) x U(2). If q denotes a
two-dimensional vector containing the up and down quark fields, the symmetry is
The x a (a — 0, . . . , 3) are the four independent two-by-two Hermitian matrices
(1,<ti,<T2,<73), and a a ' are real. The transformation with a^ = a^ (all others zero)
is called baryon number. The independent axial baryon number with a^ — —af is
not conserved because of something called a quantum anomaly, which we will discuss
later. The three-parameter subgroup of transformations with a ( L = af", (i = 1, . . . , 3)
is Wigner's isospin symmetry, from nuclear physics. We denote it by S\Jy(2).
There are strong theoretical arguments and a wealth of experimental data, which
suggest that the VEV of q/q' is non-zero, spontaneously breaking the chiral symmetry
group SU(2) x SU(2) to SUj/(2). Let us assume that this is so. The Goldstone boson
field takes values in the coset space [SU(2) x SU(2)]/SU K (2). We can view this as
the space of 2 x 2 unitary matrices, E, of determinant 1. Let F L ,R=e 10 '< i . Then
E^ v[t.V r defines an action of [SU(2) x SU(2)] on the space of all E matrices.
Furthermore, any matrix can be reached from E = 1 by this group action: the action is
transitive. The little (stability) subgroup of a particular matrix E is isomorphic to that
of 1, which is just the subgroup SLV(2) with Vl = Kr. This shows that the space of
E matrices is just the coset SU L (2) x SU R (2)/SU F (2).
The most general chirally invariant Lagrangian for E, with just two derivatives, is
£=/ 1 ?Tr(3 M E t 3' i E). (7.4)
/tt is called the pion decay constant.The pion fields are introduced by writing E = e m ' V* ,
where i=l,2, 3. The term quadratic in pion fields is ^(d^itj) 2 . The higher-order terms
control low-energy pion-pion scattering, as well as more complicated multiparticle
processes. Furthermore, when we introduce the weak interactions, the weak W bosons
couple to the SUl (2) currents, which at low energy, and in the absence of heavy stable
particles like the nucleon, are completely described by applying Noether's theorem to
this Lagrangian. Thus/^ controls both the low-energy strong interactions of pions and
the weak decays tt~ -> |x + v^. In addition, we can find relations between strong and
weak interactions of other hadrons using only the hypothesis of spontaneously broken
chiral symmetry. We will investigate some of this in the problems. The reader is urged
to consult the excellent account of this subject in [42].
I now want to give a general description of the effective Lagrangian for spontaneous
breakdown of a general Lie group G, with vacuum stability subgroup H. I warn the
reader that this material is very general and abstract. It will be worth his/her while
to work out all the details for the second example (massless QCD) described above.
Symmetries, Ward identities, and Nambu-Goldstone bosons
The dimension of G (H) is cIq (dn). We will let g denote a general element of G and h
a general element of the subgroup. The (right) coset G/H may be described as the set
of elements of G, with the equivalence relation g~gh for any h e H. It has dimension
dQ — du, and this is the number of physical NGBs. We want to describe space-time fields
which take values in G/H. 7 These are mappings g(x) of space -time into G, with the
space-time-dependent equivalence relation g(x) ~ g(x)h(x). This is our first example
of a local gauge symmetry, or gauge equivalence. We want to write Lagrangians for the
field g(x) that are invariant not only under the symmetry G, but also under the local
transformation g(x) — > g{x)h(x). The global symmetry acts by left multiplication on
the group, g(x) -» g\g(x), g\ e G.
Gauge symmetry is thus seen to be something of a misnomer. Rather than being a
statement of the invariance of physics under some physical operation, it is a statement of
redundancy. The coset has only da — da degrees of freedom. We are going to describe
it by the do degrees of freedom in g(x) but with a gauge ambiguity, the local h(x)
invariance, that tells us that only do — dn of them are physical. The advantage of this
procedure is that it allows us to avoid choosing a particular way of parametrizing the
coset space. It turns out that this choice cannot be put off forever, for when we quantize
the theory we have to eliminate the redundancy. However, we will see in Chapter 8 that
there is an elegant method of doing this, called Becchi-Rouet-Stora-Tyutin (BRST)
quantization, which is based on the cla nvariance we are introducing.
Now to the problem of writing down G-invariant and local H-invariant Lagrangians.
There can be no terms in £ without derivatives. Indeed, since the action of the group
G on the coset space G/H is transitive, the only G-invariant function on this space is
a constant. So our Lagrangian must involve derivatives of g. The derivative d^g trans-
forms into g\ d^g under the global transformation, but under the gauge transformation
it transforms like
9^g -> O^g + g d^h h- l )h(x). (7.5)
This inhomogeneous transformation law is ugly and makes it hard to construct invari-
ant Lagrangians. To fix the problem, we introduce a gauge potential or H-connection A^ .
A l± takes values in the Lie algebra of H. It is defined to transform as
A^x) -> h-\x)(A ll (x) + d„)h(x). (7.6)
We will give a geometrical explanation for this rule in Chapter 8. For the moment, it is
justified by noting that the covariant derivative
D„g = (d„g - gA„) (7.7)
transforms as
D lJL g -> D llg h. (7.8)
Note that, in our two examples, we had an explicit representation of the distinct elements of the coset, and
vere able to avoid some of the machinery we are about to introduce. I [owever, the machinery is necessary
o describe the coupling of pions to nucleons in our second example.
7.5 Problems for Chapter 7
The variables J^ = D jJL gg~^ are invariant under local H transformations and trans-
form as / M — >■ giJfig^ under the global G transformation, g — > gg\ . We may think of
them (for each space-time point and index) as elements of the Lie algebra of G, which
are thus finite -dimensional matrices. There is a unique G-invariant bilinear, which is
also Lorentz-invariant:
£=/ 2 Tr(/ M /' i ). (7.9)
This defines the non-linear (a) model on the coset space G/H. Since the multiplication
law of group elements indicates that they are dimensionless, J jjL has mass dimension 1 ,
so/ must have dimensions of mass. It is called the NGB decay constant. The "<r " in non-
linear sigma model and the "decay" in decay constant reflect the origins of these ideas
in the history of pion physics. All relevant mass scales in NGB physics are related to/.
The Lagrangian of the non-linear model depends on two field variables, A^ and g,
but contains no derivatives of the gauge potential. Therefore, the variational equation
for An is purely algebraic, and we can eliminate it in terms of g. We obtain A"J =
Tr[t m g~ l dug], where t m are the generators of H. The reader should verify that the
right-hand side transforms like an H-connection under local H transformations. The
Lagrangian written in terms of g is
c= J2 ^it k g- 1 d„gt k g- l d' i g].
fcsG/H
It is still G-invariant and locally H-invariant. To quantize it one would have to
make a choice of gauge. We will put off our discussion of quantizing gauge -invariant
Lagrangians until Chapter 8.
The gauge-invariant formulation of G/H dynamics is also useful for studying the
coupling of NGBs to other fields. Recalling that we are working at a scale below the
scale of spontaneous breakdown of global G symmetry, we should not expect these
fields to transform as representations of G, but merely of its subgroup H. If * is
such a field, the prescription for coupling * to g is to replace all derivatives in the
H-invariant Lagrangian for * by gauge -covariant derivatives, using the gauge potential
A™ = Tr[t m g~ l d^g]. The resulting Lagrangian must be used with care. It is a good tool
for calculating the emission of soft NG bosons from * particles, but not, for example,
for calculating the scattering of * particles from each other for general kinematics.
Momentum transfers must be small in order for pion exchanges to dominate nucleon
scattering. In configurations for which the kinematic invariants are large compared
with/^, the low-energy effective Lagrangian does not capture all of the physics.
7.5 Problems for Chapter 7
*7. 1 . Use Noether's theorem to write the conserved currents associated with the 0(«) or
U(«) symmetry of a general invariant Lagrangrian with up to two derivatives, for
scalar fields transforming in the fundamental (defining) representation of these
Symmetries, Ward identities, and Nambu-Goldstone bosons
groups. Show that the Lagrangian for massless Dirac fermions transforming in N[
copies of a representation R of any Lie group G has a V(Np) x U(JVp) symmetry
of separate unitary transformations on left- and right-handed components of the
Dirac fermions. We will see that most of this symmetry remains valid when we
couple the fermions to vector fields via
^afYii-taf,
where t a are the representatives of infinitesimal G transformations in the R rep-
resentation. A certain U(l) subgroup of U(JVf) x U(7Vf) is broken by a subtle
quantum effect called an anomaly.
. Use Noether's theorem to evaluate the conserved current corresponding to trans-
lation symmetry, for general scalar field theories coupled to the Maxwell field
(couple them via the minimal substitution principle: d^(j>' -> 9^ — leqiA^ 1 : e
is the coupling constant and qi the charge on the ;th complex field). You should
find the momentum is the integral of the time component of the stress-energy
tensor
■/'"
To,*,
d v T vfl = 0.
The Tf+v you find will not be symmetric. Another way to define T^ v is to construct
the action in a varying space-time metric (for the Lagrangian we are describing
this is simple: simply replace the Minkowski metric n^ by g llv (x) everywhere,
and replace d 4 x — > d 4 x ^^g, where g is the determinant of the metric. The stress
tensor is defined by
1 85
T^ v (x) = -—
^/ = g §g/"(x)
and is obviously symmetric (we take g^ v -> n^v after taking the derivative).
Show that the two definitions give the same value for P^ as long as fields fall off
sufficiently rapidly at infinity. Using the gravitational definition define
M lxvX =x lx T vX -x v T IJ , x .
Show that d k M llv i — 0, and that J I1V = f d 3 x M^o is the conserved angular-
momentum generator.
*7.3. Suppose that the stress tensor is traceless:
n llv T^ v = 0.
Show that there is a five-parameter set of quadratic polynomials
/" = Cx" + C$ K x v x K
(you must find the allowed form of C^ K ) such that/' M T /1V is conserved. Together
with the Poincare generators, the corresponding Noether charges form the Lie
7.5 Problems for Chapter 7
algebra of the group of conformal transformations SO (2,4). Field theories invari-
ant under these transformations are called conformal field theories (CFTs). In
our discussion of renormalization we will see that the extreme ultraviolet and
infrared behavior of any field theory is described by a pair of conformal field
theories (though the IR limit may be trivial, and have only delta-function corre-
lations). Any field theory may be constructed by perturbation theory around its
UV conformal limit.
*7.4. The pion field jr a (x) transforms as a triplet (vector) of the SU(2) isospin sym-
metry of strong interactions. Show that, in the limit in which we consider this
symmetry to be exact, the Ward identities for Green functions, and the LSZ
formula, imply that all three pions have the same mass. Now use the symmetry
and the obvious invariance of the four-point function (jt" 1 (x\) . . . jr aA (X4)) under
permutation of the pion fields to show that the 3 4 different components of this
Green function are related to a much smaller number of functions of x\ -> xa,.
Use the LSZ formula to conclude that there are only two independent pion-pion
scattering amplitudes, in the limit in which isospin is a good symmetry.
7.5. Represent the nucleon at low energy by an isospin doublet field N. This field
is invariant under chiral transformations, but since it transforms under the
unbroken isospin subgroup, we must write its Lagrangian with a covariant
derivative:
£ = Nliyf+O,, - A^ta) - m]N.
The Goldstone boson fields belong to the coset space SU(2) x SU(2)/SLV(2),
where the vector or diagonal subgroup is the simultaneous action of the same
SU(2) transformation in both factors. In other words we have (gL(*),gR (•*)), with
the gauge identification (aW,^W) ~ (gL(x)h(x),g R (x)h(x)). Hereg L ,gR,
and h are SU(2) matrices. Show that the two fields ^l.r = ?L R 3)ia,R are
invariant under the global symmetry and transform as gauge potentials for the
gauge identification. Either one could act as the gauge potential in the covariant
derivative. Use this, and the fact that strong interactions are invariant under
parity (which tells you which combination of ^l.r should be substituted for A a
in the nucleon covariant derivative), to write down the most general pion-nucleon
Lagrangian invariant under the symmetries of QCD with two massless flavors,
and containing at most one derivative of the pion field in nucleon interactions
and two in pion interactions. In addition to the nucleon mass and pion decay
constant, you will have to introduce one dimensionless coupling gA , determined
by the expectation value of the axial SU(2) current in the nucleon state. This
constant appears in a parity-invariant coupling of the other linear combination
of ^4/iL,R to an axial current built from nucleon fields. Use Noether's theorem to
derive the SU(2)l x SU(2)r currents, recalling that in this formalism only the
NGB fields gL.R transform under these symmetries. Now fix the gauge by carrying
out a transformation with h = g~ . The NGB field in this gauge is £ = ghg R
and transforms as £ -> V^T, V R . Write the Lagrangian in terms of E (you have
to be careful to transform the nucleon field by h = g R to get this right). Show
Symmetries, Ward identities, and Nambu-Goldstone bosons
that the long-range single-pion exchange force between nuclei is determined by
gA and/ir, while pion-pion scattering depends only onf^. Now assume that the
SU(2)l current couples to W bosons. Compute the amplitudes for neutron decay
and charged-pion decay in terms of the strong-interaction parameters/ir and gA,
and the Fermi constant Gf-
7.6. Write the Ward identities for the generating functional of non-abelian currents,
the functional average of
for some global symmetry group G with generators T". Show that they are
equivalent to the requirement that W{A) be invariant under non-abelian gauge
transformations of A. Define the non-standard Legendre transform
T(ap + M 2 (A a fl - ap 2 = W(Al).
Describe how the expansion coefficients of T are related to connected Green func-
tions. Show that the Ward identities imply that T is a gauge-invariant functional
of a° . At low energies this means that V is just the usual Yang-Mills action of
Chapter 8. Now argue that, if G is broken to H, then we should couple the a"^
field to the NGBs living in the G/H coset, in a way that respects gauge invariance.
Show that in the Yang-Mills approximation we get a theory of massive vector
mesons interacting with massless NGBs. The vector mass matrix is not just given
by M 2 . There is also a contribution from the interaction with the NGBs. Com-
pute it. Apply this formalism to the case of strong interaction chiral symmetry
G — SU(2) x SU(2). It gives an approximate theory of p, co, and Ai mesons
interacting with pions. The theory cannot be justified in the way that we justify
the chiral Lagrangian. Even if we make the chiral symmetry exact, by setting
up and down quark masses to zero, the vector mesons remain massive, with a
mass not much smaller than Aitf^, so there is no limit in which we can exactly
replace Y by the Yang-Mills action. Nonetheless, this vector-dominance model
of the hadronic currents is a useful approach to problems in strong-interaction
physics.
Non-abelian gauge theory
In our discussion of the general effective field theory for NG bosons, we encountered
a field theory whose target space was the coset space G/H. We realized this space in
terms of fields that take values in the group manifold G, with the equivalence relation
g(x) ~ h{x)g{x). This is an example of a gauge equivalence, and we were forced to
introduce a gauge potential A^(x), transforming as
and covariant derivatives
A^x)->h-\d„ + A^)h,
D^g = Op ~ A„)g,
in order to write down an invariant action for the NG fields.
We now want to generalize these considerations. It turns out that all non-gravitational
physics at energy scales that have been explored experimentally can be described in
terms of fields with a linear gauge invariance. That is, if we use the language of the field
vector, <$>(x), there is a group of linear gauge equivalences, O(x) -> [/(x)O(x), where
the U matrices belong to a subgroup G of the group of all unitary transformations on
the field vector.
Mathematicians describe this sort of situation by saying that O(x) is a section of a
vector bundle over space-time and G is the structure group of the bundle. For a physicist,
a vector bundle is just a collection of vector spaces, one over every point in space-
time, satisfying a few mathematical rules. The mathemat leal language is useful, because
certain important non-perturbative phenomena in field theory depend in a crucial way
on the mathematical theory of the topology of vector bundles.
The most important concept in the theory of vector bundles is the notion of the
parallel transport matrix Ur (x, y) connecting two points in space-time, along a path r .
This is also called the Wilson line (although it was introduced into modern physics by
Schwinger and Mandelstam [43-45]). One imagines that observers at different space-
time points have chosen different bases in the field vector spaces, which are related by
G equivalence transformations. £/ r (x, y) tells us how to relate the frame at x to that
at y. Under a gauge transformation,
U r (x,y)^ U(x)U r (x,y)U t (y).
The matrix Up(x, x) for a closed path is called the holonomy around P at x. Since our
matrices are always finite-dimensional, we can define the Wilson loop around P at x to
be the trace of the holonomy. It is gauge -invariant.
Non-abelian gauge theory
Another way in which to make gauge -invariant objects is to construct <J>^(x)
Ur(x, y)<S>(y). More generally, if the action of G on O is reducible, we can make a
construction like this for each irreducible piece of the field vector. Similarly, we can
construct Wilson loops in each irreducible representation of the gauge group G. This
exhausts the collection of gauge-invariant objects in the theory, but not all of these
constructions are independent. There are identities relating Wilson loops in different
representations, as well as along different paths.
In field theory, the action is local, so we need to construct limits of these objects for
very short paths r and P. This leads to the definition of covariant derivative and vector
potential:
U r (x, x + dx)<D(x + dx) = $(x) + D^{x)dx li = <P(x) + (d t , - L4 M )<& dx/".
The transformation law (7.6) of the vector potential follows from that of the Wilson
An infinitesimal closed loop P is characterized by the area element dx M A dx u that it
spans. We define
U P (x) = 1 + iF flv (x)dx^ a dx v .
Note that, according to our definitions, the 1-form A^ and the 2-form field strength
F^y are Hermitian matrices belonging to the Lie algebra of the group G. Under a gauge
transformation, iv„(X) -> U(x)F llv (x)U\x).
We can construct a closed path around the area element dx M a dx" from four segments
of open path along dx M , dx v , — dx M , and — dx v . This gives us the relation
[D ll ,D v ]Q=iF, J , v Q,
which can be written
Ff, v = df,A v - dvAp+ilA^A,,].
The field vector, its covariant derivatives, and the field strength tensor all transform
homogeneously under the gauge group. Thus it is easy to write down the most general
perturbatively renormalizable action involving these fields. We break the field vector
up into its scalar <j> and spinor ijr parts 1 and write
TvIfX'.f!"
+ f^da^D^ir + \D^\ 2 - V(<P) + [fMijf + h.c.].
We have labeled the simple factors of the gauge group by the integer r; G = Gi ® . . .
G,- ... (Ei Ga'- The field strength matrices (F^ v ) a t a are in a representation of the group
G,- satisfying Tr{t a ti,) — j&ab- For unitary and orthogonal groups, we take it to be the
fundamental representation. In other representations we have Tv^{t a tb) — 2-£>(R)<W
1 The alert reader may wonder why there are no vector fields included in the field vector. The answer, which
we will not have space to pro\e. is dial lenormalizabilil} implies dial die onh \eclor fields clianied under
die gauge group arc the lang Mills fields themselves.
Non-abelian gauge theory
The constant D(R) is called the Dynkin index of the representation. We have chosen to
write all of the fermions as left-handed Weyl fermions. When writing Feynman rules,
it is best to revert to Dirac notation.
The standard model of particle physics has gauge group G = SU(3) x SU(2) x
U(l). There are three generations of left-handed Weyl fields. Each generation has the
following 15 members:
• Left-handed quarks, qL, in the [3, 2, |] representation (the last number is the U(l)
quantum number, called weak hypercharge)
• Anti-up quarks, ur, in the [3, 1, j] (the conjugates of the right-handed up quarks)
• Anti-down quarks, cIr in the [3, 1, |]
• A lepton doublet, 1 L in the [1,2, -1]
• An anti-lepton 6r, (jlr, tr, in the [1,1,2] (the conjugate of the right-handed charged
lepton (electron, muon, or tau) field).
There is also a single complex scalar field, the Higgs field H, in the [1,2, 1]. We see
that, in the standard model, all fields except neutrinos have natural right-handed part-
ners. For the purposes of writing Feynman diagrams we introduce a Dirac neutrino
field. In Dirac notation, chiral gauge couplings will have j(l^ y$) projection fac-
tors. The right-handed component of the Dirac neutrino field will be a decoupled free
field. M((p) is the most general gauge -invariant Yukawa coupling, which might include
a constant gauge-invariant fermion mass matrix, and V is the most general gauge-
invariant quartic polynomial in the scalar fields. Note that we have written the scalar
field Lagrangian as if all the fields were in complex representations of the gauge group.
For real representations, we must multiply the scalar kinetic term by a factor of one
half.
The CP-violating terms in this Lagrangian, proportional to r , are all total
derivatives:
K? = 2e'
where
^^YKKp ~ ^fabcA a v A h a a c X
[t a , tb] = tfabctc
They have no effect on perturbation theory, but do have important non-perturbative
effects. Apart from these terms, CP violation could arise through the Yukawa couplings,
mass terms, and scalar self-couplings of the fermions and scalars. There is an intricate
interplay between gauge invariance and CP violation, because gauge invariance restricts
these couplings [30]. CP is conserved when there is a field basis in which all couplings
and masses are real.
Non-abelian gauge theory
8.1 The non-abelian Higgs phenomenon
In a theory with gauge group G, the scalar fields <£ will transform in a (generally
reducible) representation Rs of G. In this section we will call the generators of G in this
representation T" . In later sections we will call them Tg. Given a point v in the (linear)
space of scalars, let H be the stability subgroup of v (hv — v if h e H). Its generators
are t', which are a subset of the T". We can write the general scalar field as
<&(*) = £l(*)[v + A(x)],
where A is in the subspace of field space orthogonal to all of the vectors T a v. Under a
gauge transformation Q.(x) -> V(x)Q(x), while A(x) is invariant.
This parametrization of field space has a gauge ambiguity Q(x) — > £l(x)h{x),
A(x) — >■ h~ l (x)A(x), with h(x) e H. The new gauge group is isomorphic to H. This
fact, and the close relationship of the mathematics to that of a theory with global
symmetry G broken down to H by the VEV (<t>) — v, accounts for the standard
terminology "spontaneously broken gauge symmetry." We say that a semi-classical
expansion around the point A(x) = spontaneously breaks the gauge group G to
the gauge group H. In fact what has happened is that our parametrization of the G
gauge-invariant field space A has a new H gauge ambiguity.
Now define
B^ = QT^D^Q = a^+ Wf,.
The Lie algebra of G is the direct sum of the Lie algebra of H and the subspace of
generators orthogonal to those in H. We call these the coset generators for the coset
G/H. The coset generators 2 form a representation of H, under the adjoint action of H
on the Lie algebra of G. The second equality refers to this decomposition, a^ involves
only H generators, while W^ is composed of coset generators. All components of B IX are
invariant under G gauge transformations. Under the new H gauge transformations a jA
transforms like a connection or gauge potential, while W^ transforms homogeneously.
Now we can rewrite the scalar field kinetic term as
-(D^A)®) 2 = -[q(d m (c)A + W„(v + A))] .
Using the facts that v is annihilated by all generators of H and that A T T a v = for all
generators of G, we can rewrite this as
(D^A)®) 2 = (D IA (a)A) 2 + liW^iy + A)) T D,Aa)A + {W^y 2 + (W^A) 2 .
n generally a "roup, bui I he cose I genera lors are a linear subspace (no I a subakebra) ol'llic
8.2 BRST symmetry
On expanding around A = 0, we find that the W IA fields are massive vectors, with mass
matrix 3
(fi 2 v ) ah =v T T"T h v.
The a^ fields do not get a mass. The fields A are G-gauge-invariant, and the gauge-
invariant potential K(<t>), whose minima are at $ = Qv, will generally give mass to all
of them. The A fields are the physical Higgs bosons.
So, given a non-abelian gauge theory with scalar fields, the space of gauge-equivalent
classical vacuum states is the coset space G/H of fields <$> — Qv. The Higgs fields A
parametrize gauge-invariant deformations of the scalar field away from this vacuum
manifold. The number of massive gauge bosons W^ is just the dimension of G/H, and
there are perturbatively massless vector potentials a IL for the stability subgroup H of v.
8.2 BRST symmetry
It would be straightforward to derive the Feynman rules for the general gauge-invariant
Lagrangian, apart from one disturbing fact. The quadratic terms in the gauge fields
look like dim G copies of Maxwell's Lagrangian, and cannot be inverted to give a
propagator. We have already encountered this problem in our discussion of QED,
and finessed it by introducing a small photon mass. We promised a more extensive
discussion of the massless limit. The time has come to fulfill that promise. The nature
of this book prevents us from exploring the full extent of this subject. We will content
ourselves with introducing what I will call the Becchi-Rouet-Stora-Tyutin (BRST)
trick [46, 47]. The reader interested in the geometrical foundations of this trick can
consult the second volume of Weinberg's monograph [42]. More detailed references
may be found in [48-50]. It would be helpful at this point to do Problems 8.1-8.3.
These will show you how BRST symmetry works for free-field theories.
The basic idea of the BRST trick is simple: introduce the gauge parameters in an
enlarged field theory, where the gauge symmetry is an ordinary global symmetry. The
tricky part is that, to make it all work, the scalar gauge parameters must be quantized
as fermions. Explicitly, we introduce two scalar fermion fields c and c, and a boson field
N, all transforming in the adjoint representation of the gauge symmetry, c is called the
ghost field, c the anti-ghost, and N the Nakanishi-Lautrup Lagrange multiplier. The
BRST symmetry acts as an infinitesimal gauge transformation, with gauge parameter
3 Actually, we are writing these formulae for gauge potentials whose kinetic term involves a factor of the
gauge coupling, so the properlj normalized mass matrix includes a factor of g"g, where g" is the coupling
of the simple factor of Q to which T" belongs.
Non-abelian gauge theory
ec on all of the ordinary fields in the theory (e is a constant Grassmann parameter). In
addition,
<5brstc° = tfb c c h c c ,
<5brstc" — eN a ,
SbrsjN" = 0.
Note that <5g RST applied to any of these fields vanishes, as a consequence of the Jacobi
identity for the structure constants and the Fermi statistics of the c fields. It's easy to
verify that the same is true for the action of <5brst on ordinary fields. The operator
2brst which implements this symmetry on the quantum Hilbert space should be
nilpotent, Cg RST = 0.
We also implement a U(l) symmetry of these equations under which N" and all
ordinary fields have charge zero, while c a and c a have opposite charge (normalized to
±1). The BRST charge 2brst has charge 1. This U(l) quantum number is called the
ghost number.
The nilpotency of the BRST operator makes it very easy to add invariant ghost field
terms to the Lagrangian. Simply choose a term that has the form 8£ = [Qbrst, *]+,
where ^ is a fermionic operator of ghost number — 1 , called the gauge fermion. For most
purposes, the most convenient choice is ty = c a (F a + jkN"), where F" is a function of
ordinary fields. The resulting gauge-fixing Lagrangian has the form
*C= f d 4 y c\x) ^l e b (y) + W) 2 + N a F".
J ha) b (y) 2
We can integrate out the Lagrange multiplier, and convert this to
= [d 4 yc a
J 8 W »0;)
F" should be chosen so that the ghost kinetic term is non-degenerate, if we wish to
carry out a perturbation expansion. The choice F" — d^A 11 " is the most common one,
because it is simple and covariant. In models in which the Higgs phenomenon occurs,
a slightly more general form, involving scalar fields, is preferable.
Like any other global symmetry, the BRST symmetry will give rise to Ward identities
in the quantum theory. They have the form
J3<0l(*l) • • ■ [0BRST, Oj{Xj)]±(-iy ■ • • O n {Xn)) = 0.
The O, are any operators in the theory. The peculiar features of this Ward identity
come from the fact that the symmetry generator is fermionic. This accounts for the
commutator/anti-commutator ambiguity, depending on the fermion parity of Oj, as
well as the factor (— 1)°>' which reflects the Leibniz rule for Grassmann numbers. The
most important consequence of this identity comes from the case in which 0\ is itself
a BRST commutator (or anti-commutator) and the rest of the 0/ ( are BRST-invariant.
The Ward identity then implies that
(Oi(xi)...O n (x n ))=Q.
5£ = / d 4 j c a {x) |^4 c b (y) - ^(F a f.
8.3 A brief history of the physics of non-abelian gauge theory
In particular, since the difference between any two choices of gauge-fixing Lagrangian
is a BRST anti-commutator, this identity shows that the expectation values of all
BRST-invariant operators are independent of the choice of gauge-fixing Lagrangian.
There are two obvious classes of BRST-invariant operators. The first consists of
gauge-invariant functionals of the ordinary variables. The second, called BRST triv-
ial, are operators of the form [£>brst,-4]±- It is easy to see that the two classes
are disjoint. It is somewhat harder to prove [48-50] that there are no other ways of
being BRST-invariant. Thus, the non-trivial BRST-invariant operators are precisely
the gauge -invariant observables of the classical theory. The Ward identity shows us
that their expectation values will be identical for all choices of gauge-fixing Lagrangian.
This is the key result that enables us to resolve the irritating problem that, in the Higgs
phase of non-abelian gauge theories, there is no choice of gauge that is simultaneously
covariant, unitary, and renormalizable. BRST invariance guarantees the unitarity of
gauge-invariant Green functions computed in a covariant gauge, because it implies that
they are equal to Green functions of the same operators computed in a non-covariant,
but unitary, gauge (like the axial gauge A" = 0).
The covariant gauge Feynman rules for non-abelian gauge theory may be found in
Appendix D. The problem sets will give the reader ample opportunity to practice the
use of these rules. In the text we will do some perturbative calculations when we get
to the discussion of the renormalization group for non-abelian gauge theory. Before
entering into the intricacies of perturbative calculations, however, I want to give the
reader a feeling for the physics context in which non-abelian gauge theory is important
and the qualitative nature of the physical phenomena which arise from this theory.
A historical note: the ghost fields first appeared in the work of Feynman, DeWitt, and
Mandelstam [51-53] and were explained in terms of a change of variables in the func-
tional integral by Faddeev and Popov [54]. They are often called Faddeev-Popov ghosts.
The BRST symmetry is the most elegant way to formulate the introduction of ghosts,
and can easily be generalized to include gauge equivalences that do not form a group.
8.3 A brief history of the physics of non-abelian
gauge theory
A resource for the history of the standard model is [55]. Non-abelian gauge theory
is an integral part of Einstein's theory of gravitation, as was realized by Cartan [56].
Weyl [57] introduced the term gauge symmetry in an attempt to explain electromag-
netism in terms of the scale ambiguity of the space-time metric (Weyl transformations).
The first inkling that it had something to do with particle physics came in the Fermi the-
ory of weak interactions. 4 Fermi postulated the existence of a charge-carrying vector
Curiously, in 1938 Oskar Klein wrote down a L rangiai imi] i i th I ndard ;lectro-weak theory
| j. bin did no! use his knowledge of kalu/a Klein eompactiiiealion lo make it eauiie- in
Non-abelian gauge theory
Fermi's theory of weak interactions.
boson W+, which could mediate the process of neutron p decay via the Feynman
diagram of Figure 8.1
Fermi, the father of the idea of effective field theory, realized that, at distances long
compared with the Compton wavelength of the W, this diagram would be the same as
an effective four-fermion interaction:
and this is what came to be known as the Fermi theory of weak interactions. Nuclear
data were soon shown to be inconsistent with this simple form, and y^ was replaced
with an arbitrary sum of Dirac matrices. It wasn't until the mid 1 950s, after the ground-
breaking work of Lee and Yang [60] had showed that weak interactions did not preserve
parity, that Marshak and Sudarshan, Gershtein and Zel'dovitch, and Feynman and
Gell-Mann [61-63] realized that, if one allowed the W boson to couple to a linear com-
bination of vector and axial currents, then one could construct an elegant theory, which
also fit the data. The stage was set for a non-abelian gauge theory of weak interactions.
In 1954, Yang and Mills [64] had introduced non-abelian gauge theory into
particle physics in an attempt to account for the strongly interacting vector bosons. 5
Bludman [65] and Schwinger [66] realized that one could adapt this technology to the
Fermi theory and that an SU(2) gauge theory could unify the weak and electromag-
netic interactions. There were two problems with this idea. Glashow [67] showed that
the weak leptonic currents did not close on an algebra with the electromagnetic current
(Problem 8.14). One either had to introduce new leptons or follow the route chosen by
Glashow (and the world) of introducing an SU(2) xU(l) gauge group, with the photon
associated with a linear combination of one of the SU(2) generators and the U(l).
The remaining problem, that of the mass of the non-abelian bosons, was not so easily
solved. Mass terms seemed to break the non-abelian gauge symmetry. Furthermore,
the Euclidean propagator of a massive vector boson behaves like
Vv -p^Pv/m 2
p 2 +m 2 '
and the effect of the mass does not appear to vanish at high momentum.
In the meantime, developments stemming from the theory of superconductivity were
leading to a solution of the problem. Following work of Schwinger on two-dimensional
5 The modern approach to this idea is outlined in Problem 8.15.
8.4 The Higgs model, duality, and the phases of gauge theory
quantum electrodynamics [68], Anderson [69] discussed the screening of electromag-
netic interactions in condensed matter systems. Higgs showed how to incorporate the
Meissner effect into a fully relativistic treatment of electrodynamics [70-72]. We will
study his model in the next section. Developments followed rapidly [73-74]. Weinberg
[75] and Salam [76] combined the Higgs mechanism with Glashow's SU(2) x U(l)
theory to construct the modern theory of the electro-weak interaction. A few years
later, 't Hooft and Veltman [77-78] proved that spontaneously broken gauge theories, as
the new theories came to be called, were renormalizable.
I was a graduate student at the time of these developments, and it was obvious
that a generalization of these ideas to the strong interactions was called for. In fact,
such a generalization already existed, though it was relatively obscure, particularly in
Europe. It was not the original Yang-Mills theory of vector bosons, but rather Nambu's
SU(3) color gauge theory [79] invented to explain the statistics of quarks. In the quark
model, baryons are bound states of three quarks, and the correct spectrum of baryons
is obtained if one chooses a wave function symmetric under interchange of the three
quarks. This is paradoxical because the quarks carry spin j- Greenberg [80] realized
that the paradox could be removed if one introduced an additional three-valued label
for the quarks (color) and insisted that all states be singlets under an SU(3) group
transforming this label. 6 Nambu [79] invented the color gauge theory as an attempt to
explain the color singlet condition dynamically. He showed that the single-gauge-boson
exchange forces between quarks would lower the energy of the singlet states (just as
neutral states have less electromagnetic energy than charged states). In Nambu's pic-
ture, the colored states, including quarks, would eventually be found at higher energies.
It was perhaps for this reason that he also figured out a way to give the quarks integer
electric charge [81], using the new degree of freedom. In the modern view of quan-
tum chromodynamics or QCD, Nambu's calculation is just giving the short-distance
part of the interquark forces. At long distances, the potential rises linearly and quarks
are permanently confined. Such a potential cannot arise from particle exchange. To
understand it, we will turn in the next section to the Higgs theory of relativistic super-
conductivity, which provides both the framework for electro-weak physics and (after a
duality transformation) the explanation of quark confinement.
8.4 The Higgs model, duality, and the phases of
gauge theory
Let us recall the Stueckelberg formalism for massive vector fields. We describe a massive
vector B^ as a gauge-invariant theory of a Maxwell field A IX coupled to a scalar, with
the Lagrangian
6 Greenberg actually used the language of parastatistics rather than color. The two ideas are equivalent for
dassiiune possible stales but dynamically different. In modern language, paraslatislics is (he impossible
limil in which the QCD scale is laken lo infinity with hadron masses kepi fixed.
Non-abelian gauge theory
4g 2 ^ 2
Note that, unlike a conventional scalar field, 6 is dimensionless.
The Higgs model is obtained by the simple observation that the Stueckelberg
Lagrangian may be viewed as an approximation to a more conventional Lagrangian
of a charged scalar field coupled to electromagnetism. Simply write
£ H iggs = -^i A lv + -^ti^i 2 - v(4>*<t>)i
Transforming to radial coordinates for the charged field, 4> — P el0 , we obtain
Aiiggs = -T1 A lv + -j EpVm " 9^) 2 + (^Pf ~ V(P)±
SS 4g 2 liv g 2
If p is "frozen out" at a value p 2 — (p 2 — fi 2 /2, we obtain the Stueckelberg Lagrangian.
In the Higgs model, p fluctuates around its expectation value and quantization of the
small fluctuations gives another particle, called the Higgs boson.
The renormalizable version of this Lagrangian contains two parameters, in addition
to the gauge coupling, g 2 . The Higgs potential is
V = k{(j)*<p - <p 2 ) 2 .
It has a one-parameter set of minima,
which are related by gauge transformations (and thus represent a redundant para-
metrization of the gauge -invariant minimum p — (po). Small fluctuations around a
minimum, in the direction of the gauge transformation, are the degrees of freedom
associated with the Stueckelberg field 6 and are "eaten" by the gauge boson A^ to form
the massive field B l± . Fluctuations perpendicular to the trough of minima are Higgs
particles. The gauge-boson mass is
and the Higgs boson mass
m 2 H = AX4>1
The Stueckelberg Lagrangian is a good low-energy approximation if
A» 1.
The quartic coupling of canonically normalized fields is g 2 A/4, so this can be true in a
semi -classical regime if g 2 <§; 1.
8.5 Confinement of monopoles in the Higgs phase
In order to quantize the theory, we need to choose a gauge-fixing function F and
an associated gauge fermion. The standard covariant choice F — d^A^ recommends
itself, as does the alternative covariant choice
F — (4> - 4>*)4>o,
fixing the gauge freedom in $ completely. 7 This, the so-called unitary gauge, 8 is covari-
ant and contains no unphysical degrees of freedom. The gauge potential in this gauge
is actually equal to the gauge-invariant field B^ , because the phase of cj) is frozen.
Unfortunately, in non-abelian theories, Green functions in the unitary gauge do not
have a renormalizable perturbation expansion (although the scattering matrix does -
see below). There is actually a convenient interpolating choice for the gauge fermion.
We take
1
* = -rcid^ + K# «> - </>))•
8 2
This is to be used only with a weight function for the Lagrange multiplier field
8£ = y^N 2 .
2g 2 a
The propagators in the semi-classical expansion in this gauge contain no mixing
between the scalar and gauge field for any value of a.
The Feynman rules for a general, non-abelian, Higgs model in this R a gauge are
given in Appendix D. We will not do any explicit loop computations in Higgs models
in the text, but leave them for the exercises.
8.5 Confinement of monopoles in the Higgs phase
Maxwell's equations, with both electric and magnetic current sources, have the form
d^F^ =J V ,
d^F^y =J V ,
and are invariant under the duality transformation
if we exchange the electric current, J^, with the magnetic one / M .
To quantize the system we must make a choice. It turns out that consistency requires
the coupling strengths of the two types of currents to be inversely proportional to
each other. The only way we can have a controlled semi-classical expansion is to insist
7 Strictly speaking p(x) = (0 + (j>*')(x) is positive, but in perturbation theory it is expanded about the
positive value 0n- and thi:- constraint is i'ullilled automatically.
Also called the unitaritj gauge.
Non-abelian gauge theory
that the strongly coupled type of particle (monopoles) becomes very massive as the
coupling gets strong, so that virtual loops of these particles become negligible despite
the strong coupling. Indeed, naive calculations of radiative corrections to the mass of
these particles suggest that it is o(l/g ), where g is the weak coupling of the electrically
charged particles. In fact, as we shall see in Chapter 10, magnetic monopoles arise
as classical soliton configurations, with masses of order 1/g , whenever the U(l) of
electromagnetism is obtained by spontaneous breakdown of a simple group. Virtual
effects of these solitons are exponentially small in g, because of form factors. We will
generally use the terminology "electric charges" to refer to those particles which are
weakly coupled in a semi-classical expansion and "magnetic monopoles" to refer to
the strongly coupled objects.
The choice we make upon quantization is to solve the magnetic Maxwell equation
,-Wd..
>'[f(x-^W] + M M -9^ v>
d x f k =8 4 (x-y).
We shall also insist that the support off x (x) is on a one -dimensional curve connecting
the origin to infinity. f k is called the Dirac string. We will generally assume that the
curve is in fact a straight line. Then we are constructing the magnetic monopole by
viewing it as one end of an infinitely long, infinitesimally thin solenoid. Dirac argued
that, classically, the solenoid is invisible, but in quantum mechanics (in anticipation of
Aharonov and Bohm), charged-particle paths that encircle the solenoid would pick up
a phase e' eg (g is the magnetic charge), which would enable one to locate the direction
of the solenoid. If that were the case, a monopole would not be a point particle. This
conclusion can be avoided only if there is a universal quantization rule for the coupling
strengths of electric and magnetic poles
eg = 2tiN.
Actually, if we consider the possibility that particles carry both electric and mag-
netic charge (and follow Schwinger by calling such particles dyons), then the general
quantization rule is
qitni — Him\ — 2ttN,
for some integer N. This follows because, as one particle encircles the other's solenoid,
the second particle encircles the first's solenoid in the opposite direction. In this formula
we have used q and m to represent the electric and magnetic charges in units of e and
1/e, respectively, where e is the semi -classical expansion parameter for Maxwell's field.
We note in passing that the modern treatment of monopole physics uses the theory of
fiber bundles. Instead of introducing the singular Dirac string directly, one introduces
two vector potentials, each of which describes the magnetic field of a monopole only
over part of space. Each has a Dirac string singularity, but only in the part of space
where it is not used. The two regions overlap, and, on the sphere at infinity, the overlap
8.5 Confinement of monopoles in the Higgs phase
consists of a small region around some equator of the sphere. The difference between
the two vector potentials is a pure gauge, but the gauge transformation has a non-trivial
winding around the equator,
<* dx' 1 d^w = lizm,
where m is the magnetic charge.
We now quantize the theory by writing the Lagrangian
C = -^F»F> lvD + A.J' 1 + -^(9„^) 2 .
We can do the path integral over A^ , and, if we take the electric and magnetic currents
to be conserved, it is independent of the choice of gauge. The result is (in Euclidean
space)
Z[J, J] = e~T f*** d *y ^MD(x-y)J,(j) e - 2 -^f d ^ &*y >MD(x,y)J^(y)
x e i/d 4 .T d 4 j d 4 z e^ XK J»{x)d v D(x-y)f k {y-z)J''{z) _
Here/^ = n x (n a d a )~ i , where n a is the space-like direction of the Dirac vortex string
attached to each monopole. We have imposed the minimal Dirac quantization condition
and the currents are normalized to 1 . When the monopole is stationary, at the origin,
and we choose n x — x 3 , then
J d 4 z ^ vXK f k (y - z)J K {z) = ^0 3 )5 2 (^)m6 Atv3 45(y 4 - r).
We have chosen the parameter along the monopole world line to be its Euclidean time
coordinate. The phase is then non-vanishing for space-like charged-particle trajecto-
ries, which encircle the vortex, and is equal to e 1?m . This is the derivation of Dirac's
quantization rule.
We conclude that a theory of light charged particles interacting with heavy magnetic
monopoles makes sense, at least in the approximation that the monopoles are treated as
singular classical sources. When we realize that monopoles are forced on us, as classical
solitons in non-abelian theories, we will understand that this had to be the case. Those
theories make it clear that a fully relativistic theory of electric charges interacting with
monopoles must be consistent as well.
Now we come to an apparent conundrum. Let's consider the Higgs model, but add
in a coupling to a heavy magnetic monopole. Gauss' law tells us that there must be a
magnetic Coulomb field at infinity, but we have seen that there are no massless fields in
the theory. How do we get a long-range field? You might have asked the same question
about electric charge, but here the answer is easy. The field <j> has a vacuum expectation
value, 9 but doesn't commute with the electric-charge operator. The vacuum is therefore
There is a subtlety here, because (his field isn't gauge-invariant. Later on, when we discuss Wilson line
operators, wc will mx- how lo corner! tliii. argument into one about a eauee invariant variable.
Non-abelian gauge theory
A flux tube between monopoles in a superconductor.
not an eigenstate of electric charge, but rather a superposition of different charge states.
The charge fluctuations in the vacuum screen any external source of charge, so that
it has no long-range field. There are no dynamical monopoles in the theory, so this
cannot be the explanation of the absence of a monopole Coulomb field.
Instead, we must recognize that Gauss' law only relates the charge to the integrated
magnetic flux at infinity. In ordinary empty space, the lowest-energy configuration sat-
isfying this constraint is the rotationally invariant Coulomb field. In the Higgs vacuum,
the quanta of the magnetic field are massive, and it turns out that it costs an energy per
unit volume to force magnetic flux through the system. The minimal energy configu-
ration with a single point monopole is a. flux tube (Figure 8.2), which has energy per
unit length.
To see this explicitly, let us take the monopole off to infinity, and search for the
minimal energy configuration with an infinite straight flux tube and no source. The
energy density for x 3 independent static configurations is
~{\4>f-nY
If k — 1 , we can write this as
, 1 UFn + \>
h | A0 + icijDjW + ftFn - ie« d,A<P*D,c/>) .
Note that the last two terms are total derivatives and contribute to the total energy
only through their values at infinity. We will find that Z)/</> falls off exponentially, so
that the contribution of these terms to the energy is just (p^/e 2 times the total mag-
netic flux through the plane. We can minimize the energy by setting the two perfect
squares in £ to zero, so that this flux contribution is the total energy of the vortex. The
8.5 Confinement of monopoles in the Higgs phase
minimum energy solutions are determined by the Bogomolnyi-Prasad-Sommerfield
(BPS) [82-83] equations 10
The solution with f
where
(A + iepDj)4> = 0, (8.3)
F n = 0o " 2 - ( 8 - 4 )
i, = e Iny /(r),
^i + 1A2 — — ie 1
/' = -/,
a' = r(/ 2 - 0o 2 )-
The boundary conditions are
a^O, /^<£o, forr^cx);
a^«+0(r), /-► r"a + 0(r 2 )), for r -»- 0.
It is easy to see, by linearizing the equations at infinity, that/ approaches its asymp-
totic value exponentially. According to the second BPS equation, this means that the
magnetic flux is confined to a small region around r — 0. 0o sets the scale for this
region.
The qualitative nature of the solution is similar for other values of A. The Nielsen-
Olesen vortex [84] that we have constructed is the global minimum-energy configuration
with fixed flux. The fact that magnetic flux penetrates the Higgs vacuum in thin tubes
was first discovered, both theoretically and experimentally, in the study of supercon-
ductivity. There the expulsion of magnetic field from the superconducting ground state
is called the Meissner effect, and the flux-tube solution was found by Abrikosov [85].
The significance of the Meissner effect for the physics of dynamical monopoles in the
Higgs vacuum is profound. Isolated monopoles will have infinite energy. Monopole-
anti-monopole pairs will experience a linear confining potential (energy per unit length
for a flux tube connecting them) at large separation. They will be permanently bound
into states with a characteristic size determined by the energy density of the flux tube.
At the time the Nielsen-Olesen vortex was discovered, experimental physics had
revealed a puzzle about the nature of quarks. Spectroscopy had given strong indications
of a quark substructure underlying hadrons. High-energy scattering experiments, both
deep inelastic scattering of leptons and nucleons and electron-positron annihilation
into hadrons, had given dramatic confirmation to the idea that hadrons are bound
states of quarks and that quarks behave almost like free particles when they are close
)r of a supersynmieiric model, ami ihe BPS equations
some of the supersymmetry.
Non-abelian gauge theory
together (asymptotic freedom). But quarks had never been seen as free particles, despite
the fact that experiments were being done at energies above the characteristic scale of
the strong interactions.
By 1973, the apparently free behavior of quarks in short-distance experiments had
been understood in terms of quantum chromodynamics (QCD), the SU(3) color gauge
theory of strong interactions. As we will learn in Chapter 9, the effective gauge coupling
is scale-dependent and goes to zero at short distance. Conversely, as the separation
between quarks gets larger, it grows, up until a length scale of order (150 MeV) -1 ,
whereupon the perturbative calculations on which this observation is based break
down. Many people speculated that the coupling became infinitely strong and led to
the confinement of quarks.
Actual calculations showing the confinement of quarks were first done (in four
dimensions) by Wilson, in a latticized form of the gauge theory [86]. Shortly thereafter,
't Hooft [87] and Mandelstam [88] noted that one could obtain an intuitive picture of
confinement by treating the Cartan subgroup of SU(3) as electromagnetism (actually
two different U(l) gauge fields since the group has rank two), and postulating that the
vacuum state of the theory was in a magnetic Higgs phase. Using electric-magnetic
duality and the calculations we have done for monopoles in the Higgs phase, we see
that such a state would lead to confinement of quarks by electric flux tubes.
Much work in lattice gauge theory has gone into trying to validate the 't Hooft-
Mandelstam picture of confinement. The results are a trifle ambiguous. Nonetheless,
this is a good qualitative picture of what is going on in QCD. The picture has been
verified both in lattice models and in partially soluble supersymmetric gauge theories.
We will now examine a simple lattice model, which will introduce the reader to the
techniques of lattice gauge theory and exhibit the 7 t Hooft-Mandelstam mechanism
for quark confinement in a striking manner [89-92].
In a lattice theory (we will work in Euclidean space), the fu;
by an integral over variables defined on a hypercubic (for simplicity) lattice, and deriva-
tives are replaced by finite differences. In a lattice gauge theory, the fundamental variable
is the parallel transporter, or Wilson line between nearest-neighbor points on a lattice.
We will denote this by U^{x), where x is a lattice point and [i one of the eight inde-
pendent directions on the lattice. It connects the points x and x 4- /x. Under a gauge
transformation,
U^x)^ VWU^xWHx + v).
We also insist that [C/ M (x)] _1 = C/- M (x + /x). We take these variables to be matrices,
which are group elements in the fundamental representation (we'll deal only with U(N)
groups in this text).
Given gauge-variant fields 4>(x), transforming in the representation R of the gauge
group, we can write gauge-covariant finite differences
where the matrix that appears is the R representation of the group element U^ (x). These
can be combined to form gauge-invariant actions. Gauge-invariant actions constructed
8.5 Confinement of monopoles in the Higgs phase
solely from the gauge fields are functions of traces of Wilson loops. The simplest such
loop is the plaquette in the \lv plane,
U^ (x) = C/ M (x) U v (x + fj.) C/_ M (x
v)U- v {x
■v).
Since it goes around a closed curve, its trace is gauge-invariant.
All of this simplifies for abelian gauge groups, for which we can write U^ (x) = e 10 " (a)
and treat # M much as we would a gauge potential in the continuum. We will consider
the group Z#.
To couple a lattice gauge theory to a heavy external particle, we let x^is) be the
Euclidean path that the particle follows. Here s is a discrete parameter that counts
the steps in the path. In order to define a gauge -invariant quantity, we insist that
the path be closed. Physically this corresponds to studying a heavy charged particle by
creating a particle-anti -particle pair, separating them, and bringing them back together
to annihilate. The observable which computes the gauge -theory contribution to the
action of this path is just the corresponding Wilson loop W[x(s)]. The gauge charge
of the heavy particle is encoded in the choice of representation for the U tl matrices in
the Wilson loop.
To compute the potential energy of the pair a distance R apart we choose (Figure 8.3)
a curve containing long straight segments along which only xq(s) changes, and the
distance between the two straight segments is R.
If T is the length of the long segments and we have a linear confining potential above
some distance R c < R, then we expect
(W[x(s)]) -► e~ kRT ,
as T -> oo. This is a special case of a more general and more geometric formula
(W[x(s)]) -> t- A ,
The Wilson loop for the static potential.
Non-abelian gauge theory
where A is the area of the minimal surface spanning the loop. Confinement is equivalent
to Wilson's area law for large-area Wilson loops.
It is worth noting that, in non-abelian gauge theories, we do not expect to see confine-
ment for heavy particles in generic representations of the gauge group. Indeed, such
theories contain dynamical gauge bosons, which might be able to combine with the
heavy particle to form singlet bound states. These singlets can propagate far away from
each other without feeling a confining force. However, if the group is SU(iV) it has a
non-trivial center Zjv. The adjoint representation does not transform under the center,
so it cannot screen particles in representations that do transform. ' '
Since the essence of confinement in an SU(iV) theory is associated with the Zn center,
it seems worthwhile to study the pure Zn lattice gauge theory, and that is what we will
do. The variables of this theory, 9 tl (a), are integers modulo N, one for each link on the
lattice. A group element is
t/ M (x)=e —
We will choose a partition function of the form
= J YlWiiWiYifQ^x)).
The "measure" J[d0] is just the instruction to sum each link variable over its N possible
values. The gauge -invariant field strengths 6 I1V are defined by
llv (x) = IA (x + v) - 9 lx (x) - 9 v (x + fi) + 6 v (x) = A„^ - A,A.
Like the original link variables 6 I± , these add modulo N.
We can write the most general function/ of these Zjv-valued variables as a finite
Fourier series
fw-jrHfg®.
where the variable / is also Zyy-valued. If we insert the Fourier transform for each 0^ lv ,
and integrate by parts, we can do the 6 IJL "integrals" exactly, obi
= Jldl„ v (x)l nfe(W*))]S[A y / Ml ,(x)],
where we have used the Einstein summation convention for the v index inside the delta
function. The "functional delta function," S[ ], is the product over all links, (x, x + p,),
in the lattice, of the (Zjy-valued) Kronecker delta of the variables A v l lx v (x) . The Wilson-
loop expectation value is the ratio of two such integrals over # M (x). The argument of
the functional delta function in the numerator is A v l^ lv — qw^x). q is the Zjv charge
of the static particle, and w M (x) is 1 if the Wilson loop goes through the link (x, x + /x)
in the positive direction, — 1 if it goes through in the negative direction, and if the
Wilson loop does not pass through that link.
If the theory conicins J_\iuiiiiic;:l panicles in [he I'lmdameiual representation, then any heavy source c
be sereeneJ In creation ot pairs o'i these particles. This is what happens lor quarks in the real world.
8.5 Confinement of monopoles in the Higgs phase
To convert this to a form making its relation to electrodynamics clear, we introduce
redundant variables, to write the sums over Z^ variables in terms of ordinary integer
sums. Thus, we write l^ -*■ l^ + Nq^ v , where / and q now take on arbitrary integer
values. Shifting / by a multiple of N can be compensated by a shift of q, so, if all functions
are just functions of this combination, the double sum over integers just produces an
infinite number of copies of the sum over Z^ variables. This infinity cancels out in the
ratios that define expectation values. In order to preserve the Zm character of the delta
functional, we introduce an integer- valued link variable e^ (x) and write the functional
as
8[A v / MU (x)-A^(x)].
The "current" e IJL must be conserved in order to satisfy this constraint.
For convenience, we will choose the weighting function g(/ MV> ) = e~ g "". Other
choices give similar results. Wheng 2 is very large, non-zero values ofl^v are suppressed.
If we introduce a Wilson loop with Zm charge 1 , this shifts the delta function in the
numerator functional integral to ^[Ay/^u — w^]. w M = ±q or zero. The delta-function
constraint in the numerator now forces / M „ to be q over some area bounded by the Wilson
loop. For large g 2 we obviously want to pick the minimal area. In the denominator we
simply set l^ v — for large g 2 . We find that the Wilson loop falls off like e~ g ' A , where
A is the minimal area it bounds.
In order to understand what happens for smaller values of g 2 , we introduce the
Poisson summation formula:
£/«> = £/
&af(a)e Mma .
This formula is true because the sum over m gives rise to a sum of delta functions of a
concentrated on the integers a — I. Iff (a) — e~ 8 ~" , we can do the Gaussian integral
and obtain
"-^E'
We will use this formula after solving the constraint equation for l^ as
l, lv = (n v (nA)-% - n M («A)-Vv) + e llvXK A x l K .
Ifi is an integer-valued link variable, with an obvious gauge ambiguity. j M — Ne jjL in
the denominator functional integral and j jjL — Ne^ + qw fJi in the numerator. We want
to apply the Poisson summation formula to the sums over / M (x). In principle we must
eliminate the gauge ambiguity first, and do so (for example by picking an axial gauge
like h — 0) in a way that preserves the integer nature of the variable. However, if we
introduce the Poisson variables m jX by a factor
and insist that A^m^ = 0, then the answers will be independent of the choice of gauge.
The result is a lattice action depending on three types of variable, e^jn^, and a^. The
Non-abelian gauge theory
former are integer-valued conserved currents on the lattice, while the latter is a lattice
version of Maxwell's field. We have
5 = ^b 2 (A^„ - A v a M - e llvXK n x (n/±)- x j K ) 2 + litim^a^].
This is a discrete analog of the action for electric and magnetic monopoles interacting
via the Maxwell field. If we are interested in the small-g 2 limit, we should carry out
a duality transformation and describe j^ as an electric current and m^ as a magnetic
current. The resulting action is
S = J2\ ^ (A » Av - AvA » - ^vXKnx(nA)- 1 27tm K ) 2 + ij^A,, .
If we do the path integral over A, we get a non-local action for the currents
Snl = y I>'V + Ne tl ](x)D(x - y)[ Wll + Ne„](y)
+ \- V m ll (x)D(x - yym^iy) + i4> D ,
8 T'y
where <J>d is the Dirac phase factor we have discussed above in the continuum. D(x — y)
is the Green function of the four-dimensional lattice Laplacian.
We treat this expression in the following way: keep the term with x — y in D as is, and
rewrite the rest as a path integral over A with a modified kinetic term, whose inverse is
D{x — y) — -D(O). The sum over e^ and m^ is now weighted by
e e .
For small g 2 <§C l/N 2 , the non-zero values of m^ are highly suppressed, while the sum
over e^ is free. We do it by introducing a Lagrange multiplier to enforce the constraint
A^e^ = 0, via a term
e iE*(A^).
The sum over e/x gives an action of the form
s = J2\ F(A n e ~ NA ^ + A v- W » + n^» ~ F ° (w) + ( 4jt2 /g 2 )' n l-
A^ v is the lattice field strength and F® v (tri) is the Dirac string field corresponding to the
magnetic current m^ . This looks like a Higgs model coupled to magnetic monopoles,
and for small g 2 we can perform the path integral semi-classically. The result is that
the magnetic-monopole loops are confined to small size and are rare. The Wilson loop
just has a perimeter law.
On the other hand, for large g 2 we instead apply this treatment to the m^ loops,
getting a magnetic Higgs phase. Electric loops of e^ are generally suppressed. However,
if the Wilson loop w^ bounds a large area, and if it has charge q (mod N) rather than
charge 1, then it pays to cancel it to the smaller of q and N — q. We get an area law
8.6 The electro-weak sector of the standard model
with a coefficient that is largest for q ~ N/2. We have already analyzed the large-g"
regime in terms of the l^ v variables. Now we realize that we were just describing the
magnetic Higgs phenomenon, l^ is just the quantized electric flux which must pay an
energy price per unit length to penetrate the magnetic Higgs vacuum.
Finally, when N is large, one can argue that there is a regime of g 2 where both the
electric and magnetic currents, e M and m^, are suppressed, and the theory behaves
like free electrodynamics. This Coulomb phase exists roughly when An 2 c\ > g 2 >
C2/N 2 , where the c, are constants of order 1. For large N, the Zjv lattice gauge model
exhibits the main phases of gauge theories and the electric-magnetic duality between
Higgs and confinement phases. Other possible phases exist in more complicated non-
abelian theories, in which the particle whose field exhibits the Higgs mechanism at
strong coupling has both electric and magnetic charge at weak coupling. These are
called oblique coalmen tent phases. Non-abelian theories have also been shown to exhibit
conformally invariant phases that are not free-field theories. These go under the name
of non-abelian Coulomb phases.
8.6 The electro-weak sector of the standard model
We now turn to the use of non-abelian gauge theory in its Higgs phase, in the theory of
weak and electromagnetic interactions. Glashow's argument leads us to a model of the
electromagnetic and weak interactions based on an SU(2) x U(l) gauge group, broken
to U(l). Salam and Weinberg introduced the simplest scalar sector which achieves
this purpose, a doublet, H(x) with hypercharge 1. We can write the most general field
configuration as
The four real components of// transform as a 4-vector under SO(4) = SU(2) x SU(2).
The first SU(2) is the gauge group, while the second is a global symmetry, broken only
by the U(l) gauge interactions. The Higgs potential V is a quartic polynomial
V = -p 2 H^H + X{H^H) 2 .
Note that SU(2) x U(l) gauge invariance implies that it is SO(4)-invariant. The mini-
mum potential energy is at v 2 — p 2 /X, and the mass of the gauge -invariant excitation
of h, the Higgs particle, is
m H = TV2v.
If we call the gauge couplings g\^ then the gauge-boson mass matrix is
-Sis:
Non-abelian gauge theory
This gives
The latest measurements of the Weinberg angle, 0\y, from comparison of precision
measurements and detailed loop calculations in the standard model, give
sin 2 w =0.23120(15).
As we will learn in Chapter 9, effective Lagrangian parameters vary with energy scale
and depend on the renormalization scheme chosen to define the theory. The quoted
value is the value at the Z mass of the parameter defined by the modified minimal
subtraction scheme. The measured values of the gauge-boson masses are
»4 xp = 91.1876(21),
m^ v = 80.403(29).
The tree-level relation between them is pretty well satisfied.
The origin of this relation is the SO(4) symmetry we mentioned above. One SU(2)
subgroup is a gauge group. If g\ = 0, then the other is a global SU(2) called custodial
symmetry and would have predicted «% = mz- It is broken by the g\ gauge inter-
action. Note that the relation between m-^ and mz depends only on the two gauge
couplings. Since the couplings are small we should expect this relation to survive in the
quantum theory with only small corrections. In particular, the formula for the ratio
must approach 1 as the coupling g\ is taken to zero.
The fourth gauge boson is the massless photon, whose coupling to particles of
charge q is eq with e = g2 sin#w- We will discuss the charge assignments of standard-
model fermions in the section on anomalies below. Here we simply record the implied
couplings to the various gauge bosons:
J W + = "LK M (^CKM)y4 + Vy^ll,
4 = ^^/l M (^ 3 - sm 2 0w fiW*/*,
We have used Dirac spinor notation, where the spinors satisfy the left-handedness
condition (1 + Yi)f\f — 0- T 3 and Q are the diagonal weak isospin and charge matrices,
and the index M runs over all the quarks and leptons. The indices i and j run from 1 to
3 and denote the three copies of the same representation of SU(1, 2, 3), which we find
in nature. Note that the charged gauge -boson coupling is not diagonal in these indices.
This is due to the fact that we have to diagonalize the quark mass matrix.
8.6 The electro-weak sector of the standard model
Fermion masses are generated by Yukawa couplings to the Higgs field. For quarks
and charged leptons, the relevant terms are
(y u )ijUiHqj + (y d )ijdiH^qj + (ydfeiHUj + h.c.
The Yukawa coupling matrices are 3x3 complex matrices, because of the peculiar
experimental fact that we have found three generations or families or flavors of quarks
and leptons, with precisely the same couplings to the gauge group. These matrices can
be brought to diagonal form, with positive elements on the diagonal, by multiplying
q,u,d,e,l by independent unitary transformations. Note, however, that in order to diag-
onalize both y u and ya we must carry out independent unitary transformations on the
two isospin components of g. Thus, the relative unitary transformation V U V^ = Fckm
appears, as we have shown it, in the coupling to W bosons. Note, however, the complete
absence of neutral flavor-changing couplings. This is the Glashow-Iliopoulos-Maiani
(GIM) mechanism [93], and it accounts for the near absence of such interactions in
experiment. The mixings of neutral K, D, and B mesons are the dominant flavor-
changing processes observed. At the time of writing, all such interactions in nature can
be accounted for by double W-boson exchange graphs at one loop.
The matrix Fckm and the quark and lepton masses are the additions to the standard-
model parameter space caused by the fermionic fields. The Cabibbo-Kobayashi-
Maskawa [94-95] matrix itself contains three physical mixing angles and a CP-violating
phase. All experimental evidence for CP violation, with the exception of the baryon
asymmetry of the Universe, can be accounted for by this phase.
In the absence of neutrino masses, there is no similar mixing matrix for leptons.
Observations on neutrinos from the Sun and in the atmosphere give evidence both for
neutrino masses and for mixings. Neutrino mixing goes beyond the standard model. It
can be accounted for by a dimension-5 operator
AC = ^L{Hl(){Hlj),
M s
where y v is a complex symmetric matrix. The evidence suggests that Ms ~ 10 14 -10 15
GeV, and we have some constraints on the matrix elements of y v .
The quark masses range between 1 74 GeV and 5 MeV, and charged-lepton masses
run between 1.78 GeV and 0.5 MeV. The large mass ratios apparent here are among
the mysteries of the standard model. Various theoretical explanations have been sug-
gested, but none has been definitively established. The most prominent one involves
the Froggatt-Nielson [96] mechanism. Various mass matrix elements are determined as
powers of the expectation value of a new scalar field S divided by a new mass scale Mfn •
One introduces a symmetry broken by the VEV of S and assigns quantum numbers
to standard-model fields so that each Yukawa matrix element must be proportional
to a particular power of 5/Mfn • These ideas can also explain the peculiar texture of
the CKM matrix: the mixing angles seem largest between generations closest in mass.
Other ideas for explaining masses and mixing angles use the Kaluza-Klein picture of
extra spatial dimensions.
Non-abelian gauge theory
There is a peculiar interaction between the strong CP-violating coupling Oqcd /(32jt )
and the quark mass matrix. The argument of the determinant of y u )>d was eliminated
by a U(l) rotation which has a QCD anomaly. This rotation effectively shifts the strong
CP parameter. Experimentally 8qcd < 10~ 9 , and its small size is one of the strangest
puzzles in the standard model. If det(y u jd) were zero, there would be no issue, because
we could use the anomaly to rotate Oqcd into the Yukawa matrices and it would then
vanish. But the alert reader will easily see that the non-zero pion mass shows that there
is no symmetry of the standard model that could set detOWd) — 0- Indeed, non-
perturbative QCD physics would renormalize it away from zero even if it were zero
above the scale where the QCD coupling becomes strong. 12 There are models in which
extra symmetries and degrees of freedom at the TeV scale can set arg det(v u jd) to zero,
and in which the low-energy value of Oqcd is very small [97]. However, it is unclear
whether those models can explain the non-zero value of m u (100MeV) which seems
to be indicated by low-energy analysis [98].
The SU(2) x U(l) theory of electro-weak physics can account for all of the data on
those interactions in a very precise way. We will leave actual calculations in this model
to the exercises.
8.7 Symmetries and symmetry breaking in the
strong interactions
The modern theory of the strong interactions is called quantum chromodynamics
(QCD). It is an SU(3) gauge theory with Np Dirac quark fields transforming in the
triplet representation of SU(3). The quark mass matrix has the form
q^Mijql + h.c. (8.5)
Here q' LR — (1 ± ys)q l /2. The mass matrix could be a general complex matrix if
we do not insist on other symmetries. The kinetic terms are classically invariant
under U(7Vf) x U(Nf) transformations on the quarks. Using these we can make the
mass matrix diagonal, with positive entries. However, as we will discuss, the U^(l)
transformation
does not leave the functional measure of the quarks invariant. It changes the Euclidean
action by (all traces in this section are taken in the fundamental representation of SU(3)
unless otherwise noted)
£->h
Refer to Chapter 9 for the notion that couplings change with energy sc
8.7 Symmetries and symmetry breaking in the strong interactions
where F^ v — ^e^^F^K- We will see in the final chapter of this book that finite
action configurations of the gauge field have [1/(32jt 2 )] / tr (F^F^), which is quan-
tized in integer units, so unless 2Nfa = Inn for the transformation which makes the
determinant of the quark mass matrix real, we should expect a term
in the QCD action. could be any number between and 2tt, but a dimensional-
analysis estimate of the neutron electric dipole moment suggests that < 10~ 9 . There
have been several proposed explanations of why 0qcd should be so small, none of
which is completely satisfactory. This is called the strong CP problem.
Apart from the mysterious parameter Oqcd, QCD appears to have a dimensionless
coupling, which appears in
However, in the course of renormalization of the theory, this will be replaced by a
scale Aqcd- 13 Experiments on high-energy scattering suggest that this scale is of order
1 50 MeV. We have already alluded to the fact that the properties of pions, the lightest
hadron, can be best understood if we assume that two of the quark masses, m u and m<j,
are much smaller than Aqcd- Data on K and r\ mesons are also compatible with the
idea that the strange-quark mass can be treated as small. In the limit in which we set
all of these masses to zero, QCD has an SU(3) x SU(3) chiral symmetry. Much of low-
energy hadron physics can be understood if we assume that this symmetry is broken
spontaneously to the diagonal vector subgroup SU(3), with the explicit breaking of
the latter symmetry by quark masses treated in first-order perturbation theory. 14 The
light pseudo-scalar mesons are interpreted as the pseudo-Nambu-Goldstone bosons
of this broken symmetry.
Our purpose in this section is to outline some of the theoretical arguments which
support this conclusion. We will assume that quark confinement occurs in QCD.
The arguments of the last section gave us a plausible picture of why this is so. More
compelling is the experimental fact that, although we can see evidence for the quark
structure of hadrons in experiments at very high energy and momentum transfer, and
in the structure of the hadron spectrum itself, no experiment has ever produced an
isolated quark.
The first part of our argument is that the vector SU(3) currents do not suffer spon-
taneous breakdown. To see this, note that if we gave the up, down, and strange quarks
equal mass, m, SU(3) would still be a good symmetry. We can then ask whether this
3 This mysterious-sounding process is called uiinciisimkil iniiisinimiiion. It is a consequence of the fact
that the dimensions oi' operators in quantum Held theory recei\ e quantum corrections, as we will see in
Chapter 9.
One also has to take into account breaking ok the isospin subgroup. SI. (2). b_\ cleclromngnetism.
Non-abelian gauge theory
symmetry is spontaneously broken, that is, whether there could be a NG boson pole
in the two-point function
jd 4 xJ px {J%(x)J v B (Q)).
In the functional integral formalism this is calculated as the average of
Jr( Y ll T A S(x,0)y v T B S(0,x)),
over gauge field configurations. S(x, 0) is the quark propagator in an external gluon
field. The measure of integration over gluons is
e 4 « 2 f deUiy^D^ + \m),
which is positive (Problem 8.12). Note that we have assumed 0qcd = 0. The result we
will obtain is true for at least a small range of #qcd around 0.
The heart of the argument, which is due to Vafa and Witten [99], is that S(x, 0)
falls off exponentially at large x, with a bound on the exponent that is independent
of the gauge configuration. This is obvious in perturbation theory around vanish-
ing gauge field and for gauge fields that fall off rapidly at infinity. Vafa and Witten
proved that it is true in general. It follows from general properties of probability
measures that the averaged two-point function also falls exponentially, so the two-
point function cannot have a pole at zero momentum. Thus, the vector SU(3) is not
spontaneously broken for any finite m. But this means that it cannot be broken for
m = 0, because turning on m does not break the symmetry, so NG bosons present
at ra = would persist for finite m. To prove that the axial symmetries are sponta-
neously broken for m — 0, we will have to make a short detour through the subject of
anomalies.
8.8 Anomalies
There are a variety of examples in quantum field theory where classical symmetries or
gauge equivalences do not survive quantization [100-106]. In four-dimensional space-
time, this happens only for symmetries acting on chiral fermions. 15 Therefore, consider
M Weyl fermions, coupled to a U(M) gauge potential
Here A^ — \A a X a , where the k a are the Gell-Mann basis for all Hermitian M x M
matrices. At this point we will not specify which of the A a jX are dynamical (that is, are
5 And to conformal symmetry. The breaking of classical conformal invariance is part of the theory of
renormalization. In the language we will learn in Chapter 9, most marginal operators in field theory are
marginal only to loading order. Conioniialh invariant quantum Field iheoiio typically occur only for
isolated values of the couplings, called fixed points of the renormalization group.
8.8 Anomalies
to be integrated over in the functional integral) and which are just external sources,
which can be used to generate Green functions containing global symmetry currents.
The Euclidean Weyl matrices (x M = (i, ct) map left-handed into right-handed
fermions. Note that in Euclidean space the rotation group is SU(2) x SU(2) (rather
than the Lorentzian SL(2, C)), so the (1,2) and (2, 1) representations are not related
by complex conjugation. \[r is an independent right-handed Euclidean Weyl field. The
Weyl matrices satisfy
CT^trJ — S^ + ilS^Syi — &VQ&HJ + €p, v i\a'
= V + ^%i a ^
a a v — &^ v + i[— S^oSvj + 5„o<V( — e^v/] "'
The three-dimensional Levi-Civita symbol e^w is defined to be zero if any of its indices
is 0. The 't Hooft symbols ?? L ' R form a basis for the space of (anti) self-dual tensors.
The operators {aD)(a^D) — Ar and (a f D)(aD) — Al map the spaces of right- and
left-handed fermions into themselves. Their determinants are defined in a finite and
gauge-invariant manner by the formula
/a l , r \ f°° ds sA
In Det -jp- = / — Tr(e iAL1
\<r/ J so s
'*).
A° R are the values of the operators at vanishing A^ . The trace includes a sum over spin
indices, internal symmetry indices, and an integral over Euclidean space-time. These
formulae are valid if neither of the operators has a normalizable zero mode. In that
case the two operators are negative definite and the integrals converge at their upper
ends. When zero modes exist, the formula is valid for the restrictions of Al,r to their
non-zero eigenspaces. The determinants are given by the formal limit sq -*■ 0. There
are divergences in this limit coming from the small-s' singularities of
(x|e iAL ' R |x>.
These are all concentrated in an infinitesimal neighborhood of the point x and can
be absorbed into divergent coefficients times local functions of A^ and its derivatives
(this contributes to one-loop renormalization). By construction, everything is gauge-
invariant. This tells us that any violation of gauge invariance in the determinant is due
to the unitary operator which maps between left- and right-handed Hilbert spaces and
is therefore a pure phase. If we have a parity-invariant gauge theory, that is we gauge
only a subgroup of U(M) under which the M representation is real, then this phase
cancels out, and the determinant is positive and gauge -invariant.
The possible violations of local gauge invariance are restricted to an infinitesimal
neighborhood of a point x, since we can choose gauge transformations that differ from
the identity only in the neighborhood of x. Thus
Sin Dettio-^] = f d 4 x hA a IA {x)(J^ a (x)) = Df(J^\x)).
Non-abelian gauge theory
jpa = ix[r(rf 1 (X a /2)^/. Gauge invariance is equivalent to covariant conservation of the
current induced in the vacuum by the background gauge potential. We must expect
that any answer we get is independent of any cut-off parameters like sq, which we
introduce to make the calculation of the determinant finite, in the limit that so -*■ 0.
In this conclusion we anticipate the result of renormalization theory: any divergence
in renormalizable local field theory can be eliminated by changing a finite number of
parameters in the Lagrangian and redefining operators by a finite linear redefinition.
Dimensional analysis, and the requirement that the answer be odd under parity, then
restricts us to
DfiJ^ix)) = ^ vaP A{M){d abc d^A'l d a A c fi +g ahcd d^A^AJ
+ h ahcde A h l A c v A d a A e fi ).
The coefficients must be group-covariant numerical tensors. In fact, d ahc must
be proportional to the unique invariant in the symmetric product of two adjoint
representations of U(M). This is defined by
X a X h = (if abc + d abc )k c ,
where/ is the totally anti-symmetric structure constant and d is totally symmetric. The
equality follows from the fact that the X" are a basis for all Hermitian matrices and the
symmetry properties from cyclicity of the trace. We have defined A(M) so that d abc are
precisely these coefficients.
If we require the non-conservation equation to be covariant under gauge transfor-
mations, we get the unique answer
A a (x) = Df(J' lh (x)) = € llvafS A(M)d ahc G l ^ v G l afi .
However, in the non-abelian case, this covariance argument is inconsistent with an even
more basic requirement. The functional differential operators
satisfy the algebra
[G a (x), G b (y)] = if abc G c (x)h\x - y).
This leads to the Wess-Zumino (WZ) consistency condition [106]
G a {x)A h (y) - G b (y)A a (x) = if ahc A c (x)h\x - y).
The covariant formula doesn't satisfy this condition. However, it can be shown that the
difference between the covariant formula and the consistent one we will derive below is
the variation of a local functional. Thus, the ambiguities of renormalization can turn
one into the other. By contrast, we will see that no such local redefinition of the action
can set A" — 0.
The simplest derivation of the anomaly equation starts b;< c formula for
D ab (J llb ). We use fermion functional integration to write
8.8 Anomalies
j d 4 xDf{J^\x))=xj d 4 xtr(^-{x\(a^D il )- l a ll [D^w\\x)\
Here co is the space-time-dependent, Hermitian-matrix-valued, infinitesimal gauge
parameter. The commutator is taken in the sense of differential operators and func-
tions, as well as in the sense of matrices. The lower-case tr means trace only over spin
and gauge indices.
We now use two formulae for {a^D^)~ l :
{a^D^y 1 = A-'o-JZ)" = - f d.v e'^^D",
(a"^)" 1 = ff^Af 1 = -crtD" [ dje* AL .
ml ^ Js
We use these two forms in the two different terms of the commutator. With so finite,
we are perfectly justified in using cyclicity of the trace to write
J d 4 x Df{J"\x)) = -ij d 4 x tr(^ y'Vle^An© - A L e sA ^\x)\
We note that A L e ?AL = (d/ds)e' 5AL ande' 5AR A R = (d/ds)e sAR , which enables us to do
the .v integral:
fd 4 xDf(J»\x))=iJd 4 xtr(^-(x\e s <> A *-^\x)\
Now we recall the Baker-Campbell-Hausdorff formula
which is valid for any two operators A and B. The operators C n are sums of K-fold
multiple commutators of A and B. In our present computation, we take A — D 2 and
fi L R = r}£ v £(a a /2)F fiV . Note that the spin traces of 5l,r are both zero, as are the traces
of the commutators [A, #l,r]- In our computation we are taking the small-i'o limit of
the difference between the exponentials of Al and Ar. Taking the trace properties into
account, we find that the leading term is
J d 4 xDf{J^\x)) ^ j d 4 xtv(^-{x\^ Dl \x){Bi-Bl)(x)co{x)\
Any surviving contribution must come from the singularity of (x\e s ° D \x), as .so — >■ 0.
By using the BCH formula to expand around the operator with zero gauge potential,
we can see that the leading singularity comes from the substitution D 2 -> 3 2 , whence
Non-abelian gauge theory
.a* f d> .J 1
J (2jt) 4 16ji z Sq
The limit jo — >■ is thus finite and gives
f d 4 x Df(j»\x)) = ^ / d * x tr (y (4 - *£)(*)«(*))•
This has the gauge-covariant form we anticipated, and is odd under parity. On doing
all the traces we find
J d 4 x Df{J»- b {x)) = ~^d ahc j d 4 x e^G^G^.
Here we have normalized all of the X" matrices by tr(k"k b ) — 2S ab , and defined X"X b —
d abc k c + ifdbc k c_ In p art i cu i ar5 d c ' m = 1, and the generator of U(l), A°/2 = I/M.
The individual fermions are often chosen to have U(l) charge 1, in which case the
U(l) SU(M)" anomaly equation is multiplied by a factor of M.
Another formula for d abc is d ahc = 2 tr (T a [T b , T c ]+). This formula is important
because we often gauge a subgroup G of U(M), under which the M-dimensional
representation breaks up into a direct sum of complicated representations of G. The
anomaly for G will always be proportional to this trace. However, the normalizations
of generators need not be the same as those we use here. A fairly standard convention in
physics is to always normalize the generators in the lowest-dimensional representation
of G by tr{T a T b ) = \& ab . This defines the normalization of the structure constants.
In any other representation \.v^{T a T b ) = D(R)S ab , where D(R) is called the Dynkin
Index of the representation. Similarly, the symmetrized cubic trace which appears in
any representation is proportional to the same invariant tensor (the analog of d abc
for U(M)) with a coefficient called A(R), the anomaly of the representation. Since
d aaa ^ for all groups with anomalies and some values of a, and the structure ofd abc
is representation-independent, anomaly coefficients can be computed for any repre-
sentation by finding a single generator such that tr T 3 ^ and being careful about
normalization conventions.
The Hermitian generators T a of a Lie algebra in a representation R are generally
complex matrices, and the matrices (-T*) give another unitary representation called
the conjugate representation R. It may happen that these two representations are equiv-
alent, i.e. (-T*) — UT a U\ for a unitary matrix U. The product of a representation
and its conjugate always contains the trivial representation. If R is unitarily equivalent
to R this means that there is an invariant product on R itself, F*Fj = Fj(U^y J Fj. If
this product is symmetric, the representation is called real. There is a basis in which all
of the T a are imaginary and anti-symmetric. If it is anti-symmetric (symplectic) then
the representation is called pseudo-real. In either case it is easy to see that the anomaly
vanishes. The anomaly coefficients are real, because the generators are Hermitian. Thus
.4(R) = — -A(R). However, the coefficients are invariant under unitary equivalence, so
.A(R) = .4(R) for real or pseudo-real representations.
A simple example of a pseudo-real representation is the doublet of SU(2). We
have a* — —aiaa^. Since there is exactly one SU(2) representation of each integer
8.8 Anomalies
dimension, we see that all representations of SU(2) are either real or pseudo-real.
There are no anomalies in pure SU(2) gauge theories. 16 More generally, the anomalies
vanish in any representation of 0(«) with n ^ 6. The argument is simple. The anomaly
coefficient
tr(T mn [T kl , T ij ]+)
must be a numerical tensor that is anti-symmetric in each pair of indices and symmet-
ric under interchange of the pairs. All numerical tensors in SO(«) can be built from
products of Kronecker and Levi-Civita symbols. Only when n = 6 is there a tensor,
e mnkli J , with the appropriate symmetries. This is one of the famous "accidents" of the
Cartan classification so(6) = su(4), and the fundamental representation of SU(4) is
the chiral spinor of SO (6). All other SO(«) gauge theories are anomaly-free. The same
is true for the unitary symplectic groups Sp(2«) and all of the exceptional groups.
8.8.1 The consistent anomaly equation
For a U(l) gauge group, the anomaly equation we have derived does satisfy the WZ
consistency condition. For the abelian case, our equation becomes
2tt-
f d 4 xco(x)F llv F^ v .
We can find an action that gives rise to this formula by the following peculiar trick: intro-
duce a five-dimensional manifold X5 whose boundary is ordinary four-dimensional
Euclidean space-time. Introduce any smooth five-dimensional gauge potential whose
boundary value is A IA (x), with As = on the boundary. Define the Chern-Simons (CS)
action by
5CS = "i / d 5 xe MNKLR A M F NK F LR .
Note that this does not depend on the five-dimensional metric on X5. Under 1
five-dimensional gauge transformation, this goes into
hi
SScs = -tt^ d'xe MNKLK d M coF NK F LR .
On integrating by parts we see that the gauge variation comes only from the boundary,
and exactly reproduces our anomaly equation.
We can understand the reason for the existence of the CS action by looking at a
six-dimensional U(l) gauge theory. The gauge-invariant operator
\I\M 2 \HM A M 5 M 6 p p p
' 1 1] 1 1 Min lied 1 I 1 1 i I ] 111 in model 11 1 1 11 I 1
model with Weyl fermions in an odd number of doublet representations. The fermion determinant in
such models is an odd function on die space of gauye potentials modulo uattiie transformations, and the
partition function \anishes. these models ai
Non-abelian gauge theory
^MiMTM^MiMsMf,!? r 77
= e M <^^^M 5 M 6 a Mi( ^ M2J F M3M4jFM5M6 ).
The fact that the five-dimensional gauge variation of jCqs is a total divergence follows
from the fact that the gauge-invariant six-dimensional operator is the total divergence
of the lift of the CS Lagrangian density to a current density in six dimensions. This
fact shows us how to generalize the CS trick to non-abelian theories. Simply define the
non-abelian CS invariant by
tr[e M ^ M ^ M ^F MlM2 F M3M4 F M5M6 ]
_ ^M,M 1 M^M 4 M 5 M 6 a r
= e "A/, <•- A/ ; .V/3.V/4 \U .V/ 6 .
and use the five-dimensional restriction of Cm\ ...m s to define the non-abelian CS action.
By construction, its gauge variation on a 5-manifold with boundary will depend only
on the boundary values of the fields. Its variation is parity odd, dimension 4 (contains
no dimensionful parameters), and satisfies the WZ condition. It will also reproduce
our direct calculation for abelian subgroups.
We will show below that there can be no local four-dimensional action whose gauge
variation reproduces the anomaly equation. It is a little harder to show that there
is no other WZ-consistent form of the anomaly equation that cannot be written as
the variation of a four-dimensional action. Furthermore, the difference between our
covariant form of the anomaly equation and the consistent form can be written as such
a variation. Thus, the anomaly cannot be removed by some other method of regulation,
and its form is completely determined by its value for abelian gauge fields (but one must
examine all U(l) subgroups). Some of the proofs of the statements of this paragraph
can be found in Chapter 22 of [42] and references therein.
8.8.2 The use of anomalies: consistency of gauge theories
The most striking consequence of the existence of anomalies is that some classical gauge
theories are inconsistent at the quantum level. The lack of gauge invariance translates
into either a violation of Lorentz invariance, in manifestly unitary gauges like the axial
gauge, or a violation of unitarity, in covariant gauges. The first thing we should do is
to check whether our favorite gauge theory of the real world, the standard SU(1, 2, 3)
model, is anomaly-free. We know that the SU(2) 3 and SU(3) 3 anomalies cancel out,
because the representations of these groups are real or pseudo-real. We are left with the
U(l) 3 , U(l) SU(2) 2 , and U(l) SU(3) 2 anomalies. These give the following constraints
on weak hypercharges:
6Fq 3 + 3F Q 3 +3F| + 2F 3 + Y? = 0,
37 q + F. = 0,
27 q +y Q + y a = o.
8.8 Anomalies
There is one more constraint, which has to do with the consistency of the coupling
of the theory to gravity. In order to couple spinors to gravity, we have to introduce an
SO(l, 3) gauge potential, o>jf , to relate the spinor basis at one point of the space-time
manifold to those at any other. The ab indices refer to the adjoint representation of
SO(l,3). In Einstein's theory of gravitation this is accomplished by introducing an
orthonormal frame or Vierbein, e^ (x), in the tangent space to space-time at space-time
coordinate x. The coordinate-frame space-time metric is g^ v = e^e*^, where n is the
Minkowski metric in the tangent space, e" is the inverse matrix to e v a . This defines the
connection via the equation
df+e" — 9 v e° = w a VL e v — a^fe^.
One can solve this equation algebraically for 0$ '. If e^ — ► A£(;c)e* under local Lorentz
transformations, it is easy to see that of b transforms as a Lorentz connection. The
Lagrangian for Weyl spinors coupled to gauge fields and gravity is
a a h are the matrices representing infinitesimal Lorentz transformations. A derivation
quite analogous to what we have done above shows that there will be U(l) gravity
anomalies unless the traces of all U(l) generators vanish. Thus, in the standard model
we must have
67 q + 37a + 3F a +27 1 + Y- e = 0.
We can use the three linear equations to eliminate Y\, Y^, and Yj. Finally, we use the
cubic equation to eliminate F u in terms of F q . Remarkably, it turns into a quadratic
equation, whose solutions are
r fl = -r q ±3|r q |.
It is then obvious that all of the hypercharges must be quantized in units of Y q . Recall
that the normalization of all hypercharges is determined only once we have fixed the
gauge coupling g\ .
There are two amazing things about this result. First, we have derived charge quan-
tization from consistency conditions. Second, the consistency conditions require a
non-trivial cancelation between quark and lepton contributions. Finally, the real-world
hypercharges actually satisfy the conditions. All of this could be simply explained if the
standard model emerged from the Higgs mechanism in a simple group at a high energy
scale, M\j ;» m\y- This possibility was first investigated by Georgi and Glashow. The
idea is to search for the simplest possibility, a simple group of rank 4 (the rank of the
standard model) that contains SU(1,2,3). The list of simple groups of rank 4 consists
of SU(5) SO(8), SO(9), Sp(8), and F 4 . The only one with an SU(1,2,3) subgroup is
SU(5). SU(5) has no irreducible 15-dimensional representation, but the R — 5 © 10
representation 17 and its conjugate are 15-dimensional reducible representations. The
7 The 10 is the anti-symmetric product of two 5 representations.
Non-abelian gauge theory
5+10 has the right quantum-number assignments to fit the standard model. The d
and lepton multiplets fit in the 5 and the rest of the particles are in the 10.
8.8.3 Violation of global symmetries
If we have a simple, non-abelian gauge group G, then the only possible anomalies in
global symmetry currents have the form
The anomaly coefficients have the form
at = J2 ti d *>
where the sum is over irreducible representations of G and we are working in a basis
where the generator T' of the global symmetry is diagonal. From this formula it is
clear that if T' is the (traceless) generator of a simple non-abelian global symmetry
then the anomaly vanishes. Further, while a generic model may have several U(l)
global symmetries, their anomalies are all proportional to each other so only one
linear combination of the U(l)s is anomalous. If our non-abelian gauge group has k
simple factors, then, barring fortuitous cancelations, there will be k anomalous U(l)
symmetries.
There are two important applications of anomalous violation of global symmetries
in the standard model. The first is the anomaly in the U^(l) symmetry of QCD with
massless quarks. In this case the gauge coupling g^ is not really a free dimensionless
parameter. Rather, as we will see in our chapter on renormalization, gj, varies with the
momentum scale. It is small at very high momenta, but eventually becomes strong.
The real parameter of massless QCD is a scale Aqcd at which perturbation theory
in g3 breaks down. There is thus no sense, except in processes at energy ~2> Aqcd, in
which the violation of the U^(l) symmetry is small. Rather, the RHS of the anomaly
equation is just estimated from dimensional analysis.
This is a good thing, because our hypothesis of non-zero (qq), which is necessary to
the successful predictions of chiral symmetry for pion and kaon physics, implies that,
if U^(l) is a symmetry, it is spontaneously broken. This would predict a ninth pseudo-
Goldstone boson in the hadron spectrum, which could be created by the singlet axial
current. The lightest particle with the right quantum numbers is the r{. It is no lighter
than the proton. Chiral perturbation theory predicts that its mass is less than twice the
pion mass. The anomaly saves us from this disaster, by telling us that U^Cl) is not a
symmetry of QCD.
The God of the standard model must be Jewish, for there is a price to pay for this
pleasant victory. When we rotated away the overall phase of the quark mass matrix,
we used a U^(l) transformation. The anomaly equation tells us that this simply shifts
the CP-violating parameter Oqcd in the gauge-field Lagrangian. Since experimental
8.8 Anomalies
evidence shows that CP is violated in the real world, 18 we are left with no explanation
of why this parameter is small. We might have tried to argue, on the basis of perturbation
theory, that the operator G"G ,xva was a total divergence, with no effect on physics.
The anomalous resolution of the problem of the ninth Goldstone boson requires us
to abandon such a conjecture. Furthermore, as we will see when we study instantons,
there is strong mathematical evidence that the total divergence does have an effect
on non-perturbative physics. The strong CP problem is one of the deepest puzzles of
particle physics. Both naive and more sophisticated estimates of the neutron electric
dipole moment indicate that 9 < 10~ 9 . Theoretical explanations of this small number
have been proposed, but none has yet been experimentally verified.
The second anomalous violation of a global symmetry in the standard model is the
violation of baryon number by BSU(2) 2 anomalies. The only particles in the standard
model which have both baryon number and non-singlet SU(2)l are the left-handed
quark doublets. The reader should, at this point, be able to see that there is a baryon-
number anomaly, as well as a lepton-number anomaly, but that B — L is conserved.
Because the electro-weak couplings are small at the scale of the W-boson mass, electro-
weak perturbation theory is a good approximation. B violation is not seen in any order
in the perturbation expansion. Instanton methods, which we will learn about in the last
chapter, show that the violation of B in few-particle reactions is of order
AB/(VT) ~ m 4 w e-* n2/g 2 ~ »4e" 180 .
This is of course invisible in laboratory experiments. It can be shown that, if the tem-
perature of the early Universe reaches something of order 100 GeV, then electro-weak
5-violating processes become much more probable. They may be part of the story of
how the Universe developed an asymmetry between baryons and anti-baryons.
8.8.4 Anomaly matching and massless spectrum
Our final example of the use of anomaly equations uses non-anomalous global sym-
metries in a theory with a strongly interacting gauge invariance. We will work with
the particular example of massless QCD with SU(iV) gauge group and Np species of
massless quarks. In particular, we will concentrate on the case (approximately) relevant
to the real world: N — Np — 3. Taking into account the U^(l) anomaly, this theory
has a Gp = U(l) x SU(A^f) x SU(,/Vf) global symmetry, called the flavor group of
the standard model (in the approximation of Np zero quark masses). Imagine trying
to gauge this symmetry, but take all the gauge couplings very small. 19 Such a gauge
theory is inconsistent, because of the anomalies. Note that the anomaly equations for
Gf are independent of the SU(A0 gauge fields, and so remain valid after one has
18 And the phase in the CKM matrix gives a nice explanation of all observations to date.
19 Once renormalization is taken into account, gauge couplings vary with energy scale. The statement that
all the couplings i n !1 u n lii in i i i, i I 1 ll houi ill i i range idevant to the
Non-abelian gauge theory
done the functional integral over these fields. 20 Let us add a collection of massless
left-handed fermions, called spectators, in representations of Gf, which cancel out all
of the anomalies.
There is strong evidence that for Np/N sufficiently small (the critical value is of order
1 1/2) a small SU(7V) gauge coupling in the extreme ultraviolet grows until an energy
scale Aqcd- Below that scale a perturbative description in terms of quarks and gluons
is invalid. In fact, it is believed that the theory exhibits confinement: neither quarks
nor gluons appear in the spectrum of asymptotic particle states. Instead, the particle
states are created from the vacuum by color-singlet composite operators composed of
N quarks, N anti-quarks, or a quark-anti-quark pair. The latter are called mesons, the
former, baryons. Collectively they are known as hadrons.
Far below the scale Aqcd there must be a consistent effective Gf gauge theory
containing the spectators and massless hadrons (if any). We immediately conclude
that there must be massless hadrons, because otherwise there could be nothing to
cancel out the anomaly of the spectator fields. One possibility is massless baryons in
anomalous representations of Gf- We will look into this possibility in a moment. The
other possibility is illustrated for U(l) subgroups of Gf by the following Lagrangian:
f 2
C = J —{d, 1 c] ) ) 2 +K(l>F lJLV F> lv .
(p is an angle variable and represents the NGB of spontaneous U(l) breakdown.
The U(l) symmetry is realized as a shift of (p. For an appropriate choice of K, this
Lagrangian can reproduce an anomalous symmetry transformation, which cancels out
that of the spectators. Note that, if Gp is explicitly broken by quark masses, then
the massive pseudo-NGB will decay into "photons" of the U(l) subgroup, with an
amplitude completely determined by the anomaly K and the decay constant/. Such
a computation in fact provides a quantitatively correct description of the decay of the
tt° into two Maxwell photons. The non-abelian generalization of this anomalous NGB
Lagrangian was discovered by Wess and Zumino in the same paper [106] as that in
which they derived their consistency condition for anomalies. We will investigate it in
the exercises.
So we appear to have a dichotomous choice of massless hadron spectrum: NGBs
or massless baryons constrained by anomaly cancelation. Let's consider the second
possibility. For N = 3, the baryon fields have the form
■fly '-
Here we have used the notation that lower or upper Greek indices represent left-handed
Weyl spinors. Lower early Latin indices are SU(7V~f)L fundamental representations,
There is a much more rigorous version of this quick and dim aruumenL called die Adler-Bardeen
theorem [105].
8.8 Anomalies
whereas lower mid-alphabet Latin indices are the fundamental of SU(3). Upper Latin
indices are anti-fundamental SU(7Vf)R or SU(A r ) representations. The spin indices
must be combined to spin 1/2. The other possibility, massless spin- 1 fields with gauge
interaction, can be realized only in supergravity in anti-de Sitter space. The gauge group
representation in that case is non-chiral and does not have anomalies. Furthermore, the
spin- 1 particles are supersymmetric partners of the graviton; so, if they were compos-
ites, the graviton too would have to be composite. Weinberg and Witten [108] showed
that a composite graviton cannot arise in a Lorentz -invariant local field theory.
We can immediately see that in the case N — Nf — 3 the baryons cannot cancel out
the spectator anomaly. For N = 3 the baryons which transform under SU(3)l are of
the form etjkq^c/^q 1 ^. The only complex representation of SU(3)l which appears in the
product of three three-dimensional representations is the totally symmetric 10. If we
consider the generator
T — diagonal(l, 1, —2)
in the fundamental representation, then
Tjo = diagonal(3, 3, -6, 3, 3, 0, 0, 0, -3, -3).
If we have L baryons in the 10 representation, this gives an SU(3) 3 anomaly pro-
portional to — 6(27)L. The quark anomaly is instead —18 (including the three from
color) in the same units. The anomalies match only if L = ^, which is absurd. We con-
clude that U(l) x SUl(3) x SUr(3) must break down spontaneously to its anomaly-free
U(l) x SUk(3) subgroup. This is in fact the symmetry-breaking pattern observed for
QCD in the real world.
Another class of cases in which baryons cannot do the job is even N, with arbitrary
Nf < iVp , where baryons are bosons. In fact, using one more general property of
parity-symmetric gauge theories coupled to massless fermions, one can prove that
anomaly matching is not possible for any N and Nf. Let us give a mass to a single
quark. I claim that all baryons containing that quark field are massive. The proof is
due to Vafa and Witten, and follows the same line of argument as that we gave above to
prove that vector-like symmetries were not spontaneously broken. Do the functional
integral in steps, first doing the integral over fermion fields. The two-point function of
baryons in an external field is given by
e h - iN e h ... JN G j i [ (x,y) . . . G j t N N (x,y),
where we have suppressed the spin and flavor indices on the quark Green functions.
For the baryons in question, at least one of the Green functions is that of the massive
quark. To all orders in perturbation theory in the external field, this Green function falls
off exponentially at infinity. Vafa and Witten [99] proved that this was true rigorously
and that the exponential falloff was independent of the gauge field configuration. In a
parity-symmetric gauge theory the fermion determinant Det[iy M Z> M ] = Det[— Al,r]
for all configurations that do not have normalizable zero modes and is zero for those
which do. Thus the full baryon Green function is an integral of the external field
Green function with a positive weight function. The field-independent bound on the
Non-abelian gauge theory
exponential falloff shows that the baryon Green function falls off exponentially in
Euclidean space. This implies that the baryon operators containing the massive quark
cannot create massless particles from the vacuum.
As a consequence, given a solution to the anomaly-matching condition for some value
of (N, Np), which is postulated to give a massless baryon spectrum in that version of
QCD, we should get another solution for (N, Np — 1), with the same baryon multiplic-
ities (the analog of the integer L for N— Np— 3). It is straightforward but tedious to
show that these conditions lead to the conclusion that there are no physically consis-
tent solutions of the anomaly-matching conditions, even though there are mathematical
solutions for particular values of N and Np. Thus, all versions of QCD spontaneously
break their global symmetry to its anomaly-free vector U(Np) subgroup. If, as occurs
in the real world, part of the anomalous symmetry is weakly gauged, with the anomaly
canceled out by spectators (i.e. leptons in the real world), then we also get formulae
(from the Wess-Zumino Lagrangian alluded to above) for the decay amplitudes of
NGBs to weakly coupled gauge bosons.
Let us finally note that our current use of the anomaly equation shows that
the anomaly cannot be removed by adding a local term to the four-dimensional
Lagrangian. Indeed, we have seen that the anomaly equation seems in some cases
to imply a massless NGB pole in the Green functions of currents. In fact what one
can show [109-1 10] is that the anomaly equation implies that Green functions involv-
ing three currents, one of which is anomalous, have a momentum-space discontinuity
disc A oc S (q 2 ), where q is the momentum carried by the anomalous current. The NGB
pole is one way to achieve such a discontinuity; the other is a simple triangle diagram of
massless fermions. In this diagram there is non-zero phase space for on-shell massless
fermions to travel exactly parallel to each other, mocking up the discontinuity caused
by a massless scalar pole. No other diagram has finite measure for such configurations,
and this constitutes another proof of the Adler-Bardeen theorem that the anomaly is
given exactly by the expressions we derived by naive manipulations, without any higher-
order corrections. The fact that the anomaly equation implies massless singularities in
Green functions shows us that one cannot find a local Lagrangian counterterm that
removes it.
8.9 Quantization of gauge theories in the
Higgs phase
We now leave the subject of anomalies and turn to the quantization of gauge theories
in the Higgs phase. We imagine a gauge group G, which, abusing language, is broken
down to a subgroup H by the VEV, v/, of a multiplet of scalar fields <pj. Define
fr=(T^)h
8.9 Quantization of gauge theories in the Higgs phase
We use canonically normalized fields, whose covariant derivative is
D^i = d tl (f> i -ga(Ti s ) J i <t>j.
The couplings g a are the same for all generators belonging to the same simple factor
of the gauge group.
If we expand the classical Lagrangian around v,-, there will be kinetic terms mixing
gauge bosons and scalars. It is convenient to choose a gauge-fixing term to eliminate
this mixing. This defines the R K gauges, to which we have alluded in our discussion of
the abelian Higgs model. After integrating out the Lagrange multiplier fields N a we
have a gauge -fixing Lagrangian
£gf = 2^( 9 ^ " KgaffAd 2 + c a [d^ b - Kg a g b f?{T h )\{Vj + Aj)]c b .
In the last term we modify the summation convention and consider the term
gagbf, a (T h ) , i (vj + A 7 ) as a matrix with two adjoint indices.
In the R K gauges, all components of the vector fields propagate. The physical
components have mass matrix
(m 2 A ) ah = gaghftfi,
while the time-like components, as well as the ghosts, have mass matrix icm A . The
propagator is
<l-
V-nP-A" V^n,
For finite values of k this falls off at high momenta, but if we take the k -> oo limit
it looks like a massive vector propagator. On inspecting the gauge-fixing term, we see
that in this limit we appear to get the unitary gauge. The unphysical components of the
A, fields are projected out. Indeed, the would-be NGBs have mass matrix
(™ngb)'7 = K f"ffsl,
because the full mass matrix for the fields A, is
mi GB + M\
where M 2 is k -independent and operates in the subspace of fields orthogonal to the
would-be NGBs. M is the mass matrix for physical Higgs-boson excitations.
In Problem 8.3 you will show that the physical linearized excitations of a gauge
theory in R^ gauge are BRST-invariant states. As a consequence, the fields of the model
create these states from the vacuum to all orders in perturbation theory. Furthermore,
the linearized theory tells us precisely how to project the fields A^, c a , and A,- onto
physical states. We simply take the three physical polarization states of massive gauge
bosons, the transversely polarized states of massless gauge bosons, and components
of A, perpendicular to the subspace of would-be Goldstone bosons (i.e. satisfying
ff A,- = 0). No ghosts, time-like gauge fields, or would-be NGBs are allowed.
Non-abelian gauge theory
In scattering theory, fields are evaluated along the trajectories of asymptotic particles
at infinity. Thus, if we make these projections on the asymptotic fields, we are defining
BRST-invariant operators. The LSZ formula converts such asymptotic expressions
into residues of poles in momentum-space Green functions. We conclude that, if we
evaluate the non-gauge-invariant Green functions of the fields A a ^ and A,-, but search for
poles in perturbation theory that are near the zeroth-order masses of physical particle
states, and project onto the proper subspaces of on-shell fields, then these residues are
gauge-invariant and independent of k. In other words, the S-matrix for physical states
is a BRST-invariant, and therefore a: -independent, quantity. As k — ► oo, unphysical
states go off to infinite mass and the theory becomes manifestly unitary. Thus, the
scattering matrix satisfies perturbative unitarity.
Another way to understand Problem 8.3, and the remarks of the preceding
paragraph, is to introduce gauge-invariant fields via
4> = Q(v+ A),
B ;1 = Qt(^ -gA^Q,
* = Sllf.
A has fewer components than cp. It parametrizes the space of field configurations
orthogonal to the space of gauge transforms of the classical vacuum. x[r is the multiplet
of fermion fields in the theory, and £2f is the action of the gauge group on that multiplet.
It is easy to see that, in R K gauge, with k — >■ oo, the massive physical fields are equal
to these gauge-invariant fields to leading order in the semi-classical expansion. For
the massless vectors in the unbroken subgroup H, we must still make a transverse
projection. We may think of defining the S-matrix directly in terms of these gauge-
invariant operators.
The problem is that, as we have noted, the k —> oo and large -momentum limits do not
commute for general Green functions. The perturbation expansion describing Green
functions of the fields is not renormalizable in the k -> oo limit. We will discuss this
further in Chapter 9.
8.10 Problems for Chapter 8
*8.1. Show that the action of the BRST symmetry on ordinary fields is nilpotent.
*8.2. Quantize free Maxwell electrodynamics using the BRST method, using the
gauge-fixing Lagrangian £qf = [Qbrst, c(3 M ^4' t + kN/2)]. Use Noether's the-
orem to construct Qbrst in terms of the fields, and then from that expression
get the expression in terms of creation and annihilation operators for four pho-
ton polarizations and the ghost fields (the ghosts are quantized like a complex
scalar with b — c^). Show that the BRST-invariant states, which are not of the
BRST trivial form \s) — QbrstIO* are precisely the physical, transverse, photon
polarizations.
8.10 Problems for Chapter 8
8.3. Repeat the previous exercise for a general gauge theory in the Higgs phase, using
R K gauge. Show that only physical polarization states of massive vectors, trans-
verse polarizations of massless vectors, and scalar excitations orthogonal to the
subspace of would-be Goldstone excitations are BRST-invariant.
8.4. Using BRST quantization for pure Yang-Mills theory, in a general covariant
gauge with parameter k, compute the leading-order contribution to the expec-
tation value of a Euclidean Wilson loop in gauge group representation R, of
rectangular shape T x L, with
T»L.
You should find an answer independent of k, of the form
Tre" 7 T'R'&e- 7 ' 00 .
The physical interpretation of this computation is that the logarithm is T times
the energy of a pair of static sources for the gauge field, separated by a distance
L, one in the representation R and the other in the complex-conjugate represen-
tation R. The trace is taken over the tensor product of these two representations.
Expand (^ + t a ^) 2 to write t^t a - in terms of Casimir operators for the two rep-
resentations and for all representations that appear in the tensor product. The
trace tells you that the Wilson loop represents a bunch of energy eigenstates
where the heavy-particle-anti-particle color indices are combined into different
irreducible representations of the gauge group. Each such state has a Coulomb
attraction or repulsion with a coefficient that depends on the irreducible rep-
resentation. There is also an infinite self-energy term, which is L-independent.
Show that the most attractive potential occurs when the indices are combined
to give a singlet. This was Nambu's argument that QCD explained why color-
singlet combinations of quarks and anti-quarks were the lowest-lying states. For
baryons one first combines two quarks into an anti-quark representation, via
the SU(3) Clebsch-Gordan formula 3x3 = 3 + 6, and then makes the singlet
combination with the third quark. This leads to a picture of the structure of a
baryon in which two quarks are closer together than they are to the third. The
picture is valid for baryons with large orbital angular momentum.
There are two things to be careful of in this problem. Take care with the path
ordering of the exponential defining the Wilson loop, but argue that in leading
order in g it can be neglected, except for the change of representation from R
to R for the particle at the origin. Also the Wilson loop is originally defined in
terms of a single trace in the R representation. Show that taking the limit T ;» L
effectively replaces this by a trace over the tensor-product. One way to see this
is to take T = oo from the start. Then the object we are computing is defined
as the trace over the tensor product Hilbert space of the product of two infinite
Wilson lines, one going forward in time and the other backward in time. Show
that the path ordering accounts for the change of representation.
8.5. Show that tr^(T a [T b , T c ] + ) vanishes for any real or pseudo-real representation
of an arbitrary compact Lie group.
Non-abelian gauge theory
8.6. Given 2M chiral fermions, consider the U(l) subgroup of U(2M) whose gen-
erator is diag(l ...1,-1,... — 1), where the diagonal blocks are M x M. The
commutant of U(l) in U(2M) is G F = U(l) x SU(M) x SU(M) (commutant
means maximal subgroup commuting with U(l)). Show that the subgroup of
Gf U(l) x SUd(M ) is anomaly-free. Now consider the five -dimensional Chern-
Simons action for the gauge group Gf- The variation of this action under a finite
gauge transformation is
!**
Scs^ S CS + / d 4 xC wz (A,n)
where £wz is a local function of the four-dimensional boundary values of A and
Q and their first derivatives. Compute £wz- Now use the above remark about
anomaly freedom to conclude that £wz depends only on the coset element corre-
sponding to Q (x) in Gf/[U(1) x SUdCAOL That is, it is invariant under transfor-
mations of the form f2 (x) -» Q (x)h{x). Up until now we have considered Q. (x) to
be the boundary value of the five -dimensional gauge transformation. Now switch
gears and consider £\yz to be a Lagrangian for gauge fields in Gf, coupled to
NGBs in corresponding to the spontaneous breakdown of Gf to U(l) x SUd(AT).
If we now do Gf gauge transformations g(x) on the gauge fields, accompanied
by £2(x) — »■ g(x)Q,(x), show that £wz reproduces the anomaly of the fermions.
8.7. Show that the fermion content of the 5+ 10 representation of SU(5) reproduces a
single generation of the standard model, if we regard the latter as the SU(1, 2, 3)
subgroup of SU(5).
8.8. Consider an SU(/V) gauge theory with a Weyl fermion tyy in the second-rank
anti-symmetric tensor representation. Show that there is an integer Np such that
the anomaly of SU(A0 is canceled out if we add 7V~f Weyl fermions in the funda-
mental N representation (Np negative means fermions in the N represent;!! ion).
Find all global classical symmetries of the resulting gauge -theory action and
describe which ones are anomalous. Compute the anomaly in the three-point
function of three non-anomalous global symmetries.
8.9. If we embed the standard-model (with a single generation of quarks and lep-
tons) gauge theory into SU(5), there are two global classical U(l) symmetries.
Show that one linear combination of the U(l) generators has no U(1)SU(5) 2
anomalies. Show that one can add an extra SU(5) singlet Weyl fermion, trans-
forming under this U(l), in such a way that the U(l) 3 anomaly also cancels out.
The group SO(10) is of rank 5. Show that it has an SU(5) x U(l) subgroup,
by considering the spinor representation, where by the SO (10) generators are
constructed as commutators often Euclidean Dirac-Clifford matrices satisfying
[r M ,r v ] + = 2<5 MV .
Define
V2ai = r 2/ _i + iY 2i , /=1,...,5,
[ ai ,aj] + = 0, [at,a]]+ = 8 V .
and show that
8.10 Problems for Chapter 8
This is the algebra of five independent fermion-creation and -annihilation opera-
tors. The 25 matrices ajaj form the Lie algebra su(5) ©u(l). Use this construction
to show that the representation of the Dirac matrices is 32-dimensional. Show
that the product of all the Dirac matrices anti-commutes with each Dirac matrix
and thus commutes with all of the SO ( 1 0) generators. The 32-dimensional SO ( 1 0)
spinor representation breaks into two irreducible 16-dimensional representa-
tions. Show that these are the subspaces where either an odd or an even number
of creation operators acts on the state satisfying
a t \s) = 0.
Imagine that a Higgs field breaks SO(10) down to U(l) x SU(5). Show that
one of these 16-dimensional representations consists of the standard model plus
the singlet we introduced above to cancel out anomalies. The anomaly-free U(l)
generator we introduced above is just the extra U(l) in SO(10). If you wish, you
can also find a Higgs field and an effective potential for it, which performs the
breaking of SO(10) to SU(5) x U(l).
5.10. Show that baryon and lepton number have an SU(2) anomaly in the standard
model, but that B — Lis anomaly-free. Referring to the previous exercise, show
that B -Lis the extra U(l) that commutes with SU(5) in SO(10).
5.11. Compute the amplitudes, at tree level in the SU (2) x U(l) theory of electro-weak
interactions, for an electron-positron pair to annihilate into a pair of charged W
bosons. You may make the approximation that the electron is massless, because
this interaction can occur only at energies much larger than the electron mass. It
is convenient to do the computation separately for left-handed and right-handed
electrons, because these helicity states couple differently to the weak-interaction
gauge bosons. Do the computations in a general R K gauge and show explic-
itly that the result is k -independent. Comment on the energy dependence of the
production amplitude for longitudinally polarized W bosons.
5.12. Show that the determinant of the Euclidean Dirac operator if) + im is formally
positive, by using the Weyl representation.
5.13. Carry out the leading-order computation of the potential between static sources
in a non-abelian gauge theory in a general covariant gauge, and verify that the
answer is independent of k.
5.14. The charge -changing weak currents involved in muon decay are
J+" = eX M (l - Y5)v e + MK M (1 - Y5)v^
and its Hermitian conjugate. Compute the commutator [J+,J®], using canon-
ical commutation relations, and show that the result is not the electromagnetic
current. This was Glashow's argument that the electro-weak gauge group had to
beSU(2) xU(l).
5.15. Write the Ward identities for the generating functional of non-abelian currents,
the functional average of
Non-abelian gauge theory
for some global symmetry group G with generators T". Show that they are
equivalent to the requirement that W(A) be invariant under non-abelian gauge
transformations of A. Define the non-standard Legendre transform
r(ap + M 2 G4£ - ap 2 = W(A^).
Describe how the expansion coefficients of T are related to connected Green func-
tions. Show that the Ward identities imply that T is a gauge-invariant functional
of a^. At low energies this means that r is just the usual Yang-Mills action of
Chapter 7. Now argue that, if G is broken to H, then we should couple the a"^ field
to the NGBs living in the G/H coset, in a way which respects gauge invariance.
Show that in the Yang-Mills approximation we get a theory of massive vector
mesons interacting with massless NGBs. The vector mass matrix is not just given
by M 2 . There is also a contribution from the interaction with the NGBs. Com-
pute it. Apply this formalism to the case of strong interaction chiral symmetry
G — SU(2) x SU(2). It gives an approximate theory of p, w, and Ai mesons
interacting with pions. The theory cannot be justified in the way that we justify
the chiral Lagrangian. Even if we make the chiral symmetry exact, by setting up-
and down-quark masses to zero, the vector mesons remain massive, with a mass
not much smaller than AitF^, so there is no limit in which we can exactly replace
r by the Yang-Mills action. Nonetheless, this vector-dominance model of the
hadronic currents is a useful approach to problems in strong-interaction physics.
Renormalization and effective field theory
We are finally ready to face up to the fact that most of the expressions we write in
the Feynman-diagram solution of quantum field theory are nonsensical, which is to
say infinite. The purpose of this introductory section is to demonstrate the simple and
general counting rules which prove that this is so, and to give the reader a conceptual
orientation to the general theory of renormalization, which we will discuss in the rest
of this chapter.
We will begin with the conceptual. The formalism of quantum field theory makes
statements about arbitrarily short distances in space-time and arbitrarily high energies.
At a given moment in history experiment will never probe arbitrarily short distances or
high energies. We should expect that the formalism will have to change to fit the data
as we uncover more empirical facts about short distances. This has happened before.
Hydrodynamics is a field theory, but we know that at short enough distance scales it is
not a good description of water. A better one is given (in principle) by the Schrodinger
equation for hydrogen and oxygen nuclei interacting with electrons, and that in turn is
superseded by the standard model of particle physics as we go to even shorter distances
and higher energies. Yet only a fool would imagine that one should try to understand
the properties of waves in the ocean in terms of Feynman-diagram calculations in the
standard model, even if the latter understanding is possible "in principle."
This simple parable illustrates the idea of an effective field theory (EFT). EFT is a
description of phenomena at wavelengths longer than some effective cut-off scale l c
and energies below some energy cut-off E c . In a theory that is (even approximately)
relativistically invariant, the cut-offs are related by l c ~ E~ ' . In principle, any EFT is
a quantum theory, but it may be that, as in the case of hydrodynamics, the classical
approximation is valid throughout the range of validity of the EFT itself. The basic idea
of an effective field theory is that physics in a certain energy regime, and a certain reso-
lution for length scales, should be described most simply by a set of effective degrees of
freedom appropriate to that scale and depend on any underlying, more "fundamental"
description only through a small set of parameters governing the dynamics of these
degrees of freedom.
The basic idea of renormalization is to parametrize the effect of all possible mod-
ifications of the theory at higher energy and shorter distance in terms of all possible
interactions between the effective degrees of freedom. The non-trivial part of renor-
malization theory is the mathematical demonstration of universality of long-distance,
Renormalization and effective field theory
low-energy physics. That is, one shows that, starting from any set of interactions at
the cut-off scale, the low-energy, long-distance physics depends on only a few relevant
parameters. The mechanics of this demonstration involves a procedure for reducing the
cut-off scale and finding new Lagrangians that reproduce all of the same low-energy
physics. Continuous changes in the cut-off lead to a flow on the space of all possible
interactions
^=W' (')}),
at
where t parametrizes an infinitesimal rescaling of the cut-off. This is called the renor-
malization group (RG) flow, despite the fact that it is irreversible, and therefore only a
semigroup. For a large class of models, the RG can be shown to be a gradient flow, so
that the general asymptotic behavior is a fixed point. The relevant parameters describe
the manifold of unstable directions of the fixed point. They are the parameters which
have to be tuned in order to obtain low-energy physics that is independent of the details
of the physics at the cut-off scale, and all dependence on the cut-off is encoded in the
value one chooses for these parameters.
All possible cut-off-independent infrared behaviors of a certain set of degrees of
freedom thus fall into universality classes corresponding to the possible fixed points
of their RG flow. The fixed-point Lagrangians obviously have to be invariant under
space-time scaling transformations. For most interesting models they are actually
invariant under the larger group of space-time conformal transformations, (SO (2, d)
for a rf-dimensional Minkowski space-time). Such conformal field theories (CFTs) can
often be constructed without recourse to Lagrangians and Feynman diagrams. Indeed,
Lagrangians and Feynman diagrams are concepts appropriate only to the Gaussian
CFTs, a fancy name for free massless field theories. This leads to a possible mathemat-
ical definition of all possible QFTs as perturbations of CFTs by turning on non-zero
values of relevant parameters. But we are getting ahead of ourselves, and will return to
these issues when we understand a little more of the technicalities of the RG.
The basic idea behind the derivation of the RG flow equations is the concept of
integrating out degrees of freedom. We can illustrate this in a simple example from
classical physics by imagining two harmonic oscillators with frequencies Q and a> <<C £2,
coupled by an anharmonic interaction of the form gXx 1 . We can try to solve the
equations of motion for this system by first solving for X and writing equations of
motion for x that take into account the evolution of X. We've cleverly chosen the
interaction so that the first step can be done in closed form. We find the equations
x(t) + a?x{t) = 2gx{t) j ds[G(t - s)x 2 (s)],
where the boundary conditions on X(t) are encoded in the choice of Green function G:
(d 2 + Sl 2 )G(t-s) = 8(t-s).
In QFT we will always assume that the high-frequency degrees of freedom are in their
ground state, so the Green function is the one defined by Feynman, which is the analytic
9.1 Divergences in Feynman graphs
continuation of the Euclidean propagator. Integrating out degrees of freedom is always
done in the Euclidean path integral.
The effective equations of motion for x(t) are non-local, but if we are interested only
in times scales much longer than £2 _1 we can make the expansion
G^^-s)-^3JS( t -s ) + ....
We get an infinite series of higher-derivative terms in the approximately local equations
for x. When the solution for x contains only frequencies co, <$C Q, the higher-order terms
in the series are negligible, and we refer to them as irrelevant.
9.1 Divergences in Feynman graphs
To begin our consideration of ultraviolet divergences, we study the <p n scalar field theory.
A 1PI Feynman graph with E external legs and V vertices has / = (nV — E)/2 internal
lines. The number of loop integrals is
L = I- V+\ = \ + ({n-2)V-E)/2.
We evaluate the graphs in momentum space and look at the region of integration where
all loop momenta are large. Then we can drop all masses and external momenta in the
internal propagators. In this regime, the integral looks like
f d 4L p
J P 21 '
and it diverges if 4L > 21. In terms of V and E, this inequality is
4 + 2((n - 2)V - E) > nV - E,
4>(4-n)V + E.
We see a striking fact: for n — 3, only a finite number of orders of perturbation theory
will have such a divergence. For n — 4, the degree of divergence is independent of the
order of perturbation theory, whereas for n > 4 it increases with the order. We also
note that, for a given order of perturbation theory, the degree of divergence decreases
as we increase the number of external legs. Furthermore, if we differentiate a graph
with respect to external momenta, the degree of divergence decreases by one for each
derivative we take.
Renormalization and effective field theory
Indeed, all of these facts are a simple consequence of dimensional analysis, and the
fact that masses on internal lines are negligible in the high-momentum regime. ' The
coupling A.,, of a (/>" interaction has mass dimension 4 — n. A connected is-point func-
tion of scalar fields has mass dimension E, and its Fourier transform has dimension
4 — IE if we leave off the momentum-conservation delta function. An £ -point 1PI
function thus has dimension 4 — E. The overall number of powers of momentum in the
Feynman graph integrals is thus completely determined by dimensional analysis. An
interaction with positive (negative) mass dimension will have fewer and fewer (more
and more) divergent integrals as we go to higher and higher order. We call interac-
tions of positive mass dimension super-renormalizable, those of negative dimension,
non-renormalizable, and those of dimension 0, simply renormalizable . For reasons that
will become apparent later, these terms are often replaced by relevant, irrelevant, and
marginal.
In purely mechanical terms, the basic idea of renormalization is that, at any order in
perturbation theory, the divergences all reside in a finite number of li -point functions,
and are polynomial functions of momenta. The latter remark follows from the fact that,
if we take a finite number of derivatives of the Feynman graph, then the superficial
divergences go away, because we have introduced more internal propagators. As a
consequence, we can find local interactions of the form c n (k n ) v d a, 4> . . . d aE (j), which,
when added to the theory, can subtract away all of the divergences. In order to do this, we
must choose the coefficients c„ to be infinite. A more sensible mathematical procedure
would be to define a cut-off theory, which effectively gets rid of the divergences above
some high scale A , and allow the c n to depend on A in such a way that the limit A -> oo
exists.
It is not clear that this simple subtraction procedure actually works, for we have dealt
only with the region of integration where all momenta are large. A 1PI Feynman graph
contains many 1PI sub-graphs. Even if the overall degree of divergence of the graph is
negative, we might have a sub-graph whose integrals diverge when all other momenta
in the graph are held fixed. The second key idea of perturbative renormalization theory
is that "sub-divergences have already been taken care of by subtractions at lower orders
in perturbation theory." This is far from clear in purely graphical terms. To understand
why it is true (still in a purely mechanical sense) we will return to the coordinate-space
definition of Feynman graphs.
Divergent loop integrals fall into two distinct classes. The first is associated with loops
that consist of a single propagator attached to the same vertex (ear diagrams), Figure
9.1 (a). The second consists of loops that include several vertices like Figure 9.1 (b). Both
can be understood in terms of the singularities in operator products in free-field theory.
Indeed, interactions like <j>" are not well-defined operators in the free-field theory. Wick's
This is not true for the masses of massive gauge bosons in unitarj gauges. As a consequence of this
fact, the unitan in 1 rangian re in lad in l-renonnalizable as quanliun lick! theories of Green
functions. However, one can show that the S-matrix in these theories is identical to that of the physical
particles of the R K -gauge Lagrangian, which is renormalizable. From the point of view of the R K gauge the
non-renormalizabl di\ n Gi i fundioi unil v i li Id iri I . u c these are really
gaiige-iiivarianl non-poKnomial fund ions of the R k gauge fields.
9.1 Divergences in Feynman graphs
Examples of divergent diagrams.
theorem shows us that products of operators at different points diverge as the points
are taken to coincide. This leads to the idea of an operator product expansion (OPE):
4>(x)<p(y) = J2 C N (x - y)0 N (y).
In free, massless field theory, dimensional analysis shows us that, if the operator
On has mass dimension d^, then Cn(x) ~ x~ 2 + dN . These operators are called
: </> 3 Ml ... d' lN - 2 (p :, the normal ordered product of <j) with its partial derivatives. The
function Cn - 2 has the tensor transformation properties to make the OPE Lorentz
covariant. Notice that O , the identity operator, also appears in the list. It is the only
one with a divergent coefficient. However, we can now go on to consider higher-order
powers of </> and its derivatives. The result is an infinite collection of finite operators,
which we will again label On- There are now many ways to get an operator with a given
mass dimension. The generalized OPE has the form
o N {x)o M (y) = J2 c §m( x - y)0 K (y),
where we have again suppressed the tensor properties of operators and their coefficient
functions. The c-number functions in the OPE are called Wilson coefficients. C^ M
scales like x ~ dN ~ dM+dK . Wick's theorem guarantees that these are in fact operator rela-
tions. That is, the behavior of an «-point Green function when k of the points are
taken together is universal, and can be written as singular coefficients depending on
the coinciding points, multiplied by a function of the coincident point and the other
n — k points in the function.
In general, we see that, even after we have defined finite operators On, their products
when points are taken together will often diverge. Now consider the application of
all of this to the perturbation expansion. The series corresponds to integrals over
interaction vertices, defined as naive powers of the field at the same point. We can expect
divergences corresponding to "improper normal ordering" (the correct definition of
local composite operators) as well as to singular products of normal ordered operators
(since we must integrate over coinciding interaction points). The former correspond to
"ear" diagrams. They can be understood by writing out 4>" in terms of normal ordered
operators and subtracting the divergent terms.
The OPE allows us to understand and deal with the coinciding point divergences in
a similar way. We have to evaluate
y<:0 re (xi):...:0"(x fe ):}-
Near coinciding points, we can use the OPE to write this in terms of Wilson coefficients
and finite operators. This tells us that we can always subtract away the divergences by
Renormalization and effective field theory
adding local operators to the Lagrangian. Furthermore, the sub-divergences that occur
at order k, when/) < k points coincide, will have precisely the form of the Lagrangian
we subtracted at order p when this particular divergence first showed up.
So we have understood, at least in a hand-waving and mechanical way, how it is
that divergences in the field-theoretic perturbation expansion can be absorbed into
(formally infinite) redefinitions of the Lagrangian. We can also see that the structure of
these subtractions will be quite different for irrelevant perturbations, versus marginal or
relevant ones. For irrelevant perturbations, we will generate every interaction consistent
with the symmetries of the problem, if we go to high enough order of perturbation
theory. For relevant or marginal interactions, the subtraction process stops with a finite
set of operators. This means that irrelevant interactions have no (cut-off-independent)
meaning. When we add an infinite counterterm to subtract a divergence, we have no
rule to tell us how much of a finite coefficient one should add to the infinite one. Thus,
theories with irrelevant interactions seem to make no predictions at all. Later we will
see that this judgement is a bit too harsh, but that will have to wait until we understand
the meaning of renormalization, not just its mechanics.
I want to end this section by noting that the definition of normal ordered operators is
ambiguous in massive free-field theory. The Wilson coefficients are no longer required to
be pure power laws. Indeed, the massless Wilson coefficients are multiplied by functions
oim\x - y\, which are non-singular as m\x — y\ goes to zero. As a consequence, it is
impossible to identify a normal ordered operator as the coefficient of a certain power of
\x — y\ in the short-distance expansion. There is an ambiguity because we can replace
an operator of a given mass dimension by a polynomial in m, with higher powers
of m multiplying operators that would have lower mass dimension in the massless
theory. A standard definition of normal ordered products is to reorder the creation and
annihilation operators so that all creation operators stand to the left of all annihilation
operators. This is in fact the origin of the name normal ordered operator. We should
remember that this is just an arbitrary convention. It is an example of what we will call
a renormalization-scheme ambiguity in the general theory of renormalization.
9.2 Cut-offs
The cavalier manipulations of infinite quantities, which entered into our heuristic
mechanical description of renormalization, are clearly unacceptable." A mathematical
way of dealing with the problem is to introduce some kind of cut-off.
The simplest way to do this is to replace space, or Euclidean space-time, by a lattice
with spacing A -1 . In fact, one way in which field theories arise in physics is in the
long-distance approximation to second-order phase transitions in condensed-matter
2 They led to one of the more memorable quips in the mathematical physics literature, due to R. Jost,
"In the 1950s, under the miluence of renormalization theory the mathematical sophistication required of
physicists was reduced to a rudimentary knowledge of the Latin and Greek alphabets."
systems. In that context, a crystal lattice is a physical part of a more microscopic
description of the system. Another, more brutal, cut-off scheme is to simply refuse to
continue the momentum-space integration in Feynman diagrams beyond some scale A.
This procedure, while often convenient, does extreme violence to locality in space, just
as a sharp cut-off in space gives rise to violent effects at high momentum. A gen-
tler momentum cut-off is achieved by including all momenta but choosing an inverse
propagator K(p 2 ), 3 which, while retaining the form p 2 + m 2 at moderate momenta,
grows extremely rapidly (faster than any power) above p 2 — A 2 . We can even choose
a propagator that is a smooth function, vanishing identically for p 2 > A 2 . By look-
ing at Feynman graphs we can see that any such choice of propagator will render
all graphs with local interactions UV-fmite, because the vertices are polynomials in
momenta. From the functional-integral point of view, we are choosing the Gaus-
sian part of the action to suppress the contribution of field configurations with large
momenta.
The standard model of particle physics, and most proposed extensions of it, are
gauge theories. While it is not strictly necessary, it is extremely convenient to have
a cut-off procedure that preserves gauge invariance. Non-perturbatively, only lattice
regulators have this property, but for perturbative calculations there are two useful
Lorentz-invariant formulations. The first is Pauli-Villars regulation, with the related
proper-time cut-off of Schwinger, which we will discuss later. The second is dimen-
sional regulation. Dimensional regulation proceeds by noting that most expressions in
Feynman diagrams appear to be analytic in the dimension of space -time. 4 Only the
integration measure d 4 p/(2ir) 4 identifies which dimension we are in.
Dimensional regularization (DR) can be defined formally as follows: any Feynman
graph in d dimensions involves multiple integrals over (/-dimensional momentum space
of expressions that can always be interpreted as functions of invariants made by dot-
ting external and/or internal momenta into each other. We define the dimensional
continuation of those integrals as a linear functional obeying
fd d qf(q+p) = fd d q f(q),
Jd d qf(Rq) = Jd d qf(q),
for any proper rotation matrix R (interpreted as meaning that the answer involves only
invariant tensors) and
j d d qf(aq) = a- d j d d qf(q).
Similarly, the Kronecker delta S^ lv which appears in these formulae is interpreted as a
symbol whose trace is d, and whose contraction with two vectors is their dot product.
Propagator cut-oils arc usual!} introduced in Uuchdenn momentum space, though in principle we could
distort the propagator only in spatial momentum directions.
4 the exceptions are the Levi-Civita symbol and the Dirac y$ matrix.
Renormalization and effective field theory
This defines <r/-dimensional integration up to a multiplicative constant, which is fixed
by computing a simple Gaussian integral
/dV"- = (^-
These manipulations define analytic continuations of the scalar invariants of any
Feynman graph not involving e^,...^,, to complex values of d. We will see that the
values of these graphs are finite, analytic functions, with multiple poles at integer values
oid. One other apparent non-analyticity in d comes from closed spinor loops, since the
dimension of the spinor representation is <i S pin = 2 [ 2 ] . This can be dealt with by simply
defining d sp [ n to be any analytic function of d that takes on the correct value in d = 4
(or whatever other dimension we happen to be interested in). A way to understand this
is to note that if we introduce Np copies of every fermion, in a U(7V~F)-symmetric way,
then every closed fermion loop will introduce a power of Npd sp i n . As we continue away
from d — 4 we can modify Np so that this product is anything we want, subject to the
condition that Np = 1, and d sp [ n = 4 when d = 4.
It should be emphasized that DR is a formal, perturbath i gul u ration scheme.
One should not imagine that it represents e.g. the result of quantum field theory on
some space of fractal dimension d, analytically continued to complex d. The result
/ d d q/q 2 = 0, which we will encounter below, shows us that DR does not correspond
to a positive measure. That said, DR is by far the most convenient method of computing
anything in quantum field theory, in the perturbative regime. In DR, UV divergences
show up as poles at discrete values of the space-time dimension. We will use DR
extensively below and provide many examples of such poles.
The physically minded reader will be asking what the physical meaning of all of
these varied cut-off schemes is. The correct answer is that most of them have no physics
behind them at all. They are useful because the answers to correctly defined quantum
field theory are universal: they do not depend on the cut-off scheme. More generally,
we will argue that, if one restricts attention to an expansion in powers of momentum
over the cut-off, then all effects of one kind of cut-off can be encoded in a choice of
"irrelevant" operators in a different cut-off scheme. This is the notion of effective field
theory.
The fact that we get a length, the Planck length (10~ 33 cm), by combining quantum
mechanics and gravity, suggests that the physical cut-off on field theories of particle
physics comes from the quantization of gravity. There are strong indications that theo-
ries of quantum gravity do not have precisely defined local observables and we do not
yet understand either the correspondence principle by which these theories admit an
approximate description in terms of local field theory or the circumstances in which
the approximation is valid. The wonderful thing about the RG approach to field theory
is that it allows us to discuss physics below the Planck energy scale without getting
involved in the conceptual intricacies of quantum gravity. The RG shows us that, once
we are in a situation in which it is valid to consider space-time geometry as a fixed
background, then the analysis of local experiments in space-time depends on quantum
gravitational effects only through a few parameters in a low-energy effective field theory.
9.3 Renormalization and critical phenomena
9.3 Renormalization and critical phenomena
The deep understanding of renormalization and the renormalization group came from
the work of Fisher, Kadanoff, Wilson, and other condensed-matter physicists, working
on the problem of continuous phase transitions, or critical phenomena. We all know
that many forms of ordinary matter undergo phase transitions as we vary control
parameters like pressure, temperature, and external magnetic fields. Water turns into
steam, solids melt, ferromagnets magnetize, etc. etc. The thermodynamics of a system
is characterized by a free energy per unit volume, F(T, P,H, . . .), which depends on
the control parameters. Typically F depends analytically on these parameters, and,
for systems with a finite number of degrees of freedom, we can prove that this is so.
Mathematically, phase transitions occur only for infinite volume (unless we have an
infinite number of fields in our system at each point), and show up as discontinuities in
derivatives of F w.r.t. e.g. the temperature. Ehrenfest defined the order of a transition
as the order of the derivative which became singular or discontinuous at the phase
transition.
Landau, who invented the first general theory of phase transitions (generalizing
van der Waals' treatment of the liquid-gas system), realized that the main difference
was between first-order transitions and all the others, which are called continuous
transitions. Landau's theory is extremely simple. He postulated that the different phases
of the system were characterized by an order parameter <J> distinguishing the phases.
For example, in a ferromagnet <J> is the magnetization. In the liquid-gas transition, it
is related to the difference between the density of the phases and the critical density at
which the transition occurs.
The order parameter is defined so that its value is small near the transition. It is a
macroscopic property of the system, so it does not have thermal fluctuations. Its value
is therefore determined by minimizing a free-energy function F(<i>, T, . . .) w.r.t. <I>, at
fixed values of the control parameters. Near the transition, we can expand
F = a(T)0 2 + b(T)<£> 3 + c(T)<£> 4 + ■■■.
The coefficients in the expansion depend analytically on the control parameters, of
which we have indicated only the temperature. Keeping only the indicated terms, and
assuming that c > 0, we have a quartic polynomial, with two local minima. As we vary
T, it might happen that the global minimum jumps from one to the other. This is a
first-order phase transition: F is continuous, but its first derivative is not, because we
have jumped from one branch of the function <$>(T) to another when the free energies
of the two branches cross each other.
Landau then realized that one could explain second-order transitions by assuming
that b was zero (either by fine tuning some other control parameter, or because the sys-
tem had a built in <£> — ► — <£> symmetry) and that, at the critical temperature, a(T c ) = 0.
At this point two minima coalesce into one. The magnetization and free energies have
square-root branch points in T — T c .
Renormalization and effective field theory
Landau's theory, called main field theory, has two remarkable properties: it is com-
pletely universal, depending only on the properties of polynomials, not on the detailed
nature of the order parameter or the system it characterizes; and it predicts fractional
power-law exponents for physical properties near continuous phase transitions. Exper-
imentalists rushed to test these properties. The experiments were difficult, because one
had to get very near the critical point to measure the exponents.
The result of this experimental work was exciting and puzzling. Continuous phase
transitions did exhibit remarkable universality. For example, the critical point in the
liquid-vapor phase diagram in the P, T plane has the same critical exponents as the
so-called Ising ferromagnets (ferromagnets with a single preferred axis of up-down
polarization). However, there was not complete universality. For example, XY fer-
romagnets with a whole preferred plane of polarization, or those in which there is
no preferred axis, had different critical exponents. Furthermore, while the exponents
were not wildly different from the Landau square roots, they were definitely different.
Clearly something was right about Landau's ideas, but something much more intricate
and interesting was going on.
This was confirmed by another set of phenomena, 5 which were observed by scattering
light and neutrons off of substances undergoing second-order transitions. These phe-
nomena could be explained by assuming that correlation functions of local quantities,
like the local magnetization, obeyed scaling laws of the form
(M(x)M(y)}~\x-y\- 2fl ,
at very large distances and temperatures at the critical point. By contrast, away from
critical points, three-dimensional condensed-matter systems generally exhibit what is
called a correlation length
(M(x)M(y)) ~ L-^-.
\x - y\
The three new phenomena of universality classes, rather than a completely universal
mean field theory, non-mean field critical exponents, and infinite correlation length at the
critical point, received a common explanation in terms of the renormalization group.
Kadanoff 's first crude form of the renormalization group is called the block spin trans-
formation. Kadanoff first reasoned that, if critical phenomena had universal aspects,
then interesting results for real systems could be extracted from simple models like the
Ising model. This model consists of a classical spin, which can take only the values
±1, situated on the sites of a <i-dimensional hypercubic lattice (where d — 1, 2 or 3 for
systems realizable in condensed-matter laboratories). The Hamiltonian is
Ho = -jJ^cr(x)a(x + n),
where the sum runs over all nearest-neighbor points in the lattice.
Kadanoff 's great insight was to realize that, if critical phenomena really had some-
thing to do with an infinite correlation length, then the microscopic dynamics was
5 The most spectacular of which is critical opalescence.
9.3 Renormalization and critical phenomena
irrelevant and there should be some sort of long- wavelength effective theory that cap-
tured the phenomena. Landau's theory could be viewed as a sort of guess at what this
long-wavelength theory was (a so-called Gaussian model), but the experiments showed
that this guess was not correct. How should one systematically calculate the effective
long-wavelength Hamiltonian?
The slogan that works, as it does for so many other things, is little steps for little feet.
If you're working on a lattice with spacing a, and you want to investigate phenomena
on a length scale much bigger than a, first go to 2a, then from there to 4a, and so on.
Kadanoff did this by defining blocks on the lattice, and defining <£, to be the average
of the spins on the fth block. He imagined doing the partition sum over all spins, with
the <J>, held fixed. One then gets a new system, whose variables are the O,, defined on
a lattice of spacing 2a. The Hamiltonian takes the form
Hi = £ A^fo, • • ..*«)*(*l) . . . 0(X„),
where the sums now go over all lattice points and all n.
It isn't easy to calculate the functions K\ , but it is easy to argue that they all have
short-range correlations, because the original Hamiltonian did, and, by keeping the
<J>, fixed, we have not allowed any long-wavelength fluctuations. That is, the sum over
<S>i, which we have not yet done, contains both configurations where the $,- randomly
fluctuate from point to point on the lattice with spacing 2a and <J>, configurations of
longer wavelength. It is only by summing over the latter that we can produce long-
range correlations in the system. We can now iterate the procedure to get new effective
theories (with the same partition function) for lattices of larger and larger spacing.
After k steps we have
H k = J^K^xi, . . .,*„)*(*i) . • • <&(*„),
where the functions K% have short-range correlations on a lattice of spacing 2 k a. Note
that we have used the same name for the variables at all steps but the first. Actually
these variables all have different discrete ranges, but as k gets large it is clear that we
can let the variables be continuous and replace sums over <$> by integrals. Furthermore,
although all the <£ variables are bounded between ± 1 we can abandon this constraint
and replace it by a potential that favors <I>s in this compact range. Our system now
resembles a lattice field theory, with the statistical-mechanical Hamiltonian playing
the role of the Euclidean action.
As we take k to infinity, one of two things happens. If we are at a value of the temper-
ature for which the correlation length is finite, then the functions K k become of shorter
and shorter range in lattice units, once we take the spacing larger than the correlation
length. Eventually, they are all Kronecker deltas, and there are no interactions between
the lattice points. We call this limit the trivial fixed point of the renormalization group.
But suppose that we are at the critical temperature, for which the correlation length
is infinite. In this case, Kadanoff realized that the only way to get a sensible limit was
to assume that the Hamiltonian approached a fixed point, in terms of appropriately
rescaled variables <t>. That is, the fixed-point Hamiltonian had to be invariant under
Renormalization and effective field theory
scale transformations of space combined with a rescaling of <I>. This then predicts scale-
invariant correlation functions, as observed experimentally. It further predicts that the
scaling exponents are characteristic of the fixed-point Hamiltonian and independent
of the microphysics. This explains universality.
However, unlike Landau's theory, Kadanoff 's block spin idea can accommodate
different universality classes. For example, there is no reason for the RG transformation
for a system with Ising symmetry to be the same as that for planar or fully rotation-
invariant spin symmetry. Detailed calculation shows that it is not. Thus, block spin
or renormalization theory explains the qualitative nature of critical phenomena. It
remained for Fisher, Wilson, and others to appreciate the connection to Euclidean
field theory, and to use field-theoretic tools to calculate detailed numerical agreement
of the theory with a wide variety of experiments. Rather than following them on this
fascinating journey, we will now return to the application of these ideas to relativistic
quantum field theory itself.
9.4 The renormalization (semi-)group in field
theory
Consider a scalar field theory with the following action:
d 4 /;
2 J (2jt) 4
4>{p)^{-p)K(p z /A l )
+ J£ J dVl • ■ • d*Pn gn(pi, ■ ■ -,Pn, A)5 4 ( J>,)<£(/>l) . . -4>(Pn)
+ j d 4 P J(p)<t>(p),
where K(p 2 ,A 2 ) is smooth, behaves like p 2 for small p, and blows up exponentially
at \p\> A. The g n are smooth functions bounded by polynomials in the momenta.
The requirement of smoothness guarantees (by virtue of the Riemann-Lebesgue
lemma) that interactions are of short range in position space. That is, the functions
g n (x\, . . .,x„), while not necessarily local, fall off exponentially with distance, on a
scale of order A -1 . We want to study physics below some scale Ar <$c A. We therefore
take the source function to vanish for p 1 > A^ . We call the first quadratic term So
and everything else but the last linear term Si.
It is easy to see that this action has a finite perturbation series for any choice of the
functions in the action. The exponentially falling propagators make every loop integral
converge in the UV, because the measures of integration and the functions g n grow at
most like polynomials. Because we have chosen to perturb around a massless theory, we
might find IR divergences, but we know that these can be eliminated by shifting a mass
term from the function gi in Si to So. They have nothing to do with the UV problem.
The basic idea of the RG is to rescale the cut-off to a lower value, without changing
the physics below Ar. In this way we can eventually get an action that contains only
9.4 The renormalization (semi-)group in field theory
momentum scales of order Ar or smaller, which has the same physics as the original
action. We approach this by writing a differential equation for the required change in
the action for an infinitesimal scale change. By integrating this equation we obtain the
required elimination of degrees of freedom.
It should be noted that in using the Euclidean formalism we are implicitly assuming
that the integrated degrees of freedom are in their ground state. That is, they are excited
only by their interactions with the low-energy degrees of freedom. We have not sent
in any incoming high-energy waves. This follows from the fact that the free Euclidean
Green function is the analytic continuation of the vacuum expectation value of fields.
Any other Green function for the Lorentzian Klein-Gordon operator would have wave
packets propagating at infinity.
We now want to find an equation for Sj that allows us to vary A without changing
the physics. Let 3, = A 3/3 A. I claim that the equation
- 1 f d 4 p , T 8 2 e" s ' ]
does the trick. Indeed, if we take the scaling derivative of the numerator of the generating
functional Z, and use this formula, we get
d t I[J]= [[d<p]e-( So+ J'' t ' (p)J( - p V
\~\ f W^u dtK(p)<j>(p)c/>(-p)e- s '
|_ 2 J (2tt 4 )
2j j2^j d ' K (p) [s0(p)80(- j p)JJ -
•[-
1
We can use functional integration by parts to throw the functional derivatives acting
on e~ s ' onto e~ So+ J ^ J . The terms where the derivatives act on the source give us
integrals of the form f J(p)d t K~ l (p), which vanish, because these two functions have
disjoint support. The terms where they act on So give
fjjf dV
[_ 2J(2n 4 )
K(p)d t K- l (p)S\0) + (t>(p)K(p)d t K- l (p)K(-p)cl>(-p)) .
The infinite momentum-space delta function in the first term is a factor of the space-
time volume. It corresponds to the renormalization of the vacuum energy density from
the modes that are dropped when we lower the cut-off. It will cancel out against a
similar contribution from the denominator when we compute
d t I[J] d t I[0~]
d t z\j] = -^-^ - z[/]-^-^.
' L J /[0] L J /[0]
Using the identity d t K~ l — —K~ 2 d t K, we see that the second term cancels out against
the explicit scale derivative of So, so that Z[J] is unchanged.
Renormalization and effective field theory
We can rewrite the exact RG equation for Si as
M-ir**^ **-;** ^l_1.
2 J (2tt 4 ) |_80(p) 80(-/>) 80(p)80(-/Oj
In terms of the coefficient functions in the expansion of 57, this has the form
*f
dV
The sum in the first term is over all ways of partitioning the momenta (which we
have labeled by a single letter P„) into two groups Pk and P n -k, each containing the
indicated number of single-particle momenta.
The two contributions to the variation of Si have an interpretation in terms of
Feynman diagrams (Figure 9.2), one a loop diagram and one a tree. Note that the
internal lines in these diagrams are proportional to d t K, which is non-zero only for
momenta near the cut-off. We interpret this as integrating out degrees of freedom in
an infinitesimal shell near the cut-off. Because the phase space is so small, higher loop
diagrams are negligible.
The RG equation for Si is of the form
d t X = -\ V 2 X,
s the functional e s ' and
dV
2 r d>
J (2ny
d,K~ l {p)
80 (p)S</> (-/?)'
d,/.- 1
9.4 The renormalization (semi-)group in field theory
This looks like a heat equation on the space of all functionals X, and, like all heat
equations, it is a gradient flow. Not all functionals can be written as the exponential of
an Si, which has an expansion in terms of polynomially bounded smooth functions g„.
This requirement defines a curved sub-manifold on the infinite-dimensional linear space
of functionals X on which the heat flow takes place. The explicit form of the equations
for g n in terms of Feynman diagrams tells us that the sub-manifold of quasilocal cut-off
actions it preserved by the heat flow.
As noted above, the RG equations for the functions g„ have the form
d t gn(PU ■ ■ ;Pn) = J2 gk ^ s > ■ ■ ■Ps k _ l ,P)dtK~ l (P)g n -k+l(P,Ps k ■ ■ -Ps,,)
d 4 p
-/<
, .p„, p, -p).
The sums in the first term are over all choices of k < n and over all permutations S{. We
employ the convention that the sum of all moment.! appearing inside a vertex function
g n is zero. By expanding the functions g„(p\, ■ ■ -,p n ) m a power series around zero,
we can convert these equations into an infinite number of equations for the expansion
coefficients. Label the latter by g 1 . The equations have the form
dig' = Pjg J + Pj K g J S K = P' (S)-
The fact that the flow is also gradient means that there is a positive-definite metric
Gu(g) such that Pi = Gn d t g ! satisfies
BiPj = djP,,
so that Pi = diV. Gradient flow can have only two types of asymptotic behavior,
because the potential V increases along the flow. It can have a runaway to infinite
values, or it can hit a fixed point. Renormalized quantum field theories correspond to
fixed-point behavior of the RG flow.
Since the flow is gradient, its asymptotic behavior is governed by a fixed point where
p'( gir ) = 0. Let D 1 =g' -g[. Then
D 1 = X^D- 1 + p' JK D J D K ,
where
*/ = Pi + Pjxgf-
If we diagonalize X, calling its eigenvalues 4 — d 1 , we see that the initial values of the
coefficients of irrelevant operators, with d 1 > 4, disappear exponentially from the flow
as / — ► oo. The marginal and relevant operators, with 4 — d 1 < 0, asymptotically satisfy
a closed set of highly non-linear RG equations obtained by eliminating the irrelevant
operators in terms of the others. The coefficients of irrlevant operators in the effective
Lagrangian at Ar <$C A are completely determined by the relevant and marginal ones.
Actually some of the marginal operators may actually be marginally irrelevant. That
is, when non-linear corrections to the flow are taken into account, they may also be
irrelevant. However, since their initial conditions are forgotten at a rate that is a power
Renormalization and effective field theory
law, rather than an exponential, in t (and thus logarithmic in scale), the proper treat-
ment of marginally irrelevant operators depends on the context. In a mathematical
treatment, in which we try to take the cut-off to infinity, we must treat them like irrel-
evant operators. However, in the real world, the ultimate cut-off scale is unlikely to be
larger than Mp ~ 10 19 GeV, the Planck energy scale defined by combining Newton's
constant with Planck's. In theories with large or warped extra dimensions, it might
be much lower. Even without such exotic theoretical input, it is easy to imagine that
the elementary fields of our current effective field theories are composites of a deeper
theory hiding just around the corner of the next collider experiment. Thus, we should
be careful not to dismiss irrelevant, and especially marginally irrelevant, operators in
too cavalier a fashion. Discovering that the effective field theories we need to explain
current experiments contain irrelevant operators simply tells us that there is a scale of
new physics not too far away. Marginally irrelevant operators may be pointing to the
need for changes in our models only at energies that are exponentially far away in scale.
In fact, if we are considering the possibility of an energy scale just above our cur-
rent experimental resolution, we should always expect to find irrelevant operators with
coefficients proportional to inverse powers of that nearby scale in our effective descrip-
tion of nature. The non-renormalizability of these effective theories is telling us that
there are missing pieces of our current theory and that we can expect to encounter new
phenomena at that nearby scale. This is for example the case for the non-linear sigma
model describing low-energy pion physics. The extra physics is the entire rich structure
of QCD. Similarly, the non-renormalizability of Fermi's theory of weak interactions
told us to expect new physics by about 250 GeV. In fact, as a consequence of the small
dimensionless parameters of the electro-weak gauge theory, the new physics actually
appeared below 100 GeV, at the W-boson mass.
This section contains the key idea of renormalization theory, which is summarized
in the following picture (Figure 9.3). The RG flow stops at fixed points, whose criti-
cal surface, or basin of attraction, has low co-dimension in the space of all couplings.
Renormalization-group trajectories near a critical surface.
9.4 The renormalization (semi-)group in field theory
The infinite-cut-off limit of the theory at the fixed point is conformally invariant. Ini-
tial coupling values on the critical surface all flow to the fixed point. Near the fixed
point, infinitesimal directions on the critical surface are controlled by irrelevant or
non-renormalizable perturbations of the Lagrangian. Flows beginning off the critical
surface do not reach the fixed point, but if the initial couplings are close to the critical
surface the flow comes close to the fixed point before moving away. We can obtain a
non-scale-invariant infinite-cut-off limit by tuning the initial couplings to the critical
surface as we take the cut-off to infinity. Near the fixed point, perturbations off the
critical surface are called relevant or super-renormalizable operators. The general defi-
nition of a quantum field theory is to take the cut-off to infinity, tuning the coefficients
of relevant operators to zero in order to obtain finite Green functions.
First-order analysis of the RG equations near the fixed point shows that there are
generally marginal operators, which don't flow at all in first order. Higher-order analysis
divides these into truly marginal, marginally relevant, and marginally irrelevant oper-
ators. Truly marginal operators indicate the existence of a line or higher-dimensional
surface of fixed points. If we really insist on taking the cut-off to infinity, marginally
relevant and irrelevant operators are just like relevant and irrelevant ones. On the other
hand, in the real world, we expect the cut-off above which all quantum field theories
fail to be no bigger than 10 19 GeV. Thus, marginally irrelevant couplings, if they are
sufficiently small, would be expected to appear in our theories.
9.4.1 Changing the cut-off
We now want to argue that IR physics is in fact independent of the way we choose
to cut-off the theory. We begin by investigating changes in the function K(p 2 /A 2 ). By
definition, these changes all take place at scales above the original cut-off. Therefore,
once we have flowed to a momentum much lower than the cut-off they don't affect
the RG equations. Thus, a change in K can be absorbed into a change in the initial
conditions for the flow, and affects the low-energy physics only through the values of
the relevant and marginal parameters.
A more drastic change in cut-off scheme is a Euclidean lattice. Here momentum space
is compactified on a torus, with cycles of size 2jt/a. However, we can work instead with
a non-compact momentum space, if we insist that the propagator vanish identically
above \P\ = 2jt/a. 6 We do this by multiplying the lattice propagator by a smooth
function. Now we can repeat the argument of the previous paragraph.
The upshot of these arguments and their generalization is that we can impose the
cut-off in any way we like, once we have identified the fixed point of the RG that is
relevant to our theory. In perturbation theory, we are always working with specific
6 For Dirac fermions, depending on the way we implement the lattice Lagrangian, we have to proceed with
care. Some versions of the lattice Dirac equation actualrj describe multiple copies of a continuum fermion.
The extra copies come from momenta of order :i 'a. We iiuim lake care to incorporate all of these copies
in the low-encrg\ iheon if we reaih want to describe the continuum limit ol the lattice theory.
Renormalization and effective field theory
Gaussian fixed points, so any way of cutting off Feynman diagrams is acceptable, and
different cut-off schemes will simply lead to different parametrizations of the space of
relevant and marginal interactions. In words we will use later, they just lead to different
renormalization schemes. The exact RG equations we have discussed are conceptually
important, but perturbative calculations generally use the DR scheme, which leads to
simpler computations.
9.5 Mathematical (Lorentz-invariant, unitary)
quantum field theory
The RG enables us to understand what we mean by a mathematically well-defined,
Lorentz-invariant local quantum field theory. The classification of operators according
to their relevance fixes the dimension of the so-called critical sub-manifold associated
with a fixed point. A general RG flow will miss the fixed point because, nearby the fixed
point, some relevant couplings will be non-zero. The number of relevant (including
marginally relevant) couplings is the co-dimension of the critical manifold in the space
of all couplings.
One way to obtain a cut-off-independent quantum field theory is to choose initial
conditions on the critical manifold, and take the cut-off to infinity. By construction,
the IR limit is cut-off-independent. The theory we obtain in this way is scale-invariant,
because dimensional analysis tells us that, if we rescale the metric of space-time, we
can always compensate by rescaling the cut-off. If we consider field theory in a general
background metric (which we can always do for local Lagrangians), rescaling is the
operation
J 8^ y (x)
The definition of the stress-energy tensor of a quantum field theory is that this rescaling
is just equal to
Jd*xg* v (x){Tp V (x)).
That is, the stress tensor is just the source for the gravitational field. In most cases,
one can show that this variation can vanish only if it vanishes locally, that is, if the
theory is Weyl-invariant (invariant under local rescalings of the metric). The group
of local rescalings which can be compensated by coordinate transformations so that
they leave the Minkowski metric rj llv invariant is called the conformal group. Theo-
ries with rj^Tf+v — are called conformal field theories (CFTs). For particle-physics
applications we deal primarily with unitary CFTs, though certain non-unitary CFTs
are important in statistical mechanics and string theory.
There is a more general way to get a cut-off-independent result from a fixed point
with relevant directions. Consider flows that start near, but not on, the critical surface.
Now, take the cut-off to infinity, but, as we do this, let the initial conditions approach
9.5 Mathematical (Lorentz-invariant, unitary) quantum field theory
the critical surface in such a way that we get a finite limiting theory. To put it another
way: for fixed values of the cut-off and of the near critical initial condition g 7 (0) for the
relevant parameters, the flow passes near the fixed point at some energy scale fi <JC A
and then flows away, /i is a function of A and g r (0). By dimensional analysis it has
the form /x = f(g I (0))A, with/ vanishing on the critical surface. We tune the ini-
tial condition as A -> oo so that it remains finite in the limit. This is one condition
on all of the relevant parameters. With N as the co-dimension of the critical surface,
we obtain a cut-off-independent theory that depends on the energy scale /x and N-l
dimensionless parameters. The finite parameters which control how we have taken the
infinite-cut-off limit are called the renormalized couplings of the theory. There is an infi-
nite number of ways to parametrize the renormalized theory, and we shall discuss a few
of them.
This picture of a general quantum field theory as a RG trajectory that passes infinites-
imally close to the fixed point CFT suggests a new way of constructing the whole set
of A — ► oo theories associated with a given RG fixed point from the fixed-point CFT.
We can read off the dimensions of operators in the CFT by looking at their response
to scale transformations. This enables us to construct the list of relevant operators
without knowledge of the RG flow for finite cut-off. We now do standard perturbation
theory
(0|TOi(xi) . . . O„(x„)|0) G = (0|rOi(jci) . • • O n (xJJf* x G ' 0/W |0)o.
In this formula the O, in the Green function can be any operator in the theory, while
the G 1 are zero for all except relevant operators.
This is essentially what we are doing when we do renormalization in ordinary per-
turbation theory, except that in that case we first resum the results of perturbation
in quadratic relevant operators (mass terms) because it is easy to do. Special atten-
tion must be paid to marginally irrelevant operators. In principle, they should not
be included in the list of non-zero G 1 . However, since the signature of their initial
conditions vanishes only logarithmically in the infrared, it is hard to see the differ-
ence between marginally relevant and irrelevant operators with small coefficients until
energy scales exponentially larger than the renormalization scale fi.
To see how renormalization works from this point of view, note that all UV diver-
gences in this perturbation series come from places where points coincide. Away from
such places the Green functions are all finite. Infrared divergences could be cut off
by putting the whole system in a large Euclidean box. There are two kinds of UV
divergence. The first have to do with singularities that occur when one or more of the
integrated points from the expansion of the interaction collide with one of the operator
insertions 0,(x,). Recall that in free massless CFT we had to define composite opera-
tors by a careful limiting procedure (normal ordering). The divergences associated with
the coincidence of perturbative vertices and operator insertions can be removed by a
redefinition of the operators O,, since they are localized near the points where these
operators sit. We will not deal explicitly with these divergences, though the methods
for dealing with them follow the same line of reasoning as that which we will see below.
Renormalization and effective field theory
The other sort of UV divergence comes from the collision of two integrated vertices.
We deal with it by doing an operator product expansion. If x and y are the space-time
positions of the two colliding vertices we write
jd 4 xd 4 yJ2\ x -y\ AK ~ A, ~ AjGlGJc n°K(x).
We have again used a shorthand in which a power of \x — y\ stands for any tensor
structure with the same scaling dimension. Divergences in the integral over (x — y)
will occur whenever Aj^ — A/ — Aj < —4, as long as the angular integral does not
vanish. The latter condition tells us that only scalar operators Ok can contribute to
the divergence. 7
9.6 Renormalization of 4 field theory
We're now ready to do an explicit perturbative example of renormalization. The CFT of
a single free massless scalar has four relevant/marginal perturbations: : </>" : n — 2, 3, 4
and : (9 M </>) 2 :. 8 The last of these is equivalent, after partial integration, to : 4>d 2 cp :,
which vanishes. It does not really change the theory we obtain in the limit, but only the
normalization of the field. However, it is necessary to retain this operator in the action
in order to get cut-off-independent Green functions.
The term </> 3 is the only one of these operators which is not invariant under the
<p -*■ —4> symmetry of the fixed-point theory. We will have a lot more to say about
renormalization and symmetry later. For now, we can notice that the exact RG equa-
tions do not generate interactions odd in cp from even ones. Thus, we can think of a
restrictive class of theories that retain this symmetry, and drop the cubic operator from
our list of relevant terms.
There are two ways of speaking about perturbative renormalization. In the first,
which I prefer, and will use, one writes a Lagrangian 9
One then defines a rescaled field cj) — Z~ 1 0o, and allows the coefficients Z, m 2 ,, Xq to be
cut-off-dependent in order that the field <p has finite Green functions. The alternative
method is to split the Lagrangian into | : [(d^) 2 — m 2 <fi 2 — (A/4!)</> 4 ] : +§£, where m
is the physical mass of the particle and A a physical on-shell scattering amplitude at the
symmetric point in momentum space. The interaction then contains "counterterms,"
' Strictly speaking wc should insen a regulator. As long as this is done in a rotation-invariant mannt
conclusion that only scalars contribute is valid.
8 There is also <p itself, but we can use the symmetry ->• + a of the free theory to eliminate this in
of the other three powers.
We could actually omit the normal ordering here. The normal ordering of the quadratic operator
shifiN (he vacuum eneruy. Thai of ihe quartic lerm also Miil'ls ihc mass renormalization condition.
9.6 Renormalization of 4 field theory
s { V + ^{ V + s >[ v
Pi' v - y S P2 Pa' ^- y S - P3 Pi' V - X > 4
s-, t- and u-channel diagrams for the one-loop four-point function.
which are used to cancel out loop corrections to the physical mass and coupling and
the normalization of the single-particle matrix element of the field. One can define
an analogous counterterm method for any choice of definition of the renormalized
parameters.
Now let us consider loop corrections to the quantum action. Since we are using a
normal ordered interaction, there is no one-loop correction to the 1PI two-point func-
tion. For n > 5 the one-loop correction to the w-point function is cut-off-independent
(the loop integrals are finite). The one-loop, 1PI Euclidean four-point function is given
by a sum of three diagrams (Figure 9.4).
They all have the form
l k 2 [ _^_ 1
2 °J (lit) 4 (p 2 + m 2 )[(p - q) 2 + m 2 ]'
where q = p\ -\-pi, P\ — pi,pi ~P4, in the three different diagrams. The combinatoric
factor j comes from the symmetry of the two internal lines in the diagram.
To this order, the 1PI two-point function of </> is Z~ l (p 2 + m 2 ). Since it is independent
of Ao it must reproduce the physics of the free theory. We identify Z = 1 and m^ = m 2 ,
the physical mass of the free particle. Thus, any divergence in the four-point function is
related to the cut-off dependence of Ay. In order to get explicit expressions, we have to
decide how we are going to cut off the integral. The general theory of the RG assures us
that we can do this any way we like. The functional form of the bare parameters in terms
of renormalized parameters and the cut-off will depend on the cut-off procedure, as well
as on the definition of the renormalized parameters (usually called the renormalization
scheme).
We will use the dimensional regularization scheme. Rewrite the propagators using
the Schwinger proper-time trick,
-h
and then do the Gaussian momentum integrals, using the formula
\JT-MZt'
We have arrived at this formula by making the change of variables s\ = (1 — x)s, S2 — xs,
where si is the Schwinger parameter for (p 2 + m 2 )~ l and ^2 the parameter for the
Renormalization and effective field theory
propagator that depends on external momenta. Note that, at this point, the space-time
dimension d is just a parameter, and can take on arbitrary complex values. We will see
that the integrals actually define analytic functions of d, with poles at d — 4. Assuming
that this is so, we can simplify things by rescaling the s variable to make the argument
of the exponent simply s. We obtain
f ^/oXC^ 1 -^ 2 ^
The s integral now gives T(2 — d/2), which, as we claimed, has a pole at d — 4. Note
that, if we take derivatives w.r.t. q 2 or m 2 , we bring down a power of (d — 4), which
cancels out the pole. Thus, the divergence resides only in the constant term.
The tree-level 1PI four-point function is just ko, so, if we write ko = k + k 2 f(d) where
k is finite, and / has a pole which cancels out the pole in the one-loop computation,
then we will get a finite answer as a function of A.
We continue to discuss the renormalization program by turning to the two-loop,
two-point function. 10 There is one graph, Figure 9.5, with a symmetry factor 1/3!.
It can be evaluated by introducing three Schwinger parameters,
x f d d Pi d^2[e-' 51 ^ +m o ) e- J2 ^ +m o) e -' ! 3«'/-^i-K) 2 +'«5)].
We do the momentum integrals using the Gaussian integral formula
/
To display the result, I will write the integrand as it appears after each Gaussian integral.
We will change variables to st — sxj, J2 x i — 1 . with As\ &si d.V3 = s 1 As dx\ Ax2 d*3 8(1 —
J2 x i)- 1 wm a l so orn it the prefactor k 2 ) /[(27r) 2d 3\]. After the^i integration we obtain
Two-loop mass and wave-function renormalization.
ill not calculate the two-loop 1PI four-point function, which should be done to complete the
nalization program. The interested reader should try it, and will see that the calculation is quite
9.6 Renormalization of 4 field theory
After the p2 integral, this simplifies to
> J G/i
Next, we do the s integral, which produces a factor of
ro-d)(q 2 -
X1X2 + X2XJ + X\X3 "/
The only divergence we can get in the x, integrals comes at points where two of the vari-
ables approach while the other is forced to be 1 by the delta function. Near such limits,
the integrand behaves like dx d y W/{x + y) 2 , where x, y are the two variables which go
to zero. The coefficient W is independent of q 2 because the coefficient which multiplies
q 2 vanishes at the dangerous points. Thus, the answer contains single and double poles
at d — 4, which are momentum-independent, and a single pole proportional to q 2 .
The divergent part of the effective action coming from this diagram is given by
Tdiv = ^-7 r(3 -d)j d 4 x[(d^<P) 2 h + m^hl
I\2 are given by the following parametric integrals:
■HI-
( X (y - X) + y(l - y)Y
h- ('*<('*, ' ,
I\ is finite, while I2 has a single pole at d — 4. The minimal subtraction (MS) renor-
malization scheme is defined by keeping only the pole terms in the relation between
renormalized and bare fields and parameters. In defining the divergent part of the above
integrals in the MS scheme, we keep both the pole and the finite correction in I2, and
throw away terms of order d — 4. The residues of all poles in r^v are given by integrals
we can evaluate in terms of elementary functions, but the calculation is a bit tedious.
We have demonstrated that all the divergent terms in the effective action correspond
to one of the three operators : (d^) 2 :, : </> 2 :, : (j> A :, and therefore expect that cut-off
dependence of the coefficients of these terms in the bare Lagrangian will render the
Green functions independent of the cut-off in the limit that it goes to infinity. That is,
a Lagrangian of the form
C = \ : [(9^o) 2 - mfo 2 - ^<£ 4 ] :,
with appropriate cut-off dependence in mo and Ao, will give finite Green functions for the
field </> — Z~i(po, with cut-off-dependent Z. The renormalization procedure consists
of tuning the parameters in such a way that the rescaled field has cut-off-independent
Green functions.
Renormalization and effective field theory
Our calculations have shown that
r di vW>] = ^oO^o) 2 + 2/^o4>o + ^o</
where
X 2 2m 2
■d)Iu
3\(4jr) d /x 2 ' l67t 2 (d-4)'
We note that, in d dimensions, the coupling Xq has dimensions (mass) 4_rf . To elimi-
nate the cut-off dependence in the quantum action, we define Xo — fz 4 ~ d (X + AX 2 ),
m 2 , — m 2 + Bfi 2 , and </>/, = \fZtj>, with Z = 1 + DX 2 . The parameters X (which is taken
to be dimensionless) and m 2 are held fixed as we take away the cut-off d — ► 4, and the
4> field should have a finite quantum action. In these formulae, we have introduced a
parameter /i 2 , called the renormalization scale in dimensional rcgularization. It is neces-
sary to rewrite functions of the dimensional constant Xq in terms of the dimensionless
parameter X.
The relevant and marginal terms in the quantum action are
-(1 + aX 2 ){d^) 2 +-{m 2 + bm 2 xl)4> 2 + ^(1 + cX )c/>*.
We write this in terms of renormalized fields and parameters, dropping terms of higher
order than X 2 . The coefficient of the kinetic term is (1 +DX 2 )(\ + aX 2 ) % 1 + (D + a)X 2 .
That of the mass term is
(1 + DX 2 )(m 2 + Bii 2 X 2 + b\ji 2 X 2 ) ^ m 2 + X 2 (Dm 2 + (B + b)fi 2 ),
while the quartic term has coefficient
-(l + (A+2D + c)X).
We first choose D + a to have a cut-off-independent limit. Then B can be chosen so
that Dm 2 + (B + b)fi 2 has a cut-off-independent limit. Finally, A is chosen so that the
renormalized quartic coupling is finite.
Note that this procedure does not introduce cut-off dependence into the o(X 2 ) finite
part of the 1PI action. rfi n i te is initially written in terms of the bare parameters and
bare fields, and is of order Xq ~ X. When written in terms of renormalized fields and
parameters, it has divergent pieces, but they are of order A 3 and higher. The basic claim
of renormalization theory is that these terms cancel out exactly against sub-divergences
in higher-order Feynman graphs. The combinatorics of this appears daunting, but a
proof that it works was carried out by a number of authors. Our Wilsonian approach
through the exact RGE assures us that it must work.
It is worthwhile pausing here to remark on the precise connection between our dis-
cussion of the exact Wilsonian RGE and dimensional regularization. In the context of
the RGE we have remarked that we can easily replace one cut-off by another, changing
the kinetic function K(p) or going over to a lattice cut-off, without changing the qual-
itative nature of the RG flow, its fixed points, or the dimensions of operators at those
points. Dimensional regularization is just another way of cutting off the divergences,
9.7 Renormalization-group equations in dimensional regularization
albeit one that is denned only perturbatively. We should expect to be able to perform
renormalized computations near the Gaussian fixed point using DR, with couplings
that are related to those of the Wilsonian RGE by a finite redefinition
^dr = ^w + 2_^ ^n^w-
Returning to our computation, we note that requiring the quantum action to be finite
does not determine the parameters A,B,D uniquely. Each choice of this finite ambi-
guity determines another way of parametrizing the renormalized field theory, called
a renormalization scheme. One conceptually simple scheme called on-shell renormal-
ization chooses the ambiguity so that m is exactly the position of the zero in the 1PI
two-point function, and the vacuum to one -particle matrix element of the renormalized
field is 1; — iX is then chosen to be the value of the invariant 2^2 scattering amplitude
at the symmetric point in momentum space, s — 4m 2 , t — u — 0.
This is not always the most convenient scheme. It requires us to be able to find
the masses of particles in perturbation theory. In QCD and other asymptotically free
gauge theories this is impossible. It also ties our renormalization scheme to quantities
in Minkowski space. For many purposes, a Euclidean renormalization scheme, which
is not directly tied to physical masses, is better. For perturbative calculations, the most
efficient scheme is the so-called MS or modified minimal subtraction scheme. This
exploits the way in which dimensional regularization isolates cut-off dependence in
terms of poles in analytic functions. The minimal subtraction scheme is defined by
letting A,B,D, and analogous higher-order coefficients have precisely the poles in d — 4
that they need in order to give finite Green functions. No finite parts are admitted in the
definition of these coefficients. It turns out that this prescription leads to a proliferation
of transcendental numbers (related to the derivative of the Euler gamma function at
t = 1 ) in formulae for finite amplitudes in terms of the MS parameters. The MS scheme
allows specific finite parts in A,B,D, etc., which get rid of these transcendentals. The
reader is urged to consult the book by Peskin and Schroeder [33] for many examples
of computations in the MS scheme.
9.7 Renormalization-group equations in dimensional
regularization
We have seen that one of the peculiarities of dimensional regularization is that it intro-
duces an extra dimensionful parameter /x apart from the coefficient of the quadratic
term in the action. We know that this is a good thing, and that it is appropriate to
introduce such a parameter, the renormalization scale in any regularization scheme. The
renormalization scale serves as a surrogate for the cut-off in the renormalized theory. If
we recall our discussion of the exact RGE, general QFTs are defined by unstable flows
near a fixed point, and a renormalization scale was introduced to mark the momentum
scale at which the behavior of a given QFT deviates from its scale-invariant progenitor.
We remarked that it was always possible to exchange the renormalization scale jx for
Renormalization and effective field theory
(a power of) the coefficient of one of the relevant operators in the theory. It is possible,
but not necessary. We can, instead, over-parametrize the theory in terms of /x and all
of the G 1 . Then we expect an equation telling us how the G 1 must change with [i in
order to describe the same physical situation. This is the remnant of the RG flow near
the fixed point, and is also called the RGE of the renormalized theory.
To derive the RGE using dimensional regularization, we note that a bare 1PI
Green function (we suppress the momentum dependence of the Green function for
the moment) satisfies
^dnWT„(m ,\o,n) =
r° n (ml k , n) = Z-ir„(m 2 , k, ft).
The first of these equations expresses the fact that \i does not appear in the bare Green
functions, written in terms of Xq. We now apply the first equation to the second, and
obtain
In this equation, the [i derivative is taken at fixed k and m 2 , and we have defined
Y = n 3 M (lnZ), p = ()i dn)k, p m 2 = O dn)m 2 ,
where the derivatives are taken at fixed ko, mo. These coefficient functions are finite
functions of the renormalized parameters in any renormalization scheme. Finiteness
of T„ implies that the /j. -dependent piece of InZ, k, and rrr is finite. Since [i always
appears raised to the power k{d — 4) in /:th order in the bare perturbation expansion,
the single-pole parts of the expressions for k, m 2 , and Z in terms of bare parameters
are not /x-dependent. Furthermore, the multiple poles must all cancel out in the scal-
ing derivatives, which define fix, fi m i, and y, since these are all finite. In the MS and
MS schemes, the renormalization constants have a fi -independent finite part. Thus,
in this scheme, f3,f3 m 2, and y are //-independent and come only from the pole part
of their definition. By dimensional analysis, we must have f3 = f3(k), y = y(A.), and
We can solve these equations in the following way. In words, the equation says that
if we rescale fi we get a term proportional to T„, as if T„ were just rescaling, plus other
terms which can be compensated for by varying the couplings. This corresponds to the
idea that the renormalization scale is playing the role that the cut-off played in the exact
RGE. Near the fixed point, a change in energy scale is compensated for by a change of
scale of the field and a change in the marginal and relevant couplings. So we make an
Ansatz:
r n (e- s (i,k,m 2 ) = c F(s) r n ( l i,k( s ),m 2 (s)).
9.7 Renormalization-group equations in dimensional regularization
On plugging this in, and comparing with the RGE, we find that any function satisfying
Shis relation will be a solution of the RGE if
~d7 =
^Y(Hs)),
dX
d7 ~
PiHs)),
and
dm 2
~dV~ !
■n 2 Ym 2(Ms)).
The
boundary conditions on these equations are
F(0) =
= 0, A,(0)
= X, m
So the solution is
r„(e-'V,
X,m 2 ):
= e t/od«KW i
u)) r n L,x(s)
2 (0) =
iuy m 2(Uii))du\
Now consider the momentum dependence of T n . In momentum space (and including
the delta function of momentum conservation) T n has engineering dimension — n in
mass units. Thus it is of the form \p\~ n times a function that depends on ratios of the
dimensionful quantities,
r„0, X, m 2 , e'pi) = e""'r„(e" V, K e~ 2t m 2 ,pi).
We now use the solution of the RGE to rewrite this as
r n ((iA,m 2 ,e t p i ) = e- nt eifo d "yM«^r„(^,X(t),e- 2, e^ d "V( x ^ d W,pd.
We can interpret this equation as saying that, as we change the momentum scale,
the naively dimensionless coupling X "runs," while the renormalized field and the
mass acquire A(?)-dependent corrections to their dimensions. You should understand
that engineering dimensions never change, and that engineering-dimensional analysis
is always valid. However, the behavior of the theory under rescaling of momentum
depends on dimensionless ratios of momenta to the renormalization scale, and can
therefore be changed by quantum corrections.
The fact that fi(X) ^ means that our classical assessment of A as a marginal
parameter is wrong. It does flow along RG trajectories, and its behavior near the
free -field fixed point is determined by the leading term /J = b$X 2 + o(l 3 ). In this
approximation, the RGE is a linear equation for l/X,
i l -
whose solution is
We see that, if bo is positive (recall that bo = 3/(16jt 2 )), then, as momentum gets larger,
X(t) increases, whereas as it gets smaller the coupling decreases. Such a theory is called
Renormalization and effective field theory
infrared-free because it is well approximated by a perturbation expansion in X{t) at low
momentum, even if X is not small.
But there is something confusing and sick here. Remember, we defined a QFT by
tuning relevant parameters to zero as we took the cut-off to infinity. The RG trajectory
approaches very close to the scale -invariant fixed-point theory (which in our case is
massless free-field theory) and then runs away from it at the renormalization scale /it,
which is tuned to remain finite as the cut-off goes to infinity. If we look back along
the RG trajectory, from the scale /x to the UV, we can see only the fixed point. So a
renormalized theory defined by perturbing the massless free fixed point must behave like
the free-field theory at very high momenta. We have found the renormalized coupling
getting strong in this limit, which is a contradiction.
The clue to this behavior is infrared freedom. X(t) is getting weak in the IR. In other
words, it is behaving like an irrelevant (marginally irrelevant) coupling. Its value in
the renormalized theory should be fixed by the other couplings. But the only other
coupling is the mass. We can solve the massive free -field theory exactly, and we find no
connected four-point function. Therefore the renormalized 4> 4 coupling is zero (this is
often referred to as triviality of (p 4 theory).
Why then were we able to find a finite perturbation expansion to all orders in the
renormalized coupling XI The clue lies in noting that the place where the coupling
gets strong in the UV (it blows up there but we can no longer trust the perturbation
expansion at the singularity) is at / of order 1/X. If X is small this is a momentum
scale exponentially higher than the mass of the particles in the theory o(m 2 ). So we
can put a cut-off in the theory at an exponentially large scale, and get perfectly pre-
dictive physics by ignoring the cut-off, up to terms of order q~ c I x . In other words,
theories with marginally irrelevant parameters are almost as good as mathematically
well-defined QFTs, as long as the renormalized values of the marginally irrelevant
couplings are small enough to be in the perturbative regime. The point where the pertur-
bative coupling blows up is called the Landau pole. As long as one uses a renormalized
perturbation expansion at energies way below the Landau pole, one is on safe ground.
9.8 Renormalization of QED at one loop
As a further example, we will treat the renormalization of quantum electrodynamics
at one loop. Power counting at the Gaussian fixed point for electrodynamics in any
covariant gauge suggests that the interaction A^J^ is marginal when JP is taken to be
the current of either spin-zero or spin-^ particles. The additional A^cp*<f> interaction
necessary to the gauge-invariant Lagrangian for charged spin-zero particles is also
marginal. The scalar- and fermion-mass terms are relevant, and the only other gauge-
invariant marginal operator is a quartic coupling ((/>*</>) 2 . If we have multiple scalars
and fermions the only additional marginal couplings for generic values of the charges
of different fields would be of the form Ay(0*0,:)(0*0 / ). For special values of the charges
we could also have Yukawa couplings, cubic couplings of the scalars, or both. However,
9.8 Renormalization of QED at one loop
even when the charges allow these couplings, we can invoke a (/>,• — ► — 0,- symmetry to
forbid them.
Wilson has emphasized that general renormalization theory does not require us
to use a cut-off that preserves symmetries or gauge invariances (redundancies) of the
classical Lagrangian, at least as long as the Gaussian fixed-point theory has only a finite
number of relevant and marginal perturbations. ' ' We simply introduce any cut-off we
like and then tune the relevant and marginal parameters to make the quantum theory
satisfy the quantum Ward identities. This could fail only if there were violations of
the Ward identities, which cannot be removed by adding a local counterterm to the
action. Schwinger-Adler-Bell-Jackiw anomalies, which we studied in Chapter 8, are
the unique example of the latter phenomenon.
Nonetheless, retaining symmetries at every step greatly simplifies the renormalization
process. For simplicity then, we will insist on using a gauge-invariant regulator, so that
the most general marginal and relevant local Euclidean Lagrangian has the form
(\
) + -Z S \D^ + Zf^Ciy^D^ - m )f + ^tffr
For any charged field with charge q, the covariant derivative is D^ = 9 M — ieoqA^.
We have chosen the loop-counting parameter eo to be such that ej)/(47r) is the bare
fine-structure constant, and are already working with fields that, at tree level, have
canonically normalized kinetic terms. In tree approximation, all the Z factors are equal
to 1 , particle masses are given by m — mo, /x — /xn, and the quartic coupling is e^v /2.
General renormalization theory leads us to believe that we can eliminate all cut-off
dependence in physical amplitudes by tuning the Z factors and the bare parameters
(those subscripted with a zero) as the cut-off is taken to infinity. We will attempt to show
below that the divergent part of the one-loop quantum action, r[A,}/r,<j>], has precisely
the form given above, so that the divergences can be eliminated as claimed. We will
demonstrate this below, to one-loop order, for a single spinor field. The generalization
to the case of multiple charged scalar and spinor fields should be a straightforward
exercise for the diligent reader.
For photons, there is a gauge-invariant generalization of the sort of regularization
procedure we introduced for scalars:
F lv -+ F flv K(d 2 /A 2 )F flv .
However, if we attempt to generalize this to charged fields, retaining gauge invariance,
we introduce an infinite number of apparently irrelevant interactions. There are only
three known classes of gauge -invariant regularization prescription. The first is lattice
gauge theory, but we will insist on retaining Lorentz invariance. The second class uses
1 Which include all ihc interactions in the classical Lagrangian. It is the tail tire of l his condition that makes
spontaneously broken non-abelian gauge theories non-renormaliznble in the unitary gauge.
Renormalization and effective field theory
the fact that charged fields appear only quadratically in the QED Lagrangian. 12 We
can formally integrate them out, obtaining a power of a functional determinant of a
gauge-covariant operator D 2 or y^D^. There are many ways to regulate the product
over the gauge-invariant eigenvalues of this operator. For example, Pauli and Villars
pointed out that
Y\det(D + uF/fi
was finite for finite values of /x, and appropriate choices of e, = ±1. Here D — —D 2
for spin zero (where p, — 2) and iy^D^ (where p, — 1 and /z,- = iM,). We need only a
finite number of extra Pauli-Villars fields, but some of them must have wrong statistics
(€{ = 1 for spin zero or —1 for spin |). The masses of the extra fields act as regulators,
and are taken to infinity at the end of the calculation.
Schwinger's gauge-invariant proper-time formula uses the observation that
*H
In det [(-D z + m z )/(-d z + m z )\ = / j& ( + " n ~ e- ,( - d ' +m >].
If we cut the lower limit of the t integral off at ?o we suppress the contribution of
large eigenvalues of — D 2 and get a finite, gauge-invariant functional of A IX . to has
dimensions of inverse mass squared and defines a cut-off. In order to use this formula
in spinor electrodynamics, we must recognize that the determinant of the Euclidean
Dirac operator is formally positive and independent of the sign of the fermion mass, and
so is equal to the positive square root of the determinant of (— D 2 + m 2 + \a^ v F llv /2).
We can then apply proper-time regularization to the latter quantity.
The combination of either of these two methods with the higher-derivative cut-off
of the photon propagator gives us a finite, gauge-invariant, Lorentz-invariant answer
for all Green functions in spinor and scalar QED. The answer does not obey the
quantum requirement of unitarity until we take the cut-off to infinity. Unfortunately,
neither of these methods generalizes to non-abelian gauge theory. The only known
Lorentz-invariant, gauge-invariant regulator for the non-abelian case is dimensional
regularization (DR), and this is the method we will use. We begin by studying the
wave-function renormalization of the photon.
Figure 9.6 shows the 1PI photon two-point-function graph at one loop in spinor
QED.
Vacuum polarization in QED.
We can eliminate ihequarticscalarinleraclion b\ mlrtKluciny an au\iliar\ lield ^-^ >• ^r+icgoV ^o
9.8 Renormalization of QED at one loop
The value of the diagram is
e 2 f ddq J l}
Our first task is to show that this formula is transverse:
n afi (p) = n(p 2 ^(s aP p 2 -p aPlJ y
The less-than-alert reader will have missed the fact that, in our discussion of the gauge-
invariant quantum action above, we did not allow for a one-loop correction to the
gauge-fixing parameter kq- The fact that the 1PI photon propagator is transverse (to
all orders) guarantees this. So we compute
„ 2 f d d q t ( [p a Ya(y ,l qt+ - im^ypiyf+ip + q)^ - im )] \
p a Ya = [(p+ qfYa - iwo] - (q a Ya - imo),
a order to obtain
an 2/ & d( l t (q a Ya-im (p + qf y a - im \
p n„« oc en / 7 tr I — z z — .
a/S °i (2w)<* \ ql + ml (p + q) 2 + m 2 j
)R is defined so that we can shift variables of integration, so p a Tl a p = 0.
Now take the trace n^, and use transversality to obtain
id - i)p 2 n(/) = el[^% J ^ry- im)Y a <yHp + ih - i"»o)A
We use the identities
K°Wtt< = (2-d)y' 1 q fu
Y a moYa = dmo,
to rewrite this as
q ■ (p + q)(2 - d) - m\d_
Our next piece of trickery is
^■i^A/^, ';>y-«-* .
The first two terms cancel out after integration and shifting of variables. So we obtain
_ gp^spin f Adn [id - 2)/2]p 2 - dm 2 + (2 - d)g 2
m 2 )[(p + q) 2 + m 2 ]
u '- i>/''n</>-> Sir/ A-
Renormalization and effective field theory
To evaluate the momentum integrals, we use Sch winger's parametric formula \/A -
f °° da e~ aA to get
(d - l)p 2 n(p 2 ) =
,2:7 )■'
j f da dp q-^+P^Iq-Pp 1 f d d q e -(«+«
cU
jd spm p dad/i e - (a+ , )m 2 e _, p2
(2«/Tt) d Jo (a + P)'i
In order to understand the last equality, the knowledgeable reader will have used the
formula for Euler's gamma function,
./w-e-,
as well as the identities
if t is an integer > 1, and
from which it devolves that
r(i/2)=v^.
It is now obvious that we should introduce s = a + p and x = P/(t
that da dp = s ds dx. We obtain the parametric formula
e 2 A
Ul-
(2Jlt) a Jo Jo
x[^-^ + (2-^ + ,V)].
Simple power counting gives a quadratic divergence in Tl a p, which would be
momentum-independent and correspond to a photon mass A 2 A^. The transverse for-
mula for n^ shows that such a term cannot be there. The only way we could get
something that looks like a photon mass would be from a pole in Tl(p 2 ) at p 2 — 0.
This would not be UV-divergent. In fact, in more than two space-time dimensions such
a pole does not appear (and even there it appears only if mo — 0). The only way to
get a massive photon is through the Higgs mechanism. In perturbation theory this
can happen only if there is a scalar field with a negative mass squared in the tree-
level Lagrangian. In that case we have to diagonalize the free Lagrangian properly and
9.8 Renormalization of QED at one loop
we find only massive tree -level excitations. It must be, then, that the formula we have
written above vanishes when p 2 =0. The diligent reader will verify that fact. We thus
obtain
Poles when d — ► 4 will come from UV-divergent integrals at d = 4. Ultraviolet
divergences correspond to the small-.? region of integration. It is now obvious that
terms in our formula of order higher than p 1 in an expansion about will not have
divergences in the small-.y region. Thus, the divergence resides in n(0), which means
that it has the form of a correction to the tree-level relation Z3 = 1. In fact
■«r 2-
The peculiar factor of m ~ in this formula points up something important. Away from
four dimensions, the electromagnetic coupling is not dimensionless. DR introduces no
explicit cut-off with the dimensions of mass, so the dimensions in formulae must be
made up by powers of mo or some other mass parameter.
The pole in the 1PI photon Green function can all be attributed to the Z3 factor, but
there is an ambiguity regarding how much finite part to keep when we define Z3. DR
allows us to keep "just the pole," but when we express the answer in terms of eo this
appears to introduce a dependence on mo because of dimensional analysis. However,
we can introduce an arbitrary mass scale (i in place of mo to define the split between
Z3 and the renormalized field in the formula
We thus write
(Z3 " 1)= 3 -^a—l>
where we have evaluated the residue of the pole at d — 4. This prescription for
parametrizing the renormalized theory is called minimal subtraction (MS). Note that
V 2) 2-d/2 r '
where y is the Euler-Mascheroni constant. There is a modified prescription, called
MS, which keeps certain finite parts in addition to the pole, and removes many of the
Y factors which would otherwise appear in renormalized formulae (see Peskin and
Schroeder [33] for details).
Renormalization and effective field theory
9.8.1 Renormalization of the fermion propagator and the vertex
We now turn to the other two divergent one-loop diagrams in spinor QED, the fermion
self-energy and the photon fermion vertex. The gauge-invariant form of the Lagrangian
suggests that these are renormalized by the same multiplicative factor. The 1PI vertex
function V^(p,p + q) is related to the Fourier transform of the Green function of the
electromagnetic current,
(/*(*) V0W(z)>,
by multiplying it by inverse fermion propagators on the external legs and by eo, the
bare charge. The current Green function satisfies the Ward identity
dZ(J»(x)1r(yW(z)) = iS 4 (x - y){f(y)f(z)) - iS 4 (x - z){f(y)jr(z)).
Written in terms of the vertex function, this identity is
q^Tpfap+q) = eoly^q^ + ^{p + q) - E(p)],
where E(p) is the sum of all 1PI self-energy graphs (the notation here differs a bit from
standard texts because I have included the tree-level term in the definition of Y^).
The vertex function itself is dimensionless, so the one -loop integral defining it is at
most logarithmically divergent. Consequently, all derivatives of T M w.r.t. q are given by
finite expressions in four dimensions and will not have poles at d — 4. The divergent
part of T IA is thus independent of q at one loop. A similar argument shows that it is
independent of p as well. Reflection invariance of QED shows that it cannot involve
ys, so we must have
T~ = e (Zi - \) Y „.
The Ward identity shows that this must be related to a divergent term in £°°(/?)
= (Z% - ^YuP 11 , with Z\ = Z 2 .
We therefore turn to the self-energy diagram of Figure 9.7 to evaluate Z 2 . The result
will depend on the gauge parameter kq in the free -photon prop
(<Vv - q^qv/q 2 ) q»q v
q l q 4
We will notice that Landau (also called Lorentz) gauge, kq — 0, has certain nice features
(it avoids some infrared divergences in low orders). One way of understanding why the
Landau-gauge fermion propagator is so nice is to note that it is actually the value of a
One-loop self-energy in QED.
9.8 Renormalization of QED at one loop
fairly simple non-local gauge-invariant operator, evaluated in Landau gauge. Indeed,
if we multiply the bilinear x[r(x)ir(y) by
where dfj^(z) = S 4 (z - x) - S 4 (z - y), then we get a gauge-invariant operator. On
choosing^ = S'M, and noting that the resulting A vanishes at infinity (it is the four-
dimensional analog of the field of a dipole), we see, upon integrating by parts, that the
non-local exponential vanishes in Landau gauge.
We will do the calculation in Euclidean space. We write the one-loop contribution
to E(p) as
{2n) d J q 2 [(p-q) 2 + m 2 ]\ q 2 J
We introduce Schwinger parameters to write the denominator as
f °°da d/Je- (a + ^V^ 2+m o) e 2 ^.
Jo
Note that the extra inverse power of q 2 in the last term of the numerator turns into an
extra power of a in the numerator of the parametrized integral. To do the Gaussian
integral over loop momentum, we shift the integration variable to r = q — xp. Here we
have introduced the standard one -loop passage from Schwinger to Feynman parame-
ters: s — a + p, p — sx, da d/J — s ds dx. When we do the integrals, terms linear in r
will integrate to zero, and we need to use
and
Keeping only terms even in r, the numerator is
y a (P(l - x) - imo) Y p lS a p - (1 - k)(1 - x)s(r a rp + x 2 p a p p )}
+ Y a r/Y P x(\ - K)s(r a p p + rpp a ).
Note that we have inserted the renormalized gauge parameter rather than the bare
one, because the correction is of higher order in e 2 . After doing the r integration,
terms quadratic in r acquire an extra power of 5. Terms proportional to s l ~ d l 2 give
T(2 — d/2) ~ 2/(4 — d), but terms with higher powers of s give convergent integrals.
Thus, the divergent part of the self-energy is given by
ill'
»M1 ^ ■ S d (l-K)(l-*)\
"(/(l - x) - wi^yA 1
+ d]/x(l-x)(l-K).
Renormalization and effective field theory
Finally, we use contraction identities for Dirac matrices and do the x integral to obtain
^ = -^T^[^-i-) + 3im].
This can obviously be removed by rescaling the fields and renormalizing the mass
according to
and
8»i = —3 niQ.
lite
There are several interesting points about these formulae. First, we find that t>m oc mo,
so the mass renormalization is multiplicative and vanishes if mo vanishes. This is a
consequence of the extra chiral symmetry of the massless theory. Although DR doesn't
really preserve conservation of the associated Noether current (this is the chiral anomaly
we have discussed above), it does conserve the charge in perturbation theory. As a
consequence, although mo has dimensions of mass, it does not have the sensitivity to
the cut-off scale we expect for a relevant parameter. Note also that 8m is independent
of*.
DR has the peculiar property that
f d d p . l — -. oc M d ~ 2
J p 2 + M 2
and thus vanishes for M — 0. That is, quadratically divergent integrals with only mass-
less propagators vanish in DR, but they do depend quadratically on the masses of
very heavy particles. For scalar fields, this leads to renormalizations of any mass term
ex A 2 , coming from integrating out particles of mass near the cut-off. On dimensional
grounds we might have expected the same for fermion masses. However, the chiral sym-
metry of massless fermion systems guarantees that the corresponding massive systems
have only logarithmically divergent mass renormalization. We will return briefly to this
point when we discuss the concept of technical naturalness below.
The second interesting point is that the fermion wave-function renormalization is
a: -dependent and vanishes at k =0. In Chapter 6 we noted that QED has IR divergences
associated with massless-photon emission in the scattering of charged particles. One
can show using the RG equations of the next section and the IR freedom of QED that,
in leading-order RG-improved perturbation theory, the fermion propagator has a cut
rather than a pole at the physical mass, with a gauge-dependent power law. This is
problematic for the LSZ formula for the S-matrix. In Landau gauge, the cut becomes a
pole in this leading-order approximation. The gauge-invariant operator which is equal
to the fermion field in Landau gauge includes the Coulomb field of charged particles,
the IR effect that is of leading order in perturbation theory. In higher orders, we must
include appropriate coherent states of transverse photons in the definition of charged -
particle states, in order to account for bremsstrahlung radiation and cancel out the IR
divergences.
9.9 Renormalization-group equations in QED
9.9 Renormalization-group equations in QED
We have argued that, in Landau gauge, we can, order by order in the perturbation
expansion, render the expressions for all Green functions finite by rescaling the fields
and tuning the bare parameters mo and eo so that the renormalized parameters m —
Z m mo and e are finite. In the process, we have introduced a new mass scale, /x, the
renormalization scale, writing the dimensional bare coupling e^ as
The Green functions of the bare fields, expressed as functions of eo and mo, are
independent of /x:
V^-rf ) A (eo,mo) = 0.
d/x **
r p A is the 1PI vertex with F fermion and A photon legs. It is a function of the momenta
of the external legs, defined without the momentum-conserving delta function. It has
mass dimension 4 — ~F — A. The renormalized Green function is
/x -independence of the bare vertices implies a relation for the renormalized ones. It
should be expressed in terms of the renormalized parameters, and we can do this using
the chain rule:
(IX 9„ + d a + Ya k \ + my m 9 m )i>^ = -(Fy F + Ay A )T PA ,
where
d/x
d ,
2 d/x
We have used the symbol d/d/x to represent derivatives with ao and mo held fixed,
while 3 M refers to derivatives with the renormalized parameters fixed, ao = e^/{An) is
the bare fine-structure constant. The renormalized fine-structure constant is
= // 4 ao-Z3.
We have also used the result that the longitudinal part of the photon propagator is
unchanged by loop corrections. Thus, the renormalization of the gauge parameter k
Renormalization and effective field theory
comes only from the photon wave-function renormalization kq — Z^k. This equation
for the renormalized 1PI vertices is called the renormalization-group equation. It holds
for all values of momenta, and expresses the fact that the renormalized theory has only
one independent dimensionless parameter and we have artificially introduced another
one through the mass scale (z. As we have emphasized, the reason why we must do
this is because the theory is not scale-invariant even in the limit mo —> 0. If we had
insisted on defining the renormalization scale in terms of the electron mass parameter,
we would have introduced spurious infrared divergences into the mo — theory.
The fact that the RG equation is true for all momenta shows us that the RG functions
/S and Yi are all finite in the limit d — > 4. Read as a set of equations for these functions
in terms of the finite proper vertices, we have an over-determined set of linear equations
for ft and y,-. The fact that they have a solution is the content of the differential equation
for the vertices. In perturbation theory, it is easy to see that, to kth order in the loop
expansion, we can expect the functions Z, to have poles up to order k at d = 4.
Expressed in terms of ao, the [i dependence of the /cth order is just ji k{d ~ A) . The scaling
derivative in the definition of the RG functions brings down a factor of d — 4, which
cancels out the first-order pole. All the others must cancel out automatically, when the
finite functions (5 and y,; are expressed in terms of a, k, and ml Thus, to calculate the
/:th-order term in any of these functions, we need only find the residue of the first-order
pole, in the kth order in the loop expansion.
Finally we note that, in the MS prescription, the coefficients of these poles are inde-
pendent of mo. This follows from the fact that all of the Z, are dimensionless and
therefore (3/3mo)ln Z, is given by convergent Feynman integrals in d = 4 , plus the
fact that we have introduced /x rather than mo or m to provide the dimensions of co-
lt follows that all the RG functions depend only on the renormalized coupling a and
the gauge parameter k. In fact, since the transverse part of the photon propagator is
a: -independent, ya and fi are independent of k. In a more general field theory, defined
in perturbation theory around a Gaussian fixed point, the RG functions in the MS
scheme depend on the marginal parameters, but not on the relevant ones.
If we rescale all the momenta^,- — > e'p,, then the equation of dimensional analysis is
{d t + nd IJ , + md m )T FtA = U--F-a\t fa .
Using the RG equation, we can rewrite this as
(9, + p d a + mil + y m )d m )r F , A = - UjF + y F \ + (1 + ^m]i>^.
In words, this equation says that we can compensate for a change in momentum scale
by changing the coupling and the mass, and rescaling the fields. Thus the solution of
this equation is
T FA (e Pi ,a,m) = rV,.4 (/?,;, a(0,m(0)e il
9.9 Renormalization-group equations in QED
On differentiating this expression w.r.t. t and insisting that it give the previous equation
we find that
d(f) = j0,
m(t) = (1 + Ym)m,
L(t) = (4-FAf-AAa),
with Ap — I + yp and A a — 1 + YA ■
Thus, we learn that a rescaling of all momenta is equivalent to a flow of the coupling
constants satisfying the above equations, combined with a scale-dependent rescaling
of the Green functions, which can be interpreted as an anomalous dimension for the
fields. Similarly, the flow equation for the mass can be interpreted as an anomalous
dimension for the renormalized mass parameter. The anomalous dimensions Ap,A,m =
EF,A,m + YF,A,m, where Ep,A,m — (3/2, 1 , 1) is the engineering dimension, 13 are functions
of the scale-dependent fine-structure constant. If, in the asymptotic region t ->■ ±oo,
a(t) -> a*, a fixed point, then we conclude that the theory is asymptotically scale-
invariant, with dimensions given by the anomalous dimensions evaluated at the fixed
point.
Are there fixed points in QED? We can investigate this only in the vicinity of the
free theory, because we are doing perturbation theory. The perturbative value of the f)
function in spinor QED is
P = («* - 4)« + — a 2 .
in
Adding scalars and more spinors changes the coefficient of the second term, without
changing its sign. We note that for real values of d less than but close to 4 we find
a zero of ft at a non-zero but small value of the coupling, where we can trust per-
turbation theory. This will occur in any theory with couplings that are dimensionless
in four dimensions but have positive coefficient for the one-loop correction to the RG
function. This observation is the basis of one of the methods of calculating critical expo-
nents for second-order phase transitions in d = 2, 3. One computes the fixed points and
anomalous dimensions in a power series in 4 — d, and extrapolates to the values of the
dimension relevant for real condensed-matter systems. This turns out to give quite good
agreement with experiment, though other methods based on field-theory calculations
in integer dimensions do somewhat better. A more detailed account can be found in
the books of Peskin and Schroeder [33] and Zinn- Justin [1 1 1] and the references to the
original literature found therein.
As high-energy physicists, we are interested in d = 4. There, the RG equation has the
solution
«(0)
a{t) ~ l-[2/(33r)]a(0)f -
Note that, for t ->■ — oo, this solution remains in the perturbative regime if a(0) is
small. On the other hand, if t ->■ oo the coupling becomes strong, and in fact appears
Often the term anomalous dimension is reserved for the shift yp a m rather than the din
Renormalization and effective field theory
to reach infinity at the Landau pole, £ — [2/(37r)]a(0). Our basic definition of field
theory tells us that a mathematically consistent field theory will approach its fixed point
in the deep UV. Our perturbative philosophy was based on the assumption that this
fixed point was Gaussian. The Landau pole tells us that this assumption was wrong.
The QED coupling is marginally irrelevant, which means that the only mathematically
consistent value for the renormalized coupling is zero. If we insist on a finite value for
this coupling at some momentum scale t — 0, then the theory becomes singular at a
finite scale, that of the Landau pole.
Since a(0)~l/137 in the real world, our attitude to this as physicists is somewhat
different than that of the mathematical field theorist. If t = corresponds to the atomic
scale ~10eV at which the fine-structure constant is measured, then the scale of the
Landau pole is ~10 331 GeV, which is much larger than the Planck scale 10 19 GeV
defined by quantum gravity. We certainly expect our notions of quantum field theory
to break down long before we reach the scale of the Landau pole. Thus, QED is a
perfectly good effective field theory at all scales at which we expect the whole notion
of quantum field theory to work. In contrast with a truly irrelevant operator, like
the four-fermion coupling of Fermi's theory of weak interactions, a small, marginally
irrelevant coupling does not hint at new physics around the corner. The renormalized
perturbation series of QED could in fact be the whole story up to the Planck scale.
We know that this is not true, because, above the QCD and weak-interaction scales,
physics becomes more complicated. QED is certainly incorporated into the full stan-
dard model. The point is that there is nothing in pure QED that could have led us to
such a conclusion.
9.9.1 The static potential and the definition of a
The flip side of the marginal irrelevance of QED is that the coupling becomes arbitrarily
weak in the IR. Indeed, an alternative name for marginally irrelevant interactions is
IR-free. One must, however, be careful about the precise meaning of this. In the IR limit,
the mass parameter also becomes large. It turns out that this makes the MS definition
of the coupling very different from a physical definition in terms of real scattering
amplitudes. One useful physical definition is in terms of the long-distance potential felt
by a pair of oppositely charged heavy particles. If a particle is sufficiently heavy we can
describe its interaction with the electromagnetic field in terms of its classical trajectory
through space-time, x M (r). The current of the particle is
/"(*) = qjdT-^- S 4 (x - x(r)),
where q is its charge. The interaction with the electromagnetic field is obtained by
inserting the factor
e i/d 4 .V^(.v)/"W =e i?/^d.T" 5
where the second integral is a line integral along the particle path. This expression is
the Wilson line. A similar formula holds in non-abelian gauge theory, where A^ is a
9.9 Renormalization-group equations in QED
matrix and the exponential is replaced by a path-ordered exponential along the particle
path.
It is convenient to discuss a particle-anti-particle pair as a closed Wilson line, or
Wilson loop. This corresponds to pair production, propagation through space-time,
and annihilation at some later time. The pair-production and annihilation events are
described in an idealized manner and do not correspond to a realistic experiment.
However, if we take a rectangular loop in Euclidean space with one side T much
greater than the other, T » R, and Wick rotate to Lorentzian signature, then to
leading order in T/R the answer will be dominated by the long period of particle-
anti-particle propagation, and the unrealistic pair creation events will be a sub-leading
effect.
The exact answer in Euclidean QED is given in terms of the generating function of
connected Green functions, by e~ w ^ J K However, in the large-i? limit, the contribution
of the two-point function dominates (cluster decomposition works even in this theory
with massless photons). Furthermore, in the same limit, the contribution of the two-
point function is dominated by its behavior at/? — 0. If we assume that ejj/(l + n(p 2 ))
goes to a finite p = limit, then the Wilson loop is given by
This is the Coulomb potential with physical renormalized charge
i + n(0)'
We have already noted that we could have chosen to parametrize the renormalized
theory by this quantity. However, we also commented that this introduced a logarith-
mic dependence on mo (and thus on m) and we preferred to introduce the arbitrary
momentum scale /x to eliminate this. For finite m, this choice of parametrization cannot
change the fact that ecoul * s a nn i te quantity. Thus, the physical renormalized charge
defined in terms of the Coulomb potential is not equal to the IR limit of the MS
coupling a{— oo) = 0. Physically, this corresponds to the fact that the renormaliza-
tion of the electric charge due to loops of electrons goes to zero below the threshold
for electron-positron pair production. Mathematically it corresponds to the fact that
one must take m(i) to infinity as a(t) goes to zero in computing ec ou i i n tne MS
scheme. 14
In the m = theory, on the other hand, the MS RG equation can be used to show
that the Fourier transform of the "Coulomb" potential is modified to l/[q 2 ln(q 2 )], so
that the potential falls off more rapidly than \/R, corresponding to vanishing charge
in the IR.
[ IR freedom is useful in the massive theory when dealing with the problem of soft-photon IR divergences.
11 jttslilies the perturbali\e resummalion procedures one uses to extract Unite answers for inclusive cross
Renormalization and effective field theory
9.10 Why is QED IR-free?
I now want to give two different answers to this question. By comparing and contrasting
them, we will learn a lot of things, including things relevant to non-abelian gauge theo-
ries. Consider first the Fourier transform of the two-point function of unrenormalized
electromagnetic field strengths:
jd 4 xe-^ x {F fiV (x)F XK (0)).
This is a gauge-invariant quantity, which can be written entirely in terms of the trans-
verse photon propagator D^yiq). By inserting a complete set of physical states, we can
derive a Lehmann representation
■m&h
«„w.-(v- ^)B'>/«« 2 ^
Note that the Z3 appearing in this formula corresponds to the charge ecoul rather than
the one defined by the MS scheme. In a covariant gauge, the Hilbert space of QED
is not positive definite, but the subspace generated by acting with gauge-invariant
operators on the vacuum is. Thus p is positive. The electric and magnetic fields satisfy
the canonical commutation relation
[F 0i (x, t), F jk (y, 0] = i(Sij d k - 5 ;/ , dj)S\x - y).
This leads to the sum rule
1=Z 3 + f dM 2 p(M 2 ),
which implies < Z3 < 1. That is, the renormalized charge is always smaller than
the bare charge, eo- If we take eo -* as the cut-off is taken to infinity, as would
be appropriate for a marginally relevant perturbation of the Gaussian fixed point,
then the renormalized charge vanishes. So the electromagnetic coupling is marginally
irrelevant: QED. Note that this does not mean that there could not be interacting
fixed-point theories containing electrodynamics. However, they cannot be accessed in
perturbation theory around the Gaussian fixed point.
Furthermore, the relation fi — ay a shows that at any fixed point the two-point
function of the electromagnetic field is proportional to its value in free-field theory.
The representation theory of the four-dimensional conformal group can then be used to
show that all higher connected Green functions of F^ v vanish. That is, at any interacting
fixed point, the electromagnetic field decouples from the rest of the dynamics.
Let us now derive the marginal irrelevance of the electromagnetic coupling in another
way. This method works only at one loop, and is based on the idea of thinking of the
9.10 Why is QED IR-free?
interacting vacuum state of QED as a medium filled with bare particles. The low-energy
effective action of the electromagnetic field in a medium has the form
![*■
are the electric and magnetic susceptibilities. H m is the Hamiltonian of the particles in
the medium, and the angle brackets refer to statistical averaging over the distribution of
these particles. The vacuum is a peculiar kind of medium. It is exactly Lorentz-invariant.
This means that
1 - efam
the electric polarizability is related to the magnetic susceptibility.
We are interested in renormalizations of the electric charge due to effects going on in
the UV. This means that the problem is extremely relativistic. It turns out that, because
magnetism is a relativistic effect in condensed-matter physics, it is really only our intu-
ition for magnetic systems that is relevant here. Indeed, we are familiar with two kinds
of magnetic behavior of material.,: diamagnetism with Xm < and paramagnetism with
Xm > 0. Both have their origins in quantum mechanics, because Van Leeuwen's theo-
rem in classical statistical mechanics shows that neither can occur. Landau understood
that diamagnetism arises from the quantization of particle orbits (Landau levels) in
a constant magnetic field. Pauli was the first to interpret paramagnetism as a con-
sequence of the intrinsic spins of atoms and molecules. Indeed, given the concept of
such an intrinsic magnetic moment, paramagnetism can be understood classically. The
intrinsic magnetic moments of the particles line up with the external field, enhancing it.
In any given material, the overall magnetic properties are determined by a competition
between these two effects. In a Lorentz-invariant medium, paramagnetism is equiva-
lent to anti-screening of electric charge, an effect that is hard to understand in terms of
non-relativistic electrostatics.
In order to understand what is going on in terms of scalar and spinor QED we need
one more twist: for fermions the massless particles which give rise to the magnetic
properties of the vacuum must be thought of as having negative energy. This can be
understood in a variety of ways, the simplest of which is Dirac's picture of the free-
fermion vacuum as a filled Fermi sea of negative-energy electrons. The fact that the
vacuum energy has opposite sign for bosons and fermions arises from the famous
minus sign for closed fermion loops in Feynman's rules. Bosonic vacuum energy is just
Renormalization and effective field theory
the zero-point energy of the field oscillators, which is obviously positive. Fermionic
vacuum energies have the opposite sign because commutators are replaced by anti-
commutators in the normal ordering prescription. However one derives it, the effect
of this is to flip the sign of any individual contribution to the susceptibility. Thus, for
fermionic vacuum particles, a paramagnetic effect gives Xm < 0, while diamagnetism
corresponds to Xm > 0. Our explicit calculations and/or the general result Z3 < 1 which
follows from unitarity then tell us that for massless spin-j particles with gyromagnetic
ratio 2, paramagnetism is more important than diamagnetism.
We can now use this result to learn something about the electromagnetism of charged
spin-1 particles. The competition between paramagnetism and diamagnetism obviously
depends on the gyromagnetic ratio of particles. It turns out that the only renormalizable
theories of charged spin-1 particles are non-abelian gauge theories. The Lagrangian for
the simplest such theory, corresponding to the group SU (2), contains an "electromag-
netic" field A^ and a charged vector field W^ (the two fields are Hermitian conjugates
of each other). In addition to the minimal coupling of A to W the electromagnetic
Lagrangian is modified to
j(i> - e ( W+ W~ - W~ W+)f.
The diligent reader will verify that this gives the charged vector fields an anomalous
magnetic moment, ! 5 in such a way that their gyromagnetic ratio is 2. Thus, in this model
the winner of the competition between diamagnetism and paramagnetism is already
determined by the spin-j case, since the magnetic moment of spin-1 particles with
g — lis larger than that of Dirac particles. Paramagnetism wins and e 1 is a marginally
relevant coupling. Indeed, this hand-waving argument can be made quantitative, and
reproduces the result of the rigorous calculation we will do later. The relevant formalism
is the background field gauge and can be found in Peskin and Schroeder [33].
We can also conclude that something must go wrong with the gauge invariance
of the electromagnetic field-strength tensor in this theory. Otherwise we would find
a contradiction with our unitarity argument for Z3 < 1. Indeed, in the non-abelian
gauge theory, electromagnetic gauge invariance is part of an SU (2) gauge group. The
traditional Maxwell field strength transforms under the other independent gauge trans-
formations. Indeed, any one component of the triplet A a jX composed of A iJL and the real
components of W IX can be thought of as the photon, with the other two components
transforming as charged fields. The field strength
F* v = F I1V - e (W+ W~ - W~ W+)
is the third component of an SU (2) triplet. It is not gauge-invariant, and produces
negative norm states when acting on the vacuum in a covariant gauge.
The detailed calculation of coupling renormalization in a form that reveals the dia-
magnetic/paramagnetic split we have discussed here can be found in many textbooks
15 Weinberg [112] proved long ago that, to lowest order in electromagnetic couplings, the only value of the
-:\ mmuynetic ratio lor panicles of any spin which is consistent with vood bcha\ ior of high-energy cross
9.11 Coupling renormalization in non-abelian gauge theory
under the heading of background livid methods. We will instead perform the calculation
in terms of a simple physical quantity, the potential between static sources, or Wilson
loon,
9.11 Coupling renormalization in non-abelian
gauge theory
We will calculate the renormalization of the coupling by computing the one-loop cor-
rection to the potential between two static external sources. As explained in Chapter 8
and Problem 8.4, this potential is defined by
V{R) =
lim iln(^ R (r))M,.
(9.2)
r is the rectangular Wilson loop shown in Figure 9.8 and d& is the dimension of the
representation R.
In perturbation theory we can compute this in terms of the Feynman rules of
Appendix D, with additional vertices for the interaction of gluons 16 and the static
source. The contributions proportional to T come exclusively from the parts of the
Wilson loop which point in the Euclidean time direction. Thus we have
for the upward-going line at R, and
-go/^<Vo<5 3 0' - R)T£,
'^nnnnnr
Wilson loop for the static potential.
' We will use the QCD terminology, gluon, to refer to the generic non abelian gauge bosons of this se
Renormalization and effective field theory
One-loop contributions to the potential.
for the downward-going line at the origin. We will work in Feynman gauge, because
the triple gluon vertex of Figure 9.9(f) will not contribute in this gauge. All the gluons
couple to static sources, so only their components appear. The triple vertex is anti-
symmetric in group indices (we have a compact group) and in space indices (Bose
statistics), and so vanishes in this configuration.
To complete the Feynman rules, we must remember that all of the T£ matrices must
be path ordered around the loop. The second-order contribution comes just from the
graph of Figure 9.8. Its value is
s) 2 + R 2 '
where the integrals go from — T/2 to T/2. The integral over (s -\
of T, while the integral over relative times is just
0/2 gives a factor
V(R)--
We have evaluated the integral in four dimensions because there are no divergences in
leading order. We obtain
2 trC 2 (R)
4ttR '
where C 2 (R) is the Casimir operator in the R representation. Note that the Wilson loop
contains the trace of this Casimir operator, but the dimension of the representation is
divided out to normalize the static particle states.
9.11 Coupling renormalization in non-abelian gauge theory
R 5
(e) (f)
One-loop contributions to gluon vacuum polarization.
At fourth order in the coupling we have the seven graphs of Figure 9.9, the sixth
of which vanishes in Feynman gauge. The gluon vacuum polarization, denoted by a
shaded oval, is computed from the graphs of Figure 9.10.
The first two graphs of Figure 9.9 sum up to
(J) gU 2(4 ~ d) f d*i dx 2 dx 3 D(xi - x 3 )D(x 2 - x 4 )
x tx[(T£T£Tll%)6(ti - t 2 )B(h - t A ) + (T^T^T^Cfj - t 2 )0(t 4 - h)].
All Xj variables are integrated over the full Wilson loop, from - T/2 to T/2 at R and
then back again at the origin (we neglect the horizontal sections of the loop because
they don't give contributions that grow like powers of T). The factor of (j) 2 eliminates
double counting. The 6 functions refer to ordering around the Wilson loop. The first
contribution comes from the graph where the internal gluon lines do not cross each
other when all lines are drawn inside the loop.
If, in Figure 9.9(e), we write tr(T^T^T^) = tr(T^T^T^) + \ tr[7£, T*] 2
then we can combine the first term with the graph of Figure 9.9(d) to get precisely the
square of the second-order result. The novel term thus has the form
(\) gU 2(A ~ d) f dxi dx 2 dx 3 D(xi - x 2 )D(x 3 - x 4 )
x tr[[Z$, Tlf$(ti - t 3 mt 4 - fe)].
The group-theory factor here gives —f abc f abc D(R) — —C2(G)dQD(R), in terms of
the second Casimir operator of the group, its dimension, and the Dynkin index of the
representation R.
There are three types of configurations of the x, that contribute to the integral. If
all xt are on either the upward- or the downward-going line, then the integral has no
R dependence. It contributes to the self-energy of the static sources, but not to the
potential between them. If three x, are on either the upward or the downward line, as
Renormalization and effective field theory
in Figure 9.9(c), we get (note that there is a minus sign from the orientations of the
different integration variables)
4 J d?i dt 2 d? 3 dt 4 9(t 2 - t 3 )9(t 3 - h)D(t\ - t 2 , 0)D(t 3 - t 4 , R).
All the integrals here go from — T/2 to T/2 and the 9 functions are ordinary step
functions. In DR, the Green functions are
f d d p q-^Po+P r )
(? ' R) = j (inY (p 2 +p 2 )
The contribution from Figure 9.9(b) is similar. The two kinds of independent con-
tributions (Figures 9.9(b) and (c) and the commutator term in Figure 9.9(e)) have equal
group-theoretic factors, since they are really part of the same topological configuration
of the Wilson-loop diagrams, distinguished only by the fact that we have chosen to put
part of the loop at infinity. The differences between them come from R dependence
of the propagators, and the relative minus sign from the orientation of the static line.
Their sum is proportional to (ty = f, — tj)
{ — ~ ■ |
d?23 dt 2 \ d?34 0(t 2 3)8(t 2 \ ~ t 2i )e
-d?3ld?21 d?34 0('3l)0fel
This must be integrated over a and f), with weight (afi)~ d l 2 .
In the first integral, we do the integral over ?23 to get a factor of t 2 \. The latter
variable is constrained to be positive. The rest of the integral is a product of decoupled
Gaussians and gives us a factor 4^/jra 2 p. In the second integral we integrate over t-n to
get a factor of 9(t + )t + , where t± — t\ 2 ± ^34. In terms of these new variables we find the
sum of contributions to the Sch winger-parameter integrands of the two diagrams to be
T\4J
-d?+d?_ 9(t + )t + e
r (tl+t 2 _+4J?)U
\fs) e ->+>-{is-w)]
We now do the Gaussian integral over t~, followed by that over t+. The resulting sum
of the two diagrams is
,,2(A-d) r
g >"
h C 2 (Gu/ A D(R)
-d/2-^f
9.11 Coupling renormalization in non-abelian gauge theory
< J ds dx s 5 2- d [x(l - x)]- rf / 2 e"^(l - x)M 1 -
(9.4)
As usual, divergences come from the limits of the Sch winger-parameter integrals. Note,
however, that the s integral converges for small s. The UV divergence comes from the
region x ~ 1, from which we get a pole at d — 4. It is probing the short-distance
singularity of only a single propagator. Indeed, we can do the s integral exactly and
obtain
g y^-^C 2 (G)d A D(R)
r d/2-3\ (] _ x s\-d/2
(<-m
L (l_ x) i-^_ ( 1_ X )^-^. ( 9.5)
The integrals can be evaluated in terms of Euler beta functions, and the first term has
a pole at d = 4, as a consequence of the singularity of its integrand at x = 1.
Next we compute the diagrams which correct the internal gauge-boson line. We will
do the computation in Feynman gauge, and in Minkowski space, but drop ie factors
in propagators. In Problem 8.13 the reader will verify that the answer for the full
Wilson loop is the same in any covariant gauge. The theory is invariant both under
constant gauge transformations and under BRST symmetry. The first of these tells us
that the gauge-boson two-point function is proportional to S ab . Given this constraint,
the BRST-symmetry relation for the two-point function reduces to the same constraint
as that which we found in abelian gauge theory: the two-point function is transverse.
So we have
Our QED calculations have shown us that dimensional regularization preserves BRST
symmetry, so it is sufficient to calculate the trace of the diagrams, which gives us
S ah (d — \)q 2 Yl{q 1 ). The reader is urged to carry out the full calculation of all com-
ponents in a general covariant gauge, in order to convince her/himself that the result
is indeed transverse. In this exercise, it will become apparent that individual graphs
do not satisfy the Ward identities. They are true only for the sum of graphs at a given
order.
The graphs in Figures 9.10(d)-(f) involving fermions or scalars are proportional to
their values in QED. The square of the integer charge of the fields is replaced by the
Dynkin index (tr (T%T%) = D(R)S ab ) for the, generally reducible, representations of
the gauge group in which these fields live. Thus, the divergent part of the gauge-boson
two-point functions from these diagrams is
n£T = i(<?V V - q^q v )S ab [ - -^ K -D(R F )
<-£M!»
Renormalization and effective field theory
gig is the analog of the fine structure constant for the non-abelian group G. We have
done the computation for Weyl fermion fields in the representation Rp. For parity-
invariant gauge couplings to Dirac fields, we should make the replacement 2/3 ->
4/3. Similarly, our computation was done for complex scalar fields. If some of the
irreducible components of Rs are real, and we have only one real field in this irreducible
representation, the corresponding contribution is smaller by a factor of 1 /2. The reader
should remember these factors of two when applying the above formula to specific
theories.
The pure gauge-theory contribution to the gauge-boson two-point function consists
of three graphs, Figures 9.10(a)-(c). The most complicated is the one involving triple
gauge-boson vertices. It has the form
, avh _ (-i) 2 g W- 4 acd bcd f d d P N<-
n ^ - — 2 — / f J Qx)*p>(p +g r (9 - 8)
where the numerator factor is
N HV = |y p {q _ p) a + ^pa {2p + q y _ ^ {p + 2(/) p j
x [Sp(p - q) a - r] pa (2p + q) v + 8%(p + 2q) p ~\.
The overall factor of 1 /2 is a symmetry factor. The trace of the numerator is
N£ = -6(d-l)(p 2 + q 2 +pq).
The group-theory factor can be written in terms of the second Casimir operator,
f acd f hcd = 8 ab C 2 (G). Thus
< = 30* - D^V^^G) / % pl t q \ +P « (9-9)
yifl J (2jt) d p 2 {p + qY
The diagram of Figure 9.10(b) is given by
-navb _ (-i) 2 g V- 4 f d d p & dc r, aP
Uv2 ~ 2 l&^-pl- (9 - 10)
+ f ace f bde (rl ^ 11 afi _ ^a ^v }
+f ade f bce (ri l t v r ^ _ ^ft*)].
We can reduce the group-theory factors and find that
T-.navb n tr*\(A 1 -> 2 d-A f &* P 8 ah r]l lv (p 1 + q 1 + 2pq)
U V2 =-C 2 (GKd-l) g0li J^ ^^ . (9.11)
Finally, we have the diagram in Figure 9.10(c), with the ghost loop,
navb_ ~2ld-4fdacfcM[ d ' 'p (P+jOV (q ]2)
n F3 --(0s m / / J {2n)dpHp + q) i> ( 9 - 12 )
9.11 Coupling renormalization in non-abelian gauge theory
where we note the absence of a symmetry factor and a minus sign, coming from the
fact that the ghosts are complex scalar fermions. In contrast to physical scalars, there
is no quartic coupling between the ghosts and the gauge bosons in the gauge we are
using. The group-theory factor in this diagram is — C2(G)S ab .
The sum of these three diagrams gives
d d p N(d,x,p,q)
nf> = gln d -H ab c 2 {G)j
where
N(d, x,p, q) = [3(rf - l)(p 2 + q 2 +pq) - d(d - l)(p 2 + q 2 + 2pq) - (p 2 +pq)].
We now introduce a Feynman parameter x to simplify the denominator, and define
r = p — xq, U = — x(l — x)q 2 . We can then discard terms in the numerator linear in r
and write it as
-(d - 2) V + x 2 q 2 ) + (2d 2 -5d + 4)xq 2 .
Note that the term in the numerator quadratic in the integration variable is multiplied
by (d — 2). The integral over this term will give rise to a pole at d — 2, which is the
signal of a quadratic divergence in DR. The fact that it is multiplied by d — 2 indicates
that this pole is absent and there is no quadratic divergence, and thus no divergent
gauge-boson mass.
As usual, we do the integrals by analytically continuing to Euclidean space. I take the
opportunity to record here the Minkowski-space values of two dimensionally regulated
integrals which are obtained by this method:
i-d/2
f d d p 1
J (27T) d (p 2 -
uy
(-l)»i
1 2 d TT d l~ l
T{n - d/2) / 1
T{n) \U
d d P p' l p v
: =
(-l)»-Hr(n-d/2-
■D(
(2n) d (p 2 - Uy
2d+\ lt d/2
Tin)
I
The reader should note how the argument of the numerator Euler function corresponds
to the power of U. Once one has understood the methods of Euclidean field theory, it
is often convenient to just remember such a table of effective Minkowski integrals.
The reader is asked to complete the computation of the renormalized static potential
F(R) in Problem 9.10. He/she will find that the fi function for the Yang-Mills coupling
is given by
m = -t!^(t C2(G) - ^ (Rf) - ^ (Rs) ) •
C2(G) is the quadratic Casimir invariant of the adjoint representation, and D(R) is the
Dynkin index of the representation R. We have done the computation for Weyl fermions
(Dirac fermions will give an extra factor of two) and complex scalars (real scalars are
possible for real components of Rs and give an extra factor of i). As long as there are
not too many matter fields, the non-abelian gauge coupling is marginally irrelevant or
asymptotically free [113-1 18]. A definition of the coupling in terms of the potential is
Renormalization and effective field theory
manifestly independent of the gauge parameter. On the other hand, Problem 9.9 shows
that the first two terms in the p function are the same for all renormalization schemes
whose couplings are related by power-series expansions. Thus the first two terms of the p
function are universal, scheme-independent, and gauge -invariant. A general definition
of the gauge coupling will not have a k -independent p function beyond two loops.
The static potential in a massless gauge theory will satisfy the RG equation
Qi8 lx + P(g)d g )V(R,g,n) = 0,
while dimensional analysis implies that
(At 9 M - R d R ) V = V.
Together, these equations imply that
y_ gQ 2C 2(R)
AttR
The potential is essentially the Coulomb potential, multiplied by the running coupling
at scale 1 /R. In perturbation theory, this relation arises because all non-scale -invariant
R dependence is a function of Rfi, but all [i dependence comes through the bare
coupling and survives in the limit d — 4 only when it multiplies a pole. The RG equation
implies that the kth loop term in perturbation theory will be a /rth-order polynomial
in ln(/zi?).
9.12 Renormalization-group equations for masses
and the hierarchy problem
So far we have talked mostly about the use of RG equations for Euclidean Green
functions. In Lorentzian signature, various complications arise, which we will sketch
in this section. First consider the RG equation for a connected two-point function
■DW 2 = -2yW 2 ,
where V is the infinitesimal RG operator (/lx9 m + pd a + my m d m in spinor QED) and
Y is the anomalous dimension of the field in question (this argument is applicable to
any operator that undergoes multiplicative renormalization). Since the RG equation is
independent of momentum, it applies near a pole, where
Z
us -
- M 2
Since V is a first-order differential operator, it produces both a single-pole and a double-
pole term when acting on Wi . The two cannot cancel out, and hence must satisfy the
equation separately. Thus
VZ = -2yZ
9.12 Renormalization-group equations for masses and the hierarchy problem
VM l = 0.
The latter equation is particularly interesting when the theory has no relevant operators,
as is true for a chirally symmetric abelian or non-abelian gauge theory, like massless
QED. In massless QED, we get the equation
On combining this with dimensional analysis we have the solution
M — /Lie - -' 7m -> /xce^ .
This equation makes no sense unless c = 0. If c ^ the theory would not be continuous
at a — 0, contradicting the assumption of perturbation theory on which we based the
calculation.
This is a very general result: a marginally irrelevant perturbation cannot produce a
finite mass scale. On the other hand, if the perturbation were marginally relevant, the
sign of the exponent would change, and we would learn that exponentially small masses
could be generated. This result contains the seed of one of the most important ideas
in quantum field theory. It gives us an idea of how the vast hierarchy of mass scales in
the real world might be generated from a system with a single intrinsic scale as large
as the Planck mass. All that is necessary is that the theory have a low-energy effective
approximation that is a marginally relevant perturbation of a fixed-point quantum field
theory. In that case, if the underlying high-energy theory can give us an explanation
of why the initial values of parameters are moderately close to the critical surface,
we can explain the natural occurrence of scales exponentially smaller than the Planck
scale.
The nice behavior of marginally relevant parameters is to be contrasted with what
happens if we have truly relevant parameters, like scalar masses near the Gaussian
fixed point. In that case, the RG tells us that we need a fine tuning of initial conditions
with accuracy (m/Mp) 2 in order to explain a mass scale of order m. Most physicists
consider this unnatural. By contrast, the tuning of the bare coupling of QCD at a
Planck scale cut-off, which is required in order to explain the QCD scale in terms
of the Planck scale, is only about one part in 25. 17 Fermion masses, although they
seem like relevant perturbations, actually behave like marginal perturbations because
of chiral symmetry. This means that, if the underlying theory has enough symmetry to
forbid the appearance of a fermion mass in the high-energy Lagrangian, then we could
explain a small value of this mass in the real world in terms of spontaneous breaking
of this symmetry by the IR effects of a marginally relevant coupling.
7 I am here assuming that the particle content of the minimal supers} mmetric standard model is the only
thing that renormalizes the QCD coupling between laboraion scales and the Planck scale. Extra matter
makes the tuning oi the bare coupling even less se\ere. as long as it preserves assmptotic freedom.
Renormalization and effective field theory
The concept of technical naturalness is based on this example. We have a parameter
that could be classed as relevant, but might naturally be small as a consequence of
a fundamental symmetry broken spontaneously by a marginally relevant coupling. It
is then said that a theory containing a small value of this parameter is technically
natural, because there will not be power-law renormalizations of its value. It becomes
truly natural when we supply the underlying explanation for the symmetry, and the
dynamical symmetry-breaking mechanism.
The standard model contains one parameter, the quadratic term in the Higgs poten-
tial, that is not technically natural. It receives quadratically divergent renormalizations,
and it is not easy to understand why it is not of order the Planck scale. This is called
the gauge hierarchy problem, since this parameter determines the scale of the massive
weak gauge bosons. Proposals to solve this problem divide roughly into two classes.
The first goes under the name of technicolor, and replaces the Higgs field by a new
QCD-like sector with a dynamical scale of order a few TeV. This gives an elegant
account of gauge symmetry breaking, but suffers from numerous phenomenologi-
cal problems when we try to couple it to quarks and leptons. The second solution
of the hierarchy problem is based on supersymmetry. Supersymmetry is a symmetry
that relates bosons to fermions, and hence allows bosons to benefit from the chi-
ral protection afforded to fermion masses. It is also incredibly interesting because
it forces us to think about gravitational effects, and seems to be an intrinsic com-
ponent of string theory, our only successful theory of quantum gravity. Generic
supersymmetric extensions of the standard model, which allow for parameters that
break supersymmetry in the phenomenologically necessary way, also have a variety
of phenomenological problems. However, it is possible to solve these problems in spe-
cific models. The key question seems to be to understand the mechanism by which
supersymmetry is broken. String theory contains hints suggesting that supersymme-
try breaking is much more constrained than low-energy field theory would have us
believe.
There is another resolution, or rather postponement, of the hierarchy problem in a
class of models that go under the rubric "little Higgs" [119]. These models successfully
postpone the hierarchy discussion to the 100-TeV scale, and declare that, since that
scale will be out of experimental reach for the forseeable future, we can safely ignore
it. I will leave the reader to decide on the value of these models for her/l
9.12.1 Marginally relevant perturbations
A remarkable theorem of Coleman, Gross, and Zee [120-122] shows that theories
with marginally relevant parameters in four-dimensional space-time always involve
non-abelian gauge bosons. Non-abelian gauge couplings are always marginally rele-
vant unless the matter representation is too large. Some of the other couplings in the
theory may also turn out to be marginally relevant, as a consequence of their inter-
action with the non-abelian bosons. The result we have just proved suggests strongly
that non-abelian gauge interactions will be our only explanation of the large hierarchy
9.13 Renormalization-group equations for the S-matrix
in energy scales between the Planck mass, 10 19 GeV, and the typical scales of particle
physics. This does indeed seem to be the explanation of the scale of strong interac-
tions. Notice that the phenomenon of confinement, which we discussed in Chapter 8,
involves a dimensionful parameter, the string tension or energy per unit length of the
QCD flux tube. If the QCD coupling were not marginally relevant, we would expect
this parameter to be at the cut-off scale. The scale Aqcd is therefore sometimes called
the confinement scale of QCD, though experiment shows that the actual string tension
involves a scale roughly lit larger than Aqcd.
Actually, if we follow this philosophy stringently, we are led to either omit funda-
mental scalar fields from our Lagrangian or insist on supersymmetry as a fundamental
principle. Supersymmetry relates scalar fields to Weyl fermions, in such a way that
chiral symmetries act on scalars. Thus, in supersymmetric theories, scalar mass param-
eters, like fermion masses, behave like marginally relevant operators, even though they
have dimension 2. They receive only logarithmic renormalizations. Thus, it is consis-
tent to have scalar mass parameters of order 100 GeV, even if the cut-off scale is much
higher than that. Supersymmetry is broken in the real world, and its virtues can be
preserved only if the superpartners of standard-model particles lie not too far above
the electro-weak scale.
This kind of solution of the hierarchy problem is technical in nature. It does not yet
explain the value of the electro-weak scale in the way that asymptotic freedom explains
the scale of strong interactions. In order to do that, we have to make the breaking of
the chiral symmetry that protects the Higgs mass into a dynamical mechanism, which
depends on some new marginally relevant parameter in the Lagrangian of the world.
This can be done either with or without supersymmetry. In the non-supersymmetric
solution, which is called technicolor [ ], the Higgs field is a fermion bilinear in a
new QCD-like sector. These models have severe phenomenological problems. There is
a variety of supersymmetric scenarios for a completely dynamical explanation of the
electro-weak scale. Many of them are roughly consistent with current experimental
data, but there are various causes for unease, and no uniquely beautiful model has yet
emerged.
9.13 Renormalization-group equations for the
S-matrix
The LSZ formula for S-matrix elements has the schematic form
s = Y[ft(pf - Mf)zr lll w n+m .
The/]- are normalized ingoing or outgoing single -particle wave functions, which depend
only on the masses, the Z, are the residues of poles in two-point functions, and the W n
are renormalized connected Green functions. It is easy to see, as a consequence of
Renormalization and effective field theory
the equations in the last subsection and of the first-order nature of V, that S-matrix
elements satisfy
VS = 0.
This simply expresses the fact that the theory is over-parametrized and that the S-matrix
is a physical amplitude, which does not depend on how we have chosen to define the
couplings.
Unlike the case of Green functions, we cannot combine this equation with dimen-
sional analysis to extract a momentum dependence of the scattering matrix. The point
is that the S-matrix is defined on mass shell, and we cannot arbitrarily rescale the masses
of particle states. These typically form a discrete spectrum. Thus, the RG has nothing
so say directly about scattering-matrix elements.
However, we can combine the RG with perturbation theory in certain weak inter-
actions to obtain information about inclusive cross sections in a strongly interacting
sector of the theory. Consider for example the production of hadrons via electron-
positron annihilation. To lowest order in QED, the amplitude for producing a given
hadronic state X is given in terms of the matrix element (0|/ /U (x)|X> of the electro-
magnetic current qQy^q in QCD. This cannot be calculated without a real solution
of QCD. However, the inclusive cross section, summing over all hadronic final states,
is related to the Fourier transform of the two-point function of the hadronic current,
Jfiviq). Even for large time-like q, which is relevant to high-energy e + e~ annihilation,
the RG equation does not say anything definitive about this two-point function. For
example, if there are heavy quarks, with mass well above the scale Aqcd at which strong
interactions are strong, there may be bound states of these quarks, which would appear
as resonances in this two-point function. Their masses would satisfy the RG equa-
tion, and any function of q, Mj satisfying dimensional analysis would be an acceptable
solution.
However, it is a phenomenological observation that, away from resonance singular-
ities, such cross sections are smoothly varying functions of q. In the deep Euclidean
region, where q is space-like and » Aqcd, we can invoke asymptotic freedom to cal-
culate the two-point function in terms of perturbative graphs involving quarks and
gluons. Analyticity, in the form of the Lehmann representation, allows us to write this
explicitly known behavior in terms of integrals over the time-like region. These formu-
lae show that, in smoothly varying regions, the time-like behavior is simply the analytic
continuation of the perturbative space-like formulae, while near resonances the pertur-
bative formulae reproduce only certain integrals over the resonance. Thus, combining
knowledge of the fixed-point behavior with general principles, we can use the RG to
predict things about the high-energy behavior of QCD [32].
A similar, but more complicated, analysis works for the inclusive cross sections for
lepton scattering off hadrons, in the regime of high energy and momentum transfer
(deep inelastic scattering). After making an angular momentum projection, one can
again relate the cross section to the short-distance OPE of two currents, this time taken
between single -particle hadron states. These calculations justify and extend Feynmans
heuristic parton model for high-energy scattering in the strong interactions. Workers
9.14 Renormalization and symmetry
have applied the successful QCD parton model to a variety of other processes for which
a rigorous analysis is harder to come by. This elaboration of perturbative QCD has been
quite successful. You can read about some of the details in Peskin and Schroeder [33].
A more modern account can be found in [124].
9.14 Renormalization and symmetry
9.14.1 You don't need symmetry
Wilson's approach to the interplay of renormalization and symmetry is based on the
observation that the relevant and marginal perturbations of a Lagrangian with sym-
metry are generally finite in number. At tree level, one can set the coefficients of all
symmetry-breaking operators to zero. Even if one ignores the symmetry in construct-
ing a cut-off version of the theory, one can still imagine tuning the coefficients of the
symmetry-breaking relevant and marginal operators to restore the naive Ward identities
at the level of renormalized Green functions.
Wilson's tuning procedure ignores the issue of naturalness that we raised above. ' N
More interestingly, we have encountered a class of examples in which it fails. These
are theories with "anomalies." Certain terms that appear in loop corrections to the
classical Ward identities cannot be canceled out by adding local counterterms to the
action. Anomalies are always associated with particles whose mass is zero, or can be
made to go to zero by tuning the expectation value of a low-energy field. The only
exception to this is classical scale invariance. In zeroth-order perturbation theory, the
only requirement is that all operators in the Lagrangian be marginal. However we
have seen that most marginal operators are actua . relevant or irrelevant.
Interacting scale -invariant theories are scarce, and typically appear only at isolated
points in the space of field theories.
9.14.2 But symmetries are nice
On the other hand, we have seen that invoking symmetries makes the whole process of
renormalization less arduous. If we can invent a regulator that preserves a symmetry
we should do it, if only because it saves work in calculation. Quite frankly, it wasn't
until 't Hooft and Veltman introduced dimensional regularization that calculations in
non-abelian gauge theory became sufficiently transparent for one to decide whether
the theory was renormalizable. The Wilsonian approach can work, but only if one is a
master calculator, who doesn't make mistakes.
More importantly, symmetries can help us to understand the smallness or absence of
terms in the effective action that we might alternatively think of as finely tuned relevant
Wilson was one of the very first advocates of naturalness as a criterion for a good theory.
Renormalization and effective field theory
parameters. This is the philosophy of "technically natural field theory" which we have
adumbrated above. Indeed, it can be argued that much of the power of effective field
theory comes from its exploitation of symmetry. For example, we are able to calculate
a large number of low-energy scattering amplitudes involving pions in terms of a small
number of parameters just by using chiral symmetry and effective field theory. Effective
field theory alone would give us very little information.
9.14.3 And sometimes you get them for free
The deepest remark one can make about the relationship between symmetry and renor-
malization is that symmetries can be emergent. Like much else in this field, this remark
is due to Wilson. To make the point in a simple context, consider a lattice version of
a scalar field action, thought of as a perturbation of the Gaussian fixed point. There
are lots of allowed operators (e.g. J^^/j.^) 4 ) in the theory, on a cubic lattice, which
break continuous rotation invariance. However, for most regular lattices, the lattice
symmetries are sufficient to remove all relevant and marginal operators that break the
continuous symmetry. Thus we can view rotation invariance (and, after Wick rotation,
Lorentz invariance) as an accidental low-energy consequence of the fact that IR physics
is dominated by a fixed point. In fact, most of the fixed points describing second-order
transitions in condensed-matter physics have emergent rotation invariance, despite
coming from underlying systems that have no such symmetry.
It is likely that Lorentz invariance is not accidental, but rather a consequence of
the theory of quantum gravity. It is a large-distance manifestation of the underlying
gauge principle of general covariance. However, there is lots of room for emergent
continuous symmetries to play a role in our theory of the real world. In particular, the
standard model has a host of exact and approximate global continuous symmetries like
baryon number, lepton numbers, and various quark flavor numbers. There are good
arguments that none of these symmetries is exact, but they might be emergent infrared
symmetries of an underlying theory in which e.g. only discrete subgroups of them
are exact. A detailed exploration of theories with emergent continuous symmetries is
beyond the scope of our discussion here, but it is an important theme in the exploration
of physics beyond the standard model.
Important examples of this kind of emergent symmetry abound in the standard
model. For example, baryon and lepton numbers, and lepton flavor, are preserved auto-
matically by the most general renormalizable Lagrangian with the standard-model
gauge symmetries. 19 Furthermore, any theory of baryon-number violation can be
parametrized by the coefficients of a small number of dimension-6 operators.
A more subtle emergent symmetry is the extra SU(2) in the Higgs sector of the stan-
dard model, which we called custodial SU(2). It is broken by the U( 1 ) gauge interaction,
Actually. B and L are broken, am! onh />' L presened. In an anomaly in the standard model. Hov
at low energies the breaking is exponentially small.
9.14 Renormalization and symmetry
but that is the only renormalizable interaction which can break it. As a consequence,
the tree-level broken-symmetry relation
Mw = Mz cos 0w
can get only finite corrections, when it is expressed in terms of the renormalized
couplings.
9.14.4 Renormalization and spontaneous symmetry breaking
In the previous section we discussed the relation between renormalization and explicit
breaking of symmetries. What then of theories with spontaneously broken symmetry?
Our discussions of spontaneously broken symmetry and of the renormalization group
show that the two subjects are really decoupled from each other. Renormalization is
the procedure for deriving the effective theory of long-wavelength degrees of freedom
of a system, while spontaneous symmetry breakdown is a property of the solution of
that effective field theory.
In particular, if we consider the system in a large but finite volume, then there is
no spontaneous breaking of symmetry. There is a unique, symmetric ground state.
The renormalization procedure involves integrating out short-wavelength degrees of
freedom, which, because of the approximate locality of interactions, are insensitive to
the volume of the system.
A procedure for seeing this decoupling explicitly in perturbative calculations is to
calculate the quantum action, rather than connected Green functions. One can then
test the system for spontaneous symmetry breaking by looking to see whether there
are translation-invariant solutions of the field equations
80 (x)
that violate symmetries of the classical action. To do this, we need only calculate the
effective potential, which is defined by 2 "
?/-
r [</>,] = -j / d 4 X KefffA).
g" is the loop-counting parameter. The derivatives of V e s w.r.t. <j> c are 1PI Green
functions evaluated at zero momentum.
To develop an efficient graphical method to calculate V s s , we begin with its Legendre
transform, the connected generating functional in a constant classical source /, divided
by the volume of space-time. w(J) is computed as the logarithm of a path integral, in
the usual way, except that we divide the answer by the volume, and multiply by g . At
tree level, w(J) is just the Legendre transform of the potential in the classical action.
More generally, we can think of it as the energy density of the ground state in the
We work in Euclidean space. In Lorentzian signature this equation needs a minus sign.
Renormalization and effective field theory
presence of the constant classical source. There are two interesting interpretations of
V e ff, which follow from these definitions.
First of all, it is the answer to the following variational question: "What is the min-
imal value of the expectation value of the Hamiltonian of the J—0 theory, in states
constrained to satisfy
{f\<t>\f) = 4> Q T
Indeed, the way to solve this constrained variational problem is to introduce / as
a Lagrange multiplier, calculate the minimal energy as a function of / (which is
w(/)/V sp ace) and then impose the constraint
w'(J) = C .
The minimal expectation value of H is then obtained by subtracting the source term
from w and then expressing the answer as a function of (p c . This is just the Legendre
transform.
Consider instead the path integral
Write the delta function as / d/ e 1( ^ ^ V ^v^-^^J) ^ anc j per f orm t h e path integral to
obtain
/■ d/e iV spaC e- tlm eM/)-/0c).
Now do the integral over / by stationary phase in the large-volume limit. The answer
is
e-iV.pace-tfaK^effWc).
In other words, in the large -volume limit, the effective Wilsonian action for the zero-
momentum mode of the field is just the effective potential.
In order to compute V e ^ perturbatively we imagine that we have found the value of/
corresponding to the expectation value of </> equal to C . Then the effective potential is
just w{J) — Jcpc- Now shift the field in the path integral defining w(J) by <p c : (j> — </> c + A.
By definition, the field A has zero expectation value. The perturbative way to determine
/ = J2g"Jn i s to insist that it is chosen so that, in each order, the linear term from
the explicit /A term in the action cancels out the "tadpole" graphs which give the
one-point function of A. This then implies that all IP reducible vacuum graphs vanish.
We thus obtain the following prescription [125].
Define the action S(</> c + A) — V space - t i me (/> c (9 K/90)(0 C ). Compute the vacuum path
1 for this action, ignoring all 1PI diagrams. This path integral is equal to
9.14 Renormalization and symmetry
The one -loop term in this expression needs some care, since it is unusual to look
at a Feynman diagram with no vertices. Let's consider an 0(«)-invariant Euclidean
Lagrangian
£= (3 yLt </)') 2 + K(0'>'').
According to our prescription, the one-loop correction to F e ff is
\'(-V 2 )A'+(M 2 )j,-A'A^') |
- h (/d
= -lndet(-V 2 + M 2 ).
The matrix M 2 is defined by M 2 . = (9 2 F/90' 300 WO •
To find the regularized form of this answer, let us divide the functional integral by
the same expression with M 2 — >■ M^Sy, where Mo is a large mass that we think of as
order the cut-off. This subtracts a ^-independent constant from V e [{ and leads to a
simpler formula at intermediate stages of the calculation. It has no relevance to any
physical quantity. For any Hermitian matrix, tr In A = In det A, so we can write
K/tI
,-.?(- V- + M 2 ) _ _. S (-V 2 +M 2 )]
The identity which leads to this parametric formula can be verified eigenvalue by eigen-
value for the two operators. Note that it looks just like the Schwinger proper-time
version of a one-loop vacuum diagram with no vertices, except that the proper-time
integral has a factor of \/s in it that we might not have guessed from ordinary Feynman
rules. The trace Tr O = f d d x(x|tr 0\x) is over the tensor product of function space and
0(«)-index space; tr refers to the index trace alone. The operator — V 2 +M 2 is diagonal
in Fourier space. Thus we obtain
1
so the one-loop effective potential has a pole at d — 4. If the tree-level potential is
quartic in the fields, V = {m^/2)(j> 2 + ko(<j> 2 ) 2 , then the residue of this pole proportional
to tr M 4 is a mixture of quadratic and quartic terms. Furthermore, it is an O(n)-
invariant function of the classical fields </>. This means that the pole can be removed by
redefining mo and A.o. The required renormalization (in the MS scheme) is identical to
that which we would have to do in order to eliminate the divergent part of any one-loop
Green function.
Renormalization and effective field theory
This calculation also shows us that theories with spontaneously broken symmetries
are renormalized by the same manipulations as those which renormalize the corre-
sponding symmetries in the symmetric phase. Indeed, since we are calculating the
quantum action, we never have to make a choice of the vacuum we expand around.
The renormalized quantum action is an 0(«)-invariant functional of (p. The renormal-
ized parameters (which, as low-energy observers, we are free to choose at will) are such
that the minimum of the effective potential is asymmetric. 21 This decoupling between
renormalization and spontaneous symmetry breaking should have been expected. We
described spontaneous breaking in terms of inequivalent representations of the same
field algebra, which were inequivalent precisely because of their different large -volume
behaviors. Renormalization has to do with the local structure of the theory.
9.14.5 The Coleman-Weinberg potential
The computation of the effective potential in a general renormalizable field theory
follows the pattern set by the 0(«) model. One simply computes the vacuum energy as
a function of particle masses, where the masses are those induced by a scalar field VEV
v. The resulting renormalized potential reads
F(v) = -^r /~_> 5 + l)(-l) M M, 4 (v)ln(M,(v) 2 /M 2 ),
where \i is the renormalization scale. This is evaluated in terms of renormalized cou-
plings and satisfies a renormalization-group equation. The couplings (and Lagrangian
mass parameters) come into the computation of the mass eigenvalues M,(v). One gen-
erally has to diagonalize a complicated mass matrix in order actually to evaluate this
formula. Note that the minus sign for fermions comes about because Grassmann func-
tional integrals give determinants rather than inverse determinants. More intuitively it
is a consequence of the fact that "vacuum fermions have negative energy."
It is interesting to note that this expression is analytic in v, except at places where
one or more mass eigenvalues go to zero. We have obtained an effective action for the
zero-momentum modes of the fields, by integrating out the higher modes. However,
if a mass is zero, there is no clear separation. This shows up as IR divergences in
loop diagrams. The connection between non-analyticity of the effective potential and
massless particles was the key to Wilson's analysis of critical phenomena in condensed-
matter physics [ ]. It is also of crucial importance in particle physics. One typical
example is the Linde-Weinberg lower bound on the mass of the Higgs particle in the
standard model [127-128]. If the Higgs is much lighter than the W and Z bosons, they
can be integrated out and contribute only to the Higgs effective action. At tree level,
Strictly speaking, the exact effective potential is a convex function of the field. If spontaneous sym-
metry breaking occurs, it is in fact constant over the region. </> • </> n . indicating that the true vacua
are rotations of the vector 0n(l, 0, . . ., 0). Any point in the indicated region can be the VEV of the field
in appropriate superpositions of these rotated vacua. Only the vacua with fixed rotation angles satisfy
the cluster-decornpoMiion principle. One never sees the convexity of the effective potential in simple
approximations.
9.14 Renormalization and symmetry
the ratio of masses is essentially the ratio between v^ and g\^- However, no matter
how small the Higgs quartic coupling is, compared with the gauge couplings, the gauge
bosons become massless at \H\ =0. Their non-analytic contribution to the Coleman-
Weinberg potential dominates the tree-level potential in this region, and produces a
symmetric minimum if X is very small. The condition that the Universe actually lives
in the spontaneously broken vacuum state then puts a bound on m^/my/.
9.14.6 Renormalization of gauge theories in the Higgs phase
The interaction of renormalization and the Higgs phenomenon is more subtle than that
between renormalization and spontaneously broken global symmetries. In the unitary
gauge, the gauge-boson propagator does not fall off at high energies and non-abelian
Higgs models are not renormalizable in unitary gauge. However, the S-matrix in this
gauge is equal to that in the general R K gauge, which is renormalizable.
This peculiar situation can be understood better by noting that the elementary fields
in unitary gauge are equal to gauge -invariant operators
Bf, = QtD,j,(A)Sl,
etc., where £2 is defined by
H(x) = Cl(x)(v+A(x)).
v is the VEV of H, and A (the vector of physical Higgs fields) satisfies v T T<jA = 0.
These gauge-invariant operators can be defined in any gauge, but are equal to the
elementary fields of finite mass in the k — ► oo limit.
Thus, in R K gauge for general k, the elementary fields of the unitary gauge are
defined as non-polynomial functions of the elementary fields of the R K gauge. In per-
turbation theory, we are always working around the Gaussian fixed point, where the
operators of fixed dimension are defined as Wick monomials. Higher-order monomi-
als require more subtractions. Thus, it is not surprising that the Green functions of
unitary gauge fields require an infinite number of subtractions, which is characteristic
of a non-renormalizable theory.
On the other hand, we have argued that the scattering matrix can be computed
from projected Green functions of the elementary fields, by use of the LSZ formula.
These projected operators, smeared with normalizable wave packets at infinity, are
BRST-invariant. Thus, the S-matrix and the Green functions of those gauge-invariant
operators that can be defined as polynomials in the R K gauge fields behave as we expect
a renormalizable theory to behave.
This remark also sheds light on the situation we encountered when computing in
massive QED. There we found an apparent quadratic UV divergence, inversely pro-
portional to the square of the gauge-boson mass, in the wave-function renormalization
of the fermion field. When we used the gauge-invariant DR procedure, this divergence
disappeared and was replaced by a logarithmic divergence with no singularity at \i = 0.
We could have done the calculation using the Stueckelberg formalism in an R K gauge,
Renormalization and effective field theory
where the gauge-invariant regulator would have been natural. We had to use the gauge-
invariant regulator in the unitary gauge in order to get a result consistent with the Higgs
mechanism. Note that the unitary gauge abelian Higgs model is renormalizable, while
this is not true for non-abelian theories. The difference has to do with the fact that the
effective theory of an abelian NGB is free, whereas that of a non-abelian NGB is a
non-renormalizable interacting theory.
9.14.7 Effective field theory and NGBs
Theories of quantum gravity do not have continuous global internal symmetries. This
is a theorem in perturbative string theory [ ], and follows more generally from the
violation of global conservation laws by black-hole formation and evaporation. Thus,
continuous global symmetries should be viewed as accidental symmetries of low-energy
effective field theories. As such, we expect them to be broken by higher-dimension
operators with scale no higher than the reduced Planck scale mp = 2 x 10 18 GeV. If G
is a spontaneously broken U ( 1 ) ( subgroup of a) continuous accidental symmetry group,
and D is the dimension of Od, the lowest-dimension G-violating operator allowed by
exact symmetries of the underlying gravity theory, then we expect a potential for the
approximate NGB field b associated with G, of the form
V=^U(b/f).
m p
f is the PNGB decay constant. When G is something like a global chiral symmetry of
a strongly coupled gauge theory, we expect (Op) ~ f E '• The approximate NGB mass
is then of order (/'/mp)T~/'.
A context in which these effects are important is the axion solution of the strong CP
problem. The axion is a pseudo-NGB, whose potential is supposed to come dominantly
from a coupling (a/f a )GG, to the gluons of QCD [ 1 30]. In most models where the axion
arises from spontaneous breaking of a new strongly coupled gauge theory, the mass
arising from "Planck slop" is larger than that coming from QCD, and the axion model
does not work.
In general, even if we neglect symmetry-breaking terms, the low-energy effective
theory for NGBs of a non-abelian group is not renormalizable. The interactions are all
irrelevant operators. With our current understanding of renormalization, this should
not bother us. The constant which sets the scale for all these irrelevant interactions is
/, the NGB decay constant. The full effective theory contains all possible irrelevant
operators consistent with the symmetries. In addition one is instructed to calculate
with a cut-off Aizf (in four dimensions). To simplify matters one chooses the cut-off
procedure to preserve the symmetry which gave rise to the NGB in the first place. 22
22 In the first few paragraphs of this subsection we argued that all such symmetries were approximate. The
more precise meaning of the procedure we are now outlining is that the irrelevant operators which break
the symmetry are scaled by mp, or some other scale >>f. As a consequence the\ contribute much less
than do the irrelevant terms we are discussing here, at energy scales below 4irf.
9.15 The standard model through the lens of renormalization
It is then easy to verify that, as long as NGB momenta are below/, the higher-order
terms in the effective Lagrangian make small corrections, in a systematic power series
in p/f, to amplitudes computed using the leading-order effective action.
We can use this effective field theory to do loop calculations, and we must do so
in order to capture non-analytic terms in the amplitudes which arise from loops of
massless NGBs. All analytic corrections from the loops are simply absorbed into a
redefinition of the coefficients of higher-dimension operators in the effective action. A
systematic exposition of the rules of effective field theory for NGBs may be found in
Chapter 19 of [131].
9.15 The standard model through the lens of
renormalization
I would now like to return to the standard model of particle physics, and view it through
the eyes of an expert on renormalization theory, since all of my readers should have
brought themselves to that expert level at this point. We will take the point of view that
there is a cut-off energy scale M, between 10 16 and 10 19 GeV, above which the concepts
of local field theory fail. I will not enter into questions involving extra dimensions much
larger than 10~ 16 GeV -1 , warped extra dimensions, or why the number of dimensions
is four.
The first question one must ask oneself is why there should be any physically interest-
ing scales below the cut-off. We have seen that, in general, one requires a fine tuning of
parameters in order to achieve this. Only if there are marginally relevant parameters, at
the fixed point that defines the universal IR behavior of the theory of the real world, can
we achieve a ratio of scales as large as 10 20 without some sort of tuning. It is therefore
extremely interesting that, at the Gaussian fixed point, the only marginally relevant
perturbations involve non-abeli Tactions [120-122], and that non-abelian
gauge theories seem to describe the real world.
Another interesting point is that the fermion fields of the standard model form
chiral representations of the gauge group. Thus, fermion mass terms are forbidden by
gauge invariance, and masses can appear only through the Higgs mechanism. This
argument would have led us to expect all quark and charged-lepton masses to be of
order 100—200 GeV, but only the top quark obeys this rule. We are led to suspect
the existence of a symmetry explanation for quark mass hierarchies, a special role for
the top in the electro-weak Higgs mechanism, or some explanation that involves extra
dimensions. Chirality of the representation leads to a potential problem with anomalies.
We have seen that in the standard model the anomalies between quarks and leptons
are canceled out, leading to the suspicion of unification in the framework of a larger
gauge group.
The force of the preceding paragraph is somewhat mitigated by the fact that
the /j, 2 parameter of the electro-weak Higgs mechanism is itself severely fine-tuned.
Renormalization-group analysis would lead us to imagine it at the cut-off scale. There
Renormalization and effective field theory
is by now a host of proposed explanations for this. They all involve new degrees of
freedom at a scale within an order of magnitude of the Higgs VEV. The Large Hadron
Collider (LHC) was built to explore this energy range. I have little doubt that it will
discover at least some of the new physics that determines the scale of /x and, through it,
much of the physics of the world as we know it. One of the most attractive candidates
for the new physics is the supersymmetric extension of the standard model. In super-
symmetric models, boson masses are linked to chiral fermion masses, and are effectively
marginal parameters. Supersymmetry is also interesting from the point of view of the
possible unification of the standard model into SU(5) or some other simple group, at
high energy. Georgi, Quinn, and Weinberg [132] pointed out that one could test this
hypothesis, assuming that there are no new particle states with standard-model quan-
tum numbers 23 between laboratory scales and the scale at which the couplings unify.
At the present level of experimental precision, unification fails in the standard model.
The couplings don't unify, and the scale at which they come closest is low enough that
the inevitable proton decay implied by a unified model should have been seen in the
laboratory. The minimal supersymmetric extension of the standard model solves both
of these problems.
To summarize, RG analysis and the remoteness of the energy cut-off lead us to expect
a field-theory description of physics below the cut-off dominated by marginal, and at
least some marginally relevant, operators. Irrelevant operators are too small in the IR
to lead to most of the physics we see. Relevant operators tend to freeze out the degrees
of freedom of the world near the cut-off scale. Marginally irrelevant perturbations are
perfectly consistent, as long as their low-energy couplings are small enough that there
are no Landau poles below the cut-off scale. However, we need at least one marginally
relevant coupling to generate new low-energy scales. The special properties of anomaly-
free chiral gauge theories lead one to expect them to dominate the structure of the world
at scales below the Planck scale, and this seems to be the case. The relevant parameter
in our current description of the Higgs mechanism leads to the belief that we will find
new physics at or below the TeV scale.
There is an interesting application of the use of marginally irrelevant operators in
the standard model, to find an upper bound on the Higgs mass. The Higgs particle
in the standard model has a marginally irrelevant quartic coupling. At tree level, the
ratio of the Higgs boson mass to the mass of the charged weak bosons is Mh/Mw =
\/Vg2> where g2 is the measured SU(2) coupling strength. One would like to know,
theoretically, how large the Higgs mass could be. To raise it, one has to raise the
renormalized value of k, but eventually this will bring the Landau pole down to the
experimental regime. To do the necessary calculations in a reliable way, one must resort
to non-perturbative (lattice field theory) methods. One finds a mass of order 750 GeV as
the maximal theoretically allowed Higgs mass [1 33-1 34]. Nowadays, this is not as useful
as it once was, because indirect experimental evidence from precision electro-weak data
indicates a bound around 200 GeV. Furthermore, these considerations have neglected
23 Actually, through one-loop order it is sufficient that any new particles that mighl exist are in complete
multiplets of the unified group.
9.16 Problems for Chapter 9
the large top-quark Yukawa coupling. When this is taken into account the Higgs mass
can have an interesting quasi-fixed-point behavior [130]. It then seems possible (if one
ignores the fine tuning necessary to get the electro-weak scale) to push the standard
model to very high energies and predict a Higgs mass consistent with precision electro-
weak measurements and direct experimental bounds, by assuming that the coupling is
close to the quasi-fixed point.
Since the standard model is an effective field theory we can also expect to find strictly
irrelevant corrections to the model. We can probe these corrections most sensitively
by looking for violations of emergent symmetries such as baryon and lepton numbers
and approximate flavor conservation. The results of such searches are quite interesting
since they have almost all been negative. Flavor violation beyond that intrinsic to the
standard model seems to require that the scale of new flavor-violating physics is above
the 10-1000-TeV range (the scale for a given flavor-violating operator depends on
precisely which process it mediates). Dimension-6 operators with a scale 10 16 GeV,
which could lead to proton decay, are on the edge of being ruled out unless there is
extra suppression from weak couplings.
The only evidence for new physics from irrelevant corrections to the standard model
is the evidence for neutrino masses coming from solar and atmospheric experiments.
These can be explained by invoking a dimension-5 operator (HI) 2 /Ms, where the see-
saw scale Ms ~ 10 14 — 10 15 GeV. This is interestingly different than the other piece
of evidence for new physics, which is the apparent unification of running couplings
of all the simple factors of the standard-model group. This is far from perfect in the
standard model, but the supersymmetric extension of the model has miraculously good
unification at a scale of order 10 16 GeV.
A possible reading of the RG tea leaves, then, is that the supersymmetric extension
of the standard model (with the possible addition of other degrees of freedom whose
standard-model couplings are quite constrained) will appear at LHC energies, between
100 GeV and 1 TeV The effective theory of SUSY violation and the Higgs mechanism
must have natural flavor conservation. Flavor physics and neutrino masses are deter-
mined by unification-scale physics, at an energy scale between 10 14 and 10 16 GeV. Here
we may expect to see the effects of extra dimensions and/or quantum gravity. I caution
the reader that this is only one of many scenarios, perhaps too conservative a choice.
Whatever the future holds, it seems clear that the tools of quantum field theory will be
useful for many decades of energy yet to be explored.
9.16 Problems for Chapter 9
*9.1. Write the pure SU(2) Yang-Mills theory as a theory of charged vector bosons
coupled to photons. Show that there is an anomalous magnetic moment implied
by the Yang-Mills Lagrangian, and that the gyromagnetic ratio of the charged
bosons is 2.
Renormalization and effective field theory
*9.2. Carry out the renormalization program through one loop for a single Dirac
fermion interacting with a single scalar through the interaction
Ci = f(gs + gPY5)(/>f-
Use a regularization scheme in which the Sch winger parameter s for each one -loop
diagram is cut-off at the lower limit sq . Evaluate the divergent parts of all one -loop
diagrams and determine what other interactions need to be added. Verify that one
has to add all relevant and marginal interactions consistent with the symmetries
of the Lagrangian and regularization scheme. Compute the divergent parts of all
one-loop diagrams including ALL of the required interactions and write the full
set of RG equations for this model. You will need to invent a prescription for the
finite parts of the renormalizations. When gp — you can also do the calculations
by dimensional regularization. Find the relation between the definitions of the
couplings in minimal subtraction and the prescription you invented.
*9.3. Calculate the operator product expansion (OPE) of : 4 (x) :: (p 4 (y) :,inmassless
free-field theory. To do this calculation you can use the result that the time-ordered
product of: </> 4 (x) : with any other collection of local operators is given in terms
of Feynman diagrams with a four-point vertex where no self-contractions are
allowed. Use this first of all to argue that the OPE is an operator expression, that
is, that its structure is independent of how many extra (j> fields there are in the
Green function
(T : </> 4 (x) :: <AjO : <Kyi) ■ ■ ■ </»(>'»))■
Then calculate all of the singular terms in the OPE, including their numerical
coefficients. Finally indicate the list of operators, 0„, and their Wilson coefficient
functions C„(x — y) that appear in the full OPE. In this last part you need not
calculate all the numerical factors in C„, just its functional form.
*9.4. Compute the one -loop charge renormalization (photon wave-function renormal-
ization) for the electrodynamics of charged scalar fields, with Lagrangian
C = --F^F" v + \D ll( l>\ 2 -m 2 \(t>\ 2 .
Note that, unlike spinor electrodynamics, there are two graphs, and you have to
sum over both of them to get a transverse answer for the photon 1PI two-point
function. Use dimensional regularization for the computation.
*9.5. Show that there is a divergent graph at one loop in the above theory, with four
external scalar lines. Argue that the divergence can be removed by tuning the
coefficient Ao of a bare interaction of the form (Ao/4)|(/>| 4 . Note that this is a
marginal operator consistent with the symmetries of the original Lagrangian, so
general renormalization theory tells us that we should expect to have to tune it
in perturbation theory around the free-field fixed point.
*9.6. Let if/, and </>,• be multiplets of fields transforming in real representations Rp
and Rs of a gauge group G. Assume there is only a single invariant coupling
griji r i<Pri'j- Using dimensional regularization, repeat Problem 9.2 including the
gauge interactions.
9.16 Problems for Chapter 9
*9.7. Consider two massless scalar fields with interaction Lagrangian
g 2 A 4 21 ,,
When g 2 — X there is an 0(2) symmetry. Renormalize this model at one loop.
Argue that the inevitable mass renormalization does not affect the renormaliza-
tion of the couplings at one loop. Also argue that, for any value of the couplings,
the model has a discrete symmetry guaranteeing that the renormalized mass term
is 0(2) -symmetric. Write the RG equations for the couplings and show that if the
initial values satisfy X/g 2 < 3 then the theory becomes 0(2)-symmetric in the IR
limit.
*9.8. Compute the one-loop effective potential of scalar electrodynamics (Problems 9.4
and 9.5) in the Landau gauge d^A 11 = 0. Use dimensional regularization and min-
imal subtraction, and take the limit where the renormalized mass parameter in the
minimal subtraction scheme goes to zero. The theory then has only the RG scale
[i 2 . Show that, in this limit, the potential has a minimum at a non-zero value of the
scalar field 4>. Show that if the renormalized scalar quartic coupling k is of order
e 4 then the minimum occurs at a field value that is neither extremely large nor
extremely small, so perturbation theory should be valid. Construct the one-loop
RG equations (RGEs) for e 2 and X and show that every trajectory passes through
the region where e 4 (t) ~ X(t). Thus, whenever perturbation theory is valid, this
massless limit of electrodynamics is actually in the Higgs phase. Calculate the
"RG-improved" approximation to the effective potential. That is, write the solu-
tion to the RGE for V e f[ with the boundary condition that for \<p\ = M, V e {[ is
given by its one -loop formula. The improved V e s — f^ 4 Z(t) Vi-\ oop (e 2 (t), X(t)),
where t = ln(|0|//z) and Z(t) is the rescaling factor that comes from solving the
RGE. Compute the masses of the vector boson and the Higgs field, for every
initial value of the couplings e 2 (0) = e 2 , A(0) = X.
*9.9. Consider two definitions of the renormalized coupling constant of some theory
with a single marginal coupling. The two schemes are related by
g = gl + aigi 3 + a 2 gi 5 H .
Using the renormalization group equation for g,
j; = P(g) = big* + b 2 g 5 + ■ ■ ;
show that the first two coefficients of the /S function for g\ are identical to b\^_-
*9.10. Complete the computation of the static potential through one loop, and calculal e
the fi function for Yang-Mills theory.
Instantons and solitons
Much of this book has been devoted to perturbation theory around Gaussian fixed
points of the renormalization group. Such a perturbation theory can always be reorga-
nized into a semi-classical approximation to the functional integral. The semi-classical
expansion can also give us evidence about non-perturbative effects in quantum field
theory, and this chapter is an introduction to the relevant technology. It is a very brief
introduction to a very large subject. Readers who wish to purse the subject of instantons
and solitons in depth should consult [135-138].
10.1 The most probable escape path
We begin by recalling the WKB approximation to the problem of quantum tunneling
in a system with N degrees of freedom, q' .
The Schrodinger equation is
2 9( ? ') 2 V
V(q)\ir = Ef.
For simplicity we have made all the variables, including the energy, dimensionless, set
h = 1 , and introduced a dimensionless parameter g. The WKB approximation is valid
when g 2 is small. The reader should verify that this Schrodinger equation would be
appropriate for a system whose classical action is
■/-ff-
For small g , an approximate solution to the Schrodinger equation is
y = e* ,
m
--2(E-V).
(10.4)
The last equation is the Hamilton- Jacobi equation for the classical system with action
(10.2). The integral curves of this first-order system are solutions of the classical
equations of motion, and
US
-- <//.
(10.5)
10.2 Instantons in quantum mechanics
Here q' (t) is the classical trajectory that goes through the point q' at time t. The velocities
are real only if E > V, but we can use this approximation to solve the Schrodinger
equation in the tunneling region where V > E. Since the coordinates are still real, this
corresponds to continuing to imaginary time.
The imaginary-time equations of motion are perfectly good differential equations. In
fact, they correspond to Newtonian motion in the potential — V. The relation between
solutions to this equation and the WKB approximation to the Schrodinger equation is
the same as that between ordinary classical trajectories and the Hamilton- Jacobi equa-
tion. In particular, we have the usual result that, at any point in the tunneling region,
A j pidq',
(10.6)
wheretheintegralistakenoveraiiimj.nl iry-timei la ii In ijectory that goes through
the indicated points. This is just the Euclidean action of the trajectory. We generally
take qo to be the turning point of the trajectory, where it hits the boundary at which real
classical motion can begin. The two signs correspond to linearly independent solutions
of the Schrodinger equation, one of which is exponentially falling and the other expo-
nentially increasing, as we penetrate into the tunneling region. Boundary conditions
on the other side of the tunneling barrier (or at infinity if it is not a finite barrier, but
an infinitely rising wall) fix the coefficient of this growing solution to be small, so that
the two are of the same order of magnitude in the middle of the tunneling region.
For small g, the wave function is exponentially small in the tunneling region, and
it will be maximized in the vicinity of the path of minimal Euclidean action which
traverses the barrier. In minimizing the action we have to search over all solutions
that traverse the barrier between two classically allowed regions in configuration space.
This includes a search over all end points of the solution after tunneling. However,
once we have found the minimum, there is another solution in which we first follow the
most probable escape path, and then retrace it to its origin. This solution, called the
bounce, minimizes the action subject to the constraint that it starts and ends at the initial
stationary point of the potential. There is no need to search over possible turning points
in the second classically allowed region. The action of the bounce solution directly gives
the logarithm of the probability for tunneling through the barrier.
10.2 Instantons in quantum mechanics
Consider one -dimensional quantum mechanics with a potential V(x) that has two
or more minima. We will concentrate on three cases: two degenerate minima, a peri-
odic potential with an infinite number of degenerate minima, and two non-degenerate
minima. The Euclidean (imaginary-time) action has the form
?M»
Instantons and solitons
g 2 is the small dimensionless parameter which controls our semi-classical approxima-
tion. We will begin from a computation of the partition function Tr e~^ H . According
to Feynman, this is given by the Euclidean path integral, with periodic boundary con-
ditions, x(-p/2) = x(P/2). For large 0, this behaves like Ce~ pE \ where E is the
ground-state energy. ' For small g 2 the path integral is dominated by the saddle point
with minimum action, which is a constant solution x(t) — x m { n . Expansion of the path
integral around this saddle point leads to a perturbative calculation of the ground-state
energy E = ^g 2n E n . If the degenerate minima are related by a symmetry, as in the
periodic case, or if the double-well potential has a reflection symmetry, then the terms
in this series are independent of the minimum we expand around. The degeneracy is
not lifted to any order in perturbation theory. We will choose the classical energy at
the minimum to be zero.
Time-dependent saddle points of the Euclidean action, which we will dub instantons,
lead to contributions to the ground-state energy that are of order e~ s °/ g ~ . Normally,
we would neglect these in comparison with the terms in the expansion, but they will
give the leading contribution to the ener; illy degenerate
levels. In order to have a finite action So, in the limit /J — »• oo, the instanton solution
must have the asymptotics x(t) — >■ x^ in as / — >■ ±00. It must interpolate between two
different minima. Clearly this means that |dx/cU| — >• for large |/|.
Saddle points of the Euclidean action are given by solutions of
which are Newton's equations in a potential — V, for a particle of unit mass. The
Euclidean energy, \x 2 — V, is conserved in this motion. On evaluating it at t = ±00,
we find
x 2 = 2V,
which can be solved by quadratures. Notice that, for any pair of minima, there are
two solutions, related by time reversal t -*■ —t. We (arbitrarily) call one of them the
instanton, and the other the anti-instanton. Note that, in any relativistic quantum field
theory, we always have a kind of time-reversal symmetry, namely TCP. Thus, in this
context, every instanton will always have an anti-instanton.
Given the form of the solution, we can now understand the term instanton. Expand
the equation of motion around a minimum x(t) — x m [ n + 8. Then
8 = V"(x min )8.
Since V" is positive the solutions are rising and falling exponentials. We have to choose
the falling exponential at both t — ► ±00, in order to obey the finite action boundary
conditions. Thus, the instanton x(t) differs from the classical ground state only in
We will not deal with the sub-leading terms in /S. From them, one can extract information about excited
states. See the book by Zinn- Justin [111].
10.2 Instantons in quantum mechanics
the local vicinity of some point t\ (which may be chosen arbitrarily because of time-
translation invariance). It thus has a particle-like aspect, which explains the suffix on,
but localized in time rather than space, which is indicated by the prefix instant. In
higher-dimensional field theories instantons will be localized in both time and space.
We can also reinterpret c/-dimensional instantons as localized (d+ l)-dimensional static
solutions. The latter are known as solitons, and actually do define particle states in the
quantum theory.
Time -translation invariance implies that, in the fi — >■ oo limit, we have a continuous
set of solutions, labeled by t\. We can write them as x(t) — X(t + t\), where X(t)
is the solution whose maximum deviation from x min occurs at t — 0. Obviously, we
have to integrate over this whole set of saddle points, because they have the same
action. In other words, we have a saddle line rather than a saddle point in our multi-
dimensional integral over all functions. The measure of functional integration is defined
by expanding x(t) — J^ c„x„(t) in a complete set of orthonormal functions on the real
line. We then integrate over c„ with uniform measure. Our line of saddle points is a
curved line through this infinite-dimensional Cartesian space. In principle we have
to decompose the measure into integration along this curve and along all directions
perpendicular to it. The measure of integration is no longer Cartesian, and must be
computed. However, in the Gaussian approximation, as we will see in a moment, this
subtlety can be ignored.
The Gaussian approximation expands the action to quadratic order around the
saddle -point configuration and does the resulting Gaussian integral. The function X{t)
is an obvious zero mode of the Gaussian fluctuation operator
8.v(08.v(,v)
= ["S H
1 '" ■-V"(X(t))\S(t-s),
because the whole one -parameter family of functions X(t+t\) has the same action. Our
formula for the solution shows that X does not change sign. It is therefore the lowest
eigenmode of this Sturm-Liouville operator. All of the positive modes are orthogonal to
it. Thus, infinitesimally close to our line of saddle points, the measure on the orthogonal
directions is uniform (this is geometrically obvious for finite -dimensional integrals -
draw a picture). The non-uniformity of the measure becomes important in higher orders
in the expansion, which we will not have the opportunity to explore.
So far our calculation of the instanton correction to the partition function takes the
form
■v 7 [ llfl -l i(-df+V"(X(t))
Z = Z 1 + fie i- det 2 — '
L \-3 ( 2 +^"(*min)
Here Zq is the Gaussian approximation to the partition function in the expansion
around x m ; n . This is equal to det~ J (— 3 2 + V"(x m \ a )), which is why we have this oper-
ator in the denominator of the second term. In field theory, this one-loop correction
to the vacuum energy is infinite. However, because the instanton is a smooth function,
the very high eigenvalues of the two fluctuation operators are the same. The ratio of
Instantons and solitons
the two differential operators is a Fredholm integral operator, and has finite determi-
nant. In higher-dimensional field theory, there are additional UV divergences in the
determinant. We can study them using the Schwinger proper-time formula for func-
tional determinants, which we introduced in Chapter 9. As usual, they come from
short proper-time intervals, and correspond to renormalizations of local function-
als of the classical background field. These divergences are precisely removed by the
renormalization of parameters which defines the theory.
When g 1 is small, this expression is very small, but also almost infinite, because
/S is taken to oo. This sort of divergence is familiar from the perturbative expan-
sion of the partition function in terms of Feynman diagrams. From that context,
we know how to deal with it. In fact, vacuum diagrams with n disconnected parts
behave like P" for large p. Feynman rules tell us that the sum of all these discon-
nected diagrams is just the exponential of the sum of connected diagrams. The latter
sum has a single overall factor of /J, which is what we would expect for the logarithm
of the partition function that has the form e~^ E °d s (d g is the number of degenerate
ground states of the system), for large p. In field theory, the factor of fi is multi-
plied by a factor of spatial volume, and the quantity we are computing is the vacuum
energy density, the coefficient of space-time volume in the logarithm of the partition
function.
To see the analogous exponentiation of instanton contributions to the partition
function, we have to introduce some approximate solutions to the Euclidean equations
of motion. Consider
c c (t) = J^X k (t + t k ),
\tt-tj\X> V"(x mm ).
For each k, X^ is one of the instanton solutions we have already discussed. For the
double -well potential we must enforce a rule, namely that each instanton X + is followed
by an anti-instanton, X-, in order for x(t) to be a smooth function. The instanton looks
like a step function. Furthermore, in this model, in order to approximate functions
that were periodic before taking the large-/3 limit, we must restrict the total numbers of
instantons and anti-instantons to being the same. On the other hand, for the periodic
potential, an instanton taking minimum a into minimum a + 1 can be followed either
by an instanton going to a + 2, or by an anti-instanton going back to a.
2 The question of whether we must have N+ = N- for the periodic potential seems to depend on the period
of the x variable. If it is the same as the period of the potential then there is no restriction onN±. However,
even if the x variable runs over the whole real line, we can compute Z(9) = YIn^ 6 Tr[e~P H T n ],
where T is [lie discrete Iniiislalion which le;i\es ihc potential mutuant. This compulation allows arbitrary
N± but with a phase &(&+-&-'&, The same phase can be inserted into the periodic x computation by
adding the term ifdtx into the Euclidean action. The two computations are mathematically identical.
For non-periodic .v [he interpretation is the partition function summed oxer states w itli a fixed eigenvalue
of T (Bloch waves), whereas for periodic v il corresponds to a change in the action.
10.2 Instantons in quantum mechanics
These functions satisfy the equations of motion, up to exponentially small terms, as
a consequence of the exponential falloff of X^t). They have action «s hSq. The param-
eters tk are approximate collective coordinates for this solution. They do not change
the action much. Thus, one should integrate over them. One might worry that these
integrals cover points where the 1 1, — tj | are not large, but we will see that these regions
give only exponentially small corrections to the calculations we are about to do. There
is, however, one restriction on the integral over the fy-. Field configurations in which the
centers of two or more instantons (or anti-instantons) are permuted correspond to the
same field configuration. Thus, we can integrate freely over all collective coordinates if
we divide the result by N + \N-\ for a configuration containing N+ instantons and N-
anti-instantons.
If we take into account the contribution of all of these configurations to the path
integral, we obtain 1
(fiD- l l 2 ) N + +N - _ < w ++^o
Z = Z J2
N+lN-l
= Z e 2 ^" 1/vW ,
which corresponds to a shift in the ground-state energy of
8£b = -2D- 1/2 e~ So/g2 .
If, following the prescription in the footnote above, we insert the phase q i6 ( n +- n -) into
the sum, then 2^2 cos in these formulae. The evaluation of the partition function
looks like that for the configuration integral in the statistical mechanics of a classical
one-dimensional gas consisting of two species of particles with identical fugacities. This
approximation is therefore called the dilute-gas approximation. Note that the factorial
factors in the denominator insert what is called correct Boltzmann counting into the
classical statistical formulae. This factor, inserted to avoid the Gibbs paradox, is usually
explained by appealing to quantum mechanics. Here we see that it would also follow
from a model in which particles were localized classical field configurations.
10.2.1 Computation of determinants
Now let us explain the factor D in the above expression. Let's start from the ratio of
determinants
R = det'[-9 2 + V"(x c (t))]
det[-3 2 + V"(x mia )] '
The prime on the determinant in the numerator means that we omit the approximate
normalizable zero modes corresponding to translating individual instantons, as well as
the exact overall translation zero mode. The determinant in the denominator appears
Instantons and solitons
because we have pulled a factor of Zq out in front of our expression for the partition
function. We have chosen the classical action of the constant saddle points to vanish,
so Zq is just given by the inverse square root of the determinant of the fluctuation
operator around these configurations.
Recall how this works. Given a saddle point xs(t), we expand the functional inte-
gration variable as x(t) = x$(i) + g^c n & n (i), where S„ constitute the complete set
of orthonormal eigenfunctions of the operator — 9 2 + V"(x$(t)), which appears in
the quadratic part of the action for the fluctuations. The inverse of this operator is
the propagator that appears in the Feynman rules for calculating higher orders in the
g expansion. The Gaussian integral over the c n gives us a constant (g-independent)
pre -factor. We define the measure as
dc»
nk-
so that the Gaussian integral gives
i" 1 / 2
This expression has two possible problems. It usually diverges in the very-large-A„
regime, and it is infinite if there is a normalizable zero mode. We have, for the moment,
solved the second problem by simply omitting this mode. The first is solved by a trick
invented by Fredholm in the nineteenth century. The operators — 3 2 + U(t), for any
smooth U{t) approaching a constant at infinity, all have the same high-energy spectrum.
In modern RG language this is the statement that a smooth potential is a relevant per-
turbation of the free Hamiltonian — 3 2 . As a consequence, the high-energy divergence
in the determinant cancels out in the ratio.
The reason that we have omitted the (approximate and exact) zero modes is that we
have already extracted the integral over these directions in field space when we inte-
grated over collective coordinates. We just have to get our normalizations right. The
reason why there is a normalizable zero mode is that the action is invariant under time
translation. Thus for any saddle point x c (?) there is actually a line of saddle points
x c (t + t\). For the multiple-instanton approximate saddle points there are correspond-
ingly multiple lines, since we can translate each instanton independently as long as
they are far apart. The line of saddle points is not a straight line w.r.t. any orthonormal
basis. However, right near the line of saddles there is obviously an orthonormal basis
of small fluctuations, with one direction going along the line. 4 Thus
is the normalized zero mode. The reader can verify that it indeed satisfies the zero-
eigenvalue equation. We have written the normalization factor in a way that exploits
the classical equation x c 2 = 2V. This function is a sum of functions concentrated near
In higher orders in perturbation theory around the instanton we have to get the correct me;
vicinity of the curved line of saddles. This is done by an analou of the Hiddeev-Popov trick.
10.3 Instantons and solitons in field theory
the centers of all the instantons and anti-instantons. We can find the approximate zero
modes corresponding to relative translation of a single instanton by simply dropping
all terms in Sq except the one near that instanton.
To relate the measure d/i of the collective coordinate to that of the coefficient of the
zero mode, dco, we simply insist that these two variations of x c (t) are the same,
gS dc = x c df i
Thus the correct measure, dco/V2jr, is
A similar formula is valid for all the approximate collective coordinates. We arrange
the integral so that we integrate independently over each of the instanton positions, so
the appropriate So is that of a single instanton.
The zero mode of the single instanton is obviously the lowest eigenvalue, because
the corresponding wave function x is monotonic and has no nodes. Since, in the
Schrodinger equation analogy, the potential, U(t) — V"(X(t)), goes to a constant at
infinity, all the rest of the eigenstates are scattering states. Thus for the single instanton
i? = e tr/d,lne,(e) 3
where p(e) is the density of scattering states.
As noted above, the Schrodinger operator — d 2 /d/ 2 + V"(x c (t)) is a non-negative
operator. We know that it has N + + N- normalizable (approximate) zero modes. The
rest of its spectrum consists of positive-energy scattering states. Since the instantons
and anti-instantons are widely separated, we can calculate the phase shift by a multiple-
scattering expansion: it is the sum of phase shifts for scattering by individual instantons.
The Fredholm determinant is given by
det'[-d 2 /df 2 + V"(x c (t))] = e/d£ p(£)ln£
det[-d 2 /d/ 2 + V"(x min )]
Using the relation p (E) = 68(E) /dE, between the density of states and the phase shift,
combined with the multiple-scattering expansion, we see that the dilute-instanton-gas
determinant is just the product of individual instanton and anti-instanton determi-
nants. Thus the full one -loop expression for the contribution of the dilute gas of
instantons and anti-instantons is just the exponential of the single instanton plus
anti-instanton contributions.
10.3 Instantons and solitons in field theory
An instanton is a stable, finite-action, solution to the Euclidean field equations of a
d -dimensional quantum field theory. For renormalizable quantum field theories, these
equations can also be viewed as the equations for stable, finite -energy, static solutions to
Instantons and solitons
the field equations of the same theory in d+ 1 dimensions. With this interpretation, the
solution is called a soliton. 5 Solitons are particles with mass of order 1/g 2 , where g 2 is
the semi-classical expansion parameter. There are also more general periodic solutions
of the (d + l)-dimensional equations that have such a particle interpretation.
Another kind of generalized soliton is an infinite extended object of finite energy per
unit volume, like a flux tube or domain wall. We have already encountered flux tubes
in our discussion of confinement. Indeed, the most important solitons in high-energy
quantum field theory are the monopoles and flux tubes associated with U(l) gauge
fields. Various other solitons are useful in applications to condensed-matter physics.
The fact that a soliton's mass per unit volume goes to infinity in the semi-classical
limit means that part of the semi-classical expansion is an expansion in the recoil of
the soliton. This makes the expansion particularly intricate, and we do not have space
for a proper discussion of it here. Consequently, we will begin our discussion with
the instanton interpretation of static classical solutions, for which we can give a fairly
complete sketch of the semi-classical expansion. We will then outline the principal facts
about solitons in gauge theories.
We have emphasized repeatedly that the Schrodinger picture is an awkward way to
think about quantum field theory. Fortunately, the connection between tunneling and
Euclidean field equations gives us a way to compute tunneling corrections to Green
functions quite directly. The discussion in the previous subsection provides us with a
conceptual framework for understanding what an instanton is, but to understand what
it does we turn to the Euclidean path integral for Green functions. We will treat two
cases, the instanton for decay of a metastable ground state in scalar field theory and
the eponymous instanton for pure Yang-Mills theory. In the first case, we are doing an
expansion of a formal functional integral of e~ s WVg with the boundary conditions
that </>'(x) — > v' at Euclidean infinity, v' is a local minimum of the field potential.
Finite-action stationary points 6 will lead to exponentially small corrections to Green
functions.
It is intuitively appealing, and rigorously proven [1 39], that the minimal action solu-
tion has 0(d) invariance. Thus, it is a function only of the Euclidean distance r from
some point of origin. Translation invariance assures us of the existence of a rf-parameter
family of solutions with different origins and equal action. We will come back to
the consequences of this degeneracy forthwith. The minimal-action solution is thus a
minimum of
jd d rr d - 1 [(.d<p i /dr) 2 +V(.4> i )].
The instanton is a particular path </>' (r) in field space, so there is an equivalent single-
field problem (with a modified potential) that solves for the field-space path length as
5 This is an abuse of the mathematician's term soliton, which refers to special solutions of integrable field
theories.
Finite action actually refers to the action difference between the solution in question and the ci
10.3 Instantons and solitons in field theory
a function of r. For simplicity then, we will restrict our attention to a one-dimensional
field space, with the knowledge that the generalization is straightforward.
The variational equations are
4>rr + — ^</V " V'(4>) = 0.
These are the Newtonian equations for a particle moving under the influence of time-
dependent friction, in a potential U — — V. U has two maxima, which we call the false
vacuum vp and the true vacuum vj. The boundary conditions are that </> -> vp as r
goes to infinity and that 0,(0) =0, in order to have finite action. Notice that this spher-
ically symmetric solution is automatically what we have called the bounce above. The
Green functions we calculate with the functional integral are ground-state expectation
values, and contain two factors of the exponential suppression of the ground-state wave
function in the tunneling region.
The variational equations always have a finite-action solution. The free boundary
condition is the value of (p at r — 0. The "particle" obviously has trajectories starting
on the vt side of the minimum of U that do not make it to vf, even in infinite time.
Simply start with a Euclidean energy jtf—V, which is less than or equal to the value of
the potential at vp. To see that there are solutions that overshoot vf in finite time, start
with </> near vj. If the initial condition is close enough to the maximum, the trajectory
will remain near the maximum until very large r, where the friction term is negligible.
At this point, energy is approximately conserved, and overshoot is guaranteed.
As the initial position is varied continuously between undershoot and overshoot solu-
tions, we will find a unique solution that settles in to vf in infinite time. Near r = oo
the corrections 4> — vp fall off like e - ""', where m is the mass of small oscillations near
the metastable minimum. Thus the action is finite. The leading semi-classical approx-
imation to connected Green functions consists of saturating the functional integral
with the degenerate classical solutions I(\x — a\), where / is the spherically symmetric
instanton solution we have just discussed. The answer is
-V J*.*,
-a)...I(Xn-a)= W n (x x ...x n ).
Note that it is translation-invariant.
This simple result for connected Green functions follows from a slightly more
elaborate analysis, called the dilute-gas approximation (DGA) for full Green func-
tions. Consider instanton contributions to the partition function. The single instanton
corrects the partition function by an additive term of the form
S[I]
Ve s 1 ,
which is exponentially small but multiplied by an infinite factor of the Euclidean space-
time volume. Clearly, this infinity is similar to the overall volume factor in vacuum
Feynman diagrams, and we expect it to exponentiate. To see this consider a field config-
uration/rjGA = J2 I{x — a{). When |a,- — aj\ is large for each pair, this is an approximate
solution of the Euclidean equations, because the instanton field falls rapidly at infinity.
Instantons and solitons
Its action is nS[I] and it has n collective coordinates a, that should be integrated over
space-time. However, this multiple integration overcounts configurations that differ
only by a permutation of the a,. There is only one field configuration but many dis-
joint integration regions related by permutation. Thus, the total contribution of these
dilute-gas configurations to the partition function is
« {Ve -S[I]/g 2 r S[I]/g 2
^ n\ = 6
Thus, the DGA gives an exponentially small correction to the vacuum energy den-
sity. When applied to the calculation of Green functions the DGA gives disconnected
contributions, in which some finite number of instantons will affect connected clusters
of fields, while most of the instantons in the gas cancel out against the denominator.
Connected Green functions are, in the DGA, a one-instanton effect.
Like all leading-order semi-classical results, the overall normalization of the DGA
answer is not determined until we integrate over small fluctuations around the instan-
ton. This must be done with care. We have integrated over a (/-dimensional sub-manifold
of field space, parametrized by the collective coordinate, a, the instanton center. We must
rewrite the integration measure in terms of coordinates transverse to this sub-manifold,
and consider only fluctuations in the transverse direction. In principle, this introduces a
Jacobian determinant reminiscent of the Faddeev-Popov determinant of gauge theory.
However, most of the effects of that determinant show up only at higher orders in the
semi-classical expansion. All that is left is the instruction to leave out the zero modes
of the fluctuation operator
-V 2 + V"(I)
when calculating its determinant.
It is easy to see, from translation invariance or explicit substitution, that the d
functions 3,7 are zero modes of the fluctuation operator. Furthermore, they are nor-
malizable, because 9,7 = (x,/r)I r falls off exponentially at infinity. We have run quickly
through the discussion of the DGA and collective coordinates for field theory, because
it precisely parallels the quantum-mechanical arguments of the previous section.
10.4 Instantons in the two-dimensional Higgs model
Before beginning the discussion of instantons and vacuum structure in higher-
dimensional field theory, I want to point out one more way in which the solutions we
will discuss apply to physical systems. The classical partition of a general system is the
product of a simple term coming from the kinetic energy plus a configuration integral
For a (d + l)-dimensional relativistic field theory, the potential in this formula is just
the Euclidean action of the same field theory in dimension d. So the partition function
10.4 Instantons in the two-dimensional Higgs model
which gives the vacuum energy of the (/-dimensional field theory is also the classical
partition function of the system in d + 1 dimensions. In this way of thinking about
things, the coupling g plays the role of inverse temperature, with dimensional factors
supplied by relevant parameters (in the RG sense) of the field theory. This should be
reminiscent of the way our instanton sums in quantum mechanics reduced to a problem
in classical statistical mechanics.
The utility of these formulae goes beyond relativistic field theory, because the
coarse-grained, long-wavelength behavior of condensed-matter systems, especially near
second-order phase transitions, is dominated by classical fluctuations of an order
parameter field, whose effective energy functional is rotation-invariant. Thus, the
study of the relativistic Higgs model that we present here is relevant also to the sta-
tistical mechanics of planar superconductors. We will also touch briefly on the
Kosterlitz-Thouless phase transition in planar XY magnets.
Our strategy for finding finite-action instanton solutions in higher-dimensional field
theory will focus on the notion of topological charge, a generalization of the / dt d t x l of
quantum instantons. That is, we break the space of Euclidean field configurations up
into classes with different behavior at infinity. Continuous changes of the fields bounded
in space cannot change the class. If we can find any finite-action configuration in the
class, and the class is different from that of the constant classical background, then we
should be able to find a non-trivial finite-action solution by minimizing over variations
within the class.
In d -dimensional Euclidean space, infinity has the topology of a (d — 1 )-sphere S d ~ l .
This indicates the relevance of the space of maps n^_i (X), from the sphere to the space
X of behaviors at infinity of finite-action field configurations. Since finite-action field
configurations are defined relative to a classical vacuum, X is the space of classical
vacua. For d — 2, we have maps from the circle, and the space of classical vacua should
have a non-trivial circle in it. This implies a continuous degeneracy, which is usually an
indicator of a classical U(l) symmetry, which is spontaneously broken. This is indeed
the case for the model we will study.
Let x be the periodic, angle-valued Goldstone field, with period lit. Configurations
with non-trivial winding number have
J d9dex=2*n,
where the integral is taken around the circle at infinity. By Stokes' theorem, we can
deform this circle into the finite interior, until we hit singularities of x . If x is the phase
of a complex field, </> = pe lx , then singularities are to be expected, and don't lead to
divergences, at the zeros of p. On the other hand, p should go to a non-zero constant
po at infinity. The contribution of the kinetic term of x to the action at large radius is
thus bounded by
*/>(¥)
The action is thus logarithmically divergent.
Instantons and solitons
Normally, we will throw out configurations with infinite action, when the origin of the
infinity is infrared. However, in this case, we would miss something interesting if we did
so. The field equations for / are linear, so consider a configuration consisting of a sum
of n — ±1 solutions for x- Let the separation between the two centers with p(R±) =
be R. When R is large we have an approximate solution of the field equations with action
25o + \n(R/r + ). The contribution to the partition function from such configurations is
?inst oc V I d 2 R .
-25 /^ -j[h*\R\/r + )
The factor of V comes from the integration over the center of mass of the instanton-
anti-instanton configuration, while the integral over R describes the relative coordinate.
It's clear that the integral converges for small g 2 , but that there is a critical value of
g at which the integral diverges. This is a signal of a phase transition in which a
gas of tightly bound instanton-anti-instanton pairs is replaced by a plasma of sepa-
rated instantons and anti-instantons. The behavior near the transition can be described
exactly by converting the sum over instantons into a sine-Gordon field theory (we will
see an example of the same technique in a three-dimensional example below), and using
the renormalization group. This phase transition was first understood by Kosterlitz and
Thouless.
Another strategy, which makes the instanton action finite, is to gauge the global U(l)
symmetry. That is, instead of regarding it as a symmetry acting to transform physical
states into each other, we regard it as a redundancy. Two field configurations related
by a gauge transformation cp(x) — >■ e^'^VW (the field <p has charge m e Z) are
supposed to be the same physical state of the system. In order to do this, we have to
introduce a gauge potential A^ix), which transforms as A^ -> A tl + 3 M A and make
the replacement 3^0 — ► D^<p = (3 M — imA^fy. The Lagrangian is
4/x 2 MV
Note that, in this formula, A^ has mass dimension 1 and <j> has dimension 0. g 2 is
dimensionless, as are the parameters in the potential V. The potential has a global
minimum at \<j>\ — 1 and a maximum at \(p\ — 0. The perturbative spectrum in the
expansion around the minimum is exposed by introducing the gauge-invariant variables
p — \4>\ and
m
We find a massive vector field, 7 with mass /x, and a massive scalar with mass (fi/2) V"{\).
The natural gauge-invariant order parameter for this system is the Wilson loop
. (he panicle described by (be Proea equation has 011K one internal si
10.4 Instantons in the two-dimensional Higgs model
If we choose parameters in the potential such that \</>\ — is the global minimum, this
order parameter has the following behavior for large loops C:
wrn . U~ A[C \ k^Omodm,
W L CJ~ j e _ P[C] ^ k = 0modm)
where P[C] is the perimeter of C and A[C] the area bounded by C. This is interpreted
by thinking of a Wilson loop with a long rectangular section with
T » R » 1/mnrin.
That is, both sides are much larger than the Compton wavelength of the lightest particle
in the spectrum. This corresponds to creating a pair of extremely heavy external sources
with charge ±k, moving them a distance R apart, and letting them sit for a time T before
re-annihilating them. Then
W[C] ~ e" £wr ,
where E(R) is the minimum energy state in the presence of the static, separated, particle-
anti-particle pair. The area law corresponds to the unscreened one-dimensional
Coulomb potential E(R) ~ \R\, while the perimeter law gives us the self-energy of
charges screened by the dynamical charge-// particles in the system.
By contrast, the perturbative calculation in the Higgs phase leads to screening of
all external sources, even k ^ modulo m. In the interpretation of the Higgs model
as the statistical mechanics of a planar superconductor, this screening is interpreted
as screening of magnetic flux by the Meissner effect. The Wilson loop measures the
Aharonov-Bohm effect experienced by transporting a particle of charge k around a
closed loop.
Now let us return to the question of instanton solutions in the Higgs phase. First
consider the following configuration of the fields:
/ A9 dex — 2nn,
p=l.
Formally, since A jx is a gradient, F I1V — 0, and the action of this configuration is the
same as that of the classical vacuum with p = 1 and / = A^ — 0.
This is misleading, since
/ A»dx» = j F^dx'dx*- 2 ^.
The line integral is taken over any contour that one can reach by continuous deforma-
tion of the circle at infinity, without hitting a singularity of x • If X nas a singularity
only at one point (call it the origin), then
2nn -,
i> = <5 2 (x)e MV ,
Instantons and solitons
and the action is really infinite. As before, we can regularize the singularity by consid-
ering a configuration in which p (r) has a zero at the origin, and goes asymptotically to
1 at infinity. Thus, we make an ansatz
A„(x) = 6„ v x v a(r),
and
*.<*)= *^.
Here 4> a are the real and imaginary parts of the complex field </>. Imposing p (0) = and
the boundary condition that as r — >■ oo we approach the singular configuration p = 1,
mAjj, — S^X; x = n &i makes it easy to see that we can make the action finite. Therefore,
there is a minimum-action configuration with the same boundary conditions. It is
somewhat more difficult to prove, but nonetheless true, that the minimum is achieved
within the symmetric ansatz we have chosen. If we define the fluctuations ha — a —
aoo (r), hp (r) = p—\, then they satisfy the linearized equations for large r. The fact that
all gauge-invariant fields are massive in the Higgs vacuum shows that the fluctuations
fall off exponentially. Thus, the conditions for the dilute-instanton-gas approximation
are satisfied.
As in the Bloch-wave problem, it is convenient to define a partition function with an
extra phase
e^/ f .
The instantons have two translational zero modes and a positive fluctuation determi-
nant. Thus, the dilute-gas partition function is
„ ILT-^D-'^e-So'^cost?-)
Z = e 27T s- K ; .
Here */N is the normalization factor for the single-instanton zero modes d^Xj, and
we have used the fact that a single instanton has flux \/m for a charge-m Higgs field.
LT is the volume of Euclidean space-time.
It is easy to evaluate the Wilson-loop expectation value, by noting that
f^-l/-
where the volume integral runs over the area interior to the loop. Thus the charge-/c
Wilson loop simply shifts the 6 parameter to + Ink, within this area. When LT ^>
A ^> n~ 2 , we can break the dilute-gas sum in the numerator up into the product of a
sum for which all instantons are inside the loop and one for which they are all outside
the loop. These sums can each be evaluated using the formula for the partition function
for the appropriate volume and 9 angle. Thus
( W) ■
e ^-W H ^)_ cos( ,)]
This falls like the area, unless k is a multiple of m. Thus, instantons restore the confining
Coulomb potential, which seemed to be screened in the Higgs phase. Note, however,
10.5 Monopole instantons in three-dimensional Higgs models
that the strength of the potential is exponentially small for small g. This is quite different
from the phase where there is no expectation value of the Higgs field. There the strength
of the Coulomb potential is of order g 2 .
In the interpretation of our partition function and Wilson loop in terms of the
Aharonov-Bohm phase in a superconductor, the failure of screening in the presence
of instantons is an indication that the Meissner effect disappears at finite temperature.
Two-dimensional superconductors have finite-energy excitations that are "Abrikosov
flux dots." At finite temperature the system is a dilute gas of dots and anti-dots. An
external magnetic field penetrates the system by preferentially aligning the dots in the
region where the field exists.
10.5 Monopole instantons in three-dimensional
Higgs models
In three Euclidean dimensions, topological charges are classified by Eh (My), the map-
pings of the 2-sphere into the space of vacuum field configurations. The space of gauge
field vacua is always the gauge group, and rb(G) vanishes for any Lie group. Thus, we
will have instantons only for models that involve scalar fields, and only if the space of
minima of the scalar potential has non-vanishing Eh. The angular-momentum part of
the kinetic term has the form
so angular dependence leads to a linearly divergent action. We conclude, as in two
dimensions, that finite-action instantons will occur only if the space of scalar vacua
has a gauge equivalence on it. That is, locally it takes the form G/H x X, where G is
the gauge group and H the subgroup which preserves a point in the vacuum manifold.
In physics language, the gauge group G is in the Higgs phase, with H the unbroken
subgroup. In our discussion below of magnetic monopoles as solitons, we will show
that ri2(G/H) = rii (H). The most interesting case will be that in which H is a product
of U(l) groups, and the simplest example is a single U(l) and G = SU(2). In this case,
G/H is just S 2 and the topological charge is just the winding of the sphere on itself.
The map of charge 1 is just the identity map, while that of charge —1 is the orientation-
reversing map of the sphere on itself, n" — > —n", in the representation of S 2 as the space
of unit 3 -vectors.
Thus, the simplest three-dimensional model with finite-action instantons is the
Georgi-Glashow model of SU(2) gauge theory broken to U(l) by a Higgs field in
the three-dimensional adjoint representation. The action is
■/**[■
(F^f + UDffof+Virr)]-
Instantons and solitons
We count the mass dimension of both the gauge potential and the scalar field as 1 , in
which case g 1 also has mass dimension 1 . These are not the scaling dimensions of the
fields in a renormalization-group analysis. Restricting our attention to renormalizable
potentials, V is a polynomial of order < 6 whose minimum is at non-zero a . In order
to get explicit solutions, we will restrict ourselves to the case V — 0, a situation that
is natural if there is enough supersymmetry. None of the qualitative results we obtain
will depend on this restriction. The vacuum expectation value of |0 a | is denoted v.
Our analysis begins from the remark that
fd 3 4[F;±^(Wf>o.
Thus, if the potential is zero,
g 2 S±
It follows that
h
where the magnetic charge M is defined by
Mv= X -J6 i xF a ^€ llvX (D x <)>) a .
v is the vacuum expectation value of the dimension- 1 Higgs field. Now we can use
the Leibniz rule for covariant derivatives, the Bianchi identity for the gauge field
strength, and the fact that covariant derivatives of singlets are ordinary derivatives, to
show that
M
where F a = \F^ V dx^ dx v is the field-strength 2-form.
The name magnetic charge comes from the fact that we have an unbroken U(l)
gauge theory. If we interpret our solutions as static solutions of the (3 + l)-dimensional
version of the theory, then M measures the magnetic flux of the unbroken gauge
field through the sphere at infinity. These solutions are called 't Hooft-Polyakov
monopoles.
The inequality is saturated by solutions of the first-order equations
In the supersymmetric context which justifies setting V — in the quantum theory,
these are the Bogomol'nyi-Prasad-Sommerfield (BPS) equations which say that the
10.5 Monopole instantons in three-dimensional Higgs models
solution preserves half the supercharges. 8 This accounts for the first-order nature of
the equations: SUSY variations are of first order in derivatives.
In the theory with no potential there are n monopole solutions. These arise because
the massless scalar field which parametrizes the radius of cj) a gives rise to attractive
long-range forces that cancel out the magnetic repulsion of the monopoles. For our
purposes we will be more interested in the non-BPS solutions consisting of equal num-
bers of monopoles and anti-monopoles. The properties of these can be completely
understood in terms of the BPS solution with unit magnetic charge. It is natural to
make a spherically symmetric ansatz,
<P" = yf(r),
Note that/' and a have mass dimension 1.
We have
where P a h is the projector orthogonal to x a /r.
The field strength is given by
The Bogomol'nyi equation
If we introduce
q = (1 - ra),
then the first equation reads/ = q'/q, and the equations are solved by
q ~ sinh(rv) '
where v is the vacuum expectation value of |</> fl |.
This is most easily understood in the soliton interpretation of our solutions in 3 + 1 din-
extended supersymmetry algebra
has degenerate representations when the matrix y^P^ + eM has zero eigenvalues. This is a condition oi
the mass of pail i K i li i i u> i i I I 1 lidean action in th
three-dimensional instanton inlerpivtalion of the solutions.
Instantons and solitons
We will do the calculation of the effects of instantons in the low-energy effective
theory below the mass of the massive gauge bosons and scalars. In a generic theory
with monopole instantons, this low-energy theory consists of a three-dimensional U(l)
gauge theory, coupled to a dilute gas of point-like instantons. If we think of the vector
dual to Fy as an electric field in the statistical mechanics of three-dimensional elec-
trostatics, then our problem is equivalent to the statistical mechanics of a Coulomb
gas. In that language, the phenomenon we are about to expose is called Debye screen-
ing. For a given configuration of monopoles and anti-monopoles, the field strength is
given by
V 2 </> = 4tt J2l s3 ( x ~ *t) ~ ^(x " yi)l
Xj and }'i are the positions of the monopoles and anti-monopoles, respectively. We have
to sum over all monopole and anti-monopole numbers, with weight
(°-i
jV
27T(gR c ) 2
N+\N-\
and integrate over all the positions. M is the integral of the square of the transla-
tional zero modes (summed over all field components). This quantity, as well as D
and So, are the contributions of the microscopic theory to the instanton amplitudes.
The R c in this formula is 1/v. In principle, there are renormalization corrections to
this effective field theory formula, but if gR c is small, then they are small, and we can
neglect them.
Different underlying theories, with different scalar potentials, or other massive fields,
will change these parameters, but will not otherwise affect the infrared dynamics we
study. The parameters should be tuned to absorb infinite effects in the low-energy
theory, such as the Coulomb self-energies of the monopoles, and restore the finite
values of these self-energies in the underlying theory. Additional massless fields,
such as those guaranteed in a supersymmetric theory, can change the IR dynam-
ics. Typically the change has to do with the particular Green functions to which
the instantons contribute, rather than a drastic change in the instanton statistical
mechanics.
The fluctuating U(l) gauge dynamics can be represented by a vector potential in
a particular gauge. It is more convenient, however, to represent it by a fluctuating
field strength F I1V . This is achieved by replacing the Euclidean Maxwell action by
(g 1 /A)T^ V + liT^vidyAfj) + (gauge fixing) (note that T has dimension 1). Integrating
over T fJiv restores the Maxwell form, while integrating over A IA gives us a functional
delta function setting
% v F pkV =J ll .
10.5 Monopole instantons in three-dimensional Higgs models
The current is zero in the denominator of the functional integral formula, and represents
the electrically charged particles (in the sense of (2 + l)-dimensional electrodynamics)
coupled to the gauge field. We will take it to be the current of a single Wilson loop.
The constraint is solved by setting T^v — F® v + e^ V A dx<p, where F° is any special
solution of the constraint equation. </> is a dimensionless scalar field. Different choices
of F° are absorbed into the (j> functional integral. Later, we will make a semi-classical
approximation for </>, and it is convenient to choose F° to simplify the semi-classical
analysis. We will choose it to be zero in the denominator and, in the numerator, a
surface delta function on the minimal-area surface bounded by the loop.
The utility of the dual formulation is that it is easy to include the monopole instantons
as sources of the fluctuating field <j>. If we add the term i J][0(x,) — 4>(yj)] to the
denominator functional integral, integration over </> gives rise to the repulsive and
attractive long-range Coulomb forces among monopoles and anti-monopoles. It also
leads to infinite self-energies, which are absorbed into So. Note that the factor of i in
the action is required in order to get the right sign for the Coulomb energies and make
like charges repel and opposite charges attract.
We can now sum over the monopoles for fixed and obtain a correction to the
effective action
/ \r \ 3/2 5l
85 = 2D- l l 2 [ 7 e ww 2 cos <f>,
a result first obtained by Polyakov (though the equivalent physics was done long ago
by Debye). The most significant aspect of this result is that the 4> field has obtained an
exponentially small mass.
Now consider trying to evaluate the functional integral over <j> by finding a classical
solution that minimizes the action. In the denominator, the appropriate solution is just
(j> = . In the numerator, if we ignore the cosine term, we could try to set
This cannot work everywhere, because of the current source on the Wilson loop.
However, it fails only in the vicinity of the loop and we get an action proportional
to the circumference of the loop. The large-distance behavior of the loop is in this
case determined by the quadratic fluctuations of </> around the classical solution,
which give rise to the logarithmically rising Coulomb potential of (2 + l)-dimensional
electrodynamics.
For simplicity, think about a loop in the x, y plane, with z = 0. The solution discussed
above has 4> = for negative z. It jumps to <p — jF® v , for positive z, whenever (x, y) is
in the interior of the loop. In general, this will lead to an infinite contribution from the
cosine term, of the form ^4[oo], where A is the area of the loop. We have to let (p go back
to zero to avoid the infinity, but this leaves over a finite contribution proportional to
A for the minimal-action configuration. The proportionality constant is exponentially
small.
Note, however, that, when the required jump in cp is exactly an integral number
of periods of the cosine, we can eliminate the area term entirely. By arithmetic that
Instantons and solitons
we will do carefully when we study monopoles as solitons, we find that the area law
disappears precisely for Wilson loops corresponding to the charges carried by massive
W bosons. Thus we get a linear confining potential only between charges corresponding
to half-integral spin representations of SU(2).
The answer to the question of what classical configurations these monopole instan-
tons are tunneling between is quite subtle. To find it we compactify the two space
dimensions on a rectangular torus of radius R. The gauge-invariant magnetic field
0°.Fp has configurations with non-vanishing quantized flux wrapping the torus,
f cj) a F a = N,
and these configurations have energy ~ 1/R 2 . Thus, in the large-i? limit, the system
has degenerate states with different values of magnetic flux, and the monopoles are the
instantons which tunnel between these states.
10.6 Yang-Mills instantons
There are two possible Lorentz-invariant, gauge-invariant, dimension-4 terms in the
Euclidean Lagrangian of pure non-abelian gauge theory. These are
(10.7)
(*F)„ V = -e„ vkK F XK . (10.9)
The second of these is actually a total derivative. This is easy to see for the abelian case.
For the non-abelian case, we use the language of matrix- valued differential forms. The
gauge potential A — L4 M dx* 1 is a Lie-algebra-valued 1-form. The field strength F is
given by
F = dA- A 2 .
Note that F and A are defined to be anti-Hermitian. Furthermore, because of the
anti-symmetry of the multiplication of forms, A 2 involves only the commutator of the
matrices A i± . Finally
dF = -AA A + A dA = [A, F].
is taken in the fundamental representation with the
10.6 Yang-Mills instantons
Introduce the Chern-Simons form
= tr|"(c
= tr[dAF -2dAA 2 ],
where we have used cyclicity of the trace and graded anti-commutativity of differential
forms. Noting that tr A 4 — — tr A 4 — 0, we conclude that
dC = trF 2 (10.10)
is proportional to the 9 term in the Lagrangian. The second term in the action is
therefore an integral over the boundary of Euclidean space-time of the Chern-Simons
form. The condition of finite action is that A -> U^ d U, a pure gauge, on the boundary.
The function U defines a map from the 3-sphere at infinity in R 4 to the gauge group.
Consider the case of SU(2), where the group manifold is itself S 3 . There are obviously
different topological classes of map of the sphere into itself, which are multiple wrap-
pings. They are characterized by the winding number n, which is an arbitrary integer.
Thus, there are different classes of finite-action gauge-field configurations on R 4 , clas-
sified by the winding number of the pure gauge transformation, which they approach
at infinity. The winding number n is called the topological charge of the configuration.
For more general gauge groups it can be shown that every non-trivial map of S 3 into
the group can be deformed into a map into some SU(2) subgroup, so the topological
classification is the same.
Now note the obvious inequalities
I d 4 x tr(i> ± *F I1V ) 2 > 0. (10.11)
On expanding out the square and collecting terms we see that this bounds the action
of any configuration from below,
8tt- 2
S> —\n\,
r
where n is the topological charge. Equality is achieved only for self-dual and anti-self-
dual configurations
F = ±*F. (10.12)
Solutions of these first-order equations are automatically solutions of the Euclidean
Yang-Mills equations, because they minimize the action for fixed topological charge.
A basis of (anti-)self-dual tensors in four dimensions is given by the 't Hooft symbols
T)^ v = S^oSva - Sy0<V + € l^a, (10.13)
% v =8 l rt8 va -8vo6 ll a-e llV a. (10.14)
Instantons and solitons
These appear in the product rule for Euclidean Weyl matrices
a^o v — &' 1V + rj ,xva o a ,
It is natural to search for a minimal-action solution that is maximally symmetric,
via the ansatz
A l , = g- l (x)d i gi(x)f(x 2 ). (10.16)
g\ is the winding number 1 mapping of the 3-sphere into SU(2),
gl=x^, (10.17)
where x is the unit vector in the x direction. In Problem 10.5 the reader will show
that this gauge configuration is invariant under simultaneous space-time and gauge
rotations, and that its field strength is self-dual if
x 2
p is an arbitrary positive parameter, called the scale size of the instanton.
In order to find a tunneling interpretation for this Euclidean solution we consider
Yang-Mills theory on the Euclidean manifold R x S 3 . Note that this manifold is con-
formally equivalent to R 4 , so we can use the conformal invariance of the classical
Yang-Mills equations to construct the instanton on this manifold from the solution
we have just found. Using the fact that F a F — d C, we find that the topological charge
of the instanton for this manifold is equal to the difference of the winding numbers of
the pure gauge configurations, which it approaches as t —> ±oo.
10.6.1 Hamiltonian formulation of Yang-Mills theory in temporal
gauge
It is always possible to use the gauge freedom of a non-abelian gauge theory to set
Aq = 0. When this is done, the non-abelian electric field is just
~E a = -, doA a .
g 1
The Lagrangian has a standard canonical form and can be quantized in a straightfor-
ward manner, with Ef and A" as canonical conjugates. However, we must also impose
the condition that follows from varying the original action with respect to Aq. It is easy
to see that this is
G a (x) = DfEf - p a = 0.
This is the non-abelian form of Gauss' law, and p" is the time component of the
Noether current for matter fields, which follows from the global G symmetry. This
10.6 Yang-Mills instantons
condition should be understood as follows. The system with A% — is invariant under
time-independent gauge transformations, and
■/'-*
G{cS)= / d 3 xw a (.x)G"(x)
is the generator of these transformations when of vanishes sufficiently rapidly at infinity
that integrations by parts are permitted.
It is easy to verify that each G(co) commutes with the Hamiltonian, so that we have
an infinite-dimensional symmetry group of the mechanical system we have defined.
The statement that G(co) — is imposed on the allowed physical states of this the-
ory. It is somewhat analogous to the restriction to BRST-invariant states in covariant
quantization.
An interesting twist on this formalism occurs if there is a topological term in the
Yang-Mills action, and G is broken down to H = U(l) (or a product of U(l)
groups). The topological term does not affect the equations of motion, but it does
change the canonical momentum, and thus the generator of gauge transformations. In
electrodynamics, it is easy to see that the constraint equation is now
9
djEi - — djBj -p = 0,
where p is the electric charge density of matter fields. We see that, in the presence of 9,
magnetic monopoles will carry (generally irrational) electric charges [140].
In the classical approximation, ground states of the theory are found by looking for
minima of the energy. The classical states of minimum energy are static, pure gauge-
field configurations. When space has the topology S 3 (equivalently, when it has the
topology R 3 but we impose falloff conditions at infinity), the space of such field config-
urations is classified by the topological winding number n. The instanton represents a
tunneling process between two classical vacuum states with winding numbers differing
by one. This is closely analogous to the tunneling between different wells of an infinite
periodic potential in one dimension. The true semi-classical vacuum state in that case
is a Bloch wave, characterized by a Bloch momentum, which runs between and 2tt. In
an analogous fashion, semi-classical Yang-Mills theory would appear to have a con-
tinuous set of vacua characterized by a parameter 9 that is periodic with period lit. In
fact this is true, even though the semi-classical approximation is valid only for Green
functions at short distances, in the confining phase of the theory. The parameter 9 is
just the coefficient of the F aF term in the Lagrangian, which we introduced above. A
simple example of the correspondence between vacuum parameters and terms in the
Lagrangian is treated in Problem 10.2
10.6.2 Bosonic zero modes
There is also an anti-instanton solution of the anti-self-duality equations in which the
winding-number- 1 gauge transformation is replaced by the transformation of opposite
winding number. Atiyah and Hitchin were able to find all solutions of the self-duality
Instantons and solitons
equations [141]. These exact solutions are useful primarily in supersymmetric gauge
theories, where they allow us to calculate certain correlation functions exactly. Their
study goes beyond the scope of this book.
For our purposes, it is more interesting to look at approximate solutions of the
second-order equations consisting of a dilute gas of widely separated instantons and
anti-instantons. In this context, widely separated means separations large compared
with all of the parameters p, of the instantons and anti-instantons. The existence of
the parameter p is a consequence of the classical conformal invariance of the Yang-
Mills equations. Given any solution with finite action, we can scale the coordinates
to get a new solution with the same action. Note that the same is not true for finite
special conformal transformations, because they map finite points to infinity. When
we consider small fluctuations of the instanton, these symmetries show up as zero
modes of the fluctuation operator. The zero modes corresponding to special conformal
transformations are not normalizable and the corresponding directions in field space
are not included in the functional integral.
The procedure for dealing with the scale zero mode is similar to that for translations.
One extracts a scale collective coordinate and does the integral over it exactly. The
classical measure for this integration is the scale -invariant measure
dp
but there are generally quantum corrections to this. In Yang-Mills theory, the domi-
nant effect is the replacement of g 2 in the instanton action by the running value of the
renormalized coupling g 2 (p) [142]. In asymptotically free theories g goes to zero for
zero scale size, which tends to make the integral over p converge near 0. On the other
hand, this leads to an enhanced divergence at large p. Instanton calculations of the
short-distance behavior of Green functions are under control, because the p integrals
get cut off at scales of order the distance between operators. However, instanton tech-
nology, just like perturbation theory, fails to capture the large-distance behavior of
asymptotically free theories.
In a purely bosonic gauge theory, the result of the dilute-gas computation of Green
functions is
(01 (xi) . . . 0„(x n )) c = D~ l/2 j d 4 a j — j dtt
x g- (5 + da - 3) (p)s~^0f(xi +a)... 0< c) (x„ + a). (10.18)
The integrals are over the position, scale size, and embedding of the instanton configu-
ration in the gauge group. O^ is the classical value of the operator 0^ in the instanton
configuration of fixed position, size, and gauge orientation. The factor D~ 1 ^ 2 is the
determinant computed from the non-zero modes of fluctuation around the instan-
ton. It has been computed by 't Hooft [142]. The inverse powers of g come from the
normalization of the zero mode integrals, as explained above.
10.6 Yang-Mills instantons
10.6.3 Fermion zero modes and anomalies
In gauge theories with fermions, an extremely interesting phenomenon occurs in con-
nection with zero modes of the massless Dirac operator in an instanton background.
Consider the eigenvalue equation for the Euclidean Dirac operator:
iPf=Xf. (10.19)
ysif is then an eigenfunction as well, with eigenvalue —X. Now consider modes with
X — 0, and the way in which they behave under continuous deformations of the Yang-
Mills background. In general, we would expect such a deformation to change the
eigenvalue, but we can only change pairs of eigenvalues away from zero.
For X — 0, the eigenfunctions may be chosen to have a definite chirality, ys\j/ — ±i[r,
because ys anti-commutes with the Dirac operator and preserves the zero eigenspace.
This is not true for X ^ 0, since we have seen that multiplication by ys reverses the sign
of the eigenvalue. As a consequence, the wave function of a non-zero eigenvalue is a
linear combination of the two different chiralities. Thus, left- and right-handed zero
modes must be lifted in pairs.
This means that the difference between the number of left- and right-handed zero
modes (the so-called index of the Dirac operator) is a topological invariant of the
gauge field, unchanged by continuous deformations. We might be tempted to think
that this is related to the topological invariant we have just been discussing. A famous
mathematical theorem, due to Atiyah and Singer [143-146], tells us that this is the case.
A physics proof of this equation follows from the anomaly equation for the axial U(l)
current of the massless fermions
d„jr y^ysf = —^F^ v (*FT a v . (10.20)
Consider the Euclidean fermion functional integral in a background gauge field with
finite action and topological charge. The functional integral of the left-hand side of the
anomaly equation is
Y. d i^y' 1 y^, (10.21)
where the sum is over all normalized eigenfunctions of the Dirac operator,
Pf n =X n f n . (10.22)
Since the gauge potential falls off at infinity, the finite-^,, eigenfunctions fall off expo-
nentially at infinity. Thus, if we integrate (10.21) over all of Euclidean space, only the
zero modes contribute, and the integral just counts the difference between the numbers
of right- and left-handed zero modes. On the other hand, the right-hand side integrates
to the topological charge. So we find that the topological charge is in fact equal to the
difference between the numbers of right- and left-handed zero modes.
Actually, this calculation is valid only when the fermions are in the fundamental
representation. For a general fermion representation, R, if we choose a representative
of the topological class for which the gauge field sits in an SU(2) subgroup, then R
Instantons and solitons
will break up into a direct sum of spin-y representations of SU(2). The anomaly of the
spin-j representation is 2 Y^ m =-i ml ti mes larger than that of the spin-^ representation,
so we can use this formula to count the number of zero modes for any representation.
Bosonic zero modes give apparently infinite contributions to the path integral. We
have to integrate over the corresponding collective coordinates exactly in order to
eliminate or interpret the infinity. By contrast, fermion zero modes tell us that the
single-instanton contribution to the partition function vanishes. Instead, the single
instanton contributes to the expectation value of operators that can "absorb" the zero
modes. The simplest operator, with minimal number of fields, which can do the job,
is called the 't Hooft operator associated with the instanton. The dilute-instanton-gas
approximation for theories with fermion zero modes corresponds to the lowest-order
perturbation theory of a modified theory in which the 't Hooft operator is added to
the Lagrangian with coefficient De~ s ' , where D includes the determinant of non-zero
modes, and inverse coupling factors for bosonic zero modes. These expressions must
be integrated over bosonic collective coordinates.
The anomaly equation tells us that the U(l) axial symmetry is broken by instantons.
This shows up in the structure of the 't Hooft operator, which explicitly violates the
symmetry. In general, if there are k simple factors of the gauge group, we may expect to
have k independent symmetries of the classical Lagrangian, which are broken by these
non-perturbative effects.
10.7 Solitons
Every instanton solution that we have discussed can also be viewed as a static solution
of the equations of motion of the same field theory in one more dimension. That is,
the Euclidean action in d dimensions is the same as the energy for static solutions of
the {d + l)-dimensional Lorentzian field theory with the same field content. In those
cases in which our field theory contains vector fields, we are looking at static solutions
in the Aq = gauge. The spherically symmetric solutions look like point objects, and
we might imagine that they represent some new class of particles in the theory, called
solitons. We note, for future reference, that the static solutions can also be promoted
to a theory in more spatial dimensions. They are solutions that are independent of
some of the coordinates, and are thus non-trivial on ^-dimensional hyperplanes. The
string-theory-inspired name for such extended solutions is p branes, with particles
corresponding to p = 0, strings to p — 1, and membranes to p = 2. By abuse of
language, all of these solutions representing extended objects are called solitons. 1 "
In order to prove that 0-brane solitons are indeed particles, we begin by showing that
they do indeed correspond to points in the spectrum of the quantum Hamiltonian.
The derivation applies to general periodic solutions of the classical equations, and is
i, which referred only to solutions in integrable
basically the old Bohr-Sommerfeld quantization rule. We consider the path-integral
representation of the trace of the resolvent of the Hamiltonian:
where the integral is over all periodic paths
q(t+T[q(t)]) = q(t).
We have used notation appropriate for a single quantum degree of freedom, but it
should be obvious that the formulae generalize to any number of degrees of freedom,
including field theory. Notice that we are doing a Lorentzian functional integral here.
The derivation of this formula is easy and is left as an exercise.
When g 2 is small, we approximate the functional integral by a stationary phase,
saturating the integration by classical solutions. The action of a classical solution is
given by
-J*jk = Jp*«-*.
where we note that the classical energy oc 1 /g 2 . We consider periodic solutions satisfying
the Bohr-Sommerfeld condition
J p dq — Ink,
where the integral is taken over a period. For any such solution, there will be an infi-
nite number of others, given by multiple traverses of the same trajectory. These have
period nT and action nS, where T and S are the period and action for the single-pass
trajectory. On summing up the contributions from this infinite class of trajectories,
we get
T \z~h) ~ (1-^-*))'
which has a pole at z — E. Thus, in the semi-classical approximation, the trace of
the resolvent has a pole at the energy of each periodic classical solution. The usual
quantization of small oscillations around the minimum of the potential (which gives
rise to particles in perturbative field theory) is a special case of this rule.
The fact that static spherically symmetric solutions are particles now follows from
Lorentz invariance. This is easiest to prove for 1 + 1 dimensions, for which the derivation
is not complicated by non-covariant choices of gauge for the static solutions. We simply
note that Lorentz covariance of the theory implies that <j>j(A.x) is a solution for every
Lorentz matrix A. If the original solution was localized around the point x, the boosted
solution is localized around x — vt. Furthermore, its energy and momentum are related
by the usual relativistic dispersion relation E — ^p 2 + m 2 , where m is the energy E c of
the static classical solution.
Instantons and solitons
The fact that our solitonic particle has a mass of order l/g 2 resolves a puzzle about
how the soliton can be localized in its rest frame. The uncertainty in its velocity, for
some small uncertainty in its position, is of order g 2 /Ax. Thus, for small g it can be well
localized both in velocity and position. These remarks lead us to expect that the semi-
classical expansion is also a non-relativistic expansion for the motion of the soliton. In
particular, in this expansion, the number of solitons is conserved, rather than just the
difference between soliton and anti-soliton numbers.
The latter remark relieves a tension that someone schooled in the tenets of effec-
tive field theory might have been feeling about the whole notion of solitons. Recall
that all quantum field theories are just low-energy approximations to something else.
Why should we trust the predictions of quantum field theory for these states whose
energy goes to infinity in the semi-classical approximation? The answer goes back to
the Wilsonian procedure for integrating out degrees of freedom. This is usually done by
integrating out heavy fields in the Euclidean functional integral. Even without thinking
about solitons, we can imagine an absolutely stable particle of large mass. We consider
states containing an arbitrary number of these massive particles, interacting with them-
selves and with light degrees of freedom, under conditions in which no heavy particles
are created and all energies and momentum transfers are much smaller than the heavy
mass. The generalization of effective field theory to such situations involves light fields
interacting with the coordinates describing the motion of the heavy particles. In gen-
eral it will contain additional renormalization constants relating to the heavy-particle
properties.
We will now show that a similar procedure works for solitons. The major difference is
that the quantum theory of solitons requires no extra renormalization constants. Their
properties are completely determined by the parameters of the low-energy effective
theory in which they appear as classical solutions. The key observation, motivated
both by our remarks about Lorentz transformations and by the spatial translation
invariance of the field theory, is that the function
</»' = tf(x - X(t)),
where X(t) is slowly varying, and 4>f is the static classical solution, has energy
This is the energy of a non-relativistic particle, with mass
If we recall the classical equations of motion
d 2 . dV
and the manipulations that lead to Euclidean energy conservation when we considered
the same solution as an instanton, we find that
is x -independent. We can evaluate the constant at infinity, where both terms in the
expression vanish. The reader should verify that this shows that the mass appearing in
the non-relativistic kinetic energy of the soliton is just the static energy, as we expect
from Lorentz invariance.
The strategy for getting a perturbative expansion of soliton interactions is simply
to expand the functional integral around the sub-manifold of field space parametrized
by cpf(x — X(t)) for slowly varying X and then to do the X functional integral. The
rationale is like that of the Born-Oppenheimer approximation in molecular physics.
The time scales for X motion (in the soliton rest frame) are much longer than the time
scales involved in perturbative particle motion and interaction. Therefore, in the spirit
of effective field theory, we can first solve for the particle scattering in a fixed soliton
background, then use this to compute the effective action for X(t), and finally do the
quantum mechanics of X(t).
The sub-manifold of slowly moving soliton configurations is a curved sub-manifold
of field space. Therefore, in order to isolate it we have to compute the measure in the
vicinity of this sub-manifold. The procedure for doing this order by order in pertur-
bation theory is analogous to the Faddeev-Popov procedure for integrating along a
curved gauge slice in non-abelian gauge theory. It has been worked out in great detail by
Gervais, Jevicki, and Sakita [147]. Fortunately, to leading order, every curved manifold
is straight. When we expand around the static soliton by writing
4>i(t, x) = tf(x) +gJ2 c >A,(x, t),
we find a quadratic action for fluctuations, whose equations of motion are
([dt-d*]8 i j+U ij ( X ))8j(t, X )=Q.
Uij — d 2 V /d(j)i3(j)j, and is time-independent. On writing <5„ — e lw "'S n , we find that a>l
are the eigenvalues of the Schrodinger operator we encountered when implementing
small fluctuations around quantum instantons. We know that the spectrum of this oper-
ator consists of a normalizable zero mode, proportional to d0,/dx, and a continuum
of scattering states. The zero mode is obviously the part of the field in the direction
parametrized by X(t). Thus, to leading order in the expansion about the soliton we sim-
ply find free soliton motion, accompanied by scattering of the perturbative excitations
from the static soliton. Soliton recoil appears only at next order in the perturbation
expansion.
It's interesting to note that these amplitudes for particle scattering by solitons are
independent of g , whereas scattering amplitudes for ordinary particles are of order g 2 . If
we look in the two-soliton sector, then we can compute the S matrix for soliton-soliton
scattering by solving the classical equations with initial conditions corresponding to
two incoming, widely separated solitons. The phase shift is computed in terms of the
Instantons and solitons
classical action of this solution, and is of order l/g . So solitons are strongly coupled
to themselves, and have order- 1 couplings to elementary particles.
We can understand these results intuitively by thinking about the soliton states in
terms of the elementary-particle Fock space. In terms of fields with canonically nor-
malized kinetic terms, which create elementary-particle states with amplitude 1, the
classical soliton field is of order l/g. A state with the field shifted to the classical
soliton value is
e i/d* *<</>? |0)j
and the average number of particles in such a state is of order l/g 2 . So we can view the
various types of scattering amplitudes as coherent sums over scattering of elementary
particles off the elementary constituents of the soliton. This picture should not be
pushed too far, but it gives the right order of magnitude in powers of g for each type
of soliton scattering amplitude.
10.8 't Hooft-Polyakov monopoles
We now skip to the case of most interest, the static 't Hooft-Polyakov solutions of
(3 + l)-dimensional gauge theories. Let us study their topological charge with a little
generality. We start with a general gauge group G and a Higgs field in a representation
R of G. Recall that the finite-energy (formerly finite-action) condition is that the Higgs
field approaches some minimum of the potential on the sphere at infinity. It must
satisfy
4>(h,t2) =g(h,t2)^>0,
where (j>o is a fixed vector in the representation, whose stability subgroup is H. It is the
value of <j>(t\, t2) at a point on the sphere, which we call the North Pole. This defines
a continuous mapping of S 2 into G/H, an element of Il2(G/H). t\^ are coordinates
on the sphere. Continuity of <j>(t\, t-£) does not imply that g{t\, h) is continuous on
the sphere, and indeed it is not, whenever (j> is topologically non-trivial. Without loss
of generality we can map the sphere onto the square, with the entire boundary of the
square identified with the North Pole. (/> thus takes on the value </>o everywhere on the
boundary. Let (t\ , ti) be the Cartesian coordinates on the square, running from to 1.
The gauge potential at infinity is related to cp by the equation
£>,■</> = 0.
There are many gauge-equivalent solutions of this equation. One is chosen by setting
g(h,t 2 ) = Pe if c A ,
where, for each point (t\, t-£), C is the horizontal line starting at the left boundary of
the square and going to that point. With this definition, g — 1 on the left, top, and
10.8 't Hooft-Polyakov monopoles
bottom boundaries of the square, but may be non-trivial on the right boundary. The
condition that <p be continuous is that
g(l,t2)<pQ = 4>0,
for all ti, i.e. g(\, tj) e H. This then defines a map of the circle into H, or a member
of rii(H). It is easy to verify that the map we have just constructed, namely that of
Fl2(G/H) into rii(H), is a group homomorphism.
Conversely, given any element of n i (H) that can be continuously distorted into the
identity when rii(H) is embedded in rii(G), we can construct g(ti,ti) m terms of
the homotopy which distorts that element into the identity. Finally, the image of the
identity in P2(G/H), the mapping g(t\, ti) — 1, is clearly the identity in rii(H). Our
map is a group homomorphism, so this means that it is a one-to-one and onto mapping
(an isomorphism) between rb(G/ H) and the subgroup of rii(H) which maps into the
identity in rii(G). In particular, if G is simply connected we have an isomorphism
between Ili(H) and nb(G/H). The most interesting case is when rii(H) is an integral
lattice, i.e. when H is a product of U(l) factors. In this case, as in our SO(3)/U(l)
example, the topological charge of solitons is precisely the magnetic charge of the
multiple Maxwell fields in the low-energy effect theory below the scale of the Higgs
VEV. One way to get such a pattern of symmetry breaking for any group is to introduce
a Higgs field in the adjoint representation. A general VEV for an adjoint Higgs breaks
the group to its Cartan torus.
Every Lie group G has a simply connected covering group G, such that
G = G/IIi (G). Familiar examples are SO(3) = SU(2) and more generally SO(«, m) =
Sp(«, m). These examples illustrate a general pattern: the covering group has extra rep-
resentations, the spinors, with the property that the eigenvalues of Cartan generators
(the weights of the representation) are fractions of the weights in bona-fide representa-
tions of G. A more general class of examples is obtained by taking any simply connected
Lie group G modulo its center Zq. Only representations invariant under the center are
representations of G/Zq .
The pure Yang-Mills Lagrangian contains only adjoint fields. Let us add Higgs
fields only in representations of G/Zq. Then monopole charges are restricted. The
gauge group of the theory could be either G or G/Zq, but the Lagrangian does not
distinguish between them, so monopole charges are restricted to be in the subgroup
of IIi (H) which is trivial inside of IIi(G/Zg). So, in our SO(3)/U(l) example, the
monopole charges must all be even.
The physical meaning of all of this has to do with the Dirac quantization condition.
We will describe it explicitly only for the case of SO(3)/U(l), but the reader should
be able to generalize it to all other cases. As we have discussed in Chapter 8, the
low-energy effective theory is Maxwell's electrodynamics coupled to both electric and
magnetic charges. This raises a well-known problem, which was first solved by Dirac.
An electrically charged particle couples to the vector potential
/*£
Instantons and solitons
but the vector potential cannot be globally denned in the presence of magnetic charge,
because
€ I1V\K dv p^ = J£
Dirac solved this problem for point-like monopoles by taking a particular solution
of this equation with F non-zero only along a 2-surface running from the world line
of the monopole to infinity, and then introducing a vector potential to describe the
rest of the electromagnetic field. Essentially he described a monopole as a semi-infinite
solenoid. But we would like physics to be independent of the choice of where we place
this fictitious solenoid, if we want the description of the interaction of monopoles and
charges to be local. Classically there is no problem, since the charged particle will
never hit the infinitely thin solenoid, but quantum mechanically we must make sure
that there is no Aharonov-Bohm phase when the charged particle follows a trajectory
encircling the solenoid. As we proved in Chapter 8, this leads to the Dirac quantization
condition
(eigj - ejgt) = Inriij. (10.23)
We have written the condition for arbitrary pairs of particles, assuming that each pair
has both electric and magnetic charge. Each ny must be an integer.
Another illuminating way of deriving this condition is to compute the angular
momentum of the electromagnetic field of the particle pair. It's easy to verify that
this contains a term
8L= e -° : ~ ""'"'
An
where r is the unit vector pointing between the pair. Quantization of angular momentum
in half-integer units now implies that «,y is an integer. We will see a striking consequence
of this fact a bit later on.
For solitonic monopoles in theories in which U(l) arises from spontaneous break-
down of a simple group, both electric and magnetic charges are already quantized.
Monopole quantization follows from the topological conditions we have been dis-
cussing, while electric charge quantization follows from the fact that the weights of
representations of compact Lie groups are discrete. The restriction we discussed above,
namely that the magnetic charge be in the subgroup of n i (H) which maps to the iden-
tity in rij (G/Zg), is the statement that the monopole satisfies the Dirac quantization
condition for electric charges in all representations ofG, including those which are not
representations of G/Zq • For SO(3) / U( 1 ) the monopole charge is twice what it needs
to be to satisfy the Dirac quantization condition with particles in tensor representations
of SO(3). If this were not so we would run into an inconsistency when we tried to add
fields in the spinor representations of SU(2) to the Lagrangian.
The discussion of charge quantization in the context of non-trivial classical solutions
of the field equations leads us to ask about electric charge carried by classical field
configurations. Why can't such classical charges take on arbitrary continuous values?
To answer this, we first choose a gauge, Aq = 0. In this gauge we canonically quantize
10.9 Problems for Chapter 10
the spatial components of the gauge potential, and then impose the constraint equation
derived by varying A^ as a constraint on physical states:
DflV b + e ahc (7T h cp c - 7r c ^)|* phys > = 0.
P' b = (\/g 2 )d t A' h is the canonical conjugate to the vector potential and n a the canonical
conjugate to the Higgs field. The operator which must vanish on physical states is
the generator of time-independent local gauge transformations. The corresponding
generator of global U(l) transformations is
= f rmK
In this gauge, all charge-carrying configurations have time-dependent vector potentials
at infinity. Julia and Zee [148] found the corresponding configurations with both electric
and magnetic charge in a different, static gauge.
In Aq = gauge, the solutions have the form of a time-dependent gauge transforma-
tion of the monopole solution. For small deviations, the behavior near infinity is
8^4" = 8" —=- ,
while the leading behavior of the Higgs field is unchanged. This is a normalizable zero
mode of the monopole solution and the collective coordinate 9(f) must be quantized.
However, since it represents a U( 1 )-gauge rotation 9 is a periodic variable. Its canonical
momentum, the electric charge Q, is therefore quantized. The minimal unit is the charge
on the adjoint representation. Quantization of 9 thus leads to a discrete spectrum of
dyonic (or dual-charged) excitations of the monopole. The spacing between these levels
is of order (A0 2 in units of the mass of the massive gauge bosons.
A surprise is in store [149-150] when we carry out the collective coordinate quantiza-
tion for monopoles in theories in which there are fields in representations that transform
under the center of SU(2). The unit of charge quantization is half what it was in the
SU(2)/Z2 theory. As noted above, if we calculate the angular momentum of some of
the dyonic bound states of this system, we will find half-integer values. Thus, we have
constructed spin-^ particles from a purely bosonic system!
10.9 Problems for Chapter 10
* 1 0. 1 . Discuss vacuum decay for a scalar field theory in d dimensions. That is, we have a
single scalar field, with the same kind of 2-minimum potential we discussed in the
case of quantum mechanics. The Euclidean path integral is over all field config-
urations 4> (x) that approach </>f , the higher minimum of the potential, at infinity
(set the potential equal to zero there). Assume that the instanton configura-
tion is SO (d) -invariant. Find the field equation for it. If you call the Euclidean
radius /, you will find that the equation is identical to a mechanics problem
in the upside-down potential, with a time-dependent friction term (d — \)/t.
Instantons and solitons
Using simple energetic considerations, argue that there is a finite-action bounce
solution. Write down the equations for small fluctuations around the bounce.
Since the bounce is spherically symmetric, they can be separated by expanding
in (/-dimensional spherical harmonics. How many exact zero modes of the solu-
tion exist, following from symmetries? In what representation of SO(d) do they
transform? Try to argue that there is exactly one negative-eigenvalue fluctua-
tion mode. (Don't try too hard, the general result for a large class of quantum
systems can be found in [151]. If you can do it easily for this special case, do it.)
The answer to most of this exercise can be found in the Coleman book [138].
Try to do it yourself, but you can go there for hints if you get stuck.
*10.2. Prove Derrick's theorem. Consider a Lagrangian of the form (V0 1 ) — V(cf>) in
d > 2 dimensions. Consider any finite-energy non-singular field configuration
and show that you can lower its energy by considering <j>{ax) with appropriate
a. This shows that, without further constraints, there are no stable solitons in
such a theory. Generalize this to an arbitary number of scalars. Now consider a
complex scalar field with a charge quantum number. The potential is a function
of 0*0. Show that time-dependent fields of the form e lm/ 0(x), where x are
the space variables, carry an electric charge Q — i/d rf_1 x(0* 9,0 — 3/0*).
Evaluate the energy of such configurations. Show that Derrick's theorem no
longer applies. In the appendix to the chapter on lumps in Coleman [138], you
will find an example of a theory of this type that has stable charged solitons.
Such objects are called Q-balls.
*10.3. Consider the quantum Lagrangian
L=-(0) 2 -F(0),
where V is periodic with period 2n. There are two different interpretations of
this system. In the first, lives on a circle. In the second, lives on the real line,
in the presence of a periodic potential. In the second interpretation the system is
invariant under translation of by lit and contains a global symmetry operator
T that commutes with the Hamiltonian. We can diagonalize the Hamiltonian
in the sector satisfying
r0(0) = 0(0 + 2:r) = e ie 0(0),
where is called the Bloch momentum. In the first interpretation, translating
by lit does nothing, and wave functions are required to be periodic. In either
case, we are allowed to add the term 8L = 00 to the Lagrangian, which is a
total derivative, violating time-reversal invariance, which is analogous to the FF
term in gauge theories. Show that, in either interpretation, there are finite-action
Euclidean instantons that have value 2jt times an integer for this topological
term. Show that, in the case that lives on the real line, the path integral with
non-zero corresponds to computing the ground-state expectation values in
the lowest energy state with Bloch momentum 9.
10.9 Problems for Chapter 10
*10.4. The instanton solutions of a quantum -mechanics problem are solitonic particle
solutions of field theory with the same Lagrangian in 1 + 1 dimensions. Show
that, in any higher dimension, the same solutions can be interpreted as domain
walls separating two ground states of the theory. Similarly, show that the instan-
tons of the (1 + 1 )-dimensional Higgs model can be thought of as particles in the
(2+ l)-dimensional version of the model and as strings or (Abrikosov-Nielson-
Olesen) vortices in the (3 + l)-dimensional version. This problem requires no
additional calculation.
*10.5. Prove that the instanton solution of four-dimensional gauge theory is invariant
under rotations, in the sense that all gauge-invariant functions of the field are
rotation-invariant.
Concluding remarks
If you've successfully worked your way through this short but arduous journey to the
world of quantum field theory, you should be exhilarated! The landscape is full of
fabulous beasts (most of which seem to go by the last name -on) and elegant formulae.
Most surprisingly, all of this elegance seems to give us a remarkably precise and general
description of many facets of our world, from phase transitions in humdrum materials
to the interiors of stars and the intergalactic medium. In this concluding section I
want to emphasize again a few general lessons, and chart out for you the parts of the
quantum-field-theory landscape we have NOT explored.
The first of the important lessons that you should take away from this book is the
beautiful unification of the classical theories of fields and particles that is forced on
us by combining relativity and quantum mechanics in a fixed space-time background.
The second is the unification of the methods of quantum field theory and classical
statistical mechanics, which is provided by the Euclidean path-integral formulation of
field theory.
Next I would ask you to remember the difference between a symmetry and a gauge
equivalence and the different meanings of the idea of spontaneous symmetry breakdown
in the two cases. Spontaneous breakdown of a global symmetry is related to locality. A
quantum field theory is definedby its behavior at short distances, but there may be differ-
ent infrared realizations of the same short -distance operator algebra and Hamiltonian.
Sometimes these are related by a symmetry transformation. If it is a continuous group
of transformations, any one infrared sector of the theory contains massless excitations
called Nambu-Goldstone particles.
By contrast, no physical quantity transforms under a gauge equivalence, which is
merely a convenient redundancy of our description of a physical system. What we call a
spontaneously broken gauge symmetry is a particular phase, the Higgs phase, of gauge
theories. In this phase there are no massless particles associated with the generators
of the gauge equivalence. Gauge theories can also have another massive phase (which
is distinct from the Higgs phase if the Higgs fields are invariant under a non-trivial
subgroup of the center of the gauge group) called the confining phase, which is related
to the Higgs phase by electric-magnetic duality. In the real world the non-abelian
electro-weak gauge group is in the Higgs phase, and the color group of QCD is in its
confining phase. The U(l) group of electromagnetism is in yet a third phase, called the
Coulomb phase. This is a special case of a more general possibility in which the gauge
theory behaves like a conformal field theory in the infrared.
Concluding remarks
Perhaps the most important topic in quantum field theory is the theory of renor-
malization. As formulated by Wilson, Kadanoff, and Fisher, this is a very general way
of understanding how physical processes at different length scales are related to each
other. From the philosophical point of view, this is the reason why we are able to do
physics at all. If we had to uncover the correct theory of the smallest length scales and
highest energies in order to talk about what we see in our laboratories, there would
be no hope for a theory of physics at all. The renormalization group (RG) explains
to us why it is that we can do long-distance physics without knowing about short dis-
tances. The key concept is that of the RG fixed point, parametrized by a small number
of relevant and marginal perturbations. All UV theories containing a given IR set of
degrees of freedom flow to the vicinity of the fixed point whose basin of attraction they
lie in. They are differentiated only by the values they determine for the marginal and
relevant parameters. This gives rise to universality classes of behavior, a post-diction
spectacularly confirmed by observations of critical phenomena in condensed-matter
physics.
Given this broad-brush picture of what we have covered in this book, we now turn
to the subjects we have omitted. The most important of these are supersymmetry,
finite-temperature field theory, and field theory in curved space-time. Supersymmetry
is (in dimensions higher than 2) the only allowed extension of the space-time conformal
group. Remarkably, it joins together bosons and fermions as a single entity. It also seems
to be deeply connected to the quantum theory of gravity. At the technical level, super-
symmetry allows us to make many exact statements about quantum field theory that
are not possible with ordinary field theories. There are several good reviews of SUSY
field theory [152-156], but, if I may be allowed an opinion, as yet no comprehensive
treatment of all modern developments.
Finite-temperature relativistic field theory is primarily applicable in cosmology and
astrophysics, though the techniques are similar to those of the non-relativistic theory,
which has wide applications in condensed-matter physics. Equilibrium calculations are
quite similar to what we have discussed: to calculate averages in the canonical ensem-
ble, instead of using the vacuum density matrix, one simply performs the Euclidean
path integral on a space with one compactified dimension, whose length is the inverse
temperature. Good reviews of this subject can be found in [157]. Non-equilibrium field
theory is much more difficult, and has been the focus of a lot of recent work. As far as
I know, there is no definitive modern summary, but the reader can consult [158].
Quantum field theory in curved space-time is relatively easy to define for space-times
whose complexification has a real Euclidean section. Again, one simply performs the
path integral on the appropriate Euclidean manifold, and analytically continues the
answer. More general space-times with no time-like isometries can also be dealt with
at a certain level of generality [159]. However, the most interesting thing about this
subject is that it contains the seeds of its own demise, and shows us that we need to
replace quantum field theory with a quantum theory of gravity. I am referring to the
phenomenon of Hawking radiation. Within the context of field theory in curved space-
time, it seems to imply that the formation and evaporation of black holes violates the
unitary evolution postulate of quantum mechanics. It is only with the advent of string
Concluding remarks
theory, the first real theory of quantum gravity, that we have begun to understand how
this paradox is resolved.
Other subjects to which we have given short shrift are perturbative QCD [124]
and weak interactions [160-161]. These subjects are nicely treated in the textbook by
Peskin and Schroeder [33]. Lattice field theory has developed into a computer-intensive
subfield. Good reviews of the subject can be found in [162]. Appendix A contains ref-
erences to a number of books on the vast subject of quantum field theory in statistical
physics. Finally, I want to mention the use of two-dimensional conformal field theory
in statistical physics and string theory. This is a subject of great beauty and utility
[163-164].
I hope you've come to the end of this trip with an appreciation of the beauties of
quantum field theory and a hunger to know more about it. As you can see, there are
lots of directions to follow from this point. The methods of field theory and the concept
of renormalization have become so pervasive that it is probably beyond the capabilities
of any single person to be an expert in the entire field. But every single avenue you
can follow is interesting and exciting. Even those of you who prefer to explore the
mysteries of quantum gravity, where the paradigms of field theory appear to fail, will
continue to use many of the ideas and techniques of field theory. Indeed, we have found
that, in many cases, the dynamics of gravity in some class of space-times is completely
equivalent to that of a quantum field theory living on an auxiliary space [165-167].
Appendix A Books
This is a brief guide to other books on quantum field theory. The standard modern
textbook is An Introduction to Quantum Field Theory, by Peskin and Schroeder [33].
I recommend especially their wonderful Chapter 5, and all of the calculational sections
between 16.5 and 18.5, as well as Chapters 20 and 21. Every serious student of QFT
should work out the final project on the Coleman-Weinberg potential, which can be
found on page 469. Another standard is Weinberg's three-volume opus [131]. Here
I recommend the marvelous sections on symmetries and anomalies in Volume II. The
technical discussions of perturbative effective field theory are invaluable. The section on
the Batalin-Vilkovisky treatment of general gauge equivalences is also useful. Volume I
should probably be read after completing a first course on the subject. It presents an
interesting but idiosyncratic approach to the logical structure of the field. Volume III on
supersymmetry is full of gems. In my opinion, it is flawed by an idiosyncratic notation
and a tendency to obscure relatively simple ideas in an attempt to give absolutely general
discussions. Finally, let me mention a relatively new book by M. Srednicki [168]. I have
not gone through it thoroughly, and I do not agree with the author's ordering of
topics, but the pedagogical style of the sections I have read is wonderful. It is clear that
everyone in the field will turn to this book for all those nasty little details about minus
signs and spinor conventions. I think there is also a chance that it will replace Peskin
and Schroeder as a standard textbook.
I have also enjoyed using the books by Bailin and Love [169] and Ramond [170] in
my many years of teaching the subject. The books by Itzykson and Zuber [171] and
Zinn-Justin [111] are more monographs/encyclopediae than textbooks, but they con-
tain a wealth of detail on specific subjects in field theory that can be found nowhere
else. I mention especially Zinn-Justin's discussions of the large-order behavior of per-
turbation theory, of the use of field theory for calculating critical exponents, and of
instantons. Other treatments of the field theory/statistical physics interface can be found
in the book by Drouffe and Itzykson [1 72], the marvelous book by Parisi [173], and the
book by Ma [174]. Sch winger's source theory books [175] also belong in the category
of non-textbooks, which contain scads of invaluable information about field theory.
Among older field-theory books, the second volume of Bjorken and Drell [176]
contains lots of useful information, like explicit forms for the spectral representations
for higher spin. The books of Nishijima [177] (Chapters 7 and 8), Bogoliubov and
Shirkov [178], and even Schweber [179] will reward the really serious student of the
subject.
Finally, I want to mention various shorter documents that I think are essential
reading for students of quantum field theory. The most important is Kogut and
Appendix A Books
Wilson [126], still the best introduction I know of to Wilson's profound ideas about
renormalization. Next is the 1975 Les Houches lecture-note volume Methods in Field
Theory [180], every chapter of which is a gem. Coleman's book [138] is a collection of
lectures on a variety of topics in field theory. I've drawn on it heavily for the material
about instantons and solitons, but the other lectures are also worth reading. The contri-
butions of Adler, Weinberg, Zimmermann, and Zumino to the 1970 Brandeis Summer
School Lectures, ' and of Weinberg to the 1 964 Brandeis volume, are also worth read-
ing [181]. Much of Weinberg's material reappears in his textbook [131]. Finally, let me
mention the reprint volume Selected Papers on Quantum Electrodynamics [182] edited
by Schwinger. The contributions of Feynman and Schwinger in particular should be
read by every student of field theory.
There are lots of other books on field theory, and I apologize to those authors
I haven't mentioned. I've emphasized those texts I've found most useful in my own
career. Others will have different favorites.
These lectures also contain a marvelous introduction to string theorj b\ Mandelstam.
ix B Cross sections
Here we give the standard formulae for the differential cross section of a reaction in
which two incoming particles produce an arbitrary final state
dVf
= i . n-
2^iw 2 |vi - v 2 | Y (2tt) 3 2w(^ f )
-|M| 2 (27T) 4 S 4 (i>i+i> 2 -£> f ).
I vi — V2I is the relative velocity of the two particles in the laboratory frame. Similarly
the differential decay rate of an unstable particle is
" 2M ll (2jt) 3 2(o(p f )
\M\ 2 (27t) 4 S 4 (p-J^p f ).
Appendix C Diracology
Here we collect a variety of identities for Dirac matrices and spinors. The basic
commutation relations are
The Weyl representation of these relations is
(C.l)
where a^ = (l,er), a 11 — (1, — or), and a is the usual 3-vector of Pauli matrices. In this
representation
y 5 = \y y Y Y
■(-.' ?)•
These matrices obviously satisfy
(y")l
• y F K •
(C.3)
We will work only with representations related to the Weyl representation by unitary
(rather than special linear) transformations, so this relation will always be true.
A convenient representation is the Dirac representation, in which
'-G -.)■
It is related to the Weyl representation by y^ = S D y^So with
S D = — (l-ia 2 )®l.
The Majorana representation is related to the Dirac representation by y^ — S^y^ Sm
with
In the Majorana representation the Dirac equation is satisfied separately by the real
and imaginary parts of the field.
Appendix C Diracology
The space of all 4x4 matrices is spanned by the anti-symmetrized products of Dirac
matrices, y^ 1 -^* •, with < k < 4. But note the relations
Calculations of spin-averaged cross sections and closed fermion loops lead to traces
of products of Dirac matrices:
tr(y"'...y w ).
These can be calculated using anti-commutation relations and cyclicity of the trace, or
by the tensor method. The latter comes from the observation that the result must be a
numerical Lorentz tensor (and thus built from r)^ and e^vx*), which is invariant under
cyclic permutation of the indices. Note also that the result is zero for k odd, because
y| = 1 and y$ anti-commutes with all the y M . For example
tr(y'V) = Arf v ,
and we compute A = 4 by hitting both sides with r]^. A similar manipulation shows
that
tr(y M y v y 5 ) = 0,
tr(y ll Y v Y X Y K ) = C(^ 1 V* + ^'V v ) + D^rj"* + Be' lvXlc .
Taking all the indices different, the RHS is just B and the LHS is oc tr y 5 =0, so 5 = 0.
On hitting both sides with r/^, we get
5C + Z)= 16.
Doing the same with iq^x gives
My'Vx/^) = ~W K = (2C + 4D)rf" .
Here we have used an identity
Y v, Y v Yii = -2y y ,
which we will derive in a moment. We conclude that C = —D = 4. Similar manipulations
allow us to calculate any trace with relative ease.
Appendix C Diracology
To evaluate y X Yi±\ ■ ■ ■ Y^Yx, we proceed by induction, using the anti-commutation
relations
Y X YnYx = - 4 » + 2 >V = _2 /M'
Y X Yu.YvYx = -Yp,(-2Yv) + 2 YvYii = *Vnv,
Y X YtiYvYicYx = -^YtiVvX + 2YvYkYh,
Y X YiiYvYkYo,YX = Y^Yv^a ~ ZYoiYkYv) + ZYvYkYcYii-
The reader is encouraged to continue computing these trace and contraction identities
to higher orders.
The wave functions which convert fermion creation operators into local fields are
solutions of the momentum-space Dirac equations
(/- m)u(p,s) = (/+ m)v(p,s) = 0.
They satisfy the completeness relations
J^u(p,s)u(p,s) = (j/+m),
Y^v(p,s)v(p,s) = ip-m),
where the right-hand sides are proportional to the projection operators on solutions
of the respective equations. Note that many books use a different normalization.
We have followed the convention that, for fields of any spin, the Fourier transform
is (2<w / ,(27r) 3 )~2 times a creation or annihilation operator, times a covariant wave
function.
In the Weyl basis, the solutions have the form
X(s))'
Here / and r] are normalized two-component spinors corresponding to the choice of .v.
In a convenient convention
r)(s) = 2s\ai x* (s),
where s — ±1/2 is either the component of spin in some fixed direction, h, or the
helicity.
Appendix D Feynman rules
In this section we list the Feynman rules in both Euclidean and Lorentzian signature,
for all renormalizable interactions in four space-time dimensions.
D.I Propagators
D.1.1 Gauge boson propagator in R K gauge
m 2 A is the matrix v T T a T b v in the adjoint representation of the gauge group, and v
the expectation value of the Higgs fields. For gluons in QCD, we have m 2 A = and w
denote gluon lines by
Scalars are denoted by dashed lines (with arrows on them if the scalars are complex).
i 1
D.1.2 Scalar propagator in R K gauge
The scalar mass-squared matrix is
icT a v(T a vy
where the first term is a sum of dyadics and operates in the subspace of would-be NGBs,
while the second term operates in the orthogonal subspace of physical Higgs bosons.
Everything is written in terms of real scalar fields.
Appendix D Feynman rules
D.1.3 Ghost propagator in R K gauge
The ghosts are complex scalars in the adjoint representation. Closed ghost loops add
an extra minus sign.
The propagator for Dirac fermions is
p-mp + ie j/-im F
D.1.4 Dirac fermion propagator
The fermion mass matrix can be an arbitrary combination of 1 and y$, with coefficients
that are complex matrices in internal index space. The momentum in the propagator
goes in the direction of the particle number, which for charged particles is minus the
electric charge. Unpaired Weyl fermions are treated by using a Dirac field with couplings
such that the right-handed components are free. Closed fermion loops get an extra
minus sign.
D.2 Vertices
igy»T§P -gy^T^P
D.2.1 Fermion gauge vertex
The projection operator P is either the unit matrix, or the projector on left-handed
Weyl spinors.
g fabc [ri »v (k _ p y + rj vp (p _ q y + ^ P(q _ k)V]
_ igf abc [s „v (k _ p y + & v P{p _ q y. + 8 „p (q _ k)V]
D.2.2 Three-gauge-boson vertex
All three momenta are ingoing. The second line is the Euclidean vertex.
D.2.3 Four-gauge-boson vertex
The Euclidean vertex has — i — >■ 1 and r/' lv — >■ 8
vf v % = -ig 2 [f ahe f cde (r 1 ' xp n va - n vp n lxcr )
+ f ace f bde {rl „ Vrl pa_ r]V p rl ^ )
+ f ade f hee (jf v r)P a - ^V*)]
V gf"bc pl ± _- lg fabc pl i
D.2.4 Ghost gauge vertex
D.2.5 Yukawa vertex
For each scalar field, the matrix M is a combination of 1 and y$, with coefficients that
are matrices in the space of fermion internal indices.
-igip+pYTIP gip+pYTg
D.2.6 Scalar gauge vertex
ig[T$,T>] + r,i> v -g[T§, T^+S^
Appendix D Feynman rules
D.2.7 Scalar-scalar gauge vertex
For purely scalar interactions one generally includes a 1 /k\ in the Lagrangian, for a term
that contains k identical scalar fields. The vertices are then \g in Lorentzian signature
and — g in Euclidean signature, where g is the coefficient of the inverse factorials. Graphs
are counted by their symmetry numbers. To compute symmetry numbers one either
figures out the order of the group of geometrical symmetries of the graph, or goes
through the combinatorics of Wick's theorem. See D.4.
In our computation of Wilson loops, we have described Feynman rules for the
coupling to static non-abelian sources.
D.3 External lines
Here we give the rules for invariant amplitudes. Refer to Appendix B for the additional
factors for S-matrix elements and cross sections. The external lines in a diagram look
exactly like the internal lines except that they have one free end. For scalars the rule that
goes with an external line is just a 1. For vectors, we have a factor of e M (k) for incoming
particles and (e^(k))* for outgoing particles. For massive particles with momentum in
the 3-direction, the (linear) polarization vectors are
^ 2 0t) = (0,e u ,0),
where e\j are two-dimensional unit vectors in the 1- and 2-directions. Helicity (circular
polarization) states are the complex linear combinations with 2~z {e\ ± e\). The other
polarization is called longitudinal and has the form
< = (f,0,0,^).
If we rotate the momentum to another direction, each of the three polarizations rotates
like a vector. We do not rotate the polarization indices, because we keep the spin
quantization axis along the 3-direction.
For external Dirac fermions we have the rules
u(p, s) — incoming fermion,
u(p, s) — outgoing fermion,
v(p, s) — outgoing anti-fermion,
v(p, s) — incoming anti-fermion.
The spinor solutions of the momentum-space Dirac equation were discussed in
Appendix C.
D.4 Combinatorics
D.4 Combinatorics
Here are a few examples of Feynman graphs in a theory with Cj — (A/4!)</> 4 , along
with the symmetry numbers by which they are multiplied. The denominators in the
combinatoric factors are a 1/2 from the exponential in perturbation theory, and 1/4!
from each vertex, coming from the definition of the coupling. The numerators count
the number of ways a given diagram is obtained from Wick's theorem, written in a way
that makes the counting clear.
I I , I I I I , I I I I
0M0(y) / cfV *(w)#(^(^M / cft<Mz)</>(z)<Mz)0(z)
I I , I I , 1 I
0(.x)0(y) (fw 0(w)d>(w)0(w)0(w) / (Pz<j>{zWzMz)<j>{z)
■1 I I J I I
tfcMy) y cftv </>(w)</.(w)<Mv<#M y (ft: <Mz)</>(z)<Mz)</>(z)
x w y z 2-(4!) 2 6
I 7 I I } I I I
<HxWv)J dh, fiMMwMwMw) J <Pz4>{zWz)4>(zMz)
0(.v)0(.v) y rfHv 0(w)0(w)<«w)0(w) y d* z 0k)</>(z)<>u)0U)
r^ 1 I I i = i I I
<M4to) j dh, ftwMwMwMw) J (PzcfizWzMzMz)
24-4-3! _ i
2 • (4!) 2 6
Combinatoric factors for some two-loop diagrams.
Appendix E Group theory and Lie algebras
For physicists, a Lie group is a group whose elements depend smoothly on a finite
number of parameters w a - Mathematically this means that the group is also a smooth
differentiable manifold and that the action of the group on itself (by left or right mul-
tiplication) is a diffeomorphism. Much of the structure of Lie groups can be extracted
by studying their behavior near the identity element, where we write 1
U(ca) w 1
co a T a .
A representation of the group is a homomorphism of the group into the group of
invertible linear transformations on a vector space. The representation is faithful if
the only group element represented by the operator 1 is the identity of the group. In
physics we deal with two kinds of non-faithful representation. The first is the trivial,
one-dimensional representation, where every element is mapped into the identity. The
second is exemplified by the three-dimensional vector representation of SU(2), where
the Z? center of the group is represented by the identity. In this case, no transformation
near the identity is mapped to the identity, so the representation is locally faithful.
In a faithful representation, the requirement that the group multiplication close is
equivalent to the fact that the matrices T a form a Lie algebra:
[T a ,T b ]=if c ab T c .
The constants f" h are called the structure constants of the group. The Jacobi identity
for double commutators constrains these constants. The constraint is equivalent to the
statement that the r:
-i./;:"
form a representation of the same Lie algebra. This is called the adjoint representation.
A Lie group is semi-simple if it contains no abelian invariant subgroup. An example
of a Lie group that is not semi-simple is the Poincare group. Translations are the abelian
invariant subgroup. We know how to construct induced representations of this group,
starting from representations of the abelian subgroup, and we have used this technique
to construct the states of free particles, in the text. Mathematicians classify groups
that are direct products of U(l) /c and a semi-simple group as non-semi-simple, but in
physics it is convenient to lump these into the semi-simple category. Every semi-simple
group can be written as a direct product of simple factors G\ <g> G2 <S> ■ ■ ■ <£> G n . A simple
group is one that cannot be further factored in this way.
suited for thinking ;:bout uniuin lvproscntalions of the group.
Appendix E Group theory and Lie algebras
A final step in classification is to distinguish between compact and non-compact
groups. The reference is to the range of the parameters, or what mathematicians call
the group manifold. In physics we deal with the non-compact Poincare group and
sometimes its completion to the conformal group SO(2, d). Apart from that, all of
the Lie groups which appear in field theory are compact. It is fortunate that Cartan
was able to classify all compact simple groups. They fall into three infinite families
(the rotation, unitary, and unitary symplectic groups in n real, n complex, and In real
dimensions) plus five exceptional groups G2, F4, E6j,8- The subscripts on these names
refer to the rank of the group, the maximal number of commuting generators. The rank
of SU(iV) is N—l, the number of traceless diagonal Hermitian matrices. That of SO(«)
is determined by the maximal number of orthogonal 2-planes in n-dimensional space,
which is k for n = 2k or 2k+\. The rank of Sp(2&) (often called Sp(&)) is k.
Choose any maximal commuting set of generators. This is called a Cartan subalgebra,
and the generators are called Cartan generators //,, i — \,...,r. The eigenvalues of
the Cartan generators in a representation R of the group form a set of r-dimensional
vectors called the weights of the representation. The lattice formed by adding together
weights of all possible representations is called the weight lattice. For the special case
of the adjoint representation, the weights are called roots. The root lattice is obtained
by adding together all possible roots.
As noted above, a Lie group is called compact if its parameter space is compact as
a manifold. For example, SU(2) is the set of matrices n^ + 'm a a a , with n^+n 2 — 1 and
all components real. This is the same as the sphere S 3 in four-dimensional Euclidean
space, which is compact. The group of all N x N unitary matrices is compact for any
N. As a manifold it is equivalent to N complex, mutually orthogonal unit vectors (its
columns), with some identifications under permutations. Any Lie group that has a
faithful finite-dimensional unitary representation is thus compact.
If R is any such representation of a compact group G, and T^ the generators in
that representation, then tr(T^T^) — D(R)S ab . The trace defines a scalar product in
the space of Hermitian generators, and we can always define orthogonal linear com-
binations. The coefficient D(R) is called the Dynkin index of the representation. If we
want the generators in any representation to have the same commutation relations, then
the Dynkin indices will differ. We usually normalize the Dynkin indices by
tr(7|r*) = -S ah ,
in the smallest representation of the group, often called the defining or fundamental
representation (though representations corresponding to these words are not always
identical). This also normalizes the structure constants, via
tr[i|,r|]7| = ir ,6c .
This shows that for compact groups the structure constants are totally anti-symmetric.
In any other representation
tr[7l, 1%}T£ = iD(R)f abc .
Appendix E Group theory and Lie algebras
If we replace the commutator by an anti -commutator in the last formula we get
ti[T£,T£] + Ti = iA(R)d abc .
A(R) is called the anomaly coefficient of the representation and appears in the ABJS
anomaly equations, which are discussed extensively in the text. There is a unique totally
symmetric invariant in the product of three adjoint representations of a compact simple
group, so d abc is representation-independent. The same fact tells us that it is sufficient to
compute the trace of T 3 for any generator for which it is non-zero in order to compute
A(R).
The proof of the uniqueness of d ahc follows from the theory of tensor products
of representations. Given two representations with representation matrices M\(g) and
M2 (g) , we can make a new one by taking the tensor product M\ <g> M? (g) . This represen-
tation is called *i <g> *2- The Peter-Weyl theorem tells us that every finite-dimensional
representation is a direct sum of irreducible representations, so
Hi ®U 2 = 0**.
The theory of which representations appear in this sum is the generalization of the
Clebsch-Gordan series for angular momentum. It is most easily worked out in terms
of properties of the weights of the representations. The product of two adjoint repre-
sentations always has another adjoint in it, because of the structure constants. That
is, if V a and W a both transform like adjoints, so does/ a £ c Vb W c - The existence of d ahc
implies a second copy of the adjoint representation in the product of two adjoints. For
SU(A0 its existence is guaranteed by the fact that the representation of the generators
in the fundamental representation is a basis for all traceless Hermitian matrices. Thus
T a T b _ J_ s ab + { j-abc T c + d abc T c^
since the commutator is anti-Hermitian and the anti-commutator Hermitian. It turns
out that, for all other compact simple Lie algebras, there is only one adjoint in the
product of two adjoints, so the d abc symbol vanishes.
A direct product of two groups Gi,2 is the group formed by pairs (gi,gi) and the
obvious multiplication law. In a matrix representation the direct product matrices are
tensor products acting on independent indices in the representation space. The Lie
algebra of a direct product is a direct sum of Lie algebras, and the generators have the
form
1 ® fcO a + T 7 ^/ ® 1,
where the lower-case and capital letters refer to the generators and infinitesimal
parameters of the two different groups.
A semi-direct product group G has two subgroups Gi,2 that do not commute with
each other. Conjugation of one group by an element of the other Gi — ► g^ Gig2
(and vice versa) leads to a different, though isomorphic, subgroup. The canonical
geometrical example is the group of rotations and translations, or Lorentz transforma-
tions and translations, in a Euclidean or Lorentzian signature space. The commutation
Appendix E Group theory and Lie algebras
[J/lv,Jpa] = i[rinpJ V a ~ VvpJp.0 - VnoJvp + r) va J ljLp ],
V ti v,P p ] = i(riwP v ~ r} vp P„),
[/>„,/>„] = ().
In words, these equations tell us that P transforms as a vector, and / as a tensor of
second rank under rotations (Lorentz transformations). The second equation also tells
us how angular momentum and boosts respond to translations. These equations are
valid in any dimension and for any signature of the metric ??^ v -
Further material on Lie groups and algebras can be found in [1-5].
Appendix F Everything else
In addition to the big omissions I mentioned in the text, there is a host of subjects
in field theory that I have neglected. I will mention them here, without detailed refer-
ence. In the days of Spires and Google detailed reference hardly seems necessary: all
you need is a name. The large-iV approximation, particularly the connection between
large-iV matrix field theories and string theory, is a big lacuna. I've barely mentioned
the field of computational lattice gauge theory, which is slowly achieving a quantita-
tive understanding of the hadron spectrum and other low-energy properties of QCD.
The use of field theory techniques in condensed-matter physics produced the theory
of superconductivity, of critical phenomena, and of the quantum Hall effect as well as
a variety of other phenomena. The books by Parisi, Ma, Drouffe and Itzykson, and
Zinn- Justin provide an entree into this vast body of knowledge. Another big lacuna
is the study of integrable two-dimensional theories and exact S-matrices. There has
been a variety of attempts to study the high-energy fixed-momentum-transfer region
of scattering amplitudes (the Regge region) by summing selected classes of diagrams
in field theory. This has led to an effective field theory approach called Reggeon cal-
culus as well as to a very interesting set of equations called the BFKL equations (this
is not quite the Regge region). Then there is the use of two-dimensional conformal
field theory to do perturbative string theory. Polchinski's volumes and the new book
by the Beckers and Schwarz have excellent discussions of that. The next thing that
comes to mind is heavy-quark effective field theory. The use of field theory as an exact
formulation of quantum gravity (matrix theory and AdS/CFT) has been referred to
only in passing, so I reiterate it here. The study of field theory in dimensions higher
than four has mostly been done in connection with the Kaluza-Klein program, super-
gravity and string theory. Another topic I've neglected is what is colloquially known
as "light-cone quantization" (it's actualh ight-fron itizt on) This is a fascinating
approach that should make field theory amenable to the tricks of condensed-matter
physicists, but it has not had as much impact as one would have thought. An old
topic, which has fallen out of fashion, but still has useful lessons, is dispersion theory,
which tries to extract general properties of scattering amplitudes from minimal prin-
ciples of unitarity, crossing symmetry, Lorentz invariance, and a somewhat vaguely
defined notion of analyticity. Axiomatic and constructive and algebraic quantum field
theory are more rigorous approaches, but have not produced results significant to the
wider community of field theorists for a long time. There are repeated attempts to use
Appendix F Everything else
the techniques of field theory, in particular the renormalization group, to study tur-
bulence. This project has not really made a breakthrough yet, but many people expect
one. Finally let me note that path integrals, instantons, and lattice gauge theory have
made their appearance in fields like neurobiology, quantum computing, and finance.
Field theory is a general tool for studying complicated systems with many degrees
of freedom, so it will almost certainly find future applications that we cannot yet
dream of.
References
[1] H. Georgi (1999), Lie Algebras in Panicle Physics, 2nd edn. (New York, Perseus
Books).
[2] R. Gilmore (1974), Lie Groups, Lie Algebras, and Some of Their Applications
(New York, Wiley).
[3] H.J. Lipkin (1966), Lie Groups for Pedestrians, 2nd edn. (Amsterdam, North
Holland).
[4] R.N. Cahn (1984). Semi-simple Lie Algebras and Their Representations (New
York, Benjamin-Cummings).
[5] Particle Data Group (W.-M. Yao et al.) (2006), J. Phys. G: Nucl. Part. Phys. 33, 1
[6] W. Pauli (1940), Phys. Rev. 91, 716.
[7] S. Weinberg (1966), The Quantum Theory of Fields, Vol. I (Cambridge, Cambridge
University Press), Section 5.7.
[8] G. Liiders and B. Zumino (1958), Phys. Rev. 110, 1450.
[9] J. Schwinger (1951), Proc. Nat. Acad. Sci. 37, 452.
[10] F.J. Dyson (1949), Phys. Rev. 75, 1736.
[1 1] R. F. Streater and A. S. Wightman (1964), PCT, Spin Statistics and All That (New
York, Benjamin).
[12] B. Simon (1974), The P(</>) 2 Euclidean Quantum Field Theory (Princeton, MA,
Princeton University Press).
[13] G. t Hooft and M. Veltman (1973), CERN preprint-73-09.
[14] G. Kallen (1953), On the Magnitude of the Renornialization Constants in Quantum
Electrodynamics, Kongelige Danske Vidensk. Selskab., Vol. 27, no. 12.
[15] H. Lehmann (1954), Nuovo Cim. 11, 342.
[16] S. Weinberg (1996), The Quantum Theory of Fields, Vol. I (Cambridge, Cambridge
University Press), Chapter 4.
[17] R. F. Streater and A. S. Wightman (1964), PCT, Spin Statistics and All That (New
York, Benjamin).
[18] M. Goldberger and K. L. Watson, Collision Theory (New York, Dover Press).
[19] R. G. Newton, Scattering Theory of V\ ayes and Pan icles (New York, Dover Press).
[20] H. Lehmann, K. Symanzik, and W Zimmermann (1955), Nuovo Cim. 1, 205.
[21] R. Haag (1958), Phys. Rev. 112, 669.
[22] D. Ruelle (1962), Helv. Phys. Acta. 35, 34.
[23] R. Eden, P. Landshoff, D. Olive, and J. C. Polkinghome (1966), The Analytic
S-Mat) ix (Cambridge, Cambridge University Press).
[24] A. Proca (1936), Comptes Rendus Acad. Sci. Paris, 202, 1490.
[25] G. 't Hooft (1976), Phys. Rev. D 14, 3432 [Erratum Phys. Rev. D 18, 2199 (1978)].
[26] J. C. Pati and A. Salam (1974), Phys. Rev. D 10, 275 [Erratum Phys. Rev. D 11,
703 (1975)].
[27] R. N. Mohapatra and J. C. Pati (1975), Phys. Rev. D 11, 566.
[28] R. N. Mohapatra and J. C. Pati (1975), Phys. Rev. D 11, 2558.
[29] G. Senjanovic and R. N. Mohapatra (1975), Phys. Rev. D 12, 1502.
[30] 1. 1. Bigi and A. I. Sanda (2000), CP Violation (Cambridge, Cambridge University
Press).
[31] T. Takagi (1927), Jap. J. Math. 1, 83.
[32] E. C. Poggio, H. R. Quinn, and S. Weinberg (1976), Phys. Rev. D 13, 1958.
[33] M. Peskin and D. Schroeder (1995), An Introduction to Quantum Field Theory
(New York, Addison Wesley).
[34] T. Kinoshita and D. R. Yennie (1990), Adv. Ser. Direct. High Energy Phys. 7, 1.
[35] P. P. Kulish and L. D. Faddeev (1970), Theor. Math. Phys., 4, 745.
[36] Y Nambu (1960), Phys. Rev. Lett. 4, 380.
[37] Y Nambu and G Jona-Lasinio (1961), Phys. Rev. 122, 345.
[38] Y Nambu and G Jona-Lasinio (1961), Phys. Rev. 124, 246.
[39] J. Goldstone (1961), Nuovo Cim. 19, 154.
[40] J. Goldstone, A. Salam, and S. Weinberg (1962), Phys. Rev. 127, 965.
[41] S. L. Adler (1965), Phys. Rev. B 137, 1022.
[42] S. Weinberg (1996), Quantum Theory of Fields, Vol. II (Cambridge, Cambridge
University Press), Chapter 19.
[43] J. Schwinger (1951), Phys. Rev. 82, 664.
[44] S. Mandelstam (1962), Annals Phys. 19, 1.
[45] K. G Wilson (1974), Phys. Rev. D 10, 2445.
[46] C. Becchi, A. Rouet, and R. Stora (1976), Annals Phys. 98, 287.
[47] S. Weinberg (1996), Quantum Theory of Fields, Vol. II (Cambridge, Cambridge
University Press), Chapter 15.
[48] M. Henneaux (1988), Classical Foundations of BRST Invariance (Naples,
Bibliopolis).
[49] M. Henneaux (1990), Nucl. Phys. Proc. Suppl. 18A, 47.
[50] A. Fuster, M. Henneaux, and A. Maas (2005), Int. J. Geom. Meth. Mod. Phys.
2, 939 (arXiv:hep-th/0506098).
[51] R.P. Feynman (1963), Acta Phys. Polon. 24, 697.
[52] B. S. DeWitt (1962), J. Math. Phys. 3, 1073.
[53] S. Mandelstam (1968), Phys. Rev. 175, 1604.
[54] L. D. Fadeev and V. N Popov (1967), Phys. Lett. B 25, 29.
[55] J. L. Rosner (2003), Am. J. Phys. 71, 302 (arXiv:hep-ph/0206176).
[56] E. Cartan (1913), Bull. Soc. Math. France 41, 53.
[57] H. Weyl (1921), Raum-Zeit-Materie (Berlin, Springer- Verlag); English transla-
tion Space-Time -Mailer (New York, Dover, 1950).
[58] O. Klein (1939), "On the theory of charged fields," in New Theories in Physics
(Paris, International Institute of Intellectual Collaboration), pp. 73-93.
[59] D. Gross (1994), Oskar Klein and Gauge Theory, arXiv:hep-th/941 1233.
T. D. Lee and C. N. Yang (1955), Phys. Rev. 98, 101.
S. Gerstein and Ya. B. Zel'dovitch (1955), JETP 29, 698.
R. Marshak and E. C. G. Sudarshan (1958), Phys. Rev. 109, 1860.
R.P. Feynman and M. Gell-Mann (1958), Phys. Rev. 109, 193.
C. N. Yang and R. L. Mills (1954), Phys. Rev. 96, 191.
S. Bludman (1958), Nuovo dm. 9, 433.
J. Schwinger (1957), Annals Phys. 2, 407.
S. Glashow (1961), Nucl. Phys. 22, 579.
J. Schwinger (1962), Phys. Rev. 128, 2425.
P. W. Anderson (1963), Phys. Rev. 130, 439.
P. W. Higgs (1964), Phys. Lett. 12, 132.
P. W. Higgs (1964), Phys. Rev. Lett. 13, 508.
P. W. Higgs (1966), Phys. Rev. 145, 1156.
R. Brout and F. Englert (1964), Phys. Rev. Lett. 13, 321.
G Guralnik, C. R. Hagen, and T. W. B. Kibble (1967), Phys. Rev. 155, 1554.
S. Weinberg (1967), Phys. Rev. Lett. 19, 1264.
A. Salam (1968), in Elementary Particle Physics, ed. N. Svartholm (Stockholm,
Almquist and Wiksells).
G 't Hooft (1971), Nucl, Phys. B 35, 167.
G 't Hooft and M. J. G Veltman (1972), Nucl. Phys. B 44, 189.
Y Nambu (1966), in Preludes in Theoretical Physics, ed. A. de Shalit,
H. Feshbach, and L. van Hove (Amsterdam, North Holland), p. 133.
O. W. Greenberg (1964), Phys. Rev. Lett. 13, 598.
Han Nambu (1965), Phys. Rev. B 139, 1006.
E. B. Bogomol'nyi (1976), Sov. J. Nucl. Phys. 24, 449.
M. K. Prasad and C. M. Sommerfield (1975), Phys. Rev. Lett. 35, 760.
H. B. Nielsen and P. Olesen (1973), Nucl. Phys. B 61, 45.
A. A. Abrikosov (1957), Sov. Phys. JETP 5, 1174 [Zh. Eksp. Teor. Fiz. 32, 1442
(1957)].
K. G Wilson (1974), Phys. Rev. D 10, 2445.
G 't Hooft (1976), in High Energy Physics: Proceedings of the EPS International
Conference, Palermo, June, 1975, ed. A. Zichichi (Bologna, Editrice Compositori).
S. Mandelstam (1976), in Extended Systems in Field Theory, ed. XL. Gervais
and A. Neveu Phys. Rep. C 23, No. 3.
T. Banks, R. Myerson, and J. B. Kogut (1977), Nucl. Phys. B 129, 493.
M. E. Peskin (1978), Annals Phys. 113, 122.
S. Elitzur, R. B. Pearson, and J. Shigemitsu (1979), Phys. Rev. D 19, 3698.
A. Ukawa, P. Windey, and A. H. Guth (1980), Phys. Rev. D 21, 1013.
S. L. Glashow, J. Iliopoulos, and L. Maiani (1970), Phys. Rev. D 2, 1285.
N Cabibbo (1963), Phys. Rev. Lett. 10, 531.
M. Kobayashi and T. Maskawa (1973), Prog. Theor. Phys. 49, 652.
C. D. Froggatt and H. B. Nielsen (1979), Nucl. Phys. B 147, 277.
T. Banks, Y. Nir, and N. Seiberg, arXiv:hep-ph/9403203.
[98] C. Bernard, C. DeTar, L. Levkova et al. (2006), in Proceedings of the 5th
International Workshop on Chiral Dynamics, Theory and Experiment (CD)
(Durham/Chapel Hill, NC) (arXiv:hep-lat/0611024).
[99] C. Vafa and E. Witten (1984), Nucl. Phys. B 234, 173.
100] J. Steinberger (1949), Phys. Rev. 76, 1180.
101] J. Schwinger (1951), Phys. Rev. 82, 664.
102] J. S. Bell and R. Jackiw (1969), Nuovo Cim. A 60, 47.
103] S. Adler (1969), Phys. Rev. 177, 2426.
104] W. Bardeen (1969), Phys. Rev. 184, 1848.
105] S. L. Adler and W. Bardeen (1969), Phys. Rev. 182, 1517.
106] J. Wess and B. Zumino (1971), Phys. Lett. B 37, 95.
107] E. Witten (1982), Phys. Lett. B 117, 324.
108] S. Weinberg and E. Witten (1980), Phys. Lett. B 96, 59.
109] Y. Frishman, A. Schwimmer, T. Banks, and S. Yankielowicz (1981), Nucl. Phys.
B 177, 157.
110] S.R. Coleman and B. Grossman (1982), Nucl. Phys. B 203, 205.
Ill] J. Zinn- Justin (1993), Quantum Field Theory and Critical Phenomena (Oxford,
Oxford University Press).
112] S. Weinberg (1970), "Dynamic and algebraic symmetries," in Lectures on Ele-
mentary Particles and Quantum Field Theory, ed. H. Pendelton and M. Grisaru
(Cambridge, MA, MIT Press), p. 283.
113] LB. Khriplovich (1969), Yad. Fiz. 10, 409.
114] D.J. Gross and F. Wilczek (1973), Phys. Rev. Lett. 30, 1343.
115] D.J. Gross and F Wilczek (1973), Phys. Rev. D 8, 3633.
1 16] D. J. Gross and F Wilczek (1974), Phys. Rev. D 9, 980.
117] H.D Politzer (1974), Phys. Rep. 14, 129.
118] H.D. Politzer (1973), Phys. Rev. Lett. 30, 1346.
119] N. Arkani-Hamed, A. G Cohen, E. Katz, A.E. Nelson, T. Gregoire, and
J. G Wacker (2002), J HEP 0208, 021 (arXiv:hep-ph/0206020).
120] A. Zee (1973), Phys. Rev. D 7, 3630.
121] A. Zee (1973), Phys. Rev. D 8, 4038.
122] S.R. Coleman and D.J. Gross (1973), Phys. Rev. Lett. 31, 851.
123] E. Farhi and L. Susskind (1981), Phys. Rep. 74, 277.
124] Yu. L. Dokhitzer, V. A. Khoze, A. H. Mueller, and S. I. Troyan (1991), Basics of
Perturhative QCD (Paris, Editions Frontieres).
125] R. Jackiw (1974), Phys. Rev. D 9, 1686.
126] J. Kogut and K. Wilson (1974), Phys. Rep. 12, 75-200.
127] A. D. Linde (1976), JETP Lett. 23, 64 [Pis'ma Zh. Eksp Teor. Fiz. 23, 73 (1976)].
128] S. Weinberg (1976), Phys. Rev. Lett. 36, 294.
129] T. Banks, L.J. Dixon, D. Friedan, and E.J. Martinec (1988), Nucl. Phys.
B 299, 613.
[130] M. Carena and H.E. Haber (2003), Prog. Part. Nucl. Phys. 50, 63 (arXiv:hep-
ph/0208209).
[131] S. Weinberg (1995), Quantum Theory of Fields, Vols. I, II, and III (Cambridge,
Cambridge University Press).
132] H. Georgi, H.R. Quinn, and S. Weinberg (1974), Phys. Rev. Lett. 33, 451.
133] R. F. Dashen and H. Neuberger (1983), Phys. Rev. Lett. 50, 1897.
134] H. Neuberger, U. M. Heller, M. Klomfass, and P.M. Vranas, arXiv:hep-lat/
9208017.
135] R. Rajaraman (1982). Solitons and Instantons: An Introduction to Solitons and
Instantons in Quantum Field Theory (Amsterdam, North-Holland).
136] M. Shifman (1994), Instantons in Gauge Theories (Singapore, World Scientific).
137] S. Novikov, S. V. Manakov, L. P. Pitaevskii, and V. E. Zakharov (1984), Theory of
Solitons: The Inverse Scattering Metho d (New York, Consultants Bureau).
138] S. Coleman (1995), Aspects of Symmetry (Cambridge, Cambridge University
Press).
139] S. R. Coleman, V. Glaser, and A. Martin (1978), Commun. Math. Phys. 58, 211.
140] E. Witten (1979), Phys. Lett. B 86, 283.
141] M. F. Atiyah, N. J. Hitchin, V. G. Drinfeld, and Yu. I. Manin (1978), Phys. Lett.
A 65, 185.
142] G 't Hooft (1976), Phys. Rev. D 14, 3432 [Erratum Phys. Rev. D 18, 2199 (1978)].
143] M. F. Atiyah and I. M. Singer (1984), Proc. Nat. Acad. Sci. 81, 2597.
144] M. F Atiyah and I. M. Singer (1971), Annals Math. 93, 119.
145] M. F Atiyah and I. M. Singer (1968), Annals Math. 87, 546.
146] M. F Atiyah and I. M. Singer (1968), Annals Math. 87, 484.
147] J. L. Gervais, A. Jevicki, and B. Sakita (1975), Phys. Rev. D 12, 1038.
148] B. Julia and A. Zee (1975), Phys. Rev. D 11, 2227.
149] P. Hasenfratz and G 't Hooft (1976), Phys. Rev. Lett. 36, 1119.
150] R. Jackiw and C. Rebbi (1976), Phys. Rev. Lett. 36, 1116.
151] S. R. Coleman (1988), Nucl. Phys. B 298, 178.
152] J. Bagger and J. Wess (1992). Supersymmetry and Supergravity (Princeton, MA,
Princeton University Press).
153] S.J. Gates, M. Grisaru, M. Rocek, and W Siegel (1983), Superspace: or 1001
Lessons in Supersymmetry (New York, Benjamin-Cummings).
154] S. Weinberg (1999), The Quantum Theory of Fields ///(Cambridge, Cambridge
University Press).
155] S.P. Martin (1996), in Fields, Strings and Duality (Boulder, CO, University of
Colorado Press) (arXiv:hep-ph/9709356).
156] J.D. Lykken (1996), Fields, Strings and Duality (Boulder, CO, University of
Colorado Press) (arXiv:hep-th/9612114).
157] J. Kapusta (2006), Finite Temperature Field Theory (Cambridge, Cambridge
University Press).
158] J. Berges (2005), AIP Conf. Proc. 739, 3 [arXiv:hep-ph/0409233].
1 59] N. D. Birrell and P. C. W Davies (1982), Quantum Field Theory in Curved Space-
time (Cambridge, Cambridge University Press).
[160] H. Georgi (1964), Weak Interactions and Modern 1'unicle Theory (New York,
Benjamin-Cummings).
[161] W. Greiner and B. Muller (1996), Gauge Theories of Weak Interactions (Berlin,
Springer-Verlag).
[162] I. Montvay and G. Muenster (1994), Quantum Fields on a Lattice (Cambridge,
Cambridge University Press).
[163] C. Itzykson, H. Saleur, and J.B. Zuber (1988), Conformal Invariance and
Applications to Statistical Mechanics (Singapore, World Scientific).
[164] P. Di Francesco, P. Mathieu, and D. Senechal (1997), Conformal Field Theory
(Berlin, Springer-Verlag).
[165] T. Banks, W. Fischler, S.H. Shenker, and L. Susskind (1997), Phys. Rev. D 55,
5112 (arXiv:hep-th/96 10043).
[166] O. Aharony, S. S. Gubser, J. M. Maldacena, H. Ooguri, and Y. Oz (2000), Phys.
Rep. 323, 183 (arXiv:hep-th/9905111).
[167] T. Banks (1999), arXiv:hep-th/991068.
[168] M. Srednicki (2007), Quantum Field Theory (Cambridge, Cambridge University
Press).
[169] D. Bailin and A. Love (1993), Introduction to Gauge Field Theory (Bristol, IOP
Press).
[170] P. Ramond (1980), Field Theory: A Modern Primer, 2nd edn. (New York, Addison
Wesley).
[171] C. Itzykson and J. B. Zuber (1980), Quantum Field Theory (New York, McGraw-
Hill).
[172] J.M. Drouffe and C. Itzykson (1980), Statistical Field Theory (Cambridge,
Cambridge University Press).
[173] G. Parisi (1988), Statistical Field Theory (New York, Benjamin/Cummings).
[174] S. K. Ma (1976), Modern Theory of Critical Phenomena (New York, Benjamin/
Cummings).
[175] J. Schwinger (1998), Particles, Sources and Fields, Vols I and II (New York, Perseus
Books).
[176] J. D. Bjorken and S. Drell (1965), Relativistic Quantum Field Theory (New York,
McGraw-Hill).
[177] K. Nishijima (1969), Fields and Panicles: Field Theory and Dispersion Relations
(New York, Benjamin).
[178] N N Bogoliubov and D. Shirkov (1982), Quantum Fields (New York, Benjamin-
Cummings).
[179] S. S. Schweber (1961), Quantum Theory of Fields (Evanston, Row-Peterson).
[180] R. Balian and J. Zinn-Justin (1976), Methods in Field Theory: Proceedings, Les
Houches Summer School, Session 28 (Amsterdam, North-Holland).
[181] S. Deser, M. Grisaru, and H. Pendelton (eds.) (1970), Lectures on Elementary
Particles and Quantum Field Theory (Cambridge, MA, MIT Press).
[182] J. Schwinger (ed.) (1958), Selected Papers on Quantum Electrodynamics (New
York, Dover).
Author index
Nambu,Y. 101
Nielson, H.B. 107,115
Noether, E. 76
Olesen, P. 107
DeWitt, B. S. 99
Dirac, P. A.M. 10
Dyson, F. 18
Faddeev, L. 47, 75, 99
Fermi, E. 99
Feynman, R. 99
Froggatt, CD. 11 5
Georgi, H. 125
Glashow, S.L. 100,113,115,125
Greenberg, O. W. 101
Henneaux, M. 97
Higgs,P. 101
Kallen, G. 30
Kobayashi, M. 115
Kulish, R. 75
Mandelstam, S. 93, 99, 1
Mandl, F. 1
Maskawa, T. 115
Mills, R.L. 100
Peskin, M. 66, 67
Popov, V. N. 47, 99
Proca,A. 41
Rouet, C. 97
Salam.A. 101,113
Schroeder, D. 66, 67
Schwinger,J. 18,24,93,100
Stora, R. 97
Symanzik, K. 18,32
'tHooft.G 46,101,108
Weinberg, S. 87,97,101,113
Wess, J. 120
Weyl, H. 45, 99
Wick, G. C. 24, 25
Wightman,A. 17
Wilson, K. 93, 108
Witten.E. 118
Yang, C.N. 100
Zimmerman, W. 18,32
Zinn-Justin, J. 175
Zumino, B. 120
Subject index
D(R), 95
R K gauges, 131
SU(2) x SU(2), 87
U_4(l) symmetry, 85
MS, 169
annihilation operators, 9
anomalies, 118, 231
anomaly, 85, 87
anti-unitary operators.
axion, 200
Baker Campbell I la usdorff formula, 1
baryons, 128
Belinfante tensor, 80
block spin (Kadanoff), 146
boost, 9
bosonic zero modes, 209, 212, 232
bosons, 8, 64
bounce solution, 207
bremsstrahlung, 63
BRST symmetry, 97
canonical commutation relations, 14, 18
Cartan subalgcbra.
center of the gauge group.
charge conjugation, 49, 50
Chern-Simons form, 123, 227
chiral symmetry, 47, 85, 87, 91, 172
CKM matrix, 115
Clebsch-Gordan coefficients, 45
cluster property, 32
color, 101
complex, real or pseudo-real representation, 5 1
Compton scattering, 74
Compton wavelength, 5
confinement, 108
conformal field theory (CFT), 23, 91
conformal transformations, 80, 91
connected component of a group, 13
connected Green functions, 26
,88
spin representations.
slation length, 146
;t generators, 96
;t space, 85, 88
it derivative, 88
CP transformation.
CPT, 55
creation operators, 9
critical exponents, 146
critical phenomena,
critical surface, 1 52
■a mmetry.
custodial symmetry,
diamagnetism, 179
dilute gas approximation.
dimensional regular
dimensional tr
Dirac Lagrangian.
Dirac mass, 54
Dirac matrices, 48
Dirac picture, 10, 34
Dirac spinor, 45
Dynkin index, 95, 122
ear diagrams, 141
effective field theory, 137
effective potential, 195
electro-weak couplings, 114
electro-weak gauge theory, 113
extended objects, 2 1 4
Faddeev-Popov ghosts, 47, 97
fermion zero modes, 231
fermions, 8, 47
Feynman path integral, 1 8
1 evniiian slash notation, 48
(mile-dimensional representations of the Lorentz
group, 40
lock space.
Frcdholm determinant. ..'
'. derivative.
functional integrals, 17, 21
gauge coupling unificatio
livalcnce.
.
gauge fermion, 98
gauge field strength.
gauge invariance, 40, 42, 51, 93
gauge potential, 88, 94
gauge symmetry, 88
spontaneously broken, 96
general coordinate transformation, 79
generations, 95
Georgi-Glashow model, 221
ghost number, 98
:hanism,
grand unified models, 125
Grassmann numbers, 48, 57
Grassmann variables.
Green functions, 1 8
gyromagnetic ratio, 70, 75, 180
hadron, 32, 128
Hamilton- Jacobi equation, 206
Heisenberg equations, 46
Heisenberg equations of motion, 18
org picture, 1 1
helicity, 39, 47
Higgs field, 62, 95
Higgs mass lower bound, 198
Higgs mass upper bound, 202
Higgs mechanism, 62
Higgs particle, 1 1 3
Higgs phenomenon, 102
Higgs- Yukawa couplings, 1 1 5
holonomy, 93
homotopy, 237
infinitesimal generator, 4
infrared catastrophe, 63
infrared divergences, 172, 177
lymmctry,
invariant amplitude, 64
irreducible representation, 94
irrelevant or non-renoi n
isospin symmetry, 87
Klein-Gordon equation, 11, 33, 41
Landau gauge, 1 7 1
Landau pole, 164, 176,202
Landau theory of phase transitions (mean fie
theory), 145
ige theory, 108
Legendre transform, 27, 195
lepton, 68, 95
Lie groups and algebras, 51, 52, 76
longitudinal polarization, 65
longitudinal polarization states, 39
low-energy effective field theory, 85
low-energy effective Lagrangian, 87
LSZ, 18, 32, 34
magnetic monopole soliton, 22 1
Majorana mass, 54
Majorana spinor, 45, 47, 48
marginal couplings.
marginally irrelevant, 153
marginally relevant, 153
\ relevant coupling.
mass gap, 31
Maxwell field, 40
Maxwell's Lagrangian,
Meissner effect, 1 07
mesons, 128
minimal subtraction, 169
Nambu-Goldstone bosons (NGBs), 81, 82.
85,87
natural units, 2
Nielson-Olesen vortex, 107
Noether current, 76
Noether's theorem, 46, 76, 87
non-abelian Higgs phenomenon, 96
one-particle irreducible ( 1PI) Green
operator product expansion (OPE), 141
p-brancs, 232
paramagnetism, 179
parastatistic, 101
parity transformation, 49
Pauli-Villars regulator, 143
perturbation theory.
Peter-Weyl theorem, 51
Pfaffian, 57
pion, 85, 91
Planck mass, 2
Planck scale, 152,176
Poincare symmetry, 9
Proca equation, 41
proper orthochronous Lorentz group, 49
proper orthochronous Lorentz
transformations, 13
proper-time cut-off, 143
pseudo-vector, 50
QCD, 32, 85
QCD scale, 117
QED, 62
quark, 32, 68, 85, 8i
quark and lepton rr
86, 101, 116
115
rank of a group, 52
relevant or super-renot n
couplings, 153
renormalization-group equation, 1 50
/ation scale, 173
lenormalization scheme, 142
representations of the Lorentz group, 1
roots of a Lie algebra, 52
S-matrix, 12, 32
scattering states, 30
ger picture, 10
Schur's lemma, 52
Schwinger-Dyson equations, 1 7
self-dual tensors, 40
semi-classical approximation, 27
space reflection, 13, 49
spectral representation, 30
:.lics theorem, 13, 41, 47
stability subgroup, 1 5
standard model, 40, 70, 95
strong CP problem, 200
supersymmetric partner, 68
symmetry number, 25, 26
't Hooft symbols, 119
Lorentzian, 46
technical naturalness, 172, 190
temporal gauge, 228
time-ordered product, 12
time reversal, 13,49,55
topological charge, 217, 227
tree diagrams, 28, 29
unitary representation of the Poincare group, 38
universality classes, 146
van der Waerden, 2
van der Waerden dot convention, 45
vector bundle, 93
W boson, 99, 114
Ward identity, 170
weak hypercharge, 95, 124
Weinberg angle, 114
Wess-Zumino consistency condition, 1 20
Weyl equation, 46, 48
Weyl fermions, spinors, 45, 52
Weyl matrices, 46
Weyl transformations.
Wick rotation, 24
Wick's theorem, 25, 26
Wigner rotation, 38
Wilson line, 93
Witten effect, 229
WKB approximation, 206
Yang-Mills mstantons.
Yukawa couplings,
Yukawa potential, 1 6
Z boson, 66, 114