The Oxford Master Series is designed for final year undergraduate and beginning graduate students in physics and related disciplines. It has been driven by a perceived gap in the literature today. While basic undergraduate physics texts often show little or no connection with the huge explosion of research over the last two decades, more advanced and specialized texts tend to be rather daunting for students. In this series, all topics and their consequences are treated at a simple level, while pointers to recent developments are provided at various stages. The emphasis is on clear physical principles like symmetry, quantum mechanics, and electromagnetism which underlie the whole of physics. At the same time, the subjects are related to real measurements and to the experimental techniques and devices currently used by physicists in academe and industry. Books in this series are written as course books, and include ample tutorial material, examples, illustrations, revision points, and problem sets. They can likewise be used as preparation for students starting a doctorate in physics and related fields, or for recent graduates starting research in one of these fields in industry.

## CONDENSED MATTER PHYSICS


## ATOMIC, OPTICAL, AND LASER PHYSICS


## PARTICLE PHYSICS, ASTROPHYSICS, AND COSMOLOGY


## STATISTICAL, COMPUTATIONAL, AND THEORETICAL PHYSICS


# **Particle Physics in the LHC Era**

G. Barr, R. Devenish, R. Walczak, & T. Weidberg

*Department of Physics, University of Oxford*

## 3

Great Clarendon Street, Oxford, OX2 6DP, United Kingdom

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide. Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries

c Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg 2016

The moral rights of the authors have been asserted

First Edition published in 2016

Impression: 1

It is permitted to copy, distribute, transmit, and adapt this work, including for commercial purposes, provided that appropriate credit is given to the creator and copyright holder, a link is provided to the licence, and any changes made to the work are properly indicated. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

This is an open access publication. Except where otherwise noted, this work is distributed under and subject to the terms of a Creative Commons Attribution 4.0 International License (CC BY 4.0) a copy of which is available at https://creativecommons.org/licenses/by/4.0/.

Enquiries concerning reproduction outside the scope of this licence should be sent to the Rights Department, Oxford University Press, at the address above

This title has been made available on an Open Access basis through the sponsorship of SCOAP3.

Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America

British Library Cataloguing in Publication Data Data available

Library of Congress Control Number: 2015957415

ISBN 978–0–19–874855–7

Printed and bound by CPI Group (UK) Ltd, Croydon, CR0 4YY

Links to third party websites are provided by Oxford in good faith and for information only. Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work.

## **Preface**

This book is designed to give a clear introduction to particle physics at a level that will be accessible to advanced undergraduate students. Most of the book concentrates on the 'Standard Model of Particle Physics' what it is and the experimental evidence that supports it. The book fills a gap between the more qualitative introductory texts and the more mathematically advanced graduate textbooks. Our aim is to teach the maximum amount of physics, with the minimum level of maths. This book is in the spirit of Perkins' classic textbook but updated to the LHC era. Particle physics is an experimental science; accelerators and detectors have been essential for progress in this field. The unique feature of this book is that it gives a serious explanation of the practical side of the subject at an accessible level for undergraduates. This will provide students with some real understanding of these subjects and equip them to appreciate the many excellent graduate-level textbooks in these fields and to follow published papers. The core of the book covers the theory and experiments that underpin the Standard Model. Neutrino oscillations and flavour oscillations of the neutral strange, charm, and beauty states are explained carefully, including the violation of CP symmetry. The book covers the discovery of the Higgs at the LHC, explaining the critical issues and how one can extract such a small signal from a large background. We discuss the problems with the Standard Model that give a very strong indication that there should be physics at the TeV scale, Beyond the Standard Model (BSM). We summarize some possible BSM theories that could solve these problems and give examples of LHC searches for BSM physics. We review the evidence for dark matter and consider how LHC and other experiments are searching for it, and we look at the evidence for dark energy and its theoretical consequences.

Each chapter has questions to help students deepen their understanding of the subject, some of which are based on those used in teaching this subject at Oxford as a 4th-year physics major option course.

The official OUP website for this book is http://ukcatalogue.oup.com/ product/9780198748557.do. This contains a link to a website ppLHCEra. physics.ox.ac.uk maintained by the authors. We are maintaining a list of errata on this website and we would appreciate receiving corrections via the link on the OUP website. Many new results will be appearing over the next few years. We provide links to Particle Data Group reviews and to websites for some of the current experiments. Finally, suggested solutions are available to course instructors: a request form is available on the OUP book website.

## **Acknowledgements**

Many people have contributed significantly to this book. First, John Cobb and Malcolm John wrote preliminary drafts of Chapters 10 and 11 on flavour and neutrino oscillations and Chapter 2 on mathematical methods. Jim Libby (now at the Indian Institute of Technology, Madras) created many of the stylish plots for the quark model in Chapter 5. Several colleagues at Oxford read individual chapters or sections and provided very useful corrections and suggestions: Alan Barr, Chris Hays, Todd Huffman, Cigdem Issever, Ivan Konoplev, Andrei Starinets, Georg Viehhauser, and Alfons Weber. Amanda Cooper-Sarkar provided invaluable advice on parton distribution functions and the data to produce Fig. 9.13. Steve Biller and Armin Reichold gave useful suggestions on the subject of neutrino oscillations. Hugo Beauchemin (Tufts) provided very useful feedback on Chapter 4 on detectors. Many Oxford students provided vital corrections and useful suggestions. An incomplete list is given here, with apologies to those we have forgotten: Callum Brodie, Rehan Deen, Patrick Dunne, Scott Melville, Jack Miller, Ricky Nathvani, Jack Sennett, Jakub Sikorowski, Jacob Smith, Elliot Reynolds, Jack Weston, and Andrew Yeomans. Members of our families helped as well. Anna Walczak made Figs. 3.3 and 3.7, and Przemyslaw Walczak helped make figures for Chapters 3, 4, and 6. Sheila McKinnon helped with proof reading.

We are indebted to Sonke Adlung of OUP for his continuous support and encouragement over the many years this book has taken to complete and we thank Harriet Konishi of OUP for guiding us through the final stages of production of the book. We thank Mac Clarke for the very thorough and excellent copyediting of the text. We thank Kaarkuzhali Gunasekaran of Integra-PDY for very careful work in typesetting.

Needless to say, all residual mistakes are the responsibility of the current authors. We would be very grateful to receive any corrections and comments from readers. We will keep an up-to-date list of errata on the book's website.

## **Figure permissions**

We are grateful to the following publishers and organizations, who retain the original copyright, for permission to include the following figures (our figure numbers).

Addison-Wesley. Fig. 9.15 is reproduced from *Introduction to High Energy Physics* (3rd edn), D. H. Perkins, 1987.

The American Physical Society. Fig. 9.3 is reproduced from *Physical Review*. Figs. 2.4, 8.22, 8.23, and 11.8. is reproduced from *Physical Review D*. Figs. 5.9, 5.10, 8.26, 9.18, 9.33, 11.4, and 11.10 is reproduced from *Physical Review Letters*.

Cambridge University Press. Fig. 9.12 is reproduced from *Neutrino Physics* (2nd edn), K. Winter, 2000.

Elsevier. Figs. 4.8, 4.27, 8.5, and 13.29 is reproduced from *Nuclear Instruments and Methods*. Figs. 6.1 and 9.22 is reproduced from *Physics Letters*. Figs. 2.2, 2.6, 9.11, 10.7, 12.10, 12.12, 12.13, 12.15, 12.16, and 12.18 is reproduced from *Physics Letters B*. Figs. 8.10, 8.15, 8.18, 8.28, and 9.2 is reproduced from *Physics Reports*.

Institute of Physics Publishing. Figs. 9.28 and 9.29 is reproduced from *Reports on Progress in Physics*.

The KTeV Collaboration: Fig. 10.8.

World Scientific. Fig. 9.1 is reproduced from *Nuclear Structure I*, A. Bohr and B. Mottelson, 1998.

Other figures are covered by publishers' open access agreements or are the authors' own. For those not the authors' own, the reference to the individual publication is given in the caption to the figure.

## **Acronyms**



## **Contents**









# **Introduction 1**

The aim of this book is to provide a practical introduction to particle physics in the LHC era at the level of an advanced undergraduate or introductory graduate course. It fills a gap between qualitative introductory texts and advanced texts based on relativistic quantum field theory. We give a clear and concise explanation of key theoretical concepts and their grounding in experimental measurements, with as little use of advanced mathematical techniques as possible. The exceptions are a fairly detailed coverage of exact and broken symmetries and gauge invariance. The language and techniques of relativistic quantum field theory are not used, but relativistic quantum mechanics is covered, focusing on the Klein–Gordon and Dirac equations. The book focuses on the physics of colliders, particularly those delivering the highest energies: proton–proton (LHC), electron–positron (LEP and ILC), and electron–proton (HERA). Experiments and results from older and/or lower-energy electron–positron colliders are discussed when necessary, for example the so-called B-factories. Fixed-target experiments, particularly those using neutrino beams and those studying neutral meson oscillation phenomena (K<sup>0</sup>, B<sup>0</sup>) are outlined. Finally non-acceleratorbased particle physics topics, such as the observation and measurement of solar neutrinos and the experimental search for dark matter, are described briefly.

This chapter introduces the fundamental particles and the forces with which they interact. We will use 'natural units' and these are defined in the next section. Accelerators and colliders are introduced in a historical context that makes clear how much of our current understanding of fundamental particles and forces relies on the steady increase in interaction energy made possible by advances in accelerator technology. An important feature of this book is a description of how modern electronic detectors work—particularly the large detectors built to study proton– proton collisions at the multi-TeV scale of the LHC. Coupled with this is the enormous increase in computing power over the last half-century and the organization of this power on a global scale using the worldwide web,<sup>1</sup> <sup>1</sup>This is known as GRID computing. A which enables scattering events to be selected and reconstructed in almost real time.

The chapter ends with a brief resume of the rest of the book.


local example used by the authors is GRIDPP: http://www.gridpp.ac.uk/.

*Particle Physics in the LHC Era*, Giles Barr, Robin Devenish, Roman Walczak,

<sup>&</sup>amp; Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak,

<sup>&</sup>amp; Tony Weidberg 2016. Published in 2016 by Oxford University Press.

for area (used for cross sections). <sup>2</sup> <sup>2</sup> It is said that the name 'barn' originated when early measurements of nuclear scattering cross sections were larger than expected: 'as big as barn doors'.

static forces between two protons at a

## **1.1 Units**

Studying matter at subnuclear scales requires interactions at very high energies. SI units are very useful in many contexts, but for this subject they would require us to keep track of quantities, particularly energies, to large negative powers. Particle physics uses the **natural units** of MeV or GeV (≡ 1000MeV) for energy, the femtometre (also known as the fermi, 1 fm = 10−<sup>15</sup> m) for distance, and the barn (1 b = 10−<sup>28</sup> m<sup>2</sup>)

By using natural units, we can set **¯**h = c = 1 in our calculations. So, for example, the familiar relation between energy and momentum in special relativity becomes simply E<sup>2</sup> = p<sup>2</sup> + m<sup>2</sup>. In natural units, mass, energy, and momentum have the same units, which simplifies dimensional analysis checking. At the end of a calculation, we might need to convert the answer to practical units, which we can do very simply by using the conversion factors **¯**hc = 197.3 MeVfm and (**¯**hc)<sup>2</sup> = 0.3894 GeV<sup>2</sup> mb.

We will always use natural units in this book, unless explicitly indicated otherwise.

## **1.2 Early days**

Particle physics had its roots in nuclear physics and cosmic-ray physics. Its remit is the study of the fundamental building blocks of matter and the interactions between them, particularly the strong, electromagnetic, and weak interactions. The gravitational interaction does not play a significant role in most of the topics covered in this book. <sup>3</sup> <sup>3</sup> Compare gravitational and electro-

separation of 10−<sup>12</sup> m. In the 1930s, it appeared that atoms and nuclei could be understood in terms of a rather small number of constituents: the proton and neutron; the electron and positron; the neutrino and antineutrino. Electromagnetic interactions were assumed to remain described by the theories of Faraday and Maxwell at subatomic distance scales, but consistent relativistic calculations required the development of quantum field theory and, in particular, an understanding of renormalization (see Section 1.2.1). Weak interactions were more of a problem. The contact interaction developed by Fermi was very successful in bringing order to a wide range of phenomena, but it was not renormalizable and hence did not allow reliable higher-order calculations. The strong nuclear force has a range of a few fermi (10−<sup>15</sup>m), the scale of a nucleus. The strength of the electromagnetic interaction is given by the fine-structure constant, α = e<sup>2</sup>/4π**¯**hc, with a value <sup>1</sup> <sup>137</sup> . Defining a 'strong nuclear charge' α<sup>S</sup> similarly gives α<sup>S</sup> ∼ 1 at a distance of a few fermi. The scale of weak interactions is given by the the Fermi constant G<sup>F</sup> /(**¯**hc)<sup>3</sup> = 1.166 × 10−<sup>5</sup> GeV−<sup>2</sup>.

> This simple picture could not accommodate the discovery of new particles by cosmic-ray physicists: the muon by Anderson and Neddermeyer in 1936 using a cloud chamber within a magnetic field; the pion in 1947 by Powell's group in Bristol using specially developed photographic emulsion; and 'strange particles' with V-shaped or kinked tracks by

Rochester and Butler (1946–47) using a coincidence-counter-controlled cloud chamber. What was strange about the new particles was a much longer lifetime than would be expected for a 'normal' strongly interacting particle with a comparable mass. A new quantum number, strangeness, was introduced independently by Gell-Mann and by Nishijima in 1953 to explain this. Their proposal was that strangeness was conserved in strong and electromagnetic interactions but not in weak interactions. The lightest strange particles could only decay by a strangeness-changing weak interaction.

This was just the beginning. In the early 1950s, the development in the USA and Europe of synchrotrons capable of delivering particle beams with GeV energies and of the bubble chamber led to a plethora of new short-lived particles.

## **1.2.1 Particles and forces**

The forces that we are most familiar with at a macroscopic level are gravitation and electromagnetism, and at this level a particle may be defined roughly as a 'point-like' object that has a well-defined mass and charge. This also works reasonably well at the scale of atoms (10−<sup>8</sup> m), at which electrons, protons, and neutrons can be considered point-like. In quantum mechanics and quantum field theory, a force is described by the exchange of a field-quantum—in the case of electromagnetism, the photon (γ). One of the consequences of Dirac's attempts to find a physical explanation for the troublesome negative-energy solutions of his otherwise very successful equation describing the electron was the prediction of *antiparticles* (1930–31), in particular the positron. An antiparticle is a particle with the same mass and spin as a particle, but with opposite electric charge. A neutral particle can be its own antiparticle, for example the photon. Feynman<sup>4</sup> chapter. has given a simple but elegant argument that antiparticles are a necessary outcome of a relativistically invariant description of particle interactions. Shortly after Dirac's prediction, the positron was discovered by Anderson in 1932. By the late 1940s, the development of *renormalized* quantum field theories<sup>5</sup> gave a consistent way to handle the infinities that seemed inherent in any quantum description of particle creation and annihilation. The paradigm is quantum electrodynamics (QED), for which the mass and electric charge of the electron are two of the parameters that are renormalized, the so-called vacuum energy<sup>6</sup> <sup>6</sup>A consequence of the quantum time– being the third. After renormalization, QED provides a theory that has enabled amazingly precise calculations of quantities such as the magnetic moment of the electron.<sup>7</sup>

At a deep level, much of the thrust of particle physics in the twentieth century was to find out if the strong and weak nuclear forces could be described by renormalized quantum fields, and, if so, to discover the related field quanta. We now know that this is the case, with the W<sup>±</sup> and Z<sup>0</sup> providing the weak force and the eight massless gluons (g) the strong force. To describe these interactions, more complicated field theories are required, but they have been shown to be renormalizable.

<sup>4</sup>'The reason for antiparticles', in the 1986 Dirac Memorial lectures—see the further reading at the end of this

<sup>5</sup>A renormalizable theory is one in which infinities to all orders of perturbation theory can be absorbed by the redefinition of a finite number of the parameters of the theory, such as masses and coupling constants. These parameters are then fixed from experimental observation.

energy uncertainty relation, vacuum energy ΔE can exist for a time Δt, provided ΔEΔt < **¯**h.

<sup>7</sup>Quantum field theory is beyond the remit of this book, but some introductory texts are listed in the further reading at the end of Chapter 6.

ticles. <sup>8</sup> <sup>8</sup> A more detailed account of the group theory that we need is given at the end of Chapter 2.

known as <sup>9</sup> <sup>9</sup> The subscript 'flavour' is to distinguish this symmetry from the exact SU(3)colour of QCD. The subscripts may be omitted if the context is unambiguous.

<sup>11</sup>This is covered in the discussion of gauge symmetry in Chapter 6.

<sup>12</sup>For an ultrarelativistic particle, 'lefthanded' refers to the component of its spin projected along a direction opposite to that of its 3-momentum.

<sup>13</sup>The subscript L is for 'left-handed', but also indicates that this is not the approximate SU(2) of nuclear isospin

## **1.2.2 Group theory in particle physics**

Group theory is the mathematical description of patterns, both of physical structures such as crystals and more abstractly of groups of related objects, for example particles with similar properties such as the pions (π±, π<sup>0</sup>). Group theory also plays an essential role in the mathematical structure describing the forces and interactions of fundamental par-The great benefit of group theory is that it provides much of the mathematical apparatus needed to exploit the underlying symmetries of the fundamental forces.

As we shall explain in detail in Chapter 5, the mesons and baryons, composed of the u, d, and s quarks, occur in patterns of a symmetry SU(3)flavour. This symmetry is approximate because of the difference in masses of the three quarks. It contains an SU(2) subgroup known as isospin composed of hadrons with only u and d quarks. The pion states just mentioned form an isospin triplet and the neutron and proton an isospin doublet.

The strong interaction among the constituents <sup>10</sup> <sup>10</sup> Quarks, antiquarks, and gluons. of hadrons is based on an exact SU(3) group structure with three 'colour' charges. This is the theory of quantum chromodynamics (QCD), which underpins the physics described in Chapter 9. The space–time structure of QCD is similar to that of QED, with an inverse-square-law force. Like the photon, the gluons are massless particles, but, unlike the photon, the gluons carry a colour charge—there are eight gluons. Further explanation is given in Chapter 9.

## **Group theory in electroweak unification**

The most complicated use of group theory is in describing the 'unification' of the weak and electromagnetic interactions to form the electroweak theory of the Standard Model. It is complicated because QED is a spatial parity-conserving force whereas the weak force does not respect this symmetry. Electrodynamics has another important feature—it is 'gauge-invariant'. Maxwell's equations are unchanged by a change in the electromagnetic 4-vector potential A<sup>μ</sup> → A<sup>μ</sup> − ∂μΛ, where Λ is a scalar function. Under a gauge transformation, a wavefunction changes by a phase ψ → ψe−ie<sup>Λ</sup>. In the language of group theory, this is a U(1) symmetry. This is much more than just a mathematical curiosity, since the replacement of 4-momentum of a charged particle p<sup>μ</sup> → p<sup>μ</sup> − eA<sup>μ</sup> in the (classical) equations of motion generates the correct form for the electromagnetic interaction.<sup>11</sup>

The quanta of the weak force are the charged W<sup>±</sup> spin-1 bosons interacting via their 'left-handed'<sup>12</sup> states and the Z<sup>0</sup>, which couples to both left- and right-handed states but with different strengths. As both the photon and Z<sup>0</sup> are spin-1 states with zero electric charge, they can interfere—with a strength given by a mixing angle θ<sup>W</sup> (the 'weak mixing angle' or 'Weinberg angle'). The group structure of the left-handed states is that of<sup>13</sup> mentioned above. SU(2)L. Using group-theoretical language, this synthesis of electrodynamics with the weak interaction is based on a U(1)⊗SU(2)L

group structure. In an analogous, but more complicated, procedure to that described above for QED, the full electroweak interaction can be generated by a suitable gauge transformation. Electroweak unification is covered in Section 7.4.

## **1.2.3 Particles**

Particles, including the force quanta, are classified according to their spin and interactions. *Leptons* are spin- <sup>1</sup> <sup>2</sup> fermions that do not interact via the strong force: the electron (e) and associated neutrino (νe) provide the paradigm. Two further sets or *generations* have been discovered: the muon (μ) and muon-neutrino (νμ) and the tau (τ ) and tau-neutrino (ν<sup>τ</sup> ). To account for the non-observation of decay modes such as μ → eγ and τ → μγ, each generation of lepton pairs is given a lepton number, L. For each lepton, there is a corresponding antiparticle with opposite signs for charge Q and lepton number. The properties of the leptons are summarized in Table 1.1.

As will be explained in Chapter 8, the number of neutrino species, Nν, is given by the width of the Z<sup>0</sup> vector boson and is consistent with a value of 3.

Strongly interacting particles (*hadrons*) are composed of quarks and antiquarks bound tightly in qq¯ (*meson*) or qqq (*baryon*) combinations by the colour field of QCD. Free quarks have never been observed directly, although there is evidence that they may become unbound in a quark– gluon plasma, which is being studied using heavy-ion collisions. Quarks carry fractional electric charge, <sup>2</sup> <sup>3</sup> <sup>e</sup> or <sup>−</sup><sup>1</sup> <sup>3</sup> e, where e is the charge of the positron. There are six quarks, grouped in charge ( <sup>2</sup> <sup>3</sup> , <sup>−</sup><sup>1</sup> <sup>3</sup> ) pairs: (d, u), (s, c), (b, t). The (d, u) pair form a strong isospin doublet. The details are given in Table 1.2. All quarks have J<sup>P</sup> = <sup>1</sup> 2 <sup>+</sup>. It is worth noting that the three pairs of quarks (d, u), (s, c), (b, t) of increasing mass are matched by the three lepton pairs (e, νe),(μ, νμ),(τ,ν<sup>τ</sup> ).


**Table 1.1** Lepton properties.


**Table 1.2** Quark properties.

## **1.2.4 Forces**

The force carriers of the Standard Model occur in two independent sectors:


All force carriers in the Standard Model are spin-1 bosons and their properties are summarized in Table 1.3.


**Table 1.3** Force carriers.

For the symmetries to be exact, the particles are assumed to be initially massless, with the Higgs mechanism being invoked to generate particle masses (for all except the photon and gluons) while preserving the underlying symmetry structure. This mechanism requires the existence of a Higgs particle and there is now strong evidence from the Large Hadron Collider at CERN for the existence of at least one Higgs boson. The details of this key discovery for completing the Standard Model are covered in Chapter 12. After a long shutdown between 2013 and 2015, the LHC will operate at the higher centre-of-mass energy of 13 TeV and with higher luminosity. Apart from studying the Higgs in greater detail, much effort will be devoted to the search for evidence of physics beyond the Standard Model.

The 3-fold colour quantum number was introduced to allow baryon wavefunctions, for example that of the Δ++ composed of three u quarks (spin <sup>3</sup> <sup>2</sup> , isospin <sup>3</sup> <sup>2</sup> ), to have simultaneously the correct permutational symmetry and satisfy Fermi–Dirac statistics. QCD provides the theoretical basis for why only the 'colourless' qqq and qq¯ combinations form 'confined' hadronic bound states. A major difference between QCD and QED is that the force carries are 'charged', in that the gluons carry a colour charge. Consider a qq¯ meson and all the possible colour combinations that a gluon exchanged between the q and ¯q might carry. With r, b, g denoting the SU(3)colour charges, from (r, b, g) ⊗ (¯r, ¯b, g¯), one might expect nine coloured gluons. However, the three colour-neutral combinations (rr, b ¯ ¯b, gg¯) have one totally symmetric combination <sup>√</sup> 1 <sup>3</sup> (rr¯+ <sup>b</sup>¯<sup>b</sup> <sup>+</sup> gg¯). In group-theoretical language, this corresponds to combining the 3 and ¯3 representations of SU(3)colour: 3⊗¯3=8⊕1. The totally symmetric combination corresponds to the '1' and would be colourless and unconfined, so it is discarded. The remaining octet of coloured gluon states are

$$r\bar{b}, \quad r\bar{g}, \quad b\bar{g}, \quad b\bar{r}, \quad g\bar{r}, \quad g\bar{b}, \quad \frac{1}{\sqrt{2}}(r\bar{r} - b\bar{b}), \quad \frac{1}{\sqrt{6}}(r\bar{r} + b\bar{b} - 2g\bar{g}).$$

Note that there are two apparently colourless states. However, these two colour states are analogous to the electrically neutral members of a strong isospin multiplet (e.g. a ρ<sup>0</sup>)—they are not colourless.

## **1.3 Diagrams**

Two sorts of diagrams are used in this book: Feynman diagrams and 'quark-flow' diagrams. Richard Feynman invented a very elegant graphical formalism that provides a considerable shortcut in calculations. Feynman rules make a direct connection between each element of a diagram and a component of the mathematical expression describing the process, derived from quantum field theory. An example of a Feynman diagram for the process e<sup>+</sup>d → ν¯eu via W<sup>+</sup> exchange is shown in Fig. 1.1. Feynman's graphical technique was invented in the 1940s during

**Fig. 1.1** Feynman diagram for <sup>e</sup>+<sup>d</sup> <sup>→</sup> <sup>ν</sup>¯e<sup>u</sup> via <sup>W</sup><sup>+</sup> exchange.

**Fig. 1.2** Lepton pair production by the Drell–Yan process <sup>π</sup>−<sup>p</sup> <sup>→</sup> n+−.

the heroic age of relativistic quantum field theory calculations of electromagnetic interactions. More details and an outline of the 'Feynman rules' on how to construct a diagram are given in Section 7.2.2.

An example of a quark flow diagram is shown in Fig. 1.2. It shows the so-called Drell–Yan process π−p → n<sup>+</sup> <sup>−</sup>. Quark flow diagrams are not an exact calculational tool. However, they are very useful for explaining and understanding what is happening in a particle interaction. They also enable one to keep track of charges and other quantum numbers such as strangeness that may be changing but have to satisfy overall conservation laws.

## **1.4 Accelerators, colliders, and detectors**

This section covers the essential 'tools of the trade' for high-energy particle physics.

Accelerators were first developed for high-energy nuclear physics in the 1930s: both 'linear' electrostatic devices and the first circular accelerators. After the Second World War, new technology enabled a huge increase in beam energies. A summary of accelerators and colliders in operation over the period 1950–2010 is shown in Fig. 1.3.

Detector technology has also changed a lot owing to the development of microprocessors and advanced circuit board design—some of it driven by the demands of the video-gaming industry.

## **1.4.1 Accelerators**

The earliest particle accelerators were based on the use of a single very large potential difference to accelerate a charged particle: Van de Graaff

**Fig. 1.3** Accelerators and colliders in use between 1950 and 2020: hadron– hadron (diamonds), electron–positron (boxes), and HERA electron–proton (triangle).

(1929) used a dielectric belt to transfer charge from a voltage source to a large spherical isolated upper terminal; Cockcroft and Walton (1937) used a series of stages to 'multiply' the voltage. The maximum energy was limited by electrical breakdown, typical maximum accelerating voltages being around 25 MV. Both technologies are still in use: Van de Graaff machines for research in nuclear physics and the Cockcroft– Walton multiplier as an early accelerating stage after the ion source in high-energy facilities.

It was soon realized that to get to ever higher energies, a circular device would allow the accelerating element to be used more than once. Ernest Lawrence pioneered the early development of circular accelerators in the 1930s. His machine—the cyclotron—had a circular beam with a radius that increased as it was accelerated, and the whole device was enclosed in a single large electromagnet. The largest cyclotron that Lawrence built had a diameter of 1.5 m and produced an 8 MeV proton beam.

In a modern accelerator, dipole and quadrupole magnets are used for bending and focusing, and microwave cavities for accelerating the particles. A key technical advance was the discovery of 'phase stability', which synchronized the accelerating voltage frequency and magnetic field strength with the rotation of the particle beam. This enabled circular machines with a beam pipe of fixed radius first to accelerate the beam (electron or proton) and then maintain it at its required operating energy. Another key advance was 'strong focusing', which allowed the use of much smaller vacuum pipes and hence made much larger accelerators affordable.

The first large proton accelerator at CERN<sup>14</sup> —the Proton Synchrotron (PS)—has a diameter of 200 m and a maximum beam energy of 28 GeV. Remarkably, the PS, which started operating in 1959, is still a key component of the CERN complex of accelerators. For particle physics experiments, a high-energy proton beam is extracted from the PS and then directed at a target and detector. 'Secondary' beams of relatively long-lived particles such as pions and kaons can also be produced from the first target and selected by more magnets and particle identification devices. Producing neutral-particle beams is a bit more challenging, since they cannot be steered by electromagnetic methods. The production of neutrino beams is discussed in Chapter 11.

Synchrotrons also accelerate electrons, such as in the original 7.5 GeV machine—the Deutsches Elektronen-Synchrotron—that gave the DESY laboratory in Hamburg its name. An extracted electron beam can be used directly for experiments or to produce a secondary photon beam by bremsstrahlung.<sup>15</sup> Any remaining e<sup>±</sup> particles can be swept from the path of the photon beam by magnets before the photon beam reaches the target.

Energy loss by synchrotron radiation from an electron beam moving in a circular orbit provides additional problems for the accelerator physicist.<sup>16</sup> The rate of energy loss by synchrotron radiation is discussed in more detail in Chapter 3. Its effect is to limit the maximum energy at a <sup>14</sup>The very first accelerator at CERN was the much smaller synchrocyclotron with a maximum beam energy of 600 MeV.

<sup>15</sup>Literally 'braking radiation', the process is e<sup>±</sup> → e±γ, in which a highenergy electron emits a photon as it traverses a thin layer of matter and is deflected by the positive nuclear charge. The closely related process of 'pair creation', <sup>γ</sup> <sup>→</sup> <sup>e</sup>+e−, will also occur; together, the two processes give rise to an electromagnetic 'shower' of e± particles.

<sup>16</sup>Synchrotron radiation is essentially the same basic physical process as bremsstrahlung, but for much lowerenergy photons (roughly X-ray energies). It is enormously useful for investigating the structure of materials, and synchrotron light sources (e.g. Diamond in the UK and the European Synchrotron Radiation Facility in France) are examples of very practical spin-offs from high-energy particle physics.

given radius. This can be countered by increasing the radius—with the ultimate result being a linear accelerator. Physicists at Stanford University first developed MeV-scale linacs for nuclear structure physics in the 1950s. Somewhat later, the Stanford Linear Accelerator Center (SLAC) was set up to build and operate a two-mile-long electron linac—the longest to date. It started operating in 1966 with a maximum beam energy of 20 GeV, which increased to nearly 50 GeV before high-energy physics experiments ceased in 1998.

## **1.4.2 Colliders**

What matters for the study of the physics is the centre-of-mass (CMS) energy. All the above accelerators produced beams for 'fixed-target' experiments—as the name implies, the target is stationary. A large fraction of the beam energy has been used simply to accelerate the CMS frame in the laboratory frame of the accelerator. To exploit the maximum energy available from circular accelerators, one needs to have two counter-rotating beams and collide them head-on or nearly so. If the two beams are proton and antiproton or electron and positron, the same beam pipe can be used. An e<sup>+</sup>e<sup>−</sup> collider with beams of 20 GeV gives 40 GeV in the CMS frame.

The key challenge for colliders is to achieve sufficiently high luminosity to provide useful interaction rates. This difficulty has been solved as described in Chapter 3 and most of the major discoveries in particle physics in the past 40 years have been made at colliders of different types. The only exception is for physics requiring a particular type of incident particle rather than just a large interaction energy. For example, CP violation was discovered and studied in great detail using kaon beams. The most important such examples in recent years are the high-energy neutrino beams produced at CERN, Fermilab, and J-PARC in Japan.

Apart from the LHC, much of the experimental information covered in this book has come from colliders operating in the 20 years leading up to the start of data-taking at the LHC (see Fig. 1.3), in particular the Large Electron–Positron (LEP) collider at CERN (a 27 km circumference ring now containing the LHC) with CMS energies up to 209 GeV, the Tevatron proton–antiproton collider at Fermilab (4-mile circumference) with CMS energies up to 2 TeV, and the HERA electron–proton collider at DESY (11 km circumference) with 27.5 GeV electrons on 920 GeV protons providing a maximum CMS energy of 318 GeV. Older machines, particularly the e<sup>+</sup>e<sup>−</sup> colliders (PEP and PETRA), provided data on charm and beauty states after the 1974 'revolution'.<sup>17</sup> The latter were then studied in great detail at the dedicated 'B-factories': KEKB in Japan and PEP-II at SLAC.

## **1.5 Detectors**

In the era of fixed-target experiments, the bubble chamber was one of the most important types of detector. As the name implies, it exploited the

<sup>17</sup>The discovery of the J/ψ simultaneously by the Alternating Gradient Synchrotron (AGS) at Brookhaven and the SPEAR e+e<sup>−</sup> collider at Stanford in that year did indeed cause a revolution in our understanding of particle physics and in particular provided crucial experimental evidence in support of quantum chromodynamics as the theory of the strong interaction.

fact that boiling could be initiated in a superheated liquid by the passage of a charged particle. The liquid was kept under pressure until just before the beam arrived, at which time the chamber was expanded and then illuminated and photographed. The chamber was surrounded by a magnetic field to bend charged-particle trajectories, thereby allowing their momentum to be determined. Bubble chamber pictures still provide a very good visual aid to understanding the kinematics of high-energy particle collisions. To obtain quantitative information, it was necessary to scan the pictures manually using specialized measuring tables to digitize the tracks. Bubble chambers could only work with pulsed beams, and many were filled with liquid hydrogen, requiring very sophisticated cryogenic engineering and safety systems. An enormous bubble chamber known as Gargamelle (a 4 m long, 2 m diameter cylinder weighing 1000 tonnes) was filled with 18 tonnes of Freon (a refrigerant) for neutrino interactions. This device was designed to find evidence for 'weak neutral currents', which led the way to the discovery of the Z<sup>0</sup>.

The alternatives to a visual device like a bubble chamber are electronic detectors. Devices such as spark and drift chambers give reasonably good spatial information on charged-particle tracks and most importantly can cope with a much higher interaction and read-out rate than a bubble chamber. Drift-chamber technology provided a sophistication that made the big devices used in collider experiments almost the equivalent of an 'electronic bubble chamber'. However, silicon detectors offer much better resolution than drift chambers and they have now become the detectors of choice for the inner detectors at LHC, although wire chambers are still required for the very large areas in muon chambers. The energies of both charged and neutral hadrons can be measured by a calorimeter—a device providing an electronic signal proportional to the energy deposited. A traditional design is the sampling calorimeter with plates of a heavy material (such as lead) separated by space for a charge-sensitive detector. The latter can be based on liquid argon or another inert element like krypton or it can be a plastic scintillator with a photomultiplier readout. Calorimeters are also crucial for providing information that allows the separation of 'electromagnetic particles' (electrons and photons), hadrons, and muons. These subjects are covered in more detail in Chapter 4.

## **1.6 Open questions**

From the 1960s on, the increases in the energy of accelerators and colliders and in the sophistication of detectors, together with powerful computers to analyse the data obtained, have provided a huge number of hadronic states with well-defined mass, width (or lifetime), spin, parity, and decay modes. The information is regularly updated and published in the 'Review of Particle Physics' by the much respected Particle Data Group Collaboration.<sup>18</sup> <sup>18</sup>Available online at http://pdg.lbl. Initially, the information was summarized on a card small enough to fit into a wallet. Now even the summary (the 'Particle Physics Booklet') is the size of a pocket diary and the full

gov/ or the PDG UK mirror site http://durpdg.dur.ac.uk/lbl/.

'Towards the final laws of physics', in the 1986 Dirac Memorial Lectures—the details are in the further reading.

Review is a hefty journal volume of well over 1000 pages. The Standard Model gives a very good description of this wide range of experimental information on elementary particles and their weak, electromagnetic, and strong interactions covering an energy scale up to of order 1TeV. However, it is certainly incomplete. <sup>19</sup> <sup>19</sup> See for example Steven Weinberg's Gravity is not included and there is no explanation of why there are three 'generations' of quarks and leptons. There are nearly 20 parameters (masses, coupling constants, and mixing angles) that are not given by the Standard Model but have to be determined from experimental measurements. Other big questions crowd in—one of the most glaring is the gross matter–antimatter asymmetry of the world we inhabit, in contrast to the matter–antimatter symmetry that occurs naturally in the Standard Model. According to astrophysical and cosmological evidence discussed in Chapter 13, ordinary baryonic matter constitutes only 5% of the universe. There is an expectation that answers, or at least some initial directions, will be uncovered by the experimental programme of the LHC and future neutrino experiments.

## **1.7 Chapter outline**

Physics is an experimental science, and we would not have our current understanding of particle physics without the use of advanced particle accelerators and detectors. Chapter 3 introduces particle accelerators and explains some of the critical technology required for the successful operation of the LHC. Chapter 4 gives an introduction to the fundamental physics of particle detectors, with an emphasis on the modern techniques used at the LHC. This is obviously a vital subject for particle physics and the chapter attempts to describe this in greater depth than conventional undergraduate textbooks. The applications of these principles to particular experiments will be described in other chapters.

Some of the theoretical and mathematical concepts such as symmetries that will be required throughout the rest of the book are covered in Chapter 2. An introduction to relativistic quantum mechanics is given in Chapter 6. The static quark model for hadrons is described in Chapter 5. The use of scattering experiments to probe the dynamic nature of quarks is covered in Chapter 9, which gives an outline of the quark– parton model as well as the evidence for gluons and a brief introduction to quantum chromodynamics (QCD). The weak interaction is introduced in Chapter 7, starting with the weak interaction of leptons. This is then extended to include quarks and the chapter ends with an introduction to electroweak (EW) unification, including the prediction of the W and Z bosons. A wide range of experiments that support EW unification are covered in Chapter 8, particularly those made possible by the high energies of the LEP, Tevatron, and LHC. Flavour oscillations and CP

violation in the quark sector are explained in Chapter 10. Similar oscillations are seen in the neutrino sector, and the formalism and key experimental results are described in Chapter 11. The intriguing possibility that CP violation in neutrino oscillations could explain the observed matter–antimatter asymmetry in the universe is briefly reviewed. The Higgs mechanism is a fundamental aspect of the Standard Model. The theory and experimental evidence for the existence of a Higgs boson are given in Chapter 12. Finally, Chapter 13 starts with a review of Standard Model physics at the LHC; it then explains why this is not the end of the story. Although the Standard Model is remarkably successful in explaining the current LHC data, there remain compelling reasons to believe that it is an incomplete theory. These are outlined in this chapter, together with a discussion of theoretical ideas beyond the Standard Model that might cure some of its problems. The evidence for dark matter and dark energy is also reviewed, as well as the different attempts to discover dark matter.

## **1.8 How to read this book**

As we mentioned in the Preface, we have the ambitious aim of covering all aspects of particle physics. This inevitably means that not all topics will be of equal interest to all readers. The next three chapters provide technical information: Chapter 2 on mathematical methods; Chapters 3 and 4 on accelerators and detectors, respectively. Depending on the reader's interest or experience, these can skipped or returned to later. Similarly, Chapter 6, an introduction to relativistic quantum mechanics and the Dirac equation, is not essential for understanding many of the experimental results, but it is crucial for understanding the concept of antiparticles. Chapter 5 shows how the static quark model can explain the observed pattern of hadronic masses and quantum numbers. The second half of the book (Chapters 7–13) covers all aspects of experimental particle physics, informed by recent results from the LHC and other particle accelerators and experiments. If you need to use the most recent and accurate experimental results, you should consult the PDG tables [115].

## **Chapter summary**


## **Further reading**


# **Mathematical methods 2**

This chapter covers rotational and Lorentz invariance. Related space– time symmetries such as parity, time reversal, and charge conjugation are also defined. We give a brief introduction to group theory and its use in the mathematical construction of the standard model—both in the electroweak sector and in the strong interaction. The idea and usefulness of approximate internal symmetries are explored, using nuclear isospin as an example. This chapter also covers the essential steps in connecting calculations to measured quantities such as cross sections and decay rates. It is assumed that the reader is familiar with the quantization of angular momentum in non-relativistic quantum mechanics and has taken a first course in special relativity.

While investigating the invariance of general relativity in 1918, Emily Noether determined the conserved quantities for all physical laws that are based on a continuous symmetry. Specifically, there are the following associations between symmetries and conserved quantities:


Three discrete symmetries are also important in nuclear and particle physics: spatial parity (P), charge conjugation (C), and time reversal (T). All three are good symmetries of both the electromagnetic and strong interactions. The weak interaction famously breaks both C and P symmetries maximally but is CP-invariant for many processes. Violation of CP invariance has been observed in the interactions of neutral meson systems, particularly kaons and beauty mesons.<sup>1</sup> <sup>1</sup>CP violation and its consequences The product of all three, CPT, is expected to be a universal symmetry of physics and is a cornerstone of quantum field theory.

## **2.1 Discrete symmetries**

## **2.1.1 Spatial parity**

The parity operator performs a spatial inversion though the origin:

$$
\psi'(\mathbf{x}, t) = P\psi(\mathbf{x}, t) = \psi(-\mathbf{x}, t)
$$

& Tony Weidberg 2016. Published in 2016 by Oxford University Press.


for neutral mesons are covered in Chapter 10.

<sup>&</sup>amp; Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak,

Applying the parity operator twice must return the original state:

$$PP\psi(\mathbf{x},t) = \psi(\mathbf{x},t), \quad \text{so} \quad P^2 = 1$$

To preserve the normalization of the wavefunction,

$$
\begin{aligned}
\langle \psi | \psi \rangle &= \langle \psi' | \psi' \rangle \\&= \langle \psi || P^\dagger P || \psi \rangle
\end{aligned}
$$

Therefore,

$$P^\dagger P = I \qquad \qquad \text{( $P$  is unitary)}$$
 
$$\text{and since } P^2 = 1, \qquad P^\dagger = P \qquad \qquad \text{( $P$  is Hermitian)}$$

which implies that parity can be an observable with eigenvalues ±1.

Parity changes the direction of vector quantities (**r**, **p**) but conserves quantities that are products of vectors, such as **j** = **r** × **p**. Furthermore, since P has no explicit time dependence,

$$\mathrm{i}\frac{\mathrm{d}\langle P\rangle}{\mathrm{d}t} = [P, H].$$

Therefore, parity is a constant of the motion (i.e. **conserved**) when the interaction Hamiltonian commutes with P.

We define the following **intrinsic parity** of fundamental particles:

• **Spin-1 bosons**: Gluons and the photon have intrinsic parity P = −1:

$$P|\gamma\rangle = = -|\gamma\rangle \text{ and } P|g\rangle = -|g\rangle$$

• **Spin- <sup>1</sup> <sup>2</sup> fermions**: particles are of opposite parity to antiparticles—this follows from the Dirac equation (see Chapter 6). The conventional choice is

$$\begin{aligned} P|e^-\rangle &= P|\nu\rangle = P|q\rangle = +1\\ P|e^+\rangle &= P|\bar{\nu}\rangle = P|\bar{q}\rangle = -1 \end{aligned}$$

The weak gauge bosons (Z<sup>0</sup> and W<sup>±</sup>) are not eigenstates of spatial parity and thus do not have a definite parity quantum number.

## **2.1.2 Charge conjugation**

The charge-conjugation operator C changes a particle into its antiparticle. Generally, few particles are eigenstates of C, for example a u quark has charge +<sup>2</sup> <sup>3</sup> , its antiparticle <sup>−</sup><sup>2</sup> <sup>3</sup> . Nevertheless, C is a useful quantity when considering electromagnetic or strong decays of neutral mesons.

The photon is a neutral, fundamental particle. Its intrinsic charge quantum number can be inferred by considering its correspondence with classical wave theory. It is clear that upon reversing charge, the electric field and the electromagnetic scalar potential change sign:

$$\mathbf{E}(\mathbf{x},t) \to -\mathbf{E}(\mathbf{x},t), \quad \phi(\mathbf{x},t) \to -\phi(\mathbf{x},t).$$

The vector potential **A**(**x**, t), which is connected with the photon wavefunction, is related to **E** and φ by

$$\mathbf{E} = -\nabla\phi - \frac{\partial \mathbf{A}}{\partial t}$$

Inserting the charge-reversed **E** and φ requires

$$C\mathbf{A} = -\mathbf{A}$$

It is then simple to identify C|π<sup>0</sup> = +1|π<sup>0</sup> from the dominant electromagnetic decay π<sup>0</sup> → γγ (branching ratio = 98.8%).

## **2.1.3 Time reversal**

In line with the spatial parity transformation, we might expect that time reversal would be given by

$$\psi\_T(\mathbf{x}, t) = T\psi(\mathbf{x}, t) = \eta\_T \psi(\mathbf{x}, -t), \quad \text{where} \quad |\eta\_T| = 1$$

Both classical mechanics and electromagnetism respect time reversal. In classical mechanics, for a time-independent potential V (x), Newton's equations of motion can be derived from the energy-conservation equation<sup>2</sup> <sup>2</sup>We work in one spatial dimension for

simplicity.

$$E = \frac{1}{2}m\dot{x}^2 + V(x)$$

by differentiation with respect to time, giving

$$m\ddot{x}\dot{x} + \frac{\mathrm{d}V}{\mathrm{d}x}\dot{x} = 0,\quad\text{or}\quad m\frac{\mathrm{d}^2x}{\mathrm{d}t^2} = -\frac{\mathrm{d}V}{\mathrm{d}x}$$

so the equation of motion is unchanged by the change t → −t. However, this will not be correct for quantum mechanics, because Schr¨odinger's equation involves a first-order time derivative. For a time-independent Hamiltonian Hˆ and ψ an eigenstate of energy,

$$\mathrm{i}\hbar\frac{\partial\psi}{\partial t} = \hat{H} = E\psi \quad \Longrightarrow \quad \psi(x,t) = \psi(x,0)e^{-\mathrm{i}Et/\hbar}$$

If we apply the T operator as defined above to this equation, we find

$$T(\psi(x,t)) = \eta\_T \psi(x,0) \mathbf{e}^{+iEt/\hbar}, \quad \text{where} \quad |\eta\_T| = 1$$

The time-reversed state appears to have negative energy.<sup>3</sup> <sup>3</sup>We will consider another view of What matters in quantum mechanics is what it predicts for observable quantities, and this requires calculating normalized matrix elements, for which we need ψ<sup>∗</sup>(x, t) = ψ<sup>∗</sup>(x, 0)e+iEt/**¯**<sup>h</sup> as well as ψ(x, t). Looking at the time dependence of this state gives a hint as to how to proceed: we modify the T operator so that, in addition to requiring the change (**x**, t) → (**x**, −t), we demand that ψ → ψ<sup>∗</sup>, so now

$$\psi\_T(\mathbf{x},t) \equiv \psi\_T(x,t) = \eta\_T \psi^\*(\mathbf{x},-t) = \eta\_T \psi^\*(\mathbf{x},0) \mathbf{e}^{-iEt/\hbar}$$

negative-energy states in Chapter 6.

Such an operator is known as anti-unitary. It is straightforward to show that the normalization of ψ(**x**, t) is unchanged by the T operation, provided that |η<sup>T</sup> | <sup>2</sup> = 1.

## **2.1.4** *J P C* **of hadrons**

Parity and, where applicable, charge quantum numbers are often quoted, for a particular state, with its total angular momentum J = L + S as a JP C number. The J<sup>P</sup> of a particle is closely related to the spatial transformation properties of the state. Particles with J<sup>P</sup> = 0<sup>−</sup> are called pseudoscalar particles and those with J<sup>P</sup> = 0<sup>+</sup> scalar. Particles with J<sup>P</sup> = 1<sup>−</sup> are called vector particles and those with J<sup>P</sup> = 1<sup>+</sup> axial vector.

## **Mesons**

Mesons are qq¯ bound states. As these have opposite parity, a groundstate meson will always have P = −1. Excited states bring additional parity factors according to (−1)<sup>L</sup>:

$$P\_{\text{meson}} = (-1)^{L+1}$$

C is defined (for light neutral mesons only) by interchanging q ↔ q¯ and swapping their positions and spin:

$$C\_{\text{meson}} = (-1)^{L+S}$$

## **Baryons**

Baryons contain three quarks and as such cannot be their own antiparticle; C is undefined. The calculation of baryon parity is more complex than for mesons because one must consider the angular momentum of a three-body system. The intrinsic parity of a baryon is (+1)<sup>3</sup> = +1; similarly, it is −1 for an antibaryon. In full,

$$P\_{\text{baryon}} = B(-1)^{L\_{12}}(-1)^{L\_3}$$

where B is the baryon number, L<sup>12</sup> is the angular momentum between quarks 1 and 2, and L<sup>3</sup> is the angular momentum of the third quark relative to the 1–2 pair.

## **2.1.5 Useful examples**


So producing them in an L = 1 state will simultaneously conserve J, P, and C.

• What about ρ<sup>0</sup> → π<sup>0</sup>π<sup>0</sup>?

This would be similar to the previous example, requiring the pion pair to be in an L = 1 state.

However, applying C to the final state has no effect:<sup>4</sup> <sup>4</sup>Applying charge conjugation to a sin-

$$C|\pi^0\pi^0\rangle = |\pi^0\pi^0\rangle, \quad C = +1, \quad J^{PC}(\pi^0\pi^0) = 1^{-+} $$

So this process is forbidden by charge violation with the strong or electromagnetic force.

Indeed, it has never been observed.

• ρ<sup>0</sup> → + −

This is similar to ρ<sup>0</sup> → π<sup>+</sup>π<sup>−</sup> except that the + <sup>−</sup> system has intrinsic parity −1 from (+1)<sup>−</sup> (−1)<sup>+</sup> . Producing a final state with the spins aligned, S = 1, simultaneously conserves J, P, and C. Note that this process only proceeds via the electromagnetic force and is therefore only ∼ 10<sup>−</sup><sup>4</sup> as likely as the π<sup>+</sup>π<sup>−</sup> mode.

$$\bullet \; \rho^0 \to \pi^0 \gamma$$

The initial state is JP C = 1−−. The photon has JP C = 1−− and the π<sup>0</sup> has 0<sup>−</sup><sup>+</sup>.

But, in addition to their intrinsic parity, photons carry parity (−1)<sup>L</sup> away from a system.

Therefore, this electromagnetic decay is allowed.

• What is the JP C of a K<sup>+</sup> |su¯ is not an eigenstate of charge inversion, so the C quantum number is undefined. Therefore, the label becomes J<sup>P</sup> = 0<sup>−</sup> (ground-state kaon).

## **2.2 Addition of angular momentum**

## **2.2.1 Angular momentum in quantum mechanics**

In quantum mechanics, the angular momentum operators Jx, Jy, J<sup>z</sup> do not commute, but they do satisfy the commutation relations

$$[J\_i, J\_j] = \mathbf{i} \epsilon\_{ijk} J\_k \tag{2.1}$$

where ijk = +1 if ijk is an even permutation of xyz (e.g. yzx), −1 if it is an odd permutation (e.g. zyx), and 0 if any of the indices are identical (e.g. xyx). The only operator that commutes with Jx, Jy, J<sup>z</sup> is the the total angular momentum J<sup>2</sup>:

$$J^2 = J\_x^2 + J\_y^2 + J\_z^2,\quad \text{and}\quad [J^2, J\_i] = 0 \quad (i = x, y, z)$$

Eigenfunctions of J<sup>2</sup> and J<sup>z</sup> are labelled with the eigenvalues of J<sup>2</sup> and Jz:

$$\begin{aligned} J^2|j,m\rangle &= j(j+1)|j,m\rangle\\ J\_z|j,m\rangle &= m|j,m\rangle \end{aligned}$$

gle neutral pion gives C|π<sup>0</sup> = |π0, therefore applying the charge conjugation operator to two neutral pions give s a factor of one. Note that this decay mode is also forbidden by Bose-Einstein symmetry: the two pions must be in an L = 1 state to conserve angular momentum but this would require the wave function to be anti-symmetric with respect to exchange of identical bosons.

where m ∈ {−j, j+1,...,j−1, j}. So, for a given j value there are 2j+1 values of Jz, with the states related by raising and lowering operators

$$J\_{\pm} = J\_x \pm \mathrm{i}J\_y \tag{2.2}$$

which, from eqn 2.1, satisfy

$$[J\_z, J\_\pm] = \pm J\_\pm \tag{2.3}$$

Using this commutation relation, we have

$$\begin{aligned} \langle J\_z \ J\_- | j, m \rangle &= (J\_- J\_z - J\_-) | j, m \rangle \\ &= J\_- (J\_z - 1) | j, m \rangle \\ &= (m - 1) \ J\_- | j, m \rangle \end{aligned}$$

Similarly, m + 1 is the eigenvalue of J<sup>z</sup> applied to the 'raised' state J+|j, m. So we can generally write

$$J\_{\pm}|j,m\rangle = \mathbf{C}\_{\pm}(j,m)|j,m \pm 1\rangle \tag{2.4}$$

with the boundary condition that **C**<sup>±</sup> is required to be 0 for J−|j, −j and J+|j, +j.

To derive these constants, we note that J<sup>+</sup> and J<sup>−</sup> are Hermitian adjoints (although they are not Hermitian operators):

$$
\langle j,m|J\_{+}^{\dagger}|j,m+1\rangle = \langle j,m|J\_{-}|j,m+1\rangle
$$

$$
= \mathbf{C}\_{-}(j,m+1)\langle j,m|j,m\rangle
$$

$$
= \mathbf{C}\_{-}(j,m+1)\tag{2.5}
$$

We take the complex conjugate of this equation to give

$$\langle j,m+1|J\_+|j,m\rangle = \mathbf{C}\_+^\*(j,m)\langle j,m+1|j,m+1\rangle = \mathbf{C}\_+^\*(j,m)$$

so the lowering coefficient of the m + 1 state is the same as the raising coefficient of the m state:

$$\mathbf{C}\_{-}(j, m+1) = \mathbf{C}\_{+}^{\*}(j, m) = \mathbf{C} \tag{2.6}$$

Therefore, applying both operators successively,

$$J\_-J\_+|j,m\rangle = \mathbf{C}^2|j,m\rangle\tag{2.7}$$

where the double operator can be broken down to

$$J\_{-}J\_{+} = J\_{x}^{2} + J\_{y}^{2} + \mathrm{i}(J\_{x}J\_{y} - J\_{y}J\_{x})$$

$$= J\_{x}^{2} + J\_{y}^{2} + J\_{z}^{2} - J\_{z}^{2} + \mathrm{i}[J\_{x}, J\_{y}]$$

$$= J^{2} - J\_{z}^{2} - J\_{z}$$

$$= J^{2} - J\_{z}(J\_{z} + 1) \tag{2.8}$$

from which we can easily identify the eigenvalues of C<sup>2</sup> and the coefficients C<sup>±</sup>

$$\begin{aligned} \mathbf{C}^2 &= j(j+1) - m(m+1) \\ \mathbf{C}\_{-}(j,m) &= \sqrt{j(j+1) - m(m-1)} \\ \mathbf{C}\_{+}(j,m) &= \sqrt{j(j+1) - m(m+1)} \end{aligned} \tag{2.9}$$

## **2.2.2 Addition and Clebsch–Gordan coefficients**

Let J<sup>1</sup> and J<sup>2</sup> be two angular momentum operators, for example the orbital angular momentum and spin of a particle:

$$\begin{aligned} J\_1^2 \ |j\_1, m\_1\rangle &= j\_1(j\_1+1)|j\_1, m\_1\rangle \\ J\_2^2 \ |j\_2, m\_2\rangle &= j\_2(j\_2+1)|j\_2, m\_2\rangle \\ J\_{1z} \ |j\_1, m\_1\rangle &= m\_1|j\_1, m\_1\rangle \\ J\_{2z} \ |j\_2, m\_2\rangle &= m\_2|j\_2, m\_2\rangle \end{aligned} \tag{2.10}$$

We define the combined operators

$$\begin{aligned} J^2 &= \left(J\_{1x} + J\_{2x}\right)^2 + \left(J\_{1y} + J\_{2y}\right)^2 + \left(J\_{1z} + J\_{2z}\right)^2 \\ J\_z &= J\_{1z} + J\_{2z} \end{aligned}$$

acting on

$$\psi = |j\_1, m\_1\rangle |j\_2, m\_2\rangle$$

ψ is an eigenfunction of J<sup>z</sup> with eigenvalue m = m<sup>1</sup> + m<sup>2</sup> but it is generally not an eigenfunction of J<sup>2</sup>. However, linear combinations of ψ can produce eigenfunctions Ψ(j, m, j1, j2) of J<sup>2</sup> with eigenvalues j(j+1):

$$\Psi = \sum\_{m\_1=-j\_1}^{j\_1} \sum\_{m\_2=-j\_2}^{j\_2} \mathbf{C}\_{j\_1 j\_2}(j, m, m\_1, m\_2) \psi \tag{2.11}$$

The coefficients **C**<sup>j</sup>1j<sup>2</sup> (j, m, m1, m2) are the Clebsch–Gordan<sup>5</sup> <sup>5</sup>Clebsch and Gordan were nineteenthcoefficients. They are the probability amplitudes that we measure, from a combined system of |j1, m1|j2, m2, with a combined angular momentum of j(j + 1) when applying the operator J<sup>2</sup>. The details can be found in quantum mechanics texts; here we will consider a useful example.

century mathematicians who identified these coefficients during the development of the theory of Lie algebras.

## **2.2.3 Calculation of Clebsch–Gordan coefficients**

Consider two particles |j1, m1 and |j2, m2 forming a combined state |j, m, where j<sup>1</sup> = 1 and j<sup>2</sup> = <sup>1</sup> <sup>2</sup> . Evidently, <sup>j</sup> can be <sup>1</sup> <sup>2</sup> or <sup>3</sup> <sup>2</sup> . The following two points are key:


states:

3

2 Evidently, the states with maximum ( <sup>3</sup> <sup>2</sup> ) and minimum (−<sup>3</sup> <sup>2</sup> ) m are

$$\left|\frac{3}{2},\frac{3}{2}\right\rangle = \left|1,1\right\rangle \left|\frac{1}{2},\frac{1}{2}\right\rangle$$

$$\left|\frac{3}{2},-\frac{3}{2}\right\rangle = \left|1,-1\right\rangle \left|\frac{1}{2},-\frac{1}{2}\right\rangle$$

Using the definition **C**±(j, m) = j(j + 1) − m(m ± 1), we have

$$J\_{-}\left|\frac{1}{2},\frac{1}{2}\right\rangle = 1\left|\frac{1}{2},-\frac{1}{2}\right\rangle$$

$$J\_{-}|1,1\rangle = \sqrt{2}|1,0\rangle$$

$$J\_{-}|1,0\rangle = \sqrt{2}|1,-1\rangle$$

$$J\_{-}|1,-1\rangle = 0$$

Now operating on the combined state, we have

$$\begin{aligned} J\_- \left| \frac{3}{2}, \frac{3}{2} \right\rangle &= J\_- \left| 1, 1 \right\rangle \left| \frac{1}{2}, \frac{1}{2} \right\rangle \\ \sqrt{3} \left| \frac{3}{2}, \frac{1}{2} \right\rangle &= \sqrt{2} \left| 1, 0 \right\rangle \left| \frac{1}{2}, \frac{1}{2} \right\rangle + \left| 1, 1 \right\rangle \left| \frac{1}{2}, -\frac{1}{2} \right\rangle \\ \left| \frac{3}{2}, \frac{1}{2} \right\rangle &= \sqrt{\frac{2}{3}} \left| 1, 0 \right\rangle \left| \frac{1}{2}, \frac{1}{2} \right\rangle + \sqrt{\frac{1}{3}} \left| 1, 1 \right\rangle \left| \frac{1}{2}, -\frac{1}{2} \right\rangle \end{aligned}$$

In an analogous way,

$$
\left\langle \frac{3}{2}, -\frac{1}{2} \right\rangle = \sqrt{\frac{2}{3}} \left| 1, 0 \right\rangle \left| \frac{1}{2}, -\frac{1}{2} \right\rangle + \sqrt{\frac{1}{3}} \left| 1, -1 \right\rangle \left| \frac{1}{2}, \frac{1}{2} \right\rangle.
$$

1 2 states:

$$\left| \frac{1}{2}, \frac{1}{2} \right\rangle = a \left| 1, 1 \right\rangle \left| \frac{1}{2}, -\frac{1}{2} \right\rangle + b \left| 1, 0 \right\rangle \left| \frac{1}{2}, \frac{1}{2} \right\rangle$$

with a<sup>2</sup> + b<sup>2</sup> = 1 because of normalization. We now apply J+:

$$\begin{aligned} \left| J\_+ \left| \frac{1}{2}, \frac{1}{2} \right\rangle = 0\\ &= a \left| 1, 1 \right\rangle \left| \frac{1}{2}, \frac{1}{2} \right\rangle + \sqrt{2}b \left| 1, 1 \right\rangle \left| \frac{1}{2}, \frac{1}{2} \right\rangle \end{aligned}$$

Therefore, <sup>a</sup> <sup>+</sup> <sup>√</sup>2<sup>b</sup> = 0, which with <sup>a</sup><sup>2</sup> <sup>+</sup> <sup>b</sup><sup>2</sup> = 1 gives <sup>a</sup> <sup>=</sup> <sup>2</sup> <sup>3</sup> and b = − <sup>1</sup> 3 .

These results are summarized in Tables 2.1 and 2.2.


**Table 2.1** Clebsch–Gordan coefficients for j<sup>1</sup> = 1 and j<sup>2</sup> = <sup>1</sup> .


**Table 2.2** Clebsch–Gordan coefficients for j<sup>1</sup> = <sup>1</sup> and <sup>j</sup><sup>2</sup> <sup>=</sup> <sup>1</sup> .

## **2.3 Spatial rotations**

Consider a small rotation about the y axis,

$$\begin{aligned} \mathbf{x}' &= \mathbf{R}\_y(\epsilon)\mathbf{x} \\ \begin{pmatrix} x' \\ y' \\ z' \end{pmatrix} &= \begin{pmatrix} \cos\epsilon & 0 & \sin\epsilon \\ 0 & 1 & 0 \\ -\sin\epsilon & 0 & \cos\epsilon \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix} \end{aligned}$$

and its inverse,

$$\mathbf{x} = \mathbf{R}\_y^{-1}(\epsilon)\mathbf{x}'$$

$$\begin{pmatrix} x \\ y \\ z \end{pmatrix} = \begin{pmatrix} \cos\epsilon & 0 & -\sin\epsilon \\ 0 & 1 & 0 \\ \sin\epsilon & 0 & \cos\epsilon \end{pmatrix} \begin{pmatrix} x' \\ y' \\ z' \end{pmatrix}$$

We **impose invariance**:

$$
\psi'(\mathbf{x}') = \psi(\mathbf{x}) = \psi(\mathbf{R}\_y^{-1}(\epsilon)\mathbf{x}')
$$

Without loss of generality, we take a specific point **x**- = **a** and find a relation between ψ- (**a**) and ψ(**a**):

$$\begin{split} \psi'(\mathbf{a}) &= \psi(\mathbf{R}\_y^{-1}(\epsilon)\mathbf{a}) \\ &= \psi(a\_x \cos \epsilon - a\_z \sin \epsilon, a\_y, a\_z \cos \epsilon + a\_x \sin \epsilon) \\ &\to \psi(a\_x - \epsilon a\_z, a\_y, a\_z + \epsilon a\_x) \quad \text{as} \quad \epsilon \to 0 \\ &= \psi(\mathbf{a}) + \epsilon a\_x \frac{\partial \psi(\mathbf{a})}{\partial z} - \epsilon a\_z \frac{\partial \psi(\mathbf{a})}{\partial x} + \dots \quad \text{(Taylor expansion)} \\ &\approx \psi(\mathbf{a}) \left[ 1 + \epsilon \left( a\_x \frac{\partial}{\partial z} - a\_z \frac{\partial}{\partial x} \right) \right] \\ &= \underbrace{\left( 1 - \mathbf{i} \epsilon J\_y \right)}\_{U\_\mathbf{r}(\epsilon)} \psi(\mathbf{a}) \qquad \left[ \text{since } J\_y = -\mathbf{i} \left( z \frac{\partial}{\partial x} - x \frac{\partial}{\partial z} \right) \right] \end{split}$$

## **Conservation** *↔* **invariance**

$$\begin{split} \frac{\mathrm{d}}{\mathrm{d}t} \langle \phi(t) | U\_{\mathrm{r}} | \varphi(t) \rangle \\ &= \left[ \frac{\mathrm{d}}{\mathrm{d}t} \langle \phi(t) | \right] U\_{\mathrm{r}} | \varphi(t) \rangle + \langle \phi(t) | \frac{\mathrm{d}U\_{\mathrm{r}}}{\mathrm{d}t} | \varphi(t) \rangle + \langle \phi(t) | U\_{\mathrm{r}} \left[ \frac{\mathrm{d}}{\mathrm{d}t} | \varphi(t) \rangle \right] \\ &= \langle \phi(t) | \frac{\mathrm{d}U\_{\mathrm{r}}}{\mathrm{d}t} | \varphi(t) \rangle + \mathrm{i} \langle \phi(t) | U\_{\mathrm{r}}H - H U\_{\mathrm{r}} | \varphi(t) \rangle \end{split}$$

Hence, U<sup>r</sup> is invariant if J<sup>y</sup> (the tricky part of Ur) is independent of time and commutes with the Hamiltonian; i.e. the eigenvalues of Ur() are constant. Angular momentum is conserved owing to the requirement that the wavefunction be invariant under rotation.

and so any rotation can be expressed as the successive application of the infinitesimal rotation:

$$\begin{split} U\_{\mathbf{r}}(\beta) &= \lim\_{n \to \infty} \left( 1 - \mathrm{i}\frac{\beta}{n} J\_y \right)^n \\ &= \mathrm{e}^{-\mathrm{i}\beta J\_y} \end{split} \tag{2.12}$$

Consider the time variation of Ur: <sup>6</sup> <sup>6</sup> Here we explicitly assume that the non-relativistic Schr¨odinger equation is valid. The resulting conservation law is therefore only valid in the nonrelativistic limit. The relativistic case requires the use of the Dirac equation and this will be discussed in chapter 6

Rotation is a Lie group <sup>7</sup> <sup>7</sup> Section 2.7 covers the essentials of group theory.

or, for a general three-dimensional rotation,

$$U\_{\mathbf{r}}(\boldsymbol{\Theta}) = e^{-i\boldsymbol{\Theta} \cdot \mathbf{J}}$$

In the language of operators, the angular momentum operator J<sup>y</sup> is said to be the generator of rotations about the y axis.

## **Euler angles**

Generic rotations can be parameterized by Euler angles, which are defined by three successive rotations:


This is inconvenient, because these definitions use two different bases. However, this transformation is actually the same as


The generic rotation of wavefunctions can be represented by D-matrices:

$$D\_{m',m}^{j}(\alpha\beta\gamma) = \langle j,m'|\mathbf{e}^{-\mathbf{i}\alpha J\_z}\mathbf{e}^{-\mathbf{i}\beta J\_y}\mathbf{e}^{-\mathbf{i}\gamma J\_z}|j,m\rangle$$

$$=\mathbf{e}^{-\mathbf{i}(\alpha m'+\gamma m)}d\_{m',m}^{j}(\beta)\tag{2.13}$$

### **Rotation matrix:** *d<sup>j</sup> m-,m***(***β***)**

Although the y projection is unchanged, the z direction has changed, so the quantum number m is not the same—it is now projected onto a new z axis. A state |j, m transforms under a rotation β about the y axis into a linear combination on the 2j + 1 states |j, m- :

$$\mathbf{e}^{-\mathbf{i}\beta J\_y}|j,m\rangle = \sum\_{m'} d\_{m'm}^j(\beta)|j,m'\rangle \tag{2.14}$$

Multiplying with j, m- | gives

$$d\_{m'm}^{j}(\beta) = \langle j, m'|e^{-\mathrm{i}\beta J\_y}|j, m\rangle \tag{2.15}$$

Calculation of the matrix proceeds as follows:

(1) From inspection of the expansion of e<sup>−</sup>iβJ<sup>y</sup> ,

$$\mathrm{e}^{-\mathrm{i}\beta J\_y} = 1 - \mathrm{i}\beta J\_y - \frac{1}{2!} \beta^2 J\_y^2 + \frac{\mathrm{i}}{3!} \beta^3 J\_y^3 + \frac{1}{4!} \beta^4 J\_y^4 + \dots$$

(2) Look separately for solutions of J<sup>2</sup>n+1 <sup>y</sup> |j, m and J<sup>2</sup><sup>n</sup> <sup>y</sup> |j, m. (3) Recall the raising/lowering operators J<sup>±</sup> = J<sup>x</sup> ± iJ<sup>y</sup> and the Clebsch–Gordan coefficients Cjm <sup>±</sup> = j(j + 1) − m(m ± 1). Then

$$J\_{+} - J\_{-} = J\_{x} + \mathrm{i}J\_{y} - J\_{x} + \mathrm{i}J\_{y}$$

$$J\_{y} = -\frac{\mathrm{i}}{2}(J\_{+} - J\_{-})$$

so

$$\begin{aligned} J\_y|1,1\rangle &= -\frac{\mathrm{i}}{2} \left( C\_+^{11} \sqrt{0} - C\_-^{11}|1,0\rangle \right) \quad \text{with} \quad C\_-^{11} = \sqrt{2} \\ &= \frac{\mathrm{i}}{\sqrt{2}}|1,0\rangle \end{aligned}$$

(4) Operate again with Jy:

$$\begin{aligned} \langle J\_y^2 | 1, 1 \rangle &= \frac{\mathrm{i}}{\sqrt{2}} J\_y | 1, 0 \rangle \\ &= \frac{1}{2\sqrt{2}} (C\_+^{10} | 1, 1 \rangle - C\_-^{10} | 1, -1 \rangle) \\ &= \frac{1}{2} (| 1, 1 \rangle - | 1, -1 \rangle), \qquad \text{since} \quad C\_+^{10} = C\_-^{10} = \sqrt{2} \end{aligned}$$

(5) And again:

$$\begin{aligned} J\_y^3|1,1\rangle &= \frac{1}{2} J\_y(|1,1\rangle - |1,-1\rangle) \\ &= \frac{1}{2} \left[ \frac{-\mathbf{i}}{2} (J\_+ - J\_-)|1,1\rangle - \frac{-\mathbf{i}}{2} (J\_+ - J\_-)|1,-1\rangle \right] \\ &= \frac{\mathbf{i}}{4} \left( \sqrt{2}|1,0\rangle + \sqrt{2}|1,0\rangle \right) \\ &= \frac{\mathbf{i}}{\sqrt{2}}|1,0\rangle \\ &= J\_y|1,1\rangle \end{aligned}$$

(6) Note the cyclical pattern and conclude that

$$\begin{aligned} J\_y^{2n+1}|1,1\rangle &= \frac{\mathbf{i}}{\sqrt{2}}|1,0\rangle\\ J\_y^{2n}|1,1\rangle &= \frac{1}{2}(|1,1\rangle - |1,-1\rangle) \end{aligned}$$

(7) Which then leads, for each specific j, m- | state, to the following:

$$\begin{aligned} \langle 1, 1 | J\_y^{2n+1} | 1, 1 \rangle &= 0, & \langle 1, 1 | J\_y^{2n} | 1, 1 \rangle &= \frac{1}{2} \\ \langle 1, -1 | J\_y^{2n+1} | 1, 1 \rangle &= 0, & \langle 1, -1 | J\_y^{2n} | 1, 1 \rangle &= -\frac{1}{2} \\ \langle 1, 0 | J\_y^{2n+1} | 1, 1 \rangle &= \frac{\mathrm{i}}{\sqrt{2}}, & \langle 1, 0 | J\_y^{2n} | 1, 1 \rangle &= 0 \end{aligned}$$

(8) With these relations, the d<sup>j</sup> m<sup>m</sup> coefficients are calculated as

$$\begin{aligned} d\_{11}^1 &= \langle 1, 1 | \mathbf{e}^{-i\beta J\_y} | 1, 1 \rangle \\ &= \langle 1, 1 | 1 + \langle -\mathbf{i}\beta J\_y^\* \rangle + \frac{1}{2!} (-\mathbf{i}\beta J\_y)^2 + \frac{1}{3!} (-\mathbf{i}\beta J\_y^\*)^3 \\ &+ \frac{1}{4!} (-\mathbf{i}\beta J\_y)^4 + \dots \, | 1, 1 \rangle \\ &= 1 - \frac{1}{2!} \frac{\beta^2}{2} + \frac{i}{4!} \frac{\beta^4}{2} - \dots \\ &= \frac{1}{2} \left[ 1 + \left( 1 - \frac{1}{2!} \beta^2 + \frac{1}{4!} \beta^4 - \dots \right) \right] \\ &= \frac{1}{2} (1 + \cos \beta) \end{aligned}$$

$$\begin{aligned} d\_{-11}^1 &= \langle 1, -1 | \mathbf{e}^{-\mathbf{i}\beta J\_y} | 1, 1 \rangle \\ &= \frac{1}{2} \left[ 1 - \left( 1 - \frac{1}{2!} \beta^2 + \frac{1}{4!} \beta^4 - \dots \right) \right] \text{ (} \langle 1, -1 | 1 | 1, 1 \rangle = 0 \text{ of course} \rangle \\ &= \frac{1}{2} (1 - \cos \beta) \end{aligned}$$

$$\begin{aligned} d\_{01}^1 &= \langle 1, 0 \vert \mathrm{e}^{-i\beta J\_y} \vert 1, 1 \rangle \\ &= \langle 1, 0 \vert -\frac{\mathrm{i}}{\sqrt{2}} \mathrm{i}\beta + \frac{\mathrm{i}}{\sqrt{2}} \frac{1}{3!} \beta^3 + \dots \vert 1, 0 \rangle \\ &= \frac{1}{\sqrt{2}} \left( \beta - \frac{1}{3!} \beta^3 + \dots \right) \\ &= \frac{1}{\sqrt{2}} \sin \beta \\\\ d\_{00}^1 &= \langle 1, 0 \vert \mathrm{e}^{-i\beta J\_y} \vert 1, 0 \rangle \\ &= -\mathrm{i} \sqrt{2} \langle 1, 0 \vert \mathrm{e}^{-i\beta J\_y} J\_y \vert 1, 1 \rangle \\ &= \cos \beta \end{aligned}$$

**Example: e***<sup>−</sup>***e<sup>+</sup>** *→ μ<sup>−</sup>μ***<sup>+</sup>**

With reference to Fig. 2.1, the incoming left-handed electron annihilates with the right-handed positron. In the electromagnetic interaction, a photon is exchanged with the outgoing muon pair. As will be discussed in detail in Chapter 6, at high energies, helicity<sup>8</sup> is conserved direction.

<sup>8</sup>Helicity, σ · **p**/|**p**|, is the projection of the spin along the momentum

**Fig. 2.1** Helicity conservation in e+e<sup>−</sup> → μ+μ−.

**Fig. 2.2** Angular distribution for e+e<sup>−</sup> → μ+μ−, TASSO [55].

*s* dσ/dΏ (Gev

nb–1 sr–1)

in this reaction, so the final-state particles must have opposite helicity. The amplitudes are given by

$$\begin{aligned} \mathcal{A}\_{11} &= \mathcal{A}(e\_{\rm L}^{-}e\_{\rm R}^{+} \rightarrow \mu\_{\rm L}^{-}\mu\_{\rm R}^{+}) \propto d\_{1,1}^{1} = \frac{1}{2}(1 + \cos\theta) \\\\ \mathcal{A}\_{1-1} &= \mathcal{A}(e\_{\rm L}^{-}e\_{\rm R}^{+} \rightarrow \mu\_{\rm R}^{-}\mu\_{\rm L}^{+}) \propto d\_{1,-1}^{1} = \frac{1}{2}(1 - \cos\theta) \\\\ \frac{\mathrm{d}\sigma}{\mathrm{d}\cos\theta} &= \left|\mathcal{A}\_{11}\right|^{2} + \left|\mathcal{A}\_{1-1}\right|^{2} \\ &\propto 1 + \cos^{2}\theta \end{aligned}$$

The two amplitudes should be of equal intensity because of parity conservation in the electromagnetic interaction. Figure 2.2 shows e<sup>+</sup>e<sup>−</sup> → μ<sup>+</sup>μ<sup>−</sup> data from the TASSO experiment at DESY [55]. It is clear that the angular distribution is not symmetric about cos θ = 0. This asymmetry is evidence for the off-shell influence of the parity-violating Z<sup>0</sup> interfering with the dominant γ exchange.

## **2.4 Lorentz invariance**

Most high-energy physics requires energy scales E mproton, so it is essential that the requirements of special relativity be respected. In practice, this means identifying suitable 4-vectors and Lorentz invariants. Although the position and direction of particles is important when actually performing experiments, the results are most often derived from knowledge of the energy and momentum of the interacting particles. Here we summarize the essentials and define the Lorentz metric:

• The components of the energy–momentum 4-vector (E, **p**c) of a particle of rest mass m are related by

$$E^2 - |\mathbf{p}|^2 = m^2$$

• This relation also defines the Lorentz metric tensor gμν or gμν:

g<sup>00</sup> = 1, g<sup>11</sup> = g<sup>22</sup> = g<sup>33</sup> = −1, with all other components = 0

• The scalar product of two 4-vectors X<sup>μ</sup> ≡ (X0, **X**) and Y <sup>μ</sup> ≡ (Y <sup>0</sup>, **Y**) is

$$X \cdot Y = X\_{\mu} Y^{\mu} = X^{\mu} Y\_{\mu} = g\_{\mu\nu} X^{\mu} Y^{\nu} = g^{\mu\nu} X\_{\mu} Y\_{\nu}$$

The scalar product of two 4-vectors and hence the length of a 4-vector are Lorentz invariants.

Consider next the relationship between the energy–momentum 4 vectors p = (E, **p**c) and p- = (E- , **p**- c) of a particle of mass m in two Lorentz frames S and S- , where S is moving with a speed β ≡ v/c along the z axis in frame S. The p<sup>x</sup> and p<sup>y</sup> components of 3-momentum, perpendicular to the boost, are unchanged, but the p<sup>z</sup> component along the boost direction and the energy are modified:

$$E' = \gamma (E - \beta p\_z c) \tag{2.16}$$

$$p\_z'c = \gamma(p\_z c - \beta E) \tag{2.17}$$

$$p\_x' = p\_x \tag{2.18}$$

$$p\_y' = p\_y \tag{2.19}$$

where γ = 1/ -1 − β<sup>2</sup> and β = v/c is the boost. It is convenient to have the transformation expressed in terms of angles with respect to the z axes:

$$E' = \gamma (E - \beta p c \cos \theta) \tag{2.20}$$

$$p'c\cos\theta'=\gamma(pc\cos\theta-\beta E)\tag{2.21}$$

$$p' \sin \theta' = p \sin \theta \tag{2.22}$$

## **2.4.1 Invariant variables**

Many high-energy scattering processes can be measured and analysed in terms of two variables: the centre-of-mass energy ECM and the scattering or production angle of a 'leading' final-state particle θCM. Consider a generic process a+b → c+ X, where a and b are either beam and target or colliding beam particles, c is the leading final-state particle, and X is the, often unmeasured, remainder of the final state. Two invariant variables are useful:

$$s = (p\_a + p\_b)^2, \quad t = (p\_a - p\_c)^2 \tag{2.23}$$

where pa, pb, and p<sup>c</sup> are the 4-momenta of particles a, b, and c. s is the square of the centre-of-mass energy and t is the square of the 4 momentum transfer. For example, consider the process πp → πX using a pion beam of energy E<sup>π</sup> on a fixed liquid-hydrogen target, for which the energy E- <sup>π</sup> and angle θ of the leading final-state pion are measured. One finds (see Exercise 2.6) that

$$s \equiv E\_{\rm CM}^2 = m\_{\pi}^2 + m\_p^2 + 2m\_p E\_{\pi},\tag{2.24}$$

$$t = 2m\_{\pi}^2 - 2(E\_{\pi}E\_{\pi}' - kk'\cos\theta'),\tag{2.25}$$

where k and kare the magnitudes of the 3-momenta of the two pions.

For a high-energy collider with equal-mass particles (e.g. LEP with e+e<sup>−</sup> or LHC with pp), s = 4E<sup>2</sup> beam, ignoring masses. The equivalent fixed-target beam energy required to give the same <sup>√</sup><sup>s</sup> is <sup>E</sup><sup>p</sup> <sup>≈</sup> s/(2mp); for example to achieve ECM = 7 TeV would require a proton-beam energy of 7<sup>2</sup> × 10<sup>3</sup>/(2 × 0.94) ≈ 2.6 × 10<sup>4</sup> TeV. Clearly, colliders are the most energy-efficient way to reach the highest energies. As already mentioned in Chapter 1 and discussed in more detail in Chapter 3, the key challenge for a collider is to achieve high enough luminosity.

## **2.4.2 Rapidity**

High-energy hadron–hadron interactions tend to produce final states with limited transverse momentum with respect to the initial beam direction. In such circumstances, rapidity y and transverse mass m<sup>T</sup> are convenient variables, where

$$y = \frac{1}{2} \ln \left( \frac{E + p\_z}{E - p\_z} \right), \quad m\_\mathrm{T} = \sqrt{m^2 + p\_\mathrm{T}^2}$$

The 4-momentum p of a particle of mass m, transverse momentum pT, and an azimuthal angle φ,

$$p = (E, p\_\Gamma \cos \phi, p\_\Gamma \sin \phi, p\_z)$$

may be described instead as

$$p = (m\_\mathrm{T} \cosh y, p\_\mathrm{T} \cos \phi, p\_\mathrm{T} \sin \phi, m\_\mathrm{T} \sinh y) \tag{2.26}$$

Rapidity has the approximate range (ln(m/2E), ln(2E/m)). A difference in rapidity is an invariant under a Lorentz transformation along the beam direction and rapidities are additive under Lorentz boosts along the beam direction. At high energies, when masses can be ignored, y may be approximated by the pseudorapidity η:

$$y \rightarrow \eta \equiv \frac{1}{2} \ln \left( \frac{1 + \cos \theta}{1 - \cos \theta} \right) = -\ln \tan(\theta/2)$$

where θ is the polar angle. The range of pseudorapidity is (−∞, ∞). Note that η = 0 (or y = 0) is perpendicular to the beam line and large |η| (or |y|) is close to the beam line. In high-energy hadron–hadron scattering, it is observed that particle production is roughly uniform in units of pseudorapidity.

A useful relation follows from the Jacobian of the transformation from Cartesian to rapidity momentum components:

$$\frac{\mathbf{d}^3 \mathbf{p}}{E} = p\_\mathbf{T} \,\mathrm{d}p\_\mathbf{T} \,\mathrm{d}\phi \,\mathrm{d}y \equiv \mathrm{d}^2 p\_\mathbf{T} \,\mathrm{d}y \to \pi \,\mathrm{d}p\_\mathbf{T}^2 \,\mathrm{d}y \approx \pi \,\mathrm{d}p\_\mathbf{T}^2 \,\mathrm{d}\eta\tag{2.27}$$

where for the last two expressions the azimuthal angle has been integrated out.

## **2.5 Transitions and observables**

Particle physics experiments often involve measuring decay rates or scattering cross sections—processes that involve transitions from one state to another. For a particle of total width Γ, the lifetime is given by τ = 1/Γ and the number of particles decays exponentially:

$$N(t) = N(0)\mathbf{e}^{-t/\tau} = N(0)\mathbf{e}^{-t\Gamma} \tag{2.28}$$

where N(0) and N(t) are the numbers of particles at times 0 and t, respectively.<sup>9</sup> <sup>9</sup>With **¯**<sup>h</sup> <sup>=</sup> <sup>c</sup> = 1, both time and length Often a particle will decay to many final states, so it is useful to define Γi, the partial decay rate to final state i. The total rate is then given by Γ = <sup>i</sup> Γi, where the sum runs over all final states.

Similarly, the attenuation of a particle beam of flux I(x) is given by

$$I(x) = I(0)e^{-x/\ell} \tag{2.29}$$

have dimensions of energy−<sup>1</sup> and τΓ ∼ 1 is the time–energy uncertainty relation.

theory of gases as the mean free path.

where is the collision length <sup>10</sup> <sup>10</sup> Familiar from the classical kinetic given by = 1/Nσ for a target of number density N and scattering cross section σ. The cross section has physical dimensions of area, so [energy]−<sup>2</sup> here. For a thin target of thickness δx, the beam attenuation is given by δI ∼ I(0)Nσδx.

> In general, the S matrix describing a scattering or decay A → B is written as S = 1+iT and the reduced matrix element M(B : A) is defined by

$$S\_{\rm fi} \equiv \langle B|\mathrm{i}T|A\rangle = \mathrm{i}(2\pi)^4 \delta^4(p\_A - p\_B) \mathcal{M}(B:A) \tag{2.30}$$

where p<sup>A</sup> and p<sup>B</sup> are the total 4-momenta of states A and B.

## **2.5.1 Phase space and decay rates**

Transitions to a final state |B from an initial state |A are calculated from Fermi's Golden Rule:

$$\Gamma\_{\rm fi} = W\_{\rm fi} = 2\pi \underbrace{\left| \mathcal{T}\_{\rm fi} \right|^2}\_{\text{dynamic}} \times \underbrace{\rho(E\_{\rm f})}\_{\text{kinematic}} \tag{2.31}$$

where


$$\mathcal{T}\_{\rm fi} = \langle B|V|A\rangle \tag{2.32}$$

with |A and |B the initial and final states interacting via a potential V ;

• ρ(Ef) is the phase-space factor.

## **The phase-space factor** *ρ***(***E***f)**

This is the number of states available per unit of energy in the final state. It is important because it connects the physics contained in the matrix element to observable quantities. First, we shall review its calculation using non-relativistic quantum mechanics (NRQM) and explain why this is not appropriate for use in high-energy physics.

## **Non-relativistic quantum mechanics**

In NRQM, the calculation proceeds as follows. Imagine a cube of sides L containing one of the final-state particles with quantized momentum, p = k and k = 2πn, where n is an integer. <sup>11</sup>With <sup>11</sup> **¯**<sup>h</sup> = 1 and <sup>c</sup> = 1. Then

$$\begin{aligned} p\_x &= \frac{2\pi n\_x}{L}, & p\_y &= \frac{2\pi n\_y}{L}, & p\_z &= \frac{2\pi n\_z}{L} \\ \text{or} & n\_x = \frac{Lp\_x}{2\pi}, & n\_y &= \frac{Lp\_y}{2\pi}, & n\_z &= \frac{Lp\_z}{2\pi} \end{aligned}$$

Each (px, py, pz) momentum state resides in the elemental volume in momentum space, (2π/L)<sup>3</sup> = (2π)<sup>3</sup>/V , so that

$$\text{total phase space} = \frac{(2\pi)^3}{V} N\_1$$

where N<sup>1</sup> is the number of momentum states available to one particle. Normalizing to one particle per spatial elemental volume and rewriting 'total phase space' as the integral over all momenta, we have

$$N\_1 = \frac{1}{(2\pi)^3/V} \int \mathrm{d}p\_x \, \mathrm{d}p\_y \, \mathrm{d}p\_z \bigg/ \sqrt{V = \frac{1}{(2\pi)^3} \int \mathrm{d}^3 \mathbf{p}}$$

Next, we scale up to n particles:

$$N\_{n-1} = \frac{1}{(2\pi)^{3(n-1)}} \int \prod\_{j=1}^{n-1} \mathrm{d}^3 \mathbf{p}\_j$$

from which<sup>12</sup> <sup>12</sup>As total momentum conservation the density of states, the number of states per unit energy, is

$$\rho(E) = \frac{\mathrm{d}N\_{n-1}}{\mathrm{d}E} = \frac{1}{(2\pi)^{3(n-1)}} \frac{\mathrm{d}}{\mathrm{d}E} \int \prod\_{j=1}^{n-1} \mathrm{d}^3 \mathbf{p}\_j \tag{2.33}$$

However, this is not satisfactory for high-energy physics, since we need to take into account the Lorentz contraction of the volume element in the usual NRQM wavefunction normalization.

## **dLips**

The problem is solved by changing the quantum state normalization. Instead of normalizing to one particle per unit (spatial) volume |φ| <sup>2</sup> dV = 1, we use

$$\int |\phi|^2 \,\mathrm{d}V = 2E,\quad \text{for a particle of energy } E$$

For a particle of mass m with 4-momentum p = (E, **p**) and spin (or helicity) λ, this corresponds to a momentum-space normalization of

$$
\langle p', \lambda' | p, \lambda \rangle = 2E \delta\_{\lambda \lambda'} \delta^3(\mathbf{p} - \mathbf{p}')
$$

A useful identity, which shows the manifest Lorentz covariance of this choice, is

$$\frac{\mathbf{d}^3 \mathbf{p}}{2E} = \theta(E)\delta(p^2 - m^2) \,\mathrm{d}^4 p$$

Before giving the expressions for decay rates and cross sections, there is a somewhat tricky mathematical difficulty to be handled. In eqn 2.30, constrains the nth particle, the number of available states is that of n − 1 free particles.

we have a δ<sup>4</sup>(·) from overall 4-momentum conservation, which will be squared in the calculation. Formally, this is dealt with by using the identity

$$(2\pi)^4 \delta^4(p\_\mathbf{f} - p\_\mathbf{i}) = \int \mathbf{e}^{\mathbf{i}x(p\_\mathbf{f} - p\_\mathbf{i})} \, \mathbf{d}^4x \tag{2.34}$$

to replace one of the δ<sup>4</sup>(·) functions and then using p<sup>f</sup> = p<sup>i</sup> plus the other δ<sup>4</sup>(·) to give d<sup>4</sup>x = V T. These V T factors cancel with those that occur in the use of Fermi's Golden Rule, in which the normalized transition rate per unit volume and unit time appears: wfi = |Sfi| <sup>2</sup>/V T.

## **2.5.2 Decay rate**

The partial decay rate of a state of mass M, at rest, into n particles is related to the reduced matrix element M by

$$\mathrm{d}\Gamma(M:m\_1,\ldots,m\_n) = \frac{(2\pi)^4}{2M} |\mathcal{M}|^2 \,\mathrm{d}\mathrm{Lipps}(P; p\_1,\ldots,p\_n) \tag{2.35}$$

where dLips is the n-body phase space given by

$$\text{dLipschitz}(P; p\_1, \ldots, p\_n) = \delta^4 \left( P - \sum\_{i=1}^n p\_i \right) \prod\_{i=1}^n \frac{\text{d}^3 \mathbf{p}\_i}{(2\pi)^3 2E\_i} \tag{2.36}$$

## **Two-body decay rate**

The simplest case is a two-body decay X → a + b, where the masses of X,a, and b are M, ma, and mb, respectively. As the final-state integral is Lorentz-invariant, we are free to choose any frame. The rest frame of P<sup>X</sup> = (M, **0**), p<sup>a</sup> = (Ea, **p**a), p<sup>b</sup> = (Eb, **p**b). We have

$$\begin{split} \Gamma\_{\rm fi} &= \frac{(2\pi)^4}{2M} \int |\mathcal{M}\_{\rm fi}|^2 \frac{\mathrm{d}^3 \mathbf{p}\_a}{(2\pi)^3 (2E\_a)} \frac{\mathrm{d}^3 \mathbf{p}\_b}{(2\pi)^3 (2E\_b)} \\ &\times \delta^3(\mathbf{p}\_a + \mathbf{p}\_b) \delta(M - E\_a - E\_b) \end{split}$$

Gathering constants and using the δ<sup>3</sup>(·) to remove one of the 3 momentum integrals, we obtain

$$\Gamma\_{\rm fi} = \frac{1}{32\pi^2 M} \int |\mathcal{M}\_{\rm fi}|^2 \frac{\mathrm{d}^3 \mathbf{p}\_a}{E\_a E\_b} \delta(M - E\_a - E\_b)$$

In the CMS, we take **p**<sup>a</sup> = **k** and **p**<sup>b</sup> = −**k** and, using polar coordinates, we have

$$\operatorname{d}^3 \mathbf{p}\_a \mapsto k^2 \operatorname{d}k \sin \theta \,\operatorname{d}\theta \,\operatorname{d}\phi \mapsto k^2 \operatorname{d}k \,\operatorname{d}\Omega,\quad \text{where } k = |\mathbf{k}|$$

This gives

$$\Gamma\_{\rm fi} = \frac{1}{32\pi^2 M} \int |\mathcal{M}\_{\rm fi}|^2 \frac{k^2 \,\mathrm{d}k \,\mathrm{d}\Omega}{E\_a E\_b} \delta(M - E\_a - E\_b) \tag{2.37}$$

M is most convenient: <sup>13</sup> <sup>13</sup> It is the CMS frame of the decay products.

Since **p**<sup>b</sup> = −**p**<sup>a</sup> = **k**, E<sup>2</sup> <sup>b</sup> = m<sup>2</sup> <sup>b</sup> + k<sup>2</sup> and eqn 2.37 becomes

$$\Gamma\_{\rm fi} = \frac{1}{32\pi^2 M} \int |\mathcal{M}\_{\rm fi}|^2 g(k) \delta(f(k)) \,\mathrm{d}k \,\mathrm{d}\Omega$$

where

$$\begin{aligned} g(k) &= \frac{k^2}{E\_a E\_b} \\ f(k) &= M\_i - \sqrt{m\_a^2 + k^2} - \sqrt{m\_b^2 + k^2} \end{aligned}$$

Denoting by p<sup>∗</sup> the value of k = |**p**a| that conserves momentum, g(k) can be integrated to give<sup>14</sup> <sup>14</sup>Using the relationship

<sup>δ</sup>(f(x)) =

 df dx a −<sup>1</sup>

$$\begin{aligned} \int g(k)\delta(f(k)) \, \mathrm{d}k &= \left( \left| \frac{\mathrm{d}f}{\mathrm{d}k} \right|\_{p^\*} \right)^{-1} \underbrace{\int g(k)\delta(k-p^\*) \, \mathrm{d}k}\_{g(p^\*)} \\ &= \left( \left| \frac{\mathrm{d}f}{\mathrm{d}k} \right|\_{p^\*} \right)^{-1} \frac{(p^\*)^2}{E\_a E\_b} \end{aligned} \qquad \begin{aligned} \delta(f(x)) &= \left( \left| \frac{\mathrm{d}f}{\mathrm{d}x} \right|\_{a} \right)^{-1} \delta(x-a) \\ &= \left( \left| \frac{\mathrm{d}f}{\mathrm{d}k} \right|\_{p^\*} \right)^{-1} \frac{(p^\*)^2}{E\_a E\_b} \end{aligned}$$

The inverted modulus term is obtained as follows:

$$\frac{\mathrm{d}f}{\mathrm{d}k} = -\frac{k}{\sqrt{m\_a^2 + k^2}} - \frac{k}{\sqrt{m\_b^2 + k^2}} = -\frac{k}{E\_a} - \frac{k}{E\_b}$$

$$= -k \frac{E\_a + E\_b}{E\_a E\_b}$$

so

$$\left( \left| \frac{\mathrm{d}f}{\mathrm{d}k} \right|\_{p^\*} \right)^{-1} = \frac{1}{p^\*} \frac{E\_a E\_b}{E\_a + E\_b} $$

We then have

$$\Gamma\_{\rm fi} = \frac{1}{32\pi^2 M} \frac{E\_a E\_b}{p^\*(E\_a + E\_b)} \frac{(p^\*)^2}{E\_a E\_b} \int |\mathcal{M}\_{\rm fi}|^2 \,\mathrm{d}\Omega$$

By energy conservation, E<sup>a</sup> + E<sup>b</sup> = M, so, finally,

$$
\Gamma\_{\rm fi} = \frac{p^\*}{32\pi^2 M^2} \int |\mathcal{M}\_{\rm fi}|^2 \,\mathrm{d}\Omega,\tag{2.38}
$$

where the angular integral is over the solid angle of particle a.

chapter for an alternative derivation of

these results.

## **Two-body decay kinematics**

First, we calculate p<sup>∗</sup> from the original condition on the δ-function, f(k) = 0:

$$M = \sqrt{m\_a^2 + p^{\*2}} + \sqrt{m\_b^2 + p^{\*2}}.$$

Some straightforward algebra yields <sup>15</sup> <sup>15</sup> Try Exercise 2.5 at the end of the

$$p^\* = \frac{1}{2M}\sqrt{[M^2 - (m\_a + m\_b)^2][M^2 - (m\_a - m\_b)^2]}.\tag{2.39}$$

Then, using E<sup>2</sup> = p<sup>2</sup> + m<sup>2</sup>, the energies E<sup>a</sup> and E<sup>b</sup> are given by

$$E\_a = \frac{M^2 + m\_a^2 - m\_b^2}{2M}, \quad E\_b = \frac{M^2 + m\_b^2 - m\_a^2}{2M} \tag{2.40}$$

Note that energy and momentum conservation fix both the momenta and energies of the particles in a two-body decay. This is most easily seen in the rest frame of the decaying particle (or equivalently in the CMS frame of the decay products)—the only freedom is an azimuthal rotation about the common axis of the momenta of the two decay products.

## **Three-body decays**

For a three-body decay, there are no longer enough constraints from 4-momentum conservation to determine the energies of the decay products, even in the rest frame of the parent particle. However, there are limitations that can be understood most easily by considering one decay product at rest with the other two particles then sharing the available energy as in a two-body decay. The most elegant approach to three-body decays is the Dalitz plot—a two-dimensional plot of either two decay particle energies or two decay particle invariant mass pairs. <sup>16</sup> <sup>16</sup> The Dalitz plot is named after We shall use the latter, since the result is then manifestly Lorentz-invariant.

> • Write the Lorentz sum of m<sup>1</sup> and m2, M12, explicitly in terms of E<sup>3</sup> and p3:

$$\begin{aligned} M\_{12}^2 &= (E\_1 + E\_2)^2 - (p\_1 + p\_2)^2 \\ &= (M - E\_3)^2 - p\_3^2 \end{aligned}$$

• Identify m<sup>2</sup> 3:

$$M\_{12}^2 = M^2 - 2ME\_3 + E\_3^2 - p\_3^2$$

$$= M^2 + m\_3^2 - 2ME\_3 \tag{2.41}$$

• Differentiate with respect to E3:

$$\text{d}(M\_{12}^2) \propto \text{d}E\_3 \tag{2.42}$$

R. H. Dalitz, who invented it for the study of K<sup>0</sup> → 3π decays.

• By similar reasoning,

$$\operatorname{d}(M\_{23}^2) \propto \operatorname{d}E\_1$$

Next, consider the infinitesimal three-body phase-space element dρ3, ignoring constant factors:

• As in the two-body case, one of the integrals over d<sup>3</sup>**p**<sup>i</sup> is removed using the 3-momentum δ<sup>3</sup>(·) function:

$$\mathrm{d}\rho\_3 \propto \frac{\mathrm{d}^3 \mathbf{p}\_1 \, \mathrm{d}^3 \mathbf{p}\_2}{E\_1 E\_2 E\_3} \, \delta \left( E\_1 + E\_2 + E\_3 - M \right),$$

• Change to spherical coordinates:

$$\mathrm{d}^3 \mathbf{p}\_1 \, \mathrm{d}^3 \mathbf{p}\_2 = \mathrm{d}p\_1 \, p\_1 \, \mathrm{d}\theta\_1 \, p\_1 \sin \theta\_1 \, \mathrm{d}\phi\_1 \, \mathrm{d}p\_2 \, p\_2 \, \mathrm{d}\theta\_2 \, p\_2 \sin \theta\_2 \, \mathrm{d}\phi\_2$$

• Redefine the solid angle:

$$\begin{aligned} \mathrm{d}\Omega\_1 \,\mathrm{d}\Omega\_2 &= \sin\theta\_1 \,\mathrm{d}\theta\_1 \,\sin\theta\_2 \,\mathrm{d}\theta\_2 \,\mathrm{d}\phi\_1 \,\mathrm{d}\phi\_2\\ &= \sin\theta\_1 \,\mathrm{d}\theta\_1 \,\sin\theta\_{12} \,\mathrm{d}\theta\_{12} \,\mathrm{d}\phi\_1 \,\mathrm{d}\phi\_2 \end{aligned}$$

Then

$$d^3p\_1d^3p\_2 = p\_1^2p\_2^2dp\_1dp\_2\sin\theta\_1d\theta\_1\sin\theta\_{12}d\theta\_{12}d\phi\_1d\phi\_2$$

Simplify, noting that one part of the integration is trivial:

$$\int \mathrm{d}^3 \mathbf{p}\_1 \, \mathrm{d}^3 \mathbf{p}\_2 = 8\pi^2 \int p\_1^2 p\_2^2 \, \mathrm{d}p\_1 \, \mathrm{d}p\_2 \sin \theta\_{12} \, \mathrm{d}\theta\_{12}$$

• Consider the momentum-squared of the back-to-back systems, **p**<sup>3</sup> versus **p**<sup>1</sup> + **p**2:

$$\mathbf{p}\_3^2 = (\mathbf{p}\_1 + \mathbf{p}\_2)^2 = p\_1^2 + p\_2^2 + 2p\_1p\_2\cos\theta\_{12}$$

• For a given |**p**1| and |**p**2|, p<sup>3</sup> depends only on θ12:

$$\begin{aligned} 2p\_3 \operatorname{d}p\_3 &= -2p\_1 p\_2 \sin \theta\_{12} \operatorname{d}\theta\_{12}, \\ \sin \theta\_{12} \operatorname{d}\theta\_{12} &\propto \frac{p\_3}{p\_1 p\_2} \operatorname{d}p\_3 \end{aligned}$$

Put this result into dρ3, giving

$$\mathrm{d}\rho\_3 \propto \frac{1}{E\_1 E\_2 E\_3} p\_1^2 p\_2^2 \,\mathrm{d}p\_1 \,\mathrm{d}p\_2 \frac{p\_3}{p\_1 p\_2} \,\mathrm{d}p\_3 \,\delta(E\_1 + E\_2 + E\_3 - M)$$

$$\propto \frac{p\_1 p\_2 p\_3}{E\_1 E\_2 E\_3} \,\mathrm{d}p\_1 \,\mathrm{d}p\_2 \,\mathrm{d}p\_3 \,\delta(E\_1 + E\_2 + E\_3 - M)$$

• Next, change variables, p<sup>i</sup> dp<sup>i</sup> = E<sup>i</sup> dE<sup>i</sup> and remove one term using the δ-function:

$$\text{d}\rho\_3 \propto \text{d}E\_1 \text{ } \text{d}E\_2^{\prime} \text{d}E\_3 \quad \underline{\delta(E\_1 \pm E\_2 + E\_3 - \widetilde{M})}^{\prime}$$

• Finally, from eqn 2.42, d(M<sup>2</sup> <sup>12</sup>) ∝ dE<sup>3</sup> and d(M<sup>2</sup> <sup>23</sup>) ∝ dE1:

$$\left| \overline{\mathbf{d}\rho\_3 \propto \mathbf{d}(M\_{12}^2)\mathbf{d}(M\_{23}^2)} \right| \tag{2.43}$$

Figure 2.3 shows a Dalitz plot. The importance of the Dalitz plot is contained in this result—the phase space available in a three-body decay is uniform across the plot, here for M<sup>2</sup> <sup>12</sup> versus M<sup>2</sup> <sup>23</sup>. This means that dynamical structure, for example a resonance in one invariant mass pair, shows up as a region of higher or lower density in the Dalitz plot. These regions can then be seen clearly as peaks and troughs in appropriate onedimensional projections. An example from the BaBaR experiment [41] at the PEPII e<sup>+</sup>e<sup>−</sup> storage ring is shown in Fig. 2.4. The Dalitz plot for

**Fig. 2.4** D<sup>+</sup> <sup>s</sup> → π+π−π<sup>+</sup> Dalitz plot from the BaBaR experiment [41]. Resonant bands in the two π+π<sup>−</sup> invariant mass distributions can be seen clearly.

D<sup>+</sup> <sup>s</sup> → π<sup>+</sup>π−π<sup>+</sup> is presented in terms of two π–π invariant mass combinations. The event distribution is clearly non-uniform, with two narrow ππ resonant bands at invariant masses corresponding to the f0(980) state. What might cause the accumulation of events around 1.9 GeV<sup>2</sup> on the diagonal?

## **2.5.3 Cross section**

To calculate the total cross section for a collider process a + b → X, we start from Fermi's golden rule:

$$\text{flux} \times \sigma(ab \to X) = \int w\_{\text{fi}} \times \text{[final-state phase space]}$$

where wfi = |Sfi| <sup>2</sup>/V T is the transition rate and V T is the total space– time volume. In a colliding-beam experiment, the initial-state particle flux is 2Ea2Eb|**v**<sup>a</sup> − **v**b|, where **v**<sup>i</sup> are the particle velocities.<sup>17</sup> <sup>17</sup>The energy factors in the flux defin-

invariant normalization. Inserting the expression from eqn 2.30 for <sup>S</sup>fi in terms of the reduced matrix element into the above equation gives rise to the square of the overall 4-momentum conservation delta function. Formally, this is handled in the same way as for the decay rate calculation (eqn 2.34). The identity

$$(2\pi)^4 \delta^4(p\_\mathbf{f} - p\_\mathbf{i}) = \int \mathbf{e}^{\mathbf{i}x(p\_\mathbf{i} - p\_\mathbf{i})} \, \mathrm{d}^4x$$

is used to replace one of the δ<sup>4</sup>(·); then performing the integration with p<sup>f</sup> = pfi on account of the other δ<sup>4</sup>(·) gives d<sup>4</sup>x = V T. The V T factors then cancel to give

$$\sigma(ab \to X) = \frac{1}{2E\_a 2E\_b |\mathbf{v}\_a - \mathbf{v}\_b|} \int \mathrm{d} \mathrm{Lips}(s:X) \, | \, \mathcal{M}(X:ab)|^2 \, \mathrm{d}s$$

where dLips is the Lorentz-invariant phase space, which for a final state with n<sup>X</sup> particles is

$$\mathrm{dLips}(s:X) = (2\pi)^4 \delta^4(p\_X - p\_a - p\_b) \prod\_{i=1}^{n\_X} \frac{\mathrm{d}^3 \mathbf{k}\_i}{(2\pi)^3 2k\_i^0},$$

where s = (p<sup>a</sup> + pb)<sup>2</sup> and p<sup>X</sup> = <sup>i</sup> ki. For a total cross section to all final states X, an additional <sup>X</sup> is performed. For the unpolarized cross section for particles with spin, final spin states are summed and initial states averaged. This gives an additional term, 1/[(2S<sup>a</sup> + 1)(2S<sup>b</sup> + 1)], on the right-hand side, where S<sup>a</sup> and S<sup>b</sup> are the spins of particles a and b. If a differential cross section is required, then the relevant variables are excluded from the phase-space integral.

ition are a consequence of the Lorentz-

## **Flux factor**

The flux factor 2Ea2Eb|**v**a−**v**b| is a Lorentz invariant and may be written in a number of ways. In the fixed-target frame, with b at rest, it is 2Ea2mb|**v**a|. The following relations also hold:

$$2E\_a 2E\_b |\mathbf{v}\_a - \mathbf{v}\_b| = 4\sqrt{(p\_1 \cdot p\_2)^2 - m\_1^2 m\_2^2} = 2\sqrt{\lambda(s, m\_a^2, m\_b^2)}$$

where

$$
\lambda(s, m\_a^2, m\_b^2) = (s - m\_a^2 - m\_b^2)^2 - 4m\_a^2 m\_b^2
$$

## **Two-body scattering**

Consider the special case of two-body scattering **a** + **b** → **a**- + **b** with 4-momenta p<sup>a</sup> = p1, p<sup>b</sup> = p2, pa- = p3, pb- = p4. The cross section in the fixed-target frame (particle b at rest) is given by

$$\mathrm{d}\sigma = \frac{|\mathcal{M}|^2}{2E\_1 2m\_2 |\mathbf{v}\_1|} \,\mathrm{d}\mathrm{Lips}(s; p\_3, p\_4) \tag{2.44}$$

where in this case

$$\text{dLips}(s; p\_3, p\_4) = (2\pi)^4 \delta^4(p\_3 + p\_4 - p\_1 - p\_2) \frac{\text{d}^3 \mathbf{p}\_3}{2E\_3 (2\pi)^3} \frac{\text{d}^3 \mathbf{p}\_4}{2E\_4 (2\pi)^3}$$

with s = (p<sup>3</sup> + p4)<sup>2</sup> = (p<sup>1</sup> + p2)<sup>2</sup>. dLips may be evaluated by using the δ<sup>4</sup>(·) to integrate over four of the six variables d<sup>3</sup>**p**<sup>3</sup> d<sup>3</sup>**p**4, and doing this in the CMS frame gives <sup>18</sup> <sup>18</sup> Exercise 2.9 covers this calculation.

$$\frac{\mathrm{d}\sigma}{\mathrm{d}\Omega} = \frac{1}{(8\pi w)^2} |\mathcal{M}|^2 \frac{p'}{p} \tag{2.45}$$

where <sup>w</sup> <sup>=</sup> <sup>√</sup><sup>s</sup> is the total CMS energy and <sup>p</sup> and <sup>p</sup> are the magnitudes of the initial and final CMS 3-momenta, respectively. The above cross section has dimensions of [energy]<sup>−</sup><sup>2</sup>. To get back to physical units of area, one must multiply by (**¯**hc)<sup>2</sup> = 0.389 mb GeV<sup>2</sup>.

## **2.5.4 Breit–Wigner**

In nuclear physics, τ might be anywhere from 10<sup>−</sup><sup>3</sup> to 10+32 s but particle physics typically deals with short timescales: 10<sup>−</sup><sup>23</sup>–10<sup>−</sup><sup>8</sup> s. As we have seen above, the lifetime of an unstable state is inversely related to the total decay rate Γ (Γτ ∼ 1). This is a form of the time–energy uncertainty relation: the uncertainty in lifetime translates to an uncertainty in mass. The state is said to have a **natural width**

$$
\Gamma = 1/\tau \quad \text{and} \quad N(t) \propto \text{e}^{-\Gamma t} \tag{2.46}
$$

For long-lived particles (meaning ∼10<sup>−</sup><sup>13</sup> s or more), the natural width is so small that it is better to quote mass and lifetime.

A simple model for such states gives rise to the Breit–Wigner line shape (Fig. 2.5). We proceed as follows:

• The wavefunction of a state of energy E<sup>R</sup> and lifetime τ = 1/Γ is

$$\begin{aligned} \psi(t) &= \psi(0) \mathbf{e}^{-\mathbf{i}E\_R t} \mathbf{e}^{-t/2\tau} \\ &= \psi(0) \mathbf{e}^{-t(\mathbf{i}E\_R + \Gamma/2)} \end{aligned}$$

• The intensity ψψ<sup>∗</sup> follows the exponential decay law I ∝ e−Γ<sup>t</sup> . The amplitude as a function of E is derived from the Fourier transform:

$$\begin{aligned} \chi(E) &= \int \psi(t) \mathbf{e}^{\mathrm{i}Et} \,\mathrm{d}t \\ &= \psi(0) \int \mathbf{e}^{-t\left[\left(\Gamma/2\right) + \mathrm{i}\left(E\_R - E\right)\right]} \,\mathrm{d}t \\ &\propto \frac{1}{\left(E - E\_R\right) - \mathrm{i}\Gamma/2} \end{aligned}$$

• The cross section σ(E) ∝ χχ<sup>∗</sup>:

$$
\sigma(E) = \sigma\_{\text{max}} \frac{\Gamma^2 / 4}{(E - E\_R)^2 + \Gamma^2 / 4} \tag{2.47}
$$

The value of σmax can be found using a heuristic argument from wave optics following Perkins [117]. The angular momentum of a particle with momentum p about a scattering centre may be written as L = pb, where b is the impact parameter.<sup>19</sup> <sup>19</sup>The impact parameter is defined as Particles in a beam of fixed momentum will have b in the range (0, L/p) and will impact on the target plane within a circular disc of area πb<sup>2</sup>. The wavefunction of the incident particles can be expanded as a sum of partial waves with quantized angular momentum L = l, where l is an integer. The fraction of the beam with L ∈ (l,l + 1) hits the target plane within an annulus of area π[(l + 1)<sup>2</sup> − l <sup>2</sup>]/p<sup>2</sup> = (2l + 1)π(1/p)<sup>2</sup>. To take account of elastic and totally absorptive scattering, the elastic scattering amplitude is doubled, leading to a factor of 4 in this expression.<sup>20</sup> <sup>20</sup>For more details on partial wave ana-For scattering through the lth partial wave, the result is σmax = 4π(1/p)<sup>2</sup>(2l + 1).

A few more details are needed:

• So far, the spin of the particles has been ignored. Since the Breit– Wigner formula is a total cross section, we sum over the final-state spins and average over the initial-state spins, giving a factor

$$\frac{2J+1}{(2S\_a+1)(2S\_b+1)}$$

For scattering through a resonant state with spin J, l → J.

• The expression in eqn 2.47 for the energy dependence is not in a Lorentz-invariant form. This is rectified by multiplying it by (E + ER)<sup>2</sup> and using the approximation E ≈ E<sup>R</sup> = M<sup>0</sup> where appropriate.

**Fig. 2.5** Breit–Wigner line shape.

the perpendicular distance from the scattering centre to the direction of travel of the object.

lysis, see the Further Reading at the end of the chapter.

• Often, a resonance will occur in many different scattering processes (or channels), and to allow for this, the expression in eqn 2.47 is further multiplied by the branching ratios BRin and BRout for the entrance and exit channels.

The result is

σ(s) = σmax M<sup>2</sup> <sup>0</sup> Γ<sup>2</sup> (s − M<sup>2</sup> <sup>0</sup> )<sup>2</sup> + M<sup>2</sup> <sup>0</sup> <sup>Γ</sup><sup>2</sup> (2.48)

where

$$\sigma\_{\text{max}} = \frac{4\pi}{p^2} \frac{(2J+1)\,\text{BR}\_{\text{in}}\,\text{BR}\_{\text{out}}}{(2S\_a+1)(2S\_b+1)}\tag{2.49}$$

Two examples of Breit–Wigner line shapes are shown in Figs. 2.6 and 2.7, for the decays φ → K<sup>+</sup>K<sup>−</sup> and K<sup>∗</sup> → K<sup>+</sup>π<sup>−</sup>, respectively. Both states have comparable masses (φ ∼ 1020 MeV, K<sup>∗</sup> ∼ 900 MeV), but the K<sup>∗</sup> is about 50 MeV, while that of the φ is only 4.3 MeV. Try Exercise 2.8 to help understand this.

## **2.6 Luminosity and event rates**

In Section 2.5.3, a relationship among event rate, flux, and cross section was introduced and elaborated, particularly for two-body scattering. The strategies for optimizing and measuring the luminosity in colliders are

**Fig. 2.6** Breit–Wigner line shapes for φ → K+K−, from the LHCb experiment at the LHC [100]. The Breit– Wigner line shape is superimposed on a gradually rising background shown by the dashed line.

width <sup>21</sup> <sup>21</sup> The full width at half the maximum height of the peak (FWHM).

**Fig. 2.7** Breit–Wigner line shapes for K<sup>∗</sup> → K+π−, from the LHCb experiment at the LHC [105]. The overall background under the Breit–Wigner line shape is shown by the dashed line.

described in Section 3.4. The critical issue of 'triggering' (i.e. selecting the interesting events to keep) is introduced in Section 4.10 and discussed in slightly more detail for the LHC in Section 13.2.

## **2.7 Group theory**

This section provides a brief introduction to group theory, with emphasis on the groups of relevance to physics and particularly particle physics. The mathematical definition of a group G considers a set of objects (a, b, c, . . .) with a multiplication rule satisfying the following:


There are many different types of group of relevance to physics:


translation in a plane requires the distance that the centre of the vector is moved in the plane and the angle of rotation (also within the plane) with respect to the original direction of the line. For R3, a rotation in three dimensions requires the direction of the axis about which the rotation takes place (two direction cosines) and the angle of rotation about this axis.

algebra; n is the order of the group.

T<sup>2</sup> and R<sup>3</sup> depend on two and three parameters, respectively. <sup>22</sup> <sup>22</sup> For <sup>T</sup>2, a two-dimensional vector These numbers are known as the order of the group. For T2, the sequence in which two successive translations is applied does not affect the outcome: T2(**a**)T2(**a**- ) = T2(**a**- )T2(**a**); such a group is called Abelian. The groups R<sup>3</sup> and SU(n) are non-Abelian, so the sequence of successive operations matters.

## **2.7.1 Lie groups**

Consider a matrix group {A}. An element A must have an inverse, and hence det(A) = 0; thus, there exists a matrix α such that

$$A = \exp(\alpha) = 1 + \alpha + \alpha^2/2 + \dots$$

and α is the logarithm of A. The set of all matrices α whose exponentials belong to a group G is known as the Lie algebra of G. Using the definition of the exponential series, we have

$$A = \lim\_{k \to \infty} \left( I + \frac{\alpha}{k} \right)^k \quad \text{and inverse} \quad \alpha = \lim\_{k \to \infty} k(A^{1/k} - I).$$

For large k, the matrix I + α/k is an operator of the group and gives A by iteration. It is an infinitesimal operator, since for large k it differs from the identity operator by an infinitesimal amount.

The product of two elements of a group is a member of the group, which for a finite group must then be expressible as a linear combination of the n generators of the Lie algebra. <sup>23</sup> <sup>23</sup> Also known as the basis of the Lie This provides a set of relations among the commutators of the basis elements {gi}, i = 1,...,n:

$$[g\_i, g\_j] = \mathrm{i}c\_{ij}^k g\_k \tag{2.50}$$

There will be n(n − 1)/2 such relations, and the c<sup>k</sup> ij are known as the structure constants of the group.

## **2.7.2 U(***n***) and SU(***n***)**

The Lie groups U(n) and SU(n) occur in a number of different contexts in particle physics. U(1) and SU(2) are used in the construction of the electroweak sector of the standard model. SU(2) and SU(3) occur as approximate symmetries used to classify nuclear and particle states. The strong force of QCD is based on an exact SU(3) of eight gluons providing the strong force binding the quarks into hadrons.

$$\begin{aligned} \text{^{24}} \\ \text{That is, } UU^\dagger = U^\dagger U = I. \qquad &\quad \text{U(n)} \text{ The set of all } n \times n \text{ unitary}^{\text{24}} \text{ matrices form the } \text{U(n)} \text{ group. Because} \\ UU^\dagger = I, \det(UU^\dagger) = 1, \text{ and so we have} \end{aligned}$$

$$\begin{aligned} \det(UU^\dagger) &= \det(U)\det(U^\dagger) \\ &= (\det U)(\det U)^\* \\ &= |\det U|^2 = 1 \end{aligned}$$
  $\det U \quad \text{\(\det U\)}$ 

hence detU = e<sup>i</sup><sup>φ</sup>

SU(n) This is the subset of U(n) for which φ = 0, i.e. det(U)=1. Identifying U = e<sup>i</sup>g, where g is the generator of the operator U, and using the relation

$$\det(\mathbf{e}^{\mathbf{i}g}) = \mathbf{e}^{\mathbf{i}\operatorname{Tr}(g)}$$

it follows that e<sup>i</sup><sup>g</sup> is an element of SU(n) if, and only if, g is traceless: Tr(g)=0 ⇒ ei Tr(g) = 1.

Once the relevant group has been identified, properties of a model can be inferred from the properties of that group:


## **2.7.3 SU(2)**

The simplest SU(n) group is SU(2). It is the group of 2 × 2 unitary matrices U with det(U) = 1. Writing U = e<sup>i</sup><sup>h</sup>, unitarity gives

$$UU^\dagger = \mathbf{e}^{ih}\mathbf{e}^{-ih^\dagger}.$$

From the unitarity condition, (UU†)† = U†U = I, so U and U† commute and it follows that h and h† also commute. Under these conditions,

$$\mathbf{e}^{ih}\mathbf{e}^{-ih^\dagger} = \mathbf{e}^{\mathbf{i}(h-h^\dagger)} = I = \mathbf{e}^0$$

so h − h† = 0. These are the conditions for h to be a Hermitian matrix.

SU(2) has three generators, and the three 2 × 2 traceless Hermitian matrices making up the Lie algebra are the Pauli matrices:

$$
\sigma\_1 = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}, \quad \sigma\_2 = \begin{pmatrix} 0 & -\mathbf{i} \\ \mathbf{i} & 0 \end{pmatrix}, \quad \sigma\_3 = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} \tag{2.51}
$$

Any 2 × 2 traceless Hermitian matrix h can be written as a linear combination of the three Pauli matrices with real parameters λi:

$$h = \lambda\_1 \sigma\_1 + \lambda\_2 \sigma\_2 + \lambda\_3 \sigma\_3 \tag{2.52}$$

The commutation relations of the Pauli matrices are

$$[\sigma\_1, \sigma\_2] = 2\mathbf{i}\sigma\_3, \quad [\sigma\_2, \sigma\_3] = 2\mathbf{i}\sigma\_1, \quad [\sigma\_3, \sigma\_1] = 2\mathbf{i}\sigma\_2 \tag{2.53}$$

So for this group the structure constants are all 2. The angular momentum vector of a spin- <sup>1</sup> <sup>2</sup> particle is <sup>J</sup><sup>i</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup>σi, with the usual angular momentum commutation relations [Ji, J<sup>j</sup> ]=iijkJk.

The Pauli matrices provide the fundamental or irreducible matrix representation of the group SU(2). From these, representations of higher angular momentum states can be constructed—see the Further Reading.

## **2.7.4 Combining states**

angular momentum states (representations of the three-dimensional rotation group R3).

Representation theory provides the rules for how to combine states. <sup>25</sup> <sup>25</sup> A familiar example is how to combine Some examples of the arithmetic for combining SU(n) states are as follows:

• SU(2):

2 ⊗ 2=4=3 ⊕ 1

(3 symmetric states, 1 antisymmetric state);

2 ⊗ 2 ⊗ 2=8=4 ⊕ 2 ⊕ 2

(4 symmetric states, 2 mixed-symmetric and

2 mixed-antisymmetric).

• SU(3):

3 ⊗ 3 ⊗ 3 = 27 = 10 ⊕ 8 ⊕ 8 ⊕ 1

(10 symmetric states, 8 mixed-symmetric, 8 mixed-antisymmetric and one antisymmetric state).

Further details of SU(2) and SU(3) and their use in particle physics are given in Chapter 5. See also the exercises at the end of this chapter.

## **Chapter summary**


## **Further reading**


## **Exercises**



<sup>1</sup>An essential step in electronic chip fabrication.

# **Accelerators 3**

Accelerators are devices that accelerate charged particles to a broad range of energies from keV to TeV. This chapter is a very brief introduction to accelerators in particle physics and high-energy nuclear physics. There are only a few accelerators used for particle or nuclear physics, but about 30 000 accelerators are currently used worldwide for other very important applications. There are about 9000 accelerators used in cancer therapy, 9500 in ion implantation,<sup>1</sup> 4500 for cutting and welding, 2000 for electron-beam and X-ray sources, 1000 for neutron generators, and more in other fields. Each type of accelerator is built using the technology most suitable for its particular application. In this very brief introduction, we will focus only on synchrotrons and linear accelerators, since these are the typical choices in particle physics. Some excellent introductory textbooks on accelerator physics are given in the Further Reading at the end of this chapter.

## **3.1 Radiofrequency acceleration**

## **3.1.1 Electric and magnetic fields**

Particles are accelerated by the electric field only,

$$\mathbf{E} = -\nabla\varphi - \frac{\partial\mathbf{A}}{\partial t} \tag{3.1}$$

where ϕ is the scalar potential and **A** the vector potential. In particle physics accelerators, the time-dependent vector potential **A** is the source of the accelerating field **E**. The simplest realization is a cylindrical structure, sketched in Fig. 3.1, and is called a pill-box cavity. Microwave radiation (with frequency in the MHz–GHz range) produced in a device called a klystron (see Section 3.1.6) is guided to the pill-box cavity, where it forms a standing wave; i.e. the pill-box cavity acts as a resonator. In free space, electromagnetic waves can only have transverse electric and magnetic fields with respect to the direction of propagation; however, inside a cavity, we can have transverse electric (TE) or transverse magnetic (TM) modes, indicating fields that have either longitudinal magnetic or electric fields, respectively. Since we need a longitudinal electric field to accelerate charged particles, only the TM modes will be useful. The modes can be found by solving Maxwell's equations in free space without free charges or currents, subject to the usual boundary conditions at

Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg 2016. Published in 2016 by Oxford University Press.

**Fig. 3.1** A pill-box cavity.

the conducting surfaces of the cavity. These conditions ensure that the longitudinal component of the **E** field and the perpendicular component of the **B** field vanish at the surface of a conductor.

The useful (i.e. accelerating) modes of this cavity are TMnlm, where the indices, n, l, and m refer to the field variations along the usual polar coordinates φ (azimuthal), r (radial), and s (longitudinal, i.e. along the beam direction). Since the radial variations are given by Bessel functions and we require a non-zero component of the electric field in the longitudinal direction s on-axis (i.e. Es(r = 0) = 0), and we must satisfy the boundary conditions at the surface of the conductor, we need the l = 1 modes. We want to select the modes that minimize the energy stored (and hence the electricity costs) for a given accelerating gradient. This means that we use the TM<sup>010</sup> mode. This mode has only two components, the electric field E<sup>s</sup> in the direction of the acceleration (s) and the azimuthal component of the magnetic field B<sup>φ</sup> in the cavity (as indicated in Fig. 3.1), which are oscillating with the radiofrequency (RF) frequency ω:

$$E\_s \sim J\_0(kr) \mathbf{e}^{\rm i\omega t}, \quad B\_\phi \sim J\_1(kr) \mathbf{e}^{\rm i\omega t} \tag{3.2}$$

where J<sup>0</sup> and J<sup>1</sup> are the lowest-order Bessel functions and k = 2π/λ is the wavenumber. The requirement that Es(R) = 0, where R is the radius of the pill box, determines the allowed value of k from the zeros of the Bessel function. The first zero of the J<sup>0</sup> Bessel function is J0(2.40) = 0, therefore kR = 2.40 and hence λ = (2πR)/2.4. So for the l = 1 mode, λ 2.62R. The amplitudes J<sup>0</sup> and J<sup>1</sup> depend on the radial coordinate r as sketched in Fig. 3.2.

## **3.1.2 Circular accelerators and synchronicity**

Suppose now that we want to accelerate protons in this pill-box cavity along the direction s, which goes from left to right. We need to inject these protons into the cavity when the electric field component E<sup>s</sup> can accelerate them, i.e. when it is positive. After a time t = π/ω, the field changes sign and injected protons would be decelerated. If we further assume that the pill-box cavity is an accelerating structure of a synchrotron where protons go around on a closed trajectory, approximately a circle,<sup>2</sup> <sup>2</sup>There will be straight sections, for exthen we will end up with the synchronicity condition

$$
\omega = N \omega\_{\text{rev}}.\tag{3.3}
$$

The RF frequency needs to be an integer (N) multiple of the revolution frequency ωrev with which protons go around in the synchrotron. This is demonstrated in the cartoon in Fig. 3.3. The value of N is chosen for practical reasons—for example to make the RF frequency lie in the range where components such as amplifiers are available. This means that N is typically very large (e.g. at LEP, N = 31 320). This defines the number of 'buckets' in which we can potentially store stable beams. However, we usually only want to inject particles into a much smaller number of buckets<sup>3</sup> . In a synchrotron, protons go through the accelerating cavities protons.

**Fig. 3.2** Amplitudes of the TM<sup>010</sup> fields in a pill-box cavity.

**Fig. 3.3** The child accelerates the roundabout by only pushing at the correct phase.

ample in the experimental halls where particle physics detectors are located.

<sup>3</sup>The groups of particles in filled buckets are called bunches. At the LHC, the spacing between buckets is 2.5 ns, but only ∼ 10% of buckets are filled with

**Fig. 3.4** An accelerating cavity consisting of two pill-box cavities. A proton bunch in the left-hand gap will be accelerated. A proton bunch in the middle drift tube will be shielded from the decelerating field. When it emerges into the right-hand gap, the field has changed sign.

particle. <sup>4</sup> <sup>4</sup> For simplicity, we assume that the paths around the accelerator are of the same length for all particles considered. In reality, different momenta/energies lead to different paths, which also need to be taken into account.

account of energy losses due to synchrotron radiation and differences among trajectories of particles with different momenta. The conclusion would be that N<sup>1</sup> and not M<sup>1</sup> would correspond to a stable reference trajectory for a bunch of electrons.

function of the arrival time or relative phase of a beam particle with respect to the oscillating electric field in the cavity. E<sup>0</sup> is the accelerating electric field amplitude, L the cavity length, and e the charge of accelerated particle. Other symbols are explained in the text.

many times, gaining energy at every passage. The magnetic field guiding them through the accelerator (see Section 3.2) changes along with the acceleration, keeping them on the same orbit.

One can combine many pill-box cavities, stacking them one after another as sketched in Fig. 3.4. Protons are then accelerated in gaps between the drift tubes (see Fig. 3.4) when the accelerating electric field points in the right direction and then 'hide' inside the drift tubes, isolated from the electric field when it points in the wrong direction, emerging in the next pill box when the electric field is again pointing in the right direction.

The fact that beam particles need to enter an accelerating cavity at the right time leads to the bunched structure of the beam. This is demonstrated in Fig. 3.5 in more detail. A proton, for example, with nominal momentum p and travelling along the nominal path is called a synchronous particle and its trajectory is the reference trajectory. It arrives at the entrance to the cavity at point M1. After going through the cavity, its energy is increased by ε (here we are assuming that ωL/c 1). Another proton arrives a little earlier at point P and its energy is increased by a smaller amount than ε, so when it enters the cavity again, after the next revolution, it will not be that early in comparison with the synchronous Yet another proton arrives a little late at point P- . This proton will gain more energy than ε, so, after the next revolution, it will not be that late. In a synchrotron in which we combine the accelerating cavities with magnetic fields, these different energy gains and losses will lead to oscillations ('synchrotron oscillations').

In contrast, considering points Q and Q in relation to point N1, one can see that a proton arriving early with respect to N<sup>1</sup> will gain more energy and one arriving late will gain less energy than the proton arriving at N1; thus, every revolution, these protons will be further and further apart from each other, eventually escaping from the accelerator. <sup>5</sup> <sup>5</sup> For electrons, we would need to take So M<sup>1</sup> and M<sup>2</sup> are stable points where bunches of beam particles can be located and N<sup>1</sup> and N<sup>2</sup> are unstable points.

> This simplified discussion of phase stability needs to be expanded to take into account competing changes: the speed of the particles and the

radius of the orbit. The frequency is simply related to the speed v and the radius R by

$$f = \frac{v}{2\pi R}.\tag{3.4}$$

Both v and R depend on the momentum of the particle. From relativistic kinematics, we know that

$$p = \frac{mv}{\sqrt{1 - v^2}}\tag{3.5}$$

When the RF acceleration increases the momentum of the particle, it will also cause it to follow a slightly different orbit. This change ΔR in the radius is defined by the dispersion D, which for a certain change Δp in the momentum gives<sup>6</sup> <sup>6</sup>The value of the dispersion depends

$$
\Delta R = D \frac{\Delta p}{p} \tag{3.6}
\\
\tag{3.6}
\\
\text{Reading for details.}
$$

It is conventional to define the 'slip factor', which is given by the fractional rate of change with frequency divided by the fractional change in momentum:

$$
\eta\_{\rm RF} = \frac{\Delta f/f}{\Delta p/p} \tag{3.7}
$$

Substituting from eqns 3.5 and 3.6, we can evaluate (see Exercise 3.2) the two terms in eqn 3.7:

$$
\eta\_{\rm RF} = \frac{1}{\gamma^2} - \frac{D}{R\_0} \tag{3.8}
$$

where R<sup>0</sup> is the radius of the reference trajectory. Therefore, for injection at low momentum, for which γ ∼ 1, for a typical proton synchrotron, ηRF > 0. However, as the momentum increases, we will reach a transition in which ηRF = 0 and, for higher momentum, ηRF < 0. This implies that the region of phase stability flips when we cross the transition defined by

$$\frac{1}{(\gamma\_{\text{transition}})^2} = \frac{D}{R\_0} \tag{3.9}$$

Therefore, we need to change the RF phase as we cross such transitions.<sup>7</sup> <sup>7</sup>This can be done sufficiently quickly The concept of phase stability discussed here is one of the key ideas that the particle losses are negligible. that enabled the successful operation of high-energy accelerators.

## **3.1.3 Accelerating-cavity design and** *Q* **factor**

For efficient operation of an accelerating cavity,<sup>8</sup> <sup>8</sup>In this subsection we will work in SI i.e. a standing-wave resonator, one requires that the energy be transferred from the resonator to the accelerated beam and not dissipated into the environment by losses to the walls of the cavity and by radiation to the environment. This means that the Q value, i.e. the quality factor defined as the ratio

on the type and strength of the magnetic focusing. See Wilson in Further

units

<sup>10</sup>For superconducting cavities, we have to consider the effective surface resistance, so there are still losses; however these are orders of magnitude smaller for superconducting nickel compared with copper, which suffers from Ohmic losses.

**Fig. 3.6** Schematic cross-section of an RF cavity (not to scale). Note the small accelerating gap and the relatively large volume. The RF power enters through an insulating ceramic window and couples into the cavity.

of the average energy stored in the cavity, U, to the average energy dissipated in one oscillation period, Ud, is very large. The energy stored alternates between the electric and magnetic fields, but we can calculate this from the peak value of the magnetic field, **H**0:

$$U = \frac{\mu\_0}{2} \int |\mathbf{H}\_0|^2 \,\mathrm{d}^3 \mathbf{r} \tag{3.10}$$

We can calculate U<sup>d</sup> from the Ohmic losses at the surface of the cavity. <sup>9</sup> <sup>9</sup> This is a simplified calculation that neglects radiation losses. For a good conductor (σ/ω 1), we can neglect the displacement current. Using Amper`e's law, we can show that | **H**| = jsurf, where jsurf is the surface current per unit length. The power dissipated, I<sup>2</sup>R, can be evaluated as a surface integral

$$P = \frac{1}{2} \int |\mathbf{H}\_0|^2 R\_{\text{surf}} \, \text{d}S \tag{3.11}$$

where the surface resistance Rsurf = 1/σδ and the skin depth δ = 2/μσω. The energy dissipated over one period T = 2π/ω is then given by

$$
\Delta W = \pi \sqrt{\frac{\mu}{2\sigma\omega}} \int |\mathbf{H}\_0|^2 \,\mathrm{d}S \tag{3.12}
$$

Comparing eqns 3.10 and 3.12, we can see that to maximize the Q value, we need a high frequency, a high conductivity, and a large ratio of volume to surface area. The RF frequency used is usually in the range 100 MHz to ∼10 GHz.<sup>10</sup>

Next we need to consider the fact that the electric field is varying while the particle crosses the cavity. The electric field for the n = 0 mode on axis (r = 0) depends on time as

$$E\_s = E\_0 \cos(\omega t) \tag{3.13}$$

For an ultrarelativistic particle crossing the cavity, the position along the axis of the cavity is simply s = ct and its speed does not change, but it gains energy over the length of the gap, G:

$$
\Delta W = \int\_{-G/2}^{G/2} E\_0 \cos \left(\omega s/c\right) \mathrm{d}s = \frac{\sin \left(\omega G/2c\right)}{\omega G/2c} E\_0 G \tag{3.14}
$$

It is clear from eqn 3.14 that for efficient acceleration, we need the gap length to be significantly less than the wavelength. This helps keep the bunches in the accelerating phase and prevents slippage into the decelerating phase. On the other hand, we have seen that to obtain a high Q value (and hence minimize Ohmic losses), we need a large ratio of volume to surface area. The RF cavities that are used in high-energy accelerators have shapes optimized to meet these two requirements, which include a short gap length and a large volume. A (very schematic) sketch of a cross section of a cavity is shown in Fig. 3.6. Typically, many cavities are combined in one structure.

Good-quality accelerating cavities that can be produced in large quantities have accelerating fields in the region of 20–30 MV m−<sup>1</sup>. Single cavities might achieve up to 100 MV m−<sup>1</sup>, which is a breakdown limit (or beyond) for most materials.<sup>11</sup> <sup>11</sup>The breakdown mechanisms are dif-To achieve larger accelerating fields, one needs a different approach. Fields up to 100 GV m−<sup>1</sup> can be obtained in plasma (no walls to break down), where electrons can be displaced from quasistationary ions. This is a very active research area that eventually might lead to a new generation of accelerators.

## **3.1.4 Synchrotron radiation energy loss**

When charged particles are accelerated in a circular machine, they lose energy by synchrotron radiation. For an ultrarelativistic particle of mass m and Lorentz factor γ, in an orbit with radius of curvature ρ, the power emitted in the form of synchrotron radiation is

$$P = \frac{2}{3} \frac{r\_c m \gamma^4}{\rho^2} \tag{3.15}$$

where r<sup>c</sup> = e<sup>2</sup>/(4π0mc<sup>2</sup>) is the classical radius of the particle. As the synchrotron radiation scales as γ<sup>4</sup>, it will generally be negligible for protons, but the losses for electron machines will be very significant. The energy loss grows as the fourth power of the energy, and therefore there is a limit to the energy reach of circular electron machines. Although synchrotron radiation can be reduced by increasing the radius of the machine, this becomes prohibitively expensive at high energies. LEP is generally considered to be the highest-energy circular electron accelerator that will be built: higher-energy electron–positron machines will be linear colliders, for which the synchrotron radiation is negligible.<sup>12</sup> While synchrotron radiation is a problem for particle physics applications, it turns out to have many uses in other areas of science. The synchrotron radiation in the laboratory frame is forward-peaked around the electron direction and provides a very high-brightness X-ray source. Dedicated electron rings are built with 'wiggler' magnets to increase the synchrotron radiation. The X-rays are used in condensed matter physics, biology, medical applications, and other fields.

In a linear accelerator, one can use standing-wave cavities as described above or a travelling electromagnetic wave to accelerate electrons. Of course, the travelling wave needs to be propagating in a waveguide-like structure in order for the electric field to have a component along the direction of travel. Then one can inject electrons to sit on the crest of that travelling wave and gain energy as indicated in the cartoon in Fig. 3.7. A schematic sketch of an accelerating structure is shown in Fig. 3.8, where 2a is the diameter containing the beam and 2b is the outer diameter of the waveguide.

ferent for conducting and superconducting cavities. For superconducting cavities, the hard limit is set by the fact that if the magnetic field at the surface becomes too large, the superconductor will return to the normal resistive state ('quench'). In practice, no useful superconducting cavities have been made with gradients above 50 MV m−1.

<sup>12</sup>There is currently some discussion about ideas for very large circular colliders, including the e+e<sup>−</sup> option. If such a machine is ever built, it will certainly be very expensive!

**3.1.5 Linear accelerators Fig. 3.7** The principle behind travelling-wave acceleration.

**Fig. 3.8** Disk-loaded accelerating structure.

**Fig. 3.9** Travelling-wave accelerating structure.

There are two approaches to particle acceleration. One is based on the use of cavities with short accelerating gaps (see eqn 3.14). An alternative approach uses a waveguide structure in which we have a travelling wave. However, in a smooth waveguide, the phase velocity is always larger than c and therefore cannot be used for particle acceleration. One approach to this problem is to insert discs inside the structure that are used to adjust the phase velocity of the travelling wave. The radii a and b and the distances between discs are chosen such that the phase velocity of the wave equals the electron velocity. They need to be changing along the structure as electrons are being accelerated. But once the electron speed becomes very close to the speed of light, there is no need to change the geometry of the structure. A more realistic sketch of an accelerating structure is shown in Fig. 3.9. An RF wave produced by a klystron enters and leaves each cavity to be absorbed outside the cavity. If instead of being absorbed the wave is reflected at the end of the cavity, a standing wave will be created that could also be used to accelerate electrons.

## **3.1.6 Klystrons**

This section gives a very brief and simplified idea of how klystrons work. A DC high voltage is first used to accelerate a continuous electron beam. The electron beam enters an RF cavity to which RF power is delivered at a resonant frequency. This causes the velocity of the electron beam to become modulated. The electrons enter a drift region in which the velocity modulation is translated into spatial modulation (bunching). Finally, the electron bunches enter another RF cavity called the 'catcher region'. They enter out of phase with the RF, so they are decelerated and their kinetic energy is converted into RF energy. The RF wave is then guided by a waveguide to the accelerating RF structure.

## **3.2 Beam optics**

## **3.2.1 Magnetic lenses**

To guide and focus beam particles along the reference trajectory (which may not be a straight line), one needs magnets. In a synchrotron, for example, which has a circular geometry, one needs dipole magnets providing a vertical magnetic field (the accelerators are constructed in the horizontal plane) to bend the trajectories of beam particles so they stay inside the beam pipe (with a good vacuum inside), close to the reference trajectory. But the vertical magnetic field is not enough. For example, it does not constrain particle movement along the field direction, so any vertical component of the momentum, however small, would result in beam particles eventually escaping from the accelerator. One needs to have an arrangement of magnetic fields such that beam particles are effectively confined as if they were in a potential well that prevents them from going too far away from the reference trajectory. Quadrupole magnets are needed for this, together with other magnets for fine tuning. Only dipoles and quadrupoles will be considered here.

A schematic view of a dipole magnet is shown in Fig. 3.10. The beam pipe, a continuous vacuum chamber, runs through the yoke gap, where a magnetic field B is created by electric currents in the two coils. A 'warm' iron yoke can be used for fields up to about 2 T. To avoid iron saturation effects and achieve higher fields, one needs superconducting dipoles (see Section 3.3). The dipole bending strength for a particle with momentum p and charge q is then given by the inverse of the radius of curvature ρ:

$$\frac{1}{\rho} = \frac{qB}{p} \simeq 0.3 \frac{B \text{ [T]}}{p \text{ [GeV/}c]} \quad \text{for } q = e = \text{the electron charge} \tag{3.16}$$

In terms of the gap height h, the number of windings n/2 and the current I in each coil,

$$B = \frac{\mu\_0 n I}{h} \tag{3.17}$$

where μ<sup>0</sup> is the permeability of free space.

A schematic cross section of a quadrupole magnet is shown in Fig. 3.11. Four pairs of coils, with n windings and current I in each coil,<sup>13</sup> create a magnetic field with components B<sup>x</sup> = −gz and B<sup>z</sup> = −gx in the horizontal and vertical directions, respectively, where

$$g = \frac{2\mu\_0 n I}{R^2} \tag{3.18}$$

and R is the distance shown in Fig. 3.11. The corresponding components of the Lorentz force acting on a particle with speed v are

$$F\_x = qvB\_z = -qvgx \quad \text{and} \quad F\_z = -qvB\_x = qvgz \tag{3.19}$$

The important point to note is that in the vertical plane (containing the origin, i.e. the reference trajectory point) the force is acting away from the origin, while in the corresponding horizontal plane it acts towards the origin. One says that the quadrupole is focusing in one plane and defocusing in another plane, perpendicular to the first. There is a complete

**Fig. 3.10** Schematic view of a dipole magnet.

<sup>13</sup>This equation assumes that we are not using a warm iron core magnet, so it is only valid for superconducting quadrupoles.

**Fig. 3.11** Schematic view of a quadrupole magnet.

analogy with geometrical optics, and for a quadrupole of length l, with quadrupole strength k, one can define its focal length f:

$$\frac{1}{f} = k, \quad \text{where} \quad k \, \text{[m}^{-2}\text{]} = \frac{qg}{p} \simeq 0.3 \frac{g \, \text{[T/m]}}{p \, \text{[GeV/}c]} \quad \text{for } q = e \tag{3.20}$$

For the HERA proton ring, k 0.033 m−<sup>2</sup>, l 1.9 m, and f 16 m. If f l, the quadrupole can be treated as a thin lens irrespective of the absolute value of l.

A thin lens of focal length f<sup>1</sup> and another thin lens of focal length f<sup>2</sup> arranged as a doublet of lenses separated by a drift tube of distance l form a focusing doublet with an effective focal length f (see Exercise 3.4b) given by

$$\frac{1}{f} = \frac{1}{f\_1} + \frac{1}{f\_2} + \frac{l}{f\_1 f\_2} \tag{3.21}$$

If one lens is focusing in the horizontal plane (f positive) and one defocusing (f negative), then we can arrange for the effective focal length of the system to be positive. Such a quadruple doublet, with a drift space between the two quadruples, can be arranged to give focusing in both the horizontal and vertical directions. For example, let f<sup>1</sup> = f<sup>Q</sup> and f<sup>2</sup> = −fQ; then the effective focal length is given by f = f <sup>2</sup> <sup>Q</sup>/l in both the horizontal and vertical dimensions. This is the idea behind so-called 'strong focusing', in which focusing and defocusing quadruples are arranged in doublets with a dipole inside each doublet. <sup>14</sup> <sup>14</sup> From the focusing perspective, the A structure like this is called a FODO cell, as sketched in Fig. 3.12. A FODO cell focuses beam particles in both planes. FODO cells are put together one after another as a periodic structure along the whole ring of a synchrotron. Calculations of particle trajectories inside such a structure can be performed using the same techniques as used in geometrical optics; hence this aspect of accelerator physics is called 'beam optics'. The concept of using repeating structures of FODO cells is called strong focusing and it keeps the transverse dimensions of the beam relatively small all the way around the ring, allowing the use of relatively small beam pipes and magnets. Before strong focusing was discovered, synchrotrons used 'weak focusing', which resulted in much larger beam pipes. Strong focusing was another key development that allowed the construction of very high-energy synchrotrons at an affordable price.

dipole magnet acts as a drift tube. Note that if l is too short, then the effective focal length will become too long.

## **3.2.2 Beam trajectories and phase space**

The motion of beam particles is described in a curvilinear coordinate system, as sketched in Fig. 3.13. In a first, linear, approximation, particle motion in each of the three space directions (longitudinal s, vertical z, and horizontal x) can be considered separately. The transverse phase space is split into two 2-dimensional phase spaces. Considering, for example, the vertical direction, we have the z-coordinate and pz-component phase space. As indicated in Fig. 3.14, the velocity z component can be described as a product of the angle with respect to the s direction and the speed along s, which is approximately the speed of light. So, effectively, for a given and constant Lorentz γ factor, what matters is the angle, and the (z, pz) phase space can be replaced by (z, z- = dz/ds) phase space. A similar argument applies in the other transverse direction.

The strength of a quadrupole is defined by

$$k = \frac{1}{B\rho} \frac{\mathrm{d}B\_z}{\mathrm{d}x} \tag{3.22}$$

where ρ is the radius of curvature of the reference trajectory.<sup>15</sup> The equations of motion are then Hill's equations

$$z'' + k(s)z = 0\tag{3.23}$$

$$x'' - \left[k(s) - \frac{1}{\rho^2}\right]x = \frac{1}{\rho}\frac{\Delta p}{p} \tag{3.24}$$

We will only consider solutions for z and z- (or for the horizontal phase space for Δp = 0, i.e. for the nominal momentum and at the limit of ρ → ∞). This differential equation is reminiscent of simple harmonic motion, but k(s) is not a constant; it is a periodic function that defines the focusing strength at any point along the ring (eqn 3.22). It is obvious that if there were no focusing and k(s) = 0, then beam particles could escape unimpeded. If k(s) were constant around the ring, then the solution would be simple harmonic motion. This suggests the use of oscillatory trial functions similar to those for simple harmonic motion:<sup>16</sup>

$$\begin{pmatrix} z \\ z' \end{pmatrix} = \begin{pmatrix} \sqrt{\epsilon} \sqrt{\beta(s)} \cos[\varphi(s) - \varphi\_0] \\ -\frac{\sqrt{\epsilon}}{\sqrt{\beta(s)}} \{\sin[\varphi(s) - \varphi\_0] + \alpha(s) \cos[\varphi(s) - \varphi\_0] \} \end{pmatrix} \quad \text{(3.25)} \quad ^{16}\text{See Exercise 3.3 for a justification.}$$

The initial conditions determine the values of  and ϕ0. The function β(s) defines the amplitude modulation, which varies because of the changing focusing strength around the ring. From our trial solution, we can also derive the relationship between the function β(s) and the magnetic focusing k(s). It is convenient to define β = ω<sup>2</sup>, and we then find<sup>17</sup> <sup>17</sup>See Exercise 3.3.

$$
\omega'' - \frac{1}{\omega^3} + \omega k(s) = 0\tag{3.26}
$$

**Fig. 3.13** Curvilinear coordinate system along the reference trajectory.

**Fig. 3.14** A particle going from z<sup>1</sup> to z<sup>2</sup> in the vertical direction.

<sup>15</sup>In general, as we go around a ring (s), the magnetic focusing will vary, so we write k(s) to remind ourselves that k is not a constant.

**Fig. 3.15** The phase-space ellipse in the (z, z- ) plane.

<sup>20</sup>The horizontal and vertical Q values are not related to the Q-values of the RF cavities discussed in section in 3.1.3.

theorem when we consider stochastic cooling in Section 3.6.

In principle, this allows us to determine ω(s) and hence β(s) if we know k(s). However, this is not practical and matrix methods are used to determine β(s). <sup>18</sup> <sup>18</sup> See Wille in Further Reading. The phase advance function φ(s) is also determined by the focusing. The resulting oscillations about the reference trajectory are called betatron oscillations. The amplitudes of z and z are written using the amplitude function β(s) and the emittance  of the trajectory (at the moment, we are talking only about one particle). The optical function α(s) ≡ −β- (s)/2 and the phase function ϕ(s) are related to the β(s) function by <sup>19</sup> <sup>19</sup> See Exercise 3.3. ϕ- (s)=1/β(s).

> At a given point s along the reference trajectory, a particle has coordinates (z, z- ) in the vertical phase space. After one revolution, it will come back to the same s but with different coordinates (z, z- ). After many revolutions the particle's coordinates (z, z- ) will trace an ellipse in the (z, z- ) phase space, as shown in Fig. 3.15. Integrating the phase function around the circumference C of the accelerator gives

$$\int\_{s}^{s+C} \mathrm{d}\varphi = 2\pi Q\_z \tag{3.27}$$

where Q<sup>z</sup> is known as the betatron tune, the number of betatron oscillations for a particle going around the accelerator once (in this case the vertical tune Qz; similarly, there is a horizontal tune Qx).<sup>20</sup> If there are any small imperfections in the ring, we need to avoid particles crossing these imperfections at the same betatron phase in each revolution otherwise the beam would rapidly 'blow up'. Therefore, integer values of the betatron tune should be avoided. More generally, tunes satisfying

$$
\mu Q\_x + \mu Q\_z = \xi \tag{3.28}
$$

with ν, μ, and ξ integers must be avoided.

The area of the (z, z- ) ellipse is π, so, up to the factor of π, the emittance is the volume of the two-dimensional phase space (to be more precise, here <sup>z</sup> and similarly <sup>x</sup> for the horizontal phase space). Liouville's theorem states that under the action of conservative forces, the volume of beam phase space is conserved. <sup>21</sup>We will see how to evade Liouville's <sup>21</sup> Therefore, on moving from one point s to another along the reference trajectory, one would get another ellipse but with the same area. So the transverse motion of a particle can be visualized as an ellipse of fixed area changing its shape depending on the location in the accelerator. If there is another particle in the accelerator that is described by the same equations of motion but has a different initial phase ϕ0, then the motion of that particle will be given by the same ellipse, since a different value of ϕ<sup>0</sup> simply corresponds to another point on the same ellipse. So, in fact, one ellipse describes a family of trajectories, not just a single trajectory. From the algebraic point of view, this family of trajectories is described by the amplitude function β(s) and the emittance  (see eqn 3.25). But the emittance is constant for 'coasting' (no acceleration) beam particles, which means that we have reduced the problem from two dimensions (z, z- ) to one dimension (β). On inspecting Fig. 3.15, we can see that the amplitude function β(s) is the ratio of the beam width to the on-axis angular spread.

Each ellipse represents a family of particles, and the whole ensemble of beam particles consists of many of these families and ellipses. How do we represent the whole ensemble? Considering the vertical phase space (the same argument applies for the horizontal one), a particle beam injected into an accelerator is characterized by initial conditions equivalent to a cluster of points in the (z, z- ) phase space, centred about the reference trajectory (0, 0). We choose an ellipse that closely surrounds this cluster, and this represents the 'edge' of the beam. By convention, the ellipse should contain 95 % of particles. Then we follow this ellipse through the accelerator; the ellipse, the corresponding amplitude function β(s), and the beam emittance  represent the properties of the whole beam.

Using Liouville's theorem, we see that as long as the beam is not accelerated, the (z, z- ) and (z, pz) phase spaces are equivalent. But once the beam is accelerated, Liouville's theorem applies only to the proper phase space (z, pz), and only the normalized emittance <sup>N</sup> = βγ (here β is the speed and γ is the Lorentz factor) is conserved. The volume of the (z, z- ) phase space shrinks with the momentum p as 1/p and consequently the beam width and the beam angular divergence shrink during acceleration, each as 1/ <sup>√</sup>p; so a higher-energy beam fits into a smaller-diameter beam pipe. This explains why high-energy accelerators require chains of lower-energy accelerators; each accelerator in the chain reduces the emittance sufficiently to allow the beam to have sufficiently small emittance to fit into the next accelerator in the chain. In principle, this accelerator chain could be eliminated if the beam pipe of the highenergy accelerator were sufficiently large; however, this would increase the size and hence the cost of the magnets.

An example of such a chain is at the LHC [73], where the source of protons is a bottle of hydrogen gas. A high voltage is used to strip electrons to provide the protons. A linear accelerator (Linac 2) accelerates the protons to an energy of 50 MeV. The beam is then injected into the Proton Synchrotron Booster (PSB), which accelerates the protons to 1.4 GeV, followed by the Proton Synchrotron (PS), which accelerates the beam to 25 GeV. Protons are then injected into the Super Proton Synchrotron (SPS), where they are accelerated to 450 GeV before injection into the LHC.

## **3.3 LHC dipole magnets**

The LHC superconducting dipoles use conductors made from a niobium– titanium (NbTi) alloy.<sup>22</sup> For these dipoles [73], which are capable of generating a magnetic field B = 8.3 T, the current required is I = 11.85 kA. This requires the NbTi superconductor to be cooled to a temperature of 1.9 K using superfluid helium.<sup>23</sup> In a type I superconductor, the current flows only on the surface, not in the bulk, which limits the useful magnetic field. Therefore, high-field superconducting magnets rely on <sup>22</sup>NbTi is the only low-temperature superconductor that is ductile, and hence most existing superconducting magnets are based on this alloy, although there is interest in the niobium– tin alloy Nb3Sn, which might be able to produce larger magnetic fields. This would allow the option of an upgrade to the LHC to reach a CMS energy of 33 TeV. Nb3Sn superconductors are used in some high-field MRI magnets.

<sup>23</sup>This has the advantage of benefiting from the remarkable thermal properties of superfluid helium, which has an effective thermal conductivity orders of magnitude better than copper.

support. <sup>24</sup> <sup>24</sup> Copper is effectively an insulator when the NbTi is in the superconducting state.

<sup>25</sup>This type of cable is called Rutherford cable because it was developed at the UK Rutherford Laboratory, and it is used in all high-field superconducting magnets. The cable is used in MRI scanners, so this is probably one of the most important but least known spin-offs from particle physics research. type II superconductors, in which magnetic fluxoids can penetrate the volume. When there is a changing magnetic field in a superconductor, this will cause screening currents to flow. These are similar to eddy currents, but as there is no resistance they do not decay with time. This magnetization appears as an unwanted error in the field produced by the magnet. The magnetization is proportional to the diameter of the wire carrying the current. When the magnetic field is changing with time, as happens when the beam energy is being ramped up from injection energy, an additional magnetization is created from the flow of current between neighbouring filaments. Therefore, a useful superconducting cable has to be made from a very large number of very small filaments wound as 'twisted pairs' to minimize the magnetization. The cable for the LHC dipole magnets [73] is based on 6 μm-diameter filaments. The filaments are embedded in a copper matrix for mechanical Each strand is made from 6300 filaments and is 0.825 mm in diameter, and 36 strands are then used to make a cable, as shown in Fig. 3.16.

For the same reasons as discussed above, it is important to minimize the flux linkage between wires. Twisting wires around each other as in a conventional cable is not sufficient, because the inner (outer) wires remain inside (outside). The wires need to be fully transposed; i.e. every wire must change places with every other wire along the length of the cable so that, averaged over the length, no flux is enclosed.<sup>25</sup>

To create a perfect dipole field, a distribution of current density varying as cos φ around the beam pipe would be required. However, a very nearly uniform dipole field near the centre of the beam pipe is created by blocks of superconductor arranged in the geometry shown

**Fig. 3.16** Filaments (a), strands (b), and cable (c) of the type used for the LHC superconducting magnets, and a cross-section of one-quarter of the coils used in a main dipole (d). Note that the superconducting cable to create the dipole field is placed in the small outlined boxes in (d). From [73].

in Fig. 3.16(d). The currents in the blocks are optimized to produce a uniform magnetic field.

## **3.3.1 Engineering design details**

As there is insufficient space in the LHC tunnel for separate magnets for each beam, a 'two-in-one' magnet was designed in which the two magnetic volumes are inside a common cryostat [73], as illustrated in Fig. 3.17.<sup>26</sup> <sup>26</sup>This also allowed for significant cost This magnet design is an amazing engineering tour de force, as the figure shows. In comparison, the dipole magnets for a pp¯ collider are straightforward.

The two beam pipes containing the counter-circulating proton beams are in the centre, surrounded by their respective dipole bending magnets with fields orientated to bend the two separate positively charged particle beams in opposite directions. The superconducting cable is held in place by non-magnetic 'collars' of austenitic steel, which can withstand a magnetic force of about 400 tonnes per metre of dipole. At nominal operation, the energy stored in each of the 1232 LHC dipoles is 6.93 MJ. If one small region of the superconductor becomes savings compared with having two separate magnets and cryostats.

**Fig. 3.17** Cross-section of an LHC dipole in its cryostat. From [73].

<sup>27</sup>An additional source of danger is the electrical connection between the dipoles. This has a tiny but non-zero resistance. If this resistance is too large, this can also lead to catastrophic thermal runaway, as occurred on 19 September 2008 and led to extensive damage. Many improvements have been made since then to prevent this type of problem.

does with a garden hose to avoid damaging a sensitive plant.

show <sup>29</sup> <sup>29</sup> See Exercise 3.1. that

kept below about 10−<sup>7</sup> Pa.

non-superconducting (called a quench) for any reason, there will be Joule heating, which will increase the resistance, and hence there is the possibility of a catastrophic runaway, which would destroy the magnet. Therefore, sophisticated quench detection and protection systems are essential. Quenches can be detected by the extra 'IR' voltage drop. Once a quench is detected, a 'quench heater' is operated to force the entire magnet to become non-superconducting and the energy is transferred to a large 'dump' resistor.<sup>27</sup>

The energy stored in the two beams at nominal operation is 362 MJ, so they have sufficient energy to destroy large parts of the LHC machine and the detectors. Therefore, many beam loss monitors are installed in the machine, and if the rates rise above a threshold, kicker magnets are operated to deflect the beams out of the ring towards a beam dump [73]. The beam dump must be able to dilute the peak energy density of the beam before it is absorbed. At the LHC, this is done by 'spraying' the beam in a spiral pattern as it enters the dump. <sup>28</sup> <sup>28</sup> Somewhat analogous to what one

## **3.4 Colliders and fixed-target accelerators**

The centre-of-mass energy in a symmetric collider, with each beam having energy <sup>E</sup>, is simply given by <sup>√</sup><sup>s</sup> = 2E. For a fixed target collision, with a beam energy E and a target mass m (assuming E m), we can

$$
\sqrt{s} = \sqrt{2mE} \tag{3.29}
$$

Therefore, colliders have an obvious advantage in maximizing the centre-of-mass energies over fixed-target experiments. Although this was realized a long time ago, the challenge of achieving a useful interaction rate was formidable. The interaction rate for a given physics process depends on the luminosity (see Section 3.5). In a fixed-target geometry, it is only necessary for one high-intensity beam to collide with a block of matter to achieve very high luminosities. For a collider, this is far more challenging because we need two intense beams, which both have to be focused to very small transverse dimensions at the interaction points to achieve a useful luminosity. In a fixed-target accelerator, the beam must be kept for a few seconds before it is extracted. However, in a collider, it takes time to 'fill' the machine with sufficient numbers of particles before they are accelerated to the peak energy, and then the beams have to be kept for several hours while data are taken. This obviously puts much more significant demands on the quality of the accelerator. We must avoid the dangerous resonances and ensure that imperfections, which can increase the emittances of the beams, are kept to a minimum. We also need an extremely high vacuum to minimize beam losses and backgrounds in the detector. <sup>30</sup> <sup>30</sup> At the LHC, the pressure has to be The defocusing effects of one beam on the other must also be kept under control.

Some of these issues are common to all types of colliders. We always want higher luminosity, and how to achieve this is discussed in Section 3.5. The special issue for circular e<sup>+</sup>e<sup>−</sup> colliders is synchrotron radiation (see eqn 3.15), which puts a practical limit on the beam energy. Therefore, large e<sup>+</sup>e<sup>−</sup> colliders like LEP have simple (and cheap) magnets but require very efficient RF cavities—which means superconducting cavities (see Section 3.1). Hadron colliders do not suffer from significant synchrotron radiation losses, so we can have much higher beam energies. To optimize the beam energy for a given cost, we need to use the highest magnetic field possible for the dipole bending magnets. The critical technological challenge is the industrial-scale production of very high-quality superconducting magnets (see Section 3.3).

In an e<sup>+</sup>e<sup>−</sup> collider, the energy defines which processes can be studied<sup>31</sup> <sup>31</sup>Energy conservation implies that for and which particles can be created or discovered. In a hadron collider such as the LHC, there is a more subtle interplay between energy and luminosity. We are really interested in the rates for processes at the parton level, and the partons only carry a fraction of the momentum of the protons, so the energy reach of a hadron collider depends crucially on both energy and luminosity. Therefore, there is some complementarity between the very clean physics that can be performed at e<sup>+</sup>e<sup>−</sup> colliders, compared with the higher energy reach of hadron colliders, in which the event reconstruction is more complicated because, in addition to the interesting parton–parton collisions, there are always interactions of the remaining 'spectator quarks'.

The HERA collider was a special case because it used e<sup>±</sup>p collisions. This required a high-energy proton ring (HERA I 820 GeV, HERA II 920 GeV) and a much lower-energy electron ring (27.5 GeV) to minimize synchrotron radiation.<sup>32</sup> <sup>32</sup>The ep CMS energies were 300 and As will be discussed in Chapter 9, this gave the most precise determination of the quark and gluon distribution functions. These are required for all calculations of cross sections at hadron colliders like the LHC; the details will be covered in the same chapter.

The LHC uses pp collisions, but earlier hadron colliders used ¯pp collisions. The important advantage of ¯pp colliders is that the two beams can be contained in the same beam pipe. The critical issue for these colliders was how to produce intense and low-emittance ¯p beams. This will be discussed in Section 3.6. Colliders have also been operated using heavy ions, but these will not be discussed in this book.

## **3.5 Luminosity**

The two most important numbers in experimental (accelerator) particle physics are the energy and the luminosity of an accelerator. The luminosity L translates a cross section σ into the rate R of produced events:<sup>33</sup> <sup>33</sup>Or observed events, if detector effects

$$\frac{\mathrm{d}R}{\mathrm{d}\Omega} = \mathcal{L}\frac{\mathrm{d}\sigma}{\mathrm{d}\Omega} \tag{3.30}$$

a beam energy E, in a symmetric collider, the most massive particle that can be pair-produced will have mass m = E. In general, the sum of the masses of the final-state particles must be less than the centre-of-mass energy (2E).

318 GeV, respectively.

are included.

target is 'thin', i.e. that the probability that an individual beam particle interacts in the target is much less than one.

**Fig. 3.18** Simple geometry for derivation of the luminosity formula. Symbols are explained in the text.

ally the horizontal beam size is bigger than the vertical one.

where Ω is the solid angle for the angular differential cross section. In fixed-target experiments,

$$\mathcal{L} = n\rho l \tag{3.31}$$

where n is the number of particles per second in the beam (typically 10<sup>12</sup> s−<sup>1</sup>), ρ is the density of target particles, and l is the target length (typically, ρl 10<sup>23</sup> cm−<sup>2</sup>), giving a luminosity L 10<sup>35</sup> cm−<sup>2</sup> s−<sup>1</sup>, which is large in comparison with that at a high-energy collider. <sup>34</sup> <sup>34</sup> This simple formula assumes that the

> We begin by deriving useful formulae for the luminosity and then discuss how to optimize it.

> We start with the simple scenario sketched in Fig. 3.18, with two rectangular bunches of particles colliding head on. They contain respectively n<sup>1</sup> and n<sup>2</sup> randomly distributed particles. A is the transverse area of the wider bunch and there are b bunches in each beam, with frequency of revolution f. Then

$$\mathcal{L} = \frac{n\_1 n\_2}{A} bf \tag{3.32}$$

In a collider, n2/A corresponds to the fixed-target ρl; it is the number of target particles per unit area. If particles are distributed in colliding bunches not randomly but according to normalized density distributions ρ<sup>1</sup> and ρ2, then

$$\mathcal{L} = bfn\_1n\_2 \int\_S \rho\_1 \rho\_2 \,\mathrm{d}S \tag{3.33}$$

where S is the transverse space. Introducing beam currents I<sup>1</sup> = n1ef b and I<sup>2</sup> = n2ef b, where e is the electron charge,

$$\mathcal{L} = \frac{I\_1 I\_2}{e^2 b f} \int\_S \rho\_1 \rho\_2 \,\mathrm{d}S \tag{3.34}$$

Assuming Gaussian distributions with σ<sup>x</sup> = σ<sup>z</sup> = σ for simplicity, <sup>35</sup> <sup>35</sup> This is not very realistic, since typic-

$$\rho\_{1,2} = \frac{1}{2\pi\sigma\_{1,2}^2} \exp\left(-\frac{r^2}{2\sigma\_{1,2}^2}\right) \tag{3.35}$$

we get

$$\mathcal{L} = \frac{I\_1 I\_2}{e^2 b f} \frac{1}{2\pi (\sigma\_1^2 + \sigma\_2^2)}\tag{3.36}$$

where 2π(σ<sup>2</sup> <sup>1</sup> + σ<sup>2</sup> <sup>2</sup>) represents an effective area.

In order to illustrate how this relates to the accelerator parameters, we will assume that the vertical and the horizontal emittances are equal, <sup>x</sup> = <sup>z</sup> = /π, and that the horizontal and the vertical beta functions are equal, β<sup>x</sup> = β<sup>z</sup> = β<sup>∗</sup>. We also assume that the two beams have equal emittance. The asterisk in β<sup>∗</sup> is the common symbol to indicate that the function is calculated at the interaction point (IP). As

$$
\sigma^2 = \epsilon \beta^\* \tag{3.37}
$$

$$\mathcal{L} = \frac{I\_1 I\_2}{4\pi e^2 b f \beta^\* \epsilon} \tag{3.38}$$

Equation 3.38 gives the best guide for understanding how to maximize the luminosity for a collider. First, it is clear that increasing the beam currents I<sup>1</sup> and I<sup>2</sup> is desirable. If only a limited number of protons can fit in one bunch, then it is advantageous to increase the number of bunches. However, the beam currents cannot be increased indefinitely, because each beam exerts electromagnetic forces on the other beam at each IP. The net effect of one bunch on a counter-rotating bunch is similar to that of an additional quadrupole magnet, and therefore changes the horizontal and vertical Q values.<sup>36</sup> This is very dangerous because, even if the operating point of the machine is away from integer resonances (see eqn 3.28), such a tune shift can push the beams too close to a resonance, resulting in very rapid beam loss. This beam–beam tune shift thus limits the ultimate luminosity that can be achieved in a hadron collider.<sup>37</sup> <sup>37</sup>The situation is different in <sup>e</sup>+e<sup>−</sup>

from synchrotron radiation. Note the rather counter-intuitive result of eqn 3.38 that if the beam currents are at the beam–beam limit, then the luminosity can be increased by decreasing the number of bunches. However, the optimization of the number of bunches must also respect practical constraints imposed by the detectors. If the number of bunches is reduced, the number of collisions per bunch crossing will increase. Therefore, collisions with one interesting event will also contain a background of many other 'minimum-bias' collisions, which are effectively a noise source.<sup>38</sup> <sup>38</sup>How to cope with this is covered in There is therefore a trade-off between maximizing luminosity and having clean enough events to be useful—and there is no perfect solution. At the LHC design luminosity of 10<sup>34</sup> cm<sup>−</sup><sup>2</sup> s<sup>−</sup><sup>1</sup>, there will be approximately 25 collisions per bunch crossing (every 25 ns).

The next parameters to optimize are the emittances. These depend on the quality of the proton source. Although Liouville's theorem predicts the conservation of beam phase space, any imperfections can increase the emittances. Finally, one can increase the luminosity by decreasing the values of β<sup>∗</sup>. This is achieved by using very strong focusing quadrupole magnets, which will usually be superconducting to achieve the highest field gradients. The consequence of reducing the transverse beam size at the IP is that the beam divergences will increase. This limits how far β<sup>∗</sup> can be reduced before the beam losses from particles hitting the beam pipe become unacceptable. This, in turn, implies that there is an advantage in bringing the quadrupole magnets closer to the IP, but as this will reduce the space for detectors in the forward region, the trade-off will depend on the particular physics being studied.

## **3.6** *¯pp* **colliders**

Hadron colliders can use two separate p beams as in the LHC. However, this requires two separate magnetic fields for the counter-rotating <sup>36</sup>This effect is called the 'beam–beam' tune shift. See Wille in Further Reading for a full explanation. The horizontal and vertical Q values are not related to the Q-values of the RF cavities discussed in section in 3.1.3.

colliders because of the beam 'cooling'

more detail in Section 13.2.

tical physicists that very clean results could be obtained in this difficult environment.

**Fig. 3.19** Schematic view of a synchrotron with transverse pickup (TP), fast amplifier (A), and transverse kicker (TK) electrodes for stochastic cooling.

p beams. With counter-rotating beams of particles and antiparticles, the electric and magnetic fields are automatically correct for both beams, so only one ring is required. This enabled the relatively cheap conversion of proton synchrotrons at CERN and Fermilab into pp¯ colliders. The use of ¯pp colliders was very important as it led to the discovery of the W and Z bosons and the top quark, as well as providing the most precise measurement of the W mass. <sup>39</sup> <sup>39</sup> They also demonstrated to scep-

> The big challenge for ¯pp colliders was to produce very intense, lowemittance, beams of antiprotons, which was achieved using the technique of stochastic cooling [132]. Liouville's theorem prevents a reduction in beam phase space, but it is based on the assumption that the beam is continuous, whereas a real beam is composed of a finite number of discrete particles. Consider first the extreme case of a single particle in a beam; we can detect its transverse position using a beam pickup electrode at one place in the ring, which generates a signal proportional to the displacement about the central orbit. The signal is fed across the ring via an amplifier to two deflecting plates as shown in Fig. 3.19.

> The betatron function at the deflecting plates is an odd multiple of π/2 out of phase with that of the pickup, so that a deviation in position from the reference orbit is converted to a difference in angle. The shorter path for the cable compared with the particle trajectory compensates for the difference in speeds of the particles and the electrical signals as well as the delay in the amplifier. Therefore, a suitable voltage pulse can be used to make a correction to bring the particle back to the central orbit.

In a real beam with a large number of particles, this cooling technique works on a statistical basis. <sup>40</sup> <sup>40</sup> Hence the name 'stochastic cooling'. If the speed of the amplifier were sufficiently fast, then each individual particle in the beam would have the correction applied to bring it back to the central orbit. Therefore, the cooling works better the shorter the sample time of the amplifier, since this determines how many particles are affected. Hence, the cooling rate depends on the bandwidth of the amplifier, W. As stochastic cooling requires the detection of fluctuations, it works faster for smaller numbers of beam particles, N, and the cooling time τ ≈ N/2W. For useful bunches, we need N ≈ 10<sup>12</sup>, and with achievable bandwidth amplifiers this leads to a cooling time of the order of 1 day.

> It is also necessary to provide momentum cooling to reduce the spread in momenta. A pickup electrode can be used to measure the revolution frequencies of particles, and the signal is fed into an amplifier via a filter. The filter eliminates any signal for particles with the correct frequency (i.e. momentum), and higher frequencies give a phase shift of π. This filtered signal can then be fed into an accelerating cavity. Again this system would work perfectly for individual particles and works on a statistical basis for beams with a finite number of particles.

## **3.6.1 CERN** *¯pp* **collider**

The antiprotons were produced by collisions of 26 GeV/c protons from the CERN PS with a copper target. Some of the produced antiprotons with momenta ∼3.5 GeV/c were collected by a large-aperture low-energy ring called the antiproton accumulator. The pulse of antiprotons was first cooled and then moved to the side of the aperture, where an intense stack of antiprotons was built up, allowing a new injection of antiprotons every 2.2 s. The antiprotons were accumulated and cooled using stochastic cooling for about one day and then injected back into the PS and accelerated to 26 GeV/c; they were then injected into the SPS together with counter-rotating bunches of protons. The protons and antiprotons were then accelerated to an energy of 315 GeV (initially 270 GeV), and a run would last until sufficient antiprotons had been accumulated or the beams were lost. The peak luminosity achieved was ∼2×10<sup>30</sup> cm−<sup>2</sup> s−<sup>1</sup>.

## **3.6.2 Tevatron** *¯pp* **collider**

Similar principles were applied to the Tevatron. For the second phase of the Tevatron (Run 2), a 150 GeV synchrotron called the main injector was built to provide higher yields of antiprotons. The acceptance of antiprotons and the cooling were split between two separate machines. After cooling, the protons and antiprotons were re-accelerated in the main injector and then injected into the superconducting Tevatron and accelerated to an energy of 0.98 TeV. The peak luminosity achieved was ∼10<sup>32</sup> cm<sup>−</sup><sup>2</sup> s<sup>−</sup><sup>1</sup>.

## **3.7 Future accelerators**

The RF technology for accelerating particles has been developed over many years, providing accelerators with steadily growing energies and luminosities. Proposed new accelerators, such as the International Linear Collider and the Future Circular Collider, are still based on RF technology. However, the size of these accelerators and their high cost demand a new, cheaper technology to allow the field of particle physics to move beyond current plans and also make applications affordable, for example as a source of ultrashort X-ray pulses to study the dynamics of chemical reactions. One such technology where progress in recent years has been particularly impressive is plasma acceleration. A plasma is an ionized gas such as hydrogen. By moving electrons away from quasistationary ions, it is possible to create electric fields up to three orders of magnitude larger than those obtainable using RF technology, thus allowing accelerators to be much smaller in size. A high-intensity laser pulse or a train of laser pulses or bunches of charged particles like electrons or protons can be used to separate electrons from ions, creating very high electric fields in plasmas. Electrons have been accelerated to energies above 4 GeV in a 9 cm-long accelerator in the BELLA laboratory at Berkeley,<sup>41</sup> <sup>41</sup>Berkeley Lab Laser Accelerator demonstrating the clear potential of this technology. The maximum beam energies reached or expected to be reached at future facilities are presented in Fig. 3.20, together with the longer-term prospects for plasma accelerators.

Center.

**Fig. 3.20** The Livingston plot of the maximum beam energy for conventional RF and plasma accelerators. Image credit: R. Assmann.

## **Chapter summary**


## **Further reading**


## **Exercises**


$$\eta\_{\rm RF} = \left(\frac{\partial f}{\partial v}\frac{\partial v}{\partial p} + \frac{\partial f}{\partial R}\frac{\partial R}{\partial p}\right)\frac{p}{f}$$

(b) Show that

$$\frac{\partial v}{\partial p} = \frac{1}{m\gamma^3}$$

	- (b) Using the requirement that the coefficients of sin and cos must be identical, together with the results from (a), show that our trial solution to eqn 3.24 is valid provided the condition φ-= 1/β is satisfied.
	- (c) Verify eqn 3.26 by equating coefficients of the cos terms in Hill's equation (3.24).
	- (b) Use matrix multiplication to evaluate the matrix for the combination FOD and hence derive eqn 3.21.

Particles are uniformly distributed in cylindrical bunches that move parallel to the cylinder axis. The radius of each bunch in the wider beam is 4 × 10<sup>−</sup><sup>4</sup> m. Calculate the luminosity.


$$L = n\_b f\_{\rm rev} n\_1 n\_2 \int \rho\_1(x, y) \rho\_2(x, y) \,\mathrm{d}x \,\mathrm{d}y$$

Explain the origin of this equation. Assume that the particle densities are uncorrelated in the x and y directions. The luminosity can then be written as

$$L = n\_{\rm b} f\_{\rm rev} n\_1 n\_2 \Omega\_x \Omega\_y$$

where the beam overlap is defined in x (with an equivalent definition in y) as

$$
\Omega\_x = \int \rho\_1(x)\rho\_2(x) \,\mathrm{d}x
$$

Assume that the beams have Gaussian distributions with a common centre and root mean squares σ<sup>x</sup><sup>1</sup> and σ<sup>x</sup>2. Show that

$$
\Omega\_x = \frac{1}{\sqrt{2\pi}\sqrt{\sigma\_{x1}^2 + \sigma\_{x2}^2}}
$$

Let R(x) be the interaction rate as a function of the separation of the beams in the x direction:

$$R(x) \propto \exp\left[-\frac{x^2}{2(\sigma\_{x1}^2 + \sigma\_{x2}^2)}\right]$$

Therefore, the value of σ<sup>2</sup> <sup>x</sup><sup>1</sup> +σ<sup>2</sup> <sup>x</sup><sup>2</sup> can be determined from a VDM scan if the interaction rate for some particular process, R(x), is measured as the separation between the two beams, x, is varied. Assuming that the fraction of bunch crossings that register a hit in a counter is p, show that the mean number of collisions that produce such hits is given by μ = − ln(1 − p). Why is it advantageous to use a small counter? What is the problem with this technique at very high luminosity? Suggest a possible detector technique that could be used for such a counter.

(3.9) Synchrotrons have a periodic ring-shaped lattice of focusing and bending magnets. Specifying the position of a beam particle on the circumference of the accelerator by the distance s, the equations of motion in both horizontal and vertical directions about the ideal orbit have the form

$$\frac{\mathrm{d}^2x}{\mathrm{d}s^2} + K(s)x = 0$$

where K(s + L) = K(s) is periodic. The solution is a quasiperiodic function of the form

$$x(s) = \sqrt{\varepsilon\beta(s)}\cos[\phi(s) - \phi\_0]$$

where ε and φ<sup>0</sup> are constants. Both ε and β(s) have dimensions of length. The phase φ(s) is related to the beta function β(s) by dφ/ds = 1/β. Explain the significance of the beta function.

Show that x and x-≡ dx/ds satisfy

$$\frac{x^2 + \left[\beta(s)x' + \alpha(s)x\right]^2}{\beta(s)} = \varepsilon$$

where

$$\alpha(s) = -\frac{1}{2} \frac{\mathrm{d}\beta}{\mathrm{d}s}$$

Reduce the expression to the standard form for an ellipse in (x, x- ) space and show that its area is πε. This is the emittance of the beam (in one transverse direction). Explain its importance for accelerator design and control.

# **Particle detectors 4**

## **4.1 Introduction**

All the experimental discoveries that underpin our understanding of particle physics rely on particle detectors, so a good knowledge of how these sophisticated devices work is essential. The complexity of particle detectors has grown enormously from very simple beginnings to the very powerful detector systems used at the LHC. As in the rest of this book, we will not take a historical approach but try to find the easiest and most direct way to explain the fundamental physics. We will start in Section 4.2 with an overview of a collider detector, focusing on what the requirements are and giving a simple description of how the different subsystems are used to identify some types of particles and measure the energy of individual particles or 'jets'. This will give us a good idea of what a collider detector looks like, but it will tell us nothing about how any particular detector actually works. To gain any useful understanding, we need to consider the basic detector physics that will explain quantitatively the performance of real detectors.

We start this systematic approach in Section 4.3 by considering how high-energy particles interact with matter and lose energy. The processes result in a relatively small number of electron–ion pairs, so the next issue to consider is how we can use this effect to create a measurable signal.<sup>1</sup> <sup>1</sup>This is not strictly correct in silicon The fundamental detector physics of how signals are generated will be described in Section 4.4, since this step is obviously essential for any real understanding of how a particle detector works.

Armed with this knowledge, we can start to consider how basic particle detectors actually work. In Section 4.6, we will look at two techniques used for tracking the trajectory of charged particles: wire chambers and silicon detectors. Next, in Section 4.7, we will consider how to make energy measurements for charged and neutral particles in devices called calorimeters.

To select interesting events for permanent storage, while rejecting very high rates of background processes, very powerful trigger systems are required. We will review these briefly in Section 4.10, with a particular emphasis on LHC collider detectors, since these present the biggest challenges from the triggering perspective. Even with very powerful trigger systems, many petabytes of data are written to permanent storage every year at the LHC. Therefore, extremely powerful computer systems are required to process these data and to run the Monte Carlo simulation programmes used to understand the detector performance and correct

### Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak,

& Tony Weidberg 2016. Published in 2016 by Oxford University Press.


detectors, where we deal with electron– hole pairs.

very large computer farms is how to provide sufficient cooling to remove the heat.

for the inevitable imperfections. This computing requires ∼10<sup>5</sup> CPUs, which would be difficult to deal with in one facility. <sup>2</sup> <sup>2</sup> The biggest practical problem with The problem has been solved by the use of GRID computing, in which the CPUs are distributed over many computer centres across the world. GRID computing is now a major research area in its own right, but will not be covered further in this book.

> Having understood the basic building blocks, we will then look in Section 4.11 at how large particle detectors are designed and how they work. Here and in other chapters, we will use case studies of real detectors to see how the fundamental principles are applied in practice. Interestingly, we will see that there is no perfect solution to the many design challenges, and there are always difficult trade-offs in the design of any large detector. The discussion will focus on the design of the general-purpose LHC detectors, since these are the largest and most sophisticated detector systems ever built. We will also briefly consider neutrino detectors, since the constraints on these are not the same as those on collider detectors and the resulting systems are very different.

## **4.2 Overview of collider detectors**

As an example of a collider detector, we will look at the general-purpose detectors at the LHC. As will be discussed in Chapter 13, the principle aims of the LHC are the study of the Higgs sector and the search for new physics beyond the Standard Model (SM), such as supersymmetry. Higgs bosons or any exotic particles will be heavy and will in general decay rapidly to SM particles, so we need to optimize the detector for energetic SM particles. We must be able to measure the momenta of photons, electrons, muons, taus, and hadron jets. As well as measuring the momenta, ideally we need to identify the different particles, which is non-trivial since the rates for hadron jets are O(10<sup>6</sup>) times higher than for leptons. We also need to distinguish between jets from b and c quarks and jets from light quarks. For neutrinos or exotic weakly interacting particles, such as SUSY WIMPs (see Section 13.4.1), direct detection is clearly impractical. However, we can infer the transverse momentum of these 'invisible' particles by using conservation of momentum. For this technique to be effective, we require detectors with calorimeters that cover most of the 4π solid angle (this technique is discussed in Chapter 13).

A very schematic view of the principle components of a generalpurpose collider detector is shown in Fig. 4.1. Working our way out from the centre of the detector, we can see how the different elements contribute to satisfying these requirements:

• **Tracker:** This consists of very high-precision silicon detectors, immersed in a large magnetic field from a superconducting magnet. The trajectories of charged particles can be reconstructed, and hence the momenta can be computed. These detectors are used in

**Fig. 4.1** Longitudinal and transverse views of a generic collider detector.

conjunction with the calorimeters and muon detector to identify and measure the momenta of electrons, muons, and taus. They can also measure the momenta of charged hadrons. The very high precision of these detectors allows good momentum resolution for high-momentum particles. It also allows b and c quarks to be identified, using the fact that because of their relatively long lifetime, the trajectories of their decay products do not point back to the primary vertex. The magnetic fields required, in the range 2–4 T, are created by superconducting magnets, which are based on similar technology to that used for the accelerator (see Chapter 3).

• **Calorimeter:** The first aim of the calorimeters is to provide highprecision measurements of photons and electrons. The second aim is to measure the energy of hadrons and so reconstruct hadronic jets. All particles apart from muons and weakly interacting particles like neutrinos will deposit nearly all their energy in the calorimeters. In general, the energies are reconstructed from active detector elements interleaved with passive absorber material. For practical reasons that we will consider in Section 4.7, the calorimeters are divided into electromagnetic (EM) and hadronic sections. Each type of calorimeter is further divided into small cells, which enables reconstruction of the transverse and longitudinal profiles of the energy deposition. This provides very powerful separation between electrons, which deposit nearly all their energy in a small region of the EM calorimeter, and hadronic jets, which produce deeper and wider showers. To reconstruct the missing transverse momentum, it is essential that the calorimeter cover a solid angle as near to 4π as possible.<sup>3</sup>

<sup>3</sup>Two holes around the beam pipes are unavoidable, so there can be significant energy 'lost' down the pipes, but as the angles are very small, the transverse momenta are low; hence we can measure missing transverse momentum but not missing longitudinal momentum.

• **Muon spectrometer:** If the calorimeter is sufficiently thick, the majority of particles emerging from the calorimeters will be muons (ignoring neutrinos), because they do not tend to produce electromagnetic showers like electrons or have hadronic interactions. The trajectories of the muons are measured in large wire chambers and can be matched to high-transverse-momentum charged particles measured in the tracker, reducing the effects of hadrons 'leaking' out of the back of the calorimeter. If there is a magnetic field in the region of the muon chambers, the muon trajectory can be used to determine the muon momenta. Possible magnetic field configurations are considered in Section 4.9. The momenta of the muons can be measured independently in the tracker and combined with the measurement in the muon spectrometer to get the best precision.

## **4.3 Particle interactions with matter**

This section covers the most important interactions of high-energy particles with matter that are needed to understand detector physics. For tracking detectors, the most important process is ionization, since this generates the electron–ion pairs that we can detect. Multiple scattering is also important in tracking detectors because it limits the resolution. Electromagnetic processes such as pair production are fundamental for understanding electromagnetic and hadronic calorimeters. Finally, hadronic interactions are obviously of particular importance for understanding hadronic calorimeters. <sup>4</sup> <sup>4</sup> This is of course a very simplified

## **4.3.1 Ionization**

All charged particles interact with electrons in the atoms in any material in the detector. For high-energy particles, <sup>5</sup> <sup>5</sup> Here we are typically interested in the energy transferred to the electrons can be larger than the ionization energy, thus creating free electrons and positive ions. How to detect such secondary charged particles is discussed in this chapter. These collisions result in the incident particle losing energy in the lab frame (they are approximately elastic collisions in the CMS). <sup>6</sup> <sup>6</sup> Charged particles can also lose energy We can understand the main features of ionization energy loss by starting from the formula for Rutherford scattering. The differential cross section (see Exercise 4.1) as a function of the 4-momentum transfer Q<sup>2</sup> and speed of the incoming particle β is given by

$$\frac{\mathrm{d}\sigma}{\mathrm{d}q^2} = \frac{4\pi\alpha^2}{q^4\beta^2} \tag{4.1}$$

where z is the charge in units of electron charge of the target particle by which the charged particle is scattered and α is the fine-structure constant. We can evaluate Q<sup>2</sup> in the rest frame of the electron before

picture—in reality, all these processes affect all detector types to some extent.

particles with energies E 1 MeV.

by interacting with the atomic nuclei, but the energy transferred by the elastic scattering considered in this section is negligible compared with interactions with electrons because of the larger mass of the nuclei compared with electrons. Electrons also lose energy by bremsstrahlung, and this will be discussed in Section 4.3.3.

the collision to be (see Exercise 4.2) Q<sup>2</sup> = 2meT, where T is the kinetic energy of the scattered electron. Then, changing variables in eqn 4.1,

$$\frac{\mathrm{d}\sigma}{\mathrm{d}T} = \frac{2\pi z^2 \alpha^2}{m\_e \beta^2 T^2} \tag{4.2}$$

We then convert this expression for the energy loss in one collision to the average energy loss as a charged particle interacts with many atoms in some medium. The rate of energy loss per unit length in a medium with N atoms per unit volume and atomic number Z is

$$\frac{\mathrm{d}E}{\mathrm{d}x} = NZ \int\_{T\_{\mathrm{min}}}^{T\_{\mathrm{max}}} T \frac{\mathrm{d}\sigma}{\mathrm{d}T} \,\mathrm{d}T \tag{4.3}$$

The minimum energy Tmin is related to the ionization energy I. We can calculate the maximum kinetic energy of the electron in the lab frame by considering a collision in the rest frame in which the direction of motion of the electron is reversed (see Exercise 4.2), which gives Tmax = 2β<sup>2</sup>γ<sup>2</sup>me. Substitution into eqn 4.3 gives an approximate formula for the rate of energy loss of charged particles:

$$\frac{\mathrm{d}E}{\mathrm{d}x} = \frac{2\pi N Z z^2 \alpha^2}{m\_e \beta^2} \ln\left(\frac{2\gamma^2 \beta^2 m\_e}{I}\right) \tag{4.4}$$

This shows that the energy loss initially decreases with increasing energy and then rises logarithmically with energy. Finally, this formula modified by relativistic effects is known as the Bethe formula<sup>7</sup> <sup>7</sup>This formula used to be called the

$$\frac{\mathrm{d}E}{\mathrm{d}x} = K \frac{Z}{A\beta^2} \left[ \ln \left( \frac{2m\_e \beta^2 \gamma^2}{I} \right) - \beta^2 - \frac{\delta(\beta \gamma)}{2} \right] \tag{4.5}$$

where K = 4πNAr<sup>2</sup> <sup>e</sup>m<sup>e</sup> (N<sup>A</sup> is Avogadro's number and r<sup>e</sup> is the classical radius of the electron) and Z and A are the atomic number and atomic mass number of the nucleus. It is conventional to express the stopping power in units of MeV g<sup>−</sup><sup>1</sup> cm<sup>2</sup>. To transform this to the stopping power per unit length, we simply multiply by the density ρ. At relativistic energies, the electric field from the primary charged particle flattens and so allows collisions with more distant atoms. However, at very high energy, this effect is reduced by the polarization of the medium, which leads to the 'density effect' correction term δ(βγ).

The mean energy loss for charged particles in different media as a function of βγ is shown in Fig. 4.2. The important features of the stopping power are very similar for all targets: at low momentum, the stopping power decreases rapidly as the momentum of the incident particle increases, and then rises logarithmically at higher momentum. There is a broad minimum around βγ ∼ 3 and the value of the minimum is typically in the range 1–3 MeV g<sup>−</sup><sup>1</sup> cm<sup>2</sup>. Note that the energy loss by ionization scales with the Z of the material, which is very different to the Z<sup>2</sup> scaling that we find for pair-production and bremsstrahlung processes. We have discussed the mean energy loss, but there can be very Bethe–Bloch formula, but it is now

**Fig. 4.2** The mean energy loss by ionization in different materials as a function of βγ [115] (where β and γ are the usual relativistic factors). Note the units of MeV g−<sup>1</sup> cm2. To convert to linear stopping power, multiply by the density.

<sup>8</sup>Here the effect of scattering off the atomic nucleus dominates over that from atomic electrons because of the larger charge on the nucleus.

**Fig. 4.3** A charged particle undergoing multiple scattering in a material is deflected by an angle θ.

large fluctuations because of the large range of energies that can be lost in a single collision. The spread in the actual energy lost is given by the very broad 'Landau' distribution. Very large 'tails' in the distribution are caused by the emission of single relatively energetic electrons (called 'δ-rays').

## **4.3.2 Multiple scattering**

When a charged particle traverses a slab of a material, as sketched in Fig. 4.3, it undergoes a very large number of very small-angle Coulomb scatters from the nuclei of the material.<sup>8</sup> The net result of that is that the particle emerges from the slab at an angle θ with respect to the initial direction. Considering many identical particles, one gets a distribution of their angles θ (in a plane like the plane of Fig. 4.3, or for any plane containing the initial direction vector) that follows a Gaussian distribution with standard deviation

$$\theta\_0 = \frac{13.6 \,\mathrm{MeV}}{\beta p} q \frac{x}{X\_0} \left( 1 + 0.038 \ln \sqrt{\frac{x}{X\_0}} \right) \tag{4.6}$$

where X<sup>0</sup> is the radiation length (see Section 4.3.3), x is the thickness of the material slab, and p, β, and q are respectively the momentum, velocity, and charge of the particle. The root mean square displacement of the particle trajectory y is then yrms plane = (x/√3)θ0. The effects of multiple scattering degrade the resolution of track reconstruction and therefore can have profound effects on the detector performance, as will be discussed later in this chapter. Note that the amount of multiple scattering scales with the amount of material expressed in radiation lengths. This provides one motivation for the design of tracking detectors that are 'thin' in units of radiation length. As the radiation length scales with Z−<sup>2</sup>, this shows that we should minimize the use of high-Z material.

## **4.3.3 Electromagnetic interactions**

Electrons and positrons lose energy by ionization in a similar way to that discussed in Section 4.3.1 (although there are some differences, associated with issues like the spin and identical particles for the case of electrons). However, at high energy, the dominant process for energy loss is bremsstrahlung (see Fig. 4.4). For an electron of energy E, the rate of change of energy due to bremsstrahlung as a function of distance x is given by

$$\frac{\mathrm{d}E}{\mathrm{d}x} = -\frac{E}{X\_0} \tag{4.7}$$

where X<sup>0</sup> is the radiation length for the material. We can easily integrate eqn 4.7 to show that in travelling a distance X0, the electron energy decreases by a factor of 1/e. An approximate formula for the radiation length is given by (see [115] for the full expression)

$$\frac{1}{X\_0} \sim \frac{4\alpha^3}{m\_e^2} \frac{N\_A}{A} Z^2 L\_{\rm rad} \tag{4.8}$$

N<sup>A</sup> is Avogadro's number, Z is the atomic number and A is the atomic mass number (number of protons and neutrons). where for Z > 4, Lrad = ln (184.15Z−1/<sup>3</sup>). We can see that the radiation length scales as 1/α<sup>3</sup> as expected because the Feynman diagram contains three vertices. The electron 'sees' the charge of the entire nucleus at one vertex, so the cross section scales with the atomic number as<sup>9</sup> <sup>9</sup>This scaling with <sup>Z</sup> is more rapid than Z<sup>2</sup>. The differential cross section for bremsstrahlung as a function of the variable y = k/E, where k and E are respectively the photon and electron energies, is given to a good approximation by

$$\frac{\mathrm{d}\sigma}{\mathrm{d}k} = \frac{A}{X\_0 N\_A k} \left(\frac{4}{3} - \frac{4}{3}y + y^2\right) \tag{4.9}$$

The characteristic feature of eqn 4.9 is that the photon energy spectrum is peaked at low values. In one radiation length, it is very unlikely that the electron will lose energy to only one high-energy photon—it is far more common for it to lose energy to many lower-energy photons.

High-energy photons can undergo pair conversion (see Fig. 4.5), which is clearly a process closely related to bremsstrahlung. At high energies,

**Fig. 4.4** Lowest-order Feynman diagram for bremsstrahlung (eZ → eZγ) in a material with nuclear charge Ze.

the linear scaling with Z that we found for energy loss by ionization.

**Fig. 4.5** Lowest-order Feynman diagram for the pair production process for a photon interacting with a nucleus of charge Ze.

<sup>10</sup>Note that the length for a primary electron to decrease in energy by a factor f is <sup>7</sup> <sup>9</sup> of the length for which the probability of a photon not to pairproduce is equal to the same factor f.

<sup>11</sup>We can make an average correction to allow for shower leakage out of the back of the calorimeter, but there will always be shower-to-shower statistical fluctuations in the amount of leakage, which we cannot correct for. So, if we want a high-resolution electromagnetic calorimeter, we must ensure that it is deep enough for almost complete shower containment.

<sup>12</sup>This sets the natural size for the transverse granularity for electromagnetic calorimeters. We wish to separate electrons (positrons) from hadrons using, among other measures, the transverse shower size. This improves with finer granularity, but we clearly do not gain by having cells with lateral dimensions much less than RM.

the differential cross section for pair production as a function of the fraction of the photon energy given to the electron, x, is

$$\frac{\mathrm{d}\sigma}{\mathrm{d}x} = \frac{A}{X\_0 N\_A} \left[ 1 - \frac{4}{3} x (1 - x) \right] \tag{4.10}$$

We can integrate eqn 4.10 to obtain the total pair production cross section<sup>10</sup>

$$
\sigma = \frac{7}{9} \frac{A}{X\_0 N\_A} \tag{4.11}
$$

At lower energies, the dominant process for energy loss by photons is Compton scattering γe → γe. Here the incident electron is approximately at rest in an atom and it is ejected from the atom in the process (i.e., in the lab frame, energy is transferred from the incident photon to the outgoing electron).

Now that we have considered the fundamental electromagnetic interactions in matter, we can understand the nature of the resulting electromagnetic showers. Incident high-energy electrons will lose energy by bremsstrahlung and the resulting photons will create e<sup>+</sup>e<sup>−</sup> pairs, which in turn will create more photons by bremsstrahlung. We need to consider the competition between the rate of energy loss from bremsstrahlung/pair production and ionization. The former increases approximately linearly with energy, whereas loss due to ionization increases only logarithmically. When the energy of the electrons decreases to the 'critical energy' Ec, the energy loss by bremsstrahlung will be equal to that by ionization. An approximate fit to the critical energy as a function of atomic number Z is given by

$$E\_c = \frac{610}{Z + 1.24} \text{ MeV} \tag{4.12}$$

As the electron (positron) energies become lower than Ec, they lose energy rapidly, become non-relativistic and lose energy by ionization even more rapidly, hence ending the shower development. This results in the shower depth varying logarithmically with energy (see Exercise 4.4). The longitudinal shower profile can be calculated rather accurately using Monte Carlo simulations and an example is shown in Fig. 4.6. We require nearly complete shower containment to obtain good energy resolution, so for 30 GeV electrons we need a depth of at least ∼20X0. 11

Electromagnetic showers broaden as they penetrate deeper into matter owing to multiple Coulomb scattering of the electrons (positrons) and the scattering angles involved in bremsstrahlung and pair production. The first effect dominates and we can parameterize the width of the shower by the 'Moli`ere radius' R<sup>M</sup> = X0Es/Ec, with E<sup>s</sup> ≈ 21 MeV. Approximately 90% of the energy is contained within a radius of RM. 12

**4.3.4 Cerenkov radiation ˇ**

When a charged particle moves at a speed v greater than the local phase velocity of light, 1/n, where n is the refractive index of the medium, it will emit Cerenkov photons. The angle of the ˇ Cerenkov photons relative ˇ to the charged particle can be calculated using a simple geometrical argument (see Fig. 4.7). In time t, the relativistic particle travels from A to B, a distance of vt. The electromagnetic wave emitted by the particle from A is travelling at the (lower) speed of 1/n. The wavefront, defined by the plane with a constant phase, is given by the line from C to B. Hence the Cerenkov angle is given by cos ˇ θ<sup>C</sup> = 1/(nv).<sup>13</sup> The photons are typically in the optical range and can be detected in a similar way to that used for scintillation light (see Section 4.4.2).

## **4.3.5 Transition radiation**

If a high-energy charged particle crosses a boundary between two media with different dielectric constants, it can emit transition-radiation photons. The yield depends on the Lorentz factor γ and therefore allows the separation of electrons from charged hadrons. The yield per interface is O(α) and is therefore very low, implying that a practical transitionradiation detector requires hundreds of interfaces, which can be achieved for example with Mylar foils.

## **4.3.6 Hadronic interactions**

High-energy hadrons undergo nuclear interactions in matter. The physics involved cannot be calculated from first principles, so phenomenological models are needed. It is useful to define the interaction length λ<sup>I</sup> as the length in a material in which the probability of a hadron not interacting

**Fig. 4.6** Simulation of the longitudinal shower profile for incident 30 GeV electrons and photons on iron [115]. The histogram shows the energy deposition and the circles (squares) indicate the number of electrons (photons). The photons penetrate more deeply than electrons, reflecting the factor of <sup>7</sup> <sup>9</sup> in eqn 4.11.

**Fig. 4.7** Geometrical construction for calculation of the Cerenkov angle. ˇ

<sup>13</sup>We are using natural units with c = 1 and we have assumed that the medium is non-dispersive.


**Table 4.1** Radiation length X0, interaction length λI, and density ρ for some elements.

is 1/e. The cross section at high energy for scattering of a hadron on a nucleus scales like σ = R0A<sup>2</sup>/<sup>3</sup>, which is quite different to the Z<sup>2</sup> scaling for bremsstrahlung and pair production. <sup>14</sup> <sup>14</sup> This provides another motivation for The interaction length is compared with the radiation length for a few common absorbers used in calorimeters in Table 4.1. The longitudinal shower profile for highenergy pions in iron [88] is shown in Fig. 4.8. We see that for good shower containment, we need a depth of about 10λI, which results in very large calorimeters. This obviously increases the cost of the hadronic calorimeter itself, but also increases the radius for the start of the muon detectors, and thus increases the area and cost of the muon spectrometer. Therefore, for cost reasons, there will usually be some significant energy leakage out of the back of a hadronic calorimeter. As for electromagnetic calorimeters, we can make an average correction for this effect, but the statistical fluctuations will degrade the resolution.

> A high-energy hadron interacting with a nucleus will create a mixture of charged and neutral hadrons. The π<sup>0</sup>s will decay rapidly to photons

using high-Z absorbers in an electromagnetic calorimeter (apart from wishing to limit the depth required for good shower containment): electromagnetic showers are contained in a shorter depth than hadronic showers and the separation is better for higher-Z absorbers.

high-energy pions in the CDHS detector [88].

and thus induce electromagnetic showers. The charged hadrons produced will penetrate further into the calorimeter and create secondary hadronic interactions, leading to the development of hadronic showers deep into the calorimeter. The big difficulty with hadronic calorimetry is that a significant fraction of the energy goes into nuclear breakup and evaporating neutrons and protons from the nuclei. The resulting low-energy nuclei and protons will be very heavily ionizing and lose energy rapidly. Typically, this will occur in the passive absorber,<sup>15</sup> <sup>15</sup>For cost reasons, hadronic calorimproducing no detectable signal in the 'active' layers. The low-energy neutrons will scatter and thermalize on a timescale of microseconds, and so any photons produced from neutron capture will be outside the time 'window' for signal collection. The fraction of energy that is effectively 'lost' in a hadronic interaction due to these processes is very large (typically in the range 20–40%). The real problem is that there is a very large variation in this lost fraction from shower to shower, which greatly degrades the resolution of hadronic compared with electromagnetic calorimeters. The magnitude of the effect can be parameterized by the ratio of the response to electrons to that to hadrons, e/h. If e/h is significantly different from unity, the calorimeter resolution will be limited and there will be large non-Gaussian fluctuations. Several ideas have been pursued to try to achieve 'compensating' calorimeters in which e/h ≈ 1 and these will be discussed in Section 4.7.5.

## **4.4 Signal generation**

In Section 4.3, we have considered how particles lose energy in matter and create showers of secondary charged and neutral particles. We now need to examine how we can actually detect these secondary particles as well as the particles from the primary collision. In Section 4.4.1, we will see how charged particles moving between electrodes induce currents, which we can amplify and read out with suitable electronics. Another approach, considered in Section 4.4.2, is to use scintillation light. The scintillation and Cerenkov processes result in photons in the visible or ˇ ultraviolet wavelengths, so in Section 4.5 we review techniques to detect these photons.

## **4.4.1 Moving charges**

In this section, we explain how to calculate the induced currents created by moving charges, which generate the electrical signals we can measure in detectors like wire chambers or silicon detectors. We first calculate the induced current for a simple case and then discuss the general solution.

Consider a charged particle held between the two (infinite) plates of a parallel-plate capacitor, with both plates grounded. The potential is given by the solution of Laplace's equation [76], subject to appropriate eters are divided into alternating layers of 'passive' absorber and 'active' layers that detect the signal.

boundary conditions (the potential is 0 on the plates and approximates that from a point charge in the vicinity of the charge):

$$V(\rho, z) = \frac{q}{\epsilon\_0 \pi L} \sum\_{n=1}^{\infty} \sin\left(\frac{n\pi z}{L}\right) \sin\left(\frac{n\pi z\_0}{L}\right) K\_0\left(\frac{n\pi \rho}{L}\right) \tag{4.13}$$

where z<sup>0</sup> is the distance from the lower plate to the point charge, L is the separation between the plates, ρ = x<sup>2</sup> + y<sup>2</sup> (where x and y are the Cartesian coordinates of the point charge in the plane of the lower plate—see Fig. 4.9), and K<sup>0</sup> is a modified Bessel function. The solutions for three locations of the charge are illustrated by the equipotentials shown in Fig. 4.9.

The induced electric surface charge density on the conducting plate at z = 0 is given by σ = <sup>0</sup>|Ez(z = 0)|, where **E** = −∇V is the electric field evaluated at the edge of the conductor (z is the direction perpendicular to the conductor). When the charge is near the upper (lower) plate, we see that the equipotentials are more tightly packed near the upper (lower) plate. Therefore, when the charge is near the upper (lower) plate, the E field will be larger nearer the upper (lower) plate and hence there will be a larger induced charge on the upper (lower) plate. The fields and induced charges are obviously symmetric when the charge is equidistant from the two plates. Now let us imagine moving the charge from near the upper plate to near the lower plate. Initially, most of the induced charge will be on the upper plate, but this will gradually change and at the end most of the induced charge will be on the lower plate. This then looks like a current flowing between the two conductor plates. This is a qualitative example of the fundamental result in detector physics; moving charges

**Fig. 4.9** Equipotentials (arbitrary units) for a point charge at three different locations in a parallel-plate capacitor: (a) near the upper plate; (b) near the centre; (c) near the lower plate. The equipotentials near the point charge are omitted for clarity.

between conducting electrodes induce currents.<sup>16</sup> <sup>16</sup>Note that the induced signal occurs This current can be amplified and digitized by appropriate readout electronics.

Now that we have seen a qualitative description of the physics of induced charges, we can look at the quantitative solution. Taking the derivative in the z direction of the potential (eqn 4.13), we can determine the induced surface charge density on the upper and lower plates using Gauss's law:

$$\begin{aligned} \sigma(\rho, z = 0) &= -\frac{q}{(L\pi)} \sum\_{n=1}^{\infty} \frac{n\pi}{L} \sin\left(\frac{n\pi z\_0}{L}\right) K\_0\left(\frac{n\pi \rho}{L}\right) \\\\ \sigma(\rho, z = L) &= \frac{q}{(L\pi)} \sum\_{n=1}^{\infty} \frac{n\pi}{L} (-1)^n \sin\left(\frac{n\pi z\_0}{L}\right) K\_0\left(\frac{n\pi \rho}{L}\right) \end{aligned} \tag{4.14}$$

We can integrate the surface charge density to find the total charge induced on the upper plate as

$$\begin{split} Q\_{\rm U} &= 2\pi \int\_{0}^{\infty} \sigma(\rho, z = L) \rho \, \mathrm{d}\rho \\ &= \frac{2q}{L} \sum\_{n=1}^{\infty} \frac{L}{n\pi} (-1)^{n} \sin\left(\frac{n\pi z\_{0}}{L}\right) \int\_{0}^{\infty} x K\_{0}(x) \, \mathrm{d}x \end{split} \tag{4.15}$$

The integral is equal to unity, so

$$Q\_{\rm U} = \frac{2q}{L} \sum\_{n=1}^{\infty} \frac{L}{n\pi} (-1)^n \sin\left(\frac{n\pi z\_0}{L}\right) \tag{4.16}$$

The infinite sum is related to a Fourier series (see Exercise 4.3), so

$$Q\_{\rm U} = -\frac{qz\_0}{L} \tag{4.17}$$

We can calculate the surface charge on the lower plate at z = 0 by the same method, to obtain<sup>17</sup> <sup>17</sup>The total induced charge on the two

$$Q\_{\rm L} = -\frac{q(L - z\_0)}{L} \tag{4.18}$$

Now let the charge between the capacitor plates move with a speed v in the negative z direction. The induced charge flows from the upper to the lower plate and the current (while the charge is moving) is given by the rate of change of charge as

$$I = -\frac{qv}{L} \tag{4.19}$$

We have determined the induced current for the simplest possible geometry.

A more general solution to the calculation of the induced current, which is indispensable for understanding realistic detector geometries, as long as the charge is moving and stops when the charges are collected on the electrodes. A popular misconception is that the signal only arises when the charge is 'collected' at an electrode.

plates is −q, as expected.

the end of this chapter for a derivation of Ramo's theorem.

field <sup>19</sup> <sup>19</sup> Note that this field is not the same as the electric field and does not even have the same dimensions.

tems does the weighting field have the

forces causing the charge to move with a velocity v. In a particle detector, the motion is due to the applied electric and magnetic fields and the interactions of the moving charge with atoms or molecules in the detector.

will be rapidly re-absorbed. <sup>22</sup> <sup>22</sup> This is clearly a problem for an application requiring large-area scintillators.

is provided by Ramo's theorem. <sup>18</sup> <sup>18</sup> See Spieler in the Further Reading at This will provide us with a simple method for calculating the induced currents from any movements of charges and is therefore of fundamental importance in detector physics. First, we set the potential on the electrode being considered to 1 V and apply 0 V to all other electrodes and calculate the potential Φ by solving Laplace's equation subject to these boundary conditions. The 'weighting' is defined as

$$\mathbf{E}\_{\rm W} = -\nabla \Phi \tag{4.20}$$

The current induced on this electrode, caused by the motion of n charges q<sup>j</sup> , moving with velocities **v**<sup>j</sup> (j = 1,...,n), is given by

$$i = -\sum\_{j=1}^{n} q\_j \mathbf{v}\_j \cdot \mathbf{E}\_{\rm W} \tag{4.21}$$

The velocity depends on the real electric field, not the weighting field. <sup>20</sup> <sup>20</sup> Only in the case of two-electrode syssame form as the physical electric field. As a simple 'sanity check', we can now use Ramo's theorem to calculate the induced current for the case of a point charge between the plates of an infinite parallel-plate capacitor and compare it with the result obtained above. For this geometry, if we apply 1 V on one electrode and 0 V on the other electrode, then the weighting field is uniform and has a magnitude of 1/L. For a point charge q moving with velocity v parallel to this weighting field, we obtain the induced current from eqn 4.21 as

$$I = -\frac{qv}{L} \tag{4.22}$$

which is in agreement with eqn 4.19. <sup>21</sup> <sup>21</sup> This result is independent of the

## **4.4.2 Scintillators**

Scintillators are materials in which ionizing particles can cause scintillation light, which can be detected by photodetectors. There are two broad classes of scintillator: organic and inorganic. A common example of an organic scintillator is polystyrene. In an organic scintillator, molecules are lifted into an excited state by an ionizing particle and de-excite by emitting scintillation photons (typically in the UV). The problem with this is that the reverse reaction has a large cross section, so these UV photons This problem is solved by introducing a dopant so that these photons are absorbed by a fluorescent molecule (a 'fluor'). The fluor then decays rapidly to a lower-energy state via a radiative decay, emitting longer-wavelength photons. This increases the attenuation length, but it is usually still too short for practical applications. Therefore, a secondary fluor is used to shift the photons into the visible wavelength range, and these photons can have a suitably long attenuation length. The typical scintillation and fluorescence processes [115] are illustrated in Fig. 4.10. This type of organic scintillator is often used in sampling calorimeters (see Section 4.7).

A classic example of an inorganic scintillator is thallium-doped sodium iodide, NaI(Tl). A high-energy particle can excite an electron from the valence to the conduction band. The electron can drop from the conduction to the valence level with the emission of a photon. However, the reverse process will result in too short an attenuation length for a useful detector. Therefore, a different process is used in which high-energy particles create excitons (loosely bound states of an electron and a hole). An exciton can move through the crystal until it is captured by an impurity state (created by the doping with Tl), which can then decay via emission of a photon, thus creating scintillation light.<sup>23</sup> <sup>23</sup>As the doping concentration is rela-This has the advantage of high density, which allows the construction of a more compact calorimeter, thus reducing the cost, and it also has a very good yield for scintillation light. This scintillator is still used in many applications and it was used in older particle physics detectors. The problem is that it is too slow for use at modern colliders, because the scintillation decay time is ≈250 ns, which is much longer than the time between collisions at the LHC of 25 ns. To use an inorganic scintillator at the LHC, we need a very fast decay time. Also, the scintillator must be very tolerant to radiation—most scintillators would become opaque after exposure to LHC radiation levels. Such a scintillator, PbWO4, has been developed for the CMS electromagnetic calorimeter; its use there will be described in Section 4.7.2.

## **4.5 Photon detection**

We have seen that scintillation and Cerenkov radiation result in photons ˇ in the range from the optical to the UV, which we have to convert into an electrical signal that can be digitized and read out. The traditional method is based on photomultipliers (PMTs), but another technique that is becoming increasingly common uses avalanche photodiodes. A schematic illustration of a photomultiplier coupled to a scintillator is shown in Fig. 4.11. A photomultiplier has a photocathode, usually cence steps in an organic scintillator. [115]. Typical values are given for the wavelength and absorption length of the photons. The first step in the process, 'Foerster energy transfer', does not involve photon emission but is a dipole–dipole interaction between the base and the primary fluor.

tively low, the probability of the scintillation light being reabsorbed in the crystal is very low; i.e. the crystal is transparent at this wavelength.

**Fig. 4.11** Schematic view of a photomultiplier and the main processes involved. The primary photon is emitted from the photocathode and is accelerated and focused until it hits the first dynode. It then liberates many secondary electrons, which are accelerated to the next dynode. The resulting induced current is detected on the anode. From https:// commons.wikimedia.org/wiki/File: PhotoMultiplierTubeAndScintillator. jpg

it hits the first dynode. <sup>25</sup> <sup>25</sup> Additional electrodes act as electrostatic focusing elements to increase the fraction of electrons collected at the first dynode.

<sup>26</sup>For operation in moderate magnetic fields, PMTs can be shielded by shields made of 'mu-metal', an alloy with an exceptionally large relative permeability. However, saturation effects prevent this technique from working in high

that can be used for photodiodes, but there are photodiodes made from other semiconductors such as GaAs or In-GaAs. The optimal choice for any application depends on several factors, including wavelength, speed of response, and cost.

containing two alkali elements to obtain the best quantum efficiency. When a photon with an energy greater than the work function hits the photocathode, it can emit an electron by the photoelectric effect. <sup>24</sup> <sup>24</sup> This is called a photoelectron. The resulting electron is then accelerated by an applied electric field until This causes the emission of several secondary electrons (the electron has been accelerated so it has sufficient energy to do this). The secondary electrons are similarly accelerated and strike the second dynode. This clearly multiplies the number of electrons (hence the name photomultiplier). Several stage of dynodes are used and it is easy to obtain a very large gain (∼10<sup>6</sup> or more). A single photon thus creates a large pulse of electrons that is easy to detect and digitize. One disadvantage of PMTs is that they do not work in large magnetic fields.<sup>26</sup>

magnetic fields. The simplest solid state photodetector is a photodiode. In a photodiode, photons create electron–hole pairs in a detector working on the same principles as that of a silicon detector (see Section 4.6.2). <sup>27</sup> <sup>27</sup> Silicon is one possible semiconductor The problem is that the small signal results in a low signal-to-noise ratio. In an avalanche photodiode (APD), the electric field is large enough that electrons acquire sufficient energy to create further electron–hole pairs, leading to an 'avalanche' effect. This avalanche process creates an intrinsic gain in the device that results in APDs having better resolution for small calorimeter signals than simple photodiodes. This requires that a larger reverse bias be applied, typically ∼100 V, which results an avalanche gain in the range 10–100. The gain of an APD is more sensitive to the applied bias voltage and the temperature than that of a simple photodiode. In addition, the design needs to ensure that the avalanche does not lead to electrical breakdown. One key advantage of APDs for particle physics applications is that they are insensitive to applied magnetic fields.

## **4.6 Detectors for charged-particle tracks**

The traditional technology for tracking used wire chambers. These have been largely replaced by silicon detectors for the inner 'trackers' in LHC general purpose detectors. However, the cost of silicon detectors would be prohibitive for the very large-area outer detectors needed for the muon spectrometers, so wire chambers are the only practical technology for these systems.

## **4.6.1 Wire chambers**

A primary high-energy charged particle passing through a gas will create a few electron–ion pairs by ionization. To create a sufficiently large signal (i.e. greater than the electronic noise of an amplifier), we need to use an avalanche process. We start by considering the 'gas gain' caused by an avalanche, then consider the simple proportional wire chamber, and finally look at a 'drift' chamber.

## **Gas gain**

At sufficiently high electric fields (∼100 kV cm<sup>−</sup><sup>1</sup>), electrons drifting in an electric field acquire sufficient energy to cause further ionization in the gas and thus enable an avalanche process that can result in a very large increase in the number of electron–ion pairs. We define the gas gain G = N/N0, where N<sup>0</sup> and N are the initial and final numbers of electron–ion pairs. The change in N with a distance travelled ds is

$$\mathrm{d}N = N\alpha \,\mathrm{d}s \,\tag{4.23}$$

where α is called the first Townsend coefficient and has to be measured experimentally. We can integrate eqn 4.23 for the gas gain:

$$\begin{split} G &= N/N\_0 = \exp\left(\int \alpha \,\mathrm{d}s\right) \\ &= \exp\left(\int\_{E\_{\mathrm{min}}}^{E\_{\mathrm{max}}} \frac{\alpha}{\mathrm{d}E/\mathrm{d}s} \,\mathrm{d}E\right) \end{split} \tag{4.24}$$

where E is the electric field, Emin is the value of E at the start of the avalanche, and Emax is the value at the end of the avalanche (e.g. at the wire in a wire chamber). The value of Emin is simply related to the mean free path for electrons λ and the average ionization energy I by conservation of energy: eEminλ = I.

We can now summarize the general features of the gas gain as a function of the applied voltage across a chamber as illustrated in Fig. 4.12. At very low voltages, the electrons recombine with ions before they are collected. At higher voltages, we can distinguish different regions:


**Fig. 4.12** Variation in gas gain as a function of applied voltage [125].


The actual calculation of the gas gain depends on dE/ds, which clearly depends on the geometry used to create the field. For example, we can calculate the gas gain for the case of the proportional wire chamber to be discussed in detail below. Substituting for the electric field from eqn 4.27 into the gas gain equation (eqn 4.24), we can show that the gas gain is

$$G = \exp\left\{ V \int\_{E\_{\rm min}}^{E\_{\rm max}} \frac{\alpha(E)}{\ln(b/a)E^2} \,\mathrm{d}E \right\} \tag{4.25}$$

If we use the linear approximation that α(E) ≈ βE, where β is an empirical constant, then we can integrate eqn 4.25. Taking Emin = I/λe and Emax = V /[ln(b/a) a], we can show that

$$G = \exp\left\{\frac{\beta V}{\ln(b/a)} \ln\left(\frac{V\lambda e}{aI\ln(b/a)}\right)\right\} \tag{4.26}$$

This allows us to understand the rapid and approximately exponential rise of the gas gain with applied voltage that we saw in Fig. 4.12 for the proportional regime.

## **Proportional wire chambers**

Figure 4.13 shows the geometry of a cylindrical proportional chamber. In a typical arrangement, there is a thin anode wire at a high-voltage (HV) potential of a few kilovolts on the axis and the cylindrical cathode is at ground potential. The wire has a radius of 10–20 μm. Assuming that the length of the wire is much greater than its diameter (a very good approximation), we can easily calculate the magnitude of the electric field from Gauss's law. Taking a cylindrical surface around the wire, we can show that the magnitude of the electric field is given by

$$\left| \mathbf{E} \right| = \frac{V}{\ln(b/a) \, r} \tag{4.27}$$

(see Fig. 4.14), where V is the potential difference between the anode and cathode, a and b are the radii of the anode and cathode, respectively, and r is the radial distance from the centre of the anode wire. The cell is filled with a gas. A common, cheap, and safe choice for the gas is a 9 : 1 argon and CO<sup>2</sup> mixture: the noble gas has the advantage of chemical inertness, so the electrons liberated by ionization will be able to travel without being absorbed (the role of the CO<sup>2</sup> is explained below).

A charged particle crossing the cell ionizes the gas, creating about 40– 60 electron–ion pairs per centimetre. This number of electron–ion pairs is then increased by a factor of 2–3 because some electrons have enough energy to ionize the gas further. Electrons drift towards the anode and the much slower (massive) ions drift towards the cathode in a diffusion-like process. Very close to the anode, a few times the anode wire diameter, the electric field is high enough (the anode wire is thin) to accelerate the drifting electrons to energies allowing further gas ionization, and this leads to an avalanche process (see the discussion of gas gain earlier in this section).<sup>28</sup> There is also recombination of electrons and ions with emission of UV photons. These photons, if not absorbed, could eject electrons from the cathode, leading to a continuous electric discharge. The role of the CO<sup>2</sup> (or another gas with molecules with many degrees of freedom) is to absorb the UV photons and transform their energies to molecular vibration or rotation, which then decay via emission of longerwavelength photons. These longer-wavelength photons have too low an energy to eject electrons from the cathode. In this typical arrangement, the cell operates in the proportional regime.

Many different types of 'wire' chambers have been developed. They do not necessarily even have to contain wires, but they all rely on a large electric field to create an avalanche and they detect the induced currents caused by the drifting electrons and positive ions. These are described in the references in Further Reading.

**Fig. 4.13** A fundamental cell of a wire chamber. Not to scale.

**Fig. 4.14** Electric field inside the fundamental cell of a wire chamber.

<sup>28</sup>Electrons and ions are accelerated by the electric field, but also undergo many collisions with gas atoms and thereby acquire a uniform 'drift velocity' superimposed on their random motion as in a conductor. For our purposes, we can ignore the random motion and just consider the drift velocity. However, the random motion contributes to diffusion and is one of the factors limiting the resolution of wire chambers.

Charged particle

**Fig. 4.15** A multi-wire proportional chamber (MWPC).

**Fig. 4.16** A drift chamber and its fundamental cells.

because, in the presence of a magnetic field, electrons do not drift along lines of the electric field in the drift chamber but at an angle to them, known as the Lorentz angle.

We will consider two common types of wire chambers:


A typical example of an MWPC, as sketched in Fig. 4.15, consists of a plane of anode wires between two planes of cathodes (sometime cathode wires). Such chambers are often used in fixed-target experiments where charged particles are crossing chambers close to perpendicular to their anode planes. If the spacing between anode wires is d, and a simple binary readout is used (i.e a wire records either a hit or a no-hit), then the resolution of reconstructed points on the charged-particle trajectory is d/√12 (see Exercise 4.12). The separation between anode wires cannot be too small, because of the large electrostatic forces on the wires. The wires are held under tension to prevent neighbouring wires touching, but this imposes a minimum separation of a few millimetres.

Drift chambers have better spatial resolution, down to about 50 μm. A drift chamber in the barrel of a collider (head-on collisions) detector has a cylindrical structure. The anode wires are parallel to the chamber axis (parallel to the beam direction) and each wire is surrounded by cathode wires, creating a fundamental cell as sketched in Fig. 4.16. Such a cell might be several centimetres across, so the anode (or sense) wires are far apart from each other in comparison with an MWPC arrangement. This arrangement provides position measurements in the plane perpendicular to the beam axis. The trick is to measure the electrons' drift time. Using a signal from a fast independent detector like a scintillator, measuring precisely when particles in colliding beams interacted producing charged particles crossing the drift chamber, one can measure the time between the primary ionization and the leading edge of a signal from an anode wire.

Measurement of position along the beam direction can be done by different techniques. One method is to use anode wires at a small angle to the beam direction, which allows 'stereo' reconstruction of the distance along the beam axis. The geometry of the fundamental cell as well as the gas composition need to be chosen carefully, so the drift velocity <sup>29</sup> <sup>29</sup> Typically a few cm per <sup>μ</sup>s. of electrons is as uniform as possible across the cell, allowing for precise measurement of the location where the primary ionization took place, calculating it from the drift time and the drift velocity. <sup>30</sup> <sup>30</sup> It is often a little more complicated, In older experiments, the time of the signal was measured relative to an independent signal from a fast detector like a scintillator. At the LHC, an external timing detector is not necessary, because the LHC machine clock running at 40.008 MHz can be used.

## **Signals and readout for wire chambers**

In this section, we will calculate the induced current in a cylindrical wire chamber. The electrons drifting towards the anode will create an avalanche very close to the anode wire. To a first approximation, we can neglect the induced signal from the flow of electrons because they travel such a short distance. We can then calculate the induced current as the positive ions drift away from the wire to the cathode.

We can easily calculate the current induced by the motion of a single ion in the simple cylindrical wire chamber using eqn 4.21. As this is a two-electrode geometry, we can read off the weighting field from the actual electric field by setting the voltage across the chamber to be 1 V and therefore, from eqn 4.27,

$$\left| \mathbf{E} \mathbf{w} \right| = \frac{1}{\ln(b/a) \, r} \tag{4.28}$$

The drift velocity of the ion, **v**d, is related to the electric field **E** by **v**<sup>d</sup> = μ**E**, where μ is the ion mobility. We will assume that the mobility is constant. If the number of electron–ion pairs created by the avalanche from a single primary electron is Ntot, then the induced current (eqn 4.21) is<sup>31</sup> <sup>31</sup>The signal from a single electron is

$$I = -N\_{\rm tot} e \frac{v\_{\rm d}}{\ln(b/a) \, r} \tag{4.29}$$

Substituting for the electric field for this geometry, we get the ion speed as

$$v\_{\rm d} = \frac{\rm dr}{\rm dt} = \frac{\mu V\_0}{\ln(b/a) \, r} \tag{4.30}$$

multiplying both sides by r, we can integrate eqn 4.30 and solve for r:

$$r = a \left( 1 + \frac{t}{t\_0} \right)^{1/2} \tag{4.31}$$

where t<sup>0</sup> = a<sup>2</sup> ln (b/a)/(2μV0). Substituting from eqn 4.30 into eqn 4.29, we get

$$I(t) = -N\_{\text{tot}} e \frac{1}{\ln(b/a)} \frac{\mu V\_0}{\ln(b/a)} \tag{4.32}$$

Substituting for r from eqn 4.31 into eqn 4.32, we can calculate the induced current as a function of time:

$$I(t) = \frac{-N\_{\text{tot}}e}{2\ln(b/a)}\frac{1}{t+t\_0} \tag{4.33}$$

This current flows up to the time (tmax) when the positive ions reach the anode. We calculate tmax from eqn 4.31 by setting r(tmax) = b:

$$t\_{\text{max}} = (b^2 - a^2) \frac{\ln(b/a)}{2\mu V\_0} \tag{4.34}$$

Calculating t<sup>0</sup> and tmax for typical conditions (see Exercise 4.11), we find t<sup>0</sup> ∼ 10 ns and tmax ∼ 100 μs. This pulse shape is illustrated in Fig. 4.17. too small to measure. However, with the large gas amplification, the signal can be measured by a suitable low-noise amplifier.

from the high-voltage power supply to the ground, owing to occasional electric discharges (sparks), and might melt the wires (which are thin).

**Fig. 4.17** Typical pulse shape from a cylindrical wire chamber.

noise source that if too large will swamp the signal. In addition, the leakage current will lead to local heating of the silicon, and it is difficult to remove this heat without adding excess material.

age current. <sup>34</sup> <sup>34</sup> Thermal generation of electron–hole pairs will always occur, but the resulting leakage current is usually acceptable—if it is not, it can be reduced by cooling the silicon.

Equation 4.33 shows that the current pulse has a fast peak and then a very slow 'tail'. For a high-rate application such as a collider detector, we need fast pulses. We can produce a fast pulse by suitable 'pulse shaping'; this is done by filtering in frequency space to remove the low-frequency signals. A typical electronic readout circuit is sketched in Fig. 4.18. R<sup>1</sup> is very large (∼MΩ), thus protecting the anode wires from large currents.<sup>32</sup> <sup>32</sup>These currents could otherwise flow R2C<sup>2</sup> and R2C<sup>1</sup> are time constants, small enough to allow a fast current flow through R2, the input resistance of a preamplifier connected to the anode wire, isolated from the high voltage by the C<sup>2</sup> capacitor.

## **4.6.2 Silicon detectors**

Silicon strip detectors as well as silicon pixel detectors are playing an increasingly important role in tracking. The operation of silicon detectors is based on the fact that silicon is a semiconductor with a bandgap of 1.1 eV. A high-energy charged particle traversing silicon will interact with the silicon to create electron–hole pairs. However, most of the energy goes into phonons, so the average energy lost per electron–hole pair created is significantly larger, about 3.6 eV. This results in about 80 electron–hole pairs per micrometre for a minimum-ionizing particle. If no external field were applied, the electron-hole pairs would move apart slowly owing to diffusion; however, this process is too slow for most applications in particle detectors. Therefore, an electric field is applied to separate the electrons and holes. This motion of electrons and holes causes an induced current to flow in the external circuit as discussed in Section 4.4

Even in high-purity, high-resistivity silicon, the presence of a strong electric field would result in an unacceptably large leakage current, i.e. current flowing even without the presence of the charged particle. <sup>33</sup> <sup>33</sup> The leakage current represents a This problem is solved by making a pn junction, which forms a diode junction. When a reverse bias is applied to the diode, the free electrons are removed from the n-doped region, creating a 'depletion' region, in which there is a very low density of free carriers, thus allowing a large electric field to be applied, without paying the price of the unwanted large leak-How thick does the silicon have to be to create a big enough signal? There is actually no correct answer to this question, because it depends on the amplifier, but a typical choice is 300 μm, which

**Fig. 4.18** Fundamental cell readout circuit.

results in a signal of about 25 000 electron–hole pairs for a minimumionizing particle. The next question to consider is how large an electric field is needed to fully deplete the silicon. We can answer this question starting from Poisson's equation for the potential V in terms of the charge density ρ and the dielectric constant :

$$
\nabla^2 V = -\rho/\epsilon \tag{4.35}
$$

If we are assuming an effectively one-dimensional diode, and N is the net volume number density of charges, we can use eqn 4.35 to calculate the potential as a function of the distance x:

$$\frac{\mathrm{d}^2 V}{\mathrm{d}x^2} + \frac{Ne}{\epsilon} = 0\tag{4.36}$$

where e is the electron charge and is the permittivity of silicon. We will consider a detector with p strips in n bulk silicon. In this case, the p region is much more heavily doped than the n region, so we only need to consider the n-doped region. On applying the reverse bias, we remove all the free electrons from the n-doped region, which leaves behind a fixed space-charge density. Integrating eqn 4.36 gives<sup>35</sup> <sup>35</sup>dV /d<sup>x</sup> is just equal to minus the

$$\frac{\mathrm{d}V}{\mathrm{d}x} = -\frac{N\_{\mathrm{d}e}}{\epsilon}(x - x\_n) \tag{4.37}$$

where N<sup>d</sup> is the donor (electron) density and x<sup>n</sup> is the limit of the depletion region. Integrating eqn 4.37 gives

$$V = -\frac{N\_{\rm de}}{\epsilon} \left( \frac{x^2}{2} - x x\_n \right) \tag{4.38}$$

Finally, the total voltage applied across the depletion region is found by setting x = xn:

$$V\_{\rm bias} = \frac{N\_{\rm d}e}{\epsilon} \frac{x\_n^2}{2} \tag{4.39}$$

Equation 4.39 shows why we need high-purity silicon to make good detectors—because impurities contribute to N<sup>d</sup> and hence cause an increase in the bias voltage required for full depletion.<sup>36</sup> <sup>36</sup>Too high a bias voltage will result With typical detector-grade silicon, a 300 μm-thick silicon detector requires a bias voltage of about 50 V (see Exercise 4.13) We can calculate the drift velocities for electrons and holes:

$$
v\_{\text{drift}} = \mu E \tag{4.40}$$

where μ is the mobility and E is the electric field. We can use the measured mobilities to calculate the maximum drift times for electrons and holes (see Exercise 4.13). Detectors are typically operated at higher bias voltages to speed up the signal collection. Great care is needed in the design of silicon detectors, because too large electric fields can lead electric field.

in electrical breakdown in the cables or the silicon detector itself.

equivalent of gas gain in silicon detectors, although devices like avalanche photodiodes can be operated at sufficient voltage for amplification to occur.

LHC operating conditions.

be minimized: if more heat is generated, the cooling system must be larger, degrading the tracker resolution and creating unwanted secondary interactions in the tracker.

lifetime is 100 kGy(Si). <sup>40</sup> <sup>40</sup> Gy(Si) is the SI unit of dose, corresponding to 1 J of energy deposited per kg of Si: a dose of 100 kGy(Si) corresponds to ∼10<sup>9</sup> lung X-rays.

silicon is 1.1 eV. <sup>41</sup> <sup>41</sup> A useful rule of thumb is that the leakage current doubles for every 7 K increase in temperature.

<sup>42</sup>With sufficient damage, this can cause n-type silicon to change to p-type silicon, a process called type inversion. However, detectors can operate satisfactorily after type inversion.

to electrons gaining enough energy to cause secondary ionization, which leads to an avalanche effect and hence results in electrical breakdown. This will start in the region of highest electric field; any very small-scale non-uniformities in the electrode structure can cause enhanced electric fields and hence lead to electrical breakdown, even at relatively low bias voltages.

To determine whether this small signal <sup>37</sup> <sup>37</sup> In normal operation, there is no can be detected, it is essential to consider all sources of electronic noise. This is discussed in detail in Spieler's book in Further Reading, from which we see that we need to have low-capacitance detectors. For high-rate applications, such as the silicon trackers at the LHC, we need to minimize 'pile-up' backgrounds from hits in previous bunch crossings <sup>38</sup> <sup>38</sup> Occurring every 25 ns under nominal generating spurious hits in the triggered bunch crossing. This implies that the 'shaping time' of the electronics should be not more than O(25 ns). The challenge is to design low-noise amplifiers that are sufficiently fast and consume low power. <sup>39</sup> <sup>39</sup> Electrical power consumption must

## **Radiation damage**

One of the difficulties with the application of silicon detectors in particle physics, particularly at the LHC, is radiation damage. At a radius of 30 cm from the beam line, the expected ionizing dose over the detector High-energy particles can displace silicon atoms from their lattice sites, creating complex defects that result in states between the valence and conduction bands (called mid-bandgap states). This makes it much easier for thermal generation to promote an electron from the valence to the conduction band. This greatly increases the leakage current. The leakage current is strongly dependent on the temperature T:

$$I\_{\rm leak}(T) = AT^2 \exp\left(-\frac{E\_{\rm g}}{2k\_{\rm B}T}\right) \tag{4.41}$$

where k<sup>B</sup> is Boltzmann's constant and E<sup>g</sup> is the bandgap, which for Therefore, the leakage current can be very efficiently suppressed by cooling the silicon. These mid-bandgap states act like extra acceptors and thus change the effective dopant concentration.<sup>42</sup> From eqn 4.39, we can see that an increase in the effective dopant concentration will result in detectors requiring higher bias voltages to be fully depleted. The electrical breakdown of detectors at very high voltages therefore sets the scale for the maximum radiation doses that can be tolerated. In addition, some of the extra states can cause 'charge trapping', which looks like a signal loss.

## **Silicon systems**

The spatial resolution of a silicon detector is largely determined by the segmentation of the silicon into individual detector channels. If the width of a detector segment is x, and if a particle only causes a hit in a single channel, then the spatial resolution in this direction is x/√12. Improved resolution can be achieved by using signals in neighbouring channels; the amount of charge sharing with neighbouring channels gives extra information on the location of the 'hit'.

There are generally two classes of silicon detector systems: strips and pixels.

## **Strip systems**

A very simplified schematic cross-section of part of a generic silicon strip detector is shown in Fig. 4.19. A positive high voltage is applied to the 'backside' via the Al contact, which depletes the n-bulk silicon. Electron–hole pairs created by ionizing particles drift in the electric field and the current induces signals on the readout electrodes. The signal electrodes are AC-coupled to the Al strips (using the SiO<sup>2</sup> as an insulator), which are then connected to the preamplifiers in the readout ASIC (application-specific integrated circuit). The noise increases with the detector capacitance (see Spieler in Further Reading for an explanation); therefore, for high-rate applications such as the LHC, we must minimize any stray capacitance between the detector and the amplifier. The connection is made with 'wire bonds', typically a few millimetres long and 25 μm-thick aluminium wire. These thin bond wires can be ultrasonically bonded to pads on the detector and on the readout ASIC. This allows a very short connection between the detector and the amplifier, which introduces much less capacitance than a longer wire cable. We also need to create a DC return path for the current and this requires a largevalue resistor, so that the fast signal flows across the capacitor. This is achieved with polysilicon resistors inside the silicon detector itself.

In a strip detector, the silicon wafer is divided into long narrow strips, with typical strip widths in the range of 50–100 μm (the largest wafers used are 6 inches in diameter). This is done to obtain good precision in the bending plane of the magnetic field. Modest resolution (∼1 mm) in the orthogonal direction is achieved by having half the sensors with a small stereo angle. This has the disadvantage that it creates ambiguities if more than one particle hits a sensor.<sup>43</sup>

The amplifiers are in custom-designed ASICs. As the time taken for the first-level trigger (L1) (see Section 4.10) is of the order of microseconds, which is much longer than the 25 ns between bunch crossings, the data must be kept on-detector until the trigger decision is made. This is achieved with 'pipeline' memory in which the data from each strip for each bunch crossing are stored in different memory elements (see Fig. 4.20). If the L1 rejects the event, the corresponding data can be overwritten. If the event is triggered at L1, the corresponding data are read out via optical links.

A schematic view of an ATLAS Semi-Conductor Tracker (SCT) module is shown in Fig. 4.21. The module consists of two pairs of silicon wafers glued together to make a double-sided module. The ASICs are mounted on flexible copper–Kapton circuits. The beryllia (BeO) 'ear'

**Fig. 4.19** Schematic cross-section through a silicon micro-strip detector with p implants in an n-bulk silicon.

**Fig. 4.20** Principle of a pipelined memory. At each clock cycle, data are written into the cell defined by the write pointer. This pointer is advanced by one cell every clock cycle; after it gets to the last cell, it cycles back to the first. The read pointer follows a fixed number of clock cycles behind the write pointer. The time delay between the write and read pointers defines the time available for making a trigger decision. If the decision is positive, the data are read out from the corresponding cell; if not, then new data can be read into this cell. When the pointers advance beyond the last cell (12 in this unrealistic example), they cycle back to cell 1.

<sup>43</sup>In the ATLAS case, a discriminator is used to determine if hits are above threshold, so the output data are digital. In the CMS tracker, the signal amplitude is transmitted off-detector via analog optical links.

**Fig. 4.21** Schematic view of an SCT module [6].

at the side allows the module to make good thermal contact with the cooling tube. The coolant used is perfluoropropane (C3F8), since this provides very efficient two-phase cooling; i.e. the heat from the ASICs and the silicon detectors is used to evaporate liquid C3F8. These are very large systems, with 60 m<sup>2</sup> of silicon detectors for ATLAS and 200 m<sup>2</sup> for CMS. The modules have to be held rigidly in place to benefit from the high intrinsic spatial resolution, but the material must be minimized because any material causes multiple scattering of all charged particles and results in electrons and photons starting electromagnetic showers before the calorimeter. Therefore, each module is mounted on carbonfibre support structures since these provide the best ratio of stiffness to weight.

## **Pixel systems**

In silicon pixel detectors, the silicon is divided into much smaller areas; for example, in the ATLAS pixel detector, the dimensions of individual pixels are 50 μm × 400 μm. The smaller dimension is in the bending plane of the magnetic field to optimize the momentum resolution. The first advantage of pixel over strip detectors is that they provide unambiguous high-precision space points. In addition, the 'occupancies' (i.e. the fractions of detector elements that are hit in given events) are much lower for pixel detectors than for strips. This is vital for pattern recognition at the LHC, which has to reconstruct tracks in the presence of 'pile-up' background from about 25 collisions in the same bunch crossing. The small area of the pixels means that the detector capacitance is very low, which allows very low noise to be achieved (see Spieler in Further Reading). However, this requires minimization of stray capacitance between the silicon pixel and the amplifier in the ASIC. One of the main difficulties with pixel systems is how to make the electrical connection from each silicon pixel to a unique channel of the readout ASIC without introducing any significant capacitance. This is achieved by 'bump bonding'.<sup>44</sup> <sup>44</sup>In this process, indium solder is de-The much larger number of channels in pixel systems than in strips requires more sophisticated data processing in the ASICs.<sup>45</sup> Other system aspects for pixels are similar to those for strips.

Pixel systems offer many performance advantages over strips, but as the electronics covers essentially the full sensitive area, a layer of pixel detector will have more material than an equivalent layer of strips. In addition, pixel detectors are significantly more expensive than strip detectors of the same dimensions. Therefore, LHC detector systems are a compromise, with pixels being used close to the beam pipe and strips being used further away.

## **4.6.3 Tracker performance**

Consider the track of a charged particle with momentum p (measured in GeV) perpendicular to a magnetic field **B**. The radius of curvature R is related to the momentum by p = 0.3BqR. We assume that the track is measured over a length l (see Fig. 4.22). From the geometry, we can relate R to the 'sagitta' s and l by Pythagoras' theorem: R<sup>2</sup> = (R − s)<sup>2</sup> + (l/2)<sup>2</sup>. For high-momentum tracks, we can neglect the s<sup>2</sup> term and find 1/R = 8s/l<sup>2</sup>. Therefore, the error in 1/R is given by σ(1/R)=8δs/l<sup>2</sup>. To make approximate estimates of the momentum resolution, we will assume that the track is measured very precisely at the start and end of the trajectory but with an error given by δs at the midpoint. In this approximation,

$$
\sigma(1/p) = \frac{8\delta s}{0.3Bql^2} \tag{4.42}
$$

Although eqn 4.42 is a rough approximation, some general features are valid:


If **B** and l are fixed and we wish to measure momenta up to some value pmax, we can use eqn 4.42 to estimate the required spatial resolution

**Fig. 4.22** Definition of the track sagitta s.

posited on metallized pads on the pixel and heated in a reflow process to form hemispherical solder balls; the detector is then flipped and positioned very precisely over a flexible circuit with the readout ASICs already mounted. A further reflow of the solder results in an electrical connection between the pixels and the amplifiers in the readout AS-ICs. It is difficult to achieve a high yield and this process is very expensive.

<sup>45</sup>The area required in an ASIC to implement a pipeline for each pixel would be unacceptably large. Therefore, another approach is used for the pipeline that benefits from the very low occupancies in the pixels. A data-driven pipeline is used so that when a pixel is above threshold for a given bunch crossing, a 'time stamp' for that pixel address is written in memory. When a first-level trigger is received, the data for all the pixels with the correct time stamp are read out.

measurement error resolution. <sup>46</sup> <sup>46</sup> The optimization of the overall resolution is an interesting trade-off, because adding more measurements will decrease D but will add more material and therefore increase C.

**Fig. 4.23** Schematic view of tracks in the transverse plane showing tracks from the primary vertex and the definition of the impact parameter d<sup>0</sup> from the one track resulting from a decay.

entire calorimeter with the fine segmentation required to measure the electromagnetic showers.

(see Exercise 4.10). So far, we have only considered the contribution of the precision of the measurement points. However, in a real detector, we have material, so the charged particles undergo multiple scattering (eqn 4.6). As the scattering angle is inversely proportional to the momentum, this causes a contribution to the error in p that is constant. Therefore, the momentum resolution of a real tracker can be parameterized by adding the effects of measurement precision and multiple scattering in quadrature:

$$
\sigma(1/p) = C \oplus D/p \tag{4.43}
$$

where D is the term due to multiple scattering and C is the term due to

Another important measure of the performance of a tracking detector at a collider is how precisely the tracks can be extrapolated back to the primary vertex. Particles originating from decays of b or c quarks or τ leptons will travel for the order of 1 ps before decaying, and hence if one extrapolates the tracks back, they will miss the primary vertex. In the plane transverse to the beam direction, this distance is called the impact parameter (see Fig. 4.23). The resolution in impact parameter depends on the intrinsic resolution of the tracker and multiple scattering. Therefore, one requires a very high-precision measurement as close to the beam line as possible, and this is performed with silicon detectors (either strip or pixels). To minimize multiple scattering, one needs to have a very thin (in radiation lengths) beam pipe, and the best choice is beryllium (although beryllium is very difficult to machine and hence expensive).

## **4.7 Detectors for particle jets**

The energies of particles and 'jets of particles' are measured in detector systems called 'calorimeters'. Ideally, all particles with the exception of muons and neutrinos (or still to be discovered neutrino-like weakly interacting particles) should deposit all their energies in the calorimeter. As electromagnetic showers occupy much smaller volumes than hadronic showers (see Section 4.3), we require much finer segmentation for the front of the calorimeter than the back. <sup>47</sup>We cannot afford to instrument the <sup>47</sup> Therefore, the design of calorimeter systems is usually split into 'electromagnetic calorimeters' and 'hadronic calorimeters'.

## **4.7.1 Electromagnetic calorimeter**

The depth of the electromagnetic calorimeter is chosen such that nearly all the energy of electromagnetic showers from electrons and photons of the interesting energy range is contained in this part of the calorimeter. This can be determined from Monte Carlo simulations such as those illustrated in Fig. 4.6. At LHC energies, we need to measure electrons and photons with energies of several hundred GeV; therefore, the electromagnetic calorimeter needs to be about 25X<sup>0</sup> deep. Finer longitudinal sampling will also help separate showers induced by electrons from those induced by hadrons. The lateral shower size is set by the Moli`ere radius (see Section 4.3), which for lead is R<sup>M</sup> = 1.8 cm. The scale for the lateral size of hadronic showers is set by the hadronic interaction length λ<sup>I</sup> and is typically an order of magnitude larger. We can therefore achieve further separation between showers induced by electrons and hadrons with fine lateral and longitudinal segmentation. There are two different types of electromagnetic calorimeters:


There are many trade-offs between these approaches. In a sandwich calorimeter, most of the energy is deposited in the passive layers and there are significant fluctuations in the fraction of the energy deposited in the active layers. This usually limits the resolution of sandwich calorimeters and the best resolution can be achieved with homogeneous calorimeters, for which this effect does not arise. However, the average density of crystals used in homogeneous calorimeters tends to be lower than that in sandwich calorimeters, which therefore increases the depth of the electromagnetic calorimeter. This results in larger volumes for the hadronic calorimeter and the muon system, and will thus increase the cost.

## **4.7.2 Homogeneous calorimeters**

Homogeneous calorimeters are usually based on scintillating crystals (Section 4.4.2). The CMS electromagnetic calorimeter (ECAL) is an example of this technique. At the LHC, the scintillation must be fast because of the short time between bunch crossings (25 ns). The crystals must have very good radiation tolerance in order to survive many years of LHC operation. Finally, the crystals must have a very high density in order to keep the dimensions small enough. The CMS electromagnetic calorimeter is based on lead tungstate (PbWO4) crystals with a density of 8.28 g cm<sup>−</sup><sup>3</sup> and a radiation length of 0.89 cm. About 80% of the scintillation light is emitted in less than 25 ns [62]. One challenge with this system is that the transparency of the crystals decreases with radiation, and therefore sophisticated monitoring techniques are required to compensate for these effects. In addition, the light output is very sensitive to temperature, so the temperature needs to be maintained at a constant value. Because photomultipliers cannot be used in very strong magnetic fields, the scintillation light is read out by avalanche photodiodes (APDs).<sup>48</sup> A photograph of one such crystal with the APD readout is shown in Fig. 4.24.

<sup>48</sup>In the end-cap calorimeter, the radiation levels are too large for the use of APDs, and vacuum phototriodes are used instead.

**Fig. 4.24** Photograph of a PbWO<sup>4</sup> crystal and readout for the CMS electromagnetic calorimeter [62].

**Fig. 4.25** Schematic view of one cell of a sandwich scintillator calorimeter with wavelength-shifting plates to guide the light to the photomultiplier at the back.

## **4.7.3 Sandwich calorimeters**

In a sandwich calorimeter, there are alternating layers of active and passive material. The passive material should have high Z (to enable a relatively compact design)—lead is a common choice. The total energy detected in the active layers is only a fraction of the total energy deposited. This fraction can be measured in prototypes or small parts of the calorimeter in dedicated test beams in which the energy of the incident electrons is fixed, although at the LHC the rate of Z production is so high that 'in situ' calibration can be performed. Before the LHC, the most common design of electromagnetic sandwich calorimeter used plastic scintillators for the active layers. The scintillation light (see Section 4.4.2) needs to be guided to the photomultipliers at the back of the calorimeter. This is done using 'wavelength-shifting' plates (see Fig. 4.25). These contain fluors to shift the wavelength to longer wavelengths (typically in the green), for which the plastic is more transparent. There are several limitations to this technique:


A newer approach to scintillator sandwich calorimeters uses wavelengthshifting fibres embedded in the scintillator to transport the light to the photodetectors. This avoids the need for bulky waveguides, which add to the 'cracks' between calorimeter cells.

To overcome these limitations, a novel type of electromagnetic calorimeter has been developed for ATLAS, based on a new geometry for lead absorbers and liquid-argon ionization chambers. The signals are generated by electrons created by ionization, drifting in a large electric field and generating an induced current at the electrodes (see Section 4.4).<sup>49</sup> the signal to be detectable. The fundamental problem with this technique for use at the LHC is that typical drift times for the electrons are ∼400 ns, which is much longer than the time between bunch crossings of 25 ns. The solution is based on very fast 'bipolar' pulse-shaping electronics, in which most of the signal is not detected but a suitably fast pulse is generated. This is illustrated in Fig. 4.26. As most of the signal is not utilized, it is essential to lower the noise in order to maintain the signal-to-noise ratio. This is achieved by lowering the capacitance and inductance of the electrodes using a novel 'accordion' geometry as shown in Fig. 4.27. An important advantage of this technique is that liquid argon is inherently radiation-hard.

<sup>49</sup>This is very similar to a wire chamber operating in the 'ionization chamber' region, in which there is no gas gain. However, as liquids are much denser than gases, a high-energy charged particle can create sufficient ionization for

**Fig. 4.26** Signal pulse shape in the ATLAS liquid-argon calorimeter [20]. The triangular shape is the current pulse created by the electron drift. The curve shows the pulse shape after shaping with a bipolar pulse shaper.

**Fig. 4.27** Sketch of a small section of a prototype for the ATLAS electromagnetic calorimeter, illustrating the 'accordion' structure [38] (all dimensions are in millimetres). In this geometry, the signals are transported to the electronics on flat copper/Kapton tapes, which have lower capacitance and inductance per unit length than the cables that would be required if the electrodes were orthogonal to the direction of incidence of particles. The absorber plates are made from lead lined with stainless steel. The liquid argon is contained between the absorber plates, and the copper/Kapton electrodes are attached to these plates.


**Table 4.2** Electromagnetic calorimeter resolution for prototype calorimeters measured in test beams.

It is relatively easy to divide the readout cells to the desired lateral and longitudinal granularity. Another critical advantage is that the structure is self-supporting, so there is no need for passive material between cells, thus avoiding the cracks inherent in calorimeters based on plastic scintillators for the active layers.

## **4.7.4 Resolution**

The energy resolution of a typical electromagnetic calorimeter can be parameterized as

$$\frac{\Delta E}{E} = \frac{a}{\sqrt{E}} \oplus \frac{b}{E} \oplus c \tag{4.44}$$

where a, b, and c are constants and the different terms are added in quadrature. The constant a represents the 'stochastic term', b represents the contribution from electronic noise, and c is a constant term. In a calorimeter using a scintillator, if at a given energy the mean number of detected photons is N, there will be Poisson fluctuations giving a contribution to the stochastic term

$$\frac{\Delta E}{E} \sim \frac{\Delta N}{N} \sim \frac{\sqrt{N}}{N} = \frac{1}{\sqrt{N}} \Rightarrow \frac{\Delta E}{E} = \frac{a}{\sqrt{E}}$$

However, in a sandwich calorimeter, this effect is usually negligible compared with the 'sampling' fluctuations, i.e. the fraction of energy deposited in the active layers. <sup>50</sup> <sup>50</sup> There are many interesting trade-offs The constant b in eqn 4.44 represents the contributions from electronic noise and should be negligible at high energies. The constant term c represents the effects of residual nonuniformities in response across the cell and over all cells, as well as variations in time. With the aid of good calibration procedures, the constant term can be reduced to less than 1%.<sup>51</sup> Measured parameters from test-beam studies of the ATLAS [20] and CMS [62] electromagnetic calorimeters are given in Table 4.2. However during LHC operation, there are other factors that will degrade the resolution, such as radiation damage, uncertainties in the calibration constants, and 'pile-up' backgrounds (particles from extra collisions in the same bunch crossing). For the very

here. If the scintillator/passive ratio is increased, sampling fluctuations are reduced, but the size and cost of the calorimeter are increased. As discussed in Sections 4.7.1 and 4.7.2, there is no perfect design.

<sup>51</sup>At the LHC, the very large sample of Z → e+e<sup>−</sup> events provide ample data for in situ calibration of the electromagnetic calorimeters.

important Higgs decay, H → γγ, the precision of the angular measurement also contributes to the mass resolution. These factors favour the higher granular segmentation, the intrinsic stability, and the radiation hardness of a liquid-argon calorimeter compared with a scintillator calorimeter. The result is that the mass resolution for the Higgs decay H → γγ is comparable for ATLAS and CMS.

## **4.7.5 Hadronic calorimeter**

The hadronic calorimeter surrounds the electromagnetic calorimeter. Ideally, the combined electromagnetic and hadronic calorimeters should contain nearly all the energy from showers from hadrons entering the calorimeters (mostly π±). An indication of the required depth of the calorimeter can be deduced from the curves in Fig. 4.8. The practical depths for hadronic calorimeters are constrained by cost and available space, but a rule of thumb is that at LHC energies a depth of at least about 10 nuclear interaction lengths is required. A homogeneous hadronic calorimeter would be too large and so is not a practical option, and the hadronic calorimeter will be of the 'sandwich' type. The resolution for hadronic calorimeters is greatly reduced if the calorimeter is not 'compensating' , which means that the ratio of the response to electrons to that of hadrons, e/h, is significantly different from unity (see Section 4.3.6). There are several possible approaches to achieving compensation in hadronic calorimeters:


of uranium in the ZEUS calorimeter, which achieved compensation. However, the two first items were more important than fission.

e/h ≈ 1.4.

<sup>54</sup>The correction factors also depend on the energy as well as the local energy density.

and quartz fibres. The signal from the scintillator (S) and the Cerenkov ( ˇ C) radiation in the quartz fibres are measured separately. The values of e/h are very different for the S and C signals, which enables determination of the electromagnetic fraction fem for individual showers. The effect of e/h being different from unity can therefore be corrected, effectively achieving the good hadronic resolution of compensating calorimeters.

Although compensating calorimeters have been built, there are disadvantages in cost and/or resolution for electrons and photons, and the calorimeters for the LHC experiments are not compensating. <sup>53</sup> <sup>53</sup> The ATLAS barrel calorimeter has In a highly segmented calorimeter such as that used by ATLAS, the hadronic resolution can be improved by 'software compensation'; the secondary electromagnetic showers are smaller than hadronic showers, so they lead to higher energy density in the calorimeter cells. Therefore, the electron response can be decreased by de-weighting cells with large energy, thus making the response closer to being compensating and thereby improving the resolution. If the calorimeter cells are calibrated using electrons, the naive estimate of the energy in a hadronic shower would be given by E = <sup>i</sup> Ei, where E<sup>i</sup> is the energy in the ith calorimeter cell. As electromagnetic showers are more compact, the cells with higher local energy density will probably have arisen from electromagnetic showers. A correction factor is applied for hadronic showers. The correction factor decreases for showers with higher local density of energy deposition.<sup>54</sup> The calibration procedure used to determine the calibration factors aims to reconstruct the true energy on average and to optimize the resolution.

> The resolution for hadronic calorimeters can be parameterized by the same form as for electromagnetic calorimeters (eqn 4.44). The stochastic term will be larger because of the relatively coarse sampling, and if the calorimeter is non-compensating, then there will be a large constant term, which will dominate the resolution at high energies. If the calorimeter is not sufficiently deep, the energy lost at the back of the calorimeter will also contribute to the constant term. Any crack regions between cells or non-uniformity of the response over a cell will also add to the constant term. Typical examples of hadronic resolution for compensating and non-compensating calorimeters are given in Table 4.3. The superior


**Table 4.3** Energy resolution for prototype hadronic calorimeters measured in test beams.

resolution of the compensating ZEUS calorimeter [51] compared with the non-compensating ATLAS scintillating tile calorimeter [20] is clear, but even so the resolution is far inferior to that achieved by electromagnetic calorimeters. However, the compensation achieved in the ZEUS calorimeter came at the price of degrading the electromagnetic resolution. So, as is usual in detector physics, there is no perfect answer and designs must be optimized to the requirements of a particular experiment.

## **4.8 Detectors for particle identification**

In this section, we review some detector techniques for particle identification. Some particle identification is performed by combining signals in different types of detectors,<sup>55</sup> <sup>55</sup>For example, a high-momentum but here we restrict ourselves to types of detectors that give standalone particle identification.

## **4.8.1 Particle identification with Cerenkov ˇ detection**

The are two practical applications of Cerenkov radiation for particle ˇ identification:


## **4.8.2 Particle identification with transition radiation**

We have seen that charged particles crossing a boundary between two dielectric layers can emit X-rays. As the transition radiation increases track that is matched to an electromagnetic shower in a calorimeter can be identified as an electron.

radiation detector. <sup>56</sup> <sup>56</sup> This is not a problem for fixedtarget experiments; however, it is very problematic for collider detectors, for which the radial space for the tracker is limited by the inner radius of the calorimeter.

to ionization. <sup>57</sup> <sup>57</sup> The problem is that there are large statistical fluctuations in the magnitude of the energy loss deposited by ionization in a short path length in a

with the Lorentz γ factor, for practical purposes the yield is only significant for high-energy electrons, and this therefore provides a method to separate high-energy electrons from charged hadrons. As the photon yield per dielectric boundary is so low, we need many such boundaries. This sets a lower limit on the required length for a useful transition-The transition-radiation photons are in the X-ray region. These X-ray photons can be detected in wire chambers with a large fraction of a heavy noble gas like xenon. Xenon has Z = 54, which results in a large absorption cross section for X-rays, thus increasing the probability of X-ray absorption in a thin layer of gas. The energy deposited by X-rays is larger than the typical energy deposited by ionization in the gas, so a suitable discriminator level can be set that is sensitive to the X-rays from transition radiation but is rather insensitive

## gas. **4.8.3 Particle identification with ionization**

We saw that the rate of energy loss by ionization depends on the speed β of the particle (see eqn 4.5). Therefore, if we can make a suitable precise measurement of the energy loss by ionization and the momentum of a particle, we can achieve some separation between particles with different masses (e.g. pions and kaons). The momentum can be measured by a tracking detector in a magnetic spectrometer, and the amplitude of the signals in the elements of the tracking detector provide a measurement of the energy loss by ionization. The first difficulty with this technique is the presence of very large fluctuations in energy loss by ionization in thin layers, so if a wire chamber is used, a very large number of samples is required to achieve useful particle identification. The second problem is that the rate of energy loss as a function of momentum 'plateaus' at high momentum, so this technique is only useful at lower energies.

## **4.9 Magnetic fields**

We need magnetic fields for trackers and muon spectrometers in order to use the measured trajectory to reconstruct the momenta. The magnets are usually based on the same NiTi superconducting technology discussed for accelerators in Section 3.3. The volumes of the magnets are very much larger and, although the magnetic fields are smaller, the energy stored in these fields is very much greater, which leads to new engineering challenges.

## **4.9.1 Magnetic fields for trackers**

The usual choice of field configuration for trackers at colliders is a solenoid (with the axis along the beam line). To minimize the volume and cost, one option is to place the solenoid between the tracker and the calorimeter. Clearly, too much 'passive' material upstream of the calorimeter will degrade the resolution of the electromagnetic calorimeter. Therefore, the fields are generated using superconducting magnets, with field strengths up to 2 T being typical. The CMS magnet has a field strength of 4 T and has a larger radius, so the entire calorimeter system is housed inside the solenoid.

## **4.9.2 Magnetic fields for muon spectrometers**

One option for the magnetic field for the muon spectrometer is to use magnetized iron. If there is a superconducting solenoid for the tracker, the magnetic flux will return from the solenoid through the iron surrounding the solenoid. In this case, the iron serves multiple purposes: it can be the passive absorber for the hadron calorimeter and act as shielding to remove particles other than muons before they reach the muon chambers, as well as acting as the return 'yoke' for the solenoid. The iron is instrumented with tracking chambers (a variety of wire chambers) and the reconstructed muon tracks in these chambers can therefore be used to determine the momenta. The momentum resolution for these tracks is limited by multiple scattering to about 10%. In the CMS approach, the muon spectrometer tracks are linked to the much more precisely measured tracks in the tracker and hence a good muon momentum resolution can be achieved.<sup>58</sup> <sup>58</sup>See Exercise 4.9 for a discussion of

gers for this configuration. In the approach used for ATLAS, the magnetic field for the muon spectrometer is generated by eight large superconducting toroids in the central ('barrel') region and eight smaller superconducting coils in each end cap (see Fig. 4.28). The average magnetic field in the tracking volume

**Fig. 4.28** Schematic view of the AT-LAS toroid coils [20]. The eight barrel toroid coils with the interleaved endcap coils are shown. The cylinder shows the return flux for the solenoid. The length is 25.1 m and the outer diameter is 20.1 m.

<sup>59</sup>The total energy stored in the AT-LAS magnetic fields is about 1.6 GJ, which is the same magnitude as the kinetic energy in a TGV train with a mass of 385 t travelling at 330 km h−1.

tracker. <sup>60</sup> <sup>60</sup> The most precise muon measurement is then obtained by combining the estimates from the tracker and the muon spectrometer.

is in the range ∼0.5–1 T, but very good resolution is achieved by tracking over a long length l ∼ 5 m.<sup>59</sup> Since most of the volume is air, the momentum resolution is not so limited by multiple scattering as for magnetized iron. Another advantage of this field configuration is that it allows reconstruction of precise muon momenta independently of the

## **4.10 Trigger**

The trigger is an electronic and software system operating in 'real' time to reduce the raw data rate to a level that can be permanently stored. The trigger should keep as much of the interesting physics while rejecting the maximal amount of background events. The aim is to bring the rate down from the raw interaction rate to the maximum at which data can be kept in permanent storage, while retaining as large a fraction of the signal events as possible. Traditionally, this rate was typically of the order of a Hz, but advances in computer technology now allow far higher rates. The event rates are very different for different colliders. At e+e<sup>−</sup> colliders, the rates are relatively low, of the order of a Hz, but the rates at hadron colliders have been increasing. At the LHC, there are multiple interactions per bunch crossing (50 ns in 2012 running and 25 ns for the nominal LHC operation) and the trigger reduces this rate to a level of the order of 500 Hz.

Typically, there are three trigger stages or levels:


## **4.10.1 LHC triggers**

The issue of efficiently triggering on interesting physics events, while maintaining a manageable readout rate, is one of the main challenges for LHC detectors. At design luminosity, the rate of pp collisions is about 1 GHz, and this rate has to be reduced to the order of 500 Hz for data to be stored for subsequent offline analysis. The first-level trigger (L1) uses signals from the full detector, which, given the finite speed of light, makes it impossible to generate a trigger decision from one bunch crossing before the following bunch crossing occurs (25 ns at nominal LHC operation). This apparently insoluble problem is solved with the aid of a 'pipelined' system.<sup>61</sup> <sup>61</sup>This approach was pioneered by the The data are stored on detector in 'pipeline' memory (see Fig. 4.20), while the L1 decision is being made by a custom hardware processor. In such a pipelined processor, one step of the trigger process operates on the data for a particular event in one clock cycle and then the next step is operated in the following clock cycle. The number of allowed steps for such a processor depends on the depth of the pipeline memory in which the data are stored.<sup>62</sup> <sup>62</sup>A typical pipeline depth of 132 cor-As all bunch crossings have genuine pp collisions, it is no longer sufficient to simply reject non-beam backgrounds—the L1 trigger must decide which real events to keep. The L1 trigger uses interesting signatures like high-transverse-momentum electrons by performing hardware sums of the energy deposited in neighbouring cells in the electromagnetic calorimeter. A global L1 trigger decision is made on the basis of several signatures (high-transverse-momentum muon candidates, large missing transverse energy, etc.). This L1 trigger typically reduces the rate to the order of 100 kHz. At this rate, it is now feasible to read out all the data corresponding to triggered bunch crossings<sup>63</sup> <sup>63</sup>The readout is performed using opand for the data to be processed by very large computer farms, which use the full detector granularity to reduce the rate to the required order of 500 Hz for storage.

## **4.11 Examples of detector systems**

Now that we have seen the principles behind the design of detector subsystems, we can start to understand how these principle are applied in the designs of real detectors. We first look at collider detectors and then briefly consider neutrino detectors. Dark matter detectors are described in Section 13.7.2.

## **4.11.1 Collider detectors**

We will take the ATLAS and CMS detectors as examples of collider detectors.<sup>64</sup> <sup>64</sup>We cover some unique aspects of the The ATLAS detector is illustrated schematically in Fig. 4.29. The tracker is immersed in a 2 T magnetic field and consists of silicon detectors closest to the beam line and a Transition Radiation Tracker (TRT) at larger radius. The silicon detector contains three layers of

H1 and ZEUS experiments at DESY for the HERA collider.

responds to a time of 3.2 μs, which is sufficient to allow the signals to reach the trigger processor, for a trigger decision to be made, and for that decision to be fed back to the electronics on the detector.

tical fibre links.

LHCb detector in Section 10.7.3.

**Fig. 4.29** Schematic view of the ATLAS detector [20].

pixels closest to the beam pipe to provide the best resolution for the impact parameter and layers of silicon strips at larger radius. The TRT is made from cylindrical 'straw' tubes, with each tube working as an independent cylindrical drift chamber. The tubes are interleaved with Mylar foils to generate transition radiation to enhance electron identification. The electromagnetic calorimeter is based on the liquid-argon accordion calorimeter (see Section 4.7.3). In the central region, the hadronic calorimeter uses an iron–scintillator sandwich design. The light from the scintillators is coupled to the photomultipliers using wavelength-shifting fibres. The novel feature of this design is that the steel absorber plates are rotated by 90◦ compared with the conventional design in which the plates are perpendicular to the direction of incidence of primary particles. This has the advantage that the calorimeter cells are self-supporting, thus avoiding 'dead' material between cells. Although the calorimeter system is not compensating, the fine granularity allows the use of software compensation to improve the resolution. Calorimeters extend up to pseudorapidity η ≈ 5 in order to reconstruct missing transverse momentum (see Chapter 8). The muon spectrometer uses the toroidal coils discussed in Section 4.9.2. In the central barrel region, the muon tracks are measured using detectors based on drift tubes. However, the signals are too slow to participate in the first-level trigger (see Section 4.10) and therefore faster but lower-resolution detectors are also used.

A schematic view of the CMS detector is shown in Fig. 4.30. There is a very large all-silicon tracker consisting of three layers of pixel detector and 10 layers of strip detectors immersed in the 4 T solenoidal magnetic field, which provides very good momentum resolution for charged particles. The electromagnetic calorimeter uses PbWO<sup>4</sup> crystals (see Section 4.7.2). The hadronic calorimeter uses a brass–scintillator sandwich calorimeter. As with ATLAS, forward calorimeters extend the coverage to close to the beam pipe. The muon chambers are interleaved with the return yoke of the solenoid. They are used for the first-level muon trigger, but high-precision measurements of muon momentum are made in the tracker.

**Fig. 4.30** Schematic view of the CMS detector [62].

## **4.11.2 Neutrino detectors**

Optimization of neutrino detectors is very different to that of collider detectors because the very small cross sections mean that very massive detectors are needed to allow useful event rates to be obtained.<sup>65</sup> <sup>65</sup>This is particularly true for neutrino Given the sizes involved, we are obliged to use cheaper detector technologies than at hadron colliders. The requirements depend on the neutrino energies. For an accelerator neutrino experiment, a typical requirement is to have a very large target mass and be able to measure the following:


In general, we can use calorimeters to measure electrons and hadrons. If the passive absorber plates are made from magnetized iron and we instrument the gaps between absorbers with some tracking detector, we can determine the tracks caused by muons. We can then identify muons as particles that penetrate deeper into the detector than hadrons and at the same time we can estimate the momentum by measuring the curvature of the tracks. We will see how these principles are applied in practice in the MINOS far detector in Chapter 11.

detectors in laboratory oscillation experiments, in which we need a detector far from the neutrino source to study oscillations (see Chapter 11).

## **Chapter summary**


## **Further reading**


a very comprehensive description of many detector technologies.


## **Exercises**

	- (a) Let the kinetic energy of the scattered electron be T in the frame in which the electron was initially at rest. Show that the 4-momentum transfer evaluated in this frame is Q<sup>2</sup> = 2meT.

Hint: Consider the problem in the CMS and then use a Lorentz transformation from the CMS to the lab.

	- (i) How many photons and charged particles will there be after N radiation lengths?
	- (ii) What is the energy of each particle in the shower after N radiation lengths?
	- (iii) What is the depth (in units of L) at which the number of particles in the shower is a maximum, and what is the number of particles at maximum?
	- (b) Compute the depth and the number of particles when multiplication ceases for a 4 GeV electron entering lead glass (L = 2.5 cm, E<sup>c</sup> = 10 MeV).
	- (c) Explaining any assumptions you make, how would the resolution of an electromagnetic calorimeter scale with the energy of the incident electron, E?
	- (a) the solar neutrino flux?
	- (b) the flavour ratio of atmospheric neutrinos?

(4.7) A very simple model of a high-precision silicon 'micro-vertex detector' (MVD) consists of two concentric cylindrical layers surrounding the beam line. The first layer is at radius R<sup>0</sup> = 5 cm, and the separation between the first and second layers is L = 2 cm. The intrinsic measurement resolution of a hit is σ = 10 μm in the Rφ direction (roughly orthogonal to the trajectory of a particle with large transverse momentum).

Show that (neglecting multiple scattering) the uncertainty in the impact parameter (distance of closest approach to the beam line in the plane perpendicular to the beam), σa, is given by

$$
\sigma\_{\rm a} = \frac{\sigma}{L} \sqrt{(R\_0 + L)^2 + R\_0^2}
$$

and calculate it for the parameter values given above. How does σ<sup>a</sup> change if (i) L is doubled; (ii) R<sup>0</sup> is increased to 8 cm? What factors limit the ability to decrease R or increase L. Assume that each layer has a thickness of 2% of a radiation length. How does multiple scattering affect the impact parameter resolution? For what momentum would the uncertainty in the impact parameter from measurement error be equal to that from multiple scattering?


(a) Show that <sup>R</sup><sup>2</sup> <sup>0</sup> B(r) dr = 0.

(b) What is the force on a charged particle moving with a velocity v in the (x, y) plane? Hence find the torque on the charged particle.


If an electron–hole pair is created at a distance x from the p-type electrode, calculate the drift time of the hole in terms of the mobility of the holes, μh. For silicon, with μ<sup>h</sup> = 480 cm<sup>2</sup> V<sup>−</sup><sup>1</sup> s <sup>−</sup><sup>1</sup>, determine the charge collection times for holes created at depths x = 0.5w and x = 0.9w. If the detector were operated at a bias voltage V = 2Vdepletion, how would the charge collection times change? Hence discuss the advantages of operating the detector at a voltage greater than the depletion voltage. What limits the detector voltage that can be applied in practice?


# **Static quark model 5**

The static quark model of hadrons is central to the understanding of the pattern of hadronic masses and quantum numbers. Originally devised with three flavours of quarks (u, d, s), it was extended to include the heavy quarks—first charm after the '1974 revolution' when the J/psi was discovered and shown to be a qq¯ state and then beauty some years later. The top quark was long anticipated on the basis of quark–lepton 'generation' symmetry after the tau lepton was discovered in 1975,<sup>1</sup> but proved to be enormously heavy when it was finally teased out of the data by the CDF and D0 experiments at the Tevatron. The pattern of quark masses is one of the big unsolved problems of particle physics. In this chapter, we are concerned primarily with how the quark model helps our understanding of the phenomenology of mesons and baryons.

The chapter starts with a reminder of the 2-component spin- <sup>1</sup> <sup>2</sup> algebra and its connection with the SU(2) group. Then, after a brief account of hadronic isospin (based on SU(2)), we explain how this approximate symmetry is extended, by including strangeness, to the flavour SU(3) of the static quark model.<sup>2</sup> <sup>2</sup>The definition and properties of the

## **5.1 Spin <sup>1</sup> 2**

For a half-integer spin fermion, the eigenfunctions for spin up (m = +<sup>1</sup> 2 ) and down (m = −<sup>1</sup> <sup>2</sup> ) are

$$
\begin{pmatrix} 1 \\ 0 \end{pmatrix} \text{ and } \begin{pmatrix} 0 \\ 1 \end{pmatrix}
$$

The raising and lowering operators, defined as (see Chapter 2)

$$s\_{\pm} = s\_x \pm \text{i}s\_y$$

are, by inspection,

$$s\_+ = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, \quad s\_- = \begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}.$$

This is easily demonstrated—for example


Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak,

& Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak,

& Tony Weidberg 2016. Published in 2016 by Oxford University Press.


<sup>1</sup>The discovery of the associated neutrino τν is discussed in Chapter 8.

SU(n) groups are given in Chapter 2.

Working backwards, we get expressions for s<sup>x</sup> and sy:

$$s\_x = \frac{1}{2}(s\_+ + s\_-), \quad s\_y = \frac{1}{2}\mathbf{i}(s\_- - s\_+)$$

and s<sup>z</sup> is derived from the requirement that

sz - 1 0 = 1 2 - 1 0 , s<sup>z</sup> - 0 1 <sup>=</sup> <sup>−</sup><sup>1</sup> 2 - 0 1 

We can now write the spin- <sup>1</sup> <sup>2</sup> defining expressions in terms of the Pauli matrices (note the factor <sup>1</sup> 2 ):

$$s\_x = \begin{pmatrix} 0 & \frac{1}{2} \\ \frac{1}{2} & 0 \end{pmatrix} \quad = \frac{1}{2}\sigma\_x$$

$$s\_y = \begin{pmatrix} 0 & -\frac{1}{2}\mathbf{i} \\ \frac{1}{2}\mathbf{i} & 0 \end{pmatrix} = \frac{1}{2}\sigma\_y$$

$$s\_z = \begin{pmatrix} \frac{1}{2} & 0 \\ 0 & -\frac{1}{2} \end{pmatrix} \quad = \frac{1}{2}\sigma\_z$$

The spin operator algebra can be summarized as 

$$\left[\frac{1}{2}\sigma\_i, \frac{1}{2}\sigma\_j\right] = \epsilon\_{ijk}\frac{1}{2}\mathbf{i}\sigma\_k$$

The link with SU(2) is clear:


### **5.1.1 Combining two spin-<sup>1</sup> <sup>2</sup> particles**

Two spin- <sup>1</sup> <sup>2</sup> particles can combine to form four possible spin states. The total spin can be either s = 1, s<sup>z</sup> ∈ {−1, 0, 1} or s = 0, s<sup>z</sup> = 0. We have

$$\begin{aligned} \left|1,1\right\rangle &= \left|\frac{1}{2},\frac{1}{2}\right\rangle \left|\frac{1}{2},\frac{1}{2}\right\rangle &= \left|\uparrow,\uparrow\right\rangle \\ \left|1,0\right\rangle &= \sqrt{\frac{1}{2}} \left|\frac{1}{2},\frac{1}{2}\right\rangle \left|\frac{1}{2},-\frac{1}{2}\right\rangle + \sqrt{\frac{1}{2}} \left|\frac{1}{2},-\frac{1}{2}\right\rangle \left|\frac{1}{2},\frac{1}{2}\right\rangle = \sqrt{\frac{1}{2}} \left(\left|\uparrow,\downarrow\right\rangle + \left|\downarrow,\uparrow\right\rangle\right) \\ \left|1,-1\right\rangle &= \left|\frac{1}{2},-\frac{1}{2}\right\rangle \left|\frac{1}{2},-\frac{1}{2}\right\rangle &= \left|\downarrow,\downarrow\right\rangle \\ \left|0,0\right\rangle &= \sqrt{\frac{1}{2}} \left|\frac{1}{2},\frac{1}{2}\right\rangle \left|\frac{1}{2},-\frac{1}{2}\right\rangle - \sqrt{\frac{1}{2}} \left|\frac{1}{2},-\frac{1}{2}\right\rangle \left|\frac{1}{2},\frac{1}{2}\right\rangle = \sqrt{\frac{1}{2}} \left(\left|\uparrow,\downarrow\right\rangle - \left|\downarrow,\uparrow\right\rangle\right) \end{aligned}$$

(irreducible) <sup>3</sup> <sup>3</sup> See Chapter 2 for more on groups and SU(2).

The s = 1 triplet is symmetric under the interchange of particles. They are deduced from the spin-raising/lowering operators and the Clebsch– Gordan coefficients.<sup>4</sup> <sup>4</sup>See Chapter 2.

The s = 0, s<sup>z</sup> = 0 singlet state is found by requiring it to be orthogonal to the s = 1, s<sup>z</sup> = 0 state. Note that the s = 1 states are symmetric under the interchange of particles 1 and 2, whereas the s = 0 state is antisymmetric. This illustrates what is meant by the representation notation

$$2 \otimes 2 = \underbrace{3}\_{\text{sym}} \oplus \underbrace{1}\_{\text{asym}}$$

where 'sym' and 'asym' stand respectively for symmetric and antisymmetric combinations of the spin states of the two particles.

### **5.1.2 Combining three spin- <sup>1</sup> <sup>2</sup> particles**

We will now show that the multiplicity of states is given by 2 ⊗ 2 ⊗ 2=4 ⊕ 2 ⊕ 2.<sup>5</sup> <sup>5</sup>The multiplicities must be even for We start by adding a spin-up particle to the |1, 1 to obtain the highest possible (maximally stretched) state | <sup>3</sup> <sup>2</sup> , <sup>+</sup><sup>3</sup> <sup>2</sup> , then apply angular-momentum-lowering operators to 'step down' from there:

$$\begin{aligned} \left| \frac{3}{2}, \frac{3}{2} \right\rangle &= \left| \uparrow\uparrow\uparrow \right\rangle \\ \left| \frac{3}{2}, \frac{1}{2} \right\rangle &= \sqrt{\frac{1}{3}} \left( \left| \downarrow\uparrow\uparrow \right\rangle + \left| \uparrow\downarrow\uparrow \right\rangle + \left| \uparrow\uparrow\downarrow \right\rangle \right) \\ \left| \frac{3}{2}, -\frac{1}{2} \right\rangle &= \sqrt{\frac{1}{3}} \left( \left| \downarrow\downarrow\uparrow \right\rangle + \left| \uparrow\downarrow\downarrow \right\rangle + \left| \downarrow\uparrow\downarrow \right\rangle \right) \\ \left| \frac{3}{2}, -\frac{3}{2} \right\rangle &= \left| \downarrow\downarrow\downarrow \right\rangle \end{aligned} $$

Two more states come from adding the third particle to the triplet. Requiring orthogonality to the S = <sup>3</sup> <sup>2</sup> , S<sup>z</sup> = +<sup>1</sup> <sup>2</sup> states, two <sup>S</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> states occur:

$$\left|\frac{1}{2},\frac{1}{2}\right\rangle = \sqrt{\frac{1}{6}}\left(2\left|\uparrow\uparrow\downarrow\right\rangle - \left|\downarrow\uparrow\uparrow\right\rangle - \left|\uparrow\downarrow\uparrow\right\rangle\right)$$

$$\left|\frac{1}{2},-\frac{1}{2}\right\rangle = -\sqrt{\frac{1}{6}}\left(2\left|\downarrow\downarrow\uparrow\right\rangle - \left|\uparrow\downarrow\downarrow\right\rangle - \left|\downarrow\uparrow\downarrow\right\rangle\right)$$

These states are of mixed symmetry.

Finally, two states are derived from adding the third particle to the singlet:

$$\left|\frac{1}{2},\frac{1}{2}\right\rangle = \sqrt{\frac{1}{2}}\left(|\uparrow\downarrow\uparrow\rangle - |\downarrow\uparrow\uparrow\rangle\right)$$

$$\left|\frac{1}{2},-\frac{1}{2}\right\rangle = -\sqrt{\frac{1}{2}}\left(|\uparrow\downarrow\downarrow\rangle - |\downarrow\downarrow\uparrow\rangle\right)$$

half-integer angular momentum.

which are antisymmetric combinations under the exchange of particles 1 ↔ 2. The pattern of combinations is indeed

$$2 \otimes (2 \otimes 2) = \underbrace{4}\_{\text{sym}} \oplus \underbrace{2}\_{\text{mixs}} \oplus \underbrace{2}\_{\text{mixA}}$$

where 'mix(A)S' means a state that is a mixture of symmetric and antisymmetric parts but becomes purely (anti)symmetric under exchange of the first two particles.

## **5.2 The quark model of hadrons**

Quarks are fundamental, point-like constituents of matter that carry either <sup>1</sup> <sup>3</sup> or <sup>2</sup> <sup>3</sup> electric charge. They exist in qq¯ or qqq bound states—the hadrons—held together by the strong force. The strong force is mediated by gluons, which couple with equal strength to all quarks. The residual strong force holds together the nucleus in analogy to the van der Waals force between neutral atoms.

Strong-force bound states can be organized by a classification that predates the invention of the quark model; it arose from similarities among the hadrons observed in the 1960s, particularly from bubble chamber experiments. The observed patterns of the quantum numbers (mass, spin, parity, isospin, and strangeness) were crucial in the development of the quark model.

## **Baryons**


## **Mesons**


Although the quark model is now taken for granted, it is useful to think about how physicists used the data to devise it and why the concepts matched so beautifully the experimental observations.

## **5.2.1 Isospin**

Originally, physicists were intrigued by the similarity in mass and properties of the proton and neutron. Inspired by spin- <sup>1</sup> <sup>2</sup> doublets, it was postulated they were the same particle<sup>6</sup> <sup>6</sup>The mass difference <sup>m</sup><sup>n</sup> <sup>−</sup>m<sup>p</sup> is of the but with a different projection of the third component of a new angular-momentum-like quantum number—isospin. In fact, this is not the case, but it is still a useful approximate symmetry for low-energy hadronic physics, where perturbative QCD is not valid.

The key points are as follows:

• |u and |d form an SU(2) doublet like spin <sup>1</sup> 2 :

$$I\_3|u\rangle = \frac{1}{2}|u\rangle, \quad I\_3|d\rangle = -\frac{1}{2}|d\rangle$$

• As with normal (angular momentum) spin, the raising and lowering operators are

$$I\_+|d\rangle = |u\rangle, \quad I\_-|u\rangle = |d\rangle, \quad I\_-|d\rangle = 0$$

	- (1) the approximate degeneracy in mass of the u and d quarks;
	- (2) u and d have an identical strong coupling to the gluon.

When dealing with antiquarks, one must be careful in applying charge conjugation to the quark wavefunctions. We choose a convention that allows Clebsch–Gordan coefficients to be applied in the same manner as with quarks, although this introduces a somewhat confusing minus sign:<sup>7</sup> <sup>7</sup>We follow the convention defined ori-

$$C|u\rangle = -|\bar{u}\rangle, \quad C|d\rangle = |\bar{d}\rangle$$

where C is the charge-conjugation operator. The raising and lowering operators act on the antiquarks as follows:

$$I\_-|\vec{d}\rangle = -|\bar{u}\rangle, \quad I\_+|\bar{u}\rangle = -|\vec{d}\rangle, \quad I\_+|\vec{d}\rangle = 0, \quad I\_-|\bar{u}\rangle = 0$$

Consider next the SU(2) isospin combinations of 2 ⊗ ¯2=3 ⊕ 1, where the Clebsch–Gordan coefficients are taken from the <sup>1</sup> <sup>2</sup> <sup>⊗</sup> <sup>1</sup> <sup>2</sup> table: a symmetric singlet state

$$\left| I = 0, I\_3 = 0 \right\rangle = \sqrt{\frac{1}{2} \left( |d\bar{d}\rangle + |u\bar{u}\rangle \right)}$$

order of an electromagnetic correction.

ginally for atomic physics by Condon and Shortley in their famous book The Theory of Atomic Spectra [68].

netic effect.

<sup>8</sup>A mass difference again of a magnitude compatible with an electromagand an antisymmetric triplet state

$$\begin{aligned} \left| I = 1, I\_3 = 1 \right\rangle &= \left| u \bar{d} \right\rangle &= \left| \pi^+ \right\rangle \\ \left| I = 1, I\_3 = 0 \right\rangle &= \sqrt{\frac{1}{2}} \left( \left| d \bar{d} \right\rangle - \left| u \bar{u} \right\rangle \right) &= \left| \pi^0 \right\rangle \\ \left| I = 1, I\_3 = -1 \right\rangle &= -\left| \bar{u} d \right\rangle &= \left| \pi^- \right\rangle \end{aligned}$$

The triplet of pions is almost degenerate in mass: m(π±) = 140 MeV, m(π<sup>0</sup>) = 135 MeV.<sup>8</sup>

Isospin is also useful for baryons. As with spin, we expect a symmetric quadruplet and two mixed-symmetry isospin doublets: 2 ⊗ (2 ⊗ 2) = 4 ⊕ 2 ⊕ 2, which we can write out explicitly:

$$\left|I=\frac{3}{2},I\_3=\frac{3}{2}\right\rangle = \left|uuu\right\rangle$$

$$\left|I=\frac{3}{2},I\_3=\frac{1}{2}\right\rangle = \sqrt{\frac{1}{3}}\left(\left|duu\right\rangle + \left|udu\right\rangle + \left|uud\right\rangle\right)$$

$$\left|I=\frac{3}{2},I\_3=-\frac{1}{2}\right\rangle = \sqrt{\frac{1}{3}}\left(\left|ddu\right\rangle + \left|udd\right\rangle + \left|dud\right\rangle\right)$$

$$\left|I=\frac{3}{2},I\_3=-\frac{3}{2}\right\rangle = \left|ddd\right\rangle$$

$$\left|I=\frac{1}{2},I\_3=\frac{1}{2}\right\rangle = \sqrt{\frac{1}{6}}\left(2\left|uud\right\rangle - \left|duu\right\rangle - \left|udu\right\rangle\right)$$

$$\left|I=\frac{1}{2},I\_3=-\frac{1}{2}\right\rangle = -\sqrt{\frac{1}{6}}\left(2\left|ddu\right\rangle - \left|udd\right\rangle - \left|dud\right\rangle\right)$$

$$\left|I=\frac{1}{2},I\_3=\frac{1}{2}\right\rangle = \sqrt{\frac{1}{2}}\left(|udu\rangle - \left|duu\right\rangle\right)$$

$$\left|I=\frac{1}{2},I\_3=-\frac{1}{2}\right\rangle = \sqrt{\frac{1}{2}}\left(|udu\rangle - \left|duu\right\rangle\right)$$

$$\left|I=\frac{1}{2},I\_3=-\frac{1}{2}\right\rangle = -\sqrt{\frac{1}{2}}\left(|udd\rangle - \left|dud\right\rangle\right)$$

## **An example of isospin analysis: Δ decays**

	- As the strong interaction dominates, we can use isospin to understand relative rates using

$$|\Delta^{+}\rangle = \left|I = \frac{3}{2}, I\_3 = \frac{1}{2}\right\rangle, \quad |p\rangle = \left|\frac{1}{2}, \frac{1}{2}\right\rangle, \quad |n\rangle = \left|\frac{1}{2}, -\frac{1}{2}\right\rangle$$

$$|\pi^{+}\rangle = |1, 1\rangle, \quad |\pi^{0}\rangle = |1, 0\rangle, \quad |\pi^{-}\rangle = |1, -1\rangle$$

• Using Clebsch–Gordan coefficients, we expand the I = <sup>3</sup> <sup>2</sup> in products of I = <sup>1</sup> <sup>2</sup> and I = 1 states:

$$\begin{aligned} \left| \Delta^{+} = \left| \frac{3}{2}, \frac{1}{2} \right\rangle = \sqrt{\frac{1}{3}} \left| \frac{1}{2}, -\frac{1}{2} \right\rangle \left| 1, 1 \right\rangle + \sqrt{\frac{2}{3}} \left| \frac{1}{2}, \frac{1}{2} \right\rangle \left| 1, 0 \right\rangle \\ = \sqrt{\frac{1}{3}} \left| n \right\rangle \left| \pi^{+} \right\rangle + \sqrt{\frac{2}{3}} \left| p \right\rangle \left| \pi^{0} \right\rangle \end{aligned}$$

• Deducing branching ratios, we have

$$\frac{\text{BR}(\Delta^{+} \to \pi^{0}p)}{\text{BR}(\Delta^{+} \to \pi^{+}n)} = \frac{|\langle \pi^{0}p \mid \Delta^{+} \rangle|^{2}}{|\langle \pi^{+}n \mid \Delta^{+} \rangle|^{2}} = \frac{\left|\sqrt{\frac{2}{3}}\right|^{2}}{\left|\sqrt{\frac{1}{3}}\right|^{2}} = 2$$

• Similarly, we can estimate the relative Δ cross sections for formation in πp scattering at <sup>√</sup><sup>s</sup> <sup>∼</sup> <sup>m</sup>(Δ):

$$\frac{\sigma(\pi^-p)}{\sigma(\pi^+p)} = \frac{|\langle \Delta^0 \mid \pi^- p \rangle|^2}{|\langle \Delta^{++} \mid \pi^+ p \rangle|^2} = \frac{1}{3}$$


Although useful, isospin could not account for long-lived particles, originally labelled 'V particles', that were first observed in 1947 (in Manchester) with a mass ∼500 times that of the electron (see Fig. 5.2). These new particles—known as 'strange' particles—were assigned a new quantum number and SU(2) had to be expanded.

**Fig. 5.1** Total cross sections for π+p (dotted line) and π−p (solid line) as functions of the pion beam momentum in GeV/c. From the PDG [114].

**Fig. 5.2** One of the early 'V particles' observed in 1947 in a cloud chamber exposed to cosmic rays by Rochester and Butler [123]—then working in Blackett's group at Manchester University.

<sup>10</sup>Generically known now as a flavour quantum number.

**5.2.2 Strangeness and expansion to SU(3)**

With isospin in hand to describe up-ness and down-ness, strangeness is postulated to be a third quantum number that a hadron may possess.<sup>10</sup> For historical reasons, convention dictates that s = −1 for |s and s = +1 for |s¯. Assuming that SU(3) is valid, we expect 3<sup>2</sup> − 1 = 8 fundamental operators.

The fundamental representation of SU(3) comprises eight 3 × 3 matrices:

$$\begin{aligned} \lambda\_1 &= \begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}, & \lambda\_2 &= \begin{pmatrix} 0 & -i & 0 \\ i & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} \\ \lambda\_3 &= \begin{pmatrix} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 0 \end{pmatrix}, & \lambda\_4 &= \begin{pmatrix} 0 & 0 & 1 \\ 0 & 0 & 0 \\ 1 & 0 & 0 \end{pmatrix} \\ \lambda\_5 &= \begin{pmatrix} 0 & 0 & -i \\ 0 & 0 & 0 \\ i & 0 & 0 \end{pmatrix}, & \lambda\_6 &= \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{pmatrix} \\ \lambda\_7 &= \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & -i \\ 0 & 1 & 0 \end{pmatrix}, & \lambda\_8 &= \sqrt{\frac{1}{3}} \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & -2 \end{pmatrix} \end{aligned}$$

SU(3) has three SU(2) groups embedded within it. In addition to isospin (I-spin), there are U-spin and V-spin, which are the doublets of - s d and - s u , respectively. The fundamental representation for the quark and antiquark triplets is shown in Fig. 5.3.

**Fig. 5.3** Fundamental SU(3) flavour representations for the (u, d, s) and (¯u, d,¯ s¯) triplets, as functions of the third component of isospin I<sup>3</sup> (horizontal) and strangeness S (vertical).

The raising and lowering operators are

$$\begin{aligned} I\_+|d\rangle &= |u\rangle, \qquad I\_-|u\rangle = |d\rangle, \qquad I\_+|\bar{u}\rangle = -|\bar{d}\rangle, \qquad I\_-|\bar{d}\rangle = -|\bar{u}\rangle\\ U\_+|s\rangle &= |d\rangle, \qquad U\_-|d\rangle = |s\rangle, \qquad U\_+|\bar{d}\rangle = -|\bar{s}\rangle, \qquad U\_-|\bar{s}\rangle = -|\bar{d}\rangle\\ V\_+|s\rangle &= |u\rangle, \qquad V\_-|u\rangle = |s\rangle, \qquad V\_+|\bar{u}\rangle = -|\bar{s}\rangle, \qquad V\_-|\bar{s}\rangle = -|\bar{u}\rangle \end{aligned}$$

Any other combination is zero. The Condon–Shortley convention is used for the antiquarks, which gives rise to the 'extra' minus signs. Only two of the three are needed to describe and navigate through a multiplet, I-spin and U-spin being most commonly used.

## **5.2.3 Mesons**

We now have the tools to extend beyond the π+, π0, π<sup>−</sup> states of SU(2) qq¯ combinations and identify all the SU(3) qq¯ states. We start with the simplest case of J = 0 pseudoscalar mesons. From SU(3), we expect

$$3 \otimes 3 = 9 = \underbrace{1}\_{\text{sym}} \oplus \underbrace{8}\_{\text{mix}}$$

where 'sym' and 'mix' stand for symmetric and mixed symmetry, respectively. The six states with non-zero I and non-zero S have unambiguous quark content. The final three, with I = S = 0, need a little more thought:

• The symmetric singlet is 'obvious' by inspecting the perfect symmetry of the flavour wavefunction. It is now known as the η- :

$$|\eta'\rangle = \sqrt{\frac{1}{6}}\left(|u\bar{u}\rangle + |d\bar{d}\rangle + |\mathbf{s}\bar{s}\rangle\right)$$

The remaining two must be part of the octet of mixed symmetry. The first step is to start with the 'outer' states of the octet and apply the flavour-lowering operators to reach the centre:

$$\begin{aligned} I\_-|u\bar{d}\rangle &= |d\bar{d}\rangle - |u\bar{u}\rangle \\ U\_-|d\bar{s}\rangle &= |s\bar{s}\rangle - |d\bar{d}\rangle \\ V\_-|u\bar{s}\rangle &= |s\bar{s}\rangle - |u\bar{u}\rangle \end{aligned}$$

However, only two of these three equations are independent. So we proceed as follows:

• We choose one to be the well-established π<sup>0</sup>, the isospin-triplet partner of the π<sup>+</sup>, π<sup>−</sup>:

$$\left| \pi^0 \right\rangle = \sqrt{\frac{1}{2}} \left( \left| d\bar{d} \right\rangle - \left| u\bar{u} \right\rangle \right)$$

**Fig. 5.4** SU(3) pseudoscalar meson nonet states as functions of I<sup>3</sup> and S.

the pseudoscalar case.

• This leaves the last member of the octet to be deduced by requiring that its flavour wavefunction be orthogonal to the |π<sup>0</sup>:

$$\begin{aligned} \left| \eta \right\rangle &= \alpha (|s\bar{s}\rangle \, -|u\bar{u}\rangle) + \beta (|s\bar{s}\rangle \, -|d\bar{d}\rangle) \\ &= \sqrt{\frac{1}{6}} \left( |u\bar{u}\rangle + |d\bar{d}\rangle - 2|s\bar{s}\rangle \right) \end{aligned}$$

where the constants are derived using π<sup>0</sup> | η = 0 and η | η = 1.

The nonet of pseudoscalar meson states are plotted as functions of I<sup>3</sup> and S in Fig. 5.4 and their properties are listed in Table 5.1.

The J = 1 vector mesons are also well-established states. As we might expect, they exhibit the same pattern of states as the J = 0 mesons: 3 ⊗ 3=8 ⊕ 1.

The nonet of vector meson states are plotted as functions of I<sup>3</sup> and S in Fig. 5.5 and their properties are listed in Table 5.2.

The notable difference between the pseudoscalar and vector mesons is that with the latter, for the I<sup>3</sup> = 0, S = 0 states, SU(3) is not exact and 'octet–singlet' mixing occurs. <sup>11</sup> <sup>11</sup> In fact, it is also slightly broken in Experimentally, one state (φ) is observed to decay largely to kaons, and the other (ω) nearly always to pions. We therefore assume that the states with I<sup>3</sup> = 0, S = 0 are maximally mixed,


**Table 5.1** Properties of the SU(3) pseudoscalar meson nonet states.


**Table 5.2** Properties of the SU(3) vector meson nonet states.

such that the quark composition is given by

$$\begin{aligned} \vert \rho^0 \rangle &= \sqrt{\frac{1}{2}} (\vert d\bar{d} \rangle - \vert u\bar{u} \rangle), & \text{BR}(\rho^0 \to \pi^+ \pi^-) &= 100\% \\ \vert \omega \rangle &\approx \sqrt{\frac{1}{2}} (\vert d\bar{d} \rangle + \vert u\bar{u} \rangle), & \text{BR}(\omega \to \pi^+ \pi^- \pi^0) &= 90\% \\ \vert \phi \rangle &\approx \vert s\bar{s} \rangle, & \text{BR}(\phi \to K\bar{K}) &= 84\% \\ & & \text{BR}(\phi \to \pi^+ \pi^- \pi^0) &= 15\% \end{aligned}$$

## **5.2.4 Baryons**

The baryon wavefunction is made of four parts,

$$
\Psi = \psi\_{\text{space}} \psi\_{\text{spin}} \psi\_{\text{flavor}} \psi\_{\text{colour}}
$$

and we consider only ground-state baryons (p, n, Δ, etc.) that have no orbital angular momentum (L = 0). ψspace is symmetric. The colour wavefunction ψcolour is always antisymmetric:

$$\psi\_{\text{colour}} = \sqrt{\frac{1}{6}} \left( |RGB\rangle + |GBR\rangle + |BRG\rangle - |GRB\rangle - |BGR\rangle - |RBG\rangle \right)$$

With ψspaceψcolour being antisymmetric and the overall fermionic wavefunction required to be antisymmetric, ψspinψflavour must be symmetric.

**Fig. 5.5** SU(3) vector meson nonet states as functions of I<sup>3</sup> and S.

ψspin and ψflavour may be a mixture of symmetric and antisymmetric states, but the total spin–flavour wavefunction must be symmetric.

• We deal with ψspin first. The possible combinations are

$$2 \otimes (2 \otimes 2) = 8 = \underbrace{4}\_{\text{sym}} \oplus \underbrace{2}\_{\text{mixg}} \oplus \underbrace{2}\_{\text{mixA}}$$

where the four symmetric states ('sym') have spin <sup>3</sup> <sup>2</sup> and the other two ('mixS' and 'mixA'), with mixed symmetry, have spin <sup>1</sup> 2 .

• The SU(3) decomposition of three flavours is found by first combining two quark states, then adding the third:

$$3 \otimes 3 = \underbrace{6 \dots \oplus \underbrace{3}\_{\text{sym}}}\_{\text{sym}}$$

$$3 \otimes 3 \otimes 3 = \underbrace{10}\_{\text{sym}} \oplus \underbrace{8 \dots \oplus 8}\_{\text{mix}\_8} \oplus \underbrace{1}\_{\text{asym}}$$

where 'sym', 'mixS', and 'mixA' are as defined above and 'asym' is the totally antisymmetric three-quark combination.

• Finally, the possible symmetric (SU(3), SU(2)) combinations are the symmetric (10 , 4) and the mixed symmetric–antisymmetric 1 <sup>2</sup> [(8, 2)mix<sup>S</sup> + (8, 2)mix<sup>A</sup> ].

The ground-state octet (spin <sup>1</sup> <sup>2</sup> ) and decuplet (spin <sup>3</sup> <sup>2</sup> ) are shown in Fig. 5.6.

**Fig. 5.6** SU(3) ground state (spin- <sup>1</sup> <sup>2</sup> ) baryon octet states and the (spin- <sup>3</sup> <sup>2</sup> ) decuplet first excited states.

**Fig. 5.7** Photograph and line diagram of an Ω<sup>−</sup> event from [44].

There are two final points:


The Ω<sup>−</sup> was discovered using K−p interactions. A bubble chamber photograph [44] of the first observation of an Ω<sup>−</sup> is shown in Fig. 5.7. The new particle was observed to decay via three weak decays. The decays are clearly weak because the intermediate particles travel an appreciable distance before decaying in turn. This implies a longer lifetime than decays via the electromagnetic or strong interactions. The production and decay chain is

$$K^-p \to \Omega^-K^+K^0 \dots \Box \Omega^- \to \Xi^0\pi^- \dots \Box \Xi^0 \to \Lambda^0\pi^0 \dots \Lambda^0 \to p\pi^- $$

where each decay involves a decrease in strangeness.<sup>13</sup> <sup>13</sup>Ω<sup>−</sup> has strangeness <sup>−</sup>3 and spin <sup>3</sup>

## **5.2.5 Deriving the complete spin–flavour wavefunction**

A final step is to obtain the explicit quark model wavefunction for the spin-up proton. We concentrate on the non-trivial ψspinψflavour parts, which must be symmetric. From an inspection of the three spin- <sup>1</sup> <sup>2</sup> combinations, for angular-momentum spin (Section 5.1.2) and isospin (end of Section 5.2.1), we note that the |S = <sup>1</sup> <sup>2</sup> , S<sup>z</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> , and <sup>|</sup><sup>I</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> , I<sup>3</sup> <sup>=</sup> <sup>1</sup> 2 SU(3) flavour symmetry was used. This included quark-like objects, but did not require that they were real particles. Remember, this was 10 years before the key papers on QCD and confinement were published.

2 .

parts are of mixed symmetry. So we must take care to combine these wavefunctions appropriately. Using M<sup>S</sup> and M<sup>A</sup> as generic labels for symmetric and antisymmetric combinations, respectively, we get

$$|p^\uparrow\rangle = \sqrt{\frac{1}{2}} \left[ \psi\_{\text{spin}}(\mathcal{M}\_\text{S})\psi\_{\text{flavor}}(\mathcal{M}\_\text{S}) + \psi\_{\text{spin}}(\mathcal{M}\_\text{A})\psi\_{\text{flavor}}(\mathcal{M}\_\text{A}) \right]$$

where

$$\begin{aligned} \psi\_{\text{spin}}(\mathcal{M}\_{\text{S}}) &= \left| \frac{1}{2}, \frac{1}{2} \right\rangle = \sqrt{\frac{1}{6} \left( 2 \left| \uparrow \uparrow \downarrow \right\rangle - \left| \downarrow \uparrow \uparrow \right\rangle - \left| \uparrow \downarrow \uparrow \right\rangle \right)} \\ \psi\_{\text{spin}}(\mathcal{M}\_{\text{A}}) &= \left| \frac{1}{2}, \frac{1}{2} \right\rangle = \sqrt{\frac{1}{2}} \left( \left| \uparrow \downarrow \uparrow \right\rangle - \left| \downarrow \uparrow \uparrow \right\rangle \right) \end{aligned}$$

and, with different notation, but identical in content:

$$\begin{split} \psi\_{\text{flavor}}(\mathcal{M}\_{\text{S}}) &= \sqrt{\frac{1}{6}} \left( 2|uud\rangle - |duu\rangle - |udu\rangle \right), \\ \psi\_{\text{flavor}}(\mathcal{M}\_{\text{A}}) &= \sqrt{\frac{1}{2}} \left( |udu\rangle - |duu\rangle \right). \end{split}$$

Putting this all together, we can write out the complete proton wavefunction:

$$\begin{aligned} \vert p^{\uparrow} \rangle = \sqrt{\frac{1}{18}} (2 \vert u^{\uparrow} u^{\uparrow} d^{\downarrow} \rangle - \vert u^{\uparrow} u^{\downarrow} d^{\uparrow} \rangle - \vert u^{\downarrow} u^{\uparrow} d^{\uparrow} \rangle \\ + 2 \vert u^{\uparrow} d^{\downarrow} u^{\uparrow} \rangle - \vert u^{\uparrow} d^{\uparrow} u^{\downarrow} \rangle - \vert u^{\downarrow} d^{\uparrow} u^{\uparrow} \rangle \\ + 2 \vert d^{\downarrow} u^{\uparrow} u^{\uparrow} \rangle - \vert d^{\uparrow} u^{\downarrow} u^{\uparrow} \rangle - \vert d^{\uparrow} u^{\uparrow} u^{\uparrow} \rangle) \\ \times \vert \psi\_{\text{colour}} \rangle \end{aligned}$$

## **5.3 Heavy quarks**

By the mid-1960s, broken SU(3) flavour symmetry was becoming established as a plausible explanation for the patterns appearing among the zoo of hadronic states being discovered. <sup>14</sup> <sup>14</sup> The zoo continues to grow as more The underlying quark model was much more contentious—it was not clear if a quark, which had never been observed in isolation, was a real particle or just a convenient mathematical construct. The experimental and theoretical evidence for quarks being real particles is discussed in Chapter 9. Flavour SU(3), as discussed in this chapter, was used to make successful predictions for hadrons composed of the light u, d, and s quarks. With the discovery of charm and the c quark and some years later of the b quark, a whole new world of charm and beauty hadrons opened up. Although most of the experimental work was carried out at e<sup>+</sup>e<sup>−</sup> colliders, the first evidence for b quarks came from a fixed-target proton beam on a nuclear target. At the time, it was confidently expected, on the basis that there should be six types of quark to match the six leptons,<sup>15</sup> tromagnetic processes. that the top

precise experiments coupled with more sophisticated analysis techniques uncover the states with non-zero orbital angular momentum among the quark constituents.

<sup>15</sup>The tau lepton with a presumed tau neutrino were discovered in 1975 by the SLAC/Berkeley collaboration using the e+e<sup>−</sup> collider (SPEAR) through a larger rate for e+e<sup>−</sup> → e±μ<sup>∓</sup> +X<sup>0</sup> (where X<sup>0</sup> is missing energy) than could be explained by higher-order purely elec(t) quark would be found at the higher-energy e<sup>+</sup>e<sup>−</sup> colliders then being constructed in Europe and Japan, but it was not to be. The first evidence for the t quark came from the Tevatron using pp¯ collisions at 1.8 TeV centre-of-mass energy. This section covers the discovery and properties of the b and c quarks; the discovery of the t quark and its properties are covered in detail in Section 8.6.

## **5.3.1 The charm quark**

The need for a fourth quark was already being discussed (see Chapter 7) before the J/ψ and the D mesons were discovered.<sup>16</sup> <sup>16</sup>The J/ψ is a cc¯ state and thus has Indirect evidence for charm arrived in 1974 from two experiments:


Ting et al. named the new state J; Richter et al. chose ψ (a rare example of a particle looking like its name—Fig. 5.8(c) is an event display from the Stanford group [9] showing ψ- → ψμ<sup>+</sup>μ<sup>−</sup> followed immediately by ψ → μ+μ<sup>−</sup>). What was striking in both cases was how exceptionally narrow the Breit–Wigner resonance was to fit the data in the vicinity of

**Fig. 5.8** Plots and pictures from the J/ψ discovery papers—a striking object indeed! (a) from [40], (b) from [39], and (c) from [9].

and ψ are J/ψ(1S) and ψ(2S), respectively. The open-charm mesons are the D and D¯ states.

3 GeV. The J/ψ mass and width are 3096 MeV and 93 keV; compare this with a typical hadronic resonance, the ρ(1700), with mass ∼1720 MeV and width ∼250 MeV. The e<sup>+</sup>e<sup>−</sup> and μ<sup>+</sup>μ<sup>−</sup> decay modes of the J/ψ were equal and together amounted to ∼12%, with hadronic modes accounting for the other 88% of decays. There was no doubt that the J/ψ was a hadronic state, but its decays were highly suppressed—by a factor of roughly 1/2500! Many models were suggested—from new quarks to supersymmetry, but the simplest turned to be the former and it was named the charm quark, with charge +<sup>2</sup> <sup>3</sup> and carrying the new charm quantum number in analogy to strangeness. <sup>17</sup> <sup>17</sup> The correct PDG names for the J/ψ Data taken at e<sup>+</sup>e<sup>−</sup> colliders at energies above the J/ψ (and ψ- (3686) showed evidence for a threshold being passed, with new meson states being pair-produced, consistent with cq¯+ ¯cq pairs.

## **5.3.2 The beauty quark**

A Fermilab experiment [86] using the 400 GeV proton beam on a nuclear target (A = Cu or Pt) measured the dimuon invariant mass spectrum in p+A → μ<sup>+</sup>μ<sup>−</sup> +X. It was expected and observed that the dimuon mass spectrum would fall exponentially. The rapidity–invariant mass double differential cross section was fitted at y = 0 by an expression of the form

$$\left. \frac{\mathrm{d}^2 \sigma}{\mathrm{d}m \, \mathrm{d}y} \right|\_{y=0} = A \mathrm{e}^{-bm}$$

where m is the invariant mass. A fit to the mass range 6 GeV <m< 12 GeV, excluding the range 8.8–10.6 GeV, gave b = 0.98 ± 0.02 GeV<sup>−</sup><sup>1</sup>. A statistically significant enhancement was observed at ∼9.5 GeV. The experiment did not have sufficient mass resolution to resolve the excess above the steeply falling dimuon mass, but it could be fitted with one or two resonances.

The beauty quark is also known as the bottom quark, since it forms a doublet with the top quark.

## **5.3.3 The top quark**

The top quark was discovered at the Tevatron in pp¯ → tt ¯+ <sup>X</sup> at <sup>√</sup><sup>s</sup> <sup>=</sup> 1.8 TeV. Its discovery and properties are covered in detail in Section 8.6.

## **5.3.4 Charm and beauty states**

Both the c and b quarks can form meson and baryon states with the light quarks. The t quark decays too rapidly to form such bound states.<sup>18</sup> This section gives a brief overview of heavy flavour states. The charm and beauty states, particularly the B mesons, have been studied in great

<sup>18</sup>See Exercise 5.5 at the end of the chapter.

detail at dedicated e<sup>+</sup>e<sup>−</sup> colliders, providing a wealth of experimental results, only a fraction of which can be covered here. Oscillation phenomena and mixing of D<sup>0</sup>–D¯<sup>0</sup> and B<sup>0</sup>–B¯<sup>0</sup> states are discussed in some detail in Chapter 10.

## **Meson states**

The heavy meson states are Qq¯ systems; the lightest with zero orbital angular momentum have J<sup>P</sup> = 0−, 1−. Table 5.3 shows those for cq¯ and the antiparticles ¯cq.

For the D mesons,

$$m\_{D^{\pm}} - m\_{D^{0}} = 4.77 \pm 0.10 \,\mathrm{MeV}$$

and

$$|m\_{D\_1^0} - m\_{D\_2^0}| = 2.39^{+0.59}\_{-0.63} \,\hbar \,\mathrm{s}^{-1}$$

Similar details are given for the bq¯ and ¯bq mesons in Table 5.4. For the B mesons,

$$m\_{B^{\otimes}} - m\_{B^{\pm}} = 0.33 \pm 0.06 \,\mathrm{MeV}$$

and

$$\begin{aligned} \left| m\_{B\_H^0} - m\_{B\_L^0} \right| &= \left( 0.507 \pm 0.005 \right) \times 10^{-12} \,\hbar \,\text{s}^{-1}, \quad \text{or, equivalently,} \\ &= \left( 3.337 \pm 0.033 \right) \times 10^{-10} \,\text{MeV} \end{aligned}$$


**Table 5.3** Lowest-lying charmed meson states; the c quark has charm C = +1.


**Table 5.4** Lowest-lying beauty meson states; the b quark has beauty B = +1.

As for the neutral K mesons, the question arises of CP violation and flavour oscillations. These in turn depend on the small mass difference between the states equivalent to the K<sup>0</sup> S, K<sup>0</sup> <sup>L</sup> states. CP violation and charm or beauty oscillations for neutral meson states depend very sensitively on the mass differences; these questions are covered in Chapter 10.

## **5.3.5 Heavy** *QQ***¯ systems**

Positronium—the electromagnetic bound system of an electron and a positron—has provided a very 'clean' laboratory for understanding quantum electrodynamics (QED) without any nuclear complications. Similarly, the QQ¯ systems provide a laboratory for understanding QCD. The mass scale provided by the heavy quarks means that perturbative methods can be used for the QCD calculations. The success of these calculations in describing the details of the QQ¯ systems was important for the development of QCD itself. Of the three heavy quarks, only cc¯ and b¯b systems exist. The tt ¯ bound system does not have time to form before the top quarks have decayed.

## **5.3.6 Charmonium**

The first plots with evidence for the J/ψ are shown in Fig. 5.8. Once the excitement of the discovery in quick succession of the J/ψ and ψ died down, the focus turned to establishing the properties of these states.

## **Isospin**

The first question to be asked is how much u-ness and d-ness there are in this new state. The J/ψ isospin assignment comes from observing

$$\text{BR}(J/\psi \to \rho^+ \pi^-) = \text{BR}(J/\psi \to \rho^0 \pi^0) = \text{BR}(J/\psi \to \rho^- \pi^+)$$

Consulting the Clebsch–Gordan tables gives the J/ψ isospin: |I = 0, I<sup>3</sup> = 0 (i.e. zero isospin!):


## *J P C*

Using SPEAR, the SLAC/LBL team were examining e+e<sup>−</sup> → μ+μ<sup>−</sup>. There are two possible processes:


The lower two plots in Fig. 5.8(a) show σ(e<sup>+</sup>e<sup>−</sup> → μ<sup>+</sup>μ<sup>−</sup>) and σ(e<sup>+</sup>e<sup>−</sup> → e<sup>+</sup>e<sup>−</sup>) at CMS energies around 3.095 GeV. The middle plot shows a clear dip on the low-mass side of the resonance. This is evidence of the interference between the resonant (via the J/ψ) and non-resonant (via a photon) channels, which can only happen if the J/ψ has the same quantum number as the photon: JP C = 1−−.

## **Width**

Consider the Breit–Wigner formula for e<sup>+</sup>e<sup>−</sup> → J/Ψ → e+e<sup>−</sup>:

$$\sigma(E) = 4\pi\lambda^2 \frac{2J+1}{(2s\_1+1)(2s\_2+1)} \frac{\Gamma\_{ee}^2/4}{(E-E\_\mathrm{R})^2 + \Gamma^2/4} \tag{5.1}$$

For this reaction, J = 1, s<sup>1</sup> = s<sup>2</sup> = <sup>1</sup> <sup>2</sup> , so

$$
\sigma(E)\_{J/\psi \to e^{+}e^{-}} = 3\pi\lambda^2 \frac{\Gamma\_{e^{+}e^{-}}^2 / 4}{(E - E\_{\mathcal{R}})^2 + \Gamma^2 / 4} \tag{5.2}
$$

This can be integrated to find the total cross section, giving

$$\sigma = \int\_0^\infty \sigma(E)\_{J/\psi \to e^+e^-} \,\mathrm{d}E = \frac{3\pi^2 \lambda^2}{2} \left(\frac{\Gamma\_{e+e-}}{\Gamma}\right)^2 \Gamma \tag{5.3}$$

From the measurements made by Richter et al. (shown in Fig. 5.8(a)), we can deduce the following:


This is much narrower than the full widths of the established JP C = 1−− mesons: K∗(1410) with width 232 MeV, ρ(770), with width 149 MeV, and even φ(1020) with the relatively narrow width of 4.3 MeV. This shows that the J/Ψ cannot be composed of combinations of the light (u,d,s) quarks. The very narrow width can be understood if the state is made of heavier quarks (charm).

## **Charmonium states**

Soon after the discovery of the J/ψ, more JP C = 1−− cc¯ states were found:


Further cc¯ states with other values of JP C were later discovered: the charmonium spectrum is shown in Fig. 5.9.

Charmonium decays are governed by kinematics:


The rapid decay of ψ(3770) → DD¯ is via a single gluon exchange, ∝ α<sup>2</sup> s , since it is above threshold for strong decays to a pair of charmed mesons (Fig. 5.10(b)).

To see that we need 3 gluon decay modes, first consider the decay of the J/Ψ. This is below threshold to decay into charm mesons. There is

**Fig. 5.9** The cc¯ charmonium energy levels. From the PDG diagrams [114, p. 1040].

another hadronic decay, into pions. Because there are no charm quarks in the final state the charm/anticharm pair in the initial state must annihilate for this decay, and there must be gluon propagators connecting the initial and the final state. As this is again a strong decay the conservation laws for the strong interaction must be obeyed at all stages of the process. The relevant conserved property is the charge conjugation eigenvalue. For the initial state with L = 0 and S = 1, C = (−1)0+1 = −1. For the final state a decay into a single pion would not satisfy momentum conservation. A decay into ππ (C = +1) would violate charge conjugation, so the final state must have at least three pions. More important is the structure of the intermediate gluon state. It cannot be a single gluon, because the initial state is a colour singlet (it's a particle), and a single gluon can never be a colour singlet. The gluon is not an eigenstate to charge conjugation, because of its colour content (e.g. a gluon

*<sup>D</sup>*<sup>0</sup> **Fig. 5.10** (a) J/ψ decay to a threepion hadronic final state via three gluons. (b) ψ(3S) decay to DD¯.

with rb becomes −br, the – sign occurs for similar reasons as the – sign in the charge conjugation eigenvalue for the photon), but a multi-gluon state can be a charge conjugation eigenstate. A two-gluon colour singlet state will contain contributions like (rb)(br). The charge conjugate state will then have (−br)(−rb), so the charge conjugation eigenvalue of the two-gluon state will be C = +1, and there can be no strong decay of a J/Ψ into two gluons. A three-gluon state would contain elements like (rb)(bg)(gr)±(br)(gb)(rg). One of these combinations will have C = −1, and so a decay into three gluons is possible.

## **5.3.7 Comparison with positronium**

## **Positronium**


## **Differences between charmonium and positronium**


## **5.3.8 Bottomonium**

A comparison of the ratio

$$R = \frac{\sigma(e^{+}e^{-} \to \text{hadrons})}{\sigma(e^{+}e^{-} \to \mu^{+}\mu^{-})}$$

in the regions near the ρ<sup>0</sup>, J/ψ, and Υ resonances is shown in Fig. 5.11. Following the discovery of the narrow excess in μ-pair invariant mass distribution measured in p + A→ μ <sup>+</sup> μ<sup>−</sup> + X already described, the region of 9–10 GeV in centre-of-mass energy was studied by the e<sup>+</sup>e<sup>−</sup> colliders at SLAC, Cornell, and DESY, where the better resolution of these colliders<sup>19</sup> enabled the system to be resolved into three narrow resonances Υ(1S), Υ(2S), and Υ(3S), with masses of 9.46, 10.02, and 10.36 GeV, respectively. A broader resonance, the Υ(4S) with

<sup>19</sup>In an e+e<sup>−</sup> collider, the energy of a resonance can be determined by the beam energy, which can be measured very precisely. In a hadron production experiment, the resonance energy is determined by the final-state particles (electrons or muons) and the resolution is lower.

**Fig. 5.11** The ratio R in the e+e<sup>−</sup> CMS energy regions: (a) near the ρ; (b) near the J/ψ; (c) near the Υ. From the PDG data plots [115, Fig. 50.6].

mass 10.58 GeV, decayed to mesons via BB¯ pairs, each with a mass ∼5.28 GeV. The bottomonium spectrum is shown in Fig. 5.12.

## **5.4 Exotic hadrons**

The Standard Model allows more possibilities for hadrons than considered so far. We need to add gluons to quarks as building blocks, considering different ways in which gluon fields can be configured 'connecting' or 'gluing' quarks while making sure that the outcome is colourless.

The simplest object is called a glueball, a quarkless set of gluons that is colourless as a whole. No glueball has been unambiguously identified so far; expected lifetimes are short and glueballs would couple easily to conventional mesons, making unambiguous identification difficult.

**Fig. 5.12** The b¯b bottomonium states. From the PDG diagrams [114, p. 1109].

Having glue fields inside a hadron leads to hybrids. It is possible to excite gluonic degrees of freedom, making gluonic fields vibrate for example. No hybrids have been found so far.

Then there are tetraquarks, four-quark systems such as qqq¯q¯. They could be of two types: either a 'molecule' of two mesons, with one qq¯ orbiting another qq¯, or a diquark system, with a qq diquark binding to a ¯qq¯ antidiquark (known also as baryonium). There are some candidates for tetraquark states. The most promising one, not matching the mass, lifetime, and other quantum numbers of any conventional meson (known or predicted), is the narrow X(3872) state discovered by the Belle experiment. The X(3872) can decay to π+π−J/ψ and to γJ/ψ, suggesting that it contains charm and anticharm quarks. Its quantum numbers, JP C = 1++, are now well established by the LHCb experiment. Whether it is a 'molecule' or a diquark system or a mixture of states is not yet known.

What tetraquarks are in relation to mesons, pentaquarks are in relation to baryons: qqqqq¯. Two states discovered by LHCb, P c(4450)<sup>+</sup> and P c(4380)<sup>+</sup> [106], are good candidates for pentaquarks consisting of uudcc¯ quarks. How the quarks are bound remains to be established.

There might be even more complicated systems. For example an equal mixture of u, d, and s quarks could exist as a state as simple as one in which two Λ<sup>0</sup> baryons are bound together (if the mass of such a state were below the Λ<sup>0</sup>–nucleon threshold then its lifetime would be of the order of days or months), or even macroscopic systems with larger number of quarks. There is also a possibility of an analogue to a neutron star—a quark star.

## **Chapter summary**


## **Further reading**


## **Exercises**


of around 1 GeV in the fixed-target laboratory frame.



<sup>2</sup>Note that in that chapter, transformations are in an 'active' sense (i.e. the coordinate system does not change but vectors do) rather than the 'passive' sense (i.e. the coordinate system changes but vectors do not) considered in this book. It should also be noted that the metric in [110] is − + + +, in contrast to the one used in this book, which is + − −− (see Note 5).

<sup>3</sup>Greek indices μ, ν = 0, 1, 2, 3 and Latin indices i, j = 1, 2, 3.

## **Relativistic quantum mechanics 6**

The aim of this chapter is to introduce a relativistic formalism that can be used to describe particles and their interactions. The emphasis is on those elements of the formalism that can be carried on to relativistic quantum field theory (RQF), which underpins the theoretical framework of high-energy particle physics.

We begin with a brief summary of special relativity, concentrating on 4-vectors and spinors. One-particle states and their Lorentz transformations follow, leading to the Klein–Gordon and Dirac equations for probability amplitudes, i.e. relativistic quantum mechanics (RQM). Readers who want to get to RQM quickly, without studying its foundation in special relativity, can skip the first sections and start reading from Section 6.3.

Intrinsic problems of RQM are discussed and a region of applicability of RQM is defined. Free-particle wavefunctions are constructed and particle interactions are described using their probability currents. Gauge symmetry is introduced, which allows the interaction between a particle and a classical gauge field to be described within the formalism.

## **6.1 Special relativity**

Einstein's <sup>1</sup> <sup>1</sup> Albert Einstein, 1879–1955. special relativity is a necessary and fundamental part of any formalism of particle physics. We begin with a brief summary. For a full account, refer to specialized books, for example [128] or [127]. Theoryoriented students with a good mathematical background might want to consult books on groups and their representations, for example [46], followed by introductory books on RQM/RQF, for example [107]. Here we are only going to present conclusions without derivations, avoiding group-theoretical language and aiming at a presentation of key concepts at a qualitative level. Chapter 41 in [110] on spinors is recommended.<sup>2</sup>

> The basic elements of special relativity are 4-vectors (or, strictly speaking, contravariant 4-vectors) such as a 4-displacement<sup>3</sup> x<sup>μ</sup> = (t, **x**)=(x<sup>0</sup>, x<sup>1</sup>, x<sup>2</sup>, x<sup>3</sup>)=(x<sup>0</sup>, x<sup>i</sup> ) or a 4-momentum p<sup>μ</sup> = (E, **p**) = (p<sup>0</sup>, p<sup>1</sup>, p<sup>2</sup>, p<sup>3</sup>)=(p<sup>0</sup>, p<sup>i</sup> ). 4-vectors have real components and form a vector space. There is a metric tensor gμν = gμν that is used to form a dual space to the space of 4-vectors. This dual space is a vector

Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg 2016. Published in 2016 by Oxford University Press.

space of linear functionals, known as 1-forms (or covariant 4-vectors), which act on 4-vectors. For every 4-vector xμ, there is an associated 1-form x<sup>μ</sup> = gμνxν. Such a 1-form is a linear functional that, acting on a 4-vector yμ, gives a real number = gμνxνyμ. This number is called the scalar product<sup>4</sup> x · y of x<sup>ν</sup> and y<sup>μ</sup> . The Lorentz transformation between two coordinate systems, Λ<sup>μ</sup> <sup>ν</sup>, with x<sup>μ</sup> = Λ<sup>μ</sup> <sup>ν</sup>xν, leaves the scalar product unchanged which is equivalent to gρσ = gμνΛ<sup>μ</sup> ρΛ<sup>ν</sup> σ.

In the standard configuration, the Lorentz transformation becomes the Lorentz boost along the first space coordinate direction and is given by<sup>5</sup>

$$
\Lambda = \begin{pmatrix}
\gamma & -\gamma\beta & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1
\end{pmatrix}
$$

with

$$
\beta = \frac{v}{c}, \qquad \gamma = \frac{1}{\sqrt{1 - v^2/c^2}}
$$

where v is the velocity of the boost.

Two Lorentz boosts along different directions are equivalent to a single boost and a space rotation. This means that Lorentz transformations, which can be seen as space–time rotations, include Lorentz boosts (rotations by a purely imaginary angle) as well as space rotations (by a purely real angle). Representing Lorentz transformations by 4-dimensional real matrices acting on 4-vectors is not well suited for combining Lorentz boosts and space rotations in a transparent way. Even a simple question like 'What is the single space rotation that is equivalent to a combination of two arbitrary space rotations?' is hard to answer.

A better way is to represent Lorentz transformations by 2-dimensional complex matrices. First we consider a 3-dimensional real space and rotations. With every rotation in that 3-dimensional real space we can associate a 2 × 2 complex matrix, called a spin matrix,<sup>6</sup> <sup>6</sup>Also known as Hamilton's quaternion

$$R = \cos\left(\frac{1}{2}\theta\right) + \operatorname{i}\sin\left(\frac{1}{2}\theta\right) \left(\sigma\_x \cos\alpha + \sigma\_y \cos\beta + \sigma\_z \cos\gamma\right)$$

or

$$\begin{split} R &= \cos\left(\frac{1}{2}\theta\right) + \mathbf{i}\sin\left(\frac{1}{2}\theta\right)(\mathbf{n}\cdot\mathbf{\sigma}) \\ &= \exp[\mathbf{i}\left(\frac{1}{2}\theta\right)(\mathbf{n}\cdot\mathbf{\sigma})] \end{split} \tag{6.1}$$

where θ is the angle of rotation, α, β, γ are the angles<sup>7</sup> <sup>7</sup>Only two of the angles α, β, γ are between the axis of rotation **n** and the coordinate axes, and σ = (σx, σy, σz) are the Pauli matrices. Note that R is unitary: R† = R<sup>−</sup><sup>1</sup>. The vector space of spin matrices (a subspace of all 2×2 complex matrices) is thus defined using four basis vectors, such as the unit matrix and three basis vectors formed

<sup>4</sup>A similar situation occurs in the infinite-dimensional vector space of states in quantum mechanics (with complex numbers there). For every state, represented by a vector known as a ket, for example |x, there is a 1-form known as a bra, x|, that, acting on a ket |y, gives a number x | y, which is called the scalar product of the two kets |x and |y.

<sup>5</sup>The metric is represented by the matrix

$$g = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{pmatrix}$$

or a spinor transformation or a rotation operator.

independent.

used in video game programming because of this ease of combining rotations quickly.

where ρ = ρ**n**<sup>ρ</sup> is the rapidity. <sup>9</sup> <sup>9</sup> **<sup>n</sup>**<sup>ρ</sup> is the unit vector along the direction of the Lorentz boost.

and X satisfying X = X† by definition.

atical sense since they form a complex vector space, but they are not vectors like a displacement **x**, because they transform (e.g. under rotation) differently.

using the Pauli matrices: iσx, iσy, iσz. In this basis, the spin matrix R has the coordinates cos( <sup>1</sup> <sup>2</sup> <sup>θ</sup>), sin( <sup>1</sup> <sup>2</sup> <sup>θ</sup>) cos <sup>α</sup>, sin( <sup>1</sup> <sup>2</sup> <sup>θ</sup>) cos <sup>β</sup>, and sin( <sup>1</sup> <sup>2</sup> θ) cos γ. Combining two rotations, one multiplies corresponding spin matrices and describes the outcome using the above basis; thus getting all the parameters of the equivalent single rotation. <sup>8</sup> <sup>8</sup> This representation of rotations is The next step is to associate each 3-dimensional space (real numbers) vector **x** = (x<sup>1</sup>, x<sup>2</sup>, x<sup>3</sup>) with a corresponding spin matrix (there is no unit matrix in the basis here—only the three Pauli matrices):

$$X = x^1 \sigma\_x + x^2 \sigma\_y + x^3 \sigma\_z \tag{6.2}$$

Then, under the space rotation, **x** is transformed to **x** and X is transformed to X- = RXR† = x-<sup>1</sup>σ<sup>x</sup> + x-<sup>2</sup>σ<sup>y</sup> + x-<sup>3</sup>σz, from which we can read the coordinates of **x**- .

The beauty of this approach is that it extends seamlessly to space– time rotations, i.e. to the Lorentz transformations. The spin matrix R of eqn 6.1 becomes the Lorentz transformation

$$L = \exp\left[\left(-\mathfrak{p} + \mathrm{i}\theta\mathbf{n}\right) \cdot \frac{1}{2}\mathfrak{o}\right] \tag{6.3}$$

The rapidity is related to the Lorentz β and γ parameters by

$$
\tanh \rho = \beta, \quad \cosh \rho = \gamma, \quad \sinh \rho = \beta \gamma
$$

Now, a combination of two Lorentz transformations is very transparent—just addition of real and imaginary parts in the exponent. Association of a 4-vector x<sup>μ</sup> with a Hermitian spin matrix <sup>10</sup> <sup>10</sup> Now with four matrices as the basis X,

$$X = x^0 + x^1 \sigma\_x + x^2 \sigma\_y + x^3 \sigma\_z \tag{6.4}$$

allows us to get its Lorentz-transformed coordinates from X- = LXL† = x-<sup>0</sup> +x-<sup>1</sup>σ<sup>x</sup> +x-<sup>2</sup>σ<sup>y</sup> +x-<sup>3</sup>σ<sup>z</sup> Finally, the Lorentz boost alone (θ = 0) along **n**<sup>ρ</sup> is

$$L = \exp\left(-\mathbf{p} \cdot \frac{1}{2}\mathbf{r}\right) = \cosh\left(\frac{1}{2}\rho\right) - \mathbf{n}\_{\rho} \cdot \mathbf{o} \sinh\left(\frac{1}{2}\rho\right) \tag{6.5}$$

## **6.1.1 Spinors**

Spin matrices can act on 2-component complex vectors called spinors. <sup>11</sup> <sup>11</sup> Spinors are vectors in the mathem-Spinors, like vectors and tensors, are used in a number of different areas of physics, including classical mechanics. They play a particularly important role in RQM and in this section we will describe them in some detail. Under a space rotation R, a spinor ξ (ξ<sup>α</sup> to be more precise) transforms in the following way:

$$\xi'=R\xi$$

For comparison, the coordinates of a vector **x** transform under a space rotation as

$$X' = RXR^\dagger = x'^1 \sigma\_x + x'^2 \sigma\_y + x'^3 \sigma\_z$$

Thus, in a rotation of the coordinate system by θ = 2π, R = −1 because the <sup>1</sup> <sup>2</sup> θ in R gives ξ- = −ξ and **x**- = **x**. Continuing the rotation by a further 2π, so all together by 4π, results in ξ- = ξ. Does that counterintuitive minus sign resulting from the 2π rotation have any physical significance? Yes it does, as was demonstrated in a beautiful experiment [122] using neutrons.

One of two coherent neutron beams passes through a magnetic field of variable strength. In the magnetic field, the neutrons' magnetic moments precess with the Larmor frequency and the angle of the precession is easily calculated as a function of the strength of the magnetic field. After passing through the magnetic field, the beam interferes with the second beam, which followed a path outside the magnetic field. As demonstrated in Fig. 6.1, an angle of 4π is needed for the neutron wavefunction to reproduce itself. A 2π rotation gives a factor −1 in front of the original neutron wavefunction, as predicted for a spin- <sup>1</sup> <sup>2</sup> spinor.

So far, one could think about spinors as being identical with the Pauli spinors<sup>12</sup> <sup>12</sup>Eigenstates of spin operators, like the of non-relativistic quantum mechanics (NRQM). This is not quite right. The reason is that the Pauli spinors of NRQM live

spin projection on the z axis, <sup>1</sup> <sup>2</sup>**¯**hσz, for a spin- <sup>1</sup> <sup>2</sup> particle in NRQM [94].

**Fig. 6.1** A phase change of 4π is needed to get the same intensity from the interference of two neutron beams. Taken from [122].

dyadic product—something like this:

$$
\begin{pmatrix} a \\ b \end{pmatrix} \begin{pmatrix} c \ d \end{pmatrix} = \begin{pmatrix} ac & ad \\ bc & bd \end{pmatrix}.
$$

in space and not in space–time and we do not know how to Lorentztransform them. What we are constructing now are Weyl <sup>13</sup> <sup>13</sup> Hermann Weyl, 1885–1955. spinors (there is more than one type) living in space–time, and we do know how to Lorentz-transform these. Weyl spinors are needed to construct Dirac spinors, or bispinors, since two Weyl spinors of different type are needed for one Dirac spinor.

We will look now at the spin matrices X of eqn 6.4 from a different viewpoint, seeing them as tensors created by a tensor product <sup>14</sup> <sup>14</sup> A tensor product or outer product or of 2 dimensional spinors. Weyl spinors are rooted in space–time, not only in space like Pauli spinors. Consider the Lorentz transformation L of a spin

$$\text{matrix } X \text{ built from spinors } \xi = \begin{pmatrix} a \\ b \end{pmatrix} \text{ and } \eta = \begin{pmatrix} c \\ d \end{pmatrix} \text{:}$$

$$X' = \begin{pmatrix} a' \\ b' \end{pmatrix} \begin{pmatrix} c' \ d' \end{pmatrix} = L \begin{pmatrix} a \\ b \end{pmatrix} \begin{pmatrix} c & d \end{pmatrix} L^\dagger = LXL^\dagger$$

We can see that ξ- = Lξ but η- = L∗η (after taking the transpose, L†<sup>T</sup> = L<sup>∗</sup>). There are two different types of spinors, transforming differently. Those that transform with the complex conjugate L<sup>∗</sup> are called dotted Weyl spinors, distinguished from the undotted ξ<sup>α</sup> by a dot written above the index: η<sup>α</sup>˙ ; for example, (ξ<sup>α</sup>)<sup>∗</sup> is a dotted spinor. The spin matrix X is then written as Xαβ˙ (α = 1, 2 and β˙ = 1, 2). There is a metric tensor

$$
\epsilon\_{\alpha\beta} = \epsilon^{\alpha\beta} = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix}
$$

(and an identical one for the dotted spinors) to create a dual space of 1-forms:

$$
\xi\_{\alpha} = \epsilon\_{\alpha\beta}\xi^{\beta} \quad \text{(and} \quad \eta\_{\dot{\alpha}} = \epsilon\_{\dot{\alpha}\dot{\beta}}\eta^{\dot{\beta}}\text{)}.
$$

(thus ξ<sup>1</sup> = ξ<sup>2</sup> and ξ<sup>2</sup> = −ξ<sup>1</sup>). The scalar product <sup>15</sup> <sup>15</sup> Note that this gives <sup>ξ</sup>αξ<sup>α</sup> = 0 and ξαζ<sup>α</sup> = −ξαζα.

$$
\xi^{\alpha}\zeta\_{\alpha} = \epsilon\_{\alpha\beta}\xi^{\alpha}\zeta^{\beta}
$$

(and similarly for the dotted spinors) is invariant with respect to the Lorentz transformation. Because undotted Weyl spinors and dotted Weyl spinors are different objects, the scalar product, or in general any contraction, can only be performed on the same type of spinors: an undotted index is contracted with another undotted index and a dotted index is contracted with another dotted one; one cannot contract a dotted index with an undotted one.

In order to gain more insight, we go beyond Lorentz transformations and consider space inversion, P: P(x<sup>0</sup>, **x**)=(x<sup>0</sup>, −**x**). The space inversion P commutes with space rotations, but not with Lorentz transformations, because Lorentz transformations affect the time component and P does not. To illustrates this, consider a boost Λ followed by a space inversion P in 4-dimensional space–time, PΛ. It is evident that this is equivalent to a space inversion P followed by a boost, PΛ=Λ- P, but Λ -= Λ- : if Λ is a boost with velocity **v**, then Λ is a boost with velocity −**v**. Thus [P,Λ] -= 0, and therefore P is not proportional to the identity operator, which commutes with every operator.

We now return to Weyl spinors. Because space inversion is not proportional to the identity operator, space inversion does not transform ξ<sup>α</sup> into ξ<sup>α</sup> times a number. It transforms ξ<sup>α</sup> into a spinor of a different type, which transforms under the Lorentz transformation differently to ξα. Just as Pauli spinors represent spin in NRQM, Weyl spinors are going to represent spin in RQM. We know that space inversion leaves spin unaffected, and therefore, under P, ξ<sup>α</sup> needs to be transformed to a spinor that transforms under space rotations in the same way as ξ<sup>α</sup> and represents the same spin state.<sup>16</sup> <sup>16</sup>For a more complete treatment of Out of all three possibilities, only the 1-form ηα˙ transforms in the same way. So, under space inversion, ξ<sup>α</sup> → ηα˙ and ηα˙ → ξα. In the discussion on space inversion, P<sup>2</sup> = 1 is assumed. This is fine for all particles except Majorana particles. For a Majorana particle, P<sup>2</sup> = −1 and the transformation of spinors under P is different to that given here. We will define a Majorana particle later.

As the spinors ξ<sup>α</sup> and η<sup>α</sup>˙ play a very important role in RQM, the following is a summary of how they behave under various transformations (note here that (L†)<sup>−</sup><sup>1</sup> = L<sup>∗</sup><sup>−</sup><sup>1</sup>):


Suppose there is a spin- <sup>1</sup> <sup>2</sup> particle with 4-momentum p<sup>μ</sup> described in a particular reference frame by pαβ˙ via eqn 6.4.<sup>17</sup> <sup>17</sup>We have Note that pαβ˙ is identical to pβα˙ ; it is not a transpose operation. Following our non-relativistic intuition gained from using Pauli spinors, we want to represent the spin of that particle by ξ<sup>α</sup>. In an attempt to write a covariant<sup>18</sup> equation, we could try to contract the undotted index α, but that would lead to something like

$$p\_{\alpha\dot{\beta}}\xi^{\alpha} = m\eta\_{\dot{\beta}}$$

where ηβ˙ is a dotted spinor different from ξ<sup>α</sup> related to the uncontracted dotted index, and m is a dimensionful scalar<sup>19</sup> parameter appearing because of the energy dimensionality of p<sup>μ</sup>. So the equation is covariant only when m = 0, because we do not have any dotted spinor in hand to put on the right-hand side. A similar outcome is obtained if we have only ηβ˙ instead of ξ<sup>α</sup>. Having a column vector with two complex numbers is not enough; we also need to indicate how the two numbers transform under Lorentz transformations. Similarly, for vectors in 3-dimensional space, three real numbers are not enough; we need to know whether they represent a polar or an axial vector, since these transform differently under space inversion. So insisting on only one type of spinor excludes the other type, because they transform differently.

the material presented here on spinors, Chapter III of [50] is recommended.

$$\begin{aligned} p^{11} &= p\_{22} = p^0 + p^3 \\ p^{22} &= p\_{1\bar{1}} = p^0 - p^3 \\ p^{1\bar{2}} &= -p\_{21} = p^1 - \mathrm{i}p^2 \\ p^{2\bar{1}} &= -p\_{1\bar{2}} = p^1 + \mathrm{i}p^2 \end{aligned}$$

<sup>18</sup>Here and in the rest of this chapter, covariant means covariant with respect to Lorentz transformations.

<sup>19</sup>Henceforth, scalar means scalar with respect to Lorentz transformations.

Consequently, we end up with two independent Lorentz-invariant Weyl equations:

$$p\_{\alpha\beta}\xi^{\alpha} = 0, \qquad (p^0 - \mathbf{p} \cdot \mathbf{\sigma})\xi = 0 \tag{6.10}$$

$$p^{\alpha\dot{\beta}}\eta\_{\dot{\beta}} = 0, \qquad (p^0 + \mathbf{p} \cdot \mathbf{or})\eta = 0 \tag{6.11}$$

In the context of RQM, eqn 6.10 represents an equation of motion for a free massless spin- <sup>1</sup> <sup>2</sup> particle with positive helicity <sup>20</sup> <sup>20</sup> In quantum mechanics, a helicity opand eqn 6.11 an equation of motion for a different free massless spin- <sup>1</sup> <sup>2</sup> particle with negative helicity. Each equation is not covariant under space inversion and violates parity because the space inversion, eqn 6.9, sends each spinor beyond the formalism—only one type of spinor is present in each formalism. At present, there are no known particles that could be described by either of the Weyl equations. If the electron neutrino were exactly massless, it would be described by eqn 6.11 and the hypothetically massless and different electron antineutrino would be described by eqn 6.10.

Suppose now that we have <sup>p</sup>αβ˙ and two different spinors <sup>ξ</sup><sup>α</sup> and <sup>η</sup>β˙ to describe a spin- <sup>1</sup> <sup>2</sup> particle. First, we contract the undotted index, giving pαβ˙ ξ<sup>α</sup> = mηβ˙ , then acting with <sup>21</sup>We have <sup>21</sup> <sup>γ</sup> . <sup>p</sup>αβ˙ on <sup>η</sup>β˙ from that equation gives mξ<sup>α</sup> under the condition that m<sup>2</sup> = p<sup>μ</sup>pμ. The result is a covariant set of equations

$$\begin{aligned} p^{\alpha \dot{\beta}} \eta\_{\dot{\beta}} &= m \xi^{\alpha}, \qquad (p^{0} + \mathbf{p} \cdot \boldsymbol{\sigma}) \eta = m \xi\\ p\_{\alpha \dot{\beta}} \xi^{\alpha} &= m \eta\_{\dot{\beta}}, \qquad (p^{0} - \mathbf{p} \cdot \boldsymbol{\sigma}) \xi = m \eta \end{aligned} \tag{6.12}$$

Requiring that, under space inversion, ξ<sup>α</sup> and ηβ˙ be transformed into each other as in eqn 6.9 makes the set of equations 6.12 invariant under space inversion, because, simultaneously, <sup>p</sup>αβ˙ and <sup>p</sup>αβ˙ are also transformed into each other. The spinors ξ<sup>α</sup> and ηβ˙ are combined into a single four-component bispinor called the Dirac <sup>22</sup> <sup>22</sup> P. A. M. Dirac, 1902–1984. spinor and the two equations become one equation called the Dirac equation. In the context of RQM, the Dirac equation describes a spin- <sup>1</sup> <sup>2</sup> particle like the electron.

In order to gain more insight into the origin of the Dirac equation, consider the Lorentz boost, eqns 6.5 and 6.8, from the rest frame, momentum **p** = 0, to the frame in which the particle has energy E and momentum **p**. The relevant spinors transform as <sup>23</sup>We have <sup>23</sup>

$$\xi(\mathbf{p}) = [\cosh(\rho/2) - \mathbf{n}\_{\rho} \cdot \mathfrak{o} \sinh(\rho/2)]\xi(0) \tag{6.13}$$

$$\eta(\mathbf{p}) = [\cosh(\rho/2) + \mathbf{n}\_{\rho} \cdot \mathbf{o} \sinh(\rho/2)] \eta(0) \tag{6.14}$$

which can be written as

$$
\xi(\mathbf{p}) = \frac{E + m + \mathbf{p} \cdot \mathbf{o}}{\sqrt{2m(E + m)}} \xi(0) \tag{6.15}
$$

$$\eta(\mathbf{p}) = \frac{E + m - \mathbf{p} \cdot \mathbf{o}}{\sqrt{2m(E + m)}} \eta(0) \tag{6.16}$$

erator representing the projection of a particle's spin on the direction of its momentum is defined as

$$\frac{\mathbf{p}\cdot\mathbf{o}}{|\mathbf{p}|}$$

For a massless particle, this is equivalent to

$$\frac{\mathbf{p} \cdot \sigma}{p^0}$$

pαβ˙ <sup>p</sup>γβ˙ <sup>=</sup> <sup>p</sup>μpμδ<sup>α</sup>

$$\cosh(\rho/2) = \frac{E+m}{\sqrt{2m(E+m)}}$$

$$\sinh(\rho/2) = \frac{|\mathbf{p}|}{\sqrt{2m(E+m)}}$$

In the particle's rest frame and in all frames moving with respect to it slowly enough that the Lorentz boost can be approximated by a Galilean transformation (not affecting time) when transforming between those frames, any differences in how spinors with dotted or undotted indexes transform disappear. In that case, spinors effectively live in 3 real dimensions. Inspecting eqn 6.5, we can see that in the limit β → 0, L tends to the unit matrix and therefore, under the Galilean transformation, spinors do not change. Thus, at rest, both Weyl spinors, ξ<sup>α</sup> and ηβ˙ , become effectively identical to the same Pauli spinor and we can write ξα(0) = ηα˙(0). This allows us, after some algebra, to remove **p** = 0 spinors from eqns 6.15 and 6.16 and to obtain the Dirac equation, eqn 6.12.

Thus the Dirac equation is equivalent to the Lorentz boost. This should be expected—once an object, like a bispinor, is found to represent a particle in its rest frame, the only thing left to do is to boost it to another frame as needed.

## **6.2 One-particle states**

The fact that quantum states of free relativistic particles are fully defined by the Lorentz transformation supplemented by the space-time translation was discovered by Wigner.<sup>24</sup> <sup>24</sup> Here we will follow his idea in Eugene Wigner, 1902–1995. a qualitative way just to get the main concept across.

First, we note that Lorentz transformations are not able to transform a given arbitrary 4-momentum p<sup>μ</sup> into every possible p<sup>μ</sup>. Instead, the vector space of 4-momenta is divided into subspaces of 4-momenta that can be Lorentz-transformed into each other. Three of those subspaces represent experimentally known states. The simplest, at this stage, is the vacuum state given by the conditions p<sup>μ</sup> = 0 and p<sup>μ</sup>p<sup>μ</sup> = 0. There is no Lorentz transformation that would transform a 4-momentum not satisfying these conditions into one that does, and vice versa. We will not study vacuum states in this book, and therefore we move directly to consider two other possibilities.

A 4-momentum subspace related to massive particles, like the electron, is given by the condition p<sup>μ</sup>p<sup>μ</sup> > 0. In addition to a 4-momentum p<sup>μ</sup>, what other degrees of freedom are present and which geometrical object represent them? To answer this question, we can consider Lorentz transformations that leave p<sup>μ</sup> invariant.<sup>25</sup> <sup>25</sup>The group of such transformations is To see what these are, we can transform p<sup>μ</sup> to the particle rest frame, where p<sup>μ</sup> = (mass, 0, 0, 0), find the largest subset of Lorentz transformations leaving p<sup>μ</sup> invariant, and then transform back to the same p<sup>μ</sup>. It turns out, as intuitively expected, that the desired transformations are space rotations acting on 2s + 1 spinors representing 2s + 1 spin projections of a spin-s particle. Thus, the electron, s = <sup>1</sup> <sup>2</sup> , is represented by two Dirac spinors—in fact, by two Dirac spinors multiplied by a dimensionless scalar. To get the scalar, we add space–time translations. Looking for a theory that is space–time translation-invariant,<sup>26</sup> <sup>26</sup>Implying energy and momentum conwe are looking for the free-particle energy and momentum eigenstates that, in the position representation, lead to the scalar exp(−ip<sup>μ</sup>xμ).

known as the little group.

servation.

largest <sup>27</sup> <sup>27</sup> For a discussion of some subtle issues, see [107].

The third 4-momentum subspace is defined by the conditions p<sup>μ</sup> -= 0 and pμp<sup>μ</sup> = 0. Photons belong to this class. The question is again to find the largest subset of Lorentz transformations leaving p<sup>μ</sup> invariant. There is no rest frame in this case, and therefore, instead, we transform an arbitrary p<sup>μ</sup> to the frame where p<sup>μ</sup> = (ω, 0, 0, ω). We can see that the subset of the Lorentz transformations leaving p<sup>μ</sup> invariant are the rotations in the (x<sup>1</sup>, x<sup>2</sup>) plane. As a result, a spin-s massless particle is represented by only one state, a helicity eigenstate, and not by 2s + 1 states as in the massive case. This is an important difference. In order to get parity-conserving electromagnetism with photons having either helicity + or helicity − states, we put those two, in principle different, helicity states into one theory.

## **6.2.1 Fields and probability amplitudes**

We have now everything needed to develop RQM and to describe fundamental particles and their interactions. But before we move on, we pause to look at a larger picture of which RQM is only a part. The Dirac equation, for example, can be studied in the context of classical field theory or RQF or RQM. The algebra will often be identical, but the basic objects and the interpretation are different.

The most natural way to proceed from here would be to study a classical field theory. The paradigm for this is classical electromagnetism described in terms of the tensor field F μν or the 4-vector potential field A<sup>μ</sup>:

$$F^{\mu\nu} = \partial^{\mu}A^{\nu} - \partial^{\nu}A^{\mu}$$

Then such a field, for example a classical electron field, i.e. a classical Dirac spinor field Ψ, would be quantized, promoting the Ψ of classical field theory to an operator Ψ of RQF. Probability amplitudes would be obtained by taking matrix elements of Ψ of RQF sandwiched between particle states living in a suitably constructed space. In RQM, Ψ represents a particle state and is not an operator. Taking the vacuum to one-particle matrix element of the field operator Ψ of RQF, we get Ψ of RQM, which in the position representation is called the wavefunction. In RQM, one can also have states describing many particles—but only a fixed number of them. At high energies, much larger than the masses of the particles involved, particles can be created and the number of particles cannot be fixed; RQM is not adequate for this and RQF has to be used instead.

It is not appropriate in this text to go into sufficient detail to enable a proper understanding of classical field theory and RQF. Fortunately, considering the most important aspects of physics that we require, RQF gives the same results as those we will obtain in RQM. Differences will be in details beyond leading effects. The one important exception is that we will be missing the idea of a vacuum state. In RQF, a vacuum is not a 'nothingness', although particles are absent. For example, the QCD vacuum is a very complicated state. We will, however, consider the nature of the electroweak vacuum in Chapter 12, since it is fundamental to the origin of mass in the Standard Model.

## **6.3 The Klein–Gordon equation**

RQM of spin-0 particles was considered by Schr¨odinger first, before he published his famous equation for the non-relativistic case. He abandoned RQM because of formal difficulties that were only understood many years later. Here, we will see what they are and then define an area of applicability of RQM.

As argued earlier, a spin-0 particle with

$$p^{\mu}p\_{\mu} = m^2 > 0\tag{6.17}$$

in the position representation is expected to be described by a scalar wavefunction ∼ exp(−ip<sup>μ</sup>xμ). Replacing the energy by i∂/∂t and the momentum by −i∇ in eqn 6.17,<sup>28</sup> we get the Klein–Gordon (KG) equation<sup>29</sup> of RQM in the position representation:

$$(\Box + m^2)\Psi(t, \mathbf{x}) = 0\tag{6.18}$$

where

$$
\Box = \partial^{\mu}\partial\_{\mu} = \frac{\partial^2}{\partial t^2} - \nabla^2
$$

For a particle at rest, −i∇Ψ(t, **x**) = 0, only the time (proper time τ ) derivative would be present in eqn 6.18 and there would be two independent solutions: Ψ<sup>±</sup>(τ, **x**) = exp(∓imτ )Ψ<sup>±</sup>(0, 0). Therefore, in a frame in which the particle has momentum p and energy E<sup>p</sup> = +p<sup>2</sup> + m<sup>2</sup> > 0 (the subscript 'p' is for the plus sign, indicating that E<sup>p</sup> is positive), we get, as expected,<sup>30</sup> <sup>30</sup>We have mτ <sup>=</sup> <sup>p</sup>μxμ.

$$\Psi^+(t, \mathbf{x}) = N \exp(-\mathbf{i}p \cdot \mathbf{x}) = N \exp(-\mathbf{i}E\_\mathbf{p}t + \mathbf{i}\mathbf{p} \cdot \mathbf{x})\tag{6.19}$$

where N is a normalization constant that will be defined shortly.

Instead of boosting the other solution, i.e. taking the (−Ep, **p**) eigenstate, we take the complex conjugate of Ψ<sup>+</sup>(t, **x**), corresponding to the (−Ep, −**p**) eigenstate, to get<sup>31</sup> <sup>31</sup>Why we are doing this should become

$$N^{-}(t, \mathbf{x}) = N \exp(+\mathbf{i}p \cdot \mathbf{x}) = N \exp(+\mathbf{i}E\_{\mathbf{p}}t - \mathbf{i}\mathbf{p} \cdot \mathbf{x})\tag{6.20}$$

By a direct substitution, one can check that a general solution of eqn 6.18 is indeed a linear combination of Ψ<sup>+</sup>(t, **x**) and Ψ<sup>−</sup>(t, **x**).

We have obtained, as expected, Ψ<sup>+</sup>(t, **x**), but, in addition, we also have Ψ<sup>−</sup>(t, **x**). This is the first puzzle of RQM, the nature of which will become clearer when we progress a little further. Both solutions of the KG equation are eigenfunctions of the energy operator i∂/∂t: Ψ<sup>+</sup>(t, **x**)

$$\begin{aligned} p^{\mu} &= \mathrm{i}\partial^{\mu} \\ p^{0} &= \mathrm{i}\frac{\partial}{\partial x^{0}} = \mathrm{i}\frac{\partial}{\partial t} \end{aligned} $$
 
$$p^{i} = \mathrm{i}\partial^{i} = -\mathrm{i}\partial\_{i} = -\mathrm{i}\frac{\partial}{\partial x^{i}}$$

<sup>29</sup>Sometimes known as the Klein– Gordon–Fock equation.

clear after reading Section 6.3.1.

<sup>28</sup>We have

with an eigenvalue E<sup>p</sup> and Ψ−(t, **x**) with −Ep, a negative energy for a free particle!

In exactly the same way as for the non-relativistic Schr¨odinger equation, we can derive the continuity equation for a probability density ρ and a probability current **j**:

$$\frac{\partial \rho}{\partial t} + \nabla \cdot \mathbf{j} = 0 \tag{6.21}$$

where

$$\rho = \mathrm{i}\left(\Psi^\* \frac{\partial \Psi}{\partial t} - \Psi \frac{\partial \Psi^\*}{\partial t}\right) \tag{6.22}$$

$$\mathbf{j} = -\mathbf{i} (\Psi^\* \nabla \Psi - \Psi \nabla \Psi^\*) \tag{6.23}$$

The probability current turns out to be given by the same expression as in the non-relativistic case, but the probability density is different, although it shows a nice symmetry with the current, and we can define a 4-vector current

$$j^{\mu} \equiv (\rho, \mathbf{j}) = \mathbf{i} (\Psi^\* \partial^{\mu} \Psi - \Psi \partial^{\mu} \Psi^\*)$$

The continuity equation, eqn 6.21, can be then written as ∂μj<sup>μ</sup> = 0. The corresponding conserved quantity is the total probability, which we obtain by integrating j<sup>0</sup> = ρ over the 3-dimensional space. The underlying symmetry is invariance with respect to multiplication by a global phase factor: physics described by Ψ is identical to physics described by e<sup>i</sup><sup>θ</sup>Ψ for any fixed real parameter θ. <sup>32</sup> <sup>32</sup> This can be proved from the Klein–

theorem. Substituting Ψ<sup>+</sup>(t, **<sup>x</sup>**) from eqn 6.19 into eqn 6.23, we obtain

$$\rho^{+} = 2|N|^2 E\_{\mathbf{p}}, \qquad \mathbf{j}^{+} = 2|N|^2 \mathbf{p} \tag{6.24}$$

Now we can fix the normalization N. In NRQM, the volume integral of the probability density is a constant with value 1 for one particle in the whole space. This does not work in RQM, because of the Lorentz contraction, which modifies the volume, contracting one side of a cube, parallel to the Lorentz boost, by the Lorentz factor γ. To keep the integral independent of the Lorentz transformation, the probability density should grow by the same factor <sup>33</sup> <sup>33</sup> There is no problem here, since <sup>ρ</sup> is γ. So putting N = 1 would do the job, as would any other constant. The choice of N = 1 is called the covariant normalization and corresponds to 2E<sup>p</sup> particles in a unit volume. Another popular choice is N = 1/ <sup>√</sup>2m, which in the non-relativistic limit E<sup>p</sup> → m makes ρ → Ψ<sup>∗</sup>Ψ and **j** → **velocity** approach the expressions from NRQM.

For Ψ<sup>−</sup>(t, **x**), eqn 6.24 becomes

$$\rho^- = -2|N|^2 E\_\mathbf{p}, \qquad \mathbf{j}^- = -2|N|^2 \mathbf{p} \tag{6.25}$$

In summary, Ψ<sup>+</sup>(t, **x**) and related observables, the energy, the probability density, and the probability current come out as expected and

Gordon Lagrangian and Noether's

the time-like component of a 4-vector.

behave nicely in the non-relativistic limit. In contrast to Ψ<sup>+</sup>(t, **x**), an unexpected additional wavefunction Ψ−(t, **x**) describes a free particle with negative energy and negative probability density and with the probability current flowing in the opposite direction to the particle's momentum—all properties that are unexpected and difficult to accept.

## **6.3.1 The Feynman–Stueckelberg interpretation of negative-energy states**

In this section, we outline the Feynman–Stueckelberg interpretation of negative-energy states following the approach described by Feynman in his Dirac Lecture [74], which is recommended as further reading.

Suppose there is a particle in a state φ<sup>0</sup> as indicated in Fig. 6.2(a). At time t1, a potential U<sup>1</sup> is turned on for a moment, acting on the particle and changing its state to an intermediate state. At time t2, a second perturbation U<sup>2</sup> changes that intermediate state to the final one, which could be the same as the original state φ0. The amplitude for the particle to go from the initial state φ<sup>0</sup> to the same state φ<sup>0</sup> after time t<sup>2</sup> has a contribution from the amplitude with an intermediate state, existing for the period of time from t<sup>1</sup> to t2, of energy E<sup>p</sup> > 0. All possible intermediate states of different energies E<sup>p</sup> > 0 contribute. Among them, there are amplitudes for particles travelling faster than the speed of light. This is the result of insisting that all energies E<sup>p</sup> be positive. If one starts a series of waves from a point, keeping all energies positive, these waves cannot be confined to the inside of the light cone. The sketch in Fig. 6.2(a) corresponds to an amplitude where the particle in the intermediate state travels faster than the speed of light. An observer in the reference frame (a) with coordinates (t, x) observes one particle in quantum state φ<sup>0</sup> that moves from x<sup>1</sup> to x2, ending in the same quantum state φ0.

As indicated in Fig. 6.2(b), there is another reference frame (b) with coordinates (t - , x- ) in which the sequence of events is different; t - <sup>2</sup> happens first, before t - <sup>1</sup>. An observer in this reference frame has a different story to tell. A particle at x- <sup>1</sup> is in a quantum state φ0. Nothing happens until time t - <sup>2</sup>, when suddenly two particles emerge from the point x- 2. One of these particles travels to x- <sup>1</sup> and at time t - <sup>1</sup> collides with the original particle. The particles annihilate with each other, disappearing

**Fig. 6.2** A contribution to the transition amplitude viewed in two different reference frames. Adapted from [74].

from the scene, leaving the third particle at x- <sup>2</sup> in state φ0. In frame (b), three particles were present between t - <sup>2</sup> and t - <sup>1</sup>. The second observer can argue that the particle that travelled from x- <sup>2</sup> to x- <sup>1</sup> is the antiparticle of the original particle and therefore they were able to annihilate with each other. So antiparticles must exist and their properties are defined from particles such that the annihilation works. The first observer might argue that the antiparticle of the second observer is her/his particle travelling backwards in time and that is the interpretation of the negative-energy states. The negative-energy states correspond to particles travelling backwards in time and therefore the phase of the wavefunction in eqn 6.20 has a −iEp(−t) ≡ +iEpt contribution instead of −iEpt as in eqn 6.19. To make the picture complete, we must also take into account that a particle travelling backwards in time has its momentum reversed. Mathematically, all this is equivalent to taking the complex conjugate of the positive-energy solution Ψ<sup>+</sup> (eqn 6.19) to obtain the negative-energy solution Ψ<sup>−</sup> (eqn 6.20).

In summary, negative-energy solutions of the KG equation represent antiparticles. The probability density represents the charge density and can be either negative or positive. The same applies for the probability current, representing the charged current, the number of charges passing through the unit area per unit time.

## **Inclusion of interactions via a potential**

We introduce interactions using a potential and see what happens. Following the way in which a potential V ia introduced into the non-relativistic Schr¨odinger eqn, we modify the energy operator <sup>34</sup> <sup>34</sup> A proper discussion of interactions

$$\mathbf{i}\frac{\partial}{\partial t} \rightarrow \mathbf{i}\frac{\partial}{\partial t} - V$$

transforming eqn 6.18 into

$$\left(\mathrm{i}\frac{\partial}{\partial t} - V\right)^2 \Psi = (-\nabla^2 + m^2)\Psi$$

which, for a time-independent, time-like potential V in one dimension and for energy eigenstates with energy Ep, becomes <sup>35</sup>We have <sup>35</sup>

$$\left[E\_{\rm P} - V(s)\right]^2 \psi(s) = \left(-\frac{\partial^2}{\partial s^2} + m^2\right) \psi(s) \tag{6.26}$$

Consider a time-like potential barrier of fixed height V > 0 for s ≥ 0, as shown in Fig. 6.3. The wavefunction ψ(s) consists of incident, Ie<sup>i</sup>ps, reflected, Re<sup>−</sup>ips, and transmitted, Te<sup>i</sup>ks, waves:

$$\begin{aligned} \psi\_{\mathcal{L}}(s) &= I e^{\mathrm{i}ps} + R e^{-\mathrm{i}ps} \\ \psi\_{\mathcal{R}}(s) &= T e^{\mathrm{i}ks} \end{aligned}$$

will be given in Section 6.5.

$$\begin{aligned} \mathrm{i}\frac{\partial\Psi}{\partial t} &= E\_{\mathrm{P}}\Psi\\ \Psi(t,s) &= \psi(s)\exp(-\mathrm{i}E\_{\mathrm{P}}t) \end{aligned}$$

where

$$\psi(s) = \begin{cases} \psi\_{\mathcal{L}}(s) & \text{for } s < 0 \\ \psi\_{\mathcal{R}}(s) & \text{for } s \ge 0 \end{cases}$$

On substituting ψ<sup>L</sup> and ψ<sup>R</sup> into eqn 6.26, we obtain E<sup>2</sup> = p<sup>2</sup> + m<sup>2</sup> and (E − V )<sup>2</sup> = k<sup>2</sup> + m<sup>2</sup>, leading to

$$p = \pm \sqrt{E^2 - m^2}, \qquad k = \pm \sqrt{(E - V)^2 - m^2}$$

In both cases, we choose a + sign in front of the square root to match the expected propagation directions as in Fig. 6.3.

From the continuity condition at s = 0 for ψ(s) and dψ(s)/ds, we get

$$I + R = T, \qquad pI - pR = kT$$

and so

$$T = \frac{2p}{p+k}I, \qquad R = \frac{p-k}{p+k}I \tag{6.27}$$

The probability currents along s for s < 0 and s ≥ 0 are

$$j\_{\rm L} = \frac{p}{m}(|I|^2 - |R|^2), \qquad j\_{\rm R} = \frac{k}{m}|T|^2 \tag{6.28}$$

Keeping the energy E fixed, we consider three different cases of potential strength.

The first case is that of a weak potential, E>V + m, where k is real and k<p. The probability densities in the two regions are

$$
\rho\_{\rm L} = \frac{E}{m} |\psi\_{\rm L}|^2 > 0, \qquad \rho\_{\rm R} = \frac{E-V}{m} |\psi\_{\rm R}|^2 > 0 \tag{6.29}
$$

This case looks like the non-relativistic one, with nothing special happening: a small fraction of the incoming wave is reflected and the rest is transmitted.

In the second case, the potential is of moderate strength, V − m < E<V + m, and k = im<sup>2</sup> − (E − V )<sup>2</sup> = iκ is purely imaginary (with κ real) and

$$R = \frac{p - \mathbf{i}\kappa}{p + \mathbf{i}\kappa} I$$

so

$$|R| = |I|, \qquad j\_{\mathcal{L}} = 0$$

The incoming wave is totally reflected and the probability density in the barrier shows the expected exponential decay

$$\rho\_{\mathcal{R}} = \frac{E - V}{m} |\psi\_{\mathcal{R}}|^2 = \frac{E - V}{m} \mathbf{e}^{-2\kappa s}$$

**Fig. 6.3** A time-like potential barrier of height V . Incoming, reflected, and transmitted waves are also indicated.

as in the non-relativistic case. However, the situation is not identical, because the increasing potential V changes the sign of the probability density from positive (ρ<sup>R</sup> > 0 in the first case of the weak potential) to negative:

$$E > V \quad \Rightarrow \quad \rho\_{\mathbb{R}} > 0$$

but

$$E < V \quad \Rightarrow \quad \rho\_{\mathcal{R}} < 0 \tag{6.30}$$

We will come back to this after discussing the case of the strong potential, E<V − m, when k becomes purely real and k<sup>2</sup> > p<sup>2</sup>. The probability, which became negative when the potential became strong enough in the previous case, stays negative and the probability current becomes real inside the barrier:

$$\rho\_{\mathcal{R}} = \frac{E - V}{m} |T|^2 < 0, \qquad j\_{\mathcal{R}} = \frac{k}{m} |T|^2.$$

Consider next the previously unphysical case of k < 0, because, in a counter-intuitive way, this case now corresponds to a particle moving to the right. To see that, we will calculate the group velocity

$$\mathcal{V}\_{\rm g} = \frac{\partial E}{\partial k} = \frac{k}{E - V} > 0$$

A consequence is that T and R given by eqn 6.27 can be arbitrarily large, possibly making the reflected current bigger than the incoming one. <sup>36</sup> <sup>36</sup> This is known as the Klein paradox, 1894–1977. The reason for this is that the very strong potential provides enough energy to produce particle–antiparticle pairs. The probability current and the probability density within the barrier are negative because created antiparticles (see eqn 6.25) are attracted to the barrier, moving to the right (V<sup>g</sup> > 0). The created particles are repelled by the barrier, moving to the left, increasing the reflected probability current. Such a situation can be created by focusing light from a high-power laser, making a very strong electric field, which in turn produces electron–positron pairs from the vacuum. Another example is the Hawking radiation in the neighbourhood of a black hole. The fact that the probability density already became negative in the case of a moderate-strength potential corresponds to vacuum polarization by the creation of virtual particle– antiparticle pairs. They do not affect the probability currents, because there is not enough energy in the system to promote them to become real particles. An analogy is the Lamb shift in atomic physics, where the vacuum polarization affects energy levels. The problem of RQM is now clear: the formalism describes one particle (or a fixed number of particles), but physics needs many particles, the number of which cannot be fixed—particles can be created and particles can be annihilated. One needs RQF to describe such physics. RQM can only be used as long as the number of particles is fixed. Using the uncertainty relation

another puzzle of RQM, after O. Klein,

ΔpΔs ∼ **¯**h, we see that pair creation that starts at Δp ∼ mc sets the limit on Δs ∼ **¯**h/mc. So, as long as we are studying physics at a scale bigger than **¯**h/mc, known as the Compton wavelength, RQM can be applied. Atomic physics is an example where this condition is usually fulfilled: the Compton wavelength of the electron is about 3.86×10−<sup>13</sup> m. But RQM can also be applied to many processes in high-energy particle physics. A representative example is electron–positron annihilation producing hadrons, many of which are pions. The fundamental process in this case is e<sup>+</sup> + e<sup>−</sup> → q + ¯q. The number of particles is 2 and is fixed, and the change from 2 leptons to 2 quarks can be handled by RQM. Fragmentation of quarks to hadrons takes place on a different, much slower, time scale and therefore can be separated from the fundamental process of e<sup>+</sup> + e<sup>−</sup> annihilation.

Is there any limit on Δp? How well can one measure momentum? In NRQM, momentum can be measured with any precision, but, because of the upper limit on speed, < c, we must have<sup>37</sup> <sup>37</sup>See the Introduction in [50].

$$
\Delta p \Delta t \sim \frac{\hbar}{c}
$$

and consequently infinite precision Δp → 0 requires infinite measurement time t → ∞.

## **6.4 The Dirac equation**

This is the most important section of this chapter. The Dirac equation provides a relativistically consistent equation describing a massive pointlike spin- <sup>1</sup> <sup>2</sup> particle such as the electron and it led to the prediction of the positron—the first antiparticle.

First, we consider different representations of the Dirac equation, the probability current and bilinear covariants. Then we find the free-particle states, examine their properties, and introduce chirality and helicity operators. The formalism is then applied to describe simple Standard Model processes such as e<sup>+</sup> + e<sup>−</sup> → μ<sup>+</sup> + μ<sup>−</sup> at energies well above the muon rest mass but well below the Z<sup>0</sup> mass.<sup>38</sup> <sup>38</sup>The domain of the <sup>e</sup>+e<sup>−</sup> colliders

The properties of Dirac particles under the discrete symmetries P PETRA (DESY) and PEP (SLAC). , C, and T are then discussed. Electromagnetic interactions are introduced via so-called minimal coupling and the non-relativistic limit is obtained, leading to the prediction of g = 2 for the magnetic dipole moment of the electron. Finally, there is a brief discussion of the Aharonov–Bohm effect and the pre-eminence of the electromagnetic 4-vector potential in RQM.

Before we move on, we shall give a few words of introduction for readers who skipped Section 6.1. Unlike the non-relativistic case, where electron spin is described by a column of two complex numbers, called the Pauli spinor,<sup>39</sup> in RQM one needs a pair of two-component spinors <sup>ξ</sup><sup>α</sup> and <sup>η</sup>β˙ (with <sup>α</sup> = 1, 2 and <sup>β</sup>˙ = 1, 2) called the undotted and dotted spinors (the latter being distinguished by a dot above the index)—these

<sup>39</sup>Following experimental observations suggesting that the electron has a property called spin, Pauli extended the Schr¨odinger equation describing the electron's interaction with an electromagnetic field by inserting into it a two-component spinor and corresponding magnetic dipole moment.

are just names, and they could be called 'apples' and 'pears' since they are as different as apples and pears. The reason for the different names is that the corresponding spinors transform differently under Lorentz transformations (see eqns 6.7 and 6.8). Combining ξ<sup>α</sup> and ηβ˙ into one 4-component column gives the Dirac spinor Ψ: Ψ = <sup>ξ</sup> η . Now this Dirac spinor can be transformed using a unitary operator (changing what is called a representation), mixing ξ and η. And we must follow these different ξ and η spinors all the way through to know how to Lorentz-transform the resulting Dirac spinor; having just the four complex numbers constituting the Dirac spinor is not enough to know how to apply the Lorentz transformation to them.

The Dirac equation, as derived in Section 6.1 as eqn 6.12 or, for readers who skipped that section, as derived in Exercises 6.2 and 6.3 following the historical path, can be written as

$$
\begin{pmatrix} 0 & p\_0 + \mathbf{p} \cdot \boldsymbol{\sigma} \\\\ p\_0 - \mathbf{p} \cdot \boldsymbol{\sigma} & 0 \end{pmatrix} \boldsymbol{\Psi} = m\boldsymbol{\Psi} \tag{6.31}
$$

Instead of Ψ, we could use Ψ- = UΨ, where U is a unitary operator. In the new basis, the new representation, eqn 6.31, would look different. In general, the Dirac equation can be written as

$$(\gamma p - m)\Psi = 0\tag{6.32}$$

where

$$\gamma p \equiv \gamma^{\mu} p\_{\mu} = p\_0 \gamma^0 - \mathbf{p} \cdot \mathbf{y} = \mathbf{i} \gamma^0 \frac{\partial}{\partial t} + \mathbf{i} \mathbf{y} \cdot \nabla \tag{6.33}$$

<sup>40</sup>In some books, <sup>ξ</sup><sup>α</sup> and <sup>η</sup>β˙ swap places, leading to different space-like *γ* matrices, multiplied by −1.

<sup>41</sup>We have

$$\begin{aligned} (\gamma^0)^2 &= 1\\ (\gamma^1)^2 &= (\gamma^2)^2 = (\gamma^3)^2 = -1\\ \gamma' &= U\gamma U^\dagger = U\gamma U^{-1} \end{aligned}$$

<sup>42</sup>Equation 6.32 is

$$\mathbf{i}\gamma^0 \frac{\partial \Psi}{\partial t} + \mathbf{i}\gamma^k \frac{\partial \Psi}{\partial x^k} - m\Psi = 0$$

Applying †, we get

$$\mathrm{i}\frac{\partial\Psi^{\dagger}}{\partial t}\gamma^{0} + \mathrm{i}\frac{\partial\Psi^{\dagger}}{\partial x^{k}}(-\gamma^{k}) + m\Psi^{\dagger} = 0$$

and multiplying by γ<sup>0</sup> from the right and using eqn 6.35 gives

$$\mathrm{i}\,\frac{\partial\bar{\Psi}}{\partial t}\gamma^0 + \mathrm{i}\,\frac{\partial\bar{\Psi}}{\partial x^k}\gamma^k + m\bar{\Psi} = 0$$

Comparing equations 6.32 and 6.31, we can see that in the representation that was used to get eqn 6.31,

$$
\gamma^0 = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}, \qquad \mathfrak{Y} = \begin{pmatrix} 0 & -\mathfrak{o} \\ \mathfrak{o} & 0 \end{pmatrix} \tag{6.34}
$$

This is known as the Weyl or symmetric or chiral representation.<sup>40</sup> Multiplying eqn 6.32 by γp from the left, we get representationindependent constraints on the γ matrices:

$$
\gamma^{\mu}\gamma^{\nu} + \gamma^{\nu}\gamma^{\mu} = 2g^{\mu\nu} \tag{6.35}
$$

The matrix γ<sup>0</sup> is Hermitian and the matrices γ<sup>i</sup> are anti-Hermitian (in any representation):<sup>41</sup>

$$
\gamma^{0\dagger} = \gamma^0, \qquad (\gamma^i)^\dagger = -\gamma^i \tag{6.36}
$$

Applying Hermitian conjugation to eqn 6.32, using properties of the γ matrices given by eqn 6.36 and after some algebra,<sup>42</sup> we get the adjoint Dirac equation

$$
\bar{\Psi}(\gamma p + m) = 0\tag{6.37}
$$

where the adjoint spinor is

$$
\bar{\Psi} \equiv \Psi^\dagger \gamma^0 \tag{6.38}
$$

and p acts on the left.

Multiplying eqn 6.32 by Ψ from the left and eqn 6.37 by Ψ from the ¯ right and adding the resulting equations, we get

$$
\bar{\Psi}\gamma^{\mu}\partial\_{\mu}\Psi + (\partial\_{\mu}\bar{\Psi})\gamma^{\mu}\Psi = \partial\_{\mu}(\bar{\Psi}\gamma^{\mu}\Psi) = 0
$$

which is the continuity equation, ∂μj<sup>μ</sup> = 0, for the probability current 4-vector

$$j^{\mu} = \bar{\Psi}\gamma^{\mu}\Psi \tag{6.39}$$

The probability density

$$\rho \equiv j^0 = \bar{\Psi}\gamma^0\Psi = \sum\_{i=1}^{4} |\Psi\_i|^2 \tag{6.40}$$

is the time-like component of the probability current; it is positivedefinite and has a similar form to the non-relativistic expression. Ψ and Ψ may be used to form quantities with well-defined space–time trans- ¯ formation properties—known as bilinear covariants. The simplest is the Lorentz scalar ΨΨ. ¯ <sup>43</sup> <sup>43</sup>The matrix <sup>γ</sup><sup>0</sup> that is inside Ψ swaps ¯ The next simplest is Ψ¯ γ<sup>μ</sup>Ψ, which transforms as a Lorentz 4-vector.

The Hamiltonian H of the Dirac equation is obtained by multiplying eqn 6.32 by γ<sup>0</sup> from the left and separating the time derivative:

$$H\Psi = \mathrm{i}\frac{\partial\Psi}{\partial t} \tag{6.41}$$

where

$$H = \mathbf{a} \cdot \mathbf{p} + \beta m \tag{6.42}$$

and

$$\mathfrak{a} = \gamma^0 \mathfrak{y}, \qquad \beta = \gamma^0 \tag{6.43}$$

The matrices α and β are Hermitian and in the Weyl representation are given by<sup>44</sup> <sup>44</sup>We have

$$\mathfrak{a} = \begin{pmatrix} \mathfrak{o} & 0 \\ 0 & -\mathfrak{o} \end{pmatrix}, \qquad \beta = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} \tag{6.44}$$

The Weyl representation is very well suited to the ultrarelativistic limit, where the mass can be neglected, because the Dirac bispinor is then effectively reduced to a single Weyl spinor. In the non-relativistic limit, however, both Weyl spinor components of the Dirac bispinor contribute spinors in the bispinor such that the dotted index meets the dotted one (ξ∗η) and the undotted index meets the undotted one (η∗ξ). It should be noted that complex conjugation adds or removes a dot, so if ξ has an undotted index, ξ<sup>∗</sup> has a dotted index.

> αiα<sup>j</sup> + αjα<sup>i</sup> = 2δij βα + αβ = 0 β<sup>2</sup> = 1

equally, so another representation, called the standard or Dirac representation, is more suitable. This representation is often used in introductory textbooks. The transformation from the Weyl representation to the Dirac representation is effected by the unitary transformation

$$U = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}$$

which gives

$$\Psi(\text{Dirac}) = \begin{pmatrix} \varphi \\ \chi \end{pmatrix} = U\Psi(\text{Weyl}) = U\begin{pmatrix} \xi \\ \eta \end{pmatrix} = \frac{1}{\sqrt{2}} \begin{pmatrix} \xi + \eta \\ \xi - \eta \end{pmatrix} \tag{6.45}$$

The transformation of the γ matrices, γ(Dirac) = U γ(Weyl)U <sup>−</sup><sup>1</sup>, gives

$$\gamma^0 = \beta = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}, \quad \mathfrak{Y} = \begin{pmatrix} 0 & \mathfrak{o} \\ -\mathfrak{o} & 0 \end{pmatrix}, \quad \mathfrak{a} = \begin{pmatrix} 0 & \mathfrak{o} \\ \mathfrak{o} & 0 \end{pmatrix}.$$

The Dirac equation in the (standard) Dirac representation is then given by

$$\begin{aligned} E\varphi - \mathbf{p} \cdot \mathfrak{o}\chi &= m\varphi\\ -E\chi + \mathbf{p} \cdot \mathfrak{o}\varphi &= m\chi \end{aligned} \tag{6.46}$$

In the non-relativistic limit, χ → 0 and the Dirac spinor becomes effectively a two-component Pauli spinor.

The fact that in the relativistic theory one needs two Weyl spinors to describe the electron while in the non-relativistic world one Pauli spinor is enough is always difficult to accept. To give some insight into why this is the case, consider yet another representation, that of Foldy and Wouthuysen (FW). We start with the Dirac representation and apply a momentum-dependent unitary transformation UFW given by

$$U\_{\rm FW} = \exp\left(\frac{1}{2} \frac{\beta \mathbf{a} \cdot \mathbf{p}}{|\mathbf{p}|} \arctan \frac{|\mathbf{p}|}{m}\right).$$

The wavefunction in the FW representation is then

$$
\Psi(\text{FW}) = \begin{pmatrix} u \\ w \end{pmatrix} = U\_{\text{FW}} \Psi(\text{Dirac}) = U\_{\text{FW}} \begin{pmatrix} \varphi \\ \chi \end{pmatrix}
$$

and after the transformation of the Hamiltonian, eqn 6.42 splits into components, becoming

$$\sqrt{\mathbf{p}^2 + m^2} \, u = \mathbf{i} \frac{\partial u}{\partial t} \, \tag{6.47}$$

$$-\sqrt{\mathbf{p}^2 + m^2} \, w = \mathbf{i}\frac{\partial w}{\partial t} \tag{6.48}$$

We now have two decoupled equations for positive- and negative-energy solutions, respectively. But if we want to drop one of them and consider, say, only the equation for the positive energies, then there is a problem, because we would not know how to transform the positive-energy spinor without any knowledge of the other one.<sup>45</sup> <sup>45</sup>We only know how to transform <sup>ξ</sup><sup>α</sup> In the non-relativistic limit, however, **p**<sup>2</sup> + m<sup>2</sup>  m + |**p**| <sup>2</sup>/2m, which leads to the Schr¨odinger Hamiltonian, and the Lorentz transformation becomes a Galilean one, which does not affect spin. So, in the non-relativistic limit, we can just take one equation, for example the one for the positive energy, and use it to describe a non-relativistic electron, with its spinor u being effectively the Pauli spinor. We can 'forget' about the negative-energy solution.

## **Majorana particles**

This is a short, rather technical, detour from the main track to introduce the concept of the Majorana particle.<sup>46</sup> <sup>46</sup>Named after Ettore Majorana, 1906– It is not essential for what follows. No fundamental Majorana particle has yet been discovered, although it could be that neutrinos are Majorana particles, and a composite Majorana particle has been discovered in condensed matter. From the discussion of spinors in Section 6.1.1, one might get the impression that a massless spin- <sup>1</sup> <sup>2</sup> particle is described by the Weyl spinor and a massive one by the Dirac spinor, which has two independent Weyl spinors as components. Although all the particles we know either fit, or could fit, this scenario, this is not the only scenario. We can start with one Weyl spinor, say, an undotted spinor ξ<sup>α</sup>, get a dotted one by complex conjugation, and then lower the index with the metric tensor to give a spinor that transforms like ηβ˙ . We can then use these ξ- and η-like objects to construct a Dirac spinor that satisfies the Dirac equation in the Weyl representation for a massive particle, the Majorana particle, effectively defined by one Weyl spinor rather than by two independent Weyl spinors as for the electron.

## **6.4.1 Free-particle solutions**

The Dirac equation, eqn 6.46, in the Dirac (standard) representation can be written as

$$
\begin{pmatrix} m & \mathbf{p} \cdot \mathbf{o} \\ \mathbf{p} \cdot \mathbf{o} & -m \end{pmatrix} \begin{pmatrix} \varphi \\ \chi \end{pmatrix} = E \begin{pmatrix} \varphi \\ \chi \end{pmatrix}
$$

which for an electron at rest simplifies to

$$
\begin{pmatrix} m & 0 \\ 0 & -m \end{pmatrix} \begin{pmatrix} \varphi \\ \chi \end{pmatrix} = \mathbf{i} \frac{\partial}{\partial t} \begin{pmatrix} \varphi \\ \chi \end{pmatrix}
$$

and ηβ˙ and they are buried inside the u and w spinors. We would need to get ξ<sup>α</sup> and ηβ˙ out of u and w, transform u and <sup>w</sup> back to <sup>ξ</sup><sup>α</sup> and <sup>η</sup>β˙ , do the Lorentz transformation, and transform back to get the Lorentz-transformed u and w.

1938.

There are four independent, un-normalized, solutions for the energy eigenstates:

$$\begin{aligned} \begin{pmatrix} 1\\0\\0\\0 \end{pmatrix} \mathbf{e}^{-im\tau}, \qquad &\qquad \begin{pmatrix} 0\\1\\0\\0 \end{pmatrix} \mathbf{e}^{-im\tau} \qquad &\qquad \text{for } E = m\\ \begin{pmatrix} 0\\0\\1\\0 \end{pmatrix} \mathbf{e}^{+im\tau}, \qquad &\qquad \begin{pmatrix} 0\\0\\0\\1 \end{pmatrix} \mathbf{e}^{+im\tau} \qquad &\qquad \text{for } E = -m \end{aligned}$$

By boosting these solutions to the frame where **p** -= 0, we get (s = 1, 2) <sup>47</sup>We can go back to the Weyl rep- <sup>47</sup>

$$\begin{aligned} \text{(eqn 6.16), and return to the } & \text{l.h.c.c.}\\ \text{representation.} & \qquad u^{(s)} \exp[-i(E\_\mathbf{p}t - \mathbf{p} \cdot \mathbf{x})] = u^{(s)} \exp(-\mathbf{i}\mathbf{p} \cdot \mathbf{x}) \quad \text{for } E > 0 \end{aligned}$$

and, for the −E<sup>p</sup> energy eigenstate and **p** momentum eigenstate,

$$u^{(s+2)} \exp(+iE\_\mathbf{p}t) \exp(i\mathbf{p} \cdot \mathbf{x}) = u^{(s+2)} \exp[+i(+E\_\mathbf{p}t + \mathbf{p} \cdot \mathbf{x})] \quad \text{for } E < 0$$

where

$$u^{(s)} = N \left( \frac{\vartheta^{(s)}}{E\_\mathbf{p} + m} \vartheta^{(s)} \right), \qquad u^{(s+2)} = N \left( \frac{-\mathbf{c} \cdot \mathbf{p}}{E\_\mathbf{p} + m} \vartheta^{(s)} \right)$$

N is a normalization constant, E<sup>p</sup> = +**p**<sup>2</sup> + m<sup>2</sup>, and

$$\vartheta^{(1)} = \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \qquad \vartheta^{(2)} = \begin{pmatrix} 0 \\ 1 \end{pmatrix}.$$

As in the case of the KG equation, instead of the above two solutions we will follow the Feynman–Stueckelberg interpretation of negative-energy states as antiparticles that are equivalent to particles travelling backwards in time. This requires us to replace the momentum **p** by −**p** and to decide which spin state to choose in going from a particle to an antiparticle.

Taken together, these steps give the four independent solutions of the Dirac equation:

$$
\Psi^+(x) = u^{(s)} \mathbf{e}^{-\mathbf{i}p \cdot x}, \qquad \Psi^-(x) = v^{(s)} \mathbf{e}^{+\mathbf{i}p \cdot x} \tag{6.49}
$$

where u(s) is as above and

$$v^{(1)}(\mathbf{p}) = u^{(4)}(-\mathbf{p}), \qquad v^{(2)}(\mathbf{p}) = u^{(3)}(-\mathbf{p})\tag{6.50}$$

Both Ψ<sup>+</sup>, describing a free electron, and Ψ<sup>−</sup>, describing a free positron, are twofold-degenerate. We need a quantum number (label) to

resentation, boost ξ (eqn 6.15) and η (eqn 6.16), and return to the Dirac

for E < 0 energy eigenstates, <sup>48</sup> <sup>48</sup> In principle, we could carry on with them, as is done in a number of textbooks.

distinguish the pairs of states with the same energy. A suitable operator, commuting with the free-particle Hamiltonian, is the helicity operator

$$h(\mathbf{p}) = \begin{pmatrix} \frac{\mathbf{o} \cdot \mathbf{p}}{|\mathbf{p}|} & 0\\ 0 & \frac{\mathbf{o} \cdot \mathbf{p}}{|\mathbf{p}|} \end{pmatrix} \tag{6.51}$$

In a convenient reference frame where **p** = (0, 0, p),

$$h(\mathbf{p}) = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & -1 \end{pmatrix}$$

and, for example,

$$u^{(1)} = N \begin{pmatrix} \begin{pmatrix} 1\\0 \end{pmatrix} \\ \frac{p}{E\_\mathrm{p} + m} \begin{pmatrix} 1\\0 \end{pmatrix} \end{pmatrix}$$

Thus u(s) (as well as v(s) ) are helicity eigenstates:

$$h(\mathbf{p})u^{(1)} = +u^{(1)}, \qquad h(\mathbf{p})u^{(2)} = -u^{(2)}$$

Helicity eigenvalues correspond to the spin component along the direction of motion. Helicity + means that, in its rest frame, the electron has +<sup>1</sup> <sup>2</sup> spin projection on the axis parallel to **p** (likewise for helicity − and the −<sup>1</sup> <sup>2</sup> spin projection); ϑ(1) and ϑ(2) are also defined with respect to this axis, which has to be the z axis, given the chosen representation of the Pauli matrices.

We can see now that changing **p** to −**p** changes the direction of the quantization axis in the rest frame of the electron and the spin projection changes sign as in eqn 6.50 (see the s label).<sup>49</sup> <sup>49</sup>See [13] for further reading.

As for the KG equation, we use covariant normalization, requiring that the integral of the probability density over the unit volume gives 2E<sup>p</sup> particles. This gives the normalization constant as N = E<sup>p</sup> + m.

### **6.4.2 Chirality** *-***= helicity**

Chirality, also called handedness, is defined by a pair of projection operators<sup>50</sup> <sup>50</sup>These satisfy

$$P\_{\mathcal{L}} = \frac{1 - \gamma^5}{2}, \qquad P\_{\mathcal{R}} = \frac{1 + \gamma^5}{2}, \qquad \text{where} \quad \gamma^5 = \mathrm{i}\gamma^0\gamma^1\gamma^2\gamma^3 \tag{6.52}$$

They divide the space of wavefunctions into right-handed and lefthanded half-spaces. P<sup>R</sup> projects a wavefunction onto the right-handed

$$\begin{aligned} P\_{\mathcal{L}} + P\_{\mathcal{R}} &= 1\\ P\_{\mathcal{L}} P\_{\mathcal{R}} &= P\_{\mathcal{R}} P\_{\mathcal{L}} = 0\\ P\_{\mathcal{L}} P\_{\mathcal{L}} &= P\_{\mathcal{L}}, \quad P\_{\mathcal{R}} P\_{\mathcal{R}} = P\_{\mathcal{R}} \end{aligned}$$

half-space, giving the right-handed component of the wavefunction. P<sup>L</sup> does the same with respect to the left-handed half-space and the lefthanded component of the wavefunction. <sup>51</sup> <sup>51</sup> Thus Ψ = <sup>P</sup>RΨ + <sup>P</sup>LΨ=Ψ<sup>R</sup> + ΨL. Describing the Dirac spinor in the Dirac representation (ignoring the normalization) in terms of the Weyl spinors ξ and η of the Weyl representation (see eqn 6.45), we get

$$\frac{1-\gamma^5}{2}\binom{\xi+\eta}{\xi-\eta} = \frac{1}{2}\begin{pmatrix} 1 & -1\\ -1 & 1 \end{pmatrix}\begin{pmatrix} \xi+\eta\\ \xi-\eta \end{pmatrix} = \begin{pmatrix} \eta\\ -\eta \end{pmatrix}$$

$$\frac{1+\gamma^5}{2}\begin{pmatrix} \xi+\eta\\ \xi-\eta \end{pmatrix} = \frac{1}{2}\begin{pmatrix} 1 & 1\\ 1 & 1 \end{pmatrix}\begin{pmatrix} \xi+\eta\\ \xi-\eta \end{pmatrix} = \begin{pmatrix} \xi\\ \xi \end{pmatrix}$$

We can thus define left-handed and right-handed spinors as

$$
\eta\_{\mathcal{L}} \equiv \eta, \qquad \xi\_{\mathcal{R}} \equiv \xi \tag{6.53}
$$

The projection operators P<sup>L</sup> and P<sup>R</sup> are important in particle physics because weak interactions are sensitive to these components. <sup>52</sup> <sup>52</sup> The operators <sup>P</sup><sup>L</sup> and <sup>P</sup><sup>R</sup> respect-The W boson couples only to the left-handed part of a particle wavefunction and the Z boson couples to both parts but with different strengths.

> In order to gain better insight, we will consider chiral components of the Dirac spinor u:

$$\frac{1-\gamma^5}{2}u = \frac{1-\gamma^5}{2}\left(\begin{array}{c} \vartheta\\ \frac{\mathbf{o}\cdot\mathbf{p}}{E\_\mathbf{p}+m}\vartheta \end{array}\right) = \frac{1}{2}\begin{pmatrix} \left(1-\frac{\mathbf{o}\cdot\mathbf{p}}{E\_\mathbf{p}+m}\right)\vartheta\\ -\left(1-\frac{\mathbf{o}\cdot\mathbf{p}}{E\_\mathbf{p}+m}\right)\vartheta \end{pmatrix}$$

where ϑ = ϑ<sup>+</sup> + ϑ<sup>−</sup> is a superposition of helicity + (ϑ<sup>+</sup>) and helicity − (ϑ<sup>−</sup>) eigenstates (not normalized). Because

$$\frac{\mathbf{o}\cdot\mathbf{p}}{|\mathbf{p}|}\vartheta^{+} = \vartheta^{+}, \qquad \frac{\mathbf{o}\cdot\mathbf{p}}{|\mathbf{p}|}\vartheta^{-} = -\vartheta^{-}$$

it follows that

$$\frac{\boldsymbol{\sigma}\cdot\mathbf{p}}{|\mathbf{p}|}\left(1-\frac{\boldsymbol{\sigma}\cdot\mathbf{p}}{|\mathbf{p}|}\right)\vartheta^{+}=0,\qquad\frac{\boldsymbol{\sigma}\cdot\mathbf{p}}{|\mathbf{p}|}\left(1-\frac{\boldsymbol{\sigma}\cdot\mathbf{p}}{|\mathbf{p}|}\right)\vartheta^{-}=-2\vartheta^{-}.$$

$$\frac{\boldsymbol{\sigma}\cdot\mathbf{p}}{|\mathbf{p}|}\left(1+\frac{\boldsymbol{\sigma}\cdot\mathbf{p}}{|\mathbf{p}|}\right)\vartheta^{+}=2\vartheta^{+},\qquad\frac{\boldsymbol{\sigma}\cdot\mathbf{p}}{|\mathbf{p}|}\left(1+\frac{\boldsymbol{\sigma}\cdot\mathbf{p}}{|\mathbf{p}|}\right)\vartheta^{-}=0.$$

Finally,

$$\left(1 - \frac{\mathbf{o} \cdot \mathbf{p}}{E\_\mathbf{p} + m}\right) \vartheta = \frac{a}{2} \left(1 + \frac{\mathbf{o} \cdot \mathbf{p}}{|\mathbf{p}|}\right) \vartheta + \frac{b}{2} \left(1 - \frac{\mathbf{o} \cdot \mathbf{p}}{|\mathbf{p}|}\right) \vartheta$$

where

$$a = 1 - \frac{|\mathbf{p}|}{E\_{\mathbf{p}} + m}, \qquad b = 1 + \frac{|\mathbf{p}|}{E\_{\mathbf{p}} + m}$$

ively pull the dotted and undotted components out of the Dirac spinor.

gives a decomposition of the final state into helicity + and helicity − components with weights a/2 and b/2, respectively. In summary,

$$\frac{1-\gamma^5}{2}u = \frac{a}{2}(\text{helicity } +) + \frac{b}{2}(\text{helicity } -) = u\_\text{L} \quad \text{(left chiral state)}$$

$$\frac{1+\gamma^5}{2}u = \frac{b}{2}(\text{helicity } +) + \frac{a}{2}(\text{helicity } -) = u\_\text{R} \quad \text{(right chiral state)}$$

It should be clear now that chirality and helicity are two different things. However, in the limit

$$\text{speed} \to c \quad \Rightarrow \quad a \to 0 \quad \text{and} \quad b \to 2.$$

chirality and helicity become identical, which is the subject of the next section. Before that, we will explore consequences, relevant at energies where masses cannot be neglected, of chirality and helicity being different—for example affecting weak decay rates of particles. As an example, we will consider π<sup>−</sup> → μ<sup>−</sup> + ¯ν<sup>μ</sup> decay.

In the rest frame of the π<sup>−</sup>, the momenta of μ<sup>−</sup> and ¯ν<sup>μ</sup> are back to back and the helicities are as indicated in Fig. 6.4. The ¯ν<sup>μ</sup> is in the helicity + state,<sup>53</sup> <sup>53</sup>We are not going to consider the with its spin along its momentum. The μ<sup>−</sup> has to be in the helicity + state to conserve total angular momentum, since the pion has spin zero. But the W<sup>−</sup> couples to the left-handed component of the μ<sup>−</sup> wavefunction and therefore the μ<sup>−</sup> needs, simultaneously, to have helicity + (to conserve angular momentum) and left-handed chirality (for the W<sup>−</sup> to couple). The probability for this is

$$\frac{|a|^2}{|a|^2 + |b|^2} = \frac{1}{2} \left( 1 - \frac{\text{speed}}{c} \right).$$

We can see that as the speed tends to c, the decay rate tends to 0, so the π<sup>−</sup> cannot decay to a massless μ<sup>−</sup>. This explains why, although favoured by the energy phase-space factor (not considered here), the decay rate for π<sup>−</sup> → e<sup>−</sup> + ¯ν<sup>e</sup> is much smaller than that for μ<sup>−</sup> + ¯νμ. <sup>54</sup> <sup>54</sup>See Exercise 6.4.

## **6.4.3 Helicity conservation and interactions via currents**

It is worth repeating the conclusion of the previous section that chirality (handedness) is different from helicity and that this has consequences at low energies where masses cannot be neglected. At high energies, where masses can be neglected, helicity and chirality can be treated as identical and in many textbooks this is the working assumption right from the start.

It can be shown (see e.g. [84]) that in the probability current

$$j^{\mu} = \bar{u}\gamma^{\mu}u = (\bar{u}\_{\mathcal{L}} + \bar{u}\_{\mathcal{R}})\gamma^{\mu}(u\_{\mathcal{L}} + u\_{\mathcal{R}}) = \bar{u}\_{\mathcal{L}}\gamma^{\mu}u\_{\mathcal{L}} + \bar{u}\_{\mathcal{R}}\gamma^{\mu}u\_{\mathcal{R}}$$

**Fig. 6.4** π<sup>−</sup> → μ<sup>−</sup> + ¯ν<sup>μ</sup> decay.

helicity − state, since such a state has never been observed. A neutrino with non-zero mass is either described by a Dirac spinor with two Weyl spinors contributing but with one sterile helicity state or as a massive Majorana particle having only one helicity state.

no cross terms like ¯uLγμu<sup>R</sup> are present. In the high-energy limit where masses can be neglected,

$$\begin{aligned} \frac{1}{2}(1-\gamma^5)u &= u\_{\mathcal{L}} \simeq u\_{\mathcal{L}}^- & \text{is the helicity } - \text{ eigenstate} \\\frac{1}{2}(1+\gamma^5)u &= u\_{\mathcal{R}} \simeq u\_{\mathcal{R}}^+ & \text{is the helicity } + \text{ eigenstate} \end{aligned}$$

and therefore the probability current does not contain any helicitymixing terms like ¯u<sup>−</sup> <sup>L</sup> γμu<sup>+</sup> <sup>R</sup>. We will now discuss the significance of this.

In classical electromagnetism, two parallel wires carrying electric currents interact with each other with a force proportional to the product of the currents. We can try a similar idea to describe scattering of particles. Multiplying the probability current of a free electron by its electric charge will give us a 4-vector with time-like component representing the charge density and space-like component representing the number of electric charges crossing unit area per unit time, i.e. the electric current corresponding to moving electrons. If we now take that current and another one, for, say, a free muon, then we can expect that the dot product of these currents will have something to do with the electromagnetic interaction of these particles. Multiplying that by a propagator <sup>55</sup> <sup>55</sup> For an introduction to propagators, greater detail. (which contains the <sup>g</sup>μν tensor for the above dot product) allows for momentum transfer between the interacting particles and indeed gives the matrix element of the lowest-order approximation<sup>56</sup> to the scattering amplitude. That matrix element can be visualized by a Feynman diagram (see Fig. 6.5).

> It is a property of the Standard Model (SM) that interactions between any two SM fermions can be described, in a leading approximation, by the current–current interaction as outlined above, although in the case of the weak interactions the left- and right-handed parts have to be treated separately because the weak interaction bosons couple to them differently (see Chapter 7). Since in the probability current there are no helicity-mixing terms like ¯u<sup>−</sup> <sup>L</sup> γ<sup>μ</sup>u<sup>+</sup> <sup>R</sup>, there are only two fundamental SM vertices, as illustrated in Fig. 6.6, where f stands for a fermion and the wiggly line of the exchanged particle represents one of the SM vector bosons: the photon, W<sup>+</sup>, W<sup>−</sup>, Z<sup>0</sup>, or any of the eight gluons.

**Fig. 6.5** (a) Feynman diagram for e<sup>−</sup> + μ<sup>−</sup> → e<sup>−</sup> + μ−. This is in fact the sum of two diagrams. The vertical wiggly line representing the exchanged particle propagator is the sum of two scenarios depending on which particle was the emitter and which was the absorber of the exchanged particle (e.g. a photon or Z0) as sketched in (b). Time goes from left to right.

$$\${}^{55}\text{For an introduction to propagators,}\newline \text{see Chapter 1 and, for example, [84] for \$\text{greater}\$ detail.}$$

<sup>56</sup>In this context, 'lowest order' refers to a perturbative expansion in powers of the coupling constant—here α2/4π.

The helicity is the same before and after the scattering (coupling to the exchanged particle). One says that the helicity is conserved, remaining unchanged by the interaction.

The annihilation or pair creation vertices are obtained from the scattering ones by crossing symmetry (see e.g. [13] or [84]). Keeping in mind that in Fig. 6.6, time is going from left to right, we 'cross' the incoming particle to the other side of the reaction equation by inverting its 4 momentum and swapping the helicity state so that it travels backwards in time, representing the outgoing antiparticle with the opposite helicity travelling forwards in time<sup>57</sup> <sup>57</sup>The arrow on the fermion line indi-(Fig. 6.6):

$$\Psi^+(x) = u^{(1)}(\mathbf{p})\mathbf{e}^{-\mathbf{i}p \cdot x} \to u^{(4)}(-\mathbf{p})\mathbf{e}^{+\mathbf{i}p \cdot x} = v^{(1)}(\mathbf{p})\mathbf{e}^{+\mathbf{i}p \cdot x}$$

Crossing symmetry allows us to obtain the annihilation amplitude, for example for e<sup>+</sup> + e<sup>−</sup> → μ<sup>+</sup> + μ<sup>−</sup>, from the scattering one, for e<sup>−</sup> + μ<sup>−</sup> → e<sup>−</sup> + μ<sup>−</sup>, by 'crossing' the relevant particles to the other side of the reaction equation by changing in the scattering amplitude the corresponding 4-momenta p → −p, the helicities, and the spinors u → v, and—something that is beyond the formalism we are using—putting in 'by hand' the minus sign in front of the whole amplitude (QFT is needed

cates in which direction with respect to the time arrow the fermion specified by the label at the end of the line is

travelling.

called the s-channel process. <sup>58</sup> <sup>58</sup> There is also a <sup>u</sup>-channel process; see e.g. [84].

<sup>59</sup>For proofs of all the results men-

<sup>60</sup>It can also be shown that if the wavefunctions transform this way, then the Dirac equation is covariant with respect to this coordinate transformation. <sup>Ψ</sup>-

for that). This transformation also affects the description of the propagator. In the scattering amplitude, we have, in the example considered, 4-momenta p1 and p2 for the incoming and outgoing electrons, respectively, and therefore the 4-momentum of the exchanged boson is the difference p1 − p2, which, dotted with itself, gives t ≡ (p1−p2)·(p1−p2); for that reason, the scattering is called the t-channel process. With the change p2 → −p2, t → s ≡ (p1 + p2) · (p1 + p2), which is the square of the centre-of-mass energy; the resulting annihilation reaction is therefore

## **6.4.4** *P* **,** *T* **, and a comment on** *C*

By construction, the Dirac equation is covariant with respect to Lorentz transformations and space inversion. It is also covariant with respect to time inversion. Space inversion P : **r** → −**r** and time inversion T : t → −t are discussed in this section. We make a comment about charge conjugation C : particle → antiparticle at the end of this section.

Consider two observers with reference frames O and O- . They are describing the same system or physical process, using their coordinates and wavefunctions Ψ(x) and Ψ- (x- ), respectively. The coordinates are related by a linear coordinate transformation (x<sup>ν</sup>)- = a<sup>ν</sup> <sup>μ</sup>x<sup>μ</sup>, where a<sup>ν</sup> μ could be the Lorentz transformation or a space or time inversion. The gamma matrices in their Dirac equations are also changed by that transformation, but, after a lot of algebra, it can be shown that both observers can neglect differences between their gamma matrices.<sup>59</sup> tioned in this section, see e.g. [52]. The covariance of the Dirac equation requires that the wavefunctions transform as<sup>60</sup>

$$
\Psi'(x') = \Psi'(ax) = S(a)\Psi(x) = S(a)\Psi(a^{-1}x')
$$

where S(a) is a matrix, S<sup>−</sup><sup>1</sup>(a), exists and S−<sup>1</sup>(a) = S(a−<sup>1</sup>). One says that the coordinate transformation a induces a transformation S(a) in the space of wavefunctions: Ψ- (x- ) = S(a)Ψ(x).

## **Space inversion** *P*

The space inversion transformation is

$$a = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & -1 & 0 & 0 \\ 0 & 0 & -1 & 0 \\ 0 & 0 & 0 & -1 \end{pmatrix}$$

We want to find S(a) satisfying Ψ- (x- ) = S(a)Ψ(x). Considering, for example, Ψ<sup>+</sup>(x) = u(1)e<sup>−</sup>ip·<sup>x</sup> and following the expectation from classical physics that **r** → −**r** makes **r**- = −**r**, **p**- = −**p** and σ- = σ (the angular momentum axial vector), we obtain

$$(\Psi^{+})'(x') = \left(\frac{\vartheta^{(1)}}{\underline{\sigma}^{\prime} \cdot \mathbf{p}^{\prime}} \vartheta^{(1)}\right) \exp[-\mathrm{i}(E\_{\mathrm{p}}t - \mathbf{p}^{\prime} \cdot \mathbf{x}^{\prime})]$$

$$= \left(\frac{\vartheta^{(1)}}{\underline{E\_{\mathrm{p}} + m} \vartheta^{(1)}}\right) \exp[-\mathrm{i}(E\_{\mathrm{p}}t - \mathbf{p} \cdot \mathbf{x})]$$

$$= \gamma^{0} \Psi^{+}(x)$$

giving S = γ<sup>0</sup>, up to a fixed phase, which is set to 1 by convention. We could get that result immediately by looking at the transformation properties of Weyl spinors (see eqn 6.9) and the transformation from the Weyl to the Dirac representation (eqn 6.45).

Eigenstates of the parity operator S = γ<sup>0</sup> are states of defined intrinsic parity. Positive-energy states in the particle rest frame,

$$u^{(1)}\mathbf{e}^{-\mathrm{i}m\tau}, \qquad u^{(2)}\mathbf{e}^{-\mathrm{i}m\tau}$$

are eigenstates of γ<sup>0</sup> with eigenvalue +1, i.e. they have intrinsic parity +1, but negative-energy states

$$v^{(1)}\mathbf{e}^{+\mathrm{i}m\tau}, \qquad v^{(2)}\mathbf{e}^{+\mathrm{i}m\tau}$$

are eigenstates with eigenvalue −1, thus having intrinsic parity −1. We can state, then, that the intrinsic parity of a spin- <sup>1</sup> <sup>2</sup> particle (defined in its rest frame) is the negative of the intrinsic parity of its antiparticle. This applies to higher-spin fermions as well. Free-particle states with **p** -= 0 are not eigenstates of the parity operator γ<sup>0</sup>.

## **Time inversion** *T*

For time inversion,

$$a = \begin{pmatrix} -1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}.$$

The derivation of S(a) here is more complicated than for space inversion. We consider only one aspect and then give the result. From classical physics, we expect that t → −t makes **r**- = **r**, **p**- = −**p**, and t - = −t, and therefore

$$\begin{split} \exp[-\mathbf{i}(E\_\mathbf{p}t' - \mathbf{p}' \cdot \mathbf{x}')] &= \exp[-\mathbf{i}(-E\_\mathbf{p}t + \mathbf{p} \cdot \mathbf{x})] \\ &= \{\exp[-\mathbf{i}(E\_\mathbf{p}t - \mathbf{p} \cdot \mathbf{x})]\}^\* \end{split}$$

so complex conjugation is involved. The full derivation gives Ψ- (t - ) = S(a)Ψ(t)=iγ<sup>1</sup>γ<sup>3</sup>Ψ(t)<sup>∗</sup>, again up to an arbitrary fixed phase, which, by convention, is taken to be 1.

It is useful to consider how S(T) ≡ S(a) acts on a free-particle state, for example

$$\begin{aligned} S(T) \left( \frac{\vartheta^{(1)}}{E\_\mathrm{p} + m} \vartheta^{(1)} \right) \exp[-\mathrm{i}(E\_\mathrm{p}t - \mathbf{p} \cdot \mathbf{x})] \\ &= -\mathrm{i} \left( \frac{\vartheta^{(2)}}{E\_\mathrm{p} + m} \vartheta^{(2)} \right) \exp[-\mathrm{i}(E\_\mathrm{p}t' - \mathbf{p}' \cdot \mathbf{x}')] \end{aligned}$$

Note that ϑ(1) → ϑ(2), but the helicity does not flip as well because the direction of the momentum changes too, so a positive-helicity state remains a positive-helicity state in the time-reversed system.

S(T) is anti-unitary, S<sup>2</sup>(T) = −1, and this has interesting consequences. If we have interactions that are invariant with respect to time inversion, then S(T) commutes with the Hamiltonian. If Ψ is an eigenstate of the Hamiltonian, then S(T)Ψ is also an eigenstate of the Hamiltonian, with the same energy. But if S(T)Ψ = ζΨ, where ζ is a phase, then

$$S(T)S(T)\Psi = S(T)\zeta\Psi = \zeta^\*\zeta\Psi = \Psi$$

in contradiction to S<sup>2</sup>(T) = −1. For that reason, S(T)Ψ is a different state to Ψ; there is at least a twofold degeneracy, known as Kramer's degeneracy. A spin- <sup>1</sup> <sup>2</sup> particle has a natural twofold degeneracy due to the two spin projections (2j + 1 states). A static magnetic field can lift that degeneracy by coupling to the particle's magnetic dipole moment, but a static magnetic field is not invariant with respect to time inversion, so nothing is wrong with having non-degenerate states in the magnetic field. The situation is different if the particle is put into a static electric field, which is invariant with respect to time inversion. If the particle has an electric dipole moment, the electric field will couple to it and would lift the 2j+1 degeneracy, shifting the energy levels up or down depending on the electric dipole orientation. Kramer's degeneracy forbids this because shifted energy states are required to be at least twofold-degenerate. So, unless there is an extra degree of freedom to guarantee this, electric dipole moments are forbidden by Kramer's degeneracy. A molecule of water has a large electric dipole moment—but in molecular or atomic systems, there are extra energy-degenerate (or nearly degenerate) states that allow this (see e.g. [126]).

In a series of beautiful experiments, pioneered by N. F. Ramsey [120, 121], using a beam of ultra-cold neutrons coming from a nuclear reactor at the Institut Laue–Langevin (ILL) in Grenoble, the absolute value of the neutron electron dipole moment d was measured to be smaller than 2.9 × 10<sup>−</sup><sup>26</sup> e cm at 90% confidence level. <sup>61</sup> <sup>61</sup> The SM does predict a very tiny

## **A comment on** *C*

For every particle, there is a corresponding antiparticle (and vice versa). This symmetry of nature is called charge-conjugation symmetry.

violation of time-inversion symmetry, leading to the prediction of a very tiny d, much smaller than this limit. A number of extensions of the SM have been rejected because they predicted d to be larger than the current limit.

It has nothing to do with Lorentz invariance and is the same in all inertial frames. Charge conjugation leads to a unitary operator C in RQF (all states have positive energies in RQF). In RQM, one can construct an anti-unitary operator transforming a positive-energy state into a negative-energy state (spinor u → spinor v). Because it is anti-unitary, i.e. different from the common unitary charge-conjugation operator of RQF, we will not spend time deriving it, to avoid possible confusion.

We will, however, briefly outline basic properties of charge conjugation (see e.g. [118]). C changes the sign of the electric charge, magnetic moment, baryon number, and lepton number. Dynamical quantities like the energy, momentum, and helicity are left unchanged. Except for the weak interaction, all the interactions obey charge-conjugation symmetry. Eigenstates of charge conjugation are neutral particles like the photon, π<sup>0</sup>, η, and ρ<sup>0</sup>, with eigenvalue +1 or −1. It is −1 for the photon and therefore +1 for a two-photon state, and so, as the π<sup>0</sup> has eigenvalue +1, the decay π<sup>0</sup> → γγ is allowed. However, charge-conjugation symmetry forbids π<sup>0</sup> from decaying into three photons (eigenvalue −1). For particles built out of two fermions, f ¯f, the eigenvalue is (−1)<sup>l</sup>+<sup>S</sup>, where S is the total spin of the f ¯f state and l is the relative orbital angular momentum of the fermions. The e+e<sup>−</sup> bound state, positronium, decays to two photons if it is in the singlet state (S = 0, l = 0) and to three photons if it is in the triplet state (S = 1, l = 0).

In conclusion, we note that the weak interaction violates the symmetry of each of the three discrete transformations P, T, and C, as well as any superposition of any two of them. For all other interactions, each of the three discrete transformations is a symmetry of the interaction. But all known interactions, including the weak interaction, obey CPT symmetry.<sup>62</sup> <sup>62</sup>It should be noted that in RQF, CPT In RQF, it is impossible to construct Lorentz-invariant interactions that would violate CPT symmetry.

## is an anti-unitary operator.

## **6.4.5 Electromagnetic interactions and the non-relativistic limit**

So far, we have considered only the free-particle Dirac equation. The interaction of an electron (or indeed any electrically charged spin- <sup>1</sup> 2 structureless fermion) with a classical electromagnetic field is introduced through so-called 'minimal coupling', following the prescription by which the electromagnetic interactions are introduced in classical mechanics. The physics of this prescription will be discussed in Section 6.5.

The classical electromagnetic field is given by a 4-vector potential A<sup>μ</sup> = (φ, **A**) and its interaction is introduced into the Dirac equation by modifications of the derivatives:

$$\mathbf{i}\frac{\partial}{\partial t} \rightarrow \mathbf{i}\frac{\partial}{\partial t} - q\phi$$

$$-\mathbf{i}\nabla \rightarrow -\mathbf{i}\nabla - q\mathbf{A}$$

where q < 0 is the electron charge. The Dirac equation then becomes

$$\left(\mathbf{i}\frac{\partial}{\partial t} - q\phi\right)\Psi(\mathbf{r}, t) = [\mathbf{\alpha}\cdot(-\mathbf{i}\nabla - q\mathbf{A}) + \beta m]\Psi(\mathbf{r}, t)\tag{6.54}$$

Solutions of this equation exist for some cases of the electromagnetic field (see e.g. [96]). We will find the non-relativistic limit of eqn 6.54 by applying an iterative method of approximation.

At low energies, the mass m of a particle is the main part of the energy, so we will factor that part out of the wavefunction:

$$\Psi(\mathbf{r},t) = \mathbf{e}^{-imt} \begin{pmatrix} \psi\_{\mathrm{U}}(\mathbf{r},t) \\ \psi\_{\mathrm{L}}(\mathbf{r},t) \end{pmatrix}.$$

ψU(**r**, t) and ψL(**r**, t) are known as the upper and lower components of the Dirac spinor. They contain all 'non-rest' energy information relevant for the non-relativistic limit. Inserting this Ψ into eqn 6.54, we find

$$\mathbf{i}\frac{\partial\psi\_{\rm U}}{\partial t} = \mathbf{o}\cdot(\mathbf{p} - q\mathbf{A})\psi\_{\rm L} + q\phi\psi\_{\rm U} \tag{6.55}$$

$$\mathrm{i}\frac{\partial\psi\_{\mathrm{L}}}{\partial t} = \mathbf{o}\cdot(\mathbf{p}-q\mathbf{A})\psi\_{\mathrm{U}} + q\phi\psi\_{\mathrm{L}} - 2m\psi\_{\mathrm{L}}\tag{6.56}$$

Equation 6.56 may be rearranged to give

$$
\psi\_{\rm L} = \frac{\mathbf{o} \cdot (\mathbf{p} - q\mathbf{A})}{2m} \psi\_{\rm U} - \frac{\mathbf{i}\frac{\partial}{\partial t} - q\phi}{2m} \psi\_{\rm L} \tag{6.57}
$$

Because m is relatively large, ψ<sup>L</sup> is small compared with ψU. In the first-order approximation, we neglect the last term in eqn 6.57 and take

$$
\psi\_{\mathbf{L}} \simeq \psi\_{\mathbf{L}1} = \frac{\mathbf{o} \cdot (\mathbf{p} - q\mathbf{A})}{2m} \psi\_{\mathbf{U}1}
$$

Inserting this into eqn 6.55, we obtain the first-order approximation ψU1 for ψU:

$$\mathrm{i}\frac{\partial\psi\_{\mathrm{U1}}}{\partial t} = q\phi\psi\_{\mathrm{U1}} + \frac{[\mathbf{\sigma}\cdot(\mathbf{p}-q\mathbf{A})][\mathbf{\sigma}\cdot(\mathbf{p}-q\mathbf{A})]}{2m}\psi\_{\mathrm{U1}}$$

Using vector identities, <sup>63</sup> <sup>63</sup> Namely, this equation can be written as

$$\begin{aligned} (\boldsymbol{\sigma} \cdot \mathbf{A})(\boldsymbol{\sigma} \cdot \mathbf{A}) &= \mathbf{A} \cdot \mathbf{A} + \mathrm{i}\boldsymbol{\sigma} \cdot (\boldsymbol{\times} \mathbf{A}) \\ (\nabla \times \mathbf{A} + \mathbf{A} \times \boldsymbol{\nabla})f &= f(\nabla \times \mathbf{A}) = f\mathbf{B} \end{aligned} \qquad \mathrm{i}\frac{\partial \psi\_{\mathrm{U1}}}{\partial t} = q\phi\psi\_{\mathrm{U1}} + \frac{(\mathbf{p} - q\mathbf{A})^2}{2m}\psi\_{\mathrm{U1}} - \frac{q}{2m}\boldsymbol{\sigma} \cdot (\boldsymbol{\nabla} \times \mathbf{A} + \mathbf{A} \times \boldsymbol{\nabla})\psi\_{\mathrm{U1}}$$

and finally as

$$\mathrm{i}\frac{\partial\psi\_{\mathrm{U1}}}{\partial t} = H\_{\mathrm{P}}\psi\_{\mathrm{U1}}$$

where

$$H\_{\mathbf{P}} \equiv q\phi + \frac{(\mathbf{p} - q\mathbf{A})^2}{2m} - \frac{q}{2m}\mathbf{\sigma} \cdot \mathbf{B}$$

is the Pauli Hamiltonian for the Schr¨odinger equation and **B** is the magnetic field. Since the Hamiltonian of the interaction of a magnetic dipole moment μ with an external magnetic field **B** is −μ · **B**, we can identify (q/2m)σ with the electron magnetic dipole moment μ. On the other hand, the electron magnetic moment is related to the electron spin by μ = −g <sup>1</sup> <sup>2</sup>σμB, where g is the proportionality factor and μ<sup>B</sup> = |q|/2m is the Bohr magneton (q is the electron charge). From this, we find that g = 2, which was a triumph for Dirac and his equation.

The value of g was known from atomic physics measurements, but until the advent of the Dirac equation, there was no explanation why the experimental value was as it was. In fact, the value of g for the electron, as well as for the muon, is not exactly 2. The small difference is explained by RQF. The value of (g − 2)/2 has been measured with fantastic precision: 2.8 × 10−<sup>13</sup> for the electron [85] and 6 × 10−<sup>10</sup> for the muon [49]. As for the precision measurements of the neutron electric dipole moment, in these cases also the precision of the measurements and theoretical calculations give constraints on extensions of the SM.

It should be noted that the magnetic dipole moment of the proton is quite different from that predicted by the Dirac equation. The value of g is about 5.6 instead of about 2 (using the proper magneton for the proton). For the neutron, it is even more surprising—instead of 0 because the electric charge is 0, g is about −3.8. These significant deviations from the predictions of the Dirac equation, which are applicable for point-like particles, were the first indications that protons and neutrons are not point-like particles. As we now know, they have a complicated substructures of quarks and gluons.

We can carry on with the iterative procedure and get relativistic corrections beyond the Pauli Hamiltonian. The next one is obtained by substituting ψL1 for ψ<sup>L</sup> in the last term of eqn 6.57, thus giving the second-order approximation:

$$\psi\_{\mathbf{L}} \simeq \psi\_{\mathbf{L}2} = \frac{\mathbf{o} \cdot (\mathbf{p} - q\mathbf{A})}{2m} \psi\_{\mathbf{U}} - \frac{\left(\mathbf{i}\frac{\partial}{\partial t} - q\phi\right) [\mathbf{o} \cdot (\mathbf{p} - q\mathbf{A})]}{2m} \psi\_{\mathbf{U}}$$

Unfortunately, inserting this into eqn 6.55 gives a Hamiltonian that is not Hermitian and the electron acquires an imaginary electric dipole moment. Formal problems of this type took many years to solve after Dirac published his equation. Eventually, a consistent second-order Hermitian Hamiltonian was found:

$$\begin{aligned} H &= \frac{(\mathbf{p} - q\mathbf{A})^2}{2m} - \frac{|\mathbf{p}|^4}{8m^3} \\ &+ q\phi - \frac{q}{2m}\mathbf{\sigma} \cdot \mathbf{B} \\ &- \frac{\mathbf{i}q}{8m^2} \mathbf{p} \cdot \mathbf{E} \\ &- \left[ \frac{\mathbf{i}q}{8m^2} \mathbf{\sigma} \cdot (\nabla \times \mathbf{E}) + \frac{q}{4m^2} \mathbf{\sigma} \cdot (\mathbf{E} \times \mathbf{p}) \right] \end{aligned}$$

The first line in this expression for the Hamiltonian represents the kinetic term with its first relativistic correction, the third the Darwin term, and the fourth the spin–orbit interaction (where **E** is the electric field). For **E** = −∇φ and a spherically symmetric potential φ, σ · (∇ × **E**) = 0 and the spin–orbit term takes the familiar form

$$\begin{aligned} H\_{\rm SO} &= -\frac{q}{4m^2} \mathbf{\sigma} \cdot (\mathbf{E} \times \mathbf{p}) \\ &= \frac{q}{4m^2} \mathbf{\sigma} \cdot \frac{\partial \phi}{\partial r} \frac{\mathbf{r}}{r} \times \mathbf{p} \\ &= \frac{q}{4m^2} \frac{1}{r} \frac{\partial \phi}{\partial r} \mathbf{\sigma} \cdot \mathbf{l} \end{aligned}$$

where **l** is the orbital angular momentum of the electron. It should be noted that Thomas precession is automatically included, as it should be, in the relativistic formalism.

Historically, the first formally successful derivation of the nonrelativistic limit of the Dirac equation that included interactions with the (classical) electromagnetic field was obtained via the FW transformation. It is interesting that one has to introduce the interaction first into the Dirac equation in the Weyl or Dirac representation (or any other representation related to them by a transformation not depending on the momentum) and only then make the FW transformation. One might think that doing the FW transformation for the free-particle Dirac equation first, thus decoupling the lower and upper components of the Dirac spinor, and then introducing the interactions via 'minimal coupling' would work—but it does not.

## **6.5 Gauge symmetry**

By considering transformations of space–time, one gets a relativistic description of free particles. In order to describe their interactions, one needs to consider transformations, gauge transformations, in another space, an internal space. Symmetries of the gauge transformations in that internal space are at the heart of the SM. Before discussing gauge symmetries, we need to revise the three essential ingredients of the formalism.

## **6.5.1 Covariant derivative**

Consider two coordinate systems as sketched in Fig. 6.7. The spherical basis vectors are related to the Cartesian ones in the following way:

$$\mathbf{e}\_r = \frac{\partial x}{\partial r}\mathbf{e}\_x + \frac{\partial y}{\partial r}\mathbf{e}\_y = \cos\varphi \text{ e}\_x + \sin\varphi \text{ e}\_y,\qquad |\mathbf{e}\_r| = 1$$

$$\mathbf{e}\_{\varphi} = \frac{\partial x}{\partial \varphi}\mathbf{e}\_x + \frac{\partial y}{\partial \varphi}\mathbf{e}\_y = -r\sin\varphi\,\mathbf{e}\_x + r\cos\varphi\,\mathbf{e}\_y, \qquad |\mathbf{e}\_{\varphi}| = r$$

**Fig. 6.7** Cartesian and spherical coordinate systems.

Next introduce a constant vector field **a** of unit length, as sketched in Fig. 6.8, using both coordinate systems. In the Cartesian basis,

$$\mathbf{a} = 1\,\mathbf{e}\_x + 0\,\mathbf{e}\_y = a^x \,\mathbf{e}\_x + a^y \,\mathbf{e}\_y$$

and in the spherical basis,

$$\mathbf{a} = \cos\varphi \,\,\mathbf{e}\_r - \frac{1}{r}\sin\varphi \,\,\mathbf{e}\_\varphi = a^r \,\mathbf{e}\_r + a^\varphi \,\,\mathbf{e}\_\varphi$$

Although

$$\frac{\partial a^x}{\partial x} = \frac{\partial a^x}{\partial y} = \frac{\partial a^y}{\partial x} = \frac{\partial a^y}{\partial y} = 0$$

in the spherical basis,

$$\frac{\partial a^r}{\partial \varphi} = -\sin \varphi \neq 0, \qquad \frac{\partial a^\varphi}{\partial \varphi} = -\frac{1}{r} \cos \varphi \neq 0$$

which looks wrong because the field **a** is constant—nothing is changing from point to point. It is wrong because the differentiation has not taken into account that the spherical basis vectors change from one space point to another. If we take this into account, differentiating not only coordinates but also basis vectors, everything is fine. For example,

$$\begin{split} \frac{\partial \mathbf{a}}{\partial \varphi} &= \frac{\partial}{\partial \varphi} \left( \cos \varphi \, \mathbf{e}\_r - \frac{1}{r} \sin \varphi \, \mathbf{e}\_\varphi \right) \\ &= \frac{\partial}{\partial \varphi} (\cos \varphi) \, \mathbf{e}\_r + \cos \varphi \, \frac{\partial}{\partial \varphi} (\mathbf{e}\_r) \\ &\quad + \frac{\partial}{\partial \varphi} \left( -\frac{1}{r} \sin \varphi \right) \mathbf{e}\_\varphi - \frac{1}{r} \sin \varphi \, \frac{\partial}{\partial \varphi} (\mathbf{e}\_\varphi) \\ &= \left. - \sin \varphi \, \mathbf{e}\_r + \cos \varphi \left( \frac{1}{r} \, \mathbf{e}\_\varphi \right) - \frac{1}{r} \cos \varphi \, \mathbf{e}\_\varphi - \frac{1}{r} \sin \varphi \left( -r \mathbf{e}\_r \right) \right|\_{\varphi} \\ &= 0 \end{split}$$

So, in general, for a vector **V**,

$$\frac{\partial \mathbf{V}}{\partial x^{\beta}} = \frac{\partial V^{\alpha}}{\partial x^{\beta}} \mathbf{e}\_{\alpha} + V^{\alpha} \frac{\partial \mathbf{e}\_{\alpha}}{\partial x^{\beta}}$$

The last derivative, ∂**e**α/∂x<sup>β</sup>, is a vector and can be described as a linear combination of the basis vectors **e**α:

$$\frac{\partial \mathbf{e}\_{\alpha}}{\partial x^{\beta}} = \Gamma^{\mu}\_{\alpha \beta} \mathbf{e}\_{\mu}$$

The geometrical object Γ<sup>μ</sup> αβ is called a connection. Changing the names of the indices in the above equation, μ → α and α → μ, we can write

$$\frac{\partial \mathbf{V}}{\partial x^{\beta}} = \left(\frac{\partial V^{\alpha}}{\partial x^{\beta}} + V^{\mu} \Gamma^{\alpha}\_{\mu \beta}\right) \mathbf{e}\_{\alpha}$$

This is the covariant derivative, which takes care of the changing coordinates as well as the basis vectors.

**Fig. 6.8** A constant vector field **a**.

## **6.5.2 Gauge invariance in electromagnetism**

In classical electromagnetism, because ∇ · **B** = 0, we can introduce a vector potential **A**, such that the magnetic field **B** = ∇ × **A**. Then

$$
\nabla \times \mathbf{E} + \frac{1}{c} \frac{\partial \mathbf{B}}{\partial t} = 0 \quad \text{can be written as} \quad \nabla \times \left( \mathbf{E} + \frac{1}{c} \frac{\partial \mathbf{A}}{\partial t} \right) = 0
$$

allowing the introduction of a scalar potential φ, such that

$$\mathbf{E} + \frac{1}{c} \frac{\partial \mathbf{A}}{\partial t} = -\nabla \phi$$

which, after rearrangement, gives the electric field as

$$\mathbf{E} = -\nabla\phi - \frac{1}{c}\frac{\partial\mathbf{A}}{\partial t}$$

If we now take an arbitrary but differentiable scalar function λ(**r**, t) and make the transformation

$$\mathbf{A} \rightarrow \mathbf{A}' = \mathbf{A} + \nabla \lambda \tag{6.58}$$

then **B** remains unchanged. If, simultaneously with this transformation, we make the change

$$
\phi \to \phi' = \phi - \frac{1}{c} \frac{\partial \lambda}{\partial t} \tag{6.59}
$$

then **E** will also stay unchanged. The transformations in eqns 6.58 and 6.59 are called gauge transformations <sup>64</sup> <sup>64</sup> In manifestly Lorentz-covariant form and the fact that electric and magnetic fields stay the same is called gauge invariance.

## **6.5.3 The Aharonov–Bohm effect**

Following Feynman et al. [75], we will examine two-slit electron diffraction. To study the Aharonov–Bohm effect, we use the set-up sketched in Fig. 6.9. With no current in the solenoid, electrons are diffracted by the slits and form an interference pattern on the screen. As soon as current is flowing through the solenoid, the diffraction pattern changes to another one. The probability amplitude ψ<sup>1</sup> for an electron to follow path 1 and the amplitude ψ<sup>2</sup> for the path 2 are modified in the following way:

$$
\psi\_1 = \psi\_{01} \exp\left(-\frac{\mathrm{i}qS\_1}{\hbar}\right), \qquad \psi\_2 = \psi\_{02} \exp\left(-\frac{\mathrm{i}qS\_2}{\hbar}\right).
$$

where ψ<sup>01</sup> and ψ<sup>02</sup> are the amplitudes without the current, q is the electron charge, and S<sup>1</sup> and S<sup>2</sup> are extra phases due to the presence of the current in the solenoid, which are given by

$$S\_1 = \frac{q}{\hbar} \int\_{\text{path1}} \mathbf{A} \cdot \mathbf{dr}, \qquad S\_2 = \frac{q}{\hbar} \int\_{\text{path2}} \mathbf{A} \cdot \mathbf{dr}$$

(and with c = 1),

$$A^{\mu} \to A^{\prime \mu} = A^{\mu} - \partial^{\mu} \lambda^{\prime}$$

The modification of the phase on the screen is then

$$\frac{q}{\hbar}(S\_1 - S\_2) = \frac{q}{\hbar} \left( \int\_{\text{path1}} \mathbf{A} \cdot \mathbf{dr} - \frac{q}{\hbar} \int\_{\text{path2}} \mathbf{A} \cdot \mathbf{dr} \right)$$

$$= \frac{q}{\hbar} \oint\_{\begin{subarray}{c} \text{closed} \\ \text{path} \end{subarray}} \mathbf{A} \cdot \mathbf{dr}$$

which, by Stokes' theorem, is proportional to the magnetic flux in the solenoid.

For an infinitely long thin (and therefore not obstructing the slits) solenoid, there is no magnetic field **B** outside the solenoid. The experimentally observed modification of the interference pattern [60] is therefore due to the vector potential **A**, present outside the solenoid, with a magnitude inversely proportional to the distance from the solenoid axis. Classically, the magnetic field **B** and the vector potential **A** are equivalent, in the sense that one can use either. This is not true at the quantum level, where **A** is apparently more fundamental. It should be noted that the same applies to the scalar potential φ and the electric field **E**, where instead of space dimensions one considers time.

The final conclusion that we take from the Aharonov–Bohm effect is the observation that the vector potential **A** is related to the spacedependent phase of the probability amplitude and the scalar potential φ is related to the time-dependent phase. So, from the phase of the probability amplitude, we can get the 4-vector potential, and the other way around.

## **6.5.4 Interactions from gauge symmetry**

The electromagnetic interactions have been introduced into the Dirac equation following the classical 'minimal coupling' procedure, which required modification of the derivatives:

$$\frac{\partial}{\partial t} \to D^0 \equiv \frac{\partial}{\partial t} + \mathrm{i}q\phi, \qquad \nabla \to \mathbf{D} \equiv \nabla - \mathrm{i}q\mathbf{A}$$

which can be written in manifestly Lorentz-covariant form as

$$D^{\mu} \equiv \partial^{\mu} + \mathrm{i}qA^{\mu} \tag{6.60}$$

The derivative D<sup>μ</sup> is the covariant derivative and we can start thinking about the scalar potential φ and the vector potential **A** as connections in a space to be defined. The outcome is the Dirac equation for the electron interacting with a classical electromagnetic field represented by the 4-vector potential (φ, **A**):

$$\left(\mathbf{i}\frac{\partial}{\partial t} - q\phi\right)\Psi(\mathbf{r}, t) = \left[\mathbf{a}\cdot(-\mathbf{i}\nabla - q\mathbf{A}) + \beta m\right]\Psi(\mathbf{r}, t)\tag{6.61}$$

We know that the potentials are not unique and we can perform the gauge transformations 6.58 and 6.59 without affecting Maxwell's equations and their solutions. Would the same apply to the Dirac equation 6.61? The answer is negative. A solution of eqn 6.61 will differ from that obtained by solving eqn 6.61 after the transformations 6.58 and 6.59. But if, simultaneously with 6.58 and 6.59, we also make the transformation

$$
\Psi(\mathbf{r},t) \to \Psi'(\mathbf{r},t) = \exp[\mathrm{i}q\lambda(\mathbf{r},t)]\Psi(\mathbf{r},t) \tag{6.62}
$$

then the Dirac equation 6.61 will be covariant with respect to the three combined gauge transformations 6.58, 6.59, and 6.62, and the solution will describe the same physics as that of eqn 6.61 before the transformations. We see that changes to the 4-vector potential affect the space–time-dependent phase of the wavefunction; it needs to be changed accordingly as well. As in the Aharonov–Bohm effect, they are connected.

Now comes the crucial step. We reverse the flow of arguments. We first demand that we want interactions that are invariant with respect to the transformation 6.62. For that to happen, we need to modify the derivatives of the free-particle Dirac equation in such a way that the symmetry is obeyed. This requires a move from space–time derivatives to covariant derivatives and the introduction of the 4-vector potential (φ, **A**). So, demanding gauge symmetry with respect to the gauge transformation 6.62, we get the classical electromagnetic field with which our particle is interacting. If, in addition, in all formulae, for example the one for the probability current, we replace derivatives by covariant derivatives, they will also be form-invariant.

electromagnetism. Adapted from [111]. The last step is to introduce the internal space with its basis vectors on which the connection (the 4-vector potential) operates and extend the formalism to other interactions beyond electromagnetism. <sup>65</sup> <sup>65</sup> The formalism will only be outlined

here. For more details, see [111]. We can imagine that a particle, moving in space–time, is carrying its internal space with it as sketched in Fig. 6.10. In mathematical language, the internal space is called a fibre and the whole structure, which locally is a product of the fibre and space–time, is called a fibre bundle. In the case of electromagnetism, the fibre is a circle, as sketched in Fig. 6.11.

**Fig. 6.11** The internal space of

Each point on the circle, parameterized by a real function λ(**r**, t), corresponds to a complex number with modulus 1, i.e. a one-dimensional unitary matrix exp[iqλ(**r**, t)]. The set of all such matrices forms a group called U(1), with '1' for one-dimensionality. When the particle moves from one space–time point to another, the phase of its wavefunction is modified (eqn 6.62) and the 4-vector potential (φ, **A**) is modified as well (eqns 6.59 and 6.58). A basis in the space U(1) consists of one particular matrix, for example the unit matrix (in this case the real number 1), which corresponds to λ = 0.

When the particle moves in space–time, λ(**r**, t) changes corresponding to the changing basis.<sup>66</sup> <sup>66</sup>With respect to four coordinates, re-The change of basis is represented by the connection, which in turn gives us the change in the 4-vector potential (φ, **A**), eqns 6.59 and 6.58. So our 'differential' equation is: the connection equals the 4-vector potential.

We are ready now to extend the formalism to include other interactions. What will happen if in eqn 6.62, instead of the real function λ(**r**, t), we insert a matrix M(**r**, t)?<sup>67</sup> <sup>67</sup>Noting that Suppose that M belongs to SU(2), the group of 2 × 2 complex unitary matrices with unit determinant (as indicated by the letter S ≡ 'special'). SU(2) has 3 basis vectors, which can be the Pauli matrices σx, σy, and σz. Each element of the group can then be defined by 3 real numbers λ<sup>1</sup>, λ<sup>2</sup>, and λ<sup>3</sup>, and can be represented as a point within a sphere of radius 2π in 3 dimensions. So our particle carries with itself such a sphere, its internal space, and the interaction that we get in this way will be the weak interaction. Generalizing eqn 6.62, we require gauge symmetry with respect to the following transformation:

$$\Psi(\mathbf{r},t) \to \Psi'(\mathbf{r},t) = \exp\left[\mathrm{i}q\sum\_{k=1}^{3}\lambda^{k}(\mathbf{r},t)\sigma\_{k}\right]\Psi(\mathbf{r},t) \tag{6.63}$$

The operator acting on the wavefunction Ψ is now a series of 2×2 matrices and therefore our wavefunction Ψ gets one extra dimension and is represented by two components (for historical reasons called projections of the isotopic spin) related to the SU(2) internal space. In order to get the 4-vector potential of the weak interaction, we consider the infinitesimal change in the wavefunction when the space–time point is changed from (**r**, t) to (**r** + d**r**, t + dt). All components of the wavefunction are affected, but to get the 4-vector potential we select only the connection part of the covariant derivative, i.e. the change in the basis vectors in the internal space. That change is represented by the changes in λ<sup>1</sup>, λ<sup>2</sup>, and λ<sup>3</sup>, giving, finally,

$$A^{\mu} = \sum\_{k=1}^{3} [\partial^{\mu} \lambda^{k}(\mathbf{r}, t)] \sigma\_{k} \tag{6.64}$$

Therefore, if we want to introduce the weak interactions into the free-particle Dirac equation, we must change space–time derivatives to quiring four derivatives: one time and three space directions.

$$\exp(\mathcal{M}) \equiv 1 + \mathcal{M} + \frac{1}{2}\mathcal{M}^2 + \dots$$

given by eqn 6.64. <sup>68</sup> <sup>68</sup> The 4-vector potential is a 2 <sup>×</sup> 2 matrix in the isotopic spin space generated

covariant derivatives, including the connection, i.e. the 4-vector potential

by the Pauli matrices. The strong interaction is introduced in a similar way. The internal space is now SU(3), a group of unitary complex matrices with unit determinant. There are 8 basis vectors in that space, and therefore 8 real numbers identify each matrix belonging to the group. The Pauli matrices generating the weak interactions are replaced by these 8 basis matrices of the SU(3) group and the wavefunction gains extra degrees of freedom, becoming a 3-dimensional column vector in the colour space of the strong interaction. The corresponding 4-vector potential becomes a 3×3 matrix. Changing derivatives to covariant derivatives causes the particle to interact with the classical colour field.

> After quantization of the classical weak and strong fields, we have three spin-1 bosons for the weak interaction and 8 spin-1 gluons for the strong interaction. All of them are massless at this stage. To get massive physical Z<sup>0</sup>, W<sup>+</sup> and W<sup>−</sup> bosons, we need an extra mechanism, like the Higgs mechanism that will be discussed in Chapter 12. Elements of U(1) commute among themselves, but those of SU(2) or SU(3) do not. The physical consequences are that photons do not interact with each other but weak-interaction bosons and strong-interaction gluons do.

For further reading, [13] is particularly recommended.

## **Chapter summary**


## **Further reading**


representations is recommended for readers who are theory-oriented.


## **Exercises**


$$\left(p\_0 + \sqrt{\mathbf{p}^2 + m^2}\right) \left(p\_0 - \sqrt{\mathbf{p}^2 + m^2}\right)$$

and apply only the second bracket to give

$$(p\_0 - \sqrt{\mathbf{p}^2 + m^2})\psi = 0$$

The difficulty is that quantum mechanics gives operator expressions for p<sup>0</sup> and pi, which in the Schr¨odinger representation are first-order derivatives in time and space, respectively. The spatial derivatives are now under a square root. Dirac solved the problem by writing

$$
\sqrt{\mathbf{p}^2 + m^2} = \alpha\_1 p\_1 + \alpha\_2 p\_2 + \alpha\_3 p\_3 + \beta\_4
$$

where

$$p\_r = -\mathbf{i}\hbar \frac{\partial}{\partial x\_r} \quad (r = 1, 2, 3), \qquad p\_0 = \mathbf{i}\hbar \frac{\partial}{\partial t}$$

Show that to satisfy E<sup>2</sup> = p<sup>2</sup> + m<sup>2</sup>, it is necessary that

$$\begin{aligned} \alpha\_i \alpha\_j + \alpha\_j \alpha\_i &= 2\delta\_{ij}, \\ \alpha\_i \beta + \beta \alpha\_i &= 0 \end{aligned}$$

(i, j = 1, 2, 3), with β<sup>2</sup> = 1.

Dirac showed that the simplest realization of these constraints is given by 4-dimensional matrices. It should be noted that at this stage of the historical development, the wavefunction ψ had unknown Lorentz transformation properties; they had to be derived next.

(6.3) The covariant Dirac matrices are defined by

$$
\gamma^0 = \beta, \qquad \gamma^i = \beta \alpha\_i
$$

so that γ · p is a 4-vector scalar product—often written as p/, with the Dirac equation then being (p/ − m)ψ = 0. Using the definitions of the Dirac matrices given in Section 6.4, follow the text to derive the defining relation

$$
\gamma^{\mu}\gamma^{\nu} + \gamma^{\nu}\gamma^{\mu} = 2g^{\mu\nu} \tag{6.65}
$$


The free-electron positive-energy solutions of the Dirac equation in the standard representation are

$$\psi\_s(\mathbf{r},t) = N \left( \frac{\chi^s}{\frac{\mathbf{o} \cdot \mathbf{p}}{E+m} \chi^s} \right) \mathbf{e}^{\left(\mathbf{p} \cdot \mathbf{r} - Et\right)/\hbar}$$

where s = 1, 2, χ<sup>1</sup> = 1 0 , and χ<sup>2</sup> = 0 1 . Find a normalization factor N and explain why these wave-

functions are not normalized to one particle per unit volume.

The operator for spin projection on the x axis is

$$
\Sigma\_1 = \frac{1}{2} \begin{pmatrix} \sigma\_1 & 0 \\ 0 & \sigma\_1 \end{pmatrix},
$$

Find its eigenvalues and eigenvectors, and show that in the non-relativistic limit one can form linear combinations of ψ1(**r**, t) and ψ2(**r**, t) that are eigenvectors of Σ<sup>1</sup> but at relativistic energies this is impossible.

(6.6) Given that Ψ(x) = u(p)e<sup>−</sup>ip·<sup>x</sup> is a solution of the Dirac equation, where ¯u(p)u(p)=2m, derive expressions for u(p) with p = (E, 0, 0, pz). Here, p is the 4-momentum, p · p = m<sup>2</sup>, and x is the 4-displacement.

What are the u(p) spinors in the ultrarelativistic limit? What are their helicities? Why is the Dirac representation more suitable in the low-energy limit?

(6.7) A free-particle solution of the Dirac equation is Ψ(x) = C(p)e<sup>−</sup>ip·<sup>x</sup>, where p = (E, 0, 0, p3). Find the normalized E > 0 spinors C+(p) and C−(p) for positive- and negative-helicity states in the standard Dirac representation and using covariant normalization.

The operator for spin projection on the x axis is

$$
\Sigma\_1 = \frac{1}{2} \begin{pmatrix} \sigma\_1 & 0 \\ 0 & \sigma\_1 \end{pmatrix},
$$

Show that neither C+(p) nor C−(p) is an eigenstate of Σ1.

The expectation value of Σ<sup>1</sup> is

$$
\langle \Sigma\_1 \rangle = \frac{1}{2E} C^\dagger(p) \Sigma\_1 C(p),
$$

where C = C+(p) cos α + C−(p) sin α, with α a real constant, and normalization is to unit volume. Calculate -Σ1 and interpret the result in the non-relativistic and ultrarelativistic limits for α = ±<sup>1</sup> <sup>4</sup> π.

(6.8) Give the expression for the total energy operator H, including the electromagnetic potential (A<sup>0</sup>, **A**), for a particle of mass m.

Introducing the upper and lower bispinor components (ϕ, χ) of ψ and writing H = m + Hn, derive an expression for Hnψ. Using Hnψ, show that, under certain conditions to be specified, χ ∼ (velocity/c) × ϕ, for a stationary state Hψ = Eψ.

Under these conditions, show how the Dirac equation reduces to the non-relativistic Schr¨odinger–Pauli equation for a spin- <sup>1</sup> <sup>2</sup> particle with energy E<sup>n</sup> = E − m. Identify the term giving the interaction of the particle's spin with the magnetic field **B** = ∇ × **A** and comment on its magnitude.

[The identity (σ ·**a**)(σ ·**b**) = **a**·**b**+iσ ·**a**×**b** may be assumed.]

# **Weak interactions 7**

This chapter introduces the theory of weak interactions and electroweak unification. The experimental evidence for this theory will be reviewed in Chapter 8. We give a description of electroweak interactions with emphasis on the key physics ingredients. More mathematical treatments (graduate-level textbooks) are listed in Further Reading at the end of the chapter. The Higgs mechanism and supporting experimental evidence will be covered in Chapter 12.

The weak force is responsible for the β decay of nuclei, for the decay of the lightest hadrons such as the pion, kaon, and neutron, and for neutrino interactions. The weak interaction violates spatial parity conservation and facilitates flavour-changing interactions. The quanta of the weak force are the massive W<sup>±</sup> and Z<sup>0</sup>. Weak interactions mediated by the W<sup>±</sup> are termed 'charged-current' (CC) and those by the Z<sup>0</sup> 'neutralcurrent' (NC). In this chapter, the evidence for the 'universality' of the weak interaction is outlined and its importance explained. Figure 7.1 shows neutron β decay at the quark level; it is a CC decay. At this energy scale (∼ 1 GeV), the W will hardly propagate and it is safe to approximate the propagator g<sup>2</sup> w/(M<sup>2</sup> <sup>W</sup> −q<sup>2</sup>) by a constant, which we will see is closely related to the Fermi constant G<sup>F</sup> (the concept of a force propagator is outlined at the start of Chapter 9).

The proposal by Glashow, Weinberg, and Salam (independently) that the weak and electromagnetic interactions should be unified was a key step for particle physics. At first sight, this seems difficult to achieve because the photon is massless but the weak bosons are not—the Higgs mechanism for mass generation is crucial for this. The discovery of 'weak neutral current' interactions in neutrino scattering was the first indication that these ideas might be correct. The discovery in 1983 by the UA1 and UA2 experiments at the Spp¯S collider at CERN of the W<sup>±</sup> and Z<sup>0</sup> particles with masses close to the predicted values was a seminal moment in the development of the Standard Model.

Other topics covered in this chapter are


Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak,

& Tony Weidberg 2016. Published in 2016 by Oxford University Press.


**Fig. 7.1** Neutron β decay at the quark level, a CC weak decay. Note that the spectator ud pair in the n and p are omitted.

Much of the early work on weak interactions focused on understanding the seeming complexity of particle decays. Experiments made possible by the high energies of the LEP and LHC colliders have enabled the richness and underlying symmetry of electroweak interactions to be understood at a more profound level. After a brief review of the Fermi theory of β decay, the chapter is divided into three main sections dealing with weak interactions of leptons, weak interactions including quarks, and electroweak unification, respectively.

## **7.1 Fermi theory**

The original theory of nuclear β decay, by Fermi, uses his 'Golden Rule' of quantum mechanics to calculate rates. For example, the decay rate of the neutron (n → pe<sup>−</sup>νe) is given by

$$w = \frac{2\pi}{\hbar}|M|^2 \frac{\text{d}N}{\text{d}E} \tag{7.1}$$

where M is the matrix element given by

$$M = \int \psi\_{\rm f} \, O\psi\_{\rm i} \, \mathrm{d}V = \int \psi\_{p}^{\*} \psi\_{e}^{\*} \psi\_{\nu}^{\*} O \psi\_{n} \, \mathrm{d}V \tag{7.2}$$

M involves the initial- and final-state wavefunctions ψ<sup>i</sup> and ψf, respectively, and O is an operator describing the interaction, which was assumed to be local, with a strength given by GF. This amounts to ignoring any propagator of the weak force. Given the very large mass of the W<sup>±</sup> boson (∼80 GeV), for decays of leptons and most hadrons (masses up to a few GeV), the propagator g<sup>2</sup> w/(M<sup>2</sup> <sup>W</sup> − q<sup>2</sup>) can be approximated by the constant G<sup>F</sup> ∼ g<sup>2</sup> w/M<sup>2</sup> <sup>W</sup> . The wavefunctions of the electron and the neutrino were assumed to be constant over the volume of the nucleus and were moved outside the integral; the volume over which they are normalized cancels with the volume used to calculate the density of states Another simplification was to use Schr¨odinger wavefunctions ψ rather than the Dirac spinors that are needed for a correct relativistic description of spin- <sup>1</sup> <sup>2</sup> particles.

Some nuclear decays involve larger changes in nuclear spins than can be accommodated by the spin 0 or 1 change from the electron–neutrino pair. These decays tend to proceed more slowly. This leads to a jargon of 'forbidden decays'—for which the decay could occur with a change in the nuclear orbital angular momentum but more slowly than the 'allowed decays'. The terminology is a bit unfortunate, but it does introduce a very important idea. If a process occurs at a slower rate, or not at all, than one would expect from the phase space available, then there could be an inhibition from angular momentum conservation or some other conserved quantum number.

dN/dE. <sup>1</sup> <sup>1</sup> How to calculate dN/d<sup>E</sup> is covered in Chapter 2.

## **7.2 Weak interactions of leptons**

Many of the key features of the weak interaction can be understood using leptonic decays and interactions without the additional complication of hadronic structure. Some examples of lepton properties that need explanation are the absence of the decay μ → eγ (the upper limit on the branching ratio μ<sup>−</sup> → e−γ is 1.2 × 10−<sup>11</sup>), the very different lifetimes of the π<sup>±</sup> (2.6×10−<sup>8</sup> s) and π<sup>0</sup> (10−<sup>16</sup> s), and the absence of purely leptonic decays such as μ<sup>±</sup> → e±ν or τ <sup>±</sup> → μ±ν. The much higher decay rate for the π<sup>0</sup> is straightforward: JP C of the π<sup>0</sup> is 0−<sup>+</sup>, so the electromagnetic decay π<sup>0</sup> → γγ is allowed. So far, there is no evidence that any of the leptons are other than point-like particles.

## **7.2.1 Lepton number**

The above brief summary of leptonic properties leads to the necessity for new conserved quantum numbers—the lepton numbers. For example, we assign L<sup>e</sup> = +1 to the e<sup>−</sup> and νe, L<sup>e</sup> = −1 to the e<sup>+</sup> and ¯νe, and L<sup>e</sup> = 0 to all other particles. There are exactly analogous conservation laws for each of the other two generations involving L<sup>μ</sup> and L<sup>τ</sup> . 2 The details were summarized in Chapter 1, Table 1.1, which is repeated here for convenience as Table 7.1.

The lepton number conservation laws along with charge conservation and baryon number conservation are reflected in the fact that there is only a single vertex for each lepton generation that can be used in constructing Feynman diagrams for weak charged-current interaction. They are shown in Fig. 7.2.

## **7.2.2 Feynman rules**

Calculations of high-energy interactions respecting Lorentz covariance and allowing for particle creation and annihilation are difficult even


**Table 7.1** Lepton properties.

<sup>2</sup>Although it was assumed at the time of its discovery that the τ lepton was a so-called sequential lepton with its own lepton number and an associated ντ , it was not until 1997 that the DONUT experiment confirmed directly that a ντ beam produced the charged τ.

**Fig. 7.2** Possible leptonic charged-current vertices.

shell constraint. <sup>3</sup> <sup>3</sup> Although internal particles do not have to be on the mass shell, they do have to be consistent with the constraints of the time–energy uncertainty relation ΔEΔt ≤ **¯**h/2.

**Fig. 7.3** Feynman diagram for a tau-lepton decay to an electron and two neutrinos.

from the Feynman rules. <sup>4</sup> <sup>4</sup> Note that the <sup>W</sup> width has been ignored here; it gives an imaginary part to the propagator, analogous to the expression for a Breit–Wigner amplitude—see Section 2.5.4. **An example: tau decay**

for weak or electromagnetic interactions. Richard Feynman invented a very elegant graphical formalism that provides a considerable shortcut in calculations. The full mathematical treatment is given in the advanced textbooks in Further Reading, but a brief outline is given here since this formalism has become an essential part of the 'language' of particle physics—it will also provide a useful bridge to the more advanced texts. It ends up being formulated as a set of 'Feynman rules', in other words a recipe for performing the calculation. To show that Feynman rules actually work and give the right answer is even further beyond the scope of this book.

Feynman's recipe is as follows. First, the process from incoming particles to outgoing particles is described in terms of 'Feynman diagrams' where lines represent 'quanta' or particles in momentum space. Lines representing the initial and final particles in an interaction must be 'on-mass-shell' particles, i.e. ones where E<sup>2</sup> = p<sup>2</sup> + m<sup>2</sup>. Internal lines represent 'virtual particles', which do not have to satisfy the mass-Diagrams can have as many complicated lines as you like (but all connected), although with weak and electromagnetic interactions those with the fewest vertices (the lowest-order diagrams) are the most important. To do the calculation, all diagrams up to a given order must be considered and the amplitudes added together to get the matrix element. In this way, it is possible to have interference (constructive or destructive) between two diagrams.

The second step in Feynman's recipe is to write down elements in a mathematical expression for each external line, vertex, and internal line (propagators) according to the rules, and the third step is to evaluate the expression. If there are internal loops, the evaluation involves integrating over the unconstrained momenta of those particles. The recipe ensures energy and momentum conservation and it is also built into the rules that the resulting expression is Lorentz-invariant.

Apart from being a precise mathematical tool for relativistic field theory calculations, Feynman diagrams give a very intuitive 'billiard ball' picture of what is a complicated interaction of waves and quanta. Other, now-commonplace ideas such as vertex factors (e.g. α = <sup>1</sup> 137 for the electromagnetic interaction) and the propagators 1/q<sup>2</sup> for a massless photon and 1/(M<sup>2</sup> <sup>W</sup> − q<sup>2</sup>) for the W for internal lines come

Using the Feynman rules, we will find the lowest-order matrix element for a purely leptonic decay mode of the tau lepton τ <sup>−</sup> → ν<sup>τ</sup> e<sup>−</sup>ν¯<sup>e</sup> (Fig. 7.3). The Fermi expression for this process involves an integral over Schr¨odinger wavefunctions ψ:

$$M\_{\rm fi} = G\_{\rm F} \int \psi\_{\nu\_{\tau}}^{\*} \psi\_{e}^{\*} \psi\_{\bar{\nu}}^{\*} \psi\_{\tau} \,\mathrm{d}V \tag{7.3}$$

The equivalent expression using Feynman rules and Dirac spinors is

$$\begin{split} M\_{\text{fi}} &= \frac{1}{M\_W^2 - q^2} \left[ \frac{g\_{\text{w}}}{\sqrt{2}} \bar{u}(\nu\_\tau) \gamma\_\mu \frac{1}{2} (1 - \gamma\_5) u(\tau) \right] \\ &\times \left[ \frac{g\_{\text{w}}}{\sqrt{2}} \bar{u}(e) \gamma^\mu \frac{1}{2} (1 - \gamma^5) u(\nu\_e) \right] \end{split} \tag{7.4}$$

where 1/(M<sup>2</sup> <sup>W</sup> − q<sup>2</sup>) is the propagator and the two expressions in square brackets are the vertex factors. This is an example of a 'current–current' interaction as described in Chapter 6. The vertex factors involve a particular way of combining the spinors—inserting a matrix γ<sup>μ</sup> 1 <sup>2</sup> (1 − γ5) between the spinor and the adjoint spinor. This particular combination is called 'V−A'—more details are given in Section 7.2.4. The factor g<sup>w</sup> is called the weak coupling constant and is analogous to the electric charge <sup>e</sup> <sup>=</sup> <sup>√</sup>4πα (in natural units) in electromagnetism and <sup>q</sup><sup>2</sup> in the propagator is the four-momentum squared of the W particle. The W does not have to be on the mass shell, since it is a virtual particle; however, since q<sup>2</sup> ∼ m<sup>2</sup> <sup>τ</sup> and m<sup>τ</sup> << M<sup>W</sup> , we can approximate 1/(M<sup>2</sup> <sup>W</sup> − q<sup>2</sup>) → 1/M<sup>2</sup> W , in which case we may use

$$\frac{G\_{\rm F}}{\sqrt{2}} = \frac{g\_{\rm w}^2}{8M\_W^2} \tag{7.5}$$

to relate the weak coupling g<sup>w</sup> to the Fermi constant GF. The expression for the matrix element becomes

$$M\_{\rm fi} = \frac{4G\_{\rm F}}{\sqrt{2}} [\bar{u}(\nu\_{\tau})\gamma\_{\mu}\frac{1}{2}(1-\gamma\_{5})u(\tau)][\bar{u}(e)\gamma^{\mu}\frac{1}{2}(1-\gamma^{5})u(\nu\_{e})] \tag{7.6}$$

which, apart from the γ matrices, now looks very much like the expression from Fermi theory in eqn 7.3. The factors of 4 and <sup>√</sup>2 are simply to keep the definition of G<sup>F</sup> the same as originally defined by Fermi.

## **7.2.3 Universality**

The model of weak interactions contains the postulate that

g<sup>w</sup> is the same for all weak interactions . . .

. . . well, all weak charged-current interactions involving leptons. To bring quarks into the picture, we need Cabibbo theory and its extension by Kobayashi and Maskawa, and to include the neutral current, we need the Glashow, Weinberg, and Salam theory of electroweak unification.

There have been many experiments using a wide range of interactions and energies to test universality, measuring and comparing gw. So far, the postulate that g<sup>w</sup> is the same is holding up very well. We will discuss some of these experiments in Chapter 8.

## **Left-handed particles . . .**

You will recognize from Chapter 6 that the <sup>1</sup> <sup>2</sup> (1 − γ5) in the operator is the left-hand chirality projection operator P<sup>L</sup> (which, in the high-energy limit, is equivalent to a helicity projection). Denoting the left-handed part of the τ as uL(τ ), we have

$$u\_{\mathcal{L}}(\tau) = P\_{\mathcal{L}} u(\tau) = \frac{1}{2} (1 - \gamma\_5) u(\tau) \tag{7.7}$$

This means that the <sup>1</sup> <sup>2</sup> (1 − γ5)u(τ ) part of eqn 7.6 can be replaced by uL(τ ). We say that the weak charged-current interaction acts on only the left-handed part of the particle. We will formulate this more rigorously when we come to electroweak unification.

## **. . . and right-handed antiparticles**

The <sup>1</sup> <sup>2</sup> (1 − γ5) operator, when acting on the spinor of an antiparticle (using v rather than u to denote a spinor of an antiparticle), projects out the right-handed chirality of the antiparticle, i.e.

$$v\_{\mathcal{R}} = \frac{1}{2}(1 - \gamma\_5)v\tag{7.8}$$

The weak charged current acts only on the left-handed part of particle wavefunctions and the right-handed part of antiparticle wavefunctions.

## **7.2.4 V***−***A**

We now return to the question of why the Feynman rules were constructed so that the matrix between the spinors was γ<sup>μ</sup> 1 <sup>2</sup> (1 − γ5). The first step is to construct a scalar quantity (i.e. a number that is Lorentzinvariant) out of a spinor u. The answer is that ¯uu is such a quantity. It is the simplest of the 'bilinear covariants' for the Dirac equation and transforms as a Lorentz scalar. There are 16 such combinations <sup>5</sup> <sup>5</sup> See Section 7.3 of [80] for more details, and these can be arranged as shown in Table 7.2 to construct different types of Lorentz-covariant quantities. It can be shown that this is the only way


σμν = <sup>1</sup> <sup>2</sup> i(γμγ<sup>ν</sup> − γ<sup>ν</sup> γμ)

**Table 7.2** Lorentz covariant combinations of spinors.

starting from how a Dirac spinor transforms under a Lorentz transformation.

to do it and this is the complete set of different combinations that can be made. It is also interesting for this discussion to see what happens under a parity transformation. The combination ¯uu remains the same under a parity transformation. We can also construct other numbers by inserting various combinations of gamma matrices in the middle. For example, ¯uγ5u is also a Lorentz-invariant scalar; however, in this case, the parity operation produces −uγ¯ <sup>5</sup>u, i.e. there is a change in sign. This is therefore called a 'pseudoscalar'.

So we have a choice of S, V, T, A, or P (or some combination) for the interaction type for weak interactions. The second step is to turn to experiment to try to select the correct combination (see [57, pp. 398– 401]). It involves (i) looking at Fermi (ΔJ = 0, where J is the total spin of the nucleus) and Gamow–Teller (ΔJ = 0, 1) allowed β decays and comparing various features of the decay electron spectra to see if the Fermi transitions are S or V or a combination of the two. The Gamow–Teller transitions tell us about the choice between A and T. After considerable experimental investigation, it was determined to be a combination of V and A, with no contributions from S or T (P cannot produce a large contribution). Further measurements gave the relative contributions of V and A (equal) and the sign (negative). So:

The weak interaction is V−A, i.e. γμ(1 − γ5).

## **7.2.5 Parity violation**

We have just decided that the weak interaction has a V−A form, i.e. that the spinor combinations ¯uγμu and ¯uγμγ5u appear with equal amounts. The first combination is a Lorentz 4-vector and the second is an axial vector. We use the V−A combination to describe the parity-violating weak interaction:

$$
\bar{u}\gamma\_\mu u - \bar{u}\gamma\_\mu \gamma\_5 u = \bar{u}\gamma\_\mu (1 - \gamma\_5)u \tag{7.9}
$$

We now turn to the properties of these quantities under a parity operation. Both ¯uγμu and ¯uγμγ5u have definite parity transformation properties; however, they are opposite. If the weak interaction were a single type, i.e. either V or A, then there would not be parity violation—the final state would have definite parity. The V−A combination ¯uγμ(1 − γ5)u, however, has a mixture of parity change or non-change and this results in parity violation. Since the V and A parts come in equal amounts, parity is said to be maximally violated. We will review the experimental evidence for parity violation in the weak interaction in Chapter 8.

## **7.2.6 Currents and fields**

We now give a reminder of the meaning of a current (see Chapter 6), since it will feature a lot in the electroweak theory. Consider the decay

**Fig. 7.4** (a) Feynman diagram for τ decay. (b) Same interaction showing the concept of two interacting currents. (c) Same, now showing the concept of a current interacting with a field.

<sup>6</sup>Note the absence of the projection operator because the photon couples with equal strength to the left-handed and right-handed parts.

of the τ , as shown in Fig. 7.4(a). The contents of the square brackets in eqn 7.6 are each referred to as currents:

$$J^+\_{\mu} = \bar{u}(\nu\_{\tau})\gamma\_{\mu}\frac{1}{2}(1-\gamma\_5)u(\tau) \tag{7.10}$$

is a charge-raising current (because the outgoing ν<sup>τ</sup> is one positive increment in charge bigger than the incoming τ <sup>−</sup>). Similarly,

$$J^{-\mu} = \bar{u}(e)\gamma^{\mu}\frac{1}{2}(1-\gamma^{5})u(\nu\_{e})\tag{7.11}$$

is a charge-lowering current. These charge-raising (lowering) currents are the reason why this sort of interaction is called a charged current—the particle in the middle is a W<sup>+</sup> or a W<sup>−</sup> and charge needs to be conserved at each vertex. The other type of weak interaction is the neutral current and involves the exchange of the Z<sup>0</sup>.

We can now write the matrix element in eqn 7.4 as

$$M\_{\rm fi} = \frac{g\_{\rm w}}{\sqrt{2}} J\_{\mu}^{+} \cdot \frac{1}{M\_{W}^{2} - q^{2}} \cdot \frac{g\_{\rm w}}{\sqrt{2}} J^{-\mu} \tag{7.12}$$

This is a so called current–current formalism of an interaction; i.e. the interaction occurs between the two currents, as shown in Fig. 7.4(b).

Another variation on this theme is to treat the two halves of the Feynman diagram separately. We can 'plug-and-play', i.e. change the particles at one vertex (e.g. from (e, νe) to (d, u)) without changing the calculation at the other vertex. We do this by introducing the concept of a field W<sup>μ</sup>,

$$W^{\mu} = \frac{1}{M\_W^2 - q^2} \cdot \frac{g\_{\text{w}}}{\sqrt{2}} J^{-\mu} \tag{7.13}$$

and the matrix element is

$$M\_{\rm fi} = \frac{g\_{\rm w}}{\sqrt{2}} J\_{\mu}^{+} \cdot W^{\mu} \tag{7.14}$$

This is shown in Fig. 7.4(c).

Another example of a current × field interaction is the electromagnetic interaction. The equivalent current for an electromagnetic interaction (e.g. for an electron, with negative charge) is

$$J^{\rm EM}\_{\mu} = -\bar{u}(e)\gamma\_{\mu}u(e) \tag{7.15}$$

and the electromagnetic field is A<sup>μ</sup>, giving<sup>6</sup> Mfi = JEM <sup>μ</sup> · A<sup>μ</sup>.

This discussion in terms of currents and fields is reminiscent of electric currents and magnetic fields—a quantity called a magnetic field is constructed at all points in space to describe the action of all the bits of current that are flowing in all circuits nearby. This can then be used to determine the force on a particular piece of current placed somewhere<sup>7</sup>

<sup>7</sup>As we showed in Chapter 6, the continuity equation for the Dirac equation gives the expression for the Dirac spinor equivalent of a probability current as uγ¯ <sup>μ</sup>u. .

## **7.3 Weak interactions including quarks**

Having understood the V−A structure of the weak interaction in the lepton sector, we are now ready to extend the idea of 'universality' to the strong sector and quarks. We shall see that to maintain universality it will be necessary to accept that the quark eigenstates involved in the weak interaction are linear combinations of those involved in strong interactions. This requires the introduction of Cabibbo theory and its extension to the CKM matrix. The result is a simple picture that explains a large number of experimental measurements. The background to the problem almost predates accelerator particle physics, going back to a time when the highest-energy data came from cosmic-ray physics. It was noted that some of the particles behaved in a strange way—after correction for phase space, they seemed to decay about a factor of 20 more slowly than those that were 'not strange'.<sup>8</sup> <sup>8</sup>The 'slow ones' turn out to contain a

## **7.3.1 Cabibbo theory**

Figure 7.5(a) shows the quarks arranged in families with the possible charged-current transitions marked (thicker lines denote the transitions with the highest probabilities). In contrast, Fig. 7.5(b) shows the lepton families. Extra transitions are required to convert quarks of different generations, which do not exist in the interactions of the leptons. If this were not the case, the K and Λ particles would be completely stable (as would some particles with b quarks). Cabibbo developed his theory when only the u, d, and s quarks were known. The theory states:

The charged-current interactions of quarks proceed with the same coupling constant g<sup>w</sup> as for leptonic interactions provided we associate a factor cos θ<sup>C</sup> to u ↔ d transitions and a factor sin θ<sup>C</sup> to u ↔ s transitions.

With this scheme, a large number of weak interactions and decays are correctly predicted; the value of θ<sup>C</sup> required is 13.1◦.

However, a remaining problem was the decay K<sup>0</sup> <sup>L</sup> → μ+μ<sup>−</sup>, a secondorder weak interaction requiring two W particles in the lowest-order Feynman diagram. The predicted rate was many orders of magnitude faster than experimentally measured. A solution to this paradox was suggested by Glashow, Iliopoulos, and Maiani (GIM) [78].

## **7.3.2 GIM mechanism, flavour-changing neutral currents**

The GIM mechanism introduced a fourth quark c into the theory,<sup>9</sup> with couplings cos θ<sup>C</sup> for c ↔ s transitions and − sin θ<sup>C</sup> for c ↔ d. This produces a second Feynman diagram in the process K<sup>0</sup> <sup>L</sup> → μ<sup>+</sup>μ<sup>−</sup>, which interferes destructively with the one involving only u, d, and s quarks and suppresses the decay to a level compatible with experiment (see

**Fig. 7.5** Allowed charged-current interactions of quarks (a) and leptons (b).

<sup>9</sup>This was in 1970, four years before the J/ψ was discovered.

**Fig. 7.6** Allowed charged-current interactions with Cabibbo-rotated quarks (a) and leptons (b).

suppressed. <sup>10</sup> <sup>10</sup> Flavour-changing neutral currents are allowed at higher order and lead to the very important flavour oscillations such as K<sup>0</sup> → K¯ <sup>0</sup>, which we will examine in Chapter 10.

[80, p. 327] or [84, p. 282]), provided the mass difference m<sup>c</sup> − m<sup>u</sup> is not too large.

With four quarks, the GIM mechanism gives a nice conceptual picture of what is going on. If we define a linear supposition of quarks (of flavour eigenstates d and s) to form d and s ,

$$
\begin{pmatrix} d' \\ s' \end{pmatrix} = \begin{pmatrix} \cos \theta\_\mathcal{C} & \sin \theta\_\mathcal{c} \\ -\sin \theta\_\mathcal{C} \cos \theta\_\mathcal{c} \end{pmatrix} \begin{pmatrix} d \\ s \end{pmatrix} \tag{7.16}
$$

then it is possible to view weak charged-current interactions of quarks as occurring within generations with the same coupling g<sup>w</sup> as for leptons, provided we consider the rotated states d and s rather than d and s. This is shown in Fig. 7.6(a) (ignore the t and b quarks in the diagrams for the moment).

The original motivation for the GIM mechanism was to suppress weak interactions that change flavour but not the quark charge (flavourchanging neutral currents), such as s → d, which are only observed at very low rates. The GIM mechanism will show that these interactions are forbidden at tree level and it also ensures that the rates are suppressed at higher order in perturbation theory.

The idea is as follows. The Cabibbo rotation in eqn 7.16 shows that we have to consider weak charged-current interaction as being between quarks u ↔ d and c ↔ s and that these form two families that do not mix at all, just like the leptons (ignoring neutrino oscillations). Therefore, for the weak neutral-current interaction, we can hypothesize that the same thing happens, i.e. that the quark families do not mix at all in the rotated basis; the only interactions allowed are u ↔ u, d ↔ d , s ↔ s and c ↔ c.

The question is: restricting ourselves to the above hypothesis, are flavour-changing interactions of the neutral current, i.e. d ↔ s allowed? If so, then the overlap between d and s should not be zero. Take the rotation defined in eqn 7.16, invert it, and use it to give

$$\begin{aligned} (ds) &= (d' \cos \theta\_{\mathcal{C}} - s' \sin \theta\_{\mathcal{C}}) (d' \sin \theta\_{\mathcal{C}} + s' \cos \theta\_{\mathcal{C}}) \\ &= (d'd' - s's') \cos \theta\_{\mathcal{C}} \sin \theta\_{\mathcal{C}} + d's' \cos^2 \theta\_{\mathcal{C}} - s'd' \sin^2 \theta\_{\mathcal{C}} \\ &= 0 \end{aligned} \tag{7.17}$$

where the last line comes from the hypothesis that in the new basis, there are no interactions between quark families, i.e. d d = s s = 1 and d s = s d = 0. So lowest-order flavour-changing neutral currents are not possible under the GIM mechanism. This then agrees with experiment, since flavour-changing neutral currents are observed to be highly

## **7.3.3 CKM matrix**

As hinted by including t and b quarks in Figs. 7.5 and 7.6, the scheme can be extended to three quark generations with a 3 × 3 matrix, which was proposed by Kobayashi and Maskawa (the resulting mixing matrix **V** is known as the CKM matrix, where C = Cabibbo):

$$
\begin{pmatrix} d' \\ s' \\ b' \end{pmatrix} = \begin{pmatrix} V\_{ud} & V\_{us} & V\_{ub} \\ V\_{cd} & V\_{cs} & V\_{cb} \\ V\_{td} & V\_{ts} & V\_{tb} \end{pmatrix} \begin{pmatrix} d \\ s \\ b \end{pmatrix} = \mathbf{V} \begin{pmatrix} d \\ s \\ b \end{pmatrix} \tag{7.18}
$$

The matrix has values that come from experiment. The magnitudes (we will consider the phases later) are the magnitudes (. . . ) are as follows:<sup>11</sup> <sup>11</sup>For the most recent values consult

the PDG Tables [115].

$$\mathbf{V} = \begin{pmatrix} 0.97427 \pm 0.00015 & 0.22534 \pm 0.00065 & 0.00351^{+0.00015}\_{-0.00014} \\ 0.22520 \pm 0.00065 & 0.97344 \pm 0.00016 & 0.0412^{+0.0011}\_{-0.0005} \\ 0.00867^{+0.00029}\_{-0.00031} & 0.0404^{+0.0011}\_{-0.0005} & 0.999146^{+0.000021}\_{-0.000046} \end{pmatrix} \tag{7.19}$$

We will review some of the techniques used to measure the elements of the CKM matrix in Chapter 8. Note that the bottom row and right column are rather close to being all zero or one, i.e. the third generation does not mix much with the other two. This is the reason why the decays of particles involving b quarks are very slow.

Kobayashi and Maskawa were originally searching for an excuse to allow a non-trivial complex number into the 2 × 2 mixing matrix they realized<sup>12</sup> <sup>12</sup>For which they won the 2008 Nobel that this is a way of inserting (at that time recently discovered) CP violation into the theory. Because it is possible to add a phase to each quark without altering the theory (measurable quantities are proportional to |M| <sup>2</sup>), this could not be done with a 2 × 2 matrix. If the number of quark generations is extended to 3, the matrix can have one non-trivial CP violation-generating phase. Much of what is known as 'heavy-flavour physics' revolves around pinning down the values of the CKM matrix elements, since they are key parameters in calculating the decay rates of heavy mesons and baryons. In addition to measurements depending directly on the matrix elements, the CKM matrix must also respect the constraint of unitarity. Although the number of parameters to be determined is not huge, to take proper account of the mathematical constraints and to include systematic and statistical errors correctly is non-trivial and beyond the scope of this text.

It is conventional to treat the charge −<sup>1</sup> <sup>3</sup> quarks (d, s, b) as the ones that mix to (d , s , b ), while the others (u, c, t) do not change. This could have been done the other way round (or even considering a combination of rotations of both the charge −<sup>1</sup> <sup>3</sup> and charge +<sup>2</sup> <sup>3</sup> quarks), but it can be shown to simplify to the same combinations as used here.

Assuming that the CKM matrix is unitary, it can be parameterized in terms of three independent mixing angles θ12, θ23, θ<sup>13</sup> and the one complex phase δ as discussed above. A popular way to parameterize the matrix, following the Particle Data Group (PDG), is as follows:

$$\mathbf{V} = \begin{pmatrix} c\_{12}c\_{13} & s\_{12}c\_{13} & s\_{13}\mathbf{e}^{-i\delta} \\ -s\_{12}c\_{23} - c\_{12}s\_{23}s\_{13}\mathbf{e}^{i\delta} & c\_{12}c\_{23} - s\_{12}s\_{23}s\_{13}\mathbf{e}^{i\delta} & s\_{23}c\_{13} \\ s\_{12}s\_{23} - c\_{12}c\_{23}s\_{13}\mathbf{e}^{i\delta} & -c\_{12}s\_{23} - s\_{12}c\_{23}s\_{13}\mathbf{e}^{i\delta} & c\_{23}c\_{13} \end{pmatrix} \\ \text{(7.20)}$$

Prize.

**Fig. 7.7** Spectator quark diagram for K¯ <sup>0</sup> → π+π<sup>−</sup> decay.

charm, <sup>13</sup>Which will subsequently decay to <sup>13</sup> mesons with an s quark.

Here cij = cos θij and sij = sin θij , where i, j = 1, 2, 3 are generation labels. In the limit θ<sup>23</sup> = θ<sup>13</sup> = 0, the third generation decouples from the first two and θ<sup>12</sup> = θ<sup>C</sup> (the Cabibbo angle). The phase δ allows for CP violation in the standard model. CP violation will be covered further in Chapter 10.

## **7.3.4 Decays of hadrons containing heavy quarks**

We are now in a position to try to make predictions of how weak decays of mesons and baryons containing heavy quarks might proceed. A useful concept is that of the 'spectator quark'. Normally during the existence of a meson, gluon exchange occurs continuously between the q–¯q pair, so the strong force is important in understanding what is happening. The idea is that when the hadron decays, one of its constituent quarks changes flavour, emitting a virtual W particle—while this is happening, the other quark(s) are 'spectators', i.e. they play no role in the weak interaction itself. Although this is clearly not a rigorous result, it does provide a useful approximate model. An example is shown in Fig. 7.7, where the ¯ d is a spectator quark during the decay K¯ <sup>0</sup> → π+π<sup>−</sup>.

From inspecting the CKM matrix in eqn 7.19, we can see that since Vcs is large, a c quark within a meson is going to decay preferentially to an s quark. For example, the decay D<sup>+</sup> → K + anything should have a high branching ratio, and indeed such decays dominate: K<sup>+</sup> + anything 28%; K0/K¯ <sup>0</sup> + anything 61%; K<sup>−</sup> + anything 5.5%. Another indication that the spectator model is correct would be if the lifetime of a charged D meson were the same as the lifetime of a neutral D. This is not quite the case: τ (D<sup>+</sup>)=1.040 ps and τ (D<sup>0</sup>)=0.410 ps. However, the spectator-quark model ignores some other effects: the D<sup>0</sup> has more annihilation diagrams than the D<sup>+</sup> and non-perturbative effects from the strong interaction are still significant.

The B mesons are expected to decay mostly to particles involving because Vcb is about a factor of 10 bigger than Vub, and this is indeed the case. Since neither Vcb or Vub is very big, the b quark decays relatively slowly. The measured lifetimes are τ (B<sup>+</sup>)=1.67 ps and τ (B<sup>0</sup>)=1.54 ps, so the spectator-quark model prediction is much better. The larger mass of B mesons means that αs(mb) is smaller and perturbative effects are less important (this is a consequence of 'running' coupling constants—see Chapter 9).

Looking again at the CKM matrix, we see that Vtb is nearly 1. This means that the t quark decays nearly always to the b quark and not directly to s or d quarks. Since it is very heavy, the phase space is very large and it will decay rapidly—indeed too rapidly for any meson to be formed.

A brief reminder about the naming of heavy mesons: the letter B or D with no subscript means the meson contains a heavy quark and either a u or a d in order to make up the charge of the meson. More exotic mesons are denoted with a subscript that indicates the less massive quark, e.g. Bs, Ds, or even Bc.

## **7.4 Introduction to electroweak unification**

In Sections 7.2 and 7.3, we have characterized the charged-current weak interaction of both quarks and leptons as follows:


By comparing the properties of the electromagnetic (EM) and weak forces, it becomes apparent that these have some very similar properties. Perhaps it might be possible to unify them, i.e. to provide a theory that covers both forces as two aspects of a more complete theory. This was achieved by Glashow, Salam, and Weinberg (GSW) and we will go through their arguments here. The theory was constructed before neutral currents had been discovered, and indeed it predicted the properties of neutral currents, which were subsequently experimentally verified. We will start with just the EM and weak chargedcurrent interactions and see how the neutral current emerges from the theory.

First, in Table 7.3, we list the properties of the forces more carefully. Although the forces have similarities, they are clearly different. The procedure used by GSW to unify the forces is divided into four steps, since it is complicated to explain. We will use the τ <sup>−</sup> → ν<sup>τ</sup> e<sup>−</sup>ν¯ decay from Section 7.2.2 as an example.

## **7.4.1 Electroweak unification procedure**

The process of electroweak unification starts with the components of the weak interaction as they were known around 1964 (i.e. only the postulated W<sup>±</sup>, and no neutral weak interaction), and the success of the Feynman diagram view of perturbation theory for QED calculations. In exactly the same way that local gauge invariance in the Dirac equation leads to an additional spin-1 field (the photon) and the correct photon– electron interaction (see Section 6.5), we use the same mechanism to insert the weak charged-current interaction of the fermions by a spin-1 boson (the W<sup>±</sup>).<sup>14</sup> <sup>14</sup>We ignore for now the fact that We assume the V−A structure from Section 7.2.4 and that the spin of the W is 1, as guided by experiment, so the current is as in Section 7.2.6:

$$J^+\_{\mu} = \bar{u}(\nu)\gamma\_{\mu}\frac{1}{2}(1-\gamma\_5)u(\tau) \tag{7.21}$$

the local gauge invariance mechanism only works with massless particles like photons—we will return to this later.

$$F = \frac{1}{4\pi\epsilon\_0} \frac{Q\_1 Q\_2}{r^2}$$

Comes from the Dirac equation when insisting on local gauge invariance

JEM <sup>μ</sup> <sup>=</sup> ψγ¯ <sup>μ</sup>Qψ J<sup>μ</sup> <sup>=</sup> ψγ<sup>μ</sup> where Q is the electric charge in units in which the electron has Q = −1

## EM interaction Weak CC interaction

$$\text{Maxwell's equations} \qquad \text{e.g. } \nu\_e n \to p e^-, \mu^- \to e^- \bar{\nu}\_e \nu\_\mu, \ K^0 \to e^+ \pi^- \nu\_e.$$

Long range Short range

γ = massless spin-1 boson Massive W<sup>±</sup> spin-1 bosons

$$\frac{\epsilon\_1 Q\_2}{r^2} \tag{1} \tag{1}$$

Acts on particles depending on their charge Acts on left-handed particles and right-handed antiparticles only

Conserves parity Not parity-conserving

$$J\_{\mu} = \overline{\psi}\gamma\_{\mu}\frac{1}{2}(1 - \gamma\_{5})\psi$$

Coupling constant <sup>e</sup> <sup>=</sup> <sup>√</sup>4πα Coupling constant <sup>g</sup><sup>w</sup> for all weak interactions (universality); use CKM matrix elements with quarks

**Table 7.3** Comparison of properties of the electromagnetic and weak charged-current interactions.


R, L correspond to helicity +1, −1 if m= 0 and approximately if m∼0.

**Table 7.4** Spinors of particles and antiparticles.

## **Step 1: Left-handed particles**

Since P<sup>L</sup> = <sup>1</sup> <sup>2</sup> (1−γ5) is a projection operator, we can operate on a spinor with it twice in succession without changing the effect, so <sup>1</sup> <sup>2</sup> (1−γ5)u(τ ) = 1 <sup>2</sup> (1 <sup>−</sup> <sup>γ</sup>5) <sup>1</sup> <sup>2</sup> (1 − γ5)u(τ ), and

$$J^+\_{\mu} = \bar{u}(\nu)\gamma\_{\mu}\frac{1}{2}(1-\gamma\_5)\frac{1}{2}(1-\gamma\_5)u(\tau) \tag{7.22}$$

$$=\bar{u}(\nu)\frac{1}{2}(1+\gamma\_5)\gamma\_\mu\frac{1}{2}(1-\gamma\_5)u(\tau)\tag{7.23}$$

$$= \left[ \bar{u}(\nu) \frac{1}{2} (1 + \gamma\_5) \right] \gamma\_\mu \left[ \frac{1}{2} (1 - \gamma\_5) u(\tau) \right] \tag{7.24}$$

$$=\bar{u}(\nu\_{\mathcal{L}})\gamma\_{\mu}u(\tau\_{\mathcal{L}})\tag{7.25}$$

where in line 7.23, we have used the identity γμγ<sup>5</sup> + γ5γ<sup>μ</sup> = 0.

Line 7.24 is the main part of step 1 towards unification—we associate the parts with the (1±γ5) with the spinors rather than the operator. By doing so, we are left with an operator γ<sup>μ</sup> that looks like the EM operator. This looks more obvious in line 7.25, where we write the left-handed parts of the spinors directly. Table 7.4 shows the correspondence between left- and right-handed spinors and adjoint spinors for both particles and antiparticles.

W<sup>±</sup> vertex factors are the same as EM if we act only on the left-handed part of the spinor.

## **Interlude: weak isospin**

We now add some structure to formalize step 1 by adding a quantity I<sup>3</sup> that plays an analogous role for the weak interaction as the charge Q does for the EM interaction. We categorize the particles—indeed the separate left- and right-handed parts<sup>15</sup> <sup>15</sup>The left- and right-handed parts are —according to how they interact weakly. We define a quantity called weak isospin in a mathematically analogous way to isospin (and ordinary spin), and arrange the left-handed parts of the particles in I = <sup>1</sup> <sup>2</sup> doublets such as (ν<sup>μ</sup>L, μL) or (uL, d <sup>L</sup>) containing the particles that can change into each other at a weak CC vertex. The member of each multiplet with the more positive charge is assigned I<sup>3</sup> = +<sup>1</sup> <sup>2</sup> and the member with the more negative charge is assigned I<sup>3</sup> = −<sup>1</sup> 2 .


The right-handed parts of all the particles are assigned to weak isospin singlets with I = 0 and I<sup>3</sup> = 0.


The right-handed neutrinos have been left out of the table—their nature is still being investigated. They also have I<sup>3</sup> = 0. It is possible that neutrinos are Majorana particles (see Chapter 11), i.e. they are their own antiparticles (in which case, flipping the helicity of a neutrino ν<sup>L</sup> produces the antineutrino νR; neutrinoless double β decay becomes possible), or they could be Dirac particles in which neutrino and antineutrino are distinct, and therefore the ν<sup>R</sup> and ¯ν<sup>L</sup> do not interact with any known force. If neutrinos had been exactly massless, these two situations would be experimentally indistinguishable.

The I<sup>3</sup> label, which is +<sup>1</sup> <sup>2</sup> , <sup>−</sup><sup>1</sup> <sup>2</sup> , or 0 is used for the weak chargedcurrent interaction in a similar way to the charge in an electromagnetic interaction—i.e. when it is 0, there is no interaction.

not particles in their own right. An electron spinor u(e) describes the real electron; we split it up as u(e) = u(eL) + u(eR) simply to make the similarity between the EM and weak interactions apparent in the formalism.

## **Step 2:** *W***<sup>0</sup>**

We next appeal to symmetry, and postulate a neutral partner W<sup>0</sup> to the W±. This interacts producing no change in charge within each doublet.


This is not the Z<sup>0</sup>, which also interacts with right-handed states of quarks and leptons.

## **Step 3:** *B***<sup>0</sup>**

Instead of the electromagnetic interaction, we introduce another field, the B<sup>0</sup>, which will ensure the correct electromagnetic interaction in the following step. The B<sup>0</sup> interacts with a strength proportional to a quantity called weak hypercharge Y, where Y is defined by

$$Q = I\_3 + \frac{1}{2}Y\tag{7.26}$$

The B<sup>0</sup> interacts with a current J<sup>Y</sup> μ :

$$J^Y\_\mu = 2J^{\rm EM}\_\mu - 2J^0\_\mu\tag{7.27}$$

We give the B<sup>0</sup> interaction its own coupling constant g /2 (the factor of 2 is just a convention). Examples of B<sup>0</sup> currents involving left-handed up quarks and right-handed electrons are

$$J^{Y}\_{\mu} = \bar{u}(u\_{\mathcal{L}})\gamma\_{\mu}Y(u\_{\mathcal{L}})u(u\_{\mathcal{L}}), \qquad \bar{u}(e\_{\mathcal{R}})\gamma\_{\mu}Y(e\_{\mathcal{R}})u(e\_{\mathcal{R}}) \tag{7.28}$$

where Y (uL)=+<sup>1</sup> <sup>3</sup> and Y (eR) = −2 using eqn 7.26. Table 7.6 at the end of this section gives Y for all the particles.

## **Step 4: The Weinberg angle**

We will use the notation introduced in Section 7.2.6 with currents and fields. Table 7.5 gives a summary of the symbols used, the same as those above with the addition of μ indices on the currents and fields. Since all the fields are vector or axial vector fields, they include an index μ = 0, 1, 2, 3. When a current and a field are combined to form a matrix element, there is an implied summation over μ.

The electromagnetic interaction A<sup>μ</sup> is a linear combination of Wμ,<sup>0</sup> and B<sup>μ</sup>, and the orthogonal combination produces a new interaction, which is the weak neutral current Z<sup>0</sup>. A convenient way to form the linear combination is with a rotation angle. We introduce a new rotation angle θW, the 'weak mixing angle' or 'Weinberg angle'.

$$\begin{aligned} A^{\mu} &= +B^{\mu}\cos\theta\_{\mathcal{W}} + W^{\mu,0}\sin\theta\_{\mathcal{W}} \\ Z^{\mu} &= -B^{\mu}\sin\theta\_{\mathcal{W}} + W^{\mu,0}\cos\theta\_{\mathcal{W}} \end{aligned} \tag{7.29}$$


**Table 7.5** Summary of fields discussed in this section and the symbols used.

We now write down the total GSW electroweak interaction in the form of a (current)μ(field)<sup>μ</sup> Lorentz scalar:<sup>16</sup> <sup>16</sup>A more detailed treatment con-

$$g\_{\mathbf{w}}\left(J\_{\mu}^{+}\frac{W^{\mu,+}}{\sqrt{2}} + J\_{\mu}^{-}\frac{W^{\mu,-}}{\sqrt{2}} + J\_{\mu}^{0}W^{\mu,0}\right) + \frac{g'}{2}J\_{\mu}^{Y}B^{\mu} \qquad\qquad\text{(7.30)}\qquad\text{apparent-see further Reading.}$$

The factors of <sup>√</sup>2 come from group theory. Now, we invert eqn 7.29,

$$B^{\mu} = +A^{\mu}\cos\theta\_{\mathcal{W}} - Z^{\mu,0}\sin\theta\_{\mathcal{W}} \tag{7.31}$$

$$W^{\mu} = +A^{\mu}\sin\theta\_{\mathcal{W}} + Z^{\mu,0}\cos\theta\_{\mathcal{W}} \tag{7.32}$$

and write the neutral part of eqn 7.30 in terms of A<sup>μ</sup> and Z<sup>μ</sup> by substituting in from eqns 7.31 and 7.32:

$$\begin{aligned} g\_{\mathbf{w}} J\_{\mu}^{0} W^{\mu,0} + \frac{g'}{2} J\_{\mu}^{Y} B^{\mu} &= \left( g\_{\mathbf{w}} \sin \theta\_{\mathbf{W}} \, J\_{\mu}^{0} + g' \cos \theta\_{\mathbf{W}} \, \frac{J\_{\mu}^{Y}}{2} \right) A^{\mu} \\ &+ \left( g\_{\mathbf{w}} \cos \theta\_{\mathbf{W}} \, J\_{\mu}^{0} - g' \sin \theta\_{\mathbf{W}} \, \frac{J\_{\mu}^{Y}}{2} \right) Z^{\mu} \end{aligned} \tag{7.33}$$

## **Recovering the EM interaction**

Consider next the two parts of the right-hand side of eqn 7.33 separately. The first part must be set equal to eJEM <sup>μ</sup> A<sup>μ</sup>, otherwise the GSW theory will not reproduce the EM physics described by QED. From eqn 7.27, JEM <sup>μ</sup> = J<sup>0</sup> <sup>μ</sup> + <sup>1</sup> <sup>2</sup> J<sup>Y</sup> <sup>μ</sup> , and so

$$\left(g\_{\mathbf{w}}\sin\theta\_{\mathbf{W}}J^{0}\_{\mu} + g'\cos\theta\_{\mathbf{W}}\frac{J^{Y}\_{\mu}}{2}\right)A^{\mu} = e\left(J^{0}\_{\mu} + \frac{1}{2}J^{Y}\_{\mu}\right)A^{\mu}\tag{7.34}$$

structs the W<sup>+</sup> and W<sup>−</sup> in a way that makes the symmetry with W<sup>0</sup> more For this to be satisfied, we must set

$$e = g\_{\rm w} \sin \theta\_{\rm W} = g' \cos \theta\_{\rm W} \tag{7.35}$$

These two equalities are called the 'unification condition'.

## **The neutral-current interaction**

The second part of eqn 7.33 is the weak neutral current gzJ<sup>Z</sup> <sup>μ</sup> . This is completely specified (i.e. there are no free parameters) by the charged weak current and electromagnetism:

$$g\_z J^Z\_\mu = g\_\text{w} \cos\theta\_\text{W} J^0\_\mu - g' \sin\theta\_\text{W} \frac{J^Y\_\mu}{2} \tag{7.36}$$

We now manipulate this expression to remove the parts involving B<sup>0</sup>. Using eqn 7.27 and the unification condition 7.35, we have

$$g\_z J\_\mu^Z = \frac{g\_\text{w}}{\cos\theta\_\text{W}} [\cos^2\theta\_\text{W} J\_\mu^0 - \sin^2\theta\_\text{W} \left(J\_\mu^{\text{EM}} - J\_\mu^0\right)]$$

$$= \frac{g\_\text{w}}{\cos\theta\_\text{W}} (J\_\mu^0 - \sin^2\theta\_\text{W} J\_\mu^{\text{EM}}) \tag{7.37}$$

We can now put explicit forms of J<sup>0</sup> <sup>μ</sup> = I3uγ¯ <sup>μ</sup> 1 <sup>2</sup> (1 − γ5)u and JEM <sup>μ</sup> = Quγ¯ <sup>μ</sup>u into this expression to give

$$g\_z J\_\mu^Z = \frac{g\_\mathbf{w}}{\cos\theta\_\mathbf{W}} \bar{u}\gamma\_\mu \left[\frac{1}{2}(1-\gamma\_5)I\_3 - \sin^2\theta\_\mathbf{W}Q\right]u$$

$$=\frac{g\_\mathbf{w}}{\cos\theta\_\mathbf{W}} \bar{u}\gamma\_\mu \left[\frac{1}{2}(1-\gamma\_5)I\_3 - \sin^2\theta\_\mathbf{W}Q\left(\frac{1}{2}(1-\gamma\_5) + \frac{1}{2}(1+\gamma\_5)\right)\right]u$$

$$=\frac{g\_\mathbf{w}}{\cos\theta\_\mathbf{W}} \bar{u}\gamma\_\mu \left[g\_\mathbf{L}\frac{1}{2}(1-\gamma\_5) + g\_\mathbf{R}\frac{1}{2}(1+\gamma\_5)\right]u\tag{7.38}$$

where g<sup>L</sup> = I<sup>3</sup> − Q sin<sup>2</sup> θ<sup>W</sup> and g<sup>R</sup> = −Q sin<sup>2</sup> θ<sup>W</sup> are the couplings of the left- and right-handed particles, respectively. Note that, apart from neutrinos for which Q = 0 and so g<sup>R</sup> = 0, the neutral current interacts with both the left- and right-handed states of the particle but with different strengths. In contrast, the charged-current interaction involves only the left-handed states of all particles.

It is also possible to rearrange eqn 7.38 to define coupling constants c<sup>V</sup> and c<sup>A</sup> in terms of g<sup>L</sup> and gR:

$$\mathcal{G}\_z J^Z\_\mu = \frac{g\_\text{w}}{\cos \theta\_\text{W}} \bar{u} \gamma\_\mu \left[ \frac{1}{2} (g\_\text{L} + g\_\text{R}) - \frac{1}{2} (g\_\text{L} - g\_\text{R}) \gamma\_5 \right] u \tag{7.39}$$

$$\bar{u} = \frac{g\_{\rm w}}{\cos \theta\_{\rm W}} \bar{u} \gamma\_{\mu} \frac{1}{2} (c\_{\rm V} - c\_{\rm A} \gamma\_{5}) u \tag{7.40}$$


**Table 7.6** The electroweak properties associated with each group of fermions.

where c<sup>V</sup> = I<sup>3</sup> −2Q sin<sup>2</sup> θ<sup>W</sup> and c<sup>A</sup> = I3. Different applications prefer to use either gL, g<sup>R</sup> or cV, cA. <sup>17</sup> <sup>17</sup>For simplicity, we can use the value The values of c f <sup>V</sup> and c f <sup>A</sup> for each fermion f along with other properties of the fermions are shown in Table 7.6.

## **7.4.2 Weak neutral currents**

The existence of weak neutral currents (see Section 8.3) was the first critical prediction of the unified electroweak theory. The amplitude for ν¯μe → ν¯μe is

$$\begin{split} M &= \frac{g\_{\text{w}}^{2}}{8M\_{Z}^{2}\cos^{2}\theta\_{\text{W}}} \Big[ \bar{u}(\bar{\nu}\_{\mu})\gamma\_{\mu} \left( c\_{\text{V}}^{\langle \nu \rangle} - c\_{\text{A}}^{\langle \nu \rangle} \gamma\_{5} \right) u(\bar{\nu}\_{\mu}) \Big] \\ &\times \left[ \bar{u}(e)\gamma^{\mu} \left( c\_{\text{V}}^{\langle e \rangle} - c\_{\text{A}}^{\langle e \rangle} \gamma^{5} \right) u(e) \right] \end{split} \tag{7.41}$$

We can look up the couplings from Table 7.6, which are c (ν) <sup>V</sup> = c (ν) <sup>A</sup> = <sup>1</sup> 2 , c (e) <sup>V</sup> = −<sup>1</sup> <sup>2</sup> + 2 sin<sup>2</sup> <sup>θ</sup>W, and <sup>c</sup> (e) <sup>A</sup> = −<sup>1</sup> <sup>2</sup> , and so

$$\begin{split} M &= \frac{g\_{\rm w}^{2}}{8M\_{Z}^{2}\cos^{2}\theta\_{\rm W}} \left[ \bar{u}(\bar{\nu}\_{\mu})\gamma\_{\mu}\frac{1}{2}(1-\gamma\_{5})u(\bar{\nu}\_{\mu}) \right] \\ &\times \left[ \bar{u}(e)\gamma^{\mu} \left(2\sin^{2}\theta\_{\rm W} - \frac{1}{2}(1-\gamma\_{5})\right)u(e) \right] \end{split} \tag{7.42}$$

of I<sup>3</sup> for the left-handed particle in these equations all the time; if we have a right-handed component to the particle, we can write it as <sup>1</sup> <sup>2</sup> (1 + γ5)u, and when combined with the (1 − γ5) in eqn 7.38, it causes the term with I<sup>3</sup> in it to be zero.

Note that the current involving the neutrino takes the same <sup>1</sup> <sup>2</sup> (1 − γ5) form as the charged current. What is special about the neutrino that causes this? It is the fact that it is electrically neutral, so all of its interaction comes from the W<sup>0</sup>, with none from the B<sup>0</sup>.

The calculation of the cross section involves averaging over initial spins and summing over final spins. It can be done with several time-saving tricks (see Further Reading). The result for dσ(¯νμe → ν¯μe)/dE<sup>e</sup> is

$$\begin{split} \frac{\mathrm{d}\sigma}{\mathrm{d}E\_{e}} &= \frac{G\_{\mathrm{F}}^{2}m\_{e}}{2\pi} \left\{ \left( c\_{\mathrm{V}}^{(e)} - c\_{\mathrm{A}}^{(e)} \right)^{2} + \left( c\_{\mathrm{V}}^{(e)} + c\_{\mathrm{A}}^{(e)} \right)^{2} \left( 1 - \frac{E\_{e}}{E\_{\nu}} \right)^{2} \\ &- \frac{m\_{e}E\_{e}}{E\_{\nu}^{2}} \left[ \left( c\_{\mathrm{V}}^{(e)} \right)^{2} - \left( c\_{\mathrm{A}}^{(e)} \right)^{2} \right] \right\} \end{split} \tag{7.43}$$

This can be integrated to give the cross section

$$
\sigma = \int \frac{\mathrm{d}\sigma}{\mathrm{d}E\_e} \,\mathrm{d}E\_e \sim E\_\nu \times 10^{-45} \,\mathrm{m}^2.
$$

(where E<sup>ν</sup> is in GeV)—a very small cross-section! The neutral current is also observable in neutrino–nucleon collisions. The cross sections are still small, but somewhat larger than for scattering off electrons. The observation and precision measurements of neutral currents will be discussed in Chapter 8.

## **7.4.3 Masses of** *W* **and** *Z* **bosons**

We can now use the electroweak unification theory to predict the masses of the W and Z bosons in terms of the weak mixing angle sin θ<sup>W</sup> and the Fermi coupling constant GF. We start with the relation between G<sup>F</sup> and M<sup>W</sup> , eqn 7.5, and, substituting for g<sup>W</sup> from eqn 7.35, we find

$$\frac{G\_{\rm F}}{\sqrt{2}} = \frac{e^2}{8M\_W^2 \sin^2 \theta\_{\rm W}}$$

$$M\_W = \left(\frac{\sqrt{2}e^2}{8G\_{\rm F}}\right)^{1/2} \frac{1}{\sin \theta\_{\rm W}}\tag{7.44}$$

Using the unification condition 7.35 again, we can simply relate the masses of the W and Z bosons in terms of the weak mixing angle:

$$\frac{M\_W}{M\_Z} = \cos\theta\_W \tag{7.45}$$

Therefore, if we have measurements of sin <sup>2</sup>θ<sup>W</sup> and G<sup>F</sup> from low-energy experiments, we can predict the masses of the W and the Z bosons. Hence the discovery of the W and the Z bosons (see Chapter 8) at the expected masses was a triumph for the electroweak theory.

**Fig. 7.8** Radiative corrections to the W and Z masses from top-quark loops.

**Fig. 7.9** Radiative corrections to the W mass from Higgs loops. Equivalent diagrams also apply to the Z.

The above discussion of the W and Z bosons is valid at lowest order in perturbation theory. There are, however, small but important radiative corrections Δr from higher-order diagrams. The results of these corrections are parameterized by modifying eqn 7.44 to

$$M\_W^2 = \frac{\sqrt{2}e^2}{8G\_F \sin^2 \theta\_W \left(1 - \Delta r\right)}\tag{7.46}$$

There are contributions to Δr from fermion loops containing t quarks (in principle, other quarks contribute, but the t quark is dominant because of its much larger mass) as shown in Fig. 7.8 on page 200.

These give a contribution

$$(\Delta r)\_{\rm top} = -\frac{3G\_\rm F}{8\sqrt{2}\pi^2} \frac{c\_\rm W}{s\_\rm W} \tag{7.47}$$

where we define s<sup>W</sup> = sin θ<sup>W</sup> and c<sup>2</sup> <sup>W</sup> = 1 − s<sup>2</sup> <sup>W</sup> . The masses of the W and Z are also affected by Higgs loops (see Fig. 7.9 on page 200). The contribution to Δr is given by

$$(\Delta r)\_{\text{Higgs}} = \frac{11G\_\text{F}M\_Z^2c\_W^2}{24\sqrt{2}\pi^2} \ln \frac{m\_H^2}{M\_Z^2} \tag{7.48}$$

Δr (eqn 7.46) is the sum of the virtual top and Higgs loop corrections to M<sup>W</sup> and M<sup>Z</sup> as well as the running of the fine structure constant (α) from low energy to the value at MZ. The effect of the running of α is given by

$$\delta r\_0 = 1 - \alpha/\alpha(M\_Z)$$

Where α is the value of the fine structure constant at low energy and α(MZ) is the value at the scale Q<sup>2</sup> = M<sup>2</sup> <sup>Z</sup>. The overall sum of these radiative corrections is given by

$$
\Delta r = \delta r\_0 + \Delta r\_{\text{top}} + \Delta r\_{\text{Higgs}}
$$

Therefore, precision measurements of M<sup>W</sup> , MZ, and G<sup>F</sup> make predictions for the allowed values of the masses of the top quark and the Higgs boson. Note that the contribution from the Higgs depends logarithmically on mH, whereas the contribution from the top quark scales as m<sup>2</sup> t . Therefore, even with no knowledge of the Higgs mass other than that imposed by unitarity, precision electroweak measurements including M<sup>W</sup> and M<sup>Z</sup> predicted the mass of the top quark to be around 170 GeV (see Chapter 8).

## **7.4.4 The standard model, how good is it?**

We now look at a list of some of the difficulties with the weak interaction:

(1) The original Fermi 4-point theory had a problem, the cross section σ ∝ GFE<sup>2</sup>, where E is the centre-of-mass energy. This is fine at low energy, but at 300 GeV the scattering probability becomes bigger

**Fig. 7.10** Antineutrino–electron scattering in the original Fermi 4-point theory (a) and including the W intermediate boson (b).

**Fig. 7.11** νν¯ → W+W<sup>−</sup> t-channel, electron exchange (a) and neutral-current, Z exchange (b).

**Fig. 7.12** The three diagrams contributing to e+e<sup>−</sup> → W+W−, via Z exchange (a), γ exchange (b), and t-channel neutrino exchange (c).

**Fig. 7.13** Higher-order Feynman diagram with an e+e<sup>−</sup> loop for the process e+e<sup>−</sup> → e+e−.

than 1. This is known as unitarity violation and is bad news for a theory. By introducing the W boson and moving away from a four-point interaction (Fig. 7.10), a propagator term 1/(M<sup>2</sup> <sup>W</sup> − q<sup>2</sup>) is introduced and the cross section stops increasing.


$$\int\_{}^{\infty} \frac{1}{q^4} q^3 \,\mathrm{d}q = \ln|q| \tag{7.49}$$

which is divergent. The solution, renormalization theory, took a long time to develop and to be shown to work. Eventually, this task was completed by 't Hooft and Veltman, who showed that all locally gauge-invariant theories are renormalizable [130].


## **Chapter summary**


## **Further reading**


concise account of the quantum field theory and gauge symmetry underlying electroweak unification.


## **Exercises**


Hint: Show that ¯uRγμu<sup>L</sup> = 0 by adapting eqn 7.25 and then proceed backwards through the steps used to derive the original eqn 7.25; you should find a combination of projection operators that gives zero.

(7.4) What are the possible decay modes of the τ? Given that lifetime of the muon is 2 × 10<sup>−</sup><sup>6</sup> s, estimate the expected lifetime of the τ? How might it be measured?

Neglecting density-of-states factors, what is the expected ratio of branching ratios for

$$\frac{\tau^+ \to K^+ \bar{\nu}\_\tau}{\tau^+ \to \pi^+ \bar{\nu}\_\tau} \quad ?$$

Starting with an intense 800 GeV proton beam, how could a high-energy neutrino beam, enriched in ν<sup>τ</sup> , be produced? Explain the origin of reducible and irreducible backgrounds of other neutrino flavours.

	- (a) μ<sup>+</sup> → e<sup>+</sup> + ¯ν<sup>μ</sup> + ν<sup>e</sup>
	- (b) K<sup>+</sup> → μ<sup>+</sup> + ν<sup>μ</sup>
	- (c) π<sup>+</sup> → μ<sup>+</sup> + ν<sup>μ</sup>
	- (d) D<sup>+</sup> → K<sup>−</sup> + π<sup>+</sup> + π<sup>+</sup>
	- (e) D<sup>+</sup> → K<sup>+</sup> + π<sup>+</sup> + π<sup>−</sup>

The Λ has a mean lifetime of 2.6 × 10<sup>−</sup><sup>10</sup> s and decays into p + e<sup>−</sup> + ¯ν<sup>e</sup> with a branching fraction of 8.3 × 10<sup>−</sup><sup>4</sup>. The Λ<sup>+</sup> <sup>c</sup> (udc) has a mean lifetime of 2.1 × 10<sup>−</sup><sup>13</sup> s. Estimate the branching fraction of Λ<sup>+</sup> <sup>c</sup> → Λ +e<sup>+</sup> +ν<sup>e</sup> and comment on how your result compares with the measured value.

[m(Λ<sup>+</sup> <sup>c</sup> )=2.285 GeV, BR(Λ<sup>+</sup> <sup>c</sup> → Λe<sup>+</sup>νe) = (2.1 ± 0.6)%.]

	- (a) the ratio of μ-pair production to single-μ production in ν<sup>μ</sup> interactions on nuclei;
	- (b) the decay rate of D<sup>+</sup> <sup>s</sup> → τ <sup>+</sup>ν<sup>τ</sup> ;
	- (c) the decay rate of B¯<sup>0</sup> → D<sup>∗</sup><sup>+</sup>μ<sup>−</sup>νμ;
	- (d) the region of the μ-momentum spectrum near the kinematic endpoint from B¯ → Xμν<sup>μ</sup> decays (where X is any hadronic final state).

How are these results affected by the strong interaction?


# **Experimental tests of electroweak theory 8**

Chapter 7 covered the basic ideas of weak-interaction theory and the unification of the electromagnetic and weak interactions. The weak chargedcurrent interaction is expressed using a unique coupling constant g<sup>w</sup> for all leptonic vertices: (νe, e), (νμ, μ), (ν<sup>τ</sup> , τ ). The coupling constant is also valid for charged-current interactions involving quarks using the weak quark eigenstates, (u, d- ), (c, s- ), (t, b- ), which are a rotation of the flavour eigenstates. We showed how the absence of flavour-changing weak neutral currents is explained using the Glashow, Iliopoulos, and Maiani (GIM) mechanism. We gave an outline of how the weak and electromagnetic interactions are unified into a single consistent theory. The resulting electroweak theory includes only one additional parameter, sin θW, and predicts all the features of the weak neutral current.

In this chapter, we review some of the experimental evidence that underpins the electroweak theory. We start with some of the key neutrino experiments, then look at the discovery of neutral currents, as this was the first step towards a unified electroweak theory. The key prediction of the theory was the existence of the massive W and Z bosons, with quite precise estimates of their masses. Their discovery with masses in the predicted range was a triumph! Electroweak theory has now been probed to much higher precision by many experiments, particularly at LEP and the Tevatron. Many details have since been filled in, but the basic structure remains unchanged.

## **8.1 Neutrinos**

When Pauli postulated the existence of neutrinos, he was afraid that the cross sections were so small that they would never be measurable. The experimental discovery of neutrinos was made possible by the intense flux of antineutrinos from nuclear reactors.<sup>1</sup> <sup>1</sup>Later, high-energy proton accelerators The reaction studied was ν¯ep → e<sup>+</sup>n, using water as the target. Each e<sup>+</sup> annihilated with an e<sup>−</sup> to produce two photons. The photons were detected using tanks of liquid scintillator viewed by photomultipliers. To reduce the backgrounds, cadmium chloride was added to the water. This allowed neutrons to be captured by n <sup>108</sup>Cd → <sup>109</sup>Cd γ. The neutron capture happened a few microseconds after the first reaction. Therefore, a clean signal for the reaction was a flash of light followed by a delayed coincidence.<sup>2</sup>


were used to create neutrino beams at much higher energies (see Chapter 11).

<sup>2</sup>The proof that the events were from the reactor as opposed to backgrounds like cosmic rays was provided by running with the reactor off.

<sup>&</sup>amp; Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak,

<sup>&</sup>amp; Tony Weidberg 2016. Published in 2016 by Oxford University Press.

ferromagnetism are aligned antiparallel to the **B** field.

battleship.

**Fig. 8.1** Schematic of the experiment to measure the neutrino helicity.

If detecting neutrinos was challenging, how could one measure the helicity of a neutrino? The answer came in a brilliant experiment [79]. The key idea was to transfer the helicity of the neutrino to a photon, which could be measured relatively easily. The source of the neutrinos was electron capture Eu e<sup>−</sup> → Sm νe, with the subsequent decay Sm → Sm γ. As the energy is shared between the Sm and the γ, in general the γ does not have sufficient energy to be absorbed by another Sm nucleus. However, if the Sm decays with the γ travelling in the direction of the Sm, then it will have sufficient energy to make a resonant scatter off Sm (see Exercise 8.1). The photon must have the same helicity as the neutrino (see Exercise 8.1). The experimental apparatus is sketched in Fig. 8.1. Photons of the correct energy will be resonantly scattered by the ring of Sm2O<sup>3</sup> and then detected by the NaI(Tl) scintillator coupled to a photomultiplier. The rate is measured with two polarities of the magnetic field. The scattering cross section for γSm is greater if the spin of the photon is anti-aligned with that of the iron than if it is aligned. <sup>3</sup> <sup>3</sup> The electron spins responsible for The polarization of the photons could thus be determined and hence the helicity of the neutrino also. It was found to be consistent with −1, as expected.

The concept of lepton number was invented to explain the absence of decays μ → eγ. Neutrinos must also carry lepton number. The experimental demonstration [69] that ν<sup>e</sup> are distinct from ν<sup>μ</sup> came from an experiment in which a ν<sup>μ</sup> beam was fired at a 5000 ton steel wall <sup>4</sup> <sup>4</sup> The steel plates used came from an old to absorb all particles other than neutrinos. The neutrinos then interacted in aluminium plates and the resulting charged particles were detected by spark chambers. Muons could be separated from electrons because of their longer range. The observation of muons coupled with the absence of electrons showed that ν<sup>μ</sup> were distinct from νe.

> When the τ lepton was discovered, it was assumed that there would be an associated neutrino, ν<sup>τ</sup> . The experimental confirmation of the ν<sup>τ</sup> was made by the DONUT Collaboration [92]. Producing a ν<sup>τ</sup> beam is quite a challenge. The first step was to direct an intense beam of 800 GeV protons at a tungsten target. The forward-going interaction products then entered a magnetic field, which swept charged particles aside, greatly enhancing the neutrino content of the beam, including ν<sup>τ</sup> (from the decays of charm mesons, such as the Ds). As for other neutrinos, to identify a ν<sup>τ</sup> it must first interact to produce a charged τ , which can then be detected through its charged-current interaction, ντX → τY . The short lifetime of the τ leads to tracks with 'kinks' near the primary vertex. These kinks were identified in an emulsion chamber, but in order to determine which volume of the emulsion to measure, a magnetic spectrometer was used to identify candidate τ events. Four such events were found, significantly above the background of 0.34 events.

## **8.2 Charged currents**

The theory of charged currents developed in Chapter 7 is based on a V−A structure. This is parity-violating, and the first clear experimental observation of parity violation was in the decay <sup>60</sup>Co (J<sup>P</sup> = 5<sup>+</sup>) → <sup>60</sup>Ni + e<sup>−</sup> + νe. The <sup>60</sup>Co nuclei were aligned by an external magnetic field.<sup>5</sup> <sup>5</sup>This required adiabatic demagnetiza-The rate of emission of electrons was found to be consistent with an angular distribution of the form 1 − (v/c) cos θ, where θ is the angle between the electron 3-momentum and the magnetic field direction, as predicted by a V−A interaction. A more recent and more direct demonstration of parity violation is given by the angular distribution of leptons from W decays (see Section 8.5.1). Very strong evidence for the V−A theory also comes from the ratio of branching ratios BR(π<sup>+</sup> → e<sup>+</sup>νe)/BR(π<sup>+</sup> → μ<sup>+</sup>νμ); see Exercise 6.4. The cleanest probe of the V−A theory comes from muon decays μ → eν¯eνμ. Using intense muon beams (from pion decays), very precise measurements of the electron spectrum from stopped muons agree with the V−A theory and place stringent limits on contributions from other interactions.

## **8.2.1 Measurements of CKM matrix elements**

In Chapter 7, we outlined how the theory of charged-current weak interactions of leptons could be extended to quarks with a universal coupling strength. This required the CKM matrix to allow for the rotation between the weak and mass eigenstates of the quarks. The theory gives no predictions for the values of these matrix elements, so they have to be determined experimentally. However, as multiple experiments can be performed to determine the same element of the CKM matrix, powerful consistency checks of the theory can be performed. The CKM matrix should be unitary, so this gives additional constraints on the theory. This aspect will be developed in Chapter 10. We give a very brief outline here of some of the methods used to measure the CKM matrix elements.<sup>6</sup>

Measurement of CKM matrix elements proceeds by measuring many different processes; for example Vud is measured by comparing hadronic beta decays with the decay of the muon, and Vus is measured by comparing the decay K → πeν with a non-strange decay.

The ratio of Vub/Vcb can be determined from the muon spectrum in the decays b → cμν¯<sup>μ</sup> and b → uμν¯μ. As the u quark is much lighter than the c quark, the spectrum of muons from b → u decays extend beyond the end of those from b → c. This enables a clean sample of muons from b → u to be identified despite the fact that the ratio Vub/Vcb 1.

The ratio Vcd/Vud can be measured by observing the rate of dimuon to single-muon production in charged-current neutrino interactions on hadrons. If the neutrinos are above threshold, they can produce either charm or up quarks (see Fig. 8.2). A known fraction of the events with a charm quark will result in the semimuonic decay of the charm quark (c → μX) and hence result in events with two muons, whereas the events in which an up quark was produced will result in events with single muons. As the Feynman diagrams are the same for the two processes, the difference in the rates is simply given by the ratio of the CKM elements Vcd/Vud once BR(c → μX) has been accounted for.<sup>7</sup>

tion to achieve sufficiently low temperatures, 0.01 K.

<sup>6</sup>See the review from the Particle Data Group in Further Reading for a comprehensive discussion.

<sup>7</sup>We have assumed that the energy is far above threshold.

**Fig. 8.3** (a) νe<sup>−</sup> → νe−, for any neutrino flavour, via a neutral current, with Z exchange. (b) ¯νμe → ν¯μe, for an electron neutrino, via a charged current, with W exchange.

## **8.3 Neutral currents**

The neutral current was first discovered in the Gargamelle bubble chamber at CERN in the reaction ¯νμe → ν¯μe, elastic scattering of ¯ν<sup>μ</sup> off atomic electrons. The scattering reaction of either ν<sup>μ</sup> or ¯ν<sup>μ</sup> off electrons is an unambiguous signal of a neutral current. Scattering of either ν<sup>e</sup> or ¯ν<sup>e</sup> is ambiguous since there is a charged-current diagram for each of these reactions (see Fig. 8.3).

About 100 events were observed in total and, by measuring the cross section, a value of sin<sup>2</sup> θ<sup>W</sup> = 0.24±0.04 was obtained. This was the first success for the unified electroweak theory and it enabled the prediction of the masses of the W and Z bosons (see Section 8.5.1).

Subsequent studies of weak neutral currents with neutrino beams used electronic detectors that allowed the accumulation of much larger numbers of events. For example, the CHARM2 experiment used the neutral-current reactions νμe<sup>−</sup> → νμe<sup>−</sup> and ¯νμe<sup>−</sup> → ν¯μe<sup>−</sup>. The ratio of the cross sections at the same energy is given by (see Exercise 8.2)

$$R = 3\frac{1 - 4\sin^2\theta\_W + \frac{16}{3}\sin^4\theta\_W}{1 - 4\sin^2\theta\_W + 16\sin^4\theta\_W} \tag{8.1}$$

An intense beam of neutrinos was used and particular care was taken to reduce backgrounds. For example, consider a neutral-current interaction on a nucleus, νμN → νμπ<sup>0</sup>X. The photons from the π<sup>0</sup> decay may pair-convert (γ → e<sup>+</sup>e<sup>−</sup>) in the detector material, potentially faking the single-electron signature. However, the electron energy (Ee) and angle (θe) with respect to the neutrino beam are limited by (see Exercise 8.3)

$$E\_e \theta\_e^2 < 2m\_e \tag{8.2}$$

Hence the quantity Eeθ<sup>2</sup> <sup>e</sup> will be peaked at small values on top of a continuous background. This requires a neutrino detector with very good angular and energy resolution. The target was made from glass since this contains elements of relatively low atomic number and hence minimizes multiple scattering, which limits the angular resolution (see Chapter 4). The final result from the CHARM2 [133] experiment was sin<sup>2</sup> θ<sup>W</sup> = 0.2324 ± 0.0083.

## **8.4 Physics at** *e***<sup>+</sup>***e<sup>−</sup>* **colliders**

e<sup>+</sup>e<sup>−</sup> machines are the place of choice to study the Z<sup>0</sup>. The reasons are as follows:


Because of the finite size of hadrons, an 'underlying event' in a hadron– hadron collider is a superposition of multiple quark or gluon collisions occurring in the same hadron–hadron collision as the 'hard scatter' of interest between hadron constituents. Usually, the underlying event consists of particles produced at small angles to the beam axis.

The big disadvantage is that electrons and positrons in circular colliders lose energy through synchrotron radiation.<sup>8</sup> All charged particles, <sup>8</sup>More details are given in Section 3.2.2. when accelerated, will radiate energy at a rate ∝ 1/m<sup>4</sup>. So, for electrons and protons in circular colliders of the same radius, the ratio of energy loss is (mp/me)<sup>4</sup> ∼ 10<sup>13</sup>. The alternative of using a linear e<sup>+</sup>e<sup>−</sup> collider is discussed in Chapter 13.

A chain of accelerators<sup>9</sup> <sup>9</sup>See Chapter 3 for an explanation of (see Table 8.1) was used to produce the e<sup>+</sup>e<sup>−</sup> beams and increase their energies up to the target 45 GeV per beam in the LEP ring (Fig. 8.4). The chain started with linear accelerators (LINAC) and then progressed through three circular synchrotron machines (PS, SPS, and LEP). Filling took about one hour, with beams accumulating in LEP at 20 GeV (two cycles every 14.4 s). The countercirculating LEP beams were then accelerated from 20 GeV to 45 GeV and left 'coasting' (i.e. there was no further acceleration—the radiofrequency cavities were used just to replace energy lost through synchrotron radiation) for about 8 hours, during which time the e<sup>+</sup>e<sup>−</sup> beams collided at the positions of the four detectors at LEP: ALEPH, DELPHI, L3, and OPAL.

## **8.4.1 Detailed look at the detectors**

The detectors were of a roughly cylindrical shape providing almost '4π' (steradian) coverage; i.e. they were sensitive to particles going in any direction from the interaction point (with no cracks or holes except for the beam pipe through the centre). This was very important for some of the analysis techniques we are going to discuss later. The OPAL detector is shown in Fig. 8.5 as a typical example. The LEP detectors were general-purpose collider detectors as described in Chapter 4. There

the need for a chain of accelerators between the source and the high-energy ring.


**Table 8.1** Chain of accelerators for LEP.

**Fig. 8.4** Map of the accelerator complex at CERN and the four LEP detectors. From [17].

**Fig. 8.5** The OPAL detector. From [17].

there is no perfect detector design.

were interesting differences between the detectors because of the different optimization strategies employed. For example, the L3 had a very high-resolution electromagnetic calorimeter based on BGO (bismuth germanium oxide). BGO is a dense crystal with a good scintillation yield and was used as a homogeneous calorimeter. However, the smaller size and lower value of the magnetic field meant that the charged-track resolution was not as good as for the other LEP experiments. <sup>10</sup> <sup>10</sup> This is another demonstration that Crucial parts of all four of the LEP detectors were the vertex detectors, which used silicon as the active material. These provided high-resolution tracking close to the beam pipe, which enabled the identification of jets from b quarks using the relatively long lifetime of B mesons (see Chapter 4).

> We will now summarize the different types of events seen at LEP when running at the Z<sup>0</sup> resonance as a review of what particles do as they pass through material. What the detectors 'see' is visualized using event displays, examples of which are shown in Fig. 8.6. Event displays are

**Fig. 8.6** Four event types at LEP from the DELPHI experiment (a) e+e−; (b) μ+μ−; (c) τ+τ−; (d) quark–antiquark pair.

important tools for checking that the detectors are functioning correctly, for pedagogic purposes, and for examining unusual events.


The event categories can be distinguished—in particular e+e<sup>−</sup> and μ+μ<sup>−</sup> by their distinctive two-particle topologies. A plot of the total invariant mass of all the particles in the event versus the number of charged particles is shown in Fig. 8.7. The τ <sup>+</sup>τ <sup>−</sup> events can be distinguished from 2-jet events with these variables.

**Fig. 8.7** Main cuts used to separate the different classes of e+e<sup>−</sup> events. From [43].

## **8.4.2 Aspects of a physics analysis**

What is written here is valid for the analysis of any high-energy physics data, but is described in the context of a LEP experiment. An analysis generally goes along the following lines: the experimenters first choose selection criteria (**cuts**), which select the desired events, the **signal**, such as requiring n tracks that look like muons, or tracks above a certain energy, etc. This may involve reconstructing the mass of a combination of particles from the measured 4-momenta of the tracks and showers.<sup>11</sup> <sup>11</sup>Sometimes we assume that the com-Quite often there is **background** bination has a mass m = 0. in the sample selected—where a different physical process produces events that pass the same cuts. Some backgrounds are 'irreducible' in that they produce the same final-state particles as the signal; other backgrounds are 'reducible' in that they could be reduced by tighter cuts or with a better detector.

**Calibration** concerns subtracting pedestals<sup>12</sup> <sup>12</sup>Even if there is no genuine signal in a and measuring the gains and linearities of each channel, particularly for calorimeters. If the calibration is not done correctly, then the energy resolution of the calorimeter suffers. The **acceptance** for the process we are studying has to be calculated. Acceptance is the probability of the particles in an event hitting a certain part of the detector and having a certain minimum energy or momentum.<sup>13</sup> **Accidentals** or pile-up concerns the problem when two events occur at the same time (within detectable resolution) and the resulting combination gets into the data sample. Accidentals can also be a problem if a preceding event causes electronics to momentarily become inactive before being ready to measure a new pulse, or if the preceding event causes a movement in the pedestals that causes the energy to be measured slightly wrongly. Accidentals were not a big problem in LEP experiments, because the rate of events was low.

## **8.4.3 Monte Carlo simulation**

An essential tool for any analysis of data from a large particle physics detector is the Monte Carlo simulation computer program.<sup>14</sup> The idea is to produce simulated events, using a random number generator at each point where a choice in what happens must be made. Both the physics process and the detector response are simulated.

Figure 8.8 shows a block diagram of the steps involved from the choice of physics channel to be simulated through the generation of the detector response, then the digitization of the Monte Carlo 'data' in an identical format to that produced by the various detector components that make up the complete detector. The Monte Carlo data are then run through the complete 'real' data reconstruction chain. They are then available for analysis by the same analysis codes that are used for the real data the only difference is that one knows what physics process was used to generate the events.

First is the 'physics simulator': physics events are simulated starting from theoretical matrix elements for the basic quark and gluon scattering processes that could occur in a hard scatter. The output will be

detector channel, the readout will still deliver a non-zero value. This value is called the 'pedestal', from the shape of the distribution. The pedestal value must be measured and subtracted from genuine signals.

<sup>13</sup>Usually for a 4π LEP detector, it was almost 100% however, this could have been reduced if cuts were made around any dead channels.

<sup>14</sup>Invented by Ulam and Metropolis at Los Alamos in the 1940s and named after the the city in Monaco where there is a big casino.

**Fig. 8.8** Monte Carlo flow diagram.

components work. <sup>15</sup> <sup>15</sup> Detector simulation is very computer intensive and this is another aspect of modern high-energy physics where GRID computing is essential.

the type and 4-momentum of quarks, gluons, and leptons produced. In some cases, for example e<sup>+</sup>e<sup>−</sup> → τ <sup>+</sup>τ <sup>−</sup> and electromagnetic processes, the matrix elements can be calculated. The procedure is similar for hard quark and gluon scattering. The underlying event is produced from socalled phenomenological models that describe the observed properties of the small-angle scattering involved. The physics events can also be chosen to be all of one type if one is trying to work out the best way to select events from a particular physics source (e.g. top-quark production). Sometimes, it is useful to generate only a single type of particle, for example if one needs to know what type of signals it will produce in the different parts of the detector.

Next is the 'detector simulator': this is usually the most difficult part to produce and requires a thorough understanding of how the detector Random number simulation is used to decide how each particle proceeds as it passes through various types of material in the detector. For some particles, for example electromagnetic particles (e<sup>±</sup>, γ), the response of most materials is well documented, and wellhoned computer codes exist that can be modified to whatever geometry is required. For hadronic particles, things are more complicated, first because most of these particles are unstable and will decay either within the beam pipe or within the detector layers nearest to the beam. The decays have to be simulated and the resulting products (mostly pions, together with some kaons and nucleons) followed through the detector layers until they are absorbed. The exceptions are neutrinos (which will leave no signal in a typical collider detector) and high-energy muons, which will penetrate through the main detector and surrounding magnet and so require special and very large-area but fairly simple chargedparticle detectors covering the outside of the main detector. The final step is to collect the simulated signals, format them as though they were real data, and add the 'book keeping' records. These steps are summarized in the remaining three boxes in the flow diagram in Fig. 8.8.

## **Example**

As an example, in generating a simulated e<sup>+</sup>e<sup>−</sup> → Z<sup>0</sup> → τ <sup>+</sup>τ <sup>−</sup>, the choices could be as follows:


The technique works well and, by generating lots of simulated events, we are effectively performing a numerical integral over all the possible outcomes of what might happen by randomly sampling the integrand. The Monte Carlo technique is often used to estimate trigger efficiencies, acceptances, and accidental effects in an analysis.

However, there are always systematic uncertainties associated with Monte Carlo calculations, particularly at hadron colliders. Therefore, wherever possible, the calculations should be done in a 'data-driven' way that does not rely so heavily on Monte Carlo calculations. This approach will be described in Chapter 13.

## **Multivariate analysis methods**

The above description of an analysis is based on the simplest 'cut-andcount' approach in which one makes a series of selections on a sequence of variables and counts the number of events that pass all the selections. This approach is clearly not optimal if there are correlations between the variables that are different for the signal and background processes. Consider a toy example in which one is using two variables x<sup>1</sup> and x<sup>2</sup> to discriminate between signal and background. A cut-and-count analysis would select a rectangular region in (x1, x2) space (see Fig. 8.9). However, a more powerful discrimination between the signal and the background might be obtained by selecting a 'triangular' region (see Fig. 8.9). In a typical analysis, we have to deal with many variables, so the optimization of the selection is non-trivial. Powerful statistical techniques like neural networks and boosted decision trees are used (see Behnke et al. in Further Reading).

## **8.4.4 Physics at LEP**

LEP operation was in two phases: for LEP1, the CMS energy was close to the mass of the Z<sup>0</sup>. Similar physics was studied at the SLAC Linear Collider (SLC), where the luminosity was much lower than at LEP, although SLC had the advantage of being able to produce longitudinally polarized electrons. In LEP2, the energy was increased to above the threshold for W<sup>+</sup>W<sup>−</sup> pair production.

We will review selected LEP1 physics in this section and discuss LEP2 physics in Section 8.4.11. LEP produced a huge number of Z<sup>0</sup>s, ∼4.5 × 10<sup>6</sup> per detector. Z<sup>0</sup>s decay into almost every type of particle we know, so many things can be studied. Examples include the following:

b¯b: Lifetime of b quark B<sup>0</sup>–B¯<sup>0</sup> mixing B<sup>0</sup> CP violation (this is done better at BaBar, Belle, CDF, and LHC) τ <sup>+</sup>τ <sup>−</sup>: Branching ratios (pre LEP, there was a crisis because -BR > 100% !) Decay parameters (information on spins etc. gives information on W and Z currents)

**Fig. 8.9** Event selection in the (x1, x2) plane.


There were also a number of individual measurements of high importance that we discuss here in more detail: the mass of the Z<sup>0</sup> boson, the number of neutrinos, and production cross section and decay parameters of the Z<sup>0</sup>, which are used to obtain the couplings c (f) <sup>V</sup> and c (f) <sup>A</sup> to compare with the predictions of the electroweak theory.

The principle of all the measurements at LEP was to take runs at a variety of different beam energies around the Z<sup>0</sup> peak. The experiments recorded everything whenever there was a trigger (≡ an event), and these were reconstructed later (offline). They were then classified as ee, μμ, τ τ , qq¯, or luminosity Bhabha events (see Exercise 3.7). It was then possible to measure the cross section as a function of CMS energy <sup>√</sup><sup>s</sup> and partial cross sections of various types (e.g. according to the decay mode, or which direction the particles went).

## **8.4.5 The** *Z* **line shape**

The cross section as a function of <sup>√</sup><sup>s</sup> (at LEP, twice the beam energy) displays a clean peak at the Z resonance. The cross section as a function of s for e+e<sup>−</sup> → Z<sup>0</sup> → f ¯f (where f is one of e, μ, τ , or a quark) is given by

$$\sigma\_f(s) = \sigma\_f^0(s) \frac{s \Gamma\_Z^2}{(s - M\_Z^2)^2 + m\_Z^2 \Gamma\_Z^2} \otimes \text{(QEDcorr.)} \otimes \text{(QCDcorr.)}\tag{8.3}$$

where

$$
\sigma\_f^0(s) = \frac{12\pi}{M\_Z^2} \frac{\Gamma\_e \Gamma\_f}{\Gamma\_Z^2} \tag{8.4}
$$

and the convolutions take account of the higher-order QED and QCD corrections. Using the electroweak theory from Chapter 7 to compute Γ<sup>f</sup> gives

$$\Gamma\_f = \frac{G\_\mathrm{F}\sqrt{2}M\_Z^3}{12\pi} \left[ \left( c\_\mathrm{V}^{(f)} \right)^2 + \left( c\_\mathrm{A}^{(f)} \right)^2 \right] N\_{\mathrm{colours}} \otimes \left( \mathrm{QCDcorr.} \right) \tag{8.5}$$

Equations 8.3 and 8.4 when put together give the usual relativistic Breit– Wigner formula.

Recall that c<sup>V</sup> and c<sup>A</sup> are defined for the left- and right-handed parts of a particle separately and vary depending on what type of fermion f we have (see Table 7.6 and the discussion on page 199): c (f) <sup>V</sup> = I3(f) − 2Q(f) sin<sup>2</sup> θ<sup>W</sup> and c (f) <sup>A</sup> = I3(f).

94 **Fig. 8.10** The e+e<sup>−</sup> → Z<sup>0</sup> production cross section as a function of CMS energy. From [17].

## **8.4.6** *Z* **mass**

The mass of the Z is obtained in principle simply by taking the curve of the cross section as a function of CMS energy (as shown in Fig. 8.10) and fitting the expression given in eqns (8.3)–(8.5) to find the best value for MZ. For reasons we will come to, it is important to measure M<sup>Z</sup> very accurately. In practice, there are various complications to take care of, including the QED and QCD corrections indicated in the formulae. Another important point is to know exactly what the beam energy is, which we describe later.

## **8.4.7 The** *Z* **width; number of neutrinos**

The Z<sup>0</sup> can decay into a pair of neutrinos: Z<sup>0</sup> → νν¯. How many generations of leptons are there? We know of three: e, μ, and τ . Provided the mass of the associated neutrino is less than half the Z<sup>0</sup> mass, there is a way to detect if there are any more. The total width of the Z<sup>0</sup>, ΓZ, is made up of the sum of the partial widths of all its decay modes:

$$
\Gamma\_Z = \Gamma\_{\rm had} + \Gamma\_{ee} + \Gamma\_{\mu\mu} + \Gamma\_{\tau\tau} + N\_\nu \Gamma\_\nu \tag{8.6}
$$

$$=\Gamma\_{\rm had} + 3\Gamma\_{l\overline{l}} + N\_{\nu}\Gamma\_{\nu} \tag{8.7}$$

where lepton universality has been assumed for the second step. The fit to the experimental data is sensitive to both the width of the peak and its height. The partial widths can be predicted from the electroweak model (eqn 8.5) and the branching ratios can be used as a check. Therefore, by measuring the total width ΓZ, <sup>16</sup> <sup>16</sup> Again by fitting the shape of the cross section as a function of beam energy using eqns 8.3–8.5 with the partial widths held fixed at their predicted values from the standard model.

Circumference *L*

**Fig. 8.11** Bending field in the LEP accelerator.

the number of neutrinos, Nν, can be measured. However, since

$$
\sigma\_0 \propto \frac{1}{\Gamma\_Z^2} \tag{8.8}
$$

(eqn 8.4), the greatest sensitivity is obtained by simply measuring the cross section at the very top of the peak. The result is

$$N\_{\nu} = 2.9841 \pm 0.0083\tag{8.9}$$

## **8.4.8 LEP beam energy measurement**

The measurement of the Z mass was a very careful study and involved measuring the beam energy accurately. How do you measure the beam energy? A technique called resonant depolarization involving the anomalous magnetic moment of the electron (g − 2) was employed (recall the experiment to measure accurately the anomalous magnetic moment of the muon as one of the most stringent tests of QED). Let us approximate LEP by a circle immersed in a uniform vertical **B** field (see Fig. 8.11), which provides the bending. Then

$$\mathbf{F} = \frac{d\mathbf{p}}{dt} = -e\mathbf{v} \times \mathbf{B}, \qquad |\mathbf{p}| = eBR = \frac{e}{2\pi} BL \tag{8.10}$$

Also, the orbital angular frequency is ω<sup>c</sup> = eB/γme. Electrons 'naturally' become polarized over about 5 hours while going around in LEP. The polarization can be measured with backscattered light. The spin precession angular frequency is given by

$$
\omega\_s = \frac{eB}{\gamma m\_e} \left[ 1 + \gamma \left( \frac{g-2}{2} \right) \right] \tag{8.11}
$$

so we can compute the number of precessions per turn in LEP, νs:

$$\nu\_s = \frac{\omega\_s - \omega\_c}{\omega\_c} = \gamma \left(\frac{g-2}{2}\right) = \frac{E\_{\text{beam}}}{m\_e} \left(\frac{g-2}{2}\right) \tag{8.12}$$

The value of (g − 2)/2 is known to an accuracy of 4 × 10<sup>−</sup><sup>9</sup> and the electron mass m<sup>e</sup> to a precision of 3 × 10<sup>−</sup><sup>7</sup>, so if we can measure νs, we get the energy of the beam.

The technique for measuring ν<sup>s</sup> (resonant depolarization) proceeds by adding a small magnet with a field in the x direction (horizontal, transverse to the beam) that varies as sin νt. When ν = νs, the contribution will accumulate each turn and cause the beam to depolarize (as measured in the backscattered light). The precision obtained in the beam energy at LEP was 2 MeV (out of 45 GeV). It involved understanding various effects, including the tidal pull of the moon (which changes L slightly), movements in the water table, and even ground currents caused when the fast TGV trains to Paris passed by.<sup>17</sup>

<sup>17</sup>This was discovered on a day when there was a rail strike!

## **8.4.9 Cross sections and forward–backward asymmetries at the** *Z*

The couplings c<sup>V</sup> and c<sup>A</sup> for each fermion type can be extracted from measurements of dσ/dΩ and the forward–backward asymmetry AFB, defined as

$$A\_{\rm FB} = \frac{N\_{\rm F} - N\_{\rm B}}{N\_{\rm F} + N\_{\rm B}} \tag{8.13}$$

where N<sup>F</sup> and N<sup>B</sup> are the numbers of events with θ < 90◦ (forward) and θ > 90◦ (backward), respectively.<sup>18</sup> <sup>18</sup>The angle <sup>θ</sup> is the angle between

Starting from eqn. (7.40), we can show that (see Exercise 8.7)

$$\frac{d\sigma}{d\Omega} = \frac{G\_\mathrm{F}^2 M\_Z^4}{32\pi^2 \Gamma\_Z^2} [A(1 + \cos^2 \theta) + B \cos \theta] \tag{8.14}$$

where

$$A = \left[ \left( c\_{\rm V}^{(e)} \right)^2 + \left( c\_{\rm A}^{(e)} \right)^2 \right] \left[ \left( c\_{\rm V}^{(f)} \right)^2 + \left( c\_{\rm A}^{(f)} \right)^2 \right], \qquad B = 8c\_{\rm V}^{(e)}c\_{\rm A}^{(e)}c\_{\rm V}^{(f)}c\_{\rm A}^{(f)}$$

Integrating eqn (8.14), we find

$$
\sigma(e^{+}e^{-} \to f\bar{f}) \propto \left[ \left(c\_{\text{V}}^{(e)}\right)^{2} + \left(c\_{\text{A}}^{(e)}\right)^{2} \right] \left[ \left(c\_{\text{V}}^{(f)}\right)^{2} + \left(c\_{\text{A}}^{(f)}\right)^{2} \right] \tag{8.15}
$$

Integrating again, from eqn (8.14),

$$A\_{\rm FB} = \frac{3}{4} \frac{2c\_{\rm V}^{(e)}c\_{\rm A}^{(e)}}{\left(c\_{\rm V}^{(e)}\right)^2 + \left(c\_{\rm A}^{(e)}\right)^2} \frac{2c\_{\rm V}^{(f)}c\_{\rm A}^{(f)}}{\left(c\_{\rm V}^{(f)}\right)^2 + \left(c\_{\rm A}^{(f)}\right)^2} = \frac{3}{4}A\_{e}A\_{f} \tag{8.16}$$

where

$$A\_e = \frac{2c\_\mathrm{V}^{(e)}c\_\mathrm{A}^{(e)}}{\left(c\_\mathrm{V}^{(e)}\right)^2 + \left(c\_\mathrm{A}^{(e)}\right)^2}, \qquad A\_f = \frac{2c\_\mathrm{V}^{(f)}c\_\mathrm{A}^{(f)}}{\left(c\_\mathrm{V}^{(f)}\right)^2 + \left(c\_\mathrm{A}^{(f)}\right)^2}.$$

What electroweak information can be obtained from these measurements? σ and AFB for a particular final state f give two equations involving c (f) <sup>V</sup> and c (f) <sup>A</sup> , and, in principle, these can be solved to give both c (f) <sup>V</sup> and c (f) <sup>A</sup> separately.<sup>19</sup> <sup>19</sup>Measuring <sup>σ</sup> correctly requires accur-On a plot of c<sup>V</sup> versus cA, σ ∝ c<sup>2</sup> <sup>V</sup> + c<sup>2</sup> A is a circle and σAFB ∝ cVc<sup>A</sup> has a hyperbolic dependence. Taken together, these should enable the extraction of both c<sup>V</sup> and c<sup>A</sup> for the fermion f (see Exercise 8.6). This can be done for each final state f. In addition, note that σ and AFB also depend on the electron values for c (e) <sup>V</sup> and c (e) <sup>A</sup> to be completely unravelled. These must either be taken from other experiments or obtained using separate measurements from τ <sup>+</sup>τ <sup>−</sup> events.

ate knowledge of the luminosity, which is described in Section 8.4.10.

the incoming electron direction and the outgoing lepton or quark, or between the incoming positron and outgoing antilepton or antiquark.

We have derived the AFB formula assuming all the interactions are mediated by the Z, but we also need to account for the γ exchange diagram and the interference between the two diagrams. Figure 8.12 shows the differential cross section as a function of cos θ when LEP was run on the Z pole and 2 GeV either side. The variation of AFB as a function of beam energy is shown in Fig. 8.13.

In summary, then, from the measured quantities σee, σμμ, στ τ , A<sup>e</sup> FB, Aμ FB, and A<sup>τ</sup> FB, and some information from τ decay, we obtain c (e) <sup>V</sup> , c (μ) <sup>V</sup> , c (τ) <sup>V</sup> , c (e) <sup>A</sup> , c (μ) <sup>A</sup> , and c (τ) <sup>A</sup> . From the data, we see the following:

**Fig. 8.12** Cross section as a function of θ, the angle between the incoming electron and outgoing lepton. The two plots are the results from two different LEP detectors; the three curves show the three main beam energies at and near the Z pole, where LEP was run. From [17].

**Fig. 8.13** AFB as a function of LEP beam energy [17].


## **8.4.10 LEP luminosity measurement**

Accurate data on the LEP luminosity is essential for cross-section measurements. Recall that the integrated luminosity L is defined by the expression N<sup>i</sup> = Lσ<sup>i</sup> where N<sup>i</sup> is the number of events from a given process i that occur in the detector (once detector effects like trigger efficiency and acceptance have been corrected for) and σ<sup>i</sup> is the cross section. L is the same for all processes—it depends on the features of the accelerator and how well it is working (e.g. how well the two beams are steered into each other at a particular interaction region). Luminosity is measured using a process j with a high rate and for which we know how to calculate the cross section. For LEP, the channel j used to measure the luminosity was low-angle electron–positron scattering (Bhabha scattering), for which the QED single-photon-exchange diagram dominates. There are only very small contributions to this from diagrams with Z<sup>0</sup> or involving annihilation, and the theoretical uncertainty is below 0.1%. Bhabha scattering events are measured with small calorimeters situated very close to the beam line on each side of the detectors, several metres each side of the interaction point. The events are counted when two showers with the appropriate energy are seen within a radius region about the beam axis of typically 6 cm <R< 15 cm. This cut is made on one side only to reduce sensitivity to the movement of the interaction point. Using the numbers of events N<sup>i</sup> and N<sup>j</sup> and the formulae L = Nj/σ<sup>j</sup> and σ<sup>i</sup> = Ni/L, we obtain the cross section in which we are interested.<sup>20</sup>

## **8.4.11 Measurements at LEP2, above** *<sup>√</sup><sup>s</sup>* **<sup>=</sup>** *<sup>M</sup><sup>Z</sup>*

Superconducting cavities were added to the LEP ring from 1995 onwards and the accelerator was run at increasing energies as more cavities were added (with increasing beam energy, the energy lost by synchrotron radiation each turn becomes greater and more cavities are needed to replace it). The cross section as a function of beam energy is shown in Fig. 8.14. The maximum energy reached was a little over <sup>√</sup><sup>s</sup> = 200 GeV and the last run was in 2000. The main studies at these higher energies were searches for new particles: Higgs and SUSY being the most popular. Precision measurements of the triple gauge coupling were also made.<sup>21</sup> The higher energy also facilitated other detailed studies of the W, which complemented those done at ¯pp colliders.

A useful class of events at LEP2 is 'radiative return to the Z<sup>0</sup>' or initial-state radiation (ISR). This happens when either the beam electron or positron emits a bremsstrahlung photon (Fig. 8.15) and loses enough <sup>20</sup>In practice, luminosity measurements will be recorded as a function of date and time after correction for acceptance and radiative corrections (higher-order QED processes) and made available for all analyses.

<sup>21</sup>These studies required the CMS energy to be above 2M<sup>W</sup> .

and above the Z pole. From [17].

**Fig. 8.15** One Feynman diagram for initial-state radiation (ISR).

in Section 13.3. A nice example of a

energy for the subsequent collision to have the right energy to make an on-mass-shell Z<sup>0</sup>. This is very significant, because the Z<sup>0</sup> resonance is so large. Measuring cross sections and AFB above the Z peak provides more consistency checks of the electroweak model.

## **8.4.12** *W***<sup>+</sup>***W<sup>−</sup>* **production**

This is a good example for studying some of the important aspects of a physics analysis either at LEP or at a hadron collider. The two main aspects are (1) choice of final states and (2) combinatorics. W<sup>+</sup>W<sup>−</sup> events can come in three different types depending on whether each of the Ws decays to a pair of quarks or to a charged lepton and a neutrino. For events with quarks in the final state, we need to run a 'jet' algorithm to assign measured charged particle tracks and calorimeter energies not associated with tracks to a particular jet. <sup>22</sup> <sup>22</sup> Jet-finding algorithms are discussed

two-jet event is shown in Fig. 8.6(d). • **4 jets:** We get the 3-momentum components of each of 4 jets and we know the beam energy (a total of 13 numbers), so we can do a constrained fit of the event. The things we do not know are the two decay angles of the Z and of each of the two Ws. We also leave the masses of the two Ws free in the fit. It is therefore a 5-constraint fit (13 numbers, 8 unknowns). There is also a combinatorial problem—we do not know which pairs of jets go together—so we try each of the three combinations in turn: 12, 34; 13, 24; 14, 23. For all three combinations, we make a scatter plot of the mass of one W versus the mass of the other and hope to see a peak where the real W is. Intuition tells us that the false combinations are likely to be scattered widely about the plot, not forming a mass peak, and a Monte Carlo simulation can be used to provide more quantitative information about the distribution of the false combinations.

**Fig. 8.16** W mass distribution from the 2-constraint fit for W → μνμ, W → qq¯. Similar distributions are obtained for the qqqq, eνqq, and τνqq channels. From [5].

• **2 jets,** *l, ν* **:** There are fewer constraints here, because the neutrino is undetected and deprives us of 3 of the 13 numbers we had in the 4-jet case. Nevertheless, this is a 2-constraint fit and is therefore fairly powerful. There is no combinatorial problem here—the detector tells us which is the lepton. This analysis is only done when the lepton is an electron or a muon. If the lepton is a τ , the τ decay must involve at least one other neutrino, a τ neutrino. There is still sufficient information for a 1-constraint fit.

An example of a W mass fit from the μνμqq¯ channel [5] is shown in Fig. 8.16. This technique is a good way to measure the mass of the W. Each individual event gives two independent measurements of the mass and all we need do is take an average to get the mass. If we want to be more sophisticated, we can estimate the error on each mass measurement (from the errors on the track and calorimeter measurements) and do a weighted average. The spread of the individual M<sup>W</sup> measurements gives us the opportunity to measure the width of the W (the spread is determined by the natural width and the experimental resolution). The result is Γ<sup>W</sup> = 2.48 ± 0.41 GeV. The prediction from electroweak theory is 2.077 GeV, which is consistent. This is not as precise, however, as the measurements from the Tevatron (the CDF and D0 experiments) discussed in Section 8.5.3.

## **8.4.13** *σ***(***e***<sup>+</sup>***e<sup>−</sup> → W***<sup>+</sup>***W<sup>−</sup>***)**

The three diagrams contributing to the e<sup>+</sup>e<sup>−</sup> → W<sup>+</sup>W<sup>−</sup> cross section are shown in Fig. 8.17. All three are needed to give the cancellation required to avoid a divergent theoretical result for this cross section as the energy increases. It is therefore very satisfying to see that the measured values agree with the theory when all three diagrams are present

**Fig. 8.17** Three lowest-order Feynman diagrams for W-pair production: (a) t-channel νe exchange; (b) s-channel γ exchange; (c) s-channel Z exchange.

**Fig. 8.18** The <sup>√</sup><sup>s</sup> energy dependence of σ(Z → WW) near threshold, from

which the W mass may be determined. The lower curve, which follows the data, shows the predicted cross section including all three of the diagrams shown in Fig. 8.17, and the other curves show the prediction if some of these diagrams are omitted. From [16].

<sup>23</sup>The W was discovered before the Z because the product of cross section and branching ratio is an order of magnitude bigger than that for the Z.

troweak theory. <sup>24</sup> <sup>24</sup> The fact that the <sup>W</sup> and <sup>Z</sup> signatures were so clean was a major turning point in the subject. Previous prejudice was that hadron colliders were too 'dirty'. This change of attitude opened the way to the construction of the LHC.

but completely disagree with a calculation neglecting the triple-gaugeboson coupling. This provides conclusive evidence for the existence of the triple-gauge-boson couplings γW<sup>+</sup>W<sup>−</sup> and ZWW, with the rates agreeing with those predicted by the unified electroweak theory.

Also, the threshold energy for W<sup>+</sup>W<sup>−</sup> pair production gives a separate measurement of the mass of the W. The shape of σ(Z → W<sup>+</sup>W−) as a function of <sup>√</sup><sup>s</sup> just above threshold is shown in Fig. 8.18. The result for the mass of the W from LEP (combining the cross-section threshold measurement and the individual event reconstruction method described above) is M<sup>W</sup> = 80.39 ± 0.09 GeV. The Tevatron experiments provide a better measurement, as will be discussed in Section 8.5.2.

## **8.5** *W* **and** *Z* **physics at hadron colliders**

## **8.5.1** *W* **and** *Z* **discovery**

The W and Z were discovered in 1983 at the CERN ¯pp collider. The accelerator physics required to produce sufficiently intense ¯p beams is reviewed in Chapter 3. The largest branching ratios for W and Z are to hadronic final states, but these are very difficult to study in a hadron collider because of the very large cross section for QCD jets (see Chapter 9). Therefore, the experiments focused on the relatively clean leptonic decay modes: W → eνe, W → μνμ, and Z → e<sup>+</sup>e<sup>−</sup>, Z → μ<sup>+</sup>μ<sup>−</sup>. 23 The key detector feature that enabled the W discovery was the use of 'hermetic' detectors, which allowed the neutrino transverse momentum to be determined from the measured missing transverse energy (Emiss <sup>T</sup> ; see Chapter 4). As the mass of the W was predicted to be very large (∼80 GeV), the signature was a high-transverse-momentum lepton (electron or muon) and a high value for Emiss <sup>T</sup> . While there are backgrounds from QCD that can produce 'fake' electrons or muons in the detector, they would not usually have such a large energy, nor would a QCD event produce such a large value of Emiss <sup>T</sup> . The signature for a Z is an e<sup>+</sup>e<sup>−</sup> or μ+μ<sup>−</sup> pair with invariant mass consistent with that of the Z, resulting in a narrow resonance on top of a low background. The measured W and Z masses were in very good agreement with the predictions from elec-Much higher-precision measurements of the Z mass were made at LEP, as discussed in Section 8.4.

## **Test of V***−***A**

We will need to assume that the W decays are via a V−A coupling in order to be able to use their distinctive signature quantitatively. The most direct and in some sense simplest test of the V−A theory of weak decays is provided by measuring the angular distribution of the charged leptons resulting from W decay. We can calculate the CMS angular distribution, as will be discussed shortly. As the Ws are produced with finite longitudinal momentum, we need to boost the measured event to

**Fig. 8.19** Measured angular distribution in W → eνe decays [15].

the qq¯ CMS. We do not directly determine the longitudinal momentum of the neutrino, but it can be determined up to a quadratic ambiguity by using the W mass constraint. The measured distribution [15] is shown in Fig. 8.19.

## **8.5.2** *W* **mass determination at the Tevatron**

The current best measurement of the W mass comes from the Tevatron ¯pp collider, which had a CMS energy of 1.96 TeV. The mass of the Z was measured very precisely at LEP. Combining precision measurements of the W and Z bosons is interesting because the result provides a window on possible higher-mass particles via radiative corrections (see Section 7.4.3).

The decay modes W → lν with l = e or l = μ are used for the W mass determination, since they have very clean signals. The neutrino is not directly detected; however, the transverse component of the neutrino momentum can be determined from the missing transverse momentum in the event. Too much energy is lost in the beam pipes for the longitudinal component to be determined. Also, the quarks carry unknown fractions x and ¯x of the proton and antiproton momenta, and hence the invariant mass of the W cannot be computed from the final-state particles in an individual event. However, the mass of the W can be determined

**Fig. 8.20** Spin structure in qq¯ → W<sup>−</sup> → e−ν¯<sup>e</sup> events.

from the data on a statistical basis. The W bosons are produced via the parity-violating V−A interaction. We will assume that the quarks (antiquarks) come from the p (¯p) because the typical values of the proton momentum fraction x (antiproton momentum fraction ¯x) are quite large. All the quarks and leptons are ultrarelativistic, so the V−A interaction, which couples left-handed particles (right-handed antiparticles), will result in negative-helicity particles (positive-helicity antiparticles). The spin structure in W production and decay is shown in Fig. 8.20.

The W<sup>−</sup> has a spin along the proton beam direction of J<sup>z</sup> = −1 and the l <sup>−</sup> ν¯<sup>l</sup> system has Jz- = −1 along an axis pointing in the direction of the l <sup>−</sup>. The angular distribution in the W CMS is then given simply by the rotation matrix (see Chapter 2):

$$\frac{\mathrm{d}N}{\mathrm{d}\cos\theta^\*} = N\_0[d\_{-1,-1}^1(\cos\theta^\*)]^2 = \frac{N\_0}{4}(1+\cos\theta^\*)^2\tag{8.17}$$

where N<sup>0</sup> is a normalization constant. From the chain rule, we can change variable to the component of the momentum of the electron perpendicular to the beam, p<sup>e</sup> T:

$$\frac{\mathrm{d}N}{\mathrm{d}p\_{\mathrm{T}}^{e}} = \frac{\mathrm{d}N}{\mathrm{d}\cos\theta^{\*}} \frac{\mathrm{d}\cos\theta^{\*}}{\mathrm{d}p\_{\mathrm{T}}^{e}}\tag{8.18}$$

with

$$\cos\theta^\* = \sqrt{1 - (\sin\theta^\*)^2} = \sqrt{1 - \left(\frac{2p\_\mathrm{T}^e}{M\_W}\right)^2} \tag{8.19}$$

Differentiating gives

$$\frac{\mathrm{d}\cos\theta^{\*}}{\mathrm{d}p\_{\mathrm{T}}^{e}} = -\frac{4p\_{\mathrm{T}}^{e}/M\_{W}^{2}}{\sqrt{1 - \left(\frac{2p\_{\mathrm{T}}^{e}}{M\_{W}}\right)^{2}}}\tag{8.20}$$

<sup>25</sup>The peak arises from the Jacobian of the change of variables and is often called the 'Jacobian peak'.

Substituting from eqn 8.20 into eqn 8.18 gives<sup>25</sup>

$$\left| \frac{\mathrm{d}N}{\mathrm{d}p\_{\mathrm{T}}^{e}} \right| = N\_{0} \frac{p\_{\mathrm{T}}^{e}}{M\_{W}^{2}} \frac{\left[1 + \sqrt{1 - \left(\frac{2p\_{\mathrm{T}}^{e}}{M\_{W}}\right)^{2}}\right]^{2}}{\sqrt{1 - \left(\frac{2p\_{\mathrm{T}}^{e}}{M\_{W}}\right)^{2}}} \tag{8.21}$$

<sup>26</sup>At collider energies, the masses of the leptons (e and μ) are negligible compared with their momenta, and therefore we can use the massless approximation p = E.

From this expression, it can be seen that the p<sup>T</sup> spectrum will be peaked towards its upper endpoint at M<sup>W</sup> /2. In principle, the W mass can be determined by fitting the measured charged-lepton p<sup>T</sup> distribution to eqn 8.21.<sup>26</sup> However, the shape of the distribution is distorted by the distribution of the transverse momentum of the W, P <sup>W</sup> <sup>T</sup> . Therefore, it is convenient to define the transverse mass M<sup>T</sup> in a similar way to the invariant mass but only considering transverse components:

$$M\_{\rm T}^2 = (p\_{l,\rm T} + p\_{\nu,\rm T})^2 - |\mathbf{p}\_{l,\rm T} + \mathbf{p}\_{\nu,\rm T}|^2\tag{8.22}$$

This can also be expressed as (see Exercise 8.5)

$$M\_{\rm T}^2 = 2p\_{\rm T}^l p\_{\rm T}^\nu (1 - \cos \Delta \phi) \tag{8.23}$$

where Δφ is the angle between **p**<sup>l</sup> <sup>T</sup> and **p**<sup>ν</sup> <sup>T</sup>, the transverse momenta of the lepton and neutrino, respectively.

## **Experimental aspects of measuring** *M<sup>W</sup>* **from the** *M***<sup>T</sup> distribution**

For W bosons produced with no transverse momentum, M<sup>T</sup> = 2p<sup>l</sup> <sup>T</sup> and to lowest order the effect of non-zero values of P <sup>W</sup> <sup>T</sup> will not change the value of MT; hence there should be an endpoint at M<sup>W</sup> . The distribution is smeared by the finite width of the W and by experimental resolution. However, if these effects are taken into account, the data can be used in a fit to determine M<sup>W</sup> . There are also several systematic uncertainties that must be understood before a precision measurement of M<sup>W</sup> can be made.

One of the most important of these is the energy scale for electrons and muons; this can be constrained by fitting the l +l <sup>−</sup> invariant mass (where l is an e or a μ) to the peaks from Z, Υ, and J/ψ decays. Using the known masses of these resonances, the energy scale can be calibrated, allowing for any nonlinearity. As M<sup>Z</sup> > M<sup>W</sup> , it is essential to fix the nonlinearity as well as the overall energy scale of the detector.

Another key systematic uncertainty is the measurement of p<sup>ν</sup> <sup>T</sup> by the method of missing transverse momentum. This can be constrained by studying the apparent missing transverse energy in Z → l +l <sup>−</sup> decays and looking at the agreement between the values of p<sup>Z</sup> <sup>T</sup> inferred from the accurately measured l +l <sup>−</sup> system and the hadronic recoil. The axes perpendicular and parallel to the bisector of the l +l <sup>−</sup> system are defined as shown in Fig. 8.21. For a perfect detector, the transverse momentum of the l +l <sup>−</sup> system would be balanced by the hadronic recoil u. The l +l − system is measured with far better precision than that of the hadronic recoil. Therefore, the distribution of pll <sup>T</sup> + u can be used to determine the hadronic response. An example of such a plot from the CDF experiment [1] is shown in Fig. 8.22, which shows the mean value of pll <sup>η</sup> + u<sup>η</sup> as a function of pll <sup>T</sup>. The quantity is projected onto the η axis because the experimental error in the value of pll <sup>η</sup> is mainly due to errors in the measurements of angles, rather than energy. Measurements of angles from the tracking detector are very precise, so the measurement errors in this quantity are greatly reduced. The value of this quantity would be 0 for an ideal detector and the fact that the mean value is positive is due to energy loss outside the acceptance of the detector and in cracks between calorimeter cells.

**Fig. 8.21** Definition of the axes parallel and perpendicular to the bisector of the l +l <sup>−</sup> system.

**Fig. 8.22** Calibration of the hadronic response by measuring the mean value of pll <sup>η</sup> + u<sup>η</sup> versus pll <sup>T</sup> as measured by CDF in Z → e+e<sup>−</sup> decays [1].

**Fig. 8.23** W mass fit at CDF in the W → eνe channel [3].

The M<sup>T</sup> distribution for W → μν<sup>μ</sup> shows a very clean Jacobian peak (see Fig. 8.23). From the fits to the M<sup>T</sup> spectra from W → eν<sup>e</sup> and W → μνμ, the W mass is determined to be 80 387 ± 19 MeV, which is the most precise measurement to date [3]. The current (2014) world average value [115] is 80 385 ± 15 MeV.

## **8.5.3 Width of the** *W*

The total width of the W was measured at CDF and D0 using two methods to compare it with the value predicted by the electroweak theory of Γ<sup>W</sup> = 2.077 GeV. The direct technique is to extract it from the same transverse mass distribution as described above. The distribution will be modified depending on the W width. The value obtained is Γ<sup>W</sup> = 2.11 ± 0.32 GeV.

The indirect method of obtaining the total W width involves using some information from LEP. The width is obtained by measuring the production cross-section ratio<sup>27</sup> <sup>27</sup>Measuring a ratio of cross sections is

$$\frac{\sigma(\bar{p}p \to W \to l\nu)}{\sigma(\bar{p}p \to Z \to \bar{l}l)} = \frac{\sigma(\bar{p}p \to W)}{\sigma(\bar{p}p \to Z)} \times \frac{\Gamma(W \to l\nu)}{\text{BR}(Z \to \bar{l}l)} \times \frac{1}{\Gamma\_W} \qquad \text{(8.24)}\quad \text{systematic effects will cancel out.}$$

The other numbers in the formula are obtained as follows: the ratio σ(¯pp → W)/σ(¯pp → Z) is predicted by modelling;<sup>28</sup> <sup>28</sup> Γ(W → lν) Again a ratio is easier to predict. is predicted from the electroweak theory; BR(Z → ¯ll) was measured at LEP.

The result obtained is Γ<sup>W</sup> = 2.062±0.059 GeV. The importance of this measurement is that the total W width is sensitive to any particles into which the W might decay, including some that we might not otherwise have discovered yet. This indirect method produces a much more precise check than either the direct method or the measurements from LEP. Everything is consistent with the prediction from the electroweak theory.

## **8.6 Top-quark physics**

The top quark was expected ever since the b quark was discovered in 1977. From the measurements at LEP of Z → b¯b, the b quark was confirmed to be a member of a weak isodoublet. While these results gave strong indications that the top quark must exist, it was the combination of the precise electroweak data with the radiative corrections that gave a prediction for the top-quark mass of ∼170 GeV (see Chapter 7). We will review the discovery of the top quark in Section 8.6.1 and then consider how to make precision measurements of its mass in Section 8.6.2.

## **8.6.1 Top-quark discovery**

Given the high mass, the only accelerator at the time capable of producing the top quark was the Tevatron. The largest cross sections for producing top quarks at the Tevatron are pair production from either gg or qq¯ initial states (see Figure 8.24)

The top quark decays very rapidly, with a branching ratio of essentially 100% into a W and a b quark. Since the mass of the top is much larger than that of the W, the W from top decay is on mass shell and it decays through all its decay modes with the known branching ratios. Therefore, it is easy to work out the fraction of events with different final-state topologies. All events will have two b-jets. The decays of the two Ws leading to final states that can be identified above the backgrounds are listed in Table 8.2.

The main Standard Model background is W + jets from higher-order topologies. QCD corrections to W production. In general, this type of background event will not contain b quarks. Therefore, identifying jets containing b quarks is a critical technique for separating the top signal from

easier than measuring a correctly normalized single cross section, since many

**Fig. 8.24** Lowest-order Feynman diagrams for tt ¯ production.


**Table 8.2** Some tt ¯ final-state

backgrounds. There are two strategies for identifying b quarks (called 'b-tagging'):


The first technique is limited by the semileptonic branching ratio, so lifetime tagging is more powerful. This requires very high-precision tracking close to the interaction point. As B hadrons will typically decay inside the beam pipe, very high-precision detectors of very low mass to minimize multiple scattering are required. The only practical technique that can satisfy both requirements is based on the use of silicon detectors (see Chapter 4). Starting from the interaction point, the beam pipe within the vertex detector is made from beryllium (Z = 4), since this has a very long radiation length (which minimizes distortion of charged-particle tracks due to multiple scattering). The active detector components consist of layers of silicon strip detectors around the beam pipe. The first layer is mounted on the beam pipe to minimize the distance to the beam line. This reduces extrapolation errors and the effect of multiple scattering in the beam pipe. The transverse impact parameter is defined as the closest approach of an extrapolated track to the primary vertex location in the plane transverse to the beam line. Short-lived hadrons from lightquark jets produce a Gaussian distribution and the long-lived B hadrons generate an exponential tail at high values of the impact parameter.

The power of b-tagging to identify the top-quark signal is illustrated in Fig. 8.25 [7], which shows the number of events with Ws and various numbers of high-transverse-momentum jets before and after b-tagging, The signal events from tt ¯events should ideally contain 4 jets, but this can be distorted by instrumental effects. The W + jet events will decrease rapidly with increasing number of jets, because each additional jet has a penalty of the order of αs. A very clear signal is visible for 4 jets after b-tagging, as shown in Fig. 8.25.

## **8.6.2 Top-quark mass measurement**

Once a clean signal for tt ¯ has been established, these events can be used to fit the top-quark mass m<sup>t</sup> by reconstructing the decay products of the t and t ¯. This procedure is difficult, since there is no way of uniquely identifying which jet came from the t and which from the t ¯. Many possible combinations have to be tried in turn and some algorithm used to select the most likely combination. In the 'all-jets' channel, all the decay

**Fig. 8.25** Distribution of the number of jets before and after b-tagging as measured by CDF. The circles (triangles) are the data before (after) btagging and the shaded boxes represent the background estimates after b-tagging. From [7].

products of the t and t ¯ are measured directly, but this channel suffers from a large background.

The semileptonic channels are cleaner, but in a hadron collider only the momentum of the neutrino transverse to the beam direction can be reconstructed. The unknown longitudinal momentum of the neutrino can be determined by exploiting the constraint from the decay W → lν:

$$(M\_W)^2 = (E\_\nu + E\_l)^2 - (p\_\nu + p\_l)^2 \tag{8.25}$$

The neutrino momentum is split into a transverse component pν,T, which can be inferred from the measured missing transverse energy, and an unknown longitudinal component pν,L. This gives

$$\left(\left(M\_W\right)^2 = \left(\sqrt{p\_{\nu,\mathrm{T}}^2 + p\_{\nu,\mathrm{L}}^2} + E\_l\right)^2 - \left(p\_{\nu,\mathrm{T}} + p\_{l,\mathrm{T}}\right)^2 - \left(p\_{\nu,\mathrm{L}} + p\_{l,\mathrm{L}}\right)^2 \tag{8.26}$$

Using the known mass of the W, the value of pν,<sup>L</sup> can be determined up to a twofold ambiguity by solving the quadratic equation 8.26. An example of a fit to the top mass is shown in Fig. 8.26 [2].

The combined result of the top-quark mass measurements from the Tevatron CDF and D0 experiments is m<sup>t</sup> = 173.20 ± 0.51(statistical) ± 0.71(systematic) GeV [131]. The top-quark measurements at LHC will be reviewed in Chapter 13.

## **8.6.3 Top-quark production cross sections**

As well as measuring the mass of the top quark, another key measurement is the top-quark production cross section. The Tevatron collider measurements [115] for the total top production cross section

**Fig. 8.26** Top-quark mass fits from the CDF experiment. (a) Reconstructed dijet mass, which shows the expected peak at M<sup>W</sup> . (b) Distribution of reconstructed top-quark masses. From [2].

σtot(pp¯ → tt ¯) in pp¯ collisions at <sup>√</sup><sup>s</sup> = 1.96 TeV) are <sup>σ</sup>D0 tt ¯ = 7.56+0.<sup>63</sup> <sup>−</sup>0.<sup>56</sup> (statistical + systematic) pb and σCDF tt ¯ = 7.50 ± 0.48 (statistical + systematic) pb. The methodology for calculating cross sections for Standard Model processes will be described in Chapter 9.

## **8.7 Summary**

All electroweak data are consistent with the Standard Model at a precision sufficient to be sensitive to radiative corrections. This enables us to place limits on masses and couplings of new particles. For example, using the measured masses of the top quark and the W boson within the context of the Standard Model, we can predict a value for the mass of the Higgs boson. The results are shown in Fig. 8.27. Combining this

**Fig. 8.27** The mass of the W versus the mass of the top quark, with results from direct measurements and indirect determinations from Standard Model fits. The region below m<sup>H</sup> < 114 GeV is excluded by direct searches at LEP. The small ellipse shows the Standard Model fit to all the precision data, including the direct measurements of the W and top-quark masses, whereas the larger dashed ellipse includes only the precision data collected near the Z pole. The shaded diagonal bands show the Standard Model expectations for different ranges of the Higgs boson mass. From [97].

analysis with the results of the direct Higgs search at LEP (see Chapter 12), the 95% confidence level allowed range [97] for the Higgs mass is 114 GeV < m<sup>H</sup> < 149 GeV. The discovery of a Higgs boson in this mass range is discussed in Chapter 12.

## **Chapter summary**


## **Further reading**


Model and related topics', the review article 'Electroweak model and constraints on new physics' gives a full discussion of the Standard Model fits to all the precision electroweak data.

• Behnke, O. et al. (Eds.) (2013). Data Analysis in High Energy Physics. Wiley. A good book on advanced data analysis techniques.

## **Exercises**


Hint: The spin of the photon projected on its direction of propagation can only be +1 or −1.

(8.2) Justify eqn 8.1.

Hint: first consider the allowed couplings of L and R leptons and use the Standard Model relation for the L and R coupling constants in terms of the weak mixing angle (see eqn 7.40).

	- (a) e<sup>+</sup>e<sup>−</sup> → μ<sup>+</sup>μ<sup>−</sup>
	- (b) e<sup>+</sup>e<sup>−</sup> → b¯b

$$
\sigma(\bar{\nu}\_e + p \to e^+ + n) = \frac{4G\_\mathcal{F}^2}{\pi} p^2
$$

Derive this cross section from Fermi's Golden Rule, ignoring spin and assuming that the matrix element Mif = 2GF. In a fission reactor, the energy released per fission is ∼200 MeV and there are 6 antineutrinos released. The average energy of the anti-neutrinos is approximately 2 MeV. What is the minimal power of a nuclear fission reactor to have three interactions per hour in a 200 kg water tank at 10 m distance

(8.9) At a given neutrino energy, the ratio of antineutrino to neutrino cross-sections for scattering on electrons is

$$
\sigma(\bar{\nu}\_{\mu}) / \sigma(\nu\_{\mu}) = 0.85
$$

Deduce a value for sin<sup>2</sup> θW.


$$
\sigma\_{\rm T} \simeq \frac{2G\_{\rm F}^2 M\_W^4}{\pi s} \sqrt{1 - \frac{4M\_W^2}{s}}
$$

where G<sup>F</sup> is the Fermi coupling constant, M<sup>W</sup> the W mass, and s the CMS energy squared, all in natural units. Express σ<sup>T</sup> in terms of s and β. With e<sup>+</sup> and e<sup>−</sup> beams of energy 80.65 GeV, a cross section for all-hadronic decay modes of 1.7 pb is obtained. Hence calculate a value for the W mass.

# **Dynamic quarks 9**

In this chapter, we consider the structure of hadrons and discuss the most direct dynamical evidence that hadrons are made from quarks. This evidence comes from the scattering of high-energy leptons off protons and neutrons. The relatively high cross section for reactions at large transverse momentum, so-called deep inelastic scattering (DIS), leads to the conclusion that these hadrons are made up of point-like constituents. To observe the structure within hadrons, a resolution smaller than the size of the hadron is required; hence the wavelength of the probe should satisfy λ - R (where R is the radius of a hadron). From the de Broglie relationship<sup>1</sup> λ = h/p, therefore, high-energy experiments are required to study this structure. Using the uncertainty relation in a form more suitable for high-energy physics, ΔxΔpc ≤ **¯**hc, implies that we require an energy transfer of the order of at least 20 GeV to achieve a resolution of 10<sup>−</sup><sup>2</sup> fm.<sup>2</sup>

Other evidence that hadrons are made of quarks was discussed in Chapter 5, where the static quark model of hadrons was used to explain the observed multiplets of hadrons and their masses and magnetic moments. In Section 9.1, we will consider scattering off a nucleus and in Section 9.2 scattering off individual nucleons. Then we will discuiss how the DIS data can be explained in terms of the quark–parton model (QPM). The DIS data give indirect evidence for the existence of gluons, and we will consider more direct evidence for gluons from e<sup>+</sup>e<sup>−</sup> experiments. A brief discussion of QCD will be given, and this will be used to explain the success of the naive QPM. Finally, the QPM will be extended to hadron–hadron collisions.

## **9.1 Rutherford scattering**

The prototype for all scattering experiments is the Rutherford scattering experiment that discovered the atomic nucleus. The experiment involved scattering α particles off a gold foil and measuring the angular distribution of the scattered particles. We can calculate the transition rate from Fermi's Golden Rule, which gives

$$w\_{\rm fi} = 2\pi |\langle \mathbf{f} | H' | \mathbf{i} \rangle|^2 \rho(E) \tag{9.1}$$

where H is the Hamiltonian for the perturbation that causes the transition between initial state i and final state f and ρ(E) is the densityof-states factor. In this case, the perturbation is given by the Coulomb


<sup>1</sup>Count Louis de Broglie 1892–1987.

<sup>2</sup>**¯**hc = 197 MeV fm.

<sup>&</sup>amp; Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak,

<sup>&</sup>amp; Tony Weidberg 2016. Published in 2016 by Oxford University Press.

interaction between the α particle (atomic number Z1) and the nucleus (atomic number Z2). If all the positive charge is contained in a point-like nucleus, we can write the potential as

$$V(r) = \frac{Z\_1 Z\_2 \alpha}{r} \tag{9.2}$$

where α = e<sup>2</sup>/(4π0) is the fine-structure constant. The matrix element is then given by the integral of V (r) between the initial and final states. We can use plane waves for the initial and final states of the α particles: Ψ<sup>i</sup> = exp(i**k**<sup>i</sup> · **r**) and Ψ<sup>f</sup> = exp(i**k**<sup>f</sup> · **r**). This gives

$$
\langle \mathbf{\dot{f}} | H' | \mathbf{\dot{i}} \rangle = \int \exp(\mathbf{i} \mathbf{k}\_{\mathbf{i}} \cdot \mathbf{r}) V(r) \exp(-\mathbf{i} \mathbf{k}\_{\mathbf{f}} \cdot \mathbf{r}) \, \mathrm{d}^3 \mathbf{r} \tag{9.3}
$$

Substituting for V (r) from eqn 9.2 into eqn 9.3 and then using **q** = **k**i−**k**<sup>f</sup> for the momentum transfer gives

$$
\langle \mathbf{f} | H' | \mathbf{i} \rangle = Z\_1 Z\_2 \alpha \int \frac{\mathbf{e}^{\mathbf{i} \mathbf{q} \cdot \mathbf{r}}}{r} \, \mathrm{d}^3 \mathbf{r} \tag{9.4}
$$

It is convenient to use spherical polar coordinates and to take the z axis of **r** to lie along **q** so that **q** · **r** = qr cos θ. Then we can perform the integral over φ and θ:

$$\langle \mathbf{f} | H' | \mathbf{i} \rangle = Z\_1 Z\_2 2 \pi \alpha \int \mathbf{e}^{iqr \cos \theta} r \,\mathrm{d}r \,\mathrm{d} \cos \theta \tag{9.5}$$

$$=\frac{Z\_1 Z\_2 2\pi\alpha}{\text{iq}} \int\_0^\infty (\mathbf{e}^{iqr} - \mathbf{e}^{-\mathbf{i}qr}) \,\mathrm{d}r \tag{9.6}$$

This integral is divergent, which is related to the fact that the scattering cross section for scattering of two particles via an inverse-square force law is infinite. We will proceed by modifying the potential by a factor e<sup>−</sup>r/a and at the end of the calculation we will let a → ∞. Then we can perform the integral in eqn 9.6 to give

$$\langle \mathbf{\dot{f}} | H' | \mathbf{\dot{i}} \rangle = Z\_1 Z\_2 \frac{2 \pi \alpha}{\mathbf{\dot{i}} q} \left( \frac{1}{1/a - \mathbf{i} q} - \frac{1}{1/a + \mathbf{i} q} \right) \tag{9.7}$$

and then, if we let a → ∞, we obtain the matrix element

$$
\langle \mathbf{f} | H' | \mathbf{i} \rangle = \frac{4\pi Z\_1 Z\_2 \alpha}{q^2} \tag{9.8}
$$

The density of states for a unit volume V (this volume is arbitrary and will cancel with the flux in the calculation of the cross section) is given by (see Chapter 2)

$$\mathrm{d}N\_{\mathrm{f}} = \frac{\mathrm{d}^3 \mathbf{p}\_{\mathrm{f}}}{(2\pi)^3} \tag{9.9}$$

We make a change of variable using

$$\frac{\mathrm{d}p}{\mathrm{d}E} = \frac{1}{v\_{\mathrm{f}}}\tag{9.10}$$

where v<sup>f</sup> is the final velocity. As we are assuming an infinitely massive nucleus, we can neglect the nuclear recoil energy. Writing d<sup>3</sup>p<sup>f</sup> = p<sup>2</sup> <sup>f</sup> dp<sup>f</sup> dΩ and substituting into eqn 9.9 gives

$$\mathrm{d}N\_{\mathrm{f}} = \frac{p\_{\mathrm{f}}^2 \,\mathrm{d}p\_{\mathrm{f}} \,\mathrm{d}\Omega}{(2\pi)^3} \tag{9.11}$$

The differential scattering cross section is given by R/v, where v is the velocity of the incident beam relative to the scattering centre and R is the reaction rate.<sup>3</sup> <sup>3</sup>We have set the normalization volume

Substituting from eqns 9.8, 9.9, and 9.11 into eqn 9.1 gives the differential cross section as (see Exercise 9.1)

$$\frac{\mathrm{d}\sigma}{\mathrm{d}\Omega} = \frac{4(Z\_1 Z\_2)^2 \alpha^2 (m\_e)^2}{q^4} \tag{9.12}$$

This is an example of a scaling cross section, since it does not depend on any fixed scale in the problem. This arises because we have assumed that the nucleus is a point charge. If instead we allow for a charge density distribution ρ(r), then the potential becomes

$$V(\mathbf{r}) = Z\alpha \int \frac{\rho(R)}{|\mathbf{r} - \mathbf{R}|} \,\mathrm{d}^3 \mathbf{R} \tag{9.13}$$

Then the matrix element is modified from eqn 9.8 to become

$$\langle \mathbf{f} | H' | \mathbf{i} \rangle = Z \alpha \int \mathbf{d}^3 \mathbf{R} \int \mathbf{e}^{i\mathbf{q} \cdot \mathbf{r}} \frac{\rho(R)}{|\mathbf{r} - \mathbf{R}|} \, \mathbf{d}^3 \mathbf{r} \tag{9.14}$$

Letting **s** = **r** − **R**, we can then write the matrix element as

$$\langle \mathbf{\dot{f}} | H' | \mathbf{\dot{i}} \rangle = Z \alpha \iint \mathbf{e^{i\mathbf{q}\cdot\mathbf{R}}} \rho(R) \, \mathbf{d}^3 \mathbf{R} \, \frac{\mathbf{e^{i\mathbf{q}\cdot\mathbf{s}}}}{|\mathbf{s}|} \, \mathbf{d}^3 \mathbf{s} \tag{9.15}$$

Hence the matrix element is modified by a multiplicative factor called the 'form factor'

$$F(q^2) = \int \mathbf{e}^{i\mathbf{q}\cdot\mathbf{R}} \rho(R) \, \mathbf{d}^3 \mathbf{R} \tag{9.16}$$

The form factor can be seen to be the Fourier transform of the charge distribution into momentum space. By measuring the deviations from the pure Rutherford scattering cross section, eqn 9.12, the form factor can be determined. Hence the mean size and charge density distribution of the nucleus can be inferred. However, to obtain any meaningful data on the nuclear size, one must have data with momentum transfer large enough to satisfy the inequality qRnucleus 1, where Rnucleus is to be unity, and therefore the incident flux is 1/v, and we have assumed a single target nucleus.

the mean size of the nucleus. The cross section no longer shows scaling behaviour, because of the effect of the finite size of the nucleus. For example, if the nucleus had an exponential charge distribution

$$
\rho(\mathbf{r}) = \rho\_0 \mathbf{e}^{-mr} \tag{9.17}
$$

then the form factor would be given by

$$F(q^2) = \frac{8\pi m\rho\_0}{(q^2+m^2)^2} \tag{9.18}$$

Note that for q m, F(q<sup>2</sup>) is a constant, whereas for q m, F(q<sup>2</sup>) ∼ q−<sup>4</sup>, i.e. the cross section is suppressed by a factor of 1/q<sup>8</sup>. This feature that the cross section is suppressed for values of q large compared with 1/R, where R is the size of the nucleus, is a general result and will be true whatever the precise form of the charge distribution. An example of some real data [53] and a fit to the charge density distribution are shown in Fig. 9.1.

## **9.2 Scattering from nucleons**

In the previous section, we have seen how deviations from Rutherford scattering can be used to measure the charge distribution of the nucleus. Now we will consider the problem of how to look for evidence of individual nucleons (protons and neutrons) in a nucleus. The answer is to do scattering experiments where we look for evidence of inelastic scattering off the whole nucleus and elastic scattering off a single nucleon. The kinematics are defined in Fig. 9.2.

The invariant mass W of the recoiling nucleon of mass M is given by

$$W^2 = (E^\*)^2 - |\mathbf{p}^\*|^2\tag{9.19}$$

and the 4-momentum transfer q is found from 4-momentum conservation at the lower vertex as q = (E<sup>∗</sup> − M, **p**<sup>∗</sup>). Therefore, we can write down the square of the 4-momentum transfer:

$$\begin{split}q^2 &= (E^\*-M)^2 - |\mathbf{p}^\*|^2\\ &= (E^\*)^2 - |\mathbf{p}^\*|^2 - 2ME^\* + M^2\end{split}\tag{9.20}$$

Substituting for W from eqn 9.19 gives

$$q^2 = W^2 - 2ME^\* + M^2 \tag{9.21}$$

From energy conservation, ν = E − E- = E<sup>∗</sup> − M and therefore, from eqn 9.21,

$$q^2 = W^2 - M^2 - 2M\nu\tag{9.22}$$

For elastic scattering off the entire nucleus, the mass of the nucleus is unchanged in the collision, and hence W = M, so we get a peak in the change in the electron energy given by

$$\nu = -\frac{q^2}{2M\_{\text{nucleus}}}\tag{9.23}$$

Similarly if we had scattering off a single nucleon, then we would expect

$$\nu = -\frac{q^2}{2M\_{\text{nucleon}}}\tag{9.24}$$

Some example data [87] are shown in Fig. 9.3, where the cross section for electron scattering off helium is plotted as a function of the scattered electron energy E- . A sharp peak is seen at the value expected for elastic scattering off the entire nucleus and a smeared peak is seen for elastic scattering off individual nucleons. The smearing is caused by the Fermi motion of the nucleons within the nucleus. Naively, we might hope that we could see evidence for the quark structure of the nucleons in a similar way, but now the Fermi motion is so big that it completely smears out the peak corresponding to elastic scattering off a quark.

**Fig. 9.2** Kinematics for scattering off individual nucleons. We assume that the nucleus is initially at rest in the laboratory frame of reference.

**Fig. 9.3** Scattering of 400 MeV electrons off <sup>4</sup>He, with the scattering angle fixed at θ = 45◦. E is the energy of the scattered electron [87].

## **9.3 Quark–parton model**

This section introduces the quark–parton model (QPM), which will be used in the following sections to make predictions for deep inelastic scattering processes. Our approach is not historically accurate—it took physicists a long time to take the experimental data seriously enough to really believe in quarks. However, it is much easier to understand. We will compare the predictions with the experimental data and show that the key features are confirmed by the data, namely that the nucleons contain spin- <sup>1</sup> <sup>2</sup> , point-like particles with the fractional electric charges expected in the quark model.

In the QPM, we assume that the inelastic scattering of a lepton at large momentum transfer q with a nucleon is due to the elastic scattering off a quark via exchange of a virtual boson (γ, W, or Z). First, we will review the kinematics of the reaction in Section 9.3.1; then in Sections 9.4 and 9.5, we develop the dynamics to allow us to predict the form of the differential cross sections. We then compare QPM predictions with the experimental data and discuss many internal consistency checks that give us confidence in the model.

## **9.3.1 Kinematics of deep inelastic scattering**

The key idea of the QPM is shown in Fig. 9.4, which also defines the kinematics. We assume that the quarks each have a mass that is a fraction x of the nucleon mass; i.e. the quark mass m = xM. The energy of the struck quark after the scattering is E<sup>∗</sup> = ν + xM, where ν is energy transferred by the virtual photon. In terms of the incoming and

**Fig. 9.4** Kinematics for quark–parton scattering.

outgoing electron energies, ν = E −E- , and the 3-momentum transfer is **p**<sup>∗</sup> = **q**. Then, using E<sup>2</sup> − p<sup>2</sup> = m<sup>2</sup> for the struck quark after the elastic collision, we get

$$(\nu + xM)^2 - |\mathbf{q}|^2 = (xM)^2 \tag{9.25}$$

Multiplying this out gives

$$
\nu^2 + 2xM\nu + (xM)^2 - |\mathbf{q}|^2 = (xM)^2 \tag{9.26}
$$

Defining Q<sup>2</sup> = −q<sup>2</sup> = −(ν<sup>2</sup> − |**q**| <sup>2</sup>), we solve for the parton mass fraction x as

$$x = \frac{Q^2}{2M\nu} \tag{9.27}$$

Note that x can be determined purely from a measurement of the scattered lepton momentum. This is important from an experimental point of view, since it is generally easier to make precise measurements of the momenta of electrons and muons than those of hadrons.<sup>4</sup> <sup>4</sup>Early experiments used 'single-arm

The variable x can also be considered as representing the fraction of the nucleon momentum carried by a parton. In DIS, the 4-momentum transfer q is large enough for the mass of the quark to be neglected in comparison with its energy. Hence, considering the 4-momentum of the quark after the scattering and using the fact that the quark mass is negligible before the interaction,

$$\left(p\_{\text{quark}} + q\right)^2 = 0$$

that is,

$$p\_{\text{quark}}^2 + 2p\_{\text{quark}} \cdot q + q^2 = 0 \tag{9.28}$$

Letting pquark = xp, we can solve for x from eqn 9.28 as

$$x = \frac{Q^2}{2p \cdot q} \tag{9.29}$$

Now, p · q is a Lorentz invariant, so we can evaluate it in any frame we wish. Consider the nucleon rest frame, in which p = (M, 0), and use eqn 9.29:

$$x = \frac{Q^2}{2M\nu} \tag{9.30}$$

This value of x is the same as that obtained from eqn 9.27, so we can consider the variable x either as representing the mass fraction of the nucleon carried by a quark or as the momentum fraction of the nucleon carried by the quark.<sup>5</sup>

Before we can start the discussion of the dynamics of the scattering process, we need to evaluate one more piece of kinematics, namely the relation between the laboratory energies and the CMS scattering angle θ<sup>∗</sup>. <sup>5</sup>Strictly speaking, this is only correct in the 'infinite-momentum frame', in which the nucleus has an infinite momentum along the collision axis so that we can neglect transverse components of the momentum.

spectrometers', which could only measure the scattered lepton momentum and not that of the hadronic jet.

The Lorentz transformation from the CMS to the laboratory system for the scattered and initial leptons (we assume β = 1) is

$$\begin{aligned} E\_{\text{lab}}' &= \gamma E^\* (1 + \cos \theta^\*) \\ E\_{\text{lab}} &= \gamma E^\* (1 + 1) \end{aligned} \tag{9.31}$$

We define the scaled fractional energy transfer for the scattered lepton to be y = ν/E, so 0 <y< 1. Then we can relate the CMS scattering angle θ<sup>∗</sup> and y as

> <sup>y</sup> <sup>=</sup> <sup>1</sup> 2 (9.32)

Having completed our discussion of the kinematics of lepton–nucleon scattering, we can now look at the dynamics. We will start with neutrino probes in Section 9.4.1, because the spin structure is simpler for this case than for the case of charged lepton probes, which we will consider in Section 9.5.

## **9.4 Neutrino interactions**

We are now ready to use the QPM to explain the DIS data for neutrino We will proceed in the following steps:

	- (a) ¯νee → ν¯ee
	- (b) νee → νee

As discussed in Chapter 7, weak interactions are parity-violating; the angular distribution in β decay is maximally parity-violating and the leptons are emitted polarized. The parity violation is a V−A interaction, where V and A stand for vector and axial vector couplings, respectively.<sup>8</sup> We consider for now only virtual W exchange (i.e. we neglect Z exchange) and so we have a pure V−A interaction. In the high-energy limit, this leads to a very simple spin structure for the process whereby the W bosons couple only to negative-helicity leptons and positive-helicity antileptons, where the helicity is defined by the normalized projection of the spin **s** onto the momentum **p** of the particle:

$$H = \frac{\mathbf{p} \cdot \mathbf{s}}{|\mathbf{p}| |\mathbf{s}|} \tag{9.33}$$

so that for spin- <sup>1</sup> <sup>2</sup> fermions the eigenvalues of helicity are +1 and −1. Therefore, the spin structures for the interactions ¯νee → ν¯ee and

(1 − cos θ∗) <sup>6</sup> <sup>6</sup> Note that the variable <sup>y</sup> is only defined in the lab frame.

probes. <sup>7</sup>We will consider electron neutrinos in <sup>7</sup> the following discussion, but in the DIS regime the masses of the leptons are negligible, so the same results hold for muon neutrinos.

<sup>8</sup>In principle, there are five possible couplings, each with a different experimental signature. See Further Reading to follow this up.

**Fig. 9.5** Spin structure for (a) ¯νee scattering and (b) νee scattering. The left (right) diagram is before (after) the scatter.

νee → νee are as shown in Fig. 9.5. For the interaction in Fig. 9.5(a), the overall initial state must have an angular momentum about the z axis (the initial direction of the ¯νe) given by the quantum number J<sup>z</sup> = 1. The interaction proceeds via the creation of a virtual W. The spin of a real W is measured to be 1 and we will assume that the spin of the virtual W is also 1. Hence the initial state must have J = 1, J<sup>z</sup> = 1 and by conservation of angular momentum the final state will have the same angular momentum quantum numbers. However, we know that the scattered leptons will be in eigenstates of helicity (positive for the ¯ν<sup>e</sup> and negative for the e<sup>−</sup>); i.e. the ¯ν<sup>e</sup> will have its spin along its direction of motion. Hence, considering a z axis that is rotated from the z axis to lie along the final ¯ν<sup>e</sup> direction of motion, Jz- = <sup>1</sup> <sup>2</sup> . Similarly, the electron has Jz- = <sup>1</sup> <sup>2</sup> . Therefore, the amplitude for the reaction in Fig. 9.5(a) can be found by projecting out the spin states onto the rotated axes. This requires the rotation matrix for spin- <sup>1</sup> <sup>2</sup> particles (see Chapter 2 and Exercise 2.4):

$$d\_{m'm}^{j} = \begin{pmatrix} \cos\frac{1}{2}\theta & -\sin\frac{1}{2}\theta\\ \sin\frac{1}{2}\theta & \cos\frac{1}{2}\theta \end{pmatrix} \tag{9.34}$$

The elements of this matrix give the amplitude for a state with a given quantum number m to be found with a value of m after a rotation by a polar angle θ.

## **9.4.1 Cross section for neutrino–electron elastic scattering**

We are now in a position to calculate the cross section for elastic νee and ¯νee scattering. From the spin structure for the reaction ¯νee → ν¯ee shown in Fig. 9.5(a), the amplitude as a function of polar angle can be calculated by projecting the spin states onto the rotated axes. This projection changes the electron (¯νe) state from m = <sup>1</sup> <sup>2</sup> to m- = <sup>1</sup> <sup>2</sup> , and hence the amplitude as a function of polar angle is given by<sup>9</sup>

$$A(\theta) = \left(d\_{1/2, 1/2}^{1/2}(\theta)\right)^2 \tag{9.35} \quad \text{(bied system and using spin-1 rotation)}$$

<sup>9</sup>We have chosen to apply the spin- <sup>1</sup> 2 rotation matrices to the ¯νe and e states separately. The same result could have been obtained by considering the com-

merical factors requires an application of the Feynman rules. This is described in the standard graduatelevel textbooks—see Griffiths in Further Reading for an example.

**Fig. 9.6** Definition of the angles used in the crossing symmetry argument.

and, substituting for the appropriate element of the rotation matrix from eqn 9.34, we get

$$A(\theta) = \left(\cos\frac{1}{2}\theta\right)^2 = \frac{1}{2}(1+\cos\theta)\tag{9.36}$$

The cross section is proportional to |A(θ)| <sup>2</sup>. We can use eqn 9.32 to change variables from θ to y. From the two-body phase space, the cross section should also be proportional to (p∗)<sup>2</sup>, where p<sup>∗</sup> is the CMS momentum. We define s to be the CMS energy squared, and then <sup>10</sup>We can neglect the electron mass <sup>10</sup> since <sup>p</sup><sup>∗</sup> <sup>m</sup>e. (p∗)<sup>2</sup> = s/4. The amplitude of the charged-current weak interaction is proportional to the Fermi coupling constant GF. Hence, from eqn 9.36 and putting in the correct numerical factor, <sup>11</sup> <sup>11</sup> The calculation of the correct nuwe get the differential cross section for elastic ¯νee scattering:

$$\frac{\mathrm{d}\sigma(\bar{\nu}\_e e \to \bar{\nu}\_e e)}{\mathrm{d}y} = \frac{G\_\mathrm{F}^2 s}{\pi} (1 - y)^2 \tag{9.37}$$

Similarly, for the elastic scattering reaction νee, from the spin structure shown in Fig. 9.5(b), the angular momentum of the initial state must have J<sup>Z</sup> = 0. The possible values of the total angular momentum quantum number J are therefore 0 or 1.

We can calculate the amplitude for the reaction νee → νee, given the amplitude for the reaction ¯νee → ν¯ee. From the solutions to the Dirac equation, we saw that we can formally represent the states that apparently have negative energy as positive-energy states travelling backwards in time. Hence we can use crossing symmetry to relate the amplitudes of reactions with a particle exchanged for its antiparticle. This states that for two diagrams related by crossing, the structures of the matrix elements are the same and we have only to replace the momentum of the incoming (outgoing) particles with minus that of the outgoing (incoming) antiparticles. From Fig. 9.6, we can rewrite the amplitude for ν¯ee scattering (eqn 9.36) in terms of the 3-momenta and the magnitude of the CMS momentum of the particles, p<sup>∗</sup>, as

$$A\_{\bar{\nu}}(\theta) = \frac{(p^\*)^2 + \mathbf{p}\_1 \cdot \mathbf{p}\_3}{2(p^\*)^2} \tag{9.38}$$

and, in the CMS, **p**<sup>3</sup> = −**p**4, so we can rewrite this as

$$A\_{\bar{\nu}}(\theta) = \frac{(p^\*)^2 - \mathbf{p}\_1 \cdot \mathbf{p}\_4}{2(p^\*)^2} \tag{9.39}$$

We can rewrite eqn 9.39 in terms of 4-vectors as

$$A\_{\bar{\nu}}(\theta) = \frac{p\_1 \cdot p\_4}{2(p^\*)^2} \tag{9.40}$$

and then use crossing symmetry to find the amplitude for νe scattering by the substitution p<sup>1</sup> ↔ −p3:

$$A\_{\nu}(\theta^{\*}) = -\frac{p\_3 \cdot p\_4}{(2p^\*)^2} \tag{9.41}$$

We use (p<sup>3</sup> + p4)<sup>2</sup> = p<sup>2</sup> <sup>3</sup> + p<sup>2</sup> <sup>4</sup> + 2p<sup>3</sup> · p<sup>4</sup> and we can neglect the masses. Therefore, p<sup>3</sup> · p<sup>4</sup> = (2p∗)<sup>2</sup>, and we can see from eqn 9.41 that the amplitude for νee scattering is isotropic. Hence, putting in the factors of GF, s, and overall normalization as for ¯νee → ν¯ee, we get the differential cross section for νee elastic scattering as

$$\frac{\mathrm{d}\sigma(\nu\_e e \to \nu\_e e)}{\mathrm{d}y} = \frac{G\_{\mathrm{F}}^2 s}{\pi} \tag{9.42}$$

In summary, the ¯νee<sup>−</sup> scattering cross section has a factor of (1 − y)<sup>2</sup> whereas the νee<sup>−</sup> cross section has a factor of 1. A similar argument can be used to obtain the cross section for ¯νee<sup>+</sup> from that for ¯νee−, and we find that ¯νee<sup>+</sup> scattering has a factor of 1 whereas νee<sup>+</sup> scattering has a factor of (1 − y)<sup>2</sup>.

## **9.4.2 Neutrino–quark scattering**

The next step is to generalize the results for neutrino–electron scattering to the case of neutrino–quark scattering. The universality of the charged-current weak interactions means that the weak interaction has the same strength for quarks as for electrons (see Chapter 7). Unfortunately, we do not have free quarks available as targets, so we have to use the quarks that are confined in nucleons. One of the main techniques when doing dynamic quark modelling is to consider the kinematics of elastic scattering of a neutrino with a quark that carries a fraction x of the nucleon momentum (see Fig. 9.7).

The Lorentz-invariant CMS energy squared of the neutrino–quark system is given by

$$\begin{split} \hat{s} &= (p^\* + xp^\*)^2 - (p^\* - xp^\*)^2 \\ &= 4x(p^\*)^2 = xs \end{split} \tag{9.43}$$

Hence we can write down the cross sections for νe(¯νe) elastic scattering on q (¯q) by analogy with the ν<sup>e</sup> (¯νe) elastic scattering cross sections<sup>12</sup> <sup>12</sup>We ignore threshold effects, which production near threshold. (eqns 9.37 and 9.42) by the substitution<sup>13</sup> s → sˆ = xs:

$$\begin{aligned} \frac{\mathrm{d}\sigma(\nu\_e q \to \nu\_e q)}{\mathrm{d}y} &= \frac{G\_\mathrm{F}^2 sx}{\pi} \\\\ \frac{\mathrm{d}\sigma(\bar{\nu}\_e q \to \bar{\nu}\_e q)}{\mathrm{d}y} &= \frac{G\_\mathrm{F}^2 sx}{\pi} (1 - y)^2 \\\\ \frac{\mathrm{d}\sigma(\nu\_e \bar{q} \to \nu\_e \bar{q})}{\mathrm{d}y} &= \frac{G\_\mathrm{F}^2 sx}{\pi} (1 - y)^2 \\\\ \frac{\mathrm{d}\sigma(\bar{\nu}\_e \bar{q} \to \bar{\nu}\_e \bar{q})}{\mathrm{d}y} &= \frac{G\_\mathrm{F}^2 sx}{\pi} \end{aligned} \tag{9.44}$$

**Fig. 9.7** Neutrino–quark scattering kinematics.

can be important for the case of charm

<sup>13</sup>The variable y is unchanged here because it depends only on the neutrino quantities, not on those of the quark.

## **9.4.3 Neutrino–nucleon cross sections**

We can now put together all the pieces and calculate the cross section for neutrino–nucleon scattering in the DIS regime. The basic assumption of the QPM is that the cross section for scattering leptons off nucleons is given by the incoherent sum of scattering off free quarks. This assumption is only valid for large values of Q<sup>2</sup> (Q<sup>2</sup> 1 GeV<sup>2</sup>). We will discuss the justification for this assumption when we consider the theory of strong interactions in Section 9.6.3. Let q(x) be the probability distribution function of quarks with a momentum fraction x; i.e. the probability of finding a quark with momentum in the range (x, x + dx) is q(x) dx. Similarly, let ¯q(x) be the equivalent distribution for antiquarks. The QPM makes no prediction for q(x) or ¯q(x), so we have to obtain q(x) from fits to experimental data. Nevertheless, the QPM does make many clear predictions that can be tested experimentally. With the above assumptions, we are finally able to write down the cross sections for neutrino and antineutrino scattering off nucleons as

$$\frac{\mathrm{d}^2 \sigma(\nu\_e N)}{\mathrm{d}x \, \mathrm{d}y} = \frac{G\_\mathrm{F}^2 sx}{\pi} [q(x) + (1 - y)^2 \bar{q}(x)] \tag{9.45}$$

$$\frac{\mathrm{d}^2 \sigma(\bar{\nu}\_e N)}{\mathrm{d}x \, \mathrm{d}y} = \frac{G\_\mathrm{F}^2 sx}{\pi} [(1-y)^2 q(x) + \bar{q}(x)] \tag{9.46}$$

We can compare these predictions with the measured y distributions for neutrino and antineutrino beams [115] shown in Fig. 9.8.

If the nucleons contained only quarks and not antiquarks, then we would expect the neutrino data to be constant in y and for the antineutrino data to show a (1−y)<sup>2</sup> dependence. The data can be fit by a mixture of constant and (1 − y)<sup>2</sup> components, which tells us that nucleons are composed of quarks and antiquarks. The proportion of quarks to antiquarks can thus be estimated from these data. The fact that the data do

**Fig. 9.8** Neutrino and antineutrino cross sections on nuclei as functions of y. From [71].

fit the QPM prediction for the y distribution is a very important test of the theory. It tells us that the neutrinos are interacting with free spin- <sup>1</sup> 2 particles (the quarks and antiquarks) according to the parity-violating V−A interaction. We can estimate the total cross sections by integrating the differential cross sections in eqns 9.45 and 9.46. The y integral is trivial to perform and the x integral gives a constant of the order of unity (depending on the unknown q(x) distribution). If we assume that the ¯q content of the nucleon is negligible, we get

$$\begin{aligned} \sigma(\nu N) &\approx G\_{\rm F}^2 s \\ \sigma(\bar{\nu} N) &\approx G\_{\rm F}^2 s/3 \end{aligned} \tag{9.47}$$

This prediction is compared with the experimental data [115] in Fig. 9.9.<sup>14</sup> <sup>14</sup>For a fixed target nucleon, <sup>s</sup> <sup>≈</sup> <sup>2</sup>m<sup>N</sup> The neutrino cross sections are found to be larger than the antineutrino cross sections, as expected if the nucleons consist mainly of quarks as opposed to antiquarks—but not by as much as a factor of 3, which indicates that there are some antiquarks in protons. More significantly, the fact that the cross section scales like s is telling us that the quarks are behaving as point-like particles. If the quarks had a finite size, then the cross section would be suppressed by a form factor.

## **9.4.4 Parton distribution functions**

In the previous section, we have seen that the QPM can explain some of the key features of neutrino–nucleon interactions. However, the quark distribution functions are not predicted by the model and have to be determined from experimental data. In this section, we will explain how this is done and discuss some useful consistency checks of the theory.

Eν, where Eν is the neutrino beam energy.

**Fig. 9.9** Neutrino and antineutrino cross sections. From [115].

First, it is convenient to rewrite the ν nucleon cross sections given by eqn 9.46 in a form more suitable for comparison with the experimental data. We use the relationship

$$\begin{split} q(x) + (1 - y)^2 \bar{q}(x) &= [q(x) + \bar{q}(x)] \frac{1 + (1 - y)^2}{2} \\ &+ [q(x) - \bar{q}(x)] \frac{1 - (1 - y)^2}{2} \end{split} \tag{9.48}$$

and substitute into eqn 9.46 to obtain

$$\begin{split} \frac{\mathrm{d}^2 \sigma(^{\nu}\_{\bar{\nu}} N)}{\mathrm{d}x \, \mathrm{d}y} &= \frac{G\_{\mathrm{F}}^2 sx}{\pi} \left\{ [q(x) + \bar{q}(x)] \frac{1 + (1 - y)^2}{2} \\ &\pm \, [q(x) - \bar{q}(x)] \frac{1 - (1 - y)^2}{2} \right\} \end{split} \tag{9.49}$$

We can compare the cross-section formula of eqn 9.49 with a general phenomenological formula that is allowed by Lorentz invariance: <sup>15</sup> <sup>15</sup> This is neglecting the 'longitudinal'

$$\begin{split} \frac{\mathrm{d}^2 \sigma(\frac{\nu}{\nu}N)}{\mathrm{d}x \, \mathrm{d}y} &= \frac{G\_\mathrm{F}^2 sx}{2\pi} \left[ \frac{F\_2(x, q^2)}{x} \frac{1 + (1 - y)^2}{2} \right. \\ &\pm F\_3(x, q^2) \frac{1 - (1 - y)^2}{2} \end{split} \tag{9.50}$$

The so-called structure functions F<sup>2</sup> and F<sup>3</sup> can be determined by fits to the experimental data. Note that in general the functions F<sup>i</sup> can depend on q<sup>2</sup> as well as x. The QPM does not predict any dependence on q<sup>2</sup> and to a first approximation this agrees with data. <sup>16</sup>We will return to this later in the <sup>16</sup> Then, from a comparison of eqns 9.49 and 9.50, we can obtain the quark probability distribution functions as

$$\begin{aligned} \sum\_{i} q\_i(x) &= \frac{1}{4} \left[ \frac{F\_2(x)}{x} + F\_3(x) \right] \\ \sum\_{i} \bar{q}\_i(x) &= \frac{1}{4} \left[ \frac{F\_2(x)}{x} - F\_3(x) \right] \end{aligned} \tag{9.51}$$

where the sum runs over the different quark flavours. Or, equivalently, we can write the inverse relation to give the structure functions in terms of the quark distribution functions:

$$\begin{aligned} F\_2(x) &= 2 \sum\_i x[q\_i(x) + \bar{q}\_i(x)] \\ F\_3(x) &= 2 \sum\_i [q\_i(x) - \bar{q}\_i(x)] \end{aligned} \tag{9.52}$$

So far in the discussion, we have not defined precisely which quark flavours we are considering for the quark distribution functions. We assume that the nucleons contain only the light u, d, and s quarks and the corresponding antiquarks (i.e. we neglect the heavy-quark c, b, and t content).

structure function FL, which is expected to be a good approximation at large values of momentum transfer Q2. Note also that some older analyses of DIS data use the structure function F<sup>1</sup> rather than FL, where F<sup>L</sup> = F<sup>2</sup> −2xF1, but this hides the evidence that F<sup>L</sup> is small.

chapter.

The Feynman diagrams corresponding to the interactions of neutrinos and antineutrinos on these light quarks are shown in Fig. 9.10, from which we see that neutrinos interact with d and s quarks and u¯ antiquarks, whereas antineutrinos interact with u quarks and ¯ d and s¯ antiquarks. It is conventional to define the quark distribution functions as referring to protons. (It should be noted that, at this stage, we are ignoring the differences between the weak eigenstates and the mass eigenstates for the quarks. This is explained in Chapter 7.)

We assume that the s and ¯s quark distribution functions are the same in neutrons and protons. Therefore, using eqn 9.52, we can write the structure functions for proton targets in terms of the quark distribution functions as

$$\begin{aligned} F\_2^{\nu p}(x) &= 2x[d(x) + \bar{u}(x) + s(x)] \\ F\_2^{\bar{\nu}p}(x) &= 2x[u(x) + \bar{d}(x) + \bar{s}(x)] \\ F\_3^{\nu p}(x) &= 2[d(x) + s(x) - \bar{u}(x)] \\ F\_3^{\bar{\nu}p}(x) &= 2[u(x) - \bar{s}(x) - \bar{d}(x)] \end{aligned} \tag{9.53}$$

For a neutron target, we will assume isospin (SU(2) flavour) symmetry; i.e. we assume that the d quarks ( ¯ d antiquarks) in a neutron have the same distribution function as the u quarks (¯u antiquarks) in a proton. Then we can simply use the substitution u ↔ d and ¯u ↔ ¯ d in eqn 9.53 to write down the neutron structure functions in terms of the quark distribution functions as

$$\begin{aligned} F\_2^{\nu n}(x) &= 2x[u(x) + \bar{d}(x) + s(x)] \\ F\_2^{\bar{\nu} n}(x) &= 2x[d(x) + \bar{u}(x) + s(x)] \\ F\_3^{\nu n}(x) &= 2[u(x) + s(x) - \bar{d}(x)] \\ F\_3^{\bar{\nu} n}(x) &= 2[d(x) - \bar{s}(x) - \bar{u}(x)] \end{aligned} \tag{9.54}$$

For an 'isoscalar' target, i.e. one with equal numbers of neutrons and protons in an isospin I = 0 state,<sup>17</sup> we can write down the structure functions from the average of the proton and neutron structure functions (eqns 9.53 and 9.54). We also assume s(x)=¯s(x), because s and ¯s quarks and antiquarks have to be created together by the strong interaction, which conserves strangeness. We have

$$\begin{aligned} F\_2^{\nu N}(x) &= x[u(x) + d(x) + \bar{u}(x) + \bar{d}(x) + 2s(x)] \\ F\_2^{\bar{\nu} N}(x) &= x[u(x) + d(x) + \bar{u}(x) + \bar{d}(x) - 2s(x)] \\ F\_3^{\nu N} &= u(x) - \bar{u}(x) + d(x) - \bar{d}(x) + 2s(x) \\ F\_3^{\bar{\nu} N} &= u(x) - \bar{u}(x) + d(x) - \bar{d}(x) - 2s(x) \end{aligned} \tag{9.55}$$

It is convenient to divide the quark distribution functions into 'valence' and 'sea' parts:

$$u(x) = u\_{\rm v}(x) + u\_{\rm s}(x)\tag{9.56}$$

**Fig. 9.10** Antineutrino and neutrino scattering off different flavours of quarks.

<sup>17</sup>As is the case for many easy-to-use targets in scattering experiments, e.g. carbon.

where, by definition, us(x)=¯u(x). Then, if we consider an average of neutrino and antineutrino data, we have

$$F\_3(x) = u(x) - \bar{u}(x) + d(x) - \bar{d}(x) \tag{9.57}$$

Hence, by the definition of valence and sea quarks, we expect

$$F\_3(x) = u\_\mathbf{v}(x) + d\_\mathbf{v}(x)\tag{9.58}$$

Since there are three valence quarks in a proton, we can integrate F3(x) to get the Gross–Llewellyn Smith (GLS) sum rule [81]

$$\, \, \, S\_{\rm GLS} = \int\_0^1 F\_3(x) \, \mathrm{d}x = \int\_0^1 \left[ u\_\mathbf{v}(x) + d\_\mathbf{v}(x) \right] \, \mathrm{d}x = 3 \tag{9.59}$$

This prediction of the QPM is compared with experimental data in Fig. 9.11. The measured value for the GLS sum [98] is SGLS = 2.50 ± 0.018(statistical) ± 0.078(systematic), which is slightly lower than the predicted value of 3 in the simple QPM. However, the difference can be understood in terms of higher-order QCD corrections.

There is another interesting sum-rule check we can get from the QPM, namely the Adler sum rule. We define the sum [11] as

$$S\_{\rm A} = \int\_0^1 \frac{F\_2^{\nu n} - F\_2^{\nu p}}{x} \, \mathrm{d}x \tag{9.60}$$

and we can see from eqns 9.53 and 9.54 that

$$S\_{\mathcal{A}} = \int\_0^1 \left[ u(x) - \bar{u}(x) - d(x) + \bar{d}(x) \right] dx \tag{9.61}$$

Hence, from the definition of valence quarks and eqn 9.56, we expect that

$$S\_{\rm A} = \int\_0^1 \left[ u\_{\rm v}(x) - d\_{\rm v}(x) \right] dx \tag{9.62}$$

**Fig. 9.11** Experimental test of the Gross–Llewellyn Smith sum rule. The dashed line is a fit to the experimental data [98] for F3(x) and the solid line is the integral.

∫ 1 0 2

1.0

0.5

ν*p*

50 100

and since there are two (one) valence u (d) quarks, we expect S<sup>A</sup> = 1. This prediction is in good agreement with the neutrino DIS data over a range of <sup>Q</sup><sup>2</sup> [134], as shown in Fig. 9.12 1.5 <sup>2</sup>*<sup>x</sup>* <sup>d</sup>*<sup>x</sup> <sup>F</sup>*ν*<sup>n</sup>* –*F*<sup>2</sup>

We will examine the measured shape of the parton distribution function in Section 9.6.6. We can integrate F<sup>2</sup> to get the momentum fraction of the proton carried by all quarks. From eqn 9.55,

$$\begin{split} I &= \int\_0^1 F\_2^{\nu N} \, \mathrm{d}x \\ &= \int\_0^1 x[u(x) + d(x) + s(x) + \bar{u}(x) + \bar{d}(x) + \bar{s}(x)] \, \mathrm{d}x \end{split} \tag{9.63}$$

Therefore, I is the integral of the momentum-fraction-weighted quark distribution functions; i.e. it represents the total momentum fraction carried by all the quarks. Naively, we might have expected that I = 1 for the sum of all the quark flavours, but from the fits of the data [4] shown in Fig. 9.13, we can see that about half of the nucleon momentum is carried by particles that do not feel the electroweak force.<sup>18</sup> <sup>18</sup>Figure 9.13 shows the results for the In terms of the theory of strong interactions, QCD, we conclude that this missing momentum is carried by gluons. More direct evidence for the existence of gluons will be discussed in Section 9.6.1.

*<sup>Q</sup>*2 (GeV2)

**Fig. 9.12** Measured value of S<sup>A</sup> (see eqn 9.62) as a function of Q<sup>2</sup> from neutrino DIS [134].

1 5 10

quarks and gluons. We will discuss how the gluon distribution is determined in Section 9.6.7, but for now the critical point is that the quarks do not carry all the momentum of the proton.

**Fig. 9.13** Momentum fraction of the proton carried by different constituents as a function of momentum transfer. The curves are the results of fitting the parton distribution functions [4] and performing the integral in eqn 9.63.

## **9.5 Charged-lepton probes**

Complementary data on the quark distribution functions can be obtained from DIS using charged-lepton (e/μ) beams. As for DIS scattering with neutrino beams, we proceed in the following stages:


## **9.5.1 Electron–muon elastic scattering**

We will calculate the eμ elastic scattering cross section by analogy with the cross section for νee scattering. As discussed in Section 9.4.1, we can consider νee scattering as due to the exchange of a virtual W boson. The strength of the weak coupling constant G<sup>F</sup> can be related to the dimensionless coupling constant of the weak interaction, g, and the mass of the W, M<sup>W</sup> (see Chapter 7):

$$G\_{\rm F} \approx \frac{g^2}{M\_W^2} \tag{9.64}$$

At low values of momentum transfer Q, the strength of the weak interaction is determined by GF, but at high values of Q, of the order of M<sup>W</sup> , we have to allow for the effect of the W propagator, which is to modify the effective strength of the weak interaction to be

$$G\_{\text{effective}} \approx \frac{g^2}{M\_W^2 + Q^2} \tag{9.65}$$

Now, in the case of eμ scattering, the interaction is due to the exchange of a photon. For a real photon, M<sup>γ</sup> = 0 and the strength of the electromagnetic interaction is given by the dimensionless fine-structure constant α. Therefore, by analogy with eqn 9.44 for elastic νee scattering, the cross section for elastic eμ scattering is given by

$$\frac{\mathrm{d}\sigma}{\mathrm{d}y} = \frac{\alpha^2 s}{Q^4} F \tag{9.66}$$

where F is a spin factor that we will now evaluate. The spin factor for ν¯ee was easy to evaluate since the V−A interaction ensured that there was only one possible spin configuration. In general, if we have different spin configurations, then the recipe for calculating the cross section for unpolarized beams in an experiment where one does not measure the final-state spins is as follows:


The possible spin configurations <sup>19</sup> <sup>19</sup> For a pure vector interaction as in the are shown in Fig. 9.14.

> For the reactions in Fig. 9.14(a) and (b), the initial state has J<sup>z</sup> = 0 and hence it has the same spin factor as for νee scattering (see eqn. 9.42), giving A(θ) constant. Similarly, the amplitude for the reaction in Fig. 9.14(c) can be evaluated in the same way as for ¯νee scattering; the electron (muon) states have an initial value of m = −<sup>1</sup> 2

case of electromagnetism (or for a pure axial vector), it can be shown that at high energies there is helicity conservation, i.e. the helicity of an outgoing particle is the same as the incoming particle (see Chapter 6).

**Fig. 9.14** Spin structure for electron– muon scattering.

and a final value of m- = −<sup>1</sup> <sup>2</sup> . Therefore, the amplitude is given by the spin- <sup>1</sup> <sup>2</sup> rotation matrix

$$A(\theta) = \left(d\_{-1/2, -1/2}^{1/2}\right)^2 = \left(\cos\frac{1}{2}\theta\right)^2 = \frac{1}{2}(1 + \cos\theta)\tag{9.67}$$

and, using eqn 9.32 to change variables, we get

$$A(y) = 1 - y \tag{9.68}$$

Similarly, for the reaction in Fig. 9.14(d), projecting out the spin states gives

$$A(\theta) = \left(d\_{1/2, 1/2}^{1/2}\right)^2 = \left(\cos\frac{1}{2}\theta\right)^2 = \frac{1}{2}(1 + \cos\theta)\tag{9.69}$$

and, again using eqn 9.32 to change variables, we get

$$A(y) = 1 - y \tag{9.70}$$

The electromagnetic interaction is not parity-violating and therefore in summing over the final-state spins and averaging over the initial spins, we can assume the same strength of interaction for left- and right-handed helicity states. Therefore, summing the matrix elements squared for the final states and averaging for the initial state, we get the spin factor

$$F = 1 + (1 - y)^2\tag{9.71}$$

so, from eqn 9.66, the cross section for elastic eμ scattering is given by

$$\frac{\mathrm{d}\sigma}{\mathrm{d}y} = \frac{2\pi\alpha^2 s}{Q^4} \left[ 1 + (1 - y)^2 \right] \tag{9.72}$$

where the correct numerical factors have to be inserted from a full calculation.<sup>20</sup> <sup>20</sup>See Griffiths in Further Reading.

## **9.5.2 Electron–quark elastic scattering**

Now we can generalize the cross section for elastic eμ scattering to the case of electron–quark scattering by allowing for the following:


Therefore, from eqn 9.72, the cross section is given by

$$\frac{\mathrm{d}^2 \sigma}{\mathrm{d}x \, \mathrm{d}y} = \frac{2\pi \alpha^2 x s q\_i^2}{Q^4} \left[ 1 + (1 - y)^2 \right] \tag{9.73}$$

## **9.5.3 Electron–nucleon deep inelastic scattering**

Now we can generalize further to e–nucleon <sup>21</sup>We consider explicitly <sup>21</sup> <sup>e</sup>–nucleon scattering in the QPM in a similar way as we did for neutrino beams. We assume that e–nucleon DIS can be calculated by adding incoherently the cross sections for scattering off all quark flavours:<sup>22</sup>

$$\frac{\mathrm{d}^2 \sigma}{\mathrm{d}x \, \mathrm{d}y} = \frac{2\pi \alpha^2 s}{Q^4} \left[ 1 + (1 - y)^2 \right] \sum\_i q\_i^2 x f\_i(x) \tag{9.74}$$

where, as above q<sup>i</sup> is the charge of quark flavour i, fi(x) gives the probability distribution for quark flavour i, and, as usual, x is the momentum fraction of the nucleon carried by the quark. It is convenient to rearrange this formula to give

$$\frac{\mathrm{d}^2 \sigma}{\mathrm{d}x \, \mathrm{d}y} = \frac{4\pi \alpha^2 s}{Q^4} \left[ (1 - y) + \frac{1}{2} y^2 \right] \sum\_i q\_i^2 x f\_i(x) \tag{9.75}$$

We now compare this prediction of the QPM with a general phenomenological formula

$$\frac{\mathrm{d}^2 \sigma}{\mathrm{d}x \, \mathrm{d}y} = \frac{4 \pi \alpha^2 s}{Q^4} \left[ (1 - y) F\_2(x, Q^2) + \frac{1}{2} y^2 2x F\_1(x, Q^2) \right] \tag{9.76}$$

where F<sup>1</sup> and F<sup>2</sup> (the structure functions) are unknown functions of x and Q<sup>2</sup>. Equating coefficients of (1 − y) and y<sup>2</sup> between eqns 9.75 and 9.76, we have the result [59] that

$$F\_2(x, Q^2) = 2xF\_1(x, Q^2) \tag{9.77}$$

This important prediction—called the Callan-Gross relation—of the QPM arises because the quarks have spin <sup>1</sup> <sup>2</sup> and interact via a vector (parity-conserving) interaction. The experimental electron scattering

scattering here, but of course the analysis is identical for μ–nucleon scattering.

<sup>22</sup>This is assuming parity-conserving photon exchange only. d<sup>2</sup>σ

**Fig. 9.15** Measured value of the ratio 2xF1/F<sup>2</sup> as a function of x. From [117].

data from SLAC, as summarized in [117] and shown in Fig. 9.15, are in good agreement with this prediction, which thus provides further evidence for the existence of spin- <sup>1</sup> <sup>2</sup> quarks.

Another key prediction of the QPM is obtained from a comparison of the QPM prediction (eqn 9.75) with the phenomenological formula (eqn 9.76):

$$F\_2(x, Q^2) = \sum\_i q\_i^2 x f\_i(x) \tag{9.78}$$

This means that the QPM predicts that the structure functions depend on x and not on Q<sup>2</sup>, i.e. that they show a scaling behaviour (called Bjorken scaling). This is very directly related to the quarks being point-like particles. If the quarks had a finite size, we would expect the structure functions to be suppressed by a form factor at large Q<sup>2</sup>. The experimental data are shown in Fig. 9.16 and show approximate scaling behaviour.

There is a range of intermediate x values for which F<sup>2</sup> is remarkably constant over a very large variation in Q<sup>2</sup>. This is an extension of the earlier fixed-target data, which gave the first evidence for the QPM. At low values of x, there is a strong rise in F<sup>2</sup> with Q<sup>2</sup>, whereas at large values of x there is a decrease in F<sup>2</sup> with Q<sup>2</sup>. These scaling violations will be discussed qualitatively in the context of the theory of strong interactions, QCD, in Section 9.6.6. It is interesting to note that because Q<sup>2</sup> = sxy at high energy, i.e. large values of s, fixed values of Q<sup>2</sup> and y correspond to lower values of x than at lower energy. This implies that the low-x region is very important for high-energy machines such as the Tevatron and particularly the LHC.

**Fig. 9.16** F<sup>2</sup> from fixed-target and HERA scattering experiments From [115].

## **9.5.4 Further tests of the QPM**

In this section, we will look at the QPM prediction for the structure functions and consider some further internal consistency checks of the theory. We start with the QPM prediction for the structure function (eqn 9.78) and we again assume that nucleons contain only the lightest three quark flavours:

$$F\_2^{ep}(x, Q^2) = x \left\{ \frac{4}{9} [u(x) + \bar{u}(x)] + \frac{1}{9} [d(x) + \bar{d}(x) + s(x) + \bar{s}(x)] \right\} \tag{9.79}$$

If we assume isospin symmetry as we did when considering neutrino DIS, then we expect up(x) = dn(x) and ¯u(x) = ¯ d(x). Therefore, the equivalent structure function for a neutron target should be

$$F\_2^{en}(x, Q^2) = x \left\{ \frac{1}{9} [u(x) + \bar{u}(x) + s(x) + \bar{s}(x)] + \frac{4}{9} [d(x) + \bar{d}(x)] \right\} \tag{9.80}$$

Then, for an isoscalar target with equal numbers of neutrons and protons,

$$F\_2^{cN}(x, Q^2) = x \left\{ \frac{5}{18} [u(x) + \bar{u}(x) + d(x) + \bar{d}(x)] + \frac{1}{9} [s(x) + \bar{s}(x)] \right\} \tag{9.81}$$

which we can compare with the equivalent structure function for neutrino scattering (eqn 9.55), and if we ignore the strange quarks (these turn out to be negligible except at very low values of x), then we get the prediction that

$$F\_2^{eN}(x, Q^2) = \frac{5}{18} F\_2^{\nu N}(x, Q^2) \tag{9.82}$$

where the numerical factor of <sup>5</sup> <sup>18</sup> is just the average value of the square of the quark charges. The experimental data (Fig. 9.17) are in good agreement with the prediction and hence confirm the charge assignments for the u and d quarks.

Another interesting consistency check is given by a comparison of the structure functions for ep and en scattering (eqns 9.79 and 9.80):

$$F\_2^{ep}(x, Q^2) - F\_2^{en}(x, Q^2) = \frac{1}{3}x[u(x) - d(x) + \bar{u}(x) - \bar{d}(x)] \tag{9.83}$$

**Fig. 9.17** Comparison of F2(x) from DIS muon and neutrino data. The muon data have been multiplied by a factor of <sup>18</sup> <sup>5</sup> so that the comparison with the neutrino data provides a test [115] of the prediction of eqn 9.81.

We now split the quark distribution functions into valence and sea components:

$$F\_2^{ep} - F\_2^{en}(x, Q^2) = \frac{1}{3}x \{ u\_\mathbf{v}(x) - d\_\mathbf{v}(x) + 2[\bar{u}(x) - \bar{d}(x)] \}\tag{9.84}$$

Then, if we integrate eqn 9.84, we obtain the Gottfried sum rule

$$\begin{split} I\_{\rm G} &= \int\_{0}^{1} \frac{F\_{2}^{ep}(x, Q^{2}) - F\_{2}^{en}(x, Q^{2})}{x} \, \mathrm{d}x \\ &= \int\_{0}^{1} \frac{1}{3} \{u\_{\rm v}(x) - d\_{\rm v}(x) + 2[\bar{u}(x) - \bar{d}(x)]\} \mathrm{d}x \end{split} \tag{9.85}$$

In the QPM, there are two valence up quarks and one valence down quark in the proton, so if we have isospin symmetry, then we should expect I<sup>G</sup> = <sup>1</sup> <sup>3</sup> . The experimental data [18] are shown in Fig. 9.18 and clearly do not agree with this prediction. This can be understood in terms of the Pauli exclusion principle. Antiquarks are created by the strong interaction at the same time as quarks (the strong interaction conserves quark flavour). The Pauli exclusion principle forbids the creation of a quark in the same quantum state as one of the existing valence quarks. Since there are two (one) valence up (down) quarks in a proton (neutron), it is therefore easier to create d ¯ d quark pairs than ¯uu pairs. Hence the value of I<sup>G</sup> should be less than <sup>1</sup> 3 .

## **9.5.5 Electroweak unification at HERA**

The high CMS energy of the HERA electron–proton collider <sup>23</sup>With 27.5 GeV electrons on 920 GeV <sup>23</sup> allowed a vivid demonstration of electroweak unification. Figure 9.19 shows the inclusive cross sections as functions of Q<sup>2</sup> for the processes e±p→ e±X,

protons, giving a maximum CMS en-

ergy of 318 GeV.

**Fig. 9.18** NMC data [18] for F<sup>p</sup> <sup>2</sup> − F <sup>n</sup> 2 and the Gottfried sum rule integral (eqn 9.85).

**Fig. 9.19** Total cross sections as functions of Q<sup>2</sup> for the processes ep →eX (neutral current, CC) and ep → νX (charged current, CC) measured at the HERA collider [82]. The data from the H1 and ZEUS experiments have been combined. The fit is from the HERAPDF2.0.

mediated by photon and Z<sup>0</sup> exchange (neutral current), and e<sup>±</sup>p → νX, mediated by W<sup>±</sup> exchange (charged current). It can be seen that for Q<sup>2</sup> > M<sup>2</sup> <sup>W</sup> , the four cross sections are of comparable magnitude. At smaller Q<sup>2</sup>, the 1/Q<sup>2</sup> of the photon propagator dominates and the neutral-current cross sections increase rapidly, while the charged-current cross sections become approximately constant in Q<sup>2</sup>.

## **9.6 QCD introduction**

We have seen that about 50% of the momentum in a proton is carried by particles that do not feel the electromagnetic or weak force, and we have tentatively ascribed this momentum to gluons. We will examine direct evidence for gluons in Section 9.6.1, including a measurement of the spin of the gluon. In Section 9.6.2, we will see how to use e<sup>+</sup>e<sup>−</sup> annihilation data to measure the number of colours carried by quarks. Using this experimental knowledge as input, we can construct a theory of strong interactions called quantum chromodynamics (QCD) based on the symmetry group SU(3). We will give an introduction to this theory in Section 9.6.3. We will then introduce the concept of 'running coupling constants' in Section 9.6.4. We will examine this quantitatively for QED and then in a qualitative way for QCD. This leads to the concept of 'asymptotic freedom', which allows us to perform perturbation-theory calculations in QCD if we have large values of Q<sup>2</sup>. We then consider some experimental techniques for measuring the strong coupling constant αs(Q<sup>2</sup>) and show that the experimental measurements are consistent with QCD and provide clear evidence of the running of αs(Q<sup>2</sup>). Finally, we will review some evidence that is sensitive to the choice of the SU(3) group for the colour symmetry.

## **9.6.1 Direct evidence for gluons**

We have seen strong but indirect evidence for the existence of gluons from the momentum sum rule (Section 9.5.4), but there is much more direct evidence from e<sup>+</sup>e<sup>−</sup> annihilation. In the naive QPM we would only expect 2-jet events, but in QCD we can have multijet events. For example, 3-jet events will be produced by the gluon bremsstrahlung process e<sup>+</sup>e<sup>−</sup> → qqg¯ . Such events were observed at the PETRA collider at DESY. An example of a clear 3-jet event in the TASSO detector [136] is shown in Fig. 9.20. Many statistical tests were performed to show that these events were not just due to fluctuations of '2-jet' events.

For example, if the events were due to gluon bremsstrahlung, we would expect planar events. The measured transverse momentum out of the plane of the 3 jets was much smaller than that in the plane, confirming

the existence of 3-jet events [45]. Apart from confirming the existence of gluons, measurements of the angular distribution of the third jet are sensitive to the gluon spin. In a sample of 3-jet events from the TASSO experiment, the jet energies are ordered E<sup>1</sup> > E<sup>2</sup> > E<sup>3</sup>; the event is boosted to the CMS frame of jets 2 + 3; the angle ˜θ is defined as shown in Fig. 9.21. The results from the TASSO experiment [54] are shown in Fig. 9.22, from which it can be seen that the data are consistent with a spin-1 gluon but clearly exclude a spin-0 gluon.

## **9.6.2 Number of colours**

Now that we have seen compelling evidence for the existence of pointlike spin- <sup>1</sup> <sup>2</sup> quarks and spin-1 gluons, we are almost ready to consider the theory of strong interactions, QCD. However, we first need to review the evidence that there are three 'colour' degrees of freedom for quarks. The classic experimental test of the number of colours is the ratio

$$R = \frac{\sigma(e^{+}e^{-} \to \text{hadrons})}{\sigma(e^{+}e^{-} \to \mu^{+}\mu^{-})} \tag{9.86}$$

In the QPM, the production of hadrons in e<sup>+</sup>e<sup>−</sup> interactions proceeds via qq¯ states that fragment with unit probability to two jets. Therefore, the fundamental Feynman diagrams for the two processes in eqn 9.86 are the same and the only differences are the charges of the quarks (qi) and the number of quark colours (Nc). Therefore, in the QPM, we expect

$$R = \frac{\sigma(e^{+}e^{-} \to \text{hadrons})}{\sigma(e^{+}e^{-} \to \mu^{+}\mu^{-})} = N\_{\text{c}} \sum\_{i} q\_{i}^{2} \tag{9.87}$$

where the sum runs over all available quark flavours (i.e. those for which the CMS energy E > 2mi, with m<sup>i</sup> being the quark mass for flavour i). The experimental data [115] are shown in Fig. 9.23.

**Fig. 9.21** Definition of the Ellis–Karliner angle θ ˜ in 3-jet events. The jets are ordered in decreasing energy and are shown in (a) their common CMS and (b) the CMS of jets 2 and 3.

**Fig. 9.22** Measurement of the angular distribution in 3-jet events [54] (see the text for the definition of the angular variable used) and comparisons with spin-1 (vector) and spin-0 (scalar) gluons.

**Fig. 9.23** Measurement of the ratio R [115] (see text) and comparisons with the QPM and QCD calculations.

when we were considering the static quark model for hadrons. In that case, we were using an approximate flavour symmetry between the light quarks u, d, and s. Here we will be considering an exact SU(3) colour symmetry.

The data show sharp peaks corresponding to hadron resonances. In this region, the quarks are strongly bound and the QPM is not an appropriate approximation. However, away from these resonances, R is approximately constant and shows the step increases as the CMS energy crosses the thresholds for the different quark flavours. The data are clearly inconsistent with N<sup>c</sup> = 1 and are approximately consistent with N<sup>c</sup> = 3. We will consider the small discrepancies when we consider QCD theory in the next section.

## **9.6.3 QCD**

Now we have considered the evidence that quarks come in 3 colours and that they interact via the exchange of massless coloured spin-1 gluons, we are ready to consider the theory of strong interactions, QCD. The theory is based on the gauge group SU(3). <sup>24</sup>We have already looked at this group <sup>24</sup> This SU(3) group is associated with 3 × 3 unitary matrices. A general complex 3 × 3 matrix requires 3×3×2 = 18 real parameters to specify it. As the matrices are unitary, they must satisfy U†U = I, or

$$\begin{aligned} \sum\_{j} U\_{ij}^{\dagger} U\_{jk} &= \delta\_{ik} \\ \sum\_{j} U\_{ji}^{\*} U\_{jk} &= \delta\_{ik} \end{aligned} \tag{9.88}$$

For the diagonal terms, this sum is over terms like U<sup>∗</sup> jiUji, which must be real (i.e. no imaginary parts), and therefore this yields 3 constraints for SU(3). For the off-diagonal terms, both the real and imaginary parts in eqn 9.88 must be equal to 0. There are 6 off-diagonal elements, but if the (i, j) element of the product is 0, then the (j, i) element will also be 0. Therefore, there are 3 off-diagonal elements to consider, each of which provides 2 constraints. So the total number of constraints is 3+2×3 = 9, which leaves 9 parameters. Now consider the determinant of eqn 9.88:

$$\det(U^{\dagger}U) = \det(I) = 1\tag{9.89}$$

However, U† = U <sup>T</sup>∗, which implies that det(U†) = det(U <sup>T</sup>)<sup>∗</sup> = det(U)∗, so, from eqn 9.89, det(U) det(U)<sup>∗</sup> = 1, i.e. det(U)=e<sup>i</sup><sup>φ</sup> (where φ is a real number). The special unitary group SU(3) has the additional constraint that det(U) = +1. This additional constraint means that we need 8 parameters and there are 8 generators for SU(3) and therefore 8 gluons. We can represent the three colour states of the quarks as |r, |b, and |g. The use of the term 'colour' can lead to some confusion, since it is just a label for a state and has nothing to do with ordinary colour. In particular, a state |r|b|g would not be colourless. We expect there to be 8 generators for the group and therefore 8 colours of gluons from the arguments given above. Although the physics of colour SU(3) is unrelated to that of flavour SU(3), the group theory is of course identical. We can therefore use the results from Chapter 5 to write down the colour wavefunctions of the 8 gluons as shown in Table 9.1. Mathematically, there can be a colour-singlet state

$$\sqrt{\frac{1}{3}}\left( |r\rangle|\bar{r}\rangle + |b\rangle|\bar{b}\rangle + |g\rangle|\bar{g}\rangle \right)$$

but if such a gluon state existed in nature, then, as a colour singlet, it could mediate a long-range strong force. Clearly then, there can be no such state for a massless gluon.<sup>25</sup>

QCD is a similar gauge theory to QED in many ways. We can determine the interactions between quarks and gluons by starting with a free field theory for the quarks and then impose local SU(3) gauge symmetry. Consider<sup>26</sup> <sup>26</sup>For a more pedagogical look at local a local gauge transformation specified by Λa(x) such that the transformed quark field is

$$
\psi'(x) = \exp[\mathrm{i}g\_\*\Lambda\_a(x)T\_a]\psi(x) \tag{9.90}
$$

where g<sup>s</sup> will turn out to be the strong-interaction coupling constant, T<sup>a</sup> are the generators of SU(3), and an implicit summation over repeated indices is assumed. The infinitesimal transformations of eqn 9.90 are

$$\psi'(x) = [1 + \mathrm{i}g\_s \Lambda\_a(x) T\_a] \psi(x) \tag{9.91}$$

In order to keep gauge invariance for the Lagrangian, we are obliged to introduce 8 gauge fields (gluons) G<sup>a</sup> <sup>μ</sup>(x). These transform under SU(3) as

$$G^a\_{\mu}(x) \rightarrow G^a\_{\mu}(x) - \partial\_{\mu}\Lambda\_a(x) - g\_s f\_{abc} \Lambda\_b(x) G^c\_{\mu} \tag{9.92}$$

$$\begin{aligned} g\_1 &= \sqrt{\frac{1}{2}} \left( |r\rangle |\bar{b}\rangle + |b\rangle |\bar{r}\rangle \right) \\ g\_2 &= -\mathrm{i} \sqrt{\frac{1}{2}} \left( |r\rangle |\bar{b}\rangle - |b\rangle |\bar{r}\rangle \right) \\ g\_3 &= \sqrt{\frac{1}{2}} \left( |r\rangle |\bar{r}\rangle - |b\rangle |\bar{b}\rangle \right) \\ g\_4 &= \sqrt{\frac{1}{2}} \left( |b\rangle |\bar{g}\rangle + |g\rangle |\bar{b}\rangle \right) \\ g\_5 &= -\mathrm{i} \sqrt{\frac{1}{2}} \left( |b\rangle |\bar{g}\rangle - |g\rangle |\bar{b}\rangle \right) \\ g\_6 &= \sqrt{\frac{1}{2}} \left( |g\rangle |\bar{r}\rangle + |r\rangle |\bar{g}\rangle \right) \\ g\_7 &= -\mathrm{i} \sqrt{\frac{1}{2}} \left( |g\rangle |\bar{r}\rangle - |r\rangle |\bar{g}\rangle \right) \\ g\_8 &= \sqrt{\frac{1}{6}} \left( |r\rangle |\bar{r}\rangle + |b\rangle |\bar{b}\rangle - 2|g\rangle |\bar{g}\rangle \right) \end{aligned}$$

**Table 9.1** Gluon colour wavefunctions.

<sup>25</sup>A meson state qq¯ will be in such a colour-singlet state and can be exchanged between hadrons, but the force is short-range because of the mass of the meson.

gauge transformations and Lagrangians, see Sections 12.1 and 12.4, respectively.

and fabc are the SU(3) structure constants, given by the commutation relations

$$[T\_a, T\_b] = \text{i}f\_{abc}T\_c\tag{9.93}$$

We proceed in a similar way to QED by replacing the partial derivative with the covariant derivative

$$\mathbf{D}\_{\mu} = \partial\_{\mu} + \mathbf{i}g\_{s}T\_{a}G^{a}\_{\mu} \tag{9.94}$$

Again following QED, we need to add a kinetic energy term for the gluon fields:

$$L\_{\rm gluon} = F^{a}\_{\mu\nu} F^{\mu\nu}\_{a} \tag{9.95}$$

where the field tensor is given by

$$F^{a}\_{\mu\nu} = \partial\_{\mu}G^{a}\_{\nu} - \partial\_{\nu}G^{a}\_{\mu} - g\_{s}f\_{abc}G^{b}\_{\mu}G^{c}\_{\nu} \tag{9.96}$$

The Lagrangian contains interactions between the quarks and the gluons in a similar way to QED. What is new is that the non-Abelian nature of SU(3) leads to the extra term in eqn 9.96, which, when substituted into the Lagrangian of eqn 9.95, generates terms proportional to G<sup>3</sup> and G<sup>4</sup>. Therefore, the gauge invariance coupled with the fact that SU(3) is non-Abelian implies that there are 3-gluon and 4-gluon vertices. As we shall see in Section 9.6.4, this makes the theory of QCD very different from QED.

We can use the gluon wavefunctions to determine the relative amplitudes for different colour combinations of quark and antiquark scattering:

	- (2) Similarly, if we consider the case of qrq<sup>r</sup>¯ → qrq<sup>r</sup>¯, the exchange gluons are g<sup>3</sup> and g<sup>8</sup> as above, but we need to remember the minus sign for antiquarks (in the same way as for negatively charged electrical particles in QED). Therefore, the amplitude is proportional to −<sup>2</sup> 3 .
	- (3) Next, we will consider the amplitude for scattering of two quarks of different colours (which for convenience we will take to be red and

<sup>27</sup>Another choice would of course have

**Fig. 9.24** Feynman diagram for quark–quark scattering for red quarks (t-channel diagram).

blue). For the process rb → rb (as in Fig. 9.25(a)), we again need to consider the gluons g<sup>3</sup> and g8. For g<sup>3</sup> exchange, we need gluons with rr¯ and b¯b, i.e. g<sup>3</sup> and g8, which gives a<sup>3</sup> = − <sup>1</sup> 2 <sup>1</sup> <sup>2</sup> <sup>=</sup> <sup>−</sup><sup>1</sup> 2 and a<sup>8</sup> = <sup>1</sup> 6 <sup>1</sup> <sup>6</sup> <sup>=</sup> <sup>1</sup> <sup>6</sup> , so the resulting amplitude is a(rb → rb) = −<sup>1</sup> <sup>3</sup> . The diagram in Fig. 9.25(b) is for the process rb → br and involves gluons with r¯ b and br¯, i.e. g<sup>1</sup> and g2, which gives a<sup>1</sup> = 1 2 <sup>1</sup> <sup>2</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> and a<sup>2</sup> = <sup>1</sup> 2 <sup>1</sup> <sup>2</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> . Adding the two amplitudes gives a(rb → br) = 1. q<sup>b</sup> q<sup>b</sup>

We can now easily write down the amplitude for qq¯ states, remembering to use the negative sign for the antiquarks. Thus, a(rr¯ → rr¯) = −a(rr →rr) = −<sup>2</sup> <sup>3</sup> and <sup>a</sup>(r¯<sup>b</sup> <sup>→</sup> <sup>r</sup>¯b) = <sup>−</sup>a(rb <sup>→</sup> rb) = <sup>1</sup> <sup>3</sup> . In a similar way, we can relate rr¯ → b¯b with rb → br because they both involve exchange of gluons g<sup>1</sup> and g2. This gives a(rr¯ → b¯b) = −a(rb → br) = −1.

The results for the different combinations of colours are summarized in Table 9.2. Now that we have all the pieces, we can easily evaluate the colour factors for qq¯ in a colour-singlet configuration. This will turn out to be very interesting, since it gives some insights into the origin of quark confinement. For the colour singlet, we need

$$|q\bar{q}\rangle = \sqrt{\frac{1}{3}}(|r\bar{r}\rangle + |b\bar{b}\rangle + |g\bar{g}\rangle)\tag{9.97}$$

We have to allow for the wavefunction normalization in eqn 9.97 and for the scattering amplitudes given in Table 9.2. Thus, rr¯ → rr¯ gives a factor of <sup>1</sup> 3 <sup>1</sup> <sup>3</sup> (−<sup>2</sup> <sup>3</sup> ). Allowing for the three colours gives a factor of 3, which gives −<sup>2</sup> <sup>3</sup> . rr¯ <sup>→</sup> <sup>b</sup>¯<sup>b</sup> gives <sup>1</sup> 3 <sup>1</sup> <sup>3</sup> (−1) and we have to allow for an equivalent term rr¯ → gg¯. Again we allow for the three colours, so the result is 3(−<sup>2</sup> <sup>3</sup> ) = −2. Adding, we obtain the overall result for the colour factor for qq¯ in a colour-singlet state, which is<sup>28</sup> −<sup>8</sup> <sup>3</sup> . We can also calculate colour factors for gluon coupling g → gg. In the conventional normalization, these colour factors (called 'Casimir factors') are given by

**Fig. 9.25** Feynman diagrams for quark–quark scattering for red and blue quarks (t-channel diagrams).

<sup>28</sup>Different conventions for the definition of the strong-interaction coupling constant can give results for these factors differing by a factor of 2, but the results for any cross section are the same.


**Table 9.2** Colour factors for quark and antiquark scattering by gluon exchange.

<sup>29</sup>When we calculated the Rutherford scattering cross section in Section 9.1, we used the Fourier transform to calculate the amplitude as a function of momentum transfer from the known Coulomb potential. Here we are performing the inverse Fourier transform to determine the potential, starting from the amplitude as a function of momentum transfer.

**Fig. 9.26** Lowest-order Feynman diagram for e−e<sup>−</sup> → e−e<sup>−</sup> scattering (a) and an O(α) correction with an e+e<sup>−</sup> loop (b).

<sup>30</sup>Our explanation is based on that given in Burcham and Jobes (see Further Reading).

C<sup>F</sup> = <sup>4</sup> <sup>3</sup> for q → qg and C<sup>A</sup> = 3 for g → gg coupling (see Cooper-Sarkar and Devenish in Further Reading).

Note that we have focused on the contribution to the vertex factor from the gluon colour, but the overall amplitudes will also pick up a factor of the strong-interaction coupling constant and a propagator term. Since the gluons are massless like the photon, the propagator term is 1/q<sup>2</sup>, where q is the 4-momentum transfer. In summary, the overall amplitude for the interaction of the qq¯ colour-singlet state is given by

$$a(q^2) \sim -\frac{8}{3} \frac{g\_s}{q^2} \tag{9.98}$$

In the non-relativistic limit, we can Fourier transform the amplitude given in eqn 9.98 from momentum space to position space.<sup>29</sup> This gives a potential that is negative and scales with distance as 1/r. It turns out that this is the only combination of two quarks/antiquarks that gives a negative potential. This gives some explanation of why we find bound qq¯ states but not states with net colour like qq. This is suggestive of colour confinement, which states that the only stable hadrons are colour singlets. However, as we will see in the next section, the stronginteraction coupling 'constant' becomes large at low values of q<sup>2</sup>, which means that the perturbation theory we have used will no longer be valid, so this result should be considered as a qualitative indication of colour confinement, rather than a proof.

## **9.6.4 Running coupling constants**

In QCD, the effect of the running coupling constant is very important. Even in QED, the fine-structure 'constant' is not constant but varies with the scale Q<sup>2</sup> of the reaction being studied. This is due to the effects of shielding by virtual e+e<sup>−</sup> pairs, which reduces the effective strength of the interaction between two charges as the distance increases or, equivalently, as the scale Q<sup>2</sup> decreases. A first naive attempt to calculate the effects of these higher-order corrections results in meaningless infinities. We will see how renormalization theory allows us to overcome this problem.<sup>30</sup> Consider the case of ee → ee. The lowest-order Feynman diagram is shown in Fig. 9.26(a). There is an O(α) correction from the e<sup>+</sup>e<sup>−</sup> loop diagram shown in Fig. 9.26(b). Evaluation of the loop requires an integration of the 4-momentum running round the loop. It can be shown that the effect of this integral modifies the amplitude by a factor 1 − I(q<sup>2</sup>), where q is the 4-momentum of the photon and

$$I(q^2) = \frac{\alpha}{3\pi} \int\_{m^2}^{\infty} \frac{\mathrm{d}p^2}{p^2} - 2\frac{\alpha}{\pi} \int\_0^1 \mathrm{d}x \,(1 - x) \ln\left[1 - \frac{q^2 x (1 - x)}{m^2}\right] \tag{9.99}$$

The result is formally logarithmically divergent and for now we introduce an arbitrary upper limit Λ, but we will see that our final result is independent of the value of Λ. For the case −q<sup>2</sup> m<sup>2</sup>, we can evaluate the integral (see Exercise 9.3) as

$$I(q^2) = \frac{\alpha}{3\pi} \ln\left(\frac{\Lambda^2}{-q^2}\right) \tag{9.100}$$

There are also higher-order diagrams to consider, with two, three, four, etc. e<sup>+</sup>e<sup>−</sup> loops. The effect on the matrix element is to introduce a multiplicative correction factor

$$F = 1 - I(q^2) + [I(q^2)]^2 - [I(q^2)]^3 + \dots \tag{9.101}$$

We can use eqn 9.101 and sum this infinite geometric series to get a factor

$$F = \frac{1}{1 + (\alpha/3\pi)\ln(\Lambda^2/Q^2)}\tag{9.102}$$

where Q<sup>2</sup> = −q<sup>2</sup>. We can interpret this result by regarding α as the bare coupling constant (which we call α0), to which the measured coupling constant is related via

$$\alpha = \frac{\alpha\_0}{1 + (\alpha/3\pi)\ln(\Lambda^2/Q^2)}\tag{9.103}$$

This result shows that the effective coupling constant depends on the scale Q<sup>2</sup> as

$$\alpha(Q^2) = \frac{\alpha\_0}{1 + [\alpha(Q^2)/3\pi] \ln(\Lambda^2/Q^2)}\tag{9.104}$$

We can of course write down an equivalent formula at a different scale μ. We can then combine these two expressions (see Exercise 9.4) to find a relationship between the coupling constants at the two different scales:<sup>31</sup>

$$\alpha(Q^2) = \frac{\alpha(\mu^2)}{1 - [\alpha(\mu^2)/3\pi] \ln(Q^2/\mu^2)}\tag{9.105}$$

This result is remarkable because the dependence on the arbitrary cut-off parameter Λ has disappeared. This means that QED is a fully consistent theory up to any arbitrary energy scale.<sup>32</sup> The result of renormalization predicts that the value of α(Q<sup>2</sup>) increases slowly with Q<sup>2</sup>. This effect amounts to about 7% when comparing Q = M<sup>Z</sup> with Q = me, and this prediction has been verified. We can understand this result in a qualitative way by noting that the charge of one electron seen by the other will be decreased as a result of screening by the e<sup>+</sup>e<sup>−</sup> pairs. At higher values of Q<sup>2</sup>, the photon has a shorter wavelength and so penetrates more of the screening charges and 'sees' a larger effective charge.

In the case of QCD, we can have analogous shielding effects from the quarks, but there is also an anti-shielding effect from the gluons, since <sup>31</sup>Here we have only included the effects of e+e<sup>−</sup> loops, but if Q<sup>2</sup> is sufficiently large, we need to consider the effects of other leptons and quarks. There is an additional subtlety in that we have ignored other Feynman diagrams, but it turns out that these cancel exactly.

<sup>32</sup>From eqn 9.105, the value of the coupling constant will become infinite at Q ∼ 10<sup>280</sup> MeV, but this would only mean that perturbation theory breaks down, not that the theory is fundamentally wrong.

they have a colour charge (unlike photons, which are neutral). The net result is that the running coupling constant for QCD is given by

$$\alpha\_{\rm s}(Q^2) = \frac{\alpha\_{\rm s}(\mu^2)}{1 + [\alpha\_{\rm s}(\mu^2)/12\pi](33 - 2n\_{\rm f})\ln(Q^2/\mu^2)}\tag{9.106}$$

where n<sup>f</sup> is the number of active flavours, which depends on the scale Q compared with the mass of the different flavours of quarks. At a particular value of Q<sup>2</sup> = Λ<sup>2</sup> QCD, the denominator in eqn 9.106 will become equal to 0. This happens when

$$\ln(\Lambda\_{\rm QCD}^2/\mu^2) = -\frac{12\pi}{(33 - 2n\_{\rm f})\alpha\_{\rm s}(\mu^2)}\tag{9.107}$$

We can then invert eqn 9.107 to obtain an expression for the strong interaction running coupling constant in terms of one unknown parameter ΛQCD:

$$\alpha\_s(Q^2) = \frac{12\pi}{(33 - 2n\_\mathrm{f})\ln(Q^2/\Lambda\_{\mathrm{QCD}}^2)}\tag{9.108}$$

For n<sup>f</sup> < 17, we can see from eqn 9.108 that the value of αs(Q<sup>2</sup>) decreases with increasing Q<sup>2</sup>, which already gives an explanation for the success of the naive QPM—at large Q<sup>2</sup>, the value of α<sup>s</sup> becomes small enough for perturbation theory to be valid and the QPM can be seen to correspond to the processes that are of lowest order in αs. At small Q<sup>2</sup>, the value of α<sup>s</sup> will approach unity, perturbation theory will break down, and the parton model will not even be a useful approximation.

The QCD-improved parton model can be used to calculate the lowestorder QCD corrections to the naive QPM, so, if sufficiently precise experiments can be performed, the theory can be tested.

The value of αs(Q<sup>2</sup>) can be determined <sup>33</sup> <sup>33</sup> At lowest order in perturbation thefrom many different processes, including the following:


ory, it is convenient to determine a value of ΛQCD; however, when higherorder calculations are considered, it is better to measure a value of αs(Q2) at a particular scale. Conventionally, the scale used is M<sup>Z</sup> .

**Fig. 9.27** One Feynman diagram for the process e+e<sup>−</sup> → qqg¯ .


A summary plot [91] of these different determinations of α<sup>s</sup> over a large range of Q<sup>2</sup> is shown in Fig. 9.28. The predicted decrease in α<sup>s</sup> with increasing Q<sup>2</sup> can clearly be seen. The good agreement between the different measurements and the QCD predictions shows that the theory has now been tested to a precision of better than 1%.

## **9.6.5 Experimental tests of the gauge structure of QCD**

We have seen that there are simple experimental results that demonstrate that quarks come in 3 colours, but there are no equivalent simple demonstrations that gluons come in 8 colours. The question we need to ask is: what is the evidence to justify the choice of the SU(3) gauge group? To some extent, the question has been addressed implicitly by the measurements of αs(Q<sup>2</sup>) as a function of Q<sup>2</sup>, which were consistent with QCD (see Section 9.6.4). However, it is still interesting to see if we can do a more direct test of the choice of the gauge group as SU(3). This can be done in e<sup>+</sup>e<sup>−</sup> annihilation to 3 or 4 jets. These processes involve the vertices qqg, ggg, and gggg, and they are therefore sensitive to the colour factors for qqg and ggg, which in SU(3) are related to the colour factors C<sup>F</sup> = <sup>4</sup> <sup>3</sup> and C<sup>A</sup> = 3 (see Section 9.6.3). The following are some of the variables used in this analysis:

(1) Charged-particle multiplicity in gluon jets compared with quark jets. As the amplitude for gluon splitting is greater than that for a

**Fig. 9.28** Measurement of αs with different processes as a function of the scale Q2. From [91].

quark to emit a gluon, the multiplicity should be higher in gluon jets than in quark jets. Quark jets from b or c quarks can be tagged by the long lifetimes of the b and c quarks, and the gluon jet will tend to be the lowest-energy jet in 3-jet events because of the bremsstrahlung process.


$$T = \max\left(\frac{\sum \mathbf{p}\_i \cdot \hat{\mathbf{n}}}{\sum |\mathbf{p}\_i|}\right) \tag{9.109}$$

where the sum is over all the calorimeter cells or charged-particle tracks, with momenta pi. The unit vector **n**ˆ is varied to maximize the value of T. An ideal 2-jet event has T = 1 and multijet events will have lower values. The thrust distribution therefore depends on the gluon radiation and is thus sensitive to the colour factors for the different vertices.

The measurements were performed during LEP experiments, with data being taken at (or close to) the Z peak, which provided the highest statistics. The clean and well-defined environment in e<sup>+</sup>e<sup>−</sup> annihilation and the high energy made it possible to identify clean 3- and 4-jet events and hence to have small systematic errors. The results of a global fit to this data [91] are shown in Fig. 9.29. The data are consistent with the choice of SU(3) for the gauge symmetry and clearly exclude other choices.

**Fig. 9.29** Results of a global fit to e+e<sup>−</sup> data [115] for the colour factors C<sup>F</sup> and CA. The contours from fits to individual analyses are shown and the shaded area is the result for the global fit to all variables. The star represents the QCD prediction.

## **9.6.6 Experimental fits to the quark distribution functions**

From all the DIS data with electron, muon, and neutrino beams, we can perform a fit to determine the quark and antiquark distribution functions. Note that although the most precise data come mainly from electron and muon scattering, the neutrino data are essential to separate antiquarks from quarks. The results of one of these global fits are shown in Fig. 9.30.

We can see that the valence quarks dominate at large x but that the sea quarks are important at very low x. The gluon distribution is also very important at low x, as discussed in the next section.

## **9.6.7 The gluon distribution function**

We have already seen (see Section 9.4.4) that the gluons carry about 50% of the momentum of a nucleon, but the question is how to determine the shape of the gluon distribution function g(x). This cannot be determined as directly as for the quarks, because the gluons carry no electric or weak charge and therefore do not interact directly with photons or W bosons. We can, however, determine the shape of g(x) from the scaling violations, the slow variation of the quark distribution functions with Q<sup>2</sup> (see Fig. 9.16). These scaling violations can be explained in QCD by higherorder corrections to the simple QPM. A quark carries strong charge, so it can emit a virtual gluon, and a gluon also carries strong charge (note the important difference with electromagnetism, where the carrier of the force, the photon, is electrically neutral) and can therefore turn into a

**Fig. 9.30** QCD global fits to distribution functions [115] for two different scales: Q<sup>2</sup> = 10 GeV<sup>2</sup> (a) and Q<sup>2</sup> = 10<sup>4</sup> GeV<sup>2</sup> (b).

**Fig. 9.31** QCD corrections to the QPM.

**Fig. 9.32** QPM for hadron–hadron collisions. The generic labels a, b, c, and d refer to the type of particles participating in the reaction.

qq¯ pair. Hence we have corrections to the QPM, as illustrated in the Feynman diagrams in Fig. 9.31.

From the process in Fig. 9.31(a), we expect the quarks to move down in momentum, and hence the quark distribution functions will be enhanced at low x and depleted at high x. From the process in Fig. 9.31(b), we expect an enhancement of quarks and antiquarks at low x. How much of this sea of virtual quarks and antiquarks we resolve depends on the wavelength of the probe we use. At longer wavelengths (lower momentum transfer and hence smaller values of Q<sup>2</sup>), we do not resolve the quark– antiquark pairs and so they do not give a net contribution. Conversely, at higher values of Q<sup>2</sup>, we have sufficient resolution to resolve them. Hence we expect that as Q<sup>2</sup> increases, F2(x, Q<sup>2</sup>) should increase at low x and decrease at high x. Qualitatively, this is just the behaviour seen in the data (see Fig. 9.16). From a quantitative QCD analysis of this data, we can in fact determine g(x), and the result of such a determination [115] is shown in Fig. 9.30. These scaling violations are proportional to α<sup>s</sup> in lowest-order perturbation theory, and therefore the QCD fits can also be used to provide another determination of αs.

## **9.7 Hadron–hadron collisions**

The naive QPM and the QCD-improved QPM can be easily extended from lepton–hadron collisions to hadron–hadron collisions. Note that these calculations only apply to 'hard' processes involving large transverse momentum so that α<sup>s</sup> is small enough for perturbation theory to be valid. At low values of Q<sup>2</sup>, the process is too complicated for any QCD predictions to be made and this is a regime in which we are obliged to use simple phenomenological models. For hard processes, the QPM picture for the reaction is shown in Fig. 9.32.

At the parton level, the collision is between a parton with momentum fraction x<sup>a</sup> in one hadron and another parton with momentum fraction x<sup>b</sup> in the other hadron. The probability of finding a parton at a given momentum fraction x is given by the quark and gluon distribution functions (see Section 9.6.6). The cross section at the parton level can be calculated from perturbation theory (using QED, electroweak theory, or QCD as appropriate). This picture can then be converted into a quantitative prediction in the form of a convolution integral:

$$\sigma(pp \to cd) = \sum\_{a,b} \int\_0^1 \int\_0^1 \mathrm{d}x\_a \, \mathrm{d}x\_b f\_a(x, Q^2) f\_b(x, Q^2) \hat{\sigma}(ab \to cd) \tag{9.110}$$

where the sum runs over all the parton types, the f(x, Q<sup>2</sup>) are the parton distribution functions, ˆσ refers to the parton–parton collision, which can be calculated, and the integrals are over the parton momentum fractions in the two protons. The formula has been written for the case of pp collisions, but it can obviously be generalized to other types of hadron– hadron collisions. The momenta of the partons in terms of the CMS energy of the pp collision, <sup>√</sup>s, are given by <sup>p</sup><sup>a</sup> <sup>=</sup> <sup>x</sup><sup>a</sup> <sup>√</sup>s/2 and <sup>p</sup><sup>b</sup> <sup>=</sup> xb <sup>√</sup>s/2. The square of the CMS energy in the parton–parton collision is therefore given by

$$\hat{s} = E\_{\text{total}}^2 - p\_{\text{total}}^2 = (x\_a + x\_b)^2 s/4 - (x\_a - x\_b)^2 s/4 = x\_a x\_b s \quad \text{(9.111)}$$

## **9.7.1 Drell–Yan**

An application of this formalism is to the Drell–Yan process (leptonpair production in hadron–hadron collisions). The parton-level process is qq¯→l +l <sup>−</sup> and the cross section is given by analogy with e<sup>+</sup>e<sup>−</sup> →l +l <sup>−</sup> as

$$
\hat{\sigma}(q\_i \bar{q}\_i \to l^+ l^-) = \frac{q\_i^2}{3} \frac{4\pi\alpha^2}{3\hat{s}} \tag{9.112}
$$

where q<sup>i</sup> is the charge of the quark in units of e and the extra factor of 3 compared with the equivalent equation for e<sup>+</sup>e<sup>−</sup> annihilation comes from the fact that only qq¯ pairs of the same colour can annihilate to give a virtual photon. By differentiating eqn 9.110 and substituting for σˆ from eqn 9.112, we obtain

$$\frac{\mathrm{d}^2 \sigma (hh \to l^+ l^-)}{\mathrm{d}x\_a \, \mathrm{d}x\_b} = \left[ f\_1(x\_a) f\_2(x\_b) + f\_1(x\_b) f\_2(x\_a) \right] \frac{q\_i^2}{3} \frac{4 \pi \alpha^2}{3 \hat{s}} \tag{9.113}$$

It is convenient to change variables to

$$y = \frac{1}{2} \ln \frac{E + p\_z}{E - p\_z}, \qquad \tau = \frac{\hat{s}}{s}$$

where y is the rapidity. From a Jacobian transformation, we obtain

$$\frac{\mathrm{d}^2 \sigma (hh \to l^+ l^-)}{\mathrm{d}y \,\mathrm{d}\tau} = [f\_1(x\_a) f\_2(x\_b) + f\_1(x\_b) f\_2(x\_a)] \frac{q\_i^2}{3} \frac{4\pi \alpha^2}{3s} \tag{9.114}$$

Therefore, if we assume approximate scaling for the quark distribution functions, we should get the same scaled cross section for different values of s. This prediction is compared with experimental data [56] for pp → μ<sup>+</sup>μ<sup>−</sup>X in Fig. 9.33. There are many successful predictions for other Drell–Yan processes in ¯pp interactions at the CERN S¯ppS collider and the Tevatron and in pp interactions at the LHC. The QCD-improved parton model has been used successfully for many hard processes at the LHC. Some of these applications will be considered in Chapter 13.

**Fig. 9.33** Scaling cross section for pp → μ+μ−X at different values of CMS energy. The data are from the E605 experiment [56].

## **Chapter summary**


## **Further reading**


## **Exercises**


$$r = \frac{\sigma(\pi^+\mathcal{C} \to \mu^+\mu^-X)}{\sigma(\pi^-\mathcal{C} \to \mu^+\mu^-X)}$$

equals <sup>1</sup> <sup>4</sup> when ˆs/s approaches 1. What value does r have to be for small ˆs/s?

What are the experimental issues associated with studying Drell–Yan reactions?


this process could be used to constrain the gluon parton density function.


$$R = \frac{[4u(x\_1) + d(x\_1)][\bar{u}(x\_2) + \bar{d}(x\_2)]}{4u(x\_1)\bar{u}(x\_2) + d(x\_1)\bar{d}(x\_2)}$$

# **Oscillations and** *CP* **violation in meson systems 10**

This chapter and the next deal with the general subject of 'oscillations' and CP violation, with this chapter focusing on oscillations and CP violation in meson systems and Chapter 11 dealing with neutrino oscillations. While there are some important differences between oscillations within meson systems and neutrino oscillations, they both demonstrate quantum-mechanical interference over macroscopic distances and allow extremely small mass differences to be measured. Moreover, studies of both the neutral kaon system and neutrino oscillations have produced results challenging the theoretical orthodoxy of their times.

The neutral kaon system has been extensively studied over more than sixty years, mainly in fixed-target experiments. The CP violation phenomenon was discovered in this system in 1962. More recently, B-meson systems have been studied at 'B-factories', which have produced spectacular results demonstrating CP-violating effects. There are now four neutral meson systems in which these oscillation and CP-violating processes have been studied: kaons and D, B, and B<sup>s</sup> mesons. CP-violating effects have also been observed in decays of charged particles.

CP violation is a necessary condition to understand the observed baryon asymmetry of the universe, but the amount of CP violation observed in the quark sector is too small to explain this effect. However, it is now believed that CP violation in the neutrino sector provides the most plausible explanation, so this subject will be discussed in the context of neutrino oscillations (Chapter 11).

All the CP violation effects observed in the quark sector to date are compatible with the Standard Model. The real interest in CP violation physics in the quark sector is that it usually arises from Feynman diagrams with loops, which naturally makes it very sensitive to new heavy particles coupling within the loop. This means that very precise measurements of CP violation can give access to information about new physics at very high mass scales. This is another illustration of how the indirect search for new physics via precision measurements is complementary to direct searches at machines like the LHC.


Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak,

& Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak,

& Tony Weidberg 2016. Published in 2016 by Oxford University Press.

ity violation in weak interactions was provided by the so-called 'τ–θ puzzle', where what is now known as the K<sup>+</sup> was observed to decay into two or three pions, which cannot happen if parity is conserved.


**Table 10.1** Neutral meson properties. ΔM is the mass difference between the two mass eigenstates and the lifetimes are given for the two mass eigenstates.

<sup>2</sup>The Q value is the mass of the parent minus the mass of the decay products and gives the total amount of kinetic energy available in the centre-of-mass frame.

## **10.1 Symmetries**

Since the role of symmetries is essential to understanding the behaviour of neutral meson systems, the reader should be familiar with these concepts from Chapter 2. The symmetries we discuss here are parity P, charge conjugation C, time reversal T, and their combinations CP and CPT. Parity and charge-conjugation symmetries are conserved for strong and electromagnetic interactions but not for weak interactions. Massless neutrinos and antineutrinos have a definite helicity and only left-handed neutrinos or right-handed antineutrinos participate in the weak interactions: parity is 'maximally' violated <sup>1</sup> <sup>1</sup> Some of the first evidence for par-(see Section 7.2.5).

> As will be discussed from Section 10.5 onwards, there is evidence for violation of CP symmetry in the neutral kaon system and in systems containing heavier quarks. Since relativistic field theories predict that the combined symmetry of CPT is conserved, T-violating effects are also expected in these systems.

## **10.2 Neutral kaon decays and** *K***<sup>1</sup> and** *K***<sup>2</sup>**

The four neutral meson systems that can be studied extensively are listed in Table 10.1. Each of these is the lightest neutral meson containing a particular combination of flavours of quarks, and so the only available decay modes are via the weak interaction. The main thing that makes these particles so fascinating is that there are some decay modes where the same final state is accessible by both the particle and antiparticle, for example K<sup>0</sup> → π<sup>+</sup>π<sup>−</sup> and K¯ <sup>0</sup> → π<sup>+</sup>π<sup>−</sup>. The properties shown in Table 10.1 will be discussed extensively in this chapter—they impart striking differences in the ways in which the four meson systems behave. We start by looking closely at the neutral kaon system, which has two strikingly different lifetimes associated with it, and will then return to this table to discuss the properties of the other meson systems. Because the kaon is light, it has a limited number of decay channels and so is the simplest to consider first.

For the initial discussion, we assume that CP is strictly conserved. As just mentioned, the K<sup>0</sup> (quark content ¯sd) and K¯ <sup>0</sup> ( ¯ ds) must decay weakly. They decay to two- and three-pion final states, and semileptonically: K<sup>0</sup> → l <sup>+</sup> + ν<sup>l</sup> + π<sup>−</sup> and K¯ <sup>0</sup> → l <sup>−</sup> + ¯ν<sup>l</sup> + π<sup>+</sup>, where l is e or μ. The charge of the lepton or pion in these semileptonic decays can be used to determine the strangeness of the decaying kaon. To understand the phenomenology of kaon decays, it is first necessary to examine the properties of the final states under the operation of CP.

The semileptonic final states are CP-conjugate states; in other words, CP(l <sup>+</sup> + ν<sup>l</sup> + π<sup>−</sup>) → l <sup>−</sup> + ¯ν<sup>l</sup> + π<sup>+</sup> and vice versa—the two- and threepion final states are CP eigenstates. The two-pion final states are π<sup>0</sup>π<sup>0</sup> and π<sup>+</sup>π<sup>−</sup>, with a Q value<sup>2</sup> of ∼220 MeV. In the π<sup>0</sup>π<sup>0</sup> final state, because they are identical bosons, the π<sup>0</sup>s will be in a state of even relative angular momentum, L = 0, 2, so P = [ηp(π)]<sup>2</sup>(−1)<sup>L</sup> = +1 and C = [ηc(π<sup>0</sup>)]<sup>2</sup> = +1; hence CP = +1. For the π<sup>+</sup>π<sup>−</sup> final state, P = [ηp(π)]<sup>2</sup>(−1)<sup>L</sup> = +1; the operation of C interchanges π<sup>+</sup> and π<sup>−</sup> and C(π<sup>+</sup>, π−)=(π−, π<sup>+</sup>), so η<sup>c</sup> = (−1)<sup>L</sup> = +1 and hence again CP = +1.

The Q value of the three-pion final states is only ∼70 MeV, so this suggests that the most probable value for the relative angular momentum is L = 0.<sup>3</sup> <sup>3</sup>As the momenta of the pions are For the π<sup>0</sup>π<sup>0</sup>π<sup>0</sup> final state, P = [ηp(π)]<sup>3</sup> = −1 and C = [ηc(π<sup>0</sup>)]<sup>3</sup> = (+1)<sup>3</sup> = +1; hence CP = −1. The π<sup>+</sup>π−π<sup>0</sup> final state can be considered as a π<sup>0</sup> combined with the π<sup>+</sup>π<sup>−</sup> state. With CP(π<sup>+</sup>, π−) = +1 and CP(π<sup>0</sup>) = P(π<sup>0</sup>)C(π<sup>0</sup>) = −1, CP = −1 as for the 3π<sup>0</sup> mode.

To summarize, the 2π final states are CP-even and the 3π final states are CP-odd. The important thing to notice is that the Q values of the two- and three-pion decay modes are very different. In particular, the Q value of the three-pion decay mode is only 70 MeV, which leaves very little phase space for these decays, and hence the partial decay rates of K<sup>0</sup> to the two- and three-pion final states will be very different.

K<sup>0</sup> and K¯ <sup>0</sup> are not CP eigenstates, although

$$CP|K^0\rangle \to |\bar{K}^0\rangle, \qquad CP|\bar{K}^0\rangle \to |K^0\rangle.$$

Since the K<sup>0</sup> and K¯ <sup>0</sup> both decay to the same two- and three-pion final states, it means they are coupled by virtual |ΔS| = 2 second-order weak transitions such as

$$K^0 \leftrightarrow (2\pi) \leftrightarrow \bar{K}^0, \qquad K^0 \leftrightarrow (3\pi) \leftrightarrow \bar{K}^0.$$

At the quark level, the diagrams that change between K<sup>0</sup> and K¯ <sup>0</sup> states are shown in Fig. 10.1; they are often referred to as box diagrams.

This second-order weak coupling 'mixes' the K<sup>0</sup> and K¯ <sup>0</sup>, meaning that the physical kaon states—the states with definite masses and lifetimes must evolve as linear superpositions of K<sup>0</sup> and K¯ <sup>0</sup>, i.e.

$$|\psi(t)\rangle = a(t)|K^0\rangle + b(t)|\bar{K}^0\rangle.$$

The physical states of the neutral kaons that are CP eigenstates can easily be seen to be

$$\begin{aligned} \left| K\_1 \right\rangle &= \sqrt{\frac{1}{2}} \left( \left| K^0 \right\rangle + \left| \bar{K}^0 \right\rangle \right) \\ \left| K\_2 \right\rangle &= \sqrt{\frac{1}{2}} \left( \left| K^0 \right\rangle - \left| \bar{K}^0 \right\rangle \right) \end{aligned} \tag{10.1}$$

K<sup>1</sup> and K<sup>2</sup> are orthogonal and expressing these states as linear combinations of K<sup>0</sup> and K¯ <sup>0</sup> corresponds to a change of basis from the strong-interaction eigenstates to the weak eigenstates of CP.

so low, they would need a very large impact parameter to have L = 1.

**Fig. 10.1** Box diagrams for K¯ <sup>0</sup> → K<sup>0</sup> transitions.

The K<sup>1</sup> and K<sup>2</sup> states contain equal amounts of K<sup>0</sup> and K¯ <sup>0</sup> and are eigenstates of even (+1) and odd (−1) CP, respectively, since

$$CP|K\_1\rangle \rightarrow \sqrt{\frac{1}{2}}\left(|\bar{K}^0\rangle + |K^0\rangle\right) = +|K\_1\rangle\tag{10.2}$$

$$CP|K\_2\rangle \rightarrow \sqrt{\frac{1}{2}\left(|\bar{K}^0\rangle - |K^0\rangle\right)} = -|K\_2\rangle \tag{10.3}$$

Since K<sup>1</sup> and K<sup>2</sup> are states of different CP and CP is conserved in their decays, they will decay to different final states with different lifetimes:

$$K\_1 \to 2\pi \qquad \text{(}CP = +1\text{)}\tag{10.4}$$

$$K\_2 \to 3\pi \qquad \text{(}CP = -1\text{)}\tag{10.5}$$

The small phase space available for the K<sup>2</sup> → 3π decay (Q ∼ 70 MeV) means that τ2, the lifetime of the K2, will be much longer than τ1, the The K<sup>1</sup> and K<sup>2</sup> are also expected to have different masses. The K<sup>1</sup> − K<sup>2</sup> mass difference is very important and is discussed further below; for the present, the discussion will concentrate on the consequences of their different lifetimes.

The strangeness eigenstates K<sup>0</sup> and K¯ <sup>0</sup> can be expressed in terms of K<sup>1</sup> and K<sup>2</sup> by inverting eqns 10.1:

$$|K^0\rangle = \sqrt{\frac{1}{2}}\left(|K\_1\rangle + |K\_2\rangle\right) \tag{10.6}$$

$$|\bar{K}^0\rangle = \sqrt{\frac{1}{2}\left(|K\_1\rangle - |K\_2\rangle\right)}\tag{10.7}$$

An initially pure K<sup>0</sup> (or K¯ <sup>0</sup>) state will be an equal mixture of K<sup>1</sup> and K2. If the number of K<sup>0</sup> produced at a (proper) time t = 0 is N0, then the total number of kaons at a subsequent time t will be

$$N(t) = \frac{1}{2} N\_0 (\mathbf{e}^{-t/\tau\_1} + \mathbf{e}^{-t/\tau\_2}) \tag{10.8}$$

Since τ<sup>1</sup> τ2, the K<sup>1</sup> component will die away first and at a time much greater than the lifetime of the K1, and only the K<sup>2</sup> component will remain. If a pure K<sup>0</sup> beam is produced, it will be found to contain a rapidly decaying component, K1, decaying to two pions, and a slow component, K2, decaying to three pions. Although the kaons produced initially are all of the same strangeness, the K<sup>2</sup> component that remains at long times will be an equal mixture of K<sup>0</sup> and K¯ <sup>0</sup>. The measured lifetimes of K<sup>1</sup> and K<sup>2</sup> are 89 ps and 51 ns, respectively, corresponding to decay lengths of cτ<sup>1</sup> = 2.7 cm and cτ<sup>2</sup> = 15.6 m. The K<sup>2</sup> lifetime is 600 times the lifetime of the K<sup>1</sup> and at modest energies (i.e. a few GeV) the K<sup>1</sup> component will decay in a few centimetres but the K<sup>2</sup> component will travel many metres before decaying; it is therefore possible to make essentially pure K<sup>2</sup> beams.

lifetime of the K1. <sup>4</sup> <sup>4</sup> The heavier <sup>D</sup>, <sup>B</sup>, and <sup>B</sup><sup>s</sup> mesons also mix in a similar way; however, because they are heavier, there are many decay modes with large Q values accessible to each of the CP eigenstates and so the differences in lifetime in the heavy-meson systems are smaller than for kaons.

## **10.3 Mass differences of neutral mesons**

The discussion in Section 10.2 introduced K<sup>1</sup> and K<sup>2</sup> as the CP eigenstates of the neutral kaon system that arise as a consequence of second-order weak coupling between K<sup>0</sup> and K¯ <sup>0</sup>. It was indicated that K<sup>1</sup> and K<sup>2</sup> must have different masses and it is instructive to look more closely at how this arises. We will label the states as |K etc., but the formalism is valid for any of the neutral meson systems in Table 10.1.

We introduce the formalism in two parts, first by assuming the particles do not decay, then introducing a treatment that allows the mesons to decay. We continue to assume that CP is conserved. Take a meson in a state that is composed of a linear combination of the states |K<sup>0</sup> and |K¯ <sup>0</sup> that have definite strangeness (i.e. are eigenfunctions of the strangeness operator). Since |K<sup>0</sup> and |K¯ <sup>0</sup> are coupled by second-order weak interactions, we can describe the kaon system by a pair of coupled equations using Schr¨odinger's time-dependent equation:

$$\begin{split} \mathrm{i}\frac{\partial}{\partial t} \begin{pmatrix} |K^0\rangle \\ |\bar{K}^0\rangle \end{pmatrix} &= \mathbf{H} \begin{pmatrix} |K^0\rangle \\ |\bar{K}^0\rangle \end{pmatrix} \\ &= \begin{pmatrix} M & M\_{12} \\ M\_{12}^\* & M \end{pmatrix} \begin{pmatrix} |K^0\rangle \\ |\bar{K}^0\rangle \end{pmatrix} \end{split} \tag{10.9}$$

where M is the mass of K<sup>0</sup> and K¯ <sup>0</sup> (their masses are identical if we impose CPT invariance). The off-diagonal term M<sup>12</sup> = K<sup>0</sup>|Hweak|K¯ <sup>0</sup> represents the second-order weak coupling between |K<sup>0</sup> and |K¯ <sup>0</sup>. The matrix **H** represents the Hamiltonian of the system and so is Hermitian. If we instead consider the state as a combination of |K1 and |K2 (which are eigenfunctions of the CP operator), the Schr¨odinger equation can be written in terms of a pair of equations that are no longer coupled:

$$\mathrm{i}\frac{\partial}{\partial t}\begin{pmatrix}|K\_1\rangle\\|K\_2\rangle\end{pmatrix} = \mathbf{H}'\begin{pmatrix}|K\_1\rangle\\|K\_2\rangle\end{pmatrix} = \begin{pmatrix}M\_1 & 0\\0 & M\_2\end{pmatrix}\begin{pmatrix}|K\_1\rangle\\|K\_2\rangle\end{pmatrix} \tag{10.10}$$

The elements of the matrix **H** give the masses of the K<sup>1</sup> and K<sup>2</sup> and can be found from the eigenvalues of **H**, which are M ± |M12|. This is how the mass splitting of the K<sup>1</sup> and K<sup>2</sup> arises. The eigenvectors of **H** give the states K<sup>1</sup> and K<sup>2</sup> in terms of K<sup>0</sup> and K¯ <sup>0</sup> as in eqns 10.1.

We now turn to the case where the particles are able to decay. The time evolution of a neutral meson wavefunction may be written as

$$|K^0(t)\rangle = \mathbf{e}^{-\mathbf{i}Mt} \mathbf{e}^{-t/2\tau} |K^0\rangle \tag{10.11}$$

where the first exponential is the usual plane wave for a state with energy E = M. The second exponential is imposed<sup>5</sup> to give the exponential decay for a state with proper lifetime τ (i.e. width Γ = 1/τ in natural units), so that

$$|\langle K^0 | K^0 \rangle|^2 \propto \mathbf{e}^{-t/\tau} \quad \left(= \mathbf{e}^{-\Gamma t}\right)$$

<sup>5</sup>Although this is standard practice for considering meson decays, it is a somewhat non-standard use of quantum mechanics. This second exponential means that the state as written does not remain normalized; there is a further piece to the wavefunction, not written here, that represents the part of the state that has decayed.

We can generalize eqn 10.11 to a two-state system:

$$
\begin{pmatrix}
\end{pmatrix} = \Sigma \begin{pmatrix}
\end{pmatrix}, \qquad \text{where } \Sigma = e^{-i\mathbf{M}t - \Gamma t/2} \tag{10.12}
$$

**M** and **Γ** are 2×2 matrices encoding the time evolution of the two-state system. |K<sup>0</sup> and its antiparticle |K¯ <sup>0</sup> are the flavour eigenstates. The off-diagonal terms describe the transitions [mixing] between the meson and antimeson. We can apply Schr¨odinger's equation, i dψ/dt = Hψ, to identify the Hamiltonian, as we did previously:

$$\mathbf{H} = \mathbf{M} - \frac{\mathbf{i}}{2}\mathbf{\Gamma}$$

Because of the way we have introduced the decaying states, **H** is not Hermitian; however, since any matrix **A** can be written in the form **A** = **H**<sup>1</sup> + i**H**2, where **H**<sup>1</sup> and **H**<sup>2</sup> are Hermitian, it follows that **M** and **Γ** are Hermitian matrices. Also, we impose CPT invariance (a particle and its antiparticle have identical mass and lifetime) to give

$$\begin{aligned} M\_{21} &= M\_{12}^\*, & \Gamma\_{21} &= \Gamma\_{12}^\* \\ M\_{11} &= M\_{22} = M, & \Gamma\_{11} &= \Gamma\_{22} = \Gamma\_{12}^\* \end{aligned}$$

So the Hamiltonian simplifies to

$$\mathbf{H} = \begin{pmatrix} M & M\_{12} \\ M\_{12}^\* & M \end{pmatrix} - \frac{\mathrm{i}}{2} \begin{pmatrix} \Gamma & \Gamma\_{12} \\ \Gamma\_{12}^\* & \Gamma \end{pmatrix} \tag{10.13}$$

The five fundamental quantities describing the mixing system are M, Γ, |M12|, |Γ12|, and their relative phase arg(M12/Γ12). M<sup>12</sup> and Γ<sup>12</sup> cannot be fully determined, since there is an arbitrary, unobservable phase in the wavefunction of |K<sup>0</sup>. **H** can be diagonalized as above to give the masses and lifetimes of the K<sup>1</sup> and K<sup>2</sup> in terms of these quantities. The mass splitting is a more complicated expression than the one we derived above for the non-decaying mesons, but is still approximately ±|M12|.

## **10.4 Flavour oscillations**

Carrying on with the discussion from Section 10.3, we continue to use the kaon system as an example, but these results are applicable to all the meson systems. Since M<sup>12</sup> arises from second-order weak interactions, the mass difference ΔM = M<sup>1</sup> − M<sup>2</sup> ∼ 2M<sup>12</sup> is very small. It can be measured by examining the strangeness of a decaying K<sup>0</sup> beam. Consider an initially pure K<sup>0</sup> state at proper time t = 0: <sup>6</sup> <sup>6</sup> Time in the kaon centre-of-mass

system.

$$\left|\psi(t=0)\right\rangle = \left|K^0\right\rangle = \sqrt{\frac{1}{2}}\left(\left|K\_1\right\rangle + \left|K\_2\right\rangle\right).$$

The state evolves in time as

$$|\psi(t)\rangle = \sqrt{\frac{1}{2} \left( \mathbf{e}^{\left(-\mathbf{i}M\_1 - \Gamma\_1/2\right)t} |K\_1\rangle + \mathbf{e}^{\left(-\mathbf{i}M\_2 - \Gamma\_2/2\right)t} |K\_2\rangle \right)} \tag{10.14}$$

where Γ<sup>1</sup> and Γ<sup>2</sup> are the decay widths of K<sup>1</sup> and K2. |ψ(t) can be re-expressed in terms of |K<sup>0</sup> and |K¯ <sup>0</sup> by using eqns 10.1:

$$\begin{split} \left| \psi(t) \right> &= \frac{1}{2} \left( \mathbf{e}^{\left( -\mathbf{i}M\_1 - \Gamma\_1/2 \right)t} + \mathbf{e}^{\left( -\mathbf{i}M\_2 - \Gamma\_2/2 \right)t} \right) \left| K^0 \right> \\ &+ \frac{1}{2} \left( \mathbf{e}^{\left( -\mathbf{i}M\_1 - \Gamma\_1/2 \right)t} - \mathbf{e}^{\left( -\mathbf{i}M\_2 - \Gamma\_2/2 \right)t} \right) \left| \bar{K}^0 \right> \end{split} \tag{10.15}$$

Interference will occur between the terms of frequencies M<sup>1</sup> and M<sup>2</sup> in the amplitudes of |K<sup>0</sup> and |K¯ <sup>0</sup>. The K<sup>0</sup> intensity at time t is

$$\begin{split} \mathcal{I}(K^{0}) &= |\langle K^{0}|\psi(t)\rangle|^{2} \\ &= \frac{1}{4} \left( \mathbf{e}^{-\Gamma\_{1}t} + \mathbf{e}^{-\Gamma\_{2}t} \right) + \frac{1}{2} \left( \mathbf{e}^{-\left(\Gamma\_{1}/2\right)t - \left(\Gamma\_{2}/2\right)t} \right) \cos(\Delta Mt) \end{split} \tag{10.16}$$

Similarly, the intensity of K¯ <sup>0</sup> from the same initial K<sup>0</sup> can be obtained from |K¯ <sup>0</sup>|ψ(t)|<sup>2</sup>:

$$\mathcal{I}(\bar{K}^0) = \frac{1}{4} \left( \mathbf{e}^{-\Gamma\_1 t} + \mathbf{e}^{-\Gamma\_2 t} \right) - \frac{1}{2} \left( \mathbf{e}^{-(\Gamma\_1/2)t - (\Gamma\_2/2)t} \right) \cos(\Delta M t) \tag{10.17}$$

Equations 10.16 and 10.17 imply that the strangeness content of a beam initially containing strangeness +1 K<sup>0</sup>s will oscillate with time (or distance in the laboratory). The oscillations will be observable if ΔM τ<sup>1</sup> ∼ 1 or greater. Figure 10.2(a) illustrates the behaviour of the K<sup>0</sup> and K¯ <sup>0</sup> intensities given by eqns 10.16 and 10.17. Oscillations are observable for a few K<sup>1</sup> lifetimes. At times t τ1, the K<sup>1</sup> component will have entirely decayed away, leaving only K2, and the beam will be an equal mixture of K<sup>0</sup> and K¯ <sup>0</sup> and zero net strangeness. The total kaon intensity at any time is given by the sum of eqns 10.16 and 10.17 and is as given by eqn 10.8, as it must be.

The features of the oscillations of each of the four systems are remarkably different owing to the different values of the parameters in Table 10.1. The intensities from eqns 10.16 and 10.17 are plotted for each of the meson systems in Fig. 10.2(a–d), and we describe each in the following subsections.

## **10.4.1** *K***<sup>0</sup>–***K¯* **<sup>0</sup> oscillations**

Strangeness oscillations can be observed experimentally by starting with a pure S = +1 K<sup>0</sup> beam produced, for example, by K<sup>+</sup>+n → K<sup>0</sup>+p and tagging the strangeness of the decaying kaon by using the semileptonic decays K<sup>0</sup> → l <sup>+</sup> + ν + π<sup>−</sup> and K¯ <sup>0</sup> → l <sup>−</sup> + ¯ν + π<sup>+</sup>. The K<sup>1</sup> − K<sup>2</sup> mass difference can be deduced from the period of the oscillation: ΔM = 3.483 × 10<sup>−</sup><sup>12</sup> MeV and ΔM τ<sup>1</sup> = 0.49. Being less than one part in 10<sup>14</sup>

production of a pure meson state according to eqns 10.16 and 10.17 for the kaon (a), D<sup>0</sup> (b), B<sup>0</sup> (c), and B<sup>0</sup> s (d) systems.

where θ<sup>C</sup> is the Cabbibo angle. <sup>7</sup> <sup>7</sup> This could equally have been written in terms of CKM elements but in two generations, Vud = Vcs = cos θ<sup>C</sup> and Vcd = −Vus = − sin θC.

decay K<sup>0</sup> <sup>L</sup> → μ+μ−. Without the presence of the c quark, the branching ratio would be expected to be O(α2), where α is the fine-structure constant. Allowing for the existence of the c quark, the branching ratio

$$\text{BR} \sim \alpha^2 (m\_c^2 - m\_u^2) / M\_W^2$$

where mc, mu, and M<sup>W</sup> are the masses of the c quark, u quark, and W boson, respectively.

<sup>9</sup>The first evidence for D-meson oscillations was found at the B-factories and at the Tevatron.

<sup>10</sup>We implicitly assume the inclusion of

<sup>11</sup>In a two-generation approximation, the CKM matrix element is cos θ<sup>C</sup> and the small value of the Cabibbo angle results in a large value of this element. Hence these decay modes are called of the K<sup>0</sup> mass, ΔM is truly tiny, and is the smallest mass difference that has ever been measured.

A rough dimensional estimate of ΔM can be made by considering the 'box' diagrams for ΔS = 2 K<sup>0</sup> → K¯ <sup>0</sup> transitions shown in Fig. 10.1. The transition is second-order weak with s- and d-quark to u-, c-, or t-quark transitions occurring at the vertices. The s and d to u and c couplings are the most important at the relatively low energy scale of the kaons, so

$$
\Delta M \sim \langle K^0 | H\_{\text{weak}} | \bar{K}^0 \rangle \sim G\_\text{F}^2 m\_K^5 \cos^2 \theta\_\text{C} \sin^2 \theta\_\text{C} \tag{10.18}
$$

The use of the kaon mass m<sup>K</sup> ensures that eqn 10.18 has the correct dimensions. A full calculation involves the evaluation of loop Feynman diagrams (the box diagrams shown in Fig. 10.1), with the result

$$
\Delta M \approx \frac{G\_\mathrm{F}^2 f\_K^2 m\_K}{3\pi^2} \cos^2 \theta\_\mathrm{C} \sin^2 \theta\_\mathrm{C} \frac{(m\_c^2 - m\_u^2)^2}{m\_c^2} \tag{10.19}
$$

where f<sup>K</sup> ∼ 170 MeV is the experimentally determined kaon decay factor. In a model without charm, the prediction gives a result ∼4000 times higher than the experimental result, but good agreement is obtained if the charm quark is included in the calculation. <sup>8</sup> <sup>8</sup> A similar argument applies to the rare Thus measurement of neutral kaons gave indirect evidence for the existence of the c quark and provided a rough prediction for its mass before it was discovered. This is a classic example of how precision low-energy measurements are sensitive via loop diagrams to physics at much higher mass scales.

## **10.4.2** *D***<sup>0</sup>–***D¯* **<sup>0</sup> oscillations**

In general, we should expect to observe D<sup>0</sup>–D¯<sup>0</sup> oscillations because the same type of box diagrams responsible for kaon oscillations (Fig. 10.1) will cause D<sup>0</sup> ↔ D¯<sup>0</sup> transitions. However it turns out to be much more difficult to observe oscillations involving charm quarks. First (as for B mesons), there are so many possible decay modes that the lifetimes of the D<sup>1</sup> 'short-lived' and D<sup>2</sup> 'long-lived' mesons will be very similar (unlike the case for kaons), so we cannot produce a pure sample of the long-lived neutral D mesons. Also, the particular values of the CKM elements cause the rate of oscillations of D mesons to be much slower than their decays, as illustrated in Fig. 10.2(b). Therefore, we need to observe D mesons over a time period of many lifetimes to be able to observe oscillations. This implies that we need very high-statistics samples of D mesons, and the best place to obtain these is using the LHCb experiment at the LHC.<sup>9</sup> The oscillations were studied by measuring<sup>10</sup> charge-conjugate states. the time dependence of the ratio

$$R = \frac{N(D^0 \to K^+ \pi^-)}{N(D^0 \to K^- \pi^+)}\tag{10.20}$$

The decays D<sup>0</sup> → K<sup>+</sup>π<sup>−</sup> involve the quark-level transitions c → s and u → d, both of which have a large CKM mixing angle.<sup>11</sup> 'Cabibbo-favoured'. The decays D<sup>0</sup> → K−π<sup>+</sup> involve the quark-level transitions c → d and u → s, both of which depend on the Cabibbo angle as sin θ<sup>C</sup> and hence these rare decays are called 'doubly Cabibbo-suppressed'. We can also have decays of D<sup>0</sup> to the 'wrong-sign' Kπ decays if the D<sup>0</sup> oscillates into a D¯ <sup>0</sup> that then decays by a 'Cabibbo-allowed' mode to K−π<sup>+</sup>. Therefore, the signature for D<sup>0</sup> oscillations will be the ratio of 'wrong-sign' to 'rightsign' decays given by R (see eqn 10.20) increasing with time as more oscillations occur. From an experimental perspective, we need to


The flavour tagging is done by selecting decays D∗± → D<sup>0</sup>π<sup>±</sup>, where the charge of the π<sup>±</sup> determines the flavour of the neutral D<sup>0</sup> at production. This is done by first reconstructing D<sup>0</sup> mesons and then combining them with π<sup>±</sup> and looking for a narrow peak in the invariant mass spectrum of Δm = m(D0π<sup>±</sup>) − m(D<sup>0</sup>). The lifetime of the D<sup>0</sup> is determined by measuring the flight path L and momentum<sup>12</sup> <sup>12</sup>The momentum of the neutral parp, so that the decay time is given by t = LmD<sup>0</sup> /p. The measured value [103] of the ratio R is shown in Fig. 10.3. The value of R is increasing with time, as expected for oscillations. The oscillations are so slow compared with the lifetime that only a fraction of an oscillation period can be observed.

## **10.4.3** *B***<sup>0</sup>–***B¯* **<sup>0</sup> mixing and oscillations**

Neutral B mesons come in two varieties, B<sup>0</sup> <sup>d</sup> and B<sup>0</sup> <sup>s</sup> , containing a ¯<sup>b</sup> and either a <sup>d</sup> or an <sup>s</sup> quark, respectively.<sup>13</sup> The masses of the strong-interaction eigenstates are M<sup>B</sup> = 5.28 GeV and M<sup>B</sup><sup>s</sup> = 5.37 GeV.

ticle is reconstructed from the momenta of the charged decay products.

<sup>13</sup>The usual convention is that B<sup>0</sup> means B<sup>0</sup> <sup>d</sup> and the subscript d is omitted. B<sup>0</sup> contains a ¯b antiquark, whereas D<sup>0</sup> contains a c quark.

**Fig. 10.3** The ratio R of 'wrong-sign' to 'right-sign' decay modes as a function of decay time divided by the mean lifetime. The curve shows a fit allowing for D<sup>0</sup> oscillations and is in good agreement with the data. The dashed horizontal line shows the 'no-oscillation' model. From [103]. The data point at the largest value of t/τ covers the range 5 < t/τ < 20.

**Fig. 10.4** Box diagram for B<sup>0</sup> d–B¯<sup>0</sup> d transitions.

The large mass means that they have many possible decay modes and short lifetimes, and so (as with the D mesons, but unlike the kaons), the B<sup>1</sup> and B<sup>2</sup> cannot be studied separately.

As discussed in Section 10.3, the condition for observing oscillations is that ΔM τ is of the order of unity or greater, where ΔM is the mass difference between the physical eigenstates and τ is the lifetime. Figure 10.4 shows two diagrams for B<sup>0</sup> d–B¯<sup>0</sup> <sup>d</sup> transitions. By direct analogy with eqn 10.18, the B<sup>1</sup> − B<sup>2</sup> mass difference must be proportional to M<sup>5</sup> <sup>B</sup> times the appropriate CKM elements, i.e.

$$
\Delta M\_B = M\_{B\_1} - M\_{B\_2} \approx G\_\mathcal{F}^2 M\_B^5 |V\_{tb}|^2 |V\_{td}|^2,
$$

B mesons have a relatively fast oscillation compared with their lifetime, which makes oscillation studies easier than for D mesons, because of two factors: first, only top-type quarks run in the 'box' diagram responsible for oscillations and, second, the lifetime is relatively long because of the small value of the CKM matrix elements in the decays (Vcb and Vub).

By assuming the lifetimes of B<sup>1</sup> and B<sup>2</sup> are the same, the expressions for the B<sup>0</sup> ↔ B¯<sup>0</sup> transition probabilities are somewhat simpler than those for K<sup>0</sup> ↔ K¯ <sup>0</sup> (eqns 10.16 and 10.17):

$$\begin{aligned} \mathbf{P}(B^0 \to B^0) &= \frac{1}{2} \mathbf{e}^{-\Gamma\_B t} [1 + \cos(\Delta M\_B \, t)] \\ \mathbf{P}(B^0 \to B^0) &= \frac{1}{2} \mathbf{e}^{-\Gamma\_B t} [1 - \cos(\Delta M\_B \, t)] \end{aligned} \tag{10.21}$$

where Γ<sup>B</sup> = 1/τB. Equations 10.21 represent the probabilities of observing a B<sup>0</sup> or B¯<sup>0</sup> at some time t after a B<sup>0</sup> or B¯<sup>0</sup> has been created.

We can tag the flavour of a decaying B meson using the semileptonic decay modes. Therefore, if we produce a B<sup>0</sup>B¯<sup>0</sup> pair, we can use events in which both Bs decay semileptonically and identify events with same-sign leptons (SS) as having oscillated and those with opposite-sign leptons (OS) as not having oscillated. We define the usual asymmetry A = (OS− SS)/(OS + SS), and from eqn 10.21 we can see that A = cos(ΔM<sup>B</sup> t). Therefore, if we can measure the frequency of these oscillations, we can determine the mass difference between the light and heavy states.

B<sup>0</sup>–B¯<sup>0</sup> mixing can be detected by producing B<sup>0</sup>B¯<sup>0</sup> pairs, for example in an e<sup>+</sup>e<sup>−</sup> collider, and observing their semileptonic decays. At the quark level,

$$\begin{aligned} b &\rightarrow c + W^- \rightarrow c + \mu^- \bar{\nu}\_{\mu} \\ \bar{b} &\rightarrow \bar{c} + W^+ \rightarrow \bar{c} + \mu^+ \nu\_{\mu} \end{aligned}$$

and one signature for B<sup>0</sup>–B¯<sup>0</sup> oscillations is the observation of like-sign muon pairs: μ<sup>±</sup>μ<sup>±</sup>. Figure 10.5 shows the fraction of all muon pairs that have the same sign,

$$F = \frac{N^{++} + N^{--}}{N^{++} + N^{--} + N^{+/-}}$$

**Fig. 10.5** Fraction of like-sign muons due to B0–B¯<sup>0</sup> mixing versus proper time in the B<sup>0</sup> system observed by the DELPHI experiment at LEP [72]. The curve is the result of a prediction with ΔMB<sup>d</sup> = 0.480 ps−<sup>1</sup> (3.159 × 10−<sup>10</sup> MeV).

versus proper time in the B<sup>0</sup> system measured by the DELPHI experiment at the LEP e<sup>+</sup>e<sup>−</sup> collider. There is clearly a significant, time-dependent, excess of like-sign pairs, which is attributable to B<sup>0</sup>–B¯<sup>0</sup> mixing. It can be deduced from these data that ΔM<sup>B</sup><sup>d</sup> ∼ 3×10<sup>−</sup><sup>10</sup> MeV and ΔM<sup>B</sup><sup>d</sup> τ ∼ 0.7.<sup>14</sup> <sup>14</sup>The early observation of this mixing

## **10.4.4** *B<sup>s</sup>* **oscillations**

The study of the neutral B<sup>s</sup> system has only recently become possible, because the best place to study it is in high-energy hadron colliders such as the LHC (and the Tevatron before that). The LHCb detector at the LHC is a special facility for studying B mesons—Section 10.7.3 describes many of the general features of the LHCb detector.

To study B<sup>s</sup> oscillations, we first need to identify the flavour of the neutral B hadrons at their decay to know if they are either B<sup>0</sup> <sup>s</sup> or <sup>B</sup>¯<sup>0</sup> s . Second, we use the fact that the B hadrons arise from b¯b production to infer the flavour that the signal B hadron had at creation from the flavour of the other B hadron in the event (this is called opposite-side tagging). This can be achieved by several algorithms including the charge of the leptons (e or μ)<sup>15</sup> from semileptonic decays.<sup>16</sup> Same-side tagging (SST) can also be used to identify the flavour of the B<sup>0</sup> s . The B<sup>0</sup> <sup>s</sup> are identified by a flavour-specific decay mode and the D<sup>−</sup> <sup>s</sup> are identified by decay modes such as D<sup>−</sup> <sup>s</sup> → φπ<sup>−</sup> with φ → K<sup>+</sup>K<sup>−</sup>. The very good K/π separation provided by the RICH detectors reduces the combinatorial backgrounds in the mass reconstruction. The pion charge identifies the flavour of the B<sup>0</sup> s .

An event with a B<sup>0</sup> <sup>s</sup> and a <sup>B</sup>¯<sup>0</sup> <sup>s</sup> would be identified as non-mixed, whereas an event with two B<sup>0</sup> <sup>s</sup> or two <sup>B</sup>¯<sup>0</sup> <sup>s</sup> hadrons would arise from mixing. Finally, we need to measure the decay time. The B<sup>s</sup> oscillations was a surprise because it was only explicable in the Standard Model if the mass of the top quark was very large, and this measurement gave the first indication that the top quark would be so heavy. Here again, we see how precision low-energy measurements can give access to physics at much higher mass scales via the loop diagrams.

<sup>15</sup>The leptons in this analysis are restricted to e or μ.

<sup>16</sup><sup>b</sup> <sup>→</sup> cl−ν¯l, so a negative (positive) lepton arises from the decay of a b (¯b).

<sup>17</sup> <sup>17</sup>From conservation of strangeness, the B<sup>0</sup> <sup>s</sup> must be produced in association with an ¯s quark. If this ¯s hadronizes to form a charged kaon, the sign of that kaon identifies the flavour of the B<sup>0</sup> s .

probability of a 'mistag'. <sup>18</sup> <sup>18</sup> A mistag occurs when the tagging algorithm arrives at the wrong conclusion. The effect is to dilute the magnitude but not the period of the observed oscillations.

and letting the K<sup>1</sup> component decay away in flight.

are very rapid, so excellent decay time resolution is essential. The oscillation probability is similar to that derived for kaon oscillations (eqn 10.17) after allowing for the effects of the finite experimental resolution and the We have the characteristic oscillation term ± cos(ΔM<sup>B</sup><sup>s</sup> t). The sign is positive for mixed flavour at production and decay and negative for the case where the flavour is the same at production and decay. The characteristic oscillations are clearly seen in the data [104] shown in Fig. 10.6.

## **10.4.5 Regeneration**

A further interesting phenomenon related to oscillations, consequent on the very different lifetimes Γ<sup>1</sup> and Γ<sup>2</sup> in the kaon system, is regeneration (the discovery paper is [61]). It is an experimental consideration that must be controlled carefully in order to measure CP violation, as we will see in the next section. Regeneration occurs when we make <sup>19</sup> <sup>19</sup> By making a beam of neutral kaons a beam of pure K<sup>2</sup> and then let that interact by the strong interaction by hitting some material. Because the strong interaction is involved, the K<sup>0</sup> and K¯ <sup>0</sup> components of the K<sup>2</sup> must be considered. K<sup>0</sup> and K¯ <sup>0</sup> interact differently in nuclear matter. The K¯ <sup>0</sup> cross-section is greater than the K<sup>0</sup> cross-section because K¯ <sup>0</sup> has a ¯ d antiquark, which can annihilate with a d quark in a nucleon to produce hyperons (e.g. K¯ <sup>0</sup> + p → Λ<sup>0</sup> + π<sup>+</sup>). There is no equivalent interaction for the K<sup>0</sup>. Regeneration is not the effect of a single collision with the material, it is a quantum-coherent effect—the amplitudes of interactions with many nuclei in the material all sum together. The kaon is not deviated in its path and does not suffer any energy loss.

The consequence of this is that after a K<sup>2</sup> beam passes through an absorber, the amounts of K<sup>0</sup> and K¯ <sup>0</sup> will change and the beam will no longer be pure K2; it will contain some amount of K1. K<sup>1</sup> → ππ decays will be observed again after the absorber. This phenomenon is called regeneration. Figure 10.8 shows data from the KTeV experiment at Fermilab, <sup>20</sup> <sup>20</sup> To be described in Section 10.5.3. which is an example of the regeneration effect.

Regeneration measurements allow the sign of ΔM to be determined. It is found that ΔM = M<sup>2</sup> − M<sup>1</sup> = 3.483 × 10−<sup>12</sup> MeV, i.e. that K<sup>2</sup> is heavier than K1.

## **10.5** *CP* **violation (part 1)**

In addition to the interesting oscillation effects described up to now, the neutral meson systems also exhibit the fundamental effect of violation of CP symmetry. This effect has now been seen in a wide range of places, including decays of charged mesons. There are three main ways in which CP violation can be observed, which are listed in Table 10.2 for future reference. There is now strong evidence that the CP violation we see is due to a mechanism proposed by Kobayashi and Maskawa connected with the CKM matrix. We will follow the historical route in our description here and discuss the first experimental evidence for CP violation, which came from the kaon system, then the Kobayashi– Maskawa mechanism, then briefly other CP-violating effects with kaons, which were small and required high-precision experiments. There then followed an extensive period, which is still in progress, where strong CPviolating effects in B-meson systems became experimentally accessible, leading to tight constraints on the parameters in the CKM matrix.

## **10.5.1 Discovery of** *CP* **violation**

The first experimental evidence of CP violation came in 1964 when longlived neutral kaons were observed to decay to two pions [61]. The result of this experiment, which is entirely consistent with what is known today, was the first evidence for CP violation and was completely unexpected at the time: the decay K<sup>2</sup> → 2π was expected to be strictly forbidden by CP conservation, as described above. Figure 10.7 shows data from a more recent experiment (NA31 [112]) with high statistics and shows the number of K<sup>0</sup> → π<sup>+</sup>π<sup>−</sup> decays versus time. The fast exponential component is from the CP-conserving decay K<sup>1</sup> → 2π; the constant component that remains after t ∼ 15τ<sup>1</sup> shows that CP = +1 π<sup>+</sup>π<sup>−</sup> states appear to be produced in a region where only CP = −1 K<sup>2</sup> states should exist. Shortly after the discovery of CP violation in π<sup>+</sup>π<sup>−</sup> final states, separate experiments found π<sup>0</sup>π<sup>0</sup> states with invariant masses consistent with being kaons produced in the region where only the K<sup>2</sup> (i.e. no K1) should exist. Both of these effects are violations of CP symmetry.

At this point, the question that arises when thinking of the processes as Feynman diagrams is: Where is the CP violation happening? It could be either in the Feynman diagrams representing the mixing or in those representing the decay<sup>21</sup> <sup>21</sup> (or both). The answer, it turns out, is that it Types (a) or (b) in Table 10.2. is mainly in the mixing in the kaon system.


**Table 10.2** Main ways of observing CP violation.

To incorporate the effect of CP violation into the mixing formalism that we have been developing, we introduce two more state labels, K<sup>S</sup> and KL, to represent the physical states that decay with the short and long lifetimes, respectively. If CP were conserved, |KS≡|K1 and |KL≡|K2, but to include CP violation, we maintain the definitions of K<sup>1</sup> and K<sup>2</sup> as CP eigenstates, defined by eqn 10.1. <sup>22</sup> <sup>22</sup> So the formalism we have used up to The K<sup>L</sup> state represents the component in the beam once all the short-lived kaons have decayed away, which is mostly the CP = −1 K2, but includes an admixture of a K<sup>1</sup> component, which is what produces the ππ decays. In the other meson systems, the nomenclature is slightly different, although the concept is exactly the same. Since the B mesons all have lots of decay modes, the lifetimes are not very different, and so the states are labelled based on the mass splitting as light <sup>23</sup> <sup>23</sup> It turns out that the <sup>K</sup><sup>L</sup> is the B<sup>L</sup> and heavy BH. Returning to Fig. 10.7, the level of CP violation can be characterized by measuring the decay-rate ratio η<sup>+</sup><sup>−</sup> = Γ(K<sup>L</sup> → π<sup>+</sup>π<sup>−</sup>)/Γ(K<sup>S</sup> → π<sup>+</sup>π<sup>−</sup>); the current average of measurements is η<sup>+</sup><sup>−</sup> = (2.232 ± 0.011) × 10<sup>−</sup><sup>3</sup>.

**Fig. 10.7** Example of CP violation in K → π+π<sup>−</sup> decays from NA31 [112]. The data are the number of K → π+π<sup>−</sup> decays as a function of time. Below 10 lifetimes, the main feature is CPconserving K<sup>S</sup> decay. Above about 15 lifetimes, the CP violation effect is visible, because the decay rate does not keep falling exponentially. (Interference is possible between these two ways in which the kaons can decay. The inset shows the difference between the data and a fit without the interference. A fit with interference nicely follows the oscillations of the data.)

now is still valid.

heavier of the two kaon states, so unfortunately the names B<sup>L</sup> and K<sup>L</sup> do not correspond to each other.

## **10.5.2 Semileptonic charge asymmetry**

The next indication of CP violation was an observation of an asymmetry in the semileptonic decays of the KL:

$$A\_{\rm L} = \frac{\Gamma(K\_{\rm L} \to \pi^{-} l^{+} \nu) - \Gamma(K\_{\rm L} \to \pi^{+} l^{-} \bar{\nu})}{\Gamma(K\_{\rm L} \to \pi^{-} l^{+} \nu) + \Gamma(K\_{\rm L} \to \pi^{+} l^{-} \bar{\nu})} \,. \tag{10.22}$$

The choice of measuring a ratio like this minimizes experimental systematic effects; nevertheless, the experimental set-up needs to have the least amount of material as possible in the path of the particles because the interaction cross sections of π<sup>+</sup> and π<sup>−</sup> are different<sup>24</sup> <sup>24</sup>For similar reasons to those acand could affect the measurement if too large. The semileptonic asymmetry A<sup>L</sup> is measured to be (3.32 ± 0.06) × 10−<sup>3</sup>. If CP were conserved, it would be zero. The fact that this asymmetry is present demonstrates that the CP violation is occurring in the mixing of the kaons rather than in their decay. Another way of saying this is that this measurement shows that KL|KS = 0.

We can formulate the combined effects of mixing and CP violation using the relations in eqns 10.1 between the K1, K<sup>2</sup> and the K<sup>0</sup>, K¯ <sup>0</sup>, and then expressing K<sup>S</sup> and K<sup>L</sup> in terms of K<sup>0</sup> and K¯ <sup>0</sup> as follows:

$$\begin{aligned} \vert K\_{\rm S} \rangle &= p \vert K^{0} \rangle + q \vert \bar{K}^{0} \rangle\\ \vert K\_{\rm L} \rangle &= p \vert K^{0} \rangle - q \vert \bar{K}^{0} \rangle \end{aligned} \tag{10.23}$$

where p and q are complex coefficients with |p| <sup>2</sup> + |q| <sup>2</sup> = 1. KL|KS = |p| <sup>2</sup> − |q| <sup>2</sup>, which is zero if CP is conserved. Considering the charge asymmetry A<sup>L</sup> again, let the amplitude for K<sup>0</sup> → π−l <sup>+</sup>ν be f, so the amplitude for K¯ <sup>0</sup> → π+l <sup>−</sup>ν¯ is f <sup>∗</sup>. It is now possible to show that A<sup>L</sup> = |p| <sup>2</sup> − |q| <sup>2</sup> and therefore that the non-zero measured value of A<sup>L</sup> indicates that the K<sup>S</sup> and K<sup>L</sup> are not orthogonal to each other, and hence that CP violation is occurring in the mixing.

An alternative way to write the effects of mixing and CP violation is to define K<sup>1</sup> and K<sup>2</sup> in terms of K<sup>0</sup> and K¯ <sup>0</sup> with eqns 10.1 as before; and then define K<sup>S</sup> and K<sup>L</sup> in terms of K<sup>1</sup> and K<sup>2</sup> with a small complex impurity parameter ˜ as

$$|K\_{\rm S}\rangle = \frac{|K\_1\rangle + \tilde{\epsilon}|K\_2\rangle}{\sqrt{1 + |\tilde{\epsilon}|^2}}\tag{10.24}$$

$$|K\_{\rm L}\rangle = \frac{|K\_2\rangle + \tilde{\epsilon}|K\_1\rangle}{\sqrt{1 + |\tilde{\epsilon}|^2}}\tag{10.25}$$

With this definition, p/q = (1 + ˜ )/(1 − ˜), A<sup>L</sup> = 2 Re(˜ )/(1 + | ˜| <sup>2</sup>) ∼ 2 Re(˜ ), and η<sup>+</sup><sup>−</sup> = ˜ . From 1964 until around 1999, all the CP-violating effects measured could be characterized by the single parameter ˜ .

Theories were proposed to explain CP violation when it was first discovered. In particular, a new ΔS = 2 'superweak' interaction specific to the kaon system was considered. The CP violation in the kaon system counting for the difference in K<sup>0</sup> and K¯ <sup>0</sup> interaction cross sections in Section 10.4.5.

it is interesting to note that this was proposed after CP violation had been observed but long before either of the third-generation quarks had been discovered.

was studied extensively over nearly forty years before evidence for CP violation in B<sup>0</sup> systems—which will be discussed shortly—was found. The next sections will outline how CP violation can be accommodated in the Standard Model by introducing a complex phase into the 3 × 3 CKM quark mixing matrix (see Chapter 7). <sup>25</sup> <sup>25</sup> This requires a third generation and

## **10.5.3** *CP* **violation in** *K***<sup>0</sup> decay**

A signature that could be used to gain insight into the mechanism causing CP violation is to look for CP violation in the decay of particles—it was first observed in the decay of K<sup>L</sup> particles. If all the CP was occurring in the mixing, then the decays of K<sup>L</sup> to ππ should have exactly the same features as the decays of K<sup>S</sup> to ππ, because, in both cases, it is just K<sup>1</sup> decaying. In particular, the ratio of decays to π<sup>0</sup>π<sup>0</sup>/π<sup>+</sup>π<sup>−</sup> should be the same for both K<sup>L</sup> and KS. There was an extensive programme of experimental research to measure this, and the results were expressed in terms of the double ratio R = |η00| <sup>2</sup>/|η<sup>+</sup>−| <sup>2</sup>, where

$$\eta\_{00} = \frac{A(K\_{\rm L} \to \pi^0 \pi^0)}{A(K\_{\rm S} \to \pi^0 \pi^0)}, \qquad \eta\_{+-} = \frac{A(K\_{\rm L} \to \pi^+ \pi^-)}{A(K\_{\rm S} \to \pi^+ \pi^-)}$$

and A(K → ...) represents the amplitudes of the decays. Unfortunately, most of the K → ππ decays produce the pions in a single isospin state (I = 0), and so, from isospin symmetry (see Section 5.2.1), the ratio π<sup>0</sup>π<sup>0</sup>/π<sup>+</sup>π<sup>−</sup> is the fixed value of <sup>1</sup> <sup>2</sup> no matter what the production mechanism of the ππ state is. However, a small fraction of the ππ are made in an isospin I = 2 state, which has a different value of the π0π0/π+π<sup>−</sup> ratio, and so it is possible to use the double ratio R to detect CP violation in decays; the effect is very small, however.

Experimentally, this double ratio was a good quantity to measure since it allowed tricks to cancel systematic errors, for example the use of the same detectors to measure the K<sup>L</sup> and KS. The NA31 experiment at CERN took data alternately in K<sup>L</sup> mode (where the target was far upstream to allow the K<sup>S</sup> to decay before reaching the experiment) and K<sup>S</sup> mode (where a target close to the experiment, with a far less intense proton beam, was used; the K<sup>S</sup> target was moved on rails to different positions to reproduce a decay distribution similar to the almost-flat K<sup>L</sup> beam to minimize acceptance differences). The KTeV experiment at Fermilab [93] used two simultaneous K<sup>L</sup> beams <sup>26</sup> <sup>26</sup> In reality one big beam made from a with an absorber in one of them to regenerate a K<sup>S</sup> beam (its predecessor, the E731 experiment [116], used a similar technique). Figure 10.8 shows the reconstructed decay position distribution along the beam direction z and illustrates the difference in decay distributions from the K<sup>L</sup> and KS. It also illustrates the phenomenon of kaon regeneration. NA48 (a successor to NA31) had both a K<sup>L</sup> and a K<sup>S</sup> beam, produced from separate proton beams. Both KTeV and NA48 allowed simultaneous measurement of K<sup>L</sup> and K<sup>S</sup> to remove any small time-varying systematic effects in the detectors. The layout of NA48 is shown in Fig. 10.9.

target at z = 0 that was collimated into two beams side by side. The regenerator beam was attenuated to avoid a huge rate of decays, which is why no K<sup>L</sup> is visible in that beam.

**Fig. 10.8** Distribution of decay positions z of π+π<sup>−</sup> events along the beam direction in the KTeV experiment [93], which used two beams of K<sup>L</sup> and a regenerator (a block of material) situated at z = 125 m in one of them to produce K<sup>S</sup> particles. The regeneration effect is clearly visible and the difference in the decay distributions due to the different K<sup>S</sup> and K<sup>L</sup> lifetimes is very apparent. The acceptance of the detector varies slowly with z, which produces the nonflat K<sup>L</sup> distribution. Decays between z = 110 and 158 m are used in the analysis.

**Fig. 10.9** Layout of the NA48 experiment at CERN to measure CP violation in kaons. By comparison of the components in the detector with a collider experiment, the detector components are really very similar, just rolled out in a line rather than in a cylinder, so that the particles encounter the tracking + magnet first, then the calorimetry (electromagnetic, then hadronic), and finally the muon detectors. The kaon experiments are very long and narrow, reflecting the large Lorentz boost to which the ∼100 GeV kaons are subjected. From [113].

magnetic calorimeters <sup>27</sup> <sup>27</sup> Liquid argon and lead for NA31, lead glass for E731, caesium iodide crystals for KTeV, and liquid krypton for NA48.

 , η<sup>00</sup> = − 2- , and R = 1 − 6 Re(- /)

All four of these experiments had extremely high-resolution electrocapable of measuring high rates of particles, which were vital to separate the π<sup>0</sup>π<sup>0</sup> decays from the far more numerous CP-conserving K<sup>L</sup> → π<sup>0</sup>π<sup>0</sup>π<sup>0</sup> decays. The π<sup>+</sup>π<sup>−</sup> decays were measured with a spectrometer (magnet + drift chambers) and backgrounds from three-body decays (π<sup>+</sup>π−π<sup>0</sup>, π±e∓ν, and π±μ∓ν) were removed by (a) looking at the transverse momentum p<sup>T</sup> distribution of the events (most two-particle events that are background have larger pT, indicating another particle that was missed), (b) checking that the momentum in the spectrometer p was inconsistent with the energy in the electromagnetic calorimeter E (since E/p is close to 1 for electrons), and (c) checking that the reconstructed invariant mass of the two tracks was close to mK. The experiments also had sophisticated multilevel triggers.

The combined result of all these experiments is R = 0.9899 ± 0.0012. Since this is not consistent with R = 1, this is evidence for CP violation in the decay of the K<sup>L</sup> particle, or, equivalently, it shows that the K<sup>2</sup> state can decay to two pions directly. This excluded the superweak interpretation of CP violation and was consistent with the CKM model of CP violation, to be described next. To incorporate this into the formalism above, the parameter ˜ is replaced by two parameters <sup>28</sup> <sup>28</sup> It can be shown that <sup>η</sup>+<sup>−</sup> <sup>=</sup> <sup>+</sup> - ( ˜) and , where ∼ 2 × 10<sup>−</sup><sup>3</sup> characterizes the CP violation in the mixing and ∼ 3 × 10<sup>−</sup><sup>6</sup> characterizes CP violation in the decay (type (b) in Table 10.2).

## **10.6** *CP* **violation in the Standard Model**

The CKM matrix was introduced in Section 7.3.3 to explain the rotation between quark states |d, |s, |b (flavour eigenstates) produced in strong interactions and the |d , |s , |b states that couple with the W boson. The CKM matrix elements are needed in weak decays involving quarks. As also mentioned in Section 7.3.3, Kobayashi and Maskawa found that by extending the matrix to be a 3 × 3 matrix, they were able to insert a non-trivial complex phase into it. The presence of the phase, which appears in the transition amplitudes, can cause T violation, since

$$T(\mathbf{e}^{-\mathrm{i}Et+\delta}) \to \mathbf{e}^{\mathrm{i}Et+\delta}$$

and hence, via the CPT theorem, CP violation is expected.

This does not work with a 2×2 matrix, in which the unitarity condition imposes that the complex phases can be removed without affecting any observables, and so the discovery of CP violation along with the work of Kobayashi and Maskawa was an early indication for the third generation of quarks. Products of the CKM elements appear in the meson decay amplitudes, and the phase from the CKM matrix accounts for the CPviolating effects seen so far. The formalism allows insights into the likely magnitude of CP-violating effects in other processes by examining the relation between the elements.

The unitarity of the CKM matrix **V** requires that **V**† **V** = 1. In terms of the individual elements, this<sup>29</sup> <sup>29</sup>**V**† = (**V**∗)<sup>T</sup> gives nine relationships: .

$$\sum\_{i=1,3} |V\_{ij}|^2 = 1 \qquad (j = 1,2,3) \tag{10.26}$$

$$\sum\_{i=1,3} V\_{ji} V\_{ki}^\* = \sum\_{i=1,3} V\_{ij} V\_{ik}^\* = 0 \qquad (j,k = 1,2,3; \ j \neq k) \tag{10.27}$$

The six relations of eqn 10.27, each of which is a sum of three complex numbers, form a 'unitarity' triangle in the complex plane. It can be shown that

$$\left| \mathrm{Im} (V\_{km}^\* V\_{lm} V\_{kn} V\_{ln}^\*) \right| = \left| \mathrm{Im} (V\_{mk}^\* V\_{ml} V\_{nk} V\_{nl}^\*) \right| = J \tag{10.28}$$

irrespective of k, l, m, n, and all six triangles have the same area, A = <sup>1</sup> <sup>2</sup> J, independent of any phase convention. J is known as the Jarlskog invariant. <sup>γ</sup> <sup>β</sup>

Figure 10.10 shows one of the triangles involving the CKM elements responsible for strange and charm (D-meson) decays and one responsible for B decays. The angles of the triangle in Fig. 10.10(b), which represents the unitarity constraint VudV <sup>∗</sup> ub + VcdV <sup>∗</sup> cb + VtdV <sup>∗</sup> tb = 0, are large. We will see in Section 10.7.1 that the angles in the unitary triangle determine the magnitude of the CP-violating effects. Therefore, the large angles in the triangle describing B<sup>0</sup> decays correspond to large CP-violating effects. Conversely, the angles in the triangle describing K<sup>0</sup> and D<sup>0</sup> decays are small. The CKM ansatz therefore predicts large CP violation in Bmeson systems but very little in K<sup>0</sup> and D<sup>0</sup> systems. In fact, any CP violation in D<sup>0</sup> systems is expected to be almost negligibly small.<sup>30</sup> <sup>30</sup>The reason for this is that the mix-Any CP violation in D<sup>0</sup> decays at a level of more than a few parts in a thousand would be strong evidence for new physics.

A second way to explore whether phenomena beyond the CKM explanation of CP violation exist is to make accurate measurements of the structure of the CKM matrix. By making separate measurements of the angles α, β, and γ shown in Fig. 10.10, and the lengths of the sides of the triangle, we can check whether they are all consistent. Inconsistency could indicate, for example, that the 3 × 3 matrix is just a part of a larger matrix, or that the CKM formalism is not correct.

To find processes in which CP-violating effects could be present, two things are needed:


**Fig. 10.10** Unitarity triangles responsible for (a) strange and charm decays and (b) B decays. The angles α, β, and γ are also known as φ2, φ1, and φ3, respectively.

ing and decays of K<sup>0</sup> and D<sup>0</sup> involve mostly the first two quark generations; CP violation is observable in the kaon system because of the long lifetime of KL.

**Fig. 10.11** Two Feynman diagrams for K<sup>L</sup> decays to two pions. The interference between diagram (a) and diagram (b) (called a 'penguin diagram') generates 'direct' CP violation (i.e. there is CP violation without mixing).

<sup>31</sup>The parameters p and q can have different values in each of the meson systems, although, from unitarity, |p| <sup>2</sup> + |q| <sup>2</sup> = 1 in each case.

The CP violation in the mixing, proceeding via diagrams in Fig. 10.1, satisfies these criteria because there are several such diagrams and some of them involve the t quark inside the loop, and so some of the diagrams involve all three generations. Kaon decay can occur by diagrams of the form shown in Fig. 10.11; the more complicated 'penguin' diagrams that can contain quarks from the third generation in the loop d d provide a mechanism for CP violation in the decay.

## **10.6.1 Mixing with** *CP* **violation**

We look again at the mixing in meson decays and derive expressions for the time variation of the states, but now including CP violation as included by eqns 10.23. We have previously used symbols related to the kaons, even though many expressions have been valid for all the meson systems.<sup>31</sup> For variety (and because we will use the formulae to describe B-meson physics next), we use the symbols for the B mesons here. This involves the substitution from (K<sup>0</sup>, K¯ <sup>0</sup>, K1, K2, KS, KL) to (B<sup>0</sup>, B¯<sup>0</sup>, B1, B2, BL, BH). From eqns 10.23, the light and heavy eigenstates written with B symbols are

$$\begin{aligned} \vert B\_{\rm L} \rangle &= p \vert B^0 \rangle + q \vert \bar{B}^0 \rangle\\ \vert B\_{\rm H} \rangle &= p \vert B^0 \rangle - q \vert \bar{B}^0 \rangle \end{aligned} \tag{10.29}$$

Inverting these, we obtain

$$\begin{aligned} \left| B^0 \right> &= \frac{\left| B\_{\rm L} \right> + \left| B\_{\rm H} \right>}{2p} \\ \left| \bar{B}^0 \right> &= \frac{\left| B\_{\rm L} \right> - \left| B\_{\rm H} \right>}{2q} \end{aligned} \tag{10.30}$$

Equations 10.30 can be used to write the time evolution for a state |ψ that was created as a B<sup>0</sup> at time t = 0:

$$|\psi(t)\rangle = \frac{\mathbf{e}^{-\Gamma\_B t/2} \mathbf{e}^{-\mathrm{i}Mt}}{2p} \left( \mathbf{e}^{\mathrm{i}\Delta Mt/2} |B\_\mathrm{L}\rangle + \mathbf{e}^{-\mathrm{i}\Delta Mt/2} |B\_\mathrm{H}\rangle \right) \tag{10.31}$$

We have used the (good) approximation that the B<sup>L</sup> and B<sup>H</sup> have the same lifetime here. Using eqn 10.29, we can express this as

$$|\psi(t)\rangle = \frac{\mathbf{e}^{-\Gamma\_B t/2} \mathbf{e}^{-iMt}}{2p} \left[ \mathbf{e}^{i\Delta M t/2} \left( p|B^0\rangle + q|\bar{B}^0\rangle \right) \right]$$

$$+ \mathbf{e}^{-i\Delta M t/2} \left( p|B^0\rangle - q|\bar{B}^0\rangle \right) \tag{10.32}$$

$$\mathbf{E} = \mathbf{e}^{-\Gamma\_B t/2} \mathbf{e}^{-\mathrm{i}Mt} \left[ \cos(\Delta M t/2) \left| B^0 \right> + \mathbf{i} \frac{q}{p} \sin(\Delta M t/2) \left| \bar{B}^0 \right> \right] \tag{10.33}$$

We can now evaluate the probability for a state that was initially a B<sup>0</sup> to be found as a B<sup>0</sup> at time t and similarly the probability that it oscillates into a B¯<sup>0</sup>:

$$\begin{aligned} |\langle B^0 | \psi(t) \rangle|^2 &= \mathbf{e}^{-\Gamma\_B t} \cos^2(\Delta M t/2) \\ |\langle \bar{B}^0 | \psi(t) \rangle|^2 &= \mathbf{e}^{-\Gamma\_B t} \frac{|q|^2}{|p|^2} \sin^2(\Delta M t/2) \end{aligned} \tag{10.34}$$

The amount of CP violation in the mixing of B mesons is found to be small. In the limit of no CP violation, |p|/|q| = 1 and<sup>32</sup> <sup>32</sup>The relationship B<sup>L</sup> and B<sup>H</sup> would be orthogonal.

$$
\langle B\_{\mathcal{L}} | B\_{\mathcal{H}} \rangle = |p|^2 - |q|^2
$$

carries over from the kaons

## **10.7** *CP* **violation (part 2)**

The CKM formalism for CP violation with one complex phase δ can be easily made to fit the experimental CP violation effects described up to now, and indeed the value of δ is not strongly constrained by these. One reason is that strong-force (QCD) effects on the theoretical prediction of are difficult to calculate,<sup>33</sup> <sup>33</sup>It is a complex theoretical computaso it is not easy to relate it to δ. However, the CKM formalism gives predictions (as a function of the assumed value of δ) of CP violation in other systems, and, owing to the values of the CKM elements, some of the effects in the B-meson systems can be large. Although the kaons have the advantage that the K<sup>S</sup> and K<sup>L</sup> have very different lifetimes, there are several tricks for measuring CP violation with B mesons that do not have this feature.

As stated earlier, we need to find a situation where (a) all three quark generations can be involved and (b) several routes lead to the final state, so that when we add the amplitudes together and square to get the observable quantity, there is a dependence on the complex phase in the CKM matrix.

## **10.7.1** *CP* **violation in time-dependent asymmetries**

The first technique that exploits this is elegant and gives a large effect, called the time-dependent asymmetry. It is our first example of type (c) CP violation from Table 10.2. It has been studied in detail at facilities called B-factories and involves making a B<sup>0</sup>B¯<sup>0</sup> pair from the decay of an Υ(4S) b¯b meson, which we will look at shortly. It can also be studied at the LHCb, experiment which we will also examine later in this chapter. A final state that is a CP eigenstate is required, one to which both a B<sup>0</sup> and a B¯<sup>0</sup> can decay; a good example of this is B<sup>0</sup> → J/ψKS. A schematic of the entire process<sup>34</sup> is shown in Fig. 10.12. It turns out that the CP violation in the mixing of B<sup>0</sup> (in the box diagrams in Fig. 10.4) is very small, but the final state has two routes by which it can be made from a B<sup>0</sup>: one in which the B<sup>0</sup> oscillates to a B¯<sup>0</sup> before decaying and one tion and there are several large terms that nearly cancel to give a small overall number, which consequently has a large uncertainty.

<sup>34</sup>Although the <sup>K</sup><sup>S</sup> is needed to produce a CP final state, we can draw this in the Feynman diagram as either an sd¯ or a ds¯; although there is CP violation in the mixing of the kaons, it is small compared with the main source of CP violation in this technique and can be neglected.

shortened to f.

where it does not. The amplitudes for these two parts of the decay will interfere, and the sensitivity to the angle δ in the CKM matrix is large.

Considering decays of B mesons to states of definite CP (labelled as fCP ), we define the amplitude for B<sup>0</sup> → fCP as A<sup>f</sup> and the amplitude for a B¯<sup>0</sup> decay to the same state as A¯<sup>f</sup> . It is conventional to define the parameter

$$
\lambda\_f = \frac{\bar{A}\_f}{A\_f} \frac{q}{p} \tag{10.35}
$$

From eqn 10.33, we can write down the amplitude representing the decay of a B<sup>0</sup> state, with wavefunction ψ(t), to the fCP state: <sup>35</sup> <sup>35</sup> In the following equations, <sup>f</sup>CP is

$$\langle f|\psi(t)\rangle = \mathbf{e}^{-\Gamma t/2} \mathbf{e}^{-\mathrm{i}Mt} \left[ A\_f \cos(\Delta M t/2) + \mathbf{i}\frac{q}{p}\bar{A}\_f \sin(\Delta M t/2) \right].$$

Using the definition of λ<sup>f</sup> from eqn 10.35, this can be rewritten as

$$\langle f|\psi(t)\rangle = A\_f \mathbf{e}^{-\Gamma t/2} \mathbf{e}^{-\mathrm{i}Mt} [\cos(\Delta M t/2) + \mathrm{i}\lambda\_f \sin(\Delta M t/2)] \tag{10.36}$$

Similarly, the amplitude representing the decay of a B¯<sup>0</sup> state, with wavefunction ψ¯(t), to the same fCP state is

$$\langle f|\bar{\psi}(t)\rangle = \bar{A}\_f \mathbf{e}^{-\Gamma t/2} \mathbf{e}^{-\mathrm{i}Mt} \left[ \cos(\Delta M t/2) + \frac{\mathrm{i}}{\lambda\_f} \sin(\Delta M t/2) \right] \quad \text{(10.37)}$$

Assuming that |A<sup>f</sup> | <sup>2</sup> = |A¯<sup>f</sup> | <sup>2</sup> as expected in the Standard Model, and that |p||q| because the CP violation in the mixing is small, we can evaluate the decay rates as (Exercise 10.3)

$$\begin{aligned} \left| \langle f | \psi(t) \rangle \right|^2 &= |A\_f|^2 \mathbf{e}^{-\Gamma t} [1 - 2 \operatorname{Im}(\lambda\_f) \cos(\Delta M t/2) \sin(\Delta M t/2)] \\ |\langle f | \bar{\psi}(t) \rangle|^2 &= |A\_f|^2 \mathbf{e}^{-\Gamma t} [1 + 2 \operatorname{Im}(\lambda\_f) \cos(\Delta M t/2) \sin(\Delta M t/2)] \end{aligned} \tag{10.38}$$

We define the CP-violating asymmetry in the usual way,

$$A\_{CP} = \frac{|\langle f|\bar{\psi}(t)\rangle|^2 - |\langle f|\psi(t)\rangle|^2}{|\langle f|\bar{\psi}(t)\rangle|^2 + |\langle f|\psi(t)\rangle|^2} \tag{10.39}$$

and, substituting from eqns 10.38, we find

$$A\_{CP} = \text{Im}(\lambda\_f) \sin(\Delta M t) \tag{10.40}$$

The J/ψK<sup>0</sup> <sup>s</sup> state is a CP eigenstate (with eigenvalue ηCP = −1), so is suitable for a time-dependent asymmetry analysis. We can use the CKM matrix to predict the phase factors with or without mixing via the box diagram (see Fig. 10.12). In the box diagrams, the important contribution comes from exchange of virtual top quarks owing to the large value of Vtb. Note that the asymmetry depends on the phase of λ<sup>f</sup> . The phase factors in the box diagram come from the CKM elements and contribute V <sup>2</sup> tbV <sup>2</sup> td. Since Vtb is real, the non-zero phase arises from the factor of V <sup>2</sup> td. Hence the amplitude of the CP-violating term is given by ACP = Im(V <sup>2</sup> td). Now if we look at the unitarity triangle (see Fig. 10.10), we can see that the phase of Vtd = β. Hence ACP = sin 2β.

## **10.7.2 B-factories**

Since the B-meson system is expected to be a prolific source of physics, many recent studies of B mesons, including the time-dependent asymmetries, have been made at two 'B-factories'. These are high-luminosity e+e<sup>−</sup> colliders, KEK-B in Japan (with the Belle detector) and PEP-II (and the BaBar experiment) in the USA, operating at a centre-of-mass energy of 10.58 GeV, the mass of the Υ(4S) resonance. The Υ(4S) is a C = −1 state that decays almost entirely to BdB¯<sup>d</sup> pairs with a branching ratio of 49% to B<sup>0</sup>B¯<sup>0</sup>. <sup>37</sup> <sup>37</sup>B<sup>0</sup> Studies at e+e<sup>−</sup> B-factories continue with the new Belle-2 experiment at the upgraded Super KEK-B accelerator.

Both machines are 'asymmetric' in that the laboratory energies of the electron and positron beams are different—specifically, at PEP-II, the low-energy ring operates at 3.1 GeV and the high-energy ring at 9.0 GeV. The asymmetry between the energies of the two rings means that the centre-of-mass system (CMS) moves at a velocity of β = 0.5 in the laboratory and the resulting Lorentz boost allows resolution of the B decay vertices, which would otherwise not be resolved if mesons were produced at rest in the CMS. This is illustrated in Fig. 10.13. Nevertheless, the detectors must have extremely good vertex resolution, since the average distance from production to decay, βγcτB, is only about 260 μm even with the Lorentz boost.

Because B<sup>0</sup>B¯<sup>0</sup> pairs from the decay of Υ(4S) are produced essentially at rest in the CMS, and the Υ(4S) is a state of definite (odd) C, the initial state is

$$\Upsilon(4S)\_{C=-1} \to \sqrt{\frac{1}{2} \left[ |B^0(\mathbf{p})\bar{B}^0(-\mathbf{p})\rangle - |B^0(-\mathbf{p})\bar{B}^0(\mathbf{p})\rangle \right]}$$

<sup>36</sup> <sup>36</sup>A similar analysis for the case of J/ψK<sup>0</sup> <sup>L</sup> gives a similar result with the sign changed.

> <sup>s</sup> and <sup>B</sup>¯<sup>0</sup> <sup>s</sup> are too massive to be produced from Υ decays.

**Fig. 10.13** The Lorentz boost due to the finite velocity of the CMS in an asymmetric B-factory allows the

resolution of the B-meson decay vertices. The distance between the two decay vertices allow the time between the two B<sup>0</sup> decays to be determined.

where **p** is the CMS momentum of one of the Bs. The oscillations that occur after production are quantum-correlated: since the state must be odd under the interchange of the two mesons, they cannot both be B<sup>0</sup> or B¯<sup>0</sup> simultaneously—they must oscillate together. The coherence is lost once one of the mesons has decayed and oscillations can be observed—the decay of one meson starts the clock for the time-dependent oscillations of the other. Experimentally, the time Δt = t<sup>1</sup> − t<sup>2</sup> is determined from the distance between the decay vertices, as sketched in Fig. 10.13.

The scheme of the processes that are selected in the detector to measure the time-dependent asymmetry is shown in Fig. 10.12, which we have partly discussed already. We consider events that decay as Υ(4S) → B<sup>0</sup>B¯<sup>0</sup> → fCP ftag, where fCP is a CP eigenstate (shown in the top part of Fig. 10.12) and ftag is a B meson that is tagged as either B<sup>0</sup> or B¯<sup>0</sup> through a semileptonic decay as shown in the bottom part of Fig. 10.12. Let n(B<sup>0</sup>)(n(B¯<sup>0</sup>)) be the number of tagged B<sup>0</sup>(B¯<sup>0</sup>) events observed as a function of Δt. We can redefine the asymmetry ACP from eqn 10.39 in terms of event numbers as

$$A\_{CP} = \frac{n(B^0) - n(\bar{B}^0)}{n(B^0) + n(\bar{B}^0)} \tag{10.41}$$

Figure 10.14 shows the numbers of tagged events (a, b) and ACP (c, d) from the Belle experiment [48] as functions of Δt for the processes B → J/ψK<sup>S</sup> (a, c) and B → J/ψK<sup>L</sup> (b, d). The dataset from Belle has 700 million B<sup>0</sup>B¯<sup>0</sup> events. Similar results are also obtained at the other B-factory by the Babar experiment at SLAC with a sample of 460 million

**Fig. 10.14** Data from the Belle experiment: (a, b) numbers of events with a B<sup>0</sup> and with a B¯<sup>0</sup> tag as a function of Δt and (c, d) the time-dependent asymmetry. The data are shown separately for the CP = −1 (e.g. J/ψKS) (a, c) and CP = +1 (J/ψKL) (b, d). From [48].

B<sup>0</sup>B¯<sup>0</sup> events. It can be deduced from these and other measurements that sin 2β = 0.671 ± 0.023.

The B-factories also studied time-dependent asymmetries with other final states. The B → J/ψK<sup>S</sup> described here is particularly clean, with diagrams that are easy to calculate theoretically being dominant. This mode is an example of a diagram where the quark-level decay is b → cc¯ s¯, and gives access to the angle β. Measurements of the other angles of the unitarity triangle are more involved but can be done. For example, the angle α can be determined from the time-dependent analysis of B<sup>0</sup> → π<sup>+</sup>π<sup>−</sup> decays, but this is more challenging because of the small branching ratio for this decay mode and backgrounds from other decay modes. In addition, the theoretical analysis of this mode is more complicated.

There are about eight combinations overall of quark decays that can in principle be studied with different final states. Highly significant (over 5σ) CP-violating effects have been observed in the following final states: J/ψK, η K, φK, f <sup>0</sup>K, K+K−KS, π+π<sup>−</sup>, ψπ<sup>0</sup>, D+D<sup>−</sup>, and D∗+D∗−. In some of these cases, there are more than two particles in the measured final state, which in general do not have definite CP, and a certain region of the Dalitz plot (see Chapter 2, page 36, for an example) must be selected to isolate the decays in the CP eigenstate.

## **10.7.3 LHCb detector**

The B-factories using the Υ(4S) can produce B<sup>d</sup> mesons but they are below threshold for producing B<sup>s</sup> mesons. Hadron colliders offer a prolific source of B mesons, including the B<sup>s</sup> mesons, and this will be discussed in this section. The angle γ can be determined most cleanly from measurements of direct CP violation in B<sup>±</sup> → D0K<sup>±</sup> decays, or from effects involving the B<sup>s</sup> mesons.

The LHCb experiment [99] is a dedicated B-physics facility at the LHC, established following the success in studying B physics at the Tevatron experiments. Because of their relatively small mass compared with the CMS energy, b hadrons are produced predominantly along the beam directions, in what is called the 'forward direction'. One side of an LHC interaction region has been instrumented with detectors specifically optimized for (a) good spatial resolution (which gives good decay time resolution) to distinguish and measure B decays, (b) excellent particle identification (e.g. in distinguishing B → π<sup>+</sup>π<sup>−</sup> decays from background B → K<sup>+</sup>π<sup>−</sup>), and (c) a very efficient trigger for selecting B events.

The spatial resolution is achieved with a silicon vertex detector (see Section 4.6.2).<sup>38</sup> <sup>38</sup>The large boost at LHC helps im-Because of the geometry of measuring particles in the forward direction, the LHCb silicon vertex detector fits inside the vacuum beam pipe. The detectors are split in two parts so that they can be moved closer to (further away from) the beam during normal operation (filling the accelerator with particles) with motors to avoid exposure to too much radiation. Thin foils surround the vertex detectors to provide shielding, since otherwise the pickup from the moving charges in the beam nearby would swamp the signals of the particles being detected.

prove the proper-time resolution.

**Fig. 10.15** The LHCb RICH detector. Cerenkov photons created by charged ˇ particles with speed β > 1/n are focused onto the surface of the photon detectors by a series of mirrors. The radiator C4F<sup>10</sup> is selected because of its high refractive index among the inert gases. From [99].

ure the mass but simply to provide separation between particles of different masses.

The particle identification is mainly needed for K/π separation. This is achieved over a wide range of momenta using two ring imaging Cerenkov ˇ (RICH) detectors, one of which is shown in Fig. 10.15. The Cerenkov ˇ effect is described in Section 4.3.4; when a particle travelling at speed β is sufficiently fast that β > 1/n, light is emitted at an angle θ<sup>C</sup> given by cos θ<sup>C</sup> = 1/(nβ), where n is the refractive index of the medium. In a RICH, the locations of the Cerenkov photons are measured and used to ˇ fit a cone around the particle trajectory to determine θC. In combination with the momentum measurement from the magnetic spectrometer, this can in principle determine the particle mass. <sup>39</sup> <sup>39</sup> In practice, we do not want to meas-

> The LHCb trigger consists of hardware and software levels following the multilevel scheme described in Section 4.10. The hardware trigger uses simple algorithms based on energy deposition in the calorimeters and hits in the muon chambers to reduce the rate to 1 MHz. All the data are read out at this frequency and a large computer farm performs more sophisticated calculations to reduce the rate to 4 kHz, and the data that pass this higher-level trigger are retained on permanent storage for subsequent analysis.

## **10.8 LHCb measurements**

LHCb has made time-dependent asymmetry measurements that are as precise as the B-factory measurements (Section 10.7.2). The decay

B<sup>0</sup> → J/ΨK<sup>S</sup> can be reconstructed from the decays J/Ψ → μ<sup>+</sup>μ<sup>−</sup> and K<sup>S</sup> → π+π<sup>−</sup>. These particular final states are the simplest to detect in hadronic collisions.<sup>40</sup> <sup>40</sup>A disadvantage is that the branching The decay time of the B<sup>0</sup> is reconstructed from the separation between the primary and secondary vertex (measured by the silicon detectors) and the momentum of the B<sup>0</sup>. The flavour of the 'non-signal B<sup>0</sup>' is determined by specific decays, including semileptonic decays to electrons or muons. The time-dependent asymmetry (eqn 10.41) is shown [102] in Fig. 10.16. This is the type (c) CP violation from Table 10.2.

In contrast to the kaon system, CP violation due to mixing alone in the B system does not occur at a high level. This is the type (a) CP violation from Table 10.2 and is characterized by |q/p| = 1. This can be measured by a time-independent semileptonic asymmetry, which implies that |q/p| = 1.0002 ± 0.0028.

Direct CP violation (type (b) from Table 10.2) has been observed as a difference in the decay rates of neutral B mesons to CP-conjugate final states. For example, for the CP-conjugate decays B<sup>0</sup> → K<sup>+</sup>π<sup>−</sup> and B¯<sup>0</sup> → K−π<sup>+</sup>, the asymmetry parameter is measured to be

$$A(B^0 \to K^+ \pi^-) = \frac{\Gamma\_{K^- \pi^+} - \Gamma\_{K^+ \pi^-}}{\Gamma\_{K^- \pi^+} + \Gamma\_{K^+ \pi^-}} = -0.080 \pm 0.008 \tag{10.42}$$

A similar asymmetry has been measured in the decay of charged B mesons to the Kρ<sup>0</sup> final state: AKρ<sup>0</sup> = 0.37±0.1.<sup>41</sup> <sup>41</sup>Similar asymmetry parameters for This shows that CP effect can be measured in charged mesons and is not limited to the four main neutral meson systems.

The first evidence for CP violation in B<sup>0</sup> <sup>s</sup> decays [101] uses the mode B<sup>0</sup> <sup>s</sup> → K<sup>−</sup>π<sup>+</sup> and is an example of direct CP violation. This decay mode has a very small branching ratio and therefore good background rejection is essential. This is achieved using the excellent K/π separation and mass resolution. The measurement is A(B<sup>s</sup> → K<sup>−</sup>π<sup>+</sup>)=0.27±0.07, defined similarly to eqn 10.42.

ratio of B<sup>0</sup> to this final state is only ∼10−3.

over two hundred final states of charged and neutral B mesons have been measured and only a few of these, including the two mentioned, are statistically different from zero.

## **Chapter summary**


## **Further reading**


## **Exercises**


$$z^2 = \sum\_{i,j=1}^4 E\_i E\_j \frac{(x\_i - x\_j)^2 + (y\_i - y\_j)^2}{m\_K^2}.$$

Hint: First consider a π<sup>0</sup> → γγ decay with the y axis perpendicular to the plane of the decay, and discard non-leading terms in x<sup>2</sup>/z<sup>2</sup>.


similar set of diagrams to show how the process B → π<sup>+</sup>π<sup>−</sup> proceeds. Penguin diagrams (similar to Fig. 10.11) are non-negligible in this mode; add an example penguin to your diagram. Another mode that exhibits time-dependent asymmetries is B → φKS; draw the relevant set of diagrams for this mode.

(10.5) For the decay B<sup>S</sup> → π<sup>+</sup>K<sup>−</sup>, draw a tree diagram and a penguin diagram for this decay. Show that the tree diagram is proportional to V <sup>∗</sup> ubVud, and deduce the CKM matrix elements upon which the penguin diagram depends. You should find that the flavour inside the dominant penguin is charm. Repeat for the ¯bd decay B<sup>0</sup> → K<sup>+</sup>π<sup>−</sup> and notice that the arrangement of the lines in these two decays is very similar. It turns out that because one can show that Im(V <sup>∗</sup> ubVudVcbV <sup>∗</sup> cd) = −Im(V <sup>∗</sup> ubVusVcbV <sup>∗</sup> cs), the amount of direct CP violation in each of these two decays is the same, which is a result that has been experimentally tested [101].


# **Neutrino oscillations 11**

## **11.1 Introduction**

In the minimal Standard Model (SM), the neutrinos are assumed to be massless. However, as the neutrino masses are not protected by any gauge symmetry (unlike the photon), it is easy to extend the SM to accommodate neutrino masses. From an experimental perspective, it is much easier to detect small mass differences between different neutrino mass eigenstates than to measure their absolute masses. This is because small mass differences combined with mixing will cause one flavour of neutrino to oscillate into other flavours, which is an effect that can be measured. This chapter starts with a very brief review of the determination of upper limits on the neutrino masses and then gives an explanation of the theory of neutrino flavour oscillations. We begin with the simple case of two-neutrino oscillations, since this brings out the essential features with minimal complications. We then review the experimental evidence for neutrino oscillations. Three-generation mixing is very interesting, since it allows for the possibility of CP violation in the neutrino sector, and the mathematical treatment will be given before showing the recent experimental evidence for a non-zero value of the mixing angle between first- and third-generation neutrinos. It is possible that CP violation in the neutrino sector can explain the observed matter–antimatter asymmetry in the Universe, and this will be discussed in Section 11.5.4.

## **11.1.1 Neutrino masses**

The oscillation experiments measure mass differences between the different types of neutrinos, but they are not sensitive to the absolute mass of any neutrino (see Section 11.3). In principle, neutrino masses can be measured from the endpoint(s) of decay spectra. For example, the β decay of tritium, <sup>3</sup>H →<sup>3</sup> He + e<sup>−</sup> + ¯νe, has an endpoint at the Q value of this reaction of 18.6 keV, assuming m<sup>ν</sup><sup>e</sup> = 0. <sup>1</sup> <sup>1</sup> Tritium is the best choice here because The effect of a non-zero value of m<sup>ν</sup><sup>e</sup> would be to create a distortion in the spectrum near the endpoint. Similar studies have been performed using μ and τ decays. So far, only upper limits have been obtained [115]. Limits on the sum of the neutrino masses can also be obtained from cosmological arguments, since finite-mass neutrinos could contribute to the matter density of the

> *Particle Physics in the LHC Era*, Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg 2016. Published in 2016 by Oxford University Press.

of the very low Q value.

Universe and affect structure formation in the early Universe. The upper limits depend on which theoretical assumptions are made, but the current upper limit for the sum of the neutrino masses (for all generations) is around 1 eV [115]. A more direct but less precise limit on the mass of the electron neutrino can be derived from the observations of neutrinos from the supernova SN1987A (see Exercise 11.4).

Neutrinos could be of Dirac or Majorana type (see Chapter 6). If they are of Majorana type, this would imply that a neutrino could be its own antiparticle, which would allow neutrinoless double β decay, <sup>X</sup>Y → <sup>X</sup>Z + 2e−. The Standard Model background to its detection would be normal double β decay, <sup>X</sup>Y → <sup>X</sup>Z +2e<sup>−</sup> +2¯νe. If we consider the combined energy of the two electrons, the Standard Model background would produce a continuous spectrum as opposed to the peak at the endpoint expected for the neutrinoless double β decay. Several experiments are now looking for this signal, for example the SNO+ experiment is using the isotope <sup>130</sup>Te, which is a double β emitter. The experiment uses liquid scintillator in the SNO detector (previously filled with heavy water for solar neutrino studies). The scintillation process gives a larger number of photons than from the Cerenkov process (which ˇ generates the photons in water). Hence the energy resolution for lowenergy electrons is significantly better using liquid scintillator compared with heavy water.

## **11.2 Neutrino states**

We know from studying the weak interaction that particle mass states need not be the same as the weak-interaction states. The two types of state are connected by the Cabbibo rotation matrix. A similar phenomenon occurs for neutrino states. If neutrinos have a small mass, the flavour (i.e. weak-interaction) eigenstates are related to the mass eigenstates via a CKM-like matrix:

$$
\begin{pmatrix} \nu\_e \\ \nu\_\mu \\ \nu\_\tau \end{pmatrix} = \begin{pmatrix} U\_{e1} & U\_{e2} & U\_{e3} \\ U\_{\mu 1} & U\_{\mu 2} & U\_{\mu 3} \\ U\_{\tau 1} & U\_{\tau 2} & U\_{\tau 3} \end{pmatrix} \begin{pmatrix} \nu\_1 \\ \nu\_2 \\ \nu\_3 \end{pmatrix} \tag{11.1}
$$

The flavour states νe, νμ, and ν<sup>τ</sup> propagate in space-time as linear combinations of the mass eigenstates ν1, ν2, and ν3. The mixing matrix **U** is known as the PMNS (Pontecorvo–Maki–Nakagawa–Sakata) matrix and, as will be discussed in Section 11.5.2, can be generalized to an arbitrary number N of flavours. For the general case of N ≥ 3, the elements Uαi may be complex. The mixing causes transitions between different flavours such as ν<sup>μ</sup> and ν<sup>τ</sup> . As will be shown below, the flavour transition probabilities P(ν<sup>α</sup> → νβ) depend on the differences in masses between the mass eigenstates and are oscillatory functions of L/E, where L is the distance a neutrino of energy E has travelled from the source to a detector.

## **11.3 Two-flavour oscillations**

Most of the original evidence for neutrino oscillations came from the study of muon neutrinos originating at the top of the atmosphere and electron neutrinos from the Sun. Since the initial neutrino flavours are different and the respective ranges of L/E are very different, the results were usually analysed assuming only two neutrino flavours. It turns out, as will be discussed in Section 11.5.2, that this gives a good description of the main features of neutrino oscillation phenomena. Since the expressions for the transition ('oscillation') probabilities for three (or more) neutrino flavours are somewhat complicated, the physics of two-flavour oscillations will be discussed first. This has the additional advantage that the underlying physics is more transparent. A discussion of three-flavour oscillations is given in Section 11.5.2.

The derivation of two-flavour neutrino oscillation probabilities is similar to the derivation of the K<sup>0</sup>–K¯ <sup>0</sup> oscillation probabilities of Section 10.3, with the important difference that it proceeds in the laboratory frame, rather than the centre-of-mass.

With only two flavours, neutrino mixing can be described by one ('mixing') angle. The two flavour eigenstates ν<sup>α</sup> and ν<sup>β</sup> (e.g. α, β = e, μ) are linear combinations of mass eigenstates ν<sup>1</sup> and ν2:

$$\begin{aligned} \left| \nu\_{\alpha} \right\rangle &= \left| \cos \theta \left| \nu\_{1} \right\rangle + \sin \theta \left| \nu\_{2} \right\rangle \\ \left| \nu\_{\beta} \right\rangle &= -\sin \theta \left| \nu\_{1} \right\rangle + \cos \theta \left| \nu\_{2} \right\rangle \end{aligned} \tag{11.2}$$

Consider a neutrino of flavour α with momentum p created in a weak interaction at t = 0. The initial state is |ψ(0) = |να(0), which can be expressed in terms of the mass eigenstates <sup>2</sup> <sup>2</sup> This treatment follows the simplias

$$\left|\psi(0)\right\rangle = \cos\theta \left|\nu\_1\right\rangle + \sin\theta \left|\nu\_2\right\rangle.$$

At some time t later, the ν<sup>1</sup> and ν<sup>2</sup> wavefunctions have evolved:

$$\begin{split} \left| \psi(t) \right\rangle &= \cos \theta \left| \nu\_1 \right\rangle \mathrm{e}^{-\mathrm{i}E\_1t} + \sin \theta \left| \nu\_2 \right\rangle \mathrm{e}^{-\mathrm{i}E\_2t} \\ &= \left( \cos \theta \left| \nu\_1 \right\rangle + \sin \theta \left| \nu\_2 \right\rangle \mathrm{e}^{-\mathrm{i}(E\_2 - E\_1)t} \right) \mathrm{e}^{-\mathrm{i}E\_1t} \end{split} \tag{11.3}$$

where E<sup>1</sup> and E<sup>2</sup> are the energies of the ν<sup>1</sup> and ν2. Now

$$\begin{aligned} E\_i &= \sqrt{p^2 + m\_i^2} \\ &\simeq p \left( 1 + \frac{m\_i^2}{2p^2} \right) \end{aligned}$$

where, because m<sup>1</sup> and m<sup>2</sup> are undoubtedly very small, the approximation is good for all practical values of p and p = E. Ignoring the common

fied methods generally used in textbooks. There are many subtleties in a more correct analysis, but the final results are unchanged (see Akhmedov and Smirnov in Further Reading). We assume that we have a superposition of neutrinos with the same momentum but different energies. In Exercise 11.1, we investigate the effect of changing this assumption. A more general treatment involves the consideration of wavepackets with a finite spread in momentum. A possible objection to our treatment is that it appears to violate conservation of energy since we have neutrino states with (very slightly) different energies. This issue is investigated in Exercise 11.2.

phase factor e−iE1<sup>t</sup> (which will later be cancelled by its conjugate) and rearranging eqn 11.3,

$$\begin{aligned} \left| \psi(t) \right\rangle &= \cos \theta \left| \nu\_1 \right\rangle + \sin \theta \left| \nu\_2 \right\rangle \mathrm{e}^{-\mathrm{i}(m\_2^2 - m\_1^2)t/2E} \\ &= \cos \theta \left| \nu\_1 \right\rangle + \sin \theta \left| \nu\_2 \right\rangle \mathrm{e}^{-\mathrm{i}\Delta m^2 t/2E} \\ &= \cos \theta \left| \nu\_1 \right\rangle + \sin \theta \left| \nu\_2 \right\rangle \mathrm{e}^{-\mathrm{i}\phi} \end{aligned} \tag{11.4}$$

The phase difference φ = Δm<sup>2</sup>t/2E between the ν<sup>1</sup> and ν<sup>2</sup> components of |ψ(t) depends on Δm<sup>2</sup> = m<sup>2</sup> 2−m<sup>2</sup> <sup>1</sup>, the difference between the squared masses of ν<sup>1</sup> and ν2, and t/E.

The amplitude of |ψ at t is given by eqn 11.4 in terms of |ν<sup>1</sup> and |ν<sup>2</sup>. It can be re-expressed in terms of |ν<sup>α</sup> and |ν<sup>β</sup> by inverting eqn 11.2:

$$|\nu\_1\rangle = \cos\theta \left|\nu\_\alpha\right\rangle - \sin\theta \left|\nu\_\beta\right\rangle$$

$$|\nu\_2\rangle = \sin\theta \left|\nu\_\alpha\right\rangle + \cos\theta \left|\nu\_\beta\right\rangle$$

and hence

$$|\psi(t)\rangle = (\cos^2\theta + \sin^2\theta \,\mathrm{e}^{\mathrm{i}\phi})|\nu\_\alpha\rangle + \cos\theta\sin\theta \,(\mathrm{e}^{\mathrm{i}\phi} - 1)|\nu\_\beta\rangle$$

The probability of observing a ν<sup>β</sup> at a time t is therefore

$$\mathbf{P}(\alpha \to \beta) = |\langle \nu\_{\beta} | \psi(t) \rangle|^{2} \tag{11.5}$$

$$=\cos^2\theta\sin^2\theta\left(\mathbf{e}^{\mathrm{i}\phi}-1\right)(\mathbf{e}^{-\mathrm{i}\phi}-1)\tag{11.6}$$

$$=\cos^2\theta\sin^2\theta\left(2-2\cos\phi\right)\tag{11.7}$$

$$=4\cos^2\theta\sin^2\theta\sin^2\left(\frac{1}{2}\phi\right)\tag{11.8}$$

$$=\sin^2(2\theta)\sin^2\left(\frac{\Delta m^2 t}{4E}\right) \tag{11.9}$$

Likewise, the probability of observing a ν<sup>α</sup> as a ν<sup>α</sup> at t is

$$\begin{aligned} \mathcal{P}(\alpha \to \alpha) &= |\langle \nu\_{\alpha} | \psi(t) \rangle|^2 \\ &= 1 - \sin^2(2\theta) \sin^2\left(\frac{\Delta m^2 t}{4E}\right) \end{aligned}$$

Since neutrinos travel at speed<sup>3</sup> c, the probability that a ν<sup>α</sup> of energy E is observed as a ν<sup>β</sup> at a distance L from a source is therefore

$$\mathbf{P}(\alpha \to \beta) = \sin^2(2\theta) \sin^2\left(\frac{\Delta m^2 L}{4E}\right) \tag{11.10}$$

and the probability that the ν<sup>α</sup> is observed as a ν<sup>α</sup> at L is

$$\mathcal{P}(\alpha \to \alpha) = 1 - \sin^2(2\theta) \sin^2\left(\frac{\Delta m^2 L}{4E}\right) \tag{11.11}$$

Note that P(α → α) + P(α → β) = 1, as required by unitarity.<sup>4</sup>

<sup>3</sup>We assume that neutrinos travel at speed c. If neutrinos have non-zero mass differences, at least one flavour of neutrino must travel at a lower speed than c and the speed of the two neutrinos must be slightly different. However, for practical purposes, the error introduced by this approximation is negligible.

<sup>4</sup>It is interesting to note that if the baseline satisfies Δm2L/E 1, then the phase will oscillate very rapidly and sin2(Δm2L/E) → <sup>1</sup> 2 .

Equations 11.10 and 11.11 show that if at least one pair of neutrinos have different masses and there is some mixing, i.e. θ -= 0, then transitions between neutrinos of different flavours can occur, violating conservation of lepton flavour number, although not of overall lepton number. The transition probabilities given by eqns 11.10 and 11.11 are simple oscillatory functions of distance and are generally described as 'oscillation probabilities'; P(α → β) is referred to as the ν<sup>β</sup> *appearance* probability and P(α → α) as the ν<sup>α</sup> *survival* probability. Before neutrino oscillations were discovered, there was no theoretical guidance as to the value of Δm<sup>2</sup> and it was assumed that any mixing, i.e. θ, would be very small.

Since neutrino masses are now known to be very small, Δm<sup>2</sup> must also be small and L/E must be large for flavour-changing oscillations to be observable. For practical purposes, the dependence of the oscillation probabilities on L/E is usefully expressed as (see Exercise 11.7)

$$\mathbf{P}(\alpha \to \beta) = \sin^2(2\theta) \sin^2\left(\frac{1.27\Delta m^2 L}{E}\right) \tag{11.12}$$

$$\mathcal{P}(\alpha \to \alpha) = 1 - \sin^2(2\theta) \sin^2\left(\frac{1.27\Delta m^2 L}{E}\right) \tag{11.13}$$

where L/E is in km GeV−<sup>1</sup> or m MeV−<sup>1</sup> and Δm<sup>2</sup> is in eV<sup>2</sup>. Even with Δm<sup>2</sup> as (unrealistically) large as 1 eV<sup>2</sup>, a detector would have to be 1 km from a source of 1 GeV neutrinos for the phase of the energy-dependent terms in eqns 11.12 or 11.13 to approach π/2 and for such an experiment to be sensitive to oscillations. The smallness of Δm<sup>2</sup> therefore explains why neutrino oscillations were discovered with neutrinos from natural sources—cosmic rays and the Sun—rather than at particle accelerators.

## **11.4 Evidence for neutrino oscillations**

Detecting neutrinos is difficult because of the very small cross sections involved. The neutrino–proton cross section scales with neutrino energy E<sup>ν</sup> as σ ∼ G<sup>2</sup> <sup>F</sup>mpEν, so, for example, σ(νep) ∼ 10−<sup>41</sup>m<sup>2</sup> for a 10 MeV Therefore, very large active targets are required and/or very intense neutrino sources such as nuclear reactors.

Increased sensitivity to small mass differences can be achieved by using lower-energy neutrinos and making observations at greater distances from the source (see Section 11.3). The actual energies and distances used in real experiments are a compromise between these factors and the requirement to have a measurable reaction rate. Experiments looking for neutrinos produced in the Sun or by cosmic rays in the atmosphere need to be located deep underground to reduce the background from cosmic rays. For the case of low-energy neutrinos studied in solar neutrino oscillations, extreme care must be taken to minimize radioactive backgrounds. For the higher-energy neutrinos produced by accelerators, the detectors can be similar to those used to study neutrino deep inelastic

neutrino. <sup>5</sup> <sup>5</sup> This is 15 orders of magnitude smaller than the pp total cross section.


**Table 11.1** Approximate sensitivities of the different types of neutrino oscillation experiments.

scattering (see Chapter 9). In the case of neutrinos from nuclear reactors or secondary beams from accelerators, if there are detectors at different distances from the source, the oscillations can be measured without needing to know the absolute neutrino flux of the source, thus eliminating a major source of systematic uncertainty. Alternatively, if the distance is fixed, one can study the change in the energy spectrum caused by oscillations. The approximate sensitivity for neutrino mass-squared differences using various sources of neutrinos is given in Table 11.1.

We will first look at the evidence for oscillations from atmospheric neutrinos (Section 11.4.1). The laboratory confirmation of these oscillations will be briefly reviewed in Section 11.4.2. We will then review the evidence for solar neutrino oscillations (Section 11.4.3) and describe the confirmation of these oscillations from experiments using reactor neutrinos (Section 11.4.5). The explanation of the solar neutrino oscillations requires enhanced oscillation probabilities when neutrinos travel through matter (the so-called MSW effect) and a simple explanation will be given in Section 11.4.4. The prospects for studying CP violation in the neutrino sector will be briefly reviewed in Section 11.5.3. Finally, in Section 11.5.4, we will review the most exciting prospect in the area of neutrino oscillations, namely the idea that neutrino oscillations might explain the observed matter–antimatter asymmetry of the Universe.

## **11.4.1 Atmospheric neutrinos**

Neutrinos are produced by cosmic rays (protons and heavier nuclei) colliding with air nuclei in the atmosphere. The flux of high-energy cosmic rays is found to scale with energy E proportionally to E−2.<sup>7</sup>. However, below about 20 GeV, the primary cosmic rays are strongly affected by the Earth's magnetic field. Even allowing for the linear increase in neutrino cross section with energy, the neutrinos that interact will be predominantly of low energy; the event rate peaks at about 1 GeV. A large background is that from other cosmic rays that interact electromagnetically or strongly and can overwhelm the neutrino signal. This background is greatly suppressed by operating the detectors deep underground.<sup>6</sup> The primary cosmic rays interact with nitrogen and oxygen 1 km.

<sup>6</sup>Typical depths are greater than about

in the atmosphere, <sup>7</sup> <sup>7</sup> The atmosphere is approximately 11 nuclear interaction lengths deep, so the probability of a primary cosmic ray not interacting in the atmosphere is negligible.

were originally designed to search for proton decay, with neutrino interactions being regarded as a source of background.

displays [129] in Fig. 11.1. <sup>9</sup> <sup>9</sup> If a muon stops inside the detector, the subsequent decay will produce a delayed electron that produces a displaced ring. This provides an additional identification power for muons.

producing charged pions, which decay in turn to muon and electron neutrinos according to the decay chain for π<sup>+</sup> (with a similar decay chain for π−):

$$\begin{aligned} \pi^+ &\rightarrow \mu^+ + \nu\_\mu \\ \mu^+ &\rightarrow e^+ + \bar{\nu}\_\mu + \nu\_e \end{aligned}$$

Thus, we naively (see Exercise 11.3) expect two muon-type neutrinos for each π<sup>+</sup> but only one electron-type neutrino. The best data come from the Super-Kamiokande experiment, which uses a 50 000 ton water Cerenkov detector to detect the electrons and muons produced by ˇ the neutrino interactions in the water. <sup>8</sup> <sup>8</sup> This detector and others like Soudan The Cerenkov light produced ˇ by a charged particle travelling with a speed greater than the speed of light in water is emitted in a ring around the direction of the particle's motion (see Chapter 4). Electron neutrinos produce electrons, which in turn produce electromagnetic showers in the water. Therefore, electron neutrinos tend to produce 'fuzzier' rings than muon neutrinos and this can be used to provide a powerful statistical separation between the two flavours of neutrinos. The different responses of the Super-Kamiokande detector to electrons and muons are illustrated in the event Although there are 20% uncertainties in the absolute neutrino fluxes, some of the uncertainties cancel in the ratio R = Flux(νμ)/Flux(νe). The data are usually presented in terms of the double ratio R = Rexperiment/Rpredicted, where the prediction assumes no neutrino oscillations. The results from the Super-Kamiokande experiment and other experiments [115] are summarized in Table 11.2. The values of R are significantly lower than unity, which would be expected in the absence of neutrino oscillations.

> Even more direct evidence for neutrino oscillations comes from the zenith-angle distribution. The zenith angle θ<sup>Z</sup> is the angle between the vertical and the direction of the incoming neutrino. So values of cos θ<sup>Z</sup> close to 1 correspond to distances travelled by the neutrinos of ∼10 km, whereas those with negative values of cos θ<sup>Z</sup> have travelled for distances of ∼10 000 km. The distributions from Super-Kamiokande [115] are shown in Fig. 11.2.

**Fig. 11.1** Super-Kamiokande event displays [129] for (a) a muon neutrino event and (b) an electron neutrino event.


**Table 11.2** Measurement of atmospheric neutrino ratio R . Exposure is in units of kton-years and the suffixes 's' and 'm' stand for sub-GeV and multi-GeV, respectively. The errors quoted are the statistical and systematic errors.

**Fig. 11.2** Zenith-angle distributions for the electron-like (a, c, e) and muonlike (b, d, f) events [115]. The events are classified in three energy ranges: less than 400 MeV (a, b), less than 1 GeV (c, d), and greater than 1 GeV (e, f). The dotted histograms show the expectations in the absence of neutrino oscillations and the solid histogram is the result of the fit for ν<sup>μ</sup> → ν<sup>τ</sup> oscillations.

These distributions show good agreement with the no-oscillation calculations for the electron neutrinos and for the muon neutrinos at large positive values of cos θZ. However, the muon neutrinos show a large deficit at negative values of cos θZ. This is exactly what would be expected for muon neutrino oscillations, because the events at negative values of cos θ<sup>Z</sup> correspond to neutrinos that have travelled large distances through the Earth (∼10 000 km), whereas the events at positive cos θ<sup>Z</sup> correspond to neutrinos that have only travelled ∼20 km. The data can be explained by neutrino oscillations of the type ν<sup>μ</sup> → ν<sup>τ</sup> with sin<sup>2</sup> 2θ ∼ 1 and Δm<sup>2</sup> ∼ 0.003 eV<sup>2</sup>.

## **11.4.2 Laboratory confirmation of atmospheric neutrino oscillation**

We start this section with the description of how a neutrino beam is produced from an accelerator. Then follows a discussion of results from the MINOS experiment.

## **The NuMI beam at Fermilab**

of a high-intensity, high-energy, laboratory beam of neutrinos (and antineutrinos) starts by extracting the primary 120 GeV proton beam from the Tevatron (2.5 × 10<sup>13</sup> protons per pulse) and directing it at a long thin target (see Fig. 11.3). This produces a large flux of charged pions and kaons, with energies in the range 2–60 GeV, which is then focused and collimated, before entering a 675 m evacuated pipe in which the πs and Ks decay to μν<sup>μ</sup> (see Exercise 11.8). The decay pipe is aimed at the MINOS far detector. The next stage is to absorb any remaining hadrons and the large flux of muons. The hadron absorber, consisting mainly of steel, is placed immediately after the decay pipe. Finally, muons must be removed before the NuMI beam passes through the MINOS near detector in a cavern 240 m beyond the hadron absorber. The intervening rock is dolomite and of sufficient density to absorb the muon flux within that distance.

By the time the neutrinos have reached a distant detector, the beam size will have become too large for a detector to contain the beam. Therefore, the detector volume is made as large as can be afforded to

The creation <sup>10</sup>We cannot focus neutrinos, but we <sup>10</sup> can produce a somewhat collimated neutrino beam by focusing the charged pions before they decay. The focusing uses toroidal fields that vary as 1/r (where r is the radial distance from the beam axis). This ensures that particles emerging at large angles to the beam axis remain in the field region for longer, which in turn ensures that charged particles within a certain angular range emerge parallel to the beam axis. This is achieved using 'magnetic horns' that require currents of ∼100 kA. This is too large for DC currents, so pulsed currents are used (synchronized to the proton bunches).

**Fig. 11.3** Schematic view of the structure used to create intense neutrino beams.

compensate for the small neutrino cross sections. In practice, the optimal detector shape tends to be approximately cylindrical, with the length along the beam direction being much greater than the width. As very large detectors are required, relatively cheap and simple detector technologies must be employed (see Chapter 4 for a discussion of generic neutrino detectors).

## **MINOS**

For example, the MINOS<sup>11</sup> <sup>11</sup>Main Injector Neutrino Oscillation experiment [109] uses magnetized iron as the target. with scintillator layers interspersed with the iron plates. The MINOS detector has two parts: the far detector located at a distance of 735 km from Fermilab and a near detector at Fermilab. The far detector consists of alternating plates of iron and scintillator, with a total mass of 5400 tons.<sup>12</sup> <sup>12</sup>Unless otherwise stated, 'MINOS de-The iron acts as the passive absorber for the calorimeter, but in addition it is magnetized. The scintillator is segmented into narrow strips, which are read out by wavelength-shifting fibres.<sup>13</sup> <sup>13</sup>The wavelength-shifting fibres are These fibres are coupled to photomultiplier tubes via additional clear fibres. The scintillator signals are used for calorimeter measurements as well as for muon tracking. The muons are identified because they penetrate much deeper into the detector. Their momentum can be determined from track curvature in the magnetic field, but if they are contained in the detector, a more precise determination can be made from their range.

The detector is located 716 m underground (in the Soudan mine) in order to minimize backgrounds from cosmic rays. To further reduce backgrounds, veto detectors are located around the main detector. Therefore, the signal for charged-current interactions of ν<sup>μ</sup> is the presence of a highmomentum muon. The hadronic energy can be measured by the total energy deposited in the scintillators. A critical feature of the experiment is that there is a near detector close to the origin of the neutrino beam as well as a far detector at a distance of 735 km. The near detector measures the neutrino flux at the origin and can therefore be used to predict the flux and energy spectrum of neutrinos at the far detector in the absence of oscillations. It was the difference between the measured and predicted rates in the far detector that confirmed [10] the presence of muon neutrino oscillations (see Fig. 11.4).

## **11.4.3 Solar neutrinos**

The Sun generates energy by nuclear fusion reactions. The most important reactions are those of the pp cycle:

$$p + p \to \, ^2\text{H} + \nu\_e + e^+ \quad \quad + 0.42\,\text{MeV}$$

$$\, ^2\text{H} + p \to \, ^3\text{He} + \gamma \quad \quad \quad + 5.49\,\text{MeV} \quad \quad \quad \quad \quad \quad \text{(11.14)}$$

$$\, ^3\text{He} + \, ^3\text{He} \to \, ^4\text{He} + 2p \quad \quad \quad \quad + 12.86\,\text{MeV}$$

Search.

tector' refers to the far detector.

doped with a suitable chemical to convert shorter-wavelength photons to longer, which increases the absorption length. The photons trapped inside the fibre can easily be transported to the photomultipliers using total internal reflection (as in normal optical fibres).

Other branches of the cycle can produce higher-energy neutrinos, which are easier to detect; for example, we can have

$$\begin{aligned} \,^3\text{He} + \,^4\text{He} &\to \,^7\text{Be} + \gamma &+ 1.59\,\text{MeV} \\ \,^7\text{Be} + p &\to \,^8\text{B} &+ 0.14\,\text{MeV} \\ \,^8\text{B} &\to \,^8\text{Be} + e^- + \nu\_e &+ 14.6\,\text{MeV} \\ \,^8\text{Be} &\to 2\,^4\text{He} &+ 3\,\text{MeV} \end{aligned} \tag{11.15}$$

where the endpoint of the resulting neutrino spectrum is at 14.6 MeV. The Standard Solar Model (SSM) prediction [42] for the solar neutrino spectrum is shown in Fig. 11.5. Note that the bulk of the spectrum is due to the pp cycle and is therefore at very low energy, which makes the neutrino detection much more difficult. <sup>14</sup> <sup>14</sup> The low-energy neutrinos from the pp

> The first technique used to detect solar neutrinos was based on the reaction ν<sup>e</sup> + <sup>37</sup>Cl → <sup>37</sup>Ar+ e−. The chlorine was contained in a tank of 630 tons of C2Cl4, 15 from which the argon was extracted periodically. The <sup>37</sup>Ar decays by electron capture (the inverse reaction to that which produced it) and is left in an excited state. The atomic electron capture leaves a 'hole' in a low-energy state, and this will be filled by an electron from a higher energy level. The energy released can result in an outer electron being ejected from the atom (the 'Auger' effect). The resulting Auger electrons were detected in a proportional counter (see Chapter 4). This experiment was started over 40 years ago by Ray Davies and was the first to detect solar neutrinos and show that the rate was lower than

**Fig. 11.4** Evidence for νμ disappearance from the MINOS experiment. (a) Energy spectrum of the neutrinos as reconstructed in the far detector. The data are compared with the nooscillation hypothesis and with an oscillation fit. (b) Ratio of the observed spectrum to that expected from the no-oscillation hypothesis [10].

cycle have been measured using Ga detectors; however, the main evidence for neutrino oscillations relies on measurements of the higher-energy neutrinos.

<sup>15</sup>C2Cl<sup>4</sup> (perchloroethylene) is often used in dry cleaning.

**Fig. 11.5** Predicted neutrino energy spectrum in the SSM [115].

expected. Davies shared the 2002 Physics Nobel Prize with Masatoshi Koshiba (Kamiokande) and Riccardo Giacconi.<sup>16</sup> <sup>16</sup>The former director of the Space

Several experiments have seen a clear deficit of electron neutrinos compared with the predictions of the SSM. It is very difficult to reconcile these data with modifications of the SSM, whereas they could all be explained by electron neutrinos oscillating into other flavours [115]. However, the most model-independent demonstration that the neutrino deficit is due to oscillations rather than problems with the SSM comes from the SNO experiment [12]. The SNO experiment used 1000 tons of very pure D2O viewed by 10 000 large photomultipliers. The primary reactions are the charged-current (CC) interactions of the electron neutrinos,

$$
\nu\_e + \mathcal{D} \to p + p + e^- \tag{11.16}
$$

the neutral-current (NC) interactions of all neutrino flavours,

$$
\nu\_x + \mathcal{D} \to p + n + \nu\_x \tag{11.17}
$$

and the elastic scattering (ES) reaction

$$
\nu\_x + e^- \to \nu\_x + e^- \tag{11.18}
$$

The Feynman diagrams for the three processes are shown in Fig. 11.6. The CC interactions are sensitive only to electron neutrinos because the thresholds of the equivalent reactions with other neutrino flavours are too high. The NC interactions are equally sensitive to all neutrino flavours. All three neutrino flavours contribute to the ES reactions, but the cross section is much larger (∼6 times) for electron neutrinos. Therefore, by measuring the rate for CC, NC, and ES interactions, i.e. enough information to deduce the total ν<sup>e</sup> + ν<sup>μ</sup> + ν<sup>τ</sup> rate, one can look Telescope Science Institute.

**Fig. 11.6** Feynman diagrams at the quark level for charged-current (a), neutral-current (b), and elastic scattering (c) reactions.

for evidence of neutrino flavour transitions, independent of the SSM. Unambiguous evidence for neutrino flavour transitions is then provided by the measured flux of all neutrino flavours (as determined by the NC interactions) and the flux of electron neutrinos as determined from the CC interactions. The signals for the CC and ES interactions come from the produced electrons generating electromagnetic showers, which then lead to Cerenkov radiation, which is detected in the photomultipliers. ˇ The NC interactions are more difficult to detect.

In the first phase of SNO, neutrinos were detected by neutron capture on deuterium, which results in 6.25 MeV photons, which produce electromagnetic showers that then lead to Cerenkov radiation. In order to ˇ increase the sensitivity of the experiment to NC interactions, in the second phase of SNO operation (SNOs), 2 tons of NaCl were added to the D2O because this enabled neutron capture on <sup>35</sup>Cl. The <sup>35</sup>Cl has a high absorption cross section for low-energy neutrons, and neutron capture leads to photons with an energy distribution peaked around 8 MeV. The measured fluxes from SNO (in units of 10<sup>6</sup> cm−<sup>2</sup> s−<sup>1</sup>) are [12]

$$\begin{aligned} \phi\_{\rm CC} &= 1.68 \pm 0.06 \,\text{(stat.)} \,\, ^{+0.08}\_{-0.09} \,\text{(syst.)}\\ \phi\_{\rm ES} &= 2.35 \pm 0.22 \,\text{(stat.)} \pm 0.15 \,\text{(syst.)}\\ \phi\_{\rm NC} &= 4.94 \pm 0.21 \,\text{(stat.)} \,\, ^{+0.38}\_{-0.24} \,\text{(syst.)} \end{aligned} \tag{11.19}$$

This combination of measurements gives the determination of the flux of muon and tau neutrinos (in units of 10<sup>6</sup> cm−<sup>2</sup> s−<sup>1</sup>) as

$$
\phi(\nu\_{\mu}) + \phi(\nu\_{\tau}) = 3.26 \pm 0.25 \,\text{(stat.)}\,\text{ ${}^{+0.40}\_{-0.35}\,\text{(syst.)}$ }\tag{11.20}
$$

This observation of a non-zero flux of ν<sup>μ</sup> or ν<sup>τ</sup> from the Sun therefore provides model-independent evidence of solar neutrino oscillations. The SNO results are combined with those from Super-Kamiokande [115] and are shown in Fig. 11.7. The data are clearly inconsistent with the no-oscillation hypothesis (φ(νμ) + φ(ν<sup>τ</sup> ) = 0). All the data are consistent with each other and with the predictions of the SSM. Finally, the combined results from all phases of the SNO experiment can

**Fig. 11.7** Fluxes of <sup>8</sup>B solar neutrinos, deduced from the SNO chargedcurrent, neutral-current, and elastic scattering results from the salt-phase measurement and the results from Super-Kamiokande. The vertical axis is the flux of muon and tau neutrinos φμτ and the horizontal axis is the flux of electron neutrinos φe. The expectations from the SSM are shown by dashed lines [115].

also be used to measure the total flux of <sup>8</sup>B neutrinos<sup>17</sup> <sup>17</sup>The threshold used by SNO to refrom the Sun φSNO NC = 5.25±0.16(stat.) +0.<sup>11</sup> <sup>−</sup>0.<sup>13</sup> (syst.), which is in good agreement with the SSM prediction of <sup>φ</sup>SSM = 5.94+1.<sup>01</sup> <sup>−</sup>0.81. This confirms that the Sun is producing energy from nuclear fusion at the rate expected in the SSM.

Although the SNO results confirm the hypothesis that neutrino oscillations occur, in the context of the vacuum oscillation model we have considered it is difficult to understand why the ratio of electron neutrino to all flavours of neutrinos νe/ν<sup>x</sup> is < <sup>1</sup> <sup>2</sup> . If the oscillations are occurring rapidly compared with the Sun–Earth distance, then the minimum value of the ratio would be <sup>1</sup> <sup>2</sup> . This statement is strictly only true in a two-flavour model, but this is a good approximation for solar neutrinos. Before considering the probable explanation for this, we first need to consider other solar neutrino experiments that are sensitive to lowerenergy neutrinos. Most of the total flux of neutrinos comes from the first reaction in the pp chain (see eqn. 11.14), which generates neutrinos with a continuous spectrum with an endpoint at 0.42 MeV (see Fig. 11.5). The flux of these low-energy neutrinos is very tightly constrained by the observed solar luminosity, unlike the flux of higher-energy neutrinos, which are more sensitive to details of the solar model (in particular the core temperature). Therefore, the original motivation for efforts to detect these low-energy neutrinos was to see if the solar neutrino deficit was due to the solar model or to neutrino oscillations. However, from our perspective, these data are particularly interesting when compared with the higher-energy data, because they reveal a significant energy dependence in the oscillation probability.

The detection of very low-energy neutrinos (E<sup>ν</sup> < 0.42 MeV) is challenging. The radioactive backgrounds in water Cerenkov detectors are ˇ too large to allow a sufficiently low threshold to be set. Such low-energy neutrinos have been studied by radiochemical experiments using the reaction ν<sup>e</sup> + <sup>71</sup>Ga → <sup>71</sup>Ge + e−, which has a threshold energy of 233 keV. Slightly over half of the interactions are expected to come from the low-energy pp neutrinos. In the GALLEX experiment, the produced Ge was periodically extracted (about every 10 days) chemically and the number of <sup>71</sup>Ge atoms was detected using electron capture (essentially the inverse of the reaction that created the <sup>71</sup>Ge). The electron capture leaves a hole in the K or L shell, which is filled by electrons from higher energy states, resulting in X-ray emission. The X-rays are detected in proportional counters filled with xenon gas (xenon is a good absorber for X-rays because of its large Z value).<sup>18</sup> <sup>18</sup>The target contains 101 tons of The results from the different gallium experiments are summarized in [115] and show a suppression compared with the prediction of the SSM of a factor of about <sup>1</sup> <sup>2</sup> . Comparing these results for low-energy neutrinos with those from SNO and Super-Kamiokande, which are sensitive to higher-energy neutrinos, we can see that the neutrino oscillation probability has a significant energy dependence. These observations are hard to reconcile with vacuum oscillations, so we need to consider the effect of matter on neutrino oscillations, which we will do in the next section.

duce the backgrounds meant that the detector was mainly sensitive to <sup>8</sup>B neutrinos.

GaCl3 and a typical 10-day run produces about ten <sup>71</sup>Ge atoms. Therefore, the detector has to be deep underground to be shielded from cosmic rays, and extreme care must be taken to minimize radioactive backgrounds.

## of the MSW effect in order to illustrate how it can resolve this apparent paradox.

electron neutrinos <sup>20</sup>We are working in the flavour basis <sup>20</sup> for the neutrino eigenstates.

similar to a change in refractive index in the optical case. This then changes the propagation speed for electron neutrinos and hence the oscillation rate.

## **11.4.4 MSW effect**

There is very strong evidence for neutrino oscillations, as we have seen in Sections 11.4.1–11.4.3 (see also Section 11.4.5). The atmospheric neutrino data can be explained by the phenomenology of vacuum oscillations (see Section 11.3) with a close to maximal mixing angle between ν<sup>μ</sup> and ν<sup>τ</sup> . This implies that we can consider the electron neutrino mixing in the simple two-component picture (the ν<sup>e</sup> mixes with a linear superposition of ν<sup>μ</sup> and ν<sup>τ</sup> ). Therefore, the probability of a ν<sup>e</sup> created in the Sun remaining as a ν<sup>e</sup> is given by

$$P\_{\nu\_a \to \nu\_a}(t) = 1 - \sin^2 2\theta \sin^2 \left(\frac{\Delta m^2 t}{4E}\right) \tag{11.21}$$

For Δm<sup>2</sup> > 10−<sup>4</sup> eV<sup>2</sup>, the phase factor oscillates so rapidly that it will average to a value of <sup>1</sup> <sup>2</sup> , and therefore the minimum value of <sup>P</sup><sup>ν</sup>e→ν<sup>e</sup> (t) would be <sup>1</sup> <sup>2</sup> . However, the SNO and other data clearly indicate a significantly lower value (see Section 11.4.3).

This apparent paradox can be understood if we allow for the effects of matter on the propagation of neutrinos through the Sun, the so-called Mikheyev–Smirnov–Wolfenstein (MSW) effect <sup>19</sup> <sup>19</sup> Here we give a simplified description [108, 135]. The effective Hamiltonian differs from its vacuum counterpart by the addition of weak interactions. All flavours of neutrinos can have neutral-current interactions, but only (low-energy) electron neutrinos can undergo charged-current interactions. From the perspective of oscillations, we are only interested in the differences between electron neutrinos and muon or tau neutrinos, and therefore we do not need to consider the neutral-current interactions. The probability of incoherent scattering in the Sun is negligible and in any case could not contribute to interference effects. We are therefore only interested in the charged-current coherent forward scattering, which gives a contribution to the Hamiltonian for of

$$H(r) = \sqrt{2} \, G\_{\rm F} N\_e(r) \tag{11.22}$$

where G<sup>F</sup> is the Fermi coupling constant and Ne(r) is the electron volume number density at a radius r in the Sun. <sup>21</sup> <sup>21</sup> The effect on neutrino propagation is

> To understand the MSW effect, we start by considering the Schr¨odinger equation for two neutrino flavours propagating in vacuum. For oscillations, we are only interested in the terms in the Hamiltonian that are different for electron neutrinos compared with other flavours of neutrinos. We can write this part of the Hamiltonian as

$$H\_{\rm V} = \frac{\Delta m^2}{4p} \begin{pmatrix} -\cos 2\theta & \sin 2\theta \\ \sin 2\theta & \cos 2\theta \end{pmatrix} \tag{11.23}$$

and the time-independent Schr¨odinger equation as

$$H\_{\rm V} \begin{pmatrix} \nu\_e \\ \nu\_x \end{pmatrix} = E \begin{pmatrix} \nu\_e \\ \nu\_x \end{pmatrix} \tag{11.24}$$

where ν<sup>x</sup> represents the non-electron neutrino flavours (i.e. those that do not interact with electrons by charged-current interactions). We can then show that the difference in the energies of the two eigenstates is given by ΔE = Δm<sup>2</sup>/2p (see Exercise 11.5). The effect of the charged-current coherent forward scattering is to change the effective potential for electron neutrinos by <sup>V</sup><sup>e</sup> <sup>=</sup> <sup>√</sup><sup>2</sup> <sup>G</sup>FNe, where <sup>N</sup><sup>e</sup> is the electron number density. We can evaluate the effect on the Hamiltonian using E<sup>2</sup> − p<sup>2</sup> = m<sup>2</sup> and assuming that neutrinos are ultrarelativistic and V<sup>e</sup> E:

$$m\_{\nu}^{2} = (E + V\_{e})^{2} - p^{2} \approx m^{2} + 2EV\_{e} \tag{11.25}$$

Therefore, the change in m<sup>2</sup> for the electron neutrino is given by

$$
\Delta m\_{\nu\_e}^2 = 2\sqrt{2} \, G\_{\text{F}} N\_e E \tag{11.26}
$$

Again assuming that the neutrinos are ultrarelativistic, we can write the contribution from matter to the Hamiltonian as

$$
\Delta H\_{\rm M} = \sqrt{2} \, G\_{\rm F} N\_e \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix} \tag{11.27}
$$

As usual, we can drop any term proportional to the unit matrix from the point of view of oscillations. It is therefore convenient to rewrite this as a term proportional to the unit matrix and the matrix relevant for oscillations:

$$
\Delta H\_{\rm M} = \sqrt{2} \, G\_{\rm F} N\_e / 2 \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} \tag{11.28}
$$

We can then combine the vacuum term, eqn 11.23, with the matter term, eqn 11.28, to obtain the Hamiltonian in the presence of matter:

$$H\_{\rm M} = \frac{\Delta m^2}{4p} \begin{pmatrix} -\cos 2\theta & \sin 2\theta \\ \sin 2\theta & \cos 2\theta \end{pmatrix} + \frac{\sqrt{2}}{2} \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix} \tag{11.29}$$

It is conventional to define <sup>A</sup> = 2√<sup>2</sup> <sup>G</sup>FNep/Δm<sup>2</sup>, simplifying eqn 11.29 to give

$$H\_{\rm M} = \frac{\Delta m^2}{4p} \begin{pmatrix} -\cos 2\theta + A & \sin 2\theta \\ \sin 2\theta & \cos 2\theta - A \end{pmatrix} \tag{11.30}$$

We can then define an effective mixing angle in the presence of matter as θ<sup>m</sup> and define the total effective Hamiltonian in the presence of matter as

$$H\_{\rm M} = \frac{\Delta m^2}{4p} \begin{pmatrix} -\cos 2\theta\_{\rm m} & \sin 2\theta\_{\rm m} \\ \sin 2\theta\_{\rm m} & \cos 2\theta\_{\rm m} \end{pmatrix} \tag{11.31}$$

Comparing eqn 11.30 with eqn 11.31, we can relate the mixing angle in the presence of matter to that in vacuum:

$$\tan 2\theta\_{\rm m} = \frac{\tan 2\theta}{1 - A \sec 2\theta} \tag{11.32}$$

fusion in the core of the sun. As the core of the sun is much smaller than the full volume, they will be created near the centre.

This clearly has a resonant condition if A > 0 and A = cos 2θ, or, substituting for the definition of A, we find the electron neutrino resonant energy at which the mixing becomes maximal

$$E\_{\rm Res} = \frac{\Delta m^2 \cos 2\theta}{2\sqrt{2} \, G\_{\rm F} N\_e} \tag{11.33}$$

We can now use this MSW formalism to explain how the measured suppression of electron neutrinos can be energy-dependent and how the suppression of high-energy electron neutrinos can be larger than a factor of two, which would be the maximal value (assuming that the electron neutrinos make many oscillations between the Sun and the Earth).

Electron neutrinos with energy E > 2 MeV at the centre of the Sun <sup>22</sup> <sup>22</sup> Solar neutrinos are made by nuclear will have a higher energy than ERes (see eqn 11.33 and Exercise 11.6). Since the density of the Sun increases towards the centre (owing to gravity), the electron density decreases smoothly as the neutrinos leave; therefore, these electron neutrinos will hit the resonance condition. As the density changes slowly with radius, this will happen 'adiabatically' and the neutrinos that will propagate out to the surface of the Sun will be the heavier-mass eigenstate ν2:

$$
\nu\_2 = \nu\_e \sin \theta + \nu\_\mu \cos \theta \tag{11.34}
$$

where θ is the vacuum mixing angle. Since these states are now eigenstates of HV, they will simply propagate to the Earth with no further oscillations. From eqn 11.34, we can then easily see that the ν<sup>e</sup> survival probability is given by (eqn 11.11)

$$P(\nu\_e \to \nu\_e) = \sin^2 \theta \tag{11.35}$$

and can therefore be less than <sup>1</sup> <sup>2</sup> . Electron neutrinos with lower energies will not see this resonance and will effectively propagate as vacuum states. As these neutrinos are oscillating rapidly compared with the transit time from the Sun to the Earth, the average phase factor will be <sup>1</sup> 2 , so the survival probability will be given by

$$P(\nu\_e \to \nu\_e) = 1 - \frac{1}{2}\sin^2 2\theta \tag{11.36}$$

The combined results of the solar neutrino experiments can be fitted in terms of the MSW effect (see Section 11.4.4), with Δm<sup>2</sup> ∼ 5 × 10−<sup>5</sup> eV<sup>2</sup> and tan<sup>2</sup> θ ∼ 0.45 [115].

## **11.4.5 Confirmation of solar neutrino oscillations**

Confirmation that neutrino oscillation was the correct solution to the solar neutrino puzzle came from the KamLAND experiment. The detector is based deep underground in the Kamiokande site in Japan. It detects electron antineutrinos from a large number of Japanese nuclear reactors. The flux-weighted average distance from the sources to

**Fig. 11.8** Ratio of the measured neutrino rate divided by that expected in the no-oscillation model, as a function of L0/Eν¯<sup>e</sup> . The lines are best fits for neutrino oscillation models [77].

the detector is about L<sup>0</sup> = 180 km. This long baseline provides sensitivity to small values of the mass splitting. The primary reaction is ν¯<sup>e</sup> + p → e+n. The detector consists of 1000 tons of liquid scintillator, surrounded by photomultipliers. The advantage of liquid scintillator over water (Cerenkov) is that it produces a larger signal, thus enabling it to ˇ achieve lower energy thresholds. This is important because the neutrino energies are below 8 MeV. However, extreme care must still be taken to minimize radioactive isotopes in the detector. The primary signal comes from the e<sup>+</sup> but there is also a delayed signal from γ rays after the neutrons are captured on protons. This helps reduce the background and enables the threshold to be lowered to 2.6 MeV [77].<sup>23</sup> <sup>23</sup>There is also a background from 'geo-

The KamLAND experiment measured the ¯ν<sup>e</sup> spectrum and, by dividing by the expected flux from the no-oscillation model, it was possible to measure the survival probability as a function of L0/E (where E is the neutrino energy). The results in Fig. 11.8 show clear evidence for neutrino oscillations and, even more interestingly, these data show the characteristic maxima and minima expected from oscillations.

From a combined fit to the solar and KamLAND data, the solution to the solar neutrino problem requires the MSW effect.

## **11.5 Three (or more)-flavour oscillations**

## **11.5.1 Generalized oscillation probabilities**

Expressions for the oscillation probabilities for the general case of N flavours and N massive neutrinos can be derived in the same way as for two flavours. The N flavour eigenstates |ν<sup>α</sup> are expressed as linear combinations of the N mass eigenstates |ν<sup>k</sup>:

$$|\nu\_{\alpha}\rangle = \sum\_{k} U\_{\alpha k} |\nu\_{k}\rangle$$

thermal' neutrinos, i.e. neutrinos from the decays of radioactive isotopes in the Earth. In future, the study of geothermal neutrinos might become interesting for geophysics research for distinguishing between different models of the Earth's core.

where the coefficients Uαk are the elements of an N × N PMNS matrix. In general, the Uαk are complex, and unitarity requires that

$$\sum\_{\alpha} U\_{\alpha k} U\_{\alpha j}^\* = \delta\_{jk}$$

$$\sum\_k U\_{\alpha k} U\_{\beta k}^\* = \delta\_{\alpha \beta}$$

If a neutrino of flavour α is created at t = 0, the initial state is

$$\left|\psi(0)\right\rangle = \sum\_{k} U\_{\alpha k} \left|\nu\_{k}\right\rangle$$

At some later time t, the phases of the mass eigenstates have evolved and

$$|\psi(t)\rangle = \sum\_{k} U\_{\alpha k} \mathbf{e}^{-\mathbf{i}E\_{k}t} |\nu\_{k}\rangle.$$

where E<sup>k</sup> is the energy of the νk. The mass eigenstates can be expressed in terms of the flavour states using <sup>24</sup> <sup>24</sup> As the matrix <sup>U</sup> is unitary, the in-

$$|\nu\_k\rangle = \sum\_{\alpha} U\_{\alpha k}^\* |\nu\_\alpha\rangle$$

so

$$|\psi(t)\rangle = \sum\_{\beta} \left( \sum\_{k} U\_{\alpha k} \mathbf{e}^{-\mathbf{i}E\_k t} U\_{\beta k}^\* \right) |\nu\_{\beta}\rangle\_t$$

The amplitude for observing a ν<sup>β</sup> at time t is

$$
\langle \nu\_{\beta} | \psi(t) \rangle = \sum\_{k} U\_{\alpha k} U\_{\beta k}^\* e^{-iE\_k t}
$$

Then, as for the two-flavour case, taking E<sup>k</sup> ≈ p + m<sup>2</sup> <sup>k</sup>/2p and p = E,

$$
\langle \nu\_{\beta} | \psi(t) \rangle = \sum\_{k} \left( U\_{\alpha k} U\_{\beta k}^{\*} \mathbf{e}^{-i m\_{k}^{2} t / 2E} \right) \mathbf{e}^{-i E t}
$$

and the probability of observing a neutrino of flavour β at a distance L from the source is

$$\begin{split} \mathbf{P}(\nu\_{\alpha} \rightarrow \nu\_{\beta}) &= |\langle \nu\_{\beta} | \psi(t=L) \rangle|^{2} \\ &= \sum\_{k,j} U\_{\alpha k} U\_{\beta k}^{\*} U\_{\alpha j}^{\*} U\_{\beta j} \exp\left(-\mathbf{i} \frac{\Delta m\_{kj}^{2} L}{2E}\right) \end{split} \tag{11.37}$$

where Δm<sup>2</sup> kj ≡ m<sup>2</sup> <sup>k</sup> − m<sup>2</sup> j .

verse is U−<sup>1</sup> = U†.

The neutrino oscillation probability, eqn. 11.37, can also be written as

$$\begin{split} \mathbf{P}(\nu\_{\alpha} \rightarrow \nu\_{\beta}) &= \delta\_{\alpha\beta} - 4 \sum\_{k>j} \text{Re}(U\_{\alpha k} U\_{\beta k}^{\*} U\_{\alpha j}^{\*} U\_{\beta j}) \sin^{2} \left( \frac{\Delta m\_{kj}^{2} L}{4E} \right) \\ &+ 2 \sum\_{k>j} \text{Im}(U\_{\alpha k} U\_{\beta k}^{\*} U\_{\alpha j}^{\*} U\_{\beta j}) \sin \left( \frac{\Delta m\_{kj}^{2} L}{2E} \right) \end{split} \tag{11.38}$$

For antineutrinos, the elements of the PMNS matrix Uαk must be replaced by their complex conjugates U<sup>∗</sup> αk and

$$|\bar{\nu}\_{\alpha}\rangle = \Sigma\_k U\_{\alpha k}^\* |\bar{\nu}\_k\rangle$$

The antineutrino oscillation probabilities become

$$\begin{split} \mathbb{P}(\bar{\nu}\_{\alpha} \rightarrow \bar{\nu}\_{\beta}) &= \delta\_{\alpha\beta} - 4 \sum\_{k>j} \text{Re}(U\_{\alpha k} U\_{\beta k}^{\*} U\_{\alpha j}^{\*} U\_{\beta j}) \sin^{2} \left( \frac{\Delta m\_{kj}^{2} L}{4E} \right) \\ &- 2 \sum\_{k>j} \text{Im}(U\_{\alpha k} U\_{\beta k}^{\*} U\_{\alpha j}^{\*} U\_{\beta j}) \sin \left( \frac{\Delta m\_{kj}^{2} L}{2E} \right) \end{split} \tag{11.39}$$

The third terms of eqns 11.38 and 11.39 have different signs, with the important consequence that for the same L/E the neutrino and antineutrino oscillation probabilities are different. Since neutrinos and antineutrinos are CP conjugates, this immediately suggests the possibility of CP violation and, by the CPT theorem, T violation. The CP asymmetry would be

$$\begin{split} A^{CP}\_{\alpha\beta}(L, E) &= \mathbf{P}(\nu\_{\alpha} \to \nu\_{\beta}) - \mathbf{P}(\bar{\nu}\_{\alpha} \to \bar{\nu}\_{\beta}) \\ &= 4 \sum\_{k>j} \text{Im} (U\_{\alpha k} U\_{\beta k}^\* U\_{\alpha j}^\* U\_{\beta j}) \sin \left( \frac{\Delta m\_{kj}^2 L}{2E} \right) \end{split} \tag{11.40}$$

CP violation would be manifested as a difference between the ν<sup>α</sup> → ν<sup>β</sup> and ¯ν<sup>α</sup> → ν¯<sup>β</sup> oscillation probabilities, and T violation by a difference in the time-reversed probabilities P(ν<sup>α</sup> → νβ) and P(ν<sup>β</sup> → να). For CP or T violation to occur, the imaginary parts of eqn 11.40 must not vanish; in other words, the PMNS matrix elements Uαk must be complex.<sup>25</sup> <sup>25</sup>The quantity Im(UαkU<sup>∗</sup> CP violation would not occur for only two neutrino flavours since, just as in the case of two quark generations, the mixing can be described by a single angle.

The transition ¯ν<sup>β</sup> → ν¯<sup>α</sup> is the CPT conjugate of ν<sup>α</sup> → ν<sup>β</sup> and, by the CPT theorem, P(¯ν<sup>β</sup> → ν¯α) = P(ν<sup>α</sup> → νβ). Therefore, the CPconjugate survival probabilities P(¯ν<sup>α</sup> → ν¯α) and P(ν<sup>α</sup> → να) must be equal, and CP violation can only be observed by comparing appearance (i.e. flavour-changing) probabilities.

The rather formidable expressions for the oscillation probabilities in eqns 11.38 and 11.39 simplify in certain cases. For example, when one of

βkU<sup>∗</sup> αjUβj ) is directly analogous to the invariant area J of the CKM unitarity triangles, eqn 10.28 in Section 10.6, which describes CP violation in the quark sector.

the Δm<sup>2</sup> kj is much greater than all other Δm<sup>2</sup>s ('one-mass-scale dominance'), they reduce to quasi-two-flavour expressions described by a single effective mixing angle.

## **11.5.2 Three-flavour oscillations**

There is currently no convincing evidence for more than three neutrino flavours. The mixing of the known neutrinos can therefore be described in terms of three mixing angles θ12, θ13, and θ23, a single phase δ, and the three mass (squared) differences, Δm<sup>2</sup> 12, Δm<sup>2</sup> 23, and Δm<sup>2</sup> <sup>13</sup> between three mass eigenstates ν1, ν2, and ν3. It is known that Δm<sup>2</sup> <sup>13</sup> ≈ Δm<sup>2</sup> <sup>23</sup> Δm<sup>2</sup> 12 since solar and atmospheric neutrino oscillations are described by two very different mass differences. From the solar neutrino data allowing for the MSW effect (see Section 11.4.4), we can determine that Δm<sup>2</sup> <sup>12</sup> > 0. However, we do not currently know the hierarchy of the masses m<sup>2</sup> and m3; i.e. is Δm<sup>2</sup> <sup>23</sup> < 0?

The three-flavour PMNS matrix (eqn 11.1)

$$\mathbf{U} = \begin{pmatrix} U\_{e1} & U\_{e2} & U\_{e3} \\ U\_{\mu 1} & U\_{\mu 2} & U\_{\mu 3} \\ U\_{\tau 1} & U\_{\tau 2} & U\_{\tau 3} \end{pmatrix} \tag{11.41}$$

can be expressed in terms of the mixing angles as

$$\mathbf{U} = \begin{pmatrix} c\_{12}c\_{13} & c\_{13}s\_{12} & s\_{13}\mathbf{e}^{-i\delta} \\ -c\_{23}s\_{12} - c\_{12}s\_{13}s\_{23}\mathbf{e}^{i\delta} & c\_{12}c\_{23} - s\_{12}s\_{13}s\_{23}\mathbf{e}^{i\delta} & c\_{13}s\_{23} \\ s\_{23}s\_{12} - c\_{12}c\_{23}s\_{13}\mathbf{e}^{i\delta} & -c\_{12}s\_{23} - c\_{23}s\_{12}s\_{13}\mathbf{e}^{i\delta} & c\_{13}c\_{23} \end{pmatrix} \tag{11.42}$$

where cij ≡ cos θij and sij ≡ sin θij . The mixing angles θij can be thought of as representing rotations around a third, k, axis. For example, θ<sup>13</sup> represents a rotation around the '2' axis. <sup>26</sup> <sup>26</sup> This representation of the PMNS The matrix can also be written as follows to make the three separate rotations more apparent:

$$\mathbf{U} = \begin{pmatrix} 1 & 0 & 0 \\ 0 & c\_{23} & s\_{23} \\ 0 & -s\_{23} & c\_{23} \end{pmatrix} \begin{pmatrix} c\_{13} & 0 & s\_{13} \mathbf{e}^{-i\delta} \\ 0 & 1 & 0 \\ -s\_{13} \mathbf{e}^{i\delta} & 0 & c\_{13} \end{pmatrix} \begin{pmatrix} c\_{12} & s\_{12} & 0 \\ -s\_{12} & c\_{12} & 0 \\ 0 & 0 & 1 \end{pmatrix} \tag{11.43}$$

It can be seen from eqn 11.42 that the phase δ must be non-zero for CP violation to occur. It can also be shown that all the angles, including θ13, must be non-zero for CP violation to occur.

The three-flavour oscillation probabilities for neutrinos and antineutrinos can be written in terms of the mixing angles by substituting the Uαi of eqn 11.42 into eqn 11.38 or 11.39, respectively, for N = 3. Since the resulting expressions are rather long, they will not be given here. An important result, however, is that the dependence of the oscillation probabilities on L/E falls into three different regimes. If Δm<sup>2</sup> <sup>13</sup>L/E 1, no

matrix is identical with the representation of the CKM matrix of quark mixing discussed in Section 10.6. Of course the angles have entirely different physical meanings.

flavour oscillations will occur. In the second regime, where Δm<sup>2</sup> <sup>23</sup>L/E ≈ 1 but Δm<sup>2</sup> <sup>12</sup>L/E 1, the oscillation probability P(ν<sup>α</sup> → νβ) depends mainly on the four elements of **U** that couple ν<sup>α</sup> and ν<sup>β</sup> to ν<sup>2</sup> and ν3, i.e. Uα2, Uβ2, Uα3, and Uβ3. Finally, if Δm<sup>2</sup> <sup>12</sup>L/E ≥ 1, the oscillation probabilities depend on all the Uαi. For example, if Ue<sup>3</sup> = 0 (as is approximately true), ν<sup>e</sup> does not couple to ν<sup>3</sup> and ν<sup>e</sup> ↔ νμ, and ν<sup>e</sup> ↔ ν<sup>τ</sup> oscillations will not occur unless Δm<sup>2</sup> <sup>12</sup>L/E ≥ 1, although ν<sup>μ</sup> → ν<sup>τ</sup> oscillations can take place at smaller L/E as long as Δm<sup>2</sup> <sup>23</sup>L/E ≈ 1. The two regimes Δm<sup>2</sup> <sup>12</sup>L/E 1 ≤ Δm<sup>2</sup> <sup>23</sup>L/E and 1 ≤ Δm<sup>2</sup> <sup>12</sup>L/E are often referred to as the 'atmospheric' and 'solar' oscillation regimes, respectively.

When the results of all neutrino oscillation experiments and measurements are combined, it is found that the mixing is described by

$$
\begin{pmatrix} \nu\_e \\ \nu\_\mu \\ \nu\_\tau \end{pmatrix} = \begin{pmatrix} 0.82 & 0.56 & \sim 0.15 \\ 0.31 \text{--} 0.43 & 0.51 \text{--} 0.59 & 0.75 \\ 0.37 \text{--} 0.47 & 0.59 \text{--} 0.66 & 0.66 \end{pmatrix} \begin{pmatrix} \nu\_1 \\ \nu\_2 \\ \nu\_3 \end{pmatrix} \tag{11.44}
$$

and Δm<sup>2</sup> <sup>23</sup> ≈ Δm<sup>2</sup> <sup>13</sup> = 2.3 × 10<sup>−</sup><sup>3</sup> eV<sup>2</sup> and Δm<sup>2</sup> <sup>12</sup> = 7.5 × 10<sup>−</sup><sup>5</sup> eV<sup>2</sup>. At present there is no experimental evidence that the phase δ is nonzero, and the values in eqn 11.44 are the magnitudes of Uαi. It can be seen that since |U<sup>e</sup>3| ≡ sin θ<sup>13</sup> ≈ 0.15, ν<sup>e</sup> is coupled only weakly to ν3; the ν<sup>3</sup> 'content' of ν<sup>e</sup> is |U<sup>2</sup> <sup>e</sup>3| ≈ 0.02, i.e. only 2%. From the discussion in the preceding paragraph, this means that ν<sup>e</sup> barely participates in oscillations involving ν<sup>2</sup> and ν<sup>3</sup> only, i.e. where Δm<sup>2</sup> <sup>23</sup>L/E ≈ 1 but Δm<sup>2</sup> <sup>12</sup>L/E 1, and justifies the analysis of the results of atmospheric neutrino and long-baseline accelerator experiments in terms of two-flavour oscillations characterized by Δm<sup>2</sup> <sup>23</sup> and θ23. It also means that the disappearance of solar ν<sup>e</sup> can be described in terms of two flavours characterized by Δm<sup>2</sup> <sup>12</sup> and θ12. <sup>27</sup> <sup>27</sup>This can relatively easily be seen by The oscillation regimes are illustrated in Fig. 11.9.

In stark contrast to the CKM matrix, which is approximately diagonal, all the elements of **U**, with the exception of U<sup>e</sup><sup>3</sup> (equivalently θ13), are large. Their approximate equality suggests that CP violation by neutrinos could be large for the same reason that CP violation by mesons depends on the size of the CKM matrix elements via the angles of the unitarity triangles as discussed in Section 10.6.

setting s<sup>13</sup> = 0 and c<sup>13</sup> = 1 in eqn 11.42 and deriving the νe survival probability P(νe → νe).

changing neutrino oscillations as a function of L/E. In the atmospheric regime, the transitions are predominantly νμ ↔ ντ ; in the solar regime, all flavours participate.

## **11.5.3 Measurement of the mixing angle** *θ***<sup>13</sup>**

Very convincing evidence for a non-zero value of θ<sup>13</sup> was shown by the Daya Bay experiment. This used an array of six identical detectors to measure the ¯ν<sup>e</sup> flux around six nuclear reactors. The detectors used liquid scintillator viewed by photomultipliers. The reaction is the same as that used by KamLAND (see Section 11.4.5) and is also a disappearance experiment, but over baselines of ∼1 km, rather than the ∼100 km studied by KamLAND, which is why there is sensitivity to the mixing angle θ13. There was a separate layer of purified water outside the scintillator to act as a veto detector (for example to reject incoming cosmic rays). The detectors nearer the reactors measured the flux, and this was used to predict the flux at the further away detectors. The ratio of measured to predicted flux at the further away detectors was significantly less than 1, which is evidence for a non-zero value of θ13. This result was confirmed by RENO (another reactor experiment). Further confirmation of the non-zero value of θ<sup>13</sup> was provided by the T2K experiment. This used a very intense 30 GeV proton beam at the JPARC accelerator to send a neutrino beam to the Super-Kamiokande detector at a distance of 295 km. In order to obtain the maximal oscillation probability for this baseline and the expected value of Δm<sup>2</sup> <sup>13</sup>, the optimal neutrino energy was E<sup>ν</sup> ∼ 1 GeV. This was achieved by having the beam line at an angle of 2.5◦ with respect to the line from the accelerator to the detector. Clear evidence for ν<sup>e</sup> appearance in a ν<sup>μ</sup> beam is shown by the spectrum of identified ν<sup>e</sup> interactions (see Fig. 11.10) [8].

A global fit to these and other data gave a value of sin<sup>2</sup> 2θ<sup>13</sup> = 0.096 ± 0.013[115]. A necessary but not sufficient condition for CP violation in the neutrino sector is that θ<sup>13</sup> is non-zero. Therefore, the observation of this large value for θ<sup>13</sup> opens the way to the study of CP violation in the neutrino sector, which would be very interesting for the reasons discussed in Section 11.5.4. This would require measuring different oscillation rates for neutrinos and antineutrinos. Several ideas

**Fig. 11.10** The energy spectrum of detected electron neutrinos from the T2K experiment with the Super-Kamiokande detector. The data are significantly above the estimated background and is in good agreement with the expectations from neutrino oscillations [8]. The solid curve is the result of a fit to the oscillation model.

for new very long-baseline experiments are being considered. They are very challenging because potential small differences between neutrino and antineutrino oscillation probabilities need to be measured, which will require very intense neutrino beams as well as very large detectors.

## **11.5.4 Matter–antimatter asymmetry**

The Universe appears to be dominated by matter compared with antimatter. If there were stars or galaxies made of antimatter, we would expect characteristic γ rays from matter–antimatter annihilation. There is no evidence for this from astrophysics. The primary cosmic-ray flux contains only a small component of antiparticles such as positrons, and these can be explained by secondary production processes. As well as explaining this effect, we also need to understand the baryon-to-photon (nB/nγ) ratio. In the early Universe, with temperatures satisfying kBT >> 2mp, we can have the reactions pp¯ ↔ γγ, which would have been in thermodynamic equilibrium, and the value of nB/n<sup>γ</sup> would be determined by the Boltzmann and statistical factors. However, as the Universe cooled, the interaction rate became lower than the inverse of the expansion time, so the baryons would have been 'frozen out'. The Standard Model prediction<sup>28</sup> <sup>28</sup> for nB/n<sup>γ</sup> ∼ 10 Using the CKM mechanism. <sup>−</sup><sup>19</sup> is much lower than the measured value nB/n<sup>γ</sup> ∼ 10<sup>−</sup><sup>10</sup>.

If the Universe started off with matter–antimatter symmetry, then we must satisfy the three Sakharov conditions [124] in order to have the currently observed matter-dominated Universe:<sup>29</sup> <sup>29</sup>The explanation of the matter–


The first condition is obvious, but the second is not so self-evident. If C symmetry held, there would be an equal rate for the production of baryons and antibaryons. Similarly, if CP symmetry held, there would be an equal rate for the production of 'left-handed' baryons and 'righthanded' antibaryons and an equal rate for the production of 'left-handed' antibaryons and 'right-handed' baryons Therefore, the second condition is required to produce more baryons than antibaryons. If the Universe were in thermodynamic equilibrium, CPT symmetry would ensure that there was an equal rate for reactions creating baryons and antibaryons.

The SM does have CP violation in the quark sector, but it turns out that the off-diagonal CKM elements are so small that this cannot provide sufficient CP violation to explain the observed matter–antimatter asymmetry and nB/nγ. One possible explanation would be to invoke a Grand Unified Theory (GUT), which would naturally generate baryonnumber-violating interactions since quarks and leptons are in the same multiplets. However, these GUT models tend to predict a rate of proton decay that is incompatible with measurements.

antimatter asymmetry can also resolve

for neutrinoless double β decay, which if observed would prove that neutrinos are Majorana particles.

<sup>31</sup>We require the masses to be positive, so clearly the parameter M must be imaginary.

There are many different theoretical attempts to explain the observed asymmetry, although here we consider only one model. An attractive possibility is that the CP violation arises in the neutrino sector. We already know that the magnitudes of the off-diagonal elements of the neutrino mixing (PMNS) matrix are relatively large. If the one complex phase, δ, in the PMNS matrix turned out to be large, this would predict relatively strong CP violation in the neutrino sector. Now that we know that neutrinos have non-zero masses, they could be Dirac or Majorana particles (see Chapter 6). <sup>30</sup> <sup>30</sup> In Chapter 8, we looked at the search We will consider a model in which the neutrinos are Majorana particles. One of the unexplained features of the SM is that the neutrino masses are so much smaller than those of the charged leptons and the quarks. One attempt to explain this is motivated by GUTs in which the neutrino mass matrix is given by

$$\mathbf{M}\_{\nu} = \begin{pmatrix} 0 & M \\ M & B \end{pmatrix} \tag{11.45}$$

The value of M is of the order of the electroweak scale, but the value of the parameter B is of the order of the GUT scale, so B M. The masses of the neutrinos are then given by the eigenvalues of the mass matrix (eqn 11.45):

$$
\lambda\_{\pm} = \frac{1}{2} \left( B \pm \sqrt{B^2 + 4M^2} \right) \tag{11.46}
$$

In the regime B M, this gives λ<sup>+</sup> = B and λ<sup>−</sup> = −M2/B. 31 Therefore, larger values of B increase (decrease) the mass of the heavier (lighter) eigenstate—hence this is called the 'see-saw' mechanism. The lighter mass corresponds to the observed left-handed (LH) neutrino states and the heavier mass state would be a right-handed (RH) neutrino.

In the early Universe with temperatures kBT>m(νR), these LH and RH neutrinos would have been in thermodynamic equilibrium and there could have been no asymmetry. As the Universe cooled to lower temperatures, it would no longer be in thermodynamic equilibrium (thus satisfying the third Sakharov condition) and the abundance of the heavy RH neutrinos would have been 'frozen-out'. They would have then decayed to the light neutrino states in a CP-violating way to create a lepton number asymmetry. The lepton asymmetry can be converted to a baryon asymmetry in the SM. There is no perturbative SM process in the SM that can violate B + L (where B is the baryon number and L the lepton number), but this can happen non-perturbatively. The SU(2) vacuum has an infinite number of minima, each with a different value of B − L. At low temperatures, the probability of transition from one vacuum state to another is suppressed by a factor of e<sup>−</sup>M<sup>W</sup> /T , where M<sup>W</sup> is the mass of the W boson. However, at very high temperatures kBT >M<sup>W</sup> , the Universe can easily hop over the barrier from one vacuum state to another, thus allowing violation of B − L. This then allows the lepton number asymmetry created by the decays of the ν<sup>R</sup> to be converted to a baryon number (and lepton number) asymmetry. As electric and colour charge are protected by gauge symmetries, the Universe remains neutral for electric and colour charge.

## **Chapter summary**


## **Further reading**


## **Exercises**


that neither the momenta nor the energies are the same, and determine the difference in momenta of the two flavours of neutrinos.


neutrino energy interval, is not obvious given the steeply changing cosmic-ray spectrum. Starting from a 2 GeV pion, compute roughly the average energy of the ν and μ into which it decays. Then, again using a rough calculation, divide the energy of the μ equally among the decay products to show that all the neutrino energies are roughly equal, so the naive result works. This is a lucky coincidence; repeat this starting with a 2 GeV kaon to show that the sharing is not equal.


Assuming that neutrinos are ultrarelativistic, derive eqn 11.23. Hint: You can drop any term in the Hamiltonian that is proportional to the unit matrix, since it does not affect oscillations.


# **The Higgs boson 12**

The Higgs mechanism is a part of the standard model that we postponed from the discussion of electroweak unification in Chapter 7. Until recently, it was one possible theoretical solution to an otherwise problematic part of the GSW theory that we have highlighted in earlier chapters—that we want the W and Z to be massive to make a shortrange force, but we want them to be massless so that gauge invariance works and the theory is renormalizable (Section 7.4.4). The Higgs mechanism is as old as the rest of the GSW theory and predicts that there should be a boson, the Higgs, but does not predict its mass.

Recently, strong evidence for a boson that fits the description has been found in several different decay channels at both of the big experiments at the LHC. In this chapter, we will first discuss the theory behind the Higgs boson and then take a look at the experimental evidence for its existence.

We start at a 'middle difficulty' level, i.e. try to explain all the key ideas without too much mathematics.<sup>1</sup> Then, having set the scene, we give a more sophisticated mathematical picture of the Higgs mechanism in Section 12.4.

The discussion of the Higgs mechanism starts with all the particles being massless. We then postulate a new field with which all particles can interact. The interaction of a particle with this field produces an effect that makes the particle appear to have mass; i.e. the field produces an effect when the particle moves and this manifests itself as *inertia*. To begin with, we have a free hand to choose whatever form of new field we like, but it must respect both Lorentz invariance and local gauge invariance.

The type of field that we need has already been introduced in Section 7.2.6. Now there are certain properties that the new field must posses: it must respect Lorentz covariance, it must be colourless and uncharged, and it must have spin zero.<sup>2</sup> The field is related to the weak interaction, to give masses to the W and Z, so we will allow it to have both weak isospin and hypercharge (recall that these are related to the electric-charge equivalents for the W<sup>+</sup>, W<sup>0</sup>, W<sup>−</sup> and for the B<sup>0</sup>, respectively; see Section 7.4.1). Finally, it must be locally gauge-invariant, because we want it to make the whole theory gauge-invariant.


<sup>1</sup>Specifically, as we made clear in earlier chapters, we are not using the language of relativistic quantum field theory (RQF). How this affects our treatment of the Higgs mechanism will be covered briefly in Section 12.4.

<sup>2</sup>We must be relativistically consistent and also not allow the field to interact via any of the known forces. Otherwise, when our massless particle (e.g. an electron) interacts with it, thus gaining mass, it could also exchange spin or charge with the field. This would give the impression that the electron spin or charge could change spontaneously, which would not agree with experiment.

*Particle Physics in the LHC Era*, Giles Barr, Robin Devenish, Roman Walczak,

& Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak,

& Tony Weidberg 2016. Published in 2016 by Oxford University Press.

## **12.1 Local gauge invariance**

We give a brief reminder of what gauge invariance entails. It is an important property that fields can, but do not have to, possess. Those that are locally gauge-invariant are, by 't Hooft's theorem, renormalizable, which is important for the reasons described in Section 7.4.4. We have already considered gauge symmetry in the familiar context of classical electric and magnetic fields (see Chapter 6). We showed that the classical observable fields **B** and **E** are unchanged by the simultaneous changes to **A** and φ as follows:

$$\mathbf{A} \rightarrow \mathbf{A}' = \mathbf{A} + \nabla \Lambda, \qquad \phi \rightarrow \phi' = \phi - \frac{\partial \Lambda}{\partial t} \tag{12.1}$$

where Λ is a scalar. Λ can be a function of **x** and t and, when it is, we call it a *local* gauge transformation. In the context of relativistic quantum mechanics, we showed that the Dirac equation for a particle of charge q in an electromagnetic field (**A**, φ) is gauge-invariant if the Dirac wavefunction ψ is also transformed:

$$
\psi \to \psi' = \psi \,\,\mathrm{e}^{\mathrm{iq}\Lambda} \tag{12.2}
$$

The effect of the simultaneous gauge transformations on the **A**, φ, and ψ fields on the Dirac equation

$$\left(\mathbf{i}\frac{\partial}{\partial t} - q\phi\right)\psi = \mathbf{a}\cdot(-\mathbf{i}\nabla - q\mathbf{A})\psi + \beta m\psi\tag{12.3}$$

leave the physics unchanged (see Exercise 12.1). From demanding that Λ can be different for different space–time coordinates (local gauge symmetry), we can determine the form of the electromagnetic interaction.

## **12.2 Spontaneous symmetry breaking**

We want to arrange for there to be a Higgs field so that a massless particle can feel an interaction with it and act as if it has a mass. All fields we have encountered so far have been 'in the vacuum', i.e. they are normally switched off (apart from quantum fluctuations) if nothing is happening, so the ground state is zero. Our new field is different. It must be on all the time to generate particle masses.

The simplest type of field is a **scalar field** φ, which is a single real number that can vary as a function of **x** and t. To make it gaugeinvariant, we will have to make it a complex number φ = φ<sup>1</sup> + iφ2. We need to understand what is involved with making the field be on all the time. We start with a field that is real, so we set φ<sup>2</sup> = 0.

Familiar classical fields have energy associated with them when they are on; for example, there is energy associated with a charged capacitor (which is stored in its electric field) that is not present when the capacitor is discharged. This is represented in Fig. 12.1, which shows the potential energy associated with a field as a function of the value of the field.

**Fig. 12.1** Potential energy associated with a field such as the electric field in a capacitor.

It has a minimum, the value of the field when there is nothing there, which is usually taken to set the zero of the potential. Examples of how the potential energy V of a field with this property could vary with the value of φ are V ∝ φ<sup>2</sup>, V ∝ φ<sup>4</sup>, and V ∝ cosh φ − 1. There are many others.

Thinking pictorially, we can imagine a ball rolling around in the bottom of the potential well shown in Fig. 12.1. If it has no energy, it will sit in the bottom at φ = 0. If it has some energy, it can oscillate by rolling around in the bottom of the well.

Most of the time when we are using quantum mechanics, we calculate using Feynman diagrams or by applying perturbation theory, which is an expansion in terms of a 'small' parameter, for QED the fine-structure constant α. The order of perturbation theory (or Feynman diagrams) required will depend on the accuracy required. This is analogous to using a Taylor series expansion to calculate a mathematical function; however, going beyond the first order in RQF raises some difficult mathematical issues. To make perturbation theory work, we need a region in which φ is slowly varying. For the field theory case in which we are interested, this corresponds to expanding about the minimum of the potential. For example, for a potential with a minimum at φ = 0 as shown in Fig. 12.2, we can consider expansions about φ = 0.

Now consider an example of a field that does not have its potential energy minimum at φ = 0, as shown in Fig. 12.3. The potential in this example varies as

$$V(\phi) = \frac{1}{4}\lambda^2 \phi^4 - \frac{1}{2}\mu^2 \phi^2 \tag{12.4}$$

where λ and μ are both real constants. This has two minima, which are at φ = ±μ/λ. The vacuum state of the field is now not the one with 'no-field' (φ = 0) but the one with either φ = ±μ/λ. This means that we need to consider quanta of the field as excitations with respect to this non-zero field value, and the vacuum acquires a non-zero 'vacuum expectation value' (often shortened to 'vev' in RQF texts). This is the key idea of spontaneous symmetry breaking. The fundamental theory respects a symmetry but the symmetry is 'hidden' (or broken) in the vacuum state. If such a field were to exist in nature, we would have to do our Feynman diagram expansions about one of the minima, a procedure that we will explore in this chapter. These ideas are worked through for this potential in Exercise 12.6. In terms of a classical analogy and thinking pictorially with a ball in a potential well, the ball can now rattle around in the bottom of either minimum.

However, as noted above, the real scalar field considered so far does not work for the Higgs mechanism since it is not possible to make it locally gauge-invariant. So we now consider a complex scalar field φ = φ<sup>1</sup> + iφ<sup>2</sup> in which φ<sup>2</sup> does not have to be zero. We will use the potential

$$V = \frac{1}{4}\lambda^2(\phi^\*\phi)^2 - \frac{1}{2}\mu^2(\phi^\*\phi) \tag{12.5}$$

**Fig. 12.2** Potential energy function with a minimum at φ = 0.

**Fig. 12.3** Potential energy function V (φ) for a field with a minimum at non-zero values of φ.

**Fig. 12.4** 'Mexican hat' potential for the Higgs field. The vertical axis represents the potential and the horizontal axes represent the real and imaginary parts of the complex Higgs field. From Millard, Rupert. Higgs Mexican hat potential. http:// commons.wikimedia.org/wiki/File: Mexican hat potential polar.svg.

**Fig. 12.5** Spontaneous symmetry breaking in an isotropic ferromagnet: (a) symmetry above the Curie temperature; (b) hidden symmetry below the Curie temperature.

<sup>3</sup>This discussion is greatly simplified. In a real ferromagnet below the Curie temperature, the spins are aligned within small domains but the orientation of the spins in different domains is random.

which is a logical extension of the potential we have used up to now when we change from real φ to complex φ. If we substitute φ = φ<sup>1</sup> + iφ<sup>2</sup> into this potential, we get

$$V = \frac{1}{4}\lambda^2(\phi\_1^2 + \phi\_2^2)^2 - \frac{1}{2}\mu^2(\phi\_1^2 + \phi\_2^2) \tag{12.6}$$

This function is shown in Fig. 12.4. The potential has a likeness to a Mexican hat; the minimum has become a ring when φ<sup>2</sup> <sup>1</sup> + φ<sup>2</sup> <sup>2</sup> = (μ/λ)<sup>2</sup> and the phase of the complex number is arbitrary. If we picture a ball rolling around in this potential, we can see two degrees of freedom. One of these has the ball rolling up and down the sides in the same way as for the ball in Fig. 12.3; i.e. φ<sup>2</sup> <sup>1</sup> + φ<sup>2</sup> <sup>2</sup> oscillates and the phase of the complex number stays fixed. The other way, which is new, is for the ball to roll around the bottom; i.e. the value of φ<sup>2</sup> <sup>1</sup> + φ<sup>2</sup> <sup>2</sup> stays fixed, but the phase of the complex number changes. This second degree of freedom requires no energy for it to occur. In quantum field theory, it corresponds to the existence of a boson called a 'Goldstone boson', which has no mass. No such boson exists experimentally, but we will address this shortly.

The concept we have been exploring here is called spontaneous symmetry breaking. The potential is symmetric (i.e. the minimum can be anywhere around the rim of the Mexican hat), but in nature this symmetry is hidden, i.e. the field has to pick one particular value. Another example of spontaneous symmetry breaking in nature comes from the spin alignment behaviour in an isotropic ferromagnet. In Fig. 12.5(a), above the Curie temperature, the spins are randomly aligned. The situation is isotropic, i.e. rotationally symmetric. In Fig. 12.5(b), below the Curie temperature, the spins align (an electrostatic effect due to quantum-mechanical wavefunctions of the electrons).<sup>3</sup> The symmetry is now **hidden**. It is spontaneously broken. The symmetry still exists, because any direction for the spin alignment could equally be chosen. This is analogous to the Higgs field that is inserted into the standard model.

## **12.3 Higgs mechanism—the simplified story**

In Section 12.4.2, we will give a mathematical description of how the above concepts fit together to make the Higgs mechanism. In this section, we give a simple picture of the ideas underlying the Higgs mechanism. We start with the equation for a spin-0 particle that obeys the potential we have just described, V = <sup>1</sup> <sup>4</sup>λ<sup>2</sup>φ<sup>4</sup> <sup>−</sup> <sup>1</sup> <sup>2</sup>μ<sup>2</sup>φ<sup>2</sup>, which can behave as if it were 'on' all the time. This contains a degree of freedom corresponding to the ball rolling round the bottom of the potential, which requires no energy and causes the theory to predict a massless boson, the 'Goldstone boson'.

We then add a 'vector' (spin-1) field that is locally gauge-invariant. This field has to be massless (in order to respect the gauge invariance) before we consider the effect of the non-zero expectation value of the scalar (Higgs) field. After spontaneous symmetry breaking, the vector field interacts with the scalar field and acquires a mass because of the non-zero vacuum expectation value of the scalar field. We then extend this in a way that allows four fields rather than just one to be added; these four fields fit the properties of the W±, W<sup>0</sup>, and B<sup>0</sup> (from the first three steps of the electroweak unification procedure from Section 7.4.1). By making a few subtle rearrangements of the equation, we are able to organise the terms so that


So, almost miraculously, the new term in the potential makes everything come out exactly as we need in order to agree with experiment.

Furthermore, we can repeat the procedure with a fermion, by adding it in a similar way and insisting on local gauge invariance, with the cancellation of terms occurring between the Higgs field and the fermion, and this generates the mass term for the fermion. We can keep doing this with all the fermions in the Standard Model (SM) to get masses for each of them. The theory does not predict the mass of each fermion, but it does predict that the couplings gffH¯ are proportional to the masses of the fermions. The theory also does not specify the mass of the Higgs boson.

## **12.4 Lagrangians**

Before moving on to applying the Higgs mechanism in more mathematical detail, we shall make a short detour. Lagrangians are a very powerful alternative to Newton's laws in classical mechanics.<sup>4</sup> <sup>4</sup>We have deliberately avoided using The Lagrangian approach also works in quantum mechanics and it provides the mathematical framework for RQF. These ideas were touched on briefly at the start of Chapter 6. While the Dirac equation allows only a single particle or a system of a fixed number of particles to be considered, RQF allows the number of particles to change. RQF can cope with creation and annihilation of fermion pairs, for example, and with having several different fields within one equation at the same time.

**12.4.1 Lagrangians in classical mechanics**

Lagrangian mechanics follows a specific recipe,<sup>5</sup> <sup>5</sup>Why it works is explained in many which has three steps. The first step is to pick the correct number of independent variables for the system. For example, a pendulum that moves in one dimension has one independent variable, θ. A pendulum that can swing in any horizontal direction has two variables; we will choose the angle to the vertical θ and the azimuthal angle in the horizontal plane φ. The second

Lagrangian mechanics in this book because of the mathematical overheads involved. However, this approach is particularly useful for discussing the Higgs boson, so we give a flavour of it here that may be helpful when consulting more advanced texts.

textbooks on classical mechanics.

step is to write down an expression for the Lagrangian L by working out the total kinetic energy of the system T and the total potential energy of the system V . <sup>6</sup> <sup>6</sup> By definition, <sup>L</sup> <sup>=</sup> <sup>T</sup> <sup>−</sup> <sup>V</sup> . This must be in terms of the independent variables and their first derivatives with respect to time (so for the example of the two-dimensional pendulum, these are θ, ˙ θ, φ, φ˙). For the two-dimensional pendulum with a mass m attached to a string of length l, the Lagrangian is

$$\mathcal{L} = \frac{1}{2}m(l\dot{\theta})^2 + \frac{1}{2}m(l\dot{\phi}\sin\theta)^2 - (1-\cos\theta)mgl\tag{12.7}$$

where g is the acceleration due to gravity.

The third step is to use the Euler–Lagrange equations with L. There is one Euler–Lagrange equation for each independent variable and each gives an equation of motion (which are usually coupled). We could have tried to start directly with these equations of motion, but one of the beauties of the Lagrangian method is to be able to manipulate everything in one equation before having to deal with simultaneous equations. The Euler–Lagrange equations written for the two variables in our two-dimensional pendulum problem are (see Exercise 12.2)

$$\frac{\mathrm{d}}{\mathrm{d}t} \left( \frac{\partial \mathcal{L}}{\partial \dot{\theta}} \right) = \frac{\partial \mathcal{L}}{\partial \theta}, \qquad \frac{\mathrm{d}}{\mathrm{d}t} \left( \frac{\partial \mathcal{L}}{\partial \dot{\phi}} \right) = \frac{\partial \mathcal{L}}{\partial \phi} \tag{12.8}$$

The partial derivative with respect to θ is taken holding all of the other variables and their derivatives (˙ θ, φ, and φ˙) constant, and similarly for the other partial derivatives. For the pendulum, the resulting equations of motion are

$$l\ddot{\theta} + g\sin\theta = 0, \qquad 2\dot{\theta}\dot{\phi}\sin\theta\cos\theta + \ddot{\phi}\sin^2\theta = 0 \tag{12.9}$$

A general feature of Lagrangians is that if an independent variable does not appear explicitly in the Lagrangian, but only its time derivative (in our example, φ does not appear explicitly), then the right-hand side of the corresponding Euler–Lagrange equation is zero. Such a coordinate is called cyclic and there is a conserved quantity associated with it; in our example, this is ∂L/∂φ˙ = ml<sup>2</sup> sin<sup>2</sup> θ φ˙ = Lφ. This result is immediately apparent from the φ Euler–Lagrange equation. <sup>7</sup> <sup>7</sup> In this case, we can identify <sup>L</sup><sup>φ</sup> as

> Another general feature is the ability to move terms between being considered as kinetic or potential energy, giving what is referred to as an effective potential Veff. In the example, we can change the Lagrangian by inserting Lφ, which is constant:

$$\mathcal{L} = \frac{1}{2}m(l\dot{\theta})^2 + \frac{L\_{\phi}^2}{2ml^2\sin^2\theta} - (1 - \cos\theta)mgl\tag{12.10}$$

We can now pretend that the second term is part of the potential energy (even though it started as part of the kinetic energy) and we have an equation involving only one degree of freedom moving in a fictitious potential describing our two-dimensional pendulum. This feature of swapping terms around within the Lagrangian carries over to the use of Lagrangians in RQF and is very helpful for understanding the Higgs mechanism.

the component of angular momentum about the z axis. In the Lagrangian approach, we can see that conservation laws (e.g. angular momentum) are a result of the invariance of the Lagrangian with respect to a conjugate variable, in this case φ.

## **12.4.2 Lagrangians in quantum mechanics**

We shall not dwell too much on Lagrangians in quantum mechanics. The terms in the Lagrangian still correspond to what we loosely call kinetic and potential energy. Just as it is possible to have several objects in one classical Lagrangian, it is possible to express a system with many fields as a single Lagrangian. For each field in the Lagrangian, there is an analogue to the classical Euler–Lagrange equations,<sup>8</sup> <sup>8</sup>These have more parts to them and different terms in the Lagrangian produce recognizable expressions on application of the Euler–Lagrange equations. For example, terms in the Lagrangian from a spin-0 field give the Klein–Gordon equation, while terms from a spin- <sup>1</sup> 2 field give the Dirac equation. This also applies to spin-1 fields, and in the case of the photon (a spin-1 field with zero mass), one can use the Lagrangian approach to generate Maxwell's equations.

## **12.5 Higgs mechanism—more mathematical**

We are going to repeat the discussion above, but now following the terms that appear in the equations more closely. We will use Lagrangians since the formalism is easier to follow and this will provide an introduction to the detailed explanations in more advanced textbooks. We will build up the concepts that go into the Higgs mechanism with four examples. The particular potentials used here are examples only, and there are a number of different fields that can be made to work theoretically, but using these examples gives a good insight into how the mechanism works in principle.

## **Example 1**

We start with the Lagrangian for a spin-0 particle with no potential energy term:

$$\mathcal{L} = \underbrace{\frac{1}{2} (\partial\_{\mu}\phi)(\partial^{\mu}\phi)}\_{\text{K.E. term}} - \underbrace{-\frac{1}{2}m^{2}\phi^{2}}\_{\text{mass term}}\tag{12.11}$$

Applying the Euler–Lagrange equation to this (see Exercise 12.4) gives the Klein–Gordon equation for a spin-0 particle, ∂μ∂<sup>μ</sup>φ + m<sup>2</sup>φ = 0. The second term in the Lagrangian gives rise to the mass term. For it to be a mass term, it must be proportional to the square of the field and it must be negative.

## **Example 2**

We now modify this by setting m = 0 (so the mass term disappears) but we add a potential V = <sup>1</sup> <sup>2</sup>μ<sup>2</sup>φ<sup>2</sup> (remember that L = T − V ):

$$\mathcal{L} = \underbrace{\frac{1}{2} (\partial\_{\mu}\phi)(\partial^{\mu}\phi)}\_{\text{K.E. term}} - \underbrace{\frac{1}{2}\mu^{2}\phi^{2}}\_{\text{P.E.}}\tag{12.12}$$

than the classical ones, however, because of the need to maintain Lorentz invariance.

Notice how the second term looks just like a mass term. If the particle is in the field that has produced this type of potential, it will behave as if it has a mass. Now we change to a more sophisticated potential of the type discussed earlier, V = <sup>1</sup> <sup>4</sup>λ<sup>2</sup>φ<sup>4</sup> <sup>−</sup> <sup>1</sup> <sup>2</sup>μ<sup>2</sup>φ<sup>2</sup>:

$$\mathcal{L} = \underbrace{\frac{1}{2} (\partial\_{\mu}\phi)(\partial^{\mu}\phi)}\_{\text{K.E. term}} - \underbrace{\frac{1}{4}\lambda^{2}\phi^{4}}\_{\text{interaction}} \underbrace{+\frac{1}{2}\mu^{2}\phi^{2}}\_{\text{not a mass term}} \tag{12.13}$$

Terms that are third or fourth powers of a field (or combination of fields) represent interactions in RQF. So the φ<sup>4</sup> term above represents an interaction. The final term in this Lagrangian is a problem—we do not know how to interpret it. It is not a mass term, because it has the wrong sign.

However, if we change our zero point in the field, what happens? To do this, we need to re-express the field φ in terms of a new field ρ that is zero at the bottom of one of the two minima. Let φ = ρ ± μ/λ, where the ± reflects the fact that there are two minima that can be used. We get (see Exercise 12.6):

$$\mathcal{L} = \underbrace{\frac{1}{2} (\partial\_{\mu}\rho)(\partial^{\mu}\rho)}\_{\text{K.E.}} \quad \underbrace{-\mu^{2}\rho^{2}}\_{\text{mass}} \quad \underbrace{\pm\,\mu\lambda\rho^{3} - \frac{1}{4}\lambda^{2}\rho^{4}}\_{\text{interactions}} \tag{12.14}$$

We need to make φ complex to proceed with local gauge invariance. Let <sup>φ</sup> <sup>=</sup> <sup>φ</sup><sup>1</sup> + iφ2, where <sup>φ</sup><sup>1</sup> and <sup>φ</sup><sup>2</sup> are real, and consider the potential <sup>V</sup> <sup>=</sup> <sup>1</sup> <sup>4</sup>λ<sup>2</sup>(φ∗φ)<sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>2</sup>μ2φ∗<sup>φ</sup> <sup>=</sup> <sup>1</sup> 4λ<sup>2</sup>(φ<sup>2</sup> <sup>1</sup> + φ<sup>2</sup> <sup>2</sup>)<sup>2</sup> − <sup>1</sup> 2μ<sup>2</sup>(φ<sup>2</sup> <sup>1</sup> + φ<sup>2</sup> <sup>2</sup>). The Lagrangian becomes

$$\mathcal{L} = \frac{1}{2} (\partial\_{\mu} \phi\_1)(\partial^{\mu} \phi\_1) + \frac{1}{2} (\partial\_{\mu} \phi\_2)(\partial^{\mu} \phi\_2)$$

$$-\frac{1}{4} \lambda^2 (\phi\_1^2 + \phi\_2^2)^2 + \frac{1}{2} \mu^2 (\phi\_1^2 + \phi\_2^2) \tag{12.15}$$

The first two terms are kinetic energy terms. Next, as in Example 2, we expand around a minimum. In this case, we choose φ<sup>1</sup> = ρ + μ/λ and φ<sup>2</sup> = ρ . This is where the minimum is on the positive real axis. We could do it about any point, and would get the same result (but perhaps after quite a lot of algebra and some redefinition of the fields). The Lagrangian is now

$$\begin{aligned} \mathcal{L} &= \overbrace{\frac{1}{2} (\partial\_{\mu}\rho)(\partial^{\mu}\rho) - \mu^{2}\rho^{2}}^{\rho \text{ scalar}, m = \rho^{\prime}} + \overbrace{\frac{1}{2} (\partial\_{\mu}\rho^{\prime})(\partial^{\mu}\rho^{\prime})}^{\rho^{\prime} \text{ scalar, no mass}} \\ &\underbrace{-\mu\lambda(\rho^{3} + \rho\rho^{\prime 2}) - \frac{1}{4}\lambda^{2}(\rho^{4} + \rho^{\prime 4} + 2\rho^{2}\rho^{\prime 2})}\_{\text{interactions}} \end{aligned} \tag{12.16}$$

Now we have a mass term! <sup>9</sup>We have dropped a constant term <sup>9</sup> because it disappears when the Euler– Lagrange equations are applied, so we can ignore it. **Example 3**

The <sup>ρ</sup> field becomes a massive field with mass <sup>√</sup><sup>2</sup> <sup>μ</sup> and the <sup>ρ</sup> is a massless field (the Goldstone boson we discussed earlier). There are several interaction terms and we have omitted a constant term.

## **Example 4**

Next, we apply local gauge invariance, by adding a massless vector field, i.e. a field with spin 1 like a photon field. There is a prescription it makes the terms we are familiar with in the local gauge invariance discussion cancel properly. We start from Example 3:<sup>10</sup> <sup>10</sup>This equation is eqn 13.108 from

L = K.E. term modified for local gauge invariance 1 2 [(∂<sup>μ</sup> − iqAμ)(φ<sup>1</sup> − iφ2)][(∂<sup>μ</sup> + iqAμ)(φ<sup>1</sup> + iφ2)] <sup>−</sup> <sup>1</sup> 4 λ<sup>2</sup>(φ<sup>2</sup> <sup>1</sup> + φ<sup>2</sup> 2) <sup>2</sup> + 1 2 μ<sup>2</sup>(φ<sup>2</sup> <sup>1</sup> + φ<sup>2</sup> 2) <sup>−</sup><sup>V</sup> <sup>−</sup> <sup>1</sup> 4 FμνF μν K.E. for vector field (12.17)

where F μν = ∂<sup>μ</sup>A<sup>ν</sup> − ∂<sup>ν</sup>A<sup>μ</sup> (see Exercise 12.5). We now expand, as is becoming familiar, about a minimum in the potential: φ<sup>1</sup> = ρ+μ/λ and φ<sup>2</sup> = ρ . The result is

$$\mathcal{L} = \overbrace{\frac{1}{2} (\partial\_{\mu}\rho)(\partial^{\mu}\rho) - \mu^{2}\rho^{2}}^{\rho \text{ scalar}, \text{no mass}} + \underbrace{\frac{\rho^{\prime} \text{ scalar}, \text{no mass}}{1}}\_{\text{vector K.E.}}$$

$$\underbrace{-\frac{1}{4}F\_{\mu\nu}F^{\mu\nu}}\_{\text{vector K.E.}} + \underbrace{\frac{1}{2}q^{2}\frac{\mu^{2}}{\lambda^{2}}A\_{\mu}A^{\mu}}\_{\text{vector mass}} - \underbrace{-q\frac{\mu}{\lambda}A\_{\mu}\partial^{\mu}\rho^{\prime}}\_{\text{problem}} \tag{12.18}$$

+ interaction terms

This is now locally gauge-invariant. The spin-1 field has acquired a mass (the second term on the second line). However, there remains a problem, namely the third term on the second line, which looks like an interaction that allows the A<sup>μ</sup> field to spontaneously change into the ρ field.

There is still another trick up our sleeves. We can now pick a particular gauge. Although the form of the Lagrangian will change when we do this, its physical meaning will stay the same (that is what we mean by gauge-invariant). We change φ with a phase as follows: φ → e<sup>i</sup><sup>θ</sup>φ, where tan θ = −φ2/φ1. This particular choice makes the ρ field disappear, but we are constrained by local gauge invariance and we are modifying the A<sup>μ</sup> fields to compensate for this. What we get is

$$\mathcal{L} = \overbrace{\frac{1}{2} (\partial\_{\mu}\rho)(\partial^{\mu}\rho) - \mu^{2}\rho^{2}}^{\text{massive scalar}} - \overbrace{\frac{1}{4} F\_{\mu\nu}F^{\mu\nu}}^{\text{massive vector}} + \frac{1}{2} q^{2} \frac{\mu^{2}}{\lambda^{2}} A\_{\mu}A^{\mu} \tag{12.19}$$

+ interaction terms

Burcham and Jobes and eqn 10.129 from Griffiths—see Further Reading.

boson has disappeared. <sup>11</sup> <sup>11</sup> It hasn't quite disappeared—when the vector field became massive, it changed from having two polarization states to having three.

pers. <sup>12</sup> <sup>12</sup> For the latest results, see the ATLAS and CMS links on this book's website.

We have created a mass for the spin-1 (vector) field and the Goldstone

## **This is the Higgs mechanism.**

The real Higgs theory is an extension of the above outline. A somewhat more complicated 'doublet' of Higgs scalars is used to begin with—this allows local gauge invariance to be achieved in the manner shown above with massless versions of all four fields of interest (W±, W<sup>0</sup>, and B<sup>0</sup>). We choose a place in the Mexican hat potential to expand around as before, and choose the gauge to make the problematic 'Aμ∂μρ ' terms go away.

When we do this, everything works out to agree with experiment, as described in the simplified description in Section 12.3; the W<sup>±</sup> and Z<sup>0</sup> each acquires a mass and the γ remains massless. Also, m<sup>W</sup> /m<sup>Z</sup> = cos θ<sup>W</sup> and the scalar Higgs remains massive. However, the theory does not predict the mass.

## **12.6 Higgs discovery**

In this section, we review the evidence for the discovery of the Higgs boson. The experimental results shown come from the 'discovery' pa-As mentioned above, the SM does not predict the mass of the Higgs boson. However, once the mass is known, all the properties are fully specified. The predicted cross section as a function of Higgs mass m<sup>H</sup> is shown in Fig. 12.6 [114]. From the direct LEP Higgs search and the indirect constraints from the precision electroweak data, the 95% confidence level for the expected mass is 114GeV < m<sup>H</sup> < 149 GeV [115].

**Fig. 12.6** Theoretical cross section for different Higgs production mechanisms [114] as a function of mH for pp interactions at √s = 7 TeV.

In this mass range, the dominant production mechanism is via a topquark loop, because of the very large top-quark mass (see Fig. 12.7) and the large gluon parton distribution function in this x range.

The branching ratios of the Higgs boson as a function of m<sup>H</sup> are shown in Fig. 12.8 [114]. We can see that the Higgs boson tends to decay to the heaviest particles that are kinematically allowed, since the mass of a particle depends on its coupling to the Higgs boson. In the expected range for mH, the largest branching ratio is for decays into b¯b quarks; however, this channel is almost impossible to study in pp interactions in this production mode<sup>13</sup> because of the large irreducible background from QCD production of b¯b quarks.

At lowest order in perturbation theory, the decay H → γγ would not occur, since the photon mass m<sup>γ</sup> = 0. However, the decay can occur through virtual t and W loops as shown in Fig. 12.9. There is significant negative interference between the two diagrams.

The most important decay channels for the Higgs boson discovery are as follows:


**Fig. 12.7** Feynman diagram for Higgs production via a top-quark loop.

<sup>13</sup>An alternative production mode is V H, where V is either a W± or a Z. The cross sections are smaller but this mode has much smaller backgrounds and is expected to be observed at the LHC.

**Fig. 12.8** Higgs branching ratios [114] as functions of mH.

**Fig. 12.9** Feynman diagrams for the decay H → γγ via a top-quark loop (a) and a W loop (b).

large reducible background <sup>14</sup> <sup>14</sup> The 'reducible' background is one that could be removed if the detector were perfect. The 'irreducible' background has the same final state as the signal, and hence it is impossible to remove it on an event-by-event basis. However, in general, we can use some distributions (e.g. the invariant mass) to make a statistical separation between signal and background.

<sup>15</sup>The azimuthal angle can be determined from a line joining the shower centre to the centre of the detector. In the longitudinal direction (z) the event vertex distribution has a significant spread in z. The longitudinal angle can be determined from the shower centres in the different longitudinal layers. Alternatively, an algorithm can be used to determine which is the correct vertex (at typical LHC luminosities, there will be about 20 vertices) and the γ longitudinal angle can then be determined from a line joining the event vertex to the shower centre.

**Fig. 12.10** Response of the ATLAS EM calorimeter [26] for example events: (a) <sup>π</sup><sup>0</sup> <sup>→</sup> γγ; (b) prompt <sup>γ</sup>. The prompt γ makes a single shower, whereas the two γs from the π<sup>0</sup> decay show evidence for two distinct clusters.

this channel are that it is very clean and also allows for a precise reconstruction of the invariant mass of the ZZ∗.

• *<sup>H</sup> <sup>→</sup> WW<sup>∗</sup>* **:** This channel is also suppressed for the actual mass of the Higgs boson because one of the Ws is off mass shell. The decay modes of the W that are used are W → eν<sup>e</sup> and W → μνμ. The signal is larger than for ZZ∗, but as there are two neutrinos in the final state it is impossible to reconstruct an invariant mass. The transverse mass (see Section 12.6.3) is used, because the spectrum has an endpoint at the value of the invariant mass. This means that the signal peak is very broad and a very careful prediction of all the SM backgrounds is required.

## **12.6.1** *γγ* **channel**

We can have γs produced in the primary interaction (called 'prompt') or as a result of the decays of mesons (mainly π<sup>0</sup>). There is a potentially from events with one prompt γ with the other jet faking a prompt γ or from two-jet events with both jets faking a γ. These reducible backgrounds can be suppressed with a very highgranularity electromagnetic (EM) calorimeter. π<sup>0</sup> → γγ will tend to produce two distinct showers in the EM calorimeter as opposed to the single shower from a genuine prompt γ. In addition, the prompt γ will tend to be isolated whereas γ from π<sup>0</sup> decays will in general be part of a hadronic jet and will therefore not be isolated. The different responses in the ATLAS calorimeter for photons from a π<sup>0</sup> and a prompt γ are illustrated in Fig. 12.10. This allows the reducible background to be decreased to a level well below that of the irreducible background. The irreducible background is from the QCD production of prompt γγ (see Fig. 12.11).

The very good energy resolution of the EM calorimeter is critical for a precise reconstruction of the γγ mass mγγ. However, the γ directions must also be well measured for a precise reconstruction of mγγ and this requires good granularity in the EM calorimeter.<sup>15</sup> The mγγ spectrum is fit to a combination of an empirical background function and a signal

distribution. The signal shape is taken from the SM prediction after simulating all detector effects, for given assumed values of mH. Since the expected statistical significance of various classes of events (e.g. whether the γ converted into an e<sup>+</sup>e<sup>−</sup> pair in the tracking detector) is different, it is advantageous to plot the mγγ spectrum weighted by the expected ratio of signal to background. An example of such a plot from the Compact Muon Solenoid (CMS) experiment is shown in Fig. 12.12 [64]. The plot appears to show a Higgs-like signal on top of a smooth background, but the discussion of the statistical significance will be deferred until Section 12.6.4.

## **12.6.2** *ZZ <sup>∗</sup>* **channel**

In order to have a clean signal and to be able to make a precise reconstruction of the mass of the Higgs boson, we use this mode with electron and muon final states: Z/Z<sup>∗</sup> → e<sup>+</sup>e<sup>−</sup> or Z/Z<sup>∗</sup> → μ<sup>+</sup>μ−. In these decay modes, we have the best signal-to-background (S/B) ratio, but the rate is suppressed by the small branching ratios for the Z/Z<sup>∗</sup> decay modes used. The main reducible background for this channel come from events with two 'prompt' leptons and two from the semileptonic decays of b quarks (e.g. Zb¯b and tt ¯→ W bW¯b). The relatively long flight path of B hadrons can be used to veto leptons from the decays of b quarks. The irreducible background is from ZZ<sup>∗</sup> production without a Higgs boson as an intermediate state, which is indistinguishable from the signal, apart from the fact that the mass spectrum of the four leptons (m4<sup>l</sup>) for the signal will show a peak at mH, whereas the background will be a smooth distribution. Clearly, the excellent energy and momentum

**Fig. 12.11** Lowest-order Feynman diagram for prompt γγ production.

**Fig. 12.12** Distribution of mγγ weighted by S/B, with S and B being the expected signal and background, respectively, from the CMS experiment [64].

resolutions for electrons and muons, respectively, help improve the S/B ratio. The measured distribution of m4<sup>l</sup> in the CMS experiment [26] is compared with the combination of the expected background and signal for a SM Higgs (with m<sup>H</sup> = 125 GeV) in Fig. 12.13.

## **12.6.3** *WW<sup>∗</sup>* **channel**

This channel benefits from the larger Higgs branching ratio compared with that for the ZZ<sup>∗</sup> channel. The channel that is least contaminated by background and therefore has the best significance is the one with electrons and muons: W/W<sup>∗</sup> → eν<sup>e</sup> and W/W<sup>∗</sup> → μνμ. This decay mode has larger branching ratios than the decay modes we used for ZZ<sup>∗</sup> (see Section 12.6.2), so the total number of events expected is significantly greater. The disadvantage of this channel is that there are two neutrinos in the final state, so it is not possible to determine the invariant mass (MWW<sup>∗</sup> ). If there were only one neutrino, the transverse mass would provide a sharp endpoint to the spectrum as in the case of single-W production. It still turns out to be useful to define the transverse mass in a similar way as for single-W production:

$$m\_\mathrm{T}^2 = (E\_\mathrm{T}^{ll} + E\_\mathrm{T}^{\mathrm{miss}})^2 - (\mathbf{p}\_\mathrm{T}^{ll} + \mathbf{E}\_\mathrm{T}^{\mathrm{miss}})^2 \tag{12.20}$$

where **p**ll <sup>T</sup> is the total momentum in the transverse plane of the two charged leptons, Emiss <sup>T</sup> and **E**miss <sup>T</sup> are respectively the magnitude of the missing transverse momentum and the missing transverse momentum vector, and

$$(E\_\mathrm{T}^{ll})^2 = (\mathbf{p}\_\mathrm{T}^{ll})^2 + m\_{ll}^2\tag{12.21}$$

**Fig. 12.13** Distribution of m4l from the CMS experiment [64]. The expected background and signal for the case mH = 125 GeV are also shown. The insert shows the mass distribution for a subsample of the events that pass a kinematic selection designed to optimize the ratio of signal to background.

where mll is the invariant mass of the two charged leptons. It turns out that the distribution of m<sup>T</sup> has an endpoint at the value of the invariant mass of the Higgs boson. However, the effective mass resolution is only ∼20%. There are significant reducible backgrounds from W + jet events with the jet faking a lepton and from Drell–Yan processes (see Chapter 9) with additional jets (e.g. e<sup>+</sup>e<sup>−</sup> jet jet or μ<sup>+</sup>μ<sup>−</sup> jet jet) with fake missing transverse energy from mismeasurements. Therefore, in ATLAS, only events with one W decaying to eν<sup>e</sup> and one decaying to μν<sup>μ</sup> were considered. There remains an irreducible background from WW production. In order to reduce this background on a statistical basis, various cuts are used. For example, for WW arising from the decay of a spin-0 Higgs, the spins of the two Ws must point in opposite directions and the two charged leptons will tend to have momenta in similar directions (see Exercise 12.11). The irreducible background from WW production (see Fig. 12.14 for an example Feynman diagram for WW production) must be determined very precisely in order to be able to detect a significant excess from Higgs events. This is done by selecting events with different kinematic cuts to obtain background-dominated samples. This is used to fix the normalization of the background. The ratio of the number of expected background events in the signal region to the number of events in the background ('control') region is taken from Monte Carlo simulation. The transverse mass distribution measured by ATLAS is shown in Fig. 12.15 [26, 30] and compared with the expectations from background and a SM Higgs signal for m<sup>H</sup> = 126 GeV.

## **12.6.4 Statistical significance**

The relatively small signals in all channels and the non-negligible backgrounds mean that a careful assessment of the statistical significance is required before any claims about a discovery can be made. The technique

**Fig. 12.14** An example Feynman diagram at the quark level for WW scattering.

**Fig. 12.15** Distribution of the transverse mass mT from the ATLAS experiment [26] in a sample of candidate WW events. The expected background and signal for the case mH = 126 GeV are also shown.

<sup>16</sup>By 'shape' we mean the shape of the distributions for the relevant variables.

hood, in which the events are binned in the relevant variable(s) and the product is over the bins. In addition, for ease of computation, we work with a 'loglikelihood' ln L(μ) = <sup>i</sup> ln p(i| μ, **θ**).

is defined as <sup>18</sup> <sup>18</sup> In order to simplify the explanation, we now ignore the nuisance parameters.

statistics, the statement that the SM Higgs is excluded at a given value of mH at 95% confidence level means that if there were a SM Higgs at this mass and the experiment were repeated many times, then in 95% of the cases a larger value of the test statistic would be obtained.

used is based on maximum-likelihood fits. Probability density functions (PDFs) are computed assuming there is a signal with the same shape<sup>16</sup> as the SM Higgs but scaled by a normalization constant μ (i.e. μ = 1 corresponds to the SM Higgs expectation and μ = 0 corresponds to a pure background distribution). The PDF also depends on many 'nuisance' parameters (such as uncertainties in the detector response to different particles). The likelihood of a sample <sup>17</sup> <sup>17</sup> In practice, we use a 'binned' likeliis defined as

$$L(\mu) = \prod\_{i} p(i \mid \mu, \Theta) \tag{12.22}$$

where the product runs over all events in the sample and p(i| μ, **θ**) is the probability of observing event i, assuming a particular value of the parameter μ and for some set of nuisance parameters **θ**. The test statistic

$$\tilde{q}\_{\mu} = -2\ln\left[\frac{L(\text{data}\,\,|\,\mu)}{L(\text{data}\,\,|\,\hat{\mu})}\right] \tag{12.23}$$

where ˆμ is the value obtained by maximizing the likelihood by varying the value of μ in the denominator in eqn 12.23 and the value of μ in the numerator is fixed to a particular value depending on what statistical test is being performed. We can then define two probabilities:


Finally, we define the ratio

$$\text{CL}\_{\text{s}}(\mu) = \frac{p\_{\mu}}{1 - p\_{\text{b}}} \tag{12.24}$$

The 95% confidence level (CL) limit on μ is found by adjusting μ until CLs(μ)=0.05. This procedure is carried out for a range of values of m<sup>H</sup> and we can state the SM Higgs boson is excluded at 95% confidence level for a particular value of m<sup>H</sup> if μ(95%CL) < 1. <sup>19</sup> <sup>19</sup> In the frequentist interpretation of The resulting exclusion plot from ATLAS is shown in Fig. 12.16(a) [26]. The SM Higgs is excluded over the ranges 111–121 GeV and 131–559 GeV. The reason why there is a gap in the exclusion range between 121 and 131 GeV is that there is evidence for an excess over the SM background. The probability of this excess can be quantified by the probability p<sup>0</sup> that a backgroundonly hypothesis could generate a larger value of the test statistic q<sup>μ</sup> for a particular value of mH. The resulting p<sup>0</sup> versus m<sup>H</sup> plot for ATLAS is shown in Fig. 12.16(b). The minimum p<sup>0</sup> value corresponds to a 6σ fluctuation for a Gaussian distribution. However, it is important to allow for the 'look-elsewhere effect'; the p<sup>0</sup> value corresponds to the probability of observing a larger fluctuation at a particular value of mH, but we could have seen an excess over a range of values for mH. Therefore, the probability of observing an equal excess over the full mass range is larger by a factor N ∼ ΔmH/σ(mH), where Δm<sup>H</sup> is the range in m<sup>H</sup> and σ(MH) is the resolution in the event-by-event measurement.

Allowing for this effect for a search over the allowed range from previous experiments at LEP and the Tevatron (110–150 GeV), the statistical significance is reduced to 5.3σ. 20 A very similar result was obtained from the CMS experiment [64].<sup>21</sup> Finally, one can ask if the observed excess is consistent with the SM Higgs. This is addressed in Fig. 12.16(c), which shows the 95% confidence level on the signal strength parameter μ as a function of mH. This shows that the result is consistent with the SM Higgs expectations for a Higgs with m<sup>H</sup> = 126 GeV.

## **12.7 Coupling to fermions**

The SM Higgs mechanism generates masses for fermions as well as for bosons. Therefore, the Higgs boson should have decay modes to fermions

**Fig. 12.16** (a) 95% confidence limit on the SM Higgs. (b) Local p0 values as a function of mH. (c) Variation of fitted signal strength μ with mH from the ATLAS experiment [26].

<sup>20</sup>The look-elsewhere effect only accounts for the range of mass for this one study. Since experiments make many independent measurements, we want to minimize the probability of falsely claiming a discovery, so a high threshold in significance is used before claiming a discovery. Conventionally, this is taken to be 5σ.

<sup>21</sup>Consistent results were obtained by the Tevatron experiment at the 3σ level, mainly in the b¯b channel.

able at the LHC in the V H production mode, where V = W± or Z. However, as this production mode has a smaller cross section, it will need larger data samples than are currently available (as of 2014).

<sup>23</sup>The backgrounds are so large that simply making a sequence of cuts to optimize signal and reduce background would not be sufficient to reveal a signal, and so a more sophisticated multivariate analysis is required (see Chapter 8).

that grow with the mass of the fermion. As the Higgs boson is too light to decay into tt ¯, the most massive fermion available for this process is the b quark and hence there is a large branching ratio for H → b¯b. However, this decay mode is extremely difficult to study because of the very large QCD production of b¯b. <sup>22</sup> <sup>22</sup> This decay mode should be observ-The next heaviest fermion in the SM is the τ lepton and there is a significant branching ratio for H → τ τ¯ (see Fig. 12.8). The identification of τ leptons is much harder than the identification of e or μ. There are many backgrounds to consider. The H → τ τ channel has an irreducible background from Z → τ τ . Separation between the signal and this background can be achieved by reconstructing the mass of the Higgs boson, but the mass resolution is severely degraded by the presence of multiple neutrinos in the final state (see Section 12.6.3). The leptonic decay modes of the τ have smaller branching ratios than the hadronic modes. However, the hadronic decays of the τ have very large backgrounds from QCD jets that can be misinterpreted as τ s. Some separation between hadronically decaying τ s and QCD jets is obtained by counting the number of charged tracks in a narrow cone around the τ candidate. The number of tracks from τ decays is usually one or three, whereas QCD jets have a higher average charge multiplicity. Additional separation between τ hadronic decays and QCD jets is provided by measurements of the shower profile in the calorimeter, since the τ s will tend to produce relatively narrow jets.<sup>23</sup> Although the invariant mass cannot be reconstructed, a best estimate of the mass can be made (see Exercise 12.10 for a simpler method to estimate the Higgs mass in τ decays). The mass distribution for the data and all the SM backgrounds are shown in Fig. 12.17 and a peak above

**Fig. 12.17** Distribution of the best estimate for the Higgs boson mass for data and SM backgrounds for candidate ττ events. All events have been weighted by a factor ln(1 + S/B), where S/B is the expected signal-tobackground ratio [28].

the SM background can be seen. The peak is consistent with the SM expectation and provides evidence at the 4.1σ level for this decay mode of the Higgs boson [28]. The CMS experiment also found evidence for this decay mode [66].

## **12.8 Determination of the spin and parity of the new boson**

One fundamental prediction of the SM is that the spin/parity of the Higgs boson should be J<sup>P</sup> = 0<sup>+</sup>. Since the new boson decays to two bosons, it must have integer spin. The observation of the decay mode H → γγ excludes the spin-1 hypothesis [95]. The SM predictions can be compared with models in which the boson has alternative spin/parity assignments. For the γγ decay mode, we can compare the SM with predictions for graviton-inspired models<sup>24</sup> <sup>24</sup>The graviton is a hypothetical parwith J<sup>P</sup> = 2<sup>+</sup>. We can define the angle of the photons relative to the direction of the Higgs boson (θ∗). The distribution of cos θ<sup>∗</sup> should be isotropic for the case of a spin-0 Higgs. However, in the case of a boson with J<sup>P</sup> = 2<sup>+</sup>, the distribution can be forward-peaked. The angular distribution of the observed events is biased by the acceptance of the detector and the selection required to isolate the signal above the background. However, there are still significant differences between the predictions of the two models, and the data can be used to discriminate between them. The measured distribution of cos θ<sup>∗</sup> is compared with the ATLAS data [29] in Fig. 12.18. Information on the spin/parity can also be obtained from the ZZ<sup>∗</sup> decay mode, although the errors are still quite large because of the limited number of events in the current data sample. The WW decay modes for the case of a spin-0 Higgs should result in correlations between the azimuthal angle of the charged leptons from the resulting decays of the W bosons (see Exercise 12.11). Combining the analyses from three decay modes, all the results are consistent with the SM quantum numbers of J<sup>P</sup> = 0<sup>+</sup> and alternative models can be excluded at a range of confidence levels from 97.7% to 99.9% [28]. Results of a similar analysis from the CMS experiment [65] in the H → ZZ decay mode also favour the SM and disfavour alternative models.

A clear summary of the Higgs boson coupling measurements from the CMS experiment [67] is shown in Fig. 12.19. This shows the measured strength of the Higgs boson coupling to fermions (λ) or bosons (g) as a function of the mass of the particle. For the top quark, the decay H → tt ¯ is kinematically forbidden because of the large mass of the top quark. However, we can still determine the coupling of the top quark to the Higgs boson, because the dominant production mechanism is via a topquark loop (see Fig. 12.7). In the SM, we expect the coupling strengths to increase with the mass of the particle. The measured coupling strengths show the expected increase with particle mass as expected in the SM. However, the errors in some of the measurements are large and more data will make this test more powerful.<sup>25</sup>

ticle that is believed to be the quantum of the gravitational field. The graviton must have spin 2 to correspond to the classical theory of general relativity, in which the gravitational interaction arises from the stress–energy tensor, which is a second-rank tensor.

<sup>25</sup>Eventually, there should be sufficient data at the LHC to see the H → μ+μ<sup>−</sup> decay mode and thus provide an additional data point at much lower mass.


250

**Fig. 12.18** Angular distribution of cos θ∗ measured by ATLAS [29] compared with two hypotheses for the spin/parity: (a) J<sup>P</sup> = 0+; (b) J<sup>P</sup> = 2+.

**Fig. 12.19** Measured values of the Higgs boson coupling to fermions (λ) or bosons (g) as a function of particle mass from the CMS experiment [67].

## **12.9 Outlook**

The results already obtained provide convincing evidence for the existence of a new boson (it decays to γγ, so it must have even spin).<sup>26</sup> <sup>26</sup> The The γγ decay mode excludes spin 1. study of the spin/parity gives results consistent with the SM expectation of 0<sup>+</sup> and other possibilities are excluded. The properties appear to be consistent with the SM Higgs, but this is just the beginning of a new chapter in physics, which will involve many more detailed measurements:


The first three items will be studied with new data in the next few years at the LHC. The last two items are more challenging and will require a further upgrade to the LHC luminosity.<sup>27</sup> <sup>27</sup>Such an upgrade to the LHC is In the SM, there is only expected to be one Higgs boson, but in many Beyond the SM (BSM) theories, there can be more. For example, in the minimal supersymmetric extension of the SM (MSSM), there should be five Higgs bosons (see Chapter 13). Therefore, searching for additional Higgs bosons will clearly be a vital part of the future LHC physics programme.

## **Chapter summary**


planned for 2024. This will also require extensive upgrades to the ATLAS and CMS detectors to cope with the higher luminosity.

## **Further reading**


but accessible account of the Higgs mechanism in the SM.

• Aitchison, I. J. R. and Hey, A. J. G. (2013). Gauge Theories in Particle Physics, Volume 2 (4th edn). CRC Press. This is a thorough and more advanced graduate-level treatment of the Higgs mechanism.

## **Exercises**


$$\begin{aligned} f(x) &= x^4 - 2x^2 \\ f(x) &= -1 + 0(x - 1) + 4(x - 1)^2 + 4(x - 1)^3 \\ &\quad + (x - 1)^4 \\ f(x) &= -1 + 0(x + 1) + 4(x + 1)^2 - 4(x + 1)^3 \\ &\quad + (x + 1)^4 \end{aligned}$$

The second and third expressions are in the form of Taylor series of the first about the points x = 1 and x = −1, respectively, which are the two minima of the function. The series terminate (why is this?) and so we end up with terms only up to the fourth order.

(12.4) Verify that applying the Euler–Lagrange equation to the Lagrangian for a spin-0 particle (eqn 12.11) gives the Klein–Gordon equation.

> Note: If your answer has an unwanted factor 2, check by first expanding the (∂μφ)(∂νφ) term into the four components.

(12.5) Consider the Lagrangian given by

$$\mathcal{L} = - (\partial^{\mu} A^{\nu} - \partial^{\nu} A^{\mu})(\partial\_{\mu} A\_{\nu} - \partial\_{\nu} A\_{\mu})$$

Apply the Euler–Lagrange equations to show that the equations of motion are ∂μF μν = 0, where <sup>F</sup> μν <sup>=</sup> <sup>∂</sup><sup>μ</sup>A<sup>ν</sup> <sup>−</sup> <sup>∂</sup><sup>ν</sup>A<sup>μ</sup>. Comment on the relation of this result to classical electromagnetism.


estimate of the Higgs boson production cross section. Explain whether your estimate would be an underestimate or an overestimate.

(12.10) This question looks at the reconstruction of the Higgs boson mass in H → τ τ decays in the collinear approximation. The momenta of the neutrino(s) from each τ decay are assumed to be parallel to each τ . Show that if the angles in the transverse plane for each τ are measured, and the transverse momentum of the recoiling hadronic system (**R**) is measured, we can determine the transverse momenta of the two τ s. What other parameters must be measured in order to determine the invariant mass of the Higgs boson? Why does this method break down if the Higgs boson is produced with no transverse momentum?

Hint: It is sufficient to consider the special case in which the two τ s decay along the y axis.

(12.11) Consider the decay of a spin-0 Higgs to W<sup>+</sup>W<sup>−</sup> with subsequent leptonic decays of the Ws. Let Δφ be the azimuthal angle between the two final-state leptons. Draw diagrams to illustrate the possible polarization states of the Ws. What would be the most probable decay angle of the leptons relative to the spin of the Ws? Hence explain why the distribution of Δφ should peak at small values.

> Why do the experiments use the relative angle between the leptons in the transverse plane (Δφ) rather than the space angle?


W production, Q<sup>2</sup> = M<sup>2</sup> <sup>W</sup> .

# **LHC and BSM 13**

Now that there is growing evidence for the discovery of the Higgs boson, there are no missing particles in the Standard Model (SM). There are still many details to be checked to be sure that the new boson discovered at the LHC (see Chapter 12) is compatible with the SM Higgs boson. Even if this turns out to be the case, however, serious problems remain with the SM. A brief review of these issues, particularly the hierarchy problem, will be given in this chapter. We will justify the claim that we should expect to see new Beyond the Standard Model (BSM) physics in the TeV energy range. This then provides one of the two principal motivations for the LHC (the first being understanding the origin of mass), and we will look at how LHC experiments are searching for new physics. The study of the Higgs boson could be extended at a future linear e+e<sup>−</sup> collider (a 'Higgs factory'). Looking further into the future, we will see how a high-energy linear e<sup>+</sup>e<sup>−</sup> collider could probe deeper into any new physics that might be discovered at LHC, if it were at an accessible energy.

## **13.1 LHC and Standard Model physics**

The main motivation for the LHC is to understand the origin of mass and to search for new physics at the TeV scale. The origin of mass has been discussed in Chapter 12, so we will now consider BSM physics. However, any new physics channel will have some SM physics backgrounds, so it is essential to check that the SM works in the new region of phase space opened up by the LHC. This is non-trivial because, even with the LHC operating at 8 TeV CMS energy (rather than the design value of 14 TeV), high-Q<sup>2</sup> processes such as W/Z production <sup>1</sup> <sup>1</sup> <sup>Q</sup><sup>2</sup> is the scale of the process; e.g for sample very much lower values of the parton x distribution than previous experiments (see Chapter 9). This is illustrated in Fig. 13.1, from which one can see the large increase in the range of x that can be studied at high Q<sup>2</sup> at the LHC, compared with previous accelerators [115].

> The predicted cross sections (see Chapter 9 for an explanation of how these calculations are performed) are shown as a function of CMS energy in Fig. 13.2. There is an enormous spread in the magnitudes of the cross sections for SM processes and most of them are much larger than those expected for Higgs production or for new physics processes.

> This raises the critical question of how one can trigger and identify the interesting events in the presence of these enormous backgrounds.

*Particle Physics in the LHC Era*, Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg. c Giles Barr, Robin Devenish, Roman Walczak, & Tony Weidberg 2016. Published in 2016 by Oxford University Press.

**Fig. 13.1** Kinematic plane in x and Q<sup>2</sup> for the LHC at √s = 14 TeV, compared with HERA, Tevatron, and fixed-target experiments [115].

**Fig. 13.2** Theoretical cross sections for different processes as functions of CMS energy for ¯pp (Tevatron) and pp (LHC). Figure from private communication from Professor James Stirling.

ity was slightly lower than the design value, but the b.c. spacing was 50 ns, so the average number of interactions per bunch crossing was ∼30.

## **13.2 LHC triggers**

The scale of the problem is set by the magnitude of the total cross section. At the LHC design luminosity of 10<sup>34</sup> cm−<sup>2</sup> s−<sup>1</sup>, the total interaction rate is ∼1 GHz. Each proton beam is composed of ∼2000 bunches of protons. Each bunch contains ∼10<sup>11</sup> protons. During nominal LHC operation, the bunches collide every 25 ns and there are an average of 24 interactions per bunch crossing (b.c.). <sup>2</sup> <sup>2</sup> In 2012 LHC, operation, the luminos-The general principles of pipelined triggers were briefly reviewed in Chapter 4. Most of the triggers are based on identification of objects like electrons, photons, muons, or 'jets' at high transverse momentum pT. An example of a first-level (L1) trigger in ATLAS is the e/γ trigger, which uses the fact that electrons give very localized energy deposition in the electromagnetic (EM) calorimeter, whereas the background processes arise from QCD jets, which consist of mixtures of γs (mainly from π<sup>0</sup> decays) and hadrons. The L1 e/γ trigger selects candidates by requiring localized energy in the EM calorimeter and vetoing events with too much energy in the hadronic calorimeter behind the candidate e/γ object or in the surrounding cells of the EM calorimeter. The higher-level triggers can utilize data from the tracking detector to provide further background rejection. A genuine electron will have a track with a transverse momentum and direction compatible with the energy deposition in the EM calorimeter, unlike the background from QCD jets. The fine granularity of the EM calorimeter is also used to provide more background rejection because EM showers are narrower than hadronic showers (see Chapter 4).

> The very large trigger rejection must be achieved while still maintaining high efficiency for the interesting objects like electrons and muons. This raises the crucial question as to how one can determine if the trigger system is working efficiently or not. In general, this can be done using a 'tag-and-probe' analysis. For example, Z → e<sup>+</sup>e<sup>−</sup> events can be triggered with a single electron trigger on one of the electrons in the event. One of the electrons (we do not distinguish between electrons and positrons for this analysis) has a reconstructed electron passing the electron selection and matching in the detector location with the trigger, and this electron serves as the 'tag' for the event. We can then look for a second electron in the event that also passes the electron selection, and this electron serves as the 'probe'. We can examine the trigger data and determine if this electron also passed the electron trigger. Let ntag be the number of events in which an event is tagged and nprobe the number of events in which the second object, the 'probe', also passes the trigger. The trigger efficiency can be determined to be (see Exercise 13.1)

$$\epsilon\_e = \frac{2n\_{\text{probe}}}{n\_{\text{probe}} + n\_{\text{tag}}} \tag{13.1}$$

The beauty of the tag-and-probe method is that it provides a purely data-driven efficiency determination and so does not rely on any assumptions about the detector performance that are needed for a Monte Carlo simulation. This method can be extended from a 'global' measurement to a differential one in which the efficiency is measured as a function of variables like the transverse momentum pT.

We now consider the critical issue of how it is possible to search for very rare processes, despite the large backgrounds. The processes with the largest cross sections result in particles with limited transverse momentum and can easily be rejected by any trigger requiring a high-p<sup>T</sup> object. For example, jet events can be studied by triggering on large transverse energy in a localized region of the calorimeter. In order to trigger on rarer processes such as W and Z production, one can trigger on leptons. For the very small cross sections for processes like Higgs production, triggers on multiple objects can be used to further reduce the trigger rate. In addition to triggers on localized objects, there is also a very important trigger on 'missing transverse momentum' to detect weakly interacting particles like neutrinos. Let the energy measured in a calorimeter cell (assuming massless energy deposits, so E = p) be E<sup>i</sup> and let the polar and azimuthal angles be θ<sup>i</sup> and φi. The measured momentum balance in the plane perpendicular<sup>4</sup> <sup>4</sup>In the beam direction, so much moto the beam axis (z) is defined by

$$\begin{aligned} E\_{\mathbf{T},x}^{\text{miss}} &= -\sum\_{i} E\_i \sin \theta\_i \cos \phi\_i \\ E\_{\mathbf{T},y}^{\text{miss}} &= -\sum\_{i} E\_i \sin \theta\_i \sin \phi\_i \end{aligned} \tag{13.2}$$

and the magnitude of the missing transverse momentum is given by

$$E\_{\rm T}^{\rm miss} = \sqrt{(E\_{\rm T,x}^{\rm miss})^2 + (E\_{\rm T,y}^{\rm miss})^2} \tag{13.3}$$

This Emiss <sup>T</sup> trigger is useful for selecting events with neutrinos but is also essential for searches for BSM physics such as supersymmetry that have weakly interacting particles (see Section 13.4.1).

## **13.3 SM measurements at the LHC**

Many measurements of different SM physics processes have been performed at the LHC. This section will give a brief summary of a few of these studies. The largest cross sections for high-p<sup>T</sup> processes are for dijet production, since this is governed by the strong interaction. The data for the distribution in jet transverse momentum are compared with next-to-leading-order (NLO) QCD predictions [24] in Fig. 13.3.<sup>5</sup> <sup>5</sup>The QCD calculations are performed There is very good agreement between data and the QCD prediction up to p<sup>T</sup> ∼ 1 TeV. This very impressive agreement spans a dynamic range of 10 orders of magnitude. As the jet cross section is a steeply falling function of the jet transverse energy pT, a relatively small error in the p<sup>T</sup> measurement will result in a large error in the cross section. It is therefore essential to make an accurate calibration of the jet energies.

<sup>3</sup> <sup>3</sup>The same tag-and-probe technique can also be used to determine efficiencies for offline electron identification. The methodology is very powerful since it can be extended to any process that provides two independent objects to select.

> mentum is carried by particles that are 'lost' down the beam pipe that we cannot make a useful measurement.

> using perturbation theory, utilizing the fact that at large transverse momentum, Q2, the strong coupling constant <sup>α</sup>s(Q2) 1. The NLO calculation includes Feynman diagrams with one extra power of αs(Q2) than the leadingorder diagram.

**Fig. 13.3** Jet cross section versus transverse energy, measured by the AT-LAS experiment at the LHC. The measurements are shown in slices of the rapidity variable

$$y = \frac{1}{2} \frac{E + p\_x}{E - p\_x}$$

where E is the energy and pz is the momentum component along the beam direction for the jet. The data are compared with the predictions from NLO QCD (light-shaded bands at each data point) [24].

the decays of particles like π<sup>0</sup> and η.

<sup>7</sup>'Underlying' event refers to the interactions of the spectator partons left behind after the hard parton–parton

lisions that occur in the same bunch

(Δφ<sup>2</sup> + Δη2), where the pseudorapidity <sup>η</sup> <sup>=</sup> <sup>−</sup> ln(tan <sup>1</sup> <sup>2</sup> θ) and θ and φ are the polar and azimuthal angles of the cell.

Jets are composed of hadrons as well as photons. <sup>6</sup> <sup>6</sup> The main source of photons is from The energy determination of hadrons in a calorimeter is more difficult than that of electrons or photons (see Chapter 4). In addition, there are uncertainties in the reconstruction of the energy of a hadron jet, because any jet-finding algorithm has to determine which energy depositions should be considered as part of the jet. Therefore, there will be energy depositions that fail to be counted as part of the jet and some energy depositions from the 'underlying' event that are wrongly attributed to the jet.<sup>7</sup> interaction. At high luminosity, there is also the problem that some energy from the 'pile-up' events will also be wrongly associated with the jet. <sup>8</sup> <sup>8</sup> 'Pile-up' refers to additional pp col-

crossing as a triggered event. The simplest jet finder is based on the 'cone' algorithm. The highestp<sup>T</sup> cell in the calorimeter is used as a 'seed' and the nearest cell with transverse energy above some fixed threshold and within a radius ΔR less than a fixed size is found. <sup>9</sup><sup>Δ</sup> <sup>9</sup> <sup>R</sup> <sup>=</sup> -An updated value of the jet direction is computed as the energy-weighted average of the two cells. The procedure is then iterated until no more cells are found to merge. The next jet is found starting from the seed of the remaining cell with the highest transverse energy. The procedure terminates when there are no unused cells above threshold.

> While the algorithm is simple from an experimental point of view, it has major theoretical problems that make comparisons with QCD predictions difficult. Consider an event with two jets that are separated and should be reconstructed as two separate jets. If there is a 'soft' particle between the two partons, this can cause the two jets to merge. This makes the results very sensitive to low-energy radiation, and the algorithm is described as not being 'infrared-safe'.

> Several infrared-safe algorithms have been proposed based on sequential recombination. We define a distance dij between cells or jets i and j.

We define a similar distance di<sup>B</sup> from a cell or a jet to the beam. The clustering starts with the smallest distance dij or diB. If dij < diB, it combines the entities i and j, otherwise the entity is called a jet and its cells are removed from the list. The algorithm is iterated until there are no more jets to be found.

Different algorithms use different definitions of distance. The most commonly used algorithm is called the anti-k<sup>T</sup> jet finder [58]:

$$\begin{aligned} d\_{ij} &= \min \left\{ (p\_{\mathcal{T},i}^{-2}, p\_{\mathcal{T},j}^{-2}) (\Delta R\_{ij}/R)^2 \right\} \\ d\_{i\mathcal{B}} &= p\_{\mathcal{T},i}^{-2} \end{aligned} \tag{13.4}$$

where ΔR is the same separation in rapidity and azimuthal angle as used for the cone algorithm. and pT,i is the transverse momentum. R is a fixed radius in (y, φ) space (similar to the fixed cone radius).<sup>10</sup> <sup>10</sup>Typical values are in the range Δ<sup>R</sup> <sup>=</sup>

There are many contributions to the uncertainties in the measured 0.4–0.7. energy of jets, so it is vital to use data-driven techniques to calibrate the jet energy scale. The following are some of these techniques:


**Fig. 13.4** Jet energy calibration in ATLAS using the γ–jet balance technique [37]. (a) Ratio of jet to photon transverse energies for data and Monte Carlo simulation. (b) Ratio of the values from data and Monte Carlo calculations ('PYTHIA'). Note the different scales for the two plots.

calibrate the highest-p<sup>T</sup> jet. <sup>11</sup> <sup>11</sup> This is particularly useful because there are too few events at very high pT to use the absolute calibration techniques. As multijets are produced by purely strong interactions, the rates are higher than for γ–jet or Z–jet events.


Production of W and Z bosons with subsequent leptonic decays provides very clean channels for comparisons with QCD predictions. The first comparisons we consider are measurements of the cross sections for these processes. The methodology for the theoretical calculation in the parton model has been described in Chapter 9. For an ideal detector with 100% efficiency and no background, the measured cross section would simply be related to the number of events observed, Nobs, and to the integrated luminosity L by

$$N\_{\rm obs} = L\sigma \tag{13.5}$$

In a real detector, we have to subtract the estimated number of background events, Nbkgd. We need to correct for the finite detector efficiency . Finally, we also need to allow for the 'acceptance' A, which gives the fraction of produced events for which the particle(s) are inside the angular and momentum range that can be detected. Therefore, we modify eqn 13.5 to give

$$
\sigma = \frac{N\_{\rm obs} - N\_{\rm bkgd}}{A \epsilon L} \tag{13.6}
$$

For these SM measurements, the backgrounds are generally small, so we will postpone a discussion of how to determine them until we consider searches for new physics (see Section 13.5). In principle, the efficiency could be determined purely from a Monte Carlo calculation. However, we can greatly reduce the uncertainty by also using 'data-driven' measurements based on the tag-and-probe technique (see Section 13.2). However, the acceptance involves events that are not detectable for a given detector, so we have to rely on Monte Carlo calculations. The Z bosons are identified by detecting two oppositely charged leptons. In the case of electrons and muons, the invariant mass of the pair can be reconstructed and a very clean Z mass peak [22] is observed, as shown in Fig. 13.5 for the case of Z → e<sup>+</sup>e−.

The W → eν<sup>e</sup> and W → μν<sup>μ</sup> decays can be separated from the background due to dijet events by identifying a well-measured high-p<sup>T</sup> electron or a muon and requiring a large value of the missing transverse momentum (see Section 13.2) to identify a neutrino. Nearly all hadrons will be stopped in the calorimeters, so muons can be identified by finding tracks in the muon chambers behind the calorimeters. A very clean peak in the missing-transverse-momentum distribution is observed above the background [63], as shown in Fig. 13.6.

**Fig. 13.5** e+e<sup>−</sup> invariant mass spectrum measured by the ATLAS experiment at the LHC [22].

**Fig. 13.6** (a) Distribution of missing transverse momentum Emiss <sup>T</sup> for W →μν measured by the CMS experiment at the LHC [63]. The data are shown with error bars and the backgrounds are shown for other SM sources of genuine muons and for misidentified muons from QCD jets. The dotted histogram gives the fitted signal. (b) Fractional deviation (data − fit)/ error.

The cross section for W production at LHC [22] is compared with lower-energy ¯pp data and theoretical predictions in Fig. 13.7.

In order to compare theory with data, the calculations are shown for both pp and ¯pp. The theoretical predictions are in good agreement with all the data. An interesting and more differential test of the SM at LHC is given by measuring the charge asymmetry in W production:

$$A = \frac{\sigma(W^+ \to l^+ \nu) - \sigma(W^- \to l^- \nu)}{\sigma(W^+ \to l^+ \nu) + \sigma(W^- \to l^- \nu)}\tag{13.7}$$

At the LHC, W<sup>+</sup> (W−) will be produced mainly by a valence u (d) quark colliding with a ¯ d (¯u) quark from the sea (see Chapter 9). Therefore, if

**Fig. 13.7** Cross section times branching ratio for W production as a function of CMS energy for ¯pp and pp [23].

we assume that the sea distributions are the same for u and d quarks, we should expect

$$A \approx \frac{u(x) - d(x)}{u(x) + d(x)}\tag{13.8}$$

We can determine the p<sup>T</sup> of the ν using the missing-transversemomentum measurement, but we cannot determine the longitudinal momentum of the neutrino. Therefore, as the 4-momentum of the W cannot be uniquely reconstructed, it is convenient to measure the asymmetry as a function of an angular variable. The chosen variable is the pseudorapidity η = − ln(tan <sup>1</sup> <sup>2</sup> θ), where θ is the polar angle. In pp interactions, the asymmetry is identical for positive and negative η, so the distribution is symmetric about η = 0. At η ≈ 0, the parton momentum fractions of the two protons will tend to be similar. So typical x values will be given by M<sup>W</sup> / <sup>√</sup><sup>s</sup> <sup>≈</sup> <sup>0</sup>.01 (see Exercise 13.2). For larger values of η, the parton momenta become more unequal and so the asymmetry becomes sensitive to the ratio u(x)/d(x) at larger values of x. From the knowledge of the parton distribution functions (see Chapter 9), we know that u(x)/d(x) increases with x at high x, and, for x > 0.01, u(x)/d(x) > 1. However, at very large values of η, the effect of the angular distribution of the decay of the W (see Chapter 8) starts to dominate and this generates an asymmetry with the opposite sign. We therefore expect that the asymmetry should be positive and increasing with η over a limited range and then decrease with η in the very forward region. These general features agree with the data from ATLAS, CMS, and LHCb [21], and an NLO QCD prediction (see Chapter 9) is in good agreement with the data, as shown in Fig. 13.8.<sup>12</sup>

<sup>12</sup>These results together with other SM measurements at the LHC can be used to significantly reduce the uncertainty in the parton distribution functions at high values of Q2.

**Fig. 13.8** W charge asymmetry as a function of lepton pseudorapidity as measured by ATLAS, CMS, and LHCb at LHC [21].

## **13.3.1 Top-quark production**

Another important test of the SM at LHC is top-quark production. The cross section for tt ¯ production at a CMS energy of 7 TeV (LHC in 2011) is predicted to be 177.3+10.<sup>1</sup> <sup>−</sup>10.<sup>8</sup> pb.<sup>13</sup> <sup>13</sup>The top cross section for pp at a CMS The leptonic decays of the t or t ¯ will result in events with multijets, lepton(s), and missing transverse momentum from the neutrino(s). These events therefore represent potentially very large backgrounds for new-physics searches based on missing-transverse-momentum signatures. Many measurements of tt ¯processes have been made. The cleanest channels are those with both top quarks decaying semileptonically, t → blν. The largest background process is Z + jets. The Z + jets background can be suppressed by removing events with the same flavour, opposite-sign lepton pairs with an invariant mass consistent with the mass of the Z. The signal events have two neutrinos, which will result in missing transverse momentum (Emiss <sup>T</sup> , see Section 13.1). The Z + jets background will only have non-zero Emiss T because of the finite detector resolution. Therefore, this background can be further suppressed by requiring a large value for Emiss <sup>T</sup> . The signal events will always contain two b quarks, so the purity can be enhanced using 'b-tagging' based on the relatively long lifetime of b hadrons. A likelihood<sup>14</sup> <sup>14</sup>If the probability of a given measurefor a given jet to contain a b hadron is constructed based on several variables sensitive to lifetime, including the following:


A cut on the magnitude of the likelihood is made so as to obtain a b-tagging efficiency of ∼80%. This technique obviously requires very precise tracking detectors and in particular a very high-precision silicon pixel detector as close to the beam line as possible. The 'irreducible' energy of 7 TeV (LHC in 2011) is a factor of ∼25 greater than that for ¯pp at 2 TeV (Tevatron). At the Tevatron, at the typical value of x ∼ 0.2, the cross section is dominated by qq¯ processes. At the LHC, the smaller x values mean that the cross section is dominated by gg processes.

ment i under a given hypothesis is pi, the likelihood is constructed from a set of N measurements as L = 3<sup>N</sup> <sup>i</sup>=1 pi.

background (see Chapter 12 for a discussion of reducible and irreducible backgrounds) is estimated in such a way as to minimize the reliance on Monte Carlo simulations. The general strategy is based on the use of kinematic selections to define 'control' regions (in which the events arise mainly from one background source) and 'signal' regions. This methodology is explained in Section 13.5.3, where the backgrounds are much more significant. For this measurement, we define a control region in the measured data in which the mass of two same-flavour opposite-sign (SFOS) leptons is compatible with the Z. The Monte Carlo simulation is only used to extrapolate this number to the signal region in which this mass is incompatible with the Z. The distribution of number of jets in this signal region before and after b-tagging [25] is shown in Fig. 13.9. A clean signal is observed above the background even before b-tagging, but the power of b-tagging to greatly enhance the purity of the signal is clearly demonstrated. The measurements of the tt ¯ cross section at CMS energies of 7 and 8 TeV from ATLAS [33] are shown in Fig. 13.10.

**Fig. 13.9** Jet multiplicity distributions for dilepton events in ATLAS for (a) all jets and (b) b-tagged jets [25].

**Fig. 13.10** Measurements of the tt ¯ cross section at 7 and 8 TeV using eμ b-tag events together with results at 7 TeV using the ee, μμ, and eμ channels measured by ATLAS at the LHC [33].

The good agreement between the data and the SM for these and many other results at the LHC gives us confidence in the reliability of the SM in this energy regime. We can therefore use the SM to reliably predict backgrounds for a wide range of possible new physics processes.

One other very important measurement is the mass of the top quark, because this affects the mass of the W and Higgs bosons via radiative corrections (see Chapter 7). This was used in the past to pin down the mass of the SM Higgs boson, but now that we have observed a Higgs boson, we can combine the three precision mass measurements in a powerful consistency check of the SM. Any significant inconsistency would provide evidence of new physics. The current world average based on the Tevatron measurements by CDF and D0 and the ATLAS and CMS measurements at the LHC gives [19] a value of m<sup>t</sup> = 173.34 ± 0.76 GeV. This result is consistent with the SM, but further improvements in the precision will be possible with more data at the LHC.

## **13.4 Beyond the Standard Model physics**

There are several reasons to expect BSM physics to emerge at the TeV scale being explored at the LHC. The first general argument is simply that there are too many free parameters in the SM and a more unified theory should contain fewer. The Higgs mechanism is consistent with current data, but the theory is rather contrived and it is hoped that BSM physics will be able to explain the origin of the spontaneous symmetry breaking required in the Higgs mechanism.

An important argument that points to new physics being manifest at the TeV scale is known as the hierarchy problem in the SM. The mass of the Higgs boson has radiative corrections from the Feynman diagram with a fermion loop (see Fig. 13.11). In this Feynman diagram, we have to consider the propagators for the fermions in the loop. In Chapter 7, we only evaluated Feynman diagrams with boson propagators (e.g. the W<sup>±</sup> or Z). The Feynman rules for a spin- <sup>1</sup> <sup>2</sup> propagator (see Langacker in Further Reading) of momentum q and mass m give a factor<sup>15</sup> <sup>15</sup>See Chapter 6 for the definition of the

$$f\left(\frac{1}{2}\right) = \frac{\mathrm{i}}{\gamma^{\mu}q\_{\mu} - m} \tag{13.9}$$

Momentum has to be conserved at every vertex in a Feynman diagram, but this allows for an infinite range in the momentum of the virtual fermion. To evaluate this Feynman diagram, we therefore have to integrate over the momentum q of the virtual fermion in the loop. After averaging over the spin states by taking the trace of the fermion propagators [80] of the fermion, we can show that the contribution to the squared mass of the Higgs boson is given by

$$(\Delta m\_H)^2 = -C g\_\text{f}^2 \int\_0^\infty \mathbf{d}^4 q \,\mathrm{Tr} \left( \frac{\mathbf{i}}{\gamma^\mu q\_\mu - m} \frac{\mathbf{i}}{\gamma^\mu q\_\mu - m} \right) \tag{13.10}$$

**Fig. 13.11** Feynman diagram for the radiative correction to the mass of a Higgs boson from a fermion loop.

γμ matrices.

divergent', <sup>16</sup>We might be worried that we would <sup>16</sup> have similar quadratic divergence for fermion loops between gauge bosons in the SM. However, the SM contains an approximate chiral symmetry that prevents this happening (see Aitchison in Further Reading).

the energy scale at which quantumgravitational effects become important. The GUT scale, <sup>∼</sup>10<sup>15</sup> GeV, is the scale in a grand unified theory at which the strengths of the electromagnetic and weak interactions become equal.

**Fig. 13.12** Feynman diagram for the radiative corrections to the mass of the Higgs boson from a scalar boson loop diagram.

where g<sup>f</sup> is the coupling constant for the fermion–Higgs boson interaction and C contains other numerical constants. Note that eqn 13.10 gives the contribution from one type of fermion, so we need to sum over all fermions. However, the dominant contribution comes from the top quark since it has the largest coupling because it is by far the heaviest fermion. The negative sign arises because there is a fermion loop and this is related to the opposite intrinsic parities of fermions and antifermions. We can write d<sup>4</sup>q = dq<sup>0</sup> d<sup>3</sup>**q** = dq<sup>0</sup> |**q**| <sup>2</sup> d|**q**| dΩ. Therefore, simply by counting the powers of q, we can see that this integral is 'quadratically i.e. if we introduce a cutoff Λ, the integral will have a term proportional to Λ<sup>2</sup>. The leading term gives

$$(\Delta m\_H)^2 = -\frac{g\_\text{f}^2}{8\pi^2}\Lambda^2\tag{13.11}$$

If there is no new physics up to some scale Λ, then the mass squared of the Higgs boson will also have radiative corrections of the order of Λ<sup>2</sup>. If Λ is given by the Planck scale or even by a GUT scale, <sup>17</sup> <sup>17</sup> The Planck scale, <sup>∼</sup>10<sup>19</sup> GeV, is then the natural mass for the Higgs boson would be of the order of 10<sup>19</sup> GeV or 10<sup>15</sup> GeV, respectively. We know that the physical mass of the Higgs boson is of the order of 100 GeV, so in the SM this requires counterterms that have to be fine-tuned to 1 part in 10<sup>15</sup> to ensure the nearly perfect cancellation of the bare mass with the radiative correction. This is called the **hierarchy problem**.

## **13.4.1 Supersymmetry**

The most popular solution to the hierarchy problem is to invoke supersymmetry (SUSY). This is a symmetry that transforms bosons into fermions and vice versa. This implies that every SM particle must have a superpartner with spin differing by <sup>1</sup> <sup>2</sup> . If SUSY is correct, then the Higgs boson mass squared will receive radiative corrections from Feynman diagrams with a scalar loop as shown in Fig. 13.12. The contribution to the square of the μ parameter (see Chapter 12) is given by (see Aitchison in Further Reading)

$$
\Delta \mu^2 = C\lambda \int\_0^\infty \frac{\mathrm{d}^4 k}{k^2 - m\_H^2} \tag{13.12}
$$

where C is a numerical constant. As for the calculation of the fermion loop above, we can write d<sup>4</sup>k = dk<sup>0</sup> d<sup>3</sup>**k** = dk<sup>0</sup> |**k**| <sup>2</sup> d|**k**| dΩ, which contains the fourth power of k, and hence this integral is also 'quadratically divergent'. This means that if we introduce a cut-off parameter Λ to the integral, we find (C is another numerical constant)

$$
\Delta \mu^2 = C' \lambda \Lambda^2 \tag{13.13}
$$

The radiative correction to the μ<sup>2</sup> parameter then introduces a correction to the mass squared of the Higgs boson of

$$\left(\Delta m\_H\right)^2 = \frac{g\_\text{s}}{16\pi^2}\Lambda^2\tag{13.14}$$

where g<sup>s</sup> is the coupling constant for the scalar interaction.

Comparing eqns 13.11 and 13.14, we see that we will have perfect cancellation of the unwanted quadratic divergence if g<sup>2</sup> <sup>f</sup> = g<sup>s</sup> and we have two scalar partners for every SM fermion. This happens naturally in SUSY, which is an extension of the symmetries of the SM relating bosons and fermions. In SUSY, the two scalar particles for each SM fermion arise from the left- and right-handed fermions. The equality of the coupling constants is guaranteed by the symmetry.

Unbroken SUSY therefore solves the hierarchy problem and introduces no new parameters into the theory. However, SUSY is manifestly broken, because we have not yet discovered any of the SUSY partners of the SM particles, so their masses must be greater.<sup>18</sup> <sup>18</sup>There is clearly no scalar charged However, if the masses are very much heavier than the SM partners, then some fine tuning would be required to prevent the Higgs boson mass becoming too large. Therefore, 'naturalness' arguments suggest that the masses of the SUSY partners should be of the same order of magnitude as the Higgs vacuum expectation value (246 GeV).

There are many different options for how SUSY is broken.<sup>19</sup> <sup>19</sup>SUSY breaking must be done in such The masses of the superpartners are not protected by the gauge symmetries that force the masses of fermions and gauge bosons to be zero before spontaneous symmetry breaking is considered. In this sense, it is not surprising that the masses of SUSY particles are larger than those of their SM particles.

The particle content of the SUSY extension to the first generation of the SM is given in Table 13.1. SUSY particles are labelled with a 'tilde' (˜) above their name; for example, the SUSY partner of the electron is the selectron ˜e. SUSY partners of quarks and leptons are called squarks and sleptons, respectively. SUSY partners of neutrinos, gauge bosons, and Higgs bosons are called neutralinos, gauginos, and Higgsinos.

After SUSY breaking, there will in general be mixing of all states with the same quantum numbers. For example the 'left' and 'right' t-quark states will mix to form the physical states called t ˜<sup>1</sup> and t ˜2. This mass splitting can be very large, so it is possible that the lightest squark belongs to the third generation. The Higgs sector in SUSY is extended compared with the SM because two complex Higgs doublets are required to give masses to the up-type and down-type quarks. Each complex doublet consists of four fields. After spontaneous symmetry breaking, three of the eight fields are 'swallowed up' to give mass to the W<sup>±</sup> and Z<sup>0</sup> bosons. This leaves five physical spin-0 fields: two charged fields (H±) and three neutral fields (a pseudoscalar A and two neutral scalars h and H). There are also SUSY partners for these states. There is then mixing between the SUSY partners of the neutral gauge bosons ˜γ, Z˜<sup>0</sup> and the neutral Higgsino states h˜, H˜ to give the physical states ˜χ<sup>0</sup> <sup>1</sup>, χ˜<sup>0</sup> <sup>2</sup>, χ˜<sup>0</sup> <sup>3</sup>, χ˜<sup>0</sup> 4, where the states are labelled in order of increasing mass. Similarly, the W˜ <sup>±</sup> states mix with the H˜ <sup>±</sup> states to give the charginos χ<sup>±</sup> <sup>1</sup> , χ<sup>±</sup> 2 .

Although there is currently no direct evidence for SUSY, there are some suggestive hints that it might be a valid low-energy theory (i.e. at the electroweak scale ∼100 GeV):

• SUSY helps with the unification of the electroweak and strong coupling constants. If we assume that there should be some grand particle with the same mass as the electron (511 keV).

a way as to preserve the gauge symmetries in the SM, so that the SM particles only acquire mass through spontaneous symmetry breaking.


**Table 13.1** Example of SUSY particles for the first generation of quarks and leptons and for the gauge bosons.

<sup>20</sup>The renormalization group equations are derived from the underlying group theory and allow the determination of the change of coupling 'constants' with scale. See the discussion in Chapter 9 for a simple explanation in the context of QED and QCD.

unification of the electroweak and strong couplings at some high energy, then we can use the 'low-energy' data and the renormalization group equations<sup>20</sup> to see if this happens. In the context of the SM, the coupling constants do not quite coincide at any scale. However, if one assumes SUSY with a mass scale of the order of a TeV, then the coupling constants do coincide as shown in Fig. 13.13.

constants in the SM and in SUSY [70].


$$m\_H^2 = M\_Z^2 \cos^2 2\beta \tag{13.15}$$

where β is the ratio of the vacuum expectation value of the Higgs boson coupling to up-quark and down-quark flavours. Therefore, at tree level, the Higgs boson is constrained to be lighter than the Z. However, there are radiative corrections to the Higgs boson mass and an upper limit is given by [83]

$$
\Delta m\_H^2 = \frac{3g^2 m\_t^4}{8\pi^2 M\_W^2} \left[ \ln \left( \frac{M\_s^2}{m\_t^2} \right) + \frac{X^2}{M\_s^2} \left( 1 - \frac{X\_t^2}{12M\_s^2} \right) \right] \tag{13.16}
$$

where g is the weak coupling constant, M<sup>s</sup> is the geometric mean of the stop quark masses (M<sup>2</sup> <sup>s</sup> = m<sup>t</sup> ˜1m<sup>t</sup> ˜<sup>2</sup> ), and X<sup>t</sup> is a stop mixing parameter. The MSSM still predicts a relatively low value m<sup>H</sup> < 140 GeV, which is consistent with the recent results from Higgs boson searches at the LHC, which give m<sup>H</sup> ∼ 125 GeV.

## **13.4.2** *R***-parity**

One potentially fatal problem with SUSY is that in general it will allow quarks to convert to leptons and thus allow Feynman diagrams that would mediate very rapid proton decay as shown in Fig. 13.14.

The simplest solution to this problem is to invoke a new multiplicative parity, called R-parity, given by R<sup>p</sup> = (−1)3(B−L)+2<sup>s</sup>, where B is the baryon number, L the lepton number, and s the spin. All the SM particles have R<sup>p</sup> = 1 and all the SUSY partners have R<sup>p</sup> = −1. This has profound consequences:

• The lightest SUSY particle (LSP) is absolutely stable, since any decay to SM particles would violate R-parity. This has the attractive feature that the LSP is a natural candidate for dark matter (see Section 13.7).

using only the lowest-order Feynman diagrams.

**Fig. 13.14** Feynman diagram mediating proton decay in SUSY, <sup>p</sup> <sup>→</sup> <sup>e</sup>+π0.


Even in the context of R-parity-conserving SUSY models, there are far too many parameters to allow a model-independent survey of the full parameter space. One way to get round this problem is to assume a simplified model in which some of the key parameters of the MSSM are set by hand.

## **13.4.3 Other BSM theories**

Another solution to the hierarchy problem is to replace the fundamental Higgs scalar with a composite object. Theories based on this idea are called 'technicolour', but are not discussed here since it is difficult to construct such a theory that is compatible with all the precision electroweak data. Technicolour theories are also disfavoured by the LHC observation of a boson that is compatible with the properties of the SM Higgs boson.

Another interesting option is to assume that there are 'large' extra dimensions. In these models, the SM particles are restricted to a fourdimensional 'brane' but gravity can propagate in the higher-dimensional 'bulk'. Gravity then appears weak in our four-dimensional world because the gravitons can propagate into the bulk. At distances shorter than the characteristic length of the theory, there would be deviations from the inverse square law for gravity. However, experiments to detect such an effect have not found any deviation from the inverse square law. In these models, the fundamental Planck scale can be as low as ∼TeV. This means that the cut-off in the divergent integral for the Higgs boson mass is ∼TeV, thereby avoiding the hierarchy problem.

There are many versions of these models, but one generic feature is that at TeV scales particles can leak from the brane into the bulk and thus cause an apparent violation of momentum conservation in four dimensions. Therefore, from an experimental point of view, we would expect events with large missing transverse momentum. More speculatively, some of these models predict the production of micro black holes at the LHC with significant cross sections. The non-observation of micro black holes at the LHC therefore provides severe constraints on these theories.

## **13.5 Experimental searches for BSM physics at the LHC**

There are a bewilderingly large number of possible BSM theories that might be experimentally accessible. It is therefore essential that some searches are geared to finding unexpected signals while others are optimized for some particular model, such as SUSY. This section merely aims to give the flavour of these searches.

## **13.5.1 Jet production**

The large QCD jet production cross sections at LHC provide access to significant numbers of events at very high mass. These can be used to search for resonances in the mass spectra as would be expected from models like excited quarks as well as new contact interactions. A particular sensitive way to search for new physics is to use the angular distribution in the dijet CMS, because many experimental systematic effects (such as the jet energy scale) are thereby greatly reduced. The angular distribution is studied as a function of the variable

$$\chi = \frac{1 + \cos \theta^\*}{1 - \cos \theta^\*} \tag{13.17}$$

where θ<sup>∗</sup> is the polar angle of one of the jets in the dijet CMS system. The advantage of the χ variable is that the forward-peaked angular distribution for θ<sup>∗</sup> is transformed into a more uniform distribution as a function of χ. New physics such as a contact interaction will tend to produce a more isotropic distribution in cos θ<sup>∗</sup> and therefore a peak at small values of χ. The distribution of χ for a range of dijet invariant masses [32] is shown in Fig. 13.15. No significant deviation from the SM is seen and limits on new physics can be placed.<sup>22</sup> <sup>22</sup>In this case, limits were placed on

## other BSM physics. **13.5.2 Lepton pair resonances**

Many BSM theories predict extra U(1) or SU(2) symmetries, which would result in heavier versions of W and Z bosons, called W and Z . The signature for a W would be a Jacobian peak in the transverse mass (see Chapter 8, eqn 8.22) above M<sup>W</sup> . The signature for a Z would be a narrow peak in the l +l <sup>−</sup> invariant mass distribution above MZ. These would give clear signals above the SM backgrounds, but so far only upper limits have been reported; see for example Fig. 13.16 [27].

## **13.5.3 SUSY searches**

If we assume R-parity-conserving SUSY, then SUSY particles will be produced in pairs, such as ˜qq ¯˜. The cross sections can be calculated at the parton level, since all the couplings are the same as in the SM. Using the measured parton distribution functions (see Chapter 9), we can then models with extra dimensions, but the data can be used to place limits on

**Fig. 13.15** Distribution of χ for slices of dijet invariant mass, measured by ATLAS and compared with QCD predictions (shaded bands) and a BSM theory with extra dimensions (dashed line) [32].

**Fig. 13.16** Distribution of dilepton invariant mass, measured by ATLAS and compared with SM expectations and models with different Z masses [27].

calculate the cross section for SUSY pair production in pp collisions. The results are shown in Fig. 13.17 as a function of the average mass of the pair of sparticles [119]. There is a relatively large cross section for strongly interacting squarks and gluinos, but the cross sections for particles produced via electroweak interactions are much smaller. The cross section for a given type of particle decreases rapidly with the mass of the particle.

While the production cross sections are well defined, the decay modes for a given sparticle depend on which modes are kinematically allowed. They also depend on the model parameters that are used for SUSY breaking. However if we assume R-parity conserving SUSY, then the end result of the decays of two sparticles will be two LSPs. The LSPs

**Fig. 13.17** Predicted cross section for the production of pairs of SUSY particles for pp collisions at a CMS energy √s = 8 TeV. From [119].

will not be detected directly in a detector. Therefore, a large value of Emiss <sup>T</sup> will be used as a signature for SUSY searches. In order to establish evidence for a possible SUSY signal, all possible sources of background must be understood. After removing instrumental backgrounds such as 'beam halo' interactions<sup>23</sup> <sup>23</sup>'Beam halo' refers to beam particles and fake Emiss <sup>T</sup> generated by the finite detector resolution, the main backgrounds are SM processes that have genuine Emiss <sup>T</sup> from final-state neutrinos. Some of the main SM backgrounds are


The first reaction is an example of an irreducible background in that even with a perfect detector one could not separate it from signal on an eventby-event basis. Therefore, it is essential to have a very reliable prediction of the rate for this process. This can be done in a largely data-driven way, thus minimizing theoretical uncertainties: the rate for Z + jets with Z → l +l <sup>−</sup> can be measured directly, and if the l +l <sup>−</sup> are removed, this provides a simulation of Z → νν¯. The ratio of branching ratios of Z to leptons and neutrinos is very well known from LEP. The effect of finite efficiencies for triggering and identifying charged leptons can be measured in a data-driven way using Z → l +l <sup>−</sup> events. A Monte Carlo calculation is only required to allow for the finite detector acceptance of the charged leptons (the acceptance for neutrinos is 100%). The other reactions generate 'reducible' backgrounds that could be completely rejected with an ideal detector because they result in charged leptons, whereas some of the SUSY signal will not contain any charged leptons. Data-driven techniques are required to estimate the effects of the finite detector efficiency.

that have escaped from the beam pipe.

The kinematics of SUSY signals and SM backgrounds are very different and this is exploited in SUSY searches. SUSY events will typically contain more high-p<sup>T</sup> jets than the SM backgrounds, and SUSY events will tend to be at larger values of Emiss <sup>T</sup> than those from the SM. At low values of Emiss <sup>T</sup> , the distribution will be dominated by the SM backgrounds but any significant excess at high values of Emiss <sup>T</sup> would be evidence for SUSY.

A general approach to estimating the SM backgrounds in searches for BSM physics is based on using Monte Carlo simulations to


We can then measure the number of events in the control region in real data (N CR data) and Monte Carlo simulated data (N CR MC). Finally, we need to use the Monte Carlo simulation to calculate the 'transfer factor' (TF), defined by

$$\text{TF}^{\text{SR}} = \frac{N\_{\text{MC}}^{\text{SR}}}{N\_{\text{MC}}^{\text{CR}}} \tag{13.18}$$

The predicted background for a given process<sup>24</sup> is then given by

$$N\_{\text{predicted}} = \text{TF} \times N\_{\text{data}}^{\text{CR}} \tag{13.19}$$

While this approach works well for SM processes involving W,Z, or top-quark production, it cannot be used to predict QCD-induced backgrounds such as 'jets' that are misidentified electrons or 'fake' Emiss T arising from the finite detector resolution. In general, the QCD cross sections are so much larger than those for BSM physics that a data-driven method must be used. There are several ways of doing this. One option is to perform a 'template' fit. For example, to estimate the background from 'fake' electrons (i.e. QCD jets wrongly identified as electrons):


This method works if the shapes of the two distributions are sufficiently different, as indicated in the sketch in Fig. 13.18. The data distribution

<sup>24</sup>This is a simplified discussion, since it assumes that each control region is only populated by events from one SM process. Nevertheless, it should give a clear idea of the approach used. Note that it does not entirely eliminate the reliance on Monte Carlo simulations, but it only requires the Monte Carlo predictions of ratios, so the systematic uncertainties are greatly reduced compared with relying on Monte Carlo to predict the absolute number of SM events in the SR.

**Fig. 13.18** Distributions for signal and background. The x axis is a proxy for a suitable variable like Emiss <sup>T</sup> .

is then fitted to the sum of the two distributions, and the fit can then be used to determine the number of background events in any SR.

As discussed in Section 13.4.1, the number of free parameters in the minimal SUSY model (MSSM) are so large that it is not feasible to search the full parameter space. One example of a search [36] for squarks was performed with the assumption<sup>25</sup> <sup>25</sup>Other analyses make different model that the gluinos are much heavier than the squarks and that the squarks decay with 100% branching ratio as q˜ → qχ<sup>0</sup> <sup>1</sup>. The signature for the strong-interaction production of pairs of squarks is therefore (at least) two high-p<sup>T</sup> jets plus large Emiss <sup>T</sup> . Another SUSY search targets very large jet multiplicities that can arise from cascade decays of SUSY particles. The dominant background source for this analysis was from mismeasured values of Emiss <sup>T</sup> . The experimental uncertainty on the measurement of Emiss <sup>T</sup> is found to scale as Emiss <sup>T</sup> <sup>∝</sup> <sup>√</sup>HT, where H<sup>T</sup> is the sum of the transverse momenta of the jets. Therefore, the signal (control) region is defined by events with large (small) values of Emiss <sup>T</sup> / <sup>√</sup>HT. The distribution of <sup>E</sup>miss <sup>T</sup> / <sup>√</sup>H<sup>T</sup> was measured for a lower-jet-multiplicity sample and this was then used to scale the number of background events in the control region to the signal region. The resulting distribution [31] of Emiss <sup>T</sup> / <sup>√</sup>H<sup>T</sup> is shown in Fig. 13.19. The data are in good agreement with the background calculations, and powerful limits on the masses of the squarks and gluinos can be obtained as shown in Fig. 13.20. The curves show that for a particular simplified SUSY model, a clear excess would have been expected at large values of Emiss <sup>T</sup> / <sup>√</sup>HT.

The hierarchy argument suggests that the mass of the stop squark should be relatively low (see eqn 13.16) but the masses of the other

**Fig. 13.19** Distribution of Emiss <sup>T</sup> / <sup>√</sup>H<sup>T</sup> measured by ATLAS in a signal region with the number of jets ≥10. The SM backgrounds and the expected signal for SUSY signal for a simplified model with masses given by m(˜g) = 900 GeV and m(χ<sup>0</sup> 1) = 150 GeV are shown [31].

assumptions.

**Fig. 13.20** Limits in the plane of the mass of the χ<sup>0</sup> 1 versus the mass of the lightest ˜q, obtained in simplified models with the assumption of one nondegenerate squark or eight degenerate squarks [31]. The areas below the solid curves are excluded at 95% confidence level.

**Fig. 13.21** Kinematically allowed decay modes of the t ˜ for different regions in the mass plane (m<sup>t</sup> ˜, mχ<sup>0</sup> 1 ) [34].

squarks could be much higher without violating 'naturalness'. Therefore, the search for stop squarks is particularly important. In general, the SUSY signal for t ˜t ¯˜ production will suffer from SM backgrounds from tt ¯ production. The kinematically allowed decay modes for the stop depend on the masses of the stop and the neutralino, as illustrated in Fig. 13.21. Several searches for the stop have been performed. One particularly powerful search used final states with one-lepton and multijets [34]. The analysis used several signal regions to target the kinematics for the different regions in the mass plane (m<sup>t</sup> ˜, mχ<sup>0</sup> <sup>1</sup> ). The results were consistent with the SM, and the resulting limits are shown in Fig. 13.22.

In some SUSY models, the squarks and gluinos are very heavy and the lightest SUSY particles are the charginos and neutralinos. Therefore, another interesting way of searching for SUSY is to select events with charged leptons and large values of Emiss <sup>T</sup> . There are many possible SUSY

**Fig. 13.22** Limits in the plane of the mass of the χ<sup>0</sup> 1 versus the mass of the t ˜ [34]. The areas below the solid curves are excluded at 95% confidence level.

production and decay chains that would lead to these final states. One analysis [35] considered the electroweak production of χ<sup>±</sup> <sup>1</sup> χ<sup>0</sup> <sup>2</sup>, followed by either the decay chain χ<sup>±</sup> <sup>1</sup> → νl ˜ and ˜ν → νχ<sup>0</sup> <sup>1</sup> or χ<sup>±</sup> <sup>1</sup> <sup>→</sup> ˜lν and ˜<sup>l</sup> <sup>→</sup> lχ<sup>0</sup> <sup>1</sup>, together with similar decay modes for the χ<sup>0</sup> <sup>2</sup> (see Fig. 13.23). The results shown in Fig. 13.24 represent a large improvement on the limits from LEP2.<sup>26</sup> <sup>26</sup>At LEP2, the limits on charged-Many other electroweak processes can be considered in the search for charginos and neutralinos. For example, the Feynman diagram for the electroweak production of χ<sup>0</sup> 1χ<sup>+</sup> <sup>1</sup> is shown in Fig. 13.25. In many models, the decays of the χ<sup>+</sup> <sup>1</sup> result in a charged lepton. This results in a distinctive experimental signature of a charged lepton and a large value of Emiss <sup>T</sup> .

## **13.5.4 Summary of searches for new physics**

The limits on the masses of SUSY particles and other exotic physics have been greatly extended by the LHC, but as of 2015 there is no clear evidence for any new physics at the LHC. However, the mass reach will be greatly extended by the increase in energy to 13 TeV and the planned increase in luminosity up to and beyond the nominal value of 10<sup>34</sup> cm−<sup>2</sup> s−<sup>1</sup>, so there are prospects for exciting discoveries in the next few years

## **13.6 Linear collider**

The observation at LHC of a low-mass Higgs-boson-like particle gives a strong motivation for studying its properties in more detail in an e<sup>+</sup>e<sup>−</sup>

**Fig. 13.23** Electroweak SUSY production processes for χ± <sup>1</sup> χ<sup>0</sup> 2, with decays via sleptons or sneutrinos [35].

**Fig. 13.24** Mass limits in the plane of the masses of the χ<sup>0</sup> <sup>1</sup> and χ<sup>±</sup> 1 in a simplified SUSY model [35].

**Fig. 13.25** A Feynman diagram for electroweak chargino–neutralino production.

**Fig. 13.26** Production of Higgs bosons in e+e<sup>−</sup> annihilation.

collider. In an e+e<sup>−</sup> collider, a Higgs boson would be produced by the 'Higgsstrahlung' process (see Fig. 13.26) e<sup>+</sup>e<sup>−</sup> → HZ. The minimum CMS energy for this process is m<sup>Z</sup> + mH, but a useful rule of thumb is that to get a sufficiently large cross section requires an energy of ∼ m<sup>H</sup> + 100 GeV. Given the mass of the boson observed at LHC (see Chapter 12), the minimum energy of an e<sup>+</sup>e<sup>−</sup> collider to be operated as a 'Higgs factory' would be ∼230 GeV. In such a process, the events could be tagged by detecting the Z decay products, so that all decay modes of the Higgs boson could be studied precisely to determine their branching ratios. In the SM, all the branching ratios of the Higgs boson can be calculated once its mass has been determined, and therefore this would provide stringent tests of whether the properties of the observed particle are consistent with those expected in the SM. Using polarized electron beams would also allow a determination of the spin and parity of the Higgs boson (the LHC measurements strongly support the SM assignment of a spin/parity of 0<sup>+</sup>).

However, to confirm that the particle really was a Higgs boson, we would need to confirm that the self-coupling of the Higgs boson exists with the expected strength given by the SM. This could be studied with higher luminosity at the LHC or at an e<sup>+</sup>e<sup>−</sup> collider if it had sufficiently high CMS energy. If the CMS energy of a linear collider were larger than twice the top mass, many precise measurements would be possible. The mass of the top quark could be measured with a precision an order of magnitude better than achievable at LHC. By comparing the measurements of the mass of the W, the top quark, and the Higgs boson, a precision test of the radiative corrections in the SM would be possible. The interest here is that any deviations from the SM prediction would give indications of new physics at higher mass scales, through loop diagrams. Precision measurements of the top-quark axial and vector couplings could be made using the electron polarization and again any deviations from the SM predictions would give sensitivity to new physics.

If the CMS energy were large enough, an e<sup>+</sup>e<sup>−</sup> collider could be used to study SUSY particles. This would require that the beam energy be larger than the lightest SUSY charged particle. This might require a much higher-energy machine than would be required for the study of the Higgs boson and top quark. Such a machine would have to be a linear collider, because a circular machine would have unacceptable synchrotron radiation losses or require an unrealistically large radius. R&D for a linear collider is based around the development of high-gradient superconducting radiofrequency (RF) cavities. Currently, gradients of ∼35 MeVm−<sup>1</sup> have been achieved. In order to achieve useful interaction rates, very high luminosity would be required. As a linear collider is a single-pass machine (unlike circular machines), this will require very small beam sizes. This makes many demands on the machine:


Considerable progress has been made in these issues, but more remains to be done before such a machine could be built.

Looking further into the future, if a higher-energy e<sup>+</sup>e<sup>−</sup> collider were to be built, it would probably require the more exotic technology being developed for the CLIC (Compact Linear Collider). This is aimed at a collider with a CMS energy around 3 TeV. To keep the length affordable, accelerating gradients of ∼100 MeVm−<sup>1</sup> would be required, which is too large for the superconducting RF technology (see Chapter 3) being developed for lower-energy linear colliders. Such gradients could be achieved using a two-beam-acceleration concept in which the RF power is generated by a very high-current (I ∼ 100 A) electron beam (drive beam) running parallel to the main beam. This drive beam is decelerated and the generated RF power is transferred to the main beam.

## **13.7 Dark matter**

There are strong indications from astrophysics that the universe contains a large amount of non-baryonic matter. This matter only has weak and gravitational interactions and cannot emit electromagnetic radiation,<sup>27</sup> hence the name 'dark matter'. The evidence for this dark matter comes experimentally.

<sup>27</sup>There are models in which dark matter only has gravitational interactions, but they are not discussed here as they are almost impossible to test directly

**Fig. 13.27** General picture of quark–WIMP scattering. The WIMPs

the Fermi theory of weak interactions, which ignores the effects of the W propagator and treats the interaction as a four-particle interaction. The Fermi theory works very well at low energies (∼10 MeV) because of the very high mass of the W boson.

inverse square law of gravity, <sup>29</sup> <sup>29</sup> An alternative attempt to explain this and other effects is to assume that the inverse square law needs to be modified at large values of r.

from its gravitational interaction with the luminous normal matter and this is briefly reviewed in Section 13.7.1. If such dark matter exists in the form of weakly interacting massive particles (WIMPs), there are three different approaches to studying them in laboratory or astroparticle experiments, which can be explained in terms of the cartoon shown in Fig. 13.27.

	- (3) **WIMP annihilation:** If WIMPs accumulate under gravity, there will be significant rates of annihilation. The search for these reactions is discussed in Section 13.7.3.

This qualitative comparison between the accelerator searches and direct detection can be made more quantitative using effective field theories (EFTs) to describe the interactions between quarks and WIMPs. An EFT provides a low-energy approximation to a fuller but not yet known theory. The full theory would give accurate predictions at high energies, whereas the EFT would be expected to fail at high energies. If the separation between the two energy scales is large enough, the EFT may give reliable predictions at low energy. <sup>28</sup> <sup>28</sup> A classic example of an EFT is

## **13.7.1 Astrophysical evidence for dark matter**

There are several aspects of the astrophysical evidence for dark matter. One approach uses galactic rotation curves. The velocities of stars can be measured from their Doppler shifts and these can be compared with the velocities calculated assuming Newtonian gravitation. The latter calculation assumes that the distribution of gravitational matter follows that of the luminous matter. In this case, using Newtonian mechanics and the one expects to find the rotation velocity v(r) scaling with distance r from the galactic centre as v(r) ∝ 1/ <sup>√</sup><sup>r</sup> for values of r larger than the bulk of the luminous matter of the galaxy. However, the rotation curves are usually much flatter at large r (see Fig. 13.28 for an example of a galactic rotation curve [47]). If one assumes that the inverse square law of gravity is correct, then there must be a halo of non-luminous matter, called 'dark matter'. To produce a flat rotation curve, the dark matter mass contained within a radius r must scale like M(r) ∝ r or the density must scale like ρ(r) ∝ 1/r<sup>2</sup> (at

30 **Fig. 13.28** Rotation curve for the galaxy NGC 6503 [47].

very large values of r, the density must fall off faster in order for the galactic mass to be finite).

Radius (kpc)

Additional evidence comes from the motion of galaxies in clusters of galaxies. A striking piece of evidence for dark matter comes from the observation of the Bullet Cluster, in which two clusters have collided with one another. The distribution of gravitational mass can be inferred from gravitational lensing (see Perkins in Further Reading) and is found to be very different to the distribution of baryonic mass as determined from the observed light. The baryonic matter interacted via electromagnetic interactions and shows evidence for shock waves. Therefore, the interactions between the dark matter and the baryonic matter must be very much weaker than this.

From global fits to these and other data, the dark matter content of the universe is much greater than that of ordinary baryonic matter. Direct dark matter searches are discussed in Section 13.7.2 and the search for dark matter annihilation is discussed in Section 13.7.3. The accelerator (LHC) search is not discussed further here, since the methodology is similar to that of the search for events with large values of Emiss <sup>T</sup> that has been discussed in the context of SUSY.

## **13.7.2 Direct dark matter detection**

At first sight, it might seem surprising that it is difficult to detect dark matter if there is much more dark matter than ordinary matter. The problem is that the dark matter particles are expected to be 'cold', with a typical speed of ∼100 km s−<sup>1</sup> and they only interact weakly. For typical ranges of WIMP mass, 10 GeV to 10 TeV, the WIMP–nucleon interaction will be by elastic scattering. The signature of such an interaction would be nuclear recoil with energy in the range 1–100 keV. This energy is very low and requires specialized detection techniques. The cross sections are very small and so the event rates will be low. This presents an enormous experimental challenge to detect a signal above background. It is essential to reduce the rate of cosmic-ray interactions, which is usually done by performing the experiments in deep underground caverns. The next background to consider is that from natural radioactivity. This can be reduced by very careful control of all the materials used in the detectors. Therefore, it is advantageous if the detector can have a powerful separation between the nuclear recoil signals and radioactive backgrounds. The detectors obviously have to be large to get a good sensitivity and there is no perfect solution to these challenges. There are currently many different approaches to these challenges. The following are some of the techniques used:


produce larger ionization yields. Therefore, the ratio S2/S1 provides powerful discrimination between nuclear recoils (signal) and radioactive backgrounds. A diagram of the LUX detector [14] is shown in Fig. 13.29

Although there are several claims to have seen WIMP signals, they are all controversial and there are no signals confirmed by two experiments. Larger detectors are planned with target masses of the order of a ton, and these should be sensitive to WIMP signals over the range of WIMP masses and cross sections expected in common SUSY models that are compatible with the astrophysical data (see Section 13.7.1).

## **13.7.3 Dark matter annihilation**

If dark matter accumulates under gravity in the centres of massive objects like the Sun, it should be possible to detect some of the products of the resulting annihilation reactions. One approach uses the muon neutrinos resulting from cascade decays. Neutrino 'telescopes' should then be able to detect the neutrinos and show that they come from the centre of the Sun. One such neutrino telescope is the IceCube [90], which has strings of photomultipliers buried in the Antarctic ice, giving it a volume of about 1 km<sup>3</sup>. Fast muons produce Cerenkov light, which is detected ˇ by the photomultipliers.<sup>30</sup> <sup>30</sup>The cleanliness of the Antarctic ice There is a background from downward-going muons, but upward-going muons can only come from neutrinos that have travelled through the Earth. Satellite experiments like FERMI, Pamela, and AMS are searching for the antiparticles that are expected from WIMP annihilation in the galactic halo. The measured positronto-electron ratio is increasing with energy as would be expected in dark matter models, but the data are not yet able to exclude conventional sources such as pulsars.

## **13.8 Dark energy**

Astronomers have been trying to measure the expansion of the Universe to determine if it is 'closed' (i.e. the expansion will be reversed at some future time) or 'open' (i.e. it will continue to expand indefinitely). This requires a determination of the rate of expansion with distance scale at the largest distances. The Universe is known to be expanding (called the 'Hubble expansion' after the astronomer who discovered this effect). The recession velocity of distant objects can be measured by looking at the redshifts of spectral lines compared with laboratory measurements of the rest values. The redshift is defined in terms of the shift in wavelength by z = Δλ/λ. The recession velocity β can be determined from the Doppler formula

$$1 + z = \sqrt{\frac{1 + \beta}{1 - \beta}}\tag{13.20}$$

**Fig. 13.29** LUX detector showing the arrays of photomultiplier tubes and the electrodes to create the required electric fields [14].

photons (~175 nm)

results in a remarkably long attenuation length for the light of 55 m at a wavelength of 470 nm [89].

determining the intrinsic luminosity of 'standard candles'. For example, the periods of Cepheid variable stars are correlated with their intrinsic luminosities. As the distances of nearby Cepheid variables can be determined geometrically by parallax measurements, this allows an absolute calibration.

<sup>32</sup>Type Ia supernovae are believed to arise when a white dwarf star accretes enough mass from its main sequence (see Perkins in Further Reading if you are not familiar with this concept) companion that it exceeds the critical Chandrasekhar mass. The resulting nuclear fusion creates nickel and iron. The radioactive nickel atoms decay. It is assumed that more massive stars have to expand for longer before the opacity decreases enough for the photons to escape.

This picture is appropriate for nearby galaxies, but at larger distances Newtonian gravity is no longer correct. The Hubble law can be expressed as a linear relation between redshift and distance. The distances (called the luminosity distances DL) are estimated by comparing the measured fluxes F and intrinsic luminosities L of specific objects for which L can be determined using the inverse square law

$$F = \frac{L}{4\pi D\_L^2} \tag{13.21}$$

The main difficulty in this approach is the determination of L. <sup>31</sup> <sup>31</sup> This is referred to in astronomy as For the largest distances the best method has used type Ia supernovae. Type Ia supernovae are bright enough to be detected at very large redshifts (and hence distances). By measuring the light output over a period of a few weeks, the width of the light curve can be measured. Empirically, for nearby supernovae for which the distances can be estimated with other techniques, there is a very good correlation between the width of the light curve and the intrinsic luminosity.<sup>32</sup> This then allows type Ia supernovae to be used as standard candles. However, the further away and therefore older supernovae will have different chemical compositions compared with younger ones since they would have been formed from interstellar material that had not been through a cycle of nuclear fusion in stars. This effect might bias the calibration of the light curve to determine absolute luminosity. The resulting Hubble plot at large redshifts is shown in Fig. 13.30 and has clear deviations from linearity. The very surprising conclusion is that the expansion of the universe appears to be accelerating. More precise but less direct evidence for this acceleration can be determined from measurements of the anisotropy of the Cosmic Microwave Background (CMB) radiation.

## **13.8.1 Theoretical implications**

One approach to understanding the accelerating expansion is to introduce a cosmological constant Λ into the Friedmann equation (see Perkins in Further Reading) for the expansion of the universe. The measured value corresponds to a value of approximately 5 GeV m−<sup>3</sup>. In quantum field theory, we expect the vacuum fluctuations to contribute to the energy density. The integral for the energy density is divergent and if we assume there is a cut-off at the Planck scale, MPlanck, we would expect (see Exercise 13.10) that Λ ∼ 10<sup>121</sup> GeV m−<sup>3</sup>. New physics such as SUSY at the TeV scale would provide a much lower cut-off, but this would still disagree with the measured value by a factor of 10<sup>15</sup>. One possible solution would be to assume that we need to modify Einstein's theory of general relativity. Another option is that the cosmological constant does not arise from dark energy but from a new field called 'quintessence'. A more radical alternative is based on the Anthropic Principle, according to which, in the context of inflationary models of the early universe, our universe might be just one universe in a larger multiverse and the

**Fig. 13.30** (a) Hubble plot at moderate redshifts. (b) Hubble plot at large redshifts, with the linear trend seen at low z values divided out. The curves are fits to the Friedmann equation (see Perkins in Further Reading) with different parameters [115].

observed value of Λ might be a selection bias—only a universe with a sufficiently small value of Λ would be old enough to allow structures and life to evolve.

## **Chapter summary**


## **Further reading**


## **Exercises**


$$T = \frac{2m\_{\rm N}p^2}{(m\_{\rm N} + m\_{\rm WIMP})^2}$$

Discuss the implications for the choice of nuclear target for optimal sensitivity for a given WIMP mass. Evaluate the recoil energy for the case of WIMPs moving with a speed <sup>v</sup> <sup>∼</sup> 220 km s−<sup>1</sup>, for a WIMP mass mWIMP = 1 GeV, assuming that the target nucleus is xenon (atomic mass 131). Repeat the calculation for a WIMP of mass mWIMP = 100 GeV. Discuss the implications for the direct detection of WIMPs and in particular of low-mass WIMPs.


Q<sup>2</sup> = M<sup>2</sup> <sup>W</sup> . If the charm quark produces a D meson, how might this process be tagged in an LHC experiment.


χ = 1. Explain from an experimental perspective the advantages to a search for new physics that uses the variable χ as opposed to the jet transverse momentum pT.

(13.10) The density of states for a quantum oscillator is 4πV k<sup>2</sup> dk/(2π) <sup>3</sup>, where V is the volume and k the momentum. Use this expression to evaluate the vacuum energy density, assuming that the divergent integral is cut off at some high energy scale Emax. Estimate this value assuming that the cut-off is given by the Planck mass <sup>M</sup>Planck <sup>∼</sup> <sup>10</sup><sup>19</sup> GeV. Compare this value with the critical energy density <sup>ρ</sup><sup>c</sup> <sup>∼</sup> 5 GeV m−<sup>3</sup> and comment on the significance of this comparison. How would your conclusions change if we assumed that SUSY provided a cut-off at Emax ∼ 1 TeV?

## **References**


## **Index**

## **A**

Accelerator physics Q value of cavity, 51 bucket, 49 bunches, 49 cavity breakdown limit, 53 chain of accelerators, 59 energy gain, 50 FODO structure, 56 klystron, 54 phase stability, 50 strong focusing, 56 superconducting cavities, 53 synchrotron oscillations, 50 Accelerators brief history, 8 colliding-beam experiments, 62 Angular momentum, 19 addition, 21 Clebsch–Gordan coefficients, 21 raising and lowering operators, 20

## **B**

Beam optics, 54 beam blow-up, 58 betatron oscillations, 58 dipole bending strength, 55 emittance, 58 Hill's equations, 57 Liouville's theorem, 58 quadrupole magnet focal length, 56 quadrupole magnets, 55 reference trajectory, 54 transverse beam phase space, 59 Beyond the Standard Model (BSM) LHC trigger missing energy, 359 overview, 358 SM cross sections for Tevatron and LHC, 356 supersymmetry (SUSY), 368 Beyond the Standard Model (BSM) physics, 367 dark energy, 385 theoretical implications, 386 dark matter, 381 astrophysical evidence, 382 detection, 383 Experimental searches at the LHC SUSY, 373

experimental searches at the LHC, 373 hierarchy problem, 367 linear collider and Higgs physics, 379 other BSM theories, 372 SUSY R-parity, 371 coupling constant evolution, 370 lightest supersymmetric particle (LSP), 370 Breit–Wigner branching ratios, 42 formula, 40 maximum cross section, 42 spin factor, 41

### **C**

Charged-current weak interaction, 181 left-handed particles, right-handed antiparticles, 186 CKM matrix, 190 measurement of matrix elements, 207 CMS ↔ laboratory-frame energies and angles, 241 Colliders pp¯ , 65 HERA ep, 63 LEP e+e−, 63 LHC pp, 63 Colour charge and gluons, 7 CP violation B<sup>0</sup> system with mixing formalism, 296 time evolution, 297 KL semileptonic charge asymmetry, 291 ˜ parameter, 291 mathematical formalism, 291 B-factories, 299 Υ(4S), 299 Υ(4S), ACP , 300 beyond CKM description 'penguin' diagram, 296 CKM matrix, 294 small for K0, D0, large for B0, 295 unitarity constraints and triangles, 295 CKM unitarity triangle Jarlskog invariant, 295 constraining the CKM phase δ, 297

discovery, 289 in K<sup>0</sup> decay R = 1 from combined measurements, 294 K<sup>0</sup> decay double ratio R, 292 parameters η00, η+−, 292 LHCb detector, 301 LHCb measurements, 302 NA48 experiment, 292 time-dependent asymmetries, 297 expression for ACP (t), 299 Cross section, 39 dLips for n particles, 39 particle flux factor, 40

## **D**

Δ → π–nucleon branching ratios, 121 Data analysis W+W<sup>−</sup> at LEP2 W width measurement, 223 calibration, 213 Monte Carlo simulation, 213 W+W<sup>−</sup> at LEP2, 222 W mass and width, 223 choice of final state, 222 combinatorics, 222 Z<sup>0</sup> at LEP1, 213 Decay rate, 34 three-body decay, 36 Dalitz plot, 38 two-body decay, 34 two-body kinematics, 36 Decay rate and lifetime, 31 Deep inelastic scattering (DIS), 235 νee elastic scattering, 243 ν¯ee elastic scattering differential cross-section, 244 scattering amplitude, 243 charged-lepton probes, 251 eμ cross section, 252, 253 eμ elastic scattering, 252 Bjorken scaling, 255 Callan–Gross relation, 254 cross section in terms of PDFs, 254 cross section in terms of structure functions, 254 electron–nucleon scattering, 254 electron–quark scattering, 254 Gottfried sum rule, 258

Deep inelastic scattering (DIS) (*continued*) electroweak unification at HERA: neutral- and charged-current DIS at high Q2, 258 neutrino probes, 242 V−A interaction, 242, 243 neutrino–nucleon cross sections, 246 neutrino–quark scattering, 245 kinematics, 245 structure functions <sup>5</sup> <sup>18</sup> ratio from quark charges, 257 Detectors, 71 RICH detector, 105 charged-particle momentum determination, 97 charged-particle trackers, 86 drift chambers, 90 gas gain, 87 proportional wire chambers, 89 collider overview, 72 complete systems ATLAS, 109 CMS, 110 electromagnetic calorimeter, 98 energy resolution, 102 homogeneous, 99 sandwich, 100 for particle identification, 105 Cerenkov radiation, 105 ˇ RICH detector, 105 transition radiation, 105 using ionization, 106 hadronic calorimeter, 103 e/h compensation, 103 energy resolution, 105 magnetic fields charged-particle tracking, 106 muon spectrometer, 107 neutrinos, 111 particle jets, 98 photon detection, 85 avalanche photodiode, 86 photomultiplier, 85 signal generation, 81 induced current, 83 moving charges, 81 Ramo's theorem, 84 scintillators, 84 silicon detectors, 92 bias voltage, 93 design parameters, 92 drift velocity, 93 pixel systems, 96 radiation damage, 94 strip systems, 95 trigger, 108 levels, 108 LHC trigger, 109

wire chambers drift velocity, 91 signal and readout, 90 signal pulse shape, 92 signal readout circuit, 92 Diagrams, 7 Feynman diagrams, 7 quark flow diagrams, 8 Dirac equation, 155 Ψ 4-component Dirac spinor, 156 **α**, β matrices Weyl representation, 157 γ matrices Weyl representation, 156 Foldy–Wouthuysen (FW) representation, 158 decoupled equations for E > 0,E < 0, 158 bilinear covariant, 157 charge conjugation particle ↔ antiparticle symmetry, 168 covariant form, 156 γ matrices, 156 adjoint equation, 156 crossing symmetry pair creation, annihilation, 165 Electromagnetic interactions, 169 minimal coupling, 169 free-particle solutions, 159 Ψ+(x), for E > 0, 160 Ψ−(x), for E < 0, 160 frame with p > 0, 160 chirality operators PL, PR, 161 helicity eigenvalues, 161 helicity operator, 161 rest frame, 159 Hamiltonian, 157 helicity=chirality, 163 π → μν decay, 163 high energies helicity ≡ chirality, 164 current–current interaction, 164 helicity conservation, 165 insight into origin, 146 Lorentz-covariant one-particle states, 147 Majorana particles, 159 non-relativistic limit, 170 equation for ΨU, 170 magnetic moment of electron, 171 Pauli Hamiltonian, 171 upper and lower components of Ψ, 170 probability current and density, 157 space inversion operator, 166 parity for E > 0,E < 0 states, 167 standard representation, 158 non-relativistic limit, 158

time inversion, 167 anti-unitary operator, 168 Kramers degeneracy, 168 unitary transformation between representations, 158 Discrete symmetries, 15 JP C of hadrons, 18 charge conjugation, 16 parity, 15 time reversal, 17 dLips (Lorentz-invariant phase space), 33

## **E**

Electroweak unification, 193 W- and Z-boson masses, 200 W0, B0, 196 electroweak properties of fermions, 199 How good is it?, 201 left-handed particles, 194 neutral-current couplings cV, cA, 198 neutral-current couplings gL, gR, 198 neutral-current interaction, 198 unification condition, 198 weak isospin, 195 weak mixing angle (Weinberg angle), 196 weak neutral currents, 199

## **F**

Fermi constant GF, 185 Feynman rules, 183 Feynman–Stueckelberg, E < 0 states, 151 Fixed-target experiments, 62 Flavour oscillations Bs, 287 detector requirements, 287 B0–B¯0, 285 D0–D¯ <sup>0</sup>, 284 K0–K¯ <sup>0</sup>, 283 time dependence different Δm, 283 Forces, 6

## **G**

Gauge symmetry, 172 Aharonov-Bohm effect, 174 classical electromagnetism, 174 gauge transformation of potentials, 174 Covariant derivative, 172 interactions from gauge symmetry, 175 covariant derivative, 175 Dirac equation and electromagnetic field , 175 general formalism, 177 GIM mechanism, 189

Group theory, 43 U(n) and SU(n), 44 in particle physics, 4 Lie groups, 44 Representation theory combining states, 46 SU(2), 45 Pauli matrices, 45

## **H**

Hadron colliders W and Z discovery, 224 Emiss <sup>T</sup> , 224 top-quark mass (CDF and D0), 231 top-quark physics, 229 discovery, 229 HERA ep collider, 63 Higgs discovery, 342 WW∗ channel, 346 ZZ∗ channel, 345 γγ channel, 344 π<sup>0</sup> decay background, 344 decay modes and branching ratios, 343 expected mass, 342 fermion decay modes, 349 outlook for new measurements, 353 production mechanism, 343 statistical significance, 347 Higgs mechanism, 333 classical fields, 334 criteria for the Higgs field, 334 key ideas, 336 Lagrangians, 339 final form with massive fields, 341 local gauge invariance, 334 mathematical approach, 339 spontaneous symmetry breaking, 335 hidden symmetry, 336 Mexican hat potential, 336

## **I**

Ionization energy loss, 74 Bethe formula, 75 Isoscalar nuclear target, 249

## **J**

Jet-finding algorithms, 360 infrared safety, 360 jet energy calibration, 361

### **K**

K0, K¯ <sup>0</sup> K1, K2, 278 lifetimes and decay modes, 280 time dependence of beams, 280 K1, K2 CP eigenstates, 279 π+π−, π+π−π<sup>0</sup> decay modes, 278 CP values, 279

mixing box diagrams, 279 regeneration, 288 semileptonic decays, 278 strangeness indicator, 278 KS, KL physical states, short and long lifetimes, 289 KamLAND ¯νe experiment, 322 Klein–Gordon equation, 149 E < 0 solutions, 149 E > 0 solutions, 149 inclusion of a potential, 152 V > 0 barrier in one dimension, 152 probability current and density, 150 E < 0, 150 E > 0, 150 normalization, 150 Klystron, 54

## **L**

Lagrangians, 337 classical mechanics, 337 conserved quantities, 338 effective potential, 338 Euler–Lagrange equations, 338 independent variables, 337 local gauge invariance, 341 choice of gauge, 341 mass generation, 340 LEP e+e<sup>−</sup> collider, 63 Lepton number, 183 LHC search for physics Beyond the Standard Model (BSM), 356 LHC pp collider, 63 LHC dipole magnets, 59 LHC SM measurements, 359 <sup>W</sup> <sup>→</sup> μν<sup>μ</sup> <sup>E</sup>miss <sup>T</sup> distribution from CMS, 362 W charge asymmetry from W → lν decays, 363 Z peak, e+e<sup>−</sup> invariant mass spectrum, 362 cross-section formula, 362 jet cross sections, 359 top-quark production, 365 LHC top-quark production event selection, 365 LHCb detector, 301 Linear accelerator, 53 disc-loaded structure, 54 travelling-wave structure, 53 Lorentz invariance, 29 invariant variables, 30 rapidity, 30 Lorentz transformation spin-matrix form boost and rotation, 142 boost only, 142

Luminosity, 63 accelerator parameters, 65 Gaussian beams, 64 optimization, 65

## **M**

Majorana particles, 159 MINOS experiment νμ flux far detection by charged-current interaction, 315 near detector, 315 MSW matter effect neutrinos in matter, 320

## **N**

νμ beam production and focusing, 314 Natural units, 2 Negative-energy states in relativistic quantum mechanics, 151 Neutral meson systems flavour oscillations, 282 quantum mechanics, mass matrix, 281 summary table (Δm, τ), 278 Neutral-current weak interaction, 181 Neutrino states PMNS mixing matrix, 307 weak and mass, 307 Neutrino structure functions F2, F3 isoscalar nuclear-target PDFs, 249 neutron-target PDFs, 249 proton-target PDFs, 249 Neutrinos ν<sup>e</sup> = ν<sup>μ</sup> and lepton number, 206 DONUT ντ direct evidence, 206 atmospheric, 311 νμ-to-νe ratio, 312 double ratio R = Rexperiment/Rpredicted, 312 Ratio R = Flux(νμ)/Flux(νe), 312 Super-Kamiokande Cerenkov ˇ detector, 312 zenith-angle distributions, 314 detection (Δm)<sup>2</sup> sensitivity summary table, 311 the experimental challenge, 310 first evidence, 205 helicity measurement (Goldhaber), 206 mass upper limits, 306 double β decay, 307 tritium β decay, 306 matter–antimatter asymmetry, 329 Sakharov conditions, 329 MINOS ν oscillation experiment, 315 Neutrinos (*continued*) Solar Neutrino Oscillation (SNO) experiment, 317 solar neutrinos, 315 three (or more)-flavour oscillations, 323 CP and T violation, 325 P(ν<sup>α</sup> → νβ) at distance L, 324 P(¯ν<sup>α</sup> → ν¯β) at distance L, 325 time dependence of ν mass eigenstates, 324 three-flavour oscillations CP violation, 327 Δm<sup>2</sup> ij ordering, 326 mixing angle θ13, 327 mixing matrix, 326 PMNS mixing matrix, 326 two-flavour oscillations, 308 P(α → β) at distance L, 309 factor '1.27' units, 310 one mixing angle, 308 phase angle φ, 309 probabilities for appearance and survival, 309 time evolution, 308 V−A parity-violating <sup>60</sup>Co β decay, 206 NuMI beam high-energy νμ, 314

## **O**

<sup>Ω</sup>−, <sup>S</sup> <sup>=</sup> <sup>−</sup>3, spin- <sup>3</sup> <sup>2</sup> discovery, 127

## **P**

pp¯ colliders, 65 CERN, 66 Tevatron, 67 Parity violation in weak interactions, 187 Particle interactions with matter, 74 Cerenkov radiation, 79 ˇ electromagnetic interactions, 77 hadronic e/h ratio, 81 hadronic interactions, 79 hadronic interaction length, 80 multiple scattering, 76 radiation length, 77 transition radiation, 79 Particle travelling backwards in time, 151 Particles, 5 Physics at LEP Beam energy measurement, 218 LEP1, 215 Z cross sections and forward–backward asymmetries, 219 Z line shape, 216 number of neutrinos, 217

LEP2, 221 radiative return to the Z, 221 W+W<sup>−</sup> production, 222 luminosity measurement, 221 outline, 215

## **Q**

Quadrupole magnet strength, 57 Quantum chromodynamics (QCD) αs(Q2) methods of determination, 268 coloured quark scattering amplitude, 266 colour factors, 265 matrix elements, 264 Experimental tests of QCD gauge structure, 269 event shapes using thrust variable T, 270 introduction, 259 evidence for gluons, 260, 261 mathematical structure, 262–264 number of colours, 261 quark (parton) distribution functions (PDFs) experimental determination, 271 scaling violations, 271 the gluon PDF, 271, 272 running coupling constants αQCD(Q2), 268 αs(Q2), 268 QED example, 266, 267 Quark model, 118 I-, U-, V -spins, 123 J = 1 vector mesons, 124 I3 = S = 0 states: quark composition and decay modes, 125 octet–singlet mixing, 124 antiquarks charge conjugation, 119 baryons, 118, 125 colour wavefunction, 125 proton spin–flavour wavefunction, 127 spin wavefunction, 126 SU(3) decuplet, 126 SU(3) octet, 126 SU(3)flavour wavefunction, 126 bottomonium Υ discovery, 136 Υ resonances, 136 spectrum of states, 137 charmonium J/ψ JP C , 133 J/ψ isospin, 132 J/ψ width, 133 ψ(2S), ψ(3S) states, 134 comparison with positronium, 134 decay systematics, 134

exotic hadrons, 137 glueballs, 137 tetraquarks, 137 heavy quarks, 128 QQ¯ systems, 132 bq¯ beauty mesons, 132 cq¯ charmed mesons, 131 charm quark, 129 charmonium, 132 discovery of beauty, 130 discovery of charm, 129 isospin, 119 Δ → π–nucleon decays, 120 ud doublet, 119 baryonic (u, d) states with I = <sup>3</sup> <sup>2</sup> , <sup>1</sup> <sup>2</sup> , 120 fundamental assumptions, 119 pion I = 1 triplet, 120 mesons, 118 pseudoscalar mesons, 123 I = S = 0 states, 123 η, 123 η , 123 π0, 123 strange particles, 122 SU(3)flavour fundamental representation, 122 SU(2) subgroups I-, U-, V -spins, 123 Quark–parton model (QPM), 240 νeN, ¯νeN σ(νN)/Eν versus Eν, 247 y distribution, quark spin- <sup>1</sup> <sup>2</sup> , 247 nucleon antiquark/quark fraction, 247 nucleon quark/antiquark content, 246 νeN, ¯νeN cross sections versus data, 246 νeN, ¯νeN cross-section formulae, 246 y variable, 242 in terms of CMS scattering angle, 242 variable x summary of interpretations, 241 definition of q(x), q¯(x) distribution functions, 246 definition of the variable Q2, 241 definition of the variable x in terms of energy transfer ν, 241 in terms of Lorentz invariants, 241 mass fraction mq/M, 240 hadron–hadron Drell–Yan cross section, scaling variables, 273 Drell–Yan process, 273 formalism, 272 key assumption, 240 parton distribution functions (PDFs), 247 validity in terms of Q2, 246

### **R**

Relativistic quantum field theory, 140 Relativistic quantum mechanics, limitations of, 154 Relativistic quantum mechanics/relativistic quantum field theory, discussion of, 148 RF acceleration confined electromagnetic waves, 48 pill-box cavity, 49 Rotation in 3-dimensional space, spin-matrix representation, 141 Rutherford scattering, 235 charge distribution differential cross section, 237 form factor, 237, 238 Coulomb potential, 235 differential cross section, 237 matrix element, 236 matrix element for point-like particles, 236 scattering from nucleons, 239

### **S**

Scattering from a nucleon kinematics, 239 elastic scattering condition, 239 SNO experiment evidence for solar neutrino oscillations, 318 Solar Neutrino Oscillation (SNO) experiment Phase one, 317 Phase two, 318 Solar neutrinos, 315 Davis C2Cl4 experiment, 316 GALLEX experiment, 319 KamLAND ¯νe experiment neutrinos from nuclear reactors, 322 survival probability oscillations, 323 minimum (Pνe→ν<sup>e</sup> ) problem, 320 MSW matter effect, 320 MSW matter effect ERes resonance condition, 322 mathematical formalism, 320 mixing angle θm, 321 Standard Solar Model (SSM), 316 Spatial rotations, 23 conservation of angular momentum, 24 Euler angles, 25 rotation matrix, 25

Spin matrices, 2 × 2 complex matrix vector space, 141 Spin- <sup>1</sup> <sup>2</sup> algebra, 115 commutation relations, 116 eigenvalues and eigenvectors, 116 three spin- <sup>1</sup> <sup>2</sup> states, 117 spin- <sup>1</sup> <sup>2</sup> states, 117 spin- <sup>3</sup> <sup>2</sup> states, 117 two spin- <sup>1</sup> <sup>2</sup> states, 116 Spin-matrix representation of spatial vector, 142 Spinors, 142 dotted and undotted, 144 spatial rotation, 142 4π-rotation experiment, 143 Weyl and Pauli, 144 Stochastic cooling, 66 Strange particles discovery, 121 strangeness, 122 SU(3)flavour, 122 Structure functions Adler sum rule, 250 Gross–Llewellyn Smith sum rule, 250 Structure functions F2 and F3 for νeN, ν¯eN, 248 Superconducting magnets cable design, 60 energy stored, 61 LHC two-in-one design, 60 quench, 62 Rutherford cable, 60 Supersymmetry (SUSY), 368 particle spectrum, 369 sparticle pair production, 373 large-missing-energy signature, 374 sparticles importance of the stop in searches, 378 main SM backgrounds, 375 QCD induced backgrounds, 376 Synchronicity condition, 49 Synchrotron radiation, 53 energy loss, 53 X-ray sources, 53

### **T**

Top-quark physics σ(tt ¯) (combined CDF and D0), 231 b-tagging, 230 tt ¯ final-state topologies, 229 discovery, 229 top-quark mass, 231 Top-quark production tt ¯ cross section vs √s, 367 Transitions and observables, 31 phase space, 32

### **W**

W and Z physics W mass determination, 225 pe <sup>T</sup> distribution, 226 from MT(W → μνμ), 228 Jacobian peak, 226 W width determination , 229 W → μν<sup>μ</sup> V−A test, 224 W± charged-current weak quanta, 181 Weak interactions CP violation in CKM matrix, 191 GF and weak coupling gw, 185 τ decay, 184 τ-decay matrix element, 185 CKM matrix CP violation phase, 192 CKM matrix parameterization, 191 CKM mixing matrix, 190 currents and fields, 187 current–current matrix element, 188 electromagnetic comparison, 188 Dirac bilinear covariants , 187 Fermi theory, 182 hadrons, 189 Cabibbo mixing (CKM matrix), 189 GIM mechanism, 189 heavy-quark decays, 192 K<sup>0</sup> decay problem, 189 leptons, 183 universality, 185 V−A interaction, 186 parity violation, 187 selection rules, 187 Weak neutral currents discovery, 208 measurement of sin<sup>2</sup> θW, 208 Weinberg angle in electroweak unification, 196 Weyl spinors covariant eqns for m = 0, Dirac equation, 146 Lorentz-invariant Weyl equations, 146 metric tensor, 144 scalar products, 144 spin- <sup>1</sup> <sup>2</sup> massless particle, 146 transformation properties, 145 under space inversion, 145

## **Z**

Z<sup>0</sup> at LEP, 208 accelerator chain, 209 event classification, 210 OPAL detector, 209 physics analysis, 213 Z<sup>0</sup> neutral-current weak quantum, 181