# Mordechai Ben-Ari

# Mathematical Surprises

Mathematical Surprises

Mordechai Ben-Ari

# Mathematical Surprises

Mordechai Ben-Ari Department of Science Teaching Weizmann Institute of Science Rehovot, Israel

ISBN 978-3-031-13565-1 ISBN 978-3-031-13566-8 (eBook) https://doi.org/10.1007/978-3-031-13566-8

Mathematics Subject Classification (2020): 00-01, 51Nxx, 05-01, 00A07

© The Editor(s) (if applicable) and The Author(s) 2022. This book is an open access publication. **Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

## **Foreword**

If everyone were exposed to mathematics in its natural state, with all the challenging fun and surprises that that entails, I think we would see a dramatic change both in the attitude of students toward mathematics, and in our conception of what it means to be "good at math." Paul Lockhart

I'm really hungry for surprises because each one makes us ever-so-slightly but substantially smarter. Tadashi Tokieda

Mathematics, when appropriately approached, can provide us with plentiful pleasant surprises. This is confrmed by a Google search of "mathematical surprises," which, surprisingly, yields almost half a billion items. What is a surprise? The origins of the word trace back to Old French with roots in Latin: "sur" (over) and "prendre" (to take, to grasp, to seize). Literally, to surprise is to overtake. As a noun, surprise is both an unanticipated or bewildering event or circumstance, as well as the emotion caused by it.

Consider, for example, an extract from a lecture by Maxim Bruckheimer1 on the Feuerbach circle: "Two points lie on one and only one straight line, this is no surprise. However, three points are not necessarily on one straight line and if, during a geometrical exploration, three points 'fall into' a straight line, this is a surprise and frequently we need to refer to this fact as a theorem to be proven. Any three points not on a straight line lie on one circle. However, if four points lie on the same circle, this is a surprise that should be formulated as a theorem. . . . Insofar as the number of points on a straight line is larger than 3, so is the theorem the more surprising.

<sup>1</sup> Maxim Bruckheimer was a mathematician who was one of the founders of the Open University UK and Dean of its Faculty of Mathematics. He was Head of the Department of Science Teaching at the Weizmann Institute of Science.

Likewise, insofar as the number of points lying on one circle is larger than 4, so is the theorem the more surprising. Thus, the statement that for any triangle there are nine related points on the same circle . . . is very surprising. Moreover, in spite of the magnitude of the surprise, its proof is elegant and easy."

In this book Mordechai Ben-Ari ofers a rich collection of mathematical surprises, most of them less well known than the Feuerbach Circle and with sound reasons for including them. First, in spite of being absent from textbooks, the mathematical gems of this book are accessible with just a high school background (and patience, and paper and pencil, since fun does not come for free). Second, when a mathematical result challenges what we take for granted, we are indeed surprised (Chaps. 1, 13). Similarly, we are surprised by: the cleverness of an argument (Chaps. 2, 3), the justifcation of the possibility of a geometric construction by algebraic means (Chap. 16), a proof relying on an apparently unrelated topic (Chaps. 4, 5), a strange proof by induction (Chap. 6), new ways of looking at a well-known result (Chap. 7), a seemingly minor theorem becoming the foundation of a whole feld of mathematics (Chap. 8), unexpected sources of inspiration (Chap. 9), rich formalizations emerging from purely recreational activities such as origami (Chaps. 10–12). These are all diferent reasons for the inclusion of the pleasant, beautiful and memorable mathematical surprises in this lovely book.

So far I have addressed how the book relates to the frst part of the defnition of surprise, the cognitive rational reasons for the unexpected. As to the second aspect, the emotional aspect, this book is a vivid instantiation of what many mathematicians claim regarding the primary reason for doing mathematics: it is fascinating! Moreover, they claim that mathematics stimulates both our intellectual curiosity and our esthetic sensibilities, and that solving a problem or understanding a concept provides a spiritual reward, which entices us to keep working on more problems and concepts.

It has been said that the function of a foreword tell readers why they should read the book. I have tried to accomplish this, but I believe that the fuller answer will come from you, the reader, after reading it and experiencing what the etymology of the word surprise suggests: to be overtaken by it!

*Abraham Arcavi*

## **Preface**

Godfried Toussaint's article on the "collapsing compass" [50] made a profound impression on me. It would never have occurred to me that the modern compass with a friction joint is not the one used in Euclid's day. In this book I present a selection of mathematical results that are not only interesting, but that surprised me when I frst encountered them.

The mathematics required to read the book is secondary-school mathematics, but that does not mean that the material is simple. Some of the proofs are quite long and require that the reader be willing to persevere in studying the material. The reward is understanding of some of the most beautiful results in mathematics. The book is not a textbook, because the wide range of topics covered doesn't ft neatly into a syllabus. It is appropriate for enrichment activities for secondary-school students, for college-level seminars and for mathematics teachers.

The chapters can be read independently. (An exception is that Chap. 10 on the axioms of origami is a prerequisite for Chaps. 11, 12, the other chapters on origami.) Notes relevant to all chapters are given below in list labeled Style.

#### **What Is a Surprise?**

There were three criteria for including a topic in the book:

• The theorem surprised me. Particularly surprising were the theorems on constructibility with a straightedge and compass. The extremely rich mathematics of origami was almost shocking: when a mathematics teacher proposed a project on origami, I initially turned her down because I doubted that there could be any serious mathematics associated with the art form. Other topics were included because, although I knew the results, their proofs were surprising in their elegance and accessibility, in particular, Gauss's purely algebraic proof that a regular heptadecagon can be constructed.


Each chapter concludes with a paragraph *What Is the Surprise?* which explains my choice of the topic.

#### **An Overview of the Contents**

Chapter 1 presents Euclid's proof that any construction that is possible with a fxed compass is possible with a collapsing compass. Many proofs have been given, but, as Toussaint shows, most are incorrect because they depend on diagrams which do not always correctly depict the geometry. To emphasize that one must not trust diagrams, I present the famous alleged proof that every triangle is isoceles.

Over the centuries mathematicians unsuccessfully sought to trisect an arbitrary angle (divide it into three equal parts) using only a straightedge and compass. Underwood Dudley made a comprehensive study of trisectors who fnd incorrect constructions; most constructions are approximations that are claimed to be accurate. Chapter 2 starts by presenting two of these constructions and developing the trigonometric formulas showing that they are only approximations. To show that trisection using just a straightedge and compass is of no practical importance, trisections using more complex tools are presented: Archimedes's *neusis* and Hippias's *quadratrix*. The chapter ends with a proof that it is impossible to trisect an arbitrary angle with a straightedge and compass.

Squaring a circle (given a circle construct a square with the same area) cannot be performed using a straightedge and compass, because the value of cannot be constructed. Chapter 3 presents three elegant constructions of close approximations to , one by Kochanski and two by Ramanujan. The chapter concludes by showing ´ that a quadratrix can be used to square a circle.

The four-color theorem states that it is possible to color any planar map with four colors, such that no countries with a common boundary are colored with the same color. The proof of this theorem is extremely complicated, but the proof of the fve-color theorem is elementary and elegant, as shown in Chapter 4. The chapter also presents Percy Heawood's demonstration that Alfred Kempe's "proof" of the four-color theorem is incorrect.

How many guards must be employed by an art museum so that all the walls are under constant observation by at least one guard? The proof in Chapter 5 is quite clever, using graph coloring to solve what at frst sight appears to be a purely geometrical problem.

Chapter 6 presents some lesser-known results and their proofs by induction: theorems on Fibonacci numbers and Fermat numbers, McCarthy's 91 function, and the Josephus problem.

Chapter 7 discusses Po-Shen Loh's method of solving quadratic equations. The method is a critical element of Gauss's algebraic proof that a heptadecagon can be constructed (Chapter 16). The chapter includes al-Khwarizmi's geometric construction for fnding roots of quadratic equations and a geometric construction used by Cardano in the development of the formula for fnding roots of cubic equations.

Ramsey theory is a topic in combinatorics that is an active area of research. It looks for patterns among subsets of large sets. Chapter 8 presents simple examples of Schur triples, Pythagorean triples, Ramsey numbers and van der Waerden's problem. The proof of the theorem on Pythagorean triples was accomplished recently with the aid of a computer program based on mathematical logic. The chapter concludes with a digression on the ancient Babylonians' knowledge of Pythagorean triples.

C. Dudley Langford observed his son playing with colored blocks and noticed that he had laid them out in an interesting sequence. Chapter 9 presents his theorem on the conditions for such a sequence to be possible.

Chapter 10 contains the seven axioms of origami, together with the detailed calculations of the analytic geometry of the axioms, and characterizations of the folds as geometric loci.

Chapter 11 presents Eduard Lill's method and the origami fold proposed by by giving details here. Margharita P. Beloch. I introduce Lill's method as a magic trick so I won't spoil it

Chapter 12 shows that origami can perfom constructions not possible with a straightedge and compass: trisecting an angle, squaring a circle and constructing a nonagon (a regular polygon with nine sides).

Chapter 13 presents the theorem by Georg Mohr and Lorenzo Mascheroni that any construction with a straightedge and compass can be performed using only a compass.

The corresponding claim that a straightedge only is sufcient is incorrect, because a straightedge cannot compute lengths that are square roots. Jean-Victor Poncelet conjectured and Jakob Steiner proved that a straightedge is sufcient, provided that there exists a single fxed circle somewhere in the plane (Chap. 14).

ent? That seems reasonable but it turns out not to be true, although it takes quite a bit of algebra and geometry to fnd a non-congruent pair as shown in Chap. 15. If two triangles have the same perimeter and the same area must they be congru-

Chapter 16 presents Gauss's tour-de-force: a proof that a heptadecagon (a regular polygon with seventeen sides) can be constructed using a straightedge and compass. By a clever argument on the symmetry of the roots of polynomials, he obtained a formula that uses only the four arithmetic operators and square roots. Gauss did not give an explicit construction of a heptadecagon, so the elegant construction by James Callagy is presented. The chapter concludes with constructions of a regular pentagon based on Gauss's method for the construction of a heptadecagon.

To keep the book as self-contained as possible, Appendix A collects proofs of theorems of geometry and trigonometry that may not be familiar to the reader.

#### **Style**

	- Algebra: polynomials and division of polynomials, *monic* polynomials—those whose coefcient of the highest power is 1, quadratic equations, multiplication of expressions with exponents · = + .
	- Euclidean geometry: congruent triangles △ △ and the criteria for congruence, similar triangles △ ∼ △ and the ratios of their sides, circles and their inscribed and central angles.
	- Analytic geometry: the cartesian plane, computing lengths and slopes of line segments, the formula for a circle.
	- Trigonometry: the functions sin, cos, tan and the conversions between them, angles in the unit circle, the trigonometric functions of angles refected around an axis such as cos(180◦ <sup>−</sup> ) <sup>=</sup> <sup>−</sup> cos .

#### **Acknowledgments**

This book would never have been written without the encouragement of Abraham Arcavi who welcomed me to trespass on his turf of mathematics education. He also graciously wrote the foreword. Avital Elbaum Cohen and Ronit Ben-Bassat Levy were always willing to help me (re-)learn secondary-school mathematics. Oriah Ben-Lulu introduced me to the mathematics of origami and collaborated on the proofs. I am grateful to Michael Woltermann for permission to use several sections of his reworking of Heinrich Dorrie's book. Jason Cooper, Richard Kruel, Abraham Arcavi ¨ and the anonymous reviewers provided helpful comments.

I would like to thank the team at Springer for their support and professionalism, in particular the editor Richard Kruel.

The book is published under the Open Access program and I would like to thank the Weizmann Institute of Science for funding the publication.

The LATEX source fles for the book (which include the Ti*k*Z source for the diagrams) are available at:

https://github.com/motib/surprises

*Mordechai (Moti) Ben-Ari*

## **Contents**





## **Chapter 1 The Collapsing Compass**

A modern compass is a *fxed compass*: the distance between the two legs can be fxed so that it is possible to copy a line segment or a circle from one position to another (Fig. 1.1a). Euclid used a *collapsing compass* where a fxed distance cannot be maintained (Fig. 1.1b). Teachers often use a collapsing compass consisting of a marker tied to a string that is used to construct a circle on a whiteboard. It is impossible to maintain a fxed length when the compass is removed from the whiteboard.

**Fig. 1.1a** A fxed compass. One leg has a needle that is placed at the center of the circle. A pencil attached to the other leg is used to draw the circle. The legs are joined by a tight hinge so that the distance between the legs (the radius of the circle) is maintained even when the compass is lifted from the paper.

**Fig. 1.1b** A collapsing compass. The user holds a piece of string at the center of the circle. The other end of the string is tied to a pencil and is used to draw the circle. When the compass is lifted from the paper, the fngers (dashed) can easily slip to a new position.

This chapter begins with a discussion of the relevance of studying construction with a straightedge and compass (Sect. 1.1). Section 1.2 compares the two types of compasses in the most elementary construction: a perpendicular bisector. Section 1.3 presents Euclid's method of copying a line segment using a collapsing compass. This proves that any construction that can be done using a fxed compass can be performed using a collapsing compass. Section 1.4 shows a proof of this theorem which seems to be correct, but does not work for all confgurations of lines and points. To emphasize that one must not trust diagrams, Sect. 1.5 presents a famous alleged proof that all triangles are isoceles; the proof appears to be correct but it is not because the proof is based on an incorrect diagram.

#### **1.1 Construction with a Straightedge and Compass**

Construction with a straightedge and compass used to be the fundamental concept taught in Euclidean geometry. Recently, it has fallen out of favor in school curricula. It is certainly true that the topic has little, if any, practical use. As we show in Sects. 2.2, 2.3, 2.4, 3.4, the Greeks knew how to perform constructions that are impossible with a straightedge and compass by using tools only slightly more advanced. Today, using numerical methods, computers can perform constructions to any desired precision.

Nevertheless, I believe that there are advantages to studying constructions:


#### **1.2 Fixed Compasses and Collapsing Compasses**

Some geometry textbooks present the construction of a perpendicular bisector of a line segment by constructing two circles centered at the ends of the line segment such that the radii are equal and *greater than half the length of the segment* (Fig. 1.2a). This can only be done with a fxed compass because after drawing the circle centered at , the distance between the legs of the compass needs to remain fxed to draw the circle centered at .

**Fig. 1.2a** Construction of a perpendicular bisector with a fxed compass

**Fig. 1.2b** Construction of a perpendicular bisector with a fxed or a collapsing compass

Figure 1.2b shows the construction of a perpendicular bisector with either a fxed or a collapsing compass. Two circles are constructed: one centered at with radius and one centered at with radius . This can be done with a collapsing compass because (obviously) = , so the compass does not have to "remember" the length of to construct a circle centered at with the same radius. The proof that the line constructed shown in Fig. 1.2a is a perpendicular bisector is not at all elementary because relatively advanced concepts like congruent triangles have to be used. However, the proof that the construction of a perpendicular bisector shown in Fig. 1.2b is correct is simple and based on the fact that △ is an equilateral triangle. In fact, this is the frst proposition in Euclid's *Elements*. = since they are radii of the same circle and, similarly, = .We have: = = = .

Figure 1.3a shows that for the construction with a fxed compass, the triangle will be an isosceles, not necessarily an equilateral, triangle (Fig. 1.3b).

#### **1.3 Euclid's Construction for Copying a Line Segment**

The second proposition of Euclid's *Elements* describes how to copy a given line segment to a segment of the same length, one of whose end points is a given point . Therefore, a fxed compass adds no additional capabilities and a collapsing compass is sufcient, although constructions are easier with a fxed compass.

**Theorem 1.1** *Given a line segment and a point , a line segment* ′ *, one of whose endpoints is , can be constructed using a collapsing compass, such that* = ′ *(Fig. 1.4a).*

**Fig. 1.3a** Construction of an isoceles triangle with a fxed compass

*Proof* Construct the line segment . Construct the equilateral triangle △ whose base is (Fig. 1.4b). By Euclid's frst proposition, the triangle can be constructed using a collapsing compass. Construct the ray that is an extension of the line segment *from to* , and construct the ray that is an extension of the line segment *from to* (Fig. 1.5a). Construct the circle centered at with radius and denote the intersection of the circle and the ray extending by (Fig. 1.5b). Construct the circle centered at with radius and denote the intersection of the circle and the ray extending by (Fig. 1.6).

 <sup>=</sup> because △ is equilateral. <sup>=</sup> are radii of the same circle, as are = . Therefore:

$$
\overline{CF} = \overline{DF} - \overline{DC} = \overline{DE} - \overline{DC} = \overline{DE} - \overline{DA} = \overline{AE} = \overline{AB}.\tag{7}
$$

The specifcation of the directions of the rays is essential. The proof here works for any line segment and any point , regardless of its position relative to . By specifying directions the "cone" enclosed by the two rays will intersect the circles even if > (Fig. 1.7).

**Fig. 1.4a** Copy the line segment . The orientation of ′ is not important.

**Fig. 1.4b** Copying a line segment with a collapsing compass

**Fig. 1.5a** Constructing rays from **Fig. 1.5b** Constructing a circle with radius 

#### **1.4 A Flawed Construction for Copying a Line Segment**

*Proof* Construct three circles: one centered at with radius , one centered at with radius , and one centered at with radius = . Denote the intersections of the circles centered at and by and , respectively, and denote an intersection of the circle centered at and the circle centered at with radius by . If > , the construction is as shown in Fig. 1.8.

Construct a circle centered at with radius . Denote the intersection of this circle with the circle centered at with radius by . There are two intersections, so choose the one closer to (Fig. 1.9). = are radii of the same circle as are = . By construction the radii and are equal. Therefore,

$$
\overline{CD} = \overline{CE} = \overline{AE} = \overline{AG} \dots
$$

 <sup>=</sup> are radii of the same circle, so △ △ by side-side-side and ∠ = ∠.

**Fig. 1.6** Construction of =

**Fig. 1.7** Construction for >

Since:

$$
\angle GEC = \angle GEA - \angle CEA = \angle DEC - \angle CEA = \angle DEA,
$$

it follows that △ △ by side-angle-side. <sup>=</sup> are radii of the smaller circle centered at , so = = . □

The proof is correct only if > . Figure 1.10 shows a diagram where < and you can see that ≠ .

**Fig. 1.8** Construction for copying a line segment (1)

**Fig. 1.9** Construction for copying a line segment (2)

#### **1.5 Don't Trust a Diagram**

#### **Theorem 1.2 (Incorrect, of course)** *All triangles are isosceles.*

*Proof (Incorrect)* Given an arbitrary triangle △, let be the intersection of the angle bisector of ∠ and the perpendicular bisector of . The intersections of the altitudes from to the sides , are denoted by , , respectively (Fig. 1.11). △ △ because they are right triangles with equal angles and common side . △ △ since they are right triangles, is a common side and <sup>=</sup> <sup>=</sup> . △ △ since they are right triangles, <sup>=</sup> by the frst congruence and = by the second congruence. By combining the equations we get that △ is isoceles:

$$
\overline{AB} = \overline{AE} + \overline{EB} = \overline{AF} + \overline{FC} = \overline{AC} \,. \tag{7}
$$

The *logic* of the proof is correct, but the diagram upon which the proof is based is not correct because point is *outside* the triangle (Fig. 1.12).

**Fig. 1.10** A diagram for which the proof doesn't work

**Fig. 1.11** An incorrect proof that all triangles are isoceles

#### **What Is the Surprise?**

As a student I took it for granted that a compass has a friction joint that maintains the distance between the point and the pencil when it is lifted from the paper. When the teacher used a compass made from a piece of string and a piece of chalk, I never imagined that it difered from my compass. The article by Gotfried Toussaint was a real surprise, as was his demonstration that post-Euclid proofs were incorrect because they depended on diagrams that made unwarranted assumptions. I recommend the article to readers who wish to deepen their understanding of proofs in mathematics.

**Fig. 1.12** Why the construction doesn't work

#### **Sources**

This chapter is based on [50]. The incorrect construction of the equivalence of the two compasses in Sect. 1.4 is from [37]. A comprehensive English translation of Euclid's *Elements* together with an extensive commentary [22] was written by Thomas L. Heath, one of the foremost experts in Greek mathematics.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 2 Trisection of an Angle**

It is impossible to trisect an arbitrary angle (divide the angle into three equal parts) using only a straightedge and compass. Trisection requires the construction of cube roots, but a straightedge and compass can only construct lengths that are expressions built from integers, the four arithmetic operators and square roots. This was proved by Pierre Wantzel in 1837. Nevertheless, innumerable amateurs continue to attempt to trisect an angle. Their constructions are approximations though they are convinced that the constructions are correct. Section 2.1 presents two such constructions, develops formulas for the angles and shows the errors in the approximations.

Greek mathematicians discovered that if other instruments are allowed, angles can be trisected. Section 2.2 explains a construction by Archimedes using a simple instrument called a *neusis* and Sect 2.3 shows how to double a cube using the neusis. Section 2.4 presents a construction for trisection by Hippias using an instrument called a *quadratrix*. The rest of the chapter contains a proof of the impossibility of trisecting an angle. Section 2.5 characterizes constructible numbers, Sect. 2.6 relates constructible numbers to roots of polynomials and Sect. 2.7 uses this theory to show that trisecting an angle and doubling a cube are impossible.

#### **2.1 Approximate Trisections**

#### **2.1.1 First Approximate Trisection**

**Construction:** Let = ∠ be an arbitrary angle and without loss of generality assume that , are on a unit circle whose center is . Bisect ∠ and let be the intersection of the bisector with the unit circle. Let be the midpoint of and let be the midpoint of the . Denote the angle ∠ by (Fig. 2.1).

#### 12 2 Trisection of an Angle

**Fig. 2.1** First approximate trisection (1)

#### **Theorem 2.1**

$$\tan\phi = \frac{2\sin(\theta/2)}{1+2\cos(\theta/2)}$$

.

*Proof* Figure 2.2 is extracted from Fig. 2.1 and contains additional annotations.

Let be the perpendicular to that intersects at . Since = 1, <sup>=</sup> sin(/2) and <sup>=</sup> cos(/2). Let be the perpendicular to that intersects at .

 is the midpoint of so = = . But is the median to the hypotenuse of a right triangle, so <sup>=</sup> and therefore △ is isoceles. It follows that is the both the median and the altitude of . From the diagram it is easy to see that:

$$\overline{OE} = \frac{1}{2} + \frac{1}{2} \left( \cos \frac{\theta}{2} - \frac{1}{2} \right) \dots$$

Compute the length 2 <sup>=</sup> using Pythagoras's Theorem in △:

$$\left(\left(2a\right)^2 = \left(\cos\frac{\theta}{2} - \frac{1}{2}\right)^2 + \sin^2\frac{\theta}{2}\right.$$

#### 2.1 Approximate Trisections 13

**Fig. 2.2** First approximate trisection (2)

The length <sup>ℎ</sup> <sup>=</sup> can be computed from Pythagoras's Theorem in △ :

$$\begin{aligned} a^2 &= h^2 + \left[\frac{1}{2}\left(\cos\frac{\theta}{2} - \frac{1}{2}\right)\right]^2\\ h^2 &= \frac{1}{4}\left(\cos\frac{\theta}{2} - \frac{1}{2}\right)^2 + \frac{1}{4}\sin^2\frac{\theta}{2} - \left[\frac{1}{2}\left(\cos\frac{\theta}{2} - \frac{1}{2}\right)\right]^2 = \frac{1}{4}\sin^2\frac{\theta}{2}\\ h &= \frac{1}{2}\sin\frac{\theta}{2} \end{aligned}$$

$$\tan\phi = \frac{h}{\overline{OE}} = \frac{\frac{1}{2}\sin\frac{\theta}{2}}{\frac{1}{2} + \frac{1}{2}\left(\cos\frac{\theta}{2} - \frac{1}{2}\right)} = \frac{2\sin\frac{\theta}{2}}{1 + 2\cos\frac{\theta}{2}}.$$

This is an approximation to a trisection <sup>=</sup> /3. For <sup>=</sup> <sup>60</sup>◦ :

$$\tan^{-1}\left(\frac{2\sin 30^{\circ}}{1+2\cos 30^{\circ}}\right) = \tan^{-1} 0.366 \approx 20.1^{\circ} \approx 20^{\circ} \text{ .}$$

Table 2.1 shows the errors for a range of acute angles. The error is relatively small for small angles, rising to 1% at 85◦ .


**Table 2.1** Errors in the frst approximate trisection

#### **2.1.2 Second Approximate Trisection**

**Construction:** Let = ∠ be an arbitrary angle and without loss of generality assume that , are on a unit circle whose center is . Construct a circle of radius <sup>1</sup>/3 with center and let be its intersection with . Bisect <sup>∠</sup> and let be the intersection of the bisector with the circle of radius 1/3. Construct the chord and the chords = = . Since equal chords subtend equal central angles ∠ = ∠ = (Fig. 2.3).

#### **Theorem 2.2**

$$\cos\phi = 1 - \frac{1}{9}(1 - \cos(\theta/2)) = 1 - \frac{2}{9}\sin^2(\theta/4)\ .$$

*Proof* By the Law of Cosines in △:

$$\overline{CD} = \left(\frac{1}{3}\right)^2 + \left(\frac{1}{3}\right)^2 - 2\left(\frac{1}{3}\right)\left(\frac{1}{3}\right)\cos(\theta/2) = \frac{2}{9}\left(1 - \cos(\theta/2)\right) \dots$$

By the Law of Cosines in △ :

$$\overline{AE} = 1^2 + 1^2 - 2 \cdot 1 \cdot 1 \cdot \cos\phi = 2(1 - \cos\phi) \dots$$

#### 2.1 Approximate Trisections 15

**Fig. 2.3** Second approximate trisection

Equating the two expressions for = and simplifying we get:

$$\cos\phi = 1 - \frac{1}{9}(1 - \cos(\theta/2))\dots$$

Since cos 2 <sup>=</sup> cos<sup>2</sup> <sup>−</sup> sin<sup>2</sup> <sup>=</sup> <sup>1</sup> <sup>−</sup> 2 sin<sup>2</sup> , and therefore 1 <sup>−</sup> cos 2 <sup>=</sup> 2 sin<sup>2</sup> , we have the alternate formula:

$$\cos\phi = 1 - \frac{2}{9}\sin^2(\theta/4) \,. \tag{7}$$

This is an approximation to a trisection 2 <sup>=</sup> /3. For <sup>=</sup> <sup>60</sup>◦ :

$$2\cos^{-1}\left(1-\frac{1}{9}(1-\cos 30^{\circ})\right) \approx 19.8^{\circ} \approx 20^{\circ} \text{ .}$$

Table 2.2 shows the errors for a range of acute angles. This construction is much less accurate than the one in Sect. 2.1.1.


**Table 2.2** Errors in the second approximate trisection

#### **2.2 Trisection Using a Neusis**

The term *straightedge* is used instead of *ruler* because a straightedge has no marks on it. It can only be used to construct a straight line between two given points. Archimedes showed that a *neusis*, a straightedge with two marks that are a fxed distance apart, can be used to trisect an angle (Fig. 2.4). We defne the distance between the marks to be 1.

**Construction:** Let = ∠ be an arbitrary angle in a unit circle with center , where the radius of the circle equals the distance between the marks on the neusis. Extend the radius beyond the circle. Place an edge of the neusis on and move it until it intersects the extension of at and the circle at , using the marks so that the length of the line segment is 1.1 Construct the line . Denote ∠ = (Fig. 2.5).

**Fig. 2.4** A neusis

<sup>1</sup> This operation is called *verging*.

**Fig. 2.5** The neusis construction for trisecting an angle (1)

## **Theorem 2.3** <sup>=</sup> /3*.*

*Proof* Construct and denote the angles and line segments as shown in Fig. 2.6. △ and △ are isoceles triangles: <sup>=</sup> are radii of the same circle and = by construction using the neusis. Since the sum of the angles of a triangle is equal to 180◦ and the sum of supplementary angles is also equal to 180◦ , we have:

$$\begin{aligned} \epsilon &= 180^\circ - 2\beta \\ \gamma &= 180^\circ - \epsilon = 2\beta \\ \delta &= 180^\circ - 2\gamma = 180^\circ - 4\beta \\ \alpha &= 180^\circ - \delta - \beta = 3\beta \end{aligned}$$

**Fig. 2.6** The neusis construction for trisecting an angle (2)

#### **2.3 Doubling the Cube with a Neusis**

Given a cube construct another cube with twice its volume. If the volume of is its sides are of length <sup>√</sup><sup>3</sup> . The sides of a cube with twice the volume are √3 2 = √3 2 · √3 , so if we can construct <sup>√</sup><sup>3</sup> 2 we can double the cube.

**Construction:** Construct the unit equilateral triangle △ and extend with another unit line segment to . Construct rays extending and . Place the neusis on point and move it until one mark on the neusis is placed on the ray at and the other mark is placed on the ray at . Denote = and = (Fig. 2.7).

#### **Theorem 2.4** = √3 2*.*

*Proof* Since △ is equilateral, cos <sup>∠</sup> <sup>=</sup> cos 60◦ <sup>=</sup> 1 2 and by the Law of Cosines in △:

$$
\overline{CP} = \overline{AC}^2 + \overline{AP}^2 - 2 \cdot \overline{AC} \cdot \overline{AP} \cos 60^\circ \tag{2.1a}
$$

$$(\mathbf{x} + \mathbf{1})^2 = 1^2 + (\mathbf{y} + \mathbf{1})^2 - 2 \cdot \mathbf{1} \cdot (\mathbf{y} + \mathbf{1}) \cdot \frac{1}{2} \tag{2.1b}$$

$$\mathbf{x}^2 + 2\mathbf{x} = \mathbf{y}^2 + \mathbf{y} \,. \tag{2.1c}$$

By Menelaus's theorem (Thm. A.20):

 · · = 1 .

**Fig. 2.7** Doubing the cube with a neusis

#### 2.4 Trisection Using a Quadratrix 19

Therefore:

$$\frac{1}{y} \cdot \frac{1}{x} \cdot \frac{2}{1} = 1\tag{2.2a}$$

= 2 . (2.2b)

Substituting Eq. 2.2b into Eq. 2.1c gives:

$$\begin{aligned} \alpha^2 + 2\alpha &= \frac{4}{\alpha^2} + \frac{2}{\alpha} \\ \alpha^4 + 2\alpha^3 &= 4 + 2\alpha \\ \alpha^3(\alpha + 2) &= 2(\alpha + 2) \\ \alpha &= \sqrt[3]{2} \end{aligned}$$

#### **2.4 Trisection Using a Quadratrix**

Let be a square. Let <sup>1</sup> be a line segment placed initially at and let <sup>2</sup> be a line segment placed initially at . Move <sup>1</sup> move at a constant linear velocity until it reaches and rotate <sup>2</sup> at a constant angular velocity clockwise on until it also reaches . Assume that they reach together. For example, if <sup>2</sup> rotates at 1 ◦ /second and the side of the square is 9 centimeters, <sup>1</sup> must move at 0.1 cm/second. The trace of their point of intersection is called a *quadratrix curve* or simply a *quadratrix* (Fig. 2.8a). Its defnition is attributed to the mathematician Hippias.

**Fig. 2.8a** A quadratrix curve **Fig. 2.8b** A quadratrix compass

A quadratrix can be constructed using a *quadratrix compass* as shown in Fig. 2.8b. It consists of two (unmarked) straightedges that move as described above. A joint constrains them to move together and traces out the curve.

A quadratrix can be used to trisect an angle.

**Construction:** Let ∠<sup>1</sup> = be an arbitrary angle, where <sup>1</sup> is the intersection of the line defning the angle relative to and the quadratrix. Construct a line through <sup>1</sup> parallel to and denote its intersection with by . Denote the line segment by and trisect it (Sect. 2.5) to obtain point that is /3 from . Let <sup>2</sup> be the intersection of a line from parallel to and the quadratrix, and denote by the angle between and <sup>2</sup> (Fig. 2.9).

## **Theorem 2.5** <sup>=</sup> /3*.*

*Proof* has -coordinate 1 <sup>−</sup> so by construction has -coordinate 1 − (/3). Since the constant linear velocity of the horizontal line is proportional to the constant angular velocity of the rotating line / <sup>=</sup> (/3)/ and <sup>=</sup> /3. □

**Fig. 2.9** Trisection of an angle using a quadratrix

#### **2.5 Constructible Numbers**

Let be a line segment defned to be of length 1.

**Defnition 2.1** A number is *constructible* if and only if a line segment of length can be constructed with a straightedge and compass starting from .

Given line segment = , construct a line containing and use the compass to fnd a point on the line that is a distance of 1 from . Then is of length 2 so the number 2 is constructible. A line segment of length 1 can be constructed perpendicular to at . The hypotenuse of the triangle △ is of length <sup>√</sup> 2 so the number <sup>√</sup> 2 is constructible.

**Theorem 2.6** *A number is* constructible *if and only if it is the value of an expression built from the integers, the four arithmetic operations* {+, <sup>−</sup>, <sup>×</sup>, /} *and the operation of taking a square root* √ *.*

*Proof* First we show that the values of these expressions are constructible.

**Addition and subtraction:** Given line segments = and = , construct a circle centered at with radius (Fig. 2.10). Extend until it intersects the circle at . Then is a line segment, where <sup>=</sup> <sup>−</sup> and <sup>=</sup> <sup>+</sup> .

**Fig. 2.10** Construction of addition and subtraction

**Fig. 2.11a** Construction of multiplication **Fig. 2.11b** Construction of division

**Multiplication:** By similar triangles in Fig. 2.11a, (1/) <sup>=</sup> (/), so <sup>=</sup> . **Division:** By similar triangles in Fig. 2.11b, (1/) <sup>=</sup> (/), so <sup>=</sup> (/).

**Square roots:** Given a line segment <sup>=</sup> , construct <sup>=</sup> <sup>1</sup> <sup>+</sup> and a semicircle with as its diameter. Construct a perpendicular at and let be the intersection of the perpendicular and the circle (Fig. 2.12). ∠ is a right angle because it is subtended by a diameter. By similar triangles (ℎ/1) <sup>=</sup> (/ℎ), so <sup>ℎ</sup> <sup>2</sup> = and ℎ = √ .

To prove the converse of the theorem, we need to determine what expressions can be constructed by a straightedge and compass. There are three constructions:2


**Fig. 2.12** Construction of a square root

<sup>2</sup> For clarity these are illustrated for specifc values rather than the most general equations.

**Fig. 2.13a** The point of intersection of two lines

**Fig. 2.13b** The points of intersection of a line and a circle

3. Two circles intersect in zero, one or two points (Fig. 2.14). The coordinates of the intersections can be derived from the equations of the two circles (−1) <sup>2</sup>+ <sup>2</sup> = 4, ( <sup>+</sup> <sup>1</sup>) <sup>2</sup> <sup>+</sup> <sup>2</sup> <sup>=</sup> 4. The points of intersection are <sup>=</sup> (0, √ <sup>2</sup>), <sup>=</sup> (0, <sup>−</sup> √ <sup>2</sup>). □

**Fig. 2.14** The points of intersection of two circles

#### **2.6 Constructible Numbers As Roots of Polynomials**

To show that a number is not constructible, we need to prove that it cannot be expressed using just integers and the operations {+, <sup>−</sup>, <sup>×</sup>, /, √ }.

We will show that constructible numbers are the roots of a certain class of polynomials and then prove that trisecting an angle and doubling a cube require the construction of roots of polynomials that are not in this class. Today these results are proved using feld theory from abstract algebra, but here I give a proof that uses elementary mathematics. The proof is based on the following defnition.

**Defnition 2.2** The *depth* of an expression built from the integers and the operators {+, <sup>−</sup>, <sup>×</sup>, /, √ } is the maximum level of nesting of square roots.

*Example 2.1* Consider the following expression:

$$
\sqrt{17 + 3\sqrt{17} - \sqrt{34 - 2\sqrt{17}} - 2\sqrt{34 + 2\sqrt{17}}}
$$

The depth is 3 because at the right of the expression we have <sup>√</sup> 17 which is nested within <sup>√</sup> 34 + 2 √ 17, which in turn is nested within <sup>√</sup> 17 + · · · − · · · − 2 √ 34 + 2 √ 17.

**Theorem 2.7** *A expression of depth can be expressed as* <sup>+</sup> √ *where* , , *are expressions of depth at most* <sup>−</sup> <sup>1</sup>*.*

*Proof* Simple computations show that the expressions (<sup>1</sup> <sup>+</sup> <sup>1</sup> √ ) *op* (<sup>2</sup> <sup>+</sup> <sup>2</sup> √ ) for the operators *op* <sup>=</sup> {+, <sup>−</sup>, ×} result in expressions <sup>+</sup> √ of depth <sup>−</sup> 1. For division the computation is a bit more complicated:

$$\begin{split} \frac{a\_1 + b\_1 \sqrt{c}}{a\_2 + b\_2 \sqrt{c}} &= \frac{(a\_1 + b\_1 \sqrt{c})(a\_2 - b\_2 \sqrt{c\_2})}{(a\_2 + b\_2 \sqrt{c})(a\_2 - b\_2 \sqrt{c})} \\ &= \frac{a\_1 a\_2 - b\_1 b\_2 c}{a\_2^2 - b\_2^2 c} + \frac{a\_2 b\_1 - a\_1 b\_2}{a\_2^2 - b\_2^2 c} \sqrt{c} \,, \end{split}$$

which is of the form <sup>+</sup> √ and of depth <sup>−</sup> 1. Finally, the square root of an expression of depth <sup>−</sup> 1 is an expression of depth . □

**Theorem 2.8** *Let* () *be a monic cubic polynomial with rational coefcients:*

$$p(\mathbf{x}) = \mathbf{x}^3 + a\_2\mathbf{x}^2 + a\_1\mathbf{x} + a\_0\mathbf{x}$$

*and let* <sup>=</sup> <sup>+</sup> √ *be a root of* () of minimal depth *, where* , , *are of depth (at most)* <sup>−</sup> <sup>1</sup>*. Then* ′ <sup>=</sup> <sup>−</sup> √ *is a root of* () *and* <sup>≠</sup> ′ *.*

*Proof* Let us compute () which is equal to 0 since is a root:

$$\begin{aligned} (a+b\sqrt{c})^3 &+ a\_2(a+b\sqrt{c})^2 + a\_1(a+b\sqrt{c}) + a\_0 = \\ (a^3+3a^2b\sqrt{c}+3ab^2c+b^3c\sqrt{c}) \\ &+ a\_2(a^2+2ab\sqrt{c}+b^2c) + a\_1(a+b\sqrt{c}) + a\_0 = \\ (a^3+3ab^2c+a\_2a^2+a\_2b^2c+a\_1a+a\_0) \\ &+ (3a^2b+b^3c+2a\_2ab+a\_1b)\sqrt{c} \\ &= \begin{array}{c} \\ \\ \end{array} = 0 \end{aligned}$$

where , are expressions of depth <sup>−</sup> 1 formed from the rational coefcients and , , . Then √ <sup>=</sup> <sup>−</sup>/, so <sup>+</sup> √ can be expressed as an expression of depth <sup>−</sup> 1, contracting the assumption that <sup>+</sup> √ is of minimal depth . Since √ ≠ 0 and is of depth , for <sup>+</sup> √ to be zero it must be that = = 0.

Consider now ′ <sup>=</sup> <sup>−</sup> √ . By examining the above computation we see that ( ′ ) <sup>=</sup> <sup>−</sup> √ <sup>=</sup> <sup>0</sup> <sup>+</sup> <sup>0</sup> · √ = 0, so ′ is also a root of .

If = ′ then 0 <sup>=</sup> <sup>−</sup> ′ = 2 √ , which is true only if = 0 so , ′ would be of depth <sup>−</sup> 1, again contradicting the assumption. □

**Theorem 2.9** *If a monic cubic polynomial with rational coefcients:*

$$p(\mathbf{x}) = \mathbf{x}^3 + a\_2 \mathbf{x}^2 + a\_1 \mathbf{x} + a\_0 \mathbf{x}$$

*has no rational roots then none of its roots is constructible.*

*Proof* By the Fundamental Theorem of Algebra (Thm. 16.1) () has three roots 1, 2, 3. Let <sup>1</sup> <sup>=</sup> <sup>+</sup> √ be a root of minimal depth . By the assumption that there are no rational roots, <sup>≥</sup> 1, and therefore <sup>≠</sup> 0 and <sup>≠</sup> 0. By Thm. 2.8, <sup>2</sup> <sup>=</sup> <sup>−</sup> √ is also a root. Perform the following multiplication:

$$(\mathbf{x} - r\_1)(\mathbf{x} - r\_2)(\mathbf{x} - r\_3) = \mathbf{x}^3 - (r\_1 + r\_2 + r\_3)\mathbf{x}^2 \tag{2.3a}$$

$$+(r\_1r\_2 + r\_1r\_3 + r\_2r\_3) \mathbf{x} + r\_1r\_2r\_3 \qquad \text{(2.3b)}$$

$$a\_2 = -(r\_1 + r\_2 + r\_3) \tag{2.3c}$$

$$r\_3 = -(a\_2 + r\_1 + r\_2) \,. \tag{2.3d}$$

Since <sup>2</sup> is rational so is:

$$r\_3 = -a\_2 - (r\_1 + r\_2) = -a\_2 - 2a\_1$$

contradicting the assumption. □

#### **2.7 Impossibility of the Classical Constructions**

**Theorem 2.10** <sup>√</sup><sup>3</sup> 2 *is irrational.*

*Proof* Assume that <sup>√</sup><sup>3</sup> 2 is rational and equal to / where , are integers with no common factors other than ±1. Then:

$$\begin{aligned} (p/q)^3 &= (\sqrt[3]{2})^3, \\ p^3 &= 2q^{3}, \end{aligned}$$

so must be divisible by 2, say = 2. Now:

$$\begin{aligned} \\$r^3 &= 2q^3\\ q^3 &= 4r^3 \end{aligned}$$

so is divisible by 2, contradicting the assumption that , have no common factor.□

**Theorem 2.11** <sup>3</sup> <sup>−</sup> <sup>2</sup> *has no rational roots so it is impossible to double a cube with a straightedge and compass.*

*Proof* One of its roots is <sup>√</sup><sup>3</sup> 2 which by Thm. 2.10 is irrational. The other roots are the roots of the quadratic equation 2 + √3 <sup>2</sup> + (√<sup>3</sup> 2) <sup>2</sup> obtained by dividing <sup>3</sup> <sup>−</sup> 2 by − √3 2. It is easy to check that its roots are not rational (in fact, not even real). □

**Theorem 2.12** *It is impossible to trisect an arbitrary angle with a straightedge and compass.*

*Proof* It is sufcient to show the impossibility for one angle. Let us try to trisect 60◦ to obtain 20◦ . By Thm. A.6:

$$
\cos 3\alpha = 4\cos^3 \alpha - 3\cos \alpha$$

$$
\cos 60^\circ = 4\cos^3 20^\circ - 3\cos 20^\circ$$

.

Denote = cos 20◦ and 2 by . Since cos 60◦ <sup>=</sup> <sup>1</sup>/2 we have:

$$\begin{aligned} 4x^3 - 3x - \frac{1}{2} &= 0 \\ 8x^3 - 6x - 1 &= 0 \\ y^3 - 3y - 1 &= 0 \end{aligned}$$

To prove that the polynomial <sup>3</sup>−3−1 has no rational roots suppose that <sup>=</sup> / is a rational root with , having no common factor other than <sup>±</sup>1. Then:

$$\left( (a/b)^3 - \Re(a/b) - 1 = 0 \right) \tag{2.4a}$$

$$a^3 - 3ab^2 = b^3\tag{2.4b}$$

$$a(a - \mathfrak{B}b^2) = b^3 \tag{2.4c}$$

$$a^3 = b(b^2 + 3ab)\,. \tag{2.4d}$$

By Eq. 2.4c, must be divisible by , and by Eq. 2.4d, must be divisible by , which is possible only if <sup>=</sup> <sup>=</sup> <sup>±</sup>1 and / <sup>=</sup> <sup>±</sup>1. By computation, <sup>=</sup> / <sup>=</sup> <sup>1</sup> and <sup>=</sup> / <sup>=</sup> <sup>−</sup>1 are not roots of the polynomial. □

An alternate way of proving the impossibility of the constructions is to use the following theorem which we present without proof.

**Theorem 2.13** *If a monic polynomial* () <sup>=</sup> <sup>+</sup> −1 −<sup>1</sup> + · · · + <sup>0</sup> *with integer coefcients has rational roots then it has integer roots.*

To show the impossibility of duplicating a cube we need to show that:

$$x^3 - 2 = (x - r\_2)(x - r\_1)(x - r\_0)$$

has no integer roots. Since 01<sup>2</sup> <sup>=</sup> <sup>−</sup>2, all roots must divide 2, so the only possible integer roots are <sup>±</sup>1, <sup>±</sup>2. A quick computation shows that none of them are roots.

To show the impossibility of trisecting an angle we need to show that <sup>3</sup> <sup>−</sup> <sup>3</sup> <sup>−</sup> <sup>1</sup> has no integer roots. An integer root must divide −1 but neither 1 nor −1 are roots.

#### **What Is the Surprise?**

Underwood Dudley has made an extensive study of what he calls "cranks" who waste years of their lives trying to trisect angles with a straightedge and compass. Not only do they delude themselves into thinking that this is possible, but, even worse, they think that a solution would be important. Of course, a solution would have no practical use, since tools such as the neusis and quadratrix can solve the problem exactly. The sheer number of such constructions is surprising, especially since many of them are clever and achieve good approximations. Computing the formulas associated with the constructions is an excellent exercise in trigonometry.

It is also surprising that proofs of the impossibility of these geometric constructions are purely algebraic using properties of roots of polynomials.

#### **Sources**

Wikipedia [51, 58, 62] is a good source for the constructions in this chapter. The two approximate trisections are from [15, pp. 67–68, 95–96]. The second example is attributed to the famous philosopher Thomas Hobbes. Both [31, pp. 48–49] and [15, pp. 6–7] discuss trisection using the quadratrix. The doubling of the cube using a neusis is taken from [14].

A rigorous treatment of constructibility can be found in textbooks on abstract algebra such as [17], which contains a general proof of the converse of Thm 2.6 in Sect. 32. Theorem 2.13 is Thm. 23.11 of [17]. A relatively accessible presentation of Wantzel's proof can be found in [48]. My presentation of constructibility is based upon the presentations in [11, Chap. III] and [27].

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 3 Squaring the Circle**

Squaring the circle, the construction of a square with the same area as a given circle, is one of the three construction problems that the Greeks posed but were unable to solve. Unlike trisecting the angle and doubling the cube, where the impossibility follows from properties of the roots of polynomials, the impossibility of squaring the circle follows from the transcendentality of : it is not the root of any polynomial with rational coefcients. This is a difcult theorem that was proved in 1882 by Carl von Lindemann.

Approximations to <sup>≈</sup> <sup>3</sup>.14159265359 have been known since ancient times. Some simple but reasonably accurate approximations are:

$$
\frac{22}{7} \approx 3.142857, \quad \frac{333}{106} \approx 3.141509, \quad \frac{355}{113} \approx 3.141593\,\,.
$$

We present three constructions by a straightedge and compass of approximations to . One is by by Adam Kochanski (Sect. 3.1) and two are by Ramanujan (Sects. 3.2, ´ 3.3). Section 3.4 how to square the circle using the quadratrix.

The following table shows the formulas for the lengths that are constructed, their approximate values, the diference between these values and the value of and the error in meters that results if the approximation is used to compute the circumference of the earth given that its radius is 6378 km.


© The Author(s) 2022 29

M. Ben-Ari, *Mathematical Surprises*, https://doi.org/10.1007/978-3-031-13566-8\_3

## **3.1 Kochanski's Construction ´**

#### **Construction (Fig. 3.1):**


**Fig. 3.1** Kochanski's approximation to ´

$$\text{Theorem 3.1} \quad \overline{BH} = \sqrt{\frac{40}{3} - 2\sqrt{3}} \approx \pi.$$

*Proof* Figure 3.2 is an enlarged extract from Fig. 3.1, where dashed line segments have been added. Since all the circles are unit circles the lengths of the dashed lines are 1. It follows that is a rhombus so its diagonals are perpendicular to and bisect each other at the point labeled . <sup>=</sup> <sup>1</sup>/2.

#### 3.1 Kochanski's Construction 31 ´

**Fig. 3.2** Detail from Kochanski's construction ´

The diagonal forms two equilateral triangles △ , △ so <sup>∠</sup> <sup>=</sup> <sup>60</sup>◦ . Since the tangent forms a right angle with the radius , ∠ = 30◦ . Now:

$$\begin{aligned} \frac{1/2}{\overline{EA}} &= \cos 30^\circ = \frac{\sqrt{3}}{2} \\ \overline{EA} &= \frac{1}{\sqrt{3}} \\ \overline{AH} &= 3 - \overline{EA} = \left(3 - \frac{1}{\sqrt{3}}\right) = \frac{3\sqrt{3} - 1}{\sqrt{3}} \end{aligned}$$

△ is a right triangle and <sup>=</sup> <sup>3</sup> <sup>−</sup> , so by Pythagoras's Theorem:

$$\begin{aligned} \overline{BH}^2 &= \overline{AB}^2 + \overline{AH}^2 \\ &= 4 + \frac{27 - 6\sqrt{3} + 1}{3} = \frac{40}{3} - 2\sqrt{3} \\\\ \overline{BH}^2 &= \sqrt{\frac{40}{3} - 2\sqrt{3}} \approx 3.141533387 \approx \pi \end{aligned}$$

## **3.2 Ramanujan's First Construction**

#### **Construction (Fig. 3.3):**


**Fig. 3.3** Ramanujan's frst construction

**Theorem 3.2** <sup>2</sup> = 355 113 <sup>≈</sup> *.*

*Proof* <sup>=</sup> by construction and by Pythagoras's Theorem for △:

$$
\overline{RS} = \overline{\mathcal{Q}T} = \sqrt{1^2 - \left(\frac{2}{3}\right)^2} = \frac{\sqrt{5}}{3} \dots
$$

<sup>∠</sup> is subtended by a diameter so △ is a right triangle. By Pythagoras's theorem:

$$\overline{PS} = \sqrt{2^2 - \left(\frac{\sqrt{5}}{3}\right)^2} = \sqrt{4 - \frac{5}{9}} = \frac{\sqrt{31}}{3} \dots$$

By construction <sup>∥</sup> so △ ∼ △ and:

$$\begin{aligned} \frac{\overline{PM}}{\overline{PO}} &= \frac{\overline{PS}}{\overline{PR}}\\ \frac{\overline{PM}}{1} &= \frac{\sqrt{31}/3}{2} \\ \overline{PM} &= \frac{\sqrt{31}}{6} \end{aligned}$$

By construction <sup>∥</sup> so △ ∼ △ and:

$$\begin{aligned} \frac{\overline{PN}}{\overline{PT}} &= \frac{\overline{PS}}{\overline{PR}}\\ \frac{\overline{PN}}{5/3} &= \frac{\sqrt{31}/3}{2} \\ \overline{PN} &= \frac{5\sqrt{31}}{18} \\ \overline{MN} &= \overline{PN} - \overline{PM} = \sqrt{31} \left(\frac{5}{18} - \frac{1}{6}\right) = \frac{\sqrt{31}}{9} \end{aligned}$$

△ is a right triangle because <sup>∠</sup> is subtended by a diameter. By construction = and by Pythagoras's Theorem:

$$
\overline{\mathcal{R}K} = \sqrt{2^2 - \left(\frac{\sqrt{31}}{6}\right)^2} = \frac{\sqrt{113}}{6} \dots
$$

△ is a right triangle because is a tangent so <sup>∠</sup> is a right angle. <sup>=</sup> by construction and by Pythagoras's Theorem:

$$\overline{RL} = \sqrt{2^2 + \left(\frac{\sqrt{31}}{9}\right)^2} = \frac{\sqrt{355}}{9}.$$

By construction <sup>=</sup> <sup>=</sup> <sup>3</sup>/2 and <sup>∥</sup> . By similar triangles:

$$\begin{aligned} \frac{\overline{RD}}{\overline{RC}} &= \frac{\overline{RL}}{\overline{RK}}\\ \frac{\overline{RD}}{3/2} &= \frac{\sqrt{355}/9}{\sqrt{113}/6} \\ \overline{RD} &= \sqrt{\frac{355}{113}} \\ \overline{RD}^2 &= \frac{355}{113} \approx 3.14159292035 \approx \pi \end{aligned}$$

In Fig. 3.4 the line segments are labeled with their lengths. □

**Fig. 3.4** Ramanujan's frst construction with labeled line segments

#### **3.3 Ramanujan's Second Construction**

#### **Construction (Fig. 3.5):**


**Fig. 3.5** Ramanujan's second construction

36 3 Squaring the Circle

**Theorem 3.3** 3 √ = 9 2 + 19<sup>2</sup> <sup>22</sup> 1/<sup>4</sup> <sup>≈</sup> *.*

*Proof* △ is a right triangle so by Pythagoras's Theorem <sup>=</sup> √ 2 and:

$$
\overline{NB} = \sqrt{2} - 2/3 \ .
$$

△ is isoceles so <sup>∠</sup> <sup>=</sup> <sup>∠</sup> <sup>=</sup> <sup>45</sup>◦ . By the Law of Cosines:

$$\begin{split} \overline{AN}^2 &= \overline{BA}^2 + \overline{BN}^2 - 2 \cdot \overline{BA} \cdot \overline{BN} \cdot \cos \angle NBA\\ &= 2^2 + \left(\sqrt{2} - \frac{2}{3}\right)^2 - 2 \cdot 2 \cdot \left(\sqrt{2} - \frac{2}{3}\right) \cdot \frac{\sqrt{2}}{2} = \frac{22}{9} \\\ \overline{AN} &= \sqrt{\frac{22}{9}} \ . \end{split}$$

Again by the Law of Cosines:

$$\begin{split} \overline{AM}^2 &= \overline{BA}^2 + \overline{BM}^2 - 2 \cdot \overline{BA} \cdot \overline{BM} \cdot \cos \angle MBA \\ &= 2^2 + \left(\sqrt{2} - \frac{1}{3}\right)^2 - 2 \cdot 2 \cdot \left(\sqrt{2} - \frac{1}{3}\right) \cdot \frac{\sqrt{2}}{2} = \frac{19}{9} \\ \overline{AM} &= \sqrt{\frac{19}{9}} \cdot \end{split}$$

By construction <sup>∥</sup> so △ ∼ △ , and by construction <sup>=</sup> :

$$\begin{aligned} \frac{\overline{AQ}}{\overline{AM}} &= \frac{\overline{AP}}{\overline{AN}} = \frac{\overline{AM}}{\overline{AN}}\\ \overline{AQ} &= \frac{\overline{AM}^2}{\overline{AN}} = \frac{19/9}{\sqrt{22/9}} = \frac{19}{3\sqrt{22}} \cdots \end{aligned}$$

By construction <sup>∥</sup> so △ ∼ △ and:

$$\begin{aligned} \frac{\overline{AR}}{\overline{AQ}} &= \frac{\overline{AT}}{\overline{AO}}\\ \overline{AR} &= \overline{AQ} \cdot \frac{\overline{AT}}{\overline{AO}} = \frac{19}{3\sqrt{22}} \cdot \frac{1/3}{1} = \frac{19}{9\sqrt{22}} \cdot \frac{1}{2} \end{aligned}$$

By construction <sup>=</sup> and △ is a right triangle because is a tangent. By Pythagoras's Theorem:

$$\begin{aligned} \overline{SO} &= \sqrt{1^2 + \left(\frac{19}{9\sqrt{22}}\right)^2} \\ 3\sqrt{\overline{SO}} &= 3\left(1^2 + \frac{19^2}{9^2 \cdot 22}\right)^{1/4} = \left(9^2 + \frac{19^2}{22}\right)^{1/4} \approx 3.14159265258 \approx \pi \dots \end{aligned}$$

In Fig. 3.6 the line segments are labeled with their lengths. □

**Fig. 3.6** Ramanujan's second construction with labeled line segments

#### **3.4 Squaring a Circle Using a Quadratrix**

The quadratrix is described in Sect. 2.4.

Let = be the distance that the horizontal straightedge has moved down the -axis, and let be the corresponding angle between the rotating straightedge and the -axis. Let the position of the joint of the two straightedges. The locus of is the quadratrix curve.

Let be the projection of onto the -axis and let be the position of the joint when both straightedges reach the -axis, that is, is the intersection of the quadratrix curve and the -axis (Fig. 3.7).

**Fig. 3.7** Squaring the circle with a quadratrix

## **Theorem 3.4** <sup>=</sup> <sup>2</sup>/*.*

*Proof* Let <sup>=</sup> <sup>=</sup> <sup>=</sup> <sup>1</sup> <sup>−</sup> . Since on a quadratrix decreases at the same rate that increases:

$$
\frac{1-t}{1} = \frac{\theta}{\pi/2}
$$

$$
\theta = \frac{\pi}{2}(1-t)\,.
$$

Let <sup>=</sup> <sup>=</sup> . Then tan <sup>=</sup> / so:

$$\text{If } \mathbf{x} = \frac{\mathbf{y}}{\tan \theta} = \mathbf{y} \cot \theta = \mathbf{y} \cot \frac{\pi}{2} (1 - t) = \mathbf{y} \cot \frac{\pi}{2} \mathbf{y} \,. \tag{3.1}$$

We usually express a function as <sup>=</sup> () but it can also be expressed as <sup>=</sup> ().

To obtain = we can't simply plug = 0 into Eq. 3.1, because cot 0 is not defned, so let us compute the limit of as goes to 0. First perform the substitution <sup>=</sup> (/2) to obtain:

$$\mathbf{x} = \mathbf{y} \cot \frac{\pi}{2} \mathbf{y} = \frac{2}{\pi} \left( \frac{\pi}{2} \mathbf{y} \cot \frac{\pi}{2} \mathbf{y} \right) = \frac{2}{\pi} (z \cot z) \,, $$

and then take the limit:

$$\lim\_{z \to 0} \mathbf{x} = \frac{2}{\pi} \lim\_{z \to 0} (z \cot z) = \frac{2}{\pi} \lim\_{z \to 0} \left( \frac{z \cos z}{\sin z} \right) = \frac{2}{\pi} \lim\_{z \to 0} \left( \frac{\cos z}{(\sin z)/z} \right) = \frac{2}{\pi} \frac{\cos 0}{1} = \frac{2}{\pi} \dots$$

where we have used lim→<sup>0</sup> (sin /) <sup>=</sup> 1 (Thm. A.12). □

#### **What Is the Surprise?**

It is surprising that such accurate approximations to can be constructed. Of course one can't help but be astonished by Ramanujan's clever constructions.

#### **Sources**

Kochanski's construction appears in [7]. Ramanujan's constructions are from [38, ´ 39]. Squaring the circle using the quadratrix is from [31, pp. 48–49] and [62].

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 4 The Five-Color Theorem**

Maps use colors to distinguish one region from another by ensuring that adjacent regions are colored with diferent colors. In 1852 Francis Guthrie noticed that a map of the counties of England could be colored using only four countries. The claim that four countries sufce to color any planar map is called the *four-color theorem* and was only proved in 1976 by Kenneth Appel and Wolfgang Haken. They used sophisticated mathematical arguments to show that if there is a counterexample (a map needing more than four colors), it had to be associated with one of 1834 confgurations. They then used a computer to check these confgurations.

While the four-color theorem is extremely difcult to prove, the proofs of the fveand six-color theorems are relatively simple (Sects. 4.5, 4.6). On the way to proving these theorems, we defne planar maps and graphs (Sect. 4.1), prove Euler's formula (Sect. 4.2) and show that a planar graph must have vertex whose degree is less than or equal to fve. In Sect. 4.3 Euler's formula is used to show that two graphs are not planar.

In 1879 Alfred B. Kempe published a proof of the four-color theorem, but in 1890 Percy J. Heawood showed that the proof is incorrect. In Sect. 4.7 we present Kempe's fawed proof and Heawood's demonstration that it is not correct.

#### **4.1 Planar Maps and Graphs**

**Defnition 4.1** A *planar map* is a set of regions in the plane separated by boundaries. A *coloring* of a map is an assignment of a color to each region such that regions sharing a boundary are assigned diferent colors.

Figure 4.1a shows a fve-coloring of a planar map with ten regions. Figure 4.1b shows a four-coloring of the same map.

**Fig. 4.1a** Five-coloring of a planar map **Fig. 4.1b** Four-coloring of a planar map

**Defnition 4.2** A *graph* is a set of *vertices* and a set of *edges* , such that each edge is incident with exactly two vertices.

A *planar graph* is a graph such that no edges cross each other. In a planar graph, areas enclosed by a set of edges are called *faces*.

A *coloring* of a planar graph is an assignment of colors to vertices such that no two vertices of the same color are connected by an edge.

Planar maps and planar graphs are dual and it is convenient to investigate coloring problems in graphs rather than maps.

**Theorem 4.1** *Given a planar map, a planar graph can be constructed such that for each coloring of the regions of the map there is a coloring of the vertices of the graph, and conversely.*

*Proof* Construct one vertex for each region and construct an edge between two vertices if and only if the corresponding regions share a boundary. □

*Example 4.1* Figure 4.2a shows the planar map from Fig. 4.1b and the vertices associated with the regions. Figure 4.2b shows the planar graph that corresponds to the map.

We can further limit our graphs to those whose faces are triangular.

**Defnition 4.3** A graph is *triangular* if all its faces are bounded by three edges. A graph can be *triangulated* if edges can be added so that the graph is triangular. We also say that there is a *triangulation* of the graph.

*Example 4.2* The faces in the planar graph in Fig. 4.2b are triangular since each one is bounded by three edges. The edges are curved so the faces are not triangles, which are polygons whose three edges are straight line segments.

**Fig. 4.2a** Associating vertices with the regions of a planar map

**Fig. 4.2b** The planar graph that corresponds to the planar map

**Fary' ´ s Theorem** states that any triangular planar graph can be be transformed into an equivalent planar graph whose edges are straight line segments. Therefore, with no loss of generality, proofs can be restricted to planar graphs whose faces are triangles.

*Example 4.3* Fig. 4.3 (left) shows that a square can be two-colored, but if it is triangulated (center), four colors are necessary. Our goal is to prove that *all* graphs can be -colored for some . If the triangulated graph is -colored, so is the original graph, because deleting the extra edges does not invalidate the coloring (right).

**Fig. 4.3** Coloring a triangulated graph

#### **4.2 Euler's Formula**

**Theorem 4.2** *Let be a connected planar graph with vertices, edges and faces. Then* <sup>−</sup> <sup>+</sup> <sup>=</sup> <sup>2</sup>*.*

*Proof* By induction on the number of edges. If the number of edges in the graph is zero, there is only a single vertex and a single face, so 1−0 + 1 = 2. Otherwise, there is at least one edge and it connects two vertices 1, 2. Delete edge .

*Case 1:* The graph becomes disconnected (Fig. 4.4a). Merge <sup>1</sup> with <sup>2</sup> (Fig. 4.4b). The resulting graph ′ is a planar connected graph and has fewer edges than , so by the induction hypothesis ( <sup>−</sup> <sup>1</sup>) − ( <sup>−</sup> <sup>1</sup>) + <sup>=</sup> 2 since the number of vertices is also reduced by one. Simplifying, we get <sup>−</sup> <sup>+</sup> <sup>=</sup> 2 for .

**Fig. 4.4a** Removing an edge disconnects the graph

*Case 2:* The graph remains connected (Fig. 4.5a). ′ has fewer edges than (Fig. 4.5b), so by the induction hypothesis − ( <sup>−</sup> <sup>1</sup>) + ( <sup>−</sup> <sup>1</sup>) <sup>=</sup> 2 since removing the edge joins two faces into one. Simplifying, we get <sup>−</sup> <sup>+</sup> <sup>=</sup> 2 for . □

**Fig. 4.5a** Removing an edge does not disconnect the graph

**Fig. 4.5b** The graph remains connected and has fewer edges

**Theorem 4.3** *Let be a connected, triangulated planar graph with edges and vertices. Then* <sup>=</sup> <sup>3</sup> <sup>−</sup> <sup>6</sup>*.*

*Proof* Each face is bounded by three edges, so <sup>=</sup> <sup>3</sup>/2, where we divided by 2 because each edge has been counted twice, once for each face it bounds. By Euler's formula:

**Fig. 4.6a** Fewer edges than the upper limit **Fig. 4.6b** In a triangulated graph the number of edges is maximal

$$\begin{aligned} E &= V + F - 2 \\ &= V + 2E/3 - 2 \\ &= 3V - 6 \end{aligned}$$

*Example 4.4* The planar graph in Fig. 4.2b has 10 vertices and 3 · 10 − 6 = 24 edges. **Theorem 4.4** *Let be a connected planar graph. Then* <sup>≤</sup> <sup>3</sup> <sup>−</sup> <sup>6</sup>*.*

*Proof* Triangulate to obtain ′ . ′ = 3 ′ − 6 by Thm. 4.4. Now remove edges from ′ to obtain . The number of vertices does not change so <sup>≤</sup> <sup>3</sup> <sup>−</sup> 6. □

*Example 4.5* The graph in Fig. 4.6a has 8 edges and 6 vertices and 8 <sup>&</sup>lt; <sup>3</sup> · <sup>6</sup>−<sup>6</sup> <sup>=</sup> 12. Figure 4.6b shows a triangulated graph with 6 vertices and 3 · 6 − 6 = 12 edges.

#### **4.3 Non-planar Graphs**

Let us take a short detour to show how Thms. 4.2 and 4.4 can be used to prove that certain graphs are not planar.

**Theorem 4.5** 5*, the complete graph on fve vertices, is not planar (Fig. 4.7a).*

*Proof* For 5, = 5 and = 10. By Thm. 4.4 the number of edges must be less than or equal to 3 · <sup>5</sup> <sup>−</sup> <sup>6</sup> <sup>=</sup> 9 so the graph is not planar. □

**Theorem 4.6** 3,3*, the bipartite graph with three vertices on each side, is not planar (Fig. 4.8a).*

*Proof* <sup>=</sup> 6 and <sup>=</sup> 9. By Thm 4.2 if 3,<sup>3</sup> is planar, <sup>=</sup> <sup>−</sup> <sup>+</sup> <sup>2</sup> <sup>=</sup> <sup>9</sup>−<sup>6</sup> <sup>+</sup> <sup>2</sup> <sup>=</sup> 5. But each face is bounded by four edges (Fig. 4.8b), so <sup>=</sup> <sup>4</sup>/<sup>2</sup> <sup>=</sup> <sup>10</sup> <sup>≠</sup> 9. □

In 1930 Kazimierz Kuratowski proved a converse to these theorems: if a graph is not planar it contains (in a certain sense) <sup>5</sup> or 3,3.

**Fig. 4.7a** <sup>5</sup> is not planar **Fig. 4.7b** A failed attempt to draw <sup>5</sup> as planar

#### **4.4 The Degrees of the Vertices**

**Defnition 4.4** (), the *degree* of vertex , is the number of edges incident with .

*Example 4.6* The graph in Fig. 4.2b contains 8 vertices corresponding to the two rings and each vertex is of degree 5. The vertex corresponding to the outer face is of degree 4 as is the vertex corresponding to the inner face. Therefore:

$$\sum\_{\nu \in V} d(\nu) = 5 \cdot 8 + 4 \cdot 2 = 48 \dots$$

To get the total number of edges divide 48 by 2 because each edge was counted twice, once for each of the vertices it is incident to.

**Fig. 4.8a** 3,<sup>3</sup> is not planar **Fig. 4.8b** A failed attempt to draw 3,<sup>3</sup> as planar

By generalizing the argument we get:

**Theorem 4.7** *Let for in* {1, <sup>2</sup>, <sup>3</sup>, . . . , } *be the number of vertices of degree in a connected planar graph with vertices and edges, where is the highest degree of a vertex in . Then:*

$$\sum\_{\nu \in V} d(\nu) = \sum\_{i=1}^k i \cdot d\_i = 2E$$

**Theorem 4.8** *Let be a connected planar graph with edges and vertices, and let for in* {1, <sup>2</sup>, <sup>3</sup>, . . . , } *be the number of vertices of degree , where is the highest degree of a vertex in. Then there must be a vertex in such that* () ≤ <sup>5</sup>*.*

*Proof (1)* If there are <sup>1</sup> vertices of degree 1, <sup>2</sup> vertices of degree 2, . . . , vertices of degree , then = Í =1 . From Thms. 4.4 and 4.7:

$$\sum\_{i=1}^{k} i \cdot d\_i = 2E \le 2(3V - 6) = 6V - 12 = 6\sum\_{i=1}^{k} d\_i - 12\dots$$

Therefore:

$$\sum\_{i=1}^{k} i \cdot d\_i \le 6 \sum\_{i=1}^{k} d\_i - 12.$$

$$\sum\_{i=1}^{k} (6 - i)d\_i \ge 12 \text{ .}$$

Since 12 <sup>&</sup>gt; 0 and all are positive, for least one , 6 <sup>−</sup> > 0 and for that , < 6. □

*Proof (2)* Let us compute the *average* degree of the vertices which is the sum of the degrees divided by the number of vertices:

$$d\_{\rm avg} = \frac{\sum\_{i=1}^{k} i \cdot d\_i}{V}.$$

But the sum of the degrees is twice the number of edges which by Thm. 4.4 gives:

$$d\_{\text{avg}} = \frac{2E}{V} \le \frac{6V - 12}{V} = 6 - \frac{6}{V} < 6\dots$$

If the average is less than six there must be a vertex of degree less than six. □

*Example 4.7* In Fig. 4.2b the sum of the degrees is 8 · 5 + 2 · 4 = 48. There are 10 vertices so the average degree is 48/<sup>10</sup> <sup>=</sup> <sup>4</sup>.8 and there must be a vertex of degree 4 or less.

#### **4.5 The Six-Color Theorem**

#### **Theorem 4.9** *Any planar graph can be six-colored.*

*Proof* By induction on the number of vertices. If has six vertices or fewer, six colors sufce. For the inductive step, by Thm. 4.8 has a vertex with degree 5 or fewer. Delete vertex to obtain the graph ′ . By the induction hypothesis ′ can be six-colored, but has at most 5 neighbors and at most 5 colors are used to color them (Fig. 4.9a), so can be colored using the sixth color (Fig. 4.9b). □

**Fig. 4.9a** Five colors sufce for coloring the neighbors of

**Fig. 4.9b** Color with the sixth color

#### **4.6 The Five-Color Theorem**

**Defnition 4.5** Let be a colored planar graph. A *(Kempe) chain* ′ is a maximal, two-colored, connected subgraph of .

**Theorem 4.10** *Any planar graph can be fve-colored.*

*Proof* By induction on the number of vertices. If fve vertices or fewer, fve colors sufce. For the inductive step, by Thm. 4.8 has a vertex with degree 5 or less. Delete to obtain ′ . By the induction hypothesis, ′ can be fve-colored. In , if the degree of is less than 5, or if 1, . . . , 5, the neighbors of , are colored with four colors or fewer, can be colored with the ffth color. Otherwise, 1, . . . , <sup>5</sup> are colored with diferent colors in ′ (Fig. 4.10, top).

**Fig. 4.10** Proof of the fve-color theorem

Consider vertex <sup>1</sup> which is colored blue and vertex <sup>3</sup> which is colored red. If 1, <sup>3</sup> are not connected by a blue-red path (say if the edge 6<sup>7</sup> did not exist), we can exchange the colors along the path from <sup>1</sup> to <sup>6</sup> and color blue. Otherwise, consider the blue-red chain which contains 1, 3. By adding and the edges 1, <sup>3</sup> we obtain a closed path (double line) that divides the plane into an "inside" region and an "outside" region (Fig. 4.10, middle)

Consider <sup>2</sup> which is colored green and <sup>4</sup> which is colored orange. These vertices *cannot* be contained in a single green-orange chain, because <sup>2</sup> is *inside* and <sup>4</sup> is *outside* , so any path connecting them must cross , contradicting the assumption that the graph is planar. Therefore, they must be contained in two *unconnected* greenorange chains (double dashed line, in Fig. 4.10, middle). Exchange the colors on the chain containing <sup>2</sup> and then can be colored green to obtain a fve-coloring of (Fig. 4.10, bottom). □

The statement that a continuous path from the *inside* of of a closed continuous curve to the *outside* of must intersect is the **Jordan Curve Theorem**. The theorem is intuitively obvious but difcult to prove.

#### **4.7 Kempe's Incorrect Proof of the Four-Color Theorem**

**Theorem 4.11** *Any planar graph can be four-colored.*

*Proof (Incorrect)* The base case of the induction and most of the proof is the same as that of the fve-color theorem. The new case that must be considered is a vertex with fve neighbors which, by the inductive hypothesis, can be colored with four colors after removing .

In Fig. 4.11a there are two vertices 2, <sup>5</sup> colored blue. Consider the blue-green chain containing <sup>2</sup> and the blue-yellow chain containing 5. The blue-green chain is contained within the closed path defned by the red-yellow chain containing 1, <sup>3</sup> (double line) and the blue-yellow chain in contained within the closed path defned by the red-green chain containing 1, <sup>4</sup> (double dashed line).

Exchange the colors of both the blue-green chain and the blue-yellow chain (Fig. 4.11b). The result is that the neighbors of are colored with the three colors red, green and yellow, leaving blue free to color . □

Heawood noted that the closed paths defned by the red-yellow chain and the red-green chain can share red vertices (1, <sup>8</sup> in Fig. 4.12a). When the colors are exchanged in the blue-green and blue-yellow chains, it is possible for blue vertices 6, <sup>7</sup> to be connected (Fig. 4.12b) and the coloring is no longer correct.

**Fig. 4.11a** Blue-green and blue-yellow Kempe chains

**Fig. 4.12a** Red-yellow and red-green chains share red vertices

**Fig. 4.11b** Exchange the colors of the two Kempe chains

**Fig. 4.12b** Exchanging colors causes the blue vertices to become connected

#### **What Is the Surprise?**

The four-color theorem is notorious because it is so easy to state but extremely difcult to prove. Therefore, it is surprising that the proof of the fve-color theorem is elementary. The clever part of the proof is Thm. 4.8 (a planar graph must have a vertex of at most degree 5), which is a theorem that has nothing to do with coloring. Instead, it results just from counting vertices and edges.

#### **Sources**

For the four-color theorem see [49, 54]. The proof of the fve-color theorem is based on [1, 53]. [16] presents numerous proofs of Euler's formula. Kempe's incorrect proof of the four-color theorem is described in [46].

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 5 How to Guard a Museum**

In 1973 Victor Klee asked how many guards are need to observe all the walls of a museum? If the walls form a regular polygon or even a convex polygon, one guard is sufcient (Fig. 5.1).

**Fig. 5.1** A museum whose walls form a convex polygon

Consider now a museum with saw-toothed walls (Fig. 5.2). Verify by counting that the museum has 15 walls. Each "tooth" defnes a triangle that is shaded gray in Fig. 5.3. A guard placed anywhere within one of the triangles can observe all the walls bounding that triangle (red arrows).

**Fig. 5.2** A museum whose walls do not form a convex polygon

**Fig. 5.3** Visibility within each "tooth"

If at least one of the guards is placed near the top wall spanning the entire museum, she can observe all the horizontal walls (blue arrows in Fig. 5.4). Thus 5 = 15/3 guards are sufcient to observe all the walls of the museum. Since the triangles do not overlap a guard in one triangle will not be able to observe all the walls of another triangle (green arrow) so 5 guards are necessary.

**Fig. 5.4** Visibility of the walls of the museum

The example in Fig. 5.2 can be generalized to /3 teeth with walls, so we conclude that *at least* /3 guards are necessary. We wish to prove that /3 guards are sufcient to guard any museum.

Section 5.1 proves that any triangulated polygon can be three-colored. This is used in Sect. 5.2 to prove the theorem that /3 guards are sufcient. Section 5.3 completes the proof by showing that any polygon can be triangulated.

#### **5.1 Coloring Triangulated Polygons**

**Defnition 5.1** A *diagonal* a of polygon is an edge connecting two vertices that is not one of the (outside) edges of the polygon.

**Defnition 5.2** A polygon can be *triangulated* if non-intersecting diagonals can be constructed such that the interior of the polygon is covered by triangles.

**Theorem 5.1** *Any polygon can be triangulated.*

We defer the proof of Thm. 5.1.

**Defnition 5.3** A vertex of a polygon is *convex* if its interior angle is less than 180◦ ; a vertex is *concave* if its interior angle is greater than 180◦ .

In Fig. 5.5 vertex 1 is convex and vertex 2 is concave.

**Fig. 5.5** A polygon with a convex vertex (1) and a concave vertex (2)

**Defnition 5.4** A polygon with vertices can be *three-colored* if there is a map:

$$c: V \mapsto \{red, blue, green\},$$

such that no edge has two vertices that are assigned the same color.

**Theorem 5.2** *A triangulated polygon can be three-colored.*

*Proof* By induction on the number of vertices. A triangle can be three-colored. A triangulated polygon with > 3 vertices must have a diagonal. Choose an arbitrary diagonal (Fig. 5.6a) and divide the polygon along this diagonal into two smaller polygons (Fig. 5.6b). By induction each of these smaller polygons can be threecolored (Fig. 5.7a).

Since the colors assigned are arbitrary, if diferent colors are assigned to , in the two polygons, we can rename the colors in one of them so that the colors of , are the same in both polygons. For example, in Fig. 5.7b exchange *red* and *green* in the lower polygon. Paste the two polygons together to recover the original polygon with vertices. It will be three-colored (Fig. 5.8). □

**Fig. 5.6a** An arbitrary diagonal in a polygon **Fig. 5.6b** Divide the polygon

**Fig. 5.7a** Three-color the two smaller polygons

**Fig. 5.7b** Exchange the colors of one polygon to match the other

**Fig. 5.8** Paste the two smaller polygons back together

#### **5.2 From Coloring of Polygons to Guarding a Museum**

**Theorem 5.3** *A museum with walls can be guarded by* /<sup>3</sup> *guards.*

*Proof* By Thm. 5.1 the polygon can be triangulated and by Thm. 5.2 the polygon can be three-colored. All three vertices of each triangle in the triangulation must be colored by *diferent* colors in order to satisfy the condition of being three-colored. Since the polygon is three-colored, at least one color, say red, can appear at most /<sup>3</sup>

**Fig. 5.9** The exterior angles of a convex polygon

times, and every triangle must have a vertex colored red. Station a guard at each red vertex; she can observe all the walls of the each triangle the vertex belongs to. Since the triangles of the triangulation include all the edges of the polygon, /3 guards are sufcient to observe all the walls of the museum. □

If is not divisible by 3 the number of guards needed is ⌊/3⌋, the largest integer less than or equal to /3. For example, 4 guards are sufcient for museums with <sup>12</sup>, <sup>13</sup>, 14 walls since ⌊12/3⌋ <sup>=</sup> ⌊13/3⌋ <sup>=</sup> ⌊14/3⌋ <sup>=</sup> 4. For simplicity we ignore this complication.

#### **5.3 Any Polygon Can Be Triangulated**

**Theorem 5.4** *The sum of the interior angles of a polygon with vertices is:*

$$180^{
\circ}(n-2)\,\mathrm{.}$$

*Proof* Consider a convex polygon and denote its *exterior angles* by (Fig. 5.9). As you move from one dashed line in sequence to the next dashed line, you complete a rotation around a circle so:

$$\sum\_{i=1}^{n} \theta\_i = 360^\circ \text{ .}$$

For each exterior angle denote its corresponding interior angle by . Then:

$$\begin{aligned} \sum\_{i=1}^n \theta\_i &= \sum\_{i=1}^n (180^\circ - \phi\_i) = 360^\circ \\ \sum\_{i=1}^n \phi\_i &= n \cdot 180^\circ - 360^\circ = 180^\circ (n-2) \end{aligned}$$

.

**Fig. 5.10** A concave vertex

If there is a concave vertex ( in Fig. 5.10), there is a triangle formed by the two edges incident with the concave vertex and the line connecting the other two vertices. By summing the angles of the triangle we get:

$$\begin{aligned} (180^\circ - \alpha) + (360^\circ - \beta) + (180^\circ - \gamma) &= 180^\circ \\ \alpha + \beta + \gamma &= 3 \cdot 180^\circ \end{aligned}$$

The sum of the interior angles increases by <sup>+</sup> <sup>+</sup> while the number of vertices increases by three preserving the equation in the theorem:

$$\sum\_{i=1}^{n} \phi\_i + (\alpha + \beta + \gamma) = 180^\circ (n - 2) + 3 \cdot 180^\circ$$

$$= 180^\circ ((n + 3) - 2) \,\Big.\tag{7}$$

#### **Theorem 5.5** *There must be at least three convex vertices in a polygon.*

*Proof* Let be the number of concave vertices where the interior angle of each is <sup>180</sup>◦ <sup>+</sup> , > 0. The sum of the interior angles of the *concave* vertices is certainly less than or equal to the sum of the interior angles of *all* the vertices:

$$\begin{aligned} k \cdot 180^\circ + \sum\_{i=1}^k \epsilon\_i &\le 180^\circ (n-2) \\ (k+2) \cdot 180^\circ + \sum\_{i=1}^k \epsilon\_i &\le n \cdot 180^\circ \\ (k+2) \cdot 180^\circ &< n \cdot 180^\circ \\ k &< n-2 \end{aligned}$$

It follows that there must at least three vertices that are convex, not concave. □

*Proof (Theorem 5.1)* By induction on the number of vertices. For = 3 there is nothing to prove. If > 3, by Thm. 5.5 there must be a convex vertex . Label its adjacent vertices by , . If is contained within the polygon (Fig. 5.11a), it is

**Fig. 5.11a** Triangulation where a diagonal is contained within the polygon

**Fig. 5.11b** Triangulation where a diagonal is not contained within the polygon

a diagonal and the polygon can be split into △ and another polygon with as an edge and which is smaller than the original polygon (Fig. 5.11a). By the inductive hypothesis, the polygon can be triangulated and then pasted back to △, triangulating the original polygon.

If is not contained in the polygon, there must be concave vertex that is *closest* to (Fig. 5.11b). is a diagonal and splits the polygon into two smaller polygons and . By the inductive hypothesis these can be triangulated and pasted together. □

#### **What Is the Surprise?**

The museum theorem is suprising because what seems to be a theorem in geometry is proved rather elegantly by an appeal to coloring a graph.

#### **Sources**

This chapter is based on [1, Chap. 39].

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

# **Chapter 6 Induction**

The axiom of mathematical induction is used extensively as a method of proof in mathematics. This chapter presents inductive proofs of results that may not be known to the reader. We begin with a short review of mathematical induction (Sect. 6.1). Section 6.2 proves results about the familiar Fibonacci numbers while Sect. 6.3 proves results about Fermat numbers. Section 6.4 presents the 91-function discovered by John McCarthy; the proof is by induction on an unusual sequence: integers in an inverse ordering. The proof of the formula for the Josephus problem (Sect. 6.5) is also unusual because of the double induction on two diferent parts of an expression.

## **6.1 The Axiom of Mathematical Induction**

Mathematical induction is the primary method of proving statements to be true for an unbounded set of numbers. Consider:

$$1 = 1, \quad 1 + 2 = 3, \quad 1 + 2 + 3 = 6, \quad 1 + 2 + 3 + 4 = 10.1$$

We might notice that:

$$1 = (1 \cdot 2) / 2, \quad 3 = (2 \cdot 3) / 2, \quad 6 = (3 \cdot 4) / 2, \quad 10 = (4 \cdot 5) / 2,$$

and then conjecture that for *all* integers <sup>≥</sup> 1:

$$\sum\_{i=1}^{n} i = \frac{n(n+1)}{2} \dots$$

If you have enough patience, checking this formula for any specifc value of is easy, but how can it be proved for *all* of the infnite number of positive integers? This is where mathematical induction comes in.

**Axiom 6.1** Let () be a property (such as an equation, a formula, or a theorem), where is a positive integer. Suppose that you can:


Then () is true for all <sup>≥</sup> 1. The assumption that () is true for arbitrary is called the *inductive hypothesis*.

Here is a simple example of a proof by mathematical induction.

**Theorem 6.2** *For* <sup>≥</sup> <sup>1</sup>*:* <sup>∑</sup>

$$\sum\_{i=1}^{n} i = \frac{n(n+1)}{2} \dots$$

*Proof* The base case is trivial:

$$\sum\_{i=1}^{1} i = 1 = \frac{1(1+1)}{2} \dots$$

The inductive hypothesis is that the following equation is true for :

$$\sum\_{i=1}^{m} i = \frac{m(m+1)}{2} \dots$$

The inductive step is to prove the equation for <sup>+</sup> 1:

$$\begin{aligned} \sum\_{i=1}^{m+1} i &= \sum\_{i=1}^{m} i + (m+1) \\ &= \frac{m(m+1)}{2} + (m+1) = \frac{(m+1)(m+2)}{2} \cdot \end{aligned}$$

By the principle of mathematical induction, for any <sup>≥</sup> 1:

$$\sum\_{i=1}^{n} i = \frac{n(n+1)}{2} \,\, . \tag{1}$$

The inductive hypothesis can be confusing because it seems that we are assuming what we are trying to prove. The reasoning is *not* circular because we assume the truth of a property for something *small* and then use the assumption to prove the property for something *larger*.

Mathematical induction is an axiom so there can be no question of proving induction. You just have to accept induction like you accept other axioms of mathematics such as <sup>+</sup> <sup>0</sup> <sup>=</sup> . Of course, you are free to reject mathematical induction, but then you will have to reject much of modern mathematics.

Mathematical induction is a rule of inference that is one of the *Peano axioms* for formalizing natural numbers. The *well-ordering axiom* can be used to prove the axiom of induction and, conversely, the axiom of induction can be used to prove the well-ordering axiom. However, the axiom of induction cannot be proved from the other, more elementary, Peano axioms.

#### **6.2 Fibonacci Numbers**

Fibonacci numbers are the classic example of a recursive defnition:

$$\begin{aligned} f\_1 &= 1 \\ f\_2 &= 1 \\ f\_n &= f\_{n-1} + f\_{n-2} \text{ for } |n \ge 3 \dots \end{aligned}$$

The frst twelve Fibonacci numbers are:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144 .

**Theorem 6.3** *Every fourth Fibonacci number is divisible by* 3*.*

*Example 6.1* <sup>4</sup> <sup>=</sup> <sup>3</sup> <sup>=</sup> <sup>3</sup> · <sup>1</sup>, <sup>8</sup> <sup>=</sup> <sup>21</sup> <sup>=</sup> <sup>3</sup> · <sup>7</sup>, <sup>12</sup> <sup>=</sup> <sup>144</sup> <sup>=</sup> <sup>3</sup> · 48.

*Proof* Base case: <sup>4</sup> = 3 is divisible by 3. The inductive hypothesis is that 4 is divisible by 3. The inductive step is:

$$\begin{aligned} f\_{4(n+1)} &= f\_{4n+4} \\ &= f\_{4n+3} + f\_{4n+2} \\ &= \left( f\_{4n+2} + f\_{4n+1} \right) + f\_{4n+2} \\ &= \left( \left( f\_{4n+1} + f\_{4n} \right) + f\_{4n+1} \right) + f\_{4n+2} \\ &= \left( \left( f\_{4n+1} + f\_{4n} \right) + f\_{4n+1} \right) + \left( f\_{4n+1} + f\_{4n} \right) \\ &= \Im f\_{4n+1} + \Im f\_{4n} \ . \end{aligned}$$

<sup>3</sup> 4+<sup>1</sup> is divisible by 3 and, by the inductive hypothesis, 4 is divisible by 3. Therefore, 4(+1) is divisible by 3. □ **Theorem 6.4** < 7 4 

*.*

*Proof* Base cases: <sup>1</sup> = 1 < 7 4 1 and <sup>2</sup> = 1 < 7 4 2 = 49 16 . The inductive step is: +<sup>1</sup> <sup>=</sup> <sup>+</sup> −<sup>1</sup> < 7 4 + 7 4 −1

> −1 · 7 4 + 1

−1 · 7 4 2

+1 ,

= 7 4

< 7 4

= 7 4

$$
\left(\frac{7}{4} + 1\right) = \frac{11}{4} = \frac{44}{16} < \frac{49}{16} = \left(\frac{7}{4}\right)^2.
$$

#### **Theorem 6.5 (Binet's formula)**

$$f\_n = \frac{\phi^n - \bar{\phi}^n}{\sqrt{5}}, \quad \text{where} \quad \phi = \frac{1 + \sqrt{5}}{2}, \ \bar{\phi} = \frac{1 - \sqrt{5}}{2}.$$

*Proof* We frst show that <sup>2</sup> <sup>=</sup> <sup>+</sup> 1:

$$\begin{aligned} \phi^2 &= \left(\frac{1+\sqrt{5}}{2}\right)^2 \\ &= \frac{1}{4} + \frac{2\sqrt{5}}{4} + \frac{5}{4} = \left(\frac{1}{2} + \frac{\sqrt{5}}{2}\right) + 1 \\ &= \phi + 1 \end{aligned}$$

Similarly, we can show that ¯<sup>2</sup> <sup>=</sup> ¯ <sup>+</sup> 1.

The base case for Binet's formula is:

$$\frac{\phi^1 - \overline{\phi}^1}{\sqrt{5}} = \frac{\frac{1 + \sqrt{5}}{2} - \frac{1 - \sqrt{5}}{2}}{\sqrt{5}} = \frac{\sqrt{5}}{\sqrt{5}} = 1 = f\_1 \dots$$

since:

#### 6.2 Fibonacci Numbers 65

Assume the inductive hypothesis for all <sup>≤</sup> . The inductive step is:

$$\begin{aligned} \phi^{n+1} - \bar{\phi}^{n+1} &= \phi^2 \phi^{n-1} - \bar{\phi}^2 \bar{\phi}^{n-1} \\ &= (\phi + 1)\phi^{n-1} - (\bar{\phi} + 1)\bar{\phi}^{n-1} \\ &= (\phi^n - \bar{\phi}^n) + (\phi^{n-1} - \bar{\phi}^{n-1}) \\ &= \sqrt{5}f\_n + \sqrt{5}f\_{n-1} \\ \frac{\phi^{n+1} - \bar{\phi}^{n+1}}{\sqrt{5}} &= f\_n + f\_{n-1} = f\_{n+1} \end{aligned}$$

#### **Theorem 6.6**

$$f\_n = \binom{n}{0} + \binom{n-1}{1} + \binom{n-2}{2} + \dots + \dots$$

*Proof* Let us frst prove Pascal's rule:

$$
\binom{n}{k} + \binom{n}{k+1} = \binom{n+1}{k+1}.
$$

$$
\binom{n}{k} + \binom{n}{k+1} = \frac{n!}{k!(n-k)!} + \frac{n!}{(k+1)!(n-(k+1))!}
$$

$$
= \frac{n!(k+1)}{(k+1)!(n-k)!} + \frac{n!(n-k)}{(k+1)!(n-k)!}
$$

$$
= \frac{n!(n+1)}{(k+1)!(n-k)!}
$$

$$
= \frac{(n+1)!}{(k+1)!((n+1)-(k+1))!}
$$

$$
= \binom{n+1}{k+1}.
$$

We will also use the equality 0 = ! 0!( <sup>−</sup> <sup>0</sup>)! <sup>=</sup> 1 for any <sup>≥</sup> 1.

We can now prove the theorem. The base case is:

$$f\_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix} = \frac{1!}{0!(1-0)!} = 1 \dots$$

The inductive step is:

$$\begin{aligned} f\_n &= f\_{n-1} + f\_{n-2} = \binom{n-1}{0} + \binom{n-2}{1} + \binom{n-3}{2} + \binom{n-4}{3} + \cdots \\ &= \binom{n-2}{0} + \binom{n-3}{1} + \binom{n-4}{2} + \cdots \\ &= \binom{n-1}{0} + \binom{n-1}{1} + \binom{n-2}{2} + \binom{n-3}{3} + \cdots \\ &= \binom{n}{0} + \binom{n-1}{1} + \binom{n-2}{2} + \binom{n-3}{3} + \cdots \end{aligned}$$

#### **6.3 Fermat Numbers**

$$\text{Definition 6.1 The integers } F\_n = 2^{2^n} + 1 \text{ for } n \ge 0 \text{ are called } Fermat \text{ numbers.}$$

The frst fve Fermat numbers are prime:

$$F\_0 = \mathfrak{Z}, \quad F\_1 = \mathfrak{S}, \quad F\_2 = 17, \quad F\_3 = 2\mathfrak{S}7, \quad F\_4 = 6\mathfrak{S}\mathfrak{S}\mathfrak{T}\mathfrak{T}\mathfrak{T}$$

The seventeenth-century mathematician Pierre de Fermat claimed that all Fermat numbers are prime, but nearly a hundred years later Leonhard Euler showed that:

$$F\_S = 2^{2^5} + 1 = 2^{32} + 1 = 4294967297 = 641 \times 6700417 \text{ J}$$

Fermat numbers become extremely large as increases. It is known that Fermat numbers are not prime for 5 <sup>≤</sup> <sup>≤</sup> 32, but the factorization of some of those numbers is still not known.

**Theorem 6.7** *For* <sup>≥</sup> <sup>2</sup>*, the last digit of is* <sup>7</sup>*.*

*Proof* The base case is <sup>2</sup> = 2 2 2 <sup>+</sup><sup>1</sup> <sup>=</sup> 17. The inductive hypothesis is <sup>=</sup> <sup>10</sup> <sup>+</sup><sup>7</sup> for some <sup>≥</sup> 1. The inductive step is:

$$\begin{split} F\_{n+1} &= 2^{2^{n+1}} + 1 = 2^{2^n \cdot 2^1} + 1 = \left( 2^{2^n} \right)^2 + 1 \\ &= \left( \left( 2^{2^n} + 1 \right) - 1 \right)^2 + 1 = \left( F\_n - 1 \right)^2 + 1 \\ &= \left( 10k\_n + 7 - 1 \right)^2 + 1 = \left( 10k\_n + 6 \right)^2 + 1 \\ &= 100k\_n^2 + 120k\_n + 36 + 1 \\ &= 10(10k\_n^2 + 12k\_n + 3) + 6 + 1 \\ &= 10k\_{n+1} + 7, \quad \text{for some} \quad k\_{n+1} \ge 1 \end{split}$$

**Theorem 6.8** *For* <sup>≥</sup> <sup>1</sup>*,* <sup>=</sup> Ö−1 =0 <sup>+</sup> <sup>2</sup>*.*

*Proof* The base case is:

$$F\_1 = \prod\_{k=0}^{0} F\_k + 2 = F\_0 + 2 = 3 + 2 = 5 \dots$$

The inductive step is:

$$\begin{aligned} \prod\_{k=0}^{n} F\_k &= \left(\prod\_{k=0}^{n-1} F\_k \right) F\_n \\ &= (F\_n - 2) F\_n \\ &= \left(2^{2^n} + 1 - 2\right) \left(2^{2^n} + 1\right) \\ &= \left(2^{2^n}\right)^2 - 1 = \left(2^{2^{n+1}} + 1\right) - 2 \\ &= F\_{n+1} - 2 \\ F\_{n+1} &= \prod\_{k=0}^{n} F\_k + 2 \cdot 2 \end{aligned}$$

#### **6.4 McCarthy's 91-function**

We usually associate induction with proofs of properties defned on the set of positive integers. Here we bring an inductive proof based on a strange ordering where larger numbers are less than smaller numbers. The induction works because the only property required of the set is that it be ordered under some relational operator.

Consider the following recursive function defned on the intergers:

$$f(\mathbf{x}) = \text{if } \|\mathbf{x} > 100 \text{ then } \mathbf{x} - 10 \text{ else } \ f(f(\mathbf{x} + 11)) \text{ .}$$

For numbers greater than 100 the result of applying the function is trivial:

$$f(101) = 91, \quad f(102) = 92, \quad f(103) = 93, \quad f(104) = 94, \dots, \dots$$

What about numbers less than or equal to 100? Let us compute () for some numbers, where the computation in each line uses the results of previous lines:

$$f(100) = f(f(100+11)) = f(f(111)) = f(101) = 91$$

$$f(99) = f(f(99+11)) = f(f(110)) = f(100) = 91$$

$$f(98) = f(f(98+11)) = f(f(109)) = f(99) = 91$$

$$\dots$$

$$f(91) = f(f(91+11)) = f(f(102)) = f(92)$$

$$= f(f(103)) = f(93) = \dots = f(98) = 91$$

$$f(90) = f(f(90+11)) = f(f(101)) = f(91) = 91$$

$$f(89) = f(f(89+11)) = f(f(100)) = f(91) = 91$$

Defne the function as:

$$\text{g}(\text{x}) = \text{if} \cdot \text{x} > 100 \text{ then } \text{x} - 10 \text{ else } \text{91} \text{ .}$$

**Theorem 6.9** *For all ,* () <sup>=</sup> ()*.*

*Proof* The proof is by induction over the set of integers <sup>=</sup> { <sup>|</sup> <sup>≤</sup> <sup>101</sup>} using the relational operator ≺ defned by:

$$
\text{If } \mathbf{y} \prec \mathbf{x} \text{ if and only if } \mathbf{x} < \mathbf{y},
$$

where on the right-hand side < is the usual relational operator on the integers. This defnition results in the following ordering:

$$101 \times 100 \times 99 \times 98 \times 97 \times \dotsb$$

There are three cases to the proof. We use the results of the above computations.

*Case 1:* > 100. This is trivial by the defnitions of and . *Case 2:* <sup>90</sup> <sup>≤</sup> <sup>≤</sup> 100. The base case of the induction is:

$$f(100) = 91 = \text{g}(100)\text{ ,}$$

since we showed that (100) <sup>=</sup> 91 and by defnition (100) <sup>=</sup> 91.

The inductive assumption is () <sup>=</sup> () for <sup>≺</sup> and the inductive step is:

$$f(\mathbf{x}) = f(f(\mathbf{x} + 11))\tag{6.1a}$$

$$=f(\mathbf{x} + 11 - 10) = f(\mathbf{x} + 1)\tag{6.1b}$$

$$\mathbf{g} = \mathbf{g}(\mathbf{x} + \mathbf{l}) \tag{6.1c}$$

$$= \mathfrak{Y}\mathfrak{l}\tag{6.1d}$$

$$=\mathbf{g}(\mathbf{x})\,.\tag{6.1e}$$

Equation 6.1a holds by defnition of since <sup>≤</sup> 100. The equality of Eq. 6.1a and Eq. 6.1b holds by the defnition of , because <sup>≥</sup> 90 so <sup>+</sup> <sup>11</sup> <sup>&</sup>gt; 100. The equality of Eq. 6.1b and Eq. 6.1c follows by the inductive hypothesis <sup>≤</sup> 100, so <sup>+</sup> <sup>1</sup> <sup>≤</sup> <sup>101</sup> which implies that <sup>+</sup> <sup>1</sup> <sup>∈</sup> and <sup>+</sup> <sup>1</sup> <sup>≺</sup> . The equality of Eq. 6.1c, Eq. 6.1d and Eq. 6.1e follows by defnition of and <sup>+</sup> <sup>1</sup> <sup>≤</sup> 101, so <sup>≤</sup> 100.

*Case 3:* < 90. The base case is: (89) <sup>=</sup> ( (100)) <sup>=</sup> (91) <sup>=</sup> <sup>91</sup> <sup>=</sup> (89) by defnition of since 89 < 100.

The inductive assumption is () <sup>=</sup> () for <sup>≺</sup> and the inductive step is:

$$f(\mathbf{x}) = f(f(\mathbf{x} + 11))\tag{6.2a}$$

$$=f(\mathbf{g}(\mathbf{x}+11))\tag{6.2b}$$

$$= f(\mathfrak{P}1)\tag{6.2c}$$

$$\mathbf{u} = \mathbf{91} \tag{6.2d}$$

$$=\mathbf{g}\left(\mathbf{x}\right).\tag{6.2e}$$

Equation 6.2a holds by defnition of and < <sup>90</sup> <sup>≤</sup> 100. The equality of Eq. 6.2a and Eq 6.2b follows from the inductive hypothesis < 90, so <sup>+</sup> <sup>11</sup> <sup>&</sup>lt; 101, which implies that <sup>+</sup> <sup>11</sup> <sup>∈</sup> and <sup>+</sup> <sup>11</sup> <sup>≺</sup> . The equality of Eq. 6.2b and Eq 6.2c follows by defnition of and <sup>+</sup> <sup>11</sup> <sup>&</sup>lt; 101. Finally, we have already shown that (91) <sup>=</sup> <sup>91</sup> and () <sup>=</sup> 91 for < 90 by defnition. □

#### **6.5 The Josephus Problem**

Josephus was the commander of the city of Yodfat during the Jewish rebellion against the Romans. The overwhelming strength of the Roman army eventually crushed the city's resistance and Josephus took refuge in a cave with some of his men. They preferred to commit suicide rather than being killed or captured by the Romans. According the account by Josephus, he arranged to save himself, became an observer with the Romans and later wrote a history of the rebellion. We present the problem as an abstract mathematical one.

**Defnition 6.2 (Josephus problem)** Consider the numbers 1, . . . , +1 arranged in a circle. Delete every 'th number going around the circle , 2, 3, . . . (where the computation is performed modulo <sup>+</sup> 1) until only one number remains. ( <sup>+</sup> <sup>1</sup>, ) <sup>=</sup> is the *Josephus number* for <sup>+</sup> 1 and .

*Example 6.2* Let <sup>+</sup> <sup>1</sup> <sup>=</sup> 41 and let <sup>=</sup> 3. Arrange the numbers in a circle:

→ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 ↓ ↑ 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 ← The frst round of deletions leads to:

→ 1 2 ̸3 4 5 ̸6 7 8 ̸9 10 11 12 13 14 ̸ 15 16 17 ̸ 18 19 20 ̸ 21̸ ↓ ↑ 41 40 39 38 37 ̸ 36 35 34 ̸ 33 32 31 ̸ 30 29 28 ̸ 27 26 25 ̸ 24 23 22 ̸ ←

After removing the deleted numbers this can be written as:

1 2 4 5 7 8 10 11 13 14 16 17 19 20 22 23 25 26 28 29 31 32 34 35 37 38 40 41

The second round of deletions (starting at the last deletion of 39) leads to:

1 2 4 ̸ 5 7 8 ̸ 10 11 13 ̸ 14 16 17 ̸ 19 20 22 ̸ 23 25 26 ̸ 28 29 31 ̸ 32 34 35 ̸ 37 38 40 ̸ 41̸

We continue deleting every third number until only one remains:

```
2 4 ̸7 8 11 13 16 17 ̸ 20 22 25 ̸ 26 29 31 ̸ 34 35 38 ̸ 40̸
 2 4 ̸8 11 16 17 22 25 ̸ 29 31 35 ̸ 38̸
 2 4 11 16 22 ̸ 25 31 35 ̸
 ̸2 4 16 22 31 35 ̸
 ̸4 16 31 35̸
16 31 ̸
31
```
It follows that (41, <sup>3</sup>) <sup>=</sup> 31.

The reader is invited to perform the computation for deleting every seventh number from a circle of 40 numbers in order to verify that the last number is 30.

**Theorem 6.10** ( <sup>+</sup> <sup>1</sup>, ) <sup>=</sup> ( (, ) + ) (mod <sup>+</sup> <sup>1</sup>)*.*

*Proof* The frst number deleted in the frst round is the 'th number and the numbers that remain after the deletion are the numbers:

1 2 . . . <sup>−</sup> <sup>1</sup> <sup>+</sup> <sup>1</sup> . . . <sup>+</sup> <sup>1</sup> (mod <sup>+</sup> <sup>1</sup>) .

Counting to fnd the next deletion starts with <sup>+</sup> 1. Mapping 1, . . . , into this sequence we get:

$$
\begin{array}{ccccccccc}
1 & 2 & \dots \ n-q & n+1-q & n+2-q & \dots \ n-1 & n \pmod{n+1} \\
\downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow \\
q+1 & q+2 & \dots & n & n+1 & 1 & \dots \ q-2 & q-1 \pmod{n+1} \\
\end{array}
\quad \begin{array}{ccccc}
1 & (mod \ n+1) & \mid \quad \downarrow & (mod \ n+1) \\
\downarrow & \downarrow & \downarrow & \downarrow & \downarrow \\
1 & \dots & q-2 & q-1 & (mod \ n+1) \\
\end{array}$$

Remember that the computations are modulo <sup>+</sup> 1:

$$\begin{array}{ccc}(n+2-q)+q = (n+1)+1 & = 1 \pmod{n+1} \\ (n)+q & = (n+1)-1+q = q-1 \pmod{n+1} \end{array}$$

This is the Josephus problem for numbers, except that the numbers are ofset by . It follows that:

$$J(n+1,q) = \left(J(n,q) + q\right) \pmod{n+1} \,. \tag{7}$$

**Theorem 6.11** *For* <sup>≥</sup> <sup>1</sup> *there exist numbers* <sup>≥</sup> <sup>0</sup>, <sup>0</sup> <sup>≤</sup> < <sup>2</sup> *, such that* = 2 <sup>+</sup>*.*

*Proof* This can be proved from repeated application of the division algorithm with divisors 2<sup>0</sup> , 2 1 , 2 2 , 2 4 , . . ., but it is easy to see from the binary representation of . For some and −1, −2, . . . , 1, 0, where for all , <sup>=</sup> 0 or <sup>=</sup> 1, can be expressed as:

$$\begin{aligned} n &= 2^a + b\_{a-1}2^{a-1} + \dots + b\_0 2^0 \\ n &= 2^a + (b\_{a-1}2^{a-1} + \dots + b\_0 2^0) \\ n &= 2^a + t, \quad \text{where } t \le 2^a - 1 \end{aligned}$$

We now prove that there is simple closed form for (, <sup>2</sup>).

**Theorem 6.12** *For* = 2 <sup>+</sup> *,* <sup>≥</sup> <sup>0</sup>, <sup>0</sup> <sup>≤</sup> < <sup>2</sup> *,* (, <sup>2</sup>) <sup>=</sup> <sup>2</sup> <sup>+</sup> <sup>1</sup>*.*

*Proof* By Thm. 6.11, can be expressed as stated in the theorem. The proof that (, <sup>2</sup>) <sup>=</sup> <sup>2</sup> <sup>+</sup> 1 is by a double induction, frst on and then on .

*First induction:*

Base case. Assume that = 0 so that = 2 . Let = 1 so that there are two numbers in the circle 1, 2. Since = 2, the second number will be deleted, so the remaining number is 1 and (<sup>2</sup> 1 , <sup>2</sup>) <sup>=</sup> 1.

The inductive hypothesis is that (<sup>2</sup> , <sup>2</sup>) <sup>=</sup> 1. What is (<sup>2</sup> +1 , <sup>2</sup>)? In the frst round all the even numbers are deleted:

$$1 \quad \mathfrak{I} \quad \mathfrak{I} \quad \mathfrak{I} \quad \mathfrak{I} \quad \dots \quad \mathfrak{I}^{a+1} - 1 \quad \mathfrak{I}^{a+1} \dots$$

There are now 2 numbers left:

$$1 \quad 3 \quad \dots \quad 2^{a+1} - 1 \dots$$

By the inductive hypothesis (<sup>2</sup> +1 , <sup>2</sup>) <sup>=</sup> (<sup>2</sup> , <sup>2</sup>) <sup>=</sup> 1 so by induction (, <sup>2</sup>) <sup>=</sup> <sup>1</sup> whenever = 2 <sup>+</sup> 0.

*Second induction:*

We have proved (<sup>2</sup> <sup>+</sup> <sup>0</sup>, <sup>2</sup>) <sup>=</sup> <sup>2</sup> · <sup>0</sup> <sup>+</sup> 1, the base case of the second induction. The inductive hypothesis is (<sup>2</sup> <sup>+</sup> , <sup>2</sup>) <sup>=</sup> <sup>2</sup> <sup>+</sup> 1. By Thm. 6.10:

$$J(\mathfrak{L}^a + (t+1), \mathfrak{L}) = J(\mathfrak{L}^a + t, \mathfrak{L}) + \mathfrak{L} = 2t + 1 + \mathfrak{L} = \mathfrak{L}(t+1) + 1 \,. \qquad \square$$

Theorems 6.11 and 6.12 give a simple algorithm for computing (, <sup>2</sup>). From the proof of Thm. 6.11:

$$m = \mathcal{Z}^a + t = \mathcal{Z}^a + \left(b\_{a-1}\mathcal{Z}^{a-1} + \dots + b\_0\mathcal{Z}^0\right),$$

so <sup>=</sup> −1<sup>2</sup> −<sup>1</sup> +· · ·+0<sup>2</sup> 0 . We simply multiply by 2 (shift left by one digit) and add 1. For example, since = 41 = 2 <sup>5</sup>+<sup>2</sup> <sup>3</sup>+<sup>2</sup> <sup>0</sup> <sup>=</sup> 101001, it follows that (41, <sup>2</sup>) <sup>=</sup> <sup>2</sup>+1, and in binary notation:

$$\begin{aligned} 41 &= 101001 \\ 9 &= 01001 \\ 2t + 1 &= 10011 = 16 + 2 + 1 = 19 \dots \end{aligned}$$

The reader can verify the result by deleting every second number in a circle 1, . . . , 41.

There is a closed form for (, <sup>3</sup>) but it is quite complicated.

#### **What Is the Surprise?**

Induction is perhaps the most important proof technique in modern mathematics. While Fibonacci numbers are extremely well-known and Fermat numbers are also easy to understand, I was surprised to fnd so many formulas that I never knew (such as Thms. 6.3 and 6.4) that can be proved by induction. McCarthy's 91-function was discovered in the context of computer science although it is a purely mathematical result. What is surprising is not the function itself, but the strange induction used to prove it where 98 ≺ 97. The surprise of the Josephus problem is the bidirectional inductive proof.

#### **Sources**

For a comprehensive presentation of induction see [21]. The proof of McCarthy's 91 function is from [30] where it is attributed to Rod M. Burstall. The presentation of the Josephus problem is based on [21, Chapter 17], which also discusses the historical background. That chapter contains other interesting problems with inductive proofs, such as the muddy children, the counterfeit coin and the pennies in a box. Additional material on the Josephus problem can be found in [44, 57].

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

## **Chapter 7 Solving Quadratic Equations**

Poh-Shen Loh proposed a method for solving quadratic equations that is based on a relation between the coefcients of the quadratic polynomial and its roots. Section 7.1 reviews the traditional methods for solving quadratic equations. Section 7.2 tries to convince the reader that Loh's method makes sense and then explains how to compute the roots. In Sect. 7.3 the computation is carried out for two quadratic polynomials and a similar computation for a quartic polynomial. Section 7.4 derives the traditional formula for the roots from Loh's formulas.

The introduction of algebra and modern algebraic notation is relatively recent. Previously, mathematicians used geometry almost exclusively, so it is interesting to look at al-Khwarizmi's geometric construction of the formula for the roots of quadratic equations (Sect. 7.5). Section 7.6 shows a clever geometric construction used by Cardano in developing the formula for the roots of cubic equations.

Section 7.8 presents other geometric methods for fnding the roots of quadratic equations.1 The chapter concludes with Sect. 7.9 which discusses numerical computation of the roots of quadratic equations.

#### **7.1 Traditional Methods for Solving Quadratic Equations**

Every student of mathematics memorizes the formula for obtaining the roots of a quadratic equation <sup>2</sup> <sup>+</sup> <sup>+</sup> <sup>=</sup> 0:

$$x\_1, x\_2 = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}.$$

<sup>1</sup> Chapter 11 is a prerequisite for a full understanding of these methods.

For now we will work with monic polynomials, <sup>2</sup> <sup>+</sup> <sup>+</sup> <sup>=</sup> 0, whose roots are:

$$
\mu\_1, \mu\_2 = \frac{-b \pm \sqrt{b^2 - 4c}}{2} \,. \tag{7.1}
$$

Another method of solving quadratic equations is by factoring the polynomials more-or-less by trial-and-error. Sometimes it is easy to obtain the roots by factoring:

$$
\alpha^2 - 4\mathbf{x} + \mathbf{3} = (\mathbf{x} - \mathbf{1})(\mathbf{x} - \mathbf{3})\,. \tag{7.2}
$$

It is much harder to factor <sup>2</sup> <sup>−</sup> <sup>2</sup> <sup>−</sup> 24 because there are many possible pairs of roots that must be considered:

$$\left( \left( \pm 1, \mp 24 \right), \left( \pm 2, \mp 12 \right), \left( \pm 3, \mp 8 \right), \left( \pm 4, \mp 6 \right) \right) \dots$$

#### **7.2 The Relation Between the Roots and the Coefcients**

**Theorem 7.1** *If* 1, <sup>2</sup> *are the roots of* <sup>2</sup> <sup>+</sup> <sup>+</sup> *then:*

$$(x - r\_1)(x - r\_2) = x^2 - (r\_1 + r\_2)x + r\_1r\_2 = x^2 + bx + c \ .$$

*Therefore, even if we do not know the values of the roots, we do know that:*

$$r\_1 + r\_2 = -b \,, \qquad r\_1 r\_2 = c \,. \tag{7.3}$$

There is really nothing to prove because the result emerges from the computation. Consider some values of <sup>−</sup>, 1, <sup>2</sup> and let <sup>12</sup> be the average of 1, 2:


For any quadratic equation the average of the two roots is constant:

$$m\_{1,2} = \frac{r\_1 + r\_2}{2} = \frac{(-b - r\_2) + r\_2}{2} = -\frac{b}{2} \dots$$

**Fig. 7.1** Relation between the roots 1, <sup>2</sup> = 2, 6 and their average <sup>12</sup> = 4

Let be any number. Then:

$$-b = -b + s + (-s) = \left(\frac{-b}{2} + s\right) + \left(\frac{-b}{2} - s\right) = r\_1 + r\_2 \dots$$

If one root is at distance from the average, the other root is at distance <sup>−</sup> from the average. For 1, <sup>2</sup> = 2, 6, where <sup>12</sup> = 4, = 2, we have:


Figure 7.1 visualizes this relation. If we use other values 1, <sup>2</sup> = 3, 5 for which <sup>1</sup> <sup>+</sup> <sup>2</sup> <sup>=</sup> 8 then <sup>12</sup> <sup>=</sup> 4 remains the same while becomes 1 (Fig. 7.2).

The ofset seems to be arbitrary in:

$$r\_1 = \left(\frac{-b}{2} + s\right), \quad r\_2 = \left(\frac{-b}{2} - s\right),$$

but there is an additional constraint 1<sup>2</sup> = , where is the constant term in the polynomial. By multiplying the two expressions we have derived for 1, 2, we can determine and then 1, 2:

$$\begin{aligned} c &= \left(-\frac{b}{2} + s\right)\left(-\frac{b}{2} - s\right) = \frac{b^2}{4} - s^{s-1} \\ s &= \frac{\sqrt{b^2 - 4c}}{2} .\end{aligned}$$

**Fig. 7.2** Relation between the roots 1, <sup>2</sup> = 3, 5 and their average <sup>12</sup> = 4

#### **7.3 Examples of Loh's Method**

*Example 7.1* Consider the polynomial <sup>2</sup> <sup>−</sup> <sup>2</sup> <sup>−</sup> 24 where <sup>=</sup> <sup>−</sup>2, <sup>=</sup> <sup>−</sup>24:

$$\begin{aligned} c &= \left(-\frac{(-2)}{2} + s\right) \left(-\frac{(-2)}{2} - s\right) \\ -24 &= (1 + s)(1 - s) \\ s &= 5 \\ r\_1 &= 1 + 5 = 6 \\ r\_2 &= 1 - 5 = -4 \end{aligned}$$

Check: ( <sup>−</sup> <sup>6</sup>) ( − (−4)) <sup>=</sup> <sup>2</sup> <sup>−</sup> <sup>2</sup> <sup>−</sup> 24.

*Example 7.2* Let us fnd the roots of <sup>2</sup> <sup>−</sup> <sup>83</sup> <sup>−</sup> 2310:

$$\begin{aligned} -2310 &= \left(\frac{83}{2} + s\right)\left(\frac{83}{2} - s\right) \\ s^2 &= \frac{6889}{4} + 2310 = \frac{16129}{4} \\ s &= \frac{127}{2} \\ r\_1 &= \frac{83}{2} - \frac{127}{2} = -22 \\ r\_2 &= \frac{83}{2} + \frac{127}{2} = 105 \end{aligned}$$

Check: ( <sup>+</sup> <sup>22</sup>) ( <sup>−</sup> <sup>105</sup>) <sup>=</sup> <sup>2</sup> <sup>−</sup> <sup>83</sup> <sup>−</sup> 2310.

#### 7.3 Examples of Loh's Method 77

Compare this computation with the computation using the traditional formula:

$$\begin{aligned} \frac{-b \pm \sqrt{b^2 - 4c}}{2} &= \frac{-(-83) \pm \sqrt{(-83)^2 - 4 \cdot (-2310)}}{2} \\ &= \frac{83 \pm \sqrt{16129}}{2} = \frac{83 \pm 127}{2} \\ r\_1 &= \frac{83 - 127}{2} = -22 \\ r\_2 &= \frac{83 + 127}{2} = 105 \text{.} \end{aligned}$$

*Example 7.3* Theorem 7.1 can be generalized to polynomials of higher degrees. Here is an interesting example for a *quartic equation* <sup>4</sup> <sup>−</sup> <sup>10</sup> <sup>2</sup> <sup>−</sup> <sup>+</sup> <sup>20</sup> <sup>=</sup> 0. As with quadratic equations there are formulas for solving cubic and quartic equations (though not equations of higher powers), but the formulas are quite complicated.

Does this polynomial of degree four factor into two quadratic polynomials with integer coefcients? If so, the coefcients of the terms must be *equal and of opposite signs* since the coefcient of the 3 term is zero. Therefore, the form of the quadratic factors is:

$$f(\mathbf{x}) = (\mathbf{x}^2 - n\mathbf{x} + k\_1) \left(\mathbf{x}^2 + n\mathbf{x} + k\_2\right) \dots$$

Carrying out the multiplication results in:

$$\begin{aligned} f(\mathbf{x}) &= \mathbf{x}^4 \ + n\mathbf{x}^3 \ + k\_2\mathbf{x}^2 \\ &\quad -n\mathbf{x}^3 \ -n^2\mathbf{x}^2 \ -nk\_2\mathbf{x} \\ &\quad + k\_1\mathbf{x}^2 \ +nk\_1\mathbf{x} \ +k\_1k\_2\ \ . \end{aligned}$$

Equating the coefcients gives three equations in the three unknowns , 1, <sup>2</sup> gives:

$$\begin{aligned} \left(k\_1 + k\_2\right) - n^2 &= -10\\ n(k\_1 - k\_2) &= -1\\ k\_1 k\_2 &= 20 \end{aligned}$$

Since we are looking for factors with integer coefcients, from the last two equations it is clear that:

$$n=1, \ k\_1 = 4, \ k\_2 = 5 \qquad \text{or} \qquad n=1, \ k\_1 = -5, \ k\_2 = -4 \dots$$

Only <sup>=</sup> <sup>1</sup>, <sup>1</sup> <sup>=</sup> <sup>−</sup>5, <sup>2</sup> <sup>=</sup> <sup>−</sup>4 satisfy the frst equation for the coefcient of 2 :

$$f(\mathbf{x}) = (\mathbf{x}^2 - \mathbf{x} - \mathbf{S}) \ (\mathbf{x}^2 + \mathbf{x} - \mathbf{4}) \ .$$

.

Solving these quadratic equations gives four solutions of the quartic equation:

$$\infty = \frac{1 \pm \sqrt{21}}{2} \text{ or } \infty = \frac{-1 \pm \sqrt{17}}{2}.$$

#### **7.4 Derivation of the Traditional Formula**

For an arbitrary monic polynomial <sup>2</sup> <sup>+</sup> <sup>+</sup> , Loh's formulas are:

$$\begin{aligned} c &= r\_1 r\_2 = \left(\frac{-b}{2} + s\right) \left(\frac{-b}{2} - s\right) = \left(\frac{b^2}{4} - s^2\right) \\ s &= \sqrt{\left(\frac{b^2}{4}\right) - c} \\ r\_1, r\_2 &= \frac{-b}{2} \pm \sqrt{\left(\frac{b^2}{4}\right) - c} = \frac{-b \pm \sqrt{b^2 - 4c}}{2}, \end{aligned}$$

the traditional formula for obtaining the roots of a monic quadratic polynomial. If the polynomial is not monic divide it by , substitute in the equation and simplify:

$$\begin{aligned} x^2 + \frac{b}{a}x + \frac{c}{a} &= 0\\ r\_1, r\_2 &= \frac{-(b/a) \pm \sqrt{(b/a)^2 - 4(c/a)}}{2} \\ &= \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} .\end{aligned}$$

#### **7.5 Al-Khwarizmi's Geometric Solution of Quadratic Equations**

Let us write a monic quadratic polynomial as <sup>2</sup> <sup>+</sup> <sup>−</sup> . The roots can be found by *completing the square*:

$$\begin{aligned} \left(\mathbf{x}\right)^2 + 2\left(\frac{b}{2}\right)\mathbf{x} + \left(\frac{b}{2}\right)^2 &= c + \left(\frac{b}{2}\right)^2\\ \left(\mathbf{x} + \frac{b}{2}\right)^2 &= c + \left(\frac{b}{2}\right)^2\\ \mathbf{x} &= -\frac{b}{2} \pm \sqrt{c + \left(\frac{b}{2}\right)^2} = \frac{-b \pm \sqrt{b^2 + 4c}}{2} \end{aligned}$$

This is the familiar formula for fnding the roots of a quadratic equation, except that <sup>4</sup> has the opposite sign since the coefcient of the constant term was <sup>−</sup>.

Completing the square was developed in the 8th century by Muhammad ibn Musa al-Khwarizmi in a geometric context. Given the equation <sup>2</sup> <sup>+</sup> <sup>=</sup> , assume that there is a square whose side is so that its area is 2 . To the area 2 add by adding four rectangles of area /4 whose sides are /4 and (Fig. 7.3a). Now complete the diagram to a square by adding the four little squares of area (/4) 2 (Fig. 7.3b).

We can't construct the diagram in Fig. 7.3a because we don't know what is, but the area of the larger square in Fig. 7.3b is:

$$ax^2 + bx + \frac{b^2}{4} = c + \frac{b^2}{4},$$

which we do know since we are given the coefcients , . By constructing the diagram and erasing the small squares whose sides are (/4)—another known quantity—we obtain the line segment of length .

*Example 7.4* Let <sup>2</sup>+12 <sup>=</sup> 64. Then +( 2 /4) = 64+36 = 100. It is easy to construct a square of area 100 since each side has length 10. Now subtract (/4) + (/4) <sup>=</sup> 6, the sides of the smaller squares, to get <sup>=</sup> <sup>10</sup> <sup>−</sup> <sup>6</sup> <sup>=</sup> 4.

#### **7.6 Cardano's Construction for Solving Cubic Equations**

The formula for the roots of cubic equations was frst published in the 16th century by Gerolamo Cardano. We will not develop the formula here, but it is interesting that the central idea is based on a geometric construction similar to al-Khwarizmi's.

**Fig. 7.3a** The area is <sup>2</sup>+4(/4) <sup>=</sup>

<sup>2</sup>+ **Fig. 7.3b** The area is <sup>2</sup> <sup>+</sup> <sup>4</sup>(/4) <sup>+</sup> 4(/4) <sup>2</sup> = <sup>2</sup> <sup>+</sup> + ( 2 /4)

The construction can be obtained very simply using algebra. By multiplication:

$$(a+b)^3 = a^3 + 3a^2b + 3ab^2 + b^3 = (a^3+b^3) + 3ab(a+b)\,\,. \tag{7.4}$$

Geometrically, we start with a cube whose side is <sup>+</sup> so that its volume is ( <sup>+</sup> ) 3 . The cube is decomposed into fve pieces. The frst two are cubes whose sides are and with volumes 3 (blue) and 3 (red), respectively (Fig. 7.4).

The other three parts are boxes (the technical term is *cuboid*) each with one side of length <sup>+</sup> coinciding with a side of the cube, one side of length and one side of length , so that the volume of each of the three boxes is ( <sup>+</sup> ). In Fig. 7.5, there is one box at the left side of the cube (blue), one at the back of the cube (red) and one at the top of the cube (green). By combining the fve solids in Fig. 7.4 and Fig. 7.5 we obtain Eq. 7.4.

**Fig. 7.4** ( <sup>3</sup> <sup>+</sup> 3 ) = ( <sup>3</sup> <sup>+</sup> 3 ) + · · ·

**Fig. 7.5** ( <sup>3</sup> <sup>+</sup> 3 ) = · · · + 3( + )

#### **7.7 They Weren't Intimidated by Imaginary Numbers**

The history of mathematics demonstrates a progression of concepts that were initially considered to be meaningless, but were eventually understood, accepted and proved to be useful. "Obviously," since numbers count things, −1, a negative number, is meaningless. "Obviously," since numbers are ratios of integers (rational numbers), √ 2, which can easily be proved to be irrational, is meaningless. "Obviously," <sup>√</sup> −1, the square root of a negative number, is meaningless since there is no number—integer, rational or real—whose square is −1.

A full understanding of the square roots of negative numbers, to this day called *imaginary numbers* although they are no less real than real numbers, was not achieved until the nineteenth century. Therefore, it is surprising that already in the sixteenth century, Geralamo Cardano and Rafael Bombelli refused to be intimidated by the concept, and took the frst small steps towards understanding these numbers.

Consider the quadratic equation:

$$
\alpha^2 - 10\alpha + 40 = 0\,\,.\tag{7.5}
$$

By the familiar formula (Eq. 7.1):

$$r\_1, r\_2 = \frac{10 \pm \sqrt{100 - 160}}{2} = 5 \pm \sqrt{-15} \text{ J}$$

Well, we don't know anything about the square roots of negative numbers and we don't know what these values are, but like Cardano we do know by Thm 7.1 that:

$$\begin{array}{l}r\_1 + r\_2 = (5 + \sqrt{-15}) + (5 - \sqrt{-15}) = 10 = -b\\r\_1r\_2 = (5 + \sqrt{-15})(5 - \sqrt{-15}) = 25 - 5\sqrt{-15} + 5\sqrt{-15} - (-15) = 40 = c\ .\end{array}$$

which correspond with the coefcients of the quadratic equation Eq. 7.5. It is rather intuitive that <sup>√</sup> <sup>−</sup><sup>15</sup> + (−√ <sup>−</sup>15) <sup>=</sup> 0 even if we know nothing about <sup>√</sup> −15, and, similarly, it is rather intuitive that <sup>√</sup> <sup>−</sup><sup>15</sup> · −(<sup>√</sup> −15) = −(−15) = 15 even if we don't know what <sup>√</sup> −15 is.

Consider now the cubic equation:

$$\mathbf{x}^3 - 1\mathbf{\bar{5x}} - 4 = \mathbf{0} \,\, \text{s} \tag{7.6}$$

It is not hard to observe that 4 is a root, but how can it be computed? Cardano's formula gives the root:

$$r = \sqrt[3]{2 + 11\sqrt{-1} + \sqrt[3]{2 - 11\sqrt{-1}}},\tag{7.7}$$

a quite complicated formula that bears no obviously relation to 4.

Bombelli courageously performed the following computation (see Eq. 7.4):

$$\begin{aligned} (2+\sqrt{-1})^3 &= 8+3\cdot 4\sqrt{-1}+3\cdot 2(-1)+(-1\sqrt{-1}) = 2+11\sqrt{-1} \\ (2-\sqrt{-1})^3 &= 8-3\cdot 4\sqrt{-1}+3\cdot 2(-1)-(-1\sqrt{-1}) = 2-11\sqrt{-1} \end{aligned}$$

and by Eq. 7.7:

$$\begin{aligned} r &= \sqrt[3]{2 + 11\sqrt{-1}} + \sqrt[3]{2 - 11\sqrt{-1}} \\ &= \sqrt[3]{(2 + \sqrt{-1})^3} + \sqrt[3]{(2 - \sqrt{-1})^3} \\ &= (2 + \sqrt{-1}) + (2 - \sqrt{-1}) = 4 \end{aligned}$$

**Fig. 7.6** Lill's method on <sup>2</sup> <sup>−</sup> <sup>4</sup> <sup>+</sup> <sup>3</sup>

#### **7.8 Lill's Method and Carlyle's Circle**

Lill's method can be applied to solve quadratic equations.2 As an example we use Eq. 7.2 which gives the roots of a quadratic equation obtained by factorization:

$$ax^2 + bx + c = x^2 - 4x + 3 = (x - 1)(x - 3) \dots$$

Applying Lill's method results in the paths shown in Fig. 7.6.

Check that the angles are correct:

$$-\tan(-4\mathfrak{F}^\circ) = -1, \quad -\tan(-71.57^\circ) \approx -3\ .$$

For quadratic equations we can fnd the points 1, <sup>2</sup> as the intersections of the line representing the coefcient and the circle whose diameter is the line connecting the starting point and the end point of the paths (Fig. 7.7). In order for a point on the line to be a root, the refection of the line must be 90◦ and therefore the inscribed angle is subtended by a diameter.

This can also be checked by computation. The center of the circle is the midpoint of the diameter (−1, <sup>−</sup>2). The length of the diameter is:

$$
\sqrt{(-2)^2 + (-4)^2} = \sqrt{20},
$$

<sup>2</sup> This section assumes that you have read about Lill's method in Chap. 11.

**Fig. 7.7** Constructing a circle to fnd the roots

so the square of the length of the radius is √ 20/2 2 = 5. We need the intersection of this circle and the line = 1:

$$\begin{aligned} \left(\mathbf{x} - (-1)\right)^2 + \left(\mathbf{y} - (-2)\right)^2 &= r^2\\ \left(\mathbf{x}^2 + 2\mathbf{x} + 1\right) + \left(\mathbf{y}^2 + 4\mathbf{y} + 4\right) &= \mathbf{5} \\ \mathbf{y}^2 + 4\mathbf{y} + 3 &= \mathbf{0} \\ \mathbf{y} &= -1, \ -3 \end{aligned}$$

A similar method for solving quadratic equations is the Carlyle circle which predates Lill's method. Given a quadratic equation <sup>2</sup> <sup>−</sup> <sup>+</sup> (note the minus sign on the linear term), construct points at (0, <sup>1</sup>) and (, ). Construct a circle whose diameter is the line connecting the two points (Fig. 7.8). Its intersections (if any) with the -axis are the roots of the equation.

In the general case, the center of the circle is (/2, ( − (−1))/2) and the length of the diameter is <sup>√</sup> <sup>2</sup> <sup>+</sup> ( <sup>−</sup> <sup>1</sup>) 2 , so the equation of the circle is:

$$
\left(\mathbf{x} - \frac{b}{2}\right)^2 + \left(\mathbf{y} - \frac{c+1}{2}\right)^2 = \frac{b^2 + (c-1)^2}{4}
$$

.

For the example, substituting = 4, = 3 and = 0, we see that = 1 and = 3 are the roots of the quadratic equation.

**Fig. 7.8** Carlyle circle for <sup>2</sup> <sup>−</sup> <sup>4</sup> <sup>+</sup> <sup>3</sup>

#### **7.9 Numerical Computation of the Roots**

Students learn symbolic computation of roots, derivatives and so on. Today, most computation is performed by computers so symbolic computation is less important. *Numerical analysis* is the branch of mathematics and computer science that develops accurate and efcient computational methods. The main challenge is to deal with the fniteness of values stored in the computer's memory. The computation:

$$0.12 \times 0.14 = 0.0168$$

is easy to do, but:

<sup>0</sup>.<sup>123456789</sup> <sup>×</sup> <sup>0</sup>.<sup>123456789</sup>

needs eighteen digits to be accurately represented and this cannot be done in a memory word that stores sixteen digits. This error is called a *round-of error*.

An even more serious problem is encountered when *foating point arithmetic* is performed. Clearly:

> (0.<sup>12</sup> <sup>×</sup> <sup>10</sup>−10) × (0.<sup>14</sup> <sup>×</sup> <sup>10</sup>−<sup>8</sup> )

would not be computed by writing out all the zero digits. Instead, we multiply the mantissas and add the exponents to obtain 0.<sup>0168</sup> <sup>×</sup> <sup>10</sup>−18, which is normalized to 0.<sup>168</sup> <sup>×</sup> <sup>10</sup>−<sup>19</sup> so that the most signifcant digit appears after the decimal point, ensuring maximum precision given the fxed size of the mantissa. If the maximum exponent that can be represented is −16 the result simply cannot even be stored. This error is called *foating-point underfow*.

The formula for fnding the roots of the quadratic equation <sup>2</sup> <sup>+</sup> <sup>+</sup> is:

$$r\_1, r\_2 = \frac{-b \pm \sqrt{b^2 - 4c}}{2} \,. \tag{7.8}$$

Consider what happens if = 1000 and = 4. The roots are:

$$r\_1, r\_2 = \frac{-1000 \pm \sqrt{1000000 - 16}}{2}.$$

Depending on the precision of the arithmetic, it is possible that one of the roots is so close to zero that the value stored is zero. Evaluating the quadratic equation gives the surprising result 0<sup>2</sup> <sup>+</sup> · <sup>0</sup> <sup>+</sup> <sup>4</sup> <sup>=</sup> <sup>4</sup> <sup>=</sup> 0.

Can we do better? By Eq. 7.3:

$$r\_1 + r\_2 = -b \ , \qquad r\_1 r\_2 = c \ . $$

If <sup>2</sup> is much less that 1, written <sup>2</sup> <sup>≪</sup> 1, then <sup>1</sup> ≈ − and <sup>2</sup> <sup>=</sup> /. Table 7.1, computed by a computer program, compares the values of the roots computed by these formulas with the values obtained from the traditional formula Eq. 7.8. The value of is fxed at 4 and the roots for increasing values of are shown.

Initially, the true values computed by the traditional formula for <sup>2</sup> are more accurate (<sup>2</sup> <sup>−</sup> 2 is negative) but from <sup>=</sup> 100000, the computation based upon Eq. 7.3 is more accurate. Such are the surprises of numerical analysis.

**Table 7.1** Two computations of the roots of a quadratic equation. 1, <sup>2</sup> are the roots computed by Eq. 7.8. 1 , 2 are the roots computed using Eq. 7.3. The errors are − . The values are truncated to four decimal places. Floating-point numbers are written <sup>−</sup>4 <sup>−</sup> 5 in place of 4 <sup>×</sup> <sup>10</sup>−<sup>5</sup> because computer programs are normally written as linear sequences of characters.


#### **What Is the Surprise?**

Poh-Shen Loh's approach provides a new way of looking at the relation between the coefcients and the roots that one doesn't see simply by memorizing the traditional formula. What is surprising is that this relation is fundamental in Gauss's algebraic proof of the constructibility of a regular heptadecagon (Chap. 16).

With the modern dominance of algebraic methods in geometry it is important to be reminded that the reverse once held. As shown by the constructions of Al-Khwarizmi and Cardano, geometric methods were used to obtain results in algebra. Lill and Carlyle both developed geometric methods for solving quadratic equations. Considerations of numerical computation on computers will surprise students who have not experienced it before.

#### **Sources**

Poh-Shen Loh's method is from [28, 29]. Al-Khwarizmi's construction is from [6, Chapter 1] and [32]. Cardano's construction can be found in [6, Chap. 1]. For the colorful history of the development of Cardano's formula see [52]. The early attempts at computing with imaginary numbers are from [6, Chapter 2]. Lill's method and Carlyle's circle can be found in [61] together with a discussion of numerical computation of the roots.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 8 Ramsey Theory**

Ramsey theory is a branch of combinatorics that asks questions of the form: How large must a set be so that if it is divided into subsets, at least one subset has a certain property? Results in Ramsey theory are difcult to prove and there remain many open problems. In this chapter we present simple cases of four problems to give a taste of this fascinating subject: Schur triples (Sect. 8.1)—triples of integers such that <sup>+</sup> <sup>=</sup> , Pythagorean triples (Sect. 8.2)—triples of integers such that <sup>2</sup> <sup>+</sup> <sup>2</sup> = 2 , van der Waarden's problem (Sect. 8.3) which concerns sequences of numbers, and Ramsey's theorem (Sect. 8.4) on coloring graphs. Section 8.5 shows how the probabilistic method in combinatorics can be used to develop a lower bound for Ramsey numbers.

The Pythagorean triples problem was recently solved with the aid of computers, using a relatively new method call SAT solving. For readers familiar with propositional logic Sect. 8.6 gives an overview of how this is done.

Section 8.7 describes Pythagorean triples as known to the Babylonians four thousand years ago.

Terminology: *monochromatic* means *of the same color*.

#### **8.1 Schur triples**

**Defnition 8.1** Given *any* decomposition of the set of positive integers:

$$S(n) = \{1, \ldots, n\}$$

into two disjoint subsets 1, 2, do there exist {, , } ⊆ <sup>1</sup> or {, , } ⊆ <sup>2</sup> (or both) such that < < and <sup>+</sup> <sup>=</sup> ? If so, the set {, , } is called a *Schur triple*.

*Example 8.1* For = 8, in the decomposition:

$$\mathcal{S}\_1 = \{1, 2, 3, 4\}, \ \mathcal{S}\_2 = \{5, 6, 7, 8\}\,,\tag{8.1}$$

the set <sup>1</sup> includes the Schur triple {1, <sup>2</sup>, <sup>3</sup>}. However, the decomposition:

$$\mathcal{S}'\_1 = \{1, 2, 4, 8\}, \; \mathcal{S}'\_2 = \{3, 5, 6, 7\}, \; \tag{8.2}$$

does not contain a Schur triple, as you can check by enumerating all the triples in each subset.

**Theorem 8.1** *In* all *decompositions of* (9) <sup>=</sup> {1, . . . , <sup>9</sup>} *into two disjoint subsets, at least one subset contains a Schur triple.*

Of course we could check the 2<sup>9</sup> <sup>=</sup> 512 decompositions of (9) into two disjoint subsets, but let us try come up with a more succinct proof.

*Proof* We try to construct a decomposition that *does not* contain a Schur triple and show that the constraints of the problem make this impossible. Start by placing 1 and 3 into the subset 1. 2 must be placed in <sup>2</sup> because 1 <sup>+</sup> <sup>2</sup> <sup>=</sup> 3 and we are trying to construct a decomposition that does not contain a Schur triple. Similarly, 4 must be placed in <sup>2</sup> because 1 <sup>+</sup> <sup>3</sup> <sup>=</sup> 4. Continuing, 6 is placed in <sup>1</sup> because 2 <sup>+</sup> <sup>4</sup> <sup>=</sup> <sup>6</sup> and 7 is placed in <sup>2</sup> because 1 <sup>+</sup> <sup>6</sup> <sup>=</sup> 7. However, 3 <sup>+</sup> <sup>6</sup> <sup>=</sup> 9 and 2 <sup>+</sup> <sup>7</sup> <sup>=</sup> 9, so 9 must appear in both <sup>1</sup> and 2, a contradiction. The sequence of inferences is shown in the following table:


Backtracking, we search for a decomposition where 1, 3 are in diferent subsets. If we place 5 into 2, a sequence of inferences again leads to a contradiction because 9 must appear in both subsets. The reader should justify each of the inferences shown in the following table:


#### 8.2 Pythagorean Triples 91

Backtracking again, we try to place 5 into 1, but that also leads to a contradiction, as shown in the following table:


If follows that there is no decomposition that does not include a Schur triple. □

Issai Schur proved the following theorem:

**Theorem 8.2 (Schur)** *For every* <sup>≥</sup> <sup>2</sup> *there is a smallest such that in any disjoint decomposition of* () *into subsets, at least one of the subsets must contain a Schur triple.*

#### **8.2 Pythagorean Triples**

**Defnition 8.2** Given *any* decomposition of the set of positive integers:

$$S(n) = \{1, \ldots, n\}$$

into two disjoint subsets 1, 2, do there exist {, , } ⊆ <sup>1</sup> or {, , } ⊆ <sup>2</sup> (or both) such that < < and <sup>2</sup> <sup>+</sup> <sup>2</sup> = 2 ? If so, {, , } is called a *Pythagorean triple*.

*Example 8.2* For = 10, in the decomposition into even and odd numbers:

$$S\_1 = \{1, 3, 5, 7, 9\}, \ S\_2 = \{2, 4, 6, 8, 10\}, \ S\_3$$

there are no Pythagorean triples in <sup>1</sup> but {6, <sup>8</sup>, <sup>10</sup>} in <sup>2</sup> is a Pythagorean triple since 6<sup>2</sup> <sup>+</sup> <sup>8</sup> <sup>2</sup> = 10<sup>2</sup> .

Marijn J.H. Heule and Oliver Kullmann proved the following theorems. Their method of proof is discussed in Sect. 8.6.

**Theorem 8.3** *For all* <sup>≤</sup> <sup>7824</sup>*, there is*some *decomposition of* () *into two disjoint subsets such that both subsets* do not contain *a Pythagorean triple.*

**Theorem 8.4** *For all* <sup>≥</sup> <sup>7825</sup>*, in* all *decompositions of* () *into two disjoint subsets at least one subset* contains *a Pythagorean triple.*

It is impossible to check all 2<sup>7825</sup> decompositions of (7825). If we could check one decomposition every microsecond, 2<sup>7825</sup> microseconds <sup>≈</sup> <sup>10</sup><sup>600</sup> years, while the estimated age of the universe is only about 10<sup>10</sup> years.

#### **8.3 Van der Waerden's problem**

Consider the sequences of eight colored dots in Fig. 8.1. In the top sequence there are red dots at positions (1, <sup>2</sup>, <sup>3</sup>) and blue dots at positions (4, <sup>5</sup>, <sup>6</sup>). In each case, the positions form an arithmetic progression. Similarly, in the middle sequence the red dots at positions (1, <sup>3</sup>, <sup>5</sup>) form an arithmetic progression. However, in the bottom sequence there is no set of three monochromatic dots whose positions form an arithmetic progression. Triples of red dots are at positions (1, <sup>2</sup>, <sup>5</sup>), (1, <sup>2</sup>, <sup>6</sup>), (2, <sup>5</sup>, <sup>6</sup>), none of which form arithmetic progressions, and similarly for the blue dots.

**Fig. 8.1** van der Waerden's problem for eight colored dots

With nine dots *any* coloring *must* contain a sequence of three monochromatic dots that form an arithmetic progression. For example, let us add a red dot or a blue dot at the end of the bottom sequence in Fig. 8.1 to obtain the sequences in Fig. 8.2. In the top sequence there are red dots at positions (1, <sup>5</sup>, <sup>9</sup>), an arithmetic progression, and in the bottom sequence there are blue dots at positions (7, <sup>8</sup>, <sup>9</sup>), also an arithmetic progression.

Bartel L. van der Waerden posed the following problem: For any positive integer , what is the smallest number such that *any* sequence of colored dots *must* contain a sequence of monochromatic dots that form an arithmetic progression? For = 3, = 9, as demonstrated above for one decomposition. The next result is more difcult to show: for = 4, = 35.

**Fig. 8.2** van der Waerden's problem for nine colored dots

#### **8.4 Ramsey's Theorem**

Color the edges of 5, the complete graph on 5 vertices, with two colors as shown in Fig. 8.3a. There are no monochromatic subgraphs <sup>3</sup> (triangles) in the graph. Figure 8.3b shows one coloring of <sup>6</sup> and it is easy to see that there are monochromatic triangles △ and △. In this section we prove a simple case of a theorem by Frank P. Ramsey on the existence of subsets with a certain property.

**Fig. 8.3a** A coloring of <sup>5</sup> with two colors **Fig. 8.3b** A coloring <sup>6</sup> with two colors

**Defnition 8.3** (), the *Ramsey number* for , is the smallest number such that in *any* coloring of , the complete graph on vertices, with two colors there is a monochromatic complete subgraph .

## **Theorem 8.5 (Ramsey)** (3) <sup>=</sup> <sup>6</sup>*.*

*Proof* Figure 8.3a shows that (3) <sup>&</sup>gt; 5. To show that (3) ≤ 6, consider any vertex in 6. is connected to fve other vertices, and when the edges are colored with two colors there must be at least three monochromatic edges incident with .

**Fig. 8.4a** One vertex of <sup>6</sup> **Fig. 8.4b** Monochromatic triangles in <sup>6</sup>

In Fig. 8.4a, , , are colored red. Since the graph is complete all the vertices are connected, so if any one of the edges , , is colored red, say , a red triangle △ is formed. Otherwise, all three edges of these edges are colored blue and they form a blue triangle (Fig. 8.4b). □

The theorem can be generalized to any number of colors, as well as to colorings where the sizes of the subgraphs are not the same. (, , ) is the smallest complete graph such that in any coloring with three colors there must be complete subgraphs with red edges, blue edges and green edges.

#### **8.5 The Probabilistic Method**

The only known non-trivial Ramsey numbers are (3) <sup>=</sup> 6 and (4) <sup>=</sup> 18. In 1947 Paul Erdos developed the ˝ *probabilistic method* and used it to show lower and upper bounds on (). Subsequent research has improved both bounds, but this is still a signifcant research area since the bounds are not tight. For example, it has been proved that 43 <sup>≤</sup> (5) ≤ 48 and 798 <sup>≤</sup> (10) ≤ 23556. In this section elementary probability is used to obtain a lower bound on ().

To show that there exists an element of a set that has property , prove that the probability of a *random* element of having property is greater than zero. It is important to understand that the method is *non-constructive*: it just proves that such an element exists but does not construct one. Although from Thm. 8.5 we know that (3) <sup>=</sup> 6, let us use the probabilistic method to obtain a lower bound for (3).

## **Theorem 8.6 (Erdos) ˝** (3) <sup>&</sup>gt; <sup>4</sup>*.*

*Proof* Given a *random* coloring of by the two colors red and blue, consider an arbitrary subgraph 3, that is, an arbitrary triangle with 3 2 = 3 sides. The probability that all sides are colored red is 2−<sup>3</sup> , as is the probability that all sides are colored blue, so the probability that the triangle is monochromatic is 2−<sup>3</sup> <sup>+</sup> <sup>2</sup> <sup>−</sup><sup>3</sup> = 2 <sup>−</sup><sup>2</sup> <sup>=</sup> <sup>1</sup>/4. The number of triangles in is 3 , so (, <sup>3</sup>), the probability that *some* triangle contained in a random coloring of is monochromatic, is:

$$P(n, \mathfrak{Z}) = \binom{n}{3} \cdot \frac{1}{4} \cdot \mathfrak{Z}$$

If (, <sup>3</sup>) <sup>&</sup>lt; 1 then its complement (, <sup>3</sup>) <sup>=</sup> <sup>1</sup> <sup>−</sup> > 0, that is, the probability that a random coloring of does *not* contain a monochromatic triangle is greater than zero, so at least one must exist.

The following table shows (, <sup>3</sup>) for several values of , and whether the value of (, <sup>3</sup>) proves that there exists a coloring with no monochromatic triangle:


At frst glance the result is strange because Fig. 8.3a shows that there exists a coloring of <sup>5</sup> with no monochromatic coloring. However, the probabilistic criterion is sufcient but not necessary; it is a lower bound, meaning that () <sup>&</sup>gt; 4 which is true because Thm. 8.5 showed that () <sup>=</sup> 6.

The same proof works for arbitrary , so the probability of the existence of a coloring of with no monochromatic complete graph is:

$$P(n,k) = \binom{n}{k} \cdot 2 \cdot 2^{-\binom{k}{2}} \dots$$

For = 4:

$$\begin{aligned} \overline{P}(n,4) &= 1 - \binom{n}{4} \cdot 2^{-5} = \left(32 - \binom{n}{4}\right) / \, 32 \\\ \overline{P}(6,4) &= (32 - 15)/32 = 17/32 \\\ \overline{P}(7,4) &= (32 - 35)/32 = -3/32 \, . \end{aligned}$$

If follows that (4) <sup>&</sup>gt; 6 which is much less than the known value (4) <sup>=</sup> 18.

□

#### **8.6 SAT Solving**

SAT solving is a method for solving problems that works by encoding a problem as a formula in propositional logic and then using a computer program to check the truth value of the formula. Advances in algorithms and implementations have made SAT solving a viable approach for problem solving. We give an overview of SAT solving and explain how it can be used to solve the mathematical problems described in this chapter. The reader is assumed to have an elementary knowledge of propositional logic as summarized in Def. 8.4.

#### **8.6.1 Propositional Logic and the SAT Problem**

#### **Defnition 8.4**


The following formula is in CNF:

$$(\neg p \lor q \lor \neg r) \land (\neg p \lor r) \land (\neg r) \land (p \lor q \lor \neg r) \dots$$

The *SAT problem* is to decide if a given formula in CNF is satisfable or not. A *SAT solver* is a computer program that can solve the SAT problem. Most SAT solvers are based on the DPLL algorithm which goes back to the 1960's, but recent developments have made very signifcant improvements to the algorithm. Highly optimized implementations of these algorithms have made SAT solvers an important tool for solving problems in many felds including mathematics.

#### **8.6.2 Schur triples**

Let us encode the Schur triples problem (8) as a formula in CNF. The formula will be satisfable if and only if there is a decomposition of a set into disjoint subsets 1, <sup>2</sup> such that neither <sup>1</sup> nor <sup>2</sup> contains a Schur triple. There is an atom for each of the numbers 1 <sup>≤</sup> <sup>≤</sup> 8. The intended meaning of an interpretation for the formula is that it assigns to if is in the frst subset <sup>1</sup> and it assigns to if is in the second subset 2. To show that in all decompositions neither subset contains a Schur triple, the interpretation must ensure that for each possible Schur triple at least one atom is assigned and one atom is assigned .

For example, {2, <sup>4</sup>, <sup>6</sup>} is a Schur triple so at least one of the three integers must be in <sup>1</sup> and at least one of them must be in 2. Therefore, <sup>2</sup> <sup>∨</sup> <sup>4</sup> <sup>∨</sup> <sup>6</sup> must be true and also <sup>¬</sup><sup>2</sup> ∨ ¬<sup>4</sup> ∨ ¬<sup>6</sup> must be true. There are 12 possible Schur triples so the CNF formula is:

$$\begin{array}{rcl}(p\_1 \lor p\_2 \lor p\_3) \land (\neg p\_1 \lor \neg p\_2 \lor \neg p\_3) \land \\ (p\_1 \lor p\_3 \lor p\_4) \land (\neg p\_1 \lor \neg p\_3 \lor \neg p\_4) \land \\ (p\_1 \lor p\_4 \lor p\_5) \land (\neg p\_1 \lor \neg p\_4 \lor \neg p\_5) \land \\ (p\_1 \lor p\_5 \lor p\_6) \land (\neg p\_1 \lor \neg p\_5 \lor \neg p\_6) \land \\ (p\_1 \lor p\_6 \lor p\_7) \land (\neg p\_1 \lor \neg p\_6 \lor \neg p\_7) \land \\ (p\_1 \lor p\_7 \lor p\_8) \land (\neg p\_1 \lor \neg p\_7 \lor \neg p\_8) \land \\ (p\_2 \lor p\_3 \lor p\_5) \land (\neg p\_2 \lor \neg p\_3 \lor \neg p\_5) \land \\ (p\_2 \lor p\_4 \lor p\_6) \land (\neg p\_2 \lor \neg p\_4 \lor \neg p\_6) \land \\ (p\_2 \lor p\_8 \lor p\_7) \land (\neg p\_2 \lor \neg p\_5 \lor \neg p\_7) \land \\ (p\_2 \lor p\_6 \lor p\_8) \land (\neg p\_2 \lor \neg p\_6 \lor \neg p\_8) \land \\ (p\_3 \lor p\_4 \lor p\_7) \land (\neg p\_3 \lor \neg p\_4 \lor \neg p\_7) \land\\ (p\_3 \lor p\_8 \lor p\_8) \land (\neg p\_3 \lor \neg p\_5 \lor \neg p\_8) \,. \end{array} \tag{8.3}$$

When a SAT solver is given this formula it answers that the formula is satisfable under either of the interpretations:

$$
\begin{array}{cccccccccc}
p\_1 & p\_2 & p\_3 & p\_4 & p\_5 & p\_6 & p\_7 & p\_8 \\
\hline
F & F & T & F & T & T & T & T & F \\
T & T & F & T & F & F & F & T & T \\
\end{array}
$$

One interpretation corresponds to the decomposition in Eq. 8.2: <sup>1</sup> <sup>=</sup> {1, <sup>2</sup>, <sup>4</sup>, <sup>8</sup>}, <sup>2</sup> <sup>=</sup> {3, <sup>5</sup>, <sup>6</sup>, <sup>7</sup>}, while the other corresponds to the symmetrical decomposition <sup>1</sup> <sup>=</sup> {3, <sup>5</sup>, <sup>6</sup>, <sup>7</sup>}, <sup>2</sup> <sup>=</sup> {1, <sup>2</sup>, <sup>4</sup>, <sup>8</sup>}.

For (9), four pairs of subformulas are added for the additional possible triples:

$$\begin{array}{rcl}(p\_1 \lor p\_8 \lor p\_9) & \land & (\neg p\_1 \lor \neg p\_8 \lor \neg p\_9) \land \\ (p\_2 \lor p\_7 \lor p\_9) & \land & (\neg p\_2 \lor \neg p\_7 \lor \neg p\_9) \land \\ (p\_3 \lor p\_6 \lor p\_9) & \land & (\neg p\_3 \lor \neg p\_6 \lor \neg p\_9) \land \\ (p\_4 \lor p\_5 \lor p\_9) & \land & (\neg p\_4 \lor \neg p\_5 \lor \neg p\_9) \;.\end{array}$$

When the SAT solver is given this formula, it answers that the formula is unsatisfable, meaning that *no* decomposition has *no* Schur triple. Removing the double negative, this states that in *every* decomposition of (9) there exists a Schur triple.

#### **8.6.3 Pythagorean Triples**

Heule and Kullmann solved the Pythagorean triple problem using a highly optimized SAT solver. There was a signifcant diference in efciency between fnding a decomposition that does not have Pythagorean triples (you just need one decomposition), and showing all that decompositions have a Pythagorean triple (you have to check all of them). To show that for all (), 1 <sup>≤</sup> <sup>≤</sup> 7824, there is a decomposition with no triple took only one minute of computing time, whereas to show that every decomposition of (7825) has a triple took about two days of computing time for a computer with 800 *cores* (processors) working in parallel, altogether 40, 000 hours of computing time.

The use of computers in mathematics naturally raises the question: Can we trust a proof generated by a computer? Of course, even "ordinary" mathematical proofs can be incorrect (Sect. 4.7), but our experience with frequent computer bugs, as well as the opaqueness of large computer programs, makes us more sensitive to potential errors in computer-generated proofs.

One approach to increasing confdence in the correctness of a computer-generated proof is to use two or more programs, written independently by two or more researchers. If the multiple programs are written in diferent programming languages and for diferent computers and operating systems, this lessens the possibility of a bug in the computer hardware and software.

Heule and Kullmann's SAT solver wrote out a log of the steps in the proof so that it could be examined for correctness. The log was so massive, 200 terabytes, that it was impossible to examine directly. To put this into perspective, 200 terabytes is 200,000 gigabytes while your computer might have an internal memory of 16 gigabytes and a solid-state disk of 128 gigabytes. Instead, they wrote a small program to verify the correctness of the data in the log. To ensure that *this* program was correct, they wrote a formal proof using the Coq proof assistant that supports and checks the work of mathematicians without totally automating the proof process.

#### **8.6.4 An Overview of the DPLL Algorithm**

The frst algorithm that one learns for SAT solving is *truth tables*. Given a formula in propositional logic with diferent atoms, there are 2 interpretations since each atom can be independently assigned or . For each interpretation it is straightforward to compute the truth value of using the defnition of the propositional operators. However, to check 2 interpretations is very inefcient for even moderately large .

The DPLL algorithm works by incrementally assigning or to an atom and then attempting to evaluate the formula. For example, given <sup>=</sup> <sup>∧</sup> ∧ ¬, if is assigned then evaluates to , regardless of the assignments to and , and there is no need to perform further evaluations. Similarly, <sup>=</sup> <sup>∨</sup> ∨ ¬ evaluates to if is assigned , regardless of the assignments to and .

The efciency of DPLL comes from *unit propagation*. Consider part of the formula for Schur triples:

$$\begin{array}{l}(p\_1 \lor p\_2 \lor p\_3) \land (\neg p\_1 \lor \neg p\_2 \lor \neg p\_3) \land \\ (p\_1 \lor p\_3 \lor p\_4) \land (\neg p\_1 \lor \neg p\_3 \lor \neg p\_4) \land \\ \dots \\ (p\_3 \lor p\_4 \lor p\_7) \land (\neg p\_3 \lor \neg p\_4 \lor \neg p\_7) \land \\ (p\_3 \lor p\_5 \lor p\_8) \land (\neg p\_3 \lor \neg p\_5 \lor \neg p\_8) \; .\end{array} \tag{8.4}$$

Suppose that we have assigned to 1, 2. The frst subformula reduces to the unit formula consisting of the single atom 3. If the formula is to be satisfed, we *must* assign to <sup>3</sup> and all the subformulas:

(<sup>1</sup> <sup>∨</sup> <sup>2</sup> <sup>∨</sup> 3), (<sup>1</sup> <sup>∨</sup> <sup>3</sup> <sup>∨</sup> 4), (<sup>3</sup> <sup>∨</sup> <sup>4</sup> <sup>∨</sup> 7), (<sup>3</sup> <sup>∨</sup> <sup>5</sup> <sup>∨</sup> 8) ,

immediately evaluate to .

Since <sup>¬</sup><sup>3</sup> evaluates to , each subformula containing <sup>¬</sup><sup>3</sup> can be satisfed only if some other literal in the subformula is assigned . In <sup>¬</sup><sup>3</sup> ∨ ¬<sup>5</sup> ∨ ¬8, either <sup>5</sup> or <sup>8</sup> must be assigned so that either <sup>¬</sup><sup>5</sup> or <sup>¬</sup><sup>8</sup> evaluates to .

This analysis shows that once 1, <sup>2</sup> have been assigned , the formula in Eq. 8.4 is satisfable if and only if (¬<sup>4</sup> ∨ ¬7) ∧ (¬<sup>5</sup> ∨ ¬8) is satisfable. By performing the propagation of <sup>3</sup> on all the subformulas of Eq. 8.3, the formula is reduced to:

$$\begin{array}{rcl}(p\_4 \lor p\_5) \land (p\_4 \lor p\_6) \land (p\_5 \lor p\_6) \land (p\_5 \lor p\_7) \land \\ (p\_6 \lor p\_7) \land (p\_6 \lor p\_8) \land (p\_7 \lor p\_8) \land \\ (\neg p\_4 \lor \neg p\_7) \land (\neg p\_S \lor \neg p\_8) \;. \end{array}$$

One more assignment of to <sup>4</sup> results in a satisfying interpretation which we have found after only three arbitrary assignments.

#### **8.7 Pythagorean Triples in Babylonian Mathematics**

This section is a digression from Ramsey theory; it is included to give a taste of the rich theory of Pythagorean triples and to demonstrate the depth of mathematical knowledge in the ancient world. Pythagorean triples were known in Babylonian mathematics since at least 1800 BCE.

**Defnition 8.5** A *primitive Pythagorean triple* is a set of three positive integers {, , } such that <sup>2</sup> <sup>+</sup> <sup>2</sup> = 2 and , , have no common factor greater than 1.

*Example 8.3* {3, <sup>4</sup>, <sup>5</sup>} is a primitive Pythagorean triple but {6, <sup>8</sup>, <sup>10</sup>} is a Pythagorean triple that is not primitive since 2 is a common factor.

A cuneiform tablet called *Plimpton* 322 is one of the earliest examples of Babylonian mathematics. It lists ffteen primitive Pythagorean triples by giving and . Table 8.1 displays four of these triples, together with the computed values of and other values that will be discussed below. Historians of mathematics have proposed several explanations for how these triples were found. One explanation is that *Euclid's formula* was used to obtain the triples from a pair of generating numbers.

**Theorem 8.7 (Euclid)** {, , } *is primitive Pythagorean triple if and only if there exist two positive integers* , *, called* generating numbers*, such that:*


$$a = \mu^2 - \nu^2, \quad b = 2\mu\nu, \quad c = \mu^2 + \nu^2.$$

*Proof* By computation it follows immediately that if {, , } can be expressed as required in item 4 they form a Pythagorean triple:


**Table 8.1** Babylonian triples from the Plimpton 322 tablet

$$\begin{aligned} a^2 + b^2 &= \left(\mu^2 - \nu^2\right)^2 + \left(2\mu\nu\right)^2 \\ &= \mu^4 - 2\left(\mu\nu\right)^2 + \nu^4 + 4\left(\mu\nu\right)^2 \\ &= \mu^4 + 2\left(\mu\nu\right)^2 + \nu^4 \\ &= \mu^2 + \nu^2 = c^2 \ . \end{aligned}$$

The proof of the other direction is more complicated and is omitted. □

If it is true that the Babylonians used Euclid's formula, the question remains: How did they discover the generating numbers , ?

Each row of Table 8.1 displays *factors* and *factors*, the factorizations of and , respectively, showing that they have no common factors. The reader is invited to check that has no common factor with , so the triples are primitive. The generating numbers , and *factors*, *factors* are also displayed. Not only do they not have any common factors as required by Thm. 8.7, but the only factors greater than 1 in and are powers of 2, 3, 5.

**Defnition 8.6** A *Babylonian triple* is a primitive Pythagorean triple such that the only prime factors of , are 2, 3, 5.

The reason that the Babylonians restricted themselves to these factors is that they used the *sexagesimal* or base 60 = 2 · 2 · 3 · 5 number system whose prime factors are 2, 3 and 5.

For readers who are not familiar with non-decimal number systems, here is a brief overview of the concept. The "number" 12345 is a shorthand for the number:

$$(1 \times 10^4) + (2 \times 10^3) + (3 \times 10^2) + (4 \times 10^1) + (5 \times 10^0) \dots$$

This number system is called the *decimal* or base 10 number system. There are ten digits 0, 1, 2, . . . , 8, 9 for the coefcients of the powers, and the powers are represented by the places of coefcients with powers increasing from right to left.

The number could also be represented in the binary or base 2 number system by:

$$112345 = 8192 + 4096 + 32 + 16 + 8 + 1 = 2^{13} + 2^{12} + 2^8 + 2^4 + 2^3 + 2^0 = 11000000111001\dots$$

Binary notation uses two digits 0, 1 for the coefcients and the powers of two are indicated by the places of the coefcients.

Another popular number system is the *hexadecimal* or base 16 number system which is used in computing. For this number system we need 16 "digits" and the convention is to use 0, 1, 2, . . . , 8, 9, , , , , , .

The base 60 number system is not as unfamiliar as it may seem, because we represent time, geographical coordinates and angles in that system. We are comfortable carrying out computations such as (1 hour 40 minutes) plus (1 hour 30 minutes) equals (3 hours 10 minutes).

#### **Table 8.2** Babylonian triples in base 60


Table 8.2 shows the values of , that appear in the tablet in base 60 notation where ⟨⟩ represents the 'th "digit" for 0 <sup>≤</sup> < 60. The reader can check that these values are the same as the decimal values given in Table 8.1, for example:

$$(3 \times 60^2) + (31 \times 60^1) + (49 \times 60^0) = 127091$$
 
$$(5 \times 60^2) + (9 \times 60^1) + (1 \times 60^0) = 18541$$

The Babylonians did not have 60 distinct symbols for the digits. Instead, they used a hybrid system where the coefcients were represented with two symbols: one for the tens coefcient and the other for the ones coefcient, and the places of the coefcients indicated the powers of 60. Using ♥ for the tens coefcient and ♦ for the ones coefcient, the decimal number (<sup>38</sup> <sup>×</sup> <sup>60</sup>) + (<sup>16</sup> <sup>×</sup> <sup>60</sup><sup>0</sup> ) = 2296 would be represented as:

> ♥♥♥ ♦♦♦♦ ♦♦♦♦ ♥ ♦♦ ♦♦♦♦ .

#### **What Is the Surprise?**

Frank P. Ramsey's theorem appeared to be a minor result in combinatorics. Surprisingly, the theorem was the foundation of an entirely new and challenging feld of mathematics with many open problems. The nature of Ramsey theory is also surprising: if a set is large enough there exist regularities in its subsets.

I was introduced to Ramsey theory by the article by Marijn J. H. Heule and Oliver Kullmann on Pythagorean triples whose proof bears some similarity to the proof of the four-color theorem: the use of massive computing resources that is only successful after theoretical advances. Hence the title of their article: *The Science of Brute Force*.

Problems in combinatorics ask for specifc numerical values, for example, () must be a specifc positive integer. It is surprising that probabilistic methods have proved so fruitful in obtaining results in this feld.

We tend to think that humans are smarter today then they used to be thousands of years ago. It can be a surprise to fnd out that four thousand years ago Babylonian mathematics was sufciently advanced to discover that {12709, <sup>13500</sup>, <sup>18541</sup>} is a Pythagorean triple.

#### **Sources**

For an overview of Ramsey theory see [9], while an advanced presentation can be found in [20]. The section on the probabilistic method is based on [43, Example 4o] and [9, Chapter 4]. A database of Ramsey numbers can be found in [34].

The method of proof of the theorem on Pythagorean triples is explained in detail in [23]. See [4] for an introduction to logic and to SAT solving. The archive of my SAT solver for education [5] contains formulas for Schur triples, Ramsey graphs and van der Waerden's problem.

Section 8.7 is based upon [60], [42]. The sexagesimal number system is described in [63].

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 9 Langford's Problem**

C. Dudley Langford noticed that his son had arranged colored blocks as shown in Fig. 9.1. There is one block between the red blocks, two blocks between the blue blocks and three blocks between the green blocks.

**Fig. 9.1** Layout of blocks for Langford's problem

**Defnition 9.1 (Langford's Problem** ()**)** Given the multiset<sup>1</sup> of positive integers:

$$\{1, 1, 2, 2, 3, 3, \ldots, n, n\},$$

can they be arranged in a sequence such that for 1 <sup>≤</sup> <sup>≤</sup> there are numbers between the two occurrences of ?

Figure 9.1 shows that 312132 is a solution for (3).

Section 9.1 restates Langford's problem using a mathematical formalism that facilitates solving the problem. Section 9.2 characterizes values of for which () is solvable and presents two proofs of the theorem. The frst proof which is relatively simple uses the technique of double-counting: counting the same value in two diferent ways and equating the resulting formulas. The second proof is a clever induction but the "bookkeeping" involved requires careful attention to the details. Section 9.3 works out the solution for (4).

<sup>1</sup> A *multiset* or *bag* is like a set except that there may be more than one occurrence of an element.

#### **9.1 Langford's Problem as a Covering Problem**

Langford's problem can be posed using an array. For (3) there are six columns, one for each position at which the six numbers can be placed. There is one row for each possible placement of one of the numbers, that is, the two occurrences of must have numbers between them. There are four possible placements of 1's, three of 2's and two of 3's:


To solve the problem we need to select one row for the 1's in the sequence, one row for the 2's and one row for the 3's, such that if we stack these rows on top of each other, no column contains more than one number.

Row 9 needed not be considered because of symmetry: starting with row 9 just gives the reversal of the sequence obtained by starting with row 8.

Row 8 is the only one containing 3's so it must be chosen and the sequence is 3 3 . Any row with numbers in columns 1 and 5 can no longer be used, because only one number can be placed at each position. Let us denote the permissible and forbidden rows by:

```
̸1, 2,̸3, 4,̸5,̸ , 7, 8 .
```
Row 7 is the only remaining row containing 2's so it must be chosen and the sequence is 3 2 32. Deleting rows that can no longer be used gives:

$$\left| \left\{ , 2, \, 3, \, 4, \, 5, \, 6, \, 7, \, 8 \, . \right. \right.$$

Choosing the only remaining row, row 2, gives the solution 312132:


The analysis has shown that this is the only solution, except for the symmetrical solution obtained by starting with row 9.

#### **9.2 For Which Values of Is Langford's Problem Solvable?**

**Theorem 9.1** () *has a solution if and only if* <sup>=</sup> <sup>4</sup> *or* <sup>=</sup> <sup>4</sup> <sup>+</sup> <sup>3</sup>*.*

We prove the forward direction of the theorem. Proof 1 shows that if () has a solution then <sup>=</sup> <sup>4</sup> or <sup>=</sup> <sup>4</sup> <sup>+</sup> 3. Proof 2 shows the contrapositive: if <sup>=</sup> <sup>4</sup> <sup>+</sup> <sup>1</sup> or <sup>=</sup> <sup>4</sup> <sup>+</sup> 2 then () has no solution.

*Proof (1)* If the frst occurrence of the number is at position , the second occurrence is at position <sup>+</sup> <sup>+</sup> 1. For example, in 312132, the solution for (3), choosing <sup>=</sup> 2 gives <sup>=</sup> 3 and <sup>+</sup> <sup>+</sup> <sup>1</sup> <sup>=</sup> <sup>3</sup> <sup>+</sup> <sup>2</sup> <sup>+</sup> <sup>1</sup> <sup>=</sup> 6.

, the sum of the positions of all the numbers, is:

$$\begin{aligned} S\_n &= \sum\_{k=1}^n i\_k + \sum\_{k=1}^n (i\_k + k + 1) \\ &= 2 \sum\_{k=1}^n i\_k + \sum\_{k=1}^n (k+1) \\ &= 2 \sum\_{k=1}^n i\_k + \frac{n(n+3)}{2} \end{aligned}$$

But is simply 1 <sup>+</sup> <sup>2</sup> <sup>+</sup> <sup>3</sup> + · · · + <sup>2</sup>, so:

$$S\_n = \sum\_{k=1}^{2n} k = \frac{2n(2n+1)}{2} \dots$$

Equating the two formulas for gives:

$$\begin{aligned} 2\sum\_{k=1}^n i\_k + \frac{n(n+3)}{2} &= \frac{2n(2n+1)}{2} \\ \sum\_{k=1}^n i\_k &= \frac{1}{2} \left( \frac{2n(2n+1)}{2} - \frac{n(n+3)}{2} \right) \\ &= \frac{3n^2 - n}{4} .\end{aligned}$$

The left-hand side is an integer since it is the sum of integers (the positions), so the right-hand side must also be an integer. When is 3 <sup>3</sup> <sup>−</sup> divisible by 4? Factoring 3 <sup>2</sup> <sup>−</sup> gives (3 <sup>−</sup> <sup>1</sup>).

If is a multiple of 4, the product is divisible by 4.

When is 3 <sup>−</sup> 1 divisible by 4? Any integer can be expressed as <sup>=</sup> <sup>4</sup> <sup>+</sup> for <sup>=</sup> <sup>0</sup>, <sup>1</sup>, <sup>2</sup>, 3. If 3 <sup>−</sup> 1 is divisible by 4, then so is 3(4 <sup>+</sup> ) − <sup>1</sup> <sup>=</sup> <sup>12</sup> <sup>+</sup> <sup>3</sup> <sup>−</sup> 1. 12 is divisible by 4. For <sup>=</sup> {0, <sup>1</sup>, <sup>2</sup>, <sup>3</sup>}, 3 <sup>−</sup> <sup>1</sup> <sup>=</sup> {−1, <sup>2</sup>, <sup>5</sup>, <sup>8</sup>} is divisible by 4 if and only if <sup>=</sup> 3, that is, <sup>=</sup> <sup>4</sup> <sup>+</sup> 3. □

To introduce the idea of the second proof consider what a solution for = 4 might look like. In the following tables the positions of the occurrences of 4 are 1 and 6, and the positions of the occurrences of 2 are 5 and 8. In both cases, one position is odd and the other is even.


Let = 2 be an *even* number. If is the position of the frst occurrence of , then the position of the second occurrence is <sup>+</sup> <sup>+</sup> 1. The sum of the positions is:

$$i + (i + k + 1) = 2i + 2m + 1 = 2(i + m) + 1\ ,$$

which is an odd number. For the sum of two numbers to be odd, one must be odd and the other even.

Let us now check the positions of the occurrences of odd numbers. The positions of the occurrences of 1 are 2 and 4, both even numbers, and the positions of the occurrences of 3 are 3 and 7, both odd numbers.



Let <sup>=</sup> <sup>2</sup> <sup>+</sup> 1 be an *odd* number. The sum of the positions is:

$$i + (i + k + 1) = 2i + 2m + 1 + 1 = 2(i + m + 1) \,,$$

which is an even number. For the sum of two numbers to be even, both must be odd or both even.

The positions 1, <sup>2</sup>, . . . , <sup>2</sup> <sup>−</sup> <sup>1</sup>, <sup>2</sup> contain an equal number of even and odd positions. The two occurrences of a number in a row "cover" two positions. When the set of rows covers all the positions, they must cover an equal number of even positions and odd positions. Defne the *parity* of a set of rows to be the diference between the number of even and odd positions covered. Initially, the parity is zero, and if the problem has a solution, the set of rows in the solution also has zero parity.

When two occurrences of an even number are placed, they cover one even position and one odd position, so the parity remains the same:



When two occurrences of an odd number are placed, the parity becomes +2 or −2, so we must be able to associate this pair with a pair of occurrences of *another* odd number that are placed at positions that balance out the parity:

We have shown that there can be a solution to Langford's problem if and only if *there is an even number of odd numbers in* {1, . . . , }! The theorem claims that if this is true then either <sup>=</sup> <sup>4</sup> or <sup>=</sup> <sup>4</sup> <sup>−</sup> 1, and if not then either <sup>=</sup> <sup>4</sup> <sup>−</sup> 2 or 4 <sup>−</sup> 3.

*Proof (2)* The proof is by induction. There are four base cases:


The inductive hypothesis is that the theorem is true for {1, . . . , <sup>4</sup> <sup>−</sup> }, <sup>≥</sup> 1, <sup>0</sup> <sup>≤</sup> <sup>≤</sup> 3, and we will prove that it is true for <sup>=</sup> <sup>4</sup>( <sup>+</sup> <sup>1</sup>) − .


## **9.3 Solution for** (**4**)


Here is the array for (4). Try to fnd the solution yourself.

By symmetry row 18 may be eliminated.

Choose row 16 and the sequence is 4 4 . Any row with an element in position 1 or position 6 can no longer be part of the solution.

̸1, <sup>2</sup>, <sup>3</sup>,̸4, <sup>5</sup>,̸ ,̸7, <sup>8</sup>,̸9, <sup>10</sup>, <sup>11</sup>,12̸ ,13̸ , <sup>14</sup>, <sup>15</sup>, <sup>16</sup>,17̸

Choose row 14 and the sequence is 4 3 43 .

̸1, <sup>2</sup>,̸3,̸4,̸5,̸ ,̸7, <sup>8</sup>,̸9,10̸ , <sup>11</sup>,12̸ ,13̸ , <sup>14</sup>,15̸ , <sup>16</sup>,17̸

Choose row 8. The sequence is 423 243 .

̸1,̸2,̸3,̸4,̸5,̸ ,̸7, <sup>8</sup>,̸9,10̸ ,11̸ ,12̸ ,13̸ , <sup>14</sup>,15̸ , <sup>16</sup>,17̸

All of the choices for 1's have been eliminated so we must backtrack.

Instead of row 8 choose row 11 and the sequence is 4 3 2432.

̸1, <sup>2</sup>,̸3,̸4,̸5,̸ ,̸7,̸8,̸9,10̸ , <sup>11</sup>,12̸ ,13̸ , <sup>14</sup>,15̸ , <sup>16</sup>,17̸

Choose row 2 and we have a solution 41312432.

Continue backtracking to see if there is another solution.

Instead of row 14 choose row 15 and the sequence is 4 3 4 3.

̸1,̸2, <sup>3</sup>,̸4, <sup>5</sup>,̸ ,̸7, <sup>8</sup>,̸9,10̸ ,11̸ ,12̸ ,13̸ ,14̸ , <sup>15</sup>, <sup>16</sup>,17̸

Row 8 must be chosen and the sequence is 42 324 3.

̸1,̸2,̸3,̸4,̸5,̸ ,̸7, <sup>8</sup>,̸9,10̸ ,11̸ ,12̸ ,13̸ ,14̸ , <sup>15</sup>, <sup>16</sup>,17̸

All of the choices for 1's have been eliminated so again we backtrack.

Instead of row 16 choose row 17 and the sequence is 4 4 .

<sup>1</sup>,̸2, <sup>3</sup>, <sup>4</sup>,̸5, <sup>6</sup>, <sup>7</sup>,̸8, <sup>9</sup>,10̸ , <sup>11</sup>, <sup>12</sup>,13̸ ,14̸ , <sup>15</sup>,16̸ , <sup>17</sup>

Choose row 15 and the sequence is 4 3 43.

<sup>1</sup>,̸2, <sup>3</sup>,̸4,̸5,̸ ,̸7,̸8, <sup>9</sup>,10̸ ,11̸ ,12̸ ,13̸ ,14̸ , <sup>15</sup>,16̸ , <sup>17</sup>

Row 9 must be chosen and the sequence is 423 243.

<sup>1</sup>,̸2,̸3,̸4,̸5,̸ ,̸7,̸8, <sup>9</sup>,10̸ ,11̸ ,12̸ ,13̸ ,14̸ , <sup>15</sup>,16̸ , <sup>17</sup>

All of the choices for 1's have been eliminated. We can backtrack one last time.

Instead of row 15 choose row 12 and the sequence is 34 3 4.

̸1,̸2,̸3,̸4,̸5,̸ ,̸7,̸8, <sup>9</sup>,10̸ ,11̸ , <sup>12</sup>,13̸ ,14̸ ,15̸ ,16̸ , <sup>17</sup>

Again, all of the choices for 1's have been eliminated.

Therefore the only solution is 41312432.

#### **What Is the Surprise?**

The source of the inspiration for a mathematical theorem can be surprising. Langford noticed a pattern in his son's colored blocks which led to the interesting Thm. 9.1. Students should also be introduced to the fact that a theorem can have many completely diferent proofs.

#### **Sources**

This chapter is based on [35]. [12] shows how to fnd a solution for = 4 and <sup>=</sup> <sup>4</sup> <sup>+</sup> 3.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 10 The Axioms of Origami**

Origami, the art of paper folding, was developed several centuries ago in Japan and now has a worldwide following. In the late twentieth century the mathematical theory of origami was developed. Its foundation is a set of seven axioms, the *Huzita–Hatori axioms*, named after Humiaki Huzita who formalized the frst six axioms and Koshiro Hatori who found the seventh. Jacques Justin published all seven axioms several years before Huzita and Hatori, and Margherita P. Beloch formulated the sixth axiom in 1936. Nevertheless, the axioms as known as the Huzita-Hatori axioms.

In a sequence of three chapters we will explore the mathematics of origami. This chapter presents the axioms, Chap. 11 connects origami with the roots of polynomials and Chap. 12 shows that constructions with origami can solve problems that are impossible using a straightedge and compass.

This chapter contains a section for each of the seven axioms. Following a statement of an axiom and a diagram of the *fold* it specifes, the equations of the fold and the points of intersection are developed using analytic geometry. A fold can also be defned as a *geometric locus*, the set of all points satisfying some property. The term fold comes from the origami operation of folding a piece of paper, but here it is used to refer the geometric line that would be created by folding the paper.

Folds result in *refections*. Given a point , its refection around a fold results in a point ′ such that is the perpendicular bisector of the line segment ′ (Fig. 10.1).

**Fig. 10.1** The fold is the perpendicular bisector of the line connecting a point and its refection

#### **10.1 Axiom 1**

**Axiom 10.1** Given two distinct points <sup>1</sup> <sup>=</sup> (1, 1), <sup>2</sup> <sup>=</sup> (2, 2), there is a unique fold that passes through both of them (Fig. 10.2).

**Fig. 10.2** Axiom 1

**Derivation of the equation of the fold:** The equation of the fold is derived from the coordinates of <sup>1</sup> and 2. The slope is the quotient of the diferences of the coordinates and the intercept is derived from 1:

$$\mathbf{y} - \mathbf{y}\_1 = \frac{\mathbf{y}\_2 - \mathbf{y}\_1}{\mathbf{x}\_2 - \mathbf{x}\_1} (\mathbf{x} - \mathbf{x}\_1) \,. \tag{10.1}$$

*Example 10.1* Let <sup>1</sup> <sup>=</sup> (2, <sup>2</sup>), <sup>2</sup> <sup>=</sup> (6, <sup>4</sup>). The equation of is:

$$\begin{aligned} y - 2 &= \frac{4 - 2}{6 - 2}(x - 2) \\ y &= \frac{1}{2}x + 1 \end{aligned}$$

#### **10.2 Axiom 2**

**Axiom 10.2** Given two distinct points <sup>1</sup> <sup>=</sup> (1, 1), <sup>2</sup> <sup>=</sup> (2, 2), there is a unique fold that places <sup>1</sup> onto <sup>2</sup> (Fig. 10.3).

The fold is the geometric locus of all points equidistant from <sup>1</sup> and 2.

**Fig. 10.3** Axiom 2

**Derivation of the equation of the fold:** The fold is the perpendicular bisector of <sup>1</sup> 2. Its slope is the negative reciprocal of the slope of the line connecting <sup>1</sup> and 2. passes through the midpoint between the points:

$$\mathbf{y} - \frac{\mathbf{y}\_1 + \mathbf{y}\_2}{2} = -\frac{\mathbf{x}\_2 - \mathbf{x}\_1}{\mathbf{y}\_2 - \mathbf{y}\_1} \left(\mathbf{x} - \frac{\mathbf{x}\_1 + \mathbf{x}\_2}{2}\right) \,. \tag{10.2}$$

*Example 10.2* Let <sup>1</sup> <sup>=</sup> (2, <sup>2</sup>), <sup>2</sup> <sup>=</sup> (6, <sup>4</sup>). The equation of is:

$$\begin{aligned} \mathbf{y} - \left(\frac{2+4}{2}\right) &= -\frac{6-2}{4-2} \left(\mathbf{x} - \left(\frac{2+6}{2}\right)\right), \\ \mathbf{y} &= -2\mathbf{x} + 11 \end{aligned}$$

#### **10.3 Axiom 3**

**Axiom 10.3** Given two lines 1, 2, there is a fold that places <sup>1</sup> onto <sup>2</sup> (Fig. 10.4).

The fold is the geometric locus of the points that are equidistant from <sup>1</sup> and 2, where the distance from a point to a line is the length of the line segment through the point that is perpendicular to the line. Using congruent triangles it is easy to show that the fold is a bisector of the angle formed by <sup>1</sup> and 2.

**Fig. 10.4** Axiom 3

#### **Derivation of the equation of the fold:**

1, <sup>2</sup> *are parallel:* Let <sup>1</sup> be <sup>=</sup> <sup>+</sup> <sup>1</sup> and let <sup>2</sup> be <sup>=</sup> <sup>+</sup> 2. The fold is the line parallel to <sup>1</sup> and <sup>2</sup> that is halfway between them:

$$y = mx + \frac{b\_1 + b\_2}{2} \dots$$

1, <sup>2</sup> *intersect:* Let <sup>1</sup> be <sup>=</sup> 1 <sup>+</sup> <sup>1</sup> and let <sup>2</sup> be <sup>=</sup> 2 <sup>+</sup> 2. <sup>=</sup> ( , ), the point of intersection of the two lines, is:

$$\begin{aligned} m\_1 x\_i + b\_1 &= m\_2 x\_i + b\_2 \\ x\_i &= \frac{b\_2 - b\_1}{m\_1 - m\_2} \\ y\_i &= m\_1 x\_i + b\_1 \end{aligned}$$

*Example 10.3* Let <sup>1</sup> be <sup>=</sup> <sup>2</sup> <sup>−</sup> 2 and let <sup>2</sup> be <sup>=</sup> <sup>−</sup> <sup>+</sup> 8. Then <sup>=</sup> ( , ) is:

$$\begin{aligned} x\_i &= \frac{8 - (-2)}{2 - (-1)} = \frac{10}{3} \approx 3.33\\ y\_i &= 2 \cdot \frac{10}{3} - 2 = \frac{14}{3} \approx 4.67 \dots \end{aligned}$$

The fold is the bisector of the angle formed by <sup>1</sup> and <sup>2</sup> at their point of intersection. There are two possible folds since there are two pairs of vertical angles. We need to determine the slopes of the angle bisectors. If the angle of line <sup>1</sup> relative to the -axis is <sup>1</sup> and the angle of line <sup>2</sup> relative to the -axis is 2, then the fold is the line which makes an angle of <sup>=</sup> (<sup>1</sup> <sup>+</sup> 2)/2 relative to the -axis.

Let <sup>1</sup> = tan 1, <sup>2</sup> = tan 2. By Thm. A.9, , the slope of the line making an angle of <sup>1</sup> <sup>+</sup> <sup>2</sup> relative to the -axis, is:

$$m\_s = \tan(\theta\_1 + \theta\_2) = \frac{\tan \theta\_1 + \tan \theta\_2}{1 - \tan \theta\_1 \tan \theta\_2} = \frac{m\_1 + m\_2}{1 - m\_1 m\_2}$$

.

By Thm. A.10, , the slope of the angle bisector, is:

$$m\_b = \tan\frac{\theta\_1 + \theta\_2}{2} = \frac{-1 \pm \sqrt{1 + \tan^2(\theta\_1 + \theta\_2)}}{\tan(\theta\_1 + \theta\_2)} = \frac{-1 \pm \sqrt{1 + m\_s^2}}{m\_s}.$$

*Example 10.4* For <sup>=</sup> <sup>2</sup> <sup>−</sup> 2 and <sup>=</sup> <sup>−</sup> <sup>+</sup> 8, the slope of the angle bisector is:

$$\begin{aligned} m\_s &= \frac{2 + (-1)}{1 - (2 \cdot - 1)} = \frac{1}{3} \\ m\_b &= \frac{-1 \pm \sqrt{1 + (1/3)^2}}{1/3} = -3 \pm \sqrt{10} \approx -6.16, \ 0.162\dots \end{aligned}$$

Let us derive the equation of the fold <sup>1</sup> with the positive slope. From Example 10.3, the coordinates of the intersection of the two lines are (10/3, <sup>14</sup>/3). Therefore:

$$\begin{aligned} \frac{14}{3} &= (-3 + \sqrt{10}) \cdot \frac{10}{3} + b\_i \\ b\_i &= \frac{44 - 10\sqrt{10}}{3} \\ \text{y} &= (-3 + \sqrt{10})x + \frac{44 - 10\sqrt{10}}{3} \approx 0.162x + 4.13 \end{aligned}$$

#### **10.4 Axiom 4**

**Axiom 10.4** Given a point <sup>1</sup> and a line 1, there is a unique fold perpendicular to <sup>1</sup> that passes through point <sup>1</sup> (Fig. 10.5).

The fold is the geometric locus of all points on the line perpendicular to <sup>1</sup> that passes through 1.

**Fig. 10.5** Axiom 4

**Derivation of the equation of the fold:** Let <sup>1</sup> be <sup>=</sup> 1 <sup>+</sup> <sup>1</sup> and let <sup>1</sup> <sup>=</sup> (1, 1). is perpendicular to <sup>1</sup> so its slope is −(1/1). Since it passes through <sup>1</sup> we can compute the intercept and write down its equation:

$$\begin{aligned} y\_1 &= -\frac{1}{m}x\_1 + b\\ b &= \frac{(m\mathbf{y}\_1 + \mathbf{x}\_1)}{m} \\ \mathbf{y} &= -\frac{1}{m}\mathbf{x} + \frac{(m\mathbf{y}\_1 + \mathbf{x}\_1)}{m} \end{aligned}$$

*Example 10.5* Let <sup>1</sup> <sup>=</sup> (2, <sup>6</sup>) and let <sup>1</sup> be <sup>=</sup> <sup>2</sup> <sup>−</sup> 4. The equation of the fold is:

.

$$\mathbf{y} = -\frac{1}{2}\mathbf{x} + \frac{2\cdot 6 + 2}{2} = -\frac{1}{2}\mathbf{x} + 7\dots$$

#### **10.5 Axiom 5**

**Axiom 10.5** Given two points 1, <sup>2</sup> and a line 1, there is a fold that places <sup>1</sup> onto <sup>1</sup> and passes through <sup>2</sup> (Fig. 10.6).

Since the fold passes through <sup>2</sup> and <sup>2</sup> is on the perpendicular bisector of <sup>1</sup> ′ 1 , the geometric locus of the refection of <sup>1</sup> is the circle centered at <sup>2</sup> with radius <sup>1</sup> 2. The fold is constrained so that the refection ′ 1 is on the given line 1.

**Derivation of the equations of the folds:** Let <sup>1</sup> be <sup>=</sup> 1+<sup>1</sup> and let <sup>1</sup> <sup>=</sup> (1, 1), <sup>2</sup> <sup>=</sup> (2, 2). The equation of the circle centered at <sup>2</sup> with radius <sup>1</sup> <sup>2</sup> is:

$$\begin{aligned} \left(\mathbf{x} - \mathbf{x}\_2\right)^2 + \left(\mathbf{y} - \mathbf{y}\_2\right)^2 &= r^2 \ , \quad \text{where} \ \mathbf{y} = \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times &\times \ \mathbf{y} \ &\times \ \mathbf{y} \ &\times &\times \ \mathbf{y} \ &\times &\times \ \end{aligned}$$

Substituting the equation of <sup>1</sup> into the equation for the circle gives:

$$\begin{aligned} \left(\mathbf{x} - \mathbf{x}\_2\right)^2 + \left(\left(m\_1\mathbf{x} + b\_1\right) - \mathbf{y}\_2\right)^2 &= r^2\\ \left(\mathbf{x} - \mathbf{x}\_2\right)^2 + \left(m\_1\mathbf{x} + \left(b\_1 - \mathbf{y}\_2\right)\right)^2 &= r^2 \end{aligned}$$

and we obtain a quadratic equation for the -coordinates of the possible intersections:

$$\left(\mathbf{x}^2(1+m\_1^2) + 2(-\mathbf{x}\_2 + m\_1(b-\mathbf{y}\_2))\mathbf{x} + (\mathbf{x}\_2^2 + (b\_1^2 - 2b\_1\mathbf{y}\_2 + \mathbf{y}\_2^2) - r^2) = 0. \tag{10.3}$$

Since a quadratic equation has at most two solutions, for a given pair of points and a line there may be zero, one or two folds. From the solutions ′ 1 , ′′ <sup>1</sup> we can compute ′ 1 , ′′ 1 from <sup>=</sup> 1 <sup>+</sup> 1. The refected points are ′ 1 <sup>=</sup> ( ′ 1 , ′ 1 ), ′′ 1 <sup>=</sup> ( ′′ 1 , ′′ 1 ).

*Example 10.6* Let <sup>1</sup> <sup>=</sup> (2, <sup>8</sup>), <sup>2</sup> <sup>=</sup> (4, <sup>4</sup>) and let <sup>1</sup> be <sup>=</sup> <sup>−</sup> 1 2 <sup>+</sup> 3. The equation of the circle is (−4) <sup>2</sup>+ (−4) <sup>2</sup> <sup>=</sup> (4−2) <sup>2</sup>+ (4−8) <sup>2</sup> = 20. Substitute the equation of the line into the equation of the circle to obtain a quadratic equation for the -coordinates of the intersections (or use Eq. 10.3):

$$\begin{aligned} (x-4)^2 + \left(\left(-\frac{1}{2}x+3\right)-4\right)^2 &= 20\\ (x-4)^2 + (-1)^2 \cdot \left(\frac{1}{2}x+1\right)^2 - 20 &= 0\\ 5x^2 - 28x - 12 &= 0\\ (5x+2)(x-6) &= 0 \end{aligned}$$

The two points of intersection are:

$$p\_1' = (-2/5, 16/5) = (-0.4, 3.2)\,, \quad p\_1'' = (6, 0)\,\,\,\,\,$$

The folds will be the perpendicular bisectors of <sup>1</sup> ′ 1 and <sup>1</sup> ′′ 1 .

*Example 10.7* For <sup>1</sup> <sup>=</sup> (2, <sup>8</sup>) and ′ 1 <sup>=</sup> (−2/5, <sup>16</sup>/5) the equation of <sup>1</sup> is:

$$\begin{aligned} \text{y} - \frac{8 + (16/\text{5})}{2} &= -\frac{(-2/\text{5}) - 2}{(16/\text{5}) - 8} \left( \text{x} - \frac{2 + (-2/\text{5})}{2} \right) \\ \text{y} &= -\frac{1}{2} \text{x} + 6 \text{ .} \end{aligned}$$

*Example 10.8* For <sup>1</sup> <sup>=</sup> (2, <sup>8</sup>) and ′′ 1 <sup>=</sup> (6, <sup>0</sup>) the equation of <sup>2</sup> is:

$$\begin{aligned} \text{y} - \frac{8+0}{2} &= -\frac{6-2}{0-8} \left( \text{x} - \frac{2+6}{2} \right), \\ \text{y} &= \frac{1}{2} \text{x} + 2 \text{ .} \end{aligned}$$

#### **10.6 Axiom 6**

**Axiom 10.6** Given two points 1, <sup>2</sup> and two lines 1, 2, there is a fold that places <sup>1</sup> onto <sup>1</sup> and places <sup>2</sup> onto <sup>2</sup> (Fig. 10.7).

A fold that places onto is a line such that the distance from to is equal to the distance from to . The geometric locus of points equidistant from a point and a line is a *parabola*. is called the *focus* and <sup>1</sup> is called the *directrix*. A fold is any line tangent to the parabola (Sect. 10.6.3).

**Fig. 10.7** Axiom 6

For a fold to simultaneously place <sup>1</sup> onto <sup>1</sup> and <sup>2</sup> onto 2, it must be a tangent common to the two parabolas. There may be zero, one, two or three common tangents (Figs. 10.8a, 10.8b, 10.9a, 10.9b).

The formula for an arbitrary parabola is quite complex so we limit the presentation to parabolas whose axis of symmetry is the - or -axis.

**Fig. 10.9a** Two common tangents **Fig. 10.9b** Three common tangents

#### **10.6.1 Derivation of the Equation of a Fold**

Let (0, ) be the focus of a parabola with directrix <sup>=</sup> . Defne <sup>=</sup> <sup>−</sup> , the signed length of the line segment between the focus and the directrix.1 If the vertex of the parabola is on the -axis the equation of the parabola is = /2. To move the parabola up or down the -axis so that its vertex is at (0, ℎ), add <sup>ℎ</sup> to the equation of the parabola (Fig. 10.10):

$$\mathbf{y} = \frac{x^2}{2p} + h\,\mathrm{s}$$

 We have been using the notation for points; the use of here might be confusing but it is the standard notation. The formal name for is one-half the *latus rectum*.

**Fig. 10.10** The elements in the defnition of a parabola

Defne = 2ℎ so that the equation of the parabola is:

$$y = \frac{x^2}{2p} + \frac{a}{2p} \tag{10.4a}$$

 <sup>2</sup> <sup>−</sup> <sup>2</sup> <sup>+</sup> <sup>=</sup> <sup>0</sup> . (10.4b)

The equation of the parabola in Fig. 10.10 is <sup>2</sup> <sup>−</sup> <sup>12</sup> <sup>+</sup> <sup>12</sup> <sup>=</sup> 0.

Substitute the equation of an *arbitrary* line <sup>=</sup> <sup>+</sup> into Eq. 10.4b to obtain an equation for the points of intersection of the line and the parabola:

$$\begin{aligned} \mathbf{x}^2 - 2p(m\mathbf{x} + b) + a &= 0\\ \mathbf{x}^2 + (-2mp)\mathbf{x} + (-2pb + a) &= 0 \end{aligned}$$

The line will be tangent to the parabola if and only if this quadratic equation has *exactly one solution* if and only if its discriminant is zero:

$$(-2mp)^2 - 4 \cdot 1 \cdot (-2pb + a) = 0\tag{10.5a}$$

$$m^2 p^2 + 2pb - a = 0 \,\,\,. \tag{10.5b}$$

This is an equation with variables , for the tangents to the parabola. To obtain the common tangents to both parabolas we must simultaneously solve the equations for the two parabolas.

*Example 10.9*

**Parabola 1:** Focus (0, <sup>4</sup>), directrix <sup>=</sup> 2, vertex (0, <sup>3</sup>). <sup>=</sup> 2, <sup>=</sup> <sup>2</sup> · <sup>2</sup> · <sup>3</sup> <sup>=</sup> 12. The equation of the parabola is:

$$x^2 - 4y + 12 = 0$$

Substituting and into Eq. 10.5b and simplifying gives:

$$m^2 + b - 3 = 0 \dots$$

**Parabola 2:** Focus (0, <sup>−</sup>4), directrix <sup>=</sup> <sup>−</sup>2, vertex (0, <sup>−</sup>3). <sup>=</sup> <sup>−</sup>2, <sup>=</sup> <sup>2</sup> · −<sup>2</sup> · −<sup>3</sup> <sup>=</sup> 12. The equation of the parabola is:

$$\mathbf{x}^2 + 4\mathbf{y} + 12 = \mathbf{0} \dots$$

Substituting and into Eq. 10.5b and simplifying gives:

$$m^2 - b - 3 = 0\ .$$

The solutions of the two equations:

$$\begin{aligned} m^2 + b - 3 &= 0\\ m^2 - b - 3 &= 0 \end{aligned}$$

are <sup>=</sup> <sup>±</sup> √ <sup>3</sup> ≈ ±1.73 and <sup>=</sup> 0. There are two common tangents:

$$y = \sqrt{3}x \,, \quad y = -\sqrt{3}x \,\,.$$

*Example 10.10* **Parabola 1:** Unchanged. **Parabola 2:** Focus (0, <sup>−</sup>6), directrix <sup>=</sup> <sup>−</sup>2, vertex (0, <sup>−</sup>4). <sup>=</sup> <sup>−</sup>4, <sup>=</sup> <sup>2</sup> · −<sup>4</sup> · −<sup>4</sup> <sup>=</sup> 32. The equation of the parabola is:

$$x^2 + 8y + 32 = 0\ .$$

Substituting and into Eq. 10.5b and simplifying gives:

$$2m^2 - b - 4 = 0\dots$$

The solutions of the two equations:

$$\begin{aligned} m^2 + b - 3 &= 0\\ 2m^2 - b - 4 &= 0 \end{aligned}$$

are <sup>=</sup> <sup>±</sup> √ 7 3 ≈ ±1.53 and <sup>=</sup> 2 3 . There are two common tangents:

$$\mathbf{y} = \sqrt{\frac{7}{3}}\mathbf{x} + \frac{2}{3}\mathbf{j}, \quad \mathbf{y} = -\sqrt{\frac{7}{3}}\mathbf{x} + \frac{2}{3}\mathbf{j}$$

*Example 10.11*

Let us now defne a parabola whose axis of symmetry is the -axis. **Parabola 1:** Unchanged.

**Parabola 2:** Focus (4, <sup>0</sup>), directrix <sup>=</sup> 2, vertex (3, <sup>0</sup>). <sup>=</sup> 2, <sup>=</sup> <sup>2</sup> · <sup>2</sup> · <sup>3</sup> <sup>=</sup> 12. The equation of the parabola is:

$$\mathbf{y}^2 - 4\mathbf{x} + 12 = \mathbf{0} \,\, \text{s} \tag{10.6}$$

This is an equation with and 2 instead of 2 and , so Eq. 10.5b can't be used and we must perform the derivation again.

Substitute the equation for a line into Eq. 10.6:

$$\begin{aligned} \left(m\mathbf{x} + b\right)^2 - 4\mathbf{x} + 12 &= 0\\ m^2 \mathbf{x}^2 + (2mb - 4)\mathbf{x} + (b^2 + 12) &= 0 \ . \end{aligned}$$

Set the discriminant equal to zero and simplify:

$$\begin{aligned} \left(2mb - 4\right)^2 - 4m^2(b^2 + 12) &= 0\\ -3m^2 - mb + 1 &= 0 \end{aligned}$$

If we try to solve the two equations:

$$\begin{aligned} m^2 + b - 3 &= 0\\ -3m^2 - mb + 1 &= 0 \end{aligned}$$

we obtain a *cubic* equation with variable :

$$\left(m^3 - \Im m^2 - \Im m + 1 = 0\right). \tag{10.7}$$

Since a cubic equation has at least one and at most three real solutions, there can be one, two or three common tangents.

The formula for solving general cubic equations is quite complicated, so I used a calculator on the internet and obtained the three solutions:

$$m = 3.73, \ m = -1, \ m = 0.27.$$

From the form of Eq. 10.7 we might guess that <sup>=</sup> 1 or <sup>=</sup> <sup>−</sup>1 is a solution:

$$\begin{aligned} 1^3 - 3 \cdot 1^2 - 3 \cdot 1 + 1 &= -4 \\ \left( (-1)^3 - 3 \cdot (-1)^2 - 3 \cdot (-1) + 1 &= 0 \right) \end{aligned}$$

Divide Eq. 10.7 by − (−1) <sup>=</sup> <sup>+</sup> 1 to obtain the quadratic equation <sup>2</sup> <sup>−</sup> <sup>4</sup> <sup>+</sup> <sup>1</sup> whose roots are the other two solutions of the cubic equation <sup>=</sup> <sup>2</sup><sup>±</sup> √ <sup>3</sup> <sup>≈</sup> <sup>3</sup>.73, <sup>0</sup>.27.

#### **10.6.2 Derivation of the Equations of the Refections**

We derive the position of the refection ′ 1 <sup>=</sup> ( ′ 1 , ′ 1 ) of <sup>1</sup> <sup>=</sup> (1, 1) around a tangent line whose equation is <sup>=</sup> <sup>+</sup> . First, fnd the line with equation <sup>=</sup> <sup>+</sup> that is perpendicular to and passes through 1:

$$\begin{aligned} \mathbf{y} &= -\frac{1}{m\_I}\mathbf{x} + b\_P \\ \mathbf{y}\_1 &= -\frac{1}{m\_I}\mathbf{x}\_1 + b\_P \\ \mathbf{y} &= \frac{-\mathbf{x}}{m\_I} + \left(\mathbf{y}\_1 + \frac{\mathbf{x}\_1}{m\_I}\right) \end{aligned}$$

.

Next fnd the intersection <sup>=</sup> ( , ) of and :

$$\begin{aligned} m\_I \mathbf{x}\_I + b\_I &= \frac{-\mathbf{x}\_I}{m\_I} + \left( \mathbf{y}\_1 + \frac{\mathbf{x}\_1}{m\_I} \right) \\ \mathbf{x}\_I &= \frac{\left( \mathbf{y}\_1 + \frac{\mathbf{x}\_1}{m\_I} - b\_I \right)}{\left( m\_I + \frac{1}{m\_I} \right)} \\ \mathbf{y}\_I &= m\_I \mathbf{x}\_I + b\_I \ . \end{aligned}$$

 is the midpoint between <sup>1</sup> and ′ 1 :

$$\begin{aligned} \mathbf{x}\_{t} &= \frac{\mathbf{x}\_{1} + \mathbf{x}\_{1}^{\prime}}{2}, & \mathbf{x}\_{1}^{\prime} &= 2\mathbf{x}\_{t} - \mathbf{x}\_{1} \ , \\ \mathbf{y}\_{t} &= \frac{\mathbf{y}\_{1} + \mathbf{y}\_{1}^{\prime}}{2}, & \mathbf{y}\_{1}^{\prime} &= 2\mathbf{y}\_{t} - \mathbf{y}\_{1} \ . \end{aligned}$$

*Example 10.12* Let be = √ <sup>3</sup> <sup>+</sup> <sup>0</sup> and let <sup>1</sup> <sup>=</sup> (0, <sup>4</sup>):

$$\begin{aligned} \mathbf{x}\_{t} &= \frac{\left(4 + \frac{0}{\sqrt{3}} - 0\right)}{\left(\sqrt{3} + \frac{1}{\sqrt{3}}\right)} = \sqrt{3} \\ \mathbf{y}\_{t} &= \sqrt{3}\sqrt{3} + 0 = 3 \\ \mathbf{x}\_{1}' &= 2\mathbf{x}\_{t} - \mathbf{x}\_{1} = 2\sqrt{3} \approx 3.46 \\ \mathbf{y}\_{1}' &= 2\mathbf{y}\_{t} - \mathbf{y}\_{1} = 2 \end{aligned}$$

#### **10.6.3 Tangents to a Parabola**

We wish to prove that the folds of Axiom 6 are tangents to the parabolas. Figure 10.11 shows fve points , = 1, . . . , 5, each point at a distance from both the focus and the directrix. Drop perpendicular lines from to the directrix and denote the intersections of these lines with the directrix by ′ . By Axiom 2 there are folds through that place onto the directrix. The points ′ are the refections of around the folds. The fgure shows the fold <sup>1</sup> through <sup>1</sup> and the refection ′ 1 .

**Fig. 10.11** The tangent as a geometric locus

**Fig. 10.12** The proof that the fold is a tangent

**Theorem 10.7** *The folds of Axiom* 6 *are the tangents to the parabolas that are the loci of the points equidistant to the points* 1, <sup>2</sup> *and* , 2*, respectively.*

*Proof* In Fig. 10.12, the focus is and the directrix is . ′ is a point on the directrix and is the fold that refects onto ′ . Let be the intersection of ′ and . Then = ′ <sup>=</sup> and <sup>⊥</sup> ′ since is the perpendicular bisector of ′ .

Let be the intersection of the line perpendicular to through ′ and the fold . Then △ △ ′ by side-angle-side. It follows that = ′ = so is a point on the parabola. Choose a point ′′ on the directrix that is distinct from ′ and assume that the fold also refects onto ′′. Let be the intersection of the perpendicular to through ′′ and the fold . △ △ ′ so = ′ = . Denote ′′ = . If is a point on the parabola then = ′′ = = , but is the hypotenuse of the right triangle △ ′′ ′ and it is not possible that the hypotenuse is equal to one of the other sides of the right triangle. Therefore the fold has only one intersection with the parabola and must be a tangent. □

#### **10.7 Axiom 7**

**Axiom 10.8** Given a point <sup>1</sup> and two lines <sup>1</sup> and 2, there is a fold that places <sup>1</sup> onto <sup>1</sup> and is perpendicular to <sup>2</sup> (Fig. 10.13).

The fold is the geometric locus of all points on the line perpendicular to <sup>2</sup> and equidistant from <sup>1</sup> and ′ 1 , the refection of <sup>1</sup> onto 1.

**Derivation of the equation of the fold:** Let <sup>1</sup> <sup>=</sup> (1, 1), let <sup>1</sup> be <sup>=</sup> 1 <sup>+</sup> <sup>1</sup> and let <sup>2</sup> be <sup>=</sup> 2 <sup>+</sup> 2. Let be the line containing <sup>1</sup> ′ 1 . Since <sup>⊥</sup> 2, <sup>⊥</sup> , it follows that <sup>∥</sup> <sup>2</sup> and the equation of is <sup>=</sup> 2 <sup>+</sup> .

**Fig. 10.13** Axiom 7

 passes through <sup>1</sup> so <sup>1</sup> <sup>=</sup> 2<sup>1</sup> <sup>+</sup> and its equation is <sup>=</sup> 2 + (<sup>1</sup> <sup>−</sup>21). The refection ′ 1 <sup>=</sup> ( ′ 1 , ′ 1 ) is the intersection of <sup>1</sup> and :

$$\begin{aligned} m\_1 x\_1' + b\_1 &= m\_2 x\_1' + (y\_1 - m\_2 x\_1), \\ x\_1' &= \frac{y\_1 - m\_2 x\_1 - b\_1}{m\_1 - m\_2} \\ y\_1' &= m\_1 x\_1' + b\_1 \end{aligned}$$

The equation of the midpoint <sup>=</sup> (, ) of is:

$$(\mathbf{x}\_m, \mathbf{y}\_m) = \left(\frac{\mathbf{x}\_1 + \mathbf{x}\_1'}{2}, \frac{\mathbf{y}\_1 + \mathbf{y}\_1'}{2}\right) \dots$$

<sup>⊥</sup> <sup>2</sup> and it passes through so its equation is:

$$y = -\frac{1}{m\_2}x + b\_m,$$

where can be computed from <sup>=</sup> <sup>−</sup> 1 2 <sup>+</sup> :

$$b\_m = y\_m + \frac{x\_m}{m\_2} \dots$$

.

The equation of the fold is therefore:

$$\mathbf{y} = -\frac{1}{m\_2}\mathbf{x} + \left(\mathbf{y}\_m + \frac{\mathbf{x}\_m}{m\_2}\right)\mathbf{.}$$

*Example 10.13* Let <sup>1</sup> <sup>=</sup> (5, <sup>3</sup>), let <sup>1</sup> be <sup>=</sup> <sup>3</sup> <sup>−</sup> 3 and let <sup>2</sup> be <sup>=</sup> <sup>−</sup> <sup>+</sup> 11. Then:

$$\begin{aligned} x\_1' &= \frac{3 - (-1) \cdot 5 - (-3)}{3 - (-1)} = \frac{11}{4} \\ y\_1' &= 3 \cdot \frac{11}{4} + (-3) = \frac{21}{4} \\ p\_m &= \left(\frac{5 + \frac{11}{4}}{2}, \frac{3 + \frac{21}{4}}{2}\right) = \left(\frac{31}{8}, \frac{33}{8}\right) \end{aligned}$$

The equation of the fold is:

$$y = -\frac{1}{-1} \cdot x + \left(\frac{33}{8} + \frac{\frac{31}{8}}{-1}\right) = x + \frac{1}{4} \cdot x$$

#### **What Is the Surprise?**

Origami, the art of paper folding, has been practiced for hundreds of years, so it is surprising that the mathematical formalization goes back only to the twentieth century. It is even more surprising that there is an axiomatization of paper folding. The mathematics of origami is an excellent way to learn analytic geometry, properties of parabolas and the concept of geometric locus.

#### **Sources**

The axioms of origami are presented in [56]. Lang [26] gives descriptions of origami constructions. [31, Chap. 10] contains the detailed theory of the mathematics of origami, including the proof that two parabolas can have zero, one, two or three common tangents. The proof of Thm. 10.7 was shown to me by Oriah Ben-Lulu. I found that geometric software like Geogebra is useful for understanding the relation between the geometry and the algebra of the axioms.

A clear presentation of cubic equations can be found in [6, Chapters 1, 2].

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

# **Chapter 11 Lill's Method and the Beloch Fold**

## **11.1 A Magic Trick**

Construct a path consisting of four line segments {<sup>3</sup> <sup>=</sup> <sup>1</sup>, <sup>2</sup> <sup>=</sup> <sup>6</sup>, <sup>1</sup> <sup>=</sup> <sup>11</sup>, <sup>0</sup> <sup>=</sup> <sup>6</sup>}, starting from the origin along the positive direction of the -axis and turning 90◦ counterclockwise between segments. Construct a second path as follows: construct a line from the origin at an angle of 63.4 ◦ and mark its intersection with <sup>2</sup> by . Turn left 90◦ , construct a line and and mark its intersection with <sup>1</sup> by . Turn left 90◦ once again, construct a line and note that it intersects the end of the frst path at (−10, <sup>0</sup>) (Fig. 11.1).

**Fig. 11.1** A magic trick

Compute the negation of the tangent of the angle at the start of the second path: <sup>−</sup> tan 63.<sup>4</sup> ◦ = −2. Substitute this value into the polynomial whose coefcients are the lengths of the segments of the frst path:

$$\begin{aligned} p(\mathbf{x}) &= a\_3 \mathbf{x}^3 + a\_2 \mathbf{x}^2 + a\_1 \mathbf{x} + a\_0 \\ &= \mathbf{x}^3 + 6 \mathbf{x}^2 + 11 \mathbf{x} + 6 \\ p(-\tan 63.4^\circ) &= (-2)^3 + 6(-2)^2 + 11(-2) + 6 = 0 \end{aligned}$$

We have found a root of the cubic polynomial <sup>3</sup> <sup>+</sup> <sup>6</sup> <sup>2</sup> <sup>+</sup> <sup>11</sup> <sup>+</sup> 6!

Let us continue the example. The polynomial () <sup>=</sup> <sup>3</sup> <sup>+</sup> <sup>6</sup> <sup>2</sup> <sup>+</sup> <sup>11</sup> <sup>+</sup> 6 has three roots <sup>−</sup>1, <sup>−</sup>2, <sup>−</sup>3. Compute the arc tangent of the negation of the roots:

$$
\alpha = -\tan^{-1}(-1) = 45^{\circ}, \quad \beta = -\tan^{-1}(-2) \approx 63.4^{\circ}, \quad \gamma = -\tan^{-1}(-3) \approx 71.6^{\circ}.
$$

For each angle the second path intersects the end of the frst path (Fig. 11.2).

The value <sup>−</sup> tan 56.<sup>3</sup> ≈ −1.5 is not a root of the equation. Fig. 11.3 shows the result of the application of the method for this angle. The second path does not intersect the line segment for the coefcient <sup>0</sup> at (−10, <sup>0</sup>).

This example demonstrates a method discovered by Eduard Lill in 1867 for graphically fnding the real roots of any polynomial. We are not actually fnding the roots but verifying that a given value is a root.

Section 11.2 presents a formal specifcation of Lill's method (limited to cubic polynomials) and gives examples of how it works in special cases. A proof of the

**Fig. 11.2** Lill's method for the three roots of the polynomial

**Fig. 11.3** A path that does not lead to a root

correctness of Lill's method is given in Sect. 11.3. Section 11.4 shows how the method can be implemented using origami Axiom 6. This is called the Beloch fold and preceded the formalization of the axioms of origami by many years.

#### **11.2 Specifcation of Lill's Method**

#### **11.2.1 Lill's Method as an Algorithm**

	- For each coefcient 3, 2, 1, <sup>0</sup> (in that order) construct a line segment of that length, starting at the origin <sup>=</sup> (0, <sup>0</sup>) in the positive direction of the -axis. Turn 90◦ counterclockwise between each segment.
	- Construct a line from at an angle of with the positive -axis that intersects <sup>2</sup> at point .
	- Turn ±90◦ and construct a line from that intersects <sup>1</sup> at .
	- Turn ±90◦ and construct a line from that intersects <sup>0</sup> at .
	- If is the end point of the frst path then <sup>−</sup> tan is a root of ().

**Fig. 11.4** Lill's method with negative roots

	- When constructing the line segments of the frst path, if a coefcient is negative, construct the line segment *backwards*.
	- When constructing the line segments of the frst path, if a coefcient is zero, do not construct a line segment but continue with the next ±90◦ turn.
	- The phrase *intersects* means *intersects the line segment or any extension of* .
	- When building the second path choose to turn left or right by 90◦ so that there is an intersection with the next segment of the frst path or its extension.

#### **11.2.2 Negative Coefcients**

Let us demonstrate Lill's method on the polynomial () <sup>=</sup> <sup>3</sup> <sup>−</sup> <sup>3</sup> <sup>2</sup> <sup>−</sup> <sup>3</sup> <sup>+</sup> 1 with negative coefcients (Sect. 10.6). Start by constructing a segment of length 1 to the right. Next, turn 90◦ to face up, but since the coefcient is negative, construct a segment of length 3 down, that is, in a direction opposite of the arrow. After turning 90◦ to the left, the coefcient is again negative, so construct a segment of length 3 to the right. Finally, turn downwards and construct a segment of length 1 (Fig. 11.4, the loosely dashed lines will be discussed in Sect. 11.2.4).

**Fig. 11.5** Lill's method with polynomials with zero coefcients

Start the second path with a line at 45◦ with the positive -axis. It intersects the extension of the line segment for <sup>2</sup> at (1, <sup>1</sup>). Turning <sup>−</sup>90◦ (to the right), the line intersects the extension of the line segment for <sup>1</sup> at (5, <sup>−</sup>3). Turning <sup>−</sup>90◦ again, the line intersects the end of the frst path at (4, <sup>−</sup>4). Since <sup>−</sup> tan 45◦ <sup>=</sup> <sup>−</sup>1, we have found a root of the polynomial:

$$p(-1) = (-1)^3 - 3(-1)^2 - 3(-1) + 6 = 0\dots$$

#### **11.2.3 Zero Coefcients**

2, the coefcient of the 2 term in the polynomial <sup>3</sup> <sup>−</sup>7 <sup>−</sup><sup>6</sup> <sup>=</sup> 0, is zero. Construct a line segment of length 0, that is, do not construct a line, but still make the ±90◦ turn as indicated by the arrow pointing up at (1, <sup>0</sup>) in Fig. 11.5. Turn again and construct a line segment of length <sup>−</sup>7, that is, of length 7 backwards, to (8, <sup>0</sup>). Finally, turn once more and construct a line segment of length <sup>−</sup>6 to (8, <sup>6</sup>).

The second paths with the following angles intersect the end of the frst path:

$$-\tan^{-1}(-1) = 4\Im^\circ, \quad -\tan^{-1}(-2) \approx 6\Im.4^\circ, \quad -\tan^{-1}\Im \approx -71.6^\circ \text{ J}$$

**Fig. 11.6** Lill's method with non-integer roots

We conclude that there are three real roots {−1, <sup>−</sup>2, <sup>3</sup>}. Check:

$$((\mathbf{x}+\mathbf{1})(\mathbf{x}+\mathbf{2})(\mathbf{x}-\mathbf{3}) = (\mathbf{x}^2+\mathbf{3}\mathbf{x}+\mathbf{2})(\mathbf{x}-\mathbf{3}) = \mathbf{x}^3-\mathbf{7}\mathbf{x}-\mathbf{6}\mathbf{I}$$

#### **11.2.4 Non-integer Roots**

Figure 11.6 shows Lill's method for () <sup>=</sup> <sup>3</sup> <sup>−</sup> <sup>2</sup> <sup>+</sup> 1. The frst path goes from (0, <sup>0</sup>) to (1, <sup>0</sup>) and then turns up. The coefcient of 2 is zero so no line segment is constructed and the path turns left. The next line segment is of length −2 so it goes backwards from (1, <sup>0</sup>) to (3, <sup>0</sup>). Finally, the path turns down and a line segment of length 1 is constructed from (3, <sup>0</sup>) to (3, <sup>−</sup>1).

It is easy to see that if the second path starts at an angle of −45◦ it will intersect the frst path at (3, <sup>−</sup>1). Therefore, <sup>−</sup> tan−<sup>1</sup> (−45) ◦ <sup>=</sup> 1 is a root. If we divide () by <sup>−</sup> 1, we obtain the quadratic polynomial <sup>2</sup> <sup>+</sup> <sup>−</sup> 1 whose roots are:

$$\frac{-1 \pm \sqrt{5}}{2} \approx 0.62, \ -1.62\ .$$

There are two additional second paths: one starting at <sup>−</sup> tan−<sup>1</sup> <sup>0</sup>.<sup>62</sup> ≈ −31.<sup>8</sup> ◦ , and the other starting at <sup>−</sup> tan−<sup>1</sup> (−1.62) ≈ <sup>58</sup>.<sup>3</sup> ◦ .

The polynomial () <sup>=</sup> <sup>3</sup>−3 <sup>2</sup>−3+1 (Sect. 11.2.2) has roots 2<sup>±</sup> √ <sup>3</sup> <sup>≈</sup> <sup>3</sup>.73, <sup>0</sup>.27. The corresponding angles are <sup>−</sup> tan−<sup>1</sup> <sup>3</sup>.<sup>73</sup> ≈ −75◦ and <sup>−</sup> tan−<sup>1</sup> <sup>0</sup>.<sup>27</sup> ≈ −15◦ as shown by the loosely dashed lines in Fig. 11.4.

**Fig. 11.7** The cube root of two

#### **11.2.5 The Cube Root of Two**

To double a cube, compute <sup>√</sup><sup>3</sup> 2, a root of the cubic polynomial <sup>3</sup> <sup>−</sup> 2. In the construction of the frst path, turn left twice without constructing any line segments, because <sup>2</sup> and <sup>1</sup> are both zero. Then turn left again (to face down) and construct backwards (up) because <sup>0</sup> <sup>=</sup> <sup>−</sup>2 is negative. The frst segment of the second path is construct at an angle of <sup>−</sup> tan−<sup>1</sup> √3 <sup>2</sup> <sup>≈</sup> <sup>−</sup>51.<sup>6</sup> ◦ (Fig. 11.7).

#### **11.3 Proof of Lill's Method**

The proof is for monic cubic polynomials () <sup>=</sup> <sup>3</sup> <sup>+</sup> 2 <sup>2</sup> <sup>+</sup> 1 <sup>+</sup> 0. If the polynomial is not monic, divide it by <sup>3</sup> and the resulting polynomial has the same roots. In Fig. 11.8 the line segments of the frst path are labeled with the coefcients and with 2, 1, <sup>2</sup> <sup>−</sup> 2, <sup>1</sup> <sup>−</sup> 1. In a right triangle if one acute angle is the other angle is 90◦ <sup>−</sup> . Therefore, the angle above and the angle to the left of are equal to . Here are the formulas for tan as computed from the three triangles:

$$\begin{aligned} \tan \theta &= \frac{b\_2}{1} = b\_2\\ \tan \theta &= \frac{b\_1}{a\_2 - b\_2} = \frac{b\_1}{a\_2 - \tan \theta} \\ \tan \theta &= \frac{a\_0}{a\_1 - b\_1} = \frac{a\_0}{a\_1 - \tan \theta (a\_2 - \tan \theta)} \end{aligned}$$

**Fig. 11.8** Proof of Lill's method

Simplify the last equation, multiply by −1 and absorb −1 into the powers:

$$\begin{aligned} \left(\tan\theta\right)^3 - a\_2(\tan\theta)^2 + a\_1(\tan\theta) - a\_0 &= 0\\ \left(-\tan\theta\right)^3 + a\_2(-\tan\theta)^2 + a\_1(-\tan\theta) + a\_0 &= 0 \end{aligned}$$

It follows that <sup>−</sup> tan is a real root of () <sup>=</sup> <sup>3</sup> <sup>+</sup> 2 <sup>2</sup> <sup>+</sup> 1 <sup>+</sup> 0.

#### **11.4 The Beloch Fold**

Margharita P. Beloch discovered a remarkable connection between folding and Lill's method: one application of the operation later known as origami Axiom 6 generates a real root of a cubic polynomial. The operation is often called the *Beloch fold*.

Consider the polynomial () <sup>=</sup> <sup>3</sup> <sup>+</sup> <sup>6</sup> <sup>2</sup> <sup>+</sup> <sup>11</sup> <sup>+</sup> 6 (Sect. 11.1). Recall that a fold is the perpendicular bisector of the line segment between any point and its refection around the fold. We want in Fig. 11.9 to be the perpendicular bisector of both ′ and ′ , where ′ , ′ are the refections of , around , respectively.

Construct a line ′ 2 parallel to <sup>2</sup> at the same distance from <sup>2</sup> as <sup>2</sup> is from , and construct a line ′ 1 parallel to <sup>1</sup> at the same distance from <sup>1</sup> as <sup>1</sup> is from . Apply Axiom 6 to simultaneously place at ′ on ′ 2 and to place at ′ on ′ 1 . The fold is the perpendicular bisector of the lines ′ and ′ so the angles at and are both right angles as required by Lill's method.

**Fig. 11.9** The Beloch fold for fnding a root of <sup>3</sup> <sup>+</sup> <sup>6</sup> <sup>2</sup> <sup>+</sup> <sup>11</sup> <sup>+</sup> <sup>6</sup>

Figure 11.10 shows the Beloch fold for the polynomial <sup>3</sup> <sup>−</sup> <sup>3</sup> <sup>2</sup> <sup>−</sup> <sup>3</sup> <sup>+</sup> <sup>1</sup> (Sect. 11.2.2). <sup>2</sup> is the vertical line segment of length 3 whose equation is = 1, and its parallel line is ′ <sup>2</sup> whose equation is <sup>=</sup> 2, because is at a distance of 1 from 2. <sup>1</sup> is the horizontal line segment of length 3 whose equation is <sup>=</sup> <sup>−</sup>3, and its parallel line is ′ <sup>1</sup> whose equation is <sup>=</sup> <sup>−</sup>2 because is at a distance of 1 from 1. The fold is the perpendicular bisector of both ′ and ′ , and the line is the same as the second path in Fig. 11.4.

**Fig. 11.10** The Beloch fold for fnding a root of <sup>3</sup> <sup>−</sup> <sup>3</sup> <sup>2</sup> <sup>−</sup> <sup>3</sup> <sup>+</sup> <sup>1</sup>

#### **What Is the Surprise?**

Performing Lill's method as a magic trick never fails to surprise. It can be performed during a lecture using graphics software such as GeoGebra. It is also surprising that Lill's method, published in 1867, and Beloch's fold, published in 1936, preceded the axiomatization of origami by many years.

#### **Sources**

This chapter is based on [8, 24, 40].

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 12 Geometric Constructions Using Origami**

This chapter shows that constructions with origami are more powerful than constructions with a straightedge and compass. We give two constructions for trisecting an angle, one by Hisashi Abe (Sect. 12.1) and the other by George E. Martin (Sect. 12.2), two constructions for doubling a cube, one by Peter Messer (Sect. 12.3) and the other by Marghareta P. Beloch (Sect. 12.4), and the construction of a nonagon, a regular polynomial with nine sides (Sect. 12.5).

#### **12.1 Abe's Trisection of an Angle**

**Construction:** Given an acute angle ∠, construct , the perpendicular to at . Construct , a perpendicular to that intersects at point , and construct , the perpendicular to at that is halfway between and . Using Axiom 6 construct the fold that places at ′ on and at ′ on . Let ′ be the refection of around . Construct lines through ′ and ′ (Fig. 12.1).

**Theorem 12.1** ∠′ = ∠ ′′ = ∠ ′ <sup>=</sup> <sup>∠</sup>/3*.*

*Proof (1)* ′ , ′ , ′ are refections around the line of the points , , on the line , so they are on the refected line ′ . By construction = , ∠′ = ∠′ = 90◦ and ′ is a common side, so △′ △′ by side-angle-side. Therefore, <sup>∠</sup>′ <sup>=</sup> <sup>∠</sup> ′ <sup>=</sup> so △′ is isoceles (Fig. 12.2).

By refection △′ △ ′′ , so △ ′′ is also an isoceles triangle. ′ , the refection of ′, is the perpendicular bisector of an isoceles triangle, so ∠ ′′ = ∠ ′′ = ∠′ = . By alternating interior angles, ∠ ′ = ∠′ = . Together we have:

$$
\angle PQR' = \angle A'QB' = \angle B'QQ' = \angle Q'QR = \alpha \,. \tag{7}
$$

**Fig. 12.1** Abe's trisection of an angle

*Proof (2)* Since is a fold it is the perpendicular bisector of ′ . Denote the intersection of with ′ by and its intersection with ′ by (Fig. 12.2). △ △′ by side-angle-side since is a common side, the angles at are right angles and = ′. Therefore, ∠ = ∠′ = and ∠ ′ = ∠′ = by alternating interior angles.

As in the frst proof ′ , ′ , ′ are all refections around , so they are on the line ′ and ′′ <sup>=</sup> <sup>=</sup> <sup>=</sup> ′′ <sup>=</sup> . Then △ ′ ′ △ ′ ′ by side-angle-side and ∠ ′′ = ∠ ′′ = . □

**Fig. 12.2** Proofs of Abe's trisection (, are used in Proof 2)

#### **12.2 Martin's Trisection of an Angle**

**Construction:** Given an acute angle ∠, let be the midpoint of . Construct the perpendicular to through and construct perpendicular to through so <sup>∥</sup> . Using Axiom 6 construct the fold that places at ′ on and at ′ on . If more than one fold is possible choose the one that intersects . Construct ′ and ′ (Fig. 12.3).

**Theorem 12.2** ∠ ′ <sup>=</sup> <sup>∠</sup>/3*.*

*Proof* Denote the intersection of ′ with by and its intersection with by . Denote the intersection of and ′′ with by . It is not immediate that and ′′ intersect at the same point. But △′ ∼ △′ so the altitudes bisect both vertical angles ∠′ , ∠′ and they must be on the same line.

△ △′ by angle-side-angle since <sup>∠</sup> ′ = ∠ = by alternate interior angles, = = because is the midpoint of and ∠ = ∠′ = are vertical angles. Therefore, ′ = = .

**Fig. 12.3** Martin's trisection of an angle

△ ′′ △′ by side-angle-side, since ′ <sup>=</sup> <sup>=</sup> , the angles at are right angles and ′ is a common side. Since the altitude of the isoceles triangle △ ′ ′ is the bisector of ∠ ′ ′, it follows that ∠ ′ ′ = ∠′ = . Furthermore, ∠′ = ∠ ′ <sup>=</sup> by alternate interior angles. △ △ ′ by side-angle-side since = ′ = , the angles at are right angles and is a common side. Therefore:

$$
\begin{aligned}
\angle WQV &= \beta = \angle WQ'V = 2\alpha \\
\angle PQR &= \beta + \alpha = 3\alpha
\end{aligned}
\tag{7}
$$

#### **12.3 Messer's Doubling of a Cube**

A cube of volume has sides of length <sup>√</sup><sup>3</sup> . A cube with twice the volume has sides of length <sup>√</sup><sup>3</sup> 2 = √3 2 √3 , so if we can construct <sup>√</sup><sup>3</sup> 2 we can multiply by the given length <sup>√</sup><sup>3</sup> to double the cube.

**Construction:** Divide the side of a unit square into thirds as follows: Fold the square in half to locate the points <sup>=</sup> (0, <sup>1</sup>/2) and <sup>=</sup> (1, <sup>1</sup>/2). Next, construct the lines and (Fig. 12.4). The point of intersection <sup>=</sup> (2/3, <sup>1</sup>/3) can be obtained by solving the two equations <sup>=</sup> <sup>1</sup> <sup>−</sup> and <sup>=</sup> /2.

**Fig. 12.4** Dividing a length into thirds

Construct , the perpendicular to through , and construct the refection of around . The side of the square has now been divided into thirds.

Using Axiom 6 place at ′ on and at ′ on . Denote by the point intersection of the fold with and denote by the length of . Rename the length of the side of the square to <sup>+</sup> 1 where <sup>=</sup> ′ . The length of is ( <sup>+</sup> <sup>1</sup>) − (Fig. 12.5).

#### **Theorem 12.3** ′ = √3 2*.*

*Proof* When the fold is performed the line segment is refected onto the line segment ′ and is folded onto the line segment ′′ . Therefore:

$$
\overline{GC'} = a - \frac{a+1}{3} = \frac{2a-1}{3} \,. \tag{12.1}
$$

Since ∠ is a right angle, so is ∠ ′ ′.

△ ′ is a right triangle so by Pythagoras's Theorem:

$$a^2 + b^2 = \left((a+1) - b\right)^2\tag{12.2a}$$

$$b = \frac{a^2 + 2a}{2(a+1)}.\tag{12.2b}$$

∠ ′ ′ <sup>+</sup> <sup>∠</sup> ′ ′ <sup>+</sup> <sup>∠</sup>′ <sup>=</sup> <sup>180</sup>◦ since they form the straight line . Denote ∠′ ′ by . Then:

**Fig. 12.5** Construction of <sup>√</sup><sup>3</sup> 2

$$
\angle LC'B = 180^\circ - \angle F'C'L - \angle GC'F' = 180^\circ - 90^\circ - \alpha = 90^\circ - \alpha,
$$

which we denote by ′ . The triangles △ ′, △ ′′ are right triangles so ∠ ′ = and ∠ ′ ′ = ′ . Therefore, △ ′ ∼ △ ′′ and:

$$\frac{\overline{BL}}{\overline{C'L}} = \frac{\overline{GC'}}{\overline{C'F'}} \dots$$

Using Eq. 12.1 we have:

$$\frac{b}{(a+1)-b} = \frac{\frac{2a-1}{3}}{\frac{a+1}{3}}\dots$$

Substituting for using Eq. 12.2b gives:

$$\frac{\frac{a^2+2a}{2(a+1)}}{(a+1)-\frac{a^2+2a}{2(a+1)}}=\frac{2a-1}{a+1}\dots$$

Simplify the equation to obtain <sup>3</sup> = 2 and = √3

2. □

#### **12.4 Beloch's Doubling of a Cube**

Since the Beloch fold (Axiom 6) can solve cubic equations it is reasonable to conjecture that it can be used to double a cube. Here we give a direct construction that uses the fold.

**Construction:** Let <sup>=</sup> (−1, <sup>0</sup>), <sup>=</sup> (0, <sup>−</sup>2). Let be the line <sup>=</sup> 1 and let be the line = 2. Use the Beloch fold to construct the fold that places at ′ on and at ′ on . Denote the intersection of the fold and the -axis by and the intersection of the fold and the -axis by (Fig. 12.6).

#### **Theorem 12.4** = √3 2*.*

*Proof* The fold is the perpendicular bisector of both ′ and ′ so ′ <sup>∥</sup> ′ . By alternate interior angles ∠ = ∠ = . The labeling of the other angles in the fgure follows from the properties of right triangles.

**Fig. 12.6** Beloch's doubling of the cube

△ ∼ △ ∼ △ and <sup>=</sup> 1, <sup>=</sup> 2 are given so:

$$
\frac{\overline{OY}}{\overline{OA}} = \frac{\overline{OX}}{\overline{OY}} = \frac{\overline{OB}}{\overline{OX}}
$$

$$
\frac{\overline{OY}}{1} = \frac{\overline{OX}}{\overline{OY}} = \frac{2}{\overline{OX}} \cdot
$$

From the frst and second ratios we have = 2 and from the frst and third ratios we have = 2. Substituting for gives 3 = 2 and = √3 2. □

#### **12.5 Construction of a Regular Nonagon**

A nonagon (a regular polygon with nine sides) is constructed by deriving the cubic equation for its central angle and then solving the equation using Lill's method and the Beloch fold. The central angle is = 360◦ /9 = 40◦ . By Thm. A.6:

$$\cos 3\theta = 4\cos^3 \theta - 3\cos \theta \dots$$

Let = cos 40◦ . Then for the nonagon the equation is 4 <sup>3</sup> <sup>−</sup> <sup>3</sup> + (1/2) <sup>=</sup> 0 since cos 3 · 40◦ = cos 120◦ = −(1/2). Figure 12.7 shows the paths for the equation constructed according to Lill's method.

**Fig. 12.7** Lill's method for a nonagon

The second path starts from at an angle of approximately <sup>−</sup>37.45◦ . Turns of 90◦ at and then <sup>−</sup>90◦ at cause the path to intersect the frst path at its endpoint . Therefore, <sup>=</sup> <sup>−</sup> tan(−37.45◦ ) ≈ <sup>0</sup>.766 is a root of 4 <sup>3</sup> <sup>−</sup> <sup>3</sup> + (1/2).

The root can be obtained using the Beloch fold. Construct the line ′ 2 parallel to <sup>2</sup> at the same distance from <sup>2</sup> as <sup>2</sup> is from . Although the length of <sup>2</sup> is zero, it still has a direction (upwards) so the parallel line can be constructed. Similarly, construct the line ′ 1 parallel to <sup>1</sup> at the same distance from <sup>1</sup> as <sup>1</sup> is from . The Beloch fold simultaneously places at ′ on ′ 2 and at ′ on ′ 1 . This constructs the angle <sup>∠</sup> <sup>=</sup> <sup>−</sup>37.45◦ (Fig. 12.8).

**Fig. 12.8** The Beloch fold for solving the equation of the nonagon

By Lill's method <sup>−</sup> tan(−37.45◦ ) ≈ <sup>0</sup>.766 and therefore cos <sup>≈</sup> <sup>0</sup>.766 is a root of the equation for the central angle . We fnish the construction of the nonagon by constructing cos−<sup>1</sup> <sup>0</sup>.<sup>766</sup> <sup>≈</sup> <sup>40</sup>◦ .

The right triangle △ with <sup>∠</sup> <sup>≈</sup> <sup>37</sup>.45◦ and = 1 has opposite side <sup>≈</sup> <sup>0</sup>.766 by defnition of tangent (Fig. 12.9a). Fold onto the so that the refection of is and = 0.766. Extend and construct so that = 1. Fold to refect at on the extension of (Fig. 12.9b). Then:

$$
\angle BDF = \cos^{-1} \frac{0.766}{1} \approx 40^{\circ} \text{ .}
$$

**Fig. 12.9a** The tangent that is the solution of the equation for the nonagon

**Fig. 12.9b** The cosine of the central angle of the nonagon

#### **What Is the Surprise?**

We saw in Chaps. 2 and 3 that tools such as the neusis can perform constructions that cannot be done with a straightedge and compass. It is surprising that trisecting an angle and doubling a cube can be constructed using only paper folding. Roger C. Alperin has developed a hierarchy of four methods of construction each more powerful than the previous one.

#### **Sources**

This chapter is based on [2, 26, 31, 36].

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 13 A Compass Is Sufcient**

In 1797 Lorenzo Mascheroni proved that any construction carried out with a straightedge and compass can be carried out with only a compass. Later it came to light that this theorem had already been proved by Georg Mohr in 1672. After explaining in Sect. 13.1 what is meant by performing a construction with only a compass, the proof is presented in stages starting with four auxiliary constructions: refection of a point (Sect. 13.2), construction of a circle with a given radius (Sect. 13.3), addition and subtraction of line segments (Sect. 13.4) and construction of a line segment as a ratio of segments (Sect. 13.5). Section 13.6 shows how to fnd the intersection of two lines and Sect. 13.7 shows how to fnd the intersection of a line and a circle.

## **13.1 What Is a Construction With Only a Compass?**

Figure 13.1a shows the construction of an equilateral triangle using a straightedge and compass. How can we construct a triangle without the line segments , , ? A line segment is defned by two points, so it is sufcient to construct these points in order to obtain a construction equivalent to the one with a straightedge (Fig. 13.1b). There is no need to actually *see* the line segments. There will be lines in the fgures in this chapter, but they are used only to understand the construction and the proof of its correctness. It is important to convince yourself that the construction itself uses only a compass.

A construction using a straightedge and compass is a sequence of three operations:


The third operation can be done with only a compass. We need to show that the frst two operations can be done with a compass alone.

**Fig. 13.1a** Construction of an equilateral triangle with a straightedge and a compass

**Fig. 13.1b** Construction of an equilateral triangle with only a compass

Notation:


#### **13.2 Refection of a Point**

**Defnition 13.1** A point ′ is a *refection* of the point around a line segment if and only if (or the line containing ) is the perpendicular bisector of the line segment ′ .

**Theorem 13.1** *Given a line and a point not on , it is possible to build* ′ *, the refection of around .*

*Proof* Construct a circle centered on passing through and a circle centered on passing through . The other intersection of the two circles is the point ′ which is the refection of (Fig. 13.2). △ △′ by side-side-side since , ′ are radii of the same circle, as are , ′ and is a common side. Therefore, ∠ = ∠ ′ so is the angle bisector of ∠ ′ . But △ ′ is an isosceles triangle and the angle bisector is also the perpendicular bisector of ′ , the base of △ ′ . By defnition ′ is the refection of around . □

**Fig. 13.2** Construction of a refection

#### **13.3 Construction of a Circle With a Given Radius**

**Theorem 13.2** *Given points* , , *it is possible to construct* (, )*, the circle centered at with radius .*

*Proof* Construct (, ) and (, ) and let , be their points of intersection (Fig. 13.3). is the refection of around since △ △ by side-sideside. By Thm. 13.1 construct ′ , the refection of around and then construct (, ′ ) (Fig. 13.4).

 is the perpendicular bisector of ′ and . Denote the intersection of and by and the intersection of and ′ by . Then ′ = , = and ∠ = ∠′ is a right angle, so △ △′ by sideangle-side. Therefore, = ′ and ∠′ = ∠ (they are complementary to <sup>∠</sup> ′ <sup>=</sup> <sup>∠</sup> ). It follows that △′ △ by side-angle-side so ′ = . □

**Fig. 13.3** Construction of a circle with a given radius (1)

**Fig. 13.4** Construction of a circle with a given radius (2)

#### **13.4 Addition and Subtraction of Line Segments**

**Theorem 13.3** *Given a line segment of length and a line segment of length , it is possible to construct line segments* , *such that is a line segment, the length of is* <sup>−</sup> *and the length of is* <sup>+</sup> *(Fig. 13.5).*

**Fig. 13.5** Addition and subtraction of line segments

The proof is quite long and will be presented as a sequence of constructions.

**Theorem 13.4** *An isoceles trapezoid can be constructed.*

*Proof* Let be any point on (, ). Construct ′ its refection around . Denote the length of ′ by ℎ (Fig. 13.6).

Construct the circles (, ), (, ℎ). Let be a point of intersection of the circles and construct ′ the refection of around (Fig. 13.7).

**Fig. 13.6** Construction of an isoceles trapezoid (1)

The line containing is the perpendicular bisector of ′ and ′ so ′ <sup>∥</sup> ′ . = since it is the radius of the circle centered on , and ′ , ′ are refections of , . △′ △′ ′ by side-side-side and △ △ ′′ by side-angle-side, so ′′ = = . It follows that ′′ is an isosceles trapezoid whose bases are ′ = ℎ, ′ = 2ℎ (Fig. 13.8). Denote the length of the diagonals ′ = ′ by . □

**Theorem 13.5** *An isoceles trapezoid can be circumscribed by a circle.*

*Proof* The theorem follows immediately from Thms. A.15 and A.16. □

**Theorem 13.6** *For* , , ℎ *shown in Fig. 13.8,* <sup>2</sup> = <sup>2</sup> <sup>+</sup> <sup>2</sup><sup>ℎ</sup> 2 *.*

*Proof* The theorem follows from Ptolemy's theorem (Thm. A.18) which says that in a quadrilateral circumscribed by a circle the product of the diagonals equals the sum of the products of the opposite sides. □

**Fig. 13.7** Construction of an isoceles trapezoid (2)

**Fig. 13.8** Construction of an isoceles trapezoid (3)

The proof of Thm. 13.3 can now be given.

*Proof* Let be the point on line that extends by . (We will eventually construct .) Defne = ′. From Thm. 13.6:

$$d^2 = b^2 + 2h^2 = (\mathbf{x}^2 - h^2) + 2h^2 = \mathbf{x}^2 + h^2 \dots$$

Since △′ is a right triangle <sup>2</sup> = <sup>2</sup> <sup>+</sup> <sup>ℎ</sup> 2 (Fig. 13.9).

**Fig. 13.9** Application of Ptolemy's theorem

Construct as the intersection of (, ), ( ′ , ) (Fig. 13.10). △′ is a right triangle so by Pythagoras's Theorem 2 = <sup>2</sup> <sup>−</sup> <sup>ℎ</sup> <sup>2</sup> = 2 and = .

**Fig. 13.10** Construction of the point for addition and subtraction (1)

Construct as the intersection of (, ), ( ′ , ) (Fig. 13.11). Since the length of is <sup>√</sup> <sup>2</sup> <sup>−</sup> <sup>ℎ</sup> <sup>2</sup> <sup>=</sup> the length of is <sup>+</sup> and the length of ′ is <sup>−</sup> . □

**Fig. 13.11** Construction of the point for addition and subtraction (2)

#### **13.5 Construction of a Line Segment as a Ratio of Segments**

**Theorem 13.7** *Given line segments of length* , , *, it is possible to construct a line segment of length:*

$$x = \frac{n}{m}s\ .$$

*Proof* Construct two concentric circles <sup>1</sup> <sup>=</sup> (, ) and <sup>2</sup> <sup>=</sup> (, ),<sup>1</sup> and choose an arbitrary point on 1. By Thm. 13.2 construct a chord of length on <sup>1</sup> (Fig. 13.12a). If intersects 2, by Thm. 13.3 multiply , by a number so that the chord does not intersect the circle. Note that this does not change the value that we are trying to construct since = = .

Choose a point on <sup>2</sup> and denote the length of by . Construct on <sup>2</sup> such that the length of is (Fig. 13.12b). △ △ by side-side-side since = = are the radii of the same circle, as are = = , and <sup>=</sup> <sup>=</sup> by construction (Fig. 13.13a). From △ △ it follows ∠ = ∠ and then ∠ = ∠. It is difcult to see this equality from the diagram, but Fig. 13.13b should clarify the relation among the angles.

△ ∼ △ since both are isosceles triangles and we have shown that they have the same vertex angle. Label by . Then:

$$\begin{aligned} \frac{m}{s} &= \frac{n}{x} \\ x &= \frac{n}{m}s \end{aligned} \tag{7}$$

**Fig. 13.12a** Construction of = 

 , step 1 **Fig. 13.12b** Construction of = , step 2

<sup>1</sup> We assume that > ; if not, exchange the notation.

**Fig. 13.13a** Construction of = 

, step 3 **Fig. 13.13b** <sup>∠</sup> <sup>=</sup> <sup>∠</sup>

#### **13.6 Construction of the Intersection of Two Lines**

**Theorem 13.8** *Given two lines containing the line segments* ,*, it is possible to construct their intersection .*

*Proof* Let ′ , ′ be the refections of , around . There are two cases depending on whether , lie on the same side of or on diferent sides. Label = , = ′ , = ′ , = as shown in Figs. 13.14, 13.15. We compute the value of for each case.

*Case 1:* , are on the diferent sides of . lies on because △ △ ′ by side-angle-side: = ′, ∠ = ∠ ′ = 90◦ and is a common side. Therefore ′ <sup>=</sup> and similarly ′ <sup>=</sup> . △′ ∼ △′ are similar so <sup>−</sup> = and solving the equation gives = <sup>+</sup> .

**Fig. 13.14** Construction of the intersection of two lines (1)

**Fig. 13.15** Construction of the intersection of two lines (2)

*Case 2:* , are on the same side of . △′ ∼ △′ gives <sup>−</sup> = and solving the equation gives = <sup>−</sup> .

Construct the circles ( ′ , ), (, ) and denote their intersection by (Fig. 13.16). The sum of the line segments ′ ,′ is <sup>+</sup> . We have to show that is on the extension of ′ so that is a line segment of length <sup>+</sup> . <sup>=</sup> <sup>−</sup> in case is on the same side of as (not shown in the diagram).

 is the intersection of ( ′ , ), (, ) so <sup>=</sup> , ′ <sup>=</sup> . By construction ′′ = , ′ = so the quadrilateral ′′ is a parallelogram.

**Fig. 13.16** Construction of the intersection of two lines (3)

**Fig. 13.17** Construction of the intersection of two lines (4)

By construction ′ <sup>∥</sup> ′ so ′ <sup>∥</sup> ′ and therefore ′ <sup>∥</sup> ′ . Since one of its end points is ′ it must be on the line containing ′ . By Thm. 13.3, from the lengths , , a line segment of length <sup>+</sup> can be constructed and by Thm. 13.7 a line segment of length = <sup>+</sup> can be constructed. , the intersection of ( ′ , ) and (, ), is also the intersection of , (Fig. 13.17). □

#### **13.7 Construction of the Intersection of a Line and a Circle**

**Theorem 13.9** *Given a circle* <sup>=</sup> (, ) *and a line it is possible to construct the intersections of and .*

*Proof* Construct ′ , be the refection of around and construct the circle ′ = (′ , ). Since ′ △ ′ , , , the points of intersection of , ′ , are the points of intersection of and (Fig. 13.18).

This construction cannot be done if is on the line . In that case choose an arbitrary point on that is at a distance more than from . Using Thm 13.3 shorten and lengthen by . , , the endpoints of these segments, are the intersections of and (Fig. 13.19). □

**Fig. 13.18** Construction of the intersection of a line and a circle (1)

#### **What Is the Surprise?**

When one learns about constructions with a straightedge and compass it is obvious that both tools are necessary. Therefore, it was quite a surprise to fnd out that a compass is sufcient. The proof is quite long so we are not going to leave the straightedge at home, but the theorem shows that we should not assume that there are no alternatives to well-known mathematical concepts.

#### **Sources**

This chapter is based on problem 33 of [13] reworked by Michael Woltermann [14]. An additional proof can be found in [25].

**Fig. 13.19** Construction of the intersection of a line and a circle (2)

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

## **Chapter 14 A Straightedge and One Circle is Sufcient**

Can every construction with a straightedge and compass be done with only a straightedge? The answer is no because lines are defned by linear equations and cannot represent circles which are defned by quadratic equations. In 1822 Jean-Victor Poncelet conjectured that a straightedge only is sufcient provided that *one circle* exists in the plane. This was proved in 1833 by Jakob Steiner.

After explaining in Sect. 14.1 what is meant by performing a construction with only a straightedge and one circle, the proof is presented in stages starting with fve auxiliary constructions: construction of a line parallel to a given line (Sect. 14.2), construction of a perpendicular to a given line (Sect. 14.3), copying a line segment in a given direction (Sect. 14.4), construction of a line segment as a ratio of segments (Sect. 14.5) and construction of a square root (Sect. 14.6). Section 14.7 shows how to fnd the intersection(s) of a line with a circle and Sect. 14.8 shows how to fnd the intersection(s) of two circles.

#### **14.1 What Is a Construction With Only a Straightedge?**

A construction using a straightedge and compass is a sequence of three operations:


The frst operation can be performed with a straightedge only.

A circle is defned by a point , its *center*, and by a *radius* , a line segment of length one of whose endpoints is the center. If we can construct the points labeled and in Fig. 14.1a we can claim to have successfully constructed the points of intersection of a given circle with a given line. Similarly, the construction of , in Fig. 14.1b is the construction of the points of intersection of two given circles. The

**Fig. 14.1a** , are the points of intersection of a line and a circle

**Fig. 14.1b** , are the points of intersections of two circles

circles drawn with dashed lines in a diagram do not actually appear in a construction; they are just used to help understand the construction.

The single given circle used in the constructions, called the *fxed circle*, can appear anywhere in the plane and can have an arbitrary radius.

#### **14.2 Construction of a Line Parallel to a Given Line**

**Theorem 14.1** *Given a line defned by two points* , *and a point not on the line, it is possible to construct a line through that is parallel to .*

*Proof* There are two cases to the proof.

*Case 1:* is a *directed line segment* if the midpoint of is given. Construct a ray that extends and choose any point on the ray beyond . Construct the lines , , . The intersection of and is denoted . Construct a ray that extends and denote by the intersection of the ray with (Fig. 14.2).

We claim that <sup>∥</sup> .

**Fig. 14.2** Construction of a parallel line in the case of a directed line

The proof uses Ceva's theorem.

*Ceva's theorem (Thm. A.19):* If the line segments from the vertices of a triangle to the opposite edges intersect in a point (as in Fig. 14.2), the lengths of the segments satisfy:

$$
\frac{\overline{AM}}{\overline{MB}} \cdot \frac{\overline{BQ}}{\overline{QS}} \cdot \frac{\overline{SP}}{\overline{PA}} = 1.
$$

$$
\text{In Fig. 14.2 } M \text{ is the midpoint of } \overline{AB} \text{ so } \frac{\overline{AM}}{\overline{MB}} = 1 \text{ and the equation becomes:}
$$

$$\frac{\overline{BQ}}{\overline{QS}} = \frac{\overline{PA}}{\overline{SP}} = \frac{\overline{AP}}{\overline{PS}}\,. \tag{14.1}$$

since the order of the endpoints of a line segment is not important.

We claim that △ ∼ △:

$$\frac{\overline{\overline{BS}}}{\overline{\overline{QS}}} = \frac{\overline{\overline{B}\overline{Q}}}{\overline{\overline{QS}}} + \frac{\overline{\overline{QS}}}{\overline{\overline{QS}}} = \frac{\overline{\overline{B}\overline{Q}}}{\overline{\overline{QS}}} + 1$$

$$\frac{\overline{\overline{AS}}}{\overline{\overline{PS}}} = \frac{\overline{AP}}{\overline{PS}} + \frac{\overline{PS}}{\overline{PS}} = \frac{\overline{AP}}{\overline{PS}} + 1$$

Using Eq. 14.1:

$$\frac{\overline{BS}}{\overline{QS}} = \frac{\overline{B\underline{Q}}}{\overline{QS}} + 1 = \frac{\overline{AP}}{\overline{PS}} + 1 = \frac{\overline{AP}}{\overline{PS}} + \frac{\overline{PS}}{\overline{PS}} = \frac{\overline{AS}}{\overline{PS}}.$$

and it follows that △ ∼ △ and therefore <sup>∥</sup> .

*Case 2:* is not necessarily a directed line segment. The fxed circle has center and radius . is the point not on the line through which it is required to construct a line parallel to (Fig. 14.3a).

Choose , any point on , and construct a ray extending that intersects the circle at , . is a directed line segment because , the center of the circle, bisects the diameter . Choose a point on and use the construction for a directed line segment (Case 1) to construct a line through parallel to which intersects the circle at , (Fig. 14.3b).

Construct a diameter from through that intersects the other side of the circle at ′ , and similarly construct the diameter ′ . Construct the ray from ′ through ′ and denote by its intersection with . We claim that is the bisector of so that is a directed line segment and therefore a line can be constructed through parallel to (Fig. 14.4).

**Fig. 14.3a** Construction of a directed line **Fig. 14.3b** Construction of a line parallel to the directed line

, ′ , , ′ are all radii of the circle and ∠ = ∠ ′′ since they are vertical angles, so △ △ ′′ by side-angle-side. Defne1 ′ to be a line through parallel to that intersects at and ′ ′ at ′ . ∠ = ∠ ′′ are vertical angles, ∠ = ∠ ′ ′ are alternate interior angles and = ′ are radii, so △ △ ′′ by angle-side-angle and = ′ . Therefore, and ′ are parallelograms and = = ′ = . □

**Theorem 14.2** *Given a line segment and a point not on the line, it is possible to construct a line segment that is parallel to and whose length is equal to the length of , that is, it is possible to copy parallel to itself with as one of its endpoints.*

**Fig. 14.4** Proof that ′ is parallel to

<sup>1</sup> Defne, not construct, because we are in the middle of the proof that such a line can be constructed.

**Fig. 14.5** Construction of a copy of a line parallel to an existing line

*Proof* We have proved that it is possible to construct a line through parallel to and a line through to parallel to . The quadrilateral is a parallelogram so opposite sides are equal = (Fig. 14.5). □

#### **14.3 Construction of a Perpendicular to a Given Line**

**Theorem 14.3** *Given a line segment and a point not on , it is possible to construct a perpendicular to through .*

*Proof* By Thm. 14.1 construct a line ′ parallel to that intersects the fxed circle at , . Construct the diameter ′ and the chord ′ (Fig. 14.6). ∠′ is a right angle because it is subtended by a diameter. Therefore ′ is perpendicular to and . Again by Thm. 14.1 construct the parallel to ′ through . □

**Fig. 14.6** Construction of a perpendicular line

**Fig. 14.7** Copying a line segment in a given direction

#### **14.4 Copying a Line Segment in a Given Direction**

**Theorem 14.4** *It is possible to construct a copy of a given line segment in the direction of another line.*

The meaning of "direction" is that the line defned by two points ′ , ′ is at an angle relative to some axis and the goal is to construct = such that will have the same angle relative to that axis (Fig. 14.7).

*Proof* By Thm. 14.1 it is possible to construct a line segment such that <sup>∥</sup> ′′ , and to construct a line segment such that <sup>∥</sup> . <sup>∠</sup> <sup>=</sup> so it remains to fnd a point on so that = .

Construct two radii , of the fxed circle which are parallel to , , respectively, and construct a ray through parallel to . Denote its intersection with by (Fig. 14.8). By construction, <sup>∥</sup> and <sup>∥</sup> , so <sup>∠</sup> <sup>=</sup> <sup>∠</sup> <sup>=</sup> <sup>∠</sup> <sup>=</sup> . <sup>∥</sup> and △ ∼ △ by angle-angle-angle, △ is isosceles because , are radii of the same circle. Therefore, △ is isosceles and = = . □

**Fig. 14.8** Using the fxed circle to copy the line segment

**Fig. 14.9** Similar triangles to construct the ratio of lengths

#### **14.5 Construction of a Line Segment as a Ratio of Segments**

**Theorem 14.5** *Given line segments of lengths* , , *, it is possible to construct a line segment of length:*

$$x = \frac{n}{m}s\ .$$

*Proof* Choose points , , not on the same line and construct rays , . By Thm. 14.4 it is possible to construct points , , such that = , = , = . By Thm. 14.1 construct a line through parallel to which intersects at and label by (Fig. 14.9). △ ∼ △ by angle-angle-angle so = and = . □

#### **14.6 Construction of a Square Root**

**Theorem 14.6** *Given line segments of lengths* , *, it is possible to construct a line segment of length* <sup>√</sup> *.*

*Proof* We want to express = √ as = in order to use Thm. 14.5.


Defne ℎ = and = and then compute: = √ = √ ℎ = √ 2 ℎ = √ ℎ = <sup>ℎ</sup> <sup>+</sup> <sup>=</sup> + = ( <sup>+</sup> ) = = .

**Fig. 14.10** Construction of a square root

By Thm. 14.4 construct = ℎ on a diameter of the fxed circle. From <sup>ℎ</sup> <sup>+</sup> <sup>=</sup> we have <sup>=</sup> (Fig. 14.10). By Thm. 14.3 construct a perpendicular to at and denote the intersection of this line with the circle by . <sup>=</sup> <sup>=</sup> /<sup>2</sup> and <sup>=</sup> (/2) − .

By Pythagoras's Theorem:

$$\begin{split} s^2 &= \left(\frac{d}{2}\right)^2 - \left(\frac{d}{2} - k\right)^2 \\ &= \left(\frac{d}{2}\right)^2 - \left(\frac{d}{2}\right)^2 + 2\frac{dk}{2} - k^2 \\ &= k(d-k) = kh \\ s &= \sqrt{hk} \ . \end{split}$$

Now = can be constructed by Thm. 14.5. □

#### **14.7 Construction of the Intersection of a Line and a Circle**

**Theorem 14.7** *Given a line and a circle* (, )*, it is possible to construct their points of intersection (Fig. 14.11).*

*Proof* By Thm. 14.3 it is possible to construct a perpendicular from the center of the circle to the line . The intersection of with the perpendicular is denoted by . bisects the chord , where , are the intersections of the line with the circle (Fig. 14.12). Defne = 2 and = . Note that , , are just defnitions not entities have been constructed.

**Fig. 14.11** Construction of the points of intersection of a line and a circle (1)

By Pythagoras's Theorem <sup>2</sup> = <sup>2</sup> <sup>−</sup> <sup>2</sup> <sup>=</sup> ( <sup>+</sup>) ( <sup>−</sup>). By Thm. 14.4 it is possible to construct line segments of length from in the two directions and . The result is two line segments of length <sup>+</sup> , <sup>−</sup> .

By Thm. 14.6 a line segment of length = √ ( <sup>+</sup> ) ( <sup>−</sup> ) can be constructed, and by Thm. 14.4 line segments of length from along in both directions can be constructed. Their other endpoints are the points of intersection of and . □

**Fig. 14.12** Construction of the points of intersection of a line and a circle (2)

**Fig. 14.13** Construction of the intersection of two circles (1)

#### **14.8 Construction of the Intersection of Two Circles**

**Theorem 14.8** *Given two circles* (1, 1), (2, 2)*, it is possible to construct their points of intersection.*

*Proof* Construct 1<sup>2</sup> and label its length (Fig. 14.13). Label by be the point of intersection of 1<sup>2</sup> and , and label = 1, = (Fig. 14.14). has not yet been constructed, but if , are constructed then by Thm. 14.4 the point at length from <sup>1</sup> in the direction 1<sup>2</sup> can be constructed.

Once has been constructed, by Thm. 14.3 a perpendicular to 1<sup>2</sup> at can be constructed, and by Thm. 14.4 it is possible to construct line segments of length from in both directions along the perpendicular. Their other endpoints are the points of intersection of the circles.

**Construction of the length :** Defne = √ 2 1 + 2 , the hypotenuse of a right triangle, which can be constructed from the known lengths 1, . Note that △1<sup>2</sup> is not necessarily a right triangle; the right triangle can be constructed anywhere in the plane. In the right triangle △ 1, cos <sup>∠</sup>1 <sup>=</sup> /1. By the Law of Cosines for △12:

**Fig. 14.14** Construction of the intersection of two circles (2)

$$\begin{aligned} r\_2^2 &= t^2 + r\_1^2 - 2r\_1 t \cos\angle XO\_1O\_2\\ &= t^2 + r\_1^2 - 2tq\\ 2tq &= (t^2 + r\_1^2) - r\_2^2 = d^2 - r\_2^2\\ q &= \frac{(d+r\_2)(d-r\_2)}{2t} .\end{aligned}$$

By Thm. 14.4 these lengths can be constructed and by Thm. 14.5 can be constructed from <sup>+</sup> 2, <sup>−</sup> 2, <sup>2</sup>.

**Construction of the length :** By Pythagoras's Theorem:

$$\alpha = \sqrt{r\_1^2 - q^2} = \sqrt{(r\_1 + q)(r\_1 - q)}\ .$$

By Thm. 14.4, <sup>ℎ</sup> <sup>=</sup> <sup>1</sup> <sup>+</sup> , <sup>=</sup> <sup>1</sup> <sup>−</sup> can be constructed, as can <sup>=</sup> √ ℎ by Thm. 14.6. □

#### **What Is the Surprise?**

A compass is necessary because a straightedge can only compute the roots of linear equations and not values such as <sup>√</sup> 2, the hypotenuse of an isoceles right-triangle with sides of length 1. However, it is surprising that the existence of only one circle, regardless of the position of its center and the length of its radius, is sufcient to perform any construction that is possible with a straightedge and compass.

#### **Sources**

This chapter is based on problem 34 of [13] reworked by Michael Woltermann [14].

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

## **Chapter 15 Are Triangles with Equal Areas and Perimeters Congruent?**

Are two triangles with the same area and the same perimeter congruent? Not necessarily: the triangles with sides (17, <sup>25</sup>, <sup>28</sup>) and (20, <sup>21</sup>, <sup>29</sup>) both have perimeter 70 and area 210 but they are not congruent (Fig. 15.1).1 This chapter shows that given a triangle with rational sides it is possible to construct a non-congruent triangle, also with rational sides, that has the same area and the perimeter. We carry out the derivation using an example, showing that the triangle with sides (3, <sup>4</sup>, <sup>5</sup>) and the triangle with sides 156 35 , 101 21 , 41 15 both have perimeter 12 and area 6.

#### **15.1 From a Triangle to an Elliptic Curve**

The three angle bisectors in a triangle intersect in a point called the *incenter* of the triangle. The incenter is the center of a circle inscribed within the triangle (Fig. 15.2).

**Fig. 15.1** Non-congruent triangles with the same area and the same perimeter

<sup>1</sup> The areas were computed using Heron's formula (Thm. A.3) and the angles using the Law of Cosines (Thm. A.8).

**Fig. 15.2** A circle inscribed within a triangle

Drop altitudes from the center to the sides. The altitudes have length , the radius of the inscribed circle. The altitudes and angle bisectors create three pairs of congruent right triangles:

$$
\triangle AOB' \cong \triangle AOC', \quad \triangle BOA' \cong \triangle BOC', \quad \triangle COA' \cong \triangle COB'.
$$

The altitudes divide the sides , , into segments , , . The area of △ is the sum of the areas of △, △, △:

$$A = \frac{1}{2}(\nu + \nu)r + \frac{1}{2}(\nu + \mu)r + \frac{1}{2}(\mu + \nu)r \tag{15.1a}$$

$$=\frac{1}{2}\cdot\mathcal{2}(\mu+\nu+\mathcal{w})r\tag{15.1b}$$

$$=\frac{1}{2}(a+b+c)r\tag{15.1c}$$

$$=sr\,,\,\tag{15.1d}$$

where is the *semi-perimeter*, one-half the perimeter of the triangle △. The lengths of , , can be expressed using the radius of the circle and the central angles /2, /2, /2:

$$
\tan\frac{\alpha}{2} = \frac{\mu}{r}, \quad \tan\frac{\beta}{2} = \frac{\nu}{r}, \quad \tan\frac{\gamma}{2} = \frac{w}{r} \,. \tag{15.2}
$$

The semi-perimeter can now be expressed in terms of the tangents:

$$s = \mu + \nu + \omega = r \tan\frac{\alpha}{2} + r \tan\frac{\beta}{2} + r \tan\frac{\gamma}{2} = r \left(\tan\frac{\alpha}{2} + \tan\frac{\beta}{2} + \tan\frac{\gamma}{2}\right), \dots$$

and by Eq. 15.1d the area is:

$$A = sr = r^2 \left( \tan \frac{\alpha}{2} + \tan \frac{\beta}{2} + \tan \frac{\gamma}{2} \right) \,. \tag{15.3}$$

From <sup>=</sup> /, Eq. 15.3 can be written as:

$$
\tan\frac{\alpha}{2} + \tan\frac{\beta}{2} + \tan\frac{\gamma}{2} = \frac{A}{r^2} = \frac{A}{(A/s)^2} = \frac{s^2}{A} \,. \tag{15.4}
$$

Since the sum of the angles , , is 360◦ :

$$
\gamma/2 = \Im 60^\circ/2 - (\alpha/2 + \beta/2) \tag{15.5a}
$$

$$
\tan \gamma / 2 = \tan (180^\circ - (\alpha / 2 + \beta / 2))\tag{15.5b}
$$

$$=-\tan(\alpha/2+\beta/2)\tag{15.5c}$$

$$=\frac{\tan\alpha/2 + \tan\beta/2}{\tan\alpha/2\tan\beta/2 - 1},\tag{15.5d}$$

using the formula for the tangent of the sum of two angles (Thm. A.9).

Let us simplify the notation by defning variables for the tangents:

$$\text{l.r.} = \tan\frac{\alpha}{2}, \quad \text{y} = \tan\frac{\beta}{2}, \quad z = \tan\frac{\gamma}{2} \,. \tag{15.6}$$

By Eq. 15.5d we can express <sup>=</sup> tan /2 in terms of , :

$$z = \frac{\mathbf{x} + \mathbf{y}}{\mathbf{x}\mathbf{y} - 1}. \tag{15.7}$$

With this notation, Eq. 15.4 becomes:

$$
\mathbf{x} + \mathbf{y} + \frac{\mathbf{x} + \mathbf{y}}{\mathbf{x}\mathbf{y} - \mathbf{1}} = \frac{\mathbf{s}^2}{A} \,. \tag{15.8}
$$

Given fxed values of and are there multiple solutions of Eq. 15.8?

For the right triangle (3, <sup>4</sup>, <sup>5</sup>):

$$\frac{\frac{1}{S}^2}{A} = \frac{\left(\frac{1}{2}(3+4+\mathfrak{S})\right)^2}{\frac{1}{2}\cdot 3\cdot 4} = \frac{6^2}{6} = 6\ . \tag{15.9}$$

If there is another solution Eq. 15.8 with 2 / <sup>=</sup> 6, it can be written as:

$$\mathbf{x} + \mathbf{y} + \frac{\mathbf{x} + \mathbf{y}}{\mathbf{x}\mathbf{y} - 1} = \mathbf{6} \tag{15.10a}$$

$$\left(\mathbf{x}^2\mathbf{y} + \mathbf{x}\mathbf{y}^2 - 6\mathbf{x}\mathbf{y} + 6 = \mathbf{0}\right.\tag{15.10b}$$

This is an equation for an *elliptic curve*.

#### **15.2 Solving the Equation for the Elliptic Curve**

A portion of the graph of Eq. 15.10b is shown Fig. 15.3. Any point on the closed curve in the frst quadrant is a solution to the equation because the lengths of the sides of the triangle must be positive. , , correspond to the triangle (3, <sup>4</sup>, <sup>5</sup>) as shown below. To fnd additional rational solutions the *method of two secants* is used.

Construct a secant through the points <sup>=</sup> (2, <sup>3</sup>), <sup>=</sup> (1, <sup>2</sup>). It intersects the curve at <sup>=</sup> (−1.5, <sup>−</sup>0.5), but this does not give a solution because the values are negative. Construct a second secant from to <sup>=</sup> (3, <sup>2</sup>). The intersection with the curve at ≈ (1.5, <sup>1</sup>.2) does give a new solution whose coordinates will be computed below.

**Fig. 15.3** The method of two secants

The equation of the (red) line through , is <sup>=</sup> <sup>+</sup> 1. From Eq. 15.10b:

$$\begin{aligned} \left(\mathbf{x}^2(\mathbf{x}+\mathbf{l}) + \mathbf{x}(\mathbf{x}+\mathbf{l})\right)^2 - 6\mathbf{x}(\mathbf{x}+\mathbf{l}) + 6 &= 0\\ 2\mathbf{x}^3 - 3\mathbf{x}^2 - 5\mathbf{x} + 6 &= 0 \end{aligned}$$

From , we know two roots = 2, = 1 so we can factor the cubic polynomial:

$$(x-2)(x-1)(a\mathbf{x}+b)=0\,,$$

where the third root is unknown. Multiply the factors and conclude that = 2, = 3 since 2 <sup>3</sup> <sup>−</sup> <sup>3</sup> <sup>2</sup> <sup>−</sup> <sup>5</sup> <sup>+</sup> <sup>6</sup> <sup>=</sup> <sup>3</sup> + · · · + <sup>2</sup>. The third factor is 2 <sup>+</sup> 3 which gives the third root <sup>=</sup> <sup>−</sup> 3 2 and <sup>=</sup> <sup>+</sup> <sup>1</sup> <sup>=</sup> <sup>−</sup> 1 2 . This is the point <sup>=</sup> (−<sup>3</sup> 2 , − 1 2 ) in the graph.

The equation of the (blue) line through , is:

$$\mathbf{y} = \frac{\mathbf{S}}{\mathbf{9}}\mathbf{x} + \frac{1}{\mathbf{3}}\,. \tag{15.11}$$

Substitute for in Eq. 15.10b:

$$\begin{aligned} \alpha^2 \left( \frac{5}{9} x + \frac{1}{3} \right) + \alpha \left( \frac{5}{9} x + \frac{1}{3} \right)^2 - 6x \left( \frac{5}{9} x + \frac{1}{3} \right) + 6 &= 0\\ \frac{70}{81} x^3 - \frac{71}{27} x^2 - \frac{17}{9} x + 6 &= 0 \end{aligned}$$

From , we know two roots <sup>=</sup> <sup>3</sup>, <sup>=</sup> <sup>−</sup> 3 2 so we can factor the cubic polynomial:

$$(x-3)\left(x+\frac{3}{2}\right)(ax+b)=0$$

Equating the coefcients of the cubic term and the constant terms gives:

$$\begin{aligned} \frac{70}{81}x - \frac{4}{3} &= 0\\ x &= \frac{54}{35} \approx 1.543\dots \end{aligned}$$

and can be computed from Eq. 15.11:

$$y = \frac{25}{21} \approx 1.190\ldots$$

The coordinates of are:

$$
\left(\frac{\mathbf{54}}{\mathbf{35}}, \frac{\mathbf{25}}{\mathbf{21}}\right) = \left(1.\mathbf{543}, 1.190\right),
$$

which are close to the approximations (1.5, <sup>1</sup>.2) obtained from the graph.

Finally, compute from Eq. 15.7:

$$z = \frac{\mathbf{x} + \mathbf{y}}{\mathbf{x}\mathbf{y} - 1} = \left(\frac{\mathbf{54}}{3\mathbf{5}} + \frac{2\mathbf{5}}{21}\right) \Big/ \left(\frac{\mathbf{54}}{3\mathbf{5}}\frac{2\mathbf{5}}{21} - 1\right) = \frac{2009}{615} = \frac{49}{15} \text{ J}$$

#### **15.3 Derivation of a Triangle From the Elliptic Curve**

Using Eqs. 15.2, 15.6, , , , the sides of the triangle △, can be computed from , , and <sup>=</sup> / <sup>=</sup> <sup>6</sup>/<sup>6</sup> <sup>=</sup> 1:

$$a = w + \nu = r(z + \mathbf{y}) = (z + \mathbf{y})$$

$$b = \mu + \mathbf{w} = r(\mathbf{x} + \mathbf{z}) = (\mathbf{x} + \mathbf{z})$$

$$c = \mu + \nu = r(\mathbf{x} + \mathbf{y}) = (\mathbf{x} + \mathbf{y})$$

For solution of the elliptic curve the sides of the triangle are:

$$\begin{aligned} a &= z + y = 1 + 3 = 4 \\ b &= x + z = 2 + 1 = 3 \\ c &= x + y = 2 + 3 = 5 \end{aligned}$$

For solution of the elliptic curve the sides of the triangle are:

$$a = z + \mathbf{y} = \frac{49}{15} + \frac{25}{21} = \frac{156}{35}$$

$$b = x + z = \frac{54}{35} + \frac{49}{15} = \frac{101}{21}$$

$$c = x + \mathbf{y} = \frac{54}{35} + \frac{25}{21} = \frac{41}{15} \dots$$

Let us check this result. The semi-perimeter is:

$$s = \frac{1}{2} \left( \frac{156}{35} + \frac{101}{21} + \frac{41}{15} \right) = \frac{1}{2} \left( \frac{468 + 505 + 287}{105} \right) = \frac{1}{2} \left( \frac{1260}{105} \right) = 6, 3$$

and the area can be computed using Heron's formula (Thm. A.3):

$$A = \sqrt{6\left(6 - \frac{156}{35}\right)\left(6 - \frac{101}{21}\right)\left(6 - \frac{41}{15}\right)} = \sqrt{36} = 6\text{ A}$$

Is 156 35 , 101 21 , 41 15 (3, <sup>4</sup>, <sup>5</sup>)? To simplify the computation let us use the decimal approximations (4.48, <sup>4</sup>.81, <sup>2</sup>.73). Then:

**Fig. 15.4** The triangle with the same perimeter and area as (3, 4, 5)

$$
\sqrt{4.48^2 + 2.73^2} = \text{5.25} \neq 4.81\text{ ,}
$$

so this is not a right triangle and not congruent to (3, <sup>4</sup>, <sup>5</sup>).

The Law of Cosines can be used to compute the angles of the triangle as shown in Fig. 15.4.

#### **What Is the Surprise?**

Are triangles with the same area and perimeter congruent? My frst impression was to say "yes" because it is not easy to fnd counterexamples. What is surprising is that given an arbitrary triangle with rational sides, it is possible to construct a non-congruent triangle with rational sides which has the same area and perimeter, although the result can be strange as with the triangles (3, <sup>4</sup>, <sup>5</sup>) and 156 35 , 101 21 , 41 15 .

#### **Sources**

This chapter is based on [33]. In [3] it is shown that given an isoceles triangle there are non-congruent triangles with the same area and perimeter, but the proof does not include an explicit construction.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

## **Chapter 16 Construction of a Regular Heptadecagon**

The only regular polygons that the Greeks knew how to construct with a straightedge and compass were the triangle, the square, the pentagon and the regular polygon with 15 sides. Given a regular polygon with sides, a polygon with 2 sides can be constructed by circumscribing the polygon with a circle and bisecting the central angle (Fig. 16.1). No further progress was made until 1796 when Carl Friedrich Gauss awoke one morning, just before his 19th birthday, and by "concentrated thought" fgured out how to construct a regular *heptadecagon*, a regular polygon with 17 sides. This achievement inspired him to become a mathematician.

Section 16.1 discusses the relation between the side of a polygon inscribed in a circle and the central angle that it subtends. Section 16.2 states without proof the Fundamental Theorem of Algebra. Section 16.3 presents the *roots of unity*, the roots of the polynomial <sup>−</sup> 1, which are central to Gauss's proof. Sections 16.4 and 16.5 present Gauss's proof which is based on symmetries of roots of polynomials. Gauss derived a *formula* proving that the heptadecagon is constructible, but a geometric construction was not given for almost a century. Section 16.6 gives an elegant construction by James J. Callagy. Section 16.7 shows how constructions of a regular pentagon can be derived using both geometry and trigonometry.

Some of the material is more straightforward if presented using complex numbers. This material is set of in boxes that can be skipped.

**Fig. 16.1** Constructing a regular polynomial with 10 sides from a regular pentagon

#### **16.1 Construction of Regular Polygons**

The construction of the regular heptadecagon led to the Gauss-Wantzel theorem, which states that a regular polygon with sides can be constructed with a straightedge and compass if and only if is the product of a power of 2 and zero or more *distinct* Fermat numbers 2<sup>2</sup> + 1 which are prime. The known Fermat primes are:

$$F\_0 = \mathfrak{Z}, \quad F\_1 = \mathfrak{S}, \quad F\_2 = 17, \quad F\_3 = 2\mathfrak{S}7, \quad F\_4 = 6\mathfrak{S}\mathfrak{S}\mathfrak{Z}\mathfrak{T}\mathfrak{T}$$

A regular polygon with 257 sides was constructed by Magnus Georg Paucker in 1822 and by Friedrich Julius Richelot 1832. In 1894 Johann Gustav Hermes claimed to have constructed a regular polygon with 65537 sides.

To construct a regular polygon it is sufcient to construct a line segment of length cos , where is the central angle subtended by a chord that is a side of the polygon inscribed in a unit circle. Given the line segment = cos , construct a perpendicular at and label its intersection with the unit circle by . Then:

$$\begin{aligned} \cos \theta &= \frac{\overline{OB}}{\overline{OC}} = \overline{OB} \\ \theta &= \cos^{-1}(\overline{OB}) \end{aligned}$$

The chord is a side of the regular polygon (Fig. 16.2).

Given a line segment defned to have length 1, the lengths that are constructible are those which can be obtained from line segments of known length using the operations {+, <sup>−</sup>, <sup>×</sup>, /, √ } (Sect. 2.5). Gauss showed that cos(360◦ /17), the cosine of the central angle of a heptadecagon, is constructible since it can be expressed using only these operations:

cos 360◦ 17 = − 1 16 + 1 16 √ 17 + 1 16 √ 34 − 2 √ 17 + 1 8 √ 17 + 3 √ 17 − √ 34 − 2 √ 17 − 2 √ 34 + 2 √ 17 . 1 cos

**Fig. 16.2** The cosine of the central angle of a regular polygon

#### **16.2 The Fundamental Theorem of Algebra**

The following theorem will be used without proof.

**Theorem 16.1** *Every polynomial of degree has exactly roots.*

The statement of the theorem has been simplifed because all we will need to know is that roots *exist*.

**The Fundamental Theorem of Algebra** states that every non-constant polynomial of degree in a single variable with *complex* coefcients has exactly *complex* roots. If there are multiple roots with the same value, they are all counted: <sup>2</sup> <sup>−</sup> <sup>4</sup> <sup>+</sup> <sup>4</sup> <sup>=</sup> ( <sup>−</sup> <sup>2</sup>) ( <sup>−</sup> <sup>2</sup>) has two roots both equal to 2. The polynomial <sup>2</sup> <sup>+</sup> 1 with integer coefcients has two complex roots ± √ −1. Strangely, even though the theorem is about fnite algebraic entities polynomials of degree with roots—methods of analysis, usually complex analysis, are needed to prove the theorem.

## **16.3 Roots of Unity**

By the Fundamental Theorem of Algebra (Thm. 16.1) the polynomial <sup>−</sup> 1 has roots for any integer > 1. One root is <sup>=</sup> 1 so there are <sup>−</sup> 1 other roots. Denote one of these roots by . Since = 1 it is called an *-th root of unity*. What about 2 ?

$$(r^2)^n = (r^n)^2 = 1^2 = 1\ .$$

It follows that the numbers:

$$1, r, r^2, \dots, r^{n-2}, r^{n-1}$$

are -th roots of unity.

\* Let  $r = \cos\left(\frac{2\pi}{n}\right) + i\sin\left(\frac{2\pi}{n}\right)$ . By de Moivre's formula: 
$$\left[\cos\left(\frac{2\pi}{n}\right) + i\sin\left(\frac{2\pi}{n}\right)\right]^n = \cos\left(\frac{2n\pi}{n}\right) + i\sin\left(\frac{2n\pi}{n}\right) = 1$$
.

**Theorem 16.2** *Let be a* prime number *and let an -th root of unity. Then:*

$$\{1, r, r^2, \dots, r^{n-2}, r^{n-1}\}$$

*are distinct so they are* all *the -th roots of unity.*

*Proof* Suppose that the powers are not distinct so that = for some 0 <sup>≤</sup> < <sup>≤</sup> <sup>−</sup> 1. Then / = − = 1 so there exists at least one positive integer ′ less than such that ′ = 1. Let be the smallest such positive integer. By the division algorithm for integers <sup>=</sup> <sup>+</sup> for some 0 < < and 0 <sup>≤</sup> < . From:

$$1 = r^n = r^{ml+k} = (r^m)^l \cdot r^k = 1^l \cdot r^k = r^k \,,$$

we have 0 <sup>≤</sup> < and = 1. Since was defned to be the smallest such positive integer = 0 and = is not prime. □

**Theorem 16.3** *Let* {1, 2, . . . , −1, } *be the roots of an -th degree polynomial* ()*. Then:*

$$f(\mathbf{x}) = (\mathbf{x} - a\_1)(\mathbf{x} - a\_2) \cdots (\mathbf{x} - a\_{n-1})(\mathbf{x} - a\_n) \,. \tag{16.1}$$

*Proof* If is a root of () by defnition () <sup>=</sup> 0 but:

$$\begin{aligned} f(a\_i) &= (a\_i - a\_1)(a\_i - a\_2) \cdots (a\_i - a\_{n-1})(a\_i - a\_n) \\ &= \cdots (a\_i - a\_i) \cdots = 0 \ . \end{aligned}$$

Therefore, () <sup>=</sup> ( <sup>−</sup> )() for some () and by induction this holds for all the roots. □

From Eq. 16.1 it is easy to see that the coefcient of −1 is:

$$-(a\_1 + a\_2 + \dots + a\_{n-1} + a\_n) \; .$$

Since the coefcient of −1 in <sup>−</sup> 1 for <sup>≥</sup> 2 is zero, we have:

$$\begin{aligned} -(1+r+r^2+\cdots+r^{n-2}+r^{n-1}) &= 0\\ r+r^2+\cdots+r^{n-2}+r^{n-1} &= -1 \end{aligned}$$

For the heptadecagon this is:

$$\begin{aligned} r + r^2 + r^3 + r^4 + r^5 + r^6 + r^7 + r^8 + \\ r^9 + r^{10} + r^{11} + r^{12} + r^{13} + r^{14} + r^{15} + r^{16} = -1. \end{aligned} \quad (16.2)$$

#### **16.4 Gauss's Proof That a Heptadecagon Is Constructible**

What Gauss understood is that one need not work with the roots in their natural order , 2 , . . . , 16. The powers of <sup>3</sup> give all the roots but in a diferent order:

$$\begin{aligned} &r^1, \; r^{1 \cdot 3 = 3}, \; r^{3 \cdot 3 = 9}, \; r^{9 \cdot 3 = 27 = 10}, \; r^{10 \cdot 3 = 30 = 13}, \; r^{13 \cdot 3 = 39 = 8}, \; r^{8 \cdot 3 = 18}, \; r^{15 \cdot 3 = 45 = 11}, \\\\ &r^{11 \cdot 3 = 33 = 16}, \; r^{16 \cdot 3 = 48 = 14}, \; r^{14 \cdot 3 = 42 = 8}, \; r^{8 \cdot 3 = 24 = 7}, \; r^{7 \cdot 3 = 21 = 4}, \; r^{4 \cdot 3 = 12}, \; r^{12 \cdot 3 = 36 = 2}, \; r^{2 \cdot 3 = 6}, \; r^{3 \cdot 4 = 10}, \; r^{3 \cdot 4 = 11}, \; r^{4 \cdot 3 = 12}, \; r^{12 \cdot 3 = 36 = 2}, \; r^{2 \cdot 3 = 6}, \; \end{aligned}$$

where the roots have been reduced modulo 17:

$$r^{1^{\tau\_{m+k}}} = (r^{1^{\tau}})^m \cdot r^k = 1^m \cdot r^k = r^k \ .$$

Check that the list contains all the roots (except 1) exactly once:

$$(r^1, r^3, r^9, r^{10}, r^{13}, r^5, r^{15}, r^{11}, r^{16}, r^{14}, r^8, r^7, r^4, r^{12}, r^2, r^6.\tag{16.3}$$

Given a monic quadratic polynomial whose roots are , :

$$(\mathbf{y}^2 + p\mathbf{y} + q = (\mathbf{y} - a)(\mathbf{y} - b) = 0, 1$$

we can compute the coefcients , from the roots (Chap. 7):

$$p = -(a+b)\,, \quad q = ab\,\,.$$

Therefore, *given* <sup>+</sup> and we can write down the quadratic equation of which , are the roots.

Let <sup>0</sup> be the sum of the roots in the odd positions in Eq. 16.3:

$$a\_0 = r + r^9 + r^{13} + r^{15} + r^{16} + r^8 + r^4 + r^2 \ ,$$

and let <sup>1</sup> be the sum of the roots in the even positions in Eq. 16.3:

$$a\_1 = r^3 + r^{10} + r^8 + r^{11} + r^{14} + r^7 + r^{12} + r^6 \ \dots$$

To obtain 0, <sup>1</sup> as roots of a quadratic equation frst compute their sum and use Eq. 16.2:

$$a\_0 + a\_1 = r + r^2 + \dots + r^{16} = -1 \dots$$

Now we have to work very hard to compute their product. Figure 16.3 shows the computation where the values of = + are written after reducing the exponents modulo 17. Check that each root occurs exactly four times so that—again using Eq. 16.2—the value of the product is −4.

<sup>1</sup> = ( + <sup>+</sup> <sup>+</sup> <sup>+</sup> <sup>+</sup> <sup>+</sup> <sup>+</sup> ) × ( <sup>+</sup> <sup>+</sup> <sup>+</sup> <sup>+</sup> <sup>+</sup> <sup>+</sup> <sup>+</sup> ) = + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + = −4 .

**Fig. 16.3** Computation of 01; below each root is the number of occurrences of the root so far

Since <sup>0</sup> <sup>+</sup> <sup>1</sup> <sup>=</sup> <sup>−</sup>1 and 0<sup>1</sup> <sup>=</sup> <sup>−</sup>4, 1, <sup>2</sup> are the roots of the quadratic equation <sup>+</sup> <sup>−</sup> <sup>4</sup> <sup>=</sup> 0 and they can be computed using the simple formula for the roots of a quadratic equation:

$$a\_{0,1} = \frac{-1 \pm \sqrt{17}}{2} \dots$$

Now, let 0, 1, 2, <sup>3</sup> be the sums of every fourth root starting from , 3 , 9 , 10 , respectively:

$$\begin{aligned} b\_0 &= r^1 + r^{13} + r^{16} + r^4 \\ b\_1 &= r^3 + r^5 + r^{14} + r^{12} \\ b\_2 &= r^9 + r^{15} + r^8 + r^2 \\ b\_3 &= r^{10} + r^{11} + r^7 + r^6 \dots \end{aligned}$$

Check that <sup>0</sup> <sup>+</sup> <sup>2</sup> <sup>=</sup> 0, <sup>1</sup> <sup>+</sup> <sup>3</sup> <sup>=</sup> <sup>1</sup> and compute the corresponding products:

$$\begin{aligned} b\_0 b\_2 &= (r + r^{13} + r^{16} + r^4) \ & \times & (r^9 + r^{15} + r^8 + r^2) \\ &= (r^{10} + r^{16} + r^9 + r^3) \ & + (r^9 + r^{11} + r^4 + r^{15}) + \\ & & (r^8 + r^{14} + r^7 + r^1) \ & + (r^{13} + r^2 + r^{12} + r^6) \\ & & = -1 \ . \end{aligned}$$

$$\begin{aligned} b\_1 b\_3 &= (r^3 + r^5 + r^{14} + r^{12}) & \times & (r^{10} + r^{11} + r^7 + r^6) \\ &= (r^{13} + r^{14} + r^{10} + r^9) & + (r^{15} + r^{16} + r^{12} + r^{11}) + \\ & (r^7 + r^8 + r^4 + r^3) & + (r^5 + r^6 + r^2 + r^1) \\ &= -1 \ . \end{aligned}$$

To summarize these computations:

$$\begin{aligned} b\_0 + b\_2 &= a\_0 \\ b\_0 b\_2 &= -1 \\ b\_1 + b\_3 &= a\_1 \\ b\_1 b\_3 &= -1 \end{aligned}$$

so 0, <sup>2</sup> are the solutions of <sup>−</sup> 0 <sup>−</sup> <sup>1</sup> <sup>=</sup> 0, and 1, <sup>3</sup> are the solutions of <sup>−</sup> 1 <sup>−</sup> <sup>1</sup> <sup>=</sup> 0. Using the values previously computed for 0, <sup>1</sup> we can compute the roots 0, <sup>1</sup> (Fig. 16.4).

Finally, let 0, <sup>4</sup> be the sums of every eighth root starting with , 13:

$$\begin{aligned} c\_0 &= r^1 + r^{16} \\ c\_4 &= r^{13} + r^4 \\ c\_0 + c\_4 &= r^1 + r^{16} + r^{13} + r^4 = b\_0 \\ c\_0 c\_4 &= (r^1 + r^{16}) \cdot (r^{13} + r^4) \\ &= r^{14} + r^{5} + r^{12} + r^3 = b\_1 \end{aligned}$$

so 0, <sup>4</sup> are the roots of <sup>−</sup> 0 <sup>+</sup> <sup>1</sup> <sup>=</sup> 0. Since cos(360◦ /17) <sup>=</sup> 0/2 (Fig. 16.5) it sufces to compute the root <sup>0</sup> = <sup>+</sup> (Fig. 16.6).

The cosine of the central angle of a heptadecagon is constructible with a straightedge and compass since it is composed only of rational numbers and the operations {+, <sup>−</sup>, <sup>×</sup>, /, √ }:

$$\cos\left(\frac{360^\circ}{17}\right) = \frac{c\_0}{2}\tag{16.4}$$

$$=-\frac{1}{16} + \frac{1}{16}\sqrt{17} + \frac{1}{16}\sqrt{34 - 2\sqrt{17}} + \dotsb \tag{16.5}$$

$$\frac{1}{16}\sqrt{68+12\sqrt{17}+2(-1+\sqrt{17})}\sqrt{34-2\sqrt{17}}-16\sqrt{34+2\sqrt{17}}.\tag{16.6}$$

$$\begin{aligned} b\_0 &= \frac{a\_0 + \sqrt{a\_0^2 + 4}}{2} \\ &= \frac{\frac{(-1 + \sqrt{17})}{2} + \sqrt{\left(\frac{(-1 + \sqrt{17})}{2}\right)^2 + 4}}{2} \\ &= \frac{(-1 + \sqrt{17}) + \sqrt{\left(-1 + \sqrt{17}\right)^2 + 16}}{4} \\ &= \frac{\frac{(-1 + \sqrt{17})}{4} + \sqrt{34 - 2\sqrt{17}}}{2} \\\ b\_1 &= \frac{a\_1 + \sqrt{a\_1^2 + 4}}{2} \\ &= \frac{\frac{(-1 - \sqrt{17})}{2} + \sqrt{\left(\frac{(-1 - \sqrt{17})}{2}\right)^2 + 4}}{2} \\ &= \frac{(-1 - \sqrt{17}) + \sqrt{\left(-1 - \sqrt{17}\right)^2 + 16}}{4} \\ &= \frac{(-1 - \sqrt{17}) + \sqrt{34 + 2\sqrt{17}}}{4} \end{aligned}$$

**Fig. 16.4** Computation of <sup>0</sup> and <sup>1</sup>

$$\begin{split} r\_1 + r\_{16} &= \cos\left(\frac{2\pi}{17}\right) + i\sin\left(\frac{2\pi}{17}\right) + \cos\left(\frac{2 \cdot 16\pi}{17}\right) + i\sin\left(\frac{2 \cdot 16\pi}{17}\right) \\ &= \cos\left(\frac{2\pi}{17}\right) + i\sin\left(\frac{2\pi}{17}\right) + \cos\left(\frac{-2\pi}{17}\right) + i\sin\left(\frac{-2\pi}{17}\right) \\ &= 2\cos\left(\frac{2\pi}{17}\right) \end{split}$$

**Fig. 16.5** The cosine of the central angle computed from 1, <sup>16</sup>

$$\begin{split} c\_{0} &= \frac{b\_{0} + \sqrt{b\_{0}^{2} - 4b\_{1}}}{2} \\ &= \frac{1}{2} \frac{(-1 + \sqrt{17}) + \sqrt{34 - 2\sqrt{17}}}{4} + \\ &\quad \frac{1}{2} \sqrt{\left(\frac{(-1 + \sqrt{17}) + \sqrt{34 - 2\sqrt{17}}}{4}\right)^{2}} - 4 \left(\frac{(-1 - \sqrt{17}) + \sqrt{34 + 2\sqrt{17}}}{4}\right) \\ &= -\frac{1}{8} + \frac{1}{8} \sqrt{17} + \frac{1}{8} \sqrt{34 - 2\sqrt{17}} + \\ &\quad \frac{1}{8} \sqrt{\left((-1 + \sqrt{17}) + \sqrt{34 - 2\sqrt{17}}\right)^{2}} - 16 \left((-1 - \sqrt{17}) + \sqrt{34 + 2\sqrt{17}}\right) \\ &= -\frac{1}{8} + \frac{1}{8} \sqrt{17} + \frac{1}{8} \sqrt{34 - 2\sqrt{17}} + \\ &\quad \frac{1}{8} \sqrt{(-1 + \sqrt{17})^{2} + 2 \left(-1 + \sqrt{17}\right) \sqrt{34 - 2\sqrt{17}} + (34 - 2\sqrt{17}) - \\ &\quad \left((-16 - 16\sqrt{17}) + 16\sqrt{34 + 2\sqrt{17}}\right) \\ &= -\frac{1}{8} + \frac{1}{8} \sqrt{17} + \frac{1}{8} \sqrt{34 - 2\sqrt{17}} + \\ &\quad \frac{1}{8} \sqrt{68 + 12\sqrt{17} + 2(-1 + \sqrt{17})\sqrt{34 - 2\sqrt{17}} - 16\sqrt{34 + 2\sqrt{17}}}{17} \end{split}$$

**Fig. 16.6** Computation of <sup>0</sup>

#### **16.5 Derivation of Gauss's Formula**

The above formula for cos(360◦ /17) is not the one given by Gauss. Here is a derivation of Gauss's formula:

Let us simplify 2(−1 + √ 17) √ 34 − 2 √ 17:

$$\begin{aligned} 2(-1+\sqrt{17})\sqrt{34-2\sqrt{17}} &= -2\sqrt{34-2\sqrt{17}} + 2\sqrt{17}\sqrt{34-2\sqrt{17}} \\ &+ 4\sqrt{34-2\sqrt{17}} - 4\sqrt{34-2\sqrt{17}} \\ &= 2\sqrt{34-2\sqrt{17}} + 2\sqrt{17}\sqrt{34-2\sqrt{17}} \\ &- 4\sqrt{34-2\sqrt{17}} \\ &= 2(1+\sqrt{17})\sqrt{34-2\sqrt{17}} - 4\sqrt{34-2\sqrt{17}} \end{aligned}$$

We will remember the term −4 √ 34 − 2 √ 17 for now and simplify the frst term by squaring it and then taking the square root:

$$\begin{aligned} 2(1+\sqrt{17})\sqrt{34-2\sqrt{17}} &= 2\sqrt{\left[(1+\sqrt{17})\sqrt{34-2\sqrt{17}}\right]^2} \\ &= 2\sqrt{(18+2\sqrt{17})(34-2\sqrt{17})} \\ &= 2\sqrt{(18\cdot 34-4\cdot 17)+\sqrt{17}(2\cdot 34-2\cdot 18)} \\ &= 2\cdot 4\sqrt{34+2\sqrt{17}} \end{aligned}$$

Substituting terms results in Gauss's formula:

$$\begin{split} \cos\left(\frac{360^{\circ}}{17}\right) &= -\frac{1}{16} + \frac{1}{16}\sqrt{17} + \frac{1}{16}\sqrt{34 - 2\sqrt{17}} \\ &+ \frac{1}{16}\sqrt{68 + 12\sqrt{17} + 8\sqrt{34 + 2\sqrt{17}} - 4\sqrt{34 - 2\sqrt{17}} - 16\sqrt{34 + 2\sqrt{17}}} \\ &= -\frac{1}{16} + \frac{1}{16}\sqrt{17} + \frac{1}{16}\sqrt{34 - 2\sqrt{17}} \\ &+ \frac{1}{8}\sqrt{17 + 3\sqrt{17} - \sqrt{34 - 2\sqrt{17}} - 2\sqrt{34 + 2\sqrt{17}}}. \end{split}$$

#### **16.6 Construction of a Heptadecagon**

Construct a unit circle centered at with perpendicular diameters and (Fig. 16.7). Construct so that <sup>=</sup> (1/4).

**Fig. 16.7** Construction of a heptadecagon (1)

#### 16.6 Construction of a Heptadecagon 193

By Pythagoras's Theorem:

$$
\overline{AP} = \sqrt{\overline{OA}^2 + \overline{OP}^2} = \sqrt{(1/4)^2 + 1^2} = \sqrt{17}/4\dots
$$

Let be the intersection of the internal bisector of ∠ and the line segment and let be the intersection of the external bisector of ∠ and the line segment . By the internal angle bisector theorem (Thm. A.13):

$$\begin{aligned} \frac{\overline{OB}}{\overline{BP}} &= \frac{\overline{AO}}{\overline{AP}}\\ \frac{\overline{OB}}{1 - \overline{OB}} &= \frac{1/4}{\sqrt{17}/4} \\ \overline{OB} &= \frac{1}{1 + \sqrt{17}} = \frac{1}{1 + \sqrt{17}} \cdot \frac{1 - \sqrt{17}}{1 - \sqrt{17}} \\ &= \frac{-1 + \sqrt{17}}{16} \end{aligned}$$

and by the external angle bisector theorem (Thm. A.14):

$$\begin{aligned} \frac{\overline{OC}}{\overline{CP}} &= \frac{\overline{AO}}{\overline{AP}}\\ \frac{\overline{OC}}{1 + \overline{OC}} &= \frac{1/4}{\sqrt{17}/4} \\ \overline{OC} &= \frac{1}{-1 + \sqrt{17}} = \frac{1}{-1 + \sqrt{17}} \cdot \frac{1 + \sqrt{17}}{1 + \sqrt{17}} \\ &= \frac{1 + \sqrt{17}}{16} \end{aligned}$$

Construct on such that = = (Fig. 16.8). By Pythagoras's Theorem:

$$\begin{split} \overline{CD} &= \overline{CA} = \sqrt{\overline{OA}^2 + \overline{OC}^2} \\ &= \sqrt{\left(\frac{1}{4}\right)^2 + \left(\frac{1+\sqrt{17}}{16}\right)^2} = \frac{1}{16}\sqrt{16+1+17+2\sqrt{17}} \\ &= \frac{1}{16}\sqrt{34+2\sqrt{17}} \ . \end{split}$$

**Fig. 16.8** Construction of a heptadecagon (2)

Construct on such that = = ; again by Pythagoras's Theorem:

$$\begin{split} \overline{BE} = \overline{BA} &= \sqrt{\overline{OA}^2 + \overline{OB}^2} \\ &= \sqrt{\left(\frac{1}{4}\right)^2 + \left(\frac{-1 + \sqrt{17}}{16}\right)^2} = \frac{1}{16}\sqrt{16 + 1 + 17 - 2\sqrt{17}} \\ &= \frac{1}{16}\sqrt{34 - 2\sqrt{17}} \end{split}$$

Construct as the midpoint of and construct on such that = = :

$$\begin{split} \overline{MF} = \overline{MQ} &= \frac{1}{2} \overline{QD} = \frac{1}{2} (\overline{QC} + \overline{CD}) = \frac{1}{2} ((1 - \overline{OC}) + \overline{CD}) \\ &= \frac{1}{2} \left[ 1 - \left( \frac{1 + \sqrt{17}}{16} \right) + \frac{\sqrt{34 + 2\sqrt{17}}}{16} \right] \\ &= \frac{1}{32} \left( 15 - \sqrt{17} + \sqrt{34 + 2\sqrt{17}} \right) . \end{split}$$

Note that <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>=</sup> <sup>1</sup> <sup>−</sup> .

**Fig. 16.9** Construction of a heptadecagon (3)

Construct a semicircle whose diameter is . Construct a chord = = (Fig. 16.9). By Pythagoras's Theorem:

$$\begin{split} \overline{OG} = \overline{OF} &= \sqrt{\overline{MF}^2 - \overline{MO}^2} = \sqrt{\overline{MF}^2 - (1 - \overline{MF})^2} \\ &= \sqrt{2\overline{MF} - 1} \\ &= \sqrt{\frac{1}{16} \left( 15 - \sqrt{17} + \sqrt{34 + 2\sqrt{17}} \right) - 1} \\ &= \frac{1}{4}\sqrt{-1 - \sqrt{17} + \sqrt{34 + 2\sqrt{17}}} \end{split}$$

∠ is a right angle since it is subtended by a diameter of the circle. Construct on such that = = ; again by Pythagoras's Theorem:

$$\begin{split} \overline{EH} = \overline{EG} &= \sqrt{\overline{OE}^2 - \overline{OG}^2} = \sqrt{(\overline{OB} + \overline{BE})^2 - \overline{OG}^2} \\ &= \sqrt{\left(\frac{-1 + \sqrt{17}}{16} + \frac{\sqrt{34 - 2\sqrt{17}}}{16}\right)^2 - \frac{1}{16} \left(-1 - \sqrt{17} + \sqrt{34 + 2\sqrt{17}}\right)} \\ &= \frac{1}{16} \sqrt{\left((18 - 2\sqrt{17}) + 2(-1 + \sqrt{17})\sqrt{34 - 2\sqrt{17}} + (34 - 2\sqrt{17})\right)} \\ &\quad + \overline{\left(16 + 16\sqrt{17} - 16\sqrt{34 + 2\sqrt{17}}\right)} \\ &= \frac{1}{16} \sqrt{68 + 12\sqrt{17} - 16\sqrt{34 + 2\sqrt{17}} - 2(1 - \sqrt{17})\sqrt{34 - 2\sqrt{17}}} \end{split}$$

Compute :

$$\begin{split} \overline{OE} = \overline{OB} + \overline{BE} &= \frac{-1 + \sqrt{17}}{16} + \frac{1}{16} \sqrt{34 - 2\sqrt{17}} \\ &= \frac{1}{16} \left( -1 + \sqrt{17} + \sqrt{34 - 2\sqrt{17}} \right) \end{split}$$

Finally, <sup>=</sup> <sup>+</sup> which is Gauss's formula for cos(360◦ /17).

#### **16.7 Construction of a Regular Pentagon**

The complex ffth roots of unity are:

$$1 + i \cdot 0, \quad \frac{\sqrt{5} - 1}{4} \pm i \frac{\sqrt{10 + 2\sqrt{5}}}{4}, \quad \frac{-\sqrt{5} - 1}{4} \pm i \frac{\sqrt{10 - 2\sqrt{5}}}{4}.$$

#### **16.7.1 Trigonometry**

The central angle of a regular pentagon is 360◦ /5 = 72◦ . Let us compute cos 36◦ using the trigonometric identities for 2 and /2 (Thms. A.2.1, A.7):

$$\begin{split} 0 &= \cos 90^\circ = \cos(72^\circ + 18^\circ) = \cos 2 \cdot 36^\circ \cos 36^\circ / 2 - \sin 2 \cdot 36^\circ \sin 36^\circ / 2 \\ &= (2\cos^2 36^\circ - 1) \sqrt{\frac{1 + \cos 36^\circ}{2}} - 2\sin 36^\circ \cos 36^\circ \sqrt{\frac{1 - \cos 36^\circ}{2}}. \end{split}$$

There is now only one angle in the formula; let = cos 36◦ . Then:

$$\begin{aligned} (2x^2 - 1)\sqrt{\frac{1+x}{2}} &= 2\sqrt{1-x^2} \cdot x \cdot \sqrt{\frac{1-x}{2}}\\ (2x^2 - 1)\sqrt{1+x} &= 2\sqrt{1-x} \cdot \sqrt{1+x} \cdot x \cdot \sqrt{1-x} \\ 2x^2 - 1 &= 2x(1-x) \\ 4x^2 - 2x - 1 &= 0 \end{aligned}$$

Solving the quadratic equation gives a constructible value:

$$\cos 36^\circ = \frac{1 + \sqrt{5}}{4} \dots$$

**Fig. 16.10** Construction of a regular pentagon (1)

#### **16.7.2 Geometry**

Let be a regular pentagon (Fig. 16.10). By defnition all the sides and all the interior angles are equal. It is easy to show by congruent triangles that all diagonals are equal. Let the length of the sides be 1 and the length of the diagonals be .

△ △ by side-side-side so <sup>∠</sup> <sup>=</sup> <sup>∠</sup> <sup>=</sup> . △ △ by side-side-side so ∠ = ∠ = . ∠ = ∠ = are vertical angles. In both triangles the sum of the angles is 180◦ so <sup>+</sup> <sup>2</sup> <sup>=</sup> <sup>+</sup> <sup>2</sup> and <sup>=</sup> . By alternate interior angles we conclude that <sup>∥</sup> .

Construct a line through parallel to and let be its intersection with (Fig. 16.11). is a rhombus so <sup>=</sup> <sup>=</sup> <sup>=</sup> 1. △ is an isoceles triangle with base angles . △ is also isoceles and <sup>∠</sup> <sup>=</sup> <sup>∠</sup> <sup>=</sup> so △ ∼ △. Taking ratios of the sides gives:

$$\frac{x}{1} = \frac{1}{x-1} \cdot \frac{}{\cdot}$$

The result is a quadratic equation <sup>2</sup> <sup>−</sup> <sup>−</sup><sup>1</sup> <sup>=</sup> 0 whose positive root is constructible:

$$\chi = \frac{1 + \sqrt{5}}{2} \dots$$

**Fig. 16.11** Construction of a regular pentagon (2)

#### **What Is the Surprise?**

It is surprising that two millennia passed from the work of the Greeks on construction to the discovery by Gauss of the constructibility of the regular heptadecagon. It is also surprising that the problem was solved not by using geometry but by inventing new algebraic methods that had a far-reaching infuence in mathematics.

#### **Sources**

This chapter is based on [6]. Gauss's original work is available in an English translation [18]. Equation 16.5–16.6 appears in [41]; the author assigns an exercise to transform it into Gauss's formula as it appears in [18, p. 458] and [6, p. 68].

The construction of the heptadecagon is taken from [10] while other constructions can be found in [55]. The trigonometric construction of the regular pentagon is from [59]. The geometric construction of the regular pentagon was obtained by solving exercises 2.3.3 and 2.3.4 in [47].

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

## **Appendix A Theorems From Geometry and Trigonometry**

This appendix presents theorems in geometry and trigonometry that may not be familiar to the reader, as well as theorems that may be familiar but whose proofs are not. Section A.1 presents three formulas for computing the area of a triangle. Section A.2 proves trigonometric identities. Although the formulas and identities are mostly familiar, students frequently learn these identities by heart or look them up without ever seeing a proof. The following sections contain proofs of advanced theorems in geometry: Sect. A.3—the angle bisector theorems, Sect. A.4—Ptolemy's theorem that relates the sides and diagonals in a quadrilateral circumscribed by a circle, Sect. A.5—Ceva's theorem relating the three line segments of a triangle, and Sect. A.6—Menelaus's theorem on the segments of a transversal in a triangle.

#### **A.1 Theorems About Triangles**

#### **A.1.1 Computing the Area of a Triangle**

The standard formula for computing the area of a triangle from the base and the height is well-known. It can be proved using various geometric methods.

**Theorem A.1** *The area of the triangle* △ *is given by:*

$$
\triangle ABC = \frac{1}{2}bh\,\,,\tag{A.1}
$$

*where , the base, is one of the sides of the triangle, and* ℎ*, the height, is the length of the altitude to from the opposite vertex (Fig. A.1a).*

*Proof* Figure A.1b shows that by "cutting" the triangle at half the height, we can "move" the shaded triangles to form a rectangle of the same area as the triangle. The rectangle's base is and its height is <sup>ℎ</sup>/2. □

**Fig. A.1a** Computation of the area of a triangle from the base and the height

**Fig. A.1b** Computation of the area of a triangle from the base and the height

**Theorem A.2** *The area of the triangle* △ *is given by:*

$$
\triangle ABC = \frac{1}{2}bc\sin\theta \,. \tag{A.2}
$$

*Proof* From Thm. A.1 using ℎ = sin . □

**Theorem A.3 (Heron)** *The area of the triangle* △ *is given by:*

$$
\triangle ABC = \sqrt{s(s-a)(s-b)(s-c)}\ ,
$$

*where , the* semi-perimeter *of the triangle, is equal to* <sup>1</sup> 2 ( <sup>+</sup> <sup>+</sup> )*.*

*Proof* A radius of a circle and a tangent that intersects the radius are perdendicular. Furthermore, the lengths of the line segments of two tangents from the same point to the circle are equal. Therefore (Fig. A.2):1

$$
\triangle AOB' \cong \triangle AOC', \quad \triangle BOA' \cong \triangle BOC', \quad \triangle COA' \cong \triangle COB'
$$

The area △ is the sum of the six triangles listed above. Since the height of six triangles is , the radius of the inscribed circle, we obtain:

$$
\Delta ABC = \Delta AOB' + \Delta AOC' + \Delta BOA' + \Delta BOC' + \Delta COA' + \Delta COB' \quad (\text{A.3a})
$$

$$
\Delta ABC = \frac{1}{2}r(\mu + \mu + \nu + \nu + \nu + \nu) \tag{A.3b}
$$

$$
\triangle ABC = \frac{1}{2}r(a+b+c) \tag{A.3c}
$$

$$
\triangle ABC = rs\,.\tag{A.3d}
$$

.

<sup>1</sup> This shows that the *incenter*, the center of the inscribed circle, is the common intersection of the three angle bisectors.

**Fig. A.2** Triangle with an inscribed circle

Let us now defne the sides in terms of the tangents of the central angles:

$$
\tan\frac{\alpha}{2} = \frac{u}{r}, \quad \tan\frac{\beta}{2} = \frac{v}{r}, \quad \tan\frac{\gamma}{2} = \frac{w}{r} \ .
$$

From these defnitions and = 1 2 (2 <sup>+</sup> <sup>2</sup> <sup>+</sup> <sup>2</sup>) we get:

$$s = \mu + \nu + \omega = r \left( \tan \frac{\alpha}{2} + \tan \frac{\beta}{2} + \tan \frac{\gamma}{2} \right) \dots$$

Since 2 + 2 + 2 + 2 + 2 + 2 = 360◦ and thus 2 + 2 + 2 = 180◦ , by Thm. A.11:

$$\begin{aligned} s &= r \left( \tan \frac{\alpha}{2} \tan \frac{\beta}{2} \tan \frac{\gamma}{2} \right) \\ &= r \left( \frac{u}{r} \frac{v}{r} \frac{w}{r} \right) = \frac{1}{r^2} (u \lor w) \\ r &= \sqrt{\frac{u \lor v \cdot w}{s}} \end{aligned}$$

By Eq. A.3d:

$$
\triangle ABC = rs = s\sqrt{\frac{\mu \text{ } \nu \text{ } w}{s}} = \sqrt{s \,\mu \,\nu \,w} \,\Box
$$

Heron's formula follows from <sup>=</sup> <sup>−</sup> , <sup>=</sup> <sup>−</sup> , <sup>=</sup> <sup>−</sup> . □

#### **A.2 Trigonometric Identities**

#### **A.2.1 The Sine and Cosine of the Sum and Diference of Two Angles**

#### **Theorem A.4**

$$\begin{aligned} \sin(\alpha + \beta) &= \sin\alpha \cos\beta + \cos\alpha \sin\beta \\ \sin(\alpha - \beta) &= \sin\alpha \cos\beta - \cos\alpha \sin\beta \\ \cos(\alpha + \beta) &= \cos\alpha \cos\beta - \sin\alpha \sin\beta \\ \cos(\alpha - \beta) &= \cos\alpha \cos\beta + \sin\alpha \sin\beta \end{aligned}$$

We will prove the frst formula; the other formulas can be obtained using the values of sine and cosine for <sup>−</sup> and 90◦ <sup>−</sup> .

Given a right triangle △ with acute angle and a right triangle △ with acute angle , we can join them to obtain geometric fgures with an angle <sup>+</sup> (Fig. A.3). The left diagram is the one most often used in proofs of the identities. Here we give two proofs based on the center and right diagrams.

**Fig. A.3** Diagrams for proving the identity for the sine of sums of angles

*Proof (1)* Let us compute the area of △ in two diferent ways: (1) using Eq. A.2 on △, and (2) using the equation separately on △ and △ (Fig. A.4). ℎ is also computed twice using the defnition of the trigonometric functions:

**Fig. A.4** Computation of the area of a triangle in two ways

$$\begin{aligned} \triangle ABD &= \frac{1}{2}bc\sin(\alpha + \beta) \\ \triangle ABD &= \triangle ABC + \triangle ADC \\ &= \frac{1}{2}ch\sin\alpha + \frac{1}{2}bh\sin\beta \\ &= \frac{1}{2}c(b\cos\beta)\sin\alpha + \frac{1}{2}b(c\cos\alpha)\sin\beta \end{aligned}$$

Equating the two formulas for △ and canceling <sup>1</sup> 2 , we get:

$$
\sin(\alpha + \beta) = \sin\alpha\cos\beta + \cos\alpha\sin\beta\,\,. \tag{7}
$$

The second proof uses the following theorem:

**Theorem A.5** *In a circle of* diameter 1 *the length of a chord that subtends an inscribed angle is equal to the sine of the angle (Fig. A.5).*

*Proof* Let be a diameter and let ∠ = . Let be any other point on the circle one of whose sides is the chord . Since equal chords subtend equal inscribed angles <sup>∠</sup> <sup>=</sup> . In the right triangle △:

**Fig. A.5** All inscribed angles subtended by a chord are equal

**Fig. A.6** A quadrilateral circumscribed by a circle

$$
\sin \alpha = \frac{\overline{BC}}{\overline{AB}} = \frac{\overline{BC}}{1} = \overline{BC} \,. \tag{7}
$$

*Proof (2)* This proof is based on the right diagram in Fig. A.3 reproduced in Fig. A.6, where the quadrilateral has been inscribed in a circle. By Thm. A.15 a quadrilateral can be circumscribed by a circle if and only if the sum of each pair of opposite angles is 180◦ . <sup>∠</sup> <sup>+</sup> <sup>∠</sup> <sup>=</sup> <sup>180</sup>◦ since both angles are right angles. From Thm. 5.4 the sum of the interior angles of a quadrilateral is 360◦ , so <sup>∠</sup> <sup>+</sup> <sup>∠</sup> <sup>=</sup> <sup>180</sup>◦ .

Let the diameter of the circle be 1 (otherwise, multiply everything by the length of the diameter). Then the sides of the quadrilateral are:

$$
\overline{BC} = \sin \alpha, \quad \overline{CD} = \sin \beta, \quad \overline{AB} = \sin \gamma, \quad \overline{DA} = \sin \delta, \dots
$$

and their diagonals are:

$$
\overline{BD} = \sin(\alpha + \beta), \quad \overline{CA} = \sin(\alpha + \gamma) \dots
$$

By Ptolemy's Theorem (Thm. A.18) the product of the diagonals of a quadrilateral circumscribed by a circle is equal to the sum of the products of opposite sides of the quadrilateral. Since ∠ and ∠ are right angles we have:

$$\begin{aligned} \sin(\alpha + \beta)\sin(\alpha + \gamma) &= \sin\alpha\sin\delta + \sin\beta\sin\gamma \\ \sin(\alpha + \beta)\sin(90^\circ) &= \sin\alpha\sin(90^\circ - \beta) + \sin\beta\sin(90^\circ - \alpha) \\ \sin(\alpha + \beta) &= \sin\alpha\cos\beta + \cos\alpha\sin\beta \end{aligned}$$

#### **A.2.2 The Cosine of a Triple Angle**

#### **Theorem A.6**

$$
\cos 3\alpha = 4\cos^3 \alpha - 3\cos \alpha \dots
$$

*Proof* The proof uses the formulas in Thm. A.4 and the formula sin<sup>2</sup> <sup>+</sup> cos<sup>2</sup> <sup>=</sup> 1:

$$\begin{aligned} \cos 3\alpha &= \cos(2\alpha + \alpha) \\ &= \cos 2\alpha \cos \alpha - \sin 2\alpha \sin \alpha \\ &= (\cos^2 \alpha - \sin^2 \alpha) \cos \alpha - (2 \sin \alpha \cos \alpha) \sin \alpha \\ &= \cos^3 \alpha - \cos \alpha \sin^2 \alpha - 2 \sin^2 \alpha \cos \alpha \\ &= \cos^3 \alpha - \cos \alpha + \cos^3 \alpha - 2 \cos \alpha + 2 \cos^3 \alpha \\ &= 4 \cos^3 \alpha - 3 \cos \alpha \end{aligned}$$

#### **A.2.3 The Sine and Cosine of a Half-Angle**

**Theorem A.7** *If is an angle in a* triangle *then:2*

$$
\cos\left(\frac{\alpha}{2}\right) = \sqrt{\frac{1+\cos\alpha}{2}}
$$

$$
\sin\left(\frac{\alpha}{2}\right) = \sqrt{\frac{1-\cos\alpha}{2}}.
$$

*Proof* The proof uses the formulas Thm. A.4 and the formula sin<sup>2</sup> <sup>+</sup> cos<sup>2</sup> <sup>=</sup> 1:

$$\begin{aligned} \cos \alpha &= \cos 2\left(\frac{\alpha}{2}\right) = \cos \left(\frac{\alpha}{2}\right) \cos \left(\frac{\alpha}{2}\right) - \sin \left(\frac{\alpha}{2}\right) \sin \left(\frac{\alpha}{2}\right) \\ &= 2 \cos^2 \left(\frac{\alpha}{2}\right) - 1 \\ \cos \left(\frac{\alpha}{2}\right) &= \sqrt{\frac{1 + \cos \alpha}{2}} \\ \sin^2 \left(\frac{\alpha}{2}\right) &= 1 - \cos^2 \left(\frac{\alpha}{2}\right) = 1 - \frac{1 + \cos \alpha}{2} \\ \sin \left(\frac{\alpha}{2}\right) &= \sqrt{\frac{1 - \cos \alpha}{2}} \end{aligned}$$

<sup>2</sup> The general formula is more complex because the square roots can be either positive or negative depending on the quadrant in which /2 is located. For a triangle 0< <180◦ , so 0< /2<90◦ is in the frst quadrant and both the sine and the cosine are positive.

#### **A.2.4 The Law of Cosines**

**Theorem A.8 (Law of cosines)** *In a triangle* △ *with sides* , , *(Fig. A.7):*

$$c^2 = a^2 + b^2 - 2ab \cos{\angle ACB} \dots$$

*Proof (1)* Drop an an altitude from to and use the defnition of cosine and Pythagoras's Theorem:

$$c = x + (c - x) = a\cos\beta + b\cos\alpha\tag{A.4a}$$

$$c^2 = ac\cos\beta + bc\cos\alpha \,. \tag{A.4b}$$

Similarly, drop altitudes from to and from to to obtain:

$$a^2 = ca\cos\beta + ba\cos\gamma\tag{A.5a}$$

$$b^2 = cb\cos\alpha + ab\cos\gamma \,. \tag{A.5b}$$

Adding Eqs. A.5a and A.5b and subtracting Eq. A.4b gives:

$$\begin{aligned} a^2 + b^2 - c^2 &= ca\cos\beta + ba\cos\gamma\\ &+ cb\cos\alpha + ab\cos\gamma\\ &- ac\cos\beta - bc\cos\alpha\\ &= 2ab\cos\gamma\\ c^2 &= a^2 + b^2 - 2ab\cos\gamma\end{aligned}$$

**Fig. A.7** Proof 1 of the Law of Cosines

*Proof (2)* The second proof uses Ptolemy's theorem (Thm. A.18).3

The triangle △ can be circumscribed by a circle. Construct another triangle △′ congruent with △ and inscribed within the same circle (Fig. A.8). This can be done by constructing an angle from equal to ∠ which intersects the circle at ′ and then constructing the line ′. Since angles that are subtended by the same chord are equal ∠′ = ∠ , so also ∠ = ∠ ′ and thus △′ △ by angle-side-angle with the common side .

Drop perpendiculars from to and from ′ to ′ on so that = cos . By Ptolemy's theorem for the quadrilateral ′ :

$$\begin{aligned} \label{eq:1} &b^2 = a^2 + c(c - 2x) \\ &= a^2 + c(c - 2a\cos\beta) \\ &= a^2 + c^2 - 2ac\cos\beta \,. \end{aligned}$$

**Fig. A.8** Proof 2 of the Law of Cosines

<sup>3</sup> Section A.4 uses the Law of Cosines to prove Ptolemy's theorem! The frst proof of the Law of Cosines avoids this circular reasoning. Furthermore, there are proofs of Ptolemy's theorem that do not use the Law of Cosines.

#### **A.2.5 The Tangent of the Sum of Two Angles**

**Theorem A.9**

$$
\tan(\alpha + \beta) = \frac{\tan \alpha + \tan \beta}{1 - \tan \alpha \tan \beta} \dots
$$

*Proof*

$$\begin{split} \tan(\alpha + \beta) &= \frac{\sin(\alpha + \beta)}{\cos(\alpha + \beta)} \\ &= \frac{\sin\alpha\cos\beta + \cos\alpha\sin\beta}{\cos\alpha\cos\beta - \sin\alpha\sin\beta} \\ &= \frac{\sin\alpha + \cos\alpha\tan\beta}{\cos\alpha - \sin\alpha\tan\beta} \\ &= \frac{\tan\alpha + \tan\beta}{1 - \tan\alpha\tan\beta} .\end{split}$$

**A.2.6 The Tangent of a Half-Angle**

#### **Theorem A.10**

$$\tan\left(\frac{\alpha}{2}\right) = \frac{-1 \pm \sqrt{1 + \tan^2 \alpha}}{\tan \alpha}.$$

*Proof* We derive and solve a quadratic equation in tan 2 :

$$\begin{aligned} \tan\alpha &= \frac{\tan\left(\frac{\alpha}{2}\right) + \tan\left(\frac{\alpha}{2}\right)}{1 - \tan\left(\frac{\alpha}{2}\right)\tan\left(\frac{\alpha}{2}\right)}\\ \tan\alpha \tan^2\left(\frac{\alpha}{2}\right) + 2\tan\left(\frac{\alpha}{2}\right) - \tan\alpha &= 0\\ \tan\left(\frac{\alpha}{2}\right) &= \frac{-1 \pm \sqrt{1 + \tan^2\alpha}}{\tan\alpha} .\end{aligned}$$

#### **A.2.7 The Product of Three Tangents**

**Theorem A.11** *If* <sup>+</sup> <sup>+</sup> <sup>=</sup> <sup>180</sup>◦ *then:*

$$
\tan\alpha + \tan\beta + \tan\gamma = \tan\alpha \tan\beta \tan\gamma
$$

*Proof*

$$\begin{aligned} \tan \gamma &= \tan(180^\circ - (\alpha + \beta)) \\ &= -\tan(\alpha + \beta) \\ &= -\frac{\tan \alpha + \tan \beta}{1 - \tan \alpha \tan \beta} \\ \tan \alpha \tan \beta \tan \gamma &= \tan \alpha + \tan \beta + \tan \gamma \end{aligned}$$

#### **A.2.8 The Limit of sin** /

**Theorem A.12**

$$\lim\_{\alpha \to 0} \frac{\sin \alpha}{\alpha} = 1 \,\,.$$

*Proof* By examining regular polygons inscribed within a circle (Fig. A.9), we see that the more sides that a polygon has, the closer its perimeter is to the circumference of the circle. The circumference of the circle divided by the number of sides is the length of an arc with the same endpoints as the corresponding side, since in a regular polygon all sides have the same length. Since the ratio of the circumference of the circle to the perimeter of an inscribed polygon approaches 1 as the number of sides increases, so does the ratio of the length of an arc to the corresponding chord. This is demonstrated by the following numerical examples:

**Fig. A.9** Regular polygons with 3, 8 and 16 sides inscribed within a circle

**Fig. A.10** The length of a chord corresponding to an arc of size


Since = = 1 the length of the chord subtending can be computed from the Law of Cosines (Fig. A.10):

$$\begin{aligned} c^2 &= a^2 + b^2 - 2ab\cos\alpha\\ c &= \sqrt{2 - 2\cos\alpha}\\ \lim\_{\alpha \to 0} c &= \sqrt{2 - 2 \cdot 1} = 0 \end{aligned}$$

Referring to Fig. A.11:

$$\lim\_{\alpha \to 0} \frac{\sin \alpha}{\alpha} = \lim\_{\alpha \to 0} \frac{2 \sin \alpha}{2 \alpha} \dots$$

This is the ratio of the length of chord to the length of arc c . But we have seen that this ratio converges to 1 as the subtended angle 2 tends to 0, so:

lim →0 sin = 1 . □

**Fig. A.11** Ratio of sin to

#### **A.3 The Angle Bisector Theorems**

**Theorem A.13** *In* △ *let the angle bisector of* <sup>∠</sup> *intersect at (Fig. A.12). Then:*

$$\frac{\overline{BD}}{\overline{CD}} = \frac{\overline{AB}}{\overline{AC}} \cdot \frac{\overline{}}{\overline{AC}} \cdot \frac{\overline{}}{\overline{}}$$

*Proof* We prove the theorem by computing the areas of two triangles using both the base and height (Eq. A.1), and the base, angle and side (Eq. A.2):

$$
\begin{aligned}
\triangle ABD &= \frac{1}{2}\overline{BD}h = \frac{1}{2}\overline{AB}\,\overline{AD}\,\sin\alpha \\
\overline{\frac{\overline{BD}}{\overline{AB}}} &= \frac{\overline{AD}\sin\alpha}{h} \\
\triangle ACD &= \frac{1}{2}\overline{CD}h = \frac{1}{2}\overline{AC}\,\overline{AD}\,\sin\alpha \\
\overline{\frac{\overline{CD}}{\overline{AC}}} &= \frac{\overline{AD}\sin\alpha}{h} \\
\overline{\frac{\overline{BD}}{\overline{CD}}} &= \overline{\frac{\overline{AB}}{\overline{AC}}} \\
\end{aligned}
$$

There is also an angle bisector theorem for the *external bisector*:

 = .

**Theorem A.14** *In* △ *let be the bisector of the angle supplementary to the angle* △ *(Fig. A.13) and let the bisector intersect at (Fig. A.12). Then:*

$$\begin{aligned} \left. \begin{array}{c} \begin{array}{c} \text{a} \\ \text{b} \\ \text{c} \\ \text{b} \\ \text{c} \end{array} \right. \\\\ \left. \begin{array}{c} \text{a} \\ \text{b} \\ \text{c} \\ \text{b} \\ \text{c} \end{array} \right. \end{aligned} $$

**Fig. A.12** The internal angle bisector theorem

**Fig. A.13** The external angle bisector theorem

*Proof* Since is a straight line <sup>∠</sup> <sup>=</sup> <sup>180</sup>◦ <sup>−</sup> .

$$\begin{aligned} \triangle ABE &= \frac{1}{2} \overline{BE} \, h = \frac{1}{2} \overline{AE} \, \overline{AB} \sin \alpha \\ \triangle ACE &= \frac{1}{2} \overline{CE} \, h = \frac{1}{2} \overline{AE} \, \overline{AC} \sin(180^\circ - \alpha) = \frac{1}{2} \overline{AE} \, \overline{AC} \sin \alpha \\ \frac{\overline{BE}}{\overline{AB}} &= \frac{\overline{AE} \sin \alpha}{h} = \frac{\overline{CE}}{\overline{AC}} \\ \overline{\frac{BE}{CE}} &= \overline{\frac{AB}{AC}} \, \end{aligned}$$

#### **A.4 Ptolemy's Theorem**

#### **A.4.1 A Trapezoid Circumscribed by a Circle**

Before giving the proof of Ptolemy's theorem we prove theorems on quadrilaterals and trapezoids.

**Theorem A.15** *A quadrilateral can be circumscribed by a circle if and only if the opposite angles are supplementary (sum to* 180◦ *).*

Geometry textbooks give the simple proof of the forward direction, but it is hard to fnd a proof of the converse so both proofs are given here.

*Proof (Forward direction)* An inscribed angle is equal to half the arc that subtends it so <sup>∠</sup> is half of the arc and <sup>∠</sup> is half of the arc (Fig. A.14a). The two arcs form the entire circumference of the circle so their sum is 360◦ . Therefore, <sup>∠</sup> <sup>+</sup> <sup>∠</sup> <sup>=</sup> 1 2 · 360◦ = 180◦ , and similarly <sup>∠</sup> <sup>+</sup> <sup>∠</sup> <sup>=</sup> <sup>180</sup>◦ .

**Fig. A.14a** A quadrilateral circumscribed by a circle

**Fig. A.14b** The fourth vertex must be on the circumference

*Proof (Converse direction)* Any triangle can be circumscribed by a circle. Circumscribe △ by a circle and suppose that ′ is a point such that <sup>∠</sup> <sup>+</sup> <sup>∠</sup>′ <sup>=</sup> 180◦ , but ′ is *not* on the circumference of the circle. Without loss of generality, let ′ be within the circle (Fig. A.14b).

Construct a ray that extends ′ and let be its intersection with the circle. is circumscribed by a circle so:

$$
\angle DAB + \angle DCB = 180^{\circ} = \angle DAB + \angle DC'B
$$

$$
\angle DCB = \angle DC'B
$$

which is impossible if is on the circle and ′ is inside the circle. □

**Theorem A.16** *The opposite angles of an isosceles trapezoid are supplementary.*

*Proof* Construct the line ′ parallel to (Fig. A.15). ′ is a parallelogram and △′ is an isosceles triangle, so ∠ = ∠′ = ∠′ = ∠. Similarly, ∠ = ∠. Since the sum of the internal angles of any quadrilateral is equal to 360◦ :

$$
\angle A + \angle B + \angle C + \angle D = 360^\circ
$$

$$
2\angle A + 2\angle C = 360^\circ
$$

$$
\angle A + \angle C = 180^\circ
$$

and similarly <sup>∠</sup> <sup>+</sup> <sup>∠</sup> <sup>=</sup> <sup>180</sup>◦ . □

**Theorem A.17** *An isoceles trapezoid can be be circumscribed by a circle.*

The proof is immediate by Thms. A.15, A.16.

**Fig. A.15** An isoceles trapezoid

#### **A.4.2 Proof of Ptolemy's Theorem**

**Theorem A.18 (Ptolemy)** *Given a quadrilateral circumscribed by a circle, the following formula relates the lengths of the diagonals and the lengths of the sides (Fig. A.16).*

$$ef = ac + bd \ .$$

*Proof* By the Law of Cosines for the four triangles △, △, △ , △:

$$\begin{aligned} e^2 &= a^2 + b^2 - 2ab\cos\angle B\\ e^2 &= c^2 + d^2 - 2cd\cos\angle D\\ f^2 &= a^2 + d^2 - 2ad\cos\angle A\\ f^2 &= b^2 + c^2 - 2bc\cos\angle C\ .\end{aligned}$$

<sup>∠</sup> <sup>=</sup> <sup>180</sup>◦ <sup>−</sup> <sup>∠</sup> and <sup>∠</sup> <sup>=</sup> <sup>180</sup>◦ <sup>−</sup> <sup>∠</sup> because they are opposite angles of a quadrilateral circumscribed by a circle, so cos <sup>∠</sup> <sup>=</sup> <sup>−</sup> cos <sup>∠</sup> and cos <sup>∠</sup> <sup>=</sup> <sup>−</sup> cos <sup>∠</sup>. Eliminate the cosine term from the above equations to obtain:

**Fig. A.16** Ptolemy's theorem

$$\begin{aligned} e^2(cd+ab) &= abc^2 + abd^2 + a^2cd + b^2cd\\ e^2 &= \frac{(ac+bd)(ad+bc)}{(ab+cd)}\\ f^2 &= \frac{(ab+cd)(ac+bd)}{(ad+bc)} \end{aligned}$$

Multiply the two equations and simplify to get Ptolemy's theorem:

$$\begin{aligned} e^2 \cdot f^2 &= \left(ac + bd\right)^2\\ ef &= \left(ac + bd\right) \end{aligned} \tag{7}$$

#### **A.5 Ceva's Theorem**

**Theorem A.19 (Ceva)** *Given line segments from the vertices of a triangle to the opposite edges that intersect in a point, the lengths of the segments satisfy (Fig. A.17):*

$$\frac{\overline{AM}}{\overline{MB}} \cdot \frac{\overline{BQ}}{\overline{QS}} \cdot \frac{\overline{SP}}{\overline{PA}} = 1 \; .$$

*Proof* If the altitudes of two triangles are equal, their areas are proportional to the bases. In both diagrams in Fig. A.18, the altitudes of the gray triangles are equal, so:

$$\frac{\triangle BQO}{\frac{\triangle SQO}{\triangle SQO}} = \frac{\overline{BQ}}{\overline{QS}}\,, \qquad \frac{\triangle BQA}{\triangle SQA} = \frac{\overline{BQ}}{\overline{QS}}\,\,\,\,\,\,\,\,\,$$

By subtracting the areas of the indicated triangles, we get the proportion between the gray triangles shown in Fig. A.19:

**Fig. A.17** Ceva's theorem

**Fig. A.18** Triangles in Ceva's theorem

$$\frac{\Delta BOA}{\Delta SOA} = \frac{\Delta BQA - \Delta BQO}{\Delta SQA - \Delta SQO} = \frac{\overline{BQ}}{\overline{QS}} \dots$$

This might look strange at frst so we explain it using a simpler notation:

$$\begin{aligned} \frac{c}{d} &= \frac{a}{b} \\ \frac{e}{f} &= \frac{a}{b} \\ c - e &= \frac{ad}{b} - \frac{af}{b} = \frac{a}{b}(d - f) \\ \frac{c - e}{d - f} &= \frac{a}{b} \end{aligned}$$

**Fig. A.19** Subtracting areas in Ceva's theorem

#### A.6 Menelaus's Theorem 217

Similarly, we can prove:

$$
\begin{aligned}
\overline{AM} &= \frac{\triangle AOS}{\triangle BOS} \\
\overline{SP} &= \frac{\triangle SOB}{\triangle AOB} \\
\end{aligned}
$$

so:

$$\frac{\overline{AM}}{\overline{MB}} \frac{\overline{BQ}}{\overline{QS}} \frac{\overline{SP}}{\overline{PA}} = \frac{\triangle AOS}{\triangle BOS} \frac{\triangle BOA}{\triangle SOA} \frac{\triangle SOB}{\triangle AOB} = 1\ \frac{1}{2}$$

since the order of the vertices in a triangle makes no diference. □

#### **A.6 Menelaus's Theorem**

#### **Theorem A.20 (Menelaus)**

*Let* △ *be a triangle and <sup>a</sup>* transversal line *that intersects all three of the edges of the triangle or their extensions (Fig. A.20). Then:4*

$$
\frac{\overline{AB}}{\overline{BP}} \cdot \frac{\overline{PQ}}{\overline{QC}} \cdot \frac{\overline{CD}}{\overline{AD}} = 1 \,\tag{A.6}
$$

*Proof* Draw a line through parallel to and extend until it intersects the parallel at . From △ ∼ △ it follows that:

**Fig. A.20** Menelaus's theorem

<sup>4</sup> Depending on the confguration of the triangle and the transversal line, the result of the multiplication can be either plus or minus one.

218 A Theorems From Geometry and Trigonometry

$$\frac{\overline{CD}}{\overline{AD}} = \frac{\overline{CK}}{\overline{AB}}$$

.

From △ ∼ △ it follows that:

 = .

Eliminating gives · · <sup>=</sup> · · which can be re-arranged to obtain Thm. A.6. □

#### **Sources**

The appendix is based primarily on [19]. Ceva's theorem and Menelaus's theorem can be proved from each other [45].

## **References**

Wikipedia references are listed by their title only.

All links were alive on 7 December 2021.


Wayback Machine. https://web.archive.org/web/20191223032114/http://www2. washjeff.edu/users/mwoltermann/Dorrie/DorrieContents.htm (2010)


## **Index**

#### **A**

Abe, Hisashi 141 Alperin, Roger C. 150 Appel, Kenneth 41 Archimedes 11, 16

#### **B**

Babylonian mathematics 100 Beloch fold 138 Beloch, Margherita P. 113, 138, 141 Binet's formula 64 Burstall, Rod M. 72

#### **C**

Callagy, James J. 183 Cardano, Gerolamo 79 Carlyle circle 84 Ceva's theorem 165, 215 Circumscribed circle 204, 212–214 around a trapezoid 213 Collapsing compass 1 construction of a perpendicular bisector 2 construction of an equilateral triangle 3 Coloring of a planar graph 42 of a planar map 41 of a polygon 55 Constructible number 21 arithmetic operation 21 depth of square roots 24 square root 22

Construction with only a compass 151–161 of a regular heptadecagon 192 of a regular pentagon 196 Cubic equation 79

#### **D**

de Moivre's formula 185 Degree of a vertex 46 Directrix of a parabola 121 Doubling a cube impossibility of 26 with a neusis 18 using origami 144, 147

#### **E**

Elliptic curve 178 Euclid's *Elements* 3 Euclid's formula 100 Euler's formula 44 Euler, Leonhard 66

#### **F**

Fary's theorem 43 ´ Fermat numbers 66, 184 Fermat, Pierre de 66 Fibonacci numbers 63 Five-color theorem 48 Floating-point underfow 85 Focus of a parabola 121 Four-color theorem 50 Fundamental theorem of algebra 25, 185

#### **G**

Galois, Evariste 2 ´ Gauss, Carl Friedrich 183 Gauss-Wantzel theorem 184 Geometric locus 113 Guthrie, Francis 41

#### **H**

Haken, Wolfgang 41 Hatori, Koshiro 113 Heath, Thomas L. 9 Heawood, Percy J. 41 Heptadecagon 183 construction 192 cosine of the central angle 184 Hermes, Johann Gustav 184 Heron's formula *see* Triangle, Heron's formula Heule, Marijn J.H. 91, 98, 102 Hippias 11, 19 Huzita, Humiaki 113

#### **I**

Inscribed circle 175

#### **J**

Josephus problem 69

#### **K**

3,<sup>3</sup> is not planar 45 <sup>5</sup> is not planar 45 Kempe chain 48 Kempe, Alfred B. 41 Klee, Victor 53 Kochanski, Adam 30 ´ Kullman, Oliver 91, 98, 102 Kuratowski Kazimierz 45

#### **L**

Langford's problem 105 as a covering problem 106 solution of (4) 110 solvability, conditions for 107 Langford, C. Dudley 105

Law of cosines 18, 36, 172, 175, 206, 210, 214 Lill's method 131 algorithm 133 cube root of two 137 multiple roots 132 negative coefcients 134 non-integer roots 136 paths that do not lead to roots 132 proof of 137 quadratic equations 83 zero coefcients 135 Lindemann, Carl von 29 Loh, Poh-Shen 73

#### **M**

Martin, George E. 141 Mascheroni, Lorenzo 151 Mathematical induction 61 McCarthy's 91-function 67 McCarthy, John 67 Menelaus's theorem 217 Messer, Peter 141 Mohr, Georg 151 Mohr-Mascheroni theorem 151 Monic polynomial 24, 25, 27, 74, 78, 137, 187 Museum guard a 53 and triangulated polygons 56

#### **N**

Neusis doubling of a cube 18 trisection of an angle 16 Nonagon *see* Origami, construction of a nonagon

#### **O**

Origami angle bisector 117 axiom 1 114 axiom 2 115 axiom 3 116 axiom 4 118 axiom 5 119 axiom 6 121

#### Index 225

axiom 7 128 construction of a nonagon 148 doubling a cube 144, 147 geometric constructions 141 perpendicular bisector 115 refection 113 trisection of an angle 141, 143

#### **P**

Parabola common tangents to two parabolas 121 cubic equation for the common tangents 125 folds of Axiom 6 are tangents 127 as the locus of a fold 121 Pascal's rule 65 Paucker, Magnus Georg 184 Paul Erdos 94 ˝ Peano axioms 63 Planar graph 42 map 41 Plimpton 322 100 Polygon convex and concave vertices 55 triangulated 54, 57 Poncelet, Jean-Victor 163 Probabilistic method 94 Propositional logic 96 Ptolemy's theorem 204, 212 Pythagorean triples 91, 98, 100 primitive 100

#### **Q**

Quadratic equation 73 completing the square 78 numerical computation of the roots 85 roots of 74 traditional formula 78 Quadratrix 19 compass 20 squaring a circle 38 Quadrilateral inscribed in a circle 204 Quartic equation 77

#### **R**

Ramanujan 32, 35

Ramsey's theorem 93 lower bound 94 lower bound for 94 proof that (3) = 6 93 Ramsey, Frank P. 93, 102 Regular polygon cosine of the central angle 184 Richelot, Friedrich Julius 184 Roots of cubic polynomials 24–25 of unity 185 Round-of error 85

#### **S**

SAT solver 96 DPLL algorithm 99 unit propagation 99 Schur triples 89, 97 Sexagesimal number system 101 Six-color theorem 48 Squaring a circle approximation 30, 32, 35 with a quadratrix 38 Steiner, Jakob 163

#### **T**

Trapezoid, isoceles 154, 213 Triangle angle bisector theorem 211 computing the area 199–201 Heron's formula 175, 200 incenter 175, 200 inscribed circle 200 all are isoceles 7 same area and perimeter 175 Triangulated graph 42 Trigonometric identities 202–210 cosine of a triple angle 205 limit of sin / 209 product of three tangents 209 sine and cosine of a half-angle 205 sine and cosine of the sum and diference of two angles 202 tangent of a half-angle 208 tangent of the sum of two angles 208 Trisection of an angle approximation 11, 14 impossibility of 26


**V**

van der Waerden's problem 92

Wantzel, Pierre 11