textbook...

Third edition

Mathematical Proofs A Transition to Advanced Mathematics Gary Chartrand Western Michigan University

Albert D. Polimeni State University of New York at Fredonia

Ping Zhang Western Michigan University

Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto Delhi Mexico City Sao Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo

Editor-in-Chief: Deirdre Lynch Senior Acquisitions Editor: William Hoffman Assistant Editor: Brandon Rawnsley Executive Marketing Manager: Jeff Weidenaar Marketing Assistant: Caitlin Crain Senior Production Project Manager: Beth Houston Manager, Visual Cover Research and Permissions: Jayne Conte Cover Designer: Suzanne Behnke Cover Photo: Shutterstock.com Full Service Project Management: Kailash Jadli, Aptara® , Inc. Composition: Aptara®, Inc. Printer/Binder: Courier Westford Envelope Printer: Lehigh/Phoenix Acknowledgments and Acknowledgments reproduced from other sources and included with permission in this book reproduced appear in the text on the relevant page. C 2013, 2008, 2003 by Pearson Education, Inc. All rights reserved. Manufactured in the United States Copyright. This publication is copyrighted and the permission of the publisher must be sought before any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording or the like. To obtain permission(s) to use material from this work, please submit a written request to Pearson Education, Inc., Department of Permissions, One Lake Street, Upper Saddle River, New Jersey 07458, or fax your request at 201-236-3290.

Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where such designations appear in this book and the publisher was aware of a trademark claim, the designations were printed in capital letters or all capital letters. Cataloging Data in Publications from the Library of Congress Chartrand, Gary. Mathematical Proofs: A Transition to Advanced Mathematics / Gary Chartrand, Albert D. Polimeni, Ping Zhang. – 3rd ed. p. cm. Including references and index. ISBN-13: 978-0-321-79709-4 ISBN-10: 0-321-79709-4 1. Proof Theory - Textbooks. I. Polimeni, Albert D., 1938 - II. Zhang, Ping, 1957 - III. Title. QA9.54.C48 2013 511.3 6—dc23

2012012552

10 9 8 7 6 5 4 3 2 1 – CW – 16 15 14 13 12 ISBN-13: 978-0-321-79709-4 ISBN-10: 0-321-79709-4

In memory of my mother and father

GC

the memory of my Uncle Joe and my brothers John and Rocky, my mother and the memory of my father

PZ

ADP

contents

mathematics communicates

1

Learn math 2 What others have said about writing 4 Writing math 5 Using symbols 6 Writing math expressions 8 Common words and phrases in math 10 Some final notes on writing 12

1

puts

14

1.1 Description of a set 14 1.2 Subsets 18 1.3 Operations on sets 21 1.4 Indexed collections of sets 1.5 Partitions of sets 27 1.6 Cartesian products of sets Exercises on Chapter 1 29

2

24 28

Logic 2,1 2,2 2,3 2,4 2,5 2,6 2,7

37

Propositions 37 The negation of a proposition 39 The disjunction and conjunction of propositions The implication 42 More about implications 44 The bicondition 47 Tautologies and contradictions 49 iv

41

contents

2.8 Logical Equivalence 51 2.9 Some Basic Properties of Logical Equivalence 2.10 Quantified Propositions 55 2.11 Characterizations of Propositions 63 Exercises for Chapter 2 64

3

4

53

Direct proof and proof by contrapositive 3.1 Trivial and vacuum proofs 3.2 Direct proofs 80 3.3 Proof by contrapositive 3.4 Proof by cases 89 3.5 Evaluating proofs 92 Exercises for Chapter 3 93

v

77

78 84

More about Direct Proof and Proof by Contrapositiv

99

4.1 Proofs with divisibility of integers 99 4.2 Proofs with congruence of integers 103 4.3 Proofs with real numbers 105 4.4 Proofs with sets 108 4.5 Basic properties of set operations 111 4.6 Proofs with Cartesian products of sets 113 Exercises for Chapter 4 114

5

Existence and proof by contradiction

120

5.1 Counterexamples 120 5.2 Proof by Contradiction 124 5.3 An Overview of Three Proof Techniques 130 5.4 Proofs of Existence 132 5.5 Refutation of Existence Claims 136 Exercises for Chapter 5 137

6

Mathematical induction 6.1 The principle of mathematical induction 142 6.2 A more general principle of mathematical induction 151 6.3 Proof by minimal counterexample 158 6.4 The strong principle of mathematical induction 161 Exercises for Chapter 6 165

142

vi

contents

7

prove or disprove

170

7.1 Conjectures in Mathematics 170 7.2 Repeating Quantified Statements 173 7.3 Testing Statements 178 Exercises for Chapter 7 185

8

Equivalence relations 8.1 Relations 192 8.2 Properties of relations 193 8.3 Equivalence relations 196 8.4 Properties of equivalence classes 8.5 Congruence Module #202 8.6 Integers Module #207 Exercises on Chapter 8 210

9

192

198

functions

216

9.1 Function definition 216 9.2 The set of all functions from A to B 219 9.3 Injective and overlapping functions 220 9.4 Bijective functions 222 9.5 Composition of functions 225 9.6 Inverse functions 229 9.7 Permutations 232 Exercises on Chapter 9 234

10

Set cardinalities

242

10.1 Numerically equivalent sets 243 10.2 Countable sets 244 10.3 Uncountable sets 250 10.4 Comparison of the cardinalities of sets 255 10.5 The Schröder-Bernstein theorem 258 Exercises for Chapter 10 262

11

Proofs in number theory 11.1 11.2

Integer divisibility properties The division algorithm 267

266 266

contents

vii

11.3 Greatest common divisor 271 11.4 The Euclidean algorithm 272 11.5 Relatively prime integers 275 11.6 The fundamental theorem of arithmetic 277 11.7 Terms with sums of divisors 280 Exercises for Chapter 11 281

12

Tests in Analysis

288

12.1 Limits of sequences 288 12.2 Infinite series 295 12.3 Limits of functions 300 12.4 Basic properties of limits of functions 12.5 Continuity 312 12.6 Differentiability 314 Exercises on Chapter 12 317

13

Proofs in group theory 13.1 Binary operations 322 13.2 Groups 326 13.3 Permutation groups 330 13.4 Basic properties of groups 13.5 Subgroups 336 13.6 Isomorphic groups 340 Exercises on Chapter 13 344

14

Proofs in Ring Theory (Online) 14.1 Rings 14.2 Elementary Properties of Rings 14.3 Subrings 14.4 Integral Domains 14.5 Field Exercises to Chapter 14

fifteen

Tests in Linear Algebra (Online) 15.1 15.2 15.3 15.4

Properties of vectors in matrices of 3-space vector spaces Some properties of vector spaces

307

322

333

VIII

contents

15.5 Subspaces 15.6 Extensions of vectors 15.7 Linear dependence and independence 15.8 Linear transformations 15.9 Properties of linear transformations Exercises for Chapter 15

16

Proofs in topology (Online) 16.1 Metric spaces 16.2 Open sets in metric spaces 16.3 Continuity in metric spaces 16.4 Topological spaces 16.5 Continuity in topological spaces Chapter 16 Exercises

Answers and hints to selected exercises on odd numbers in chapters 14–16 (online)

Answers and tips for exercises in odd-numbered sections References Symbol Index Index

351 394 395 396

FOREWORD TO THE THIRD EDITION

As we noted in the prefaces to the first two editions, the theoretical gap between the material presented in Analysis and the theoretical gap between the material presented in Analysis and the teaching of Analysis has become more problem-solving oriented in many colleges and universities, with the Increased emphasis on the use of calculators and computers The expected (or at least expected) mathematical basis in more advanced courses such as abstract algebra and advanced calculus has increased. To fill this gap and better prepare students for the more abstract mathematics courses to come, many colleges and universities have introduced courses that are now commonly referred to as transition courses. In these courses, students are introduced to problems that require mathematical thinking and knowledge of proof techniques and clear writing of proofs to solve. Topics such as relations, functions and cardinalities of sets can be found in theoretical mathematics courses. In addition, transition courses often include theoretical aspects of number theory, abstract algebra, and calculus. This book was written for one such course. The idea for this textbook came about in the early 1980s, long before transition courses became fashionable, while supervising research projects in undergraduate mathematics. We have found that even advanced undergraduate students do not have a solid understanding of proof techniques and struggle to write correct and clear proofs. A notebook was developed for these students at the time. This was followed by the introduction of a transition course, for which a more detailed collection of notes was written. These notes gave rise to the first edition of this book, which in turn led to a second edition and now to this third edition. While understanding tests and test techniques and writing good tests are the main goals here, they are not things that can be accomplished on a large scale in a single course during a single semester. These should be further emphasized and practiced in successive mathematics courses.

Our Approach Because this textbook grew out of notes written solely for undergraduate students to help them understand proof techniques and write good proofs, this is Tone ix

x

Foreword to the third edition

all editions of this book were written: to be student-friendly. Numerous examples are presented in the text. Following common practice, we mark the end of a proof with the square symbol . We often precede a proof with a discussion called proof strategy, in which we consider what is required to present a proof of the result in question. Sometimes we find it helpful to reflect on a piece of evidence we just presented to point out some important details. We refer to such a discussion as evidence analysis. From time to time problems are presented and solved, and we may find it appropriate to discuss some features of the solution, which we simply refer to as analysis. For clarity, we mark the end of a discussion of a proof strategy, proof analysis, analysis, or solution of an example with the hash symbol . One of the main goals of this book is to help students construct their own proofs that are not only mathematically correct but also clearly written. Advanced mathematics students should endeavor to present proofs that are persuasive, readable, consistent in notation, and grammatically correct. A secondary goal is for students to gain enough knowledge and confidence from the tests to recognize, understand, and appreciate a correctly written test. Like the first two editions, the third edition of this book is designed to help students transition to courses that are more based on mathematical proof and reasoning. We expect students to take a course based on this book after completing a year of calculus (and possibly another course such as elementary linear algebra). It is likely that prior to starting this course, a student's mathematical education consisted mostly of solving standard problems; That is, students learned methods of solving problems, likely including an explanation of why those methods worked. Students may have faced some proofs in previous courses, but most likely they were unaware of the logic involved and the proof method used. There may even have been times when the students weren't sure what they were proving.

Content Outline Since writing good proofs requires a certain amount of writing competence, we have dedicated Chapter 0 to mathematical writing. The emphasis in this chapter is on effective and clear presentation, the correct use of symbols, writing and displaying mathematical expressions, and the use of keywords and phrases. While each instructor emphasizes writing in their own way, we found it helpful to read Chapter 0 regularly throughout the course. It will mean more as the student progresses through the course. Additions and changes in the second edition that have led to this third edition include the following. • More than 250 exercises have been added, many of which require more attention to solve. • New exercises have been added that deal with conjectures to help students practice this important aspect of more advanced mathematics. • Additional examples have been provided to make it easier to understand and solve new tasks. • In several cases, extensive discussions were held on an issue in order to create more clarity. In particular, the important issue of quantified claims is introduced in Section 2.10 and then revised in Section 7.2 to improve understanding of the issue.

Foreword to the third edition

XI

• Chapter 13 (Proofs of Group Theory) has been supplemented with a discussion of the cosets and Lagrange's theorem. Each chapter is divided into sections, and the exercises for each chapter are at the end of the chapter and are divided into sections in the same way. There is also a concluding section with exercises for the whole chapter. Chapter 1 provides a gentle introduction to Sets, so they all have the same background and use the same notation as we prepare for what is to come. Up to chapter 4 there is no proof with sets. Much of Chapter 1 may be repetitive to many. Chapter 2 is all about logic. The point here is to present as quickly as possible what it takes to get into the exams. Much of the emphasis in Chapter 2 is on propositions, implications, and quantified propositions, including a discussion of mixed quantifiers. Sets are introduced before logic so that the student will be familiar with mathematics here for the first time and because sets are needed to discuss properly quantified statements in Chapter 2. The two proof techniques of direct proof and proof by contrapositive are introduced in Chapter 2. Chapter 3 builds on the familiar structure of even and odd integers. This chapter covers case evidence and proofs of "if and only if" statements. Chapter 4 continues this discussion in other areas, namely integer divisibility, congruence, real numbers, and sets. The proof-by-contradiction technique is introduced in Chapter 5. Since proofs of existence and counterexamples are related to proof-by-contradiction, they appear in Chapter 5 as well. The issue of uniqueness (of an element with certain properties) is also discussed in Chapter 5. The main goal of Chapter 7 (Prove or Disprove) is to test statements of unknown truth value, using justification to determine whether a given statement is true or false. In addition to the challenge of determining whether a statement is true or false, such tasks provide additional practice with counterexamples and different proof techniques. In this chapter, the testing of propositions is preceded by a historical discussion of conjectures in mathematics and a review of quantifiers. Chapter 8 deals with relations, especially equivalence relations. Many examples involving congruence are presented and the set of integers modulo n is described. Chapter 9 deals with functions, focusing on one-to-one and on properties. This leads to a discussion about bijective and inverse functions. The well-defined property of functions is discussed in more detail later in this issue. There is also a discussion of images and inverse images of sets in terms of functions and several additional exercises on these concepts. Chapter 10 deals with infinite sets and a discussion of set cardinalities. This chapter contains a historical discussion of infinite sets, beginning with Cantor and his fascination with and difficulties with the Schröder-Bernstein theorem, then Zermelo and the axiom of choice, and ending with a proof of the Schröder-Bernstein theorem. All proof techniques are used in Chapter 11, where numerous results from the field of number theory are presented and proven. Chapter 12 deals with proofs that occur in calculus. Since this evidence is quite different from that found earlier, but is generally more predictable, many illustrations involving boundaries of

xii

Foreword to the third edition

Consequences and limits of functions and their connections with infinite series, continuity and differentiability. The final Chapter 13 deals with modern algebra, starting with binary operations and ending with proofs in the field of group theory.

Mathematical Proofs Website Three additional chapters, Chapters 14–16 (dealing with proofs in ring theory, linear algebra, and topology), can be found at the website: http://www.aw.com/info/chartrand.

Teaching a Course Using This Text Although a course using this book can be designed in many ways, here are our thoughts on such a course. As we mentioned earlier, we found it helpful for the students to re-read Chapter 0 (at least part of it) throughout the course, as we feel the chapter gains more meaning with each reading. The first part of Chapter 1 (Sentences) will probably be familiar to most students, but the last part will not. Chapters 2-6 will likely be part of such a course, although certain topics may be given different degrees of emphasis (perhaps evidenced by a minimal counterexample in Chapter 6, which may even be omitted). Little or a lot of time can be spent on Chapter 7, depending on how much time is spent discussing the many "Prove or Disprove" exercises. We believe that most of Chapters 8 and 9 would be covered in such a course. It would be useful to cover some of the basic ideas from Chapter 10 (cardinalities of sets). If time allows, portions of later chapters may be covered, particularly those of interest to the instructor, including access to the website online for even more variety in the three chapters.

Exercises Chapters 1-13 (as well as Chapters 14-16 on the website) have several exercises. The level of difficulty of the exercises ranges from routine to moderately difficult to moderately challenging. As mentioned, the third edition has more exercises in the intermediate category. There are exercises that present statements to the students and ask them to decide whether they are true or false (with reasoning). Proofs are proposed for statements that ask whether the argument is valid. There are non-statement proofs in which students are asked to make a statement about what has been proved. In addition, there are exercises in which students are asked to make their own guesses and possibly prove those guesses. Chapter 3 is the first chapter where students are asked to write tests. We believe that at such an early stage, students need to (1) focus on constructing a valid proof and not be distracted by ignorance of the mathematics, (2) develop some confidence in the process, and (3 ) need to learn how to write a proof properly. With this in mind, many of the exercises in Chapter 3 have been intentionally structured to resemble the examples in this chapter. Typically, at the end of a chapter, there are exercises related to each section (section exercises) and additional exercises to the entire chapter (chapter exercises). answers or

Foreword to the third edition

xiii

Notes on exercises in the odd-numbered section appear at the end of the text. However, it should also be noted that outcome tests are not generally unique.

Agradecimentos É um prazer agradecer aos revisores da terceira edição: Daniel Acosta, Southeastern Louisiana University Scott Annin, California State University, Fullerton J. Marshall Ash, DePaul University Ara Basmajian, Hunter College of CUNY Matthias Beck, San Francisco State University Richard Belshoff, Missouri State University James Brawner, Armstrong Atlantic State University Manav Das, University of Louisville David Dempsey, Jacksonville State University Cristina Domokos, California State University, Sacramento José D. Flores, University of South Dakota Eric Gottlieb, Rhodes College Richard Hammack, Virgínia Commonwealth University Alan Koch, Agnes Scott College M. Harper Langston, Courant Institute of Mathematical Sciences, New York University Maria Nogin, California State University, Fresno Daniel Nucinkis, University of Southampton Thomas Polaski, Winthrop University John Randall, Rutgers University Eileen T. Shugart, Virginia Tech Brian A. Snyder, Estado do Lago Superior U Niversidade Melissa Sutherland, SUNY Geneseo M.B. Ulmer, University of South Carolina Upstate Mike Winders, Worcester State University, Agradecemos, zusammen mit Renato Mirollo, Boston College und Tom Weglaitner, der letzte Teil der dritten Ausgabe. Tivemos a sorte de receber o apoio entusiástico de muitos na Pearson. In erster Linie lugar, gostaríamos de agradecer à equipe editorial, bem como a outros da Pearson que foram tão prestativos e prestativos: Greg Tobin, Editor, Mathematics and Statistics; William Hoffman, Herausgeber Sênior de Aquisições; Jeff Weidenaar, Gerente Executive de Marketing, Matemática; e Brandon Rawnsley, Mitherausgeber, Artes e Ciências, Ensino Superior. Nosso agradecimento a todos vocês. Por fim, agradeço tambem a Beth Houston, Gerente Sênior de Projetos de Produção; Kailash Jadli, Gerente de Projetos, Aptara, Inc.; e Mercedes Heston, copidesque, por nos guiar nos estágios finais da terceira edição. Gary Chartrand Albert D. Polimeni Ping Zhang

This page has been left blank internationally

mathematics communicates

Q

Most likely, the math you've encountered is about solving problems using a specific approach or procedure. This may involve solving equations in algebra, simplifying algebraic expressions, checking trigonometric identities, using specific rules for finding and simplifying the derivatives of functions, and establishing and evaluating a specific integral that takes the area of a region or the indicates the volume of a solid. Accomplishing all of this is usually a matter of practice. Many of the methods used to solve mathematical problems are based on mathematical results discovered and proven true by humans. You may be new to this type of math, and as with anything new, there are things to learn. But learning something new can (should) be fun. Several steps are required here. The first step is to discover something in mathematics that we believe to be true. How do you discover new mathematics? This is usually done by looking at examples and noting that a pattern seems to be emerging with the examples. This may lead to an assumption on our part about what appears to be going on. So we have to convince ourselves that our assumption is correct. Mathematics is about constructing a proof that what we believe to be true is actually true. But that's not enough. We have to convince others that we are right. Therefore, we must write a proof that is written so clearly and logically that people who know the methods of mathematics are convinced of it. What distinguishes mathematics from all other branches of science is that once a certain mathematical statement has been proved, there is no longer any doubt. This statement is true. Period. There is no other alternative. Our main focus here will be on learning how to construct mathematical proofs and how to write proofs in a way that is clear and understandable to others. While guessing new math is important and can be fun, we'll only spend a little time on it as it usually requires an understanding of more math than can be discussed here. But why should we want to discover new mathematics? While one possible answer is that this arises from the curiosity of most mathematicians, a more common explanation is that we have a problem to solve for which we need to know that a mathematical statement is true.

1

2

Chapter 0

mathematics communicates

Learning Mathematics One of the main goals of this book is to help you progress from being a math performer to a math understander. Perhaps this is the beginning of becoming someone who really develops your own mathematics. This is an achievable goal if you have the desire. The fact that you got this far in your math degree indicates that you are proficient at math. This is a real opportunity for you. Much of the math you will encounter in the future is based on what you will learn here. The better you learn the material and mathematical thought process now, the more you will understand later. Any course of study is certainly much more fun if you understand it. But getting to this point requires effort on your part. There are probably as many excuses for doing poorly in math as there are strategies for doing well in math. We've all heard students say (sometimes remarkably, even proudly) that they're not good at math. That's just an alibi. Mathematics can be learned like any other subject. Even some students who are good at math say they are not good at taking tests. This too is unacceptable. What it takes is determination and effort. Doing well on a test with little or no study is nothing to be proud of. But self-confidence through good preparation is good. Here is some advice that has worked for many students. First, it is important to understand what is happening in class every day. That means being present and prepared for each class. Make copies of all lecture notes after each lecture. As you copy the notes, express the sentences in your own words and add details so that everything is as clear as possible. If you encounter obstacles (and you will), discuss them with a classmate or your teacher. In fact, it's a good idea (at least in our opinion) to have someone to discuss the material with on a regular basis. Not only does he frequently clarify ideas, but he also gets used to using the correct terminology and notation. Aside from learning math from your teacher, reinforcing your understanding through careful note-taking, and talking to classmates, your text is (or at least should be) an excellent resource. With pen (or pencil) and paper in hand, read your text carefully. Make a serious effort to complete all the assigned homework and finally make sure you know how to solve it. If there are problems in the text that have not been assigned, you can also try to solve them. Another good idea is to try to create your own problems. If you're studying for an exam, even try creating your own exam. If you do this in all of your classes, you might be surprised at how good you get. Creativity is an important part of mathematics. Discovering mathematics not only contributes to your understanding of the subject, but has the potential to contribute to mathematics itself. Creativity can come in all forms. The following quote is from the well-known writer J.K. Rowling (author of the Harry Potter novels). Sometimes ideas just come to me. Sometimes I have to sweat and almost bleed to get the ideas. It's a mysterious process, but I hope I never find out exactly how it works.

learn math

3

In the book Defying Gravity: The Creative Career of Stephen Schwartz from Godspell to Wicked, author Carol de Giere writes a biography of Stephen Schwartz, one of the most successful composers and lyricists, in which she not only discusses creativity in music but also, how an idea can lead to something special and interesting and how creative people can deal with disappointment. Indeed, de Giere dedicates his book to the creative spirit within each of us. While writing the music for famous shows like Godspell and Wicked, Schwartz discusses creativity in the song "The Spark of Creation," which he wrote for the musical Children of Eden. In his book, de Giere writes: In many ways this music expresses the life theme of Stephen Schwartz - the naturalness and importance of the creative impulse within us. At the same time he created an anthem for artists. In mathematics, our goal is to seek the truth. Finding answers to math questions is important, but we cannot stop at that. We need to be sure that we are right and that our explanation of why we believe we are right will convince others. The reasoning we use to get from what we know to what we want to show must be logical. It has to make sense for others, not just for ourselves. There is a shared responsibility here. As authors, it is our responsibility to provide clear and concise reasoning, with enough detail for the reader to understand what we are writing and be persuaded. It is the reader's responsibility to understand the basics of logic and to study the concepts involved in order for a well-presented argument to be understood. So writing in mathematics is important, very important. Is it really important to write well in math? After all, isn't mathematics mostly equations and symbols? No way. Not only is it important to write well in math, it is important to write well. You will write for the rest of your life, at least reports, letters and emails. Many people you never knew only know you by what you write and how you write. Math is such a complicated subject that we don't need a vague, foggy, boring script to add it. A teacher has a very positive impression of a student who delivers well-written, well-organized assignments and exams. You want people to enjoy reading what you write. It is important to have a good reputation as an author. It is part of being an educated person. Especially with the large number of email letters that many of us write, it has become common to write more casually. While everyone would probably agree with that (since it's more efficient), we should know how to write formally and professionally when the situation calls for it. You might think that it will be very difficult to improve your writing considering how long you have been writing and what you have in your habits. Not really. If you want to improve, you can and you will. Even if you are a good writer, you can always improve your writing. Usually people don't think much about what they write. Often just thinking about your writing is the first step to writing better.

4

Chapter 0

mathematics communicates

What Others Have Said About Writing Many people who are well known in their fields have shared their thoughts on writing. Here are quotes from some of those people. Anything that helps communication is good. Anything that hurts is bad. I like words more than numbers and always have - more conceptual than computational. Paul Halmos, mathematician Writing is easy. All you have to do is cross out all the wrong words. Mark Twain, author (The Adventures of Huckleberry Finn) You don't write to say something; You write because you have something to say. F. Scott Fitzgerald, Author (The Great Gatsby) Writing is easier when you have something to say. Scholem Asch, Author Write something worth reading or do something worth writing about. Benjamin Franklin, statesman, writer, inventor. What is written effortlessly is usually read listlessly. Samuel Johnson, writer Easy reading is extremely difficult writing. Nathaniel Hawthorne, novelist (The Scarlet Letter) Anything written to please the author is useless. The last thing you know when you're writing a book is what to write first. I made this letter longer because I don't have time to make it short. Blaise Pascal, mathematician and physicist The best way to learn about a subject is to write a book about it. Benjamin Disraeli, Prime Minister of England In a very real sense, the writer writes to learn himself, to understand himself, to gratify himself; The publication of his ideas, while gratifying, is an odd disappointment. Alfred Kazin, literary critic The art of writing creates a context in which other people can think. Edwin Schlossberg, exhibition designer A writer needs three things: experience, observation and imagination, two of which, sometimes all, can make up for the other's lack. William Faulkner, writer (The Sound and the Fury) If there is confusion in the passage you have just read, there may well be too much said.

Mathematical writing

5

So that's what he meant! Then why didn't he say it? Frank Harary, mathematician A mathematical theory should not be considered complete until you have formulated it clearly enough to explain it to the first man you meet on the street. David Hilbert, mathematician Everything should be as simple as possible, but not simpler. Albert Einstein, physicist Never allow what you write to be published without others criticizing you. Donald E. Knuth, computer scientist and author Some books are meant to be tasted, some to be swallowed, and others to be chewed and digested. Reading makes a complete man, lecturing a willing man, and writing an accurate man. Francis Bacon, writer and philosopher Don't judge an item by the quality of what's framed and hung on the wall, but by the quality of what's in the bin. Anonymous (quote from Leslie Lamport) We are all apprentices in a trade that no one ever masters. Ernest Hemingway, Author (For Whom the Bell Tolls) There are three rules for writing a novel. Unfortunately nobody knows what they are. W. Somerset Maugham, author (Of Human Servitude)

Mathematical Writing Most of the above quotes relate to writing in general, not to mathematical writing in particular. However, these writing suggestions also apply to mathematical writing. For us, math writing means writing assignments for a math course (especially a course with exams). Such an assignment may be writing a single exam, writing solutions to a set of problems, or perhaps writing a term paper that will most likely include definitions, examples, background, and evidence. We will call each of these a task. Your goal should be to write correctly, clearly, and interestingly. Before you even start writing, you need to have a few things in mind. First, you need to know what examples and evidence to include when appropriate for your task. You shouldn't worry too much about writing good proofs on your first try - but make sure you have the proofs. When writing your term paper, you need to be aware of your target audience. What is the target group of your task? Of course it has to be written for your teacher. But it should be written in a way that a colleague can understand. As you grow mathematically, your audience will grow with you, and you will adapt your writing to this new audience. Give yourself enough time to write your term paper. Don't try to assemble it just minutes before the deadline. The disappointing result will be obvious to you

6

Chapter 0

mathematics communicates

Teacher. And to you! Find a place for writing that is comfortable for you: your bedroom, an office, a study, the library and sit at a desk, at a table, in a chair. Do what is best for you. You might write better when it's quiet or when there's music playing in the background. Now that you're comfortably settled in and have enough time to do a good job, let's come up with a plan. If the task is fairly long, you may need an outline that will likely include one or more of the following: 1. 2. 3. 4.

Background and motivation The definitions to present and notation to use, if any The examples to include The results to present (whose proofs are already written, probably in rudimentary form) 5. References to other results you want to use 6. The order of everything was mentioned above. If the assignment is a thesis, it may contain extensive supporting material and may need careful justification. The topic of the article needs to be put into perspective. How does this fit into what we already know? Many writers write in spirals. Even if you have a plan for your assignment that includes an ordered list of things you want to say, you'll probably find at some point (perhaps sooner than you think) that you should have included something earlier - maybe a definition, a theorem , an example, a notation. (This has happened to us many times while writing this book.) Fill in the missing material, start over, and write until you find something missing again. It is important that you start from the beginning each time you re-read. Then repeat the steps above. We're about to give you some advice, some tips on how to write math. Such advice is inevitably subjective. Not everyone agrees with these writing suggestions. In fact, writing experts do not agree on every point. Right now your teacher is your best guide. But writing doesn't follow a list of rules. As you mature mathematically, perhaps the best advice for your writing is the same advice Jiminy Cricket gave about Disney's Pinocchio: Always let your conscience guide you. you have to be yourself One more piece of advice: be careful when accepting writing advice. Originality and creativity follow no rules. However, until you've reached the stage where you feel comfortable and confident with your own writing, we think it's helpful to consider a few writing tips. Since some of these writing tips may not make sense (since we haven't got anything to write yet, after all), it's probably more helpful for you to return to this chapter regularly as you progress through the following chapters. 🇧🇷

Using Symbols Because math is a symbol-oriented subject, writing math requires a mix of words and symbols. Here are some guidelines that many mathematicians follow.

use of symbols

7

1. Never start a sentence with a symbol. Writing math follows the same practice as writing all sentences, which is that the first word must be capitalized. This is confusing when the sentence starts with a symbol as the sentence will appear incomplete. Also, a sentence generally sounds better if it begins with a word. Instead of writing: x 2 − 6x + 8 = 0 has two different roots. write: The equation x 2 − 6x + 8 = 0 has two different roots. 2. If possible, separate symbols that are not in a list by words. By separating symbols with words, the sentence becomes easier to read and therefore more understandable. The theorem: Apart from a, b is the only root of (x − a) (x − b) = 0. It would be clearer if it were written as: Apart from a, the number b is the only root of (x − a ) ( x − b) = 0. 3. Unless you are discussing logic, avoid writing the following symbols in your problem: ⇒, ∀, ∃, , etc. The first four symbols stand for “implies”, “for each ", "exists" or "so that". You've probably seen these symbols before and know what they mean. If so, that's fine. It is useful when taking notes or when first drafting a problem to use shorthand symbols, but many mathematicians avoid such symbols in their professional writing. (We will visit these icons later.) 4. Be careful when using e.g. and e.g. These represent this and are, for example, resp. There are situations where writing the words is preferable to writing the abbreviations because there is √−1 and there can be confusion with nearby symbols. For example, 1 n are not rational numbers, that is, i and e are not rational numbers lim 1 + n→∞ n. 5. Write integers as words when used as adjectives and when the numbers are relatively small or light to be described in words. Write numbers numerically when they indicate the value of something. There are exactly two groups of order 4. Fifty million French can't be wrong. There are one million positive integers less than 1,000,001. 6. Don't mix words and symbols wrong. Instead of writing: Every integer ≥ 2 is prime or composite.

8

Chapter 0

mathematics communicates

it is better to write: Every integer greater than 1 is prime or composite. or If n ≥ 2 is an integer, then n is prime or composite. Although (x − 2)(x − 3) = 0 implies that x = 2 or 3 looks correct, it is not spelled correctly. It should read: Since (x − 2)(x − 3) = 0, it follows that x = 2 or x = 3. 7. Avoid using a symbol in the statement of a sentence unless it is necessary. Do not write: Theorem Every bijective function f has an inverse. Delete "f". It serves no useful purpose. The theorem does not depend on how the function is called. A symbol must not be used exactly once in the statement of a theorem (or its proof). If it makes sense to have a name for any bijective function in the proof (which it probably will), then “f” can be introduced there. 8. Explain the meaning of each symbol you introduce. Even if your intention seems clear, don't assume it. For example, if you write n = 2k + 1 and k has never occurred before, you are saying that k is an integer (if k is indeed an integer). 9. Use "frozen symbols" correctly. If m and n are normally used for integers (which they probably are), then don't use them for real numbers. When A and B are used for sets, don't use them as typical members of a set. If f is used for a function, don't use it as an integer. Write symbols that the reader would expect. Otherwise, the reader might very well get confused. 10. Use consistent symbols. Unless you have a particular reason not to, use “appropriate” symbols. Otherwise it is a distraction for the reader. Instead of writing If x and y are even integers then x = 2a and y = 2r for some integers a and r. replace 2r with 2b (then of course we write "for some integers a and b"). On the other hand, you might prefer to write x = 2r and y = 2s.

Writing Mathematical Expressions There will be many occasions when you will want to write mathematical expressions, such as algebraic equations, inequalities, and formulas, in your homework. if

Write mathematical expressions

9

These phrases are relatively short, so they should probably be written in the text of the exam or discussion. (We'll explain that in a moment.) If the expressions are fairly long, it's probably preferable to write those expressions as displays. Suppose we are discussing the binomial theorem. (It doesn't matter if you don't remember what that sentence is.) It's possible that what we write contains the following passage: For example, if we expand (a + b)4, we get (a + b). )4 = a 4 + 4a 3 b + 6a 2 b2 + 4ab3 + b4 . It would probably be better to write the expansion of (a + b)4 as a display, placing and centering the mathematical expression on a line or lines themselves. This is shown below. For example, if we expand (a + b)4, we get (a + b)4 = a 4 + 4a 3 b + 6a 2 b2 + 4ab3 + b4 . If there are multiple math expressions connected by equals and inequality signs, we would almost certainly write this as a display. For example, suppose we wanted to write n 3 + 3n 2 − n + 4 in terms of k, where n = 2k + 1. One possible representation is given below: Since n = 2k + 1, it follows that n 3 + 3n 2 − n + 4 = (2k + 1)3 + 3(2k + 1)2 − (2k + 1) + 4 = (8k 3 + 12k 2 + 6k + 1) + 3(4k 2 + 4k + 1 ) − 2k − 1 + 4 = 8k 3 + 24k 2 + 16k + 7 = 8k 3 + 24k 2 + 16k + 6 + 1 = 2(4k 3 + 12k 2 + 8k + 3) + 1. Notice how the Equal signs are aligned. (We wrote two equals in one line because otherwise that line would contain very little material and the line lengths are better balanced.) Going back to the expression (a + b)4 = a 4 + 4a 3 b + 6a 2 b2 + 4ab3 + b4 at the moment. If we put this expression in the body of a paragraph (as we do) and if we find it necessary to put parts of this expression on two separate lines, then this expression should be wrapped so that the first line ends with a comparison operation or symbol like +, −, a}, [a, ∞) = {x ∈ R : x ≥ a}. The interval (−∞, ∞) is the set R. Note that the infinity symbols ∞ and −∞ are not real numbers; They are only used to describe specific areas. Therefore, for example, [1, ∞] has no meaning. Two sets A and B are equal, indicated by the writing A = B, if they have exactly the same elements. Another way of saying A = B is that every element of A is in B and every element of B is in A, i.e. H. A ⊆ B and B ⊆ A. In particular, if an element x belongs to A, then x ∈ B because A ⊆ B. If y is an element of B, then B ⊆ A implies y ∈ A. That is, whenever an element belongs to one of these sets, it must belong to the other, and so A = B. This fact will be very useful to us.

20

Chapter 1

puts

w A A

B

B x

Figure 1.1

z

j

Venn diagrams for two sets A and B

in Chapter 4. If A = B, then there must be an element that belongs to one of A and B but not to the other. It is often convenient to represent sets using diagrams called Venn diagrams. For example, Figure 1.1 shows Venn diagrams for two sets A and B. The diagram on the left represents two sets A and B that have no elements in common, while the diagram on the right is more general. Element x belongs to A but not to B, element y belongs to B but not to A, element z belongs to both A and B, while w does not belong to A or B. Elements of a set are generally understood to be: those that appear within the region that the set describes. A rectangle in a Venn diagram, in this case, represents the universal set. Since each element considered belongs to the universal set, each element in a Venn diagram lies within the rectangle. A set A is a proper subset of a set B if A ⊆ B but A = B. If A is a proper subset of B then we write A ⊂ B. For example if S = {4, 5, 7} and T = {3, 4, 5, 6, 7}, then S ⊂ T . (Although we write A ⊂ B to indicate that A is a proper subset of B, it should be noted that some prefer to write A ⊆/ B to indicate that A is a proper subset of B. In fact, there are some they write A ⊂ B instead of A ⊆ B to indicate that A is a subset of B. However, we follow the notation introduced above, denoted by P(A) Example 1.8

For any set A below, find P(A). Determine |A| in each case and |P(A)|. (a) A = ∅,

solution

(b) A = {a, b},

(c) A = {1, 2, 3}.

(a) P(A) = {∅}. In this case |A| = 0 and |P(A)| = 1. (b) P(A) = {∅, {a}, {b}, {a, b}}. In this case |A| = 2e |P(A)| = 4. (c) P(A) = {∅, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3 }}. In this case |A| = 3e |P(A)| = 8

Note that for every set A in Example 1.8 we have |P(A)| have = 2|A| 🇧🇷 If A is any finite set, say with n elements, then P(A) has 2n elements; that is, |P(A)| = 2|A| for any finite set A. (We will explain why this is so later.)

1.3 Example 1.9

define operations

21

If C = {∅, {∅}}, then P(C) = {∅, {∅}, {{∅}}, {∅, {∅}}}. It is important to note that none of the two sets ∅, {∅} and {{∅}} are the same. (An empty box and a box containing an empty box are not the same.) For the set C above, it is true that ∅ ⊆ C, ∅ ⊂ C, ∅ ∈ C, {∅} ⊆ C, {∅} ⊂ to write C, {∅} ∈ C, as well as {{∅}} ⊆ C, {{∅}} ∈ / C, {{∅}} ∈ P(C).

1.3 Operations on sets Just as there are many ways to combine two integers to produce another integer (addition, subtraction, multiplication, and sometimes division), there are many ways to combine two sets to produce another set generate. The union of two sets A and B, denoted by A ∪ B, is the set of all elements belonging to either A or B, i.e. H. A ∪ B = {x : x ∈ A or x ∈ B}. The use of the word "or" here, and in mathematics in general, allows an element of A ∪ B to belong to both A and B. That is, x is in A ∪ B if x is in A or x in B, or x is in both A and B. A Venn diagram for A ∪ B is shown in Figure 1.2. Example 1.10

Para os conjuntos A1 = {2, 5, 7, 8}, A2 = {1, 3, 5} und A3 = {2, 4, 6, 8}, temos A1 ∪ A2 = {1, 2, 3, 5 , 7, 8}, A1 ∪ A3 = {2, 4, 5, 6, 7, 8}, A2 ∪ A3 = {1, 2, 3, 4, 5, 6, 8}. Além disso, N ∪ Z = Z e Q ∪ I = R.

The intersection of two sets A and B is the set of all elements belonging to A and B. The intersection of A and B is denoted by A ∩ B. In symbols, A ∩ B = {x : x ∈ A and x ∈ B }. A Venn diagram for A ∩ B is shown in Figure 1.3. ONE

Figure 1.2

B

A Venn diagram for A ∪ B

22

Chapter 1

puts

ONE

Figure 1.3

Example 1.11

B

A Venn diagram for A ∩ B

For the sets A1 , A2 and A3 described in Example 1.10, A1 ∩ A2 = {5}, A1 ∩ A3 = {2, 8} and A2 ∩ A3 = ∅. In addition, N ∩ Z = N and Q ∩ R = Q.

For any two sets A and B, it follows that A ∩ B ⊆ A ∪ B. To see why this is true, suppose x is an element belonging to A ∩ B. Then x belongs to both A and B. As x ∈ A, for example x ∈ A ∪ B and thus A ∩ B ⊆ A ∪ B. If two sets A and B have no common elements, then A ∩ B = ∅ and A and B are called disjoint. Consequently, the sets A2 and A3 described in Example 1.10 are disjoint; However, A1 and A3 are not disjoint since 2 and 8 belong to both sets. Also, Q and I are disjoint. The difference A − B of two sets A and B (also written as A \ B by some mathematicians) is defined as A − B = {x : x ∈ A and x ∈ / B}. A Venn diagram for A − B is shown in Figure 1.4. Example 1.12

For the sets A1 = {2, 5, 7, 8} and A2 = {1, 3, 5} in Examples 1.10 and 1.11, A1 − A2 = {2, 7, 8} and A2 − A1 = {1, 3 }. In addition, R − Q = I.

ONE

Figure 1.4

B

A Venn diagram for A − B

1.3

define operations

23

UA

Figure 1.5 Example 1.13

A Venn diagram for A

For A = {x ∈ R : |x| ≤ 3}, B = {x ∈ R : |x| > 2} and C = {x ∈ R : |x − 1| ≤ 4}: (a) Express A, B and C in interval notation. (b) Determine A ∩ B, A − B, B ∩ C, B ∪ C, B − C and C − B.

Solution (a) A = [−3, 3], B = (−∞, −2) ∪ (2, ∞) and C = [−3, 5]. (b) A ∩ B = [−3, −2) ∪ (2, 3], A − B = [−2, 2], B ∩ C = [−3, −2) ∪ (2, 5], B ∪ C = (−∞, ∞), B − C = (−∞, −3) ∪ (5, ∞) and C − B = [−2, 2].

Suppose we consider a certain universal set U, that is, all sets under discussion are subsets of U. For a set A, its complement is A = U − A = {x : x ∈ U and x ∈ / A}. If U = Z, then N = {0, −1, −2, . 🇧🇷 🇧🇷 while if U = R then Q = I. A Venn diagram for A is shown in Figure 1.5. The difference in sets A − B is sometimes called the relative complement of B in A. In fact, by definition A − B = {x : x ∈ A and x ∈ / B}. The set A − B can also be expressed in terms of complements, ie A − B = A ∩ B. This fact will be established later. Example 1.14

Let U = {1, 2, . 🇧🇷 🇧🇷 , 10} is the universal set, A = {2, 3, 5, 7} and B = {2, 4, 6, 8, 10}. Determine the following for each: (a) B,

solution

Example 1.15

(b) A-B,

(c) A ∩ B,

(a) (b) (c)

B = {1, 3, 5, 7, 9}.

(d)

B = B = {2, 4, 6, 8, 10}.

(d)B.

A - B = {3, 5, 7}. A ∩ B = {3, 5, 7} = A − B.

Let A = {0, {0}, {0, {0}}}. (a) Determine which of the following elements of A are: 0, {0}, {{0}}. (b) Determine |A|.

24

Chapter 1

puts

(c) Determine which of the following subsets of A are: 0, {0}, {{0}}. Find for (d)-(i) the given sets. (d) {0} ∩ A (e) {{0}} ∩ A (f) {{{0}}} ∩ A (g) {0} ∪ A (h) {{0}} ∪ A (i ) {{{0}}} ∪ A. Solution

(a) While 0 and {0} are elements of A, {{0}} is not an element of A. (b) The set A has three elements: 0, {0}, {0, {0}}. Hence |A| = 3. (c) The integer 0 is not a set and therefore cannot be a subset of A (or a subset of any other set). Since 0 ∈ A and {0} ∈ A, it follows that {0} ⊆ A and {{0}} ⊆ A. (d) Since 0 is the only element belonging to both {0} and A, it follows it is assumed that {0} ∩ A = {0}. (e) Since {0} is the only element belonging to both {{0}} and A, it follows that {{0}} ∩ A = {{0}}. (f) Since {{0}} is not an element of A, it follows that {{{0}}} and A are disjoint sets and therefore {{{0}}} ∩ A = ∅. (g) As 0 ∈ A it follows that {0} ∪ A = A. (h) As {0} ∈ A it follows that {{0}} ∪ A = A. (i) As {{0} } ∈ / A, it follows {{{0}}} ∪ A = {0, {0}, {{0}}, {0, {0}}}.

1.4 Indexed Collections of Sets We will often encounter situations where more than two sets are combined using the set operations described above. In the case of three sets A, B, and C, the standard Venn diagram is shown in Figure 1.6.

B

ONE

C Figure 1.6

A Venn diagram for three sentences

1.4

Indexed collections of sets

25

The union A ∪ B ∪ C is defined as A ∪ B ∪ C = {x : x ∈ A or x ∈ B or x ∈ C}. So for an element to belong to A ∪ B ∪ C, the element must belong to at least one of the sets A, B, and C. Because it often makes sense to consider the union of multiple sets, an additional notation is required. n The union of n ≥ 2 sets A1 , A2 , . 🇧🇷 🇧🇷 , An is denoted by A1 ∪ A2 ∪ ∪ An or i=1 Ai and is defined as n

Ai = {x : x ∈ Ai for some i, 1 ≤ i ≤ n}.

i=1

So that an element a belongs to the sets A1, A2, . 🇧🇷 🇧🇷 , One . Example 1.16

i=1

Ai , it is necessary that belonging to at least one

Let B1 = {1, 2}, B2 = {2, 3},. 🇧🇷 ., B10 = {10, 11}; that is, Bi = {i, i + 1} for i = 1, 2, . 🇧🇷 🇧🇷 , 10. Determine each of the following: (a)

5

Bi .

(b)

i=1

solution

n

(a) (c)

5i=1 7

10

Bi .

(c)

i=1

Bi = {1, 2, . . . , 6}. Bi = {3, 4, . . . , 8}.

7

Bi .

(d)

(d)

Bi , onde 1 ≤ j ≤ k ≤ 10.

i = j

i=3

(b)

k

10i=1k

Bi = {1, 2, . . . , 11} Bi = { j, j + 1, . . . , k + 1}.

i = j

i=3

We are also often interested in the intersection of multiple sets. The intersection Ai e consists of n ≥ 2 sets A1 , A2 , . 🇧🇷 🇧🇷 , An is expressed as A1 ∩ A2 ∩ ∩ An or i=1 defined by n Ai = {x : x ∈ Ai for all i, 1 ≤ i ≤ n}. i=1

The next example concerns the quantities mentioned in Example 1.16. Example 1.17

Let Bi = {i, i + 1} for i = 1, 2, . 🇧🇷 🇧🇷 , 10. Determine the following: (a) (d)

10i=1k

Bi .

(b) Bi ∩ Bi+1 .

(c)

j+1

Bi , wobei 1 ≤ j < 10.

i = j

Bionde 1 ≤ j < k ≤ 10.

i = j

solution

(To sue)

10 i=1 k i= j

Bi = ∅.

(b) Bi ∩ Bi+1 = {i + 1}.

(c)

j+1

Bi = {j + 1}.

i = j

Bi = {j + 1} when k = j + 1; While

k i = j

Bi = ∅ se k > j + 1.

26

Chapter 1

puts

There are cases where the union or intersection of a collection of sets cannot (or perhaps not at all) be conveniently described in the way mentioned above. For this reason we introduce a (non-empty) set I, called the index set, to be used as a mechanism for choosing which sets to consider. For example, given a set of indices I, suppose there is a set Sα for every α ∈ I. We write {Sα }α∈I to describe the collection of all sets Sα, where α ∈ I . Such a collection is called an indexed collection of sets. We define the union of sets in {Sα }α∈I by Sα = {x : x ∈ Sα for some α ∈ I }, α∈I

and the intersection of these sets by Sα = {x : x ∈ Sα for all α ∈ I }. α∈I

Hence an element a belongs to α∈I Sα if a belongs to at least one of the sets in α∈I Sα, if a belongs to all sets of the collection {Sα }α∈I, while a belongs to }. We denote the collection S {S α α α∈I α∈I α as the union of the collection {Sα }α∈I and α∈I Sα as the intersection of the collection {Sα }α∈I . Just as our choice of i at i=1 Ai is nothing special (i.e. we could just as easily describe this set by say n A), so α at α∈I Sα is also nothing special. We could also define this j=1 j by x∈I Sx. The variables i and α above are dummy variables and any suitable symbol can be used. In fact, we could write J or some other symbol for a set of indices. Example 1.18

For n ∈ N define Sn = {n, 2n}. Example: S1 = {1, 2}, S2 = {2, 4} and S4 = {4, 8}. So S1 ∪ S2 ∪ S4 = {1, 2, 4, 8}. We can also describe this set by a set of indices. If we make I = {1, 2, 4} then Sα = S1 ∪ S2 ∪ S4 . α∈I

Example 1.19

For every n ∈ N define An as a closed interval [− n1 , n1 ] of real numbers; that is, 1 1 An = x ∈ R : − ≤ x ≤ . n n Then A1 = [−1, 1], A2 = − 12 , 12 , A3 = − 13 , 13 , and so on. We now define the sets A2 ∪ A3 ∪ · · · or ∞ A1 , A2 , A3 , . 🇧🇷 🇧🇷 🇧🇷 The union of these sets can be written as A1 ∪ A. Using N as a set of indices, we can also write this union as i n∈N An . Since i=1 An ⊆ A1 = [−1, 1] for all n ∈ N, it follows that n∈N An = [−1, 1]. Of course, 0 ∈ An for all n ∈ N; in fact, n∈N An = {0}.

Example 1.20

Let A be the set of letters of the alphabet, i.e. A = {a, b, . 🇧🇷 🇧🇷 ,z}. For α ∈ A let Aα be composed of α and the two letters after α. So Aa = {a, b, c} and Ab = {b, c, d}. By A y we mean the set {y, z, a} and A z = {z, a, b}. So |Aα | = 3 for every α ∈ A. Hence α∈A Aα = A. Namely, if B = {a, d, g, j, m, p, s, v, y},

1,5

set partitions

Aα = A too. On the other hand, if I = { p, q, r }, then { p, q, r, s, t}, while α∈I Aα = {r }. then

Example 1.21

α∈B

α∈I

27 Aα =

Let S = {1, 2, . 🇧🇷 🇧🇷 , 10}. Each of the sets S1 = {1, 2, 3, 4}, S2 = {4, 5, 6, 7, 8} and S3 = {7, 8, 9, 10} is a subset of S. In addition, furthermore , S1 ∪ S2 ∪ S3 = S. This union can be described in different ways. Define I = {1, 2, 3} and J = {S1 , S2 , S3 }. Then the union of the three sets belonging to J is exactly S1 ∪ S2 ∪ S3 , which can also be written as 3 Si = Sα = X. S = S1 ∪ S2 ∪ S3 = i=1

α∈I

X ∈J

1.5 Partitions of sets Recall that two sets are disjoint if their intersection is the empty set. A set S of subsets of a set A is called disjoint if all two different subsets belonging to S are disjoint. For example, let A = {1, 2, . 🇧🇷 🇧🇷 , 7}, B = {1, 6}, C = {2, 5}, D = {4, 7} and S = {B, C, D}. Then S is a pairwise disjoint collection of subsets of A since B ∩ C = B ∩ D = C ∩ D = ∅. On the other hand, let A = {1, 2, 3}, B = {1, 2}, C = {1, 3}, D = {2, 3} and S = {B , C , D }. Although S is a collection of subsets of A and B ∩ C ∩ D = ∅, the set S is not a collection of pairwise disjoint sets, for example since B ∩ C = ∅. In fact, B ∩ D and C ∩ D are not empty either. We shall often have occasion (especially in Chapter 8) to find, for a nonempty set A, a collection S of pairwise disjoint nonempty subsets of A with the additional property that every element of A belongs to some subset in S. is called a partition of A. A partition of A can also be defined as a collection S of non-empty subsets of A such that each element of A belongs to exactly one subset of S. Furthermore, a partition of A can be defined as a collection S of subsets of A that satisfies the three properties: (1) X = ∅ for every set X ∈ S; (2) for any two sets X, Y ∈ S, or X = Y or X ∩ Y = ∅; (3) X ∈S X = A. Example 1.22

Consider the following collections of subsets of the set A = {1, 2, 3, 4, 5, 6}: S1 = {{1, 3, 6}, {2, 4}, {5}}; S2 = {{1, 2, 3}, {4}, ∅, {5, 6}}; S3 = {{1, 2}, {3, 4, 5}, {5, 6}}; S4 = {{1, 4}, {3, 5}, {2}}. Determine which of these sets are partitions of A.

28

Chapter 1

puts

1

5

3 6

2

Figure 1.7

solution

4

A partition of a crowd

The set S1 is a partition of A. The set S2 is not a partition of A since ∅ is one of the elements of S2. The set S3 is also not a partition of A, since element 5 belongs to two different subsets in S3, namely {3, 4, 5} and {5, 6}. Finally, S4 is not a partition of A either, because element 6 does not belong to any subset in S4. As the word partition probably implies, a partition of a non-empty set A is a partition of A into non-empty subsets. The partition S1 of set A from Example 1.22 is shown in the diagram of Figure 1.7. For example, the set Z of integers can be partitioned into the set of even integers and the set of odd integers. The set R of real numbers can be decomposed into the set R+ of positive real numbers, the set of negative real numbers, and the set {0}, which consists of the number 0. In addition, R can be decomposed into the set Q of rational numbers and the set I of irrational numbers.

Example 1.23

Let A = {1, 2, . 🇧🇷 🇧🇷 , 12}. (a) Give an example for a decomposition S of A with |S| = 5. (b) Give an example for a subset T of the partition S in (a) such that |T | = 3. (c) List all elements B in partition S in (a) such that |B| = 2

solution

(a) We are looking for a partition S of A that consists of five subsets. One such example is S = {{1, 2}, {3, 4}, {5, 6}, {7, 8, 9}, {10, 11, 12}}. (b) We are looking for a subset T of S (given in (a)) that consists of three elements. One such example is T = {{1, 2}, {3, 4}, {7, 8, 9}}. (c) We have been asked to list all elements of S (given in (a)) that consist of two elements of A. These elements are: {1, 2}, {3, 4}, {5, 6}.

1.6 Cartesian products of sets We have already mentioned that when describing a set A by listing its elements, the order in which the elements of A are listed does not matter. That is, if the set A consists of two elements x and y, then A = {x, y} = {y, x}. If we talk about the ordered pair (x,y),

Exercises for Chapter 1

29

However, that is another story. The ordered pair (x,y) is a single element consisting of a pair of elements, where x is the first element (or coordinate) of the ordered pair (x,y) and y is the second element (or coordinate). ) is. 🇧🇷 For two ordered pairs (x,y) and (w,z) to be equal, i.e. H. (x, y) = (w, z), we must also have x = w and y = z. So if x = y, then (x, y) = (y, x). The Cartesian product (or simply the product) A × B of two sets A and B is the set consisting of all ordered pairs whose first coordinate belongs to A and whose second coordinate belongs to B. In other words, A × B = {( a, b) : a ∈ A and b ∈ B}. Example 1.24

If A = {x, y} and B = {1, 2, 3}, then A × B = {(x, 1), (x, 2), (x, 3), (y, 1), ( y, 2), (y, 3)}, while B × A = {(1, x), (1, y), (2, x), (2, y), (3, x), ( 3 ,y)}. For example, since (x, 1) ∈ A × B and (x, 1) ∈ / B × A, these two sets do not contain the same elements; then A × B = B × A. Also A × A = {(x, x), (x, y), (y, x), (y, y)} and B × B = {(1, 1) . ), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3) } .

We also note that if A = ∅ or B = ∅, then A × B = ∅. The Cartesian product R × R is the set of all points in the Euclidean plane. For example, the graph of the line y = 2x + 3 is the set {(x, y) ∈ R × R : y = 2x + 3}. For the sets A={x, y} and B={1, 2, 3} given in Example 1.24, |A| = 2 and |B| = 3, while |A × B| = 6. In fact, for all finite sets A and B, |A × B| = |A| |B|. Cartesian products are examined in more detail in Chapter 7.

EXERCISES FOR CHAPTER 1 Section 1.1: Description of a set 1.1. Which of the following are sets? (a) (b) (c) (d) (e)

1, 2, 3 {1, 2}, 3 {{1}, 2}, 3 {1, {2}, 3} {1, 2, a, b}.

30

Chapter 1

puts

1.2. Let S = {−2, −1, 0, 1, 2, 3}. Describe each of the following sets as {x ∈ S : p(x)}, where p(x) is a condition on x. (A B C D)

A = {1, 2, 3} B = {0, 1, 2, 3} C = {−2, −1} D = {−2, 2, 3}

1.3. Find the cardinality of each of the following sets: (a) (b) (c) (d) (e) (f)

A = {1, 2, 3, 4, 5} B = {0, 2, 4, . . . , 20} C = {25, 26, 27, . . . , 75} D = {{1, 2}, {1, 2, 3, 4}} E = {∅} F = {2, {2, 3, 4}}

1.4. Write each of the following sets, listing their elements in curly brackets. (a) (b) (c) (d) (e)

A = {n ∈ Z : B = {n ∈ Z : C = {n ∈ N : D = {x ∈ R : E = {x ∈ R :

−4 < n ≤ 4} n 2 < 5} n 3 < 100} x 2 − x = 0} x 2 + 1 = 0}

1.5. Write each of the following sets in the form {x ∈ Z : p(x)}, where p(x) is a property of x. (a) A = {−1, −2, −3, . 🇧🇷 .} (b) B = {−3, −2, . 🇧🇷 🇧🇷 , 3} (c) C = {−2, −1, 1, 2} 1.6. The set E = {2x : x ∈ Z} can be described by listing its elements, ie E = {. 🇧🇷 🇧🇷 , −4, −2, 0, 2, 4, . 🇧🇷 🇧🇷 Similarly, list the elements of the following sets. (a) A = {2x + 1 : x ∈ Z} (b) B = {4n : n ∈ Z} (c) C = {3q + 1 : q ∈ Z} 1.7. The set E = {. 🇧🇷 🇧🇷 , −4, −2, 0, 2, 4, . 🇧🇷 .} of even integers can be described via a defining condition by E = {y = 2x : x ∈ Z} = {2x : x ∈ Z}. Describe the following sets in a similar way. (a) A = {. 🇧🇷 🇧🇷 , −4, −1, 2, 5, 8, . 🇧🇷 .} (b) B = {. 🇧🇷 🇧🇷 , −10, −5, 0, 5, 10, . 🇧🇷 .} (c) C = {1, 8, 27, 64, 125, . 🇧🇷 .} 1.8. Let A = {n ∈ Z : 2 ≤ |n|√< 4}, B = √ {x ∈ Q : 2 < x ≤ 4}, √ √ C = {x ∈ R : x 2 − (2 + 2) x + 2 2 = 0} and D = {x ∈ Q : x 2 − (2 + 2)x + 2 2 = 0}. (a) (b) (c) (d) (e)

Describe the set A by listing its elements. Give an example of three elements that belong to B but not to A. Describe the set C by listing its elements. Describe the set D in another way. Find the cardinality of each of the sets A, C, and D.

Exercises for Chapter 1

31

1.9. For A = {2, 3, 5, 7, 8, 10, 13} let B = {x ∈ A : x = y + z, where y, z ∈ A} and C = {r ∈ B : r + s ∈ B for some s ∈ B}. determine C.

Section 1.2: Subsets 1.10. Give examples for three sets A, B and C with (a) A ⊆ B ⊂ C (b) A ∈ B, B ∈ C and A ∈ /C (c) A ∈ B and A ⊂ C. 1.11. Let (a, b) be an open interval of real numbers and let c ∈ (a, b). Describe an open interval I with center c such that I ⊆ (a, b). 1.12. Which of the following sets are equal? A = {n ∈ Z : |n| < 2} D = {n ∈ Z : n 2 ≤ 1} 3 B = {n ∈ Z : n = n} E = {−1, 0, 1}. C = {n ∈ Z : n 2 ≤ n} 1.13. For a universal set U = {1, 2, . 🇧🇷 🇧🇷 , 8} and two sets A = {1, 3, 4, 7} and B = {4, 5, 8}, draw a Venn diagram that represents these sets. 1.14. Find P(A) and |P(A)| for (a) A = {1, 2}. (b) A = {∅, 1, {a}}. 1.15. Find P(A) for A = {0, {0}}. 1.16. Find P(P({1})) and its cardinality. 1.17. Find P(A) and |P(A)| for A = {0, ∅, {∅}}. 1.18. Find P(A) for A = {x : x = 0 or x ∈ P({0})}. 1.19. Give an example of a set S such that (a) (b) (c) (d)

S S S S

⊆ P(N) ∈ P(N) ⊆ P(N) e |S| = 5 ∈ P(N) e |S| = 5

1.20. Decide whether the following statements are true or false. (A B C D)

If {1} ∈ P(A), then 1 ∈ A, but {1} ∈ / A. If A, B, and C are sets such that A ⊂ P(B) ⊂ C and |A| = 2, then |C| can be 5, but |C| cannot be 4. If a set B has one more element than a set A, then P(B) has at least two more elements than P(A). If four sets A, B, C, and D are subsets of {1, 2, 3} such that |A| = |B| = |C| = |D| = 2, then at least two of these sets are equal.

1.21. Three subsets A, B, and C of {1, 2, 3, 4, 5} have the same cardinality. Also (a) (b) (c) (d)

1 belongs to A and B, but not to C. 2 belongs to A and C, but not to B. 3 belongs to A and exactly one of B and C. 4 belongs to an even number of A, B and C.

32

Chapter 1

puts

(e) 5 belongs to an odd number of A, B, and C. (f) The sums of the elements in two of the sets A, B, and C differ by 1. What is B?

Section 1.3: Defining Operations 1.22. Let U = {1, 3, . 🇧🇷 🇧🇷 , 15} is the universal set, A = {1, 5, 9, 13} and B = {3, 9, 15}. Determine the following: (a) A ∪ B (b) A ∩ B (c) A − B (d) B − A (e) A (f) A ∩ B. 1.23. Give examples for two sets A and B with |A − B| = |A ∩ B| = |B − A| = 3. Draw the following Venn diagram. 1.24. Give examples of three sets A, B and C such that B = C but B − A = C − A. 1.25. Give examples of three sets A, B and C such that (a) A ∈ B, A ⊆ C and B ⊆ C (b) B ∈ A, B ⊂ C and A ∩ C = ∅ (c) A ∈ B, B ⊆ C and A ⊆ C. 1.26. Let U be a universal set and let A and B be two subsets of U . Draw a Venn diagram for each of the following sets. (a) A ∪ B

(b) A ∩ B

(c) A ∩ B

(d) A ∪ B.

What can you say about parts (a) and (b)? Parts (c) and (d)? 1.27. Give an example of a universal set U , two sets A and B and the associated Venn diagram such that |A ∩ B| = |A − B| = |B − A| = |A ∪ B| = 2. 1.28. Let A, B and C be nonempty subsets of a universal set U . Draw a Venn diagram for each of the following set operations. (a) (C − B) ∪ A (b) C ∩ (A − B). 1.29. Let A = {∅, {∅}, {{∅}}}. (a) Determine which of the following elements of A are: ∅, {∅}, {∅, {∅}}. (b) Determine |A|. (c) Determine which of the following subsets of A are: ∅, {∅}, {∅, {∅}}. Find for (d)-(i) the given sets. (d) ∅ ∩ A (e) {∅} ∩ A (f) {∅, {∅}} ∩ A (g) ∅ ∪ A (h) {∅} ∪ A (i) {∅, {∅}} ∪ A.1.30. Let A = {x ∈ R : |x − 1| ≤ 2}, B = {x ∈ R : |x| ≥ 1} and C = {x ∈ R : |x + 2| ≤ 3}. (a) Express A, B and C in interval notation. (b) Determine each of the following sets in interval notation: A ∪ B, A ∩ B, B ∩ C, B − C. 1.31. Give an example of four different sets A, B, C, and D such that (1) A ∪ B = {1, 2} and C ∩ D = {2, 3} and (2) if B and C are swapped and ∪ and ∩ are swapped, so we get the same result.

Exercises for Chapter 1

33

1.32. Give an example of four distinct subsets A, B, C, and D of {1, 2, 3, 4} such that all intersections of two subsets are distinct. 1.33. Give an example of two nonempty sets A and B such that {A ∪ B, A ∩ B, A − B, B − A} is the power set of a set. 1.34. Give an example of two subsets A and B of {1, 2, 3} such that all of the following sets are distinct: A ∪ B, A ∪ B, A ∪ B, A ∪ B, A ∩ B, A ∩ B , A ∩ B, A ∩ B. 1.35. Give examples of a universal set U and sets A, B, and C such that each of the following sets contains exactly one element: A ∩ B ∩ C, (A ∩ B) − C, (A ∩ C) − B, ( B ∩ C) − A, A − (B ∪ C), B − (A ∪ C), C − (A ∪ B), A ∪ B ∪ C. Draw the following Venn diagram.

Section 1.4: Indexed Collections of Sets

1.36. For a real number r, define Sr as the interval [r − 1, r + 2]. Let A = {1, 3, 4}. Determine α∈A Sα and α∈A Sα . 1.37. Let A = {1, 2, 5}, B = {0, 2, 4}, C = {2, 3, 4} and S = {A, B, C}. Determine X ∈S X and X ∈S X . 1.38. For a real number r, define Ar = {r 2 }, Br as a closed interval [r − 1, r + 1] and Cr as an interval (r, ∞). For S = {1, 2, 4} determine (a) Aα and α∈S Aα α∈S (b) B and α∈S Bα α∈S α (c) α∈S C α and α∈S C α . 1.39. Let A = {a, b, . 🇧🇷 🇧🇷 , z} is the set formed by the letters of the alphabet. For α ∈ A let Aα be composed of α and the two following letters, where A y = {y, z, a} and A z = {z, a, b}. Find a set S ⊆ A of smaller cardinality such that α∈S Aα = A. Explain why your set S has the necessary properties. 1.40 For i ∈ Z let Ai = {i − 1, i + 1}. Determine the following: 5 5 5 A2i (b) (Ai ∩ Ai+1 ) (c) (A2i−1 ∩ A2i+1 ). (a) i=1

i=1

i=1

1.41. For each of the following statements, find an indexed collection {An }n∈N of distinct sets (that is, no two sets are the same) that satisfy the given conditions. ∞ An = {0} and ∞ (a) n = [0, 1] n=1 A n=1 ∞ ∞ (b) A = {−1, 0, 1} and n=1 n n=1 An = Z 1.42 . For each of the following collections of sets, define a set An for each n ∈ N such that the indexed collection {An }n∈N is exactly the given collection of sets. Then find the union and the intersection of the indexed collection of sets. (a) {[1, 2 + 1), [1, 2 + 1/2), [1, 2 + 1/3), . 🇧🇷 .} (b) {(−1, 2), (−3/2, 4), (−5/3, 6), (−7/4, 8), . 🇧🇷 🇧🇷 1.43. For r ∈ R+ let Ar = {x ∈ R : |x| <r}. Determine

r ∈R+

are

r ∈R+

Ar.

1.44. Each of the following sets is a subset of A = {1, 2, . 🇧🇷 🇧🇷 , 10}: A1 = {1, 5, 7, 9, 10}, A2 = {1, 2, 3, 8, 9}, A3 = {2, 4, 6, 8, 9}, A4 = {2, 4, 8}, A5 = {3, 6, 7}, A6 = {3, 8, 10}, A7 = {4, 5, 7, 9}, A8 = {4, 5, 10 }, A9 = {4, 6, 8}, A10 = {5, 6, 10}, A11 = {5, 8, 9}, A12 = {6, 7, 10}, A13 = {6, 8, 9 }.

Find a set I ⊆ {1, 2, . 🇧🇷 🇧🇷 , 13} such that for any two different elements j, k ∈ I , A j ∩ Ak = ∅ e i∈I Ai

it is max 1.45. For n ∈ N let An = − n1 , 2 − n1 , . Determine n∈N An and n∈N An .

34

Chapter 1

puts

Section 1.5: Defining Partitions 1.46. Which of the following are partitions of A = {a, b, c, d, e, f, g}? Justify your answer for each collection of subsets that is not a partition of A. (a) S1 = {{a, c, e, g}, {b, f }, {d}} (b) S2 = {{a, b, c, d}, {e, f }} (c ) S3 = {A} (d) S4 = {{a}, ∅, {b, c, d}, {e, f, g}} (e) S5 = {{a, c, d}, {b , g}, {e}, {b, f}}. 1.47. Which of the following sets are partitions of A = {1, 2, 3, 4, 5}? (a) S1 = {{1, 3}, {2, 5}} (b) S2 = {{1, 2}, {3, 4, 5}} (c) S3 = {{1, 2}, {2, 3}, {3, 4}, {4, 5}} (d) S4 = A1.48. Let A = {1, 2, 3, 4, 5, 6}. Give an example of a partition S of A such that |S| = 3.1.49. Give an example for a set A with |A| = 4 and two disjoint partitions S1 and S2 of A with |S1 | = |S2 | = 3. 1.50. Give an example of dividing N into three subsets. 1.51. Give an example of dividing Q into three subsets. 1.52. Give an example of three sets A, S1, and S2 such that S1 is a partition of A, S2 is a partition of S1, and |S2| < |S1 | < |A|. 1.53. Give an example of partitioning Z into four subsets. 1.54. Let A = {1, 2, . 🇧🇷 🇧🇷 , 12}. Give an example of a partition S of A that satisfies the following requirements: (i) |S| = 5, (ii) there is a subset T of S such that |T | = 4 and | ∪ X ∈T X | = 10 and (iii) there is no element B ∈ S with |B| = 3.1.55. A set S is divided into two subsets S1 and S2. This creates a partition P1 of S, where P1 = {S1,S2} and hence |P1| = 2. One of the sets in P1 is then partitioned into two subsets, resulting in a P2 partition of S with |P2 | generated = 3. Total |P1| Sets in P2 are each partitioned into two subsets, creating a P3 partition of S. Then the total becomes |P2| the sets in P3 are each partitioned into two subsets, producing a P4 partition of S. This continues until a P6 partition of S is created. What is |P6|? 1.56. We mentioned that there are three ways to define a collection S of subsets of a non-empty set A as a partition of A. S. Definition 2 The collection S consists of non-empty subsets of A and each element of A belongs to exactly one subset in S. Definition 3 The collection S consists of subsets of A that satisfy the three properties (1) every subset in S is nonempty, (2) every two subsets of A are equal or disjoint, and (3) the union of all subsets in S is A. (a) Show that every collection S of subsets of A satisfying definition 1 , Definition 2 fulfilled. (b) Show that every collection S of subsets of A satisfying Definition 2 satisfies Definition 3. (c) Show that every collection S of subsets of A that satisfies Definition 3 satisfies Definition 1.

Section 1.6: Cartesian Products of Sets 1.57. Let A = {x, y, z} and B = {x, y}. Determine A × B. 1.58. Let A = {1, {1}, {{1}}}. Determine A × A. 1.59. For A = {a, b}, find A × P(A). 1.60. For A = {∅, {∅}} find A × P(A). 1.61. For A = {1, 2} and B = {∅}, find A × B and P(A) × P(B). 1.62. Describe the graph of the circle whose equation x 2 + y 2 = 4 as a subset of R × R.

Additional exercises to Chapter 1

35

1.63. Name the elements of the set S = {(x, y) ∈ Z × Z : |x| + |y| = 3}. Draw the corresponding points in the Euclidean xy plane. 1.64. For A = {1, 2} and B = {1}, find P(A × B). 1.65. For A = {x ∈ R : |x − 1| ≤ 2} and B = {y ∈ R : |y − 4| ≤ 2}, give a geometrical description of the points in the xy-plane belonging to A × B. 1.66. For A = {a ∈ R : |a| ≤ 1} and B = {b ∈ R : |b| = 1}, give a geometric description of the points in the xy-plane belonging to (A × B) ∪ (B × A).

EXERCISES ADDITIONAL TO CHAPTER 1 1.67. The set T = {2k + 1 : k ∈ Z} can be written as T = {. 🇧🇷 🇧🇷 , −3, −1, 1, 3, . 🇧🇷 🇧🇷 Describe the following quantities in a similar way. (a) A = {4k + 3 : k ∈ Z} (b) B = {5k − 1 : k ∈ Z}. 1.68. Let S = {−10, −9, . 🇧🇷 🇧🇷 , 9, 10}. Describe each of the following sets as {x ∈ S : p(x)}, where p(x) is a condition on x. (a) A = {−10, −9, . 🇧🇷 🇧🇷 , −1, 1, . 🇧🇷 🇧🇷 , 9, 10} (b) B = {−10, −9, . 🇧🇷 🇧🇷 , −1, 0} (c) C = {−5, −4, . 🇧🇷 🇧🇷 , 0, 1, . 🇧🇷 🇧🇷 , 7} (d) D = {−10, −9, . 🇧🇷 🇧🇷 , 4, 6, 7, . 🇧🇷 🇧🇷 , 10}. 1.69. Describe each of the following sets by listing their elements in curly brackets. (a) {x ∈ Z : x 3 − 4x = 0} (b) {x ∈ R : |x| = −1} (c) {m ∈ N : 2 < m ≤ 5} (d) {n ∈ N : 0 ≤ n ≤ 3} (e) {k ∈ Q : k 2 − 4 = 0} (f) {k ∈ Z : 9k 2 − 3 = 0} (g) {k ∈ Z : 1 ≤ k 2 ≤ 10}. 1.70. Determine the cardinality of each of the following sets. (a) A = {1, 2, 3, {1, 2, 3}, 4, {4}} (b) B = {x ∈ R : |x| = −1} (c) C = {m ∈ N : 2 < m ≤ 5} (d) D = {n ∈ N : n < 0} (e) E = {k ∈ N : 1 ≤ k 2 ≤ 100 } (f) F = {k ∈ Z : 1 ≤ k 2 ≤ 100}. 1.71. For A = {−1, 0, 1} and B = {x, y}, find A × B. 1.72. Let U = {1, 2, 3} be the universal set and let A = {1, 2}, B = {2, 3} and C = {1, 3}. Determine the following. (a) (A ∪ B) − (B ∩ C) (b) A (c) B ∪ C (d) A × B. 1.73. Let A = {1, 2, . 🇧🇷 🇧🇷 , 10}. Give an example for two sets S and B with S ⊆ P(A), |S| = 4, B ∈ S and |B| = 2.1.74. Given A = {1} and C = {1, 2}, give an example of a set B such that P(A) ⊂ B ⊂ P(C). 1.75. Give examples for two sets A and B with A ∩ P(A) ∈ B and P(A) ⊆ A ∪ B.

36

Chapter 1

puts

1.76. Which of the following sets are equal? A = {n ∈ Z : −4 ≤ n ≤ 4} D = {x ∈ Z : x 3 = 4x} B = {x ∈ N : 2x + 2 = 0} E = {−2, 0, 2}. C = {x ∈ Z : 3x − 2 = 0} 1.77. Let A and B be subsets of an unknown universal set U . Suppose A = {3, 8, 9}, A − B = {1, 2}, B − A = {8} and A ∩ B = {5, 7}. Determine U , A and B. 1.78. Let I be the interval [0, ∞). Define for every r ∈ I

Ar = (x, y) ∈ R × R : x 2 + y 2 = r 2

Br = (x, y) ∈ R × R : x 2 + y 2 ≤ r 2

Cr = (x, y) ∈ R × R : x2 + y2 > r2 . (a) Determine r ∈I Ar and r ∈I Ar . (b) Determine r ∈I Br and r ∈I Br . (c) Determine r ∈I Cr and r ∈I Cr . 1.79. Give an example of four sets A1 , A2 , A3 , A4 with |Ai ∩ A j | = |i − j| for any two integers i and j with 1 ≤ i < j ≤ 4. 1.80. (a) Give an example of two problems proposed in Exercise 1.79 (above). (b) Solve one of the problems in (a). 1.81. Let A = {1, 2, 3}, B = {1, 2, 3, 4} and C = {1, 2, 3, 4, 5}. For the sets S and T described below, explain whether |S| < |T|, |S| > |T| or |S| = |T|. (a) Let B be the universal set and let S be the set of all subsets X of B such that |X | = |X|. Let T be the set of 2-element subsets of C. (b) Let S be the set of all partitions of the set A and let T be the set of 4-element subsets of C. (c) Let S = {( b, a ) : b ∈ B, a ∈ A, a + b is odd} and let T be the set of all nonempty proper subsets of A. 1.82. Give an example for a set A = {1, 2, . 🇧🇷 🇧🇷 , k} for a smaller k ∈ N containing subsets A1 , A2 , A3 such that |Ai − A j | = |A j − A i | = |i − j| for any two integers i and j with 1 ≤ i < j ≤ 3. 1.83. (a) For A = {−3, −2, . 🇧🇷 🇧🇷 , 4} and B = {1, 2, . 🇧🇷 🇧🇷 , 6}, find S = {(a, b) ∈ A × B : a 2 + b2 = 25}. (b) For C = {a ∈ B : (a, b) ∈ S} and D = {b ∈ A : (a, b) ∈ S}, where A, B, S are the sets in (a), determine C × D 1.84. For A = {1, 2, 3} let B be the set of sets of 2 elements belonging to P(A) and let C be the set consisting of the sets that are the intersections of two different elements of B. Determine D = P (Ç). 1.85. For a real number r let Ar = {r, r + 1}. Let S = {x ∈ R : x 2 + 2x − 1 = 0}. (a) Determine B = As × At for the different elements s, t ∈ S, where s < t. (b) Let C = {ab : (a, b) ∈ B}. Find the sum of the elements of C.

2

Logic

EU

In mathematics, our goal is to seek the truth. Are there connections between any two given mathematical concepts? If so, which ones are they? Under what conditions does an object have a certain property? Finding answers to questions like these is important, but we cannot stop there. We need to be sure that we are right and that our explanation of why we believe we are right will convince others. The reasoning we use to get from what we know to what we want to show must be logical. It has to make sense for others, not just for ourselves. However, there is a shared responsibility here. It is the author's responsibility to use the rules of logic to provide a valid and clear argument with enough detail for the reader to understand what we are writing and be persuaded. It is the reader's responsibility to know the basics of logic and to study the concepts involved sufficiently well that he or she is not only able to understand a well-presented argument, but also to decide whether it is valid . Consequently, both the writer and the reader must be familiar with logic. While it's possible to spend a lot of time studying logic, we'll only present what we really need and instead spend most of our time putting what we've learned into practice.

2.1 Propositions In mathematics we are constantly dealing with propositions. By statement we mean a declarative sentence or assertion that is either true or false (but not both). Statements therefore state or assert the truth of something. The statements that interest us primarily deal with mathematics, of course. For example the sentences The integer 3 is odd. The integer 57 is a prime number. are statements (only the first of which is true). Every statement has a truth value, i. H. true (indicated by T) or false (indicated by F). We often use P, Q and R to denote propositions, or perhaps P1 , P2 , . 🇧🇷 🇧🇷 , pm if there are 37

38

Chapter 2

Logic

There are several declarations involved. We have seen that P1 : the integer 3 is odd. and P2: The integer 57 is a prime number. are statements where P1 has truth value T and P2 has truth value F. Sentences that are mandatory (commands) such as Replace the number 2 with x. Find the derivative of f(x) = e−x cos 2x. or are they interrogative (questions) like Are these sets disjoint? What is the derivative of f(x) = e−x cos 2x? or they're exclamation points like What an interesting question! How difficult is this problem! are not statements, since these sentences are not declarative. It cannot be immediately clear whether a statement is true or false. For example, the sentence "The hundredth place in the decimal expansion of π is 7". is a claim, but it may be necessary to find this information on an Internet site to determine whether this claim is true. For a sentence to be a statement, it is not necessary for us to be able to determine its truth value. The theorem "The real number r is rational". is a statement as long as we know which real number r is being referred to. However, without this additional information, it is impossible to assign a truth value to it. This is an example of what is often referred to as an open sentence. In general, an open set is a declarative set containing one or more variables, each variable representing a value in a prescribed set called the domain of the variable, which becomes a declaration when replacing values from their respective domains become these variables. . For example, the open sentence "3x = 12", in which the range of x is the set of integers, is a true statement only if x = 4. An open sentence containing a variable x is usually represented by P( x), Q( x) or R(x). If P(x) is an open sentence where the domain of x is S, then we say that P(x) is an open sentence over domain S. In addition, P(x) is a statement for every x ∈ S. For example, the open statement P(x) : (x − 3)2 ≤ 1 over the domain Z is a true statement if x ∈ {2, 3, 4} , and otherwise an incorrect statement. Example 2.1

For the open statement P(x, y): |x + 1| + |and| = 1

2.2

The refusal of a declaration

P

Q

P Q

PQR

T

T

T

T

T

T T

F

F

T F F T F F

T

T F

39

T F T T F F F F

T T

Figure 2.1

Truth tables for one, two, and three statements

Let us assume for two variables that the domain of the variable x is S = {−2, −1, 0, 1} and the domain of the variable y is T = {−1, 0, 1}. Then P(−1, 1) : |(−1) + 1| + |1| = 1 is a true statement, while P(1, −1) : |1 + 1| 🇧🇷 − 1| = 1 is an incorrect statement. In fact, P(x, y) is a true statement if (x, y) ∈ {(−2, 0), (−1, −1), (−1, 1), (0, 0)}, while P(x, y) for all other elements (x, y) ∈ S × T is false.

The possible truth values of a statement are usually listed in a table called a truth table. The truth tables for two statements P and Q are given in Figure 2.1. Since there are two possible truth values for each of P and Q, there are four possible combinations of truth values for P and Q. The truth table showing all of these combinations is also included in Figure 2.1. When it comes to a third R-statement, there are eight possible combinations of truth values for P, Q, and R. This is also shown in Figure 2.1. In general, a truth table with n statements P1 , P2 , · · · , Pn contains 2n possible combinations of truth values for those statements, and a truth table showing these combinations would have n columns and 2n rows. Most of the time we will be dealing with two statements, usually denoted P and Q; then the associated truth table has four rows, with the first two columns being headed P and Q. In this case it is usual to consider the four combinations of truth values in the order TT, TF, FT, FF from top to bottom.

2.2 The negation of a proposition Much of the interest in integers and other familiar sets of numbers comes not only from the numbers themselves, but from the properties of numbers that result from performing operations on them (e.g., their negatives take, add or multiply). her or

40

Chapter 2

Logic

combinations thereof). Likewise, much of our interest in propositions stems from investigating the truth or falsity of new propositions that can be generated from one or more given propositions by performing certain operations on them. Our first example involves creating a new utterance from a single utterance. The negation of a statement P is the statement: not P. and is denoted by ∼P. Although ∼P can always be expressed as P, it isn't. There are usually better ways to express the ∼P statement. Example 2.2

For instruction P1: The integer 3 is odd. as described above, we have ∼P1 : it is not that the integer 3 is odd. but it would be better to write ∼P1: The integer 3 is not odd. or better yet, write ∼P1 : the integer 3 is even. Likewise the negation of the statement P2 : The integer 57 is a prime number. considered above is ∼P2 : The integer 57 is not a prime number. Note that ∼P1 is false while ∼P2 is true.

In fact, the negation of a true statement is always false, and the negation of a false statement is always true; that is, the truth value of ∼P is opposite to that of P. This is summarized in Figure 2.2, which gives the truth table for ∼P (in terms of the possible truth values of P).

Figure 2.2

P

∼S

T F

F T

The truth table for negation

2.3

The disjunction and conjunction of statements

41

2.3 The Disjunction and Conjunction of Propositions Given two propositions P and Q, a common way to form a new proposition is to insert the word “or” or “and” between P and Q. The disjunction of statements P and Q is the statement P or Q and is denoted by P ∨ Q. The disjunction P ∨ Q is true if at least one of P and Q is true; otherwise P ∨ Q is false. Hence P ∨ Q is true if exactly one of P and Q is true, or if both P and Q are true. Example 2.3

For P1 instructions: The integer 3 is odd. and P2: The integer 57 is a prime number. already described, the disjunction is the new statement P1 ∨ P2 : Either 3 is odd or 57 is a prime number. which is true since at least one of P1 and P2 is true (i.e. P1 is true). Of course, in this case exactly one of P1 and P2 is true. For two statements P and Q, the truth table for P ∨ Q is shown in Figure 2.3. This truth table then describes exactly when P ∨ Q is true (or false). Although the truth of "P or Q" admits both P and Q true, there are cases where the use of "or" does not allow that possibility. For example, for an integer n, if we say "n is even or n is odd", it is certainly not possible that "n is even" and "n is odd" are true. When "or" is used in this way, it is called an exclusive or. For example, suppose P = {S1 , S2 , . 🇧🇷 🇧🇷 , Sk }, where k ≥ 2, is a partition of a set S, and x is an element of S. If x ∈ S1 or x ∈ S2 is true, then it is impossible for x ∈ S1 and x ∈ S2 to be true.

Pt

QP ∨QT

T

T F F T

T

F F Figure 2.3

T F

The truth table for the disjunction

42

Chapter 2

Logic

P Q P ∧Q T T T F T F F T F F Figure 2.4

F F

The conjunction truth table

The conjunction of the statements P and Q is the statement: P and Q and is denoted by P ∧ Q. The conjunction P ∧ Q is true only if P and Q are true; otherwise P ∧ Q is false. Example 2.4

For P1: The integer 3 is odd. and P2 : The integer 57 is a prime number. The statement P1 ∧ P2 : 3 is odd and 57 is prime. is false because P2 is false and therefore P1 and P2 are not true.

The truth table for the conjunction of two statements is shown in Figure 2.4.

2.4 The implication A statement formed from two given statements that will interest us most is the implication (also called condition). For the statements P and Q, the implication (or condition) is the statement If P then Q. and is denoted by P ⇒ Q. In addition to the expression "If P then Q", we also express P ⇒ Q in words such as P implies Q.

Example 2.5

The truth table for P ⇒ Q is given in Figure 2.5. Note that P ⇒ Q is false only if P is true and Q is false (otherwise, P ⇒ Q is true). For P1: The integer 3 is odd. and P2 : The integer 57 is prime., the implication P1 ⇒ P2 : If 3 is an odd integer, then 57 is prime. P Q P ⇒ Q T T T F F T F F Figure 2.5

T F T T

The truth table for implications

2.4

The Implication

43

it is a false statement. The implication P2 ⇒ P1 : If 57 is prime, then 3 is odd.

but it is true.

While the truth tables for the negation ∼ P, the disjunction P ∨ Q, and the conjunction P ∧ Q are probably what one would expect, this may not be the case for the implication P ⇒ Q truth values in the truth table of P ⇒ Q. We illustrate this with an example. Example 2.6

A student is taking a math class (say this one) and is currently earning a B+. He visits his professor a few days before the final exam and asks her, "Is there any chance I can get an A in this course?" The professor looks through her gradebook and says, "If you get an A in your final exam, get it you get an A in your final grade." We now check this implication for truth or falsity using the various truth value combinations of the statements P: In the final exam you get a one. and Q : You get an A for your final grade. that make up the implication.

Analyse

Let's first assume that P and Q are both true. That is, the student gets an A on the final exam and then learns that he got an A on the final grade of the course. Did your teacher tell the truth? I think we can all agree that she did. So if both P and Q are true, then P ⇒ Q is also true, which agrees with the first row of the truth table in Figure 2.5. Second, suppose that P is true and Q is false. So the student got an A on her final exam, but not an A for her final grade, let's say she got a B her student). What she said was wrong, which corresponds to the second row of the table in Figure 2.5. Third, assume that P is false and Q is true. In this case, the student didn't get an A on the final exam (let's say he got a B), but when he received his final grade he found out (and was pleasantly surprised) that his final grade was an A. How could this happen? 🇧🇷 Maybe your teacher was lenient. Perhaps the final exam was exceptionally difficult and a B mark on it indicated an exceptionally good performance. Maybe the teacher made a mistake. In any case, the instructor wasn't lying; so she told the truth. In fact, she never promised anything if the student didn't get an A on the final exam. This matches the third row of the table in Figure 2.5. Finally, suppose P and Q are both false. Suppose the student did not get an A in the final exam and also did not get an A in the final grade. Again, the instructor did not lie. She only promised a student an A if they got an A on their final exam. Again, she promised nothing if the student didn't get an A on the final exam. So the instructor was telling the truth and this corresponds to the fourth and last row of the table.

44

Chapter 2

Logic

In summary, the only situation where P ⇒ Q is false is when P is true and Q is false (so ∼ Q is true). That is, the truth tables for ∼(P ⇒ Q) and P ∧ (∼Q) are the same. We shall come back to this observation shortly. We have already mentioned that the implication P ⇒ Q can be expressed as "if P, then Q" and "P implies Q". In fact, there are several ways to express P ⇒ Q in words, namely: if P then Q Q if P implies Q P only if Q P is sufficient for Q. Q is necessary for P. It's probably not surprising that the first three say the same thing, but maybe it's not so obvious that the last three say the same thing as the first three. Consider the statement "P only if Q". This states that P is true only under the condition that Q is true; in other words, P cannot be true and Q false. It means that if P is true, then Q must also be true. From this we can also see that the statement "Q is necessary for P" has the same meaning as "P only if Q". The statement "P is sufficient for Q" asserts that the truth of P is sufficient for the truth of Q. In other words, the truth of P implies the truth of Q; that is, "P implies Q".

2.5 More on Implications We have just discussed four ways to construct new propositions from one or two given propositions. In mathematics, however, we are often interested in declarative sentences that contain variables and whose truth or falsity is known only after we assign values to the variables. Values assigned to variables come from their respective domains. These sentences are, of course, exactly what we call open sentences. Just as new propositions can be formed from P and Q propositions by negation, disjunction, conjunction or implication, new open-ended sentences can be constructed from open-ended sentences in the same way. Example 2.7

Consider the open sentences P1 (x): x = −3. and P2(x): |x| = 3, where x ∈ R, i.e. the domain of definition of x, is R in each case. We can then form the following open sentences: ∼ P1 (x) : x = −3. P1 (x) ∨ P2 (x) : x = −3 or |x| = 3. P1 (x) ∧ P2 (x) : x = −3 and |x| = 3. P1 (x) ⇒ P2 (x) : If x = −3, then |x| = 3

2.5

More about implications

45

T2:

T1: Figure 2.6

Isosceles and equilateral triangles

For a given real number x, the truth value of each resulting statement can be determined. For example, ∼ P1(−3) is a false statement, while each of the remaining sentences above leads to a true statement when x = −3. Both P1(2) ∨ P2(2) and P1(2) ∧ P2(2) are false statements. On the other hand, both ∼ P1 (2) and P1 (2) ⇒ P2 (2) are true statements. In fact, for any real number x = −3, the implication P1(x) ⇒ P2(x) is true, since P1(x): x = −3 is false. Hence P1(x) ⇒ P2(x) for all x ∈ R. We shall see that open sentences which result in statements that are true for all values in the domain will be of particular interest to us. Several possibilities to formulate the implication P1(x) ⇒ P2(x) are listed below: If x = −3, then |x| = 3. |x| = 3 when x = −3. x = −3 implies that |x| = 3. x = −3 only if |x| = 3. For |x| x = −3 is sufficient = 3. |x| = 3 is required for x = −3.

Now let's consider another example, this time from geometry. You may recall that a triangle is said to be equilateral when the lengths of its three sides are equal, while a triangle is isosceles when the lengths of any two of its three sides are equal. Figure 2.6 shows an isosceles triangle T1 and an equilateral triangle T2. Since the lengths of any two of the three sides of T2 are equal, T2 is also isosceles. In fact, this is precisely the fact we wish to discuss. Example 2.8

For a triangle T let P(T ) : T is equilateral. and Q(T): T is isosceles. Thus P(T ) and Q(T ) are open statements about the domain S of all triangles. Consider the implication P(T ) ⇒ Q(T ), where then the domain of the variable T is the set S . For an equilateral triangle T1, both P(T1 ) and Q(T1 ) are true statements and hence P(T1). ) ⇒ Q(T1 ) is also a true statement. If T2 is not an equilateral triangle, then P(T2 ) is false and then P(T2 ) ⇒ Q(T2 ) is true. Hence P(T ) ⇒ Q(T ) is a true statement for every T ∈ S. Now we declare P(T ) ⇒ Q(T ) in several ways: If T is an equilateral triangle, then T is isosceles . A triangle T is isosceles if T is equilateral. An equilateral triangle T implies that T is isosceles. A triangle T is equilateral only if T is isosceles. For a triangle T to be isosceles, it suffices that T is equilateral. For a triangle T to be equilateral, T must be isosceles.

46

Chapter 2

Logic

Note that sometimes we change the words to make the sentence sound better. In general, the proposition P in the implication P ⇒ Q is commonly called the hypothesis or premise of P ⇒ Q, while Q is called the conclusion of P ⇒ Q. Often an implication is easier to deal with when it is expressed in a form: "if, then". This allows us to more easily identify the hypothesis and the conclusion. Since implications can be expressed in many ways (even beyond those mentioned above ), it is particularly useful to be able to rephrase an implication as “if then.” For example, the implication P(T ) ⇒ Q(T ) described in Example 2.8 can be found in a number of ways, including the following: • • • •

Let T be an equilateral triangle. So T is isosceles. Suppose T is an equilateral triangle. So T is isosceles. Every equilateral triangle is isosceles. Whenever a triangle is equilateral, it is isosceles.

Now we examine the truth or falsity of implications involving open sets for values of their variables. Example 2.9

Let S = {2, 3, 5} and let P(n): n 2 − n + 1 is a prime number. and Q(n) : n 3 − n + 1 is a prime number. are open sentences over the domain S. Determine the truth or falsity of the implication P(n) ⇒ Q(n) for every n ∈ S.

solution

In this case we have the following: P(2) : 3 is a prime number. Q(2): 7 is a prime number.

P(3): 7 is a prime number. Q(3): 25 is a prime number.

P(5): 21 is a prime number. Q(5): 121 is a prime number.

Consequently, P(2) ⇒ Q(2) and P(5) ⇒ Q(5) are true, while P(3) ⇒ Q(3) is false. Example 2.10

Let S = {1, 2} and let T = {−1, 4}. Also let P(x, y) : ||x + y| − |x − y|| = 2. and Q(x, y): x y+1 = y x . be open sentences, where the domain of the variable x is S and the domain of y is T. Determine the truth or falsity of the implication P(x, y) ⇒ Q(x, y) for all (x, y) ∈ S × T .

solution

For (x, y) = (1, −1) we have P(1, −1) ⇒ Q(1, −1) : If 2 = 2, then 1 = −1. what is wrong. For (x, y) = (1, 4) we have P(1, 4) ⇒ Q(1, 4) : if 2 = 2 then 1 = 4. which is also wrong. For (x, y) = (2, −1) we have P(2, −1) ⇒ Q(2, −1) : if 2 = 2, then 1 = 1. which is true; while for (x, y) = (2, 4) we have: P(2, 4) ⇒ Q(2, 4) : if 2 = 4 then 32 = 16. which is true.

2.6

The bi-condition

47

2.6 The bicondiction For propositions (or open sentences) P and Q the implication is called Q ⇒ P the reciprocal of P ⇒ Q. The reciprocal of an implication will often interest us, either alone or in connection with the original implication . Example 2.11

For P1 instructions: 3 is an odd integer.

P2: 57 is a prime number.

the converse of the implication P1 ⇒ P2 : if 3 is an odd integer, then 57 is prime. the implication is P2 ⇒ P1 : if 57 is prime, then 3 is an odd integer.

For propositions (or open sentences) P and Q, the conjunction (P ⇒ Q) ∧ (Q ⇒ P) of the implication P ⇒ Q and its reciprocal is called the biconditional of P and Q and is denoted by P ⇔ Q. For statements P and Q, the truth table for P ⇔ Q can be determined. This is shown in Figure 2.7. From this table we see that P ⇔ Q is true if the statements P and Q are both true or false, while P ⇔ Q is false otherwise. That is, P ⇔ Q is true if and only if P and Q have the same truth values. The two-condition P ⇔ Q is often stated as P equals Q.

P Q P ⇒ Q T T F F

T F T F

Q⇒P

T F T T

T T F T

(P ⇒ Q) ∧ (Q ⇒ P ) T F F T

P QP ⇔ Q T T T T F F F T F F Abbildung 2.7

F T

The truth table for a biconditional

48

Chapter 2

Logic

or P if and only if Q. or like P is a necessary and sufficient condition for Q. For the statements P and Q it follows that the biconditional "P iff Q" is true only if P and Q have truth values. Example 2.12

The biconditional 3 is an odd integer if and only if 57 is prime. It is a fake; while the biconditional 100 is even if and only if 101 is prime. and truth. Also, the biconditional 5 is even if and only if 4 is odd.

it is also true.

The expression “if and only if” is common in mathematics and we will discuss it in more detail later. For now, let's consider two examples of statements that include the phrase "if and only if." Example 2.13

We found in Example 2.7 that for the open sentences P1(x) it holds: x = −3. and P2(x): |x| = 3. over the domain R the implication P1 (x) ⇒ P2 (x): If x = −3, then |x| = 3. is a true statement for every x ∈ R. The reciprocal P2(x) ⇒ P1(x) : If |x| = 3, then x = −3. is false when x = 3 since P2(3) is true and P1(3) is false. For all other real numbers x the implication P2(x) ⇒ P1(x) holds. Hence the bicondition P1(x) ⇔ P2(x): x = −3 if and only if |x| = 3. is false when x = 3, and true for all other real numbers x.

Example 2.14

For open sentences P(T ): T is equilateral. and Q(T): T is isosceles.

2.7

Tautologies and Contradictions

49

over the domain S of all triangles the converse of the implication P(T ) ⇒ Q(T ): If T is equilateral, then T is isosceles. the implication is Q(T ) ⇒ P(T ) : if T is isosceles then T is equilateral. We note that P(T ) ⇒ Q(T ) is a true proposition for all triangles T, while Q(T ) ⇒ P(T ) is a false proposition when T is an isosceles triangle that is not equilateral. On the other hand, for all other triangles T, the second implication becomes a true statement. Hence the bi-condition P(T ) ⇔ Q(T ) : T is equilateral if and only if T is isosceles. is false for all triangles that are isosceles and not equilateral, while it is true for all other triangles T. Now let's examine the truth or falsity of biconditionals obtained by assigning each value in its range to a variable. Example 2.15

Let S = {0, 1, 4}. Consider the following open propositions over the domain S: n(n + 1)(2n + 1) is odd. 6 Q(n) : (n + 1)3 = n 3 + 1.

P(n):

Determine three distinct elements a, b, c in S such that P(a) ⇒ Q(a) is false, Q(b) ⇒ P(b) is false, and P(c) ⇔ Q(c) is true . solution

Note that P(0) : 0 is odd. Q(0): 1 = 1.

P(1): 1 is odd. Q(1): 8 = 2.

P(4): 30 is odd. Q(4): 125 = 65.

Thus P(0) and P(4) are false while P(1) is true. Also, Q(1) and Q(4) are false while Q(0) is true. Thus P(1) ⇒ Q(1) and Q(0) ⇒ P(0) are false, while P(4) ⇔ Q(4) is true. Therefore we can take a = 1, b = 0 and c = 4. analysis

In Example 2.15, notice that both P(0) ⇔ Q(0) and P(1) ⇔ Q(1) are false biconditionals. Hence the value 4 in S is the only choice for c.

2.7 Tautologies and Contradictions The symbols ∼, ∨, ∧, ⇒, and ⇔ are sometimes referred to as logical operations. Given statements, we can use these logical operations to form more complex statements. For example, the statement (P ∨ Q) ∧ (P ∨ R) is a statement formed from the given statements P, Q and R and the logical operations ∨ and ∧. We call (P ∨ Q) ∧ (P ∨ R) a

50

Chapter 2

Logic

composite statement. More generally, a compound statement is a statement that consists of one or more given statements (referred to in this context as component statements) and at least one logical operation. For example, given a component proposition P, its negation ∼P is a composite proposition. The composite proposition P ∨ (∼ P), whose truth table is given in Figure 2.8, has the property of being true regardless of the truth value of P. A composite proposition S is called a tautology if it is true for all possible combinations of the value truth of the component propositions that S consists of. Hence P ∨ (∼ P) is a tautology, as is (∼Q) ∨ (P ⇒ Q). This last fact is verified in the truth table shown in Figure 2.9. Let P1 : 3 be odd. and P2: 57 is a prime number. We see that not only is 57 not prime, or that 57 is prime if 3 is odd. a true statement, but (∼P2 ) ∨ (P1 ⇒ P2 ) is true regardless of which statements P1 and P2 are considered. On the other hand, a compound statement S is called a contradiction if it is false for all possible combinations of truth values of the sub-statements used to form S. The statement P ∧ (∼P) is a contradiction, as shown in Figure 2.10. So statement 3 is odd and 3 is not odd. It is a fake. Another example of a contradiction is (P ∧ Q) ∧ (Q ⇒ (∼ P)), which is verified in the truth table shown in Figure 2.11. In fact, if a compound statement S is a tautology, then its negation ∼S is a contradiction. P

P ∨ (∼ P )

T

F

T

F

T

T

Figure 2.8

P Q ∼ Q T

∼S

example tautology

P⇒Q

(∼ Q) ∨ (P ⇒ Q)

T

F

T

T

T F F T

T F

F T

T T

F F

T

T

T

Figure 2.9

another tautology

2.8

P

∼S

T

F

F

F

T

F

Figure 2.10

P Q

∼S

logical equivalence

51

P ∧ (∼ P )

example contradiction

P ∧ Q Q ⇒ (∼ P ) (P ∧ Q) ∧ (Q ⇒ (∼ P ))

T

T

F

T

F

F

T

F

F

F

T

F

F F

T F

T T

F F

T T

F F

Figure 2.11

another contradiction

2.8 Logical Equivalence Figure 2.12 shows a truth table for the two statements P ⇒ Q and (∼ P) ∨ Q. The corresponding columns of these composite statements are identical; in other words, these two composite propositions have exactly the same truth value for any combination of truth values of propositions P and Q. Let R and S be two composite propositions containing the same sub-propositions. Then R and S are said to be logically equivalent if R and S have the same truth values for all combinations of the truth values of their sub-propositions. If R and S are logically equivalent, then this is denoted by R ≡ S. Hence P ⇒ Q and (∼ P) ∨ Q are logically equivalent, and hence P ⇒ Q ≡ (∼ P) ∨ Q. Yet another example The simplest form of logical equivalence involves P ∧ Q and Q ∧ P. That P ∧ Q ≡ Q ∧ P is verified in the truth table shown in Figure 2.13. What is the practical meaning of logical equivalence? Assume that R and S are logically equivalent compound statements. Then we know that R and S have the same truth values for all possible combinations of the truth values of their sub-propositions. However, this means that the two-condition R ⇔ S is true for all possible combinations of truth values of its sub-propositions and therefore R ⇔ S is a tautology. Conversely, if R ⇔ S is a tautology, then R and S are logically equivalent. Let R be a mathematical statement whose truth we want to show, and suppose that R and a statement S are logically equivalent. If we can show that S is true, then R is also true. Suppose we want to check the correctness of a P Q T

T

T F F

F T F

∼S

Figure 2.12

F F T T

P ⇒ Q T F T T

(∼ P ) ∨ Q T F T T

Verification of P ⇒ Q ≡ (∼P) ∨ Q

52

Chapter 2

Logic

P Q P ∧Q

Q∧P

T T

T

T

T F F T F F

F f f

F f f

Figure 2.13

Verification of P ∧ Q ≡ Q ∧ P

Implication P ⇒ Q. If we can establish the truth of the statement (∼P) ∨ Q, then the logical equivalence of P ⇒ Q and (∼P) ∨ Q guarantees that P ⇒ Q is also true. Example 2.16

Returning to the math teacher in Example 2.6, and whether he kept his promise that if you got an A on your final exam, you'd get an A on your final grade. We just need to know that the student didn't get an A on the final exam or that she got an A as her final grade to see if she kept her promise. Since the logical equivalence of P ⇒ Q and (∼P) ∨ Q, verified in Figure 2.12, is particularly important and we will use this fact often, we formulate it as a theorem.

Theorem 2.17

Let P and Q be two statements. Then P ⇒ Q and (∼P) ∨ Q are logically equivalent. Returning to the truth table in Figure 2.13, we show that P ∧ Q and Q ∧ P are logically equivalent for any two propositions P and Q. This means in particular that (P ⇒ Q) ∧ (Q ⇒ P) and (Q ⇒ P) ∧ (P ⇒ Q) are logically equivalent. Of course, (P ⇒ Q) ∧ (Q ⇒ P) is what is called the bi-condition of P and Q. Since (P ⇒ Q) ∧ (Q ⇒ P) and (Q ⇒ P) ∧ (P ⇒ Q ) are logically equivalent, (Q ⇒ P) ∧ (P ⇒ Q) also represents the bicondition of P and Q. Da Q ⇒ P can be written as "P if Q" and P ⇒ Q can be expressed as "P only if Q", their conjunction can be written as "P if Q and P only if Q" or more simply as P if and only if if Q. Hence it is justified to express P ⇔ Q as "P if and only if Q". Since Q ⇒ P can be formulated as "P is necessary for Q" and P ⇒ Q can be expressed as "P is sufficient for Q", writing P ⇔ Q as "P is necessary and sufficient for Q" is equally justified .

2.9

Some basic properties of logical equivalence

53

2.9 Some Basic Properties of Logical Equivalence It is probably not surprising that the statements P and ∼ (∼P) are logically equivalent. This fact is verified in Figure 2.14. We mentioned in Figure 2.13 that for two propositions P and Q, the propositions P ∧ Q and Q ∧ P are logically equivalent. There are also other basic logical equivalences that we encounter frequently. Theorem 2.18

For the statements P, Q and R apply (1) commutative laws (a) P ∨ Q ≡ Q ∨ P (b) P ∧ Q ≡ Q ∧ P (2) associative laws (a) P ∨ (Q ∨ R) ≡ ( P ∨ Q) ∨ R (b) P ∧ (Q ∧ R) ≡ (P ∧ Q) ∧ R (3) Distribution Laws (a) P ∨ (Q ∧ R) ≡ (P ∨ Q) ∧ (P ∨ R ) ( b) P ∧ (Q ∨ R) ≡ (P ∧ Q) ∨ (P ∧ R) (4) De Morgan's laws (a) ∼ (P ∨ Q) ≡ (∼P) ∧ (∼Q) (b) ∼ (P ∧ Q) ≡ (∼P) ∨ (∼Q). Each part of Theorem 2.18 is verified with a truth table. We have already established the commutative law for the conjunction (i.e. P ∧ Q ≡ Q ∧ P) in Figure 2.13. In Figure 2.15, P ∨ (Q ∧ R) ≡ (P ∨ Q) ∧ (P ∨ R) is checked by noting that the columns corresponding to the propositions P ∨ (Q ∧ R) and (P ∨ Q) ∧ (P ∨ correspond to R) are identical. The laws given in Theorem 2.18, together with other known logical equivalences, can sometimes be used to great advantage to prove other logical equivalences (without introducing a truth table).

Example 2.19

Suppose we are asked to prove that ∼ (P ⇒ Q) ≡ P ∧ (∼Q) for any two propositions P and Q. Using the logical equivalence of P ⇒ Q and (∼P) ∨ Q from Theorems 2.17 and 2.18 (4a) we see that ∼ (P ⇒ Q) ≡ ∼ ((∼P) ∨ Q) ≡ (∼ (∼P)) ∧ (∼Q) ≡ P ∧ (∼Q), P

∼S

∼ (∼P)

T

F

F

T

T F

Figure 2.14

Verification of P ≡ ∼ (∼ P)

(2.1)

54

Chapter 2

Logic

P Q

R

Q∧R

P∨Q

T

T

T

T

T

T

T

T

T

T

F

F

T

T

T

T

T

F

T

F

T

T

T

T

T

F

F

F

T

T

T

T

F

T

T

T

T

T

T

T

F

T

F

F

T

F

F

F

F

F

T

F

F

T

F

F

F

F

F

F

F

F

F

F

Figure 2.15

P∨R

P ∨ (Q ∧ R)

(P ∨ Q) ∧ (P ∨ R)

Verification of the distributive law P ∨ (Q ∧ R) ≡ (P ∨ Q) ∧ (P ∨ R)

which implies that the statements ∼ (P ⇒ Q) and P ∧ (∼Q) are logically equivalent, to which we have already alluded. It's important to keep in mind what we said about logical equivalence. For example, the logical equivalence of P ∧ Q and Q ∧ P allows us to replace a statement of type P ∧ Q by Q ∧ P without changing its truth value. As another example, according to De Morgan's laws in Theorem 2.18, unless it is the case that an integer a is even or an integer b is even, it follows that a and b are both odd. Example 2.20

Using the second of De Morgan's laws and (2.1), we can set up a useful logically equivalent form of negating P ⇔ Q through the following chain of logical equivalences: ∼ (P ⇔ Q) ≡ ∼ ((P ⇒ Q) ∧ ( Q ⇒ P)) ≡ (∼ (P ⇒ Q)) ∨ (∼ (Q ⇒ P)) ≡ (P ∧ (∼Q)) ∨ (Q ∧ (∼P)).

What we have observed about the negation of an implication and a bicondition is repeated in the following theorem. Theorem 2.21

For the statements P and Q, (a) ∼ (P ⇒ Q) ≡ P ∧ (∼Q) (b) ∼ (P ⇔ Q) ≡ (P ∧ (∼Q)) ∨ (Q ∧ (∼P)) .

Example 2.22

Let's go back to what the math teacher said in Example 2.6: If you get an A on your final exam, you'll get an A on your final grade. If this teacher is not true, it follows from Theorem 2.21(a) that you got an A in your final exam and didn't get an A in your final grade.

2.10

55

Quantified Statements

On the other hand, suppose the math teacher said, if you get an A on your final exam, you'll get an A on your final grade—and that's the only way you'll get an A on your final grade. 🇧🇷 If this teacher is not true, it follows from Theorem 2.21(b) that either you got an A on your final exam and didn't get an A on your final grade, or you got an A on your final grade and didn't get you an A in your final exam.

2.10 Quantified Propositions We have mentioned that if P(x) is an open proposition over a domain S, then P(x) is a proposition for every x ∈ S. We illustrate this again. Example 2.23

If S = {1, 2, , 7}, then P(n):

2n 2 + 5 + (−1)n is a prime number. two

is a declaration for every n ∈ S. So P(1) : P(2) : P(3) : P(4) :

3 is a prime number. 7 is a prime number. 11 is a prime number. 19 is a prime number.

are true statements; while P(5): 27 is a prime number. P(6): 39 is a prime number. P(7): 51 is a prime number.

are false statements.

There are other ways that an open-ended sentence can be turned into a statement, and that is through a method called quantification. Let P(x) be an open proposition over a domain S. Adding the expression “For all x ∈ S” to P(x) produces a proposition called a quantified proposition. The expression “for everyone” is called a universal quantifier and is denoted by the symbol ∀. Other expressions for the universal quantifier are "for everyone" and "for all". This quantified statement is expressed symbolically by ∀x ∈ S, P(x)

(2.2)

For every x ∈ S we have P(x).

(2.3)

and is expressed in words by

The quantified proposition (2.2) (or (2.3)) is true if P(x) is true for all x ∈ S, while the quantified proposition (2.2) is false if P(x) is false for at least one element x ∈ is S

56

Chapter 2

Logic

Another way to transform an open-ended statement P(x) over a domain S into a statement by quantification is to introduce a quantifier called an existential quantifier. Each of the propositions exists, exists, for some and for at least one is called an existential quantifier and is denoted by the symbol ∃. The quantified statement ∃x ∈ S, P(x)

(2.4)

There is x ∈ S with P(x).

(2.5)

can be expressed in words by

The quantified proposition (2.4) (or (2.5)) is true if P(x) is true for at least one element x ∈ S, while the quantified proposition (2.4) is false if P(x) is false for all x ∈ S is S. We now consider two quantified statements constructed from the open theorem we saw in Example 2.23. Example 2.24

For the open set 2n, 2 + 5 + (−1)n is a prime number. 2 over the domain S = {1, 2, · · · , 7}, the quantified statement P(n):

∀n ∈ S, P(n) : For every n ∈ S holds

2n 2 + 5 + (−1)n is a prime number. two

is false because, for example, P(5) is false; while the quantified statement ∃n ∈ S, P(n) : There are n ∈ S such that

2n 2 + 5 + (−1)n is a prime number. two

is true as long as, for example, P(1) is true. The quantified statement ∀x ∈ S, P(x) can also be expressed as If x ∈ S, then P(x). Consider the open set P(x) : x 2 ≥ 0. over the set R of real numbers. Then ∀x ∈ R, P(x) or equivalently ∀x ∈ R, x 2 ≥ 0 can be expressed as For any real number x, x 2 ≥ 0. or If x is a real number then x 2 ≥ 0 as well as The square of any real number is not negative.

2.10

57

Quantified Statements

In general, the universal quantifier is used to assert that the statement resulting from a given open sentence is true when every value in the domain of the variable is assigned to the variable. Consequently, the statement ∀x ∈ R, x 2 ≥ 0 is true since x 2 ≥ 0 for every real number x. Now suppose we consider the open set Q(x): x 2 ≤ 0. The statement ∀x ∈ R, Q(x) (that is, for every real number x, x 2 ≤ 0) is false since, for example, Q( 1) is wrong. Of course, that means your denial is true. If it were not the case that for every real number x we have x 2 ≤ 0, then there must exist a real number x such that x 2 > 0. This negation There is a real number x such that x can 2 > 0 can be written on symbols like ∃x ∈ R, x 2 > 0 or ∃x ∈ R, ∼Q(x). More generally, if we consider an open set P(x) over a domain S, then ∼ (∀x ∈ S, P(x)) ≡ ∃x ∈ S, ∼P(x). Example 2.25

Suppose we consider the set A = {1, 2, 3} and its power set P(A), the set of all subsets of A. Then the quantified statement holds for every set B ∈ P(A), A − B = ∅ .

(2.6)

is wrong, because for the subset B = A = {1, 2, 3} we have A − B = ∅. The negation of statement (2.6) is There is B ∈ P(A) with A − B = ∅.

(2.7)

Statement (2.7) is therefore true, because for B = A ∈ P(A) we have A − B = ∅. Statement (2.6) can also be written as If B ⊆ A, then A − B = ∅.

(2.8)

Hence the negation of (2.8) can be expressed as There is a subset B of A with A − B = ∅.

The existential quantifier is used to assert that at least one proposition resulting from a given open-ended sentence is true when assigned the values of a variable from its range. We know that for an open sentence P(x) over a domain S the quantified proposition ∃x ∈ S, P(x) is true if P(x) is a true sentence for at least one element x ∈ S. a the statement ∃x ∈ R, x 2 > 0 is true, since e.g. B. x 2 > 0 for x = 1 holds. So the quantified statement ∃x ∈ R, 3x = 12 is true since there is some real number x for which 3x = 12, so x = 4, has this property. (In fact, x = 4 is the only real number for which 3x = 12.) On the other hand, the quantified statement is ∃n ∈ Z, 4n − 1 = 0

58

Chapter 2

Logic

is false since there is no integer n such that 4n − 1 = 0. (Of course, 4n − 1 = 0 when n = 1/4, but 1/4 is not an integer.) Suppose Q(x) is an open proposition over some domain S. If the statement ∃x ∈ S, Q (x) is not true, then for all x ∈ S, Q(x) must be false. That is, ∼ (∃x ∈ S, Q(x)) ≡ ∀x ∈ S, ∼Q(x) is true. We illustrate this with a concrete example. Example 2.26

The following statement contains the existential quantifier: There is a real number x with x 2 = 3.

(2.9)

If we make P(x) : x 2 = 3, then (2.9) √ can be rewritten as ∃x ∈√R, P(x). Statement (2.9) is true since P(x) is true if x = 3 (or if x = − 3). Hence the negation of (2.9) reads: For every real number x, x 2 = 3.

(2.10)

Statement (2.10) is therefore false.

Let P(x, y) be an open set, where the domain of the variable x is S and the domain of the variable y is T. Then the quantified statement is for all x ∈ S and y ∈ T , P(x, y). can be expressed symbolically as ∀x ∈ S, ∀y ∈ T, P(x, y).

(2.11)

The negation of statement (2.11) is ∼ (∀x ∈ S, ∀y ∈ T, P(x, y)) ≡ ∃x ∈ S, ∼ (∀y ∈ T, P(x, y)) ≡ ∃ x ∈ S, ∃y ∈ T, ∼P(x, y).

(2.12)

We now consider examples of quantified statements with two variables. Example 2.27

Consider the statement For any two real numbers x and y, x 2 + y 2 ≥ 0.

(2.13)

If we let P(x, y) be: x 2 + y 2 ≥ 0, where the domain of both x and y is R, then statement (2.13) can be expressed as ∀x ∈ R, ∀y ∈ R , P(x , and )

(2.14)

or as ∀x, y ∈ R, P(x, y). Since x 2 ≥ 0 and y 2 ≥ 0 for all real numbers x and y, it follows that x 2 + y 2 ≥ 0 and hence P(x, y) is true for all real numbers x and y. Thus the quantified statement (2.14) is true.

2.10

Quantified Statements

59

The negation of statement (2.14) is therefore ∼ (∀x ∈ R, ∀y ∈ R, P(x, y)) ≡ ∃x ∈ R, ∃y ∈ R, ∼ P(x, y) ≡ ∃ x, y ∈ R, ∼ P(x, y), (2.15) which is expressed in words There are real numbers x and y with x 2 + y 2 < 0.

(2.16)

Statement (2.16) is therefore false.

For an open set containing two variables, the value ranges of the variables do not have to be the same. Example 2.28

Consider the statement For all s ∈ S and t ∈ T, st + 2 is a prime number.

(2.17)

where the domain of the variable s S = {1, 3, 5} and the domain of the variable t T = {3, 9}. Let Q(s, t): st + 2 is prime. then theorem (2.17) can be expressed as ∀s ∈ S, ∀t ∈ T, Q(s, t).

(2.18)

Like all Q(1, 3) statements: 1 · 3 + 2 is a prime number. Q(5, 3): 5 * 3 + 2 is a prime number.

Q(3, 3): 3 × 3 + 2 is a prime number.

Q(1, 9): 1 * 9 + 2 is a prime number. Q(5, 9): 5 * 9 + 2 is a prime number.

Q(3, 9): 3 * 9 + 2 is a prime number.

are true, the quantified statement (2.18) is true. As we saw in (2.12), the negation of the quantified statement (2.18) is ∼ (∀s ∈ S, ∀t ∈ T, Q(s, t)) ≡ ∃s ∈ S, ∃t ∈ T, ∼Q (s , t) and then the negation of (2.17) There are s ∈ S and t ∈ T such that st + 2 is not prime. Statement (2.19) is therefore wrong.

(2.19)

Let P(x, y) be an open sentence again, where the domain of the variable x is S and the domain of the variable y is T. The quantified statement There are x ∈ S and y ∈ T such that P(x, y) can be expressed symbolically as ∃x ∈ S, ∃y ∈ T, P(x, y).

(2.20)

60

Chapter 2

Logic

The negation of statement (2.20) is then ∼ (∃x ∈ S, ∃y ∈ T, P(x, y)) ≡ ∀x ∈ S, ∼ (∃y ∈ T, P(x, y)) ≡ ∀ x ∈ S, ∀y ∈ T, ∼P(x, y).

(2.21)

We now illustrate this situation. Example 2.29

Consider the open set R(s, t): |s − 1| + |t − 2| ≤ 2. where the domain of the variable s is the set S of even integers and the domain of the variable t is the set T of odd integers. Then the quantified statement is ∃s ∈ S, ∃t ∈ T, R(s, t).

(2.22)

can be expressed in words like: There is an even integer s and an odd integer t such that |s − 1| + |t − 2| ≤ 2. (2.23) Since R(2, 3) : 1 + 1 ≤ 2 is true, the quantified statement (2.23) is true. The negation of (2.22) is thus ∼ (∃s ∈ S, ∃t ∈ T, R(s, t)) ≡ ∀s ∈ S, ∀t ∈ T, ∼R(s, t).

(2.24)

and so the negation of (2.22) in words is: For every even integer s and every odd integer t, |s − 1| + |t − 2| > 2. The quantified statement (2.25) is therefore wrong.

(2.25)

De Morgan's laws are also used in the next two examples of negating quantified statements. Example 2.30

The negation of For all integers a and b, if ab is even then a is even and b is even. There are integers a and b such that ab is even and either a or b is odd.

Example 2.31

The negation of √ There is a rational number r with √ r√∈ A = { 2, π} or r ∈ B = {− 2, 3, e}. é For every rational number r, both r ∈ / A and r ∈ / B hold.

Quantified statements can contain universal and existential quantifiers. Although we present examples of them now, we will discuss them in more detail in Section 7.2.

2.10 Example 2.32

61

Quantified Statements

Consider the open set P(a, b): ab = 1. where the domain of both a and b is the set Q+ of positive rational numbers. Then the quantified statement ∀a ∈ Q+ , ∃b ∈ Q+ , P(a, b)

(2.26)

it can be expressed in words like: for every positive rational number a there is a positive rational number b such that ab = 1. The quantified proposition (2.26) turns out to be true. If we replace Q+ by R, we get ∀a ∈ R, ∃b ∈ R, P(a, b).

(2.27)

The negation of this statement is ∼ (∀a ∈ R, ∃b ∈ R, P(a, b)) ≡ ∃a ∈ R, ∼ (∃b ∈ R, P(a, b)) ≡ ∃a ∈ R , ∀b ∈ R, ∼ P(a, b), which says in words: There is a real number a such that ab = 1 for every real number b. This negation is true, because for a = 0 and every real number b, ab = 0 = 1. Thus the quantified statement (2.27) is false. Example 2.33

Consider the open set Q(a, b): ab is odd. where the domain of both a and b is the set N of positive integers. Then the quantified statement ∃a ∈ N, ∀b ∈ N, Q(a, b),

(2.28)

put in words: There is a positive integer a such that for every positive integer b, ab is odd. Statement (2.28) is wrong. The negation of (2.28) in symbols is ∼ (∃a ∈ N, ∀b ∈ N, Q(a, b)) ≡ ∀a ∈ N, ∼ (∀b ∈ N, Q(a, b)) ≡ ∀ a ∈ N, ∃b ∈ N, ∼ Q(a, b). In words, for every positive integer a, there is a positive integer b such that ab is even. So this statement is correct.

62

Chapter 2

Logic

Suppose P(x, y) is an open set where the domain of x is S and the domain of y is T. Then the quantified statement ∀x ∈ S, ∃y ∈ T, P(x, y) is true if ∃y ∈ T, P(x, y) for every x ∈ S holds. This means that for every x ∈ S there is a y ∈ T such that P(x, y) holds. Example 2.34

Consider the open sentence P(x, y): x + y is a prime number. where the domain of x is S = {2, 3} and the domain of y is T = {3, 4}. The quantified statement ∀x ∈ S, ∃y ∈ T, P(x, y), expressed in words, is For all x ∈ S there is y ∈ T such that x + y is a prime number. This statement is true. For x = 2, P(2, 3) is true and for x = 3, P(3, 4) is true.

Suppose Q(x, y) is an open set, where S is the domain of x and T is the domain of y. The quantified statement ∃x ∈ S, ∀y ∈ T, Q(x, y) is true if ∀y ∈ T, Q(x, y) for some x ∈ S. This means that for an element x in S , the open proposition Q(x, y) is true for all y ∈ T . Example 2.35

Consider the open-ended statement Q(x, y): x + y is a prime number. where the domain of x is S = {3, 5, 7} and the domain of y is T = {2, 6, 8, 12}. The quantified statement ∃x ∈ S, ∀y ∈ T, Q(x, y),

(2.29)

expressed in words: There is an x ∈ S such that for all y ∈ T , x + y is a prime number. For x = 5, all the numbers 5 + 2, 5 + 6, 5 + 8, and 5 + 12 are prime. Thus the quantified statement (2.29) is true.

2.11

Characterizations of declarations

63

Let's look again at the symbols we introduced in this chapter: ∼ ∨ ∧ ⇒ ⇔ ∀ ∃

negation (not) disjunction (or) conjunction (and) biconditional implication universal quantifier (for all) existential quantifier (exists)

2.11 Characterizations of Statements Let us return to the bicondition P ⇔ Q. Recall that P ⇔ Q represents the composite proposition (P ⇒ Q) ∧ (Q ⇒ P). Earlier we described how this compound statement can be expressed as P if and only if Q. Many mathematicians abbreviate the expression "if and only if" by writing "if". Although "iff" is informal and not a natural word, its usage is common and you should be familiar with it. Remember that whenever you see P if and only if Q or P is necessary and sufficient for Q, this means if P then Q, and if Q then P. Example 2.36

Suppose P(x) : x = −3. and Q(x): |x| = 3. where x ∈ R. Then the bi-condition P(x) ⇔ Q(x) can be expressed as x = −3 if and only if |x| = 3. or x = −3 necessary and sufficient for |x| = 3. Or maybe better, because x = −3 is a necessary and sufficient condition for |x| is = 3. Now consider the quantified statement ∀x ∈ R, P(x) ⇔ Q(x). This statement is false because P(3) ⇔ Q(3) is false.

64

Chapter 2

Logic

Suppose a concept (or object) is expressed in an open sentence P(x) over a domain S, and Q(x) is another open sentence over the domain S related to that concept. We say that this notion is characterized by Q(x) if ∀x ∈ S, P(x) ⇔ Q(x) is a true statement. The statement ∀x ∈ S, P(x) ⇔ Q(x) is then called the characterization of this concept. For example, irrational numbers are defined as real numbers that are not rational and characterized as real numbers whose decimal extensions do not repeat. This provides a characterization of irrational numbers: a real number r is irrational if and only if r has a non-repeating decimal extension. We have seen that equilateral triangles are defined as triangles whose sides are equal. However, they are characterized as triangles whose angles are equal. Hence we have the characterization: A triangle T is equilateral if and only if T has three equal angles. One might think that equilateral triangles are also characterized as such triangles with three equal sides, but the corresponding biconditional: A triangle T is equilateral if and only if T has three equal sides. it is not a characterization of equilateral triangles. In fact, this is the definition we gave for equilateral triangles. Characterizing a concept offers an alternative but equivalent way of looking at that concept. Characterizations are often valuable for studying concepts or proving other results. We will see examples of this in future chapters. We have mentioned that the following biconditional statement, while true, is not a characterization: A triangle T is equilateral if and only if T has three equal sides. Although this is the definition of equilateral triangles, mathematicians rarely use the phrase "if and only if" in a definition because it is about what constitutes a definition. That is, a triangle is defined as equilateral if it has three equal sides. Consequently, a triangle with three equal sides is equilateral, but a triangle not having three equal sides is not equilateral.

CHAPTER 2 EXERCISES Section 2.1: Statements 2.1. Which of the following sentences are statements? Give the truth value for those that are. (A B C D E F G )

The integer 123 is a prime number. The integer 0 is even. 5 × 2 = 10? x 2 − 4 = 0. Multiply 5x + 2 by 3. 5x + 3 is an odd integer. What an impossible question!

Exercises for Chapter 2

65

2.2. Consider sets A, B, C, and D below. Which of the following statements are true? Justify each incorrect statement. A = {1, 4, 7, 10, 13, 16, . 🇧🇷 .} C = {x ∈ Z : x is prime and x = 2} B = {x ∈ Z : x is odd} D = {1, 2, 3, 5, 8, 13, 21, 34, 55, . 🇧🇷 .} (a) 25 ∈ A

(b) 33 ∈ D

(c) 22 ∈ / A∪D

(d) C ⊆ B

(e) ∅ ∈ B ∩ D

(f) 53 ∈ /C.

2.3. Which of the following statements are true? Justify each incorrect statement. (a) ∅ ∈ ∅ (b) ∅ ∈ {∅} (c) {1, 3} = {3, 1} (d) ∅ = {∅} (e) ∅ ⊂ {∅} (f) 1 ⊆ { 1}. 2.4. Consider the open proposition P(x): x(x − 1) = 6 over the domain R. (a) For which values of x is P(x) a true proposition? (b) For which values of x is P(x) false? 2.5. For the open set P(x): 3x − 2 > 4 over the domain Z: (a) determine the values of x for which P(x) is true. (b) the values of x for which P(x) is false. 2.6. Find for the open set P(A) : A ⊆ {1, 2, 3} over the domain S = P({1, 2, 4}): (a) every A ∈ S for which P(A) is TRUE. (b) every A ∈ S for which P(A) is false. (c) every A ∈ S such that A ∩ {1, 2, 3} = ∅. 2.7. Let P(n): n and n + 2 prime numbers. be an open set over the domain N. Find six positive integers n such that P(n) is true. If n ∈ N such that P(n) is true, then the two integers n, n + 2 are called twin primes. It has been suggested that there are infinitely many twin primes. 2.8. Let P(n):

n 2 +5n +6 2

it is a couple.

(a) Find a set S1 of three integers such that P(n) is an open proposition over the domain S1 and P(n) is true for every n ∈ S1 . (b) Find a set S2 of three integers such that P(n) is an open proposition over the domain S2 and P(n) is false for every n ∈ S2 . 2.9. Find an open set P(n) over the domain S = {3, 5, 7, 9} such that P(n) is true for half of the integers in S and false for the other half. 2.10. Find two open sets P(n) and Q(n), both on the domain S = {2, 4, 6, 8} such that P(2) and Q(2) are true, P(4) and Q(4) are both false, P(6) is true and Q(6) is false while P(8) is false and Q(8) is true.

Section 2.2: Denial of a Claim 2.11. Give the negative of each of the following statements. √ (a) 2 is a rational number. (b) 0 is not a negative integer. (c) 111 is a prime number. 2.12. Complete the truth table in Figure 2.16. 2.13. Give the negative of each of the following statements. √ (a) The real number r is at most 2. (b) The absolute value of the real number a is less than 3. (c) Two angles of the triangle measure 45o.

66

Chapter 2

Logic

P

Q

T

T

T

F

F

T

F

F

Figure 2.16

∼S

∼Q

The truth table from Exercise 2.12.

(d) The area of the circle is at least 9π. (e) Two sides of the triangle are of equal length. (f) The point P in the plane lies outside the circle C. 2.14. Give the negative of each of the following statements. (a) (b) (c) (d) (e)

At least two of my library books are overdue. One of my two friends lost his homework. Nobody expected it. It's not often that my teacher teaches this course. It is surprising that two students got the same grade in the exam.

Section 2.3: The disjunction and conjunction of statements 2.15. Complete the truth table in Figure 2.17.

PQ ∼Q

P ∧ (∼ Q)

T T F F F T F F Figure 2.17

The truth table for Exercise 2.15

2.16. For the sets A = {1, 2, · · · , 10} and B = {2, 4, 6, 9, 12, 25} consider the statements P: A ⊆ B.

F: |A − B| = 6.

Determine which of the following statements are true. (a) P ∨ Q (b) P ∨ (∼ Q) (c) P ∧ Q (d) (∼ P) ∧ Q (e) (∼ P) ∨ (∼ Q). 2.17. Let P: 15 be odd. and Q: 21 is a prime number. Put each of the following statements into words and decide whether they are true or false. (a) P ∨ Q (b) P ∧ Q (c) (∼ P) ∨ Q (d) P ∧ (∼ Q). 2.18. Let S = {1, 2, . 🇧🇷 🇧🇷 , 6} and let P(A) : A ∩ {2, 4, 6} = ∅. and Q(A) : A = ∅. be open sentences over the P(S) domain. (a) Find all A ∈ P(S) such that P(A) ∧ Q(A) holds.

Exercises for Chapter 2

67

(b) Find all A ∈ P(S) such that P(A) ∨ (∼Q(A)) holds. (c) Find all A ∈ P(S) such that (∼P(A)) ∧ (∼Q(A)) holds.

Section 2.4: The Implication 2.19. Consider the statements P: 17 is even. and Q: 19 is a prime number. Write each of the following statements in words and indicate whether they are true or false. (a) ∼ P (b) P ∨ Q (c) P ∧ Q (d) P ⇒ Q. 2.20. For the statements P and Q construct a truth table for (P ⇒ Q) ⇒ (∼ P). √ 2.21. Consider the statements P : 2 is rational. and F: 22/7 is rational. Write each of the following statements in words and indicate whether they are true or false. (a) P ⇒ Q (b) Q ⇒ P (c) (∼ P) ⇒ (∼ Q) (d) (∼ Q) ⇒ (∼ P). 2.22. Consider the statements: Q:

√

2 is reasonable.

P:

2 3

it is reasonable.

R:

√

3 is reasonable.

Write each of the following statements in words and indicate whether the statement is true or false. (a) (P ∧ Q) ⇒ R (b) (P ∧ Q) ⇒ (∼ R) (c) ((∼ P) ∧ Q) ⇒ R (d) (P ∨ Q) ⇒ (∼ R). 2.23. Suppose {S1 , S2 } is a partition of a set S and x ∈ S. Which of the following statements are true? (a) (b) (c) (d) (e)

If we know that x ∈ /S1, then x must belong to S2. It is possible that x ∈ / S1 and x ∈ / S2 . Either x ∈ / S1 or x ∈ / S2 . Either x ∈ S1 or x ∈ S2 . It is possible that x ∈ S1 and x ∈ S2 .

2.24. Two sets A and B are nonempty disjoint subsets of a set S. If x ∈ S, which of the following statements are true? (a) (b) (c) (d) (e) (f)

It is possible that x ∈ A ∩ B. If x is an element of A, then x cannot be an element of B. If x is not an element of A, then x must be an element of B. It is possible that x ∈ / A and x ∈ / B. For any nonempty set C, either x ∈ A ∩ C or x ∈ B ∩ C. For a nonempty set C, both x ∈ A ∪ C and x ∈ B ∪ C

2.25. A college student makes the following statement: If I get an A in Calculus I and discrete math this semester, then I'll take Calculus II, or computer programming, this summer. For each of the following statements, determine whether that statement is true or false. (a) Student does not get A's in Calculus I, but chooses to take Calculus II this summer anyway. (b) Student gets A's in Calculus I and Discrete Mathematics, but chooses not to take classes this summer. (c) The student does not get an A in Calculus I and chooses not to take Calculus II but to take computer programming this summer. (d) The student gets an A in Calculus I and Discrete Mathematics and decides to take Calculus II and Computer Programming this summer. (e) The student does not get an A in either Calculus I, Discrete Mathematics, Calculus II, or Computer Programming this summer.

68

Chapter 2

Logic

2.26. A student makes the following statement: If I don't see my counselor today, I will see her tomorrow. For each of the following statements, determine whether that statement is true or false. (A B C D)

The student does not see his supervisor on any of the days. The student sees his supervisor on both days. The student sees their supervisor one day out of two. The student doesn't see his advisor today and waits until next week to see her.

2.27. The lecturer of a computer science class announces to her class that a well-known speaker will be on campus that same day. Four students in the class are Alice, Ben, Cindy and Don. Ben says he will attend the lecture if Alice does. Cindy says she will attend the class if Ben does. Don says he'll go to class if Cindy does. That afternoon exactly two of the four students attended the lecture. Which two students attended the lecture? 2.28. Consider the statement (implication): If Bill takes Sam to the concert, then Sam will take Bill out to dinner. Which of the following statements implies that this statement is true? (A B C D E F G )

Sam only invites Bill out to dinner if Bill takes Sam to the concert. Either Bill doesn't take Sam to the concert or Sam takes Bill out to dinner. Bill takes Sam to the concert. Bill takes Sam to the concert and Sam takes Bill out to dinner. Bill takes Sam to the concert and Sam doesn't take Bill out to dinner. The concert is cancelled. Sam doesn't go to the concert.

2.29. Let P and Q be statements. Which of the following statements implies that P ∨ Q is false? (a) (∼ P) ∨ (∼ Q) is false. (b) (∼ P) ∨ Q is true. (c) (∼ P) ∧ (∼ Q) is true. (d) Q ⇒ P is true. (e) P ∧ Q is false.

Section 2.5: More on Implications 2.30. Consider the open sentences P(n): 5n + 3 is a prime number. and Q(n) : 7n + 1 is prime, both over the domain N. (a) Formulate P(n) ⇒ Q(n) in words. (b) State P(2) ⇒ Q(2) in words. Is this statement true or false? (c) Write down P(6) ⇒ Q(6) in words. Is this statement true or false? 2.31. Two open sentences P(x) and Q(x) over a region S are specified below. Determine the true value of P(x) ⇒ Q(x) for every x ∈ S. (a) P(x) : |x| = 4; Q(x): x = 4; S = {−4, −3, 1, 4, 5}. (b) P(x): x2 = 16; Q(x): |x| = 4; S = {−6, −4, 0, 3, 4, 8}. (c) P(x): x > 3; Q(x): 4x − 1 > 12; S = {0, 2, 3, 4, 6}. 2.32. Two open sentences P(x) and Q(x) over a region S are specified below. Find all x ∈ S such that P(x) ⇒ Q(x) is true. (A B C D)

P(x): P(x): P(x): P(x):

x − 3 = 4; Q(x): x ≥ 8; S = R. x 2 ≥ 1; Q(x): x ≥ 1; S = R. x 2 ≥ 1; Q(x): x ≥ 1; S = N. x ∈ [−1, 2]; Q(x): x 2 ≤ 2; S = [−1, 1].

Exercises for Chapter 2

69

2.33. In the following two open sets P(x, y) and Q(x, y) are given, where the domain of x and y is equal to Z. Determine the true value of P(x, y) ⇒ Q( x, y) for the given values of x and y. (a) P(x, y): x 2 − y 2 = 0. and Q(x, y): x = y. (x, y) ∈ {(1, −1), (3, 4), (5, 5)}. (b) P(x,y): |x| = |y|. and Q(x,y): x = y. (x, y) ∈ {(1, 2), (2, −2), (6, 6)}. (c) P(x, y): x 2 + y 2 = 1. and Q(x, y): x + y = 1. (x, y) ∈ {(1, −1), (−3, 4), (0, −1), (1, 0)}. 2.34. Each of the following describes an implication. Write the implication in "if, then" form. (a) Every point on the line with the equation 2y + x − 3 = 0 whose x-coordinate is an integer also has an integer for its y-coordinate. (b) The square of every odd integer is odd. (c) Let n ∈ Z. Whenever 3n + 7 is even, n is odd. (d) The derivative of the function f(x) = cos x is f(x) = − sin x. (e) Let C be a circle with perimeter 4π . So the area of C is also 4π. (f) The integer n 3 is even only if n is even.

Section 2.6: The Two-Condition 2.35. Let P : 18 be odd. and Q: 25 is even. State P ⇔ Q in words. P ⇔ Q is true or false? 2.36. Let P(x): x is odd. and Q(x): x 2 is odd. be open sentences over the domain Z. Form P(x) ⇔ Q(x) in two ways: (1) with “if and only if” and (2) with “necessary and sufficient”. 2.37. For the open sentences P(x): |x − 3| < 1. and Q(x) : x ∈ (2, 4). over the domain R, state the bicondition P(x) ⇔ Q(x) in two different ways. 2.38. Consider the open sentences: P(x) : x = −2. and Q(x): x 2 = 4. over the domain S = {−2, 0, 2}. Formulate each of the following in words and determine all values of x ∈ S for which the resulting statements are true. (a) ∼ P(x) (b) P(x) ∨ Q(x) (c) P(x) ∧ Q(x) (e) Q(x) ⇒ P(x) (f) P(x ) ) ⇔ Q(x).

(d) P(x) ⇒ Q(x)

2.39. For the following open sentences P(x) and Q(x) over a domain S, find all values of x ∈ S for which the bicondition P(x) ⇔ Q(x) holds. (a) P(x) : |x| = 4; Q(x): x = 4; S = {−4, −3, 1, 4, 5}. (b) P(x): x ≥ 3; Q(x): 4x − 1 > 12; S = {0, 2, 3, 4, 6}. (c) P(x): x2 = 16; Q(x): x2 − 4x = 0; S = {−6, −4, 0, 3, 4, 8}. 2.40. In the following two open sets P(x, y) and Q(x, y) are given, where the domain of x and y is equal to Z. Find the true value of P(x, y) ⇔ Q( x, y) for the given values of x and y. (a) P(x, y) : x 2 − y 2 = 0 e; Q(x, y): x = y. (x, y) ∈ {(1, −1), (3, 4), (5, 5)}. (b) P(x,y) : |x| = |y| and; Q(x, y): x = y. (x, y) ∈ {(1, 2), (2, −2), (6, 6)}. (c) P(x,y): x2 + y2 = 1e; Q(x, y) : x + y = 1. (x, y) ∈ {(1, −1), (−3, 4), (0, −1), (1, 0)}. 2.41. Determine all values of n in the domain S = {1, 2, 3} for which the following statement holds: 3 2 A necessary and sufficient condition for n 2+n to be even is that n 2+n is odd is .

70

Chapter 2

Logic

2.42. Determine all values of n in the domain S = {2, 3, 4} for which the following statement holds: The integer n(n−1) is odd if and only if n(n+1) is even. 2 2 2.43. Let S = {1, 2, 3}. Consider the following open propositions over the domain S: P(n): (n+4)(n+5) is odd. 2 Q(n): 2n−2 + 3n−2 + 6n−2 > (2,5)n−1 . Determine three distinct elements a, b, c in S such that P(a) ⇒ Q(a) is false, Q(b) ⇒ P(b) is false, and P(c) ⇔ Q(c) is true . 2.44. Let S = {1, 2, 3, 4}. Consider the following open propositions over the domain S: P(n): n(n−1) is even. 2 n−2 − (−2)n−2 is even. Q(n): 2n−1 + 2n is a prime number. R(n): 5 Determine four distinct elements a, b, c, d in S such that (i) P(a) ⇒ Q(a) is false; (ii) Q(b) ⇒ P(b) is true; (iii) P(c) ⇔ R(c) is true; (iv) Q(d) ⇔ R(d) is false. 2.45. Let P(n): 2n − 1 is a prime number. and Q(n): n is a prime number. be open sentences over the domain S = {2, 3, 4, 5, 6, 11}. Find all values of n ∈ S such that P(n) ⇔ Q(n) is a true statement.

Section 2.7: Tautologies and Contradictions 2.46. For the statements P and Q, show that P ⇒ (P ∨ Q) is a tautology. 2.47. For the statements P and Q, show that (P ∧ (∼ Q)) ∧ (P ∧ Q) is a contradiction. 2.48. For the statements P and Q, show that (P ∧ (P ⇒ Q)) ⇒ Q is a tautology. Then write (P ∧ (P ⇒ Q)) ⇒ Q in words. (This is an important form of logical reasoning called modus ponens.) 2.49. Show for the statements P, Q and R that ((P ⇒ Q) ∧ (Q ⇒ R)) ⇒ (P ⇒ R) is a tautology. Then formulate this compound statement in words. (This is another important form of logical reasoning called syllogism.) 2.50. Let R and S be composite statements containing the same component statements. If R is a tautology and S is a contradiction, then what can be said about the following? (a) R ∨ S (b) R ∧ S (c) R ⇒ S (d) S ⇒ R

Section 2.8: Logical Equivalence 2.51. For the statements P and Q the implication (∼P) ⇒ (∼Q) is called the inverse of the implication P ⇒ Q. (a) Use a truth table to show that these statements are not logically equivalent. (b) Find another implication that is logically equivalent to (∼ P) ⇒ (∼ Q) and check your answer. 2.52. Let P and Q be statements. (a) Is ∼ (P ∨ Q) logically equivalent to (∼P) ∨ (∼Q)? To explain. (b) What can you say about the two-condition ∼ (P ∨ Q) ⇔ ((∼P) ∨ (∼Q))? 2.53. For the statements P, Q, and R, use a truth table to show that each of the following pairs of statements is logically equivalent. (a) (P ∧ Q) ⇔ P and P ⇒ Q. (b) P ⇒ (Q ∨ R) and (∼Q) ⇒ ((∼P) ∨ R). 2.54. For the statements P and Q, show that (∼Q) ⇒ (P ∧ (∼P)) and Q are logically equivalent. 2.55. Show for the statements P, Q and R that (P ∨ Q) ⇒ R and (P ⇒ R) ∧ (Q ⇒ R) are logically equivalent.

Exercises for Chapter 2

71

2.56. Two compound statements S and T are composed of the same partial statements P, Q and R. If S and T are not logically equivalent, then what can we conclude? 2.57. Five compound propositions S1, S2, S3, S4 and S5 are all composed of the same sub-propositions P and Q and their truth tables have identical first and fourth rows. Show that at least two of these five statements are logically equivalent.

Section 2.9: Some Basic Properties of Logical Equivalence 2.58. Check the following laws given in Theorem 2.18: (a) Let P, Q and R be declarations. Then P ∨ (Q ∧ R) and (P ∨ Q) ∧ (P ∨ R) are logically equivalent. (b) Let P and Q be propositions. Then ∼ (P ∨ Q) and (∼P) ∧ (∼Q) are logically equivalent. 2.59. Write the negations of the following open propositions: (a) Either x = 0 or y = 0. (b) The integers a and b are both even. 2.60. Consider the implication: if x and y are even, then x y is even. (A B C D)

Specify the implication as "only if". State the inverse of the implication. State the implication as a disjunction (see Theorem 2.17). State the negation of the implication as a conjunction (see Theorem 2.21(a)). √ 2.61. For a real number x, let P(x) : x 2 = 2. and Q(x) : x = 2. Formulate the negation of the bicondition P ⇔ Q in words (see Theorem 2.21(b)). 2.62. Let P and Q be statements. Show that [(P ∨ Q) ∧ ∼ (P ∧ Q)] ≡ ∼ (P ⇔ Q). 2.63. Let n ∈ Z. Which implication follows from your negation? The integer 3n + 4 is odd and 5n − 6 is even. 2.64. For which biconditional negation does the following hold true? n 3 and 7n + 2 are odd or n 3 and 7n + 2 are even.

Section 2.10: Quantified Claims 2.65. Let S be the set of odd integers and let P(x): x 2 + 1 is even.

e

Q(x): x 2 sec par.

be open sentences over the domain S. Formulate ∀x ∈ S, P(x) and ∃x ∈ S, Q(x) in words. 2.66. Define an open set R(x) over a domain S and then write down ∀x ∈ S, R(x) and ∃x ∈ S, R(x) in words. 2.67. Give the negations of the following quantified statements, where all sets are subsets of a universal set U: (a) For every set A, A ∩ A = ∅. (b) There is a set A with A ⊆ A. 2.68. Give the negations of the following quantified statements: (a) For every rational number r, the number 1/r is rational. (b) There is a rational number r with r 2 = 2.

72

Chapter 2

Logic

2.69. Let P(n): (5n − 6)/3 be an integer. be an open theorem over the domain Z. Use explanations to determine whether the following statements are true: (a) ∀n ∈ Z, P(n). (b) ∃n ∈ Z, P(n). 2.70. Determine the truth value of each of the following statements. (a) ∃x ∈ R, x√2 − x = 0. (b) ∀n ∈ N, n + 1 ≥ 2. (c) ∀x ∈ R, x 2 = x. (d) ∃x ∈ Q, 3x 2 − 27 = 0. (e) ∃x ∈ R, ∃y ∈ R, x + y + 3 = 8. (f) ∀x, y ∈ R, x + y + 3 = 8. (g) ∃x, y ∈ R, x 2 + y 2 = 9. (h) ∀x ∈ R, ∀y ∈ R, x 2 + y 2 = 9. 2.71. The statement For any integer m, either m ≤ 1 or m 2 ≥ 4. can be expressed with a quantifier like: ∀m ∈ Z, m ≤ 1 or m 2 ≥ 4. Do this for the following two statements. (A B C D)

There are integers a and b such that both ab < 0 and a + b > 0. For all real numbers x and y, x = y implies that x 2 + y 2 > 0. In words, express the negations of the Statements in (a) and (b). Express the negations of the statements in (a) and (b) in symbols using quantifiers.

2.72. Let P(x) and Q(x) be open sentences in which the domain of the variable x is S. Which of the following statements implies that (∼ P(x)) ⇒ Q(x) is false for some x ∈ S? (a) (b) (c) (d) (e)

P(x) ∧ Q(x) is false for all x ∈ S. P(x) is true for all x ∈ S. Q(x) is true for all x ∈ S. P(x) ∨ Q(x) is false for x ∈ S. P(x) ∧ (∼ Q(x)) is false for all x ∈ S.

2.73. Let P(x) and Q(x) be open sentences in which the domain of the variable x is T. Which of the following statements implies that P(x) ⇒ Q(x) for all x ∈ T? (a) (b) (c) (d) (e) (f)

P(x) ∧ Q(x) is false for all x ∈ T . Q(x) is true for all x ∈ T . P(x) is false for all x ∈ T . P(x) ∧ (∼ (Q(x)) is true for some x ∈ T . P(x) is true for all x ∈ T . (∼ P(x)) ∧ (∼ Q(x)) is false for all x ∈ T .

2.74. Consider the open set P(x, y, z): (x − 1)2 + (y − 2)2 + (z − 2)2 > 0. where the domain of each of the variables x, y, and z is R .(a) (b) (c) (d) (e)

Express the quantified statement ∀x ∈ R, ∀y ∈ R, ∀z ∈ R, P(x, y, z) in words. Is the statement quantified in (a) true or false? To explain. Express the negation of the quantified assertion in (a) in symbols. Express in words the negation of the claim quantified in (a). Is the negation of the statement quantified in (a) true or false? To explain.

2.75. Consider the quantified statement. For every s ∈ S and t ∈ S, st − 2 is a prime number.

73

Exercises for Chapter 2

where the domain of the variables s and t is S = {3, 5, 11}. (a) (b) (c) (d) (e)

Express this quantified statement in symbols. Is the statement quantified in (a) true or false? To explain. Express the negation of the quantified assertion in (a) in symbols. Express in words the negation of the claim quantified in (a). Is the negation of the statement quantified in (a) true or false? To explain.

2.76. Let A be the set of circles in the plane centered at (0, 0) and let B be the set of circles in the plane centered at (1, 1). Also let P(C1 , C2 ): C1 and C2 have exactly two points in common. be an open sentence in which the domain of C1 is A and the domain of C2 is B. (a) Express the following quantified statement in words: ∀C1 ∈ A, ∃C2 ∈ B, P(C1 , C2 ).

(2.30)

(b) Express the negation of the claim quantified in (2.30) in symbols. (c) Express in words the negation of the claim quantified in (2.30). 2.77. For a triangle T, let r(T) be the ratio of the length of the longest side of T to the length of the shortest side of T. Let A be the set of all triangles and let P(T1 , T2 ): r (T2 ) ≥ r (T1 ). be an open sentence in which the domain of definition of T1 and T2 is A. (a) Express the following quantified statement in words: ∃T1 ∈ A, ∀T2 ∈ A, P(T1 , T2 ).

(2.31)

(b) Express the negation of the claim quantified in (2.31) in symbols. (c) Express in words the negation of the claim quantified in (2.31). 2.78. Consider the open set P(a, b): a/b < 1. where the domain of a is A = {2, 3, 5} and the domain of b is B = {2, 4, 6}. (a) Formulate the quantified statement ∀a ∈ A, ∃b ∈ B, P(a, b) in words. (b) Show that the statement quantified in (a) is true. 2.79. Consider the open set Q(a, b): a − b < 0. where the domain of a is A = {3, 5, 8} and the domain of b is B = {3, 6, 10}. (a) Formulate the quantified statement ∃b ∈ B, ∀a ∈ A, Q(a, b) in words. (b) Show that the statement quantified in (a) is true.

Section 2.11: Characterization of declarations 2.80. Give a definition for each of the following statements, and then give a characterization of each. (a) Two straight lines in the plane are perpendicular to each other. (b) A rational number. 2.81. Define an integer n as odd if n is not even. Give a characterization of odd integers. 2.82. Define a triangle as isosceles if it has two equal sides. Which of the following statements are characterizations of isosceles triangles? If a statement is not a characterization of isosceles triangles, explain why.

74

Chapter 2 (a) (b) (c) (d) (e) (f)

Logic

If a triangle is equilateral, then it is isosceles. A triangle T is isosceles if and only if T has two equal sides. If a triangle has two equal sides, then it is isosceles. A triangle T is isosceles if and only if T is equilateral. If a triangle has two equal angles, then it is isosceles. A triangle T is isosceles if and only if T has two equal angles.

2.83. A right triangle is, by definition, a triangle whose angles are right. Also, two angles of a triangle are complementary if the sum of their degrees is 90°. Which of the following statements are characterizations of a right triangle? If a statement is not a right triangle characterization, explain why. (a) A triangle is a right triangle if and only if two of its sides are perpendicular to each other. (b) A triangle is a right triangle if and only if it has two complementary angles. (c) A triangle is a right triangle if and only if its area is half the product of the lengths of some of its sides. (d) A triangle is a right triangle if and only if the square of the length of its longest side is equal to the sum of the squares of the lengths of the two shortest sides. (e) A triangle is a right triangle if and only if twice the area of the triangle is equal to the area of a rectangle. 2.84. Two dissimilar lines in the plane are defined as parallel if they do not intersect. Which of the following statements is a characterization of parallel lines? (a) Two distinct lines 1 and 2 are parallel if and only if every line 3 that is perpendicular to 1 is also perpendicular to 2. (b) Two distinct lines 1 and 2 are parallel if and only if each line distinct from 1 and 2 that does not intersect 1 does not intersect 2 either. (c) Two distinct lines 1 and 2 are parallel if and only if whenever a line 1 intersects at an acute angle α, it also intersects 2 at an acute angle α. (d) Two distinct lines 1 and 2 are parallel if and only if whenever a point P does not lie on 1, the point P does not lie on 2.

EXERCISES ADDITIONAL TO CHAPTER 2 2.85. Construct a truth table for P ∧ (Q ⇒ (∼ P)). 2.86. Since the implication (Q ∨ R) ⇒ (∼ P) is false and Q is false, determine the truth values of R and P. 2.87. Find a compound statement with sub-statements P and Q that has the truth table given in Figure 2.18.

P Q ∼ Q T

Figure 2.18

T

F

T

T F

T

T

F T F F

F

F

T

T

Truth table for Exercise 2.87.

Additional exercises to Chapter 2

75

2.88. Determine the truth value of each of the following quantified statements: (a) ∃x ∈ R, x 3 + 2 = 0. (b) ∀n ∈ N, 2 ≥ 3 − n. (c) ∀x ∈ R, |x| =x. (d) ∃x ∈ Q, x 4 − 4 = 0. (e) ∃x, y ∈ R, x + y = π . (f) ∀x, y ∈ R, x + y = x2 + y2 . 2.89. Rewrite each of the following implications using (1) only if and (2) suffice. (a) If a function f is differentiable, then f is continuous. (b) If x = −5, then x 2 = 25. 2.90. Let P(n): n 2 − n + 5 is a prime number. an open theorem over a domain S. (a) Determine the truth values of the quantified statements ∀n ∈ S, P(n) and ∃n ∈ S, ∼ P(n) for S = {1, 2, 3, 4}. (b) Determine the truth values of the quantified statements ∀n ∈ S, P(n) and ∃n ∈ S, ∼ P(n) for S = {1, 2, 3, 4, 5}. (c) How are the statements in (a) and (b) related? 2.91. (a) Show for the statements P, Q and R that ((P ∧ Q) ⇒ R) ≡ ((P ∧ (∼R)) ⇒ (∼Q)). (b) Show for the statements P, Q and R that ((P ∧ Q) ⇒ R) ≡ ((Q ∧ (∼ R)) ⇒ (∼P)). 2.92. For a fixed integer n, use Exercise 2.91 to formulate the following implication in two different ways: If n is prime and n > 2, then n is odd. 2.93. For fixed integers m and n, use Exercise 2.91 to formulate the following implication in two different ways: If m is even and n is odd, then m + n is odd. 2.94. For a real function f and a real number x, use Exercise 2.91 to formulate the following implication in two different ways: If f(x) = 3x 2 − 2x and f(0) = 4, then f(x) = x3 − x2 + 4. 2.95. Give an example for the set S = {1, 2, 3} for three open sentences P(n), Q(n) and R(n), each over the domain S such that (1) each P( n), Q(n) and R(n) is a true statement for exactly two elements of S, (2) all implications P(1) ⇒ Q(1), Q(2) ⇒ R(2) and R( 3) ⇒ P(3) are true, and (3) the reciprocal of each implication in (2) is false. 2.96. There is a set S of cardinality 2 and a set {P(n), Q(n), R(n)} of three open sets over domain S such that (1) the implications P(a) ⇒ Q( a ) Are Q(b) ⇒ R(b) and R(c) ⇒ P(c) true, where a, b, c ∈ S and (2) the reciprocals of the implications in (1) are false? Necessarily at least two of these elements a, b and c of S are the same. 2.97. Let A = {1, 2, . 🇧🇷 🇧🇷 , 6} and B = {1, 2, . 🇧🇷 🇧🇷 , 7}. For x ∈ A let P(x) : 7x + 4 be odd. For y ∈ B let Q(y) : 5y + 9 be odd. Let S = {(P(x), Q(y)) : x ∈ A, y ∈ B, P(x) ⇒ Q(y) is false}. What is |S|? 2.98. Let P(x, y, z) be an open set where the domains of x, y, and z are A, B, and C, respectively. (a) Formulate the quantified statement ∀x ∈ A, ∀y ∈ B, ∃z ∈ C, P(x, y, z) in words. (b) Formulate the quantified statement ∀x ∈ A, ∀y ∈ B, ∃z ∈ C, P(x, y, z) in words for P(x, y, z): x = yz.

76

Chapter 2

Logic

(c) Determine whether the statement quantified in (b) is true when A = {4, 8}, B = {2, 4} and C = {1, 2, 4}. 2.99. Let P(x, y, z) be an open set where the domains of x, y, and z are A, B, and C, respectively. (a) Express the negation of ∀x ∈ A, ∀y ∈ B, ∃z ∈ C, P(x, y, z) in symbols. (b) Express ∼ (∀x ∈ A, ∀y ∈ B, ∃z ∈ C, P(x, y, z)) in words. (c) Determine whether ∼ (∀x ∈ A, ∀y ∈ B, ∃z ∈ C, P(x, y, z)) is true if P(x, y, z) : x + z = y. for A = {1, 3}, B = {3, 5, 7} and C = {0, 2, 4, 6}. 2,100. Write “if, then” for each of the following statements. (A B C D E F G H)

A sufficient condition for a triangle to be isosceles is that it has two equal angles. √ Let C be a circle with diameter 2/π. So the area of C is 1/2. The fourth power of any odd integer is odd. Suppose the slope of a line is 2. So the equation is y = 2x + b for some real number b. Whenever a and b are nonzero rational numbers, a/b is a nonzero rational number. For every three integers, there are two whose sum is even. A triangle is √ a right triangle if the sum of any two of its angles is 90o. The number 3 is irrational.

3

Direct proof and proof by contrapositive

C

We are now ready to discuss our main topic: mathematical proofs. First of all, we are concerned with one question: given a true mathematical statement, how can we show that it is true? This chapter introduces two important proof techniques. A true mathematical statement, the truth of which is accepted without proof, is called an axiom. For example, an axiom of Euclid in geometry states that for every line and point P that does not lie on P, there is a unique line that contains P and is parallel to P. A true mathematical statement that can be checked for accuracy is often called a theorem, although many mathematicians reserve the word theorem for statements that are particularly significant or interesting. For example, the mathematical statement "2 + 3 = 5" is true, but few, if any, would consider it a theorem under this latter interpretation. Besides the word theorem, other common terms for such statements are theorem, result, observation, and fact, the choice usually depending on the importance of the statement or the difficulty of proving it. However, we will use the word theorem sparingly, reserving it primarily for true mathematical statements used to verify other mathematical statements that we shall encounter later. Otherwise we just use the word result. Our results are mostly examples to illustrate proof techniques, and our goal is to prove these results. An inference is a mathematical result that can be derived and is therefore a consequence of an earlier result. A lemma is a mathematical result useful in establishing the truth of another result. Some people like to think of a motto as “helping outcome”. In fact, the German word for motto is hilfsatz, which translates to "help sentence" in English. Normally, then, a motto in itself is not of primary importance. In fact, its mere existence is due solely to its utility in proving a different (more interesting) result. Most theorems (or results) are given as implications. We now begin to study the proofs of such mathematical statements.

77

78

Chapter 3

Direct proof and proof by contrapositive

3.1 Trivial and vacuum proofs In almost all P ⇒ Q-implications that we will encounter, P and Q are open sentences; that is, we are really considering P(x) ⇒ Q(x) or P(n) ⇒ Q(n) or some related implication, depending on which variable is used. The variables x or n (or some other symbol) are used to represent elements of a set S under discussion; that is, S is the domain of the variable. As we have seen, there is a declaration for each value of a variable in its domain. (It is of course possible that P and Q are expressed by two or more variables.) Whether P(x) (or Q(x)) is true usually depends on which element x ∈ S we are considering; that is, P(x) is rarely true for all x ∈ S (or that P(x) is false for all x ∈ S). For example for P(n): 3n 2 − 4n + 1 is even, where n ∈ Z, P(1) is a true proposition while P(2) is a false proposition. Equally seldom is Q(x) true for all x ∈ S or Q(x) false for all x ∈ S. When the quantified proposition ∀x ∈ S, P(x) ⇒ Q(x ) is expressed as a result or theorem we usually write a statement like For x ∈ S, if P(x) then Q(x). or as Let x ∈ S. If P(x), then Q(x).

(3.1)

Thus (3.1) is true if P(x) ⇒ Q(x) is a true statement for every x ∈ S, while (3.1) is false if P(x) ⇒ Q(x) is false for at least one element x ∈ S. Given an element x ∈ S (see the truth table in Figure 3.1), recall the conditions under which P(x) ⇒ Q(x) has a certain truth value. Consequently, if Q(x) is true for all x ∈ S or P(x) is false for all x ∈ S, then determining whether (3.1) is true or false becomes considerably easier. Indeed, if it can be shown that Q(x) is true for all x ∈ S (regardless of the truth value of P(x)), then according to the truth table for the implication (in Figure 3.1) (3.1) is true. This is a proof of (3.1) and is called a trivial proof. Hence the statement Let n ∈ Z. If n 3 > 0, then 3 is odd. is true, and a (trivial) proof is to observe that 3 is an odd integer. The following provides a more interesting example of a trivial proof. P(x)Q(x)P(x)⇒Q(x)T

T

T

T

F

F

F

T

T

F

F

T

Figure 3.1 The truth table for the implication P(x) ⇒ Q(x) for an element x in its domain

3.1 Result 3.1 Test

Trivial and empty proofs

79

Let x ∈ R. If x < 0, then x 2 + 1 > 0. Since x 2 ≥ 0 for every real number x, it follows that x 2 + 1 > x 2 ≥ 0. So x 2 + 1 > 0 Consider P(x) : x < 0 and Q(x) : x 2 + 1 > 0 with x ∈ R. Then Result 3.1 tells the truth: For all x ∈ R we have P(x) ⇒ Q( x) . Since we have verified that Q(x) for all x ∈ R, it follows that P(x) ⇒ Q(x) for all x ∈ R and therefore Result 3.1 is true. In this case, Q(x) over the range R is indeed a true statement. It is this fact that allows us to give a trivial proof of Result 3.1. The proof of Result 3.1 does not depend on x < 0. In fact, since x ∈ R, we could have replaced 'x < 0' with any hypothesis (including the more satisfactory 'x ∈ R') and the result would still be TRUE. In fact, this new result has the same proof. Of course, it's really rare that a trivial proof is used to verify an implication; However, this is an important reminder of the truth table in Figure 3.1. which occurs at the end of the proof of Result 3.1 indicates that the symbol (or other symbol) to be proved is complete. There are clear benefits to using cues to indicate the completion of a test. When you start reading a proof, you can first look for this symbol to see how long the proof is. Without this symbol, you can continue reading past the end of the proof and still think you are reading a proof of the result. When you reach this symbol, you must be convinced that the result is true. If so, that's good! Everything happened as planned. On the other hand, if you are not convinced, the author has not provided proof for you. But that can't be the author's fault. In the past, the most common way to indicate that evidence had been provided was to call Q.E.D. to write, which stands for the Latin expression "quod erat demonstrandum", which translates into English as "that it should be demonstrated". Some still use it. Let P(x) and Q(x) be open sentences over a domain S. Then ∀x ∈ S, P(x) ⇒ Q(x) is a true statement if it can be shown that P(x) is false for all x ∈ S (regardless of the truth value of Q(x)), according to the truth table for implication. Such a proof is called an empty proof of ∀x ∈ S, P(x) ⇒ Q(x). So let n ∈ Z. If 3 is even, then n 3 > 0 is a true statement. However, let's look at a more interesting example of an empty proof.

Result 3.2 test

Let x ∈ R. If x 2 − 2x + 2 ≤ 0, then x 3 ≥ 8. First note that x 2 − 2x + 1 = (x − 1)2 ≥ 0. Hence x 2 − 2x + 2 = ( x − 1)2 + 1 ≥ 1 > 0. Hence x 2 − 2x + 2 ≤ 0 is false for all x ∈ R and the implication is true.

80

Chapter 3

Direct proof and proof by contrapositive

For P(x) : x 2 − 2x + 2 ≤ 0 and Q(x) : x 3 ≥ 8 over the domain R, Result 3.2 confirms the truth of ∀x ∈ R, P(x) ⇒ Q(x) . Since we have verified that P(x) is false for all x ∈ R, it follows that P(x) ⇒ Q(x) is true for all x ∈ R. Hence the result 3.2 is true. In this case P(x) is false for every x ∈ R. This allowed us to give an empty proof of Result 3.2. In proving Result 3.2, the truth or falsity of x 3 ≥ 8 did not matter. For example, if we had replaced x 3 ≥ 8 by x 3 ≤ 8, neither the truth nor the proof of Result 3.2 would be affected. Whenever there is an empty proof for a result, we usually say that the result follows loosely. As already mentioned, in mathematics one almost never finds a trivial proof; However, the same cannot be said of empty proofs, as we shall see later. We consider another example. Result 3.3 test

Let S = {n ∈ Z : n ≥ 2} and let n ∈ S. If 2n +

2 k

< 5, also 4n 2 +

4n2

< 25.

First we note that if n = 2, then 2n + n2 = 5. Of course, 5 < 5 is false. If n ≥ 3, then 2n + n2 > 2n ≥ 6. Therefore, if n ≥ 3, 2n + n2 < 5 is also false. Thus 2n + n2 < 5 is false for all n ∈ S. Hence the implication is true. In two of the examples we have presented to illustrate trivial and empty proofs, we have used (and assumed known) the fact that 3 is odd. Furthermore, in the proofs of Results 3.1 and 3.2 we used the fact that if r is any real number then r 2 ≥ 0. While you are certainly familiar with this property of real numbers, it is important that all facts used within a proof are known and will likely be remembered by the reader. The facts used in a proof should not surprise the reader. This matter will be discussed in more detail shortly. Although trivial and empty proofs are rare in mathematics, they are important reminders of the truth table of implications. We are now ready to be introduced to the first great proof technique in mathematics.

3.2 Direct Proofs When we discuss an implication P(x) ⇒ Q(x) over a domain S, there is typically a connection between P(x) and Q(x). That is, the true value of Q(x) for a given x ∈ S generally depends on the true value of P(x) for the same element x, or the true value of P(x) depends on the true value of Q (x). These are the types of implications that we are primarily interested in, and it is the evidence for these types of results that will occupy much of our attention. We begin with the first major proof technique, which is more common in mathematics than any other technique. Let P(x) and Q(x) be open sentences over a domain S. Suppose our goal is to show that P(x) ⇒ Q(x) is true for all x ∈ S, i.e. H. our goal is to show that a quantified statement ∀x ∈ S, P(x) ⇒ Q(x) is true. If P(x) is false for some x ∈ S, then P(x) ⇒ Q(x) is true for this element x. So we only have to worry about showing that P(x) ⇒ Q(x) for all x ∈ S for which P(x) holds. In a direct proof of P(x) ⇒ Q(x) for all x ∈ S we consider an arbitrary element x ∈ S for which P(x) holds

3.2

Direct evidence

81

true and show that Q(x) is true for this element x. In summary, for a direct proof of P(x) ⇒ Q(x) for all x ∈ S, we assume that P(x) is true for any element x ∈ S and show that Q(x) must also be true for this element x. In order to illustrate this type of proof (and others, too), we need to delve into mathematical topics that we are all familiar with. First, let's look at integers and some of their elementary properties. We assume that you are familiar with integers and the following properties of integers: 1. The negative of any integer is an integer. 2. The sum (and difference) of any two integers is an integer. 3. The product of two integers is an integer. We agree that we can use any of these properties. No justification is required or expected. First we will use even and odd integers to illustrate our proof techniques. In this case, however, all even and odd integer properties must be checked before they can be used. For example, you probably know that the sum of any two even integers is even, but you need to prove that this is used first. We must set some groundwork before giving any examples of direct evidence. Since we're going to be working with even and odd integers, it's important that we have precise definitions of these types of numbers. An integer n is defined as even if n = 2k for an integer k. For example, 10 is even since 10 = 2 x 5 (where 5 is, of course, an integer). Also, −14 = 2(−7) is even, as is 0 = 2 · 0. The integer 17 is not even, since there is no integer k such that 17 = 2k. So we see that the set of all even integers is the set S = {2k : k ∈ Z} = { · · · , −4, −2, 0, 2, 4, · · ·}. We could define an integer n as odd if it is not even, but it would be difficult to work with this definition. Instead, we define an integer n as odd if n = 2k + 1 for an integer k. Now 17 is odd since 17 = 2 8 + 1. Also −5 is odd because −5 = 2(−3) + 1. On the other hand 26 is not odd since there is no integer k such that 26 = 2k + 1. Actually, 26 is even. According to the definition of odd integers just given, we see that the set of all odd integers is exactly the set T = {2k + 1 : k ∈ Z} = { , −5, −3 , −1 , 1 , 3, 5, }. Note that S and T are disjoint sets and S ∪ T = Z; that is, Z is split into S and T. Therefore, every integer is either even or odd. From time to time we find ourselves in a situation where we need to prove a result and it may not be entirely clear how to proceed. In this case, we must weigh our options and develop a plan, which we call a proof strategy. The idea is to discuss a proof strategy for the result and build a proof from it. At other times we might want to reflect on a proof we just gave to better understand it. Such a discussion is called evidence analysis. As with the examples, we complete a proof strategy and proof analysis with the symbol . We are now ready to illustrate the direct proof technique. We follow the proof through a proof analysis. Result 3.4

If n is an odd integer, then 3n + 7 is an even integer.

82

Chapter 3 exam

Direct proof and proof by contrapositive

Suppose n is an odd integer. Since n is odd, we can write n = 2k + 1 for an integer k. Now 3n + 7 = 3(2k + 1) + 7 = 6k + 3 + 7 = 6k + 10 = 2(3k + 5). Since 3k + 5 is an integer, 3n + 7 is even.

EVIDENCE ANALYSIS

First, notice that Result 3.4 could also have been expressed as: For any odd integer n, the integer 3n + 7 is even. or Let n be an odd integer. So 3n + 7 is even. The domain of the variable n in Result 3.4 is therefore the set of odd integers. In the proof of Result 3.4, the expression 2k + 1 has been replaced by n in 3n + 7 and simplified to 6k + 10. Since our aim was to show that 3n + 7 is even, we had to show that 3n + 7 can be expressed as a double integer. So we factor 2 out of 6k + 10 and write it as 2(3k + 5). Since 3 and k are integers, so is 3k (the product of two integers is an integer). Since 3k and 5 are integers, this is also 3k + 5 (the sum of two integers is an integer). So 3n + 7 satisfies the definition of an even integer. Another observation deserves mention here. In the second theorem we write: Since n is odd, we can write n = 2k + 1 for an integer k. It would be wrong to write "if n is odd" instead of "since n is odd" because we have already assumed that n is odd and we now know that n is odd. We define an integer n as odd if we can write n as 2k + 1 for an integer k. That is, whenever we want to show that an integer, say m, is odd, we have to follow this definition; that is, we must show that m = 2k + 1 for some integer k. (Of course, the use of the symbol k is unimportant. For example, an odd integer n can be written as n = 2 + 1 for an integer.) We could have defined an integer n as odd if we could write n = 2k − 1 for an integer k, but we didn't do it. However, if we could prove that an integer n is odd if and only if n can be expressed as 2k − 1 for an integer k, then we could use this characterization of odd integers to show that an integer is odd is. However, this would require additional work on our part, with no obvious benefit. Likewise, we could have defined an integer n as even if we could write n = 2k + 2, or n = 2k − 2, or perhaps n = 2k + 100 for an integer k. The definitions of even and odd integers that we have chosen are probably the most commonly used. Any other definitions do not give us any particular advantage. The proof of Result 3.4 is an example of a direct proof. Let Q(n): 3n + 7 is an even integer. over the range of odd integers. We verify Result 3.4 by assuming that n is any odd integer and then showing that Q(n) is true for this element n. Showing that Q(n) is true essentially required one step on our part. As we venture further into the evidence, we will see that we are not always so quick to establish the truth of the desired conclusion. He can

3.2

Direct evidence

83

It may be necessary to establish the truth of some other mathematical statements along the way, which can then be used to establish the truth of Q(n). We will see examples of this later. Let's consider another example. For a change, we use an alternative opening theorem and other symbols in the proof of the following result. Result 3.5 test

If n is an even integer, then −5n − 3 is an odd integer. Let n be an even integer. So n = 2x, where x is an integer. Therefore, −5n − 3 = −5(2x) − 3 = −10x − 3 = −10x − 4 + 1 = 2(−5x − 2) + 1. Since −5x − 2 is an integer, −5n − 3 is an odd integer. Now let's look at another example that can have a surprising ending.

Result 3.6 test

If n is an odd integer, then 4n 3 + 2n − 1 is odd. Suppose n is odd. So n = 2y + 1 for an integer y. So 4n 3 + 2n − 1 = 4(2y + 1)3 + 2(2y + 1) − 1 = 4(8y 3 + 12y 2 + 6y + 1) + 4y + 2 − 1 = 32y 3 + 48y 2 + 28y + 5 = 2(16y 3 + 24y 2 + 14y + 2) + 1. Since 16y 3 + 24y 2 + 14y + 2 is an integer, 4n 3 + 2n − 1 is odd.

EVIDENCE ANALYSIS

Although the direct proof of Result 3.6 that we have provided is correct, it is not the desired proof. Had we actually observed that 4n 3 + 2n − 1 = 4n 3 + 2n − 2 + 1 = 2(2n 3 + n − 1) + 1 and that 2n 3 + n − 1 ∈ Z, we could have done it immediately concluded that 4n 3 + 2n − 1 is odd for any integer n. Hence a trivial proof of Result 3.6 could be given and is indeed preferable. The fact that 4n 3 + 2n − 1 is odd does not depend on n being odd. In fact, it would be much better to replace Result 3.6 with If n is an integer, then 4n 3 + 2n − 1 is odd.

We give another example of a different kind. Result 3.7

Let S = {1, 2, 3} and let n ∈ S. If

(n + 2)(n − 5) n(n + 3) is even, so it's even. 2 2

Study

Let n∈S be such that n(n + 3)/2 is even. Since n(n+3)/2=2 for n = 1, n(n + 3)/2 = 5 for n = 2 and n(n + 3)/2 = 9 for n = 3, it follows that n = 1. When n = 1, (n + 2)(n − 5)/2 = −6, which is even. Hence the implication is true.

EVIDENCE ANALYSIS

The proof of Result 3.7 is only concerned with the elements n ∈ S for which n(n + 3)/2 is even. In addition, it is initially unclear for which elements n of S the integer n(n + 3)/2 is even. Since S consists of only three elements, this can be determined

84

Chapter 3

Direct proof and proof by contrapositive

fast, which is what we did. We have seen that only n = 1 has the desired property and this is the only element we need to consider. If our goal is to prove the truth of P(x) ⇒ Q(x) for all x in a domain S by a direct proof, then the proof starts with the fact that P(x) for any element x ∈ S In this situation, however, it is often common to omit the initial assumption that P(x) is true for any element x ∈ S. It is then understood that we provide a direct proof. We illustrate this with a small example. Result 3.8 test

If n is an even integer, then 3n 5 is an even integer. Since n is an even integer, n = 2x for an integer x. Therefore 3n 5 = 3(2x)5 = 3(32x 5 ) = 96x 5 = 2(48x 5 ). Because of 48x 5 ∈ Z the integer 3n 5 is even. If we now provide a direct proof of P(x) ⇒ Q(x) for all x in a domain S, we will often include the initial assumption that P(x) is true for any element x ∈ S in order to consolidate this technique in your mind.

3.3 Proof by contrapositive For propositions P and Q the contrapositive of the implication is P ⇒ Q implication (∼Q) ⇒ (∼P). For example, for P1 : 3 is odd and P2 : 57 is prime, the inverse of implication P1 ⇒ P2 : if 3 is odd then 57 is prime. the implication is (∼P2 ) ⇒ (∼P1 ): If 57 is not prime, then 3 is even. The most important feature of the contrapositive (∼Q) ⇒ (∼P) is that it is logically equivalent to P ⇒ Q. This fact is formally formulated as a proposition and verified in the truth table shown in Figure 3.2. Theorem 3.9

For any two propositions P and Q, the implication P ⇒ Q and its contrapositive are logically equivalent; that is, P ⇒ Q ≡ (∼Q) ⇒ (∼P). Let P(x): x = 2. and Q(x): x 2 = 4. where x ∈ R. The contrapositive of the implication P(x) ⇒ Q(x): If x = 2, then x 2 = 4 ... is the implication

3.3

P Q P ⇒ Q ∼ P

Figure 3.2

T T T F F T

T F T

F

F F

T

Contrapositives Test

85

∼ Q (∼ Q) ⇒ (∼ P )

F T

F T F

T F T

T

T

T

The logical equivalence of an implication and its contrapositive

(∼Q(x)) ⇒ (∼P(x)) : If x 2 = 4, then x = 2. Suppose we want to prove a result (or theorem) expressed as Let x ∈ S. If P (x), then Q(x).

(3.2)

For all x ∈ S, if P(x) then Q(x).

(3.3)

or what

We have seen that the proof of such a result consists in proving the implication P(x) ⇒ Q(x) for all x ∈ S. If it can be shown that (∼Q(x)) ⇒ (∼P (x)) is true for all x ∈ S, then P(x) ⇒ Q(x) for all x ∈ S. A counter-positive proof of the Result (3.2) (or of (3.3)) is a direct proof of its contrapositive: Let x ∈ S. If ∼Q(x), then ∼P(x). or For all x ∈ S, if ∼Q(x), then ∼P(x). So, to give a contrapositive proof of (3.2) (or of (3.3)), we assume that ∼Q(x) holds for any element x ∈ S and show that ∼P(x) for that element x is applicable . There are certain types of results for which proof by contrapositive is preferable, or perhaps even essential. We now give some examples to illustrate this method of proof. Result 3.10 exam

Let x ∈ Z. If 5x − 7 is even, then x is odd. Suppose x is even. Then x = 2a for an integer a. So 5x − 7 = 5(2a) − 7 = 10a − 7 = 10a − 8 + 1 = 2(5a − 4) + 1. Since 5a − 4 ∈ Z holds, the integer 5x − 7 is odd.

EVIDENCE ANALYSIS

Some comments are now in order. The purpose of Result 3.10 was to prove P(x) ⇒ Q(x) for all x ∈ Z, where P(x) : 5x − 7 is even. and Q(x): x is odd. Since we opted for a contrapositive proof, we gave a direct proof of (∼Q(x)) ⇒ (∼P(x)) for all x ∈ Z. Hence the proof began by assuming that x is not odd; that is, x is even. The goal then was to show that 5x − 7 is odd. If we had tried to prove Result 3.10 with a direct proof, we would first have assumed that 5x − 7 is even for any integer x. So 5x − 7 = 2a for an integer a. So x = (2a + 7)/5. So we want to show that x is odd. With the expression we have for x, it's not even clear that x is an integer, let alone that x is odd

86

Chapter 3

Direct proof and proof by contrapositive

integer, although of course we were told in the statement of Result 3.10 that the domain of x is the set of integers. So not only does a contrapositive proof provide us with a fairly simple method of proving Result 3.10, it may not be immediately clear how or if a direct proof can be used. How did we know in advance that this is counter-positive evidence to use here? This is not as difficult as it may seem. When using a direct proof, we first assume that 5x − 7 is even for any integer x; On the other hand, when we use a contrapositive proof, we first assume that x is even. Therefore, using a contrapositive proof allows us to start with x instead of the more complicated expression 5x − 7. In all the examples we have seen so far, we have only considered implications. Now we consider a bicondition. Result 3.11 Proof

Let x ∈ Z. Then 11x − 7 is even if and only if x is odd. There are two implications to prove here, namely (1) if x is odd then 11x − 7 is even, and (2) if 11x − 7 is even then x is odd. We start with (1). In this case, direct verification is appropriate. Suppose x is odd. So x = 2r + 1, where r ∈ Z. So 11x − 7 = 11(2r + 1) − 7 = 22r + 11 − 7 = 22r + 4 = 2(11r + 2). Since 11r + 2 is an integer, 11x − 7 is even. We now prove (2), which is the inverse of (1). We use a counter-positive proof here. Suppose x is even. So x = 2s, where s ∈ Z. So 11x − 7 = 11(2s) − 7 = 22s − 7 = 22s − 8 + 1 = 2(11s − 4) + 1. Since 11s − 4 is an integer, 11x − 7 is odd. It is worth repeating a comment on the statements in results 3.10 and 3.11. These results begin with the theorem: Let x ∈ Z. This of course tells us that the domain is Z in this case. That is, we are told that x represents an integer. We do not need to state this assumption in the proof. The theorem "Let x ∈ Z." is commonly called the "replacement" assumption or hypothesis, and hence x is assumed to be an integer in the proofs of Results 3.10 and 3.11. In the proof of Result 3.11 we discussed our plan of attack. That is, we have established that there are two implications to be proved, and explicitly stated each of them. We don't normally include this information in the proof - unless the proof is quite long; In this case, a roadmap with the planned steps can be helpful. We give another example of this kind, this time presenting a more conventional condensed proof. The following example will be useful to us in the future, so we call it the theorem.

Theorem 3.12

Let x ∈ Z. Then x 2 is even if and only if x is even.

3.3 Proof

Contrapositives Test

87

Suppose x is even. Then x = 2a for an integer a. Therefore x 2 = (2a)2 = 4a 2 = 2(2a 2 ). Since 2a 2 ∈ Z, the integer x 2 is even. Otherwise, assume that x is odd. So x = 2b + 1, where b ∈ Z. So x 2 = (2b + 1)2 = 4b2 + 4b + 1 = 2(2b2 + 2b) + 1. Since 2b2 + 2b is an integer, x is 2 odd . Now suppose that you are asked to prove the following result: Let x ∈ Z. Then x 2 is odd if and only if x is odd.

(3.4)

how would you do it You might consider proving the implication "if x is odd then x 2 is odd". by a direct proof and its converse "If x 2 is odd, then x is odd". by a contrapositive proof, where the domain of x is obviously Z. If we look at what is happening here, we will see that we are duplicating the proof of Theorem 3.12. That's no surprise. Theorem 3.12 says that x 2 is even if x is even; and if x 2 is even, then x is even. The contrapositive of the first implication is "If x 2 is odd, then x is odd", while the contrapositive of the second implication is "If x is odd, then x 2 is odd". In other words, (3.4) simply formulates Theorem 3.12 in terms of contrapositives. Thus (3.4) needs no proof. It is essentially a restatement of Theorem 3.12. And when we speak of reformulations of Theorem 3.12, we have to recognize that this theorem can be reformulated in other ways. For example, we could rephrase: if x is an even integer, then x 2 is even. since The square of every even integer is even. Hence Theorem 3.12 can be written as: An integer is even if and only if its square is even. Sometimes it is not only useful to formulate results in different ways, but it is important to recognize what a result says, regardless of how it is expressed. It is useful at this point to pause and discuss how theorems (or results) can be used and why we might be interested in proving a given theorem. First, it is only by proving a proposition that we know for sure that the proposition is true, and so we can call it a proposition. A major reason mathematicians want to give a proof of a mathematical statement is that they find it challenging - that's what mathematicians do. This indeed raises a question that many mathematicians consider very important. Where do such statements come from? The answer, of course, is that they come from mathematicians or students. The way these people come up with such questions does not follow any set rules. But this is about the creative aspect of mathematics. Any

88

Chapter 3

Direct proof and proof by contrapositive

People are curious and imaginative. Perhaps when proving a theorem, one realizes that the proof method used could be applied to prove something even more interesting. (Of course, what's interesting is quite subjective.) Most likely, however, a person observed a relationship that exists in one example under consideration and appears to occur in a more general setting. That person then tries to show that this is the case by providing evidence. This whole process involves the idea of conjectures (conjectures) and trying to show the accuracy of a conjecture. We will discuss this in more detail later. Suppose we managed to prove (by some method) P(x) ⇒ Q(x) for all x in a domain S. Hence we know that for every x ∈ S for which P(x) is true, Q(x) is true. Furthermore, for every x ∈ S for which the statement Q(x) is false, the statement P(x) is false. For example, since we know Result 3.10 is true, if we ever find an integer n such that 5n − 7 is even, we know that n is odd. Furthermore, if we find an integer n such that n 2 is odd, we can conclude from Proposition (3.4), or better yet from Theorem 3.12, that n itself must be odd. It's not just about knowing that a certain theorem might be useful to us in the future, it might be that a theorem looks surprising, interesting or even beautiful. (Yes - for mathematicians, and I hope for you too, a theorem can be beautiful.) Next we describe a kind of result that we have not yet encountered. Consider the following result that we want to prove. Result to prove the PROOF STRATEGY

Let x ∈ Z. If 5x − 7 is odd, then 9x + 2 is even. This result does not seem to fit the types of results we are proving. (This is not uncommon. After we learn to prove certain propositions, we come across new propositions that force us to...think.) If we try to provide a direct proof or a counter-proof of this result, we can encounter difficulties. But there is another approach. Although we have to be very careful about what we assume, it seems from what we know about even and odd integers that if 5x − 7 is odd, x must be even. If we knew that whenever 5x − 7 is odd then x is even, this fact would be extremely useful. We illustrate this below. Remember that our goal is to prove the following result, which we will call Result 3.14: Let x ∈ Z. If 5x − 7 is odd, then 9x + 2 is even. The (unusual) numbering of this result is because we will first formulate and prove a lemma (Lemma 3.13), which will help us to prove Result 3.14. In order to verify the truth of Result 3.14, we first prove the following lemma.

Lemma 3.13 Proof

Let x ∈ Z. If 5x − 7 is odd, then x is even. Suppose x is odd. So x = 2y + 1, where y ∈ Z. So 5x − 7 = 5(2y + 1) − 7 = 10y − 2 = 2(5y − 1). Since 5y − 1 is an integer, 5x − 7 is even. We are now ready to prove Result 3.14.

3.4 Result 3.14 Proof

evidence by cases

89

Let x ∈ Z. If 5x − 7 is odd, then 9x + 2 is even. Let 5x − 7 be an odd integer. By Lemma 3.13 the integer x is even. Since x is even, x = 2z for an integer z. So 9x + 2 = 9(2z) + 2 = 18z + 2 = 2(9z + 1). Since 9z + 1 is an integer, 9x + 2 is even. Thus, with the help of Lemma 3.13, we have produced a fairly straightforward (and hopefully easy to follow) proof of Result 3.14. The main reason for presenting Result 3.14 was to show how a lemma can be useful to provide a proof of another result. Having said that, however, let's show how we can prove Result 3.14 without the help of a lemma, by performing a little algebraic manipulation.

Alternative proof of results 3.14

Suppose 5x − 7 is odd. So 5x − 7 = 2n + 1 for an integer n. Note that 9x + 2 = (5x − 7) + (4x + 9) = 2n + 1 + 4x + 9 = 2n + 4x + 10 = 2( n + 2x + 5). Since n + 2x + 5 is an integer, 9x + 2 is even. You may choose one proof of Result 3.14 over the other. Whether you do this or not, it's important to realize that two different methods can be used. These methods can be useful for future results you come across. Also, you might think that we used a trick to give the second proof of Result 3.14; but as we shall see, when the same "trick" can be used over and over again, it becomes a technique.

3.4 Proof by Cases When trying to give a proof of a mathematical statement about an element x in a set S, it is sometimes useful to note that x has one of two or more properties. A common property that x can have is membership in a certain subset of S. If we can verify the truth of the claim for every property that x can have, then we have a proof of the claim. Such a proof is then divided into parts called cases, a case for each property x may have or for each subset x may belong to. This method is called case proof. In fact, in a case proof, it can be useful to further subdivide a case into other cases called subcases. For example, when proving ∀n ∈ Z, R(n), it may be convenient to use a case proof whose proof is divided into the two cases Case 1. n is even. e case 2. n is odd. Other possible proofs by cases could include proving ∀x ∈ R, P(x) using the cases Case 1. x = 0, Case 2. x < 0, and Case 3. x > 0.

90

Chapter 3

Direct proof and proof by contrapositive

Furthermore, we can try to prove ∀n ∈ N, P(n) with case 1. n = 1 and case 2. n ≥ 2. Also, for S = Z − {0} we can try to prove ∀ x, y ∈ S, P(x, y) using the cases: Case 1. x y > 0 and Case 2. x y < 0. Case 1 could actually be divided into two subcases: Subcase 1.1. x > 0 and y > 0. e Subcase 1.2. x < 0 and y < 0, while case 2 can be divided into two subcases: subcase 2.1. x > 0 and y < 0. e Subcase 2.2. x < 0 and y > 0. Let's look at a case study. Result 3.15 exam

If n ∈ Z, then n 2 + 3n + 5 is an odd integer. We proceed case by case depending on whether n is even or odd. Case 1. n is even. So n = 2x for some x ∈ Z. So n 2 + 3n + 5 = (2x)2 + 3(2x) + 5 = 4x 2 + 6x + 5 = 2(2x 2 + 3x + 2) + 1. Since 2x 2 + 3x + 2 ∈ Z, the integer n 2 + 3n + 5 is odd. Case 2. n is odd. So n = 2y + 1, where y ∈ Z. So n 2 + 3n + 5 = (2y + 1)2 + 3(2y + 1) + 5 = 4y 2 + 10y + 9 = 2(2y 2 + 5y + 4) + 1. Since 2y 2 + 5y + 4 ∈ Z, the integer n 2 + 3n + 5 is odd.

Theorem 3.16 Proof

Two integers x and y have the same parity if x and y are both even or odd. The integers x and y have opposite parity if one of x and y is even and the other is odd. For example, 5 and 13 have the same parity, while 8 and 11 have opposite parity. Because the definition of two integers with the same (or opposite) parity requires that the two integers satisfy one of two properties, any result containing these terms is likely to be proved by cases. The following theorem presents a characterization of two integers that have the same parity. Let x, y ∈ Z. Then x and y have the same parity if and only if x + y is even. First, assume that x and y have the same parity. We consider two cases. Case 1. x and y are even. Then x = 2a and y = 2b for some integers a and b. So x + y = 2a + 2b = 2(a + b). Since a + b ∈ Z, the integer x + y is even. Case 2. x and y are odd. So x = 2a + 1 and y = 2b + 1, where a, b ∈ Z. Therefore x + y = (2a + 1) + (2b + 1) = 2a + 2b + 2 = 2(a + b + 1). Since a + b + 1 is an integer, x + y is even.

3.4

evidence by cases

91

Otherwise, assume that x and y are of opposite parity. We consider two cases again. Case 1. x is even and y is odd. So x = 2a and y = 2b + 1, where a, b ∈ Z. So x + y = 2a + (2b + 1) = 2(a + b) + 1. Since a + b ∈ Z, the whole Number x + y is odd. Case 2. x is odd and y is even. The proof is similar to the proof of the previous case and is therefore omitted. EVIDENCE ANALYSIS

A comment on the proof of Theorem 3.16 is helpful here. While there is always some concern about omitting steps or proofs, it should be clear that providing a proof for the case when x is odd and y is even in the proposition is really a waste of effort on the part of the author and reader 3.16. In fact, there is an alternative if you consider the opposite. Otherwise, assume that x and y are of opposite parity. Assume without loss of generality that x is even and y is odd. So x = 2a and y = 2b + 1, where a, b ∈ Z. So x + y = 2a + (2b + 1) = 2(a + b) + 1. Since a + b ∈ Z, the whole Number x + y is odd.

We use the expression without loss of generality (some abbreviate this WOLOG or WLOG) to indicate that the proofs of the two situations are similar, so only one proof of them is required. It's sometimes quite subjective to say that two situations are similar. We present another example to illustrate this. Theorem to prove the PROOF STRATEGY

Let a and b be integers. Then ab is even if and only if a is even or b is even. Before we start proving this result (Theorem 3.17 below), let's see what we need to show. We have two implications to prove, namely (1) if a is even or b is even then ab is even, and (2) if ab is even then a is even or b is even. We first consider (1). A direct proof seems appropriate. Here we assume a is even or b is even. We could give a case proof: (i) a is even, (ii) b is even. On the other hand, since the proofs of these cases will certainly be similar, we could say, without loss of generality, that a is even. We shall see that it is unnecessary to make any assumptions about b. If we were to provide a direct proof of (2), we would start by assuming that ab is even, say ab = 2k for some integer k. But how could we derive information about a and b individually? Let's try a different approach. When using a contrapositive proof, we first assume that a is not even or b is even. Exactly this situation is covered by one of De Morgan's laws: ∼ (P ∨ Q) is logically equivalent to (∼P) ∧ (∼Q). It is important not to forget this. In this case P : a is even. and Q : b is even. So the negation of "a is even or b is even" is "a is odd and b is odd". Let us now prove this result.

92

Chapter 3 Theorem 3.17 Proof

Direct proof and proof by contrapositive

Let a and b be integers. Then ab is even if and only if a is even or b is even. First assume that a is even or b is even. Be at eye level without restricting the general public. So a = 2x for an integer x. So ab = (2x)b = 2(xb). Since xb is an integer, ab is even. Otherwise, assume that a is odd and b is odd. Then a = 2x + 1 and b = 2y + 1, where x, y ∈ Z. Hence ab = (2x + 1)(2y + 1) = 4x y + 2x + 2y + 1 = 2(2x y + x + y) + 1. Since 2x y + x + y is an integer, ab is odd.

3.5 Evidence Evaluations We have already given several results and given a proof for each result (sometimes preceded a proof by a proof strategy or followed the proof by a proof analysis). Let's reverse this process by giving an example of proving a result, but not specifying the result to be proved. We follow the proof with several options for the statements of the result to be proved. Example 3.18 Proof

Below is a proof of a result. Suppose n is an odd integer. Then n = 2k + 1 for an integer k. So 3n − 5 = 3(2k + 1) − 5 = 6k + 3 − 5 = 6k − 2 = 2(3k − 1). Since 3k − 1 is an integer, 3n − 5 is even. Which of the following statements above is proved? (1) (2) (3) (4)

3n − 5 is an even integer. If n is an odd integer, then 3n − 5 is an even integer. Let n be an integer. If 3n − 5 is an even integer, then n is an odd integer. Let n be an integer. If 3n − 5 is an odd integer, then n is an even integer.

The correct answers are (2) and (4). The proof given is a direct proof of (2) and a counter-positive proof of (4). Proposition (1) is an open proposition, not a statement, and is only the conclusion of (2). Statement (3) is the inverse of (2). When learning a math subject, it is not uncommon to make mistakes. In fact, part of learning math is learning from your mistakes and those of others. For this reason, at the end of most chapters (starting with this chapter) you will see some problems asking you to evaluate the proof of a result. That is, a result and a proposal to prove this result are given. You will then be asked to read this proposed evidence and determine whether you think it is indeed evidence. If you think that the given argument doesn't prove the result, you need to point out the error (or an error). We give two examples of this. Exercise 3.19

Evaluate the proposed proof of the following result.

Exercises for Chapter 3

Evidence of results

93

If x and y are integers of equal parity, then x − y is even. Let x and y be two integers with the same parity. We consider two cases depending on whether x and y are both even or both odd. Case 1. x and y are both even. Let x = 6 and y = 2, both even. So x − y = 4, which is even. Case 2. x and y are both odd. Let x = 7 and y = 1, both odd. So x − y = 6, which is even.

test evaluation

Exercise 3.20

Evidence of results

Although the proof started correctly, assuming that x and y are two integers with the same parity and splitting the proof into these two cases, the proof is wrong for every case. For example, if we assume that x and y are even, then x and y must represent arbitrary even integers, not specific even integers. Evaluate the proposed proof of the following result. If m is an even integer and n is an odd integer, then 3m + 5n is odd. Let m be an even integer and n an odd integer. So m = 2k and n = 2k + 1, where k ∈ Z. So 3m + 5n = 3(2k) + 5(2k + 1) = 6k + 10k + 5 = 16k + 5 = 2(8k + 2) + 1. Since 8k + 2 is an integer, 3m + 5n is odd.

test evaluation

There is an error in the second sentence of the proposed proof, which says that m = 2k and n = 2k + 1, where k ∈ Z. Since the same symbol k is used for m and n, we inadvertently add the assumption that that n = m + 1. However, this is incorrect since it was never said that m and n must be consecutive integers. In other words, we have to write m = 2k and n = 2 + 1, say where k, ∈ Z.

EXERCISES FOR CHAPTER 3 Section 3.1: Trivial and vacuum proofs 3.1. Let x ∈ R. Prove: If 0 < x < 1, then x 2 − 2x + 2 = 0. 3.2. Let n ∈ N. Prove that if |n − 1| + |n + 1| ≤ 1, then |n 2 − 1| ≤ 4. 3.3. Let r ∈ Q+ . Prove that if

r 2 +1 r 3

≤ 1, then

r 2 +2 r

≤ 2.

3.4. Let x ∈ R. Prove that if x − 5x − 1 ≥ 0, then (x − 1)(x − 3) ≥ −2. 3.5. Let n ∈ N. Prove that if n +

1n

< 2, also n 2 +

1n2

< 4.

3.6. Prove that if a, b, and c are odd integers such that a + b + c = 0, then abc < 0. (You can use well-known properties of integers here.) 3.7. Prove that if x, y, and z are three real numbers such that x 2 + y 2 + z 2 < x y + x z + yz, then x + y + z > 0.

94

Chapter 3

Direct proof and proof by contrapositive

Section 3.2: Direct Evidence 3.8. Prove that if x is an odd integer, then 9x + 5 is even. 3.9. Prove that if x is an even integer, then 5x − 3 is an odd integer. 3.10. Prove that if a and c are odd integers, then ab + bc is even for any integer b. 3.11. Let n ∈ Z. Prove that if 1 − n 2 > 0, then 3n − 2 is an even integer. 3.12. Let x ∈ Z. Prove that if 22x is an odd integer, then 2−2x is an odd integer. 3.13. Let S = {0, 1, 2} and let n ∈ S. Prove: If (n + 1)2 (n + 2)2 /4 is even, then (n + 2)2 (n + 3)2 / 4 is even. 3.14. Let S = {1, 5, 9}. Prove that if n ∈ S and

n 2 +n−6 2

then it's weird

2n 3 +3n 2 +n 6

it is a couple.

3.15. Let A = {n ∈ Z : n > 2 and n odd} and B = {n ∈ Z : n < 11}. Prove that if n ∈ A ∩ B, then n 2 − 2 is prime.

Section 3.3: Proof by Contrapositive 3.16. Let x ∈ Z. Prove that if 7x + 5 is odd, then x is even. 3.17. Let n ∈ Z. Prove that if 15n is even, then 9n is even. 3.18. Let x ∈ Z. Prove that 5x − 11 is even if and only if x is odd. 3.19. Let x ∈ Z. Use a lemma to prove that if 7x + 4 is even then 3x − 11 is odd. 3.20. Let x ∈ Z. Prove that 3x + 1 is even if and only if 5x − 2 is odd. 3.21. Let n ∈ Z. Prove that (n + 1)2 − 1 is even if and only if n is even. 3.22. Let S = {2, 3, 4} and let n ∈ S. Use a contrapositive proof to prove that if n 2 (n − 1)2 /4 is even, then n 2 (n + 1)2 /4 above is up at. 3.23. Let A = {0, 1, 2} and B = {4, 5, 6} be subsets of S = {0, 1, . 🇧🇷 🇧🇷 , 6}. Let n ∈ S. Prove that if even then n ∈ A ∪ B.

n(n−1)(n−2) 6

it is

3.24. Let n ∈ Z. Prove that 2n 2 + n is odd if and only if cos nπ is even. 2 3.25. Let {A, B} be a partition of the set S = {1, 2, . 🇧🇷 🇧🇷 , 7}, where A = {1, 4, 5} and B = {2, 3, 6, 7}. Let 2 n ∈ S. Prove that if n +3n−4 is even then n ∈ A. 2

Section 3.4: Proof by Cases 3.26. Prove that for n ∈ Z n 2 − 3n + 9 is odd. 3.27. Prove that if n ∈ Z, then n 3 − n is even. 3.28. Let x, y ∈ Z. Prove that if x y is odd, then x and y are odd. 3.29. Let a, b ∈ Z. Prove that a 2 + b2 is even if ab is odd. 3.30. Let x, y ∈ Z. Prove that x − y is even if and only if x and y have the same parity. 3.31. Let a, b ∈ Z. Prove that if a + b and ab have the same parity, then a and b are even. 3.32. (a) Let x and y be integers. Prove that (x + y)2 is even if and only if x and y have the same parity. (b) Rewrite the result in (a) as odd integers. 3.33. Let A = {1, 2, 3} and B = {2, 3, 4} be subsets of S = {1, 2, 3, 4}. Let n ∈ S. Prove that 2n 2 − 5n is (a) positive and even or (b) negative and odd if and only if n ∈ / A ∩ B. 3.34. Let A = {3, 4} be a subset of S = {1, 2, . 🇧🇷 🇧🇷 , 6}. Let n ∈ S. Prove that if 3.35. For any nonnegative integer n, prove that 2 + 6 is an even integer. n

n

n2 (n+1)2 4

is even, so n ∈ A.

Exercises for Chapter 3

95

3.36. A collection of nonempty subsets of a nonempty set S is called a cover of S if every element of S belongs to at least one of the subsets. (A covering is a partition of S if each element of S belongs to exactly one of the subsets.) Consider the following. Result Let a, b ∈ Z. If a is even or b is even, then ab is even. Check

Suppose a is even or b is even. We consider the following cases.

Case 1. a is even. Then a = 2k, where k ∈ Z. So ab = (2k)b = 2(kb). Since kb ∈ Z, ab is even. Case 2. b is even. Then b = 2, where ∈ Z. So ab = a(2) = 2(a). Since a ∈ Z, ab is even. Since the domain for a and b is Z, we can think of Z × Z as the domain of (a, b). Consider the following subsets of Z × Z: S1 = {(a, b) ∈ Z × Z : a and b are odd} S2 = {(a, b) ∈ Z × Z : a is even} S3 = {( a , b ) ∈ Z × Z : b is even}. (a) Why is {S1, S2, S3} a Z × Z cover and not a Z × Z partition? (b) Why does the set S1 not appear in the proof above? (c) Give a proof for cases of the above result, where the cases are determined by a partition and not by a cover.

Section 3.5: Test Ratings 3.37. Below is a proof of a result. Proof Consider two cases. Case 1. a and b are even. Then a = 2r and b = 2s for the integers r and s. So a 2 − b2 = (2r )2 − (2s)2 = 4r 2 − 4s 2 = 2(2r 2 − 2s 2 ). Since 2r 2 − 2s 2 is an integer, a 2 − b2 is even. Case 2. a and b are odd. So a = 2r + 1 and b = 2s + 1 for the integers r and s. So a 2 − b2 = (2r + 1)2 − (2s + 1)2 = (4r 2 + 4r + 1) − ( 4s 2 + 4s + 1) = 4r 2 + 4r − 4s 2 − 4s = 2( 2r 2 + 2r − 2s 2 − 2s). Since 2r 2 + 2r − 2s 2 − 2s is an integer, a 2 − b2 is even. Which of the following statements is proven? (1) (2) (3) (4)

Let a, b ∈ Z. Then a and b have the same parity if and only if a 2 − b2 is even. Let a, b ∈ Z. Then a 2 − b2 is even. Let a, b ∈ Z. If a and b have the same parity, then a 2 − b2 is even. Let a, b ∈ Z. If a 2 − b2 is even, then a and b have the same parity.

3.38. Below is a proof of a result. Which result is proved?

96

Chapter 3

Direct proof and proof by contrapositive

Proof Suppose x is even. Then x = 2a for an integer a. So 3x 2 − 4x − 5 = 3(2a)2 − 4(2a) − 5 = 12a 2 − 8a − 5 = 2(6a 2 − 4a − 3) + 1. Since 6a 2 − 4a − 3 is an integer is , 3x 2 − 4x − 5 is odd. Otherwise, assume that x is odd. So x = 2b + 1, where b ∈ Z. So 3x 2 − 4x − 5 = 3(2b + 1)2 − 4(2b + 1) − 5 = 3(4b2 + 4b + 1) − 8b − 4 − 5 = 12b2 + 4b − 6 = 2(6b2 + 2b − 3). Since 6b2 + 2b − 3 is an integer, 3x 2 − 4x − 5 is even. 3.39. Evaluate the proof of the following result. Result Let n ∈ Z. If 3n − 8 is odd, then n is odd. Proof Assume that n is odd. Then n = 2k + 1 for an integer k. So 3n − 8 = 3(2k + 1) − 8 = 6k + 3 − 8 = 6k − 5 = 2(3k − 3) + 1. Since 3k − 3 is an integer, 3n − 8 is odd. 3.40. Evaluate the proof of the following result. Result Let a, b ∈ Z. Then a − b is even if and only if a and b have the same parity. Proof Consider two cases. Case 1. a and b have the same parity. Now let's consider two subcases. Subcase 1.1. a and b are even. Then a = 2x and b = 2y, where x, y ∈ Z. Then a − b = 2x − 2y = 2(x − y). Since x − y is an integer, a − b is even. Subcase 1.2. a and b are both odd. Then a = 2x + 1 and b = 2y + 1, where x, y ∈ Z. Then a − b = (2x + 1) − (2y + 1) = 2(x − y). Since x − y is an integer, a − b is even. Case 2. a and b are of opposite parity. Again we have two subcases. Subcase 2.1. a is odd and b is even. Then a = 2x + 1 and b = 2y, where x, y ∈ Z. Then a − b = (2x + 1) − 2y = 2(x − y) + 1. Since x − y is an integer , a − b is odd. Subcase 2.2. a is even and b is odd. So a = 2x and b = 2y + 1, where x, y ∈ Z. So a − b = 2x − (2y + 1) = 2x − 2y − 1 = 2(x − y − 1) + 1. As x − y − 1 is an integer, a − b is odd. 3.41. The following is an attempt to prove a result. What is the result and is the proof attempt correct? Proof Assume without loss of generality that x is even. Then x = 2a for an integer a. Thus x y 2 = (2a)y 2 = 2(ay 2 ). Since ay 2 is an integer, x y 2 is even. 3.42. Below is a proof of a result. What is the result? Proof Assume without loss of generality that x and y are even. Then x = 2a and y = 2b for the integers a and b. Therefore xy + xz + yz = (2a)(2b) + (2a)z + (2b)z = 2(2ab + az + bz). Since 2ab + az + bz is an integer, x y + x z + yz is even. 3.43. What result is proved below and what procedure is used to verify the result? First we present the following proof.

Additional exercises to Chapter 3

Study

97

Suppose x is even. Then x = 2a for an integer a. So 7x − 3 = 7(2a) − 3 = 14a − 3 = 14a − 4 + 1 = 2(7a − 2) + 1.

Since 7a − 2 is an integer, 7x − 3 is odd. We are now ready to prove our main result. Proof Suppose 7x − 3 is even. From the result above, x is odd. So x = 2b + 1 for an integer b. So 3x + 8 = 3(2b + 1) + 8 = 6b + 11 = 2(3b + 5) + 1. Since 3b + 5 is an integer, 3x + 8 is odd. 3.44. Consider the following statement. Let n ∈ Z. Then (n − 5)(n + 7)(n + 13) is odd if and only if n is even. Which of the following would be an appropriate way to begin a proof of this claim? (a) Assume that (n − 5)(n + 7)(n + 13) is odd. (b) Assume that (n − 5)(n + 7)(n + 13) is even. (c) Assume that n is even. (d) Suppose n is odd. (e) We consider two cases depending on whether n is even or n is odd.

EXERCISES ADDITIONAL TO CHAPTER 3 3.45. Let x ∈ Z. Prove that if 7x − 8 is even, then x is even. 3.46. Let x ∈ Z. Prove that x 3 is even if and only if x is even. 3.47. Let x ∈ Z. Use a lemma or two to prove that 3x 3 is even if and only if 5x 2 is even. 3.48. Give a direct proof: Let x ∈ Z. If 11x − 5 is odd, then x is even. 3.49. Let x, y ∈ Z. Prove that if x + y is odd, then x and y have opposite parity. 3.50. Let x, y ∈ Z. Prove that if 3x + 5y is even then x and y have the same parity. 3.51. Let x, y ∈ Z. Prove that (x + 1)y 2 is even if and only if x is odd or y is even. 3.52. Let x, y ∈ Z. Prove that if x y and x + y are even, then x and y are even. 3.53. For each integer x, prove that the integers 3x + 1 and 5x + 2 are of opposite parity. 3.54. Prove the following two results: (a) Result A: Let n ∈ Z. If n 3 is even, then n is even. (b) Result B: If n is an odd integer, then 5n 9 + 13 is even. 3.55. Prove for any two different real numbers a and b, or

a + b 2

> one or

a + b 2

> b.

3.56. Let x, y ∈ Z. Prove that if a and b are even integers, then ax + by is even. 3.57. Evaluate the proof of the following result. Result Let x, y ∈ Z and let a and b be odd integers. If ax + by is even, then x and y have the same parity. Proof Assume that x and y are of opposite parity. So x = 2p and y = 2q + 1 for some integers p and q. Since a and b are odd integers, a = 2r + 1 and b = 2s + 1 for integers r and s. So ax + by = (2r + 1)(2p) + (2s + 1)( 2q + 1) = 4pr + 2p + 4qs + 2s + 2q + 1 = 2(2pr + p + 2qs + s + q) + 1. Since 2pr + p + 2qs + s + q is an integer, ax + by odd.

98

Chapter 3

Direct proof and proof by contrapositive

3.58. Let S = {a, b, c, d} be a set of four distinct integers. Prove that if (1) for any x ∈ S the integer x and the sum of any two of the three remaining integers of S have the same parity, or (2) for any x ∈ S the integer x and the sum Any two of the three remaining integers in S have opposite parity, so any pair of integers in S has the same parity. 3.59. Prove that if a and b are two positive integers, then a 2 (b + 1) + b2 (a + 1) ≥ 4ab. 3.60. Let a, b ∈ Z. Prove that if ab = 4 then (a − b)3 − 9(a − b) = 0. 3.61. Let a, b and c be the side lengths of a triangle T with a ≤ b ≤ c. Prove that if T is a right triangle, then (abc)2 =

c6 − a 6 − b6 . 3

3.62. Consider the following statement. Let n ∈ Z. Then 3n 3 + 4n 2 + 5 is even if and only if n is even. Which of the following would be an appropriate way to begin a proof of this claim? (a) (b) (c) (d) (e)

Suppose 3n 3 + 4n 2 + 5 is odd. Suppose 3n 3 + 4n 2 + 5 is even. Suppose n is even. Suppose n is odd. We consider two cases, depending on whether n is even or n is odd.

3.63. Let P = {A, B, C} be a partition of a set S of integers, where A = {n ∈ S: n odd and n > 0}, B = {n ∈ S : n odd and n < 0} and C = {n ∈ S : n is even and n > 0}. Prove that if x and y are elements of S belonging to different subsets in P, then x y is odd, even and greater than 1, or even and less than -1. 3.64. Let n ∈ N. Prove that if n 3 − 5n − 10 > 0 then n ≥ 3. 3.65. For any odd integer a, prove that (a 2 + 3)(a 2 + 7) = 32b for some integer b. 3.66. Prove this for any two positive integers a and b

1 1 (a + b) + ein b

≥ 4.

3.67. What result is proved below and what procedure is used to verify the result? We start with the following proof. Proof First assume that x is even. Then x = 2a, where a ∈ Z. So 3x − 2 = 3(2a) − 2 = 6a − 2 = 2(3a − 1). Since 3a − 1 is an integer, 3x − 2 is even. Then assume that x is odd. So x = 2b + 1 for an integer b. Therefore, 3x − 2 = 3(2b + 1) − 2 = 6b + 1 = 2(3a) + 1. Since 3a is an integer, 3x − 2 is odd. We can now give the following proof. Proof Let us first assume that 3x − 2 is even. From the previous result, x is even, so x = 2a, where a ∈ Z. So 5x + 1 = 5(2a) + 1 = 2(5a) + 1. Since 5a is an integer, 5x + 1 is odd. Next, assume that 3x − 2 is odd. After the previous result, x is odd again. So x = 2b + 1 for an integer b. So 5x + 1 = 5(2b + 1) + 1 = 10b + 6 = 2(5b + 3). Since 5b + 3 is an integer, 5x + 1 is even.

4

More about Direct Proof and Proof by Contrapositiv

T

The vast majority of the examples that we have seen that illustrate direct proof and proof by contrapositive involve properties of even and odd integers. In this chapter we will give additional examples of direct proofs and proofs by contrapositives with respect to integers, but in new environments. First we shall see how even and odd integers can be studied more generally by the divisibility of integers. Next we examine some properties of real numbers and finally some properties of set operations.

4.1 Proofs of divisibility of integers We have seen many examples of integers that can be written as 2x for an integer x. These are, of course, exactly the even integers. However, some integers can also be expressed as 3x or 4x, or as -5x for some integers x. In general, for integers a and b with a = 0, we say that a divides b if there exists an integer c such that b = ac. In this case we write a | B. So if n is an even integer, then 2 | n; also, if 2 divides an integer n, then n is even. That is, an integer n is even if and only if 2 | n. Theorem 3.17 (which says for the integers a and b that ab is even if and only if a or b are even) can therefore be rewritten for the integers a and b as: 2 | from if and only if 2 | a or 2 | B. If a | b we also say that b is a multiple of a and a divides b. So every even integer is a multiple of 2. If a doesn't divide b, then we write | B. Example: 4 | 48, because 48 = 4 12 and −3 | 57 of 57 = (−3) * (−19). On the other hand 4 | 66 because there is no integer c such that 66 = 4c. Now we'll apply the techniques we've learned to prove some results regarding the divisibility properties of integers. Result to prove the PROOF STRATEGY

Let a, b and c be integers with a = 0 and b = 0. If a | b and b | c then a | c. It makes sense here to use a direct proof and first assume that the | b and b | c. This means that b = ax and c = by for some integers x and y. Since our goal is to show that | c, we need to show that c can be written as the product of a and some other integer. So it's logical to look at c and determine how we can express it. 99

100

Chapter 4 More on direct proof and proof by counterpositive result 4.1 Proof

Let a, b and c be integers with a = 0 and b = 0. If a | b and b | c then a | c. Suppose a | b and b | c. Then b = ax and c = by, where x, y ∈ Z. Hence c = by = (ax)y = a(x y). Since x y is an integer, a | c. We now check two further divisibility properties of integers.

Result 4.2 test

Let a, b, c and d be integers with a = 0 and b = 0. If a | c and b | d then from | CD. Leave a | c and b | i.e. Then c = ax and d = by, where x, y ∈ Z. Then cd = (ax)(by) = (ab)(x y). Since x y is an integer, ab | CD.

Result 4.3 test

Let a, b, c, x, y ∈ Z, where a = 0. If a | b and a | c then a | (bx+cy). Suppose a | b and a | c. Then b = ar and c = as, where r, s ∈ Z. Then bx + cy = (ar )x + (as)y = a(r x + sy). Since r x + sy is an integer, a | (bx+cy). The examples presented so far concern general divisibility properties of integers. We will now look at some specialized divisibility properties.

Result 4.4 test

Let x ∈ Z. If 2 | (x 2 − 1), then 4 | (x2 − 1). Suppose 2 | (x2 − 1). Then x 2 − 1 = 2y for an integer y. So x 2 = 2y + 1 is an odd integer. From Theorem 3.12 it then follows that x is also odd. So x = 2z + 1 for an integer z. So x 2 − 1 = (2z + 1)2 − 1 = (4z 2 + 4z + 1) − 1 = 4z 2 + 4z = 4(z 2 + z). Since z 2 + z is an integer, 4 | (x2 − 1). For each of the results 4.1-4.4 a direct proof worked very well. For the following result, however, the situation is quite different.

Result to prove the PROOF STRATEGY

Let x, y ∈ Z. If 3 | x y, then 3 | x and 3 | j. Let's let P: 3 | xy,

P: 3 | x

e

R: 3 | Sim,

then we want to prove that P ⇒ Q ∧ R for all integers x and y. (It should be clear that P, Q, and R are open propositions in this case, but we omit the variables here for simplicity.) Using a direct proof, we assume that 3 | x y and try to show that 3 | x and 3 | j. So we would know that xy y cannot be expressed as 3 times an integer. On the other hand, if we use a contrapositive proof, we consider the implication (∼(Q ∧ R)) ⇒ (∼P), which is logically equivalent by De Morgan's law

4.1

Proof that integers are divisible

101

((∼Q) ∨ (∼R)) ⇒ (∼P) and which is complete: If 3 | x or 3 | y then 3 | xy. This method looks more promising. Result 4.5 test

Let x, y ∈ Z. If 3 | x y, then 3 | x and 3 | j. Suppose 3 | x or 3 | j. Without loss of generality, assume that 3 x divides. So x = 3z for an integer z. Hence xy = (3z)y = 3(zy). Since zy is an integer, 3 | xy. We have already mentioned that if an integer n is not a multiple of 2, we can write n = 2q + 1 for an integer q (that is, if an integer n is not even then it is odd). This is a result of knowing that 0 and 1 are the only possible remainders when dividing an integer by 2. Similarly, if an integer n is not a multiple of 3, we can write n = 3q + 1 or n = 3q + 2 for an integer q; That is, any integer can be expressed as 3q, 3q + 1, or 3q + 2 for an integer q, since 0, 1, and 2 are the only remainders that can result when an integer is divided by 3. Integer n is not a multiple of 4, so n can be expressed as 4q + 1, 4q + 2, or 4q + 3 for an integer q. This topic concerns a well-known theorem called the division algorithm, which we'll discuss in more detail in Chapter 11.

Result to prove the PROOF STRATEGY

Result 4.6 test

Let x ∈ Z. If 3 | (x 2 − 1), then 3 | x. We have two options here, namely (1) use a direct proof and start a proof assuming that 3 | (x 2 − 1) or (2) use a proof by contraposition and start a proof by assuming that 3 | x. Of course, we can't help but assume that 3 doesn't divide an integer. However, it seems much easier to know that 3 | x and try to show that 3 | (x 2 − 1) than knowing that 3 | (x 2 − 1) and show that 3 | x. If also 3 | x we now know that x = 3q + 1 or x = 3q + 2 for an integer q, suggesting a case proof. Let x ∈ Z. If 3 | (x 2 − 1), then 3 | x. Suppose 3 | x. Then x = 3q + 1 for an integer q or x = 3q + 2 for an integer q. We consider these two cases. Case 1. x = 3q + 1 for an integer q. So x 2 − 1 = (3q + 1)2 − 1 = (9q 2 + 6q + 1) − 1 = 9q 2 + 6q = 3(3q 2 + 2q). Since 3q 2 + 2q is an integer, 3 | (x2 − 1). Case 2. x = 3q + 2 for an integer q. So x 2 − 1 = (3q + 2)2 − 1 = (9q 2 + 12q + 4) − 1 = 9q 2 + 12q + 3 = 3(3q 2 + 4q + 1). Since 3q 2 + 4q + 1 is an integer, 3 | (x2 − 1). Now we consider a biconditional function with divisibility.

102

Chapter 4 More on direct proof and proof by counterpositive result 4.7 Proof

Let x, y ∈ Z. Then 4 | (x 2 − y 2 ) if and only if x and y have the same parity. First, assume that x and y have the same parity. We show that 4 | (x 2 − y 2 ). There are two cases. Case 1. x and y are both even. So x = 2a and y = 2b for some integers a and b. Then x 2 − y 2 = (2a)2 − (2b)2 = 4a 2 − 4b2 = 4(a 2 − b2 ). Since a 2 − b2 is an integer, 4 | (x 2 − y 2 ). Case 2. x and y are both odd. So x = 2c + 1 and y = 2d + 1 for some integers c and d. So x 2 − y 2 = (2c + 1)2 − (2d + 1)2 = (4c2 + 4c + 1) − (4d 2 + 4d + 1) = 4c2 + 4c − 4d 2 − 4d = 4(c2 + c − d 2 − d). Since c2 + c − d 2 − d is an integer, 4 | (x 2 − y 2 ). Otherwise, assume that x and y are of opposite parity. We show that 4 | (x 2 − y 2 ). We consider two cases. Case 1. x is even and y is odd. So x = 2a and y = 2b + 1 for some integers a and b. So x 2 − y 2 = (2a)2 − (2b + 1)2 = 4a 2 − [4b2 + 4b + 1] = 4a 2 − 4b2 − 4b − 1 = 4a 2 − 4b2 − 4b − 4 + 3 = 4(a 2 − b2 − b − 1) + 3. Since a 2 − b2 − b − 1 is an integer, it follows that when x 2 − y 2 is divided by 4, there is a remainder of 3, so , 4 | (x 2 − y 2 ). Case 2. x is odd and y is even. The proof of this case is similar to that of case 1 and is therefore omitted. We consider a result of a different nature.

Result to prove the PROOF STRATEGY

Result 4.8 test

For every integer n ≥ 7 there are positive integers a and b such that n = 2a + 3b. First notice that we can write 7 = 2 2 + 3 1, 8 = 2 1 + 3 2 and 9 = 2 3 + 3 1. So the result is surely true for n = 7, 8, 9. On the other hand, there is no pair a, b of positive integers such that 6 = 2a + 3b. Of course, this observation only shows that we cannot replace n ≥ 6 with n ≥ 7. Suppose n is an integer with n ≥ 7. We could bring the integer 2 into the discussion by noting that we can write n = 2q or n = 2q + 1, where q ∈ Z. Indeed, if n = 2q, then q ≥ 4 since n ≥ 7; whereas if n = 2q + 1 then q ≥ 3 since n ≥ 7. This is a useful observation. For every integer n ≥ 7 there are positive integers a and b such that n = 2a + 3b. Let n be an integer with n ≥ 7. Then n = 2q or n = 2q + 1 for an integer q. We consider these two cases.

4.2

Prove with congruence of integers

103

Case 1. n = 2q. Since n ≥ 7, it follows q ≥ 4. Hence n = 2q = 2(q − 3) + 6 = 2(q − 3) + 3 2. Since q ≥ 4, it follows q − 3 ∈ N . Case 2. n = 2q + 1. Since n ≥ 7, q ≥ 3. Hence n = 2q + 1 = 2(q − 1) + 2 + 1 = 2(q − 1) + 3 1 . Since q ≥ 3, it follows q − 1 ∈ N.

4.2 Proofs of Congruence of Integers We know that an integer x is even if x = 2q for an integer q, while x is odd if x = 2q + 1 for an integer q. Also, two integers x and y have the same parity regardless of whether they are even or odd. It follows that x and y have the same parity if and only if 2 | (x−y). Consequently 2 | (x − y) if and only if x and y have the same remainder when divided by 2. We also know that an integer x can be expressed as 3q, 3q + 1, or 3q + 2 for an integer q depending on whether the remainder is 0, 1, or 2 when x is divided by 3. If the integers x and y are both of the form 3q + 1, then x = 3s + 1 and y = 3t + 1, where s, t ∈ Z, and thus x − y = 3( s - t). Since s − t is an integer, 3 | (x−y). If x and y are both of the form 3q, or both of the form 3q + 2, then 3 | (x − y) also. So if x and y have the same remainder when divided by 3, then 3 | (x−y). The converse of this implication is also true. This indicates a special interest in x,y pairs of integers such that 2 | (x − y) or 3 | (x − y) or indeed on pairs x, y of integers such that n | (x − y) for an integer n ≥ 2. For integers a, b and n ≥ 2 we say that a is congruent to b modulo n, written a ≡ b (mod n), if n | (away). For example 15 ≡ 7 (mod 4) since 4 | (15 − 7) and 3 ≡ −15 (mod 9) from 9 | (3 − (−15)). On the other hand, 14 is not congruent with 4 modulo 6, written 14 ≡ 4 (mod 6), since 6 |(14 − 4). Since we know that any integer x can be expressed as x = 2q or as x = 2q + 1 for some integer q, it follows that either 2 | (x − 0) or 2 | (x − 1), i.e. x ≡ 0 (mod 2) or x ≡ 1 (mod 2). Also, since any integer x can be expressed as x = 3q, x = 3q + 1, or x = 3q + 2 for an integer q, it follows that 3 | (x − 0), 3 | (x − 1) or 3 | (x − 2). So x ≡ 0 (mod 3),

x ≡ 1 (mod 3)

or

x ≡ 2 (mod 3).

In addition, for every integer x exactly one of x ≡ 0 (mod 4) holds,

x ≡ 1 (mod 4),

x ≡ 2 (mod 4),

x ≡ 3 (mod 4)

is true depending on whether the remainder is 0, 1, 2, or 3 when x is divided by 4. Similar statements can also be made if x is divided by n for every integer n ≥ 5. We now consider some congruence properties of integers. Result to prove the PROOF STRATEGY

Let a, b, k and n be integers with n ≥ 2. If a ≡ b (mod n), then ka ≡ kb (mod n). A direct proof makes sense here. So we start with the assumption that a ≡ b (mod n). Our goal is to show that ka ≡ kb (mod n). Since we know that a ≡ b (mod n), it follows that

104

Chapter 4 More on direct proof and proof by contrapositive from the definition that n | (a − b), which implies that a − b = nx for some integer x. We need to show that ka ≡ kb (mod n), which means we need to show that n | (ka − kb). So we have to show that ka − kb = nt for some integer t. This suggests considering the expression ka − kb. Result 4.9 test

Let a, b, k and n be integers with n ≥ 2. If a ≡ b (mod n), then ka ≡ kb (mod n). Assume that a ≡ b (mod n). Then n | (away). So a − b = nx for an integer x. Hence ka − kb = k(a − b) = k(nx) = n(kx). Since kx is an integer, n | (ka − kb) and thus ka ≡ kb (mod n).

Result 4.10

Let a, b, c, d, n ∈ Z with n ≥ 2. If a ≡ b (mod n) and c ≡ d (mod n), then a + c ≡ b + d (mod n).

Study

Assume that a ≡ b (mod n) and c ≡ d (mod n). Then a − b = nx and c − d = ny for some integers x and y. If we add these two equations, we get (a − b) + (c − d) = nx + ny and thus (a + c) − (b + d) = n(x + y). Since x + y is an integer, n | [(a + c) − (b + d)]. So a + c ≡ b + d (mod n). The next result corresponds to result 4.10 in terms of multiplication.

to prove result

Let a, b, c, d, n ∈ Z with n ≥ 2. If a ≡ b (mod n) and c ≡ d (mod n), then ac ≡ bd (mod n).

TEST STRATEGY

This result and result 4.10 have the same hypothesis. In the proof of Result 4.10 we arrive at the equations a − b = nx and c − d = ny and only need to add them to complete the proof. This suggests that in the current result it would make sense to multiply these two equations. However, if we multiply them together, we get (a − b)(c − d) = (nx)(ny), which does not give us the desired conclusion that ac − bd is a multiple of n. It is essential, however, that we work with c − bd in the proof. However, by rewriting a − b = nx and c − d = ny as a = b + nx and c = d + ny, respectively, and then multiplying them together, we can do this.

Result 4.11

Let a, b, c, d, n ∈ Z with n ≥ 2. If a ≡ b (mod n) and c ≡ d (mod n), then ac ≡ bd (mod n).

Study

Assume that a ≡ b (mod n) and c ≡ d (mod n). Then a − b = nx and c − d = ny, where x, y ∈ Z. So a = b + nx and c = d + ny. Multiplying these two equations we get ac = (b + nx)(d + ny) = bd + dnx + bny + n 2 x y = bd + n(d x + by + nx y)

4.3

Proof with real numbers

105

and thus ac − bd = n(d x + by + nx y). Since d x + by + nx y is an integer, ac ≡ bd (mod n). The proofs of the three previous results use a direct proof. However, this is not a convenient proof technique for the next result. Result to prove the PROOF STRATEGY

Let n ∈ Z. If n 2 ≡ n (mod 3), then n ≡ 0 (mod 3) and n ≡ 1 (mod 3). Let P(n) : n 2 ≡ n (mod 3), Q(n) : n ≡ 0 (mod 3) and R(n) : n ≡ 1 (mod 3). Our goal is then to show that P(n) ⇒ (Q(n) ∧ R(n)) for every integer n. Direct proof doesn't seem like a good choice. However, a contrapositive proof would lead us to the implication ∼(Q(n) ∧ R(n)) ⇒ (∼P(n)), which by De Morgan's law is logically equivalent to ((∼Q(n ) ) ∨ (∼R(n))) ⇒ (∼P(n)). In words, then: If n ≡ 0 (mod 3) or n ≡ 1 (mod 3), then n 2 ≡ n (mod 3).

Result 4.12 test

Let n ∈ Z. If n 2 ≡ n (mod 3), then n ≡ 0 (mod 3) and n ≡ 1 (mod 3). Let n be an integer such that n ≡ 0 (mod 3) or n ≡ 1 (mod 3). We consider these two cases. Case 1. n ≡ 0 (mod 3). Then n = 3k for an integer k. So n 2 − n = (3k)2 − (3k) = 9k 2 − 3k = 3(3k 2 − k). Since 3k 2 − k is an integer, 3 | (n 2 − n). So n 2 ≡ n (mod 3). Case 2. n ≡ 1 (mod 3). So n = 3 + 1 for an integer and n 2 − n = (3 + 1)2 − (3 + 1) = (92 + 6 + 1) − (3 + 1) = 92 + 3 = 3(32 + ). Since 32+ is an integer, 3 | (n 2 − n) and thus n 2 ≡ n (mod 3). As a consequence of Result 4.12, if an integer n and its square n have 2 distinct remainders when divided by 3, then the remainder for n (when divided by 3) is 2.

4.3 Proofs with real numbers We now apply the proof techniques we introduced to verify some mathematical statements with real numbers. To make sure we're working by the same rules, let's recall a few facts about real numbers, the truth of which we accept without justification. We have already mentioned that a 2 ≥ 0 for every real number a. In fact, a n ≥ 0 for any real number a if n is an even positive integer. If a < 0 and n is a positive odd integer, then a n < 0. Of course, the product of two real numbers is positive if and only if both numbers are positive or both are negative.

106

Chapter 4 More on direct proof and proof by contrapositive Now let a, b, c ∈ R. If a ≥ b and c ≥ 0, then the inequality ac ≥ bc holds. Indeed if c > 0 then a/c ≥ b/c. If a > b and c > 0, then ac > bc and a/c > b/c.

(4.1)

If c < 0, the inequalities in (4.1) are inverted; namely: If a > b and c < 0, then ac < bc and a/c < b/c.

(4.2)

Another important and well-known property of real numbers is that if the product of two real numbers is 0, then at least one of those numbers is 0. Theorem to prove STRATEGY

Theorem 4.13 Proof

If x and y are real numbers such that x y = 0, then x = 0 or y = 0. If we use a direct proof, we first assume that x y = 0. If x = 0, we already have that desired result. On the other hand, if x = 0, we have to show that y = 0. However, if x = 0, then 1/x is a real number. This suggests multiplying x y = 0 by 1/x. Let x, y ∈ R. If x y = 0 then x = 0 or y = 0. Suppose x y = 0. We consider two cases depending on whether x = 0 or x = 0. Case 1. x = 0 Then we have the desired result. Case 2. x = 0. If we multiply x y = 0 by the number 1/x, we get Da 1 (x y) = x

1 1 (x y) = · 0. x x

1 x y = 1 y = y, x

it follows that y = 0. We now use Theorem 4.13 to prove the next result. Result 4.14 Proof

Let x ∈ R. If x 3 − 5x 2 + 3x = 15, then x = 5. Suppose x 3 − 5x 2 + 3x = 15. So x 3 − 5x 2 + 3x − 15 = 0. Note that x 3 − 5x 2 + 3x − 15 = x 2 (x − 5) + 3(x − 5) = (x 2 + 3)(x − 5). Since x 3 − 5x 2 + 3x − 15 = 0, it follows that (x 2 + 3)(x − 5) = 0. By Theorem 4.13, either x 2 + 3 = 0 or x − 5 = 0. Since x 2 + 3 > 0, it follows that x − 5 = 0 and hence x = 5. Next we consider an example of a contrapositive proof of an inequality.

Result 4.15 exam

Let x ∈ R. If x 5 − 3x 4 + 2x 3 − x 2 + 4x − 1 ≥ 0, then x ≥ 0. Suppose x < 0. So x 5 < 0, 2x 3 < 0 and 4x < 0. Also −3x 4 < 0 and −x 2 < 0. Hence x 5 − 3x 4 + 2x 3 − x 2 + 4x − 1 < 0 − 1 < 0, as desired.

4.3

Proof with real numbers

107

Occasionally we may encounter problems involving the verification of a particular equality or inequality where it is convenient to find an equivalent formulation of the equality or inequality whose truth is clear. This then becomes the starting point of a proof. We now examine an inequality whose proof uses this common approach. to prove result

If x, y ∈ R, then 1 2 3 2 x + y ≥ x y. 3rd 4th

TEST STRATEGY

First, let's eliminate the fractions from the expression. To show that it is equivalent to show that 1 2 3 2 x + y ≥ 12x y, 12 3 4

1 2 x 3

+ 34 y 2 ≥ x y é

that is, 4x 2 + 9y 2 ≥ 12x y, which is equivalent to 4x 2 − 12x y + 9y 2 ≥ 0. That is, if we could show that 4x 2 − 12x y + 9y 2 ≥ 0, then we could we show that 1 2 3 2 x + y ≥ x y. 3 4 2 A simple observation about 4x − 12x y + 9y 2 leads to a proof. Result 4.16

If x, y ∈ R, then 1 2 3 2 x + y ≥ x y. 3rd 4th

Study

Since (2x − 3y)2 ≥ 0, it follows that 4x 2 − 12x y + 9y 2 ≥ 0 and thus 4x 2 + 9y 2 ≥ 12x y. If we divide this inequality by 12, we get 1 2 3 2 x + y ≥ xy 3 4, which gives the desired inequality. Remember that for a real number x, its absolute value |x| is defined as x if x ≥ 0 |x| = −x if x < 0. A well-known property of absolute values is that |x y| = |x||y| for any two real numbers x and y (see Exercise 30). The following theorem provides another well-known property of absolute values of real numbers (called the triangle inequality) that has numerous applications. Since the definition of |x| is essentially a definition by cases, proofs with |x| are often occasional.

Theorem 4.17

(The triangle inequality) For any two real numbers x and y, |x + y| ≤ |x| + |y|.

108

Chapter 4 More on direct proof and proof by contrapositive proof

Since |x + y| = |x| + |and| If either x or y is 0, we can assume that x and y are non-zero. We proceed on a case-by-case basis. Case 1. x > 0 and y > 0. Then x + y > 0 and |x + y| = x + y = |x| + |and|. Case 2. x < 0 and y < 0. Since x + y < 0, |x + y| = −(x + y) = (−x) + (−y) = |x| + |and|. Case 3. Um of x and y is positive and o outro is negative. Assume without loss of generality that x > 0 and y < 0. We consider two subcases. Subcase 3.1. x + y ≥ 0. So |x| + |and| = x + (−y) = x − y > x + y = |x + y|. Subcase 3.2. x + y < 0. Here |x| + |and| = x + (−y) = x − y > −x − y = −(x + y) = |x + y|. So |x + y| ≤ |x| + |and| for any two real numbers.

Example 4.18

solution

Show that if |x − 1| < 1 and |x − 1| < r/4, where r ∈ R+ , i.e. |x 2 + x − 2| <r. First, note that |x 2 + x − 2| = |(x + 2)(x − 1)| = |x + 2||x − 1|. By Theorem 4.17, |x + 2| = |(x − 1) + 3| ≤ |x − 1| + |3| < 1 + 3 = 4. So |x 2 + x − 2| = |x + 2||x − 1| < 4

r 4

= r.

4.4 Proofs with sets We now turn to proofs of properties of sets. Recall that for sets A and B contained in a universal set U, the intersection of A and B A ∩ B = {x : x ∈ A and x ∈ B} is the union of A and B A ∪ B = { x : x ∈ A or x ∈ B} and the difference of A and B is A − B = {x : x ∈ A and x ∈ / B}. The set A − B is also called the relative complement of B in A, and the relative complement of A in U is simply called the complement of A and denoted by A. So A = U − A. Next we shall see Always assume that the sets under discussion are subsets of a universal set U.

4.4

ONE

B

109

B

ONE

A∩B

A-B Figure 4.1

Tests with sentences

Venn diagrams for A − B and A ∩ B

Figure 4.1 shows Venn diagrams of A − B and A ∩ B for arbitrary sets A and B. The diagrams suggest that these two sets are equal. This is indeed the case. Remember that to show the equality of two sets C and D, we can check the two inclusions of the sets C ⊆ D and D ⊆ C. To prove the inclusion C ⊆ D, we show that every element of C is also an element of D; that is, if x ∈ C, then x ∈ D. This is done with a direct proof by letting x be an (arbitrary) element of C and showing that x must also belong to D. Remember that we don't need to worry if C has no elements; for in this case x ∈ C is false for every element x, and hence the implication “If x ∈ C, then x ∈ D”. holds for all x ∈ U, and in this case the statement follows loosely. As a consequence of this observation, if C = ∅, then C contains no elements and it follows that C ⊆ D. Result 4.19

For any two sets A and B, A − B = A ∩ B.

Study

We first show that A − B ⊆ A ∩ B. Let x ∈ A − B. Then x ∈ A and x ∈ / B. Since x ∈ / B, it follows that x ∈ B. So x ∈ A and x ∈ B; then x ∈ A ∩ B. So A − B ⊆ A ∩ B. Next we show that A ∩ B ⊆ A − B. Let y ∈ A ∩ B. Then y ∈ A and y ∈ B. Since / B. Now because y ∈ A and y ∈ / B, we conclude that y ∈ A − B. y ∈ B, we see that y ∈ So, A ∩ B ⊆ A − B.

EVIDENCE ANALYSIS

In the second paragraph of the proof of Result 4.19 we use y (instead of x) to denote any element of A ∩ B. We only did this for a change. We could have used x twice. Since we decided to use different symbols, y was the logical choice since x was used in the first paragraph of the proof. This keeps our icon usage consistent. Another option would be to use a in the first paragraph and b in the second. However, this has some disadvantages. Since the sets A and B are named, we might be inclined to assume that a ∈ A and b ∈ B, which can confuse the reader. For this reason we choose x and y instead of a and b. Before we leave the proof of Result 4.19, one more remark. At one point in the second paragraph we learned that y ∈ A and y ∈ / B. From this we could (correctly) have concluded that y ∈ / A ∩ B, but we didn't want to. Instead we write y ∈ A − B. It's always a good idea to keep our goal in mind. We wanted to show that y ∈ A − B; So it was important to keep in mind that it was the set A − B that we were interested in, not A ∩ B.

110

Chapter 4 More on direct proof and proof by contrapositive A

B

(A ∪ B) − (A ∩ B) Figure 4.2

B

ONE

(A − B) ∪ (B − A)

Venn diagrams for (A ∪ B) − (A ∩ B) and (A − B) ∪ (B − A)

Next we consider the Venn diagrams for (A ∪ B) − (A ∩ B) and (A − B) ∪ (B − A) shown in Figure 4.2. From these two diagrams we can (correctly) conclude that the two sets (A ∪ B) − (A ∩ B) and (A − B) ∪ (B − A) are equal. In fact, the only thing missing is the proof that these two sentences are equal. That is, Venn diagrams can be useful in proposing certain results related to sets, but they are only drawings and do not constitute proof. Outcome 4.20

For any two sets A and B, (A ∪ B) − (A ∩ B) = (A − B) ∪ (B − A).

Study

We first show that (A ∪ B) − (A ∩ B) ⊆ (A − B) ∪ (B − A). Let x ∈ (A ∪ B) − (A ∩ B). Then x ∈ A ∪ B and x ∈ / A ∩ B. Since x ∈ A ∪ B, it follows either x ∈ A or x ∈ B. We can assume x ∈ A without loss of generality. Since x ∈ / A ∩ B , the element x∈ / B. So x ∈ A − B and hence x ∈ (A − B) ∪ (B − A). So (A ∪ B) − (A ∩ B) ⊆ (A − B) ∪ (B − A). Next we show that (A − B) ∪ (B − A) ⊆ (A ∪ B) − (A ∩ B). Let x ∈ (A − B) ∪ (B − A). So x ∈ A − B or x ∈ B − A, let's say the former. Then x ∈ A and x ∈ / B. So x ∈ A ∪ B and x ∈ / A ∩ B. So x ∈ (A ∪ B) − (A ∩ B). So (A − B) ∪ (B − A) ⊆ (A ∪ B) − (A ∩ B), as desired.

EVIDENCE ANALYSIS

When we checked the inclusion of the set (A ∪ B) − (A ∩ B) ⊆ (A − B) ∪ (B − A) in the proof of Result 4.20, we concluded that x ∈ A or x ∈ B At this point we could have split the proof into two cases (case 1. x ∈ A and case 2. x ∈ B); However, the proofs of the two cases would be identical except that A and B would be swapped. Therefore, we decided to consider only one of them. Since it didn't really matter which case we treated, we simply chose the case where x ∈ A. We did this by writing: Assume without loss of generality that x ∈ A and we find ourselves in a similar situation, i.e. H. x ∈ A − B or x ∈ B − A. These two situations were basically identical

4.5

Basic properties of set operations

111

and just decide to work with the first (old) situation. (Had we decided to assume that x ∈ B − A, we would have considered the latter case.) Now let's look at an example of a biconditional operation involving sets. Result 4.21

Let A and B be sets. Then A ∪ B = A if and only if B ⊆ A.

Study

We first prove that if A ∪ B = A, then B ⊆ A. We use a contrapositive proof. Suppose B is not a subset of A. Then there must be an element x ∈ B such that x ∈ / A. Since x ∈ B follows, x ∈ A ∪ B. But since x ∈ / A, we have A ∪ B = A. Next we check the converse, ie if B ⊆ A then A ∪ B = A. We use a direct proof here. Suppose that B ⊆ A. To verify that A ∪ B = A, we show that A ⊆ A ∪ B and A ∪ B ⊆ A. The inclusion of the set A ⊆ A ∪ B is immediate (when x ∈ A , then x ∈ A ∪B). It then remains to show that A ∪ B ⊆ A. Let y ∈ A ∪ B. So y ∈ A or y ∈ B. If y ∈ A then we already have the desired result. If y ∈ B, then B ⊆ A implies that y ∈ A. So A ∪ B ⊆ A.

EVIDENCE ANALYSIS

In the first paragraph of the proof of Result 4.21 we stated that we use a contrapositive proof, while in the second paragraph we mentioned that we use a direct proof. This really wasn't necessary as the assumptions we made would tell the reader what technique we are using. Furthermore, in the proof of Result 4.21 we used a contrapositive proof of an implication and a direct proof of its reciprocal. That wasn't necessary either. In fact, it's entirely possible to switch the techniques we use (see Exercise 41).

4.5 Basic Properties of Set Operations Many results about sets follow from some basic properties of sets, which in turn follow from corresponding results about logical statements described in Chapter 2. For example, we know that if P and Q are two statements, then P ∨ Q and Q ∨ P are logically equivalent. If A and B are two sets, then A ∪ B = B ∪ A. We list some of the basic properties of set operations in the following theorem. Theorem 4.22

For the sets A, B and C it holds: (1) commutative laws (a) A ∪ B = B ∪ A (b) A ∩ B = B ∩ A (2) associative laws (a) A ∪ (B ∪ C) = ( A ∪ B) ∪ C (b) A ∩ (B ∩ C) = (A ∩ B) ∩ C (3) Distribution Laws (a) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C ) (b) A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)

112

Chapter 4 More on direct proof and proof by contrapositive (4) De Morgan's laws (a) A ∪ B = A ∩ B (b) A ∩ B = A ∪ B. We present proofs of only three parts of Theorem 4.22, starting with the commutative law of the union of two sets.

Proof of Theorem 4.22(1a)

We show that A ∪ B ⊆ B ∪ A. Suppose that x ∈ A ∪ B. Then x ∈ A or x ∈ B. Applying the commutative property to the disjunction of sentences, we conclude that x ∈ B or x ∈ A; then x ∈ B ∪ A. Hence A ∪ B ⊆ B ∪ A. The proof of inclusion of the inverse set B ∪ A ⊆ A ∪ B is similar and is therefore omitted. Next we examine one of the distributive laws.

Prova do Theorema 4.22(3a)

We first show that A ∪ (B ∩ C) ⊆ (A ∪ B) ∩ (A ∪ C). Let x ∈ A ∪ (B ∩ C). Then x ∈ A or x ∈ B ∩ C. If x ∈ A, then x ∈ A ∪ B and x ∈ A ∪ C. So x ∈ (A ∪ B) ∩ (A ∪ C), as desired. On the other hand, if x ∈ B ∩ C, then x ∈ B and x ∈ C; and again x ∈ A ∪ B and x ∈ A ∪ C. So x ∈ (A ∪ B) ∩ (A ∪ C). So A ∪ (B ∩ C) ⊆ (A ∪ B) ∩ (A ∪ C). To check the inclusion of the inverse set, let x ∈ (A ∪ B) ∩ (A ∪ C). So x ∈ A ∪ B and x ∈ A ∪ C. If x ∈ A, then x ∈ A ∪ (B ∩ C). So we can assume that x ∈ / A. The fact that x ∈ A ∪ B and x ∈ / A implies that x ∈ B. By the same reasoning, x ∈ C. So x ∈ B ∩ C and hence x ∈ A ∪ (B ∩ C). Hence (A ∪ B) ∩ (A ∪ C) ⊆ A ∪ (B ∩ C). As a final example, we prove one of De Morgan's laws.

Prova do Theorem 4.22(4a)

/ A ∪ B. So x ∈ / A We first show that A ∪ B ⊆ A ∩ B. Let x ∈ A ∪ B. Then x ∈ and x ∈ / B. So x ∈ A and x ∈ B, then x ∈ A ∩ B; So A ∪ B ⊆ A ∩ B. Next we show that A ∩ B ⊆ A ∪ B. Let x ∈ A ∩ B. Then x ∈ A and x ∈ B. So x∈ / A and x ∈ / B; then x ∈ / A ∪ B. So x ∈ A ∪ B. So A ∩ B ⊆ A ∪ B.

EVIDENCE ANALYSIS

In the proof of De Morgan's law just presented, at some point we come to the step x ∈ / A ∪ B and then write x ∈ / A and x ∈ / B. Since x ∈ A ∪ B implies that x ∈ A or x ∈ B, you might expect us to write x ∈ / A or x ∈ / B after writing x ∈ / A ∪ B, but that would not be the right conclusion. When we say that x∈ / A ∪ B, this is equivalent to writing ∼ (x ∈ A ∪ B), which is logically equivalent to ∼ ((x ∈ A) or (x ∈ B)). By De Morgan's law for the negation of the disjunction of two sentences (or two open sentences), we have that ∼ ((x ∈ A) or (x ∈ B)) is logically equivalent to ∼ (x ∈ A) and ∼ ( x ∈ B); that is, x ∈ / A and x ∈ / B. Proofs of some other parts of Theorem 4.22 are left as exercises.

4.6

Proofs with Cartesian products of sets

113

4.6 Proofs using Cartesian products of sets Recall that the Cartesian product (or simply the product) A × B of two sets A and B is defined as A × B = {(a, b): a ∈ A and b ∈ B } . If A = ∅ or B = ∅, then A × B = ∅. Before looking at some examples of proofs about Cartesian products of sets, it is important to remember that any element of the Cartesian product A × B of two sets A and B has the form (a, b), where a ∈ A and b ∈ B Result 4.23 Proof

Result 4.24

Let A, B, C and D be sets. If A ⊆ C and B ⊆ D, then A × B ⊆ C × D. Let (x, y) ∈ A × B. Then x ∈ A and y ∈ B. Since A ⊆ C and B ⊆ D, this follows x ∈ C and y ∈ D. So (x, y) ∈ C × D. For the sets A, B and C we have A × (B ∪ C) = (A × B) ∪ (A × C).

Study

We first show that A × (B ∪ C) ⊆ (A × B) ∪ (A × C). Let (x, y) ∈ A × (B ∪ C). So x ∈ A and y ∈ B ∪ C. So y ∈ B or y ∈ C, let's say the former. So (x, y) ∈ A × B and thus (x, y) ∈ (A × B) ∪ (A × C). So A × (B ∪ C) ⊆ (A × B) ∪ (A × C). Next we show that (A × B) ∪ (A × C) ⊆ A × (B ∪ C). Let (x, y) ∈ (A × B) ∪ (A × C). So (x, y) ∈ A × B or (x, y) ∈ A × C, let's say the former. Then x ∈ A and y ∈ B ⊆ B ∪ C. So (x, y) ∈ A × (B ∪ C), which implies that (A × B) ∪ (A × C) ⊆ A × (B ∪ C). We give another example of a proof using Cartesian set products.

Result 4.25

For the sets A, B, and C, A × (B − C) = (A × B) − (A × C).

Study

We first show that A × (B − C) ⊆ (A × B) − (A × C). Let (x, y) ∈ A × (B − C). Then x ∈ A and y ∈ B − C. Since y ∈ B − C, it follows that y ∈ B and y ∈ / C. Since x ∈ A and y ∈ B, we have (x, y) ∈ A × B ∈ / C, but (x, y) ∈ / A × C. So (x, y) ∈ (A × B) − (A × C). So A × (B − C) ⊆ (A × B) − (A × C). Now we show that (A × B) − (A × C) ⊆ A × (B − C). Let (x, y) ∈ (A × B) − (A × C). Then (x, y) ∈ A × B and (x, y) ∈ / A × C. Since (x, y) ∈ A × B, it follows that / A × C, it follows that y ∈ / C. So x ∈ A and y ∈ B. Also, since x ∈ A and (x, y) ∈ y ∈ B − C. So (x, y) ∈ A × (B − C) and (A × B) − (A × C) ⊆ A × (B − C).

EVIDENCE ANALYSIS

We add a comment to the previous proof. In proving (A × B) − (A × C) ⊆ A × (B − C) we need to show that y ∈ / C. We have learned that (x, y) ∈ / A × C. This information alone does not allow us to conclude that y ∈ / C. Indeed, if

114

Chapter 4 More on direct proof and proof by contrapositive (x, y) ∈ / A × C, then x ∈ / A or y ∈ / C. But how did we know that x ∈ A and (x, y) ∈ / A × C, we can conclude that y ∈ / C.

EXERCISES FOR CHAPTER 4 Section 4.1: Proofs that integers are divisible 4.1. Let a and b be integers, where a = 0. Prove that if a | b, then a 2 | b2 . 4.2. Let a, b ∈ Z, where a = 0 and b = 0. Prove that if a | b and b | a, then a = b or a = −b. 4.3. Let m ∈ Z (a) (b) (c) (d) (e)

Prove the following directly: If 3 | m then 3 | m2. Give the contrapositive of the implication in (a). Prove the following directly: If 3 | m then 3 | m2. State the opposite of implication in (c). State the conjunction of the implications in (a) and (c) with “if and only if”.

4.4. Let x, y ∈ Z. Prove that if 3 | x and 3 | y then 3 | (x 2 − y 2 ). 4.5. Let a, b, c ∈ Z, where a = 0. Prove that if a | bc then a | b and a | c. 4.6. Let a ∈ Z. Prove that if 3 | 2a, then 3 | one. 4.7. Let n ∈ Z. Prove that 3 | (2n 2 + 1) if and only if 3 | according to 4.8. In Result 4.4 it was proved for an integer x that if 2 | (x 2 − 1), then 4 | (x2 − 1). Prove that if 2 | (x 2 − 1), then 8 | (x2 − 1). 4.9. (a) Let x ∈ Z. Prove that if 2 | (x 2 − 5), then 4 | (×2 − 5). (b) Give an example of an integer x such that 2 | (x 2 − 5), but 8 | (×2 − 5). 4.10. Let n ∈ Z. Prove that 2 | (n 4 − 3) if and only if 4 | (No. 2 + 3). 4.11. Prove that for every integer n ≥ 8 there are nonnegative integers a and b such that n = 3a + 5b. 4.12. In Result 4.7 it was proved for integers x and y that 4 | (x 2 − y 2 ) if and only if x and y have the same parity. In particular, it says that if x and y are both even, then 4 | (x2 - y2 ); while if x and y are both odd, then 4 | (x 2 − y 2 ). Prove that if x and y are both odd, then 8 | (x 2 − y 2 ). 4.13. Prove that if a, b, c ∈ Z and a 2 + b2 = c2 , then 3 | away.

Section 4.2: Proofs with congruence of integers 4.14. Let a, b, n ∈ Z, where n ≥ 2. Prove that if a ≡ b (mod n), then a 2 ≡ b2 (mod n). 4.15. Let a, b, c, n ∈ Z, where n ≥ 2. Prove that if a ≡ b (mod n) and a ≡ c (mod n), then b ≡ c (mod n). 4.16. Let a, b ∈ Z. Prove that if a 2 + 2b2 ≡ 0 (mod 3), then a and b are both congruent to 0 modulo 3, or neither is congruent to 0 mod 3. 4.17. (a) Prove that if a is an integer such that a ≡ 1 (mod 5), then a 2 ≡ 1 (mod 5). (b) Assuming b is an integer with b ≡ 1 (mod 5), what can we conclude from (a)? 4.18. Let m, n ∈ N with m ≥ 2 and m | n. Prove: If a and b are integers such that a ≡ b (mod n), then a ≡ b (mod m). 4.19. Let a, b ∈ Z. Show that if a ≡ 5 (mod 6) and b ≡ 3 (mod 4), then 4a + 6b ≡ 6 (mod 8).

Exercises for Chapter 4

115

4.20. (a) Result 4.12 says: Let n ∈ Z. If n 2 ≡ n (mod 3), then n ≡ 0 (mod 3) and n ≡ 1 (mod 3). Formulate and prove the converse of this result. (b) Give the conjunction of Result 4.12 and its inverse using “if and only if”. 4.21. Let a ∈ Z. Prove that a 3 ≡ a (mod 3). 4.22. Let n ∈ Z. Prove each of the statements (a)–(f). (a) (b) (c) (d) (e) (f)

If n ≡ 0 (mod 7), then n 2 ≡ 0 (mod 7). If n ≡ 1 (mod 7), then n 2 ≡ 1 (mod 7). If n ≡ 2 (mod 7), then n 2 ≡ 4 (mod 7). If n ≡ 3 (mod 7), then n 2 ≡ 2 (mod 7). For every integer n, n 2 ≡ (7 − n)2 (mod 7). For any integer n, n 2 is exactly congruent to 0, 1, 2, or 4 modulo 7.

4.23. Prove for any set S = {a, a + 1, . 🇧🇷 🇧🇷 , a + 5} of six integers, where 6 | one that 24 | (x 2 − y 2 ) for distinct odd integers x and y in S if and only if one of x and y is congruent to 1 modulo 6 while the other is congruent to 5 modulo 6. 4.24. Let x and y be even integers. Prove that x 2 ≡ y 2 (mod 16) if and only if (1) x ≡ 0 (mod 4) and y ≡ 0 (mod 4) or (2) x ≡ 2 (mod 4) and y ≡ 2 (mod 4) .

Section 4.3: Proofs with real numbers 4.25. Let x, y ∈ R. Prove: If x 2 − 4x = y 2 − 4y and x = y, then x + y = 4. 4.26. Let a, b and m be integers. Prove that if 2a + 3b ≥ 12m + 1, then a ≥ 3m + 1 or b ≥ 2m + 1. 4.27. Let x ∈ R. Prove: If 3x 4 + 1 ≤ x 7 + x 3 , then x > 0. 4.28. Prove that if r is a real number such that 0 < r < 1, then

1 r (1-r )

4.29. Prove that if r is a real number such that |r − 1| < 1, then

≥ 4.

4r (4-r)

≥ 1.

4.30. Let x, y ∈ R. Prove that |x y| = |x| |y|. 4.31. For any two real numbers x and y, prove that |x + y| ≥ |x| − |y|. √ 4.32. (a) Recall that √r > 0 for every positive √real number r. Prove that if a and b are positive real numbers, then 0 < ab ≤ a+b . (The number ab is called the geometric mean of a and b, while (a + b)/2 2 is called the arithmetic mean or √ mean of a and b.) (b) Under what conditions does ab = (a + b ) /2 for positive real numbers a and b? Justify your answer. √ 4.33. The geometric mean of three positive real numbers a, b and c is 3 abc and the arithmetic mean is √(a + b + c)/3. Prove that 3 abc ≤ (a + b + c)/3. [Note: The numbers a, b and c can be expressed as a = r 3 , b = s 3 and c = t 3 for positive numbers r, s and t.] 4.34. For any three real numbers x, y, and z, prove that |x − z| ≤ |x − y| + |y − z|. 4.35. Prove that if x is a real number such that x(x + 1) > 2, then x < −2 or x > 1. 4.36. For any positive real number x, prove that 1 +

1x4

≥

1x

+

1. x3

4.37. Prove for x, y, z ∈ R that x 2 + y 2 + z 2 ≥ x y + x z + yz. 4.38. Let a, b, x, y ∈ R e r ∈ R+ . Prove that if |x − a| < r/2 e |y − b| < r/2, i.e. |(x + y) − (a + b)| <r. 4.39. Prove that if a, b, c, d ∈ R, then (ab + cd)2 ≤ (a 2 + c2 )(b2 + d 2 ).

116

Chapter 4 More on direct proof and proof by contrapositive

Section 4.4: Proofs with Sets 4.40. Let A and B be sets. Prove that A ∪ B = (A − B) ∪ (B − A) ∪ (A ∩ B). 4.41. In Result 4.21 it was proved for the sets A and B that A ∪ B = A if and only if B ⊆ A. Prove this result again by giving a direct proof of the implication “If A ∪ B = A, then B ⊆ A” and a contrapositive proof of its reciprocal. 4.42. Let A and B be sets. Prove that A ∩ B = A if and only if A ⊆ B. 4.43.

(a) Give an example of three sets A, B and C such that A ∩ B = A ∩ C but B = C. (b) Give an example of three sets A, B and C such that A ∪ B = A ∪ C but B = C. (c) Let A, B and C be sets. Prove that if A ∩ B = A ∩ C and A ∪ B = A ∪ C, then B = C.

4.44. Prove that if A and B are sets with A ∪ B = ∅, then A = ∅ or B = ∅. 4.45. Let A = {n ∈ Z : n ≡ 1 (mod 2)} and B = {n ∈ Z : n ≡ 3 (mod 4)}. Prove that B ⊆ A. 4.46. Let A and B be sets. Prove that A ∪ B = A ∩ B if and only if A = B. 4.47. Let A = {n ∈ Z : n ≡ 2 (mod 3)} and B = {n ∈ Z : n ≡ 1 (mod 2)}. (a) Describe the elements of the set A − B. (b) Prove that if n ∈ A ∩ B, then n 2 ≡ 1 (mod 12). 4.48. Let A = {n ∈ Z : 2 | n} and B = {n ∈ Z : 4 | n}. Let n ∈ Z. Prove that n ∈ A − B if and only if n = 2k for an odd integer k. 4.49. For any two sets A and B, prove that A = (A − B) ∪ (A ∩ B). 4.50. For any two sets A and B, prove that A − B, B − A, and A ∩ B are uniformly disjoint. 4.51. Let A and B be subsets of a universal set. Which of the following conditions is a necessary condition for A and B to be disjoint? (a) (b) (c) (d) (e)

Either A = ∅ or B = ∅. Whenever x ∈ / A, it must happen that x Whenever x ∈ / A, it must happen that x If x ∈ A, it must happen that x If x ∈ A, it must happen that x

∈ ∈ / ∈ ∈ /

B.B.B.B.

Section 4.5: Basic Properties of Set Operations 4.52. Prove that A ∩ B = B ∩ A for any two sets A and B (Theorem 4.22(1b)). 4.53. Prove that A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C) for any three sets A, B, and C (Theorem 4.22(3b)). 4.54. Prove that A ∩ B = A ∪ B for any two sets A and B (Theorem 4.22(4b)). 4.55. Let A, B and C be sets. Prove that (A − B) ∩ (A − C) = A − (B ∪ C). 4.56. Let A, B and C be sets. Prove that (A − B) ∪ (A − C) = A − (B ∩ C). 4.57. Let A, B and C be sets. Use Theorem 4.22 to prove that A ∪ (B ∩ C) = (A ∩ B) ∪ (A − C). 4.58. Let A, B and C be sets. Prove that A ∩ (B ∩ C) = (A ∪ B) ∩ (A ∪ C). 4.59. For any three sets A, B, and C, show that A − (B − C) = (A ∩ C) ∪ (A − B).

Additional exercises to Chapter 4

117

Section 4.6: Proofs with Cartesian products of sets 4.60. For A = {x, y}, find A × P(A). 4.61. For A = {1} and B = {2}, find P(A × B) and P(A) × P(B). 4.62. Let A and B be sets. Prove that A × B = ∅ if and only if A = ∅ or B = ∅. 4.63. For the sets A and B, find a necessary and sufficient condition for A × B = B × A. 4.64. For the sets A and B, find a necessary and sufficient condition for (A × B) ∩ (B × A) = ∅. Check whether this condition is necessary and sufficient. 4.65. Let A, B and C be nonempty sets. Prove that A × C ⊆ B × C if and only if A ⊆ B. 4.66. Result 4.23 says that if A, B, C, and D are sets with A ⊆ C and B ⊆ D, then A × B ⊆ C × D. (a) Show that the converse of Result 4.23 is false. (b) Under what additional assumptions does the converse hold? Prove your claim. 4.67. Let A, B and C be sets. Prove that A × (B ∩ C) = (A × B) ∩ (A × C). 4.68. Let A, B, C and D be sets. Prove that (A × B) ∩ (C × D) = (A ∩ C) × (B ∩ D). 4.69. Let A, B, C and D be sets. Prove that (A × B) ∪ (C × D) ⊆ (A ∪ C) × (B ∪ D). 4.70. Let A and B be sets. In general, show that A × B = A × B.

EXERCISES ADDITIONAL TO CHAPTER 4 4.71. Let n ∈ Z. Prove that 5 | n 2 if and only if 5 | n. 4.72. For integers a and b, prove that 3 | from if and only if 3 | a or 3 | B. 4.73. Prove that if n is an odd integer, then 8 | [n2 + (n + 6)2 + 6]. 4.74. Prove that if n is an odd integer, then 8 | (n4 + 4n2 + 11). 4.75. Let n, m ∈ Z. Prove that if n ≡ 1 (mod 2) at ≡ 3 (mod 4), then n 2 + m ≡ 0 (mod 4). 4.76. Find two different positive integer values of a such that and give a proof for each: For every integer n, a | (n2+1). √ √ 4.77. For any two real numbers a and b, prove that ab ≤ a 2 b2 . + from ≥ 2. √ 4.79. Prove the following: Let x ∈ R. If x(x − 5) = −4, then 5x 2 − 4 = 1 implies that x +

4.78. Prove this for any two positive real numbers a and b

a b

1x

= 2.

4.80. Let x, y ∈ R. Prove that if x < 0 then x 3 − x 2 y ≤ x 2 y − x y 2 . 4.81. Prove that 3 | (n 3 − 4n) for every integer n. 4.82. Evaluate the proposed proof of the following result. result

Let x, y ∈ Z. If x ≡ 2 (mod 3) and y ≡ 2 (mod 3), then x y ≡ 1 (mod 3).

Proof Let x ≡ 2 (mod 3) and y ≡ 2 (mod 3). So x = 3k + 2 and y = 3k + 2 for an integer k. So x y = (3k + 2)(3k + 2) = 9k 2 + 12k + 4 = 9k 2 + 12k + 3 + 1 = 3(3k 2 + 4k + 1) + 1. As 3k 2 + 4k + 1 an integer, x y ≡ 1 (mod 3).

118

Chapter 4 More on direct proof and proof by contrapositive

4.83. Below is a proof of a result. Which result is proven? Proof Assume that x ≡ 1 (mod 5) and y ≡ 2 (mod 5). So 5 | (x − 1) and 5 | (y − 2). So x − 1 = 5a and y − 2 = 5b for some integers a and b. So x = 5a + 1 and y = 5b + 2. So x 2 + y 2 = (5a + 1)2 + (5b + 2)2 = (25a 2 + 10a + 1) + (25b2 + 20b + 4 ) = 25a 2 + 10a + 25b2 + 20b + 5 = 5(5a 2 + 2a + 5b2 + 4b + 1). Since 5a 2 + 2a + 5b2 + 4b + 1 is an integer, 5 | (x 2 + y 2 ) and thus x 2 + y 2 ≡ 0 (mod 5). 4.84. A proof of the following result is given. Let n ∈ Z. If n 4 is even, then 3n + 1 is odd. 2 Proof Assume that n 4 = n 2 is even. Since n 4 is even, n 2 is even. Since n 2 is even, n is even. Since n is even, n = 2k for an integer k. So 3n + 1 = 3(2k) + 1 = 6k + 1 = 2(3k) + 1.

result

Since 3k is an integer, 3n + 1 is odd. Answer the following questions. (1) What proofing technique is used? (2) What is the initial assumption? (3) What needs to be shown to provide a complete proof? (4) Justify each of the following proof steps. (a) (b) (c) (d) (e)

Since n 4 is even, n 2 is even. Since n 2 is even, n is even. Since n is even, n = 2k for an integer k. So 3n + 1 = 3(2k) + 1 = 6k + 1 = 2(3k) + 1. Since 3k is an integer, 3n + 1 is odd.

4.85. Below is preliminary evidence of a result. Proof We first show that A ⊆ (A ∪ B) − B. Let x ∈ A. Since A ∩ B = ∅, it follows that x ∈ / B. Hence x ∈ A ∪ B and x ∈ / B; then x ∈ (A ∪ B) − B. So A ⊆ (A ∪ B) − B / B. From this we show that (A ∪ B) − B ⊆ A. Let x ∈ (A ∪ B) − B Then x ∈ A ∪ B and x ∈ follows x ∈ A. So (A ∪ B) − B ⊆ A. (a) Which result is proved above? (b) What change (or changes) to this test would improve it (in your opinion)? 4.86. Evaluate the proposed proof of the following result. result

Seien x, y ∈ Z mit 3 | x. is 3 | (x + y), dann 3 | Y.

Try since 3| x follows that x = 3a, where a ∈ Z. Suppose that 3 | (x+y). So x + y = 3b for an integer b. So y = 3b − x = 3b − 3a = 3(b − a). Since b − a is an integer, 3 | j. On the contrary, suppose that 3 | j. Hence y = 3c, where c ∈ Z. So x + y = 3a + 3c = 3(a + c). Since a + c is an integer, 3 | (x+y). 4.87. Evaluate the proposed proof of the following result. Result Let x, y ∈ Z. If x ≡ 1 (mod 3) and y ≡ 1 (mod 3), then x y ≡ 1 (mod 3). Proof Assume that x ≡ 1 (mod 3) and y ≡ 1 (mod 3). So 3 | (x − 1) and 3 | (y − 1). So x − 1 = 3q and y − 1 = 3q for an integer q and therefore x = 3q + 1 and y = 3q + 1. So x y = (3q + 1)(3q + 1) = 9q 2 + 6q + 1 = 3(3q 2 + 2q) + 1 and thus x y − 1 = 3(3q 2 + 2q). Since 3q 2 + 2q is an integer, 3 | (x y − 1). So xy ≡ 1 (mod 3).

119

Additional exercises to Chapter 4

4.88. Evaluate the proposed proof of the following result. Result For every three sets A, B and C, (A × C) − (B × C) ⊆ (A − B) × C. Proof Let (x, y) ∈ (A × C) − (B × C ) . Then (x, y) ∈ A × C and (x, y) ∈ / B × C. Since (x, y) ∈ A × C, it follows that x ∈ A and y ∈ C. Since (x, y) ∈ / B × C, we have x ∈ / B. So x ∈ A − B. So (x, y) ∈ (A − B) × C. 4.89. Prove that for any three integers a, b, and c, the sum |a − b| is + |a − c| + |b − c| is an even integer. √ √ 4.90. For any four real numbers a, b, c, and d, prove that ac + bd ≤ a 2 + b2 c2 + d 2 . 4.91. Prove that for any real number x sin6 x + 3 sin2 x cos2 x + cos6 x = 1. 4.92. Let a ∈ Z. Prove that if 6 | to and 10 | a, then 15 | one. 4.93. Let A = {x}. Give an example of a set relative to the set A to which each of the following elements belongs. (a) (x, {x}) (b) ({x}, x) (c) (x, x) (d) ({x}, {x}) (e) x (f) {x} (g) {(x, x)} (h) ({x}, {{x}}). 4.94. Let a, b ∈ Z. Prove that if a ≡ b (mod 2) and b ≡ a (mod 3), then a ≡ b (mod 6). 4.95. Let a, b, c ∈ R. Prove that 32 (a 2 + b2 + c2 + 1) ≥ a(b + 1) + b(c + 1) + c(a + 1). 4.96. Prove that if a, b, and c are positive real numbers, then (a + b + c) a1 + b1 + 1c ≥ 9. 4.97. Let T = {1, 2, . 🇧🇷 🇧🇷 , 8}. (a) Determine the elements of the set A = {a ∈ T : 2m ≡ a (mod 9) for some m ∈ N}. (b) Determine the elements of the set B = {a ∈ T : 5m ≡ a (mod 9) for some m ∈ N}. (c) What is the property of the sets A and B in (a) and (b)? 4.98. Consider the open proposition P(m) : 5m + 1 = a 2 for some a ∈ Z, where m ∈ N. That is, P(m) is the open proposition: 5m + 1 is a perfect square. (a) Determine four different solutions t of t 2 ≡ 4 (mod 5). Find for each solution t m = 4(t 5−4) + 3 and show that P(m) is a true statement. (b) Show that the set S = {t ∈ Z : t 2 ≡ 4 (mod 5)} contains infinitely many elements. 2 (c) Let t be an element of the set S in (b). Prove that if m = 4(t 5−4) + 3, then 5m + 1 is a perfect square. (d) As a consequence of the results established in (a)–(c), what can be concluded about the set M = {m ∈ N : 5m + 1 is a perfect square}? two

4.99. Let a1, a2, . 🇧🇷 🇧🇷 , an (n ≥ 3) be n integers with |ai+1 − ai | ≤ 1 for 1 ≤ i ≤ n − 1. Prove that if k is an integer exactly between a1 and an, then there exists an integer j with 1 < j < n such that a j = k.

5

Existence and proof by contradiction

T

So far we have mainly dealt with quantified statements with universal quantifiers, i.e. with statements of the type ∀x ∈ S, R(x). We now consider problems that contain directly or indirectly quantified statements with existential quantifiers, i.e. statements of the type ∃x ∈ S, R(x).

5.1 Counterexamples It should certainly not come as a surprise that some quantified propositions of the type ∀x ∈ S, R(x) are false. We have seen that ∼ (∀x ∈ S, R(x)) ≡ ∃x ∈ S, ∼ R(x), that is, if the statement ∀x ∈ S, R(x) is false, then there is any element x ∈ S for which R(x) is false. Such an element x is called a counterexample of the (false) statement ∀x ∈ S, R(x). Finding a counterexample verifies that ∀x ∈ S, R(x) is false. Example 5.1

Consider the statement:

If x ∈ R, then (x 2 − 1)2 > 0.

(5.1)

or equivalently: For every real number x, (x 2 − 1)2 > 0. Show that statement (5.1) is false by showing a counterexample. solution

For x = 1, (x 2 − 1)2 = (12 − 1)2 = 0. So x = 1 is a counterexample.

It should be noted that the number x = −1 is also a counterexample. In fact, x = 1 and x = −1 are the only two counterexamples of claim (5.1). That is, the statement If x ∈ R − {1, −1}, then (x 2 − 1)2 > 0. 120

and truth.

(5.2)

5.1

counterexamples

121

If a statement P turns out to be false in any way, P is said to be disproved. The counterexample x = 1 thus refutes assertion (5.1). Example 5.2

Disprove the statement: If x is a real number, then tan2 x + 1 = sec2 x.

solution

(5.3)

Since tan x and sec x are undefined at x = π/2, it follows that tan2 x + 1 and sec2 x have no numerical value at x = π/2 and consequently tan2 x + 1 and sec2 x are unequal when x = π/2. That is, x = π/2 is a counterexample to assertion (5.3). Although tan2 x + 1 = sec2 x is a well-known identity from trigonometry, statement (5.3) is false as presented. However, if x is a real number for which tan x and sec x are defined, then tan2 x + 1 = sec2 x.

(5.4)

In Example 5.2, assertion (5.4) rather than assertion (5.3) was probably intended. Since tan x and sec x are defined for exactly the same real numbers x (i.e. such numbers x with cos x = 0), we can rewrite (5.4) as If x ∈ R − nπ + π2 : n ∈ Z , then tan2 x + 1 = sec2 x. Example 5.3

Disprove the claim: If x ∈ Z, then

solution

x2 + x x +1 = . 2 x – x x – 1

(5.5)

x2 + x is undefined. On the other hand, if x = 0, x2 − x x2 + x x +1 x +1 then = −1; then the expressions 2 and are certainly not equal if x −1 x −x x −1 x = 0. Hence x = 0 is a counterexample to claim (5.5).

If x = 0, then x 2 − x = 0, and so on

x2 + x x +1 is not even defined when x = 1, it follows that x = 1 also 2 x −x x −1 is a counterexample to claim (5.5). In fact, x = 0 and x = 1 are the only counterexamples to the assertion (5.5) and thus to the assertion as neither

If x ∈ Z − {0, 1}, then

x +1 x2 + x = . 2 x – x x – 1

and truth. The three examples above illustrate the fact that an open proposition R(x) that is false over a region S may well be true over a subset of S. Hence the truth (or falsity) of a statement ∀x ∈ S, R( x) depends not only on the open set R(x) but also on its domain.

122

Chapter 5 Existence and proof by contradiction Example 5.4

Disprove the statement: For every positive odd integer n, 3 | (n 2 − 1).

solution

Since 3 | (32 − 1) it follows that n = 3 is a counterexample.

(5.6)

You may have noticed that although 3| (32 − 1) it holds that 3 | (n 2 − 1) for some odd positive integers. Example: 3 | (n 2 − 1) when n = 1, 5, 7, 11, 13, 17, while 3 | (n 2 − 1) when n = 3, 9, 15, 21. This should make you wonder for which odd positive integers n the open set 3 | (n 2 − 1) is true. (See Result 4.6.) We have seen that a quantified statement of type ∀x ∈ S, R(x) is false if ∃x ∈ S, ∼ R(x) is true, that is, if there is an element x ∈ gives S for which R(x) is false. There will be many cases where R(x) is an implication P(x) ⇒ Q(x). Hence the quantified statement ∀x ∈ S, P(x) ⇒ Q(x)

(5.7)

∃x ∈ S, ∼ (P(x) ⇒ Q(x))

(5.8)

it's wrong if

and truth. By Theorem 2.4(a), the statement (5.8) can be expressed as ∃x ∈ S, (P(x) ∧ (∼ Q(x))). That is, to show that statement (5.7) is false, we have to show a counterexample, which is then an element x ∈ S such that P(x) is true and Q(x) is false. Example 5.5

Disprove the claim: Let n ∈ Z. If n 2 + 3n is even, then n is odd.

solution

If n = 2, then n 2 + 3n = 22 + 3 x 2 = 10 is even and 2 is even. So n = 2 is a counterexample. In the previous example, not only is 2 a counterexample, but any even integer is a counterexample.

Example 5.6

Disprove the statement: If n is an odd integer, then n 2 − n is odd.

solution

(5.9)

For the odd integer n = 1, the integer n 2 − n = 12 − 1 = 0 is even. So n = 1 is a counterexample.

5.1

counterexamples

123

In fact, it is not difficult to prove that If n is an odd integer, then n 2 − n is even. and truth. While this may be very interesting to know, to show that Claim (5.9) is false, only one counterexample needs to be shown. There is no need to prove a different result. One should know the difference between these two. Example 5.7

Show that the statement: Let n ∈ Z. If 4 | (n 2 − 1), then 4 | (n − 1). It is a fake.

solution

Example 5.8

Since 4 | (32 − 1), but 4 | (3 − 1) it follows that n = 3 is a counterexample.

Show that the statement c c For positive integers a, b, c, a b = a b . It is a fake.

solution

Example 5.9

c 3 3 c Let a = 2, b = 2 and c = 3. Then a b = 22 = 28 = 256 while a b = 22 = 43 = 64. Since 256 = 64 the positive integers a = 2, b = 2 and c = 3 is a counterexample.

Show that the statement: Let a and b be non-zero real numbers. If x, y ∈ R+ , then a2 2 b2 2 x + y > xy y. 2b2 2a 2

(5.10)

It is a fake. solution

Let x = b2 and y = a 2 . So a2 2 a 2 b2 b2 2 a 2 b2 + = a 2 b2 = x y. x + y = 2b2 2a 2 2 2 So x = b2 and y = a 2 is a counterexample and therefore the inequality is false.

Analyse

After reading the solution of Example 5.9, the only question that remains is where the counterexample x = b2 and y = a 2 comes from: Multiplying the inequality (5.10) by 2a 2 b2 (whereby all fractions are eliminated) gives the inequality equivalent to 4 x 2 + b4 y 2 > 2a 2 b2 x y

124

Chapter 5 Existence and proof by contradiction and therefore a 4 x 2 − 2a 2 b2 x y + b4 y 2 > 0, which can be expressed as (a 2 x − b2 y)2 > 0. Of course (a 2 x − b2 y) 2 ≥ 0. Thus, all values of x and y such that a 2 x − b2 y = 0 provide a counterexample. While there are many options for x and y, one of them is x = b2 and y = a 2 .

5.2 Proof by Contradiction Suppose, as usual, that we want to show that a certain mathematical statement R is true. If R is expressed as a quantified statement ∀ x ∈ S, P(x) ⇒ Q(x), then we have already introduced two proof techniques, namely direct proof and proof by contrapositive, which can be used to prove the truth of R We now introduce a third method that can be used to establish the truth of R regardless of whether R is expressed in terms of an implication. Suppose we assume that R is a false assertion, and from that assumption we can arrive at or deduce an assertion that contradicts an assumption we made in the proof or a known fact. (The known fact can be a definition, an axiom, or a theorem.) If we denote this assumption or known fact by P, then we infer ∼ P, thereby creating the contradiction C : P ∧ (∼ P). Hence we have established the truth of the implication (∼ R) ⇒ C. However, since (∼ R) ⇒ C is true and C is false, it follows that ∼ R is false and therefore R is true, as desired. This technique is called proof by contradiction. If R is the quantified statement ∀ x ∈ S, P(x) ⇒ Q(x), then a proof by contradiction of this statement consists in proving the implication ∼ (∀ x ∈ S, P(x) ⇒ Q(x) ) ⇒ C for a contradiction C. But since ∼ (∀ x ∈ S, P(x) ⇒ Q(x)) ≡ ∃x ∈ S, ∼ (P(x) ⇒ Q(x)) ≡ ∃x ∈ S, ( P(x) ∧ (∼ Q(x))), it follows that a proof by contradiction of ∀ x ∈ S, P(x) ⇒ Q(x) would start by assuming the existence of an element x ∈ S such that P (x) is true and Q(x) is false. That is, a proof by contradiction of ∀ x ∈ S, P(x) ⇒ Q(x) assumes that there is a counterexample to this quantified statement. Often the reader is alerted to using a proof by contradiction by saying (or writing) Suppose R is false. or Conversely, suppose that R is false.

5.2

proof by contradiction

125

So if R is the quantified proposition ∀ x ∈ S, P(x) ⇒ Q(x), then a proof by contradiction could start with: Suppose there exists an element x ∈ S such that P( x ) is true and Q( x) is wrong. (or something like that). The rest of the proof then consists of showing that this assumption leads to a contradiction. Now let's look at some examples of the proof by contradiction. We begin by stating a fact about positive real numbers. Result to prove the PROOF STRATEGY

Result 5.10

There is no smallest positive real number. In a proof by contradiction, we first assume that the statement is false and try to show that this leads us to a contradiction. Therefore, we first assume that there is a smallest positive real number. It is useful to represent this number by a symbol, say r . Our goal is to create a contradiction. How will we do that? Of course, if we could think of a positive real number less than r, that would give us a contradiction. There is no smallest positive real number.

Study

Instead, assume there is a smallest positive real number, say r . Since 0 < r/2 < r, it follows that r/2 is a positive real number less than r. However, this is a contradiction.

EVIDENCE ANALYSIS

The contradiction addressed in the proof of Result 5.10 is the statement: r is the smallest positive real number and r/2 is a positive real number less than r . This claim is certainly false. We assume that the reader understands what objection was received. If we think that the reader might not see this, then of course we have to state (in the proof) specifically what the contradiction is. There is one more point about result 5.10 that needs to be made. This result states that "there is no smallest positive real number". This is a negative-sounding result. In most cases, apparently negative results are proven by contradiction. Therefore, the proof technique used in Outcome 5.10 is not unexpected. Let's consider two more examples.

Result 5.11 test

No odd integer can be expressed as the sum of three even integers. Instead, suppose there is an odd integer n that can be expressed as the sum of three even integers x, y, and z. So x = 2a, y = 2b and z = 2c with a, b, c ∈ Z. So n = x + y + z = 2a + 2b + 2c = 2(a + b + c). Since a + b + c is an integer, n is even. This is a contradiction.

EVIDENCE ANALYSIS

Consider the statement: A: No odd integer can be expressed as the sum of three even integers.

126

Chapter 5 Existence and Proof by Contradiction Obviously, Result 5.11 says that R is a true statement. In order to prove Result 5.11 by contradiction, we try to prove an implication of type (∼ R) ⇒ C for a contradiction C. The negation ∼ R is ∼ R: there exists an odd integer that can be expressed as the sum of three even integers. The proof we gave of Result 5.11 started by assuming the truth of ∼ R. We introduce symbols for the four integers involved to make the proof easier to explain. Finally we could show that n is an even integer. On the other hand, we knew that n is odd. So n was even and odd. This was our contradiction C. In the two examples of proof by contradiction that we have given, neither statement to be proved is expressed as an implication. For our next example, let's consider an implication. Result 5.12 test

If a is an even integer and b is an odd integer, then 4 | (a2 + 2b2). Instead, suppose there is an even integer a and an odd integer b such that 4 | (a2 + 2b2). So a = 2x, b = 2y + 1 and a 2 + 2b2 = 4z for some integers x, y and z. So a2 + 2b2 = (2x)2 + 2(2y + 1)2 = 4z. Simplified we get 4x 2 + 8y 2 + 8y + 2 = 4z or equivalently 2 = 4z − 4x 2 − 8y 2 − 8y = 4(z − x 2 − 2y 2 − 2y). Since z − x 2 − 2y 2 − 2y is an integer, 4 | 2, which is impossible.

EVIDENCE ANALYSIS

Let S be the set of even integers and T the set of odd integers. In Result 5.12, our goal was to prove that ∀a ∈ S, ∀b ∈ T, P(a, b).

(5.11)

is true, where P(a,b): 4 | (a2 + 2b2). Since we tried to prove (5.11) by contradiction, we wanted to prove the truth of ∼ (∀a ∈ S, ∀b ∈ T, P(a, b)) ⇒ C. for a contradiction C or equivalently the truth of ∃a ∈ S, ∃b ∈ T, (∼ P(a, b))) ⇒ C. Therefore we first assume that there is an even integer a and an odd integer b exists such that 4 | (a2 + 2b2). In the end we deduced that 4 | 2, which is a false statement and therefore creates an intentional contradiction.

5.2

proof by contradiction

127

Using some of the facts we discussed earlier, we could have proved Result 5.12 directly. Since we wrote a = 2x and b = 2y + 1, we have a 2 + 2b2 = (2x)2 + 2(2y + 1)2 = 4x 2 + 8y 2 + 8y + 2 = 4(x 2 + 2y 2). + 2y) + 2. Therefore we express a 2 + 2b2 as 4q + 2, where q = x 2 + 2y 2 + 2y. That is, dividing a 2 + 2b2 by 4 gives a remainder of 2 and therefore 4 | (a2 + 2b2). At this point, however, a proof by contradiction of Result 5.12 is probably preferred, both for practicing and understanding this proof technique. Let's look at two other results that appear negative. Result 5.13

The integer 100 cannot be written as the sum of three integers of which one odd number is odd.

Study

On the contrary, suppose that 100 can be written as the sum of three integers a, b, and c, one of which is odd. We consider two cases. Case 1. Exactly one of a, b and c is odd, say a. So a = 2x + 1, b = 2y and c = 2z, where x, y, z ∈ Z. So 100 = a + b + c = (2x + 1) + 2y + 2z = 2(x + y + z ) + 1. Since x + y + z ∈ Z, the integer 100 is odd, which creates a contradiction. Case 2. Items a, b, and c are all odd. So a = 2x + 1, b = 2y + 1 and c = 2z + 1, where x, y, z ∈ Z. So 100 = a + b + c = (2x + 1) + (2y + 1) + ( 2z + 1) = 2(x + y + z + 1) + 1. Since x + y + z + 1 ∈ Z, the integer 100 is odd, again a contradiction.

EVIDENCE ANALYSIS

Note that the proof of Result 5.13 begins with the assumption that 100 can be written as the sum of three integers, one of which is odd (as expected). However, with the introduction of symbols for these integers, namely a, b, and c, the proof became simpler and clearer.

Result 5.14

For any integer m such that 2 | m and 4 | m, there are no integers x and y such that x 2 + 3y 2 = m.

Study

On the contrary, suppose there is an integer m such that 2 | m and 4 | m and integers x and y, for which x 2 + 3y 2 = m. Since 2 | m, it follows that m is even. By Theorem 3.16, x 2 and 3y 2 have the same parity. We consider two cases.

128

Chapter 5 Existence and proof by contradiction Case 1. x 2 and 3y 2 are even. Since 3y 2 is even and 3 is odd, Theorem 3.17 implies that y 2 is even. Since x 2 and y 2 are even, by Theorem 3.12 we have that x and y are even. So x = 2a and y = 2b, where a, b ∈ Z. So x 2 + 3y 2 = (2a)2 + 3(2b)2 = 4a 2 + 12b2 = 4(a 2 + 3b2 ) = m. Da 2 + 3b2 ∈ Z, it follows that 4 | m, which creates a contradiction. Case 2. x 2 and 3y 2 are odd. Since 3y 2 is odd and 3 is odd, it follows (by the contrapositive formulation of) Theorem 3.17 that y 2 is odd. By the (contra-positive formulation of) Theorem 3.12, x and y are both odd. So x = 2a + 1 and y = 2b + 1, where a, b ∈ Z. So x 2 + 3y 2 = (2a + 1)2 + 3(2b + 1)2 = (4a 2 + 4a + 1) + 3(4b2 + 4b + 1) = 4a 2 + 4a + 12b2 + 12b + 4 = 4(a 2 + a + 3b2 + 3b + 1) = m. Since a 2 + a + 3b2 + 3b + 1 ∈ Z is, it follows 4 | m, which creates a contradiction. The next result concerns irrational numbers. Remember that a real number is rational if, for some m, n ∈ Z, with n = 0, it can be expressed as m/n. Since "irrational" means "non-rational," it is not surprising that the proof by contradiction is the technical proof we are going to use. Result 5.15 exam

The sum of a rational number and an irrational number is irrational. Instead, suppose there is a rational number x and an irrational number y whose sum is a rational number z. So x + y = z, where x = a/b and z = c/d for some integers a, b, c, d ∈ Z and b, d = 0. This implies that a bc − ad c − = . d b bd Since bc − ad and bd are integers and bd = 0, it follows that y is rational, which is a contradiction. y=

Theorem to prove the PROOF STRATEGY

Theorem 5.16 Proof

Result 5.15 √ concerns the irrationality of numbers. One of the most well-known irrational numbers is 2. Although we have never verified that this number is irrational, we have now established this fact. √ The real number 2 is irrational. In proving this result we use Theorem 3.12, which says that an integer x is even if and only if x 2 is even. Also, it will be useful in the proof to express a rational number m/n, where m, n ∈ Z and n = 0, in the lowest terms, which means that m and n have no common divisor greater than 1.

√

2 is unreasonable.

√ √ Assume instead that 2 is rational. Then 2 = a/b, where a, b ∈ Z and b = 0. We can further assume that a/b has been expressed (or reduced) in lower terms.

5.2

proof by contradiction

129

So 2 = a2/b2; then a2 = 2b2 . Since b2 is an integer, a 2 is even. By Theorem 3.12, a is even. So a = 2c, where c ∈ Z. So (2c)2 = 2b2 and then 4c2 = 2b2 . So b2 = 2c2 . Since c2 is an integer, b2 is even, which by Theorem 3.12 implies that b is even. Since a and b are even, each divides by 2, which is a contradiction because a/b has been reduced to the lowest terms. The problem of the three prisoners

We will now deviate briefly from our discussion of proof by contradiction to present a "history" problem. Three prisoners (see Figure 5.1) were sentenced to long prison terms, but one prisoner has to be released due to overcrowding. The warden devises a plan to determine which prisoner to release. He tells the prisoners that he will blindfold them and paint either a red or a blue dot on each forehead. After painting the dots, he removes the blindfolds and a prisoner must raise his hand if he sees a red dot on at least one of the other two buttons. The first inmate to recognize the color of the dot on his forehead is released. Of course, the prisoners agree. (What do they have to lose?) The guard blindfolds the prisoners as promised, then paints a red dot on the foreheads of the three prisoners. He removes the blindfold and since each prisoner sees a red dot (actually two red dots), each prisoner raises their hand. Some time passes when one of the prisoners exclaims: “I know the color of my stitch! It's red!” That prisoner is then released. Although the story of the three prisoners is over, one question remains: how did this prisoner correctly identify the color of the dot painted on his forehead? The solution is given below, but try to determine the answer yourself before reading on.

Solution to the three prisoners problem

Let's assume (without loss of generality) that it is prisoner #1 (see Figure 5.1) who has noticed that he has drawn a red dot on his forehead. How did he come to this conclusion? Maybe you think he just guessed it since he had nothing to lose anyway. But that's not the answer we were looking for. Prisoner #1 knows his spot color is red or blue. He thinks, "Instead, assume that my dot is blue. So of course #2 knows that and knows that #3 has a red dot. (That's why #2 raised his hand.) But #2 also knows that #3 raised his hand.

#2

#1 Figure 5.1

the three prisoners

#3

130

Chapter 5 Existence and proof by hand of the contradiction. So if my dot is blue, #2 knows its dot is red. Likewise, if my dot is blue, #3 knows its dot is red. In other words, if my dot is blue, both number 2 and number 3 should be able to quickly identify the color of their dots. But time passed and they did not determine the colors of their dots. So my dot can't be blue. So #1 exclaims, "I know what color my dot is! It's red!” What you probably noticed is that reasoning #1, used to conclude that your point is red, is proof by contradiction. It seems there is more to know about prisoner #1. But this is another story.

5.3 An overview of three proof techniques We have seen that we are often in the situation to prove the truth of a statement ∀x ∈ S, P(x) ⇒ Q(x). You have now learned three proof techniques: direct proof, proof by contrapositive, proof by contradiction. For each of these three techniques, you should know how to start a test and what your goal should be. You also need to know what not to do. Figure 5.2 shows several ways we can start a proof. However, only some of them can lead to proof. Let us now compare the three proof techniques using two examples.

1. 2. 3. 4. 5. 6. 7. 8. 9. 10.

First step of the “proof” Suppose there is x ∈ S such that P(x) is true. Suppose there exists x ∈ S such that P(x) is false. Suppose there is x ∈ S such that Q(x) is true. Suppose there is x ∈ S such that Q(x) is false. Suppose there is x ∈ S such that P(x) and Q(x) are true. Suppose there is x ∈ S such that P(x) is true and Q(x) is false. Suppose there exists x ∈ S such that P(x) is false and Q(x) is true. Suppose there exists x ∈ S such that P(x) and Q(x) are false. Suppose there is x ∈ S such that P(x) ⇒ Q(x) is true. Suppose there is x ∈ S such that P(x) ⇒ Q(x) is false. Figure 5.2

Remarks/Objective A direct proof is used. Show that Q(x) holds for the element x. There has been an error. There has been an error. Contra-positive evidence is used. Show that P(x) is false for the element x. There has been an error. A proof by contradiction is used. Produce a contradiction. There has been an error. There has been an error. There has been an error. A proof by contradiction is used. Produce a contradiction.

How to prove (and not prove) that ∀x ∈ S, P(x) ⇒ Q(x) is true

5.3 Result 5.17

direct test

An overview of three correction techniques

131

If n is an even integer, then 3n + 7 is odd. Let n be an even integer. So n = 2x for an integer x. Therefore 3n + 7 = 3(2x) + 7 = 6x + 7 = 2(3x + 3) + 1. Since 3x + 3 is an integer, 3n + 7 is odd.

Contrapositives Test

Suppose 3n + 7 is even. So 3n + 7 = 2y for an integer y. Therefore n = (3n + 7) + (−2n − 7) = 2y − 2n − 7 = 2(y − n − 4) + 1. Since y − n − 4 is an integer, n is odd.

proof by contradiction

Instead, suppose there is an even integer n such that 3n + 7 is even. Since n is even, n = 2x for an integer x. Therefore 3n + 7 = 3(2x) + 7 = 6x + 7 = 2(3x + 3) + 1. Since 3x + 3 is an integer, 3n + 7 is odd, which is a contradiction. While a direct proof of Result 5.17 is certainly the preferred proof technique in this case, it makes sense to compare all three techniques. The following example is more complex.

Result 5.18

direct test

Let x be a non-zero real number. If x +

1 < 2, also x < 0. x

1 < 2. Since x = 0, we know that x 2 > 0. If we multiply both sides of the inequality x 1 1 2 2 x + < 2 by x, we get x x + < 2x 2 . If we simplify this inequality, x x we have x 3 + x − 2x 2 < 0; Say x +

x(x 2 − 2x + 1) = x(x − 1)2 < 0. Since (x − 1)2 ≥ 0 and x(x − 1)2 = 0, then (x − 1)2 must be > 0 Since x(x − 1)2 < 0 and (x − 1)2 > 0, it follows that x < 0 as desired

For a contrapositive proof, we first assume that x ≥ 0 and try to show 1 that x + ≥ 2. This inequality can be simplified by multiplying by x, giving x x 2 + 1 ≥ 2x. If we subtract 2x from both sides, we get x 2 − 2x + 1 = (x − 1)2 ≥ 0, which we obviously know to be true. A proof is proposed by reversing the order of these steps: x+

1 ≥2x

x 2 + 1 ≥ 2x x 2 − 2x + 1 = (x − 1)2 ≥ 0. This method is common when dealing with inequalities.

132

Chapter 5 Existence and Proof by Contradiction

Contrapositives Test

proof by contradiction

Suppose x ≥ 0. Since x = 0, it follows that x > 0. Since (x − 1)2 ≥ 0, we have (x − 1)2 = x 2 − 2x + 1 ≥ 0. Adding 2x to both sides this inequality we get x 2 + 1 ≥ 2x. Dividing both sides of the inequality x 2 + 1 ≥ 2x by the positive number x, 1 gives x + ≥ 2. x, as desired

1 Suppose instead that there is a non-zero real number x such that x + < 2 x and x ≥ 0. Since x = 0, x > 0. Multiply both sides of the inequality 1 x + < 2 by x , we get x 2 + 1 < 2x. If we subtract 2x from both sides, we get x x 2 − 2x + 1 < 0. It follows that (x − 1)2 < 0, which is a contradiction. Many mathematicians are of the opinion that if a result can be verified by direct proof, then that is the proof technique to use, as it is usually easier to understand. However, this is only a general guideline; It's not a fixed rule.

5.4 Proofs of Existence In an existential theorem, the existence of an object (or objects) with a certain property or properties is asserted. Typically, an existence theorem with respect to an open proposition R(x) over a region S can then be expressed as a quantified proposition ∃x ∈ S, R(x): There is x ∈ S with R(x).

(5.12)

We have seen that such a statement (5.12) is true provided that R(x) is true for some x ∈ S. A proof of an existence theorem is called an existence proof. A proof of existence might then consist of indicating or constructing an instance of such an object, or perhaps verifying, using known results, that such objects must exist without ever producing a single instance of the desired type. For example, there are theorems in mathematics that tell us that every odd-degree polynomial with real coefficients has at least one real number solution, but we don't know how to find a real number solution for each of these polynomials. In fact, we quote the great mathematician David Hilbert, who used the following example in his lectures to illustrate the idea of a proof of existence: There is at least one student in this class. 🇧🇷 🇧🇷 let's call it 'X'. 🇧🇷 🇧🇷 applies to the following statement: No other student in the class has more hair on his head than X . Which student is it? We shall never know; but we can be absolutely sure of its existence. Let us now consider some examples of existence proofs. Result to prove the PROOF STRATEGY

There is an integer whose cube is equal to its square. Since this result only asserts the existence of an integer whose cube is equal to its square, we have a proof once we think of an example. The integer 1 has this property.

5.4 Result 5.19 Test

proof of existence

133

There is an integer whose cube is equal to its square. Since 13 = 12 = 1, the integer 1 has the desired property. Suppose we didn't notice that the integer 1 satisfies the condition required in the previous theorem. An alternative proof could look something like this: Let x ∈ Z with x 3 = x 2 . Then x 3 − x 2 = 0 or x 2 (x − 1) = 0. So there are only two possible integers with this property, namely 1 and 0, and in fact both integers have the desired property. A common mistake in elementary algebra is to write (a + b)2 = a 2 + b2. Can this be true?

Result 5.20 test

There are real numbers a and b such that (a + b)2 = a 2 + b2 . Let a, b ∈ R such that (a + b)2 = a 2 + b2 . So a 2 + 2ab + b2 = a 2 + b2 , so 2ab = 0. Since a = 1, b = 0 is a solution of this equation, we have (a + b)2 = (1 + 0)2 = 12 = 12 + 02 = a2 + b2 . The presented proof of result 5.20 is longer than necessary. We could have written the following proof:

Study

Let a = 1 and b = 0. Then (a + b)2 = (1 + 0)2 = 12 = 12 + 02 = a 2 + b2 . In the first proof, we actually present an argument for how we think about a = 1 and b = 0. In a proof, we don't have to explain where we got the idea for the proof, although it can be very interesting to know that. If we think that such information might be interesting or valuable, it might be worth sharing it include in a discussion before or after the test. The first proof we gave of Result 5.20 actually tells us all the real numbers a and b for which (a + b)2 = a 2 + b2 , so (a + b)2 = a 2 + b2 if and only then at least one of a and b is 0. That's more than we were asked for, but it still looks interesting. √ √ We saw in Section 5.2 that 2 is irrational. Since 2 = 21/2, it follows that there are rational numbers a and b such that a b is irrational, i.e. H. a = 2 and b = 1/2 have this property. Let's turn this question around. That is, are there irrational numbers a and b such that a b is rational? Although there are √ many irrational numbers (actually infinitely many), we only check that 2√ is irrational. (On the other hand, we know from the exercise part that r + 2 is irrational for any rational number r, and √ √ for that both r 2 and r/ 2 are irrational for any non-zero rational number r.)

Result to prove the PROOF STRATEGY

There are irrational numbers a and b such that a b is rational. As we mentioned earlier, there are few numbers that we know to be irrational, with √ √ √2 being the simplest 2. This might suggest considering the (real) number 2. if this

134

Chapter 5 Existence and Proof by Contradiction √ √2 Number is rational, so our question is answered. But maybe 2 is irrational. So what are we going to do? This discussion suggests two cases. Result 5.21 test

There are irrational numbers a and b such that a b is rational. √ √2 Look at the number 2 . Of course, this number is either rational or irrational. We consider these possibilities separately. √ √2 √ Case 1. 2 √ is rational. Then we can take a = b = 2 and have the desired result. √ 2 Case 2. 2 is irrational. In this case, consider the number you get by raising the number √ √2 √ (irrational) 2 to the (irrational) power of 2; that is, consider a b , where a = √ √ 2 √ 2 and b = 2. Note that √ √2 √ 2 √ √2 √2 √ 2 2 = 2 = 2 = 2, ab = which is rational . The proof of Theorem 5.21 may seem unsatisfactory to you, since we still do not know that two particular irrational numbers a and b are such that a b is rational. All we know is that √ √2 of these numbers exist. We actually know a bit more, namely (1) 2 is rational √ √2 √ √ 2 √ 2 or (2) 2 is irrational and 2 is rational. (Actually it is proved that √ √2 2 is an irrational number. Therefore there are also irrational numbers of the form a b , where a and b are both irrational.) In the next result we want to show that the equation x 5 + 2x − 5 = 0 has a real number solution between x = 1 and x = 2. Finding a number that satisfies this equation is not easy. Instead, we use a well-known theorem from calculus to show that such a solution exists. You may not remember all of the terms used in the following theorem, but this is not critical.

Intermediate value calculation theorem

If f is a continuous function on the closed interval [a, b] and k is a number between f (a) and f (b), then there is a number c ∈ (a, b) with f (c ) = k. We now give an example to show how this phrase can be used.

Result 5.22 test

The equation x 5 + 2x − 5 = 0 has a real number solution between x = 1 and x = 2. Let f (x) = x 5 + 2x − 5. Since f is a polynomial function, it is continuous on the set of all real ones numbers, and hence f is continuous on the interval [1, 2]. Now f(1) = −2 and f(2) = 31. Since 0 lies between f(1) and f(2), it follows from the intermediate value calculus theorem that there is a number c between 1 and 2, such as that f (c) = c5 + 2c − 5 = 0. So c is a solution. As we have just seen, the equation x 5 + 2x − 5 = 0 has a real number solution between x = 1 and x = 2. In fact, the equation x 5 + 2x − 5 = 0 has exactly one real number

5.4

proof of existence

135

Solution between x = 1 and x = 2. This raises the issue of uniqueness. An element belonging to a prescribed set A and having a certain property P is unique if it is the only element of A that has the property P. To prove that only one element of A has the property P, we usually proceed in one of two ways: (1) We assume that a and b are elements of A with the property P and show that a = b. (2) We assume that a and b are distinct elements of A with property P and show that a = b. Although (1) leads to a direct proof and (2) to a proof by contradiction, both proof techniques can often be used. To illustrate, we return to Result 5.22 and indeed show that the equation 5x + 2x − 5 = 0 has a unique real solution between x = 1 and x = 2. Result 5.23

The equation x 5 + 2x − 5 = 0 has a unique real number solution between x = 1 and x = 2.

Study

Instead, assume that the equation x 5 + 2x − 5 = 0 has two distinct solutions a and b between x = 1 and x = 2. We can assume that a < b. Since 1 < a < b < 2, it follows that a 5 + 2a − 5 < b5 + 2b − 5. On the other hand, a 5 + 2a − 5 = 0 and b5 + 2b − 5 = 0. So 0 = a 5 + 2a − 5 < b5 + 2b − 5 = 0, which creates a contradiction.

to prove result

In fact, we could have dropped Result 5.22 altogether and just replaced Result 5.23 (to number that Result 5.22), including the proofs of Results 5.22 and 5.23. We now present another uniqueness result. For an irrational number r let S = {sr + t : s, t ∈ Q}. For all x ∈ S there are unique rational numbers a and b with x = ar + b.

TEST STRATEGY

To verify that a and b are unique, suppose that x can be expressed in two ways, say as ar + b and cr + d, where a, b, c, d ∈ Q, and then show that that a = c and b = D. So ar + b = cr + d. If a = c, then we can show that r is a rational number, which creates a contradiction. So a = c. If we subtract air from both sides of air + b = cr + d, we also get b = d. We now give a complete proof.

Result 5.24

For an irrational number r let S = {sr + t : s, t ∈ Q}.

136

Chapter 5 Existence and proof by contradiction For all x ∈ S there are unique rational numbers a and b with x = ar + b. Check

Let x ∈ S and x = ar + b and x = cr + d, where a, b, c, d ∈ Q. Then ar + b = cr + d. If a = c, then (a − c)r = d − b and then r=

db. a-c

is a rational number, that's impossible. So a = c. Subtracting ar = cr from Since d−b a−c both sides of ar + b = cr + d, we get b = d.

3 2 Example 5.25 (a) Show that the equation 6x + x − 2x = 0 has a root in the interval [−1, 1].

(b) Does this equation have a single root in the interval [−1, 1]?

solution

(a) Looking more closely, we can see that x = 0 is a root of the equation. (b) Note that 6x 3 + x 2 − 2x = x(6x 2 + x − 2) = x(3x + 2)(2x − 1). Thus x = −2/3 and x = 1/2 are also roots of the equation 6x 3 + x 2 − 2x = 0, and therefore this equation has no single root in the interval [−1, 1].

5.5 Refutation of Existence Claims Let R(x) be an open sentence in which the domain of x is S . We have already seen that to refute a quantified claim of type ∀ x ∈ S, R(x) it suffices to provide a counterexample (ie an element x in S for which R(x) is false). However, the refutation of a quantified statement of the type ∃x ∈ S, R(x) requires a completely different approach. Since ∼ (∃x ∈ S, R(x)) ≡ ∀x ∈ S, ∼ R(x), it follows that the statement ∃x ∈ S, R(x) is false if each R(x) is false x ∈ S. Let's look at some examples of refuting existence claims. Example 5.26

solution

Disprove the claim: There is an odd integer n such that n 2 + 2n + 3 is odd. We have shown that n 2 + 2n + 3 is even if n is an odd integer. Let n be an odd integer. Then n = 2k + 1 for an integer k. So n 2 + 2n + 3 = (2k + 1)2 + 2(2k + 1) + 3 = 4k 2 + 4k + 1 + 4k + 2 + 3 = 4k 2 + 8k + 6 = 2(2k 2 + 4k +3). Since 2k 2 + 4k + 3 is an integer, n 2 + 2n + 3 is even.

Example 5.27

solution

Disprove the statement: There is a real number x such that x 6 + 2x 4 + x 2 + 2 = 0. Let x ∈ R. Since x 6 , x 4 and x 2 are all even powers of the real number x, it is it follows that x 6 ≥ 0, x 4 ≥ 0 and x 2 ≥ 0. Hence x 6 + 2x 4 + x 2 + 2 ≥ 0 + 0 + 0 + 2 = 2 and so on

Exercises for chapter 5

137

x 6 + 2x 4 + x 2 + 2 = 0. Therefore the equation x 6 + 2x 4 + x 2 + 2 = 0 has no real number solution. Example 5.28

solution

Disprove the claim: There is an integer n such that n 3 − n + 1 is even. Let n ∈ Z. We consider two cases. Case 1. n is even. Then n = 2a, where a ∈ Z. Then n 3 − n + 1 = (2a)3 − (2a) + 1 = 8a 3 − 2a + 1 = 2(4a 3 − a) + 1. Since 4a 3 − a is an integer, n 3 − n + 1 is odd and therefore not even. Case 2. n is odd. Then n = 2b + 1, where b ∈ Z. So n 3 − n + 1 = (2b + 1)3 − (2b + 1) + 1 = 8b3 + 12b2 + 6b + 1 − 2b − 1 + 1 = 8b3 + 12b2 + 4b + 1 = 2(4b3 + 6b2 + 2b) + 1. Since 4b3 + 6b2 + 2b is an integer, n 3 − n + 1 is odd and therefore not even.

If we had replaced Example 5.26 with For every odd integer n, n 2 + 2n + 3 is even. replaced Example 5.27 with For every real number x, x 6 + 2x 4 + x 2 + 2 = 0. and replaced Example 5.28 with For every integer n n 3 − n + 1 is odd. then we would each have a true statement and the solutions of Examples 5.26–5.28 would become proofs.

EXERCISES FROM CHAPTER 5 Section 5.1: Counterexamples 5.1. Disprove the statement: If a and b are any two real numbers, then log(ab) = log(a) + log(b). 5.2. Disprove the statement: If n ∈ {0, 1, 2, 3, 4}, then 2n + 3n + n(n − 1)(n − 2) is a prime number. 5.3. Disprove the statement: If n ∈ {1, 2, 3, 4, 5}, then 3 | (2n 2 + 1). 5.4. Refute the claim: Let n ∈ N. If

n(n+1) 2

then it's weird

(n+1)(n+2) 2

is strange.

5.5. Disprove the statement: For any two positive integers a and b, we have (a + b)3 = a 3 + 2a 2 b + 2ab + 2ab2 + b3 . 5.6. Let a, b ∈ Z. Disprove the statement: If ab and (a + b)2 are of opposite parity, then a 2 b2 and a + ab + b are of opposite parity. 5.7. For positive real numbers a and b it can be shown that (a + b) a1 + b1 ≥ 4. If a = b then this inequality is an equality. Consider the following statement: If a and b are positive real numbers such that (a + b) a1 + b1 = 4, then a = b. Is there a counterexample to this claim?

138

Chapter 5 Existence and Proof by Contradiction

5.8. Exercise 5.7 states that (a + b) a1 + b1 ≥ 4 for any two positive real numbers a and b. Does it then follow (c2 + d 2 ) c12 + d12 ≥ 42 for any two positive real numbers c and d? 5.9. Disprove the statement: For every positive integer x and every integer n ≥ 2, the equation x n + (x + 1)n = (x + 2)n has no solution.

Section 5.2: Proof by contradiction 5.10. Prove that there is no larger negative rational number. 5.11. Prove that there is no smallest positive irrational number. 5.12. Prove that 200 cannot be written as the sum of an odd integer and two even integers. 5.13. Use a proof by contradiction to prove that if a and b are odd integers, then 4 | (a2+b2). 5.14. Prove that if a ≥ 2 and b are integers, then a | b or a | (b+1). 5.15. Prove that 1000 cannot be written as the sum of three integers because it is an even number. 5.16. Prove that the product of an irrational number and a rational number other than zero is irrational. 5.17. Prove that when an irrational number is divided by a rational (non-zero) number, the resulting number is irrational. 5.18. Let a be an irrational number and r a rational number other than zero. Prove that if s is a real number, then ar + s or ar − s is irrational. √ 5.19. Prove that 3 is irrational. [Hint: First prove for an integer that 3 | to 2 if and only if 3 | one. Remember that any integer can be written as 3q, 3q + 1, or 3q + 2 for an integer q. See also exercise 3 in chapter 4.] √ √ 5.20. Prove that 2 + 3 is an irrational number. √ 5.21. (a) Prove that 6 is an irrational number. √ (b) Prove that there are infinitely many positive integers n such that n is irrational. √ √ 5.22. Let S = { p + q 2 : p, q ∈ Q} and T = {r + s 3 : r, s ∈ Q}. Prove that S ∩ T = Q. 5.23. Prove that there is no integer a such that a ≡ 5 (mod 14) and a ≡ 3 (mod 21). 5.24. Prove that there is no positive integer x such that 2x < x 2 < 3x. 5.25. Prove that there are not three distinct positive integers a, b, and c such that each integer divides the difference between the other two. 5.26. Prove that the sum of squares of two odd integers cannot be the square of an integer. √ √ √ 5.27. Prove that if x and y are positive real numbers, then x + y = x + y. 5.28. Prove that there are no positive integers m and n with m 2 − n 2 = 1. 5.29. Let m be a positive integer of the form m = 2s, where s is an odd integer. Prove that there are no positive integers x and y such that x 2 − y 2 = m. 5.30. Prove that there are no three distinct real numbers a, b, and c such that all numbers a + b + c, ab, ac, bc, abc are equal. 5.31. Use a proof by contradiction to prove the following. Let m ∈ Z. If 3 | (m 2 − 1), then 3 | m. (A counter-proof of this result is given in Result 4.6.) 5.32. Prove that there are no positive integers m and n such that m 2 + m + 1 = n 2 . 5.33. (a) Prove that there is no rational numerical solution to the equation x 2 − 3x + 1 = 0. (b) The problem in (a) should suggest a more general problem. Formulate and sketch a proof of this.

Exercises for chapter 5

139

Section 5.3: An Overview of the Three Proof Techniques 5.34. Prove that 7n − 5 is even when n is an odd integer using (a) a direct proof, (b) a counterpositive proof, and (c) a proof by contradiction. 5.35. Let x be a positive real number. Prove that if x − x2 > 1, then x > 2 by (a) a direct proof, (b) a counterpositive proof, and (c) a proof by contradiction. 5.36. Let a, b ∈ R. Prove that if ab = 0 then a = 0 using as many of the three proof techniques as possible. 5.37. Let x, y ∈ R+ . Prove that if x ≤ y, then x 2 ≤ y 2 by (a) a direct proof, (b) a counterpositive proof, and (c) a contradiction proof. 5.38. Prove the following statement using more than one proof method. Let a, b ∈ Z. If a is odd and a + b is even, then b is odd and ab is odd. 5.39. Prove the following statement using more than one proof method. For any three integers a, b, and c, exactly two of the integers ab, ac, and bc cannot be odd.

Section 5.4: Proof of Existence 5.40. Show that there is a rational number a and an irrational number b such that a b is rational. 5.41. Show that there is a rational number a and an irrational number b such that a b is irrational. 5.42. Show that there are two different irrational numbers a and b such that a b is rational. √ √ 5.43. Show that there are no non-zero real numbers a and b such that a 2 + b2 = 3 a 3 + b3 . 5.44. Prove that there is a unique real solution of the equation x 3 + x 2 − 1 = 0 between x = 2/3 and x = 1. 5.45. Let R(x) be an open proposition over a domain S. Suppose ∀x ∈ S, R(x) is a false proposition and the set T of counterexamples is a proper subset of S. Show that there is a subset W of S such that ∀x ∈ W, R(x) is true. 5.46. (a) Prove that there are four distinct positive integers such that each integer divides the sum of the remaining integers. (b) The problem in (a) should suggest another problem to you. Formulate and solve such a problem. 5.47. Let S be a set of three integers. For a non-empty subset A of S, let σ A be the sum of the elements of A. Prove that there are two distinct non-empty subsets B and C of S such that σ B ≡ σC (mod 6). 5.48. Prove that the equation cos2(x) − 4x + π = 0 has a real number solution in the interval [0, 4]. (You can assume that cos2(x) is continuous at [0, 4].)

Section 5.5: Refutation of Existence Claims 5.49. Disprove the statement: There are odd integers a and b such that 4 | (3a2 + 7b2). 5.50. Disprove the claim: There is a real number x with x 6 + x 4 + 1 = 2x 2 . 5.51. Disprove the claim: There is an integer n such that n 4 + n 3 + n 2 + n is odd. 5.52. The integers 1, 2, 3 have the property that each divides the sum of the other two. In fact, for any positive integer a, the integers a, 2a, 3a have the property that each divides the sum of the other two. Show that the following statement is false. There is an example of three different positive integers other than a, 2a, 3a for some a ∈ N with the property that each divides the sum of the other two.

140

Chapter 5 Existence and Proof by Contradiction

EXERCISES ADDITIONAL TO CHAPTER 5 5.53. Show that the following statement is false. If A and B are two sets of positive integers with |A| are = |B| = 3, so that whenever an integer s is the sum of the elements of a subset of A, then s is also the sum of the elements of a subset of B, ie A = B. 5.54. (a) Prove that if a ≥ 2 and n ≥ 1 are integers such that a 2 + 1 = 2n , then a is odd. (b) Prove that there are no integers a ≥ 2 and n ≥ 1 such that a 2 + 1 = 2n . 5.55. Prove that there are no positive integers a and n such that a 2 + 3 = 3n . √ √ 5.56. Let x, y ∈ R+ . Use a proof by contradiction to prove that if x < y then x < y. 5.57. The king's daughter had three suitors and could not decide which one to marry. Then the king said: "I have three golden crowns and two silver crowns. I will put a gold or silver crown on each of them. The suitor who can tell me what crown he has will marry my daughter. The first suitor looked around and said he didn't know. The second did the same. The third suitor said, "I have a golden crown." He is right, but the daughter was confused: this suitor was blind. How did he know? (Source: Ask Marilyn, Parade Magazine, July 6, 2003). 5.58. Prove that if a, b, c, d are four real numbers, then at most four of the numbers ab, ac, ad, bc, bd, cd are negative. 5.59. Evaluate the proposed proof of the following result. result

The number 25 cannot be written as the sum of three whole numbers because it is an odd even number.

Proof On the contrary, suppose that 25 can be written as the sum of three integers, which is an odd even number. So 25 = x + y + z, where x, y, z ∈ Z. We consider two cases. Case 1. x and y are odd. So x = 2a + 1, y = 2b + 1 and z = 2c, where a, b, c ∈ Z. Therefore 25 = x + y + z = (2a + 1) + (2b + 1) + 2c = 2a + 2b + 2c + 2 = 2(a + b + c + 1). Since a + b + c + 1 is an integer, 25 is even, a contradiction. Case 2. x, y and z are even. So x = 2a, y = 2b and z = 2c, where a, b, c ∈ Z. Therefore 25 = x + y + z = 2a + 2b + 2c = 2(a + b + c). Since a + b + c is an integer, 25 is even, again a contradiction. 5.60. (a) Let n be a positive integer. Show that any integer m with 1 ≤ m ≤ 2n can be expressed as 2k, where k is a nonnegative integer and k is an odd integer with 1 ≤ k < 2n. (b) Prove for every positive integer n and every subset S of {1, 2, . 🇧🇷 🇧🇷 , 2n} with |S| = n + 1 that there are integers a, b ∈ S such that a | B. √ √ √ 5.61. Prove that the sum of the irrational numbers 2, 3, and 5 is also irrational. 5.62. Let a1, a2, . 🇧🇷 🇧🇷 , r are odd integers with ai > 1 for i = 1, 2, . 🇧🇷 🇧🇷 , r . Prove that if n = a1 a2 ar + 2, then ai | n for every integer i (1 ≤ i ≤ r). 5.63. Below is a proof of a result. Which result is proven? Proof Let a, b, c ∈ Z with a 2 + b2 = c2 . Let's assume instead that a, b, and c are all odd. So a = 2r + 1, b = 2s + 1 and c = 2t + 1, where r, s, t ∈ Z. So a 2 + b2 = (4r 2 + 4r + 1) + (4s 2 + 4s + 1 ) = 2(2r 2 + 2r + 2s 2 + 2s + 1).

Additional exercises to Chapter 5

141

Since 2r 2 + 2r + 2s 2 + 2s + 1 is an integer, it follows that a 2 + b2 is even. On the other hand, c2 = (2t + 1)2 = 4t 2 + 4t + 1 = 2(2t 2 + 2t) + 1. Since 2t 2 + 2t is an integer, it follows that c2 is odd. Hence a 2 + b2 is even and c2 is odd, contradicting the fact that a 2 + b2 = c2 . 5.64. Evaluate the proposed proof of the following result. result

If x is an irrational number and y is a rational number, then z = x − y is irrational.

Proof Assume instead that √ that z = x − y is rational. Then z = a/b, where a, b ∈ Z and b = 0. Since √ 2 is irrational, we make x = 2. Since y is rational, we have y = c/d, where c, d ∈ Z and d = 0. So √ c a ad + bc 2=x = y+z = + = . d b bd √ Since ad + bc and bd are integers where bd = 0, it follows that 2 is rational, which creates a contradiction. 5.65. Prove that there are four different real numbers a, b, c, d such that exactly four of the numbers ab, ac, ad, bc, bd, cd are irrational. 5.66. Below is a proof of a result. Which result is proven? Proof Let a ≡ 2 (mod 4) and b ≡ 1 (mod 4) and on the contrary assume that 4 | (a2 + 2b). Since a ≡ 2 (mod 4) and b ≡ 1 (mod 4), it follows that a = 4r + 2 and b = 4s + 1, where r, s ∈ Z. Hence a 2 + 2b = (4r + 2 ) 2 + 2(4s + 1) = (16r 2 + 16r + 4) + (8s + 2) = 16r 2 + 16r + 8s + 6. Since 4 | (a 2 + 2b), we have a 2 + 2b = 4t, where t ∈ Z. So 16r 2 + 16r + 8s + 6 = 4t and 6 = 4t − 16r 2 − 16r − 8s = 4(t − 4r 2 − 4r − 2s). Since t − 4r 2 − 4r − 2s is an integer, 4 | 6, which is a contradiction.

6

mathematical induction

C

We have seen three proof techniques that can be used to prove that a quantified statement ∀x ∈ S, P(x) is true: direct proof, proof by contrapositive, proof by contradiction. For certain sets S there is another proof method: mathematical induction.

6.1 The principle of mathematical induction Let A be a nonempty set of real numbers. A number m ∈ A is called the smallest element (or smallest or smallest element) of A if x ≥ m for all x ∈ A. Some nonempty sets of real numbers have a smallest element; others not. The set N has a smallest element, namely 1, while Z has no smallest element. The closed interval [2, 5] has minimum element 2, but the open interval (2, 5) has no minimum element. The set 1 : n∈N A= n also has no smallest element. If a nonempty set A of real numbers has a minimal element, then that element is necessarily unique. We will verify this fact. Remember that when trying to prove that an element with a certain property is unique, it is often assumed that there are two elements with that property. We then show that these elements are equal, which implies that exactly one of these elements exists. Theorem 6.1 Proof

If a set A of real numbers has a smallest element, then A has a unique smallest element. Let m 1 and m 2 be the minimal elements of A. Since m 1 is a minimal element, m 2 ≥ m 1 . Also, since m 2 is a minimal element, m 1 ≥ m 2 . Hence m 1 = m 2 . The proof we gave of Theorem 6.1 is a direct proof. Suppose we replaced the first sentence of this proof with

142

On the contrary, assume that A contains the least distinct elements m 1 and m 2 .

6.1

The principle of mathematical induction

143

If the rest of the proof of Theorem 6.1 were the same, apart from the addition of a final theorem that we have a contradiction, then this would also be a proof of Theorem 6.1. That is, with a little modification, the proof technique used to verify Theorem 6.1 can be transformed from a direct proof to a proof by contradiction. There is a property of some sets of real numbers that will interest us here. A nonempty set S of real numbers is called well-ordered if every nonempty subset of S has a minimal element. Let S = {−7, −1, 2}. The nonempty subsets of S are {−7, −1, 2}, {−7, −1}, {−7, 2}, {−1, 2}, {−7}, {−1}, and { two}. Since each of these subsets has a minimal element, S is well-ordered. In fact, it should be clear that every non-empty finite set of real numbers is well-ordered. (See Exercise 6.20 in Exercises for Section 6.2.) The open interval (0, 1) is not well-ordered, since eg (0, 1) itself has no smallest element. The closed interval [0, 1] has the smallest element 0; However, [0, 1] is not well-ordered since the open interval (0, 1) is a (non-empty) subset of [0, 1] without a minimal element. Since none of the sets Z, Q, and R has a smallest element, none of these sets is well-ordered. Therefore, a minimal element is a necessary condition for a nonempty set to be well-ordered, but it is not a sufficient condition. While it may seem self-evident that the set N of positive integers is well-ordered, this assertion cannot be proved using the properties of positive integers that we have used and derived so far. Consequently, this statement is accepted as an axiom, which we state below.

The principle of good order

The set N of positive integers is well-ordered.

A corollary of the well-ordering principle is another principle that serves as the basis for another important proof technique. Theorem 6.2

(The principle of mathematical induction) For every positive integer n let P(n) be a statement. If (1) P(1) is true and (2) the implication If P(k), then P(k + 1). is true for every positive integer k, then P(n) is true for every positive integer n.

Study

On the contrary, assume that the sentence is false. Then conditions (1) and (2) are satisfied, but there are some positive integers n for which P(n) is false. Let S = {n ∈ N : P(n) is false}.

144

Chapter 6 Mathematical Induction Since S is a nonempty subset of N, the well-ordering principle implies that S contains a minimal element s. Since P(1) is true, 1 ∈ / S. So s ≥ 2 and s − 1 ∈ N. So s−1∈ / S and hence P(s − 1) is a true statement. According to condition (2), P(s) is also true and thus s∈ / S. However, this contradicts our assumption that s ∈ S. The principle of mathematical induction is formulated more symbolically below.

The principle of mathematical induction

For every positive integer n, let P(n) be a declaration. If (1) P(1) is true and (2) ∀k ∈ N, P(k) ⇒ P(k + 1) is true, then ∀n ∈ N, P(n) is true. As a consequence of the principle of mathematical induction, the quantified statement ∀n ∈ N, P(n) can be proved if (1) we can show that the statement P(1) is true and (2) we can confirm the Truth of the implication If P(k), then P(k + 1). for every positive integer k. A proof using the principle of mathematical induction is called proof by induction or proof by induction. The verification of the truth of P(1) in a proof by induction is called basic step, basic step or inductive anchor. Implicit If P(k), then P(k + 1). for any positive integer k, the statement P(k) is called the induction hypothesis (or induction). We often use a direct proof to show that ∀k ∈ N, P(k) ⇒ P(k + 1),

(6.1)

although any proofing technique is acceptable. That is, we usually assume that the inductive hypothesis P(k) is true for any positive integer k and try to show that P(k + 1) is true. Establishing the truth of (6.1) is called the induction step in the induction proof. We illustrate this proof technique by showing that for any positive integer n, the sum of the first n positive integers is given by n(n + 1)/2, so 1 + 2 + 3 + ··· + n = result 6.3

n(n + 1) . 2

Let n(n + 1) 2 with n ∈ N. Then P(n) holds for every positive integer n. P(n): 1 + 2 + 3 + · · · + n =

6.1 Evidence

The principle of mathematical induction

145

We use induction. Since 1 = (1 * 2)/2, the statement P(1) is true. Suppose P(k) is true for any positive integer k, i.e. H. suppose k(k + 1) . 2 We show that P(k + 1) is true, that is, we show that 1 + 2 + 3 + ··· + k =

1 + 2 + 3 + · · · + (k + 1) =

(k + 1)(k + 2) . 2

So 1 + 2 + 3 + + (k + 1) = (1 + 2 + 3 + + k) + (k + 1) =

k(k + 1) k(k + 1) + 2(k + 1) + (k + 1) = 2 2

=

(k + 1)(k + 2), 2

as required. By the principle of mathematical induction, P(n) holds for any positive integer n. Normally, a statement to be proved by induction is not expressed in terms of P(n) or any other symbol. To illustrate this, we give an alternative statement and proof of Result 6.3 to understand what P(n) would represent. Result 6.4

For any positive integer n, 1 + 2 + 3 + ··· + n =

Study

n(n + 1) . 2

We use induction. Since 1 = (1 · 2)/2, the statement holds for n = 1. Suppose 1 + 2 + 3 + ··· + k =

k(k + 1) , 2

where k is a positive integer. We show that 1 + 2 + 3 + + (k + 1) =

(k + 1)(k + 2) . 2

So 1 + 2 + 3 + + (k + 1) = (1 + 2 + 3 + + k) + (k + 1) =

k(k + 1) k(k + 1) + 2(k + 1) + (k + 1) = 2 2

(k+1)(k+2) . 2 According to the principle of mathematical induction =

1 + 2 + 3 + ··· + n = for any positive integer n.

n(n + 1) 2

146

Chapter 6 Mathematical Induction

EVIDENCE ANALYSIS

The proof of Result 6.3 (or Result 6.4) began by claiming that induction was used. This alerts the reader to what to expect from the test. Furthermore, in the proof of the induction step it is assumed that k(k + 1) 2 for a positive integer k, i.e. for any positive integer k. We do not assume that 1 + 2 + 3 + ··· + k =

k(k + 1) 2 for any positive integer k, since this would imply what we try to prove in Result 6.3 (and Result 6.4). 1 + 2 + 3 + ··· + k =

Carl Friedrich Gauss (1777-1855) is considered one of the most brilliant mathematicians of all time. The story goes that at a very young age (in elementary school in Germany) his teacher gave him and his classmates the supposedly awkward task of adding the whole numbers from 1 to 100. He very quickly had the correct result of 5050. It is believed that he took both the sum 1 + 2 + + 100 and the inverse sum 100 + 99 + + 1 and added them together to get the sum 101 + 101 + · · · + 101, which has 100 terms and is therefore equal to 10,100. Since this is twice the required sum, 1 + 2 + + 100 = 10100/2 = 5050. Of course, this can easily be generalized to find a formula for 1 + 2 + 3 + + n . where n ∈ N. Let S = 1 + 2 + 3 + + n.

(6.2)

If we reverse the order of the terms on the right-hand side of (6.2), we get S = n + (n − 1) + (n − 2) + + 1.

(6.3)

Adding (6.2) and (6.3) we have 2S = (n + 1) + (n + 1) + (n + 1) + + (n + 1).

(6.4)

Since there are n terms on the right-hand side of (6.4), we conclude that 2S = n(n + 1) or S = n(n + 1)/2. So n(n + 1) . 2 You may think that the proof of Result 6.3 (and Result 6.4) given by mathematical induction is longer (and more complicated) than the one we just gave, and that may very well be true. But in general, mathematical induction is a technique that can be used to prove a variety of claims. In this chapter we will see a variety of claims where mathematical induction is a natural technique used to check their truth. We start with an example that leads to a problem of mathematical induction. Suppose an n × n square S consists of n 2 1 × 1 squares. How many different k × k squares does S contain for all integers k with 1 ≤ k ≤ n? (See Figure 6.1 for the case where n = 3.) For n = 3, the square S contains the 3 × 3 square S itself, four 2 × 2 squares, and nine 1 × 1 squares (see Figure 6.1) . Therefore, the number of distinct squares that S contains is 1 + 4 + 9 = 12 + 22 + 32 = 14. 1 + 2 + 3 + + n =

6.1

The principle of mathematical induction

147

S:

Figure 6.1

The squares in a 3×3 square

y (n, n) (x + k, y + k) S∗ (x, y) S

x

(0, 0) Figure 6.2

A k×k square into an n×n square

To find the number of distinct k × k squares in an n × n square S, we place S in the first quadrant of the coordinate plane such that the lower-left corner of S is at the origin (0, 0). (See Figure 6.2.) So the top right corner of S is at point (n, n). Consequently, the lower-left corner of a square k × k S ∗ , where 1 ≤ k ≤ n, lies at a point (x, y), while the upper-right corner of S ∗ lies at (x + k, y + k) . Necessarily x and y are nonnegative integers with x + k ≤ n and y + k ≤ n (again, see Figure 6.2). Since 0 ≤ x ≤ n − k and 0 ≤ y ≤ n − k, the number of choices for x and y is n − k + 1, respectively, and therefore the number of choices for (x, y) is (n − k+1)2. Since k is one of the integers 1, 2, . 🇧🇷 🇧🇷 , n, the total number of distinct squares in S is n (n − k + 1)2 = n 2 + (n − 1)2 + + 22 + 12 k=1

= 1 + 2 + ··· + n = 2

2

2

nk=1

There is a compact formula for the expression n k=1

k 2 = 12 + 22 + · · · + n 2 ?

k2.

148

Chapter 6 Mathematical Induction For the problem we are describing, it would be very useful to know the answer to this question. Since we asked this question, you may have guessed that the answer is yes. A formula is given below, along with a proof by induction. Result 6.5

For any positive integer n, 1 2 + 22 + · · · + n 2 =

Study

n(n + 1)(2n + 1) . 6

We proceed by induction. Since 12 = (1 2 3)/6 = 1, the statement is true when n = 1. Suppose that k(k + 1)(2k + 1) 12 + 22 + + k 2 = 6 for any positive integer k. We show that 12 + 22 + + (k + 1)2 =

(k + 1)(k + 2)(2k + 3) . 6

Note that 12 + 22 + + (k + 1)2 = [12 + 22 + + k 2 ] + (k + 1)2 k(k + 1)(2k + 1) + (k + 1)2 = 6 k(k + 1)(2k + 1) 6(k + 1)2 + = 6 6 (k + 1)[k(2k + 1) + 6(k + 1)] = 6 2 (k + 1)(2k + 7k + 6) = 6 (k + 1)(k + 2)(2k + 3) = , 6 as desired. According to the principle of mathematical induction, 12 + 22 + · · · + n 2 =

n(n + 1)(2n + 1) 6

for any positive integer n. In fact, the last sentence in the proof of Result 6.5 is typical of the last sentence of any proof using mathematical induction, since the aim here is to show that the hypothesis of the principle of mathematical induction is satisfied and then the conclusion follows . Some therefore omit this last theorem, since it is clear that once properties (1) and (2) of Theorem 6.2 are satisfied, we have a proof. However, for emphasis we will continue to include that last sentence. You may have noticed another question. We explained why 1 + 2 + + n equals n(n + 1)/2, but how did we know that 12 + 22 + + n 2 equals n(n + 1)(2n + 1) is? /6? In fact, we can show that 12 + 22 + · · · + n 2 =

6.1

The principle of mathematical induction

149

n(n + 1)(2n + 1)/6 using the formula 1 + 2 + + n = n(n + 1)/2. We start by solving (k + 1)3 = k 3 + 3k 2 + 3k + 1 for k 2 . Since 3k 2 = (k + 1)3 − k 3 − 3k − 1, it follows that 1 1 (k + 1)3 − k 3 − k − k2 = 3 3 and so n n n n n 1 1 2 3 3 k = ( k + 1) − k − k− 1. 3 k=1 3 k=1 k=1 k=1 k=1 So n

k2 =

k=1

1 1 1 (n + 1)3 − 13 − n(n + 1) − n 3 2 3

n2 + n n n 3 + 3n 2 + 3n − − 3 2 3 3 2 2 2n 3 + 3n 2 + n 2n + 6n + 6n − 3n − 3n − 2n = = 6 6 n(n + 1)(2n + 1) n(2n 2 + 3n + 1) = . = 6 6 Esta é, na verdade, uma prova alternativa de que =

n(n + 1)(2n + 1) 6 for any positive integer n, but of course this proof depends on 1 2 + 22 + + n 2 =

1 + 2 + 3 + ··· + n =

n(n + 1) 2

for any positive integer n. We now use mathematical induction to obtain the formulas n(n + 1) (6.5) 1 + 2 + + n = 2 and n(n + 1)(2n + 1) ( 6.6) 1 2 + 22 + + n 2 = 6 for any positive integer n. We have seen that (6.6) gives the number of distinct squares in an n × n square made up of n 2 1 × 1 squares. In fact, (6.5) gives the number of intervals in an interval of length n that consists of n intervals of length 1. You can probably guess what 13 + 23 + + n 3 counts for. Exercise 6.6 deals with this expression. We now present a formula for 1 1 1 + + + 2 3 3 4 (n + 1)(n + 2) for any positive integer n.

150

Chapter 6 Result of Mathematical Induction 6.6

For any positive integer n, 1 1 1 n + + ··· + = . 2 * 3 3 * 4 (n + 1)(n + 2) 2n + 4

Study

We use induction. Since 1 1 1 = = , 2 3 2 1+4 6, the formula holds for n = 1. Suppose 1 1 1 k + + + = 2 3 3 4 (k + 1) (k + 2) 2k + 4 for a positive integer k. We show that 1 1 1 k+1 k+1 + + ··· + = = . 2.3 3.4 (k + 2)(k + 3) 2(k + 1) + 4 2k + 6 Note that 1 1 1 + + + 2.3 3.4 (k + 2 )( k + 3)

1 1 1 1 + + ··· + + = 2,3 3,4 (k + 1)(k + 2) (k + 2)(k + 3) 1 k 1 k + = + = 2k + 4 ( k + 2)(k + 3) 2(k + 2) (k + 2)(k + 3) =

k 2 + 3k + 2 k(k + 3) + 2 = 2(k + 2)(k + 3) 2(k + 2)(k + 3)

=

k+1 k+1 (k + 1)(k + 2) = = , 2(k + 2)(k + 3) 2(k + 3) 2k + 6

give us the desired result. By the principle of mathematical induction, 1 1 1 n + + ··· + = 2·3 3·4 (n + 1)(n + 2) 2n + 4 for any positive integer n. ANALYSIS OF PROOF

Each of the examples of mathematical inductive proofs that we have seen involves some level of algebra. Soon we will have more algebra to remember. Many errors in these proofs are due to errors in algebra. Therefore, caution is advised. For example, in the proof of Result 6.6 we find the sum 1 k + . 2(k + 2) (k + 2)(k + 3) To add these fractions, we needed to find a common denominator (actually the lowest common denominator), which is 2(k + 2)(k +). 3). This was used to get the next fraction, i.e. 1k(k + 3) 2k(k + 3) + 2k + = + = . 2(k + 2) (k + 2)(k + 3) 2(k + 2)(k + 3) 2(k + 2)(k + 3) 2(k + 2)(k + 3)

6.2

A more general principle of mathematical induction

151

When we expanded and factored the numerator and dropped the k + 2 term, this was actually expected since the end result we were looking for was k+1 k+1 = .2k + 6 2(k + 3), which is not contains k + 2 as a factor in the denominator.

6.2 A more general principle of mathematical induction The principle of mathematical induction described in the previous section gives us a technique to prove that a statement of the type For any positive integer n, P(n). and truth. However, there are situations where the domain of P(n) consists of those integers that are greater than or equal to a fixed integer m other than 1. We now describe an analogous technique for testing the truth of a statement of the following type, where m denotes a fixed integer : For any integer n ≥ m, P(n). According to the well-ordering principle, the set N of natural numbers is well-ordered; that is, every non-empty subset of N has a minimal element. As a consequence of the well-ordering principle, other sets are also well-ordered. Theorem 6.7

For every integer m, the set S = {i ∈ Z : i ≥ m} is well-ordered. The proof of Theorem 6.7 is left as an exercise (see Exercise 6.17). The following follows from Theorem 6.7. This is a slightly more general form of the principle of mathematical induction. Consequently, it is commonly referred to by the same name.

Theorem 6.8

(The principle of mathematical induction) For a fixed integer m let S = {i ∈ Z : i ≥ m}. For every integer n ∈ S let P(n) be a declaration. If (1) P(m) is true and (2) the implication If P(k), then P(k + 1). is true for every integer k ∈ S, then P(n) is true for every integer n ∈ S.

152

Chapter 6 Mathematical Induction The proof of Theorem 6.8 is similar to the proof of Theorem 6.2. We also formulate Theorem 6.8 symbolically.

The principle of mathematical induction

For a fixed integer m let S = {i ∈ Z : i ≥ m}. For every n ∈ S let P(n) be a theorem. If (1) P(m) is true and (2) ∀k ∈ S, P(k) ⇒ P(k + 1) is true, then ∀n ∈ S, P(n) is true.

This (more general) principle of mathematical induction can be used to prove that certain quantified statements of the type ∀n ∈ S, P(n) are true if S = {i ∈ Z : i ≥ m} for a given integer Number m. If m = 1, then of course S = N. Now let's consider several examples. Result 6.9

For every non-negative integer n, 2n > n.

Study

We proceed by induction. The inequality holds for n = 0 as long as 20 > 0. Suppose 2k > k, where k is a non-negative integer. We show that 2k+1 > k + 1. If k = 0, we have 2k+1 = 2 > 1 = k + 1. Therefore we assume that k ≥ 1. So 2k+1 = 2 2k > 2k = k + k ≥ k + 1. By the principle of mathematical induction, 2n > n for any non-negative integer n.

EVIDENCE ANALYSIS

Let's look again at the proof of Result 6.9. Since Result 6.9 applies to nonnegative integers, we first apply Theorem 6.8 when m = 0. We start by saying that 2n > n when n = 0. Next we assume that 2k > k, where k is a nonnegative integer is. Our goal was to show that 2k+1 > k + 1. It seems logical that 2k+1 = 2 2k. Knowing that 2k > k, we have 2k+1 = 2 2k > 2k. If we could show that 2k ≥ k + 1 then we would have a proof. However, when k = 0, the inequality 2k ≥ k + 1 does not hold. Therefore we treat k = 0 separately in the proof. This allowed us to assume that k ≥ 1, and then to conclude that 2k ≥ k + 1. We could have proved Result 6.9 slightly differently. We could have first observed that 2n > n for n = 0 and then proved by induction that 2n > n for n ≥ 1. Our next example aims to show that 2n > n 2 if n is a sufficiently large integer. We start by trying some values of n as shown below. It seems that 2n > n 2 if n ≥ 5.

6.2

A more general principle of mathematical induction

0 1 2 3 4 5 6

2n 1 2 4 8 16 32 64

153

n2 0 1 4 9 16 25 36

Comparison of 2n and n to prove 2 result

For every integer n ≥ 5, 2n > n 2 .

TEST STRATEGY

Let's see what an induction proof of this result would look like. Of course, 2n > n 2 when n = 5. We assume that 2k > k 2 with k ≥ 5 (and k is an integer) and want to prove that 2k+1 > (k + 1)2 . We start with 2k+1 = 2 2k > 2k 2 . We would have a proof if we could show that 2k 2 ≥ (k + 1)2 or that 2k 2 ≥ k 2 + 2k + 1. There are several convincing ways to show that 2k 2 ≥ k 2 + 2k + 1 for integers, k ≥ 5. Here's one way: Note that 2k 2 = k 2 + k 2 = k 2 + k k ≥ k 2 + 5k since k ≥ 5. Also k 2 + 5k = 2k + 2k + 3k ≥ k 2 + 2k + 3 * 5 = k 2 + 2k + 15, again there k ≥ 5. Finally, k 2 + 2k + 15 > k 2 + 2k + 1. We now present one formal proof. (Here we use the principle of mathematical induction with m = 5.)

Result 6.10

For every integer n ≥ 5, 2n > n 2 .

Study

We proceed by induction. Since 25 > 52, the inequality holds for n = 5. Suppose 2k > k 2 with k ≥ 5. We show that 2k+1 > (k + 1)2 . Note that 2k+1 = 2 2k > 2k 2 = k 2 + k 2 ≥ k 2 + 5k = k 2 + 2k + 3k ≥ k 2 + 2k + 15 > k 2 + 2k + 1 = (k + 1) two . So 2k+1 > (k + 1)2 . By the principle of mathematical induction, 2n > n 2 for every integer n ≥ 5.

Result 6.11

For any non-negative integer n, 3 | 22n − 1 .

154

Chapter 6 Proof of Mathematical Induction

2n We proceed by induction. The 2k result is true when n = 0, since in this case 2 − 1 = 0 and 3 | 0. Suppose 3 | 2 − 1 , where k is a nonnegative integer. We show that 3 | 22k+2 − 1 . Since 3 | 22k − 1 , there is an integer x such that 22k − 1 = 3x and thus 22k = 3x + 1. Well

22k+2 − 1 = 4 22k − 1 = 4(3x + 1) − 1 = 12x + 3 = 3(4x + 1). Since 4x + 1 is an integer, 3 | 22k+2 − 1 . According to the principle of mathematical induction, 3 | 22n − 1 for every non-negative integer n.

EVIDENCE ANALYSIS

Let's look at the proof. To establish the induction step, we assume, as expected, that 2k is the previous one, that 3 | 2 − 1 for any nonnegative integer k and tried to show that 3 | 22k+2 − 1 . To check if 3 | 22k+2 − 1 , it had to be shown that 22k+2 − 1 is a multiple of 3; That is, we had to show that 22k+2 − 1 can be expressed as 3z for an integer z. Since our goal was to show that 22k+2 − 1 can be expressed in a certain way, it's natural to look at 22k+2 − 1 and see how we can write it. Knowing that 22k − 1 = 3x, where x ∈ Z, it was logical to rewrite 22k+2 − 1 so that 22k appears. This is actually quite simple, because 22k+2 = 22 22k = 4 22k . So 22k+2 − 1 = 4 22k − 1. We have to be a bit careful here because the expression we are looking at is 4 22k − 1, not 4(22k − 1). That is, it would be wrong to say that 4 · 22k − 1 = 4(3x). So in this case we need to replace 22k, not 22k − 1. So in the proof we rewrite 22k − 1 = 3x to 22k = 3x + 1. We reinforce this kind of proof with another example.

Result 6.12

For any non-negative integer n, 9 | 43n − 1 .

Study

3n We proceed by induction. If 3k n = 0, 4 − 1 = 0. As 9 | 0, the statement is true if n = 0. Suppose that 9 | 4 − 1 , where k is a nonnegative integer. Now we show that 9 | 43k+3 − 1 . Since 9 | 43k − 1 implies that 43k − 1 = 9x for an integer x. So 43k = 9x + 1. Notice that now

43k+3 − 1 = 43 43k − 1 = 64(9x + 1) − 1 = 64 9x + 64 − 1 = 64 9x + 63 = 9(64x + 7). Since 64x + 7 is an integer, 9 | 43k+3 − 1 . According to the principle of mathematical induction, 9 | 43n − 1 for every non-negative integer n.

6.2

A more general principle of mathematical induction

155

As a final comment on the previous proof, notice that we didn't multiply 64 and 9 because in the next step we wanted to factor 9 from the expression. For a positive integer n, the factorial integer n, denoted by n!, is defined as n! = n(n − 1) 3 2 1. In particular 1! = 1, 2! = 2 * 1 = 2 and 3! = 3 2 1 = 6. So 0! is set to 0! = 1. Among the many equalities and inequalities with n! The following is. to prove result

For every positive integer n, we have 1 3 5 (2n − 1) =

TEST STRATEGY

(2n)! . 2n n!

Primeiro, beachte das (2n)! = 2n (2n − 1) (2n − 2) 3 2 1.

(6.7)

Of the 2n terms on the right-hand side of expression (6.7), n are even, ie 2, 4, 6, . 🇧🇷 🇧🇷 , 2n. If we factor 2 of each of these numbers, we get 2n 1 2 3 n = 2n n!, while the remaining n integers, i.e. 1, 3, 5, . 🇧🇷 🇧🇷 , 2n − 1, are all odd. So (2n)! = 2n n! 1 3 5 (2n − 1). So 1 3 5 (2n − 1) =

(2n)! . 2n n!

This equality can also be established by induction. Result 6.13

For every positive integer n, we have 1 3 5 (2n − 1) =

Study

(2n)! . 2n n!

We proceed by induction. Since

2 (2x1)! = , 1 2 1! 2 the statement holds for n = 1. Assume that for a positive integer k we have 1=

1 · 3 · 5 · · · (2k − 1) =

(2k)! . 2k·k!

We show that 1 3 5 (2k + 1) =

(2k + 2)! . · (k + 1)!

2k+1

Note that (2k + 2)! (2k + 2)(2k + 1) (2k)! = k = (2k + 1)[1 3 5 (2k − 1)] 2k+1 (k + 1)! 2 * (k + 1) 2 * (k)! = 1 * 3 * 5 * * * (2k + 1).

156

Chapter 6 Mathematical Induction According to the principle of mathematical induction, 1 · 3 · 5 · · · (2n − 1) =

(2n)! 2n n!

for every positive integer n. We saw in Theorem 3.12 that the square x 2 of an integer x is even if and only if x is even. This is actually a consequence of Theorem 3.17, which says that for integers a and b, their product ab is even if and only if a or b are even. We now present a generalization of Theorem 3.12. Result 6.14

Let x ∈ Z. For every integer n ≥ 2, x n is even if and only if x is even.

Study

First assume that x is even. So x = 2y for an integer y. Hence x n = x x n−1 = (2y)x n−1 = 2 yx n−1 . Since yx n−1 is an integer, x n is even. We now verify the reciprocal, that is, if x n is even, with n ≥ 2, then x is even. We proceed by induction. If x 2 is even, we have already seen that x is even. Hence the statement holds for n = 2. Suppose if x k is even for an integer k ≥ 2, then x is even. We show that x is even if x k+1 is even. Let x k+1 be an even integer. So x · x k is even. By Theorem 3.17 x is even or x k is even. If x is even, the result is proven. On the other hand, if x k is even, then by the induction hypothesis x is even as well. By the principle of mathematical induction, for every integer n ≥ 2 it follows that x is even if x n is even. Although it is impossible to illustrate all kinds of results where induction can be used, we give two examples that differ significantly from what we have seen so far. One of De Morgan's laws (see Theorem 4.22) states that A∪B = A∩B for any two sets A and B. It is possible to use this law to show that A∪ B ∪C = A ∩ B ∩C for any three sets A, B, and C. We show how induction can be used to prove De Morgan's law for any finite number of sets.

Theorem 6.15

Wenn A1, A2, . 🇧🇷 🇧🇷 , An sind n ≥ 2 Mengen, also A1 ∪ A2 ∪ · · · ∪ An = A1 ∩ A2 ∩ · · · ∩ An .

Study

We proceed by induction. For n = 2 the result is De Morgan's law and therefore true. Suppose the result is true for any k sets, where k ≥ 2; that is, suppose if B1 , B2 , . 🇧🇷 🇧🇷 , Bk any k sets, i.e. B1 ∪ B2 ∪ ∪ Bk = B1 ∩ B2 ∩ ∩ Bk .

6.2

A more general principle of mathematical induction

157

We prove that the result holds for any k + 1 sets. Let S1 , S2 , . 🇧🇷 🇧🇷 , Sk+1 b and k + 1 sets. We show that S1 ∪ S2 ∪ · · · ∪ Sk+1 = S1 ∩ S2 ∩ · · · ∩ Sk+1 . Let T = S1 ∪ S2 ∪ · · · ∪ Sk . Then S1 ∪ S2 ∪ ∪ Sk+1 = (S1 ∪ S2 ∪ ∪ Sk ) ∪ Sk+1 = T ∪ Sk+1 . According to De Morgan's law, T ∪ Sk+1 = T ∩ Sk+1 . By the definition of T and the assumption of induction, T = S1 ∪ S2 ∪ · · · ∪ Sk = S1 ∩ S2 ∩ · · · ∩ Sk . So S1 ∪ S2 ∪ ∪ Sk+1 = T ∪ Sk+1 = T ∩ Sk+1 = S1 ∩ S2 ∩ ∩ Sk ∩ Sk+1 . According to the principle of mathematical induction, for all n ≥ 2 sets A1 , A2 , . 🇧🇷 🇧🇷 , An , A1 ∪ A2 ∪ ∪ An = A1 ∩ A2 ∩ ∩ An , as desired.

EVIDENCE ANALYSIS

A few remarks on the notation used in the statement and proof of Theorem 6.15 may be useful. First, sets A1 , A2 , . 🇧🇷 🇧🇷 , An were used in the statement of Theorem 6.15 only as an aid in describing the result. Theorem 6.15 could also be formulated as follows: For every integer n ≥ 2, the complement of the union of any n sets is equal to the intersection of the complements of these sets. To verify the inductive step in the proof of Theorem 6.15, we assume that the statement holds for any k ≥ 2 sets, which we denote by B1 , B2 , . 🇧🇷 🇧🇷 , bk . The fact that we have A1 , A2 , . 🇧🇷 🇧🇷 , To describe the statement of Theorem 6.15 does not mean that we use A1, A2, . 🇧🇷 🇧🇷 , Ak for sets k in the induction hypothesis. In fact, it's probably best not to use this notation. In the induction step we now have to show that the result is valid for any k + 1 sets. We use S1 , S2 , . 🇧🇷 🇧🇷 , Sk+1 for these sets. It would have been a bad idea to replace the k + 1 sets with B1 , B2 , . 🇧🇷 🇧🇷 , Bk+1, because this would have suggested (wrongly) that k of the k + 1 sets must be exactly the sets mentioned in the inductive hypothesis. Now we can prove another well-known theorem about sets that we have already referred to.

Theorem 6.16

If A is a finite set of size n ≥ 0, then the size of its power set P(A) is 2n .

158

Chapter 6 Proof of Mathematical Induction

We proceed by induction. If A is a set with |A| is = 0, then A = ∅. So P(A) = {∅} and thus |P(A)| = 1 = 20 . Hence the theorem holds for n = 0. Suppose that if B is any set with |B| = k for some non-negative integer k, then |P(B)| = 2k. We show that if C is a set with |C| = k + 1, then |P(C)| = 2k+1 . Let C = {c1, c2, . 🇧🇷 🇧🇷 , ck+1 }. By the induction hypothesis, there are 2k subsets of the set {c1 , c2 , . 🇧🇷 🇧🇷 ,ck }; that is, there are 2k subsets of C that do not contain ck+1. Any subset of C that contains ck+1 can be expressed as D ∪ {ck+1 }, where D ⊆ {c1 , c2 , . 🇧🇷 🇧🇷 ,ck}. Again, by induction, there are 2k of these subsets D. So there are 2k + 2k = 2 · 2k = 2k+1 subsets of C. By the principle of mathematical induction, for any non-negative integer n, it follows that if | A| = n, then |P(A)| = 2n . Of course, Theorem 6.16 can also be formulated like this: The number of subsets of a finite set with n elements is 2n .

6.3 Proof by minimal counterexample For every positive integer n let P(n) be a statement. We have seen that induction is a natural proof technique that can be used to verify the truth of the quantified statement ∀n ∈ N, P(n).

(6.8)

There are certainly statements where induction does not work or does not work well. If we try to prove (6.8) by a proof by contradiction, then we start this proof with the assumption that the statement ∀n ∈ N, P(n) is false. Consequently, there are positive integers n such that P(n) is false. By the good order principle, there is a smallest positive integer n such that P(n) is false. Denote this integer by m. Therefore, P(m) is false, and for any integer k with 1 ≤ k < m, P(k) is true. The integer m is denoted as a minimal counterexample to claim (6.8). If a proof (by contradiction) of ∀n ∈ N, P(n) can be given using the fact that m is a minimal counterexample, then such a proof is called a proof by minimal counterexample. We now illustrate this proof technique. For the example we're about to describe, it's helpful to remember from algebra that (a + b)3 = a 3 + 3a 2 b + 3ab2 + b3 . Suppose we want to prove that 6 | (n 3 − n) for any positive integer n. A proof by induction would probably start like this: 3 If3 n = 1, then n − n = 0. Since 6 | 0, the result is true for n = 1. 3Assume k is a positive integer. We want to prove that 6 | (k + 1) − (k + 1) . 6 | k − k , where since 6 | k 3 − k , it follows that k 3 − k = 6x for an integer x. So (k + 1)3 − (k + 1) = k 3 + 3k 2 + 3k + 1 − k − 1 = (k 3 − k) + 3k 2 + 3k = 6x + 3k(k + 1).

6.3

Proof by minimal counterexample

159

If we can show that 6 | 3k(k + 1), we have a proof. So we have to show that k(k + 1) is even for every positive integer k. To verify this, a lemma could be introduced. This lemma can be proved in two cases (k is even and k is odd) or induction can be used. Although such a lemma is not difficult to prove, we give an alternative proof that avoids the need for a lemma. Result 6.17

For any positive integer n, 6 | n3 − n .

Study

3 Suppose instead that there are positive integers such that 6 | (n − n). Then 3 n such that there is the smallest positive integer n such that 6 | n - n . Let m be this integer. If 3 3n = 1, then 3n − n = 0; while if n = 2, then n − n = 6. Since 6 | 0 and 6 | 6 follows that 6 | n − n for n = 1 and n = 2. So m ≥ 3. Then we can write m = k + 2, where 1 ≤ k < m. Look at that

m 3 − m = (k + 2)3 − (k + 2) = (k 3 + 6k 2 + 12k + 8) − (k + 2) = (k 3 − k) + (6k 2 + 12k + 6 ). Since k < m, it follows that 6 | k 3 − k . So k 3 − k = 6x for an integer x. So we have m 3 − m = 6x + 6(k 2 + 2k + 1) = 6(x + k 2 + 2k + 1). Since x + k 2 + 2k + 1 is an integer, 6 | m 3 − m , which creates a contradiction.

EVIDENCE ANALYSIS

Let's see how this proof was constructed. In this proof, m is a positive integer 6 | m 3 - m ; while for any positive integer n with n < m we have 6 | n 3 - n . We are trying to determine how large m must be to get a contradiction. We saw how we wrote this 6| (13 − 1) and 6 | (23 - 2); then m ≥ 3. Knowing that m ≥ 3 allowed m to be k + 2, where 1 ≤ k < m. Since 1 ≤ k < m, we know that 6 | k 3 − k and thus k 3 − k = 6x, where x ∈ Z. So in the proof we write m 3 − m = (k + 2)3 − (k + 2) = (k 3 + 6k 2 + 12k + 8) − (k + 2) = (k 3 − k) + (6k 2 + 12k + 6) = 6x + 6k 2 + 12k + 6. 2 The fact that we can factor 6 out of 6x + 6k + 12k + 6 allowed us to conclude that 3, that 6 | m − I get a contradiction. But how did we know that we wanted m ≥ 3? Had we considered that 6 | (13 − 1) and not the 6 | (23 − 2), then we would only know that m ≥ 2, which would allow us to write m = k + 1, where 1 ≤ k < m. Of course, we would still know that 6 | k 3 − k and thus k 3 − k = 6x, where x ∈ Z. However, if we consider m 3 − m, we would have

m 3 − m = (k + 1)3 − (k + 1) = (k 3 + 3k 2 + 3k + 1) − (k + 1) = (k 3 − k) + 3k 2 + 3k = (k 3 − k) + 3k(k + 1) = 6x + 3k(k + 1).

160

Chapter 6 Mathematical Induction As it stands, we can factor 3 out of 6x + 3k(k + 1), but we can't factor 6 unless we can prove that k(k + 1) is even. This is the same difficulty we encountered when considering a proof by induction. In any case, there is no contradiction. If a result can be proved by induction, then it can also be proved by minimal counterexample. It is not difficult to use induction to prove that 3 | 22n − 1 for every nonnegative integer n. We also give a proof by a minimal counterexample of this statement. Result 6.18

Study

For every non-negative integer n:

3 | 22n − 1 .

Instead, assume that there are nonnegative integers n for which3 | gives 22n − 1 . By Theorem 2n 6.7 there is a smallest integer 2n n with 3 | 2-1. Denote 2m nonnegative this integer by m. So 3 | 2 − 1 and 3 | 2 − 1 for all nonnegative integers n such that 0 ≤ n < m. Since 3 | 22n − 1 if n = 0, then ≥ 1. Hence m 2kque m can be expressed by m = k + 1, where 0 ≤ k < m. So 3 | 2 − 1 , which implies that 22k − 1 = 3x for an integer x. Therefore 22k = 3x + 1. Note that 22m − 1 = 22(k+1) − 1 = 22k+2 − 1 = 22 22k − 1 = 4(3x + 1) − 1 = 12x + 3 = 3(4x+1). Since 4x + 1 is an integer, 3 | 22m − 1 , which creates a contradiction. We give an additional example with a proof by minimal counterexample.

Result 6.19

For any positive integer n, 1 + 2 + 3 + ··· + n =

Study

n(n + 1) . 2

Suppose instead that 1 + 2 + 3 + + n =

n(n + 1) 2

for some positive integers n. By the well-ordering principle, there is the smallest positive integer n such that 1 + 2 + 3 + · · · + n =

n(n + 1) . 2

Denote this integer by m. So 1 + 2 + 3 + · · · + m =

m(m + 1) , 2

while 1 + 2 + 3 + ··· + n =

n(n + 1) 2

6.4

The strong principle of mathematical induction

161

for every integer n with 1 ≤ n < m. Since 1 = 1(1 + 1)/2, it follows that m ≥ 2. Hence we can write m = k + 1, where 1 ≤ k < m. Hence 1 + 2 + 3 + ··· + k =

k(k + 1) . 2

Note that 1 + 2 + 3 + + m = 1 + 2 + 3 + + (k + 1) = (1 + 2 + 3 + + k) + (k + 1 ) k (k + 1) k(k + 1) + 2(k + 1) = + (k + 1) = 2 2 m(m + 1) (k + 1)(k + 2) = , = 2 2 that creates a contradiction.

6.4 The strong principle of mathematical induction We conclude with a final form of mathematical induction. This principle has many names: the strong principle of mathematical induction, the strong form of induction, the alternative form of mathematical induction, and the second principle of mathematical induction are common names. Theorem 6.20

(The strong principle of mathematical induction) For every positive integer n let P(n) be a statement. If (a) P(1) is true and (b) the implication If P(i) for every integer i with 1 ≤ i ≤ k, then P(k + 1). is true for every positive integer k, then P(n) is true for every positive integer n. Like the principle of mathematical induction (Theorem 6.2), the strong principle of mathematical induction is also a consequence of the well-ordering principle. The strong principle of mathematical induction is now given more symbolically below.

The strong For every positive integer n let P(n) be a statement. If the mathematical principle (1) P(1) is true and induction (2) ∀k ∈ N, then P(1) ∧ P(2) ∧ ∧ P(k) ⇒ P(k + 1) . true , then ∀n ∈ N, P(n) is true. The difference in the statements of the principle of mathematical induction and the strong principle of mathematical induction lies in the induction step (condition 2).

162

Chapter 6 Mathematical Induction To prove that ∀n ∈ N, P(n) by mathematical induction, we need to show that P(1) is true and check the implication: If P(k), then P(k+1).

(6.9)

holds for any positive integer k. On the other hand, to prove that ∀n ∈ N, P(n) is true by Starke's principle of mathematical induction, we need to show that P(1) is true and check the implication: if P(i) for every i with 1 ≤ i ≤ k, then P(k + 1).

(6.10)

holds for any positive integer k. If we gave direct implicational proofs (6.9) and (6.10), then we could bet more on the induction step (6.10) of the Strong Principle of Mathematical Induction than on the induction step (6.9) of the Principle of Mathematical Induction and still get the same result . If the assumption that P(k) is true is not sufficient to verify the truth of P(k + 1) for any positive integer k, but the assumption that all statements P(1), P(2 ), . 🇧🇷 🇧🇷 , P(k) are true is sufficient to verify the truth of P(k + 1), so this suggests that we should use the strong principle of mathematical induction. In fact, any result that can be proved by the principle of mathematical induction can also be proved by the strong principle of mathematical induction. Just as there is a more general version of the principle of mathematical induction (namely Theorem 6.8), there is a more general version of the strong principle of mathematical induction. We will also refer to this as the strong principle of mathematical induction. Theorem 6.21

(The strong principle of mathematical induction) For a fixed integer m let S = {i ∈ Z : i ≥ m}. For every n ∈ S let P(n) be a theorem. If (1) P(m) is true and (2) the implication If P(i) for any integer i with m ≤ i ≤ k, then P(k + 1). is true for every integer k ∈ S, then P(n) is true for every integer n ∈ S. We now consider a class of mathematical propositions for which the strong principle of mathematical induction is usually the appropriate proof technique. Suppose we consider a sequence a1 , a2 , a3 , . 🇧🇷 🇧🇷 of numbers. One way to define a sequence {an } is to explicitly specify the nth term an (as a function of n). 1 (−1)n For example, for every n ∈ N we can have a = , a = or a = n 3 + n. A sequence n n2 can also be defined recursively. In a recursively defined sequence {an} only the first term or maybe the first terms are specifically defined, say a1, a2, . 🇧🇷 🇧🇷 , k for some fixed k ∈ N. These are called initial values. Then ak+1 is replaced by a1, a2, . 🇧🇷 🇧🇷 , yak ; and more generally, for n > k, an is represented by a1, a2, . 🇧🇷 🇧🇷 , an−1 . This is called a repeat relationship.

6.4

The strong principle of mathematical induction

163

A concrete example of this is the sequence {an } defined by a1 = 1, a2 = 3 and an = 2an−1 − an−2 for n ≥ 3. In this case there are two initial values, namely a1 = 1 and a2 = 3 The repetition relation here is an = 2an−1 − an−2 for n ≥ 3. Setting n = 3 we find that a3 = 2a2 − a1 = 5; while letting n = 4 we have a4 = 2a3 − a2 = 7. Likewise a5 = 9 and a6 = 11. From this information one can conjecture (guess) that an = 2n − 1 for every n ∈ N (The conjectures are discussed in more detail in Section 7.1.) Using the strong principle of mathematical induction, we can indeed prove that this conjecture is true. Result 6.22

A sequence {an } is recursively defined by a1 = 1, a2 = 3 and an = 2an−1 − an−2 for n ≥ 3. Then an = 2n − 1 for all n ∈ N.

Study

We proceed by induction. Since a1 = 2 1 − 1 = 1, the formula holds for n = 1. For any positive integer k, assume that ai = 2i − 1 for all integers i with 1 ≤ i ≤ k. We show that ak+1 = 2(k + 1) − 1 = 2k + 1. If k = 1, then ak+1 = a2 = 2 1 + 1 = 3. Since a2 = 3, then ak+ 1 = 2k + 1 when k = 1. Hence we can assume that k ≥ 2. Since k + 1 ≥ 3, we have ak+1 = 2ak − ak−1 = 2(2k − 1) − (2k − 3 ) ) = 2k + 1, which is the desired result. By the strong principle of mathematical induction, an = 2n − 1 for all n ∈ N.

EVIDENCE ANALYSIS

Exercise 6.23

A few remarks on the proof of Result 6.22 are in order. First, for any positive integer k, we assumed that ai = 2i − 1 for all integers i with 1 ≤ i ≤ k. Our goal was to show that ak+1 = 2k + 1. Since k is a positive integer it might happen that k = 1 or k ≥ 2. If k = 1 then we need to show that ak+ 1 = a2 = 2 * 1 + 1 = 3. We know that a2 = 3 because it is one of the initial values. If k ≥ 2, then k + 1 ≥ 3 and ak+1 can be expressed by the recursion as 2ak − ak−1. To show that ak+1 = 2k + 1 when k ≥ 2, it was necessary to know that ak = 2k − 1 and that ak−1 = 2(k − 1) − 1 = 2k − 3. Because we used the Stark principle of mathematical induction, we knew both pieces of information. If we had applied the principle of mathematical induction, we would have assumed (and known) that ak = 2k − 1, but we would not have known that ak−1 = 2k − 3 and therefore could not set up the desired expression for ak +1. A sequence {an } is defined recursively by a1 = 1, a2 = 4 and an = 2an−1 − an−2 + 2 for n ≥ 3. Conjecture a formula for an and check that your conjecture is correct.

164

Chapter 6 Solving Mathematical Induction

Result 6.24

We start by finding a few more terms of the sequence. Note that a3 = 2a2 − a1 + 2 = 9, while a4 = 2a3 − a2 + 2 = 16 and a5 = 2a4 − a3 + 2 = 25. The obvious guess is that an = n 2 for any positive integer n. In the next result we check whether this assumption is correct.

A sequence {an } is recursively defined by a1 = 1, a2 = 4 and an = 2an−1 − an−2 + 2 for n ≥ 3. Then an = n 2 for all n ∈ N.

Study

We proceed by induction. Since a1 = 1 = 12 , the formula holds for n = 1. For any positive integer k, assume that ai = i 2 for any integer i with 1 ≤ i ≤ k. We show that ak+1 = (k + 1)2 . Since a2 = 4, it follows that ak+1 = (k + 1)2 for k = 1. We can thus assume that k ≥ 2. Hence k + 1 ≥ 3 and hence ak+1 = 2ak − ak−1 + 2 = 2k2 − (k − 1)2 + 2 = 2k2 − (k2 − 2k + 1) + 2 = k2 + 2k + 1 = (k + 1)2 . By the strong principle of mathematical induction, an = n 2 for all n ∈ N. Although we have mentioned that problems involving recurrence relationships are usually solved using the strong principle of mathematical induction, this is by no means the only type of problem where the strong principle of mathematical induction can be applied. While the best examples of this require a mathematical background beyond what we've covered so far, we present a different type of example.

Score 6.25

For every integer n ≥ 8 there are nonnegative integers a and b such that n = 3a + 5b.

Study

We proceed by induction. Since 8 = 3 1 + 5 1, the statement holds for n = 8. Suppose for every integer i with 8 ≤ i ≤ k, where k ≥ 8 is any integer, there exist nonnegative integers s and t such that i = 3s + 5t. Consider the integer k + 1. We show that there are nonnegative integers x and y such that k + 1 = 3x + 5y. Since 9 = 3 3 + 5 0 and 10 = 3 0 + 5 2, this holds when k + 1 = 9 and k + 1 = 10. Hence we can assume that k + 1 ≥ 11. So , 8 ≤ (k + 1) − 3 < k. By the induction hypothesis, there are nonnegative integers a and b such that (k + 1) − 3 = 3a + 5b and then k + 1 = 3(a + 1) + 5b. Setting x = a + 1 and y = b, we have the desired conclusion. By the strong principle of mathematical induction, for every integer n ≥ 8 there exist nonnegative integers a and b such that n = 3a + 5b.

Exercises for chapter 6

165

EXERCISES FOR CHAPTER 6 Section 6.1: The principle of mathematical induction 6.1. Which of the following sets are well ordered? (A B C D)

S S S S

= {x ∈ Q : x ≥ −10} = {−2, −1, 0, 1, 2} = {x ∈ Q : −1 ≤ x ≤ 1} = { p : p é primo} = {2 , 3, 5, 7, 11, 13, 17, . . .}.

6.2. Prove that if A is a well-ordered set of real numbers and B is a non-empty subset of A, then B is also well-ordered. 6.3. Prove that every nonempty set of negative integers has a largest element. 6.4. Prove that 1 + 3 + 5 + · · · + (2n − 1) = n 2 for every positive integer n by (1) by induction and (2) by adding 1 + 3 + 5 + · · · + (2n − 1) ) and (2n − 1) + (2n − 3) + · · · + 1. 6.5. Use mathematical induction to prove that 1 + 5 + 9 + + (4n − 3) = 2n 2 − n for any positive integer n. 6.6. (a) We have seen that 12 + 22 + · · · + n 2 is the number of squares in an n × n square composed of n 2 1 × 1 squares. What does 13 + 23 + 33 + · · · + n 3 represent geometrically? n 2 (n + 1)2 (b) Prove by induction that 13 + 23 + 33 + · · · + n 3 = for every positive integer 4 n. 6.7. Find another formula suggested by Exercises 6.4 and 6.5 and verify your formula by mathematical induction. 6.8. Find a formula for 1 + 4 + 7 + + (3n − 2) for positive integers n and then check your formula by mathematical induction. 6.9. Prove that 1 3 + 2 4 + 3 5 + + n(n + 2) =

n(n+1)(2n+7) 6

for every positive integer n.

6.10. Let r = 1 be a real number. Use induction to prove that a + ar + ar 2 + + ar n−1 = positive integer n. 6.11. prove that

1 3·4

+

1 4·5

+ ··· +

1 (n+2)(n+3)

=

n3n+9

for every positive integer n.

6.12. Consider the open sentence P(n): 9 + 13 + · · · + (4n + 5) =

4n 2 +14n +1 , 2

and n ∈ N.

(a) Check the implication P(k) ⇒ P(k + 1) for any positive integer k. (b) ∀n ∈ N, is P(n) true? 6.13. Prove that 1 · 1! +2 · 2! + · · · +n · n! = (n + 1)! − 1 for every positive integer n. 6.14. Prove that 2! · 4! · 6! · · · · (2n)! ≥ [(n + 1)!]n for every positive integer n. √ 6.15. Prove that √11 + √12 + √13 + + √1n ≤ 2 n − 1 for every positive integer n. 6.16. Prove that 7 | [34n+1 − 52n−1 ] for any positive integer n.

a(1−r n ) 1−r

for each

166

Chapter 6 Mathematical Induction

Section 6.2: A more general principle of mathematical induction 6.17. Prove Theorem 6.7: For every integer m, the set S = {i ∈ Z : i ≥ m} is well-ordered. [Hint: For every subset T of S either T ⊆ N or T − N is a nonempty finite set.] 6.18. Prove that 2n > n 3 for every integer n ≥ 10. 6.19. Prove the following implication for every integer n ≥ 2: If x1 , x2 , . 🇧🇷 🇧🇷 , xn are n arbitrary real numbers such that x1 x2 xn = 0, then at least one of the numbers x1 , x2 , . 🇧🇷 🇧🇷 , xn is 0. (Use the fact that if the product of two real numbers is 0, then at least one of the numbers is 0.) 6.20. (a) Use mathematical induction to prove that every nonempty finite set of real numbers has a largest element. (b) Use (a) to prove that every nonempty finite set of real numbers has a smallest element. 6.21. Prove that 4 | (5n − 1) for every non-negative integer n. 6.22. Prove that 3n > n 2 for every positive integer n. 6.23. Prove that 7 | 32n − 2n for every non-negative integer n. 6.24. Prove Bernoulli's identity: For every real number x > −1 and every positive integer n, we have (1 + x)n ≥ 1 + nx. 6.25. Prove that n! > 2n for every integer n ≥ 4. 6.26. Prove that 81 | (10n+1 − 9n − 10) for every non-negative integer n. 6.27. Prove that 1+

1 4

+

1 9

+ ··· +

1n2

≤2−

1n

for every positive integer n.

6.28. In Problem 6 of Chapter 4 you were asked to prove that if 3 | 2a, where a ∈ Z, i.e. 3 | one. Assume that this result is true. Prove the following generalization: Let a ∈ Z. For every positive integer n, 3 | 2n a, i.e. 3 | one. 6.29. Prove that if A1 , A2 , . 🇧🇷 🇧🇷 , An arbitrary n ≥ 2 sets, so A1 ∩ A2 ∩ ∩ An = A1 ∪ A2 ∪ ∪ An . 6.30 am. For integers n ≥ 2, a, b, c, d, remember that if a ≡ b (mod n) and c ≡ d (mod n), both a + c ≡ b + d (mod n) and ac ≡ db (mod n). Use these results and mathematical induction to prove the following: For any 2m integers a1 , a2 , . 🇧🇷 🇧🇷 , am and b1 , b2 , . 🇧🇷 🇧🇷 , bm with ai ≡ bi (mod n) for 1 ≤ i ≤ m, (a) a1 + a2 + + am ≡ b1 + b2 + + bm (mod n) and (b ) a1 a2 am ≡ b1 b2 bm (mod n). 6.31. Prove for all n ≥ 1 positive real numbers a1 , a2 , . 🇧🇷 🇧🇷 , and that

n n 1 ≥ n2. ai a i=1 i=1 i 6.32. Prove for all n ≥ 2 positive real numbers a1 , a2 , . 🇧🇷 🇧🇷 , and that (n − 1)

n i = 1

ai2 ≥ 2

has a j.

1 ≤ i < j ≤ n

[Nota: Quando n = 4, por exemplo, (6.11) afirma que 3(a12 + a22 + a32 + a42 ) ≥ 2(a1 a2 + a1 a3 + a1 a4 + a2 a3 + a2 a4 + a3 a4).]

Section 6.3: Proof by minimal counterexample

6.33. Use the proof by minimal counterexample to prove that 6 | 7n n 2 − 1 for every positive integer n. 6.34. Use the minimal counterexample method to prove that 3 | 22n − 1 for every positive integer n.

(6.11)

Additional exercises to Chapter 6

167

6.35. Prove by a minimal counterexample that 1 + 3 + 5 + · · · + (2n − 1) = n 2 for every positive integer n. 6.36. Prove that 5 | n 5 − n for every integer n. 6.37. Use Proof by Minimal Counterexample to prove that 3 | 2n + 2n+1 for every non-negative integer n. 6.38. Prove by minimal counterexample that 2n > n 2 for every integer n ≥ 5. 6.39. Prove that 12 | n 4 − n 2 for every positive integer n. 6.40. Let S = {2r : r ∈ Z, r ≥ 0}. Use the proof by minimal counterexample to prove that for all n ∈ N there is a subset Sn of S such that i∈Sn i = n.

Section 6.4: The strong principle of mathematical induction 6.41. A sequence {an } is recursively defined by a1 = 1 and an = 2an−1 for n ≥ 2. Conjecture a formula for an and check that your conjecture is correct. 6.42. A sequence {an } is defined recursively by a1 = 1, a2 = 2 and an = an−1 + 2an−2 for n ≥ 3. Conjecture a formula for an and check if your conjecture is correct. 6.43. A sequence {an } is defined recursively by a1 = 1, a2 = 4, a3 = 9 and an = an−1 − an−2 + an−3 + 2(2n − 3) for n ≥ 4. Conjecture a formula for and prove that your guess is correct. 6.44. Consider the sequence F1 , F2 , F3 , . 🇧🇷 🇧🇷 , where F1 = 1, F2 = 1, F3 = 2, F4 = 3, F5 = 5, and F6 = 8. The terms of this sequence are called Fibonacci numbers. (a) Define the sequence of Fibonacci numbers using a repetition relation. (b) Prove that 2 | Fn if and only if 3 | n. 6.45. Use the strong principle of mathematical induction to prove that for every integer n ≥ 12 there exist nonnegative integers a and b such that n = 3a + 7b. 6.46. Use the strong principle of mathematical induction to prove the following. Let S = {i ∈ Z : i ≥ 2} and let P be a subset of S with properties 2, 3 ∈ P and if n ∈ S then n ∈ P or n = ab, where a, b ∈ S then each element of S to P or can be expressed as a product of elements of P [note: you can see the set P of primes. This is an important theorem in mathematics, appearing in Chapter 11 as Theorem 11.17.] 6.47. Prove that there is an odd integer m such that any odd integer n with n ≥ m can be expressed as 3a + 11b or as 5c + 7d for nonnegative integers a, b, c, and d.

EXERCISES ADDITIONAL TO CHAPTER 6 6.48. Prove that 1 2 + 2 3 + 3 4 + + n(n + 1) =

n(n+1)(n+2) 3

for every positive integer n.

6.49. Prove that 4n > n 3 for every positive integer n. 6.50. Prove that 24 | (52n − 1) for every positive integer n. 6.51. According to result 6.5, 12 + 22 + 32 + + n 2 =

n(n + 1)(2n + 1) 6

(6.12)

168

Chapter 6 Mathematical induction for any positive integer n. (a) Use (6.12) to find a formula for 22 + 42 + 62 + · · · + (2n)2 for any positive integer n. (b) Use (6.12) and (a) to find a formula for 12 + 32 + 52 + · · · + (2n − 1)2 for any positive integer n. (c) Use (a) and (b) to find a formula for 12 − 22 + 32 − 42 + · · · + (−1)n+1 n 2 for any positive integer n. (d) Use mathematical induction to verify the formulas in (b) and (c).

6.52. Use the strong principle of mathematical induction to prove that for every integer n ≥ 28 there exist nonnegative integers x and y such that n = 5x + 8y. 6.53. Find a positive integer m such that for every integer n ≥ m there are positive integers x and y such that n = 3x + 5y. Use the principle of mathematical induction to prove this. 6.54. Find a positive integer m such that for every integer n ≥ m there are integers x, y ≥ 2 such that n = 2x + 3y. Use the principle of mathematical induction to prove this. n−1 6.55. A sequence {an } of real numbers is defined recursively by a1 = 1, a2 = 2 and an = i=1 (i − 1)ai for n ≥ 3. Prove that an = (n − 1)! for every integer n ≥ 3. 6.56. Consider the sequence a1 = 2, a2 = 5, a3 = 9, a4 = 14, etc. (a) Find a repetition relation expressing an through an−1 for every integer n ≥ 2. (b) Conjecture an explicit formula for a and then prove that your conjecture is correct. 6.57. The following theorem allows us to prove certain quantified statements about some finite sets. The Principle of Finite Induction For a fixed positive integer m, let S = {1, 2, . 🇧🇷 🇧🇷 ,m}. For every n ∈ S let P(n) be a theorem. If (a) P(1) is true and (b) the implication If P(k), then P(k + 1). is true for every integer k with 1 ≤ k < m, then P(n) is true for every integer n ∈ S. Use the principle of finite induction to prove the following result. Let S = {1, 2, . 🇧🇷 🇧🇷 , 24}. For every integer t with 1 ≤ t ≤ 300 there is a subset St ⊆ S with i∈St i = t. 6.58. Evaluate the proposed proof of the following result. Result For any positive integer n, 1 + 3 + 5 + · · · + (2n − 1) = n 2 . Proof We proceed by induction. Since 2 · 1 − 1 = 12, the formula holds for n = 1. Suppose that 1 + 3 + 5 + · · · + (2k − 1) = k 2 for a positive integer k. We prove that 1 + 3 + 5 + + (2k + 1) = (k + 1)2 . Note that 1 + 3 + 5 + + (2k + 1) = (k + 1)2 1 + 3 + 5 + + (2k − 1) + (2k + 1) = (k +1). )2 k 2 + (2k + 1) = (k + 1)2 (k + 1)2 = (k + 1)2 . 6.59. Below is a proof of a result. Which result is proved and which proof technique is used?

Additional exercises to Chapter 6

169

Proof On the contrary, suppose there is a positive integer n such that 8 | (32n − 1). Let m be the smallest positive integer such that 8 | (32m − 1). For n = 1, 32n − 1 = 8. Since 8 | 8 implies that m ≥ 2. Let m = k + 1. Since 1 ≤ k < m, it follows that 8 | (32k - 1). So 32k − 1 = 8x for an integer x and so 32k = 8x + 1. So 32m − 1 = 32(k+1) − 1 = 32k+2 − 1 = 9 32k − 1 = 9(8x + 1 ) − 1 = 72x + 8 = 8(9x + 1). Since 9x + 1 is an integer, 8 | (32m − 1), which creates a contradiction. 6.60. Below is a proof of a result. Which result is proved and which proof technique is used? Proof First note that a1 = 8 = 3 1 + 5 and a2 = 11 = 3 2 + 5. So an = 3n + 5 for n = 1 and n = 2. Suppose ai = 3i + 5 for all integers i with 1 ≤ i ≤ k, where k ≥ 2. Since k + 1 ≥ 3, it follows that ak+1 = 5ak − 4ak−1 − 9 = 5(3k + 5) − 4(3k + 2) − 9 = 15k + 25 − 12k − 8 − 9 = 3k + 8 = 3(k + 1) + 5. 6.61. By an n-gon we mean an n-sided polygon. So a 3-sided is a triangle and a 4-sided is a quadrilateral. It is known that the sum of the interior angles of a triangle is 180°. Prove by induction that for any integer n ≥ 3, the sum of the interior angles of an n-agon is (n − 2) 180o. 6.62. Suppose {an } is a sequence of recursive numbers defined recursively by a1 = 1, a2 = 2, a3 = 3 and an = 2an−1 − an−3 for n ≥ 4. Prove that an = an−1 + an− 2 for every integer n ≥ 3. 6.63. Suppose {an } is a sequence of recursive numbers defined recursively by a1 = 1, a2 = 2 and an = an−1 /an−2 for n ≥ 3. (a) Prove this

⎧ se n ≡ 1, 4 (mod 6) ⎨1 se n ≡ 2, 3 (mod 6) an = 2 ⎩ 1/2 se n ≡ 0, 5 (mod 6)

for every positive integer n. 6 a j+i = 7. (b) For every non-negative integer j, prove that i=1 3 x for every integer n ≥ 3. 6.64. Let x ∈ R with x ≥ 3. Prove that (1 + x)n ≥ n(n−1)(n−2) 6 n j n(n+1)(n+2) for every positive integer n. 6.65. Prove that j=1 i=1 i = 6 j n(n+1)(2n+1) for every positive integer n. 6.66. Prove that nj=1 i=1 (2i − 1) = 6 6.67. Prove that there is an odd integer m such that any odd integer n with n ≥ m can be expressed as 3a + 5b + 7c for positive integers a, b, and c.

7

prove or disprove

EU

In every mathematical statement you have seen so far, you have been told its truth value. If the claim is true, we will provide evidence or ask you to provide your own. What you (perhaps) didn't know was how we or you should verify its accuracy. If the statement is false, then we check that here too, or ask you to check if it was wrong. As you delve further into the world of mathematics, you will make more and more questionable statements. Each of these statements therefore presents you with two problems: (1) Determine whether the statement is true or false. (2) Check the accuracy of your belief.

7.1 Conjectures in mathematics In mathematics, when we don't know whether a certain statement is true, but there are good reasons to believe it, then we call the statement a conjecture. Therefore, the word guess is used in mathematics as a fancy synonym for a smart guess (or maybe just a guess). Once a conjecture is proven, it becomes a theorem. If, on the other hand, the assumption turns out to be wrong, then we have made a wrong assumption. That's how math works - guess and show whether our guess is right or wrong, then possibly make a new guess and repeat the process (possibly many times). Learning what is true and what is false about the mathematics we study affects the questions we ask and the assumptions we make. Let's consider an example of a conjecture (although it's always possible that someone else solved the conjecture between the time it was written here and the time you read it). A word is called a palindrome if it is read the same forwards and backwards (e.g. action, noon and radar). In fact, a sentence is a palindrome if it is read the same forwards and backwards, ignoring spaces (name no one man). A positive integer is called a palindrome if it is the same number when its digits are swapped. (It's considerably easier to give an example of a number that is a palindrome than a word that is a palindrome.) For example, 1221 and 47374 are palindromes. Consider the integer 27.

170

7.1

conjectures in mathematics

171

It's not a palindrome. If you reverse the digits, you get 72. Of course, 72 isn't a palindrome either. If we add 27 and 72, we have: 27 + 72 99 A palindrome result. Consider another positive integer, say 59. It's not a palindrome. Invert the digits and add: 59 + 95 154 The result is not a palindrome either. Reverse the digits and add: 154 + 451 605 Once again we come to a number that is not a palindrome. But flip your digits and add: 605 + 506 1111 This time the result is a palindrome. It has been suggested that if we start with any positive integer and apply the technique described above to it, we will eventually arrive at a palindrome. However, no one knows whether this is true. (All two-digit numbers are known to be true.) Some conjectures have become famous because it took years, decades, or even centuries to establish their truth or falsity. Other assumptions are undecided to this day. Now let's consider four conjectures in mathematics, each with a long history. In 1852 a question came to the mind of British student Francis Guthrie as he was coloring a map of the counties of England. Suppose a country (real or imaginary) has been divided into counties in some way. Is it possible to color the counties on this map with four colors or fewer such that one color is used for each county and two counties that share a common boundary (not just a single point) are colored differently? For example, the “Country” map shown in Figure 7.1 has eight “Counties” colored with the four colors Red (R), Blue (B), Green (G), and Yellow (Y) according to the rules described above. This card can also be colored with more than four suits, but not less than four. Within a few years, Francis Guthrie's question caught the attention of some of the most prominent mathematicians of the time, and it eventually became a famous conjecture. The Four Color Conjecture

Each card can be colored with four or fewer colors. Many have tried to resolve this assumption. In fact, an article was published in 1879 that allegedly contained proof of the conjecture. However, in 1890 an error in the proof was discovered and the "theorem" reverted to conjectural status. It lasted until 1976 when

172

Chapter 7 Prove or disprove

G

B G

BR

Y

YB Figure 7.1

Color counties in a four color country

a real proof by Kenneth Appel and Wolfgang Haken combining mathematics and computers was presented. The period between the emergence of the problem and its solution was about 124 years. That's a theorem now. The Four Color Theorem

Each card can be colored with four or fewer colors. We now describe a conjecture with an even longer history. One of the most famous mathematicians of the 17th century was Pierre Fermat. He is arguably best known for a specific claim he made. He wrote that for every integer n ≥ 3 there are no non-zero integers x, y and z such that xn + yn = zn. Of course, there are many non-zero integer solutions to the equation x 2 + y 2 = z 2 . Example: 32 + 42 = 52 , 52 + 122 = 132 and 82 + 152 = 172 . A triple (x, y, z) of positive integers such that x 2 + y 2 = z 2 is often called a Pythagorean triple. Therefore (3, 4, 5), (5, 12, 13) and (8, 15, 17) are Pythagorean triples. Indeed, if (a, b, c) is a Pythagorean triple and k ∈ N, then (ka, kb, kc) is also a Pythagorean triple. Fermat's claim was discovered without justification in the margin of a book written by Fermat after his death. It was written in the margin that there was not enough space for his "truly remarkable demonstration". Consequently, this statement became known as Fermat's last theorem. However, it would have been more appropriate to call this claim Fermat's Last Conjecture, since the truth or falsity of this claim remained in question for approximately 350 years. In 1993, however, British mathematician Andrew Wiles solved the conjecture by providing a truly remarkable proof. Therefore, Fermat's last theorem is finally a theorem.

Fermat's Last Theorem

For every integer n ≥ 3, there are no nonzero integers x, y, and z such that xn + yn = zn. The last two conjectures we mentioned concern prime numbers. Although we have mentioned prime numbers from time to time, we have yet to provide a formal definition. we do it now An integer p ≥ 2 is prime if its only positive integer divisors are 1 and p. A Fermat number t (Yes, same Fermat!) is an integer of the form Ft = 22 + 1, where t is a non-negative integer. The first five Fermat numbers are F0 = 3, F1 = 5, F2 = 17, F3 = 257, F4 = 65, 537, all prime numbers.

7.2

Verification of Quantified Claims

173

In 1640 Fermat wrote to many (including the famous mathematician Blaise Pascal) that he believed that all these numbers (he did not call them Fermat numbers) were prime, but he could not prove it. Hence we have the following. Fermat's conjecture

Every Fermat number is a prime number. Almost a century later (1739), the famous mathematician Leonhard Euler proved that F5 = 4, 294, 967, 297 is divisible by 641, thereby disproving Fermat's conjecture. More specifically, Euler proved the following.

Euler's theorem

If p is a prime factor of Ft, then p = 2t+1k+1 for a positive integer k. If we set t = 5 in Euler's theorem, we see that each prime factor of F5 has the form 64k + 1. The first five primes of this form are 193, 257, 449, 577, and 641, the last of which divides F5. In recent decades, other Fermat numbers have been studied and shown not to be prime. Indeed, many scholars of this subject are now inclined to the opposite view (and conjecture): except for the Fermat numbers F0 , F1 , · · · , F4 (all of which Fermat observed to be primes), no number of Fermat is prime. The last conjecture we describe here dates back to 1742. The German mathematician Christian Goldbach conjectured that every even integer greater than 2 is the sum of two prime numbers. This is of course easy to see for small even integers. For example 4 = 2 + 2, 6 = 3 + 3, 8 = 5 + 3 and 10 = 7 + 3 = 5 + 5. The main difference between this guess and the three previous guesses is that this guess never existed solved. Therefore we conclude with the following.

Goldbach's conjecture

Every even integer at least 4 is the sum of two prime numbers.

7.2 Verification of Quantified Claims Many (indeed most) of the claims we have encountered are quantified claims. In fact, for an open statement P(x) over a domain S we often consider a quantified statement with a universal quantifier, i.e. ∀x ∈ S, P(x): For all x ∈ S, P(x) . or If x ∈ S, then P(x). or a quantified statement with existential quantifier, ie ∃x ∈ S, P(x): There is x ∈ S with P(x). Recall that ∀x ∈ S, P(x) is a true statement if P(x) is true for all x ∈ S; while ∃x ∈ S, P(x) is a true statement if P(x) is true for at least one x ∈ S. Let's go through them again. Example 7.1

Let S = {1, 3, 5, 7} and consider P(n): n 2 + n + 1 is a prime number.

174

Chapter 7 Prove or disprove for every n ∈ S. Then both ∀n ∈ S, P(n): For every n ∈ S, n 2 + n + 1 is a prime. e ∃n ∈ S, P(n): There are n ∈ S such that n 2 + n + 1 is a prime number. they are quantified statements. Since P(1): P(3): P(5): P(7):

12 + 1 + 1 = 3 is a prime number. 32 + 3 + 1 = 13 is a prime number. 52 + 5 + 1 = 31 is a prime number. 72 + 7 + 1 = 57 is a prime number.

it's true it's true it's true it's false

it follows that ∀n ∈ S, P(n) is false and ∃n ∈ S, P(n) is true. On the other hand, the statement Q : 323 is a prime number. is not a quantified statement, but Q is false (since 323 = 17 19 is not a prime number).

Let P(x) be a statement for each x in a domain S. Recall that the negation of ∀x ∈ S, P(x) ∼ (∀x ∈ S, P(x)) ≡ ∃x ∈ S , ∼ is P(x). and the negation of ∃x ∈ S, P(x) is ∼ (∃x ∈ S, P(x)) ≡ ∀x ∈ S, ∼ P(x). Consider P(n) again: n 2 + n + 1 is a prime number. from Example 7.1, which is a statement for each n in S = {1, 3, 5, 7}. The negation of ∀n ∈ S, P(n) is ∃n ∈ S, ∼ P(n): There are n ∈ S such that n 2 + n + 1 is not a prime number. is true since 7 ∈ S, but 72 + 7 + 1 = 57 is not a prime number. On the other hand, the negation of ∃n ∈ S, P(n) ∀n ∈ S, ∼ P(n) is: If n ∈ S, then n 2 + n + 1 is not a prime number. is false because, for example, 1 ∈ S and 12 + 1 + 1 = 3 is a prime number. In Section 2.10, we start discussing quantified statements that contain two quantifiers. We continue that discussion here. Example 7.2

Consider P(s, t): 2s + 3t is a prime number.

7.2

175

Verification of Quantified Claims

where s is a positive even integer and t is a positive odd integer. If we denote S the set of positive even integers and T the set of positive odd integers, then the quantified statement is ∃s ∈ S, ∃t ∈ T, P(s, t). can be expressed in words like: There is a positive even integer s and a positive odd integer t such that 2s + 3t is prime.

(7.1)

Statement (7.1) is true since P(2, 1): 22 + 31 = 7 is a prime number. and truth. On the other hand, the quantified statement is ∀s ∈ S, ∀t ∈ T, P(s, t). can be expressed in words like: For every positive even integer s and every positive odd integer t, 2s + 3t is a prime number.

(7.2)

Statement (7.2) is false because P(6, 3): 26 + 33 = 91 is a prime number. is false because 91 = 7 · 13 is not a prime number.

Let P(s, t) be an open sentence, where the domain of the variable s is S and the domain of the variable t is T. Recall that the negations of the quantified statements ∃s ∈ S, ∃t ∈ T, P(s, t) and ∀s ∈ S, ∀t ∈ T, P(s, t) ∼ (∃s ∈ S , ∃ t ∈ T, P(s, t)) ≡ ∀s ∈ S, ∀t ∈ T, ∼ P(s, t). and ∼ (∀s ∈ S, ∀t ∈ T, P(s, t)) ≡ ∃s ∈ S, ∃t ∈ T, ∼ P(s, t). Hence the negation of statement (7.1) is For every positive even integer s and every positive odd integer t, 2s + 3t is not a prime number. which is a false claim. On the other hand, the negation of statement (7.2) is There is a positive even integer s and an odd positive integer t such that 2s + 3t is not prime. what a true statement is. We have seen that quantified statements can also contain different types of quantifiers. For example, the definition of an even integer implies that for every even

176

Chapter 7 Prove or disprove the integer n, there is an integer k with n = 2k. There is one more mathematical symbol that you may be familiar with. The symbol denotes the expression so that (although some mathematicians write simply s.t. for "so that"). For example, let S denote the set of even integers. So ∀n ∈ S, ∃k ∈ Z n = 2k.

(7.3)

says: For every even integer n there is an integer k with n = 2k. This statement can be rephrased as follows: if n is an even integer, then n = 2k for some integer k. If we swap the two quantifiers in (7.3), we get in words: There is an even integer n such that for every integer k n = 2k. This statement can also be rephrased as There is an even integer n such that n = 2k for every integer k. This statement can be expressed in symbols like ∃n ∈ S, ∀k ∈ Z, n = 2k.

(7.4)

Of course, statements (7.3) and (7.4) say something completely different. In fact, (7.3) is true and (7.4) is false. Another example is: For every real number x there is an integer n with |x − n| < 1

(7.5)

This statement can also be expressed as If x is a real number then there exists an integer n with |x − n| < 1. To express (7.5) symbolically, let P(x, n): |x − n| < 1. where the domain of the variable x is R and the domain of the variable n is Z. Thus (7.5) can be expressed in symbols as ∀x ∈ R, ∃n ∈ Z, P(x, n). Assertion (7.5) is true, as we have now verified. In the proof of the following result we refer to the upper bound x of a real number, which is the smallest integer greater than or equal to x. Result 7.3 test

For every real number x there is an integer n with |x − n| < 1. Let x be a real number. If we make n = x, then |x − n| = |x − x | = x − x < 1.

7.2

Verification of Quantified Claims

177

Another example of a quantified statement containing two different quantifiers is Is there a positive integer even m such that for every positive integer n, 1 − 1 ≤ 1 . m

n

2

(7.6)

Let S be the set of positive even integers and let P(m, n): m1 − n1 ≤ 12 . where the domain of the variable m is S and the domain of the variable n is N. Thus (7.6) can be expressed in symbols as ∃m ∈ S, ∀n ∈ N, P(m, n). The correctness of the statement (7.6) is now verified. Result 7.4

Study

There is a positive integer for m such that for every positive integer n 1 1 − ≤ 1. m n 2 Consider m = 2. Let n be a positive integer. We consider three cases. Case 1. n = 1. Then m1 − n1 = 12 − 11 = 12 . Case 2. n = 2. Then m1 − n1 = 12 − 12 = 0 < 12 . Case 3. n ≥ 3. Then m1 − n1 = 12 − n1 = 12 − n1 < 12 . So 12 − n1 ≤ 12 for all n ∈ N. Let P(s, t) be an open sentence, where the domain of the variable s is S and the domain of the variable t is T. The negation of the quantified statement ∀s ∈ S, ∃t ∈ T, P(s, t) is ∼ (∀s ∈ S, ∃t ∈ T, P(s, t)) ≡ ∃s ∈ S, ∼ ( ∃ t ∈ T, P(s, t)) ≡ ∃s ∈ S, ∀t ∈ T, ∼ P(s, t); while the negation of the quantified statement ∃s ∈ S, ∀t ∈ T, P(s, t) ∼ (∃s ∈ S, ∀t ∈ T, P(s, t)) ≡ ∀s ∈ S, ∼ ( ∀t ∈ T, P(s, t)) ≡ ∀s ∈ S, ∃t ∈ T, ∼ P(s, t). Hence the negation of assertion (7.5) There is a real number x such that for every integer n |x − n| ≥ 1. So this statement is false. The negation of statement (7.6) is: For every positive even integer m there is a positive integer n with 1 − 1 > 1. m n 2 This is also false.

178

Chapter 7 Prove or Disprove Consider the following statement, which has more than two quantifiers. For every positive real number e there is a positive real number d such that for every real number x |x| < d implies that |2x| < e.g.

(7.7)

If we do P(x, d): |x| < D and Q(x, e): |2x| < e.g. where the domain of the variables e and d is R+ and the domain of the variable x is R, then (7.7) can be written in symbols like ∀e ∈ R+ , ∃d ∈ R+ , ∀x ∈ R, P(x, d) ⇒ Q(x,e). Statement (7.7) is indeed true, which we now verify. Result 7.5

For every positive real number e there is a positive real number d such that if x is a real number with |x| is < d, so |2x| < e.g.

Study

Let and be a positive real number. Now choose d = e/2. Let x be a real number with |x| <d = e/2. Then e = e, |2x| = 2|x| < 2 2 as desired.

7.3 Testing statements Now we turn to the main topic of this chapter. Given a statement whose truth value we are not given, our task is to determine the truth or falsity of the statement and, moreover, to show that our conclusion is correct by proving or disproving the statement accordingly. Example 7.6

Prove or disprove: There is a real solution to the equation x 6 + 2x 2 + 1 = 0.

Strategy

Note that x 6 and x 2 are even powers of x. So if x is any real number, then x 6 ≥ 0 and x 2 ≥ 0, so 2x 2 ≥ 0. Adding 1 to x 6 + 2x 2 shows that x 6 + 2x 2 + 1 ≥ 1. So it's so impossible that x 6 + 2x 2 + 1 is 0. These thoughts lead us to our solution. We start by informing the reader that the statement is false so the reader will know what we are trying to do.

Solution for example 7.6

The statement is wrong. Let x ∈ R. Since x 6 ≥ 0 and x 2 ≥ 0, it follows that x 6 + 2x 2 + 1 ≥ 1 and thus x 6 + 2x 2 + 1 = 0. For the previous example, we wrote “strategy”. in instead of "evidence strategy" for two reasons: (1) Since the statement may be false, there may be no proof in this case. (2) We

7.3

Testausages

179

we “think out loud” and try to convince ourselves whether the statement is true or false. Of course, if the claim is true, our strategy could very well turn into a proof strategy. Example 7.7

Prove or disprove: Let x, y, z ∈ Z. Then two of the integers x, y and z have the same parity.

Strategy

Given three integers, two are even or two are odd. So it looks like the statement is true. The only question seems to be whether what we said in the previous sentence is convincing enough for all readers. We tried a different approach.

Solution to Example 7.7 Proof

The statement is true. Consider x and y. If x and y have the same parity, the proof is complete. So we can assume that x and y are of opposite parity, let's say x is even and y is odd. If z is even, then x and z have the same parity; while if z is odd, y and z have the same parity. Of course, the previous proof could also have been provided by cases.

Example 7.8

Prove or disprove: Let A, B and C be sets. If A × C = B × C, then A = B.

Strategy

The elements of the set A × C are ordered pairs of elements, i.e. H. they have the form (x, y), where x ∈ A and y ∈ C. Let (x, y) ∈ A × C. If A × C = B × C, it follows that (x, y) is also an element must be of B × C. This states that x ∈ B and y ∈ C. Conversely, if (x, y) ∈ B × C, then (x, y) ∈ A × C, which implies that x ∈ A also. These observations certainly seem to point to this to suggest that under these conditions it would be possible to show that A = B. However, this argument depends on A × C containing an element (x, y). Does A×C contain no elements? If A or C is empty, this would happen. However, if C = ∅ and A × C = ∅, then A must be empty. But B × C = A × C = ∅ would mean that B must also be empty and therefore A = B. This suggests a different answer.

Solution for example 7.8

The statement is wrong. Let A = {1}, B = {2} and C = ∅. So A × C = B × C = ∅, but A = B. So these sets A, B and C form a counterexample. In some cases we consider changing a false statement so that the revised statement is true. Our previous discussion seems to indicate that the statement would be true if the set C did not have to be empty.

Result 7.9 test

Let A, B and C be sets with C = ∅. If A × C = B × C, then A = B. Suppose A × C = B × C. Since C = ∅, the set C contains an element c. Let x ∈ A. Then (x, c) ∈ A × C. Since A × C = B × C, it follows (x, c) ∈ B × C. Hence x ∈ B and hence A ⊆ B. By A similar argument follows that B ⊆ A. So A = B.

180

Chapter 7 Prove or Disprove Example 7.10

Strategy

Solution to Example 7.10 Proof

Prove or disprove: There is a real number x with x 3 < x < x 2 . If there is a real number x such that x 3 < x < x 2 , then that number is certainly not 0. Consequently, every real number x with this property is either positive or negative. If x > 0, then we can divide x 3 < x < x 2 by x and get x 2 < 1 < x. However, if x > 1, then x 2 > 1. Hence there is no positive real number x such that x 3 < x < x 2 . Therefore, any real number x that satisfies x 3 < x < x 2 must be negative. Dividing x 3 < x < x 2 by x gives x 2 > 1 > x or x < 1 < x 2 . Trying out some negative numbers tells us that any number less than -1 has the desired property. The statement is true. Consider x = −2. So x 3 = −8 and x 2 = 4. So x 3 < x < x 2 .

Example 7.11

Prove or disprove: For every positive irrational number b there is an irrational number a with 0 < a < b.

Strategy

We start with a positive irrational number b. If this statement is true, we have to show that there is an irrational number a such that 0 < a < b. If we consider a = b/2, then surely 0 < a < b. The only question is whether b/2 is necessarily irrational. However, we have seen that b/2 is irrational (in Exercise 5.17 in Section 5.2).

Solution to Example 7.11 Proof

Example 7.12

Strategy

Solution to Example 7.12 Proof

The statement is true. Let b be a positive irrational number. Now let a = b/2. Then 0 < a < b and a is irrational according to Exercise 5.17 in Section 5.2. Prove or disprove: Every even integer is the sum of three different even integers. This statement can be rephrased in a number of ways. A restatement of this statement is: If n is an even integer, then there exist three distinct even integers a, b, and c such that n = a + b + c. What this statement doesn't say is that the sum of three different even integers is even; That is, we don't start with three different even integers and show that their sum is even. We start with an even integer n and ask if we can find three different even integers a, b and c such that n = a + b + c. This is certainly true for n = 0 since 0 = (−2) + 0 + 2. It is also true for n = 2 since 2 = (−2) + 0 + 4. If n = 4 then 4 = (− 2) + 2 + 4. This last example may suggest a general proof. For any even integer n we can write n = 2 + (−2) + n. Of course n, 2 and −2 are even. But are they different? They are not different if n = 2 or n = −2. This provides a blueprint for an exam. The statement is true.

Let n be an even integer. We show that n is the sum of three different even integers by considering the following three cases.

7.3

Testausages

181

Case 1. n = 2. Note that 2 = (−2) + 0 + 4. Case 2. n = −2. Note that −2 = (−4) + 0 + 2. Case 3. n = 2, −2. So n = 2 + (−2) + n. Example 7.13

Strategy

Prove or disprove: Let k ∈ N. If k 2 + 5k is odd, then (k + 1)2 + 5(k + 1) is odd. One idea that might come to mind is to assume that k 2 + 5k is an odd integer where k ∈ N and see if we can show that (k + 1)2 + 5(k + 1) is also odd. If k 2 + 5k is odd, we can write k 2 + 5k = 2 + 1 for an integer. So (k + 1)2 + 5(k + 1) = k 2 + 2k + 1 + 5k + 5 = (k 2 + 5k) + (2k + 6) = (2 + 1) + (2k + 6) = (2 + 2k + 6) + 1 = 2( + k + 3) + 1, which is an odd integer, and we have a proof.

Solution to Example 7.13 Proof

The statement is true. Suppose k 2 + 5k is an odd integer, where k ∈ N. So k 2 + 5k = 2 + 1 for an integer . So (k + 1)2 + 5(k + 1) = k 2 + 2k + 1 + 5k + 5 = (k 2 + 5k) + (2k + 6) = (2 + 1) + (2k + 6) = (2 + 2k + 6) + 1 = 2( + k + 3) + 1. Since + k + 3 is an integer, (k + 1)2 + 5(k + 1) is an odd integer.

Example 7.14

Prove or disprove: For every positive integer n, n 2 + 5n is an odd integer.

Strategy

It seems reasonable to examine n 2 + 5n for some values of n. For n = 1 we have n 2 + 5n = 1 + 5 1 = 6. We have solved the problem! For n = 1, n 2 + 5n is not an odd integer. We found a counterexample.

Solution to example 7.14

The statement is wrong. For n = 1, n 2 + 5n = 1 + 5 * 1 = 6, which is even. So n = 1 is a counterexample. Looking back at Examples 7.13 and 7.14, we might wonder what exactly is going on. Certainly these two examples appear to be related. Perhaps the following thought occurs to us. For any positive integer n, let P(n): The integer n 2 + 5n is odd. and consider the statement (quantified) For every positive integer n, n 2 + 5n is odd. or in symbols, ∀n ∈ N, P(n).

(7.8)

182

Chapter 7 Proving or Disproving We can ask whether (7.8) is true. Because of the domain of definition, a proof by induction seems appropriate. In fact, the statement of Example 7.13 is the induction step in a proof of (7.8). According to Example 7.13 the induction step is true. On the other hand, statement (7.8) is false since n = 1 is a counterexample. This underscores the importance of checking both the fundamental and inductive steps in a proof by induction. Returning to Example 7.13, we can show (with a case proof) that k 2 + 5k is even for all k ∈ N, which would provide an empty proof of the claim of Example 7.13. In this chapter we discuss the analysis of statements, specifically understanding statements, determining whether they are true or false, and proving or disproving them. All statements we verify have been provided to us. But how do we arrive at statements that we can analyze for ourselves? This is an important question and concerns the creative aspect of mathematics - how new mathematics is discovered. Obviously there is no rule or formula for creativity, but creating new statements often results from studying old statements. Let's illustrate how we can create statements to parse. In Exercise 4.6 you were asked to prove the following: Let a ∈ Z. If 3 | 2a, then 3 | one.

(7.9)

What other claims does this suggest? For example, is the converse true? (The answer is yes, but the converse isn't very interesting.) Is (7.9) true if 3 and 2 are swapped? Which integers can we replace by 2 in (7.9) and get a true statement? That is, for which positive integers k it holds that if 3 | ka, so 3 | one? This is of course true for k = 1. And we know that it is true for k = 2. It does not hold for k = 3; that is, it is not true that if 3 | 3a, then 3 | one. The integer a = 1 is a counterexample. On the other hand, it is possible to prove that if 3 | 4a, where a ∈ Z, i.e. 3 | one. What we are trying to do is extend the result in (7.9) so that we have a result like: Let a ∈ Z. If 3 | ka, so 3 | one.

(7.10)

for a fixed integer k greater than 2. We want to find a set S of positive integers such that: Let a ∈ Z. If 3 | ka, where k ∈ S, so 3 | one.

(7.11)

Certainly 2 ∈ S. The result (7.9) then becomes a special case and a consequence of (7.11). For this reason (7.11) is called a generalization of (7.9). Ideally we want S to have the additional property that (7.11) is true if k ∈ S, while (7.11) is false if k ∈ / S. So we are looking for a set S of integers such that the following : Let a ∈ Z. Then 3 | ka implies that 3 | a if and only if k ∈ S. If we succeed in finding this set S, we can start over by replacing 3 in (7.9) with another positive integer. In mathematics, a new result is often obtained by looking at an old result in a new way and expanding it to get a generalization of the old result. This is what often happens: Today's sentence becomes tomorrow's conclusion. We close this chapter with a questionnaire. Solutions will be given after the quiz.

7.3

Testausages

183

Quiz Prove or disprove each of the following statements. 1. If n is a positive integer and s is an irrational number, then n/s is an irrational number. 2. For every integer b there is a positive integer a with |a − |b|| ≤ 1. 3. If x and y are integers with the same parity, then x y and (x + y)2 have the same parity. 4. Let a, b ∈ Z. If 6 | down, then (1) 2 | a and 3 | b or (2) 3 | a and 2 | B. n

5. For every positive integer n, 22 ≥ 4n! 🇧🇷 6. If A, B and C are sets, then (A − B) ∪ (A − C) = A − (B ∪ C). 7. Let n ∈ N. If (n + 1)(n + 4) is odd, then (n + 1)(n + 4) + 3n is odd. 8. (a) There are distinct rational numbers a and b such that (a − 1)(b − 1) = 1. (b) There are distinct rational numbers a and b such that a1 + b1 = 1. 9 Let a, b, c ∈ Z. If all two of a, b and c have the same parity, then a + b + c is even. 10. If n is a nonnegative integer, then 5 divides 2 · 4n + 3 · 9n .

Quiz Answers 1. The statement is true. Proof Suppose instead that there is a positive integer n and an irrational number s such that n/s is a rational number. Then n/s = a/b, where a, b ∈ Z and a, b = 0. Hence s = nb/a, where nb, a ∈ Z and a = 0. Thus s is rational, which is a contradiction generated. 2. The statement is true. Check

Let b ∈ Z. Now let a = |b| + 1. So a ∈ N and |a − |b|| = |(|b| + 1) − |b|| = 1

3. The statement is wrong. Note that x = 1 and y = 3 have the same parity. So x y = 3 and (x + y)2 = 16 are of opposite parity. Therefore, x = 1 and y = 3 provide a counterexample. 4. The statement is false. Let a = b = 2. So ab = 4. So 6 | away. Since 2 | a and 2 | b, both (1) and (2) are false. So a = b = 2 is a counterexample. n

3

5. The statement is false. For n = 3, 22 = 28 = 256 while 4n! = 43! = 46 = 4096. So 22 < 43! so n = 3 is a counterexample. 6. The statement is false. Let A = {1, 2, 3}, B = {2} and C = {3}. So B ∪ C = {2, 3}. So A − B = {1, 3}, A − C = {1, 2} and A − (B ∪ C) = {1}. So (A − B) ∪ (A − C) = {1, 2, 3} = A − (B ∪ C). Then A = {1, 2, 3}, B = {2} and C = {3} form a counterexample. 7. The statement is true. Proof Let n ∈ N and consider (n + 1)(n + 4). We show that (n + 1)(n + 4) is even, so an empty proof. There are two cases. Case 1. n is even. Then n = 2k for an integer k. So (n + 1)(n + 4) = (2k + 1)(2k + 4) = 2(2k + 1)(k + 2). Since (2k + 1)(k + 2) ∈ Z, it follows that (n + 1)(n + 4) is even.

184

Chapter 7 Prove or disprove

Case 2. n is odd. So n = 2 + 1 for an integer. So (n + 1)(n + 4) = (2 + 2)(2 + 5) = 2( + 1)(2 + 5). Since (+ 1)(2 + 5) ∈ Z, it follows that (n + 1)(n + 4) is even. 8. (a) The statement is true. Check

Let a = 3 and b = 32 . So (a − 1)(b − 1) = 2

1 2

= 1.

(b) The statement is true. Check

You are at =

1 2

and b = −1. So 1 1 1 1 + = 1 + = 2 − 1 = 1. a b −1 2

Proof Analysis Note: If a and b are two (different) rational numbers satisfying a1 + b1 = 1, then a + b = 1 and thus a + b = ab. Thus ab − a − b = 0, which is equivalent to ab − a − b + 1 = 1 and hence ab (a − 1)(b − 1) = 1. Therefore, two distinct rational numbers a and b ( a − 1)(b − 1) = 1 if and only if a and b satisfy 1 1 + =1 a b if and only if a and b satisfy a + b = ab.

9. The statement is wrong. Let a = 1, b = 3 and c = 5. Then every two of a, b and c have the same parity; nevertheless a + b + c is odd. Therefore, a = 1, b = 3, and c = 5 provide a counterexample. 10. The statement is true. Proof We proceed by induction. For n = 0, 2 4n + 3 9n = 2 1 + 3 1 = 5. So 5 | (2 40 + 3 90 ) and the statement holds for n = 0. Suppose that 5 | (2 4k + 3 9k ) for a non-negative integer k. We show that 5 | (2 x 4k+1 + 3 x 9k+1 ). Since 5 | (2 4k + 3 9k ) it follows that 2 4k + 3 9k = 5x for an integer x. So 2 4k = 5x − 3 9k . So 2 4k+1 + 3 9k+1 = 4(2 4k) + 3 9k+1 = 4(5x − 3 9k ) + 3 9k+1 = 20x − 12 9k + 27 9k = 20x + 15 * 9k = 5(4x + 3 * 9k ). Since 4x + 3 · 9k ∈ Z, it follows that 5 | (2 x 4k+1 + 3 x 9k+1 ). By the principle of mathematical induction, 5 divides 2 4n + 3 9n for each non-negative integer n.

Exercises for chapter 7

185

EXERCISES FOR CHAPTER 7 Section 7.1: Conjectures in mathematics 7.1. Consider the following sequence of equalities: 1=0+1 2+3+4=1+8 5 + 6 + 7 + 8 + 9 = 8 + 27 10 + 11 + 12 + 13 + 14 + 15 + 16 = 27 + 64 (a) What is the next equality in this sequence? (b) What conjecture do these equalities imply? (c) Prove the conjecture in (b) by induction. 7.2. Consider the following statements: (1 + 2)2 − 12 = 23 (1 + 2 + 3)2 − (1 + 2)2 = 33 (1 + 2 + 3 + 4)2 − (1 + 2 + 3 ) 2 = 43 (a) Based on the three statements given above, what is the next statement you propose? (b) What conjecture do these statements suggest? (c) Check the conjecture in (b). 7.3. A sequence {an } of real numbers is recursively defined by a1 = 2 and for n ≥ 2 an =

2 2 + 1 · a12 + 2 · a22 + · · · + (n − 1)an−1 . n

(a) Determine a2, a3 and a4. (b) Obviously an is a rational number for every n ∈ N. But what conjecture does this suggest based on the information in (a)? 7.4. The German mathematician Christian Goldbach is said to have been known for a conjecture about prime numbers. We call this conjecture Conjecture A. Conjecture A Every even integer that is at least 4 is the sum of two prime numbers. Goldbach made two more conjectures about prime numbers. Conjecture B Every integer that is at least 6 is the sum of three prime numbers. Conjecture C Every odd integer that is at least 9 is the sum of three odd primes. Prove that the truth of one or more of these three conjectures implies the truth of one or two of the other conjectures. 7.5. An ordered partition of an integer n ≥ 2 means a sequence of positive integers whose sum is n. For example, the ordered divisions of 3 are 3, 1 + 2, 2 + 1, 1 + 1 + 1. (a) Find the ordered divisions of 4. (b) Estimate the number of ordered divisions of an integer n ≥ 2 7.6. Two recursively defined sequences {an } and {bn } of positive integers have the same repetition relation, i.e. H. an = 2an−1 + an−2 and bn = 2bn−1 + bn−2 for n ≥ 3. Initial values for { an } are a1 = 1 and a2 = 3, while initial values for {bn } are b1 = 1 and b2 = 2. (a) Determine a3 and a4 . (b) Determine whether the following is true or false: Conjecture: an = 2n−2 n + 1 for every integer n ≥ 2.

186

Chapter 7 Prove or Disprove (c) Determine b3 and b4. (d) Determine whether the following is true or false: √ (1+ 2)n √ −(1− 2)n for every integer n ≥ 2. Conjecture: bn = 2 2

7.7. We know that 1 + 2 + 3 = 1 2 3; that is, there are three positive integers whose sum equals the product. Prove or disprove (a) and (b). (a) There are four positive integers whose sum is equal to their product. (b) There are five positive integers whose sum is equal to their product. (c) What conjecture does this suggest to you? 7.8. Note that 3 = 1 + 2, 6 = 1 + 2 + 3, 9 = 4 + 5, and 12 = 3 + 4 + 5. (a) Show that 13 and 14 are the sum of two or more positive values numbers can be expressed consecutive integers. (b) Estimate which integers n ≥ 3 can be expressed as the sum of two or more consecutive positive integers. (c) Prove your conjecture in (b). [Note that any positive integer n can be expressed as n = 2r s, where r is a non-negative integer and s is an odd positive integer.] 7.9. An m × n chessboard has a total of m ≥ 2 rows, n ≥ 2 columns and mn squares. Two squares are adjacent if they belong to the same row or column and no square is strictly between them. Conjecture for which integers m and n it is possible to number the squares from 1 to mn such that consecutively numbered squares are adjacent, as well as squares numbered 1 and mn.

Section 7.2: Verification of Quantified Claims 7.10. (a) Express the following quantified statement in symbols: For every odd integer n, the integer 3n + 1 is even. (b) Prove that the statement in (a) is true. 7.11. (a) Express the following quantified statement in symbols: There is an even positive integer n such that 3n + 2n−2 is odd. (b) Prove that the statement in (a) is true. 7.12. (a) Express the following quantified statement in symbols: For every positive integer n, the integer n n−1 is even. (b) Show that the statement in (a) is false. 7.13. (a) Express the following quantified statement in symbols: There is an integer n such that 3n 2 − 5n + 1 is an even integer. (b) Show that the statement in (a) is false. 7.14. (a) Express the following quantified statement in symbols: For every integer n ≥ 2 there is an integer m with n < m < 2n. (b) Prove that the statement in (a) is true. 7.15. (a) Express the following quantified statement in symbols: There is an integer n with m(n − 3) < 1 for every integer m. (b) Prove that the statement in (a) is true. 7.16. (a) Express the following quantified statement in symbols: For every integer n there is an integer m such that (n − 2)(m − 2) > 0. (b) Express in symbols the negation of the statement in (a). (c) Show that the statement in (a) is false. 7.17. (a) Express the following quantified statement in symbols: There is a positive integer n with −nm < 0 for every integer m.

Exercises for chapter 7

187

(b) Express in symbols the negation of the statement in (a). (c) Show that the statement in (a) is false. 7.18. (a) Express the following quantified statement in symbols: For every positive integer a there is an integer b with |b| < such that |bx| < a for every real number x. (b) Prove that the statement in (a) is true. 7.19. (a) Express the following quantified statement in symbols: For every real number x there exist integers a and b such that a ≤ x ≤ b and b − a = 1. (b) Prove that the statement in (a) is true. 7.20. (a) Express the following quantified statement in symbols: There is an integer n such that for any two real numbers x and y, x 2 + y 2 ≥ n. (b) Prove that the statement in (a) is true. 7.21. (a) Express the following quantified statement in symbols: for every even integer a and every odd integer b, there exists a rational number c such that a < c < b or b < c < a. (b) Prove that the statement in (a) is true. 7.22. (a) Express the following quantified statement in symbols: There are two integers a and b such that for every positive integer n a < (b) Prove that the statement in (a) is true.

1n

<b.

7.23. (a) Express the following quantified statement in symbols: There are odd integers a, b and c such that a + b + c = 1. (b) Prove that the statement in (a) is true. 7.24. (a) Express the following quantified statement in symbols: For any three odd integers a, b, and c, their product abc is odd. (b) Prove that the statement in (a) is true. 7.25. Consider the following statement. R : There is a real number L such that for every positive real number e there is a positive real number d such that if x is a real number with |x| is < d, then |3x − L| < e.g. (a) Use P(x, d): |x| < d and Q(x, L , e): |3x − L| < and to express the R statement in symbols. (b) Prove that statement R is true. 7.26. Prove the following statement. For every positive real number a and every positive rational number b there is a real number c and an irrational number d with ac + bd = 1. 7.27. Prove the following statement. For every integer a there are integers b and c such that |a − b| > cd for every integer d.

Section 7.3: Assertion Testing 7.28. For the set S = {1, 2, 3, 4} let P(n): 2n+1 + (−1)n+1 2n + 2n−1 is a prime number. and Q(n): 2n + 3 is a prime number. Prove or disprove: ∀n ∈ S, P(n) ⇒ Q(n). 7.29. Let P(n): n 2 + 3n + 1 is even. Prove or disprove: (a) ∀k ∈ N, P(k) ⇒ P(k + 1). (b) ∀n ∈ N, P(n).

188

Chapter 7 Prove or disprove

For each of Exercises 7.30–7.81: Prove or disprove. 7:30 a.m. Let x ∈ Z. If 4x + 7 is odd, then x is even. 7.31. For every non-negative integer n there is a non-negative integer k such that k < n. 7.32. Any even integer can be expressed as the sum of two odd integers. 7.33. If x, y, z ∈ Z such that x + y + z = 101, then any two of the integers x, y, and z have opposite parity. 7.34. For any two sets A and B, (A ∪ B) − B = A. 7.35. Be A a lot. If A ∩ B = ∅ for every set B, then A = ∅. 7.36. There is an odd integer whose digit sum is even and whose digit product is odd. 7.37. For every nonempty set A there is a set B with A ∪ B = ∅. 7.38. If x and y are real numbers, then |x + y| = |x| + |y|. 7.39. Let S be a nonempty set. For every proper subset A of S there is a nonempty subset B of S with A ∪ B = S and A ∩ B = ∅. 7.40 There is a real solution of the equation x 4 + x 2 + 1 = 0. 7.41. There is an integer a with a c ≥ 0 for every integer c. 7.42. There are real numbers a, b and c such that

a+b a+c

= bc .

7.43. If x, y ∈ R and x 2 < y 2 , then x < y. 7.44. Let x, y, z ∈ Z. If z = x − y and z is even, then x and y are odd. 7.45 Any odd integer can be expressed as the sum of three odd integers. 7.46. Let x, y, z ∈ Z. If z = x + y and x is odd, then y is even and z is odd. 7.47. For every two integers a and c, there is an integer b such that a + b = c. 7.48. Any even integer can be expressed as the sum of two even integers. 7.49. For every two rational numbers a and b with a < b there is a rational number r with a < r < b. 7.50 Let A, B, C and D be sets with A ⊆ C and B ⊆ D. If A and B are disjoint, then C and D are disjoint. 7.51. Let A and B be sets. If A ∪ B = ∅, then A and B are not empty. 7.52. For every two positive integers a and c, there exists a positive integer b such that a + b = c. 7.53. For every odd integer a, there are integers b and c with opposite parity such that a + b = c. 7.54. For every rational number a/b with a, b ∈ N there is a rational number c/d, where c and d are odd positive integers such that 0 < dc < ab . 7.55 The equation x 3 + x 2 − 1 = 0 has a real solution between x = 0 and x = 1. 7.56. There is a real number x with x 2 < x < x 3 . 7.57. Let A and B be sets. If A − B = B − A, then A − B = ∅. 7.58. If x ∈ Z, then

x 3 + x x 4 − 1

=

x . x 2 −1

7.59. For every positive rational number b there is an irrational number a with 0 < a < b. 7.60. Be A a lot. If A − B = ∅ for every set B, then A = ∅. 7.61. Let A, B and C be sets. If A ∩ B = A ∩ C, then B = C. 7.62. For every nonempty set A there is a set B with |A − B| = |B − A|. 7.63. Be A a lot. If A ∪ B = ∅ for every set B, then A = ∅. 7.64. There is an irrational number a and a rational number b such that a b is irrational.

Additional exercises to Chapter 7

189

7.65. There is a real solution of the equation x 2 + x + 1 = 0. 7.66. For any two sets A and B, P(A ∪ B) = P(A) ∪ P(B) holds. 7.67. Any rational number other than zero can be expressed as the product of two irrational numbers. 7.68. If A and B are disjoint sets, then P(A) and P(B) are disjoint. 7.69. Let S be a nonempty set and let T be a set of subsets of S. If A ∩ B = ∅ for all pairs A, B of elements of T , then there is an element x ∈ S with x ∈ C for all C ∈ T . 7.70. Let A, B and C be sets. So A ∪ (B − C) = (A ∪ B) − (A ∪ C). 7.71. Let a, b, c ∈ Z. If ab, ac, and bc are even, then a, b, and c are even. 7.72. Let n ∈ Z. If n 3 + n is even, then n is even. 7.73. There are three distinct integers a, b, and c such that a b = bc . 7.74. Let a, b, c ∈ Z. Then at least one of the numbers a + b, a + c and b + c is even. 7.75. Any integer can be expressed as the sum of two unequal integers. 7.76. There are positive integers x and y such that x 2 − y 2 = 101. 7.77. For every positive integer n, n 2 − n + 11 is a prime number. 7.78. For every odd prime p there exist positive integers a and b such that a 2 − b2 = p. 7.79. If the product of two consecutive integers is not divisible by 3, then their sum is . 7.80. The sum of any five consecutive integers is divisible by 5 and the sum of non-consecutive integers is divisible by 6. 7.81. There are three distinct positive real numbers a, b, c, none of which are integers, so ab, ac, bc, and abc are all integers.

EXERCISES ADDITIONAL TO CHAPTER 7 7.82. (a) Show that the following statement is false: For every natural number x there is a natural number y with x < y < x 2 . (b) Amend the statement in (a) slightly so that the new statement is true. Prove the new claim. 7.83. (a) Show that the following statement is false: Every positive integer is the sum of two different positive odd integers. (b) Amend the statement in (a) slightly so that the new statement is true. Prove the new claim. 7.84. (a) Prove or disprove: There are two distinct positive integers whose sum exceeds the product. (b) Your solution to (a) must propose a different problem for you. Specify and solve this new problem. √ √ √ 7.85. (a) Prove or disprove: if a and b are positive integers, then a + b = a √ + b. √ √ (b) Prove or disprove: There are positive real numbers a and b such that a + b = a + b. (c) Complete the following statement so that it is true and provides a proof: √ √ is√ . Let a, b ∈ R+ ∪ {0}. Then a + b = a + b if and only if n n 7.86. Consider the open sentence P(n): n! > 2 for n ∈ N. Prove or disprove: ∀n ∈ N, P(n). 7.87. Evaluate the proof of the following statement. Result Any even integer can be expressed as the sum of three different even integers. Proof Let n be an even integer. Since n + 2, n − 2 and −n are different even integers and n = (n + 2) + (n − 2) + (−n), the desired result follows.

190

Chapter 7 Prove or disprove

7.88. It is known (although difficult to prove) that for any non-negative integer m, the integer 8m + 3 can be expressed as a 2 + b2 + c2 for positive integers a, b, and c. (a) For every integer m with 0 ≤ m ≤ 10, find positive integers a, b and c such that 8m + 3 = a 2 + b2 + c2 . (b) Prove or disprove that if a, b, and c are positive integers such that a 2 + b2 + c2 = 8m + 3 for some integer m, then all a, b, and c are odd. 7.89. In Exercise 6 of Chapter 4 you were asked to prove the statement P: Let a ∈ Z. If 3 | 2a, then 3 | one. (a) Prove that the converse of P holds. Now declare P and its reciprocal in a more familiar way. (b) Is the statement obtained by converting 2 to 3 to P true? (c) Find a set S of positive integers with 2 ∈ S and |S| ≥ 3 such that: Let a ∈ Z. If 3 | ka, where k ∈ S, so 3 | one. Prove this generalization of the statement P. (d) In Exercise 72 of Chapter 4 it was shown for integers a and b that 3 | from if and only if 3 | a or 3 | B. How can this be used to answer (c)? √ √ 7.90. In problem 20 of chapter 5 you were asked to prove that 2 + 3 is irrational. √ √ (a) Prove that 2 + 5 is irrational. √ √ (b) Use proof to find another positive integer a such that 2 + a is irrational. (c) Formulate and prove a generalization of the result in (a). 7.91. In Exercise 27 of Chapter 3 you were asked to prove the statement: P: If n ∈ Z, then n 3 − n is even. This can be expressed as: P: If n ∈ Z then 2 | (n 3 − n). (a) Find a positive integer a = 2 such that if n ∈ Z then a | (n 3 − n). is true and prove this statement. (b) Find a positive integer k = 3 such that if n ∈ Z then 2 | (nk − n). is true and prove this statement. (c) Ask yourself a question about P and give an answer. 7.92. Let A be the set of odd integers. Examine the truth (or falsity) of the following statements. (a) (b) (c) (d) (e) (f)

For all x, y ∈ A, 2 | (x2 + 3y2 ). There are x, y ∈ A with 4 | (x2 + 3y2 ). For all x, y ∈ A, 4 | (x2 + 3y2 ). There are x, y ∈ A with 8 | (x2 + 3y2 ). There are x, y ∈ A with 6 | (x2 + 3y2 ). Provide a relevant statement of your choice and determine whether it is true or false.

Additional exercises to Chapter 7

191

7.93. (a) Prove or disprove the following: There are four positive integers a, b, c and such that a 2 + b2 + c2 = d 2 . (b) Prove or disprove the following: There are four different positive integers a, b, c and such that a 2 + b2 + c2 = d 2 . (c) The problems in (a) or (b) above should suggest another problem that you can solve. Formulate and solve such a problem. (d) The problems in (a) or (b) above should lead you to a conjecture (which you are unlikely to be able to solve). Make such a guess.

8

equivalence relations

T

Here are many common examples of relationships in mathematics. For example, there are three different ways that a real number x can be related to a real number y: (1) x < y, (2) y = x 2 + 1, or (3) x = y.

Three different ways an integer a can be related to an integer b are: (1) a | b, (2) a and b are of opposite parity or (3) a ≡ b (mod 3). In the field of geometry, there are three different ways in which a line in three-dimensional space can be related to a plane in three-dimensional space: (1) it lies at , (2) it is parallel to it, or (3) it exactly intersects it one point. Three different ways a triangle T can be related to a triangle T are: (1) T is congruent to T, (2) T is similar to T, or (3) T has the same area as T. All of the previous examples involve two sets A and B (where possibly A = B), such that the elements of A are related to the elements of B in some way. We shall now examine this idea in a more general context.

8.1 Relations Let A and B be two sets. By a relation R from A to B we mean a subset of A × B. That is, R is a set of ordered pairs, where the first coordinate of the pair belongs to A and the second coordinate to B. If (a, b) ∈ R, then we say that a is related to b through R and write a R b. If (a, b) ∈ / R, then a of R is independent of b and we write a R b. For the sets A = {x, y, z} and B = {1, 2} we have the set R = {(x, 2), (y, 1), (y, 2)}

192

(8.1)

is a subset of A × B and is therefore a relation from A to B. Thus x R 2 (x is related to 2) and x R 1 (x is not related to 1). Given any two sets A and B, it is always the case that ∅ and A × B are subsets of A × B. Hence ∅ and A × B are both examples of relationships from A to B. (In fact, these are the extreme examples.) For the relationship ∅, no element of A is related to any element of B; while for the relation A × B each element of A is related to each element of B. So let's simplify a relation from a set A to a set

8.2

relationship properties

193

B tells us which elements of A are related to which elements of B. While this may seem like a simple idea, it is very important that we have a full understanding of it. Let R be a relation from A to B. The domain of R, denoted by dom(R), is the subset of A, defined by dom(R) = {a ∈ A : (a, b) ∈ R for some b ∈ B}; while the range of R denoted by range(R) is the subset of B defined by range(R) = {b ∈ B : (a, b) ∈ R for some a ∈ A}. Hence dom(R) is the set of elements of A appearing as first coordinates among ordered pairs in R, and area(R) is the set of elements of B appearing as second coordinates among ordered pairs in R. and interval of The relation R given in (8.1) is / dom(R) because there is dom(R) = {x, y} and interval(R) = {1, 2}. The reason why z ∈ is not an ordered pair in R whose first coordinate is z. Let R be a relation from A to B. By inverse relation of R we mean the relation R −1 from B to A defined by R −1 = {(b, a) : (a, b) ∈ R}. For example, the inverse relation of the relation R = {(x, 2), (y, 1), (y, 2)} from A = {x, y, z} to B = {1, 2} is the relation R − 1 = {(1, y), (2, x), (2, y)} from B to A. By a relation on a set A we mean a relation from A to A. That is, a relation on a single Set A is a collection of ordered pairs whose first and second coordinates belong to A. So {(1, 2), (1, 3), (2, 2), (2, 3)} is an example of a relation on the set A = {1, 2, 3, 4}. If A = {1, 2}, then A × A = {(1, 1), (1, 2), (2, 1), (2, 2)}. Like |A × A| = 4, the number of subsets of A × A is 24 = 16. However, a relation in A is by definition a subset of A × A. Hence there are 16 relations in A. Six of these 16 relations are ∅ , {(1, 2)}, {(1, 1), (1, 2)}, {(1, 2), (2, 1)}, {(1, 1), (1, 2), ( 2, 2) }, A × A

8.2 Properties of Relations For a relation defined on a single set, there are three properties that a relation can have that will be of particular interest to us. A relation R defined on a set A is called reflexive if x R x for all x ∈ A. That is, R is reflexive if (x, x) ∈ R for all x ∈ A. Let S = {a, b, c } and consider the following six relations defined in S: R1 = {(a, b), (b, a), (c, a)} R2 = {(a, b), (b, b), (b , c), (c, b), (c, c)} R3 = {(a, a), (a, c), (b, b), (c, a), (c, c)}

194

Chapter 8 Equivalence Relations R4 = {(a, a), (a, b), (b, b), (b, c), (a, c)} R5 = {(a, a), (a, b ) )} R6 = {(a, b), (a, c)}. /R1 for example. Since (a, a) ∈ / R2 , the relation R1 is not reflexive, since (a, a) ∈ it follows that R2 is not reflexive either. Since (a, a), (b, b), (c, c) ∈ R3 , the relation R3 is reflexive. None of the relations R4, R5, R6 is reflexive. A relation R defined on a set A is called symmetric if whenever x Ry, then y R x holds for all x, y ∈ A. For a relation R to be “not symmetric” on A, there must be an ordered pair ( w , z) in R with (z, w) ∈ / R. If such an ordered pair (w, z) exists, then of course w = z. The relation R1 is not symmetric since (c, a) ∈ R1 but (a, c) ∈ / R1 . Note that (a, b) ∈ R1 and (b, a) ∈ R1 , but this does not mean that R1 is symmetric. Recall that the definition of a symmetric relation R on a set A says that whenever x Ry then y R x for all x, y ∈ A. However, the relation R3 is symmetric since both (a, c ) how many (c , a) belong to R3 . None of the ordered pairs (a, a), (b, b), (c, c) in R3 is relevant to whether R3 is symmetric. None of the relationships R2, R4, R5, R6 is symmetrical. A relation R defined on a set A is called transitive if whenever x Ry and y R z holds, then x R z holds for all x, y, z ∈ A. Note that in this definition it is not necessary for x, y and z to be different. Thus, for a relation R to be “non-transitive” in A, there must be two ordered pairs (u, v) and (v, w) in R such that (u, w) ∈ / R. If this is the case, then necessarily u = v and v = w (although perhaps u = w). For example, the relation R2 is not transitive since (a, b), (b, c) ∈ R2, but (a, c) ∈ / R2 . In fact, R1 is not transitive /R1 . The example (counterexample) that or because (a, b), (b, a) ∈ R1 but (a, a) ∈ shows that R1 is not transitive illustrates the fact that it cannot be easy to show that a relation is not transitive. All relations R3 , R4 , R5 , R6 are transitive. It is also not always easy to convince oneself that a relation is transitive. Let's argue carefully why the relations R5 and R6 are transitive. For R5 to be transitive, (x, z) must belong to R5 whenever (x, y) and (y, z) belong to R5 for all x, y, z ∈ A. To check whether the property is transitive is valid on R5, we have to consider all possible pairs of ordered pairs of type (x, y) and (y, z). We have two possibilities for (x, y) in R5, namely (a, a) and (a, b), i.e. x = a and y = a, or x = a and y = b. If (x, y) = (a, a), then y = a, then (y, z) = (a, a) or (y, z) = (a, b). In the first case (a, a) ∈ R5 and (a, a) ∈ R5 , and (x, z) = (a, a) belongs to R5 . In the second case (a, a) ∈ R5 and (a, b) ∈ R5 , and (x, z) = (a, b) belongs to R5 . This example suggests (correctly!) that if (x, y) and (y, z) belong to some relation R and x = y, then surely (x, z) ∈ R. The same could be said if y = z. So when checking transitivity, we only need to consider the ordered pairs (x,y) and (y,z) for which x = y and y = z. Next, let's assume that (x, y) = (a, b), so y = b. There is no ordered pair of type (y,z) here; that is, there is no ordered pair of R5 whose first coordinate is b. Therefore there is nothing to check if (x, y) = (a, b). For R5 there are only two possibilities for two ordered pairs of type (x, y), (y, z) and each (x, z) ∈ R5 . So R5 is transitive.

8.2

relationship properties

195

Now let's go back to R6. The relation R6 does not contain two ordered pairs of type (x, y), (y, z) because if (x, y) = (a, b), no ordered pair has b as its first coordinate; whereas if (x,y) = (a,c), no ordered pair has c as first coordinate. Consequently, the assumption of the transitive property is false and the implication "If (x, y) ∈ R6 and (y, z) ∈ R6 , then (x, z) ∈ R6 ." is reasonably satisfied. Hence R6 is transitive. Another way to convince yourself that R6 is transitive is to think about what must happen if R6 is not transitive; that is, there must be two ordered pairs (x, y), (y, z) in R6 such that (x, z) ∈ / R6 . But such ordered pairs (x, y) and (y, z) do not exist! In previous discussions we used an important point in testing a relation for transitivity. It must be repeated here. If we try to determine whether a relation R is transitive and consequently check all pairs of type (x, y) and (y, z), we do not have to consider the situation where x = y or y = z. In this case the ordered pair (x, z) will always be present in R. If a relation R is not transitive, then in R there must be ordered pairs (x, y) and (y, z) where x = y and y = z such that (x, z) is not in R. That is, (x, y) and (y, z) form a counterexample to the implication "If (x, y) ∈ R and (y , z) ∈ R, then (x, z) ∈ R", which is a premise for is that R is transitive. We have already mentioned that relations are common in mathematics. Let R be the relation defined on the set Z of integers by a R b if a ≤ b; that is, R is the ≤ relation. Since x ≤ x for every integer x, x R x follows for every x ∈ Z; that is, R is reflexive. Certainly 2 R 3, since 2 ≤ 3. However, 3 > 2; then 3 R 2. So R is not symmetric. On the other hand, it is a well-known property of integers that if a ≤ b and b ≤ c, then a ≤ c. So if a R b and b R c, then a R c. So R is transitive. Another relation R that we could consider on the set Z is defined by a R b if a = b. However, since 1 = 1, then 1 R 1. Therefore, this relation is not reflexive. If a and b are integers such that a = b, then we also have b = a. So if a R b, then b R a. This states that this relationship is symmetrical. Note that 2 = 3 and 3 = 2 but 2 = 2. That is, 2 R 3 and 3 R 2 but 2 R 2. Hence R is not transitive. The distance between two real numbers a and b is |a − b|. Therefore, the distance between 2 and 4.5 is |2 − 4.5| 🇧🇷 − 2.5| = 2.5. So if the real numbers (points) a and b are plotted on the real number line (x-axis), then the length of the segment between them is the distance. This is shown in Figure 8.1 for the real numbers a = 3 and b = −2, whose distance is |a − b| is = |3 − (−2)| = 5 = |(−2) − 3| = |b − a|. Define a relation R on the set R of real numbers by a R b if |a − b| ≤ 1; ie a is related to b if the distance between a and b is at most 1. Of course, the distance of a real number from itself is 0; that is, |a − a| = 0 ≤ 1 for every x ∈ R. So a R a and R is reflexive. If the distance between two real numbers a and b is at most 1, then the distance is 5−3

−2

−1

Figure 8.1

1

2

3

The distance between 3 and -2

4

196

Chapter 8 Equivalence relations between b and a is at most 1. In symbols, if |a − b| ≤ 1, then |b − a| = |a − b| ≤ 1; that is, if a R b, then b R a. So R is symmetric. Now to the transitive property. If a R b and b R c, is a R c? That is, if the distance between a and b is at most 1 and the distance between b and c is at most 1, does it follow that the distance between a and c is at most 1? The answer is no. For example 3 R 2 and 2 R 1 since |3 − 2| ≤ 1 and |2 − 1| ≤ 1. However, |3 − 1| = 2. So 3 R 1 and R is not transitive.

8.3 Equivalence Relations Perhaps the best-known relation in mathematics is the equality relation. For example, let R be the relation defined in Z by a R b when a = b. For every whole number a, a = a and thus an R a. If a = b, then b = a. So if a R b, then b R a. Also if a = b and b = c, then a = c. So if a R b and b R c, then a R c. These observations tell us that the equality relation on the set of integers has all three properties: reflexive, symmetric, and transitive. This raises the question of which other relations (in the set Z, or in any set at all) have the same three properties that the equality relation has. These are the relationships that we will primarily focus on in this chapter. A relation R on a set A is called an equivalence relation if R is reflexive, symmetric, and transitive. Of course, the equality relation R defined on Z by a R b when a = b is an equivalence relation on Z. As another example, consider the set A = {1, 2, 3, 4, 5, 6} and the relation R = {(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (1, 3), (1, 6 ), (6, 1), (6, 3), (3, 1), (3, 6), (2, 4), (4, 2)}

(8.2)

defined on A. This relation has all three properties reflexive, symmetric and transitive and is therefore an equivalence relation. Suppose R is an equivalence relation on a set A. If a ∈ A, then a is related to a since R is reflexive. Other elements of A may also be related to a. The set of all elements related to a given element of A will be important, and so these sets are given special names. For an equivalence relation R defined on a set A and for a ∈ A, the set [a] = {x ∈ A : x R a} consisting of all elements in A that are related to a is called an equivalence class , namely in the equivalence class , which contains a since a ∈ [a] (because R is reflexive). Roughly speaking, then, [a] consists of the "relatives" of a. For the equivalence relation R defined in (8.2), the resulting equivalence classes are [1] = {1, 3, 6},

[2] = {2, 4},

[3] = {1, 3, 6},

[4] = {2, 4},

[5] = {5},

[6] = {1, 3, 6}.

(8.3)

Since [1] = [3] = [6] and [2] = [4], there are only three different equivalence classes in this case, namely [1], [2] and [5]. Let us return to the equality relation R defined on Z by a R b when a = b, and determine the equivalence classes for this equivalence relation. For a ∈ Z, [a] = {x ∈ Z : x R a} = {x ∈ Z : x = a} = {a}; That is, each integer is in its own equivalence class.

8.3

equivalence relations

197

To illustrate further, define a relation R on the set L of lines in a plane through 1 R 2 when 1 = 2 (the lines coincide) or 1 is parallel to 2. Since every line coincides with itself, R is reflexive. If a line 1 is parallel to (or coincides with) a line 2, then 2 is parallel to (or coincides with) 1. So R is symmetric. Finally, if 1 is parallel to 2 and 2 is parallel to 3 (including the possibility that such pairs of lines coincide), then 1 is parallel to 3 or they coincide. Indeed, it is quite possible that 1 and 2 are different parallel lines, as are 2 and 3, but 1 and 3 coincide. In any case, this relation is transitive. Therefore R is an equivalence relation. Hence for ∈ L the equivalence class [] = {x ∈ L : x R } = {x ∈ L : x = or x is parallel to }; that is, the equivalence class [] consists of and all lines in the plane parallel to . To describe other examples of geometric relationships, let T be the set of all triangles in a plane. For two triangles T and T in T, define the relations R1 and R2 in T by T R1 T if T is congruent to T and T R2 T if T is similar to T . Then R1 and R2 are equivalence relations. For a triangle T and the relation R1, [T] is the set of triangles in T that are congruent to T; while for R2 [T] is the set of triangles in T that are similar to T. The relation R defines on Z by x R y if |x| = |y| is also an equivalence relation. In this case, for a ∈ Z, the equivalence class [a] consists of the two integers a and −a, unless a = 0, then [0] = {0}. Now let's look at an example that needs further thought and explanation. Result 8.1

A relation R is defined on Z by x R y if x + 3y is even. Then R is an equivalence relation. Before proving this result, let's make sure we understand this relationship. First, notice that 5 R 7 is even since 5 + 3 7 = 26. However, 8 R 9 since 8 + 3 9 = 35 is not even. On the other hand, 4 R 4 because 4 + 3 * 4 = 16 is even.

Proof of results 8.1

We first show that R is reflexive. Let a ∈ Z. Then a + 3a = 4a = 2(2a) even since 2a ∈ Z. Hence a R a and R is reflexive. Next we show that R is symmetric. Assume that the R b. So a + 3b is even. So a + 3b = 2k for an integer k. So a = 2k − 3b. So b + 3a = b + 3(2k − 3b) = b + 6k − 9b = 6k − 8b = 2(3k − 4b). Since 3k − 4b is an integer, b + 3a is even. Therefore, b R a and R is symmetrical. Finally we show that R is transitive. Suppose a R b and b R c. Therefore a + 3b and b + 3c are even; then a + 3b = 2k and b + 3c = 2 for some integers k and . Adding these two equations gives us (a + 3b) + (b + 3c) = 2k + 2. So a + 4b + 3c = 2k + 2 and a + 3c = 2k + 2 − 4b = 2(k + −). 2 B). Since k + − 2b is an integer, a + 3c is even. Therefore an R c and thus R is transitive. Therefore R is an equivalence relation.

EVIDENCE ANALYSIS

A few remarks on the previous proof are in order. Recall that a relation R defined on a set A is reflexive if x R x for all x ∈ A. The reflexive property can also be reformulated as: For all x ∈ A, x R x or: If x ∈ A , then xRx . Therefore, when we proved in Result 8.1 that R is reflexive, we first assumed that a is an arbitrary variable.

198

Chapter 8 Elemental Equivalence Relations of Z. (We give a direct proof.) We were then asked to show that a + 3a is even, which we did. However, it would be wrong to assume that a + 3a is even or that a Ra. That's exactly what we want to prove. Since the relation defined in Result 8.1 is an equivalence relation, there are equivalence classes, ie for every a ∈ Z an equivalence class [a]. For example, let's start at 0. The equivalence class [0] is the set of all integers related to 0. In symbols, this equivalence class is [0] = {x ∈ Z : x R 0} = {x ∈ Z : x + 3 0 is even} = {x ∈ Z : x is even} = {0, ±2, ± 4, . 🇧🇷 🇧🇷 that is, [0] is the set of even integers. It shouldn't be hard to see that if a is an even integer, say a = 2k, where k ∈ Z, then [a] = {x ∈ Z : x Ra} = {x ∈ Z : x + 3a is even} = {x ∈ Z : x + 3(2k) is even} = {x ∈ Z : x + 6k is even} is also the set of even integers. On the other hand, the equivalence class consisting of these integers related to 1 is [1] = {x ∈ Z : x R 1} = {x ∈ Z : x + 3 1 is even} = {x ∈ Z : x + 3 is even} = {±1, ±3, ±5, . 🇧🇷 .}, which is the set of odd integers. If a is an odd integer, then a = 2 + 1 for an integer and [a] = {x ∈ Z : x + 3a is even} = {x ∈ Z : x + 3(2 + 1) even} = {x ∈ Z : x + 6 + 3 is even} is the set of odd integers. So if a and b are even, then [a] = [b] is the set of even integers; while if a and b are odd, then [a] = [b] is the set of odd integers. Hence there are only two distinct equivalence classes, namely [0] and [1], the sets of even and odd integers, respectively. We shall soon see that there is good reason for this observation.

8.4 Properties of Equivalence Classes You may have noticed that in the previous examples of equivalence relations we saw several situations where two equivalence classes are equal. Exactly when this is the case can be determined. Theorem 8.2

Let R be an equivalence relation on a nonempty set A and let a and b be elements of A. Then [a] = [b] if and only if a R b.

Study

Assume that the R b. We show that the sets [a] and [b] are equal by verifying that [a] ⊆ [b] and [b] ⊆ [a]. First we show that [a] ⊆ [b]. Let x ∈ [a]. So xR a. Since a R b and R is transitive, x R b. So x ∈ [b] and therefore [a] ⊆ [b]. Next let y ∈ [b]. So, yR b. Since a R b and R are symmetric, b is R a. Again, because of the transitivity of R y Ra. So y ∈ [a] and thus [b] ⊆ [a]. So [a] = [b].

8.4

Properties of equivalence classes

199

For the inverse, assume that [a] = [b]. Since R is reflexive, a ∈ [a]. But since [a] = [b], it follows that a ∈ [b]. Hence the R b. So if R is an equivalence relation on a set A and a is related to b, then by Theorem 8.2 the set [a] of the elements of A related to a and the set [b] of the elements of A related to b are equal, that that is, [a] = [b]. Characterized by the theorem if [a] = [b], we know that if a R b, then [a] = [b]. Let us return to the equivalence relation defined in (8.2) on the set A = {1, 2, 3, 4, 5, 6} and the equivalence classes given in (8.3). We have already established that [1] = [3] = [6]. Since any two of the integers 1, 3, 6 (according to the definition of R) are related, Theorem 8.2 tells us that [1], [3], and [6] are expected to be equal. The same applies to [2] and [4]. But since, for example, (5, 6) ∈ / R, Theorem 8.2 tells us that [5] = [6], which is the case. Therefore, as already mentioned, there are only three different equivalence classes, namely [1] = [3] = [6] = {1, 3, 6},

[2] = [4] = {2, 4},

[5] = {5}.

(8.4)

Now you may have noticed something else. Each element of A belongs to exactly one equivalence class. This observation may remind you of a concept we discussed earlier. Recall that a partition P of a nonempty set S is a collection of nonempty subsets of S such that each element of S belongs to exactly one of those subsets; that is, P is a collection of pairwise disjoint nonempty subsets of S whose union is S. Hence the set of different equivalence classes in (8.4) is a partition of the set A = {1, 2, 3, 4, 5 , 6}. We now show that this is also expected. Theorem 8.3

Let R be an equivalence relation defined on a nonempty set A. Then the set P = {[a] : a ∈ A} of equivalence classes resulting from R is a decomposition of A.

Study

Of course, every equivalence class [a] is not empty since a ∈ [a] and so every element of A belongs to at least one equivalence class. We show that each element of A belongs to exactly one equivalence class. Suppose an element x of A belongs to two equivalence classes, say [a] and [b]. Since x ∈ [a] and x ∈ [b], it follows that x Ra and x R b. Since R is symmetric, an R is x. So a R x and x R b. Since R is transitive, a is R b. From Theorem 8.2 it follows that [a] = [b]. Therefore, any two equivalence classes to which x belongs are equal. Hence x belongs to a single equivalence class. In the proof of Theorem 8.3 we had to show that every element x ∈ A belongs to a single equivalence class. In this proof we assume that x belongs to two equivalence classes [a] and [b]. Note that we do not assume that [a] and [b] are different. We later learned that [a] = [b]; then x can only belong to one equivalence class. With very little modification, we could have come to the same conclusion using a different proofing technique. We could have said: Suppose instead that x belongs to two different equivalence classes [a] and [b]. Using the same argument as above, we can show that [a] = [b]. Now this results in a contradiction, and we have just given a proof by contradiction.

200

Chapter 8 Equivalence Relations By Theorem 8.3, whenever we have defined an equivalence relation R on a nonempty set A, A is decomposed into the associated equivalence classes of R. Perhaps unexpectedly, the opposite is also true. That is, if we are given a partition of A, then there is a corresponding equivalence relation that can be defined on A whose resulting equivalence classes are the elements of the given partition. For example, let P = {{1, 3, 4}, {2, 7}, {5, 6}} be a given partition of the set A = {1, 2, 3, 4, 5, 6, 7 }. (Note that each element of A belongs to exactly one subset of P.) So R = {(1, 1), (1, 3), (1, 4), (2, 2), (2, 7 ), (3, 1), (3, 3), (3, 4), (4, 1), (4, 3), (4, 4), (5, 5), (5, 6), (6 ) , 5), (6, 6), (7, 2), (7, 7)} is an equivalence relation on A whose different equivalence classes are [1] = {1, 3, 4},

[2] = {2, 7}

e

[5] = {5, 6},

and thus P = {[1],[2],[5]}. We now formulate this result in general terms; That is, if we have a non-empty set A and a partition P of A, it is possible to construct an equivalence relation R in A such that the distinct equivalence classes of R are exactly the subsets in P. As we try to check this in general (and not for a specific example), we need to describe the subsets in P in terms of a set of indices. Since we want every subset in P to be an equivalence class, every two elements in the same subset must be related. On the other hand, since we want two distinct subsets in P to be distinct equivalence classes, elements in different subsets must be independent. Theorem 8.4

Let P = {Aα : α ∈ I } be a decomposition of a nonempty set A. Then there exists an equivalence relation R on A such that P is the set of equivalence classes determined by R, i.e. P = {[a ] : a ∈ A}.

Study

Define a relation R on A by x R y if x and y belong to the same subset in P; that is, x R y if x, y ∈ Aα for some α ∈ I . Now we show that R is an equivalence relation. Let a ∈ A. Since P is a decomposition of A, it follows that a ∈ Aβ for some β ∈ I . Trivially, a and a belong to Aβ; then a R a and R is reflexive. Then let a, b ∈ A and a R b. Then a and b belong to Aγ for some γ ∈ I. So b and a belong to Aγ; then b R a and R is symmetric. Finally, let a, b, and c be elements of A such that a R b and b R c. So a, b ∈ Aβ and b, c ∈ Aγ for some β, γ ∈ I . Since P is a decomposition of A, element b belongs to only one set in P. Hence Aβ = Aγ and hence a, c ∈ Aβ . So a Rc and R are transitive. So R is an equivalence relation in A. Now consider the resulting equivalence classes of R. Let a ∈ A. Then a ∈ Aα for some α ∈ I . The equivalence class [a] consists of all elements of A that are related to a. Incidentally, the only elements that refer to a are those that belong to the same subset in P that a belongs to, i.e. H. [a] = Aα . So {[a] : a ∈ A} = {Aα : α ∈ I } = P.

8.4

Properties of equivalence classes

201

We now give another example of an equivalence relation. Although this example is similar to the one described in Issue 8.1, it is different enough that some thought is required. to prove result

A relation R is defined on Z by xR y if 11x − 5y is even. Then R is an equivalence relation.

TEST STRATEGY

Since we want to verify that R is an equivalence relation, we need to show that R is reflexive, symmetric, and transitive. Let's start with the first of them. We start with an integer a. To show that a is R a, we need to show that 11a − 5a is even. However, 11a − 5a = 6a = 2(3a), so this shouldn't be a problem. To prove that R is symmetric, we start with a R b (where a, b ∈ Z, of course) and try to show that b R a. Because a R b it follows that 11a − 5b is even. To show that b R a we have to show that 11b − 5a is even. Since 11a − 5b is even, we can write 11a − 5b = 2k for an integer k. At first, however, it might seem like a good idea to solve for a with respect to b, or for b with respect to a. However, since neither the coefficient of a nor the coefficient of b in 11a − 5b = 2k is 1 or −1, fractions would be introduced. We need a different approach. Note that if we write 11b − 5a = (11a − 5b) + (? a + ? b), we have 11b − 5a = (11a − 5b) + (−16a + 16b) = 2k − 16a + 16b = have 2 (k − 8a + 8b). That will work. To verify that R is transitive, we first assume that a R b and b R c (and try to show that a R c). So 11a − 5b and 11b − 5c are even, so 11a − 5b = 2k

e

11b − 5c = 2

(8.5)

for integers k and . To show that a R c we need to verify that 11a − 5c is even. We need to incorporate the expression 11a − 5c into the discussion. However, this can be done by adding the expressions in (8.5). We are ready to try now. Result 8.5

A relation R is defined on Z by xR y if 11x − 5y is even. Then R is an equivalence relation.

Study

First we show that R is reflexive. Let a ∈ Z. Then 11a − 5a = 6a = 2(3a). Since 3a is an integer, 11a − 5a is even. Thus Ra and R are reflexive. Next we show that R is symmetric. Suppose a R b, where a, b ∈ Z. So 11a − 5b is even. Hence 11a − 5b = 2k, where k ∈ Z. Note that 11b − 5a = (11a − 5b) + (−16a + 16b) = 2k − 16a + 16b = 2(k − 8a + 8b). Since k − 8a + 8b is an integer, 11b − 5a is even. Therefore, b Ra and R are symmetric. Finally we show that R is transitive. Suppose a R b and b R c. Therefore 11a − 5b and 11b − 5c are even. Hence 11a − 5b = 2k and 11b − 5c = 2, where k, ∈ Z.

202

Chapter 8 Equivalence Relations If we add these equations together, we get (11a − 5b) + (11b − 5c) = 2k + 2. If we solve for 11a − 5c, we get 11a − 5c = 2k + 2 − 6b = 2(k + −). 3b). Since k + − 3b is an integer, 11a − 5c is even. Therefore a R c and R are transitive. Therefore R is an equivalence relation. We now determine the equivalence classes for the equivalence relation just discussed. Let's start with the equivalence class containing 0, for example. Then [0] = {x ∈ Z : x R 0} = {x ∈ Z : 11x is even} = {x ∈ Z : x is even} = {0, ±2, ±4, . 🇧🇷 🇧🇷 Remember that different equivalence classes always produce a partition of the set involved (Z in this case). Since the class [0] does not only consist of integers, at least one other equivalence class exists. To determine another equivalence class, we search for an element that does not belong to [0]. Like 1 ∈ / [0], the equivalence class [1] is different (and disjoint) from [0]. So [1] = {x ∈ Z : x R 1} = {x ∈ Z : 11x − 5 is even} = {x ∈ Z : x is odd} = {±1, ±3, ±5, . 🇧🇷 🇧🇷 Since [0] and [1] produce a decomposition of Z (i.e. every integer belongs to exactly one of [0] and [1]), these are the only equivalence classes in this case.

8.5 Congruence module Next we describe one of the most important equivalence relations. If you do more calculations in the future, you will probably see the equivalence relation, which we will describe again shortly - often even. Recall again that for integers a and b, where a = 0, the integer a divides b, written as a | b if there is an integer c such that b = ac. Also, for integers a, b, and n ≥ 2, a is said to be congruent to b modulo n, written a ≡ b (mod n), if n | (away). For example 24 ≡ 6 (mod 9) since 9 | (24 - 6); as long as 1 ≡ 5 (mod 2) since 2 | (1 - 5). In addition, 4 ≡ 4 (mod 5) since 5 | (4 - 4). However, 8 ≡ 2 (mod 4) since 4 | (8 - 2). These concepts were introduced in Chapter 4. Let's consider some examples of pairs a, b of integers such that a ≡ b (mod 5). Note that 7 ≡ 7 (mod 5), −1 ≡ −1 (mod 5), and 0 ≡ 0 (mod 5). Also 2 ≡ −8 (mod 5) and −8 ≡ 2 (mod 5). Also note that 2 ≡ 17 (mod 5). Hence both −8 ≡ 2 (mod 5) and 2 ≡ 17 (mod 5). Also −8 ≡ 17 (mod 5). These examples might indicate that the reflexive, symmetric, and transitive properties are satisfied here, a fact we'll check in a moment. This is the important equivalence relation referred to at the beginning of this section, not only for n = 5, but for any integer n ≥ 2. Theorem 8.6

Let n ∈ Z, where n ≥ 2. Then the congruence modulus n (i.e. the relation R defined in Z by a R b if a ≡ b (mod n)) is an equivalence relation in Z.

Study

Let a ∈ Z. Since n | 0 it follows that n | (a − a) and thus a ≡ a (mod n). Thus a is R a which means that R is reflexive.

8.5

Kongruenzmodul n

203

Next we show that R is symmetric. Suppose a R b, where a, b ∈ Z. Since a R b it follows that a ≡ b (mod n) and hence n | (away). So there exists a k ∈ Z with a − b = nk. So b − a = −(a − b) = −(nk) = n(−k). Since −k ∈ Z it follows that n | (b − a) and thus b ≡ a (mod n). Therefore, b R a and R is symmetrical. Finally we show that R is transitive. Suppose a R b and b R c, where a, b, c ∈ Z. We show that a R c. Since a is R b and b is R c, we know that a ≡ b (mod n) and b ≡ c (mod n). Thus n | (a − b) and n | (b-c). Hence a − b = nk

e

b − c = n

(8.6)

for some integers k and . Adding the equations in (8.6) we get (a − b) + (b − c) = nk + n = n(k + ); then a − c = n(k + ). Because of k + ∈ Z, n | (a − c) and thus a ≡ c (mod n). Therefore a R c and R is transitive. EVIDENCE ANALYSIS

Theorem 8.6 describes a well-known equivalence relation. Let's see how to verify this. The proof we gave to show that the congruence modulus n is an equivalence relation is a common proof technique for this kind of result and we need to be familiar with it. To prove that R is reflexive, we start with any element of Z. We call this element a. Our aim was to show that R a. By definition, a R a if and only if a ≡ a (mod n). However, a ≡ a (mod n) if and only if n | (a − a), which corresponds to the statement n | corresponds to 0. Unique, n | 0 and this is where we decided to start. To prove that R is symmetric, we assume (as usual) that a R b. Our aim was to show that b R a. Like a R b, the definition of the relation R tells us that a ≡ b (mod n). From this we knew that n | (a − b) and a − b = nk for an integer k. However, to show that b R a we had to verify that b ≡ a (mod n). But this only works if we can show that n | (b − a) or equivalently that b − a = n for an integer. So we need to check if b − a can be expressed as a product of n and some other integer. Since b − a is the negative of a − b and we have a convenient expression for a − b, this gave us a crucial step. Finally, to prove that R is transitive, we first assumed that a R b and b R c, leading us to the expressions a − b = nk and b − c = n, where k, ∈ Z. Like our The goal was To show that a R c, we had to show that a − c is a multiple of n. So we had to somehow work the term a − c into the problem, since we knew that a − b = nk and b − c = n. The crucial step here was the observation that a − c = (a − b) + (b−c). Then, by Theorem 8.6, congruence modulo 3 is an equivalence relation. In other words, if we define a relation R on Z by a R b if a ≡ b (mod 3), it follows that R is an equivalence relation. In this case, let's determine the different equivalence classes. First choose an integer, say 0. So [0] is an equivalence class. In fact, [0] = {x ∈ Z : x R 0} = {x ∈ Z : x ≡ 0 (mod 3)} = {x ∈ Z : 3 | x} = {0, ±3, ±6, ±9, . 🇧🇷 🇧🇷

204

Chapter 8 Equivalence Relations The class [0] consists of multiples of 3. This class can be denoted by [3] or [6] or even [−300]. Since there is an integer that is not in [0], there must be at least one equivalence class that is different from [0]. In particular, it follows from 1 ∈ / [0] that [1] = [0]; in fact, necessarily [1] ∩ [0] = ∅. The equivalence class [1] = {x ∈ Z : x R 1} = {x ∈ Z : x ≡ 1 (mod 3)} = {x ∈ Z : 3 | (x − 1)} = {1, −2, 4, −5, 7, −8, . 🇧🇷 🇧🇷 Since 2 ∈ [0] and 2 ∈ [1], the equivalence class [2] is different from [0] and [1]. By definition, [2] = {x ∈ Z : x R 2} = {x ∈ Z : x ≡ 2 (mod 3)} = {x ∈ Z : 3 | (x − 2)} = {2, −1, 5, −4, 8, −7, . 🇧🇷 🇧🇷 Since every integer belongs (exactly) to one of these classes, in this case we have exactly three different equivalence classes, namely: [0] = {0, ±3, ±6, ±9, . 🇧🇷 .}, [1] = {1, −2, 4, −5, 7, −8, . 🇧🇷 .}, [2] = {2, −1, 5, −4, 8, −7, . 🇧🇷 🇧🇷 These equivalence classes have a connection to some very familiar mathematical concepts, divisions, and remainders that we encountered in Section 4.1 and that we should revisit here. If m and n ≥ 2 are integers and m is divided by n, then we can express this division as m = nq + r, where q is the quotient and r is the remainder. The remainder r has the condition that 0 ≤ r < n. With this requirement, q and r are unique and the result just referred to is the division algorithm. (The division algorithm is explored in more detail in Chapter 11.) Consequently, any integer m can be expressed as 3q + r, where 0 ≤ r < 3; that is, r has one of the values 0, 1, or 2. Therefore, any integer can be expressed as 3q, 3q + 1, or 3q + 2 for an integer q. In this case, the equivalence class [0] consists of the multiples of 3, and therefore every integer with remainder 0 when divided by 3 belongs to [0]. Also, every integer with remainder 1 when divided by 3 belongs to [1], while every integer with remainder 2 when divided by 3 belongs to [2]. As 73 = 24 * 3 + 1

e

−22 = (−8) · 3 + 2,

for example, it follows that 73 ∈ [1] and −22 ∈ [2]. In fact, [73] = [1] and [−22] = [2]. In general, the congruence modulo n of the equivalence relation for n ≥ 2 leads to n different equivalence classes. In other words, if we define an R b by a ≡ b (mod n), then there are n distinct equivalence classes: [0], [1], . 🇧🇷 🇧🇷 , [n − 1]. In fact, for an integer r with 0 ≤ r < n, an integer m belongs to the set [r] if and only if there is an integer q (the quotient) such that m = nq + r. In fact, the equivalence class [r] consists of all integers with remainder r when divided by n. Consider another equivalence relation defined in Z involving congruence, but obviously different from the class of examples just described. to prove result

Let R be the relation defined in Z by a R b if 2a + b ≡ 0 (mod 3). Then R is an equivalence relation.

8.5 TESTING STRATEGY

Kongruenzmodul n

205

To prove that R is reflexive we need to show that x R x for all x ∈ Z. This means we need to show that 2x + x ≡ 0 (mod 3) or that 3x ≡ 0 (mod 3) . This is equivalent to showing that 3 | 3x which is clear. This tells us where to start proving the reflection property. The proof that R is symmetric is a little more subtle. Of course we know where to start. We assume that x R y. From this we have 2x + y ≡ 0 (mod 3). So 3 | (2x + y) or 2x + y = 3r for an integer r. Our goal is to show that y R x or, equivalently, that 2y + x ≡ 0 (mod 3). Finally we have to show that 2y + x = 3s for an integer s. Of course we cannot assume that. Since 2x + y = 3r, it follows that y = 3r − 2x. So 2y + x = 2(3r − 2x) + x = 6r − 3x = 3(2r − x). Since 2r − x ∈ Z, we have 3 | (2y + x) and the symmetry check is almost complete. The proof that R is transitive should be expected.

Result 8.7

Let R be the relation defined in Z by a R b if 2a + b ≡ 0 (mod 3). Then R is an equivalence relation.

Study

Let x ∈ Z. Since 3 | 3x implies that 3x ≡ 0 (mod 3). So 2x + x ≡ 0 (mod 3). So x R x and R is reflexive. Next we check that R is symmetric. Suppose x R y, where x, y ∈ Z. So 2x + y ≡ 0 (mod 3) and hence 3 | (2x+y). So 2x + y = 3r for an integer r. So y = 3r − 2x. So 2y + x = 2(3r − 2x) + x = 6r − 3x = 3(2r − x). Since 2r − x is an integer, 3 | (2y + x). So 2y + x ≡ 0 (mod 3). So y R x and R is symmetric. Finally we show that R is transitive. Assume that xRy and yRz, where x,y,z ∈ Z. So 2x + y ≡ 0 (mod 3) and 2y + z ≡ 0 (mod 3). So 3 | (2x + y) and 3 | (2a+z). It follows that 2x + y = 3r and 2y + z = 3s for some integers r and s. Adding these two equations we get 2x + 3y + z = 3r + 3s; then 2x + z = 3r + 3s − 3y = 3(r + s − y). Since r + s − y is an integer, 3|(2x + z); then 2x + z ≡ 0 (mod 3). Therefore x R z and R is transitive.

EVIDENCE ANALYSIS

Some additional remarks on the proof of the symmetry property in Result 8.7 may be helpful. At one point in the proof we knew that 2x + y = 3r for an integer r and we wanted to show that 2y + x = 3s for an integer s. Adding these two equations gives us 3x + 3y = 3r + 3s. Of course we can't add because we don't know that 2y + x = 3s. But that suggests another idea.

206

Chapter 8 Equivalence Relations Assume that x R y. So 2x + y ≡ 0 (mod 3). Therefore 3 | (2x + y); so 2x + y = 3r for an integer r. Note that 3x + 3y = (2x + y) + (2y + x) = 3r + (2y + x). Therefore 2y + x = 3x + 3y − 3r = 3(x + y − r ). Since x + y − r ∈ Z , it follows that 3 | (2y + x). Hence 2y + x ≡ 0 (mod 3), y R x and R is symmetric. The different equivalence classes for the equivalence relation described in Result 8.7 are [0] = {x ∈ Z : x R 0} = {x ∈ Z : 2x ≡ 0 (mod 3)} = {x ∈ Z : 3 | 2x} = {0, ±3, ±6, ±9, . 🇧🇷 .}, [1] = {x ∈ Z : x R 1} = {x ∈ Z : 2x + 1 ≡ 0 (mod 3)} = {x ∈ Z : 3 | (2x + 1)} = {1, −2, 4, −5, 7, −8, . 🇧🇷 .}, [2] = {x ∈ Z : x R 2} = {x ∈ Z : 2x + 2 ≡ 0 (mod 3)} = {x ∈ Z : 3|(2x + 2)} = { 2, −1, 5, −4, 8, −7, . 🇧🇷 🇧🇷 Let's discuss how we got these equivalence classes. We started with the integer 0 and saw that [0] = {x ∈ Z : 3 | 2x}. If we try different values of x (e.g. 0, 1, 2, 3, 4, 5 etc. and −1, −2, −3, −4 etc.) we see that we get the multiples of 3 (Exercise 6 in Chapter 4 asks you to show that x is a multiple of 3 if 3 | 2x.) The content of [1] and [2] may be justified in a similar way. We have seen that if we define a relation R1 in Z by an R1 b if a ≡ b (mod 3), we will have three distinct equivalence classes; whereas if we define a relation R2 in Z by an R2 b if 2a + b ≡ 0 (mod 3), we also have three different classes - in fact the same equivalence classes. Let's see why that's true. Result 8.8 test

Let a, b ∈ Z. Then a ≡ b (mod 3) if and only if 2a + b ≡ 0 (mod 3). First assume that a ≡ b (mod 3). So 3 | (a − b) and thus a − b = 3x for an integer x. So a = 3x + b. Now 2a + b = 2(3x + b) + b = 6x + 3b = 3(2x + b). Since 2x + b is an integer, 3 | (2a + b) and thus 2a + b ≡ 0 (mod 3). Instead, assume that 2a + b ≡ 0 (mod 3). Therefore 3 | (2a + b), which implies that 2a + b = 3y for some integer y. So b = 3y − 2a. Note that a − b = a − (3y − 2a) = 3a − 3y = 3(a − y). Since a − y is an integer, 3 | (a − b) and thus a ≡ b (mod 3). We must not conclude that just because we are dealing with an equivalence relation defined with respect to the integers modulo 3, we will necessarily have three different equivalence classes. Suppose we define a relation R on

8.6

The whole modulus n

207

Z by a R b if a 2 ≡ b2 (mod 3). Here, too, R is an equivalence relation. In this case, however, there are only two different equivalence classes, namely: [0] = {0, ±3, ±6, ±9, . 🇧🇷 .} and [1] = {±1, ±2, ±4, ±5, . 🇧🇷 .} because whenever an integer n has a remainder of 1 or 2 when divided by 3, then n 2 has a remainder of 1 when divided by 3.

8.6 The integers modulo n We have already seen that for any integer n ≥ 2 the relation R defined in Z by a R b if a ≡ b (mod n) is an equivalence relation. In addition, this equivalence relation leads to n different equivalence classes [0], [1], . 🇧🇷 🇧🇷 , [n − 1]. We denote the set of these equivalence classes by Zn and denote this set as integers modulo n. Thus Z3 = {[0],[1],[2]} and in general Zn = {[0],[1], . 🇧🇷 🇧🇷 , [n − 1]} . Hence every element [r] of Zn with 0 ≤ r < n is a set containing infinitely many integers; in fact, as we have already noted, [r] consists of all those integers which have remainder r when divided by n. For this reason, the Zn elements are sometimes referred to as residue classes. While it makes perfect sense to take the union and intersection of two elements of Zn since those elements are sets (actually subsets of Z), at this point it makes no sense to add or multiply two elements of Zn. However, since the elements of Zn look like integers, say [a] and [b], where a, b ∈ Z, this suggests the possibility of defining addition and multiplication in Zn. We will now discuss how these operations on the set Zn can be defined. Of course, we've seen the definition of addition and multiplication many times. When we speak of addition and multiplication as operations on a set S, we mean that for x, y ∈ S the sum x + y and the product x y must both belong to S. For example, in the set Q of rational numbers, the sum and the product of two rational numbers a/b and c/d (i.e. a, b, c, d ∈ Z and b, d = 0) are defined by c ad + bc a + = b d bd

e

a c ac · = , b d bd

both are rational numbers and therefore belong to Q. As we have already mentioned, if addition and multiplication are operations on a set S, then x + y ∈ S and x y ∈ S for all x, y ∈ S. So if T is a nonempty subset of S and x, y ∈ T , then x + y ∈ S and xy y ∈ S. The set T is closed under addition if x + y ∈ T whenever x, y ∈ T . Likewise, T is closed under multiplication if x y ∈ T whenever x, y ∈ T . If addition and multiplication are operations on a set S, then S is necessarily closed under addition and multiplication. For example, addition and multiplication are operations on Z. If A and B denote the sets of even integers and odd integers, respectively, then A is closed under both addition and multiplication, but B is closed only under multiplication. As much as addition and multiplication can be defined in Zn, we would certainly expect the sum and product of two elements of Zn to be one element of Zn. there

208

Chapter 8 Equivalence relations seem to be a natural definition of addition and multiplication in Zn; i.e. for two equivalence classes [a] and [b] in Zn we define [a] + [b] = [a + b]

[a] · [b] = [ab].

e

(8.7)

For example, suppose we consider Z6 , where then Z6 = {[0], [1], . 🇧🇷 🇧🇷 , [5]}. From the definitions of addition and multiplication just given, [1] + [3] = [1 + 3] = [4] and [1] * [3] = [1 * 3] = [3]. This certainly seems harmless enough, but consider adding and multiplying two other equivalence classes, say [2] and [3]. Again according to the definitions in (8.7), [2] + [3] = [2 + 3] = [5] and [2] * [3] = [2 * 3] = [6]. However, we express the elements of Z6 by [0], [1], [2], [3], [4] and [5] and we don't see [2 3] = [6] explicitly among these elements. Since 6 ≡ 0 (mod 6), it follows that 6 ∈ [0], hence [6] = [0]. Also, when 6 is divided by 6, and then [6] = [0], the remainder is 0. Therefore, [2] * [3] = [0]. By similar reasoning, [3] + [5] = [2] and [3] * [5] = [3]. In fact, the full addition and multiplication tables for Z6 are shown in Figure 8.2. If we add [1] to [0], add [1] to [1] and so on, we get [0] + [1] = [1], [1] + [1] = [2 ], [ 2] + [1] = [3], . 🇧🇷 🇧🇷 , [5] + [1] = [6] = [0], [6] + [1] = [0] + [1] = [1] etc.; that is, we go back to [0] and go through all the classes of Z6 again (and again). If we were dealing with Z12 instead of Z6, we would have [0] + [1] = [1], [1] + [1] = [2], [2] + [1] = [3] , . 🇧🇷 ., [11] + [1] = [12] = [0], [12] + [1] = [0] + [1] = [1] etc. and this should remind you what is happening , when adding a certain number of hours to a time (in hours), here of course 12 hours are represented as 0 hours. (For example, if it is 11 o'clock now, what time will it be in 45 hours?) While the definitions of addition and multiplication in Zn that we provided in (8.7) should seem fairly reasonable and expected, there is a Potential pain point here that needs to be addressed. According to the addition definition in Z6, [4] + [5] = [3]. However, the class [4] consisting of all integers x such that x ≡ 4 (mod 6) need not be represented in this way. Since 10 ∈ [4], it follows that [10] = [4]. Also for example [16] = [4] and [−2] = [4]. Also [11] = [5], [17] = [5] and [−25] = [5]. Therefore, adding the equivalence classes [4] and [5] is the same as adding [10] and [−25], say, since [10] = [4] and [−25] = [5]. But according to the definition we have given, [10] + [−25] = [−15]. Fortunately, [−15] = [3] and we get the same sum as before. But will this always happen? So does the definition of the sum of the equivalence classes [a] and [b] given in (8.7) depend on the representatives a and b of these classes? If the sum (or product) of two equivalence classes does not depend on the representatives, we say that this sum (or product) is well-defined. We certainly +[0][1][2][3][4][5]

[0] [0] [1] [2] [3] [4] [5]

[1] [1] [2] [3] [4] [5] [0]

[2] [2] [3] [4] [5] [0] [1]

[3] [3] [4] [5] [0] [1] [2]

Figure 8.2

[4] [4] [5] [0] [1] [2] [3]

[5] [5] [0] [1] [2] [3] [4]

· [0] [1] [2] [3] [4] [5]

[0] [0] [0] [0] [0] [0] [0]

[1] [0] [1] [2] [3] [4] [5]

[2] [0] [2] [4] [0] [2] [4]

The addition and multiplication tables for Z6

[3] [0] [3] [0] [3] [0] [3]

[4] [0] [4] [2] [0] [4] [2]

[5] [0] [5] [4] [3] [2] [1]

8.6

The whole modulus n

209

I would like it to be like that, which luckily is the case. More specifically, addition and multiplication are well defined in Zn if whenever [a] = [b] and [c] = [d] in Zn then [a + c] = [b + d] and [ac ] = [ db ]. Theorem 8.9 Proof

Addition in Zn, n ≥ 2, is well defined. The set Zn is the set of equivalence classes that results from the equivalence relation R defined in Z by a R b if a ≡ b (mod n). Let [a], [b], [c], [d] ∈ Zn , where [a] = [b] and [c] = [d]. We prove that [a + c] = [b + d]. Because [a] = [b] it follows from Theorem 8.2 that a R b. Likewise c R d. Therefore, a ≡ b (mod n) and c ≡ d (mod n). Thus n | (a − b) and n | (CD). Hence there are integers x and y with a − b = nx and c − d = ny.

(8.8)

Adding the equations in (8.8) we get (a − b) + (c − d) = nx + ny = n(x + y); then (a + c) − (b + d) = n(x + y). This implies that n | [(a + c) − (b + d)]. Thus (a + c) ≡ (b + d) (mod n). From this we conclude that (a + c) R (b + d), which implies that [a + c] = [b + d]. If the proof of Theorem 8.9 looks a little familiar, look at Result 4.10 and its proof. For example in Z7, [118] + [26] = [144]. Since the remainder is 4 when 144 is divided by 7, it follows that [118] + [26] = [4]. Also [118] = [6] and [26] = [5]; then [118] + [26] = [6] + [5] = [11] = [4]. As already mentioned, the Zn multiplication described in (8.7) is also well defined. The proof of this fact is left as an exercise (Exercise 8.58). Addition and multiplication in Zn satisfy many well-known properties. Among them are: commutative laws [a] + [b] = [b] + [a] and [a] [b] = [b] [a]

for all a, b ∈ Z;

Associative properties ([a] + [b]) + [c] = [a] + ([b] + [c]) and ([a] [b]) [c] = [a] ([ b] [c])

for all a, b, c ∈ Z;

Distribution law [a] ([b] + [c]) = [a] [b] + [a] [c]

for all a, b, c ∈ Z.

Although we have defined Zn multiplication as you would probably expect, this is not the only way to define it. For example, suppose we consider the set Z3 of integers modulo 3. For the equivalence classes [a] and [b] in Z3 set the “product” [a] [b] equal to [q], where [q ] is the quotient when ab is divided by 3. Since the "product" of every two elements of Z3 is an element of Z3, this operation is over. In particular, [2] * [2] = [1] since the quotient is 1 when dividing 2 * 2 = 4

210

Kapitel 8 Äquivalenzbeziehungen nach 3. Allerdings [2] = [5] aber [5] · [5] = [8] = [2]. Beachten Sie auch, dass [5] · [2] = [3] = [0]. Daher ist diese Multiplikation nicht gut definiert.

EXERCISES FROM CHAPTER 8 Section 8.1: Relationships 8.1. Let A = {a, b, c} and B = {r, s, t, u}. Also let R = {(a, s), (a, t), (b, t)} be a relation from A to B. Determine dom(R) and amplitude(R). 8.2. Let A be a nonempty set and B ⊆ P(A). Define a relation R from A to B by x RY if x ∈ Y . Give an example of two sentences A and B that illustrate this. What is R for these two sets? 8.3. Let A = {0, 1}. Determine all relations in A. 8.4. Let A = {a, b, c} and B = {1, 2, 3, 4}. So R1 = {(a, 2), (a, 3), (b, 1), (b, 3), (c, 4)} is a relation from A to B, while R2 = {(1, b ), (1, c), (2, a), (2, b), (3, c), (4, a), (4, c)} is a relation from B to A. A relation R is in A defined by x R y if there is z ∈ B with x R1 z and z R2 y. Express R by listing its elements. 8.5. For the relation R = {(1, 1), (1, 2), (1, 3), (2, 2), (2, 3), (3, 3)} defined on the set {1, 2 , 3}, what is R −1 ? 8.6. A relation R is defined on N by an R b if a/b ∈ N. For c, d ∈ N, under what conditions is c R −1 d? 8.7. For the relation R = {(x, y) defined on N: x + 4y is odd}, what is R −1? 8.8. For the relation R = {(x, y): x ≤ y} defined on N, what is R −1 ? 8.9. Let A and B be sets with |A| = |B| = 4. (a) Prove or disprove: If R is a relation from A to B, where |R| = 9 and R = R −1 , then A = B. (b) Show that by changing the statement in (a) slightly, a different answer to the resulting statement can be obtained. 8.10. Let A be a set with |A| = 4. What is the maximum number of elements that a relation R over A can contain such that R ∩ R −1 = ∅?

Section 8.2: Properties of Relations 8.11. Let A = {a, b, c, d} and let R = {(a, a), (a, b), (a, c), (a, d), (b, b), (b, c), (b, d), (c, c), (c, d), (d, d)} is a relation on A. Which of the reflexive, symmetric, and transitive properties does the relation R have? Justify your answers. 8.12. Let S = {a, b, c}. Then R = {(a, a), (a, b), (a, c)} is a relation on S. Which of the reflexive, symmetric and transitive properties does the relation R have? Justify your answers. 8.13. Let S = {a, b, c}. Then R = {(a, b)} is a relation on S. Which of the reflexive, symmetric and transitive properties does the relation R have? Justify your answers. 8.14. Let A = {a, b, c, d}. Give an (justifying) example of a relation R over A that has none of the following properties: reflexive, symmetric, transitive. 8.15. A relation R is defined on Z by an R b if |a − b| ≤ 2. Which of the reflexive, symmetric and transitive properties does the relation R have? Justify your answers. 8.16. Let A = {a, b, c, d}. How many relations defined in A are reflexive, symmetric, and transitive and contain the ordered pairs (a, b), (b, c), (c, d)? 8.17. Let R = ∅ be the empty relation on a nonempty set A. Which of the reflexive, symmetric and transitive properties does R have?

Exercises for Chapter 8

211

8.18. Let A = {1, 2, 3, 4}. Give an example of a relation in A that reads: (a) (b) (c) (d) (e) (f)

reflexive and symmetric, but not transitive. reflexive and transitive, but not symmetric. symmetric and transitive but not reflexive. reflexive, but neither symmetric nor transitive. symmetric, but neither reflexive nor transitive. transitive, but neither reflexive nor symmetric.

8.19. A relation R is defined on Z by xR y if x · y ≥ 0. Prove or disprove the following: (a) R is reflexive, (b) R is symmetric, (c) R is transitive. 8.20. Find the maximum number of elements in a relation R in a 3-element set such that R has none of the reflexive, symmetric, and transitive properties. 8.21. Prove or disprove: If there is a relation R1 on the set {a1 , a2 } that is non-reflexive, non-symmetric and non-transitive, then there exists a relation R2 on the set {b1 , b2 , b3 } that is non-reflexive is not symmetric and not transitive. 8.22. Let S be the set of all polynomials of at most 3rd degree. An element s(x) of S can then be expressed as s(x) = ax 3 + bx 2 + cx + d, where a, b, c, d ∈ R. A relation R on S is defined by p(x) R q(x) defines if p(x) and q(x) have a common real zero. (For example, p = (x − 1)2 and q = x 2 − 1 share the root 1, so p R q.) Determine which of the reflexive, symmetric, and transitive properties R has. 8.23. A relation R is defined on N by an R b if a | b or b | one. Determine which of the reflexive, symmetric, and transitive properties R possesses.

Section 8.3: Equivalence Relations 8.24. Let R be an equivalence relation on A = {a, b, c, d, e, f, g} such that a R c, c R d, d R g and b R f . If three different equivalence classes result from R, determine these equivalence classes and determine all elements of R. 8.25. Let A = {1, 2, 3, 4, 5, 6}. The relation R = {(1, 1), (1, 5), (2, 2), (2, 3), (2, 6), (3, 2), (3, 3), (3, 6), (4, 4), (5, 1), (5, 5), (6, 2), (6, 3), (6, 6)} is an equivalence relation on A. Determine the equivalence of different classes . 8.26. Let A = {1, 2, 3, 4, 5, 6}. The different equivalence classes that result from an equivalence relation R in A are {1, 4, 5}, {2, 6} and {3}. What is R? 8.27. Let R be a relation defined in Z by a R b if a 3 = b3 . Show that R is an equivalence relation in Z and determine the different equivalence classes. 8.28. (a) Let R be the relation defined in Z by a R b if a + b is even. Show that R is an equivalence relation and determine the different equivalence classes. (b) Suppose in (a) "even" is replaced by "odd". Which of the reflexive, symmetric and transitive properties does R have? 8.29. Let R be an equivalence relation defined on a set A containing the elements a, b, c, and d. Prove that if a R b, c R d and a R d, then b R c. 8:30 am. Let H = {2m : m ∈ Z}. A relation R is defined on the set Q+ of positive rational numbers by a R b if a/b ∈ H . (a) Show that R is an equivalence relation. (b) Describe the elements of the equivalence class [3].

212

Chapter 8 Equivalence Relations

8.31. A relation R on a nonempty set A is called circular if whenever x Ry and y R z, then z R x for all x, y, z ∈ A holds. Prove that a relation R on A is an equivalence relation if and only if R is circular and reflexive. √ √ 8.32. A relation R is defined on the set A = {a + b 2 : a, b ∈ Q, a + b 2 = 0} by x R y if x/y ∈ Q. Show that R is an equivalence relation and determine the different equivalence classes. 8.33. Let H = {4k : k ∈ Z}. A relation R is defined on Z by an R b if a − b ∈ H . (a) Show that R is an equivalence relation. (b) Determine the different equivalence classes. 8.34. Let H be a nonempty subset of Z. Suppose that the relation R, which is defined on Z by an R b if a − b ∈ H, is an equivalence relation. Check the following (a) 0 ∈ H . (b) If a ∈ H , then −a ∈ H . (c) If a, b ∈ H , then a + b ∈ H . 8.35 Prove or disprove: There are equivalence relations R1 and R2 on the set S = {a, b, c} such that R1 ⊆ R2 , R2 ⊆ R1 and R1 ∪ R2 = S × S.

Section 8.4: Properties of Equivalence Classes 8.36. Give an example of an equivalence relation R on the set A = {v, w, x, y, z} such that there are exactly three different equivalence classes. What are the equivalence classes for your example? 8.37. A relation R is defined on N by an R b if a 2 + b2 is even. Prove that R is an equivalence relation. Determine the different equivalence classes. 8.38. Let R be a relation defined on the set N by a R b if a | 2b or b | 2. Prove or disprove: R is an equivalence relation. 8.39. Let S be a nonempty subset of Z and let R be a relation on S defined by x R y if 3 | (x + 2y). (a) Prove that R is an equivalence relation. (b) If S = {−7, −6, −2, 0, 1, 4, 5, 7}, what are the different equivalence classes in this case? 8.40 A relation R is defined on Z by xR y if 3x − 7y is even. Prove that R is an equivalence relation. Determine the different equivalence classes. 8.41. (a) Prove that the intersection of two equivalence relations on a nonempty set is an equivalence relation. (b) Consider the equivalence relations R2 and R3 defined in Z by a R2 b if a ≡ b (mod 2) and a R3 b if a ≡ b (mod 3). By (a) R1 = R2 ∩ R3 is an equivalence relation in Z. Determine the different equivalence classes in R1 . 8.42. Prove or disprove: The union of two equivalence relations on a nonempty set is an equivalence relation. 8.43. Let A = {u, v, w, x, y, z}. The relation R = {(u, u), (u, v), (u, w), (v, u), (v, v), (v, w), (w, u), (w, v), (w, w), (x, x), (x, y), (y, x), (y, y), (z, z)} defined on A is an equivalence relation. In particular, [u] = [v] = [w] = {u,v,w}, [x] = [y] = {x,y} and [z] = {z}; then |[u]| = |[v]| = |[w]| = 3 and |[x]| = |[y]| = 2, while |[z]| = 1. So |[u]| + |[v]| + |[w]| + |[x]| + |[s]| + |[z]| = 14. Let A = {a1, a2, . 🇧🇷 🇧🇷 , an } a set of n elements and R an equivalence relation defined in A. Prove that i=1 |[ai ]| is even if and only if n is even.

Exercises for Chapter 8

213

Section 8.5: Congruence modulus n 8.44. Classify each of the following statements as true or false. (a) 25 ≡ 9 (mod 8), (b) −17 ≡ 9 (mod 8), (c) −14 ≡ −14 (mod 4), (d) 25 ≡ −3 (mod 11). 8.45 A relation R is defined in Z by an R b if 3a + 5b ≡ 0 (mod 8). Prove that R is an equivalence relation. 8.46. Let R be the relation defined in Z by a R b if a + b ≡ 0 (mod 3). Show that R is not an equivalence relation. 8.47. The relation R over Z defined by a R b if a 2 ≡ b2 (mod 4) is known to be an equivalence relation. Determine the different equivalence classes. 8.48. The relation R defined on Z by x R y when x 3 ≡ y 3 (mod 4) is known as the equivalence relation. Determine the different equivalence classes. 8.49. A relation R is defined in Z by an R b if 5a ≡ 2b (mod 3). Prove that R is an equivalence relation. Determine the different equivalence classes. 8.50. A relation R is defined in Z by an R b if 2a + 2b ≡ 0 (mod 4). Prove that R is an equivalence relation. Determine the different equivalence classes. 8.51. Let R be the relation defined by an R b in Z if 2a + 3b ≡ 0 (mod 5). Prove that R is an equivalence relation and determine the different equivalence classes. 8.52. Let R be the relation defined in Z by a R b if a 2 ≡ b2 (mod 5). Prove that R is an equivalence relation and determine the different equivalence classes. 8.53. For an integer n ≥ 2 the relation R defined on Z by a R b if a ≡ b (mod n) is an equivalence relation. Equivalent, a R b if a − b = kn for some k ∈ Z. Define a relation R on the set R of real numbers by a R b if a − b = kπ for some k ∈ Z. R an equivalence relation? If not, explain why. If so, prove it and find [0], [π] and [2].

Section 8.6: The Integers Module # 8.54. Create the addition and multiplication tables for Z4 and Z5. 8.55 In Z8, express the following sums and products as [r], where 0 ≤ r < 8. (a)[2]+[6](b)[2][6](c)[−13]+ [138] (d) [−13] [138] 8.56. In Z11, express the following sums and products as [r], where 0 ≤ r < 11. (a)[7] + [5] (b) [7] [5] (c) [−82] + [207] (d) [−82] [207] 8.57. Let S = Z and T = {4k : k ∈ Z}. So T is a nonempty subset of S. (a) (b) (c) (d) (e)

Prove that T is closed under addition and multiplication. If a ∈ S − T and b ∈ T , then ab ∈ T ? If a ∈ S − T and b ∈ T , then a + b ∈ T ? If a, b ∈ S − T , is it possible that ab ∈ T ? If a, b ∈ S − T , then is it possible that a + b ∈ T ?

8.58. Prove that the multiplication in Zn , n ≥ 2, defined by [a][b] = [ab] is well defined. (See result 4.11.) 8.59. (a) Let [a], [b] ∈ Z8 . If [a] · [b] = [0], does it follow that [a] = [0] or [b] = [0]? (b) How is the question in (a) answered if Z8 is replaced by Z9? Why Z10? Why Z11? (c) For which integers n ≥ 2 is the following statement true? (You only have to guess, not prove.) Let [a], [b] ∈ Zn , n ≥ 2. If [a] [b] = [0], then [a] = [0 ] or [b] = [0]. 8.60. For integers m, n ≥ 2 consider Zm and Zn. Let [a] ∈ Zm with 0 ≤ a ≤ m − 1. Then a, a + m ∈ [a] in Zm . If a, a + m ∈ [b] for some [b] ∈ Zn , then what can be said about m and n?

214

Chapter 8 Equivalence Relations

8.61. (a) For integers m, n ≥ 2 consider Zm and Zn . If an element of Zm also belongs to Zn, what can be said about Zm and Zn? (b) Are there examples of integers m, n ≥ 2 such that Zm ∩ Zn = ∅?

EXERCISES ADDITIONAL TO CHAPTER 8 8.62. Prove or disprove: (a) There is an integer a with ab ≡ 0 (mod 3) for every integer b. (b) If a ∈ Z, then ab ≡ 0 (mod 3) for all b ∈ Z. (c) For all integers a there is an integer b such that ab ≡ 0 (mod 3). 8.63. A relation R is defined in R by an R√b if a − b ∈ Z. Prove that R is an equivalence relation and determine the equivalence classes [1/2] and [ 2]. 8.64. A relation R is defined on Z by an R b if |a − 2| = |b − 2|. Prove that R is an equivalence relation and determine the different equivalence classes. 8.65. Let k be integers with k + ≡ 0 (mod 3) and let a, b ∈ Z. Prove that if a ≡ b (mod 3), then ka + b ≡ 0 (mod 3). 8.66. Formulate and prove a generalization of Exercise 8.65. 8.67. A relation R is defined in Z by an R b if 3 | (a 3 - b). Prove or disprove the following: (a) R is reflexive. (b) R is transitive. 8.68. A relation R is defined in Z by a R b if a ≡ b (mod 2) and a ≡ b (mod 3). Prove or disprove: R is an equivalence relation in Z. 8.69. A relation R is defined in Z by an R b if a ≡ b (mod 2) or a ≡ b (mod 3). Prove or disprove: R is an equivalence relation in Z. 8.70. Determine each of the following items. (a) [4]3 = [4][4][4] in Z5 (b) [7]5 in Z10 8.71. Let S = {(a, b) : a, b ∈ R, a = 0}. (a) Show that the relation R (c, d) defined on S by (a, b) R (c, d) for ad = bc is an equivalence relation. (b) Describe geometrically the elements of the equivalence classes [(1, 2)] and [(3, 0)]. 8.72. In Exercise 8.19. (from this chapter) a relation R on Z was defined by x Ry if x · y ≥ 0, and we were asked to determine which of the reflexive, symmetric and transitive properties are satisfied. (a) How would our answers have changed if x y ≥ 0 had been replaced by: (i) x y ≤ 0, (ii) x y > 0, (iii) x y = 0, (iv ) x y ≥ 1, (v) x y is odd, (vi) x y is even, (vii) x y ≡ 2 (mod 3)? (b) What additional questions could you ask? 8.73. For the following statement S and the proposed proof, either (1) S is true and the proof is correct, (2) S is true and the proof is false, or (3) S is false (and the proof is false). Explain which occurs. S: Every symmetric and transitive relation on a nonempty set is an equivalence relation. Proof Let R be a symmetric and transitive relation defined on a nonempty set A. We only have to show that R is reflexive. Let x ∈ A. We show that x R x. Let y ∈ A with x R y. Since R is symmetric, y R x. Now x R y and y R x. Since R is transitive, x R x. So R is reflexive. 8.74. Evaluate the proposed proof of the following result. result

A relation R is defined in Z by an R b if 3 | (a+2b). Then R is an equivalence relation.

Additional exercises to Chapter 8

215

Proof Suppose R a. So 3 | (a+2a). Since a + 2a = 3a and a ∈ Z, it follows that 3 | 3a or 3| (a+2a). Therefore a is R a and R is reflexive. Next we show that R is symmetric. Assume that the R b. So 3 | (a+2b). So a + 2b = 3x, where x ∈ Z. So a = 3x − 2b. So b + 2a = b + 2(3x − 2b) = b + 6x − 4b = 6x − 3b = 3(2x − b). Since 2x − b is an integer, 3 | (b+2a). Then b R a and R is symmetric. Finally we show that R is transitive. Suppose a R b and b R c. So 3 | (a + 2b) and 3 | (b+2c). So a + 2b = 3x and b + 2c = 3y, where x, y ∈ Z. Adding together we have (a + 2b) + (b + 2c) = 3x + 3y. So a + 2c = 3x + 3y − 3b = 3(x + y − b). Since x + y − b is an integer, 3 | (a + 2c). Therefore a R c and R are transitive. 8.75 (a) Show that the relation R defined on R × R by (a, b) R (c, d) holds if |a| + |b| = |c| + |d| is an equivalence relation. (b) Describe geometrically the elements of the equivalence classes [(1, 2)] and [(3, 0)]. 8.76. Let x ∈ Zm and y ∈ Zn , where m, n ≥ 2. If x ⊆ y, then what can you say about m and n? 8.77. Let A be a nonempty set and B a fixed subset of A. A relation R is defined on P(A) by X RY if X ∩ B = Y ∩ B. (a) Prove that R is an equivalence relation. (b) Let A = {1, 2, 3, 4} and B = {1, 3, 4}. For X = {2, 3, 4}, find [X ]. 8.78. Let R1 and R2 be equivalence relations on a non-empty set A. Prove or disprove each of the following statements. (a) If R1 ∩ R2 is reflexive, then so are R1 and R2. (b) If R1 ∩ R2 is symmetric, then so are R1 and R2. (c) If R1 ∩ R2 is transitive, then so are R1 and R2. 8.79. Prove that if R is an equivalence relation on a set A, then the inverse relation R −1 is an equivalence relation on A. 8.80. Let R1 and R2 be equivalence relations on a nonempty set A. A relation R = R1 R2 is defined on A as follows: For a, b ∈ A, a R b if there is c ∈ A such that a R1 c and c R2 b. Prove or disprove: R is an equivalence relation in A. 8.81. A relation R on a nonempty set S is called sequential if for every sequence x, y, z of elements of S (unique or not) at least one of the ordered pairs (x, y) and (y, z) belongs A. Prove or refute: Every symmetric sequential relation on a non-empty set is an equivalence relation. 8.82. Consider the subset H = {[3k] : k ∈ Z} of Z12 . (a) Determine the different elements of H and construct an addition table for H . (b) A relation R in Z12 is defined by [a] R [b] if [a − b] ∈ H . Show that R is an equivalence relation and determine the different equivalence classes. 8.83. For the elements a, b ∈ Zn, n ≥ 2, a = [c] and b = [d] for some integers c and d. Define a − b = [c] − [d] as equivalence class [c − d]. Let H = {x1 , x2 , . 🇧🇷 🇧🇷 , xd } a subset of Zn , n ≥ 2 such that a relation R defined on Zn by an R b if a − b ∈ H is an equivalence relation. (a) For each a ∈ Zn, determine the equivalence class [a] and show that [a] consists of d elements. (b) Prove that d | n.

9

functions

EU

If R is a relation from a set A to a set B and x is an element of A, then x is not related to any element of B or x is related to at least one element of B. In the latter case it can happen that x is related to every element of B, or perhaps to exactly one element of B. If every element of A is not related to any element of B, then R is the empty set ∅. If every element of A is related to every element of B, then R is the Cartesian product A × B. However, if every element of A is related to exactly one element of B, then we have the most studied relationship of all: the Function. You've probably come across functions before, at least in calculus and precalculus. But you probably haven't studied functions the way we describe them here.

9.1 The definition of the function Let A and B be nonempty sets. By a function f from A to B, written f : A → B, we mean a relation from A to B with the property that each element a in A is the first coordinate of exactly one ordered pair in f. Since f is a relation, the set A is the domain of f in this case, denoted by dom( f ). The set B is called the region of f. For a function f : A → B let (a, b) ∈ f . Since f contains only one ordered pair whose first coordinate is a, it follows that b is the only second coordinate of an ordered pair whose first coordinate is a; that is, if (a, b) ∈ f and (a, c) ∈ f , then b = c. If (a, b) ∈ f , then we write b = f (a) and denote b as the image of a. Sometimes it is said that f maps a to b. In fact, f itself is sometimes called a mapping. The set range( f ) = {b ∈ B : b is an image under f of an element of A} = { f (x) : x ∈ A}

216

is the interval of f and consists of the second coordinates of the elements of f. If A is a finite set, then the function f is a finite set and the number of elements in f is |A| since there is exactly one ordered pair in f that corresponds to each element of A. In this chapter, as in previous chapters, when we refer to cardinalities of sets, we will only be dealing with finite sets. Suppose f : A → B and g : A → B are two functions from A to B and a ∈ A. Then f and g contain exactly one ordered pair with a as first coordinate, say (a, x) ∈ f and (a, y ) ∈ g. If the sets f and g are equal, then (a, x) also belongs to g. Since g contains only one ordered pair whose first coordinate is a, it follows that (a, x) = (a, y). But it follows from this that x = y, i.e. f(a) = g(a). Therefore, it is natural to define two

9.1

The function definition

217

Functions f : A → B and g : A → B equal, written f = g if f (a) = g(a) for all a ∈ A. Let A = {1, 2, 3} and B = {W x y Z}. So f 1 = {(1, y), (2, w), (3, y)} is a function from A to B and then we can write f 1 : A → B. On the other hand, f 2 = {( 1, x), (2, z), (3, y), (2, x)} is not a function since there are two ordered pairs whose first coordinate is 2. Also, f 3 = {(1 , z ), (3, x)} is not a function from A to B either, because dom( f 3 ) = A. On the other hand, f 3 is a function from A − {2} to B It is often convenient to “imagine” a function f : A → B that represents the two sets A and B by diagrams and draws an arrow (a directed segment) from an element x ∈ A to its image f(x) ∈ B the function f 1 described above in Figure 9.1. Therefore, to represent a function in this way, exactly one directed segment must leave each element of A and proceed to an element of B. In analysis, functions like f (x) = x 2 are considered. This function f goes from R to R, i.e. H. A = R and B = R. Although f(x) = x 2 is commonly called a function in calculus and elsewhere, strictly speaking f(x) is the image of a real number x under f . The function f itself is actually the set f = {(x, x 2 ) : x ∈ R}. So, for example, (2, 4) and (−3, 9) belong to f . The set {(x, x 2 ) : x ∈ R} of points in the plane is the graph of f . In this case the graph is a parabola. Here the function f : R → R defined by f (x) = x 2 can also be viewed as defined by a rule, ie the rule that associates every real number x with the number x 2 . Example 9.1

Another function found in calculus is g(x) = e x . As mentioned above, this function is actually the set g = {(x, e x ) : x ∈ R}. More precisely, this is the function g : R → R defined by g(x) = e x for all x ∈ R. In general, we will follow this last convention to define functions, often described by a rule or 1 from calculus become formula . Hence the function h(x) = x −1

f1 1

W

2

x

3

y z A Figure 9.1

B A function f 1 : A → B

218

Chapter 9 Functions 1 for all x ∈ R with x = 1 and the function x −1 φ(x) = ln x, the function φ : R+ → R is defined by φ(x) = ln x for all x ∈ R+ , where , remember that R+ is the set of all positive real numbers.

Function h : R − {1} → R defined by h(x) =

For a function f : A → B and a subset C of A, the domain f (C) of C is defined as f (C) = { f (x) : x ∈ C}. Hence f(C) ⊆ B for every subset C of A. If C = A then f(A) is the image of f. Example 9.2

For A = {a, b, c, d, e} and B = {1, 2, . 🇧🇷 🇧🇷 , 6}, f = {(a, 3), (b, 5), (c, 2), (d, 3), (e, 6)} is a function from A to B. For C1 = {a , b, c}, C2 = {a, d}, C3 = {e} and C4 = A, f (C1 ) = {2, 3, 5}, f (C2 ) = {3}, f (C3 ) = {6}, f (C4 ) = area( f ) = {2, 3, 5, 6}.

For a function f : A → B and a subset D of B, the inverse image f −1 (D) of D is defined as f −1 (D) = {a ∈ A : f (a) ∈ D}. Hence f −1 (D) ⊆ A for every subset D of B. Necessarily then f −1 (B) = A. In particular, for some element b ∈ B f −1 ({b}) = {a ∈ A : f (a) = b}. Example 9.3

For the function f : A = {a, b, c, d, e} → B = {1, 2, . 🇧🇷 🇧🇷 , 6} defined in Example 9.2 by f = {(a, 3), (b, 5), (c, 2), (d, 3), (e, 6)}, it follows that f −1(B) = A

f −1 ({3}) = {a, d},

f −1 ({1, 3}) = {a, d},

f −1 ({4}) = ∅

e

Among the many classes of functions encountered in calculus are polynomial functions, rational functions, and exponential functions. The function f : R → R previously defined by f (x) = x 2 for x ∈ R is a polynomial function. The function h in Example 9.1 is a rational function and g is an exponential function. Other important classes of functions that are often encountered in calculus are continuous functions and differentiable functions. The function definition we gave is probably not the definition you remember from calculus; In fact, you may not remember the definition of a function given in calculus. If so, it is not surprising. The evolution of what is understood by function took hundreds of years. In the development of calculus, the need for a formal definition of a function became obvious. At the beginning of the 18th century, the Swiss mathematician Johann Bernoulli wrote: I call a variable variable a variable that is somehow composed of this variable variable and constants.

9.2

The set of all functions from A to B

219

Later, in the 18th century, the famous Swiss mathematician Leonhard Euler studied calculus as the theory of functions and did not resort to diagrams and geometric interpretations as many of his predecessors had done. The definition of a function given by Euler in his work on calculus is: A function of variable quantity is an analytic expression composed of some form of variable quantity and numbers or constant quantities. In the early 19th century, the German mathematician Peter Dirichlet developed a more modern definition of a function: y is a function of x if every value of x corresponds to a unique value of y in a given interval. Dirichlet said that it doesn't matter whether y depends on x according to a formula, a law, or a mathematical operation. He emphasized this by considering the function f : R → R defined by 1 if x is rational, f(x) = 0 if x is irrational. Later in the 19th century the German mathematician Richard Dedekind wrote: A function φ on a set S is a law by which for every definite element s of S there exists a definite thing called a transformation of s and denoted φ ( s). So at that point, the modern definition of a role was just around the corner.

9.2 The set of all functions from A to B For nonempty sets A and B we denote the set of all functions from A to B by B A . That is, B A = { f : f is a function from A to B}, or more simply B A = { f : f : A → B}. While this might seem like a whimsical notation, it's actually pretty logical. In particular, let's determine B A for A = {a, b} and B = {x, y, z}. Any function f from A to B is necessarily of the form f = {(a, α), (b, β)}, where α, β ∈ B. Since there are three options for α and three options for β, the sum is Number of such functions f is 3 3 = 32 = 9. These nine functions are listed below: f 1 = {(a, x), (b, x)}, f 4 = {(a, y), (b , x )}, f 7 = {(a, z), (b, x)},

f 2 = {(a, x), (b, y)}, f 5 = {(a, y), (b, y)}, f 8 = {(a, z), (b, y)} ,

f 3 = {(a, x), (b, z)}, f 6 = {(a, y), (b, z)}, f 9 = {(a, z), (b, z)} .

Hence the number of elements in B A is 32 . In general, for finite sets A and B, the number of functions from A to B is A B = |B||A| 🇧🇷 If B = {0, 1} then it is common to represent the set of all functions from A to B by 2 A.

220

Chapter 9 Functions

9.3 One-to-One and Overlapping Functions Now we consider two important properties that a function can have. A function f from a set A to a set B is said to be one-to-one if every two different elements of A have different images in B. In symbols, a function f : A → B is one-to-one whenever x , y ∈ A and x = y, hence f(x) = f(y). So if a function f : A → B is not unique, then there are distinct elements w and z in A such that f (w) = f (z). Let A = {a, b, c, d}, B = {r, s, t, u, v} and C = {x, y, z}. Then f 1 = {(a, s), (b, u), (c, v), (d, r )} is a one-to-one function from A to B, since different elements of A have different images have inB; while the function f 2 = {(a, s), (b, t), (c, s), (d, u)} from A to B is not one-to-one since a and c have the same image , so S Es however, there is no one-to-one function from A to C. In general, for a function f : A → B is a one-to-one function, where A and B are finite sets, respectively, two elements of A must have distinct images in B and therefore there must be at least the same number of elements in B as in A, ie |A| ≤ |B|. Sometimes it is difficult to work with the definition of an injector function because the elements are dissimilar. However, there is a useful equivalent formulation of the definition using the contrapositive: A function f : A → B is one-to-one if whenever f(x) = f(y), where x, y ∈ A, then x = j. We show how this formulation can be applied to functions defined by formulas. Result 9.4

The function f : R → R is defined by f (x) = 3x − 5. Then f is one-to-one.

Study

Suppose f(a) = f(b) where a, b ∈ R. So 3a − 5 = 3b − 5. Adding 5 to both sides we get 3a = 3b. Divided by 3 we have a = b and then f is one to one.

Example 9.5

The function f : R → R is defined by f (x) = x 2 − 3x − 2. Determine whether f is one-to-one.

solution

Since f(0) = −2 and f(3) = −2, it follows that f is not one-to-one.

Analyse

To show that the function f defined in Example 9.5 is not unique, we have to show that under f there are two different real numbers with the same image. This was done by showing that f(0) = f(3). But what if we can't find two real numbers with this property? Of course, if we can't find two of these numbers, we might think that f is one to one. In this case we have to try to prove that f is one-to-one. We would probably start such a proof by assuming that f(a) = f(b), so a 2 − 3a − 2 = b2 − 3b − 2. We would then try to show that a = b. We can simplify a 2 − 3a − 2 = b2 − 3b − 2 by adding 2 to both sides, resulting in a 2 − 3a = b2 − 3b. When trying to solve an equation, it's often convenient to collect all the terms on one side of the equation

9.3

One-to-One and Onto functions

221

with 0 on the other side. Rewriting this equation, we get a 2 − 3a − b2 + 3b = 0. Rearranging and factoring some terms, we have a 2 − 3a − b2 + 3b = (a 2 − b2 ) − 3(a − b ) = (a − b )(a + b) − 3(a − b) = (a − b)(a + b − 3) = 0. So if f (a) = f (b), then (a − b) ( a + b − 3) = 0. Since (a − b)(a + b − 3) = 0, it follows that a − b = 0 (and thus a = b) or a + b − 3 = 0. Hence f(a) = f(b) does not imply that a = b. It only implies that a = b or a + b = 3. Since 0 + 3 = 3, we now see why f(0) = f(3). If a and b are any two real numbers, where a + b = 3, then f(a) = f(b). This tells us how to find all possible counterexamples to the statement: f is one-to-one. Looking again at f(x) = x 2 − 3x − 2, we see that f(x) = x(x − 3) − 2. How is x(x − 3) = 0 when x = 0 or x = 3 , it's now more obvious because 0 and 3 are numbers for which f(0) = f(3). A function f : A → B is called over or surjective if every element of the set B is the image of an element of A. Similarly, f over when f (A) = B. A function we considered earlier was f 1 : A → B, where A = {1, 2, 3}, B = {x, y, z, w} and f 1 = {(1, y), (2, w), (3, y)}. This function f 1 is non-overlapping since neither x nor z is an image of any element of A. You might notice that for these two sets A and B, there is also no overlapping function from A to B that has exactly three ordered pairs, but B has four elements. So if for finite sets A and B f : A → B is a surjective function, then |B| ≤ |A|. The function g : B → A with g = {(x, 3), (y, 1), (z, 3), (w, 2)} is a surjective function since each of the elements is 1, 2, and 3 an image of an element of B. Next we determine which of the functions defined in Result 9.4 and Example 9.5 are linked. Result to prove the PROOF STRATEGY

The function f : R → R defined by f (x) = 3x − 5 is complete. Let's make a few remarks before we start the demo. To show that f is on, we need to show that every element in the range B = R is the image of an element in the range A = R. Since f(0) = −5 and f(1) = −2, surely −5 and −2 are images of elements of R. The real number 10 is also an image since f(5) = 10. If π is a image of a real number? To answer this question we need to determine if there is a real number x such that f(x) = π . Since f(x) = 3x − 5 we only need to find a solution for x of the equation 3x − 5 = π. If we solve this equation for x, we find x = (π + 5)/3, which is of course a real number. Finally, note that π +5 π +5 f (x) = f =3 − 5 = π. 3 3 However, this discussion gives us the information we need to prove that f is decreasing because, say, for any real number r we need to find a real number x such that f (x) = r . But then 3x − 5 = r and x = (r + 5)/3.

Result 9.6 test

The function f : R → R defined by f (x) = 3x − 5 is complete. Let r ∈ R. We show that there is x ∈ R with f(x) = r . Choose x = (r + 5)/3.

222

Chapter 9 Functions Then x ∈ R e

f (x) = f

EVIDENCE ANALYSIS

r +5 3

r +5 =3 3

− 5 = r.

Note that the proof of Result 9.6 itself does not take into account the equation 3x − 5 = r. Our goal was to show that there is a real number x such that f(x) = r. How we get this number, while potentially interesting, is not part of the proof. On the other hand, it can be useful to accompany the test with this information. Let A = {1, 2, 3}, B = {x, y, z, w} and C = {a, b, c}. Four functions g1 : A → B, g2 : B → C, g3 : A → C and g4 : A → C are defined as follows: g1 = {(1, y), (2, w), (3, x ) }, g2 = {(x, b), (y, a), (z, c), (w, b)}, g3 = {(1, a), (2, c), (3, b ) }, g4 = {(1, b), (2, b), (3, b)}. The functions g1 and g3 are one-to-one; while g2 and g4 are not injectors since g2(x) = g2(w) = b and g4(1) = g4(2) = b. Both g2 and g3 are on. The function g1 does not overlap because z is not an image of any element of A; while g4 is not superimposed since neither a nor c are images of an element of A.

9.4 Bijective Functions We have already mentioned for finite sets A and B that if f : A → B is a surjective function, then |A| ≥ |B|. We also mentioned that if f : A → B is one-to-one, then |A| ≤ |B|. So if A and B are finite sets and there is a function f : A → B that is one-to-one and one-to-one, then |A| = |B|. What happens when A and B are infinite sets is discussed in detail in Chapter 10. A function f : A → B is called bijective or bijective correspondence if it is both bijection and injection. As we have already mentioned, if a function f : A → B is bijective and A and B are finite sets, then |A| = |B|. Perhaps it is also clear that if A and B are finite sets with |A| = |B|, then there is a bijective function f : A → B. A bijective function from a set A to a set B produces a pairing of the elements of A with the elements of B. In the case where A and B are sentences with |A| = |B| = 3, let's say A = {a, b, c} and B = {x, y, z}, the bijective functions from A to B are f 1 = {(a, x), (b, y), ( c , z)} f 2 = {(a, y), (b, z), (c, x)} f 3 = {(a, z), (b, x), (c, y)} f 4 = {(a, y), (b, x), (c, z)} f 5 = {(a, z), (b, y), (c, x) } f 6 = { (a, x), (b, z), (c, y)}.

9.4

bijective functions

223

That is, there are six bijective functions from A to B; In fact, there are six bijective functions from any 3-element set to any 3-element set. More generally we have the following. (See Exercise 9.34.) Theorem 9.7

If A and B are finite sets with |A| are = |B| = n, then there are n! bijective functions from A to B.

Study

Suppose A = {a1 , a2 , . 🇧🇷 🇧🇷 , a }. Then every bijective function f : A → B can be expressed as f = {(a1 , −), (a2 , −), . 🇧🇷 🇧🇷 , (an , −)} where the second coordinate of each ordered pair of f belongs to B. There are n possible images for a1 in f. Once an image is determined for a1, there are n − 1 possible images for a2. Since f is one-to-one, no element of B can be the image of two elements of A. Since none of the images of a1 and a2 can be an image of a3, there are n − 2 possibilities for a3 . Continuing in this way, we see that there is only one possibility for the image of a. It turns out that the total number of possible bijective functions f is obtained by multiplying these numbers, and therefore there are n(n − 1)(n − 2) · · · 1 = n! bijective functions from A to B. There is another interesting fact about the existence of bijective functions f : A → B for finite sets A and B with |A| = |B|.

Theorem 9.8

Let A and B be nonempty finite sets with |A| = |B| and let f be a function from A to B. Then f is one-to-one if and only if f is superintelligent.

Study

Let |A| = |B| = n. First assume that f is one-to-one. Since the n elements of A have different images, there are n different images. Range( f ) = B and then f is over. For the inverse, assume that f is on. Thus each of the n elements of B is an image of an element of A. Consequently, the n elements of A have n different images in B, which implies that two different elements of A cannot have the same image and hence f is one to one. Theorem 9.8 concerns finite sets A and B with |A| = |B|. Although we haven't defined cardinality for infinite sets, we certainly would |A| expect = |A| for every infinite set A. With this understanding, Theorem 9.8 is false for infinite sets A and B, even if A = B. For example, the function f : Z → Z defined by f (n) = 2n is one-to-one. one; however, its image is the set of all even integers. That is, f does not overlap even though f is a one-to-one function from Z to Z. The function f : N → N defined by g(n) = n − 1 when n ≥ 2 and g(1) = 1 overlaps, but is not an injector since g(1) = g(2) = 1. For the sets A = {1, 2, 3}, B = {x, y, z, w} and C = {a, b, c } as described at the end of Section 9.3 cannot have a function from A to B or from B to C be bijective. However, it is possible to have a bijective function from A to C as long as |A| = |C|. In fact, g3 is such a function, although there are other bijective functions from A to C. Of course, not every function from A to C is bijective, as g4 shows.

224

Chapter 9 Functions For a nonempty set A, the function i A : A → A defined by i A (a) = a for every a ∈ A is called the identity function on A. If the set A to be discussed is clear, we write the identity function i A through i. For S = {1, 2, 3} the identity function is i S = i = {(1, 1), (2, 2), (3, 3)}. This identity function is not only bijective, the identity function iA is bijective for every nonempty set A. Identity features are important, and we'll see them again in a moment. We give another example of a bijective function. Result 9.9

The function f : R − {2} → R − {3} defined by f (x) =

3x x -2

it is bijective. Check

Here it has to be shown that f is one-to-one and one-to-one. We start with the first 3b 3a = . Several of them. Suppose f(a) = f(b), where a, b ∈ R − {2}. Then if we double a−2 b−2 on both sides by (a − 2)(b − 2), we get 3a(b − 2) = 3b(a − 2). Put simply, we have 3ab − 6a = 3ab − 6b. If we add −3ab to both sides and divide by −6, we get a = b. So f is one to one. To show that f overlaps, let r ∈ R − {3}. We show that there are x ∈ R − {2} such 2r. So f(x) = r. Choose x = r −3 3 r2r 2r 6r 6r f (x) = f = 2r −3 = = = r, r −3 2r − 2(r − 3) 6 −2 r −3 which implies that f overlaps. So f is bijective.

EVIDENCE ANALYSIS

A few remarks on the proof that the function f in Result 9.9 is frugal may be helpful. Given a real number r in R − {3} we needed to find a real number x in R − {2} 3x = r, it was necessary to solve this such that f(x) = r . Since we wanted f(x) = x −2 equation for x. This can be achieved by rewriting this equation as 3x = r(x − 2) and then simplifying to get r x − 3x = 2r. Now, by factoring x from r x − 3x and dividing by r − 3, we have the desired choice of x, namely x = 2r/(r − 3). Incidentally, it was perfectly legal to divide by r − 3 since r ∈ R − {3} and thus r = 3. Note also that x ∈ R − {2} for if x = 2r/(r − 3) = 2, then 2r = 2r − 6, which is impossible. Again, although 3x = r for x is not part of the proof, it can be useful to include this solution work for x −2 in addition to the proof. Suppose f is a function from A to B, i.e. H. f : A → B. If f (x) = f (y) implies that x = y for all x, y ∈ A, then f is one to 1. It may seem obvious that if x = y then f( x) = f(y) for all x, y ∈ A, since this is simply a requirement of a function. For a relation f from a set A to a set B to be a function from A to B, the following two conditions must be satisfied: (1) For every element a ∈ A there is an element b ∈ B such that (a , b) ∈ f . (2) If (a, b), (a, c) ∈ f , then b = c.

9.5

composition of functions

225

Condition (1) says that the domain of f is A, that is, every element of A has an image in B; while condition (2) states that if an element of A has an image in B, then that image is unique. Occasionally a function f that satisfies condition (2) is called well-defined. However, since (2) is a requirement of every function, it follows that every function must be well-defined. However, there are situations where the definition of a function f can reveal whether f is well-defined. This can often happen when a function is defined on the set of equivalence classes of an equivalence relation. The next result illustrates this using the equivalence classes for the modulo 4 congruence relation on the set of integers. to prove result

The function f : Z4 → Z4 defined by f ([x]) = [3x + 1] is a well-defined bijective function.

TEST STRATEGY

To prove that this function is well defined, we need to prove that if [a] = [b] then f([a]) = f([b]), so [3a + 1] = [ 3b + 1 ] . It seems reasonable to use a direct proof, so let's assume that [a] = [b]. Since [a] and [b] are elements of Z4, the statement that [a] = [b] means that a ≡ b (mod 4). Since a ≡ b (mod 4) it follows that 4 | (a − b) and thus a − b = 4k for an integer k. To verify that [3a + 1] = [3b + 1] we need to show that 3a + 1 ≡ 3b + 1 (mod 4) or equivalently that (3a + 1) − (3b + 1) = 3a − 3b = 3(a − b) is a multiple of 4. Since Z4 consists of only four elements, namely [0], [1], [2], [3], we need to prove that f is bijective, just to note that the elements f([0]), f([1]), f([2]), f([3]) are different.

Result 9.10

The function f : Z4 → Z4 defined by f ([x]) = [3x + 1] is a well-defined bijective function.

Study

First we verify that this function is well defined; that is, if [a] = [b], then f([a]) = f([b]). Then assume that [a] = [b]. So a ≡ b (mod 4) and thus 4 | (away). So a − b = 4k for an integer k. Hence (3a + 1) − (3b + 1) = 3(a − b) = 3(4k) = 4(3k). Since 3k is an integer, 4 | [(3a + 1) − (3b + 1)]. So 3a + 1 ≡ 3b + 1 (mod 4) and [3a + 1] = [3b + 1]; then f([a]) = f([b]). So f is well defined. Since f([0]) = [1], f([1]) = [0], f([2]) = [3] and f([3]) = [2], it follows that f is both one-to-one and to; that is, f is bijective.

9.5 Composition of functions Since it is usual to define operations on certain sets of numbers (and on the set Zn of equivalence classes, as described in Chapter 8), it is possible under suitable circumstances to define operations on certain sets of functions. For example, for the functions f : R → R and g : R → R you may recall the calculus that the sum f + g and the product f g of f and g are defined by ( f + g)(x) = f (x ) + g(x) and ( f g)(x) = f (x) g(x)

(9.1)

226

Chapter 9 Functions for all x ∈ R. So if f is defined by f (x) = x 2 and g by g(x) = sin x, then ( f + g)(x) = x 2 + sin x e ( f g)(x) = x 2 sin x for all x ∈ R. In calculus we are particularly interested in these operations because once we have learned how to find the derivatives of f and g, we want to know how to use the information to find the derivatives of f + g and f g . For example, the derivative of f g yields the well-known product rule for derivatives: ( f g) (x) = f (x) g (x) + g(x) f (x). Later this led us to study the quotient rule for derivatives. The definitions in (9.1) of the sum f + g and the product f g of the functions f : R → R and g : R → R depend on the value range of these two functions being R, whose elements can be added and multiplied, i.e f ( x) + g(x) and f(x) · g(x) make sense. On the other hand, if f : A → B and g : A → B, where B = {a, b, c}, say, then f (x) + g(x) and f (x) g(x ). no meaning. There is an operation that can be defined on pairs of functions that satisfy appropriate conditions that have nothing to do with numbers. For non-empty sets A, B and C and functions f : A → B and g : B → C it is possible to construct a new function from f and g called their composition. The composition g ◦ f of f and g is the function from A to C defined by (g ◦ f )(a) = g( f (a))

for all a ∈ A.

To illustrate this definition, let A = {1, 2, 3, 4}, B = {a, b, c, d} and C = {r, s, t, u, v} and define the functions f : A → B and g : B → C by f = {(1, b), (2, d), (3, a), (4, a)}, g = {(a, u), (b, r ), (c, r ), (d, s)}. Now we have the right arrangement of sets and functions to consider the composition g ◦ f. Since g ◦ f is a function of A in C, g ◦ f follows: g ◦ f = {(1, α), (2, β), (3, γ), (4, δ ) }, where α, β, γ , δ ∈ C. It only remains to determine the image of each element of A. First we find the image of 1. By the definition of g ◦ f , (g ◦ f ) ( 1) = g( f (1)) = g(b) = r, hence (1, r ) ∈ g ◦ f . Likewise (g ◦ f )(2) = g( f (2)) = g(d) = s and thus (2, s) ∈ g ◦ f . Continuing in this way, we get g ◦ f = {(1, r ), (2, s), (3, u), (4, u)}. A diagram illustrating how g ◦ f is determined is shown in Figure 9.2. To find the image of 1 at g ◦ f we follow the arrow from 1 to b and from b to r. In principle, the function g ◦ f is found by removing the set B. The fact that g ◦ f is defined does not necessarily imply that f ◦ g is also defined. Since g is a function from B to C and f is a function from A to B, the only way to define f ◦ g is when range(g) ⊆ A. In the example we just saw, f ◦ g undefined, since range (g) = {r, s, u} ⊆ A.

9.5

composition of functions

227

g°f

1

one

f

g

r

b

s

2 3 4 A

c

t

d

vc

B Figure 9.2

C

The composition function g ◦ f

The composition of functions was also found in calculus. Let's consider an example of a composition you may have seen in calculus. Suppose again that the functions f : R → R and g : R → R are defined by f (x) = x 2 and g(x) = sin x. In this case we can determine both g ◦ f and f ◦ g; that is, (g ◦ f )(x) = g( f (x)) = g(x 2 ) = sin x 2 ( f ◦ g)(x) = f (g(x)) = f (sin x ) = (sin x)2 = sin2 x. Second, this example also serves to illustrate that even if g ◦ f and f ◦ g are both defined, they need not be equal. Studying the composition of functions in calculus led us to the well-known chain rule for differentiation: (g ◦ f ) (x) = g ( f (x)) f (x). There are two facts about the compositional properties of functions that will be particularly useful to us. First, if f and g are injective functions such that g ◦ f is defined, then g ◦ f is injective. The corresponding statement is also valid for surjective functions. to prove result

Let f : A → B and g : B → C be two functions. (a) If f and g are injective, then so is g ◦ f. (b) If f and g are surjectives, then so is g ◦ f.

TEST STRATEGY

To verify (a) we use a direct proof and first assume that f and g are injectors. To show that g ◦ f is one-to-one, we prove that whenever (g ◦ f )(a1 ) = (g ◦ f )(a2 ) then a1 = a2 . However, (g ◦ f )(a1 ) = (g ◦ f )(a2 ) means that g( f (a1 )) = g( f (a2 )). But g is one to one; then g(x) = g(y) implies that x = y. The form g(x) = g(y) is exactly what we have, where x = f (a1 ) and y = f (a2 ). This leads us to f(a1) = f(a2). But we also know that f is one-to-one.

228

Chapter 9 Functions To verify (b), we need to prove that if f and g overlap, then g ◦ f overlaps. To show that g ◦ f overlaps, it has to be shown that each element of C is an image of an element of A under the function g ◦ f. So we start with an element c ∈ C. Since g is superimposed, there is an element b ∈ B with g(b) = c. But f is approximate; then there is an element a ∈ A with f (a) = b. This suggests considering (g ◦ f )(a). Theorem 9.11

Let f : A → B and g : B → C be two functions. (a) If f and g are injective, then so is g ◦ f. (b) If f and g are surjectives, then so is g ◦ f.

Study

Let f : A → B and g : B → C be injective functions. Assume that (g ◦ f )(a1 ) = (g ◦ f )(a2 ), where a1 , a2 ∈ A. By definition, g( f (a1 )) = g( f (a2 )). Since g is injective, it follows that f (a1 ) = f (a2 ). But since f is injective, it follows that a1 = a2 . This implies that g ◦ f is injective. Next, let f : A → B and g : B → C be surjective functions and c ∈ C. Since g is surjective, there is b ∈ B with g(b) = c. On the other hand, since f is surjective, it follows that there is a ∈ A with f(a) = b. Hence (g ◦ f )(a) = g( f (a)) = g(b) = c, which means that g ◦ f is also surjective. Combining the two parts of Theorem 9.11 gives an immediate consequence.

Corollary 9.12

If f : A → B and g : B → C are bijective functions, then g ◦ f is bijective. For nonempty sets A, B, C and D let the functions be f : A → B, g : B → C and h : C → D. Then the compositions are g ◦ f : A → C and h ◦ g : B → D are defined, as are the compositions h ◦ (g ◦ f ): A → D and (h ◦ g) ◦ f : A → D. Composition of the functions f , g and h is associative if the functions h ◦ (g ◦ f ) and (h ◦ g) ◦ f are equal. This is indeed the case.

Theorem 9.13

Study

For nonempty sets A, B, C and D let the functions be f : A → B, g : B → C and h : C → D. Then (h ◦ g) ◦ f = h ◦ (g ◦ f ) . Let a ∈ A and suppose that f(a) = b, g(b) = c and h(c) = d. Then ((h ◦ g) ◦ f )(a) = (h ◦ g)( f (a)) = (h ◦ g)(b) = h(g(b)) = h(c) = d ; while (h ◦ (g ◦ f ))(a) = h((g ◦ f )(a)) = h(g( f (a))) = h(g(b)) = h(c) = i.e. So (h ◦ g) ◦ f = h ◦ (g ◦ f ). As we have already mentioned, when considering the composition of functions, it is customary to start with two functions f and g, where f : A → B and g : B → C, and to the function g ◦ f : A → C to get. Strictly speaking, however, all that is required is that the domain of g is a set B, where range(f) is a subset of B. In other words, if f and g are functions with f : A → B and g : B → C, where image( f ) ⊆ B , then the composition g ◦ f : A → C is defined.

9.6 Example 9.14

inverse functions

229

For sets A = {−3, −2, . 🇧🇷 🇧🇷 , 3} and B = {0, 1, . 🇧🇷 🇧🇷 , 10}, B = {0, 1, 4, 5, 8, 9} and C = {1, 2, . 🇧🇷 🇧🇷 , 10}, let f : A → B and g : B → C be functions defined by f (n) = n 2 for all n ∈ A and g(n) = n + 1 for all n ∈ B . (a) Show that the composition g ◦ f : A → C is defined. (b) Find for n ∈ A (g ◦ f )(n).

solution

(a) Since image( f ) = {0, 1, 4, 9} and image f ⊆ B , it follows that the composition g ◦ f : A → C is defined. (b) For n ∈ A we have (g ◦ f )(n) = g( f (n)) = g(n 2 ) = n 2 + 1.

9.6 Inverse Functions Next we describe a property that all bijective functions possess. In preparation, we return to relations to recall a concept introduced in Chapter 8. For a relation R from a set A to a set B, the inverse relation R −1 from B to A is defined as R −1 = { ( b, a) : (a, b) ∈ R}. For example, if A = {a, b, c, d}, B = {1, 2, 3} and R = {(a, 1), (a, 3), (c, 2), (c, 3) , (d, 1)} is a relation from A to B, so R −1 = {(1, a), (3, a), (2, c), (3, c), (1, d )} is the inverse relation of R. Of course, every function f : A → B is also a relation from A to B, so there is an inverse relation f −1 from B to A. This raises a natural question: Under Under which conditions is the inverse relation f −1 from B to A also a function from B to A? If the inverse relation f −1 is a function from B to A, then surely dom( f −1 ) = B. This implies that f must overlap. If f is not unique, then f(a1) = f(a2) = b for some a1, a2 ∈ A, and b ∈ B, where a1 = a2. But then (b, a1 ), (b, a2 ) ∈ f −1 , which cannot happen if f −1 is a function. This leads us to the following theorem. Two basic facts are used again and again in the proof, namely (1) f (a) = b if and only if (a, b) ∈ f and (2) if f −1 is a function and f (a) = b, then (b, a) ∈ f −1 . Theorem 9.15

Let f : A → B be a function. Then the inverse relation f−1 is a function from B to A if and only if f is bijective. If f is bijective, then f −1 is also bijective.

Study

Let's first assume that f −1 is a function from B to A. Then we show that f is one-to-one and one-to-one. Suppose f (a1 ) = f (a2 ) = y, where y ∈ B. Then (a1 , y), (a2 , y) ∈ f , which implies that (y, a1 ), (y, a2 ) ∈ f − 1 . Since f −1 is a function from B to A, each element of B has a unique image under f −1 . So in particular y has a unique image under f −1 . Since f −1 (y) = a1 and f −1 (y) = a2 , it follows that a1 = a2 and hence f is one-to-one. To show that f overlaps, let b ∈ B. Since f −1 is a function of B in A, there exists a unique element a ∈ A with f −1 (b) = a. So (b, a) ∈ f −1 , which implies that (a, b) ∈ f , so f (a) = b. So f is approximately.

230

Chapter 9 Functions Conversely, assume that the function f : A → B is bijective. We show that f is a function of B in A. Let b ∈ B. Since f is supernatural, there is a ∈ A with (a, b) ∈ f . So (b, a) ∈ f −1 . It remains to show that (b, a) is the only element of f−1 whose first coordinate is b. Assume that (b, a) and (b, a ) both lie at f−1. Then (a, b), (a, b) ∈ f, which implies that f(a) = f(a) = b. Since f is one-to-one, a = a . So we show that for every b ∈ B there is a single element a ∈ A such that (b, a) ∈ f −1 ; that is, f−1 is a function from B to A. Finally, we show that f−1 is bijective if f is bijective. Assume f is bijective. We just saw that f −1 is a function from B to A. First we show that f −1 is one-to-one. Assume that f −1 (b1 ) = f −1 (b2 ) = a. Then (b1 , a), (b2 , a) ∈ f −1 and hence (a, b1 ), (a, b2 ) ∈ f . Since f is a function, b1 = b2 and f −1 is one-to-one. To show that f −1 overlaps, let a ∈ A. Since f is a function, there is an element b ∈ B with (a, b) ∈ f . Hence (b, a) ∈ f −1 such that f −1 (b) = a and f −1 overlaps. Therefore f −1 is bijective. −1

Let f : A → B be a bijective function. Then, by Theorem 9.15, f −1 : B → A is a bijective function called the inverse function or simply the inverse function of f. Therefore both composition functions f −1 ◦ f and f ◦ f −1 are defined. In fact, f −1 ◦ f is a function from A to A, and f ◦ f −1 is a function from B to B. As we will learn in a moment, f −1 ◦ f and f ◦ f −1 are functions I visited earlier. Let a ∈ A −1 −1 and suppose and therefore (b, f−1 (b) = a) ∈−1f , so −1that f (a) = b.−1So (a, b) ∈ f − 1 ( b) = f f (b) = a. So f ◦ f (a) = f ( f (a)) = f (b) = a and f ◦ f f (a) = b. Hence f −1 ◦ f = i A and f ◦ f −1 = i B are the identity functions on the sets A and B. (See Figure 9.3.) Indeed, if f : A → B and g : B → A are functions , for which g ◦ f = i A and f ◦ g = i B , then f and g have some important properties. Theorem 9.16

If f : A → B and g : B → A are two functions such that g ◦ f = i A and f ◦ g = i B , then f and g are bijective and g = f −1 .

Study

First we show that f is one-to-one. Suppose f (a1 ) = f (a2 ) where a1 , a2 ∈ A. Then g( f (a1 )) = g( f (a2 )). Since g ◦ f = i A , it follows that a1 = (g ◦ f )(a1 ) = g( f (a1 )) = g( f (a2 )) = (g ◦ f )(a2 ) = a2 e so f is f one to one.

to A Figure 9.3

f

f-1

bB

A bijective function and its inverse

9.6

inverse functions

231

Next we show what f is all about. Let b ∈ B and g(b) = a. Since f ◦ g = i B , it follows that ( f ◦ g)(b) = b. Hence ( f ◦ g)(b) = f (g(b)) = f (a) = b, and hence f overlaps. So f is bijective and then f−1 exists. Likewise g is bijective. Let a ∈ A and suppose that f (a) = b ∈ B. Then f −1 (b) = a. Since g ◦ f = i A , it follows that a = (g ◦ f )(a) = g( f (a)) = g(b). Hence g = f −1 . If a bijective function f has a relatively small number of ordered pairs, then finding f−1 is easy. But what if f is a bijective function that can be found in e.g. calculus? We illustrate this below using a function described in Result 9.9. Example 9.17

The function f : R − {2} → R − {3} defined by f (x) =

3x x -2

is known to be bijective. Determine f −1 (x), where x ∈ R − {3}. solution

Since f ◦ f −1 (x) = x for all x ∈ R − {3}, it follows

f ◦ f −1 (x) = f f −1 (x) =

3 f −1 (x) = x. f −1 (x) − 2

So 3 f −1 (x) = x( f −1 (x) − 2) and 3 f −1 (x) = x f −1 (x) − 2x. If we put the terms with f −1 (x) together on the same side of the equation and then factor the f −1 (x) term, we get x f −1 (x) − 3 f −1 (x) = 2x, which is f −1 (x )(x − 3) = 2x. If we solve for f −1 (x), we get f −1 (x) = analysis

2x . x-3

You may have faced the problem of finding the inverse of a function before, and you may recall a slightly different approach than the one just given. Let's look at this example again, but from a different perspective. When we consider arithmetic functions, we sometimes write them as y = x 2 , y = 5x + 1 instead of f (x) = x 2 , g(x) = 1 1 5x + 1 or h(x) = x + or y = x + . x x 3x 2x −1 In Example 9.17 we got f (x) = and found that f (x) = . Let's take x −2 x −3 2x 2x instead. That is, (x, y) ∈ f −1 , where y = . Of course, since we write the inverse as y = x −3 x −3 −1 , at first we don't know what y is. But if (x, y) ∈ f , then (y, x) ∈ f and we 3y know that x = f (y) = . If we solve this equation for y, we get x(y − 2) = 3y, so y − 2 x y − 2x = 3y. If we combine the y terms on the same side of the equation and factor the y term, we get

232

Chapter 9 Functions x y − 3y = 2x solving for y gives y =

e

y(x − 3) = 2x.

2x; that is, x −3 f −1 (x) =

2x . x-3

3x we replace f (x) by x and x by y and then solve for x −2 y. The result is f −1 (x). Of course, the procedure we described to find f −1 (x) is exactly the same as before. The only difference is the notation. You may also have noticed that the algebra performed to find f −1 (x) in Example 9.17 is exactly the same as the algebra performed to prove that f is over in Result 9.9.

In short, to find f −1 when f(x) =

Finding the inverse of a bijective function is not always possible through algebraic manipulation. For example, the function f : R → (0, ∞) defined by f (x) = e x is bijective, but f −1 (x) = ln x. In fact, the function g : R → R defined by g(x) = 3x 7 + 5x 3 + 4x − 1 is bijective, but there is no way to find an expression for g −1 (x). Of course, if f : A → B is a one-to-one function from A to B that does not overlap, then f is not bijective and by Theorem 9.15 f has no inverse (from B to A). On the other hand, if we define a new function g : A → range( f ) by g(x) = f (x) for all x ∈ A, then g is a bijective function and thus its inverse function g −1 : range( f ) → A exists. For example, let E be the set of all even integers and consider the function f : Z → Z over f (n) = 2n. So this function f is injective but not surjective, and so there is no inverse of f from Z to Z. Note that range( f ) = E. If we define g : Z → E by g(n) = f (n ) for all n ∈ Z, then g is bijective and g −1 : E → Z is a (bijective) function. In fact, g −1 (n) = n/2 for all n ∈ E.

9.7 Permutations We have already mentioned that the identity function i A defined on a nonempty set A is bijective. There are usually many bijective functions that can be defined on non-empty sets. In fact, the number of bijections in a set of n elements is n! according to Theorem 9.7. These types of functions are common in mathematics, particularly in the area of mathematics called abstract (or modern) algebra. A permutation of (or over) a nonempty set A is a bijective function on A, i.e. a function from A to A that is one to one and overlaps. From Results 9.4 and 9.6, the function f : R → R defined by f (x) = 3x − 5 is a permutation of R. Consider an even simpler example. For A = {1, 2, 3} let f be a permutation of A. Then f is fully determined if we know the images of 1, 2 and 3 under f. We have seen that there are three possible choices for f(1), two choices for f(2) once f(1) has been specified, and one choice for f(3) once f(1) and f(2 ) specified were specified specified . It follows that there are 3 2 1 = 3! = 6 different permutations f of the set A = {1, 2, 3}. This corresponds to Theorem 9.7. Such a function is the identity function defined on {1, 2, 3}, which we denote by α1; that is, α1 = {(1, 1), (2, 2), (3, 3)}.

9.7

permutations

233

Another permutation of {1, 2, 3} is α2 = {(1, 1), (2, 3), (3, 2)}. There are other common ways to represent these permutations. A permutation of {1, 2, 3} is also written as 1 2 3 , −−− , with the numbers directly below 1, 2, and 3 being their images. Hence α1 , α2 and the other four permutations of {1, 2, 3} can be expressed as: 123 123 123 α2 = α3 = α1 = 123 132 321 α4 =

123 213

a5 =

123 231

a6 =

123 . 312

Since every permutation αi (1 ≤ i ≤ 6) is a bijective function from {1, 2, 3} to {1, 2, 3}, it follows from Corollary 9.12 that the composition of any two permutations of {1, 2 , 3 } is again a permutation of {1, 2, 3}. For example, consider 123 123 1 2 3 α2 ◦ α5 = ◦ = . 132 231 −−− Since (α2 ◦ α5 ) (1) = α2 (α5 (1)) = α2 (2) = 3, (α2 ◦ α5 ) (2) = 2 and (α2 ◦ α5 ) (3) = 1 it follows that 123 123 123 α2 ◦ α 5 = ◦ = = α3 . 132 231 321 It follows from Theorem 9.13 that the composition of permutations on the same nonempty set A is associative. Therefore, for all integers i, j, k ∈ {1, 2, · · · , 6}, (αi ◦ α j ) ◦ αk = αi ◦ (α j ◦ αk ). Furthermore, since a permutation is a bijective function by Theorem 9.15, every permutation has an inverse, which is also a permutation. Thus for any i (1 ≤ i ≤ 6) αi−1 = α j for some j (1 ≤ j ≤ 6). The inverse of a permutation can be found by swapping the two rows and rearranging the columns so that the top row has the natural order 1, 2, 3, . 🇧🇷 .. So 231 123 −1 = = α6 . α5 = 123 312 The set of all n! Permutations of the set {1, 2, , · · · , n} are denoted by Sn. So S3 = {α1 , α2 , . 🇧🇷 🇧🇷 , α6 }. As we saw in S3, the elements of Sn satisfy closure, associativity, and inverse existence for every positive integer n. This will be revisited in Chapter 13.

234

Chapter 9 Functions

EXERCISES FOR CHAPTER 9 Section 9.1: The function definition 9.1. Let A = {a, b, c, d} and B = {x, y, z}. Then f = {(a, y), (b, z), (c, y), (d, z)} is a function from A to B. Determine dom( f ) and amplitude( f ). 9.2. Let A = {1, 2, 3} and B = {a, b, c, d}. Give an example of a relation R from A to B that contains exactly three elements such that R is not a function from A to B. Explain why R is not a function. 9.3. Let A be a nonempty set. If R is a relation from A to A that is both an equivalence relation and a function, what familiar function is R? Justify your answer. 9.4. Given the subset Ai of R and the relation Ri (1 ≤ i ≤ 3) from Ai to R, determine whether Ri is a function from Ai to R. (a) A1 = R, R1 = {(x, y): x ∈ A1 , y = 4x − 3} (b) A2 = [0, ∞), R2 = {(x, y) : x ∈ A2 , (y + 2)2 = x} (c) A3 = R, R3 = {(x, y) : x ∈ A3 , (x + y)2 = 4} 9.5. Let A and B be nonempty sets and let R be a nonempty relation from A to B. Show that there is a subset A of A and a subset f of R such that f is a function from A to B. 9.6 . In the following, a function f i : Ai → R (1≦i≦5) is defined in each case, where the domain of definition Ai consists of all real numbers x for which f i (x) is defined. Determine the domain of definition Ai and the domain of f i . (a) (b) (c) (d) (e)

f 1 (x) = 1 + x 2 f 2 (x) = 1 − x1 √ f 3 (x) = 3x − 1 f 4 (x) = x 3 − 8 x f 5 (x) = x−3 .

9.7. Let A = {3, 17, 29, 45} and B = {4, 6, 22, 60}. A relation R from A to B is defined by a R b if a + b is prime. Is R a function from A to B? 9.8. Let A = {5, 6}, B = {5, 7, 8} and S = {n : n ≥ 3 is an odd integer}. A relation R from A × B to S is defined as (a, b) R s if s | (a+b). Is R a function from A × B to S? 9.9. Determine which of the following five relations Ri (i = 1, 2, . . . , 5) are functions. (a) (b) (c) (d) (e)

R1 R2 R3 R4 R5

is defined in R by x R1 y if x 2 + y 2 = 1. is defined in R by x R2 y if 4x 2 + 3y 2 = 1. is defined from N to Q by a R3 b if 3a + 5b = 1 is defined in R by x R4 y if y = 4 − |x − 2|. is defined in R by x R5 y if |x + y| = 1

9.10. A function g : Q → Q is defined by g(r ) = 4r + 1 for every r ∈ Q. (a) Find g(Z) and g(E), where E is the set of even integers. (b) Determine g −1 (N) and g −1 (D), where D is the set of odd integers. 9.11. Let C = {x ∈ R : x ≥ 1} and D = R+ . For each function defined below, find f f (C), f −1 (C), f −1 (D), and f −1 ({1}). (a) (b) (c) (d) (e)

f f f f f

: R → R is defined by f (x) = x 2 . : R+ → R is defined by f (x) = ln x. : R → R is defined by f (x) = e x . : R → R is defined by f (x) = sin x. : R → R is defined by f (x) = 2x − x 2 .

Exercises for Chapter 9

235

9.12. For a function f : A → B and subsets C and D of A and E and F of B, prove the following. (a) (b) (c) (d) (e) (f)

f (C ∪ D) = f (C) ∪ f (D) f (C ∩ D) ⊆ f (C) ∩ f (D) f (C) − f (D) ⊆ f (C − D) f − 1 (E ∪ F) = f −1 (E) ∪ f −1 (F) f −1 (E ∩ F) = f −1 (E) ∩ f −1 (F) f −1 (E − F) = f −1 (E) − f −1 (F).

Section 9.2: The set of all functions from A to B 9.13. Let A = {1, 2, 3} and B = {x, y}. Determine BA. 9.14. Give an example of a function g ∈ B A and a function h ∈ B B for the sets A = {1, 2, 3, 4} and B = {x, y, z}. 9.15. Find for A = {a, b, c} 2 A . 9.16. (a) Give an example for two sets A and B such that |B A | = 8. (b) Give an example of an element in B A for the sets A and B given in (a). ONE

9.17. (a) What is the possible interpretation of the C-notation B for non-empty sets A, B and C? A (b) Determine according to the definition given in (a) C B for A = {0, 1}, B = {a, b} and C = {x, y}.

Section 9.3: One-to-One and Onto Functions 9.18. Let A = {w, x, y, z} and B = {r, s, t}. Give an example of a function f : A → B that is neither one-to-one nor one-to-one. Explain why f does not have these properties. 9.19. Give an example of two finite sets A and B and two functions f : A → B and g : B → A such that f is injective but not injective and g is injective but not injective. 9.20. A function f : Z → Z is defined by f (n) = 2n + 1. Determine whether f (a) is injective, (b) surjective. 9.21. A function f : Z → Z is defined by f (n) = n − 3. Determine whether f (a) is injective, (b) surjective. 9.22. A function f : Z → Z is defined by f (n) = 5n + 2. Determine whether f (a) is injective, (b) surjective. 9.23. Prove or disprove: For every nonempty set A there is an injective function f : A → P(A). 9.24. Determine whether the function f : R → R defined by f (x) = x 2 + 4x + 9 is (a) one-to-one and (b) overlapping. 9.25. Is there a function f : R → R that overlaps but not one-to-one? explain your answer 9.26. Give an example of a function f : N → N that is (a) one-to-one and one-to-one (b) one-to-one but not one-to-one (c) one-to-one - one and not too. 9.27. Let A = {2, 3, 4, 5} and B = {6, 8, 10}. A relation R is defined from A to B by an R b if a | b and b/a + 1 is a prime number. (a) Is R a function from A to B? (b) If R is a function from A to B, then determine whether this function is one-to-one and/or one-to-one. 9.28. Let A = {2, 4, 6} and B = {1, 3, 4, 7, 9}. A relation f is defined from A to B by a f b if 5 divides ab + 1. Is f a one-to-one function? 9.29. Let f be a function with dom( f ) = A and let C and D be subsets of A. Prove that if f is one-to-one then f (C ∩ D) = f (C) ∩ f (D) .

236

Chapter 9 Functions

Section 9.4: Bijective Functions 9.30. Prove that the function f : R → R defined by f (x) = 7x − 2 is bijective. 9.31. Let f : Z5 → Z5 be a function defined by f ([a]) = [2a + 3]. (a) Show that f is well defined. (b) Determine whether f is bijective. 9.32. Prove that the function f : R − {2} → R − {5} is defined by f (x) =

5x+1x−2

it is bijective.

9.33. Let A = [0, 1] be the closed interval of real numbers between 0 and 1. Give an example of two different bijective functions f 1 and f 2 from A to A, neither of which is the identity function. 9.34. Prove Theorem 9.7 using mathematical induction. 9.35 For two nonempty finite sets A and B, let R be a relation from A to B such that image(R) = B. Define the dominant number γ(R) of R as the smallest cardinality of a subset S ⊆ A such that for each element y of B there is an element x ∈ S such that x is related to y. (a) Let A = {1, 2, 3, 4, 5, 6, 7} and B = {a, b, c, d, e, f, g} and let R = {(1, c), (1, e), (2, c), (2, f), (2, g), (3, b), (3, f), (4, a), (4, c), (4 , g), (5, a), (5, b), (5, c), (6, d), (6, e), (7, a), (7, g)}. Determine γ(R). (b) If R is an equivalence relation defined on a finite nonempty set A (and thus B = A), then what is γ(R)? (c) If f is a bijective function from A to B, what is γ ( f )? 9.36. Let A = {a, b, c, d, e, f } and B = {u, v, w, x, y, z}. Each element r ∈ A is assigned a list or subset L(r ) ⊆ B. The goal is to define a “list function” φ : A → B with the property that φ(r ) ∈ L(r ) for every r ∈ A. (a) For L(a) = {w, x, y }, L(b) = {u, z}, L(c) = {u, v}, L(d) = {u, w}, L(e) = {u, x, y}, L( f ) = {v, y}, is there a bijective list function φ : A → B for these lists? (b) For L(a) = {u, v, x, y}, L(b) = {v, w, y}, L(c) = {v, y}, L(d) = {u , w, x, z}, L(e) = {v, w}, L( f ) = {w, y}, is there a bijective list function φ : A → B for these lists?

Section 9.5: Composition of Functions 9.37. Let A = {1, 2, 3, 4}, B = {a, b, c} and C = {w, x, y, z}. Consider the functions f : A → B and g : B → C, where f = {(1, b), (2, c), (3, c), (4, a)} and g = {(a , x) , (b, y), (c, x)}. Determine g ◦ f . 9.38. Two functions f : R → R and g : R → R are defined by f (x) = 3x 2 + 1 and g(x) = 5x − 3 for all x ∈ R. Determine (g ◦ f )(1) and (f◦g)(1). 9.39. Two functions f : Z10 → Z10 and g : Z10 → Z10 are defined by f([a]) = [3a] and g([a]) = [7a]. (a) Determine g ◦ f and f ◦ g. (b) What can be concluded from (a)? 9.40 Let A and B be nonempty sets. Prove that if f : A → B then f ◦ i A = f and i B ◦ f = f . 9.41. Let A be a nonempty set and let f : A → A be a function. Prove that if f ◦ f = i A , then f is bijective. 9.42. Prove or disprove the following: (a) If two functions f : A → B and g : B → C are both bijective, then g ◦ f : A → C is bijective. (b) Let f : A → B and g : B → C be two functions. If g overlaps, then g ◦ f : A → C overlaps. (c) Let f : A → B and g : B → C be two functions. If g is one to one, then g ◦ f : A → C is one to one. (d) There are functions f : A → B and g : B → C such that f is nonoverlapping and g ◦ f : A → C is overlapping. (e) There are functions f : A → B and g : B → C such that f is not one-to-one and g ◦ f : A → C is one-to-one.

Exercises for Chapter 9

237

9.43. For nonempty sets A, B and C let the functions be f : A → B and g : B → C. (a) Proof: If g ◦ f is one-to-one, then f is one-to-one. Use as many of the following proof techniques as possible: direct proof, proof by contrapositive, proof by contradiction. (b) Refutation: If g ◦ f is one-to-one, then g is one-to-one. 9.44. Let A be the set of integers that are multiples of 4, let B be the set of integers that are multiples of 8, and let B be the set of even integers. So A = {4k : k ∈ Z}, B = {8k : k ∈ Z} and B = {2k : k ∈ Z}. The functions f : A × A → B and g : B → Z are defined by f ((x, y)) = x y for x, y ∈ A and g(n) = n/2 for n ∈ B . (a) Show that the composition function g ◦ f : A × A → Z is defined. (b) For k, ∈ Z, find (g ◦ f )((4k, 4)). 9.45 Let A be the set of even integers and B the set of odd integers. A function f : A × B → B × A is defined by f ((a, b)) = f (a, b) = (a + b, a) and a function g : B × A → B × B is defined by g(c, d) = (c + d, c). (a) Find (g ◦ f )(18, 11). (b) Determine whether the function g ◦ f : A × B → B × B is one-to-one. (c) Determine whether g ◦ f overlaps. 9.46. Let A be the set of odd integers and B the set of even integers. A function f : A × B → A × A is defined by f (a, b) = (3a − b, a + b) and a function g : A × A → B × A is defined by g(c, d ) = (c − d, 2c + d). (a) Find (g ◦ f )(3, 8). (b) Determine whether the function g ◦ f : A × B → B × A is one-to-one. (c) Determine whether g ◦ f overlaps. 9.47. Prove or disprove the following for the functions f , g and h with domain and domain R. (a) (g + h) ◦ f = (g ◦ f ) + (h ◦ f ) (b) f ◦ (g + h) = ( f ◦ g) + ( f ◦ h). , where 9.48. The composition g ◦ f : (0, 1) → R of two functions f and g is given by (g ◦ f )(x) = 2√4x−1 x−x 2 f : (0, 1) → (− 1 , 1) is defined by f (x) = 2x − 1 for x ∈ (0, 1). Determine the function g.

Section 9.6: Inverse Functions 9.49. Let A = {a, b, c}. Give an example of a function f : A → A where the inverse (relation) f −1 is not a function. 9.50 Show that the function f : R → R defined by f (x) = 4x − 3 is bijective and determine f −1 (x) for x ∈ R. 9.51. Show that the function f : R − {3} → R − {5} is defined by f (x) = for x ∈ R − {5}.

5x x−3

is bijective and determines f −1 (x)

9.52. The functions f : R → R and g : R → R defined by f (x) = 2x + 1 and g(x) = 3x − 5 for x ∈ R are bijective. Find the inverse of g ◦ f −1 . 9.53. Let A and B be sets with |A| = |B| = 3. How many functions from A to B have inverse functions? 9.54. The functions f : R → R and g : R → R are defined by f (x) = 2x + 3 and g(x) = −3x + 5. (a) Show that f is one-to-one and more. (b) Show that g is one to one and more. (c) Determine the composition function g ◦ f .

238

Chapter 9 Functions (d) Determine the inverse functions f −1 and g −1 . (e) Determine the inverse function (g ◦ f )−1 of g ◦ f and the composition f −1 ◦ g −1 .

9.55 Let A = R − {1} and define f : A → A by f (x) =

x for all x ∈ A. x −1

(a) Prove that f is bijective. (b) Determine f −1 . (c) Determine f ◦ f ◦ f . 9.56. Let A, B and C be nonempty sets and let f , g and h be functions such that f : A → B, g : B → C and h : B → C. Prove or disprove each of the following statements: (a) If g ◦ f = h ◦ f , then g = h. (b) If f is one-to-one and g ◦ f = h ◦ f , then g = h. 9.57. The function f : R → R is defined by

f (x) =

1 se x < 1 x−1 √ x − 1 se x ≥ 1.

(a) Show that f is a bijection. (b) Determine the inverse f −1 of f . 9.58. Suppose for a function f : A → B there exists a function g : B → A with f ◦ g = i B . Prove that if g is surjective, then g ◦ f = i A . 9.59. Let f : A → B, g : B → C and h : B → C be functions, where f is a bijection. Prove that if g ◦ f = h ◦ f , then g = h.

Section 9.7: Permutations

12345 23451

12 35 123456 1 9.61. Seja α = e β = 264153 5 −1 −1 (a) Bestimme α e β . (b) Bestimme α ◦ β e β ◦ a.

9.60. Let a =

e b =

345 241

be permutations on S5 . Determine α ◦ β and β −1 .

23456 36214

be elements of S6.

9.62. Beweisen que para todo inteiro n ≥ 3, existiert α, β ∈ Sn tal que α ◦ β = β ◦ α.

EXERCISES ADDITIONAL TO CHAPTER 9 9.63. Let f : R → R be the function defined by f (x) = x 2 + 3x + 4. (a) (b) (c) (d) (e)

Show that f is not injective. Find all pairs r1 , r2 of real numbers such that f (r1 ) = f (r2 ). Show that f is not surjective. Find the set S of all real numbers such that for s ∈ S there is no real number x such that f(x) = s. To which known set is the set S in (d) related?

9.64. Let f : R → R be the function defined by f (x) = x 2 + ax + b, where a, b ∈ R. Show that f is not unique. [Hint: It may be useful to consider the cases a = 0 and a = 0 separately.]

Additional exercises to Chapter 9

239

9.65 In Result 9.4 we saw that the (linear) function f : R → R defined by f (x) = 3x − 5 is one-to-one. In fact, we have seen that other linear functions are one-to-one. Prove the following generalization of this result: The function f : R → R with a, b ∈ R and a = 0 defined by f (x) = ax + b is one-to-one. 9.66. Evaluate the proposed proof of the following result. result

The function f : R − {1} → R − {3} defined by f (x) =

3x is bijective. x-1

Proof First we show that f is one-to-one. Suppose f(a) = f(b), where a, b ∈ R − {1}. So 3b 3a = . Cross multiplying gives 3a(b − 1) = 3b(a − 1). Put simply, we have a−1 b−1 3ab − 3a = 3ab − 3b. If we subtract 3ab from both sides and divide by −3, we get a = b. So f is one to one. 3x Next we show what f is all about. Let f(x) = r . So = r; then 3x = r(x − 1). Put simply, we have x −1 3x = r x − r and then 3x − r x = −r . Hence x(3 − r ) = −r . Since r ∈ R − {3}, we can divide by 3 − r −r r = . So we get x = 3−r r −3 r 3 r −3 r 3r 3r f (x) = f = r = = = r. r −3 − 1 r − (r − 3) 3 r −3 So f. 9.67 overlaps. For each of the following functions, with explanation, determine whether it is a one-to-one function and whether it is a one-to-one function. (a) (b) (c) (d) (e)

f : R × R → R × R, onde f (x, y) = (3x − 2, 5y + 7) g : Z × Z → Z × Z, onde g(m, n) = (n + 6, 2 − m) h : Z × Z → Z × Z, onde √ √ h(r, s) = (2r + 1, 4s + 3) φ : Z × Z → S = {a + b 2 : a, b ∈ Z}, für φ(a, b) = a + b 2 α : R → R × R, für α(x) = (x 2 , 2x + 1).

9.68. Let S be a nonempty set. Show that there is an injective function from P(S) to P(P(S)). 9.69. Let A = {a, b, c, d, e}. Then f = {(a, c), (b, e), (c, d), (d, b), (e, a)} is a bijective function from A to A. (a) Show that If so, list the five elements of A such that the picture of each of the first four elements in the list is immediately to the right of the element and that the picture of the last element in the list is the first element in the list. (b) Show that it is not possible to list elements of A as in (a) for every bijective function from A to A. 9.70 Let A = R − {0} and let f : A → A defined by f (x) = 1 − (a) Show that f ◦ f ◦ f = i A . (b) Determine f −1 .

1x

for all x ∈ R.

9.71. Give an example for a finite nonempty set A and a bijective function f : A → A with (1) f = i A , (2) f ◦ f = i A and (3) f ◦ f ◦ f = i A . 9.72. For nonempty sets A and B and functions f : A → B and g : B → A let g ◦ f = i A , the identity function on A. (a) (b) (c) (d) (e) (f)

Prove that f is one-to-one and that g overlaps. Show that f does not have to overlap. Show that g need not be bijective. Prove that g is one-to-one if f overlaps. Prove that f overlaps if g is one-to-one. Combine the results in (d) and (e) into a single statement.

240

Chapter 9 Functions

9.73. Let A = {1, 2}, B = {1, −1, 2, −2} and C = {1, 2, 3, 4}. So f = {(1, 1), (1, −1), (2, 2), (2, −2)} is a relation from A to B, while g = {(1, 1), (− 1 , 1), (2, 4), (−2, 4)} is a relation from B to C. In addition, g f = {(x, z) : (x, y) ∈ f and (y, z) ∈ g for some y ∈ B} is a relation from A to C. Note that although the relation g f is not a function from A to B, the relation g f is a function from A to C. Explain why. 9.74. A relation f in R is defined by f = {(x, y): x ∈ R and y = x or y = −x} and a function g : R → R is defined by g(x) = x 2 . Then g f = {(x, z) : (x, y) ∈ f and (y, z) ∈ g for some y ∈ R}. (a) Explain why f is not a function from R to R. (b) Show that g f is a function from R to R and determine it explicitly. (c) Although the relation f is not a function from R to R, the relation g f is a function from R to R. Explain why. 9.75 Let A = {1, 2}, B = {1, 2, 3, 4} and C = {1, 2, 3, 4, 5, 6}. Give an example of a function f from A to B and a relation g from B to C that is not a function from B to C such that g f = {(x, z) : (x, y) ∈ f e (y , z ) ∈ g for some y ∈ B} is a function from A to C. 9.76. Let F be the set of all functions with domain and domain R. Define a relation R on F by f R g if there is a constant C such that f (x) = g(x) + C for all x ∈ R. ( a ) Show that R is an equivalence relation. (b) Let f ∈ F. If the derivative of f is defined for all x ∈ R, then use this information to describe the elements in the equivalence class [f]. 9.77. Let S be the set of odd positive integers. A function F : N → S is defined by F(n) = k for every n ∈ N, where k is the odd positive integer such that 3n + 1 = 2m k for some nonnegative integer m. Prove or disprove the following: (a) F is injective. (b) F is approximate. 9.78. A function F : N → N ∪ {0} is defined by F(n) = m for every n ∈ N, where m is the nonnegative integer for which 3n + 1 = 2m k and k is an odd integer . Prove or disprove the following: (a) F is injective. (b) F is approximate. 9.79. Recall that the derivative of ln is x 1/x and that the derivative of x is n nx n−1 for any integer n. In symbols, d (ln x) = x1 and ddx (x n ) = nx n−1 . Let f : R+ → R be defined by f (x) = ln x for all x ∈ R+ . Prove dx that the nth derivative of f (x) is given by f (n) (x) =

(−1)n+1 (n−1)! xn

for every positive integer n.

9.80. Let f : R → R be defined by f (x) = xe−x for all x ∈ R. Prove that the nth derivative of f (x) is given by f (n) (x) = (−1)n e − x is given (x − n) for every positive integer n. 9.81. The function h : Z16 → Z24 is defined by h([a]) = [3a] for a ∈ Z. (a) Prove that the function h is well defined; that is, prove that if [a] = [b] in Z16 , then h([a]) = h([b]) in Z24 . (b) Determine for the subsets A = {[0], [3], [6], [9], [12], [15]} and B = {[0], [8]} of Z16 the h(A) and h(B) subsets of Z24 . (c) Find for the subsets C = {[0], [6], [16], [18]} and D = {[4], [8], [16]} of Z24 h −1 (C ) eh −1 (D).

Additional exercises to Chapter 9

241

9.82. Let U be a universal set and A a subset of U. A function g A : U → {0, 1} is defined by 1 if x ∈ A g A (x) = 0 if x ∈ / A. Check the following for each. (a) (b) (c) (d) (e)

gU (x) = 1 for all x ∈ U. g∅ (x) = 0 for all x ∈ U. For U = R and A = [0, ∞) we have (g A ◦ g A )(x) = 1 for x ∈ R. For subsets A and B of U and C = A ∩ B, gC = (g A ) (g B ), where ((g A ) (g B ))(x) = g A ( x) g B (x). For A ⊆ U we have g A (x) = 1 − g A (x) for every x ∈ U.

9.83. (a) Let S = {a, b, c, d} and let T be the set of all six 2-element subsets of S. Show that there is an injective function f : S → {0, 1, 2, . 🇧🇷 🇧🇷 , |T |} such that the function g : T → {1, 2, . 🇧🇷 🇧🇷 , |T |} defined by g({i, j}) = | f(i) − f(j)| it is bijective. (b) Let S = {a, b, c, d, e} and let T be the set of all ten 2-element subsets of S. Show that there is no injective function f : S → {0, 1, 2 , . 🇧🇷 🇧🇷 , |T |} such that the function g : T → {1, 2, . 🇧🇷 🇧🇷 , |T |} defined by g({i, j}) = | f(i) − f(j)| it is bijective. (c) Show for the sets S and T in (b) that there is an injective function f : S → {0, 1, 2, . 🇧🇷 🇧🇷 , |T | + 2} such that the function g : T → {1, 2, . 🇧🇷 🇧🇷 , |T | + 2} defined by g({i, j}) = | f(i) − f(j)| it is injective. (d) The results in (b) and (c) should suggest a question for you. Ask and answer this question.

10

Set cardinalities

M

242

Everyone considers the Italian mathematician and scientist Galileo Galilei to be the founder of modern physics. His most important contributions included his mathematical view of the laws of motion. In the early 17th century, Galileo used mathematics to study the motion of the earth. He was convinced that the earth revolved around the sun, an opinion not shared by the Catholic Church at the time. This resulted in his being imprisoned for the last nine years of his life. Galileo's two most important scientific writings were Dialogue Concerning the Two Chief World Systems and Discourses and Mathematical Demonstrations Concerning Two New Sciences, the first published before his arrest and the second (in the Netherlands) during his imprisonment. In these two works, he discussed scientific theories through a dialogue between fictional characters. In this way he was able to present his positions on various theories. One subject that fascinated Galileo was infinite sets. Galileo observed that there is a one-to-one correspondence (ie a bijective function) between the set N of positive integers and the subset S of N consisting of the squares of positive integers. This led Galileo to observe that while there are many positive integers that are not squares, there are just as many squares as there are positive integers. This caused Galileo to encounter a property of an infinite set that he found problematic: there can be a one-to-one correspondence between a set and a proper subset of the set. Although Galileo correctly concluded that the number of squares of positive integers is not less than the number of positive integers, he failed to say that these sets have the same number of elements. Bernhard Bolzano was a Bohemian priest, philosopher and mathematician. Although best known for his work on calculus in the first half of the 19th century, he was also interested in infinite sets. His Paradoxes of the Infinite, published two years after his death and ignored for twenty years, contained many insights from modern set theory. He noted that one-to-one correspondences between an infinite set and a proper subset of itself are common, and he agreed with this fact, contrary to Galileo's feelings. The German mathematician Richard Dedekind studied under the brilliant Carl Friedrich Gauss. Dedekind had a long and productive career in mathematics, making many contributions to the study of irrational numbers. What amazed Galileo and interested Bozen led to the end of the 19th century

10.1

Numerically Equivalent Sets

243

One-to-one correspondence with S. So understanding infinite sets was not an easy task, even among well-known mathematicians of the past. We mentioned in Chapter 1 that the cardinality |S| a set S is the number of elements in S, and now we would use the notation |S| use only if S is a finite set. A set S is finite if S = ∅ or |S| = n for some n ∈ N; while a set is infinite if it is not finite. It might seem that we |S| should write = ∞ when S is infinity, but we'll soon see that this isn't particularly informative. In fact, it is much more difficult to |S| to assign a meaning if S is an infinite set; but it is precisely this issue that we want to examine.

10.1 Numerically Equivalent Sets It is fairly obvious that the sets A = {a, b, c} and B = {x, y, z} have the same cardinality, since each set has exactly three elements. That is, if we count the number of elements in two sets and get the same value, then those two sets have the same cardinality. However, there is another way to see that the sets A and B described above have the same cardinality without counting the elements of each set. Note that we can pair the elements of A and B say as (a,x), (b,y) and (c,z). This implies that A and B have the same number of elements, so |A| = |B|. In fact, we have described a bijective function f : A → B, i.e. f = {(a, x), (b, y), (c, z)}. Although it is much easier to see that |A| = |B| observing that every set has three elements as a bijective function from A to B is the latter method of showing that |A| = |B| which can be generalized to the situation where A and B are infinite sets. Two sets A and B (finite or infinite) are said to have the same cardinality, written |A| = |B| if A and B are empty or if there is a bijective function f from A to B. Two sets of the same size are also called numerically equivalent sets. Two finite sets are therefore numerically equivalent if they are both empty or both have n elements for some positive integer n. Consequently, two nonempty sets A and B are not numerically equivalent, written |A| = |B| if there is no bijective function f from one set to another. The study of numerically equivalent infinite sets is more challenging but far more interesting than studying numerically equivalent finite sets. The justification for the term "numerically equivalent sets" lies in the following theorem, which combines the main concepts of Chapters 8 and 9. Theorem 10.1

Let S be a nonempty collection of nonempty sets. A relation R is defined on S by A R B if there is a bijective function from A to B. Then R is an equivalence relation.

Study

Let A ∈ S. Since the identity function i A : A → A is bijective, A R A follows. Hence R is reflexive. Then let A R B, where A, B ∈ S. Then there is a bijective function f : A → B. By Theorem 9.15 f has an inverse function f −1 : B → A and furthermore f −1 is bijective . Therefore B R A and R is symmetrical. Finally, suppose A R B and B R C, where A, B, C ∈ S. Then there are bijective functions f : A → B and g : B → C. It follows from Corollary 9.12 that the composition g ◦ f : A → C is also bijective and therefore A R C. So R is transitive. So R is an equivalence relation.

244

Chapter 10

Set cardinalities

If A is a nonempty set, then according to the equivalence relation defined in Theorem 10.1, the equivalence class [A] consists of all elements of S with the same cardinality as A; Hence the notion of "numerically equivalent sets" is natural for two sets with the same cardinality. Example 10.2

Let S = {A1 , A2 , A3 , A4 , A5 , A6 } where A1 = {1, 2, 3}, A2 = {a, b, c, d}, A3 = {x, y, z}, A4 = {r, s, t}, A5 = {m, n}, A6 = {7, 8, 9, 10}. Then any two of the sets A1, A3, and A4 are numerically equivalent, while A2 and A6 are numerically equivalent. This says that |A1 | = |A3 | = |A4| and |A2| = |A6|. The only set in S that is numerically equivalent to A5 is A5 itself. So [A1] = {A1, A3, A4}, [A2] = {A2, A6}, and [A5] = {A5} are the distinct equivalence classes of S

Although Example 10.2 only deals with finite sets, in this chapter we are mostly interested in infinite sets. In particular, we will be interested in sets that are numerically equivalent to N or R.

10.2 Countable Sets To understand the cardinality of an infinite set, we start with a particular class of infinite sets. A set A is called countable if |A| = |N|, that is, if A has the same size as the set of natural numbers. If A is countable, then A is of course infinite. If A is a countable set, then by definition there exists a bijective function f : N → A and then f = {(1, f (1)), (2, f (2)), (3, f (3 ) ) , . 🇧🇷 🇧🇷 Consequently, A = { f(1), f(2), f(3), . 🇧🇷 🇧🇷 that is, we can write the elements of A as f(1),f(2),f(3), . 🇧🇷 .. Correspondingly, we can write the elements of A as a1 , a2 , a3 , . 🇧🇷 ., where then ai = f (i) for i ∈ N. Conversely, if the elements of A are defined as a1 , a2 , a3 , . 🇧🇷 🇧🇷 , where ai = a j for i = j, then A is countable since the function g : N → A defined by g(n) = an for every n ∈ N is certainly bijective. Hence A is an enumerable set if and only if it is possible to denote the elements of A as a1 , a2 , a3 , . 🇧🇷 🇧🇷 and thus A = {a1 , a2 , a3 , . 🇧🇷 🇧🇷 A set is countable if it is finite or countable. Then countably infinite sets are exactly countable sets. So if A is a nonempty countable set, we can write A = {a1 , a2 , a3 , . 🇧🇷 🇧🇷 , an } for some n ∈ N or A = {a1 , a2 , a3 , . 🇧🇷 🇧🇷 A set that cannot be counted is called uncountable. An uncountable set is necessarily infinite. It may not be clear whether any sets are uncountable, but we shall soon see that such sets exist. Let's look at some examples of countable sets. Of course, N itself is countable, since the identity function i N : N → N is bijective. But not only the set of positive integers is countable, but also the set of all integers. The proof of this fact that we give illustrates a common technique for showing that a set is countable; that is, if we define the elements of a set A as a1 , a2 , a3 , . 🇧🇷 🇧🇷 such that each element of A occurs exactly once in the list, then A is enumerable.

10.2

Figure 10.1

Study

245

1 2 3 4 5 ··· ↓ ↓ ↓ ↓ ↓ ↓ 0 1 −1 2 −2 · · ·

f:

Result 10.3

Countable sentences

A bijective function f : N → Z

The set Z of integers is countable. Note that the elements of Z are written as 0, 1, −1, 2, −2, . can be listed. 🇧🇷 🇧🇷 🇧🇷 The function f : N → Z described in Figure 10.1 is therefore bijective and thus Z is countable. The function f : N → Z given in Figure 10.1 can also be defined by f (n) =

1 + (−1)n (2n − 1) . 4

(10.1)

Although we have already noticed that this function f is bijective, Exercise 10.8 is. requires formal proof of this fact. The fact that Z is countable illustrates what Galileo had observed centuries ago: it is possible for two sets to have the same cardinality, one being a proper subset of the other. (However, such a situation could never arise for finite sets.) For example, N ⊂ Z and |N| = |Z|. This fact serves to illustrate a result whose proof is somewhat complicated. Theorem to prove the PROOF STRATEGY

Every infinite subset of a countable set is countable. In the proof we start with two sets, which we call A and B, where A is countable, B ⊆ A and B is infinite. Since A is countable, we can use A = {a1 , a2 , a3 , . 🇧🇷 🇧🇷 Since our goal is to show that B is countable, we need to show that we have B = {b1 , b2 , b3 , . 🇧🇷 🇧🇷 The question, of course, is how to do it. Since B is an infinite subset of A, some of the elements of A belong to B (in fact, infinitely many elements of A belong to B); while some elements of A most likely do not belong to B. We can follow the elements of A that belong to B through a set, which we shall denote by S. If a1 ∈ B, then 1 ∈ S; if a1 ∈ / B, then 1 ∈ / S. In general, n ∈ S if and only if an ∈ B. Of course, S ⊆ N. Since N is a well-ordered set (by the well-ordering principle), S contains a smallest element, let's say s. That is, as ∈ B / B. It is the element as Furthermore, if r is an integer such that 1 ≤ r < s, then ar ∈ which we will call b1. It is now logical to consider the (infinite) set S − {s} and consider its smallest element, say t. So t > s. The at element becomes b2 . Etc. Since we want to provide an accurate and thorough proof, we already encounter two problems. First, it will be difficult for us to denote the smallest element of S by s and the smallest element of S − {s} by t. We need to use better notation. So let's denote the smallest element of S by i 1 (hence b1 = ai1 ) and the smallest element of S − {i 1 } by i 2 (hence b2 = ai2 ). This is a much better notation. The other problem we have is when we write "And so on". Since we have the positive integers i 1 and i 2 , it follows that the positive integer i 3 is the smallest element of S − {i 1 , i 2 }. In general, if we take the positive integers i 1 , i 2 , . 🇧🇷 🇧🇷 , i k , where k ∈ N is the positive integer

246

Chapter 10

Set cardinalities

i k+1 is the smallest element of S − {i 1 , i 2 , . 🇧🇷 🇧🇷 , i k }. In fact, this suggests that the elements b1, b2, b3, . 🇧🇷 🇧🇷 can be located in A by induction. After the set {b1 , b2 , b3 , . 🇧🇷 .}, which we call B, for example, so we have one more concern. Are we sure that B = B? Since every element of B belongs to B, we know that B ⊆ B. To show that B = B, we also need to be sure that B ⊆ B . As we know, the standard way to show that B ⊆ B is to take a typical element b ∈ B and show that b ∈ B. Now let's write a complete proof. Theorem 10.4 Proof

Every infinite subset of a countable set is countable. Let A be a countable set and B an infinite subset of A. Since A is countable, we can write A = {a1 , a2 , a3 , . 🇧🇷 🇧🇷 Let S = {i ∈ N : ai ∈ B}; that is, S consists of all those positive integers that are indices of the elements in A that also belong to B. Since B is infinite, S is infinite. We first show by induction that B contains an enumerable subset. Since S is a non-empty subset of N, the well-ordering principle implies that S has a minimal element, say i 1 . Let b1 = ai1 . Let S1 = S − {i 1 }. Since S1 = ∅ (actually S1 is infinite), S1 has a minimal element, say i 2 . Let b2 = ai2 , which obviously differs from b1. Assume that for any integer k ≥ 2 the (different) elements b1 , b2 , . 🇧🇷 🇧🇷 , bk was defined by b j = ai j for any integer j with 1 ≤ j ≤ k, where i 1 is the smallest element in S and i j is the smallest element in S j−1 = S − {i 1 , i 2 , . 🇧🇷 🇧🇷 , i j−1 } for 2 ≤ j ≤ k. Now let i k+1 be the minimal element of Sk = S − {i 1 , i 2 , . 🇧🇷 🇧🇷 , i k } and let bk+1 = aik+1 . It follows that for every integer n ≥ 2, B has an element bn that is derived from b1 , b2 , . 🇧🇷 🇧🇷 , billion−1 . Thus we show the elements b1 , b2 , b3 , . 🇧🇷 🇧🇷 in B. Let B = {b1, b2, b3, . 🇧🇷 🇧🇷 Certainly B ⊆ B. We even claim that B = B . It only remains to show that B ⊆ B . Let b ∈ B. Because B ⊆ A, it follows that b = an for some n ∈ N and therefore n ∈ S. If n = i 1 , then b = b1 = an and hence b ∈ B . So we can assume that n > i 1 . Let S be the positive integers belonging to S smaller than n. Since n > i 1 and i 1 ∈ S, it follows that S = ∅. Of course, 1 ≤ |S | ≤ n − 1; then S is finite. So |S| = m for some m ∈ N. The set S consists of the m smallest integers of S, ie S = {i 1 , i 2 , . 🇧🇷 🇧🇷 , I am }. The smallest integer belonging to S that is greater than im must of course be im+1, and im+1 ≥ n. But n ∈ S, so n = i m+1 and b = an = target+1 ∈ B. So B = B = {b1 , b2 , b3 , . 🇧🇷 .} which is enumerable. In order to use Theorem 10.4 to describe other countable sets, it is convenient to introduce an additional notation. Let k ∈ N. Then the set kZ is defined by kZ = {kn : n ∈ Z}. Likewise kN = {kn : n ∈ N}. So 1Z = Z and 1N = N, while 2Z is the set of even integers. An immediate consequence of Theorem 10.4 is given below.

Result 10.5 exam

The set 2Z of even integers is countable. Since 2Z is infinite and 2Z ⊆ Z, Theorem 10.4 implies that 2Z is countable.

10.2

b1

b2

b3

Countable sentences

b1

b2

247

b3

a1

(a1, b1) (a1, b2) (a1, b3)

a1

(a1, b1)

(a1, b2)

(a1, b3)

a2

(a2, b1)

(a2, b2) (a2, b3)

a2

(a2, b1)

(a2, b2)

(a2, b3)

a3

(a3, b1) (a3, b2) (a3, b3)

a3

(a3, b1)

(a3, b2)

(a3, b3)

(b)

(a) Figure 10.2

Construction of a bijective function f : N → A × B

Obviously kZ is enumerable for every non-zero integer k. We now describe an enumerable set that can be obtained from two given sets. For sets A and B, remember that the Cartesian product is A × B = {(a, b) : a ∈ A, b ∈ B}. Result 10.6 test

If A and B are countable sets, then A × B is countable. Since A and B are countable sets, we can write A = {a1 , a2 , a3 , . 🇧🇷 .} and B = {b1, b2, b3, . 🇧🇷 🇧🇷 Consider the table shown in Figure 10.2(a), which has an infinite (countable) number of rows and columns, where the elements a1 , a2 , a3 , . 🇧🇷 🇧🇷 are written on the side and b1 , b2 , b3 , . 🇧🇷 🇧🇷 are written above. In row i, column j of the table we place the ordered pair (ai , b j ). Of course, each element of A × B occurs exactly once in this table. This table is reproduced in Figure 10.2(b), where the directed lines indicate the order in which we will find the entries in the table. That is, we find the elements of A × B in the order (a1 , b1 ), (a1 , b2 ), (a2 , b1 ), (a1 , b3 ), (a2 , b2 ), . 🇧🇷 🇧🇷 🇧🇷 Since each element of A × B occurs exactly once in this list, this describes a bijective function f : N → A × B, where f (1) = (a1 , b1 ), f (2) = (a1 , b2 ), f (3) = (a2 , b1 ), f (4) = (a1 , b3 ), f (5) = (a2 , b2 ), . 🇧🇷 .. So A × B is countable. We can use a technique similar to the proof of Result 10.6 to show that another known set is countable.

Result 10.7 exam

The set Q+ of positive rational numbers is countable. Consider the table shown in Figure 10.3(a). In row i, column j we enter the rational number j/i. So surely every positive rational number appears in the table in Figure 10.3(a); in fact, it appears infinitely often. For example, the number 1/2 appears in row 2, column 1, as well as in row 4, column 2. The table in Figure 10.3(a) is reproduced in Figure 10.3(b), where the arrows indicate the order in which we look at the entries the table. That is, we now consider the positive rational numbers in order

248

Chapter 10

Cardinalities of Sets 1

3

4

2 1

3 1

4 1

1 2

2 2

3 2

4 2

3

1 3

2 3

3 3

4 3

4

1 4

1

2

3

4

1

1 1

2 1

3 1

4 1

1

1 1

2

1 2

2 2

3 2

4 2

2

3

1 3

2 3

3 3

4 3

4

1 4

2 4

3 4

4 4

2

2 4

(one)

Figure 10.3

3 4

4 4

(b)

A table used to show that Q+ is countable 1 2 1 3 2 1 4 , , , , , , , 1 1 2 1 2 3 1

....

With the help of this list we can describe a bijective function f : N → Q+. In particular, we define f(1) = 1/1 = 1, f(2) = 2/1 = 2, f(3) = 1/2, and f(4) = 3/1 = 3, as expected. However, since 2 /2 = 1 and we have already defined f(1) = 1, we do not define f(5) = 1 (since f must be one-to-one). We ignore 2/2 = 1 and, following the arrows, go straight to the next number in the list, which is 1/3. Whenever we find a number in the list that we saw before, we move on to the next number in the list. In this way, the function f described is one-to-one. The function f is shown in Figure 10.4. Since every element of Q+ is eventually found, f is also superimposed and hence f is bijective. So Q+ is countable. The function f described in Figure 10.4 is not unique. There are many ways to iterate over the positive rational numbers in the table shown in Figure 10.3(a). The tables in Figure 10.5 show two other methods. Some care must be taken in proceeding with the entries in the table of Figure 10.3(a). For example, it just won't work to iterate through positive rational numbers one line at a time (see Figure 10.6). Since the first row never ends, we only find the positive integers. With the help of the table in Figure 10.7, the set Q+ can also be represented as countable. All positive rational numbers j/i with i = 1 are displayed in the first line. The second line shows all positive rational numbers j/i with i = 2 and such that j/i has been reduced to the lowest terms. This results in the rational number (2 j − 1)/2 in row 2, column j. So we continue with all the other lines. In this way everyone

f: Figure 10.4

1 2 3 ↓ ↓ ↓ 1 2 12

4 5 ↓ ↓ 3 13

··· ↓ ···

A bijective function f : N → Q+

10.2

3

4

2 1

3 1

4 1

1 2

2 2

3 2

4 2

3

1 3

2 3

3 3

4 3

4

1 4

1

1

2

3

4

1

1 1

2 1

3 1

4 1

1

1 1

2

1 2

2 2

3 2

4 2

2

3

1 3

2 3

3 3

4 3

4

1 4

2 4

3 4

4 4

Figure 10.5

2

2 4

3 4

Iterate over positive rational numbers

1

2

3

4

1

1 1

2 1

3 1

2

1 2

2 2

3 2

4 1 4 2

3

1 3

2 3

3 3

4 3

4

1 4

2 4

3 4

4 4

Figure 10.6

Countable sentences

How not to iterate through positive rational numbers

1

2

3

4

1

1 1

2 1

3 1

2

1 2

3 2

5 2

4 1 7 2

3

1 3

2 3

4 3

5 3

4

1 4

3 4

5 4

7 4

Figure 10.7

Another bijective function g : N → Q+

4 4

249

250

Chapter 10

Set cardinalities

f:

1 2 3 4 5 ··· ↓ ↓ ↓ ↓ ↓ ↓ 0 q1 −q1 q2 −q2 · · ·

Figure 10.8

A bijective function f : N → Q

positive rational number occurs exactly once in the table. So if we go through the entries marked by the arrows, we get the positive rational numbers in the order 1 2 1 3 3 1 4 , , , , , , ,... 1 1 2 1 2 3 1

and the corresponding bijective function g : N → Q+ . So g(1) = 1, g(2) = 2, g(3) = 1/2, g(4) = 3, g(5) = 3/2 and so on. Now that we have shown that Q+ is countable, it is not difficult to show that the set Q of all rational numbers is countable. Result 10.8 exam

The set Q of all rational numbers is countable. Since Q+ is countable, we can use Q+ = {q1 , q2 , q3 , . 🇧🇷 🇧🇷 So Q = {0} ∪ {q1 , q2 , q3 , . 🇧🇷 .} ∪ {−q1 , −q2 , −q3 , . 🇧🇷 🇧🇷 Therefore, Q = {0, q1 , −q1 , q2 , −q2 , . 🇧🇷 .}, and the function f : N → Q shown in Figure 10.8 is bijective, and hence Q is countable.

10.3 Uncountable sets Although we have already given some examples of countable sets (and thus infinitely countable sets), we still need to give an example of an uncountable set. We'll do that next. But first, let's check some facts about decimal expansions of real numbers. Every irrational number has a unique decimal extension and that extension is non-repeating, while every rational number has a repeating decimal extension. For example 3.11 = 0.272727 · · ·. However, some rational numbers have two (repeating) 3-decimal extensions. Example: 12 = 0.5000 · · · and 12 = 0.4999 · · ·. (The number 11 has only one decimal extension.) In particular, a rational number a/b such that a, b ∈ N reduced to the lowest terms has two decimal extensions if and only if the only primes dividing b are 2 or 5 are If a rational number has two decimal extensions, then one of the extensions repeats the digit 0 from a certain point (i.e. the decimal extension ends), while the alternative extension repeats the digit 9 from a certain point. We are now ready to give an example of an uncountable set. Recall that for real numbers a and b with a < b, the open interval (a, b) is defined by (a, b) = {x ∈ R : a < x < b}. Although, as we shall see, all open intervals (a, b) of real numbers are uncountable, we now only prove that (0, 1) is uncountable. to prove theorem

The open interval (0, 1) of real numbers is uncountable.

10.3 TESTING STRATEGY

Theorem 10.9

Study

countless sentences

251

Since uncountable means uncountable, it is not surprising that we try a proof by contradiction here. Hence the proof would begin by assuming that (0, 1) is countable. Since (0, 1) is an infinite set, this means that we assume (0, 1) is countable, which implies that there must be a bijective function f : N → (0, 1). Therefore, for every n ∈ N, f(n) is a number in the set (0, 1). It may be convenient to introduce a notation for the number f(n), e.g. e.g. f(n) = an , in which case 0 < an < 1. Since f is assumed to be one-to-one, it follows that ai = a j for distinct positive i and j for integers. Every number an has a decimal extension, say an = 0.an1 an2 an3 · · · where an1 is the first digit in the extension, an2 is the second digit in the extension, and so on. However, we have to be a bit careful here because, as we have seen, some real numbers have two decimal extensions. To avoid possible confusion, we can choose decimal expansion, which repeats the digit 0 from a certain point. That is, no real number has a decimal extension that repeats 9 from a certain point. But where does this lead to a contradiction? From what we said, (0, 1) = / {a1 , a2 , {a1 , a2 , a3 , . 🇧🇷 🇧🇷 If we can think of a real number b ∈ (0, 1) such that b ∈ a3 , . 🇧🇷 .} then that would give us a contradiction because that would say f is non-overlapping. Hence we need to find a number b ∈ (0, 1) such that b = an for every n ∈ N. Since b ∈ (0, 1), the number b has a decimal expansion, say b = 0.b1 b2 b3 · · ·. How can we use the digits b1 , b2 , b3 , . 🇧🇷 🇧🇷 such that b = an for all n ∈ N? We could choose b1 = a11 , b2 = a22 and so on. But would that mean b = a1 , b = a2 etc.? We have to be careful here. For example, 0.500 · · · and 0.499 · · · are two equal numbers whose first digits are not equal in their extensions. The reason for this, of course, is that one is the alternate decimal expansion of the other. So as long as we can avoid choosing a decimal expansion for b that is the alternative decimal expansion for a number an with n ∈ N, we have found a number b ∈ (1, 0) such that b ∈ / {a1 , a2 , a3 , . 🇧🇷 🇧🇷 This will give us a contradiction. The open interval (0, 1) of real numbers is uncountable. Suppose instead that (0, 1) is countable. Since (0, 1) is infinite, it is countable. Hence there is a bijective function f : N → (0, 1). For n ∈ N let f (n) = an . Since an ∈ (0, 1), the number an has a decimal expansion, say 0.an1 an2 an3 · · ·, where ani ∈ {0, 1, 2, . 🇧🇷 🇧🇷 , 9} for all i ∈ N. If an is irrational, then its decimal expansion is unique. If a ∈ Q, then the expansion can be unique. If not unique, then we assume without loss of generality that the digits of the decimal extension 0.an1 an2 an3 · · · are 0 from one place onwards. For example, since f is bijective, 2/5 is the range of exactly one positive integer, and this range is written as 0.4000 · · · (instead of 0.3999 · · ·). In short, we have f(1) = a1 = 0.a11 a12 a13 f(2) = a2 = 0.a21 a22 a23 f(3) = a3 = 0.a31 a32 a33 .. .. . . 🇧🇷 🇧🇷

252

Chapter 10

Set cardinalities

However, we show that the function f is not superimposed. Define the number b = 0,b1 b2 b3 · · ·, where bi ∈ {0, 1, 2, . 🇧🇷 🇧🇷 , 9} for all i ∈ N, by 4 if aii = 5 bi = 5 if aii = 5. (For example, suppose a1 = 0.31717 , a2 = 0.151515 · and a3 = 0.04000 · · · So the first three digits in the decimal expansion of b are 5, 4, and 5, so b = 0.545 · · ·.) For any i ∈ N, the digit bi = aii , which implies , that b = an for all n ∈ N, since b is not an alternating expansion of any rational number, since there is no digit in the expansion of b 9. Thus b is not an image of any element of N. Hence f is not over e , hence not bijective, creates a contradiction. In the proof of Theorem 10.9, each digit in the decimal expansion of the constructed number b is either 4 or 5. We could have chosen any two distinct digits without using 9. It is now easy, with the help of examples, to give the following result for other uncountable sets. Theorem 10.10 Proof

Corollary 10.11 Proof

Let A and B be sets with A ⊆ B. If A is uncountable, then B is uncountable. Let A and B be two sets such that A ⊆ B and A are uncountable. A and B are then necessarily infinite. On the contrary, suppose that B is countable. Since A is an infinite subset of a countable set, Theorem 10.4 implies that A is countable, which creates a contradiction. The set R of real numbers is uncountable. Since (0, 1) is uncountable by Theorem 10.9 and (0, 1) ⊆ R, it follows by Theorem 10.10 that R is uncountable. Let's pause to review some facts we've discovered about infinite sets (at least certain infinite sets). First, recall that two nonempty sets A and B have the same cardinality (same number of elements) if a bijective function from A to B exists. We are particularly interested in the situation where A and B are infinite. A family of infinite sets that we introduce is the class of countable sets. Also remember that a set S is countable if there is a bijective function from N to S. Suppose A and B are two countable sets. Then there are bijective functions f : N → A and g : N → B. Since f is bijective, f has an inverse function f −1 : A → N, where f −1 is also bijective (Theorem 9.15). Since f −1 : A → N and g : N → B are bijective functions, it follows that the composition function g ◦ f −1 : A → B is also bijective (Corollary 9.12). This tells us that |A| = |B|; that is, A and B have the same number of elements. We give this as a phrase for emphasis.

Theorem 10.12

Any two countable sets are numerically equivalent. Next, let B be an uncountable set. Then B is an infinite set that is not countable. Also, let A be an enumerable set. Hence there is a bijective function f : N → A. We say that |A| = |B|; that is, A and B do not have the same number of elements.

10.3

countless sentences

253

Let's prove it. Suppose instead that |A| = |B|. Hence there exists a bijective function g : A → B. Since the functions f : N → A and g : A → B are bijective, the composition function g ◦ f : N → B is bijective. But this means that B is a countable set, which is a contradiction. We also state this fact as a proposition. Theorem 10.13

If A is a countable set and B is an uncountable set, then A and B are not numerically equivalent. Theorems 10.12 and 10.13 can also be viewed as consequences of Theorem 10.1. In particular, Theorem 10.13 says that Z and R are not numerically equivalent and hence |Z| = |R|. So here are two infinite sets that don't have the same number of elements. In other words, there are different sizes of infinity. Now this raises a series of questions, one of which is: Are there three infinite sets, none of which have the same number of elements? If A is a countable set and B is an uncountable set, is one set "greater" than the other in any way? In other words, we would like |A| can compare and |B| kind of accurate. Like |Z| = |R| and Z ⊂ R, it is tempting to conclude that |Z| < |R| but we must |A| give another meaning < |B| for the sets A and B. This idea is discussed in Section 10.4. However, we must remember that for infinite sets C and D it is possible that both C ⊂ D and |C| = |D|. For example Z ⊂ Q and |Z| = |Q| since Z and Q are both countable. Before concluding our discussion of Z and R, one more observation is useful. Recall that by Theorem 10.4, if B is an infinite subset of an enumerable set A, then B is enumerable too. But what if A is not countable? That is, if B is an infinite subset of an uncountable set A, can we conclude that B is uncountable? The sets Z and R answer this question, since Z is infinite, R is uncountable, and Z ⊂ R. However, Z is not uncountable. We have now seen two examples of uncountable sets, namely the open interval (0, 1) of real numbers and the set R of all real numbers. None of these sets have the same number of elements as any countable set. But how do they compare? We will show that these two sets actually have the same number of elements. Before checking this, we show that the open interval (−1, 1) and R have the same number of elements.

Theorem to prove the PROOF STRATEGY

The sets (−1, 1) and R are numerically equivalent. The obvious approach to proving this theorem is to find a bijective function f : (−1, 1) → R. In fact, there are several such functions with this property. For each of these functions we face the problem of determining the degree of the envelope in order to show that the function is bijective. We describe one of them here. Another one is given in Exercise 10.25. Consider the function f : (−1, 1) → R defined by x. f (x) = 1 − |x| (See Figure 10.9.) This function is defined for all x ∈ (−1, 1). Note that f(0) = 0, f(x) > 0 when 0 < x < 1, and f(x) < 0 when −1 < x < 0. This function also has the property that x x = +∞ and lim + = −∞. lim x→1 − 1 − |x| 1 − |x| x→−1

254

Chapter 10

Set cardinalities

j

x=1

x = −1

x

Figure 10.9

Or graph of y =

x 1−|x|

If you remember enough information about continuous functions from calculus, you can see that this function is continuous on the interval (−1, 1). From this information it follows that f ((−1, 1)) = R and that f overlaps. Furthermore, the derivative of this function on the interval is (−1, 1) ⎧ 1 ⎪ 2 if x ∈ (0, 1) ⎪ ⎨ (1−x) 1 if x = 0 f (x) = ⎪ ⎪ ⎩ 1 if x ∈ (−1, 0). (1+x)2

This says that f (x) > 0 for all x ∈ (−1, 1) and then f is an increasing function on the interval (−1, 1). This information tells us that f must be one-to-one and therefore f is bijective. Although the argument just given depends on calculus and you may not remember everything, an argument of the kind we discussed can be given. Theorem 10.14

The sets (−1, 1) and R are numerically equivalent.

Study

Consider the function f : (−1, 1) → R defined by x. f (x) = 1 − |x|

(10.2)

We show that f is bijective. First we check that f is one-to-one. Let f (a) = f (b), where a b a b a = 1−|b| 🇧🇷 If 1−|a| = 1−|b| = 0, then a = b = 0. If 1−|a| = a, b ∈ (−1, 1). Then 1−|a| b a b > 0, then a > 0 and b > 0. So 1−a = 1−b . So a(1 − b) = b(1 − a) and thus 1−|b| a b a b = 1−|b| < 0, so a < 0 and b < 0. So 1+a = 1+b . So a(1 + b) = a = b. If 1−|a| b(1 + a) and thus a = b. So f is one to one. Next we show what f is all about. Let r ∈ R. Since f (0) = 0 we can assume that r = 0. r r r r If r > 0 then 1+r ∈ (0, 1) and f ( 1+r ) = r . If r < 0, then 1−r ∈ (−1, 0) and f ( 1−r ) = r. So f is on. Since f is a bijective function, the sets (−1, 1) and R are numerically equivalent. It is easy to show that the function g : (0, 1) → (−1, 1) defined by g(x) = 2x − 1 is bijective. For this function g and the function f in (10.2) in the proof of

10.4

Compare cardinalities of sets

255

Theorem 10.14, so g ◦ f : (0, 1) → R is also bijective. This gives an immediate conclusion. Episode 10.15

The sets (0, 1) and R are numerically equivalent. Not only are (0, 1) and R numerically equivalent (as are (-1, 1) and R), but any open interval (a, b) of real numbers with a < b and R is numerically equivalent. (See Exercise 10.23.)

10.4 Comparing cardinalities of sets As we know, two nonempty sets A and B have the same cardinality if there is a bijective function f : A → B. Let's illustrate this concept again by showing that two known sets belonging to a given set are numerically equivalent. Recall that the power set P(A) of a set A is the set of all subsets of A, and that 2 A is the set of all functions from A to {0, 1}. If A = {a, b, c}, then |P(A)| = 23 = 8. Also, the set 2A contains 2|A| = 23 = 8 functions. Therefore, in this case, P(A) and 2A have the same number of elements. This is no coincidence. Theorem to prove the PROOF STRATEGY

For every nonempty set A, the sets P(A) and 2 A are numerically equivalent. If we can construct a bijective function φ : P(A) → 2 A, this proves that P(A) and 2 A are numerically equivalent. We use φ for this function because 2A is a set of functions and it's probably better to use a standard notation like f to denote the elements of 2A. But how can such a function φ be defined? Let's look at P(A) and 2 A for A = {a, b}. In this case P(A) = {∅, {a}, {b}, {a, b}}; while 2 A = { f 1 , f 2 , f 2 , f 4 }, where f 1 = {(a, 0), (b, 0)}, f 3 = {(a, 0), (b, 1 )},

f 2 = {(a, 1), (b, 0)}, f 4 = {(a, 1), (b, 1)}.

Since both P(A) and 2A each have four elements, we can easily find a bijective function from P(A) to 2A. But that is not the point. What we are looking for is a bijective function φ : P(A) → 2 A for A = {a, b} that proposes a way to find a bijective function from P(A) to 2 A for any set A ( finite or infinite ). For A = {a, b}, note the connection between the following pairs of elements, where the first element belongs to P(A) and the second to 2A: ∅, {a}, {b}, {a, b },

f1 f2 f3 f4

= {(a, 0), (b, 0)} = {(a, 1), (b, 0)} = {(a, 0), (b, 1)} = {(a, 1), (b, 1)}.

For example, the subset {a} of {a, b} contains a but not b, while f 2 maps a to 1 and b to 0. For any set A, this suggests defining φ such that a subset S of A is mapped

256

Chapter 10

Set cardinalities

in the function where 1 is the image of the elements of A belonging to S and 0 is the image of the elements of A not belonging to S. Theorem 10.16 Proof

For every nonempty set A, the sets P(A) and 2 A are numerically equivalent. We show that there is a bijective function φ from P(A) to 2 A. Define φ : P(A) → 2 A such that for S ∈ P(A) φ(S) = f S , where for x ∈ A 1 holds if x ∈ S f S (x) = 0 if x ∈ / S. Of course f S ∈ 2 A . First we show that φ is one-to-one. Let φ(S) = φ(T ). Thus f S = f T , which implies that f S (x) = f T (x) for every x ∈ A. Hence f S (x) = 1 if and only if f T (x) = 1 for each holds x ∈ A; that is, x ∈ S if and only if x ∈ T and then S = T. It remains to show that φ is superimposed. Let f ∈ 2 A . Define S = {x ∈ A : f (x) = 1}. So f S = f and thus φ(S) = f . So φ is over and consequently φ is bijective. It is clear that A = {x, y, z} has fewer elements than B = {a, b, c, d, e}, so |A| < |B|. And it definitely looks like |B| from < |N| and that, in general, every finite set has fewer elements than any countable set (or any infinite set). Furthermore, our discussion of countable and uncountable sets seems to indicate that uncountable sets have more elements than countable sets. But these claims are based on intuition. Now let's make it more specific. A set A is said to have a lower cardinality than a set B, written as |A| < |B|, if there is a one-to-one function from A to B but no bijective function from A to B, i.e. |A| < |B| whether it is possible to pair the elements of A with some elements of B but not with all elements of B. If |A| < |B|, so we also write |B| > |A|. For example, since N is countable and R is not countable, there is no bijective function from N to R. Since the function f : N → R defined by f (n) = n is injective for all n ∈ N, it follows | N| < |R|. Also |A| ≤ |B| means that |A| = |B| or |A| < |B|. So to verify that |A| ≤ |B|, we only have to show the existence of a one-to-one function from A to B. The cardinality of the set N of natural numbers is often denoted by ℵo (often read as "zero-aleph"); so |N| = ℵo . In fact, ℵ is the first letter of the Hebrew alphabet. Indeed, if A is a countable set, then |A| = ℵo . The set R of real numbers is also called a continuum and its size is denoted by c. So |R| = c and from what we have seen, ℵo < c. It was the German mathematician Georg Cantor who helped put set theory on a solid footing. An interesting guess of his became known as:

The Continuum Hypothesis

There is no set S with ℵo < |S| <c. Of course, if the continuum hypothesis were true, this would mean that every subset of R is countable or numerically equivalent to R

10.4

Compare cardinalities of sets

257

The Austrian mathematician Kurt Gödel proved that it is impossible to refute the continuum hypothesis from the axioms on which set theory is based. In 1963, the American mathematician Paul Cohen went a step further by showing that it is also impossible to prove the continuum hypothesis from these axioms. Thus the continuum hypothesis is independent of the axioms of set theory. Another doubt that can arise is this: there is a set S with |S| > c? However, this is a question we can answer, and the answer might come as a surprise. Theorem to prove the PROOF STRATEGY

If A is a set, then |A| < |P(A)|. First, it is not surprising that |A| < |P(A)| if A is finite, because if A has n elements, with n ∈ N, then P(A) has 2n elements and 2n > n (which was proved by induction in Result 6.9). Of course we still have to show that |A| < |P(A)| when A is infinity. First we show that for every set A there is a one-to-one function f : A → P(A). Let's take an example, let's say A = {a, b}. So P(A) = {∅, {a}, {b}, {a, b}}. Although there are many injective functions from A to P(A), there is a one-to-one natural function: f = {(a, {a}), (b, {b})}; in other words, define f : A → P(A) by f (x) = {x}. Once verified that this function is one-to-one, we know that |A| ≤ |P(A)|. However, to show that the inequality is strict, we have to prove that there is no bijective function from A to P(A). The natural technique for such a proof is proof by contradiction.

Theorem 10.17 Proof

If A is a set, then |A| < |P(A)|. If A = ∅, then |A| = 0 and |P(A)| = 1; then |A| < |P(A)|. Therefore we can assume that A = ∅. First we show that there is a one-to-one function from A to P(A). Define the function f : A → P(A) by f (x) = {x} for every x ∈ A. Let f (x1 ) = f (x2 ). So {x1} = {x2}. Then x1 = x2 and f is one to one. To prove that |A| < |P(A)| it remains to show that there is no bijective function from A to P(A). Suppose instead that there is a bijective function g : A → P(A). For every x ∈ A let g(x) = A x , where A x ⊆ A. We show that for every x ∈ A there is a subset of A different from A x . Define the subset B of A by B = { x ∈ A : x ∈ / A x }. By assumption there is an element y ∈ A with B = A y . If y ∈ A y , then y ∈ / B by the definition of B. Conversely, if y ∈ / A y , then by the definition of the set B it follows that y ∈ B. In both cases , y belongs to exactly one of A y and B. So B = A y , which creates a contradiction. By Theorem 10.17 there is no greatest set. In particular, there exists a set S with |S| > c.

258

Chapter 10

Set cardinalities

10.5 Schroder-Bernstein Theorem ¨ For two nonempty sets A and B let f be a function from A to B and let D be a nonempty subset of A. With the constraint f 1 from f to D we mean the function f 1 = {(x , y) ∈ f : x ∈ D}. Hence, a restriction of f refers to the restriction of the domain of f. For example, for the sets A = {a, b, c, d} and B = {1, 2, 3} let f = {(a, 2) , (b, 1), (c, 3), (d, 2)} a function from A to B. For D = {a, c} the restriction from f to D is the function f 1 : D → B given by {(a, 2), (c, 3)}. Sometimes we can also consider a new region B for such a constraint f 1 of f. Of course we must have range( f 1 ) ⊆ B. Next we consider the function g : R → [0, ∞) defined by g(x) = x 2 for x ∈ R. Although g overlaps, g is not an injection since g(1) = g(−1) = 1 for example. On the other hand, the constraint g1 of g for [0, ∞) is one-to-one, and hence the constrained function g1 : [0, ∞) → [0, ∞) is defined by g1 (x) = g( x) = x 2 for all x ∈ [0, ∞) is bijective. On the other hand, if f : A → B is one-to-one, then any restriction of f to a subset of A is also one-to-one. Let f : A → B and g : C → D be functions, where A and C are disjoint sets. We define a function h from A ∪ C to B ∪ D by f (x) if x ∈ A h(x) = g(x) if x ∈ C. Recalling that a function is a set of ordered pairs, we see that h is the union of the two sets f and g. Obviously it is important that A and C are disjoint to ensure h is a function. If f and g overlap, then h must also overlap; however, if f and g are injectable, h need not be injectable. However, the following result provides a sufficient condition for h to be bijective. Motto 10.18

Let f : A → B and g : C → D be one-to-one functions, where A ∩ C = ∅, and define h : A ∪ C → B ∪ D by f (x) if x ∈ A h( x ) = g(x) if x ∈ C. If B ∩ D = ∅, then h is also a one-to-one function. So if f and g are bijective functions, then h is a bijective function.

Study

Suppose h(x1 ) = h(x2 ) = y, where x1 , x2 ∈ A ∪ B. So y ∈ B ∪ D. So y ∈ B or y ∈ D, let's say the former. Since B ∩ D = ∅, it follows that y ∈ / D. So x1 , x2 ∈ A and thus h(x1 ) = f (x1 ) and h(x2 ) = f (x2 ). Since f (x1 ) = f (x2 ) and f is one-to-one, it follows that x1 = x2 . Let A and B be nonempty sets with B ⊆ A and let f : A → B. So for x ∈ A the element f (x) ∈ B. Since B ⊆ A it follows of course that f (x ) ∈ A and thus f ( f(x)) ∈ B. It is convenient to introduce a notation in this case. Let f 1 (x) = f (x) and let f 2 (x) = f ( f (x)). In general, for an integer k ≥ 2, f k (x) = f ( f k−1 (x)). So f 1 (x), f 2 (x), f 3 (x), . 🇧🇷 🇧🇷 is a recursively defined sequence of elements from B (and also from A). Thus f n (x) is defined for every positive integer n.

10.5

The Schröder-Bernstein theorem

259

For example, consider the function f : Z → 2Z defined by f (n) = 4n for all n ∈ Z. So f 1 (3) = f (3) = 4 3 = 12 and f 2 (3) = f ( f (3)) = f (12) = 4 12 = 48. If A and B are nonempty sets with B ⊆ A, then the function φ : B → A defined by φ(x) = x is for all x ∈ B is injective. This gives us the expected result that |B| ≤ |A|. On the other hand, if there is an injective function from A to B, a more interesting consequence arises. Theorem 10.19

Let A and B be nonempty sets with B ⊆ A. If there is an injective function from A to B, then there is a bijective function from A to B.

Study

If B = A, then the identity function i A : A → B = A is bijective. So we can assume that B ⊂ A and therefore A − B = ∅. Let f : A → B be an injective function. If f is bijective, the proof is complete. Hence we can assume that f is non-overlapping. So range( f ) ⊂ B and thus B − range( f ) = ∅. Consider the subset B of B defined by B = { f n (x) : x ∈ A − B, n ∈ N}. So B ⊆ interval( f ). So for every x ∈ A − B its image f(x) belongs to B . Furthermore, for x ∈ A − B the element f 2 (x) = f ( f (x)) ∈ B , f 3 (x) = f ( f 2 (x)) ∈ B and so on. Let C = (A − B) ∪ B and consider the constraint f 1 : C → B from f to C. We show that f 1 overlaps. Let y ∈ B . So y = f n (x) for some x ∈ A − B and some n ∈ N. It follows that y = f (x) for some x ∈ A − B or y = f (x) for some x ∈ B . Hence f 1 (x) = y for some x ∈ C and then f 1 overlaps. Furthermore, since f is one to one, the function f 1 is also one to one. Hence f 1 : C → B is bijective. Let D = B − B . Since B − range( f ) = ∅ and B − range( f ) ⊆ B − B , it follows that D = ∅. Furthermore, D and B are disjoint, as are D and C. Of course, the identity function i D : D → D is bijective. Let h : C ∪ D → B ∪ D defined by f 1 (x) if x ∈ Ch h(x) = i D (x) if x ∈ D. By Lemma 10.18 h is bijective. However, C ∪ D = A and B ∪ D = B; then h is a bijective function from A to B. From what we know about inequalities (of real numbers), it might appear that A and B are sets with |A| ≤ |B| and |B| ≤ |A|, then |A| = |B|. This is indeed the case. This theorem is often referred to as the Schröder-Bernstein theorem.

Theorem 10.20

(Schr¨oder-Bernstein theorem) Are A and B sets such that |A| ≤ |B| and |B| ≤ |A|, then |A| = |B|.

Study

Like |A| ≤ |B| and |B| ≤ |A| there are injective functions f : A → B and g : B → A. Thus g1 : B → interval(g) is defined by g1 (x) = g(x) for all x ∈ B bijective function . By Theorem 9.15 there are g1−1 and g1−1 : image(g) → B is a bijective function. Since f :A → B and g1 : B → image(g) are injective functions, Theorem 9.11 implies that g1 ◦ f :A → image(g) is an injective function. Since area(g) ⊆ A,

260

Chapter 10

Set cardinalities

by Theorem 10.19 we have that there is a bijective function h : A → image(g). Thus h : A → range(g) and g1−1 : range(g) → B are bijective functions. By Corollary 9.12 g1−1 ◦ h : A → B is a bijective function and |A| = |B|. The Schröder-Bernstein theorem is referred to by some as the Cantor-Schröder-Bernstein theorem. Although the history of this theorem has never been fully documented, there are several proven facts. One mathematician who will forever be associated with set theory is Georg Cantor (1845–1918). Born in Russia, Cantor received his bachelor's degree in mathematics from the University of Berlin in 1867 and became a faculty member at the University of Halle in Germany in 1869. While there, he became interested in set theory. In 1873 Cantor proved that the set of rational numbers is countable. Shortly thereafter, he proved that the set of real numbers is uncountable. In this article he essentially introduced the idea of a one-to-one correspondence (bijective function). Over the next few years he made numerous contributions to set theory - the study of sets of the same cardinality. However, there were a number of issues that proved difficult for Cantor. Consider the following two theorems: Theorem A

For any two cardinal numbers a and b, exactly one of the following occurs: (1) a = b, (2) a < b, (3) a > b.

Theorema B

If A and B are two sets for which there is a one-to-one function from A to B and a one-to-one function from B to A, then |A| = |B|. Cantor remarked that after proving Theorem A, Theorem B could be proved. On the other hand, there was never any evidence that Cantor could prove Theorem A. Ernst Zermelo (1871–1953) was able to prove Theorem A in 1904. However, Zermelo's proof used an axiom formulated by Zermelo. This axiom, which has been controversial in the mathematical world for many years, is known as the Axiom of Choice. The Axiom of Choice. For every collection of pairwise disjoint nonempty sets, there is at least one set that contains exactly one element from each of those nonempty sets. As it turns out, not only can the Axiom of Choice be used to prove Theorem A, but Theorem A is true if and only if the Axiom of Choice is true. Ernst Schröder (1841-1902), a German mathematician, was one of the leading figures in mathematical logic. In 1897-1898 Schröder presented a "proof" of Theorem B, but it contained an error. Around the same time, Felix Bernstein (1878–1956) provided his own proof of Theorem B in his doctoral thesis, which became the first complete proof of Theorem B. Its proof required no knowledge of Theorem A. Be surprised to learn that R and the power set of N are numerically equivalent. But how could one find a bijective function between these two sets? Theorem 10.20 tells us that finding such a function is unnecessary.

10.5 Theorem 10.21 Proof

The Schröder-Bernstein theorem

261

The sets P(N) and R are numerically equivalent. First we show that there is a one-to-one function f : (0, 1) → P(N). Recall that a real number a ∈ (0, 1) can only be expressed as a = 0,a1 a2 a3 · · · where each ai ∈ {0, 1, . 🇧🇷 🇧🇷 , 9} and there is no positive integer N with an = 9 for all n ≥ N . Thus we define f(a) = {10n−1 an : n ∈ N} = A. For example, f(0.1234) = {1, 20, 300, 4000} and f(1/3) = {3, 30, 300, . 🇧🇷 🇧🇷 Now we show that f is one-to-one. Suppose f(a) = f(b), where a, b ∈ (0, 1) and a = 0,a1 a2 a3 and b = 0,b1 b2 b3 with ai , bi ∈ { 0, 1, . 🇧🇷 🇧🇷 , 9} for every i ∈ N such that the decimal expansion of a and b is no longer 9 at some point. Hence A = {10n−1 an : n ∈ N} = {10n−1 bn : n ∈ N} = B. Consider the ith digit, i.e. ai , in the decimal expansion of a. So 10i−1 ai ∈A. If ai =0, then 10i−1 ai is the only number in the interval [10i−1 , 9 · 10i−1 ] that belongs to A. Since A = B, it follows that 10i−1 ai ∈ B. However, 10i −1 bi is the only number belonging to B in the interval [10i−1 , 9 · 10i−1 ]; then 10i−1 ai = 10i−1 bi . So ai = bi. If ai = 0, then 0 ∈ A and there is no number belonging to A in the interval [10i−1 , 9 · 10i−1 ]. Since A = B, it follows that 0 ∈ B and there is no number in the interval [10i −1 , 9 · 10i−1 ] belonging to B. So bi = 0 and therefore ai = bi . So ai = bi for all i ∈ N, so a = b. Hence f is one to one and |(0, 1)| ≤ |P(N)|. Next we define a function g : P(N) → (0, 1). Define for S ⊆ N g(S) = 0,s1 s2 s3 · · ·, where 1 if n ∈ S sn = 2 if n ∈ / S. Thus g(S) is a real number in (0, 1). ) whose decimal expansion consists only of ones and twos. We show that g is one-to-one. Suppose g(S) = g(T ), where S, T ⊆ N. Thus g(S) = s = 0.s1 s2 s3 = 0.t1 t2 t3 = t = g( T ), where

sn =

1 2

se n ∈ S e tn = se n ∈ /S

1 2

se n ∈ T se n ∈ / T.

Since the decimal extensions of s and t do not contain zeros or 9s, both s and t have unique decimal extensions. We show that S = T . First we verify that S ⊆ T . Let k ∈ S. Then sk = 1. Since s = t, it follows that tk = 1, which implies that k ∈ T . So S ⊆ T . The proof that T ⊆ S is similar is therefore omitted. Thus S = T and g is one-to-one. Hence |P(N)| ≤ |(0, 1)|. By the Schröder-Bernstein theorem, |P(N)| = |(0, 1)|. By Corollary 10.15 we have |(0, 1)| = |R|. Thus |P(N)| = |R|. As a consequence of Theorems 10.16 and 10.21 we have the following result. Corollary 10.22

The sets 2N and R are numerically equivalent.

262

Chapter 10

Set cardinalities

We have already mentioned that |A| = ℵo for every countable set A and that |R| = c. If A is countable, then we represent the cardinality of the set 2 A by 2ℵo. By Corollary 10.22, 2ℵo = c.

EXERCISES FOR CHAPTER 10 Section 10.1: Numerically Equivalent Sets 10.1. Let S = {A1, A2, . 🇧🇷 🇧🇷 , A5 } is a collection of five subsets of the set A = {−5, −4, . 🇧🇷 🇧🇷 , 5}, where A1 = {x ∈ A : 1 < x 2 < 10} A2 = {x ∈ A : (x + 2)(x − 4) > 0} A3 = {x ∈ A : |x + 2 | + |x − 3| ≤ 5} A4 = {x ∈ A : x 21+1 > 25 } A5 = {x ∈ A : sin π4x = 0}. A relation R is defined on S by Ai R A j (1 ≤ i, j ≤ 5) if Ai and A j are numerically equivalent. By Theorem 10.1, R is an equivalence relation. Determine the different equivalence classes for this equivalence relation. 10.2. (a) Let S be a collection of n ≥ 2 numerically equivalent sets. Prove that these sets are numerically equivalent by using n − 1 bijective functions between pairs of sets in S. (b) What other question does the problem in (a) suggest?

Section 10.2: Countable Quantities 10.3. Prove that if A and B are disjoint countable sets, then A ∪ B is countable. 10.4. Let R+ be the set of positive real numbers and let A and B be countable subsets of R+. Define C = {x ∈ R : −x ∈ B}. Show that A ∪ C is countable. 10.5. Prove that |Z| = |Z − {2}|. 10.6. (a) Prove that the function f : R − {1} → R − {2} is defined by f (x) = (b) Explain why |R − {1}| = |R − {2}|. 10.7. Leave

n2 + S= x ∈R: x = n Deﬁna f : N → S por f (n) = (a) (b) (c) (d)

√ 2

2x x−1

it is bijective.

,n ∈ N .

√ n2 + 2 . n

Name three elements that belong to S. Show that f is one-to-one. Show what f is about. Is S countable? To explain.

10.8. Prove that the function f : N → Z defined in (10.1) is defined by f (n) =

1+(−1)n (2n−1) 4

it is bijective.

10.9. Show that every countable set A can be partitioned into two countable subsets of A. 10.10. Let A be a countable set and let B = {x, y}. Prove that A × B is countable. 10.11. Let B be a countable set and A a nonempty set of unspecified cardinality. If f : A → B is a one-to-one function, then what can be said about the cardinality of A? To explain. 10.12. Prove that the set of all 2-element subsets of N is countable.

Exercises for Chapter 10

10.13. A Gaussian integer is a complex number of the form a + bi, where a, b ∈ Z and i = the set G of Gaussian integers is countable.

√

263

−1. show that the

10.14. Prove that S = {(a, b) : a, b ∈ N and b ≥ 2a} is countable. 10.15 Let S ⊆ N × N be defined by S = {(i, j) : i ≤ j}. Show that S is countable. ∞ 10.16. Let A1, A2, A3, . 🇧🇷 🇧🇷 be pairwise disjoint enumerable sets. Prove that ∪i=1 Ai is countable.

10.17. Let A = {a1, a2, a3, . 🇧🇷 🇧🇷 Define B = A − {an 2 : n ∈ N}. Prove that |A| = |B|. 10.18. A function f : N × N → N is defined by f (m, n) = 2m−1 (2n − 1). (a) Prove that f is one-to-one and one-to-one. (b) Show that N × N is countable. 10.19. Prove that every countable set A can be partitioned into a countable number of countable subsets of A.

Section 10.3: Countless Sentences 10.20. Prove that the set of irrational numbers is uncountable. 21.10. Prove that the set of complex numbers is uncountable. 10.22. Prove that the open interval (−2, 2) and R are numerically equivalent by finding a bijective function h: (−2, 2) → R. (Show that your function is indeed bijective.) 10.23. (a) Prove that the function f : (0, 1) → (0, 2) mapping the open interval (0, 1) to the open interval (0, 2) and by f (x) = 2x is defined is bijective. (b) Explain why (0, 1) and (0, 2) have the same cardinality. (c) Let a, b ∈ R, where a < b. Prove that (0, 1) and (a, b) have the same cardinality. 24.10. Prove that R and R+ are numerically equivalent. 10.25 Consider the function g : (−1, 1) → R defined by g(x) =

x . 1−x 2

(a) Prove that g is superimposed. (b) Prove that g is one-to-one. (c) What conclusions can be drawn from the information obtained in (a) and (b)?

Section 10.4: Comparing cardinalities of sets 10.26. Prove or disprove the following: (a) (b) (c) (d) (e) (f) (g)

If A is an uncountable set, then |A| = |R|. There is a bijective function f : Q → R. If A, B and C are √ sets such as

that A ⊆ B ⊆ C and A and C are countable, then B is countable. 2 The set S = n : n ∈ N is countable. There is a countable subset of the set of irrational numbers. Every infinite set is a subset of a countable set. If A and B are sets such that there is an injective function f : A → B, then |A| = |B|.

27.10. Let A and B be nonempty sets. Prove that |A| ≤ |A × B|. 28.10. Prove or disprove: If A and B are two sets such that A is countable and |A| < |B|, then B is uncountable. 29.10. How do the cardinalities of the sets [0, 1] and [1, 3] compare? Justify your answer. 10:30 am. Let A = {a, b, c}. Then P(A) consists of the following subsets of A: Aa = ∅, Ab = A, Ac = {a, b}, Ad = {a, c}, Ae = {b, c}, A f = {a } , Ag = {b}, Ah = {c}.

264

Chapter 10

Set cardinalities

In part of the proof of Theorem 10.17 it was found (with a contradiction argument) that |A| < |P(A)| for every nonempty set A. In this argument we assume the existence of a bijective function g : A → P(A), where g(x) = A x for every x ∈ A. Then a subset B of A is defined by B = {x ∈ A : x ∈ / A x }. (a) Given the sets A and P(A) described above, what is the set B? (b) What does the set B in (a) illustrate? 31.10. Prove or disprove: There is no set A such that 2 A is countable.

Section 10.5: Schroder-Bernstein theorem ¨ 10.32. Prove that if A, B, and C are nonempty sets such that A ⊆ B ⊆ C and |A| = |C|, then |A| = |B|. 10.33 Use the Schröder-Bernstein theorem to prove that |(0, 1)| = |[0, 1]|. 10.34 Prove that |Q − {q}| = ℵ0 for every rational number q and |R − {r }| = c for any real number r. 10.35 Let R∗ be the set obtained by removing the number 0 from R. Prove that |R∗ | = |R|. 10.36. Let f : Z → 2Z be defined by f (k) = 4k for all k ∈ Z. (a) Prove that f n (k) = 4n k for every k ∈ Z and every n ∈ N. (b) For these Function f, describe the sets B, C and D given in Theorem 10.19 and the functions f 1 and h. 10.37. Express each positive rational number as m/n, where m, n ∈ N and m/n are reduced to the lowest terms. Let there be the number of places in a ∈ N. So d2 = 1, d13 = 2 and d100 = 3. Define the function f : Q+ → N such that f (m/n) is the positive integer with 2(dm) is + dn) digits whose first dm digits are the integer m, whose last dn digits are the integer n, and all remaining dm + dn digits are 0. Hence f(2/3) = 2003 and f(10/271) = 1000000271. (a) Prove that f is one-to-one. (b) Use the Schröder-Bernstein theorem to prove that Q+ is countable.

EXERCISES ADDITIONAL TO CHAPTER 10 10.38. Evaluate the proposed proof of the following result. result

Let A and B be two sets with |A| = |B|. If a ∈ A and b ∈ B, then |A − {a}| = |B − {b}|.

Proof Since A and B have the same number of elements and one element is removed from each of A and B, it follows that |A − {a}| = |B − {b}|. 10.39 Evaluate the proposed proof of the following result. Result The sets (0, ∞) and [0, ∞) are numerically equivalent. Proof Define the function f : (0, ∞) → [0, ∞) by f (x) = x. First we show that f is one-to-one. Suppose f(a) = f(b). Then a = b and then f is one-to-one. Next we show what f is all about. Let r ∈ [0, ∞). Since f (r ) = r , the function f overlaps. Since f is bijective, |(0, ∞)| = |[0, ∞)|. 10.40 For a real number x, the base x of x is the largest integer less than or equal to x. Therefore 5.5 = 5, 3 = 3 and -5.5 = -6. Let f : N → Z be defined by f (n) = (−1)n n/2. (a) Prove that f is bijective. (b) What does (a) tell us about Z? (See result 10.3.)

Additional exercises to Chapter 10

265

10.41. Show that the following pairs of intervals are numerically equivalent. (a) (0, 1) and (0, ∞) (b) (0, 1] and [0, ∞) (c) [b, c) and [a, ∞), where a, b, c ∈ R and b < c. 10.42 Let S and T be two sets. Prove that if |S − T | = |T − S|, then |S| = |T|. 10.43 Prove each of the following statements: (a) A nonempty set S is countable if and only if there is a surjective function f : N → S. (b) A nonempty set S is countable if and only if there is an injective function g : Y → N. 10.44. Prove that |A| < |N| for every nonempty finite set A. 10.45. Let A = (0, 1) be the open interval of real numbers between 0 and 1. For every number r ∈ A let 0.r1 r2 r3 . 🇧🇷 🇧🇷 denotes its unique decimal extension, where after a certain point no extension has the digit 9. For (a, b) ∈ A × A let f ((a, b)) = f ((0.a1 a2 a3 . . . , 0.b1 b2 b3 . . .)) = 0.a1 b1 a2 b2 . 🇧🇷 🇧🇷 🇧🇷 while for a ∈ A g(a) = g(0.a1 a2 a3 . . .) = (0.a1 a3 a5 . . . , 0.a2 a4 a6 . . .) which of the following Can we conclude statements using f and g? explain your answer (a) (b) (c) (d) (e) (f)

|A×A| ≤ |A|. |A| ≤ |A × A|. |A×A| = |A|. Nothing, because neither f : A × A → A nor g : A → A × A is a function. Nothing, because both f : A × A → A and g : A → A × A are functions, but neither is injective. Nothing, for reasons other than those set out in (d) and (e).

10.46. Prove for every integer n ≥ 2 that if A1 , A2 , . 🇧🇷 🇧🇷 , An are countable sets, so A1 × A2 × An is countable. 10.47 As a consequence of Exercise 10.20 in Section 10.3, the set of all irrational numbers is uncountable. √ √ √ √ 3 3 3 4 2, 3, 5, 2, 3, 5 and 2. Prove that the set Among the (many) irrational numbers are √ S = { n k : k, n ∈ N and k, n ≥ 2 } is enumerable. 10.48 Let b, c ∈ Z. A number rb,c (real or complex) belongs to a set S if rb,c is a root of the polynomial x 2 + bx + c. Prove that S is countable. 10.49 We have seen that R is an uncountable set. (a) Show that R can be partitioned into an uncountable number of uncountable sets. (b) Show that R can be partitioned into an uncountable number of countable sets.

11

Number Theory Tests

N

Number theory is the area of mathematics that deals with whole numbers and their properties. It is one of the oldest branches of mathematics, dating back at least to the Pythagoreans (500 BC). Number theory is considered one of the most beautiful branches of mathematics. In fact, it has been said that mathematics is the queen of science while number theory is the queen of mathematics. This discipline is characterized in large part by the attractiveness, clarity, and simplicity of many of its problems and by the elegance and style of its solutions. The main purpose of this chapter is to extend some of the things we've learned to illustrate the types of proofs that occur in number theory.

11.1 Divisibility Properties of Integers You may already know that every integer n ≥ 2 can be expressed as a product of primes, and only in one way, apart from the order in which the primes are written. We'll see later how to prove this fact, but first let's return to the divisibility of integers (introduced in Chapter 4) and establish some elementary properties of divisibility. Remember that a prime number is an integer p ≥ 2 whose only positive integer divisors are 1 and p. An integer n ≥ 2 that is not prime is called a composite (or simply composite) number. The first ten prime numbers are 2, 3, 5, 7, 11, 13, 17, 19, 23, and 29. The first ten composite numbers are 4, 6, 8, 9, 10, 12, 14, 15, 16, and 18 If an integer n ≥ 2 is composite, then there exist integers a and b such that n = ab, where 1 < a < n and 1 < b < n that n = ab, where 1 < a < n and 1 < b < n, then n is composite. We summarize these observations in the following lemma. Motto 11.1

An integer n ≥ 2 is composite if and only if there are integers a and b such that n = ab, where 1 < a < n and 1 < b < n. Chapter 4 introduced some basic divisibility properties. We recall some of these in the following theorem, where the proofs are also repeated to check the proof techniques we use.

266

11.2 Clause 11.2

The division algorithm

267

Let a, b and c be integers with a = 0. (i) If a | b, then a | v. (ii) If the | b and b | c, where b = 0, then a | c. (iii) If the | b and a | c then a | (bx + cy) for all integers x and y.

Study

We start with part (i). Since a | b, there is an integer q such that b = aq. So bc = a(qc). Since qc is an integer, a | v. For part (ii), leave the | b and b | c. Then there are integers q1 and q2 such that b = aq1 and c = bq2 . Hence c = bq2 = (aq1 )q2 = a(q1 q2 ). Since q1 q2 is an integer, a | c. For part (iii), leave the | b and a | c. Then there are integers q1 and q2 such that b = aq1 and c = aq2 . Therefore, for the integers x and y, bx + cy = (aq1 )x + (aq2 )y = a(q1 x + q2 y). Since q1 x + q2 y is an integer, a | (bx+cy).

EVIDENCE ANALYSIS

In all three parts of the previous theorem we had to show that r | s for some integers r and s, where r = 0. To do this, we have shown that we can write s as r t for an integer t. Of course, this is just the definition of what it means for r to divide s. The proof of both parts of the next theorem depends on the definition of r | ab s again, in addition to using certain remarks. For example, in the second part we use the fact that |x y| = |x||y| for any two real numbers x and y.

Theorem 11.3

Let a and b be non-zero integers. (i) If the | b and b | a, then a = b or a = −b. (ii) If the | b, then |a| ≤ |b|.

Study

We first prove (i). Since a | b and b | a, it follows that b = aq1 and a = bq2 for some integers q1 and q2 . So a = bq2 = (aq1 )q2 = a(q1 q2 ). Dividing by a gives 1 = q1 q2 . So q1 = q2 = 1 or q1 = q2 = −1. So a = b or a = −b. Next we prove (ii). Since a | b, it follows that b = aq for an integer q. Also, since b = 0, q = 0. So |q| ≥ 1. So |b| = |here| = |a| · |q| ≥ |a| · 1 = |a|.

11.2 The Division Algorithm We have discussed the concept of divisibility several times. Of course, when we use this term, we are referring to the a | statement b, where a, b ∈ Z and a = 0. Of course, we are more familiar with the term division. For positive integers a and b, dividing b by a and asking for the quotient q and the remainder r is an elementary problem. For example, for a = 5 and b = 17, we have q = 3 and r = 2; that is, when 17 is divided by 5, a quotient of

268

Chapter 11

Number Theory Tests

There are 3 and a remainder of 2. This division can be expressed as 17 = 5 x 3 + 2. If a = 6 and b = 42, then q = 7 and r = 0; then 42 = 6 * 7 + 0 or 6 | 42. More generally, for positive integers a and b, it is always possible to write b = aq + r, where 0 ≤ r < a. The number q is the quotient and r is the remainder when b is divided by a. In fact, the integers q and r not only exist, they are unique. This is the essence of a theorem called the division algorithm. Although this theorem may seem pretty obvious because you've probably used it a lot, it's important and its proof isn't obvious. to prove theorem

(The Division Algorithm) For positive integers a and b, there exist unique integers q and r such that b = aq + r and 0 ≤ r < a.

TEST STRATEGY

We then start the proof with two positive integers a and b. We have two problems ahead of us. First we have to show that there are integers q and r such that b = aq + r and 0 ≤ r < a. Second, we need to show that only an integer q and an integer r satisfies the equation b = aq + r and the inequality 0 ≤ r < a. How can we get integers q and r that satisfy these conditions? Our main concern is to show that there are integers q and r such that b = aq + r and 0 ≤ r < a. If a | b, then we know that b = aq for an integer q. Then b = aq + 0 and r = 0 satisfies 0 ≤ r < a. If a | b, then b = aq for every integer q, and then b − aq = 0 for every integer q. However, if we perform the operation of dividing b by a, we get a quotient q and a non-zero remainder r. The integer q has the properties that b − aq > 0 and b − aq is as small as possible. If a | b or a | b, this suggests considering the set S = {b − ax : x ∈ Z and b − ax ≥ 0}, which is a set of nonnegative integers. Once we have shown that S = ∅, then we can apply Theorem 6.7 (which says for any integer m that the set {i ∈ Z : i ≥ m} is well-ordered) to conclude that S is a smallest element r, which means there is an integer q such that b − aq = r . Then b = aq + r and we have the beginning of a proof.

Theorem 11.4

(The Division Algorithm) For positive integers a and b, there exist unique integers q and r such that b = aq + r and 0 ≤ r < a.

Study

We first show that there are integers q and r such that b = aq + r with 0 ≤ r < a. We will check for uniqueness later. Consider the set S = {b − ax : x ∈ Z and b − ax ≥ 0}. If we set x = 0, we see that b ∈ S and S is not empty. Hence, by Theorem 6.7, S has a smallest element r and necessarily r ≥ 0. Also, since r ∈ S, there is an integer q with r = b − aq. So b = aq + r with r ≥ 0. Prove next we that r < a. On the contrary, assume that r ≥ a. Let t = r − a. Then t ≥ 0. Since a > 0, it follows that t < r. Also, t = r − a = (b − aq) − a = b − (aq + a) = b − a(q + 1), which implies that t ∈ S, contradicting the fact that r is the smallest Element of is S. Hence r < a, as desired.

11.2

The division algorithm

269

It remains to show that q and r are the only integers such that b = aq + r and 0 ≤ r < a. Let q and r be integers such that b = aq + r , where 0 ≤ r < a. We show that q = q and r = r . Without loss of generality, assume that r ≥ r ; then r − r ≥ 0. Since aq + r = aq + r holds, it follows that a(q − q ) = r − r. Since q − q is an integer, a | (r − r ). Since 0 ≤ r − r < a, then r − r = 0 and therefore r = r . But a(q − q ) = r − r = 0 and a = 0; then q − q = 0 and q = q . In Theorem 11.4 we restrict a and b to positive. With slight modifications of the proof, we can remove these restrictions, although of course we must still require that a = 0. Exercise 11.20 requires a proof of the following result. Corollary 11.5

(The division algorithm, general form) For integers a and b with a = 0 there exist unique integers q and r such that b = aq + r and 0 ≤ r < |a|. The proof of Theorem 11.4 is an existence proof, since it proves the existence of the integers q and r, but does not provide a method for finding q and r. However, there is an implicit connection between proving the theorem and how you learned to divide one positive integer by another to find the quotient and the remainder (as we mentioned earlier). For example, when dividing 89 by 14, you first determine how many times 14 goes into 89, which is 6. More formally, you have found the largest nonnegative integer whose product of 14 does not exceed 89. That number is 6. Then subtract 14 * 6 = 84 from 89 to find the remainder, 5. This determines the smallest nonnegative value of 89 − 14q, where q is an integer corresponding to the smallest nonnegative value in the set S for a = 14 and b = 89 in the proof of Theorem 11.4.

Example 11.6

In accordance with the notation of Corollary 11.5, find the integers q and r for the given integers a and b. (i) (ii) (iii) (iv)

solutions

same same same

= 17, b = 78 = –17, b = 78 = 17, b = –78 = –17, b = –78

(i) By simple division we see that dividing 78 by 17 gives a quotient of 4 and a remainder of 10, so 78 = 17 × 4 + 10;

(11.1)

then q = 4 and r = 10. (ii) Replacing 17 and 4 in (11.1) by −17 and −4, respectively, we get 78 = (−17)(−4) + 10; then q = −4 and r = 10. (iii) Multiplying (11.1) by −1, we have −78 = −(17 4) + (−10) = 17(−4) + (−10).

(11.2)

270

Chapter 11

Number Theory Tests

Since everything else is nonnegative, we subtract 17 from and add 17 to the right-hand side of (11.2), giving −78 = 17(−5) + 7;

(11.3)

then q = −5 and r = 7. (iv) If in (11.3) we replace 17 and −5 by −17 and 5, respectively, we have −78 = (−17) 5 + 7; so q = 5 and r = 7.

We will now discuss some consequences of the division algorithm. If one applies this algorithm to any integer b with a = 2, one sees that b must have one of the two forms 2q or 2q + 1 (correspondingly r = 0 or r = 1). Of course, if b = 2q, then b is even; while if b = 2q + 1 then b is odd. We've seen this before. Of course, this shows that every integer is either even or odd. If we apply the division algorithm for a = 3 to any integer b, then b has exactly one of the three forms 3q, 3q + 1 or 3q + 2 (corresponding to r = 0, 1, 2). We also saw this in Chapter 4. In general, for any integer b and any positive integer a, the remainder r is one of 0, 1, 2, . 🇧🇷 🇧🇷 , a − 1. Hence b is exactly one of aq, aq + 1, . 🇧🇷 ., aq + (a − 1). This last observation should also sound familiar to you. In Chapter 8 we considered, for an integer n ≥ 2, a relation R defined in Z by a R b if a ≡ b (mod n), ie if n | (away). This relation was treated as an equivalence relation and the set Zn = {[0], [1], . 🇧🇷 🇧🇷 , [n − 1]} of different equivalence classes has been denoted as the set of integers modulo n. Let's take a closer look at this relationship using the division algorithm. For every integer b there are integers q and r unique by the division algorithm such that b = nq + r, where 0 ≤ r < n. So b − r = nq and n | (b − r ), then b ≡ r (mod n). Since b R r, it follows that b ∈ [r ]. But since r is the only integer with 0 ≤ r ≤ n − 1, we see that b belongs to exactly one of the classes [0], [1], . 🇧🇷 🇧🇷 , [n − 1]. These observations show that (i) the classes [0], [1], . 🇧🇷 🇧🇷 , [n − 1] are uniformly disjoint and (ii) Z = [0] ∪ [1] ∪ ∪ [n − 1], both of which should not be surprising since we recall that Equivalence classes always produce a decomposition of the set on which the equivalence relation is defined. In addition, for each r = 0, 1, . 🇧🇷 🇧🇷 , n − 1, [r ] = {nq + r : q ∈ Z} ; That is, [r] consists of all integers with remainder r when divided by n. For this reason these equivalence classes are also called modulo n residue classes. Chapter 8 considered the special case n = 3 and presented the resulting residual classes. In the present context, these residue classes are [0] = {3q : q ∈ Z} = {. 🇧🇷 🇧🇷 − 6, −3, 0, 3, 6, . 🇧🇷 .} [1] = {3q + 1 : q ∈ Z} = {. 🇧🇷 🇧🇷 − 5, −2, 1, 4, . 🇧🇷 .} [2] = {3q + 2 : q ∈ Z} = {. 🇧🇷 🇧🇷 − 4, −1, 2, 5, . 🇧🇷 🇧🇷

11.3

Greatest common divisor

271

11.3 Greatest Common Divisors We now move from the divisors of an integer to the divisors of a pair of integers. An integer c = 0 is a common divisor of two integers a and b if c | a and c | B. We are primarily interested in the largest integer that is a common divisor of a and b. Formally, the greatest common divisor of two integers a and b, not both 0, is the greatest positive integer that is a common divisor of a and b. The requirement that a and b are not both 0 is necessary since every positive integer divides 0. We denote the greatest common divisor of two integers a and b gcd(a, b), although (a, b) is also a common notation. When a and b are relatively small (in absolute values), it is usually easy to find gcd(a, b). For example, it should be clear that gcd(8, 12) = 4, gcd(4, 9) = 1, and gcd(18, 54) = 18. The definition of gcd(a, b) requires no and b to be positive; in fact it only requires that at least one of a and b be non-zero. For example, gcd(−10, −15) = 5, gcd(16, −72) = 8, and gcd(0, −9) = 9. There are two particularly useful properties of the greatest common divisor of two integers of a point from theoretical point of view that we want to mention. For integers a and b, an integer of the form ax + by, where x, y ∈ Z, is called a linear combination of a and b. Using this terminology, we can now restate Theorem 11.2(iii): Every nonzero integer dividing two integers b and c divides every linear combination of b and c. Although there doesn't seem to be an obvious connection between linear combinations of a and b and gcd(a, b), we'll see shortly that there is actually a very close connection. For example, let a = 10 and b = 16. Then 6, −4, and 0 are linear combinations of a and b, since, for example, 6 = 10 (−1) + 16 (1), −4 = 10 (−2) + 16 (1) and 0 = 10 0 + 16 0. The integer 4 is also a linear combination of a and b since 4 = 10(2) + 16(−1). Also, 2 is a linear combination of a and b since 2 = 10(−3) + 16(2). On the other hand, no odd integer can be a linear combination of a and b, because if n is a linear combination of a and b, then there exist integers x and y such that n = ax + by = 10x + 16y = 2 ( 5x + 8a). Since 5x + 8y is an integer, n is even. Therefore, 2 is the smallest positive integer that is a linear combination of 10 and 16. Interestingly, gdc(10, 16) = 2. We now show that this observation is not accidental. Once again, the well-ordering principle will come in handy. Theorem 11.7

Let a and b be integers that are not both 0. Then gcd(a, b) is the smallest positive integer that is a linear combination of a and b.

Study

Let S be the set of all positive integers that are linear combinations of a and b, i.e. S = {ax + by : x, y ∈ Z and ax + by > 0}. First we show that S is not empty. By assumption, at least one of a and b is non-zero; then a a + b b = a 2 + b2 > 0. So a a + b b ∈ S and, as said, S = ∅.

272

Chapter 11

Number Theory Tests

Since S is a nonempty subset of N, the well-ordering principle implies that S contains a minimal element, which we denote by d. Thus there are integers x0 and y0 such that d = ax0 + by0 . Now we show that d = mdc(a, b). Applying the division algorithm to a and d, we have a = dq + r , where 0 ≤ r < d. Consequently, r = a − dq = a − q(ax0 + by0 ) = a(1 − q x0 ) + b(−qy0 ); that is, r is a linear combination of a and b. If r > 0 then inevitably r ∈ S, which would contradict the fact that d is the smallest element of S. Hence r = 0, which implies that d | one. By a similar argument it follows that d | b and so d is a common divisor of a and b. It remains to show that d is the greatest common divisor of a and b. Let c be a positive integer that is also a common divisor of a and b. By Theorem 11.2(iii), c divides every linear combination of a and b, and thus c divides d = ax0 + by0 . Since c and d are positive and c | d implies that c ≤ d and therefore d = mdc(a, b). EVIDENCE ANALYSIS

The proof of Theorem 11.7 illustrates a common proof technique involving the divisibility of integers. At one point in the proof we wanted to show that d | one. A common way to show that an integer d divides another integer a is to apply the division algorithm and divide a by d, giving a = dq + r, where 0 ≤ r < |d| or if d > 0, then 0 ≤ r < d. The goal is then to show that r = 0. There is another characterization of the greatest common divisor of two non-zero integers that is useful to know. This characterization offers an alternative definition of the greatest common divisor, which is in fact occasionally used as a definition.

Theorem 11.8

Let a and b be two integers, not both 0. Then d = gdc(a, b) if and only if d is the positive integer that satisfies the following two conditions: (1) (2)

Study

d is a common divisor of a and b; if c is any common divisor of a and b, then c | i.e.

Let's first assume d = mdc(a, b). We show that d satisfies (1) and (2). By definition, d satisfies(1); then it only remains to show that d satisfies (2). Let c be an integer such that c | a and c | B. Since d = mdc(a, b), there exist integers x0 and y0 such that d = ax0 + by0 . Since c | a and c | b follows from Theorem 11.2(iii) that c divides ax0 + by0 = d. Sod fulfilled (2). Suppose d is a positive integer that satisfies properties (1) and (2). We show that d = mdc(a, b). Since d is already a common divisor of a and b, it suffices to show that d is the greatest common divisor of a and b. Let c be any positive integer that is a common divisor of a and b. Since d satisfies (2), c | i.e. Since c and d are both positive, Theorem 11.3(ii) implies that c ≤ d, which implies that d = mdc(a, b).

11.4 The Euclidean Algorithm Although we know the definition of the greatest common divisor d of two integers a and b, not both 0 and two characterizations, none of them are useful for computing d. For this reason we describe an algorithm for determining d = gdc(a, b) associated with

11.4

The Euclidean Algorithm

273

famous mathematician Euclid, best known for his work in geometry. First we note that if b = 0, then a = 0 and gdc(a, b) = gdc(a, 0) = |a|. Hence we can assume that a and b are non-zero. Also, since gcd(a, b) = gcd(a, −b) = gcd(−a, b) = gcd(−a, −b), we can assume that a and b are positive. So gcd(0, −12) = gcd(0, 12) = 12 and gcd(−12, −54) = gcd(12, −54) = gcd(−12, 54) = gcd(12, 54) = 6. So in general we can assume that 0 < a ≤ b. The method of computing d = gdc(a,b) that we shall describe shortly, and which is called the Euclidean algorithm, uses repeated applications of the division algorithm and the following lemma. Motto 11.9

Let a and b be positive integers. If b = aq + r for some integers q and r , then gdc(a, b) = gdc(r, a).

Study

Let d = gdc(a, b) and e = gdc(r, a). We show that d = e. First, note that b = aq + r = aq + r * 1; that is, b is a linear combination of a and r. Since e = mdc(r, a), it follows that e | the and the | r . By Theorem 11.2 (iii) and | (aq + r · 1) and so and | B. So e is a common divisor of a and b. Since d = gdc(a, b), we have e ≤ d. Since b = aq + r , we can write r = b − aq = b 1 + a(−q), so r is a linear combination of a and b. From the fact that d = mdc(a, b), we get d | (b 1 + a(−q)); that is, d | r . Then d is a common divisor of r and a. Since e = mdc(r, a), it follows that d ≤ e. So e = d. Now we are ready to describe the Euclidean Algorithm. We start with two integers a and b, where 0 < a ≤ b. By the division algorithm, b = aq1 + r1 , where 0 ≤ r1 < a. By Lemma 11.9, gcd(a, b) = gcd(r1 , a). So if r1 = 0, then gcd(a, b) = gcd(0, a) = a. So we can assume r1 = 0 and apply the division algorithm to r1 and a and get a = r1 q2 + r2 , where 0 ≤ r2 < r1 . At this point we have gdc(a, b) = gdc(r1 , a) = gdc(r2 , r1 ) and 0 ≤ r2 < r1 < a. If r2 =0, then gcd(a, b) = gcd(r1 , a) = gcd(0, r1 ) = r1 . By now we should see the usefulness of Lemma 11.9 and also what we mean by repeated application of the division algorithm. Continuing this process, we get the following sequence of equalities and inequalities: b = aq1 + r1 a = r 1 q2 + r 2 .. .

0 < r1 < a 0 ≤ r2 < r1 .. .

rk−1 = rk qk+1 + rk+1 .. .

0 ≤ rk+1 < rk .. .

By Lemma 11.9, gcd(a, b) = gcd(r1 , a) = gcd(r2 , r1 ) = · · · = gcd(rk+1 , rk ) = · · · and · · · < rk+1 < rk < r2 < r1 < a. Since these remainders are nonnegative, the strictly decreasing sequence r1 , r2 , . 🇧🇷 🇧🇷 the leftovers contain at most one term. Let rn−1

274

Chapter 11

Number Theory Tests

the last remainder must be non-zero. So rn = 0. We then have: b = aq1 + r1

onde 0 ≤ r1 < a

a = r 1 q2 + r 2 .. .

evil 0 ≤ r2 < r1

rn−4 = rn−3 qn−2 + rn−2

onde 0 ≤ rn−2 < rn−3

rn−3 = rn−2 qn−1 + rn−1

onde 0 ≤ rn−1 < rn−2

(11.4)

rn−2 = rn−1 qn + 0 and we know that gcd(a, b) = gcd(r1 , a) = gcd(r2 , r1 ) = gcd(rn−1 , rn−2 ) = gcd (0, rn−1 ) = rn−1 . The Euclidean algorithm can now be described. We start with two integers a and b, where 0 < a ≤ b. If a | b, then gdc(a, b) = a; while if a | b, then we repeatedly apply the division algorithm until a remainder 0 is obtained. In the latter case, the last non-zero remainder is then gdc(a, b). Let's see how the Euclidean algorithm works in practice. Example 11.10

solution

Use Euclidean's algorithm to find d = gcd(374, 946). If we divide 946 by 374 we find 946 = 374 2 + 198. Now if we divide 374 by 198 we have 374 = 198 1 + 176. Continuing like this we get 198 = 176 1 + 22 176 = 22 8 + 0. So gcd(374, 946) = 22.

Given the integers a and b, not both 0, we know that there are integers s and t such that gdc(a, b) = as + bt. We now describe an algorithm for finding such integers s and t using the notation in the calculations after the proof of Lemma 11.9. Since gdc(a, b) = rn−1, our goal is to find integers s and t such that rn−1 = as + bt. We start with the equation rn−3 = rn−2 qn−1 + rn−1 and rewrite it in the form rn−1 = rn−3 − rn−2 qn−1 . Then, using equations (11.4), we solve for rn−2 and get rn−2 = rn−4 − rn−3 qn−2 .

(11.5)

11.5

relatively prime integers

275

If we now substitute this expression for rn−2 in equation (11.5), we get rn−1 = rn−3 − qn−1 (rn−4 − rn−3 qn−2 ) = (1 + qn−1 qn− 2 ) rn−3 + (−qn−1 )rn−4 . Here rn−1 is represented as a linear combination of rn−3 and rn−4. Notice that rn−2 is no longer in the expression. Continuing with this back-substitution method, we eliminate the residues rn−3 , rn−4 , . 🇧🇷 🇧🇷 , r2 , r1 in sequence and finally arrive at an equation of the form rn−1 = as + bt. Example 11.11

solution

For a = 374 and b = 946, find the integers s and t such that + bt = gdc(a, b). Using the calculations from Example 11.10, we have 22 = 198 − 176 1 = 198 1 + 176 (−1) 176 = 374 − 198 1 = 374 1 + 198 (−1 ) 198 = 946 − 374 2 = 946 1 + 374 (−2). So 22 = 198 1 + 176 (−1) = 198 1 + [374 1 + 198 (−1)] (−1) = 198 1 + 374 (−1) + 198 1 = 198 2 + 374 (−1) = [946 1 + 374 (−2)] 2 + 374 (−1) = 946 2 + 374 (−4) + 374 (−1) = 946 2 + 374 (−5). So s = −5 and t = 2.

The integers s and t just found are not unique. Namely, if gcd(a, b) = d and d = as + bt, then d = a(s + b) + b(t − a).

11.5 Relatively prime integers For two integers a and b that are not both 0, we know that if gdc(a, b) = 1 then there exist integers s and t such that + bt = 1. What What might surprise you is that the converse is also true in this particular case. Theorem 11.12

Let a and b be integers, not both 0. Then gcd(a, b) = 1 if and only if there are integers s and t such that 1 = als + bt.

Study

If gdc(a, b) = 1, then by Theorem 11.7 there are integers s and t such that + bt = 1. We now consider the converse. Let a and b be integers, not both 0, for which there are integers s and t such that + bt = 1. By Theorem 11.7, gcd(a, b) is the smallest positive number

276

Chapter 11

Number Theory Tests

integer that is a linear combination of a and b. Since 1 is a linear combination of a and b, it follows that gcd(a, b) = 1. Two integers a and b that are not both 0 are called coprime if gcd(a, b) = 1. By Theorem 11.12 then are two integers a and b relatively prime if and only if 1 is a linear combination of a and b. This fact is extremely useful, as we shall see. If a, b, and c are integers such that a | bc, then there is no reason to believe that a | b or a | c. For example, let a = 4, b = 6, and c = 2. Then 4 | 6 2 but 4 | 6 and 4 | 2. However, if a and b are relatively prime, we can draw a different conclusion. The following result is often referred to as Euclid's lemma. to prove theorem

(Euklid's lemma) Let a, b and c be integers, where a = 0. If a | bc and gdc(a, b) = 1, then a | c.

TEST STRATEGY

When using a direct proof, we assume that a | bc and mdc(a, b) = 1. To show that a | c, we need to show that c can be expressed as ar for an integer r. Because a | bc we know that bc = aq for an integer q. Since gdc(a, b) = 1, there also exist integers s and t such that + bt = 1. If we multiplied as + bt = 1 by c, then we would have c = acs + bct. However, bc = aq and we could factor a from acs + bct. That is the plan.

Theorem 11.13

(Euklid's lemma) Let a, b and c be integers, where a = 0. If a | bc and gdc(a, b) = 1, then a | c.

Study

Since a | bc, there is an integer q such that bc = aq. Since a and b are relatively prime, there exist integers s and t such that 1 = as + bt. So c = c 1 = c(as + bt) = a(cs) + (bc)t = a(cs) + (aq)t = a(cs + qt). Since cs + qt is an integer, | c. Euclid's lemma is of particular interest when the integer a is prime.

Corollary 11.14 Proof

Let b and c be integers and p a prime number. If p | bc then or p | b or p | c. If p divides b, then the corollary is proved. Suppose p does not share b. Since the only positive integer divisors of p are 1 and p, it follows that gdc( p, b) = 1. Thus, by Euclid's lemma, p | c and the proof is complete. The previous corollary can be extended to the case where a prime number p divides any product of integers.

Corollary 11.15

Let a1, a2, . 🇧🇷 🇧🇷 , an , where n ≥ 2, are integers and p are prime numbers. If p | a1 a2 · · · an , then p | ai for an integer i (1 ≤ i ≤ n).

11.6 Proof

The fundamental theorem of arithmetic

277

We proceed by induction. For n = 2 this is simply a repetition of Corollary 11.14. Then suppose that if a prime number p divides the product of k integers (k ≥ 2), then p divides at least one of the integers. Now let a1 , a2 , . 🇧🇷 🇧🇷 , ak+1 let k + 1 be integers, where p | a1 a2 ak+1 . We show that p | ai for some i (1 ≤ i ≤ k + 1). Let b = a1 a2 ak . Then p | bake+1 . According to Corollary 11.14, or p | b or p | k+1 . If p | ak+1 , then the proof is complete. Otherwise p | b, that is, p | a1 a2 · · · ak . However, by the induction hypothesis, p | ai for some i (1 ≤ i ≤ k). In any case p | ai for some i (1 ≤ i ≤ k + 1). By the principle of mathematical induction, p divides at least one of the integers if a prime number p divides the product of any n ≥ 2 integers. There is one more useful fact about relatively prime integers. Once again we shall have an opportunity to use the result that whenever two integers a and b are relatively prime, 1 is a linear combination of a and b.

Theorem 11.16

Let a, b, c ∈ Z, where a and b are coprime non-zero integers. If a | c and b | c then from | c.

Study

Since a | c and b | c there are integers x and y such that c = ax and c = by. Since a and b are relatively prime, there are also integers s and t such that 1 = as + bt. Multiplying by c and substituting we get c = c 1 = c(as + bt) = c(as) + c(bt) = (by)(as) + (ax)(bt) = ab(sy + xt ). ). Since (sy + xt) is an integer, ab | c. By Theorem 11.16, if we want to show that 12 divides an integer c, for example, we only need to show that 3 | c and 4 | c since 12 = 3 · 4 and 3 and 4 are coprime.

11.6 The Fundamental Theorem of Arithmetic It is a basic fact of divisibility that every integer can be expressed as a product of prime numbers. This fact is specified in a famous theorem of number theory. His proof serves as one of the most interesting applications of the strong principle of mathematical induction. Theorem 11.17

(Principle of arithmetic) Every integer n ≥ 2 is prime or can be expressed as a product of primes; that is, n = p1 p2 · · · pm , where p1 , p2 , . 🇧🇷 🇧🇷 , pm are prime numbers. Also, this factorization is unique, except perhaps for the order in which the factors appear.

Study

To show the existence of such a factorization, we use the strong principle of mathematical induction. Since 2 is prime, the statement is certainly true for n = 2. For an integer k ≥ 2, we assume that every integer i with 2 ≤ i ≤ k is prime or can be expressed as a product of primes. We show that k + 1 is prime or can be expressed as a product of primes. Obviously, if k + 1 is prime, there is nothing left to prove. So we can assume that k + 1 is composite. By Lemma 11.1 they exist

278

Chapter 11

Number Theory Tests

there are integers a and b such that k + 1 = ab, where 2 ≤ a ≤ k and 2 ≤ b ≤ k. Therefore, by the induction hypothesis, both a and b are prime or can be expressed as a product of primes. In any case, k + 1 = ab is a product of prime numbers. By the strong principle of mathematical induction, every integer n ≥ 2 is prime or can be expressed as a product of primes. To prove that such a factorization is unique, we proceed by contradiction. Suppose instead that there is an integer n ≥ 2 that can be expressed in two different ways as a product of primes, say n = p1 p2 ps = q 1 q 2 q t , where at each factorization primes are arranged in non-descending order; that is, p1 ≤ p2 ≤ ps and q1 ≤ q2 ≤ qt . Since the factorizations are different, there must be a smallest positive integer r such that pr = qr. In other words, if r ≥ 2, then pi = qi for all i with 1 ≤ i ≤ r − 1. After truncation we have pr pr +1 ps = qr qr +1 qt .

(11.6)

Consider the integer pr . Either s = r and the left-hand side of (11.6) is exactly pr, or s > r and pr +1 pr +2 · · · ps is an integer that is the product of s − r primes. In both cases pr | qr qr +1 qt . Therefore, by Corollary 11.15 pr | q j for some j with r ≤ j ≤ t. Since q j is prime, pr = q j . Since qr ≤ q j , it follows that qr ≤ pr . Considering the integer qr (instead of pr ), we can show that pr ≤ qr . So pr = qr . But this contradicts the fact that pr = qr. Hence, as we said, every integer n ≥ 2 has a unique factorization. An immediate consequence of Theorem 11.17 is given below. Corollary 11.18

Every integer greater than 1 has a prime factor. Actually, we can say a little bit more.

Lemma 11.19 Proof

If n is a composite number, then n has a prime factor p such that p ≤

√

n.

Since n is composite, we know that n = ab, where 1 < a < n and 1 < b < n. √ Assume without loss of generality that a ≤ b. So a 2 ≤ ab = n and therefore a ≤ n. Since a > 1 we know that a has a prime factor, say √ p. Since a is a factor of n, it follows that p is also a factor of n, and p ≤ a ≤ n. If an integer n ≥ 2 is expressed as a product q1 q2 qm qm of primes, then the primes q1 , q2 , . 🇧🇷 🇧🇷 , sqm don't have to be different. Consequently, we can group equal prime factors and express n in the form n = p1a1 p2a2 · · · pkak, where p1 , p2 , . 🇧🇷 🇧🇷 , pk are prime numbers such that p1 < p2 < < pk and each exponent ai is a positive integer. We call this the canonical factorization of n. By the fundamental theorem of arithmetic, every integer n ≥ 2 has a unique canonical factorization. For example, the canonical factorizations of 12, 210, and 1000 are 12 = 22 3, 210 = 2 3 5 7, and 1000 = 23 53 . Of course, it is relatively easy to determine whether a small

11.6

The fundamental theorem of arithmetic

279

positive integer is prime or composite, and if composite, express it as a product of primes. We have mentioned some divisibility tests for certain integers. You may already be familiar with these. 1. Divisibility by 2, 4 and other powers of 2: An integer n is divisible by 2 if and only if n is even (or the last digit of n is even). In fact, n is divisible by 4 if and only if the two-digit number consisting of the last two digits of n is divisible by 4, the integer n is divisible by 8 if and only if the three-digit number consists of the last three digits digits of n is divisible by 8 and so on. Therefore, the number 14220 is divisible by 4 because 20 is divisible by 4, but it is not divisible by 8 because 220 is not divisible by 8. 2. Divisibility by 3 and 9: An integer is divisible by 3 if and only if the sum of its digits is divisible by 3. In fact, an integer is divisible by 9 if and only if the sum of its digits is divisible by 9. However, this procedure ends at 9; that is, it does not extend to 27. For example, the digit sum of 27 itself is not divisible by 27, but 27 is certainly divisible by 27. The digit sum of the integer 4278 is 21, which is divisible by 3 but not by 9. So 4278 is divisible by 3 but not by 9. Obviously 4278 is divisible by 2 but not by 4 since 78 is not divisible by 4 is. So 4278 is divisible by 6 by Theorem 11.16. 3. Divisibility by 5: An integer is divisible by 5 if and only if it ends in 5 or 0; That is, an integer is divisible by 5 if and only if its last digit is divisible by 5. 4. Divisibility by 11: Start with the first digit of n and add the alternating digits (all other digits). Suppose the resulting number is a. Then add the remaining digits and get b. Then n is divisible by 11 if and only if a − b is divisible by 11. For example, consider the number 71929. Notice that a = 7 + 9 + 9 = 25 while b = 1 + 2 = 3. Since a − b = 25 − 3 = 22 is divisible by 11, the number is 71929 is divisible by 11. In fact, 71929 = 11 6539. However, since (6 + 3) − (5 + 9) = −5 is not divisible by 11, the integer 6539 is not divisible by 11; i.e. 71929 is not divisible by 112 = 121.

Theorem 11.20 Proof

Although there are tests for divisibility by other primes, such as 7 (see Exercise 11.66) and 13, none are practical enough to warrant inclusion here. Applying the above tests to the number n = 471240, we find that n is divisible by 5, 8 (but not 16), 9, and 11. In fact, n = 5 8 9 11 119 = 5 8 9 11 7 17 = 23 32 5 7 11 17. We are now able to find an infinite class of irrationals to describe numbers. √ √ Let n be a positive integer. Then n is a rational number if and only if n is an integer. √ √ Of course, if n is an integer, then n is rational. So we just need to check the inverse. Instead, suppose there is a √positive integer n such that √√n is a rational number but n is not an integer. So n = a/b for some positive integers a and b. We can also assume that a and b have no common factors, so gcd(a, b) = 1. Since a/b is not an integer, b ≥ 2. So n = a 2 /b2 and hence a 2 = nb2 . By Corollary 11.18 b has a prime factor p. So p | nb2 and so p | a 2 . According to Corollary 11.14, p | one. But then p | a and p | b, which contradicts our assumption that gcd(a, b) = 1.

280

Chapter 11

Corollary 11.21 Proof

Number Theory Tests

A consequence of this theorem is the following. √ If p is prime, then p is irrational. √ Suppose instead that there is a prime p such that p is rational. By √ Theorem 11.20, p = n for an integer n ≥ 2. Then p = n 2 . Since n 2 is composite, this is a contradiction. Although our observations imply that there are infinitely many primes, we have not yet proved this. we do it now Since our goal is to prove that the number of primes is not finite, proof by contradiction is the expected technique.

Theorem 11.22 Proof

The number of prime numbers is infinite. On the contrary, assume that the number of primes is finite. Let P = { p1 , p2 , . 🇧🇷 🇧🇷 , pn } is the set of all prime numbers. Consider the integer m = p1 p2 pn + 1. Obviously m ≥ 2. Since m has a prime factor and every prime belongs to P, there is a prime pi (1 ≤ i ≤ n) with pi | m. So m = pi k for an integer k. Let = p1 p2 pi−1 pi+1 pn . So 1 = m − p1 p2 pn = pi k − pi = pi (k − ). Since k − is an integer, pi | 1, which is impossible. Two primes p and q with p < q are called twin primes if q = p + 2. Twin primes are necessarily odd. For example, 5, 7 and 11, 13 are twin primes. Although we just established that there are infinitely many primes, the number of twin primes is not known.

Conjecture 11.23

There are infinitely many twin primes.

11.7 Concepts with sums of divisors For an integer n ≥ 2, a positive integer a is called a proper divisor of n if a | n and a < n. Therefore, the correct divisors of 6 are 1, 2, and 3, while the correct divisors of 28 are 1, 2, 4, 7, and 14. Also note that 1 + 2 + 3 = 6 and 1 + 2 + 4 + 7 + 14 = 28. A positive integer n ≥ 2 is called perfect if the sum of its proper divisors is n. Therefore 6 and 28 are perfect integers - in fact they are the two smallest perfect integers. The smallest third perfect integer is 496. The greatest prime divisors of 6, 28, and 496 are 3, 7, and 31, respectively. Adding the integers from 1 to each of these primes yields potentially unexpected results: 1+2+3=6 1 + 2 + 3 + 4 + 5 + 6 + 7 = 28 1 + 2 + + 31 = 496. the integers 6, 28, and 496 can also be written as 6 = 21 (22 − 1), 28 = 22 (23 − 1) and 496 = 24 (25 − 1) can be expressed.

Exercises for Chapter 11

281

In fact, Euclid, the famous geometer who lived over 2000 years ago, showed that whenever 2p − 1 is prime, then 2p − 1 (2p − 1) is a perfect integer. In the 18th century, the brilliant Swiss mathematician Leonhard Euler proved that every perfect even integer is of the form 2 p − 1 (2 p − 1), where 2 p − 1 is a prime number. Prime numbers of the form 2 p − 1 are called Mersenne primes. As of June 2010, 47 Mersenne primes are known, and therefore 47 perfect integers are known. There are many mysteries surrounding the perfect numbers. Are there odd perfect numbers? Nobody knows. Are there infinitely many perfect even numbers? Nobody knows about it either. Let's look at the first primes - the first seven, to be precise: 2, 3, 5, 7, 11, 13, 17. The 1st, 2nd, 4th, and 7th primes are 2, 3, 7, and 17 , and the remaining prime numbers (the 3rd, 5th, and 6th) are 5, 11, and 13. If we add the integers 1, 2, 4, and 7, we get the same result as if we add 3, 5, and 6 add up, so 1 + 2 + 4 + 7 = 3 + 5 + 6 = 14. While this fact may not seem like anything special, it is also a fact that the sum of the prime numbers corresponding to these two sets of integers is also gives the same result: 2 + 3 + 7 + 17 = 5 + 11 + 13 = 29. Although the sums of these primes are equal, the products of these primes, namely 2 3 7 17 and 5 11 13 , impossible to be the same. This is of course a consequence of the fundamental theorem of arithmetic. On the other hand, these products are surprisingly close since 2 x 3 x 7 x 17 = 714 and 5 x 11 x 13 = 715. The serious baseball fan will recognize these numbers. For years, 714 was considered a career home run record. That record was held by Babe Ruth. However, that record was broken in 1974 when Hank Aaron hit his 715th home run. Two consecutive integers n, n + 1 are called integer Ruth-Aaron pairs if the sums of their prime divisors are equal. Thus 714 and 715 are a Ruth-Aaron pair, as are 5 and 6. Although such pairs of integers may seem rare, the famous Hungarian mathematician Paul Erd˝os proved that there are in fact infinitely many Ruth-Aaron pairs whole numbers. .

EXERCISES FOR CHAPTER 11 Section 11.1: Divisibility properties of integers 11.1. Let a, b, c, d ∈ Z with a, c = 0. Prove that if a | b and c | d then ac | (ad + bc). 11.2. Let a, b ∈ Z with a = 0. Prove that if a | b, then a | (−b) and (−a) | B. 11.3. Let a, b, c ∈ Z with a, c = 0. Prove that if ac | bc then a | B. 11.4. Prove that 3 | (n 3 − n) for every integer n. 11.5. Prove that if n = k 3 + 1 ≥ 3, where k ∈ Z, then n is not a prime number. [Hint: Remember that k 3 + 1 = (k + 1)(k 2 − k + 1).] 11.6. Find all prime numbers that are 1 less than a perfect cube. 11.7. Prove that 8 | (52n + 7) for every positive integer n.

282

Chapter 11

Number Theory Tests

11.8. Prove that 5 | (33n+1 + 2n+1 ) for every positive integer n. 11.9. Prove that for every positive integer n there are n consecutive positive integers, each of which is composite. [Hint: Consider the numbers 2 + (n + 1)!, 3 + (n + 1)!, . 🇧🇷 🇧🇷 , n + (n + 1)!, n + 1 + (n + 1)!.] 11.10. (a) Prove that 6 | (5n 3 + 7n) for any positive integer n. (b) Note that 5 + 7 = 12 is a multiple of 6. Formulate and prove a generalization of the problem in (a). 11.11. Prove the following: Let d be a non-zero integer. nIf a1 , a2 , . 🇧🇷 🇧🇷 , an and x1 , x2 , . 🇧🇷 🇧🇷 , xn are 2n ≥ 2 integers such that d | ai for all i (1 ≤ i ≤ n), then d | i=1 ai xi . 11.12. Let pn be the nth prime number and cn the nth composite number. So p1 = 2 and p2 = 3, while c1 = 4 and c2 = 6. Obviously pn = cn for all n ∈ N. Determine all positive integers n with | pn − cn | = 1.11.13. (a) Suppose there are k distinct positive integers sharing an odd positive integer n. How many different positive integers divide 2n? How many divide 4n? (b) Suppose there are k distinct positive integers dividing a positive integer n that is not divisible by 3. How many different positive integers divide 3n? How many share 9n? (c) Name and answer a question suggested by questions in (a) and (b). 11.14. For an integer n ≥ 2, let m be the largest positive integer less than n such that m | n. Then n = mk for a positive integer k. Prove that k is a prime number.

Section 11.2: The Division Algorithm 11.15. Illustrate the division algorithm for: (a) (c) (e) (g)

same same same

= 17, b = 125 = 8, b = 96 = 22, b = −17 = 15, b = 0

(b) (d) (f) (h)

same same same

= –17, b = 125 = –8, b = 96 = –22, b = –17 = –15, b = 0.

11.16. Give an example of a prime p of each form: (a) 4k + 1 (b) 4k + 3 (c) 6k + 1 (d) 6k + 5. 11.17. Let p be an odd prime number. Prove each of the following points. (a) p is of the form 4k + 1, or of the form 4k + 3 for a non-negative integer k. (b) p ≥ 5 is of the form 6k + 1, or of the form 6k + 5 for a non-negative integer k. 11.18. Show that every prime number except 2 and 5 can be expressed as 10k + 1, 10k + 3, 10k + 7, or 10k + 9, where k ∈ Z. 11.19. (a) Prove that if an integer n is of the form 6q + 5 for some q ∈ Z, then n is of the form 3k + 2 for some k ∈ Z. (b) Is the converse of (a) true? 11.20 Prove the general form of the division algorithm (Corollary 11.5): For integers a and b with a = 0 there exist unique integers q and r with b = aq + r and 0 ≤ r < |a|. 21.11. Prove that the square of every odd integer is of the form 4k + 1, where k ∈ Z (that is, for every odd integer a ∈ Z there is a k ∈ Z with a 2 = 4k + 1). 11.22. (a) Prove that the square of every integer that is not a multiple of 3 has the form 3k + 1, where k ∈ Z. (b) Prove that the square of no integer has the form 3m − 1 , where m ∈ Z. 11.23. Complete and prove the following statement as best you can. (See Exercise 11.22(a).) The square of an integer that is not a multiple of 5 has the form or .

Exercises for Chapter 11

283

11/24 (a) Prove that for every integer m, one of the integers m, m + 4, m + 8, m + 12, m + 16 is a multiple of 5. (b) Give and prove a generalization of the result in ( a) . 11.25 Prove: If a1 , a2 , . 🇧🇷 🇧🇷 , an are n ≥ 2 integers such that ai ≡ 1 (mod 3) for every integer i (1 ≤ i ≤ n), then a1 a2 an ≡ 1 (mod 3). 11.26. Let a, b and c be integers. Prove that if abc ≡ 1 (mod 3), then an odd number of a, b, and c are congruent to 1 modulo 3. 11.27. Prove or disprove: If a and b are odd integers, then 4 | (a − b) or 4 | (a+b). 11.28. For any positive integer n, prove that n 2 + 1 is not a multiple of 6. 11.29. It is known that there are infinitely many positive integers whose square is the sum of the squares of two positive integers. Example: 52 = 32 + 42 and 132 = 52 + 122 . (a) Prove that there are infinitely many positive integers whose square is the sum of the squares of three positive integers. Example: 592 = 502 + 302 + 92 . (b) Prove that there are infinitely many positive integers whose square is the sum of the squares of four positive integers. 11:30 a.m. (a) Let n ∈ N. Show that for every set S of n distinct integers there is a nonempty subset T of S such that n divides the sum of the elements of T. [Hint: Let S = {a1, a2, . 🇧🇷 🇧🇷 , an } and consider the subsets Sk = {a1 , a2 , . 🇧🇷 🇧🇷 , ak } for each k (1 ≤ k ≤ n).] (b) Is the word "distinct" in (a) necessary? 31.11. For an integer n ≥ 2, let Sn be the set of all positive integers m such that n is the smallest positive integer such that dividing m by n leaves the remainder 1.(a) What does S2 consist of? (b) To which set Sn does 14 belong? (c) To which set Sn does 16 belong? (d) Prove or disprove: For every integer n ≥ 2, the set Sn either contains infinitely many elements or is empty.

Section 11.3: Greatest Common Divisor 11.32. Give an example of a set S of four positive (distinct) integers such that the greatest common divisor of all six pairs of elements of S is 6. 11.33 Give an example of a set S of four positive (distinct) integers such that the greatest common divisors of all six pairs of elements of S are six distinct positive integers. 11.34 Prove for a ∈ Z and n ∈ N that gdc(a, a + n) | n. 11:35 p.m. Let a and b be two integers, not both 0, where gdc(a, b) = d. For a positive integer k, prove that mdc(ka, kb) = kd. 11.36. For positive integers a, b, and c, the greatest common divisor gcd(a, b, c) of a, b, and c is the greatest positive integer that divides all a, b, and c. Let d = gcd(a, b, c), e = gcd(a, b) and f = gcd(e, c). Prove that d = f .

Section 11.4: The Euclidean Algorithm 11.37. Use Euclidean's algorithm to find the greatest common divisor for each of the following pairs of integers: (a) 51 and 288 (b) 357 and 629 (c) 180 and 252. 11.38. Determine integers x and y such that (see Exercise 11.37): (a) gcd(51, 288) = 51x + 288y (b) gcd(357, 629) = 357x + 629y (c) gcd(180, 252 ) = 180x + 252 years old.

284

Chapter 11

Number Theory Tests

11.39 Let a and b be integers, not both 0. Show that there are infinitely many pairs s, t of integers such that gdc(a, b) = as + bt. 11:40 a.m. Let a, b ∈ Z, where neither a nor b are 0, and let d = mdc(a, b). Show that an integer n is a linear combination of a and b if and only if d | n. 11.41. An integer n > 1 has the properties that n | (35m + 26) and n | (7m + 3) for an integer m. What is n? 11.42 Let a, b ∈ Z, where not both a and b are 0. Prove that if d = gdc(a, b), a = a1 d and b = b1 d, then gdc(a1 , b1 ) = 1. 11.43. Prove the following: Let a, b, c, m, n ∈ Z, where m, n ≥ 2. If a ≡ b (mod m) and a ≡ c (mod n) with d = gcd(m, n), then b ≡ c (mod d). 11.44 In Exercise 11.36 it was shown that gdc(a, b, c) = gdc(gcd(a, b), c) for positive integers a, b and c. Show that there are integers x, y, and z such that gdc(a, b, c) = ax + by + cz. 11:45 a.m. Suppose the Euclidean algorithm is applied to find gcd(a, b) for two positive integers a and b. If at any point in the algorithm we arrive at a remainder ri that is prime, what conclusion can we draw about gcd(a, b)?

Section 11.5: Relatively Prime Integers 11.46. (a) Let a, b, c ∈ Z with a = 0 and a | v. BC. Show that if gcd(a, b) = 1, then a does not have to divide c. (b) Let a, b, c ∈ Z with a, b = 0, a | c and b | c. Show that if gcd(a, b) = 1, ab c need not divide. √ 11.47. Use Corollary 11.14 to prove that 3 is irrational. √ 11.48. Prove that if p and q are different primes, then pq is irrational. 11.49 Let p be a prime number and let n ∈ Z, where n ≥ 2. Prove that p 1/n is irrational. 11:50 a.m. Let n ∈ N. Prove or disprove each of the following statements: (a) 2n and 4n + 3 are relatively prime. (b) 2n + 1 and 3n + 2 are relatively prime. 11.51. (a) Prove that every two consecutive odd positive integers are relatively prime. (b) Formulate and prove a generalization of the result in (a). 11.52. Prove that if p ≥ 2 is an integer with the property that for every pair b, c of integers p | bc implies that p | b or p | c, then p is a prime number. (This result is related to Corollary 11.14.) 11.53. Prove that if p and q are primes with p ≥ q ≥ 5, then 24 | ( p 2 − q 2 ). 11.54 A triple (a, b, c) of positive integers such that a 2 + b2 = c2 is called a Pythagorean triple. A Pythagorean triple (a, b, c) is called primitive if gcd(a, b) = 1. (In this case it also happens that gcd(a, c) = gcd(b, c)=1.) (a) Prove that if (a, b, c) is a Pythagorean triple, then (an, bn, cn) is a Pythagorean triple for all n ∈ N. (b) In Exercise 13 of Chapter 4 it was shown if (a , b, c) is a Pythagorean triple, so 3 | away. Use this fact and Theorem 11.16 to show that 12 | away. (c) Prove that if (a, b, c) is a primitive Pythagorean triple, then a and b are of opposite parity. 11.55 Prove the following: Let a, b, m, n ∈ Z, where m, n ≥ 2. If a ≡ b (mod m) and a ≡ b (mod n), where gcd(m, n) = 1, then a ≡ b (mod mn). 11.56. Prove the following: Let a, b, c, n ∈ Z, where n ≥ 2. If ac ≡ bc (mod n) and mdc(c, n) = 1, then a ≡ b (mod n). 11.57. For two integers a and b that are not both 0, assume d = mdc(a, b). Then there are integers x and y such that d = ax + by; that is, d is a linear combination of a and b. This implies that d is also a linear combination of x and y. Find a necessary and sufficient condition that d = gdc(x, y).

Exercises for Chapter 11

285

11.58. Suppose the Euclidean algorithm is applied to find gcd(a, b) for two positive integers a and b. If at any point in the algorithm we come to the residues ri and ri+1 such that ri+1 = ri − 1, what conclusion can we draw about gcd(a, b)? 11.59. (a) Let a and b be non-zero integers with d = mdc(a, b) and let c ∈ Z. Prove that if a | c and b | c then from | CD. (b) Show that Theorem 11.16 follows from the result in (a). 11.60 Prove that there are infinitely many positive integers n such that n, n + 1, and n + 2 can each be expressed as the sum of the squares of two non-negative integers. (Note for example that 8 = 22 + 22 , 9 = 32 + 02 , 10 = 32 + 12 and 80 = 82 + 42 , 81 = 92 + 02 and 82 = 92 + 12 and that 3 = 2 + 1 and 9 = 8 + 1.) 11.61. (a) Give an example for integers m, n ≥ 5 such that x ∩ y = ∅ for every x ∈ Zm and y ∈ Zn . (b) Formulate a conjecture giving conditions under which the integers m, n ≥ 2 have the property that x ∩ y = ∅ for every x ∈ Zm and y ∈ Zn .

Section 11.6: The Fundamental Theorem of Arithmetic 11.62. Find the smallest prime factor for each integer below: (a) 539 (b) 1575 (c) 529 (d) 1601 11.63. Find the canonical factorization of each of the following integers: (a) 4725 (b) 9702 (c) 180625 11.64. Prove each: (a) Every prime number of the form 3n + 1 also has the form 6k + 1. (b) If n is a positive integer of the form 3k + 2, then n has a prime factor of these forms too. 11.65 (a) Express each of the integers 4278 and 71929 as a product of prime numbers. (b) What is mdc(4278, 71929)? 11.66. Consider the periodic sequence 1, 3, 2, −1, −3, −2, 1, 3, 2, −1, −3, −2, . 🇧🇷 🇧🇷 what we write in reverse order: . 🇧🇷 🇧🇷 , −2 − 3, −1, 2, 3, 1, −2 − 3, −1, 2, 3, 1. Next, consider the 8-digit positive integer n = a7 a6 a5 a4 a3 a2 a1 a0 where each ai is a digit. It turns out that 7 | n if and only if 3 a7 + 1 a6 + (−2) a5 + (−3) a4 + (−1) a3 + 2 a2 + 3 a1 + 1 a0 is a multiple of 7 Use this to determine which of the following values are multiples of 7: (a) 56 (b) 821,317 (c) 31,142,524. 11.67. In the proof of Theorem 11.22 it was proved that there are infinitely many primes given that there are infinitely many primes, say p1, p2, . 🇧🇷 🇧🇷 , pn , where p1 < p2 < < pn . The number m = p1 p2 · · · pn + 1 was then considered to obtain a contradiction. Show that an alternative proof of Theorem 11.22 can be obtained by using pn ! + 1 instead of m. 11.68. Determine a necessary and sufficient condition that p1a1 p2a2 · · · pkak is the canonical factorization of the square of an integer n ≥ 2. 11.69. For two integers m, n ≥ 2, let p1 , p2 , . 🇧🇷 🇧🇷 , pr are such distinct primes that every pi (1 ≤ i ≤ r ) shares at least one of m and n. Then m and n can be expressed as m = p1a1 p2a2 prar and n = p1b1 p2b2 prbr, where the integers ai and bi (1 ≤ i ≤ r ) are nonnegative. Let ci = min(ai , bi ) for 1 ≤ i ≤ r . Prove that gdc(m, n) = p1c1 p2c2 prcr .

286

Chapter 11

Number Theory Tests

Section 11.7: Concepts with sums of divisors 11.70. Let k be a positive integer. (a) Prove that if 2k − 1 is prime, then k is prime. (b) Prove that if 2k − 1 is prime, then n = 2k − 1 (2k − 1) is perfect. 11.71. For a real number r, the floor r of r is the largest integer less than or equal to r. The greatest number of distinct positive integers whose sum is 5 is 2 (5 = 1 + 4 = 2 + 3), while the greatest number of distinct positive integers whose sum is 8 is 3 (8 = 1 + 2 + 5 = 1 + 3 + 4). √ Prove that the maximum number of distinct positive integers whose sum is the positive integer n is ( 1 + 8n − 1)/2.

EXERCISES ADDITIONAL TO CHAPTER 11 11.72. Rate the proposed solution to the following problem. Prove or disprove the following statement: There are not three integers n, n + 2 and n + 4 that are all prime. solution

This statement is true.

Proof On the contrary, suppose there are three integers n, n + 2, and n + 4, all prime numbers. We can write n as 3q, 3q + 1 or 3q + 2, where q ∈ Z. We consider these three cases. Case 1. n = 3q. So 3 | n and so n is not a prime number. This is a contradiction. Case 2. n = 3q + 1. Then n + 2 = 3q + 3 = 3(q + 1). Since q + 1 is an integer, 3 | (n + 2) and thus n + 2 is not a prime number. Once again we have a contradiction. Case 3. n = 3q + 2. So we have n + 4 = 3q + 6 = 3(q + 2). Since q + 2 is an integer, 3 | (n + 4) and thus n + 4 is not a prime number. A contradiction arises from this. 11.73. An integer a ≥ 2 is defined as lucky if f(n) = n 2 − n + a is a prime for every integer n with 1 ≤ n ≤ a − 1. We know that (1) 41 is lucky and (2) only nine other integers a ≥ 2 are lucky. (a) Prove that if a is a lucky number, then a is a prime number. (b) Give an example of three other lucky integers. (c) If a is a lucky number, what can be said about f(a)? 11.74. Prove that log2 3 is irrational. 11.75 Formulate and prove a more general result than that in Exercise 11.74. 11.76. Given below is an incomplete result with an incomplete proof. This result should determine all twin prime numbers (cousins of the form p and q = p + 2) in such a way that pq − 2 is also a prime number. Result Let p and q = p + 2 be two prime numbers. Then pq − 2 is prime if and only if (complete this sentence). Proof Let p and q = p + 2 be two prime numbers such that pq − 2 is also a prime number. Since p and p + 2 are prime, it follows that p is odd. By the division algorithm we can write p = 3k + r, where k ∈ Z and 0 ≤ r ≤ 2. Since p is an odd prime, k ≥ 1. We consider three cases for p depending on the value of r. Case 1. p = 3k. So p = ,q = and pq − 2 = case 2. p = 3k + 1. So q = 3k + 3. Since k ≥ 1, it follows q = 3(k + 1). So case 3. p = 3k + 2. So q = 3k + 4. So

.

Additional exercises to Chapter 11

287

11.77. Exercise 11.76 should suggest another exercise for you. Give a result for Exercise 11.76 and prove this result. 11.78. Assume that every positive rational number is expressed as m/n, where m, n ∈ N and m and n are coprime. A function f : Q+ → N is defined by f (m/n) = 2m 3n . (a) Prove that f is one-to-one. (b) If you discussed the Schröder-Bernstein theorem (Theorem 10.20) in Chapter 10, what information about Q+ can you glean from part (a)? 11.79. We saw in Section 11.7 that there is a partition of the set of the first seven primes into two subsets such that the sums of the elements in those two subsets are equal. Show that there is no such partition on the set of the first eight primes, but there is such a partition on the set of the first nine primes. 11.80. (a) Show that 5039 = 5040 − 1 is prime, while 5041 = 5040 + 1 is not prime. (b) Show that apart from 5039 there is no prime number between 5033 = 5040 − 7 and 5047 = 5040 + 7. 11.81. Let a0, a1, a2, . 🇧🇷 🇧🇷 be a sequence of positive integers such that (1) a0 = 1, (2) a2n+1 = an for n ≥ 0, and (3) a2n+2 = an + an+1 for n ≥ 0. Prove that an and an + 1 are coprime for every nonnegative integer n. 11.82. Any positive integer n can be expressed as n = ak ak−1 a2 a1 a0 , so n = ak 10k + ak−1 10k−1 + + a2 102 + a1 10 + a0 . In Section 11.6 it was mentioned that 9 | n if and only if 9 | (ak + ak−1 + + a2 + a1 + a0 ). For example, for the integer n = 32, 751, 9 | (3 + 2 + 7 + 5 + 1) and so 9 | 32, 751. Check this against the fact that 10 = 9 + 1 and for a positive integer r that 10r = (9 + 1)r = 9s + 1 for an integer s. 11.83. Let A be the set of 2-element subsets of N and B the set of 3-element subsets of N. Let f : A → B and g : B → A be functions defined by f ({i, j}) = { i , j , i + j} and g({i, j, k}) = {2i , 3 j 5k }, where i < j < k. Which of the following statements can we conclude using the functions f and g and possibly the Schröder-Bernstein theorem (Theorem 10.20 in Chapter 10)? (a) |A| ≤ |B| (b) |B| ≤ |A| (c) |A| = |B| (All right. 11.84. Let p1, p2, p3, . 🇧🇷 🇧🇷 be the prime numbers, where 2 = p1 < p2 < p3 < · · ·. Let A be a countable set, where A = {a1 , a2 , a3 🇧🇷 🇧🇷 For every integer n ≥ 2, let An be the Cartesian product of n copies of A, that is, An is the set of ordered n-tuples of elements of A. Define a function f : An → N over f ((ai1 , ai2 , . . . , ain )) = p1i1 p2i2 pnin (a) Prove that f is injective (b) Use (a) to show that An and A are numerical are equivalent (c) For any two countable sets A and B and any two integers n, m ≥ 2, show that An and Bm are numerically equivalent 11.85 Let p1 , p2 , . 🇧🇷 🇧🇷 , pn+ 1 denote the first n + 1 primes Suppose {U, V } is a partition of the set S = { p1 , p2 , .🇧🇷 🇧🇷 , pn }, where U = {q1 , q2 , .🇧🇷 🇧🇷 , qs } and V = {r1 , r2 , 🇧🇷 🇧🇷 , rt } Prove that if 2 M = q1 q2 qs + r1 r2 rt < pn+1 , then M is a prime number .

12

Tests in Analysis

Y

Our introduction to calculus certainly included an examination of limits—both limits of sequences (including infinite series) and limits of functions (including continuity and differentiability). Although we learned methods of calculating limits in these areas, the methods presented were probably based on facts that were not carefully checked. In this chapter some of the proofs for fundamental calculus results are presented. The proofs that arise in analysis differ significantly from any we have seen so far. The functions found in analysis are real-valued functions defined on sets of real numbers. That is, every function we study in calculus is of type f : X → R, where X ⊆ R. When studying limits, we are often interested in functions that have the property that (1) X = N and values increasing in the domain N lead to functional values approaching a real number L, or (2) the function is defined for all real numbers close to a given real number a, and values approaching a result in functional values that approximate a real number L We start with (1), where X = N.

12.1 Limits of sequences A sequence (of real numbers) is a real-valued function that is defined on the set of natural numbers; that is, a sequence is a function f : N → R. If f (n) = an for every n ∈ N, then f = {(1, a1 ), (2, a2 ), (3 , a3) , . 🇧🇷 🇧🇷 Since only the numbers a1 , a2 , a3 , . 🇧🇷 🇧🇷 are relevant in f, this sequence is usually denoted by a1 , a2 , a3 , . 🇧🇷 🇧🇷 or from {one}. The numbers a1, a2, a3, etc. are called terms of the sequence {an}, where a1 is the first term, a2 is the second term, 1, etc. So an is the nth term of the sequence. So the sequence is 1, 1/2, 1/3, . 🇧🇷 🇧🇷 n n is the sequence 1/3, 2/5, 3/7, . 🇧🇷 .. In these two examples the nth term is given as 2n + 1 of a sequence and we can easily find the first terms and indeed any specific term from it. On the other hand, it can be difficult to find the nth term of a sequence whose first term is given. For example, the nth term of the sequence 1 1 1 , , , ... 2 4 6 288

12.1

sequence boundaries

289

is 1/2n; the nth term of the sequence 1 1 1 1 + , 1 + , 1 + , ··· 2 4 8 is 1 + 1/2n ; the nth term of the sequence 3 1 5 3 7 , , , , , ... 5 2 11 7 17 is (n + 1)/(3n − 1); the nth term of sequence 1,

1, −1, 1, −1, 1, −1, . 🇧🇷 🇧🇷 2 is (−1)n+1 ; while the nth term of the sequence is 1, 4, 9, 16, · · · n. 1 For the sequence, the larger the integer n, the closer 1/n is to 0; and for the sequence n n , the larger the integer n, the closer n/(2n + 1) is to 1/2. On the other hand, for the sequence n 2 , 2n + 1 as the integer n increases, n 2 becomes larger and larger and does not approach any real number. When we discuss the proximity between two numbers, we are actually considering the distance between them. We saw in Chapter 8 that the distance between two real numbers a and b is given as |a − b| is defined. Remember that the absolute value of a real number x is x if x ≥ 0 |x| = −x if x < 0.

Hence the distance between a = 3 and b = 5 is |3 − 5| = |5 − 3| = 2; while the distance 1 1 1 is between 0 and 1/n, where n ∈ N, 0 − = − 0 = . n n n For a fixed positive real number r, the inequality is |x| < r is equivalent to the inequalities −r < x < r . So |x| < 3 is equivalent to −3 < x < 3, while |x − 2| < 4 is equivalent to −4 < x − 2 < 4. Adding 2 along these inequalities gives −4 + 2 < (x − 2) + 2 < 4 + 2 and thus −2 < x < 6 We saw in Exercise 4.30 and Theorem 4.17 in Chapter 4 that for real numbers x and y |x y| = |x||y|

e

|x + y| ≤ |x| + |y|.

Both properties are useful throughout the calculation. We mentioned that for some sequences {an } there is a real number L (or at least there seems to be a real number L), so the larger the integer n gets, the closer to L it is. We now come to an important and fundamental idea in the study of sequences and are ready to introduce a concept that describes this situation. A sequence {an } of real numbers converges to a real number L if the larger the integer n is, the closer to L is. Since the words bigger and closer are vague and therefore open to interpretation, we need to make these words much more precise. What we mean by this is that we can make an as close to L as we like (that is, we can make |an − L| as small as we like) as long as n is large enough. (The Greek letter epsilon) denotes how small we |an − L| want to be that is, we want |an − L| < n choose large enough. This is equivalent to − < an − L < , what

290

Chapter 12

Tests in Analysis

(3, a3) L+ L L− (2, a2) (1, a1) 1

2

Figure 12.1

3

N A sequence {an } that converges to L

that is, L − < an < L + . Therefore we require that an is a number in the open interval (L − , L + ) if n is large enough. Now we need to know what we mean by "big enough". What we mean by this is that there exists a positive integer N such that if n is an integer greater than N then an ∈ (L − , L + ). If such a positive integer N can be found for every positive number, no matter how small, then we say that {an } converges to L. This is shown in Figure 12.1. Formally, one then says that a sequence {an } of real numbers converges to the real number L if for every real number > 0 there exists a positive integer N such that if n is an integer with n > N, |an − is L| 🇧🇷 As we have already indicated, the number is a measure of how close the n terms must be to the number L, and N indicates a position in the sequence beyond which the required condition is satisfied. If a sequence {an } converges to L, then L is called the limit of {an } and we write lim an = L . If n→∞

to prove result

TEST STRATEGY

A sequence does not converge, it is said to diverge. So if a sequence {an } diverges, then there is no real number L such that lim an = L . n→∞ Before looking at some examples, let's introduce some useful notations. For a real number x, remember that x denotes the smallest integer greater than or equal to x. The integer √ x is often referred to as the upper bound of x. Hence 8/3 = 3, 2 = 2, −1,6 = −1 and 5 = 5. From the definition of x it follows that if x is an integer then x = x; whereas if x is not an integer, then x > x. In particular, if n is an integer such that n > x , then n > x. We now show how the definition of a convergent sequence is used to prove that a sequence converges to a number. 1 The sequence converges to 0. n Here, given a real number > 0, we need to show that there exists a positive integer 1 1 1 1 N such that if n > N , then − 0 = = < . The inequality < is n n n n

12.1

sequence boundaries

291

equivalent to n > 1/. So if we make N = 1/ and take n as an integer greater than 1 as N, then n > . We can now present a formal proof. Result 12.1

1 converges to 0. The sequence n

Study

Let > 0. Choose N = 1/ and let n be any integer such that n > N . So n > 1/ 1 1 and thus − 0 = < . n n

EVIDENCE ANALYSIS

Although the proof of Result 12.1 is fairly short, the real work in constructing the proof took place in the proof strategy (our "note paper" work), which preceded the proof but is not part of the proof. This explains why we chose N this way and why this choice of N was successful. In the proof of Result 12.1 we chose N = 1/ and showed that 1 with this value of N leads to − 0 < any integer n with n > N, which of course n was our goal. However, there is nothing unique about this choice of N. In fact, we could have chosen N to be any integer greater than 1/, or equivalently any integer greater than 1/, and also get the desired result. However, we could not choose N to be an integer less than 1/. In general, we cannot choose N = 1/ since there is no guarantee that N is an integer.

Result to prove the PROOF STRATEGY

Now we consider another illustration of a convergent sequence. 2 The sequence 3 + 2 converges to 3. n Here, given > 0, we need to show that there exists a positive integer N such that when n > N

3 + 2 − 3 = 2 = 2 < . n2 n2 2 n 1 n2 2 > e n > 2/. So if we let the inequality 2 < is equivalent to 2 n √ 2/ and choose n to be an integer greater than N, then n > 2/. We can now give N = a proof.

Result 12.2

Study

2 The sequence 3 + 2 n

converges to 3.

Let > 0. Choose N = 2/ and let n be any integer such that n > N . So 2 1 n > 2/ and n 2 > 2/. So 2 < and 2 < . So n 2 n

3 + 2 − 3 = 2 = 2 < . n2 n2 2 n

292

Chapter 12

Tests in Analysis

Now let's look at a slightly more complicated example. Result to prove the PROOF STRATEGY

The sequence Note that

1n converges for . 2n + 1 2

n 2n − 2n − 1 1 1 1 2n + 1 − 2 = 2(2n + 1) = − 4n + 2 = 4n + 2 .

1 < is equivalent to 4n + 2 > 1/, which in turn is equivalent to 4n + 2

1 1 1 1 for n > − . It seems that the correct choice for N is −; but if ≥ 1/2, 4 2 4 2, then N = 0, which is unacceptable since N must be a positive integer. 1 1 1 1 1 1 Note, however, that > − . So if n > , then also n > −. So if we choose 4 4 2 4 4 2 N = 1/4 we can get the desired inequality.

the inequality

1 n converges to . The sequence 2n + 1 2

Result 12.3

Study

Result to prove the PROOF STRATEGY

1 1 1 > − e Let > 0. Choose N = 1/4 and let n > N . Then n > 4 4 2 1 1 then 4n > − 2 and 4n + 2 > 1/. Therefore < . So 4n + 2 n 1 1 1 2n − 2n − 1 2n + 1 − 2 = 2(2n + 1) = − 4n + 2 = 4n + 2 < . Here, too, the choice made for N in the proof of Result 12.3 is not unique. We could choose N to be any positive integer greater than . 4 We have mentioned that a sequence {an } is called divergent if it does not converge. In order to prove that a sequence {an } diverges, a proof by contradiction would have to be anticipated. We would begin such a proof by assuming, on the contrary, that {an } converges, say, to a real number L. We know that for all > 0 there exists a positive integer N such that if n > N , then |an − L| 🇧🇷 If we could also show for a choice > 0 that there is no such positive integer N, then we would have created a contradiction and proved the desired result. Let's see how this works with two examples. The sequence (−1)n+1 is divergent. In a proof by contradiction we first assume that (−1)n+1 converges, say for the limit L. Our goal is to show that there is a value > 0 for which there is no positive integer N, that meets the requirement. We choose = 1. By the definition of n+1 converges to L, which means for (−1) there must exist a positive integer N such that if n is an integer with n > N then (− 1 ) n+ 1 − L < = 1. Let k be an odd integer such that k > N . Then (−1)k+1 − L = |1 − L| = |L − 1| < 1

12.1

sequence boundaries

293

Hence −1 < L − 1 < 1 and 0 < L < 2. Now let an even integer such that > N . Then (−1)+1 − L = |−1 − L| = |L + 1| < 1. So −1 < L + 1 < 1 and −2 < L < 0. So L < 0 < L, which of course is impossible. Now we repeat what we just said in a formal proof. Result 12.4 test

The sequence (−1)n+1 is divergent. On the other hand, assume that the sequence (−1)n+1 converges. Then lim (−1)n+1 = n→∞ L for some real L. Let = 1. Then there exists a positive integer N such that if number n > N , then (−1)n+1 − L < = 1. Let k be an odd integer such that k > N . Then (−1)k+1 − L = |1 − L| = |L − 1| < 1. So −1 < L − 1 < 1 and 0 < L < 2. Then let be an even integer such that > N . Then (−1)+1 − L = |−1 − L| = |L + 1| = |1 + L| < 1. So −1 < L + 1 < 1 and −2 < L < 0. So L < 0 < L, which is a contradiction.

EVIDENCE ANALYSIS

A question that now arises is how did we know to choose =1. If it denotes any positive integer, then both inequalities are |L − 1| < and |L + 1| < must be fulfilled, but these lead to the inequalities 1 − < L < 1 + and −1 − < L < −1 + . In particular, 1 − < L < −1 + and thus 1 − < −1 + . This is only possible if 2 > 2 or > 1. So choosing any number where 0 < ≤ 1 creates a contradiction. We chose = 1.

Result to prove the PROOF STRATEGY

n is divergent. The sequence (−1)n+1 n+1 n As expected, we try a proof by contradiction and assume that (−1)n+1 n+1 is a convergent sequence, say with limit L. For > 0 there is a positive integer N such that (−1)n+1 n − L < n+1 for every integer n with n > N . There are some useful observations. First, if n > N and n is odd, then n n n + 1 − L < and hence − < n + 1 − L < . So L − <

n < L + . n+1

294

Chapter 12

Tests in Analysis

Second, if n > N and n is even, then − n − L < and hence − < − n − L < . n+1 n+1 So L − . n+1 2 Depending on whether L = 0, L > 0 or L < 0, we are faced with the decision of how to choose in order to create a contradiction. Also, since n > 1, we have n + n > n + 1, and therefore 2n > n + 1. So

Result 12.5 exam

n is divergent. The sequence (−1)n+1 n+1 n converges. Then lim (−1)n+1 Suppose instead that (−1)n+1 n+1

n =L n→∞ n+1 for a real number L. We consider three cases, depending on whether L = 0, L > 0 or L < 0. 1 Case 1. L = 0. Let = . Then there exists a positive integer N such that if n > N , 2 1 n 1 n+1 n − 0 < or < . So 2n < n + 1 and then n < 1, which is then (-1) n+1 2 n+1 2 a contradiction. L Case 2. L > 0. Let = . Then there exists a positive integer N such that if n > N , 2 L n+1 n − L < . Let n be an even integer such that n > N . Then then (−1) n+1 2 −

L n L N Also change (-1) n+1 2 Therefore

L n L < −L N , then an > M. The sequence (−1)n+1 found

12.2

Infinite series

295

at 2 result 12.4, although diverging, does not diverge to infinity. However, the sequence 1 n + n diverges to infinity. Result to prove the PROOF STRATEGY

1 = ∞. n2 + n→∞ nlim

Given a positive number M, we need to show the existence of a positive integer 1 1 N such that if n > N , then n 2 + > M. Note that if n 2 > M, then n 2 + > n 2 > M n n √ Since M > 0, it follows that n 2 > M is equivalent to n > M. A formal proof can now be constructed.

Result 12.6

Study

lim

n→∞

1 n + n

2

= ∞.

√ Let M be a positive number. Choose N = M and let n be any integer such that √ 1 n > N . So n > M and therefore n 2 > M. So n 2 + > n 2 > M. n

12.2 Infinite Series An important concept in calculus with sequences is that of infinite series. For real numbers ∞ ak = a1 + a2 + a3 + · · · denoting an infinite series (in general a1 , a2 , a3 , . . . ) we write k=1

simply referred to as a series). For example ∞ ∞ 1 2 3 1 k 1 1 = + + + = 1 + + + and 2 2 2 2 k 2 3 2k + 1 3 9 19 k=1 k=1

are infinite series. The numbers a1 , a2 , a3 , . 🇧🇷 🇧🇷 are the terms of the series

∞

k = a1 + a2 +

k=1

a3 + · · ·. The notation certainly seems to indicate that we should use the terms a1 , a2 , a3 , . 🇧🇷 .. But what does it mean to add infinite numbers? It must be given meaning. For this reason we construct a sequence {sn } called the sequence of the partial sums of the series. Here s1 = a1 , s2 = a1 + a2 , s3 = a1 + a2 + a3 and in general for n ∈ N sn = a1 + a2 + + an =

n

e.

k=1

Since sn is determined by adding a finite number of terms, there is no confusion in understanding the terms of the sequence {sn}. If the sequence {sn } converges, say to the number ∞ ∞ L, then one says that the series ak converges to L and we write ak = L. Hence k=1

Number L is called the sum of

∞ k=1

ak. If {sn} diverges, then

∞ k=1

k=1

ak means divergent.

296

Chapter 12

Tests in Analysis

The French mathematician Augustin-Louis Cauchy was one of the most prolific mathematicians of the 19th century. Among his many achievements was his definition of the convergence of infinite series, a definition that is still used today. In his work Cours d'Analyse, Cauchy considered the sequence {sn } of partial sums of a series. He stated that when the sum sn approaches a certain limit s for increasing values of n infinity, the series is called convergent and this limit in question is called the sum of the series. We consider an example of a convergent series. to prove result

the infinite series

∞ k=1

TEST STRATEGY

1 converges to 1. k(k + 1)

First we consider the sequence {sn } of partial sums for this series. Since ∞ k=1

1 1 1 1 = + + + ···, k(k + 1) 1·2 2·3 3·4

1 1 1 1 2 1 1 = , s2 = + = + = and 1 2 2 1 2 2 3 2 6 3 1 1 1 1 1 1 1 3 s3 = + + = + + = . 1.2 2.3 3.4 2 6 12 4 n for any positive integer n. Based on these three terms, sn = n+1 seems to prove that this is indeed the case. it follows that s1 =

Motto 12.7

For every positive integer n, sn =

Proof of Lemma 12.7

1 1 1 n 1 + + + ··· + = . 1,2 2,3 3,4 n(n + 1) n+1

1 1 = and the result is valid. 1·2 1+1 1 1 1 k 1 + + + ··· + = , where k is positive Suppose sk = 1·2 2·3 3·4 k(k + 1) k+1 integer. We show that we proceed by induction. For n = 1 we have s1 =

sk+1 = Que beachten

sk+1 =

1 1 1 k+1 1 + + + ··· + = . 1,2 2,3 3,4 (k + 1)(k + 2) k+2

1 1 1 1 1 + + + ··· + + 1·2 2·3 3·4 k(k + 1) (k + 1)(k + 2)

=

1 k(k + 2) + 1 k 2 + 2k + 1 k + = = k + 1 (k + 1)(k + 2) (k + 1)(k + 2) (k + 1)(k + 2)

=

k+1 (k + 1)2 = . (k + 1)(k + 2) k+2

12.2

According to the principle of mathematical induction, sn = integer n.

Infinite series

n for every positive n+1

Is there another way we could see that sn = we are observing that an =

297

Not . Se n+1

1 1 1 = − , n(n + 1) n n+1

1 1 1 1 1 = 1 − , a2 = = − usw. Insbesondere 1 2 2 2 3 2 3 sn = a1 + a2 + a3 + · · · + an

1 1 1 1 1 1 1 + − + − + ··· + − = 1− 2 2 3 3 4 n n+1 n 1 = . = 1− n+1 n+1 n Anyway, now that we know that sn = , it only remains to prove that n+1 n lim sn = lim = 1. n→∞ n→∞ n + 1 then a1 =

Motto for proof of the PROOF STRATEGY

lim

n→∞

n = 1. n+1

Given > 0, we need to find a positive integer N such that if n > N , then n n + 1 − 1 < . Now n n − n − 1 −1 1 n + 1 − 1 = n + 1 = n + 1 = n + 1 . 1 1 < is equivalent to n + 1 > , which is equivalent to n+1 1 1 1 n > − 1. If n > , then n > − 1. We can now present a proof of this lemma. the inequality

Motto 12.8

Proof of Lemma 12.8

lim

n→∞

n = 1. n+1

1 1 Let > 0. Choose N = 1/ and let n > N . Then n > > − 1. Then 1 1 1 n > − 1. So n + 1 > and < . So n+1 n −1 1 n + 1 − 1 = n + 1 = n + 1 < . We are now ready to provide a proof of the result.

298

Chapter 12

Result 12.9

Tests in Analysis

the infinite series

∞ k=1

Study

1 converges to 1. k(k + 1)

The nth term of the sequence {sn } of partial sums of the series

∞ k=1

sn =

1 is k(k+1)

1 1 1 1 + + + ··· + . 1,2 2,3 3,4 n(n + 1)

By Lemma 12.7 we have sn = and thus sn =

1 1 1 n 1 + + + ··· + = 1·2 2·3 3·4 n(n + 1) n+1

n. By Lemma 12.8 we have n+1 lim

n→∞

Since lim sn = 1, it follows that n→∞

∞ k=1

n = 1. n+1

1 = 1. k(k + 1) ∞ 1

1 1 + + · · · is the famous k 2 3 k=1 and is called the harmonic series. In fact, it's probably the best-known divergent series. We now turn to a divergent series. the series

Result 12.10

the harmonic series

∞ 1k=1

Study

k

diverge.

On the contrary, suppose that n is 1

=1+

∞ 1k=1

k

converges, say, to the number L. For every positive

🇧🇷 Hence the sequence {sn } of partial sums converges to L. k Therefore, for every > 0 there exists a positive integer N such that if n > N , then |sn − L| 🇧🇷 Consider = 1/4 and let n be an integer with n > N . Then integer n, let sn =

k=1

1 1 < sn − L < . 4 4 1 1 1 Since 2n > N , we also have |s2n − L| < and so − < s2n − L < . Notice 4 4 4 that

1 1 1 1 1 s2n = sn + + + ··· + > sn + n = sn + . n+1 n+2 2n 2n 2 −

So 1 1 1 1 1 1 > s2n − L > sn + − L = (sn − L) + > − + = , 4 2 2 4 2 4 which is impossible.

12.2 EVIDENCE ANALYSIS

Infinite series

299

In Result 12.10 we show that a certain series diverges; that is, it does not converge. Hence it is not surprising that we proved this by contradiction. Assuming that the sequence {sn } converges, this means that the sequence has a limit L. This tells us that an inequality of type |sn − L| < exists for every positive number and for sufficiently large integers n (which depend on ). The goal, of course, was to get a contradiction. We did this by making a choice of ( = 1/4 worked!), which ended up being a mathematical impossibility. ∞ 1

it not only diverges, it diverges to infinity; that is, if {sn } k is the sequence of partial sums for the harmonic series, then lim sn = ∞. We also establish this fact n→∞. First, we verify a lemma that shows once again that mathematical induction can be a useful proof technique in analysis. the harmonic series

k=1

Motto 12.11

You are sn =

n 1k=1

k

=1+

1 1 n + + , where n ∈ N. Then s2n ≥ 1 + for all positive 2 n 2

integer Check

1 We proceed by induction. For n = 1, s21 = 1 + and so the result holds for n = 1. 2 k k+1 Suppose s2k ≥ 1 + , where k ∈ N. We show that s2k+1 ≥ 1 + . Now note 2 2 that 1 1 + + k+1 2 2 1 1 1 + k + + k+1 = s 2k + k 2 +1 2 +2 2 1 1 1 ≥ s2k + k+ 1 + k+1 + + k+1 2 2 2 k 2 1 = s2k + k+1 = s2k + 2 2 1 k+1 k ≥ 1+ + =1+ . 2 2 2 n for all positive By the principle of mathematical induction, s2n ≥ 1 + 2 integer n. s2k+1 = 1 +

Result 12.12

the harmonic series

∞ 1k=1

Study

k

n 1

diverges to infinity.

🇧🇷 So {sn} is the sequence of partial sums for the k harmonic series. We show that lim sn = ∞. Let M be a positive integer and choose N = 22M . For n ∈ N let sn =

k=1

n→∞

300

Chapter 12

Tests in Analysis

Let n > N . Then, using Lemma 12.11, we have 1 1 1 1 + + + + + 2 N N +1 n 1 1 1 + + + = sN + N +1 N +2 n 2M > M . > sN = s22M ≥ 1 + 2

sn = 1 +

12.3 Limits of Functions We now turn to another common type of limit problem (perhaps the most common). Here we consider functions f : X → R, where X ⊆ R, and study the behavior of such a function f near a real number (point) a. For the moment we are not concerned with whether a ∈ X , but since we are dealing with the numbers f(x) for real numbers x close to a, it is necessary that f is defined in a “deleted neighborhood” of a is. By an excluded neighborhood of a we mean a set of the type (a − δ, a) ∪ (a, a + δ) = (a − δ, a + δ) − {a} ⊆ X for a positive real number δ ( the Greek letter delta). (See Figure 12.2.) In fact, it may be that (a − δ, a + δ) ⊆ X for some δ > 0. For |x| and we are interested in the behavioral example when f : X → R is defined by f (x) = x of f near 0, then 0 ∈ / X . Indeed, it may be that X = R − {0}, then (−δ, 0) ∪ (0, δ) ⊆ X for any positive real number δ. On the other hand, if f : X → R x is defined by f (x) = 2 and again we are interested in the behavior of f near x −1 0 then 1, −1 ∈ / X . A natural choice for X is R − {1, −1}, in this case (−δ, δ) ⊆ X for any real number δ with 0 < δ ≤ 1. We are now ready to present the definition of the limit of an occupation . Let f be a real-valued function defined on a set X of real numbers. Also let a ∈ R such that f is defined in a deleted neighborhood of a. Then we say that the real number L is the limit of f(x) as x approaches a, written lim f(x) = L, as x is closer to a, the closer x→a

f(x) stands for L. Again, the imprecision of the closest word requires a much more precise definition. The positive number is intended to indicate how close f(x) must be to L; that is, we require that | f(x) − L| 🇧🇷 So the claim is: if x is close enough to a, then | f(x) − L| 🇧🇷 We use the positive number δ to represent how close x is to a for the inequality | must be f(x) − L| < being satisfied, remembering that we are not concerned with how or if f is defined in a. More precisely, L is the limit of f(x) as x approaches a, written lim f(x) = L, x→a if for every real number > 0 there is a real number δ > 0 such that for every real number x with 0 < |x − a| < δ, it follows | f(x) − L| 🇧🇷 This implies that if 0 < |x − a| < δ, then surely f(x) is defined. If there is a number L such that

a+d

advertisement

X axis

to Figure 12.2

A neighborhood deleted from a

12.3

L-

a-d Figure 12.3

one

f(x)L

x

functional limits

L+

301

eixo and

X axis

a+d

A geometric interpretation of lim f(x) = L x→a

lim f(x) = L, then we say that the limit lim f(x) exists and is equal to L; otherwise x→a this limit does not exist. Thus, to show that lim f (x) = L, it is first necessary to specify x→a > 0 and then to show the existence of a real number δ > 0. Usually, the smaller the value of , the smaller the value of δ . However, we must be sure that the chosen number δ, no matter how small (or large) it may be, satisfies the requirement. Although our choice of δ depends on it should not depend on which real number x with 0 < |x − a| < δ is considered. Thus, if lim f(x) = L, then for a given > 0 there exists a δ > 0 such that x→a if x is any number in the open interval (a − δ, a + δ) other than a, then f (x) is a number in the interval (L − , L + ). This geometric interpretation of the boundary definition is shown in Figure 12.3. We illustrate these ideas with an example. x→a

Result to prove the PROOF STRATEGY

Result 12.13 test

lim (3x − 7) = 5.

x→4

Before we provide a formal proof of this limit, let's discuss the procedure we will use. The proof begins by assuming that > 0 is given. What we need to do is find a number δ > 0 such that if 0 < |x − 4| < δ, then |(3x − 7) − 5| < or equivalently |3(x − 4)| 🇧🇷 This is also equivalent to |3| · |x − 4| < e for |x − 4| < /3. This suggests our choice of δ. Now we can do a proof. lim(3x − 7) = 5.

x→4

Leave > 0 data. Choose δ = /3. Let x ∈ R with 0 < |x − 4| < δ = /3. So |(3x − 7) − 5| = |3x − 12| = |3(x − 4)| = 3|x − 4| < 3(/3) = . Let's consider another example.

Result to prove the PROOF STRATEGY

lim (−2x + 1) = 7.

x→−3

First, let's do some preliminary algebra. The inequality |(−2x + 1) − 7| < is equivalent to | − 2x − 6| < e for 2|x + 3| 🇧🇷 This suggests a desired value of δ. Now we can do a proof.

302

Chapter 12 Result 12.14 Proof

Tests in Analysis

lim (−2x + 1) = 7.

x→−3

Let > 0 and choose δ = /2. Let x ∈ R with 0 < |x − (−3)| < δ = /2, then 0 < |x + 3| </2. So |(−2x + 1) − 7| 🇧🇷 − 2(x + 3)| 🇧🇷 − 2||x + 3| = 2|x + 3| < 2(/2) = . The two examples we have seen so far are intended to show us what to do when the function is linear (ie f(x) = ax + b, where a, b ∈ R). We now present a small variation of this.

to prove result

TEST STRATEGY

Result 12.15

Study

lim

x→ 32

4x 2 − 9 = 6. 2x − 3

2 4x − 9 − 6 < or after simplification: In this example | f(x) − L| < becomes 2x − 3 (2x + 3)(2x − 3) − 6 < . However, since the numbers x lie in an excluded neighbor 2x − 3 (2x + 3)(2x − 3) hood of 3/2, it follows that 2x − 3 = 0 and thus − 6 < to 2x − 3 |(2x + 3) − 6| < or |2x − 3| 🇧🇷 So 2|x − 3/2| < e |x − 3/2| </2. Now we are ready to try. limited

x→ 32

4x 2 − 9 = 6. 2x − 3

Let > 0 and choose δ = /2. Let x ∈ R with 0 < |x − 3/2| < δ = /2. So 2|x − 3/2| +23) − 6| 0, we need to find δ > 0 such that if 0 < |x − 3| < δ, then |x 2 − 9| 🇧🇷 To find an appropriate choice of δ with respect to , we start with |x 2 − 9| 🇧🇷 We want with the expression |x − 3| work in this inequality. This is actually quite easy since |x 2 − 9| < is equivalent to |x − 3||x + 3| 🇧🇷 This can prompt us to choose δ = . However, for δ it is necessary to think about |x − 3| to write < |x + 3| |x + 3| to be a positive number (a constant) that depends on x but is not a function of x. The expression |x + 3| can be eliminated, as we shall now show. Since it is our choice how we choose δ, we can certainly require δ ≤ 1, which we do. So |x − 3| < 1 and thus −1 < x − 3 < 1. So 2 < x < 4. So 5 < x + 3 < 7 and therefore |x + 3| < 7. Then, under this constraint for δ, it follows that |x − 3||x + 3| < 7|x − 3|. If now 7|x − 3| 🇧🇷

12.3

functional limits

303

that is, if |x − 3| </7, then it surely follows that |x − 3||x + 3| 🇧🇷 To arrive at this inequality, both |x − 3| < 1 and |x − 3| </7. This suggests an appropriate choice of δ. Result 12.16 test

lim x 2 = 9.

x→3

Let > 0 and choose δ = min(1, /7). Let x ∈ R with 0 < |x − 3| < δ = min(1, /7). As |x − 3| < 1, it follows that −1 < x − 3 < 1 and then 5 < x + 3 < 7. In particular, |x + 3| < 7. Because |x − 3| </7 it follows that |x 2 − 9| = |x − 3||x + 3| < |x − 3| 7 < (/7) 7 = . We have now seen four limit proofs of the type lim f(x) = L. In Result 12.13, x→a we chose δ = /3 for data > 0 and in Result 12.14 we chose δ = /2. In each case, if we had considered a different value of a for the same function, the same choice of δ would be successful. In any case, the function is linear. In Result 12.15, for a cube > 0, the choice of δ = /2 would succeed even if a = 3/2, since 3/2 ∈ / (a − δ, a + δ). Because the function f defined by f (x) = (4x 2 − 9)/(2x − 3) in Result 12.15 is “almost linear”; that is, f(x) = 2x + 3 when x = 3/2 and f(3/2) is undefined. However, our choice of δ = /7 in the proof of Result 12.16 depended on a = 3; that is, when a = 3, a different choice of δ is required. For example, if we want to prove that lim x 2 = 16, then for data > 0 a suitable choice for δ is min(1, /9). x→4

Next we consider a limit involving a higher degree polynomial function. Result to prove the PROOF STRATEGY

lim (x 5 − 2x 3 − 3x − 7) = 3.

x→2

For data > 0 we need to show that |(x 5 − 2x 3 − 3x − 7) − 3| < if 0 < |x − 2| < δ for a suitable choice of δ > 0. We then have to use |x − 2| work in the expression |x 5 − 2x 3 − 3x − 10|. If we divide x 5 − 2x 3 − 3x − 10 by x − 2, we get x 5 − 2x 3 − 3x − 10 = (x − 2)(x 4 + 2x 3 + 2x 2 + 4x + 5). So we have |x 5 − 2x 3 − 3x − 10| = |x − 2||x 4 + 2x 3 + 2x 2 + 4x + 5|. So we are looking for an upper bound for |x 4 + 2x 3 + 2x 2 + 4x + 5|. For this we define the constraint δ ≤ 1. So |x − 2| < δ ≤ 1. Then −1 < x − 2 < 1 and 1 < x < 3. So |x 4 + 2x 3 + 2x 2 + 4x + 5| ≤ |x 4 | + |2x3| + |2x2| + |4x| + |5| < 170. We are now ready to prove Result 12.17.

Result 12.17 test

lim (x 5 − 2x 3 − 3x − 7) = 3.

x→2

Let > 0 and choose δ = min(1, /170). Let x ∈ R with 0 < |x − 2| < δ = min(1, /170). As |x − 2| < 1, it follows 1 < x < 3 and thus |x 4 + 2x 3 + 2x 2 + 4x + 5| ≤ |x 4 | + |2x3| + |2x2| + |4x| + |5| < 170

304

Chapter 12

Tests in Analysis

As |x − 2| </170 we have |(x 5 − 2x 3 − 3x − 7) − 3| = |x 5 − 2x 3 − 3x − 10| = |x − 2| · |x4 + 2x3 + 2x2 + 4x + 5| < (/170) * 170 = . Our next example involves a rational function (the ratio of two polynomials). Result to prove the PROOF STRATEGY

lim

x→1

2x2 + 1 = . x2 + 4 5

First notice that 2 2 x + 1 2 5(x 2 + 1) − 2(x 2 + 4) = = |3x − 3| = 3|x − 1||x + 1| 🇧🇷 − x2 + 4 5 5(x 2 + 4) 5(x 2 + 4) 5(x 2 + 4) 3|x + 1| 🇧🇷 Again we restrict δ to 5(x 2 + 4) that δ ≤ 1. So |x − 1| < 1 or 0 < x < 2. So 1 < x + 1 < 3 and so 3|x + 1| < 9. 1 1 Also, from x > 0, it follows that 5(x 2 + 4) > 20. So < and therefore 2 5(x + 4) 20

1 3|x + 1| 9 0 given and choose δ = min(1, 20/9). Let x ∈ R with 0 < |x − 1| <δ. As |x − 1| < 1, we have 0 < x < 2 and 1 < x + 1 < 3. So 3|x + 1| < 3 * 3 = 9 1 3|x + 1| 1 < . So < 9/20. As |x − 1| < and 5(x 2 + 4) > 20, i.e. 2 5(x + 4) 20 5(x 2 + 4) 20/9, it follows that 2 2 x + 1 2 5(x 2 + 1) − 2 ( x2 + 4) = |3x − 3| = − x2 + 4 5 5(x 2 + 4) 5(x 2 + 4) 20 9 3|x − 1||x + 1| 🇧🇷 = 5(x 2 + 4) 9 20 We now present another example on this topic.

Example 12.19

solution

determine limit

x→1

x2 − 1 and check your answer. 2x - 1

Since it looks like lim (x 2 − 1) = 0 and lim (2x − 1) = 1, we would expect x→1

x→1

0 x2 − 1 lim = = 0. To verify this we need to show that given > 0 x→1 there are 2x − 1 1

12.3

functional limits

305

then δ > 0, so if 0 < |x − 1| < δ, then 2 2 x −1 x − 1 2x − 1 − 0 = 2x − 1 < . look at that

2 x − 1 (x − 1)(x + 1) = |x + 1| |x − 1|. 2x − 1 = |2x − 1| 2x - 1

|x + 1| 🇧🇷 Usually we can do |2x − 1| constrain δ ≤ 1, as before, but in this situation we have a problem. If δ ≤ 1, then 0 < |x − 1| < δ and hence |x − 1| < 1. So 0 < x < 2 or x ∈ (0, 2). This interval of |x + 1| real numbers contain 1/2 and are undefined when x = 1/2. So we set |2x − 1| a stricter restriction on δ. Also the restriction δ ≤ 1/2 is not sufficient, because if |x − 1| < |x + 1| is defined for all real numbers x δ ≤ 1/2, i.e. 1/2 < x < 3/2. Although |2x − 1| in this domain, this expression becomes arbitrarily large when x is arbitrarily close to 1/2, which means |2x − 1| considered to be arbitrarily close to 0. That is, we cannot give an upper bound |x + 1| find when δ = 1/2. Hence we require that δ ≤ 1/4, say, and thus |x − 1| < δ ≤ 1/4. for |2x − 1| So 3/4 < x < 5/4. So |x + 1| <9/4 In addition, |2x − 1| > 2 34 − 1 = 1/2 and thus |x + 1| 9 9 1 < 2. So < 2 = . We now carry out a formal proof. |2x − 1| |2x − 1| 4 2 Proceeding as before, we find an upper bound for

Result 12.20 test

lim

x→1

x2 − 1 = 0. 2x − 1

Let > 0 and choose δ = min(1/4, 2/9). Let x ∈ R with 0 < |x − 1| <δ. Since δ ≤ 1/4, it follows that |x − 1| < 1/4 and thus 3/4 < x < 5/4. Hence 1 < 2. |x + 1| < 5/4 + 1 = 9/4. In addition, |2x − 1| > 2 34 − 1 = 1/2 and thus |2x − 1| 9 9 |x + 1| < 2 = . As |x − 1| < δ ≤ 2/9, so it follows: So |2x − 1| 4 2 2 2 x −1 x − 1 |x + 1| 2 9 2x − 1 − 0 = 2x − 1 = |2x − 1| |x − 1| < 9 * 2 = . Next we consider a limit value problem where the limit value does not exist.

to prove result

TEST STRATEGY

lim

x→0

1 does not exist. x

As expected, we carry out a proof by contradiction. if clean

x→0

1 exists, then there is x

1 = L. Hence for all > 0 δ > 0 there exists such a real number L with lim x→0 x

306

Chapter 12

Tests in Analysis

1 what if 0 < |x| < δ, then − L < . For numbers x "close" to 0, it certainly appears x 1, which is "large" (in absolute value). So regardless of the value of x 1 looks like a real number x with 0 < |x| must be < δ such that − L ≥ . It is our plan x to show that this is indeed the case. For example, we choose = 1 and show that no desired δ can be found. Result 21.12

Study

lim

x→0

1 does not exist. x

On the contrary, assume that

x→0

1 exists. Then there is a real number L with x

1 = L. Let = 1. Then there is δ > 0 such that if x is a real number then lim x→0 x 1 0 < |x| < δ, then − L < = 1. Choose an integer n such that n > 1/δ ≥ 1. Since x n > 1/δ, it follows that 0 < 1/n < δ. We consider two cases. Case 1. L ≤ 0. Let x = 1/n. Then 0 < |x| <δ. Since −L ≥ 0, it follows that 1 − L = |n − L| = n − L ≥ n > 1 = , x which is a contradiction. Case 2. L > 0. Let x = −1/n. Then 0 < |x| <δ. So 1 − L = | − n − L| 🇧🇷 − (n + L)| = n + L > n > 1 = , x also creates a contradiction in this case. Result to prove the PROOF STRATEGY

Result 12.22 test

Let f (x) = |x|/x, where x ∈ R and x = 0. Then lim f (x) does not exist. x→0

The graph of this function is shown in Figure 12.4. If x > 0, then f(x) = |x|/x = x/x = 1; while if x < 0 then f(x) = |x|/x = −x/x = −1. Hence there are numbers x that are "close" to 0 such that f(x) = 1 and numbers x that are "close" to 0 such that f(x) = −1. This points to evidence.

Let f (x) = |x|/x, where x ∈ R and x = 0. Then lim f (x) does not exist. x→0

Suppose instead that lim f(x) exists. Then there is a real number L such as x→0

that lim f (x) = L. Let = 1. Then there is δ > 0 such that if x is a real number x→0

satisfies 0 < |x − 0| = |x| < δ, then | f(x) − L| < = 1. We consider two cases. Case 1. L ≥ 0. Consider x = −δ/2. So |x| = δ/2 < δ. However, f (x) = f (−δ/2) = (δ/2)/(−δ/2) = −1. So | f(x) − L| 🇧🇷 − 1 − L| = 1 + L ≥ 1, a contradiction.

12.4

Basic properties of functional boundaries

307

j

1 0

x

−1

Figure 12.4

The graph of the function f(x) = |x|/x

Case 2. L < 0. Let x=δ/2. So |x|=δ/2 1, a contradiction.

12.4 Basic Properties of Limits of Functions If we wanted to continue evaluating limits, it would be important to have some theorems that would allow us to compute limits more quickly. We now present some theorems that will enable us to determine limit values more easily. We start with a standard theorem about limits on function sums. to prove theorem

If lim f(x) = L and lim g(x) = M, then x→a

x→a

lim (f(x) + g(x)) = L + M.

x→a

TEST STRATEGY

In this case, for data > 0, we need to show that |( f (x) + g(x)) − (L + M)| < if 0 < |x − a| < δ for a correct choice of δ > 0. Now |( f (x) + g(x)) − (L + M)| = |( f(x) − L) + (g(x) − M)| 🇧🇷 | f(x) − L| + |g(x) − M|. So if we can show that both | f(x) − L| </2 and |g(x) − M| </2, then we got the desired inequality. However, based on the hypothesis, it can be realized. Now we're doing everything right.

Theorem 12.23

If lim f(x) = L and lim g(x) = M, then x→a

x→a

lim (f(x) + g(x)) = L + M.

x→a

Study

Let > 0. Since /2 > 0, there exists δ1 > 0 such that if 0 < |x − a| < δ1 , then | f(x) − L| </2. Also, there is δ2 > 0 such that if 0 < |x − a| < δ2 , then |g(x) − M| </2. Choose δ = min(δ1 , δ2 ) and let x ∈ R with 0 < |x − a| <δ. Since 0 < |x − a| < δ, it follows that both 0 < |x − a| < δ1 and 0 < |x − a| <δ2. So |( f(x) + g(x)) − (L + M)| = |( f(x) − L) + (g(x) − M)| 🇧🇷 | f(x) − L| + |g(x) − M| </2 + /2 = .

308

Chapter 12

Tests in Analysis

Theorem 12.23 says that the limit of the sum of two functions is the sum of their limits. Next we show that this is also true for products. Before we get to this theorem, let's see what goes into its proof. Let lim f(x) = L and lim g(x) = M. x→a x→a This means that we use the expressions | can make f(x) − L| and |g(x) − M| as small as we want. Our goal is to show what we can do | f (x) g(x) − L M| as small as we want, say less than for any date > 0. Then the question is how do we use what we say about | know f(x) − L| and |g(x) − M| as we consider | f (x) g(x) − L M|. A common way to do this is to add and subtract the same amount of e from f(x) · g(x) − L M . For example | f (x) g(x) − L M| 🇧🇷 f (x) g(x) − f (x) M + f (x) M − L M| 🇧🇷 f (x)(g(x) − M) + ( f (x) − L)M| 🇧🇷 | f(x)||g(x) − M| 🇧🇷 f (x) − L||M|. If we can do each | f(x)||g(x) − M| and | f (x) − L||M| less than /2, say, then we have achieved our goal. Like |M| is a non-negative constant and | f(x) − L| and |g(x) − M| can be made arbitrarily small, only | f(x)| is in question. In fact, it suffices to show that f(x) can be bounded in a deleted neighborhood of a, i.e. |, f(x)| ≤ B for some constant B > 0. Lemma 12.24

Suppose lim f (x) = L. Then there is δ > 0 such that if 0 < |x − a| < δ, then x→a | f(x)| < 1 + |L|.

Study

Let = 1. Then there is δ > 0 such that if 0 < |x − a| < δ, then | f(x) − L| < 1. So | f(x)| 🇧🇷 f(x) − L + L| 🇧🇷 | f(x) − L| + |L| < 1 + |L|. We can now show that the limit of the product of two functions is the product of their limits.

Theorem to prove the PROOF STRATEGY

If lim f (x) = L and lim g(x) = M, then lim f (x) g(x) = L M. x→a

x→a

x→a

As we have already discussed, | f (x) g(x) − L M| 🇧🇷 f (x) g(x) − f (x) M + f (x) M − L M| 🇧🇷 f (x)(g(x) − M) + ( f (x) − L)M| 🇧🇷 | f(x)||g(x) − M| 🇧🇷 f (x) − L||M|. For data > 0 we show that each of the | f(x)||g(x) − M| and | f (x) − L||M| it can be less than /2, which gives us proof of the result. Of course, this goes straight to | f (x) − L||M| if M = 0. Otherwise we can use | f(x) − L| less than /2|M|. By Lemma 12.24 we can | f(x)| less than 1 + |L|. So we do |g(x) − M| </2(1 + |L|). Now let's put all the pieces together.

Theorem 12.25

If lim f (x) = L and lim g(x) = M, then lim f (x) g(x) = L M. x→a

x→a

x→a

12.4 Proof

Basic properties of functional boundaries

309

Leave > 0 data. By Lemma 12.24 there is δ1 > 0 such that if 0 < |x − a| < δ1 , then | f(x)| < 1 + |L|. Since lim g(x) = M exists, there exists δ2 > 0 such that if 0 < x→a |x − a| < δ2 , then |g(x) − M| </2(1 + |L|). We consider two cases. Case 1. M = 0. Choose δ = min(δ1 , δ2 ). Let x ∈ R with 0 < |x − a| <δ. So | f (x) g(x) − L M| 🇧🇷 f (x) g(x) − f (x) M + f (x) M − L M| 🇧🇷 f (x)(g(x) − M) + ( f (x) − L)M| 🇧🇷 | f(x)||g(x) − M| 🇧🇷 f (x) − L||M| < (1 + |L|)/2(1 + |L|) + 0 = /2 < . Case 2. M = 0. Since lim f (x) = L, there exists δ3 > 0 such that if 0 < |x − a| < δ3, x→a

so | f(x) − L| </2|M|. In this case we choose δ = min(δ1 , δ2 , δ3 ). Now let x ∈ R with 0 < |x − a| <δ. So | f (x) g(x) − L M| 🇧🇷 f (x) g(x) − f (x) M + f (x) M − L M| 🇧🇷 f (x)(g(x) − M) + ( f (x) − L)M| 🇧🇷 | f(x)||g(x) − M| 🇧🇷 f (x) − L||M| < (1 + |L|)/2(1 + |L|) + (/2|M|)|M| = /2 + /2 = . Next we consider the limit of the quotient of two functions. As before, let lim f (x) = x→a L f (x) L and lim g(x) = M. Our goal is to show that lim = . This is of course not true x→a x→a g(x) M L f (x) if M = 0; hence we have to assume that M = 0. To prove that lim = , x→a g(x) M f (x) L, we have to show that − can be arbitrarily small. Note that g(x) M f (x) f (x) M − L g(x) f (x) M − L M + L M − L g(x) L = g(x) − M = g(x) M g(x) M ( f (x) − L)M + L(M − g(x)) | f (x) − L||M| + |L||M − g(x)| ≤ = g(x) M |g(x)||M|

🇧🇷 f(x) − L| |L||M − g(x)| 🇧🇷 |g(x)| |g(x)||M| f (x) L So to show that − can be smaller than for any given positive number g(x) M | f(x) − L| |L||M − g(x)| it suffices to show that each of e is defined by |g(x)| |g(x)||M| can be reduced starting at 2. Only 1/|g(x)| it requires study. In particular, we need to show that there is an upper bound on 1/|g(x)| gives in a deleted neighborhood of a. 🇧🇷

Motto 26.12

If lim g(x) = M = 0, then 1/|g(x)| < 2/|M| for all x in a deleted neighborhood x→a

to one.

310

Chapter 12 exam

Tests in Analysis

Let = |M|/2. Then there is δ > 0 such that if 0 < |x − a| < δ, then |g(x) − M| < |M|/2. Hence |M| = |M − g(x) + g(x)| ≤ |M − g(x)| + |g(x)| So |g(x)| ≥ |M| − |M − g(x)| > |M| − |M|/2 = |M|/2. So 1/|g(x)| < 2/|M|.

Theorem to prove the PROOF STRATEGY

If lim f(x) = L and lim g(x) = M = 0, then lim x→a

x→a

x→a

L f (x) = . g(x)M

Returning to our earlier discussion, we now have f(x) L | f(x) − L| |L||M − g(x)| g(x) − M ≤ |g(x)| + |g(x)||M| 2 2 + |L||M − g(x)| 🇧🇷 🇧🇷 f(x) − L| |M| |M|2 This suggests how small we | should make f(x) − L| and |g(x) − M| = |M − g(x)| to fulfill our goal.

Theorem 12.27 Proof

If lim f(x) = L and lim g(x) = M = 0, then lim x→a

x→a

x→a

L f (x) = . g(x)M

Leave > 0 data. By Lemma 12.26 there is δ1 > 0 such that if 0 < |x − a| < δ1 , then 1/|g(x)| < 2/|M|. Since lim f (x) = L, there exists δ2 > 0 such that when 0 < x→a |x − a| < δ2 , then | f(x) − L| < |M|/4. We consider two cases. Case 1. L = 0. Define δ = min(δ1 , δ2 ). Let x ∈ R with 0 < |x − a| <δ. Then f(x)L | f(x) − L| |L||M − g(x)| g(x) − M ≤ |g(x)| + |g(x)||M| 2 |M| · + 0 = < . < 4 |M| 2 Case 2. L = 0. Since lim g(x) = M exists, there exists δ3 > 0 such that if 0 < |x − a| < δ3, x→a

then |g(x) − M| < |M|2 /4|L|. In this case, define δ = min(δ1 , δ2 , δ3 ). Let x ∈ R with 0 < |x − a| <δ. Then f(x)L | f(x) − L| |L||M − g(x)| g(x) − M ≤ |g(x)| + |g(x)||M| 🇧🇷

2 |L| |M|2 2 |M| · + · · = + = . 4 |M| |M| 4|L| |M| 2 2

Using Theorems 12.23, 12.25, 12.27 and some other general results, it is now possible to provide simpler arguments for some of the limits discussed. First we present some additional results, starting with a remark about constant functions, followed by limits of polynomial functions defined by f (x) = x n for some n ∈ N. Theorem 12.28

Let a, c ∈ R. If f (x) = c for all x ∈ R, then lim f (x) = c. x→a

12.4 Proof

Theorem 12.29 Proof

311

Basic properties of functional boundaries

Let > 0 and choose δ to be any positive number. Let x ∈ R with 0 < |x − a| <δ. Since f(x) = c for all x ∈ R, it follows that | f(x) − c| = |c − c| = 0 0 and choose δ = . Let x ∈ R with 0 < |x − a| <δ. So | f(x) − a| = |x − a| < δ = . Now we extend the result of Theorem 12.29.

Theorem 12.30 Proof

Let n ∈ N and f (x) = x n for all x ∈ R. Then for every a ∈ R lim f (x) = a n . x→a

We proceed by induction. The statement holds for n = 1, because if f (x) = x, then by Theorem 12.29 lim f (x) = a. Assume that lim x k = a k , where k ∈ N. We show x→a x→a k+1 k+1 k+1 = a . Note that lim x = lim x k x. By Theorem 12.25 and this lim x x→a x→a x→a 12.29 and the inductive assumption, lim x k+1 = lim x k x = lim x k lim x = a k (a) = a k+ 1 . x→a

x→a

x→a

x→a

According to the principle of mathematical induction, lim x = a n for all n ∈ N. n

x→a

It is also possible to prove the following theorem by induction. We leave the proof as an exercise (Exercise 12.32). Theorem 12.31

Let f 1 , f 2 , · · · , f n be functions (n ≥ 2) such that lim f i (x) = L i for 1 ≤ i ≤ n. So x→a

lim ( f 1 (x) + f 2 (x) + · · · + f n (x)) = L. 1 + L. 2 + · · · + L. n .

x→a

With the results just presented, it is possible to prove that if p(x) = cn x n + cn−1 x n−1 + + c1 x + c0 is a polynomial, then lim p(x) = cn a n + then cn−1 is an n−1 + · · · + c1 a + c0 = p(a).

x→a

(12.1)

For example, applying this to Result 12.13, we have lim (3x − 7) = 3 4 − 7.

x→4

Result 12.14 can be established in the same way. The result 12.15 cannot be established directly, since lim (2x − 3) = 0. Applying what we now know to the result 12.16, we have x→ 32

lim x 2 = 32 = 9 and the result is 12.17,

x→3

lim (x 5 − 2x 3 − 3x − 7) = 25 − 2 · 23 − 3 · 2 − 7 = 3.

x→2

If r is a rational function, that is, if r(x) is the ratio p(x)/q(x) of two polynomials p(x) and q(x) such that q(a) = 0 for a ∈ R, then by Theorem 12.27 lim r(x) = lim

x→a

x→a

limx→a p(x) p(a) p(x) = = = r (a). q(x) limx→a q(x) q(a)

(12.2)

312

Chapter 12

Tests in Analysis

So in result 12.18 we have lim

x→1

12 + 1 2x2 + 1 = = . 2 2 x +4 1 +4 5

Although it is easier and certainly less time-consuming to check certain limits using these theorems, we should also know how limits are checked by definition − δ.

12.5 Continuity Again let f : X → R be a function, where X ⊆ R, and let a be a real number such that f is defined in a deleted neighborhood of a. Recall that lim f (x) = L for x→a there exists a real number L if for all > 0 δ > 0 such that if x ∈ (a − δ, a + δ) and x = a, then | f(x) − L| 🇧🇷 If f is defined at a and f(a) = L, then f is called continuous at a. That is, f is continuous at a if lim f(x) = f(a). Hence a function x→a f in a is continuous if for all > 0 there exists a δ > 0 such that if |x − a| < δ, then | f(x) − f(a)| 🇧🇷 (Note that in this case 0 < |x − a| < δ is replaced by |x − a| < δ.) Thus, for f to be continuous at a, three conditions must be satisfied: (1) f is defined on a;

(2) lim f(x) exists; x→a

(3) lim f(x) = f(a). x→a

Let's illustrate that now. Exercise 12.32

A function f is defined by f (x) = (x 2 − 3x + 2)/(x 2 − 1) for all x ∈ R − {−1, 1}. f is continuous at 1 under one of the following circumstances: (a) f is not defined at 1; (b) f(1) = 0; (c) f (1) = −1/2?

solution

For f to be continuous at 1, the function f must be defined at 1. So we can answer question (a) straight away. The answer is no. In order to answer questions (b) and (c), we must first determine whether lim f(x) exists. Note that x→1

f (x) =

x 2 − 3x + 2 (x − 1)(x − 2) x −2 = = 2 x −1 (x − 1)(x + 1) x +1

and x = 1. And f(x) = lim

x→1

x −2 is a rational function, we can apply (12.2) to get x +1

limx→1 (x − 2) −1 1 x −2 = = =− . x +1 limx→1 (x + 1) 2 2

So if f(1) = −1/2, then f is continuous at 1. Therefore the answer to question (b) is no and the answer to (c) is yes. As a further exercise we present a − δ proof that lim

x→1

to prove result

lim

x→1

1 x 2 − 3x + 2 = − . 2 x −1 2

1 x 2 − 3x + 2 = − . x2 − 1 2

12.5 TESTING STRATEGY

continuity

313

Note that 2

x − 3x + 2 1 (x − 1)(x − 2) 1 (x − 2) 1 x 2 − 1 − − 2 = (x − 1)(x + 1) + 2 = (x + 1) + 2 2(x − 2) + (x + 1) 3x − 3 3 |x − 1| = = 2(x + 1) = 2 |x + 1| 🇧🇷 2(x + 1) if |x − 1| < 1, so 0 < x < 2 and |x + 1| > 1, then 1/|x + 1| < 1. Now we can prove that lim f (x) = −1/2. x→1

Result 12.33 test

lim

x→1

1 x 2 − 3x + 2 = − . 2 x −1 2

Let > 0 and choose δ = min(1, 2/3). Let x ∈ R with |x − 1| <δ. As |x − 1| < 1, then 0 < x < 2. So |x + 1| > 1 and 1/|x + 1| < 1. So 2

x − 3x + 2 1 (x − 1)(x − 2) 1 x − 2 1 x 2 − 1 − − 2 = (x − 1)(x + 1) + 2 = x + 1 + 2 3x − 3 3 |x − 1| 3 2 = = < = . 2(x + 1) 2 |x + 1| 2 3 In fact, (12.2) says that if a rational function r is defined by r (x) = p(x)/q(x), where p(x) and q(x) are polynomials such that q(a ) = 0, then r is continuous at a. Furthermore, (12.1) implies that if p is a polynomial function defined by p(x) = cn x n + cn−1 x n−1 + · · · + c1 x + c0, then p is continuous for every real number a. We now present some examples of continuity for functions that are neither polynomials nor rational functions.

Result to prove the PROOF STRATEGY

The function f defined by f(x) =

√

x for x ≥ 0 is continuous at 4.

√ √ Since f(4) = 2, it suffices to show that lim x = 2. So | f(x) − L| 🇧🇷 x − 2|. x→4 √ √ √ √ To incorporate x − 4 into the expression x − 2, we multiply x − 2 by ( x + 2)/( x + 2), getting √ √ ( x − 2)( x + 2) √|x − 4| = 🇧🇷 | x − 2| 🇧🇷 √ x +2 x +2 √ First we require√ that δ ≤ 1, so |x − 4| < 1, so 3 < x < 5. Since x + 2 > 3, it follows that 1/( x + 2) < 1/3. So √ |x − 4| |x − 4| 🇧🇷 🇧🇷 x − 2| = √ < 3 x +2 This suggests an appropriate choice for δ.

Result 12.34

The function f defined by f(x) =

√

x for x ≥ 0 is continuous at 4.

314

Chapter 12

Tests in Analysis

j

3 2 1 −3

−2

x

−1

1

3

2

−1 −2 −3 Figure 12.5 Test

The graph of the ceiling function f(x) = x

Let > 0 and choose δ = min(1, 3).√ Let x ∈ R such that |x − 4|√ < δ. As |x − 4| < 1, it follows that 3 < x < 5 and so x + 2 > 3. Hence 1/( x + 2) < 1/3. So √ √ ( x − 2)( x + 2) √ |x − 4| 1 = √ | x − 2| = < (3) = . √ 3 x +2 x +2 Figure 12.5 shows the graph of the ceiling function f : R → Z defined by f (x) = x . This function is not continuous for any integer, but is continuous for all other real numbers. We verify the first of these observations and leave the proof of the second as an exercise (Exercise 12.37).

Result 12.35

The blanket function f : R → Z defined by f(x) = x is not continuous at any integer.

Study

Instead, assume that there is an integer k such that f is continuous at k. Hence lim f(x) = f(k) = k = k. Hence for = 1 δ > 0 exists such that if x→k

|x − k| < δ, then | f(x) − f(k)| 🇧🇷 f (x) − k| < = 1. Let δ1 = min(δ, 1) and let x1 ∈ (k, k + δ1 ). So k < x1 < k + δ and k < x1 < k + 1. So f (x1 ) = x1 = k + 1 e | f (x1 ) − k| = |(k + 1) − k| = 1 < 1, a contradiction.

12.6 Differentiability We discuss the existence and non-existence of limits lim f(x) for functions x→a

f : X → R with X ⊆ R, where f is defined in a deleted neighborhood of the real number a and in the case of continuity at a it is examined whether lim f (x) = f (a) if x→a

12.6

differentiability

315

j

Equation of the tangent: y − f (a) = m(x − a), where m = f ′(a)

Fa)

xa Figure 12.6

Derivatives and slopes of tangents

f is defined in a neighborhood of a. If f is defined in a neighborhood of a, then there is an important limit on the ratio of the differences f(x) − f(a) and x − a. There is a function f : X → R with X ⊆ R that is defined in a neighborhood of a real f (x) − f (a). This limit is called the number a that is differentiable at a if lim x→a x −a is a derivative of f at a and is denoted by f (a). So f(a) = lim

x→a

f(x) − f(a) . x-a

You probably already know that f(a) is the slope of the tangent to the graph of y = f(x) at point (a, f(a)). In fact, if f(a) = m, then the equation of this line is y − f(a) = m(x − a). See Figure 12.6. We illustrate derivations with an example. Example 12.36

solution

Show that the function f defined by f (x) = 1/x 2 is differentiable with respect to 1 for x = 0 and determine f (1). So we have to show that lim

x→1

district 1,

1 f (x) − f (1) 2 − 1 = lim x exists. On an excluded neighbor → 1 x − 1 x −1

2

1−x −1 1 − x2 (1 − x)(1 + x) 1+x 2 = x = 2 = =− 2 . 2 x −1 x −1 x (x − 1) x (x − 1) x

1x2

Since

(12.3)

1+x is a rational function, we can again see from (12.2) that −x 2 2 1+x limx→1 (1 + x) = = −2 = 2 2 x→1 −x limx→1 ( − x ) −1 limit

so f(1) = −2. We also present a − δ proof of this limit. For data > 0 we need 1 −1 x2 − (−2) < . Note that one finds δ > 0 such that if x ∈ R with 0 < |x − 1| < δ, i.e. x −1

316

Chapter 12 what

Tests in Analysis

1 −1 x2 − (−2) = x −1

2 1+x 2x − x − 1 |x − 1||2x + 1| − = + 2 = . x2 x2 x2

If we restrict δ such that δ ≤ 1/2, then |x − 1| < 1/2 and thus 1/2 < x < 3/2. Since x > 1/2, we have x 2 > 1/4 and 1/x 2 < 4. Since x < 3/2, we also have |2x + 1| < 4. So |x − 1||2x + 1|/x 2 < 16|x − 1|. This shows us how to choose δ. We now prove that f(1) = −2. Result 12.37 test

Let f be the function defined by f (x) = 1/x 2 for x = 0. Then f (1) = −2. Let > 0 and choose δ = min(1/2, /16). Let x ∈ R with 0 < |x − 1| <δ. As |x − 1| < 1/2, it follows that 1/2 < x < 3/2. So x 2 > 1/4 and thus 1/x 2 < 4. Also |2x + 1| < 4. Since |x − 1| < /16 it follows that 1 1+x f (x) − f (1) x2 − 1 − (−2) = − (−2) = − 2 + 2 x −1 x −1 x 2 2x − x − 1 |2x + 1| 🇧🇷 · |x − 1| < 4 4 2 2 x x 16 From Result 12.37 it now follows that the slope of the tangent to the graph of y = 1/x 2 at the point (1, 1) is −2 and thus the equation of this tangent line is y − 1 = −2(x − 1). The differentiability of a function at a number a implies continuity at a, as we shall now show.

Theorem 12.38

Study

If a function f is differentiable at a, then f is continuous at a. f (x) − f (a) exists and is equal to the real number x→a x −a f (a). To show that f is continuous at a, we need to show that lim f(x) = f(a). x→a We write f (x) as Since f is differentiable at a, it follows that lim

f (x) =

f (x) − f (a) (x − a) + f (a). x-a

Using the properties of limits, we now have f (x) − f (a) lim f (x) = lim lim (x − a) + lim f (a) x→a x→a x→a x→a x −a = f (a) * 0 + f(a) = f(a). The converse of Theorem 12.38 is not true. For example, the functions f and g √ 3 defined by f(x) = |x| and g(x) = x are continuous at 0 but not differentiable at 0. Indeed, that f is not differentiable at 0 is established in Example 12.22.

Exercises for Chapter 12

317

CHAPTER 12 EXERCISES Section 12.1: Sequence Boundaries 12.1. Give an example of a sequence that is not expressed trigonometrically but whose terms are exactly those of the sequence {cos(nπ)}. 12.2. Give an example of two different sequences of the sequence {n 2 − n! + |n − 2|} whose first three terms are equal to those of {n 2 − n! + |n − 2|}. 1 12.3. Prove that the sequence 2n converges to 0. 1 12.4. Prove that the sequence n 2 +1 converges to 0. 12.5. Prove that the sequence 1 + 21n converges to 1. n+2 12.6. Prove that the sequence 2n+3 converges to 12. 12.7. By definition limn→∞ an = L if for all > 0 there is a positive integer N such that if n is an integer with n > N then |an − L| 🇧🇷 Take the negation of this definition and write the meaning of limn→∞ an = L using quantifiers. Then write the meaning of {a } diverges using quantifiers. 12.8. Show that the sequence n 4 diverges to infinity. 5 diverges to infinity. 12.9. Show that the sequence n n+2n 2 √ 12.10. (a) Prove that 1 + 12 + 13 + · · · + n1 < 2 n for every positive integer n. 1 1 (b) Let sn = n1 + 2n + 3n + · · · + n12 for every n ∈ N. Prove that the sequence {sn } converges to 0. 12.11. Prove that if a sequence {sn } converges to L, then the sequence {sn 2 } also converges to L.

Section 12.2: Infinite Series 12.12. prove that the series

∞

1 k=1 (3k−2)(3k+1)

converges and find their sum by

(a) compute the first terms of the sequence {sn } of partial sums and conjecture a formula for sn ; (b) using mathematical induction to check whether your guess in (a) is correct; (c) take the test. 1 12.13. Prove that the series ∞ k=1 2k converge and calculate their sum

12.14.

12.15. 12.16. 12.17.

(a) compute the first terms of the sequence {sn } of partial sums and conjecture a formula for sn ; (b) using mathematical induction to check whether your guess in (a) is correct; (c) take the test. 1 The terms a1 , a2 , a3 , · · · of the series ∞ k=1 ak are recursively defined by a1 = 6 and 2 an = an−1 − n(n + 1)(n + 2) ∞ for n ≥ 2 .Prove that k=1 and k converges and determine its value. k+3 Prove that the series ∞ k=1 (k+1)2 diverges to infinity. (a) Prove that if ∞ k=1 ak is a convergent series, then limn→∞ an = 0. (b) Show that the reciprocal of the result in (a) is false. 3n Let ∞ k=1 ak be an infinite series whose partial sum sequence is {sn }, where sn = 4n+2 . (a) What is the series ∞ ak ? k=1 ∞ (b) Determine the sum s of ∞ k=1 ak and prove that k=1 ak = s.

318

Chapter 12

Tests in Analysis

Section 12.3: Functional Limits 12.18. Prove − δ that limx→2

3 2

x + 1 = 4.

12.19. Give a − δ proof that limx→−1 (3x − 5) = −8. 12.20 Prove − δ that limx→2 (2x 2 − x − 5) = 1. 12.21. Prove − δ that limx→2 x 3 = 8. 12.22. Determine limx→1

1 5x−4

and check the correctness of your answer with a − δ-proof.

23.12. Prove − δ that limx→3 12.24. Determine limx→3 12.25. Show that limx→0

x 2 −2x−3 x 2 −8x+15 1 x2

3x+1 4x+3

= 23 .

and check the correctness of your answer with a − δ-proof.

is not present.

12.26. The function f : R → R is defined by

⎧ ⎨1 f (x) = 1,5 ⎩2

x 3.

(a) Determine whether limx→3 f(x) exists and check your answer. (b) Determine whether limx→π f(x) exists and check your answer. 27.12. A function g : R → R is bounded if there is a positive real number B with |g(x)| < B for every x ∈ R. (a) Let g : R → R be a bounded function and suppose f : R → R and a ∈ R such that limx→a f (x) = 0. Prove that limx→a f ( x )g(x) = 0. (b) Use the result in (a) to find limx→0 x 2 sin x12. √ √ 12.28. Assume that limx→a f(x) = L, where L > 0. Prove that limx→a f(x) = L. 12.29. Suppose f : R → R is a function with limx→0 f (x) = L. (a) Let c ∈ R. Prove that limx→c f (x − c) = L. (b) Suppose f is also has the property that f(a + b) = f(a) + f(b) for all a, b ∈ R. Use the result in (a) to prove that limx→c f(x ) for all c ∈ R there is 12 :30. Let f : R → R be a function. (a) Prove that if limx→a f(x) = L, then limx→a | f(x)| = |L|. (b) Prove or disprove: If limx→a | f(x)| = |L|, then limx→a f(x) exists.

Section 12.4: Basic Properties of Functional Boundaries 12.31. Use the limit theorems to find: (a) limx→1 (x 3 − 2x 2 − 5x + 8) (b) limx→1 (4x + 7)(3x 2 − 2) 2 −1 (c) limx → 2 2x 3x 3 +1 12.32. Use induction to prove that for all integers n ≥ 2 and for all n functions f 1 , f 2 , · · · , f n such that lim f i (x) = L i for 1 ≤ i ≤ n , x→a

lim ( f 1 (x) + f 2 (x) + · · · + f n (x)) = L. 1 + L. 2 + · · · + L. n .

x→a

Additional exercises to Chapter 12

319

12.33 Use Exercise 12.32 to prove that limx→a p(x) = p(a) for any polynomial p(x) = cn x n + cn−1 x n−1 + + c1 x + c0 . 12.34 Prove: If f 1 , f 2 , . 🇧🇷 🇧🇷 , f n are any n ≥ 2 functions such that limx→a f i (x) = Li for 1 ≤ i ≤ n, then lim ( f 1 (x) f 2 (x) f n (x )) = L 1 L 2 L n .

x→a

Section 12.5: Continuity 12.35. The function f : R − {0, 2} → R is defined by f(x) = can be defined at 2 such that f is continuous at 2.

x 2 −4 . x 3 − 2 x 2

Use limit theorems to determine if f

−9 12.36. The function f defined by f(x) = xx2 −3x is not defined at 3. Can one define f at 3 such that f is continuous there? Check your answer with a − δ proof. two

12.37. Let f : R → Z be the ceiling function defined by f (x) = x. Prove − δ that if a is a non-integer real number, then f is continuous at a. 12.38. Show that Exercise 12.33 implies that every polynomial is continuous for every real number. √ 12.39. Prove that the function f : [1, ∞) → [0, ∞) defined by f (x) = x − 1 is continuous at x = 10. 12.40 (a) Let f : R → R be defined by

f (x) =

0 if x is rational 1 if x is irrational.

In particular, f(0) = 0. Prove or disprove: f is continuous at x = 0. (b) The problem in (a) should suggest another problem. Formulate and solve such a problem.

Section 12.6: Differentiability 12.41. The function f : R → R is defined by f (x) = x 2 . Determine f(3) and check the correctness of your answer with a − δ proof. 12.42. The function f : R − {−2} → R is defined by f (x) = is correct with a proof − δ.

1 . x+2

Determine f(1) and confirm your answer

12.43 The function f : R → R is defined by f (x) = x 3 . Determine f(a) for a ∈ R+ and check the correctness of your answer with a − δ proof. 12.44 The function f : R → R is defined by

f (x) =

x 2 sen 0

1x

se x = 0 se x = 0.

Determine f(0) and check the correctness of your answer with a − δ-proof.

EXERCISES ADDITIONAL TO CHAPTER 12 12.45. Prove that the sequence 12.46. Prove that limn→∞

2n 2 4n 2 +1

n+1 3n−1

converges to 13 .

= 12 .

12.47 Prove that the sequence {1 + (−2)n } diverges. √ 12.48. Prove that limn→∞ ( n 2 + 1 − n) = 0.

320

Chapter 12

Tests in Analysis

at 12.49. Prove that the sequence (−1)n+1 2n+1 diverges. n1 = . 3n + 1 3 12.51. Let a, c0 , c1 ∈ R with c1 = 0. Prove a − δ that limx→a (c1 x + c0 ) = c1 a + c0 . 12:50 p.m. prove it clean

n→∞

12.52. Rate the proposed solution to the following problem. Exercise The function f : R → R is defined by x 2 −4 if x = 2 f (x) = x−2 2 if x = 2. Determine whether lim f (x) exists. x→2

Solution Consider lim f(x) = lim

x→2

x→2

x2 − 4 (x − 2)(x + 2) = lim = lim (x + 2) = 4. x→2 x→2 x −2 x −2

However, since lim f(x) = 4 = 2 = f(2), the limit does not exist. x→2

12.53. Evaluate the proposed proof of the following result. 2n 2 Result The sequence converges to . 3n + 5 3

10 5 10 5 − and let n > N . Then n > − . Prove Let > 0. Choose N = 9 3 9 3 10 10 So 9n > − 15 and 9n + 15 > . So 9n + 15 1 10 > and < . 10 9n + 15 Now

2n | −10| 2 6n − 2(3n + 5) 10 3n + 5 − 3 = 3(3n + 5) = 9n + 15 = 9n + 15 < .

12.54. Evaluate the proposed proof of the following result. 1 result threshold = −1. x→1 2x − 3 Proof Let > 0 and choose δ = min(1, 72 ). Let 0 < |x − 1| < δ = min(1, 72 ). As |x − 1| < 1, then 0 < x < 2 and then |2x − 3| ≤ |2x| 🇧🇷 − 3| = 2|x| + 3 < 4 + 3 = 7. Since |x − 1| < 7/2, we have 1 2x − 2 2|x − 1| 2 7 2x − 3 + 1 = 2x − 3 = |2x − 3| < 7 * 2 = . 12.55. Let {an }, {bn } and {cn } be sequences of real numbers such that an ≤ bn ≤ cn for every positive integer n and lim an = lim cn = L n→∞

n→∞

(a) Prove that lim (cn − an ) = 0. n→∞ (b) Prove that lim bn = L. n→∞

Additional exercises to Chapter 12

321

12.56. In Chapter 10 it was shown that the set Q of rational numbers is countable and can therefore be expressed as Q = {q1 , q2 , q3 , . 🇧🇷 🇧🇷 A function f : R → R is defined by 1 if x = qn (n = 1, 2, 3, . . .) f (x) = n 0 if x is irrational. (a) Prove that f is continuous for every irrational number. (b) Prove that f is not continuous for any rational number. (c) If the function f were defined as above, except that f(0) = 0, then prove that f would be continuous at 0.

13

Tests in group theory

M

All of the proofs we've seen involve familiar sets of numbers (particularly integers, rational numbers, and real numbers). Furthermore, most of the theorems and examples we encounter involve either additive properties of these numbers or multiplicative properties (or both). Many important properties of integers (or of rational or real numbers) do not come from the integers themselves but from the addition and multiplication of integers. This raises a fundamental question in mathematics: given a nonempty set S, can we describe other, less well-known ways to relate an element of S to each pair of elements of S in such a way that some interesting properties appear? The mathematical subject that deals with such questions is abstract algebra (also called modern algebra or just algebra). In this chapter we consider one of the more well-known concepts in abstract algebra. First, however, we must have a clear understanding of what we mean when we associate each pair of elements in this set with an element of a given set, and what properties can be considered interesting.

13.1 Binary Operations

322

When we add two integers a and b, we perform an operation (i.e. addition) to produce an integer, which we denote a + b. Likewise, when we multiply these two integers, we perform another operation (namely, multiplication) to produce an integer, which we denote a b (or ab). Both operations do something very similar. Each takes a pair a,b of integers, actually an ordered pair (a,b) of integers, and assigns that pair a unique integer. Therefore, these operations are actually functions, i.e. H. Functions from Z × Z to Z. These functions are examples of a concept called a binary operation. By a binary operation ∗ on a nonempty set S we mean a function from S × S to S; that is, ∗ is a function that maps every ordered pair of elements of S to an element of S. So ∗ : S × S → S. In particular, if the ordered pair (a, b) is mapped into S × S, the element c is mapped into S by a binary operation ∗ (i.e. c is the image of (a, b) under ∗ ), so we write c = a ∗ b instead of the more cumbersome notation ∗((a, b)) = ç. Thus, addition + and multiplication * are binary operations in Z. For example, under addition, the ordered pair (3, 5) maps to 3 + 5 = 8; in multiplication, (3, 5) is mapped to 3 × 5 = 15. Subtraction is also a binary operation in Z, but not a binary operation in N, because, for example, subtraction maps what is ordered

13.1

Binary Operations

323

even (3, 5) in 3 − 5 = −2, which does not belong to N. Hence subtraction is not a function from N × N to N, since it is not defined in (3, 5) as it is in many other ordered pairs of positive integers. Likewise, division is not a binary operation on Z or N, since it is not defined for many ordered pairs, including (1, 0) ∈ Z × Z and (2, 3) ∈ N × N, since 1/0 ∈ /Z is 2/3 ∈ / N. However, the division is a binary operation on the set Q+ of positive rational numbers, since the quotient of two positive rational numbers is again a positive rational number. It's not just binary operations of addition and multiplication on Z, but binary operations on Q and R as well as R+ (the positive real numbers) and Q+. For the set R∗ of non-zero real numbers, multiplication is a binary operation, but addition is not (since, for example, 1 + (−1) = 0 ∈ / R∗ ). If ∗ is a binary operation on a set S, then by definition a ∗ b ∈ S for all a, b ∈ S. If T is a nonempty subset of S and a, b ∈ T, then surely a ∗ b ∈ S ; a ∗ b does not have to belong to T, however. A nonempty subset T of S is said to be closed under ∗ if whenever a, b ∈ T , then also a ∗ b ∈ T . If ∗ is a binary operation on S, then S is surely closed under ∗. Although subtraction is a binary operation on Z, the subset N of Z is not closed under subtraction. Familiar sets with well-known binary operations include: (a) the set Zn = {[0], [1], · · · , [n − 1]}, n ≥ 2, of residue classes modulo n, with addition of residue classes [a] + [b] = [a + b] and under multiplication of the residue classes [a] · [b] = [ab] (as defined in Chapter 8); (b) the set M2(R) of all 2 × 2 matrices over R (i.e. whose entries are real numbers) under matrix addition a b e f a+e b+ f + = c d g h c+g d +h and under matrix multiplication a b e f ae + bg = c d g h ce + dg

ein f + bh; cf + d

(c) the set FR = RR of functions from R to R under function addition ( f + g)(x) = f (x) + g(x), under function multiplication ( f g)(x) = f ( x) · g(x) and under the composition of the function ( f ◦ g)(x) = f (g(x)); (d) the power set P(A) of a set A under union of sets, under intersection of sets and under difference of sets. For a more abstract example of a binary operation, let S = {a,b,c}. A binary operation ∗ on S is shown in the table of Figure 13.1, where then a ∗ a = b, a ∗ b = c, a ∗ c = a, etc. that ∗ is indeed a binary operation on S. Although it may seem relatively clear that each of the examples given above defines a binary operation, not all binary operations are so simple. Result 13.1

Para a, b ∈ R − {−2}, defina a ∗ b = ab + 2a + 2b + 2,

324

Chapter 13

Tests in group theory

Figure 13.1

∗

one

b

c

one

b

c

one

b

one

c

one

c

c

one

b

A binary operation ∗ on S = {a, b, c}

where the operations given in ab + 2a + 2b + 2 are ordinary additions and multiplications in R. So ∗ is a binary operation in R − {−2}. Check

We have to show that if a, b ∈ R − {−2}, then a ∗ b ∈ R − {−2}. Suppose instead that there is a pair x, y ∈ R − {−2} such that x ∗ y ∈ / R − {−2}. So x ∗ y = x y + 2x + 2y + 2 = −2. This equation is equivalent to (x + 2)(y + 2) = 0, so x = −2 or y = −2, which is impossible. Hence ∗ is a binary operation on R − {−2}. A nonempty set S with a binary operation ∗ is often denoted by (S, ∗). We call (S, ∗) an algebraic structure. There are certain properties that (S, ∗) can have that will be of particular interest to us. In particular, G1 (S, ∗) is associative if a ∗ (b ∗ c) = (a ∗ b) ∗ c for all a, b, c ∈ S; G2 (S, ∗) has an element e called the unit element (or simply identity) if a ∗ e = e ∗ a = a for every a ∈ S; G3 (S, ∗) has identity e and for every element a ∈ S there exists an element s ∈ S called the inverse of a such that a ∗ s = s ∗ a = e; G4 (S, ∗) is commutative if a ∗ b = b ∗ a for all a, b ∈ S. Two elements a, b ∈ S are said to be commutative if a ∗ b = b ∗ a. If all two elements of S commute, then (S, ∗) satisfies property G4. By property G2, an identity commutes with each element of S; and by property G3, every element of S commutes with an inverse of that element (assuming it has an inverse, of course). An algebraic structure (S, ∗) can satisfy all, some or none of the properties G1 – G4; however (S, ∗) cannot satisfy G3 without first satisfying G2. Strictly speaking, the expression a ∗ b ∗ c for the elements a, b, c ∈ S is undefined. Since ∗ is a binary operation, it is only defined for pairs of elements of S. There are two standard interpretations of a ∗ b ∗ c. Does a ∗ b ∗ c mean a ∗ (b ∗ c) or does a ∗ b ∗ c mean (a ∗ b) ∗ c? On the other hand, if (S, ∗) satisfies the property G1 (the associative property), then a ∗ (b ∗ c) = (a ∗ b) ∗ c, so in this case both interpretations are acceptable. For this reason we usually write a ∗ b ∗ c (without parentheses). However, we will usually continue to write a ∗ (b ∗ c) or (a ∗ b) ∗ c to emphasize the importance of parentheses, even when (S, ∗) satisfies the association property. Certainly (Z, +) satisfies the properties G1 – G4, where 0 is an identity element and −n is an inverse for the integer n. In addition, (R, ·) satisfies the properties G1, G2, and G4, where the integer 1 is an identity element. Going back to the G3 property, we see that respectively

13.1

Binary Operations

325

real number r has 1/r as its inverse, except 0, which has no inverse since there is no real number s such that 0 s = s 0 = 1. Thus (R, ) does not satisfy G3. In the case of (R∗ , ·), where R∗ is the set of all non-zero real numbers, all four properties G1–G4 are satisfied. The algebraic structure (Zn , +), n ≥ 2, also satisfies all properties G1 – G4, where [0] is an identity and [−a] is an inverse of [a]. On the other hand, (Zn, ·) satisfies only G1, G2, and G4, where [1] is an identity; but (Zn , ·) does not satisfy G3 because, for example, there is no element [s] ∈ Zn such that [0][s] = [1] in Zn and then [0] has no inverse. all properties The algebraic structure (M 2 (R), +) satisfies G1 – G4, where 0 0 −a −b a b is an identity and an inverse of . The algebraic structure 0 0 −c −d c d 1 0 (M2 (R), ·) satisfies only G1 and G2, where I = an identity. For 0 1 a b A= to have an inverse, the number ad − bc (the determinant of A) must be c d and not zero. Thus (M2(R),·) does not satisfy G3. Furthermore, 1 0 0 1 0 1 0 0 0 1 1 0 = = = 0 0 0 0 0 0 0 0 0 0 0 0 shows that (M2(R), ·) does not satisfy the G4 property. The algebraic structure (S, ∗) shown in Figure 13.1 is not associative since, for example, b ∗ (b ∗ c) = b ∗ a = a while (b ∗ b) ∗ c = c ∗ c = b and so on b ∗ (b ∗ c) = (b ∗ b) ∗ c. Since S contains no elements and e ∗ x = x ∗ e = x for all x ∈ S, it follows that (S, ∗) has no identity. Furthermore, a ∗ c = c ∗ a, since a ∗ c = a and c ∗ a = c. Hence (S, ∗) has none of the properties G1–G4. Let's look at another binary operation defined on S = {a,b,c}. Example 13.2

A binary operation ∗ is defined on the set S = {a, b, c} by x ∗ y = x for all x, y ∈ S. Determine which of the properties G1 – G4 are satisfied by (S, ∗).

solution

Let x, y and z be any three elements of S (unique or not). So x ∗ (y ∗ z) = x ∗ y = x, while (x ∗ y) ∗ z = x ∗ z = x. Hence (S, ∗) is associative. Now (S, ∗) has no identity, since for every element e ∈ S it follows e ∗ a = e ∗ b = e and then it is impossible for e ∗ a = a and e ∗ b = b. Since (S, ∗) has no identity, the question of the inverse does not apply here. Of course (S, ∗) is not commutative since a ∗ b = a while b ∗ a = b. The verification of the associative law in Example 13.2 would probably have been better if we had written x ∗ (y ∗ z) = x ∗ y = x = x ∗ z = (x ∗ y) ∗ z.

Example 13.3

Let N0 be the set of non-negative integers and consider (N0 , ∗), where ∗ is the a ∗ b = |a − b| defined binary operation is for all a, b ∈ N0 . Determine which of the four properties G1 – G4 of (N0 , ∗) are satisfied.

solution

Since 1 ∗ (2 ∗ 3) = 1 ∗ |2 − 3| = 1 ∗ 1 = |1 − 1| = 0 and (1 ∗ 2) ∗ 3 = |1 − 2| ∗ 3 = 1 ∗ 3 = |1 − 3| = 2 it follows that (1 ∗ 2) ∗ 3 = 1 ∗ (2 ∗ 3) and thus (N0 , ∗) is not

326

Chapter 13

Tests in group theory

associative. Let ∈ N0 . Then a ∗ 0 = 0 ∗ a = |a| = a and thus 0 is an identity for (N0 , ∗). Since a ∗ a = |a − a| = 0 for all a ∈ N0 , it follows that a is an inverse of itself. Because |a − b| = |b − a|, we have a ∗ b = b ∗ a and (N0 , ∗) is commutative. Therefore (N0 , ∗) satisfies the properties G2 – G4. Example 13.4

solution

Let ∗ be the binary operation defined in Z a ∗ b = a + b − 1 for a, b ∈ Z, where the operations given in a + b − 1 are ordinary additions and subtractions. Determine which of the four properties G1 – G4 of (Z, ∗) are satisfied. For integers a, b and c, a ∗ (b ∗ c) = a ∗ (b + c − 1) = a + (b + c − 1) − 1 = a + b + c − 2, while (a ∗ b) ∗ c = (a + b − 1) ∗ c = (a + b − 1) + c − 1 = a + b + c − 2. So a ∗ (b ∗ c) = (a ∗ b) ∗ c e ( Z, ∗) is associative. Since a + b − 1 = b + a − 1, it follows that a ∗ b = b ∗ a for all a, b ∈ Z and hence (Z, ∗) is commutative. be an integer. Note that a ∗ 1 = a + 1 − 1 = a. So 1 is an identity for (Z, ∗). For b = −a + 2 ∈ Z we have a ∗ b = a ∗ (−a + 2) = a + (−a + 2) − 1 = 1. Hence b is an inverse of a and every integer has one Reversal. Therefore (Z, ∗) satisfies all four properties G1 – G4.

Analyse

Let's discuss this example further. It has been shown that 1 is an identity for (Z, ∗). How did we know to choose 1? In fact, this was a natural choice since we were looking for an integer e such that e ∗ a = a for any integer a. Since e ∗ a = a + e − 1 = a, it follows that e = 1. The choice of b = −a + 2 for an inverse of a results from the solution of a ∗ b = a + b − 1 = 1 according to b.

13.2 Groups One of the most elementary but fundamental properties of the algebraic structure (Z, +) is the ability to solve linear equations, ie equations of the type a + x = b. By this we mean that given the integers a and b, we are looking for an integer x such that a + x = b. How is this equation solved? First, we know that (Z, +) has an element of identity 0. Also, a has −a as its inverse. If we add −a to a + x, which is the same as adding −a to b since a + x and b are the same integer, we get −a + (a + x) = −a + b. If we apply the associative law to (13.1), we now get (−a + a) + x = −a + b and thus 0 + x = x = −a + b.

(13.1)

13.2

327

The group

This tells us that if a + x = b has a solution, the only possible solution is −a + b. That doesn't tell us that −a + b is actually a solution, but we can easily take care of it. Setting x = −a + b, we have a + x = a + (−a + b) = (a + (−a)) + b = 0 + b = b. Since (Z, +) of course also satisfies the commutative law, the solution −a + b can also be written as b + (−a) = b − a. Let's look at the associated linear equation when the operation is a multiplication, say in (R∗ , ·). Here we look for x ∈ R∗ for a, b ∈ R∗ such that a · x = b. Remember that 1 is an identity element in (R∗ , ·). If we multiply both sides of a x = b by a1 (belonging to R∗), we get 1 b 1 (a x) = b = . a a a If we apply the associative law to (13.2), we get 1 1 (a x) = a x = 1 x = x a a So x = b/a. To always show that

is actually a solution of a · x = b, we make x = b = b. a x = a a b a

(13.2)

BA

e

The solutions to the two equations just discussed should look very familiar to you. However, these were given to illustrate a more general situation. Suppose we have an algebraic structure (S, ∗) in which we want to solve all linear equations, i.e. H. for a, b ∈ S we want to show that there is an element x ∈ S such that a ∗ x = b. If (S, ∗) is associative, has an identity e, and a has an inverse s ∈ S, then we have s ∗ (a ∗ x) = s ∗ b and then s ∗ (a ∗ x) = (s ∗ a ) ∗ x = and ∗ x = x = s ∗ b. To show that s ∗ b is indeed a solution of the linear equation a ∗ x = b, we set x = s ∗ b and get a ∗ x = a ∗ (s ∗ b) = (a ∗ s) ∗ b = e ∗ b = b, so s ∗ b is a solution. Now you may have noticed that in order to solve all linear equations in an algebraic structure (S, ∗), it is necessary that (S, ∗) satisfy the three properties G1 – G3 (property G4 is not required). Algebraic structures that satisfy the properties G1 – G3 are so important in abstract algebra that they are given a special name and will be the focus of this chapter. A group is a nonempty set G together with a binary operation ∗ that satisfies the following three properties: G1 associative law: a ∗ (b ∗ c) = (a ∗ b) ∗ c for all a, b, c ∈ G ; G2 Existence of identity: There is an element e ∈ G with a ∗ e = e ∗ a = a for all a ∈ G; G3 Existence of inverses: For every element a ∈ G there is an element s ∈ G with a ∗ s = s ∗ a = e.

328

Chapter 13

Tests in group theory

+

[0]

[1]

[2]

+

[0]

[1]

[2]

[3]

[0]

[0]

[1]

[2]

[1]

[1]

[2]

[0]

[0] [1]

[0] [1]

[1] [2]

[2] [3]

[3] [0]

[2]

[2]

[3]

[0]

[1]

[2]

[2]

[0]

[1]

[3]

[3]

[0]

[1]

[2]

Figure 13.2

The group tables for (Z3 , +) and (Z4 , +)

Hence a group is a special kind of algebraic structure (G, ∗), i.e. one that satisfies the properties G1 – G3. Usually, when the ∗-operation is simple, we simply denote this group by G instead of (G, ∗). An element e ∈ G that satisfies the G2 property is called an identity for the group G, while an element that satisfies the G3 property is called the inverse of a. If a group G also satisfies the commutative property G4, then G is called an abelian group, named after the Norwegian mathematician Niels Henrik Abel. If a group G does not satisfy G4, then G is called a non-navelian group. We have already seen several abelian groups, namely (Z, +), (Q, +), (R, +), (Zn , +), (Q+ , ), (R+ , ) and (R∗ , ) ) and the algebraic structure (Z, ∗) described in Example 13.4. The order of a group G, denoted by |G|, is the cardinality of G. If the order of G is finite, then G is a finite group; whereas if G has infinitely many elements, then G is an infinite group. All groups given above are infinite groups except (Zn, +) which has order n. When a finite group G has relatively few elements, we usually describe the operation ∗ using a table called the group table (or operation table). For example, the group tables for (Z3 , +) and (Z4 , +) are shown in Figure 13.2. Although (Zn , +) is a group for every integer n ≥ 2, (Zn , ) is not a group for every n ≥ 2 since, as already mentioned, the element [0] has no inverse. This suggests considering the set Z∗n = Zn − {[0]} = {[1], [2], · · · , [n − 1]}, n ≥ 2, under multiplication. For some integers n ≥ 2, multiplication is not a binary operation on Z∗n. For example /Z∗4 . On the other hand, multiplication is a binary operation [2] ∈ Z∗4 but [2] · [2] = [0] ∈ in Z∗5 . In fact, (Z∗5 , ·) satisfies the properties G1, G2 and G4, where [1] is an identity. Since [1] [1] = [1], [2] [3] = [3] [2] = [1] and [4] [4] = [1], each element of (Z ∗5 , ·) has an inverse. Therefore (Z∗5 , ·) also satisfies the G3 property and is therefore an abelian group. This of course raises the question of which algebraic structures (Z∗n , ·) are groups. Perhaps the above examples suggest the answer. Theorem to prove the PROOF STRATEGY

The set Z∗n , n ≥ 2, is a group under multiplication if and only if n is a prime number. If n is a composite number then there exist integers a and b such that 2 ≤ a, b ≤ n − 1 / Z∗n and the multiplication is not a and n = ab. Then [a], [b] ∈ Z∗n and [a][b] = [n] = [0] ∈ binary operation. Therefore the only possibility for (Z∗n , ·) to be a group is if n is prime. Then suppose that p is a prime number. First we have to check if the multiplication is in fact a binary operation on Z∗p; that is, if [a], [b] ∈ Z∗p , then [a][b] ∈ Z∗p . If [ab] ∈ / Z∗p , then

13.2

The group

329

[ab] = [0]; then ab ≡ 0 (mod p), which implies that p | away. According to Corollary 11.14, p | a or p | b and then [a] = [0] or [b] = [0]. That is, either [a] ∈ / Z∗p or [b] ∈ / Z∗p , a contradiction. ∗ To show that (Z p , ·) is a group, it only remains to check whether property G3 is satisfied. Let r be an integer with 1 ≤ r ≤ p − 1. We need to show that [r] has an inverse; that is, there is [s] ∈ Z∗p with [r ][s] = [1]. Since p is prime and 1 ≤ r ≤ p − 1, the integers r and p are relatively prime. By Theorem 11.12, the integer 1 is a linear combination of r and p. Hence there are integers x and y such that 1 = rx + py. Using the definition of addition and multiplication in Zp and noting that [p] = [0] in Zp, we have [1] = [rx + py] = [rx] + [py] = [r] [x] + [p] [y] = [r] [x] + [0] [y] = [r] [x] + [0] = [r] [x]. So [x] is an inverse for [r].

We will now give a brief proof of the theorem. Theorem 13.5 Proof

The set Z∗n , n ≥ 2, is a group under multiplication if and only if n is a prime number. Suppose n is a composite number. Then there are integers a and b with / Z∗n , where 2 ≤ a, b ≤ n − 1 and n = ab. Hence [a], [b] ∈ Z∗n and [a][b] = [n] = [0] ∈ ∗ imply that multiplication is not a binary operation on Zn. Certainly Z∗n is not a group under multiplication. Otherwise, assume that p is a prime number. First we show that multiplication is a binary operation on Z∗p. On the contrary, assume that this is not the case. Then there is [a], [b] ∈ / Z∗p . Since [a][b] ∈ / Z∗p, it follows that [a][b] = [ab] = [0]. So Z∗p with [a][b] ∈ ab ≡ 0 (mod p) and thus p | away. According to Corollary 11.14, p | a or p | B. Hence [a] = [0] or [b] = [0], contradicting the fact that [a], [b] ∈ Z∗p . Here [1] is the identity. Hence (Z∗p , ·) is an algebraic structure that satisfies the properties G1 and G2. It remains to show that (Z∗p , ·) satisfies the G3 property. Let [r ] ∈ Z∗p , where we can assume that 1 ≤ r ≤ p − 1. Since r and p are relatively prime, by Theorem 11.12 1 is a linear combination of r and p. So 1 = r x + py for some integers x and y. So [1] = [r x + py] = [r x] + [ py] = [r] [x] + [ p] [y] = [r] [x] + [0] [y] = [r] * [x]. Thus [x] is an inverse of [r] and (Z∗p , ·) is a group. By Theorem 13.5, (Z∗p , ·) is an abelian group of order p − 1 for every prime p − 1. Another example of an abelian group is (G, ∗), where G = {a, b, c} and ∗ is defined in Figure 13.3. It is not difficult to see that a is an identity for (G, ∗) and that a, b, and c are inverses for a, c, and b, respectively. Since a ∗ b = b ∗ a, a ∗ c = c ∗ a and b ∗ c = c ∗ b, it follows that G is abelian. However, we need to check an additional property to show that G is a group, namely the associative property. What we need to show is that x ∗ (y ∗ z) = (x ∗ y) ∗ z for all x, y, z ∈ G. Since there are three possibilities each for x, y, and z, we have 27 equalities check over. Since b ∗ (c ∗ b) = b ∗ a = b and (b ∗ c) ∗ b = a ∗ b = b, it follows that b ∗ (c ∗ b) = (b ∗ c) ∗ b. Since the remaining 26 equalities can also be verified, G is indeed an abelian group.

330

Chapter 13

Tests in group theory

*

one

b

c

one

one

b

c

b

b

c

one

c

c

one

b

Figure 13.3

An abelian group with three elements

We have already mentioned that the algebraic structure (M2(R),·) does not satisfy the 1 0 G3 property, although it does satisfy the G2 property. The matrix I = is an identity for 0 1 a b (M2 (R), ·). Furthermore, a matrix A = ∈ M2 (R) has an inverse if and only if c d is its determinant det A = ad − bc = 0. In this case 1 d − b B= a ad − bc − c is an inverse for A and so AB = B A = I . Let M2∗ (R) = {A ∈ M2 (R) : det A = 0}. Since det(AB) = det(A) det(B) for all A, B ∈ M2 (R), it follows if A, B ∈ M2∗ (R) then AB ∈ M2∗ (R) and hence M2 ∗ (R) is closed under matrix multiplication. It follows that (M2∗ (R), ·) is a group. On the other hand, since the matrices 1 1 0 1 A= and B = 1 0 1 1 belong to M2∗ (R) and

1 AB = 0

2 1 is = 1 2

0, 1

it follows that (M2∗ (R), ·) is a non-abelian group.

13.3 Permutation Groups One of the most important classes of groups concerns a concept introduced in Chapter 9. Recall that a permutation of a nonempty set A is a bijective function f : A → A; that is, f is one-to-one and more. In Chapter 9 it was shown that: (1) the composition of any two permutations of A is a permutation of A; (2) the composition of permutations of A is an associative operation; (3) the identity function i A : A → A defined by i A (a) = a for all a ∈ A is a permutation of A; (4) every permutation of A has an inverse that is also a permutation of A.

13.3

The permutation group

331

By a permutation group we mean a group (G, ◦), where G is a set of permutations of a set A and ◦ denotes a composition. Let S A be the set of all permutations of A. Then by (1)-(4) above we have the following result. Theorem 13.6

For every nonempty set A, the algebraic structure (S A , ◦) is a permutation group. The group (S A , ◦) is called a symmetric group on A. Therefore every symmetric group is a permutation group. We have already established that for the set FR of all functions from R to R (FR , ◦) is an algebraic structure, where ◦ denotes the composition of the function. By Theorem 13.6, for the set SR of bijective functions from R to R, the algebraic structure (SR , ◦) is a group, i.e. the symmetric group in R. The identity function i R in SR , defined by i R ( x ) = x for every x ∈ R, is an identity in the group (SR , ◦). The functions f and g defined by f(x) = x + 1 and g(x) = 2x for all x ∈ R belong to SR ; however ( f ◦ g)(x) = f (g(x)) = f (2x) = 2x + 1 and (g ◦ f )(x) = g( f (x)) = g(x + 1) . ) = 2x + 2. Since f ◦ g = g ◦ f , it follows that (SR , ◦) is a non-Navelian group. If A = {1, 2, · · · , n}, where n ∈ N, the group S A is usually denoted by Sn. The group (Sn , ◦) has n! elements (see Theorem 9.7) and is called a symmetric group (of degree n). The symmetric group (Sn , ◦) is thus a finite group of order n!. By the notation introduced in Chapter 9, S3 = {α1 , α2 , α3 , α4 , α5 , α6 }, where 1 2 3 1 2 3 1 2 3 α1 = α2 = α3 = 1 2 3 1 3 2 3 2 1 α4 =

1 2

2 1

3 3

a5 =

1 2

2 3

3 1

a6 =

1 3

2 1

3 . 2

Remember that with this notation for a permutation, an element of {1, 2, 3} listed on the first row maps to the element on the second row directly below. So α1 is the identity of S3. Let's consider the composition α3 ◦ α6 . For example (α3 ◦ α6 )(1) = α3 (α6 (1)) = α3 (3) = 1. Also (α3 ◦ α6 )(2) = 3 and (α3 ◦ α6 )(3) = two . Therefore 1 2 3 1 2 3 1 ◦ = α3 ◦ α6 = 3 2 1 3 1 2 1

2 3

3 2

= a2 .

Likewise α6 ◦ α3 = α4 . So α3 ◦ α6 = α6 ◦ α3 . Therefore (S3 , ◦) is a non-navelian group. This shows that there is a non-navelian group of order 6. However, there is no non-Navelian group of order less than 6. If we take the composition of two elements α, β ∈ Sn, where n ∈ N, we often write α ◦ β as αβ and we say that we multiply α and β . With this notation we have α3 α6 = α2 , α6 α3 = α4 , α32 = α3 α3 = α1 and α62 = α6 α6 = α5 . The group table for (S3 , ◦) is shown in Figure 13.4 (with all 36 products!).

332

Chapter 13

Tests in group theory

a1 a1 a1 a2 a2

a2 a3 a2 a3 a1 a5

a4 a4 a6

a5 a5 a3

a6 a6 a4

a3 a4 a5 a6

a6 a5 a4 a3

a1 a6

a5 a1

a4 a2

a2 a3

a2 a4

a3 a2

a6 a1

a1 a5

a3 a4 a5 a6

Figure 13.4

The group table for (S3 , ◦)

A permutation group need not consist of all permutations of a set A. For example, if we consider the subsets G 1 = {α1 , α2 } and G 2 = {α1 , α5 , α6 } of S3 , then (G 1 , ◦ ) and (G 2 , ◦) are both permutation groups. Its group tables are shown in Figure 13.5. Furthermore, let 1 2 3 4 1 2 3 4 β2 = β1 = 1 2 3 4 2 1 3 4 1 2 3 4 1 2 3 4 β3 = β4 = 1 2 4 3 2 1 4 3 permutations of the set {1 , 2, 3, 4} and let G 3 = {β1 , β2 , β3 , β4 }. Then (G 3 , ◦) is a group of abelian permutations whose group table is shown in Figure 13.6. In the second half of the 18th century, a major problem in mathematics was whether the roots of every fifth-degree polynomial with real coefficients could be α1

A'2

A'1

A'1

A'2

A'2

A'2

A'1

Figure 13.5

A'1

a5

a6

A'1

A'1

a5

a6

a5

a5

a6

A'1

a6

a6

A'1

a5

The group tables for (G 1 , ◦) and (G 2 , ◦)

b1

B2

b3

b4

b1

b1

B2

b3

b4

B2

B2

b1

b4

b3

b3

b3

b4

b1

B2

b4

b4

b3

B2

b1

Figure 13.6

The group table for (G 3 , ◦)

13.4

Basic properties of groups

333

expressed in radicals and the usual arithmetic operations. It was well known 2 that the roots √ ax + bx + c, where a, b, c ∈ R and a = 0, √ of a quadratic polynomial 2 2 are (−b + b − 4ac)/2a and (− b − b − 4ac)/2a, which is a consequence of the quadratic formula. In addition, it was known since the 16th century that the roots of all third-degree (cubic) and fourth-degree (quartic) polynomials with real coefficients can be described by radicals and standard arithmetic operations. But fifth-degree polynomials turned out to be a different story. However, Niels Henrik Abel proved in 1824 that the roots of fifth degree polynomials with real coefficients cannot in general be expressed in this way. It follows that for every integer n ≥ 5 there are polynomials of degree n with real coefficients whose roots cannot be expressed in terms of radicals and standard arithmetic operations. However, his work went unnoticed until his death at the age of 26.´ Some time later, the French mathematician Evariste Galois characterized those polynomials of degree 5 and higher whose roots can be expressed in terms of radicals and ordinary arithmetic. Like Abel, Galois died very young (aged 20), but in his case from an unlikely cause: a duel. Galois' work was also not recognized until 11 years after his death, when Joseph Liouville addressed the Paris Academy of Sciences: "I hope to arouse the interest of the Academy by announcing that among the articles by Evariste Galois I have found an equally precise one solution how profound this beautiful problem is: whether it can be solved by radicals or not". In developing his theory, Galois assigned to a given polynomial a set G of permutations of the roots of the polynomial. This set G had the property that whenever s, t ∈ G, the composition s ◦ t ∈ G; that is, G was completed under composition. He called G a group, a notion that has a firm and prominent place in abstract algebra.

13.4 Basic Properties of Groups Let us now consider some properties that all groups possess. Obviously every property satisfied by all groups must be a consequence of the properties G1 – G3. Unless otherwise indicated, the symbol e represents an identity in the group in question. A simple but important property that is satisfied by every group (G, ∗) allows us to cancel a in a ∗ b = a ∗ c and conclude that b = c. Since a group need not be Abelian, there are two such cancellation properties. Theorem 13.7

Every group (G, ∗) satisfies: (a) The Left Cancellation Law Let a, b, c ∈ G. If a ∗ b = a ∗ c, then b = c. (b) The law of correct cancellation Let a, b, c ∈ G. If b ∗ a = c ∗ a, then b = c.

Study

We only prove (a). (The proof of (b) is similar. See Exercise 13.21.) Assume that a ∗ b = a ∗ c. Let s be an inverse for a. Then s ∗ (a ∗ b) = s ∗ (a ∗ c). Then s ∗ (a ∗ b) = (s ∗ a) ∗ b = e ∗ b = b,

334

Chapter 13

Tests in group theory

while s ∗ (a ∗ c) = (s ∗ a) ∗ c = and ∗ c = c. So b = c. The last two theorems of this proof could have been replaced by: Then b = e ∗ b = (s ∗ a) ∗ b = s ∗ (a ∗ b) = s ∗ (a ∗ c) = (s ∗ a) ∗ c = e ∗ c = c. The next result will not surprise you. Theorem 13.8

Let (G, ∗) be a group and let a, b ∈ G. The linear equations a ∗ x = b and x ∗ a = b have unique solutions in G.

Study

We only prove that a ∗ x = b has a unique solution. (The remaining proof is Exercise 13.22.) Let e be an identity for G, let s be an inverse of a, and let x = s ∗ b. Then a ∗ x = a ∗ (s ∗ b) = (a ∗ s) ∗ b = e ∗ b = b. Then x = s ∗ b is a solution of the equation a ∗ x = b. It remains to show that s ∗ b is the unique solution of a ∗ x = b. Suppose x1 and x2 are solutions of a ∗ x = b. Then a ∗ x1 = b and a ∗ x2 = b. So a ∗ x1 = a ∗ x2 . Applying the cancellation law on the left (Theorem 13.7(a)), we have x1 = x2 . The previous sentence provides some interesting information for us. Suppose we have a group table for a group G and we consider the row corresponding to element a. Then this row contains the elements a ∗ g for all g ∈ G. Let b ∈ G. By Theorem 13.8 there is x ∈ G with a ∗ x = b. That is, element b must appear on the line corresponding to a. This is shown in Figure 13.7. On the other hand, the element b must not appear twice in this row because the equation a ∗ x = b has a unique solution. From this we can conclude that each element of G occurs exactly once in each row of G's group table. From the equation x ∗ a = b we can also conclude that each element of G occurs exactly once in each column of the table of G. As with composition in a symmetric group, it is common in a group G to perform the binary operation ∗ to denote as multiplication and to denote the product of the elements a and b in G by ab instead of a ∗ b for the sake of simplicity notation. So we write a ∗ a = aa = a 2 . The only exception to this practice is when we have a group whose operation is addition. In this case we continue to use + as the operation. It is also common to use +never as an operation when the group is nonabelian. Let's use this newly adopted notation to present a theorem showing that every group G has a unique identity and

xb

to Figure 13.7

The equation a ∗ x = b

13.4

Basic properties of groups

335

Each element of G has a unique converse, two facts you may have already guessed to be true. Theorem 13.9

Let G be a group. Then (a) G has a unique identity and (b) every element in G has a unique inverse.

Study

Assume that e and f are two identities in G. Since e is an identity and f = f ; and since f is an identity and f = e. So e = e f = f . This checks (a). Then let g ∈ G and assume that s and t are both inverses of g. So gs = sg = e and gt = tg = e. Thus s = se = s(gt) = (sg)t = et = t, which verifies (b). It is customary to denote the inverse (unique) of an element a in a group a−1. When the operation on a group under consideration is addition, we follow the standard practice of denoting identity by 0 and the inverse of a by −a. We now present two theorems involving inverses in a group.

Theorem 13.10

Study

Let G be a group. If a ∈ G, then

−1 −1 a = a.

−1 Since aa −1 = a −1 a = e, the element a is the inverse of a −1 , so a −1 = a. For elements a and b in a group, the next theorem establishes a connection between the inverses a −1 , b−1 and (ab)−1.

to prove theorem

Let G be a group. For a, b ∈ G we have (ab)−1 = b−1 a −1 .

TEST STRATEGY

Theorem 13.11

Suppose the group under consideration is G. For a, b ∈ G, its product is ab ∈ G. Since ab ∈ G, the element ab has an inverse—indeed, a single inverse. The inverse of ab is denoted by (ab)−1. The theorem says that the inverse of ab is the element b−1 a −1 in G. To show that an element s ∈ G is an inverse of x ∈ G, we have −1 −1 to show that sx = xs = e . So to show that b a is the inverse of ab, we need to show −1 −1 = b a (ab) = e. let (ab) b a Let G be a group. If a, b ∈ G, then (ab)−1 = b−1 a −1 .

Study

−1 −1 −1 −1 = e To show that b a is the inverse of ab, it suffices to show that (ab) b a −1 −1 (ab) = e. We verify the first since the proof of the second equality is b a

336

Chapter 13

Tests in group theory

similar. Note that (ab)(b−1 a −1 ) = ((ab)b−1 )a −1 = (a(bb−1 ))a −1 = (ae)a −1 = aa −1 = e . In words, Theorem 13.11 says that the inverse of the product of two elements in a group is the product of their inverses in reverse order. If G is an abelian group and a, b ∈ G, then of course (ab)−1 = b−1 a −1 = a −1 b−1 . (See Exercise 13.25.)

13.5 Subgroups There were occasions when we considered a group (G, ∗) and a subset H of G such that H is a group under the same operation ∗; that is, (H, ∗) is also a group. If (G, ∗) is a group and H is a subset of G such that (H, ∗) is a group, then (H, ∗) is called a subgroup of G. For example (Z, +) is a subgroup of (Q, +), which in turn is a subgroup of (R, +). Furthermore, the groups G 1 and G 2 in Figure 13.5 are subgroups of S3 , while the group G 3 in Figure 13.6 is a subgroup of S4. If (G, ∗) is a group with identity e, then ({e}, ∗) and (G, ∗) are always subgroups of (G, ∗). So if G has at least two elements, then G has at least two subgroups. The group (2Z, +) of even integers under addition is a subgroup of (Z, +). To see this, first note that 2Z ⊆ Z. Since the sum of two even integers is an even integer, 2Z is closed under addition. Since the associative addition law holds in Z, it also holds in 2Z. The identity at (Z, +) is 0. Since 0 is an even integer, 0 ∈ 2Z. After all, the additive inverse (the negative) of an even integer is an even integer. So (2Z, +) is a group. Hence it is easier to show that (2Z, +) is a subgroup of (Z, +) than to show that it is a group. This observation applies to all subgroups. Theorem 13.12

(The subgroup test) A nonempty subset H of a group G is a subgroup of G if and only if (1) ab ∈ H for all a, b ∈ H and (2) a −1 ∈ H for all a ∈ H .

Study

We first show that if H is a subgroup of G, properties (1) and (2) are satisfied. Since H is closed under multiplication, property (1) is surely fulfilled. Now we show that the identity e of G is also the identity of H. Let f be the identity of H . So f f = f . Since e is the identity of G, it follows that f · e = f . So f f = f e. By the law of left cancellation (Theorem 13.7(a)), f = e. Hence, as said, the identity of G is also the identity of H. Next let a ∈ H . Since a ∈ G, it follows that the inverse a −1 of a belongs to G and thus aa −1 = e. It remains to show that a −1 ∈ H . Since H is a subgroup of G, a has an inverse a in H. So aa = e in H and aa = e also in G. Hence aa = aa −1 . Again by the left cancellation law, a = a −1 and thus property (2) is satisfied. Next we verify the converse, that is, if H is a nonempty subset of G that satisfies properties (1) and (2), then H is a subgroup of G. Let a, b, c ∈ H . Since H ⊆ G it follows that a, b, c ∈ G. Since G is a group, a(bc) = (ab)c by property G1. So multiplication in H is also associative. From property (1) it follows that H is closed under multiplication and from property (2) every element of H has an inverse. It only remains to show that H contains an identity. Because H = ∅ there is an element a ∈ H . By (2) a −1 ∈ H ; and by (1) aa −1 = e ∈ H . Since H contains the identity e of G, it follows that xe = ex = x for all x ∈ H and hence e is also the identity of H.

13.5

subgroups

337

We now illustrate the subgroup test. Result to prove the PROOF STRATEGY

You are H =

Study

b : a, b, c ∈ R . So (H, +) is a group. 0

The elements of H are matrices, actually matrices in M2(R). In fact, a matrix in M2(R) belongs to H if and only if the entry in row 2, column 2 is 0. We have already seen that (M2(R),+) is a group. Since H uses the same operation as M2(R), i.e. H. Addition, it is convenient to prove that (H, +) is a group by the subgroup test (Theorem 13.12). To use the subgroup test, we first need to know that the set H is not empty. Since 0 0 satisfies the zero matrix of belonging to H, we only need 0 0 to show that conditions (1) and (2) of Theorem 13.12 are satisfied, that is, that H is closed under addition and that if A is a matrix in H , then its inverse (in this case negative) −A also belongs to H. This will be shown relatively routinely.

Result 13.13

a c

You are H =

a c

b : a, b, c ∈ R . So (H, +) is a group. 0

We have indeed shown that (H, +) is a subgroup (M2 (R), +). Of course H is a non-empty subset of 0 0 of M2(R) since H belongs to the zero matrix. Let A, B ∈ H . Then 0 0 a1 a2 b1 b2 A= and B = , a3 0 b3 0 a1 + b1 a2 + b2 ∈ H and the inverse where ai , bi ∈ R (1 ≤ i ≤ 3). So A + B = a 3 + b3 0 −a1 −a2 ∈ H . Hence, by the subgroup test, H is a subgroup of A is −A = −a3 0 of M2(R) and hence (H, +) is a group. If G is an abelian group then we know that every two elements of G commute. But although G is not abelian, we know that its identity commutes with all elements of G. However, there may well be other elements of G that commute with all elements of G. The set of all elements in a group G that swap where each element in G is called the center of G and is in fact always a subgroup of G. This subgroup is usually denoted Z(G). Since Z(G) = G when G is abelian, the center is more interesting when G is nonabelian.

to prove result

For a group G the center Z(G) = {a ∈ G : ga = ag for all g ∈ G} is a subgroup of G.

TEST STRATEGY

To prove this result, it makes sense to use the subgroup test. Since e ∈ Z(G), it follows that Z(G) = ∅. We now have to show that Z(G) satisfies the two required properties of the subgroup test.

338

Chapter 13

Tests in group theory

First we show that Z(G) is closed under multiplication; that is, if a, b ∈ Z(G), then ab ∈ Z(G). We use a direct proof. Let a, b ∈ Z(G). To show that ab ∈ Z(G), we have to show that ab commutes with every element of G. So let g ∈ G. We have to show that (ab)g = g(ab). This suggests starting with (ab)g. According to the associative law, (ab)g = a(bg). But since b ∈ Z (G), it follows that a(bg) = a(gb). We can continue in this way to complete the proof of this property. Second, we have to show that if a ∈ Z (G), then a −1 ∈ Z (G). Again, we use a direct proof. Let a ∈ Z (G). Then it commutes with all elements of G. To show that a −1 ∈ Z (G), we have to verify that a −1 commutes with all elements of G. Let g ∈ G. We then need to show that a −1 g = ga −1 to complete the proof. But how do we do that? Theorem 13.11, which deals with inverses of elements, can be helpful. We know that (x y)−1 = y −1 x −1 for all x, y ∈ G. So (ag)−1 = g −1 a −1 . However, this includes g -1 . −1 −1 −1 = g a . But if we start with ag −1 = g −1 a, then we have ag −1 Result 13.14

For a group G the center Z(G) = {a ∈ G : ga = ag for all g ∈ G} is a subgroup of G.

Study

Since eg = ge for all g ∈ G, it follows that e ∈ Z (G) and hence Z (G) is nonempty. First we show that Z(G) is closed under multiplication. Let a, b ∈ Z(G). Hence ag = ga and bg = gb for all g ∈ G. We show that ab ∈ Z (G). Since (ab) g = a (bg) = a (gb) = (ag)b = (ga)b = g (ab) , ab ∈ Z (G). Hence Z(G) is closed under multiplication. Next we show that every element of Z(G) has an inverse in Z(G). Let a ∈ Z (G) and g ∈ G. We show that a −1 ∈ Z (G); that is, a −1 and g exchange. Since a commutes with all elements of G, it follows that a and g −1 commute and thus ag −1 = g −1 a. Since each element of G has a unique in−1 −1 −1 −1 verse, (ag −1 )−1 = (g −1 a)−1 . By Theorem 13.11 we have ag −1 = g a = ga −1 and −1 −1 −1 g a = a −1 g −1 = a −1 g. Hence a −1 g = ga −1 . For A = {1, 2, , n}, n ≥ 2 and k ∈ A, let G k consist of the permutations α in the symmetric group (Sn , ◦) such that α(k) = k (that G So k consists of all those permutations of A that “stabilize” or fix k. The set G k is called the stabilizer of k in Sn .

Result 13.15

For integers k and n with 1 ≤ k ≤ n and n ≥ 2, the stabilizer G k of k in Sn is a subgroup of Sn.

Study

We use a subgroup test. Certainly belongs to the identity α1 in Sn G k , then G k = ∅. Let α, β ∈ G k Let α(k) = β(k) = k. So (α ◦ β)(k) = α(β(k)) = α(k) = then α ◦ β ∈ G k . Consider the inverse α −1 of α. Map α −1 ◦ α = α1 . Hence (α −1 ◦ α)(k) = α1(k) = k. So (α −1 ◦ α)(k) = α −1 (α(k)) = α −1(k) = α1(k) = k. Map α −1 ∈ G k . For the subgroup test, G is a subgroup of Sn. The group (G 1 , ◦) shown in Figure 13.5 is the stabilizer of 1 em (S3 , ◦). In contrast, (G 2 , ◦) in Figure 13.5 is not the stabilizer of 2 in (S3 , ◦).

13.5

subgroups

339

We have already mentioned that the set 2Z of even integers is a subgroup of (Z, +). Indeed, for every integer n ≥ 2, the set nZ = {nk : k ∈ Z} of multiples of n is a subgroup of (Z, +) (see Exercise 13.31). In Chapter 8 we saw that the relation R defined on Z by a R b if a ≡ b (mod n) is an equivalence relation. This relation can also be described differently, namely a R b if a − b ∈ nZ and thus a − b = h for an element h ∈ nZ or a = b + h. It turns out that this equivalence relation is a special case of a more general situation. Suppose H is a subgroup of a group (G, ·) and a relation R is defined in G by a R b if a = bh for some h ∈ H . (Note that b + h in Z is replaced by bh here, since the operation in G is multiplication.) So this relation is also an equivalence relation. Theorem 13.16

Let H be a subgroup of a group (G, ·). The relation R defined in G by a R b if a = bh for some h ∈ H is an equivalence relation.

Study

First we show that R is reflexive. Let a ∈ G. Since a = ae (where e is the identity of G and hence H), a Ra and hence R is reflexive. Next we show that R is symmetric. Suppose a R b, where a, b ∈ G. Then a = bh for some h ∈ H . Since H is a group, h −1 ∈ H and thus ah −1 = (bh)h −1 = b(hh −1 ) = b and = b, or b = ah −1 . Therefore, b R a and R is symmetrical. Finally we show that R is transitive. Suppose a R b and b R c, where a, b, c ∈ G. Then a = bh 1 and b = ch 2 for the elements h 1 and h 2 in H . Therefore a = bh 1 = (ch 2 )h 1 = c(h 2 h 1 ). Since h 2 , h 1 ∈ H , it follows that h 2 h 1 ∈ H . So an R is c and so R is transitive. For a subgroup H of a group (G, ·), the equivalence relation defined in Theorem 13.16 leads to equivalence classes. For each element g ∈ G the equivalence class [g] is defined by [g] = {x ∈ G : x R g} = {x ∈ G : x = gh for some h ∈ H } = {gh : h ∈ H } . The set {gh : h ∈ H } is often denoted by g H and is called the left coset of H in G, i.e. H. [g] = gH. We saw in Chapter 8 that for an equivalence relation defined on a set S, the different equivalence classes form a partition of S. Consequently, the equivalence relation defined in Theorem 13.16 leads to a partition of G on the different cosets to the left of H in G. An important feature of a left coset g H of H in G is that g H and H have the same number of elements, ie |g H | = |H|. To see this we show for an element g ∈ G that there is a bijection from H to gH. Let φ : H → g H be defined by φ(h) = gh. First we show that φ is one-to-one. Suppose that φ(h 1 ) = φ(h 2 ). So gh 1 = gh 2 . According to the left cancellation law, h 1 = h 2 . Therefore φ is one-to-one. Next we show that φ is superimposed. Let gh ∈ g H . Since φ(h) = gh, it follows that φ is supernatural and hence φ is a bijection and |g H | = |H|. Therefore, all two cosets to the left of H in G have the same number of elements. What we have just observed provides all the information needed to prove a fundamental theorem of the group theory of Joseph-Louis Lagrange, probably the greatest French mathematician of the 18th century.

340

Chapter 13

Tests in group theory

Theorem 13.17

(Lagrange's theorem) If H is a subgroup of order m in a (finite) group G of order n, then m | n.

Study

We have already seen that the different left cosets of H in G form a partition of G and that every two left cosets have the same number of elements. Suppose there are k remaining cosets of G. Then n = mk and hence m | n. Lagrange's theorem first appeared in 1770-1771 in connection with the problem of solving the general polynomial of degree 5 or higher. Although this theorem was not presented in this general form by Lagrange and, in fact, group theory has not yet been invented, it is commonly referred to as the Lagrange theorem. We have seen that the group formed by the non-zero elements of Z7 forms a group under multiplication, i.e. H. G = Z∗7 = {[1], [2], . 🇧🇷 🇧🇷 , [6]}. Since H = {[1], [6]} is a subgroup of order 2 in G, the distinct left cosets of H in G are [1]H = H , [2]H = {[2], [5] } and [3]H = {[3],[4]}.

13.6 Isomorphic Groups Suppose we are asked to give examples of two groups of order 3. A possible example is (Z3 , +). On the other hand, we can try to construct two groups of order 3, say G = {a, b, c} and H = {x, y, z}. Of course we also have to describe binary operations for G and H. Let us denote the binary operation on G by ∗ and the binary operation on H by ◦. Thus we have two groups (G, ∗) and (H, ◦), both of order 3. One of the elements of G is the identity for G and one of the elements of H is the identity for H . Suppose we choose a as the identity of G and x as the identity of H . Hence the operations ∗ and ◦ on G and H, respectively, satisfy the subtables shown in Figure 13.8. Since each element in each of the G and H groups must appear exactly once in each row and column in the tables shown in Figure 13.8, the complete tables for ∗ and ◦ must be those shown in Figure 13.9. Now we can easily see that in G we have a−1 = a, b−1 = c, and c−1 = b; while in H x −1 = x, y −1 = z and z −1 = y. Checking the associative laws requires more effort, but it can be shown that the associative law holds in every case. So (G, ∗) and (H, ◦) are both groups, and we have just given examples of two groups of order 3. Or not? There is something very similar in these two examples. They're not really two different groups. In fact, the group (H, ◦) is just a camouflaged form of the group

∗

one

b

c

◦

x

j

z

one

one

b

c

x

x

j

z

b

b

j

j

c

c

z

z

Figure 13.8

Partial tables for groups (G, ∗) and (H, ◦)

13.6

isomorphic groups

∗

one

b

c

◦

x

j

z

one

one

b

c

x

x

j

z

b

b

c

one

j

j

z

x

c

c

one

b

z

z

x

j

Figure 13.9

341

Complete tables for groups (G, ∗) and (H, ◦)

∗

one

b

c

◦

x

z

j

one

one

b

c

x

x

z

j

b

b

c

one

z

z

j

x

c

c

one

b

j

j

x

z

Figure 13.10

Groups (G, ∗) and (H, ◦)

(G, ∗). Let's describe what we mean by that. If the elements a, b, c in G are replaced by x, y, z respectively, we have the identical table. What is important here is not only that x, y, and z in H play the roles of a, b, and c in G, but that the operations on both groups do the same thing. For example, if we multiply b and c in G (to get element a), multiplying the corresponding elements y and z in H gives the element corresponding to a, which is x. Although this may appear to be the natural correspondence between the elements of G and the elements of H, we must not be fooled by the order in which the elements of these two groups are listed. Suppose we consider the two groups (G, ∗) and (H, ◦) again (in Figure 13.10), with the elements of H listed in the order x, z, y. We can see that the elements a, b, c in G also correspond to x, z, and y, respectively. We actually consider the two groups (G, ∗) and (H, ◦) as a single group, since these two groups have the same order (although the sets are different) and their operations perform the same functions (although different symbols are used for operations ). The technical term for this is that they are isomorphic groups (groups with the same structure). In general, two groups (G, ∗) and (H, ◦) are isomorphic if there is a bijective function φ : G → H that satisfies the property φ(a ∗ b) = φ(a) ◦ φ(b). .

(13.3)

for all a, b ∈ G. Every function φ that satisfies (13.3) preserves the operation. So that (G, ∗) and (H, ◦) are isomorphic, there must be an operation-preserving bijective function φ : G → H. If φ has these properties, then φ is called an isomorphism. If φ : G → H is an isomorphism, then φ is also a bijective function. Thus φ has an inverse function φ −1 : H → G, which is also an isomorphism (Exercise 13.51). For isomorphic groups (G, ∗) and (H, ◦) there are certain properties that every isomorphism from G to H must have. We consider two examples of this.

342

Chapter 13

Theorem 13.18

Tests in group theory

Let (G, ∗) and (H, ◦) be isomorphic groups, where the identity of G is e and the identity of H is f. If φ : G → H is an isomorphism, then (a) φ(e) = f and (b) φ g −1 = (φ(g))−1 for all g ∈ G.

Study

We first prove (a). Let h ∈ H . Since φ is superimposed, there is a g ∈ G with φ(g) = h. Since e ∗ g = g ∗ e = g and φ retains the operation, it follows that φ(e) ◦ φ(g) = φ(e ∗ g) = φ(g) = φ(g ∗ e) = φ ( g) ◦ φ(e) and thus φ(e) ◦ h = h ◦ φ(e) = h. This implies that φ(e) is the identity of H and thus φ(e) = f . −1 −1 g= −1 (b). Let g ∈ G. Since g ∗g −1 = g ∗ −1 e, Next−1 follows and we prove = φ g ∗ g = φ(e) and thus φ(g) ◦ φ g =φ g ◦ φ ( g ) = φ(e) = f . φ g∗g This says that φ g −1 is the inverse of φ(g), so φ g −1 = (φ(g))−1 . If two groups (G, ∗) and (H, ◦) are isomorphic, then by definition there exists an isomorphism φ : G → H . Since φ is a bijective function, it follows |G| = |H|. In fact, it's not surprising that the isomorphic groups have the same number of elements, because when we say that G and H are isomorphic, we're technically saying that these groups are the same, apart from what the elements and binary operations are called . On the other hand, consider the two-group group tables shown in Figure 13.11. You can notice that the first group is (Z4 , +). The second group G is abelian, has the identity e and x −1 = x for all x ∈ G. In addition, G is of course of order 4. So Z4 and G are of order 4. Nevertheless, Z4 and G are not isomorphic; instead assume that they are isomorphic. Then there is an isomorphism φ : Z4 → G. By Theorem 13.18 we know that φ([0]) = e. Let φ([1]) = x ∈ G. Then φ([2]) = φ([1 + 1]) = φ([1] + [1]) = φ([1]) φ( [ 1]) = x x = x 2 = z. Thus φ([2]) = φ([0]) = e, but this contradicts the fact that φ is one-to-one. Consequently, these two groups of order 4 are not isomorphic. So if two groups have the same number of elements, they don't have to be isomorphic. However, it turns out that each group of order 4 is isomorphic to one of the two groups of order 4 shown in Figure 13.11. + [0]

[1]

[2]

[3]

·

e

one

b

c

[0]

[0]

[1]

[2]

[3]

e

e

one

b

c

[1]

[1]

[2]

[3]

[0]

one

one

e

c

b

[2]

[2]

[3]

[0]

[1]

b

b

c

e

one

[3]

[3]

[0]

[1]

[2]

c

c

b

one

e

Figure 13.11

Two groups of order 4

13.6

isomorphic groups

343

We saw in Chapter 10 that |Z| = |Q| although Z is a proper subset of Q. However, (Z, +) and (Q, +) are not isomorphic. Result 13.19 test

The groups (Z, +) and (Q, +) are not isomorphic. Instead, assume that (Z, +) and (Q, +) are isomorphic. Then there is an isomorphism φ : Z → Q. Let φ(1) = a ∈ Q. Since φ(0) = 0, it follows a = 0. So a/2 ∈ Q and a/2 = 0. As φ is over, there is an integer n = 0 such that φ(n) = a/2. So φ(2n) = φ(n + n) = φ(n) + φ(n) =

a a + = a. 2 2

Since φ is one-to-one, 2n = 1. However, n = 1/2 ∈ / Z, which is a contradiction. On the other hand, the set 2Z of even integers is a proper subset of Z; nevertheless (2Z, +) and (Z, +) are isomorphic. Result 13.20 test

The groups (2Z, +) and (Z, +) are isomorphic. Define the function φ : Z → 2Z by φ(n) = 2n for every n ∈ Z. First we show that f is one-to-one. Assume that φ(a) = φ(b). So 2a = 2b. If we divide by 2 we get a = b and then φ is one to one. Now we show that φ is superimposed. Let n ∈ 2Z. Since n is even, n = 2k for an integer k. Then φ(k) = 2k = n. This shows that φ is approximate. Finally we show that φ preserves the operation. Let a, b ∈ Z. Then φ(a + b) = 2(a + b) = 2a + 2b = φ(a) + φ(b). It should be obvious that every group G is isomorphic to itself. In fact, the identity function i G : G → G defined by i G (g) = g is an isomorphism for all g ∈ G. However, other permutations of G can be isomorphisms.

Result 13.21

Let G be a group and let g ∈ G. The function φ : G → G defined by φ(a) = gag −1 for all a ∈ G is an isomorphism.

Study

First we show that f is one-to-one. Assume that φ(a) = φ(b). So gag −1 = gbg −1 . By deleting g on the left and g −1 on the right, we get a = b. Next we show that φ is superimposed. Let c ∈ G. Then φ g −1 cg = g g −1 cg g −1 = gg −1 c g −1 g = ec and c. Finally we show that φ preserves the operation. Let a, b ∈ G. Then φ(ab) = g(ab)g −1 = gag −1 gbg −1 = φ(a)φ(b).

344

Chapter 13

Tests in group theory

EXERCISES FOR CHAPTER 13 Section 13.1: Binary Operations 13.1. Consider the algebraic structure (S, ∗), where S = {x, y, z} and ∗ is described in the table of Figure 13.12. Calculate (a) (b) (c) (d)

x ∗ (y ∗ z) and (x ∗ y) ∗ z. x ∗ (x ∗ x) and (x ∗ x) ∗ x. y ∗ (y ∗ y) and (y ∗ y) ∗ y. What conclusion can you draw from (a)-(c)?

13.2. For every pair a, b of elements in the given sets, the element a ∗ b is defined. Which of these are binary operations? For binary operations, determine which of the properties G1 – G4 are fulfilled. (a) a ∗ b = 1 in the set Z (b) a ∗ b = a/b in the set N (c) a ∗ b = a b in the set N (d) a ∗ b = max{a, b } in the set N (e) a ∗ b = a + b + ab in the set Z (f) a ∗ b = a + b − 1 in the set Z (g) a ∗ b = √ ab + 2a in the Z Set (h) a ∗ b = ab − a − b + 2 in the set R − {1} (i) a ∗ b = ab in Q (j) a ∗ b = a + b in the set S of odd numbers whole numbers. a-b 13.3. Let T = : a, b ∈ R . Is T closed under b a (a) addition of matrices? (b) matrix multiplication? 13.4. Suppose ∗ is a binary associative operation on a set S. Let T = {a ∈ S : a ∗ x = x ∗ a for all x ∈ S}. Prove that T is closed under ∗. 13.5. Suppose ∗ is an associative and commutative binary operation on a set S. Let T = {a ∈ S : a ∗ a = a}. Prove that T is closed under ∗. ab and f 13.6. For the matrices A = and B = in M2 (R), the binary operation ∗ in M2 (R) is defined by c d g h a b e f a+e−1 b+ f A∗B = ∗ = . c d g h c+g d +h+1 Which of the properties G1 – G4 are fulfilled? 13.7. For n ≥ 2 and [a], [b] ∈ Zn the binary operation ∗ in Zn is defined by [a] ∗ [b] = [a + b + 1]. Which of the properties G1 – G4 are fulfilled?

Figure 13.12

∗

x

j

z

x

j

z

j

j

j

x

x

z

z

z

j

A binary operation on the set S = {x, y, z}

Exercises for Chapter 13

∗

one

one

one

b

c

d

c

b

one

c

d

d Figure 13.13

345

a b

A binary operation on the set S = {a, b, c, d} in Exercise 8

Section 13.2: Groups 13.8. Let S = {a, b, c, d}. Figure 13.13 shows a partially complete table for a binary associative operation ∗ defined on S. (a) Complete the table. (b) Is the algebraic structure (S, ∗) a group? 13.9. Let (G, ∗) be a group with G = {a, b, c, d}, where a partially complete table for (G, ∗) is given in Figure 13.14. Fill in the table of. 13.10. None of the following binary operations ∗ on the given set lead to a group. What is the first property of G1, G2, G3 that fails? √ (a) Let ∗ in R+ be defined by a ∗ b = ab. (b) Let ∗ in R∗ be defined by a ∗ b = a/b. (c) Let ∗ in R+ be defined by a ∗ b = a + b + ab. 13.11. (a) Determine whether for all [a], [b] ∈ Z∗6 = {[1], [2], [3], [4], [5]} there is a [x] ∈ Z∗6 exists such that [a][x] = [b]. (b) Why is the answer to the question asked in (a) not surprising? from 13.12. Let G = : a, b ∈ R and a = 0 . 0 0 (a) Prove that G is closed under matrix multiplication. (b) Prove that there is E ∈ G such that E · A = A for every A ∈ G. (c) For every A ∈ G, prove that there is A ∈ G such that A · A = E. (d) Prove or disprove : (G, ·) is a group. 13.13. Let ∗ be an associative binary operation on the set G such that: (i) There exists e ∈ G such that g ∗ e = g for all g ∈ G. (ii) For every g ∈ G there exists g ∈ G with g ∗ g = e. Prove that (G, ∗) is a group.

∗

one

b

one

d

c

b

c

d

one

cd Figure 13.14

A partially filled table for (G, ∗) in Exercise 9

346

Chapter 13

Tests in group theory

Section 13.3: Permutation Groups 13.14. For 1 ≤ i ≤ 6, every function f i is a permutation on the set Q − {0, 1}: f 1 (x) = x f 2 (x) = 1 − x f 3 (x) = x1 1 x f 4 (x ) = x − 1 f 5 (x) = 1 − x f 6 (x) = x − 1 . x Show that the set F = { f 1 , f 2 , . 🇧🇷 🇧🇷 , f 6 } is a group in composition. 13.15 Prove that if A is a set with at least three elements, then the symmetric group (S A , ◦) is nonabelian. 13.16. Give examples of the following (if any): (a) (b) (c) (d)

a finite abelian group a finite nonabelian group a finite abelian group a finite nonabelian group

13.17. Determine all x elements in group S3 such that x 2 = α1 and all y elements in S3 such that y 3 = α1 . 13.18. For the permutations: 1 2 3 4 1 2 3 4 γ1 = γ2 = 1 2 3 4 2 3 4 1 1 2 3 4 1 2 3 4 γ3 = γ4 = 3 4 1 2 4 1 2 3 of the set {1, 2 , 3, 4}, show that the set G = {γ1, γ2, γ3, γ4} is an abelian group under composition. 13.19. Consider the set A = {1, 2, 3,4, 5} in the following permutations. 1 2 3 4 5 1 2 3 4 5 1 2 β2 = β3 = β1 = 1 2 3 4 5 2 3 1 4 5 3 1 1 2 3 4 5 1 2 3 4 5 1 2 β5 = β6 = β4 = 1 2 3 5 4 2 3 1 5 4 3 1 For G = {β1 , β2 , . 🇧🇷 🇧🇷 , β6 }, show that (G, ◦) is a group of permutations in A.

3 2

4 4

5 5

3 2

4 5

5 . 4

13.20 For a group of permutations G on a set A, a relation R on A is defined by an R b if there is g ∈ G such that g(a) = b. (a) Prove that R is an equivalence relation in A. (The equivalence classes resulting from this equivalence relation R are called the orbits of A under G.) (b) Determine the orbits of A under G for the group G in Exercise 13.19.

Section 13.4: Basic Properties of Groups 13.21. Prove Theorem 13.7(b) (The Law of Even Cancellation): Let (G, ∗) be a group. If b ∗ a = c ∗ a, where a, b, c ∈ G, then b = c. 13.22. Prove (see Theorem 13.8): Let (G, ∗) be a group and let a, b ∈ G. The linear equation x ∗ a = b has a unique solution x in G. 13.23. Let (G, ∗) be a group and let a, b, c ∈ G. Prove that each of the following equations has a unique solution for x in G and find the solution. (a) a ∗ x ∗ b = c (b) a ∗ b ∗ x = c. 13.24. Let a and b be two elements in a group G. Prove that if a and b commute, then a−1 and b−1 also commute. 13.25 Let G be a group. Prove that G is Abelian if and only if (ab)−1 = a−1 b−1 for all a, b ∈ G.

Exercises for Chapter 13

347

+

Figure 13.15

Create a table for the group (Z9 , +) in Exercise 13.26

13.26. Create a table for the group (Z9, +) by listing the elements of Z9 in a specific order at the top of the table and down the left side of the table in a specific order such that each element of Z9 is in each of the nine regions occurs 3 × 3 in the table of Figure 13.15. 13.27. By Theorem 13.9, every group G has a unique identity. That is, G contains only one element e such that ae = ea = a for all a ∈ G. Suppose e is an element of G such that e b = b for some element b ∈ G. Prove or disprove: e is the Identity of G. 13.28. Let (G, ∗) be a group. Prove that if g ∗ g = e for all g ∈ G, then G is abelian. 13.29. Let (G, ∗) be a finite group of even order. Prove that there is g ∈ G with g = e and g ∗ g = e. 1:30 p.m. Suppose G is a finite abelian group of order n, say G = {g1 , g2 , . 🇧🇷 🇧🇷 , g }. Let g = g1 g2 · · · gn g1 g2 · · · gn . what is g

Section 13.5: Subgroups 13.31. For an integer n ≥ 2, prove that the set nZ = {nk : k ∈ Z} of multiples of n is a subgroup of (Z, +). 13.32. Which of the following are subgroups of the given group? (A B C D)

The subset N in (Z, +) The subset {[0], [2], [4]} in (Z7 , +) The subset {[1], [2], [4]} in (Z∗7 , ·) The subset {2n : n ∈ Z} on (Q∗ , ·).

13.33 Let H and K be two subgroups of a group G. Prove or disprove: (a) H ∩ K is a subgroup of G. (b) H ∪ K is a subgroup of G. 13.34. For each of the following subsets H of M2∗ (R), prove or disprove that (H, ·) is a subgroup of (M2∗ (R), ·). a b (a) H = : a, b, c ∈ R, bc = 0 c 0 a b (b) H = : a, b, c ∈ R, ac = 0 . 0c √ 13.35. Let H = {a + b 3 : a, b ∈ Q, a = 0 or b = 0}. Prove that H is a subgroup of (R∗ , ·). 13.36. Let T for n ∈ N be a nonempty subset of {1, 2, · · · , n} and define G T = {α ∈ Sn : α(t) = t for all t ∈ T }. Prove that G T is a subgroup of (Sn , ◦).

348

Chapter 13

Tests in group theory

13.37. Remember that M2∗ (R) = {A ∈ M2 (R) : det(A) = 0}. Let H = {A ∈ M2∗ (R) : det(A) = 1 or det(A) = −1}. Prove that H is a subgroup of (M2∗ (R), ·). 13.38 Let G be an abelian group and let H = {a 2 : a ∈ G}. Prove that H is a subgroup of G. 13.39 Let G be an abelian group and let H = {a ∈ G : a 2 = e}. Prove that H is a subgroup of G. 13.40 What are all subgroups of a group of order p, where p is a prime number? 13.41. Prove or disprove: There is a group of order 372 that contains a subgroup of order 22. 13.42. Prove that a nonempty subset H of a group G is a subgroup of G if and only if ab−1 ∈ H for all a, b ∈ H . 13.43 (a) Let (G, ∗) be a finite group. Prove that if H is a nonempty subset of G closed under ∗, then H is a subgroup of G. (b) Show that the result in (a) is false if (G, ∗) is an infinite group. 13.44 Let B be a nonempty subset of a set A, S = { f ∈ S A : f (B) = B} and T = { f ∈ S A : f (b) = b for every b ∈ B}. (a) Prove that (S, ◦) is a subgroup of (S A , ◦). (b) Prove that (T, ◦) is a subgroup of (S, ◦). 1:45 p.m. A group G has order 48. If there are six distinct left cosets of a subgroup H in G, what is the order of H? 1 2 3 1 2 3 13.46. For the subgroup H = {α1 , α2 } of (S3 , ◦) with α1 = and α2 = 1 2 3 1 3 2 determine the different left cosets of H in (S3 , ◦). 13.47 For a subgroup H of a group (G, ·), let g H be a distinct left coset of H . By the element g 2 ∈ G we mean g · g. Prove or disprove: g 2 ∈ g H .

Section 13.6: Isomorphic Groups

1.48 pm. You are H =

1 0

n 1

: n∈Z .

(a) Prove that H is a subgroup of (M2∗ (R), ·).

1 n (b) Prove that the function f : (Z, +) → (H, ·) given by f (n) = is an isomorphism. 0 1 (c) Parts (a) and (b) should suggest another question for you. Ask and answer a related question. 13.49 In each of the following items, determine whether the function φ is an isomorphism from the first group to the second group. (A B C D)

f f f f

: (Z, +) → (Z, +) defined by φ(n) = 2n. : (Z, +) → (Z, +) defined by φ(n) = n + 1. : (R, +) → (R+ , ·) defined by φ(r ) = 2r . : (M2∗ (R), ) → (R∗ , ) defined by ϕ(A) = det(A).

1:50 p.m. Obviously (R+ , ·) and (R+ , ·) are isomorphic groups. Consider the function φ : R+ → R+ defined by φ(r ) = r 2 for all r ∈ R+ . is φ an isomorphism? 13.51. Let (G, ∗) and (H, ◦) be two groups. Prove that if φ : G → H is an isomorphism, then the inverse function φ −1 of φ is an isomorphism from H to G. 13.52. Let G, H and K be three groups. Prove that if φ1 : G → H and φ2 : H → K are isomorphisms, then the composition φ2 ◦ φ1 : G → K is an isomorphism.

Additional exercises to Chapter 13

349

13.53. Let (G, ∗) be a group. Define a binary operation ◦ on G by a ◦ b = b ∗ a. (a) Prove that (G, ◦) is a group. (b) Prove that (G, ∗) and (G, ◦) are isomorphic. [Hint: Consider the function φ(g) = g −1 .] 13.54. Explain why the groups (Q, +) and (R, +) are not isomorphic. 13.55 We saw in Example 13.4 that with the binary operation on Z defined by a ∗ b = a + b − 1 ∗ (Z, ∗) is an abelian group. Prove that (Z, ∗) is isomorphic to (Z, +). 13.56. (a) Let G and H be isomorphic groups. Prove that H is abelian if G is abelian. (b) Show that the groups (Z6 , +) and (S3 , ◦) are not isomorphic. 13.57. Let B = { n1 : n ∈ Z − {0}} and let A = R − B. x is bijective. (a) Prove for every n ∈ Z that the function f n : A → A is defined by f n (x) = 1+nx (b) Let P = { f n : n ∈ Z}. Prove that (P, ◦) is a permutation group on A. (c) Prove that the groups (Z, +) and (P, ◦) are isomorphic.

13.58. Let (G, ◦) and (H, ∗) be groups with identities e and e, respectively. Suppose f : G → H is a function with the property that f (a ◦ b) = f (a) ∗ f (b) for all a, b ∈ G. (a) Let M = amplitude( f ). Prove that (M, ∗) is a subgroup of (H, ∗). (b) Let K = {a ∈ G : f (a) = e }. Prove that (K , ◦) is a subgroup of (G, ◦). 13.59. Let A = { mn : m and n odd integers, gdc(m, n) = 1 and n ≥ 1}. Let R∗ = R − {0} and for a ∈ A let f a : R∗ → R∗ be defined by f a (x) = x a . (a) Prove that (A, ·) is a subgroup of (Q∗ , ·), where the product of any two elements of A reduces to minimal terms. (b) For every a ∈ A, show that f a is a permutation on R∗. (c) Let F = { f a : a ∈ A}. Prove that (F, ◦) is a subgroup of (SR∗ , ◦). (d) Prove that (A, ·) and (F, ◦) are isomorphic groups.

EXERCISES ADDITIONAL TO CHAPTER 13 13.60. Let (G, ∗) be a group. An element g of G is idempotent for ∗ if g ∗ g = g. Prove that there is exactly one idempotent in G. 13.61. Let G be a group and let a ∈ G. The set Z(a) = {g ∈ G : ga = ag} is called the centralizer of a. Prove that the centralizer of a is a subgroup of G. 13.62. Let a, b ∈ Z, where a, b = 0 and let H = {am + bn : m, n ∈ Z} be the set of all linear combinations of a and b. (a) Prove that H is a subgroup of (Z, +). (b) Let d = gdc(a, b). Prove that H = dZ. 13.63. Define ∗ in R − {1} by a ∗ b = a + b − ab. (a) Prove that (R − {1}, ∗) is an abelian group. (b) Prove that (R − {1}, ∗) is isomorphic to (R∗ , ·). 13.64. Let G be a group of order pq, where p and q are different primes. What are the possible orders of a subgroup of G?

350

Chapter 13

Tests in group theory

13.65 Let H be a subgroup of (Z, +) with at least two elements and let m be the smallest positive integer in H . Prove that H = mZ. [Hint: Use the division algorithm.] 13.66. Evaluate the proposed proof of the following statement. Result There is no group that contains exactly two different elements that do not commute. Proof On the contrary, suppose there is a group G containing exactly two distinct elements, say x and y, that do not commute. So xy = yx. Since x and y are the only two elements of G that do not commute, x −1 and y commute. So x −1 y = yx −1 . Multiplying by x left and right gives −1 −1 x x y x = x yx x. Put simply, we have yx = x y. This is a contradiction. 13.67. A group G of order n ≥ 2 contains a subgroup H . For a left coset decomposition of H into G, the number of distinct left cosets is the same as the order of H . If a left set contains p elements, where p is a prime number, then what is n? 13.68. Evaluate the proposed proof of the following statement. Result There is no abelian group that contains exactly three distinct elements x such that x 2 = e. Proof On the contrary, suppose there is an abelian group G such that x 2 = e for exactly three distinct elements x of G. Of course, e2 = e, then there are two nonidentity elements a and b such that a 2 = b2 = e. Note that (ab)2 = a 2 b2 = ee = e. So either ab = a, ab = b, or ab = e, which implies that b = e, a = e, or a = b, which creates a contradiction. 13.69. Prove or disprove the following: for any odd integer k ≥ 3, there is no abelian group containing exactly k elements x such that x 2 = e. 13.70 For a function f : N → N we set f 1 = f and f 2 = f ◦ f . More generally, for k ≥ 2, the function f k is recursively defined by f k = f ◦ f k−1 . So f n : N → N for every n ∈ N. Give an example for two elements f and g in SN such that f 2 = g 2 = i N but ( f ◦ g)m = i N for all m ∈ N .

Chapter 14

Proofs in Ring Theory We have found that many of the proofs we have seen so far involve integers and their properties. This was certainly the case in Chapter 11, where we were mainly concerned with the additive and multiplicative properties of integers. Many important properties of integers follow from just a few familiar additive and multiplicative properties of integers. In particular, all three integers a, b, and c satisfy the following: (1) a + b = b + a (3) a + 0 = a (5) a(bc) = (ab)c

(2) (a + b) + c = a + (b + c) (4) a + (−a) = 0 (6) a(b + c) = ab + ac

(14.1)

Properties (1) - (4) tell us that integers form an abelian group under addition, a fact we noted in Chapter 13. You can probably think of other well-known properties of integers (like ab = ba), but let's focus on them in the six properties listed above. We saw in Chapter 13 that some of these properties have names. For example, (1) is called the commutative law of addition; while (2) and (5) are the associative laws of addition and multiplication, respectively. Property (6) is called the distributive law. Property (3) states that the integer 0 is the identity to add; while property (4) tells us that for an integer a, the integer −a is its inverse under addition. In particular, properties (3) and (4) seem to be such fundamental properties of integers that they should not even be mentioned. But the very fact that these six qualities are so basic and natural makes them important and draws our attention to them. Now the question arises: Which facts about integers follow from exactly these six properties? An even more fundamental question is: if we have a non-empty set S of objects (not necessarily integers) for which it is possible to add and multiply any two elements of S (and get one element of S at a time), so that the properties (1) - (6) are satisfied, i.e. which additional properties must S have? It is clear that all properties that can be derived over the elements of S are also properties of the integers. Indeed, this is the essence of the field of abstract algebra we are about to encounter (and often all of mathematics). By examining a familiar set of objects, we can discover an interesting fact about that set. But what characteristics of this set led us to this conclusion? And if any other set has the same characteristics, does this interesting fact also apply to these sets? We are now ready to examine non-empty sets in which addition and multiplication have been defined that satisfy properties (1) - (6). 14.1 Rings Addition and multiplication of integers are binary operations, since each assigns an integer to each (ordered) pair of integers. Binary operations in general are discussed in Chapter 13.1

2

CHAPTER 14. PROOFS IN RING THEORY

In the current context, a non-empty set containing one or more binary operations required to satisfy certain prescribed properties is called an algebraic structure. Therefore, we have already seen examples of algebraic structures. In fact, every group (see Chapter 13) is an algebraic structure. The study of algebraic structures is fundamental to abstract algebra. We have mentioned that the familiar addition and multiplication operations defined on integers satisfy the six properties listed in (14.1). Other known sets of numbers with these operations also satisfy these six properties, including rational numbers, real numbers, and complex numbers. The situation is different with irrational numbers, √ however, √ since the binary operations addition and multiplication. For example, 2 and − 2 √ are not even √ √ √ are irrational numbers, while 2 2 = 2 and 2 + (− 2) = 0 are not. These and other examples suggest a general concept. A set R (this is not the symbol used for the set of real numbers) with two binary operations, one called addition and denoted by + and the other called multiplication and denoted by · (where we usually write ab instead of a · b for a, b ∈ R), is called a ring if it satisfies the following six properties: R1 Commutative law of addition: a + b = b + a for all a, b ∈ R; R2 Associative law of addition: (a + b) + c = a + (b + c) for all a, b, c ∈ R; R3 Existence of additive identity: There exists an element 0 ∈ R such that a + 0 = a for all a ∈ R; R4 Existence of the additive inverse: For every a ∈ R there is an element −a ∈ R such that a + (−a) = 0; R5 Associative law of multiplication: a(bc) = (ab)c for all a, b, c ∈ R; R6 distribution laws: a(b + c) = ab + ac and (a + b)c = ac + bc for all a, b, c ∈ R. Note that the R3 property requires the existence of at least one element in R , which implies that each ring is non-empty. Recall that if S is a set with a binary operation ∗ e e ∈ S an identity for S among ∗, then e ∗ a = a ∗ e = a for all a ∈ S. As a ring, R has two binary operations and one identity element is needed only for the addition operation, we refer to an element 0 specified in property R3 as additive identity. The notation 0 for an additive identity is chosen because the integer 0 is an additive identity in Z. In other words, an additive identity in a ring R has the same properties as the integer 0 under addition in Z. Speaking of an additive identity 0 in a ring R, we are only referring to an element in R that we are using 0 and which satisfies the property R3, i. H. a + 0 = a for all a ∈ R. Just as the property R1 holds in every ring, we also have 0 + a = a. Furthermore, if an algebraic structure (S, ∗) has an identity e, then an element a ∈ S has an inverse b ∈ S if a ∗ b = b ∗ a = e. Each element of an R-ring needs to have this property only for the addition operation. An inverse of an element a ∈ R with respect to addition is called the additive inverse of a. In Z, an additive inverse of an integer m is its negative −m. For this reason we use −a to denote an additive inverse of an element a in a ring R. We must remember that an element −a in R represents only one element in R that satisfies the property R4, namely a + ( −a) = 0. From the property R1 we also know that (−a) + a = 0. Since the properties R1 – R4 are required of each ring R, it follows that (R, +) is an abelian group. A ring with binary operations + and · is usually denoted by (R, +, ·). However, when the two operations involved are clear, we simply write R. Especially when we are dealing with a known set using standard addition and multiplication operations (and these are the

3 operations we use), so we just write the symbol for that set. So Z, Q, R and C are rings. Now let's look at some other common examples of rings. Result 14.1

The set 2Z of even integers is a ring under ordinary addition and multiplication.

Check. First we show that ordinary addition and multiplication are binary operations on 2Z. Let a, b ∈ 2Z. So a = 2x and b = 2y for x, y ∈ Z. So a + b = 2x + 2y = 2(x + y) and ab = (2x)(2y) = 2(2xy). Since x + y and 2xy are integers, a + b and ab belong to 2Z. Since 2Z ⊆ Z and the binary operations on 2Z are the same as on Z, the properties R1, R2, R5 and R6 are automatically satisfied. Furthermore, since the integer 0 is even, 0 ∈ 2Z and hence 2Z has an additive identity. To show that the R4 property is also satisfied, let a ∈ 2Z. So a = 2x, where x ∈ Z. So −a = −(2x) = 2(−x). Since −x ∈ Z it follows −a ∈ 2Z. Result 14.2 The set Zn = {[0], [1], [2], · · · , [n − 1]}, n ≥ 2, of residue classes is a ring with addition of residue classes and multiplication of waste classes. Check. In Chapter 7 it was pointed out that both addition and multiplication of the class of residues defined by [a] + [b] = [a + b] and [a] [b] = [ab] are well defined, as well as the binary operations on Zn . That the properties R1, R2, R5 and R6 are fulfilled depends only on the corresponding properties in ring Z. For example, to see that R1 and R2 are satisfied, let [a], [b], [c] ∈ Zn . So [a] + [b] = [a + b] = [b + a] = [b] + [a] and ([a] + [b]) + [c] = [a + b] + [ c] = [(a + b) + c] = [a + (b + c)] = [a] + [b + c] = [a] + ([b] + [c]). The evidence for the R5 and R6 properties is similar. The residue class [0] is an additive identity in Zn and an additive inverse for [a] is [−a] since [a] + [−a] = [a + (−a)] = [0]. The ring (Zn , +, ·) described in Result 14.2 is generally called a ring of residue classes modulo n. Result 14.3 The set M2(R) of 2 × 2 matrices over R is a ring under matrix addition and matrix multiplication.

Study.

Remember that for A =

are defined by

A + B =

A B C D

e B =

a+e b+f c+g d+h

E f G H

in M2(R), addition and multiplication

e AB =

An additive identity for M2(R) is the zero matrix Z =

ae + bg af + bh ce + dg cf + dh 0 0 0 0

.

and an additive inverse for the

−a −b . The verification of the properties R1, R2, −c −d R5 and R6 depends only on the properties of the ring R. The matrix A given above is the matrix −A =

Not only M2(R) is a ring under matrix addition and matrix multiplication, but also Mn(R) for every integer n ≥ 2.

CHAPTER 14. PROOFS IN RING THEORY

4

Result 14.4 The set FR = {f : f : R → R} of real functions with domain R is a ring under function addition and function multiplication. Check.

Remember that for f, g ∈ FR addition and multiplication are defined by (f + g)(x) = f (x) + g(x) and (f g)(x) = f (x ) g(x)

for all x ∈ R. The proofs of the properties R1, R2, R5 and R6 depend only on the properties of the ring R. For example, the property R1 follows because (f + g)(x) = f (x) + g( x ) = g(x) + f (x) = (g + f )(x) for all x ∈ R and then f + g = g + f ; while the R5 property follows because ((f g) h)(x) = (f g)(x) h(x) = (f (x) g(x)) h(x ) = f (x) (g(x) h(x)) = f (x) (g h)(x) = (f (g h))(x) for all x ∈ R and so ( f g) h = f (g h). The null function f0 : R → R defined by f0 (x) = 0 for all x ∈ R is an additive identity since for every f ∈ FR and all x ∈ R (f + f0 )(x) = f (x ) + f0 (x) = f (x) + 0 = f (x) and thus f + f0 = f . For f ∈ FR the function −f ∈ FR defined by (−f )(x) = − (f (x)) for all x ∈ R is an additive inverse for f, since for all x ∈ R (f + ( − f ))(x) = f (x) + (−f )(x) = f (x) + (−f (x)) = 0 = f0 (x) and thus f + (−f ) = f0 . A less common but useful example of a ring is given below. Result 14.5 The set R × R = R2 is a ring under addition (a, b) + (c, d) = (a + c, b + d) and multiplication (a, b) (c, d) = ( ac, bd). Before proving this result, it is important to know that we define a new sum (a, b)+(c, d) in terms of the known sums a + c and b + c of two real numbers. + has two different meanings here. There is a similar distinction between the product on R2 and the standard product of real numbers. Proof of results 14.5. Of course, the additions and multiplications defined here are binary operations on R2. It follows that R2 satisfies the R1 property because the addition in R is commutative. Let (a, b), (c, d) ∈ R2 . So (a, b) + (c, d) = (a + c, b + d) = (c + a, d + b) = (c, d) + (a, b). Let (a, b) ∈ R2 . Note that (0, 0) ∈ R2 and that (a, b) + (0, 0) = (a + 0, b + 0) = (a, b). So (0, 0) is an additive identity on R2. In addition, (−a, −b) ∈ R2 and (a, b) + (−a, −b) = (a + (−a), b + (−b)) = (0, 0).

5 Thus (−a, −b) is an additive inverse of (a, b) and the properties R3 and R4 are valid. We only check one of the distributive laws for a ring, since the argument for the remaining law is similar. Again let (a, b), (c, d), (e, f ) ∈ R2 . Applying the distribution law for addition and multiplication in R, we have (a, b)[(c, d) + (e, f )] = (a, b)(c + e, d + f ) = (a ( c + e), b(d + f )) = (ac + ae, bd + bf ) = (ac, bd) + (ae, bf ) = (a, b)(c, d) + (a, b ) (e, f ), which establishes this distributive law on R2. Similarly, the associative properties R2 and R5 can be established on R2. Next we show that familiar sets need not be rings under unknown binary operations. Example 14.6 For a, b ∈ R define addition a

and multiplication

b = a + b − 1 e ein

through

b = ab,

where the operations given in a + b − 1 and ab are ordinary addition, subtraction, and multiplication. Then (R, , ) is not a ring.

Solution. In Example 13.4 it was shown that the binary operation satisfies the properties R1-R4, it is an ordinary multiplication, the property R5 holds, ie (R, ·) is an abelian group. Because even. However, the R6 property is not satisfied, since for a = b = c = 0, a therefore (R,

(b

,

c) = 0

(−1) = 0

e

(one

b)

(one

c) = 0

0 = −1.

♦

) is not a ring.

Let's see what happens when ordinary addition and multiplication of real numbers are reversed. Example 14.7 The set R of real numbers is not a ring if addition ∗ is defined as ordinary multiplication and multiplication ◦ as ordinary addition. Solution. We denote ordinary addition of real numbers by + and ordinary multiplication by · (although we write a · b as usual, as ab). We show that a distributive property in (R, ∗, ◦) fails. Let a = b = c = −1. So a ◦ (b ∗ c) = a + (bc) = (−1) + (−1)(−1) = (−1) + 1 = 0, while (a ◦ b) ∗ (a ◦ c) = (a + b)(a + c) = [(−1) + (−1)][(−1) + (−1)] = (−2)(−2) = 4. So (R , ∗, ◦) is not a ring.

♦

Some rings have attributes that go beyond the six required attributes of all rings. We have already mentioned that integers satisfy the well-known property: ab = ba for all a, b ∈ Z. This is of course the commutative property of multiplication. Rings do not have to have this property. In this case, however, we give these rings a special name. A ring (R, +, ·) is called a commutative ring if it satisfies

CHAPTER 14. PROOFS IN RING THEORY

6

R7 Commutative law of multiplication: ab = ba for all a, b ∈ R. A ring (R, +, ·) that does not satisfy the law of commutative multiplication is called a non-commutative ring. While the rings Z, Q, R, C, 2Z, Zn, FR and R2 are commutative, the ring M2(R) is not commutative. For example, if we let

A = then

0 0 2 0

AB =

e B =

0 0 0 4

=

4 0 0 0

0 2 , 0 0

= BA.

Another fundamental but important property of Z is that it contains an integer e with the property that a e = e a = a for any integer a. Of course, 1 has this property in Z. In general, a ring (R, +, ·) is called a unit ring (or multiplicative identity ring) if it satisfies R8 Multiplicative identity existence: There exists an element 1 ∈ R such that a · 1 = 1 · a = a for all a ∈ R. If (R, +, ) has an element 1 that satisfies the R8 property, then 1 is called a unit for R. Again, we emphasize that care is needed here. When we write 1, we mean just an element of R that satisfies the property R8, that is, a 1 = 1 a = a for all a ∈ R. This does not mean that 1 is the integer 1. R could be a unit ring containing no integers. If R is a commutative ring, then proving that an element 1 ∈ R is a unit requires only to show that a 1 = a for all a ∈ R, since a 1 = 1 a for all a ∈ R. The Z, Q, R, C, Zn, FR, and R2 rings are unit rings. The number 1 is a unit for Z, Q and R, just as 1 = 1 + 0i is a unit for C. The residue class [1] is a unit for Zn; while the constant function f1 : R → R defined by f1 (x) = 1 for all x ∈ R is a unit for FR. The ordered pair (1, 1) is also a unit for R2. The non-commutative ring M2(R) also has a unit. Actually the 2 × 2 identity matrix

I= is a unit for M2(R) there

A B C D

1 0 0 1

=

1 0 0 1

1 0 0 1

A B C D

=

A B C D

for all a, b, c, d ∈ R. On the other hand, not all rings have a unit. In particular, the 2Z-ring of even integers has no one, since the only integer e with e a = a is for every integer a e = 1 but 1 ∈ / 2Z. 14.2 Elementary Properties of Rings Although there are many different types of rings, there are properties that all rings have in common. Necessarily, of course, such properties are consequences of the six defining properties of a ring. We will now introduce some properties that all rings have in common, starting with the uniqueness of certain types of elements in rings.

7 The definition of a ring R guarantees that it contains an additive identity, i.e. H. an element 0 such that a + 0 = a for all a ∈ R. Although the definition does not specify that there is only one such element, there actually is . , just one. Furthermore, the definition of R says that for every a ∈ R there is an element −a ∈ R such that a + (−a) = 0. Again, there is no indication that each element of R is only an additive Inverse has, but actually it is. These are actually consequences of the fact that R is an addition group (see Theorem 13.9), but we verify these facts here. Theorem 14.8

Let R be a ring. then

(i) R has a unique additive identity and (ii) each element in R has a unique additive inverse. Check. We first check (i). Assume that 0 and 0 are additive identities for R. Since 0 is an additive identity, 0 + 0 = 0 . Also, since 0 is an additive identity, 0 + 0 = 0. Then, by the commutative property, 0 = 0 + 0 = 0 + 0 = 0, and therefore 0 = 0. Hence, in R there is only one additive identity and ( i) applies. Now we check (ii). Suppose −x and x are both additive inverses for the element x ∈ R. Then x + (−x) = 0 and x + x = 0. So −x = −x + 0 = −x + (x + x ) = (− x + x) + x = 0 + x = x . Hence every element in R has a unique additive inverse. Proof Analysis Consider again the proof of uniqueness of additive inverses in Theorem 14.8 to see how this proof might have been constructed. We know that x + (−x) = 0 and x + x = 0. Since x + (−x) = 0 = x + x , adding −x to the same elements x + (−x) and follows x + x , which gives −x + (x + (−x)) = −x + (x + x ). (14.2) The left side of (14.2) is −x + 0 = −x. Our goal was to show that −x = x , so this suggests starting with −x = −x + 0 = −x + (x + x ) and the rest of the proof follows naturally. The resulting proof in Theorem 14.8 is certainly much clearer than a list of equations without further explanation: x + (−x) = x + x −x + (x + (−x)) = −x + (x + x ) ( −x + x) + (−x) = (−x + x) + x 0 + (−x) = 0 + x −x = x . ♦ Given Theorem 14.8, we can now refer to the additive identity of a ring and the additive inverse of an element in a ring. The additive identity of a ring R is called the zero element of R. Not only is the additive identity and additive inverse of each element in a ring R unique, but if R has a unit then that element is also unique. Theorem 14.9 If R is a ring with one, then R has a unique one.

CHAPTER 14. PROOFS IN RING THEORY

8

Check. Let 1 and 1 be units in R. Since 1 is a unit, 1 1 = 1 1 = 1 ; while 1 is a unit, 1 1 = 1 1 = 1. Hence 1 = 1 1 = 1 . A basic fact about rings allows us to simplify certain algebraic expressions. Although the next theorem follows from the fact that R is an abelian group under addition (see Theorem 13.7), we provide a proof of this theorem. Theorem 14.10 (law of cancellation of addition) (R, +, ) such that a + b = a + c, then b = c. Check.

If a, b and c are elements of a ring

Beachte que b = 0 + b = [(−a) + a] + b = (−a) + (a + b) = (−a) + (a + c) = [(−a) + a] + c = 0 + c = c.

Hence in (R, +, ·) the law of cancellation of addition holds. Proof Analysis Another version of the previous proof starts with a + b = a + c (ie a + b and a + c represent the same element in R). If you add the additive inverse −a of a to this element, you get −a + (a + b) = −a + (a + c). According to the associative law, (−a + a) + b = (−a + a) = c; so 0 + b = 0 + c and thus b = c.

♦

We have seen that the null element 0 in a ring R has the property that 0 + 0 = 0. So R contains an element c such that c + c = c, hence c = 0. However, as a direct consequence of the law of cancellation of addition, no other element of R has this property. Corollary 14.11 c = 0.

Let (R, +, ·) be a ring. If c is an element of R such that c + c = c, then

Check. Since c+c = c, we also have c+c = c+0. If we now delete c, it follows from Theorem 14.10 that c = 0. Although the defining property of the zero element of a ring (R, +, ·) affects only one of the two operations, namely addition, it has a property involving multiplication this is probably not unexpected. Theorem 14.12

For each element a in a ring (R, +, ) we have a 0 = 0 a = 0.

Check. Since the proofs that a 0 = 0 and 0 a = 0 are similar, we only examine the first one. Note that a 0 + 0 = a 0 = a (0 + 0) = a 0 + a 0. The result now follows from Corollary 14.11 (where c = a 0). We now turn to the properties of rings with additive inverses. Sometimes a very simple argument for a fact can be provided by realizing that −a is the only element that, when added to a, equals 0. Two examples of this appear in the following theorem.

9 Theorem 14.13

Let (R, +, ·) be a ring and let a, b ∈ R. Then

(i) −(−a) = a (ii) if a = −b, then b = −a. Check. Since a + (−a) = 0, it follows that a is the additive inverse of −a, i.e. a = −(−a). This checks (i). To establish (ii), let a = −b. Hence a is the additive inverse of b and therefore a + b = 0. However, this implies that b is the additive inverse of a and therefore b = −a. We now consider some results on the product of two elements in a ring, at least one of which is an additive inverse. Since the additive inverse is an element defined only by addition, it seems natural that any property relating to such an element involving multiplication is a consequence of the distributional laws. (This is exactly what happened in Theorem 14.12.) Theorem 14.14

Let (R, +, ) be a ring and let a, b ∈ R. Then (−a) b = a (−b) = −(ab).

Check. To show that (−a) b = −(ab), it suffices to verify that (−a) b is the additive inverse of a b. This can be done by showing that a b + (−a) b = 0. Note that a b + (−a) b = [a + (−a)] b = 0 · b = 0. Prove that a · (−b) = −(a · b) is similar. Episode 14.15

Let (R, +, ·) be a ring and let a, b ∈ R. Then (−a) · (−b) = ab.

Check. By Theorem 14.14 we have (−a) (−b) = a [−(−b)] and by Theorem 14.13 −(−b) = b. So (−a) (−b) = a b. In the ring of integers we know that if a, b ∈ Z, then a+(−b) = a−b. We follow this convention in any ring. If (R, +, ·) is a ring and a, b ∈ R, then we define the subtraction of b from a as a − b = a + (−b). In particular, when a = b in R, we arrive at the seemingly obvious fact that a − b = b − b = b + (−b) = 0. We present a basic fact about subtraction. Result 14.16

Let (R, +, ) be a ring and let a, b, c ∈ R. Then a(b − c) = ab − ac.

Check. Note that a(b − c) = a[b + (−c)] = ab + a (−c). By Theorem 14.14, a(−c) = −(ac), then a(b − c) = ab + [−(ac)] = ab − ac. 14.3 Subrings We have seen that the 2Z subset of Z is a Ring is if the addition and multiplication operations used in 2Z are the same as in Z. Since Z is already a ring, we find it relatively easy to prove that 2Z is a ring. We have seen that 2Z inherits the properties R1, R2, R5 and R6 of a ring from Z. What we didn't automatically know and therefore needed to check was

10

CHAPTER 14. PROOFS IN RING THEORY

that 2Z is closed under addition and multiplication, that the zero element of Z is also in 2Z, and that every element of 2Z has an additive inverse in 2Z. In general, then, it is much easier to prove that a subset S of a known ring R is a ring if the same operations on R are defined. This observation leads us to an important concept in the study of rings. Let R be a ring. If S is a subset of R such that S is a ring under the same operations defined on R, then S is called a subring of R. If R contains at least two elements, then R contains at least two subrings, dR itself and the "zero subring" {0}. We now detail which properties must be checked in order to show that a subset of a known ring R is a subring of R. Theorem 14.17 (The subring test) A nonempty subset S of a ring R is a subring of R S if and only if S is closed under subtraction and multiplication. Check. If S is a subring of R then S is surely closed under subtraction and multiplication. For the inverse, let R be a ring and S a nonempty subset of R that is closed under subtraction and multiplication. We show that S itself is a ring. Since S = ∅, there is an element s ∈ S. Since S is closed under subtraction, s − s = 0 ∈ S, i.e. H. the zero element of R belongs to S and therefore the property R3 holds. Now let a ∈ S. Again, since S is closed under subtraction, 0 − a = 0 + (−a) = −a ∈ S and hence the R4 property holds. This implies that the additive inverse of an element of S also belongs to S. For a, b ∈ S we know that −b ∈ S and then a − (−b) = a + [−(−b)] = a + b ∈ S. So S is also closed under addition. Now it remains to show that addition is commutative, that addition and multiplication are associative and that the distributive laws hold, i.e. the properties R1, R2, R5 and R6 in S hold. But all these properties are inherited from R and hence , hold them in S too. So to show that a subset S of a ring R is a subring we only have to show that S is nonempty and that S is closed under subtraction and multiplication. We now illustrate how to use the subring test by presenting several examples, beginning with a new proof that 2Z is a ring under ordinary addition and multiplication. Result 14.18 The subset 2Z of even integers is a subring of Z. Proof. Since 0 is an even integer, 2Z is not empty. Let a, b ∈ 2Z. Then a = 2x and b = 2y, where x, y ∈ Z. Note that a − b = 2x − 2y = 2(x − y) and ab = (2x)(2y) = 2(2xy). Since x − y and 2xy are integers, a − b and ab belong to 2Z. According to the subring test, 2Z is a subring of Z. Result 14.19 The subset R × {0} = {(x, 0) : x ∈ R} of the ring R × R is a subring of R × R. Proof . Since (0, 0) ∈ R×{0}, the set R×{0} is not empty. Let a, b ∈ R×{0}. Then a = (x, 0) and b = (y, 0) for some x, y ∈ R. So a−b = (x, 0)−(y, 0) = (x−y, 0− 0 ) = (x−y, 0) ∈ R×{0} and a b = (x, 0) (y, 0) = (xy, 0) ∈ R × {0}. In the subring test, R × {0} is a subring of R × R. The next example involves a subring of √ the ring of complex numbers. A complex number of the form a + bi, where a, b ∈ Z and i = −1, is called a Gaussian integer. Result 14.20 The set G = {a + bi : a, b ∈ Z} of Gaussian integers is a subring of the ring C of complex numbers.

11 proof. Since 0 = 0 + 0i ∈ G, the set G is not empty. Let x, y ∈ G. Then x = a + bi and y = c + di, where a, b, c, d ∈ Z. Note that x − y = (a + bi) − (c + di) = ( a − c) + (b − d)i and xy = (a + bi) (c + di) = (ac − bd) + (ad + bc)i. Since a − c, b − d, ac − bd, and ad + bc are integers, x − y and xy are Gaussian integers. According to the subring test, G is a subring of C. Elements belonging to two subrings of a ring R also generate a subring of R. Result 14.21

If S1 and S2 are subrings of a ring R, then S1 ∩ S2 is also a subring of R.

Check. Since 0 ∈ S1 and 0 ∈ S2 , it follows that 0 ∈ S1 ∩ S2 and hence S1 ∩ S2 is nonempty. Let a, b ∈ S1 ∩ S2 . Then a, b ∈ Si for i = 1, 2. Since S1 and S2 are subrings of R, it follows that a − b ∈ Si and ab ∈ Si for i = 1, 2. So a − b ∈ S1 ∩ S2 and ab ∈ S1 ∩ S2 . Hence, by the subring test, S1 ∩ S2 is a subring of R. 14.4 Integer Ranges The properties of integers led us to the notion of rings and to two special types of rings, namely commutative rings and rings with one. We have seen that if R is a ring, then a 0 = 0 a = 0 for all a ∈ R. This property can also be expressed in another way: Let a, b ∈ R. If a = 0 or b = 0 , then a b = 0.

(14.3)

Of course, the converse of (14.3) also applies in the ring Z: let a, b ∈ Z. If a b = 0, then a = 0 or b = 0.

(14.4)

Implication (14.4) also holds in the ring of real numbers. In fact, (14.4) is the crucial property of real numbers, needed to solve many equations. For example, if (x − 3)(x + 2) = 0, where x ∈ R, then x = 3 or x = −2. This brings us to another important concept. A non-zero element a in a ring R is called a zero divisor of R if there is a non-zero element b in R such that ab = 0 or ba = 0. It is clear that in this case b is a zero divisor from R is how we will. Certainly the Z and R rings do not have zero dividers. Also, 2Z, Q, and C are rings that have no zero divisors. However, there are some known rings that have a divisor of zero. In Z6 we saw that [2][3] = [6] = [0]. Since [2] = [0] and [3] = [0], it follows that [2] and [3] in Z6 divide by zero. The remainder class [4] is also a zero divisor in Z6 since [4][3] = [0]. Consider the functions f and g defined in FR as:

f (x) =

1 0

se x ∈ Q se x ∈ I

g(x) =

0 1

se x ∈ Q se x ∈ I

Then (f g)(x) = f (x) g(x) = 0 = f0 (x) for all x ∈ R. So f g = f0 , the null element of FR , but f = f0 and g = f0 . Then f and g are zero divisors in FR. Leave in M2(R)

CHAPTER 14. PROOFS IN RING THEORY

12

A = Also

AB =

0 1 0 1

0 0 0 0

1 1 . 0 0

e B =

while BA =

0 2 . 0 0

So A and B are zero divisors in the non-commutative ring M2(R). From what we've seen, it's not all that unusual for a ring R to contain non-zero elements whose product is the zero element of R. Hence, it is useful to distinguish the rings that contain divisors of zero from those that do not. Before we continue, however, we need to address a specific type of ring. A ring R is said to be trivial if it contains only one element - inevitably the element zero. That is, R is trivial if R = {0}. When R is non-trivial, it contains at least two elements and consequently at least one non-zero element. If R is a trivial ring, then surely a 0 = 0 a = a for all a ∈ R, since a = 0 is the only element of R. So if R is trivial, then it contains one (i.e. 0) . Obviously a trivial ring is also commutative. On the other hand, if R is a nontrivial ring with one, then it cannot happen that the elements one and zero are equal. Theorem 14.22

If R is a nontrivial ring with unit 1, then 1 = 0.

Check. Let's assume instead that 1 = 0. Since R is a non-trivial ring, there exists an element a ∈ R with a = 0. However, then a = a 1 = a 0 = 0, which is a contradiction. A non-trivial commutative ring with one that contains no divisors of zero is called an integral domain. Therefore, all Z, Q, R, and C rings are integral domains. However, not all commutative rings with one are integral domains. For example, we have seen that [2] and [3] are zero divisors in Z6. We have also seen that FR has divisors of zero. Since (0, 1) (1, 0) = (0, 0) in R2, it also follows that (0, 1) and (1, 0) in R2 divide from zero. Thus, although Z6, FR, and R2 are all commutative rings with one, they are not integer domains. Since an integer range must be a commutative ring with one unit, 2Z is not an integer range, even though it is commutative and contains no divisors of zero, since it contains no unit. We have seen that every ring obeys the law of cancellation of addition. With multiplication, the situation can be quite different. In this case, there are two possible cancellation laws. Multiplication cancellation laws: Let R be a ring and let a, b, c ∈ R. (1) If ab = ac, where a = 0, then b = c. (2) If ac = bc, where c = 0, then a = b. Of course, if R is a commutative ring, (1) and (2) say the same thing. In a non-commutative ring it is called (1) the law of left cancellation of multiplication and (2) the law of right cancellation of multiplication. In the ring Z6, [3] [2] = [3] [4] but [2] = [4]. Hence the laws of canceling multiplication break at Z6. However, the laws of cancellation of multiplication never fail in rings with no divisors of zero.

13 Theorem 14.23 Let R be a ring. Then the laws of canceling multiplication in R hold if and only if R contains no divisors of zero. Check. First, assume that R is a ring with no zero divisor. We only check the left cancellation law (1) because the proof of (2) is analogous. Let a, b, c ∈ R, where a = 0 and ab = ac. Since ab = ac, it follows that ab + (−(ac)) = ac + (−(ac)) and thus ab − ac = 0. So then a(b − c) = 0. Since R contains no divisors of zero and a = 0, it follows that b − c = 0 and hence b = c. Instead, assume that R is a ring in which the cancellation laws of multiplication hold. We show that R contains no zero divisors. Let a, b ∈ R with ab = 0. We show that a = 0 or b = 0. If a = 0 then we have the desired result. Hence we can assume that a = 0. Hence a b = 0 = a 0 and therefore a b = a 0. By the law of multiplication termination (left) the element a can be deleted at a . b = a 0, which leads to b = 0. So R has no zero divisors. Since a ring R satisfying the laws of cancellation of multiplication is equivalent to R with no divisor of zero, we have an immediate consequence of Theorem 14.23. Corollary 14.24 Let R be a nontrivial commutative ring with unity. Then R is an integral domain if and only if the law of canceling multiplication in R holds. Although Z6 is not an integral domain, it is not difficult to show that Z5 is (by making a table for Z5 in the same way as for Z6 in Figure 7.1 of Chapter 7). Consequently, some Zn rings are integral domains while others are not. You may have already seen a difference between Z6 and Z5, that is, 5 is prime and 6 is not. We'll see shortly that this is the key observation. In the proof of the next theorem we will use the fact that if a and b are integers and p is prime such that p | ab, then p | a or p | B. (This theorem is discussed in detail in Chapter 11. See in particular Corollary 11.14.) Theorem 14.25 is a prime number.

For an integer n ≥ 2, the ring Zn is an integer domain if and only if n is

Check. First we show that n is prime if Zn is an integer range. Suppose n is not a prime number. Then n = ab for some integers a and b with 1 < a < n and 1 < b < n. So [a] = [0] and [b] = [0] in Zn. On the other hand, [a][b] = [ab] = [n] = [0] in Zn . Therefore [a] and [b] divide zero in Zn and therefore Zn is not an integer range. Otherwise, assume that n is a prime number. We show that Zn is an integer domain. In fact, Zn is a nontrivial commutative ring with unity. It only remains to show that Zn has no zero divisor. Let [a], [b] ∈ Zn such that [a] · [b] = [0]. Then [a] [b] = [ab] = [0], which implies that ab ≡ 0 (mod n). Hence n | away. Since n is prime, it follows from Corollary 11.14 that n | a or n | B; then [a] = [0] or [b] = [0]. Thus Zn contains no zero divisors. 14.6 Fields We saw earlier that many fundamental properties of integers are shared by other algebraic structures. This led us to the concept of rings. Among the many rings we find are Z, 2Z, Q, R, C, Zn, FR, and M2(R). However, only some of them are commutative rings with one, namely Z, Q, R, C, Zn and FR; and only some of them are integer ranges, namely Z, Q, R, C and Zp, where p is a prime number. There is a property that Q, R, C, and Zp have but that Z does not have that will eventually allow us to distinguish Z from these rings.

14

CHAPTER 14. PROOFS IN RING THEORY

Let a be a non-zero integer. If a is not 1 or −1, then there is no integer b with ab = 1. On the other hand, if a is a rational number other than zero, then there is always a rational number b with ab = 1. , b = 1/a ∈ Q has this property. This discussion leads us to another concept. Let R be a ring with one 1. A nonzero element a of R is called one if there is an element b in R such that ab = ba = 1. In this case b is called the multiplicative inverse of a. (Of course, b is also a unit with multiplicative inverse a.) We have to carefully distinguish between the terms "unit" and "unit" in a ring R. A unit in R is an element 1 ∈ R such as a 1 = 1 a = a for all a ∈ R. On the other hand, if R is a nontrivial ring with one 1, then a nonzero element a ∈ R is one if a b = b a = 1 for some b ∈ R. The unit 1 is always a unit since 1 1 = 1. Like additive inverses, multiplicative inverses are unique in a ring. Theorem 14.26 Let R be a nontrivial ring with unity. So every unit in R has a unique multiplicative inverse. Check. Let a be a unit in R and assume that b and c are multiplicative inverses of a. Hence ab = ba = 1 and ac = ca = 1. It follows that b = b 1 = b(ac) = (ba)c = 1 c = c. Hence a has a unique multiplicative inverse. For a unit a in a nontrivial ring with unity we write a−1 for the multiplicative (unique) inverse of a. The only units in Z are 1 and −1, since these are the only integers a for which there is an integer b such that ab = 1. However, in Q and R all non-zero elements are units. In Z6, [5] · [5] = [25] = [1], so [1] and [5] are units. Also, there are no other units in Z6, as can be seen from the multiplication table (Figure 7.1) in Chapter 7. A nontrivial commutative ring with unity in which every nonzero element is a unit is called a field. Besides Q and R, the ring C of complex numbers is a field. Result 14.27

The ring C of complex numbers is a field.

Check. We have already established that C is a commutative ring with one unit, so all we have to do is show that every non-zero complex number is one unit. Let x be a non-zero complex number. So x = a + bi, where a, b ∈ R and a = 0 or b = 0. So a2 + b2 = 0. We show that there is a complex number y = c + di, where c, d ∈ R , so that xy = 1 = 1 + 0i. −b a and d = 2 and note that let c = 2 2 a + b a + b2

−b a xy = (a + bi)(c + di) = (a + bi) 2 + ich a + b2 a2 + b2

1 1 2 2 2 = (a + bi)(a − bi) = a − b ich a2 + b2 a2 + b2 2 2 a + b = = 1 = 1 + 0i. a2 + b2

a − b + 2 i. 2 +b a + b2 Proof Analysis In the proof of the previous result for the complex number x = a + bi, how do we know that y = c + di is to be chosen such that xy = 1 = 1 + 0i? That is, how did we know what the multiplicative inverse of x was? It really wasn't that difficult. Therefore x has a multiplicative inverse, so x−1 =

a2

15 Then xy = (a + bi)(c + di) = 1 + 0i, followed by (ac − bd) + (ad + bc)i = 1 + 0i. Also ac − bd =

(14.5)

ad + bc = 0.

(14.6)

e Multiplying equation (14.5) by a, equation (14.6) by b and adding we get (a2 + b2 )c = a;

(14.7)

by multiplying equation (14.5) by −b, equation (14.6) by a and adding, we get (a2 + b2 )d = −b.

(14.8)

If we solve (14.7) for c and (14.8) for d, we find c=

a2

-b ein e d = 2 . 2 + b a + b2

a - b -1 . That c + di is actually x−1 was natural, so a2 +b 2 + a2 +b2 i is the logical choice for x that follows from proving Result 14.27. 🇧🇷

Fields are actually special types of integral domains, as we shall now show. Theorem 14.28

Each field is an integral domain.

Check. Let F be a field. To verify that F is also an integral domain, we only have to show that F contains no factors of zero. Let a be a non-zero element of F and b ∈ F with ab = 0. Then 0 = a−1 0 = a−1 (ab) = a−1 a b = 1b = b. Since b = 0, it follows that a does not divide from zero. Of course, the converse of Theorem 14.28 does not hold, since Z is an integral domain that is not a field. However, with some restrictions, an integer range is also a field. Theorem 14.29

Every finite integral domain is a field.

Check. Let D be a finite integer range, say D = {a1 , a2 , · · · , an }. To show that D is a field we only need to show that every non-zero element of D has a multiplicative inverse. Let a ∈ D, where a = 0, and consider the elements aa1 , aa2 , · · · , aan . If aai = aaj , where 1 ≤ i ≤ n and 1 ≤ j ≤ n, then ai = aj by the law of cancellation of multiplication. This implies that the elements aa1 , aa2 , · · · , aan are distinct and are in fact all n elements of D. Thus one of these elements is 1 and thus aak = 1 for an integer k with 1 ≤ k ≤ n. Hence ak = a−1 and a has a multiplicative inverse. We saw in Theorem 14.25 that Zn is an integral range if and only if n is a prime number. Theorem 14.29 now gives us the following result. Episode 14.30

The ring Zn is a field if and only if n is a prime number.

CHAPTER 14. PROOFS IN RING THEORY

16 exercises for Chapter 14

Exercise 14.1 Check that each of the following operations is a ring by showing that (1) the given additions and multiplications are binary operations and (2) the six required properties are satisfied. (You can assume that both Z and R are rings under ordinary addition and multiplication.) (a) The set kZ, where k ∈ Z and k ≥ 2, under ordinary addition and multiplication. √ √ (b) The set Z[ 2] = {a + b 2 : a, b ∈ Z} under ordinary addition and multiplication. 14.2 Check that each of the following items is not a ring. (a) The FR set under function addition and function composition. (b) The set Z under addition defined by a ∗ b = a and ordinary multiplication. (c) The set Z under ordinary addition and multiplication defined by a ∗ b = a. (d) The set Z under addition defined by a∗b = min{a, b} and ordinary multiplication. (e) The set Z under ordinary addition and multiplication defined by a∗b = min{a, b}. 14.3 Given a set S and binary operations ∗ and ◦, determine whether (S, ∗, ◦) is a ring. (a) S = R, a ∗ b = a + b + 1, a ◦ b = ab. (b) S = R+ , the set of positive real numbers, a ∗ b = ab and a ◦ b = ab . 14.4 Let a be an element of a ring (R, +, ·). Complete the proof of Theorem 14.12 by proving that 0 · a = 0. 14.5 Let a and b be elements in a ring (R, +, ·). Complete the proof of Theorem 14.14 by proving that a (−b) = −(a b). 14.6 Let R be a ring with unit 1. Use Theorem 14.14 to prove that (−1)a = −a for all a ∈ R. 14.7 Let (R, +, ) be a ring with the property that a2 = a · a = a for all a ∈ R. (a) Prove that each element in R is its own additive inverse, ie prove that −a = a for all a ∈ R. [Hint: Consider (a + a)2 . ] (b) Prove that R is a commutative ring. [Hint: Consider (a + b)2 .] 14.8 There is an example of a non-trivial ring (R, +, ·), that is, R has at least two elements, so addition and multiplication in R are equal, i.e. a + b = ab for all a, b ∈ R? Justify your answer. Exercise 14.9 Check whether each of the following subsets is a subring of the given ring.

a 0 : a, b ∈ R in the ring M2 (R). (a) S = 0 b √ √ (b) S = {a + b 3 2 + c 3 4 : a, b, c ∈ Q} in the ring R. 14.10 Prove that the subset S = {[0] , [ 2], [4]} is a subring of Z6 .

17 14.11 Recall √ that a Gaussian integer is a complex number of type a + bi, where a, b ∈ Z and i = −1, and that the set G of Gaussian integers is a subring of the ring C of complex numbers is. Define an even Gaussian integer as a complex number of type a + bi, where a, b ∈ 2Z. Is the 2G set of even Gaussian integers a subring of G? Justify your answer. 14.12 If S1 and S2 are subrings of a ring R, then by Result 14.21 S1 ∩ S2 is a subring of R. Both 2Z and 3Z are subrings of ring Z. Give a simple description of the subring 2Z ∩ 3Z in Z. Justify Your Answer. Answer.

14.13 Six S =

a b 0 0

: a, b ∈ R .

(a) Prove that S is a subring of M2(R). (b) Prove that there is an element E ∈ S with EA = A for all A ∈ S, but there is an element C ∈ S with CE = C. (c) Prove that S has no unit. Exercise 14.14 Use Theorem 14.23 to prove Corollary 14.24. 14.15 Define the multiplication ◦ in 2Z with a ◦ b = ab/2. Prove that (2Z, +, ◦) is an integer domain, where + is an ordinary addition. 14.16 Let R be a commutative ring with unity. (a) Prove that a unit of R is not a zero divisor in R. (b) Determine whether the converse of (a) holds. (c) Prove that if R is a finite ring and a is not a zero divisor of R, then a has a multiplicative inverse in R. 14.17 Define addition ∗ and multiplication ◦ in Z as follows: a∗b=a+b− 1

e a ◦ b = a + b + ab.

Prove that (Z, ∗, ◦) is a ring with one and answer the following questions. (a) Is this ring commutative? (b) Is this ring an integer domain? (c) Is this ring a body? √ √ 14.18 Show that Z[ 2] = {a + b 2 : a, b ∈ Z} is not a field. Exercise 14.19 Give an example of a ring that is not a field but has a subring that is a field. 14.20 Let R be a nontrivial commutative ring with unity. Prove that R is a field if and only if for all a, b ∈ R with a = 0 the equation ax = b has a solution x ∈ R. 14.21 Prove that Q[i] = {a + bi : a, b ∈ Q} is a field. 14.22 Let (F, +, ·) be a field and let a, b ∈ F with a = 0. Show that the equation a · x = b has a unique solution x ∈ F.

CHAPTER 14. PROOFS IN RING THEORY

18

Exercise 14.23 Give examples of the following (if any): (a) a finite ring (b) an infinite ring (c) a noncommutative finite ring (d) a noncommutative infinite ring (e) a ring with one (f) a ring without one (g) a non-commutative ring with one (h) a non-commutative ring without one (i) a ring that is not an integer domain (j) a finite integer domain (k) an infinite integer domain () an integer domain that none is a field (m) a finite field (n) an infinite field 14.24 For the following statement S and the proposed proof, either (1) S is true and the proof is correct, (2) S is true and the proof is false , or (3) S is false and the proof is false. Explain which occurs. S: Let A = {n ∈ N : n = 0}. Then A is a subring of (Z, +, ·). Check. Let a, b ∈ S. Then a = 0 and b = 0. Since a − b = 0 − 0 = 0 ∈ A and a b = 0 0 = 0 ∈ A, it follows that A is closed under subtraction and multiplication. In the subring test, (A, +, ·) is a subring of (Z, +, ·). 14.25 For the following statement S and the proposed proof, either (1) S is true and the proof is correct, (2) S is true and the proof is false, or (3) S is false and the proof is false. Explain which occurs. S: Let R be a unit ring containing at least two elements, and let R = {a ∈ R : a − r is a unit for every r ∈ R}. Then R is a subring of R. Proof. Let a, b ∈ R . First consider a − b and r ∈ R. Then (a − b) − r = a − (b + r). Since a ∈ R and b + r ∈ R, (a − b) − r is a unit and hence a − b ∈ R . Next we consider ab and r ∈ R. Then ab − r = a − (a − ab + r ). Since a ∈ R and a − ab + r ∈ R, it follows that ab − r is a unit. So from ∈ R . In the subring test, R is a subring of R.

Chapter 15

Proofs in Linear Algebra A topic you may have studied in geometry, calculus, or physics is vectors. You can memorize vectors both in the plane R2 = R × R and in three-dimensional space R3 = R × R × R. We often think of a vector as a line segment that goes from the origin to another point. Examples of this (both in the plane and in three-dimensional space) are shown in Figure 15.1.

j

z 4

(4, 3)

3

4

(2, 3, 4) 3

x

j

2x

(one)

(b)

Figure 15.1 Vectors in the plane and in three-dimensional space The vector u in the plane shown in Figure 15.1(a) (it is customary to type vectors in bold) can be expressed as u = (4, 3); while the vector v can be expressed in the three-dimensional space shown in Figure 15.1(b) as v = (2, 3, 4). The vectors i = (1, 0) and j = (0, 1) in the plane and i = (1, 0, 0), j = (0, 1, 0) and k = (0, 0, 1 ) in the 3-room will particularly interest us. 15.1 Properties of Vectors in Three-Dimensional Space An important feature of vectors is that they can be added (to produce another vector); while another is that a vector can be multiplied by an element of a set, usually a real number (again, to produce another vector). In this context these elements are called scalars. We now focus on vectors in three-dimensional space. Let u = (a1 , b1 , c1 ) and v = (a2 , b2 , c2 ), where ai , bi , ci (i = 1, 2) are real numbers. The sum of u and v is defined by u + v = (a1 + a2 , b1 + b2 , c1 + c2 ) 1

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

2

and the scalar multiple of u by a scalar (real number) α is defined by αu = (αa1 , αb1 , αc1 ). From this definition follows u = (a1 , b1 , c1 ) = (a1 , 0, 0) + (0, b1 , 0) + (0, 0, c1 ) = a1 (1, 0, 0) + b1 (0 , 1, 0) + c1 (0, 0, 1) = a1 i + b1 j + c1 k. That is, it is possible to express a vector u in three-dimensional space in terms of vectors i, j, and k in three-dimensional space (and be referred to as a linear combination of them). Below are eight simple but fundamental properties that follow from these definitions of vector addition and scalar multiplication in R3: 1. u + v = v + u for all u, v ∈ R3 . 2. (u + v) + w = u + (v + w) for all u, v, w ∈ R3 . 3. For z = (0, 0, 0) we have u + z = u for all u ∈ R3 . 4. For every u ∈ R3 there exists a vector in R3 denoted by −u such that u + (−u) = z = (0, 0, 0). 5. α(u + v) = αu + αv for all α ∈ R and all u, v ∈ R3 . 6. (α + β)u = αu + βu for all α, β ∈ R and all u ∈ R3 . 7. (αβ)u = α(βu) for all α, β ∈ R and all u ∈ R3 . 8. 1u = u for all u ∈ R3 . These properties are fairly easy to check, as Properties 1, 4, and 6 show. To check property 1, note that u + v = (a1 , a2 , a3 ) + (b1 , b2 , b3 ) = (a1 + b1 , a2 + b2 , a3 + b3 ) = (b1 + a1 , b2 + a2 , b3 + a3 ) = v + u. Here we only use the definition of vector addition in R3 and the fact that addition of real numbers is commutative. To check property 4, we start with a vector v = (b1 , b2 , b3 ) ∈ R3 and show that there is a vector in R3, denoted −v such that v + (−v) = z = ( 0, 0, 0). However, there is an obvious choice for −v, which is (−b1 , −b2 , −b3 ). Note that v + (−b1 , −b2 , −b3 ) = (b1 , b2 , b3 ) + (−b1 , −b2 , −b3 ) = (b1 + (−b1 ), b2 + (−b2 ) , b3 + (−b3 )) = (0, 0, 0). Thus −v = (−b1 , −b2 , −b3 ) has the desired property. We also note that according to the definition of scalar multiplication in R3 (−1)v = ((−1)b1 , (−1)b2 , (−1)b3 ) = (−b1 , −b2 , −b3). ) = −v. We will come back to this observation later.

3 Note that (α + β)u = (α + β)(a1 , b1 , c1 ) = ((α + β)a1 , (α + β)b1 , (α + β)c1 ) ) = ( αa1 + βa1 , αb1 + βb1 , αc1 + βc1 ) = (αa1 , αb1 , αc1 ) + (βa1 , βb1 , βc1 ) = α(a1 , b1 , c1 ) + β(a1 , b1 , c1 ) = αu + βu . The proof that (α + β)u = αu + βu also depends only on some well-known properties of addition and multiplication of real numbers. The vectors in the plane can be added and multiplied by scalars in the expected way and in fact also satisfy properties 1-8. 15.2 Vector Spaces In addition to vectors in plane and three-dimensional space, there are other mathematical objects that can be added and multiplied by scalars such that properties 1-8 are satisfied. In fact, these objects provide a generalization of vectors in the plane and in three-dimensional space. For this reason we will also refer to these more abstract objects as vectors. The study of vectors is an important subject in the field of mathematics called linear algebra. A nonempty set V , any two elements can be added (i.e. if u, v ∈ V , then u + v is a unique vector of V ) and each element can be multiplied by any real number (i.e. if α ∈ R and v ∈ V , then αv is a unique element in V ) is called a vector space (actually a vector space over R) if it satisfies the following eight properties: 1. u + v = v + u for every u , v ∈ V . (commutative law) 2. (u + v) + w = u + (v + w) for all u, v, w ∈ V . (associative law) 3. There is an element z ∈ V with v + z = v for all v ∈ V . 4. For every v ∈ V there is an element −v ∈ V with v + (−v) = z. 5. α(u + v) = αu + αv for all α ∈ R and all u, v ∈ V . 6. (α + β)v = αv + βv for all α, β ∈ R and all v ∈ V . 7. (αβ)v = α(βv) for α, β ∈ R and all v ∈ V . 8. 1v = v for all v ∈ V . The elements of V are called vectors and the real numbers in this definition are called scalars. So if u, v ∈ V and α, β ∈ R, then both αu and βv belong to V . Hence αu + βv ∈ V . The vector αu + βv is called a linear combination of u and v. We can also discuss linear combinations of more than two vectors. Let u, v, w be three vectors in V and let α, β, γ be three scalars (real numbers). Hence αu, βv and γw are three vectors in V and αu + βv + γw is a linear combination of u, v and w. Now we encounter a familiar situation in mathematics. Since addition in V is only defined for two vectors, what exactly does αu + βv + γw mean? There are two obvious interpretations of αu + βv + γw, namely (αu + βv) + γw (where αu and βv are added first, giving the vector αu + βv, which is then added to γw) and αu + (βv + γw). ). Property 2 (the associative law

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

4

of vector addition) ensures that both interpretations yield the same vector, and consequently it is not ambiguous to write αu + βv + γw without parentheses. If v1 , v2 , . 🇧🇷 🇧🇷 , vn ∈ V and α1 , α2 , . 🇧🇷 🇧🇷 , αn ∈ R, then α1 v1 + α2 v2 + . 🇧🇷 🇧🇷 + αn vn is a linear combination of the vectors v1 , v2 , . 🇧🇷 🇧🇷 , v. The element z ∈ V described in Property 3 (and used in Property 4) is called a zero vector, and an element −v in Property 4 is called the negative of v. By the commutative law we also know that z + v = v and (−v) + v = z for every vector v ∈ V . Since V satisfies properties 1–4, the set V forms an abelian group under addition (see Chapter 13). Although we have only defined a vector space over the set R of real numbers (and that is all we will cover), it is not always necessary for the scalars to be real numbers. In fact, there are certain situations where complex numbers are not only appropriate scalars, but are actually the preferred scalars. There are other options too. Of course, we have seen two examples of vector spaces, namely R2 and R3 (with the addition and scalar multiplication defined above). General, n-space Rn = R × R × . 🇧🇷 🇧🇷 × R (n factors) is a vector space in which the addition of two vectors u = (a1 , a2 , . . . , an ) and v = (b1 , b2 , . . . , bn ) is given by u + v = is defined by ( a1 + b1 , a2 + b2 , . . . , an + bn ) and scalar multiplication αu, where α ∈ R, is defined by αu = (αa1 , αa2 , . . . , αan ). We now describe two vector spaces of very different nature. Remember that FR is the set of all functions from R to R, so FR = {f : f : R → R}. Hence the well-known trigonometric function f1 : R → R defined by f1 (x) = sin x for all x ∈ R belongs to FR . FR also includes the function f2 : R → R defined by f2 (x) = 3x + x/(x2 + 1) for all x ∈ R. For f, g ∈ FR and a scalar (real number) α are scalar addition and multiplication defined by (f + g)(x) = f (x) + g(x) (αf )(x) = α (f (x))

for all x ∈ R,

for all x ∈ R.

For the functions f1 and f2 defined above, (f1 + f2 )(x) = sin x + 3x +

x2

x +1

e

(5f2)(x) = 15x +

5x . +1

x2

Under these definitions of addition and scalar multiplication, FR is a vector space whose verification depends only on ordinary addition and multiplication of real numbers. To illustrate, we verify that FR satisfies properties 2–5 of a vector space. First we check property 2. Let f, g, h ∈ FR . So ((f + g) + h)(x) = (f + g)(x) + h(x) = (f (x) + g(x)) + h(x) = f (x) + (g(x) + h(x)) = f(x) + (g + h)(x) = (f + (g + h))(x)

5 for all x ∈ R. So (f + g) + h = f + (g + h). Second, we show that FR satisfies property 3 of a vector space. Define the (constant) function f0 : R → R by f0 (x) = 0 for all x ∈ R. We show that f0 is a zero vector for FR. For f ∈ FR we have (f + f0 )(x) = f (x) + f0 (x) = f (x) + 0 = f (x) for all x ∈ R. So f + f0 = f . The function f0 is called the null function in FR. Next we show that FR satisfies property 4 of a vector space. For every function f ∈ FR, define the function −f : R → R by (−f )(x) = −(f (x)) for all x ∈ R. Since (f + (−f ))(x ) = f (x) + (−f )(x) = f (x) + (−f (x)) = 0 = f0 (x) for all x ∈ R it follows f + (−f ) = f0 and thus − f negative of f . Finally we show that FR satisfies property 5 of a vector space. Let f, g ∈ FR and α ∈ R. Then for every x ∈ R (α(f + g))(x) = α ((f + g)(x)) = α (f (x) + g (x)) = αf (x) + αg(x) = (αf )(x) + (αg)(x) = (αf + αg)(x) and thus α(f + g) = αf + αg . We now consider a special class of real-valued functions defined in R. These functions are important in many areas of mathematics, not just linear algebra. A function p : R → R is called a polynomial function (actually a polynomial function over R) if p(x) = a0 + a1 x + . 🇧🇷 🇧🇷 + a xn for every x ∈ R, where n is a non-negative integer and a0 , a1 , . 🇧🇷 🇧🇷 , an are real numbers. The expression p(x) itself is called a polynomial in x. You may remember that when an = 0, n is the degree of p(x). The zero function f0 is a polynomial function. However, no grade is awarded. We denote the set of all polynomial functions over R by R[x]. So R[x] ⊆ FR . Let f, g ∈ R[x] and α ∈ R. Then f (x) = a0 + a1 x + . 🇧🇷 🇧🇷 + an xn and g(x) = b0 + b1 x + . 🇧🇷 🇧🇷 + bm xm , where n and m are nonnegative integers and ai , bj ∈ R for 0 ≤ i ≤ n and 0 ≤ j ≤ m. For example, suppose m ≥ n, then the sum is f + g the polynomial function defined by (f + g)(x) = f (x) + g(x) = (a0 + b0 ) + (a1) + b1 ) x + . 🇧🇷 🇧🇷 + (an + bn )xn + bn+1 xn+1 + . 🇧🇷 🇧🇷 + bm x m ; while the scalar multiple αf of f through α is the polynomial function defined by (αf )(x) = α(f (x)) = (αa0 ) + (αa1 )x + . 🇧🇷 🇧🇷 + (αan)xn. These definitions are of course exactly the same as the sum of two FR elements and the dot product of an FR element and a real number. In fact, R[x] is itself a vector space over R under the addition and scalar multiplication just defined. For example, let f, g ∈ R[x]. Since R[x] ⊆ FR and the addition in R[x] is defined in the same way as in FR , it follows that f + g = g + f ; that is, property 1 of a vector space is satisfied. For the same reasoning, property 2 and properties 5-8 are also satisfied. The zero function f0 lies in R[x] and we know that f + f0 = f for all f ∈ FR . So p + f0 = p for all p ∈ R[x]. Then f0 is a zero vector for R[x]. For f ∈ R[x] defined by f (x) = a0 + a1 x + . 🇧🇷 🇧🇷 + at xn , we know that −f is given by (−f )(x) = −(f (x)) = (−a0 ) + (−a1 )x + . 🇧🇷 🇧🇷 + (−an )xn . So −f ∈ R[x] is negative of f . Hence properties 3 and 4 are also satisfied and therefore R[x] is a vector space over R.

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

6

15.3 Matrices The matrices are among the best-known and most important examples of vector spaces. A rectangular matrix of real numbers is called a matrix. The plural form of "matrix" is "matrices". (In general, a matrix need not be a matrix of real numbers—it can be a rectangular matrix of elements from any prescribed set. However, we will only deal with real numbers.) So a matrix has m rows and n columns for a pair m, n positive integers and contains mn real numbers, each of which is in any row i and column j for integers i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ n. A matrix with m rows and n columns is m × n in size and is called an m × n matrix (read "m × n matrix"). Hence √ 1 2 −3/2 B = 0 −0.8 4 is a 2 × 3 matrix, while

⎡

⎤

4 1 9 ⎢ ⎥ 3 2 ⎦ C = ⎣ 0 7 −1 1 is a 3 × 3 matrix A general m × n matrix A is usually written as ⎡

⎢ ⎢ A = ⎢ ⎢ ⎣

a11 a21 .. .

a12 a22...

am1 am2

. . . a1n . . . a2n .. .. . . . . . amn

⎤

⎥ ⎥ ⎥. ⎥ ⎦

Therefore, aij represents the element that is in row i and column j of A. This is called the (i,j) entry of A. In fact, it is a convenient shorthand notation for representing the matrix A by [aij] and writing A = [aij ]. The ith row of A is [ai1 ai2 . 🇧🇷 🇧🇷 ain ] and the jth column is ⎡ ⎤ a1j ⎢ ⎥ ⎢ a2j ⎥ ⎢ . 🇧🇷. 🇧🇷 . 🇧🇷 . ⎦ amj For two matrices to be equal, they must have the same size. Furthermore, two m × n matrices A = [aij ] and B = [bij ] are equal, written as A = B, if aij = bij for all integers i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ n. That is, A = B if A and B are the same size and the corresponding entries are equal. So why

A=

2 x −3 1/2 4 0

e

B=

2 4/5 −3 e 4 0

to be equal we must have x = 4/5 and y = 1/2. For positive integers m and n, let Mmn[R] be the set of all m × n matrices whose entries are real numbers. If m = n, then the matrices are called square matrices. The set of all m × m (square) matrices whose entries are real numbers is also denoted by Mm [R]. Now we define addition and scalar multiplication in Mmn[R]. Let A, B ∈ Mmn[R], where A = [aij ] and B = [bij ]. The sum A + B of A and B is defined as the matrix m × n [cij ], where cij = aij + bij for all integers i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ n. For α ∈ R is the scalar multiple αA of A by α defined as αA = [dij ], where dij = αaij for all integers i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ n. For example, if

7

A = then

A + B =

2 −1 −3 0 4 0

e

5 −10 −1 −2 9 0

B=

3 −9 2 −2 5 0

e

(−2)A =

,

−4 2 6 0 −8 0

.

Under this scalar addition and multiplication, Mmn[R] is a vector space. To illustrate, we verify that properties 1 and 3-5 of a vector space in M2[R] are satisfied. Let α ∈ R and be

a11 a12 a21 a22

A = Also

a11 a12 a21 a22

A + B =

+

e

=

b11 b12 b21 b22

b11 + a11 b12 + a12 b21 + a21 b22 + a22

B=

=

=

b11 b12 b21 b22

.

a11 + b11 a12 + b12 a21 + b21 a22 + b22

b11 b12 b21 b22

+

a11 a12 a21 a22

= B + A.

This checks property 1 of a vector space. We see here that checking property 1 depends only on the definition of matrix addition and the fact that real numbers are commutative under addition.

Session Z =

0 0 , often referred to as a 2 × 2 zero matrix, i.e. 0 0

A + Z =

=

a11 a12 a21 a22 a11 a12 a21 a22

0 0 0 0

+

=

a11 + 0 a12 + 0 a21 + 0 a22 + 0

=A

so Z is a null element of M2[R], which verifies property 3.

Next let −A =

−a11 −a12 −a21 −a22

A + (-A) =

🇧🇷 Consequently,

a11 a12 a21 a22

+

−a11 −a12 −a21 −a22

=

0 0 0 0

= Z,

and so −A is negative of A. Hence, Property 4 is satisfied. We also note that if A is multiplied by the scalar −1 then we get

(−1)A = (−1)

a11 a12 a21 a22

=

−a11 −a12 −a21 −a22

= -A.

Finally,

α(A + B) = α

a11 + b11 a12 + b12 a21 + b21 a22 + b22

=

a(a11 + b11 ) a(a12 + b12 ) a(a21 + b21 ) a(a22 + b22 )

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

8

=

αa11 + αb11 αa12 + αb12 αa21 + αb21 αa22 + αb22

a11 a12 a21 a22

= identical

+a

=

b11 b12 b21 b22

aa11 aa12 aa21 aa22

+

αb11 αb12 αb21 αb22

= αA + αB.

In certain circumstances matrices can also be multiplied - although this is obviously not a requirement for a vector space. Let A = [aij] an m × n matrix and B = [bij] an n × r matrix, that is, let A and B be two matrices with the number of columns in A equal to the number of rows in B In this case we define the product AB of A and B as the matrix m × r [cij], where n

cij = ai1 b1j + ai2 b2j + . . . + ain bnj =

aik bkj

(15.1)

k=1

for all integers i and j with 1 ≤ i ≤ m and 1 ≤ j ≤ r. Therefore, the entry (i,j) of AB is obtained from the ith row of A and the jth column of B, i.e. ⎡

[ai1 ai2 . . . ain]

⎢ ⎢ ⎢ ⎢ ⎣

e

b1j b2j .. .

⎤ ⎥ ⎥ ⎥ ⎥ ⎦

bnj by multiplying the corresponding terms of this row and column and then adding all n products. The expression (15.1) is called the inner product of the ith row of A and the jth column of B

A=

1 −3 5 0 −1 0 6 2

⎡

e

⎢ ⎢ ⎣

B=⎢

⎤

1 −6 5 2 0 1 ⎥ ⎥ ⎥. 3 3 2 ⎦ −6 9 0

Since A is a 2 × 4 matrix and B is a 4 × 3 matrix, the product AB is defined, and in fact AB = [cij ] is the 2 × 3 matrix, with the six inner products c11 = 1 × 1 + (−3) 2 + 5 3 + 0 (−6) = 10 c12 = 1 (−6) + (−3) 0 + 5 3 + 0 9 = 9 c13 = 1 5 + (−3) 1 + 5 2 + 0 0 = 12 c21 = (−1) 1 + 0 2 + 6 3 + 2 (−6) = 5 c22 = (−1) (− 6 ) + 0 0 + 6 3 + 2 9 = 42 c23 = (−1) 5 + 0 1 + 6 2 + 2 0 = 7. So

AB =

10 9 12 5 42 7

.

On the other hand, since the matrix B above is a 4 × 3 matrix and A is a 2 × 4 matrix, the product BA is undefined. However, if A and B are any two square matrices of the same size, then AB and BA are both definite, although they need not be equal. For example when

9

A = then

AB =

1 2 1 2 2 1 2 1

e

B=

0 1 1 0

, While

BA =

1 2 1 2

,

.

15.4 Some Properties of Vector Spaces Although we have looked at a number of different vector spaces, there are some properties that these vector spaces have in common (in addition to the eight defining properties). In fact, there are several additional properties that all vector spaces have in common. Since vector spaces are defined by eight properties, one might reasonably expect that any other properties they share would be consequences of those eight properties. By property 3 every vector space contains at least one zero vector and by property 4 every vector has at least one negative vector. We show that "at least one" can be replaced by "exactly one" in both cases. In fact, these are consequences of the fact that every vector space is a group under addition (Chapter 13). But we looked at it. Theorem 15.1

Every vector space has a unique zero vector.

Check. Let V be a vector space and assume that z and z are both zero vectors in V. Since z is a zero vector, z + z = z . Also, since z is a zero vector, z + z = z. So z = z + z = z + z = z . As a consequence of Theorem 15.1 we now know that a vector space V has only one zero vector z that satisfies property 3 of a vector space. So now we can call z the zero vector of V. Theorem 15.2

Let V be a vector space. So every vector in V has a unique negative.

Check. Let v ∈ V and assume that v1 and v2 are both negative of v. So v + v1 = z and v + v2 = z. So v1 = v1 + z = v1 + (v + v2 ) = (v1 + v) + v2 = z + v2 = v2 . Proof Analysis Consider again the proof of Theorem 15.2. We wanted to show that every vector v has only one negative. We assume that there were two negatives of v, viz. v1 and v2. Our goal was then to show that v1 = v2 . We start with v1. Our idea was to add z to v1 since this sum is again the vector v1. Since z can also be expressed as v + v2, we made this substitution and brought the vector v2 into the discussion. Finally, we have shown that this expression for v1 is also equal to v2. There is another approach we could have tried. Since v1 and v2 are both negative of v, it follows that v + v1 = z and v + v2 = z, i.e. H. v + v1 = v + v2 . If we add the same vector to v + v1 and v + v2 we get equal vectors (since v + v1 = v + v2 ). A good choice of vector to add to v + v1 and v + v2 is a negative of v (both of both!). This gives us the following list of equalities: v1 + (v + v1 ) = v1 + (v + v2 ) (v1 + v) + v1 = (v1 + v) + v2 z + v 1 = z + v2 v1 = v2 .

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

10

Although this sequence of equalities leads to v1 = v2, this is not a particularly well-written proof. However, since our goal is to show that v1 = v2 , this suggests a way to get to our goal. We start with v1 (bottom of the left column), continue up, then right, then down, making v1 = z + v1 = (v1 + v) + v1 = v1 + (v + v1 ) = v1 + (v) yields + v2 ) = (v1 + v) + v2 = z + v2 = v2 , which is similar to the proof in Theorem 15.2 (albeit a little longer).

♦

As a consequence of Theorem 15.2, we can now call −v the negative of v. Of course, the zero vector z has the property that z + z = z. However, no other vector has this property. Theorem 15.3 Proof.

Let V be a vector space. If v is a vector such that v + v = v, then v = z.

Como v + (−v) = z, segue-se que z = v + (−v) = (v + v) + (−v) = v + (v + (−v)) = v + z = v .

A proof like that of Theorem 15.3 is obtained by adding −v to the same vectors v + v and v and proceeding as we did in the following discussion of the proof of Theorem 15.2. See also Exercise 15.6(b). We now describe two further properties with respect to the zero vector, which are consequences of Theorem 15.3. Corollary 15.4

Let V be a vector space. then

(i) 0v = z for all vectors v em V and (ii) αz = z for all scalars α ∈ R. Proof.

First we prove (i). Note that 0v = (0 + 0)v = 0v + 0v.

By Theorem 15.3, 0v = z. Next we verify (ii). Note that αz = α(z + z) = αz + αz. Again, by Theorem 15.3, αz = z. Hence, by Corollary 15.4, 0v = z for any vector v in a vector space and αz = z for any scalar α. That is, if α = 0 or v = z, then αv = z. We now show that the converse of this statement is also true. Theorem 15.5

Let V be a vector space. If αv = z then α = 0 or v = z.

11 proof. If α = 0, then the statement is of course true. Therefore we can assume that α = 0. In this case,

v = 1v =

1 a v = a

1 (αv) = a

1 z = z. a

Another useful property is that the scalar multiple of a vector times -1 is the negative of that vector. In fact, we've seen this earlier with two specific vector spaces, but that's true in general. to prove theorem

If v is a vector in a vector space, then (−1)v = −v.

Proof strategy Since v is clearly negative, to show that (−1)v = −v we only need to verify that the sum of v and (−1)v equals z. ♦ Theorem 15.6 Proof.

If v is a vector in a vector space, then (−1)v = −v.

Note that v + (−1)v = 1v + (−1)v = (1 + (−1))v = 0v = z.

So (−1)v = −v. 15.5 Subspaces We have already seen that FR = {f : f : R → R} is a vector space (under function addition and scalar multiplication). Since the set R[x] of all polynomial functions over R is a subset of FR, and the additions and scalar multiplications defined in R[x] are exactly the same as those defined in FR, it was significantly easier to show that R[x ] is a vector space. This idea can be made more general. For a vector space V, a subset W of V is called a subspace of V if W is a vector space under the same scalar addition and multiplication defined on V. So if W is a subspace of a known vector space V, then W is itself a vector space. Since every subspace contains a zero vector, W must not be empty. As we delve further into vector spaces, we will see that certain subspaces occur regularly, and consequently it is beneficial to have an understanding of subspaces. Also, some sets on which addition and scalar multiplication are defined are subsets of known vector spaces and can more easily be shown as vector spaces by checking that they are subspaces. What does it take to show that a subset W of a vector space V is a subspace of V? Of course, W must satisfy the eight required properties of all vector spaces. Even if u, v ∈ W , then u + v must belong to W. This property is expressed by the fact that W is closed under addition. Furthermore, if α is a scalar (a real number) and v ∈ W, then αv must belong to W. We express this property by saying that W is closed under scalar multiplication. Property 1 (the commutative property) requires that u+v = v +u for every two vectors u and v in W . However, V is a vector space and satisfies property 1. Thus u+v = v+u and W satisfies property 1. By the same reasoning, W satisfies property 2 and properties 5-8. These properties of W are said to inherit from V. Therefore, for a non-empty subset W of a vector space V to be a subspace of V, it is necessary that W is closed under addition and scalar multiplication. Perhaps surprisingly, these requirements also suffice for a nonempty subset W of V to be a subspace of V.

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

12

Theorem 15.7 (The subspace test) A nonempty subset W of a vector space V is a subspace of V if and only if W is closed against addition and scalar multiplication. Check. First, let W be a subspace of V . Of course, W is closed under addition and scalar multiplication. Instead, let W be a nonempty subset of V that is closed against addition and scalar multiplication. As already mentioned, W inherits properties 1, 2 and 5-8 of a vector space of V . Since W is non-empty and closed against addition and scalar multiplication, only properties 3 and 4 remain to be checked. Since W = ∅, there is a vector v in W . Since W is closed under scalar multiplication, it follows from Corollary 15.4(i) that 0v = z ∈ W . Hence W contains a zero vector (i.e. the zero vector of V) and property 3 is satisfied. Now let w be an arbitrary vector of W . Here, too, (−1)w ∈ W . However, by Theorem 15.6 (−1)w = −w ∈ W , and thus w has a negative in W (i.e. the negative of w in V ). Thus, property 4 in W is also fulfilled. The proof of Theorem 15.7 revealed two important facts. That is, if W is a subspace of a vector space V, then W contains a null vector (i.e., the null vector of V ) and for every vector w ∈ W also its negative −w belongs to W. Every vector space V (containing at least two elements) contains always two subspaces, i. H. V itself and the subspace consisting only of the zero vector of V. We now present several examples to illustrate how the subspace test (Theorem 15.7) can be applied to show that certain subsets of a vector space are (or are not) subspaces of that vector space. The first two examples concern the vector space R3. Result 15.8

O Konjunktion W = {(a, b, 2a − b) : a, b ∈ R}

is a subspace of R3. First notice that W contains all vectors of R3 whose third coordinate is twice the first coordinate minus the second coordinate. For example, W contains (3, 2, 4) if a = 3 and b = 2 are assumed, and (0, 0, 0) if a = b = 0 is assumed. Of course, if W is a subspace of R3, then it is essential that W contains the zero vector of R3. Proof of results 15.8. Since W contains the zero vector of R3, it follows that W = ∅. To show that W is a subspace of V, we only have to show that W is closed under addition (i.e. if u, v ∈ W , then u+v ∈ W ) and that W is closed under scalar multiplication (i.e , if u ∈ W and α ∈ R, i.e. αu ∈ W ). Let u, v ∈ W and α ∈ R. Then u = (a, b, 2a − b) and v = (c, d, 2c − d), where a, b, c, d ∈ R. Then u + v = (a + c, b + d, 2(a + c) − (b + d)) ∈ W αu = (αa, αb, 2(αa) − (αb)) ∈ W. By the subspace test W is a subspace of R3. Example 15.9

Bestimme se W = {(a, b, a2 + b) : a, b ∈ R}

is a subspace of R3.

e

13 solution. With a = b = 1 we see that u = (1, 1, 2) ∈ W . So 2u = (2, 2, 4). Since 4 = 22 + 2, it follows that 2u ∈ / W . Since W is not closed under scalar multiplication, W is not a subspace of R3. (The subset W of R is also not closed against addition, since u+u∈ / W .) ♦ Next we consider the vector space FR . We have already mentioned that R[x] is a subspace of FR. Furthermore, the set CR = {f ∈ FR : f is continuous} is a subspace of FR . In fact, R[x] is also a subspace of CR. Result 15.10

Let F0 = {f ∈ FR : f (1) = 0}. Then F0 is a subspace of FR .

Therefore the function f1 : R → R defined by f1 (x) = x − 1 belongs to F0 as well as the null function f0 : R → R defined by f0 (x) = 0 for all x. Since F0 contains the zero function, F0 = ∅. Let f, g ∈ F0 and

Proof of results 15.10. α ∈ R. Then

(f + g)(1) = f (1) + g(1) = 0 + 0 = 0

(αf )(1) = αf (1) = α · 0 = 0.

e

So f + g ∈ F0 and αf ∈ F0 . By the subspace test, F0 is a subspace of FR. Determine if

Example 15.11

F1 = {f ∈ FR : f (0) = 1} is a subspace of FR . Solution. Note that the functions g, h ∈ FR are defined by g(x) = x+1 and h(x) = x2 +1 / F1. belong to F1. However, (g+h)(x) = g(x)+h(x) = x2 +x+2 and (g+h)(0) = 2, so g+h ∈ Hence F1 is not a subspace of FR . ♦ The next example concerns the vector space M2(R) of 2 × 2 matrices with real entries. Result 15.12

Or set W =

a 0 b c

: a, b, c ∈ R

is a subspace of M2(R). So W consists of all the 2 × 2 matrices whose entry (1, 2) is 0. So the zero matrix whose entries are all 0 belongs to W . Since W contains the zero matrix, W = ∅. Let A, B ∈ W and

Proof of results 15.12. α ∈ R. So

A=

a 0 b c

e

B=

d 0 e f

,

where a, b, c, d, e, f ∈ R. Then

A + B =

a+d 0 b+e c+f

e

αA =

αa 0 αb αc

.

Hence A + B and αA belong to W and by the subspace test W is a subspace of M2(R).

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

14

15.6 Vector Spaces In Result 15.12 we showed that the set

W =

a 0 b c

: a, b, c ∈ R

a 0 b c

is a subspace of M2(R). So if A ∈ W , then A =

for some a, b, c ∈ R. Note that

also that

A =

a 0 b c

= identical

1 0 0 0

a 0 0 0

=

+b

0 0 1 0

+

0 0 b 0

+c

0 0 0 c

+

0 0 0 1

.

In A, every matrix in W ) is a linear combination of other words, (and therefore 1 0 0 0 0 0 , , and . So W is the set of all linear combinations of 0 0 1 0 0 1 of these three matrices. This observation presents a more general situation Recall that if V is a vector space, v1 , v2 , ... , vn ∈ V , and α1 , α2 , ... , αn ∈ R, then any vector of the form α1 v1 + α2 v2 + . . . + αn vn is a linear combination of the vectors v1 , v2 , ... , vn , so if we take α1 = α2 = ... = αn = 0 we see that the zero vector is a linear combination of v1 , v2 , . . . , vn If we also take αi = 1 for some fixed integer i (1 ≤ i ≤ n) and all other scalars 0, we see that every vector vi is a linear combination of v1 , v2 , ... , vn We observe that every linear combination of vectors in V is a vector in V, and of course the set of all such linear combinations is a subset of V. In fact, over this subset even more sat Theorem 15.13 Let V be a vector space containing the s vectors v1 , v2 , . 🇧🇷 🇧🇷 , v. Then the set W of all linear combinations of v1 , v2 , . 🇧🇷 🇧🇷 , vn is a subspace of V . Check. Since W contains the zero vector of V, it follows that W = ∅. Let u, w ∈ W and α ∈ R. Then u = α1 v1 + α2 v2 + . 🇧🇷 🇧🇷 + αn vn and w = β1 v1 + β2 v2 + . 🇧🇷 🇧🇷 + βn vn , where αi , βi ∈ R for 1 ≤ i ≤ n. So u + w = (α1 + β1 )v1 + (α2 + β2 )v2 + . 🇧🇷 🇧🇷 + (αn + βn )vn and αu = (αα1 )v1 + (αα2 )v2 + . 🇧🇷 🇧🇷 + (ααn)vn. Hence both u + w and αu are linear combinations of v1 , v2 , . 🇧🇷 🇧🇷 , vn , and therefore belong to W . Thus, by the subspace test, W is a subspace of V. For the vectors v1 , v2 , . 🇧🇷 🇧🇷 , vn in a vector space V , where the subspace W of V consists of all linear combinations of v1 , v2 , . 🇧🇷 🇧🇷 , vn is used as an extension of v1 , v2 , . 🇧🇷 🇧🇷 , vn and is denoted by v1 , v2 , . 🇧🇷 🇧🇷 , v. Furthermore, W is used as that of v1, v2, . . . generated subspace of V. 🇧🇷 🇧🇷 , v. According to result 15.12,

W =

a 0 b c

: a, b, c ∈ R

=

1 0 0 0

,

0 0 1 0

,

0 0 0 1

.

15 We saw in Result 15.8 that W = {(a, b, 2a − b) : a, b ∈ R} is a subspace of R3. Since (a, b, 2a − b) = a(1, 0, 2) + b(0, 1, −1), it follows that W is represented by the vectors (1, 0, 2) and (0, 1, −1), i.e. W = (1, 0, 2), (0, 1, −1) . We consider another illustration of vector extensions. Result 15.14 Let f1 , f2 , f3 , g2 and g3 be five functions in R[x] defined by f1 (x) = 1, 2 f2 (x) = 1 + x , f3 (x) = 1 + x2 + x4 , g2 (x) = x2 and g3 (x) = x4 for all x ∈ R, and let W = f1 , f2 , f3 and W = f1 , g2 , g3 . So W = W . Since W and W are sets of vectors (polynomial functions) and our goal is to show that W = W , we proceed in the usual way and show that W and W are each a subset of the other. Proof of results 15.14. First we show that W ⊆ W . Let f ∈ W . Then f = af1 + bf2 + cf3 for some a, b, c ∈ R. Hence for every x ∈ R

f(x) = a · 1 + b · 1 + x2 + c · 1 + x2 + x4

= (a + b + c) + (b + c) x2 + c x4 . So f is also a linear combination of f1 , g2 and g3 . So W ⊆ W . It remains to show that W ⊆ W . Let g ∈ W . Then g = af1 + bg2 + cg3 for some a, b, c ∈ R. Hence for every x ∈ R

g(x) = a · 1 + b · x2 + c · x4 = (a − b) · 1 + b · 1 + x2 + c · x4

= (a − b) 1 + (b − c) 1 + x2 + c 1 + x2 + x4 . Hence g is also a linear combination of f1 , f2 , f3 and hence W ⊆ W . If V is a vector space containing the vectors v1 , v2 , . 🇧🇷 🇧🇷 , vn , then W = v1 , v2 , . 🇧🇷 🇧🇷 , vn is a subspace of V (which contains v1 , v2 , . . . , vn). Possibly other subspaces of V contain v1 , v2 , . 🇧🇷 🇧🇷 , vn too. Of course, V itself is a subspace of V satisfying v1 , v2 , . 🇧🇷 🇧🇷 , v. In a sense, however, W is the smallest subspace of V satisfying v1 , v2 , . 🇧🇷 🇧🇷 , v. Theorem 15.15 Let V be a vector space containing the vectors v1 , v2 , . 🇧🇷 ., vn and let W = v1 , v2 , . 🇧🇷 🇧🇷 , v. If W is a subspace of V covering v1 , v2 , . 🇧🇷 ., vn , then W is a subspace of W . Check. Since W and W are subspaces of V, we only have to show that W ⊆ W . Let v ∈ W . So v = α1 v1 + α2 v2 + . 🇧🇷 🇧🇷 + αn vn , where αi ∈ R for 1 ≤ i ≤ n. Since vi ∈ W for 1 ≤ i ≤ n and W is a subspace of V, it follows that v ∈ W . So W ⊆ W . There is a consequence of Theorem 15.15 that is particularly useful. Corollary 15.16 Let V be a vector space defined by the vectors v1 , v2 , . 🇧🇷 ., v. If W is a subspace of V covering v1 , v2 , . 🇧🇷 ., vn , then W = V .

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

16

Check. Since W is a subspace of V, surely W ⊆ V . By Theorem 15.15, V ⊆ W . Thus, W = V. To illustrate some of the concepts and results introduced so far, let's consider an example related to three-dimensional space. Result 15.17 (i) For the vectors i = (1, 0, 0), j = (0, 1, 0) and k = (0, 0, 1) we have R3 = i, j, k . (ii) If w1 = (1, 1, 0), w2 = (0, 1, 1) and w3 = (1, 1, 1), then R3 = w1 , w2 , w3 . (iii) Let u1 = (1, 1, 1), u2 = (1, 1, 0) and u3 = (0, 0, 1). So u1, u2, u3 = u1, u2. Check. Let W1 = i, j, k . Since W1 is a subspace of R3, it follows that W1 ⊆ R3 . Now we show that R3 ⊆ W1 . Let v ∈ R3 . Then v = (a, b, c), where a, b, c ∈ R. Then v = (a, 0, 0) + (0, b, 0) + (0, 0, c) = a (1 , 0, 0) + b(0, 1, 0) + c(0, 0, 1) = ai + bj + ck. So v is a linear combination of i, j and k, i.e. v ∈ W1 . So R3 ⊆ W1 . This implies that R3 = i, j, k and (i) is verified. Next we check (ii). Let w2 = w1, w2, w3. To verify that R3 = W2 it suffices to show by Corollary 15.16 and part (i) of this result that each of the vectors i, j and k belongs to W2. To show that i, j, and k belong to W2, we need to show that i, j, and k are linear combinations of w1, w2, and w3, respectively. Since i = (1, 0, 0) = (1, 1, 1) + (−1)(0, 1, 1), it follows that i = 0 w1 +(−1)w2 +1 w3 . Now j = (0, 1, 0) = (1, 1, 0)+(0, 1, 1)+(−1)(1, 1, 1); then j = 1 w1 + 1 w2 + (−1)w3 . Finally, k = (0, 0, 1) = (1, 1, 1) + (−1)(1, 1, 0) and thus k = (−1)w1 + 0 w2 + 1 w3 . Hence R3 = W2 and (ii) is established. Finally we check (iii). Let W = u1, u2 and W = u1, u2, u3. Since W contains the vectors u1 and u2, it follows from Theorem 15.15 that W ⊆ W . To prove by Corollary 15.16 that W ⊆ W , we only have to show that each of the vectors u1 , u2 and u3 belongs to W, that is, that each of these three vectors is a linear combination of u1 and u2. This is obvious for u1 and u2 since u1 = 1 u1 + 0 u2 and u2 = 0 u1 + 1 u2 . So it only remains to show that u3 is a linear combination of u1 and u2. However, u3 = (0, 0, 1) = (1, 1, 1) + (−1)(1, 1, 0) = 1 u1 + (−1)u2 , which completes the proof. 15.7 Linear dependence and independence For the vectors u1 = (1, 1, 0) and u2 = (0, 1, 1) in R3, the vector u3 = (−1, 1, 2) ∈ R3 is a linear combination of u1 and u2 , since u3 = (−1, 1, 2) = (−1) u1 + 2 u2 = (−1) (1, 1, 0) + 2 (0, 1, 1). The vector u3 thus depends, so to speak, linearly on u1 and u2. This linear dependence can be expressed as (−1) u1 + 2 u2 + (−1) u3 = (0, 0, 0). This type of dependency plays an important role in linear algebra. Let S = {u1, u2, . 🇧🇷 🇧🇷 , a } is a nonempty set of vectors in a vector space V . The set S is called linearly dependent if there are scalars c1 , c2 , . 🇧🇷 🇧🇷 , cm , not all 0, so c1 u1 + c2 u2 + . 🇧🇷 🇧🇷 + cm one = z. If S is not linearly dependent, then S is called linearly independent. For S = {u1, u2, . 🇧🇷 🇧🇷 , a }, we also say that the vectors u1 , u2 , . 🇧🇷 🇧🇷 , hm are

17 linearly dependent or linearly independent, depending on whether the set S is linearly dependent or linearly independent. Consequently, the vectors u1, u2, . 🇧🇷 🇧🇷 , a are linearly independent whenever c1 u1 + c2 u2 + . 🇧🇷 🇧🇷 + cm a = z, so ci = 0 for every i (1 ≤ i ≤ m). We now consider some examples. Example 15.18 Determine whether S = {(1, 1, 1), (1, 1, 0), (0, 1, 1)} is a linearly independent set of vectors in R3. Solution.

Let a, b and c be scalars such that a (1, 1, 1) + b (1, 1, 0) + c (0, 1, 1) = (0, 0, 0).

By scalar multiplication and vector sum we have (a + b, a + b + c, a + c) = (0, 0, 0), which leads to the following system of equations: a+b = 0 a+b+ c = 0 a + c = 0. If we subtract the first equation from the second, we get c = 0. Substituting c = 0 into the third equation, we get a = 0. Substituting a = 0 and c = 0 into the second equation , we get b = 0. So a = b = c = 0 and S is linearly independent. ♦ Example 15.19

You determine

S=

2 1 1 0

,

0 1 1 2

1 1 1 1

,

is a linearly independent set of vectors in M2(R). Solution.

Again let a, b and c be scalars such that

one

2 1 1 0

+b

0 1 1 2

+c

1 1 1 1

=

0 0 0 0

.

By scalar multiplication and matrix addition, we have

2a + c a+b+c

a+b+c2b+c

=

0 0 0 0

.

This results in the system of equations: 2a + c = 0 a+b+c = 0 2b + c = 0 where the second equation actually occurs twice. From the first and third equation follows c = −2a and c = −2b and thus a = b = −c/2. Replace these values with a

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

18

and b in the second equation gives (−c/2) + (−c/2) + c = −c + c = 0, that is, the second equation is satisfied for any value of c. So if we let c = −2, then a = b = 1 and

1·

2 1 1 0

+1·

0 1 1 2

+ (-2) ·

Hence S is a set of linearly dependent vectors.

1 1 1 1

=

0 0 0 0

.

♦

We now show that a known set of polynomial functions is linearly independent. Theorem to be proved For every nonnegative integer n, the set Sn = {1, x, x2 , . 🇧🇷 🇧🇷 , xn } is linearly independent in R[x]. Proof strategy The elements of Sn are actually functions, say Sn = {f0 , f1 , f2 , . 🇧🇷 🇧🇷 , fn }, where fi : R → R is defined by fi (x) = xi for 0 ≤ i ≤ n and for all x ∈ R. To show that Sn is linearly independent, we need to show that that if c0 1 + c1 x + c2 x2 + . 🇧🇷 🇧🇷 + cn xn = 0, where ci ∈ R for 0 ≤ i ≤ n, then ci = 0 for all i. The question, of course, is how to do it. By choosing different values of x, we could develop a system of equations to solve. For example, we could start by taking x = 0 and get c0 1 + c1 0 + c2 0 + . 🇧🇷 🇧🇷 + cn 0 = 0, so c0 = 0. So c1 x + c2 x2 + . 🇧🇷 🇧🇷 + cn xn = 0. Setting x = 1 and x = 2, we have c1 + c2 + . 🇧🇷 🇧🇷 + cn = 0 and 2c1 + 22 c2 + . 🇧🇷 🇧🇷 + 2n cn = 0. Actually, we could end up with a system of n equations and n unknowns, but that might sound complicated. On the other hand, another approach is suggested from the statement of the theorem. When we see a theorem formulated as "for every non-negative integer n" we often think of the application of induction. The main challenge for such a proof would be to show that if {1, x, x2 , . 🇧🇷 🇧🇷 , xk } is linearly independent, where k ≥ 0, so {1, x, x2 , . 🇧🇷 🇧🇷 , xk+1 } is linearly independent. So we would be dealing with the equation c0 ·1+c1 x+c2 x2 + . 🇧🇷 .+ck+1 xk+1 = 0 for ci ∈ R, 0 ≤ i ≤ k + 1, trying to show that ci = 0 for all i (0 ≤ i ≤ k + 1). We have already mentioned that showing c0 = 0 is not difficult. To use the induction hypothesis, we need a linear combination of the polynomials 1, x, x2 , . 🇧🇷 🇧🇷 , x k . An idea for this is to take the derivative of c0 1 + c1 x + c2 x2 + . 🇧🇷 🇧🇷 +ck+1 xk+1 . ♦ Theorem 15.20 For every non-negative integer n, the set Sn = {1, x, x2, . 🇧🇷 🇧🇷 , xn } is linearly independent in R[x]. Check. We proceed by induction. For n = 0 we have to show that S0 = {1} in R[x] is linearly independent. Let c be a scalar with c · 1 = 0. Then surely c = 0 and then S0 is linearly independent. Suppose Sk = {1, x, x2 , . 🇧🇷 🇧🇷 , xk } is linearly independent in R[x], where k is a nonnegative integer. We show that Sk+1 = {1, x, x2 , . 🇧🇷 🇧🇷 , xk+1 } is linearly independent in R[x]. Let c0, c1, . 🇧🇷 🇧🇷 , ck+1 are scalars with c0 1 + c1 x + c2 x2 + . 🇧🇷 🇧🇷 + ck+1 xk+1 = 0,

(15.2)

for all x ∈ R. Taking x = 0 in (15.2) we see that c0 = 0. Now taking the derivatives of both sides of (15.2) we see that c1 1 + 2c2 x + 3c3 x2 + . 🇧🇷 🇧🇷 + (k + 1)ck+1 xk = 0 for all x ∈ R. By the induction hypothesis, Sk is a linearly independent set of vectors in R[x] and thus c1 = 2c2 = 3c3 = . 🇧🇷 🇧🇷 = (k + 1)ck+1 = 0, which implies that c1 = c2 = c3 = . 🇧🇷 🇧🇷 = ck+1 = 0. Just as c0 = 0, it follows that Sk+1 is linearly independent.

19 Evidence Analysis Before proceeding further, it is important that we understand the evidence we have just provided. The proof began by showing that S0 = {1} is linearly independent. This means that S0 consists of the single constant polynomial function f defined by f(x) = 1 for all x ∈ R. Let c be a scalar (real number) such that c f = f0 , where f0 is the zero function polynomial defined by f0 (x) = 0 for all x ∈ R. Thus for any x ∈ R (cf )(x) = f0 (x) = 0, i.e. H. (cf )(x) = c f (x) = c 1 = 0 = f0 (x) ♦

also c = 0.

Now we consider a result for a general vector space. Result 15.21 If v1, v2 and v3 are linearly independent vectors in a vector space V, then v1, v1+v2 and v1+v2+v3 are also linearly independent in V. Check.

Let a, b and c be scalars such that a v1 + b (v1 + v2 ) + c (v1 + v2 + v3 ) = z.

From this we have (a + b + c) v1 + (b + c) v2 + c v3 = z. Since v1 , v2 and v3 are linearly independent, a + b + c = b + c = c = 0, which implies that a = b = c = 0 and hence v1 , v1 + v2 and v1 + v2 + v3 linearly independent. Let S = {v1, v2, . 🇧🇷 🇧🇷 , vn } be a set of n vectors, where n ∈ N, and let S be a nonempty subset of S. Then |S | = m for an integer m with 1 ≤ m ≤ n. Since the order in which the elements of S are listed does not matter, these elements can be rearranged and renamed if necessary, such that S = {v1 , v2 , . 🇧🇷 🇧🇷 , vm }. This fact is sometimes very useful. Theorem 15.22 Let S be a finite and nonempty set of vectors in a vector space V . If S is linearly independent in V and S is a nonempty subset of S, then S is also linearly independent in V. Check. We can assume that S = {v1 , v2 , . 🇧🇷 🇧🇷 , vm } and S = {v1 , v2 , . 🇧🇷 🇧🇷 , vm , vm+1 , . 🇧🇷 🇧🇷 , vn }, where then 1 ≤ m ≤ n. If m = n then S = S and surely S is linearly independent. So we can assume that m < n. Let c1, c2, . 🇧🇷 🇧🇷 , cm are scalars, so c1 v1 + c2 v2 + . 🇧🇷 🇧🇷 + cm vm = z. But then, c1 v1 + c2 v2 + . 🇧🇷 🇧🇷 + cm vm + 0vm+1 + 0vm+2 + . 🇧🇷 🇧🇷 + 0vn = z.

(15.3)

Since S is linearly independent, all scalars in (15.3) are 0. In particular, c1 = c2 = . 🇧🇷 🇧🇷 = cm = 0, which implies that S is linearly independent. We can reformulate Theorem 15.22 as follows: Let V be a vector space and let S and S be nonempty finite subsets of V with S ⊆ S. If S is linearly independent, then S is linearly independent. The contrapositive of this implication is that if S is linearly dependent, then S is linearly dependent.

20

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

Although we have only discussed linear independence and linear dependence in the context of finite sets of vectors, these concepts also exist for infinite sets of vectors. An infinite set of vectors in a vector space V is linearly independent if every nonempty finite subset of S is linearly independent. Correspondingly, an infinite set S of vectors in a vector space V is linearly dependent if a nonempty finite subset of S is linearly dependent. Every instance of a (finite) set S of linearly dependent vectors in a vector space V leads to an infinite set T of linearly dependent vectors; that is, every infinite subset T of V such that S ⊆ T is linearly dependent. But what is an example of a vector space containing infinitely many linearly independent vectors? We now provide such an example. Result 15.23

The set T = {1, x, x2 , . 🇧🇷 .} is linearly independent in R[x].

Check. Let S be a nonempty finite subset of T . Then there is the largest nonnegative integer m with xm ∈ S. So S ⊆ Sm = {1, x, x2 , . 🇧🇷 🇧🇷 , xm }. By Theorem 15.20, Sm is linearly independent in R[x] and by Theorem 15.22, S is linearly independent. Consequently, T in R[x] is linearly independent. 15.8 Linear Transformations We have seen that many properties of a vector space V , subspaces of V , the span of a set of vectors in V , and linear independence and linear dependence of vectors in V deal with a common concept: linear combinations of vectors. Perhaps this is not unexpected in a field of mathematics called linear algebra. There are cases where two vector spaces V and V are so closely related that for every vector w ∈ V there is a corresponding vector w ∈ V such that the vector corresponding to αu + βv in V is equal to αu + βv in V is. Such an association describes a function from V to V . In particular, a function f : V → V is said to conserve linear combinations of vectors if f(αu + βv) = αf(u) + βf(v) for all u, v ∈ V and all two scalars α and β. If f : V → V has the property that f(u + v) = f(u) + f(v) for all u, v ∈ V , then f obtains the addition; while if f (αu) = αf (u) for any u ∈ V and any scalar α, then f is said to preserve scalar multiplication. Let z be the zero vector of V. If f : V → V conserves linear combinations and u, v ∈ V , then f (u + v) = f (1 u + 1 v) = 1 f (u) + 1 f (v) = f ( u) + f (v) and f (αu) = f (αu + 0v) = αf (u) + 0f (v) = αf (u) + z = αf (u). So if f : V → V is a function that preserves linear combinations, then f also preserves scalar addition and multiplication. Conversely, suppose that f : V → V is a function that preserves both addition and scalar multiplication. For u, v ∈ V and scalars α and β we have f (αu + βv) = f (αu) + f (βv) = αf (u) + βf (v), i.e. H. f preserves linear combinations. Since functions that preserve linear combinations are so important in linear algebra, they are given a special name. Let V and V be vector spaces. A function T : V → V is called a linear transformation if it preserves both addition and scalar multiplication, i.e. H. if it meets the following conditions:

21 1. T(u + v) = T(u) + T(v) 2. T(αv) = αT(v) for all u, v ∈ V and all α ∈ R. There are some points related to those conditions that need to be addressed and that may not be obvious. Condition 1 says that T(u + v) = T(u) + T(v) for any two vectors u and v of V. Hence the addition given in T(u + v) occurs in V; while on the other hand, since T(u) and T(v) are vectors in V, the addition given in T(u) + T(v) takes place in V. Furthermore, condition 2 says that T(αv) = αT(v) for every vector v in V and every scalar α. By the same reasoning, the scalar multiplication given in T(αv) occurs in V, while the scalar multiplication in αT(v) occurs in V. From what we've seen, any linear transform preserves linear combinations of vectors (hence the name). Let's consider an example of a linear transformation. Result 15.24 The function T : R3 → R2 defined by T ((a, b, c)) = T (a, b, c) = (2a + c, 3c − b) is a linear transformation. Before proving Result 15.24, let's make sure we understand what this function does. For example T(1, 2, 3) = (5, 7), T(1, −6, −2) = (0, 0), while T(0, 0, 0) = (0, 0) also . Now we show that T is a linear transformation. Let u, v ∈ R3 . So u = (a, b, c) and v = (d, e, f ) for

Proof of results 15.24. a, b, c, d, e, f ∈ R. So

T (u + v) = T (a + d, b + e, c + f ) = (2(a + d) + c + f, 3(c + f ) − (b + e)) = (2a + c, 3c − b) + (2d + f, 3f − e) = T (a, b, c) + T (d, e, f ) = T (u) + T (v) e T (αu) = T (α(a, b, c)) = T (αa, αb, αc) = (2αa + αc, 3αc − αb) = α(2a + c, 3c − b) = αT (u), konforme desejado .

⎡

⎤

a ⎢ ⎥ 3 Sometimes vectors in R are written as “column vectors”, i.e. ⎣ b ⎦ instead of (a, b, c) or the “row vector” [a b c]. In this case, note that the linear transformation T : R3 → R2 defined by T (a, b, c) = (2a + c, 3c − b) can be written as ⎛⎡

⎤⎞

⎡

⎤

a a 2 0 1 ⎢ ⎥ 2a + c ⎜⎢ ⎥⎟ , T (a, b, c) = T ⎝⎣ b ⎦⎠ = ⎣ b ⎦= 0 −1 3 −b + 3c c c ⎡

⎤

a 2 0 1 ⎢ ⎥ , then this linear transformation can be if we leave v = ⎣ b ⎦ and A = 0 −1 3 c defined by the matrix A, i.e. H.

T(v) = Off.

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

22

In general, if A is an m × n matrix, then the function T : Rn → Rm defined by n T (u) = Au for an n × 1 column ⎡ vector ⎤u ∈ R is a linear transformation. For example, 1 −2 a c ⎢ ⎥ ,v= , and α ∈ R, consider the matrix 3 × 2 A = ⎣ 3 −1 ⎦. For u = b d 2 5

T (u + v) = T

a+c b+d

⎡

⎡

⎤

⎤

⎡

⎤

1 −2 a + c − 2b − 2d ⎢ ⎥ a+c ⎢ ⎥ = ⎣ 3 −1 ⎦ = ⎣ 3a + 3c − b − d ⎦ b+d 2 5 2a + 2c + 5b + 5d

⎡

⎤

a − 2b c − 2d ein c ⎢ ⎥ ⎢ ⎥ = ⎣ 3a − b ⎦ + ⎣ 3c − d ⎦ = T + T b d 2a + 5b 2c + 5d

= T(u) + T(v) e

T (αu) = T

αa αb

⎡

⎡

⎤

⎡

⎤

1 −2 αa − 2αb ⎢ ⎥ αa ⎢ ⎥ = ⎣ 3 −1 ⎦ = ⎣ 3αa − αb ⎦ αb 2 5 2αa + 5αb

⎤

a − 2b a ⎢ ⎥ = α ⎣ 3a − b ⎦ = αT = αT (u). b 2a + 5b

Thus T : R2 → R3 is a linear transformation. The proof for a general m × n matrix is similar. As a further illustration of a linear transformation, consider a well-known function from R[x] to itself. Result 15.25

The function D (for differentiation) from R[x] to R[x] is defined by D(c0 + c1 x + c2 x2 + . . . + cn xn ) = c1 + 2c2 x + . 🇧🇷 🇧🇷 + ncn xn−1

is a linear transformation. Check. Let f, g ∈ R[x], where f (x) = a0 + a1 x + a2 x2 + . 🇧🇷 🇧🇷 + ar xr and g(x) = b0 + b1 x + b2 x2 + . 🇧🇷 🇧🇷 + bs xs and, say, r ≤ s. So D(f(x) + g(x)) = D ((a0 + a1 x + . . . + ar xr ) + (b0 + b1 x + . . . + bs xs ))

= D (a0 + b0 ) + (a1 + b1 )x + . . . + (ar + br )xr + br+1 xr+1 + . . . + bs xs

= (a1 + b1 ) + . . . + r(ar + br )xr−1 + (r + 1)br+1 xr + . . . + sb xs−1 =

a1 + 2a2 x + . . . + rar xr−1 + b1 + 2b2 x + . . . + sb xs−1

= D(f(x)) + D(g(x)) e

D(αf (x)) = D αa0 + αa1 x + αa2 x2 + . . . + αar xr

= αa1 + 2αa2 x + . 🇧🇷 🇧🇷 + rαar xr−1 = α(a1 + 2a2 x + . . . + rar xr−1 ) = αD(f (x)). Since D preserves addition and scalar multiplication, it is a linear transform. There is a special kind of function from a vector space to itself that is always a linear transformation.

23 Result 15.26 Let V be a vector space over the set R of real numbers. For c ∈ R the function T : V → V defined by T (v) = cv is a linear transformation. Check.

Seja u, w ∈ V . Então T (u + w) = c(u + w) = cu + cw = T (u) + T (w);

while for α ∈ R T (αu) = c(αu) = (cα)(u) = (αc)(u) = α(cu) = αT (u). Hence T is a linear transformation. For c = 1, the function T defined in Result 15.26 is the identity function; while for c = 0 the T-function maps each vector into the zero vector. Consequently, both functions are linear transformations. Now we consider functions with other vector spaces. For a function f ∈ FR and a real number r, we define the function f + r by (f + r)(x) = f (x) + r for all x ∈ R. Example 15.27 Let r be a non-zero real number. Prove or disprove: The function T : FR → FR defined by T (f ) = f + r is a linear transformation. Solution. Let f, g ∈ FR . Note that T(f + g) = (f + g) + r, while T(f) + T(g) = (f + r) + (g + r) = (f + g) + 2r. Since r = 0, it follows that T(f + g) = T(f) + T(g). Hence T is not a linear transformation. ♦ Example 15.28

Let T : M2(R) → M2(R) be a function defined by

A B C D

T

=

Announcement 0 0 BC

.

Prove or disprove: T is a linear transformation.

Solution. Since T

2

1 1 1 1

e 2T

=T

1 1 1 1

T is not a linear transformation. Example 15.29

=2

2 2 2 2 1 0 0 1

=

=

4 0 0 4

2 0 0 2

,

♦

The function T : M2(R) → M2(R) is defined by

T

A B C D

=

Prove or disprove: T is a linear transformation.

a a c c

.

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

24

a1 b1 c1 d1

Solution. Leave

T

a1 b1 c1 d1

+

,

a2 b2 c2 d2

a2 b2 c2 d2

∈ M2 (R) and α ∈ R. Then

= T

=

a1 + a2 a1 + a2 c1 + c2 c1 + c2

= T

a1 + a2 b1 + b2 c1 + c2 d1 + d2

a1 b1 c1 d1

=

+T

a1 a1 c1 c1

a2 b2 c2 d2

+

a2 a2 c2 c2

;

While

T

one

a1 b1 c1 d1

= T

= identical

αa1 αb1 αc1 αd1

a1 a1 c1 c1

=

= αT

αa1 αa1 αc1 αc1

a1 b1 c1 d1

.

Since T preserves addition and scalar multiplication, T is a linear transform.

♦

15.9 Properties of Linear Transformations An important property of linear transformations is that the composition of any two linear transformations (if the composition is defined) is also a linear transformation. This fact also has an interesting consequence. Theorem 15.30 Let V, V and V be vector spaces. If T1 : V → V and T2 : V → V are linear transformations, then the composition T2 ◦T1 : V → V is also a linear transformation. Check.

Para u, v ∈ V e um Skalar α, beachte, dass (T2 ◦ T1 )(u + v) = T2 (T1 (u + v)) = T2 (T1 (u) + T1 (v)) = T2 ( T1 (u)) + T2 (T1 (v)) = (T2 ◦ T1 )(u) + (T2 ◦ T1 )(v)

and (T2 ◦ T1 )(αv) = T2 (T1 (αv)) = T2 (αT1 (v)) = αT2 (T1 (v)) = α(T2 ◦ T1 )(v). Hence T2 ◦ T1 is a linear transformation. As an example for the previous theorem, let T1 : R3 → R2 and T2 : R2 → R3 be defined by T1 (a, b, c) = (a + 2b − c, 3b + 2c) and T2 (a, b) = (b , 2a, a + b). Then T2 ◦ T1 : R3 → R3 is given by (T2 ◦ T1 )(a, b, c) = T2 (T1 (a, b, c)) = T2 (a + 2b − c, 3b + 2c) = ( 3b + 2c, 2a + 4b − 2c, a + 5b + c).

25 As already mentioned, T1 and T2 can also be defined by ⎛⎡

⎤⎞

⎤

⎡

⎡

⎤

a a 0 1 1 2 −1 ⎢ ⎥ a ⎜⎢ ⎥⎟ ⎢ ⎥ a T1 ⎝⎣ b ⎦⎠ = =⎣ 2 0 ⎦ . ⎣ b ⎦ e T2 0 3 2 b b c c 1 1

Interestingly ⎛⎡

⎤⎞

⎡

⎤

⎡

⎤

a a 0 1 ⎜⎢ ⎥⎟ ⎢ ⎥ 1 2 −1 ⎢ ⎥ (T2 ◦ T1 ) ⎝⎣ b ⎦⎠ = ⎣ 2 0 ⎦ ⎣ b ⎦, 0 3 2 c 1 1 c

that is, the composition T2 ◦ T1 can be obtained by multiplying the matrices describing T1 and T2. So if we represent the linear transformations T1 and T2 by the matrices A1 and A2, respectively, then the matrix representing T2 ◦ T1 is A2 A1 . This also explains why the definition of matrix multiplication is strange at first, but actually quite logical. Two basic properties of a linear transformation are given in the next theorem. Theorem 15.31 Let V and V be vector spaces with zero z and z vectors, respectively. If T : V → V is a linear transformation, then (i) T (z) = z and (ii) T (−v) = −T (v) for all v ∈ V . Check.

We first check (i). Since T preserves scalar multiplication, T(z) = T(0z) = 0T(z) = z.

Next we verify (ii). Let v ∈ V . Then T(v) + T(−v) = T(v + (−v)) = T(z) = z , the last equality followed by (i). Since the vector T(v) in V is uniquely negative, i.e. H. −T (v), we conclude that T (−v) = −T (v). When T : V → V is a linear transformation, it is often interesting to know how T acts on subspaces of V. Let's recall some terminology and notation of functions. For a linear transformation T : V → V, the set V is the domain of T and the set V is the domain of T. If W is a subset of V, then T (W) = {T (w) : w ∈ W } is the image of W under T . In particular, T(V) is the interval of T. Theorem 15.32 Let V and V be vector spaces and let T : V → V be a linear transformation. If W is a subspace of V, then T(W) is a subspace of V. Check. Let z and z be the zero vectors in V and V, respectively. Since z ∈ W and T (z) = z by Theorem 15.31, it follows that z ∈ T (W ) and thus T (W ) = ∅. So we only need to show that T(W ) is closed under addition and scalar multiplication. Let x and y be two vectors in T (W ). So there are vectors u and v in W such that T (u) = x and T (v) = y. So x + y = T(u) + T(v) = T(u + v).

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

26

Since u, v ∈ W and W is a subspace of V, it follows that u + v ∈ W . So x + y = T (u + v) ∈ T (W ). Then let α be a scalar and x ∈ T (W ). We show that αx ∈ T (W ). Since x ∈ T (W ), there exists u ∈ W with T (u) = x. Now αx = αT (u) = T (αu). Since αu ∈ W , it follows that αx = T (αu) ∈ T (W ). By the subspace test, T(W) is a subspace of V. To illustrate Theorem 15.32 we return to the linear transformation T : R3 → R2, which is defined in Result 15.24 by T (a, b, c) = (2a + c, 3c − b). Let W = {(a, b, 0) : a, b ∈ R}. We use the subspace test to show that W is a subspace of R3. Since (0, 0, 0) ∈ W , it follows W = ∅. Let (a1 , b1 , 0), (a2 , b2 , 0) ∈ W and α ∈ R. Then (a1 , b1 , 0) + (a2 , b2 , 0) = (a1 + a2 , b1 + b2 , 0) ∈ W and α(a1 , b1 , 0) = (αa1 , αb1 , 0) ∈ W. Since W is closed against addition and scalar multiplication, W is a subspace of R3 . By Theorem 15.32, T (W ) = {(2a, −b) : a, b ∈ R} is a subspace of R2 . We have indeed shown that T (W ) = R2 . Of course, R2 = (1, 0), (0, 1) . Thus, to show that T (W ) = R2 , by Corollary 15.16 it suffices to show that (1, 0) and (0, 1) belong to T (W ). Setting a = 1/2 and b = 0, we see that (1, 0) ∈ T (W ); If we set a = 0 and b = −1, we see that (0, 1) ∈ T (W ). For the same linear transformation T, we have seen that T(1, −6, −2) = (0, 0) and T(0, 0, 0) = (0, 0). Therefore, both (1, −6, −2) and (0, 0, 0) map to the zero vector of R2. That (0, 0, 0) maps to (0, 0) is of course not surprising since Theorem 15.31 guarantees it. If T : V → V is a linear transformation and W is a subset of V, then T −1 (W ) = {v ∈ V : T (v) ∈ W } is called the inverse image of W under T . If W = {z }, where z is the zero vector of V, then T −1 (W ) is called the kernel of T and is denoted by ker(T ). That is, the core of T : V → V is the set ker(T ) = T −1 ({z }) = {v ∈ V : T (v) = z }. An interesting feature of the kernel is the following theorem. Theorem 15.33 Let V and V be vector spaces and let T : V → V be a linear transformation. Then the kernel of T is a subspace of V . Check. Let z and z be the zero vectors of V and V, respectively. Since T (z) = z , it follows that z ∈ ker(T ) and soker(T ) = ∅. Now let u, v ∈ ker(T ) and α ∈ R. Then T (u + v) = T (u) + T (v) = z + z = z and

T (αu) = αT (u) = αz = z .

It follows that u + v ∈ ker(T ) and αu ∈ ker(T ). In the subspace test, ker(T ) is a subspace of V . If we come back to the linear transformation T : R3 → R2 in Result 15.24, which is defined by T (a, b, c) = (2a + c, 3c − b), we see that

27 ker(T ) = {(a, b, c) : 2a + c = 0 and 3c − b = 0} is a subspace of R3 . Since 2a + c = 0 and 3c − b = 0, it follows that a = −c/2 and b = 3c. So ker(T ) = {(−c/2, 3c, c) : c ∈ R}. In other words, ker(T ) is the subspace of R3 that consists of all scalar multiples of (−1/2, 3, 1).

Exercises for Chapter 15 15.1 Prove that the set C = {a + bi : a, b ∈ R} of complex numbers is a vector space under the addition (a + bi) + (c + di) = (a + c) + (b + d)i and scalar multiplication α(a + bi) = αa + αbi, where α ∈ R. 15.2 Although we have assumed R to be the set of scalars in a vector space, this is not always the case. Let V = {([a], [b]) : [a], [b] ∈ Z3 } and let Z3 be the set of scalars. (a) Show that V is a vector space over the set Z3 of scalars under the addition ([a], [b]) + ([c], [d]) = ([a + c], [b + d ]) and scalar multiplication [c]([a], [b]) = ([ca], [cb]). (b) Write exactly the elements of V . (Therefore a vector space can have more than one vector and be finite.) 15.3 Scalar addition or multiplication is defined in each of the following points in R3. (Each undefined operation is the default.) Among these operations, determine whether R3 is a vector space. (a) (a, b, c) + (d, e, f ) = (a, b, c) (b) (a, b, c) + (d, e, f ) = (a − d, b − e, c − f ) (c) (a, b, c) + (d, e, f ) = (0, 0, 0) (d) α(a, b, c) = (a, b , c) (e) α(a, b, c) = (b, c, a) (f) α(a, b, c) = (0, 0, 0) (g) α(a, b, c) = (αa, 3αb, αc) 15.4 Let V be a vector space, where u, v ∈ V . Prove that there is a unique vector x in V such that u + x = v. 15.5 Let V be a vector space with v ∈ V and α ∈ R. Prove that α(−v) = (−α)v = −( αv). 15.6 (a) Let V be a vector space and u, v, w ∈ V . Prove that if u + v = u + w then v = w. (This is the cancellation property for adding vectors.) (b) Use (a) to prove Theorem 15.3. 15.7 Prove or disprove: (a) No vector is its own negative. (b) Every vector is the negative of a vector. (c) Every vector space has at least two vectors.

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

28

15.8 Let V be a vector space containing nonzero vectors u and v. Prove that if u = αv for every α ∈ R, then u = β(u + v) for every β ∈ R. 15.9 Determine which of the following subsets of R4 are subspaces of R4. (a) W1 = {(a, a, a, a) : a ∈ R} (b) W2 = {(a, 2b, 3a, 4b) : a, b ∈ R} (c) W3 = {(a , 0, 0, 1) : a ∈ R} (d) W4 = {(a, a2 , 0, 0) : a ∈ R} (e) W5 = {(a, b, a + b, b) : a, b ∈ R} 15.10 Let FR be the vector space of all functions from R to R. Determine which of the following subsets of FR are subspaces of FR. (a) W1 consists of all functions f with f(1) = 0 = f(2). (b) W2 consists of all functions f with f(1) = 0 or f(2) = 0. (c) W3 consists of all functions f with f(2) = 2f(1). (d) W4 consists of all functions f with f(1) = f(2). (d) W5 consists of all functions f with f(1) = 0. 15.11 Recall that the set R[x] of polynomial functions is a subspace of FR. Now determine which of the following subsets of R[x] are subspaces of R[x]. (a) U1 = {f : f (x) = a for a fixed real number a} (The set of all constant polynomials) (b) U2 = {f : f (x) = a + bx + cx2 + dx3 , a , b, c, d ∈ R, d = 0} (c) U3 = {f : f (x) = a + bx + cx2 + dx3 , a, b, c, d ∈ R} (d) U4 = { f : f (x) = a0 + a2 x2 + a4 x4 + . 🇧🇷 🇧🇷 + a2m x2m , m ≥ 0, and ai ∈ R for 0 ≤ i ≤ m} (e) U5 = {f : f (x) = (x3 + 1)g(x) for some g ∈ R [x]} 15.12 Let M2(R) be the vector space of 2 × 2 matrices whose entries are real numbers. Determine which of the following subsets of M2(R) are subspaces of M2(R).

(a) W =

A B C D

a b (b) W = c d real numbers. 15.13 Prove that

: ad − bc = 0

: α1 a + α2 b + α3 c + α4 d = 0 , onde α1 , α2 , α3 , α4 são fixos

⎧⎡ ⎪ ⎨ a1 ⎢ W = ⎣ 0 ⎪ ⎩ 0

⎤

⎫

⎪ a2 a3 ⎬ ⎥ a4 a5 ⎦ : ai ∈ R para 1 ≤ i ≤ 6 ⎪ ⎭ 0 a6

is a subspace of the vector space M3 [R].

29 15.14 Let U and W be subspaces of a vector space V . Prove that U ∩ W is a subspace of V. 15.15 The graph of the function f : R → R defined by f(x) = 35 x is a straight line in R2 that goes through the origin. Every point (x, y) on this graph is a solution of the equation 3x − 5y = 0. Prove that the solution set S of this equation is a subspace of R2. Exercise 15.16 Determine the following linear combinations: (a) 4 (1, −2, 3) + (−2) (1, −1, 0)

(b) (-1)

3 −2 1 −3

+2

1 1 1 2

+5

−1 −1 −1 −1

15.17 In R3, write i = (1, 0, 0) as a linear combination of u1 = (0, 1, 1), u2 = (1, 0, 1), and u3 = (1, 1, 0). 15.18 Let u = (1, 2, 3), v = (0, 1, 2) and w = (3, 1, −1) be vectors in R3 . (a) Show that w can be expressed as a linear combination of u and v. (b) Show that the vector x = (8, 5, 2) can be expressed in more than one way as a linear combination of u, v and w. 15.19 Let V be a vector space containing the vectors v1 , v2 , . 🇧🇷 🇧🇷 , vn and the vectors w1 , w2 , . 🇧🇷 ., hmm . Let W = v1, v2, . 🇧🇷 🇧🇷 , vn and W = w1 , w2 , . 🇧🇷 🇧🇷 , hmm . Prove that if every vector vi (1 ≤ i ≤ n) is a linear combination of the vectors w1 , w2 , . 🇧🇷 🇧🇷 , wm , then W ⊆ W . 15.20 Prove that (1, 2, 3), (0, 4, 1) = (1, 6, 4), (1, −2, 2) in R3 15.21 Let V be a vector space and let u and v be in V. Prove that (a) u, v = u, 2u + v (b) u, v = u + v, u − v 15.22 Determine which sets S of vectors in the given vector space V are linearly independent. (a) S = {(1, 1, 1), (1, −2, 3), (2, 5, −1)}; V=R3. (b) S = {(1, 0, −1), (2, 1, 1), (0, 1, 3)}; V=R3.

(c) S =

1 1 0 0

,

1 2 1 1

,

0 1 0 1

; V = M2 (R).

15.23 For the vectors u = (1, 1, 1) and v = (1, 0, 2) find a vector w such that u, v, w are linearly independent in R3. Check if u, v, w are linearly independent. 15.24 Prove or disprove: If u1 , u2 , u3 are linearly independent vectors in a vector space V , then u1 + u2 , u1 + u3 , 2u3 are linearly independent vectors in V . Exercise 15.25 Determine which sets S of vectors in FR are linearly independent. (a) S = {1, sin2 x, cos2 x} (b) S = {1, sin x, cos x}

CHAPTER 15. PROOFS IN LINEAR ALGEBRA

30 (c) S = {1, ex , e−x } (d) S = {1, x, x/(x2 + 1)}.

15.26 Let S = {u1, u2, . 🇧🇷 🇧🇷 , un } is a linearly dependent set of n ≥ 2 vectors in a vector space V . Prove that if every subset of S consisting of n − 1 vectors is linearly independent, then there exist nonzero scalars c1 , c2 , . 🇧🇷 🇧🇷 , cn such that c1 u1 + c2 u2 + . 🇧🇷 🇧🇷 + cn un = z. Exercise 15.27 Prove that if T : V → V is a linear transformation, then T (α1 v1 + α2 v2 + . . . + αn vn ) = α1 T (v1 ) + α2 T (v2 ) + . 🇧🇷 🇧🇷 + αn T (vn ), where v1 , v2 , . 🇧🇷 🇧🇷 , vn ∈ V and α1 , α2 , . 🇧🇷 🇧🇷 , αn ∈ R. 15.28 Let V and V be vector spaces and let T : V → V be a linear transformation. Prove that if W is a subspace of V, then T −1 (W ) is a subspace of V . Exercise 15.29 Prove that there is a bijective linear transformation T : R2 → C, where C = {a + bi : a, b ∈ R} is the set of complex numbers. Exercise 15.30 For vector spaces V and V, let T1 and T2 be linear transformations from V to V. Define T1 + T2 : V → V as (T1 + T2 )(v) = T1 (v) + T2 (v). Prove that T1 + T2 is also a linear transformation.

15.31 Six W =

ein b 0 a + b

: a, b ∈ R .

(a) Prove that W is a subspace of M2(R). (b) Prove that there is a bijective linear transformation T : R2 → W.

3 1 −1 15.32 For the 2 × 3 matrix A = a function T : R3 → R2 is defined by 2 −5 2 T (u) = Au, where u is a 3 × 1 column vector in R3. 🇧🇷

⎤

4 ⎢ ⎥ (a) Determine T (u) for u = ⎣ −1 ⎦. −2 (b) Prove that T is a linear transformation. 15.33 Let D : R[x] → R[x] be the linear transformation of the differentiation defined by D(c0 + c1 x + . . . + cn xn ) = c1 + 2c2 x + . 🇧🇷 🇧🇷 + ncn xn−1 . Determine each of the following items. (a) D(W ), where W = {a + bx : a, b ∈ R}. (b) D(W ), where W = R. (c) ker(D).

31 15.34 Let T : M2 (R) → M2 (R) be the linear transformation defined by

T

and consider the subset W =

A B C D

a 0 0 d

=

a a c c

: a, d ∈ R

from M2 (R).

(a) Prove that W is a subspace of M2(R). (b) Determine the subspace T(W) of M2(R). (c) Determine the subspace ker(T) of M2(R). 15.35 For the following statement S and the proposed proof, either (1) S is true and the proof is correct, (2) S is true and the proof is false, or (3) S is false and the proof is false. Explain which occurs. S: Let V be a vector space. If u is a vector of V such that u + v = v for some v ∈ V, then u + v = v for all v ∈ V. Check. Suppose u + v = v for some v ∈ V . So we also know that z + v = v, where z is the zero vector of V. So u + v = z + v. By Exercise 15.6, u = z and thus u + v = v for all v ∈ V .

Chapter 16

Proofs in topology Recall from the calculus that a function f : X → R, with X ⊆ R, is continuous at a ∈ X if for all > 0 there is a δ > 0 such that if |x − a| < δ, then |f (x) − f (a)| 🇧🇷 If we |x − a| write, we are referring to the distance between x and a, i.e. the distance between them. Likewise, |f (x) − f (a)| is the distance between f(x) and f(a). Not surprisingly, this is where distance comes in, because when we say f is continuous at a, we mean that if x is a number close to a, then f(x) is close to f(a). The term "close" only makes sense if we understand how we measure the distances between the two pairs of numbers involved. It may seem obvious that the distance between two real numbers x and y is |x − y| is; however, it turns out that the distance between x and y is not |x − y| needs to be defined, although this is certainly the most common definition. Furthermore, when considering the continuity of a function f : A → B, it is not essential that A and B are sets of real numbers. That is, it is possible to place these calculus concepts in a more general framework. The area of mathematics that deals with this is topology. 16.1 Metric Spaces We have already mentioned that the distance between two real numbers x and y is given by |x − y| given is. There are four properties of this distance that will be of particular interest to us: (1)

|x−y| ≥ 0 for all x, y ∈ R;

(2)

|x−y| = 0 only if x = y for all x, y ∈ R;

(3)

|x−y| = |y − x| for all x, y ∈ R;

(4)

|x−z| ≤ |x − y| + |y − z| for all x, y, z ∈ R.

(16.1)

Many of the fundamental results of calculus depend on these four properties. Using these properties as a guide, we now define distance more generally. Let X be a nonempty set and let d : X × X → R be a function of the Cartesian product X × X for the set R of real numbers. Thus, for every ordered pair (x, y) ∈ X × X, it follows that d((x, y)) is a real number. For simplicity we write d(x, y) instead of d((x, y)) and denote d(x, y) as the distance from x to y. The distance d is called a metric on X if it satisfies the following properties: (1) d(x, y) ≥ 0 for all x, y ∈ X; (2) d(x, y) = 0 if and only if x = y for all x, y ∈ R; 1

CHAPTER 16. TESTS IN TOPOLOGY

2

(3) d(x, y) = d(y, x) for all x, y ∈ X (symmetry property); (4) d(x, z) ≤ d(x, y) + d(y, z) for all x, y, z ∈ X (triangle inequality). A set X together with a metric d defined on X is called a metric space and is denoted by (X, d). Since the set R of real numbers together with the distance d defined in R is given by d(x, y) = |x − y| satisfies the properties listed in (16.1), it follows that (R, d) is a metric space. Now let's consider two other ways to define the distance between two real numbers. Example 16.1 For X = R let the distance d : X × X → R be defined by d(x, y) = x − y. Determine which of the four properties of a metric are satisfied by this distance. Solution. Since d(1, 2) = −1, property 1 is not satisfied. On the other hand, since d(x, y) = x − y = 0 if and only if x = y, property 2 is satisfied. Since d(2, 1) = 1, it follows that d(1, 2) = d(2, 1) and thus the symmetry property (property 3) is not satisfied. Finally, d(x, z) = x − z = (x − y) + (y − z) = d(x, y) + d(y, z), and the triangle inequality (Property 4) holds.

♦

In our next example, we present a distance function that is actually a metric on R. Result 16.2 For X = R let d : X × X → R be defined by d(x, y) = |2x − 2y | . Then (X, d) is a metric space. Check. Obviously d(x, y) = |2x − 2y | ≥ 0 and d(x, y) = 0 if and only if 2x = 2y . If x = y then of course 2x = 2y . Next, let's assume that 2x = 2y. Taking the logarithm of 2x and 2y to base 2 gives x = y. So d(x, y) = 0 if and only if x = y. Since d(x, y) = |2x − 2y | = |2y − 2x | = d(y, x) it follows that d satisfies the symmetry property. Finally, by property 4 in (16.1) we have d(x, z) = |2x − 2z | = |(2x − 2y ) + (2y − 2z )| ≤ |2x − 2y | + |2y − 2z | = d(x, y) + d(y, z) and the triangle inequality holds. Another theorem where you've no doubt seen a specific distance is R × R = R2 . Therefore, an element P ∈ R2 can be expressed as (x, y) where x, y ∈ R. Here we are discussing points in the Cartesian plane, as you saw when studying analytic geometry. There the (Euclidean) distance d(P1 , P2 ) between two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) is given by

d(P1, P2) =

(x1 − x2 )2 + (y1 − y2 )2 .

This distance is actually a metric on R2. That the first three properties are satisfied depends only on the following facts for real numbers a and b: (1) a2 ≥ 0, (2) a2 + b2 = 0 if and only if a = b = 0, (3) a2 = (−a )two . However, the triangle inequality is more difficult to verify, and its proof depends on the following lemma, which is a special case of a result commonly called the Schwarz inequality.

3 Motto 16.3

If a, b, c, d ∈ R, then ab + cd ≤

Study.

(a2 + c2 )(b2 + d2 ).

Sicher, (ab + cd)2 + (ad − bc)2 ≥ (ab + cd)2 . Visto que (ab + cd)2 + (ad − bc)2 = (a2 b2 + 2abcd + c2 d2 ) + (a2 d2 − 2abcd + b2 c2 ) = a2 b2 + a2 d2 + b2 c2 + c2 d2 = (a2 + c2)(b2 + d2),

follows the desired inequality. Now we can show that this distance is a metric. Result 16.4 For X = R2 let P1 = (x1 , y1 ) and P2 = (x2 , y2 ) be two points in R2 and let d : X × X → R be defined by

d(P1, P2) =

(x1 − x2 )2 + (y1 − y2 )2 .

Then (X, d) is a metric space. Check. We have already mentioned that the first three properties of a metric are satisfied, so it only remains to check the triangle inequality. Let P1 = (x1, y1), P2 = (x2, y2) and P3 = (x3, y3). Using Lemma 16.3, where a = x1 − x2 , b = x2 − x3 , c = y1 − y2 and d = y2 − y3 , we have [d(P1 , P3 )]2 = (x1 − x3 )2 + (y1 − y3 )2 = [(x1 − x2 ) + (x2 − x3 )]2 + [(y1 − y2 ) + (y2 − y3 )]2 = (x1 − x2 )2 + (x2 − x3 ) 2 + 2(x1 − x2 )(x2 − x3 ) + 2(y1 − y2 )(y2 − y3 ) + (y1 − y2 )2 + (y2 − y3 )2 ≤ (x1 − x2 )2 + (x2 − x3 )2+

2 (x1 − x2 )2 + (y1 − y2 )2 (x2 − x3 )2 + (y2 − y3 )2 + (y1 − y2 )2 + (y2 − y3 )2

=

(x1 − x2 )2 + (y1 − y2 )2 +

(x2 − x3 )2 + (y2 − y3 )2

2

= [d(P1 , P2 ) + d(P2 , P3 )]2 , which gives us the desired result. There is a metric defined on N × N = N2 known as the Manhattan metric or the taxi metric. For the points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) on N2, the distance d(P1 , P2 ) is defined by d(P1 , P2 ) = |x1 − x2 | + |y1 − y2 |. For example, consider the points P1 = (2, 2) and P2 = (4, 6) shown in Figure 16.1(a). The rolling distance between these two points is d(P1 , P2 ) = |2 − 4| + |2 − 6| = 6. If we think of points (x,y) as intersections of streets in a given city (Manhattan), we need to travel (by taxi) at least 6 blocks. Two such routes are shown in Figure 16.1 (b), (c). The Manhattan metric is not only a metric for N2, but also a metric for Z2 and R2. The proof of the following result remains as an exercise (Exercise 16.2).

CHAPTER 16. TESTS IN TOPOLOGY

4 ...... ... ...

... .... .. ...

.. ..... .. ..

7

7

P2

t

6 5

7

Pt 2

6 5

5 4

4

4 3

3 2

Pt 1

2 1

2 3

4

5

6

........... ......

7

P2

t

6

3 2

P t 1 1

2

3

(one)

4 5

6

7

........... ......

(b)

P t 1 1

2

3

4

5

6 7

........... ......

(c)

Figure 16.1: The Manhattan metric Result 16.5 by

For the points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) on R2, the distance d(P1 , P2 ) defines d(P1 , P2 ) = |x1 − x2 | + |y1 − y2 |

is a metric on R2 (the Manhattan metric). We have seen that there is more than one metric in R and R2. The metric spaces (R, d), 2 with d(x, y) = |x − y| and (R , d), with d((x1 , y1 ), (x2 , y2 )) = (x1 − x2 )2 + (y1 − y2 )2 , are called Euclidean spaces and the associated metrics are the Euclidean metrics. These are certainly the most well-known metrics in R and R2. For any nonempty set A, it is always possible to define a distance d : A × A → R, which is a metric. For x, y ∈ A the distance

d(x,y)=

0 1

if x = y if x = y

is called a discrete metric on A. Result 16.6

The discrete metric d defined on a nonempty set A is a metric.

Check. By definition, d(x, y) ≥ 0 for all x, y ∈ A and d(x, y) = 0 if and only if x = y. Also, by defining this distance, d(x, y) = d(y, x) for all x, y ∈ A. Now let x, y, z ∈ A. If x = z, then surely 0 = d ( x , z) ≤ d(x, y) + d(y, z). If x = z, then d(x, z) = 1. Since x = y or y = z, it follows d(x, y) + d(y, z) ≥ 1 = d(x, z) . In any case, the triangle inequality holds. 16.2 Open sets in metric spaces Returning to our discussion of a real calculus function f, we said that for some real number a in the domain of f, f is continuous if for all > 0 there exists a number δ > 0 such that if |x − that| < δ, then |f (x) − f (a)| 🇧🇷 Naturally, this made us reconsider what we mean by distance and what led us to metric spaces. However, continuity itself can be described somewhat differently. A function f is continuous at a if for all > 0,

5 there is a number δ > 0 such that if x is a number in the open interval (a − δ, a + δ), then f (x) is a number in the open interval (f (a) − , f ( a) + ). That is, continuity can be defined in terms of open intervals. What are the properties of open intervals? Of course, an open interval is a kind of subset of the set of real numbers. But every open interval has a property that can be very usefully generalized. An open interval I of real numbers has the property that for all x ∈ I there is a real number r > 0 such that (x − r, x + r) ⊆ I, i.e. for all x ∈ I there is an open interval I1 centered around x contained in I. Let (X, d) be a metric space. Furthermore, let ∈ X be a real number r > 0. The subset of X consisting of the points (elements) x ∈ X with d(x, a) < r is called an open sphere with center a and radius r and is denoted by Mr(a). Hence x ∈ Sr(a) if and only if d(x, a) < r. For example, the open sphere Sr (a) in Euclidean space (R, d) is the open interval (a − r, a + r) with center a and length 2r. Conversely, every open interval in (R, d) is an open sphere according to this definition. Then the open spheres in (R, d) are exactly the open intervals of the form (a, b), where a < b and a, b ∈ R. In Euclidean space (R2, d), the open sphere Sr(P) is is the interior of the circle with center P and radius r. In metric Manhattan space (R2 , d), where the distance between two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) is given by d(P1 , P2 ) = |x1 − x2 | is defined + |y1 − y2 | the open sphere S3(P ) for P = (5, 4) is the interior of the square shown in Figure 16.2. 🇧🇷

7 6 5 4 3 2 1 0

(5, 7) q qqqq qqqq qqq.q........qqqqqqqq qq q q .. . qqqq ......... ........ ......... . qqqqq qqq.q...................................... ........ .........................qqqqqqqqq q q q q .. .. .. .. .. .. qqqq . ........ ......... ......... ......... ......... ... ... .... .qqqqqqqqq (8, 4) (2, 4) . qqq.................................................. ...... ... ...................(5, 4) qqqqq q ..... q q .... .. q . . . . . qqqqqqq.................................................. ..... . . . .......................................... s ..... . .................................................. .. ......qqqqqqqq . . . . . . . . . qqqq.q ......... ......... ......... ......... ....... .... ..... ......... ......... qqqq qqqq ................ ......... . .......... .......... .......... .......... qqqqq qqqqq.... ... . ...... .......... .......... .... ...... .......... qqqq qqq. q.q .......... .......... .......... .......... qqqq qqqqq...... . ......... .. ........ .......... qqqqq qqqqq.... ........ ....... ... qqqq qqqqq ...... ... qqqq qqqqq qqqqq qqqqq (5, 1)

1

2 3

4

5

6

7

........... ......

Figure 16.2: An open sphere S3(P ) for P = (5, 4) Since every point in a metric space (X, d) belongs to an open sphere at X (actually it is the center of an open sphere), it is is immediate that every two different points of X belong to different open spheres. In fact, they belong to disjoint open spheres. Theorem 16.7

Any two distinct points in a metric space belong to disjoint open spheres.

Check. Let a and b be distinct points in a metric space (X, d) and assume that d(a, b) = r. Necessarily r > 0. Consider open spheres S r2 (a) and S r2 (b) of radius r/2 centered on a and b, respectively. We claim that S r2 (a) ∩ S r2 (b) = ∅. On the other hand, let us assume that S r2 (a)∩S r2 (b) = ∅. Then there is c ∈ S r2 (a)∩ S r2 (b). So d(c, a) < r/2 and d(c, b) < r/2. Because of the triangle inequality, r = d(a, b) ≤ d(a, c) + d(c, b) < r/2 + r/2 = r, which is a contradiction. A subset O of a metric space (X, d) is called open if for every point a of O there is a positive real number r with Sr(a) ⊆ O, d. H. every point of O is the center of an open sphere contained in O. In Euclidean space (R, d), any open interval (a, b) where

CHAPTER 16. TESTS IN TOPOLOGY

6

a < b is an open set. To see this, let for every x ∈ (a, b) r = min(x − a, b − x). Then the open sphere Sr(x) = (x − r, x + r) is contained in (a, b). In fact, the set (−∞, a) ∪ (a, ∞) is open in (R, d) for every a ∈ R. On the other hand, the semi-open (or semi-closed) set (a , b ] is not open since there is no there is an open sphere centered at b that is contained in (a, b) Similarly, the sets [a, b], [a, b), (−∞, a] and [a , ∞ ) are in (R, d) not open. Every metric space contains some open sets, as we shall now show. to prove theorem

In a metric space (X, d),

(i) the empty set ∅ and the set X are open, and (ii) every open sphere is an open set. Proof strategy To show that a subset A of X is open, one has to show that if a is a point of A, then a is the center of an open ball contained in A. The empty set satisfies this condition in empty form, and X trivially satisfies this condition; so we focus on checking (ii). We start with an open sphere Sr(a) with center a and radius r. For an arbitrary element x ∈ Sr (a) we have to show that there is an open sphere with center at x and suitable radius that is contained in Sr (a). Since the theorem is an arbitrary metric space (X, d), the open sphere Sr (a) does not necessarily result in a geometric phenomenon. On the other hand, it makes sense to think of Sr(a) as the interior of the circle (see Figure 16.3). Since d(x, a) < r, it follows that r = r − d(x, a) is a positive real number. It seems likely that Sr(x) ⊆ Sr(a). To show this, it remains to show that if y ∈ Sr(x), then y ∈ Sr(a); that is, if d(y, x) < r , then d(y, a) < r. It makes sense to use the triangle inequality to verify this. 🇧🇷 .......................................... . ......... ........ ........ ....... ......... ...... . 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷 🇧🇷

r

ein t d(x, a) x

t

r = r − d(x, a)

Figure 16.3: Diagram of an open sphere Sr (a) in a metric space (X, d) Theorem 16.8

In a metric space (X, d),

(i) the empty set ∅ and the set X are open, and (ii) every open sphere is an open set. Check. Since ∅ has no sense, the claim that ∅ is open is false. For every point a ∈ X every open ball with center a is contained in X. Thus X is open and (i) is verified. To verify (ii), let Sr (a) be an open sphere in (X, d) and let x ∈ Sr (a). We show that there is an open sphere centered at x and contained in Sr(a). Since d(x, a) < r, this follows

7 r = r − d(x, a) > 0. We show that Sr (x) ⊆ Sr (a). Let y ∈ Sr(x). Since d(y, x) < r and d(x, a) = r − r, it follows from the triangle inequality that d(y, a) ≤ d(y, x) + d(x, a) < r + ( r − r ) = r. So y ∈ Sr (a) and thus Sr (x) ⊆ Sr (a). To illustrate Theorem 16.8 we return to the metric space (X, d) described in Result 16.2, i.e. X = R with d(x, y) = |2x − 2y | for x, y ∈ R. Hence ∅ and X = R are open sets like all open spheres Sr (a), where a ∈ R and r > 0. One of these open spheres is S1 (0) = {x ∈ R : | 2x − 20 | <1}. The inequality |2x − 20 | < 1 is equivalent to the inequalities −1 < 2x − 1 < 1 and 0 < 2x < 2. Since 2x > 0 for all x, it follows that 0 < 2x < 2 for all real numbers in the infinite interval (− ∞, 1), and thus (−∞, 1) is the open sphere with center 0 and radius 1 (according to the given metric). We also consider the open sphere S6 (1) = {x ∈ R : |2x − 21 | < 6}. Here |2x − 21 | < 6 is equivalent to the inequalities −6 < 2x − 2 < 6 and −4 < 2x < 8 and then S6 (1) is the open sphere (−∞, 3) with center 1 and radius 6. Characterization of open sets in any metric space. Theorem 16.9 A subset O of a metric space is open if and only if it is a union (finite or infinite) of open spheres. Check. Let (X, d) be a metric space. First, let O be an open set in (X, d). We show that O is a union of open spheres. If O = ∅ then O is the union of zero open spheres. So we can assume that O = ∅. Let x ∈ O. Since O is open, there is a positive number rx with Srx (x) ⊆ O. It follows that x∈O Srx (x) ⊆ O. On the other hand, if x ∈ O then x ∈ Srx ( x) ⊆ x∈O Srx (x), which implies that O ⊆ x∈O Srx (x). Hence O = x∈O Srx (x). Next we show that O is open if O is a subset of (X, d), which is a union of open spheres. If O = ∅ then O is open. Therefore we can assume that O = ∅. Let x ∈ O. Since O is a union of open spheres, x belongs to some open sphere, say Sr(a). Since Sr(a) is open, there exists r > 0 (as we saw in the proof of Theorem 16.8) such that Sr(x) ⊆ Sr(a) ⊆ O. Hence O is open. Two important properties of open sets are stated in the next theorem. Theorem 16.10

Let (X, d) be a metric space. then

(i) the intersection of any finite number of open sets in X is open, and (ii) the union of any number of open sets in X is open. Check. We first check (i). Let O1 , O2 , · · · , Ok k be open sets in X and O = ∩ki=1 Oi . If O is empty, then by Theorem 16.8(i) O is open. So we can assume that O is nonempty and x ∈ O. We show that x is the center of an open sphere contained in O. Since x ∈ O, it follows that x ∈ Oi for all i (1 ≤ i ≤ k). Since every set Oi is open, there exists an open sphere Sri(x) ⊆ Oi , where 1 ≤ i ≤ k. Let r = min{r1 , r2 , · · · , rk }. Then r > 0 and Sr(x) ⊆ Sri(x) ⊆ Oi for every i (1 ≤ i ≤ k). So Sr(x) ⊆ ∩ki=1 Oi = O. So O is open. Next we verify (ii). Let {Oα }α∈I be an indexed collection of open sets in X, and let O = ∪α∈I Oα . We show that O is open. If O = ∅, then O is open again. So we assume that O = ∅. By Theorem 16.9, every open set Oα (α ∈ I) is the union of open spheres. Thus O is a union of open spheres. Theorem 16.9 again implies that O is open.

CHAPTER 16. TESTS IN TOPOLOGY

8

1 1 ,1 + , n ∈ N, is an open set n n ∞ 100 101 101. By Theorem 16.10, In = (−2, 2) is an open set, as is In = − . However , 100 100 n=1 n=1 For Euclidean space (R, d) every open interval In = −1 −

Theorem 16.10 does not guarantee this

∞

In is open. As a matter of fact,

n=1

∞

In is the closed interval

n=1

1 [−1, 1], which is not an open set. The open interval Jn = 0, , n ∈ N, is also an open set. no

∞

Jn = (0, 1) is an open set. In this case,

n=1

∞

Jn = ∅, also an open set.

n=1

Now we turn to Euclidean space (R2 , d). Let P0 = (0, 0). For n ∈ N the open sphere Sn (P0 ) with center (0, 0) and radius n is an open set. Here it is open; While

∞

∞

Sn (P0) = R2, that

n=1

Sn(P0) = S1(P0) which is open. In (R2 , d), where d is the discrete metric,

n=1

S1(P0) = {P0}, while S2(P0) = R2. Of course, all sets in a discrete metric space are open. There is another important class of sets in metric spaces that arise naturally from open sets. Let (X, d) be a metric space. A subset F of X is called closed if its complement F is open. For example, in Euclidean space (R, d), every closed interval [a, b] where a < b is closed as long as its complement (−∞, a) ∪ (b, ∞) is open. Let a be a point in a metric space and let Sr [a] consist of the points x ∈ X with d(x, a) ≤ r. The set Sr [a] is called a closed sphere with center a and radius r. Not surprisingly, Sr[a] is closed, as we shall show below. Also, ∅ and X are both open and closed. Theorem 16.11

In a metric space (X, d),

(i) ∅ and X are closed, and (ii) every closed sphere is closed. Check. Since ∅ and X are complementary and both are open, it follows that both are closed. To verify (ii), let Sr[a] be a closed ball in (X,d) with a ∈ X. We show that its complement Sr[a] is open. We can assume that Sr[a] is nonempty and a proper subset of X. Let x ∈ Sr[a]. So d(x, a) > r and r∗ = d(x, a) − r > 0. We show that Sr∗ (x) ⊆ Sr[a], hence / Sr[a]. Let y ∈ Sr∗ (x). Since d(x, y) < r∗ = d(x, a) − r, it follows for y ∈ Sr∗ (x) that y ∈ the triangle inequality d(y, a) ≥ d(x , a) − d(x, y) > d(x, a) − r∗ = r then d(y, a) > r. So y ∈ Sr[a], which implies that Sr∗ (x) ⊆ Sr[a]. Some other useful facts about closed sets follow immediately from Theorem 16.10. First, it is useful to recall from Result 9.15 and Exercise 9.24 that if A1 , A2 , . 🇧🇷 🇧🇷 , An are n ≥ 2 sets, so n i=1

Ai =

n i = 1

You have

n i = 1

Ai =

n

Ai .

i=1

These are DeMorgan's laws for any finite number of sets. There is a more general form of DeMorgan's Laws.

9 Theorem 16.12 (Extended DeMorgan Laws)

(one)

Aa =

α∈I

Aα

For an indexed collection {Aα }α∈I of sets, we have:

e

(b)

α∈I

Aa =

α∈I

Aa .

α∈I

We only present the proof of (a) and leave the proof of (b) as an exercise (Exercise 16.14). We'll show that first

Proof of Theorem 16.12 (a).

x∈ /

Aα

α∈I

Aα . Sei x ∈

α∈I

Aα . then

α∈I

Aα . So x ∈ / Aα for every α ∈ I, which implies that x ∈ Aα for every α ∈ I.

α∈I

x∈

Ah and such

α∈I

Aα

α∈I

Next we show that

Aa .

α∈I

Aα

α∈I

α∈I

Aα . Sei x ∈

Aα . Then x ∈ Aα for every α ∈ I. So

α∈I

x ∈ / Aα for all α ∈ I. But it follows from this that x ∈ / So

Aα

α∈I

Corollary 16.13

Aα and thus x ∈

α∈I

Aa .

α∈I

Aa .

α∈I

Let (X, d) be a metric space. then

(i) the union of any finite number of closed sets in X is closed, and (ii) the intersection of any number of closed sets in X is closed. Check.

Let F1 , F2 , · · · , Fk k be closed sets in X and let F =

k

Fi. Also F =

i=1

k i = 1

Fi =

k

Fi .

i=1

Since every set Fi (1 ≤ i ≤ k) is closed, every set Fi is open. By Theorem 16.10, F is open and therefore closed. This checks (i). Next we verify (ii). Let {Fα }α∈I be an indexed collection of closed sets in X, and let F =

α∈I

Fa. Also F =

α∈I

Fa =

Fα according to Theorem 16.12. Since every set Fα (α ∈ I) is closed,

α∈I

every set Fα is open. By Theorem 16.10, F is open and therefore closed. 16.3 Continuity in metric spaces We have then seen in calculus that the definition of a function f as continuous on a real number can be formulated in terms of distance or in terms of open intervals, each of which can be generalized. Now we generalize the very concept of continuity. Let (X, d) and (Y, d ) be metric spaces and let a ∈ X. A function f : X → Y is called continuous at the point a if for every positive real number there is a positive real number δ such that if x ∈ X and d(x, a) < δ, i.e. d (f (x), f (a)) < . The function f : X → Y is continuous in X if it is continuous in every point of X. If X = Y = R and d = d is defined by d(x, y) = |x − y| for all x, y ∈ R, then we give the standard definition of continuity in analysis. We now consider some examples of continuous functions in this more general scenario. Result 16.14 Let (R2 , d) be the metric Manhattan space whose distance d(P1 , P2 ) between two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) in R2 is defined by d(P1 , P2 ). is = |x1 − x2 | + |y1 − y2 |, and let (R, d ) be the Euclidean space, where d (a, b) = |a − b|. then

10

CHAPTER 16. TESTS IN TOPOLOGY

(i) the function f : R2 → R defined by f ((x, y)) = f (x, y) = x + y is continuous. (ii) the function g : R2 → R defined by g(x, y) = d (x, y) = |x − y| it is continuous. Check. We first check (i). Let > 0 and let P0 = (x0 , y0 ) ∈ R2 . We choose δ = . Now let P = (x, y) ∈ R2 such that d(P, P0 ) = |x − x0 | + |y − y0 | <δ. Then d (f (x, y), f (x0 , y0 )) = d (x + y, x0 + y0 ) = |(x + y) − (x0 + y0 )| = |(x − x0 ) + (y − y0 )| ≤ |x − x0 | + |y − y0 | < δ = . So f is continuous. Now we check (ii). Let > 0 be given again and P0 = (x0 , y0 ) ∈ R2 . For data > 0 choose δ = . Let P = (x, y) ∈ R2 such that d(P, P0 ) = |x − x0 | + |y − y0 | <δ. We show that d (g(P ), g(P0 )) = d (|x − y|, |x0 − y0 |) = ||x − y| − |x0 − y0 || < , which is equivalent to − < |x − y| is − |x0 − y0 | 🇧🇷 Because of the triangle inequality, note that |x − y| − |x0 − y0 | = |(x − x0 ) + (x0 − y0 ) + (y0 − y)| − |x0 − y0 | ≤ |x − x0 | + |x0 − y0 | + |y0 − y| − |x0 − y0 | = |x − x0 | + |y0 − y| < δ = . Likewise |x0 − y0 | − |x − y| ≤ |x − x0 | + |x − y| + |y0 − y| − |x − y| = |x − x0 | + |y0 − y| 🇧🇷 Proof Analysis Let's see how Theorem 16.14(ii) was proved. The main goal was to show that ||x−y|−|x0 −y0 || < given that |x−x0 |+|y0 −y| <δ. a = |x−y| and b = |x0 − y0 |, we have the inequality |a − b| <, which is equivalent to − < a − b <, which is equivalent to a − b < and b − a <. Hence one of the inequalities we want to set up is |x − y| − |x0 − y0 | 🇧🇷 Since we know that |x − x0 | + |y0 − y| < δ, this suggests with the expression |x − x0 | to work + |y0 − y| in the expression |x−y|−|x0 −y0 |. This can be done by adding and subtracting the appropriate amounts. Note that |x − y| − |x0 − y0 | = |(x − x0 ) + (x0 − y0 ) + (y0 − y)| − |x0 − y0 | ≤ |x − x0 | + |x0 − y0 | + |y0 − y| − |x0 − y0 | = |x − x0 | + |y0 − y| <δ. This suggests the choice of δ = . Of course we have to be sure that with this choice of δ we can also show that |x0 − y0 | − |x − y| 🇧🇷 ♦ The function i : R → R defined by i(x) = x for all x ∈ R is obviously the identity function. It seems likely that this function must actually be continuous. However, this depends on the metrics used. Example 16.15 Let (R, d) be the discrete metric space and (R, d ) the Euclidean space with d (x, y) = |x − y| for all x, y ∈ R. Then (i) the function f : (R, d) → (R, d ) defined by f (x) = x for all x ∈ R is continuous, and

11 (ii) the function g : (R, d ) → (R, d) defined by g(x) = x for all x ∈ R is not continuous. Solution. We first check (i). Let a ∈ R and let > 0. Choose δ = 1/2. Let x ∈ R with d(x, a) < δ = 1/2. We show that d (f (x), f (a)) < . Since d is the discrete metric and d(x, a) < 1/2, it follows that x = a. So d (f (x), f (a)) = |f (x) − f (a)| = |x − a| = |a − a| = 0 < . Next we verify (ii). Let a ∈ R and choose = 1/2. Let δ be any positive real number. Let x = a + δ/2 ∈ R. Then d (x, a) = |x − a| = |(a + δ/2) − a| = δ/2 < δ. Since x = a, d(g(x), g(a)) = d(x, a) = 1 > . Hence for = 1/2 there is no δ > 0 such that if d(x, a) < δ then d(g(x), g(a)) < . So g is not continuous in a. ♦ The continuity of functions defined from one metric space to another can also be described by open sets. For this we need additional definitions and notations. Let (X, d) and (Y, d ) be metric spaces and let f : X → Y . If A is a subset of X, then its image f(A) is the subset of Y defined by f(A) = {f(x): x ∈ A}. If B is a subset of Y, then its inverse image f −1 (B) is defined by f −1 (B) = {x ∈ X : f (x) ∈ B}. To illustrate these concepts, consider a function f : R → R, for a metric d on R, where f is defined by f(x) = x2 for all x ∈ R. Then f(x) is a polynomial (whose graph is a simile). Let A = (−1, 2], B = [−2, 2] and C = [0, 4]. Then f (A) = C while f −1 (C) = B. Now let (X , d) and (Y, d ) are metric spaces, let f : X → Y , and let a ∈ X. Suppose for every > 0 there exists δ > 0 such that if x ∈ X and d(x, a ) < δ , then d(f(x),f(a)) < . Then f is continuous in a. Equivalently, f is continuous in a if whenever then if x ∈ Sδ(a), then f(x) ∈ S ( f (a)). Hence f is continuous in a if for every > 0 there exists a δ > 0 such that f (Sδ (a)) ⊆ S (f (a)). We now present a characterization of these functions f which are continuous in the whole set X. Theorem to be proved Let (X, d) and (Y, d ) be metric spaces and let f : X → Y Then f is continuous on X for every open set O in Y iff that inverse image f −1 (O) is an open set in X. Proof strategy Let's start with the implication: if f is continuous in X, then for every open set O in Y the inverse image f −1 ( O ) is an open one set in X. In a direct proof we would assume that as f continuous and is that O is an open set in Y. If f −1 (O) = ∅, then f −1 (O) is an open set in X; while if f −1 (O) = ∅, then we have to show that every element x ∈ f −1 (O) is the center of an open sphere contained in f −1 (O). Then let x ∈ f −1 (O). So f(x) ∈ O. We know that O is open; then O contains an open ball S(f(x)). However, f is continuous at x; then there is δ > 0 with f (Sδ (x)) ⊆ S (f (x)). So Sδ (x) ⊆ f −1 (O). We also attempted a direct proof to verify the converse. We then start by assuming that for every open set O in Y, the set f−1 (O) is open in X. Our goal is to show that f is continuous in X. We give a ∈ X e > 0. The openness sphere S (f (a)) is an open set on Y . By assumption f −1 (S (f (a))) is an open set in X. Furthermore, a ∈ f −1 (S (f (a)))). So there exists δ > 0 such that f (Sδ (a)) ⊆ S (f (a)) and f is continuous on X. ♦ We now give a shorter proof.

12

CHAPTER 16. TESTS IN TOPOLOGY

Theorem 16.16 Let (X, d) and (Y, d ) be metric spaces and f : X → Y . Then f is continuous in X if and only if for every open set O in Y the inverse f−1 (O) is an open set in X. Proof. Let us first assume that f is continuous on X. Let O be an open set on Y . We show that f −1 (O) is open in X. If f −1 (O) = ∅, then f −1 (O) is open; then we can assume that f −1 (O) = ∅. Let x ∈ f −1 (O). Since x ∈ f −1 (O), it follows that f (x) ∈ O. Since O is open, there is an open ball S (f (x)) contained in O. Since f is continuous at x, there exists δ > 0 such that f (Sδ (x)) ⊆ S (f (x)) ⊆ O. So Sδ (x) ⊆ f −1 (O), as desired. Instead, assume that for any open set O of Y, the inverse image f −1 (O) is an open set of X. We show that f is continuous on X. Let a be any point on X. Let > 0 be given. The set S(f(a)) is open in Y and hence its inverse image f−1(S(f(a))) is open in X and contains a. Then there is δ > 0 such that the open ball is Sδ (a) ⊆ f −1 (S (f (a))). Hence f (Sδ (a)) ⊆ S (f (a)) and hence f is continuous at a. Hence f is continuous on X. Using Theorem 16.16 it can now be shown that every constant function is continuous from one metric space to another. Result 16.17 Let (X, d) and (Y, d ) be metric spaces and let f : X → Y be a constant function, ie f (x) = c for some c ∈ Y . So f is continuous. Check. Let O be an open set on Y . Then f −1 (O) = ∅ if c ∈ / O; otherwise f −1 (O) = X. In any case f −1 (O) is open. By Theorem 16.16, f is continuous on X. 16.4 Topological Spaces In the previous section we introduced the concept of a continuous function from one metric space to another, and the definition was formulated in terms of the metrics of the spaces involved. However, Theorem 16.16 shows that the continuity of a function in a metric space can only be established with respect to open sets without direct reference to metrics. This suggests the possibility of discarding metrics entirely, replacing them with open sets, and describing continuity in an even more general scenario. This leads to another mathematical structure called a topological space. Let X be a nonempty set and let τ (the Greek letter “tau”) be a collection of subsets of X. Then (X, τ ) is called a topological space and τ itself a topology on X if the following properties are satisfied: (1 ) X ∈ τ and ∅ ∈ τ . (2) If O1 , O2 , · · · , On ∈ τ , with n ∈ N, then ∩ni=1 Oi ∈ τ. (3) If for a set of indices I Oα ∈ τ for every α ∈ I, then ∪α∈I Oα ∈ τ . In a topological space (X, τ ) we call each element of τ an open set of X. Property (1) says that X and the empty set are open. Property (2) states that the intersection of a finite number of open sets is open; while property (3) says that the union of any number of open sets is open. For example, for a nonempty set X, let τ1 = {∅, X} and τ2 = P(X), the set of all subsets of X. Then for i = 1, 2 (X, τi ) there is room for a topology. The topology τ1 is called the trivial topology in X, while τ2 is the discrete topology in X. In (X, τ1 ) the only open sets are X and ∅; while in (X, τ2 ) every subset of X is open.

13 From the definition of a topological space and the properties of open sets in a metric space it follows immediately that every metric space is a topological space. However, the opposite is not the case. When we say that a topological space (X, τ ) is a metric space, we mean that it is possible to define a metric d on X such that the set of open sets of (X, d) is τ . Example 16.18 Let X = {a, b, c} and τ = {∅, X, {a}, {a, b}, {a, c}}. Then (X, τ ) is a topological space that is not a metric space. Solution. To see that (X, τ ) is a topological space, just note that the union or intersection of all elements of τ also belongs to τ. We now show that (X, τ ) is not a metric space, that is, there is no way to define a metric on X such that the resulting open sets are exactly the elements of τ. We check this by appeal. On the contrary, suppose there is a metric d such that the open sets in (X, d) are the elements of τ. Let r = min{d(a, b), d(b, c)}. Necessarily, r > 0. Then Sr (b) = {x ∈ X : d(x, b) < r} = {b}, which does not belong to τ, a contradiction.

♦

We now present two more examples of topological spaces, the first of which is suggested by the previous result. Result 16.19 Let X be a nonempty set. For a ∈ X let τ be composed of ∅ and every subset of X that contains a. Then (X, τ ) is a topological space. Check. Since a ∈ X, it follows that X ∈ τ . In addition, ∅ ∈ τ ; then property (1) is fulfilled. Let O1 , O2 , · · · , On be n elements of τ . If Oi = ∅ for some i (1 ≤ i ≤ n), then ∩ni=1 Oi = ∅ and therefore ∩ni=1 Oi ∈ τ. Otherwise a ∈ Oi for all i with 1 ≤ i ≤ n. So a ∈ ∩ni=1 Oi , which implies that ∩ni=1 Oi ∈ τ. Finally, for a set of indices I let {Oα }α∈I be a set of elements of τ . If Oα = ∅ for all α ∈ I, then ∪α∈I Oα = ∅ and then ∪α∈I Oα ∈ τ . Otherwise a ∈ Oα for some α ∈ I and thus a ∈ ∪α∈I Oα . So ∪α∈I Oα ∈ τ . Hence (X, τ ) is a topological space. Our next example of a topological space uses DeMorgan's extended laws (Theorem 16.12). Result to be proved Let X be a nonempty set and let τ be the set consisting of ∅ and every subset of X whose complement is finite. Then (X, τ ) is a topological space. Proof strategy If X is a finite set, then τ consists of all subsets of X. In this case τ is the discrete topology on X, and (X, τ ) is a topological space. So we need only concern ourselves with the case where X is infinite. We already know that ∅ ∈ τ . Also X = ∅, which is finite; then X ∈ τ. Then (X, τ ) satisfies property (1) required of a topological space. To show that (X, τ ) satisfies property (2), we make O1 , O2 , · · · , On ∈ τ for n ∈ N. We have to show that ∩ni=1 Oi ∈ τ . If one of the open sets O1 , O2 , · · · , On is empty, then ∩ni=1 Oi = ∅ and then ∩ni=1 Oi belongs to τ . So just assume that Oi = ∅ for all i (1 ≤ i ≤ n). It has to be shown that ∩ni=1 Oi is finite. However, ∩ni=1 Hi = ∪ni=1 Hi by DeMorgan's law. Since every set Oi is finite (1 ≤ i ≤ n), the union of these sets is also finite. Hence ∩ni=1 Oi ∈ τ and property (2) is satisfied. To show that property (3) is satisfied, we start with an indexed family {Oα }a∈I of open sets in X and have to show that ∪a∈I Oα ∈ τ . We can proceed similarly to proof of ownership (2). 🇧🇷

CHAPTER 16. TESTS IN TOPOLOGY

14

Result 16.20 Let X be a nonempty set and let τ be the set consisting of ∅ and every subset of X whose complement is finite. Then (X, τ ) is a topological space. Check. If X is finite then τ is the discrete topology. Therefore we can assume that X is infinite. Since the complement of X is ∅, it follows that X ∈ τ . Since ∅ ∈ τ also holds (1). Let O1 , O2 , · · · , On be n elements of τ . If Oi = ∅ for some i (1 ≤ i ≤ n), then ∩ni=1 Oi = ∅ ∈ τ . Hence we can assume that Oi = ∅ for all i (1 ≤ i ≤ n). So every Oi set is finite. According to DeMorgan's law, ∩ni=1 Hi = ∪ni=1 Hi . Since ∩ni=1 Oi is a finite union of finite sets, it is finite. Thus ∩ni=1 Oi ∈ τ and thus (2) is fulfilled. To verify (3), let {Oα }a∈I be an arbitrary collection of elements of τ . Again, by DeMorgan's law, Oα = Oα . a∈eu

a∈eu

If Oα = ∅ for all α ∈ I, then Oα = X and hence a∈I Oα = X. So we can assume that they exist

is a β ∈ I with Oβ = ∅. So Oβ is finite and a∈I Oα ⊆ Oβ . Then a∈I Oα is also finite. Hence ∪a∈I Oα ∈ τ and (3) holds. We saw in Theorem 16.7 that every two distinct points in a metric space (X, d) belong to disjoint open spheres in X. Since open spheres are open sets in X, two distinct points in X belong to disjoint open sets. This is often a useful property for a topological space. A topological space (X, τ ) is called a Hausdorf space (named after the mathematician Felix Hausdorf) if for every pair a, b of distinct points of X there are disjoint open sets Oa and Ob of X containing a and b, respectively. The following result follows from Theorem 16.7. Corollary 16.21

Every metric space is a Hausdorf space.

On the other hand, not every topological space is a Hausdorf space and not every Hausdorf space is a metric space. We review the first of them. The second of these is a deeper topic in topology. Example 16.22 Let X be an infinite set and let τ be the set consisting of ∅ and every subset of X whose complement is finite. Then (X, τ ) is a topological space that is not a Hausdorf space. Solution. We saw in Result 16.20 that (X, τ ) is a topological space; then it only remains to show that (X, τ ) is not a Hausdorf space. Let a and b be two different elements of X. We claim that there are no two disjoint open sets, one containing a and the other b. Suppose instead that there are open (nonempty) sets Oa and Ob containing a and b, respectively, such that Oa ∩ Ob = ∅. So, by DeMorgan's law, Oa ∩ Ob = X = Oa ∪ Ob. Since X is infinite, at least one of Oa and Ob is infinite. This implies that at least one of Oa and Ob is not open-ended, which is a contradiction. ♦ 16.5 Continuity in topological spaces If (X, d) and (Y, d ) are metric spaces, then by Theorem 16.16 a function f : X → Y is continuous if and only if f −1 (O) is an open set in X for every open set O in Y . So instead of defining a function f as continuous with respect to distances in the two metric spaces (as we did), we could have defined f as continuous with respect to open sets. Then it would be meaningless

To define a function as distance continuous from one topological space to another, we have a logical alternative. Let (X, τ ) and (Y, τ ) be two topological spaces. A function f : X → Y is called continuous if f −1 (O) is an open set in X for every open set O in Y. Let's see how this definition works in practice. Result 16.23 Let (X, τ ) and (Y, τ ) be two topological spaces. (i) If τ is the discrete topology on X, then every function f : X → Y is continuous. (ii) Let τ be the trivial topology on X and let f : X → Y be a surjective function. Then f is continuous if and only if τ is the trivial topology on Y. Proof. We first check (i). Let O be an open set on Y . Since f −1 (O) is a subset of X, it follows that f −1 (O) is an open set in X and hence f is continuous. Next we verify (ii). First assume that τ is the trivial topology on Y. Then Y and ∅ are the only open sets in Y . Since f −1 (Y ) = X and f −1 (∅) = ∅ are open sets in X, it follows that f is continuous. Instead, assume that τ is a topology on Y that is not the trivial topology. Then there is an open set O in Y distinct from Y and ∅. Since f is a surjection, f −1 (O) is different from X and ∅. So f −1 (O) is not an open set in X, which implies that f is not continuous.

Result 16.24 Let (X, τ ) and (Y, τ ) be topological spaces. (i) The identity function i : X → X (defined by i(x) = x for all x ∈ X) is continuous. (ii) If g : X → Y is a constant function, that is, if g(x) = c for all x ∈ X, with c ∈ Y , then g is continuous. Check. We first check (i). Let O be an open set in X. Since i−1 (O) = O is an open set in X, the function i is continuous. Next we verify (ii). Let O be an open set on Y . If c ∈ O, then g −1 (O) = X; while if c ∈ / O, −1 −1 then g(O) = ∅. In both cases g(O) is an open set in X and hence g is continuous. Example 16.25 Let X = {a, b, c} with the topology τ = {∅, X, {a}, {a, b}, {a, c}} and let f : X → X defined by f ( a ) = b, f (b) = cef (c) = a. Determine if f is continuous. Solution. Since O = {a} is an open set in X and f −1 (O) = {b} is not an open set in X, the function f is not continuous. ♦ Based on the given definition of a continuous function from one metric space to another, for topological spaces (X, τ ) and (Y, τ ) it may seem more natural to define a function f : X → Y as a continuous if , because For every x ∈ X and every open set O of Y containing f (x), there is an open set U of X containing x such that f (U ) ⊆ O. This corresponds to our definition as we give it look at it. First, a motto is useful. Lemma 16.26 Let X and Y be nonempty sets and let f : X → Y be a function. For every subset B of Y we have f f −1 (B) ⊆ B.

CHAPTER 16. TESTS IN TOPOLOGY

16 exam. and ∈ B

Let y ∈ f f −1 (B) . Then there is x ∈ f −1 (B) with f(x) = y. This implies that

Result to prove Let (X, τ ) and (Y, τ ) be topological spaces. Then f : X → Y is continuous if and only if for every x ∈ X and every open set O of Y containing f (x), there is an open set U of X containing x such that f (U ) ⊆ O. Proof Strategy First assume that f is continuous. Let x ∈ X and O be an open set in Y containing y = f (x). What we need to do is find an open set U of X that contains x such that f (U ) ⊆ O. However, there is an obvious choice for U , which is f −1 (O). An application of Lemma 16.26 completes the proof of this implication. Then we consider the inverse. Assume that for every x ∈ X and every open set O of Y containing f (x), there is an open set U of X containing x such that f (U ) ⊆ O. Since our goal is to show that f is continuous: we have to show that for every open set B of Y, the set f − 1 (B) in X is open. If f −1 (B) = ∅, then of course f −1 (B) is an open set; so we assume that f −1 (B) = ∅. If we can show that f −1 (B) is the union of open sets, then f −1 (B) is open. Let x ∈ f −1 (B). Then f (x) ∈ B. By assumption there exists an open set Ux in X that contains x such that f (Ux ) ⊆ B. It follows that f −1 (B) is a union of open sets in X. ♦ Result of Exercise 16.27 Let (X, τ ) and (Y, τ ) be topological spaces. Then f : X → Y is continuous if and only if for every x ∈ X and every open set O of Y containing f (x), there is an open set U of X containing x such that f (U ) ⊆ O Proof . First assume that f is continuous. Let x ∈ X and O be an open set in Y containing f(x). Since f is continuous, f −1 (O) is an open set on X containing x. Let U = f −1 (O).

−1 By Lemma 16.26 we have f (U ) = f f (O) ⊆ O. Suppose conversely that for every x ∈ X and every open set O of Y that contains f (x), there is an open set U of X containing x such that f (U ) ⊆ O. Let B be an open set in Y . We show that f −1 (B) is an open set in X. If f −1 (B) = ∅, then f −1 (B) is open in X. Hence we can assume that f −1 (B) = ∅ . For every x ∈ f −1 (B), the set B is an open set in Y containing f (x). By assumption there is an open set Ux in X that contains x such that f (Ux ) ⊆ B. So Ux ⊆ f −1 (B). However, f −1 (B) = x∈f −1 (B) Ux and thus f −1 (B) is also an open set in X.

Chapter 16 Exercises 16.1 In the following, a distance in the set R of real numbers is defined. Determine which of the four properties of a metric space are satisfied by d. Check your answers. (a) d(x, y) = y − x (c) d(x, y) = |x − y| + |y − x| (e) d(x, y) = |x2 − y 2 |

(b) d(x, y) = (x − y) + (y − x) (d) d(x, y) = x2 + y 2 (f ) d(x, y) = |x3 − y 3 |

16.2 For the points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) on R2, the Manhattan metric d(P1 , P2 ) is defined by d(P1 , P2 ) = |x1 − x2 | + |y1 − y2 |. Prove that the Manhattan metric is in fact a metric on R2. 16.3 Let (X, d) be a metric space. For two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) in X 2 define: X × X → R by d (P1 , P2 ) = d(x1 , x2 ) + d(y1 , y2 ) . Which of the four properties of a metric space does d satisfy?

17 16.4 Let (X, d) be a metric space. For two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) on X 2 define ∗ ∗ d : X × X → R by d (P1 , P2 ) = [d(x1 , x2 )]2 + [ d(y1 ,y2 )]2 . Which of the four properties of a metric space does d∗ satisfy? 16.5 Let A be a set and let a and b be two different elements of A. A distance d : A × A → R is defined as follows: ⎧ ⎪ ⎨ 0

d(x,y)=

1

⎪ ⎩ 2

six x = y six {x, y} = {a, b} six x = y und six {x, y} = {a, b}.

Which of the four properties of a metric space does this distance satisfy? 16.6 Let (X, d) be a metric space. (a) Define d1(x, y) = d(x, y)/[1 + d(x, y)]. Prove that d1 is a metric for X. (b) Define d2(x, y) = min{1, d(x, y)}. Prove that d2 is a metric for X. 16.7 In each following part, a distance d(P1 , P2 ) between two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) on the Cartesian product R2 is defined. Determine which of the four properties of a metric space is satisfied by each distance d. For metric distances, describe the associated open spheres. (a) d(P1 , P2 ) = min {|x1 − x2 |, |y1 − y2 |} (b) d(P1 , P2 ) = max {|x1 − x2 |, |y1 − y2 |} (c ) d(P1 , P2 ) = (|x1 − x2 | + |y1 − y2 |) /2 16.8 Let (R2 , d) be the metric space whose distance d(P1 , P2 ) between two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) is given by d(P1 , P2 ) = (x1 − x2 )2 + (y1 − y2 )2 . Prove that the set S = {(x, y) : −1 < x < 1 and − 1 < y < 1} is open in (R2 , d). 16.9 Let (R2 , d) and (R2 , d ) be metric spaces, where for two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) in R2 d(P1 , P2 ) = (x1 − x2 ) 2 + (y1 − y2 )2 and d (P1 , P2 ) = |x1 − x2 | + |y1 − y2 |. Prove each of the following points. (a) Every set open in (R2 , d ) is open in (R2 , d ). (b) Every set open in (R2, d) is open in (R2, d). Exercise 16.10 In the metric space (R, d) with d(x, y) = |x − y|, determine which of the following sets are open, closed, or neither, and check your answers. (a) (0, 1) (d) (0, ∞) (g) I

(b) [0, 1] (e) (0, 2) − {1} (h) { n1 | n ∈ N}

(c) (−∞, 1] (f ) Q (i) { n1 | n ∈ N} ∪ {0}

16.11 Let (R2 , d) be the metric space whose distance d(P1 , P2 ) between two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) in R2 is given by d(P1 , P2 ) = | x1 − x2 | is defined + |y1 − y2 |, and let (R, d ) be the metric space with d (a, b) = |a − b|. Check each of the following items. (a) The function f : (R2 , d) → (R, d ) defined by f (x, y) = 12 (x − y) is continuous.

CHAPTER 16. TESTS IN TOPOLOGY

18

(b) The function g : (R2 , d) → (R, d ) defined by g(x, y) = x is continuous. 16.12 Let (R2 , d) be the metric space whose distance d(P1, P2 ) between two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) in R2 is defined by d(P1 , P2 ) = ( x1 − x2 )2 + (y1 − y2 )2 and let d be the discrete metric, i.e. H.

d (P1, P2) =

0 1

with P1 = P2 with P1 = P2 .

Check each of the following items. (a) The function f : (R2 , d) → (R2 , d ) defined by f (x, y) = (x, y) is continuous. (b) The function g : (R2 , d ) → (R2 , d) defined by g(x, y) = (x, y) is not continuous. 16.13 Let X = {a, b, c, d}. Determine which of the following collections of subsets of X are topologies on X. Check your answers. (a) S1 = {∅, {a}, {a, b}, {a, c}} (b) S2 = {∅, X, {a, b}, {a, c}} (c) S3 = {∅, X, {a}, {a, b}, {a, d}, {a, b, d}} 16.14 Prove DeMorgan's extended law in Theorem 16.12(b). Exercise 16.15 Let X be a nonempty set and let S ⊆ X. Let τ be from ∅ and every subset of X that contains S . Prove that (X, τ ) is a topological space. 16.16 Let (X, τ ) be a topological space. Prove that if {x} is an open set for all x ∈ X, then τ is the discrete topology. 16.17 Let (X, τ ) be a topological space, where X is finite. Prove that (X, τ ) is a metric space if and only if τ is the discrete topology on X. 16.18 (a) For a set X with a ∈ X, let τ consist of X together with all sets S with a ∈ / S. Prove that (X, τ ) is a topological space. (b) Formulate and prove a generalization of the result in (a). 16.19 Let X be a nonempty set and let τ be the set consisting of ∅ and every subset of X whose complement is countable. Prove that (X, τ ) is a topological space. Exercise 16.20 Let a, b, c be three distinct elements in a Hausdorf space (X, τ ). Prove that there are paired disjoint open sets Oa , Ob and Oc containing a, b and c respectively. 16.21 Let τ be the set formed from ∅, R and each interval (a, ∞) where a ∈ R. It is known that (R, τ ) is a topological space. (Don't try to prove this.) Show that (R, τ ) is not a Hausdorf space. Exercise 16.22 Prove that if (X, τ ) is a topological space with discrete topology, then (X, τ ) is a Hausdorf space. 16.23 Let (N, τ ) be a topological space, where τ consists of ∅ e {S : S ⊆ N, 1 ∈ S}, and let f : N → N be a continuous permutation. Determine f(1).

19 16.24 Let X = {a, b, c} with the topology τ = {∅, X, {a}, {a, b}, {a, c}}. Determine all continuous functions from X to X. 16.25 Let (X, τ1 ), (Y, τ2 ) and (Z, τ3 ) be topological spaces and let the functions be f : X → Y and g : Y → Z. Prove , that if f and g are continuous, then the composition g ◦ f is a continuous function from X to Z. 16.26 Let τ be the trivial topology on a nonempty set X. Prove: If f : X → X is continuous, then f a constant function. 16.27 For the following statement S and the proposed proof, either (1) S is true and the proof is correct, (2) S is true and the proof is false, or (3) S is false and the proof is false. Explain which occurs. S: Let X be an infinite set and let τ be composed of ∅ and all infinite subsets of X. Then (X, τ ) is a topological space. Check. Since X is an infinite subset of X, it follows that X ∈ τ . Since ∅ ∈ τ , the property (1) of a topological space is satisfied. Let O1 , O2 , . 🇧🇷 🇧🇷 , be elements of τ for n ∈ N. We show that ∩ni=1 Oi ∈ τ . If Oi = ∅ for some i with 1 ≤ i ≤ n, then ∩ni=1 Oi = ∅ and ∩ni=1 Oi ∈ τ . Otherwise, Oi is infinite for all i (≤ i ≤ n). So ∩ni=1 Oi is infinite and then ∩ni=1 Oi ∈ τ . Thus, property (2) is satisfied. Next, let {Oα }α∈I be an indexed family of open sets. If Oα = ∅ for every α ∈ I, then ∪α∈I Oα = ∅ and thus ∪α∈I Oα ∈ τ . Otherwise Oα is infinite for some α ∈ I and then ∪α∈I Oα is infinite. So ∪α∈I Oα ∈ τ . Hence τ is a topology on X. 16.28 Let (X, τ ) and (Y, τ ) be two topological spaces. If τ is the discrete topology on X, then by Result 16.23(i) every function f : X → Y is continuous. The converse of Result 16.23(i) is given below along with a “proof”. Reciprocal of Result 16.23(i): Let (X, τ ) and (Y, τ ) be two topological spaces. If all functions from X to Y are continuous, then τ is the discrete topology on X. "Proof". Assume that every function f : X → Y is continuous and conversely assume that τ is not the discrete topology on X. Then there is a subset S of X such that S is not open on X. Then S is different from X and ∅. Let T be an open set in Y and let a, b ∈ Y with a ∈ T and b ∈ / T . Define a function f : X → Y by

f (x) =

a b

se x ∈ S se x ∈ / S.

Since T is open in Y and f −1 (T ) = S is not open in X, it follows that f is not continuous, which is a contradiction. (a) Is the proposed converse proof correct? (b) If the answer to (a) is yes, then state Result 16.23(i) and its reciprocal using “if and only if”. If the answer to (a) is no, then revise the inverse hypothesis to be true (with proof attached). 16.29 Let X be a set with at least two elements and let a ∈ X. Prove or disprove: (a) If (X, d) is a metric space, then X − {a} is an open set.

20

CHAPTER 16. PROOFS IN TOPOLOGY (b) If (X, d) is a topological space, then X − {a} is an open set.

16.30 For the following statement S and the proposed proof, either (1) S is true and the proof is correct, (2) S is true and the proof is false, or (3) S is false and the proof is false. Explain which occurs. S: Let (X, d) be a metric space. For every open set O in X with O = ∅ and every element b ∈ O there exists an open sphere Sr(b) in X such that Sr(b) and O are disjoint. Check. Let r = min{d(b, x) : x ∈ O}. Consider the open sphere Sr (b). We claim that Sr (b)∩O = ∅. Instead assume that Sr (b)∩O = ∅. Then there is y ∈ Sr(b)∩O. Since y ∈ Sr(b), it follows that d(b, y) < r. However, since y ∈ O, this contradicts the fact that r is the minimum distance between b and an element of O. 16.31 Prove or disprove: Let (X, d) be a metric space. For every open set O in X with O = ∅ there is b ∈ O and an open sphere Sr (b) in X such that Sr (b) and O are disjoint.

1 Solutions and tips to selected odd number exercises in Chapters 14-16 Chapter 14 14.1 (a) Proof. Let a, b ∈ kZ. So a = kx and b = ky for some x, y ∈ Z. Note that a + b = kx + ky = k(x + y) and ab = (kx)(ky) = k(kxy). Since x + y, kxy ∈ Z, it follows a + b, ab ∈ kZ; then definite addition and multiplication are binary operations on kZ. Since kZ ⊆ Z and the binary operations on kZ are the same as on Z, the properties R1, R2, R5 and R6 are automatically satisfied. Furthermore, since 0 = k · 0 and 0 ∈ Z, kZ has an additive identity. To show that the R4 property is also satisfied, let a ∈ kZ. Then a = kx, where x ∈ Z. Then −a = −(kx) = k(−x). Since −x ∈ Z it follows that −a ∈ kZ. 14.3 (a) Solution We show that (S, ∗, ◦) is not a ring. Of course, ∗ and ◦ are binary operations on S. However, property R6 is not satisfied. To see this, let a = b = c = 0. Then a ◦ (b ∗ c) = 0 ◦ 1 = 0 and (a ◦ b) ∗ (a ◦ c) = 0 ∗ 0 = 1. ♦ 14.7 Proof. Let a ∈ R. Then a2 = a. So (a + a)2 = (a + a)(a + a) = a(a + a) + a(a + a) = (a2 + a2 ) + (a2 + a2 ) = (a + a) + (a + a). Since (a + a)2 = a + a, it follows that (a + a) + (a + a) = (a + a) + 0. Applying the addition cancellation law (Theorem 14.10) we get + a = 0. So − a = a. 0 0 14.9 (a) Since the matrix zero belongs to S, it follows that S = ∅. Let M1 , M2 ∈ S. 0 0 a1 0 a2 0 and M2 = , where ai , bi ∈ R for i = 1, 2. Then M1 = 0 b1 0 b2 a1 − a2 0 a1 a2 0 and M1 M2 = belong to S. According to the M1 − M2 = 0 b1 − b2 0 b1 b2 subring test, S is a subring of M2(R). 14.11 solution

The 2G set of even Gaussian integers is a subring of G.

Check. Since 0 ∈ 2Z, it follows that 0 = 0 + 0i ∈ 2G and thus 2G = ∅. Let x, y ∈ 2G. Then x = a1 +b1 i and y = a2 +b2 i, where ai , bi ∈ 2Z for i = 1, 2. Then x−y = (a1 −a2 )+(b1 −b2 )i and xy = (a1 a2 − b1 b2 ) + (a1 b2 + a2 b1 )i. Since a1 − a2 , b1 − b2 , a1 a2 − b1 b2 , a1 b2 + a2 b1 ∈ 2Z, it follows from the subring test that 2G is a subring of G. 0 0 14.13 (a) Since the matrix zero belongs to S , it follows that S = ∅. Let M1 , M2 ∈ S. 0 0 a1 b1 a2 b2 and M2 = , where ai , bi ∈ R for 1 ≤ i ≤ 2. Then M1 = 0 0 0 0 a1 − a2 b1 − b2 a1 a2 a1 b2 and M1 M2 = belongs to S. By M 1 − M2 = 0 0 0 0 subring test, S is a subring of M2 (R). 1 1 a b (b) Let E = and let A = any element of S. Then 0 0 0 0 1 1 a b a b 2 3 EA = = . Let C = ∈ S. Then CE = 0 0 0 0 0 0 0 0 2 3 1 1 2 2 = = C 0 0 0 0 0 0

2 14.15 proof. We first show that (2Z, +, ◦) is a ring. Of course, 2Z is closed under addition. Let a, b, c ∈ 2Z. So a = 2x, b = 2y and c = 2z, where x, y, z ∈ 2Z. So a ◦ b = (2x)(2y)/2 = 2(xy). Since xy is an integer, 2Z is closed under this multiplication. Since (2Z, +, ·) is a ring, where · is an ordinary multiplication, (2Z, +, ◦) satisfies the properties R1–R4 and the integer 0 is the zero element. Now a ◦ (b ◦ c) = a ◦ (bc/2) = a(bc)/4 = (ab)c/4 = (ab/2) ◦ c = (a ◦ b) ◦ c; then (2Z, +, ◦) satisfies the R5 property. Finally, a ◦ (b + c) = a(b + c)/2 = (ab/2) + (ac/2) = (a ◦ b) + (a ◦ c), and so (2Z, +, ◦) satisfies property R6. Therefore (2Z, +, ◦) is a ring. Since a ◦ b = ab/2 = ba/2 = b ◦ a, the ring (2Z, +, ◦) is commutative. Since a ◦ 2 = (a 2)/2 = a and 2 ∈ 2Z, the integer 2 is a unit for (2Z, +, ◦). Next assume that a ◦ b = 0, where a, b ∈ Z. So ab/2 = 0 and hence ab = 0, which implies that a = 0 or b = 0. So (2Z, +, ◦) a full value domain. 14.19 Hint: Consider the following R-rings and S-subrings: (a) R = M2(R); S =

same 0 0 same

:a∈R .

(b) R = R[x]; S = {f ∈ R[x] : f is a constant function}. (c) R=Q×Z; S = Q × {0}. 14.21 Hint: First show that Q[i] is a subring of C. Then show that every non-zero element of Q[i] is a unit. 14.23 (a) Zn (n ≥ 2) (b) Z (c) M2 (Z2) (d) M2 (R) 14.25 (3). Now explain your answer with justification.

Chapter 15 15.1 Proof. Let u, v ∈ C and α, β ∈ R. Then u = a + bi and v = c + di, where a, b, c, d ∈ R. Then u + v = (a + bi) + (c + di) = (a + c) + (b + d)i and αu = α(a + bi) = αa + αbi. Since a+c, b+d, αa, αb ∈ R, it follows that u+v ∈ C and αu ∈ C. Now u+v = (a+c)+(b+d)i = (c + a ) + (d + b)i = v + u, and property 1 is satisfied. Let w = e + f i, where e, f ∈ R. Then (u + v) + w = [(a + c) + (b + d)i] + (e + f i) = [ (a + c) + e] + [(b + d) + f ]i = [a + (c + e)] + [b + (d + f )]i = (a + bi) + [(c + e) + ( d + f )i] = (a + bi) + [(c + di) + (e + f i)] = u + (v + w); then property 2 is fulfilled. Let z = 0 + 0i. Since u + z = (a + bi) + (0 + 0i) = a + bi = u, property 3 is satisfied. Let −u = (−a) + (−b)i. Then u + (−u) = (a + bi) + [(−a) + (−b)i] = 0 + 0i = z, and property 4 is satisfied. Because α(u + v) = α[(a + bi) + (c + di)] = α[(a + c) + (b + d)i] = (αa+αc)+(αb+αd) i = (αa+αbi)+(αc+αdi) = α(a+bi)+α(c+di) = αu+αv, property 5 is fulfilled. Now (α+β)u = (α+β)(a+bi) = (α+β)a+(α+β)bi = αa+βa+αbi+βbi = (αa+αbi)+(βa+ βbi). ) = α(a+bi)+β(a+bi) = αu+βu. So property 6 is satisfied. Since (αβ)u = (αβ)(a + bi) = (αβ)a + (αβ)bi = α(βa) + α(βbi) = α(βa + βbi) = α(β(a + bi) ) = α(βu), property 7 is satisfied. Finally, 1 u = 1(a + bi) = 1 a + 1 bi = a + bi = u, and thus property 8 is satisfied.

3 15.3 (a) Since (1, 0, 0) + (0, 1, 0) = (1, 0, 0) and (0, 1, 0) + (1, 0, 0) = (0, 1 , 0), property 1 is not satisfied and therefore R3 is not a vector space. (c) Let v = (1, 0, 0) and z = (a, b, c) be the zero vector, where a, b, c ∈ R. Then v + z = (0, 0, 0) = v ; then property 3 is not satisfied and R3 is not a vector space. (e) Let v = (1, 0, 0). Since 1v = (0, 0, 1) = v, property 8 is not satisfied and R3 is not a vector space. 15.5 Evidence. −(αv).

Beachte, dass α(−v) = α((−1)v) = (α(−1))v = (−α)v = ((−1)α)v = (−1)(αv) =

15.7 (a) The statement is false. Since z + z = z, it follows that −z = z. ♦ 15.9 (a) The set W1 is a subspace of R4 . Check. Since (0, 0, 0, 0) ∈ W1 , it follows that W1 = ∅. Let u, v ∈ W1 and α ∈ R. Then u = (a, a, a, a) and v = (b, b, b, b) for some a, b ∈ R. Then u + v = (a + b, a + b, a + b, a + b) and αu = (αa, αa, αa, αa). Since u + v, αu ∈ W1 , it follows from the subspace test that W1 is a subspace of R4. (c) Since (0, 0, 0, 1) ∈ W3 but 2(0, 0, 0, 1) ∈ / W3 , it follows that W1 is not closed under scalar multiplication and hence W3 is not a subspace of R4. ♦ 15.11 (a) The set U1 is a subspace of R[x]. Check. Since the zero function f0 defined by f0 (x) = 0 belongs to R[x] for all x ∈ R, it follows that U1 = ∅. Let f, g ∈ U1 and α ∈ R. Then there are constants a and b such that f (x) = a and g(x) = b for all x ∈ R. Then (f + g)(x) = f (x) + g (x) = a + b and (αf )(x) = αf (x) = αa. Since f + g, αf ∈ U1 , it follows from the subspace test that U1 is a subspace of R[x]. (b) Since the function h defined by h(x) = x3 for all x ∈ R belongs to U2, but (0 h)(x) = 0 h(x) = 0 x3 = 0 does not belong to U2 it follows that U2 is not closed under scalar multiplication and therefore U2 is not a subspace of R[x]. ♦ 15:15 test. Since (0, 0), i.e. x = 0 and y = 0, is a solution of the equation, (0, 0) ∈ S and therefore S = ∅. Let (x1 , y1 ), (x2 , y2 ) ∈ S and α ∈ R. Then 3x1 − 5y1 = 0 and 3x2 − 5y2 = 0. However, 3(x1 +x2 )−5(y1 +y2 ) = ( 3x1 −5y1 )+(3x2 −5y2 ) = 0. So (x1 +x2 , y1 +y2 ) ∈ S. So 3(αx1 )−5(αy1 ) = α(3x1 −5y1 ) = α 0 = 0, and thus α(x1 , y1 ) = (αx1 , αy1 ) ∈ S. Therefore, according to the subspace test, S is a subspace of R2. 15.17 i = − 12u1 + 12u2 + 12u3 . 15.19 Proof. Let v ∈ W . So v = c1 v1 + c2 v2 + . 🇧🇷 🇧🇷 + cn vn , where ci ∈ R for 1 ≤ i ≤ n. Also let vi = ai1 w1 + ai2 w2 + . 🇧🇷 🇧🇷 + target wm , where aij ∈ R for 1 ≤ i ≤ n and 1 ≤ j ≤ m. Then ⎤ ⎤ ⎡ ⎡ v1 w1 ⎢ v2 ⎥ ⎢ w2 ⎥ ⎢ ⎢ ⎥ ⎥ v = [c1 c2 . 🇧🇷 🇧🇷 cn] ⎢ . ⎥ = [c1 c2 . 🇧🇷 🇧🇷 cn ] A ⎢ . 🇧🇷 , . 🇧🇷 🇧🇷 . 🇧🇷 . ⎦ vn wm ⎤ ⎡ a11 a12 . 🇧🇷 🇧🇷 a1m ⎢ a21 a22 . 🇧🇷 🇧🇷 a2m ⎥ ⎥ ⎢ where A = ⎢ . 🇧🇷. Thus v is a linear combination of w1 , w2 , . 🇧🇷 🇧🇷 , wm .. .. .. ⎦ ⎣ .. . 🇧🇷 🇧🇷 to1 to2 . 🇧🇷 🇧🇷 anm and thus v ∈ W .

4 15.21 Proof. We first show that u, v ⊆ u, 2u + v . Note that u ∈ u, 2u + ve v = (−2)u + 1 (2u + v) ∈ u, 2u + v . By Exercise 15.19, u, v ⊆ u, 2u + v . Next we show that u, 2u + v ⊆ u, v . Since u ∈ u, v and 2u+v is a linear combination of u and v, u, 2u + v ⊆ u, v again follows from Exercise 15.19. 15.23 Hint: One possibility is to choose w = (1, 0, 0). Now consider au + bv + cw = (0, 0, 0), where a, b, c ∈ R. 15.25 (a) The set S is not linearly independent since 1 + (−1) sin2 x + (− 1 ) cos2 x = 0 for all x ∈ R. (b) The set S is linearly