BIB-VERSION:: CS-TR-v2.0
ID:: CORNELLCS//TR94-1411
ENTRY:: 1994-03-17
ORGANIZATION:: Cornell University, Computer Science Department
LANGUAGE:: English
TITLE:: A New Approach to Teaching Mathematics
AUTHOR:: Gries, David 
AUTHOR:: Schneider, Fred B. 
DATE:: February 1994
PAGES:: 22
ABSTRACT::
We propose a new approach to teaching discrete math: First, teach logic as a 
powerful and versatile tool for discovering and communicating truths; then 
use this tool in all other topics of the course. We spend 6 weeks teaching an 
equational style of propositional and predicate calculus, thereby ensuring 
that students gain a fluency in logical notation and some skill in its use. 
We teach basic heuristics for developing proofs, and we relate such proofs to 
more common informal proofs in mathematics. Then, we use logic extensively 
and rigorously in teaching topics like set theory,  relations and functions, 
a theory of integers, induction, combinatorics, and solving recurrence 
relations.

Success in teaching logic as a tool means that students lose their fear of 
mathematics and formalism, gain a positive view of rigorous proofs, learn to 
appreciate the use of syntactic manipulation, and begin using logic in other 
areas of study. Our experiences in teaching discrete math at Cornell shows
that such success is possible.
END:: CORNELLCS//TR94-1411
BODY::
A New Approach to
Teaching Mathematics
David Gries*
Fred B Schneider**
TR 94-1411
February 1994
Department of Computer Science
Cornell University
Ithaca, NY 14853-7501
* Supported by DARPA under ONR grant N00014-91-JA123.
** Supported by ONR under contract N00014-91-J-1219, AFOSR under proposal
93NM31 2, NSF under grant CCR-8701 103, and DARPA/NSF grant CCR-901 4363.
A New Approad? to Teaching Mathematics
David Gries1 and Fred B. Schneider2
Computer Science, Cornell University
February 1994
1 Introduction
Generally speaking, mathematicians and computer scientists are not satisfied with the level
at which college students understand math. Students have difficulty constructing proofs,
their reasoning abilities are inadequate, and they don't know what "rigor" means. Moreover,
their fear of formalism and rigor impedes learning in science and engineering courses, as well
as in math courses.
Tbis paper describes a new approach to teaching discrete math at the freshman level.
Our experience with the approach leads us to believe that it engenders a fresh and positive
outlook on mathematics and proof, addressing the problems mentioned above.
A key ingredient of our approach is an equational treatment of propositional and pred-
icate calculus instead of the natural-deduction and Hilbert styles that currently prevail.
The equational treatment has several advantages.
1. It is easily learned, since it builds on students' familiarity with equational reasoning
from high-school algebra.
2. After just three days of teaching logic, problems can be solved that would be difficult
without the logic, thus providing motivation for further study and for developing a
skill.
3. The logic can be applied rigorously, without complexity and detail overwhelming (a
criticism of other logics), so it becomes a useful tool.
4. The accompanying rigid proof format allows the introduction and discussion of proof
principles and strategies, thereby giving students direction in developing proofs.
5. The logic can be extended to other areas, including all those typically taught in discrete
math courses (e.g. set theory and a theory of integers).
` Supported by Darpa under 0NR grant Nooo14-91-J-4123.
2 Supported by 0NR under contract N00014-91-J-1219, AF0SR under proposal 93NM312, NSF under
grant CCR-87O1103, and DARPA/NSF grant CCR-9014363.
6. The problem-solving paradigm that students see in high school is reinforced: (0) for-
malize, then (1) manipulate the formalization according to rigorous rules, and finally
(2) interpret the results.
Our course in discrete math starts with six weeks of propositional and predicate logic.
The emphasis is on giving students a skill in formal manipulation. Students see many
rigorous proofs and develop many themselves, in a rigid proof style (not in English). They
practice applying proof principles and strategies made possible by the proof style. Along the
way, they learn that attention to rigor may be a simplifying force --Hinstead of an onerous
burden.
The unit on logic does include a discussion of informal presentations of proofs typically
found in mathematics (e.g. proof by contradiction) and how they are based in logic (e.g.
proof by contradiction is based on the theorem p p ? false ). Thus, students see the
connection between the informal proof outlines they see in other courses and our rigorous
proof style.
The rest of our course covers a variety of other topics of discrete math, e.g. set theory,
a theory of integers, mathematical induction, relations and functions, combinatorics, and
solving recurrence relations. Each topic is presented in the same rigorous style in which
logic was presented, by giving the necessary axioms for the theory and building up a library
of theorems. For example, set theory is introduced by adding to pure predicate logic the
axioms that characterize set comprehension and the set operators. Proofs about these set
operators are then presented in the same style as before. Thus, the notions of proof and
proof style become the unifying force in the course.
Niany believe that stressing rigor turns students away from math. Our experience is just
the opposite. Students take comfort in the rigor, because it gives them a basis for knowing
when a proof is correct. And, we find our students applying what they have learned to other
courses --Hbefore our discrete math course ends. However, success in teaching logic and rigor
does require that students have time to digest the new notations and to become skilled in
formula manipulation.
The paper proceeds as follows. First, we compare our equational proofs with more
conventional approaches. ?Ve then turn (in Section 3) to our equational propositional logic
and discuss how we teach it. ?Ve follow this up with a discussion of quantification and
predicate logic. In Section 5, we show how to use our logic as the basis for other topics in
discrete math. Our approach to logic meets resistance from those who feel that meaning
and understanding is sacrificed when forn?al manipulation is emphasized. ?Ve address this
criticsm in Section 6. \Ve conclude in Section 7 with some comments about experiences in
teaching the new approach.
2
2 A comparison of proof methods
Consider proving, from set theory, that union distributes over intersection:
(1) A u (B A C) (A u B) A (A u C)
Most mathematicians favor the following kind of proof --Hat least their texts contain such
proofs. ?
Proof. We first show that A u (B A C) c (A u B) A (A u C). If x E A u (B A C),
then either x E A or x E B A C. If x E A, then certainly x E A u B and x E A u C,
so xE(A u B) A (A u C). On the other hand, if xEB A C, then xEB and xEC,
so x e A u B and x E A u C, so x c (A u B) A (A u C) Hence, A u (B A C) c
(A u B) A (A u G).
Conversely, if ? E (A u B) A (A u C), then y c A u B and y E A u C. We consid&
two cases: yEA and y?A. If yEA, then yEA u (B A C), and this part is done.
If y ? A,then, since y E A u B we must have y E B. Similarly, since y E A u C and
y ? A, we have y E 0. Thus, y E B A 0, and this implies y E A u (B A 0). Hence
(A u B) A (A u 0) C A u (B A 0). theorem follows.
This proof has several shortcomings:
o+ It is, unnecessarily, a ping-pong argument --Hequality is shown by proving two set
inclusions.
o+ The two case analyses complicate the argument.
o+ The use of English obscures the structure of the proof.
o+ Some expressions, for example x E A and x E A u B , are repeated several times, thus
lengthening the proof unnecessarily. (Replacing some of the expressions with pronouns
would only make matters worse by introducing ambiguity)
The justifications for inferences within the proof are not given; too much is left for the
reader to fill in. For example, the proof says, "If x ? A, then, since x E A u B we
must have x c B", but no reference is given to the theorem being used to substantiate
this inference.
The proof style is too undisciplined to invite useful discussion of proof strategies and
principles. After seeing one proof like this, students still have difficulty constructing
similar proofs.
Lay [7, p. 361 gives this proof, with a few fon??ulas replaced by blanks for the reader to 1111 in.
3
Some mathematicians and computer scientists favor teaching Gerhard Gentzen's natural-
deduction system [3] or something similar. They argue (correctly) that Gentzen developed
natural deduction in order to formalize how mathematicians argue in English. However,
formalizing contorted and difficult English arguments is of dubious utility --Hthe contortions
and difficulties remain. N\?e favor proof methods that allow us to bypass problems introduced
by the use of imprecise and unwieldy natural language.
Note that we do not advocate altogether avoiding English, informality, ping-pong ar-
guments, or case analysis. They should be used when they lead to proofs that are more
accessible to the reader. Too often, however, they are used to ease the task of the writer at
the expense of the reader, when a more mathematical style would be more effective.
Having criticized the traditional proof style, we now illustrate a better one, by giving our
own proof that union distributes over intersection. Our treatment of set theory extends an
equational logic. For this discussion, then, we assume that theorems of propositional logic
and pure predicate logic, like the following one, have already been proved.
(2) p V (q A r) = (p v q) A (p v r)
Equality of sets is defined using the axiom of Extensionality (we describe our notation for
quantification in Section 4).
(3) Extensionality: S = T = (Vv v e S =--H v E
Further union and intersection of sets are defined by membership tests.
(4)			v e B U 0			=--H			v C B v v e 0
(5)			v E B fl 0			v C B A v c 0
Note that set n?embership, set equality, and other set operators are defined axiomatically,
and in a way that is useful in forn?al n?anipulation not, as is traditional, by an inforn?al
model of evaluation.
We now prove theorem (1): union distributes over intersection.
Proof. By Extensionality (3), we can prove (1) by showing that an arbitrary element v is
in the LHS of (1) exactly when it is in the RHS:
v E A u (B n 0)
(Definition of U (4))
v E A v v e B fl 0
(Definition of fl (5))
v c A v (v c B A v C 0)
(Distributivity of v over A (2))
(v c A v v E B) A (v e A v v E 0)
(Definition of U (4), twice)
4
(VEAUB) A (VEAUC)
(Definition of fl (5))
v E (A u B) n (A u C)
This proof uses substitution of equals for equals (or `Leibniz", as we call it) as its main
inference rule, as `veil as transitivity of equality to link steps. College freshmen are familiar
with these rules from high-school algebra and therefore become comfortable with the proof
style rather quickly.
The proof has the following nice properties:
o+ All arguments are explicit and rigorous: ??ith each step, a hint between the two equal
terms gives the theorem that indicates why the two terms are equal.
o+ The proof is simple and direct. Its simplicity accentuates, rather than hides, the
relation between disjunction and union and between conjunction and intersection.
The strategy used in tlie proof is easy to identify and tea4?: To prove something
about operators (here, u and n), use their definitions to eliminate them, perform
some manipulations, and then use their definitions to reintroduce them.
Now consider a theorem whose proof involves quantification (even though the statement
of the theorem does not). At the recent AMS meeting in Cincinnati, a mathematician
demonstrated a computer program for helping in the development of proofs. The program's
interface was elegant and easy to use, but the underlying proof system was not. The logic
was natural deduction, augmented with definitions from set theory. The mathematician
demonstrated his system with a proof of
(6) A CC A B CC ? A u BC C
Because the definition of C is in terms of universal quantification,
(7)			B C C			(Vx x c B : x E
his formal proof required several levels of nesting (there were proofs within proofs within
proofs), as well as case analysis. The proof was not easy to follow or develop. Below is
our simpler, equational proof. It happens to use the same proof strategy as the previous
equational proof: eliminate an operator, manipulate, and reintroduce the operator.
ACCA BCC
(Definition of C (7), twice)
(Vx x E A : x E C) A (Vx x c B : x c
(Split range)
(Vx x E A v x E B : x E
5
(Definition of U (4))
(Vx x E A u B : x E
(Definition of C (7))
AuBcC
An equivalence has been proved, instead of implication (6)! Extending the speaker's proof
to justify this equivalence would double the length of his proof, because natural deduction
is being used.
In the two examples presented, the equational approach is far superior to more conven-
tional approaches. There is much less writing, the proofs are absolutely rigorous yet simple,
and the proofs are easy to digest and repeat to others. All one has to know is an equational
version of the pure predicate calculus and the few definitions of set theory.
Were these just isolated instances of the superiority of the equational approach, there
would little reason to discuss it, However, in our experience, the equational approach ex-
hibits these advantages in all topics that are traditionally included in a first-semester discrete
math course --Hset theory, theory of integers, modular arithmetic, mathematical induction,
relations, functions, combinatorics, solving recurrence relations, and modern algebra. This
should not be surprising, since logic is the glue that binds together arguments in all domains.
?4oreover, bringing this glue to the fore provides the unifying theme that has been missing
for so long in discrete math courses.
3 Equational propositional logic
We now outline our equational propositional logic and discuss teaching it. The logic has
the same theorems as any conventional propositional logic; the difference is in the inference
rules, the axioms (and the order in which they are introduced), and the definition of a proof.
For example, equivalence (i.e. equality over the boolean domain) plays a prominent role in
our logic; in comparison, it is a second-class citizen in most other propositional logics, where
implication dominates.
We profit from using two different symbols for equality. Generally speaking, the expres-
sion b c is defined as long as b and c have the same type --He.g. both booleans, both
integers, both set of integers, both graphs. Further, equality is treated conjunctionally:
b = c = d is an abbreviation for b = c A c = d
Symbol = is used only for equality over the booleans: b = c is the same as b = c if
b and c are boolean. Symbol = is assigned a lower precedence than = ; this allows us to
eliminate some parentheses. For example, we can write
x=yVx<y = x?y
(Note how spacing is used to help the reader with precedences.) Formal manipulation will
6
be used often, and we need ways to keep formulas simple. ?
A more important benefit arises from having two different symbols for equality: we can
make use of the associativity of equality over the booleans. Thus, we write b = c = d
for either b = (c = or (h = c) = d, since they are equivalent. Had we used
only = for equality, we could not have benefited from associativity because a = b = c
already means a = b A b = c. As with arithmetic manipulations, we often use symmetry
(commutativity) and associativity of operators without explicit mention. Logicians have
not made use of the associativity of =. For example, in [8], Rosser uses = conjunctionally
instead of associatively. Perhaps this is because logicians have been more interested in
st?dying rather than using logic.
The three inference rules of our equational logic are given below, using the notation
E[v := P] to denote textual substitution of expression P for free occurrences of variable v
in expression E:
P=Q
Leibniz: E[v := P] = E[v :=
P =Q,Q =I?
?ansitivity:			P = 1?
P
Substitution: P[v:=Q]
A theorern of our propositional calculus is either (i) an axiom or (ii) a boolean expression
that, using the inference rules is proved equal to an axiom or a previously proved theorem.
??Ve also have, as a metatheorem, that P = Q is a theorem iff P can be transformed to
Q (or Q to P ) using tilese inference rules.
Our proofs generally follow the format of the equational proofs given earlier (although
there are some extensions). \Vith this proof format, uses of inference rules can be left
implicit --Hthere is no need to mention the rules. Contrast this with natural deduction and
Hilbert-style logics, where the plethora of inference rules dictates the explicit mention of
each use of a rule.
Inference rule Leibniz is used to infer an equality; its use is indicated in proofs as:
E[v :=
E[v:=Q]
As another simphfication, we write the application of a function f to a sin?ple argnn?ent b as j.6
7
Inference rule Transitivity is used without mention in the usual fashion; for example, it is
transitivity that allows us to conclude A U (B fl C) = (A u B) n (A u G) in the proof
on page 4. And, inference rule Substitution is used to generate an instance of a theorem
that is to be the premise of an application of Leibniz. For example, in the step
pA qAr
(pAq = qAp, with q :=qAr)
qArAp
the premise of the instance of Leibniz that is being used is pA q = q A p with q := q A r,
i.e. p A (q A v) = (q A v) A p. Often in our proofs, Substitution is unmentioned when its
use Is obvious.
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
Table 1: Axioms of Equational Logic
Associativity of =?: ((p q) v) = (p (q = v))
Symmetry of =?: p = q = q p
Identity of =?: ?vue = q q
Definition of false: false = ?vxe
Distributivity of over E: (p = q) = ?p q
Definition of i'l: (p ? q) ?(p =
Symmetry of V: p V q			q V p
Associativity of V. (p V q) V ?` p V (q V v)
Idempotency of V: p V p = p
Distributivity of V over =?: p V (q v) = p V q p V v
Excluded Middle: p V ?)
(19)			Golden rule: p A q = p			q = p V q
(20) Definition of b?plication: p ? q = p V q = q
(21)Consequence: p ? q = q ? p
The axioms of our equational logic are given in Table 1, ordered and grouped as we teach
them. Equivalence is introduced first. And, because the first axiom says that equivalence
is associative, thereafter we eliminate parentheses from sequences of equivalences. For ex-
ample, Symmetry of = (9) is given as p = q = q = p. Associativity of allows us to
8
parse (9) in five ways, thus reducing the number of axioms and theorems that have to be
listed:			((p			=			=			= p,			=			=			=			=			=			= p, p =			=			=			, and
p =			=			=
Our definition of conjunction is called the Golden rule (19). To see that (19) is valid,
check its truth table or else use associativity and symmetry to rewrite it as
p=q = pAq=pvq
Now, it may be recognized as the law that says that two booleans are equal iff their con-
junction and disjunction are equal.
Operator ? (see (21)) is included because some proofs are more readily developed or
understood using it rather thau ? . As an example, we prove that any integer divides itself,
i.e. c c holds, where is defined by
bic = (3d1:bd=c)
Note that the proof format below has been extended to allow ? or ? to appear in place
of = and that part of the proof is given in English.
Proof We calculate:
cic
(Definition of
(?dI: c?d= c)
(?-Introduction P[x := ? (?x I: P) ?
cl = c
(?4ultiplicative identity)
true
Hence, true ? c c. By Left Identity of ? (i.e. P = Irue ? P), c c is a theorem. E]
Were ? not available, this proof would have to be presented in reverse, as shown below.
But this second proof is difficult to follow. The first few formulas are rabbits pulled out of a
hat. Why start with true? Where did the insight come from to replace true by cl --H
A goal in writing a proof should be to avoid rabbits, to have each step guided or even forced
by the structure of the formulas, so that an experienced reader would say, "Oh, I would
have done that too." A proof strategy that Ilelps in this regard is: Start with the more
complicated side of an equivalence or in?plication and use its structure to help guide the
proof.
9
true
(Multiplicative identity)
c?l = c
Table 2: Portia's Suitor's Dilemma
This story is a take-off on a scene In Shakespeare's Merchant of Venice. Portia has a
suitor, who wants to marry her. She does not know whether she wants to marry him.
So, she gives him a puzzle to solve, She gets two boxes, one gold and the other silver,
and puts her portrait into one of them. On the gold box she writes the inscription "The
portrait is not here." On the silver box, she writes, "Exactly one of these two inscriptions
is true." She tells the suitor that the inscriptions may be true or false, she won't say
which, but if he can determine which box containstheportrait,shewillmarry him.
(?-Introduction)
(?dI: c?d--H c)
(Definition of
cic
Teaching preliminary concepts (e.g. textual substitution) and the propositional calculus
takes 115 at least eleven 50-minute lectures. Students are shown various proof principles and
strategies, are given many proofs, and are asked to prove many theorems themselves. The
emphasis is on achieving familiarity with the notation and theorems and on gaining a skill
in formal manipulation.
Here is a list of proof principles and strategies that can be taught with our approad?.
They are not obvious to mathematical novices, and discussing them can be enlightening.
o+ Principle. Structure proofs to avoid repeating the same subexpression on many lines.
Principle. Structure proofs to minimize the number of "rabbits pulled out of a hat"
__make each step obvious, based on the structure of the expression and the goal of
the manipulation.
o+ Principle. Lemmas may provide structure, bring to light interesting facts, and ulti-
mately shorten a proof.
o+ Strategy. To prove something about an operator, eliminate it, manipulate the result,
and then, if necessary, reintroduce the operator.
Strategy. The shape of an expression can focus the choice of theorems to be used in
manipulating it. Thus, identify applicable theorems by matching the structure of the
expression and its subexpressions with the theorems.
o+ Strategy. To prove P Q , transform the term with the most structure (either P
or Q) into the other. The idea here is that the one with the most structure provides
the most insight into the next Leibniz transformation.
10
Table 3: Informal Proof Ted?niques
Informal proof ted?nique Basis for the informal technique
Case analysis			(p			v q) A (p ? r) A (q ?			? r
Mutual implication			p			q =			? q) A (q ?
Contradiction			? faise			= p
Contrapositive			p			? q =			?q ?
Upwards of 80 theorems of the propositional calculus can be used as examples and
exercises in illustrating these proof principles and strategies. Students are not asked to
memorize all 80-odd theorems --Hsince mathematicians don't. Rather, students always have
on hand a list of the theorems in the order in which they can be proved. Through continual
use, students begin to know many of these theorems as their formal friends. On a test, the
list of theorems is distributed and students are asked to prove certain ones.
We intersperse the study of propositional calculus with interesting applications. Many
seemingly difficult word problems can be solved easily through formalization and manipula-
tion, so here is where students begin to see that they are learning a new, powerful, mental
tool. One interesting example is Portia's Suitor's Dilemma (see Table 2). It's solution is
amazingly simple using our equational logic when one formalizes, manipulates, and then
interprets. Its solution in natural deduction is much harder (see [9, 6] for a discussion).
Another interesting problem is to make sense of the following sentence: "For every value
of array section b[1..9] , if that value is in array section b[21..25] ,then it is not in b[21..25] ."
This sentence may seem contradictory, but formalizing it, simplifying the formal statement,
and then interpreting the result yields a surprising answer.
Our study of equational logic is completed with a look at how conventional, seemingly
informal, proof techniques in mathematics are based on theorems of propositional calculus
(see Table 3). Several proofs are given in both an informal and the equational style and
compared.
4 Quantification and the predicate calculus
The treatment of quantification in our course unifies what, until now, has been a rather
chaotic topic. Quantification in mathematics assumes many forms, for example:
12 + 22 + 32
= 0 A b[2] = 0 A b[3] = 0
b[1] = 0 v b[2] = 0 V b[3] = 0
3 i2
i=1
(Vx).1 <x < 3 ? b[x] = 0			=
x ? 3 A b[x] = 0			=
There appears to be no consistency of concept or notation here. Compounding the problem
is that students are not taught rules for manipulating specific quantifiers --Hmuch less general
11
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)
Table 4: Axioms for Quantification
Empty range: (*x I false : P) = (the identity of *)
One-point rule: (*x I x = E: P) = P[x :=
Distributivity: (*x I R: P)*(*x 1?: Q) = (*x R: P*Q)
Range split: Provided ?(R A S) holds or * is idempotent,
I RvS : P) = (*x R: P)*(*x IS:
Rangesplit: (*xIRvS:P)*(*xIRAS:P) = (*xIR:P)*(*xIS:P)
Interd?ange of dummies: (*x I 1?: (*y I Q : P)) = (*y I Q : (*x I 1?:
Nesting: (*x,y I R A Q : P) = (*x Il?: (*y I Q :
Dummy renamh?: (*x R: P) = (*? I R[x := : P[x :=
Note: The usual caveats concerning the absence of free occurrences of dummies in some
expressions are needed to avoid capture of variables. Further, some of the axioms require
the ranges of quantifications to be finite or * to be idempotent.
rules that hold for all.
We use a single notation for all quanti?cations. Let * be any binary, associative, and
symmetric operator that has an identity. ? The notation
(22) (*i:T I Ri : Pi)
denotes the accumulation of values P.i, using operator *, over all values i for which range
predicate R.i holds. T is the type of dummy i; it is often omitted in contexts in which
the type is obvious. 6 If range R.i is irue , we may write the quantification as (*i I: P.i).
In the examples below, the type is omitted; also, gcd is the greatest common divisor
operator.
I 1 < i < 3 : i2)			=			12 + 22 + 32
(Ax 13< x <9 A prin?.x : b[x] = 0)			=			b[3] = 0 A b[5j = 0
(gcd i I 2 <i < 4 : i2)			=			22 gcd 32 gcd 42
With the single notation (22) for quantification, we can discuss bound variahies, scope of
variables, free variables, and textual substitution for all quantifications, once and for all.
Further, we can give axioms that hold for all sud? quantifications (see Table 4).
Ilaving discussed qiiantification in detail, we then turn to pure predicate logic itself.
This calls for just a few more axioms that deal specifically with universal and existential
If * is associative and symmetric but has no identity, then instances of the axioms of Table 4 that have
a false range do not hold.
6 Our logic deals with types, hut space limitations preclude a discussion of types here.
12
(33)
Table 5: Additional Axioms for Predicate Calculus
Trading: (VxI R:P) (VxI:R ?
Distributivity of v over V: P v (Vx I R: (Vx I R : P v Q)
(Generalized) De Morgan: (?x R: P) I R :
quantification --Hsee Table 5. We do bow to convention and use V and a instead of A and
v
Range R.i in notation (22) for quantification can be any predicate. For universal and
existential quantification, however, a range is not necessary. Nevertheless, consistency of
notation encourages us to use a single notation, even for universal and existential quantifi-
cation. Furthermore, in many manipulations, range R.i may remain und?anged while term
P.i is modified, and the separation of the range and term makes this easier to see. Here,
the desire for ease of formal manipulation has dictated the choice of notations.
Issues of scope, bound variables, etc., make quantification and predicate calculus far
more complicated than propositional calculus. Some may even feel that quantification is
too complicated for freshmen and sophomores. However, many courses in math, computer
science, physics, and engineering require quantification in one form or another, so not teach-
ing quantification means that students are unprepared for those classes. In fact, the lack of
knowledge of basic tools like quantification may explain partly why students are apprehen-
sive about mathematics.
Thus, we advocate teaching quantification carefully, completely, and rigorously, but in a
manner that instills confidence. We have found that this can be done.
5 Examples of the use of predicate calculus
In our course, equational logic serves as a springboard to the study of other topics. After
all, many topics in discrete math can be viewed as extensions of pure predicate calculus
--Hnew types of values are introduced and operations on them are defined by adding axioms
to the logic. We provide some brief glimpses of how this works.
Set theory
In this article, we use conventional notation for set comprehension, (x I P.xJ , but restrict
x to be a bound (dummy) variahie (i.e. not an expression). We define membership in a set
comprehension as follows.
(34) Set n?embership: Provided x does not occur free in E
13
EEfx I PJ =--H (?x P : x =
Recall from (3) that set equality is defined by S = T			(Vx I: x E S =--H x E
Section 2 already compared equational proofs in set theory to informal English, so we
won't do that again here. instead, we discuss Russell's paradox, which is usually covered in
elementary set theory. We define Russell's paradoxical set s (if it is allowed) by its mem-
bership test --Hremember that set union and intersection were also defined by membership
tests.
(35)			x E S			x ? x for all sets x
S is a set, and instantiating x with S in (35) yields
SES			S?S
which is false. An inconsistency arose by introducing set 5. We conclude that 5 is not
well defined and refuse to allow (35). That's all there is to it! There is no need for the
confusing, English, ping-pong argument that one finds in many discrete-math texts.
Theory of integers
The integers can be explored by extending predicate logic with the new type Z, giving the
axioms for operations on its elements, and then proving various theorems. The same idea was
used in introducing sets, so the students see a pattern emerging. Once the pattern is seen,
one need not dwell on the proofs of all the theorems concerning addition, multiplication,
etc., since the students are already familiar with most of the them.
One can spend time on new integer functions and operations. One example is the greatest
common divisor. We use infix operator gcd for this and define it as follows. (We write the
maximum of two integers b and c as b I c and assume that theorems for I and abs have
already been taught.)
(36)			bgcdc=(IdIdbAdc:d)			(for bc notboth 0)
0 gcd 0 = 0
The first line of (36) does not define 0 gcd 0; since all integers divide 0, 0 has no maximum
divisor. We define 0 gcd 0 to be 0, so that gcd is total over Z x Z
Note that ? is associative and symmetric, so we can use it in quantifications. Having
spent a great deal of time on quantification, little has to be said about using ? in this
manner. However, since ? over the integers does not have an identity, we must avoid using
quantification theorems that rely on an identity.
The next step in exploring gcd is to state (and prove some of) its properties. This
step has two goals: to familiarize the student with the new operation and to provide a basis
for later manipulations using gcd. Function gcd is symmetric, is associative, satisfies
14
bgcdb=abs.b,has 1 asitszero(i.e. lgcdb= 1), has 0 asitsidentity(i.e. Ogcdb=b),
and so on.
Let us prove b gcd b = abs.b. A few points about the proof given below are worth
mentioning. It contains a case analysis, because the definition of gcd does. We try to
avoid case analysis, but it is not always possible. Second, the proof is written partly in
English, since that is the easiest way to see the case analysis. Thus, we are not completely
rigid in our proof style. However, the proof does contain two equational subproofs.
Proof. There are two cases to consider: b = 0 and b # 0.
Case b = 0
Case 6 # 0
o gcd 0 = abs.O
(Def. of gcd (36); Def. of abs
o = 0 Refiexivity of =
6 gcd 6
= (Def. of gcd (36) 6 # 0 by case assumption?
(Id dib A dib:
= (Idempotency of A, p A p =
(Id dlb:d)
= (The maximum divisor of 6 is abs.6
6 # 0 by case assumption?
abs.6
Mathematical induction
Strong induction over the integers can be defined by the following axiom, where the dummies
are of type na?nraI n?m6er.
Mathematical Induction:
(VnI: P.n) P.0 A (VnI: (Vi 10< i < n : Pi) ? P(n+ 1))
Our earlier study of quantification has given students the manipulative skills they need to
prove formulas by induction. For example, proving n2 = (?i I 1 < i < n : 2?i--H 1) requires
knowledge of axioms and theorems for manipulating summations.
Furthermore, the rigorous formulation of induction and proofs by induction helps clar-
ify certain points that are usually confusing. Here is an example of this. When proving
statements by induction, we always put them in the form
(VnI:0<n:P.n) where P.n:
Thus, we formalize the theorem to be proved and name the induction hypothesis. For
example, suppose we want to prove 6m+n = 6m 6n for n, m natural numbers. Writing this
15
Table 6: Equivalence of Mathematical Induction and Well-foundedness
(U, ?) is well founded
(Definition of well-foundedness?
YSI:S#?			(?xI:xESA(VyIy?x:yfS)))
Y =--H = ?Y Double negation?
(VSI: S = (?xI: xcS A (Vy 1 y ? x :
(De Morgan, twice)
?SI:S=? = (VxI:x?S V ?(Vy I y?x:y?S)))
(Change dummy, using P.z = z ?
(VPI: (VxI: P.x) (VxI: P.x V ?(Vy I y ? x :
(Law of implication, p ? q = ?p V q)
(VPI:(VxI:P.x) (VxI:(VyI y?x:P.y) ?
(Definition of induction over (U, ?))
(U, ?) admits induction
formula as
(VnI0<n:P.n) where P.n: (VmIo<m.b?+n?bm.bn)
clarifies what the "induction variable" is, by making it an explicit argument of P. Here,
our style imposes a measure of precision that is almost impossible to obtain with an English
argument. If one does not know how to write quantifications, it is difficult to explain the
different roles of m and n in this proof by induction.
Finally, the background on predicate logic allows us to go into mathematical induction
more deeply than is usually possible in a first course. Consider a pair (U, ?), where U
is a nonempty set and ? a binary relation over U. When can one use induction on
(U, ?)? Answering this question rigorously gives the students a much better understanding
of induction than has hitherto been achieved.
The pair (U, ?) is well founded iff every nonempty subset of U has a minimal element
(with respect to ?). In Table 6, we prove that (U, ?) is well founded exactly when it admits
induction (the second formula in the proof is the formal definition of well-foundedness,
while the penultimate formula is the definition of induction 7). This proof is so simple that
students can be asked to reconstruct it on a test --Hand 95% of them do so. Such performance
is impossible using a proof in English, even though the two proofs rely on the same idea of
defining a predicate P in terms of a set S (and vice versa). Students don't memorize the
proof character for character; instead, they understand the basic idea and develop the proof
afresh each time they want to present it.
The formal definitions of well foundedness and induction require second-order predicate calculus.
16
Generating functions
Solving recurrence relations (homogeneous difference equations) using generating functions
is sometimes taught in discrete math. The generating function 0(z) for a sequence xo
x1, x2, ... is the polynomial xo.z0+xi'z'+x2?x2 +...,or
10 <i: xo.z?)
Thus, the generating function for a sequence is just a different representation of the sequence.
Many useful generating functions can be written in a closed form, but for the student
with little knowledge of the axioms for manipulating quantifications, the derivation of these
closed forms is a black art. However, students with skill in manipulating quantifications can
do more than follow the proofs; they can discover the closed forms themselves, once they
have been shown the basic idea.
To see this, we now calculate the closed form of the generating function 0(z) for the
sequence c0,c1,c2,... for some nonzero c. The calculation starts with the definition of
the generating function. Thereafter, the calculation unfolds in an opportunistic or forced
fashion: at each step, the shape of the formula guides the development in an almost unique
way. We have extended the hints of the proof with comments that motivate each step.
0(z) = (?i 10 <i:
(Split off term --Hthis is the simplest and almost the only
change possible.?
0(z)=c0.z0+(?i Ii <i:c?.z?)
(Arithmetic --Hisolate the complicated term?
0(z) --H 1 = (?i Ii <i:
(Change dummy --Hobvious way to remove the summation
is to use the definition of 0(z); change the range of the
summation to make it the same as that of 0(z).)
0(z) --H 1 = (?i 10< i :
(Distributivity --Hexpose the RHS of definition of 0(z)
0(z) --H 1 = c z (? i I 0 < i : c?
(Definition of 0(z)
0(z)-1=c.z.0(z)
(Arithmetic)
0(z) = 11(1 --H c.z)
6 Intuition and understanding
The approach we have presented has not, as yet, been universally accepted. The most
frequent criticism we hear is that stressing formal manipulation will impede the develop-
17
ment of intuition and understanding. Also, our critics fear that students will lose track of
"meaning", or semantics. N\'e now address these issues.
On intuition and discovery
By intuition, one usually means direct perception of truth or fact, independent of any
reasoning process; keen and quick direct insight; or pure, untaught, noninferential knowledge
(Webster's Encyclopedic Unabridged Dictionary, 1989). There is simply no hope of teaching
this --Hhow can one teach something that is untaught, noninferential, and independent of
any reasoning process? Of course, one can hope that students will develop an ability to
intuit by watching instructors in math courses over the years. But this hit-or-miss prospect
cannot be called teaching intuition.
On the other hand, a good part of mathematics is concerned with the opposite of intu-
ition: with new and different reasoning processes that complement our ability to reason in
English. This part of mathematics can be taught, and our approach to logic is an excellent
vehicle for that task.
The question may then arise whether students can be taught something about discovery
that does not hinge on intuition. Here, our syntactic method of proof outshines the more
conventional proof methods of reasoning in English. We are able to tead? aids to discov-
ery. In particular, with our disciplined, syntactic, proof style, we can teach principles and
strategies whose application can indeed lead to proofs in many (but not all) cases. We have
yet to see comparable principles and strategies for conventional English proofs.
Finally, in many cases, formalization and syntactic manipulation can set the stage for
discovery. For example, recall the earlier discussion of closed forms of generating functions.
The chance of intuition or English reasoning helping to discover closed forms is small, but
freshman and sophomores can be taught to discover them through syntactic manipulation.
Recall also the pigeonhole principle, which for a hundred years has been presented as:
If more than n pigeons are placed in n holes, at least one hole will contain
more than one pigeon.
We formalize it. Consider the bag (multiset) 8 of numbers giving the number of pigeons
in each hole. For example, if the first two holes contain 3 pigeons ead? and the third hole
1, the bag is 8 = ?1, 3, 3?. The statement "more than n pigeons are placed in n holes"
is formalized as average.8 > 1. The statement "At least one hole contains more than
one pigeon" is formalized as maxirnurn.S > 1. Hence, the pigeonhole principle can be
formalized as
(37) Pigeonhole principle: average.S > 1 ? rnaxirnum.8 > 1.
18
Now, (37) should not be surprising, since the average of a set of numbers is bounded above
by their maximum. In fact, we could take the following as a generalized principle.
(38) Generalized pigeonhole principle: average.S < maximum.S.
Clearly, (38) implies (37) but not the converse. And, there are situations in which (38) can
be used, but not (37). Hence, formalization led to an improvement.
On meaning or semantics
It has also been said that semantics or "meaning is lost in our approach to proofs. The
English proof of distribution of union over intersection on page 3 is full of meaning, it is
said, while our syntactic proof suppresses meaning.
However, an English &gument can be formalized in a natural-deduction logic. The
resulting natural-deduction proof is just as much a syntactic argument as an equational
proof, so its English counterpart is just an informal version of a syntactic proof (in which
inference rules are not mentioned). There is nothing "semantic" about it.
Thus, comparing proofs based on meaning or semantics is a red herring.
Let us discuss meaning or semantics in relation to the introduction of an operator like
set union. The operator can be explained in several ways. One can illustrate set union
with a Venn diagram, define in English what it means, show how to evaluate a set union,
give examples, and give a formal definition. Having different views provides redundancy,
which is helpful in promoting full comprehension, and our formalistic approach to teaching
mathematics does not preclude meaning-ful discussions of alternative views. However, the
student should be made to realize that for purposes of reasoning --Hconstructing proofs--H it
is the axiomatic definition that is important. In fact, the student should vie?v the axion?atic
definition as encoding all the meaning of the object being defined.
On understanding
What really is at issue is the question of understanding: which approach, the equational or
the conventional one, provides more understanding?
A proof should convey belief in the theorem. A proof provides evidence for belief, where
the evidence consists of the facts (e.g. previously proved theorems) on whid? it rests and on
how these facts interact to convince. We understand the proof when we understand which
facts are used and how they interact. Full understanding also implies the ability to explain
the proof to others and perhaps to prove other theorems with similar proofs.
Now look at the English proof of distribution of union over intersection on page 3. That
proof does not state the facts on which it rests. For example, it says, "If y ? A , then, since
19
y E A u B we must have y E B", but there is no reference to the theorem that explains why
this inference holds. This is typical of most proofs in English: the underlying facts are left
to be inferred by the reader.
Second, in that English proof, it is difficult to see precisely how the facts interact. The
proof cannot be easily learned and then explained to others. And, having seen the proof,
students still have difficulty proving similar theorems. In fact, showing the proof to students
gives little insight into proofs in general.
Thus, from the view of providing "understanding", the English proof is deficient.
Now turn to the equational proof on page 4. Every fact used in the proof is stated. The
uses of the three inference rules are given implicitly by the structure of the proof, and the
premises of uses of Leibniz's rule are stated explicitly. Thus, "understanding" this proof, in
terms of the facts it uses and their interaction, is almost trivial. Further, the proof uses a
simple proof strategy: eliminate operators using their definitions, perform manipulations,
and reintroduce the operators. Since the strategy is one that is used often, it can be taught
and then applied by students in many different cases. In fact, after studying this proof,
a student should have little difficulty developing it afresh when it is to be explained to
someone else. The student has full understanding.
In short the conventional proof method has trouble promoting full understanding, while
the equational proof method makes understanding easy.
As another example, look at the proof of equivalence of well-foundedness of (U, ?) and
mathematical induction over (U, ?). Rarely do texts give an English proof of this equiva-
lence, because it would be too difficult. The equational proof, however, is fully understood
by most of our students. Further, the equational proof stunningly brings to light the im-
portant fact upon which the proof is based, namely the ability to put predicates and sets in
correspondence using the relation P.z z ? S. The corresponding English truth, because
its structure is so complex, hides this fact.
7 Experience with our approach
In Spring 1993, we taught a discret&math course to 70 students (mainly freshman and
sophomores) in the Computer Science Department at Cornell, using a draft of our text A
Logical Approach to Discrete Math (Springer-Verlag, New York, 1993) [5]. This experience
showed that students:
o+ Lose their fear of mathematics and mathematical notation.
o+ Gain a good understanding of proofs and their development.
o+ Acquire some skill in formal manipulation and begin almost immediately to apply that
skill in other courses.
20
o+ Gain an appreciation for rigor, so much so that they ask for it in later courses.
Further, we found that part of the six weeks spent on logic could be made up in the latter
part of the course, because students understood the later material more easily. We also feel
that material in subsequent courses can be taught more effectively and efficiently because
students are better prepared, although we have no firm evidence on this yet.
Near the end of our discrete-math course, we asked the students to write a short essay
explaining what they had learned. Five wrote negative comments, but the other 65 were
overwhelmingly positive. This is the first time we experienced or heard of an overwhelmingly
enthusiastic response to a discrete math course. All their comments are summarized in the
first chapter of the Instructor's Manual [6], where the gestalt of the course is discussed. We
repeat a few comments here.
This course was really groovy, perhaps my favorite one ... . I am a math major
who never thought I would enjoy d9ing proofs, until taking this course.
Stressing logic has helped my methods of thinking. I definitely have a better
understanding of proof.
In my linear algebra class, I understand proofs with much less difficulty, and
proving theorems about matrices has become easier as well.
My math homework from last week ended with two proofs, and I was happy
about it!
There are real applications out there for predicate and propositional calculus
and an observable benefit to formalism.
I really enjoyed ... [this coui.se], because of the rigor and techniques of proof I
learned.
The class dissolved my fear of proving things, and I find proofs easier to tackle
now.
To he perfectly honest, I enjoyed it, which is surprising considering I do not
enjoy math.
This course taught me the most about how --Hto think logically in all the many
courses I have taken in my Cornell career.
A draft of text [5] was also used in Fall 1993 by Sam Warford at Pepperdine, who
considered his course ?a big success
We hope that others will try out this new approach to teaching math. We think that
the approach produces students with far better math skills and that it should be integrated
into all math curricula. In fact, a course for non-majors based on this approach could serve
as a replacement for calculus.
21
Acknowledgements
The approach discussed in this paper was developed over the past 20 years by researchers
in the formal development of algorithms. Researchers in that field continually needed to
manipulate formulas of the predicate calculus, but conventional logics, like natural deduc-
tion, required too much formal detail and gave no insight. The equational approach was
used informally in the late 1970's, and the first text on the formal development of programs
[4] used a rudimentary equational logic. Dijkstra and Scholten [2] developed the equational
logic on which ours is based. The unification of quantification was also developed in the early
1980's; the first text we know of that talks about this topic is by Backhouse [1]. Dijkstra is
responsible for the generalized pigeonhole principle.
References
[1] Backhouse, R. Program Constrnction and Verification. Prentice-Hall International, En-
glewood Cliffs, N.J., 1986.
[2] Dijkstra, E.W., and C. Scholten. Predicate Calcnlus and Program Semantics. Springer-
Verlag, New York 1990.
[3]
Gentzen, G. Untersuchungen jiber das logiscbe Sd?liessen. Mathematische Zeitchrifl 39
(1935), 176-210,405-432. (An English translation appears in M.E. Szabo (ed.). Collected
Papers of Gerhard Centzen. North Holland, Amsterdam, 1969.
[4] Gries, D. The Science of Programming. Springer-Verlag, New York, 1981.
[5] Gries, D., and F.B. Schneider. A Logical Approach to Discrete Math. Springer-Verlag,
New York, 1993.
[6]
Cries, D., and P.13. Schneider. Instructor's Manual for "A Logical Approach to Dis-
crete Math". Cries and Schneider, Ithaca, 1993. (To discuss obtaining a copy, email
gries?cs.cornell.edu.)
[7] Lay, S.R. Analysis with an h'troduction to Proof Second Edition. Prentice Hall, Engle-
wood Cliffs, NJ. 1990.
[8] Rosser, B. Logic for AThthematic?ans. NlcCraw-Hill, New York, 1953.
[9] Wiltink. A deficiency of natural deduction. Information Processing Letters 25 (1987),
233-234.
22
