Lecture 21: Structural induction

• Proofs by structural induction

• Review exercises:
• Prove that $$len(cat(x,y)) = len(x) + len(y)$$.

• Prove that $$len(reverse(x)) = len(x)$$.

• Use the inductive definitions of $$\mathbb{N}$$ and $$plus$$ to show that $$plus(a,b) = plus(b,a)$$.

Idea behind structural induction

Consider the definition $$x \in Σ^* ::= ε \mid xa$$. I will refer to $$x ::= ε$$ as "rule 1" and $$x ::= xa$$ as "rule 2". This definition says that there are two kinds of strings: empty strings (formed using rule 1), and strings of the form $$xa$$, where $$x$$ is a smaller string (formed using rule 2); these are the only kinds of strings.

If we want to prove that property $$P$$ holds on all strings (i.e. $$∀x \in Σ^*, P(x)$$), we can do it by giving a proof for strings formed using rule 1 (let's call it proof 1), and another proof for strings formed using rule 2 (let's call it proof 2). In the second proof, we may assume that $$P(y)$$ holds.

Why can we make this assumption? Suppose we have some complicated string, like $$εabc$$, and we want to conclude $$P(εabc)$$. We build the string $$εabc$$ by snapping together smaller strings using rules 1 and 2; we can imagine building a proof of $$P(εabc)$$ by snapping together smaller proofs using proofs 1 and 2.

To show that $$εabc$$ is a string, we first use rule 1 to show that $$ε$$ is a string, then rule 2 to show that $$εa$$ is a string (this assumes that $$ε$$ is a string, but we just argued it was), and then rule 2 again to show that $$εab$$ is a string (using the fact that $$εa$$ is a string), and finally use rule 2 a third time to show that $$εabc$$ is a string.

Similarly, we can use proof 1 to show that $$P(ε)$$ holds, then use proof 2 to show that $$P(εa)$$ holds (this assumes that $$P(ε)$$ holds, but we just argued it does), and then use proof 2 again to show that $$P(εab)$$ holds (using the fact that $$P(εa)$$ holds), and finally use proof 2 a third time to show that $$P(εabc)$$ holds.

In general, any element of an inductively defined set is built up by applying the rules defining the set, so if you provide a proof for each rule, you have given a proof for every element. Before you can build a complex structure, you have to build the parts, so while building the proof that some property holds on a complex structure, you can assume that you have already proved it for the subparts.

Structural induction step by step

In general, if an inductive set $$X$$ is defined by a set of rules (rule 1, rule 2, etc.), then we can prove $$∀x \in X, P(X)$$ by giving a separate proof of $$P(x)$$ for $$x$$ formed by each of the rules. In the cases where the rule recursively uses elements $$y_1, y_2, \dots$$ of the set being defined, we can assume $$P(y_1), P(y_2), \dots$$.

Example structures:

• $$Σ^*$$ is defined by $$x ∈ Σ^* ::= ε \mid xa$$. To prove $$∀x \in Σ^*, P(x)$$, you must prove (1) $$P(ε)$$, and (2) $$P(xa)$$; but in the proof of (2) you may assume $$P(x)$$.

• If a set $$T$$ is defined by $$t \in T ::= empty \mid node(a,t_1,t_2)$$, you must prove (1) $$P(empty)$$ and (2) $$P(node(a,t_1,t_2))$$. But, in the proof of (2) you may assume $$P(t_1)$$ and $$P(t_2)$$.

• If a set $$F$$ is defined by $$φ \in F ::= Q \mid \lnot φ \mid φ_1 \land φ_2 \mid φ_1 \lor φ_2$$, you can prove $$∀φ ∈ F, P(φ)$$ by proving (1) $$P(Q)$$, (2) $$P(\lnot φ)$$ [assuming $$P(φ)$$], (3) $$P(φ_1 \land φ_2)$$ [assuming $$P(φ_1)$$ and $$P(φ_2)$$], (4) $$P(φ_1 \lor φ_2)$$ [assuming $$P(φ_1)$$ and $$P(φ_2)$$].

Example proof

Recall $$Σ^*$$ is defined by $$x \in Σ^* ::= ε \mid xa$$ and $$len : Σ^* → \N$$ is given by $$len(ε) ::= 0$$ and $$len(xa) ::= 1 + len(x)$$.

Claim: For all $$x \in Σ^*$$, $$len(x) \geq 0$$ Proof: By induction on the structure of $$x$$. Let $$P(x)$$ be the statement "$$len(x) \geq 0$$". We must prove $$P(ε)$$, and $$P(xa)$$ assuming $$P(x)$$.

$$P(ε)$$ case: we want to show $$len(ε) \geq 0$$. Well, by definition, $$P(ε) = 0 \geq 0$$.

$$P(xa)$$ case: assume $$P(x)$$. That is, $$len(x) \geq 0$$. We wish to show $$P(xa)$$, i.e. that $$len(xa) \geq 0$$. Well, $$len(xa) = 1 + len(x) \geq 1 + 0 = 1$$.

Proofs on pairs

Often, we want to prove something about all pairs $$x$$ and $$y$$, where $$x$$ and $$y$$ are both in an inductively defined set $$X$$. Pairs of elements of $$X$$ are formed by pairs of rules of $$X$$, so one can give a proof for each pair of rules. For example, to prove $$∀x,y \in Σ^*, len(cat(x,y)) = len(x) + len(y)$$, you can give a proof for the case where $$x$$ and $$y$$ are both $$ε$$, a proof for the case when $$x = ε$$ and $$y$$ is of the form $$zc$$, a proof for the case when $$x = zc$$ and $$y = ε$$, and a proof for the case where $$x = zc$$ and $$y = wd$$.

What inductive assumptions can be made in these cases? You can inductively assume that $$P$$ holds on any pair that is formed from a subpiece of $$x$$ and a subpiece of $$y$$, and at least one of those subpieces needs to be smaller. For example, while proving $$P(zc,wd)$$, you can assume $$P(z,wd)$$, you can assume $$P(zc,w)$$, and you can assume $$P(z,w)$$. You can't assume $$P(zc,wd)$$ (since that's what you're trying to prove). You can't assume $$P(c,d)$$, because that doesn't even make sense: $$c$$ and $$d$$ are elements of $$Σ$$ not $$Σ^*$$, and $$P$$ is a property of pairs of strings, not pairs of characters. You can't assume $$P(εc, wd)$$ because $$εc$$ is not a subpiece of $$zc$$. You can't assume $$P(cat(z,w),w)$$ because $$cat(z,w)$$ is not a substructure of $$zc$$. You shouldn't assume $$P(w,z)$$, although this can be justified using more advanced techniques.

Here is an example:

Claim: for all $$x$$ and $$y$$ in $$Σ^*$$, $$len(cat(x,y)) = len(x) + len(y)$$.

Proof: Recall $$len(ε) ::= 0$$ and $$len(xa) ::= 1 + len(x)$$. Recall also that $$cat(ε,ε) ::= ε$$, $$cat(ε,xa) ::= xa$$, $$cat(xa, ε) ::= xa$$ and $$cat(xa, yb) ::= cat(xa,t)b$$.

We proceed by induction on the structure of $$x$$ and $$y$$. Let $$P(x,y)$$ be the statement $$len(cat(x,y)) = len(x) + len(y)$$.

$$P(ε,ε)$$ case: we want to show $$len(cat(ε,ε)) = len(ε) + len(ε)$$. By definition, the left hand side is $$len(ε) = 0$$, and the right hand side is $$0 + 0 = 0$$.

$$P(ε,xa)$$ case: we want to show $$len(cat(ε,xa)) = len(ε) + len(xa)$$. By definition, $$cat(ε,xa) = xa$$, so \$len(cat(ε,xa)) = len(xa). We also know $$len(ε) = 0$$, so the right hand side also simplifies to $$len(xa)$$.

The $$P(xa,ε)$$ case is symmetric to the $$P(ε,xa)$$ case.

In the $$P(xa,yb)$$ case, we want to show that $$len(cat(xa,yb)) = len(xa) + len(yb)$$. We may assume $$P(xa,y)$$, i.e. that $$len(cat(xa,y)) = len(xa) + len(y)$$. Using this, we have \begin{aligned} len(cat(xa,yb)) &= len(cat(xa,y)b) && \text{by definition of cat} \\ &= 1 + len(cat(xa,y)) && \text{by definition of len} \\ &= 1 + len(xa) + len(y) = len(xa) + (len(y) + 1) && \text{by inductive assumption} \\ &= len(xa) + len(yb) && \text{by definition of len} \end{aligned}

This concludes the proof.

Note that the structure of this proof very closely follows the structure of the function we were proving something about. In this case, we were proving a property of the $$cat$$ function; $$cat(xa,yb)$$ was defined in terms of $$cat(xa,y)$$, and in the proof of $$P(xa,yb)$$, we had to use the assumption $$P(xa,y)$$. This is a common occurrence in proofs by structural induction.