Lecture 20: Inductive definitions

Reading: MCS 7,7.1
Inductively defined sets
- BNF notation
- examples: lists, trees, $\mathbb{N}$ , logical formulae
- Strings ( $Σ^*$ )
Inductively defined functions
- examples: length, concatenation
Proofs by structural induction
Review Exercises:
- Give inductive definitions for the following sets: $\mathbb{N}$ ; the set of strings with alphabet $Σ$ ; the set of binary trees; the set of arithmetic expressions formed using addition, multiplication, exponentiation
- Give inductive definitions of the length of a string, the concatenation of two strings, the reverse of a string, the maximum element of a list of integers, the sum of two natural numbers, the product of two natural numbers, etc.
- Prove that $len(cat(x,y)) = len(x) + len(y)$ .
- Prove that $len(reverse(x)) = len(x)$ .
- Use the inductive definitions of $\mathbb{N}$ and $plus$ to show that $plus(a,b) = plus(b,a)$ .

Inductively defined sets

An inductively defined set is a set where the elements are constructed by a finite number of applications of a given set of rules.

Examples:

the set $\mathbb{N}$ of natural numbers is the set of elements defined by the following rules:
1. $Z \in \mathbb{N}$
2. If $n \in \mathbb{N}$ then $Sn \in \mathbb{N}$ .
thus the elements of $\mathbb{N}$ are $\{Z, SZ, SSZ, SSSZ, \dots\}$ . $S$ stands for successor. You can then define $1$ as $SZ$ , $2$ as $SSZ$ , and so on.
the set $\Sigma^{*}$ of strings with characters in $\Sigma$ is defined by
1. $\epsilon \in \Sigma^*$
2. If $a \in \Sigma$ and $x \in \Sigma^{*}$ then $xa \in \Sigma^*$ .
thus if $\Sigma = \{0,1\}$ , then the elements of $\Sigma^*$ are $\{ε, ε0, ε1, ε00, ε01, \dots, ε1010101, \dots\}$ . we usually leave off the $ε$ at the beginning of strings of length 1 or more.
the set $T$ of binary trees with integers in the nodes is given by the rules
1. the empty tree (, written $nil$ ) is a tree
2. if $t_1$ and $t_2$ are trees, then , written $node(a,t_1,t_2)$ ) is a tree.
thus the elements of $T$ are things like the picture to the right (click for tex), which might be written textually as $node(3,node(0,nil,nil),node(1,node(2,nil,nil),nil))$

BNF

Compact way of writing down inductively defined sets: BNF (Backus Naur Form)

Only the name of the set and the rules are written down; they are separated by a "::=", and the rules are separated by vertical bar ( $|$ ).

Examples (from above):

$n \in \mathbb{N} ::= 0 \mid Sn$
$x \in \Sigma^* ::= \epsilon \mid xa$ where $a \in \Sigma$
$t \in T ::= nil \mid node(a,t_1,t_2)$ where $a \in Z$
(basic mathematical expresssions) $\begin{aligned}e \in E &::= n \mid e_1 + e_2 \mid e_1 * e_2 \mid - e \mid e_1 / e_2 \\ n \in \mathbb{Z}\end{aligned}$

Here, the variables to the left of the $\in$ indicate metavariables. When the same characters appear in the rules on the right-hand side of the $::=$ , they indicate an arbitrary element of the set being defined. For example, the $e_1$ and $e_2$ in the $e_1 + e_2$ rule could be arbitrary elements of the set $E$ , but $+$ is just the symbol $+$ .

Inductively defined functions

If $X$ is an inductively defined set, you can define a function from $X$ to $Y$ by defining the function on each of the types of elements of $X$ ; i.e. for each of the rules. In the inductive rules (i.e. the ones containing the metavariable being defined), you can assume the function is already defined on the subterms.

Examples:

$add2 : \mathbb{N} → \mathbb{N}$ is given by $add2(0) ::= SS0$ and $add2 (Sn) ::= S(add2(n))$ .
$plus : \mathbb{N} \times \mathbb{N} → \mathbb{N}$ given by $plus (0,n) ::= n$ and $plus (Sn, n') ::= S(plus(n,n'))$ . Note that we don't need to use induction on both of the inputs.
$len : Σ^* → \mathbb{N}$ is given by $len(ε) ::= 0$ and $len(xa) ::= 1 + len(x)$ .
$cat : Σ^* \times Σ^* → Σ^*$ is given by $cat(ε,ε) ::= ε$ , $cat(xa,ε) ::= xa$ , $cat(ε,xa) ::= xa$ and $cat(xa,yb) ::= cat(xa,y)b$ .

Idea behind structural induction

Consider the definition $x \in Σ^* ::= ε \mid xa$ . I will refer to $x ::= ε$ as "rule 1" and $x ::= xa$ as "rule 2". This definition says that there are two kinds of strings: empty strings (formed using rule 1), and strings of the form $xa$ , where $x$ is a smaller string (formed using rule 2); these are the only kinds of strings.

If we want to prove that property $P$ holds on all strings (i.e. $∀x \in Σ^*, P(x)$ ), we can do it by giving a proof for strings formed using rule 1 (let's call it proof 1), and another proof for strings formed using rule 2 (let's call it proof 2). In the second proof, we may assume that $P(y)$ holds.

Why can we make this assumption? Suppose we have some complicated string, like $εabc$ , and we want to conclude $P(εabc)$ . We build the string $εabc$ by snapping together smaller strings using rules 1 and 2; we can imagine building a proof of $P(εabc)$ by snapping together smaller proofs using proofs 1 and 2.

To show that $εabc$ is a string, we first use rule 1 to show that $ε$ is a string, then rule 2 to show that $εa$ is a string (this assumes that $ε$ is a string, but we just argued it was), and then rule 2 again to show that $εab$ is a string (using the fact that $εa$ is a string), and finally use rule 2 a third time to show that $εabc$ is a string.

Similarly, we can use proof 1 to show that $P(ε)$ holds, then use proof 2 to show that $P(εa)$ holds (this assumes that $P(ε)$ holds, but we just argued it does), and then use proof 2 again to show that $P(εab)$ holds (using the fact that $P(εa)$ holds), and finally use proof 2 a third time to show that $P(εabc)$ holds.

In general, any element of an inductively defined set is built up by applying the rules defining the set, so if you provide a proof for each rule, you have given a proof for every element. Before you can build a complex structure, you have to build the parts, so while building the proof that some property holds on a complex structure, you can assume that you have already proved it for the subparts.

Structural induction step by step

In general, if an inductive set $X$ is defined by a set of rules (rule 1, rule 2, etc.), then we can prove $∀x \in X, P(X)$ by giving a separate proof of $P(x)$ for $x$ formed by each of the rules. In the cases where the rule recursively uses elements $y_1, y_2, \dots$ of the set being defined, we can assume $P(y_1), P(y_2), \dots$ .

Example structures:

$Σ^*$ is defined by $x ∈ Σ^* ::= ε \mid xa$ . To prove $∀x \in Σ^*, P(x)$ , you must prove (1) $P(ε)$ , and (2) $P(xa)$ ; but in the proof of (2) you may assume $P(x)$ .
If a set $T$ is defined by $t \in T ::= nil \mid node(a,t_1,t_2)$ , you must prove (1) $P(nil)$ and (2) $P(node(a,t_1,t_2))$ . But, in the proof of (2) you may assume $P(t_1)$ and $P(t_2)$ .
If a set $F$ is defined by $φ \in F ::= Q \mid \lnot φ \mid φ_1 \land φ_2 \mid φ_1 \lor φ_2$ , you can prove $∀φ ∈ F, P(φ)$ by proving (1) $P(Q)$ , (2) $P(\lnot φ)$ [assuming $P(φ)$ ], (3) $P(φ_1 \land φ_2)$ [assuming $P(φ_1)$ and $P(φ_2)$ ], (4) $P(φ_1 \lor φ_2)$ [assuming $P(φ_1)$ and $P(φ_2)$ ].

Example proofs

lengths of strings are nonnegative

Recall $Σ^*$ is defined by $x \in Σ^* ::= ε \mid xa$ and $len : Σ^* → \N$ is given by $len(ε) ::= 0$ and $len(xa) ::= 1 + len(x)$ .

Claim: For all $x \in Σ^*$ , $len(x) \geq 0$ Proof: By induction on the structure of $x$ . Let $P(x)$ be the statement " $len(x) \geq 0$ ". We must prove $P(ε)$ , and $P(xa)$ assuming $P(x)$ .

$P(ε)$ case: we want to show $len(ε) \geq 0$ . Well, by definition, $P(ε) = 0 \geq 0$ .

$P(xa)$ case: assume $P(x)$ . That is, $len(x) \geq 0$ . We wish to show $P(xa)$ , i.e. that $len(xa) \geq 0$ . Well, $len(xa) = 1 + len(x) \geq 1 + 0 = 1$ .

balanced trees of height $k$ have height $2^k - 1$

Here is another example proof by structural induction, this time using the definition of trees. We proved this in lecture 21 but it has been moved here.

Definition: We say that a tree $t \in T$ is balanced of height $k$ if either 1. $t = nil$ and $k = 0$ , or 2. $t = node(a,t_1,t_2)$ and $t_1$ and $t_2$ are both balanced of height $k-1$ .

Definition: We define $n : T → \mathbb{N}$ by the rules $n(nil) := 0$ and $n(node(a,t_1,t_2)) := 1 + n(t_1) + n(t_2)$ .

Claim: for all $t \in T$ and for all $k \in \mathbb{N}$ , If $t$ is balanced of height $k$ then $n(t) = 2^{k}-1$ .

Proof: By structural induction on $t$ . Let $P(t)$ be the statement "for all $k \in \mathbb{N}$ , if $t$ is balanced of height $k$ , then $n(t) = 2^{k}-1$ ." We must show $P(nil)$ and $P(node(a,t_1,t_2))$ .

We start by proving $P(nil)$ , i.e. that for all $k$ , if $nil$ is balanced of height $k$ then $n(nil) = 2^k-1$ . Well, the only way for $nil$ to be balanced of height $k$ is if $k = 0$ . Therefore $2^k - 1 = 2^0 - 1 = 0$ . The definition of $n$ shows that $n(nil)$ is also 0, so $n(nil) = 2^k-1$ in this case.

For the $node$ case, we must show that if $node(a,t_1,t_2)$ is balanced of height $k$ for some $k$ , then $n(node(a,t_1,t_2)) = 2^k-1$ . We get to assume the inductive hypotheses: $P(t_1)$ says that if $t_1$ is balanced of height $k'$ for some $k'$ then $n(t_1) = 2^{k'}-1$ , and similarly for $t_2$ .

Since $node(a,t_1,t_2)$ is balanced of height $k$ , we know that $t_1$ and $t_2$ must both be balanced of height $k-1$ (this is the definition of balanced of height $k$ ). Therefore, by $P(t_1)$ we see that $n(t_1) = 2^{k-1}-1$ , and $n(t_2) = 2^{k-1}-1$ . Therefore, by definition of $n$ , we see

$n(node(a,t_1,t_2)) = 1 + n(t_1) + n(t_2) = 1 + (2^{k-1}-1) + (2^{k-1}-1) = 2^k$

as required.