Lecture 28: Structural induction, languages

Inductively defined sets
- inductively defined functions
- proof by structural induction
Language of an automaton
- key terms: extended transition function δ^, language, language of a machine L(M), M recognizes L

Inductively defined sets

An inductively defined set is a set where the elements are constructed by a finite number of applications of a given set of rules.

Examples:

the set ℕ of natural numbers is the set of elements defined by the following rules:
1. 0 ∈ ℕ
2. If n ∈ ℕ then Sn ∈ ℕ.
thus the elements of ℕ are {0, S0, SS0, SSS0, …}. S stands for successor. You can then define 1 as S0, 2 as SS0, and so on.
the set Σ^* of strings with characters in Σ is defined by
1. ϵ ∈ Σ^*
2. If a ∈ Σ and x ∈ Σ^* then xa ∈ Σ^*.
thus the elements of Σ^* are {ε, ε0, ε1, ε00, ε01, …, ε1010101, …}. we usually leave off the ε at the beginning of strings of length 1 or more.
the set T of binary trees with integers in the nodes is given by the rules
1. the empty tree (, written nil) is a tree
2. if t₁ and t₂ are trees, then , written node(a, t₁, t₂)) is a tree.
thus the elements of T are things like the picture to the right (click for tex), which might be written textually as node(3, node(0, nil, nil),node(1, node(2, nil, nil),nil))

BNF

Compact way of writing down inductively defined sets: BNF (Backus Naur Form)

Only the name of the set and the rules are written down; they are separated by a "::=", and the rules are separated by vertical bar (|).

Examples (from above):

n ∈ ℕ : :=0 | Sn
x ∈ Σ^* : :=ϵ | xa
a ∈ Σ
t ∈ T : :=nil | node(a, t₁, t₂)
a ∈ Z
(basic mathematical expresssions)
e ∈ E : :=n | e₁ + e₂ | e₁ * e₂ | − e | e₁/e₂

n ∈ Z

Here, the variables to the left of the ∈ indicate metavariables. When the same characters appear in the rules on the right-hand side of the ::=, they indicate an arbitrary element of the set being defined. For example, the e₁ and e₂ in the e₁ + e₂ rule could be arbitrary elements of the set E, but + is just the symbol +.

Inductively defined functions

If X is an inductively defined set, you can define a function from X to Y by defining the function on each of the types of elements of X; i.e. for each of the rules. In the inductive rules (i.e. the ones containing the metavariable being defined), you can assume the function is already defined on the subterms.

Examples:

add2 : ℕ → ℕ is given by add2 : 0 ↦ SS0 and add2 : Sn ↦ S(add2(n)).
plus : ℕ × ℕ → ℕ given by plus : (0, n)↦n and plus : (Sn, n′) ↦ S(plus(n, n′)). Note that we don't need to use induction on both of the inputs.
$\hat{\delta} : Q \times \Sigma^* → Q$

Proofs by structural induction

If X is an inductively defined set, then you can prove statements of the form ∀x ∈ X, P(x) by giving a separate proof for each rule. For the inductive/recursive rules (i.e. the ones containing metavariables), you can assume that P holds on all subexpressions of x.

Examples:

Proof that M is correct (see homework solutions) can be simplified using structural induction
A proof by structural induction on the natural numbers as defined above is the same thing as a proof by weak induction. You must prove P(0) and also prove P(Sn) assuming P(n).

Language of a Machine

Extended transition function (Note: because this doesn't render nicely in HTML, I will write δ^ for "delta-hat")

δ^:Q × Σ^* → Q
- informally: δ^(q, x) tells you where you end up after processing the string x starting in state q.
- compare with δ : Q × Σ → Q: δ^ processes strings, while δ processes single characters
- domain of δ is finite, so description of δ is finite; it is part of the machine.
- domain of δ is infinite, it is not part of the description of the machine (but is built from the description of the machine).
δ^:(q, ε)↦q, and δ^:(q, xa)↦δ(δ^(q, x),a)
- informally: to process xa, first process x; starting there, take a single step with the (non-extended) transition function δ
- Note: the δ in this definition cannot be δ^

Language - A language is a set of strings

Language of a machine - L(M) stands for the "language of M". - contains all (and only) strings that M accepts - Informally, a string x is accepted by a machine M if, after processing x starting at the start state, the machine ends in a final state. - formal definition: L(M)={x ∈ Σ^* | δ^(q₀, x)∈F}, where M = (Q, Σ, δ, q₀, F). - We say x is accepted by M if x ∈ L(M), x is rejected otherwise. - We say that M recognizes L if L = L(M). - A language L is DFA-recognizable if there is some machine M with L = L(M).

Proof of correctness of an automaton

Given a language L, we may wish to build a machine that recognizes L, and prove that it is correct.

In other words, we wish to prove that L = L(M).

In other words, we wish to prove that ∀x ∈ Σ^*, x ∈ L if and only if x ∈ L(M).

In other words, we wish to prove that ∀x ∈ Σ^*, x ∈ L if and only if δ^(q₀, x)∈F.

A straightforward approach is induction on the structure of x; however the induction hypothesis usually needs to be strengthened to describe all of the states (and not just the final state); essentially you want to prove that each state "satisfies its specification".

For example: we may want to build a machine that recognizes strings that contain at least two ones. We might build a machine with three states: q₀ represents strings with no 1's, q₁ represents strings with one 1, and q₂ represents strings with two or more 1's.

A proof of correctness for this machine might go as follows:

Let P(x) be the statement "δ^(q₀, x)=q₀ if and only if x has no 1s, and δ^(q₀, x)=q₁ if and only if x has exactly one 1, and δ^(q₀, x)=q₂ if and only if x has two or more 1's. I claim that ∀x ∈ Σ^*, P(x) holds.

We will prove this claim by induction on the structure of x. We must show P(ε) and, assuming P(x), P(xa).

To prove P(ε), note that δ^(q₀, ε)=q₀. Thus only the first part of P(x) makes any claim (it is vacuously true that if δ^(q₀, ε)=q₁, then ε contains one 1, because the statement says nothing).

Now, to prove P(xa), assume P(x). a can be either 0 or 1, and δ^(q₀, x) could be any of q₀, q₁, and q₂. We consider each case:

If a = 0, then note that δ(q_i, a)=q_i. Moreover, note that xa has the same number of 1's as x. Thus in this case, P(xa) follows directly from P(x).

If a = 1 and δ^(q₀, x)=q₀, then by P(x), x must have no 1's. Thus xa has exactly one 1. Moreover, δ^(q₀, xa)=δ(δ^(q₀, x),a)=δ(q₀, 1)=q₁, so $P(xa) holds in this case.

If a = 1 and δ^(q₀, x)=q₁, then by $P(x), x must have one

Thus xa has two ones. Moreover, δ^(q₀, xa)=δ(δ^(q₀, x),a)=δ(q₁, 1)=q₂, so $P(xa) holds in this case.

If a = 1 and δ^(q₀, x)=q₂, then by $P(x), x must have two or more 1's. Thus xa has more than two ones. Moreover, δ^(q₀, xa)=δ(δ^(q₀, x),a)=δ(q₂, 1)=q₂, so $P(xa) holds in this case.

In all possible cases, we have shown P(xa). This concludes the inductive proof.

Note that the framework provided by the inductive proof forces you to write down a specification for each state, and then reason about each transition.