Reading: Pass and Tseng 8.1

- Definitions and examples:
- DFA, language of a machine, recognizable language, transition function/extended transition function

**Review exercises:**- Write the definition of \(\hat{δ}\).
- Let \(M\) be a machine with one final state \(q\); and a transition function that takes \(q\) back to \(q\) on every character of \(Σ\). Find \(L(M)\)
- If every \(x \in L\) is accepted by \(M\), is \(L\) recognized by \(M\)?
- If \(L\) is recognized by \(M\), is every string in \(L\) accepted by \(M\)?
- Draw an automaton that recognizes the set of even length strings, the set of all strings, the empty set, etc.

An automaton is an extremely simple model of a computer or a program. The automata we will study examine an input string character by character and either say "yes" (accept the string) or "no" (reject the string).

Automata are defined by state transition diagrams. Here is one example:

This automaton processes strings containing the characters 0 and 1. It contains 4 states, \(q_{ee}\), \(q_{eo}\), \(q_{oe}\) and \(q_{oo}\).

While processing a string \(x\), the machine starts at the beginning of \(x\), and in the *start state* \(q_{ee}\) (as indicated by the arrow pointing to \(q_{ee}\)).. As it processes each character, it follows the corresponding *transitions* (arrows). When it has finished processing the string, if it is in a final state (\(q_{eo}\) in this case, as indicated by the double circle), it accepts \(x\); otherwise it rejects \(x\).

For example, while processing \(1000110\), the machine will start in state \(q_{ee}\), then transition in order to states \(q_{eo}\) (after processing the 1), \(q_{oo}\) (after the first 0), then \(q_{eo}\), \(q_{oo}\), \(q_{eo}\), \(q_{ee}\), and finally end up in \(q_{eo}\). Since \(q_{eo}\) is an accepting state, the string \(1000110\) is accepted.

Although this model of computation is very limited, it is sophisticated enough to demonstrate several kinds of analysis that apply to more sophisticated models:

we'll talk about translating "programs" (automata) from one representation to another, and proving that those translations are correct. This is analagous to building and verifying compilers

we'll show how to optimise automata, again proving that our optimizations don't change the behavior of the program

we'll show that there are specifications (sets of strings) that can't be recognized by any automaton. Similar results apply to fully general models of computation, and have important practical implications.

we'll talk about non-determinism, an important concept when reasoning about programs.

Before we do any of that, we need to formalize the informal definition of an automaton and its operation.

**Definitions:** A **deterministic finite automaton** \(M\) is a 5-tuple \(M = (Q,Σ,δ,q_0,F)\), where

- \(Q\) is a finite set, called the
**set of states of \(M\)**; - \(Σ\) is a finite set called the
**alphabet of \(M\)**(elements of \(Σ\) are called**characters**) - \(δ\) is a function \(δ : Q \times Σ → Q\), called the
**transition function**. In the picture, there is a transition from \(q\) to \(q'\) on input \(a\) if \(δ(q,a) = q'\). - \(q_0 \in Q\) is the
**start state** - \(F \subseteq Q\) is the
**set of final states**

In the example diagram above,

- \(Q = \{q_{ee}, q_{eo}, q_{oe}, q_{oo}\}\)
- \(Σ = \{0,1\}\)
- \(δ\) is given by the following table:

\(q\) | \(a\) | \(δ(q,a)\) |
---|---|---|

\(q_{ee}\) | 0 | \(q_{oe}\) |

\(q_{ee}\) | 1 | \(q_{eo}\) |

\(q_{eo}\) | 0 | \(q_{oo}\) |

\(q_{eo}\) | 1 | \(q_{ee}\) |

\(q_{oe}\) | 0 | \(q_{ee}\) |

\(q_{oe}\) | 1 | \(q_{oo}\) |

\(q_{oo}\) | 0 | \(q_{eo}\) |

\(q_{oo}\) | 1 | \(q_{oe}\) |

- \(q_0 = q_{ee}\)
- \(F = \{q_{eo}\}\)

Given an automaton \(M\), we define the extended transition function \(\hat{δ} : Q \times \Sigma^{\bf *} → Q\). Informally, \(\hat{δ}(q,x)\) tells us where \(M\) ends up after processing the entire *string* \(x\); contrast the domain with that of \(δ\), which processes only a single character. This distinction is important: since \(δ\) only processes characters, its domain is finite, so the description of the machine is finite; but \(\hat{δ}\) (which is not part of the description of the machine) can process an infinite number of strings.

**Definition:** Formally, we define the **extended transition function** \(\hat{δ} : Q \times Σ^* → Q\) inductively by \(\hat{δ}(q,ε) = q\), and \(\hat{δ}(q,xa) = δ(\hat{δ}(q,x), a)\).

The first part of this definition says that processing the empty string doesn't move the machine; the second part says that to process \(xa\), you first process \(x\), and then take one more step using \(a\) from wherever \(x\) gets you.

With these definitions, we can say what it means for the machine to accept or reject a string:

**Definition:** If \(\hat{δ}(q_0,x) \in F\) then \(M\) **accepts** \(x\). Otherwise, \(M\) **rejects** \(x\).

**Definition:** A **language** is a set of strings, i.e. a subset of \(Σ^*\).

**Definition:** The **language of \(M\)**, denoted \(L(M)\) is given by \[L(M) ::= \{x \in \Sigma^* \mid M\text{ accepts }x\} = \{x \in \Sigma^* \mid \hat{\delta}(q_0,x) \in F\}\]

**Definition:** We say a language \(L\) is **recognized by \(M\)** if \(L = L(M)\).

**Definition:** A language \(L\) is **DFA-recognizable** if there exists some DFA \(M\) that recognizes \(L\).

Here is a mistake that students often make, and one that it is important not to make. We are primarily interested in whether *entire languages* are recognized by machines, rather than particular strings. It does not make sense to say that a string is recognized, or that a language is accepted, and just because all of the strings in the language \(L\) are accepted by \(M\), does not mean that \(L\) is recognized by \(M\).

This is important, because otherwise the whole theory of automata becomes trivial:

**(False) claim:** Every language \(L\) is recognizable.

**(Bogus) proof:** Let \(M\) be a machine with one final state \(q\); and a transition function that takes \(q\) back to \(q\) on every character of \(Σ\). Clearly \(M\) accepts every string in \(L\), so \(M\) recognizes \(L\), and therefore \(L\) is recognizable.

This is bogus, because it only proves that \(L \subseteq L(M)\), not that \(L = L(M)\). To show \(L = L(M)\), you must show **both** that every \(x \in L\) is in \(L(M)\) **and** that every \(x \notin L\) is not in \(L(M)\).