- Regular expressions
*L*(*r*)

*ε*-NFA (discussed in notes for next lecture)

Regular expressions are patterns that match certain strings. They give a way to define a language: the language of a regular expression is the set of all strings that match the pattern.

There are six ways to construct regular expressions. Formally, the set of regular expressions is formed by the following grammar:

*r*∈*R**E*: : = ∅ ∣ *ε* ∣ *a* ∣ *r*_{1}*r*_{2} ∣ (*r*_{1}∣*r*_{2}) ∣ *r*^{ * }

∅ matches no strings. *L*(∅) = ∅.

*ε* matches only the empty string. *L*(*ε*) = {*ε*}.

*a* matches the string "a". *L*(*a*) = {*a*}.

*r*_{1}*r*_{2} (the **concatenation** of *r*_{1} and *r*_{2}) matches any string that can be broken into two parts *x* and *y*, with *x* matching *r*_{1} and *y* matching *r*_{2}. *L*(*r*_{1}*r*_{2}) = {*x**y* ∣ *x* ∈ *L*(*r*_{1}), *y* ∈ *L*(*r*_{2})}.

(*r*_{1}∣*r*_{2}) (the **alternation** of *r*_{1} and *r*_{2}, sometimes written *r*_{1} + *r*_{2} or *r*_{1} ∪ *r*_{2}) matches any string that matches either *r*_{1} or *r*_{2}. Formally, *L*(*r*_{1}∣*r*_{2}) = *L*(*r*_{1}) ∪ *L*(*r*_{2}).

*r*^{ * } (the **Kleene star** or **Kleene closure** of *r*) matches the concatenation of any number (including 0) of strings, each of which match *r*. Formally, *L*(*r*) = {*x*_{1}*x*_{2}*x*_{3}… ∣ *x*_{i} ∈ *L*(*r*)}.