%%% This is the scribe notes template for CS611
%%% There are several comments preceded by CS611: and boxed in %%%%'s 
%%% which indicate where macros should be altered to set up the header
%%% for the paper.  Your Notes should go at the comment SCRIBE NOTES GO HERE!.

%%% In the various .sty files that accompany this .tex file you will    
%%% find LaTeX macros that make it easier to typeset inference rules    
%%% and programming language constructs.  You must make sure that the   
%%% file proof.sty is in a path searched by LaTeX when you try to       
%%% use this file.  Take a look to see what macros are available--it    
%%% will save you time and make the notes look better.  Feel free to    
%%% extend the set of macros--post them to the newsgroup and contact    
%%% the course staff if you come up with some good ones so they can be  
%%% added to the template.                                              

%%% This template includes examples of how to use some of the macros
%%% to give you an idea of how they work.  (Delete the examples when
%%% you do your scribing.)

\documentclass{article} \usepackage{611-lecture}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% CS611: Please fill in these macros as appropriate:
\lecture{6}                  		%% Lecture number
\title{Well Founded Induction}   	%% Title of lecture
%\author{James Worthington, Soam Vasani} %% name of scribe
\date{11 September 2006}     		%% Date of lecture, e.g., 1 January 2001
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% See 611.sty for a variety of macros that will be helpful in
% typesetting the lecture. Here are a few of particular interest:
%
% "x"	 	x in keyword font (e.g., "if", "#t")
% _x_	 	x in italics
% \nm{n}   	n in slanted font (used for abbreviations)
% <e> 	 	e in angle brackets
% \lt 	 	less-than sign
% \gt 	 	greater-than sign
% \SB{x}	x in semantic brackets
% \Tr x{y} 	x[[y]] with x in calligraphic font
%          	(if x is more than a single character, use \Tr{x}{y})

\begin{document}

\maketitle

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% CS611: SCRIBE NOTES GO HERE!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section {Summary}

In this lecture we will

\begin{itemize}
\item define induction on a well-founded relation;
\item illustrate the definition with some examples, including the inductive definition of free variables $FV(e)$;
\item take another look at inference rules.
\end{itemize}

\section {Introduction}

Recall that some of the substitution rules mentioned the function $FV:\{\mbox{$\lambda$-terms}\}\rightarrow\mathbf{Var}$.
\[
\begin{array}{rcll}
(\lambda y.e_0)\{e_1/x\} &=& \lambda y. e_0\{e_1/x\}, & \mbox{where $y \neq x$ and $y \notin FV(e_1)$},\\
(\lambda y.e_0)\{e_1/x\} &=& \lambda z.e_0\{z/y\} \{e_1/x\}, & \mbox{where $z \neq x$, $z \notin FV(e_0)$, and $z \notin FV(e_1)$}.
\end{array}
\]

\noindent
Let's examine the definition of the free variable function \nm{FV}.
\begin{eqnarray*}
\nm{FV}(x) &=& \{x\}\\
\nm{FV}(e_1~e_2) &=& \nm{FV}(e_1) \mathrel\cup \nm{FV}(e_2)\\
\nm{FV}(\lam xe) &=& \nm{FV}(e) - \{x\}.
\end{eqnarray*}
Why does this definition uniquely determine the function \nm{FV}?  There are two issues here:
\begin{itemize}
\item
Existence: whether \nm{FV} is defined on all $\lambda$-terms;
\item
Uniqueness: whether the definition is unique.
\end{itemize}
Of relevance here is the fact that there are three clauses in the definition of $\nm{FV}$ corresponding to the three clauses in the definition of $\lambda$-terms and that a $\lambda$-term can be formed in one and only one way by one of these three clauses.  Note also that although the symbol \nm{FV} occurs on the right-hand side in two of these three clauses, they are applied to proper (\emph{proper} = strictly smaller) subterms.

The idea underlying this definition is called \emph{structural induction}.  This is an instance of a general induction principle called \emph{induction on a well-founded relation}.

\section {Well-Founded Relations}

A binary relation $\prec$ is said to be \emph{well-founded} if it has no infinite descending chains. An \emph{infinite descending chain} is an infinite sequence of elements $a_0,a_1,a_2,\ldots$ such that $a_{i+1} \prec a_i$ for all $i\geq 0$.  Note that a well-founded relation cannot be reflexive.

Here are some examples of well-founded relations:
\begin{itemize}
\item the successor relation $\{(m,m+1)\mid m\in\mathbb{N}\}$ on $\mathbb{N}$;
\item the less-than relation $\lt$ on $\mathbb{N}$;
\item the element-of relation $\in$ on sets.  The axiom of foundation (or axiom of regularity) of Zermelo--Fraenkel (ZF) set theory asserts exactly that $\in$ is well-founded.  Among other things, this prevents a set from being a member of itself;
\item the proper subset relation $\subset$ on the set of finite subsets of $\mathbb{N}$.
\end{itemize}
The following are not well-founded relations:
\begin{itemize}
\item the predecessor relation $\{(m+1,m)\mid m\in\mathbb{N}\}$ on $\mathbb{N}$ ($0,1,2,\ldots$ is an infinite \emph{descending} chain!);
\item the greater-than relation $\gt$ on $\mathbb{N}$;
\item the less-than relation $\lt$ on $\mathbb{Z}$ ($0,-1,-2,\ldots$ is an infinite descending chain);
\item the less-than relation $\lt$ on the real interval $[0,1]$ ($1,\frac 12,\frac 13,\frac 14,\ldots$ is an infinite descending chain);
\item the proper subset relation $\subset$ on subsets of $\mathbb{N}$ ($\mathbb{N},\mathbb{N}-\{0\},\mathbb{N}-\{0,1\},\ldots$ is an infinite descending chain).
\end{itemize}

\section{Well-Founded Induction}

Let $\prec$ be a well-founded binary relation on a set $A$.  Abstractly, a \emph{property} is just a map $P:A\rightarrow\{\mathit{true},\mathit{false}\}$, or equivalently, a subset $P\subseteq A$ (the set of all elements of $A$ for which the property is true).

The principle of well-founded induction on the relation $\prec$ says that in order to prove that a property $P$ holds for all elements of $A$, it suffices to prove that $P$ holds of any $a\in A$ whenever $P$ holds for all $b\prec a$.  In other words,
\begin{eqnarray}
\forall a\in A\ (\forall b\in A\ b\prec a\Rightarrow P(b)) \Rightarrow P(a) &\ \Rightarrow\ & \forall a\in A\ P(a).\label{eqn:induction}
\end{eqnarray}
Expressed as a proof rule,
\begin{equation}
\frac{\forall a\in A\ (\forall b\in A\ b\prec a\Rightarrow P(b)) \Rightarrow P(a)}{\forall a\in A\ P(a)} .\label{eqn:induction2}
\end{equation}
The basis of the induction is the case when $a$ has no $\prec$-predecessors; in that case, the statement $\forall b\in A\ b\prec a\Rightarrow P(b)$ is vacuously true.

For the well-founded relation $\{(m,m+1)\mid m\in\mathbb{N}\}$, (\ref{eqn:induction}) and (\ref{eqn:induction2}) reduce to the familiar notion of mathematical induction on $\mathbb{N}$: to prove $\forall n\ P(n)$, it suffices to prove that $P(0)$ and that $P(n+1)$ whenever $P(n)$.  

For the well-founded relation $\lt$ on $\mathbb{N}$, (\ref{eqn:induction}) and (\ref{eqn:induction2}) reduce to \emph{strong} induction on $\mathbb{N}$: to prove $\forall n\ P(n)$, it suffices to prove that $P(n)$ whenever $P(0),P(1),\ldots,P(n-1)$.  When $n = 0$, the induction hypothesis is vacuously true.

\subsection{Equivalence of Well-Foundedness and the Validity of Induction}

In fact, one can show that the induction principle (\ref{eqn:induction})--(\ref{eqn:induction2}) is valid for a binary relation $\prec$ on $A$ if and only if $\prec$ is well-founded.

To show that well-foundedness implies the validity of the induction principle, suppose the induction principle is not valid.  Then there exists a property $P$ for which the premise of (\ref{eqn:induction2}) holds but not the conclusion.  Thus $P$ is false for some element $a_0\in A$.  The premise of (\ref{eqn:induction2}) is equivalent to
\[
\forall a\in A\ \neg P(a) \Rightarrow \exists b\in A\ b\prec a\wedge \neg P(b);
\]
this implies that there exists an $a_1\prec a_0$ such that $P$ is false for $a_1$.  Continuing in this fashion, using the axiom of choice one can construct an infinite descending chain $a_0,a_1,a_2,\ldots$ for which $P$ is false, so $\prec$ is not well-founded.

Conversely, suppose that there is an infinite descending chain $a_0,a_1,a_2,\ldots$.  Then the property ``$a\notin\{a_0,a_1,a_2,\ldots\}$'' violates (\ref{eqn:induction2}), since the premise of (\ref{eqn:induction2}) holds but not the conclusion.

\section{Structural Induction}

Now let's define a well-founded relation on the set of all $\lambda$-terms.  Define $e \lt e'$ if $e$ is a \emph{proper} subterm of $e'$.  A $\lambda$-term $e$ is a \emph{proper} (or \emph{strict}) subterm of $e'$ if it is a subterm of $e'$ and if $e\neq e'$.  If we think of $\lambda$-terms as syntax trees, then $e'$ is a tree that has $e$ as a subtree.  Since these trees are finite, the relation is well-founded.  Induction on this relation is called \emph{structural induction}.

We can now show that $\nm{FV}(e)$ exists and is uniquely defined for any $\lambda$-term $e$.  In the grammar for $\lambda$-terms, for any $e$, exactly one case in the definition of $\nm{FV}$ applies to $e$, and all references in the definition of $\nm{FV}$ are to subterms, which are strictly smaller.  The function $\nm{FV}$ exists and is uniquely defined for the base case of the smallest $\lambda$-terms $x\in\nm{Var}$.  So $\nm{FV}(e)$ exists and is uniquely defined for any $\lambda$-term $e$ by induction on the well-founded subexpression relation.

We often have a set of expressions in a language built from a set of \emph{constructors} starting from a set of \emph{generators}.  For example, in the case of $\lambda$-terms, the generators are the variables $x\in\nm{Var}$ and the constructors are the application operator $\cdot$ and the abstraction operators $\lambda x$.  The set of expressions defined by the generators and constructors is the smallest set containing the generators and closed under the constructors.  

If a function is defined on expressions in such a way that
\begin{itemize}
\item there is one clause in the definition for every generator or constructor pattern,
\item the right-hand sides refer to the value of the function only on proper subexpressions,
\end{itemize}
then the function is well-defined and unique.

\section {Inference rules}

We defined small-step and big-step semantics using inference rules.  These rules are another kind of inductive definition.  To prove properties of them, we would like to use well-founded induction.

To do this, we can change our view and look at reduction as a binary relation.  To say that $<c,\sigma> \stepsone <c',\sigma'>$ according to the small-step SOS rules just means that $(<c,\sigma>,\;<c',\sigma'>)$ is a member of some reduction relation, which is a subset of $(\nm{Com} \times \Sigma) \times (\nm{Com} \times \Sigma)$.  In fact, not only is it a relation, it is a partial function.

Here is an example of the kind of the rule we have been looking at so far.
\begin{equation}
\infer[(|a_1| \gt 0)]{a_1 + a_2\stepsone a'_1 + a_2}{a_1\stepsone a'_1}\label{eqn:examplerule1}
\end{equation}
Here $a_1,a_2$, and $a_1'$ are _metavariables_.  Everything above the line is part of the _premise_, and everything below the line is the _conclusion_.  The expression on the right side is a _side condition_.

A \emph{rule instance} is a substitution for all the metavariables such that the side condition is satisfied.  For example, here is an instance of the above rule:
\[
\infer[(|3*4| \gt 0)]{(3 * 4 + 1)\stepsone(12 + 1)}{3 * 4\stepsone 12}
\]
where the substitutions are $a_1 = 3*4$, $a'_1 = 12$, $a_2 = 1$.

Another valid instance of the rule is
\[
\infer[(|3*4| \gt 0)]{(3 * 4 + 1)\stepsone(11 + 1)}{3 * 4\stepsone 11}
\]
where the substitutions are $a_1 = 3*4$, $a'_1 = 11$, $a_2 = 1$.

With rules like (\ref{eqn:examplerule1}), we are usually trying to define some set or relation.  For example, this rule might be part of the definition of some reduction relation $\stepsone$ that is a subset of $\nm{AExp} \times \nm{AExp}$.  Such rules are typically of the form
\begin{equation}
\infer[(\phi)]{X}{X_1\ X_2\ \ldots\ X_n}\label{eqn:examplerule2}
\end{equation}
where $X_1, X_2, \ldots, X_n$ represent elements that are already members of the set or relation being defined, $X$ represents a new member of the relation added by this rule, and $\phi$ is a collection of side conditions that must hold in order for the rule to be applied.

The difference between a premise and a side condition is that the side condition is not part of the relation that the rule is trying to define, while the premises are.  The side condition is some restriction that determines when an instance of the rule may be applied.

Now suppose we have written down a set of rules in an attempt to define a set $A$.  How do we know whether $A$ is well-defined?  Certainly we would like to have $X\in A$ whenever $X_1,X_2,\ldots,X_n\in A$ and
\[
\infer{X}{X_1 X_2 \dots X_n}
\]
is a rule instance, but this is hardly a definition of $A$.  What do we put in $A$ to start with?

One approach is to find a well-founded relation such that the rules constitute an inductively defined function, as described above.  Define a _rule operator_ $R$ on sets as follows.  Given a set $B$, let
\begin{eqnarray*}
R(B) &\definedas& \{ X \mid \{X_1,X_2,\ldots,X_n\} \subseteq B \mbox{ and } \frac{X_1~X_2~\ldots~X_n}{X} \mbox{  is a rule instance}\}
\end{eqnarray*}
Then
\begin{itemize}
\item $R(B)$ is the set of members of $A$ that can be inferred from the members of set $B$;
\item $R(\emptyset)$ is the set of members that can be inferred from nothing;
\item $R(R(\emptyset))$ is the set of members that can be inferred from $R(\emptyset)$; the elements of $R(\emptyset)$ are in this set because they are inferred from the empty set, which is a subset of $R(\emptyset)$.
\end{itemize}

Next time we will use the operator $R$ to define $A$ precisely.

\end{document}
