%%% This is the scribe notes template for CS611
%%% There are several comments preceded by CS611: and boxed in %%%%'s 
%%% which indicate where macros should be altered to set up the header
%%% for the paper.  Your Notes should go at the comment SCRIBE NOTES GO HERE!.

%%% In the various .sty files that accompany this .tex file you will    
%%% find LaTeX macros that make it easier to typeset inference rules    
%%% and programming language constructs.  You must make sure that the   
%%% file proof.sty is in a path searched by LaTeX when you try to       
%%% use this file.  Take a look to see what macros are available--it    
%%% will save you time and make the notes look better.  Feel free to    
%%% extend the set of macros--post them to the newsgroup and contact    
%%% the course staff if you come up with some good ones so they can be  
%%% added to the template.                                              

%%% This template includes examples of how to use some of the macros
%%% to give you an idea of how they work.  (Delete the examples when
%%% you do your scribing.)

\documentclass{article} \usepackage{611-lecture}
\usepackage{amsthm,amsmath,amssymb}

\renewcommand\emptyset\varnothing

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% CS611: Please fill in these macros as appropriate:
\lecture{7}                  %% Lecture number
\title{Inductive Definitions and the Knaster--Tarski Theorem}   %% Title of lecture
%\author{Yisong Yue, Yunsong Guo}       %% name of scribe
\date{13 September 2006}     %% Date of lecture, e.g., 1 January 2001
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% See 611.sty for a variety of macros that will be helpful in
% typesetting the lecture. Here are a few of particular interest:
%
% "x"	 	x in keyword font (e.g., "if", "#t")
% _x_	 	x in italics
% \nm{n}   	n in slanted font (used for abbreviations)
%          	(if x is more than a single character, use \Tr{x}{y})

\newcommand\powerset[1]{2^{#1}}

\begin{document}

\maketitle

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% CS611: SCRIBE NOTES GO HERE!
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Set Operators}

Recall from last time that a \emph{rule instance} is of the form
\begin{equation}
\frac{X_1\ X_2\ \ldots\ X_n}{X},\label{eqn:rule}
\end{equation}
where $X$ and the $X_i$ are members of some set $S$.  The \emph{premises} are $X_1,X_2,\ldots,X_n$ and the \emph{conclusion} is $X$.  We use such rules to specify an inductively defined subset $A\subseteq S$; intuitively, the rule (\ref{eqn:rule}) says that if you have $X_1,\ldots,X_n\in A$, then you must also take $X\in A$.

Formally, given a set of rules and a subset $B\subseteq S$, define
\begin{eqnarray}
R(B) &\definedas& \{X \mid \mbox{$\{X_1,X_2,\ldots,X_n\}\subseteq B$ and $\displaystyle\frac{X_1\ X_2\ \ldots\ X_n}X$ is a rule instance}\}.\label{eqn:rulebased}
\end{eqnarray}
Then $R$ is a function mapping subsets of $S$ to subsets of $S$; that is, $R:\powerset S\rightarrow\powerset S$, where $\powerset S$ denotes the \emph{powerset} (set of all subsets) of $S$.  An important property of $R$ is that it is \emph{monotone}: if $B\subseteq C$, then $R(B)\subseteq R(C)$.

What set $A\subseteq S$ is defined by the rules?  At the very least, we would like $A$ to satisfy the following two properties:
\begin{itemize}
\item $A$ is \emph{$R$-consistent}: $A\subseteq R(A)$.  We would like this to hold because we would like every element of $A$ to be included in $A$ only as a result of applying a rule.
\item $A$ is \emph{$R$-closed}: $R(A)\subseteq A$.  We would like this to hold because we would like every element that the rules say should be in $A$ to actually be in $A$.
\end{itemize}
These two properties together say that $A=R(A)$, or in other words, $A$ should be a fixpoint of $R$.  There are two natural questions to ask:
\begin{itemize}
\item
Does $R$ actually have a fixpoint?
\item
Is the fixpoint unique?  If not, which one should we take?
\end{itemize}

\section{Least Fixpoints}

In fact, any monotone set operator $R$ always has at least one fixpoint.  It may have many.  But among all its fixpoints, it has a unique minimal one with respect to set inclusion $\subseteq$; that is, a fixpoint that is a subset of all other fixpoints of $R$.  We call this the \emph{least fixpoint} of $R$, and we take $A$ to be this set.

The least fixpoint of $R$ can be defined in two different ways, ``from below'' and ``from above'':
\begin{eqnarray}
A_* &\definedas& \bigcup\ \{R^n(\emptyset) \mid n\geq 0\}\ \ =\ \ R(\emptyset) \cup R(R(\emptyset)) \cup R(R(R(\emptyset)))\cup \cdots\label{eqn:KT1}\\
A^* &\definedas& \bigcap\ \{B\subseteq S \mid R(B)\subseteq B\}.\label{eqn:KT2}
\end{eqnarray}
The set $A_*$ is the union of all sets of the form $R^n(\emptyset)$, the sets obtained by applying $R$ some finite number of times to the empty set.  It consists of all elements of $S$ that are in $R^n(\emptyset)$ for some $n\geq 0$.

The set $A^*$ is the intersection of all the $R$-closed subsets of $S$.  It consists of all elements of $S$ that are in every $R$-closed set.

We will show that $A_* = A^*$ and that this set is the least fixpoint of $R$, so we will take $A\definedas A_*=A^*$.

\section{The Knaster--Tarski Theorem}

The fact that $A_* = A^*$ is a special case of a more general theorem called the \emph{Knaster--Tarski theorem}.  It states that any monotone set operator $R$ has a unique least fixpoint, and that this fixpoint can be obtained either ``from below'' by iteratively applying $R$ to the empty set, or ``from above'' by taking the intersection of all $R$-closed sets.

For general monotone operators $R$, the ``from below'' construction may require iteration through transfinite ordinals.  However, the operators $R$ defined from rule systems as described above are \emph{chain-continuous} (definition below).  This is a stronger property than monotonicity.  It guarantees that the ``from below'' construction converges to a fixpoint after only $\omega$ steps, where $\omega$ is the first transfinite ordinal.

\subsection{Monotone, Continuous, and Finitary Operators}

A set operator $R:\powerset S\rightarrow\powerset S$ is said to be
\begin{itemize}
\item \emph{monotone} if $B\subseteq C$ implies $R(B)\subseteq R(C)$;
\item \emph{chain-continuous} if for any chain of sets $\mathcal C$,
\begin{eqnarray*}
R(\,\bigcup\,\mathcal C) &=& \bigcup\ \{R(B) \mid B\in\mathcal C\}
\end{eqnarray*}
(a \emph{chain} is a set $\mathcal C$ of subsets of $S$ that is linearly ordered by the set inclusion relation $\subseteq$; that is, for all $B,C\in\mathcal C$, either $B\subseteq C$ or $C\subseteq B$);
\item \emph{finitary} if for any set $C$, the value of $R(C)$ is determined by the values of $R(B)$ for finite subsets $B\subseteq C$ in the following sense:
\begin{eqnarray*}
R(C) &=& \bigcup\ \{R(B) \mid \mbox{$B\subseteq C$, $B$ finite}\}.
\end{eqnarray*}
\end{itemize}
One can show
\begin{enumerate}
\item
every rule-based operator of the form (\ref{eqn:rulebased}) is finitary;
\item
every finitary operator is chain-continuous (in fact, the converse holds as well);
\item
every chain-continuous operator is monotone.
\end{enumerate}
The proofs of 1, 2, and 3 are fairly straightforward and we will leave them as exercises.  The converse of 2 requires transfinite induction and is more difficult.

\subsection{Proof of the Knaster--Tarski Theorem for Chain-Continuous Operators}

Let us prove the Knaster--Tarski theorem in the special case of chain-continuous operators, which will allow us to avoid introducing transfinite ordinals (not that they are not worth introducing!), and that is all we need to handle rule-based inductive definitions.

\medskip

\noindent
\textbf{Theorem (Knaster--Tarski)}\quad
Let $R:\powerset S\rightarrow\powerset S$ be a chain-continuous set operator, and let $A_*$ and $A^*$ be defined as in (\ref{eqn:KT1}) and (\ref{eqn:KT2}), respectively.  Then $A_*=A^*$, and this set is the $\subseteq$-least fixpoint of $R$.
\begin{proof}
The theorem follows from two observations:
\begin{enumerate}
\def\labelenumi{(\roman{enumi})}
\item
For every $n$ and every $R$-closed set $B$, $R^n(\emptyset)\subseteq B$.  This can be proved by induction on $n$.  It follows that $A_*\subseteq A^*$.
\item
$A_*$ is a fixpoint of $R$, thus is $R$-closed.  Since $A^*$ is contained in all $R$-closed sets, $A^*\subseteq A_*$.
\end{enumerate}

For (i), let $B$ be an $R$-closed set.  We proceed by induction on $n$.  The basis for $n=0$ is $\emptyset\subseteq B$, which is trivially true.  Now suppose $R^n(\emptyset)\subseteq B$.  We have
\begin{eqnarray*}
R^{n+1}(\emptyset) &=& R(R^n(\emptyset))\\
&\subseteq& R(B) \quad\mbox{by the induction hypothesis and monotonicity}\\
&\subseteq& B \quad\mbox{since $B$ is $R$-closed}.
\end{eqnarray*}
We conclude that for all $n$ and all $R$-closed sets $B$, $R^n(\emptyset)\subseteq B$, therefore
\begin{eqnarray*}
A_* &=& \bigcup\ \{R^n(\emptyset) \mid n\geq 0\}\ \ \subseteq\ \ \bigcap\ \{B\subseteq S \mid R(B)\subseteq B\}\ \ =\ \ A^*.
\end{eqnarray*}

For (ii), we want to show that $R(A_*)=A_*$.  It can be proved by induction on $n$ that the sets $R^n(\emptyset)$ form a chain:
\begin{eqnarray*}
\emptyset &\subseteq& R(\emptyset)\ \ \subseteq\ \ R^2(\emptyset)\ \ \subseteq\ \ R^3(\emptyset)\ \ \subseteq\ \ \cdots
\end{eqnarray*}
We have $\emptyset\subseteq R(\emptyset)$ trivially, and by monotonicity, if $R^n(\emptyset)\subseteq R^{n+1}(\emptyset)$, then \begin{eqnarray*}
R^{n+1}(\emptyset) &=& R(R^{n}(\emptyset))\ \ \subseteq\ \ R(R^{n+1}(\emptyset))\ \ =\ \ R^{n+2}(\emptyset).
\end{eqnarray*}
Now by chain-continuity,
\begin{eqnarray*}
R(A_*) &=& R(\bigcup_{n\geq 0}\ R^n(\emptyset))\ \ =\ \ \bigcup_{n\geq 0}\ R(R^n(\emptyset))\ \ =\ \ \bigcup_{n\geq 0}\ R^{n+1}(\emptyset)\ \ =\ \ A_*.
\end{eqnarray*}
\end{proof}

\section{Rule Induction}

Let us use our newfound wisdom on well-founded induction and least fixpoints of monotone maps to prove some properties of the reduction rules.

\medskip

\noindent
\textbf{Theorem}\quad If $e\rightarrow e'$ under the CBV reduction rules, then $\nm{FV}(e')\subseteq\nm{FV}(e)$.  In other words, CBV reductions cannot introduce any new free variables.

\begin{proof}
By induction on the CBV derivation of $e\rightarrow e'$.  There is one case for each CBV rule, corresponding to each way $e\rightarrow e'$ could be derived.

\medskip

\noindent
\emph{Case 1}: $\displaystyle\frac{e_1\rightarrow e'_1}{e_1~e_2\rightarrow e'_1~e_2}$.

\medskip

\noindent
We assume that the desired property is true of the premise---this is the induction hypothesis---and we wish to prove under this assumption that it is true for the conclusion.  Thus we are assuming that $\nm{FV}(e'_1)\subseteq\nm{FV}(e_1)$ and wish to prove that $\nm{FV}(e'_1~e_2)\subseteq\nm{FV}(e_1~e_2)$.
\begin{eqnarray*}
\nm{FV}(e'_1~e_2) &=& \nm{FV}(e'_1) \cup \nm{FV}(e_2)\quad\mbox{by the definition of $\nm{FV}$}\\
&\subseteq& \nm{FV}(e_1) \cup \nm{FV}(e_2)\quad\mbox{by the induction hypothesis}\\
&=& \nm{FV}(e_1~e_2)\quad\mbox{again by the definition of $\nm{FV}$.}
\end{eqnarray*}

\medskip

\noindent
\emph{Case 2}: $\displaystyle\frac{e_2\rightarrow e'_2}{v~e_2\rightarrow v~e'_2}$.

\medskip

\noindent
This case is similar to Case 1, where now $e_2\rightarrow e'_2$ is used in the induction hypothesis.

\medskip

\noindent
\emph{Case 3}: $\displaystyle\frac{}{(\lam xe)v\rightarrow\subst evx}$.

\medskip

\noindent
There is no induction hypothesis for this case, since there is no premise in the rule; thus this case constitutes the basis of our induction.  We wish to show, independently of any inductive assumption, that $\nm{FV}(\subst evx)\subseteq\nm{FV}((\lam xe)v)$.

This case requires a lemma, stated below, to show that $\nm{FV}(\subst evx) \subseteq (\nm{FV}(e) - \{x\}) \cup \nm{FV}(v)$.  Once that is shown, we have
\begin{eqnarray*}
\nm{FV}(\subst evx) &\subseteq& (\nm{FV}(e) - \{x\}) \cup \nm{FV}(v)\quad\mbox{by the lemma to be proved}\\
&=& \nm{FV}(\lam xe) \cup \nm{FV}(v)\quad\mbox{by the definition of \nm{FV}}\\
&=& \nm{FV}((\lam xe)v)\quad\mbox{again by the definition of \nm{FV}.}
\end{eqnarray*}

We have now considered all three rules of derivation for the CBV $\lambda$-calculus, so the theorem is proved.
\end{proof}

\medskip

\noindent
\textbf{Lemma}\quad $\nm{FV}(\subst evx) \subseteq (\nm{FV}(e) - \{x\}) \cup \nm{FV}(v)$ (this lemma is used by case 3 in the above theorem).

\begin{proof}
By structural induction on $e$.  There is one case for each clause in the definition of the substitution operator.  We have assumed previously that values are closed terms, so $\nm{FV}(v)=\emptyset$ for any value $v$; but actually we do not need this for the proof, and we do not assume it.

\medskip

\noindent
\emph{Case 1}: $e = x$.
\begin{eqnarray*}
\nm{FV}(\subst evx) &=& \nm{FV}(\subst xvx)\\
&=& \nm{FV}(v)\quad\mbox{by the definition of the substitution operator}\\
&=& (\{x\} - \{x\}) \cup \nm{FV}(v)\\
&=& (\nm{FV}(x) - \{x\}) \cup \nm{FV}(v)\quad\mbox{by the definition of \nm{FV}}\\
&=& (\nm{FV}(e) - \{x\}) \cup \nm{FV}(v).
\end{eqnarray*}

\medskip

\noindent
\emph{Case 2}: $e = y$, $y\neq x$.
\begin{eqnarray*}
\nm{FV}(\subst evx) &=& \nm{FV}(\subst yvx)\\
&=& \nm{FV}(y)\quad\mbox{by the definition of the substitution operator}\\
&=& \{y\}\quad\mbox{by the definition of \nm{FV}}\\
&\subseteq& (\{y\} - \{x\}) \cup \nm{FV}(v)\\
&=& (\nm{FV}(y) - \{x\}) \cup \nm{FV}(v)\quad\mbox{again by the definition of \nm{FV}}\\
&=& (\nm{FV}(e) - \{x\}) \cup \nm{FV}(v).
\end{eqnarray*}

\medskip

\noindent
\emph{Case 3}: $e = e_1~e_2$.
\begin{eqnarray*}
\nm{FV}(\subst evx) &=& \nm{FV}(\subst{(e_1~e_2)}vx)\\
&=& \nm{FV}(\subst{e_1}vx~\subst{e_2}vx)\quad\mbox{by the definition of the substitution operator}\\
&\subseteq& (\nm{FV}(e_1) - \{x\}) \cup \nm{FV}(v) \cup (\nm{FV}(e_2) - \{x\}) \cup \nm{FV}(v)\quad\mbox{by the induction hypothesis}\\
&=& ((\nm{FV}(e_1) \cup \nm{FV}(e_2)) - \{x\}) \cup \nm{FV}(v)\\
&=& (\nm{FV}(e_1~e_2) - \{x\}) \cup \nm{FV}(v)\quad\mbox{again by the definition of \nm{FV}}\\
&=& (\nm{FV}(e) - \{x\}) \cup \nm{FV}(v).
\end{eqnarray*}

\medskip

\noindent
\emph{Case 4}: $e = \lam x{e'}$.
\begin{eqnarray*}
\nm{FV}(\subst evx) &=& \nm{FV}(\subst{(\lam x{e'})}vx)\\
&=& \nm{FV}(\lam x{e'})\quad\mbox{by the definition of the substitution operator}\\
&=& \nm{FV}(\lam x{e'}) - \{x\}\quad\mbox{because $x\not\in\nm{FV}(\lam x{e'})$}\\
&\subseteq& (\nm{FV}(e) - \{x\}) \cup \nm{FV}(v).
\end{eqnarray*}

\medskip

\noindent
\emph{Case 5}: $e = \lam y{e'}$, $y \neq x$.  This is the most interesting case, because it involves a change of bound variable.  Using the fact $\nm{FV}(v) = \emptyset$ for values $v$ would give a slightly simpler proof.  Let $v$ be a value and $z$ a variable such that $z\neq x$, $z\not\in\FV{e'}$, and $z\not\in\FV v$.
\begin{eqnarray*}
  \nm{FV}(\subst evx) &=& \nm{FV}(\subst{(\lam y{e'})}vx)\\
    &=& \nm{FV}(\lam z{\subst{\subst{e'}zy}vx})\quad\mbox{by the definition of the substitution operator}\\
    &=& \nm{FV}(\subst{\subst{e'}zy}vx)  -  \{z\}\quad\mbox{by the definition of \nm{FV}}\\
    &=& ((((\nm{FV}(e') - \{y\}) \cup \FV z) - \{x\}) \cup \FV v)  -  \{z\}\quad\mbox{by the induction hypothesis twice}\\
    &=& (((\nm{FV}(\lam y{e'}) \cup \{z\}) - \{x\}) \cup \FV v)  -  \{z\}\quad\mbox{by the definition of \nm{FV}}\\
    &=& ((\nm{FV}(\lam y{e'}) - \{x\}) \cup \FV v \cup \{z\})  -  \{z\}\\
    &=& (\nm{FV}(e) - \{x\}) \cup \FV v.
\end{eqnarray*}
\end{proof}

There is a subtle point that arises in case 5.  We said at the beginning of the proof that we would be doing structural induction on $e$; that is, induction on the well-founded subterm relation $\lt$.  This was a lie.  Because of the change of bound variable necessary in case 5, we are actually doing induction on the relation of subterm modulo $\alpha$-equivalence:
\begin{eqnarray*}
e \lt_\alpha e' &\definedas& \exists e''\ e'' \lt e' \wedge e=_\alpha e''.
\end{eqnarray*}
But a moment's thought reveals that this relation is still well-founded, since $\alpha$-reduction does not change the size or shape of the term, so we are ok.

\section{Remark}

These proofs may seem rather tedious, and one may wonder why we are doing them in such detail.  But of course, this is exactly the point.  Formal reasoning about the semantics of the $\lambda$-calculus, including such seemingly complicated notions as reductions and substitutions, can be reduced to the mindless application of a few simple rules.  There is no hand-waving or magic involved.  There is nothing hidden, it is all right there in front of you.  To the extent that we can do this for real programming languages, we will be better able to understand what is going on.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\end{document}