CS 212 Spring 2000 Lecture 28

CS212 Notes for Lecture 28
May 3, 2000

Today's topic: Computability and Undecidability>

Announcements

Final exam
Review session
PS6
Party 2.0
Course evals

We've enjoyed teaching this class very much, and we hope that we've given you some sense of what Computer Science is like. We hope to see you later on in some of our upper level courses.

In this lecture we explore the far reaches of what computers can and cannot do. We also prove one of the deepest results in computer science: Turing's Halting Theorem. It's related to Gödel's Incompleteness Theorem, among other things. We'll discuss bigger and smaller infinities. At the end we'll relate it all back to the Scheme interpreter.

We've been showing you lots of powerful programming tools.

You might be tempted to think that computer can solve any problem, given the right program.

You'd soon realize that there are problems worth solving, say, social or philosophical or religious problems, that can't really be formulated as computational problems.

But you might think that any mathematical problem could be solved by a computer.

Not so.

Today we'll look at some things that cannot be computed.

And prove that they cannot be.

No matter

what program,
what language,
what machine,
how long you wait
anything.

There are two reasons:

There are more mathematical functions than programs.

Lots more.
There are infinitely many of both,
But there is a bigger infinity of functions

We'll even show you some interesting functions that cannot be computed. It turns out that many desirable compiler optimizations are not computable. We can approximate their solutions, but it's impossible to always get it right.

Here's what we'll do:

We'll show there are countably many programs. This means we can put them in one-to-one correspondence with the natural numbers N = {0,1,2,...}.
We'll show there are uncountably many functions. It's impossible to put them in one-to-one correspondence with N.
That means that there must be some function which isn't programmable.
Then we'll show you a particular one that you may find interesting.

Countability and Uncountability

We're going to show that the set of all Scheme programs is countable.

Definition: We'll say that two sets A and B are the same size if there is an exact pairing between them; that is, if there is a set R of pairs (a b) such that every element of A occurs on the left-hand side of exactly one pair in R and every element of B occurs on the right-hand side of exactly one pair in R.

Example: the sets {0,1,2} and {2,4,6} are the same size because we can pair them up as follows: (0 2), (1 4), (2 6). This definition goes for infinite sets as well.

A set S is countable if it is the same size as the natural numbers N = {0,1,2,...}; that is, if there is a way to pair the elements of S with 0,1,2,...

Do you think this should be possible for every infinite set? It's not!

Countable sets are all the same size; their "size" is the size of N, called "countably infinite". This is the smallest infinite size. There are strictly larger infinite sets, as we'll see.

So, a set S is countable if the elements can be listed out:

s₀, s₁, s₂,...

For example, N is countable: take the identity pairing (n n).

The set of integers Z = {...,-3,-2,-1,0,1,2,3,...} is countable:

0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 ...
0 -1  1 -2  2 -3  3 -4  4 -5  5 -6  6 -7  7 -8  8 ...

Wait a minute. Something's weird. N is a proper subset of Z.

So how can it be the same size?

But it is. You can pair up the elements of Z with the elements of N exactly.

So for infinite sets, being a proper subset does not mean it's not the same size.

The even numbers E = {0,2,4,6,8,...} are countable: pair n with 2n.

Again, E is a proper subset of N, so how can it be the same size?

But it is.

The rationals are countable.

1/1 1/2 1/3 1/4 1/5
2/1 2/2 2/3 2/4 2/5
3/1 3/2 3/3 3/4 3/5
4/1 4/2 4/3 4/4 4/5
5/1 5/2 5/3 5/4 5/5

Number these in a diagonal zigzag:

0   1   3   5   9
2   *   6   *
4   7   *
8   *
10

etc... Skip over duplicates (marked in the table above with "*").

Enumerating Programs

There are countably many Scheme programs:

A program is a finite string of ASCII characters.
So there are certainly no more programs than finite ASCII strings
We can list out all the finite ASCII strings.

0   - empty string   strings of length 0
1   - a              strings of length 1
2   - b              .
...                  .
26  - z              .
27  - aa             strings of length 2
28  - ab             .
...                  .
702 - zz             .
703 - aaa            strings of length 3
704 - aab            .
...                  .

etc.

Now, not all of these are legal Scheme programs, but all Scheme programs are in this list.

We can recognize the legal Scheme programs and skip over the ones that are not.

So, there are (only) countably many Scheme programs.

Some Uncountable Sets

There are only countably many Scheme programs, but we'll show that there are uncountably many functions.

There are even uncountably many Boolean-valued functions of one argument f:N -> {#t,#f}. The set of all such functions is an infinite set, but its size is a bigger infinity than the size of N.

All the following sets are all the same size (they can be put into one-to-one correspondence with each other), and they are all uncountable:

All Boolean-valued functions of one argument
All infinite binary strings, e.g. 01101001010010... (0 in position n if f(n)=#f, 1 in position n if f(n)=#t)
All real numbers in the interval [0,1] (take the binary expansion .01101001010010... There are some duplicates here, e.g. .000111111... and .001000000..., but only countably many numbers have duplicate representations, and uncountable minus countable is still uncountable)
All paths in the infinite complete binary tree (0=go left, 1=go right)
All subsets of N

Diagonalization

Let S be the set of all Boolean-valued functions of one (natural number) argument. Let's show this set is not countable. Any element of S can be thought of as a row in an infinite table, where we list in column n the value of the function on input n:

inputs->  0  1  2  3  4  5  6  7  8  9 ...
------------------------------------------
odd?   | #f #t #f #t #f #t #f #t #f #t ...
prime? | #f #f #t #t #f #t #f #t #f #f ...
even?  | #t #f #t #f #t #f #t #f #t #f ...
=4?    | #f #f #f #f #t #f #f #f #f #f ...

etc.

Suppose that S were countable. (This is a fallacious assumption, and we're going to show that it leads to a contradiction.)

If S were countable, then there would be some exact pairing of S with N. (That's the definition of countable.) Then we could list the elements of S out as S₀, S₁, S₂,... where S_n is the element of S that is paired with n.

We'll show that there can be no such pairing. More precisely, we'll construct an element of S that isn't S_n for any n. This will be a contradiction.

We can list all the elements of S as rows in a table:

inputs->  0  1  2  3  4  5  6  7  8  9 ...
------------------------------------------
S₀     | #f #t #f #t #f #t #f #t #f #t ...
S₁     | #f #f #t #t #f #t #f #t #f #f ...
S₂     | #t #f #t #f #t #f #t #f #t #f ...
S₃     | #f #f #f #f #t #f #f #f #f #f ...
S₄     | #f #t #f #f #t #t #t #f #f #t ...
...

Now we will construct a function d: N -> {#t,#f} that does not appear as a row in this table. Thus, there is no integer n such that S_n = d.

Define the function d:N -> {#t,#f} that on input n returns the negation of the value of S_n applied to n:

d(n) = Boolean negation of S_n(n)

If we could write this in Scheme, it would be

(define d
   (lambda (n) (not (Sn n))))

In other words, we look down the main diagonal of the table and look at the list of values:

#f #f #t #f #t ...

The first value is S₀(0), the next is S₁(1), and so on.

We take the Boolean complement of this list:

#t #t #f #t #f ...

and let d be the function that takes these values:

inputs-> 0  1  2  3  4 ...
d =     #t #t #f #t #f ...

Now, d cannot appear as a row of the table. It isn't S₀ because S₀ and d differ on input 0. It isn't S₁ because S₁ and d differ on input 1... etc.

Therefore d is not S_n for any n. This is a contradiction, because the pairing (n S_n) was supposed to be an exact pairing between N and S, but it's not: it missed d.

So the set of all functions f:N -> {#t,#f} is not countable.

And since there are (only) countably many programs, there are not enough to compute all functions. So there must exist a function we cannot compute.

Note that this argument holds not just for Scheme, but for any programming language. No matter how powerful your programming language, there is always some function it cannot compute.

We have proved that the functions N -> {#t,#f} are not countable (neither are the real numbers, subsets of N, paths in the infinite complete binary tree).

Cantor discovered this about a hundred years ago and flipped out...

The real numbers are uncountable --

There is no way to pair the integers with the reals
There are more reals than integers.
Way more.

The style of argument is called diagonalization because we constructed d by going down the diagonal of that infinite matrix.

The Halting Problem

Now you may be asking, so what?

Why would you ever want to compute one of these functions?

Well, many interesting questions concerning programs are uncomputable.

Here's a real useful one: it would be nice to have a program halts? that checks whether a given program will eventually halt on a given argument.

(halts? prog arg) ==> #t

if (prog arg) would halt and return a value, #f if not.

This would be very, very useful! We could use it to check whether a given program is safe to simulate before jumping into the simulation.

But unfortunately, halts? cannot possibly exist.

We'll prove this by contradiction. It's essentially a diagonalization argument.

Suppose there were such a function halts?. Let's construct a new function, which we'll call Cantor:

(define Cantor
  (lambda (p)
    (if (halts? p p)
        (not (p p))
        #f)))

Note that Cantor always halts on any argument, because on input p it uses halts? to check whether p would halt on input p before it actually calls it.

For example,

(Cantor (lambda (x) 4))
==> (not ((lambda (x) 4) (lambda (x) 4)))
==> (not 4)
==> #f

Now the big question is: what about (Cantor Cantor)?

(Cantor Cantor) halts, as we have argued. What value does it have?

(Cantor Cantor)
==> (if (halts? Cantor Cantor)
        (not (Cantor Cantor))
        #f
==> (not (Cantor Cantor))

Thus (Cantor Cantor) returns a value, and that value is the Boolean complement of the value it returns.

huh?

This is impossible. There's no Scheme value that is equal to the negation of itself. We have a contradiction.

Where was the mistake in our argument? It was in the fallacious assumption that there exists a function halts? with the contract given above. No such procedure halts? can exist.

We have just proved

The Undecidability of the Halting Problem (Alan Turing, 1936)

It is undecidable whether a given program halts on a given input.

...and none of the neat programming tricks we've taught you can possibly help.

There are lots of useful and interesting problems that compiler writers would love to be able to solve. Unfortunately, many of them are provably unsolvable. E.g.,

the dead code problem: given a block of code in a program, will that code ever be executed?
the equivalence problem: given two programs, do they compute the same function?
and lots more.

There do not exist programs that can solve these problems in general.