******************************************************************* We've enjoyed teaching this class very much, and we hope that we've given you some sense of what Computer Science is like. We hope to see you later on in some of our upper level courses. ******************************************************************* Computability and Undecidability -------------------------------- In this lecture we explore the far reaches of what computers can and cannot do. We also prove one of the deepest results in computer science: Turing's Halting Theorem. It's related to Goedel's Incompleteness Theorem, among other things. We'll discuss bigger and smaller infinities. At the end we'll relate it all back to the Dylan interpreter. -------------------------------------------------------------------- * We've been showing you lots of powerful programming tools. * You might be tempted to think that computer can solve any problem, given the right program. * You'd soon realize that there are problems worth solving, say, social or philosophical or religious problems, that can't really be formulated as computational problems. * But you might think that any mathematical problem could be solved by a computer. Not so. Today we'll look at some things that CANNOT be computed. * And PROVE that they can't be. * No matter - what program, - what language, - what machine, - how long you wait - ANYTHING. There are TWO reasons: 1. There are more mathematical functions than programs. * Lots more. * There are infinitely many of both, * But there is a *bigger* infinity of functions 2. We'll even show you some INTERESTING functions that cannot be computed. It turns out that many desirable compiler optimizations are not computable. We can approximate their solutions, but it's impossible to always get it right. Here's what we'll do: 1. We'll show there are *countably* many programs. This means we can put them in one-to-one correspondence with the natural numbers N = {0,1,2,...}. 2. We'll show there are *uncountably* many functions. It's impossible to put them in one-to-one correspondence with N. 3. That means that there must be some function which isn't programmable. 4. Then we'll show you a particular one that you may find interesting. ---------------------------------------------------------------------- We're going to show that the set of all Dylan programs is *countable*. DEFINITION: We'll say that two sets A and B are /the same size/ if there is an exact pairing between them; that is, if there is a set R of pairs (a b) such that every element of A occurs on the left-hand side of *exactly one* pair in R and every element of B occurs on the right-hand side of *exactly one* pair in R. Example: the sets {0,1,2} and {2,4,6} are the same size because we can pair them up as follows: (0 2), (1 4), (2 6). This definition goes for infinite sets as well. A set S is *countable* if it is the same size as the natural numbers N = {0,1,2,...}; that is, if there is a way to pair the elements of S with 0,1,2,... Do you think this should be possible for every infinite set? It's not! Countable sets are all the same size; their "size" is the size of N, countably infinite. This is the smallest infinite size. There are strictly larger infinite sets, as we'll see. So, a set S is countable if the elements can be listed out: s0, s1, s2,... For example, N is countable: take the identity pairing (n n) The set of integers Z = {...,-3,-2,-1,0,1,2,3,...} is countable: pair n with 2n, -n with 2n+1. Wait a minute. Something's weird. N is a *proper subset* of Z. so how can it be the same size? But it is. You can pair up the elements of Z with the elements of N exactly. The even numbers E = {0,2,4,6,8,...} are countable: pair n with 2n. Again, E is a proper subset of N, so how can it be the same size? But it is. The *rationals* are countable. 1/1 1/2 1/3 1/4 1/5 2/1 2/2 2/3 2/4 2/5 3/1 3/2 3/3 3/4 3/5 4/1 4/2 4/3 4/4 4/5 5/1 5/2 5/3 5/4 5/5 Number these in a diagonal zigzag: 1 2 4 6 10 3 * 7 * 5 8 * 9 * 11 etc... Skip over duplicates (marked in the table above with "*"). ---------------------------------------------------------------------- There are countably many Dylan programs: * A program is a finite string of ASCII characters. * We can list out all the finite ASCII strings. 0 - empty string strings of length 0 1 - a strings of length 1 2 - b . ... . 26 - z . 27 - aa strings of length 2 28 - ab . etc. . * Now, not all of these are legal Dylan programs, * but all Dylan programs are in this list. * We can recognize the legal Dylan programs and skip over the ones that are not. * So, there are (only) countably many Dylan programs. ------------------------------------------------------------------------ But there are UNCOUNTABLY many functions. There are even UNCOUNTABLY many Boolean valued functions of one argument f:N -> {#t,#f} The set of all such functions is not countable. It is an infinite set, but its size is a bigger infinity than the size of N. All the following sets are all the same size (they can be put into one-to-one correspondence with each other), and they are all uncountable: - all Boolean valued functions of one argument - all infinite binary strings, e.g. 01101001010010... (0 in position n if f(n)=#f, 1 in position n if f(n)=#t) - all real numbers in the interval [0,1] (take the binary expansion .01101001010010...) (there are some duplicates here, e.g. .000111111... and .001000000.... but only countably many) (and uncountable minus countable is still uncountable) - all paths in the infinite complete binary tree (0=go left, 1=go right) - all subsets of N ------------------------------------------------------------------------ Let S be the set of all Boolean-valued functions of one (natural number) argument. Let's show this set is not countable. Any element of S can be thought of as a row in an infinite table, where we list in column n the value of the function on input n: inputs 0 1 2 3 4 5 6 7 8 9 ... ------------------------------ odd? | #f #t #f #t #f #t #f #t #f #t ... prime? | #f #f #t #t #f #t #f #t #f #f ... even? | #t #f #t #f #t #f #t #f #t #f ... =4? | #f #f #f #f #t #f #f #f #f #f ... etc. Suppose that S were countable. (This is a fallacious assumption, and we're going to show that it leads to a contradiction.) If S is countable, then there is some exact pairing of S with N. (That's the definition of countable.) Then we can list the elements of S out as S0, S1, S2,... where Sn is the element of S that is paired with n. We'll show that there can be no such pairing. More precisely, we'll construct an element of S that isn't Sn for any n. This will be a contradiction. We can list all the elements of S as rows in a table: inputs 0 1 2 3 4 5 6 7 8 9 ... ------------------------------ S0 | #f #t #f #t #f #t #f #t #f #t ... S1 | #f #f #t #t #f #t #f #t #f #f ... S2 | #t #f #t #f #t #f #t #f #t #f ... S3 | #f #f #f #f #t #f #f #f #f #f ... S4 | #f #t #f #f #t #t #t #f #f #t ... . . . Now we will construct a function d: N -> {#t,#f} that does not appear as a row in this table. Thus, there is no integer n such that Sn = d. Define the function d:N -> {#t,#f} that on input n returns the negation of the value of Sn applied to n: d(n) = Boolean negation of Sn(n) If we could write this in Dylan, it would be (define (d ) (method ((n )) (not (Sn n)))) In other words, we look down the main diagonal of the table and look at the list of values: #f #f #t #f #t ... The first value is S0(0), the next is S1(1), and so on. We take the Boolean complement of this list: #t #t #f #t #f ... and let d be the function that takes these values: inputs 0 1 2 3 4 ... d = #t #t #f #t #f ... Now, d cannot appear as a row of the table. It isn't S0 because S0 and d differ on input 0. It isn't S1 because S1 and d differ on input 1... etc. Therefore d is not Sn for any n. This is a contradiction, because the pairing (n Sn) was supposed to be an exact pairing between N and S, but it's not: it missed d. So the set of all functions f:N -> {#t,#f} is not countable. And since there are (only) countably many programs, there are not enough to compute all functions. So there must exist a function we cannot compute. Note that this argument holds not just for Dylan, but for any programming language. No matter how powerful your programming language, there is always something it cannot compute. ---------------------------------------------------------------------- We have proved that the functions N -> {#t,#f} are NOT countable (neither are the real numbers, subsets of N, paths in the infinite complete binary tree). Cantor discovered this about a hundred years ago and flipped out... The real numbers are UNCOUNTABLE -- * There is NO way to pair the integers with the reals * There are MORE reals than integers. - WAY more. The style of argument is called DIAGONALIZATION because we constructed d by looking at the diagonal of that infinite matrix. ---------------------------------------------------------------------- Now you may be asking, so what? Why would you ever want to compute one of these functions? Well, many interesting questions concerning programs are uncomputable. Here's a real useful one: it would be nice to have a program halts? that checks whether a given program will eventually halt on a given argument. (halts? prog arg) ==> #t if (prog arg) halts and returns a value #f if it doesn't. This would be very, very useful! We could use it to check whether a given program is safe to simulate before jumping into the simulation. But unfortunately, halts? cannot possibly exist. We'll prove this by contradiction. It's essentially a diagonalization argument. Suppose there were such a function halts?. Let's construct a new function, which we'll call Cantor: (define Cantor (method ((p )) (if (halts? p p) (not (p p)) #f))) Note that Cantor always halts on any argument, because on input p it uses halts? to check whether it is safe to simulate p on input p before it actually does the simulation. For example, (Cantor (method (x) 4)) ==> (not ((method (x) 4) (method (x) 4))) ==> (not 4) ==> #f Now the big question is: what about (Cantor Cantor)? (Cantor Cantor) halts, as we have argued. What value does it have? (Cantor Cantor) ==> (if (halts? Cantor Cantor) (not (Cantor Cantor)) #f ==> (not (Cantor Cantor)) Thus (Cantor Cantor) returns a value, and that value is the Boolean complement of the value it returns. ***huh?*** This is impossible. There's no Dylan value that is equal to the negation of itself. We have a contradiction. Where was the mistake in our argument? It was in the fallacious assumption that there exists a function halts? with the contract given above. No such procedure halts? can exist. We have just proved Turing's Halting Theorem: It is UNDECIDABLE whether a given program halts on a given input. ...and none of the neat programming tricks we've taught you can possibly help. ---------------------------------------------------------------------- There are *lots* of useful and interesting problems that compiler writers would love to be able to solve. Unfortunately, many of them are *provably* unsolvable. E.g., - the DEAD CODE PROBLEM: given a block of code in a program, will that code ever be executed? - the EQUIVALENCE PROBLEM: given two programs, do they compute the same function? There do not exist programs that can solve these problems in general. One can apply heuristics in special cases, but there is no general algorithm that solves all instances and be correct 100% of the time.