We have seen many useful algorithms and data structures for solving computational problems. However, some problems are intractably hard in the sense that they would require too much time or space. Some problems are not solvable at all, no matter how much time or space is available.
The class of problems that are generally considered to be tractable are those that can be solved in polynomial time, which is to say O(nk) for some k. In practice, algorithms that take polynomial time with k larger than 1 scale poorly, and algorithms that require k≥5 are almost never used in practice (and even k=3 and k=4 are often impractically slow).
We define the complexity class P as the set of all problems that can be solved by an algorithm taking polynomial time, that is, time O(nk) for some constant k.
Beyond P is the class EXPTIME, which includes all algorithms that take time O(2nk) for some constant k. Algorithms in EXPTIME effectively hit a wall when the problem size n becomes even moderately large. For example, if an algorithm takes time 2n, increasing the problem size from n to size n+1 requires twice as much time. Even if the algorithm is fast for small n, as n grows, we quickly reach a size where even a small increase is far too expensive. Contrast this with the case of an O(nk) algorithm, where going from n to n+1 means increasing the time only by a factor of roughly 1+k/n, which gets smaller as n increases. (To see why, note that ((n+1)/n)k ≈ 1 + k/n when n ≫ k.)
Another important class is nondeterministic polynomial time, or NP. These are the problems for which a potential solution can be checked in polynomial time. It may be intractably difficult to discover a solution to a problem in NP, but once we are given a candidate solution, we can check in (deterministic) polynomial time whether it is indeed a solution.
Examples of problems in NP are the following:
These are three famous problems in NP. All of them can be solved by exponential-time algorithms that essentially try all exponentially many possible solutions. But this approach is infeasible for all but very small instances of the problem. It is not known whether polynomial-time algorithms for these problems exist, although by now most computer scientists believe they do not.
Each of these three problems above has the interesting property that in polynomial time, any problem in NP can be reduced to it (encoded in it). That is, if a computational problem A is in NP, then there is a translation that, given an instance x of A, transforms x into an instance (G,k) of the graph coloring problem such that a valid k-coloring of G could be transformed back to a solution of x.
Because they can express any problem in NP, these problems are said to be NP-complete. If we had a polynomial-time algorithm to solve an NP-complete problem, we could solve any problem in NP in polynomial time! This result would mean that the complexity classes P and NP were the same.
Most computer scientists believe that P and NP are not the same and that there is no algorithm that solves any NP-complete problem in worst-case polynomial time. However, no one has managed to prove that the two classes are different either. The P=NP question has intrigued and stymied researchers for decades. It remains the single most important unsolved problem of computer science.
It is also possible to classify algorithms in terms of the memory space they require. Algorithms in PSPACE require a polynomial amount of space. Algorithms in L require only a logarithmic amount of space in addition to the input data; in effect, they can use a constant number of pointers into the input data. A recent surprising result is that undirected graph reachability is in L.
Space classes have nondeterministic versions too. A surprising result, proved by Savitch in the 1970s and known as Savitch's theorem, states that PSPACE = NPSPACE; that is, for a particular problem, if there is a polynomial-space algorithm to check whether a potential solution is indeed a solution, then there is a polynomial-space algorithm to find solutions.
Some relationships among complexity classes have been proved, such as the following inclusion relationships:
It is also known that L≠PSPACE and that P≠EXPTIME. However, many important things are not known. For example, because L≠PSPACE, we know that at least one of the inequalities L≠P, P≠NP, and NP≠PSPACE must hold, but we don't know which.
The complexity of some important problems is not known either. For example, even though the security of RSA encryption rests on the difficulty of factoring numbers, it is not known whether factoring is in P. (However, it is known that factoring can in principle be solved in polynomial time on a quantum computer, though no one has yet been able to build a useful quantum computer.) On the other hand, testing primality—whether a given integer is prime or composite—is known to be in polynomial time.
Beyond the problems mentioned above, there are even computational problems that cannot be solved at all, even in principle, and even with unlimited time and space. We can prove mathematically that such problems exist. We will focus on decision problems where the goal is to decide whether a given property holds of some input, and we will show that some decision problems are undecidable by any algorithm.
An example of such a decision problem is the halting problem: Given a
program p
and input x
to that program, does
p
terminate when run on input x
? We will see that
the halting problem is undecidable in general. That is, there is no
algorithm that halts and answers this question correctly for all
programs and inputs.
We have seen in previous lectures and from the programming assignments
that a program can be represented as a data structure such as an
abstract syntax tree (AST) or bytecode. Let us assume there is a class
Program
that can be used to represent programs.
We want to know whether we can implement a method with the following specification:
/** * Returns true if the given program p terminates when given x as input, * otherwise returns false. */ boolean terminates(Program p, Object x);
That is, for every p
and input x
, it should
successfully return either true
or false
depending on
whether the program represented by p
halts on input x
.
Note that although terminates
is just one method, it is allowed to use as many other classes and
methods as it likes. We have the full power of Java at our disposal.
For simplicity let us consider only Java programs that implement a decision problem and have the following form:
class P { public static boolean main(Object x) { ... } }
If p
is an instance of Program
that represents
the Java program P
, then
terminates(p,x)
should return true
or
false
according as P.main(x)
halts or
does not halt, respectively.
The method main
is allowed to use other classes
and methods. However, we will only consider programs that receive no input from and
send no output to the outside environment. The only input to the
program is x
and the only output is the boolean result of main
.
If we can't determine whether such simple programs terminate, then of
course we have no hope of determining whether more complex programs do.
We have seen that it is possible to write an interpreter for a
programming language. An interpreter for programs like P
(as coded by the Program
object p
) can be
written with the following signature:
/** * Simulate the execution ofp
on inputx
, returning * the same result asp
would. Ifp
would fail to terminate on * this input, so doesinterpret(p, x)
. */ public static boolean interpret(Program p, Object x);
In other words, interpret(p,x)
gives exactly the same result
as P.main(x)
.
Now consider the following program.
1 class H { 2 public static boolean main(Object x) { 3 if (!(x instanceof Program)) return false; 4 Program p = (Program)x; 5 if (!terminates(p,p)) return true; 6 return !interpret(p,p); 7 } 8 }
Note that H.main
terminates on any input x
,
because the only possibility for nontermination is in line 6 in the call to interpret(p,p)
,
but we have already guaranteed that this will terminate by the call to terminates(p,p)
in line 5. Therefore H.main(x)
always returns either true or false.
But now let h
be an instance of Program
representing H
,
and consider what happens when we run H.main(h)
. As argued above, this must
terminate. The program does not return in line 3 and the cast succeeds in line 4 because h
is an instance of Program
. The program does not return in line 5 because that would only happen if terminates(h,h)
returned false, which would only happen if H.main(h)
did not
terminate, but it does. Thus we get all the way to line 6. But now observe that at line 6, the code returns
the value !interpret(h, h)
, which if the interpreter works correctly, is equal to !H.main(h)
.
In other words, the result of H.main(h)
is equal to !H.main(h)
. This is a contradiction:
H.main(h)
returns true
⟺
interpret(h,h)
returns false
⟺
H.main(h)
returns false.
This contradiction means that our original assumption, that we can test halting
in line 5, must be wrong. More precisely, in a language expressive enough to
implement interpret
, we cannot implement
terminates
to work correctly on all input programs.
We know how to implement interpreters for programming languages (including Java) in Java;
we did it for a much simpler language in Assignment 5, but the principle is the same.
So the erroneous assumption was that we could decide termination. The method
terminates
cannot exist.
The conclusion is that some useful results are simply not computable by any algorithm.
This result may sound obscure but it has some far-reaching practical implications.
In particular, many different program analyses besides
termination are also undecidable. For example, we cannot reliably
identify dead code (code that cannot be reached), or tell if a given
program will ever generate a NullPointerException
, or
tell if a run-time type error will occur. These problems are all
provably undecidable.
To see why they are undecidable, suppose that
we had an analysis tool that could always tell at compile time whether
a program could generate a NullPointerException
. Using
that analysis, we could use it to figure out whether arbitrary code
terminates, by constructing code like the following:
String x = null; while (...) { // some computation that does not refer to x } System.out.println(x.toString());
Assuming the body of the while loop does not generate any NullPointerException
s,
this code generates NullPointerException
if and only if
the while
loop terminates. If we had an analysis tool that could
predict NullPointerException
s at compile time, we could use it to determine whether the while
loop terminates. But we know that termination is undecidable, therefore
predicting NullPointerException
s is impossible in general.
As a result, all “interesting” program analyses must be conservative, giving answers “true”, “false” or “not sure”. Type checking is another example of such an analysis. A compiler type-checks programs conservatively by only allowing programs that (if the type system is sound) definitely have no run-time type errors. However, a type checker will sometime complain about programs being ill-typed even though they do not cause run-time type errors.
By similar arguments, we see that we have to be conservative about many other facts we'd like to know about programs—for example, whether they are correct, whether they are secure, or whether they leak memory. All automatic tools for analyzing programs will either be incomplete, meaning that they reject some safe programs as possibly unsafe (this is a false positive or a false alarm), or else unsound, meaning that they accept some unsafe programs as safe (this is a false negative). Despite these limitations, software developers use incomplete and even unsound automatic tools all the time: they can still be useful despite their limitations.