CS 312 Lecture 26
Debugging Techniques

Testing a program against a well-chosen set of input tests gives the programmer confidence that the program is correct. During the testing process, the programmer observes input-output relationships, that is, the output that the program produces for each input test case. If the program produces the expected output and obeys the specification for each test case, then the program is successfully tested.

But if the output for one of the test cases is not the one expected, then the program is incorrect -- it contains errors (or defects, or "bugs"). In such situations, testing only reveals the presence of errors, but doesn't tell us what the errors are, or how the code needs to be fixed. In other words, testing reveals the effects (or symptoms) of errors, not the cause of errors. The programmer must then go through a debugging process, to identify the causes and fix the errors.

Bug Prevention and Defensive Programming

Surprisingly, the debugging process may take significantly more time than writing the code in the first place. A large amount (if not most) of the development of a piece of software goes into debugging and maintaining the code, rather than writing it.

Therefore, the best thing to do is to avoid the bug when you write the program in the first place! It is important to sit and think before you code: decide exactly what needs to be achieved, how you plan to accomplish that, design the high-level algorithm cleanly, convince yourself it is correct, decide what are the concrete data structures you plan to use, and what are the invariants you plan to maintain. All the effort spent in designing and thinking about the code before you write it will pay off later. The benefits are twofold. First, having a clean design will reduce the probability of defects in your program. Second, even if a bug shows up during testing, a clean design with clear invariants will make it much easier to track down and fix the bug.

It may be very tempting to write the program as fast as possible, leaving little or no time to think about it before. The programmer will be happy to see the program done in a short amount. But it's likely he will get frustrated shortly afterwards: without good thinking, the program will be complex and unclear, so maintenance and bug fixing will become an endless process.

Once the programmer starts coding, he should use defensive programming. This is similar to defensive driving, which means driving under worst-case scenarios (e.g, other drivers violating traffic laws, unexpected events or obstacles, etc). Similarly, defensive programming means developing code such that it works correctly under the worst-case scenarios from its environment. For instance, when writing a function, one should assume worst-case inputs to that function, i.e., inputs that are too large, too small, or inputs that violate some property, condition, or invariant; the code should deal with these cases, even if the programmer doesn't expect them to happen under normal circumstances.

Remember, the goal is not to become an expert at fixing bugs, but rather to get better at writing robust, (mostly) error-free programs in the first place. As a matter of attitude, programmers should not feel proud when they fix bugs, but rather embarrassed that their code had bugs. If there is a bug in the program, it is only because the programmer made mistakes.

Classes of Defects

Even after careful thought and defensive programming, a program may still have defects. Generally speaking, there are several kinds of errors one may run into:

Difficulties

The debugging process usually consists of the following: examine the error symptoms, identify the cause, and finally fix the error. This process may be quite difficult and require a large amount of work, because of the following reasons: 

Debugging strategies

Although there is no precise procedure for fixing all bugs, there are a number of useful strategies that can reduce the debugging effort. A significant part (if not all) of this process is spent localizing the error, that is, figuring out the cause from its symptoms. Below are several useful strategies to help with this. Keep in mind that different techniques are better suited in different cases; there is no clear best method. It is good to have knowledge and experience with all of these approaches. Sometimes, a combination of one or more of these approaches will lead you to the error. 

A number of other strategies can be viewed as a matter of attitude about where to expect the errors:

All of the above are techniques for localizing errors. Once they have been identified, errors need to be corrected. In some cases, this is trivial (e.g., for typos and simple errors). Some other times, it may be fairly straightforward, but the change must ensure maintaining certain invariants. The programmer must think well about how the fix is going to affect the rest of the code, and make sure no additional problems are created by fixing the error. Of course, proper documentation of these invariants is needed. Finally, bugs that represent conceptual errors in an algorithm are the most difficult to fix. The programmer must re-think and fix the logic of the algorithm.