Lecture 26:
Modular Programming. Interfaces. Specifications. Refactoring.


The length of programs we wrote or otherwise examined in this course was of at most a few hundred lines of code. This length was sufficient to illustrate the concepts, datastructures, and programming techniques that were the subject of our investigations. Many real-life applications, however, are much bigger, often exceeding 10,000 - 100,000 lines of code (LOC), and sometimes having sizes of up to a few million LOC. The size of the source code of modern operating systems is in the range of tens of millions of LOC; precise numbers are to come by, for proprietary reasons, but also due to methodological reasons. (What is a 'line of code, really? Should the accept the authors' formatting, or should we reformat the input before counting?)

Experience has shown that large programs are characterized by new behaviors that emerge due to the sheer complexity of the underlying code. In this lecture we will discuss about techniques that allow programmers to manage and master this complexity.

Large programs will most often be written by teams of programmers, often over an extended period of time. The individuals involved in development must coordinate and document their work, and must be able to adapt to changes in the composition and/or administrative structure of the team. Once written, large programs tend to be used for extended periods of time during which the original requirements often evolve, while the underlying technologies (software: compilers, operating systems; hardware: processors speed, memory size, hard disk size, network bandwidth and latency) change spectacularly. Changes in requirements and/or the underlying technologies often impose changes in the structure and functionality of programs.

The inherent complexity of a large program prevents any individual from comprehending it in its entirety; inevitably programers will acquire true expertise in only part of the program. On the other hand, even large, monolithic programs have an internal structure that emerges naturally; the interaction (and dependency) between certains parts of the program that address a specific sub-problem are much stronger than the interaction between these and other parts.

In languages with imperative features, where values can be changed, one further disadvantage of large, monolithic programs is that any global value might be changed from anywhere else, even inadvertently (for example, because a programmer misspells a variable name).

One solution to these problems is to carefully break up large programs into parts (subsystems) that can be designed, developed and tested separately before being put together to solve the problem at hand. The usual name for these subsystems is that of module.

Good modular design decomposes the problem into units that map naturally onto subproblems of task at hand. A typical module will have a few hundred lines of code, and as such, it is much easier for programmers to understand it (compared to the entire program). Testing at the module level allows rapid isolation and resolution of errors, significantly speeding up development. Once modules are debugged, modules can be put together to build and debug the final system. Note that it is possible for a system composed of individually correct modules not to be correct as a whole (e.g. when modules do not interact correctly).

It is practical to isolate modules as much as possible from each other. Many programming languages provide specific mechanisms that enforce, or at least encourage such strict isolation. There are several goals that can be achieved by isolating modules: (a) the source code can be maintained separately; (b) datastructures in one module can not be changed from the other module, or can only be changed in controlled ways; (c) if modules allow only for the minimum amount of interaction that is needed for the correct perfomance of their task, then changing the internal structure of modules becomes easier (we say that modules are loosely coupled).

Programmers control the interaction between a module and and its users by defining interfaces. Simply put, an interface is a contract, in which the module guarantees that it will implement all the advertised functionality, while the users guarantee that they will only rely on advertised functionality (and they will use the module only through the respective interface - this latter requirement is often automatically enforced by compilers).

Programming languages differ widely in the way in which they support and/or enforce modular programming. The relevant language mechanisms for three well-known languages are shown below:

C Java ML
Module source file class structure
Interface header files interfaces, or
public members of a class

Note that the Java interface is not identical to the notion of interface that we discuss here. The interface of a class that does not implement any interfaces is the collection of public functions and variables that the class declares. Besides their role in specifying the interaction between a module (class) and its users, a Java interface also has a role in simulating the implementation of multiple inheritance (which exists, for example, in C++).

Inexperienced programmers, often under the pressure to show quick and tangible results, are often tempted to split a bigger problem into modules informally, and start writing code before clearly defining interfaces. This can be an extremely dangerous mistake, as it might lead to incompatibilities between modules. Fixing such problems can be very expensive, and often involves extensive rewrites of the source code.

There is a natural tension between module implementers and users. Users are often interested in having complex, powerful functionality accessible through the interface ("give me a function for every small problem I need to solve"). Implementers, on the other hand, are interested in exposing only a small number of simple operations, which are easy to implement, test, and maintain. The situation is not always so clear cut, however, as sometimes implementers are tempted to expose functionality that is complex and requires a lot of effort just because it is "cool."

It is often said that a complex interface is wide, while a simple one is narrow.

Designing good interfaces requires a good understanding of the problem at hand, good skills, and experience. In many cases complex functionality is rarely, if ever, used, but its implementation might require a lot of resources and might have forced design decisions that make the entire module much less efficient than it could otherwise be. Some principles of interface design have been distilled down to statements like "do a few things, but do them well," or "don't do it just because it's cool." While every case is different, you should be strongly biased toward simple interfaces and loose coupling between modules.


Type declarations are too coarse to capture the semantics of the entities that an interface exposes. For example, SML (or C, or Java) does not expose a type positive integer. We can check inside a function whether an argument is positive, but this will not be obvious from the declaration of the function that an interface exposes.

Information that can not be expressed purely using language mechanisms is often provided in comments which provide the behavioral description of the functions involved. Often, these comments have a fixed form, to facilitate quick understanding, and to reduce the possibility that an essential element will be overlooked.

When describing a function in an interface it is common to provide comments that include:

The description of what functions do is called specification. It is often useful to write specifications in the form of clauses:

Here is one example:

   requires: x >= 0
   results: r = sqrt(x) >= 0; | r * r - x | < 0.0001
   checks: x >=0, if not true raises exception Fail "negative argument"
   effects: prints the value of the square root
fun sqrt (x: real): real = 
  if x < 0.0
  then raise Fail "negative argument"
      fun helper(root: real): real = 
        if Real.abs(root * root - x) <= 0.0001
        then root
        else helper(0.5 * (root + x/root))
      val root = helper 1.0
      print ("root = " ^ (Real.toString root) ^ "\n");

- sqrt 4.0;
root = 2.00000009292
val it = 2.00000009292 : real
- sqrt ~4.0;

uncaught exception Fail: negative argument
  raised at: stdIn:62.14-62.38

In keeping with the general principle that an interface must expose as little functionality as possible, specifications should be kept as simple and as abstract as possible. The module implementor must assume that any information he exposed in the past has been relied on by users. If a lot of information has been disclosed through the interface, then the respective module can not be changed easily (and - if changed - must respect the specification as written originally). Such modules are said to be tightly coupled.

We say that a specification is definitional if the respective specification states what the interface does, but provides as little detail as possible about how the functionality is achieved. An operational specification provides a lot of information on how the functionality is achieved. Definitional specifications tend to lead to loose coupling, while operational specifications tend to lead to tight coupling.

Consider the following two partial specifications:

version 1: returns j such that a[j] = y
version 2: loop on j from 0 to n - 1, 
             compare a[j] to y, if equal, returns j

Note: These specifications are incomplete (what happens if there is no j such that a[j] = y?), and are only meant to illustrate the difference between definitional and operational specifications.

Version 1 should be preferred to version 2 whenever possible. To clarify why this is so, let us assume that the array a holds a complex datastructure, whose comparison operation is performed by a user-supplied custom function. A user could define a comparison operation that had side effects, and the rest of the program might come to rely on these side effects occuring in a certain order, corresponding to the loop index growing from 0 upwards. If this happens, the modules become tightly coupled, and we can not change the module implementation in ways that do not implement the linear search pattern the interface exposes. On the other hand, version 1 would have forced the module's user not to rely on comparisons to occur in a certain order. This might require a slightly more complex code at the user's end, but this is often not the case in practice. Version 1 does not disclose the existence of a linear search pattern even though it might actually use it in the implementation. The advantage is that now we can change the function more easily, and we are not prevented from renouncing the linear search pattern.

Version 1 is also an example of non-deterministic specification, i.e. it allows for several possible results (any j such that a[j] = y is acceptable). A specification given as "returns the smallest j such that a[j] = y" would make the answer unambiguous, i.e. it would be deterministic. We note that the specification we have just given is still better than version 2 above - we do not necessarily have to search in a linear fashion in the array to find the smallest j with the given property.

Natural languages are inherently ambiguous, so their use in specifications might lead to misunderstandings. Formal specifications use a special, unambiguous notation to describe clauses. Being precise, these specifications reduce the possibility of a misunderstanding. Their other big advantage is that formal specifications can be processed using specialized tools which can determine whether they are sound, and whether a certain piece of code (say, function) actually conforms to its specification. Automated program checking greatly speeds development and reduces costs, while offering guarantees of program correctness. Next time you fly try to decide whether you would want the software that keeps the airplane in the air to have been debugged by a tired hacker working overnight, or by tools that used formal specifications to warrant that there are no hidden errors. Unfortunately, developing formal specifications is not always easy, and many domains prove remarkably resilient to efforts in this direction.

Module Hierarchies

Truly large programs can not be simply split into an unstructured collection of modules; at some point the number of modules becomes so great that the system becomes unmanageable. The answer to this problem is given by module hierarchies. Hierarchies of modules emerge natually if one breaks down the problem into a few big subproblems first, then attacks and further decomposes each subproblem using a similar strategy, and repeating this process as long as necessary to obtain subproblems of reasonable size. This is not always easy to do well, an wrong choices will lead to subproblems that exhibit many mutual dependencies. A good decomposition minimizes the number and complexity of dependencies.

In such a system, one will have modules that implement functionality which does not rely on any other module. There will be, however, modules that integrate the functionality of simpler, more basic modules. These dependencies naturally create a hierarchy of modules. Care must be taken no to introduce circular dependencies in module hierarchies; e.g. one does not want module A to depend on module B, to depend on module C, which in turn depends on module A (can you tell why?). Good module hierarchies can be represented as trees or DAGs.

Once modules and interfaces have been defined, one must proceed to implementation. There are two basic implementation methods: top-down and bottom-up.

The bottom-up method starts with the implementation of modules at the lowest level of the hierarchy, followed by the implementation of modules that depend only on modules on the lowest level, and proceeds in similar fashion up to the root of the hierarchy. The bottom-up method is preferred by many because code that is developed can be tested immediately (all the functionality a module depends on has already been implemented). Immediate testing allows for the discovery and early fix of errors, but it also allows for the early discovery of low-level efficiency problems. The downside of this method is that the "big picture" is somewhat blurred during development, and high-level, systemic design flaws are discovered only late, possibly too late for any changes to be possible.

The top-down method starts with the development of top-level modules, then implements the modules upon which top-level modules rely, and proceeds in a similar fashion toward the modules that are lowest in the hierarchy. The advantages and disadvantages of this method mirror those of the bottom-up method. High-level design flaws become visible early, and arrow for quick corrections and - often - for higher-quality specifications. The major disadvantage is that testing becomes much more difficult in the early stages of a project, as not all the functionality needed by already implemented modules is available. Testing is not impossible, however; one can develop modules that simulate the behavior of as-of-yet unimplemented modules, for example (note that such simulation will often not allow for full-fledged testing of existing modules).

Changing Module Implementations. Refactoring.

As mentioned above, big programs tend to have a long life, during which technologies change, requirements evolve, and - likely - various flaws of the original implementation become visible. All these factors can prompt the rewrite of some parts of the code base. Rewrites must often be accomplished so that the program remains functional at all times. In a well-designed system changes should only occur inside modules, and all these changes should leave the interfaces unaffected. Refactoring is a technique that allows for the internal restructuring of a module while its external behavior remains unchanged.

Consider our Mini-ML interpreter. We have mentioned previously that we could add various constant and function definitions to the interpreter's global (initial) environment. Because the environment is implemented as a list, adding a very large number of predefined constants and functions to the global environment would slow down lookups of predefined identifiers. To improve performance, we could redesign the environment so that it consists of, say, a splay tree (with the identifiers used as keys), and a list. The tree will contain the global environment (which - in mini-ML - never changes), while the list could be used to implement the dynamic part of the environment, as it is done now. The only change that we need to make is to the functions env_lookup, and env_add; all the other parts of the interpreter rely on these two functions to access environments. Given enough identifiers defined in the global environment, and mini-ML programs that use these predefined identifiers intensively enough, refactoring will likely have a significant payoff. Equally important, the change we had to make very localized.

Suggested problem: Implement this refactoring idea. Hint: Look up the various tree datatypes that SML implements.

Certain aspects of refactoring are highly structured, and consist of the sistematic identification (detection) of certain design patterns. The design patterns are called smells (seriously!). Big repositories of smells grouped by categories are reachable through the web, and they are also available in the published technical literature.

Assume that in many functions internal to a module a pair of arguments with the meaning "start date" and "end date," respectively, is always used together. In such a case it is a good idea to create a new type, say, a pair of dates, with the meaning of "interval." This change better expresses the logic of the situation, makes the program easier to understand, reduces the number of function arguments (making the program more readable), and reduces the possibility of errors (since the chance of pairing incorrect dates is reduced). All these advantages emerge from recognizing a simple smell! Systematically identifying smells and restructuring programs accordingly can greatly improve the performance, readability, and maintainability of a big program.

We have stated that refactoring should not change the external behavior of modules. But how do we know that the behavior did not change? Ideally, we would have generated formal specifications and we would have used automated tools to confirm the correctness of our implementation unambiguously. In practice, however, we need to rely on testing.

As the project develops, one must develop tests that thoroughly and exhaustively examine the functionality of every module. The results of tests should be deterministic (e.g. they should always be the same), so that the correctness of the result can be easily checked using automated tools. Ideally, the number of tests should only grow in time, and no old tests should be discarded (it is better to test something twice than not to test it at all; besides, it is not always clear that two tests exercise exactly the same code path). When a program is changed, all tests should be run on it, to make sure that no correct functionality has been lost. This technique called regression testing.