Good software engineering is about dividing code into modules that separate concerns and localize them within modules. These modules then interact via interfaces that provide abstraction barriers supporting local reasoning. Let's look more closely at the problem of designing good interfaces.
Interfaces exist at the boundaries of the modular decomposition. An interface will be most effective when it has the following three properties:
It provides a strong abstraction barrier between modules.
The interface should be as narrow as possible while providing the functionality needed by clients.
The interface should be clearly specified.
We've already discussed abstraction earlier; our goal here is to examine the second two attributes of a good interface.
By a narrow interface, we mean an interface that exposes few operations or other potential dependencies between modules. The opposite of a narrow interface is a wide interface, one that exposes many operations or potential dependencies between modules.
The choice between a narrow interface and a wide interface is not always obvious, because there are benefits to each approach. We can compare and contrast the philosophies:
| narrow | wide |
|---|---|
| few operations, limited functionality for clients to use | many operations, much functionality available for clients |
| easy to extend, maintain, reimplement | hard to extend, maintain, reimplement |
| loose coupling: clients less likely to be disrupted by changes | tight coupling: clients more likely to be disrupted by changes |
In principle, it's possible to make the interface so narrow that it interferes with clients getting their job done in an efficient and straightforward way. But this is not the usual mistake of software designers; more typically, they make interfaces too wide, leading to software that is hard to maintain, extend, and reimplement without breaking client code.
The rule of thumb, then, is that interfaces should be made only as wide as necessary for efficient client code to be written in a straighforward way.
Often when a narrow interface feels awkward to use, it is possible to address this problem by writing convenience methods that are implemented outside the module, using only the narrow interface that the module provides. Clients can then use the convenience functions to avoid code duplication, but without widening the interface and thereby introducing new dependencies between modules.
When a module's interface is wide and there doesn't seem to be a way to avoid this by writing convenience functions or by separating the module into multiple modules, it is often a sign that you haven't managed to separate program concerns into different modules. When concerns are not sufficiently separated, there are inherently too many interactions between the different parts of the program to define a narrow interface between the components.
Object-oriented languages offer a nice pattern for separating convenience methods from
the core functionality of a class: convenience methods are factored out into an
abstract class that is intended to be subclassed to provide the missing core
functionality. The core functionality, corresponding to a narrow interface, is
provided by the subclass by overriding the unimplemented methods. This
implementation strategy is used extensively in the Java Collection Framework.
The collection classes offer a wide interface to client code, but most of their
methods are implemented by abstract classes such as
AbstractCollection, AbstractList,
AbstractQueue, AbstractSequentialList, and
AbstractSet. This implementation strategy keeps interfaces
narrower and separates concerns. Another nice feature is that the subclass can
even override convenience methods if there is a more efficient way to implement
them than the generic way in the context of this particular implementation.
Once we've decided what operations and other functionality belongs in an interface, what documentation should be added? An important principle can guide us here: documentation is code, code for humans to run. The documentation is a human-readable abstraction of the code that (depending on which documentation we're talking about) supports programmers writing client code or maintaining the implementation.
The most important function of documentation is to provide specifications of what code does. Specifications are particularly useful for supporting client code, but also help implementers.
According to the principle that documentation is code, the best place for documentation is with the code itself, in the form of program comments. When this is not practical, code documentation should be linked from code so it can be easily accessed. Javadoc documentation is a good example of this principle in action: the documentation is extracted from the code, so it cannot be separated from it.
Documenting code separately in separate documents may be appealing, but the more separate documentation is from the code it describes, the more it tends to diverge from the code. The more it diverges from the code, the less useful it becomes and the less programmers rely on (or look at) the documentation. Both documentation and code require programmers' attention to stay fresh!
Too often, programmers write their documentation at the end of the design and implmentation process, as a kind of afterthought. The workflow of design, coding, debugging, and documenting tends to look like the figure on the left. A lot of time is spent debugging because the design is not worked out carefully enough. In general, spending a lot of time debugging is a sign you haven't worked hard enough on the design.
Documenting the design early, as shown in the figure on the right, helps you work bugs out of your design and to understand your design better. Typically, this makes both coding and debugging faster. Sometimes your code just works on the first try!
The moral is that documentation is not some kind of esthetic decoration for your code. It is a tool that can improve your designs and save you time in the long run.
Know your audience: Tell your reader the things they need to know in a way they can understand. But your reader's attention is precious. Don't waste space on obvious, boring things. Filler distracts from what's important. Avoid “filler” comments that don't add any value and distract from what's important, such as:
x = x + 1 // add one to x
Be coherent: avoid interruptions. Better to write one clear explanation than to intersperse explanatory comments throughout the code.
Respect the abstraction barrier: write specifications in terms the reader/client can understand without knowing the implementation.
/** A polynomial over a single variable x
* Example: 2 + 3x - 5.3x3
*/
interface Poly {
...
}
Well-designed methods usually fall into one of three categories: creators (factory methods), observers (also known as queries), and mutators (also known as commands).
Abstractions that do not have any mutators that can change their state, such as
String and Integer, are immutable
abstractions. Abstractions with mutators are mutable. Both kinds of abstractions
have their uses. The advantage of immutable abstractions is that their objects can be
shared freely by different code modules.
The useful principle of command-query separation can guide how we design methods. The principle says that a given operation should fall into one of these three categories, rather than multiple categories. This makes the interface easier to use. For example, you don't want to be forced to have side effects in order to check the state of an object.
Considering each of the categories in turn, we might come up with operations like the following:
zero: create the zero polynomialmonomial: create a polynomial with form axbfromArray: create a polynomial with coefficients defined by an array of doubles.derivative: create a polynomial that is the derivative of the given polynomial (also an observer).plus: create a polynomial that is the sum of two polynomials.minus: create a polynomial that is the difference of two polynomials.degree: report the maximum exponent with non-zero degree.coefficient: report one coefficient of the polynomialevaluate: evaluate the polynomial at a given value for xtoString: generate a string representation of the polynomialequals: report equality of a polynomial with another object.clear: set this to the zero polynomial.add: add another polynomial to this.Notice that we have not discussed how we are going to implement this polynomial abstraction. That is a good thing. We want to expose the operations that clients are going to need. We might have to make sacrifices because some operations are hard or expensive to implement, but that should be done only after thinking about the ideal interface.
We want to avoid adding operations that we can implement efficiently using
existing operations. For example, we might be tempted to have an operation that
finds zeros of the polynomial. However, such an operation can probably be
implemented efficiently using either factoring (for low-degree polynomials) or
numerically via Newton's method, using evaluate and
derivative.
Standard operations. Some operations are so useful that it is worth thinking about whether you will want them for every data abstraction you define:
equals. Testing whether two values are equal is fundamental to mathematics and to programming. Implementing the equals() method is surprisingly tricky, because the notion of equality itself is more complex than it might appear at first glance. There are two natural ways to define equality on two values:
Leibniz equality. The mathematician Leibniz said that two things should be equal if and only if they are indistinguishable.
As a rule of thumb, it is best to use Leibniz equality when feasible. Since
mutable objects are only really indistinguishable when they are the same
object, this rule implies that mutable objects should simply inherit
Object.equals(). However, two immutable objects, such as
Strings, can be indistinguishable through all of their observer
operations and should in that case be considered equal.
toString. It is very useful for debugging to be able to
print a string representation of an object. Ideally, two objects should have
equal answers for toString() if and only if they are equal
according to equals.
hashCode. If you want to use an object as key in a hash table,
it needs to have a hashCode() method. Two objects that are equal
according to equals() must have the same hashcode.
Two application-generated objects that are not equal according to
equals() should have different hash codes with high probability.
a copy constructor. For mutable abstractions (that is,
abstractions that have mutators), it is often useful to be able to make a copy of an
existing mutable object. Unless mutators are used on either the copy or the
original, the two should be indistinguishable. There should be no way to affect
the original by mutating the copy, or vice versa. Among other uses, copies are
handy for building test cases. The copying method should override Object.clone().
Getters and setters. Getters are observer methods that report the contents of fields, and setters are mutators that change the values of fields to a given value. Both getters and setters allow access to your object's data, and care should be taken to allow access in cases only where it is appropriate. It is not a good idea to indiscriminately add getters and setters to classes you write. Think about whether the client needs read or write access to the object's data and what restrictions apply.
One can make a field public, but it is generally better to keep the field private and use a setter. Public fields allow completely unchecked access. With a setter, one can check to make sure data is of the correct form before updating. For example, if an integer field must never be negative, that check can be put in a setter.
One pitfall with getters to be aware of is that they may inadvertently allow modification of the data by exposing mutable state from inside the abstraction. For example, suppose we store the coefficients of a polynomial as an array in the instance variable coefficients and provide a getter:
private double[] coefficients;
public double[] getCoefficients() {
return coefficients;
}
Now a client can get access to the internal array of coefficients and change the polynomial in an unconstrained way, possibly breaking its class invariant. To avoid this, a copy of the array must be returned.
The moral is that some observers and some mutators may look like getters and setters, but you should choose observers and mutators that are appropriate to the abstraction being designed rather than reflexively exporting access to all instance variables.
Another operation that is overrated is the default constructor. It is the job of constructors to create a properly initialized object. Unfortunately, one often sees code that uses a default constructor to create the object, then initializes the fields using setters. A better approach is to use a constructor for initialization.
We have a rough idea of the operations we want to support. But before we start implementing, we should write clear, precise specifications so that we know when we have implemented the operations properly.
For each method, we need to define a signature that gives the types of the parameters, the type of the return value, and the possible exceptions that could be thrown. We should also define a specification (spec) that describes what the client needs to know (and is allowed to know) about the behavior of the method.
For example, we might write a spec for the degree method as follows:
/** Returns: the degree of the polynomial, the largest exponent with a non-zero * coefficient. */ int degree();
To help us construct a good spec, it is useful to think of the spec as being composed of various clauses. These cover different things the client needs to know about:
The key to writing good specs is to think of the spec as a contract between the client and the implementer. Like a legal contract, its main goal is to help everyone figure out who to blame when things go wrong. This is very important for successful software engineering, especially in a large team.
Javadoc doesn't completely support the clauses we have described so far, although there are ongoing efforts to improve it. If you want to use Javadoc to generate HTML documentation, you will need to adapt this documentation strategy accordingly. The key is not that you need to have explicit clauses, but that you should know for each thing you write in the comment which clause it belongs to.
Now we are ready to implement our specifications. And we will want to write some documentation of that implementation.