CS 3110 Lecture 7
Modular Programming: Modules and Signatures

We've been building very small programs. When a program is small enough, we can keep all of the details of the program in our heads at once. Real application programs are 100 to 10000 times larger than any program you have likely written or worked on; they are simply too large and complex to hold all their details in our heads. They are also written by multiple authors because otherwise it would take too long. To build large software systems requires techniques we haven't talked about so far.

One key solution to managing complexity of large software is modular programming: the code is composed of many different code modules that are developed separately. This allows different developers to take on discrete pieces of the system and design and implement them without having to understand all the rest. However, to build large programs out of modules effectively, we need to be able to write code modules that we can convince ourselves are correct in isolation from the rest of the program. Rather than have to think about every other part of the program when developing a code module, we need to be able to use local reasoning: that is, reasoning about just the module and the contract it needs to satisfy with respect to the rest of the program. If everyone has done their job, separately developed code modules can be plugged together to form a working program without every developer needing to understand everything done by every other developer in the team. This is the idea of modular programming.

Therefore, to build large programs that work we must use abstraction to make it manageable to think about the program. Abstraction is simply the removal of detail. A well-written program has the property that we can think about its components (such as functions) abstractly, without concerning ourselves with all the details of how those components are implemented.

Modules are abstracted by giving specifications of what they are supposed to do. A good module specification is clear, understandable, and give just enough information about what the module does for clients to successfully use it. This abstraction makes the programmer's job much easier; it is helpful even when there is only one programmer working on a moderately large program, and it is crucial when there is more than one programmer.

Languages often contain mechanisms that support modules directly. Objective Caml is one of those languages, as we'll see shortly. Object-oriented languages support modular programming with classes. In general, a module specification is known as an interface, which abstracts the module implementation. The name “interface” should not be confused with the Java interface mechanism, which indeed can be used as an interface. But even just the public methods of a class constitute an interface in the more general sense -- an abstract description of what the module can do.

Once we have defined a module and its interface, developers working with the module take on distinct roles. Ideally, most developers are clients of the module who understand the interface but do not need to understand the implementation of the module. A developer who works on the module implementation is an implementer. The module interface is a contract between the client and the implementer, defining the responsibilities of both. Contracts are very important because they allow us to figure out where the problem is when something goes wrong!

Good interface design practice involves both clients and implementers in the specification of a module's interface. Interfaces designed solely by implementers (an easy trap to fall into as they are responsible for building it) are generally seriously deficient. Changing interfaces gets more and more difficult as the development process proceeds -- the interfaces are what was agreed upon so that separate developers or teams can work independently. Thus getting interfaces right is important. Equally important is specifying what interfaces do in an unambiguous manner. In OCaml the signature is part of writing an unambiguous specification but is by no means the whole story as we will see shortly. While beyond the scope of this course, IDL's, Interface Description (or Definition) Languages, are used to specify intrfaces in a language indpendent way so that different modules do not even necessarily need to be implemented in the same language.

In modular programming, modules are used only through their declared interfaces, which the language may help enforce. This is true even when the client and the implementer are the same person. Modules decouple the system design and implementation problem into separate tasks that can be carried out largely independently. When modules are used only through their interface, the implementer has the flexibility to change the module as long as the module still satisfies its interface. The interface ensures that the module is loosely coupled to its clients. Loose coupling gives implementers and clients the freedom to work on their code mostly independently, and it also means that changes in one code module are less likely to require changes to others.

Abstraction mechanisms

We will be concerned with two kinds of abstraction:

Modules

Modules in OCaml are implemented by module declarations that have the following syntax:

module ModuleName = struct implementation end
The module name ModuleName must begin with an upper case letter.

Modules partition the namespace, so that any symbol x that is bound in the implementation of a module named Module must be referenced by the qualifed name Module.x outside the implementation of the module (unless the namespace has been exposed using open).

The implementation of a module can contain type definitions, exception definitions, let definitions, open statements to open the namespace of another module, include statements to include the contents of another module and signature definitions.

Like structures, modules (and signatures discussed below) are not first-class objects in O'Caml. This is in contrast to functions which are first-class objects. Modules cannot be passed as arguments to a function nor resturned as results of a function.

Signatures

To successfully develop large programs, we need more than the ability to group related operations together in a module. We need to be able to use the compiler to enforce the separation between different modules, which prevents bad things from happening. Signatures are the mechanism that enforces this separation.

Signature declarations that have the following syntax:

module type SIGNAME = sig definitions end

By convention the signature name SIGNAME is all in capital letters. The definitions of a signature declare a set of types and values that any module implementing it must provide. The definitions of a signature may be type definitions, val definitions to define the type signature of a name, and exception definitions to specify exceptions that module can raise.

A module that implements a particular signature specifies the name of that signature in its definition, after the module name and separated by a : as with types. The signature must be defined before the module is defined.

module ModuleName : SIGNAME = struct implementation end

A module that implements a signature not only must specify concrete types for the abstract types in the signature and provide all the declarations in the signature. Only the abstract types are accesible outside the module, and only declarations in the signature are accessible outside of the module (for instance functions defined in the implementation but not in the signature are not accessible).

For example, here is a signature for a simple set data abstraction, together with a particular implementation of that interface using lists:

(* Sets with union and intersection
 * mem x (add x S)=true
 * mem x (rem x S)=false
 * mem x (union S1 S2)=(mem x S1) || (mem x S2)
 * mem x (inter S1 S2)=(mem x S1 )&& (mem x S2)
 *)

module type SETSIG = sig
  type 'a set
  val empty : 'a set
  val add : 'a -> 'a set -> 'a set
  val mem : 'a -> 'a set -> bool
  val rem : 'a -> 'a set -> 'a set
  val union: 'a set -> 'a set -> 'a set
  val inter: 'a set -> 'a set -> 'a set
end

module Set : SETSIG = struct
  type 'a set = 'a list
  let empty = []
  let add x l = x :: l
  let mem x l = List.mem x l
  let rem x l = List.filter (fun h -> h<>x) l
  let union l1 l2 = l1 @ l2
  let inter l1 l2 = List.filter (fun h -> List.mem h l2) l1
end