# Specification of Modules

Let us now consider the use of specification in module implementations. The first question we must ask ourselves is who is going to read the comments written in module implementations. Because we are going to work hard to allow module clients to program against the module while reading only its interface, clearly clients are not the intended audience. Rather, the purpose of implementation comments is to explain the implementation to other implementers or maintainers of the module. This is done by writing comments that convince the reader that the implementation correctly implements its interface.

It is inappropriate to copy the specifications of functions found in the module interface into the module implementation. Copying runs the risk of introducing inconsistency as the program evolves, because programmers don't keep the copies in sync. Copying code and specifications is a major source (if not the major source) of program bugs. In any case, implementers can always look at the interface for the specification.

Implementation comments fall into two categories. The first category arises because a module implementation may define new types and functions that are purely internal to the module. If their significance is not obvious, these types and functions should be documented in much the same style that we have suggested for documenting interfaces. Often, as the code is written, it becomes apparent that the new types and functions defined in the module form an internal data abstraction or at least a collection of functionality that makes sense as a module in its own right. This is a signal that the internal data abstraction might be moved to a separate module and manipulated only through its operations.

The second category of implementation comments is associated with the use of data abstraction. Suppose we are implementing an abstraction for a set of items of type 'a. The interface might look something like this:

(** A set is an unordered collection in which multiplicity is ignored. *)
module type Set = sig

(** the type representing a set whose elements are type ['a] *)
type 'a set

(** the set containing no elements *)
val empty : 'a set

(** [mem x s] is whether [x] is a member of set [s] *)
val mem : 'a -> 'a set -> bool

(** [add x s] is the set containing all the elements of [s]
as well as [x]. *)
val add : 'a -> 'a set -> 'a set

(** [rem x s] is the set containing all the elements of [s],
minus [x]. *)
val rem : 'a -> 'a set -> 'a set

(** [size s] is the cardinality of [s] *)
val size: 'a set -> int

(** [union s1 s2] is the set containing all the elements that
are in either [s1] or [s2]. *)
val union: 'a set -> 'a set -> 'a set

(** [inter s1 s2] is the set containing all the elements that
are in both [s1] and [s2]. *)
val inter: 'a set -> 'a set -> 'a set
end


In a real signature for sets, we'd want operations such as map and fold as well, but let's omit these for now for simplicity. There are many ways to implement this abstraction. One easy way is as a list:

(* Implementation of sets as lists with duplicates *)
module ListSetDups : Set = struct
type 'a set = 'a list
let empty = []
let mem = List.mem
let add x l = x :: l
let rem x = List.filter ((<>) x)
let rec size = function
| [] -> 0
| h :: t -> size t + (if mem h t then 0 else 1)
let union l1 l2 = l1 @ l2
let inter l1 l2 = List.filter (fun h -> mem h l2) l1
end


This implementation has the advantage of simplicity. For small sets that tend not to have duplicate elements, it will be a fine choice. Its performance will be poor for large sets or applications with many duplicates but for some applications that's not an issue.

Notice that the types of the functions do not need to be written down in the implementation. They aren't needed because they're already present in the signature, just like the specifications that are also in the signature don't need to be replicated in the structure.

Here is another implementation of Set that also uses 'a list but requires the lists to contain no duplicates. This implementation is also correct (and also slow for large sets). Notice that we are using the same representation type, yet some important aspects of the implementation are quite different.

(* Implementation of sets as lists without duplicates *)
module ListSetNoDups : Set = struct
type 'a set = 'a list
let empty = []
let mem = List.mem
let add x l = if mem x l then l else x :: l
let rem x = List.filter ((<>) x)
let size = List.length (* size is just length if no duplicates *)
let union l1 l2 = (* check if already in other set *)
List.fold_left (fun a x -> if mem x l2 then a else x :: a) l2 l1
let inter l1 l2 = List.filter (fun h -> mem h l2) l1
end


An important reason why we introduced the writing of function specifications was to enable local reasoning: once a function has a spec, we can judge whether the function does what it is supposed to without looking at the rest of the program. We can also judge whether the rest of the program works without looking at the code of the function. However, we cannot reason locally about the individual functions in the three module implementations just given. The problem is that we don't have enough information about the relationship between the concrete type (int list) and the corresponding abstract type (set). This lack of information can be addressed by adding two new kinds of comments to the implementation: the abstraction function and the representation invariant for the abstract data type. We turn to discussion of those, next.