Developing Software in Groups
CS 412/413, Spring 2000
Andrew Myers

Like most medium to large software projects, compilers are generally developed by groups of people of about the size of your project group. In fact, this size is nearly optimal for developing such projects -- if the group functions effectively. One of the goals of this class is to give you experience doing real software development; an important part of this experience is learning how to make a small group work.

I thought it might be helpful to share a few lessons about working in groups that I have learned during a number of group projects in industry and academia. A group project like this will only be successful if every member of the group is doing useful work. The projects in this course are very difficult if not impossible for a single person to finish in the time allotted. So your goal is to organize the project development so that everyone is helping to make the project happen. The strategies suggested below all are largely common sense, but do think about whether you could improve your development process by applying them. I also recommend the classic The Mythical Man-Month, by Fred Brooks: a great source of ideas for and cautionary tales about group projects.

Partition the project with interfaces
It is very important to figure out how to cut up a software project into pieces that can be designed and worked on separately, in parallel. When you cut up the coding, define the interfaces and data structures that the various pieces share with each other. It is very valuable to have at least a rough cut at these interfaces before doing a more detailed design. As the interfaces are refined or corrected, the changes must be propagated to and discussed with the affected parties.
For example, in this project there is an obvious division between semantic analysis and syntactic analysis. The shared interface is the AST data structure, and it should be designed carefully because it is the input to one phase and the output from the other. Once you've defined it carefully, the two parts of the project can be worked on almost independently.
Document interfaces and invariants with specifications
When these interfaces are defined, it is very important to document anything that is not obvious or that might be forgotten. A little effort spent writing down this kind of information pays off even on a single-person project; on a 3 or 4-person project the pay-off is tremendous.
As a corollary, when you have to figure something out about some code that you or someone else has written, it should be written down in the code at the time you figure it out. This will prevent other people from having to waste time figuring it out too.
Invariants on data structures (classes) are very important to document, because you will need to know them when extending the data structure. A representation invariant for a class states any restrictions on the instance variables (fields) of that class. It describes the expectations that a class has about its own code. This helps when the code is modified because the rules to follow are known. When the representation invariant is not obvious, it should be recorded.
Another useful piece of a data structure specification is the abstraction function: a mapping from the concrete representation (the instance variables) to the abstract value represented by that representation. The abstraction function needs to be defined only for representations that satisfy the representation invariant.
Convince others your code is correct
Correctness: Once you have written down a specification for what a class is supposed to do, you now have a powerful tool: you can talk about whether a given piece of code is correct -- without reference to the system that it is a part of! When bugs arise, there are two possibilities: either the specification is wrong, or the code does not meet the specification. Since the latter case is very common, this means that bugs are less likely to cause redesign of large parts of the system -- changes will tend to be encapsulated within an interface boundary. In addition, when a bug arises, you will more often know whose job it is to fix it.
Code Review: A very important process is arguing that a piece of code is correct. Software is much more likely to work properly the first time if the person who writes the code must explain to someone else why it is that the code actually meets its specification. If you can't tell whether a piece of code is correct, it is probably because the specification the code is supposed to meet is not clear; in this case, you have learned that the specification needs refinement. This process of arguing for correctness saves immense time in debugging, particularly when applied to code that is tricky in any way. For completely trivial code, it may be a waste of time. However, a compiler contains very little trivial code.
Pilot/Co-pilot: A good trick for writing code and simultaneously doing this argument for correctness is the pilot/co-pilot or pair programming model. Two people code together with the co-pilot looking over the pilot's shoulder. They discuss interfaces and work up specifications together, and the pilot writes the code and has the burden of convincing the co-pilot that the code meets the specification. In cases where one member of the team is more familiar with the kind of coding involved, that person should be the co-pilot, not the pilot. The co-pilot can provide the higher-level guidance that his or her experience provides in the particular code being written. In 4-person groups, it often works well to split into two pilot-copilot subgroups. It is also useful to rotate who you work with in this arrangement -- everyone learns more that way. (For a nice article on this approach, see All I really need to know about pair programming I learned in kindergarten.)
Prepare for group meetings
Group meetings are a very important part of a group development effort, and should be treated seriously by everyone. They can be an efficient way to communicate information efficiently to everyone in the group and to arrive at consensus about how to develop the project. However, it is also easy for group meetings to be unproductive because there is no concrete material to be discussed. Avoid vague discussions; when such a discussion seems to be happening, it is time for people to go off and think about the problem alone or in a subgroup. The best meetings are ones in which some members have worked up straw man proposals that the group can sink its teeth into and refine. The best approach is to delegate various issues to be figured out to different people in the group. These people then present a proposal or proposals, and the group discusses them and decides whether further off-line work is required or whether a solution has been arrived at and understood by all. This approach uses the energies of all of the group members effectively.

Developing Software in Groups CS 412/413, Spring 2000 Andrew Myers

Developing Software in Groups
CS 412/413, Spring 2000
Andrew Myers