A major topic of this course is object-oriented design. We use an object-oriented language, Java, as a vehicle for exploring object-oriented design. We assume some prior familiarity with Java, but will focus on how to use it in an object-oriented way.
It is useful to distinguish between object-oriented (OO) languages and the object-oriented programming model. A programming model is an approach to solving programming problems. There are many programming models and variants thereof. For example, in addition to the object-oriented model, there is a functional programming model that you will learn about in CS 3110. And there are variations on the object-oriented model such as the event-driven model.
Some languages are designed to support some programming models better than others, and it makes sense to use an OO language like Java for learning OO design. But this course is not primarily about Java. It is a course about object-oriented design (and other computer science topics), and the lessons you learn about object-oriented design should apply to other programming languages.
What makes a language object-oriented? It should support the essential elements of object-oriented programming. There are three key elements, which we will discuss in the following three lectures:
We'll start by talking about encapsulation, an idea that Java supports
with the keywords public
, private
, and
protected
. The use of these keywords prevents code and
data from being used, which may seem strange: why is it important
to take power away from the programmer? The answer is that limiting
access supports modularity.
In early programming languages, the information manipulated by the program got short shrift. Programs were organized around the algorithms doing the computation, as illustrated in Figure 1. Those algorithms had full access to the data that they were computing with. This is the procedural model of programming. It involves building bigger procedures out of smaller ones, creating procedural abstractions that could be reused in a powerful way.
As software systems became increasingly complex, the procedural model stopped working well. Procedural abstraction by itself did not scale up to big software systems. The problem was that the procedural model offers no control over which program code can access a given part of the data. Code can reach into the program data and use it or update in an arbitrary way. Working on a team of programmers is difficult in the procedural model because the different parts of the code tend to be tightly coupled. A bug in one software component can corrupt program data and look like a bug in a different component. Code is hard to maintain and to evolve without breaking it, and changes to way data is represented in the program tend to affect all of the program code rather than being isolated to a small part of the code.
To address these problems, modular programming was developed. (Object-oriented programming extends modular programming with some additional ideas that we get to soon.) The idea of modular programming is that the software should be broken up into distinct modules that can be developed relatively independently. A good modular design respects the principle of separation of concerns, which says that different aspects of the design should be designed separately. With a good modular design, changes can be made to one module without changing other modules, and it is relatively easy for programmers to know whether their changes will affect other modules that are perhaps owned by other programmers. Separation of concerns is strengthened by information hiding in which modules do not take advantage of knowledge about how other modules are implemented. Information hiding provides loose coupling that tends to make code evolution easier. In a loosely coupled system, changes to the way information is represented or modules are implemented tend to propagate less to other modules, much as loosely coupled train cars can start moving without trying to move the whole train.
A key insight in programming language design was that modular programming can be enforced by a programming language mechanism that encapsulates its state and behavior, enforcing information hiding and controlling how other modules can access it. This approach is suggested by Figure 2. Code outside a module cannot directly access the data that is internal to the module. Any access from outside to a module's data must occur via the module's code, and only the code that the module chooses to expose to the outside. Access to the data is mediated by this public code.
By client code we mean code outside the module that is using the
module through a set of publicly exposed operations. These operations we
called the interface of the module, which should not be confused
with the Java language feature interface
(which is, however, an
example of the more general interface idea, and one we will return to).
The interface that a module exposes to client code is a kind of contract with the rest of the program. The idea of modular programming is that if every module lives up to its contract, the whole program will work correctly. Programmers can then think about and program each module in isolation from the rest. Instead of thinking about the correctness of the entire program, a bewilderingly complex problem, they can just think about the correctness of the particular module they are working on now. This nice property of modular programming is called local reasoning.
Modules also make it possible to use data from other modules without knowing exactly how that data is represented. All they have to know is what operations (from the public interface) can be performed on the data. The data is opaque to the client code, which means that the module implementer is free to change how the data is represented because no client code can depend on the precise representation. This powerful idea is called data abstraction. The word abstraction refers to the idea of hiding inessential detail. In this case, the inessential detail is, for the client code, the precise way that information is represented inside the module.
In an object-oriented language like Java, encapsulation and data abstraction
are primarily provided by classes, though packages are also
used as an encapsulation mechanism. A class and its code are shared by all objects of
that class (the instances of the class), and the class's code can
mediate access to all information stored in instances. For example, suppose we
want objects that act like rational numbers, allowing us to write code like the
following, using a class Rational
:
Main.java
A class implementing rational numbers is shown here (you can click on code examples to download them).
Rational.javaExample: Rational numbers
There are many interesting things going on in this implementation. We start out with a very
important comment, which we call the class overview. This describes how client
code programmers should think about the values of class Rational
.
To the client, the objects are simply rational numbers, with a numerator p and a denominator q.
The overview also gives the client a notation for talking about these objects abstractly, as a fraction p/q.
Having a notation for objects of the class is helpful for expressing the specifications of the methods.
The data in
Rational
objects is the fields p
and q
, with type int
.
This is just one possible representation of rational numbers. For example, it would probably be better
to make the types of these fields long
or even use BigInteger
. We could also
imagine keeping track of the sign of the number in a separate boolean field, leaving both p and q as
nonnegative numbers. The point is that the client doesn't and shouldn't need to know how the number
is represented internally. The client should think of the objects of class Rational
as simply rational numbers.
The fields of the class are marked private
to ensure that they
are encapsulated inside the class. The keywords public
and
private
are known as visibility modifiers, because they
control which parts of the class are visible outside the class, and hence can
be accessed.
The methods add
and equals
and the constructor
Rational
are marked public
and hence can be used by
external clients.
Inside the method add
, there is a special variable this
that
refers to the object on which the method was invoked, called the
receiver object. It happens that add
does not mention this
explicitly, but it does refer to the fields of this
as p
and q
.
Writing these names is equivalent to writing this.p
and this.q
respectively.
The method plus
is a static method, which means that it does not
have a receiver object. The special variable this
is not in scope
in a static method. That means it cannot be used within the method. A
static method should be called using the name of its class, as in the
following code:
Rational r3 = Rational.plus(r1, r2);
It is also possible to declare fields to be static, in which case they are shared by all objects of their class.
The specification of the constructor has a Requires clause stating that
the argument den
must be non-zero. This clause is a form of
precondition that must be satisfied by any correct client implementation.
It is a mistake for client code to call the constructor without satisfying
the precondition: in particular, it is a mistake by the programmer writing the client code.
Thus, when mistakes are made, preconditions help us figure out whose fault they are.
Similarly, postconditions (sometimes called Returns clauses) specify what a method is supposed to do. If a method doesn't satisfy its postcondition, the mistake is not the client's; it's the implementer's.
An early comment expresses an invariant regarding the fields p
and
q
. An invariant is something that is always true at certain points in the program,
though in programming, invariants can be violated temporarily. This particular invariant is an
invariant about the state of objects of the class, and is variously called a
class invariant, data structure invariant, or
representation invariant.
The invariant states that q
is
positive and that the rational number is always stored in the reduced form
where p
and q
are relatively prime. A class invariant is expected
to hold at the beginning and end of every public method, but may be temporarily violated
in the middle of a method. Knowing that the class invariant is true
is very helpful because when writing the code for the methods of the class, you
can ignore the possibility of a zero denominator.
Having invariants that you can rely on is critical to being able to easily write working code. It is much easier to make sure you can rely on invariants if the code that enforces the invariant is localized to one class, as a class invariant (or, at least, to a small number of classes in a package).
Encapsulation aids with this goal of localization.
Because the fields p
and q
are private, the code of the class can enforce this invariant. Code
outside the current class has no way to, say, modify q
to be zero.
Conversely, we can see that making any assignable
field public completely destroys the ability to enforce class invariants involving
that field: client code can assign an arbitrary value to the field.
Like the method plus
,
the class constructor, which must also be named Rational
like its class,
is also static in the sense
that it is not called using a receiver object. Unlike in a static method, the variable
this
and its fields are in scope inside the constructor. They
refer to the fields of the object currently being constructed. Notice that the
constructor does not simply accept the numerator and denominator
directly, but instead computes a new numerator and denominator that
represent the same number while satisfying the invariant. This ensures that at
the end of the constructor, the invariant holds.
Because of the invariant, the method equals
can be implemented very
simply and efficiently, by comparing the corresponding fields of
this
and r
. This implementation relies on the fact that the
invariant we chose ensures there is only
one way to represent a given rational number. In general, it is not required that there be
a unique representation for any rational number, but it's handy here. Without the invariant,
we would have to write something more expensive like the following:
public boolean equals(Object o) { ... return (p*r.q == q*r.p); }
There is no free lunch here, of course. We had to pay up front for the
simplicity of equals
(and toString
), by enforcing the
invariant elsewhere in the code.
As we have seen, the annotations public
and private
can be used
to control which code outside a class can access its components. The full list of visibility
modifiers is as follows:
Modifier | Significance | Comments |
---|---|---|
public |
Accessible everywhere | Instance variables should not normally be public |
private |
Accessible only within the class | May limit future extensibility |
(no modifier) | Accessible from classes in the same package | Does not apply to nested packages |
protected |
Accessible from subclasses and other classes in the same package |
Preconditions and postconditions define a contract between the client and the implementer, and class invariants are an internal contract between the module implementation and itself. If everyone is obeying the contract, the program will work. But if someone doesn't follow their part of a contract, the program may fail in a way that is hard to debug. How can we gain confidence that these contracts are all being obeyed?
Using assertions is very helpful for catching these contract
violations and speeding up debugging. The assert
statement stops
the program (with an AssertionError
) if the tested condition is
false. Assertions can be used to double-check that anything the programmer
believes ought to be true, actually is. While this has some performance impact,
assertions can be turned off for production code.
One thing to watch out for with assertions is that they are turned off by default! You should always have them turned on when developing code. This is achieved by giving the Java VM the -ea flag. We recommend setting this flag by default for your Eclipse projects. You can find it in Eclipse in Preferences→Java→Installed JREs→Edit, where you set “Default VM arguments” to -ea.
Rational
withAssertions.RationalExample: class
Rational
with assertions added to check class invariants and preconditions
Classes in Java live in packages. For example, the class String
is really a shorthand for the fully qualified class name java.lang.String
,
where java.lang
is the name of a Java package containing many standard
Java types.
The dot symbol .
is used for several things in Java. As we've seen earlier,
it is used to indicate use of a method or field. Beyond this, it is used to indicate
which package a class lives in. Package
names can have dots inside them, and those happen to define how Java
source code and compiled code are stored in the file system of the computer.
Perhaps surprisingly, it is incorrect to think of java.lang
as being
“inside” the package java
. In particular, something
that is made visible just to classes in the java
package will
not be visible to classes in the java.lang
package, or vice versa.
These are two different packages whose names happen to be related.
When referring to a class in a different package, it is necessary either to
use the fully qualified name for those classes or to use an
import
statement at the top of the source file. Thus, we can write
the fully qualified class name cs2112.lec03.Rational
to refer to
the Rational
class from a package other than
cs2112.lec03
; alternatively, we can write a statement import
cs2112.lec03.Rational;
at the top of the file and then just refer to the
class as Rational
. It is also possible to use a wildcard
import to import all classes in a single package: for example, import
cs2112.lec03.*;
. In fact, the entire package java.lang
is
automatically imported in this way, which is why we can talk about classes like
String
and Integer
without qualifying their names.
In general, however, the danger of a wildcard import is that it may import
many classes you don't need, creating confusion about what some names refer to.
It is usually better to import just the classes you are actually using.