We look at some important language features and see how object diagrams can help us to understand how they work and to avoid common programming errors involving them.
Like objects, arrays in Java are boxed values. The type int[]
is the type
of an array of int
, and any type can be substituted for the int
to obtain a corresponding array type.
Since arrays are boxes, we ordinarily create them with the new
expression.
Consider the following example:
int[] a = new int[2]; int[] b = new int[] { 10, 20, 30 };
This code creates objects and initializes variables as shown by the following diagram.
As with objects, the variables a
and b
contain
references to the arrays rather than the arrays themselves. If we wrote an
assignment a = b;
, they would subsequently refer to the same
underlying array; a
and b
would be aliases.
Each array contains a first slot that keeps track of the
type of the elements in the array, and each has a single immutable instance
variable length
that keeps track of the number of elements. The
two elements of a
are initialized to the default value of type
int
, which is 0.
When declaring and initializing an array of type T[]
for some T, an abbreviated syntax
is allowed in which the usual new T[]
is omitted. For example, the previous declaration
of b
could be written more compactly:
int[] b = { 10, 20, 30 };
In general, we can't construct arrays completely at declaration time. To initialize them,
it is common to use loops. The for
loop is a useful statement in
Java. For example, here is a loop that initializes an array of Point
s, where
the class Point
is defined as in the previous lecture:
for_loop.java
This code creates an array whose entries are references to a series of newly
created Point
objects:
The for
loop repeatedly executes its body (here, in braces) until
a condition is false. It has an interesting syntax. There are three clauses in
the parentheses, separated by semicolons. The first clause is the loop
initializer. It is executed once at the beginning of the loop and may be a
variable declaration. The second clause is the loop guard, It is evaluated at
the beginning of every loop iteration, and the loop terminates if it evaluates
to false
. The third clause is the increment statement. It is
executed at the end of every loop iteration.
Another way to exit from a loop is to use the break
statement. It
immediately terminates the closest enclosing loop. The less frequently used
continue
statement causes the current loop iteration to end and
the next loop iteration to begin immediately (although the increment statement
is still executed, and the guard is still checked.)
A multidimensional array is really an array of references to arrays. For example, consider the following code that creates a two-dimensional array (aka matrix):
int[][] m = {{10, 20, 30}, {40, 50, 60}};
This code actually creates three objects:
Java does not try very hard to ensure that m
continues to represent
a nice rectangular matrix. For example, we can change the length of one of its rows:
m[1] = new int[1];
Or we can even make the rows alias each other!
m[1] = m[0];
Strings in Java are really objects, which leads to some surprises for programmers. A
string literal like "Hello"
actually causes a call to a
constructor for String
, resulting in an object. For example, the
code on the left has the effect on the right:
String x = "Hello"; String y = x; String z = y + "World"; String w = y + "World";
The operator +
denotes concatenation when applied to strings,
rather than addition. It creates new string objects. Notice that variables z
and w
are initialized to refer to string objects that have exactly the same
state, but are actually different objects. Since strings are immutable (they cannot be
changed after they are created), the fact that they are different objects normally does
not matter.
Strings support
a large selection of useful methods. For example, one such method is charAt
,
which returns the character at a given position in the string. For example, the expression
z.charAt(1)
evaluates to the character 'e'
, and the same is
true for w
.
The strings referenced by z
and w
can be distinguished in
one way, however. If they are compared using the ==
operator, the
result of z == w
is false
. This happens because the
==
operator on boxed values simply returns whether the operands
are the same box (that is, the same object). Probably this isn't what we want
when we compare two strings!
Therefore, when comparing two objects generally, and strings particularly, you
should almost always use the equals
method, which returns whether
two objects should be considered interchangeable. The expression
z.equals(w)
evaluates to true
, as we'd like. Think
twice before you use ==
on object values.
Based on the discussion of strings above, it is tempting to think that strings
are very special objects in Java. Actually, they aren't: the only special thing
about strings is that string objects can be created using the convenient
quotation mark syntax. The object diagram above is a bit of a white lie,
because strings are actually implemented using arrays of characters. For
example, the string "Hello"
is really implemented as two
objects as shown in the object diagram on right. The String
object
contains an instance variable value
that refers to the array of
characters making up the string.
Since the entries in the character array never change, you have to work pretty
hard to figure out that is what a string really is in Java, because you can
only access strings through the operations of the String
class.
And that is a Good Thing, because it means that the designers of Java can
change the way strings work in future versions of Java without breaking all the
existing programs! In fact, the implementation of Strings has changed significantly
in the past few versions of Java, so even this object diagram is a white lie.
Sometimes we want to use an unboxed value like an int
where a
boxed value is expected. For example, a variable of the type Object
can refer to any object, but can't refer directly to a primitive value.
To address this issue, Java introduces a set of classes corresponding to the
primitive values. For int
there is Integer
, for
boolean
there is Boolean
, for double
,
Double
, and so on. Each of these classes defines objects that
contain a value of the appropriate primitive type, and define equals
to compare state.
In addition, Java will automatically box primitive values into the corresponding object type when necessary, and will automatically unbox them in the other direction, too. This feature is called autoboxing. It can have some counterintuitive effects, however. For example, consider this code:
Integer i = 200; Object l = i; int j = i; Object k = j; i == j // true i == l // true j == k // static error: can't compare Object and int. i == k // false!
There are a couple of surprises here: first, the compiler does not let us compare
j and k. Autoboxing causes j to be boxed into an Integer
object, but
the static type of k is Object
, so the Java compiler does not
know that k can be unboxed into an int
.
Another surprise is the last line of code.
Since i
and k
are different objects representing the
same number, they compare as unequal. As with strings, we should use the
equals
method to compare values of type Integer
.
Perhaps even more surprisingly, changing the number 200 to anything between
-128 and 127 will cause the code above to report true
for i
== k
. This happens because there is a table of Integer
objects that is used only for small integers. Autoboxing is performed by the
method Integer.valueOf
, which uses this table when it can and only
resorts to new
for larger integers.
One moral of the story, again, is that to compare to Integer
objects, we need to use the equals()
method on the objects.
Even though expression i==k
is false,
the expression i.equals(j)
is true.
Clearly, the assignment j=i
is doing more than just an assignment.
In fact, it's really executing the following code: j = i.intValue()
.
The intValue()
method extracts the int
value from the
Integer
object. This is an example of syntactic sugar, in
which the language permits us to abbreviate how we write code.
Conversely, if we assigned i=j
, this would be syntactic sugar for
i = Integer.valueOf(j)
, which calls a method that depending on
the value of j
either looks up an appropriate preexisting
object in a table, or creates a new Integer
object.
Calls to the valueOf
and intValue
methods are
automatically inserted by the Java compiler to implement boxing and unboxing.
Similar methods exist for the other primitive types.
Names can refer to a variety of things: local variables (including formal parameters), instance variables (aka fields), methods, types, classes, and packages. The basic rule for deciding what kind of thing a name refers to is to find the definition of the name with the smallest scope that includes the use of the name. Different kinds of names have different rule for scope. Local variables are in scope from the point of declaration until the end of the block in which they are declared. Method and field names are in scope throughout their class. Class and interface names are in scope throughout the program unless they are nested inside another class, in which case they are in scope throughout the containing class.
If a name is in the scope of two different declarations at once, the outer declaration
is said to be shadowed by the inner one. Java considers some shadowing to be
illegal. For example, this code will not compile because the variable x
is shadowed inside the while
loop:
int x = 2; while (x != 0) { int x = 5; // both x's in scope here. } // only outer x in scope here.
One place where shadowing is allowed, often getting programmers into trouble is when a local variable shadows an instance variable. This often arises with constructors, because it is tempting to name formal parameters in the same way as instance variables:
class Point { int x, y; Point(int x, int y) { // locals x and y shadow instance variables x and y this.x = x; this.y = y; } }
As the example shows, there is a way to talk about shadowed instance variables,
using the object reference this
. The expression this
can
only be used inside instance methods (not static methods) and refers to the current receiver
object: in this case, the object being constructed.