CS 2110: Object-Oriented Programming and Data Structures


Transition to Java

This document aims to help you adapt your knowledge of another procedural programming language to the context of Java, the language used in CS 2110. It provides explicit comparisons to Python and MATLAB, as those are the languages currently used in Cornell’s introductory programming courses CS 1110 and CS 1112. See the setup page for instructions on how to install a Java development environment on your computer.

Where to write code

While many aspects of a language’s syntax can be illustrated in isolated code snippets, it is important to understand the context in which that code might appear. In particular, fluency in a language requires practice (i.e., writing and running code and interpreting the results), so you need to know where to write code so you can see its effects first-hand.

Interactive programming

A great way to explore new programming concepts is to interact with them one statement at a time, printing a text representation of each statement’s value (if any) after it is run. In Python, this can be done by running the python (or ipython) interpreter, and in MATLAB, this is done by typing into the Command Window. A program that facilitates this kind of interaction is called a read-eval(uate)-print loop, or REPL, though more colloquially it may be called an “interpreter” or “shell”.

Modern versions of Java include a program called JShell that serves this purpose. If your JDK is on your PATH (optional for this class), you can launch its command-line interface by running jshell. Alternatively, IntelliJ IDEA provides a graphical interface to JShell under Tools | JShell Console; this is especially useful because you can interact with the classes in a Project (such as a lecture demo or programming assignment) in addition to those from Java’s standard library.

When using JShell, you can treat Java like a scripting language and have the ability to redeclare variables and even types. This is great for trying out code snippets, but remember that this mixture of definitions and execution is not reflective of how most Java code is written and run.

Source files

Interacting with code is great for learning or performing one-off tasks, but most Java code is written to create applications or libraries that are used multiple times by many people. Such code is saved to a file so that it can be shared and revised over time, much like a word processor document. Professional software engineers actually spend much more time reading old code than writing new code, so the style of code saved to files is just as important as its correctness and efficiency.

In Python and MATLAB, code can be written in a file just as it would have been typed interactively; this creates a script that can be executed multiple times. Here is where Java diverges—while you could ask JShell to read inputs from a file, “real” Java code is organized differently from a script. Java is a compiled language, meaning that applications do not execute source code directly. Instead, a program called a compiler first reads all of your source code and translates it into machine instructions without executing any of it. This is typically done by the software’s developer, who may not be the end user (think manufacturer vs. customer). The output of this process, called bytecode in Java, can then be packaged with other bytecode to form one or more applications that the end user can execute.

Java encourages modularity and reuse, so the basic unit of Java code is a class. Typically, every class in Java is defined in its own file, so the code defining a Vehicle class would be saved to a file named “Vehicle.java”. Within a class, you can define methods, which are analogous to functions in other languages. And within a method body you can write statements of Java code. Executing Java code therefore implies calling methods. Most methods are called by other methods, forming a chain of method calls known as the call stack. But one method has to come first in this chain, defining the first code that is executed when an application is started. This method must be named main(); it can be defined in any class, with the name of that class effectively being the name of the application (if multiple classes in a project have a main() method, then your project effectively contains multiple applications; there’s nothing wrong with that).

Here, then, is a minimal “hello world” application written in all three languages:

Python Java (HelloWorld.java) MATLAB
print('Hello world')
public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello world");
    }
}
disp('Hello world')

Wow, does Java look complicated! The amount of “boilerplate” code required for such a simple task is a common criticism of the language—it’ll be a few lectures before you understand public, static, String[], and System.out. But keep in mind that most software in the world is not as simple as printing “Hello world”, and the rigid structure imposed by Java becomes a big advantage when managing large projects. For now, focus on the following:

Exercise

Write a Java application in a class named HelloGoodbye that prints “Hola” on the first line, followed by “Adios” on the next line. Run this application from your IDE to ensure that it compiles and behaves as you expect.

Basic syntax

Statements

In Python and MATLAB, the end of a line of code implies the end of whatever statement was on that line, unless special “line continuation” syntax is used. If you want to put multiple short statements on the same line (usually considered poor style), you can separate them with semicolons. And in MATLAB, ending an assignment statement with a semicolon will prevent MATLAB from printing the result of the assignment.

Java works differently—statements may span several lines of code without any line continuation syntax because every statement must be terminated by a semicolon. Here is an example:

Python Java (snippet) MATLAB
vehicleMass = dryMass + \
              propellantMass
print(vehicleMass)
vehicleMass = dryMass +
              propellantMass;
System.out.println(vehicleMass);
vehicleMass = dryMass + ...
              propellantMass;
disp(vehicleMass)

Operators

Many arithmetic and relational operators in Java are identical to those in other languages (+ for addition, / for division, <= for less-than-or-equal-to, etc.). Here is a table of the most common operators that may differ in syntax from other languages you have learned:

Operator Python Java MATLAB
NOT not ! ~
OR (short-circuit) or || ||
AND (short-circuit) and && &&
Equality == == (primitive types)
.equals() (reference types)
== (value types)
isequal() (handle types)
Non-equality != != (primitive types) ~= (value types)
Identity (reference types) is == ==
Non-identity (reference types) is not != ~=
Remainder (aka modulus) % (positive operands) % rem()
(mod() matches Python)
Exponentiation ** Math.pow() ^

Additionally, the + operator is used for string concatenation (this is the only operator besides identity ==/!= that can act on operands of reference type).

Note that type widening rules come into play when determining the type of an arithmetic operation. Here is a quick summary:

This is most relevant for division: integer division is truncating (rounds towards zero). If you want to perform floating-point division between two integers, you must cast at least one of them to a floating-point type:

Python Java (snippet) MATLAB
a = 1
b = 2
a // b == 0
a / b == 0.5
int a = 1;
int b = 2;
a/b == 0
(double)a/b == 0.5
% Default type is double
a = 1;
b = 2;
% Uses narrowest type instead of widest
int32(a)/b == 0
a/b == 0.5

Literals

A literal is a value that can be expressed as a single token in source code. false is a Boolean literal (type boolean); 5 is an integer literal (type int); 3.14 is a floating-point literal (type double), as is 1e-6 (scientific notation). "Hello" is a string literal (type String; strings and arrays are the only reference types with literals other than null).

In Java, string literals are always enclosed in double quotes (""). Single quotes ('') denote a character literal of type char. This is different from Python, which can use either single or double quotes around strings, and MATLAB, which uses single quotes for character arrays.

An int can only represent values whose magnitude is less than two billion (roughly). If you want an integer literal to have type long instead of int, add a suffix of l or L (e.g. 987654321L). If you want a floating-point literal to have type float instead of double (sometimes desired for multimedia performance), add a suffix of f or F.

Comments

Java has three different kinds of comments: line comments (//, like Python and MATLAB), block comments (/* */), and documentation comments (/** */, analogous to Python docstrings). Note that documentation comments come before the declaration being documented, not after.

Python Java (snippet) MATLAB
t = 0  # Reset the clock

# In Earth-centered frame, must subtract
# Sun's gravity at Earth's center.
fSun = fSun - fSunEarth

def reciprocal(x):
    """Return 1 divided by the argument."""
    return 1/x
t = 0;  // Reset the clock

/*
 * In Earth-centered frame, must subtract
 * Sun's gravity at Earth's center.
 */
fSun = fSun - fSunEarth;

/** Return 1 divided by the argument. */
static double reciprocal(double x) {
    return 1.0/x;
}
t = 0;  % Reset the clock

% In Earth-centered frame, must subtract
% Sun's gravity at Earth's center.
fSun = fSun - fSunEarth;

function y = reciprocal(x)
    % Return 1 divided by the argument.
    y = 1/x;
end

Blocks

It is often necessary to specify that a group of statements all go together; examples include defining the body of a method or the else branch of an if statement. A group of statements is called a block, and blocks are often nested. Python uses indentation to indicate which statements belong to the same block, while MATLAB uses the end keyword to denote the end of a block. Java uses curly braces ({}) to enclose blocks. Note that, while indentation is not syntactically relevant in Java and MATLAB, it is an essential part of good style, so blocks should always be indented in addition to using the appropriate delimiters.

Python Java (snippet) MATLAB
def max3(x, y, z):
    if x > y or z > y:
        if x > z:
            return x
        else:
            return z
    else:
        return y
static int max3(int x, int y, int z) {
    if (x > y || z > y) {
        if (x > z) {
            return x;
        } else {
            return z;
        }
    } else {
        return y;
    }
}
function m = max3(x, y, z)
    if x > y || z > y
        if x > z
            m = x;
        else
            m = z;
        end
    else
        m = y;
    end
end

In Java, braces are technically optional if a block only consists of a single statement. However, this shorthand tends to lead to bugs and maintenance headaches and is banned by most professional style guides. Blocks should always be enclosed by braces in this course (but you should be aware of the shorthand because older materials may use it).

Blocks in Java also establish a scope for local variables, to be discussed later. Occasionally you may see braces used to segment a block without any associated control structure, typically with the intent of restricting variable scope.

Methods

We’ve said that “methods” are what Java calls functions and procedures. They are object-oriented by default—“instance methods” are used to request that an object perform some operation for you. But object-oriented programming comes later; for now, know that if you put static in front of a method declaration, it will behave as a “free function” and can be called without needing to construct an object. You will need to write the name of its enclosing class when calling it; for example, calling Math.sqrt(4.0) invokes the static sqrt() method in the Math class in order to compute the square root of its argument.

To declare a method, you need to give it a name, a return type (or void if the method does not return a value), and a list of parameters. Each parameter must have both a type and a name. The body of a method is enclosed in curly braces ({}). The return keyword is used to indicate when to stop executing the method and what value to produce for the caller. A method’s body may have multiple return statements (though only one will be executed per call); they are not restricted to the last line. Some people consider early returns to be “unstructured” (you may have been told to avoid them in prior CS classes), but in Java they are common and can improve the readability of code when used judiciously.

In Java (and unlike Python and MATLAB), a method can only return a single value (or none if the return type is void). To produce multiple pieces of information, you must aggregate them in a custom class (to be discussed later in the course).

In the snippets above, main(), reciprocal(), and max3() are all examples of method definitions. The return type comes first, followed by the name, followed by the parameter list. Remember that methods (even static ones) need to be defined inside of a class.

Variables and assignment

Let’s consider how to store the integer value 5 in a variable named score:

Python Java (snippet) MATLAB
score = 5
int score;  // Declaration
score = 5;  // Assignment
score = int32(5);
% Note: by default, numeric literals are treated
% as floating-point values (doubles), not integers

In all three cases the syntax for assignment is similar: a variable name goes on the left, followed by an equals sign (=), which represents the assignment operator, followed by the value that should be assigned to the variable. MATLAB even ends the statement with a semicolon (though that is just to suppress automatically printing the assigned value, whereas in Java it is required to indicate that the statement is finished).

Remember that, despite the use of an equals sign, assignment does not establish an equality relation between two symbols; it simply stores the value (or reference) on the right-hand side in the variable named on the left-hand side. Other languages denote assignment differently to break the visual symmetry and avoid this this confusion (score := 5, score ← 5), but Java, like many others, chose to use an equals sign, which you should pronounce as “gets” or “becomes”, not “equals”. In order to test an equality relation, use the double-equals (==) operator (and if you want to assert symbolic equality, well, that’s a different kind of programming altogether).

But Java has something extra—a variable declaration (int score;). In Java, variables must be declared before they can be used, which helps catch typos (in other languages, if an assignment is made to a variable that hasn’t been used before, a new variable is created, whether or not that was your intent).

To declare a variable in Java, write the type of the variable, followed by its name. In this example, we want score to be a variable that can hold integers, so we declare its type to be int. Other built-in types in Java include double (for double-precision floating-point numbers), boolean (for true/false), and String (for text).

To keep code shorter, declaration and assignment can be combined into a single initialization statement: int score = 5;. But remember that a variable only needs to be declared once (declaration is a prerequisite to assignment, not a substep of it), so an initialization statement may only be used the first time a variable is assigned (to assign a new value later on, simply write score = 4; without a type prefix).

Static typing

One of the biggest differences between Java and languages you have learned before is that Java is statically typed, while Python and MATLAB are dynamically typed. In a dynamically-typed language, any variable can hold any kind of value; you could assign the integer 5 to variable a, say, then later reassign it to point to the string 'hello'. Function return values are similarly flexible—a function may return a floating-point number when given one set of arguments, then return a boolean given a different set of arguments. As a result, one cannot know whether the values involved in an operation are compatible until the program is actually run.

In this context, “static” refers to properties of a program that can be inferred from its source code alone, regardless of what values may actually be present when it is run. By declaring types for variables and return types for methods, Java allows us to reason about the types of expressions statically. This means that Java can ensure that operands will be compatible (to an extent) regardless of what inputs a user may provide when a program is run.

The upshot is that, when a variable is declared in Java, you must specify its type, and by doing so, it is only legal to assign values of that type to the variable. So if variable score is declared as int score;, then you may only ever assign integer values to score, and if variable name is declared as String name;, then you may only ever assign string values to name. You cannot change the type of a variable after it is declared (except in JShell). Because every variable, method return value, and operator result has a statically-known type, every expression in Java has a statically-known type: the type of (1 + 1) is int, the type of Math.sqrt(2) is double, the type of (5 < 3) is boolean, and the type of "hello".substring(1, 5) is String.

The benefit of this type annotation is that bugs are caught much earlier, saving you time in the long run (even if it feels frustrating at first). Incompatible types are caught during compilation without having to run the program; this process is so fast that, in practice, such bugs are underlined in red as you type. This gives you immediate feedback on the logical consistency of the code you are trying to write—pretty spiffy.

Scope

Above, we noted that most Java code is written in method bodies, and methods get defined inside of classes. And within a method body, we’re likely to have blocks and nested blocks associated with various control structures. Each of these levels of nesting provides scope for variables declared within them.

For the procedural code snippets in this document, we are concerned with local variables, which are variables declared inside of a method body. Such variables are only accessible by other code in that method body; in fact, they are only accessible by other code in the same block (including nested blocks). Local variables are also confined to a single method invocation; if a method is called recursively, each recursive call gets its own copy of any local variables. Local variables are never used to communicate results between different pieces of code; such coordination must instead be done via return values, shared mutable values (not “shared variables”), or fields.

In the following example, variable x is declared inside of the if block, which restricts its scope. x does not exist outside of that scope, so trying to access its value is meaningless and results in a compile-time error. This forces you to reconsider your intent—if you want to refer to the same variable both inside and outside the if block, then it must be declared outside that block (and that declaration would need to provide an initial value). Note that Python and MATLAB would encounter a runtime error if the condition were changed to false (e.g. -1 > 0) because they do not provide an initial value.

Python Java (snippet) MATLAB
if 1 > 0:
    x = 5
print(x)
# Prints 5
if (1 > 0) {
    int x = 5;
}
System.out.println(x);
// Compile-time error:
// x is not in scope
if 1 > 0
     x = 5;
end
disp(x)
% Prints 5

In addition to local variables, there is another kind of variable in Java used for object-oriented programming. Variables declared at class scope are known as instance variables, member variables, or fields (in MATLAB they are called “properties”; in Python, “data attributes”). From a class’s own methods, these would be prefixed by self. in Python or MATLAB, but Java does not require any prefix to distinguish them (unless they are shadowed by a parameter or local variable of the same name). You may optionally distinguish them with the prefix this., however (which is also how you would avoid shadowing ambiguity).

It is possible to mimic “global variables” in Java via static fields; don’t do this. Global variables make code difficult to reuse and will not play a role in this course.

Arrays

Arrays are somewhat analogous to Python lists or MATLAB vectors, but there are some very important differences:

Python Java (snippet) MATLAB
u = [3, 1, 4]
v = 3*[0]
for i in range(len(u)):
    v[i] = 2*u[i]
int[] u = new int[] {3, 1, 4};
int[] v = new int[u.length];
for (int i = 0; i < u.length; ++i) {
    v[i] = 2*u[i];
}
u = [3, 1, 4];
v = zeros(1, 3);
for i = 1:length(u)
    v(i) = 2*u(i);
end

Control structures

Conditionals and loops typically behave similarly to other languages, so this comparison table should help you translate the syntax in most cases:

Python Java (snippet) MATLAB
# Conditional
if x > y:
    return 1
elif x < y:
    return -1
else:
    return 0

# Definite iteration
for i in range(10):
    print(v[i])

# Iteration over list
for student in roster:
    print(student)

# Indefinite iteration
while v[i] != 0:
    i = i + 1
// Conditional
if (x > y) {
    return 1;
} else if (x < y) {
    return -1;
} else {
    return 0;
}

// Definite iteration
for (int i = 0; i < 10; ++i) {
    System.out.println(v[i]);
}

// Iteration over list
for (String student : roster) {
    System.out.println(student);
}

// Indefinite iteration
while (v[i] != 0) {
    i = i + 1;
}
% Conditional
if x > y
    z = 1
elseif x < y
    z = -1
else
    z = 0
end

% Definite iteration
for i = 0:9
    % MATLAB array indices
    % are 1-based
    disp(v(i+1))
end

% Iteration over list
for student = roster
    disp(student)
end

% Indefinite iteration
while v(i) != 0
    i = i + 1;
end

Misc. syntax

You won’t need to use the following syntax in your own code, but you may come across it in examples, so it’s worth knowing what to look out for.

Arithmetic-assignment statements: i += 1;, b /= 2;, etc.
Shorthand for operating on a variable and storing the result back in that variable. Equivalent to i = i + 1;, b = b/2;, etc. Can also be used for string concatenation. They’re harmless, and some programmers prefer them.
Pre-increment operators: ++i, --j
Increment (or decrement) the variable by 1 and evaluate to the new value. This forms a side-effecting expression, which can be dangerous for programmers to reason about. While some data structure operations can be written very succinctly using such operators, saving a line of code is not worth the risk. This course will only use these operators as statements, not expressions (for example, incrementing a loop variable in a for-loop, where it is the dominant idiom).
Post-increment operators: i++, j--
Increment (or decrement) the variable by 1 and evaluate to the old value. The above remarks about pre-increment operators also apply here. Note that, when the expression’s value is unused (i.e., when employed as an increment statement), there is no difference between pre-increment and post-increment operators.
Conditional expression: (a < b) ? -1 : 1
This is a ternary operator, meaning it involves three operands. It acts like an if/else statement, but in an expression context. If the first operand is true, the expression evaluates to the second operand; otherwise, it evaluates to the third operand. Code that uses this operator can be hard to read, but it is much more succinct and takes on a “functional programming” flavor. We will use this operator sparingly in this course.
Exponentiation
Java does not have an exponentiation operator to take numbers to various powers (i.e. 5**3 in Python or 5^3 in MATLAB to compute 5*5*5==125). The closest analogue is the static method Math.pow(5, 3), but note that it returns a floating-point result. For small integer powers, just write out the multiplication, or write a utility function. Do not try to use the carrot (^) operator for exponentiation in Java—it has a very different meaning, just as in Python (see below).
Bitwise operators: ~, ^, &, |, <<, >>, >>>
Sometimes you need to manipulate the bits that make up binary numbers. The operators perform bitwise NOT, XOR, AND, and OR operators, or shift bits to the left or right (possibly preserving the sign bit). We do not intend to use them in this course, but expect to see them in CS 3410.
Hexadecimal and octal literals: 0x1F == 31 == 037
The prefix 0x signifies that the following letters and digits represent a base-16 (hexadecimal) integer; this is often convenient when doing bitwise arithmetic. Never write a leading zero in front of an integer in Java (e.g. 037)—this interprets the number in base-8 (octal), which no one will be expecting.

Summary

We know this is a lot to take in at once, so bookmark this page as a reference. For more details and examples of Java syntax, read the Language Basics lesson in the Java Tutorial.