Concurrency

Concurrency is the simultaneous execution of multiple threads of execution. These threads may execute within the same program, which is called multithreading. Modern computers also allow different processes to run separate programs at the same time; this is called multiprogramming. We've already seen in the context of user interfaces that like most modern programming languages, Java supports concurrent threads; Java is a multithreaded programming language.

Concurrency is important for two major reasons. First, it helps us build good user interfaces, because one thread can be taking care of the UI while other threads are doing the real computational work of the application. JavaFX has a separate Application thread that starts when the first JavaFX window is opened, and handles delivery of events to JavaFX nodes. JavaFX applications can also start other threads that run in the background and get work done for the application even while the user is using it. Second, modern computers have multiple processors that can execute threads in parallel, so concurrency lets you take full advantage of your computer's processing power by giving all the available processors work to do.

Programming concurrent applications is challenging because different threads can interfere with each other, and it is hard to reason about all the ways that this can happen. Some additional techniques and design patterns help.

Concurrency vs. parallelism

Modern computers usually have multiple processors that can be simultaneously computing different things. The individual processors are called cores when they are located together on the same chip, as they are in most modern multicore machines. Multiprocessor systems have existed for a long time, but prior to multicore systems, the processors were located on different chips.

Concurrency is different from, but related to, parallelism. Parallelism is when different hardware units (e.g., cores) are doing work at the same time. Other forms of parallelism exist: graphics processing units (GPUs), network, and disk hardware all do work in parallel with the main processor(s). Modern processors even use parallelism when they are executing a single thread, because they use pipelining and other techniques to execute multiple machine instructions from a single thread at the same time.

Thus, concurrency can be present even when there is no parallelism, and parallelism can be present without concurrency. However, parallelism makes concurrency more effective, because concurrent threads can execute in parallel on different cores.

Concurrent threads can also execute on a single core. To implement the abstraction that the threads are all running at the same time, the core rapidly switches among the threads, making a little progress on executing each thread before moving on to the next thread. This is called context switching. One problem with context switching is that it takes a little time to set up the hardware state for a new context.

The JVM and your operating system automatically allocate threads to cores so you usually don't need to worry about how many cores you have. However, creating a very large number of threads is usually not very efficient because it forces cores to context-switch a lot.

Programming with threads in Java

In Java, the key class for concurrency is java.lang.Thread. It starts a new thread to do some computation. The most important part of the interface is as follows:

class Thread {
    /** Start a new thread that executes this.run() */
    public void start();
    
    /** Effects: do something. Overridable: do nothing. */
    public void run();

    /** Allow other threads to do work. However, other threads may preempt
        the current thread even if {@code this} is not called. */
    public void yield();

    /** Set whether this thread is a daemon thread. */
    void setDaemon(boolean b);
}

Thread objects have other methods, such as stop(), but they should probably be avoided. There are better ways to accomplish what they do.

To start a new thread, we create a subclass of Thread whose run() method is overridden to do something useful. Whatever it does will be run concurrently with other threads in the program.

For example, consider a program where we want to start a long-running computation when the user clicks a button. We don't want to do this computation inside the Application thread because this will stop the user interface while the computation completes. Therefore, we can start a new thread when the button is clicked. This can be done very conveniently using two inner classes:

button example

In Java, threads can preempt each other by starting to run even when yield() is not called. With preemptive concurrency, a thread that has run long enough might be suspended automatically to allow other threads to run. It is nearly impossible for the programmer to predict when preemption will occur, so careful programming is needed to ensure the program works no matter when threads are preempted.

Another useful method of Thread is setDaemon. A Java program will not stop running until all non-daemon threads have stopped. If a thread should automatically stop when the program is done, it should be marked as a daemon thread.

Race conditions

We have to be careful about having threads share objects, because threads can interfere with each other. If two threads access the same object but only read information from it, it is not a problem. Read-only sharing is safe. But if one or both of the threads is updating the object state, we need to make sure that the order in which the updates happen is fixed. Otherwise we have a race condition. Both read--write and write--write races are a problem.

For example, consider the following bank account simulation:

class Account {
    int balance;
    void withdraw(int n) {
        int b = balance - n; // R1
        balance = b;         // W1
    }
    void deposit(int n) {
        int b = balance + n; // R2
        balance = b;         // W2
    }
}

If two threads T1 and T2 are respectively concurrently executing withdraw(50) and deposit(50), what can happen? Clearly the final balance ought to be 100. But the actions of the different threads can be interleaved in many different ways. Under some of those interleavings, such as (R1, W1, R2, W2) or (R2, W2, R1, W1), the final balance is indeed 100. But other interleavings are more problematic: (R1, R2, W2, W1) destroys 50 dollars, and (R2, R1, R1, W2) creates 50 dollars. The problem is the races between R1 and W2 and between R2 and W1.

We can fix this code by controlling which interleavings are possible. In particular, we want only interleavings in which the methods withdraw() and deposit() execute atomically, meaning that their execution can be thought of an indivisible unit that cannot be interrupted by another thread. This does not mean that when one thread executes, say, withdraw(), that all other threads are suspended. However, it does mean that as far as the programmer is concerned, the system acts as if this were true.

Critical sections and atomicity

We have been seeing that sharing mutable objects between different threads is tricky. We need some kind of synchronization between the different threads to prevent them from interfering with each other in undesirable ways. For example, we saw that the following two methods on a BankAccount object got us into trouble:

void withdraw(int n) {
    balance -= n;
}

void deposit(int n) {
    balance += n;
}

There is a problem here even though the updates to balance are done all in one statement rather than in two as in the previous lecture. Execution of one thread may pause in the middle of that statement, so it doesn't help to write it as one statement. Two threads that are simultaneously executing withdraw and deposit, or even two threads both simultaneously executing withdraw, may cause the balance to be updated in a way that doesn't make sense.

This example shows that sometimes a piece of code needs to be executed as though nothing else in the system is making updates. Such code segments are called critical sections. They need to be executed atomically and in isolation : that is, without interruption from or interaction with other threads.

However, we don't want to stop all threads just because one thread has entered a critical section. So we need a mechanism that only stops the interactions of other threads with this one. This is usually achieved by using locks. (Recently, software- and hardware-based transaction mechanisms have become a popular research topic, but locks remain for now the standard way to isolate threads.)

Mutexes and `synchronized`

Mutexes are mutual exclusion locks. There are two main operations on mutexes: acquire() and release(). The acquire() operation tries to acquire the mutex for the current thread. At most one thread can hold a mutex at a time. While a lock is being held by a thread, all other threads that try to acquire the lock will be blocked until the lock is released, at which point just one waiting thread will manage to acquire it.

Java supports mutexes directly. Every object has a mutex implicitly associated with it. There is no way to directly invoke the acquire() and release() operations on an object o; instead, we use the synchronized statement to acquire the object's mutex, to perform some action, and to release the mutex:

synchronized (o) {
    // ...perform some action while holding o's mutex...
}

The synchronized statement is useful because it makes sure that the mutex is released no matter how the statement finishes executing, even it is through an exception. You can't call the underlying acquire() and release() operations explicitly, but if you could, the above code using synchronized would be equivalent to this:

try {
    o.acquire();
    // ...perform some action while holding o's mutex...
} finally {
    o.release();
}

Mutexes take up space, but a mutex is created for an object only when the object is first used for a synchronized statement, so normally they don't add much overhead.

Mutex syntactic sugar

Using mutexes we can protect the withdraw() and deposit() methods from themselves and from each other, using the receiver object's mutex:

void withdraw(int n) {
  synchronized(this) {
    balance -= n;
  }
}

void deposit(int n) {
  synchronized(this) {
    balance += n;
  }
}

Because the pattern of wrapping entire method bodies in synchronized(this) is so common, Java has syntactic sugar for it. Declaring a method to be synchronized has the same effect:

synchronized void withdraw(int n) {
    balance -= n;
}

synchronized void deposit(int n) {
    balance += n;
}

Mutex variations

Java mutexes are reentrant mutexes, meaning that it is harmless for a single thread to acquire the same mutex more than once. One consequence is that one synchronized method can call another on the same object without getting stuck trying to acquire the same mutex. Each mutex keeps track of the number of times the thread has acquired the mutex, and the mutex is only really released once it has been released by the holding thread the same number of times.

A locking mechanism closely related to the mutex is the semaphore, named after railway semaphores. A binary semaphore acts just like a (non-reentrant) mutex, except that a thread is not required to hold the semaphore in order to release it. In general, semaphores can be acquired by up to some fixed number of threads, and additional threads trying to acquire it block until some releases happen. Semaphores are the original locking abstraction, and they make possible some additional concurrent algorithms. But semaphores are harder than mutexes to use successfuly.

When is synchronization needed?

Synchronization is not free. It involves manipulating data structures, and on a machine with multiple processors (or cores), requires communication between the processors. When one is trying to make code run fast, it is tempting to cheat on synchronization. Usually this leads to disaster.

Synchronization is needed whenever we need to rely on invariants on the state of objects, either between different fields of one or more objects, or between contents of the same field at different times. Without synchronization there is no guarantee that some other thread won't be simultaneously modifying the fields in question, leading to an inconsistent view of their contents.

Synchronization is also needed when we need to make sure that one thread sees the updates caused by another thread. It is possible for one thread to update an instance variable and another thread to later read the same instance variable and see the value it had before the update. This inconsistency arises because different threads may run on different processors. For speed, each processor has its own local copy of memory, but updates to local memory need not propagate immediately to other processors. For example, consider two threads executing the following code in parallel:

Thread 1:

y = 1;
x = 1;

Thread 2:

while (x != 0) {}
print (y);

What possible values of y might be printed by thread 2? Naively it looks like the only possible value is 1. But without synchronization between these two threads, the update to x can be seen by thread 2 without seeing the update to y. The fact that the assignment to y happened before x does not matter!

The reliable way to ensure that updates done by one thread are seen by another is to explicitly synchronize the two threads. Synchronization is needed for all accesses to mutable state that is shared between threads. The mutable state might be entire objects, or, for finer-grained synchronization, just mutable fields of objects. Each piece of mutable state should be protected by a lock. When the lock protecting a shared mutable field is not being held by the current thread, the programmer must assume that its value can change at any time. Any invariant that involves the value of such a field cannot be relied upon.

Note that immutable state shared between threads doesn't need to be locked because no one will try to update them. This fact encourages a style of programming that avoids mutable state.