Concurrency
Concurrency is the simultaneous execution of multiple threads of execution. These threads may execute within the same program, which is called multithreading. Modern computers also allow different processes to run separate programs at the same time; this is called multiprogramming. We've already seen in the context of user interfaces that like most modern programming languages, Java supports concurrent threads; Java is a multithreaded programming language.
Concurrency is important for two major reasons. First, it helps us build good user interfaces, because one thread can be taking care of the UI while other threads are doing the real computational work of the application. JavaFX has a separate Application thread that starts when the first JavaFX window is opened, and handles delivery of events to JavaFX nodes. JavaFX applications can also start other threads that run in the background and get work done for the application even while the user is using it. Second, modern computers have multiple processors that can execute threads in parallel, so concurrency lets you take full advantage of your computer's processing power by giving all the available processors work to do.
Programming concurrent applications is challenging because different threads can interfere with each other, and it is hard to reason about all the ways that this can happen. Some additional techniques and design patterns help.
Concurrency vs. parallelism
Modern computers usually have multiple processors that can be simultaneously computing different things. The individual processors are called cores when they are located together on the same chip, as they are in most modern multicore machines. Multiprocessor systems have existed for a long time, but prior to multicore systems, the processors were located on different chips.
Concurrency is different from, but related to, parallelism. Parallelism is when different hardware units (e.g., cores) are doing work at the same time. Other forms of parallelism exist: graphics processing units (GPUs), network, and disk hardware all do work in parallel with the main processor(s). Modern processors even use parallelism when they are executing a single thread, because they use pipelining and other techniques to execute multiple machine instructions from a single thread at the same time.
Thus, concurrency can be present even when there is no parallelism, and parallelism can be present without concurrency. However, parallelism makes concurrency more effective, because concurrent threads can execute in parallel on different cores.
Concurrent threads can also execute on a single core. To implement the abstraction that the threads are all running at the same time, the core rapidly switches among the threads, making a little progress on executing each thread before moving on to the next thread. This is called context switching. One problem with context switching is that it takes a little time to set up the hardware state for a new context.
The JVM and your operating system automatically allocate threads to cores so you usually don't need to worry about how many cores you have. However, creating a very large number of threads is usually not very efficient because it forces cores to context-switch a lot.
Programming with threads in Java
In Java, the key class for concurrency is java.lang.Thread. It
starts a new thread to do some computation. The most important part of the
interface is as follows:
class Thread {
/** Start a new thread that executes this.run() */
public void start();
/** Effects: do something. Overridable: do nothing. */
public void run();
/** Allow other threads to do work. However, other threads may preempt
the current thread even if {@code this} is not called. */
public void yield();
/** Set whether this thread is a daemon thread. */
void setDaemon(boolean b);
}
Thread objects have other methods, such as stop(), but they should
probably be avoided. There are better ways to accomplish what they do.
To start a new thread, we create a subclass of Thread
whose run()
method is overridden to do something useful. Whatever it does will be
run concurrently with other threads in the program.
For example, consider a program where we want to start a long-running computation when the user clicks a button. We don't want to do this computation inside the Application thread because this will stop the user interface while the computation completes. Therefore, we can start a new thread when the button is clicked. This can be done very conveniently using two inner classes:
button example
In Java, threads can preempt each other by starting to run
even when yield() is not
called. With preemptive concurrency, a thread that has run long enough might be
suspended automatically to allow other threads to run. It is nearly impossible
for the programmer to predict when preemption will occur, so careful
programming is needed to ensure the program works no matter when threads
are preempted.
Another useful method of Thread is setDaemon. A Java program
will not stop running until all non-daemon threads have stopped. If a thread should
automatically stop when the program is done, it should be marked as a daemon thread.
Race conditions
We have to be careful about having threads share objects, because threads can interfere with each other. If two threads access the same object but only read information from it, it is not a problem. Read-only sharing is safe. But if one or both of the threads is updating the object state, we need to make sure that the order in which the updates happen is fixed. Otherwise we have a race condition. Both read--write and write--write races are a problem.
For example, consider the following bank account simulation:
class Account {
int balance;
void withdraw(int n) {
int b = balance - n; // R1
balance = b; // W1
}
void deposit(int n) {
int b = balance + n; // R2
balance = b; // W2
}
}
If two threads T1 and T2 are respectively concurrently executing
withdraw(50) and
deposit(50), what can happen? Clearly the final balance ought to be
100. But the actions of the different threads can be
interleaved in
many different ways. Under some of those interleavings, such as (R1,
W1, R2, W2) or (R2, W2, R1, W1), the final balance is indeed 100. But
other interleavings are more problematic: (R1, R2, W2, W1) destroys 50
dollars, and (R2, R1, R1, W2) creates 50 dollars. The problem is the
races between R1 and W2 and between R2 and W1.
We can fix this code by controlling which interleavings are possible. In
particular, we want only interleavings in which the methods
withdraw() and deposit() execute atomically,
meaning that their execution can be thought of an indivisible unit that cannot
be interrupted by another thread. This does not mean that when one thread
executes, say, withdraw(), that all other threads are suspended.
However, it does mean that as far as the programmer is concerned, the system
acts as if this were true.
Critical sections and atomicity
We have been seeing that sharing mutable objects between different
threads is tricky. We need some kind of synchronization between the
different threads to prevent them from interfering with each other in
undesirable ways. For example, we saw that the following two methods
on a BankAccount object got us into trouble:
void withdraw(int n) {
balance -= n;
}
|
void deposit(int n) {
balance += n;
}
|
There is a problem here even though the updates to balance are done
all in one statement rather than in two as in the previous lecture.
Execution of one thread may pause in the middle of that statement, so
it doesn't help to write it as one statement. Two threads that are
simultaneously executing withdraw and deposit, or even two threads
both simultaneously executing withdraw, may cause the balance to be
updated in a way that doesn't make sense.
This example shows that sometimes a piece of code needs to be executed as though nothing else in the system is making updates. Such code segments are called critical sections. They need to be executed atomically and in isolation : that is, without interruption from or interaction with other threads.
However, we don't want to stop all threads just because one thread has entered a critical section. So we need a mechanism that only stops the interactions of other threads with this one. This is usually achieved by using locks. (Recently, software- and hardware-based transaction mechanisms have become a popular research topic, but locks remain for now the standard way to isolate threads.)
Mutexes and synchronized
Mutexes are mutual exclusion locks. There are two main operations on
mutexes: acquire() and release(). The acquire() operation tries
to acquire the mutex for the current thread. At most one thread can
hold a mutex at a time. While a lock is being held by a thread, all
other threads that try to acquire the lock will be blocked until the
lock is released, at which point just one waiting thread will manage
to acquire it.
Java supports mutexes directly. Every object has a mutex implicitly
associated with it. There is no way to directly invoke the acquire()
and release() operations on an object o; instead, we use the
synchronized statement to acquire the object's mutex, to perform
some action, and to release the mutex:
synchronized (o) {
// ...perform some action while holding o's mutex...
}
The synchronized statement is useful because it makes sure that the
mutex is released no matter how the statement finishes executing,
even it is through an exception. You can't call the underlying
acquire() and release() operations explicitly, but if you could,
the above code using synchronized would be equivalent to this:
try {
o.acquire();
// ...perform some action while holding o's mutex...
} finally {
o.release();
}
Mutexes take up space, but a mutex is created for an object only when
the object is first used for a synchronized statement, so normally
they don't add much overhead.
Mutex syntactic sugar
Using mutexes we can protect the withdraw() and deposit() methods
from themselves and from each other, using the receiver object's
mutex:
void withdraw(int n) {
synchronized(this) {
balance -= n;
}
}
|
void deposit(int n) {
synchronized(this) {
balance += n;
}
}
|
Because the pattern of wrapping entire method bodies in
synchronized(this) is so common, Java has syntactic sugar for it.
Declaring a method to be synchronized has the same effect:
synchronized void withdraw(int n) {
balance -= n;
}
|
synchronized void deposit(int n) {
balance += n;
}
|
Mutex variations
Java mutexes are reentrant mutexes, meaning that it is harmless for a single thread to acquire the same mutex more than once. One consequence is that one synchronized method can call another on the same object without getting stuck trying to acquire the same mutex. Each mutex keeps track of the number of times the thread has acquired the mutex, and the mutex is only really released once it has been released by the holding thread the same number of times.
A locking mechanism closely related to the mutex is the semaphore, named after railway semaphores. A binary semaphore acts just like a (non-reentrant) mutex, except that a thread is not required to hold the semaphore in order to release it. In general, semaphores can be acquired by up to some fixed number of threads, and additional threads trying to acquire it block until some releases happen. Semaphores are the original locking abstraction, and they make possible some additional concurrent algorithms. But semaphores are harder than mutexes to use successfuly.
When is synchronization needed?
Synchronization is not free. It involves manipulating data structures, and on a machine with multiple processors (or cores), requires communication between the processors. When one is trying to make code run fast, it is tempting to cheat on synchronization. Usually this leads to disaster.
Synchronization is needed whenever we need to rely on invariants on the state of objects, either between different fields of one or more objects, or between contents of the same field at different times. Without synchronization there is no guarantee that some other thread won't be simultaneously modifying the fields in question, leading to an inconsistent view of their contents.
Synchronization is also needed when we need to make sure that one thread sees the updates caused by another thread. It is possible for one thread to update an instance variable and another thread to later read the same instance variable and see the value it had before the update. This inconsistency arises because different threads may run on different processors. For speed, each processor has its own local copy of memory, but updates to local memory need not propagate immediately to other processors. For example, consider two threads executing the following code in parallel:
Thread 1:
y = 1; x = 1; |
Thread 2:
while (x != 0) {}
print (y); |
What possible values of y might be printed by thread 2? Naively it
looks like the only possible value is 1. But without synchronization
between these two threads, the update to x can be seen by thread 2
without seeing the update to y. The fact that the assignment to y
happened before x does not matter!
The reliable way to ensure that updates done by one thread are seen by another is to explicitly synchronize the two threads. Synchronization is needed for all accesses to mutable state that is shared between threads. The mutable state might be entire objects, or, for finer-grained synchronization, just mutable fields of objects. Each piece of mutable state should be protected by a lock. When the lock protecting a shared mutable field is not being held by the current thread, the programmer must assume that its value can change at any time. Any invariant that involves the value of such a field cannot be relied upon.
Note that immutable state shared between threads doesn't need to be locked because no one will try to update them. This fact encourages a style of programming that avoids mutable state.