Threads
The next several lectures will all be about writing software that does multiple computations at once. There are two primary motivations:
- Concurrency: You want to write a program that can manage different events in the “real world” that might all happen at the same time.
- Parallelism: You have a multicore machine, and you want to harness them to get work done faster.
Here are some examples of software that needs concurrency or parallelism:
- A web server needs to handle concurrent requests from clients. It cannot control when requests arrive, so they may be concurrent.
- A web browser might want to issue concurrent requests to servers. This time, the software can control when requests happen—but for performance, it is a good idea to let requests overlap. For example, you can start a request to server A, start a request for server B, and only then wait for either request to finish. That’s concurrency.
- GUI applications are naturally concurrent. You want to do the actual computational work for the program while simultaneously updating on-screen animations and responding quickly to to mouse and keyboard events.
- A machine learning application wants to harness multiple CPU cores to make its linear-algebra operations go faster: for example, by dividing a matrix across several cores and working on each partition in parallel.
Compared to sequential code, concurrency and parallelism require fundamental changes to the way software works and how it interacts with hardware.
One strategy would be to use fork() to launch multiple processes whenever you need concurrency and parallelism.
This is possible but has downsides because processes need to use an inter-process communication mechanism (e.g., signals, pipes, files) to pass data between each other.
This lecture is about threads, which are an OS concept that lets a single process do multiple things at once.
What Is a Thread?
Recall that processes let different programs (or different “copies” of the same program) run on a single machine. A process consists of some execution state (page table, register values, program counter, etc.) that is distinct from every other process’s execution state.
A thread is an execution state within a process. One process has one or more threads. The threads within a process share some aspects of their state but not others. Namely:
- The threads within a process all share a single virtual address space. They also share one heap and one text segment, i.e., the same machine-code program.
- However, each thread within a process has its own program counter, CPU register contents, and call stack.
The consequence is that, impressionistically speaking, the threads within a process are all running the same program on the same data, but they are all at different points within that program at any given time.
One way to define a thread is to think of it as “like a process, but within a process.” That is, you already know that processes have separate code (so they can run separate programs), register states, separate program counters, and separate memory address spaces. Threads are similar, except that threads exist within a process, and all threads within a process share their virtual memory. Threads also share the data segment and file descriptors.
When a process has multiple threads, it has multiple stacks in memory. Recall the typical memory layout for a process. When there are multiple threads, everything remains the same (the heap, text, and data segments are all unchanged) except that there are multiple stacks coexisting side-by-side in the virtual address space.
Because the threads within a process share a single heap, they can easily communicate through the heap. One thread can allocate some memory and put some data there and then simply let another thread read that data. This shared memory mechanism is both incredibly convenient and ridiculously error prone. (We will get more experience with the problems it can cause later.)
A Small Demo
Let’s try writing a program that uses multiple threads. We’ll start with a somewhat odd-looking program without threads:
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
void* go(void* arg) {
sleep(1);
printf("GO\n");
fflush(stdout);
return NULL;
}
void* big(void* arg) {
sleep(1);
printf("BIG\n");
fflush(stdout);
return NULL;
}
void* red(void* arg) {
sleep(1);
printf("RED\n");
fflush(stdout);
return NULL;
}
int main() {
printf("starting...\n");
go(NULL);
big(NULL);
red(NULL);
return 0;
}
That sleep function is a system call that asks the OS to “pause” the program for a certain amount of time.
(Recalling what you know about how processes work, that concretely means that the OS kernel sets the process state to “blocked” and letting other processes run, and then switches it back to “ready” after the specified number of seconds elapse.)
Also, the fflush call asks the OS to print things out immediately instead of buffering them.
Try compiling and running this program. You should see little pauses between each line of output.
Next, let’s try running those three functions in three different threads.
Don’t worry about how those pthread_* function calls work exactly; we’ll cover those in a future lecture.
We’ll keep the three printing functions the same and just change main:
int main() {
printf("starting...\n");
pthread_t thread1;
pthread_create(&thread1, NULL, go, NULL);
pthread_t thread2;
pthread_create(&thread2, NULL, big, NULL);
pthread_t thread3;
pthread_create(&thread3, NULL, red, NULL);
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
pthread_join(thread3, NULL);
return 0;
}
Try compiling and running this program too. You should see all three lines of output show up at the same time, after a one-second delay. And at least on my machine, the lines appear in a nondeterministic order, i.e., the order is different every time I run the program. Now is a good time to ponder why this indicates that the functions are running in different threads.
The OS View
Remember process control blocks (PCB)s? The PCB is the data structure within the kernel that contains all the relevant state for a given process.
To support threads, operating systems use an analogous data structure with an analogous name: a thread control block (TCB). There is one TCB per thread. Like a PCB, a TCB contains identity, status, and state information. The state must at least include register values, including the stack pointer and program counter.
The OS scheduler takes care of switching not only between processes but also between the threads in a process. When the computer has multiple CPU cores (as all modern machines do), the OS may also choose to schedule concurrent threads onto separate cores when there are multiple threads with work to do.
Summary
Here are some important takeaways to remember about threads:
- Threads are like processes in that they consist of separate execution states that can run concurrently, but they are unlike processes because they exist at a different level: namely, processes contain threads. Within a process, all threads share a virtual-memory address space. Different processes have different page tables and therefore different virtual-memory address spaces.
- Threads are useful for both concurrency (handling multiple, possibly-simultaneous external events) and parallelism (harnessing multiple processors for performance). That means that threads are useful both on single-core machines (for concurrency only) and on multicore machines (for both concurrency and parallelism).