Threads

The next several lectures will all be about doing multiple computations at once. As we saw in the previous lecture, real software needs to deal with concurrency (managing different events that might all happen at the same time) and parallelism (harnessing multiple processors to get work done faster than a single processor on its own). Compared to sequential code, concurrency and parallelism require fundamental changes to the way software works and how it interacts with hardware.

Here are some examples of software that needs concurrency or parallelism:

  • A web server needs to handle concurrent requests from clients. It cannot control when requests arrive, so they may be concurrent.
  • A web browser might want to issue concurrent requests to servers. This time, the software can control when requests happen—but for performance, it is a good idea to let requests overlap. For example, you can start a request to server A, start a request for server B, and only then wait for either request to finish. That’s concurrency.
  • A machine learning application wants to harness multiple CPU cores to make its linear-algebra operations go faster: for example, by dividing a matrix across several cores and working on each partition in parallel.

Threads are an OS concept that a single process can use to exploit concurrency and parallelism.

What Is a Thread?

A thread is an execution state within a process. One process has one or more threads. Each thread has its own thread-specific state: the program counter, the contents of all the CPU registers, and the stack. However, all the threads within a process share a virtual address space, and they share a single heap.

One way to define a thread is to think of it as “like a process, but within a process.” That is, you already know that processes have separate code (so they can run separate programs), register states, separate program counters, and separate memory address spaces. Threads are similar, except that threads exist within a process, and all threads within a process share their virtual memory. All threads within a process are running the same program (they have same text segment)—they may just execute different parts of that program concurrently. Threads also share the data segment and file descriptors.

When a process has multiple threads, it has multiple stacks in memory. Recall the typical memory layout for a process. When there are multiple threads, everything remains the same (the heap, text, and data segments are all unchanged) except that there are multiple stacks coexisting side-by-side in the virtual address space.

The threads within a process share a single heap. That means that threads can easily communicate through the heap: one thread can allocate some memory and put some data there and then simply let another thread read that data. This shared memory mechanism is both incredibly convenient and ridiculously error prone. (We will get more experience with the problems it can cause later.)

The thread’s state includes the registers (including the program counter and the stack pointer). The OS scheduler takes care of switching not only between processes but also between the threads in a process. When the computer has multiple CPU cores (as all modern machines do), the OS may also choose to schedule concurrent threads onto separate cores when there are multiple threads with work to do.

Why Threads?

You may be wondering why we might use threads. Further, do threads make sense with just a single core (spoiler: yes!)?

The key benefit of threads over processes is that all threads within a process run the same program and share virtual memory. This encourages a natural program structure, as opposed to using processes, for example. It would be rather clunky and tedious to fork() off separate child processes to update the screen, fetch data, and receive user input. Processes need to use an inter-process communication mechanism (e.g., signals, pipes, files) to pass data between each other. These mechanisms also tend to be significantly more expensive performance-wise.

Since they share memory, threads make it easy to write programs which must logically concurrent tasks. Even on a system with a single core, threads can make programs more responsive and efficient. One thread could be processing data in a buffer while another is fetching new data to push to the end of the same buffer. Yet another thread could be responsible for updating the screen.