Lecture 4: Threads and Scheduling

Threads
- TCB, kernel vs. user-level threads
Scheduling algorithms
- fairness, responsiveness, waiting time, simplicity, user-specified priority
- FCFS, LCFS, RR, SJF, SRTF, Adaptive multilevel queue

Threads

Multiple processes allow programmers to "do multiple things at the same time", but communication between processes is difficult. IPC is possible but invokes system calls and may require copying large amounts of data between processes

Threads are like processes, but they share an address space. Multiple threads within a single process can communicate by simply reading and writing to shared variables in memory.

To create a thread, a programmer simply makes a system call to the kernel to fork a new thread (similar to how one forks a new process). The kernel creates a new Thread Control Block (TCB) to store the state of each thread. The thread control block shares a PCB with the parent thread.

The state of the computation (registers, ready/runing/waiting) is stored in the TCB, while the shared process-level information (VM configuration, permissions) are stored in the shared PCB.

This design is referred to as kernel level threading (or simply "kernel threads"), because the kernel is responsible for managing the TCBs. An alternative design is user-level threading, in which processes manage their own threading, and switch between threads using normal jump instructions inside an application-level scheduler. The Async library used in recent offerings of CS3110 is an example of a user-level threading library.

In order to support user-level threading, the kernel must provide a way for applications to request I/O without being transitioned to the waiting state. This is referred to as non-blocking or asynchronous I/O.

Scheduling

Thus far when discussing time-sharing between processes, we've simply said that when it is time to switch processes, the operating system selects a new process and then runs it. The details of "when it is time" and which process to select can have major impacts on system behavior.

We would like a scheduler that satisfies the following criteria:

simplicity: the scheduler should be easy to reason about, fast, and must not use too many resources
fairness: no process can be starved forever
responsiveness: user-facing processes should respond quickly to input
low waiting time: the total waiting time for all processes should be short
flexible priority: perhaps it is useful for users to be able to indicate that some processes are "more important" than others
predictability: bounded variation in any of the above metrics

We discussed the following algorithms: - First-come, first-served (FCFS): whenever a process becomes ready, it is placed at the tail of a queue. Whenever a process relinquishes the CPU, a new process is taken from the head of the queue and scheduled. - Pros: simple, fair (no process starves). - Cons: I/O bound and CPU/bound processes are treated the same, so waiting time, responsiveness, priority and predictability can be poor.

Round-robin (RR): FCFS with preemption. Before starting a process, the OS sets a timer to a fixed quantum. When the timer expires, the currently running process is placed at the tail of the queue and a new process is selected.
- Pros: simple, fair, more responsive and predictable than FCFS
- Cons: can have bad waiting time if there is a mix of long and short processes, not responsive if there are many processes
Last-come, first served (LCFS): like FCFS, but choose the most recently added process instead of the oldest.
- Pros: I/O bound processes become ready more often than CPU bound processes, so may be more responsive in some cases
- not fair: processes can starve (including I/O bound processes in pathological cases, hurting responsiveness)
Shortest Job First (SJF): Run processes without preemption. Always select the job that will complete its next CPU burst quickly
- Pros: Theoretically optimal waiting time.
- Cons: Theoretically impossible to know how long next CPU burst will take (this is equivalent to the halting problem).
Shortest remaining time first (SRTF): Like shortest job first, but preempt processes with a fixed quantum. This improves responsiveness by allowing newly arrived short jobs (such as responding to a keypress) to execute immediately instead of waiting for currently running job to finish.
- Pros: better responsiveness than SJF
- Cons: still impossible
Adaptive multi-level queue: we approximate the running time for the next CPU burst by measuring the running time of the previous CPU burst. We'll add more details to this in tomorrow's lecture, but the basic idea is that we maintain a collection of queues ranging in priority from highest to lowest. If there are high priority jobs, we run them, if not we look for slightly lower priority jobs, and so on.

Within a queue, jobs are scheduled using RR.

Each job is run with a quantum. If it consumes its quantum, then it looks like a CPU-bound job, so we decrease its priority by moving it to a lower priority queue: our goal is to quickly service user-facing (I/O-bound) processes. If it does I/O before its quantum expires, it looks like an I/O bound process, so we increase its priority.
- Pros: approximates optimal SJF if past behavior predicts future behavior. Can be easily integrated with user-specified priorities.
- Cons: less simple than other schemes, starvation (until we adapt it tomorrow)

None of these algorithms are perfect. In reality, schedulers are hand-tweaked over time to work well in practice based on known and experimental workloads.