Lecture 15: Page replacement

Page replacement

The advantage of virtual memory is that processes can be using more memory than exists in the machine; when memory is accessed that is not present (a page fault), it must be paged in (sometimes referred to as being "swapped in", although some people reserve "swapped in to refer to bringing in an entire address space).

Swapping in pages is very expensive (it requires using the disk), so we'd like to avoid page faults as much as possible. The algorithm that we use to choose which pages to evict to make space for the new page can have a large impact on the number of page faults that occur. We discuss a number of these algorithms in this lecture.

Random

When we need to evict a page, choose one randomly

First-in, First-out (FIFO)

When we need to evict a page, choose the first one that was paged in. This can be easily implemented by treating the frames as a circular buffer and storing a single head pointer. On eviction, replace the head, and then advance it. It will always point to the first-in page.

FIFO is also susceptible to "Belady's anomaly": it is possible that adding more frames can actually make performance worse! For example, consider the trace

Step 1 2 3 4 5 6 7 8 9 10 11 12
Access 1 2 3 4 1 2 5 1 2 3 4 5

Using FIFO with three frames, we incur 9 page faults (work this out!). With four frames, we incur 10 faults! It would be nice if buying more RAM gave us better performance.

Belady's algorithm (OPT)

When we need to evict a page, evict the page that will be unused for the longest time in the future.

Example: on the same trace as above with three frames:

Step 1 2 3 4 5 6 7 8 9 10 11 12
Access 1 2 3 4 1 2 5 1 2 3 4 5

Initially, memory is empty:

frame: 0 1 2
page:

In the first three steps, we incur three page faults and load pages 1, 2, and 3

frame: 0 1 2
page: 1 2 3

In step 4, we access page 4, incurring a page fault. Page 1 is used in step 5, page 2 is used in step 6, but page 3 is not used until step 10, so we evict page 3.

frame: 0 1 2
page: 1 2 4

Steps 5 and 6 do not incur page faults. In step 7, we need to evict a page. Page 1 is used in step 8, page 2 is used in step 9, but page 4 isn't needed until step 11, so we evict page 4.

frame: 0 1 2
page: 1 2 5

Steps 8 and 9 do not incur page faults, but step 10 does. Again, we consider the future uses of the data in memory; neither page 1 nor 2 will be used in the future, so we can evict either.

frame: 0 1 2
page: 3 2 5

Similarly in step 11, we can evict either page 3 or page 2:

frame: 0 1 2
page: 3 4 5

Finally, we execute step 12, which incurs no page fault. This gives a total of 7 page faults. This is guaranteed to be the optimal number.

LRU, MRU, LFU

Although we cannot predict the future, we can estimate it based on past behavior. Most programs exhibit both spatial and temporal locality:

Exploiting these assumptions leads to the following algorithms: - Least frequently used (LFU) assumes that pages that have been accessed rarely are unlikely to be accessed again. Keep a count of how many times each page is accessed, evict the page with the lowest count - Least recently used assumes that pages that were accessed recently are likely to be needed. Keep a timestamp of latest access, evict the page with the lowest timestamp. - Most recently used assumes that programs do not read the same addresses multiple times. For example, a media player will read a byte and then move on, never to read it again. As with LRU, keep a timestamp of latest access, but evict the page with the highest timestamp.

These algorithms exploit locality to approximate OPT, and thus can often do a good job of reducing page faults. However, implementing them is very difficult: - a count or timestamp needs to be updated on every access; this requires hardware support, and an extra register per TLB entry (expensive!) - there is one count/timestamp per page; to find the process to evict, we have to traverse the entire frame list.

Approximating LRU: second chance and clock

Instead of finding the least recently used page, we can simply find a page that was not "recently used" for some definition of "recently used". This requires only a bit per page, and makes finding a candidate to evict easy (since there are many we could choose).

To support these approximations, many TLBs support an additional "use bit" that is set automatically whenever a page is accessed.

second chance

The second chance algorithm (which some people call the clock algorithm) works just like FIFO, but it skips over any pages with the use bit set (and clears the use bit).

Example: Let's consider the same trace as above with the clock algorithm, for the first few steps:

Step 1 2 3 4 5 6 7 8 9 10 11 12
Access 1 2 3 4 1 2 5 1 2 3 4 5

Initially, memory is empty:

frame: 0 1 2
page:
use:
next: ^

In the first three steps, we incur four page faults and load pages 1, 2, and 3, advaning the next pointer. The use bit is set (since we're using the pages).

frame: 0 1 2
page: 1 2 3
use: 1 1 1
next: ^

In step 4, we incur a page fault. We look for an unused page, clearing bits as we go:

frame: 0 1 2
page: 1 2 3
use: 0 1 1
next: ^
frame: 0 1 2
page: 1 2 3
use: 0 0 1
next: ^
frame: 0 1 2
page: 1 2 3
use: 0 0 0
next: ^

Once we find one, we evict it:

frame: 0 1 2
page: 4 2 3
use: 1 0 0
next: ^

Step 5 is also a page fault; again we look for an unused page starting from the next pointer. In this case frame 1's use bit is clear, so we evict page 2.

frame: 0 1 2
page: 4 5 3
use: 1 1 0
next: ^

On step 6, we again have a page fault; we evict page 3 from frame 2.

frame: 0 1 2
page: 4 5 2
use: 1 1 1
next: ^

The rest is left as an exercise: in the end, we will have incurred 10 total page faults and end in the following state:

|Step | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |10 |11 |12 | |-------|---|---|---|---|---|---|---|---|---|---|---|---| |Access | 1 | 2 | 3 | 4 | 1 | 2 | 5 | 1 | 2 | 3 | 4 | 5 | F F F F F F F F F F

frame: 0 1 2
page: 5 3 4
use: 1 0 0
next: ^

clock

Second chance resets the use bit when a page is considered for eviction. Depending on the locality and size of memory, we can end up in a state where almost every use bit is set (so that most accesses will cause us to loop over a large number of candidates) or almost every bit is clear (so that we degenerate to FIFO).

To solve this, we can "decouple" eviction from the clearing of the use bit. We can have two "hands" that traverse the frames:

The clearing hand should lead the eviction hand; the distance between them defines "recent". A page's use bit will be set if it has been accessed more recently then when the clearing hand passed.