Concurrency: Condition Variables

Condition Module

For this module we refer to OCaml's condition module documentation.

A condition variable is another synchronization primitive that many threading libraries will provide. It allows threads to wait (stop running) until they are signaled by some other thread that some condition has been fulfilled.

Condition variables are always used in conjunction with a mutex, and have three operations:

These are useful when using simpler primitives (such as mutices alone) would cause excessive locking, unlocking, and waiting. Let's see an example where condition variables are useful.

The reader writer concurrency pattern

The reader/writer problem is a classic conurrency pattern (concurrent readers and exclusive writer, CRXW). We have some shared data that many threads want to access. Some of these threads are writers who may update this data, and others are readers who will only read the data, never change it.

The rule is that when there is any writer updating the shared data, no other thread may be accessing the data. Conversely, of course if a reader is reading the data, writers must wait until the reader is finished reading. However, many readers may operate at the same time, because they do not change any shared state.

We'll enforce the CRXW constraints with a reader-writer lock, whose interface is defined like so:

module type RWLOCK = sig
  type t
  val create: unit -> t
  val read_lock: t -> unit
  val read_unlock: t -> unit
  val write_lock: t -> unit
  val write_unlock: t -> unit
end

Whenever a reader wants to read the shared data protected by the reader-writer lock, the rule is that it will call read_lock before reading, and call read_unlock after it is done. The same rules apply for writers. As with a mutex, the two lock functions must not return until it is safe for the reader/writer to proceed.

Let's explore a few possible implementations of the reader-writer lock.

A simple attempt

module Rwlock : RWLOCK = struct
  (* A reader count, and a bool to indicate whether there's a writer *)
  type t = int ref * bool ref

  let create () = (ref 0, ref false)

  let read_lock (num_readers, _) = num_readers := !num_readers + 1
  let read_unlock (num_readers, _) = num_readers := !num_readers - 1
  let write_lock (_, writer_here) = writer_here := true
  let write_unlock (_, writer_here) = writer_here := false
end

Hopefully alarm bells are going off in your head right now, as this code has zero chance of being correct. One therad (a reader) could call read_lock while another thread (a writer) calls write_lock, and they would both succeed! We have violated the invariant; this code is obviously incorrect.

A second attempt: checking before return

Okay, the problem with the previous attempt was that both of the lock functions just returned without checking that it was safe for the caller to proceed. Let's add such a check, then!

module Rwlock : RWLOCK = struct
  (* A reader count, and a bool to indicate whether there's a writer *)
  type t = int ref * bool ref

  let create () = (ref 0, ref false)

  let read_lock (num_readers, writer_here) =
    while !writer_here do () done;
    num_readers := !num_readers + 1
  let read_unlock (num_readers, _) = num_readers := !num_readers - 1
  let write_lock (num_readers, writer_here) =
    while !writer_here || !num_readers > 0 do () done;
    writer_here := true
  let write_unlock (_, writer_here) = writer_here := false
end

And that's it, right? This code is completely correct. We're done...!

Not quite. Remember that when multiple threads are running, the processor may switch between two threads at any time. Let's again consider the case where a reader and a writer both call their respective lock functions on a newly created reader-writer lock.

If the reader's thread runs to completion first, everything is good: read_lock will increment the reader count and return, so the reader does its reading. write_lock will see that the reader count is positive and get stuck in its while loop. Presumably the reader will call reader_unlock at some point in time, at which point the writer will get to exit its while loop and do its thing. Similarly, if the writer's thread runs to completion first, there are no problems.

But terrible things could happen if the timing of the context switch between threads is chosen very unfortunately. Consider this: The reader first checks that there are no writers. There really are none, so it is done with its while loop. We switch to the writer who checks that there are neither writers nor readers. Indeed there are none, so the writer is also done with its while loop. Now the calls to read_lock and write_lock return, and we have again broken the rules.

A third attempt: using a mutex

The previous attempts should have convinced you that only one thread should be calling the lock functions at once. The clear solution is to protect these functions with a mutex. You may question why we are using a mutual exclusion lock inside our reader-writer lock. The answer is that we intend to use this mutex only to protect calls to the lock functions, not to protect access to the shared data structure itself (which is what the reader-writer lock protects). The mutex will only be acquired momentarily to make sure that updates to the reader count and writer boolean happen atomically.

module Rwlock : RWLOCK = struct
  (* A reader count, a bool to indicate whether there's a writer, *)
  (* and a mutex to protect calls to the lock operations. *)
  type t = int ref * bool ref * Mutex.t

  let create () = (ref 0, ref false, Mutex.create ())

  let read_lock (num_readers, writer_here, mutex) =
    Mutex.lock mutex;
    while !writer_here do () done;
    num_readers := !num_readers + 1;
    Mutex.unlock mutex

  let read_unlock (num_readers, _, mutex) =
    Mutex.lock mutex;
    num_readers := !num_readers - 1;
    Mutex.unlock mutex

  let write_lock (num_readers, writer_here, mutex) =
    Mutex.lock mutex;
    while !writer_here || !num_readers > 0 do () done;
    writer_here := true;
    Mutex.unlock mutex

  let write_unlock (_, writer_here, mutex) =
    Mutex.lock mutex;
    writer_here := false;
    Mutex.unlock mutex
end

Unfortunately, this code could lead to deadlock! Consider the case when there is a reader currently reading the shared data, and a writer wants to write. The writer will grab the mutex, and then get stuck in its while loop, without unlocking the mutex. Now if the reader calls reader_unlock to let the writer in, instead it gets stuck waiting for the mutex to be unlocked (and it never will be!).

Some students have asked why it was necessary to protect the unlock functions with a mutex. Admittedly in this code it may not be, as the unlock functions are very simple.

This does not change the fundamental point: busy-waiting (doing nothing in a loop repeatedly) while holding a mutex is bad.

A fourth attempt: unlocking the mutex if unsuccessful

The problem with the previous attempt was we got stuck in a while loop when we were holding the mutex, waiting for a condition that would never come true. The next step we should take, then is to unlock the mutex if we were unsuccessful.

module Rwlock : RWLOCK = struct
  (* A reader count, a bool to indicate whether there's a writer, *)
  (* and a mutex to protect calls to the lock operations. *)
  type t = int ref * bool ref * Mutex.t

  let create () =
    (ref 0, ref false, Mutex.create ())

  let read_lock (num_readers, writer_here, mutex) =
    let success = ref false in
    while not !success; do
      Mutex.lock mutex;
      if !writer_here then Mutex.unlock mutex
      else success := true
    done;

    num_readers := !num_readers + 1;
    Mutex.unlock mutex

  let read_unlock (num_readers, _, mutex) =
    Mutex.lock mutex;
    num_readers := !num_readers - 1;
    Mutex.unlock mutex

  let write_lock (num_readers, writer_here, mutex) =
    let success = ref false in
    while not !success; do
      Mutex.lock mutex;
      if !num_readers > 0 || !writer_here; then Mutex.unlock mutex
      else success := true
    done;

    writer_here := true;
    Mutex.unlock mutex

  let write_unlock (_, writer_here, mutex) =
    Mutex.lock mutex;
    writer_here := false;
    Mutex.unlock mutex

end

Examine how read_lock is implemented now. First it goes into a loop, and every loop iteration it will first lock the mutex. It will check whether it is safe for the reader to enter. If it is unsafe (there are writers), it will unlock the mutex, then stays in the while loop. Once it is successful, it exits the while loop, increments the reader counter, then unlocks the mutex. The write_lock function does the analogous operation.

(This code could have been written differently to avoid repeated statements, but this way of writing it transitions nicely to the next attempt)

This code is (as far as we can tell) correct, but there is a large problem with it: it is wasteful! As long as there is a writer updating the shared data, a reader who calls read_lock will acquire the mutex, discover it is unsafe to continue, release the mutex, then try again. This repeated locking, checking, and unlocking wastes CPU time and also prevents other threads from calling functions that require the mutex (most importantly, the writer who wants to call write_unlock).

The fifth (and final) attempt: condition variables!

We want to avoid the wasteful cycle of locking, checking, and unlocking required by the previous attempt. The ideal situation is that if a reader discovers there is currently a writer updating the data, it should go to sleep (stop running) until the writer has called write_unlock, because there is no chance the reader will be able to proceed until the writer has exited. The same should hold true of writers: if they discover any other writer or reader has the reader-writer lock, they should sleep until nobody else is reading or writing the shared data. This is where condition variables come in.

module Rwlock : RWLOCK = struct
  (* (reader_count, writer_here, mutex, reader_can_enter, writer_can_enter) *)
  type t = int ref * bool ref * Mutex.t * Condition.t * Condition.t

  let create () =
    (ref 0, ref false, Mutex.create (), Condition.create (), Condition.create ())

  let read_lock (num_readers, writer_here, mutex, reader_can_enter, _) =
    Mutex.lock mutex;
    while !writer_here; do
      Condition.wait reader_can_enter mutex;
    done;
    num_readers := !num_readers + 1;
    Mutex.unlock mutex

  let read_unlock (num_readers, _, mutex, _, writer_can_enter) =
    Mutex.lock mutex;
    num_readers := !num_readers - 1;
    if !num_readers = 0 then Condition.signal writer_can_enter;
    Mutex.unlock mutex

  let write_lock (num_readers, writer_here, mutex, _, writer_can_enter) =
    Mutex.lock mutex;
    while !num_readers > 0 || !writer_here; do
      Condition.wait writer_can_enter mutex;
    done;
    writer_here := true;
    Mutex.unlock mutex

  let write_unlock (_, writer_here, mutex, reader_can_enter, writer_can_enter) =
    Mutex.lock mutex;
    writer_here := false;
    Condition.signal writer_can_enter;
    Condition.broadcast reader_can_enter;
    Mutex.unlock mutex;

end

Compare this code with the previous attempt. Instead of locking and unlocking a mutex repeatedly, each call to read_lock or write_lock locks the mutex once initially. It checks whether it is safe to proceed; if it is not, it calls Condition.wait on the appropriate condition variable, which suspends the calling thread and unlocks the mutex. The calling thread will wake up holding the mutex when signaled.

Since threads will be waiting for a signal, the read_unlock and write_unlock functions must now be modified to give this signal. In read_unlock, the reader checks whether there are no more readers reading the data; if this is true, it signals the waiting writers that it is safe for them to enter. The same thing happens in write_unlock: the writer may have stopped other writers or reader from entering, so it should signal both. In the case of the readers, many readers may be able to enter once the writer exits, so it broadcasts (sends a signal to all the waiting readers).

Testing our code

The following code tests our implementation of reader-write locks:

let lock = Rwlock.create ()

let reader id =
  while true; do
    Printf.printf "Reader %i trying to read...\n" id;
    Rwlock.read_lock lock;
    Printf.printf "Reader %i is reading!\n" id;
    Thread.delay (Random.float 1.5);
    Rwlock.read_unlock lock;
    Printf.printf "Reader %i exits.\n" id;
    flush stdout;
    Thread.delay (Random.float 1.5)
  done

let writer id =
  while true; do
    Printf.printf "Writer %i trying to read...\n" id;
    Rwlock.write_lock lock;
    Printf.printf "Writer %i is reading!\n" id;
    Thread.delay (Random.float 1.5);
    Rwlock.write_unlock lock;
    Printf.printf "Writer %i exits.\n" id;
    flush stdout;
    Thread.delay (Random.float 1.5)
  done

let test_reader_writer () =
  for i = 0 to 3 do ignore (Thread.create reader i) done;
  for i = 0 to 2 do ignore (Thread.create writer i) done

Conclusion

In this recitation we iteratively implemented a reader-writer lock as motivation for condition variables. We hope we have shown you that condition variables are a powerful tool for dealing for situations where threads need to wait for certain conditions to be true.