Lectures 16: Advanced Async

Last time we saw how to describe concurrent computations using bind (often written >>=) and return primitives. The return function takes a value and returns a deferred computation that is determined immediately, while d >>= f waits for d to become determined, and then passes the resulting value to f, which also computes a deferred value. In general, there may be many deferred computations executing concurrently, and we will need to coordinate and synchronize their behavior.

These notes build on material and code examples originally developed in Real World OCaml Chapter 18.

An echo server



Simple Synchronization

One simple way to coordinate deferred computations is to wait for them to become determined. The function both does this for the case where there are two computations:

val both : 'a Deferred.t -> 'b Deferred.t -> ('a * 'b) Deferred.t
Recall that both can be easily implemented using bind and return:
let both (d1:'a Deferred.t) (d2:'b Deferred.t) : ('a * 'b) Deferred.t = 
  d1 >>= (fun v1 -> 
  d2 >>= (fun v2 -> 
  return (v1,v2)))
More generally, we can wait for a list of deferred values to become determined using all
val all : ('a Deferred.t) list -> ('a list) Deferred.t
One can think of all as a kind of barrier which ensures that all of the deferred values in the list have been fully computed before proceeding. The resulting values are available in the final list. Here is one possible implementation of all:
let rec all (l:('a Deferred.t) list) : ('a list) Deferred.t = 
  List.fold_right
    (fun x acc -> 
      x >>= (fun h -> 
      acc >>= (fun t -> 
      return (h::t))))
    l (return [])

There is a dual to all called any that waits for some value in a list of deferreds to become determined:

val any : ('a Deferred.t) list -> 'a Deferred.t
Note that, if several deferreds from the list become determined in close proximity, the Async scheduler makes no guarantees that the deferred computation selected will be the first that became determined.

Timeouts

One application of any is to create functions that execute computations with a fixed time budget. The function after produces a deferred computation that becomes determined at a fixed time in the future:

val after : Core.Span.t -> unit Deferred.t
The type Core.Span.t represents a span of time. The function sec computes a span in seconds:
val sec : float -> Core.Span.t
Hence, the expression
after (sec 5.3)
is a deferred computation that becomes determined after approximately 5.3 seconds. Async does not make real-time guarantees: the after will become determined after 5.3 seconds, but not necessarily immediately after.

Using after and any, we can write a higher-order function that executes a deferred computation f until it becomes determined or reaches a timeout:

let timeout (thunk:unit -> 'a Deferred.t) (n:float) 
  : ('a option) Deferred.t
  = Deferred.any 
    [ after (sec n) >>| (fun () -> None) ; 
      thunk () >>| (fun x -> Some x) ]
The expression timeout thunk n attempts to execute the thunk and wraps the result in the Some constructor for the option datatype. However, if the time budget is exceeded, then the unit value computed by after become determined and the whole computation returns None instead.

The 3110 autograder uses a similar function to handle the issue of student submissions spinning off into infinite loops.

Ivars

Most common patterns of coordination and synchronization can be implemented using simple functions like both, all, and any. But in certain other situations, more intricate behavior is needed. An ivar can be used to implement fine-grained coordination and synchronization between deferred computations. Intuitively, an ivar is like a reference that can be assigned to, or filled, exactly once. Hence, an ivar is similar to a deferred computation, which is intially undetermined and then becomes determined exactly once.

The ivars interface is captured by the following code:

module IVar : sig
  type 'a t
  val create : unit -> 'a t
  val fill : 'a t -> 'a -> unit
  val fill_if_empty : 'a t -> 'a -> unit
  val is_empty : 'a t -> bool
  val is_full : 'a t -> bool
  val read : 'a t -> 'a Deferred.t
  ...
end = 
struct 
  ... 
end
Consult the Ivar documentation for further details.

The key thing to notice about the Ivar module is the read function, which constructs a deferred value that becomes determined when the ivar is filled. We will use this functionality in the next example.

The ivar abstraction was invented by John Reppy (Cornell PhD, 1992) in a language called Concurrent ML.

Delaying Scheduler

To illustrate the use of ivars, we will implement a simple scheduler that delays jobs by a fixed amount of time. We will need the function upon,

upon : 'a Deferred.t -> ('a -> unit) -> unit
which waits for a value in the first argument to become determined, and then executes the code in the (possibly side-effecting) second argument.

The interface for the delaying scheduler is as follows:

module type DELAYER = sig
  type t
  val create : float -> t
  val schedule : t -> (unit -> 'a Deferred.t) -> 'a Deferred.t
end
The create function initializes a scheduler that delays incoming jobs by a fixed amount of time. The schedule function schedules a job provided by a thunk that produces a deferred value. The overall result of schedule is determined when the job has been run and its value has become determined.

The implementation for the delaying scheduler is as follows:

module Delayer : DELAYER = struct
  type t = { delay: Core.Span.t;
             jobs: (unit -> unit) Core.Std.Queue.t }

  let create n =
    { delay = Core.Time.Span.of_sec n; 
      jobs = Core.Std.Queue.create () }

  let schedule t thunk =
    let ivar = Ivar.create () in
    Core.Std.Queue.enqueue t.jobs (fun () ->
      upon (thunk ()) (fun x -> Ivar.fill ivar x));
    upon (after t.delay) (fun () ->
      (Core.Std.Queue.dequeue_exn t.jobs) ());
    Ivar.read ivar
end
Note that the job is enqueued immediately and dequeued after the delay has elapsed. Also note how ivar is filled and used to communicate the final result.

Choice

The any function provides a way to choose just one of several deferred computations. However, if those computations have effects, such as opening files or socket connections, printing to the console, etc., the behavior may not be what is desired. This can lead to problems in general if these side effects are not correctly handled. However, OCaml does not provide general mechanisms to cleanly revert or shutdown a computation that has been discarded.

To address this problem, Async provides the notion of a choice:

type Deferred.choice 
To construct a value of type choice, one provides a deferred computation as well a function:
val choice : 'a Deferred.t -> ('a -> 'b) -> 'b Deferred.choice
Intuitively a choice can be thought of as representing a computation where the function is executed if the deferred computation becomes determined and the choice is chosen. By placing all side-effecting pieces of the computation in the function, programmers can obtain finer-grained control if and when computations are discarded.

The function choose selects a list of choices:

val choose : ('a Deferred.choice) list -> 'a Deferred.t
Just like any, the implementation provides no guarantees about which element will be chosen, or when it will be determined. However, importantly it does guarantee that only one of the functions for the choice elements is executed. This can be used to write clean shutdown code in programs that combine deferred computation and side effects. See Real World OCaml Chapter 18 for a fully worked out example using choice to build a web search client.