So far in this class we have only considered sequential programs. Execution of a sequential program proceeds one step at a time, with no choice about which step to take next. Sequential programs are limited in that they are not very good at dealing with multiple sources of simultaneous input and they can only execute on a single processor. For these reasons, many modern applications are concurrent. There are many different approaches to concurrent programming, but they all share the fact that a program is split into multiple independent threads of execution. Each thread runs a sequential program, but the overall collection of threads no longer produces a single overall predictable sequence of execution steps. Instead, execution proceeds concurrently, resulting in potentially unpredictable order of execution for steps in one thread with respect to steps in other threads.
The granularity of concurrent programming varies widely, from coarse-grained techniques that loosely coordinate the execution of separate programs, such as pipes in Unix (or even the HTTP protocol between Web servers and clients), to fine-grained techniques where the concurrent threads share the same memory, such as lightweight threads.
In this lecture we will introduce concurrent progarmming through the simple mechanisms provided in Jane Street's async library.
Concurrency is powerful and it enables new kinds of applications,
but it also makes writing correct programs more difficult, because
execution of a concurrent program is
nondeterministic: the order in which operations occur is not
always known ahead of time. As a result, the programmer must think
about all possible orders in which the different threads might
execute, and make sure that in all of them the program works
correctly. If the program is purely functional, nondeterminism is
easier because evaluation of an expression always returns the same
value no matter what. For example, in the
(2*4)+(3*5), the operations can be executed
concurrently (e.g., with the left and right products evaluated
simultaneously) without changing the answer. Imperative programming is
more problematic. For example, the expressions
(x := !x+1), if executed by two different threads,
could give different results depending on which thread executes
The async library attempts to combine the best features of lightweight threads and event loops. The simplest way to use async is through utop. To start, invoke utop and load async:
% utop utop # #require "async";; utop # open Async.Std;;The library is organized around a collection of primitives organized around the notion of a deferred computation. You can find documentation for async here.
A partial signature for the Async.Std module is as follows:
module Std : sig = module Deferred : sig = type 'a t val return : 'a -> 'a Deferred.t val bind : 'a Deferred.t -> ('a -> 'b Deferred.t) -> 'b Deferred.t val peek : 'a Deferred.t -> 'a option val map : 'a Deferred.t -> ('a -> 'b) -> 'b Deferred.t val both : 'a Deferred.t -> 'b Deferred.t -> ('a * 'b) Deferred.t val don't_wait_for : unit Deferred.t -> unit module List : sig val map : 'a list -> ('a -> 'b Deferred.t) -> 'b list Deferred.t val iter : 'a list -> ('a -> unit Deferred.t) -> unit Deferred.t val fold : 'a list -> 'b -> ('b -> 'a -> 'b Deferred.t) -> 'b Deferred.t val filter : 'a list -> ('a -> bool Deferred.t) -> 'a list Deferred.t val find : 'a list -> ('a -> bool Deferred.t) -> 'a option Deferred.t ... end ... end ... endA value of type
'a Deferred.trepresents a deferred computation. The value encapsulated within a deferred computation is typically not available initially. Such a deferred computation is called indeterminate. However when the value becomes determined, it can be accessed and used by the rest of the computation.
As an example to warm up, consider the following program, which
defines an internal function
f that prints out an integer
and then returns a deferred unit.
open Async.Std let main () = let f i = printf "Value is %d\n" i; return () in Deferred.both (Deferred.List.iter [1;2;3;4;5] f) (Deferred.List.iter [1;2;3;4;5] f) let () = don't_wait_for (main () >>= fun _ -> exit 0); ignore (Scheduler.go ())The function Deferred.List.iter iterates a function that produces a deferred value and combines the resulting list of deferred units into a single deferred unit. The both function combines a pair of deferred values into a single deferred pair.
If executed sequentially, this program would simply print the integers from 1 to 5 twice. However, if executed concurrently, as in async, the calls to printf can be interleaved. For example:
Value is 1 Value is 1 Value is 2 Value is 2 Value is 3 Value is 3 Value is 4 Value is 4 Value is 5 Value is 5The reason for this behavior is that the deferred values are executed concurrently, as determined by the scheduler. Hence, the values printed to the console may appear in a different order than would be specified using the normal sequential control flow of the program.
The simplest way to create a deferred computation is to use the return function:
let d = return 42;; val d : int Deferred.t =It produces a deferred value that is determined immediately, as can be verified using the peek function:
Deferred.peek d;; - : int option = Some 42
A more interesting way to create a deferred computation is to combine two smaller deferred computations sequentially. The bind operator, written infix as >>= takes the result of one deferred computation and feeds it to a function that produces another deferred computation:
let d = return 42 >>= fun n -> return (n,3110) val d : int * int Deferred.t =Execution of an expression that uses bind proceeds as follows: when the first computation becomes determined, the value is supplied to the function, which schedules another deferred computation. The overall computation is determined when this second deferred is determined. The idiom used in the above code snippet can be used as the implementation of the both function described previously:
let both (d1:'a Deferred.t) (d2:'b Deferred.t) : ('a * 'b) Deferred.t = d1 >>= fun v1 -> d2 >>= fun v2 -> return (v1,v2)This function waits until d1 is determined and passes the resulting value to v1 the first function, waits until d2 is determined and passes the resulting value to v2 the second function, which returns the pair (v1,v2) in a new deferred computation.
A more interesting example of composing deferreds arises with programs that read and write from the file system. I/O is a particularly good match for concurrent programming using deferreds, because I/O operations can often block, depending on the behavior of the operating system and underlying devices. For example, a read may block waiting for the disk to become available, or for the disk controller to move the read head to the appropriate place on the physical disk itself. The async library includes variants of the standard functions for opening and manipulating files. For example, here is a small snippet of the Reader module:
module Read_result : sig = type 'a t = [ `Eof | `Ok of 'a ] ... end module Reader : sig = val open_file : -> string -> t Deferred.t val read_line : t -> string Read_result.t Import.Deferred.t ... endThe type used to define 'a Read_result.t is known as polymorphic variant, and uses some new notation we have not seen before. For the purposes of this course, it can be treated as an ordinary datatype whose constructors happen to be prefixed with the backtick symbol, "`".
Using these functions, we can write a function that reads in the contents of a file:
let file_contents (fn:string) : string Deferred.t = let rec loop (r:Reader.t) (acc:string) : string Deferred.t = Reader.read_line r >>= fun res -> match res with | `Eof -> return acc | `Ok s -> loop r (acc ^ s) in Reader.open_file fn >>= fun r -> loop r ""Note that each I/O operation is encapsulated in a deferred computation, so the async scheduler is free to interleave them with other computations that might be executing concurrently—e.g., another deferred computation also performing I/O.
Going a step further, we can write a function that computes the number of characters in a file:
let file_length (fn:string) : int Deferred.t = contents fn >>= fun s -> return (String.length s)This pattern of sequencing a deferred computation with a computation that consumes the value and immediately returns a value is so common, that the async library includes a primitive for implementing it directly:
val map : 'a Deferred.t -> ('a -> 'b) -> 'b Deferred.tThe map function can be written infix as >>|. Hence, the above function could be written more succinctly as:
let file_length (fn:string) : int Deferred.t = contents fn >>| String.lengthNote the use of partial application in String.length.
In the next lecture, we will see further examples of creating and programming with deferred computations using async.