SML code examples for this lecture
So far in this class we've been talking about sequential programs. Execution of a sequential program proceeds one step at a time according to the evaluation rules, with no choice about which step to take next. We saw this in the various SML semantics that we explored earlier. Sequential programs are somewhat limited because they are not very good at dealing with multiple sources of simultaneous input. For this reason, many modern applications are therefore concurrent (or multi-threaded, parallel): there are multiple threads of execution concurrently executing in parallel.
For example, a web browser must be simultaneously handling input from the user interface, reading and rendering web pages incrementally as new data comes in, and running embedded programs written in Java, Javascript and other languages. All these activities must happen at the same time, so separate threads are used to handle each of them. Another example of a naturally concurrent application is a web crawler, which traverses the web collecting information about its structure and content. It doesn't make sense for the web crawler to access sites sequentially, because most of the time would be spent waiting for the remote server and network to respond to each request. Therefore, a typical web crawler is highly concurrent, simultaneously accessing thousands of different web sites. This design uses the processor and network efficiently.
Concurrency is a powerful language feature
that enables new kinds of applications, but it also makes writing correct
programs more difficult, because execution of a concurrent program is nondeterministic
: the order in which things happen is not known ahead of time. The programmer
must think about all possible orders in which the different threads might
execute, and make sure that in all of them the program works correctly. If the
program is purely functional, nondeterminism is not a problem because evaluation
of an expression always returns the same value no matter what. For example, the
expression (2*4)+(3*5)
could be executed concurrently, with the
left and right products evaluated at the same time. The answer would not change.
Imperative programming is much more problematic. For example, the expressions (!x)
and (a := !a+1)
, if executed by two different threads, could give
different results depending on which thread executed first, if it happened that
x
and a
were the same ref.
A few modern languages directly support concurrent programming. Java is one.
Languages like C and C++ don't directly support concurrency, though most
operating systems allow concurrent programs to be written in these languages,
somewhat awkwardly. It turns out that the SML distribution includes Concurrent
ML (CML),
an extension to SML that supports a relatively clean model of concurrent
programming. Concurrent ML is found in the sml/src/cml directory of
the distribution. To execute a program in CML, you use the function RunCML.doit
:
structure RunCML = struct (* doit(f, t) evaluates the expression f() with thread quantum t. * It returns the return status of the program. *) val doit: (unit->unit)*(Time.time option)->Word32.word ... end
The thread quantum is the amount of time that a processor will work on executing any one thread before switching to another thread. Although we think of the machine as running all the threads at once, it is much more efficient for a processor to execute one at a time; for one thing, the various caches work better. As long as the quantum is sufficiently small (usually, a few milliseconds), it isn't noticeable. The machine may have multiple processors that can each work on running a separate thread, but the semantics of running a program don't change depending on the number of processors. A concurrent system has a scheduler that decides what thread to run on a given processor. When the current thread's quantum expires, the scheduler is invoked.
CML provides a special operation that creates a new thread:
structure CML = struct (* spawn(f) creates a new thread that evaluates the expression f() * concurrently with the current thread. It returns the thread * identifier of the new thread. *) val spawn : (unit -> unit) -> thread_id ... end
For example, we can write a program that spawns two threads that generate output:
- fun prog() = (CML.spawn (fn() => print "hello!"); print "goodbye!"); - val q = Time.fromMilliseconds(1) - RunCML.doit(prog, q)
There are two possible executions of this code: it might print "hello!goodbye!" or "goodbye!hello!", depending on whether the spawned thread gets to run first or its parent thread does. If we care which one we get, this code won't do.
You've probably noticed that the computation of a thread is given type unit->unit
,
which doesn't give a lot of opportunity for a thread to send a result back to
its parent thread. For example, if the web browser spawns a thread to read
an image embedded in a web page, it needs to get the actual image data back from
that thread. One obvious way to accomplish this is using refs. Here is a
circuitous way to add two numbers:
fun prog() = let val result = ref 0 in CML.spawn (fn() => result := 2+2); print(Int.toString(!result)) end
If we're lucky this will work, but what if the parent tries to access the contents of result before it is updated? In that case we'll read the original 0. Assuming that we know the result isn't zero, we could try to wait until it gets updated:
fun prog() = let val result = ref 0 fun wait() = if !result = 0 then wait() else () in CML.spawn (fn() => result := 2+2); wait(); print(Int.toString(!result)) end
This is an example of a primitive synchronization technique known as spinning.
Two threads synchronize when they each figure out what the other thread
is doing. In this case we don't want the printing thread to print until the
computing thread is done. On a single-processor system, this is probably an
unsatisfactory synchronization technique because the parent thread might waste
processor time waiting for the result to arrive. It can make sense in a
multiprocessor system if the expected spinning duration is small. (CML provides
a function yield()
that allows a thread to give up its quantum,
which can be helpful.)
For real programs we need more powerful synchronization techniques. Consider what happens if we write a simple web server that allows money transfers between two accounts (represented as refs). A web server typically spawns threads to handle each incoming request. We could easily end up with code with an effect like the following:
fun prog() = let val acct_from = ref 1000 val acct_to = ref 1000 fun transfer(n: int) = (acct_from := !acct_from - n; acct_to := !acct_to + n) in CML.spawn(fn() => transfer(100)); (* thread 1 *) CML.spawn(fn() => transfer(100)); (* thread 2 *) print(Int.toString(!acct_from)^" "^Int.toString(!acct_to)) end
Clearly, we would expect this to print out "800 1200". But it might
not, because the threads can be scheduled in other ways. Each thread does a read
and a write from each of acct_to
and acct_from
. Consider some possible orders of
execution on a single-processor machine:
thread 1 thread 2 read acct_from (1000) write acct_from (900) read acct_to (1000) write acct_to (1100) read acct_from (900) write acct_from (800) read acct_to (1100) write acct_to (1200) Result: 800 1200
thread 1 thread 2 read acct_from (1000) read acct_from (1000) write acct_from (900) write acct_from (900) read acct_to (1000) write acct_to (1100) read acct_to (1100) write acct_to (1200) Result: 900 1200
With the second, entirely possible schedule of execution, $100 is manufactured from thin air. Worst yet, we could test this code quite a bit and have it return the right result every time. Yet when deployed as a product, it will occasionally create or consume money. The problem is that we really cannot allow two threads to execute the transfer code at the same time; it is an example of a critical section that only one thread should be able to run at a time.
This kind of problem is the reason for the synchronized statement and
attribute in Java. In Java we could wrap synchronized
around the
whole transfer function, and prevent the interleaved executions shown above.
Another language feature that can be used to prevent interleaved access is locks.
One thread acquires a lock, does the transfer, and releases the
lock. If a thread tries to acquire a lock that is currently held by another
thread, it blocks waiting until the first thread releases the lock. This
kind of simple lock is known as a mutex, for "mutual
exclusion". Locks are difficult to program with if there is more than one
lock, because of the possibility of deadlock when two or more threads can
both try to acquire locks the other one holds, e.g.
thread 1 thread 2 acquire(L1) acquire(L2) ... ... acquire(L2) acquire(L1)
In this example both threads will block and the program will stop. Debugging programs to eliminate deadlocks can be very difficult.
These mutual exclusion features (such as synchronized
and mutexes) can be
implemented using just refs, but it turns out
to be amazingly difficult to get right; for this reason they are usually
provided as primitives.
What we have just been describing is known as a shared-memory approach to thread communication, because the state of refs is shared among the various threads. Shared-memory communication does not work in all concurrent programming models; for example, the standard programming model of Unix (Linux, etc.) is based on processes rather than threads. The major difference is that processes do not share any state; a spawned process gets a copy of the state of its parent process.
CML discourages communication through refs; instead, it takes the other major approach to managing thread communication and synchronization, called message-passing. Message passing has the benefit of being easier to reason about, and also easier to implement in a distributed system. In CML, threads communicate and synchronize using channels, mailboxes, and events (These are terms specific to CML.) Channels and mailboxes provide the ability to deliver values from one thread to another. Events give a thread the ability to synchronize on activity by multiple other threads.
structure CML = struct ... type 'a chan val channel: unit -> 'a chan val send: 'a chan * 'a -> unit val recv: 'a chan -> 'a ...
A value of type T chan
is a channel that transmits values of
type T
. A new channel is created using channel
. The
channel allows two threads to synchronize: a sending thread and a receiving
thread. When a thread evaluates send(c,x)
for some channel c
and message value x
, it then blocks waiting for some thread to
receive the value by calling recv(c)
. Once one thread is waiting on
send
and another on recv
, the value x
is
transferred and becomes the result of the recv
. The two threads
then both resume execution. Similarly, if a thread performs a recv(c)
but there is no other thread doing a send already, the receiving thread blocks
waiting for a sender. This is known as synchronous message-passing
because the sender and receiver synchronize at the moment that the message is
delivered.
Here is a simple example of using channels:
open CML fun prog() = let val c1: int chan = channel() in spawn (fn() => send(c1,2)); spawn (fn() => print(Int.toString(recv(c1)))); () end
struct Mailbox = struct type 'a mbox val mailbox : unit -> 'a mbox val send : ('a mbox * 'a) -> unit val recv : 'a mbox -> 'a ... end
Mailboxes provide asynchronous messages: the sender does not wait for the receiver before going on. Otherwise they act like channels. A mailbox provides a FIFO message queue: messages are delivered in the order they were sent. This is important because a mailbox can contain a large number of messages. Mailboxes can be implemented using channels and threads; it's a good exercise to think about how to do this.
Concurrent applications need the ability to select from several different
possible input sources. CML provides this ability through the event
abstraction:
structure CML = struct ... val recvEvt: 'a chan -> 'a event val select: 'a event list -> 'a ... end structure Mailbox = struct val recvEvt: 'a mbox -> 'a event ... end
Given a channel or a mailbox, we can generate a corresponding event to
synchronize on. Given a list of events, the select
function blocks
until one of the events arrives, then reads from the corresponding channel or
mailbox. Without select
the program can only test for incoming data
on one channel at a time, blocking if there is no data. In Unix there is a
system call select
that provides similar functionality.
Using events we can write an extended version of the banking example from earlier. Since we want only one thread to be able to do the update at a time, we invent a thread whose job that is. This thread also processes requests to read the balance, because otherwise a read might be interleaved with an update, resulting in inconsistent account balances. Other threads communicate with it via channels:
open CML fun prog() = let val c1: int chan = channel() val e1 = recvEvt(c1) val c2: int chan = channel() val e2 = recvEvt(c2) in spawn(fn() => send(c1,100)); spawn(fn() => send(c2,100)); spawn(fn() => let val acct_from = ref 1000 val acct_to = ref 1000 fun server() = ( let val amount = select([e1,e2]) in acct_from := !acct_from - amount; acct_to := !acct_to + amount end; server()) in server() end); print "main thread done" end
(What if we wanted the server to send back results? What kind of channel could we use then?)
A thread may also want to select from a number of different channels to send output
on. In this case it might want to choose the channel on which there is already a
receiver waiting. Send events provide this functionality. A send event is
created by using the sendEvt
function:
val sendEvt: 'a chan * 'a -> unit event
Selection on a send event created with sendEvt(c,v)
enables it
to send the value v
when there is a receiver waiting on the channel
c
. The select
call then returns a unit value to
indicate that the send has occurred.
In general a CML thread may want to wait on various different events, with different associated types. The events cannot be put onto a common event list because the types are not equal. Events can be wrapped to give them a different type:
val wrap: 'a event * ('a -> 'b) -> 'b event
This allows simultaneous selection on receive and send events, for example. It also helps keep track of which of several channels delivered an event. In the server example above, we might want to know which client thread sent a value, which can be accomplished by tagging the request:
let val (client: int, amount:int) = select([wrap(e1, fn(a) => (1,a)), wrap(e2, fn(a) => (2,a))])
When a value arrives on the channel, the function wrapped around the event is automatically applied to that value.
Concurrent ML home
page
John H. Reppy, Concurrent Programming in ML, Cambridge University Press, 1999.