# Futures This lab focuses on Jane Street's Async library. We'll experiment with `return`, `bind`, `upon`, and some other Async functions. As of the writing of this lab, the most recent release of Async is version 0.9. The HTML documentation is available from https://ocaml.janestreet.com/ocaml-core/v0.9/doc/. The elaborate use of `include` causes it to be rather difficult to find functions and types, unfortunately. Rather than the HTML documentation, your instructor finds it more useful to directly read the `.mli` and `.ml` files: - https://github.com/janestreet/async_kernel/tree/v0.9/src - https://github.com/janestreet/async_unix/tree/v0.9/src - https://github.com/janestreet/async_extra/tree/v0.9/src Also, [chapter 18][rwo18] of *Real World OCaml* has a tutorial on Async. [rwo18]: https://realworldocaml.org/v1/en/html/concurrent-programming-with-async.html ## Async Warmup Read the following code: ``` open Async let output s = printf "%s\n%!" s let d1 = return 42 let d2 = return 42 let d3 = return 42 let _ = output "A" let _ = upon d2 (fun _ -> upon d3 (fun _ -> output "B")) let _ = upon d1 (fun _ -> output "C") let _ = output "D" ``` Recall that ``` upon : 'a Deferred.t -> ('a -> unit) -> unit ``` registers a callback with the Async scheduler. The callback gets to use the contents of the deferred after it becomes determined, but the callback doesn't return anything interesting—just `()`, which is consumed by the scheduler. ##### Exercise: predict order [✭] Before trying to run that code, first think about an answer to this question: in what order will the four strings be output? Write down a guess before proceeding. □ ##### Exercise: order1 [✭] Now put the above code in a file named `warmup.ml`. If you're using Atom, add the following to your `.merlin` file: ``` PKG async ``` Compile and run it with ``` $ corebuild -pkg async warmup.byte $ ./warmup.byte ``` Note that we are using `corebuild` for Async, not `ocamlbuild`. The `corebuild` tool is Jane Street's customized version of `ocamlbuild`. What output do you get? Does it match what you guessed? □ The reason why you don't get any output is that the scheduler is not running, and that `printf` in this code is actually Async's own version of `printf`, not the standard library's `Printf.printf`. Async's version of `printf` is non-blocking and relies on the scheduler to be running. The extra `"%!"` at the end of the `printf` string causes the output to be printed immediately rather than being buffered. ##### Exercise: order2 [✭] Add this as the last line of the file: ``` let _ = Scheduler.go() ``` Recompile and run. You should now see all four strings output. But they might be in a different order than you guessed. Afterwards, the scheduler keeps running even though it has nothing more to do. To terminate the program, you'll have to enter Control-C. We'll come back to that later in the lab. □ There's really **no guarantee** about what order the four strings will be output. The scheduler is free to reorder callbacks as it wishes. But the most likely output you'll receive is: ``` A D C B ``` If so, it's likely because the scheduler never encountered a deferred that was undetermined, hence it processed all callbacks/outputs in exactly the order they were registered: * The output of `A` was registered first and the scheduler chose to run it first. * The callback for `d2` was registered next. * The callback for `d1` was registered next. * The output of `D` was registered next, and that occurs. * The scheduler then examines the registered callback for `d2`, to check whether `d2` is determined. It is, so, the scheduler runs the callback `fun _ -> upon d3 (fun _ -> output "B")`. That immediately registers another callback for `d3`. * The scheduler then examines the callback for `d1`, to check whether `d1` is determined. It is, so, the scheduler runs the callback `fun _ -> output "C"`. That output occurs. * The scheduler then examines the callback for `d3`, to check whether `d3` is determined. It is, so, the scheduler runs the callback `fun _ -> output "B"`. That output occurs. We can mess with that order, though, by inserting some delays in when deferreds become determined. ##### Exercise: delay [✭✭] Here is code to create a deferred that becomes determined after about 5 seconds: ``` after (Core.sec 5.) ``` Individually change each of `d1`, `d2`, and `d3` in the warmup program to that deferred, recompile, and run, to see what effect it has. Explain in your own words what you observe (output order and timing) and why. □ ## Bind Recall that ``` Deferred.bind : 'a Deferred.t -> ('a -> 'b Deferred.t) -> 'b Deferred.t ``` is used to schedule a deferred computation to take place after another deferred computation finishes. The function `bind` takes two inputs: * a deferred `d : 'a Deferred.t`, and * a callback `c : ('a -> 'b Deferred.t)`. The expression `bind d c` immediately returns with a new deferred `d'`. Sometime after `d` is determined (if ever), the scheduler runs `c` on the contents of `d`. The callback `c` itself produces a new deferred, which if it ever becomes determined, also causes `d'` to be determined with the same value. ##### Exercise: bind [✭] Enter the following code in a file named `bind.ml`: ``` open Async let d1 = Reader.file_contents Sys.argv.(1) let cb = fun s -> return (String.uppercase_ascii s) let d2 = Deferred.bind d1 cb let _ = upon d2 (fun s -> printf "%s\n" s) let _ = Scheduler.go () ``` Compile the file. You will get Warning 6 about an omitted label; that's okay, we'll take care of that later. Run the program, supplying a filename as a command-line argument, e.g. ``` $ ./bind.byte bind.ml ``` □ The code creates a deferred `d1` that will become determined after the contents of a file (named by the first command-line argument, which is `Sys.argv.(1)`) have been read. `Reader.file_contents` is a non-blocking I/O function that immediately returns while a file's contents are being read "in the background". The code then uses `bind` to register a callback `cb` to be run when those contents are available. That callback returns a deferred that is immediately determined and is the upper-cased version of the file. The code finally registers another callback to print those uppercased contents when they are available. ##### Exercise: return [✭] Remove the `return` function call from the above code and attempt to compile it. Why does compilation fail? Why is `return` needed? □ In all our code above, we've had to press Control-C to cause the scheduler to terminate. We can eliminate that annoyance by calling `Async.exit` when the program is assured that all work has been finished. ##### Exercise: exit [✭] Change `bind.ml` to this code: ``` open Async let d1 = Reader.file_contents Sys.argv.(1) let cb = fun s -> return (String.uppercase_ascii s) let d2 = Deferred.bind d1 cb let d3 = Deferred.bind d2 (fun s -> printf "%s\n" s; return ()) let _ = upon d3 (fun _ -> ignore(exit 0)) let _ = Scheduler.go () ``` Recompile and run. The program should now terminate when the file contents have been uppercased and printed. The function `exit : int -> 'a Deferred.t` is provided by `Async` and causes execution to terminate. The return code `0` means execution ended normally. The function `ignore : 'a -> unit` is provided by `Pervasives`; we use it here to ignore the deferred returned by `exit`. □ ##### Exercise: infix bind [✭] Change the calls to `Deferred.bind` in the previous exercise **exit** to the infix operator `>>=`. Recompile (Warning 6 will now go away) and run. □ ## Idiomatic use of >>= Programming directly with `bind` can lead to code that is difficult to read. Here's a more idiomatic way of writing the code we've been developing in the last few exercises: ``` open Async let _ = Reader.file_contents Sys.argv.(1) >>= fun s -> printf "%s\n" (String.uppercase_ascii s); exit 0 let _ = Scheduler.go () ``` Observe how we put `>>= fun s ->` at the end of a line. Read that as "take the deferred from the left of the `>>=`, extract the value inside it, bind that value to `s`, and keep going." Here's a larger example. The code on the left is a hypothetical program written without Async for reading a string from a file, converting the string to an integer, waiting for that integer number of seconds, then reading a message from the network, then terminating. The code on the right is a hypothetical program written with Async for doing the same things. (The functions `read_file`, `wait`, and `read_from_network` are fictitious here.) ``` (* synchronous function *) (* asynchronous function *) (* let program () : unit = *) let program () : unit Deferred.t = (* let s = read_file () in *) read_file () >>= fun s -> (* let n = int_of_string s in *) let n = int_of_string s in (* let _ = wait n in *) wait n >>= fun _ -> (* let p = read_from_network () in *) read_from_network () >>= fun p -> (* print "done"; *) print "done"; (* () *) return () ``` Note how in the asynchronous program each line contains pretty much the same subexpressions as the corresponding line in the synchronous program, but with an anonymous function instead of a let expression (we know those are the same thing anyway!). That is, instead of ``` let x = e in ... ``` we write ``` e >>= fun x -> ... ``` With some practice, you'll soon become accustomed to reading code written in this style. ##### Exercise: sequence [✭✭] Complete the following function `f`, which prompts the user to enter some input, reads a line of input, waits 3 seconds, prints "done", and exits the program. *Hints: each comment corresponds to one line of code that you need to write; you will need to use `>>=` twice.* ``` open Async (** * [stdin] is used to read input from the command line. * [Reader.read_line stdin] will return a deferred that becomes determined when * the user types in a line and presses enter. *) let stdin : Reader.t = Lazy.force Reader.stdin let shout () : unit Deferred.t = (* prompt the user using [printf] *) (* read the input using [Reader.read_line stdin] *) (* wait 3 seconds using [after] *) (* print out whatever the user entered, but converted to all caps, using [printf]; or if EOF was reached, print nothing *) (* exit the program using [exit] *) let _ = shout () let _ = Scheduler.go () ``` `Reader.read_line` is documented [here][reader]. To pattern match against the result of `Reader.read_line`, you need to write a pattern like the following: ``` | `Ok line -> (* do something with the line, which is a string *) | `Eof -> (* do something because the end of file is reached *) ``` [reader]: https://ocaml.janestreet.com/ocaml-core/v0.9/doc/async_unix/Async_unix/Reader/index.html □ ## Idiomatic use of let%bind There is another, relatively new way of writing the bind operator. Instead of ``` bind d (fun x -> e) ``` or ``` d >>= fun x -> e ``` you can now write ``` let%bind x = d in e ``` This might be the clearest way yet to write the operator. To make that syntax work with Atom, add the following to your `.merlin` file: ``` PKG async FLG -ppx 'ppx-jane -as-ppx' ``` ##### Exercise: sequence let [✭✭] Rewrite your solution to **sequence** above to use `let%bind`. □ ## Additional exercises ##### Exercise: loop [✭✭] Based on the **sequence** exercise above, write an Async program that repeatedly: prompts for input, reads the input, waits for three seconds, and prints the input. If the end of the file is reached, then the program instead prints "done" and exits. Note: type Control+D at the terminal to send the EOF character. □ ##### Exercise: either [✭✭✭] Use ivars to implement a function ``` either : 'a Deferred.t -> 'b Deferred.t -> [`Left of 'a | `Right of 'b] Deferred.t ``` The deferred returned from `either` should become determined when at least one of the input deferreds becomes determined. The value of the result should contain the results of either the first or the second input deferred. Hint: first create a new `Ivar.t` and then use `upon` to schedule a function on each of the two input deferreds. □ Just as you can use recursive functions to repeatedly process input in a synchronous program, you can write recursive functions to repeatedly process input in an asynchronous program. ##### Monitor a file [✭✭✭] Write an Async program that monitors the contents of a log file. Specifically, your program should open the file, continually read a line from the file, and as each line becomes available, print the line to stdout. When you reach the end of the file (EOF), your program should terminate. To get started, open a new terminal window and enter the following commands: ``` $ mkfifo log $ cat >log ``` Now anything you type into the terminal window (after pressing return) will be added to the log file. This will enable you to test your program interactively. Next, enter the following code into a file named `monitor.ml`: ``` open Async let rec monitor file = (* hint: use Reader.read_line *) let _ = Reader.open_file "log" (* hint: use >>= and monitor *) let _ = Scheduler.go () ``` Complete the code. □