Recitation: Futures

# Futures Today we will experiment with some Async code, using `return`, `bind`, `upon`, and some other Async functions. It can be hard to come by up-to-date Async documentation. Here are some pointers: * The source code for version 113.33 (which is the current available version through OPAM as of the semester this course is being taught) is on github. Reading the `.mli` and `.ml` files is the most reliable way to get current documentation. - https://github.com/janestreet/async_kernel/tree/113.33/src - https://github.com/janestreet/async_unix/tree/113.33/src - https://github.com/janestreet/async_extra/tree/113.33/src * There is HTML documentation for version 111.28 available from https://ocaml.janestreet.com/ocaml-core/111.28.00/doc/. Although it's out of date, a lot of it is still applicable. * [Chapter 18][rwo18] of *Real World OCaml* has a tutorial on Async. [rwo18]: https://realworldocaml.org/v1/en/html/concurrent-programming-with-async.html ## Async Warmup Read the following code: ``` open Async.Std let output s = printf "%s\n" s let d1 = return 42 let d2 = return 42 let d3 = return 42 let _ = output "A" let _ = upon d2 (fun _ -> upon d3 (fun _ -> output "B")) let _ = upon d1 (fun _ -> output "C") let _ = output "D" ``` Recall that ``` upon : 'a Deferred.t -> ('a -> unit) -> unit ``` registers a callback with the Async scheduler. The callback gets to use the contents of the deferred after it becomes determined, but the callback doesn't return anything interesting—just `()`, which is consumed by the scheduler. ##### Exercise: predict order [✭] Before trying to run that code, first think about an answer to this question: in what order will the four strings be output? Write down a guess before proceeding. &square; ##### Exercise: order1 [✭] Now put the above code in a file named `warmup.ml`. Compile and run it with ``` $ corebuild -pkg async warmup.byte $ ./warmup.byte ``` What output do you get? Does it match what you guessed? &square; The reason why you don't get any output is that the scheduler is not running, and that `printf` in this code is actually `Async.Std.printf` (which is a synonym for `Async_unix.Async_print.printf`), not the standard library's `Printf.printf`. Async's version of `printf` is non-blocking and relies on the scheduler to be running. ##### Exercise: order2 [✭] Add this as the last line of the file: ``` let _ = Scheduler.go() ``` Recompile and run. You should now see all four strings output. But they might be in a different order than you guessed. Afterwards, the scheduler keeps running even though it has nothing more to do. To terminate the program, you'll have to enter Control-C. We'll come back to that later in the lab. &square; There's really **no guarantee** about what order the four strings will be output. The scheduler is free to reorder callbacks as it wishes. But the most likely output you'll receive is: ``` A D C B ``` If so, it's likely because the scheduler never encountered a deferred that was undetermined, hence it processed all callbacks/outputs in exactly the order they were registered: * The output of `A` was registered first and the scheduler chose to run it first. * The callback for `d2` was registered next, and the scheduler runs it until the callback for `d3` is registered, at which point the scheduler moves on. * The callback for `d1` was registered next, and the scheduler runs it until the output of `C` is registered, at which point the scheduler moves on. * The output of `D` was registered next, and that occurs. * The scheduler then returns to the callback for `d3`, which runs until the output of `B` is registered, at which point the scheduler moves on. * The scheduler then returns to the output of `C`. * And finally, the output of `B`. We can mess with that order, though, by inserting some delays in when deferreds become determined. ##### Exercise: delay [✭✭] Here is code to create a deferred that becomes determined after about 5 seconds: ``` after (Core.Std.sec 5.) ``` Individually change each of `d1`, `d2`, and `d3` in the warmup program to that deferred, recompile, and run, to see what effect it has. Then try delaying two or even all three of them. Explain in your own words what you observe (output order and timing) and why. &square; ## Bind Recall that ``` Deferred.bind : 'a Deferred.t -> ('a -> 'b Deferred.t) -> 'b Deferred.t ``` is used to schedule a deferred computation to take place after another deferred computation finishes. The function `bind` takes two inputs: * a deferred `d : 'a Deferred.t`, and * a callback `c : ('a -> 'b Deferred.t)`. The expression `bind d c` immediately returns with a new deferred `d'`. Sometime after `d` is determined (if ever), the scheduler runs `c` on the contents of `d`. The callback `c` itself produces a new deferred, which if it ever becomes determined, also causes `d'` to be determined with the same value. ##### Exercise: bind [✭] Enter the following code in a file named `bind.ml`: ``` open Async.Std let d1 = Reader.file_contents Sys.argv.(1) let cb = fun s -> return (String.uppercase_ascii s) let d2 = Deferred.bind d1 cb let _ = upon d2 (fun s -> printf "%s\n" s) let _ = Scheduler.go () ``` Compile the file. Then run it, supplying a filename as a command-line argument, e.g. ``` $ ./bind.byte bind.ml ``` &square; The code creates a deferred `d1` that will become determined after the contents of a file (named by the first command-line argument, which is `Sys.argv.(1)`) have been read. `Reader.file_contents` is a non-blocking I/O function that immediately returns while a file's contents are being read "in the background". The code then uses `bind` to register a callback `cb` to be run when those contents are available. That callback returns a deferred that is immediately determined and is the upper-cased version of the file. The code finally registers another callback to print those uppercased contents when they are available. ##### Exercise: return [✭] Remove the `return` function call from the above code and attempt to compile it. Why does compilation fail? Why is `return` needed? &square; In all our code above, we've had to press Control-C to cause the scheduler to terminate. We can eliminate that annoyance by calling `Async.Std.exit` when the program is assured that all work has been finished. ##### Exercise: exit [✭] Change `bind.ml` to this code: ``` open Async.Std let d1 = Reader.file_contents Sys.argv.(1) let cb = fun s -> return (String.uppercase_ascii s) let d2 = Deferred.bind d1 cb let d3 = Deferred.bind d2 (fun s -> printf "%s\n" s; return ()) let _ = upon d3 (fun _ -> ignore(exit 0)) let _ = Scheduler.go () ``` Recompile and run. The program should now terminate when the file contents have been uppercased and printed. The function `exit : int -> 'a Deferred.t` is provided by `Async.Std` and causes execution to terminate. The return code `0` means execution ended normally. The function `ignore : 'a -> unit` is provided by `Pervasives`; we use it here to ignore the deferred returned by `exit`. &square; ##### Exercise: infix bind [✭] Change the calls to `Deferred.bind` in the previous exercise **exit** to the infix operator `>>=`. Recompile and run. &square; ## Idiomatic use of >>= Programming with `bind` and `upon` can lead to code that is difficult to read. Here's a more idiomatic way of writing the code we've been developing in the last few exercises: ``` open Async.Std let _ = Reader.file_contents Sys.argv.(1) >>= fun s -> printf "%s\n" (String.uppercase_ascii s); exit 0 let _ = Scheduler.go () ``` Here's a larger example. The code on the left is a hypothetical program written without Async for reading a string from a file, converting the string to an integer, waiting for that integer number of seconds, then reading a message from the network, then terminating. The code on the right is a hypothetical program written with Async for doing the same things. (The functions `read_file`, `wait`, and `read_from_network` are fictitious here.) ``` (* synchronous function *) (* asynchronous function *) (* let program () : unit = *) let program () : unit Deferred.t = (* let s = read_file () in *) read_file () >>= fun s -> (* let n = int_of_string s in *) let n = int_of_string s in (* let _ = wait n in *) wait n >>= fun _ -> (* let p = read_from_network () in *) read_from_network () >>= fun p -> (* print "done"; *) print "done"; (* () *) return () ``` Note how in the asynchronous program each line contains pretty much the same subexpressions as the corresponding line in the synchronous program, but with an anonymous function instead of a let expression (we know those are the same thing anyway!). That is, instead of ``` let x = e in ... ``` we write ``` e >>= fun x -> ... ``` It might take some practice, but you'll soon become accustomed to reading code written in this style. ##### Exercise: sequence [✭✭] Complete the following function `f`, which prompts the user to enter some input, reads a line of input, waits 3 seconds, prints "done", and exits the program. *Hints: each comment corresponds to one line of code that you need to write; you will need to use `>>=` twice.* ``` open Async.Std (** * [stdin] is used to read input from the command line. * [Reader.read_line stdin] will return a deferred that becomes determined when * the user types in a line and presses enter. *) let stdin : Reader.t = Lazy.force Reader.stdin let f () : unit Deferred.t = (* prompt the user using [printf] *) (* read the input using [Reader.read_line stdin] *) (* wait 3 seconds using [after] *) (* print out "done" using [printf] *) (* exit the program using [exit] *) let _ = f () let _ = Scheduler.go () ``` &square; ## Additional exercises ##### Exercise: loop [✭✭] Based on the **sequence** exercise above, write an Async program that repeatedly: prompts for input, reads the input, waits for three seconds, and prints the input. If the end of the file is reached, then the program instead prints "done" and exits. Note: type Control+D at the terminal to send the EOF character. &square; ##### Exercise: either [✭✭✭] Use ivars to implement a function ``` either : 'a Deferred.t -> 'b Deferred.t -> [`Left of 'a | `Right of 'b] Deferred.t ``` The deferred returned from `either` should become determined when at least one of the input deferreds becomes determined. The value of the result should contain the results of either the first or the second input deferred. Hint: first create a new `Ivar.t` and then use `upon` to schedule a function on each of the two input deferreds. &square; Just as you can use recursive functions to repeatedly process input in a synchronous program, you can write recursive functions to repeatedly process input in an asynchronous program. ##### Monitor a file [✭✭✭] Write an Async program that monitors the contents of a log file. Specifically, your program should open the file, continually read a line from the file, and as each line becomes available, print the line to stdout. When you reach the end of the file (EOF), your program should terminate. To get started, open a new terminal window and enter the following commands: ``` $ mkfifo log $ cat >log ``` Now anything you type into the terminal window (after pressing return) will be added to the log file. This will enable you to test your program interactively. Next, enter the following code into a file named `monitor.ml`: ``` open Async.Std let rec monitor file = (* hint: use Reader.read_line *) let _ = Reader.open_file "log" (* hint: use >>= and monitor *) let _ = Scheduler.go () ``` Complete the code. `Reader.read_line` and `Reader.open_file` are documented in this somewhat out-of-date [Async documentation][reader]. To pattern match against the result of `Reader.read_line`, you need to write a pattern like the following: ``` | `Ok line -> (* do something with the line, which is a string *) | `Eof -> (* do something because the end of file is reached *) ``` [reader]: https://ocaml.janestreet.com/ocaml-core/111.28.00/doc/async/#Std.Reader &square;