Part 4: Appendices

These appendices are Part 4 of 4 of the The Error Monad tutorial.

In depth discussion: what even is a monad?

This tutorial has been pretty loose with the word monad. It has focused on usage with very little explanations of fundamental concepts. It is focused on the surface syntax instead of the underlying mathematical model. This section goes into a bit more details about the general principles of monads and such. It does not claim to even attempt to give a complete description of monads, it simply gives more details than the previous sections.

For coding purposes, a monad is a parametric type equipped with a set of specific operators.

type 'a t
val bind : 'a t -> ('a -> 'b t) -> 'b t
val return : 'a -> 'a t

The return operator injects a value into the monad and the bind operator continues within the monad.

The set of operators must also follow the monad laws. For example bind (return x) f must be equivalent to f x.

Monads are used as a generic way to encode different abstractions within a programming language: I/O, errors, collections, etc. For example, the option monad is defined as

module OptionMonad = struct
  type 'a t = 'a option
  let bind x f = match x with | None -> None | Some x -> f x
  let return x = Some x
end

And it is useful when dealing with queries that may have no answer. This can be used as a lighter form of error management than the result monad.

Some programming languages also offer syntactic sugar for monads. This is to avoid having to write bind within bind within bind. E.g., Haskell relies heavily on monads and has the dedicated do-notation. In OCaml, you can use one of the following methods:

  • binding operators (since OCaml 4.08.0)

    let add x y =
      let ( let* ) = OptionMonad.bind in
      let* x = int_of_string_opt x in
      let* y = int_of_string_opt y in
      Some (string_of_int (x + y))
    
  • infix operators

    let add x y =
      let ( >>= ) = OptionMonad.bind in
      int_of_string_opt x >>= fun x ->
      int_of_string_opt y >>= fun y ->
      Some (string_of_int (x + y))
    

    Note that mixing multiple infix operators is not always easy because of precedence and associativity.

  • partial application and infix @@

    let add x y =
      OptionMonad.bind (int_of_string_opt x) @@ fun x ->
      OptionMonad.bind (int_of_string_opt y) @@ fun y ->
      Some (string_of_int (x + y))
    

    This is useful for the occasional application: you do not need to declare a dedicated operator nor open a dedicated syntax module.

Monads can have additional operators beside the required core. E.g., you can add OptionMonad.join : 'a option option -> 'a option.

In depth discussion: Error_monad, src/lib_error_monad/, Mavryk_base__TzPervasives, etc.

The different parts of the error monad (syntax modules, extended stdlib, tracing primitives, etc.) are defined in separate files. Yet, they are all available to you directly. This section explains where each part is defined and how it reaches the scope of your code.

From your code, working back to the definitions.

In most of Mavkit, the Error_monad module is available. Specifically, it is available in all the packages that depend on mavryk-base. This covers everything except the protocols and a handful of low-level libraries.

In those part of Mavkit, the build files include -open Mavryk_base__TzPervasives.

The module Mavryk_base__TzPervasives is defined by the compilation unit src/lib_base/TzPervasives.ml.

This compilation unit gathers multiple low-level modules together. Of interest to us is include Mavryk_error_monad.Error_monad (left untouched in the mli) and include Mavryk_error_monad.TzLwtreslib (not present in the mli, used to shadow the Stdlib modules List, Option, Result, etc.).

The Error_monad module exports:

  • the error type along with the register_error_kind function,

  • the 'a tzresult type,

  • the TzTrace module,

  • the Result_syntax and Lwt_result_syntax modules (from a different, more generic name),

  • and exports a few more functions.

The rest of the mavryk-error-monad package:

  • defines the 'a trace type (in TzTrace.ml), and

  • instantiates TzLwtreslib by applying Lwtreslib’s Traced functor to TzTrace.

The Lwtreslib module exports a Traced (T: TRACE) functor. This functor takes a definition of traces and returns a group of modules intended to shadow the Stdlib.

From the underlying definitions, working all the way up to your code.

At the low-level is Lwtreslib.

  • src/lib_lwt_result_stdlib/bare/sigs: defines interfaces for basic, non-traced syntax modules and Stdlib-replacement modules.

  • src/lib_lwt_result_stdlib/bare/structs: defines implementations basic, non-traced syntax modules and Stdlib-replacement modules.

  • src/lib_lwt_result_stdlib/traced/sigs: defines interfaces for traced syntax modules and Stdlib-replacement modules. These interfaces are built on top of the non-traced interfaces, mostly by addition and occasionally by shadowing.

  • src/lib_lwt_result_stdlib/traced/structs: defines implementations for traced syntax modules and Stdlib-replacement modules. These implementations are built on top of the non-traced implementations, mostly by addition and occasionally by shadowing. These are defined as functors over some abstract tracing primitives.

  • src/lib_lwt_result_stdlib/lwtreslib.mli: puts together the traced implementations into a single functor Traced that takes a trace definition and returns fully instantiated modules to shadow the Stdlib.

Above Lwtreslib is the Error monad.

  • src/lib_error_monad/TzTrace.ml: defines the 'a trace type along with the low-level trace-construction primitives.

  • src/lib_error_monad/TzLwtreslib.ml: instantiates Lwtreslib.Traced with TzTrace.

  • src/lib_error_monad/monad_extension_maker.ml: provides a functor which, given a tracing module, provides some higher level functions for tracing as well as a few other functions.

  • src/lib_error_monad/core_maker.ml: provides a functor which, given a name, provides an error type, a register_error_kind function, and a few other related functions. This is a functor so we can instantiate it separately for the shell and for each of the protocols.

  • src/lib_error_monad/TzCore.ml: instantiates the core_maker functor for the shell.

  • src/lib_error_monad/error_monad.ml: puts together all of the above into a single module.

Above the Error monad is lib-base:

  • src/lib_base/TzPervasives.ml: exports the Error_monad module, includes the Error_monad module, exports each of the TzLwtreslib module.

In depth discussion: result as data and result as control-flow

Note that result (and similarly, tzresult) is a data type. Specifically

type ('a, 'b) result =
| Ok of 'a
| Error of 'b

You can treat values of type result as data of that data-type. In this case, you construct and match the values, you pass them around, etc.

Note however that, in Mavkit, we also use the result type as a control-flow mechanism. Specifically, in conjunction with the let* binding operator, the result type has a continue/abort meaning.

Within your code, you can go from one use to the other. E.g.,

let xs =
  List.rev_map
    (fun x ->
      (* [result] as control-flow *)
      let open Result_syntax in
      let* .. = .. in
      let* .. = .. in
      return ..)
    ys
in
let successes xs =
  (* [result] as data *)
  List.length (List.rev_filter_ok xs)
in
..

Using result as sometimes data and sometimes control-flow is the main reason to bend the guidelines about which syntax module to open. E.g., if your function returns (_, _) result Lwt.t but the result is data returned by the function rather than control-flow used within the function, then you should open Lwt_syntax (rather then Lwt_result_syntax).

As a significant aside, note that in OCaml you can also use exceptions for control-flow (with raise and try-with and match-with-exception) and as data (the type exn is an extensible variant data-type).

(** [iter_no_raise f xs] applies [f] to all the elements of [xs]. If [f] raises
    an exception, the iteration continues and [f] is still applies to other
    elements. The function returns pairs of the exceptions raised by [f] along
    the elements of [xs] that triggered these exceptions. *)
let iter_no_raise f xs =
  List.fold_left
    (fun excs x ->
      match f x with
      | exception exc -> exc :: excs
      | () -> excs)
    []
    xs

You can find uses of exception as data within the error monad itself. First, the generic failure functions (error_with, error_with_exn, failwith, and fail_with_exn) are just wrapper around an error which carries an exception (as data).

Second, Lwtreslib provides helpers to catch exceptions. E.g., Result.catch : (unit -> 'a) -> ('a, exn) result calls a function and wraps any raised exception inside an Error constructor.

In depth discussion: pros and cons of result compared to other error management techniques

In Mavkit, we use result and the specialised tzresult. For this reason, this tutorial is focused on result/tzresult. However, there are other techniques for handling errors. This section compares them briefly.

In general you should use result and tzresult but in some specific cases you can deviate from that. The comparisons below may help you decide.

Exceptions

In exception-based error handling, you raise an exception (via raise) when an error occurs and you catch it (via try-with) to recover. Exceptions are fast because the OCaml compiler and runtime provide the necessary mechanisms directly.

Whether a function can raise an exception or not cannot be determined by its type. This means that it is easy to forget to recover from an exception. An external library may change the set of exceptions that a function raises and you need to update calls to this function, but the type-checker cannot warn you about it. This places a heavy burden on the developer who is responsible for checking the documentation of all the functions they call.

Exception-raising functions should be documented as such using the @raise documentation keyword.

Pros: performance is good, used widely in the larger ecosystem.
Cons: you cannot rely on the type-checker to help you at all, you depend on the quality of the documentation of your external and internal dependencies.

Note that within the protocol, you should not use exceptions at all.

tzresult

With tzresult, errors are carried by the Error constructor of a result. In this way an 'a tzresult represents the result of a computation that normally returns an 'a but may fail.

Because the type of errors is an abstract wrapper (trace) around an extensible variant (error), you can only recover from these errors in a generic way.

Pros: the type of a function indicates if it can fail or not, you cannot forget to check for success/failure.
Cons: you cannot check which error was raised, registration is heavy and complicated.

result

With result, errors are carried by the Error constructor. Each function defines its own type of errors.

Pros: the type of a function indicates if and how it can fail, you cannot forget to check for success/failure, you can check the payload of failures.
Cons: different errors from different functions cannot be used together (need conversions), and* is unusable.

option

With option, errors are represented by the None constructor. Errors are completely void of payload.

Because there are no payloads attached to an error, you should generally treat the error directly at the call site. Otherwise you might lose track of the origin of the failure. E.g., what was not found in the following code fragment?

match
  let open Option_syntax in
  let* z = find "zero" in
  let* o = find "one" in
  Some (z, o)
with
| None -> ..
| Some (z, o) -> ..
Pros: the type of a function indicates if it can fail, you cannot forget to check for success/failure.
Cons: a single kind of errors means it cannot be very informative.

Option is a common enough strategy that the Option_syntax and Lwt_option_syntax modules are available in the Mavkit source.

fallback

Another approach to errors is to have a default or fallback value. In that case, the function returns a default sensible value when it would raise and exception or return an error. Alternatively, it can take this fallback value as parameter.

(** @raise Not_found if argument is [None] *)
val get : 'a option -> 'a

(** returns [default] if argument is [None] *)
val value : default:'a -> 'a option -> 'a
Pros: there is no error.
Cons: doesn’t work for every function, works differently on different functions.

Legacy code

The codebase contains only let-style binding operators. However, you might encounter infix bindings in older protocols. If you do and you are unsure about what the many infix operators do, read on.

The legacy code is written with infix bindings instead of let-style binding operators. The binding >>? for result and tzresult, >>= for Lwt, and >>=? for Lwt-result and Lwt-tzresult. A full equivalence table follows.

Modern

Legacy

let open Result_syntax in
let* x = e in
e'
e >>? fun x ->
e'
let open Result_syntax in
let+ x = e in
e'
e >|? fun x ->
e'
let open Lwt_syntax in
let* x = e in
e'
e >>= fun x ->
e'
let open Lwt_syntax in
let+ x = e in
e'
e >|= fun x ->
e'
let open Lwt_result_syntax in
let* x = e in
e'
e >>=? fun x ->
e'
let open Lwt_result_syntax in
let+ x = e in
e'
e >|=? fun x ->
e'

and*, and+ (any syntax module)

No equivalent, uses both_e, both_p, or both_ep

let open Lwt_result_syntax in
let*! x = e in
e'
(e >>= ok) >>=? fun x ->
e'
let open Lwt_result_syntax in
let*? x = e in
e'
e >>?= fun x ->
e'

In addition, instead of dedicated return and fail functions from a given syntax module, the legacy code relied on global values.

Modern

Legacy

let open Result_syntax in
return x
ok x
let open Result_syntax in
fail e

No equivalent, uses Error e

let open Lwt_syntax in
return x

No equivalent, uses Lwt.return x

let open Lwt_result_syntax in
return x
return x
let open Lwt_result_syntax in
fail e

No equivalent, uses Lwt.return_error e

let open Result_syntax in
return x
ok x
let open Result_syntax in
tzfail e
error e
let open Lwt_result_syntax in
return x
return x
let open Lwt_result_syntax in
tzfail e
fail e

In addition to these syntactic differences, there are also usage differences. You might encounter the following patterns which you should not repeat:

  • Matching against a trace:

    match f () with
    | Ok .. -> ..
    | Error (Timeout :: _) -> ..
    | Error trace -> ..
    

    This is discouraged because the compiler is unable to warn you if the matching is affected by a change in the code. E.g., if you add context to an error in one place in the code, you may change the result of the matching somewhere else in the code.