Part 4: Appendices¶
These appendices are Part 4 of 4 of the The Error Monad tutorial.
In depth discussion: what even is a monad?¶
This tutorial has been pretty loose with the word monad. It has focused on usage with very little explanations of fundamental concepts. It is focused on the surface syntax instead of the underlying mathematical model. This section goes into a bit more details about the general principles of monads and such. It does not claim to even attempt to give a complete description of monads, it simply gives more details than the previous sections.
For coding purposes, a monad is a parametric type equipped with a set of specific operators.
type 'a t
val bind : 'a t -> ('a -> 'b t) -> 'b t
val return : 'a -> 'a t
The return operator injects a value into the monad and the bind
operator continues within the monad.
The set of operators must also follow the monad laws. For example
bind (return x) f must be equivalent to f x.
Monads are used as a generic way to encode different abstractions within
a programming language: I/O, errors, collections, etc. For example, the
option monad is defined as
module OptionMonad = struct
type 'a t = 'a option
let bind x f = match x with | None -> None | Some x -> f x
let return x = Some x
end
And it is useful when dealing with queries that may have no answer. This
can be used as a lighter form of error management than the result
monad.
Some programming languages also offer syntactic sugar for monads. This
is to avoid having to write bind within bind within bind.
E.g., Haskell relies heavily on monads and has the dedicated
do-notation. In OCaml, you can use one of the following methods:
binding operators (since OCaml 4.08.0)
let add x y = let ( let* ) = OptionMonad.bind in let* x = int_of_string_opt x in let* y = int_of_string_opt y in Some (string_of_int (x + y))
-
let add x y = let ( >>= ) = OptionMonad.bind in int_of_string_opt x >>= fun x -> int_of_string_opt y >>= fun y -> Some (string_of_int (x + y))
Note that mixing multiple infix operators is not always easy because of precedence and associativity.
partial application and infix
@@let add x y = OptionMonad.bind (int_of_string_opt x) @@ fun x -> OptionMonad.bind (int_of_string_opt y) @@ fun y -> Some (string_of_int (x + y))
This is useful for the occasional application: you do not need to declare a dedicated operator nor open a dedicated syntax module.
Monads can have additional operators beside the required core. E.g., you
can add OptionMonad.join : 'a option option -> 'a option.
In depth discussion: Error_monad, src/lib_error_monad/, Mavryk_base__TzPervasives, etc.¶
The different parts of the error monad (syntax modules, extended stdlib, tracing primitives, etc.) are defined in separate files. Yet, they are all available to you directly. This section explains where each part is defined and how it reaches the scope of your code.
From your code, working back to the definitions.
In most of Mavkit, the Error_monad module is available. Specifically, it is
available in all the packages that depend on mavryk-base. This covers
everything except the protocols and a handful of low-level libraries.
In those part of Mavkit, the build files include
-open Mavryk_base__TzPervasives.
The module Mavryk_base__TzPervasives is defined by the compilation
unit src/lib_base/TzPervasives.ml.
This compilation unit gathers multiple low-level modules together. Of
interest to us is include Mavryk_error_monad.Error_monad (left
untouched in the mli) and include Mavryk_error_monad.TzLwtreslib
(not present in the mli, used to shadow the Stdlib modules List,
Option, Result, etc.).
The Error_monad module exports:
the
errortype along with theregister_error_kindfunction,the
'a tzresulttype,the
TzTracemodule,the
Result_syntaxandLwt_result_syntaxmodules (from a different, more generic name),and exports a few more functions.
The rest of the mavryk-error-monad package:
defines the
'a tracetype (inTzTrace.ml), andinstantiates
TzLwtreslibby applyingLwtreslib’sTracedfunctor toTzTrace.
The Lwtreslib module exports a Traced (T: TRACE) functor. This
functor takes a definition of traces and returns a group of modules
intended to shadow the Stdlib.
From the underlying definitions, working all the way up to your code.
At the low-level is Lwtreslib.
src/lib_lwt_result_stdlib/bare/sigs: defines interfaces for basic, non-traced syntax modules and Stdlib-replacement modules.src/lib_lwt_result_stdlib/bare/structs: defines implementations basic, non-traced syntax modules and Stdlib-replacement modules.src/lib_lwt_result_stdlib/traced/sigs: defines interfaces for traced syntax modules and Stdlib-replacement modules. These interfaces are built on top of the non-traced interfaces, mostly by addition and occasionally by shadowing.src/lib_lwt_result_stdlib/traced/structs: defines implementations for traced syntax modules and Stdlib-replacement modules. These implementations are built on top of the non-traced implementations, mostly by addition and occasionally by shadowing. These are defined as functors over some abstract tracing primitives.src/lib_lwt_result_stdlib/lwtreslib.mli: puts together the traced implementations into a single functorTracedthat takes a trace definition and returns fully instantiated modules to shadow the Stdlib.
Above Lwtreslib is the Error monad.
src/lib_error_monad/TzTrace.ml: defines the'a tracetype along with the low-level trace-construction primitives.src/lib_error_monad/TzLwtreslib.ml: instantiatesLwtreslib.TracedwithTzTrace.src/lib_error_monad/monad_extension_maker.ml: provides a functor which, given a tracing module, provides some higher level functions for tracing as well as a few other functions.src/lib_error_monad/core_maker.ml: provides a functor which, given a name, provides anerrortype, aregister_error_kindfunction, and a few other related functions. This is a functor so we can instantiate it separately for the shell and for each of the protocols.src/lib_error_monad/TzCore.ml: instantiates thecore_makerfunctor for the shell.src/lib_error_monad/error_monad.ml: puts together all of the above into a single module.
Above the Error monad is lib-base:
src/lib_base/TzPervasives.ml: exports theError_monadmodule, includes theError_monadmodule, exports each of theTzLwtreslibmodule.
In depth discussion: result as data and result as control-flow¶
Note that result (and similarly, tzresult) is a data type.
Specifically
type ('a, 'b) result =
| Ok of 'a
| Error of 'b
You can treat values of type result as data of that data-type. In
this case, you construct and match the values, you pass them around,
etc.
Note however that, in Mavkit, we also use the result type as a
control-flow mechanism. Specifically, in conjunction with the let*
binding operator, the result type has a continue/abort meaning.
Within your code, you can go from one use to the other. E.g.,
let xs =
List.rev_map
(fun x ->
(* [result] as control-flow *)
let open Result_syntax in
let* .. = .. in
let* .. = .. in
return ..)
ys
in
let successes xs =
(* [result] as data *)
List.length (List.rev_filter_ok xs)
in
..
Using result as sometimes data and sometimes control-flow is the
main reason to bend the guidelines about which syntax module to
open. E.g., if your function returns (_, _) result Lwt.t but the
result is data returned by the function rather than control-flow
used within the function, then you should open Lwt_syntax (rather
then Lwt_result_syntax).
As a significant aside, note that in OCaml you can also use exceptions
for control-flow (with raise and try-with and
match-with-exception) and as data (the type exn is an
extensible variant data-type).
(** [iter_no_raise f xs] applies [f] to all the elements of [xs]. If [f] raises
an exception, the iteration continues and [f] is still applies to other
elements. The function returns pairs of the exceptions raised by [f] along
the elements of [xs] that triggered these exceptions. *)
let iter_no_raise f xs =
List.fold_left
(fun excs x ->
match f x with
| exception exc -> exc :: excs
| () -> excs)
[]
xs
You can find uses of exception as data within the error monad itself.
First, the generic failure functions (error_with,
error_with_exn, failwith, and fail_with_exn) are just
wrapper around an error which carries an exception (as data).
Second, Lwtreslib provides helpers to catch exceptions. E.g.,
Result.catch : (unit -> 'a) -> ('a, exn) result calls a function and
wraps any raised exception inside an Error constructor.
In depth discussion: pros and cons of result compared to other error management techniques¶
In Mavkit, we use result and the specialised tzresult. For this
reason, this tutorial is focused on result/tzresult. However,
there are other techniques for handling errors. This section compares
them briefly.
In general you should use result and tzresult but in some
specific cases you can deviate from that. The comparisons below may help
you decide.
Exceptions
In exception-based error handling, you raise an exception (via
raise) when an error occurs and you catch it (via try-with)
to recover. Exceptions are fast because the OCaml compiler and runtime
provide the necessary mechanisms directly.
Whether a function can raise an exception or not cannot be determined by its type. This means that it is easy to forget to recover from an exception. An external library may change the set of exceptions that a function raises and you need to update calls to this function, but the type-checker cannot warn you about it. This places a heavy burden on the developer who is responsible for checking the documentation of all the functions they call.
Exception-raising functions should be documented as such using the
@raise documentation keyword.
Note that within the protocol, you should not use exceptions at all.
tzresult
With tzresult, errors are carried by the Error constructor of a
result. In this way an 'a tzresult represents the result of a
computation that normally returns an 'a but may fail.
Because the type of errors is an abstract wrapper (trace) around an
extensible variant (error), you can only recover from these errors
in a generic way.
result
With result, errors are carried by the Error constructor. Each
function defines its own type of errors.
and* is unusable.option
With option, errors are represented by the None constructor.
Errors are completely void of payload.
Because there are no payloads attached to an error, you should generally treat the error directly at the call site. Otherwise you might lose track of the origin of the failure. E.g., what was not found in the following code fragment?
match
let open Option_syntax in
let* z = find "zero" in
let* o = find "one" in
Some (z, o)
with
| None -> ..
| Some (z, o) -> ..
Option is a common enough strategy that the Option_syntax and
Lwt_option_syntax modules are available in the Mavkit source.
fallback
Another approach to errors is to have a default or fallback value. In that case, the function returns a default sensible value when it would raise and exception or return an error. Alternatively, it can take this fallback value as parameter.
(** @raise Not_found if argument is [None] *)
val get : 'a option -> 'a
(** returns [default] if argument is [None] *)
val value : default:'a -> 'a option -> 'a
Legacy code¶
The codebase contains only let-style binding operators. However,
you might encounter infix bindings in older protocols. If you do and
you are unsure about what the many infix operators do, read on.
The legacy code is written with infix bindings instead of let-style
binding operators. The binding >>? for result and tzresult,
>>= for Lwt, and >>=? for Lwt-result and Lwt-tzresult. A
full equivalence table follows.
Modern |
Legacy |
let open Result_syntax in
let* x = e in
e'
|
e >>? fun x ->
e'
|
let open Result_syntax in
let+ x = e in
e'
|
e >|? fun x ->
e'
|
let open Lwt_syntax in
let* x = e in
e'
|
e >>= fun x ->
e'
|
let open Lwt_syntax in
let+ x = e in
e'
|
e >|= fun x ->
e'
|
let open Lwt_result_syntax in
let* x = e in
e'
|
e >>=? fun x ->
e'
|
let open Lwt_result_syntax in
let+ x = e in
e'
|
e >|=? fun x ->
e'
|
|
No equivalent, uses
|
let open Lwt_result_syntax in
let*! x = e in
e'
|
(e >>= ok) >>=? fun x ->
e'
|
let open Lwt_result_syntax in
let*? x = e in
e'
|
e >>?= fun x ->
e'
|
In addition, instead of dedicated return and fail functions from
a given syntax module, the legacy code relied on global values.
Modern |
Legacy |
let open Result_syntax in
return x
|
ok x
|
let open Result_syntax in
fail e
|
No equivalent, uses
|
let open Lwt_syntax in
return x
|
No equivalent, uses
|
let open Lwt_result_syntax in
return x
|
return x
|
let open Lwt_result_syntax in
fail e
|
No equivalent, uses
|
let open Result_syntax in
return x
|
ok x
|
let open Result_syntax in
tzfail e
|
error e
|
let open Lwt_result_syntax in
return x
|
return x
|
let open Lwt_result_syntax in
tzfail e
|
fail e
|
In addition to these syntactic differences, there are also usage differences. You might encounter the following patterns which you should not repeat:
Matching against a trace:
match f () with | Ok .. -> .. | Error (Timeout :: _) -> .. | Error trace -> ..
This is discouraged because the compiler is unable to warn you if the matching is affected by a change in the code. E.g., if you add context to an error in one place in the code, you may change the result of the matching somewhere else in the code.