Rewriting Micheline with mavryk-micheline-rewriting

mavryk-micheline-rewriting allows to construct rewriting rules on Micheline, enumerate matches and perform rewritings. It is fairly generic and deserves a documentation of its own. This library manipulates the following basic elements:

  • Micheline terms

  • Paths

  • Patterns

Informally, the library allows to find patterns in terms: the result is a list of paths, corresponding to the location of these patterns. Rewriting is then performed by substituting a subterm at a given path by another one. The library also provides facilities for manipulating hash-consed terms and patterns, which is crucial in all non-trivial applications.

Constructing Micheline terms

The type of Micheline terms has two parameters: the type of locations and the type of primitives. We abstract away from the underlying representation of terms via the Micheline_sig.S module type:

module type S = sig
  type label
  type head
  type node = (label, head) Micheline.node
  val default_label : label
  val int : Z.t -> node
  val string : string -> node
  val bytes : Bytes.t -> node
  val prim : head -> node list -> string list -> node
  val seq : node list -> node
  val label : node -> label
end

We provide two functors for manipulating Micheline terms:

  • Micheline_without_hash_consing.Make

  • Micheline_with_hash_consing.Make

Both functors provide implementations of the Micheline_sig.S signature (with distinct type constraints however). They both take as input some additional structure on the type of primitives, as specified in the Signature.S module type:

module type S = sig
  type t
  val compare : t -> t -> int
  val hash : t -> int
  val pp : Format.formatter -> t -> unit
end

ie primitives must be totally ordered, hashable and pretty-printable.

Micheline_without_hash_consing.Make has type

functor (X : Signature.S) (Label : sig type t val default : t end) ->  Micheline_sig.S with type label = Label.t and type head = X.t

and provides a simple overlay over the Micheline constructors, setting the label to its default value.

Micheline_with_hash_consing.Make has type

functor (X : Signature.S) (P : sig val initial_size : int option end) -> Micheline_sig.S with type label = hcons_info and type head = X.t

We use the label slot to store hash-consing information on terms: the type hcons_info contains a unique id for the term and its hash. This implementation of Micheline_sig.S performs maximal sharing: it is important that terms are only constructed through the interface provided by the module for the required invariants to hold true. A weak hash table is used to store terms, its initial size is fixed by P.initial_size. One can use the unique id contained in hcons_info to construct a separate map from the term to some other annotation.

Manipulating paths in Micheline terms

A Micheline term is either a primitive application, a sequence, or one of three atoms (int, string or byte). The primitive application and sequence constructors have variable arity (the take lists of subterms as arguments). It follows that one can designate a subterm of a given term as a list of integers, each integer denoting an index in the list of subterms of either a primitive or sequence application.

We provide two implementations of path-manipulating modules: one without hash-consing and one with. Both implement the following signature:

module type S = sig
  type desc = private Root | At_index of int * t
  and t = private {tag : int; hash : int; rev_path_desc : desc}
  val compare : t -> t -> int
  val root : t
  val at_index : int -> t -> t
  val concat : above:t -> under:t -> t
  val to_string : t -> string
end

We observe that paths, contrary to our intuition, are not directed from the root to the subterm but rather from the subterm to the root. This allows easy hash-consing and follows the usual way paths are constructed during the pattern matching process. The two fundamental operations are root, corresponding to an empty path, and at_index i p, corresponding to the ith subterm of the term at path p.

Pattern-matching Micheline

The Pattern module provides two functors implementing a small pattern description language, as well as functions for enumerating matches of a pattern in a given term. The signature is the following:

module type S = sig
  type head
  type path
  type t
  type plist
  type node
  val pattern_matches : t -> node -> bool
  val all_matches : t -> node -> path list
  val focus_matches : t -> path list -> path list
  val int : (Z.t -> bool) option -> t
  val string : (string -> bool) option -> t
  val bytes : (Bytes.t -> bool) option -> t
  val prim : head -> plist -> t
  val prim_pred : (head -> bool) -> plist -> t
  val seq : plist -> t
  val any : t
  val focus : t -> t
  val list_any : plist
  val list_empty : plist
  val list_cons : t -> plist -> plist
  val ( @. ) : t -> plist -> plist
  val pp : Format.formatter -> t -> unit
  val uid : t -> int
end

The comments describing all these constructs can be found in src/lib_benchmark/lib_micheline_rewriting/pattern.mli. It is worth describing a subset of these functions here:

  • pattern_matches patt node returns true if and only if patt matches node.

  • all_matches patt node returns the list of all paths in of subterms of node matching patt.

  • focus patt constructs a focused subpattern. There can be several focused subpatterns but the cannot be nested.

  • focus_matches patt matches converts a list of matches for patt into a list of matches for the focused subpatterns of patt.

The focusing mechanism allows patterns to have a contextual part, corresponding to the subterm matched by the whole pattern, and a “point of interest” in the context, corresponding to a subterm of the subterm matched by the whole pattern. For instance, we can match on integers that are directly under a particular primitive, etc.

The signatures of the non-hash-consing functor is as follows:

Make : functor (X : Signature.S) (Micheline : Micheline_sig.S with type head = X.t) (Path : Path.S) -> S with type head = X.t and type path = Path.t and type node = Micheline.node

While the hash-consing implementation has the following slightly more complicated type:

module Make_with_hash_consing : functor
    (X : Signature.S)
    (Micheline : Micheline_sig.S
                   with type head = X.t
                    and type label = Micheline_with_hash_consing.hcons_info)
    (Path : Path.S) -> sig
  include
    S
      with type head = X.t
       and type path = Path.t
       and type node = Micheline.node

  val all_matches_with_hash_consing : t -> node -> path list
end

I.e the default implementation of match enumeration does not use hash-consing; one has to use all_matches_with_hash_consing to do so.

Performing substitutions

The Rewrite module provides facilities for performing substitutions. There is only one implementation here (as all hash-consing is taken care of in previously described modules). The module provides a functor taking implementations for terms, paths and patterns and provides the following functions:

module type S = sig
  type label
  type head
  type path
  type patt
  type node = (label, head) Micheline.node
  exception Rewrite_error of string * node option
  val get_subterm : term:node -> path:path -> node
  val subst : term:node -> path:path -> replacement:node -> node
  val pattern_matches : patt -> node -> bool
  val all_matches : patt -> node -> path list
end

The key function here is subst which performs the substitution. The implementation proceeds as one might expect, by recursive descent on the term together with the specified path.

An example?

An example can be found in the test subdirectory. It consists in a reimplementation of the migration of addresses towards pairs of addresses and chain ids in multisignature contracts.