Module Mavryk_store_unix.Block_store

Persistent and cached generic block store

The store instantiate a cemented block store and multiple floating block stores. The floating stores serve as buffers until enough blocks have arrived. Then it performs a "cementing" (also called a "merge"). Under normal circumstances, there are two different kinds (Floating_block_store.floating_kind) of floating stores instances: a RO(read-only) and a RW(read-write). Newly arrived blocks are always pushed in the RW instance. The block lookup is first tried in RW, then RO and finally in the cement blocks.

This store also instantiates a LRU block cache to reduce the number of I/O operations. This cache is updated whenever a block is read or stored.

When a merge occurs, the RW instance is promoted as another RO' and a new RW' instance replaces it. This allows retrieving the new cycle to be cemented from RO and RO' (former RW) asynchronously and thus allowing new blocks to be stored in the newly instantiated RW store without pausing. This asynchronous merging thread, while retrieving the cycle to cement, also combines RO and RO' into a new RO'' without the cemented cycle. When the merging thread is done, the former RO and RO' instances are deleted from the disk and the new RO'' replaces them. A merging thread has to wait for the previous one to finish.

Retrieving the new cycle from RO and RO' from blocks B_start and B_end means that we must retrieve the set of blocks between them but also trim potential branches that have roots in this set. To achieve that, we iterate over RO and RO' linearly. This means that every block's predecessor in floating stores must be previously known. Either we previously encountered it in the same floating store file, either in RO if the block is in RO' or in the cemented store (see invariants below). This invariant is required to ensure minimal memory usage. The iterations done to retrieve the cycle and merge the floating stores works similarly to a stop and copy GC algorithm. It works as follows:

The result is a correct, in order and trimmed new RO floating store. A visual example of merging is given below.

The merging thread will also trigger a garbage-collection of the cemented block store w.r.t. the given history mode.

Invariants

This store is expected to respect the following invariants:

Merging example

           RO          RW

                 | C' - D' - E'   G'
              /  |               /
         A - B - | C - D - E - F - G
                 |  \
                 |    D'' - E''
                 |     \
                 |       E'''

For instance, a merging from A to C will first retrieve blocks C, B and A. Then, iterate over all the blocks in both the RO and RW files: reading first in RO, then in RW. By construction, blocks are stored after their predecessors. For example, [ A ; B ; C ; C'; D'' ; D'; D ; E ; E'' ; F ; E''' ; G' ; G ] is a valid storing sequence.

The algorithm starts iterating over this sequence and will only copy blocks for which predecessors are present in the set S of hash (initially S = { hash(C) }). Thus, for the given sequence, D'' will first be considered, S will be updated to { hash(C), hash(D) } and so on, until RO and RW are fully read.

The new RO will then be:

                G'
               /
    - D - E - F - G

    - D'' - E''
       \
        E'''

where its storing order will be correct with regards to the invariant.

type block_store

The type of the block store

type t = block_store
type key =
  1. | Block of Mavryk_base.TzPervasives.Block_hash.t * int

The type of the block's key to be accessed: a hash and an offset.

  • Block (h, 0) represents the block h itself ;
  • Block (h, n) represents the block's nth predecessor.

A block key may represent an invalid block (wrong hash and/or offset) as it is not ensured to be valid by construction.

type merge_status =
  1. | Not_running
  2. | Running
  3. | Merge_failed of Mavryk_base.TzPervasives.tztrace

The status of the merging thread

type status =
  1. | Idle
  2. | Merging

The status of the store

val status_encoding : status Mavryk_base.TzPervasives.Data_encoding.t
val cemented_block_store : block_store -> Cemented_block_store.t

cemented_block_store block_store returns the instance of the cemented block store for block_store.

val floating_block_stores : block_store -> Floating_block_store.t list

floating_block_stores block_store returns all running floating block store instances for block_store. It will always return two or three ordered floating stores:

  • [RO] ; [RW] if a merge is not occurring;
  • [RO] ; [RO'] ; [RW] if a merge is occurring.

Warning These stores should only be accessed when the store is not active.

val status : block_store -> status Lwt.t
val write_status : block_store -> status -> unit Mavryk_base.TzPervasives.tzresult Lwt.t

mem block_store key tests the existence of the block key in block_store.

get_hash block_store key retrieves the hash corresponding to the given key in block_store. Return None if the block is unknown.

val resulting_context_hash : block_store -> fetch_expect_predecessor_context: (unit -> bool Mavryk_base.TzPervasives.tzresult Lwt.t) -> key -> Mavryk_base.TzPervasives.Context_hash.t option Mavryk_base.TzPervasives.tzresult Lwt.t

resulting_context_hash block_store ~fetch_expect_predecessor_context key retrieves the resulting context hash from the block application, corresponding to the given key.

val read_block : block_store -> read_metadata:bool -> key -> Mavryk_store_shared.Block_repr.t option Mavryk_base.TzPervasives.tzresult Lwt.t

read_block block_store ~read_metadata key reads the block key in block_store if present. Return None if the block is unknown. If read_metadata is set to true it tries to retreive the metadata but do not fail if it is not available.

read_block_metadata block_store key reads the metadata for the block key in block_store if present. Return None if the block is unknown or if the metadata are not present.

store_block block_store block resulting_context_hash stores the block in the current RW floating store with its associated resulting_context_hash.

val cement_blocks : ?check_consistency:bool -> write_metadata:bool -> block_store -> Cemented_block_store.chunk_iterator -> cycle_range:(int32 * int32) -> unit Mavryk_base.TzPervasives.tzresult Lwt.t

cement_blocks ?check_consistency ~write_metadata block_store chunk_iterator ~cycle_range

Wrapper of Cemented_block_store.cement_blocks.

move_floating_store block_store ~src ~dst_kind closes the floating store src, tests the existence of a dst_kind store opened in block_store and tries to close it if it is the case. It then proceeds to replace the files from src to dst.

This function is unsafe and should only be called in very specific cases.

Warning block_store remains unchanged meaning that the potential deleted floating store is referenced in the structure.

Fails if both src and dst (if it exists) have the same Floating_block_store.floating_kind.

val await_merging : block_store -> unit Lwt.t

await_merging block_store waits for the current merging thread in block_store to finish if any.

val default_cycle_size_limit : int32

Default cemented cycles maximum size. I.e.: 2^16 - 1

val merge_stores : ?cycle_size_limit:int32 -> block_store -> on_error: (Mavryk_base.TzPervasives.tztrace -> unit Mavryk_base.TzPervasives.tzresult Lwt.t) -> finalizer:(int32 -> unit Mavryk_base.TzPervasives.tzresult Lwt.t) -> history_mode:Mavryk_shell_services.History_mode.t -> new_head:Mavryk_store_shared.Block_repr.t -> new_head_metadata:Mavryk_store_shared.Block_repr.metadata -> cementing_highwatermark:int32 -> unit Mavryk_base.TzPervasives.tzresult Lwt.t

(* TODO UPDATE MERGE DOC *) merge_stores block_store ?finalizer ~nb_blocks_to_preserve ~history_mode ~from_block ~to_block triggers a merge as described in the above description. This will result, asynchronously, in:

  • the cementing (if needs be) of a cycle from from_block to to_block (included)
  • trimming the floating stores and preserves to_block - nb_blocks_to_preserve blocks (iff these blocks are present or the longest suffix otherwise) along with their metadata in the floating store. It may potentially have duplicates in the cemented block store.

After the cementing, Cemented_block_store.trigger_gc will be called with the given history_mode. When the merging thread succeeds, the callback finalizer will be called.

If a merge thread is already occurring, this function will first wait for the previous merge to be done.

The cemented cycles will have a max size of cycle_size_limit blocks which default to default_cycle_size_limit.

Warning For a given block_store, the caller must wait for this function termination before calling it again or it may result in concurrent intertwining causing the cementing to be out of order.

val get_merge_status : t -> merge_status
val pp_merge_status : Stdlib.Format.formatter -> merge_status -> unit
val switch_history_mode : block_store -> current_head:Mavryk_store_shared.Block_repr.t -> previous_history_mode:Mavryk_shell_services.History_mode.t -> new_history_mode:Mavryk_shell_services.History_mode.t -> unit Mavryk_base.TzPervasives.tzresult Lwt.t

switch_history_mode block_store ~current_head ~previous_history_mode ~new_history_mode switches the store from the previous_history_mode to the given new_history_mode. To do so, it infers and updates both the caboose and savepoint. If needed, a garbage collection of unnecessary cycles is performed.

val create : ?block_cache_limit:int -> [ `Chain_dir ] Mavryk_store_shared.Naming.directory -> genesis_block:Mavryk_store_shared.Block_repr.t -> block_store Mavryk_base.TzPervasives.tzresult Lwt.t

create ?block_cache_limit ~chain_dir ~genesis_block instantiates a fresh block_store in directory chain_dir and stores the genesis_block in it. It fails if the given chain_dir is already populated. Setting the block_cache_limit allows to override the default block cache size.

val load : ?block_cache_limit:int -> [ `Chain_dir ] Mavryk_store_shared.Naming.directory -> genesis_block:Mavryk_store_shared.Block_repr.t -> readonly:bool -> block_store Mavryk_base.TzPervasives.tzresult Lwt.t

load ?block_cache_limit chain_dir ~genesis_block ~readonly loads an existing block_store from directory chain_dir. Setting readonly will prevent new blocks from being stored. Setting the block_cache_limit allows to override the default block cache size.

val register_gc_callback : block_store -> (Mavryk_base.TzPervasives.Block_hash.t -> unit Mavryk_base.TzPervasives.tzresult Lwt.t) option -> unit

register_gc_callback block_store callback installs a callback that may be triggered during a block store merge in order to garbage-collect old contexts.

val register_split_callback : block_store -> (unit -> unit Mavryk_base.TzPervasives.tzresult Lwt.t) option -> unit

register_split_callback chain_store callback installs a callback that may be triggered during a set_head in order to split the context into a new chunk.

val split_context : t -> Stdlib.Int32.t -> unit Mavryk_base.TzPervasives.tzresult Lwt.t

split_context block_store new_head_lpbl calls the callback registered by register_split_callback if any.

val close : block_store -> unit Lwt.t

close block_store closes the block_store and every underlying opened stores.

Warning If a merging thread is occurring, it will wait up to 5s for its termination before effectively closing the store.

val may_recover_merge : block_store -> unit Mavryk_base.TzPervasives.tzresult Lwt.t

may_recover_merge block_store recovers, if needed, from a block_store where the merge procedure was interrupted.

val v_3_0_upgrade : [ `Chain_dir ] Mavryk_store_shared.Naming.directory -> cleanups:(unit -> unit Lwt.t) list Stdlib.ref -> finalizers:(unit -> unit Lwt.t) list Stdlib.ref -> unit Mavryk_base.TzPervasives.tzresult Lwt.t

Upgrade a v_2 to v_3 block store by retrieving resulting_context_hash of all blocks present in the floating stores and updating their index.

Warning Not backward-compatible.