Profiling the Mavkit node¶
Memory profiling the OCaml heap¶
Install an OCaml switch with the
statmemprofpatch:4.04.2+statistical-memprofor4.06.0+statistical-memprofInstall
statmemprof-emacs.Enable loading
statmemprofinto the node.Add the
statmemprof-emacspackage as a dependency to the main package, and addlet () = Statmemprof_emacs.start 1E-4 30 5to thenode_main.mlfile.Arguments:
sampling_rateis the sampling rate of the profiler. Good value:1e-4.callstack_sizeis the size of the fragment of the call stack which is captured for each sampled allocation.min_sample_printis the minimum number of samples under which the location of an allocation is not displayed.
Load sturgeon into emacs, by adding this to your
.emacs:
(let ((opam-share (ignore-errors (car (process-lines "opam" "config" "var" "share")))))
(when (and opam-share (file-directory-p opam-share))
(add-to-list 'load-path (expand-file-name "emacs/site-lisp" opam-share))))
(require 'sturgeon)
Launch the node then connect to it with sturgeon.
If the process is launched with pid
1234then
M-x sturgeon-connect
mavkit-nodememprof.1234.sturgeon
(tab-completion works for finding the socket name)
Memory profiling the C heap¶
Install
valgrindandmassif-visualizer
valgrind --tool=massif mavkit-node run ...
Stop with
Ctrl-Cthen display with
massif-visualizer massif.out.pid
Performance profiling¶
Install
perf(thelinux-perfpackage for debian).If the package does not exist for your current kernel, a previous version can be used. Substitute the
perfcommand toperf_4.9if your kernel is 4.9).Either:
Run the node, find the pid.
Attach
perfwithperf record -p pid -F 99 --call-stack dwarf.Then stop capturing with
Ctrl-C. This can represent a lot of data. Don’t do that for too long. If this is too much you can remove the--call-stack dwarfto get something more manageable, but interpreting the information can be harder.Let
perfrunmavkit-node:perf record -g -F 99 --call-graph=dwarf -- ./mavkit-node run ...This will write
perf.dataafter having stopped the node withCtrl-C.
In both cases, the
-Fargument specifies the frequency of sampling of data (in hertz). If too much data is generated, use a smaller value. If data is not precise enough, try using a higher value.display the result with
perf report, or use a more advanced visualizer (recommended). Such visualizers include:flamegraph: command-line tool for generating flamegraphs (example for mavkit-node)
gprof2dot: command-line tool for generating callgraphs (example for mavkit-node)
hotspot: a GUI for the
perftool