Skip to content

Performance Tips

Batched Reads and Writes

Generated protocol reader and writer classes have read and write methods for each step. When a step is a stream, there will also be overloads that read or write a batch of values as an std::vector in one go. For small data types, using batched reads and writes can make a dramatic difference in throughput, especially for HDF5 files.

Use Fixed Data Types When Possible

For HDF5, using variable-length collections (like !vector without a length or !array without fixed dimension sizes) has lower throughput than their fixed-sized counterparts.

Avoid NDJSON Encoding

The NDJSON serialization format is great for debugging and interoperability with other tools (like jq) but is it orders of magnitude less efficient than the binary or HDF5 formats.