Expand description
A cached-size, single-pass encoder for [WalEntry].
§Why
prost does not cache message sizes the way protobuf-C++ does. When
WalEntry::encode_to_vec runs, the encode pass recomputes the
encoded_len of every nested message each time it needs a length
delimiter. For the deep WAL tree
(WalEntry → Mutation → Rows → Row → Value) the length of a leaf Value
ends up being recomputed roughly once per ancestor level — i.e. ~5 times.
Microbenchmarks show ~87% of encode_to_vec is spent in these repeated
length walks, not in writing bytes.
§How
This encoder walks the tree once to compute and cache the body length of
every length-delimited message node (in pre-order into a flat sizes
vector), then walks it a second time to write bytes, reading each cached
length back via a cursor. Every node’s length is computed exactly once.
Leaf messages whose sizing is not recursively redundant (ColumnSchema,
Value, WriteHint, BulkWalEntry) are delegated to prost’s own
encoded_len/encode_raw, but their (single) computed length is still
cached so the encode pass never recomputes it.
The output is byte-for-byte identical to WalEntry::encode_to_vec; this is
asserted in tests and must hold to preserve WAL replay compatibility.
§Maintenance
This encoder hard-codes the wire layout (field tags and field order) of
WalEntry, Mutation, Rows and Row. If any of these messages change in
greptime-proto, this file MUST be updated to match:
- Adding or removing a field is caught at compile time: every one of
these messages is destructured exhaustively (no
..), so a changed field set fails to compile here. - Changing a field’s tag number or type is caught by the byte-for-byte equality tests against prost (which populate all fields). Keep those tests exhaustive when adding fields.
Leaf messages are delegated to prost, so changes to them need no update here.
Structs§
- WalEntry
Encoder - A reusable encoder that caches message body sizes between its size pass and its encode pass.
Constants§
- BULK_
ENTRY_ 🔒TAG - MUTATION_
TAG 🔒 - OP_
TYPE_ 🔒TAG - ROWS_
TAG 🔒 - ROW_TAG 🔒
- SCHEMA_
TAG 🔒 - SEQUENCE_
TAG 🔒 - VALUE_
TAG 🔒 - WRITE_
HINT_ 🔒TAG
Functions§
- msg_
field_ 🔒len - Length contribution of a length-delimited message field: key + length varint + body.