Skip to main content

modify_batch_sparse

Function modify_batch_sparse 

Source
pub fn modify_batch_sparse(
    batch: RecordBatch,
    table_id: u32,
    sorted_tag_columns: &[TagColumnInfo],
    extra_column_indices: &[usize],
) -> Result<RecordBatch>
Expand description

Modifies a [RecordBatch] to include a sparse primary key column.

This function transforms the input batch into a new RecordBatch where the first column is the generated primary key (named [PRIMARY_KEY_COLUMN_NAME]), followed by columns indicated by extra_column_indices.

The primary key uses a “sparse” encoding, which compactly represents the row’s identity by only including non-null tag values. The encoding, handled by [SparsePrimaryKeyCodec], consists of:

  1. The table_id.
  2. A tsid (Time Series ID), which is a hash of the present tags.
  3. The actual non-null tag values paired with their column_id.

§Parameters

  • batch: The source [RecordBatch].
  • table_id: The ID of the table.
  • sorted_tag_columns: Metadata for tag columns, used for both TSID computation and PK encoding.
  • extra_column_indices: Indices of columns from the original batch to keep in the output (typically the timestamp and value fields).