Skip to main content

Module sst

Module sst 

Source
Expand description

Sorted strings tables.

Modulesยง

file
Structures to describe metadata of files.
file_purger
file_ref
index
location
parquet
SST in parquet format.
version ๐Ÿ”’
SST version.

Structsยง

FlatSchemaOptions
Options of flat schema.
SeriesEstimator ๐Ÿ”’
Gets the estimated number of series from record batches.

Enumsยง

FormatType
Format type of the SST file.

Constantsยง

DEFAULT_WRITE_BUFFER_SIZE
Default write buffer size, it should be greater than the default minimum upload part of S3 (5mb).
DEFAULT_WRITE_CONCURRENCY
Default number of concurrent write, it only works on object store backend(e.g., S3).
INTERNAL_PARQUET_FIELD_ID_BASE ๐Ÿ”’
Parquet field ID base for internal columns (__primary_key, __sequence, __op_type). Uses bit 30 to distinguish from user column IDs and fit in positive i32 range.
OP_TYPE_PARQUET_FIELD_ID ๐Ÿ”’
Parquet field ID for the __op_type column.
PARQUET_FIELD_ID_KEY
Iceberg-compatible column field ID key stored in Parquet column metadata.
PRIMARY_KEY_PARQUET_FIELD_ID ๐Ÿ”’
Parquet field ID for the __primary_key column.
SEQUENCE_PARQUET_FIELD_ID ๐Ÿ”’
Parquet field ID for the __sequence column.

Functionsยง

concretize_json_type ๐Ÿ”’
flat_sst_arrow_schema_column_num
Returns the number of columns in the flat format.
internal_fields ๐Ÿ”’
Fields for internal columns.
override_pk_field_to_binary ๐Ÿ”’
Returns a copy of schema with the __primary_key field replaced by a plain Binary field.
tag_maybe_to_dictionary_field ๐Ÿ”’
Helper function to create a dictionary field from a field if it is a string column.
to_dictionary_field ๐Ÿ”’
Helper function to create a dictionary field from a field.
to_flat_sst_arrow_schema
Gets the arrow schema to store in parquet.
to_sst_arrow_schema
Gets the arrow schema to store in parquet.
with_field_id
Adds PARQUET:field_id metadata to an Arrow field.