Skip to main content

Module prefilter

Module prefilter 

Source
Expand description

Prefilter framework for parquet reader.

Prefilter optimization reduces I/O by reading only a subset of columns first (the prefilter phase), applying filters to compute a refined row selection, then reading the remaining columns with the refined selection.

Structsยง

BulkFilterPlan ๐Ÿ”’
How the bulk-memtable read should apply each predicate.
CachedPrimaryKeyFilter ๐Ÿ”’
PrefilterContext ๐Ÿ”’
Context for prefiltering a row group.
PrefilterContextBuilder ๐Ÿ”’
Pre-built state for constructing PrefilterContext per row group.
PrefilterResult ๐Ÿ”’
Result of prefiltering a row group.
ReaderFilterPlan ๐Ÿ”’
How the parquet reader should apply each predicate.

Constantsยง

PREFILTER_COLUMN_RATIO_THRESHOLD ๐Ÿ”’
PREFILTER_MIN_REMAINING_COLUMNS ๐Ÿ”’

Functionsยง

apply_filters_to_batch ๐Ÿ”’
build_bulk_filter_plan ๐Ÿ”’
build_reader_filter_plan ๐Ÿ”’
Splits a query [Predicate] into a ReaderFilterPlan: predicates that can run during the prefilter pass (on a reduced projection, to compute a refined row selection) versus predicates that must run on the normal read path (alongside the full projection).
compute_projection_mask ๐Ÿ”’
Executes prefiltering on a row group.
execute_prefilter ๐Ÿ”’
matching_row_ranges_by_primary_key ๐Ÿ”’
prefilter_flat_batch_by_primary_key ๐Ÿ”’
Filters a flat-format record batch by primary key, returning only rows whose primary key matches the filter. Returns None if all rows are filtered out.
should_use_prefilter ๐Ÿ”’