Skip to main content

Module prefilter

Module prefilter 

Source
Expand description

Prefilter framework for parquet reader.

Prefilter optimization reduces I/O by reading only a subset of columns first (the prefilter phase), applying filters to compute a refined row selection, then reading the remaining columns with the refined selection.

Structsยง

BulkFilterPlan ๐Ÿ”’
How the bulk-memtable read should apply each predicate.
CachedPrimaryKeyFilter ๐Ÿ”’
PrefilterContext ๐Ÿ”’
Context for prefiltering a row group.
PrefilterContextBuilder ๐Ÿ”’
Pre-built state for constructing PrefilterContext per row group.
PrefilterEntry ๐Ÿ”’
PrefilterResult ๐Ÿ”’
Result of prefiltering a row group.
ReaderFilterPlan ๐Ÿ”’
How the parquet reader should apply each predicate.

Enumsยง

PrefilterEntryKind ๐Ÿ”’

Constantsยง

PREFILTER_COLUMN_RATIO_THRESHOLD ๐Ÿ”’
PREFILTER_MIN_REMAINING_COLUMNS ๐Ÿ”’

Functionsยง

all_prefilter_entries ๐Ÿ”’
build_bulk_filter_plan ๐Ÿ”’
build_prefilter_cache_entries ๐Ÿ”’
build_prefilter_masks ๐Ÿ”’
build_reader_filter_plan ๐Ÿ”’
Splits a query [Predicate] into a ReaderFilterPlan: predicates that can run during the prefilter pass (on a reduced projection, to compute a refined row selection) versus predicates that must run on the normal read path (alongside the full projection).
compute_projection_count ๐Ÿ”’
compute_projection_mask ๐Ÿ”’
Executes prefiltering on a row group.
eval_entry_mask ๐Ÿ”’
eval_physical_filter_mask ๐Ÿ”’
eval_pk_group_mask ๐Ÿ”’
eval_simple_filter_mask ๐Ÿ”’
execute_prefilter ๐Ÿ”’
execute_prefilter_by_reading_columns ๐Ÿ”’
execute_prefilter_with_result_cache ๐Ÿ”’
matching_row_ranges_by_primary_key ๐Ÿ”’
non_cacheable_physical_filters ๐Ÿ”’
prefilter_column_names_for_entries ๐Ÿ”’
prefilter_flat_batch_by_primary_key ๐Ÿ”’
Filters a flat-format record batch by primary key, returning only rows whose primary key matches the filter. Returns None if all rows are filtered out.
projection_indices ๐Ÿ”’
refined_selection_from_mask ๐Ÿ”’
rows_before_filter ๐Ÿ”’
should_use_prefilter ๐Ÿ”’