pub struct ExternalSorter {
index_name: String,
temp_file_provider: Arc<dyn ExternalTempFileProvider>,
segment_null_bitmap: Bitmap,
values_buffer: BTreeMap<Bytes, (Bitmap, usize)>,
total_row_count: usize,
segment_row_count: NonZeroUsize,
current_memory_usage: usize,
current_memory_usage_threshold: Option<usize>,
global_memory_usage: Arc<AtomicUsize>,
global_memory_usage_sort_limit: Option<usize>,
}
Expand description
ExternalSorter
manages the sorting of data using both in-memory structures and external files.
It dumps data to external files when the in-memory buffer crosses a certain memory threshold.
Fields§
§index_name: String
The index name associated with the sorting operation
temp_file_provider: Arc<dyn ExternalTempFileProvider>
Manages creation and access to external temporary files
segment_null_bitmap: Bitmap
Bitmap indicating which segments have null values
values_buffer: BTreeMap<Bytes, (Bitmap, usize)>
In-memory buffer to hold values and their corresponding bitmaps until memory threshold is exceeded
total_row_count: usize
Count of all rows ingested so far
segment_row_count: NonZeroUsize
The number of rows per group for bitmap indexing which determines how rows are batched for indexing. It is used to determine which segment a row belongs to.
current_memory_usage: usize
Tracks memory usage of the buffer
current_memory_usage_threshold: Option<usize>
The threshold of current memory usage below which the buffer is not dumped, even if the global memory
usage exceeds global_memory_usage_sort_limit
. This allows for smaller buffers to remain in memory,
providing a buffer against unnecessary dumps to external files, which can be costly in terms of performance.
None
indicates that only the global memory usage threshold is considered for dumping the buffer.
global_memory_usage: Arc<AtomicUsize>
Tracks the global memory usage of all sorters
global_memory_usage_sort_limit: Option<usize>
The memory usage limit that, when exceeded by the global memory consumption of all sorters, necessitates
a reassessment of buffer retention. Surpassing this limit signals that there is a high overall memory pressure,
potentially requiring buffer dumping to external storage for memory relief.
None
value indicates that no specific global memory usage threshold is established for triggering buffer dumps.
Implementations§
Source§impl ExternalSorter
impl ExternalSorter
Sourcepub fn new(
index_name: String,
temp_file_provider: Arc<dyn ExternalTempFileProvider>,
segment_row_count: NonZeroUsize,
current_memory_usage_threshold: Option<usize>,
global_memory_usage: Arc<AtomicUsize>,
global_memory_usage_sort_limit: Option<usize>,
) -> Self
pub fn new( index_name: String, temp_file_provider: Arc<dyn ExternalTempFileProvider>, segment_row_count: NonZeroUsize, current_memory_usage_threshold: Option<usize>, global_memory_usage: Arc<AtomicUsize>, global_memory_usage_sort_limit: Option<usize>, ) -> Self
Constructs a new ExternalSorter
Sourcepub fn factory(
temp_file_provider: Arc<dyn ExternalTempFileProvider>,
current_memory_usage_threshold: Option<usize>,
global_memory_usage: Arc<AtomicUsize>,
global_memory_usage_sort_limit: Option<usize>,
) -> SorterFactory
pub fn factory( temp_file_provider: Arc<dyn ExternalTempFileProvider>, current_memory_usage_threshold: Option<usize>, global_memory_usage: Arc<AtomicUsize>, global_memory_usage_sort_limit: Option<usize>, ) -> SorterFactory
Generates a factory function that creates new ExternalSorter
instances
Sourcefn push_not_null(
&mut self,
value: BytesRef<'_>,
segment_index_range: RangeInclusive<usize>,
) -> usize
fn push_not_null( &mut self, value: BytesRef<'_>, segment_index_range: RangeInclusive<usize>, ) -> usize
Pushes the non-null values to the values buffer and sets the bits within the specified range in the given bitmap to true. Returns the memory usage difference of the buffer after the operation.
Sourceasync fn may_dump_buffer(&mut self, memory_diff: usize) -> Result<()>
async fn may_dump_buffer(&mut self, memory_diff: usize) -> Result<()>
Checks if the in-memory buffer exceeds the threshold and offloads it to external storage if necessary
Sourcefn segment_index_range(&self, n: usize) -> RangeInclusive<usize>
fn segment_index_range(&self, n: usize) -> RangeInclusive<usize>
Determines the segment index range for the row index range
[row_begin, row_begin + n - 1]
Sourcefn segment_index(&self, row_index: usize) -> usize
fn segment_index(&self, row_index: usize) -> usize
Determines the segment index for the given row index
Trait Implementations§
Source§impl Sorter for ExternalSorter
impl Sorter for ExternalSorter
Source§fn push_n<'life0, 'life1, 'async_trait>(
&'life0 mut self,
value: Option<BytesRef<'life1>>,
n: usize,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
fn push_n<'life0, 'life1, 'async_trait>(
&'life0 mut self,
value: Option<BytesRef<'life1>>,
n: usize,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
Pushes n identical values into the sorter, adding them to the in-memory buffer and dumping the buffer to an external file if necessary
Source§fn output<'life0, 'async_trait>(
&'life0 mut self,
) -> Pin<Box<dyn Future<Output = Result<SortOutput>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn output<'life0, 'async_trait>(
&'life0 mut self,
) -> Pin<Box<dyn Future<Output = Result<SortOutput>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Finalizes the sorting operation, merging data from both in-memory buffer and external files into a sorted stream
Source§fn push<'life0, 'life1, 'async_trait>(
&'life0 mut self,
value: Option<BytesRef<'life1>>,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
fn push<'life0, 'life1, 'async_trait>(
&'life0 mut self,
value: Option<BytesRef<'life1>>,
) -> Pin<Box<dyn Future<Output = Result<()>> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
'life1: 'async_trait,
push_n
with n = 1Auto Trait Implementations§
impl Freeze for ExternalSorter
impl !RefUnwindSafe for ExternalSorter
impl Send for ExternalSorter
impl Sync for ExternalSorter
impl Unpin for ExternalSorter
impl !UnwindSafe for ExternalSorter
Blanket Implementations§
§impl<T> AnySync for T
impl<T> AnySync for T
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
§impl<T> Conv for T
impl<T> Conv for T
§impl<T, V> Convert<T> for Vwhere
V: Into<T>,
impl<T, V> Convert<T> for Vwhere
V: Into<T>,
fn convert(value: Self) -> T
fn convert_box(value: Box<Self>) -> Box<T>
fn convert_vec(value: Vec<Self>) -> Vec<T>
fn convert_vec_box(value: Vec<Box<Self>>) -> Vec<Box<T>>
fn convert_matrix(value: Vec<Vec<Self>>) -> Vec<Vec<T>>
fn convert_option(value: Option<Self>) -> Option<T>
fn convert_option_box(value: Option<Box<Self>>) -> Option<Box<T>>
fn convert_option_vec(value: Option<Vec<Self>>) -> Option<Vec<T>>
§impl<T> Downcast for Twhere
T: Any,
impl<T> Downcast for Twhere
T: Any,
§fn into_any(self: Box<T>) -> Box<dyn Any>
fn into_any(self: Box<T>) -> Box<dyn Any>
Box<dyn Trait>
(where Trait: Downcast
) to Box<dyn Any>
. Box<dyn Any>
can
then be further downcast
into Box<ConcreteType>
where ConcreteType
implements Trait
.§fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
fn into_any_rc(self: Rc<T>) -> Rc<dyn Any>
Rc<Trait>
(where Trait: Downcast
) to Rc<Any>
. Rc<Any>
can then be
further downcast
into Rc<ConcreteType>
where ConcreteType
implements Trait
.§fn as_any(&self) -> &(dyn Any + 'static)
fn as_any(&self) -> &(dyn Any + 'static)
&Trait
(where Trait: Downcast
) to &Any
. This is needed since Rust cannot
generate &Any
’s vtable from &Trait
’s.§fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
fn as_any_mut(&mut self) -> &mut (dyn Any + 'static)
&mut Trait
(where Trait: Downcast
) to &Any
. This is needed since Rust cannot
generate &mut Any
’s vtable from &mut Trait
’s.§impl<T> DowncastSync for T
impl<T> DowncastSync for T
§impl<T> FmtForward for T
impl<T> FmtForward for T
§fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
self
to use its Binary
implementation when Debug
-formatted.§fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
self
to use its Display
implementation when
Debug
-formatted.§fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
self
to use its LowerExp
implementation when
Debug
-formatted.§fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
self
to use its LowerHex
implementation when
Debug
-formatted.§fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
self
to use its Octal
implementation when Debug
-formatted.§fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
self
to use its Pointer
implementation when
Debug
-formatted.§fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
self
to use its UpperExp
implementation when
Debug
-formatted.§fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
self
to use its UpperHex
implementation when
Debug
-formatted.§fn fmt_list(self) -> FmtList<Self>where
&'a Self: for<'a> IntoIterator,
fn fmt_list(self) -> FmtList<Self>where
&'a Self: for<'a> IntoIterator,
§impl<T> FutureExt for T
impl<T> FutureExt for T
§fn with_context(self, otel_cx: Context) -> WithContext<Self>
fn with_context(self, otel_cx: Context) -> WithContext<Self>
§fn with_current_context(self) -> WithContext<Self>
fn with_current_context(self) -> WithContext<Self>
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request
Source§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T
in a tonic::Request
§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self
and passes that borrow into the pipe function. Read more§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self
and passes that borrow into the pipe function. Read more§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R,
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self
, then passes self.as_ref()
into the pipe function.§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self
, then passes self.as_mut()
into the pipe
function.§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self
, then passes self.deref()
into the pipe function.§impl<T> Pointable for T
impl<T> Pointable for T
§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self
from the equivalent element of its
superset. Read more§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self
is actually part of its subset T
(and can be converted to it).§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset
but without any property checks. Always succeeds.§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self
to the equivalent element of its superset.§impl<T> Tap for T
impl<T> Tap for T
§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B>
of a value. Read more§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B>
of a value. Read more§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R>
view of a value. Read more§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R>
view of a value. Read more§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target
of a value. Read more§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target
of a value. Read more§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap()
only in debug builds, and is erased in release builds.§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut()
only in debug builds, and is erased in release
builds.§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow()
only in debug builds, and is erased in release
builds.§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut()
only in debug builds, and is erased in release
builds.§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref()
only in debug builds, and is erased in release
builds.§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut()
only in debug builds, and is erased in release
builds.§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref()
only in debug builds, and is erased in release
builds.