Function merge_seq_files

Source
pub fn merge_seq_files<T: Item>(
    input_files: &[T],
    max_file_size: Option<u64>,
) -> Vec<T>
Expand description

Finds the optimal set of adjacent files to merge based on a scoring system.

This function evaluates all possible contiguous subsets of files to find the best candidates for merging, considering:

  1. File reduction - prioritizes merging more files to reduce the total count
  2. Write amplification - minimizes the ratio of largest file to total size
  3. Size efficiency - prefers merges that utilize available space effectively

When multiple merge candidates have the same score, older files (those with lower indices) are preferred.

§Arguments

  • input_files - Slice of files to consider for merging
  • max_file_size - Optional maximum size constraint for the merged file. If None, uses 1.5 times the average file size.

§Returns

A vector containing the best set of adjacent files to merge. Returns an empty vector if input is empty or contains only one file.