Commits


Weston Pace authored and David Li committed 24f372297c5
ARROW-16294: [C++] Improve performance of parquet readahead * Turns out the batch size we were slicing internally for parquet (in the TableBatchReader) was not the batch size from the scanner. I added batch slicing in file_parquet to leave the parquet reader itself more or less unchanged here (and it's not clear it would make more sense to use the smaller batch size slicing inside the reader anyways). * Changed parquet readahead so it reads ahead more than one row group. It now tries to keep `batch_size * batch_readahead` reads in flight. Users that want more parallel reads can increase batch readahead. Closes #12967 from westonpace/feature/ARROW-16294--improve-parquet-readahead Authored-by: Weston Pace <weston.pace@gmail.com> Signed-off-by: David Li <li.davidm96@gmail.com>