Commits


Jinpeng authored and GitHub committed 16eb02649fd
PARQUET-2423: [C++][Parquet] Avoid allocating buffer object in RecordReader's SkipRecords (#39818) ### Rationale for this change Currently each invocation of `SkipRecords()` for non-repeated fields would [create a new buffer](https://github.com/apache/arrow/blob/main/cpp/src/parquet/column_reader.cc#L1482) object to hold a decoded validity bitmpa. It is not useful as we are merely counting how many defined values are in the internal buffer, not reusing the validity bitmap. ### What changes are included in this PR? * Remove temporary validity bitmap, just counting the definition levels at the max value instead. This improves performance when skipping non-repeated records. * Add a new microbenchmark for reading and skipping alternatively from a RecordReader. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. Lead-authored-by: jp0317 <zjpzlz@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Co-authored-by: Jinpeng <zjpzlz@163.com> Signed-off-by: Antoine Pitrou <antoine@python.org>