Commits


Antoine Pitrou authored and GitHub committed 20c975d03f8
GH-39122: [C++][Parquet] Optimize FLBA record reader (#39124) ### Rationale for this change The FLBA implementation of RecordReader is suboptimal: * it doesn't preallocate the output array * it reads the decoded validity bitmap one bit at a time and recreates it, one bit at a time ### What changes are included in this PR? Optimize the FLBA implementation of RecordReader so as to avoid the aforementioned inefficiencies. I did a quick-and-dirty benchmark on a Parquet file with two columns: * column 1: uncompressed, PLAIN-encoded, FLBA<3> with no nulls * column 2: uncompressed, PLAIN-encoded, FLBA<3> with 25% nulls With git main, the file can be read at 465 MB/s. With this PR, the file can be read at 700 MB/s. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: #39122 Lead-authored-by: Antoine Pitrou <antoine@python.org> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Signed-off-by: Antoine Pitrou <antoine@python.org>