Commits


Micah Kornfield authored and Wes McKinney committed 5389008df02
ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column This change does the following: * Vectorizes computation for validity bitmaps with no repeated parents for all little-endian architectures * Vectorizes computation for validity bitmaps with repeated parents for AVX2 + architectures (really it requires BMI2) * Exposes some building blocks to do vectorized computation for all validity bitmaps in for parquet level data (more need to be handled for generating appropriate bitmaps and offset data for lists). These will be added level_conversions.h in a future PR. * Replaces loops over bitmaps with SetBitsTo * Leaves a fallback for non BMI2/little-endian capable machines. With AVX2 enabled this seems to improve benchmarks by 20% for nullable columns (BM_Read..) on my box. I didn't see any impacts in other benchmarkmarks. See checklist for what is remaining. If possible i'd like to get early feedback on naming and my approach to checking for BMI2. Still needed: - [x] Still adding more focused unit tests for what I've added. - [x] Move the changes to Bitmap::ToString to there own PR. Closes #6985 from emkornfield/ARROW-8413 Lead-authored-by: Micah Kornfield <emkornfield@gmail.com> Co-authored-by: François Saint-Jacques <fsaintjacques@gmail.com> Signed-off-by: Wes McKinney <wesm@apache.org>