Commits


Wes McKinney authored and Antoine Pitrou committed 7cebe082e60
ARROW-9210: [C++] Use BitBlockCounter in array/visitor_inline.h This significantly speeds up processing of mostly-not-null or mostly-null data, while having almost no overhead for the other scenarios where you rarely have a word-sized run of all-not-null or all-null-data. Because `BitUtil::GetBit` is used for bit-checking in the scenario where you need to check every bit in the whole array individually I show slight but inconclusive perf regression similar with the perf difference we've seen comparing BitmapReader with the naive approach calling GetBit inside a loop. This small perf degradation seems to be present mostly with gcc and not meaningfully with clang on Linux. For data with null_count 0, data is processed in blocks of INT16_MAX values at a time, so this adds no meaningful overhead for this case either. I modified the hash benchmarks where this code is used to exhibit both the cases that benefit from this optimization as well as the ones that don't. Closes #7521 from wesm/ARROW-9210 Lead-authored-by: Wes McKinney <wesm@apache.org> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>