Commits


Felipe Oliveira Carvalho authored and GitHub committed fcbcb7dab71
GH-34569: [C++] Diffing of Run-End Encoded arrays (#35003) ### What changes are included in this PR? Ability to diff run-end encoded arrays without expanding them first. This is useful for debugging and understanding of unit test failures. Many successive random accesses to REE arrays by logical indices incur many successive binary searches on the run-ends array to find the physical index of the values array, so this PR implements a few interesting optimizations -- the most complicated one is encapsulated in a new utility available via `ree_util.h` as there are many potential uses for it other than diffing. +-------------+---------------------------+----------------------------------------------------+ | | Optimizations | +-------------+---------------------------+-----------------------+----------------------------+ | # bsearches | Null checks on Comparator | RunLengthOfEqualsFrom | Cached FindPhysicalIndex() | +-------------+---------------------------+----------------------+-----------------------------+ | 3798 | | | | +-------------+---------------------------+----------------------+-----------------------------+ | 1914 | X | | | +-------------+---------------------------+----------------------+-----------------------------+ | 180 | X | X | | +-------------+---------------------------+----------------------+-----------------------------+ | 153 | X | | X | +-------------+---------------------------+----------------------+-----------------------------+ | 75 | X | X | X | +-------------+---------------------------+----------------------+-----------------------------+ ### Are these changes tested? Yes. By existing and newly added REE-specific tests. ### Are there any user-facing changes? No. * Closes: #34569 Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>