Commits


Weston Pace authored and GitHub committed ee2e9448c85
ARROW-17115: [C++] HashJoin fails if it encounters a batch with more than 32Ki rows (#13679) The swiss join was correctly breaking up probe side batches but build size batches would get partitioned as-is before any breaking up happened. That partitioning assumed 16-bit addressable indices and this failed if a build side batch was too large. Rather than break batches up in the hash-join node I went ahead and started breaking batches up in the source node. This matches the morsel / batch model and is basically a small precursor for future scheduler changes. This will have some small end-user impact as the output for larger queries is going to be batched more finely. However, we were already slicing batches up into 128Ki chunks in the scanner starting with 8.0.0 and so I don't think this is a significant difference. Authored-by: Weston Pace <weston.pace@gmail.com> Signed-off-by: Weston Pace <weston.pace@gmail.com>