Commits

Antoine Pitrou authored 8ae596a9b62
ARROW-10796: [C++] Implement optimized RecordBatch sorting Add two RecordBatch sorting implementations: * A single-pass left-to-right radix sort that's fast up to ~8 sort keys * A single-pass multiple-key-comparing sort that gives decent performance for large numbers of sort keys Both implementations benefit from direct indexed access into the contiguous RecordBatch columns (as opposed to table sorting, which must index into the chunks). Add some RecordBatch-sorting benchmarks. Also, add and improve tests; and fix a bug related to sorting of NaNs and nulls. Benchmarks (changes less than 10% in absolute value not shown): ``` benchmark baseline contender change % counters 10 RecordBatchSortIndicesInt64Narrow/1048576/100/8 1.482m items/sec 5.410m items/sec 265.083 {'run_name': 'RecordBatchSortIndicesInt64Narrow/1048576/100/8', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1} 20 RecordBatchSortIndicesInt64Narrow/1048576/0/8 1.524m items/sec 5.478m items/sec 259.478 {'run_name': 'RecordBatchSortIndicesInt64Narrow/1048576/0/8', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1} 60 RecordBatchSortIndicesInt64Narrow/1048576/100/2 2.276m items/sec 7.803m items/sec 242.839 {'run_name': 'RecordBatchSortIndicesInt64Narrow/1048576/100/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2} 21 RecordBatchSortIndicesInt64Narrow/1048576/0/2 2.340m items/sec 7.802m items/sec 233.369 {'run_name': 'RecordBatchSortIndicesInt64Narrow/1048576/0/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2} 23 RecordBatchSortIndicesInt64Wide/1048576/100/2 4.673m items/sec 9.867m items/sec 111.164 {'run_name': 'RecordBatchSortIndicesInt64Wide/1048576/100/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 3} 61 RecordBatchSortIndicesInt64Wide/1048576/0/2 4.677m items/sec 9.820m items/sec 109.971 {'run_name': 'RecordBatchSortIndicesInt64Wide/1048576/0/2', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 3} 35 RecordBatchSortIndicesInt64Wide/1048576/100/8 4.680m items/sec 9.822m items/sec 109.850 {'run_name': 'RecordBatchSortIndicesInt64Wide/1048576/100/8', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 3} 55 RecordBatchSortIndicesInt64Wide/1048576/0/8 4.755m items/sec 9.933m items/sec 108.895 {'run_name': 'RecordBatchSortIndicesInt64Wide/1048576/0/8', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 3} 59 RecordBatchSortIndicesInt64Wide/1048576/0/16 4.794m items/sec 8.408m items/sec 75.389 {'run_name': 'RecordBatchSortIndicesInt64Wide/1048576/0/16', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 3} 16 RecordBatchSortIndicesInt64Wide/1048576/100/16 4.733m items/sec 8.177m items/sec 72.780 {'run_name': 'RecordBatchSortIndicesInt64Wide/1048576/100/16', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 3} 29 RecordBatchSortIndicesInt64Narrow/1048576/0/16 1.640m items/sec 2.627m items/sec 60.146 {'run_name': 'RecordBatchSortIndicesInt64Narrow/1048576/0/16', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1} 9 RecordBatchSortIndicesInt64Narrow/1048576/100/16 1.559m items/sec 2.342m items/sec 50.201 {'run_name': 'RecordBatchSortIndicesInt64Narrow/1048576/100/16', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1} 4 TableSortIndicesInt64Narrow/1048576/0/2/1 2.415m items/sec 2.699m items/sec 11.723 {'run_name': 'TableSortIndicesInt64Narrow/1048576/0/2/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2} 51 TableSortIndicesInt64Narrow/1048576/0/2/32 1.814m items/sec 2.023m items/sec 11.513 {'run_name': 'TableSortIndicesInt64Narrow/1048576/0/2/32', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1} 49 TableSortIndicesInt64Narrow/1048576/0/16/4 1.542m items/sec 1.717m items/sec 11.361 {'run_name': 'TableSortIndicesInt64Narrow/1048576/0/16/4', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1} 30 TableSortIndicesInt64Narrow/1048576/0/2/4 2.272m items/sec 2.516m items/sec 10.733 {'run_name': 'TableSortIndicesInt64Narrow/1048576/0/2/4', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 2} 25 TableSortIndicesInt64Narrow/1048576/0/8/4 1.542m items/sec 1.706m items/sec 10.628 {'run_name': 'TableSortIndicesInt64Narrow/1048576/0/8/4', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1} 11 TableSortIndicesInt64Narrow/1048576/0/16/1 1.691m items/sec 1.866m items/sec 10.316 {'run_name': 'TableSortIndicesInt64Narrow/1048576/0/16/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1} 12 TableSortIndicesInt64Narrow/1048576/0/8/1 1.683m items/sec 1.856m items/sec 10.280 {'run_name': 'TableSortIndicesInt64Narrow/1048576/0/8/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 1} [...] 6 RecordBatchSortIndicesInt64Narrow/1048576/0/1 185.050m items/sec 164.579m items/sec -11.062 {'run_name': 'RecordBatchSortIndicesInt64Narrow/1048576/0/1', 'run_type': 'iteration', 'repetitions': 0, 'repetition_index': 0, 'threads': 1, 'iterations': 122} ``` Closes #8890 from pitrou/ARROW-10796-batch-sort Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>