Commits


Heres, Daniel authored and Andy Grove committed a054c78813c
ARROW-10968: [Rust][DataFusion] Don't build hash table for right side of join This PR changes to not build an index for the probe side of the join. As I observed while writing the PR for adding an optimization pass for the build/probe side of joins, currently it takes more time to have the biggest table on the probe side, which is not what's expected. The current implementation also creates a hashset for both the left and right side for each new batch for inner joins. This change has big impact on join performance, e.g. TCP-H query 12 has a >4x speedup and query 5 a 16x speed up. Query 12 (locally, in memory). Master ``` Query 12 iteration 0 took 1102 ms Query 12 iteration 1 took 1084 ms Query 12 iteration 2 took 1099 ms Query 12 iteration 3 took 1077 ms Query 12 iteration 4 took 1082 ms Query 12 iteration 5 took 1098 ms Query 12 iteration 6 took 1081 ms Query 12 iteration 7 took 1101 ms Query 12 iteration 8 took 1138 ms Query 12 iteration 9 took 1084 ms ``` PR ``` Query 12 iteration 0 took 257 ms Query 12 iteration 1 took 255 ms Query 12 iteration 2 took 255 ms Query 12 iteration 3 took 254 ms Query 12 iteration 4 took 260 ms Query 12 iteration 5 took 261 ms Query 12 iteration 6 took 266 ms Query 12 iteration 7 took 259 ms Query 12 iteration 8 took 256 ms Query 12 iteration 9 took 255 ms ``` Query 5: ~16x speedup Master: ``` Query 5 iteration 0 took 15857 ms Query 5 iteration 1 took 15428 ms Query 5 iteration 2 took 15234 ms Query 5 iteration 3 took 15024 ms Query 5 iteration 4 took 14942 ms Query 5 iteration 5 took 14926 ms Query 5 iteration 6 took 14900 ms Query 5 iteration 7 took 15073 ms Query 5 iteration 8 took 15176 ms Query 5 iteration 9 took 15076 ms ``` PR ``` Query 5 iteration 0 took 1282 ms Query 5 iteration 1 took 930 ms Query 5 iteration 2 took 940 ms Query 5 iteration 3 took 882 ms Query 5 iteration 4 took 891 ms Query 5 iteration 5 took 903 ms Query 5 iteration 6 took 903 ms Query 5 iteration 7 took 900 ms Query 5 iteration 8 took 905 ms Query 5 iteration 9 took 905 ms ``` FYI @andygrove @jorgecarleitao Closes #8965 from Dandandan/right_hash Authored-by: Heres, Daniel <danielheres@gmail.com> Signed-off-by: Andy Grove <andygrove73@gmail.com>