Commits


Andy Grove authored and Andrew Lamb committed 53a36f5435b
ARROW-10703: [Rust] [DataFusion] Compute build-side of hash join once This simply introduces a mutex so that the left-side is computed once and then re-used. I partitioned the TPC-H SF=1 data set so that the order table had 2 partitions and the lineitem had 8 partitions. I confirmed that the build-side only got built once by adding debug logging. I ran query 12 with master and this PR. Benchmark results from master branch: ``` Query 12 iteration 0 took 856.2 ms Query 12 iteration 1 took 577.5 ms Query 12 iteration 2 took 579.7 ms Query 12 iteration 3 took 562.9 ms Query 12 iteration 4 took 590.9 ms Query 12 avg time: 633.44 ms ``` Benchmark results from this PR: ``` Query 12 iteration 0 took 307.6 ms Query 12 iteration 1 took 296.8 ms Query 12 iteration 2 took 307.4 ms Query 12 iteration 3 took 304.5 ms Query 12 iteration 4 took 318.5 ms Query 12 avg time: 306.95 ms ``` Performance is very close to 2x and that was the expected outcome since the build-side got built once instead of twice. Closes #8981 from andygrove/join-compute-build-side-once Authored-by: Andy Grove <andygrove73@gmail.com> Signed-off-by: Andrew Lamb <andrew@nerdnetworks.org>