Commits


Heres, Daniel authored and Andy Grove committed 15503ee1d24
ARROW-11042: [Rust][DataFusion] Increase default batch size This increases the default batch size 8x from `4096` to `32768` as it improves performance of quite some operations. I just increased the size until performance didn't increase on my machine. Note that CSV reading also is faster on bigger batches on the bigger data sources. This PR ``` Loading table 'part' into memory Loaded table 'part' into memory in 125 ms Loading table 'supplier' into memory Loaded table 'supplier' into memory in 10 ms Loading table 'partsupp' into memory Loaded table 'partsupp' into memory in 381 ms Loading table 'customer' into memory Loaded table 'customer' into memory in 126 ms Loading table 'orders' into memory Loaded table 'orders' into memory in 961 ms Loading table 'lineitem' into memory Loaded table 'lineitem' into memory in 6382 ms Loading table 'nation' into memory Loaded table 'nation' into memory in 2 ms Loading table 'region' into memory Loaded table 'region' into memory in 2 ms Query 12 iteration 0 took 220.2 ms Query 12 iteration 1 took 223.2 ms Query 12 iteration 2 took 222.4 ms Query 12 iteration 3 took 222.2 ms Query 12 iteration 4 took 221.8 ms Query 12 iteration 5 took 222.0 ms Query 12 iteration 6 took 223.1 ms Query 12 iteration 7 took 223.7 ms Query 12 iteration 8 took 222.5 ms Query 12 iteration 9 took 222.9 ms Query 12 avg time: 222.40 ms ``` Master ``` Loading table 'part' into memory Loaded table 'part' into memory in 116 ms Loading table 'supplier' into memory Loaded table 'supplier' into memory in 7 ms Loading table 'partsupp' into memory Loaded table 'partsupp' into memory in 386 ms Loading table 'customer' into memory Loaded table 'customer' into memory in 115 ms Loading table 'orders' into memory Loaded table 'orders' into memory in 1048 ms Loading table 'lineitem' into memory Loaded table 'lineitem' into memory in 7673 ms Loading table 'nation' into memory Loaded table 'nation' into memory in 0 ms Loading table 'region' into memory Loaded table 'region' into memory in 0 ms Query 12 iteration 0 took 596.1 ms Query 12 iteration 1 took 602.0 ms Query 12 iteration 2 took 608.1 ms Query 12 iteration 3 took 607.9 ms Query 12 iteration 4 took 613.5 ms Query 12 iteration 5 took 615.3 ms Query 12 iteration 6 took 611.6 ms Query 12 iteration 7 took 609.8 ms Query 12 iteration 8 took 615.7 ms Query 12 iteration 9 took 616.9 ms Query 12 avg time: 609.68 ms ``` Query 1 also improves a bit (but smaller improvement) PR. ``` Query 1 iteration 0 took 653.0 ms Query 1 iteration 1 took 653.4 ms Query 1 iteration 2 took 652.3 ms Query 1 iteration 3 took 658.9 ms Query 1 iteration 4 took 655.1 ms Query 1 iteration 5 took 662.0 ms Query 1 iteration 6 took 659.7 ms Query 1 iteration 7 took 662.7 ms Query 1 iteration 8 took 669.0 ms Query 1 iteration 9 took 665.7 ms Query 1 avg time: 659.19 ms ``` Master: ``` Query 1 iteration 0 took 708.8 ms Query 1 iteration 1 took 714.5 ms Query 1 iteration 2 took 700.4 ms Query 1 iteration 3 took 713.7 ms Query 1 iteration 4 took 707.5 ms Query 1 iteration 5 took 727.8 ms Query 1 iteration 6 took 727.9 ms Query 1 iteration 7 took 721.3 ms Query 1 iteration 8 took 717.3 ms Query 1 iteration 9 took 729.4 ms Query 1 avg time: 716.85 ms ``` Closes #9021 from Dandandan/batch_size Authored-by: Heres, Daniel <danielheres@gmail.com> Signed-off-by: Andy Grove <andygrove73@gmail.com>