Public / arrow / 5666dcaf524

Commits

Jörn Horstmann authored and Neville Dipale committed 5666dcaf52415 Nov 2020
ARROW-10079: [Rust] Benchmark and improve count bits

This refactors the calculation of null counts by using the bitchunk iterator and `count_ones` intrinsic. The performance of array creation improves by between 3%-5%. The biggest impact is on getting a slice of an existing array, for a slice length of 2048 the performance in a microbenchmark doubles.

Benchmark results on a Ryzen 3700U. LLVM seems to be able to vectorize the `count_ones` intrinsic, so performance on machines with better AVX units should be even higher.

```
Running /home/jhorstmann/Source/github/apache/arrow/rust/target/release/deps/array_slice-d438c5aeed9bef19
Gnuplot not found, using plotters backend
array_slice 128         time:   [150.68 ns 151.74 ns 153.14 ns]
                        change: [-11.525% -10.357% -9.3284%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe

array_slice 512         time:   [158.77 ns 161.03 ns 163.62 ns]
                        change: [-25.922% -24.956% -23.945%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  5 (5.00%) high mild
  8 (8.00%) high severe

array_slice 2048        time:   [170.17 ns 171.24 ns 172.60 ns]
                        change: [-50.388% -49.865% -49.375%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe
```

```
Running /home/jhorstmann/Source/github/apache/arrow/rust/target/release/deps/array_from_vec-8a972c208e7a6334
Gnuplot not found, using plotters backend
array_from_vec 128      time:   [750.62 ns 751.69 ns 752.85 ns]
                        change: [-8.2512% -7.3658% -6.5917%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

array_from_vec 256      time:   [1.2501 us 1.2580 us 1.2676 us]
                        change: [-0.9517% +0.1364% +1.1850%] (p = 0.81 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  7 (7.00%) high mild
  4 (4.00%) high severe

array_from_vec 512      time:   [2.1603 us 2.1643 us 2.1690 us]
                        change: [-2.8122% -2.1984% -1.6379%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  10 (10.00%) high mild
  3 (3.00%) high severe

array_string_from_vec 128
                        time:   [3.2196 us 3.2288 us 3.2395 us]
                        change: [+1.4107% +1.9839% +2.5419%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  7 (7.00%) high mild
  5 (5.00%) high severe

array_string_from_vec 256
                        time:   [4.8112 us 4.8352 us 4.8685 us]
                        change: [-5.2564% -3.8312% -2.7672%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  4 (4.00%) high mild
  10 (10.00%) high severe

array_string_from_vec 512
                        time:   [8.2588 us 8.2802 us 8.3049 us]
                        change: [-0.5974% -0.2991% +0.0027%] (p = 0.05 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

struct_array_from_vec 128
                        time:   [4.5771 us 4.5947 us 4.6206 us]
                        change: [-6.6706% -5.3873% -4.2340%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
  5 (5.00%) high mild
  10 (10.00%) high severe

struct_array_from_vec 256
                        time:   [6.7617 us 6.7887 us 6.8182 us]
                        change: [-4.9262% -4.2419% -3.6528%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

struct_array_from_vec 512
                        time:   [9.8926 us 9.9639 us 10.052 us]
                        change: [-5.4057% -3.9331% -2.5242%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

struct_array_from_vec 1024
                        time:   [15.812 us 15.862 us 15.922 us]
                        change: [-6.1966% -5.6634% -5.0338%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  3 (3.00%) high mild
  8 (8.00%) high severe
```

```
Running /home/jhorstmann/Source/github/apache/arrow/rust/target/release/deps/builder-26beb41d8ad91167
Gnuplot not found, using plotters backend
bench_primitive         time:   [4.8277 ms 4.8879 ms 4.9558 ms]
                        thrpt:  [807.14 MiB/s 818.35 MiB/s 828.55 MiB/s]
                 change:
                        time:   [-5.3415% -3.4401% -1.3973%] (p = 0.00 < 0.05)
                        thrpt:  [+1.4171% +3.5626% +5.6429%]
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

bench_bool              time:   [2.5176 ms 2.5230 ms 2.5292 ms]
                        thrpt:  [197.69 MiB/s 198.18 MiB/s 198.60 MiB/s]
                 change:
                        time:   [-19.412% -19.203% -18.988%] (p = 0.00 < 0.05)
                        thrpt:  [+23.438% +23.767% +24.088%]
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe
```

Closes #8663 from jhorstmann/ARROW-10079-benchmark-and-improve-count-bits

Authored-by: Jörn Horstmann <git@jhorstmann.net>
Signed-off-by: Neville Dipale <nevilledips@gmail.com>