Public / arrow / 408e5be0bc3

Commits

Heres, Daniel authored and Neville Dipale committed 408e5be0bc315 Dec 2020
ARROW-10810: [Rust] Improve comparison kernels performance

This PR shows that there is still about a ~2x performance (compared to ~8x earlier) difference between using a builder vs using a mutable buffer directly after https://github.com/apache/arrow/pull/8842 .
This also accounts for a ~5% difference on some queries in DataFusion (when not using the simd feature, where the implementation doesn't use the builder). Also the bounds checks are a bit expensive. In some `value` functions they are explicitly not there whereas in other (like for string) they are there.

I guess there will be always _some_ overhead in the builder as it does need to do some bookkeeping, but I think it's a good idea to see how we can write kernels while not losing too much performance.

FYI @jorgecarleitao

```
Gnuplot not found, using plotters backend
eq Float32              time:   [107.02 us 107.29 us 107.60 us]
                        change: [-54.994% -54.839% -54.681%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

eq scalar Float32       time:   [70.271 us 70.356 us 70.446 us]
                        change: [-48.540% -48.392% -48.258%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

neq Float32             time:   [71.580 us 71.655 us 71.732 us]
                        change: [-58.072% -58.001% -57.931%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe

neq scalar Float32      time:   [70.011 us 70.079 us 70.155 us]
                        change: [-59.055% -58.980% -58.908%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low severe
  3 (3.00%) high mild

lt Float32              time:   [70.945 us 70.991 us 71.038 us]
                        change: [-55.834% -55.757% -55.683%] (p = 0.00 < 0.05)
                        Performance has improved.

lt scalar Float32       time:   [50.708 us 50.789 us 50.882 us]
                        change: [-62.939% -62.825% -62.689%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

lt_eq Float32           time:   [106.29 us 106.40 us 106.52 us]
                        change: [-42.593% -42.470% -42.350%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

lt_eq scalar Float32    time:   [71.089 us 71.170 us 71.261 us]
                        change: [-52.021% -51.941% -51.857%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

gt Float32              time:   [71.759 us 71.939 us 72.131 us]
                        change: [-58.319% -58.190% -58.067%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

gt scalar Float32       time:   [38.748 us 38.782 us 38.821 us]
                        change: [-73.757% -73.691% -73.624%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

gt_eq Float32           time:   [102.79 us 102.87 us 102.96 us]
                        change: [-53.103% -52.953% -52.805%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low severe
  4 (4.00%) high mild
  3 (3.00%) high severe

gt_eq scalar Float32    time:   [55.034 us 55.109 us 55.201 us]
                        change: [-59.706% -59.544% -59.381%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild
```

Closes #8900 from Dandandan/comparison_kernels

Authored-by: Heres, Daniel <danielheres@gmail.com>
Signed-off-by: Neville Dipale <nevilledips@gmail.com>