Commits

Wes McKinney authored bbbbbfb1ecd
ARROW-1844: [C++] Add initial Unique benchmarks for int64, variable-length strings I also fixed a bug this surfaced in the hash table resize (unit test coverage was not adequate) Now we have ``` $ ./release/compute-benchmark Run on (8 X 4200.16 MHz CPU s) 2017-11-28 18:33:53 Benchmark Time CPU Iterations ------------------------------------------------------------------------------------------------- BM_BuildDictionary/min_time:1.000 1352 us 1352 us 1038 2.88639GB/s BM_BuildStringDictionary/min_time:1.000 3994 us 3994 us 351 75.5809MB/s BM_UniqueInt64NoNulls/16M/50/min_time:1.000/real_time 35814 us 35816 us 39 3.49023GB/s BM_UniqueInt64NoNulls/16M/1024/min_time:1.000/real_time 119656 us 119660 us 12 1069.73MB/s BM_UniqueInt64NoNulls/16M/10k/min_time:1.000/real_time 174924 us 174930 us 8 731.747MB/s BM_UniqueInt64NoNulls/16M/1024k/min_time:1.000/real_time 448425 us 448440 us 3 285.443MB/s BM_UniqueInt64WithNulls/16M/50/min_time:1.000/real_time 49511 us 49513 us 29 2.52468GB/s BM_UniqueInt64WithNulls/16M/1024/min_time:1.000/real_time 134519 us 134523 us 10 951.541MB/s BM_UniqueInt64WithNulls/16M/10k/min_time:1.000/real_time 191331 us 191336 us 7 668.999MB/s BM_UniqueInt64WithNulls/16M/1024k/min_time:1.000/real_time 533597 us 533613 us 3 239.882MB/s BM_UniqueString10bytes/16M/50/min_time:1.000/real_time 150731 us 150736 us 9 1061.5MB/s BM_UniqueString10bytes/16M/1024/min_time:1.000/real_time 256929 us 256938 us 5 622.739MB/s BM_UniqueString10bytes/16M/10k/min_time:1.000/real_time 414412 us 414426 us 3 386.09MB/s BM_UniqueString10bytes/16M/1024k/min_time:1.000/real_time 1744253 us 1744308 us 1 91.7298MB/s BM_UniqueString100bytes/16M/50/min_time:1.000/real_time 563890 us 563909 us 2 2.77093GB/s BM_UniqueString100bytes/16M/1024/min_time:1.000/real_time 704695 us 704720 us 2 2.21727GB/s BM_UniqueString100bytes/16M/10k/min_time:1.000/real_time 995685 us 995721 us 2 1.56927GB/s BM_UniqueString100bytes/16M/1024k/min_time:1.000/real_time 3584108 us 3584230 us 1 446.415MB/s ``` We can also refactor the hash table implementations without worrying too much about whether we're making things slower Author: Wes McKinney <wes.mckinney@twosigma.com> Closes #1370 from wesm/ARROW-1844 and squashes the following commits: 638f1a11 [Wes McKinney] Decrease resize load factor to 0.5 2885c645 [Wes McKinney] Multiply bytes processed by state.iterations() f7b36194 [Wes McKinney] Add initial Unique benchmarks for int64, strings