Commits


Sasha Krassovsky authored and Antoine Pitrou committed 50fab73da58
ARROW-3998: [C++] Add TPC-H Generator This PR contains an implementation of a multithreaded TPC-H dbgen, as well as an implementation of Q1 as a google benchmark. The advantage of this dbgen approach is that it is a scan node: it generates data on the fly and streams it over. As a result, I was for instance able to run scale factor 1000 on Q1 on my desktop with only 32 GB of RAM. I did verify results of Q1. They don't exactly match the reference results, but they are quite close and well within what I'd expect the variance to be between random number generators. ``` ------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------- BM_Tpch_Q1/SF:1 186609936 ns 268825 ns 100 BM_Tpch_Q1/SF:10 1858114140 ns 276741 ns 10 BM_Tpch_Q1/SF:100 18561088470 ns 273067 ns 1 BM_Tpch_Q1/SF:1000 186103719755 ns 289445 ns 1 ``` Closes #12537 from save-buffer/sasha_tpch Lead-authored-by: Sasha Krassovsky <krassovskysasha@gmail.com> Co-authored-by: Jonathan Keane <jkeane@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>