Commits


Antoine Pitrou authored and Korn, Uwe committed f1ef7084785
ARROW-2319: [C++] Add BufferedOutputStream class Also add benchmarks for FileOutputStream and BufferedOutputStream. They are not ideal as they use `/dev/null`, which can lead to extravagant numbers. Once ARROW-1018 is solved, we should be able to make them slightly more realistic. Benchmark numbers here (Ubuntu 16.04, x86-64): ``` Benchmark Time CPU Iterations --------------------------------------------------------------------------------------------------------- BM_FileOutputStreamSmallWrites/min_time:1.000/repeats:2 1043 ns 1043 ns 1347551 316.284MB/s BM_FileOutputStreamSmallWrites/min_time:1.000/repeats:2 1045 ns 1045 ns 1347551 315.844MB/s BM_FileOutputStreamSmallWrites/min_time:1.000/repeats:2_mean 1044 ns 1044 ns 1347551 316.064MB/s BM_FileOutputStreamSmallWrites/min_time:1.000/repeats:2_stddev 1 ns 1 ns 0 225.293kB/s BM_FileOutputStreamLargeWrites/min_time:1.000/repeats:2 270 ns 270 ns 5165540 373.673GB/s BM_FileOutputStreamLargeWrites/min_time:1.000/repeats:2 270 ns 270 ns 5165540 372.756GB/s BM_FileOutputStreamLargeWrites/min_time:1.000/repeats:2_mean 270 ns 270 ns 5165540 373.215GB/s BM_FileOutputStreamLargeWrites/min_time:1.000/repeats:2_stddev 0 ns 0 ns 0 469.353MB/s BM_BufferedOutputStreamSmallWrites/min_time:1.000/repeats:2 179 ns 179 ns 7830925 1.79612GB/s BM_BufferedOutputStreamSmallWrites/min_time:1.000/repeats:2 178 ns 178 ns 7830925 1.80651GB/s BM_BufferedOutputStreamSmallWrites/min_time:1.000/repeats:2_mean 179 ns 179 ns 7830925 1.80132GB/s BM_BufferedOutputStreamSmallWrites/min_time:1.000/repeats:2_stddev 1 ns 1 ns 0 5.31812MB/s BM_BufferedOutputStreamLargeWrites/min_time:1.000/repeats:2 290 ns 290 ns 4822102 347.58GB/s BM_BufferedOutputStreamLargeWrites/min_time:1.000/repeats:2 290 ns 290 ns 4822102 346.958GB/s BM_BufferedOutputStreamLargeWrites/min_time:1.000/repeats:2_mean 290 ns 290 ns 4822102 347.269GB/s BM_BufferedOutputStreamLargeWrites/min_time:1.000/repeats:2_stddev 0 ns 0 ns 0 318.542MB/s ``` The 350GB/s number for large writes most likely means the kernel completely disregards the data. Still, for small writes we see that there is a real benefit in not emitting system calls every write. Author: Antoine Pitrou <antoine@python.org> Closes #1903 from pitrou/ARROW-2319-buffered-output-stream and squashes the following commits: e3b3cea1 <Antoine Pitrou> Fix lint error 8493c596 <Antoine Pitrou> Avoid calling close() to check that a fd is open (!) cbf029ce <Antoine Pitrou> Try to fix MSVC compilation failure 2fcd45a4 <Antoine Pitrou> ARROW-2319: Add BufferedOutputStream class