Commits


Wes McKinney authored and Antoine Pitrou committed 1ae946c8bfe
ARROW-6910: [C++][Python] Set jemalloc default configuration to release dirty pages more aggressively back to the OS dirty_decay_ms and muzzy_decay_ms to 0 by default, add C++ / Python option to configure this The current default behavior causes applications dealing in large datasets to hold on to a large amount of physical operating system memory. While this may improve performance in some cases, it empirically seems to be causing problems for users. There's some discussion of this issue in some other contexts here https://github.com/jemalloc/jemalloc/issues/1128 Here is a test script I used to check the RSS while reading a large Parquet file (~10GB in memory) in a loop (requires downloading the file http://public-parquet-test-data.s3.amazonaws.com/big.snappy.parquet) https://gist.github.com/wesm/c75ad3b6dcd37231aaacf56a80a5e401 This patch enables jemalloc background page reclamation and reduces the time decay from 10 seconds to 1 second so that memory is returned to the OS more aggressively. Closes #5701 from wesm/ARROW-6910 and squashes the following commits: 8fc8aa8c1 <Antoine Pitrou> Revert "Try to fix protobuf-related clang warning" ab67abb3a <Antoine Pitrou> Try to fix protobuf-related clang warning 929047034 <Wes McKinney> Review comments, disable PLASMA_VALGRIND in Travis 8c4d367f1 <Wes McKinney> Use background_thread:true and 1000ms decay daa541605 <Wes McKinney> Set jemalloc dirty_decay_ms and muzzy_decay_ms to 0 by default, add function to set the values to something else Lead-authored-by: Wes McKinney <wesm+git@apache.org> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>