Commits


Antoine Pitrou authored and Wes McKinney committed 01202ccc841
ARROW-2568: [Python] Expose thread pool size setting to Python, and deprecate "nthreads" where possible There are two areas where `nthreads` cannot be replaced immediately by the global thread pool: 1. when converting Pandas data to Arrow table or record batch, since it uses a Python `ThreadPoolExecutor` from pure Python code (see `dataframe_to_arrays` in `pandas_compat.py`) 2. when reading or writing Parquet data, since `parquet-cpp` relies on parallelization facilities in the stable version of Arrow (see https://github.com/apache/parquet-cpp/pull/467) Elsewhere, we add a `use_threads` boolean argument and deprecate `nthreads`. Author: Antoine Pitrou <antoine@python.org> Closes #2078 from pitrou/ARROW-2568 and squashes the following commits: 91187bf6 <Antoine Pitrou> Move use_threads flag into PandasOptions a7aeed0e <Antoine Pitrou> Factor out secession predicate f601d4e9 <Antoine Pitrou> ThreadPool::State pointer is const 4567a2c3 <Antoine Pitrou> Add a two-argument variant of ParallelFor() that uses the global CPU thread pool d0a527ab <Antoine Pitrou> Restore single-thread path in WriteTableToBlocks() 2171e6e2 <Antoine Pitrou> On Windows, avoid shutting down the global thread pool at process exit 934e5a11 <Antoine Pitrou> Add & operator between Statuses 96397076 <Antoine Pitrou> Factor out deprecation logic 6b685406 <Antoine Pitrou> Fix MSVC warning 6b6f64a1 <Antoine Pitrou> Make ThreadPool capacity an int, not a size_t 61669755 <Antoine Pitrou> Rename CPUThreadPool() to GetCpuThreadPool() d4eb8d47 <Antoine Pitrou> Export ThreadPool and CPUThreadPool() 6985afe6 <Antoine Pitrou> Lint fcca62f5 <Antoine Pitrou> Emit FutureWarning (which is visible by default) rather than DeprecationWarning 172fba37 <Antoine Pitrou> Use C++ API, rather than multiprocessing, in pyarrow.{cpu_count,set_cpu_count} 50310b88 <Antoine Pitrou> Add API function to get desired ThreadPool capacity 3417325c <Antoine Pitrou> ARROW-2568: WIP