Commits


Wes McKinney authored and Benjamin Kietzman committed 50a0f312b3c
ARROW-7059: [C++][Parquet] Mostly fix performance regression when reading Parquet file with many columns This fixes a quadratic growth issue where an unordered set containing a list of all the column indices is constructed for each column. This code still seems a bit messy but we have a benchmark already to keep an eye on this (if we would actually run and track the benchmarks though...) Without patch Elapsed: 61.533573419999996 seconds With patch Elapsed: 2.484871 seconds With 0.14.1 Elapsed: 1.6306039680000002 seconds Closes #6181 from wesm/ARROW-7059 and squashes the following commits: 3205e98aa <Wes McKinney> Fix UBSAN failure f5e5212bb <Wes McKinney> Use shared_ptr for included leaves to prevent N^2 construction of unordered_set Authored-by: Wes McKinney <wesm+git@apache.org> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>