Commits


mwish authored and GitHub committed 60708150f1e
GH-38432: [C++][Parquet] Try to fix performance regression in the DictByteArrayDecoderImpl (#38784) ### Rationale for this change Do some changes mentioned in https://github.com/apache/arrow/issues/38432 I believe this might fix https://github.com/apache/arrow/issues/38577 Problem1: The `BinaryHelper` might call `Prepare()` and `Prepare(estimated-output-binary-length)` for data. This might because: 1. For Plain Encoding ByteArray, the `len_` is similar to the data-page size, so `Reserve` is related. 2. For Dict Encoding. The Data Page is just a RLE encoding Page, it's `len_` might didn't directly related to output-binary. Problem2: `Prepare` using `::arrow::kBinaryMemoryLimit` as min-value, we should use `this->chunk_space_remaining_`. Problem3: `std::optional<int64_t>` is hard to optimize for some compilers ### What changes are included in this PR? Mention the behavior of BinaryHelper. And trying to fix it. ### Are these changes tested? No ### Are there any user-facing changes? Regression fixes * Closes: #38432 Lead-authored-by: mwish <maplewish117@gmail.com> Co-authored-by: mwish <1506118561@qq.com> Co-authored-by: Gang Wu <ustcwg@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>