Commits


Jinpeng authored and GitHub committed ec77fab2f5c
PARQUET-2316: [C++] Allow partial PreBuffer in the parquet FileReader (#36192) ### Rationale for this change The current FileReader can only work in one of the two modes, coalescing (when Prebuffer is called) and non-coalescing (when Prefufer is not called), due to the if statement [here](https://github.com/apache/arrow/blob/main/cpp/src/parquet/file_reader.cc#L203) Since Prebuffer is basically caching all specified column chunks, it would raise concerns on memory usage for systems with tight memory budget. In such scenarios, one may want to Prebuffer some small chunks while being able to read the rest chunks using BufferedInputStream. ### What changes are included in this PR? Changes to support partial prebuffer on a subset of column chunks and a unit test ### Are these changes tested? Yes. ### Are there any user-facing changes? No. Authored-by: jp0317 <zjpzlz@gmail.com> Signed-off-by: Gang Wu <ustcwg@gmail.com>