Commits


mwish authored and GitHub committed 951d92a7d5f
GH-38438: [C++] Dataset: Trying to fix the async bug in Parquet dataset (#38466) ### Rationale for this change Origin mentioned https://github.com/apache/arrow/issues/38438 1. When PreBuffer is default enabled, the code in `RowGroupGenerator::FetchNext` would switch to async mode. This make the state handling more complex 2. In `RowGroupGenerator::FetchNext`, `[this]` is captured without `shared_from_this`. This is not bad, however, `this->executor_` may point to a invalid address if this dtor. This patch also fixes a lifetime issue I founded in CSV handling. ### What changes are included in this PR? 1. Fix handling in `cpp/src/parquet/arrow/reader.cc` as I talked above 2. Fix a lifetime problem in CSV ### Are these changes tested? I test it locality. But don't know how to write unittest here. Fell free to help. ### Are there any user-facing changes? Bugfix * Closes: #38438 Authored-by: mwish <maplewish117@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>