Commits


Jeroen van Straten authored and GitHub committed 66c66d040bb
ARROW-16904: [C++] min/max not deterministic if Parquet files have multiple row groups (#13509) The min/max aggregate compute kernels seemed to discard their state between partitions, so they would only aggregate the last partition they see (in each thread). This is the simplest change I could come up with to fix this, but honestly I'm not sure why the `local` variable even exists. It seems to me it could just be replaced with `this->state` directly, since there doesn't seem to be any failure path where `this->state` isn't updated from `local`. Am I missing something? ETA: I tried to make a test case for this, only to find that there is already a test case for this. In that case however, it seems that the merging of the partition results is done by `Merge`'ing the result of separate `Consume` calls, rather than chaining multiple `Consume` calls. I'm not sure how to trigger the latter behavior from a normal C++ test case. Lead-authored-by: Jeroen van Straten <jeroen.van.straten@gmail.com> Co-authored-by: Aldrin M <octalene.dev@pm.me> Co-authored-by: octalene <octalene.dev@pm.me> Signed-off-by: David Li <li.davidm96@gmail.com>