Commits


Wes McKinney authored and GitHub committed 53752adc6b8
ARROW-16756: [C++] Introduce non-owning ArraySpan, ExecSpan data structures and refactor ScalarKernels to use them (#13364) Parent issue: ARROW-16755. Also resolves ARROW-16819 ArraySpan has no shared pointers at all and is much cheaper to pass around, copy, and basically eliminates the current significant overhead associated with ExecBatch ExecBatchIterator. This PR isn't going to show meaningful performance gains in function or expression evaluation -- that will require implementing a more streamlined expression evaluator that is based on ArraySpan. This is only an intermediate patch to try to limit the scope of work as much as possible and facilitate follow up PRs. I have a long list of things I would like to do pretty much right away in follow up patches Some notes: * The ArraySpan retains pointers to the buffers that were used to populate it because in many places in the existing scalar kernels, we have to "go back" to a `shared_ptr<ArrayData>` * There are multiple places where having only `const DataType*` or `const Scalar*` would disallow the use of APIs that require either `shared_ptr<DataType>` or `shared_ptr<Scalar>`, so I added `std::enable_shared_from_this` on these classes. I don't know whether this increases the initialization cost of `make_shared<T>` if anyone knows, but I hope that in the future we can remove `std::enable_shared_from_this`. It would be better to have `Scalar::Copy` and `DataType::Copy` methods so this isn't necessary, but rather than try to hack this in this PR, I left this for follow on work * A few kernels have been refactored to always write into preallocated memory (IsIn, IndexIn, IsNull, IsValid) * Some internal APIs were best refactored to use ArraySpan, such as ArrayBuilder::AppendArraySlice, stuff in arrow/util/int_util.h In the interest of getting this merged sooner rather than later, rather than trying to make everything perfect here let's try to fix any glaring / serious issues that you see otherwise leave many improvements for follow up patches, otherwise any work in the scalar kernels codebase will be blocked. Authored-by: Wes McKinney <wesm@apache.org> Signed-off-by: Wes McKinney <wesm@apache.org>