Commits


Li Jin authored and GitHub committed b1e85a6d0cd
GH-36672: [Python][C++] Add support for vector function UDF (#36673) ### Rationale for this change In Arrow compute, there are four main types of functions: Scalar, Vector, ScalarAggregate and HashAggregate. Some of the previous work added support for Scalar, ScalarAggregate(https://github.com/apache/arrow/issues/35515) and HashAggregate(https://github.com/apache/arrow/issues/36252). I think it makes sense to add support for vector function as well to complete all non-decomposable UDF kernel support. Internally, we plan to extend Acero to implement a "SegmentVectorNode" which would use this API to invoke vector on a segment by segment basis, which will allow to use constant memory to compute things like "rank the value across all rows per segment using a python UDF". ### What changes are included in this PR? The change includes is very similar to the support for aggregate function, which includes code to register the vector UDF, and a kernel that invokes the vector UDF on given inputs. ### Are these changes tested? Yes. Added new test. ### Are there any user-facing changes? Yes. This adds an user-facing API to register the vector function. * Closes: #36672 Authored-by: Li Jin <ice.xelloss@gmail.com> Signed-off-by: Li Jin <ice.xelloss@gmail.com>