Commits


Alessandro Molina authored and GitHub committed e0ab40d3793
GH-41692: [Python] Improve substrait extended expressions support (#41693) Addresses some missing features and usability issues when using PyArrow with Substrait ExtendedExpressions * GitHub Issue: #41692 - [x] Allow passing `BoundExpressions` for `Scanner(columns=X)` instead of a dict of expressions. - [x] Allow passing `BoundExpressions` for `Scanner(filter=X)` so that user doesn't have to distinguish between `Expression` and `BoundExpressions` and can always just use `pyarrow.substrait.deserialize_expressions` - [x] Allow decoding `pyarrow.BoundExpressions` directly from `protobuf.Message`, thus allowing to use substrait-python objects. - [x] Return `memoryview` from methods encoding substrait, so that those can be directly passed to substrait-python (or more in general other python libraries) without a copy being involved. - [x] Allow decoding messages from `memoryview` so that the output of encoding functions can be sent back to dencoding functions. - [x] Allow to encode and decode schemas from substrait - [x] When encoding schemas return the extension types required for a substrait consumer to decode the schema - [x] Handle arrow extension types when decoding a schema - [x] Update docstrings and documentation --------- Co-authored-by: Raúl Cumplido <raulcumplido@gmail.com>