Commits


Sarah Gilmore authored and GitHub committed fe2d926ef38
GH-41803: [MATLAB] Add C Data Interface format import/export functionality for `arrow.tabular.RecordBatch` (#41817) ### Rationale for this change This pull requests adds two new APIs for importing and exporting `arrow.tabular.RecordBatch` instances using the C Data Interface format. **Example:** ```matlab >> T = table((1:3)', ["A"; "B"; "C"]); >> expected = arrow.recordBatch(T) expected = Arrow RecordBatch with 3 rows and 2 columns: Schema: Var1: Float64 | Var2: String First Row: 1 | "A" >> cArray = arrow.c.Array(); >> cSchema = arrow.c.Schema(); % Export the RecordBatch to C Data Interface Format >> expected.export(cArray.Address, cSchema.Address); % Import the RecordBatch from C Data Interface Format >> actual = arrow.tabular.RecordBatch.import(cArray, cSchema) actual = Arrow RecordBatch with 3 rows and 2 columns: Schema: Var1: Float64 | Var2: String First Row: 1 | "A" % The RecordBatch is the same after round-tripping to the C Data Interface format >> isequal(actual, expected) ans = logical 1 ``` ### What changes are included in this PR? 1. Added a new method `arrow.tabular.RecordBatch.export` for exporting `RecordBatch` objects to the C Data Interface format. 2. Added a new static method `arrow.tabular.RecordBatch.import` for importing `RecordBatch` objects from the C Data Interface format. 3. Added a new internal class `arrow.c.internal.RecordBatchImporter` for importing `RecordBatch` objects from the C Data Interface format. ### Are these changes tested? Yes. 1. Added a new test file `matlab/test/arrow/c/tRoundtripRecordBatch.m` which has basic round-trip tests for importing and exporting `RecordBatch` objects. ### Are there any user-facing changes? Yes. 1. Two new user-facing methods were added to `arrow.tabular.RecordBatch`. The first is `arrow.tabular.RecordBatch.export(cArrowArrayAddress, cArrowSchemaAddress)`. The second is `arrow.tabular.RecordBatch.import(cArray, cSchema)`. These APIs can be used to export/import `RecordBatch` objects using the C Data Interface format. ### Future Directions 1. Add integration tests for sharing data between MATLAB/mlarrow and Python/pyarrow running in the same process using the [MATLAB interface to Python](https://www.mathworks.com/help/matlab/call-python-libraries.html). 2. Add support for the Arrow [C stream interface format](https://arrow.apache.org/docs/format/CStreamInterface.html). ### Notes 1. Thanks to @ kevingurney for the help with this feature! * GitHub Issue: #41803 Authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Sarah Gilmore <sgilmore@mathworks.com>