Commits


Kevin Gurney authored and GitHub committed 798132ce65e
GH-37473: [MATLAB] Add support for indexing `RecordBatch` columns by `Field` name (#37475) ### Rationale for this change Currently, `arrow.tabular.Schema` supports indexing by `Field` name. However, `arrow.tabular.RecordBatch` does not. This pull request adds the ability to index columns in a `RecordBatch` by `Field` name. ### What changes are included in this PR? 1. Added support for indexing columns in a `RecordBatch` by `Field` name via the `column` method. **Example** ```matlab >> recordBatch = arrow.tabular.RecordBatch.fromArrays(... arrow.array([1, 2, 3]), ... arrow.array(["A", "B", "C"]), ... arrow.array([true, false, true]), ... ColumnNames=["A", "B", "C"] ... ) recordBatch = A: [ 1, 2, 3 ] B: [ "A", "B", "C" ] C: [ true, false, true ] >> recordBatch.column("B") ans = [ "A", "B", "C" ] >> recordBatch.column("C") ans = [ true, false, true ] ``` 2. Removed comments about vectorizing `field` method of `Schema` and `column` method of `RecordBatch`. After further consideration, we believe it would make more sense to only allow these methods to accept scalar inputs. We could revisit support for vectorization if we overload the parenthesis operator (e.g. `recordBatch(rows, columns)`) in the future to return another `RecordBatch`/`Schema` that only includes the selected columns/fields. 3. Fixed typo in `tSchema.m`. ### Are these changes tested? Yes. 1. Added tests for indexing by column name using the `column` method to `tRecordBatch.m`. ### Are there any user-facing changes? Yes. 1. Users can now index `RecordBatch` columns by name using the syntax `column(name)`. ### Future Directions 1. Consider overloading parentheses-based indexing on `RecordBatch` and `Schema`. * Closes: #37473 Lead-authored-by: Kevin Gurney <kgurney@mathworks.com> Co-authored-by: Sarah Gilmore <sgilmore@mathworks.com> Signed-off-by: Kevin Gurney <kgurney@mathworks.com>