Commits


Fiona La authored and Sutou Kouhei committed 9c5e4b3e76b
ARROW-13185: [MATLAB] Create a single MEX gateway function which delegates to specific C++ functions ## Overview This pull request implements a more scalable and sustainable approach to organizing the C++ functionality that needs to be exposed to MATLAB. It adds a new singular MEX gateway function, `mexfcn('<cpp-function-name>', <cpp-function-argument-1>, ..., <cpp-function-argument-N>)`, which delegates to specific C++ functions. To make use of `mexfcn`, the directory containing `mexfcn.<mex-extension>` must be added to the MATLAB path. Advantages to this approach: - Organization - All C++ functions that are exposed to MATLAB are registered in one location - Reduce the complexity of managing and linking many MEX files and ensuring that they are added to the MATLAB path - Reduce cognitive load for adding new functions - Avoid polluting the source tree with build artifacts - Reduce build times, building a single MEX function is faster than building potentially hundreds - Reduce binary bloat caused by creating a separate MEX file for every MEX function - Enable flexibility in terms of where C++ implementation files live ## Implementation 1. One MEX function, `mexfcn`, defined in `matlab/src/cpp/arrow/matlab/mex/mex_util.h`, dispatches to individual C++ implementation files. For example, to invoke featherread functionality from MATLAB, that is implemented in C++: ``` matlab >> [variables, metadata] = mexfcn('featherread', 'featherfile.feather'); ``` 2. Functionality implemented in C++ that we want to expose in MATLAB is registered in a function map in the file `matlab/src/cpp/arrow/matlab/mex/mex_functions.h`. 3. Restructured source tree layout and performed general code clean up in preparation for feature implementation work: 3.1. Split source code to matlab/src/matlab and matlab/src/cpp 3.2 Make packages, namespaces, and directories consistent in terms of naming and hierarchy for simplifying navigation and header inclusion. 3.3 Renamed the MATLAB package name `mlarrow` to `arrow` as the `ml` is superfluous. 4. Refactored `matlab/CMakeLists.txt`: 4.1. Build shared library, `arrow_matlab`, that contains C++ functionality for the interface. 4.2. macOS: explicitly add the path of `arrow` to the `rpath` of `arrow_matlab`, as paths of libraries output by imported targets are not automatically included. 4.3. Windows: - Copy shared library, `arrow.dll` to the directory where the MEX function lives. - Add the path to MATLAB and GTest shared libraries to the ctest `ENVIRONMENT` when building tests. - Specify the release version of MSVC Runtime Libraries for all targets created in the CMake file. 5. Enable the `install` target: 5.1. Utilize [`CMAKE_INSTALL_PREFIX`](https://cmake.org/cmake/help/v3.0/variable/CMAKE_INSTALL_PREFIX.html) to allow users to customize the install location. The platform-specific default install locations will be used if the user does not specify a custom value. 5.2. Once installed, the interface's source files and libraries are relocatable, on all platforms. 5.3. As part of the install step, add the path to the install directory to the [MATLAB Search Path](https://uk.mathworks.com/help/matlab/matlab_env/what-is-the-matlab-search-path.html) by: - Option 1: Call `addpath` and `savepath` to modify the `pathdef.m` file that MATLAB uses on startup. This option is on by default. However, it can only be used if CMake has the appropriate permissions to modify `pathdef.m`. This option is toggled on and off by the flag: `MATLAB_ADD_INSTALL_DIR_TO_SEARCH_PATH`. - Option 2: Add an `addpath` command to the `startup.m` file located at the [`userpath`](https://uk.mathworks.com/help/matlab/matlab_env/what-is-the-matlab-search-path.html#:~:text=on%20Search%20Path.-,userpath%20Folder%20on%20the%20Search%20Path,-The%20userpath%20folder). This option can be used if a user does not have the permissions to modify the `pathdef.m` file. This option is off, by default, and is toggled on and off by the flag: `MATLAB_ADD_INSTALL_DIR_TO_STARTUP_FILE`. - Option 3: Add the path to the install directory to the [`MATLABPATH`](https://uk.mathworks.com/help/matlab/matlab_env/add-folders-to-matlab-search-path-at-startup.html#btpajlw) environment variable. The paths listed in `MATLABPATH` are added to the MATLAB Search Path on start up. - The MATLAB Actions install of MATLAB used during CI does not have `userpath` set, nor do we have edit permissions for `pathdef.m`. Therefore, we use the third option and set the `MATLABPATH` environment variable in `matlab.yml`. ## Testing Qualified `CMakeLists.txt` changes by building and running all tests: - On Windows 10 (Ninja and Visual Studio), macOS 11.5 (Make and Ninja), and Debian 10 (Make and Ninja) - Configurations: build both Arrow and GTest, use provided `ARROW_HOME`, use provided `GTEST_ROOT`, use both `ARROW_HOME` and `GTEST_ROOT`. ## Future Directions 1. Investigate why the default CMake behavior does not link the test executables against the correct MSVC Runtime libraries (ie. `ucrtbase.dll` versus `ucrtbased.dll`) when building with Ninja on Windows. 2. Add support for specifying function names and arguments as MATLAB strings to `mexfcn`. Currently, only character vectors are supported. 3. Refactor `mexfcn` to use [MATLAB Data Arrays](https://uk.mathworks.com/help/matlab/matlab-data-array.html) (MDAs) and [C++ Mex](https://uk.mathworks.com/help/matlab/cpp-mex-file-applications.html). 4. Investigate running MATLAB tests using [`matlab_add_unit_test`](https://cmake.org/cmake/help/latest/module/FindMatlab.html#command:matlab_add_unit_test). ## Notes 1. Thank you for all of your help on this pull request, @sgilmore10 and @kevingurney! Closes #12004 from lafiona/ARROW_13185 Lead-authored-by: Fiona La <fionala@mathworks.com> Co-authored-by: sgilmore <sgilmore@mathworks.com> Co-authored-by: lafiona <fionala7@gmail.com> Co-authored-by: Kevin Gurney <kgurney@mathworks.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>