Commits


Dewey Dunnington authored and Jonathan Keane committed aade058c32d
ARROW-14804: [R] import_from_c() / export_to_c() methods should accept external pointers This PR limits the use of external pointers whose value is casted to double. It only uses the double-casted pointer when passing a pointer to Python because this is the only way I can get this to work (perhaps there is a better way that will result from this review or that could be implemented in the future). The PR changes the implementation of the `Pointer<>` wrapper to: - Do the right thing and use R's externalptr type internally (and accept pointers defined in this way). - Accept pointers defined as `void*` casted to `uintptr_t` casted to `double` (for backward compatibility) - Accept pointers defined as `integer64` (because it's what @amol- thought to do at first and somebody else might, too) - Accept pointers defined as `raw(<pointer size>)` (because it's a way that pointers could get passed to Python without changing reticulate) - Accept pointers defined as a character representation (as parsed by `strtoll()`) (as suggested by @pitrou) I imagine that we don't have to use one or more of these, but thought it best to implement and test them all and use the review to eliminate any options that will never get used. Things I'm not sure about: - My implementations of parsing and exporting pointers (new to this!) - The non-unwind-protected-ness of the `Pointer<>` constructor...it's written such that it's unlikely to longjmp but could possibly use `cpp11::safe[]()` to make this impossible. This doesn't seem to be used elsewhere in the Arrow R package so I didn't implement it here. Reprex that tests this from the R end of things: ``` r # remotes::install_github("paleolimbot/arrow/r@r-pointers) library(arrow, warn.conflicts = FALSE) library(reticulate) pa <- reticulate::import("pyarrow") py <- pa$array(c(1, 2, 3)) py == Array$create(c(1, 2, 3)) #> Array #> <bool> #> [ #> true, #> true, #> true #> ] pa <- reticulate::import("pyarrow", convert = FALSE) r <- Array$create(c(4, 5, 6)) py <- pa$concat_arrays(list(r)) py #> [ #> 4, #> 5, #> 6 #> ] ``` ...and a reprex with some exploration of how reticulate deals with the Python conversion: ``` r # remotes::install_github("paleolimbot/arrow/r@r-pointers) library(reticulate) # external pointers are sort of supported by reticulate (an_external_pointer <- arrow:::allocate_arrow_schema()) #> <pointer: 0x121117f30> # but maybe not? r_to_py(an_external_pointer) #> <capsule object NULL at 0x1116f3d20> # currently we do double because this can cast to a Python integer # which is how all the _import_from_c() and _export_to_c() methods # are implemented (ptr_dbl <- arrow:::external_pointer_addr_double(an_external_pointer)) #> [1] 4849762096 r_to_py(ptr_dbl) #> 4849762096.0 # int64 definitely not implemented yet (tries to interpret bytes as REAL) (ptr_int64 <- arrow:::external_pointer_addr_integer64(an_external_pointer)) #> integer64 #> [1] 4849762096 r_to_py(ptr_int64) #> 2.396100842e-314 # character is an option that we could use without a PR into reticulate (ptr_raw <- arrow:::external_pointer_addr_raw(an_external_pointer)) #> [1] 30 7f 11 21 01 00 00 00 r_to_py(ptr_raw) #> python.builtin.bytearray (8 bytes) py_run_string("print(r.ptr_raw)") # raw is another option that we could use without a PR into reticulate (ptr_chr <- arrow:::external_pointer_addr_character(an_external_pointer)) #> [1] "4849762096" r_to_py(ptr_chr) #> 4849762096 ``` Closes #11919 from paleolimbot/r-pointers Authored-by: Dewey Dunnington <dewey@fishandwhistle.net> Signed-off-by: Jonathan Keane <jkeane@gmail.com>