Commits


Dewey Dunnington authored and GitHub committed 36824d1b97f
ARROW-17054: [R] Creating an Array from an object bigger than 2^31 results in an Array of length 0 (#14929) This was a problem on both sides: the created `Array` really did have length 0 because of a cast to `int` on the way in, and on the way out, we were returning `int` from `array->length()` which wouldn't have been correct either. On the way in, we were using an imported C callable from the vctrs package: the C callable version of `vctrs::vec_size()`. This C callable took care of returning the correct value for normal vectors (`length()`), for data frames (`nrow()`), and for other classed vectors whose C concept of "length" was not the value returned at the R level (i.e., "record style vectors" like POSIXlt). At the C++ conversion level, we don't handle record style vectors: they are handled via the `VctrsExtensionType` and C++ only sees the `vec_data()` (i.e., data.frame). Because of this, implementing our own `vec_size()` that also supports long vectors was not hard. This allowed removing the link to vctrs for now (until a time that need to use more of the exported C API). On the way out, we already had the concept of `r_vec_size` from an earlier PR, we had just forgotten to use it in `Array__length()`. Before this PR: ``` r library(arrow, warn.conflicts = FALSE) too_big <- raw(.Machine$integer.max + 1) too_big_array <- Array$create(too_big) length(too_big) #> [1] 2147483648 length(too_big_array) #> [1] 0 ``` After this PR: ``` r library(arrow, warn.conflicts = FALSE) too_big <- raw(.Machine$integer.max + 1) too_big_array <- Array$create(too_big) length(too_big) #> [1] 2147483648 length(too_big_array) #> [1] 2147483648 ``` Authored-by: Dewey Dunnington <dewey@voltrondata.com> Signed-off-by: Dewey Dunnington <dewey@voltrondata.com>