Commits


Dewey Dunnington authored and GitHub committed 702dbf3982e
ARROW-17637: [R] as.Date fails going from timestamp[us] to timestamp[s] (#14935) Before this PR: ``` r library(arrow, warn.conflicts = FALSE) #> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information. library(dplyr, warn.conflicts = FALSE) library(lubridate, warn.conflicts = FALSE) #> Loading required package: timechange # Use as_datetime() because as.POSIXct() truncates the fractional seconds ds <- InMemoryDataset$create(data.frame(x = as_datetime('2022-05-05T00:00:01.676632'))) ds %>% mutate(date = as.Date(x)) %>% collect() #> Error in `compute.arrow_dplyr_query()` at r/R/dplyr-collect.R:22:2: #> ! Invalid: Casting from timestamp[us, tz=UTC] to timestamp[s, tz=UTC] would lose data: 1651708801676632 #> /Users/dewey/Desktop/rscratch/arrow/cpp/src/arrow/compute/exec.cc:821 kernel_->exec(kernel_ctx_, input, out) #> /Users/dewey/Desktop/rscratch/arrow/cpp/src/arrow/compute/exec.cc:789 ExecuteSingleSpan(input, &output) #> /Users/dewey/Desktop/rscratch/arrow/cpp/src/arrow/compute/exec/expression.cc:608 executor->Execute( ExecBatch(std::move(arguments), all_scalar ? 1 : input.length), &listener) #> /Users/dewey/Desktop/rscratch/arrow/cpp/src/arrow/compute/exec/expression.cc:590 ExecuteScalarExpression(call->arguments[i], input, exec_context) #> /Users/dewey/Desktop/rscratch/arrow/cpp/src/arrow/compute/exec/project_node.cc:91 ExecuteScalarExpression(simplified_expr, target, plan()->exec_context()) #> /Users/dewey/Desktop/rscratch/arrow/cpp/src/arrow/record_batch.cc:334 ReadNext(&batch) #> /Users/dewey/Desktop/rscratch/arrow/cpp/src/arrow/record_batch.cc:348 ToRecordBatches() #> Backtrace: #> ▆ #> 1. ├─ds %>% mutate(date = as.Date(x)) %>% collect() #> 2. ├─dplyr::collect(.) #> 3. └─arrow:::collect.arrow_dplyr_query(.) #> 4. └─arrow:::compute.arrow_dplyr_query(x) at r/R/dplyr-collect.R:22:2 #> 5. └─base::tryCatch(...) at r/R/dplyr-collect.R:40:2 #> 6. └─base (local) tryCatchList(expr, classes, parentenv, handlers) #> 7. └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]]) #> 8. └─value[[3L]](cond) #> 9. └─arrow:::augment_io_error_msg(e, call, schema = schema()) at r/R/dplyr-collect.R:49:6 #> 10. └─rlang::abort(msg, call = call) at r/R/util.R:251:2 ``` <sup>Created on 2022-12-13 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup> After this PR: ``` r library(arrow, warn.conflicts = FALSE) #> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information. library(dplyr, warn.conflicts = FALSE) library(lubridate, warn.conflicts = FALSE) #> Loading required package: timechange # Use as_datetime() because as.POSIXct() truncates the fractional seconds ds <- InMemoryDataset$create(data.frame(x = as_datetime('2022-05-05T00:00:01.676632'))) ds %>% mutate(date = as.Date(x)) %>% collect() #> x date #> 1 2022-05-05 00:00:01 2022-05-05 ``` <sup>Created on 2022-12-13 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup> Authored-by: Dewey Dunnington <dewey@voltrondata.com> Signed-off-by: Dewey Dunnington <dewey@voltrondata.com>