Commits


Dragoș Moldovan-Grünfeld authored and GitHub committed 3e0eea1244a
ARROW-14575: [R] Allow functions with `pkg::` prefixes (#13160) This PR will allow the use of namespacing with bindings: ``` r library(arrow, warn.conflicts = FALSE) library(dplyr, warn.conflicts = FALSE) library(lubridate, warn.conflicts = FALSE) test_df <- tibble( date = as.Date(c("2022-03-22", "2021-07-30", NA)) ) test_df %>% mutate(ddate = lubridate::as_datetime(date)) %>% collect() #> # A tibble: 3 × 2 #> date ddate #> <date> <dttm> #> 1 2022-03-22 2022-03-22 00:00:00 #> 2 2021-07-30 2021-07-30 00:00:00 #> 3 NA NA test_df %>% arrow_table() %>% mutate(ddate = lubridate::as_datetime(date)) %>% collect() #> # A tibble: 3 × 2 #> date ddate #> <date> <dttm> #> 1 2022-03-22 2022-03-22 00:00:00 #> 2 2021-07-30 2021-07-30 00:00:00 #> 3 NA NA ``` <sup>Created on 2022-05-14 by the [reprex package](https://reprex.tidyverse.org) (v2.0.1)</sup> The approach (option 1 from the [design doc](https://docs.google.com/document/d/1Om-vYb31b6p_u4tyl86SGW1DrtWBfksq8NYG1Seqaxg/edit#)): - [x] add functionality to allow binding registration with the `pkg::fun()` name; - [x] Modify `register_binding()` to register 2 identical copies for each `pkg::fun` binding, namely `fun` and `pkg::fun`. - [x] Add a binding for the `::` operator, which helps with retrieving bindings from the function registry. - [x] Add generic unit tests for the `pkg::fun` functionality. - [x] Warn for a duplicated binding registration. - [x] register `nse_funcs` requiring _indirect_ mapping - [x] register each binding with and without the `pkg::` prefix. - [x] add / update unit tests for the `nse_funcs` bindings to include at least one `pkg::fun()` call for each binding <details> <summary>unit tests for conditional bindings</summary> - [x] `"dplyr::coalesce"` - [x] `"dplyr::if_else"` - [x] `"base::ifelse"` - [x] `"dplyr::case_when"` </details> <details> <summary>unit tests for date/time bindings</summary> - [x] `"base::strptime"` - [x] `"base::strftime"` - [x] `"lubridate::format_ISO8601"` - [x] `"lubridate::is.Date"` - [x] `"lubridate::is.instant"` - [x] `"lubridate::is.timepoint"` - [x] `"lubridate::is.POSIXct"` - [x] `"lubridate::date"` - [x] `"lubridate::second"` - [x] `"lubridate::wday"` - [x] `"lubridate::week"` - [x] `"lubridate::month"` - [x] `"lubridate::am"` - [x] `"lubridate::pm"` - [x] `"lubridate::tz"` - [x] `"lubridate::semester"` - [x] `"lubridate::make_datetime"` - [x] `"lubridate::make_date"` - [x] `"base::ISOdatetime"` - [x] `"base::ISOdate"` - [x] `"base::as.Date"` - [x] `"lubridate::as_date"` - [x] `"lubridate::as_datetime"` - [x] `"lubridate::decimal_date"` - [x] `"lubridate::date_decimal"` - [x] `"base::difftime"` - [x] `"base::as.difftime"` - [x] `"lubridate::make_difftime"` - [x] `"lubridate::dminutes"` - [x] `"lubridate::dhours"` - [x] `"lubridate::ddays"` - [x] `"lubridate::dweeks"` - [x] `"lubridate::dmonths"` - [x] `"lubridate::dyears"` - [x] `"lubridate::dseconds"` - [x] `"lubridate::dmilliseconds"` - [x] `"lubridate::dmicroseconds"` - [x] `"lubridate::dnanoseconds"` - [x] `"lubridate::dpicoseconds"` - [x] `"lubridate::parse_date_time"` - [x] `"lubridate::ymd"` - [x] `"lubridate::ydm"` - [x] `"lubridate::mdy"` - [x] `"lubridate::myd"` - [x] `"lubridate::dmy"` - [x] `"lubridate::dym"` - [x] `"lubridate::ym"` - [x] `"lubridate::my"` - [x] `"lubridate::yq"` - [x] `"lubridate::fast_strptime"` </details> <details> <summary>unit tests for math bindings</summary> - [x] `"base::log"` - [x] `"base::logb"` - [x] `"base::pmin"` - [x] `"base::pmax"` - [x] `"base::trunc"` - [x] `"base::round"` - [x] `"base::sqrt"` - [x] `"base::exp"` </details> <details> <summary>unit tests for string bindings</summary> - [x] `"base::paste"` - [x] `"base::paste0"` - [x] `"stringr::str_c"` - [x] `"base::grepl"` - [x] `"stringr::str_detect"` - [x] `"stringr::str_like"` - [x] `"stringr::str_count"` - [x] `"base::startsWith"` - [x] `"base::endsWith"` - [x] `"stringr::str_starts"` - [x] `"stringr::str_ends"` - [x] `"base::sub"` - [x] `"base::gsub"` - [x] `"stringr::str_replace"` - [x] `"stringr::str_replace_all"` - [x] `"base::strsplit"` - [x] `"stringr::str_split"` - [x] `"base::nchar"` - [x] `"stringr::str_to_lower"` - [x] `"stringr::str_to_upper"` - [x] `"stringr::str_to_title"` - [x] `"stringr::str_trim"` - [x] `"base::substr"` - [x] `"base::substring"` - [x] `"stringr::str_sub"` - [x] `"stringr::str_pad"` </details> <details> <summary>unit tests for type bindings</summary> - [x] `"base::as.character"` - [x] `"base::as.double"` - [x] `"base::as.integer"` - [x] `"bit64::as.integer64"` - [x] `"base::as.logical"` - [x] `"base::as.numeric"` - [x] `"methods::is"` - [x] `"tibble::tibble"` - [x] `"base::data.frame"` - [x] `"base::is.character"` - [x] `"base::is.numeric"` - [x] `"base::is.double"` - [x] `"base::is.integer"` - [x] `"bit64::is.integer64"` - [x] `"base::is.logical"` - [x] `"base::is.factor"` - [x] `"base::is.list"` - [x] `"rlang::is_character"` - [x] `"rlang::is_double"` - [x] `"rlang::is_integer"` - [x] `"rlang::is_list"` - [x] `"rlang::is_logical"` - [x] `"base::is.na"` - [x] `"base::is.nan"` - [x] `"dplyr::between"` - [x] `"base::is.finite"` - [x] `"base::is.infinite"` - [x] `"base::format"` </details> - [x] register `nse_funcs` requiring _direct_ mapping (unary and binary bindings) - [x] register unary bindings - [x] register binary bindings - [x] add / update unit tests for the `nse_funcs` bindings to include at least one `pkg::fun()` call for each binding <details> <summary>Unary and binary bindings unit tests</summary> * arithmetic functions - [x] `"base::abs"` - [x] `"base::ceiling"` - [x] `"base::floor"` - [x] `"base::log10"` - [x] `"base::log1p"` - [x] `"base::log2"` - [x] `"base::sign"` * trigonometric functions - [x] `"base::acos"` - [x] `"base::asin"` - [x] `"base::cos"` - [x] `"base::sin"` - [x] `"base::tan"` * string functions - [x] `"stringr::str_length"` - [x] `"stringi::stri_reverse"` - [x] `"base::tolower"` - [x] `"base::toupper"` * date and time functions - [x] `"lubridate::day"` - [x] `"lubridate::dst"` - [x] `"lubridate::hour"` - [x] `"lubridate::isoweek"` - [x] `"lubridate::epiweek"` - [x] `"lubridate::isoyear"` - [x] `"lubridate::epiyear"` - [x] `"lubridate::minute"` - [x] `"lubridate::quarter"` - [x] `"lubridate::mday"` - [x] `"lubridate::yday"` - [x] `"lubridate::year"` - [x] `"lubridate::leap_year"` * type conversion functions - [x] `"base::as.factor"` * binary functions - [x] `"base::strrep"` - [x] `"stringr::str_dup"` </details> - [x] aggregating functions - [x] register `agg_funcs` - [x] add unit tests for `agg_funcs` <details> <summary>unit tests for aggregating bindings</summary> - [x] `"base::sum"` - [x] `"base::any"` - [x] `"base::all"` - [x] `"base::mean"` - [x] `"stats::sd"` - [x] `"stats::var"` - [x] `"stats::quantile"` - [x] `"stats::median"` - [x] `"dplyr::n_distinct"` - [x] `"dplyr::n"` - [x] `"base::min"` - [x] `"base::max"` </details> - [x] namespace qualified bindings work inside the {dplyr} action verbs: - [x] `filter()` - [x] `mutate()` - [x] `transmute()` - [x] `group_by()` - [x] `summarise()` - [x] document changes in the Writing bindings article. - [x] going forward we should be using `pkg::fun` when defining a binding, which will register 2 copies of the same binding. Bindings that will not be registered with a `pkg::` prefix: * type casting, such as `cast()` or `dictionary_encode()`, and * operators (e.g. `"!"`, `"=="`, `"!="`, `">"`, `">="`, `"<"`, `"<="`, `"&"`, etc.) Authored-by: Dragoș Moldovan-Grünfeld <dragos.mold@gmail.com> Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>