Commits


Neal Richardson authored and GitHub committed 03f8ae754ed
GH-41540: [R] Simplify arrow_eval() logic and bindings environments (#41537) ### Rationale for this change NSE is hard enough. I wanted to see if I could remove some layers of complexity. ### What changes are included in this PR? * There no longer are separate collections of `agg_funcs` and `nse_funcs`. Now that the aggregation functions return Expressions (https://github.com/apache/arrow/pull/41223), there's no reason to treat them separately. All bindings return Expressions now. * Both are removed and functions are just stored in `.cache$functions`. There was a note wondering why both `nse_funcs` and that needed to exist. They don't. * `arrow_mask()` no longer has an `aggregations` argument: agg functions are always present. * Because agg functions are always present, `filter` and `arrange` now have to check for whether the expressions passed to them contain aggregations--this is supported in regular dplyr but we have deferred supporting it here for now (see https://github.com/apache/arrow/pull/41350). If we decide we want to support it later, these checks are the entry points where we'd drop in the `left_join()` as in `mutate()`. * The logic of evaluating expresssions in `filter()` has been simplified. * Assorted other cleanups: `register_binding()` has two fewer arguments, for example, and the duplicate functions for referencing agg_funcs are gone. There is one more refactor I intend to pursue, and that's to rework abandon_ship and how arrow_eval does error handling, but I ~may~ will defer that to a followup. ### Are these changes tested? Yes, though I'll add some more for filter/aggregate in the followup since I'm reworking things there. ### Are there any user-facing changes? There are a couple of edge cases where the error message will change subtly. For example, if you supplied a comma-separated list of filter expressions, and more than one of them did not evaluate, previously you would be informed of all of the failures; now, we error on the first one. I don't think this is concerning. * GitHub Issue: #41540