Commits


Neal Richardson authored and Benjamin Kietzman committed 898bef89357
ARROW-9665: [R] head/tail/take for Datasets This adds: * `head()` and `tail()` methods for Datasets and dplyr queries. `head()` for Datasets is done in cpp and is very fast; `tail()` not so much because it has to consume the whole scan iterator to reverse the sequence and grab rows from the end. * A `take` method for Datasets to allow selection of rows by index. This is also slow and not advised (could be done faster in cpp, left as a TODO) but was needed to complete the vroom benchmarks, which samples random rows. * `dim()` (nrow) for dplyr queries on Table/RecordBatch is now supported using the compute functions (sum the boolean filter array) * `collect()` gains an `as_data_frame` argument to allow you to evaluate the accumulated `select` and `filter` query but keep the result in Arrow, not an R data.frame Closes #7913 from nealrichardson/dataset-slicing Authored-by: Neal Richardson <neal.p.richardson@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>