Commits


David Li authored and Benjamin Kietzman committed 8b4942728e7
ARROW-9697: [C++][Python][R][Dataset] Add CountRows for Scanner This implements a CountRows method for scanner. It will ask the fragment if it can count rows using only metadata, and otherwise project away columns and count the resulting rows. Originally, I thought we did not need a special optimization for the metadata-only case, because the Parquet reader will skip I/O and fabricate empty batches if you ask it to read no columns. However, in benchmarking, the overhead of the rest of the pipeline was still significant and so I implemented the optimization after all. Closes #10060 from lidavidm/arrow-9697 Authored-by: David Li <li.davidm96@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>