Commits


François Saint-Jacques authored and Neal Richardson committed ad1fc6cdafb
ARROW-6952: [C++][Dataset] Implement predicate pushdown with ParqueFileFragment The proposed change can be divided in 3 parts: - Implement the `StatisticsAsScalars(Statistics& stats, Scalar* min, Scalar* max)` function to convert `parquet::Statistic`s min and max as `arrow::Scalar`s. - Implement the `RowGroupStatisticsAsExpression(RowGroupMetadata& meta, Expression* out)` function to represents the RowGroup's statistics as an expression of conjunction, e.g. `(a_min <= a AND a <= a_max) AND (b_min <= b AND b <= b_max) AND ...` - Modifies ParquetScanTaskIterator to skip RowGroups by checking the expression derived from the metadata with the filter expression. Closes #5765 from fsaintjacques/ARROW-6952-dataset-parquet-predicate-pushdown and squashes the following commits: 5fd0efd6c <François Saint-Jacques> Add unit test and fix issues 04e4a45a3 <François Saint-Jacques> Review comments 0d14b8f58 <François Saint-Jacques> Expose SchemaManifest publicly 235a0d699 <François Saint-Jacques> ARROW-6952: Implement predicate pushdown with ParquetFileFragment Authored-by: François Saint-Jacques <fsaintjacques@gmail.com> Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>