Commits


Benjamin Kietzman authored and Neal Richardson committed d2b6d1132d6
ARROW-6243: [C++][Dataset] Filter expressions Adds the Expression class which is used to represent an arbitrarily complex filter expression. Expressions can be constructed using factory functions, for example: ```c++ and_( equal(field_ref("a"), scalar<int16_t>(5)), // column 'a' is equal to 5 greater(field_ref("b"), scalar<double>(0.0)) // column 'b' is greater than 0.0 ) ``` Operator overloads are also provided, so the above could also be written as ```c++ "a"_ == int16_t(5) and "b"_ > 0.0 ``` These can be executed against a single record batch (using the `arrow::compute::` kernels). Additionally, expressions may be simplified or even elided given partition information. For example, given a partition where column 'a' is equal to 5 the above query could be simplified to `"b"_ > 0.0` (since the condition on column 'a' is satisfied by the entire partition) and given a partition where column 'b' is between -1.0 and 0.0 the query simplifies to `false` (since no record in the partition will satisfy the condition on column 'b'). This can be used to support arbitrary partitioning schemes and do the least kernel work possible on each record batch. Closes #5157 from bkietz/6243-Implement-basic-Filter-ex and squashes the following commits: fda47422f <Benjamin Kietzman> give MSVC a little help to avoid instantiating impossible constructors 539c28719 <Benjamin Kietzman> rename fieldRef to field_ref, comments 9494bab85 <Benjamin Kietzman> refactor And, Or to binary 0e366bb16 <Benjamin Kietzman> rename all/any to and/or ca155c6d9 <Benjamin Kietzman> add explicit std::move, msvc doesn't like defining operator and 7c01f7619 <Benjamin Kietzman> construct correct scalartype 3ac299ba6 <Benjamin Kietzman> use strongly typed nulls cab17b1a3 <Benjamin Kietzman> amend doccomments 46fa3fbfe <Benjamin Kietzman> add Expression::Validate implementations d54ca0600 <Benjamin Kietzman> Expressions evaluate to Datums 24323fc15 <Benjamin Kietzman> implement NotExpression::ToString d7f3ed239 <Benjamin Kietzman> simplify Expression::Equals cb569064f <Benjamin Kietzman> rename FieldRef -> Field, and_ -> all, or_ -> any, add comments to ExpressionType 5bae59004 <Benjamin Kietzman> re-enable Invert f9e0c0887 <Benjamin Kietzman> remove unused Empty() method e9af9354b <Benjamin Kietzman> lint fixes 58990674f <Benjamin Kietzman> break OperatorExpression into multiple classes 02d94a6f1 <Benjamin Kietzman> use explicit enumeration of comparison results 523186d88 <Benjamin Kietzman> add support for evaluation of trivial expressions, tests 60d9e088f <Benjamin Kietzman> fix factory linkage, factory fns deal in shared_ptrs exclusively 55aa45230 <Benjamin Kietzman> implement more robust null handling 3d355ff65 <Benjamin Kietzman> break up assumption logic, add function expression factories 3499fd2ec <Benjamin Kietzman> add an expression simplification test 06b25becb <Benjamin Kietzman> move expression evaluation to a free function a67b330a1 <Benjamin Kietzman> add comments, more tests, simplify operator overloads 9282284a0 <Benjamin Kietzman> simplify filter testing 6ccd636c7 <Benjamin Kietzman> add execution of filter expressions using compute kernels f3165784b <Benjamin Kietzman> add basic filter expressions Authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>