Commits

Wes McKinney authored 3d435e4f8d5
PARQUET-1508: [C++] Read ByteArray data directly into arrow::BinaryBuilder and BinaryDictionaryBuilder. Refactor encoders/decoders to use cleaner virtual interfaces This patch ended up being a bit of a bloodbath, but it sorted out a number of technical debt problems. Summary: * Add type-specific virtual encoder interfaces such as `ByteArrayEncoder` and `ByteArrayDecoder` -- this enables adding new encoder or decoder methods without conflicting with the other types. This was very hard to do before because all types shared a common template such as `PlainDecoder<ByteArrayType>` * Encoder and decoder implementations are now all in an `encoding.cc` compilation unit, performance should be unchanged (I will check to make sure) * Add BYTE_ARRAY decoder methods that write into `ChunkedBinaryBuilder` or `BinaryDictionaryBuilder`. This unblocks the long-desired direct-to-categorical Parquet reads * Altered RecordReader to decode BYTE_ARRAY values directly into `ChunkedBinaryBuilder`. More work will be required to expose DictionaryArray reads in a sane way Along the way I've decided I want to eradicate all instances of `extern template class` from the codebase. It's insanely brittle with different visibility rules in MSVC, gcc, AND clang (no kidding, gcc and clang do different things). I'll refactor the others parts of the codebase that use them later Author: Wes McKinney <wesm+git@apache.org> Author: Uwe L. Korn <xhochy@users.noreply.github.com> Closes #3492 from wesm/PARQUET-1508 and squashes the following commits: df1bfc016 <Wes McKinney> lint f3fadcbe3 <Uwe L. Korn> Update cpp/src/parquet/arrow/record_reader.cc 4bafc5547 <Wes McKinney> Fix public compile definition on windows c4fcf7479 <Wes McKinney> verbose makefile 06e2c2315 <Wes McKinney> lint daac8a686 <Wes McKinney> Delete a couple commented-out methods, add code comments about unimplemented DecodeArrowNonNull method for DictionaryBuilder 453ecbd83 <Wes McKinney> Refactor encoder and decoder classes to facilitate type-level extensibility