Commits


Ying Zhou authored and Antoine Pitrou committed 1dc8f94ba76
ARROW-7906: [C++] [Python] Add ORC write support This pull request tracks the progress on adding ORC write support. The functionality is not complete yet. However for most types the process of populating a ColumnVectorBatch in ORC using data from Arrow Array. Arrow data types (arrow::Type::type) I do support: Boolean: BOOL Numerical: INT8, INT16, INT32, INT64, FLOAT, DOUBLE Time-related: DATE32 Binary: BINARY, STRING, LARGE_BINARY, LARGE_STRING, FIXED_SIZE_BINARY Nested: LIST, LARGE_LIST, FIXED_SIZE_LIST, STRUCT, MAP, DENSE_UNION, SPARSE_UNION Arrow data types I plan to support: Numerical: DECIMAL128 Time-related: DATE64, TIMESTAMP Dictionary: DICTIONARY Arrow data types I currently do NOT plan to support: Numerical: UINT8, UINT16, UINT32, UINT64, HALF_FLOAT, DECIMAL256 (There are no corresponding types in ORC. Of course except for in the case of DECIMAL256 we can always cast them into larger types. However I think maybe users need to explicitly do that.) Time-related: TIME32, TIME64, INTERVAL_MONTHS, INTERVAL_DAY_TIME, DURATION (There are no corresponding types in ORC and it is impossible to cast them into ORC types without losing time-related information) Extension: EXTENSION Closes #8648 from mathyingzhou/ARROW-7906_pyarrow_write_orc Lead-authored-by: Ying Zhou <yingzhou474@gmail.com> Co-authored-by: Sutou Kouhei <kou@clear-code.com> Co-authored-by: Jorge C. Leitao <jorgecarleitao@gmail.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Heres, Daniel <danielheres@gmail.com> Co-authored-by: Dmitry Patsura <zaets28rus@gmail.com> Co-authored-by: Neville Dipale <nevilledips@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Co-authored-by: Yibo Cai <yibo.cai@arm.com> Co-authored-by: Yordan Pavlov <yordan.pavlov@outlook.com> Co-authored-by: mqy <meng.qingyou@gmail.com> Co-authored-by: Kenta Murata <mrkn@mrkn.jp> Co-authored-by: Johannes Müller <JohannesMueller@fico.com> Co-authored-by: Mahmut Bulut <vertexclique@gmail.com> Co-authored-by: Ryan Jennings <ryan@ryanj.net> Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com> Co-authored-by: Weston Pace <weston.pace@gmail.com> Co-authored-by: Jörn Horstmann <joern.horstmann@signavio.com> Co-authored-by: Daniël Heres <danielheres@gmail.com> Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Co-authored-by: Matt Brubeck <mbrubeck@limpet.net> Co-authored-by: Max Burke <max@urbanlogiq.com> Co-authored-by: Maarten A. Breddels <maartenbreddels@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>