Commits


Wes McKinney authored and Uwe L. Korn committed 01a67f3ff3f
ARROW-493: [C++] Permit large (length > INT32_MAX) arrays in memory This commit relaxes the INT32_MAX length requirement for in-memory data. It does not change the Arrow memory format, nor does it permit arrays over INT32_MAX elements to be included in a RecordBatch message sent in the streaming or file formats. The purpose of this change is to enable Arrow containers to do zero-copy addressing of large datasets (generally of fixed-size elements) produced by other systems. Should those systems wish to send messages to Java, they will need to break those large arrays up into smaller pieces. We can create utilities to assist in copy-free segmentation of large in-memory datasets into compatible chunksizes. If the large data is only being used in C++-land, then there are no problems. This is a helpful change en route to adding an `arrow::Tensor` type per ARROW-550, and probably some other things. This also includes ARROW-584, as I wanted to be sure that I caught all the places in the codebase where there were imprecise integer conversions. cc @pcmoritz @robertnishihara Author: Wes McKinney <wes.mckinney@twosigma.com> Closes #352 from wesm/ARROW-493 and squashes the following commits: 013d8cc [Wes McKinney] Fix some more compiler warnings 13c4067 [Wes McKinney] Do not pass CMAKE_CXX_FLAGS to googletest ep dc50d80 [Wes McKinney] Fix last imprecise conversions c8e90bc [Wes McKinney] Fix many imprecise integer conversions 6bacdf3 [Wes McKinney] Permit in-memory arrays with more than INT32_MAX elements in Array and Builder classes. Raise if large arrays used in IPC context