Commits


Joris Van den Bossche authored and GitHub committed e2b0de29d6e
GH-41664: [C++][Python] PrettyPrint non-cpu data by copying to default CPU device (#42010) ### Rationale for this change The various Python reprs or the C++ `PrettyPrint` functions will currently just segfault when passing an object that has its data on a non-CPU device. In python, getting a segfault while displaying the object is very annoying, and so we should make this at least not crash. ### What changes are included in this PR? When we detect data on a non-CPU device passed to `PrettyPrint`, we copy the necessary part (the full Arrays for Array/RecordBatch, or the full chunks that are being printed for ChunkedArray/Table) to the default CPU device, and then use the existing print utilities as is on this copied subset. For large data, this can be potentially costly by copying a lot of data (but you can always avoid that by not printing the data), but for chunked data we will still only copy those chunks of the full dataset needed to print the object. Longer term, we should investigate if we can actually copy sliced arrays to a different device (with actual pruning of the buffers while copying): https://github.com/apache/arrow/issues/43055 ### Are these changes tested? Yes ### Are there any user-facing changes? No * GitHub Issue: #41664 Lead-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Co-authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>