Commits


Joris Van den Bossche authored and Wes McKinney committed 7f4165c4757
ARROW-2428: [Python] Support pandas ExtensionArray in Table.to_pandas conversion Prototype for https://issues.apache.org/jira/browse/ARROW-2428 What does this PR do? - Based on the pandas_metadata (stored when creating a Table from a pandas DataFrame), we infer which columns originally had a pandas extension dtype, and support a custom conversion (based on a `__from_arrow__` method defined on the pandas extension dtype) - The user can also specify explicitly with the `extension_column` keyword which columns should be converted to an extension dtype This only covers [use case 1 discussed in the issue](https://issues.apache.org/jira/browse/ARROW-2428?focusedCommentId=16914231&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16914231): automatic roundtrip for pandas DataFrames that have extension dtypes. So it eg does not yet provide a way to do this if the arrow.Table has no pandas metadata (did not originate from a pandas DataFrame) Closes #5512 from jorisvandenbossche/ARROW-2428-arrow-pandas-conversion and squashes the following commits: dc8abac17 <Joris Van den Bossche> Avoid pandas_dtype check for known numpy dtypes 9572641a5 <Joris Van den Bossche> clean-up, remove extension_column kwarg in to_pandas, add docs 6f6b6f6f7 <Joris Van den Bossche> Also support arrow ExtensionTypes via to_pandas_dtype (without having pandas metadata) e2b4b6257 <Joris Van den Bossche> ARROW-2428: Support pandas ExtensionArray in Table.to_pandas conversion Authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Wes McKinney <wesm+git@apache.org>