Commits


Matthew Topol authored and Korn, Uwe committed b1d1633c90d
ARROW-2661: [Python] Adding the ability to programmatically pass hdfs configration key/value pairs via pyarrow https://issues.apache.org/jira/browse/ARROW-2661 Both the JNI and libhdfs3 support hdfsBuilderConfSetStr so we can utilize that to allow passing arbitrary configuration values for hdfs connection similiar to how https://hdfs3.readthedocs.io/en/latest/hdfs.html supports passing them. I've added a param called `extra_conf` to facilitate it in pyarrow, such as: ```python import pyarrow conf = {"dfs.nameservices": "nameservice1", "dfs.ha.namenodes.nameservice1": "namenode113,namenode188", "dfs.namenode.rpc-address.nameservice1.namenode113": "hostname_of_server1:8020", "dfs.namenode.rpc-address.nameservice1.namenode188": "hostname_of_server2:8020", "dfs.namenode.http-address.nameservice1.namenode188": "hostname_of_server1:50070", "dfs.namenode.http-address.nameservice1.namenode188": "hostname_of_server2:50070", "hadoop.security.authentication": "kerberos" } hdfs = pyarrow.hdfs.connect(host='nameservice1', driver='libhdfs3', extra_conf=conf) ``` Author: Matthew Topol <mtopol@factset.com> Closes #2097 from zeroshade/configs and squashes the following commits: 047dd4b1 <Matthew Topol> forgot to use make format to fix the order of includes. oops d27e3c3e <Matthew Topol> switching to unordered_map 858b44bb <Matthew Topol> missed a flake8 spot 77eeae09 <Matthew Topol> Adding the ability to programmatically pass hdfs configuration key/value pairs in the C++ and via pyarrow