Commits


Dewey Dunnington authored and GitHub committed 5ce8d79d7ae
ARROW-17332: [R] error parsing folder path with accent ('c:/Público') in read_csv_arrow (#14930) This PR fixes a bug that prevented some filenames with non-ASCII characters from being openable. The probable culprit is `normalizePath()`, which does some handling of special characters but does not mark the encoding of its output in a way that cpp11's conversion to `std::string` understands. Because most of our test environments have UTF-8 as the session encoding, this usually works by accident, and it may work by accident in a latin-1 locale too (judging mostly by the fact that our issue thread is not overflowing with complaints of unopenable files, which may or may not be a good metric). I've added a test for converting the out the output of `normalizePath()` and making sure it's marked as UTF-8. I'll try to replicate this using Docker, too and see if there's any additional test we could add. The change here brings arrow in line with what the vroom package does for this operation: https://github.com/tidyverse/vroom/blob/main/R/path.R#L322-L324 Authored-by: Dewey Dunnington <dewey@voltrondata.com> Signed-off-by: Dewey Dunnington <dewey@voltrondata.com>