Commits


Haocheng Liu authored and GitHub committed a44b5372c39
GH-41493: [C++][S3] Add a new option to check existence before CreateDir (#41822) ### Rationale for this change I have a use case that thousands of jobs are writing hive partitioned parquet files daily to the same bucket via S3FS filesystem. The gist here is a lot of keys are being created at the same time hense jobs hits `AWS Error SLOW_DOWN. during Put Object operation: The object exceeded the rate limit for object mutation operations(create, update, and delete). Please reduce your rate request error.` frequently throughout the day since the code is creating directories pessimistically. ### What changes are included in this PR? Add a new S3Option to check the existence of the directory before creation in `CreateDir`. It's disabled by default. When it's enabled, the CreateDir function will check the existence of the directory first before creation. It ensures that the create operation is only acted when necessary. Though there are more I/O calls, but it avoids hitting the cloud vendor put object limit. ### Are these changes tested? Add test cases when the flag is set to true. Right on top of the mind i donno how to ensure it's working in these tests. But in our production environment, we have very similar code and it worked well. ### Are there any user-facing changes? * GitHub Issue: #41493 Lead-authored-by: Haocheng Liu <lbtinglb@gmail.com> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Signed-off-by: Antoine Pitrou <antoine@python.org>