Commits


Tom Jarosz authored and GitHub committed b522f8c5cc1
GH-39313: [Python] Fix race condition in _pandas_api#_check_import (#39314) ### Rationale for this change See: ``` cdef inline bint _have_pandas_internal(self): if not self._tried_importing_pandas: self._check_import(raise_=False) return self._have_pandas ``` The method `_check_import`: 1) sets `_tried_importing_pandas` to true 2) does some things which take time... 3) sets `_have_pandas` to true (if we indeed do have pandas) Suppose thread 1 calls `_have_pandas_internal`. If thread 1 is at step 2 while thread 2 calls `_have_pandas_internal`, `_have_pandas_internal` may incorrectly return False for thread 2 as thread 1 has set `_tried_importing_pandas` to true, but has not yet (but will) set `_have_pandas` to True. `_have_pandas_internal` will return True for thread 1. After my fix, `_have_pandas_internal` will not return an incorrect value in the scenario described above. It would instead result in a redundant, but (I believe) harmless, invocation of `_check_import`. ### What changes are included in this PR? Changes ordering of "trying to import pandas" and "recording that pandas import has been tried" ### Are these changes tested? yes, see test committed ### Are there any user-facing changes? This PR resolves a user-facing race condition https://github.com/apache/arrow/issues/39313 * Closes: #39313 Lead-authored-by: Thomas Jarosz <thomas.jarosz@c3.ai> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>