Skip to content

Commit 81d03bb

Browse files
authored
PERF: Use Arrow path in _factorize_keys for all Arrow Dtypes (#63435)
1 parent 554f9c6 commit 81d03bb

File tree

2 files changed

+3
-1
lines changed

2 files changed

+3
-1
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1091,13 +1091,15 @@ Performance improvements
10911091
- Performance improvement in :meth:`CategoricalDtype.update_dtype` when ``dtype`` is a :class:`CategoricalDtype` with non ``None`` categories and ordered (:issue:`59647`)
10921092
- Performance improvement in :meth:`DataFrame.__getitem__` when ``key`` is a :class:`DataFrame` with many columns (:issue:`61010`)
10931093
- Performance improvement in :meth:`DataFrame.astype` when converting to extension floating dtypes, e.g. "Float64" (:issue:`60066`)
1094+
- Performance improvement in :meth:`DataFrame.merge` by using Arrow-native path for all Arrow-backed dtypes (:issue:`63435`)
10941095
- Performance improvement in :meth:`DataFrame.stack` when using ``future_stack=True`` and the DataFrame does not have a :class:`MultiIndex` (:issue:`58391`)
10951096
- Performance improvement in :meth:`DataFrame.to_hdf` avoid unnecessary reopenings of the HDF5 file to speedup data addition to files with a very large number of groups . (:issue:`58248`)
10961097
- Performance improvement in :meth:`DataFrame.where` when ``cond`` is a :class:`DataFrame` with many columns (:issue:`61010`)
10971098
- Performance improvement in ``DataFrameGroupBy.__len__`` and ``SeriesGroupBy.__len__`` (:issue:`57595`)
10981099
- Performance improvement in indexing operations for string dtypes (:issue:`56997`)
10991100
- Performance improvement in unary methods on a :class:`RangeIndex` returning a :class:`RangeIndex` instead of a :class:`Index` when possible. (:issue:`57825`)
11001101

1102+
11011103
.. ---------------------------------------------------------------------------
11021104
.. _whatsnew_300.bug_fixes:
11031105

pandas/core/reshape/merge.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2832,7 +2832,7 @@ def _factorize_keys(
28322832
rk = ensure_int64(rk.codes)
28332833

28342834
elif isinstance(lk, ExtensionArray) and lk.dtype == rk.dtype:
2835-
if (isinstance(lk.dtype, ArrowDtype) and is_string_dtype(lk.dtype)) or (
2835+
if isinstance(lk.dtype, ArrowDtype) or (
28362836
isinstance(lk.dtype, StringDtype) and lk.dtype.storage == "pyarrow"
28372837
):
28382838
import pyarrow as pa

0 commit comments

Comments
 (0)