Commit f362c7e
[SPARK-55059][PYTHON] Remove empty table workaround in toPandas
### What changes were proposed in this pull request?
Remove the SPARK-51112 workaround in `_convert_arrow_table_to_pandas()` that bypassed PyArrow's `to_pandas()` for empty tables.
### Why are the changes needed?
The workaround was added because arrow-java's `ListVector.getBufferSizeFor(0)` returned 0, causing the offset buffer to be omitted for empty nested arrays in IPC serialization, which led to a segmentation fault in PyArrow.
This has been fixed upstream in arrow-java 19.0.0 ([apache/arrow-java#343](apache/arrow-java#343)), which Spark adopted in SPARK-56000 (PR #54820). The workaround is no longer necessary.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Existing test `test_to_pandas_for_empty_df_with_nested_array_columns` passes.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #53824 from Yicong-Huang/SPARK-55059/refactor/remove-empty-table-workaround.
Authored-by: Yicong Huang <17627829+Yicong-Huang@users.noreply.github.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>1 parent 6debd47 commit f362c7e
1 file changed
Lines changed: 2 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
254 | 254 | | |
255 | 255 | | |
256 | 256 | | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
| 257 | + | |
| 258 | + | |
267 | 259 | | |
268 | 260 | | |
269 | 261 | | |
| |||
0 commit comments