What happened?
Calling .values on a DataArray that has a _FillValue attribute causes a segmentation fault on Python 3.14.2 on Linux. The crash occurs in _apply_mask() (xarray/coding/variables.py) during the CF decoding step that replaces _FillValue entries with NaN.
The same code, same file, same machine works perfectly on Python 3.12 and Python 3.13.
We initially discovered this while reading Sentinel-3 OLCI satellite data (4091×4865 float32 arrays). Sub-sampled reads (e.g. var[::10, ::10].values) worked fine, but full-size reads crashed. Since small arrays didn't crash, we wrote a binary search script to find the exact threshold with synthetic data. The result: the crash occurs at exactly 512×512 (262,144 = 2^18 elements), which suggests a numpy internal dispatch threshold perhaps.
Binary search output
Python: 3.14.2 [Clang 21.1.4 ]
Testing array sizes to find segfault threshold...
2x2 ( 4 elements, 0.0 MB) OK
10x10 ( 100 elements, 0.0 MB) OK
50x50 ( 2,500 elements, 0.0 MB) OK
100x100 ( 10,000 elements, 0.0 MB) OK
200x200 ( 40,000 elements, 0.2 MB) OK
500x500 ( 250,000 elements, 1.0 MB) OK
750x750 ( 562,500 elements, 2.1 MB) SEGFAULT
--- Binary search between 500 and 750 ---
625x625 ( 390,625 elements, 1.5 MB) SEGFAULT
562x562 ( 315,844 elements, 1.2 MB) SEGFAULT
531x531 ( 281,961 elements, 1.1 MB) SEGFAULT
515x515 ( 265,225 elements, 1.0 MB) SEGFAULT
507x507 ( 257,049 elements, 1.0 MB) OK
511x511 ( 261,121 elements, 1.0 MB) OK
513x513 ( 263,169 elements, 1.0 MB) SEGFAULT
512x512 ( 262,144 elements, 1.0 MB) SEGFAULT
Threshold: crashes at 512x512 (262,144 elements, 1.0 MB)
Last OK: 511x511 (261,121 elements, 1.0 MB)
The threshold at 2^18 elements is a power of 2, this could suggest that this hits an internal numpy buffer/dispatch boundary (SIMD strategy, ufunc buffer size, or similar). This is likely a numpy bug on Python 3.14 that xarray exposes through _apply_mask → np.where.
$ python -X faulthandler test_crash.py
Fatal Python error: Segmentation fault
Current thread 0x0000793fc3d60740 [python] (most recent call first):
File ".../xarray/coding/variables.py", line 132 in _apply_mask
File ".../xarray/coding/common.py", line 80 in get_duck_array
File ".../xarray/coding/common.py", line 80 in get_duck_array
File ".../xarray/core/indexing.py", line 924 in get_duck_array
File ".../xarray/core/indexing.py", line 970 in get_duck_array
File ".../xarray/core/indexing.py", line 604 in __array__
File ".../xarray/core/variable.py", line 336 in _as_array_or_item
File ".../xarray/core/variable.py", line 556 in values
File ".../xarray/core/dataarray.py", line 798 in values
File ".../test_crash.py", line 12 in main
The crash path is: DataArray.values -> Variable.values -> _as_array_or_item() -> np.asarray() -> _ElementwiseFunctionArray.get_duck_array() -> _apply_mask() -> np.where() segfaults.
_apply_mask is wired up during CFMaskCoder.decode() via functools.partial and lazy_elemwise_func whenever the variable has _FillValue or missing_value attributes. The masking is deferred until .values is accessed, at which point np.where(condition, decoded_fill_value, data) is called on the full array - and this is where it crashes on Python 3.14.
Cross-references
This is most likely a numpy bug exposed through xarray. A parallel issue should be opened on numpy. Related: numpy#28197 (segfault on free-threaded build).
What did you expect to happen?
.values should return a numpy array with _FillValue entries replaced by NaN, without crashing.
Minimal Complete Verifiable Example
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "xarray[complete]@git+https://github.com/pydata/xarray.git@main",
# ]
# ///
#
# This script automatically imports the development branch of xarray to check for issues.
# Please delete this header if you have _not_ tested this script with `uv run`!
import xarray as xr
xr.show_versions()
# your reproducer code ...
import numpy
size = 512 # 511 works, 512 segfaults
data = numpy.random.rand(size, size).astype(numpy.float32)
data[0:5, 0:5] = 65534.0
da = xr.DataArray(data, dims=["rows", "columns"])
da.encoding["_FillValue"] = numpy.float32(65534.0)
da.to_netcdf("/tmp/test_fill.nc")
ds = xr.open_dataset("/tmp/test_fill.nc")
var = ds["__xarray_dataarray_variable__"]
print(var.values) # segfaults on Python 3.14 Linux
Steps to reproduce
- Use Python 3.14.2 on Linux (tested on Ubuntu 24.04)
- Install xarray and h5netcdf (or netCDF4) via pip or uv
- Run the MVCE above
- Observe: 511×511 arrays work, 512×512 arrays segfault
MVCE confirmation
Relevant log output
python -X faulthandler -c "
import numpy, xarray
data = numpy.random.rand(512, 512).astype(numpy.float32)
data[0:5, 0:5] = 65534.0
da = xarray.DataArray(data, dims=['rows', 'columns'])
da.encoding['_FillValue'] = numpy.float32(65534.0)
da.to_netcdf('/tmp/test.nc')
ds = xarray.open_dataset('/tmp/test.nc')
print(ds['__xarray_dataarray_variable__'].values)
"
Fatal Python error: Segmentation fault
Current thread 0x0000789b39442740 [python] (most recent call first):
File "/mnt/d/poly/dask_tests/.venv314/lib/python3.14/site-packages/xarray/coding/variables.py", line 132 in _apply_mask
File "/mnt/d/poly/dask_tests/.venv314/lib/python3.14/site-packages/xarray/coding/common.py", line 80 in get_duck_array
File "/mnt/d/poly/dask_tests/.venv314/lib/python3.14/site-packages/xarray/core/indexing.py", line 924 in get_duck_array
File "/mnt/d/poly/dask_tests/.venv314/lib/python3.14/site-packages/xarray/core/indexing.py", line 970 in get_duck_array
File "/mnt/d/poly/dask_tests/.venv314/lib/python3.14/site-packages/xarray/core/indexing.py", line 604 in __array__
File "/mnt/d/poly/dask_tests/.venv314/lib/python3.14/site-packages/xarray/core/variable.py", line 336 in _as_array_or_item
File "/mnt/d/poly/dask_tests/.venv314/lib/python3.14/site-packages/xarray/core/variable.py", line 556 in values
File "/mnt/d/poly/dask_tests/.venv314/lib/python3.14/site-packages/xarray/core/dataarray.py", line 798 in values
File "<string>", line 9 in <module>
Current thread's C stack trace (most recent call first):
[1] 106458 segmentation fault (core dumped)
Anything else we need to know?
The 512×512 threshold (2^18 elements) is almost certainly a numpy internal boundary, likely NPY_BUFSIZE, a SIMD dispatch threshold, or a ufunc buffer size. The crash is in np.where(condition, scalar, large_array) called from _apply_mask. We suspect this is ultimately a numpy bug on the Python 3.14 + Linux + Clang 21 combination, but we're reporting here since xarray is where it surfaces.
It may be worth testing if a bare np.where(arr > 0.5, np.nan, arr) on a 512×512 float32 array also segfaults on Python 3.14 Linux, if so, the issue should be filed on numpy directly.
Environment
Details
python: 3.14.2 (main, Jan 27 2026, 23:59:57) [Clang 21.1.4]
OS: Ubuntu 24.04 LTS
xarray==2026.2.0
numpy==2.4.2
What happened?
Calling
.valueson a DataArray that has a_FillValueattribute causes a segmentation fault on Python 3.14.2 on Linux. The crash occurs in_apply_mask()(xarray/coding/variables.py) during the CF decoding step that replaces_FillValueentries withNaN.The same code, same file, same machine works perfectly on Python 3.12 and Python 3.13.
We initially discovered this while reading Sentinel-3 OLCI satellite data (4091×4865 float32 arrays). Sub-sampled reads (e.g.
var[::10, ::10].values) worked fine, but full-size reads crashed. Since small arrays didn't crash, we wrote a binary search script to find the exact threshold with synthetic data. The result: the crash occurs at exactly 512×512 (262,144 = 2^18 elements), which suggests a numpy internal dispatch threshold perhaps.Binary search output
The threshold at 2^18 elements is a power of 2, this could suggest that this hits an internal numpy buffer/dispatch boundary (SIMD strategy, ufunc buffer size, or similar). This is likely a numpy bug on Python 3.14 that xarray exposes through
_apply_mask→np.where.The crash path is:
DataArray.values->Variable.values->_as_array_or_item()->np.asarray()->_ElementwiseFunctionArray.get_duck_array()->_apply_mask()->np.where()segfaults._apply_maskis wired up duringCFMaskCoder.decode()viafunctools.partialandlazy_elemwise_funcwhenever the variable has_FillValueormissing_valueattributes. The masking is deferred until.valuesis accessed, at which pointnp.where(condition, decoded_fill_value, data)is called on the full array - and this is where it crashes on Python 3.14.Cross-references
This is most likely a numpy bug exposed through xarray. A parallel issue should be opened on numpy. Related: numpy#28197 (segfault on free-threaded build).
What did you expect to happen?
.values should return a numpy array with _FillValue entries replaced by NaN, without crashing.
Minimal Complete Verifiable Example
Steps to reproduce
MVCE confirmation
Relevant log output
Anything else we need to know?
The 512×512 threshold (2^18 elements) is almost certainly a numpy internal boundary, likely NPY_BUFSIZE, a SIMD dispatch threshold, or a ufunc buffer size. The crash is in np.where(condition, scalar, large_array) called from _apply_mask. We suspect this is ultimately a numpy bug on the Python 3.14 + Linux + Clang 21 combination, but we're reporting here since xarray is where it surfaces.
It may be worth testing if a bare np.where(arr > 0.5, np.nan, arr) on a 512×512 float32 array also segfaults on Python 3.14 Linux, if so, the issue should be filed on numpy directly.
Environment
Details
python: 3.14.2 (main, Jan 27 2026, 23:59:57) [Clang 21.1.4] OS: Ubuntu 24.04 LTS xarray==2026.2.0 numpy==2.4.2