Skip to content

to_geotiff drops rioxarray nodatavals and CF _FillValue silently #1582

@brendancol

Description

@brendancol

Summary

to_geotiff only reads attrs['nodata'] when extracting the NoData
sentinel from a DataArray. DataArrays produced by rioxarray (the most
common GeoTIFF reader in the ecosystem) carry the sentinel under
attrs['nodatavals'] (tuple, one per band), and CF-style xarray
pipelines use attrs['_FillValue']. Both are silently dropped on
write: the file lands without a GDAL_NODATA tag and the sentinel pixels
are stored as ordinary data values.

The user has to know to rename the attr by hand before the call. This
is exactly the failure mode the metadata-propagation sweep targets:
the call succeeds, the file looks fine, but a downstream reader sees a
raster with no nodata and treats the sentinel pixels as real data.

Reproduction

import numpy as np, xarray as xr
from xrspatial.geotiff import to_geotiff, open_geotiff

arr = np.array([[1., 2., -9999.], [3., -9999., 5.]], dtype=np.float32)
da = xr.DataArray(arr, dims=['y', 'x'],
                  coords={'y': [10., 20.], 'x': [100., 110., 120.]},
                  attrs={'crs': 4326, 'nodatavals': (-9999.,)})
to_geotiff(da, '/tmp/r.tif')
rd = open_geotiff('/tmp/r.tif')
print(rd.attrs.get('nodata'))   # None
print(rd.values)                # sentinel still present, no NaN

Fix

In to_geotiff (CPU path) and write_geotiff_gpu (GPU path), when
nodata is None and attrs['nodata'] is unset, fall back to
attrs['nodatavals'] (rioxarray) and then attrs['_FillValue'] (CF).
Use the first band's value if nodatavals is a sequence.

Found by the metadata-propagation sweep (Cat 1 + Cat 4).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions