-
Notifications
You must be signed in to change notification settings - Fork 19
TIFF to GPU memory via cog3pio backend entrypoint #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
weiji14
wants to merge
1
commit into
main
Choose a base branch
from
cog3pio-backend
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+160
−4
Draft
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,5 @@ | ||
| from . import _version | ||
| from .accessors import CupyDataArrayAccessor, CupyDatasetAccessor # noqa | ||
| from .accessors import CupyDataArrayAccessor, CupyDatasetAccessor # noqa: F401 | ||
| from .cog3pio import Cog3pioBackendEntrypoint # noqa: F401 | ||
|
|
||
| __version__ = _version.get_versions()["version"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| """ | ||
| `cog3pio` backend for xarray to read TIFF files directly into CuPy arrays in GPU memory. | ||
| """ | ||
|
|
||
| import os | ||
| from collections.abc import Iterable | ||
|
|
||
| import cupy as cp # type: ignore[import-untyped] | ||
| import numpy as np | ||
| import xarray as xr | ||
| from cog3pio import CudaCogReader | ||
| from xarray.backends import BackendEntrypoint | ||
|
|
||
|
|
||
| # %% | ||
| class Cog3pioBackendEntrypoint(BackendEntrypoint): | ||
| """ | ||
| Xarray backend to read GeoTIFF files using 'cog3pio' engine. | ||
|
|
||
| When using :py:func:`xarray.open_dataarray` with ``engine="cog3pio"``, the | ||
| ``device_id`` parameter can be set to the CUDA GPU id to do the decoding on. | ||
|
|
||
| Examples | ||
| -------- | ||
| Read a GeoTIFF from a HTTP url into an [xarray.DataArray][]: | ||
|
|
||
| >>> import xarray as xr | ||
| >>> # Read GeoTIFF into an xarray.DataArray | ||
| >>> dataarray: xr.DataArray = xr.open_dataarray( | ||
| ... filename_or_obj="https://github.com/OSGeo/gdal/raw/v3.11.0/autotest/gcore/data/byte_zstd.tif", | ||
| ... engine="cog3pio", | ||
| ... device_id=0, # cuda:0 | ||
| ... ) | ||
| >>> dataarray.sizes | ||
| Frozen({'band': 1, 'y': 20, 'x': 20}) | ||
| >>> dataarray.dtype | ||
| dtype('uint8') | ||
|
|
||
| """ | ||
|
|
||
| description = "Use .tif files in Xarray" | ||
| open_dataset_parameters = ("filename_or_obj", "drop_variables", "device_id") | ||
| url = "https://github.com/weiji14/cog3pio" | ||
|
|
||
| def open_dataset( # type: ignore[override] | ||
| self, | ||
| filename_or_obj: str, | ||
| *, | ||
| drop_variables: str | Iterable[str] | None = None, | ||
| device_id: int, | ||
| # other backend specific keyword arguments | ||
| # `chunks` and `cache` DO NOT go here, they are handled by xarray | ||
| mask_and_scale=None, | ||
| ) -> xr.Dataset: | ||
| """ | ||
| Backend open_dataset method used by Xarray in [xarray.open_dataset][]. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| filename_or_obj : str | ||
| File path or url to a TIFF (.tif) image file that can be read by the | ||
| nvTIFF or image-tiff backend library. | ||
| device_id : int | ||
| CUDA device ID on which to place the created cupy array. | ||
|
|
||
| Returns | ||
| ------- | ||
| xarray.Dataset | ||
|
|
||
| """ | ||
|
|
||
| with cp.cuda.Stream(ptds=True): | ||
| cog = CudaCogReader(path=filename_or_obj, device_id=device_id) | ||
| array_: cp.ndarray = cp.from_dlpack(cog) # 1-D Array | ||
| x_coords, y_coords = cog.xy_coords() # TODO consider using rasterix | ||
| height, width = (len(y_coords), len(x_coords)) | ||
| channels: int = len(array_) // (height * width) | ||
| # TODO make API to get proper 3-D shape directly, or use cuTENSOR | ||
| array_ = array_.reshape(height, width, channels) # HWC | ||
| array = array_.transpose(2, 0, 1) # CHW | ||
|
|
||
| dataarray: xr.DataArray = xr.DataArray( | ||
| data=array, | ||
| coords={ | ||
| "band": np.arange(channels, dtype=np.uint8), | ||
| "y": y_coords, | ||
| "x": x_coords, | ||
| }, | ||
| name=None, | ||
| attrs=None, | ||
| ) | ||
|
|
||
| return dataarray.to_dataset(name="raster") | ||
|
|
||
| def guess_can_open(self, filename_or_obj): | ||
| try: | ||
| _, ext = os.path.splitext(filename_or_obj) | ||
| except TypeError: | ||
| return False | ||
| return ext in {".tif", ".tiff"} | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| """ | ||
| Tests for xarray 'cog3pio' backend engine. | ||
| """ | ||
|
|
||
| import cupy as cp | ||
| import pytest | ||
| import xarray as xr | ||
|
|
||
| from cupy_xarray.cog3pio import Cog3pioBackendEntrypoint | ||
|
|
||
| cog3pio = pytest.importorskip("cog3pio") | ||
|
|
||
|
|
||
| def test_entrypoint(): | ||
| assert "cog3pio" in xr.backends.list_engines() | ||
|
|
||
|
|
||
| def test_xarray_backend_open_dataarray(): | ||
| """ | ||
| Ensure that passing engine='cog3pio' to xarray.open_dataarray works to read a | ||
| Cloud-optimized GeoTIFF from a http url. | ||
| """ | ||
| with xr.open_dataarray( | ||
| filename_or_obj="https://github.com/developmentseed/titiler/raw/1.2.0/src/titiler/mosaic/tests/fixtures/TCI.tif", | ||
| engine=Cog3pioBackendEntrypoint, | ||
| device_id=0, | ||
| ) as da: | ||
| assert isinstance(da.data, cp.ndarray) | ||
| assert da.sizes == {"band": 3, "y": 1098, "x": 1098} | ||
| assert da.x.min() == 700010.0 | ||
| assert da.x.max() == 809710.0 | ||
| assert da.y.min() == 3490250.0 | ||
| assert da.y.max() == 3599950.0 | ||
| assert da.dtype == "uint8" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would this be handles on a multi-GPU system? You may want to load many tif files into a dask-cupy-xarray object where different chunks are on different GPUs. This API feels a little inflexible for this use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly the feedback I needed! Short answer is: I'm probably gonna change the signature of this parameter to
device_id: int | None = None. Where the default ofNonemeans to get the 'current device' fromcp.cuda.runtime.getDevice().Longer answer is: I'm currently using
nvtiffDecoderCreateSimple()which uses the default memory allocator. The multi-gpu case would probably mean I need to usenvtiffDecoderCreateinstead that allows a custom device allocator, which I presume dask will have some way of handling. I see dask's scope as more to do with parallel compute, not I/O from a file format, so would appreciate any advice here (the xarray <-> dask integration piece has always felt very CPU-centric to me 🙂)Note
Alternatively, I also considered having the parameter as just
deviceto take in acupy.cuda.Deviceobject. I didn't go with this option (yet) because I'd prefer to have something more cross-framework (e.g. allowtorch.cuda.deviceortf.device) to get thedevice_id, something touched on in data-apis/array-api#972 which proposes a__dlpack_device__()protocol.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would probably be fine for a multi-GPU setup. Generally the
NVIDIA_VISIBLE_DEVICESenv var is set to a unique index for each worker (in Dask this is somethingdask_cuda.LocalCUDACluseranddask_cuda.CUDAWorkerhandle), so when a worker uses the "current device" it would be different for each worker.It's just a task scheduler with some high-level collections. It doesn't matter if the task is compute, IO or anything else (is there anything else? 😅). But overall you need to think about how the high-level collection object filters down to the lower level Dask calls.
If I have a VM with four GPUs, and I call something along the lines of
xr.open_mfdataset(filename_or_obj="mytiffs/*.tiff", engine="cog3pio")you want to avoid being explicit with the device otherwise everything will end up on one device and wasting the other three.It's true that
dask-cudais a separate package that adds GPU logic to Dask. But GPUs are well supported in Dask today. There may just be work to be done wiring things up to collections like xarray.