diff --git a/.claude/sweep-test-coverage-state.csv b/.claude/sweep-test-coverage-state.csv index 73a96954..f1abe453 100644 --- a/.claude/sweep-test-coverage-state.csv +++ b/.claude/sweep-test-coverage-state.csv @@ -1,3 +1,3 @@ module,last_inspected,issue,severity_max,categories_found,notes -geotiff,2026-05-15,1921,HIGH,1;2;3;4,"Pass 15 (2026-05-15): added test_write_vrt_bool_nodata_1921.py closing Cat 1 HIGH backend-parity gap on bool nodata rejection. Issue #1911 added the isinstance(nodata, (bool, np.bool_)) -> TypeError guard at to_geotiff and build_geo_tags, but the sibling writers were left unchecked: write_vrt(nodata=True) silently emits True into the VRT XML (str(True) drops the sentinel because no reader parses 'True' as numeric); write_geotiff_gpu direct call relies on the build_geo_tags defense-in-depth rather than an entry-point check, so a future refactor moving that guard would regress the GPU writer with no test coverage. 17 new tests: 4 xfail (strict=True) pinning the write_vrt fix surface (issue #1921), 1 passing pin on the current buggy str(True) emission so the fix is visible as a diff, 6 numeric/None happy-path tests on write_vrt, 4 GPU writer direct-call bool-reject tests (4 dtypes x 1 call), 1 to_geotiff(gpu=True) dispatcher thread-through. Filed issue #1921 for the write_vrt fix (sweep is test-only). Cat 1 HIGH (write_vrt backend parity bug) + Cat 1 MEDIUM (write_geotiff_gpu defense-in-depth pin). Pass 14 (2026-05-15): added test_dask_streaming_write_degenerate_2026_05_15.py closing Cat 3 HIGH and Cat 2 HIGH/MEDIUM gaps on the dask streaming write path (to_geotiff with dask-backed DataArray, #1084). test_streaming_write.py covered 100x100 with a NaN block plus a 2x2 small raster but had nothing 1-pixel-row, 1-pixel-column, all-NaN, all-Inf, or +/-Inf-mixed. The streaming tile-row segmenter (#1485) on a 1-pixel-tall raster and the streaming nodata-mask coercion on an all-NaN chunk were reachable only with a dask input and had no direct coverage; a regression on either would not surface from the eager numpy path or the write_geotiff_gpu path (pass 5 covered the GPU writer's degenerate shapes). 16 new tests, all passing: 1x1 chunk-matches-shape + nodata-attr round-trip + uint16, 1xN single chunk + chunks-split-columns + wide-segmented-by-buffer (#1485 streaming_buffer_bytes=1 forces the segmenter), Nx1 single chunk + chunks-split-rows, all-NaN with finite sentinel + all-NaN without sentinel, mixed NaN/+Inf/-Inf preserving Inf bit-exact + sentinel masking NaN only, all-+Inf and all--Inf, predictor=3 (float predictor) round-trip on float32 + float64 plus int-dtype ValueError. predictor=3 streaming coverage extends the small-chunk and int-rejection geometry around test_predictor_fp_write_1313.test_predictor3_streaming_dask (which already covers a 128x192 predictor=3 dask streaming write with a Predictor-tag assertion). Cat 3 HIGH (1x1/1xN/Nx1) + Cat 2 HIGH (all-NaN with sentinel) + Cat 2 MEDIUM (mixed-Inf, all-Inf) + Cat 4 MEDIUM (predictor=3 streaming). Pass 13 (2026-05-13): added test_size_param_validation_gpu_vrt_1776.py closing Cat 4 HIGH parameter-coverage gap on size-arg validation. Issue #1752 added tile_size validation to to_geotiff and chunks validation to read_geotiff_dask, but the matching kwargs on three sibling entry points were left unchecked: write_geotiff_gpu(tile_size=) raised ZeroDivisionError for 0, struct.error for -1, TypeError for 256.0; read_geotiff_gpu(chunks=) and read_vrt(chunks=) raised ZeroDivisionError for 0 and silently accepted negative values. Factored two shared validators (_validate_tile_size_arg, _validate_chunks_arg) and called them up front from each entry point. 34 new tests, all passing on GPU host: tile_size matrix on write_geotiff_gpu (0/-1/256.0/True/False/positive/np.int64), chunks matrix on read_geotiff_gpu and read_vrt (0/-1/(0,N)/(N,-1)/wrong-length/bool/non-int/(N,float)/positive/np.int64), dispatcher thread-through tests (open_geotiff(gpu=True, chunks=0), to_geotiff(gpu=True, tile_size=0)). Pre-existing 13 #1752 tests still pass after refactor. Filed issue #1776. Pass 12 (2026-05-12): added test_gpu_writer_overview_mode_and_compression_level_1740.py closing Cat 4 HIGH and Cat 4 MEDIUM parameter-coverage gaps. (1) write_geotiff_gpu(overview_resampling='mode') and the dedicated _block_reduce_2d_gpu mode-fallback branch (_gpu_decode.py:3051-3056) had zero direct tests; six of the seven overview_resampling modes were covered (mean/nearest by test_features, min/max/median by pass 6, cubic by test_signature_parity_1631) but mode was the odd one out -- a regression dropping the mode dispatch from _block_reduce_2d_gpu would fall through to the mean reshape branch and emit wrong overview pixels for integer rasters. (2) write_geotiff_gpu(compression_level=) documented as accepted-but-ignored had no test; the CPU writer rejects out-of-range levels with ValueError, the GPU writer is documented not to -- a regression wiring the GPU writer up to the CPU range validator would silently break every to_geotiff(gpu=True, compression_level=X) caller for in-range levels and noisily for out-of-range. 19 tests, all passing on GPU host: _block_reduce_2d_gpu(method='mode') CPU-parity on 4x4 deterministic + random 8x8 + dtype-preserved across u8/u16/i16/i32, write_geotiff_gpu(cog=True, overview_resampling='mode') end-to-end round trip, to_geotiff(gpu=True, ..., overview_resampling='mode') dispatcher thread-through, GPU-vs-CPU pixel parity on 8x8 input, write_geotiff_gpu(compression_level=) in-range matrix on zstd/deflate, out-of-range matrix (zstd=999/-5, deflate=50/0) accepted without raising + round-trip preserved, to_geotiff(gpu=True, compression_level=999) dispatcher thread-through, companion CPU rejects-OOR pin to lock the asymmetry. Mutation against the mode branch (drop the 'if method == mode' block in _block_reduce_2d_gpu) flipped 9 mode tests red. Filed issue #1740. Pass 11 (2026-05-12): added test_gpu_writer_cpu_fallback_codecs_2026_05_12.py closing a Cat 4 HIGH parameter-coverage gap on write_geotiff_gpu compression= modes for the CPU-fallback codecs (lzw, packbits, lz4, lerc, jpeg2000/j2k). Pass 7 (test_gpu_writer_compression_modes_2026_05_11) covered only none/deflate/zstd/jpeg; the remaining five codecs route through dedicated branches in gpu_compress_tiles (_gpu_decode.py:2974-3019) with CPU fallbacks (lerc_compress, jpeg2000_compress, cpu_compress) that had zero direct tests via write_geotiff_gpu. A regression in routing/tag-wiring/fallback dispatch would ship silently because the internal reader uses the same compression-tag table. 17 tests, all passing on GPU host: lzw/packbits/lz4 round-trip + compression-tag pin on uint16, lerc lossless float32 + uint16 round-trip + tag pin, jpeg2000 uint8 single-band + RGB multi-band lossless round-trip + j2k-alias parity + tag pin, GPU-vs-CPU writer pixel parity for lzw/packbits, to_geotiff(gpu=True, compression=lzw/packbits) dispatcher thread-through. Mutation against compression dispatch (swap lzw bytes to zstd; swap lerc bytes to deflate) flipped round-trip tests red. Filed issue #1706. Pass 10 (2026-05-12): added test_kwarg_behaviour_2026_05_12_v2.py closing two Cat 4 HIGH parameter-coverage gaps. (1) write_geotiff_gpu(predictor=True/2/3) had zero direct tests; the GPU writer threads predictor= through normalize_predictor and gpu_compress_tiles into five CUDA encode kernels (_predictor_encode_kernel_u8/u16/u32/u64 for predictor=2, _fp_predictor_encode_kernel for predictor=3) and a regression dropping the encode-kernel calls would ship corrupt files. (2) read_vrt(window=) had no behaviour tests (only a signature pin in test_signature_annotations_1654); the kwarg is documented and _vrt.read_vrt implements full windowed-read semantics (clip, multi-source overlap, src/dst scaling, GeoTransform origin shift on coords + attrs['transform']). 23 tests, all passing on GPU host: predictor=True/2 round-trips on u8/u16/i32 + 3-band RGB samples_per_pixel stride; predictor=3 lossless round-trip on f32 and f64; predictor=3 int-dtype ValueError (CPU/GPU parity); CPU/GPU pixel-exact parity for pred=2 u16 and pred=3 f32; read_vrt(window=) subregion + full + clamp-overflow + clamp-negative + 2x1 mosaic seam straddle + offset past seam + transform-attr origin shift + y/x coords half-pixel shift + window+band + window+chunks (dask) + window+gpu (cupy) + window+gpu+chunks (dask+cupy). Mutation against the encode dispatch flipped 7 predictor tests red. Filed issue #1690. Pass 9 (2026-05-12): added test_kwarg_behaviour_2026_05_12.py closing three Cat 4 MEDIUM parameter-coverage gaps plus one Cat 4 LOW error path. write_vrt documented kwargs (relative/crs_wkt/nodata) had a smoke-test pinning that the kwargs are accepted but no test verified the override *effect* -- a regression dropping the override branch and silently using the default-from-first-source would ship undetected. read_geotiff_gpu(dtype=) cast had zero direct tests; the eager path has TestDtypeEager and dask has TestDtypeDask but the GPU branch had no equivalent. write_geotiff_gpu(bigtiff=) threads through to _assemble_tiff(force_bigtiff=) but no test asserted the on-disk header byte switches; the CPU writer had it via test_features::test_force_bigtiff_via_public_api. write_vrt(source_files=[]) ValueError was uncovered. 26 tests, all passing on GPU host: write_vrt relative=True/False XML attribute + path inspection + parse-back round-trip, write_vrt crs_wkt= override distinct-from-default XML check, write_vrt nodata= override + default-from-source coverage, write_vrt([]) ValueError + no-file side effect, read_geotiff_gpu dtype= matrix (float64->float32, float64->float16, uint16->int32, uint16->uint8, float-to-int raise, dtype=None preserves native), open_geotiff(gpu=True, dtype=) dispatcher, read_geotiff_gpu(chunks=, dtype=) dask+GPU branch, write_geotiff_gpu bigtiff=True/False/None header verification, to_geotiff(gpu=True, bigtiff=True) dispatcher thread-through. Pass 8 (2026-05-11): added test_lz4_compression_level_2026_05_11.py closing Cat 4 MEDIUM parameter-coverage gap on compression='lz4' + compression_level=. _LEVEL_RANGES advertises lz4: (0, 16) but only deflate (1, 9) and zstd (1, 22) had direct level boundary + round-trip + reject tests. The range check is the gatekeeper -- lz4_compress silently accepts any int level -- so a regression dropping 'lz4' from _LEVEL_RANGES would ship undetected. 18 tests, all passing: round-trip at levels 0/1/9/16 (lossless), default-level no-arg path, higher-level-not-larger smoke check on compressible input, out-of-range reject at -1/-10/17/100 on eager path, valid-range message format pin (lz4 valid: 0-16), dask streaming round-trip at 0/1/8/16, dask streaming out-of-range reject at -1/17/50 (separate _LEVEL_RANGES call site). Pass 7 (2026-05-11): added test_gpu_writer_compression_modes_2026_05_11.py closing Cat 4 HIGH gap on write_geotiff_gpu compression= modes. The writer documents zstd (default, fastest GPU), deflate, jpeg, and none, but only deflate + none had round-trip tests; the default zstd and the jpeg (nvJPEG/Pillow) paths shipped without targeted coverage. 11 new tests, all passing on GPU host: zstd round-trip + default-codec pinning, jpeg round-trip on 3-band RGB uint8 + 1-band greyscale, TIFF compression-tag header check across none/deflate/zstd/jpeg, plain deflate + none round-trips outside the COG/sentinel paths, and a cross-codec lossless parity check (zstd/deflate/none agree pixel-exact). nvJPEG path was exercised live, not just the Pillow fallback. Pass 6 (2026-05-11): added test_overview_resampling_min_max_median_2026_05_11.py covering Cat 4 HIGH parameter-coverage gap on overview_resampling=min/max/median. CPU end-to-end paths were already covered by test_cog_overview_nodata_1613::test_cpu_cog_overview_aggregations_ignore_sentinel; the GPU end-to-end paths and the direct CPU+GPU block-reducer branches had no targeted tests, so a regression on those code paths would ship undetected. 26 tests, all passing on GPU host: block-reducer unit tests (finite + partial-NaN), end-to-end COG writes for both to_geotiff and write_geotiff_gpu, CPU/GPU parity for to_geotiff(gpu=True), CPU nodata-sentinel regression check, and ValueError error-path tests for unknown method names on both backends. Pass 5 (2026-05-11): added test_degenerate_shapes_backends_2026_05_11.py covering Cat 3 HIGH geometric gaps (1x1 / 1xN / Nx1 reads on dask+numpy, GPU, dask+cupy backends; 1x1 / 1xN / Nx1 writes through write_geotiff_gpu) and Cat 2 MEDIUM NaN/Inf gaps (all-NaN read on GPU + dask+cupy, Inf / -Inf reads on all non-eager backends, NaN sentinel mask on dask read path including sentinel block split across chunk boundary). 23 tests, all passing on GPU host. Prior passes still hold: pass 4 (r4) closed read_geotiff_gpu/dask name= + max_pixels= kwargs (Cat 4), pass 3 (r3) closed read_vrt GPU/dask+GPU backend dispatch (Cat 1) and dtype/name kwargs (Cat 4)." +geotiff,2026-05-15,1974,HIGH,1;2;3;4,"Pass 16 (2026-05-15): added test_max_cloud_bytes_dispatcher_silent_drop_2026_05_15.py closing Cat 4 HIGH parameter-coverage gap on the open_geotiff dispatcher's max_cloud_bytes kwarg. The kwarg was added in #1928 (eager fsspec budget) and re-ordered into the canonical reader signature by #1957, but open_geotiff only forwards it to _read_to_array on the eager non-VRT branch (__init__.py:431). The GPU branch at line 410, the dask branch at line 422, and the VRT branch at line 362 never reference the kwarg, so open_geotiff(p, max_cloud_bytes=8, gpu=True) / open_geotiff(p, max_cloud_bytes=8, chunks=N) / open_geotiff(vrt, max_cloud_bytes=8) all silently drop the budget. Same class of dispatcher-silently-drops-backend-kwarg bug fixed by #1561 / #1605 / #1685 / #1810 for other kwargs; the two sibling kwargs on_gpu_failure (line 339) and missing_sources (line 355) already raise ValueError when used on a path where they do not apply. 11 tests: 4 xfail(strict=True) pinning the fix surface (gpu, dask, vrt, dask+gpu), 3 passing pins on the current silent-drop behaviour so the fix is visible as a diff, 4 positive pins that the eager local + file-like paths accept the kwarg (docstring no-op contract). Filed issue #1974 for the dispatcher fix (sweep is test-only). Cat 4 HIGH (silent backend-kwarg drop). Pass 15 (2026-05-15): added test_write_vrt_bool_nodata_1921.py closing Cat 1 HIGH backend-parity gap on bool nodata rejection. Issue #1911 added the isinstance(nodata, (bool, np.bool_)) -> TypeError guard at to_geotiff and build_geo_tags, but the sibling writers were left unchecked: write_vrt(nodata=True) silently emits True into the VRT XML (str(True) drops the sentinel because no reader parses 'True' as numeric); write_geotiff_gpu direct call relies on the build_geo_tags defense-in-depth rather than an entry-point check, so a future refactor moving that guard would regress the GPU writer with no test coverage. 17 new tests: 4 xfail (strict=True) pinning the write_vrt fix surface (issue #1921), 1 passing pin on the current buggy str(True) emission so the fix is visible as a diff, 6 numeric/None happy-path tests on write_vrt, 4 GPU writer direct-call bool-reject tests (4 dtypes x 1 call), 1 to_geotiff(gpu=True) dispatcher thread-through. Filed issue #1921 for the write_vrt fix (sweep is test-only). Cat 1 HIGH (write_vrt backend parity bug) + Cat 1 MEDIUM (write_geotiff_gpu defense-in-depth pin). Pass 14 (2026-05-15): added test_dask_streaming_write_degenerate_2026_05_15.py closing Cat 3 HIGH and Cat 2 HIGH/MEDIUM gaps on the dask streaming write path (to_geotiff with dask-backed DataArray, #1084). test_streaming_write.py covered 100x100 with a NaN block plus a 2x2 small raster but had nothing 1-pixel-row, 1-pixel-column, all-NaN, all-Inf, or +/-Inf-mixed. The streaming tile-row segmenter (#1485) on a 1-pixel-tall raster and the streaming nodata-mask coercion on an all-NaN chunk were reachable only with a dask input and had no direct coverage; a regression on either would not surface from the eager numpy path or the write_geotiff_gpu path (pass 5 covered the GPU writer's degenerate shapes). 16 new tests, all passing: 1x1 chunk-matches-shape + nodata-attr round-trip + uint16, 1xN single chunk + chunks-split-columns + wide-segmented-by-buffer (#1485 streaming_buffer_bytes=1 forces the segmenter), Nx1 single chunk + chunks-split-rows, all-NaN with finite sentinel + all-NaN without sentinel, mixed NaN/+Inf/-Inf preserving Inf bit-exact + sentinel masking NaN only, all-+Inf and all--Inf, predictor=3 (float predictor) round-trip on float32 + float64 plus int-dtype ValueError. predictor=3 streaming coverage extends the small-chunk and int-rejection geometry around test_predictor_fp_write_1313.test_predictor3_streaming_dask (which already covers a 128x192 predictor=3 dask streaming write with a Predictor-tag assertion). Cat 3 HIGH (1x1/1xN/Nx1) + Cat 2 HIGH (all-NaN with sentinel) + Cat 2 MEDIUM (mixed-Inf, all-Inf) + Cat 4 MEDIUM (predictor=3 streaming). Pass 13 (2026-05-13): added test_size_param_validation_gpu_vrt_1776.py closing Cat 4 HIGH parameter-coverage gap on size-arg validation. Issue #1752 added tile_size validation to to_geotiff and chunks validation to read_geotiff_dask, but the matching kwargs on three sibling entry points were left unchecked: write_geotiff_gpu(tile_size=) raised ZeroDivisionError for 0, struct.error for -1, TypeError for 256.0; read_geotiff_gpu(chunks=) and read_vrt(chunks=) raised ZeroDivisionError for 0 and silently accepted negative values. Factored two shared validators (_validate_tile_size_arg, _validate_chunks_arg) and called them up front from each entry point. 34 new tests, all passing on GPU host: tile_size matrix on write_geotiff_gpu (0/-1/256.0/True/False/positive/np.int64), chunks matrix on read_geotiff_gpu and read_vrt (0/-1/(0,N)/(N,-1)/wrong-length/bool/non-int/(N,float)/positive/np.int64), dispatcher thread-through tests (open_geotiff(gpu=True, chunks=0), to_geotiff(gpu=True, tile_size=0)). Pre-existing 13 #1752 tests still pass after refactor. Filed issue #1776. Pass 12 (2026-05-12): added test_gpu_writer_overview_mode_and_compression_level_1740.py closing Cat 4 HIGH and Cat 4 MEDIUM parameter-coverage gaps. (1) write_geotiff_gpu(overview_resampling='mode') and the dedicated _block_reduce_2d_gpu mode-fallback branch (_gpu_decode.py:3051-3056) had zero direct tests; six of the seven overview_resampling modes were covered (mean/nearest by test_features, min/max/median by pass 6, cubic by test_signature_parity_1631) but mode was the odd one out -- a regression dropping the mode dispatch from _block_reduce_2d_gpu would fall through to the mean reshape branch and emit wrong overview pixels for integer rasters. (2) write_geotiff_gpu(compression_level=) documented as accepted-but-ignored had no test; the CPU writer rejects out-of-range levels with ValueError, the GPU writer is documented not to -- a regression wiring the GPU writer up to the CPU range validator would silently break every to_geotiff(gpu=True, compression_level=X) caller for in-range levels and noisily for out-of-range. 19 tests, all passing on GPU host: _block_reduce_2d_gpu(method='mode') CPU-parity on 4x4 deterministic + random 8x8 + dtype-preserved across u8/u16/i16/i32, write_geotiff_gpu(cog=True, overview_resampling='mode') end-to-end round trip, to_geotiff(gpu=True, ..., overview_resampling='mode') dispatcher thread-through, GPU-vs-CPU pixel parity on 8x8 input, write_geotiff_gpu(compression_level=) in-range matrix on zstd/deflate, out-of-range matrix (zstd=999/-5, deflate=50/0) accepted without raising + round-trip preserved, to_geotiff(gpu=True, compression_level=999) dispatcher thread-through, companion CPU rejects-OOR pin to lock the asymmetry. Mutation against the mode branch (drop the 'if method == mode' block in _block_reduce_2d_gpu) flipped 9 mode tests red. Filed issue #1740. Pass 11 (2026-05-12): added test_gpu_writer_cpu_fallback_codecs_2026_05_12.py closing a Cat 4 HIGH parameter-coverage gap on write_geotiff_gpu compression= modes for the CPU-fallback codecs (lzw, packbits, lz4, lerc, jpeg2000/j2k). Pass 7 (test_gpu_writer_compression_modes_2026_05_11) covered only none/deflate/zstd/jpeg; the remaining five codecs route through dedicated branches in gpu_compress_tiles (_gpu_decode.py:2974-3019) with CPU fallbacks (lerc_compress, jpeg2000_compress, cpu_compress) that had zero direct tests via write_geotiff_gpu. A regression in routing/tag-wiring/fallback dispatch would ship silently because the internal reader uses the same compression-tag table. 17 tests, all passing on GPU host: lzw/packbits/lz4 round-trip + compression-tag pin on uint16, lerc lossless float32 + uint16 round-trip + tag pin, jpeg2000 uint8 single-band + RGB multi-band lossless round-trip + j2k-alias parity + tag pin, GPU-vs-CPU writer pixel parity for lzw/packbits, to_geotiff(gpu=True, compression=lzw/packbits) dispatcher thread-through. Mutation against compression dispatch (swap lzw bytes to zstd; swap lerc bytes to deflate) flipped round-trip tests red. Filed issue #1706. Pass 10 (2026-05-12): added test_kwarg_behaviour_2026_05_12_v2.py closing two Cat 4 HIGH parameter-coverage gaps. (1) write_geotiff_gpu(predictor=True/2/3) had zero direct tests; the GPU writer threads predictor= through normalize_predictor and gpu_compress_tiles into five CUDA encode kernels (_predictor_encode_kernel_u8/u16/u32/u64 for predictor=2, _fp_predictor_encode_kernel for predictor=3) and a regression dropping the encode-kernel calls would ship corrupt files. (2) read_vrt(window=) had no behaviour tests (only a signature pin in test_signature_annotations_1654); the kwarg is documented and _vrt.read_vrt implements full windowed-read semantics (clip, multi-source overlap, src/dst scaling, GeoTransform origin shift on coords + attrs['transform']). 23 tests, all passing on GPU host: predictor=True/2 round-trips on u8/u16/i32 + 3-band RGB samples_per_pixel stride; predictor=3 lossless round-trip on f32 and f64; predictor=3 int-dtype ValueError (CPU/GPU parity); CPU/GPU pixel-exact parity for pred=2 u16 and pred=3 f32; read_vrt(window=) subregion + full + clamp-overflow + clamp-negative + 2x1 mosaic seam straddle + offset past seam + transform-attr origin shift + y/x coords half-pixel shift + window+band + window+chunks (dask) + window+gpu (cupy) + window+gpu+chunks (dask+cupy). Mutation against the encode dispatch flipped 7 predictor tests red. Filed issue #1690. Pass 9 (2026-05-12): added test_kwarg_behaviour_2026_05_12.py closing three Cat 4 MEDIUM parameter-coverage gaps plus one Cat 4 LOW error path. write_vrt documented kwargs (relative/crs_wkt/nodata) had a smoke-test pinning that the kwargs are accepted but no test verified the override *effect* -- a regression dropping the override branch and silently using the default-from-first-source would ship undetected. read_geotiff_gpu(dtype=) cast had zero direct tests; the eager path has TestDtypeEager and dask has TestDtypeDask but the GPU branch had no equivalent. write_geotiff_gpu(bigtiff=) threads through to _assemble_tiff(force_bigtiff=) but no test asserted the on-disk header byte switches; the CPU writer had it via test_features::test_force_bigtiff_via_public_api. write_vrt(source_files=[]) ValueError was uncovered. 26 tests, all passing on GPU host: write_vrt relative=True/False XML attribute + path inspection + parse-back round-trip, write_vrt crs_wkt= override distinct-from-default XML check, write_vrt nodata= override + default-from-source coverage, write_vrt([]) ValueError + no-file side effect, read_geotiff_gpu dtype= matrix (float64->float32, float64->float16, uint16->int32, uint16->uint8, float-to-int raise, dtype=None preserves native), open_geotiff(gpu=True, dtype=) dispatcher, read_geotiff_gpu(chunks=, dtype=) dask+GPU branch, write_geotiff_gpu bigtiff=True/False/None header verification, to_geotiff(gpu=True, bigtiff=True) dispatcher thread-through. Pass 8 (2026-05-11): added test_lz4_compression_level_2026_05_11.py closing Cat 4 MEDIUM parameter-coverage gap on compression='lz4' + compression_level=. _LEVEL_RANGES advertises lz4: (0, 16) but only deflate (1, 9) and zstd (1, 22) had direct level boundary + round-trip + reject tests. The range check is the gatekeeper -- lz4_compress silently accepts any int level -- so a regression dropping 'lz4' from _LEVEL_RANGES would ship undetected. 18 tests, all passing: round-trip at levels 0/1/9/16 (lossless), default-level no-arg path, higher-level-not-larger smoke check on compressible input, out-of-range reject at -1/-10/17/100 on eager path, valid-range message format pin (lz4 valid: 0-16), dask streaming round-trip at 0/1/8/16, dask streaming out-of-range reject at -1/17/50 (separate _LEVEL_RANGES call site). Pass 7 (2026-05-11): added test_gpu_writer_compression_modes_2026_05_11.py closing Cat 4 HIGH gap on write_geotiff_gpu compression= modes. The writer documents zstd (default, fastest GPU), deflate, jpeg, and none, but only deflate + none had round-trip tests; the default zstd and the jpeg (nvJPEG/Pillow) paths shipped without targeted coverage. 11 new tests, all passing on GPU host: zstd round-trip + default-codec pinning, jpeg round-trip on 3-band RGB uint8 + 1-band greyscale, TIFF compression-tag header check across none/deflate/zstd/jpeg, plain deflate + none round-trips outside the COG/sentinel paths, and a cross-codec lossless parity check (zstd/deflate/none agree pixel-exact). nvJPEG path was exercised live, not just the Pillow fallback. Pass 6 (2026-05-11): added test_overview_resampling_min_max_median_2026_05_11.py covering Cat 4 HIGH parameter-coverage gap on overview_resampling=min/max/median. CPU end-to-end paths were already covered by test_cog_overview_nodata_1613::test_cpu_cog_overview_aggregations_ignore_sentinel; the GPU end-to-end paths and the direct CPU+GPU block-reducer branches had no targeted tests, so a regression on those code paths would ship undetected. 26 tests, all passing on GPU host: block-reducer unit tests (finite + partial-NaN), end-to-end COG writes for both to_geotiff and write_geotiff_gpu, CPU/GPU parity for to_geotiff(gpu=True), CPU nodata-sentinel regression check, and ValueError error-path tests for unknown method names on both backends. Pass 5 (2026-05-11): added test_degenerate_shapes_backends_2026_05_11.py covering Cat 3 HIGH geometric gaps (1x1 / 1xN / Nx1 reads on dask+numpy, GPU, dask+cupy backends; 1x1 / 1xN / Nx1 writes through write_geotiff_gpu) and Cat 2 MEDIUM NaN/Inf gaps (all-NaN read on GPU + dask+cupy, Inf / -Inf reads on all non-eager backends, NaN sentinel mask on dask read path including sentinel block split across chunk boundary). 23 tests, all passing on GPU host. Prior passes still hold: pass 4 (r4) closed read_geotiff_gpu/dask name= + max_pixels= kwargs (Cat 4), pass 3 (r3) closed read_vrt GPU/dask+GPU backend dispatch (Cat 1) and dtype/name kwargs (Cat 4)." reproject,2026-05-10,,HIGH,1;4;5,"Added 39 tests: LiteCRS direct coverage, itrf_transform behaviour/roundtrip/array, itrf_frames, geoid_height numerical correctness + raster happy-path, vertical helpers (ellipsoidal<->orthometric/depth), reproject() lat/lon and latitude/longitude dim propagation. Note: _merge_arrays_cupy is imported but unused (no cupy merge dispatch in merge()); flagged as feature gap not test gap." diff --git a/xrspatial/geotiff/tests/test_max_cloud_bytes_dispatcher_silent_drop_2026_05_15.py b/xrspatial/geotiff/tests/test_max_cloud_bytes_dispatcher_silent_drop_2026_05_15.py new file mode 100644 index 00000000..2a613bf7 --- /dev/null +++ b/xrspatial/geotiff/tests/test_max_cloud_bytes_dispatcher_silent_drop_2026_05_15.py @@ -0,0 +1,267 @@ +"""Dispatcher parameter coverage for ``open_geotiff(max_cloud_bytes=...)``. + +``max_cloud_bytes`` was added in #1928 as an eager-path cloud-budget +guard, and re-ordered into the canonical reader signature in #1957 +(commit 750dc20). The kwarg is only meaningful for the eager fsspec +read path inside ``_read_to_array``: the HTTP/COG path is range-based +and the local + file-like paths skip the budget check. + +The dispatcher in ``open_geotiff`` (``xrspatial/geotiff/__init__.py``) +forwards the value to ``_read_to_array`` only on the eager non-VRT +branch. The GPU branch (``read_geotiff_gpu``), the dask branch +(``read_geotiff_dask``), and the VRT branch (``read_vrt``) all ignore +the kwarg silently -- the value is accepted at the signature but never +consumed downstream. + +That is the same class of dispatcher silently-drops-backend-kwarg bug +that issues #1561 (``overview_level`` to dask), #1605 (``band`` to GPU), +#1685 (``overview_level`` to VRT), and #1810 (``missing_sources`` to +non-VRT) fixed for the other backend-only kwargs. Pass 14 + 15 of the +test-coverage sweep closed several adjacent parameter gaps but did not +pin this one. + +The two sibling kwargs ``on_gpu_failure`` and ``missing_sources`` +already raise ``ValueError`` when used on a path where they do not +apply (the dispatcher gates them on sentinel defaults at +``__init__.py:339`` and ``:355``). ``max_cloud_bytes`` defaults to the +``_MAX_CLOUD_BYTES_SENTINEL`` and would slot into the same pattern, +but the rejection guard is missing. + +This module pins the current silent-drop behaviour. The fix surface is +expected to be either: + +* Add ValueError guards mirroring ``on_gpu_failure`` / + ``missing_sources`` (refuses the kwarg on gpu / chunks / VRT paths). +* Or thread ``max_cloud_bytes`` through every backend so it has effect + everywhere (broader change because the GPU + dask paths would need + to plumb the budget into their respective fsspec entry points). + +Either fix would flip the four ``xfail(strict=True)`` tests below from +xpass to pass after the source change. The fifth class (positive +``test_eager_*`` pins) stays green either way so the canonical eager +path keeps its current contract. + +Filed as issue #1974 (test-coverage sweep is test-only; the fix lives +in a separate PR). See test-coverage-state.csv pass 16. +""" +from __future__ import annotations + +import io +import os + +import numpy as np +import pytest +import xarray as xr + +from xrspatial.geotiff import ( + open_geotiff, + to_geotiff, + write_vrt, +) + + +def _skip_if_no_cupy_cuda(): + """Skip the calling test if cupy is unavailable or CUDA is offline.""" + import importlib.util + if importlib.util.find_spec("cupy") is None: + pytest.skip("cupy not available") + try: + import cupy + if not cupy.cuda.is_available(): + pytest.skip("CUDA unavailable on host") + except Exception: + pytest.skip("cupy import failed") + + +def _build_local_tif(tmp_path, name='src.tif'): + """Write a small valid GeoTIFF used as the dispatcher's source.""" + arr = np.arange(8 * 8, dtype=np.float32).reshape(8, 8) + da = xr.DataArray( + arr, + dims=['y', 'x'], + coords={'y': np.arange(8.0, 0, -1), 'x': np.arange(8.0)}, + attrs={ + 'crs': 4326, + 'transform': (1.0, 0, 0.0, 0, -1.0, 8.0), + }, + ) + path = str(tmp_path / name) + to_geotiff(da, path) + return path + + +def _build_vrt(tmp_path): + """Build a 1-source VRT mosaic referencing a small local GeoTIFF.""" + src = _build_local_tif(tmp_path, name='vrt_src.tif') + vrt = str(tmp_path / 'mosaic.vrt') + write_vrt(vrt, [src]) + return vrt, src + + +# --------------------------------------------------------------------- +# Positive pins: the kwarg is forwarded through the eager path. +# These stay green whether the dispatcher fix raises ValueError or +# threads the budget into every backend. +# --------------------------------------------------------------------- + +class TestEagerLocalPathAcceptsMaxCloudBytes: + """Local-file eager reads accept ``max_cloud_bytes`` as a no-op. + + The docstring on ``open_geotiff`` states the budget "Has no effect + on local file or file-like sources." A tight budget on a local + file still reads successfully. + """ + + def test_local_file_max_cloud_bytes_small_is_noop(self, tmp_path): + path = _build_local_tif(tmp_path) + # 8 bytes is far below the file size; local files skip the budget. + out = open_geotiff(path, max_cloud_bytes=8) + assert out.shape == (8, 8) + assert out.dtype == np.float32 + + def test_local_file_max_cloud_bytes_none_is_noop(self, tmp_path): + path = _build_local_tif(tmp_path) + out = open_geotiff(path, max_cloud_bytes=None) + assert out.shape == (8, 8) + + def test_local_file_max_cloud_bytes_large_is_noop(self, tmp_path): + path = _build_local_tif(tmp_path) + out = open_geotiff(path, max_cloud_bytes=10 ** 9) + assert out.shape == (8, 8) + + +class TestEagerFileLikeAcceptsMaxCloudBytes: + """File-like sources accept ``max_cloud_bytes`` (documented no-op).""" + + def test_bytesio_max_cloud_bytes_small_is_noop(self, tmp_path): + path = _build_local_tif(tmp_path) + with open(path, 'rb') as f: + buf = io.BytesIO(f.read()) + out = open_geotiff(buf, max_cloud_bytes=8) + assert out.shape == (8, 8) + + +# --------------------------------------------------------------------- +# Silent-drop pins. These mark the current buggy behaviour with +# ``xfail(strict=True)``: when the dispatcher fix lands (whichever +# direction), these flip from xpass to pass. ``strict=True`` makes +# the xpass a test failure so the diff is visible at fix time. +# --------------------------------------------------------------------- + +@pytest.mark.xfail( + strict=True, + reason=( + "open_geotiff silently drops max_cloud_bytes when gpu=True. " + "Should raise ValueError mirroring on_gpu_failure (#1810 pattern), " + "or thread the budget into the GPU fsspec entry point. " + "See test-coverage sweep pass 16." + ), +) +def test_dispatcher_gpu_path_rejects_max_cloud_bytes(tmp_path): + """``gpu=True`` with ``max_cloud_bytes=...`` should not silently drop. + + The kwarg is only consumed on the eager non-VRT path; the GPU + branch at ``__init__.py:410`` never references it. Caller has no + way to learn the budget is being ignored. + """ + path = _build_local_tif(tmp_path) + with pytest.raises(ValueError, match=r"max_cloud_bytes"): + open_geotiff(path, max_cloud_bytes=8, gpu=True) + + +@pytest.mark.xfail( + strict=True, + reason=( + "open_geotiff silently drops max_cloud_bytes when chunks=N. " + "Should raise ValueError mirroring on_gpu_failure (#1810 pattern), " + "or thread the budget into read_geotiff_dask. " + "See test-coverage sweep pass 16." + ), +) +def test_dispatcher_dask_path_rejects_max_cloud_bytes(tmp_path): + """``chunks=N`` with ``max_cloud_bytes=...`` should not silently drop. + + The kwarg is only consumed on the eager non-VRT path; the dask + branch at ``__init__.py:422`` never references it. + """ + path = _build_local_tif(tmp_path) + with pytest.raises(ValueError, match=r"max_cloud_bytes"): + open_geotiff(path, max_cloud_bytes=8, chunks=4) + + +@pytest.mark.xfail( + strict=True, + reason=( + "open_geotiff silently drops max_cloud_bytes for .vrt sources. " + "Should raise ValueError mirroring missing_sources (#1810 " + "pattern), or thread the budget into read_vrt. " + "See test-coverage sweep pass 16." + ), +) +def test_dispatcher_vrt_path_rejects_max_cloud_bytes(tmp_path): + """``.vrt`` source with ``max_cloud_bytes=...`` should not silently drop. + + The kwarg is only consumed on the eager non-VRT path; the VRT + branch at ``__init__.py:362`` never references it. + """ + vrt, _src = _build_vrt(tmp_path) + with pytest.raises(ValueError, match=r"max_cloud_bytes"): + open_geotiff(vrt, max_cloud_bytes=8) + + +@pytest.mark.xfail( + strict=True, + reason=( + "open_geotiff silently drops max_cloud_bytes when " + "gpu=True + chunks=N (dask+cupy dispatch). " + "Should raise ValueError mirroring on_gpu_failure (#1810 pattern). " + "See test-coverage sweep pass 16." + ), +) +def test_dispatcher_dask_gpu_path_rejects_max_cloud_bytes(tmp_path): + """``gpu=True + chunks=N`` should not silently drop max_cloud_bytes.""" + _skip_if_no_cupy_cuda() + path = _build_local_tif(tmp_path) + with pytest.raises(ValueError, match=r"max_cloud_bytes"): + open_geotiff(path, max_cloud_bytes=8, gpu=True, chunks=4) + + +# --------------------------------------------------------------------- +# Pinning the buggy CURRENT behaviour so the fix diff is observable. +# These tests pass today (the kwarg is silently dropped). When the +# dispatcher fix lands they will fail and must be replaced by the +# xfail tests above flipping green. They live alongside the xfails so +# the fix author sees both the "before" and "after" expectations. +# --------------------------------------------------------------------- + +# remove with #1974 +class TestCurrentSilentDropPins: + """Pin the current silent-drop behaviour. + + These tests assert that the kwarg is silently accepted today on the + non-eager paths. They are the "before" half of the fix-visibility + contract documented at module top. After the dispatcher fix the + xfail siblings above flip to pass; remove this class at that time. + """ + + def test_gpu_path_silently_accepts_today(self, tmp_path): + _skip_if_no_cupy_cuda() + path = _build_local_tif(tmp_path) + # No raise today; the kwarg is silently dropped. + out = open_geotiff(path, max_cloud_bytes=8, gpu=True) + assert out.shape == (8, 8) + + def test_dask_path_silently_accepts_today(self, tmp_path): + path = _build_local_tif(tmp_path) + # No raise today; the kwarg is silently dropped. + out = open_geotiff(path, max_cloud_bytes=8, chunks=4) + assert out.shape == (8, 8) + # Lazy result; confirm it computes. + arr = out.values + assert arr.shape == (8, 8) + + def test_vrt_path_silently_accepts_today(self, tmp_path): + vrt, _src = _build_vrt(tmp_path) + # No raise today; the kwarg is silently dropped. + out = open_geotiff(vrt, max_cloud_bytes=8) + assert out.shape == (8, 8)