Fix #2790: Add geometry deduplication to contours()#3001
Conversation
|
[Human] still in draft |
PR Review: Fix #2790 - Add geometry deduplication to contours()Blockers (must fix before merge)None identified. Suggestions (should fix, not blocking)
Nits (optional improvements)
What looks good
Checklist
CI StatusUnable to check via |
When raster cells exactly equal the contour level, overlapping chunk boundaries (dask) or saddle-case disambiguation (numpy) can emit duplicate polylines. Add _deduplicate_lines() to remove identical geometries before returning results, ensuring all backends produce unique contour lines. Added tests for numpy, dask, and geopandas backends to verify no duplicate geometries are returned.
When raster cells exactly equal the contour level, overlapping chunk boundaries (dask) or saddle-case disambiguation (numpy) can emit duplicate polylines. Add _deduplicate_lines() to remove identical geometries before returning results, ensuring all backends produce unique contour lines. Changes: - Add _deduplicate_lines() function with canonical segment comparison (direction-independent, rounded to 10 decimals) - Move deduplication before coordinate transform (array-index space) - Move defaultdict to top-level import in contour.py and test_contour.py - Add plateau regression tests for numpy, dask, and geopandas backends - Add 3 unit tests for _deduplicate_lines (exact dupes, different levels, reverse dupes) Review notes: - No blockers identified - All 52 tests pass (23 skipped for optional deps: cupy, geopandas, skimage) - Deduplication runs on array-index coordinates, consistent with DECIMALS=10 usage in _stitch_segments and _remove_duplicate_segments - Dask backend already calls _deduplicate_by_level() internally; top-level _deduplicate_lines() is a safety net for all backends
749a2e1 to
ec4b994
Compare
Closes #2790
Summary
contours()can return duplicate contour lines when raster cells exactly equal the contour level. This happens because:da.overlap.overlapwith 1-cell halo) process the same 2x2 quads from adjacent chunks, producing duplicate segments that can stitch into identical polylines.The existing code already guards against zero-length segments (
_emit_segline 217-218) and single-point polylines (_stitch_segmentsline 296), but lacked a final geometry-level deduplication step.Solution
Added
_deduplicate_lines(results)function that:The function is called in
contours()after the coordinate transform and before the return-type dispatch, ensuring all backends (numpy, cupy, dask+numpy, dask+cupy) and both output formats (list, GeoDataFrame) benefit from deduplication.