feat(tests): add HETA 1.2.0 parquet size checks and GeoJSON parity validation#640
Conversation
There was a problem hiding this comment.
Pull request overview
Adds end-to-end test updates for HETA 1.2.0 outputs by expanding expected result artifacts to include the new parquet polygon exports and validating parquet↔GeoJSON feature parity.
Changes:
- Extend
SPOT_0_EXPECTED_RESULT_FILES/SPOT_1_EXPECTED_RESULT_FILESto includetissue_qc,tissue_segmentation, andcell_classificationparquet outputs (now 12 expected files). - Update GUI/CLI e2e tests to assert 12 downloaded result files instead of 9.
- Add parquet↔GeoJSON parity assertions by comparing parquet row counts to GeoJSON
featurescounts for the three paired outputs.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
tests/constants_test.py |
Updates expected output file lists and byte-size tolerances to include the three new parquet outputs for both production and staging. |
tests/aignostics/application/gui_test.py |
Adjusts expected result file count to 12 and adds parquet↔GeoJSON parity validation after download. |
tests/aignostics/application/cli_test.py |
Adjusts expected result file count to 12 and adds parquet↔GeoJSON parity validation after execution/download. |
| assert len(files_in_results_dir) == 12, ( | ||
| f"Expected 12 files in {results_dir}, but found {len(files_in_results_dir)}: " |
bd5f44a to
1a3e050
Compare
47de64d to
4bf84bb
Compare
c65c3fc to
afc60d1
Compare
afc60d1 to
8896207
Compare
| def _build_minimal_wsi_input_item(gs_url: str, crc32c: str, expires_seconds: int) -> platform.InputItem: | ||
| """Build a minimal WSI InputItem supplying only the CRC32C and image URL.""" | ||
| return platform.InputItem( | ||
| external_id=gs_url, | ||
| input_artifacts=[ | ||
| platform.InputArtifact( | ||
| name="whole_slide_image", | ||
| download_url=platform.generate_signed_url(url=gs_url, expires_seconds=expires_seconds), | ||
| metadata={ | ||
| "checksum_base64_crc32c": crc32c, | ||
| "media_type": "image/tiff", | ||
| }, | ||
| ) | ||
| ], | ||
| ) | ||
|
|
||
|
|
Codecov Report✅ All modified and coverable lines are covered by tests. |
| import pyarrow.parquet as pq | ||
|
|
||
| for parquet_filename, geojson_filename in parquet_geojson_pairs: | ||
| parquet_path = results_dir / parquet_filename | ||
| geojson_path = results_dir / geojson_filename | ||
| parquet_row_count = pq.read_metadata(parquet_path).num_rows |
| import pyarrow.parquet as pq | ||
|
|
||
| for parquet_filename, geojson_filename in parquet_geojson_pairs: | ||
| parquet_path = results_dir / parquet_filename | ||
| geojson_path = results_dir / geojson_filename | ||
| parquet_row_count = pq.read_metadata(parquet_path).num_rows |
| import pyarrow.parquet as pq | ||
|
|
||
| for parquet_filename, geojson_filename in parquet_geojson_pairs: | ||
| parquet_path = results_dir / parquet_filename | ||
| geojson_path = results_dir / geojson_filename | ||
| parquet_row_count = pq.read_metadata(parquet_path).num_rows |
| import pyarrow.parquet as pq | ||
|
|
||
| for parquet_filename, geojson_filename in parquet_geojson_pairs: | ||
| parquet_path = results_dir / parquet_filename | ||
| geojson_path = results_dir / geojson_filename | ||
| parquet_row_count = pq.read_metadata(parquet_path).num_rows |
| @pytest.mark.stress_only | ||
| @pytest.mark.long_running | ||
| @pytest.mark.timeout(timeout=TEST_APP_STRESS_SUBMIT_AND_FIND_SUBMIT_TIMEOUT_SECONDS) | ||
| def test_platform_test_app_stress_submit() -> None: |
There was a problem hiding this comment.
When do these run? They could get very expensive if running through. Should we consider cancelling after acknowledging they have been submitted or so?
| ] | ||
| import pyarrow.parquet as pq | ||
|
|
||
| for parquet_filename, geojson_filename in parquet_geojson_pairs: |
There was a problem hiding this comment.
too complicated maybe but a rough area check could be nice for the segmentation ones
- test-app: 0.0.6 → 1.0.0 (new version uses same he-tme input schema) - he-tme: 1.1.0 → 1.1.1 on staging - Remove SPECIAL_APPLICATION_ID/VERSION from staging (no longer needed) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…alization artifact - Re-add SPECIAL_APPLICATION_ID/VERSION to staging pointing to test-app 1.0.0 so e2e_test.py imports resolve on staging - Remove normalization:wsi input artifact from _get_spots_payload_for_special; test-app 1.0.0 only requires whole_slide_image, matching the he-tme schema Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Remove SPECIAL_APPLICATION_ID/VERSION from staging constants entirely - Guard the import in e2e_test.py with try/except so staging doesn't NameError - Add skipif(SPECIAL_APPLICATION_ID is None) to both special-app tests so they are silently skipped on staging but still run on production (0.99.0) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Simpler than a try/except guard: staging defines SPECIAL_APPLICATION_ID and SPECIAL_APPLICATION_VERSION as None, the regular import works, and the existing skipif(SPECIAL_APPLICATION_ID is None) handles the rest. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…e-tme 1.2.0 - Replace SPOT_1 with breast cancer slide 1603ba4c (BREAST/BREAST_CANCER, 6649×6578 at 0.25 MPP); preserve old 9375e3ed data as SPOT_4 - Add VIPS 10x resolution ambiguity note for SPOT_2, SPOT_3, SPOT_4 - Bump HETA_APPLICATION_VERSION to 1.2.0, TEST_APPLICATION_VERSION to 1.0.0 - Remove SPECIAL_APPLICATION concept; restore stress tests against test-app 1.0.0 - Unify payload builders via _build_wsi_input_item / _build_minimal_wsi_input_item - Update SPOT_1_EXPECTED_RESULT_FILES sizes from staging run 43a3bcd2 - Reduce PIPELINE_NODE_ACQUISITION_TIMEOUT_MINUTES to 25
- Use pyarrow.parquet.read_metadata() instead of pd.read_parquet() to get row count from Parquet footer without loading polygon data - Use ijson streaming to count GeoJSON features without loading the full feature array into memory - Replace hard-coded file counts with len(SPOT_x_EXPECTED_RESULT_FILES) to avoid drift when the constants change - Sync qupath/gui_test.py to use len(SPOT_0_EXPECTED_RESULT_FILES) instead of the stale literal 9 - Remove unused _build_minimal_wsi_input_item dead code from e2e_test.py
03e47d1 to
7ad6745
Compare
| import pyarrow.parquet as pq | ||
|
|
||
| for parquet_filename, geojson_filename in parquet_geojson_pairs: | ||
| parquet_path = results_dir / parquet_filename | ||
| geojson_path = results_dir / geojson_filename | ||
| parquet_row_count = pq.read_metadata(parquet_path).num_rows | ||
| with geojson_path.open("rb") as f: | ||
| geojson_feature_count = sum(1 for _ in ijson.items(f, "features.item")) |
| import pyarrow.parquet as pq | ||
|
|
||
| for parquet_filename, geojson_filename in parquet_geojson_pairs: | ||
| parquet_path = results_dir / parquet_filename | ||
| geojson_path = results_dir / geojson_filename | ||
| parquet_row_count = pq.read_metadata(parquet_path).num_rows | ||
| with geojson_path.open("rb") as f: | ||
| geojson_feature_count = sum(1 for _ in ijson.items(f, "features.item")) |
| SPECIAL_APPLICATION_SUBMIT_AND_FIND_DEADLINE_SECONDS_ON_40 = 60 * 60 * 3 # 3 hours | ||
| SPECIAL_APPLICATION_SUBMIT_AND_FIND_SUBMIT_TIMEOUT_SECONDS = 60 * 30 # 30 minutes | ||
| SPECIAL_APPLICATION_FIND_AND_VALIDATE_TIMEOUT_SECONDS = 60 * 60 # 60 minutes | ||
| TEST_APP_STRESS_SLIDE_PER_RUN_COUNT = 100 |
- tissue_qc and tissue_segmentation: compare total polygon area (WKB via shapely) between parquet and GeoJSON, within 1% - cell_classification: compare polygon count, within 1% - Use pandas.read_parquet() instead of pyarrow.parquet directly, making the check engine-agnostic (pyarrow on 3.14, fastparquet on 3.11-3.13) - Extract shared helper assert_parquet_geojson_parity() to conftest.py to avoid duplication between cli_test and gui_test
| SPOT_4_GS_URL = ( | ||
| "gs://aignostics-platform-ext-a4f7e9/python-sdk-tests/he-tme/slides/9375e3ed-28d2-4cf3-9fb9-8df9d11a6627.tiff" | ||
| ) | ||
| SPOT_4_FILENAME = "9375e3ed-28d2-4cf3-9fb9-8df9d11a6627.tiff" | ||
| SPOT_4_CRC32C = "9l3NNQ==" | ||
| SPOT_4_FILESIZE = 14681750 | ||
| SPOT_4_RESOLUTION_MPP = 0.46499982 | ||
| SPOT_4_WIDTH = 3728 | ||
| SPOT_4_HEIGHT = 3640 |
| """Assert parquet/GeoJSON output parity for the three he-tme polygon artifact pairs. | ||
|
|
||
| - tissue_qc and tissue_segmentation: total polygon area within 1% | ||
| - cell_classification: polygon count within 1% | ||
|
|
| ) | ||
| assert result.exit_code == 0 | ||
| assert "Zipped 11 files" in normalize_output(result.output) | ||
| assert "Zipped 16 files" in normalize_output(result.output) |
| SPOT_4_GS_URL = ( | ||
| "gs://aignostics-platform-ext-a4f7e9/python-sdk-tests/he-tme/slides/9375e3ed-28d2-4cf3-9fb9-8df9d11a6627.tiff" | ||
| ) | ||
| SPOT_4_FILENAME = "9375e3ed-28d2-4cf3-9fb9-8df9d11a6627.tiff" | ||
| SPOT_4_CRC32C = "9l3NNQ==" | ||
| SPOT_4_FILESIZE = 14681750 | ||
| SPOT_4_RESOLUTION_MPP = 0.46499982 | ||
| SPOT_4_WIDTH = 3728 | ||
| SPOT_4_HEIGHT = 3640 |
| """Assert parquet/GeoJSON output parity for the three he-tme polygon artifact pairs. | ||
|
|
||
| - tissue_qc and tissue_segmentation: total polygon area within 1% | ||
| - cell_classification: polygon count within 1% | ||
|
|
| ) | ||
| assert result.exit_code == 0 | ||
| assert "Zipped 11 files" in normalize_output(result.output) | ||
| assert "Zipped 16 files" in normalize_output(result.output) |
…on test With he-tme 1.2.0 (new version), many runs in staging are PENDING/PROCESSING, causing the client-side has_output=True filter to page through all of them before finding 20 completed runs — exceeding the 60s timeout. Drop has_output and reduce limit to 5: item_count (total items submitted) already serves as a proxy, and the first API page returns enough runs instantly.
| # Find a run with fewer items than RESULTS_PAGE_SIZE. | ||
| # Omit has_output so the server-side filter is applied without client-side pagination: | ||
| # item_count already acts as a proxy (runs with no output show item_count=0 and fail | ||
| # the 0 < item_count <= RESULTS_PAGE_SIZE check below). | ||
| runs = Service().application_runs( |
| for parquet_name, geojson_name in [ | ||
| ("tissue_qc_parquet_polygons.parquet", "tissue_qc_geojson_polygons.json"), | ||
| ("tissue_segmentation_parquet_polygons.parquet", "tissue_segmentation_geojson_polygons.json"), | ||
| ]: | ||
| parquet_area = float( | ||
| shapely.area( | ||
| shapely.from_wkb( | ||
| pd.read_parquet(results_dir / parquet_name, columns=["geometry"])["geometry"].to_numpy() | ||
| ) | ||
| ).sum() | ||
| ) | ||
| geojson_area = 0.0 | ||
| with (results_dir / geojson_name).open("rb") as f: | ||
| for feature in ijson.items(f, "features.item"): | ||
| geojson_area += float(shapely.area(shapely.geometry.shape(feature["geometry"]))) | ||
| assert geojson_area > 0, f"No area computed from {geojson_name}" | ||
| diff_pct = abs(parquet_area - geojson_area) / geojson_area | ||
| assert diff_pct <= tolerance, ( | ||
| f"Total polygon area differs by >{tolerance * 100:.0f}% between " | ||
| f"{parquet_name} ({parquet_area:.2f}) and {geojson_name} ({geojson_area:.2f})" | ||
| ) |
| parquet_count = len(pd.read_parquet(results_dir / "cell_classification_parquet_polygons.parquet", columns=[])) | ||
| with (results_dir / "cell_classification_geojson_polygons.json").open("rb") as f: | ||
| geojson_count = sum(1 for _ in ijson.items(f, "features.item")) | ||
| delta = abs(parquet_count - geojson_count) | ||
| assert delta <= max(1, round(parquet_count * tolerance)), ( | ||
| f"Polygon count differs by >{tolerance * 100:.0f}% between " | ||
| f"cell_classification_parquet_polygons.parquet ({parquet_count}) " | ||
| f"and cell_classification_geojson_polygons.json ({geojson_count})" | ||
| ) |
| # Find a run with fewer items than RESULTS_PAGE_SIZE. | ||
| # Omit has_output so the server-side filter is applied without client-side pagination: | ||
| # item_count already acts as a proxy (runs with no output show item_count=0 and fail | ||
| # the 0 < item_count <= RESULTS_PAGE_SIZE check below). | ||
| runs = Service().application_runs( |
| for parquet_name, geojson_name in [ | ||
| ("tissue_qc_parquet_polygons.parquet", "tissue_qc_geojson_polygons.json"), | ||
| ("tissue_segmentation_parquet_polygons.parquet", "tissue_segmentation_geojson_polygons.json"), | ||
| ]: | ||
| parquet_area = float( | ||
| shapely.area( | ||
| shapely.from_wkb( | ||
| pd.read_parquet(results_dir / parquet_name, columns=["geometry"])["geometry"].to_numpy() | ||
| ) | ||
| ).sum() | ||
| ) | ||
| geojson_area = 0.0 | ||
| with (results_dir / geojson_name).open("rb") as f: | ||
| for feature in ijson.items(f, "features.item"): | ||
| geojson_area += float(shapely.area(shapely.geometry.shape(feature["geometry"]))) | ||
| assert geojson_area > 0, f"No area computed from {geojson_name}" | ||
| diff_pct = abs(parquet_area - geojson_area) / geojson_area | ||
| assert diff_pct <= tolerance, ( | ||
| f"Total polygon area differs by >{tolerance * 100:.0f}% between " | ||
| f"{parquet_name} ({parquet_area:.2f}) and {geojson_name} ({geojson_area:.2f})" | ||
| ) |
| parquet_count = len(pd.read_parquet(results_dir / "cell_classification_parquet_polygons.parquet", columns=[])) | ||
| with (results_dir / "cell_classification_geojson_polygons.json").open("rb") as f: | ||
| geojson_count = sum(1 for _ in ijson.items(f, "features.item")) | ||
| delta = abs(parquet_count - geojson_count) | ||
| assert delta <= max(1, round(parquet_count * tolerance)), ( | ||
| f"Polygon count differs by >{tolerance * 100:.0f}% between " | ||
| f"cell_classification_parquet_polygons.parquet ({parquet_count}) " | ||
| f"and cell_classification_geojson_polygons.json ({geojson_count})" | ||
| ) |
|



Summary
Adds validation for the 3 new parquet outputs introduced in HETA 1.2.0 (
tissue_qc,tissue_segmentation,cell_classification).cell_detectionparquet outputs are intentionally excluded as they are being removed from the pipeline.SPOT_0_EXPECTED_RESULT_FILESandSPOT_1_EXPECTED_RESULT_FILESto include the 3 new parquet entries (12 files total)cli_test.pyandgui_test.pyto assert 12 result files instead of 9len(pd.read_parquet(...))must equallen(geojson["features"])for each paired outputTest plan