Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
af3cf2e
changes many files
heylf Apr 22, 2026
ac9c237
bugfix
heylf Apr 23, 2026
cb6235d
bugfix for data_process in config.vsh.yaml
heylf Apr 23, 2026
4e9b8f2
bugfix for data_process in config.vsh.yaml and script.py
heylf Apr 23, 2026
ae452af
bugfix for data_process in script.py
heylf Apr 23, 2026
30d308b
bugfix for data_process in config.vsh.yml
heylf Apr 23, 2026
bc0de4e
bugfix for data_process in script.py
heylf Apr 23, 2026
4e158a5
bugfix for data_process in script.py
heylf Apr 23, 2026
cb7d5b4
bugfix for data_process in script.py
heylf Apr 23, 2026
cd1f3d2
changing docker containter in config.vsh.yaml
heylf Apr 23, 2026
59ed4c1
changing docker containter in config.vsh.yaml
heylf Apr 23, 2026
c5dfac0
comment out output_spatial_dataset
heylf Apr 24, 2026
97c42d1
Revert "comment out output_spatial_dataset"
rcannood Apr 27, 2026
00589e7
update project config
rcannood Apr 27, 2026
9022aa3
update readme
rcannood Apr 27, 2026
13368a7
update helper scripts
rcannood Apr 27, 2026
4f275a8
update data processor
rcannood Apr 27, 2026
6e6a277
change to file_scranseq_reference.yaml
heylf Apr 27, 2026
292fe3f
change to script.py
heylf Apr 27, 2026
3a3f95e
change dataprocessor to config.vsh.yaml and script.py
heylf Apr 28, 2026
6d0d9dd
fix script
rcannood Apr 28, 2026
f9d4062
simplify yaml
rcannood Apr 28, 2026
b92f687
update partial
rcannood Apr 28, 2026
bed471e
override dataset metadata in data processor (since we're combining tw…
rcannood Apr 28, 2026
5e40206
update scripts
rcannood Apr 28, 2026
3ade168
update api
rcannood Apr 28, 2026
d4bb21b
wip fix script
rcannood Apr 28, 2026
8916afa
fix par
rcannood Apr 28, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
284 changes: 54 additions & 230 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ A one sentence summary of purpose and methodology. Used for creating an
overview tables.

Repository:
[openproblems-bio/task_template](https://github.com/openproblems-bio/task_template)
[openproblems-bio/task_spatial_segmentation](https://github.com/openproblems-bio/task_spatial_segmentation)

## Description

Expand All @@ -28,34 +28,34 @@ should convince readers of the significance and relevance of your task.

## Authors & contributors

| Name | Roles | Linkedin | Twitter | Email | Github | Orcid |
|:---|:---|:---|:---|:---|:---|:---|
| John Doe | author, maintainer | johndoe | johndoe | john@doe.me | johndoe | 0000-0000-0000-0000 |
| name | roles |
|:---------|:-------------------|
| John Doe | author, maintainer |

## API

``` mermaid
flowchart TB
file_common_ist("<a href='https://github.com/openproblems-bio/task_template#file-format-common-ist-dataset'>Common iST Dataset</a>")
comp_data_processor[/"<a href='https://github.com/openproblems-bio/task_template#component-type-data-processor'>Data processor</a>"/]
file_spatial_dataset("<a href='https://github.com/openproblems-bio/task_template#file-format-raw-ist-dataset'>Raw iST Dataset</a>")
file_scrnaseq_reference("<a href='https://github.com/openproblems-bio/task_template#file-format-scrna-seq-reference'>scRNA-seq Reference</a>")
comp_control_method[/"<a href='https://github.com/openproblems-bio/task_template#component-type-control-method'>Control Method</a>"/]
comp_method[/"<a href='https://github.com/openproblems-bio/task_template#component-type-method'>Method</a>"/]
comp_metric[/"<a href='https://github.com/openproblems-bio/task_template#component-type-metric'>Metric</a>"/]
file_prediction("<a href='https://github.com/openproblems-bio/task_template#file-format-predicted-data'>Predicted data</a>")
file_score("<a href='https://github.com/openproblems-bio/task_template#file-format-score'>Score</a>")
file_common_scrnaseq("<a href='https://github.com/openproblems-bio/task_template#file-format-common-sc-dataset'>Common SC Dataset</a>")
file_common_ist("<a href='https://github.com/openproblems-bio/task_spatial_segmentation#file-format-common-ist-dataset'>Common iST Dataset</a>")
comp_data_processor[/"<a href='https://github.com/openproblems-bio/task_spatial_segmentation#component-type-data-processor'>Data processor</a>"/]
file_scrnaseq_reference("<a href='https://github.com/openproblems-bio/task_spatial_segmentation#file-format-scrna-seq-reference'>scRNA-seq Reference</a>")
file_spatial_dataset("<a href='https://github.com/openproblems-bio/task_spatial_segmentation#file-format-raw-ist-dataset'>Raw iST Dataset</a>")
comp_control_method[/"<a href='https://github.com/openproblems-bio/task_spatial_segmentation#component-type-control-method'>Control Method</a>"/]
comp_metric[/"<a href='https://github.com/openproblems-bio/task_spatial_segmentation#component-type-metric'>Metric</a>"/]
comp_method[/"<a href='https://github.com/openproblems-bio/task_spatial_segmentation#component-type-method'>Method</a>"/]
file_prediction("<a href='https://github.com/openproblems-bio/task_spatial_segmentation#file-format-predicted-data'>Predicted data</a>")
file_score("<a href='https://github.com/openproblems-bio/task_spatial_segmentation#file-format-score'>Score</a>")
file_common_scrnaseq("<a href='https://github.com/openproblems-bio/task_spatial_segmentation#file-format-common-sc-dataset'>Common SC Dataset</a>")
file_common_ist---comp_data_processor
comp_data_processor-->file_spatial_dataset
comp_data_processor-->file_scrnaseq_reference
file_spatial_dataset---comp_control_method
file_spatial_dataset---comp_method
comp_data_processor-->file_spatial_dataset
file_scrnaseq_reference---comp_control_method
file_scrnaseq_reference---comp_metric
file_spatial_dataset---comp_control_method
file_spatial_dataset---comp_method
comp_control_method-->file_prediction
comp_method-->file_prediction
comp_metric-->file_score
comp_method-->file_prediction
file_prediction---comp_metric
file_common_scrnaseq---comp_data_processor
```
Expand All @@ -76,91 +76,12 @@ Format:

<div class="small">

SpatialData object
images: 'image', 'image_3D', 'he_image'
labels: 'cell_labels', 'nucleus_labels'
points: 'transcripts'
shapes: 'cell_boundaries', 'nucleus_boundaries'
tables: 'metadata'
coordinate_systems: 'global'

</div>

Data structure:

<div class="small">

*images*

| Name | Description |
|:-----------|:------------------------------------|
| `image` | The raw image data. |
| `image_3D` | (*Optional*) The raw 3D image data. |
| `he_image` | (*Optional*) H&E image data. |

*labels*

| Name | Description |
|:-----------------|:---------------------------------------|
| `cell_labels` | (*Optional*) Cell segmentation labels. |
| `nucleus_labels` | (*Optional*) Cell segmentation labels. |

*points*

`transcripts`: Point cloud data of transcripts.

| Column | Type | Description |
|:---|:---|:---|
| `x` | `float` | x-coordinate of the point. |
| `y` | `float` | y-coordinate of the point. |
| `z` | `float` | (*Optional*) z-coordinate of the point. |
| `feature_name` | `categorical` | Name of the feature. |
| `cell_id` | `integer` | (*Optional*) Unique identifier of the cell. |
| `nucleus_id` | `integer` | (*Optional*) Unique identifier of the nucleus. |
| `cell_type` | `string` | (*Optional*) Cell type of the cell. |
| `qv` | `float` | (*Optional*) Quality value of the point. |
| `transcript_id` | `long` | Unique identifier of the transcript. |
| `overlaps_nucleus` | `boolean` | (*Optional*) Whether the point overlaps with a nucleus. |

*shapes*

`cell_boundaries`: Cell boundaries.

| Column | Type | Description |
|:-----------|:---------|:-------------------------------|
| `geometry` | `object` | Geometry of the cell boundary. |

`nucleus_boundaries`: Nucleus boundaries.

| Column | Type | Description |
|:-----------|:---------|:----------------------------------|
| `geometry` | `object` | Geometry of the nucleus boundary. |

*tables*

`metadata`: Metadata of spatial dataset.

| Slot | Type | Description |
|:---|:---|:---|
| `obs["cell_id"]` | `string` | A unique identifier for the cell. |
| `var["gene_ids"]` | `string` | Unique identifier for the gene. |
| `var["feature_types"]` | `string` | Type of the feature. |
| `obsm["spatial"]` | `double` | Spatial coordinates of the cell. |
| `uns["dataset_id"]` | `string` | A unique identifier for the dataset. |
| `uns["dataset_name"]` | `string` | A human-readable name for the dataset. |
| `uns["dataset_url"]` | `string` | Link to the original source of the dataset. |
| `uns["dataset_reference"]` | `string` | Bibtex reference of the paper in which the dataset was published. |
| `uns["dataset_summary"]` | `string` | Short description of the dataset. |
| `uns["dataset_description"]` | `string` | Long description of the dataset. |
| `uns["dataset_organism"]` | `string` | The organism of the sample in the dataset. |
| `uns["segmentation_id"]` | `string` | A unique identifier for the segmentation. |

*coordinate_systems*

| Name | Description |
|:---------|:------------------------------------|
| `global` | Coordinate system of the replicate. |

</div>

## Component type: Data processor
Expand All @@ -176,110 +97,7 @@ Arguments:
| `--input_sp` | `file` | An unprocessed spatial imaging dataset stored as a zarr file. |
| `--input_sc` | `file` | An unprocessed dataset as output by a dataset loader. |
| `--output_spatial_dataset` | `file` | (*Output*) A spatial transcriptomics dataset, preprocessed for this benchmark. |
| `--output_scrnaseq_reference` | `file` | (*Output*) A single-cell reference dataset, preprocessed for this benchmark. |

</div>

## File format: Raw iST Dataset

A spatial transcriptomics dataset, preprocessed for this benchmark.

Example file:
`resources_test/task_spatial_segmentation/mouse_brain_combined/common_ist.zarr`

Description:

This dataset contains preprocessed images, labels, points, shapes, and
tables for spatial transcriptomics data.

Format:

<div class="small">

SpatialData object
images: 'image', 'image_3D', 'he_image'
labels: 'cell_labels', 'nucleus_labels'
points: 'transcripts'
shapes: 'cell_boundaries', 'nucleus_boundaries'
tables: 'metadata'
coordinate_systems: 'global'

</div>

Data structure:

<div class="small">

*images*

| Name | Description |
|:-----------|:------------------------------------|
| `image` | The raw image data. |
| `image_3D` | (*Optional*) The raw 3D image data. |
| `he_image` | (*Optional*) H&E image data. |

*labels*

| Name | Description |
|:-----------------|:---------------------------------------|
| `cell_labels` | (*Optional*) Cell segmentation labels. |
| `nucleus_labels` | (*Optional*) Cell segmentation labels. |

*points*

`transcripts`: Point cloud data of transcripts.

| Column | Type | Description |
|:---|:---|:---|
| `x` | `float` | x-coordinate of the point. |
| `y` | `float` | y-coordinate of the point. |
| `z` | `float` | (*Optional*) z-coordinate of the point. |
| `feature_name` | `categorical` | Name of the feature. |
| `cell_id` | `integer` | (*Optional*) Unique identifier of the cell. |
| `nucleus_id` | `integer` | (*Optional*) Unique identifier of the nucleus. |
| `cell_type` | `string` | (*Optional*) Cell type of the cell. |
| `qv` | `float` | (*Optional*) Quality value of the point. |
| `transcript_id` | `long` | Unique identifier of the transcript. |
| `overlaps_nucleus` | `boolean` | (*Optional*) Whether the point overlaps with a nucleus. |

*shapes*

`cell_boundaries`: Cell boundaries.

| Column | Type | Description |
|:-----------|:---------|:-------------------------------|
| `geometry` | `object` | Geometry of the cell boundary. |

`nucleus_boundaries`: Nucleus boundaries.

| Column | Type | Description |
|:-----------|:---------|:----------------------------------|
| `geometry` | `object` | Geometry of the nucleus boundary. |

*tables*

`metadata`: Metadata of spatial dataset.

| Slot | Type | Description |
|:---|:---|:---|
| `obs["cell_id"]` | `string` | A unique identifier for the cell. |
| `var["gene_ids"]` | `string` | Unique identifier for the gene. |
| `var["feature_types"]` | `string` | Type of the feature. |
| `obsm["spatial"]` | `double` | Spatial coordinates of the cell. |
| `uns["dataset_id"]` | `string` | A unique identifier for the dataset. |
| `uns["dataset_name"]` | `string` | A human-readable name for the dataset. |
| `uns["dataset_url"]` | `string` | Link to the original source of the dataset. |
| `uns["dataset_reference"]` | `string` | Bibtex reference of the paper in which the dataset was published. |
| `uns["dataset_summary"]` | `string` | Short description of the dataset. |
| `uns["dataset_description"]` | `string` | Long description of the dataset. |
| `uns["dataset_organism"]` | `string` | The organism of the sample in the dataset. |
| `uns["segmentation_id"]` | `string` | A unique identifier for the segmentation. |

*coordinate_systems*

| Name | Description |
|:---------|:------------------------------------|
| `global` | Coordinate system of the replicate. |
| `--output_scrnaseq` | `file` | (*Output*) A single-cell reference dataset, preprocessed for this benchmark. |

</div>

Expand All @@ -288,7 +106,7 @@ Data structure:
A single-cell reference dataset, preprocessed for this benchmark.

Example file:
`resources_test/task_spatial_segmentation/mouse_brain_combined/common_scrnaseq.h5ad`
`resources_test/task_spatial_segmentation/mouse_brain_combined/scrnaseq_reference.h5ad`

Description:

Expand Down Expand Up @@ -364,6 +182,30 @@ Data structure:

</div>

## File format: Raw iST Dataset

A spatial transcriptomics dataset, preprocessed for this benchmark.

Example file:
`resources_test/task_spatial_segmentation/mouse_brain_combined/spatial_dataset.zarr`

Description:

This dataset contains preprocessed images, labels, points, shapes, and
tables for spatial transcriptomics data.

Format:

<div class="small">

</div>

Data structure:

<div class="small">

</div>

## Component type: Control Method

Quality control methods for verifying the pipeline.
Expand All @@ -380,34 +222,34 @@ Arguments:

</div>

## Component type: Method
## Component type: Metric

A method.
A task template metric.

Arguments:

<div class="small">

| Name | Type | Description |
|:---|:---|:---|
| `--input` | `file` | A spatial transcriptomics dataset, preprocessed for this benchmark. |
| `--output` | `file` | (*Output*) A predicted dataset as output by a method. |
| `--input_prediction` | `file` | A predicted dataset as output by a method. |
| `--input_scrnaseq_reference` | `file` | A single-cell reference dataset, preprocessed for this benchmark. |
| `--output` | `file` | (*Output*) File indicating the score of a metric. |

</div>

## Component type: Metric
## Component type: Method

A task template metric.
A method.

Arguments:

<div class="small">

| Name | Type | Description |
|:---|:---|:---|
| `--input_prediction` | `file` | A predicted dataset as output by a method. |
| `--input_scrnaseq_reference` | `file` | A single-cell reference dataset, preprocessed for this benchmark. |
| `--output` | `file` | (*Output*) File indicating the score of a metric. |
| `--input` | `file` | A spatial transcriptomics dataset, preprocessed for this benchmark. |
| `--output` | `file` | (*Output*) A predicted dataset as output by a method. |

</div>

Expand All @@ -422,31 +264,12 @@ Format:

<div class="small">

SpatialData object
labels: 'segmentation'
tables: 'table'

</div>

Data structure:

<div class="small">

*labels*

| Name | Description |
|:---------------|:--------------------------|
| `segmentation` | Segmentation of the data. |

*tables*

`table`: AnnData table.

| Slot | Type | Description |
|:-----------------|:---------|:------------|
| `obs["cell_id"]` | `string` | Cell ID. |
| `obs["region"]` | `string` | Region. |

</div>

## File format: Score
Expand Down Expand Up @@ -562,3 +385,4 @@ Data structure:
| `uns["dataset_organism"]` | `string` | (*Optional*) The organism of the sample in the dataset. |

</div>

4 changes: 2 additions & 2 deletions _viash.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ license: MIT
keywords: [single-cell, openproblems, benchmark]
# Step 3: Update the `task_template` to the name of the task from step 1.
links:
issue_tracker: https://github.com/openproblems-bio/task_template/issues
repository: https://github.com/openproblems-bio/task_template
issue_tracker: https://github.com/openproblems-bio/task_spatial_segmentation/issues
repository: https://github.com/openproblems-bio/task_spatial_segmentation
docker_registry: ghcr.io


Expand Down
Loading
Loading