Skip to content

Add new component: gridss/preprocess#11988

Open
imsarath wants to merge 3 commits into
nf-core:masterfrom
imsarath:gridss_preprocess
Open

Add new component: gridss/preprocess#11988
imsarath wants to merge 3 commits into
nf-core:masterfrom
imsarath:gridss_preprocess

Conversation

@imsarath

@imsarath imsarath commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

As a part of the distributed computing GRIDSS subworkflow #4498 , I added this gridss/preprocess module. This pre-processing step in GRIDSS extracts multiple Picard metrics (insert size, MAPQ, CIGAR, IDSV, tag, and coverage) from an input BAM file prior to assembly and variant calling.

PR checklist

Closes #4499

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Broadcast software version numbers to topic: versions - See version_topics
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

@imsarath imsarath self-assigned this Jun 12, 2026
stripPicardHeaderMd5("${workdir}/${prefix}.insert_size_metrics"),
stripPicardHeaderMd5("${workdir}/${prefix}.mapq_metrics"),
stripPicardHeaderMd5("${workdir}/${prefix}.tag_metrics"),
process.out.findAll { key, val -> key.startsWith("versions") }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that with nf-test 0.9.5, which the CI for this repo already has, this can be done simply by:

Suggested change
process.out.findAll { key, val -> key.startsWith("versions") }
topics

Provided that topics "versions" is added above.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even just process.out.versions_gridss? 🤔

Comment on lines +60 to +67
process.out.preprocess_dir.collect { meta, dir -> [meta, file(dir).list().findAll { it.startsWith(prefix) }.sort()] },
path("${workdir}/${prefix}.computesamtags.changes.tsv"),
path("${workdir}/${prefix}.coverage.blacklist.bed"),
stripPicardHeaderMd5("${workdir}/${prefix}.cigar_metrics"),
stripPicardHeaderMd5("${workdir}/${prefix}.idsv_metrics"),
stripPicardHeaderMd5("${workdir}/${prefix}.insert_size_metrics"),
stripPicardHeaderMd5("${workdir}/${prefix}.mapq_metrics"),
stripPicardHeaderMd5("${workdir}/${prefix}.tag_metrics"),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think lines 60-67 are actually testing the tool itself, not the module. The behaviour of the tool should be tested in the tool itself. For the nextflow module, I would think checking the process outputs is suficient.

Suggested change
process.out.preprocess_dir.collect { meta, dir -> [meta, file(dir).list().findAll { it.startsWith(prefix) }.sort()] },
path("${workdir}/${prefix}.computesamtags.changes.tsv"),
path("${workdir}/${prefix}.coverage.blacklist.bed"),
stripPicardHeaderMd5("${workdir}/${prefix}.cigar_metrics"),
stripPicardHeaderMd5("${workdir}/${prefix}.idsv_metrics"),
stripPicardHeaderMd5("${workdir}/${prefix}.insert_size_metrics"),
stripPicardHeaderMd5("${workdir}/${prefix}.mapq_metrics"),
stripPicardHeaderMd5("${workdir}/${prefix}.tag_metrics"),
process.out

Looking at the tests for cadd for example, that seems to be the case. 🤔

versions:
- - ${task.process}:
type: string
description: The process

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: The process
description: The name of the process

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

new module: gridss/preprocess

2 participants