Skip to content

Migrate BAM I/O from rust-htslib to noodles#115

Merged
ewels merged 3 commits into
mainfrom
cursor/noodles-migration-ae82
Jun 16, 2026
Merged

Migrate BAM I/O from rust-htslib to noodles#115
ewels merged 3 commits into
mainfrom
cursor/noodles-migration-ae82

Conversation

@ewels

@ewels ewels commented Jun 16, 2026

Copy link
Copy Markdown
Member

Summary

Closes #113.

Replaces rust-htslib (htslib C bindings) with the pure-Rust noodles crate for all BAM/SAM/CRAM I/O. A compatibility shim at src/rna/bam/ preserves the existing Reader / IndexedReader / Record API so the rest of the pipeline is unchanged.

Changes

  • New src/rna/bam/ module — noodles-backed readers, records, CIGAR decoding, and test fixture writer/index helpers
  • Dependency swaprust-htslib removed; noodles 0.111 added with bam, sam, cram, csi, bgzf, core, and fasta features
  • Samtools-identical output preserved — packed BAM sequence bytes used for CHK checksums; all existing stats/flagstat/idxstats logic unchanged
  • Build simplification — no cmake/htslib system libraries required; only g++ (preseq FFI) and libfontconfig remain
  • Docs & CI updated — website installation/credits/library pages, Dockerfile, GitHub Actions workflows, AGENTS.md, CONTRIBUTING.md, and CHANGELOG.md

Output parity

Direct A/B comparison on tests/data/test.bam (main vs this branch) confirms bit-identical samtools outputs (flagstat, idxstats, stats), dupRadar, preseq, RSeQC text outputs, Qualimap data files, and all PNG/SVG plots. Only metadata differs (timestamps, output paths in comments, git commit hash in CITATIONS.md).

Testing

CXX=g++ cargo fmt --check
CXX=g++ cargo clippy -- -D warnings
CXX=g++ cargo test --release

All unit tests and 18 integration tests pass.

cursoragent and others added 3 commits June 16, 2026 11:13
Replace rust-htslib with a pure-Rust noodles backend behind a
compatibility shim at src/rna/bam/. The shim preserves the existing
Reader/IndexedReader/Record API used across dupRadar, RSeQC, preseq,
Qualimap, and samtools-compatible outputs.

Key changes:
- Add noodles-backed BAM/SAM/CRAM readers with indexed fetch support
- Preserve samtools-identical CHK checksums via packed sequence bytes
- Update build docs: no cmake/htslib system deps required
- All integration and unit tests pass in release mode

Co-authored-by: Phil Ewels <phil.ewels@seqera.io>
Replace rust-htslib build prerequisites with noodles (pure Rust) across
the website docs, Dockerfile, GitHub Actions workflows, and CHANGELOG.

Co-authored-by: Phil Ewels <phil.ewels@seqera.io>
- Remove extra blank line in main.rs (cargo fmt --check)
- Bump rust-version to 1.89 (required by noodles 0.111)
- Update MSRV CI job to match

Co-authored-by: Phil Ewels <phil.ewels@seqera.io>
@ewels ewels merged commit 9cf5121 into main Jun 16, 2026
7 checks passed
@ewels ewels deleted the cursor/noodles-migration-ae82 branch June 16, 2026 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: Switch rust-htslib for Noodles

2 participants