-
Notifications
You must be signed in to change notification settings - Fork 18
GSoC OpenSwathWorkflow project #242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -80,3 +80,24 @@ Tasks: | |
| 7. Provide a recommendation report for future binding strategy based on findings. | ||
|
|
||
| --- | ||
|
|
||
| ### 3) Accelerating OpenSwathWorkflow for Large-Scale In Silico Spectral Libraries | ||
| **Proposed Mentors:** Joshua Charkow | ||
| **Skills:** C++, Algorithm Optimization, Profiling | ||
| **Estimated Project Length:** 200 hours | Difficulty: Medium | ||
|
|
||
| OpenSwathWorkflow is a central component of OpenMS for Data Independent Acquisition (DIA) analysis, enabling targeted extraction and scoring of chromatographic signals using spectral libraries. While OpenSwathWorkflow performs well for conventional experimental libraries, the increasing adoption of large in silico–generated spectral libraries presents substantial computational challenges. Such libraries can contain millions of precursors, leading to increased memory usage, longer runtimes, and scalability bottlenecks in candidate selection and scoring. | ||
|
|
||
| This project aims to analyze and improve the computational performance and scalability of OpenSwathWorkflow, with a particular focus on workflows using very large in silico spectral libraries. The goal is to identify bottlenecks, redesign performance-critical components where necessary, and introduce optimizations that enable efficient processing without compromising identification quality. | ||
|
|
||
| A key deliverable of this project is a systematic performance evaluation of OpenSwathWorkflow before and after optimization. | ||
|
|
||
| Tasks: | ||
| 1. Develop a comprehensive understanding for the OpenSwathWorkflow algorithm | ||
| 1. Develop a benchmarking dataset for profiling. | ||
| 2. Profile OpenSwathWorkflow to identify computational bottlenecks. | ||
| 3. Identify algorithmic bottlenecks and propose changes. | ||
| 4. Experiment with different algorithms using inspiration from other open source DIA projects. | ||
| 5. Validate that the optimized implementation provides comparable results to the original implementation and other DIA software tools. | ||
|
|
||
| -- | ||
|
Comment on lines
+83
to
+103
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Search for the project title in the file
echo "=== Searching for project title ==="
rg -n "Accelerating OpenSwathWorkflow for Large-Scale In Silico Spectral Libraries" content/en/news/GSoC2026.md
echo ""
echo "=== Searching for mentor name ==="
rg -n "Joshua Charkow" content/en/news/GSoC2026.md
echo ""
echo "=== Full file structure for context ==="
rg -n "^###|^--" content/en/news/GSoC2026.md | head -50Repository: OpenMS/OpenMS-website Length of output: 521 🏁 Script executed: #!/bin/bash
# Read the specific lines to verify task numbering and other issues
sed -n '83,103p' content/en/news/GSoC2026.md | cat -nRepository: OpenMS/OpenMS-website Length of output: 2022 Fix task numbering, grammar, and formatting issues. The project description does not contain duplicates. However, there are real issues to address:
🧰 Tools🪛 LanguageTool[style] ~91-~91: As an alternative to the over-used intensifier ‘very’, consider replacing this phrase. (EN_WEAK_ADJECTIVE) [grammar] ~100-~100: Use a hyphen to join words. (QB_NEW_EN_HYPHEN) 🤖 Prompt for AI Agents |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix task list numbering and wording.
There’s a duplicated “1.” and a couple of phrasing issues that make the task list harder to follow. Suggest renumbering and tightening the wording.
🛠️ Proposed edit
🧰 Tools
🪛 LanguageTool
[grammar] ~100-~100: Use a hyphen to join words.
Context: ...rithms using inspiration from other open source DIA projects. 5. Validate that ...
(QB_NEW_EN_HYPHEN)
🤖 Prompt for AI Agents