From 687716f4bb7b0be0e5ba9dfb325d808109595de1 Mon Sep 17 00:00:00 2001 From: Joshua Charkow Date: Thu, 29 Jan 2026 11:44:46 -0500 Subject: [PATCH 1/2] GSoC OpenSwathWorkflow project --- content/en/news/GSoC2026.md | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/content/en/news/GSoC2026.md b/content/en/news/GSoC2026.md index fa483e6..6d70035 100644 --- a/content/en/news/GSoC2026.md +++ b/content/en/news/GSoC2026.md @@ -80,3 +80,24 @@ Tasks: 7. Provide a recommendation report for future binding strategy based on findings. --- + +### 3) Accelerating OpenSwathWorkflow for Large-Scale In Silico Spectral Libraries +**Proposed Mentors:** Joshua Charkow +**Skills:** C++, Algorithm Optimization, Profiling +**Estimated Project Length:** 200 hours | Difficulty: Medium + +OpenSwathWorkflow is a central component of OpenMS for Data Independent Acquisition (DIA) analysis, enabling targeted extraction and scoring of chromatographic signals using spectral libraries. While OpenSwathWorkflow performs well for conventional experimental libraries, the increasing adoption of large in silico–generated spectral libraries presents substantial computational challenges. Such libraries can contain millions of precursors, leading to increased memory usage, longer runtimes, and scalability bottlenecks in candidate selection, scoring. + +This project aims to analyze and improve the computational performance and scalability of OpenSwathWorkflow, with a particular focus on workflows using very large in silico spectral libraries. The goal is to identify bottlenecks, redesign performance-critical components where necessary, and introduce optimizations that enable efficient processing without compromising identification quality. + +A key deliverable of this project is a systematic performance evaluation of OpenSwathWorkflow before and after optimization. + +Tasks: +1. Develop a comprehensive understanding for the OpenSwathWorkflow algorithm +1. Develop a benchmarking dataset for profiling. +2. Profile OpenSwathWorkflow to identify computational bottlenecks. +3. Identify algorithmic bottlenecks and propose changes. +4. Experiment with different algorithms using inspiration from other open source DIA projects. +5. Validate that the optimized implementation provides comparable results to the original implementation and other DIA software tools. + +-- \ No newline at end of file From 62d19e1b01ec87956197556b2a4dcedbd32d54c6 Mon Sep 17 00:00:00 2001 From: Joshua Charkow <47336288+jcharkow@users.noreply.github.com> Date: Thu, 29 Jan 2026 12:33:09 -0500 Subject: [PATCH 2/2] minor: fix typo replace , with and --- content/en/news/GSoC2026.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/en/news/GSoC2026.md b/content/en/news/GSoC2026.md index 6d70035..232e8c4 100644 --- a/content/en/news/GSoC2026.md +++ b/content/en/news/GSoC2026.md @@ -86,7 +86,7 @@ Tasks: **Skills:** C++, Algorithm Optimization, Profiling **Estimated Project Length:** 200 hours | Difficulty: Medium -OpenSwathWorkflow is a central component of OpenMS for Data Independent Acquisition (DIA) analysis, enabling targeted extraction and scoring of chromatographic signals using spectral libraries. While OpenSwathWorkflow performs well for conventional experimental libraries, the increasing adoption of large in silico–generated spectral libraries presents substantial computational challenges. Such libraries can contain millions of precursors, leading to increased memory usage, longer runtimes, and scalability bottlenecks in candidate selection, scoring. +OpenSwathWorkflow is a central component of OpenMS for Data Independent Acquisition (DIA) analysis, enabling targeted extraction and scoring of chromatographic signals using spectral libraries. While OpenSwathWorkflow performs well for conventional experimental libraries, the increasing adoption of large in silico–generated spectral libraries presents substantial computational challenges. Such libraries can contain millions of precursors, leading to increased memory usage, longer runtimes, and scalability bottlenecks in candidate selection and scoring. This project aims to analyze and improve the computational performance and scalability of OpenSwathWorkflow, with a particular focus on workflows using very large in silico spectral libraries. The goal is to identify bottlenecks, redesign performance-critical components where necessary, and introduce optimizations that enable efficient processing without compromising identification quality. @@ -100,4 +100,4 @@ Tasks: 4. Experiment with different algorithms using inspiration from other open source DIA projects. 5. Validate that the optimized implementation provides comparable results to the original implementation and other DIA software tools. --- \ No newline at end of file +--