Skip to content

Add parsers to extract only relevant data for complexes table in new PDBePISA#51

Merged
Joseph-Ellaway merged 18 commits into
mainfrom
PDBE-8506-complex-table-data
May 21, 2026
Merged

Add parsers to extract only relevant data for complexes table in new PDBePISA#51
Joseph-Ellaway merged 18 commits into
mainfrom
PDBE-8506-complex-table-data

Conversation

@Joseph-Ellaway
Copy link
Copy Markdown
Member

Adds a new model and parser to convert the polished complex-data-containing JSON into a "minimal" JSON, with only data needed by the Complexes Tab on the new PDBePISA UI.

Save the FE from having to do lots of parsing each time and also allows us to add a "Save as CSV" or "Copy to clipboard" button anywhere we want.

ABC classes used as I'm planning to extend this into a new set of models/parsers so the API need only deliver the data the new PDBePISA website needs. The existing polished JSONs and XMLs will remain unchanged, and the endpoints that fetch them can still be supplied to the user.

@Joseph-Ellaway
Copy link
Copy Markdown
Member Author

@mihaitodor I still need to add some tests. But wanted to get your thoughts on the implementation first

Copy link
Copy Markdown
Member Author

@Joseph-Ellaway Joseph-Ellaway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on the justification for features

Comment thread pisa_utils/models/post_process_models.py
Comment thread pisa_utils/parsers.py
Comment thread pisa_utils/post_process_parsers.py Outdated
Comment thread pisa_utils/models/data_fields.py
Copy link
Copy Markdown
Contributor

@mihaitodor mihaitodor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a few small nits

Comment thread pisa_utils/utils.py Outdated
Comment thread pisa_utils/utils.py Outdated
Comment thread pisa_utils/post_process_parsers.py
Comment thread tests/models/test_post_process_models.py Outdated
@Joseph-Ellaway Joseph-Ellaway marked this pull request as ready for review May 20, 2026 09:54
Copilot AI review requested due to automatic review settings May 20, 2026 09:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new post-processing layer that converts the existing “polished” assemblies JSON into a minimal JSON tailored for the new PDBePISA “Complexes” table, while also refactoring previously monolithic utilities into focused modules (CLI tools, file I/O helpers, and field handlers).

Changes:

  • Added post-process models + a PostProcessComplexTable parser to emit a minimal complex_table.json derived from assemblies.json.
  • Refactored pisa_utils/utils.py into cli_tools.py, file_io.py, and field_handlers.py, updating imports accordingly.
  • Added tests and expected-output fixtures for the new post-processing output.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_utils.py Updates imports to match the new utility module split.
tests/parsers/test_post_process_parsers.py Adds an integration-style test for the new complex-table post-processor.
tests/models/test_post_process_models.py Adds model validation tests for the new post-process schema.
tests/data/expected_output/post_processed_jsons/3hax_assembly_multi_asmset_post_proc.json Adds expected minimal JSON output fixture for complex-table extraction.
pisa_utils/utils.py Removes the old utils module in favor of dedicated modules.
pisa_utils/run.py Wires the new post-processing step into the service pipeline and updates CLI imports.
pisa_utils/run_pisa.py Updates imports after moving file/config helpers to file_io.py.
pisa_utils/post_process_parsers.py Adds the new post-processing parser implementation.
pisa_utils/parsers.py Switches JSON saving/opening to shared file I/O helpers and moves field helpers to field_handlers.py.
pisa_utils/models/post_process_models.py Adds Pydantic models for the minimal complex-table JSON.
pisa_utils/models/labels.py Adds new label(s) and renames/adjusts interface-energy and interface-total label text.
pisa_utils/models/data_models.py Reuses centralized Field helper definitions for complex-related fields.
pisa_utils/models/data_fields.py Introduces centralized Pydantic Field helper functions for reuse across models.
pisa_utils/file_io.py New shared helpers for gzip-aware open/save, XML parsing, config creation, etc.
pisa_utils/field_handlers.py New module containing extracted field/identifier helpers and UniProt CIF lookup logic.
pisa_utils/dictionaries.py Updates imports for read_uniprot_info after refactor.
pisa_utils/cli_tools.py New module containing CLI arg parsing + validation formerly in utils.py.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pisa_utils/models/data_fields.py Outdated
Comment thread pisa_utils/post_process_parsers.py Outdated
Comment thread pisa_utils/models/labels.py Outdated
Comment thread pisa_utils/run.py
Comment thread pisa_utils/post_process_parsers.py
Joseph-Ellaway and others added 7 commits May 20, 2026 12:59
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@Joseph-Ellaway Joseph-Ellaway merged commit 59e5721 into main May 21, 2026
8 checks passed
@Joseph-Ellaway Joseph-Ellaway deleted the PDBE-8506-complex-table-data branch May 22, 2026 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants