InsightFold

InsightFold is a notebook-driven development project for turning ideas about AlphaFold Database (AFDB), PDBe, protein structures, variants, interfaces, confidence metrics, and related biological questions into reviewed, runnable, and eventually reusable notebooks.

The goal is not just to make individual notebooks. The goal is to create a repeatable lifecycle for scientific notebook development:

capture an idea
decide whether it is worth pursuing
scope it into a PRD
convert the PRD into a notebook specification
review the specification
select pinned fixtures
build the notebook
execute and validate it
review it as a scientific artifact
beta test it
decide whether to integrate, maintain, or archive it

This matters because notebooks can easily become one-off experiments with hidden assumptions, fragile dependencies, unclear scientific claims, and no obvious path to maintenance. InsightFold aims to make notebook work faster while also making it easier to review, reproduce, explain, and hand off.

What InsightFold Hopes To Achieve

InsightFold should help the project team:

move from rough biological/product ideas to structured notebook prototypes
keep notebook scope explicit before implementation starts
make scientific assumptions, formulas, thresholds, and data sources traceable
use pinned AFDB, PDBe, PDB, UniProt, or local fixtures instead of ad hoc examples
validate notebooks by restart-and-run-all execution, not only visual inspection
distinguish "the notebook runs" from "the notebook is scientifically and user-facingly good"
decide whether a notebook should graduate into AFDB/PDBe, remain a standing notebook, or be archived

Repository Layout

Key project areas:

.agents/skills/
  idea-scoping-interview/
  scoping-decision-capture/
  concept-to-prd/
  prd-to-notebook-spec/
  notebook-spec-review/
  fixture-selection/
  notebook-from-spec/
  notebook-execution-validation/
  notebook-review/

.agents/agents/
  lifecycle/
    AGENTS.md
    spec-reviewer.md
    fixture-curator.md
    notebook-builder.md
    notebook-validator.md
    notebook-reviewer.md
  homodimer/
  advisory/

prd/
specs/
notebooks/
src/
skills/          # legacy/source copy retained for comparison
agents/          # legacy/source copy retained for comparison
agent-skills/

.agents/skills/ contains reusable Codex skill instructions. Each skill has a SKILL.md file and, where useful, templates under assets/ and rationale under references/.

.agents/agents/lifecycle/ contains role definitions for agents that apply those skills at specific lifecycle stages. These are not standalone programs; they are operating instructions for how an AI agent should behave when assigned that role.

prd/, specs/, and notebooks/ are the expected artifact path from product idea to notebook implementation.

skills/, agents/, and agent-skills/ are retained as original/source material. New Codex-facing work should use .agents/.

agent-skills/insightfold-lifecycle-skill-roadmap.md is the broader lifecycle roadmap, including future skills that have not yet been implemented.

Current Lifecycle Chain

The current implemented chain is:

idea-scoping-interview
  -> scoping-decision-capture
  -> concept-to-prd
  -> prd-to-notebook-spec
  -> spec-reviewer
  -> fixture-curator
  -> notebook-builder
  -> notebook-validator
  -> notebook-reviewer

The corresponding skills are:

Stage	Skill	Purpose
Concept discovery	`.agents/skills/idea-scoping-interview/SKILL.md`	Interview and scope a rough notebook idea
Decision capture	`.agents/skills/scoping-decision-capture/SKILL.md`	Summarize decisions, tradeoffs, and open questions
PRD creation	`.agents/skills/concept-to-prd/SKILL.md`	Convert a scoped concept into a true PRD
Notebook spec	`.agents/skills/prd-to-notebook-spec/SKILL.md`	Convert the PRD into an implementation-ready notebook spec pack
Spec review	`.agents/skills/notebook-spec-review/SKILL.md`	Review the spec before implementation starts
Fixture selection	`.agents/skills/fixture-selection/SKILL.md`	Choose pinned examples, edge cases, expected outputs, and provenance
Notebook build	`.agents/skills/notebook-from-spec/SKILL.md`	Build the notebook from the reviewed spec and fixture manifest
Execution validation	`.agents/skills/notebook-execution-validation/SKILL.md`	Validate restart-and-run-all behavior, fixture outputs, dependencies, and reproducibility
Final notebook review	`.agents/skills/notebook-review/SKILL.md`	Review scientific quality, interpretation, pedagogy, maintainability, and lifecycle readiness

The corresponding agent roles are:

Agent role	File	Responsibility
Shared lifecycle guidance	`.agents/agents/lifecycle/AGENTS.md`	Common rules for all lifecycle agents
Spec reviewer	`.agents/agents/lifecycle/spec-reviewer.md`	Blocks vague, incomplete, or untestable specs
Fixture curator	`.agents/agents/lifecycle/fixture-curator.md`	Selects fixtures and expected outputs
Notebook builder	`.agents/agents/lifecycle/notebook-builder.md`	Implements notebooks from reviewed specs
Notebook validator	`.agents/agents/lifecycle/notebook-validator.md`	Runs mechanical validation and reports evidence
Notebook reviewer	`.agents/agents/lifecycle/notebook-reviewer.md`	Reviews whether a notebook is ready to share, beta, graduate, or archive

Recommended Way Of Working

Use one lead thread first, then parallel subagents only after the core question is stable.

Mental model:

skill = reusable method
agent = role lens, priorities, and guardrails
spawned subagent = a separate worker that can use both

Recommended sequence in a new conversation:

Start with one lead role in a single thread. Use .agents/agents/advisory/scientific-product-manager.md with $idea-scoping-interview. Goal: define the user, the biological question, the expected evidence, and the rough notebook outcome.
Do not parallelize yet if the core sentence is still unclear. The idea should be stable enough to say: this notebook is for X user to answer Y question using Z evidence
Once that sentence is stable, spawn 3 to 5 advisory subagents in parallel. Good default panel:
- scientific-product-manager
- computational-structural-biologist
- bioinformatics-data-engineer
- evaluation-benchmarking-specialist
- optional domain specialist such as enzymologist, wet-lab-liaison, protein-biochemist, or molecular-visualization-specialist
Give every subagent the same idea statement, but a different role. Each subagent should return:
- key opportunities
- highest-risk assumptions
- blocking questions
- concrete recommendations
- whether the idea is ready for $scoping-decision-capture
Synthesize the panel with $scoping-decision-capture. This is the bridge from exploratory discussion to a structured concept record.
Only then use $concept-to-prd. The PRD should stay a true PRD. Do not push notebook section order, cell plans, traceability tables, or fixture manifests into the PRD.
Use $prd-to-notebook-spec after the PRD is approved. This skill owns the implementation-ready spec pack.
Use lifecycle agents after the spec exists. The normal order is:
- spec-reviewer
- fixture-curator
- notebook-builder
- notebook-validator
- notebook-reviewer

When to stay in one thread:

very early ideation
clarifying the user and problem
deciding whether an idea is worth pursuing at all

When to use spawned subagents:

once the idea has enough shape for role-specific critique
when you want parallel expert views
when you want role separation instead of one blended answer

When not to use too many subagents:

if the idea is still vague
if multiple roles would ask the same basic questions
if you do not yet know which biological or technical domain matters most

How To Use The Skills From Start To Finish

Use the lifecycle as a sequence of gates. Do not jump straight from idea to notebook unless the work is intentionally exploratory.

1. Start With An Idea

Write a short idea in plain language:

Idea: Build a notebook that helps users inspect whether AFDB structure confidence changes around clinically interesting variant positions.

Recommended first prompt:

Use $idea-scoping-interview.

Act as .agents/agents/advisory/scientific-product-manager.md.

I want to explore a new InsightFold notebook idea:
<idea>

Do not write a PRD yet.
Interview me until the user, the biological question, the evidence, and the rough notebook outcome are clear.

Expected outcome:

stable problem statement
target user
expected notebook output
obvious risks and unknowns
judgment about whether parallel advisory review should start

2. Run Parallel Advisory Review

After the core question is stable, spawn advisory subagents in parallel.

Recommended default panel:

.agents/agents/advisory/scientific-product-manager.md
.agents/agents/advisory/computational-structural-biologist.md
.agents/agents/advisory/bioinformatics-data-engineer.md
.agents/agents/advisory/evaluation-benchmarking-specialist.md
optional: .agents/agents/advisory/molecular-visualization-specialist.md when 3D structure views matter

Recommended prompt:

Spawn parallel subagents for advisory review.

Shared idea statement:
<stable idea statement>

Subagent 1:
Act as .agents/agents/advisory/scientific-product-manager.md and use $idea-scoping-interview.
Focus on scope, user value, PRD readiness, and lifecycle fit.

Subagent 2:
Act as .agents/agents/advisory/computational-structural-biologist.md.
Focus on biological assumptions, interpretation risk, and overclaims.

Subagent 3:
Act as .agents/agents/advisory/bioinformatics-data-engineer.md.
Focus on APIs, identifiers, provenance, fixtures, and reproducibility.

Subagent 4:
Act as .agents/agents/advisory/evaluation-benchmarking-specialist.md.
Focus on validation, expected outputs, fixture quality, and failure cases.

Have each subagent return:
- key opportunities
- highest-risk assumptions
- blocking questions
- concrete recommendations
- whether this is ready for $scoping-decision-capture

Expected outcome:

role-specific critique
surfaced blockers
better-defined scope boundaries
clear recommendation on whether to proceed

3. Capture The Decisions

Use:

.agents/skills/scoping-decision-capture/SKILL.md

Ask Codex:

Use $scoping-decision-capture to synthesize the ideation discussion and advisory-panel outputs into a structured decision capture.
Do not write the PRD yet.

Expected output:

core problem statement
scope decisions
user journey summary
open questions
success and stop signals

4. Convert The Concept To A PRD

Use:

.agents/skills/concept-to-prd/SKILL.md

Ask Codex something like:

Use $concept-to-prd to turn this scoped concept into an InsightFold PRD:
<decision capture or approved scoped concept>

Expected output:

prd/<feature>.md

The PRD should define the user problem, target audience, scope, non-goals, success criteria, assumptions, risks, and expected notebook artifact.

5. Convert The PRD To A Notebook Spec Pack

Use:

.agents/skills/prd-to-notebook-spec/SKILL.md

Ask Codex:

Use $prd-to-notebook-spec to convert prd/<feature>.md into a notebook spec pack.

Expected output:

specs/<feature>/
  spec-pack-overview.md
  requirements.md
  notebook-ux-contract.md
  notebook-design.md
  cell-blueprint.md
  traceability-matrix.md
  tasks.md
  validation.md
  docs-plan.md
  fixture-manifest.md
  data-contracts.md

For very small prototypes, a single consolidated spec is acceptable if it contains equivalent sections.

6. Review The Spec Before Building

Use:

.agents/skills/notebook-spec-review/SKILL.md
.agents/agents/lifecycle/spec-reviewer.md

Ask Codex:

Act as .agents/agents/lifecycle/spec-reviewer.md and use $notebook-spec-review to review specs/<feature>/ for implementation readiness.

The review should decide whether implementation can start. It should flag:

missing acceptance criteria
vague tasks
weak fixtures
missing data contracts
scientific ambiguity
dependency risks
unclear validation
documentation gaps

Do not build the notebook until blocking spec review findings are resolved.

7. Select Fixtures

Use:

.agents/skills/fixture-selection/SKILL.md
.agents/agents/lifecycle/fixture-curator.md

Ask Codex:

Act as .agents/agents/lifecycle/fixture-curator.md and use $fixture-selection to create a fixture manifest for specs/<feature>/.

Expected output:

specs/<feature>/fixture-manifest.md

Good fixture manifests include:

happy-path fixture
edge-case or negative fixture where relevant
stable identifiers or local paths
source/provenance
retrieval date for network data
expected outputs
tolerances
validation checks

8. Build The Notebook From The Spec

Use:

.agents/skills/notebook-from-spec/SKILL.md
.agents/agents/lifecycle/notebook-builder.md

Ask Codex:

Act as .agents/agents/lifecycle/notebook-builder.md and use $notebook-from-spec to implement the notebook described in specs/<feature>/.

Expected output:

notebooks/<feature>.ipynb

The notebook should:

run top-to-bottom after kernel restart
have explicit setup/import cells
use clear markdown sections
keep code cells focused
validate required fields before use
show provenance for fetched or uploaded data
include interpretation and limitations
avoid hidden local paths and hidden state

9. Validate Notebook Execution

Use:

.agents/skills/notebook-execution-validation/SKILL.md
.agents/agents/lifecycle/notebook-validator.md

Ask Codex:

Act as .agents/agents/lifecycle/notebook-validator.md and use $notebook-execution-validation to validate notebooks/<feature>.ipynb against specs/<feature>/.

Expected output:

specs/<feature>/validation-report.md

Validation should check:

restart-and-run-all execution
fixture outputs
expected warnings or failures
data contracts
dependency/runtime constraints
hidden state
unresolved TODOs in critical cells
visualization outputs
documentation presence

Execution validation only proves the notebook runs and matches its mechanical checks. It does not prove the notebook is scientifically ready.

10. Review The Notebook As A Scientific Artifact

Use:

.agents/skills/notebook-review/SKILL.md
.agents/agents/lifecycle/notebook-reviewer.md

Ask Codex:

Act as .agents/agents/lifecycle/notebook-reviewer.md and use $notebook-review to review notebooks/<feature>.ipynb after execution validation.

Expected output:

specs/<feature>/notebook-review.md

The review should assess:

scientific correctness
formulas and cited sources
biological assumptions
uncertainty and limitations
reproducibility
pedagogy and documentation
visualization quality
maintainability
readiness for beta, graduation, or archive

Use a human/domain review gate when interpretation, thresholds, clinical/RUO framing, or AFDB/PDBe publication risk requires judgment.

11. Move To Instrumented Beta

Current status: placeholder stage.

Placeholder skills to develop:

beta-feedback-instrumentation
usage-signal-summary
beta-release-packaging

Expected future outputs:

specs/<feature>/beta-plan.md
specs/<feature>/feedback-summary.md
specs/<feature>/beta-release-notes.md

The beta stage should answer:

who tried the notebook
whether they understood it
whether they returned to it
where they failed
what they preferred over existing workflows
what must change before graduation

12. Graduation Review

Current status: placeholder stage.

Placeholder skills to develop:

graduation-decision-brief
engineering-handoff-package
standing-notebook-maintenance-plan
archive-retrospective

Expected future outputs:

specs/<feature>/graduation-decision.md
specs/<feature>/engineering-handoff.md
specs/<feature>/maintenance-plan.md
specs/<feature>/archive-retrospective.md

Possible outcomes:

integrate into AFDB or PDBe
keep as a standing InsightFold notebook
archive with rationale and preserved state

Recommended Prompts

Full Ideation

Use $idea-scoping-interview.

Act as .agents/agents/advisory/scientific-product-manager.md.

I want to explore a new InsightFold notebook idea:
<idea>

Do not write a PRD yet.
Interview me until the user, the question, the evidence, and the rough notebook outcome are clear.

Parallel Advisory Review

Spawn parallel subagents for advisory review of this stable idea:
<stable idea statement>

Use:
- .agents/agents/advisory/scientific-product-manager.md
- .agents/agents/advisory/computational-structural-biologist.md
- .agents/agents/advisory/bioinformatics-data-engineer.md
- .agents/agents/advisory/evaluation-benchmarking-specialist.md

Have each subagent return:
- key opportunities
- highest-risk assumptions
- blocking questions
- concrete recommendations
- whether this is ready for $scoping-decision-capture

Ideation To PRD

Use $scoping-decision-capture to synthesize the ideation discussion and advisory-panel outputs.
Then use $concept-to-prd only if the concept is ready.

Full Lifecycle

Take this idea through the InsightFold lifecycle.

1. Use $idea-scoping-interview in one lead thread first.
2. When the idea is stable, spawn parallel advisory subagents.
3. Use $scoping-decision-capture to synthesize the discussion.
4. Use $concept-to-prd only if the concept is ready.
5. Use $prd-to-notebook-spec to build the spec pack.
6. Then use the lifecycle agents in order.
7. Stop at each gate if there are blocking findings.

Idea:
<idea>

From Existing Spec

Act as .agents/agents/lifecycle/fixture-curator.md and complete fixture selection for specs/<feature>/.
Then act as .agents/agents/lifecycle/notebook-builder.md and build the notebook only if fixtures and validation criteria are ready.

Validate Existing Notebook

Act as .agents/agents/lifecycle/notebook-validator.md and use $notebook-execution-validation to validate notebooks/<feature>.ipynb against specs/<feature>/.

Final Review

Act as .agents/agents/lifecycle/notebook-reviewer.md and use $notebook-review to decide whether notebooks/<feature>.ipynb is ready for beta, graduation review, continued iteration, or archive.

Development Status

Implemented lifecycle skills:

idea-scoping-interview
scoping-decision-capture
concept-to-prd
prd-to-notebook-spec
notebook-spec-review
fixture-selection
notebook-from-spec
notebook-execution-validation
notebook-review

Implemented lifecycle agents:

spec-reviewer
fixture-curator
notebook-builder
notebook-validator
notebook-reviewer

Important placeholders:

idea-capture
idea-quality-check
idea-to-triage-brief
evidence-scan
risk-and-assumption-log
acceptance-criteria-authoring
dependency-policy
structure-parsing-strategy
data-contract-validation
scientific-computation-patterns
notebook-result-interpretation
scientific-figure-style
fixture-regression-harness
dependency-runtime-audit
beta-feedback-instrumentation
usage-signal-summary
beta-release-packaging
graduation-decision-brief
engineering-handoff-package
standing-notebook-maintenance-plan
archive-retrospective

Guiding Principle

InsightFold should make notebook development faster, but speed is not the only target. The lifecycle exists so that each notebook carries enough context, evidence, validation, and review to be useful beyond the first experiment.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.agents		.agents
agent-skills		agent-skills
agents		agents
local-models-testing		local-models-testing
notebooks		notebooks
prd		prd
prompts		prompts
skills		skills
specs		specs
src		src
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
PDBE-Lifecycle-050526-080147.pdf		PDBE-Lifecycle-050526-080147.pdf
README.md		README.md
Untitled.ipynb		Untitled.ipynb
main.py		main.py
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InsightFold

What InsightFold Hopes To Achieve

Repository Layout

Current Lifecycle Chain

Recommended Way Of Working

How To Use The Skills From Start To Finish

1. Start With An Idea

2. Run Parallel Advisory Review

3. Capture The Decisions

4. Convert The Concept To A PRD

5. Convert The PRD To A Notebook Spec Pack

6. Review The Spec Before Building

7. Select Fixtures

8. Build The Notebook From The Spec

9. Validate Notebook Execution

10. Review The Notebook As A Scientific Artifact

11. Move To Instrumented Beta

12. Graduation Review

Recommended Prompts

Full Ideation

Parallel Advisory Review

Ideation To PRD

Full Lifecycle

From Existing Spec

Validate Existing Notebook

Final Review

Development Status

Guiding Principle

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

InsightFold

What InsightFold Hopes To Achieve

Repository Layout

Current Lifecycle Chain

Recommended Way Of Working

How To Use The Skills From Start To Finish

1. Start With An Idea

2. Run Parallel Advisory Review

3. Capture The Decisions

4. Convert The Concept To A PRD

5. Convert The PRD To A Notebook Spec Pack

6. Review The Spec Before Building

7. Select Fixtures

8. Build The Notebook From The Spec

9. Validate Notebook Execution

10. Review The Notebook As A Scientific Artifact

11. Move To Instrumented Beta

12. Graduation Review

Recommended Prompts

Full Ideation

Parallel Advisory Review

Ideation To PRD

Full Lifecycle

From Existing Spec

Validate Existing Notebook

Final Review

Development Status

Guiding Principle

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages