Skip to content

Review pipeline: calibrate review frequency to step risk level #405

@sansari

Description

@sansari

Problem

The DeepWork review pipeline applies uniform review intensity across all workflow steps regardless of their risk profile. This causes significant wall-time overhead on small/personal analyses where the review cost far exceeds the value — while still being appropriate for complex, high-stakes deliverables.

Identified from session analysis of a personal-site analysis project (salmanio-analysis). Three distinct categories of wasteful review were identified:

  • Category A — Redundant validation: DocBaker already validates its own output via validation_report.json, yet a separate deepwork:reviewer agent then re-reads the same screenshots. ~8 minutes wasted per DocBaker report.
  • Category B — Trivial step reviews: Config files, YAML decisions files, and env file generation steps get full reviewer agents despite being trivially reversible and low-risk. 22s–2m10s per occurrence.
  • Category C — Failed reviewers with no hard timeout: Reviewer agents that hit API issues return 0 tool uses after ~4 minutes each, wasting wall time and producing nothing.

Proposed approach

Proportional review: match review depth to actual step risk rather than applying the same intensity everywhere. Full reviews should be preserved on final deliverables and high-stakes analysis outputs. See sub-issues for each category.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions