Skip to content

refactor(review): hoist GH+GL shared review code into src/core/review/#96

Open
factory-nizar wants to merge 6 commits into
devfrom
feat/core-review-refactor
Open

refactor(review): hoist GH+GL shared review code into src/core/review/#96
factory-nizar wants to merge 6 commits into
devfrom
feat/core-review-refactor

Conversation

@factory-nizar
Copy link
Copy Markdown
Contributor

@factory-nizar factory-nizar commented Jun 3, 2026

Summary

After Phase 1 GitLab support landed in #93 (squash-merged as 84946f2), the GitHub and GitLab review paths had several pieces of duplicated or shape-overlapping code. This PR pulls the genuinely shared parts into a new platform-agnostic src/core/review/ module and converts the platform-specific files into thin adapters / collaborators.

Changes

New shared module src/core/review/ (8 files, 566 LOC):

  • prompts/{types,candidates,validator}.ts — shared Pass 1 / Pass 2 prompt templates plus the ReviewTerminology shape.
  • artifacts/{types,write}.tsReviewArtifactPaths / ReviewArtifactContents shapes and a writeReviewArtifacts(outDir, contents, names) helper (mkdir + parallel writeFile×3).
  • tracking/{types,format}.tsReviewTrackingState / ReviewTrackingTelemetry / ReviewTrackingFields types and formatDurationMs / formatCostUsd helpers.
  • triggers/parse-command.tsparseDroidCommand(text) with the regex set for |review|security|....
    Modified adapters / collaborators:
  • src/create-prompt/templates/review-{candidates,validator}-prompt.ts — now thin adapters onto the shared templates.
  • src/create-prompt/terminology.ts (new) — GITHUB_TERMINOLOGY.
  • src/create-prompt/types.ts — re-exports the shared type.
  • src/github/utils/command-parser.ts — re-exports the core parser/types so existing imports still resolve.
  • src/gitlab/prompts/{candidates,validator,terminology}.ts — now thin.
  • src/gitlab/data/review-artifacts.ts — uses the shared writer.
  • src/gitlab/operations/tracking-note.ts — uses the shared types/formatters.

Implementation Details

Commit-by-commit:

  1. fdda9ef prompts — Pass 1 (candidate generation) and Pass 2 (validator) prompts were ~95% identical text. Extracted into src/core/review/prompts/ (shared template + ReviewTerminology shape); the GH and GL files are now ~40 LOC thin adapters that map their context onto the shared shape and supply terminology constants. ~280 LOC of literal duplicated prompt text eliminated.
  2. 20b98bc artifact writerReviewArtifactPaths + ReviewArtifactContents shapes plus a writeReviewArtifacts(outDir, contents, names) helper. GL adopts it; GH keeps its three-helper public API (pinned by tests) but re-exports the shared type via the existing ReviewArtifacts alias. The platform-specific fetch mechanics (git+gh CLI vs REST) stay separate — they're fundamentally different.
  3. f7eb40a sticky-comment contractsReviewTrackingState, ReviewTrackingTelemetry, ReviewTrackingFields types plus formatDurationMs / formatCostUsd helpers move to core. GitLab's tracking-note builder uses them; GitHub's MCP-comment-server pattern is untouched (different architecture today). Identical telemetry rendering guaranteed.
  4. e9aa6a4 command parserparseDroidCommand(text) (the regex set for |review|security|...) moves to src/core/review/triggers/parse-command.ts. GH extractCommandFromContext (which scans GH webhook payloads) keeps its location and re-exports the shared types/parser so all existing imports still resolve. When GitLab ports its trigger path, it can reuse the parser directly.

Scoping note

The initial inventory estimated 4 dedupe candidates with ~500 LOC saved. Reading the actual code revealed that only the prompts (commit 1) had real textual duplication (~280 LOC). The other three commits are smaller-bore: shared contracts (types, formatters, parser regex) rather than shared implementations, because the platform mechanics genuinely diverge:

  • Artifact fetching — GH shells out via git+gh CLI with shallow-clone unshallowing + 50MB-buffered fallback; GL uses parallel REST calls. Sharing the fetcher would be more harm than good.
  • Sticky comment — GH builds the rich body from inside the agent via an MCP tool; GL builds it from CI in TypeScript. Sharing the renderer would require unifying the architectures.
    So commits 2–4 are intentionally minimal: they share just the parts that should never drift apart between platforms (types, telemetry text, command regex), without forcing the implementations into a shared mold they don't want to fit.

Out of scope (deferred)

Testing

  • tsc --noEmit: clean
  • bun test: 445 pass / 0 fail (same baseline as dev)
  • prettier: clean
    Next: end-to-end smoke test on both platforms after merge (GH self-review on the PR, GL via the mirror at factory-components/droid-action).

Related Issues

factory-nizar and others added 4 commits June 3, 2026 12:03
…core/review/

Both Pass 1 (candidate generation) and Pass 2 (validator) prompts were
~95% identical between GitHub and GitLab — same skill invocation, same
output schema, same critical constraints. Differences were limited to
labels (PR/MR, Repo/Project), JSON meta keys (prNumber/mrIid,
baseRef/targetBranch), MCP tool names, and a handful of platform-specific
instruction lines.

Extract the shared structure into platform-agnostic builders under
src/core/review/prompts/ that take a ReviewTerminology shape, and convert
the four existing prompt files into thin adapters:

  src/core/review/prompts/
    types.ts        ReviewTerminology + ReviewPromptContext
    candidates.ts   generateCandidatesPrompt(ctx)
    validator.ts    generateValidatorPrompt(ctx)

  src/create-prompt/terminology.ts        GITHUB_TERMINOLOGY constant
  src/gitlab/prompts/terminology.ts       GITLAB_TERMINOLOGY + factory

Both platform-specific wrappers now just map their context shape onto
ReviewPromptContext and delegate. GitLab's submit_review needs the MR IID
re-asserted in the tool call, so we expose a gitlabTerminologyFor(mrIid)
helper that bakes it into the submit_review hint.

Net LOC is roughly even on this commit alone (+61), but ~280 LOC of
literal duplication is gone, and any future shared prompt (security
review/report when GitLab adds it) gets the shared scaffolding for free.

All 29 prompt-specific tests still pass on both platforms; full suite
445/445; tsc clean.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…l disk-write helper

The two platforms fetch review artifacts with fundamentally different
mechanics (GitHub shells out to git/gh CLI with shallow-clone unshallow
and a 50MB-buffered fallback; GitLab uses parallel REST calls), so the
fetchers stay platform-specific. But the on-disk shape is identical, so:

  src/core/review/artifacts/types.ts   ReviewArtifactPaths +
                                       ReviewArtifactContents shapes
  src/core/review/artifacts/write.ts   writeReviewArtifacts(outDir,
                                       contents, names) — mkdir + parallel
                                       writeFile×3

GitLab's computeReviewArtifacts now delegates the disk-write through the
shared helper. GitHub keeps its 3-helper public API (each does its own
write so tests pin those signatures); it just re-exports the shared type
via the existing ReviewArtifacts alias.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…ters to core

The two platforms render the sticky review-tracking comment with very
different mechanics today — GitHub builds it from inside the agent via
the github-comment-server MCP tool, while GitLab writes a rich state-
machine body from CI — so we can't share a single renderer. What we
*can* share is the contract:

  src/core/review/tracking/types.ts    ReviewTrackingState +
                                       ReviewTrackingTelemetry +
                                       ReviewTrackingFields
  src/core/review/tracking/format.ts   formatDurationMs(ms) +
                                       formatCostUsd(usd) — keeps the
                                       "1m 23s • $0.0042" rendering
                                       identical across platforms

GitLab's tracking-note.ts now imports those and re-exports the GitLab
aliases (TrackingNoteState, TrackingNoteTelemetry, TrackingNoteOptions)
so existing call sites stay untouched.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
…core

The actual string-matching for `@droid fill`, `@droid review`,
`@droid security`, `@droid security --full`, and bare `@droid` is
identical between platforms. Move parseDroidCommand + DroidCommand +
ParsedCommand into:

  src/core/review/triggers/parse-command.ts

GitHub's command-parser keeps extractCommandFromContext (which scans
GitHub-payload-shaped events) and re-exports the shared types/parser
so existing imports from `../utils/command-parser` stay valid.

When the GitLab trigger path is ported (currently a pending task on
the GitLab side), it can reuse parseDroidCommand directly instead of
duplicating the regex set.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
@factory-droid
Copy link
Copy Markdown
Contributor

factory-droid Bot commented Jun 3, 2026

Droid finished @factory-nizar's task —— View job


@factory-droid
Copy link
Copy Markdown
Contributor

factory-droid Bot commented Jun 3, 2026

Droid finished @factory-nizar's task —— View job


Security Review

Solid refactor that centralizes shared review logic while keeping platform-specific mechanics separate. Please fix the formatDurationMs rounding edge case and align GitLab’s security-badge prompt instruction with the actual tracking-note badge/semantics to avoid confusing status output.

Comment thread src/core/review/tracking/format.ts Outdated
Comment thread src/gitlab/prompts/terminology.ts Outdated
factory-nizar and others added 2 commits June 3, 2026 14:15
…L security badge

Two pre-existing bugs surfaced by the bot review on PR #96 — both ship
on dev today, this just brings the fixes along with the refactor.

1. formatDurationMs returned "1m 60s" for ms values whose remainder
   seconds rounded up to 60 (e.g. ms=119600 → seconds=119.6 →
   minutes=1, remSec=round(59.6)=60). Round to whole seconds first, then
   split into minutes+seconds so the carry happens cleanly. Add
   regression coverage in test/core/review/tracking/format.test.ts
   covering the rollover boundary (119600/179600/239600 ms) plus the
   normal sub-second / sub-minute / multi-minute cases.

2. The GitLab security-badge URL in the prompt instruction
   (`security%20review-ran-blue`) didn't match the badge actually
   rendered by buildTrackingNoteBody (`security%20review-enabled-blue`,
   plus shields.io style params + logo). If the validator pass followed
   the prompt literally and the CI-prepended badge also fired, the MR
   could end up with two distinct security badges. Export the
   SECURITY_BADGE constant from tracking-note.ts and reference it from
   GITLAB_TERMINOLOGY.securityBadgeInstruction so the renderer is the
   single source of truth.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Three small issues caught by a thorough re-review of the refactor diff:

1. `securityBadgeInstruction` wrapped the badge markdown in single
   backticks (```![security](…)```), so an LLM literally following the
   prompt would paste the badge inside an inline-code span and produce
   `![security](...)` rendered as code instead of an image. Rephrased
   to emit the raw markdown with an explicit "no surrounding backticks
   or code fences" caveat.

2. The security-subagent prompt block interpolated
   `${t.baseRefLabel.toLowerCase()}` which produced awkward bare
   phrases ("pr base ref" / "mr target branch") instead of the
   original prompts' "base ref" / "target branch". Added a dedicated
   `baseRefShortLabel` field on ReviewTerminology and use it in the
   narrative prose; the noun-prefixed `baseRefLabel` is still used in
   the structured <context> block where it matches today's wording.

3. format.ts docstring claimed the helpers are "shared between the
   GitHub MCP comment server and the GitLab tracking-note builder" but
   the GH side doesn't consume them yet. Softened the wording to
   reflect actual usage today plus the unification intent.

Tests: 451/451 still pass; tsc clean.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant