RSM-1639: research — DLA integration into studio code#3277
Draft
RSM-1639: research — DLA integration into studio code#3277
studio code#3277Conversation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Vendors the Data Liberation Agent at a pinned git SHA into apps/cli/ai/dla/ for runtime use by the studio code agent. Modeled on download-agent-skills.ts. Skips gracefully when GH_PAT is unset so installs keep working for contributors without access to the private upstream repo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hooks `scripts/download-data-liberation-agent.ts` (added in T2) into the root `postinstall` chain right after `download-agent-skills.ts`, mirroring the existing `ts-node ./scripts/...` invocation pattern. Without `GH_PAT` the script logs a clear warning and exits 0, keeping installs working for contributors without access to the private DLA repo. Adds the two DLA runtime deps that aren't currently workspace-hoisted: - `fast-xml-parser@^5.7.2` — DLA's XML/RSS content parser. - `papaparse@^5.5.3` — DLA's WooCommerce CSV importer. Both are pinned to the latest stable major. Once DLA is vendored at a real SHA (RSM-1675 T2 TODO), these should be reconciled with DLA's own pins — verify via `apps/cli/ai/dla/package.json` after vendoring. Per plan §T3 and `wave-1-dla-inventory.md` §4, `ink` is intentionally NOT added: DLA's `src/ui/*.tsx` Ink screens are CLI-only and not invoked by the MCP server path. The MCP server emits progress via `sendLoggingMessage` from `@modelcontextprotocol/sdk` (already in `apps/cli/dependencies`). Adds an `it.todo(...)` placeholder in `apps/cli/ai/tests/agent.test.ts` for the missing-DLA-dir branch; T5 implements the conditional registration and turns this test on. Verified locally: `npm install` from repo root succeeds, the DLA fetch hits its skip-when-missing-`GH_PAT` branch and logs the expected warning without failing the install. `npm run typecheck` passes across all workspaces. `npx vitest run apps/cli/ai/tests/agent.test.ts` shows 2 passing and 1 todo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conditionally registers the Data Liberation Agent (DLA) when its tree is present at apps/cli/ai/dla/. The agent stays runnable without DLA (the common case for contributors without a GH_PAT to vendor it) — registration gates on fs.existsSync. When DLA is available: - Adds `data-liberation` as a stdio MCP server, spawning `src/mcp-server.js` via `process.execPath` so the Electron-bundled Node runtime matches the host. - Appends DLA's directory as a second local plugin entry, exposing its `/migrate` slash command surface. - Forwards `STUDIO_WPCOM_TOKEN` (plus the broader resolvedEnv) so DLA tools targeting WordPress.com sites have credentials available. Drops the `cwd` field from the stdio MCP config: the Anthropic Agent SDK's `McpStdioServerConfig` type does not declare it, and inspecting `sdk.mjs` confirms `cwd` is honored only for the host Claude Code process — not forwarded to MCP children. We pass `mcp-server.js` as an absolute path instead, which lets DLA's `import.meta.url` peer lookups resolve regardless of working directory and keeps the config strictly typesafe with no cast. Also moves the wpcomAccessToken read in commands/ai/index.ts out from behind the `site?.remote` guard. DLA's MCP server may need the token even when the active Studio site is local (e.g. migrating a local source into a WP.com target). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements RSM-1675 T8: per-tool permission policy for the Data Liberation
Agent's MCP tools. Without this, `permissionMode: 'auto'` would auto-approve
write-to-disk and remote-write tools alongside genuinely read-only ones.
Policy summary (in `apps/cli/ai/dla-permissions.ts`):
- Auto-approve `liberate_detect` / `_discover` / `_inspect` / `_status` /
`_verify` (read-only).
- `liberate_import` is auto-approved when `tool_input.delegate === true`
(DLA returns a manifest; Studio handles the actual import via `wp_cli`).
- `liberate_extract` / `_setup` / `_map_apis` / `_probe` / `_import`
(without `delegate: true`) ask the user once per session through the
existing `AskUserQuestion` plumbing; the answer is memoised in a
closure-scoped `Set` so the user is not re-prompted later in the turn.
- Non-DLA tools pass through with `{ behavior: 'allow', updatedInput }`.
- Unrecognised DLA tools deny by default.
Wired into `query()` only when DLA is vendored — non-DLA sessions keep the
SDK's default classifier path untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related issues
How AI was used in this PR
Both phases were AI-orchestrated through the
orchestratorskill. The roster:apps/cli/, the Claude Agent SDK atnode_modules/@anthropic-ai/claude-agent-sdk@0.2.117, and a shallow checkout of the privateAutomattic/data-liberation-agentrepo; the lead synthesised findings into the recommendation report. Every concrete file path, line number, type signature, MCP tool name, and bundling size inresearch-report.mdis sourced directly from the codebase, not from the model's training data.plan.md); eight implementer agents (one per[code]task) landed T1–T8 against the plan; a code-reviewer agent ran the full vitest workspace, typecheck, lint, and dev/prod CLI builds, and approved the implementation (review-1.md); two documentator agents wrote T9 and T10; this PR description and the documentation review (doc-review-1.md) are produced by a final doc-reviewer pass.The orchestrator log, individual researcher findings, planning artifacts, code-review evidence, and the doc-review report all live under
issues/rsm-1639-dla-integration/for inspection.Proposed Changes
Research artifacts (
issues/rsm-1639-dla-integration/)research-plan.md— research question, sub-questions, four-wave plan, findings log, evaluation against "research complete" criteria.tasks/wave-1-*.md— four task briefs assigned to wave-1 researchers (DLA inventory; Studio Code skill/MCP/slash-command plumbing; Claude Agent SDK plugin/MCP loading semantics; CLI bundling and distribution constraints).findings/wave-1-*.md— four exhaustive researcher reports, each with evidence (file paths, line numbers, manifest contents verbatim, MCP type signatures, disk sizes, release cadences).research-report.md— synthesis. Five integration approaches investigated (vendor + stdio MCP; vendor + in-process MCP; npm dep; runtime fetch; handler-only CLI spawn); head-to-head comparison; opinionated recommendation; eight open questions for the implementation phase.plan.md— atomic, ordered task plan derived fromresearch-report.md. Tracks which open questions each task resolves.review-1.md— code-reviewer's verdict (approved) for T1–T8 with command outputs, per-task acceptance verification, and a list of non-blocking nits.verification/— captured stdout for each verification command in the code review.doc-review-1.md— this doc-reviewer's verdict for T9–T10.PR-description.md— this document.Implementation (
apps/cli/,tools/common/ai/, rootscripts/, rootpackage.json, rootvitest.config.ts)The implementation realises Approach A from
research-report.md: vendor DLA's plugin tree underapps/cli/ai/dla/, load it as a second local SDK plugin alongsideapps/cli/ai/plugin/, and boot DLA's MCP server as a stdio child-process entry alongside Studio's in-process MCP. A new/migrateslash command surfaces this through a thin Studio-side wrapper skill that drives DLA'sliberate_*workflow and uses DLA's existingdelegate: trueimport mode to hand artifacts back to Studio'ssite_createandwp_cliplumbing.Eight
[code]tasks (each a single commit on the branch):apps/cli/vite.config.prod.tsstatic-copy fix. Adds theviteStaticCopyblock that the dev/npm configs already have forai/plugin. This was a pre-existing latent bug independently surfaced by the research (research-report.mdOpen Question 1) — landing it first makes T5 verifiable. Confirmed by inspectingapps/cli/dist/cli/plugin/.claude-plugin/plugin.jsonafternpm run cli:build:prod. (Commit14a58dda.)scripts/download-data-liberation-agent.ts. Build-time fetch script modeled onscripts/download-agent-skills.ts. Pinned SHA, GH_PAT/GH_TOKEN auth, graceful skip when no token, tarball download, atomic-ish staging swap,tscpre-compile (no runtimetsxdependency),dist-vendored/→src/rename, vendored PHP undersrc/lib/preview/scripts/preserved,.dla-pinned-shaprovenance file,--update/STUDIO_REFRESH_DLA=1opt-in. Four vitest cases. (Commitd329634b.)package.jsonpostinstall chain afterdownload-agent-skills.ts; addsfast-xml-parser ^5.7.2andpapaparse ^5.5.3toapps/cli/package.json; addsapps/cli/ai/dla/to.gitignore. (Commit20169f27.)ai/dla. All three Vite configs (vite.config.dev.ts,vite.config.npm.ts,vite.config.prod.ts) gain anexistsSync-guardedai/dlatarget so the build still works on contributors without the vendored tree. Cross-config snapshot test inapps/cli/tests/vite-config.test.ts. (Commit5785f4d5.)agent.tsplugin + MCP wiring.apps/cli/ai/agent.tsregisters DLA conditionally ondlaAvailable(fs.existsSync(path.resolve(import.meta.dirname, 'dla'))). DLA loads as a second local plugin and as a stdio MCP server keyed underdata-liberation, spawned withprocess.execPathagainst the absolute path to the pre-compiledmcp-server.js.STUDIO_WPCOM_TOKENis forwarded explicitly;LIBERATION_TOKEN/SHOPIFY_ADMIN_TOKENare forwarded transitively via...resolvedEnv. ThewpcomAccessTokenread inapps/cli/commands/ai/index.tsis moved out of thesite?.remoteguard so the token is available to DLA regardless of site type. Four agent.test.ts cases cover the conditional wiring. (Commit4231813d.)/migrateslash command.tools/common/ai/slash-commands.tsappends{ name: 'migrate', description: __('Migrate a site from a closed platform into Studio') }toAI_SKILL_COMMANDS. The shared list auto-wires the chat dispatcher inapps/cli/commands/ai/index.ts, the autocomplete provider inapps/cli/ai/ui.ts, the desktop IPC dispatcher inapps/studio/src/ipc-handlers.ts, and the renderer composer slash hints — no Electron-side change required because the skill-based path was chosen specifically to satisfy the existing IPC dispatcher'sAI_SKILL_COMMANDSfilter. Test intools/common/ai/tests/slash-commands.test.ts. (Commitaaa50d3d.)apps/cli/ai/plugin/skills/migrate/SKILL.md. Studio-side wrapper skill withuser-invocable: true(with C, not the K typo present in older Studio skills),argument-hint: <source-url>, and a preciseallowed-toolslist scoped to the DLA tools we use plus the Studio MCP tools we drive. Body covers Steps 1–9 (identify, inspect, confirm, extract, verify, setup-with-delegate, create Studio site, import-with-delegate, wrap up) plus an explicit "What this skill does NOT do" footer documenting the deferral of headless mode (Approach E). Includes theimportWxrblueprint shape so the model can construct it for very large WXR exports that would otherwise hit the WP-CLI 120s IPC timeout. Vitest case inapps/cli/ai/tests/plugin-skills.test.ts. (Commit202302c2.)canUseToolpermission scoping.apps/cli/ai/dla-permissions.tsexportsbuildDlaCanUseTool(options);agent.tswires it toquery()'scanUseToolonly when DLA is available. Read-only DLA tools (liberate_detect,liberate_discover,liberate_inspect,liberate_status,liberate_verify) auto-allow;liberate_importwithdelegate: trueauto-allows; ask-once tools (liberate_extract,liberate_setup,liberate_map_apis,liberate_probe, plusliberate_importwithoutdelegate) prompt via the existingonAskUserplumbing and memoise per-session via a closure-scopedSet; default-deny whenonAskUseris missing (with a TODO referencing OQ2 for a future non-interactive opt-in flag) and on unrecognised DLA tools. 11 vitest cases cover the policy. (Commite57e012f.)Two
[docs]tasks:apps/cli/README.md. New "Migrate from a closed platform" section (after "Studio Code"). Documents the eight supported platforms, the/migrateand/migrate <url>invocation shapes, the agent-driven flow (inspect → extract → verify → site-create → import), theLIBERATION_TOKEN/SHOPIFY_ADMIN_TOKENrequirement for Webflow/Shopify, the~/Studio/landing dir, theGH_PATrequirement at install time, and a credit line for the Data Liberation Agent. ToC updated. (Commitfb85ddce.)docs/design-docs/cli.md. New "Data Liberation Agent integration" architecture section. Covers vendoring (the fetch script,tscpre-compile,dist-vendored/→src/rename, vendored PHP,.dla-pinned-sha), plugin and MCP wiring (process.execPath, absolute path tomcp-server.js, nocwdfield onMcpStdioServerConfig, env forwarding), thedelegate: truehandoff contract, thecanUseToolpermission policy, and the SHA-pin update cadence. Cross-links toresearch-report.mdfor trade-off rationale. (Commitc18568a0.)Electron-side gotchas surfaced and resolved
vite.config.prod.tsplugin-copy gap (was Open Question 1). Independently fixed by T1 — adding theviteStaticCopyblock forai/pluginensures the Electron-bundledstudio codeactually loads the SDK plugin tree. The gap was pre-existing and would have silently shipped without DLA forcing a verification.apps/studio/src/ipc-handlers.ts:295-306). Only forwardsAI_SKILL_COMMANDSentries; handler-only slash commands would not appear in the desktop slash menu. The skill-based shape of/migrate(T6 + T7) satisfies this constraint by construction — no Electron-side change required.Out of scope (consciously deferred)
/migrate --headless(Approach E fromresearch-report.md) — documented as deferred in T7's "What this skill does NOT do" footer.user-invokable(with K) typo — orthogonal cleanup, T7 just uses the correctuser-invocable(with C) for the new skill.apps/studio/— out of scope by construction (CLI-only PR per the research scope).cli:packagerun with DLA vendored to verify.Testing Instructions
Reviewing the research
issues/rsm-1639-dla-integration/research-report.mdend-to-end. The Executive Summary and Recommendation sections are load-bearing.findings/wave-1-*.mdfile — every claim cites file paths and line numbers.plan.mdrecords which open questions each implementation task resolves.Reviewing the implementation
The full vitest workspace passes (1474 tests across 158 files), typecheck is clean, lint is clean on every file the patches touched, and dev/prod CLI builds both succeed without DLA vendored. To reproduce locally:
Captured outputs from the code-reviewer's verification run live at
issues/rsm-1639-dla-integration/verification/review-1-*.txtfor direct inspection.Exercising
/migrateend-to-endThe full
/migrateflow can only be exercised with a vendored DLA tree, which requires read access to the privateAutomattic/data-liberation-agentrepo:GH_PATin your environment with read access to the repo.npm install(the postinstall step will runscripts/download-data-liberation-agent.tsand populateapps/cli/ai/dla/).npm run cli:buildand confirmapps/cli/dist/cli/dla/.claude-plugin/plugin.jsonis present.node apps/cli/dist/cli/main.mjs codeand type/migrate(or/migrate https://example.wixsite.com/foo).canUseToolpolicy will prompt beforeliberate_extractand unmemoise after the session ends.Without
GH_PAT, the postinstall logs a warning and exits 0;/migrateis gated onfs.existsSync(apps/cli/ai/dla)at runtime inagent.ts, so the agent runs normally without DLA but the/migrateskill is unavailable. The unit-test path for T2/T5/T7/T8 covers both branches without needing DLA vendored.Pre-merge Checklist
This PR is intentionally draft — owner has flagged it as DO NOT MERGE. The items below need to be resolved before any merge attempt.
review-1.md.npm run cli:packagerun with DLA vendored. Confirms theapps/cli/vite.config.prod.tsplugin-copy fix lands DLA into the Electron extra-resource bundle (Open Question 1, resolved by T1, but verification with DLA vendored is still pending — the code-reviewer ran the prod build without DLA on disk).GH_PATdistribution decision (Open Question 6). Right now contributors must supply their own PAT or the install silently skips DLA. Long-term resolution: push the DLA team for tagged public releases so we can move to an npm dep (Approach C fromresearch-report.md). Short-term: decide whether the team uses a shared GitHub App token, per-user PATs, or a public-mirror-of-tagged-releases workflow.DLA_PINNED_SHAfrom the'main'placeholder. Currentlyscripts/download-data-liberation-agent.tspins to'main'; before merge it must be a real commit SHA (TODO comment in the script tracks this).studio codewith DLA vendored and confirm DLA's MCP tools surface asmcp__data-liberation__*exactly once (no double-registration via DLA's own.claude-plugin/plugin.json#mcpcolliding with ourquery()-timemcpServersentry).src/lib/preview/studio.ts:266-279rmSyncof orphan Studio site dirs does not double-manage sites Studio Code creates directly.delegate: trueis the canonical handoff for hosts like Studio Code, and the request to consider tagged public releases so we can move to an npm dep long-term).