Skip to content

Experiment: use Static Site Importer for generated local sites#3309

Draft
chubes4 wants to merge 7 commits intotrunkfrom
bfb-mu-plugin-agent-output-draft
Draft

Experiment: use Static Site Importer for generated local sites#3309
chubes4 wants to merge 7 commits intotrunkfrom
bfb-mu-plugin-agent-output-draft

Conversation

@chubes4
Copy link
Copy Markdown
Contributor

@chubes4 chubes4 commented Apr 30, 2026

Summary

  • Bundles Static Site Importer into Studio via the existing wp-files/ download chain. A fresh npm install fetches SSI's main zipball, extracts to wp-files/static-site-importer/, and the CLI bundle ships it. No environment variables, no separate clone, no setup steps required.
  • Reframes local site generation around a normal static HTML/CSS handoff: the agent writes tmp/static-site/index.html, then imports it with static-site-importer import-theme.
  • Updates the deterministic smoke script (apps/cli/scripts/studio-bfb-smoke.mjs) to validate the bundled flow end-to-end: SSI generates and activates a block theme with templates, parts, a front-page pattern, and zero fallback events.

Why

Studio's local site-generation path should not require agents to hand-author Gutenberg block markup or maintain brittle block-format prompt skills. Static Site Importer gives Studio a reusable pipeline:

  1. The agent writes familiar static HTML/CSS.
  2. Static Site Importer converts that HTML into a WordPress block theme.
  3. The generated site remains editable in the Site Editor.

This also supports customer-provided static HTML, migrated templates, and AI-generated prototypes as inputs instead of forcing Studio to discard or rewrite them.

Related upstream projects:

How SSI gets into the site

npm install
  └─ scripts/download-wp-server-files.ts
       └─ fetches https://github.com/chubes4/static-site-importer/archive/refs/heads/main.zip
            └─ extracts into wp-files/static-site-importer/
                 ├─ static-site-importer.php       (plugin entry)
                 ├─ vendor/autoload.php            (composer deps, committed in SSI repo)
                 └─ includes/.../cli-command.php   (registers `wp static-site-importer`)

apps/cli build
  └─ Vite copies wp-files/ into dist/cli/wp-files/

site_create / site_start
  └─ tools/common/lib/mu-plugins.ts
       └─ symlinks dist/cli/wp-files/static-site-importer/ into the per-site mu-plugins temp dir
            └─ a small loader mu-plugin requires static-site-importer.php
                 └─ SSI's CLI command registers → `wp static-site-importer ...` works

Path resolution lives in apps/cli/lib/dependency-management/paths.ts::getStaticSiteImporterPluginPath(), which mirrors getPhpMyAdminPath() / getSqliteCommandPath() / getBlueprintsPharPath() — all bundled assets resolved against getWpFilesPath().

Iteration model (draft phase)

While the SSI substrate is still rapidly evolving, the downloader pulls main rather than a tagged release. Reviewers automatically pick up upstream fixes by re-running npm install. There's no per-machine setup and no env-var contract — a fresh clone of this PR plus npm install is everything a reviewer needs.

Before this PR un-drafts, swap the URL in scripts/download-wp-server-files.ts for a pinned tag, e.g.:

-		getUrl: () =>
-			'https://github.com/chubes4/static-site-importer/archive/refs/heads/main.zip',
+		getUrl: () =>
+			'https://github.com/chubes4/static-site-importer/archive/refs/tags/v0.4.0.zip',

That locks the substrate to a known-good release for merge.

Verification (Riad-equivalent reviewer simulation)

I created a fresh worktree off this branch in a directory that has no relationship to my local SSI development setup, with no environment variables set, and walked it through the reviewer flow:

$ git worktree add studio@bfb-reviewer-sim bfb-mu-plugin-agent-output-draft
$ cd studio@bfb-reviewer-sim
$ npm install --no-audit --no-fund
   ...
   [static-site-importer] Downloading Static Site Importer (chubes4/static-site-importer @ main) ...
   [static-site-importer] Extracting files from zip ...
   [static-site-importer] Files extracted
   ...
$ ls wp-files/static-site-importer/static-site-importer.php
   wp-files/static-site-importer/static-site-importer.php
$ ls wp-files/static-site-importer/vendor/autoload.php
   wp-files/static-site-importer/vendor/autoload.php
$ cd apps/cli && npm run build
   ✓ built in 2.43s
$ node apps/cli/scripts/studio-bfb-smoke.mjs
   {
       "importer_loaded": true,
       "bfb_loaded": true,
       "h2bc_loaded": true,
       "file:parts/header.html": true,
       "file:parts/footer.html": true,
       "file:templates/front-page.html": true,
       "file:templates/page-home.html": true,
       "file:patterns/page-home.php": true,
       "file:style.css": true,
       "file:theme.json": true,
       "front_page_id": 4,
       "home_page_id": 4,
       "active_theme": "studio-static-importer-smoke",
       "fallback_count": 0,
       "header_html_blocks": 0,
       "footer_html_blocks": 0,
       "pattern_has_blocks": true
   }
   PASS: Studio Static Site Importer smoke generated an editable block theme.

Tests

  • npm run typecheck (all workspaces) — clean
  • npx eslint <touched files> — clean
  • npx vitest run — 1483 passed across 157 test files, including 9 in tools/common/lib/tests/mu-plugins.test.ts (3 new SSI tests cover loader + symlink presence when the plugin path is provided, and absence when it isn't)
  • apps/cli/scripts/studio-bfb-smoke.mjs — passes end-to-end against a freshly-bundled site

Notes

AI assistance

  • AI assistance: Yes
  • Tool(s): Claude Code (Sonnet 4.5)
  • Used for: Drafted the wp-files download entry, mu-plugin contract refactor (env-var → bundled-path option), CLI entry-point wiring, smoke-script clean-up, and the new mu-plugins tests under Chris's direction and review. The original draft of this PR (commit 2f336e86) was authored separately by OpenCode/GPT-5.5; the iteration on top is Claude Code's.

@chubes4 chubes4 changed the title Experiment: use BFB for raw HTML site content Experiment: use Static Site Importer for generated local sites May 3, 2026
@chubes4
Copy link
Copy Markdown
Contributor Author

chubes4 commented May 3, 2026

Testing it with lots of different prompts using automated benchmarks.

Site build prompts being tested: https://github.com/chubes4/homeboy-rigs/tree/main/Automattic/studio/bench/prompts/site-build

Here is a screenshot from the first passing (all valid blocks matching the frontend) generation from last night:

screencapture-localhost-9180-2026-05-02-21_49_16

@youknowriad
Copy link
Copy Markdown
Contributor

I'm giving this PR a try. Quick questions though:

  • how does the block conversion work? Is it hard-coded rules? like conceptually how does it work. My guess is that it's only core blocks right?

  • why not a nodeJS converter instead of php (to allow for using the same tool for both remote and local sites), we can't install mu plugins on remote sites.

@youknowriad
Copy link
Copy Markdown
Contributor

Not sure what I'm doing wrong, this is not working for me

Screenshot 2026-05-04 at 9 54 06 AM

@chubes4
Copy link
Copy Markdown
Contributor Author

chubes4 commented May 4, 2026

how does the block conversion work?

My initial approach was to mirror the client side rawHandler in Gutenberg, using the core HTML API. Yes only core blocks for now. Non-supported blocks fallback to core/html.

There is some hardcoding, but I am trying my best to make it more about generic pattern recognition.

My approach has been to run many different prompts, fix the gaps, and run again. The quality has improved significantly over time.

Even with hardcoding, there are certain patterns the AI is likely to use. I think we could account for most of them and handle edge cases as they come up.

why not a nodeJS converter instead of php

Honestly, I did not think of that. I had started this experiment a while back for a plugin that creates content at scale. The site editor angle is new.

Eventually, this could be the responsibility of Gutenberg, instead of Studio.

Not sure what I'm doing wrong

You did nothing wrong, it was me who accidentally left a local path in the original draft 🥇. Sorry about that!

chubes4 added a commit that referenced this pull request May 4, 2026
## Summary

Extends the eval-runner result JSON with two new fields that any
consumer of `EVAL_RUNNER_RESULT_FILE` can read:

- **`error: string | null`** — set when an exception is thrown inside
the message loop (auth blip, MCP crash, network error, SDK throw, etc.).
- **`timedOut: boolean`** — flipped by the timeout callback before it
calls `query.interrupt()`.

Together these address the `finalError` ask in #3262 section 1 ("First
actionable failure") and complete the failure-visibility story started
by #3273 (`firstToolError`, `toolEvents`, `phaseTimingsMs`).

## Why

Three failure modes today, three different visibility outcomes:

| Failure shape | Today | After this PR |
|---|---|---|
| Model returns `success: false` cleanly | ✅ Captured fully | Same |
| Timeout fires (`query.interrupt()` after `timeoutMs`) | ⚠️ Captured as
`success: false`, **indistinguishable** from a clean model failure | ✅
`timedOut: true` distinguishes it |
| Mid-loop exception (auth, MCP, network, SDK throw) | ⚠️ Caught at
`main()`'s top-level → emits stripped-down `{ success: false, error }`
JSON, **losing all timings/tools/turns** | ✅ Caught inside `runEval()` →
full structured result with `error` set alongside everything else |

The third row is the most consequential: today, a run that completes
meaningful work and then hits a late exception loses all the diagnostic
state that was already captured. The new `try { … } catch` keeps the
structured result intact.

## What this enables

A consumer can now answer:

- Did the eval **time out** vs. the model **gave up cleanly**? (Today:
indistinguishable. After: check `result.timedOut`.)
- Was there a **runtime exception** during the run, even if `success`
was `true`? (Today: not representable — `success: true` and the
exception path are mutually exclusive. After: both fields can coexist.)
- When an exception fires, **what tools ran before it, how long did each
phase take**? (Today: lost in the `main()`-level fallback JSON. After:
preserved in the structured result.)

This is the same shape as #3273's contribution — producer-side
observability improvements that any downstream consumer benefits from.
Studio's own `npm run eval`, future internal CI, promptfoo
configurations, and external benchmark harnesses all read the same JSON
contract.

## Diff

8 lines:
- `let error: string | null = null` and `let timedOut = false`
declarations
- Timeout callback rewritten to set `timedOut = true` before calling
`query.interrupt()`
- `try { … } catch ( caught ) { error = … }` wrapper around the message
loop
- Two new fields in the return shape

The catch only changes behavior on exception paths; the existing
successful and clean-failure paths are unchanged.

## Origin

Originally drafted on the experimental Static Site Importer branch
(#3309) where it was being used by an out-of-tree benchmark harness.
Split out for upstream review because:

1. The change is generic — useful to any consumer of the eval-runner's
structured output, not just the SSI experiment.
2. It shouldn't ship gated on the SSI experiment merging.
3. Reviewing it on its own keeps the surface area small (8 lines vs.
~200).

The SSI draft PR will rebase on top of this once it lands.

## Tests

- `npm run typecheck` (all workspaces) — clean
- `npx eslint apps/cli/ai/eval-runner.ts` — clean

The change is small enough that it's covered by the existing eval-runner
exercise paths (`npm run eval`, etc.). If reviewers want explicit unit
tests around the new fields I'm happy to add them.

## Refs

- Issue #3262 — Eval runner should expose structured diagnostics and
benchmark metadata
- PR #3273 — CLI: add eval-runner diagnostics (the previous installment
of this story)
- PR #3309 — Experiment: use Static Site Importer for generated local
sites (where this code originally lived)

## AI assistance
- **AI assistance:** Yes
- **Tool(s):** Claude Code (Sonnet 4.5)
- **Used for:** Drafted the catch wrapper, the `timedOut` flag, and the
result-shape extension under Chris's direction. Chris reviewed the diff,
the issue framing, and the split rationale.
@chubes4 chubes4 force-pushed the bfb-mu-plugin-agent-output-draft branch from ddc9586 to 6502021 Compare May 4, 2026 14:08
@chubes4 chubes4 changed the base branch from fix-ai-screenshot-local-site to trunk May 4, 2026 14:09
@chubes4 chubes4 force-pushed the bfb-mu-plugin-agent-output-draft branch from 6502021 to 3892194 Compare May 4, 2026 18:03
@youknowriad
Copy link
Copy Markdown
Contributor

I think this PR is trying to do too much, change how we transform blocks but also change other things in the system prompt... I've also fallen into this trap on previous PRs and it's very hard to gain confidence about whether the change you're making is the right one or not.

I would like to suggestion smaller steps. For the static site import (or block transformation) which is the main thing this PR is trying to achieve. I'm wondering if there's a simple solution that avoid building a php transformer entirely and just rely fully on the raw handler. The advantage is that it will work consistently and support any blocks that is available on the site. We could achieve it by using an eval in playwright like we do today for block validation, it's just that instead of block validation, we'll be doing block transformation directly.

@chubes4
Copy link
Copy Markdown
Contributor Author

chubes4 commented May 4, 2026

Thanks for the feedback. I will see if I can come up with a simpler approach. I do want to keep experimenting with this, because I already built the PHP transformer, and have been getting good results with it.

For the system prompt, I think we should gradually be moving design and workflow constraints out of the system prompt and into a skill. As part of this experiment, I tweaked the prompt to try and speed up generation by reducing the number of steps.

I am running some benchmarks that track the agent's font choices, color choices, and design decisions in a database over the course of many eval runs with a given system prompt. This can help us measure the impact of prompt changes on output with hard data and make sound decisions.

The idea is that eventually, Studio Code can become a general-purpose WordPress agent that is powerful enough to be the daily driver for the average WordPress developer.

@youknowriad
Copy link
Copy Markdown
Contributor

I think we should gradually be moving design and workflow constraints out of the system prompt and into a skill.

I think that makes sense yeah :)

Looking forward to see where these experiments lead.

chubes4 added 4 commits May 5, 2026 12:55
Drops the STUDIO_STATIC_SITE_IMPORTER_PLUGIN_PATH env-var contract that
required reviewers to clone chubes4/static-site-importer separately and
point Studio at it. The PR is now self-contained: a fresh `npm install`
fetches SSI's main zipball and extracts it into wp-files/, the CLI bundle
ships it, and the mu-plugin loader symlinks from the bundle path.

- scripts/download-wp-server-files.ts: add a `static-site-importer`
  entry that pulls https://github.com/chubes4/static-site-importer/archive/refs/heads/main.zip
  and renames the extracted root into wp-files/static-site-importer/.
  Iteration-mode contract: re-running `npm install` refreshes SSI to
  the current main HEAD. Before the PR un-drafts, swap the URL for a
  pinned tag/sha to lock the substrate.
- apps/cli/lib/dependency-management/paths.ts: expose
  `getStaticSiteImporterPluginPath()` resolving against `getWpFilesPath()`.
- tools/common/lib/mu-plugins.ts: drop the env-var gate. The SSI loader
  mu-plugin and the symlink branch now key on `options.staticSiteImporterPluginPath`,
  passed in from each CLI runtime entry point.
- apps/cli/{php-server-child,playground-server-child}.ts and
  apps/cli/lib/run-wp-cli-command.ts: pass the bundled SSI path through
  to mu-plugin generation.
- apps/cli/scripts/studio-bfb-smoke.mjs: drop the env-var requirement;
  the bundled CLI already knows where SSI lives.
- tools/common/lib/tests/mu-plugins.test.ts: cover loader + symlink
  presence when an SSI path is provided, and absence when it isn't.
- Rename mu-plugin filename from 1-static-site-importer-experiment.php
  to 1-static-site-importer.php, and add the old name to the legacy
  cleanup list so existing installs scrub it.

Also bundles two unrelated prompt/UI tweaks from the same iteration:
- system-prompt.ts: drop two outdated guidance lines now superseded by
  the SSI workflow itself.
- ui.ts: welcome text "block themes" -> "WordPress sites" for accuracy.

## AI assistance
- **AI assistance:** Yes
- **Tool(s):** Claude Code (Sonnet 4.5)
- **Used for:** Drafted the wp-files download entry, mu-plugin contract
  refactor, smoke-script clean-up, and tests under Chris's direction
  and review.
The 'do not run visual screenshot loops unless the user explicitly asks'
guidance reads as anti-instruction for a behavior that is no longer part
of the workflow. Negative framing in prompts trains the model on the
behavior we're trying to avoid; it's better to omit the topic entirely.

Removes the trailing clause from the two 'Finish promptly' workflow
steps and the parenthetical from the take_screenshot tool description.
The neutral tool description (what it does) remains in both the
WordPress.com and local tool listings.

- **AI assistance:** Yes
- **Tool(s):** Claude Code (Sonnet 4.5)
- **Used for:** Mechanical removal under Chris's direction.
chubes4 added 3 commits May 5, 2026 12:56
Studio doesn't depend on homeboy, so the error path shouldn't either.
Replace the homeboy command suggestion with the Studio-native build
sequence (`npm install` at the root, `npm run build` in apps/cli)
so a reviewer running the smoke without a homeboy rig sees instructions
they can actually follow.

The script filename and the studio_bfb_unsupported_fallback_count option
key are unchanged \u2014 "BFB" there refers to Block Format Bridge (the
upstream conversion substrate), not the homeboy rig name.

## AI assistance
- **AI assistance:** Yes
- **Tool(s):** Claude Code (Sonnet 4.5)
- **Used for:** Mechanical message swap under Chris's direction.
Adding draft mu-plugin filenames to LEGACY_MU_PLUGIN_FILENAMES is
overengineering for a draft PR \u2014 there is no installed base of the
old experiment-suffixed filename to scrub, and adding the current
filename to the legacy list is conceptually wrong (it's not legacy,
it's the current shipping name).

The list is for cleaning up files older Studio versions wrote on disk;
the SSI mu-plugin has never shipped in a Studio release. If/when this
PR lands and a future rename is needed, that's the right time to add
the previous name to the legacy list.

## AI assistance
- **AI assistance:** Yes
- **Tool(s):** Claude Code (Sonnet 4.5)
- **Used for:** Mechanical removal under Chris's direction.
@chubes4 chubes4 force-pushed the bfb-mu-plugin-agent-output-draft branch from 4349fe5 to fe16ee9 Compare May 5, 2026 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants