feat: agent discoverability for World SDK docs#1457
feat: agent discoverability for World SDK docs#1457Ralph-20 wants to merge 9 commits intolucas/dse-2334-world-docsfrom
Conversation
…rability - Add `keywords` field to Fumadocs schema (source.config.ts) - Add keywords to all 8 world/ MDX page frontmatter (method names, search terms) - Add realistic file path comments to complete code examples (e.g., `// app/api/workflow-runs/route.ts`) - Both changes improve agent greppability and code context Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
🧪 E2E Test Results❌ Some tests failed Summary
❌ Failed Tests▲ Vercel Production (1 failed)nitro (1 failed):
🌍 Community Worlds (56 failed)mongodb (3 failed):
redis (2 failed):
turso (51 failed):
Details by Category❌ ▲ Vercel Production
✅ 💻 Local Development
✅ 📦 Local Production
✅ 🐘 Local Postgres
✅ 🪟 Windows
❌ 🌍 Community Worlds
✅ 📋 Other
❌ Some E2E test jobs failed:
Check the workflow run for details. |
post-eval improvement — gap analysis found:
- pagination nesting ({ pagination: { cursor } } not { cursor })
- resolveData param guidance ('none' for polling, 'all' for inspection)
- runs.cancel() method missing
- name parsing returns null (needs optional chaining)
- event types taxonomy
- devalue format clarification (opaque arrays without hydration)
eval: new skill 89% vs old skill 32% vs no skill 32%
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
E2E Eval Update: Devalue Hydration FindingFinding
Raw (from Hydrated (after Changes
Eval Results3/3 tests pass, all 18 checks ✅ — hydration works correctly for:
Full findings:
|
…nalysis Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Static analysis + runtime eval proving SKILL.md materially helps agents with observability patterns: hydration, imports, pagination, name parsing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With/Without Skill Comparison EvalResult: 3/8 → 8/8 (+5 delta) Ran
Key failures without skill
Files
|
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Adds keywords frontmatter, file path comments, and World SDK method signatures to all 8 World SDK doc pages + the workflow skill to improve agent discoverability.
Linear: DSE-2337 (child of DSE-2334) | Depends on: PR #1456 (World SDK docs breakout)
What Changed
source.config.tskeywordsto frontmatter schemaworld/*.mdxfileskeywordsfrontmatter with method names + search termsworld/*.mdxfilesskills/workflow/SKILL.mdSkill Improvements (post-eval gap analysis)
resolveDataparameter guidance ('none'for polling,'all'for inspection)runs.cancel()methodnull— always use optional chaining)Eval Results
Two evals confirmed the skill improvements work. Full writeup:
eval/skill-e2e-findings.mdTask breadth (with/without skill, 8 binary checks)
3/8 → 8/8 (+5 delta) — without-skill agent misses hydration, imports, pagination nesting, name parsers, and I/O validation entirely.
Code quality (3-condition deep eval)
getWorldfromworkflow/runtime,hydrateResourceIO,parseStepName, proper paginationKey finding: Old skill = no skill for World SDK tasks. The new skill's Observability section is what makes the difference.
Deep eval methodology
Each agent got the same task: "Build a 3-step workflow + an API route using the World SDK to list runs with pagination, get steps with hydrated I/O, calculate step duration, and parse display names."
Scored on 6 criteria (0-2 each):
workflow/runtimefor getWorld,workflow/observabilityfor hydrationworld.runs.list(),world.steps.get(),hydrateResourceIO()resolveDataoptimization, name parsingWhere new skill won decisively:
getWorld()fromworkflow/runtime(others used raw HTTP or local helpers)hydrateResourceIO()+observabilityRevivers(others wrote naive JSON.parse fallbacks)FatalError+RetryableErrorwithretryAfterapp/api/(matching doc file path comments) vs others usingsrc/app/Test Results
Manual TODOs
@expect-errorcomments fromobservability.mdxafter PR feat: re-export parseName + hydrators for observability DX #1453 merges