Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,7 @@ test → troubleshoot) written for an AI to follow.
| [docs/SETUP.md](docs/SETUP.md) | Human setup guide (Windows + macOS/Linux) |
| [docs/HOW_IT_WORKS.md](docs/HOW_IT_WORKS.md) | The mechanism + reverse-engineering evidence |
| [docs/AUTO_ROUTER.md](docs/AUTO_ROUTER.md) | The Auto Router — pick the right model per task automatically |
| [docs/DIRECTIVES.md](docs/DIRECTIVES.md) | Routing directives — pin a request to a model from the prompt (per-role multi-agent workflows) |
| [docs/ADD_A_MODEL.md](docs/ADD_A_MODEL.md) | Add any backend to the `/model` menu |
| [docs/TROUBLESHOOTING.md](docs/TROUBLESHOOTING.md) | Symptom → cause → fix |

Expand Down
17 changes: 17 additions & 0 deletions config.example.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
"_models": "What shows in Claude Code's /model picker. id MUST start with 'claude' or 'anthropic' (others are silently dropped). display_name is the label you see. 'claude-auto' is the Auto Router (see the 'router' section below): pick it and the proxy chooses the cheapest configured backend that can handle each task.",
"models": [
{ "id": "claude-auto", "display_name": "Auto (smart routing)" },
{ "id": "claude-opus", "display_name": "Claude Opus 4.8 (real)" },
{ "id": "claude-gpt-5.5-codex", "display_name": "GPT-5.5 (Codex OAuth)" },
{ "id": "claude-minimax-m3", "display_name": "MiniMax-M3" },
{ "id": "claude-mimo", "display_name": "MiMo v2.5 Pro" },
Expand All @@ -37,6 +38,10 @@
"_": "The Auto Router. type:auto means this is NOT a real backend - the proxy asks the cheap 'classifier' model (configured in the 'router' block below) to score the candidates and routes each task to the cheapest one that clears the quality bar. Pick 'Auto (smart routing)' in /model or the selector.",
"type": "auto"
},
"claude-opus": {
"_": "Real Anthropic Claude Opus 4.8, as a first-class pick. Anthropic passthrough: no 'type' (forwards unchanged), no 'upstream' (defaults to api.anthropic.com), no 'auth' (reuses your existing Claude OAuth login - no API key needed). The id is 'claude-opus' ON PURPOSE: it must NOT be 'claude-opus-4-8', because the dynamic-workflow engine hardcodes that exact id for its background traffic and the orchestrator/worker layer remaps it onto your pick - so a route named 'claude-opus-4-8' would be ambiguous with stock traffic. A distinct id ('claude-opus') is recognized as a deliberate orchestrator/worker pick and as a routing-directive target ([[route:opus]]). The 'model' below is what's actually sent upstream. (This is also what include_stock_models can't give you on its own: real Opus as a DISTINCT orchestrator while a different/cheaper model runs the workers.)",
"model": "claude-opus-4-8"
},
"claude-gpt-5.5-codex": {
"_": "Needs `codex login` once; no API key. effort/tier via UC_CODEX_EFFORT/UC_CODEX_SERVICE_TIER.",
"type": "codex_oauth",
Expand Down Expand Up @@ -152,5 +157,17 @@
"card": "Highest cost here; frontier reasoning and agentic coding. Best for the hardest work: large multi-file refactors, subtle debugging, architecture and design, long autonomous/dynamic workflows, and anything requiring images. Reserve for tasks the cheaper models would likely fail."
}
]
},

"_directives": "Routing directives ('pins') - optional, OPT-IN (OFF until you set enabled:true below, or UC_DIRECTIVES=1). A request's PROMPT can FORCE a specific backend, overriding the orchestrator/worker pick AND the Auto Router. This is how an automated multi-agent workflow lands each spawned sub-agent on the right model BY ROLE: tag each agent()'s prompt with [[route:NAME]] (or @NAME / use:NAME). Names auto-derive from your model ids + display names (composer/codex/minimax/mimo... already work). No tag, two names, or an unknown name -> normal routing decides. Full guide: docs/DIRECTIVES.md; runnable plan->code->review->fix pipeline: examples/role_pipeline_workflow.js.",
"directives": {
"_enabled": "OFF by default so this never changes existing behavior. Set true to turn directives on (or UC_DIRECTIVES=1).",
"enabled": false,
"_aliases": "Optional name -> route id overrides ON TOP of the auto-derived table. Right side MUST be a route id from 'routes'. Add entries only to introduce a new name or disambiguate one (e.g. bare 'deepseek' maps to two routes and is dropped unless pinned here).",
"aliases": {},
"_planner": "Optional. When set to a route id, interactive plan-mode turns with NO explicit pin auto-route there (e.g. let your strongest model write every plan). null = disabled.",
"planner": null,
"_strip": "Remove the [[route:...]] / @name / use:name marker from the prompt before forwarding (recommended).",
"strip": true
}
}
152 changes: 152 additions & 0 deletions docs/DIRECTIVES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# Routing directives — pin any request to a specific model

Routing directives ("pins") let a request's **prompt** force which backend serves
it, overriding the orchestrator/worker selection **and** the [Auto Router](AUTO_ROUTER.md).
They exist for one job in particular: making an **automated multi-agent workflow**
land each spawned sub-agent on the right model **by role** — e.g.

> **opus** writes the plan → **composer** writes the code → **codex** adversarially
> reviews it → **claude** fixes what the review got right.

You don't drive that turn-by-turn. The workflow script bakes a role tag into each
`agent()` prompt; the proxy reads the tag, hard-pins that request, strips the tag,
and forwards the rest. No tag → nothing changes, the normal routing flow decides.

It's **opt-in — OFF by default** (enable with `"directives": {"enabled": true}`
in `config.json`, or `UC_DIRECTIVES=1`), so pulling this feature never changes an
existing setup until you ask for it. Once on, it's fully local and degrades
safely: an unknown name, two names in one message, or a name that maps to the
synthetic `auto` route are all ignored, so a request is never broken.

---

## How a request gets pinned (30 seconds)

On every request the proxy looks at the **latest real user turn** (tool-result-only
turns are skipped, so a sub-agent's tag stays sticky across its tool calls) and
scans for a marker, most-explicit tier first. A tier wins only if it resolves to
**exactly one** configured backend:

| Tier | Form | Example | Stripped before forwarding? |
|------|------|---------|------------------------------|
| 1. Sentinel | `[[route:NAME]]` | `[[route:codex]] review this diff` | yes |
| 2. Tag | `@NAME` · `use:NAME` · `route:NAME` · `model:NAME` | `@composer implement the parser` | yes |
| 3. Natural language **(opt-in, off by default)** | `use/have/ask/let/with/via NAME` | `please have codex review it` | no (it's prose) |

> The natural-language tier is **off by default** — enable it with
> `UC_DIRECTIVES_NL=1`. It's deliberately opt-in because ordinary prose that merely
> mentions a model name after a trigger word (e.g. "*does this work **with Claude**?*")
> would otherwise silently reroute the request. With it off, only the explicit
> sentinel/tag forms pin.

`NAME` is resolved through an **alias table** (below). If it resolves to one
backend, that request is pinned there — skipping both the worker/orchestrator pick
and the Auto Router. If it resolves to **two or more** distinct backends (e.g. you
wrote `@opus then @composer`), that's ambiguous → ignored. If it resolves to
nothing, → ignored.

> Earlier picks: this implements **"explicit tag wins"** + **hard pin**. The tag
> is authoritative; natural language is only a fallback when no tag is present.

Watch decisions with `UC_DIRECTIVES_LOG=1`:

```
directive pin: tier=fast claude-deepseek-v4-flash -> claude-composer
[directive] ambiguous (claude-composer, claude-gpt-5.5-codex named); ignored
```

---

## Names (the alias table)

Names **auto-derive** from your configured model ids and display names, so the
obvious ones already work with no setup:

| You type | Resolves to (in the shipped config) |
|----------|--------------------------------------|
| `opus`, `claude` | `claude-opus` |
| `composer` | `claude-composer` |
| `codex` | `claude-gpt-5.5-codex` |
| `minimax` | `claude-minimax-m3` |
| `mimo` | `claude-mimo` |
| `deepseek-v4-pro`, `deepseek-v4-flash` | the matching route |

Matching is case- and punctuation-insensitive (`GPT-5.5`, `gpt5.5`, `gpt_5_5` all
collapse to the same key). A name that would map to **two** routes is dropped as
ambiguous — use the specific id, or pin it explicitly in `aliases`. In the shipped
example two such names are dropped: bare **`deepseek`** (matches both v4-pro and
v4-flash) and **`gpt`** (matches both `claude-gpt-5.5-codex` via `gpt-5.5` *and*
`claude-ollama-cloud` via its `gpt-oss` model) — so use `codex` for GPT-5.5, not
`gpt`.

### Configure it (optional)

The `directives` block in `config.json` (already present in the shipped config):

```jsonc
"directives": {
"enabled": true, // OFF by default — set true to turn the feature on
// Friendly name -> route id. The common names (opus, claude, composer, codex,
// minimax, mimo, ...) AUTO-DERIVE from your models, so list entries here only to
// ADD a custom name or disambiguate one. RHS must be a route id.
"aliases": {
"fixer": "claude-opus", // a custom role name -> a route
"deepseek": "claude-deepseek-v4-pro" // disambiguate a name that maps to two routes
},
// Optional: interactive plan-mode turns with NO explicit pin auto-route here.
"planner": "claude-opus",
// Strip the marker from the prompt before forwarding (recommended).
"strip": true
}
```

| Field | Meaning |
|-------|---------|
| `enabled` | Turn directives on/off. **Defaults to off.** An explicit `UC_DIRECTIVES` env var (1/0) overrides this either way. |
| `aliases` | `name → route id` overrides on top of the auto-derived table. The right side must be a route id in `routes`. |
| `planner` | If set, **plan-mode** turns (the interactive planning loop, detected structurally via the `ExitPlanMode` tool) with no explicit pin route here. Set `null` to disable. |
| `strip` | Remove the matched marker from the prompt before forwarding so the backend never sees it. |

### Knobs

| Env var | Default | Effect |
|---------|---------|--------|
| `UC_DIRECTIVES` | unset | `1` force-enables, `0` force-disables — overrides `directives.enabled`. Unset → follow config (default off). |
| `UC_DIRECTIVES_NL` | `0` | `1` enables the natural-language tier. **Off by default** (avoids prose like "with Claude" silently rerouting); only sentinel/tag pins count unless enabled. |
| `UC_DIRECTIVES_LOG` | `0` | `1` logs every pin / ambiguity / ignore decision. |

---

## The point: automated multi-agent workflows

Because a workflow script is something you (or the orchestrator) author, you don't
need any fuzzy "which phase is this?" inference — you state the role **in the
prompt**, deterministically:

```js
const plan = await agent(`[[route:opus]] Write a plan for: ${task}`)
const code = await agent(`[[route:composer]] Implement this plan:\n${plan}`)
const review = await agent(`[[route:codex]] Adversarially review:\n${code}`, { schema: REVIEW })
const fixed = await agent(`[[route:claude]] Fix the valid issues:\n${JSON.stringify(review)}`)
```

Each `agent()` is a separate request; the proxy pins each one independently, so
every spawned sub-agent lands exactly where you declared — regardless of which
single worker model is selected in `/model`.

A complete, runnable version (with a structured review verdict and a conditional
fix step) ships at [`examples/role_pipeline_workflow.js`](../examples/role_pipeline_workflow.js).
Save it as `.claude/workflows/role-pipeline.js` to invoke by name, and pass the
task via `args`.

---

## Failure behavior (never breaks a request)

| Situation | What happens |
|-----------|--------------|
| No marker in the prompt | Normal routing (tier/worker selection, then Auto Router). |
| Name resolves to nothing | Ignored; normal routing. |
| Two+ distinct names in one turn | Ambiguous → ignored; normal routing. |
| Name maps to the `auto` route | Not a real backend → ignored. |
| Pinned backend errors at dispatch | Same handling as any other route (the pin only chooses *where*, not *how*). |
121 changes: 121 additions & 0 deletions examples/role_pipeline_workflow.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
// Role pipeline: plan (opus) -> code (composer) -> adversarial review (codex) -> fix (claude)
// ---------------------------------------------------------------------------------------------
// A ready-to-run multi-agent Workflow that lands each spawned sub-agent on a
// SPECIFIC model by role, with no turn-by-turn driving. It does this purely by
// baking a routing directive ("pin") into each agent()'s prompt:
//
// [[route:opus]] -> claude-opus (the planner)
// [[route:composer]] -> claude-composer (the implementer)
// [[route:codex]] -> claude-gpt-5.5-codex (the adversarial reviewer)
// [[route:claude]] -> claude-opus (the fixer)
//
// UltraCode-Shim's proxy sees that tag on each sub-agent request, HARD-PINS that
// request to the named backend (overriding the worker/orchestrator pick AND the
// Auto Router), strips the tag, and forwards the rest. The names resolve through
// the alias table in config.json ("directives" block) -- auto-derived from your
// model ids + display names, so composer/codex/opus already work out of the box.
// See docs/DIRECTIVES.md.
//
// Run it (from a project where you've launched Claude Code through the shim):
// - Save this as .claude/workflows/role-pipeline.js and invoke it by name, OR
// - paste it into the Workflow tool's `script` field.
// Pass the task via args, e.g. args: "Add a --json flag to the export command".

export const meta = {
name: 'role-pipeline',
description: 'plan (opus) -> code (composer) -> adversarial review (codex) -> fix (claude)',
phases: [
{ title: 'Plan', detail: 'opus drafts the implementation plan' },
{ title: 'Code', detail: 'composer implements the plan' },
{ title: 'Review', detail: 'codex adversarially reviews the implementation' },
{ title: 'Fix', detail: 'claude fixes the issues the review deems valid' },
],
}

// Accept the task as a plain string (args: "...") or {task: "..."}.
const task = (typeof args === 'string' && args.trim())
? args.trim()
: (args && typeof args.task === 'string' && args.task.trim())
? args.task.trim()
: null

if (!task) {
log('No task provided. Pass it via Workflow args, e.g. args: "Add a --json flag".')
return { error: 'no task provided' }
}

// 1) PLAN -- pinned to opus.
phase('Plan')
const plan = await agent(
`[[route:opus]] You are the PLANNER. Write a precise, step-by-step implementation ` +
`plan for the task below: the files to touch, the approach, data/flow changes, and ` +
`the edge cases that matter. Do NOT write the final code yet.\n\nTASK:\n${task}`,
{ label: 'plan:opus', phase: 'Plan' },
)

// 2) CODE -- pinned to composer.
phase('Code')
const code = await agent(
`[[route:composer]] You are the IMPLEMENTER. Implement the plan below in full and ` +
`produce the actual code (diffs or complete files). Follow the plan; where it is ` +
`underspecified, make the smallest reasonable choice and note it inline.\n\n` +
`PLAN:\n${plan}\n\nORIGINAL TASK:\n${task}`,
{ label: 'code:composer', phase: 'Code' },
)

// 3) REVIEW -- pinned to codex, structured so we can branch on the verdict.
phase('Review')
const REVIEW_SCHEMA = {
type: 'object',
additionalProperties: false,
properties: {
verdict: { type: 'string', enum: ['ship', 'fix', 'reject'] },
issues: {
type: 'array',
items: {
type: 'object',
additionalProperties: false,
properties: {
severity: { type: 'string', enum: ['high', 'medium', 'low'] },
title: { type: 'string' },
detail: { type: 'string' },
},
required: ['severity', 'title', 'detail'],
},
},
summary: { type: 'string' },
},
required: ['verdict', 'issues', 'summary'],
}
const review = await agent(
`[[route:codex]] You are an ADVERSARIAL, SKEPTICAL reviewer. Actively try to BREAK ` +
`the implementation below: correctness bugs, missed requirements, unhandled edge ` +
`cases, race conditions, and security issues. Be concrete and cite specifics. Then ` +
`return your structured verdict (ship = no real problems; fix = real issues to ` +
`address; reject = fundamentally wrong).\n\nTASK:\n${task}\n\nIMPLEMENTATION:\n${code}`,
{ label: 'review:codex', phase: 'Review', schema: REVIEW_SCHEMA },
)

// 4) FIX -- pinned to claude, only if the review found issues worth fixing.
let fixed = null
const actionable = (review.issues || []).filter(i => i.severity === 'high' || i.severity === 'medium')
if (review.verdict !== 'ship' && actionable.length) {
phase('Fix')
fixed = await agent(
`[[route:claude]] You are the FIXER. The adversarial review below flagged issues. ` +
`Fix ONLY the ones that are genuinely correct; for any you judge a false positive, ` +
`leave the code as-is and briefly explain why. Return the corrected implementation.\n\n` +
`IMPLEMENTATION:\n${code}\n\nREVIEW:\n${JSON.stringify(review, null, 2)}`,
{ label: 'fix:claude', phase: 'Fix' },
)
} else {
log(`review verdict=${review.verdict}; no high/medium issues -> skipping fix`)
}

return {
task,
plan,
implementation: fixed || code,
review,
fixed: Boolean(fixed),
}
Loading
Loading