Skip to content

fix(think): make step.prompt structured output work across all providers (#1685)#1688

Merged
threepointone merged 2 commits into
mainfrom
fix/think-step-prompt-structured-output
Jun 6, 2026
Merged

fix(think): make step.prompt structured output work across all providers (#1685)#1688
threepointone merged 2 commits into
mainfrom
fix/think-step-prompt-structured-output

Conversation

@threepointone
Copy link
Copy Markdown
Contributor

@threepointone threepointone commented Jun 6, 2026

Summary

Fixes #1685. ThinkWorkflow.step.prompt({ output }) failed on Workers AI with:

AiError 5023: JSON Schema mode is not supported with stream mode

Structured workflow prompts requested output via the AI SDK Output.object path, which streams a JSON Schema response_format. Workers AI (Think's default provider) rejects JSON Schema on streaming requests, so step.prompt() was effectively broken there.

What changed

Workflow-prompt turns no longer use Output.object. Instead Think injects a synthetic think_final_answer tool whose input schema is the prompt's Zod schema, instructs the model to terminate the turn by calling it, and reads the validated structured result from that tool call's input.

Because this is ordinary tool calling, it works uniformly across Workers AI, OpenAI, and Anthropic, and keeps Think's streaming engine intact (persistence, recovery, resumable streams). A structured step.prompt() is now a full agentic turn — the agent can use its own tools across multiple steps before producing the final answer.

Key details / edge cases handled

  • Forced tool choice. toolChoice is forced for structured turns ("required" when real tools are present, otherwise pinned to the final-answer tool). A hasToolCall stop condition alone is insufficient — some models (kimi, llama) answer in plain text and stop at step 0 without ever calling the tool.
  • Reserved name + collision guard. think_final_answer is reserved and namespaced; if a user tool already uses the name, the turn picks a collision-suffixed variant.
  • Clean transcript. The internal tool's call/result are stripped from the persisted conversation (stateless, matched by the reserved name) so the transcript and later turns never see Think's internal plumbing. Stateless matching also covers the recovery re-persist path.
  • No hook leakage. The synthetic tool is excluded from user-facing afterToolCall / extension hooks, and is always kept in activeTools on structured turns even when a caller overrides activeTools via beforeTurn.

Tests

  • New cross-provider e2e harness (packages/think/src/e2e-tests/step-prompt-structured.test.ts) exercising:

    • a no-tool structured prompt, and
    • a multi-step tool-using prompt (write → read → final answer, i.e. the toolChoice: "required" path)

    on Workers AI, OpenAI, and Anthropic.

  • Unit tests: structured output via a final-answer mock, "does not persist the internal final-answer tool", and a recovered-assistant-message strip test.

Docs

  • docs/think/workflows.md gains a Behavior notes section:
    • the agent may use its tools first (allow maxSteps >= 2),
    • tool use is forced during a structured turn — do not override toolChoice from beforeTurn on a step.prompt() turn,
    • think_final_answer is reserved and stripped from the transcript.

Note

docs/think/workflows.md still needs a manual port to cloudflare/cloudflare-docs (src/content/docs/agents/...).

Verification

  • think unit suite: 551 passed (16 files)
  • step-prompt-structured e2e: 6 passed (Workers AI kimi-k2.6, OpenAI gpt-4o-mini, Anthropic claude-haiku-4-5; greeting + multi-step tool legs)
  • pnpm run check: green across 92 projects

Changeset included (@cloudflare/think patch).


Open in Devin Review

…ers (#1685)

ThinkWorkflow `step.prompt({ output })` failed on Workers AI with
`AiError 5023: JSON Schema mode is not supported with stream mode`.
Structured workflow prompts requested output via the AI SDK `Output.object`
path, which streams a JSON Schema `response_format` — rejected by some
providers (notably Workers AI, which Think uses by default).

Fix
---
Workflow-prompt turns no longer use `Output.object`. Instead Think injects a
synthetic `think_final_answer` tool whose input schema is the prompt's Zod
schema, instructs the model to terminate the turn by calling it, and reads the
validated structured result from that tool call's input. This uses ordinary
tool calling, so it works uniformly across Workers AI, OpenAI, and Anthropic,
and keeps Think's streaming engine intact (persistence, recovery, resumable
streams). A structured `step.prompt()` is now a full agentic turn: the agent
can use its own tools across multiple steps before producing the final answer.

Key details
-----------
- toolChoice is forced for structured turns (`"required"` when real tools are
  present, otherwise pinned to the final-answer tool). `hasToolCall` stop
  conditions alone are insufficient — some models (kimi, llama) answer in plain
  text and stop at step 0 without ever calling the tool.
- `think_final_answer` is reserved and namespaced; if a user tool already uses
  the name, the turn picks a collision-suffixed variant.
- The internal tool's call/result are stripped from the persisted conversation
  (stateless, matched by the reserved name) so the transcript and later turns
  never see Think's internal plumbing. Stateless matching also covers the
  recovery re-persist path.
- The synthetic tool is excluded from user-facing `afterToolCall` / extension
  hooks, and is always kept in `activeTools` on structured turns even when a
  caller overrides `activeTools` via `beforeTurn`.

Tests
-----
- New cross-provider e2e harness (`step-prompt-structured.test.ts`) exercising
  both a no-tool structured prompt and a multi-step tool-using prompt
  (write -> read -> final answer) on Workers AI, OpenAI, and Anthropic.
- Unit tests: structured output via the final-answer mock, "does not persist
  the internal final-answer tool", and a recovered-assistant-message strip test.

Docs
----
- docs/think/workflows.md gains a "Behavior notes" section (agent may use tools
  first / maxSteps >= 2, tool use is forced so do not override toolChoice on
  step.prompt turns, think_final_answer is reserved + stripped).

Verified: think unit suite 551 passed; step-prompt-structured e2e 6 passed
(Workers AI kimi-k2.6, OpenAI gpt-4o-mini, Anthropic claude-haiku-4-5);
`pnpm run check` green across 92 projects.

Note: docs/think/workflows.md still needs a manual port to cloudflare-docs.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jun 6, 2026

🦋 Changeset detected

Latest commit: 1e2bf69

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@cloudflare/think Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Jun 6, 2026

Open in StackBlitz

agents

npm i https://pkg.pr.new/agents@1688

@cloudflare/ai-chat

npm i https://pkg.pr.new/@cloudflare/ai-chat@1688

@cloudflare/codemode

npm i https://pkg.pr.new/@cloudflare/codemode@1688

hono-agents

npm i https://pkg.pr.new/hono-agents@1688

@cloudflare/shell

npm i https://pkg.pr.new/@cloudflare/shell@1688

@cloudflare/think

npm i https://pkg.pr.new/@cloudflare/think@1688

@cloudflare/voice

npm i https://pkg.pr.new/@cloudflare/voice@1688

@cloudflare/worker-bundler

npm i https://pkg.pr.new/@cloudflare/worker-bundler@1688

commit: 1e2bf69

step.prompt structured output works only with models that honor a forced
toolChoice while streaming (Think streams every turn). Verified on OpenAI
gpt-4o-mini, Anthropic claude-haiku-4-5, and Workers AI kimi-k2.6.

Some Workers AI models (e.g. @cf/meta/llama-3.3-70b-instruct-fp8-fast) only
honor forced tool calls on non-streaming requests and reply in plain text while
streaming — the turn then ends without a think_final_answer call and the prompt
fails with a clear error. The original #1685 AiError 5023 is gone either way;
this documents the model requirement so users pick a capable model.
@threepointone threepointone merged commit 4d050c7 into main Jun 6, 2026
4 checks passed
@threepointone threepointone deleted the fix/think-step-prompt-structured-output branch June 6, 2026 01:23
@github-actions github-actions Bot mentioned this pull request Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant