fix(think): make step.prompt structured output work across all providers (#1685)#1688
Merged
Merged
Conversation
…ers (#1685) ThinkWorkflow `step.prompt({ output })` failed on Workers AI with `AiError 5023: JSON Schema mode is not supported with stream mode`. Structured workflow prompts requested output via the AI SDK `Output.object` path, which streams a JSON Schema `response_format` — rejected by some providers (notably Workers AI, which Think uses by default). Fix --- Workflow-prompt turns no longer use `Output.object`. Instead Think injects a synthetic `think_final_answer` tool whose input schema is the prompt's Zod schema, instructs the model to terminate the turn by calling it, and reads the validated structured result from that tool call's input. This uses ordinary tool calling, so it works uniformly across Workers AI, OpenAI, and Anthropic, and keeps Think's streaming engine intact (persistence, recovery, resumable streams). A structured `step.prompt()` is now a full agentic turn: the agent can use its own tools across multiple steps before producing the final answer. Key details ----------- - toolChoice is forced for structured turns (`"required"` when real tools are present, otherwise pinned to the final-answer tool). `hasToolCall` stop conditions alone are insufficient — some models (kimi, llama) answer in plain text and stop at step 0 without ever calling the tool. - `think_final_answer` is reserved and namespaced; if a user tool already uses the name, the turn picks a collision-suffixed variant. - The internal tool's call/result are stripped from the persisted conversation (stateless, matched by the reserved name) so the transcript and later turns never see Think's internal plumbing. Stateless matching also covers the recovery re-persist path. - The synthetic tool is excluded from user-facing `afterToolCall` / extension hooks, and is always kept in `activeTools` on structured turns even when a caller overrides `activeTools` via `beforeTurn`. Tests ----- - New cross-provider e2e harness (`step-prompt-structured.test.ts`) exercising both a no-tool structured prompt and a multi-step tool-using prompt (write -> read -> final answer) on Workers AI, OpenAI, and Anthropic. - Unit tests: structured output via the final-answer mock, "does not persist the internal final-answer tool", and a recovered-assistant-message strip test. Docs ---- - docs/think/workflows.md gains a "Behavior notes" section (agent may use tools first / maxSteps >= 2, tool use is forced so do not override toolChoice on step.prompt turns, think_final_answer is reserved + stripped). Verified: think unit suite 551 passed; step-prompt-structured e2e 6 passed (Workers AI kimi-k2.6, OpenAI gpt-4o-mini, Anthropic claude-haiku-4-5); `pnpm run check` green across 92 projects. Note: docs/think/workflows.md still needs a manual port to cloudflare-docs.
🦋 Changeset detectedLatest commit: 1e2bf69 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
agents
@cloudflare/ai-chat
@cloudflare/codemode
hono-agents
@cloudflare/shell
@cloudflare/think
@cloudflare/voice
@cloudflare/worker-bundler
commit: |
step.prompt structured output works only with models that honor a forced toolChoice while streaming (Think streams every turn). Verified on OpenAI gpt-4o-mini, Anthropic claude-haiku-4-5, and Workers AI kimi-k2.6. Some Workers AI models (e.g. @cf/meta/llama-3.3-70b-instruct-fp8-fast) only honor forced tool calls on non-streaming requests and reply in plain text while streaming — the turn then ends without a think_final_answer call and the prompt fails with a clear error. The original #1685 AiError 5023 is gone either way; this documents the model requirement so users pick a capable model.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1685.
ThinkWorkflow.step.prompt({ output })failed on Workers AI with:Structured workflow prompts requested output via the AI SDK
Output.objectpath, which streams a JSON Schemaresponse_format. Workers AI (Think's default provider) rejects JSON Schema on streaming requests, sostep.prompt()was effectively broken there.What changed
Workflow-prompt turns no longer use
Output.object. Instead Think injects a syntheticthink_final_answertool whose input schema is the prompt's Zod schema, instructs the model to terminate the turn by calling it, and reads the validated structured result from that tool call's input.Because this is ordinary tool calling, it works uniformly across Workers AI, OpenAI, and Anthropic, and keeps Think's streaming engine intact (persistence, recovery, resumable streams). A structured
step.prompt()is now a full agentic turn — the agent can use its own tools across multiple steps before producing the final answer.Key details / edge cases handled
toolChoiceis forced for structured turns ("required"when real tools are present, otherwise pinned to the final-answer tool). AhasToolCallstop condition alone is insufficient — some models (kimi, llama) answer in plain text and stop at step 0 without ever calling the tool.think_final_answeris reserved and namespaced; if a user tool already uses the name, the turn picks a collision-suffixed variant.afterToolCall/ extension hooks, and is always kept inactiveToolson structured turns even when a caller overridesactiveToolsviabeforeTurn.Tests
New cross-provider e2e harness (
packages/think/src/e2e-tests/step-prompt-structured.test.ts) exercising:toolChoice: "required"path)on Workers AI, OpenAI, and Anthropic.
Unit tests: structured output via a final-answer mock, "does not persist the internal final-answer tool", and a recovered-assistant-message strip test.
Docs
docs/think/workflows.mdgains a Behavior notes section:maxSteps >= 2),toolChoicefrombeforeTurnon astep.prompt()turn,think_final_answeris reserved and stripped from the transcript.Note
docs/think/workflows.mdstill needs a manual port tocloudflare/cloudflare-docs(src/content/docs/agents/...).Verification
step-prompt-structurede2e: 6 passed (Workers AIkimi-k2.6, OpenAIgpt-4o-mini, Anthropicclaude-haiku-4-5; greeting + multi-step tool legs)pnpm run check: green across 92 projectsChangeset included (
@cloudflare/thinkpatch).