fix(gateway): preserve parallel tool calls in the streaming path#17
Merged
Conversation
The streaming SSE path hardcoded index: 0 for every tool-call chunk in the OpenAI wire format, where parallel calls are distinguished only by that index. The openai-compatible client collapsed concurrent calls into a single index-0 slot, silently dropping all but the first. Assign a stable incrementing index per tool-call id so each call keeps a distinct slot. The non-streaming generateText path was already correct. Also extend the mock model with a "call tools <name> <name>" seam that emits parallel tool calls, and add a regression test driving two parallel calls through streamText that asserts both survive. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
albanm
added a commit
that referenced
this pull request
Jun 9, 2026
Reconcile the trace-storage work with main's quota refactor (#16) and streaming parallel-tool-call fix (#17): - gateway/summary routers adopt resolveUsageIdentity/enforceQuotas from usage/enforce.ts while keeping trace recording and the streamed tool-call capture; merged the per-id toolCallIndex with streamedToolCalls - defaultQuotas (now incl. untrusted) stays centralized in settings/service.ts; routers fall back to it, enforce.ts uses NonNullable<Settings['quotas']> - settings.quotas/models remain optional (storeTraces form work) - regenerated put-req validate.js and vjsf components from the merged schema Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix the gateway's streaming (SSE) path silently dropping all but the first
parallel tool call.
What changed: in the OpenAI streaming wire format, parallel tool calls are
distinguished only by their
index. The gateway hardcodedindex: 0on everytool-input-start/tool-input-deltachunk, so the openai-compatible clientcollapsed concurrent calls into a single index-0 slot. Each call now gets a
stable incrementing index keyed by its id. The non-streaming
generateTextpath was already correct.
Also extends the mock model with a
call tools <name> <name> …seam that emitsparallel calls, and adds a regression test driving two parallel calls through
streamTextthat asserts both survive.Why: parallel tool calls were broken all along in the live gateway; the
trace feature surfaced it.
Regression risks:
mock-tool-call-idtomock-tool-call-id-{idx}(test-only; no test depends on the old literal).
tool-input-deltaindex lookup falls back to0only when no matchingtool-input-startwas seen — a safe guard, not a live path.