DeepSeek thinking tokens not replayed on multi-turn tool calls — affects both OpenAI and Anthropic SDK paths
Related to: #31
TLDR
- This is a real DeepSeek-specific breakage in Aether tool-call conversations, not user misconfiguration.
- It reproduces on both OpenAI and Anthropic SDK paths, so it is not endpoint-specific.
- MiniMax and Qwen work in the same environment; DeepSeek fails with missing replayed thinking content.
- The extension currently does not reliably apply
includeThinking to dynamic provider models in the effective request path.
- An attempted compatible-model workaround was not reliable in this environment and should not be treated as a fix.
Background
DeepSeek's API (V4 and later, including reasoner variants) enforces a strict rule: any assistant message in conversation history that contains tool_calls must also include a reasoning_content field (OpenAI SDK path) or a content[].thinking block (Anthropic SDK path). If either is absent the API rejects the entire request with a 400 error.
Aether already has the correct mechanism for this — the includeThinking flag on ModelConfig — and convertAssistantMessage() in openaiHandler.ts already injects reasoning_content (or an empty placeholder) when includeThinking === true. The observed behavior indicates includeThinking is not reliably present in the effective model config during later tool-call turns, so replay injection does not consistently trigger.
This affects DeepSeek models selected through dynamic provider paths in this environment (notably opencodego).
Errors Observed
OpenAI SDK path (sdkMode: openai):
Error: reasoning_content missing on assistant message containing tool_calls
or the request silently fails mid-tool-call with no content applied.
Observed request example from Copilot chat error output:
Copilot Request id: 416f2c28-cf84-4952-8aff-47c3d91cfe2d
Reason: 400 Error from provider (DeepSeek): The reasoning_content in the thinking mode must be passed back to the API.
Anthropic SDK path (sdkMode: anthropic) — from issue #31:
Anthropic API call failed: 400 {
"error": {
"message": "The `content[].thinking` in the thinking mode must be passed back to the API.",
"type": "invalid_request_error",
"code": "invalid_request_error"
}
}
Both errors indicate the same replay contract violation. The SDK path is irrelevant — DeepSeek enforces this on both.
Failure Pattern
The failure is non-deterministic in when it appears, which makes it confusing to debug:
- ✅ Turn 1 (plain chat): works fine — no tool calls yet
- ✅ Turn 2 (first tool call): works — model emits thinking + tool_call
- ❌ Turn 3 (tool result → next request): fails — assistant message from turn 2 is in history without
reasoning_content/thinking, DeepSeek rejects
Variations observed:
- File operation completes, diff is not applied, then error
- Error fires before any operation runs
- Error fires after a successful complete operation, on the next turn
- Affects
deepseek-v4-pro and deepseek-v4-flash equally
- Does not affect MiniMax, Qwen, Kimi, GLM on the same provider (those APIs don't enforce this rule)
Root Cause — Three Missing Pieces
1. ModelOverride is missing includeThinking — src/types/sharedTypes.ts
outputThinking is present but includeThinking is not:
// src/types/sharedTypes.ts
export interface ModelOverride {
id: string;
model?: string;
maxInputTokens?: number;
maxOutputTokens?: number;
sdkMode?: SdkMode;
capabilities?: { toolCalling?: boolean; imageInput?: boolean };
baseUrl?: string;
customHeader?: Record<string, string>;
extraBody?: Record<string, unknown>;
outputThinking?: boolean;
// ❌ includeThinking is absent — this is the first breakage point
}
This means any includeThinking: true a user sets in aether.providerOverrides[x].models[] is silently discarded by TypeScript before it reaches applyProviderOverrides().
2. applyProviderOverrides() never applies includeThinking — src/utils/configManager.ts
Around line 700–720, outputThinking is handled for existing models but includeThinking has no corresponding block:
// Exists for outputThinking:
if (modelOverride.outputThinking !== undefined) {
existingModel.outputThinking = modelOverride.outputThinking;
}
// ❌ No equivalent block for includeThinking
Around line 739, the new-model branch also omits it:
...(modelOverride.outputThinking !== undefined && {
outputThinking: modelOverride.outputThinking
})
// ❌ includeThinking not spread here either
3. No automatic includeThinking detection for DeepSeek models — src/utils/globalContextLengthManager.ts
The extension already auto-detects model families for token limits (isDeepSeekModel(), isDeepSeekV4Model(), etc.). There is no equivalent capability flag injection. For any dynamically-fetched model, includeThinking defaults to undefined (falsy), so the thinking replay is always skipped — even when the model provably requires it.
4. Reliability gap specifically on tool-call continuation turns
In this environment, the failure often appears after a successful first tool turn (read/edit planned, then next turn fails). That suggests the continuation/history serialization path is dropping required thinking replay data for DeepSeek assistant tool-call messages.
Proposed Fix
Fix 1 — Add includeThinking to ModelOverride (1 line)
src/types/sharedTypes.ts:
export interface ModelOverride {
// ... existing fields ...
outputThinking?: boolean;
includeThinking?: boolean; // ← ADD THIS
}
Fix 2 — Apply it in applyProviderOverrides() (4 lines)
src/utils/configManager.ts — in the existing-model branch, after the outputThinking block:
if (modelOverride.includeThinking !== undefined) {
existingModel.includeThinking = modelOverride.includeThinking;
}
In the new-model branch, alongside outputThinking:
...(modelOverride.includeThinking !== undefined && {
includeThinking: modelOverride.includeThinking
})
Fix 3 (Ideal / No-Config Solution) — Auto-detect DeepSeek models and force includeThinking
src/utils/globalContextLengthManager.ts or wherever capability resolution happens:
// When building effective model config for a dynamically-fetched model:
if (isDeepSeekModel(model.id) || isDeepSeekV4Model(model.id)) {
model.includeThinking = true;
}
This would make it work out of the box for all users without any manual settings — which is the right long-term behaviour. DeepSeek always requires thinking replay when the model emits thinking tokens; there is no scenario where a DeepSeek V4 model should have includeThinking: false.
Workaround Status
An attempted workaround using aether.compatibleModels with explicit DeepSeek model entries and includeThinking: true was not reliable in this environment. It should be considered an experiment, not a production workaround.
For this report, no reliable DeepSeek workaround is claimed.
Why providerOverrides Settings Don't Help
Many users will try:
"aether.providerOverrides": {
"opencodego": {
"models": [{ "id": "deepseek-v4-pro", "includeThinking": true }]
}
}
This appears to be the supported config path but has no effect due to issues 1 and 2 above. The setting is accepted by VS Code (no schema error), appears to save, and gives no feedback that it was ignored. This is a UX trap that should also be addressed — either by fixing the path or by documenting that it doesn't work.
Environment
| Field |
Value |
| Aether version |
0.45.1 |
| VS Code version |
1.116.0 |
| OS |
Linux |
| Provider tested |
opencodego (OpenCode Zen Go) |
| SDK paths affected |
Both openai and anthropic (see #31) |
| Models confirmed failing |
deepseek-v4-pro, deepseek-v4-flash |
| Models confirmed working |
minimax-m2.7, minimax-m2.5, qwen3.6-plus, kimi-k2.5 |
DeepSeek thinking tokens not replayed on multi-turn tool calls — affects both OpenAI and Anthropic SDK paths
Related to: #31
TLDR
includeThinkingto dynamic provider models in the effective request path.Background
DeepSeek's API (V4 and later, including reasoner variants) enforces a strict rule: any assistant message in conversation history that contains
tool_callsmust also include areasoning_contentfield (OpenAI SDK path) or acontent[].thinkingblock (Anthropic SDK path). If either is absent the API rejects the entire request with a 400 error.Aether already has the correct mechanism for this — the
includeThinkingflag onModelConfig— andconvertAssistantMessage()inopenaiHandler.tsalready injectsreasoning_content(or an empty placeholder) whenincludeThinking === true. The observed behavior indicatesincludeThinkingis not reliably present in the effective model config during later tool-call turns, so replay injection does not consistently trigger.This affects DeepSeek models selected through dynamic provider paths in this environment (notably
opencodego).Errors Observed
OpenAI SDK path (sdkMode:
openai):or the request silently fails mid-tool-call with no content applied.
Observed request example from Copilot chat error output:
Anthropic SDK path (sdkMode:
anthropic) — from issue #31:Both errors indicate the same replay contract violation. The SDK path is irrelevant — DeepSeek enforces this on both.
Failure Pattern
The failure is non-deterministic in when it appears, which makes it confusing to debug:
reasoning_content/thinking, DeepSeek rejectsVariations observed:
deepseek-v4-proanddeepseek-v4-flashequallyRoot Cause — Three Missing Pieces
1.
ModelOverrideis missingincludeThinking—src/types/sharedTypes.tsoutputThinkingis present butincludeThinkingis not:This means any
includeThinking: truea user sets inaether.providerOverrides[x].models[]is silently discarded by TypeScript before it reachesapplyProviderOverrides().2.
applyProviderOverrides()never appliesincludeThinking—src/utils/configManager.tsAround line 700–720,
outputThinkingis handled for existing models butincludeThinkinghas no corresponding block:Around line 739, the new-model branch also omits it:
3. No automatic
includeThinkingdetection for DeepSeek models —src/utils/globalContextLengthManager.tsThe extension already auto-detects model families for token limits (
isDeepSeekModel(),isDeepSeekV4Model(), etc.). There is no equivalent capability flag injection. For any dynamically-fetched model,includeThinkingdefaults toundefined(falsy), so the thinking replay is always skipped — even when the model provably requires it.4. Reliability gap specifically on tool-call continuation turns
In this environment, the failure often appears after a successful first tool turn (read/edit planned, then next turn fails). That suggests the continuation/history serialization path is dropping required thinking replay data for DeepSeek assistant tool-call messages.
Proposed Fix
Fix 1 — Add
includeThinkingtoModelOverride(1 line)src/types/sharedTypes.ts:Fix 2 — Apply it in
applyProviderOverrides()(4 lines)src/utils/configManager.ts— in the existing-model branch, after theoutputThinkingblock:In the new-model branch, alongside
outputThinking:Fix 3 (Ideal / No-Config Solution) — Auto-detect DeepSeek models and force
includeThinkingsrc/utils/globalContextLengthManager.tsor wherever capability resolution happens:This would make it work out of the box for all users without any manual settings — which is the right long-term behaviour. DeepSeek always requires thinking replay when the model emits thinking tokens; there is no scenario where a DeepSeek V4 model should have
includeThinking: false.Workaround Status
An attempted workaround using
aether.compatibleModelswith explicit DeepSeek model entries andincludeThinking: truewas not reliable in this environment. It should be considered an experiment, not a production workaround.For this report, no reliable DeepSeek workaround is claimed.
Why
providerOverridesSettings Don't HelpMany users will try:
This appears to be the supported config path but has no effect due to issues 1 and 2 above. The setting is accepted by VS Code (no schema error), appears to save, and gives no feedback that it was ignored. This is a UX trap that should also be addressed — either by fixing the path or by documenting that it doesn't work.
Environment
openaiandanthropic(see #31)deepseek-v4-pro,deepseek-v4-flashminimax-m2.7,minimax-m2.5,qwen3.6-plus,kimi-k2.5