Skip to content

DeepSeek thinking tokens not replayed on multi-turn tool calls — affects both OpenAI and Anthropic SDK paths #36

@dkebler

Description

@dkebler

DeepSeek thinking tokens not replayed on multi-turn tool calls — affects both OpenAI and Anthropic SDK paths

Related to: #31

TLDR

  • This is a real DeepSeek-specific breakage in Aether tool-call conversations, not user misconfiguration.
  • It reproduces on both OpenAI and Anthropic SDK paths, so it is not endpoint-specific.
  • MiniMax and Qwen work in the same environment; DeepSeek fails with missing replayed thinking content.
  • The extension currently does not reliably apply includeThinking to dynamic provider models in the effective request path.
  • An attempted compatible-model workaround was not reliable in this environment and should not be treated as a fix.

Background

DeepSeek's API (V4 and later, including reasoner variants) enforces a strict rule: any assistant message in conversation history that contains tool_calls must also include a reasoning_content field (OpenAI SDK path) or a content[].thinking block (Anthropic SDK path). If either is absent the API rejects the entire request with a 400 error.

Aether already has the correct mechanism for this — the includeThinking flag on ModelConfig — and convertAssistantMessage() in openaiHandler.ts already injects reasoning_content (or an empty placeholder) when includeThinking === true. The observed behavior indicates includeThinking is not reliably present in the effective model config during later tool-call turns, so replay injection does not consistently trigger.

This affects DeepSeek models selected through dynamic provider paths in this environment (notably opencodego).


Errors Observed

OpenAI SDK path (sdkMode: openai):

Error: reasoning_content missing on assistant message containing tool_calls

or the request silently fails mid-tool-call with no content applied.

Observed request example from Copilot chat error output:

Copilot Request id: 416f2c28-cf84-4952-8aff-47c3d91cfe2d
Reason: 400 Error from provider (DeepSeek): The reasoning_content in the thinking mode must be passed back to the API.

Anthropic SDK path (sdkMode: anthropic) — from issue #31:

Anthropic API call failed: 400 {
  "error": {
    "message": "The `content[].thinking` in the thinking mode must be passed back to the API.",
    "type": "invalid_request_error",
    "code": "invalid_request_error"
  }
}

Both errors indicate the same replay contract violation. The SDK path is irrelevant — DeepSeek enforces this on both.


Failure Pattern

The failure is non-deterministic in when it appears, which makes it confusing to debug:

  • ✅ Turn 1 (plain chat): works fine — no tool calls yet
  • ✅ Turn 2 (first tool call): works — model emits thinking + tool_call
  • ❌ Turn 3 (tool result → next request): fails — assistant message from turn 2 is in history without reasoning_content/thinking, DeepSeek rejects

Variations observed:

  • File operation completes, diff is not applied, then error
  • Error fires before any operation runs
  • Error fires after a successful complete operation, on the next turn
  • Affects deepseek-v4-pro and deepseek-v4-flash equally
  • Does not affect MiniMax, Qwen, Kimi, GLM on the same provider (those APIs don't enforce this rule)

Root Cause — Three Missing Pieces

1. ModelOverride is missing includeThinkingsrc/types/sharedTypes.ts

outputThinking is present but includeThinking is not:

// src/types/sharedTypes.ts
export interface ModelOverride {
  id: string;
  model?: string;
  maxInputTokens?: number;
  maxOutputTokens?: number;
  sdkMode?: SdkMode;
  capabilities?: { toolCalling?: boolean; imageInput?: boolean };
  baseUrl?: string;
  customHeader?: Record<string, string>;
  extraBody?: Record<string, unknown>;
  outputThinking?: boolean;
  // ❌ includeThinking is absent — this is the first breakage point
}

This means any includeThinking: true a user sets in aether.providerOverrides[x].models[] is silently discarded by TypeScript before it reaches applyProviderOverrides().

2. applyProviderOverrides() never applies includeThinkingsrc/utils/configManager.ts

Around line 700–720, outputThinking is handled for existing models but includeThinking has no corresponding block:

// Exists for outputThinking:
if (modelOverride.outputThinking !== undefined) {
  existingModel.outputThinking = modelOverride.outputThinking;
}
// ❌ No equivalent block for includeThinking

Around line 739, the new-model branch also omits it:

...(modelOverride.outputThinking !== undefined && {
  outputThinking: modelOverride.outputThinking
})
// ❌ includeThinking not spread here either

3. No automatic includeThinking detection for DeepSeek models — src/utils/globalContextLengthManager.ts

The extension already auto-detects model families for token limits (isDeepSeekModel(), isDeepSeekV4Model(), etc.). There is no equivalent capability flag injection. For any dynamically-fetched model, includeThinking defaults to undefined (falsy), so the thinking replay is always skipped — even when the model provably requires it.

4. Reliability gap specifically on tool-call continuation turns

In this environment, the failure often appears after a successful first tool turn (read/edit planned, then next turn fails). That suggests the continuation/history serialization path is dropping required thinking replay data for DeepSeek assistant tool-call messages.


Proposed Fix

Fix 1 — Add includeThinking to ModelOverride (1 line)

src/types/sharedTypes.ts:

export interface ModelOverride {
  // ... existing fields ...
  outputThinking?: boolean;
  includeThinking?: boolean;  // ← ADD THIS
}

Fix 2 — Apply it in applyProviderOverrides() (4 lines)

src/utils/configManager.ts — in the existing-model branch, after the outputThinking block:

if (modelOverride.includeThinking !== undefined) {
  existingModel.includeThinking = modelOverride.includeThinking;
}

In the new-model branch, alongside outputThinking:

...(modelOverride.includeThinking !== undefined && {
  includeThinking: modelOverride.includeThinking
})

Fix 3 (Ideal / No-Config Solution) — Auto-detect DeepSeek models and force includeThinking

src/utils/globalContextLengthManager.ts or wherever capability resolution happens:

// When building effective model config for a dynamically-fetched model:
if (isDeepSeekModel(model.id) || isDeepSeekV4Model(model.id)) {
  model.includeThinking = true;
}

This would make it work out of the box for all users without any manual settings — which is the right long-term behaviour. DeepSeek always requires thinking replay when the model emits thinking tokens; there is no scenario where a DeepSeek V4 model should have includeThinking: false.


Workaround Status

An attempted workaround using aether.compatibleModels with explicit DeepSeek model entries and includeThinking: true was not reliable in this environment. It should be considered an experiment, not a production workaround.

For this report, no reliable DeepSeek workaround is claimed.


Why providerOverrides Settings Don't Help

Many users will try:

"aether.providerOverrides": {
  "opencodego": {
    "models": [{ "id": "deepseek-v4-pro", "includeThinking": true }]
  }
}

This appears to be the supported config path but has no effect due to issues 1 and 2 above. The setting is accepted by VS Code (no schema error), appears to save, and gives no feedback that it was ignored. This is a UX trap that should also be addressed — either by fixing the path or by documenting that it doesn't work.


Environment

Field Value
Aether version 0.45.1
VS Code version 1.116.0
OS Linux
Provider tested opencodego (OpenCode Zen Go)
SDK paths affected Both openai and anthropic (see #31)
Models confirmed failing deepseek-v4-pro, deepseek-v4-flash
Models confirmed working minimax-m2.7, minimax-m2.5, qwen3.6-plus, kimi-k2.5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions