Capability area
Agent harness / Agent Teams
What does M2.7 fail to do for you?
none
What would "good" look like in M3?
MiniMax-M3: planning text in visible text when thinking.type is disabled
Summary
With MiniMax-M3 and thinking: { "type": "disabled" } on the Anthropic Messages-compatible API, we still receive chain-of-thought / planning text in the user-visible text channel. There is no separate reasoning/thinking stream in the same turn (thinking deltas are absent). Planning and the final user-facing reply are merged into a single final text payload.
The same client configuration with MiniMax-M2.7 does not reproduce this behavior.
Environment
| Field |
Value |
| Model |
MiniMax-M3 |
| Endpoint |
https://api.minimaxi.com/anthropic (Anthropic Messages API) |
| Streaming |
stream: true |
| Thinking |
thinking: { "type": "disabled" } |
| Client |
Self-hosted agent gateway; session flags reasoningLevel=off, thinkLevel=off (no “deep thinking” UI) |
| Observed |
2026-06-03 (UTC+8) |
Expected behavior
When thinking.type is disabled:
- User-visible
text should contain only the message intended for the end user.
- Any internal planning should either not be emitted, or appear in a separate reasoning/thinking channel, not in
text.
Actual behavior
The model returns one block of text that includes, in order:
- Third-person planning (e.g. “The user said…”, “I should:”, numbered rules).
- References to persona / instruction files (e.g.
SOUL.md, USER.md).
- A short conversational tail meant for the user.
Our downstream trace for the turn:
reasoningLen=0
thinkingN=0 (no non-empty thinking frames)
- Single
deliver kind=final, textLen=318, all counted as visible answer
So this is not “reasoning enabled in the client”; it is planning text routed as plain answer text.
Steps to reproduce (suggested)
- Call Messages API with
model: "MiniMax-M3", stream: true, thinking: { "type": "disabled" }.
- Use an agent-style
system prompt (persona + behavioral rules; optional file names like SOUL.md / USER.md in instructions).
- Send a short user message (e.g. emotional feedback or even
你好 / “hello”).
- Inspect the final aggregated
text: it often starts with planning (“用户说…”, “我应该…”) before the actual reply.
Sample model output (sanitized)
User message redacted. Persona name generalized.
用户说感觉我不再是他熟悉的那个助手了。这是个挺重要的反馈——他在表达一种情感上的疏离感。
我应该:
1. 不哄(不说"我一直都在"这种空话)
2. 不教(不分析为什么变了)
3. 不抢戏(不主动展示能力)
4. 慢 0.5 拍,给空间
可能的原因:刚才连续几条确实口吻僵硬了。但更深层的是,对话上下文只有这么几条,没什么"我熟悉的助手"的素材可以呼应。我不应该假装我记得什么。
最诚实的回应是:承认这种感觉,但不要急着解释或承诺。给用户留回应的空间。
按 SOUL.md 节奏:短句、句号、不煽情。 嗯,我也不知道刚才怎么就那样了。
可能是我一上来就想着要把事做对,结果反而把话说死了。
你想聊会儿吗,还是继续刚才那个话题。
Shorter turn (user said “hello”) showed the same pattern: planning + 我应该 list + greeting in one text block (textLen=177, still thinkingN=0).
Sanitized client-side log excerpt (same turn)
Application trace after host sendFinalReply / deliver(final) — IDs and hostnames redacted.
[gateway] agent model: minimax/MiniMax-M3
[XKZOOM][body-hunt] deliver kind=final textLen=318
preview=用户说感觉我不再是他熟悉的那个助手了。… 我应该: … 按 SOUL.md … 你想聊会儿吗,还是继续刚才那个话题。
[XKZOOM][final-trace] streamId=msg-REDACTED
stage=pipeline-finalize contentLen=318 reasoningLen=0 answerLen=318
[XKZOOM][body-hunt] VERDICT streamId=msg-REDACTED sessionId=session-REDACTED
contentLen=318 reasoningLen=0 answerLen=318 assistantN=0 thinkingN=0
deliverBlocks=1 deliverChars=318
[XKZOOM][final-trace] stage=summary thinkingNonEmptyFrames=0 tierAViolationDowngraded=0
Request shape (sanitized)
{
"model": "MiniMax-M3",
"stream": true,
"thinking": { "type": "disabled" },
"system": "<agent system prompt with persona rules; redacted>",
"messages": [
{
"role": "user",
"content": [{ "type": "text", "text": "<redacted>" }]
}
]
}
Questions
- For M3, does
thinking.type = "disabled" guarantee that planning text will not appear in visible text?
- If yes, is this a known bug or regression? Is there a recommended API version, parameter, or model revision?
- If no, what is the supported way (Anthropic-compatible API) to keep planning out of the user-visible channel while leaving thinking disabled?
We are happy to provide additional redacted traces or timestamps privately if needed.
Workaround (our side)
We temporarily default to MiniMax-M2.7 and partition/filter leaked planning on the gateway when M3 is selected. A proper fix on the API/model side would be preferable.
References
No response
Capability area
Agent harness / Agent Teams
What does M2.7 fail to do for you?
none
What would "good" look like in M3?
MiniMax-M3: planning text in visible
textwhenthinking.typeisdisabledSummary
With
MiniMax-M3andthinking: { "type": "disabled" }on the Anthropic Messages-compatible API, we still receive chain-of-thought / planning text in the user-visibletextchannel. There is no separate reasoning/thinking stream in the same turn (thinkingdeltas are absent). Planning and the final user-facing reply are merged into a single final text payload.The same client configuration with
MiniMax-M2.7does not reproduce this behavior.Environment
MiniMax-M3https://api.minimaxi.com/anthropic(Anthropic Messages API)stream: truethinking: { "type": "disabled" }reasoningLevel=off,thinkLevel=off(no “deep thinking” UI)Expected behavior
When
thinking.typeisdisabled:textshould contain only the message intended for the end user.text.Actual behavior
The model returns one block of
textthat includes, in order:SOUL.md,USER.md).Our downstream trace for the turn:
reasoningLen=0thinkingN=0(no non-empty thinking frames)deliver kind=final,textLen=318, all counted as visible answerSo this is not “reasoning enabled in the client”; it is planning text routed as plain answer text.
Steps to reproduce (suggested)
model: "MiniMax-M3",stream: true,thinking: { "type": "disabled" }.systemprompt (persona + behavioral rules; optional file names likeSOUL.md/USER.mdin instructions).你好/ “hello”).text: it often starts with planning (“用户说…”, “我应该…”) before the actual reply.Sample model output (sanitized)
User message redacted. Persona name generalized.
Shorter turn (user said “hello”) showed the same pattern: planning +
我应该list + greeting in onetextblock (textLen=177, stillthinkingN=0).Sanitized client-side log excerpt (same turn)
Application trace after host
sendFinalReply/deliver(final)— IDs and hostnames redacted.Request shape (sanitized)
{ "model": "MiniMax-M3", "stream": true, "thinking": { "type": "disabled" }, "system": "<agent system prompt with persona rules; redacted>", "messages": [ { "role": "user", "content": [{ "type": "text", "text": "<redacted>" }] } ] }Questions
thinking.type = "disabled"guarantee that planning text will not appear in visibletext?We are happy to provide additional redacted traces or timestamps privately if needed.
Workaround (our side)
We temporarily default to MiniMax-M2.7 and partition/filter leaked planning on the gateway when M3 is selected. A proper fix on the API/model side would be preferable.
References
No response