fix(agent): surface error for unsupported slash commands instead of hanging#2345
fix(agent): surface error for unsupported slash commands instead of hanging#2345lost-particles wants to merge 1 commit into
Conversation
Closes PostHog#2158 When the user submits a Claude Code slash command that PostHog Code does not implement (e.g. `/plugin install slack`), the SDK consumes the input but produces no echo or `local_command_output`. The prompt loop then sits on the very first `session_state_changed → idle` notification with `promptReplayed` still false, breaks out of the switch, and goes back to `query.next()` forever — the chat shows "Discombobulating…" indefinitely with no way to recover. Detect this case specifically: when an idle arrives before any prompt replay AND the user input was a slash command we did not handle locally, emit a clear agent message ("Unsupported slash command: `/foo`…") and return `end_turn`. Regular prompts keep the existing skip behavior so background idles do not get misclassified as failures.
|
Hi maintainers, this PR targets #2158. Whenever convenient, could a maintainer take a look and let me know if the failure-mode detection (idle-before-replay + original input is a slash command we did not handle locally) is the right place to draw the line? I considered also validating slash commands in the renderer against the |
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
packages/agent/src/adapters/claude/claude-agent.ts:478
The `!isLocalOnlyCommand` guard is always `true` inside this branch. When `isLocalOnlyCommand` is `true`, `promptReplayed` is also set to `true` at line 353, so the enclosing `if (!promptReplayed)` block is never entered with a local-only command. The check is superfluous (simplicity rule: no superfluous parts), and a reader has to trace back to line 353 to convince themselves it's harmless.
```suggestion
if (commandMatch) {
```
### Issue 2 of 2
packages/agent/src/adapters/claude/claude-agent.slash-command.test.ts:79-162
**Non-parameterized tests** — the two cases share identical setup (make agent, install session, send idle, complete) and differ only in the prompt text and expected outcome. The project convention is to always prefer parameterised tests. Both scenarios could be driven by a single `it.each` table that supplies `{ prompt, expectedStopReason, expectsErrorChunk }` (or similar), which would also make it trivial to add future slash-command variants without duplicating the fixture plumbing.
Reviews (1): Last reviewed commit: "fix(agent): surface error for slash comm..." | Re-trigger Greptile |
| // and produced no output (e.g. /plugin in a non-interactive | ||
| // context). Without this branch we would loop forever waiting | ||
| // for an echo that never comes; surface a clear error instead. | ||
| if (commandMatch && !isLocalOnlyCommand) { |
There was a problem hiding this comment.
The
!isLocalOnlyCommand guard is always true inside this branch. When isLocalOnlyCommand is true, promptReplayed is also set to true at line 353, so the enclosing if (!promptReplayed) block is never entered with a local-only command. The check is superfluous (simplicity rule: no superfluous parts), and a reader has to trace back to line 353 to convince themselves it's harmless.
| if (commandMatch && !isLocalOnlyCommand) { | |
| if (commandMatch) { |
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/agent/src/adapters/claude/claude-agent.ts
Line: 478
Comment:
The `!isLocalOnlyCommand` guard is always `true` inside this branch. When `isLocalOnlyCommand` is `true`, `promptReplayed` is also set to `true` at line 353, so the enclosing `if (!promptReplayed)` block is never entered with a local-only command. The check is superfluous (simplicity rule: no superfluous parts), and a reader has to trace back to line 353 to convince themselves it's harmless.
```suggestion
if (commandMatch) {
```
How can I resolve this? If you propose a fix, please make it concise.| it("emits a clear error and ends the turn when SDK silently consumes an unsupported slash command", async () => { | ||
| const { agent, client } = makeAgent(); | ||
| const query = installFakeSession(agent, "s-slash"); | ||
|
|
||
| const promptPromise = agent.prompt({ | ||
| sessionId: "s-slash", | ||
| prompt: [{ type: "text", text: "/plugin install slack" }], | ||
| }); | ||
|
|
||
| // Let the prompt loop start awaiting the first SDK message. | ||
| await new Promise((resolve) => setImmediate(resolve)); | ||
|
|
||
| // Simulate the SDK going idle without echoing the user message back | ||
| // (the failure mode reported in #2158). | ||
| const idleMessage: SDKMessage = { | ||
| type: "system", | ||
| subtype: "session_state_changed", | ||
| state: "idle", | ||
| } as unknown as SDKMessage; | ||
| query._mockHelpers.sendMessage(idleMessage); | ||
| query._mockHelpers.complete(); | ||
|
|
||
| const result = await promptPromise; | ||
|
|
||
| expect(result.stopReason).toBe("end_turn"); | ||
|
|
||
| const errorChunk = client.sessionUpdate.mock.calls.find( | ||
| ([call]) => | ||
| (call as { update?: { sessionUpdate?: string } }).update | ||
| ?.sessionUpdate === "agent_message_chunk", | ||
| ); | ||
| expect(errorChunk).toBeDefined(); | ||
| const errorText = | ||
| ( | ||
| errorChunk?.[0] as { | ||
| update: { content: { text: string } }; | ||
| } | ||
| ).update.content.text ?? ""; | ||
| expect(errorText).toContain("/plugin"); | ||
| expect(errorText.toLowerCase()).toContain("unsupported"); | ||
| }); | ||
|
|
||
| it("still skips a pre-prompt idle for non-slash-command prompts", async () => { | ||
| const { agent, client } = makeAgent(); | ||
| const query = installFakeSession(agent, "s-regular"); | ||
|
|
||
| const promptPromise = agent.prompt({ | ||
| sessionId: "s-regular", | ||
| prompt: [{ type: "text", text: "hello" }], | ||
| }); | ||
|
|
||
| await new Promise((resolve) => setImmediate(resolve)); | ||
|
|
||
| // First an unrelated idle (e.g. from a background task) before the prompt | ||
| // is replayed — should be skipped, not surfaced as an error. | ||
| query._mockHelpers.sendMessage({ | ||
| type: "system", | ||
| subtype: "session_state_changed", | ||
| state: "idle", | ||
| } as unknown as SDKMessage); | ||
|
|
||
| // Then the SDK is done with no further output. The existing loop exits via | ||
| // the "Session did not end in result" path. | ||
| query._mockHelpers.complete(); | ||
|
|
||
| await expect(promptPromise).rejects.toThrow( | ||
| /Session did not end in result/, | ||
| ); | ||
|
|
||
| // Most importantly: no agent_message_chunk with an "unsupported" error | ||
| // was emitted for the regular prompt. | ||
| const errorChunk = client.sessionUpdate.mock.calls.find(([call]) => { | ||
| const update = ( | ||
| call as { | ||
| update?: { sessionUpdate?: string; content?: { text?: string } }; | ||
| } | ||
| ).update; | ||
| return ( | ||
| update?.sessionUpdate === "agent_message_chunk" && | ||
| update?.content?.text?.toLowerCase().includes("unsupported") | ||
| ); | ||
| }); | ||
| expect(errorChunk).toBeUndefined(); | ||
| }); |
There was a problem hiding this comment.
Non-parameterized tests — the two cases share identical setup (make agent, install session, send idle, complete) and differ only in the prompt text and expected outcome. The project convention is to always prefer parameterised tests. Both scenarios could be driven by a single
it.each table that supplies { prompt, expectedStopReason, expectsErrorChunk } (or similar), which would also make it trivial to add future slash-command variants without duplicating the fixture plumbing.
Context Used: Do not attempt to comment on incorrect alphabetica... (source)
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/agent/src/adapters/claude/claude-agent.slash-command.test.ts
Line: 79-162
Comment:
**Non-parameterized tests** — the two cases share identical setup (make agent, install session, send idle, complete) and differ only in the prompt text and expected outcome. The project convention is to always prefer parameterised tests. Both scenarios could be driven by a single `it.each` table that supplies `{ prompt, expectedStopReason, expectsErrorChunk }` (or similar), which would also make it trivial to add future slash-command variants without duplicating the fixture plumbing.
**Context Used:** Do not attempt to comment on incorrect alphabetica... ([source](https://app.greptile.com/review/custom-context?memory=instruction-0))
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Closes #2158
Summary
/plugin install slack), the SDK consumes the input but emits no echo orlocal_command_output. The prompt loop inClaudeAcpAgent.promptthen sits on the firstsession_state_changed → idlenotification withpromptReplayedstill false, breaks out of the switch, and goes back toquery.next()forever. The chat shows "Discombobulating…" indefinitely with no way to recover.end_turnso the renderer can clear its spinner.Behavior
/context,/heapdump,/extra-usage(local-only commands)isLocalOnlyCommand,promptReplayed=true)/plugin install slackand other SDK commands that silently consumeagent_message_chunkwith"Unsupported slash command: \/plugin`. PostHog Code does not implement this command."and returnsend_turn`idlearrives before replayTest plan
pnpm --filter @posthog/agent test src/adapters/claude/claude-agent.slash-command.test.ts— 2 new tests:idle→ emits the unsupported-command chunk and returnsend_turn.idle→ does NOT emit the chunk; falls through to the existingSession did not end in resultpath.pnpm --filter @posthog/agent test— all 446 tests pass, including the existingclaude-agent.refresh.test.ts.pnpm --filter @posthog/agent typecheckandbiome checkclean on changed files./plugin install slack, submit. The chat should show the "Unsupported slash command" message and stay responsive instead of spinning.Notes
commandMatchregex (/^(\/\S+)/) that the function already computes at line 350, so the diff is small and the matching rules stay consistent with theLOCAL_ONLY_COMMANDScheck.agent_message_chunk(the same channel local-only commands use to forward their results), so it renders inline in the conversation rather than as a toast or error banner — matching the reporter's "surface a clear error in the chat" expectation.supportedCommands()and on what each command needs at runtime). It just catches the "SDK swallowed it and produced nothing" failure mode, which is the actual user-visible bug.