Skip to content

fix(agent): surface error for unsupported slash commands instead of hanging#2345

Open
lost-particles wants to merge 1 commit into
PostHog:mainfrom
lost-particles:fix/unsupported-slash-command-hang-2158
Open

fix(agent): surface error for unsupported slash commands instead of hanging#2345
lost-particles wants to merge 1 commit into
PostHog:mainfrom
lost-particles:fix/unsupported-slash-command-hang-2158

Conversation

@lost-particles
Copy link
Copy Markdown
Contributor

Closes #2158

Summary

  • When the user submits a Claude Code slash command that PostHog Code does not implement (e.g. /plugin install slack), the SDK consumes the input but emits no echo or local_command_output. The prompt loop in ClaudeAcpAgent.prompt then sits on the first session_state_changed → idle notification with promptReplayed still false, breaks out of the switch, and goes back to query.next() forever. The chat shows "Discombobulating…" indefinitely with no way to recover.
  • This change detects the case specifically: when an idle arrives before any prompt replay AND the original user input was a slash command we did not handle locally, emit a clear agent message and return end_turn so the renderer can clear its spinner.
  • Regular prompts keep the existing skip behavior, so background idles (the original purpose of the skip) do not get misclassified as failures.

Behavior

Input Old behavior New behavior
/context, /heapdump, /extra-usage (local-only commands) Works (isLocalOnlyCommand, promptReplayed=true) Unchanged
/plugin install slack and other SDK commands that silently consume Loop hangs forever, "Discombobulating…" Emits agent_message_chunk with "Unsupported slash command: \/plugin`. PostHog Code does not implement this command."and returnsend_turn`
Regular prompt where an unrelated background idle arrives before replay Skipped, waits for prompt replay Unchanged

Test plan

  • pnpm --filter @posthog/agent test src/adapters/claude/claude-agent.slash-command.test.ts — 2 new tests:
    • Slash command + early idle → emits the unsupported-command chunk and returns end_turn.
    • Regular prompt + unrelated early idle → does NOT emit the chunk; falls through to the existing Session did not end in result path.
  • pnpm --filter @posthog/agent test — all 446 tests pass, including the existing claude-agent.refresh.test.ts.
  • pnpm --filter @posthog/agent typecheck and biome check clean on changed files.
  • Manual: open a new task in PostHog Code, type /plugin install slack, submit. The chat should show the "Unsupported slash command" message and stay responsive instead of spinning.

Notes

  • The detection reuses the existing commandMatch regex (/^(\/\S+)/) that the function already computes at line 350, so the diff is small and the matching rules stay consistent with the LOCAL_ONLY_COMMANDS check.
  • The error is emitted as a regular agent_message_chunk (the same channel local-only commands use to forward their results), so it renders inline in the conversation rather than as a toast or error banner — matching the reporter's "surface a clear error in the chat" expectation.
  • This does not enumerate which slash commands PostHog Code supports vs not (that depends on the SDK's supportedCommands() and on what each command needs at runtime). It just catches the "SDK swallowed it and produced nothing" failure mode, which is the actual user-visible bug.

Closes PostHog#2158

When the user submits a Claude Code slash command that PostHog Code does
not implement (e.g. `/plugin install slack`), the SDK consumes the input
but produces no echo or `local_command_output`. The prompt loop then sits
on the very first `session_state_changed → idle` notification with
`promptReplayed` still false, breaks out of the switch, and goes back to
`query.next()` forever — the chat shows "Discombobulating…" indefinitely
with no way to recover.

Detect this case specifically: when an idle arrives before any prompt
replay AND the user input was a slash command we did not handle locally,
emit a clear agent message ("Unsupported slash command: `/foo`…") and
return `end_turn`. Regular prompts keep the existing skip behavior so
background idles do not get misclassified as failures.
@lost-particles
Copy link
Copy Markdown
Contributor Author

Hi maintainers, this PR targets #2158. Whenever convenient, could a maintainer take a look and let me know if the failure-mode detection (idle-before-replay + original input is a slash command we did not handle locally) is the right place to draw the line? I considered also validating slash commands in the renderer against the availableCommands list before sending, but /plugin is in the SDK reported command set so that approach would not have caught the reported case. Happy to iterate on wording, scope, or placement based on your feedback. Thanks!

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 25, 2026

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
packages/agent/src/adapters/claude/claude-agent.ts:478
The `!isLocalOnlyCommand` guard is always `true` inside this branch. When `isLocalOnlyCommand` is `true`, `promptReplayed` is also set to `true` at line 353, so the enclosing `if (!promptReplayed)` block is never entered with a local-only command. The check is superfluous (simplicity rule: no superfluous parts), and a reader has to trace back to line 353 to convince themselves it's harmless.

```suggestion
                if (commandMatch) {
```

### Issue 2 of 2
packages/agent/src/adapters/claude/claude-agent.slash-command.test.ts:79-162
**Non-parameterized tests** — the two cases share identical setup (make agent, install session, send idle, complete) and differ only in the prompt text and expected outcome. The project convention is to always prefer parameterised tests. Both scenarios could be driven by a single `it.each` table that supplies `{ prompt, expectedStopReason, expectsErrorChunk }` (or similar), which would also make it trivial to add future slash-command variants without duplicating the fixture plumbing.

Reviews (1): Last reviewed commit: "fix(agent): surface error for slash comm..." | Re-trigger Greptile

// and produced no output (e.g. /plugin in a non-interactive
// context). Without this branch we would loop forever waiting
// for an echo that never comes; surface a clear error instead.
if (commandMatch && !isLocalOnlyCommand) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The !isLocalOnlyCommand guard is always true inside this branch. When isLocalOnlyCommand is true, promptReplayed is also set to true at line 353, so the enclosing if (!promptReplayed) block is never entered with a local-only command. The check is superfluous (simplicity rule: no superfluous parts), and a reader has to trace back to line 353 to convince themselves it's harmless.

Suggested change
if (commandMatch && !isLocalOnlyCommand) {
if (commandMatch) {
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/agent/src/adapters/claude/claude-agent.ts
Line: 478

Comment:
The `!isLocalOnlyCommand` guard is always `true` inside this branch. When `isLocalOnlyCommand` is `true`, `promptReplayed` is also set to `true` at line 353, so the enclosing `if (!promptReplayed)` block is never entered with a local-only command. The check is superfluous (simplicity rule: no superfluous parts), and a reader has to trace back to line 353 to convince themselves it's harmless.

```suggestion
                if (commandMatch) {
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +79 to +162
it("emits a clear error and ends the turn when SDK silently consumes an unsupported slash command", async () => {
const { agent, client } = makeAgent();
const query = installFakeSession(agent, "s-slash");

const promptPromise = agent.prompt({
sessionId: "s-slash",
prompt: [{ type: "text", text: "/plugin install slack" }],
});

// Let the prompt loop start awaiting the first SDK message.
await new Promise((resolve) => setImmediate(resolve));

// Simulate the SDK going idle without echoing the user message back
// (the failure mode reported in #2158).
const idleMessage: SDKMessage = {
type: "system",
subtype: "session_state_changed",
state: "idle",
} as unknown as SDKMessage;
query._mockHelpers.sendMessage(idleMessage);
query._mockHelpers.complete();

const result = await promptPromise;

expect(result.stopReason).toBe("end_turn");

const errorChunk = client.sessionUpdate.mock.calls.find(
([call]) =>
(call as { update?: { sessionUpdate?: string } }).update
?.sessionUpdate === "agent_message_chunk",
);
expect(errorChunk).toBeDefined();
const errorText =
(
errorChunk?.[0] as {
update: { content: { text: string } };
}
).update.content.text ?? "";
expect(errorText).toContain("/plugin");
expect(errorText.toLowerCase()).toContain("unsupported");
});

it("still skips a pre-prompt idle for non-slash-command prompts", async () => {
const { agent, client } = makeAgent();
const query = installFakeSession(agent, "s-regular");

const promptPromise = agent.prompt({
sessionId: "s-regular",
prompt: [{ type: "text", text: "hello" }],
});

await new Promise((resolve) => setImmediate(resolve));

// First an unrelated idle (e.g. from a background task) before the prompt
// is replayed — should be skipped, not surfaced as an error.
query._mockHelpers.sendMessage({
type: "system",
subtype: "session_state_changed",
state: "idle",
} as unknown as SDKMessage);

// Then the SDK is done with no further output. The existing loop exits via
// the "Session did not end in result" path.
query._mockHelpers.complete();

await expect(promptPromise).rejects.toThrow(
/Session did not end in result/,
);

// Most importantly: no agent_message_chunk with an "unsupported" error
// was emitted for the regular prompt.
const errorChunk = client.sessionUpdate.mock.calls.find(([call]) => {
const update = (
call as {
update?: { sessionUpdate?: string; content?: { text?: string } };
}
).update;
return (
update?.sessionUpdate === "agent_message_chunk" &&
update?.content?.text?.toLowerCase().includes("unsupported")
);
});
expect(errorChunk).toBeUndefined();
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Non-parameterized tests — the two cases share identical setup (make agent, install session, send idle, complete) and differ only in the prompt text and expected outcome. The project convention is to always prefer parameterised tests. Both scenarios could be driven by a single it.each table that supplies { prompt, expectedStopReason, expectsErrorChunk } (or similar), which would also make it trivial to add future slash-command variants without duplicating the fixture plumbing.

Context Used: Do not attempt to comment on incorrect alphabetica... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/agent/src/adapters/claude/claude-agent.slash-command.test.ts
Line: 79-162

Comment:
**Non-parameterized tests** — the two cases share identical setup (make agent, install session, send idle, complete) and differ only in the prompt text and expected outcome. The project convention is to always prefer parameterised tests. Both scenarios could be driven by a single `it.each` table that supplies `{ prompt, expectedStopReason, expectsErrorChunk }` (or similar), which would also make it trivial to add future slash-command variants without duplicating the fixture plumbing.

**Context Used:** Do not attempt to comment on incorrect alphabetica... ([source](https://app.greptile.com/review/custom-context?memory=instruction-0))

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unsupported slash commands hang the chat instead of erroring

1 participant