Guard spawn agent activity labels#454
Conversation
|
@itkonen Can you give me a example from eca stderr or something to understand better how codex behave? |
|
@ericdallo There's actually nothing in the stderr. The main issue is with the user-facing labels printed out in the chat. Sometimes the label is just gibberish or exotic unicode symbols, but sometimes it's really, really long. Like here in this subagent call the label was over 13000 characters, filling multiple pages on my screen:
A secondary issue is that this long label might get passed forward in later chat turns, filling up the LLM context window. My guess is that Codex uses a small specialized model to generate the tool calls that ECA receives, and that model knows only how to write the tool call payload but easily fails at writing sensible action labels. And that model might get stuck in a loop repeating a word or a phrase over and over again. So in this PR implements minimal guardrails to limit the length of the label and prevents it from bloating further model requests. The user will still see the gibberish but at least it will not harm the workflow. |
Normalize model-generated spawn_agent activity labels before they are displayed, stored, or replayed so pathological labels do not leak into UI or downstream context. 🤖 Generated with [eca](https://eca.dev) Co-Authored-By: eca-agent <git@eca.dev>
🤖 Generated with [eca](https://eca.dev) Co-Authored-By: eca-agent <git@eca.dev>
The PR check failed on macOS while the spawn handler had already reached completion output. Keep the regression coverage but allow more time for the end-to-end chat/prompt path and include state details if it times out. 🤖 Generated with [eca](https://eca.dev) Co-Authored-By: eca-agent <git@eca.dev>
7649d1d to
0c85a4a
Compare
|
Ah got it, yeah that's crazy! but makes sense |
|
@ericdallo I think that would only fix the UI problem. My understanding is that the misbehaving activity label would still get stored in chat history, which could later be replayed back to the Responses API as part of the tool_call arguments, filling up the model context. But I cannot read the code well enough to say for sure. |
|
Hum, yes, I thought would be ok for UI, but since this goes to LLM again and could be huge for some reason, I think it's ok to strip it |
|
I absolutely agree that this is a problem. |
|
@zikajk That's a valid point, but the weird thing is that the actual arguments in these tool calls might be completely sane, even if the activity label is gibberish. I wouldn't want break the workflow because of the silly labels. But it might be worth thinking if something could be done to prevent the gibberish - like giving more context for the activity label. |
Sometimes Codex provides sub-agent activity labels that are extremely long, nonsensical, or repeat the same text over and over. This bloats the dialogue and can also pollute the tool-call history that later model turns may see. This PR adds a small guardrail to prevent that issue.
The summary below was generated by AI.
Summary
activityarguments by trimming, collapsing whitespace, truncating overly long labels, and omitting blank or non-string labels.Verification
clojure -M:test --focus eca.features.tools.agent-testclj-kondo --lint src/eca/features/tools/agent.clj src/eca/features/chat/tool_calls.clj test/eca/features/tools/agent_test.clj