Summary
runStreamLoop swaps this stream's events channel into the elicitation bridge before the deferred finalizeEventChannel cleanup is registered. Two early-return paths sit in that gap. On those paths the stream exits without ever restoring the bridge or closing the events channel: the observe forwarder goroutine never terminates, the external consumer waits forever on a channel that never closes, and the elicitation bridge is left pointing at a dead stream's channel.
This is pre-existing (it predates #3069 / #3070) but belongs to the same elicitation-channel lifecycle family. Found while reviewing #3070.
The gap
In pkg/runtime/loop.go, runStreamLoop:
prevElicitationCh := r.elicitation.swap(events) // bridge now points at this stream
...
agentTools, err := r.getTools(ctx, a, sessionSpan, sink, true)
if err != nil {
sink.Emit(ErrorWithCode(...))
return // ← early return #1: no finalize
}
...
if lastMsg.Role == chat.MessageRoleUser {
stop, msg, ctxMsgs := r.executeUserPromptSubmitHooks(...)
if stop {
...
r.emitHookDrivenShutdown(ctx, a, sess, msg, sink)
return // ← early return #2: no finalize
}
...
}
...
var streamReason string
defer func() { // ← finalize only registered HERE
r.finalizeEventChannel(ctx, sess, streamReason, prevElicitationCh, events)
}()
Consequences on the early-return paths
events is never closed → observe's forwarder goroutine (for event := range inner) never exits → its out channel is never closed → any consumer doing for range RunStream(...) hangs forever. Run() is such a consumer.
- The elicitation bridge keeps pointing at the dead stream's channel. A later elicitation
send targets a channel nobody drains: with the buffer full it parks forever holding the bridge RLock, which then wedges the next stream's teardown (restoreAndClose blocks on the write lock).
- For nested streams, the parent's elicitation channel is never restored.
Proposed fix
Register the cleanup defer immediately after the swap, so every exit path finalizes:
prevElicitationCh := r.elicitation.swap(events)
var streamReason string
defer func() {
r.finalizeEventChannel(ctx, sess, streamReason, prevElicitationCh, events)
}()
finalizeEventChannel is already safe for these early paths (it emits StreamStopped non-blockingly and closes last, since #3070). The streamReason declaration just moves up with the defer.
Add a regression test: a RunStream whose getTools fails must still close its events channel (consumer's for range terminates) and restore the previous elicitation channel.
Relations
Summary
runStreamLoopswaps this stream's events channel into the elicitation bridge before the deferredfinalizeEventChannelcleanup is registered. Two early-return paths sit in that gap. On those paths the stream exits without ever restoring the bridge or closing the events channel: theobserveforwarder goroutine never terminates, the external consumer waits forever on a channel that never closes, and the elicitation bridge is left pointing at a dead stream's channel.This is pre-existing (it predates #3069 / #3070) but belongs to the same elicitation-channel lifecycle family. Found while reviewing #3070.
The gap
In
pkg/runtime/loop.go,runStreamLoop:Consequences on the early-return paths
eventsis never closed →observe's forwarder goroutine (for event := range inner) never exits → itsoutchannel is never closed → any consumer doingfor range RunStream(...)hangs forever.Run()is such a consumer.sendtargets a channel nobody drains: with the buffer full it parks forever holding the bridgeRLock, which then wedges the next stream's teardown (restoreAndCloseblocks on the write lock).Proposed fix
Register the cleanup defer immediately after the swap, so every exit path finalizes:
finalizeEventChannelis already safe for these early paths (it emitsStreamStoppednon-blockingly and closes last, since #3070). ThestreamReasondeclaration just moves up with the defer.Add a regression test: a
RunStreamwhosegetToolsfails must still close its events channel (consumer'sfor rangeterminates) and restore the previous elicitation channel.Relations