Bug
Background tasks (/codex:rescue, /codex:task) can hang indefinitely when the Codex API stalls during response generation. There is no timeout mechanism to recover from this state.
Observed Behavior
- Task starts successfully, Codex reads files and runs commands (visible in log)
- After all tool calls complete, Codex enters response generation phase
- Log file stops updating — no new entries for 8+ minutes
- Task remains in
status: "running" forever
result <job-id> returns "No job found" because it only queries completed/failed/cancelled jobs
cancel <job-id> fails because the OS process has already exited (but the job record was never updated)
Root Cause (suspected)
Two issues compound:
1. No response timeout
tracked-jobs.mjs runTrackedJob awaits the Codex API call without a timeout. If the API stalls during generation, the job hangs indefinitely.
2. Job state not updated on process exit
When the Codex process exits unexpectedly (or the API connection drops), the job record in the state file is never transitioned from running to failed. This leaves orphaned running jobs that:
- Can't be queried via
result (which filters for completed/failed/cancelled only)
- Can't be cancelled (process already gone)
- Block
task-resume-candidate from working correctly
Expected Behavior
- Tasks should have a configurable timeout (e.g., 5 minutes of no activity)
- If the process exits without updating job state, the companion should detect orphaned jobs and mark them as
failed
result should provide partial output for timed-out or orphaned jobs
Environment
- Windows 11 Pro 10.0.26200
- Node.js (latest)
- codex-plugin-cc 1.0.1
- Claude Code (latest)
- Codex CLI 0.117.0
Reproduction
- Run a
/codex:rescue task with a complex, multi-file review prompt
- Wait for Codex to complete its tool calls (file reads, greps)
- Observe the log file — after the last command completes, no further output appears
status --json shows the task as running with increasing elapsed time
- The task never completes
Log excerpt
[06:29:06] Assistant message: キャンセル処理は一箇所だけ直しても意味がない場合があります...
[06:29:06] Running command: powershell -Command 'Get-Content Goudy....'
[06:29:07] Command completed (exit 0)
[06:29:07] Command completed (exit 0)
[06:29:07] Command completed (exit 0)
[06:29:07] Command completed (exit 0)
— (no further log entries for 8+ minutes, task stays "running") —
Note
This was observed after locally patching the spawn codex ENOENT issue (#32 / #46). The task successfully starts and runs commands, but hangs during the final response generation.
Bug
Background tasks (
/codex:rescue,/codex:task) can hang indefinitely when the Codex API stalls during response generation. There is no timeout mechanism to recover from this state.Observed Behavior
status: "running"foreverresult <job-id>returns"No job found"because it only queries completed/failed/cancelled jobscancel <job-id>fails because the OS process has already exited (but the job record was never updated)Root Cause (suspected)
Two issues compound:
1. No response timeout
tracked-jobs.mjsrunTrackedJobawaits the Codex API call without a timeout. If the API stalls during generation, the job hangs indefinitely.2. Job state not updated on process exit
When the Codex process exits unexpectedly (or the API connection drops), the job record in the state file is never transitioned from
runningtofailed. This leaves orphanedrunningjobs that:result(which filters for completed/failed/cancelled only)task-resume-candidatefrom working correctlyExpected Behavior
failedresultshould provide partial output for timed-out or orphaned jobsEnvironment
Reproduction
/codex:rescuetask with a complex, multi-file review promptstatus --jsonshows the task asrunningwith increasing elapsed timeLog excerpt
Note
This was observed after locally patching the
spawn codex ENOENTissue (#32 / #46). The task successfully starts and runs commands, but hangs during the final response generation.