Skip to content

Background task hangs indefinitely — no timeout on Codex API response generation #49

@butmasaru

Description

@butmasaru

Bug

Background tasks (/codex:rescue, /codex:task) can hang indefinitely when the Codex API stalls during response generation. There is no timeout mechanism to recover from this state.

Observed Behavior

  1. Task starts successfully, Codex reads files and runs commands (visible in log)
  2. After all tool calls complete, Codex enters response generation phase
  3. Log file stops updating — no new entries for 8+ minutes
  4. Task remains in status: "running" forever
  5. result <job-id> returns "No job found" because it only queries completed/failed/cancelled jobs
  6. cancel <job-id> fails because the OS process has already exited (but the job record was never updated)

Root Cause (suspected)

Two issues compound:

1. No response timeout

tracked-jobs.mjs runTrackedJob awaits the Codex API call without a timeout. If the API stalls during generation, the job hangs indefinitely.

2. Job state not updated on process exit

When the Codex process exits unexpectedly (or the API connection drops), the job record in the state file is never transitioned from running to failed. This leaves orphaned running jobs that:

  • Can't be queried via result (which filters for completed/failed/cancelled only)
  • Can't be cancelled (process already gone)
  • Block task-resume-candidate from working correctly

Expected Behavior

  • Tasks should have a configurable timeout (e.g., 5 minutes of no activity)
  • If the process exits without updating job state, the companion should detect orphaned jobs and mark them as failed
  • result should provide partial output for timed-out or orphaned jobs

Environment

  • Windows 11 Pro 10.0.26200
  • Node.js (latest)
  • codex-plugin-cc 1.0.1
  • Claude Code (latest)
  • Codex CLI 0.117.0

Reproduction

  1. Run a /codex:rescue task with a complex, multi-file review prompt
  2. Wait for Codex to complete its tool calls (file reads, greps)
  3. Observe the log file — after the last command completes, no further output appears
  4. status --json shows the task as running with increasing elapsed time
  5. The task never completes

Log excerpt

[06:29:06] Assistant message: キャンセル処理は一箇所だけ直しても意味がない場合があります...
[06:29:06] Running command: powershell -Command 'Get-Content Goudy....'
[06:29:07] Command completed (exit 0)
[06:29:07] Command completed (exit 0)
[06:29:07] Command completed (exit 0)
[06:29:07] Command completed (exit 0)
— (no further log entries for 8+ minutes, task stays "running") —

Note

This was observed after locally patching the spawn codex ENOENT issue (#32 / #46). The task successfully starts and runs commands, but hangs during the final response generation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions