Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
c4f7475
feat: add idle detection data types, marker constant, and duration pa…
malpou Mar 22, 2026
470d0ba
feat: add ITERATION_IDLE event type, STOP_MAX_IDLE reason, and IDLE_E…
malpou Mar 22, 2026
8c61c47
feat: implement idle detection in engine with backoff and max idle
malpou Mar 22, 2026
be5f760
feat: parse idle config from RALPH.md frontmatter in CLI
malpou Mar 22, 2026
874dd76
feat: render ITERATION_IDLE event in console emitter
malpou Mar 22, 2026
adb489e
docs: add idle detection to all doc surfaces
malpou Mar 22, 2026
cd1e33e
refactor: remove redundant idle integration tests from test_cli.py
malpou Mar 22, 2026
6a6e91e
refactor: parametrize idle delay tests and remove fragile timing test
malpou Mar 22, 2026
dbda556
refactor: consolidate idle event tests into single emitter test
malpou Mar 22, 2026
4fbddc1
refactor: remove dead IterationIdleData, inline idle defaults, simpli…
malpou Mar 22, 2026
6a89cb6
refactor: inline _parse_idle_duration into _validate_idle and loop ov…
malpou Mar 22, 2026
6b3b3f1
refactor: simplify _compute_idle_delay into _idle_delay and remove gu…
malpou Mar 22, 2026
61986e2
refactor: parametrize idle tests in cli, console_emitter, and run_types
malpou Mar 22, 2026
c3a5f86
docs: trim redundant idle detection explanations in cli, writing-prom…
malpou Mar 22, 2026
fe4f447
refactor: inline idle sub-field constants used only in _validate_idle
malpou Mar 22, 2026
e1d78dd
refactor: consolidate idle tests to reduce diff footprint
malpou Mar 22, 2026
9b02721
docs: trim idle detection section in writing-prompts guide
malpou Mar 22, 2026
421b559
feat: add stdout_text field to AgentResult
malpou Mar 22, 2026
cf5f54b
feat: check stdout_text as fallback for idle marker detection
malpou Mar 22, 2026
a3a84f9
test: add tests for idle-via-stdout-fallback and stdout_text population
malpou Mar 22, 2026
f39da3b
feat: add DELAY_STARTED/DELAY_ENDED events for live countdown support
malpou Mar 22, 2026
0de612f
feat: add live delay countdown renderable and handlers to console emi…
malpou Mar 22, 2026
6c155c3
test: add tests for delay countdown rendering and event handlers
malpou Mar 22, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,14 @@ keywords: ralphify changelog, release history, new features, version updates, br

All notable changes to ralphify are documented here.

## 0.2.5 — 2026-03-22

### Added

- **Idle detection with backoff** — configure via the `idle` frontmatter block (`delay`, `backoff`, `max_delay`, `max`). When an agent emits `<!-- ralph:state idle -->`, the engine applies backoff delays and optionally stops after a cumulative idle time limit.

---

## 0.2.4 — 2026-03-22

### Fixed
Expand Down
29 changes: 29 additions & 0 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,7 @@ Your instructions here. Reference args with {{ args.dir }}.
| `commands` | list | no | Commands to run each iteration (each has `name` and `run`) |
| `args` | list of strings | no | Declared argument names for user arguments. Letters, digits, hyphens, and underscores only. |
| `credit` | bool | no | Append co-author trailer instruction to prompt (default: `true`) |
| `idle` | mapping | no | Idle detection config — backoff delays when agent signals idle state (see [Idle detection](#idle-detection)) |

### Commands

Expand All @@ -188,3 +189,31 @@ If a command exceeds its timeout, the process is killed and the captured output
| `{{ args.<name> }}` | Value of the named user argument |

Unmatched placeholders resolve to an empty string.

### Idle detection

When an agent emits `<!-- ralph:state idle -->` in its output, the engine applies backoff delays between iterations and optionally stops the loop after a cumulative idle time limit.

Add an `idle` block to your frontmatter:

```markdown
---
agent: claude -p --dangerously-skip-permissions
idle:
delay: 30s
backoff: 2
max_delay: 5m
max: 30m
---
```

| Field | Type | Default | Description |
|---|---|---|---|
| `delay` | duration or number | `30` (seconds) | Initial delay after the first idle iteration |
| `backoff` | number | `2.0` | Multiplier applied each consecutive idle iteration |
| `max_delay` | duration or number | `300` (5 minutes) | Maximum delay cap |
| `max` | duration or number | none | Stop the loop after this cumulative idle time |

Duration values accept numbers (seconds) or human-readable strings: `30s`, `5m`, `6h`, `1d`.

A non-idle iteration resets all idle tracking. When no `idle` block is present, the loop runs exactly as before.
14 changes: 11 additions & 3 deletions docs/contributing/codebase-map.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ The run loop communicates via structured events (`_events.py`). Each event has a

Event data uses TypedDict classes — one per event type — rather than free-form dicts. The key types:

- **`RunStartedData`** / **`RunStoppedData`** — run lifecycle (stop reason is a `StopReason` literal: `"completed"`, `"error"`, `"user_requested"`)
- **`RunStartedData`** / **`RunStoppedData`** — run lifecycle (stop reason is a `StopReason` literal: `"completed"`, `"error"`, `"user_requested"`, `"max_idle"`)
- **`IterationStartedData`** / **`IterationEndedData`** — per-iteration data (return code, duration, log path)
- **`CommandsStartedData`** / **`CommandsCompletedData`** — command execution bookends
- **`PromptAssembledData`** — prompt length after placeholder resolution
Expand Down Expand Up @@ -113,7 +113,7 @@ The CLI uses a `ConsoleEmitter` (defined in `_console_emitter.py`) that renders

1. **`engine.py`** — The core run loop. Uses `RunConfig` and `RunState` (from `_run_types.py`) and `EventEmitter`. This is where iteration logic lives.
2. **`_run_types.py`** — `RunConfig`, `RunState`, `RunStatus`, and `Command`. These are the shared data types used by the engine, CLI, and manager.
3. **`cli.py`** — All CLI commands. Validates frontmatter fields via extracted helpers (`_validate_agent`, `_validate_commands`, `_validate_credit`, `_validate_run_options`, `_validate_declared_args`), builds a `RunConfig`, and delegates to `engine.run_loop()` for the actual loop. Terminal event rendering lives in `_console_emitter.py`.
3. **`cli.py`** — All CLI commands. Validates frontmatter fields via extracted helpers (`_validate_agent`, `_validate_commands`, `_validate_credit`, `_validate_idle`, `_validate_run_options`, `_validate_declared_args`), builds a `RunConfig`, and delegates to `engine.run_loop()` for the actual loop. Terminal event rendering lives in `_console_emitter.py`.
4. **`_frontmatter.py`** — YAML frontmatter parsing. Extracts `agent`, `commands`, `args` from the RALPH.md file.
5. **`_resolver.py`** — Template placeholder logic. Small file but critical.
6. **`_skills.py`** + **`skills/`** — The skill system behind `ralph new`. `_skills.py` handles agent detection, reads bundled skill definitions from `skills/`, installs them into the agent's skill directory, and builds the command to launch the agent.
Expand All @@ -124,7 +124,7 @@ The CLI uses a `ConsoleEmitter` (defined in `_console_emitter.py`) that renders

Frontmatter parsing is in `_frontmatter.py:parse_frontmatter()`, which returns a raw dict. Each field is then validated and coerced by a dedicated helper in `cli.py` — e.g. `_validate_agent()`, `_validate_commands()`, `_validate_credit()`. Adding a new frontmatter field means adding a new validator in `cli.py` and wiring it into `_build_run_config()`.

**Field name constants** (`FIELD_AGENT`, `FIELD_COMMANDS`, `FIELD_ARGS`, `FIELD_CREDIT`, `CMD_FIELD_NAME`, `CMD_FIELD_RUN`, `CMD_FIELD_TIMEOUT`) are centralized in `_frontmatter.py`. Always import these constants instead of hardcoding strings like `"agent"` or `"commands"` — this keeps error messages, validation, and placeholder resolution in sync when fields are renamed.
**Field name constants** (`FIELD_AGENT`, `FIELD_COMMANDS`, `FIELD_ARGS`, `FIELD_CREDIT`, `FIELD_IDLE`, `CMD_FIELD_NAME`, `CMD_FIELD_RUN`, `CMD_FIELD_TIMEOUT`) are centralized in `_frontmatter.py`. Always import these constants instead of hardcoding strings like `"agent"` or `"commands"` — this keeps error messages, validation, and placeholder resolution in sync when fields are renamed.

### If you add a new CLI command...

Expand All @@ -134,6 +134,14 @@ Add it in `cli.py`. The CLI uses Typer. Update `docs/cli.md` to document the new

Events are defined in `_events.py:EventType`, with a corresponding TypedDict payload class for each type. Adding a new event type requires a new `EventType` member, a new TypedDict payload class, adding it to the `EventData` union, and handling it in `ConsoleEmitter` (`_console_emitter.py`).

### Idle detection

When an agent emits `<!-- ralph:state idle -->` (the `IDLE_STATE_MARKER` constant in `_frontmatter.py`) in its output, the engine marks the iteration as idle instead of completed. Idle behavior is configured via the `idle` frontmatter block, parsed by `_validate_idle()` in `cli.py` into an `IdleConfig` dataclass (`_run_types.py`).

The engine (`engine.py`) tracks idle state on `RunState` (`consecutive_idle`, `cumulative_idle_time`). Backoff delay is computed by `_idle_delay()`: `delay × backoff^(consecutive_idle - 1)`, capped at `max_delay`. A non-idle iteration calls `state.reset_idle()` to clear all idle tracking. When `idle.max` is set and cumulative idle time exceeds it, the loop stops with `RunStatus.IDLE_EXCEEDED`.

The `ITERATION_IDLE` event type and `STOP_MAX_IDLE` stop reason are defined in `_events.py`. The console emitter renders idle iterations with a dimmed style.

### Credit trailer

When `credit` is `true` (the default), `engine.py:_assemble_prompt()` appends `_CREDIT_INSTRUCTION` to the prompt — a short instruction telling the agent to include a `Co-authored-by: Ralphify` trailer in git commits. Users can opt out with `credit: false` in frontmatter.
Expand Down
13 changes: 13 additions & 0 deletions docs/quick-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ Your instructions here. Use {{ args.dir }} for user arguments.
| `commands` | list | no | Commands to run each iteration |
| `args` | list | no | User argument names. Letters, digits, hyphens, and underscores only. |
| `credit` | bool | no | Append co-author trailer instruction to prompt (default: `true`) |
| `idle` | mapping | no | Idle detection: backoff delays when agent signals idle (see below) |

### Command fields

Expand Down Expand Up @@ -107,6 +108,18 @@ Your instructions here. Use {{ args.dir }} for user arguments.
- `--` ends flag parsing: `ralph run my-ralph -- --verbose ./src` treats `--verbose` as a positional value
- Missing args resolve to empty string

### Idle detection

```yaml
idle:
delay: 30s # Initial delay after first idle iteration (default: 30s)
backoff: 2 # Multiplier per consecutive idle iteration (default: 2)
max_delay: 5m # Delay cap (default: 5m)
max: 30m # Stop loop after this cumulative idle time (optional)
```

Agent emits `<!-- ralph:state idle -->` → backoff delays kick in. Non-idle iteration resets tracking. Durations: `30s`, `5m`, `6h`, `1d`.

## The loop

Each iteration:
Expand Down
11 changes: 11 additions & 0 deletions docs/writing-prompts.md
Original file line number Diff line number Diff line change
Expand Up @@ -331,6 +331,17 @@ HTML comments in your RALPH.md are automatically stripped before the prompt is a

You can freely add and edit comments while the loop runs — they're stripped every iteration, so they never waste the agent's context window.

## Idle detection

If your agent signals when it has no work to do, you can avoid wasting tokens on idle iterations. Add an [`idle` block](cli.md#idle-detection) to your frontmatter and instruct the agent to emit the idle marker when there's nothing left to do:

```markdown
If all tasks are complete, output exactly:
<!-- ralph:state idle -->
```

When the agent emits `<!-- ralph:state idle -->`, the engine applies increasing backoff delays before the next iteration. A non-idle iteration resets the backoff. If cumulative idle time reaches the `max` limit, the loop stops automatically.

## Prompt size and context windows

Keep your prompt focused. A long prompt with every possible instruction eats into the agent's context window, leaving less room for the actual codebase.
Expand Down
3 changes: 3 additions & 0 deletions src/ralphify/_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ class AgentResult(ProcessResult):
elapsed: float = 0.0
log_file: Path | None = None
result_text: str | None = None
stdout_text: str | None = None


@dataclass(frozen=True)
Expand Down Expand Up @@ -206,6 +207,7 @@ def _run_agent_streaming(
log_file=log_file,
result_text=stream.result_text,
timed_out=stream.timed_out,
stdout_text="".join(stream.stdout_lines),
)


Expand Down Expand Up @@ -254,6 +256,7 @@ def _run_agent_blocking(
elapsed=time.monotonic() - start,
log_file=log_file,
timed_out=timed_out,
stdout_text=ensure_str(stdout) if stdout else None,
)


Expand Down
42 changes: 40 additions & 2 deletions src/ralphify/_console_emitter.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,10 @@
from ralphify._events import (
LOG_ERROR,
STOP_COMPLETED,
STOP_MAX_IDLE,
CommandsCompletedData,
DelayEndedData,
DelayStartedData,
Event,
EventType,
IterationEndedData,
Expand All @@ -34,6 +37,7 @@
_ICON_SUCCESS = "✓"
_ICON_FAILURE = "✗"
_ICON_TIMEOUT = "⏱"
_ICON_IDLE = "◇"
_ICON_ARROW = "→"
_ICON_DASH = "—"

Expand All @@ -54,6 +58,20 @@ def __rich_console__(self, console: Console, options: ConsoleOptions) -> RenderR
yield text


class _DelayCountdown:
"""Rich renderable that shows a countdown timer for inter-iteration delays."""

def __init__(self, total: float) -> None:
self._total = total
self._start = time.monotonic()

def __rich_console__(self, console: Console, options: ConsoleOptions) -> RenderResult:
elapsed = time.monotonic() - self._start
remaining = max(0.0, self._total - elapsed)
text = Text(f" Waiting {format_duration(remaining)}…", style="dim")
yield text


class ConsoleEmitter:
"""Renders engine events to the Rich console."""

Expand All @@ -66,7 +84,10 @@ def __init__(self, console: Console) -> None:
EventType.ITERATION_COMPLETED: partial(self._on_iteration_ended, color="green", icon=_ICON_SUCCESS),
EventType.ITERATION_FAILED: partial(self._on_iteration_ended, color="red", icon=_ICON_FAILURE),
EventType.ITERATION_TIMED_OUT: partial(self._on_iteration_ended, color="yellow", icon=_ICON_TIMEOUT),
EventType.ITERATION_IDLE: partial(self._on_iteration_ended, color="dim", icon=_ICON_IDLE),
EventType.COMMANDS_COMPLETED: self._on_commands_completed,
EventType.DELAY_STARTED: self._on_delay_started,
EventType.DELAY_ENDED: self._on_delay_ended,
EventType.LOG_MESSAGE: self._on_log_message,
EventType.RUN_STOPPED: self._on_run_stopped,
}
Expand Down Expand Up @@ -130,6 +151,19 @@ def _on_commands_completed(self, data: CommandsCompletedData) -> None:
if count:
self._console.print(f" [bold]Commands:[/bold] {count} ran")

def _on_delay_started(self, data: DelayStartedData) -> None:
countdown = _DelayCountdown(data["delay"])
self._live = Live(
countdown,
console=self._console,
transient=True,
refresh_per_second=_LIVE_REFRESH_RATE,
)
self._live.start()

def _on_delay_ended(self, data: DelayEndedData) -> None:
self._stop_live()

def _on_log_message(self, data: LogMessageData) -> None:
msg = escape_markup(data["message"])
level = data["level"]
Expand All @@ -143,7 +177,8 @@ def _on_log_message(self, data: LogMessageData) -> None:

def _on_run_stopped(self, data: RunStoppedData) -> None:
self._stop_live()
if data["reason"] != STOP_COMPLETED:
reason = data["reason"]
if reason not in (STOP_COMPLETED, STOP_MAX_IDLE):
return

total = data["total"]
Expand All @@ -161,4 +196,7 @@ def _on_run_stopped(self, data: RunStoppedData) -> None:
parts.append(f"{timed_out_count} timed out")
detail = ", ".join(parts)
self._console.print(f"\n[bold blue]──────────────────────[/bold blue]")
self._console.print(f"[bold green]Done:[/bold green] {total} iteration(s) {_ICON_DASH} {detail}")
if reason == STOP_MAX_IDLE:
self._console.print(f"[bold yellow]Stopped (idle):[/bold yellow] {total} iteration(s) {_ICON_DASH} {detail}")
else:
self._console.print(f"[bold green]Done:[/bold green] {total} iteration(s) {_ICON_DASH} {detail}")
20 changes: 18 additions & 2 deletions src/ralphify/_events.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,13 @@
LOG_INFO: LogLevel = "info"
LOG_ERROR: LogLevel = "error"

StopReason = Literal["completed", "error", "user_requested"]
StopReason = Literal["completed", "error", "user_requested", "max_idle"]
"""Valid reason strings for :class:`RunStoppedData` events."""

STOP_COMPLETED: StopReason = "completed"
STOP_ERROR: StopReason = "error"
STOP_USER_REQUESTED: StopReason = "user_requested"
STOP_MAX_IDLE: StopReason = "max_idle"


class EventType(Enum):
Expand All @@ -37,7 +38,7 @@ class EventType(Enum):

**Iteration lifecycle** — emitted once per iteration:
``ITERATION_STARTED``, ``ITERATION_COMPLETED``, ``ITERATION_FAILED``,
``ITERATION_TIMED_OUT``.
``ITERATION_TIMED_OUT``, ``ITERATION_IDLE``.

**Commands** — emitted around command execution:
``COMMANDS_STARTED``, ``COMMANDS_COMPLETED``.
Expand All @@ -63,6 +64,7 @@ class EventType(Enum):
ITERATION_COMPLETED = "iteration_completed"
ITERATION_FAILED = "iteration_failed"
ITERATION_TIMED_OUT = "iteration_timed_out"
ITERATION_IDLE = "iteration_idle"

# ── Commands ────────────────────────────────────────────────
COMMANDS_STARTED = "commands_started"
Expand All @@ -74,6 +76,10 @@ class EventType(Enum):
# ── Agent activity (live streaming) ─────────────────────────
AGENT_ACTIVITY = "agent_activity"

# ── Delay ─────────────────────────────────────────────────────
DELAY_STARTED = "delay_started"
DELAY_ENDED = "delay_ended"

# ── Other ───────────────────────────────────────────────────
LOG_MESSAGE = "log_message"

Expand Down Expand Up @@ -131,6 +137,14 @@ class AgentActivityData(TypedDict):
iteration: int


class DelayStartedData(TypedDict):
delay: float


class DelayEndedData(TypedDict):
pass


class LogMessageData(TypedDict):
message: str
level: LogLevel
Expand All @@ -146,6 +160,8 @@ class LogMessageData(TypedDict):
| CommandsCompletedData
| PromptAssembledData
| AgentActivityData
| DelayStartedData
| DelayEndedData
| LogMessageData
)
"""Union of all typed event data payloads."""
Expand Down
19 changes: 19 additions & 0 deletions src/ralphify/_frontmatter.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
FIELD_COMMANDS = "commands"
FIELD_ARGS = "args"
FIELD_CREDIT = "credit"
FIELD_IDLE = "idle"

# Sub-field names within each command mapping.
CMD_FIELD_NAME = "name"
Expand All @@ -41,9 +42,27 @@
# Human-readable description of allowed name characters, paired with CMD_NAME_RE.
VALID_NAME_CHARS_MSG = "Names may only contain letters, digits, hyphens, and underscores."

# Marker that agents emit to signal idle state.
IDLE_STATE_MARKER = "<!-- ralph:state idle -->"

# Pre-compiled pattern to strip HTML comments from body text.
_HTML_COMMENT_RE = re.compile(r"<!--.*?-->", re.DOTALL)

# Pattern for human-readable duration strings (e.g. "30s", "5m", "6h", "1d").
_DURATION_RE = re.compile(r"^\s*(\d+(?:\.\d+)?)\s*([smhd])\s*$")
_DURATION_MULTIPLIERS = {"s": 1, "m": 60, "h": 3600, "d": 86400}


def parse_duration(value: str) -> float:
"""Parse a duration string (e.g. ``"30s"``, ``"5m"``) into seconds."""
match = _DURATION_RE.match(value)
if not match:
raise ValueError(
f"Invalid duration '{value}'. Use a number with a suffix: "
f"s (seconds), m (minutes), h (hours), d (days). Examples: 30s, 5m, 6h."
)
return float(match.group(1)) * _DURATION_MULTIPLIERS[match.group(2)]


def _extract_frontmatter_block(text: str) -> tuple[str, str]:
"""Split text into raw YAML frontmatter and body at ``---`` delimiters.
Expand Down
Loading
Loading