| title | Ralphify Codebase Map |
|---|---|
| description | Architecture overview and module map for contributors and AI coding agents working on ralphify. |
| keywords | ralphify architecture, codebase map, module overview, engine, CLI, resolver, agent subprocess |
Quick orientation guide for anyone working on this codebase — human contributors and AI coding agents alike.
Ralphify is a CLI tool (ralph) that runs AI coding agents in autonomous loops. It reads a RALPH.md file from a ralph directory, runs commands, assembles a prompt with the output, pipes it to an agent command via stdin, waits for it to finish, then repeats. Each iteration gets a fresh context window. Progress is tracked through git commits.
The core loop is simple. The complexity lives in prompt assembly — running commands and resolving placeholders into the prompt before each iteration.
src/ralphify/ # All source code
├── __init__.py # Version detection + app entry point
├── cli.py # CLI commands (run, scaffold) — delegates to engine for the loop
├── engine.py # Core run loop orchestration with structured event emission
├── manager.py # Multi-run orchestration (concurrent runs via threads)
├── _resolver.py # Template placeholder resolution ({{ commands.* }}, {{ args.* }}, {{ ralph.* }})
├── _agent.py # Run agent subprocesses (streaming + blocking modes, log writing)
├── _run_types.py # RunConfig, RunState, RunStatus, Command — shared data types
├── _runner.py # Execute shell commands with timeout and capture output
├── _frontmatter.py # Parse YAML frontmatter from RALPH.md, marker constants
├── _console_emitter.py # Rich console renderer for run-loop events (ConsoleEmitter)
├── _events.py # Event types, emitter protocol, and BoundEmitter convenience wrapper
├── _output.py # ProcessResult base class, combine stdout+stderr, format durations
└── _brand.py # Brand color constants shared across CLI and console rendering
tests/ # Pytest tests — one test file per module
docs/ # MkDocs site (Material theme) — user-facing documentation
docs/contributing/ # Contributor documentation (this section)
.github/workflows/
├── test.yml # Run tests on push to main and PRs (Python 3.11–3.13)
├── docs.yml # Deploy docs to GitHub Pages on push to main
└── publish.yml # Publish to PyPI on release (with test gate)
The CLI entry point is cli.py:run(), which parses options, reads the ralph directory path, and delegates to engine.py:run_loop() for the actual iteration cycle. The engine emits structured events via an EventEmitter, making the same loop reusable from both the CLI and any external orchestration layer (such as manager.py).
ralph run my-ralph
│
├── cli.py:run() — parse options, print banner
│ ├── Read RALPH.md from the given directory
│ ├── Parse frontmatter (agent, commands, args)
│ └── Build RunConfig and call engine.run_loop()
│
└── engine.py:run_loop(config, state, emitter)
└── Loop:
├── Re-read RALPH.md from disk
├── Run commands → capture output
├── Resolve {{ commands.* }}, {{ args.* }}, and {{ ralph.* }} placeholders
├── Pipe assembled prompt to agent command via subprocess
├── Emit iteration events (started, completed, failed, timed_out)
├── Handle pause/resume/stop requests via RunState
└── Repeat
The resolver (_resolver.py) handles three placeholder kinds: {{ commands.<name> }}, {{ args.<name> }}, and {{ ralph.<name> }}. Two functions:
resolve_all()— resolves all three placeholder kinds in a single pass so that a value inserted by one kind (e.g., an arg value containing{{ commands.foo }}) is never re-processed as the other kind. Used by the engine for final prompt assembly. Theralph.*placeholders (ralph.name,ralph.iteration,ralph.max_iterations) provide runtime metadata and require no frontmatter configuration.resolve_args()— resolves only{{ args.<name> }}placeholders. Used by the engine to expand arg references inside commandrunstrings before executing them.
Unmatched placeholders resolve to empty string in both functions.
The run loop communicates via structured events (_events.py). Each event has a type (EventType enum), run ID, typed data payload, and UTC timestamp.
Event data uses TypedDict classes — one per event type — rather than free-form dicts. The key types:
RunStartedData/RunStoppedData— run lifecycle (stop reason is aStopReasonliteral:"completed","error","user_requested")IterationStartedData/IterationEndedData— per-iteration data (return code, duration, log path)CommandsStartedData/CommandsCompletedData— command execution bookendsPromptAssembledData— prompt length after placeholder resolutionAgentActivityData— streaming agent outputLogMessageData— info/error messages with optional traceback
All payload types are unioned as EventData.
Emitter implementations:
EventEmitter— protocol that any listener implements (just anemit(event)method)NullEmitter— discards events (used in tests)QueueEmitter— pushes events into aqueue.Queuefor async consumptionFanoutEmitter— broadcasts events to multiple emittersBoundEmitter— wraps any emitter with a fixed run ID, so callers don't have to pass the ID on every emit. The engine creates one per run and threads it through all internal functions.
The CLI uses a ConsoleEmitter (defined in _console_emitter.py) that renders events to the terminal with Rich formatting.
manager.py:RunManager orchestrates concurrent runs:
- Creates runs with unique IDs and wraps them in
ManagedRun(config + state + emitter + thread) - Starts each run in a daemon thread via
engine.run_loop() - Supports pause/resume/stop per run via
RunStatethread-safe control methods - Uses
FanoutEmitterto broadcast events to multiple listeners
engine.py— The core run loop. UsesRunConfigandRunState(from_run_types.py) andEventEmitter. This is where iteration logic lives._run_types.py—RunConfig,RunState,RunStatus, andCommand. These are the shared data types used by the engine, CLI, and manager.cli.py— All CLI commands. Validates frontmatter fields via extracted helpers (_validate_agent,_validate_commands,_validate_credit,_validate_run_options,_validate_declared_args), builds aRunConfig, and delegates toengine.run_loop()for the actual loop. Terminal event rendering lives in_console_emitter.py._frontmatter.py— YAML frontmatter parsing. Extractsagent,commands,argsfrom the RALPH.md file._resolver.py— Template placeholder logic. Small file but critical.
Frontmatter parsing is in _frontmatter.py:parse_frontmatter(), which returns a raw dict. Each field is then validated and coerced by a dedicated helper in cli.py — e.g. _validate_agent(), _validate_commands(), _validate_credit(). Adding a new frontmatter field means adding a new validator in cli.py and wiring it into _build_run_config().
Field name constants (FIELD_AGENT, FIELD_COMMANDS, FIELD_ARGS, FIELD_CREDIT, CMD_FIELD_NAME, CMD_FIELD_RUN, CMD_FIELD_TIMEOUT) are centralized in _frontmatter.py. Always import these constants instead of hardcoding strings like "agent" or "commands" — this keeps error messages, validation, and placeholder resolution in sync when fields are renamed.
Add it in cli.py. The CLI uses Typer. Update docs/cli.md to document the new command.
Events are defined in _events.py:EventType, with a corresponding TypedDict payload class for each type. Adding a new event type requires a new EventType member, a new TypedDict payload class, adding it to the EventData union, and handling it in ConsoleEmitter (_console_emitter.py).
When credit is true (the default), engine.py:_assemble_prompt() appends _CREDIT_INSTRUCTION to the prompt — a short instruction telling the agent to include a Co-authored-by: Ralphify trailer in git commits. Users can opt out with credit: false in frontmatter.
_output.py defines ProcessResult, the base dataclass for subprocess results (provides returncode, timed_out, and a success property). Both _runner.py:RunResult (command execution) and _agent.py:AgentResult (agent execution) extend it. If you add a new subprocess wrapper, inherit from ProcessResult to get consistent success/timeout semantics. The module also provides ensure_str() for bytes-to-string decoding, collect_output() for combining stdout+stderr, and SUBPROCESS_TEXT_KWARGS — the shared kwargs dict used by all subprocess.Popen calls to ensure consistent encoding and stream handling.
Agent subprocesses run in their own process group (start_new_session=True on POSIX, configured via _SESSION_KWARGS in _agent.py). This lets _kill_process_group() send signals to the agent and all its children at once.
The two-stage Ctrl+C flow:
- First Ctrl+C — the engine's SIGINT handler sets
RunState.stop_requested, which lets the current iteration finish gracefully. - Second Ctrl+C —
KeyboardInterruptpropagates normally and the agent process is killed.
Timeout and cancellation both use a two-step kill: SIGTERM first, then SIGKILL after _SIGTERM_GRACE_PERIOD seconds (3s). If you add a new subprocess wrapper, use _kill_process_group() and _SESSION_KWARGS to get consistent cleanup behavior.
Commands in RALPH.md frontmatter are parsed with shlex.split() — no shell features. For shell features, users point the run field at a script.
uv run pytest # Run all tests
uv run pytest -x # Stop on first failureTests are in tests/ with one file per module. All tests use temporary directories and don't require any external services.
Minimal by design:
- typer — CLI framework
- rich — Terminal formatting (used via typer's console)
- pyyaml — YAML frontmatter parsing in
_frontmatter.py
Dev dependencies: pytest, mkdocs, mkdocs-material.