Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
143 changes: 85 additions & 58 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,68 @@
<a href="https://ralphify.co/docs/"><img src="https://img.shields.io/badge/docs-ralphify.co%2Fdocs-blue" alt="Documentation"></a>
</p>

Put your AI coding agent in a `while True` loop and let it ship.
**A ralph is a directory that defines an autonomous agent loop.** It bundles a prompt, commands, and any files your agent needs. Ralphify is the CLI runtime that executes them.

Ralphify is a minimal harness for running autonomous AI coding loops, inspired by the [Ralph Wiggum technique](https://ghuntley.com/ralph/). The idea is simple: pipe a prompt to an AI coding agent, let it do one thing, commit, and repeat. Forever. Until you hit Ctrl+C.
```
grow-coverage/
├── RALPH.md # the prompt (only required file)
├── check-coverage.sh # command that runs each iteration
└── testing-conventions.md # context for the agent
```

```markdown
---
agent: claude -p --dangerously-skip-permissions
commands:
- name: coverage
run: ./check-coverage.sh
---

You are an autonomous coding agent working in a loop.
Each iteration, write tests for one untested module, then stop.

Follow the conventions in testing-conventions.md.

## Current coverage

{{ commands.coverage }}
```
while :; do cat RALPH.md | claude -p ; done

```bash
ralph run grow-coverage # loops until Ctrl+C
```

Ralphify wraps this pattern into a proper tool with commands, iteration tracking, and clean shutdown.
That's it. One directory. One command. The agent loops — running commands, building a fresh prompt with the latest output, and piping it to your agent. Every iteration starts with clean context and current data.

*Works with any agent CLI. Swap `claude -p` for Codex, Aider, or your own — just change the `agent` field.*

## Why loops

A single agent run can fix a bug or write a function. But the real leverage is **sustained, autonomous work** — campaigns that run for hours, chipping away at a goal one commit at a time while you do something else.

Ralph loops give you:

- **Incremental progress in small chunks.** Each iteration does one thing, tests it, and commits. Small changes are easier to review and safer to ship.
- **Fresh context every iteration.** No context window bloat. The agent starts clean each loop and sees the current state of the codebase — including everything it changed last iteration.
- **Continuous work toward a goal.** The loop keeps running until you hit Ctrl+C or it reaches the iteration limit. Walk away, come back to a pile of commits.
- **No micromanagement.** Define the goal once in the prompt, tune with commands that feed live data back in. The agent figures out what to do next.
- **The prompt is a tuning knob.** When the agent does something dumb, add a rule. Like putting up a sign: "SLIDE DOWN, DON'T JUMP."

### What people build ralphs for

| Ralph | What it does |
|---|---|
| **grow-coverage** | Write tests for untested modules, one per iteration, until coverage hits the target |
| **security-audit** | Hunt for vulnerabilities — scan, find, fix, verify, repeat |
| **clear-backlog** | Work through a TODO list or issue tracker, one task per loop |
| **write-docs** | Generate documentation for undocumented modules, one at a time |
| **improve-codebase** | Find and fix code smells, refactor patterns, modernize APIs |
| **migrate** | Incrementally migrate files from one framework or pattern to another |
| **research** | Deep-dive into a topic — gather sources, synthesize, and build a knowledge base |
| **bug-hunter** | Run the test suite, find edge cases, write regression tests |
| **perf-sweep** | Profile, find bottlenecks, optimize, benchmark, repeat |

The ralph format is intentionally simple — if you've written a skill file or a GitHub Action, you already know how it works. YAML frontmatter for config, markdown body for the prompt, `{{ commands.name }}` placeholders for live data.

## Install

Expand All @@ -36,56 +89,33 @@ Any of these gives you the `ralph` command.

## Quickstart

A ralph is a directory with a `RALPH.md` file. Scaffold one:
Scaffold a ralph and start experimenting:

```bash
ralph scaffold my-ralph
```

Then edit `my-ralph/RALPH.md`:

```markdown
---
agent: claude -p --dangerously-skip-permissions
commands:
- name: tests
run: uv run pytest
---

You are an autonomous coding agent working in a loop.

## Test results

{{ commands.tests }}

If any tests are failing, fix them before continuing.

## Task

Implement the next feature from the TODO list.
```

Run it:
Edit `my-ralph/RALPH.md`, then run it:

```bash
ralph run my-ralph # Starts the loop (Ctrl+C to stop)
ralph run my-ralph -n 5 # Run 5 iterations then stop
ralph run my-ralph # loops until Ctrl+C
ralph run my-ralph -n 5 # run 5 iterations then stop
```

### What `ralph run` does

Each iteration:
1. **Runs commands** — executes all commands, captures output
2. **Assembles prompt** — reads RALPH.md body, replaces `{{ commands.<name> }}` placeholders with output
3. **Pipes to agent** — executes the agent command with the assembled prompt on stdin
4. **Repeats** — goes back to step 1
1. **Runs commands** — executes all commands in the ralph, captures output
2. **Builds prompt** — reads the RALPH.md body, replaces `{{ commands.<name> }}` placeholders with fresh output
3. **Pipes to agent** — runs the agent command with the assembled prompt on stdin
4. **Repeats** — goes back to step 1 with updated data

### What it looks like

```
$ ralph run my-ralph -n 3
$ ralph run grow-coverage -n 3

▶ Running: my-ralph
▶ Running: grow-coverage
1 command · max 3 iterations

── Iteration 1 ──
Expand All @@ -104,20 +134,9 @@ $ ralph run my-ralph -n 3
Done: 3 iterations — 2 succeeded, 1 failed
```

## The technique

The Ralph Wiggum technique works because:

- **One thing per loop.** The agent picks the most important task, implements it, tests it, and commits. Then the next iteration starts fresh.
- **Fresh context every time.** No context window bloat. Each loop starts clean and reads the current state of the codebase.
- **Progress lives in git.** Code, commits, and a plan file are the only state that persists between iterations. If something goes wrong, `git reset --hard` and run more loops.
- **The prompt is a tuning knob.** When the agent does something dumb, you add a sign. Like telling Ralph not to jump off the slide — you add "SLIDE DOWN, DON'T JUMP" to the prompt.

Read the full writeup: [Ralph Wiggum as a "software engineer"](https://ghuntley.com/ralph/)
## The ralph format

## Core concepts

A **ralph** is a directory containing a `RALPH.md` file. That's it. Everything the ralph needs lives in that directory.
A **ralph** is a directory containing a `RALPH.md` file. Everything the ralph needs lives in that directory — scripts, reference docs, test data, whatever the agent might need.

```
my-ralph/
Expand All @@ -127,7 +146,7 @@ my-ralph/
└── test-data.json # any supporting file (optional)
```

**RALPH.md** is the only file the framework reads. It has YAML frontmatter for configuration and a body that becomes the prompt:
**RALPH.md** has YAML frontmatter for configuration and a markdown body that becomes the prompt:

| Frontmatter field | Required | Description |
|---|---|---|
Expand All @@ -136,17 +155,25 @@ my-ralph/
| `args` | No | Declared argument names for `{{ args.<name> }}` placeholders |
| `credit` | No | Append co-author trailer instruction to prompt (default: `true`) |

**Commands** run before each iteration. Their output replaces `{{ commands.<name> }}` placeholders in the prompt. Use them for test results, git history, lint output — anything that changes between iterations.
**Commands** run before each iteration. Their output replaces `{{ commands.<name> }}` placeholders in the prompt. Use them for test results, coverage reports, git history, lint output — anything that changes between iterations.

## The technique

The Ralph Wiggum technique works because:

**No project-level configuration.** No `ralph.toml`. No config files. A ralph is fully self-contained.
- **One thing per loop.** The agent picks the most important task, implements it, tests it, and commits. Then the next iteration starts fresh.
- **Fresh context every time.** No context window bloat. Each loop starts clean and reads the current state of the codebase.
- **Progress lives in git.** Code, commits, and a plan file are the only state that persists between iterations. If something goes wrong, `git reset --hard` and run more loops.

Read the full writeup: [Ralph Wiggum as a "software engineer"](https://ghuntley.com/ralph/)

## Install ralphs from GitHub
## Share ralphs

Use [agr](https://github.com/computerlovetech/agr) to install shared ralphs:
Use [agr](https://github.com/computerlovetech/agr) to install shared ralphs from GitHub:

```bash
agr add owner/repo/my-ralph # Install a ralph from GitHub
ralph run my-ralph # Run it by name
agr add owner/repo/grow-coverage # install a ralph from GitHub
ralph run grow-coverage # run it by name
```

Ralphs installed by agr go to `.agents/ralphs/` and are automatically discovered by `ralph run`.
Expand All @@ -158,7 +185,7 @@ Full documentation at **[ralphify.co/docs](https://ralphify.co/docs/)** — gett
## Requirements

- Python 3.11+
- [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code) (or any agent CLI that accepts piped input)
- An agent CLI that accepts piped input ([Claude Code](https://docs.anthropic.com/en/docs/claude-code), Codex, Aider, or your own)

## License

Expand Down
Loading