This guide explains how to manually drive the prompt evolution loop using an AI coding assistant like OpenCode or Cursor.
You take the role of the human evaluator in the modify → evaluate → keep/revert loop:
- Modify — Improve the prompt in
prompt.txt - Evaluate — Run
python eval.pyto see the score - Keep or Revert — Commit improvements, revert regressions
- Repeat — Iterate until the prompt is excellent
Open this repo in OpenCode or Cursor. The file you'll edit is prompt.txt at the repo root.
Edit prompt.txt to make it better at generating AI agent projects. Some ideas:
- Add more specific instructions about project structure
- Request specific libraries or patterns (LangGraph, Pydantic, etc.)
- Clarify output format requirements (markdown code blocks)
- Add constraints about code quality, testing, or documentation
After each change, run:
python eval.pyThe evaluator scores your prompt from 0–100 based on quality signals.
| Score | Action |
|---|---|
| Higher than before | Commit: git add prompt.txt && git commit -m "Better prompt - score X" |
| Same or lower | Revert: git checkout prompt.txt and try a different approach |
Keep going for many iterations. Each generation brings you closer to the optimal prompt.
For a fully automated version (no human needed), use run_evolution.sh. It uses a genetic algorithm with mutation, crossover, and selection — no manual editing required.
chmod +x run_evolution.sh
./run_evolution.sh- Make small, focused changes rather than rewriting the whole prompt
- Check
results.logto track your score history - The best prompts are specific, structured, and mention concrete libraries and patterns
- Focus on instructions that lead to production-ready, local-first agent projects