diff --git a/.github/workflows/autoloop.md b/.github/workflows/autoloop.md index c9a34a6..f7010d2 100644 --- a/.github/workflows/autoloop.md +++ b/.github/workflows/autoloop.md @@ -300,7 +300,7 @@ Each run executes **one iteration for the single selected program**: 1. Read the program file to understand the goal, targets, and evaluation method. 2. Read the **state file** `{program-name}.md` from the repo-memory folder. This is the **single source of truth** for all program state. The file contains: - - **βš™οΈ Machine State** table: `last_run`, `best_metric`, `target_metric`, `iteration_count`, `paused`, `pause_reason`, `completed`, `completed_reason`, `consecutive_errors`, `recent_statuses`. These are machine-readable scheduling and control fields visible to both humans and the pre-step. + - **βš™οΈ Machine State** table: `last_run`, `initial_metric`, `best_metric`, `target_metric`, `iteration_count`, `paused`, `pause_reason`, `completed`, `completed_reason`, `consecutive_errors`, `recent_statuses`. These are machine-readable scheduling and control fields visible to both humans and the pre-step. - **🎯 Current Priorities**: Human-set guidance for the next iterations (editable by maintainers). - **πŸ“š Lessons Learned**: Key findings from past iterations. - **🚧 Foreclosed Avenues**: Approaches definitively ruled out, with reasons. @@ -308,6 +308,7 @@ Each run executes **one iteration for the single selected program**: - **πŸ“Š Iteration History**: Reverse-chronological log of all past iterations. If the state file does not yet exist, create it in the repo-memory folder using the template defined in the [Repo Memory](#repo-memory) section. +3. Before proposing a new change, reconcile any previous iteration whose metric was left as `pending-ci`: read the successful CI run/check logs or artifacts for the PR HEAD, parse the CI-measured fitness value, and retroactively update the state file, PR body, and program issue status comment before continuing. Use that CI-measured value exactly as the iteration metric; if it is the first accepted metric, set `Initial Metric` at the same time. ### Step 2: Analyze and Propose @@ -345,11 +346,11 @@ Each run executes **one iteration for the single selected program**: 2. Push the commit to the long-running branch. 3. If a draft PR does not already exist for this branch, create one: - Title: `[Autoloop: {program-name}]` - - Body includes: a summary of the program goal, link to the program issue, the current best metric, and AI disclosure: `πŸ€– *This PR is maintained by Autoloop. Each accepted iteration adds a commit to this branch.*` - If a draft PR already exists, update the PR body with the latest metric and a summary of the most recent accepted iteration. Add a comment to the PR summarizing the iteration: what changed, old metric, new metric, improvement delta, and a link to the actions run. + - Body includes: a summary of the program goal, link to the program issue, cumulative fitness (`Fitness: {best_metric} (started at {initial_metric}, {absolute_delta} / {improvement_pct}% improvement)`), and AI disclosure: `πŸ€– *This PR is maintained by Autoloop. Each accepted iteration adds a commit to this branch.*` + If a draft PR already exists, update the PR body with the latest cumulative fitness, the baseline metric, the absolute and percentage improvement, and a summary of the most recent accepted iteration. Add a comment to the PR summarizing the iteration: what changed, old metric, new metric, improvement delta, and a link to the actions run. 4. Ensure the program issue exists (see [Program Issue](#program-issue) below) β€” for file-based programs that have no program issue yet (`selected_issue` is null in `/tmp/gh-aw/autoloop.json`), create one and record its number in the state file's `Issue` field. 5. Update the state file `{program-name}.md` in the repo-memory folder: - - Update the **βš™οΈ Machine State** table: reset `consecutive_errors` to 0, set `best_metric`, increment `iteration_count`, set `last_run` to current UTC timestamp, append `"accepted"` to `recent_statuses` (keep last 10), set `paused` to false. + - Update the **βš™οΈ Machine State** table: reset `consecutive_errors` to 0, set `best_metric`, set `initial_metric` only if it is currently missing/`β€”` (never overwrite it after the first accepted metric), increment `iteration_count`, set `last_run` to current UTC timestamp, append `"accepted"` to `recent_statuses` (keep last 10), set `paused` to false. - Prepend an entry to **πŸ“Š Iteration History** (newest first) with status βœ…, metric, PR link, and a one-line summary of what changed and why it worked. - Update **πŸ“š Lessons Learned** if this iteration revealed something new about the problem or what works. - Update **πŸ”­ Future Directions** if this iteration opened new promising paths. @@ -366,12 +367,13 @@ Each run executes **one iteration for the single selected program**: 3. **Update the program issue**: edit the status comment and post a per-iteration comment on the program issue (see [Program Issue](#program-issue)). **If evaluation could not run** (build failure, missing dependencies, etc.): -1. Discard the code changes (do not commit them to the long-running branch). -2. Update the state file `{program-name}.md` in the repo-memory folder: - - Update the **βš™οΈ Machine State** table: increment `consecutive_errors`, increment `iteration_count`, set `last_run`, append `"error"` to `recent_statuses` (keep last 10). +1. If CI is expected to run the same evaluation or publish the fitness value, commit and push the changes with the metric marked `pending-ci`. After CI succeeds, parse the CI-measured fitness from the run/check logs or artifacts and continue the accept flow using that value. If the value is not available before the iteration ends, leave the iteration history metric as `pending-ci`, append `"pending-ci"` to `recent_statuses`, update the PR/issue to say the metric is awaiting CI, and reconcile it at the start of the next iteration before proposing new work. +2. If no CI-measured metric is available or expected, discard the code changes (do not commit them to the long-running branch). +3. Update the state file `{program-name}.md` in the repo-memory folder: + - Update the **βš™οΈ Machine State** table: increment `iteration_count`, set `last_run`, append `"error"` (or `"pending-ci"` when awaiting CI fitness) to `recent_statuses` (keep last 10). Increment `consecutive_errors` only for the `"error"` path; do not increment it while waiting for CI fitness. - If `consecutive_errors` reaches 3+, set `paused` to `true` and set `pause_reason` in the Machine State table, and create an issue describing the problem. - - Prepend an entry to **πŸ“Š Iteration History** with status ⚠️ and a brief error description. -3. **Update the program issue**: edit the status comment and post a per-iteration comment on the program issue (see [Program Issue](#program-issue)). + - Prepend an entry to **πŸ“Š Iteration History** with status ⚠️ and a brief error description, or status ⏳ with metric `pending-ci` when waiting for CI fitness. +4. **Update the program issue**: edit the status comment and post a per-iteration comment on the program issue (see [Program Issue](#program-issue)). ## Program Issue @@ -409,8 +411,10 @@ Find the status comment by searching for a comment containing `