Skip to content

[FIX] Remediate Loop Counter Shared Across Failure Paths#3354

Merged
Trecek merged 10 commits into
developfrom
recipe-test-fix-loop-count-shared-across-distinct-failure-pa/3328
May 31, 2026
Merged

[FIX] Remediate Loop Counter Shared Across Failure Paths#3354
Trecek merged 10 commits into
developfrom
recipe-test-fix-loop-count-shared-across-distinct-failure-pa/3328

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented May 30, 2026

Summary

Fix two gaps in the current recipe-test-fix-loop-count-shared-across-distinct-failure-pa/3328 implementation identified by /audit-impl:

  1. Rule engine fixrules_loop_counter.py uses a local bounded BFS (_find_guard_loop_body, max 4 hops) to find the cycle body containing a guard step. The plan required using _find_cycle_members from _rule_helpers.py and taking the union of all returned frozensets that contain the guard, ensuring multi-path cycles are fully captured. The local function is structurally incorrect: it early-returns on the first BFS path found and uses an arbitrary hop bound, producing incomplete full_cycle sets for guards reachable through multiple distinct cycle paths.

  2. Missing tests — Two test methods specified by the original implementation plan are absent from TestLoopBudgetSeparation in tests/recipe/test_rules_integration_predicate.py: test_test_fix_loop_not_reachable_from_merge_gate_path and test_merge_gate_assess_routes_through_merge_fix_guard.

Closes #3328

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/remediation-20260530-130256-691280/.autoskillit/temp/make-plan/remediation_loop_counter_shared_plan_2026-05-30_150300.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step Model count uncached output cache_read peak_ctx turns cache_write time
rectify* opus[1m] 1 114 29.2k 4.5M 151.6k 315 265.6k 26m 44s
review_approach* sonnet 1 3.8k 7.8k 203.9k 52.2k 73 40.7k 5m 6s
dry_walkthrough* opus 2 99 26.4k 1.4M 85.1k 293 109.6k 16m 28s
implement* sonnet 2 2.2k 45.7k 2.4M 163.5k 141 187.9k 15m 54s
audit_impl* sonnet 2 3.3k 32.5k 612.9k 67.8k 142 120.7k 18m 15s
make_plan* sonnet 1 1.6k 17.7k 791.8k 76.2k 81 64.1k 8m 8s
post_run_diagnostics* sonnet 1 38 15.9k 100.4k 71.1k 290 44.5k 9m 26s
prepare_pr* sonnet 1 193.5k 7.0k 433.5k 31.0k 41 60.2k 2m 17s
compose_pr* sonnet 1 163.1k 2.5k 399.6k 31.0k 29 15.6k 1m 12s
resolve_review* opus 1 36 14.0k 1.2M 85.6k 42 86.4k 10m 19s
resolve_pre_review_conflicts* opus 1 37 5.9k 1.1M 77.4k 36 61.5k 2m 26s
diagnose_ci* sonnet 1 75.3k 1.1k 213.9k 31.0k 14 15.3k 47s
resolve_ci* opus 1 43 3.8k 900.7k 73.4k 32 57.4k 5m 17s
Total 443.1k 209.5k 14.2M 163.5k 1.1M 2h 2m

* Step used a non-Anthropic provider; caching behavior may differ.

Token Efficiency

Step LoC Changed cache_read/LoC cache_write/LoC output/LoC
rectify 0
review_approach 0
dry_walkthrough 0
implement 106 22809.4 1772.5 430.7
audit_impl 0
make_plan 0
post_run_diagnostics 0
prepare_pr 0
compose_pr 0
resolve_review 34 34758.4 2540.6 412.2
resolve_pre_review_conflicts 1489 726.1 41.3 4.0
diagnose_ci 0
resolve_ci 4 225179.5 14340.8 939.5
Total 1633 8715.0 691.7 128.3

Model Usage Breakdown

Model steps uncached output cache_read cache_write time
opus[1m] 1 114 29.2k 4.5M 265.6k 26m 44s
sonnet 8 442.8k 130.2k 5.2M 549.1k 1h 1m
opus 4 215 50.1k 4.5M 314.8k 34m 32s

Trecek and others added 9 commits May 30, 2026 16:55
…er guard after verify

Adds two new semantic rules (loop-counter-cross-path-sharing,
loop-guard-before-verify) and fixes the recipes so each failure path
uses its own counter and the guard fires after the test step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…er rule

Replace the local `_find_guard_loop_body` (bounded BFS, max 4 hops,
early-return on first path) with `_find_cycle_members` from `_rule_helpers`,
taking the union of all frozensets containing the guard step. This correctly
captures multi-path cycles and avoids incomplete `full_cycle` sets.

Remove unused `_build_yaml_successor_map`, `_find_guard_loop_body`,
`_MAX_LOOP_HOPS`, and `from collections import deque`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…path coverage

Add two test methods to TestLoopBudgetSeparation that were specified in
the original plan but absent from the implementation:

- test_test_fix_loop_not_reachable_from_merge_gate_path: BFS backward
  from check_test_fix_loop to assert it is not reachable from merge-gate
  steps (diagnose_merge_gate, merge_gate_assess, check_merge_test_fix_loop).

- test_merge_gate_assess_routes_through_merge_fix_guard: asserts
  merge_gate_assess routes to check_merge_test_fix_loop and not to
  check_test_fix_loop (skipped when merge_gate_assess is absent).

Also add _get_routing_targets helper and deque import used by the new tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…te_test step

Two fixes:

1. Rule engine: exclude external predecessors whose own predecessors are
   all inside the cycle. These are satellite retry guards (e.g.,
   commit_guard_pre_remediation), not independent entry paths.

2. Recipe routing: add merge_gate_test step to remediation, implementation,
   and implementation-groups recipes. check_merge_test_fix_loop now routes
   to merge_gate_test (not the shared test step), preventing merge-gate
   failures from entering the pre-merge fix loop via check_test_fix_loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tions

Reorder merge-gate fix loop from fix→guard→test to fix→test→guard
so the last fix attempt gets verified before the budget is exhausted.
Flow is now: merge_gate_assess → merge_gate_test → check_merge_test_fix_loop.

Update test assertions to match new routing: merge_gate_assess routes to
merge_gate_test (not check_merge_test_fix_loop directly), and the
reachability test now checks direct routing properties instead of
whole-graph backward BFS.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…p-counter rule

Replace _find_cycle_members DFS union (which misses cycles through
already-visited nodes) with forward+backward BFS intersection.
This correctly finds all nodes in cycles with the guard step.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ositives

Skip the cross-path-sharing check when the cycle exceeds 10 members.
Fix-retry cycles are always small (3-5 steps); large cycles (15+)
indicate general processing loops where counter sharing is intentional
(e.g., merge-prs PR iteration loop).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ounter rule

Remove unreachable `if not full_cycle: continue` guard (frozenset always
contains step_name). Rename inner loop variable `m` to `sn` in generator
comprehension to avoid shadowing the outer regex match variable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…st isolation

Replace for-loop iteration over recipe names with @pytest.mark.parametrize
in test_bundled_recipes_no_counter_sharing and
test_bundled_recipes_no_guard_before_verify for per-recipe failure attribution
under xdist parallel execution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Trecek Trecek force-pushed the recipe-test-fix-loop-count-shared-across-distinct-failure-pa/3328 branch from ca55c54 to 2b0eedb Compare May 30, 2026 23:57
The rules_loop_counter.py addition brought recipe/rules/ to 35 Python
files, exceeding the previous exemption cap of 34.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Trecek Trecek added this pull request to the merge queue May 31, 2026
Merged via the queue into develop with commit 3a2f4c8 May 31, 2026
3 checks passed
@Trecek Trecek deleted the recipe-test-fix-loop-count-shared-across-distinct-failure-pa/3328 branch May 31, 2026 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant