[FIX] Rectify: Cross-Campaign Reaper Kills Actively Executing Dispatches#3396
Merged
Trecek merged 4 commits intoMay 31, 2026
Merged
Conversation
…hes from reaper
Introduces a dispatch-level heartbeat file (dispatch-{id}.heartbeat) co-located
with dispatch state files so the cross-campaign reaper can detect active execution
across process boundaries. The reaper now skips identity-confirmed dispatches whose
heartbeat mtime is within the configurable grace period (default 90s), eliminating
the kill vector against actively-executing dispatches from other campaigns.
- _dispatch_reaper.py: add _is_dispatch_heartbeating() helper and heartbeat_grace_seconds
gate inside if identity_confirmed: before dry_run check; add heartbeat_grace_seconds
param to both reap_stale_dispatches() and reap_stale_dispatches_async()
- _api.py: add _dispatch_heartbeat() async context manager (write/touch/cleanup) and
nest it inside execution_marker() around dispatch_food_truck() call
- _lifespan.py: pass heartbeat_grace_seconds=90.0 to both boot call sites
- tests: add 4 new reaper heartbeat tests and 3 new dispatch marker tests;
update 3 strict mock assertions to include heartbeat_grace_seconds kwarg
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…test mock Two failures: 1. fleet/_api.py used path.write_text() for heartbeat file creation, violating the test_no_direct_write_text_in_src architectural rule. Replaced with atomic_write() from core.io. 2. test_run_dispatch_heartbeat_mtime_is_fresh monkeypatched dispatch_food_truck with a mock returning None, causing AttributeError on skill_result.subtype. Mock now returns a proper SkillResult. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cross-package submodule imports are prohibited by REQ-IMP-001/002. atomic_write is re-exported from core/__init__.pyi. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ispatch_heartbeating Aligns the private helper parameter name with its public-facing callers (reap_stale_dispatches, reap_stale_dispatches_async) for naming symmetry across the call chain. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e2642ab to
4f82e0c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The dispatch reaper (
_dispatch_reaper.py) kills actively-executing dispatches from other campaigns because it has zero awareness of execution activity. Its only guards do not distinguish "orphaned process from a crashed campaign" from "actively-executing process doing real work."The architectural solution is a dispatch-level heartbeat sidecar (
.heartbeatfile co-located with dispatch state files) combined with a reaper activity gate that checks the heartbeat mtime before killing. The heartbeat file is written by_run_dispatch()via a new_dispatch_heartbeatasync context manager, is touch()ed every 30s, and is deleted on normal completion.Closes #3355
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/remediation-20260530-231830-883536/.autoskillit/temp/rectify/rectify_cross_campaign_reaper_immunity_2026-05-30_233000.md🤖 Generated with Claude Code via AutoSkillit
Token Usage Summary
* Step used a non-Anthropic provider; caching behavior may differ.
Token Efficiency
Model Usage Breakdown