Skip to content

FE-743: Petri parallel execution — concurrent firing, resource pools, worktree-per-slice#149

Open
kostandinang wants to merge 2 commits into
ka/fe-738-petri-semantic-lanesfrom
ka/fe-743-petri-parallel-execution
Open

FE-743: Petri parallel execution — concurrent firing, resource pools, worktree-per-slice#149
kostandinang wants to merge 2 commits into
ka/fe-738-petri-semantic-lanesfrom
ka/fe-743-petri-parallel-execution

Conversation

@kostandinang
Copy link
Copy Markdown
Contributor

@kostandinang kostandinang commented May 22, 2026

Summary

Phase 2 of the petri orchestrator arc (umbrella H-6476). Parallel firing + shared resource pools + sandbox-per-slice — the categorical break where the Petri engine earns its complexity over proc. Decision gate passed: parallel measurably beats serial on multi-slice plans. Stacks on FE-738 (#148).

Changes

Interpreter (petri-net.ts)

Compiler (net-compiler.ts)

  • Shared pool:test-agent / pool:code-agent places replace per-slice agent places
  • agentPoolSize on RunPolicy caps global concurrency (default = slice count = unbounded)
  • Per-slice sandbox dirs via join(sandboxDir, sliceId) for all action contexts

CLI (cook-cli.ts)

  • Retired dead --engine=proc|petri flag (proc engine deleted in FE-730)
  • Added --policy=serial|parallel wired to createOrchestrator()

Naming (PR #143 review cleanup)

  • Renamed worktreeDirsandboxDir across all orchestrator files — disambiguates from git worktrees.

Verification

  • All 41 contract tests pass under both serial and parallel policies
  • Concurrency proof — maxConcurrent > 1 under parallel
  • Wall-clock benchmark — parallel < 85% of serial time
  • Pool-bound tests — agentPoolSize=1 → 1; =2 → 2; default → 3
  • Sandbox-per-slice isolation adapter test
  • npm run verify clean

Acceptance (per memory/PLAN.md)

(1)–(6) met including decision gate: wall-clock improvement measurable on a 3+ slice fixture vs serial execution.

Next

Phase 3 — petri-graph-compilation — blocked on FE-700 (intent-graph-semantics). Frontier definition added to memory/PLAN.md capturing two open constraints from PR #143 review:

  1. Declarative output arcs — current topology declares only input places; output routing lives in fire closures. HandlerDescriptor declares candidate outputs but selection is runtime. Phase 3 should move conditional routing into topology-level guard predicates + declared output arcs for formal analyzability.
  2. Token state enrichment — open question whether more metadata should move from reports into tokens as the token taxonomy gets richer (spec §3).

@cursor
Copy link
Copy Markdown

cursor Bot commented May 22, 2026

PR Summary

High Risk
Adds a new parallel Petri execution mode with greedy token-claiming concurrency and shared resource pools, which can introduce ordering/race issues and changes execution semantics. Also renames and re-scopes filesystem execution directories (worktreeDirsandboxDir) affecting where agents/tests read and write artifacts.

Overview
Implements a new Petri parallel firing policy that runs all enabled transitions concurrently (greedy token claiming + Promise.allSettled) and wires the orchestrator factory/contract suite to run under both serial and parallel.

Reworks compilation/runtime to use shared pool:test-agent / pool:code-agent token places (bounded by new RunPolicy.agentPoolSize) instead of per-slice agent tokens, and switches execution from worktreeDir to per-slice sandboxDir directories (with epic verification running against the parent sandbox).

Updates brunch cook to drop --engine in favor of --policy=serial|parallel, renames createWorktreecreateSandbox, adjusts pi action/test runner cwd usage, and refreshes plan/docs to mark FE-743 complete and define the Phase 3 petri-graph-compilation horizon item.

Reviewed by Cursor Bugbot for commit 0f5cdbf. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown
Contributor Author

kostandinang commented May 22, 2026

@kostandinang kostandinang changed the title plan: petri-parallel-execution in-progress, FE-743 assigned FE-743: Petri parallel execution — concurrent firing, resource pools, worktree-per-slice May 22, 2026
}
}

if (firstError) throw firstError;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parallel batch ignores halt flag

Medium Severity

Under parallel policy, shouldHalt (wired to ctx.halted) is only checked at the start of each loop iteration. If one transition in a concurrent batch sets ctx.halted (retry or semantic exhaustion), sibling transitions in the same Promise.allSettled batch still run to completion, unlike serial mode where the next transition is skipped.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 6ff7563. Configure here.

Comment thread src/orchestrator/src/petri-net.ts
@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented May 22, 2026

🤖 Augment PR Summary

Summary: This PR completes FE-743 by adding true parallel Petri-net execution, shared agent resource pools, and per-slice worktree isolation to reduce wall-clock time on multi-slice plans.

Changes:

  • Extend FiringPolicy to serial | parallel and implement runParallel() using greedy token claiming + Promise.allSettled.
  • Add shared pool:test-agent/pool:code-agent places and RunPolicy.agentPoolSize to cap global agent concurrency.
  • Route action contexts and test runs through join(worktreeDir, sliceId) to isolate slice execution directories.
  • Update brunch cook CLI: retire --engine, add --policy=serial|parallel, default to serial.
  • Expand contract tests to cover both policies and add explicit concurrency/pool/worktree isolation checks.
  • Update memory/PLAN.md to mark petri-parallel-execution done and adjust sequencing.

Technical Notes: Parallel execution aims for serial parity while allowing bounded concurrency via shared pool tokens.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread src/orchestrator/src/net-compiler.ts Outdated
slice,
epic,
plan,
worktreeDir: join(input.worktreeDir, sliceId),
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worktreeDir is now set to join(input.worktreeDir, sliceId), but createWorktree() currently only creates the parent directory; pi-actions and BunTestRunner use this path as cwd and will fail if the slice subdirectory doesn’t exist. This same assumption shows up in the other join(input.worktreeDir, <sliceId>) call sites in this file as well.

Severity: high

Other Locations
  • src/orchestrator/src/net-compiler.ts:382
  • src/orchestrator/src/net-compiler.ts:421
  • src/orchestrator/src/net-compiler.ts:488

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

tracker.maxConcurrent = Math.max(tracker.maxConcurrent, active);
await new Promise((resolve) => setTimeout(resolve, 5));
const result = await handler!(ctx);
active--;
Copy link
Copy Markdown

@augmentcode augmentcode Bot May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In withConcurrencyTracking, if a wrapped handler throws/rejects, active-- won’t run, so the tracker can overcount and make concurrency assertions misleading on failure paths. It may be worth ensuring the active counter is decremented even when the wrapped handler errors.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@kostandinang kostandinang force-pushed the ka/fe-743-petri-parallel-execution branch from 6ff7563 to d4e9d95 Compare May 22, 2026 13:32
Comment thread src/orchestrator/src/net-compiler.ts
@kostandinang kostandinang force-pushed the ka/fe-743-petri-parallel-execution branch from d4e9d95 to d43222e Compare May 22, 2026 13:57
@kostandinang kostandinang force-pushed the ka/fe-738-petri-semantic-lanes branch from 8e56a7c to f959557 Compare May 22, 2026 14:14
@kostandinang kostandinang force-pushed the ka/fe-743-petri-parallel-execution branch from d43222e to 55945f9 Compare May 22, 2026 14:14
Comment thread src/orchestrator/src/net-compiler.ts
@kostandinang kostandinang force-pushed the ka/fe-743-petri-parallel-execution branch 3 times, most recently from 13953ca to 386c3cf Compare May 22, 2026 14:31
Comment thread src/orchestrator/src/net-compiler.ts
kostandinang and others added 2 commits May 22, 2026 16:58
Parallel firing policy (petri-net.ts):
- FiringPolicy = 'serial' | 'parallel'
- runParallel: greedy token claiming + Promise.allSettled concurrent fire
- Extracted isEnabled() private helper

Shared resource pools (net-compiler.ts):
- pool:test-agent / pool:code-agent replace per-slice agent places
- agentPoolSize on RunPolicy bounds global concurrency
- Worktree-per-slice: join(worktreeDir, sliceId) for all action contexts

CLI (cook-cli.ts):
- Retired dead --engine=proc|petri flag
- Added --policy=serial|parallel wired to createOrchestrator

Tests (engine-contract.test.ts):
- Parallel added to all engines arrays (serial parity)
- Concurrency proof, wall-clock benchmark, pool-bounded tests
- Extracted withConcurrencyTracking() helper
- Worktree isolation adapter test

Decision gate passed: parallel measurably beats serial on wall clock.

Co-authored-by: Amp <amp@ampcode.com>
@kostandinang kostandinang force-pushed the ka/fe-743-petri-parallel-execution branch from 386c3cf to 0f5cdbf Compare May 22, 2026 14:59
@kostandinang kostandinang force-pushed the ka/fe-738-petri-semantic-lanes branch from f959557 to 1747538 Compare May 22, 2026 14:59
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0f5cdbf. Configure here.

epic,
plan,
sandboxDir: join(input.sandboxDir, sliceId),
reports,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slice id escapes sandbox

Medium Severity

Per-slice sandboxes use join(input.sandboxDir, slice.id) without validating slice.id. A plan id containing .. segments resolves outside the run sandbox for mkdirSync, action cwd, and test runs, so orchestration can write or execute under arbitrary filesystem paths from YAML.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 0f5cdbf. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant