Serialize expensive editor operations with leases#1184
Conversation
📝 WalkthroughWalkthroughThis pull request introduces a file-based lease mechanism to coordinate "Unity editor operations" across concurrent MCP agents. It prevents destructive operations (refresh_unity and run_tests) from executing simultaneously by acquiring exclusive leases keyed to a Unity instance and operation type, with atomic file creation, configurable TTL, expiry detection, and reentrancy support for same-owner reacquisition. ChangesEditor Operation Lease Coordination
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
docs/plans/2026-05-31-multi-agent-operation-lease-spec.md (1)
68-69: ⚡ Quick winAdd trailing newline at end of file.
The file should end with a newline character after line 68. This follows POSIX standards and is a common best practice for text files.
📝 Suggested fix
- Tool-test that `run_tests` releases the startup lease after dispatch. +🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/plans/2026-05-31-multi-agent-operation-lease-spec.md` around lines 68 - 69, The file ends without a trailing newline; add a single newline character at the end of docs/plans/2026-05-31-multi-agent-operation-lease-spec.md so the last line ("Tool-test that `run_tests` releases the startup lease after dispatch.") is terminated by a newline character to satisfy POSIX/file-format conventions.Server/src/services/tools/editor_operation_lease.py (1)
126-134: 💤 Low valueMinor race window between payload read and lease reclaim.
Between reading the payload at line 126 and unlinking at line 129, another process could reclaim the expired lease. If the original owner releases and a new owner creates in that window, the stale payload's
expires_unix_msstill appears expired, causing the unlink to delete a valid lease.The window is narrow and the impact is limited (occasional duplicate work, handled by retry semantics), but for a more robust guarantee, consider re-reading/re-checking expiry immediately before unlink or using atomic rename.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Server/src/services/tools/editor_operation_lease.py` around lines 126 - 134, There’s a race between reading the payload with _read_payload and deleting it with path.unlink: before unlinking, re-read the file and re-check expiry with _is_lease_expired (and/or compare the freshly read payload metadata/body against the earlier `existing` to confirm it hasn’t been replaced) so you only unlink when the file is still the same expired lease; alternatively perform an atomic operation (rename/move to a tombstone name and validate) before removing to avoid deleting a newly-created valid lease — update the block that calls _read_payload, _is_lease_expired, path.unlink and _payload_to_info to perform this extra verification step (use instance_id or payload contents/timestamp to detect changes).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/plans/2026-05-31-multi-agent-operation-lease-spec.md`:
- Around line 33-61: Document that the lease supports reentrant acquisition by
the same owner (e.g., allow nested acquisition within a single tool invocation
so operations like run_tests preflight refresh can acquire the same lease held
for refresh_unity without releasing it), and add a Lease duration subsection
that lists default TTLs and that they are configurable (specify defaults for
refresh_unity and run_tests startup phase and note that expiry time in the lease
file is used to reclaim crashes); update the Design section to mention the lease
file fields (owner, operation, PID, token, start time, expiry time), reentrancy
behavior, default TTL values, and that TTLs are configurable.
---
Nitpick comments:
In `@docs/plans/2026-05-31-multi-agent-operation-lease-spec.md`:
- Around line 68-69: The file ends without a trailing newline; add a single
newline character at the end of
docs/plans/2026-05-31-multi-agent-operation-lease-spec.md so the last line
("Tool-test that `run_tests` releases the startup lease after dispatch.") is
terminated by a newline character to satisfy POSIX/file-format conventions.
In `@Server/src/services/tools/editor_operation_lease.py`:
- Around line 126-134: There’s a race between reading the payload with
_read_payload and deleting it with path.unlink: before unlinking, re-read the
file and re-check expiry with _is_lease_expired (and/or compare the freshly read
payload metadata/body against the earlier `existing` to confirm it hasn’t been
replaced) so you only unlink when the file is still the same expired lease;
alternatively perform an atomic operation (rename/move to a tombstone name and
validate) before removing to avoid deleting a newly-created valid lease — update
the block that calls _read_payload, _is_lease_expired, path.unlink and
_payload_to_info to perform this extra verification step (use instance_id or
payload contents/timestamp to detect changes).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ddad71a6-1bfb-4ad8-8e6a-700f22c7a746
📒 Files selected for processing (5)
Server/src/services/tools/editor_operation_lease.pyServer/src/services/tools/refresh_unity.pyServer/src/services/tools/run_tests.pyServer/tests/integration/test_editor_operation_lease.pydocs/plans/2026-05-31-multi-agent-operation-lease-spec.md
| ## Design | ||
|
|
||
| The Python server creates one atomic file lease per Unity instance in a shared | ||
| per-user directory. The lease file contains owner, operation, PID, token, start | ||
| time, and expiry time. | ||
|
|
||
| Acquire uses exclusive file creation so separate MCP server processes agree on | ||
| one owner. If the file exists and is not expired, the tool returns: | ||
|
|
||
| ```json | ||
| { | ||
| "success": false, | ||
| "error": "operation_busy", | ||
| "hint": "retry", | ||
| "data": { | ||
| "reason": "operation_busy", | ||
| "operation": "refresh_unity", | ||
| "owner": "session:...", | ||
| "retry_after_ms": 2000 | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| `refresh_unity` holds the lease for the whole refresh/wait operation. | ||
|
|
||
| `run_tests` holds the lease only while it performs preflight and starts the | ||
| async Unity test job. The actual test execution continues to be represented by | ||
| Unity's test state and `get_test_job`; this keeps the change small while | ||
| removing the most common race window at job startup. |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Document reentrancy support and TTL/expiry values.
The design section is missing two important aspects mentioned in the PR objectives and stack context:
-
Reentrancy support: The PR objectives state "Reentrant acquisition is allowed for the same tool invocation so nested operations can proceed without releasing an outer lease." This is a key design feature that should be documented to help readers understand how nested operations (e.g.,
run_testspreflight refreshing dirty assets) interact with the lease mechanism. -
TTL/expiry values: While line 37 mentions "expiry time" as a lease field, the design section doesn't specify default TTL values or mention that they're configurable (as indicated in the stack context). Consider adding a subsection that specifies the default lease duration for each operation type.
📝 Suggested additions to the Design section
Add after line 39:
### Reentrancy
The lease allows reentrant acquisition by the same owner within a single tool invocation. This enables nested operations (e.g., `run_tests` performing a preflight refresh) to proceed without deadlock or requiring the outer lease to be released.
### Lease duration
Each operation type has a default TTL:
- `refresh_unity`: [specify duration, e.g., 300 seconds]
- `run_tests`: [specify duration for the startup phase, e.g., 60 seconds]
Expired leases are automatically reclaimed to recover from crashed MCP server processes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/plans/2026-05-31-multi-agent-operation-lease-spec.md` around lines 33 -
61, Document that the lease supports reentrant acquisition by the same owner
(e.g., allow nested acquisition within a single tool invocation so operations
like run_tests preflight refresh can acquire the same lease held for
refresh_unity without releasing it), and add a Lease duration subsection that
lists default TTLs and that they are configurable (specify defaults for
refresh_unity and run_tests startup phase and note that expiry time in the lease
file is used to reclaim crashes); update the Design section to mention the lease
file fields (owner, operation, PID, token, start time, expiry time), reentrancy
behavior, default TTL values, and that TTLs are configurable.
Summary
Add a small cross-process operation lease for expensive editor operations that target the same Unity instance.
operation_busy+hint=retryresponses when another agent owns the operation windowrefresh_unityfor the full refresh/wait operationrun_testsstartup path while preflight and async job dispatch runrun_testspreflight can still refresh dirty assets without dropping the outer leaseWhy
This is aimed at the multi-agent workflow where two or more MCP clients point at the same Unity project. In that setup, agents can independently decide to refresh assets, request compilation, or start tests before Unity has updated its own busy state. The result is avoidable contention: duplicate refreshes, overlapping compile waits, or test-start attempts that make the project feel much slower.
The lease gives the server a small coordination layer before those expensive operations reach Unity. It does not serialize normal read-only tools, and it does not try to become a full job queue. Clients still get a retryable response, so they can back off and retry instead of piling more work onto the editor.
Validation
uv run --extra dev python -m pytest tests/integration/test_editor_operation_lease.py tests/integration/test_run_tests_async.py tests/integration/test_refresh_unity_retry_recovery.py -qgit diff --checkSummary by CodeRabbit
Release Notes
New Features
Tests
Documentation