fix(sandbox): bound quick file ops during link partitions#4066
Merged
Conversation
decocms Bot
pushed a commit
that referenced
this pull request
Jun 22, 2026
PR: #4066 fix(sandbox): bound quick file ops during link partitions Bump type: patch - decocms (apps/mesh/package.json): 3.43.2 -> 3.43.3 Deploy-Scope: server
tlgimenes
added a commit
that referenced
this pull request
Jun 22, 2026
…nnect assertion) (#4069) The Resilience Tests workflow has been red on main since the NATS tunnel transport landed (#3854) and worsened with #4066. Two independent root causes in the sandbox↔studio WS-partition suite: 1. Baseline write/read 502s on a healthy link. #4066 bounded quick file ops at a 10s QUICK_FILE_OP_TIMEOUT_MS (sandbox-proxy.ts). The daemon gates the first quick file op on waitForFirstMounts() — up to 10s FIRST_MOUNT_WAIT_MS (entry.ts) — waiting for org-fs mounts. These containers have no FUSE, so rclone mount always fails after the full grace, so the first write times out at exactly 10s. org-fs can't work here and this suite doesn't test it, so disable it via DISABLE_ORGFS_MOUNTS on the studio service. The daemon then never expects org-fs and skips the wait. 2. "WS restored → reconnects" can never pass. Since #3854 presence is an optimistic live /api/links/status probe; /api/links/me no longer returns a stored claim's connectedAt. The test asserted `claim.connectedAt > baseline` (undefined > undefined → always false), so it timed out every run. Detect reconnect by presence reading online again, matching the new model. Also folds in the NATS healthcheck hardening (image/healthcheck no longer depend on shell utilities, with a docker-compose.test.ts guard) and skips the log-replay suite outright so its heavy beforeAll never runs while its only tests are skipped. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test Plan
Note: full bun test was not completed in the sandbox because Docker-backed resilience setup did not progress.
Summary by cubic
Bound quick sandbox file ops to a 10s fast-fail so reads/writes don’t stall ~30s during link partitions and instead return a prompt 502. Updated resilience test timing and aligned Playwright config test with
MCP_CACHE_ENABLED=true.QUICK_FILE_OP_TIMEOUT_MSandquickFileOpSignal; applied to read/write/mkdir/unlink/rename/glob routes to fail fast on partitions.MCP_CACHE_ENABLED=truein the dev server command.Written for commit 3d7c9cf. Summary will update on new commits.