feat: fast worktree setup via a shared frozen deps store#23993
Open
spalladino wants to merge 11 commits into
Open
feat: fast worktree setup via a shared frozen deps store#23993spalladino wants to merge 11 commits into
spalladino wants to merge 11 commits into
Conversation
When CACHE_LINK_DIR is set (local dev only, never on CI), cache_download extracts each tarball once into a frozen, read-only, content-addressed store and grafts absolute symlinks into out_dir instead of extracting in place. Grafting walks each path top-down, descending through real directories (tracked dirs, uninitialised submodule dirs, thawed copies) and symlinking at the first non-descendable component; it derives link roots from the paths themselves so it is correct even when the tarball omits intermediate directory entries. Concurrent extraction of the same entry is guarded by an mkdir lock. yarn-project tarballs are excluded (their contents interleave with tracked src and must stay writable). Each linked entry is appended to .deps-manifest.linked at the repo root for gc.
scripts/worktrees.sh creates aztec-packages worktrees backed by the shared frozen deps store instead of a full multi-minute bootstrap. create makes a git worktree, copies the writable yarn-project layer (node_modules, .yarn cache, gitignored build outputs) from the source checkout, and runs each upstream component bootstrap in CACHE_LINK_DIR link mode so cached artifacts are symlinked from the store. Subcommands: create, status, thaw (replace store symlinks with writable copies), gc (mark-and-sweep the store against live worktree manifests). --frozen-only aborts on a coarse cache miss.
An uninitialized noir-repo submodule makes git -C noir-repo rev-parse HEAD resolve to the parent repo HEAD, corrupting the noir content hash and every downstream component hash, so all cache lookups miss.
…ource detection Dir-only gitignore patterns (build*/) do not match symlinks, so grafted links showed as untracked, dirtying git status and disabling content-hash caching for the whole checkout. The graft now degrades such link roots to real directories and links one level deeper until the path is ignored. worktrees.sh now derives the source checkout from the script location rather than the CWD, and refuses to run from an unbuilt source.
…strap writes Three bootstraps write into cached outputs even on cache hits, which fails against the frozen CACHE_LINK_DIR store: bb's inject_version (skip read-only binaries), noir-contracts' stamp_dev_aztec_version (replace by rename so the store symlink becomes a real stamped copy; idempotent), and bb.js's test snapshot copy into dest (skip when read-only). Also relax worktrees.sh yp_same_state to ignore untracked files (-uno) when comparing source state.
Updates the worktree-spawn skill and root CLAUDE.md so worktrees are created with the cached-store setup instead of a full bootstrap.
…tore Node resolves imports from a module's real path, so bb.js dest and noir/packages code living in the store cannot see the checkout's node_modules and their runtime deps fail to resolve. Exclude bb.js-* and noir-packages-* tarballs from link mode alongside yarn-project-*.
6331012 to
a1768e3
Compare
cache_upload returned at its CI guard before the CACHE_LOCAL_DIR save block, so artifacts built locally after a remote cache miss were never reusable by other checkouts or worktrees. Local-save now always runs when CACHE_LOCAL_DIR is set; the S3 upload remains gated on CI/S3_FORCE_UPLOAD, and the no-op case (nothing to save anywhere) skips tarring entirely.
Cached contract tarballs now already carry aztec_version "dev", so the post-cache-hit stamp fast-paths to a no-op and grafted store symlinks survive in CACHE_LINK_DIR worktrees instead of being materialized as real stamped copies. Release paths overwrite the field unconditionally, so a pre-stamped value is harmless.
Remove the hardcoded ~/Projects location and spl/ branch prefix: worktrees are created as siblings of the source checkout, the source resolves from CWD (falling back to the script's own checkout), and the default branch prefix comes from git user.initials or user.name. A name containing a slash is taken as the full branch name, and create gains --dry-run to print the resolved source/path/branch without mutating anything.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Creating a worktree of aztec-packages today means a full
./bootstrap.sh— many minutes of compiling barretenberg, noir, contracts, and every TS workspace, even though those artifacts almost never differ between worktrees. Sharing them naively (symlinks into a sibling checkout) has two failure modes: rebuilding the main checkout silently changes artifacts under every worktree, and a stray rebuild inside a worktree silently writes back into the shared checkout.Approach
Reuse the ci3 build cache as the source of truth, and add an immutable, content-addressed, extracted store next to the existing tarball cache:
ci3/cache_download(opt-in viaCACHE_LINK_DIR, hard-disabled on CI): a tarball is extracted once into$CACHE_LINK_DIR/<name>/, frozen withchmod -R a-w, and symlinks are grafted into the checkout from the tarball's own listing — no hardcoded path lists. Content addressing makes entries immutable by construction; the freeze turns any accidental write through a symlink into a loudEACCESinstead of silent cross-checkout corruption. Grafting degrades a link root to a real directory whenever the symlink would not be gitignored (dir-only patterns likebuild*/don't match symlinks, and an untracked path flips content hashes todisabled-cacherepo-wide). Tarballs whose contents are resolved as Node modules (bb.js,noir-packages) or interleave with tracked sources (yarn-project) always extract in place.scripts/worktrees.sh:create <name> [base-ref]makes a worktree at~/Projects/<name>, inits thenoir/noir-reposubmodule (an uninitialized submodule corrupts the noir content hash and cascades to every downstream component), copies the writable yarn layer from the source checkout (node_modules,dest/, generated outputs — enumerated dynamically from gitignore state), and runs each upstream component bootstrap in link mode.status,thaw(writable copy before rebuilding an upstream component locally), and mark-and-sweepgc(roots are per-worktree manifests) complete the lifecycle.inject_versionskips read-only binaries, noir-contracts'stamp_dev_aztec_versionreplaces by rename (the store symlink becomes a real stamped copy) and is now idempotent, bb.js skips the test-snapshot copy into a read-onlydest. No behavior change in CI or normal checkouts, where artifacts are writable.Design doc with the full mechanics and pitfalls:
scripts/worktrees.md.Verified end-to-end: a worktree created from a built checkout resolves
@aztec/*modules identically to the source, runsbb/nargothrough store symlinks, rejects writes into the store, rebuilds workspaces in isolation (sourcedest/untouched), keepsgit statusclean, and passes the world-state native suite (50/50).Changes
cache_downloadgains link mode (CACHE_LINK_DIR): extract-once store, freeze, gitignore-aware symlink grafting, concurrency-safe extraction, per-checkout manifest appends.worktrees.sh(create/status/thaw/gc) andworktrees.mddesign doc.scripts/worktrees.sh create.Follow-up changes (after review of real bootstrap logs)
cache_uploadnow saves the artifact intoCACHE_LOCAL_DIReven whenCI=0(previously it returned at the CI guard before the local-save block, so artifacts built locally after a remote cache miss were never reusable — every fresh worktree re-paid the build, ~366 CPU-seconds of contract compiles in a measured run). S3 upload stays gated on CI/S3_FORCE_UPLOAD, and when there is nowhere to save the tarring is skipped entirely. Covered by a new case incache_local.test.sh.aztec_version: "dev"beforecache_upload, so cached tarballs already carry the field and the post-cache-hit stamp fast-paths to a no-op — grafted store symlinks survive instead of being materialized as ~134MB of real stamped copies per worktree. Release paths (release_prep_package_json, release-image Dockerfile) overwrite the field unconditionally, so a pre-stamped value is harmless.worktrees.shgeneralized for any user — worktrees are created as siblings of the source checkout (no hardcoded~/Projects), the source checkout resolves from CWD (falling back to the script's own checkout), and the default branch prefix derives from gituser.initials/user.name(no hardcodedspl/). A<name>containing a slash is taken as the full branch name, andcreate --dry-runprints the resolved source/path/branch without mutating anything.