feat(node): pin the per-push object delta instead of re-enumerating the whole repo#90
Conversation
Compute the per-push object delta (git rev-list --objects <new> --not <old>)
for the pin path instead of enumerating the whole repo, with a fully-peeled
ref-type guard (cat-file -t '<sha>^{}') that fails closed to a full scan on a
non-commit tip, any git error, or the GITLAWB_PIN_FORCE_FULLSCAN kill-switch.
The whole-repo lister moves here for the fallback and the reconciliation sweep.
Pin-enumeration scope only (N16); the withheld filter is unchanged.
Wire the push-delta enumeration into the IPFS and Pinata pin paths so pin work scales with push size, not repo size (N16). - U2: both pin_new_objects twins take a candidate OID set instead of scanning the whole repo internally; repo_path is retained for reading object bytes. - U4: extract the anonymous replication withheld gate (caller=None, no caller param) into a shared helper returning (announce, withheld), so the push path and the future reconciliation sweep cannot drift. The per-caller read-serve withheld call in git_upload_pack is deliberately left separate. - U3: the post-receive handler resolves the push delta once (off-thread), falls back to a full scan on a non-commit tip / git error / kill-switch, and passes the candidate set to both pin paths. Logs delta size and fallbacks. The reconciliation sweep (U5) is deferred: it depends on the history-deep blob_paths fix (#84) landing on main first.
…_delta Code review found the push candidate computation collapsed both a failed full-scan fallback and a panicked spawn_blocking task to a silent empty set, diverging from the sibling withheld path that logs and fails closed. Extract resolve_candidates_for_push into git::push_delta (it owns the PinCandidates dispatch and full-scan fallback), logging every degraded path (fallback taken, fallback failed, task panicked) instead of failing silently. Pinning nothing on a hard failure is still safe (the withheld filter always runs; the reconciliation sweep backstops durability) but is now observable. Moving the dispatch out of the HTTP handler also makes it unit-testable. Add tests: tag-of-tag-of-commit peels to a Delta, and resolve_candidates_for_push delta + non-commit-tip full-scan-fallback paths.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughIntroduces a new ChangesPush-delta pinning optimization
Sequence Diagram(s)sequenceDiagram
participant ReceivePack as git_receive_pack
participant Helper as replication_withheld_set
participant Delta as resolve_candidates_for_push
participant IPFS as ipfs_pin::pin_new_objects
participant Pinata as pinata::pin_new_objects
ReceivePack->>Helper: visibility snapshot
Helper-->>ReceivePack: (announce, withheld Option)
alt withheld is Some
ReceivePack->>Delta: new_tips / old_tips (excl. ZERO_SHA)
Delta-->>ReceivePack: candidates Vec<String>
ReceivePack->>IPFS: candidates_ipfs, withheld
IPFS-->>ReceivePack: [(sha, cid)]
ReceivePack->>Pinata: candidates_pinata, withheld
Pinata-->>ReceivePack: [(sha, cid)]
else withheld is None
ReceivePack->>ReceivePack: skip pinning (fail-closed)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/gitlawb-node/src/api/repos.rs (1)
724-741: 🚀 Performance & Scalability | 🔵 Trivial | 🏗️ Heavy liftKeep fallback candidate scans off the push response path.
resolve_candidates_for_push(...).awaitnow runs before the handler returns, but the IPFS/Pinata pinning work is already backgrounded. If the helper takes the full-scan fallback or the kill switch is enabled, a large repo scan can add user-visiblegit pushlatency; consider moving candidate resolution into the spawned replication/pinning worker and sharing the resulting candidate set there.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/api/repos.rs` around lines 724 - 741, The resolve_candidates_for_push function call is running synchronously in the handler before returning, which can add user-visible latency to the git push response if the fallback full-scan path is triggered. Move the candidate resolution logic from its current location (where it's awaited as part of the handler response) into the spawned background replication/pinning worker task, and share the computed candidate set (new_tips, old_tips, and resulting candidates) within that worker context instead of computing it on the push response path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gitlawb-node/src/api/repos.rs`:
- Around line 52-77: When the withheld_blob_oids computation fails or panics and
withheld becomes None, the code still returns the original announce value,
allowing downstream replication to proceed when it should fail closed. Modify
the logic to clear or set announce to None in cases where withheld computation
fails (distinguished from the legitimate None case when rules themselves are
None). You can track whether a failure occurred in the Some(rules) branch and
conditionally return a tuple with announce set to None when the withheld
computation fails, ensuring that the final return statement at line 77 properly
handles the fail-closed scenario for both announce and withheld.
---
Nitpick comments:
In `@crates/gitlawb-node/src/api/repos.rs`:
- Around line 724-741: The resolve_candidates_for_push function call is running
synchronously in the handler before returning, which can add user-visible
latency to the git push response if the fallback full-scan path is triggered.
Move the candidate resolution logic from its current location (where it's
awaited as part of the handler response) into the spawned background
replication/pinning worker task, and share the computed candidate set (new_tips,
old_tips, and resulting candidates) within that worker context instead of
computing it on the push response path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f8fc693f-d027-414a-8481-9abb62fa58c1
📒 Files selected for processing (5)
crates/gitlawb-node/src/api/repos.rscrates/gitlawb-node/src/git/mod.rscrates/gitlawb-node/src/git/push_delta.rscrates/gitlawb-node/src/ipfs_pin.rscrates/gitlawb-node/src/pinata.rs
A failed or panicked withheld_blob_oids walk left announce=true while withheld=None, so a push we could not vet would still gossip ref updates, notify peers, and write a permanent Arweave anchor. Force announce false in that arm so the gate fails closed on both axes.
|
On the candidate-resolution latency nitpick (repos.rs:724-741): leaving it as-is. The happy path is a single bounded |
Resolves conflicts in ipfs_pin.rs and pinata.rs against #90 (per-push delta pin). #90 centralized object enumeration in push_delta::list_all_objects (cat-file --batch-all-objects) and reworked the pin path to take a candidate OID set, so this branch's local reachable-only list_all_objects helpers and their tests are now dead duplicates — dropped both. The withholding fix (full ref scope + full-history classification + explicit detached-HEAD walk) in visibility_pack.rs is unaffected; updated its stale comments that still described the pin set as 'git rev-list --objects --all'.
Summary
Pin only the objects a push actually introduces, instead of re-enumerating the entire repository on every push. Pin cost now scales with push size rather than total repo size.
Motivation & context
After each push the IPFS and Pinata pin paths ran
git cat-file --batch-all-objectsover the whole repo and then a per-object DB lookup, so a one-commit push to a large repo walked and checked every object in history. The post-receive handler already has the push's ref updates (old/new tips) in scope, so the introduced objects are computable directly.No tracked issue for this one; it came out of a performance pass on the replication path.
Kind of change
What changed
Crate touched:
gitlawb-node.git::push_deltamodule.resolve_push_deltacomputes the per-push delta withgit rev-list --objects --no-object-names <new tips> --not <old tips>. It fails closed to a full scan (returnsFullScanRequired) on a non-commit tip, any git error, or theGITLAWB_PIN_FORCE_FULLSCANkill-switch. The non-commit guard peels fully withgit cat-file -t '<sha>^{}'(a barecat-file -treturnstagfor an annotated tag, andrev-listdoes not error on a non-commit tip, so neither can be the guard).pin_new_objectsin bothipfs_pinandpinatanow take the candidate OID set from the caller instead of scanning the repo themselves. The two identicallist_all_objectscopies collapse into one inpush_delta. The withheld filter and per-object DB dedup downstream are unchanged.caller = None) into a sharedreplication_withheld_sethelper returning(announce, withheld), so the push path and a future reconciliation sweep cannot drift. The per-caller read-serve withheld computation ingit_upload_packis deliberately left separate (it filters per requester, not for anonymous replication).resolve_candidates_for_pushowns the dispatch and logs every degraded path (fallback taken, fallback failed, task panicked) rather than silently pinning nothing.How a reviewer can verify
Before you request review
cargo test --workspacepasses locallycargo fmt --allandcargo clippy --workspace --all-targets -- -D warningsare cleanfeat(...),fix(...),docs(...)).env.exampleupdated if behavior or config changed (or N/A)Notes for reviewers
blob_pathsfix from fix(node): close under-withholding via full ref scope and full-history classification (#42) #84 landing onmainfirst. Holding it as a follow-up keeps this change to the pin-enumeration side only.--mirrorpush spawns onecat-fileper ref. Fine for ordinary pushes; batchable with a singlecat-file --batchsession if bulk pushes become common.Summary by CodeRabbit
New Features
Refactor
Tests