You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Two related concerns surfaced while designing GC recovery. Neither blocks recovery (the client-select + server-repoint plan works on pre-v8), but both deserve a deep-dive afterward.
1. Old buckets stuck on pre-v8 despite lazy walkable-v8 migration
Yet the big old buckets documents (6139 objects) and audio (11886) are still pre-v8 (manifest page refs have cid = None) — which is exactly why they could not self-heal a gc'd page offline.
Open question: is this simply "no upload/flush to those buckets since v8 shipped," or is the lazy migration not firing for them (stuck marker, gated path)? If buckets migrate only on WRITE, large read-mostly buckets stay pre-v8 indefinitely and remain non-self-healing — consider a proactive/forced migration path.
2. Data-loss footgun: unguarded upload to a damaged bucket
A 404 (NoSuchKey) at index_key makes the SDK create a fresh EMPTY v7 forest (crates/fula-client/src/encryption.rs ~:3376; intentional for genuinely-new buckets, tests at :12092). 5xx / connection-refused now correctly propagate (:12133, ~:12168).
Risk: if a DAMAGED bucket (object_count > 0) ever returns 404 at index_key, the empty-forest path engages → a subsequent upload+flush writes a ~1-file manifest at index_key → orphans the prior 6139/11886 object references (listing data-loss; underlying blocks stay pinned).
The gc-damaged buckets appear to return 410 (which propagates → upload fails, no overwrite), so this likely has not triggered — but it is a latent footgun.
Hardening: refuse to flush an overwrite when the loaded forest is EMPTY but the server reports object_count > 0 (a damaged-bucket guard), independent of the recovery feature's upload-gate.
Does NOT block recovery
The recovery plan (detect -> client walks the newest consistent manifest -> server rebuilds the index from its pins + repoints index_key/forest_manifest_cid) works on pre-v8 and gates uploads on damaged buckets (closing #2 for the damaged case). #1 (migration) and the SDK-level guard in #2 are separate hardening. Priority: after recovery.
Summary
Two related concerns surfaced while designing GC recovery. Neither blocks recovery (the client-select + server-repoint plan works on pre-v8), but both deserve a deep-dive afterward.
1. Old buckets stuck on pre-v8 despite lazy walkable-v8 migration
walkable_v8_writer_enableddefaults totrue(crates/fula-client/src/config.rs:332); a per-bucket migration marker exists (crates/fula-client/src/wal.rs:537, issue Walkable-v8: force-rewrite v7 buckets to v8 on first master-up load (closes lazy-migration gap) #10). So any flush/upload to a bucket should emit v8 CID hints / migrate it.documents(6139 objects) andaudio(11886) are still pre-v8 (manifest page refs havecid = None) — which is exactly why they could not self-heal a gc'd page offline.2. Data-loss footgun: unguarded upload to a damaged bucket
NoSuchKey) atindex_keymakes the SDK create a fresh EMPTY v7 forest (crates/fula-client/src/encryption.rs~:3376; intentional for genuinely-new buckets, tests at:12092). 5xx / connection-refused now correctly propagate (:12133, ~:12168).index_key, the empty-forest path engages → a subsequent upload+flush writes a ~1-file manifest atindex_key→ orphans the prior 6139/11886 object references (listing data-loss; underlying blocks stay pinned).object_count > 0(a damaged-bucket guard), independent of the recovery feature's upload-gate.Does NOT block recovery
The recovery plan (detect -> client walks the newest consistent manifest -> server rebuilds the index from its pins + repoints
index_key/forest_manifest_cid) works on pre-v8 and gates uploads on damaged buckets (closing #2 for the damaged case). #1 (migration) and the SDK-level guard in #2 are separate hardening. Priority: after recovery.