fix(partitions): persist data on vsr backups, add data integrity test#3512
Open
hubcio wants to merge 2 commits into
Open
fix(partitions): persist data on vsr backups, add data integrity test#3512hubcio wants to merge 2 commits into
hubcio wants to merge 2 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3512 +/- ##
============================================
- Coverage 74.27% 74.24% -0.03%
Complexity 937 937
============================================
Files 1259 1259
Lines 125969 125917 -52
Branches 101644 101636 -8
============================================
- Hits 93558 93489 -69
+ Misses 29396 29369 -27
- Partials 3015 3059 +44
🚀 New features to boost your workflow:
|
26b7423 to
fa2c7f2
Compare
VSR's hash chain and checksum-keyed recovery require a committed op's on-disk bytes to match on every replica. Backups persisted 0-byte segments: the partition commit path drained the in-memory pipeline, which only the primary fills. And the append path stamped base_timestamp from a local now(), diverging bytes per node even once persisted. Backups now source committable ops from the journal when the pipeline is empty, and commit_messages flushes only the committed prefix (op <= commit_max), keeping the uncommitted tail resident. A backup thus never writes uncommitted bytes to its segment and never drops the headers a later commit needs - which would otherwise wedge commit_min below commit_max. base_timestamp reuses the prepare's monotonic timestamp, stamped once by the primary and replicated. A 3-node data-integrity test gates the cross-replica byte-identity.
220f3d6 to
a82ddbf
Compare
A disconnect storm spawns many concurrent in-process Logout submits on shard 0, breaking two invariants that held only under serial submission. dispatch_prepare_and_await (metadata plane) snapshotted view and commit_min before on_replicate().await and asserted them unchanged after. A sibling on_ack legitimately advances commit_min while a task is parked in the await, and a view change advances the view, tripping the debug_assert. The snapshots fed only the assert; no release-mode logic read them. Both conditions are handled downstream - a view change drops the reply_sender so the receiver resolves Canceled, and loopback acks are op-routed - so the assert and snapshots are removed. A new primary floors its view-change pipeline rebuild at the max commit across the DoViewChange quorum. The DVC carried commit_min (locally applied), which can lag commit_max (known committed) by more than the pipeline depth: on_ack pops the committable prefix at once then applies it across an await per entry, while concurrent submits push op ahead. The rebuild range then exceeds PIPELINE_PREPARE_QUEUE_MAX and the assert, or the pipeline push, panics the shard in release builds. Carry commit_max instead: every replica holds op - commit_max <= pipeline depth, so the rebuilt range stays within capacity, while quorum intersection keeps the floor at or below the winner's op. The committed-but-unapplied tail is replayed by CommitJournal. Only the value written to the DoViewChange commit field changes; the wire layout is untouched.
a82ddbf to
aff604f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
VSR's hash chain and checksum-keyed recovery require a committed op's
on-disk bytes to match on every replica. Two defects broke that: the
partition commit path drained the pipeline, which only the primary
fills, so backups journaled replicated prepares but never flushed them
(0-byte segments); and append re-stamped base_timestamp from a local
now(), diverging bytes even once persisted.
commit_journal now falls back to the journal when the pipeline is
empty, so backups persist like the metadata plane. base_timestamp
reuses the prepare's monotonic timestamp, stamped once by the primary
and replicated. The 3-node data integrity test is un-ignored and gates
this.