spike: action-based conflict resolution prototype#6832
Draft
wjones127 wants to merge 1 commit into
Draft
Conversation
Design doc + working prototype of action-based transactions and conflict
resolution. Not for merge — published for reference.
- rust/lance-table/design/action_based_conflict_resolution.md
- rust/lance-table/src/format/transaction/actions{.rs,/} — Action trait,
ManifestMask, six action impls (RemoveFragments, AddFragments,
UpdateDeletionVector, InvalidateIndexCoverage, ChangeSchema, AddIndex),
and a conflict-resolution test suite covering update-update variants
and update-create_index orderings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
ACTION NEEDED The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Spike for #6448. Not for merge — published for reference when doing future work.
Includes a design doc plus a working Rust prototype of action-based transactions and conflict resolution.
What's here
rust/lance-table/design/action_based_conflict_resolution.md— design doc covering principles (criterion-based targeting, late binding, symbolic intra-txn references), the action catalog, lifecycle, worked examples, a first-pass conflict matrix (§5/§6), and a classification of every existingcheck_*_txnrule against the design (§11).rust/lance-table/src/format/transaction/actions{.rs,/}—Actiontrait,ManifestMask, transaction driver, and six action implementations:RemoveFragments,AddFragments,UpdateDeletionVector,InvalidateIndexCoverage,ChangeSchema,AddIndex.24 tests, all passing. Clippy + fmt clean.
Why this approach makes N×N rules unnecessary
The acceptance criteria on #6448 call for "a conflict matrix covering all 14 action types." The prototype's premise is that the matrix is the wrong artifact — it should be emergent, not hand-authored.
Three mechanisms, all derived from per-action
reads()andwrites()masks, cover the bulk of conflict cases:rebase. Most fragment-id / field-id collisions resolve here without any per-pair logic.IndicesCoveringFields{fields=[7]}); the mask layer evaluates it against the peer's concrete path. This is what lets aUpdate(field=7)correctly invalidate a concurrently-addedAddIndex(field=7)without either action knowing about the other.rebasecan downcast the peer to make a finer decision —UpdateDeletionVectorchecking row-level overlap against a concurrent UDV is the canonical case.Classifying the existing ~52 rules across all 14
check_*_txnmethods (see §11):UpdateConfig,AddBases, etc.). Add the action, the rule fits.For an N-action catalog, that's O(N) action declarations + O(small) special-case rebase impls, not O(N²) hand-curated cells. Adding a new action means adding its mask declarations and any necessary peer-inspection logic in
rebase; existing actions are untouched.Worked example: merge-insert primary-key bloom filter
One existing rule that looks at first like it needs transaction-level metadata:
inserted_rows_filter, the Bloom filter over primary keys a merge-insert claims as new. Two concurrent merge-inserts could both decide PK 42 doesn't exist and both insert it — the bloom detects the race.This appears transaction-level because both txns need to compare blooms. In fact it fits pattern 3 cleanly: the bloom describes the PKs in the new rows of an
AddFragments, so it lives on the action.AddFragments::rebasewalks concurrent peers, finds otherAddFragmentsactions, and:RetryRetry(can't verify)Retry(peer's PKs unknowable)No txn-level surface required. Same shape as UDV row-overlap, just with a bloom instead of a roaring bitmap.
Genuine gaps that did surface
Three structural additions the prototype doesn't yet handle:
merged_generations. This field appears on three different operation shells (Update,UpdateMemWalState,CreateIndex(MemWAL)) — same data, different wrappers. Unlike the PK bloom, it genuinely doesn't belong to any one action. Wants a TxnContext-level metadata surface parallel to action masks.TransactionError.rebaseshould returnOk | Retry | Fail(or split the error type) so the driver can drive the retry loop.The bulk of the remaining work is mechanical catalog completion (~10 more actions:
RemoveIndex,UpdateConfig,AddFields, etc.) plus enriching the trait to return the three-way verdict.Known TODOs / open questions (prototype-specific)
applyoverwrites rather than unions — real impl needs IO in rebase.AddIndexintroduces anOnNewFragments::{LeaveUncovered, Fail}policy knob that the design doc doesn't yet pin down.InvalidateIndexCoverage/AddIndexoperate on a prototype-onlyWorkingState { manifest, indices }wrapper since the realManifestdoesn't carry the index list inline.