CASSANDRA-21133 Make bin/sstableupgrade functionally on par with nodetool upgradesstables#4825
CASSANDRA-21133 Make bin/sstableupgrade functionally on par with nodetool upgradesstables#4825arvindKandpal-ksolves wants to merge 2 commits into
Conversation
…etool upgradesstables
| sstableIdGenerator.get()); | ||
| assert !newDescriptor.fileFor(Components.DATA).exists(); | ||
| Descriptor newDescriptor; | ||
| while (true) |
There was a problem hiding this comment.
this might not happen in practice but it is worth saying that there is no limit as in how many times we loop, if there is some buggy ID supplier or we exhausted the namespace of these ids and all were on disk then we would spin for ever. Currently we have sequence based generator on integer and UUID generator on ... uuid, the namespace is pretty much "infinite" in practical terms so this we will not hit it, one would have to have like ~2 billions sstables on disk to exhaust (and then it would start to overflow)
There was a problem hiding this comment.
Yeah, agreed. With ~2 billion for sequence IDs and practically infinite for UUIDs, we shouldn't ever hit an infinite loop in the real case. Thanks for pointing this out.
| } | ||
|
|
||
| @Test | ||
| public void testNewSSTableDescriptorCollision() throws Exception |
There was a problem hiding this comment.
just saying we are not testing UUID-based generator but I dont think doing that is the absolute must given how it behaves.
There was a problem hiding this comment.
if needed we can try to add , otherwise I also thought there is no absolute required.
| sstableIdGenerator.get()); | ||
| assert !newDescriptor.fileFor(Components.DATA).exists(); | ||
| Descriptor newDescriptor; | ||
| while (true) |
There was a problem hiding this comment.
I think that thread safety is just delegated to how sstableIdGenerator.get() returns the new id, right? That is AtomicInteger, hence increased atomically, so two threads calling this stuff will always have different ids to check the existence of a file against.
@arvindKandpal-ksolves could you put a comment on while(true) about this and that this loop is meant to be a defense against offline / online generation for tooling's sake instead of in-process flow?
There was a problem hiding this comment.
Yes, exactly. The atomic increment takes care of thread safety in-process. I've added a comment above the while(true) loop to explain that this is purely a defense against offline tool generation.
|
Hi @smiklosovic , I have addressed most of the points, Please take a look. |
Make bin/sstableupgrade functionally on par with nodetool upgradesstables
What this PR does / why we need it:
This PR brings functional parity between
bin/sstableupgradeandnodetool upgradesstablesby introducing the-a/--include-all-sstablesflag to the offline tool.Previously,
bin/sstableupgradewould automatically skip SSTables that were already on the latest version. This patch allows users to force a rewrite of all SSTables even if they are on the current format.Technical Details & Safety:
Adding this flag to an offline tool introduced a known edge case: If the offline tool is run while the Cassandra node is online, it consumes a new SSTable ID on disk. However, the live node's internal
sstableIdGeneratorremains unaware of this. Upon the next flush, the live node attempts to use the same ID, which previously triggered ajava.lang.AssertionErrorinColumnFamilyStore.newSSTableDescriptor().To safely resolve this without introducing regressions:
assert !newDescriptor.fileFor(Components.DATA).exists();check innewSSTableDescriptorwith a safewhile(true)loop. It checks if the generated ID already exists on disk. If a collision is detected, it logs a warning and advances the generator, self-healing the state.StandaloneUpgraderto parse and apply the-aoption consistently.testNewSSTableDescriptorCollision) inColumnFamilyStoreTestthat probes the current ID, usesSSTableIdFactoryto pre-create files for future IDs (N+1,N+2), and verifies that the loop correctly skips them and yieldsN+3.StandaloneUpgraderTest.sstableupgrade.adoc.patch by Arvind Kandpal; reviewed by TBD for CASSANDRA-21133