fix: Prevent and clean up zero-valued entries in Alpha storage maps during hotkey and coldkey swaps#2513
Conversation
|
@open-junius Could you please review this PR? Welcome to any feedbacks. |
|
change the target branch as devnet-ready, not main. |
8cd929a to
369a319
Compare
|
I changed target branch. Thanks! Ready to review |
|
please check https://github.com/opentensor/subtensor/pull/2509/changes, which also prevent some zero value insert. Could you include these storage into the PR. |
|
@open-junius Thanks for your feedback! I updated all following your feedback. Could you please review again? |
|
please merge the latest devnet-ready, will fix the e2e test. |
|
@open-junius I merged the latest devnet-ready. Could you take a look once more? |
|
@open-junius All e2e tests passed. thank you. The benchmark testing issue is not relate to my change. |
|
@open-junius I hope you had a great weekend. Any updates for me, please. |
|
Converting to I think the cleanest approach is to drop the migration from this PR and just ship the zero-value insert guards (which prevent new zeros from being created). The purge of existing zeros can be handled properly in a separate PR with the right multi-block infrastructure. I’m planning to take care of that follow-up PR too after this one. Let me know what you'd prefer. |
|
@evgeny-s @open-junius did you check my description? Should I have to update my PR? |
|
|
||
| // Set the new root claimed value. | ||
| RootClaimed::<T>::insert((netuid, hotkey, coldkey), new_root_claimed); | ||
| // Set the new root claimed value, removing if zero to avoid storage bloat. |
There was a problem hiding this comment.
The comment is saying "removing", but I don't see it here.
There was a problem hiding this comment.
But below we remove it. Fix it for consistency
I would rather keep it with the migration. Please test it first on the mainnet clone, and use multi-block migration if needed. Let's see other opinions on this regard. |
I absolute agree with you. If possible, I want to keep migration in this PR, too. Now I am testing locally. After finishing, I will share result here. Thanks @evgeny-s |
|
Updated:
|
e7ec1f3 to
3fedc8b
Compare
| const stakeAfterRemove = await getStake(api, hotkey.address, coldkey.address, netuid); | ||
|
|
||
| // Stake should be zero or very close to zero (owner minimum may be retained) | ||
| log(`Stake after full remove: ${stakeAfterRemove}`); |
There was a problem hiding this comment.
If we can't check in the code. maybe we don't need this test case. since 0 is default value, we can't tell the difference. It seems we need lower level api to check if the key exists or not.
There was a problem hiding this comment.
In my opinion, it should be enough you already verify them in rust unit test. no need to add it in e2e.
02a8034 to
83a8392
Compare
…block on_idle migration to purge existing ones
83a8392 to
2966490
Compare
|
@open-junius Thanks for your feedback. I removed e2e test. Please review again. |
|
@open-junius Could you tell me why |
You can ignore it. we need merge the main back to devnet-ready sometimes if there is any hotfix in main branch. |
…into fix/remove-zero-alpha-entries
|
@evgeny-s @open-junius could you please review again once more? I think this almost done |
|
These testing failure doesn't seem to relate to my change. Please check the log. |
|
|
I have updated the migration script more correctly: |
|
Tested on Finney mainnet clone via
No OOM or block timeout issues. |
|
@open-junius @evgeny-s I'd like you review once more ^^. It works well for me |
It is great, let me review it again. |
Thanks for collect the data, it is vaulable. |
|
|
||
| /// Called from `on_runtime_upgrade`. Schedules the cleanup by setting phase = 1 | ||
| /// if the migration hasn't run yet. This is O(1) — no iteration. | ||
| pub fn migrate_remove_zero_alpha<T: Config>() -> Weight { |
There was a problem hiding this comment.
After the actual cleanup executed in the on_idle, I am wondering if we really need the migration part.
For instance, we can set the ZeroAlphaCleanupPhase init value as 1 in storage.
then on_idle start to handle it until the value go to 0.
Or use different values for different steps, make it clearer.
What's your opinion?
There was a problem hiding this comment.
Good point. We could simplify by setting the default to 1, but the issue is: after cleanup completes and sets phase=0, a node restart or fresh sync would read the default (1) again and re-trigger cleanup. The current approach uses HasMigrationRun as a permanent guard to prevent this.
There was a problem hiding this comment.
Good point. We could simplify by setting the default to 1, but the issue is: after cleanup completes and sets phase=0, a node restart or fresh sync would read the default (1) again and re-trigger cleanup. The current approach uses
HasMigrationRunas a permanent guard to prevent this.
@open-junius what do you think about it? Still should i need to update?
Description
Prevent and clean up zero-valued entries in Alpha storage maps during hotkey and coldkey swaps.
Coldkey and hotkey swap logic was inserting zero-valued entries into
Alpha,TotalHotkeyShares,TotalHotkeyAlphaLastEpoch, andAlphaDividendsPerSubnetwithout checking if the value was zero first. These entries persist indefinitely and bloat storage - a single coldkey swap could produce ~18k zero entries, and roughly 52% of Alpha entries on-chain are zero-valued.This PR adds zero-guards to all insertion points in
swap_coldkey.rsandswap_hotkey.rs, and includes a one-time migration to purge existing zero entries from the four affected storage maps.Related Issue(s)
Type of Change
Breaking Change
No breaking changes.
Checklist
./scripts/fix_rust.shto ensure my code is formatted and linted correctlyAdditional Notes
The migration collects the full
Alphamap into memory before removing zero entries. This is intentional - mutating a Substrate storage map while iterating over it is unsafe. The other three maps are smaller and iterated directly. All four maps useHasMigrationRunto ensure idempotency.