feat: allow unlimited number of gen4 nodes#10448
Conversation
There was a problem hiding this comment.
Pull request overview
This PR modifies the registry canister’s node-add flow to relax max_rewardable_nodes enforcement for certain Gen4 node reward types, enabling node providers to register more cloud/on-demand nodes (while explicitly keeping type4.5 subject to the existing quota rule).
Changes:
- Bypasses the
max_rewardable_nodesquota check when adding nodes of reward typestype4.1–type4.4. - Keeps the quota check enforced for
type4.5. - Adds unit tests covering the bypass behavior for
type4.1–type4.4and continued enforcement fortype4.5.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let bypass_max_rewardable_nodes_check = matches!( | ||
| node_reward_type, | ||
| NodeRewardType::Type4dot1 | ||
| | NodeRewardType::Type4dot2 | ||
| | NodeRewardType::Type4dot3 | ||
| | NodeRewardType::Type4dot4 | ||
| ); |
There was a problem hiding this comment.
We don't want to do it for type4 nodes. They were a one-off and won't be used for this feature.
| if !bypass_max_rewardable_nodes_check { | ||
| let max_rewardable_nodes_same_type = *node_operator_record | ||
| .max_rewardable_nodes | ||
| .get(&(node_reward_type.to_string())) | ||
| .ok_or(format!("{LOG_PREFIX}do_add_node: Node Operator does not have rewardable nodes for {node_reward_type}"))?; |
There was a problem hiding this comment.
On the reward canister side, these aren't considered anyways. So if they exist they are skipped.
There was a problem hiding this comment.
This pull request changes code owned by the Governance team. Therefore, make sure that
you have considered the following (for Governance-owned code):
-
Update
unreleased_changelog.md(if there are behavior changes, even if they are
non-breaking). -
Are there BREAKING changes?
-
Is a data migration needed?
-
Security review?
How to Satisfy This Automatic Review
-
Go to the bottom of the pull request page.
-
Look for where it says this bot is requesting changes.
-
Click the three dots to the right.
-
Select "Dismiss review".
-
In the text entry box, respond to each of the numbered items in the previous
section, declare one of the following:
-
Done.
-
$REASON_WHY_NO_NEED. E.g. for
unreleased_changelog.md, "No
canister behavior changes.", or for item 2, "Existing APIs
behave as before.".
Brief Guide to "Externally Visible" Changes
"Externally visible behavior change" is very often due to some NEW canister API.
Changes to EXISTING APIs are more likely to be "breaking".
If these changes are breaking, make sure that clients know how to migrate, how to
maintain their continuity of operations.
If your changes are behind a feature flag, then, do NOT add entrie(s) to
unreleased_changelog.md in this PR! But rather, add entrie(s) later, in the PR
that enables these changes in production.
Reference(s)
For a more comprehensive checklist, see here.
GOVERNANCE_CHECKLIST_REMINDER_DEDUP
|
✅ No security or compliance issues detected. Reviewed everything up to 8934f9f. Security Overview
Detected Code Changes
|
daniel-wong-dfinity-org-twin
left a comment
There was a problem hiding this comment.
Seems like, this could cause problems space problems in the Registry canister... I mean, an alternative is just to have a very large (but finite!) limit, right??
| // For now, node providers may deploy arbitrarily many nodes of these | ||
| // types without being constrained by `max_rewardable_nodes`. Type4.5 is | ||
| // explicitly excluded from this exemption. | ||
| let bypass_max_rewardable_nodes_check = matches!( |
There was a problem hiding this comment.
Do as you see fit.
Alternatively, you can just say that there must be < EXCESSIVE_NUMBER_OF_TYPE_4_NODES type 4.x nodes. That way, they do not need to be treated specially.
When it is possible to not treat something specially, it is better to treat it like everything else. Since the improvement here is marginal, I leave it up to you.
| if max_rewardable_nodes_same_type | ||
| <= num_in_registry_same_type.saturating_sub(num_removed_same_ip_same_type) |
There was a problem hiding this comment.
I know it was like this before already, but why <= ?
Suggestion:
| if max_rewardable_nodes_same_type | |
| <= num_in_registry_same_type.saturating_sub(num_removed_same_ip_same_type) | |
| let num_remaining_nodes = num_in_registry_same_type.saturating_sub(num_removed_same_ip_same_type); | |
| if num_remaining_nodes > max_rewardable_nodes_same_type { |
Note that my suggestion is not equivalent, because > is used. Seems like the original code has an off by one bug? Equal to max should be fine, right??
The most important change in my suggestion is swapping the order of operands, putting the reference value on the right hand side of the comparison operator, because we pretty much always do it that way.
The dummy variable in my suggestion is less important. It helps because multi-line if expressions should be avoided (and in this case, it can be avoided pretty easily), but this is not something that we particularly follow; hence, lower importance.
There was a problem hiding this comment.
I can address this, but this is the question for @pietrodimarco-dfinity, I assume that this isn't an array and is actually how many nodes you have vs how many you are allowed. If you have 5 and you are allowed 5 it is fine, right?
| registry | ||
| .do_add_node_(payload, node_operator_id, now_system_time()) | ||
| .unwrap_or_else(|e| { | ||
| panic!("do_add_node_ failed for {node_reward_type} on iteration {i}: {e}") |
There was a problem hiding this comment.
Maybe, comment here that the next test shows that for other node reward types, this would fail. Otherwise, even if this test passes, it could be simply due to panic NEVER occurs, regardless of node reward type.
| * Temporarily bypass the `max_rewardable_nodes` quota check in `add_node` for | ||
| node reward types `type4.1` through `type4.4`, allowing node providers to | ||
| register an arbitrary number of such nodes. `type4.5` is explicitly excluded | ||
| and still subject to the quota. This is a temporary measure until rewards of | ||
| `type4.5` are no longer treated as `type1.1` (see CLO-15). |
There was a problem hiding this comment.
Can you maybe explain the motivation?
This pull request allows node providers to add more type4.x nodes (excluding type4.5 for now see #10393).
This is needed to allow for provisioning on-demand nodes in the cloud and registering them with the network.