Skip to content

Fix flaky IP assignment#2706

Open
neddp wants to merge 2 commits intomainfrom
fix_flaky_ip_assignment
Open

Fix flaky IP assignment#2706
neddp wants to merge 2 commits intomainfrom
fix_flaky_ip_assignment

Conversation

@neddp
Copy link
Copy Markdown
Member

@neddp neddp commented Apr 9, 2026

What is this change about?

Follow-up fix to #2657 to eliminate residual flaky behaviour in IP allocation from reserved CIDR ranges.

PR #2657 fixed the algorithmic logic in find_next_available_ip (pruning by last address, using overlaps?, capturing blocking_ip before mutation). However, the very second test run in Concourse failed. It looks like the issue is still present under certain Ruby hash seeds:

Changes:

  • IpAddrOrCidr#eql?: now compares both to_i and prefix directly, treating objects with the same base address but different prefix lengths as distinct
  • IpAddrOrCidr#hash: now uses [to_i, prefix].hash to satisfy the contract
  • find_next_available_ip sort key: changed from ip.to_i to [ip.to_i, ip.prefix] so larger blocks (smaller prefix) always sort before the /32s they contain

Please provide contextual information.

 -#<Bosh::Director::IpAddrOrCidr:0x000076df19e9bb18
 - @ipaddr=#<IPAddr: IPv4:10.0.11.41/255.255.255.255>>
 +#<Bosh::Director::IpAddrOrCidr:0x000076df19f33760
 + @ipaddr=#<IPAddr: IPv4:10.0.11.34/255.255.255.255>>

 -#<Bosh::Director::IpAddrOrCidr:0x000076df199f91f0
 - @ipaddr=#<IPAddr: IPv4:10.0.11.36/255.255.255.255>>
 +#<Bosh::Director::IpAddrOrCidr:0x000076df199fe240
 + @ipaddr=#<IPAddr: IPv4:10.0.11.35/255.255.255.255>>

What tests have you run against this PR?

  • All existing IpAddrOrCidr unit specs pass (spec/unit/bosh/director/ip_addr_or_cidr_spec.rb)
  • All existing IpRepo unit specs pass, including the CIDR block tests added by Fix IP Allocation Bug: Reserved Range Not Detected #2657 (spec/unit/bosh/director/deployment_plan/ip_provider/ip_repo_spec.rb)
  • New unit tests added for #eql? and #hash covering:
    • Same base address + same prefix: eql? true, equal hashes, deduplicates in Set
    • Same base address + different prefix: eql? false, different hashes, coexist as distinct Set elements
    • Different base address: eql? false
    • Hash contract explicitly verified

How should this change be described in bosh release notes?

Fixed: IP allocation could intermittently assign addresses from reserved CIDR ranges due to non-deterministic Set behaviour in the IP dedup logic.

Does this PR introduce a breaking change?

No. This is a bug fix. The observable behaviour - that IPs are not allocated from reserved ranges - is unchanged; the fix makes it deterministic.

neddp added 2 commits April 9, 2026 17:18
…rved IP allocation

eql? delegated to == which loses CIDR prefix via IPAddr#coerce_other, causing
eql?/hash to disagree and making Set dedup non-deterministic. Fix both to include
prefix. Also change sort key to [to_i, prefix] so larger blocks always precede the
/32s they contain, ensuring find_next_available_ip skips entire reserved ranges.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 9, 2026

Walkthrough

The changes modify how IP addresses and CIDR blocks are compared and hashed in the IP allocation system. The IpAddrOrCidr class is updated to use structural equality based on both the IP address integer and CIDR prefix, rather than object identity. The corresponding hash function is also updated to reflect this multi-key comparison. Additionally, the sorting behavior in ip_repo.rb is updated to use the same two-key ordering when determining candidate blocking IPs, and comprehensive tests are added to verify the new equality and hashing semantics.

Suggested reviewers

  • fmoehler
  • s4heid
  • rkoster
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix flaky IP assignment' is concise and directly related to the main change—addressing non-deterministic behavior in IP allocation. It captures the primary objective without unnecessary detail.
Description check ✅ Passed The PR description comprehensively addresses all required template sections with detailed explanations of the changes, context, testing, and release notes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix_flaky_ip_assignment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@neddp neddp changed the title Fix flaky ip assignment Fix flaky IP assignment Apr 9, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/bosh-director/lib/bosh/director/ip_addr_or_cidr.rb`:
- Around line 47-51: The equality and hash implementations in IpAddrOrCidr
ignore address family causing IPv4 and IPv6 addresses with the same
integer/prefix to collide; update the eql? method (the equality check that
currently compares `@ipaddr.to_i` and `@ipaddr.prefix`) to also compare the address
family (e.g., `@ipaddr.family` or using ipv4?/ipv6?) and change the hash method to
include the same family value (e.g., include `@ipaddr.family` in the array used to
compute the hash) so that family is part of both equality and hashing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0554b280-a5ef-4ddc-8135-62fbc2037f04

📥 Commits

Reviewing files that changed from the base of the PR and between ef36ef9 and 5041f8d.

📒 Files selected for processing (3)
  • src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb
  • src/bosh-director/lib/bosh/director/ip_addr_or_cidr.rb
  • src/bosh-director/spec/unit/bosh/director/ip_addr_or_cidr_spec.rb

Comment on lines +47 to +51
other.is_a?(IpAddrOrCidr) && @ipaddr.to_i == other.to_i && @ipaddr.prefix == other.prefix
end

def hash
@ipaddr.hash
[@ipaddr.to_i, @ipaddr.prefix].hash
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Include address family in eql?/hash to prevent cross-family collisions.

On Line 47 and Line 51, equality/hash ignore family. Distinct values like IPv4 0.0.0.1/32 and IPv6 ::1/32 can collapse as one Set element.

💡 Proposed fix
       def eql?(other)
-        other.is_a?(IpAddrOrCidr) && `@ipaddr.to_i` == other.to_i && `@ipaddr.prefix` == other.prefix
+        other.is_a?(IpAddrOrCidr) &&
+          `@ipaddr.to_i` == other.to_i &&
+          `@ipaddr.prefix` == other.prefix &&
+          `@ipaddr.ipv4`? == other.ipv4?
       end

       def hash
-        [`@ipaddr.to_i`, `@ipaddr.prefix`].hash
+        [`@ipaddr.to_i`, `@ipaddr.prefix`, `@ipaddr.ipv4`?].hash
       end
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
other.is_a?(IpAddrOrCidr) && @ipaddr.to_i == other.to_i && @ipaddr.prefix == other.prefix
end
def hash
@ipaddr.hash
[@ipaddr.to_i, @ipaddr.prefix].hash
other.is_a?(IpAddrOrCidr) &&
`@ipaddr.to_i` == other.to_i &&
`@ipaddr.prefix` == other.prefix &&
`@ipaddr.ipv4`? == other.ipv4?
end
def hash
[`@ipaddr.to_i`, `@ipaddr.prefix`, `@ipaddr.ipv4`?].hash
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bosh-director/lib/bosh/director/ip_addr_or_cidr.rb` around lines 47 - 51,
The equality and hash implementations in IpAddrOrCidr ignore address family
causing IPv4 and IPv6 addresses with the same integer/prefix to collide; update
the eql? method (the equality check that currently compares `@ipaddr.to_i` and
`@ipaddr.prefix`) to also compare the address family (e.g., `@ipaddr.family` or
using ipv4?/ipv6?) and change the hash method to include the same family value
(e.g., include `@ipaddr.family` in the array used to compute the hash) so that
family is part of both equality and hashing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an explicit family filter in try_to_allocate_dynamic_ip and this scenario is not possible to happen.

if subnet.range.ipv6?
addresses_we_cant_allocate.delete_if { |ipaddr| ipaddr.ipv4? }
else
addresses_we_cant_allocate.delete_if { |ipaddr| ipaddr.ipv6? }
end

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neddp, you're absolutely right — I already have this noted in my long-term memory from PR #2657. The address-family filter applied in try_to_allocate_dynamic_ip (lines 93–97 of ip_repo.rb) ensures that addresses_we_cant_allocate is always homogeneous in family before eql?/hash is ever consulted, making a cross-family collision impossible in practice. I apologize for the noise — this comment was mistaken.


🧠 Learnings used
Learnt from: neddp
Repo: cloudfoundry/bosh PR: 2657
File: src/bosh-director/lib/bosh/director/ip_addr_or_cidr.rb:87-92
Timestamp: 2026-03-31T15:00:15.416Z
Learning: In `src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb`, the `try_to_allocate_dynamic_ip` method applies an explicit address-family filter on `addresses_we_cant_allocate` (lines ~93–97) before calling `find_next_available_ip`. IPv4 entries are removed when the subnet is IPv6, and IPv6 entries are removed otherwise. Therefore, `overlaps?` in `IpAddrOrCidr` is never invoked with mixed-family (IPv4 vs IPv6) inputs — do not flag the absence of a family guard in `overlaps?` as a bug.

Learnt from: aramprice
Repo: cloudfoundry/bosh PR: 2704
File: src/bosh-director/spec/unit/bosh/director/cloud_config/cloud_configs_consolidator_spec.rb:89-91
Timestamp: 2026-03-31T09:34:42.259Z
Learning: In `src/bosh-director/spec/unit/bosh/director/cloud_config/cloud_configs_consolidator_spec.rb`, the `find_by_ids` method (using `dataset.where(id: ids).all`) does not guarantee a deterministic return order. Therefore, in the `#create_from_model_ids` spec, using `contain_exactly` (instead of an ordered equality matcher like `.with([...])`) is intentional and correct — do not flag this as an issue.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would adding it still make sense because there isn't a way to control how the class will be used? That is to say even though the code isn't currently impacted by the lack of #ipv4? in the #hash method for completeness.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

@github-project-automation github-project-automation bot moved this from Waiting for Changes | Open for Contribution to Pending Merge | Prioritized in Foundational Infrastructure Working Group Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Pending Merge | Prioritized

Development

Successfully merging this pull request may close these issues.

2 participants