Skip to content

fix: fall back to lease file when DHCPLeases API is empty#190

Open
ajmeese7 wants to merge 6 commits intofog:masterfrom
ajmeese7:fix/leasefile-fallback-for-external-dhcp
Open

fix: fall back to lease file when DHCPLeases API is empty#190
ajmeese7 wants to merge 6 commits intofog:masterfrom
ajmeese7:fix/leasefile-fallback-for-external-dhcp

Conversation

@ajmeese7
Copy link
Copy Markdown

@ajmeese7 ajmeese7 commented Apr 1, 2026

Summary

  • When a libvirt network uses an external dnsmasq (not managed by libvirt), the DHCPLeases API returns empty and the VM's IP address can't be discovered. This adds a fallback that reads the dnsmasq lease file directly from /var/lib/libvirt/dnsmasq/<network>.leases.
  • Common scenario: dnsmasq running inside a network namespace on WSL2 to work around port 67 conflicts with Windows ICS.
  • Includes hardening (error handling for file permission/race conditions, warning log on fallback) and a minitest suite for the new method.

Changes

  • lib/fog/libvirt/models/compute/server.rb
    • New ip_address_from_leasefile(net, mac) private method parses dnsmasq lease files, matches by MAC, returns IP with latest expiry
    • addresses() calls the fallback when dhcp_leases returns nil and a network exists
    • DNSMASQ_LEASE_DIR constant for the lease file directory
    • Fog::Logger.warning when the fallback path is entered (emitted only once per MAC)
    • Errno::EACCES/Errno::ENOENT rescue around file reads
    • Removed unused DOMAIN_CLEANUP_REGEXP constant
    • Retain Tempfile reference in generate_config_iso_in_dir to prevent GC from deleting the ISO file before File.size reads it (fixes flaky CI ENOENT failure)
  • minitests/server/leasefile_fallback_test.rb — 8 test cases covering happy path, expiry selection, case-insensitive MAC, missing file, malformed lines, no match, permission error, and nil network name

Test plan

  • bundle exec rake minitest passes
  • Manual verification on a WSL2 host with external dnsmasq in a network namespace
  • Verify existing public_ip_address behavior is unchanged when libvirt DHCP is active (bundle exec rake test)

The libvirt DHCPLeases API returns no results when the network was
defined without a <dhcp> block, even if an external dnsmasq is
serving DHCP on the bridge. This occurs on WSL2 with mirrored
networking where port 67 is held by Windows ICS, forcing dnsmasq
to run in an isolated network namespace outside libvirt's control.

Add ip_address_from_leasefile() as a fallback in addresses() that
reads the dnsmasq lease file directly, matching by MAC and picking
the lease with the latest expiry — mirroring the existing
dhcp_leases logic.
- Extract lease directory to DNSMASQ_LEASE_DIR constant
- Remove bare rescue on net.name (fog attribute won't raise)
- Handle Errno::EACCES/ENOENT during lease file reads
- Log warning when falling back to dnsmasq lease file
- Remove unused DOMAIN_CLEANUP_REGEXP constant
- Add minitest suite for ip_address_from_leasefile
The addresses() method is polled in a tight loop while waiting for
a VM to get an IP. The warning fired on every call, flooding logs
with dozens of identical lines.
The Tempfile object was immediately discarded after calling .path,
making it eligible for garbage collection. When GC finalized the
object, the underlying ISO file was deleted before File.size could
read it, causing flaky ENOENT failures in CI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant