Skip to content

feat: add hardware interrupt support#1272

Open
danbugs wants to merge 37 commits intohyperlight-dev:mainfrom
nanvix:danbugs/hw-interrupts
Open

feat: add hardware interrupt support#1272
danbugs wants to merge 37 commits intohyperlight-dev:mainfrom
nanvix:danbugs/hw-interrupts

Conversation

@danbugs
Copy link
Contributor

@danbugs danbugs commented Mar 3, 2026

Summary

Adds hardware interrupt support to Hyperlight, enabling guest OS kernels to receive timer interrupts for preemptive scheduling. Each hypervisor backend uses its native interrupt delivery mechanism:

  • KVM: In-kernel IRQ chip (PIC + IOAPIC + LAPIC) + host timer thread with irqfd for IRQ0 injection
  • MSHV: LAPIC partition flag + request_virtual_interrupt from a host timer thread
  • WHP: Software timer thread with WHvRequestInterrupt for periodic interrupt injection, using the bulk LAPIC state API for initialization

All implementations are gated behind the hw-interrupts cargo feature and have no effect on existing behavior when the feature is disabled.

Motivation

Nanvix is a microkernel that requires preemptive scheduling via timer interrupts. Beyond this immediate use case, hardware interrupt support is a prerequisite for the ring buffer notifier mechanism planned for the tandr/ring branch — the upstream Notifier trait needs a way to signal the guest that new work is available in a virtqueue/ring buffer, and hardware interrupts are the natural delivery mechanism (matching the virtio model of interrupt-driven I/O notification).

Key design decisions

Host timer thread + native injection (no in-kernel PIT/SynIC STIMER)

All three backends use the same pattern: a host-side timer thread that fires at the guest-requested period and injects IRQ0 through the platform's native mechanism (irqfd on KVM, request_virtual_interrupt on MSHV, WHvRequestInterrupt on WHP). This replaces earlier experiments with KVM's in-kernel PIT and MSHV's SynIC STIMER, which had platform-specific quirks. The unified approach is simpler, consistent across backends, and better performant w/ Nanvix's tests.

No PIC state machine

PIC command/data ports are handled with static responses (data ports return 0xFF = all masked, command ports return 0, PIT returns 0). The only stateful behavior is bridging the guest's PIC EOI write (port 0x20, non-specific EOI) to a LAPIC EOI when a timer is active. This eliminates the full 8259A emulation from earlier iterations.

Guest halt mechanism

The guest signals "I'm done" by writing to OutBAction::Halt (port 108) instead of using the HLT instruction. With an in-kernel LAPIC (KVM) or MSHV LAPIC, HLT is absorbed by the hypervisor to wait for the next interrupt — it never reaches userspace as a VM exit. The Halt port write always triggers a VM exit, giving Hyperlight a clean signal to stop the vCPU run loop. This Halt port is used unconditionally (not just with hw-interrupts) to simplify the guest dispatch epilogue.

PvTimerConfig port

The guest writes a 32-bit LE value (period in microseconds) to OutBAction::PvTimerConfig (port 107) to start the timer. A value of 0 disables it. This avoids platform-specific timer configuration in the guest.

Changes

New file: hw_interrupts.rs

Shared helpers for MSHV and WHP: TIMER_VECTOR (0x20), handle_io_in() / handle_common_io_out() for legacy port emulation, LAPIC register read/write helpers, init_lapic_registers(), and lapic_eoi().

Guest side

  • outb.rs: OutBAction::PvTimerConfig (107) and OutBAction::Halt (108) enum variants
  • exit.rs: halt() function using Halt port
  • dispatch.rs + init.rs: Replace hlt with out dx, al to port 108

KVM

  • In-kernel IRQ chip creation (PIC + IOAPIC + LAPIC)
  • irqfd registration for GSI 0 (IRQ0)
  • Host timer thread writes to EventFd at guest-requested period
  • hw-interrupts run loop: HLT re-entry, PvTimerConfig spawns timer, Halt stops timer and exits
  • PIT port writes (0x40-0x43) silently ignored (no in-kernel PIT)

MSHV

  • LAPIC partition flag (pt_flags = 1)
  • LAPIC register initialization via set_lapic
  • Host timer thread calls request_virtual_interrupt (fixed type, vector 0x20)
  • MSR intercept for IA32_APIC_BASE to prevent guest APIC disable
  • PIC EOI → LAPIC EOI bridging via shared hw_interrupts helpers
  • Arc<VmFd> for thread-safe request_virtual_interrupt calls

WHP

  • LAPIC emulation mode detection and partition setup
  • Bulk LAPIC state API (WHvGet/SetVirtualProcessorInterruptControllerState2)
  • Software timer thread with WHvRequestInterrupt
  • set_sregs APIC_BASE filtering to prevent accidental LAPIC disable
  • LAPIC EOI via bulk state API

Tests and CI

  • Unit tests for LAPIC register helpers, IO port handling, PIC EOI bridging
  • #[cfg_attr(feature = "hw-interrupts", ignore)] on tests that conflict with hw-interrupts mode (raw HLT-based tests)
  • hw-interrupts test step in dep_build_test.yml and Justfile

@danbugs danbugs force-pushed the danbugs/hw-interrupts branch from 268cf83 to ddd016c Compare March 3, 2026 23:04
@danbugs danbugs added the kind/enhancement For PRs adding features, improving functionality, docs, tests, etc. label Mar 3, 2026
@danbugs danbugs mentioned this pull request Mar 4, 2026
Copy link
Member

@syntactically syntactically left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond this immediate use case, hardware interrupt support is a prerequisite for the ring buffer notifier mechanism planned for the tandr/ring branch — the upstream Notifier trait needs a way to signal the guest that new work is available in a virtqueue/ring buffer, and hardware interrupts are the natural delivery mechanism for that (matching the virtio model of interrupt-driven I/O notification).

I'm not sure we've made this decision yet. I believe the intention was to benchmark whether that made sense, or if a custom ABI (say some register flag set the next time that the guest was reentered through one of the existing entrypoint stubs) ended up being faster (since it would allow some extra trips up and down through the hv).

I'm actually a bit curious if we have similar data here as well---maybe the complexity of emulating a good fraction of an interrupt controller is worth it for the performance in the KVM case where there's extra support for it, but especially in the other cases, are we sure this is actually any better than just having a custom interface for "jump to this address every so often"? It seems like we don't really need all the complex interrupt routing, priority, etc parts of the interrupt controller---we just need the timer pulse?

Where the guest expects a legacy PIC but interrupts are actually delivered via LAPIC

Since we don't intend to actually have a PIC at any point, can we just modify the guest to get rid of this assumption when it's being built for the Hyperlight platform?

@danbugs danbugs force-pushed the danbugs/hw-interrupts branch from d0daeee to ba7dce0 Compare March 5, 2026 20:18
@danbugs
Copy link
Contributor Author

danbugs commented Mar 5, 2026

Major rework based on the feedback. Addressing all review comments:

Custom ABI approach (syntactically's suggestion)

Adopted the "custom ABI" approach. The changes:

  1. PIC state machine eliminated (~130 lines of pic.rs deleted). Timer vector 0x20 is hardcoded — Nanvix always remaps to 0x20 via ICW2, so there's no need to emulate the PIC initialization sequence. PIC I/O ports (0x20-0x21, 0xA0-0xA1) are accepted as no-ops. The only retained PIC logic: a 3-line EOI bridge on port 0x20 write that signals end-of-interrupt to the backend.

  2. Guest requests timer via paravirtual port (PvTimerConfig, port 107). Guest writes the period in nanoseconds. Host spawns a timer thread that fires at that rate. No PIT countdown emulation, no channel registers.

  3. Guest signals readiness via Halt port (port 108) before cli; hlt. This ensures KVM's in-kernel LAPIC doesn't absorb the HLT exit — the outb triggers an IO exit first.

Backend simplification

  • KVM: irqfd replaces in-kernel PIT. Host timer thread writes to an EventFd → kernel injects vector via in-kernel PIC. No userspace signal handling races — irqfd is kernel-mediated.
  • MSHV: request_virtual_interrupt replaces SynIC. Direct interrupt injection per timer tick — no SynIC message pages, no auto-EOI configuration, no synthetic timer setup. Addresses the "why not just opt into specific features" concern.
  • WHP: WHvRequestInterrupt (same pattern as MSHV).

Specific review comments addressed

  • "PIC file should be in arch-specific directory" → File deleted entirely, no longer needed.
  • "Guest IOAPIC workaround should be fixed in guest" → Agreed. Guest no longer looks for IOAPIC. The PIC initialization in the guest kernel writes to PIC ports which are accepted as no-ops by the host.
  • "Where are these [SynIC] values coming from?" → SynIC code removed entirely.
  • "Do we need to support hosts without LAPIC emulation?" → WHP LAPIC emulation check retained as it's part of the setup path, but could be simplified to an error.
  • "Race condition between interrupt delivery and IDT setup" → Still present in the current code (IDT installed after guest starts). Host-side mitigation: timer doesn't start until guest explicitly requests it via PvTimerConfig port. This means no interrupts fire during init before the IDT is installed.
  • "Is xsave the same commit as in feat: use i686 layout for nanvix-unstable guests and make snapshot RWX #1271?" → Yes, it was duplicated. Dropped from both this PR and feat: use i686 layout for nanvix-unstable guests and make snapshot RWX #1271.

Testing

All existing tests pass on KVM (default features, kvm-only features, and hw-interrupts features). The hw_timer_interrupts integration test validates the full flow: guest configures timer → timer fires → guest increments counter → host verifies count > 0.

MSHV and WHP testing pending.

@danbugs danbugs force-pushed the danbugs/hw-interrupts branch 13 times, most recently from ebbf92b to 0d5942b Compare March 9, 2026 19:39
@jsturtevant jsturtevant requested a review from Copilot March 9, 2026 22:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds hw-interrupts-gated hardware interrupt delivery to Hyperlight so guests can receive periodic timer interrupts for preemptive scheduling, with backend-native injection paths (KVM irqfd + in-kernel IRQ chip, MSHV request_virtual_interrupt, WHP WHvRequestInterrupt).

Changes:

  • Introduces PV timer configuration + halt signaling via new OUT ports (PvTimerConfig=107, Halt=108) and updates guest exit path accordingly.
  • Implements backend-specific timer injection loops/threads for KVM/MSHV/WHP and shares LAPIC/PIC/PIT helpers via a new hw_interrupts module.
  • Adds/adjusts tests and CI/Justfile workflows to exercise hw-interrupts mode while ignoring HLT-based tests when incompatible.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/tests/rust_guests/simpleguest/src/main.rs Adds a guest-side timer interrupt test (PIC remap, IDT install, IRQ handler, PV timer arm) and switches some “halt” sequences to the Halt port.
src/tests/rust_guests/dummyguest/src/main.rs Updates dummy guest halt to use the Halt port (108) instead of hlt alone.
src/hyperlight_host/tests/integration_test.rs Adds an integration test calling TestTimerInterrupts under hw-interrupts.
src/hyperlight_host/src/sandbox/outb.rs Explicitly rejects PvTimerConfig/Halt in the sandbox outb handler (expected to be handled in hypervisor loop).
src/hyperlight_host/src/hypervisor/virtual_machine/whp.rs Adds WHP LAPIC emulation setup, bulk LAPIC init, software timer injection thread, and IO-port handling for hw-interrupts.
src/hyperlight_host/src/hypervisor/virtual_machine/mshv.rs Enables LAPIC partition flag, initializes LAPIC, adds timer injection thread via request_virtual_interrupt, and hw-interrupts IO-port handling.
src/hyperlight_host/src/hypervisor/virtual_machine/mod.rs Exposes new hw_interrupts module (feature-gated) and updates VmExit::Retry cfg attributes.
src/hyperlight_host/src/hypervisor/virtual_machine/kvm.rs Creates in-kernel IRQ chip + irqfd, spawns host timer thread, handles PvTimerConfig/Halt ports in the KVM run loop.
src/hyperlight_host/src/hypervisor/virtual_machine/hw_interrupts.rs New shared helpers: legacy port behavior, LAPIC register init/EOI, and unit tests.
src/hyperlight_host/src/hypervisor/mod.rs Ignores a HLT-based test under hw-interrupts.
src/hyperlight_host/src/hypervisor/hyperlight_vm.rs Ignores multiple HLT-based reset tests under hw-interrupts.
src/hyperlight_host/Cargo.toml Adds the hw-interrupts cargo feature to hyperlight-host.
src/hyperlight_guest_bin/src/arch/amd64/init.rs Switches guest “done” signaling from hlt to OUT port 108.
src/hyperlight_guest_bin/src/arch/amd64/dispatch.rs Switches guest dispatch epilogue from hlt to OUT port 108.
src/hyperlight_guest/src/exit.rs Replaces the hlt-based halt with an OUT write to OutBAction::Halt.
src/hyperlight_common/src/outb.rs Adds two new OutBAction variants: PvTimerConfig (107) and Halt (108).
Justfile Adds hw-interrupts test/check invocations to CI-like targets.
.github/workflows/dep_build_test.yml Adds a CI step to run tests with hw-interrupts enabled.

danbugs added 23 commits March 18, 2026 23:27
The PIC state machine was eliminated in the previous experimental
commits (MSHV/WHP now hardcode vector 0x20 and no-op PIC ports).
Remove the orphaned pic.rs file and its mod declaration.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
- Remove _default_irq_handler (timer thread, not PIT, injects IRQs)
- Remove cli;hlt after outb port 108 in guest asm and exit.rs
- Revert to #[derive(Debug)] for all backends
- Revert feature_val to upstream unsafe { features.as_uint64[0] }
- Remove MSR intercept entirely from MSHV
- Extract shared hw_interrupts module (LAPIC helpers, IO port handling,
  EOI bridging) to reduce duplication between MSHV and WHP
- Replace lapic_timer_active with timer_thread.is_some()
- Error on missing LAPIC emulation in WHP instead of fallback path
- Remove lapic_emulation field from WhpVm
- Remove Nanvix mentions from comments
- Add Intel SDM references to LAPIC register writes

Signed-off-by: danbugs <danilochiarlone@gmail.com>
During evolve(), the guest init sequence halts via OutBAction::Halt,
which sets timer_stop=true. When the guest later configures a timer
via PvTimerConfig, the timer thread inherits the already-true stop
flag and exits immediately without ever firing.

Reset the flag to false right before spawning the timer thread in
both KVM and MSHV backends. WHP was not affected because it creates
a fresh AtomicBool each time.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
- KVM: data[..4] → data.get(..4) to avoid panic on short IO OUT
- KVM/MSHV: stop timer thread when period_us == 0
- WHP: check LAPIC state buffer >= 0x374 before init_lapic_registers
- hw_interrupts: add bounds checks + panic messages to write/read_lapic_u32
- init.rs/dispatch.rs: use xor eax,eax; out dx,eax; cli; hlt halt sequence
- simpleguest: guard IDT vector 0x20 against short IDT limit
- simpleguest: early return on period_us <= 0

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: Daniil Baturin <daniil.baturin@microsoft.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
The hyperlight-host clippy.toml disallows assert!/assert_eq!/assert_ne!
in non-test code. Revert to natural slice bounds checking which already
panics on out-of-bounds access.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
…thods

Split the two cfg-gated code paths in run_vcpu into
run_vcpu_hw_interrupts and run_vcpu_default helper methods on KvmVm,
keeping run_vcpu as a thin dispatcher. This makes each path easier to
read and reason about independently.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
The host reads RAX after the init halt to obtain the dispatch_function
address (stored by generic_init's return value). Similarly, the host
reads RAX after dispatch to get the guest function return value.

The previous 'xor eax, eax' before 'out dx, eax' zeroed RAX, causing
the host to store NextAction::Call(0) and subsequent guest calls to
jump to address 0 (triggering Double Fault on MSHV/WHP).

Remove the xor; the port-108 halt signal only needs the port number,
not a specific data value.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
…upts to x86_64/

- Split OutBAction enum: PvTimerConfig and Halt moved to new VmAction
  enum. VmAction ports are intercepted at the hypervisor level in
  run_vcpu and never reach the sandbox outb handler, so the split
  eliminates unreachable match arms.
- Move hw_interrupts.rs to virtual_machine/x86_64/hw_interrupts.rs
  since it contains x86-64 specific helpers (LAPIC, PIC, PIT).
- Remove halt() from hyperlight_guest::exit — all halt sites use
  inline assembly with options(noreturn) to avoid stack epilogue issues.
- Extract handle_pv_timer_config() method in KVM backend.
- Remove intermediate variable in WhpVm::new().
- Restore create_vm_with_args() comment in MSHV backend.
- Add memory fence after IDT writes in simpleguest for CoW safety.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
- Add trace-level log for unexpected VcpuExit::Hlt in KVM hw-interrupts
  loop (jsturtevant thread 1)
- Fix MSHV EINTR handling: always return Cancelled so InterruptHandle::kill()
  is honoured even when the timer thread is active (jsturtevant thread 3)
- Add min/max bounds (100µs–10s) on guest-provided timer period across
  all backends to prevent runaway injection or excessive sleep
  (jsturtevant thread 4)
- Stop timer thread in WHP Halt handler before returning, matching
  KVM/MSHV behaviour (jsturtevant thread 6)
- Use MSHV_PT_BIT_LAPIC named constant instead of magic 1u64 for
  pt_flags (ludfjig nit)

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Replace per-backend Arc<AtomicBool> + Option<JoinHandle> + manual Drop
with a shared TimerThread struct in x86_64/hw_interrupts.rs. Each
backend now passes a closure for the inject call (eventfd.write on KVM,
request_virtual_interrupt on MSHV, WHvRequestInterrupt on WHP).

This removes ~60 lines of duplicated timer management logic and
ensures consistent start/stop/drop semantics across all backends.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
- WhpVm::new(): partition creation sequence now appears once with
  #[cfg(feature = "hw-interrupts")] around just the LAPIC capability
  check and LAPIC-specific setup, eliminating ~20 lines of duplication.

- Replace #[cfg_attr(feature = "hw-interrupts", ignore)] with
  #[cfg(not(feature = "hw-interrupts"))] on tests that don't work
  with hw-interrupts, so they are excluded at compile time rather
  than at runtime.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
…ibility

Signed-off-by: danbugs <danilochiarlone@gmail.com>
…uration

Extract handle_pv_timer_config() into hw_interrupts.rs as a shared
function used by KVM, MSHV, and WHP backends. This eliminates
duplicated timer-config parsing/clamping/start logic.

Fix MSHV timer reconfiguration: previously MSHV only started a timer
if self.timer.is_none(), ignoring reconfiguration requests. Now all
backends consistently stop any existing timer before starting a new
one when period_us > 0.

Fix WHP do_lapic_eoi to log errors from set_lapic_state instead of
silently discarding them.

Fix MSHV request_virtual_interrupt to log errors via tracing::warn
instead of silently discarding them with let _.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
KVM: extract irqfd/eventfd setup into KvmVm::setup_irqfd(), called
from new() via Self::setup_irqfd(&vm_fd).

MSHV: extract LAPIC initialization into MshvVm::init_lapic(), called
from new() via Self::init_lapic(&vcpu_fd). Move lapic_regs_as_u8 and
lapic_regs_as_u8_mut helper functions closer to the hw-interrupts impl
block for better locality.

WHP: extract check_lapic_emulation_support() and
enable_lapic_emulation() from inline blocks in new(). Move
LAPIC_STATE_MAX_SIZE and init_lapic_bulk into the hw-interrupts impl
block since they are only used with that feature.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
With hw-interrupts enabled, VcpuExit::Hlt should never reach
userspace because the in-kernel LAPIC handles HLT internally.
Previously this was silently ignored with a trace log and continue.
Now it returns RunVcpuError::UnexpectedExit to surface the problem.

Add UnexpectedExit(String) variant to RunVcpuError for cases where
the vCPU exits in a way that shouldn't happen under normal operation.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Replace hardcoded port number 108 in dispatch_function and
pivot_stack inline assembly with a const operand referencing
hyperlight_common::outb::VmAction::Halt. This ensures the guest
assembly stays in sync with the host-side port definitions.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Signed-off-by: danbugs <danilochiarlone@gmail.com>
The data slice borrows from the KVM run buffer (self.vcpu_fd), so
calling self.handle_pv_timer_config(data) would create a conflicting
mutable borrow. Copy data to a local Vec before calling the method.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
@danbugs danbugs force-pushed the danbugs/hw-interrupts branch from da18544 to 836cb91 Compare March 18, 2026 23:28
danbugs added 3 commits March 18, 2026 23:41
Restore the combined `use {anyhow, serde_json};` import that was
split during rebase, matching the format expected by nightly rustfmt.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Replace hardcoded literal 1 with the named constant from the
windows crate for LAPIC emulation mode.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Replace .expect() with match + tracing::warn on eventfd clone failure
in KVM handle_pv_timer_config to satisfy clippy::expect_used lint.

Restore combined `use {anyhow, serde_json};` import in error.rs that
was split during rebase.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
ludfjig
ludfjig previously approved these changes Mar 19, 2026
Copy link
Contributor

@ludfjig ludfjig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great now I think, just one more thing ;D

Remove the dedicated VcpuExit::Hlt match arm and UnexpectedExit error
variant in the hw-interrupts run loop. Unexpected HLT now falls through
to the Ok(other) catch-all, returning VmExit::Unknown which gets
converted to the existing UnexpectedVmExit in hyperlight_vm.

Signed-off-by: danbugs <danilochiarlone@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement For PRs adding features, improving functionality, docs, tests, etc.

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

6 participants