guide: document OpenHCL CPU scheduling model#2975
Conversation
Why is this change being made? - The cooperative executor model in OpenHCL — one affinitized thread per VP, shared between device workers and VTL0 execution — is a frequent source of confusion when debugging storage latency and device worker stalls. - No existing documentation explains the blocking scenarios (VTL0 residence, kernel syscalls, hypervisor intercepts), the mitigations (io_uring cancel, dedicated threads for GET), or how sidecar changes the picture. What changed? - New Guide page: `Guide/src/reference/architecture/openhcl/cpu_scheduling.md` covering the thread model, cooperative scheduling, blocking scenarios, sidecar tradeoffs, OpenVMM comparison, and device design rules. - Expanded rustdoc for [`VmTaskDriverBuilder::target_vp()`](https://openvmm.dev/rustdoc/linux/vmcore/vm_task/struct.VmTaskDriverBuilder.html) and [`run_on_target()`](https://openvmm.dev/rustdoc/linux/vmcore/vm_task/struct.VmTaskDriverBuilder.html) — documents per-backend guarantee strength and the VP index semantics. - Expanded rustdoc for [`VmTaskDriver::retarget_vp()`](https://openvmm.dev/rustdoc/linux/vmcore/vm_task/struct.VmTaskDriver.html) — documents that in-flight IOs are not retargeted. - Expanded comment on `vp.rs` VP-index-to-CPU assumption. How was the change tested? - ✅ `cargo doc --no-deps -p vmcore` — no warnings - ✅ Guide page reviewed for cross-link validity Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
This PR modifies files containing For more on why we check whole files, instead of just diffs, check out the Rustonomicon |
There was a problem hiding this comment.
Pull request overview
Adds missing documentation around OpenHCL’s cooperative per-VP executor/thread model to reduce confusion when debugging device-worker stalls and storage latency, and aligns related Rust API docs/comments with those scheduling concepts.
Changes:
- Adds a new Guide reference page documenting OpenHCL CPU scheduling/cooperative execution, blocking scenarios, mitigations, and sidecar implications.
- Expands rustdoc on
VmTaskDriverBuilder/VmTaskDriverVP-targeting APIs to clarify semantics and backend-dependent guarantees. - Clarifies the current VP-index-to-Linux-CPU simplifying assumption in Underhill VP spawning code.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| vm/vmcore/src/vm_task.rs | Updates rustdoc for run_on_target, target_vp, and retarget_vp scheduling semantics. |
| openhcl/underhill_core/src/vp.rs | Adds a comment explaining the current VP index → Linux CPU number assumption. |
| Guide/src/reference/architecture/openhcl/cpu_scheduling.md | New page describing OpenHCL’s CPU scheduling model, blocking modes, and sidecar behavior. |
| Guide/src/SUMMARY.md | Adds the new CPU scheduling page to the Guide navigation. |
Guide changes: - Replace all 'Underhill' references with 'OpenHCL' - Add Scope section explaining this covers the worker process - Add links to Rust async book and tokio tutorial - Clarify 'idle' terminology (VTL2 executor idle, not CPU idle) - Describe multiple OpenHCL processes, scope to worker - Explain GET worker CPU (dedicated thread, not VP thread) - Clarify io_uring cancel scope: covers disk_blockdevice, disk_nvme eventfd via poll, but not hypervisor traps - Explain VTL0 hypervisor trap vs VTL0 usermode for cancel - Add sidecar VP readiness caveat to 'what runs on a VP thread' - Add TaskControl/AsyncRun explanation and rustdoc links - Improve timeline diagram with shading legend - Fix OpenVMM comparison table: run_on_target is ignored by thread backend, target_vp controls dedicated thread - Fix dangling StorVSP channels page reference - Add rule 5: use TaskControl for worker lifecycle Rustdoc changes: - run_on_target: thread backend ignores it, target_vp controls - retarget_vp: note it's backend-dependent (no-op in thread backend) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds end-user-facing Guide documentation and API rustdoc clarifying OpenHCL’s cooperative per-VP executor/thread model, common blocking scenarios, and how VmTaskDriver targeting hints behave across backends.
Changes:
- Added a new Guide page describing OpenHCL CPU scheduling/cooperative executor behavior and sidecar implications.
- Expanded rustdoc for
VmTaskDriverBuilder::{run_on_target,target_vp}andVmTaskDriver::retarget_vp()to clarify backend-specific semantics. - Expanded an in-code comment documenting the current VP-index-to-Linux-CPU assumption in OpenHCL.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
vm/vmcore/src/vm_task.rs |
Adds backend-specific semantics and VP index clarifications to task-driver rustdoc. |
openhcl/underhill_core/src/vp.rs |
Documents the current assumption mapping VP index to Linux CPU number. |
Guide/src/reference/architecture/openhcl/cpu_scheduling.md |
New architecture doc explaining OpenHCL’s cooperative scheduling model, stalls, mitigations, and sidecar behavior. |
Guide/src/SUMMARY.md |
Wires the new CPU scheduling page into the Guide navigation. |
The OpenHCL threadpool guarantees CPU-affinitized execution only after the target VP is online. For sidecar VPs that haven't been onlined yet, work may run on a different CPU. Reference is_target_vp_ready/wait_target_vp_ready for callers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
|
||
| ### What runs on a VP thread | ||
|
|
||
| All tasks with `target_vp = N` and `run_on_target = true` run on |
There was a problem hiding this comment.
What's the point of target_vp if run_on_target = false?
There was a problem hiding this comment.
If run_on_target is true, then the task always runs on the target VP wherever it was woken. If it's false, the task can run wherever it is woken.
In both cases, IOs issues by the task will use the target VP's io-uring (and so when the IO completes, it will be woken up on the target VP and therefore will be likely to run there, even if run_on_target is false). If no target VP is set, then the task will use the current VP's io-uring.
There was a problem hiding this comment.
@jstarks: the current threadpool impl seems to dispatch this on VP0's executor. Am I looking at the right spot?
There was a problem hiding this comment.
Pull request overview
Adds Guide and rustdoc documentation to clarify OpenHCL’s cooperative per-VP executor/threadpool model (and sidecar’s impact), aiming to reduce confusion when debugging latency and stalls.
Changes:
- Added a new Guide architecture page describing OpenHCL CPU scheduling/cooperative execution, blocking scenarios, and mitigations.
- Expanded rustdoc on
VmTaskDriverBuilder::{target_vp,run_on_target}andVmTaskDriver::retarget_vp()to clarify backend-specific guarantees and retargeting semantics. - Clarified (via comment) the current VP-index-to-Linux-CPU assumption in VP spawning.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
vm/vmcore/src/vm_task.rs |
Rustdoc clarifications for VP targeting, run-on-target semantics, and retargeting behavior across backends. |
openhcl/underhill_core/src/vp.rs |
Comment documents current VP index → Linux CPU mapping assumption used for threadpool selection. |
Guide/src/reference/architecture/openhcl/cpu_scheduling.md |
New Guide page documenting OpenHCL worker thread model, cooperative scheduling, blocking cases, sidecar changes, and device design guidance. |
Guide/src/SUMMARY.md |
Adds the new CPU scheduling page to the Guide TOC under OpenHCL architecture. |
There was a problem hiding this comment.
Pull request overview
Adds missing documentation for OpenHCL’s cooperative per-VP scheduling model (and related API docs) to reduce confusion when debugging device stalls and storage latency.
Changes:
- Adds a new Guide page documenting OpenHCL CPU scheduling/cooperative executor behavior, blocking scenarios, mitigations, and sidecar implications.
- Expands rustdoc for
VmTaskDriverBuilder::{run_on_target,target_vp}andVmTaskDriver::retarget_vp()to clarify semantics and backend-dependent guarantees. - Clarifies a VP-index-to-CPU assumption in
underhill_coreVP spawning code.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
vm/vmcore/src/vm_task.rs |
Rustdoc updates describing VP targeting / run-on-target behavior and retargeting limitations. |
openhcl/underhill_core/src/vp.rs |
Comment clarifying the current VP-index-to-Linux-CPU assumption. |
Guide/src/reference/architecture/openhcl/cpu_scheduling.md |
New reference page explaining OpenHCL’s per-VP cooperative executor model, blocking modes, mitigations, and sidecar effects. |
Guide/src/SUMMARY.md |
Adds the new CPU Scheduling page to the Guide navigation. |
There was a problem hiding this comment.
Pull request overview
Adds documentation explaining OpenHCL’s per-VP cooperative executor scheduling model and clarifies task-to-VP targeting semantics to reduce confusion when debugging latency/stalls.
Changes:
- Added a new guide page describing OpenHCL’s thread model, cooperative scheduling, blocking scenarios, sidecar behavior, and device design guidance.
- Expanded rustdoc on
VmTaskDriverBuilderandVmTaskDriverVP-targeting APIs to explain backend-dependent guarantees and retargeting behavior. - Clarified the VP-index-to-Linux-CPU assumption in
vp.rsand linked the new guide from the docs SUMMARY.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| vm/vmcore/src/vm_task.rs | Expanded rustdoc for VP targeting, scheduling affinity hints, and retargeting semantics |
| openhcl/underhill_core/src/vp.rs | Added a clarifying comment about VP index vs Linux CPU numbering assumptions |
| Guide/src/reference/architecture/openhcl/cpu_scheduling.md | New detailed guide on cooperative scheduling, blocking scenarios, and sidecar implications |
| Guide/src/SUMMARY.md | Added the new “CPU Scheduling” page to the OpenHCL architecture section |
The existing storvsp page mentions subchannels in a bullet point but has no coverage of the negotiation sequence, worker model, CPU affinity, performance tradeoffs, or configuration. ## Changes - New Guide page: `devices/vmbus/storvsp_channels.md` covering subchannel negotiation (mermaid sequence diagram), the one-worker-per-channel model, CPU affinity and VP targeting, IDE accelerator comparison, subchannel scaling illustrations (0 subs through 64 VPs), poll mode mechanics, the slow-disk head-of-line blocking problem, cooperative executor impact, sidecar behavior, configuration (CLI + OpenHCL VTL2 settings + guest kernel params), Hyper-V behavioral differences, and inspect output. - Updated `storvsp.md` with cross-link to the new page. - Updated `storage.md` with cross-link from sub-channel allocation mention. - Updated `storage_configuration.md` with `scsi_sub_channels` fixed settings documentation. - Updated `ide.md` with new IDE accelerator section explaining the StorVSP-backed VMBus path and why it doesn't support subchannels. - Added Guide cross-link to `storvsp` crate-level rustdoc. Addresses #2954, touches #2955. Can be reviewed independently from the VMBus channels (#2977) and CPU scheduling (#2975) PRs. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds missing Guide coverage for OpenHCL’s cooperative per-VP executor/thread model (and its blocking/latency implications), and tightens related API/docs around VP targeting and retargeting semantics.
Changes:
- Adds a new Guide page documenting OpenHCL CPU scheduling/cooperative executor behavior, blocking scenarios, and sidecar implications.
- Expands rustdoc for
VmTaskDriverBuilder::{run_on_target,target_vp}andVmTaskDriver::retarget_vp()to clarify backend-dependent guarantees. - Updates OpenHCL VP spawning commentary and Guide navigation to reflect VP-index/CPU-index assumptions and add the new page.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| vm/vmcore/src/vm_task.rs | Refines rustdoc around VP targeting/retargeting and run_on_target semantics. |
| openhcl/underhill_core/src/vp.rs | Adds clarification on the current VP-index-to-Linux-CPU indexing assumption. |
| Guide/src/reference/devices/vmbus/storvsp_channels.md | Shortens a “guest VP not stalled” explanation in the StorVSP channel doc. |
| Guide/src/reference/architecture/openhcl/cpu_scheduling.md | New OpenHCL CPU scheduling/cooperative executor documentation page. |
| Guide/src/SUMMARY.md | Adds the new CPU scheduling page to the Guide TOC. |
Comments suppressed due to low confidence (1)
vm/vmcore/src/vm_task.rs:170
- The
target_vp()docs say that when unset the backend typically uses "the calling thread or a shared pool", but the OpenHCL threadpool backend currently drives untargeted IO viadriver(0)(VP0) rather than the caller. Please tighten this wording (or explicitly call out the OpenHCL threadpool default) so it matches actual backend behavior.
/// If not set, the backend uses its default scheduling (typically the
/// calling thread or a shared pool).
| │ async executor │ │ async executor │ │ async executor │ | ||
| │ │ │ │ │ │ | ||
| │ • device workers │ │ • device workers │ │ • device workers │ | ||
| │ • VMBus relay │ │ • VMBus relay │ │ • VMBus relay │ |
There was a problem hiding this comment.
is ther a reason vmbus relay is called out? it just seems weird to me here. what is the intention of this box, should it just be async tasks?
| situations can stall all tasks on a VP. | ||
|
|
||
| ```admonish note | ||
| In OpenVMM (not OpenHCL), device workers and VP execution run |
There was a problem hiding this comment.
I think the general guidance of "you should never blocking wait in side an async task" still applies, that's general async rust. What's this meant to clarify here? Because if you block openvmm's async executor, you might get lucky that it's a threadpool with multiple threads but we're still going to be very unhappy.
| │▓▓▓▓▓▓▓▓▓▓▓▓▓▓│░░░░░░░░░░░░░░░│▓▓▓▓▓▓▓▓▓▓▓▓▓▓│██████████████│ | ||
| │ VTL2 tasks │ VTL0 guest │ VTL2 tasks │ kernel │ | ||
| │ │ │ │ syscall │ | ||
| │ storvsp, │ ALL VTL2 │ storvsp, │ ALL VTL2 │ |
There was a problem hiding this comment.
I think this is subtly wrong - there is NO VTL2 pending tasks, which is why we are in the VTL0 guest. it should maybe say ALL VTL2 tasks awaited/Poll::Pending.
There was a problem hiding this comment.
Because the key point here is we will never dispatch to VTL0 unless there is no pending work in VTL2.
| For device interrupts that don't go through io_uring (e.g., the physical | ||
| NVMe driver in `disk_nvme` receives interrupts via an eventfd), the eventfd | ||
| is registered with io_uring as a poll operation, so it also triggers the | ||
| cancel path. |
There was a problem hiding this comment.
This reads weirdly to me, it feels like it's saying "things that don't go through io_uring actually do go through io_uring", which is an odd statement to make.
There was a problem hiding this comment.
Thanks! I'm not going to be able to get to this immediately, so filed #3205 to track. Thanks for the review :).
smalis-msft
left a comment
There was a problem hiding this comment.
One minor nit, but otherwise LGTM
The cooperative executor model in OpenHCL — one affinitized thread per VP, shared between device workers and VTL0 execution — is a frequent source of confusion when debugging storage latency and device worker stalls. No existing documentation explains the blocking scenarios, mitigations, or how sidecar changes the picture. ## Changes - New Guide page: `architecture/openhcl/cpu_scheduling.md` covering the thread model, cooperative scheduling, blocking scenarios (VTL0 residence, kernel syscalls, hypervisor intercepts), mitigations (io_uring cancel, dedicated threads), sidecar tradeoffs, OpenVMM comparison, and device design rules. - Expanded rustdoc for `VmTaskDriverBuilder::target_vp()` and `run_on_target()` — documents per-backend guarantee strength and VP index semantics. - Expanded rustdoc for `VmTaskDriver::retarget_vp()` — documents that in-flight IOs are not retargeted. - Expanded comment on `vp.rs` VP-index-to-CPU assumption. Addresses the CPU scheduling / cooperative executor dimension of microsoft#2954 and microsoft#2955. Can be reviewed independently from the VMBus channels (microsoft#2977) and StorVSP channels (microsoft#2976) PRs. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The cooperative executor model in OpenHCL — one affinitized thread per VP, shared between device workers and VTL0 execution — is a frequent source of confusion when debugging storage latency and device worker stalls. No existing documentation explains the blocking scenarios, mitigations, or how sidecar changes the picture.
Changes
architecture/openhcl/cpu_scheduling.mdcovering the thread model, cooperative scheduling, blocking scenarios (VTL0 residence, kernel syscalls, hypervisor intercepts), mitigations (io_uring cancel, dedicated threads), sidecar tradeoffs, OpenVMM comparison, and device design rules.VmTaskDriverBuilder::target_vp()andrun_on_target()— documents per-backend guarantee strength and VP index semantics.VmTaskDriver::retarget_vp()— documents that in-flight IOs are not retargeted.vp.rsVP-index-to-CPU assumption.Addresses the CPU scheduling / cooperative executor dimension of #2954 and #2955. Can be reviewed independently from the VMBus channels (#2977) and StorVSP channels (#2976) PRs.