arch: Change numeric serialization for CPU profiles by olivereanderson · Pull Request #79 · cyberus-technology/cloud-hypervisor

olivereanderson · 2026-02-09T19:26:42Z

This PR makes some changes to the way we serialize CPUID leaf and sub-leaves in the CPU profiles. When debugging CPU profiles I noticed that it is easier to deal with shorter hex strings for the leaves and that it would also be nice to have the sub-leaf ranges specified in terms of hex as well.

We do however want to keep serializing the adjustments as 10 byte hex strings which we do not change here (only the code).

We don't bother regenerating profiles in this PR, because there is no functional change, and we have to regenerate them in #70 anyway.

After updating the Linux kernel to 6.19.6 the second segment (segment=1) is now under the 2nd IOMMU group (which it a more logical setup) and as such the added device which is on that segment is in that second IOMMU group. The same check is made in test_vdpa_block so also test there. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

The set of extensions supported by a RISC-V system needs to be exposed to the guest - currently that is a fixed, minimal set of extensions. These extensions are not sufficient to boot Ubuntu 25.10 which now has a mininimum requirement of RVA23S64 (which is a minimum set of extensions that make sense for server use cases.) The easiest way to convey the extensions that the guest should use is to copy those that the host kernel understands (and thus includes in the /proc/cpuinfo) data. However since nested virtualisation is not currently possible - exclude the "H" (Hypervisor) extension from the list of short (single letter) extensions. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

To get that list, I've used ``` git log | grep --fixed-strings -- "-by:" | head -n 100000 | sort | less ``` on the Linux kernel's git repository. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

Suggested-by Alyssa Ross [0]. [0]: cloud-hypervisor#7471 (comment) Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

as of rust 1.90, writes to unix socket streams use send_with_flags instead of write, so it uses a sendto syscall instead of write. Signed-off-by: Matt Moriarity <matt@mattmoriarity.com>

This is required to support exclusive locking on files which is needed for safe test ID generation when using nextest (since it runs each test as a separate process.) Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

When using nextest for running tests each test is run in its own process so the old solution of using a static variable for the guest ID (used to determine the network segment) no longer works. Instead use a text file on the filesystem protected with an exclusive lock. The test process will read from it and then write back the next ID that can be used. It wraps around at the limit of u8 and skips ID 0. This function intentionally panics rather than propagate errors as it should only be called for testing purposes and there the panic handler will give a useful backtrace and cleanup. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

cargo nextest is an improved test runner that allows retries as well a reporting the times for the test runs. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

This alternative test runner supports retries and also reports how long each test takes to run. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

This is now failing on x86-64 as well after the update of the Rust version. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

Due to the fw_cfg test being disabled on MSHV this results in no tests being runnable which results in an error in nextest. Reduce that error to a warning use --no-tests=warn Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

Instead of using ad-hoc code, just write an extension to the Iterator trait that we can easily unit test. Co-authored-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP julian.stecklina@sap.com On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de> Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>

Adding itertools as dependency improves the iteration code in the following significantly. With this change, we don't need a copy of the vector. Just something that can be coerced into an iterator. We also use the bit position iterator to make the code somewhat clearer. The new code is much faster, because it will not iterate over every bit, just each 1 bit in the input. The next commit will complete this optimization and have some concrete numbers. On-behalf-of: SAP julian.stecklina@sap.com Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>

This would be a good opportunity to optimize another pointless vector away, but I don't have a good way to test this at the moment. But maybe someone else gives it a shot. On-behalf-of: SAP julian.stecklina@sap.com Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>

... by passing the slice along instead. On-behalf-of: SAP julian.stecklina@sap.com Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>

... by just passing the iterator along. For large VMs this bitmap is gigantic. A 12TB VM has 384MB of dirty bitmap. With all these optimizations from the previous commits in place, we see quite the improvement when it comes to scanning the dirty bitmap. For a bitmap with 1% bits (randomly) set, dirty_log() takes: Original code: 2166ms (100.0%) New code: 382ms ( 17.6%) on my system. The sparser the dirty bitmap the faster. Scanning an empty bitmap is 100x faster. For a 5% populated bitmap we are still 3x faster. If someone wants to play with this, there is a benchmark harness here: https://github.com/blitz/chv-bitmap-bench On-behalf-of: SAP julian.stecklina@sap.com Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

A major improvement to the developer experience of clippy in Cloud Hypervisor. 1. Make `cargo clippy` just work with the same lints we use in CI 2. Simplify adding new lints Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

Bumps [crate-ci/typos](https://github.com/crate-ci/typos) from 1.39.0 to 1.39.2. - [Release notes](https://github.com/crate-ci/typos/releases) - [Changelog](https://github.com/crate-ci/typos/blob/master/CHANGELOG.md) - [Commits](crate-ci/typos@v1.39.0...v1.39.2) --- updated-dependencies: - dependency-name: crate-ci/typos dependency-version: 1.39.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>

Commit 5ec47d4 was intended to patch ebx bits 23-16 in cpuid leaf 0x1, but it was not working as expected, as in rust, operator << has a stronger precedence than & [1]. Later commit b6667f9 fixed the operator precedence clippy warning, but did not fix the actual issue. As a result, the current code is not changing ebx, ``` cpu_ebx |= ((dies_per_package as u32) * (cores_per_die as u32) * (threads_per_core as u32)) & (0xff << 16); ``` Since the total number of logical processors is generally less than 65536, the right hand side of the expression is 0 in most cases. [1] https://doc.rust-lang.org/reference/expressions.html#expression-precedence Fixes: 5ec47d4 ("arch: x86_64: enable HTT flag") Signed-off-by: Changyuan Lyu <changyuanl@google.com>

This change makes integration tests more resilient to transient download failures. This will reduce churn in our CI workflow, specifically for the Merge Queue. Examples: https://github.com/cloud-hypervisor/cloud-hypervisor/actions/runs/19545345066/job/55962570896 https://github.com/cloud-hypervisor/cloud-hypervisor/actions/runs/19545345122/job/55962570736 https://github.com/cloud-hypervisor/cloud-hypervisor/actions/runs/19545345034/job/55962570724 Signed-off-by: Bo Chen <bchen@crusoe.ai>

TL;DR: Massive quality of life improvement for devs Cloud Hypervisor uses the Cargo test framework for multiple tests: - normal unit tests - unit tests requiring special environment (the Tap device tests) - integration tests requiring a special environment This prevented the execution of `cargo test --workspace`, which results in a very poor developer experience. Although `./scripts/run_unit_tests.sh` exists, there are valid reasons why devs cannot or even don't want to use it. By adding a new `chv_testenv` rustc config, we can conditionally only activate tests when the `./scripts/` magic runs them. This improves the general developer experience by a lot. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

This better aligns with the rest of the code and makes it clearer that these tests can run "as is" in a normal hosted environments without the special test environment. Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

We introduce data structures to describe values within the registers modified by the CPUID instruction. These data structures will later be used by the upcoming CPU profile generation tool. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com

We introduce CPUID definitions for Intel CPUs that will be utilized by the upcoming CPU Profile generation tool. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com

We introduce CPUID definitions defined for the KVM hypervisor. These definitions will later be utilized by the upcoming CPU profile generation tool. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com

We use the Intel CPUID definitions to provide more information when CPUID compatibility checks fail (when both the source and destination VM run on Intel CPUs). Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com

This commit introduces a CLI for generating a CPU profile closely matching the CPU of the machine the CLI is executed on. The idea is to have a simple way to add more CPU profiles corresponding to physical CPUs. Note however that with the current setup one still needs a little bit of manual work to integrate the generated CPU profile data into cloud hypervisor itself. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com

Some devices can do announcements after a live migration, e.g. a network device announces its new location in a network. With this change a device can return an announcer to do that. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>

To announce the new location in a network, virtio-net devices can send reverse ARP packets. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>

After this change switches in a network should find the new location of a VM quicker. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

x86::__cpuid is safe on Rust ≥1.94 but unsafe on older versions. This causes unused_unsafe warnings when compiling with Rust ≥1.94. However, on earlier Rust versions, the code won’t compile if the unsafe blocks are absent. Work around this by adding #[allow(unused_unsafe)] where needed to suppress the warnings. See cloud-hypervisor#7588 for more discussion. Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com

Unnecessary option. Change is also on upstream [0]. [0] cloud-hypervisor#7629 Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

Adds a flake configuration that enables building Cloud Hypervisor directly from this repository using Nix. This makes it possible to deploy and test Cloud Hypervisor on NixOS systems in real environments. On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>

On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>

Users with Nix and a shell with direnv integration will be able to automatically enter the Nix development shell. Forgot to add this in [0]. [0] cyberus-technology#73 On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>

On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>

This is the first commit in a series of improvements `memory_copy_iterations()` and `struct MigrationState`. I verified everything in dozens of test runs with excessive logging together with three colleagues (StefanK, PascalS, SebastianE). The series is a pre-requisite for live migration statistics reporting. Previously, the initial transmission was done in a dedicated step and `memory_copy_iterations()` took only care of the delta transmission. This makes aggregating metrics unnecessarily difficult. Therefore, everything is now handled gracefully by `memory_copy_iterations()` in one single place. Further, this consolidation perfectly makes sense as all memory transmission is now streamlined in one function. I had to adapt the iteration counter: Iteration 0 : initial transmission Iteration 1..n: delta transmission ^ done inside the function Iteration n : final transmission ^ done outside the function On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>

This improves the code quality of `struct MigrationState` and memory_copy_iterations(). This significantly improves maintainability of the code. Further, I've added the ability for expected downtime calculation and dirty page calculation. The new names are much more descriptive. I also removed properties that didn't make sense. These changes have undergone intense manual testing where many colleagues attended (PascalS, StefanK, SebastianE). There is currently no easy way to check that things really work as a reviewer. PS: The old struct comes from an external contributor [0]. [0] cloud-hypervisor#7033 On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>

This replaces the multiple mpsc channels that the main thread created (one for each memory send thread) with a single mpsc channel. We synchronize by wrapping a mutex around the receiving end. That way the main thread can simply put all memory chunks into the channel without the need for additional synchronization. With this change, the main thread does not send memory anymore, thus we now create one additional thread. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>

This solves the race condition in the following scenario: Thread A is done working and waits at the barrier. Thread B encounters an error and sends it to the main thread. Thread A is still waiting at the barrier, and the main thread cannot abort that. With the custom gate, the main thread can simply open the gate, and all waiting threads will continue. Even if Thread A now gets the gate-message that was sent for Thread B, the gate is now open and Thread A will not block. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>

This reverts commit 3fe978e.

This reverts commit daaea60.

phip1611 · 2026-02-10T06:51:19Z

~~wrong base, you're merging this into main not gardenlinux~~

ah duplicate of #80, right?

From working with serialized CPU profile data we noticed the following: 1. It is easier to read shorter hex strings when looking up CPUID leaves in the serialized CPU profile data. 2. It is preferable to also have sub-leaf ranges hex encoded. 3. We still want to keep serializing the adjustments to 10 character hex strings. Some of the convenience functions introduced here will also be utilized in the upcoming MSR adjustments PR in order to serialize register addresses. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com

olivereanderson · 2026-02-10T07:59:32Z

~~wrong base, you're merging this into main not gardenlinux~~

ah duplicate of #80, right?

Yes duplicate. Very strange that this was left open after changing the base, but perhaps I made some mistake in the browser.

rbradford and others added 30 commits November 13, 2025 14:29

misc: gitlint: allow more prefixes disabling 72 width limit

62345cd

Suggested-by Alyssa Ross [0]. [0]: cloud-hypervisor#7471 (comment) Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

misc: gitlint: python code improvements

02da2f2

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

seccomp: allow sendto for vsock thread

ec57aad

as of rust 1.90, writes to unix socket streams use send_with_flags instead of write, so it uses a sendto syscall instead of write. Signed-off-by: Matt Moriarity <matt@mattmoriarity.com>

build: Bump MSRV to 1.89.0

5051feb

This is required to support exclusive locking on files which is needed for safe test ID generation when using nextest (since it runs each test as a separate process.) Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

build: Add cargo nextest to container

2707b0f

cargo nextest is an improved test runner that allows retries as well a reporting the times for the test runs. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

tests: Use cargo nextest for integration tests

3734a13

This alternative test runner supports retries and also reports how long each test takes to run. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

tests: Disable test_snapshot_restore_with_fd()

9f046f0

This is now failing on x86-64 as well after the update of the Rust version. Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

tests: Add test timeout (10 minutes) to nextest configuration

f9076cc

Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

tests: Disable nextest failure on fw_cfg tests on MSHV

6147c4c

Due to the fw_cfg test being disabled on MSHV this results in no tests being runnable which results in an error in nextest. Reduce that error to a warning use --no-tests=warn Signed-off-by: Rob Bradford <rbradford@rivosinc.com>

virtio-devices: avoid creating a temporary vector

fc99e29

... by passing the slice along instead. On-behalf-of: SAP julian.stecklina@sap.com Signed-off-by: Julian Stecklina <julian.stecklina@cyberus-technology.de>

vm-migration: better naming + unittests

a7fa3a0

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

misc: unrelated misc code improvements

935332b

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

tests: prevent broken terminal after running cargo test -p vmm

35b91f7

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

tests: move VM test into a test module

7364fbd

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

net_util: unrelated code improvement

e160a17

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

docs: update CONTRIBUTING.md

11d17fb

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

olivereanderson and others added 23 commits January 16, 2026 11:26

arch: Intel CPUID definitions

80f7e0b

We introduce CPUID definitions for Intel CPUs that will be utilized by the upcoming CPU Profile generation tool. Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com

virtio-devices: implement a PostMigrationAnnouncer for virtio-net

cbb1851

To announce the new location in a network, virtio-net devices can send reverse ARP packets. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>

devices: schedule PostMigrationAnnouncer of all devices

bde41fb

After this change switches in a network should find the new location of a VM quicker. On-behalf-of: SAP sebastian.eydam@sap.com Signed-off-by: Sebastian Eydam <sebastian.eydam@cyberus-technology.de>

vmm: latest fixes

88dd698

Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

arch: Fix clippy lints in CPU profile generation code

930fe34

Signed-off-by: Oliver Anderson <oliver.anderson@cyberus-technology.de> On-behalf-of: SAP oliver.anderson@sap.com

vmm: net code simplification

50263b5

Unnecessary option. Change is also on upstream [0]. [0] cloud-hypervisor#7629 Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de> On-behalf-of: SAP philipp.schuster@sap.com

build: ci/nix: build cloud-hypervisor

4168cb9

On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>

ci: fix reuse

3781c2f

On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>

misc: flake: add rustup

4a26381

On-behalf-of: SAP philipp.schuster@sap.com Signed-off-by: Philipp Schuster <philipp.schuster@cyberus-technology.de>

Revert "vmm: fix panic in http API when connecting GDB"

161cdf1

This reverts commit 3fe978e.

Revert "vmm: api: make HTTP Server (API) multithreaded"

332a4bc

This reverts commit daaea60.

olivereanderson force-pushed the cpu-profiles-change-numeric-serialization branch 2 times, most recently from db08dbf to 876c3c1 Compare February 10, 2026 07:52

olivereanderson force-pushed the cpu-profiles-change-numeric-serialization branch from 876c3c1 to a6cb0eb Compare February 10, 2026 07:58

olivereanderson closed this Feb 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

arch: Change numeric serialization for CPU profiles#79

arch: Change numeric serialization for CPU profiles#79
olivereanderson wants to merge 520 commits intocyberus-technology:mainfrom
olivereanderson:cpu-profiles-change-numeric-serialization

olivereanderson commented Feb 9, 2026

Uh oh!

phip1611 commented Feb 10, 2026 •

edited

Loading

Uh oh!

olivereanderson commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Comments

Conversation

olivereanderson commented Feb 9, 2026

Uh oh!

phip1611 commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

olivereanderson commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

phip1611 commented Feb 10, 2026 •

edited

Loading