Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5786 +/- ##
==========================================
- Coverage 83.00% 82.93% -0.07%
==========================================
Files 275 275
Lines 29383 29497 +114
==========================================
+ Hits 24388 24462 +74
- Misses 4995 5035 +40
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
0fcdd09 to
cdcf610
Compare
ApiServerAdapter has a handle_request() method for dispatching requests received from the API thread to the right handler. This method is called from inside EventManager::run() which takes &mut self as an argument. This can be problematic if we want to modify the EventManager object from inside handle_request(). Work around that by having ApiServerAdapter::process() store the request but not call handle_request(). EventManager::run() returns after handling each event. Call handle_request() after EventManager::run() in the event loop. In a subsequent patch, handle_request() will take a &mut EventManager as an argument. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
The handle_request() function will soon need access to the EventManager in order to add and remove new objects to support device hot-plugging. Pass a mutable reference of EventManager to handle_request() in preparation of that. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Pmem::alloc_region() panics if it fails to allocate the address range it needs. That is especially problematic for hot-plugging support, since it means failing to attach a device would kill the entire VM. Remove the unwrap and introduce a new error code that is propagated to the caller in case of failure. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
The EntropyDevice and PmemDevice error names are inconsistent with the error names used for all other devices all of which use the Config suffix. Rename them to EntropyConfig and PmemConfig for consistency. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Firecracker has been rejecting device attach API requests after the VM was started until now. Add hot-plugging for Block, Pmem and Net PCIe devices. This enables the relevant API calls and attaches the device to the PCIe "bus". No notification is delivered to the guest at the moment to notify it that a new device has been added. The guest has to manually rescan the bus in order to detect new devices. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
The device hotplug path creates devices after boot, when seccomp filters are already active. This requires allowing syscalls that were previously only called during boot before seccomp was installed. Add the following to the vmm thread filter: - timerfd_create: RateLimiter creates a TimerFd for each new device - ioctl(KVM_IOEVENTFD): registering ioeventfds for virtqueue notification - ioctl(TUNSETIFF): opening tap device for net hotplug - ioctl(TUNSETOFFLOAD): configuring tap offload for net hotplug - ioctl(TUNSETVNETHDRSZ): setting vnet header size for net hotplug - mmap(MAP_SHARED|MAP_NORESERVE): pmem backing file mapping - mmap(MAP_PRIVATE|MAP_NORESERVE|MAP_ANONYMOUS): pmem aligned region - mmap(MAP_SHARED|MAP_NORESERVE|MAP_FIXED): pmem file overlay mapping - mmap(MAP_SHARED|MAP_FIXED): IovDeque ring buffer for net device - memfd_create: IovDeque ring buffer for net device - fcntl(F_ADD_SEALS): IovDeque memfd sealing for net device Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add unit tests for testing the possible failure modes of hotplugging different classes of devices. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Add integration tests for block, pmem and net hotplugging. The tests require a manual PCI bus rescan at the moment since no hotplug notification mechanism is implemented at the moment. Signed-off-by: Ilias Stamatis <ilstam@amazon.com>
Manciukic
left a comment
There was a problem hiding this comment.
mostly LGTM, just a few things here and there. I'm happy to take some of the items (like extending test coverage) to a separate PR. We should also update documentation, but happy to take that on another PR.
|
|
||
|
|
||
| def test_api_put_update_post_boot(uvm_plain, io_engine): | ||
| def test_api_put_update_post_boot(uvm_plain): |
There was a problem hiding this comment.
why are we removing the Async engine? Maybe we should fix the test to actually use io_engine
There was a problem hiding this comment.
That argument was only used for the /drive PUT API request and is therefore no longer needed. I don't think further modifications to this test belong to this PR.
| Ok(()) | ||
| } | ||
|
|
||
| fn hotplug_make_block( |
There was a problem hiding this comment.
these functions are not specific to the transport, they just "happen to" be only supported by PCI. Is there a way we could decouple them from the PCI transport, and maybe deduplicate the code we have in builder.rs?
There was a problem hiding this comment.
Removing builder.rs and VmResources is very painful and a significant refactoring as previously discussed. As to where to put these hotlpug functions what is your suggestion? I had them in vmm/libs.rs previously as methods of the Vmm object.
| vm.basic_config() | ||
| vm.add_net_iface() | ||
| vm.start() | ||
|
|
There was a problem hiding this comment.
can't we use one of the helpers like uvm_any or maybe add uvm_any_pci if not available?
Ideally we'd also check before snapshot and after snapshot to verify everything works irrespective of it being snapshot/restored.
Thinking out loud, maybe it's also worth verifying devices are persisted in the snapshot/restore?
|
|
||
| let mut pmem = Pmem::new(config).map_err(PmemConfigError::from)?; | ||
| pmem.alloc_region(vm.as_ref()).unwrap(); | ||
| pmem.set_mem_region(vm.as_ref()) |
There was a problem hiding this comment.
how are failures handled? is Pmem (and the other devices) correctly cleaning up the resources they hold? This is likely going to be more of an issue for hotunplugging: iovdeque memfd, mmap-ed memory areas, etc
There was a problem hiding this comment.
I was hopping everything uses RAII, but good point, I'll double check it.
| VIRTIO_PCI_DEVICE_ID_PMEM = 0x105B | ||
|
|
||
|
|
||
| def test_hotplug_block(microvm_factory, guest_kernel_acpi, rootfs): |
There was a problem hiding this comment.
maybe we should also check the rate limiter works as expected after hotplugging as that requires registration with eventfd.
| assert lspci_final == lspci_after | ||
|
|
||
|
|
||
| def test_hotplug_no_pci(microvm_factory, guest_kernel_acpi, rootfs): |
There was a problem hiding this comment.
we could maybe add some other tests to stress that other handles are handled correctly:
- what happens if we hotplug too many devices? we should hit a limit at 32?
- what happens if device hotplug fails for device specific region? Ie non-existent tap or block file
There was a problem hiding this comment.
Good point, I will add these
Add PCI hotplug support. No hotplug notification mechanism is implemented yet, so the the guest needs to rescan the PCI "bus" manually in order to see new attachments.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.PR Checklist
tools/devtool checkbuild --allto verify that the PR passesbuild checks on all supported architectures.
tools/devtool checkstyleto verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md.Runbook for Firecracker API changes.
integration tests.
TODO.rust-vmm.