fix(snapshot): align file cursor between slots in dump_dirty#5696
fix(snapshot): align file cursor between slots in dump_dirty#5696ejc3 wants to merge 1 commit intofirecracker-microvm:mainfrom
Conversation
When taking a diff snapshot of a VM with multiple KVM memory slots (e.g., x86_64 VMs with >3GB RAM where the MMIO gap splits memory into two regions), `GuestMemorySlot::dump_dirty` did not advance the file cursor to the end of each slot after processing its dirty bitmap. If a slot ended with clean (non-dirty) pages, the cursor remained short by the size of those trailing pages. This caused all subsequent slots' data to be written at incorrect file offsets, corrupting the diff snapshot. On restore, the shifted data produced invalid virtio queue state (e.g., avail ring idx reading stale values), triggering `InvalidAvailIdx` panics. The bug only affects VMs with multiple memory slots — single-slot VMs (≤3GB on x86_64) were unaffected because there is no subsequent slot whose cursor would be misaligned. Fix: track total cursor advancement through the dump loop and seek to `slot.len() - cursor_offset` at the end of each slot, ensuring the cursor is always aligned at the exact slot boundary. Signed-off-by: EJ Ciramella <ejc3@meta.com>
|
Hello EJ, Thank you very much for reporting this. This is indeed a bug in Firecracker. I think there is a slightly simpler way to implement the fix. Also, I would rather extend the existing test_dump_dirty() rather than introducing a new one. Normally we'd share the feedback here and ask you to send a new revision, but given the importance of the bug we want to move a bit faster here in order to get a release with the fix out sooner. I've been told that Github allows me to force push to your branch associated with this PR, but personally I find this weird. Hence, I have created a separate PR: #5705 I would appreciate your reviews and feedback on the new PR. Again, thank you for reporting this. Ilias |
|
All good, thank you for doing the work to get this landed. |
|
@ejc3 thanks again for reporting the issue and proposing a fix! The release with the fix is now live: https://github.com/firecracker-microvm/firecracker/releases/tag/v1.14.2 |
Summary
Fix diff snapshot corruption for VMs with multiple KVM memory slots (x86_64 VMs with >3GB RAM).
GuestMemorySlot::dump_dirtyprocesses dirty pages using seek+write operations but did not advance the file cursor to the end of each slot. When a slot ends with clean pages, the cursor is short by those pages' size. All subsequent slots' data is then written at wrong file offsets, corrupting the snapshot.Root Cause
In
src/vmm/src/vstate/memory.rs, thedump_dirtyper-slot function tracksskip_sizefor trailing clean pages but only seeks when the next dirty page is found. If the slot ends with clean pages, no seek occurs — the file cursor remains atslot_start + last_dirty_page_endinstead ofslot_start + slot_size.For single-slot VMs (≤3GB on x86_64, below the MMIO gap), this is harmless — there's no subsequent slot to misalign. For multi-slot VMs (>3GB, where the MMIO gap at ~3GB splits memory into two KVM slots), slot 1's data is shifted backward by the size of slot 0's trailing clean pages.
Impact
InvalidAvailIdxpanics (virtio avail ring reads stale data from wrong offset)Fix
Track total cursor advancement (
cursor_offset) through the dump loop. After processing all bitmap entries, seek toslot.len() - cursor_offsetto align the cursor at the exact slot boundary.Test Plan
Added
test_dump_dirty_multi_slot_cursor_alignmentregression test:Also verified end-to-end with 4GB and 8GB VMs: