Agentic hypervisors for autonomous AI systems.
A high-performance hypervisor framework in Rust with first-class AI agent support. Type-1 bare-metal and Type-2 hosted modes.
| Category | Capabilities |
|---|---|
| Performance | Zero-copy memory, JIT compilation, hardware GPU virtualization |
| Networking | Full TCP/IP stack, TAP/TUN, gRPC/REST APIs, distributed orchestration |
| GPU | Vulkan/WebGPU, passthrough, virtual GPU, CUDA/OpenCL |
| AI-First | MCP server, scriptable API, WASM plugins, LLM tool formats |
| GUI | Desktop app, AI-driven automation, semantic control API |
| Security | FIPS-approved + post-quantum crypto, capability-based access, audit logging |
Measured with Criterion on an
AMD Ryzen 9 9900X (reproduce with cargo bench):
| Benchmark | Median | What it measures |
|---|---|---|
| Agent spawn — CoW clone | ~9 ns (O(1)) | fork a sandbox from a warm baseline; constant at 1 / 16 / 64 baseline units (8.9 / 9.3 / 9.1 ns). A full copy is 206 µs → 9.3 ms and grows with size — so 64 agents cost ~one baseline, not 64. |
| CoW first-write fault | 73 ns | the per-page copy paid once, when a clone first dirties a page |
| Guest memory read / write | 27–96 ns / 9–19 ns | 64 B – 4 KiB accesses (zero-copy mapping) |
| MCP tool dispatch | 547 ns | one agent tool call (vm.list) end-to-end |
| MCP dispatch ×64 (concurrent) | 47 µs (~0.7 µs/call) | concurrent agent tool calls over the dispatch path |
| Snapshot — vCPU regs / 10 devices | 18 ns / 21 ns | serialize control registers + device state |
| Tool-schema projection (OpenAI) | 3.3 µs | render the MCP registry to an LLM tool format |
AES-256-GCM (ring, AES-NI) |
~9–10 GiB/s | crypto throughput — see Performance |
The defining number is O(1) agent spawn: a copy-on-write clone is ~9 ns regardless of fleet size, so 100 idle agents cost roughly one baseline's memory rather than 100 — the foundation of the agent runtime's fleet density.
HyperMachine is the first hypervisor designed from the ground up for AI agent workloads. Every VM is an MCP-addressable resource: agents discover capabilities via ontology endpoints, invoke typed tools (vm.create, vm.exec, gpu.reserve), and receive structured results — no shell scraping or brittle CLI wrappers. Multi-LLM tool schemas ship built-in for OpenAI, Anthropic, and Google formats.
A built-in agent runtime turns this into a fleet service: agents spawn in O(1) as copy-on-write clones of a warm baseline (100 idle agents cost ~one baseline, not 100), run tool-calling loops over a fast MCP dispatch path, and have their sessions and memory reclaimed automatically. It is exposed over a tenant-scoped, optionally-authenticated REST API (/api/v1/agents).
A single codebase runs as both a Type-2 hosted hypervisor (KVM, WHPX, HVF) and a Type-1 bare-metal hypervisor (Intel VMX, AMD SVM) with no code duplication. The same VM definitions, device models, and API surface work in both modes — develop on your laptop, deploy bare-metal in production.
HyperMachine models GPU interconnect topology (NVLink, NVSwitch, PCIe) and makes placement decisions based on real bandwidth and latency. Capacity reservations with SLA tiers (platinum/gold/silver/bronze) prevent noisy-neighbor GPU contention. Fleet-wide GPU health monitoring tracks utilization, temperature, and ECC errors across hosts.
Alongside classical FIPS-approved algorithms (AES-GCM, RSA, ECDSA), HyperMachine ships ML-KEM (Kyber) for key encapsulation, ML-DSA (Dilithium) for digital signatures, and SLH-DSA (SPHINCS+) for hash-based signatures — all NIST-standardized, quantum-resistant, and backed by the audited pure-Rust RustCrypto implementations (not placeholders).
~238,000 lines of Rust across 13 crates with zero todo!(), unimplemented!(), or placeholder stubs. The full stack — from bare-metal boot sequence to REST middleware to GPU scheduling — is implemented in safe Rust. 4,600+ tests and cargo clippy -D warnings clean. Security advisories in the active dependency graph are triaged in deny.toml (one accepted with justification: the rsa crate's Marvin-attack timing advisory, for which no fixed release exists).
AI agents control the desktop GUI through a typed command API (gui.navigate, gui.dialog.open, gui.form.set_field) rather than pixel-based screen scraping. This is deterministic, resolution-independent, and orders of magnitude faster than vision-based approaches like screen capture automation.
The REST API ships with 28 composable middleware layers out of the box: rate limiting, circuit breakers, request replay protection, tenant isolation, geo-IP enrichment, W3C trace propagation, HMAC payload signing, schema validation, slow-request detection, maintenance mode, and more — all configurable, all tested.
+----------------------------------------------------------+
| AI Agent Interface |
| (MCP Server - OpenAI/Claude/Gemini) |
+----------------------------------------------------------+
| Remote API Layer |
| (gRPC - REST - WebSocket) |
+-------------+-------------+-------------+----------------+
| CPU | GPU | Network | Memory |
| Emulation | Vulkan | TAP/TUN | Management |
+-------------+-------------+-------------+----------------+
| HyperMachine Core |
| HV2 (KVM/WHPX/HVF) - HV1 (VMX/SVM bare-metal) |
+----------------------------------------------------------+
- Rust 1.87+ (stable) for Type-2 crates; nightly for Type-1 (
hv1-core,hv1-boot) - Hypervisor backend (Type-2 mode): KVM (Linux), WHPX (Windows), or HVF (macOS)
- protoc (Protocol Buffers compiler) for building gRPC components
# Build (excludes nightly-only Type-1 crates)
git clone https://github.com/nervosys/HyperMachine && cd HyperMachine
cargo build --release --workspace --exclude hv1-core --exclude hv1-boot
# Create and run a VM (Type-2 hosted mode)
hm t2 create --name myvm --cpu 4 --memory 8G --gpu
hm t2 start myvm
# Start MCP server for AI agents
hm mcp serve --api-key "your-key"HyperMachine exposes a Model Context Protocol (MCP) server for AI agents:
# Discover available tools
curl http://localhost:8080/mcp/tools
# LLM-specific tool formats
curl http://localhost:8080/agentic/tools/openai # GPT-4o, o1, o3
curl http://localhost:8080/agentic/tools/anthropic # Claude 4, Sonnet
curl http://localhost:8080/agentic/tools/gemini # Gemini 2.5
# Execute operations
curl -X POST http://localhost:8080/mcp/call \
-H "Authorization: Bearer your-key" \
-d '{"tool": "vm.create", "arguments": {"name": "ai-sandbox", "cpu_cores": 4}}'Spawn and operate a fleet of copy-on-write agents over HTTP. Each agent gets an
MCP session plus an O(1) sandbox cloned from a warm baseline. Requests are
tenant-scoped via X-Tenant-Id (an agent is owned by the tenant that spawned
it), with an optional Authorization: Bearer <token>:
# Spawn a tenant-scoped agent (O(1) CoW sandbox from the warm baseline)
curl -X POST http://localhost:8080/api/v1/agents \
-H "X-Tenant-Id: acme" -H "Content-Type: application/json" \
-d '{"agent_id": "researcher", "capabilities": "operator"}'
curl http://localhost:8080/api/v1/agents -H "X-Tenant-Id: acme" # list
curl http://localhost:8080/api/v1/agents/fleet -H "X-Tenant-Id: acme" # fleet memory
curl -X POST http://localhost:8080/api/v1/agents/reap -d '{"max_idle_secs": 600}'
curl -X DELETE http://localhost:8080/api/v1/agents/<session_id> -H "X-Tenant-Id: acme"Python SDK (planned — not yet shipped in this repository):
from hypermachine import HyperMachine
hm = HyperMachine("http://localhost:8080", api_key="your-key")
vm = hm.create_vm("sandbox", cpu=4, memory="8G", gpu=True)
vm.start()
vm.exec("echo 'Hello from AI agent'")These compile and run today (cargo run -p <crate> --example <name>):
| Example | Crate | Shows |
|---|---|---|
agent_mcp_workflow |
hv2-agent |
An agent driving a full VM lifecycle over the MCP tool surface (provision → boot → guest.exec → snapshot → resize → restore → teardown) with an audit log |
llm_tool_schemas |
hv2-agent |
The MCP tool registry projected into OpenAI / Anthropic / Gemini tool-use formats |
agent_vm_workflow |
hm-cli |
Tool discovery + VM lifecycle through the typed ToolExecutor and agentic ontology |
multi_agent_orchestration |
hv2-agent |
Multiple role-scoped agents coordinating: exclusive VM claims, role enforcement, inter-agent messaging |
agent_runtime |
hv2-agent |
End-to-end agent runtime: 100 agents spawned from one warm baseline (O(1), ~100× memory density), kept isolated, calling tools, then reclaimed |
gpu_fabric_reservation |
hv2-runtime |
Publishing a GPU VM class and reserving capacity with SLA tiers via CapacityManager |
agent_script / integrated |
hv2-agent |
Rhai-scripted agent decision-making and agent↔device (serial/MMIO) interaction |
cargo run -p hv2-agent --example agent_mcp_workflowHyperMachine includes a desktop GUI with semantic automation API for AI agents:
use hm_gui::{AutomationHandle, GuiCommand, DialogType, FormType};
// Create automation handle
let (handle, receiver) = AutomationHandle::new();
// AI agent controls the GUI
handle.open_dialog(DialogType::CreateVm)?;
handle.set_field(FormType::CreateVm, "name", "ai-sandbox")?;
handle.set_field(FormType::CreateVm, "cpus", 4)?;
handle.set_field(FormType::CreateVm, "memory_mb", 8192)?;
handle.execute(GuiCommand::SubmitDialog(DialogType::CreateVm))?;Available GUI Tools (13 total):
| Tool | Description |
|---|---|
gui.navigate |
Navigate views (welcome, vm_details, console, settings) |
gui.dialog.open/close/submit |
Manage dialogs (create_vm, settings, about) |
gui.vm.select |
Select VM by id, name, or partial match |
gui.vm.action |
VM operations (start, stop, pause, delete, console) |
gui.form.set_field |
Set form values programmatically |
gui.get_state |
Query current GUI state |
LLM JSON Commands:
{"type":"OpenDialog","params":"create_vm"}
{"type":"SetFormField","params":{"form":"create_vm","field":"name","value":"my-vm"}}
{"type":"SubmitDialog","params":"create_vm"}This semantic approach is superior to screen-based automation (like Anthropic Computer Use) because it is deterministic, fast, and layout-independent.
Implementations of FIPS-approved classical algorithms plus the NIST
post-quantum schemes. Classical AES-GCM/SHA come from the ring
backend, RSA from the pure-Rust rsa crate,
and the post-quantum schemes from RustCrypto
(ml-kem, ml-dsa, slh-dsa). These are validated algorithm implementations,
not a FIPS 140-3 validated module.
| Type | Algorithms |
|---|---|
| Symmetric | AES-256-GCM, SHA-256/384/512, HMAC, HKDF |
| Asymmetric | RSA-2048/3072/4096, ECDSA P-256/P-3841 |
| Post-Quantum | ML-KEM (Kyber), ML-DSA (Dilithium), SLH-DSA (SPHINCS+) |
HyperMachine provides a GPU Fabric REST API for topology-aware GPU placement and fleet management:
# Query GPU topology
curl http://localhost:8080/api/v1/gpu-fabric/topology
# List fleet hosts
curl http://localhost:8080/api/v1/gpu-fabric/fleet
# Check capacity for GPU workloads
curl -X POST http://localhost:8080/api/v1/gpu-fabric/capacity/check \
-H "Content-Type: application/json" \
-d '{"gpu_count": 4, "min_vram_mb": 40960, "interconnect": "NvLink"}'
# Reserve GPU capacity
curl -X POST http://localhost:8080/api/v1/gpu-fabric/capacity/reserve \
-H "Authorization: Bearer your-key" \
-d '{"gpu_count": 8, "sla_tier": "Premium", "duration_secs": 3600}'# Kubernetes
helm install hypermachine ./deploy/helm/hypermachine \
--set environment=production \
--set replicaCount=3
# Terraform (AWS EKS)
cd deploy/terraform && terraform apply -var="environment=production"Crypto throughput from crypto_bench on an AMD Ryzen 9 9900X (64 KiB blocks).
AES-GCM and SHA run on the validated ring backend (AES-NI hardware
acceleration), which is enabled by default:
| Operation | Throughput |
|---|---|
| AES-256-GCM encrypt | ~9.0 GiB/s |
| AES-256-GCM decrypt | ~10.1 GiB/s |
| SHA-256 | ~2.5 GiB/s |
| SHA-512 | ~0.76 GiB/s |
| HMAC-SHA256 | ~2.3 GiB/s |
cargo bench -p hv2-core --bench crypto_benchNumbers are hardware- and backend-dependent. AES-GCM is AES-NI accelerated; SHA throughput reflects
ring's software implementation on this CPU. Run the command above to reproduce on your own hardware.
crates/
hm-cli # CLI + MCP server
hm-gui # Desktop GUI with AI automation API
hv1-arm # ARM64 EL2 hypervisor backend (127 tests)
hv1-boot # Type-1 bare-metal bootloader (nightly)
hv1-core # Type-1 bare-metal hypervisor core (nightly)
hv2-core # Core engine (CPU, memory, devices, crypto)
hv2-cpu # CPU virtualization and instruction decoding
hv2-gpu # GPU virtualization (Vulkan/WebGPU, passthrough)
hv2-net # Networking (TCP/IP stack, TAP/TUN, virtio-net)
hv2-agent # AI agent interface (MCP, WASM plugins)
hv2-api # REST/gRPC API server + GPU Fabric endpoints
hv2-cli # Standalone hypervisor CLI
hv2-runtime # Fleet runtime, scheduler, GPU observability
deploy/
k8s/ # Kubernetes manifests
helm/ # Helm charts
terraform/ # Infrastructure as code- Getting Started — Installation, prerequisites, first VM
- API Quickstart — REST/gRPC API reference and examples
- Architecture — System design and internals
- Deployment Guide — Production deployment on Kubernetes/Terraform
- GPU Virtualization — Vulkan/WebGPU passthrough and virtual GPU
- Guest Programming Guide — Writing code for guest VMs
| Problem | Solution |
|---|---|
KVM not available |
Enable VT-x/AMD-V in BIOS; modprobe kvm_intel or kvm_amd |
WHPX not found (Windows) |
Enable "Windows Hypervisor Platform" in Windows Features |
protoc not found |
Install protobuf compiler: apt install protobuf-compiler or brew install protobuf |
| Build fails on nightly crates | Type-1 crates require nightly; use --exclude hv1-core --exclude hv1-boot |
Permission denied on /dev/kvm |
Add user to kvm group: sudo usermod -aG kvm $USER |
See CONTRIBUTING.md for development setup, coding standards, and PR process.
This project is dual-licensed:
- AGPL-3.0 (GNU Affero General Public License v3) — free and open source with strong copyleft. See LICENSE.
- Commercial License — available for use without AGPL obligations. See LICENSE-COMMERCIAL.
For commercial licensing inquiries, contact licensing@nervosys.ai.
Footnotes
-
ECDSA P-521 keys can be represented but key generation and signing require a backend other than
ring, which does not support that curve. ↩