feat(config): nano-replica memory profile — run the replica in 512 MiB–1 GiB VMs by Dfinity-Bjoern · Pull Request #10491 · dfinity/ic

Dfinity-Bjoern · 2026-06-16T15:22:13Z

Status: DRAFT / experimental — shared for discussion, not for merge.
Branched off bjoern/dockernet (the local 4-node dev/local-net); this PR is just the nano-profile config + load-driver work on top of it.

What this is

An experiment to run the ICP replica in 512 MiB–1 GiB VMs (mainnet uses ~512 GiB), accepting a much smaller subnet: 2 execution threads, 1 query thread, a few-hundred-MB subnet state, best-effort-leaning messaging. The premise is that the 512 GiB figure is mostly worst-case capacity bounds and reservations, not steady-state resident memory — so the work is shrinking every pool/limit by ~100–1000× and fixing the few places where a single execution could OOM a tiny node.

All changes are gated to the nano profile's constants in rs/config; nothing here is meant for mainnet defaults as-is.

Config changes (`rs/config`)

execution_environment.rs: subnet memory capacity 2 TiB→512 MiB; exec threads 4→2, query threads 4→1; heap-delta capacity 140 GiB→96 MiB; guaranteed-response msg mem 15 GiB→64 MiB, best-effort 5 GiB→32 MiB; ingress history/custom sections/caches shrunk; callback soft-limit 1M→4096; memory reservation 2560→8 MiB/thread; SUBNET_MEMORY_THRESHOLD = capacity (disables storage cycle-reservation so canisters can use the full cap).
embedders.rs (the OOM-cliff fix): per-message stable dirty/accessed page limits 1–8 GiB → 32/128 MiB; sandbox count 10000→32, idle 30m→2m; rayon threads 10/8→2/2.
subnet_config.rs: MAX_HEAP_DELTA_PER_ITERATION 200 MB→64 MiB (so a single round can't overshoot the 96 MiB heap-delta cap — bounds the unreclaimable resident spike under writes); heap-delta initial reserve 32 GiB→32 MiB; max paused (DTS) execs 4→1; per-canister heap-delta rate-limit 75→32 MiB.
canister_sandbox/.../sandboxed_execution_controller.rs: decoupled max sandbox RSS from heap-delta via a 128 MiB floor (so shrinking heap delta doesn't starve sandboxes); eviction batch 1 GiB→64 MiB.
message_routing.rs: XNet stream size 10→2 MiB, max stream messages 10000→1000.
dev/local-net/prep.sh: bakes the nano hypervisor overrides into generated configs and shortens the DKG/checkpoint interval to ~50 rounds.

Two correctness fixes were found by actually running it: the DTS scheduler requires ≥2 cores ((cores-1)*100% capacity → 1 core trips an invariant), and MAX_HEAP_DELTA_PER_ITERATION must stay ≤ the heap-delta cap.

Load driver (`rs/canister_client/examples/hammer.rs`)

A self-contained stress tool driven over the public endpoint via the in-repo ic-canister-client (no dfx). Deploys universal canisters and hammers them. Modes: read (stable reads), heap/heapread (heap memory), calls (inter-canister chains), fanout (parallel calls), hybrid (reads+writes+messaging at once), plus a per-message dirty/accessed-limit probe. Run e.g.:

UNIVERSAL_CANISTER_WASM_PATH=/path/to/universal_canister.wasm \
  HAMMER_MODE=hybrid HAMMER_CANISTERS=4 cargo run -p ic-canister-client --example hammer -- http://localhost:8080

Key findings (from `dev/local-net`, container RAM hard-capped)

Idle/light load fits 512 MiB (~172–215 MiB anon). Heavy mixed load wants 1 GiB. Under a heavy hybrid storm, 512 MiB had 2 sandbox-OOM restarts (recovered, consensus never stopped); 1 GiB ran clean (0 restarts).
Non-reclaimable (anon) floor ≈ 340 MiB under load (replica + sandboxes + bounded heap delta); the rest of resident memory is reclaimable page cache of the checkpoint (read working set). The MAX_HEAP_DELTA_PER_ITERATION fix keeps anon bounded (no 200 MB overshoot).
Per-message 32 MiB stable limit verified: a single message touching >32 MiB of stable memory traps cleanly (canister-level), not an OOM-kill — the protection that makes a tiny node safe. Heap memory has no such per-execution cap and is ~2.5× more expensive to store, so stable is the right place for large state.
Inter-canister: ~200 msgs/s, latency = ~1 consensus round/hop; the reduced 64 MiB guaranteed-response cap is never hit under realistic patterns (execution-rate + ingress backpressure keep outstanding calls low) and is enforced gracefully when pushed.
Everywhere it degrades gracefully — throttle, backpressure, page-cache eviction, recover — rather than failing hard.

Not done / caveats

Structural items from the plan are not included: rejecting guaranteed-response calls at the system-API boundary, consensus/p2p pool sizing for the 4-node target, HTTP-endpoint concurrency, disabling BTC/HTTP-outcalls adapters.
Dependent-crate value-assertion tests (execution_environment, messaging, scheduler) will need expected-constant updates; the full bazel test sweep has not been run (CI-scale). Compile + ic-config unit tests + targeted bazel builds pass.

🤖 Generated with Claude Code

Scale down the replica's memory capacities, reservations and limits so it can run on a 512 MiB–1 GiB VM (down from the 512 GiB mainnet footprint), accepting a substantially reduced subnet capacity. execution_environment.rs: - subnet memory capacity 2 TiB -> 512 MiB, threshold -> 384 MiB - guaranteed-response msg mem 15 GiB -> 64 MiB, best-effort 5 GiB -> 32 MiB - ingress history 4 GiB -> 32 MiB, wasm custom sections 2 GiB -> 16 MiB - execution threads 4 -> 1, query threads 4 -> 1 - subnet memory reservation 2560 -> 64 MiB per thread - callback soft limit 1,000,000 -> 4,096 - subnet heap delta capacity 140 GiB -> 96 MiB - query cache 200 -> 16 MiB, compilation cache 10 GiB -> 64 MiB embedders.rs (OOM-cliff fix — bound a single execution's resident set): - stable dirty/accessed page limits 1-8 GiB -> 32/128 MiB - max dirty pages without optimization 1 GiB -> 32 MiB - sandbox count 10,000 -> 32, idle time 30m -> 2m - rayon compilation/page-allocator threads 10/8 -> 2/2 - query threads per canister 2 -> 1 subnet_config.rs: - heap delta initial reserve 32 GiB -> 32 MiB (must be <= capacity) - max paused (DTS) executions 4 -> 1 - per-canister heap delta rate limit 75 -> 32 MiB sandboxed_execution_controller.rs: - decouple max sandbox RSS from heap delta via a 128 MiB floor (MIN_SANDBOXES_RSS), so a tiny heap delta no longer starves sandboxes - eviction batch 1 GiB -> 64 MiB message_routing.rs: - XNet stream target size 10 -> 2 MiB, max stream messages 10,000 -> 1,000 Verified: rustfmt, clippy (clean), cargo test -p ic-config (19 passed), bazel build //rs/config:config //rs/canister_sandbox:backend_lib. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The DTS scheduler computes allocatable compute capacity as `(scheduler_cores - 1) * 100%` (round_schedule::compute_capacity_percent). With NUMBER_OF_EXECUTION_THREADS = 1 this is 0%, so the invariant `total_compute_allocation + 1% <= compute_capacity` fails on every round and the replica panics in the MR Batch Processor on restart. Bump to 2 (the scheduler floor). Memory cost is negligible: the extra execution thread's Wasm address space is virtual, resident usage stays bounded by the per-message dirty-page limits and the shared sandbox-RSS budget, and SUBNET_MEMORY_RESERVATION is 64 MiB x 2 = 128 MiB (< the 512 MiB subnet cap). Found by running a 4-node local-net subnet with the nano profile. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Standalone stress driver for a local subnet, driven over the public endpoint with the in-repo ic-canister-client Agent (no dfx needed): deploys N universal canisters via provisional_create_canister_with_cycles, then runs throughput / compute / dirty-page / memory-growth phases and reports throughput, latency and error classes. Run: UNIVERSAL_CANISTER_WASM_PATH=/path/to/universal_canister.wasm \ cargo run -p ic-canister-client --example hammer -- http://localhost:8080 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Disable the storage cycle-reservation mechanism on the nano profile so canisters can freely allocate up to the subnet memory capacity: - SUBNET_MEMORY_THRESHOLD = SUBNET_MEMORY_CAPACITY (512 MiB). When the threshold is >= capacity the subnet is never "high usage", so growth never triggers cycle reservations (whose mainnet-calibrated pricing otherwise rejects growth on a tiny subnet, hitting the reserved-cycles limit). - SUBNET_MEMORY_RESERVATION = 8 MiB/thread (was 64), so the response- callback reservation no longer caps usable storage well below capacity. Also bake the matching hypervisor override into dev/local-net/prep.sh so the local 4-node net inherits it across resets. Verified on the local-net: with reservation disabled, a single message writing 24 MiB of stable memory succeeds while 48 MiB traps with "Exceeded the limit for the number of accessed pages ... limit 32768 KB" (the nano 32 MiB per-message stable limit), and the subnet keeps finalizing with no replica panic — i.e. the per-message limit, not an OOM-kill, bounds a single execution's working set. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- HAMMER_MODE=probe runs only the per-message dirty/accessed-page-limit probe (skips the throughput/compute/growth storms). - Grow stable memory in its own committed message, then fill 24 MiB (under the 32 MiB limit, expect OK) and 48 MiB (over, expect trap), so the limit is isolated from subnet-capacity effects. - Widen error-class output so full canister reject reasons are visible. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The nano heap-delta capacity (96 MiB) is small relative to the default ~500-round checkpoint interval, so a memory-write-heavy workload fills the heap delta in a few rounds and then execution stalls until the next checkpoint flushes it (consensus keeps finalizing throughout — graceful, but execution duty-cycle collapses). Pass --dkg-interval-length 49 to ic-prep so checkpoints happen every ~50 rounds. Measured effect under the same hammer workload: heap-delta round-skips during the run: ~880 -> ~150 compute phase drains ~3x faster; execution advances in short bursts instead of multi-minute stalls. Checkpoint cadence follows the DKG interval (CUP heights); cheap here because the nano subnet state is only a few hundred MB. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

HAMMER_MODE=read populates N canisters with large stable state, then runs read-heavy 24 MiB stable_read calls — updates on all-but-one canister and queries on the last — concurrently, plus a 48 MiB single-execution read probe to exercise the per-message/query stable accessed-page limit. storm() gains an is_query flag to drive query calls via execute_query. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

HAMMER_MODE=heap mirrors the stable-memory tests on Wasm heap memory: per-message heap-write probe (24/48/96 MiB in one message), heap-write storm (8 MiB/call), and a heap-read storm (40 MiB get_global_data reads, updates + queries). Demonstrates that heap has no per-execution dirty/accessed cap (the 32 MiB limits are stable-only): all three single-message heap writes and the 40 MiB heap reads succeed, whereas the stable equivalents trap at 32 MiB. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- read mode: cycle read offsets across the FULL populated range (not 4 fixed windows) and error-check the populate, so reads pull distinct state and all canisters are actually large. - heapread mode: build a large per-canister heap global via append_to_global_data and query-read it (96 MiB/read). Surfaces that large heap state is ~2.5x more expensive than stable (wasm heap never shrinks + realloc on build), so 3x96 MiB heap globals OOM the 512 MiB subnet while 3x128 MiB stable fits, and that large heap reads via update OOM (the get_global_data copy grows heap). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- read storm now queries ALL canisters cycling the full populated range (clean read pressure; queries don't replicate or dirty). - populate grows+fills in 24 MiB increments (a single 128 MiB grow can be rejected; small incremental grows reliably build the state). Used to measure read memory/perf under a container RAM cap. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

HAMMER_MODE=calls: each ingress makes the target canister start a HAMMER_CALL_DEPTH-hop chain of update calls around the canister ring (nested via call_args().other_side), generating ~2*depth inter-canister messages per ingress. Used to stress message routing, callbacks and the guaranteed-response memory reservation under the nano profile. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…iplier) HAMMER_MODE=fanout: each ingress fires N parallel fire-and-forget update calls (no-op callbacks), leaving N outstanding inter-canister calls per in-flight ingress to stress the guaranteed-response memory reservation and callback limits. HAMMER_FANOUT_MULT repeats the fan-out so a single message issues N*mult calls (all reservations taken before any drain), which exposes the 64 MiB guaranteed-response cap (~32 simultaneous calls). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

HAMMER_MODE=hybrid runs three storms concurrently over the canister pool: query reads (24 MiB stable_read), update writes (8 MiB stable_fill), and 3-hop inter-canister call chains — splitting the concurrency budget. Shows read/update path isolation and update-path contention under mixed load. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

MAX_HEAP_DELTA_PER_ITERATION was 200 MB > SUBNET_HEAP_DELTA_CAPACITY (96 MiB), so a single execution round could push the in-memory heap delta far past the cap before the next round's skip-check — a transient spike of unreclaimable (anonymous) resident memory (~200-300 MB) that threatens a 512 MiB VM under write load. Lower it to 64 MB so one round cannot overshoot the cap, tightening the anonymous-memory ceiling. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Experimental “nano-replica” configuration intended to shrink replica memory footprints (targeting 512 MiB–1 GiB VMs) by aggressively reducing subnet memory caps, heap-delta limits, sandbox/resource limits, and XNet stream sizes, plus adding a standalone Rust “hammer” example to stress the local 4-node network.

Changes:

Reduce multiple replica/subnet memory and concurrency limits (heap delta, message routing streams, sandbox resources, query/execution threads).
Add hammer.rs load driver example and wire in universal canister dependency for it.
Adjust dev/local-net prep to bake nano hypervisor overrides and shorten DKG interval.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
rs/config/src/subnet_config.rs	Shrinks heap-delta iteration cap, reserve, paused DTS executions, and per-canister heap-delta rate limit.
rs/config/src/message_routing.rs	Reduces XNet stream target size and max messages per stream.
rs/config/src/execution_environment.rs	Cuts subnet memory/message capacities and execution/query parallelism; lowers caches and reservations.
rs/config/src/embedders.rs	Lowers stable-memory per-message dirty/accessed limits, sandbox counts/idle time, and compilation/page-copying parallelism.
rs/canister_sandbox/src/replica_controller/sandboxed_execution_controller.rs	Adds a minimum sandbox RSS floor and reduces eviction RSS batch size.
rs/canister_client/examples/hammer.rs	New stress-test tool for deploying universal canisters and generating mixed load patterns.
rs/canister_client/Cargo.toml	Adds `ic-universal-canister` as a dev-dependency for the new example.
dev/local-net/prep.sh	Applies nano hypervisor overrides and sets a shorter DKG interval for local-net.
Cargo.lock	Lockfile update for the added dev-dependency.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+// Nano-replica profile: keep a single round's heap-delta production below the
+// SUBNET_HEAP_DELTA_CAPACITY (96 MiB) so one round cannot overshoot the cap and
+// spike unreclaimable (anonymous) resident memory. This bounds the per-round
+// dirty working set so writes stay safe on a 512 MiB - 1 GiB VM.
+const MAX_HEAP_DELTA_PER_ITERATION: NumBytes = NumBytes::new(64 * M);


 /// This specifies the threshold in bytes at which the subnet memory usage is
 /// considered to be high. If this value is greater or equal to the subnet
 /// capacity, then the subnet is never considered to have high usage.
-const SUBNET_MEMORY_THRESHOLD: NumBytes = NumBytes::new(750 * GIB);
+// Nano-replica profile: set equal to the subnet memory capacity so the subnet
+// is never considered "high usage" and the storage cycle-reservation mechanism
+// stays disabled — canisters can allocate freely up to the subnet capacity
+// without reserving cycles (reservation pricing is calibrated for mainnet and
+// would otherwise reject growth on a tiny subnet).
+const SUBNET_MEMORY_THRESHOLD: NumBytes = NumBytes::new(512 * MIB);


+        // ---- Heap-read storm (analogue of the stable READ test) ----
+        // get_global_data reads the whole 40 MiB global in one execution — more
+        // than the 32 MiB stable per-message accessed limit would ever allow.
+        let qry_cans = Arc::new(vec![canisters[canisters.len() - 1]]);


+        us.report("HEAP-READ-UPDATE (40 MiB heap read)", t.elapsed());
+        qs.report("HEAP-READ-QUERY (40 MiB heap read)", t.elapsed());


+        // Populate each canister with ~120 MiB of real stable data (written in
+        // <=24 MiB chunks to respect the 32 MiB per-message dirty limit).
+        const BIG_MIB: u32 = 128;
+        let chunk: u32 = 24 * MIB;


+    println!("\n[5/5] MEMORY-GROWTH storm: grow 16 MiB + fill per call across all canisters until rejected");
+    let grow = Arc::new(Stats::default());
+    let total_mib = Arc::new(AtomicU64::new(0));


+    for h in handles {
+        let _ = h.await;
+    }
+    grow.report("MEMORY-GROWTH", Duration::from_secs(1));


 /// The number of sandbox processes to evict in one go in order to amortize
 /// for the eviction cost. A large number could lead to the eviction
 /// of many sandboxes and increased system load. The number was chosen
 /// based on the assumption of 800 canister executions per round
 /// distributed across 4 execution cores.
 const SANDBOX_PROCESSES_TO_EVICT: usize = 200;


Bjoern Tackmann and others added 14 commits June 12, 2026 16:19

github-actions Bot added the feat label Jun 16, 2026

Dfinity-Bjoern requested a review from Copilot June 16, 2026 15:22

Copilot started reviewing on behalf of Dfinity-Bjoern June 16, 2026 15:23 View session

Copilot AI reviewed Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(config): nano-replica memory profile — run the replica in 512 MiB–1 GiB VMs#10491

feat(config): nano-replica memory profile — run the replica in 512 MiB–1 GiB VMs#10491
Dfinity-Bjoern wants to merge 14 commits into
bjoern/dockernetfrom
bjoern/nano-replica

Dfinity-Bjoern commented Jun 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		us.report("HEAP-READ-UPDATE (40 MiB heap read)", t.elapsed());
		qs.report("HEAP-READ-QUERY (40 MiB heap read)", t.elapsed());

Conversation

Dfinity-Bjoern commented Jun 16, 2026

What this is

Config changes (rs/config)

Load driver (rs/canister_client/examples/hammer.rs)

Key findings (from dev/local-net, container RAM hard-capped)

Not done / caveats

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Config changes (`rs/config`)

Load driver (`rs/canister_client/examples/hammer.rs`)

Key findings (from `dev/local-net`, container RAM hard-capped)