Skip to content

reactive: zero-alloc data structures and wave-based API#8277

Draft
cristianoc wants to merge 54 commits intomasterfrom
reactive-noalloc-current-mechanism
Draft

reactive: zero-alloc data structures and wave-based API#8277
cristianoc wants to merge 54 commits intomasterfrom
reactive-noalloc-current-mechanism

Conversation

@cristianoc
Copy link
Collaborator

@cristianoc cristianoc commented Mar 5, 2026

reactive: zero-alloc data structures and wave-based API

Motivation

The reactive incremental engine maintains hash maps, sets, and nested containers (map-of-sets, map-of-maps) that are updated on every file change. Under the old Hashtbl-based implementation, each lookup returned a boxed option, each iteration allocated a closure, and inner containers in structures like contributions: (k2, (k1, v2) Hashtbl.t) Hashtbl.t were created and abandoned on key churn. In a replay of 56 sequential commits on a real codebase (hyperindex), these micro-allocations dominated GC pressure in steady state.

This PR eliminates all steady-state allocation in the reactive engine's core data path.

Design

1. ReactiveHash.Map / ReactiveHash.Set (434 + 63 LOC)

Custom open-addressing hash tables vendored from Hachis (François Pottier, Inria Paris), adapted for the reactive engine's usage patterns. Key properties:

  • Linear probing with power-of-2 capacity, void/tomb sentinels, and 82% max occupancy threshold. After tables reach steady-state capacity, clear + replace cycles allocate zero heap words.
  • Type-erased storage via Obj.t arrays — a single concrete table type backs both Map and Set, avoiding functor overhead.
  • iter_with / exists_with — iteration with an extra context argument, avoiding closure allocation on every call. This is the critical pattern: instead of Map.iter (fun k v -> ... captured_state ...), callers write Map.iter_with f state map where f is a module-level function.
  • find_maybe — returns ReactiveMaybe.t instead of option, eliminating the Some box on every lookup.

2. ReactiveMaybe.t (17 LOC)

An unboxed optional: none is a physically unique sentinel, some v is Obj.repr v (zero allocation). is_some / unsafe_get are inline comparisons. This replaces option at every map lookup boundary and in wave payloads for remove-vs-set discrimination.

3. ReactiveWave.t (31 LOC)

A fixed-capacity pair of Obj.t arrays (keys + values) with an integer length counter. Waves replace the delta variant type (Set | Remove | Batch of (k * v option) list) that previously allocated a list cons cell per entry per propagation step. clear just resets the length to 0. Waves are allocated once at combinator creation time and reused across all subsequent processing cycles.

4. ReactivePoolMapSet (107 LOC) and ReactivePoolMapMap (102 LOC)

Pooled container-of-containers with deterministic inner-container recycling.

Problem: structures like pred_map: (k, k Set) Map and contributions: (k2, (k1, v2) Map) Map exhibit key churn — outer keys appear and disappear across incremental updates. Under the old design, each new outer key allocated a fresh inner container, and removal just dropped it for GC.

Solution: both modules maintain an internal free-list (stack of cleared inner containers). The API forces callers through lifecycle-aware operations:

  • add / replace — reuses a pooled inner container on first access to a new key, or allocates if the pool is empty (pool_miss_create event).
  • drain_key / drain_outer — iterates the inner container, then clears and returns it to the pool.
  • remove_from_set_and_recycle_if_empty / remove_from_inner_and_recycle_if_empty — removes one element; if the inner container becomes empty, clears and recycles it.

After warmup (first request in replay), the pool satisfies 100% of inner-container demands — zero allocation in steady state. Measured on hyperindex replay: 31,963 initial pool misses for sets, then 138 misses across the remaining 55 requests combined.

5. Zero-alloc combinators

All four combinators (flatMap, join, union, fixpoint) were rewritten to:

  • Use iter_with with module-level callback functions instead of closures.
  • Store scratch state in pre-allocated mutable fields on the combinator's record (affected, scratch, merge_acc, etc.).
  • Accept and emit ReactiveWave.t instead of delta lists.
  • Use ReactivePoolMapSet for provenance tracking and ReactivePoolMapMap for contribution aggregation (flatMap, join).

The fixpoint combinator additionally migrated pred_map from Map<k, Map<k, unit>> to ReactivePoolMapSet (recognizing it as semantically a map-of-set), and has_live_predecessor uses the new Set.exists_with for early-exit iteration.

6. ReactiveAllocTrace (80 LOC)

Two-level tracing controlled by RESCRIPT_REACTIVE_ALLOC_TRACE:

  • Level 1 (=1): logs allocation events only (map/set create, table resize, pool miss, pool resize).
  • Level 2 (=2): also logs operational events (drain, remove-recycle) for full lifecycle analysis.

Events are written as single-line strings to a file descriptor, with event kinds split into alloc_event_kind and op_event_kind types. The emit_alloc_kind / emit_op_kind functions check the level before writing, so level-1 tracing has zero overhead for operational events.

Migration summary

Structure Before After
All hash maps/sets Hashtbl.t ReactiveHash.Map.t / Set.t
Map lookups find_optoption find_maybeReactiveMaybe.t
Delta propagation Set | Remove | Batch of list ReactiveWave.t
flatMap.provenance ad-hoc Map<k1, k2 list> ReactivePoolMapSet
flatMap.contributions Map<k2, Map<k1,v2>> with manual get_contributions ReactivePoolMapMap
join.contributions same pattern ReactivePoolMapMap
fixpoint.pred_map Map<k, Map<k, unit>> ReactivePoolMapSet
Combinator iteration closure per call iter_with + module-level functions

Testing

53 tests across 5 test modules. The 20 allocation tests (AllocTest.ml, 642 LOC) measure Gc.stat().minor_words across warmup + measured iterations and assert:

  • words/iter = 0 for fixpoint, flatMap, join, union in steady state.
  • pool_miss_delta = 0 for both PoolMapSet and PoolMapMap churn patterns after warmup.
  • Functional correctness of drain/recycle cycles (outer cardinal, empty-inner counts).

- Zero-alloc fixpoint, flatMap, join, union, source, scheduler
- ReactiveHash.Map/Set with ReactiveMaybe for zero-alloc lookups
- ReactivePoolMapSet for zero-alloc map-of-sets with set recycling
- ReactivePoolMapMap for zero-alloc map-of-maps with inner-map recycling
- ReactiveAllocTrace with two-level tracing (alloc-only vs alloc+ops)
- Wave-based emit API with ReactiveMaybe
- Comprehensive allocation tests

Signed-Off-By: Cristiano Calcagno <nicola.calcagno@gmail.com>
@cristianoc cristianoc changed the title reactive: zero-alloc current mechanism (squashed) reactive: zero-alloc data structures and wave-based API Mar 5, 2026
@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 5, 2026

Open in StackBlitz

rescript

npm i https://pkg.pr.new/rescript@8277

@rescript/darwin-arm64

npm i https://pkg.pr.new/@rescript/darwin-arm64@8277

@rescript/darwin-x64

npm i https://pkg.pr.new/@rescript/darwin-x64@8277

@rescript/linux-arm64

npm i https://pkg.pr.new/@rescript/linux-arm64@8277

@rescript/linux-x64

npm i https://pkg.pr.new/@rescript/linux-x64@8277

@rescript/runtime

npm i https://pkg.pr.new/@rescript/runtime@8277

@rescript/win32-x64

npm i https://pkg.pr.new/@rescript/win32-x64@8277

commit: acec4fb

cristianoc and others added 27 commits March 6, 2026 12:07
Signed-off-by: Cristiano Calcagno <cristianoc@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…of_list out of fixpoint

Rename ReactiveHash module to StableHash since it is a plain hash table
with no reactive behavior. Also change edge_wave type to carry
StableList.inner instead of raw lists, pushing the unsafe list-to-stable
conversion to the boundary where data enters from external waves.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cristianoc and others added 26 commits March 8, 2026 09:04
…Set and use StableMap for fixpoint pending buffers

Rename ReactiveMap and ReactiveSet modules to StableMap and StableSet
to better reflect their role as stable-boundary-aware data structures.

Change fixpoint pending buffers (root_pending, edge_pending) from
StableHash.Map to StableMap, eliminating manual Stable.t
wrapping/unwrapping at the wave-to-pending and pending-to-wave
boundaries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make Reactive.t's iter and get use Stable.t types, matching subscribe
which already delivers Stable.t-wrapped wave values. Replace
Maybe.stable_strip/stable_wrap with Maybe.to_stable/of_stable that
reorder wrappers without hiding the stable boundary crossing. Make
Stable.unit a constant instead of a function.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e conversions

Use Maybe.to_stable (Maybe.some (Stable.int i)) and Maybe.to_stable (Maybe.some Stable.unit)
instead of Stable.unsafe_of_value (Maybe.some ...) for types known to be immediates.
Also remove maybe_int_to_stable, maybe_unit_to_stable, maybe_stable_list_to_stable helpers,
change Fixpoint.create edges type to use StableList.inner, and consolidate
stable_edge_wave_map_replace into stable_wave_map_replace.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
StableHash doesn't use the custom stable allocator, so the "Stable"
name was misleading. Revert to the original ReactiveHash name.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These modules are now backed by stable storage, so the "Pool" naming
no longer reflects their implementation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…iveHash

Source.tables now uses StableMap for both tbl and pending, with proper
destroy. ReactiveHash is no longer used anywhere and is removed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…d ReactiveTable

ReactiveWave is allocator-backed, so rename to StableWave for
consistency. ReactiveTable had no production usage and is removed
along with its test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ment Stable policy

unsafe_to_value was misleading — reading from stable storage is not
unsafe. The real contract is linear: consume the value immediately,
don't stash it. Rename to to_linear_value and rewrite Stable.mli to
clearly explain the two boundaries (storing and reading).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Eliminate all Stable.unsafe_of_value calls from inner modules by threading
Stable.t types through APIs. Push the boundary to reanalyze callers.

Key changes:
- ReactiveUnion: merge signature takes 'v Stable.t -> 'v Stable.t -> 'v Stable.t
- Reactive.ml: remove pointless round-trips in Source (iter, get, pending->wave),
  rewrite apply_emit with Maybe.of_stable/to_stable
- Add Stable.unsafe_to_nonlinear_value for auditable non-linear reads
- Update all reanalyze callers to wrap/unwrap at the boundary
- Add STABLE_SAFETY.md guide documenting patterns and current status

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Thread Stable.t through f/merge callbacks and internal mutable state
(current_k1, merge_acc, emit_fn) to eliminate all 23 unsafe_of_value calls.
Push boundary wrapping to callers in tests and reanalyze.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n stable-safe, fix right_tbl leak

Replace emit callback with StableWave in FlatMap and Join: f now receives
a wave and pushes to it, eliminating let rec + Obj.magic + emit_fn field.
Use Maybe.none sentinels for mutable fields and Maybe.t merge accumulator.
Make ReactiveJoin.ml fully stable-safe (zero unsafe_of_value calls).
Fix pre-existing right_tbl leak in test_join_alloc_n (missing destroy).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…oc tests

Add assert (words = 0) after each allocation measurement and
assert (Allocator.live_block_count () = 0) after teardown to catch
regressions in both GC allocation and stable storage leaks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ctiveFixpoint nearly stable-safe

StableList iter/iter_with/exists/exists_with now provide 'a Stable.t to
callbacks. This eliminates stable_key (unsafe_of_value) from all
ReactiveFixpoint processing code — only 2 calls remain in debug-only
Invariants. Use unsafe_to_nonlinear_value in Invariants where values are
stored in Hashtbl/lists.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… type

Rename StableList.inner to StableList.t (now just 'a list under the hood)
and remove the old type 'a t = 'a inner Stable.t. The container storing
a StableList is now responsible for the Stable.t wrapping.

Add safe to_stable/maybe_to_stable conversions that hide unsafe_of_value,
and restore of_list as a checked constructor. Add find_succs/succs_of_stable
helpers in ReactiveFixpoint for the StableMap boundary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…entation

Elements are now pre-wrapped as Stable.t at creation time via a zero-cost
%identity reinterpretation. This lets iter/exists/length delegate directly
to List.iter/List.exists/List.length with no per-element conversion.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nts, skip alloc checks when invariants enabled

Replace all Hashtbl usage in ReactiveFixpoint.Invariants with StableSet
operations to avoid OCaml heap allocation. Add inv_pre_current,
inv_scratch_a, inv_scratch_b scratch sets to the fixpoint state record.
Remove stable_key helper and output_entries_list intermediate list.

Skip allocation assertions in AllocTest when RESCRIPT_REACTIVE_FIXPOINT_ASSERT
is enabled, since invariant checks themselves allocate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Eliminate all OCaml heap allocation in invariant checks so tests pass
with RESCRIPT_REACTIVE_FIXPOINT_ASSERT=1 and zero words/iter.

- Move is_supported, old_successors, has_live_predecessor before
  Invariants module so callbacks can reference them directly
- Add fill_reachable_scratch: BFS taking t directly (no ref/tuple)
- Use exception for stable_set_equal (no ref/tuple)
- All invariant functions take t directly (no tuple args at call sites)
- Extract all callbacks as top-level functions (no per-call closures)
- Change alloc skip flag to RESCRIPT_REACTIVE_SKIP_ALLOC_ASSERT

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…s_not_equal exception

Move stable_set_equal and copy_stable_set from ReactiveFixpoint.Invariants
into StableSet as proper operations that can benefit from internal
implementation details (direct slot iteration, no closures/exceptions).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move inv_pre_current, inv_scratch_a, inv_scratch_b out of the fixpoint
record type. They are now created/destroyed locally in apply_list using
Maybe.t for safe optional access. Add iter_with2 to StableSet and
StableMap to pass two args without partial-application closures,
maintaining zero OCaml heap allocation even with invariants enabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a health check after each send_request to verify the server process
is still alive, and fail immediately with the server log if it crashed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…toring in C memory

StableQueue.resize: add missing Block2.resize before blit (the old code
tried to blit new_cap elements into a block of old_cap capacity) and
reset head/tail to linearized positions after the copy.

ReactiveFileCollection.process_files_batch: accumulate changes in an
OCaml list first, then Gc.full_major() to promote all values to the
major heap, then push to the C-allocated scratch_wave. This ensures
the GC-invisible C pointers target stable major-heap addresses. Uses
Stable.of_value (not unsafe_of_value) as a double-check that values
are indeed promoted.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ive pipeline

Zero-allocation change: ReactiveAnalysis.to_file_data_collection now returns
file_data Maybe.t instead of file_data option, and ReactiveMerge.create
consumes it using Maybe.of_stable/is_none/unsafe_get instead of option
pattern matching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant