Skip to content

perf: add direct-mapped node cache to BTreeMap#416

Draft
sasa-tomic wants to merge 1 commit intomainfrom
perf/direct-mapped-node-cache
Draft

perf: add direct-mapped node cache to BTreeMap#416
sasa-tomic wants to merge 1 commit intomainfrom
perf/direct-mapped-node-cache

Conversation

@sasa-tomic
Copy link
Contributor

@sasa-tomic sasa-tomic commented Mar 18, 2026

Summary

  • Add a 32-slot direct-mapped node cache to BTreeMap, modeled after CPU caches: O(1) lookup via (address / page_size) % 32, collision = eviction (no LRU tracking)
  • Read paths (get, contains_key, first/last_key_value) use a take+return pattern to avoid re-loading hot upper-tree nodes from stable memory
  • Write paths invalidate affected cache slots in save_node, deallocate_node, merge, and clear_new
  • Switch get() from destructive extract_entry_at (swap_remove) to non-destructive node.value() (borrows via OnceCell)
  • Remove now-unused extract_entry_at method

This subsumes all four previous caching approaches (root-only, LRU+clone, LRU+Rc, page cache) into a single design that:

  • Has ~5 instructions overhead per cache lookup (vs ~330 for the Rc LRU's linear scan)
  • Stores Node<K> directly (no Rc, no Clone, no heap allocation per cache entry)
  • Uses cache.get_mut() on write paths (zero RefCell overhead)

Expected improvement: ~15-20% for random reads, ~65% for hot-key workloads, ~0% overhead for writes.

Add a 32-slot direct-mapped node cache to BTreeMap that avoids
re-loading hot nodes from stable memory. Modeled after CPU caches:
O(1) lookup via (address / page_size) % 32, collision = eviction.

Read paths (get, contains_key, first/last_key_value) use a
take+return pattern to borrow nodes from the cache without
RefCell lifetime issues. Write paths (insert, remove, split,
merge) invalidate affected cache slots.

Key changes:
- Switch get() from destructive extract_entry_at to node.value()
- Remove unused extract_entry_at method
- Change traverse() closure from Fn(&mut Node) to Fn(&Node)
- Invalidate cache in save_node, deallocate_node, merge, clear_new

Expected improvement: ~15-20% for random reads, ~65% for hot-key
workloads, ~0% overhead for writes (cache.get_mut() bypasses RefCell).
@sasa-tomic sasa-tomic requested a review from a team as a code owner March 18, 2026 17:46
@sasa-tomic sasa-tomic marked this pull request as draft March 18, 2026 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant