"Establishing Controllability in Mamba-2 Recurrent States"
Caution
Research Artifact Disclaimer: This repository documents exploratory research into Mamba-2 state dynamics. It is a Phase 1: Technical Feasibility Study. It is not a production-ready framework or a functional memory system.
The long-term goal of ALSI is to internalize the context processing of MIT's Recursive Language Models (RLM) directly into the latent dynamics of State Space Models.
-
The Dream: A model that mathematically "uploads" external facts into its recurrent state (
$h_t$ ), allowing it to process unbounded context with zero-latency implicit recall. - The Reality: We are currently at the infrastructure layer, proving that such steering is physically and mathematically possible.
We have established the foundational capabilities required for latent steering.
- Controllability Proof: SSM states are controllable via non-linear, off-manifold perturbations (linear methods like PCA fail).
- Functional Engine: A custom differentiable Mamba-2 implementation that bypasses stateful cache limitations.
-
Token Forcing: A trained projector (
$\Phi$ ) can force target tokens with Rank 1 accuracy.
- Coherence: Model output often becomes garbled after the forced token (The Coherence Gap).
- Semantic Encoding: We can force a specific token (e.g., "BLUE") but haven't yet proven we can inject a factual statement (e.g., "John lives in Paris").
- Memory Validation: No experiments have been conducted on long-range recall or QA tasks.
| Phase | Milestone | Status |
|---|---|---|
| Phase 1: Token Control | Prove states are differentiable and controllable. | COMPLETE ✅ |
| Phase 2: Fact Injection | Inject semantic facts and verify recall in QA tasks. | PLANNED 🔄 |
| Phase 3: Multi-Hop Reasoning | Inject multiple compositional facts simultaneously. | CONCEPTUAL 🔮 |
| Phase 4: Continuous Memory | Rolling latent injection in long-range conversations. | CONCEPTUAL 🔮 |
| Phase 5: RLM Parity | Match/Exceed RLM performance on long-context benchmarks. | CONCEPTUAL 🔮 |
core/functional_mamba.py: Differentiable Mamba-2 recurrence.core/phi_t.py: Recursive Trajectory Projector.
- EXECUTIVE SUMMARY: Start here for the high-level research arc.
- The Vision: The original spark and core hypothesis.
- Current State: Detailed breakdown of Phase 1 results.
- Roadmap: Immediate experiments to close the gap.
pip install -r requirements.txt- Run Phase 1 validation:
python main.py --phase 1-token-control - View negative results:
docs/Why_Linear_Steering_Fails_in_SSMs.md