diff --git a/.gitignore b/.gitignore
index 568074e..b1adf79 100644
--- a/.gitignore
+++ b/.gitignore
@@ -16,3 +16,4 @@ IMPLEMENTATION-PLAN.md
 DISCOVERY-SUMMARY.md
 IMPLEMENTATION-ROADMAP.md
 RESUMPTION-PROMPT.md
+CLAUDE.md.bak
diff --git a/CLAUDE.md b/CLAUDE.md
index 7326eb6..0367fc4 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -1,24 +1,21 @@
 # Model Colosseum
 
-## Project Overview
-A local-first Tauri 2.0 desktop app for evaluating Ollama models across three modes: Arena (model vs model debates with Elo ratings), Benchmark (custom test suites with manual + auto-judge scoring), and Sparring Ring (structured human vs AI debates with scorecards). All modes feed a unified leaderboard backed by SQLite. macOS-only, dark theme, arena/colosseum aesthetic.
+Local-first Tauri 2.x desktop app for evaluating Ollama models: Arena (model vs model debates, Elo ratings), Benchmark (custom test suites, TTFT/TPS metrics, manual + auto-judge scoring), and Sparring Ring (human vs AI debates, scorecards). All modes feed a unified leaderboard backed by SQLite. macOS-only, dark theme, arena/colosseum aesthetic.
+
+## Stack
 
-## Tech Stack
 - Runtime: Tauri 2.x (Rust backend + webview frontend)
 - Frontend: React 19 + TypeScript 5.x strict mode
 - Build: Vite 6.x with `@tauri-apps/vite-plugin`
 - Styling: Tailwind CSS 4.x (dark theme, gold/amber accents)
-- State: Zustand 5.x
-- Routing: React Router 7.x
-- Charts: Recharts 2.x
+- State: Zustand 5.x; Routing: React Router 7.x; Charts: Recharts 2.x
 - Database: SQLite via `rusqlite` 0.31+ (bundled, WAL mode)
-- HTTP: `reqwest` 0.12+ (async streaming)
-- Async: `tokio` 1.x
-- System info: `sysinfo` 0.31+
-- LLM: Ollama REST API (localhost:11434)
+- HTTP: `reqwest` 0.12+; Async: `tokio` 1.x; System info: `sysinfo` 0.31+
+- LLM: Ollama REST API (`localhost:11434`)
 
 ## Architecture
-React frontend communicates with Rust backend via Tauri IPC (`invoke` for commands, `listen` for streaming events). Rust backend owns all Ollama communication, SQLite access, and Elo calculations. Frontend is purely presentational + state management.
+
+React frontend → Tauri IPC (`invoke` / `listen`) → Rust backend. Rust owns all Ollama communication, SQLite access, and Elo calculations. Frontend is purely presentational + state management.
 
 Key modules:
 - `src-tauri/src/db.rs` — SQLite connection, migrations, schema (13 tables), seed data
@@ -29,46 +26,46 @@ Key modules:
 - `src-tauri/src/elo.rs` — Elo rating calculations (67 tests)
 - `src-tauri/src/prompts.rs` — System prompt templates (arena, formal, socratic, sparring, scorecard judge)
 
-## Development Conventions
-- TypeScript strict mode. No `any` types.
-- React: Functional components with hooks only. No class components.
-- Rust: `clippy` clean. `cargo fmt` on save.
-- File naming: `snake_case.rs` for Rust, `PascalCase.tsx` for React components, `camelCase.ts` for utilities
-- Git commits: conventional commits (`feat:`, `fix:`, `refactor:`, `chore:`)
-- All Tauri commands return `Result<T, String>` — handle errors in Rust, display in frontend
-- Database writes wrapped in explicit transactions
-- No unwrap() in production Rust code — use ? operator or proper error handling
+## Build / Test / Run
 
-## Current Phase
-**v1.0.0 — Feature Complete** (all phases done, audit remediation applied)
+```bash
+pnpm install          # install deps
+pnpm tauri dev        # dev server (hot reload)
+pnpm tauri build      # production build
+pnpm test             # runs: cd src-tauri && cargo test
+cargo clippy -- -D warnings  # lint (must pass clean)
+cargo fmt             # format on save
+```
 
-- [x] **Phase 0: Foundation** — Tauri 2.0 scaffold, SQLite (13 tables, WAL), Ollama REST client, Elo module
-- [x] **Phase 1: Arena Mode** — Debate engine (freestyle/formal/socratic), vote + Elo, leaderboard, history
-- [x] **Phase 2: Benchmark** — CRUD suites/prompts, runner with TTFT/TPS metrics, manual + auto-judge scoring, blind comparison, hardware metrics, import/export
-- [x] **Phase 3: Sparring Ring** — Human vs AI debates, 3 difficulty levels, 4-phase structure, scorecards, user Elo
-- [x] **Phase 4: Polish** — 3 debate formats, topic suggestions, settings page, blind test, animations, skeleton loading, export (Markdown/CSV/JSON)
-- [x] **Audit** — Security hardening (configurable Ollama URL, query limit caps, settings key whitelist), accessibility (ARIA attributes), error handling, 67 Rust tests
+## Conventions
+
+- TypeScript strict mode; type with `unknown` + narrowing, never `any`
+- React functional components with hooks only; no class components
+- Rust: `clippy` clean, `cargo fmt` on save; use `?` or proper error handling — no `unwrap()` in production code
+- File naming: `snake_case.rs`, `PascalCase.tsx`, `camelCase.ts`
+- Tauri commands return `Result<T, String>` — handle errors in Rust, surface to frontend
+- Database writes in explicit transactions
+- Data directory: `~/.model-colosseum/` — the only storage location (`colosseum.db` lives here)
+
+## Gotchas
+
+- Use Tauri v2 APIs only — import paths are `@tauri-apps/api` v2; v1 APIs are incompatible
+- Use `rusqlite` directly, not `tauri-plugin-sql` — needed for WAL mode, migrations, concurrent access
+- Always health-check Ollama before calling it; handle absence gracefully (`localhost:11434`)
+- Network calls to localhost Ollama only — no telemetry, no cloud, no external endpoints
+- Concurrent streaming: runs concurrent with auto sequential fallback when models > 40B combined (prevents OOM)
+- Ollama streaming: NDJSON line-by-line parsing, not SSE
+
+## Key Decisions
 
-## Key Decisions Made
 | Decision | Choice | Rationale |
 |----------|--------|-----------|
-| Concurrent streaming | Concurrent with auto sequential fallback when models > 40B combined | User wants dramatic visual. Fallback prevents OOM. |
-| Database access | rusqlite directly, not tauri-plugin-sql | More control over WAL mode, migrations, concurrent access |
-| Elo parameters | Start 1500, K=40→32→24 based on game count | Standard chess Elo with decay to stabilize ratings |
+| Concurrent streaming | Concurrent + auto sequential fallback (>40B combined) | Dramatic visual; fallback prevents OOM |
+| Database access | `rusqlite` directly, not `tauri-plugin-sql` | WAL mode, migrations, concurrent access |
+| Elo parameters | Start 1500, K=40→32→24 by game count | Standard chess Elo with decay to stabilize |
 | Benchmark scoring | 1-5 manual, 1-10 auto-judge normalized | Fast manual scoring, more granular auto-judge |
-| App modes | Arena → Benchmark → Sparring (build order) | Arena builds all shared infra, others plug in |
-| DB location | ~/.model-colosseum/colosseum.db | Standard macOS app data location |
-| Ollama streaming | NDJSON line-by-line parsing, not SSE | That's what Ollama returns |
-
-## Do NOT
-- Do not scaffold the entire project in one session — follow the phased plan strictly
-- Do not use Tauri v1 APIs or import paths — this is Tauri 2.x (`@tauri-apps/api` v2)
-- Do not use `tauri-plugin-sql` — we use `rusqlite` directly
-- Do not use `unwrap()` in Rust production code — use `?` or proper error handling
-- Do not make any network calls except to localhost Ollama (no telemetry, no cloud)
-- Do not use class components in React — hooks only
-- Do not store any data outside `~/.model-colosseum/` — single source of truth
-- Do not assume Ollama is running — always health check first and handle absence gracefully
+| DB location | `~/.model-colosseum/colosseum.db` | Standard macOS app data location |
+| Ollama streaming | NDJSON line-by-line, not SSE | That's what Ollama returns |
 
 <!-- portfolio-context:start -->
 # Portfolio Context