Skip to content

Sprint 45: extract cli/commands.py + per-domain dispatchers#18

Merged
mfwolffe merged 47 commits into
trunkfrom
sprint/45-cli-extraction
May 4, 2026
Merged

Sprint 45: extract cli/commands.py + per-domain dispatchers#18
mfwolffe merged 47 commits into
trunkfrom
sprint/45-cli-extraction

Conversation

@espadonne
Copy link
Copy Markdown
Contributor

@espadonne espadonne commented May 3, 2026

Summary

  • 45.1 — Promote cli/commands.py (4677 LOC) into the cli/commands/ package: 22 per-command submodules + _shared.py helpers + a slim __init__.py of re-exports.
  • 45.2 — Lift seven domain dispatchers out of the CLI: metrics, synth, preference, init, show, prompt, train. Each command now builds a typed Request, calls a run_* dispatcher, and renders a typed Result. Dotted imports throughout so test monkeypatches resolve at call time.
  • 45.3 — Three export-target dispatchers in dlm.export.entry: run_vllm_target_export, run_mlx_serve_target_export, run_llama_server_post_export. CLI is now thin glue around prepare → smoke → finalize for each runtime target.
  • 45.4 — Direct unit tests for every new dispatcher (tests/unit/{inference,train,store,export}/test_*.py), so the per-package coverage gates stay 100% without depending on indirect CLI test coverage.

The branch also carries a few non-Sprint-45 fixes that landed during the same window: an MLX-PEFT-adapter silent-corruption fix (f7f0450, 931f6bb, 4d133cf), probe-marker normalization across replay/synth/gate parsers, and audit-13 follow-up findings moved into the versioned docs tree.

Test plan

  • uv run pytest tests/unit/ — 4200 pass, 4 skip
  • ./scripts/coverage-gates.sh — all 16 gates at 100%
  • ./scripts/pregate.sh — clean (ruff, format, mypy, unit, advisory checks)
  • CI checks — all required gates green (macOS, Ubuntu, no-network sandbox, slow-integration label-gated as designed); slow-tests-with-llama.cpp informational
  • Manual smoke: dlm init --base smollm2-135m → scaffolded .dlm + provisioned store; dlm show (text + --json) → both render correctly through gather_store_view. dlm prompt skipped (needs a trained adapter; out of scope for a refactor smoke)

mfwolffe added 30 commits April 28, 2026 18:29
The dispatcher previously did 'from dlm.preference import build_judge'
(re-export). Tests monkeypatch the canonical 'dlm.preference.judge.build_judge'
path; using the canonical import in the dispatcher keeps function-local
attribute lookup aligned with what tests patch.
mfwolffe added 17 commits April 28, 2026 20:08
Lifts build_backend + load + generate out of the CLI for text-only
bases. VL and audio paths still live in prompt.py CLI helpers; a
follow-up phase splits them into modality-aware dispatchers.
….dispatch:run_train

Lifts the hardware probe → manifest bootstrap → phase orchestration
sequence out of the CLI. Watch loop, RPC probe server, multi-GPU
accelerate launcher dispatch, and license interactive prompt stay
CLI-side. Dotted imports in the dispatcher keep tests' monkeypatches
on dlm.hardware.doctor and dlm.train.preference.phase_orchestrator.run_phases
visible at call time.
Two bugs combined to make `dlm prompt --backend mlx` produce
base-model behavior even with a fully-trained PEFT LoRA adapter:

1. `target_modules` from PEFT is bare (`q_proj`), but mlx-lm's
   `linear_to_lora_layers` matches `named_modules()` keys inside
   each transformer block via exact equality. The FQN within a
   block is `self_attn.q_proj`, so no keys ever matched and
   `linear_to_lora_layers` silently left the model un-wrapped.

2. PEFT and mlx-lm use different LoRA tensor layouts:
   PEFT lora_A=[r,in], lora_B=[out,r]; mlx-lm lora_a=[in,r],
   lora_b=[r,out]. mlx-lm's `model.load_weights(strict=False)`
   silently skipped the mismatched shapes, leaving zero overlay.

The user-visible failure was "trained model behaves identically
to base" — surfaced during the audit-13 follow-up Finding 04
direct-query smoke test.
Even with the conversion fix, an unconvertible adapter (architecture
whose layers don't follow the self_attn/mlp convention) would still
fall through to base-model output silently. Add a post-load guard
that walks the model's `trainable_parameters` and raises
`MlxConversionError` when zero `lora_a`/`lora_b` parameters are
present. Surfaces the failure as a clear message pointing at
`--backend pytorch` instead of letting the trained adapter behave
identically to the base.
Lifts each server target's prepare → smoke → finalize chain out of the
CLI into a typed dispatcher. CLI just builds a Request, calls the
runner, and renders. Smoke failure surfaces as a populated 'smoke'
field with ok=False (and manifest_path=None), so the CLI keeps full
control of exit codes. Dotted import of dlm.export.targets keeps
existing test fixture monkeypatches visible at call time.
Lifts the adapter-dir resolution + prepare_llama_server_export +
smoke chain out of the CLI's llama-server branch. CLI just builds a
LlamaServerPostExportRequest, calls run_llama_server_post_export, and
renders the typed result. VendoringError + ExportError still propagate
to the CLI for target-specific banner formatting.
Each new dispatcher module now has a tests/unit/ peer that drives
its branches directly, so the per-package coverage gates (store,
train, inference, export) stay at 100% without depending on CLI
tests' indirect coverage. Modules covered: dlm.inference.dispatch,
dlm.train.dispatch, dlm.store.bootstrap, dlm.store.show,
dlm.export.entry.
…tion

# Conflicts:
#	src/dlm/cli/commands.py
#	src/dlm/replay/store.py
The preference dispatcher uses dotted import 'from dlm.preference import
judge as _judge_mod; _judge_mod.build_judge(...)'. Tests must patch
'dlm.preference.judge.build_judge' (canonical) for late attribute lookup
to see the patch — patches on the package re-export 'dlm.preference.build_judge'
are invisible to the dispatcher. Caught by Ubuntu CI on PR #18.
@mfwolffe mfwolffe merged commit 1c83a65 into trunk May 4, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants