Skip to content

feat(voice): /voice setup check + whisper.cpp detection (slice 1)#173

Closed
oratis wants to merge 1 commit into
mainfrom
feat/voice-setup-detect
Closed

feat(voice): /voice setup check + whisper.cpp detection (slice 1)#173
oratis wants to merge 1 commit into
mainfrom
feat/voice-setup-detect

Conversation

@oratis

@oratis oratis commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Summary

First slice of /voice (local speech-to-text, M8 in docs/BEHAVIOR_PARITY.md). This is the safe, self-contained foundation: a /voice slash command that detects whether whisper.cpp + a model are installed/configured and prints actionable setup steps. No mic capture yet — recording + transcription wiring lands in follow-up slices.

Follows the decisions in docs/VOICE_INPUT.md (settings shape voice.provider / voice.binPath / voice.modelPath; fully local — no audio leaves the machine).

The core WhisperCppProvider / parseWhisperOutput engine already existed and was tested; it just wasn't surfaced anywhere. This slice exposes it via config + a command.

Changes

Core (@deepcode/core)

  • Add VoiceConfig (provider | binPath | modelPath) to settings types, re-exported from the package root. (The JSON schema already had the voice block; the TS type was missing.)
  • New detectVoice() (voice/detect.ts): resolves the whisper binary (settings.voice.binPath, else whisper-cli / whisper on PATH) and the model (settings.voice.modelPath, else the documented default ~/.deepcode/models/whisper-base.en.bin). Never throws — missing/invalid pieces become a problems[] list. Filesystem/PATH probes are injectable for deterministic tests.
  • validateSettingsShallow now flags an unknown voice.provider.

CLI

  • /voice reports readiness (binary + model paths) or prints setup instructions with a per-issue breakdown; /voice setup always shows install steps.
  • SessionContext gains an optional home (honors --home) for the default model-path probe; wired in the REPL.

Docs

  • Update the /voice row in BEHAVIOR_PARITY.md (, 🔄 M8🟡 setup/detection; capture noted as a follow-up). The large line churn in that file is prettier re-aligning the table column widths; git diff -w shows only the one semantic row + the separator.

Testing

  • pnpm typecheck — clean
  • core: 652 passed / 16 skipped (9 new detectVoice cases + 1 schema case)
  • cli: 148 passed (3 new /voice messaging cases)
  • pnpm lint — 0 errors; pnpm format:check — clean

Follow-up slices (not in this PR)

  1. CLI mic capture (record → whisper.cpp → insert transcript into REPL input).
  2. Desktop record button + mic permission via a Tauri command (same backend).

🤖 Generated with Claude Code

Surface the existing core whisper.cpp engine via a `/voice` slash command
and add the settings schema for it. No mic capture yet — this is the safe,
self-contained foundation per docs/VOICE_INPUT.md.

Core:
- Add VoiceConfig (provider | binPath | modelPath) to settings types,
  re-exported from @deepcode/core (the JSON schema already had the block).
- New detectVoice() (voice/detect.ts): resolves the whisper binary
  (settings.binPath, else whisper-cli/whisper on PATH) and the model
  (settings.modelPath, else ~/.deepcode/models/whisper-base.en.bin),
  never throws — missing pieces become `problems`. Injectable probes for
  deterministic tests.
- validateSettingsShallow now flags an unknown voice.provider.

CLI:
- /voice reports readiness or prints actionable setup steps (+ per-issue
  detail); `/voice setup` always shows install instructions.
- SessionContext gains an optional `home` (honors --home) for the default
  model-path probe; wired in the REPL.

Tests: 9 core detection cases, 1 schema case, 3 CLI messaging cases.
Updates the /voice BEHAVIOR_PARITY row (✗ → ✓, 🔄 → 🟡).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@oratis oratis force-pushed the feat/voice-setup-detect branch from fe7712c to 4a61412 Compare June 8, 2026 06:29
oratis added a commit that referenced this pull request Jun 8, 2026
…175)

Local, on-device speech-to-text via whisper.cpp — no audio leaves the machine.

- Core: VoiceConfig (binPath/modelPath/provider/inputDevice) + detectVoice();
  existing WhisperCppProvider surfaced. detectRecorder/recordToWav (ffmpeg/sox).
- CLI: /voice records → transcribes → pre-fills the input line to edit;
  /voice setup prints install steps.
- Desktop: 🎙 composer button + Tauri voice_status/start/stop/cancel (ffmpeg,
  stdin-q graceful stop); mic entitlement + NSMicrophoneUsageDescription.
- Docs: docs/VOICE_INPUT.md + BEHAVIOR_PARITY /voice row.

Squashes the three review slices (PRs #173, #174, #175). Real-microphone
round-trip needs manual on-device verification (no audio hardware / Rust in CI).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@oratis

oratis commented Jun 8, 2026

Copy link
Copy Markdown
Owner Author

✅ Rolled into main via #175 (squash-merged — contains all three voice slices: setup/detection + CLI capture + desktop 🎙). Closing as merged.

@oratis oratis closed this Jun 8, 2026
@oratis oratis deleted the feat/voice-setup-detect branch June 8, 2026 06:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant