feat(voice): /voice setup check + whisper.cpp detection (slice 1)#173
Closed
oratis wants to merge 1 commit into
Closed
feat(voice): /voice setup check + whisper.cpp detection (slice 1)#173oratis wants to merge 1 commit into
oratis wants to merge 1 commit into
Conversation
This was referenced Jun 8, 2026
Surface the existing core whisper.cpp engine via a `/voice` slash command and add the settings schema for it. No mic capture yet — this is the safe, self-contained foundation per docs/VOICE_INPUT.md. Core: - Add VoiceConfig (provider | binPath | modelPath) to settings types, re-exported from @deepcode/core (the JSON schema already had the block). - New detectVoice() (voice/detect.ts): resolves the whisper binary (settings.binPath, else whisper-cli/whisper on PATH) and the model (settings.modelPath, else ~/.deepcode/models/whisper-base.en.bin), never throws — missing pieces become `problems`. Injectable probes for deterministic tests. - validateSettingsShallow now flags an unknown voice.provider. CLI: - /voice reports readiness or prints actionable setup steps (+ per-issue detail); `/voice setup` always shows install instructions. - SessionContext gains an optional `home` (honors --home) for the default model-path probe; wired in the REPL. Tests: 9 core detection cases, 1 schema case, 3 CLI messaging cases. Updates the /voice BEHAVIOR_PARITY row (✗ → ✓, 🔄 → 🟡). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fe7712c to
4a61412
Compare
oratis
added a commit
that referenced
this pull request
Jun 8, 2026
…175) Local, on-device speech-to-text via whisper.cpp — no audio leaves the machine. - Core: VoiceConfig (binPath/modelPath/provider/inputDevice) + detectVoice(); existing WhisperCppProvider surfaced. detectRecorder/recordToWav (ffmpeg/sox). - CLI: /voice records → transcribes → pre-fills the input line to edit; /voice setup prints install steps. - Desktop: 🎙 composer button + Tauri voice_status/start/stop/cancel (ffmpeg, stdin-q graceful stop); mic entitlement + NSMicrophoneUsageDescription. - Docs: docs/VOICE_INPUT.md + BEHAVIOR_PARITY /voice row. Squashes the three review slices (PRs #173, #174, #175). Real-microphone round-trip needs manual on-device verification (no audio hardware / Rust in CI). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Owner
Author
|
✅ Rolled into |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First slice of
/voice(local speech-to-text, M8 indocs/BEHAVIOR_PARITY.md). This is the safe, self-contained foundation: a/voiceslash command that detects whether whisper.cpp + a model are installed/configured and prints actionable setup steps. No mic capture yet — recording + transcription wiring lands in follow-up slices.Follows the decisions in
docs/VOICE_INPUT.md(settings shapevoice.provider/voice.binPath/voice.modelPath; fully local — no audio leaves the machine).The core
WhisperCppProvider/parseWhisperOutputengine already existed and was tested; it just wasn't surfaced anywhere. This slice exposes it via config + a command.Changes
Core (
@deepcode/core)VoiceConfig(provider|binPath|modelPath) to settings types, re-exported from the package root. (The JSON schema already had thevoiceblock; the TS type was missing.)detectVoice()(voice/detect.ts): resolves the whisper binary (settings.voice.binPath, elsewhisper-cli/whisperon PATH) and the model (settings.voice.modelPath, else the documented default~/.deepcode/models/whisper-base.en.bin). Never throws — missing/invalid pieces become aproblems[]list. Filesystem/PATH probes are injectable for deterministic tests.validateSettingsShallownow flags an unknownvoice.provider.CLI
/voicereports readiness (binary + model paths) or prints setup instructions with a per-issue breakdown;/voice setupalways shows install steps.SessionContextgains an optionalhome(honors--home) for the default model-path probe; wired in the REPL.Docs
/voicerow inBEHAVIOR_PARITY.md(✗→✓,🔄 M8→🟡setup/detection; capture noted as a follow-up). The large line churn in that file is prettier re-aligning the table column widths;git diff -wshows only the one semantic row + the separator.Testing
pnpm typecheck— cleandetectVoicecases + 1 schema case)/voicemessaging cases)pnpm lint— 0 errors;pnpm format:check— cleanFollow-up slices (not in this PR)
🤖 Generated with Claude Code