|
| 1 | +# Voice Providers — Field Cheat-Sheet |
| 2 | + |
| 3 | +The `voice` block on an assistant or `membersOverrides.voice` on a squad is **provider-specific**. Same conceptual field (e.g. "speed") lives at different paths depending on the provider. The Vapi platform rejects misplaced fields with a generic `property X should not exist` 400 — it does not point to the correct path. This page is the lookup table. |
| 4 | + |
| 5 | +> **When a 400 says "property X should not exist":** check this page for the provider's field layout before re-pushing. The engine has no schema awareness and will accept whatever you write, then surface the error only after the push reaches the API. |
| 6 | +
|
| 7 | +--- |
| 8 | + |
| 9 | +## Quick lookup |
| 10 | + |
| 11 | +| Field | 11labs | Cartesia (sonic-3) | OpenAI / Azure / Rime / LMNT / Minimax / Neuphonic / SmallestAI | |
| 12 | +|-------|--------|---------------------|------------------------------------------------------------------| |
| 13 | +| Speech rate | `voice.speed` (0.7–1.2) | `voice.generationConfig.speed` (0.6–1.5) | `voice.speed` | |
| 14 | +| Stability / consistency | `voice.stability` (0.0–1.0) | — (not exposed) | — | |
| 15 | +| Voice similarity | `voice.similarityBoost` (0.0–1.0) | — | — | |
| 16 | +| SSML parsing | `voice.enableSsmlParsing: true` | (parsed natively, no flag) | varies — see provider docs | |
| 17 | +| Pronunciation dictionary | — | `voice.pronunciationDictId` | — | |
| 18 | +| Volume control | — | `voice.generationConfig.volume` (0.5–2.0) | — | |
| 19 | +| Emotion / accent (experimental) | — | `voice.experimentalControls.emotion`, `voice.experimentalControls.speed` (-1 to 1, older API) | — | |
| 20 | + |
| 21 | +--- |
| 22 | + |
| 23 | +## 11labs |
| 24 | + |
| 25 | +```yaml |
| 26 | +voice: |
| 27 | + provider: 11labs |
| 28 | + voiceId: <uuid-or-name> |
| 29 | + model: eleven_turbo_v2 # or eleven_flash_v2_5 |
| 30 | + speed: 1.05 # 0.7–1.2 |
| 31 | + stability: 0.6 # 0.0–1.0; higher = less expressive variation |
| 32 | + similarityBoost: 0.75 # 0.0–1.0; higher = closer to source voice |
| 33 | + enableSsmlParsing: true # required for `<break>`, `<flush/>`, etc. |
| 34 | +``` |
| 35 | +
|
| 36 | +Common pitfalls: |
| 37 | +- `voice.generationConfig.*` — **does not exist** for 11labs. That's a Cartesia path. Push will 400. |
| 38 | +- Forgetting `enableSsmlParsing: true` — SSML tags will be spoken literally. |
| 39 | + |
| 40 | +--- |
| 41 | + |
| 42 | +## Cartesia (sonic-3) |
| 43 | + |
| 44 | +```yaml |
| 45 | +voice: |
| 46 | + provider: cartesia |
| 47 | + model: sonic-3 |
| 48 | + voiceId: <uuid> |
| 49 | + pronunciationDictId: pdict_<id> # optional but sticky — see warning below |
| 50 | + generationConfig: |
| 51 | + speed: 1.1 # 0.6–1.5 |
| 52 | + volume: 1.0 # 0.5–2.0 |
| 53 | + experimentalControls: |
| 54 | + speed: 0.0 # -1 to 1 (older API path) |
| 55 | + emotion: ["positivity:high"] |
| 56 | +``` |
| 57 | + |
| 58 | +**Forbidden at top level for Cartesia (will 400):** |
| 59 | +- `voice.speed` — use `voice.generationConfig.speed` instead. |
| 60 | +- `voice.enableSsmlParsing` — Cartesia parses SSML (`<break time='0.4s'/>`, `<speed ratio='0.9'/>`) natively from the text stream; no opt-in flag exists. |
| 61 | +- `voice.stability`, `voice.similarityBoost` — those are 11labs fields. |
| 62 | + |
| 63 | +**Pronunciation dictionary warning:** changing the `voiceId` in the Vapi dashboard's voice picker silently drops `pronunciationDictId` from the resource. If you swap the Cartesia voice via the dashboard, re-attach the dictionary on the next pull or it will be gone. Treat `(voiceId, pronunciationDictId)` as one atomic unit during edits. |
| 64 | + |
| 65 | +--- |
| 66 | + |
| 67 | +## OpenAI / Azure / Rime / LMNT / Minimax / Neuphonic / SmallestAI |
| 68 | + |
| 69 | +```yaml |
| 70 | +voice: |
| 71 | + provider: openai # or azure, rime, lmnt, minimax, neuphonic, smallestai |
| 72 | + voiceId: <provider-voice-id> |
| 73 | + model: <provider-model> # e.g. tts-1-hd for openai |
| 74 | + speed: 1.0 # top-level for these providers |
| 75 | +``` |
| 76 | + |
| 77 | +These providers expose `speed` at the top of the `voice` block. Refer to the [Vapi voice provider docs](https://docs.vapi.ai/providers/voice) for additional provider-specific fields (instructions, language hints, etc.). |
| 78 | + |
| 79 | +--- |
| 80 | + |
| 81 | +## Switching providers |
| 82 | + |
| 83 | +When migrating an assistant or squad member from Cartesia to 11labs (or vice versa), the field layout flips. If you carry over `generationConfig` from a Cartesia config to an 11labs voice, the next push will 400. Always rewrite the voice block from the target provider's template; do not patch in place. |
| 84 | + |
| 85 | +If a customer changes the provider on the dashboard and your local YAML still has the old nesting, `pull` will overwrite it cleanly — but a subsequent `push` from a stale branch will 400. Pull first, then edit. |
| 86 | + |
| 87 | +--- |
| 88 | + |
| 89 | +## Adding a new provider |
| 90 | + |
| 91 | +If you find yourself reaching for a provider not in the table above, append a row here in the same PR. The cheat-sheet only stays useful if it grows with the platform. |
0 commit comments