Skip to content

Enhance LK Inference STT and TTS options with new parameters and models#4949

Open
russellmartin-livekit wants to merge 3 commits intomainfrom
rm/update-options
Open

Enhance LK Inference STT and TTS options with new parameters and models#4949
russellmartin-livekit wants to merge 3 commits intomainfrom
rm/update-options

Conversation

@russellmartin-livekit
Copy link
Contributor

  • Added DeepgramFluxModels to STT models.
  • Introduced DeepgramFluxOptions with various configuration parameters.
  • Updated existing TypedDicts in TTS to include additional options for Cartesia, Elevenlabs, and Rime.
  • Enhanced ElevenlabsOptions with new fields for better control over TTS behavior.

- Added DeepgramFluxModels to STT models.
- Introduced DeepgramFluxOptions with various configuration parameters.
- Updated existing TypedDicts in TTS to include additional options for Cartesia, Elevenlabs, and Rime.
- Enhanced ElevenlabsOptions with new fields for better control over TTS behavior.
@russellmartin-livekit russellmartin-livekit self-assigned this Feb 25, 2026
@chenghao-mou chenghao-mou requested a review from a team February 25, 2026 18:42
@russellmartin-livekit russellmartin-livekit requested review from a team and removed request for a team February 25, 2026 18:42
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

@davidzhao davidzhao changed the title Enhance STT and TTS options with new parameters and models Enhance LK Inference STT and TTS options with new parameters and models Feb 26, 2026
max_turn_silence: int # default: not specified
keyterms_prompt: list[str] # default: not specified
language_detection: bool
inactivity_timeout: int # milliseconds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make it pythonic?

e.g: using seconds here, with float

Comment on lines +148 to +159
reduceLatency: bool
# Mistv2-specific
pauseBetweenBrackets: bool
phonemizeBetweenBrackets: bool
inlineSpeedAlpha: str
noTextNormalization: bool
speedAlpha: float
# Arcana-specific
repetition_penalty: float # range 1-2
temperature: float # range 0-1
top_p: float # range 0-1
max_tokens: int # range 200-5000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should keep snake_case, and handle the conversions on the server side?

…tency

- Changed `inactivity_timeout` type from int to float in AssemblyaiOptions to represent seconds.
- Renamed keys in RimeOptions for consistency: `reduceLatency` to `reduce_latency`, `pauseBetweenBrackets` to `pause_between_brackets`, `phonemizeBetweenBrackets` to `phonemize_between_brackets`, `inlineSpeedAlpha` to `inline_speed_alpha`, and `noTextNormalization` to `no_text_normalization`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants