feat(google-cloud): add Google Cloud Text-to-Speech plugin by mshivam019 · Pull Request #1671 · livekit/agents-js

mshivam019 · 2026-06-01T10:46:18Z

Summary

Adds @livekit/agents-plugin-google-cloud using the @google-cloud/text-to-speech client library
gRPC bidirectional streaming: SynthesizeStream via TextToSpeechClient.streamingSynthesize() with sentence tokenization
REST one-shot synthesis: ChunkedStream via TextToSpeechClient.synthesizeSpeech() with LINEAR16 WAV -> PCM extraction
updateOptions() for runtime voice/language/gender changes
Credentials follow the standard Google Cloud auth chain: credentials object -> keyFilename -> GOOGLE_APPLICATION_CREDENTIALS -> ADC

Why this plugin vs the existing `google.beta.TTS`

The existing google.beta.TTS (Gemini TTS) uses @google/genai which does not support streaming synthesis. This plugin uses the @google-cloud/text-to-speech client which supports gRPC bidirectional streaming — needed for Gemini Flash TTS models like gemini-3.1-flash-tts-preview as well as classic models (Journey, Chirp 3, Standard, WaveNet).

The LiveKit docs show "Available in: [ ] Node.js, [x] Python" — this closes the Node.js gap.

Credentials

// Option 1: credentials object
const tts = new TTS({
  credentials: { client_email: '...', private_key: '...' },
});

// Option 2: key file path
const tts = new TTS({
  keyFilename: '/path/to/service-account.json',
});

// Option 3: GOOGLE_APPLICATION_CREDENTIALS env var or gcloud ADC (auto-detected)
const tts = new TTS();

Streaming synthesis

Uses TextToSpeechClient.streamingSynthesize() for gRPC bidirectional streaming
Input text sentence-tokenized via tokenize.basic.SentenceTokenizer
Audio decoded via AudioByteStream, emitted as SynthesizedAudio frames
Proper abort signal handling: cleans up gRPC call via call.cancel()/call.destroy()
AudioByteStream.flush() called on stream end to prevent trailing audio truncation

Files changed

File	Change
`plugins/google-cloud/package.json`	New package: `@livekit/agents-plugin-google-cloud`
`plugins/google-cloud/src/tts.ts`	`TTS`, `SynthesizeStream`, `ChunkedStream` classes
`plugins/google-cloud/src/models.ts`	`TTSModel`, `TTSGender`, `TTSLanguage` types
`plugins/google-cloud/src/index.ts`	Plugin registration + exports
`plugins/google-cloud/tsconfig.json`	Extends root tsconfig
`plugins/google-cloud/tsup.config.ts`	Extends root tsup config
`plugins/google-cloud/api-extractor.json`	Extends shared API extractor config
`plugins/google-cloud/README.md`	Installation, auth, usage docs
`.changeset/google-cloud-tts-plugin.md`	Changeset entry
`pnpm-lock.yaml`	Lock `@google-cloud/text-to-speech` dependency

Usage

import { TTS } from '@livekit/agents-plugin-google-cloud';

// Streaming synthesis with Gemini Flash TTS
const tts = new TTS({
  modelName: 'gemini-3.1-flash-tts-preview',
  voiceName: 'Zephyr',
  language: 'en-IN',
});

// Non-streaming with standard voices
const tts = new TTS({
  language: 'en-US',
  voiceName: 'en-US-Standard-H',
  streaming: false,
});

Verification

tsc --noEmit — zero errors
eslint src/**/*.ts — zero errors
tsup build — CJS + ESM bundles compile cleanly

Add @livekit/agents-plugin-google-cloud using the @google-cloud/text-to-speech client library. Supports both gRPC bidirectional streaming and REST-based synthesis. The existing google.beta.TTS uses @google/genai (Gemini API) which does not support streaming. This plugin uses the Google Cloud TTS client which supports streaming with Gemini Flash TTS models like gemini-3.1-flash-tts-preview, as well as standard models (Journey, Chirp 3, Standard, WaveNet). Credentials follow the standard Google Cloud auth chain: credentials object -> keyFilename -> GOOGLE_APPLICATION_CREDENTIALS -> ADC.

- Remove queue.close() from ChunkedStream finally (base class handles retry) - Remove tokenizer.close() from SynthesizeStream finally (breaks retry path) - Skip toLiveKitTtsError wrapping for existing APIConnectionError/APIStatusError - Fix voiceName type from TTSLanguage to string (semantically misleading) - Log warning when gender overrides explicit voiceName - Restore updateOptions method (dropped during squash)

changeset-bot · 2026-06-01T11:07:49Z

🦋 Changeset detected

Latest commit: cca2c86

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 35 packages

Name	Type
@livekit/agents-plugin-google-cloud	Patch
@livekit/agents	Patch
@livekit/agents-plugin-anam	Patch
@livekit/agents-plugin-assemblyai	Patch
@livekit/agents-plugin-baseten	Patch
@livekit/agents-plugin-bey	Patch
@livekit/agents-plugin-cartesia	Patch
@livekit/agents-plugin-cerebras	Patch
@livekit/agents-plugin-deepgram	Patch
@livekit/agents-plugin-elevenlabs	Patch
@livekit/agents-plugin-fishaudio	Patch
@livekit/agents-plugin-google	Patch
@livekit/agents-plugin-hedra	Patch
@livekit/agents-plugin-hume	Patch
@livekit/agents-plugin-inworld	Patch
@livekit/agents-plugin-lemonslice	Patch
@livekit/agents-plugin-liveavatar	Patch
@livekit/agents-plugin-livekit	Patch
@livekit/agents-plugin-minimax	Patch
@livekit/agents-plugin-mistral	Patch
@livekit/agents-plugin-mistralai	Patch
@livekit/agents-plugin-neuphonic	Patch
@livekit/agents-plugin-openai	Patch
@livekit/agents-plugin-perplexity	Patch
@livekit/agents-plugin-phonic	Patch
@livekit/agents-plugin-resemble	Patch
@livekit/agents-plugin-rime	Patch
@livekit/agents-plugin-runway	Patch
@livekit/agents-plugin-sarvam	Patch
@livekit/agents-plugin-silero	Patch
@livekit/agents-plugin-soniox	Patch
@livekit/agents-plugin-tavus	Patch
@livekit/agents-plugins-test	Patch
@livekit/agents-plugin-trugen	Patch
@livekit/agents-plugin-xai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Avoid wrapping existing API errors, prevent streaming audio flush after gRPC errors, and ensure streaming call/tokenizer cleanup does not interfere with retry handling. Resolve the pnpm lockfile conflict after rebasing on main.

Use the Google gax cancellable unary call for synthesizeSpeech so aborting a ChunkedStream cancels the in-flight RPC. Pass the connection timeout through CallOptions and add updateOptions warnings for gender-derived Standard voices overriding voice selection.

Reject pending streaming writes when the gRPC stream closes before drain, and destroy failed streaming calls with an error so concurrent tasks settle during cleanup. Treat DEADLINE_EXCEEDED as retryable for Google Cloud TTS errors.

Treat gax CANCELLED rejections as normal ChunkedStream aborts when the abort signal is set. Also document that Google Cloud TTS numeric provider errors are gRPC status codes with explicit retryability.

Keep a no-op error listener attached when destroying a failed Google Cloud streaming call with an error. Node streams may emit destroy errors asynchronously, so removing the listener immediately can still produce an unhandled error.

Abort per-attempt tokenization instead of closing the shared SynthesizeStream input queue. This lets cleanup settle without poisoning the base retry path. Add abortable AsyncIterableQueue.next support for the plugin cleanup path.

This comment was marked as resolved.

Sign in to view

mshivam019 added 4 commits June 2, 2026 10:27

fix(google-cloud): harden TTS error cleanup

65d5b5c

Avoid wrapping existing API errors, prevent streaming audio flush after gRPC errors, and ensure streaming call/tokenizer cleanup does not interfere with retry handling. Resolve the pnpm lockfile conflict after rebasing on main.

fix(google-cloud): unblock streaming shutdown

5e9e7a3

Reject pending streaming writes when the gRPC stream closes before drain, and destroy failed streaming calls with an error so concurrent tasks settle during cleanup. Treat DEADLINE_EXCEEDED as retryable for Google Cloud TTS errors.

fix(google-cloud): ignore unary TTS cancellation

bf872c8

Treat gax CANCELLED rejections as normal ChunkedStream aborts when the abort signal is set. Also document that Google Cloud TTS numeric provider errors are gRPC status codes with explicit retryability.

This comment was marked as resolved.

Sign in to view

fix(google-cloud): preserve streaming input on retry

5b8041f

Abort per-attempt tokenization instead of closing the shared SynthesizeStream input queue. This lets cleanup settle without poisoning the base retry path. Add abortable AsyncIterableQueue.next support for the plugin cleanup path.

This comment was marked as resolved.

Sign in to view

fix(google-cloud): settle streaming abort cleanup

cca2c86

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(google-cloud): add Google Cloud Text-to-Speech plugin#1671

feat(google-cloud): add Google Cloud Text-to-Speech plugin#1671
mshivam019 wants to merge 9 commits into
livekit:mainfrom
mshivam019:feat/google-cloud-tts

mshivam019 commented Jun 1, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

changeset-bot Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mshivam019 commented Jun 1, 2026

Summary

Why this plugin vs the existing google.beta.TTS

Credentials

Streaming synthesis

Files changed

Usage

Verification

Uh oh!

This comment was marked as resolved.

Uh oh!

changeset-bot Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Why this plugin vs the existing `google.beta.TTS`

changeset-bot Bot commented Jun 1, 2026 •

edited

Loading