feat(d-id): add D-ID avatar plugin (#1)#1670
Open
osimhi213 wants to merge 3 commits into
Open
Conversation
* feat(d-id): add D-ID avatar plugin
Ports the Python livekit-plugins-did plugin to the Node SDK. The new
@livekit/agents-plugin-did package dispatches a D-ID v4 (expressive)
avatar worker into a LiveKit room via POST /v2/agents/{agent_id}/sessions/join,
mints a LiveKit JWT for the avatar participant (kind=agent,
publish_on_behalf=<local>), and wires the agent's audio output to a
voice.DataStreamAudioOutput targeting the avatar identity. Audio sample
rate is configurable (16k/24k/48k) via AudioConfig, matching the Python
plugin's API.
Plugin mirrors plugins/tavus structurally (closest behavioral analog):
- src/avatar.ts: AvatarSession extends voice.AvatarSession
- src/api.ts: DIDAPI with fetch + AbortSignal.timeout + intervalForRetry
- src/types.ts: AudioConfig (D-ID-specific addition)
- src/{log,index}.ts + package/tsup/api-extractor configs
Workspace wiring:
- examples/package.json: add @livekit/agents-plugin-did dep
- examples/tsconfig.json: paths: {} so tsx resolves plugins via built
dist at runtime (otherwise __PACKAGE_VERSION__ is undefined)
- turbo.json: declare DID_API_KEY / DID_API_URL / DID_AGENT_ID
- examples/src/did_avatar.ts: realtime LLM example mirroring bey_avatar
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* chore(examples): drop tsconfig paths override
Restore examples/tsconfig.json to upstream main. The previously added
`paths: {}` reset (so tsx resolves @livekit/agents-* via built dist at
runtime) shouldn't be bundled with the D-ID plugin port — it's a
workspace-wide runtime concern, not plugin-specific.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* chore(d-id): drop checked-in api-extractor baseline
Other avatar plugins (tavus, anam, bey, hedra, lemonslice, runway,
liveavatar, trugen) don't ship etc/<name>.api.md baselines, and the
root .gitignore *.md rule explicitly excludes them. Match the
established convention.
The file is still produced locally by `pnpm api:update` for the dev
loop; it's just untracked.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* test(d-id): verify super.start is called first
Mirrors plugins/anam/src/avatar.test.ts. Asserts that
AvatarSession.start delegates to voice.AvatarSession.prototype.start
before running any plugin-specific setup, so shutdown callbacks and the
audio-output replacement warning remain wired by the base class.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
🦋 Changeset detectedLatest commit: d359991 The changes in this PR will be included in the next version bump. This PR includes changesets to release 35 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Regenerated via `pnpm changeset` to follow the documented workflow. Bump type changed from minor to patch to match how upstream merges similar PRs (inworld-stt-voice-profiling, cartesia-ink-2-stt, lift-google-realtime-out-of-beta). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* upstream/main: (26 commits) fix(voice): align session.start recording with Python primary-session semantics (livekit#1704) docs: add TcpSessionTransport to Remote Sessions section (livekit#1703) (format): remove whitespace (livekit#1705) feat(realtime): add reasoning configuration for gpt-realtime-2 models (livekit#1575) feat(voice): support granular RecordingOptions in session.start (livekit#1702) fix(inference): guard agent sid header (livekit#1700) feat(voice): add TcpSessionTransport and updateIo session handler (livekit#1693) fix(amd): defer SIP listening until answer (livekit#1639) fix(job): close RecorderIO at session end (livekit#1682) docs: fix incorrect inference model file reference (livekit#1685) docs: update cartesia plugin capabilities for STT support (livekit#1686) feat(inference): add agent ID header to inference requests (livekit#1687) fix(soniox): exclude test files from dist build (livekit#1689) Version Packages (livekit#1683) fix(recorder): prevent close hang (livekit#1684) docs(cartesia): add cartesia stt to readme (livekit#1681) fix(inworld): harden TTS connection layer and default to inworld-tts-2 (livekit#1675) feat(inference): enhance schemas and models for TTS and STT (livekit#1680) fix(llm): make ToolOptions.abortSignal required (livekit#1678) chore(deps): update dependency vitest to v4.1.0 [security] (livekit#1673) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
New plugin
@livekit/agents-plugin-did— ports the Pythonlivekit-plugins-didto Node. Dispatches a D-ID v4 (expressive)avatar worker into a LiveKit room and routes the agent's audio to it via
voice.DataStreamAudioOutput.Changes Made
plugins/did/— new packageexamples/src/did_avatar.tsexamples/package.json,DID_*env vars inturbo.json, changesetPre-Review Checklist
pnpm build/lint/api:check/vitestall greenTesting
plugins/did/src/avatar.test.tsadded; all tests passrestaurant_agent.ts/realtime_agent.ts— N/A; additiveAdditional Notes
AudioConfigexposes sample rate (16k / 24k / 48k, default 24k) — matches Python parity.