Skip to content

feat(d-id): add D-ID avatar plugin (#1)#1670

Open
osimhi213 wants to merge 3 commits into
livekit:mainfrom
de-id:main
Open

feat(d-id): add D-ID avatar plugin (#1)#1670
osimhi213 wants to merge 3 commits into
livekit:mainfrom
de-id:main

Conversation

@osimhi213
Copy link
Copy Markdown

@osimhi213 osimhi213 commented Jun 1, 2026

Description

New plugin @livekit/agents-plugin-did — ports the Python
livekit-plugins-did to Node. Dispatches a D-ID v4 (expressive)
avatar worker into a LiveKit room and routes the agent's audio to it via voice.DataStreamAudioOutput.

const avatar = new did.AvatarSession({ agentId: process.env.DID_AGENT_ID });
await avatar.start(session, ctx.room);

Changes Made

  • plugins/did/ — new package
  • examples/src/did_avatar.ts
  • Workspace wiring: dep in examples/package.json, DID_* env vars in turbo.json, changeset

Pre-Review Checklist

Testing

  • plugins/did/src/avatar.test.ts added; all tests pass
  • restaurant_agent.ts / realtime_agent.ts — N/A; additive

Additional Notes

  • AudioConfig exposes sample rate (16k / 24k / 48k, default 24k) — matches Python parity.
  • v4 (expressive) D-ID agents only.

* feat(d-id): add D-ID avatar plugin

Ports the Python livekit-plugins-did plugin to the Node SDK. The new
@livekit/agents-plugin-did package dispatches a D-ID v4 (expressive)
avatar worker into a LiveKit room via POST /v2/agents/{agent_id}/sessions/join,
mints a LiveKit JWT for the avatar participant (kind=agent,
publish_on_behalf=<local>), and wires the agent's audio output to a
voice.DataStreamAudioOutput targeting the avatar identity. Audio sample
rate is configurable (16k/24k/48k) via AudioConfig, matching the Python
plugin's API.

Plugin mirrors plugins/tavus structurally (closest behavioral analog):
- src/avatar.ts: AvatarSession extends voice.AvatarSession
- src/api.ts: DIDAPI with fetch + AbortSignal.timeout + intervalForRetry
- src/types.ts: AudioConfig (D-ID-specific addition)
- src/{log,index}.ts + package/tsup/api-extractor configs

Workspace wiring:
- examples/package.json: add @livekit/agents-plugin-did dep
- examples/tsconfig.json: paths: {} so tsx resolves plugins via built
  dist at runtime (otherwise __PACKAGE_VERSION__ is undefined)
- turbo.json: declare DID_API_KEY / DID_API_URL / DID_AGENT_ID
- examples/src/did_avatar.ts: realtime LLM example mirroring bey_avatar

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(examples): drop tsconfig paths override

Restore examples/tsconfig.json to upstream main. The previously added
`paths: {}` reset (so tsx resolves @livekit/agents-* via built dist at
runtime) shouldn't be bundled with the D-ID plugin port — it's a
workspace-wide runtime concern, not plugin-specific.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore(d-id): drop checked-in api-extractor baseline

Other avatar plugins (tavus, anam, bey, hedra, lemonslice, runway,
liveavatar, trugen) don't ship etc/<name>.api.md baselines, and the
root .gitignore *.md rule explicitly excludes them. Match the
established convention.

The file is still produced locally by `pnpm api:update` for the dev
loop; it's just untracked.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(d-id): verify super.start is called first

Mirrors plugins/anam/src/avatar.test.ts. Asserts that
AvatarSession.start delegates to voice.AvatarSession.prototype.start
before running any plugin-specific setup, so shutdown callbacks and the
audio-output replacement warning remain wired by the base class.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jun 1, 2026

🦋 Changeset detected

Latest commit: d359991

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 35 packages
Name Type
@livekit/agents-plugin-did Patch
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-assemblyai Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-fishaudio Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-hume Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-liveavatar Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-minimax Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents-plugin-mistralai Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-perplexity Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-runway Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugin-soniox Patch
@livekit/agents-plugin-tavus Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch
@livekit/agents-plugins-test Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

osimhi213 and others added 2 commits June 1, 2026 11:28
Regenerated via `pnpm changeset` to follow the documented workflow.
Bump type changed from minor to patch to match how upstream merges
similar PRs (inworld-stt-voice-profiling, cartesia-ink-2-stt,
lift-google-realtime-out-of-beta).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* upstream/main: (26 commits)
  fix(voice): align session.start recording with Python primary-session semantics (livekit#1704)
  docs: add TcpSessionTransport to Remote Sessions section (livekit#1703)
  (format): remove whitespace (livekit#1705)
  feat(realtime): add reasoning configuration for gpt-realtime-2 models (livekit#1575)
  feat(voice): support granular RecordingOptions in session.start (livekit#1702)
  fix(inference): guard agent sid header (livekit#1700)
  feat(voice): add TcpSessionTransport and updateIo session handler (livekit#1693)
  fix(amd): defer SIP listening until answer (livekit#1639)
  fix(job): close RecorderIO at session end (livekit#1682)
  docs: fix incorrect inference model file reference (livekit#1685)
  docs: update cartesia plugin capabilities for STT support (livekit#1686)
  feat(inference): add agent ID header to inference requests (livekit#1687)
  fix(soniox): exclude test files from dist build (livekit#1689)
  Version Packages (livekit#1683)
  fix(recorder): prevent close hang (livekit#1684)
  docs(cartesia): add cartesia stt to readme (livekit#1681)
  fix(inworld): harden TTS connection layer and default to inworld-tts-2 (livekit#1675)
  feat(inference): enhance schemas and models for TTS and STT (livekit#1680)
  fix(llm): make ToolOptions.abortSignal required (livekit#1678)
  chore(deps): update dependency vitest to v4.1.0 [security] (livekit#1673)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant