Skip to content

Support Realtime custom voice objects#3473

Open
lionel-oai wants to merge 1 commit into
mainfrom
fix/realtime-custom-voice
Open

Support Realtime custom voice objects#3473
lionel-oai wants to merge 1 commit into
mainfrom
fix/realtime-custom-voice

Conversation

@lionel-oai
Copy link
Copy Markdown
Contributor

@lionel-oai lionel-oai commented May 20, 2026

Summary

This PR fixes Realtime custom voice handling in the Agents SDK.

Realtime sessions can receive and send structured custom voice objects such as {"id": "voice_..."}, but the SDK previously typed voice settings as strings and validated inbound server events before updating response lifecycle state. If a server event such as response.created or response.done contained a structured voice object that failed validation, the SDK could skip response state updates and leave the response-create sequencer blocked. That could prevent the next response.create from being sent after tool output.

The change adds typed support for custom voice objects in Realtime session settings, preserves structured voices when building outbound session.update payloads, and adds a validation fallback for inbound server events so custom voice objects do not break response lifecycle tracking.

Tests

  • make format
  • make lint
  • uv run pytest -q tests/realtime/test_openai_realtime.py tests/realtime/test_realtime_model_settings.py
  • uv run pytest -q tests/realtime/test_session.py -k "handoff_session_update_preserves_custom_voice or handoff_tool_handling"
  • uv run mypy src/agents/realtime/config.py src/agents/realtime/openai_realtime.py tests/realtime/test_openai_realtime.py
  • uv run pyright src/agents/realtime/config.py src/agents/realtime/openai_realtime.py tests/realtime/test_openai_realtime.py
  • uv run mypy tests/realtime/test_session.py
  • uv run pyright tests/realtime/test_session.py

Full make tests / make typecheck were not completed locally because optional dependency installation was blocked by a socket-firewall tunnel failure while downloading docstring-parser==0.18.0.

@lionel-oai lionel-oai force-pushed the fix/realtime-custom-voice branch from 9a489b8 to eed10dc Compare May 20, 2026 16:01
@lionel-oai lionel-oai force-pushed the fix/realtime-custom-voice branch from eed10dc to 20e7135 Compare May 20, 2026 18:42
return normalized


def _create_realtime_audio_output(audio_output_args: dict[str, Any]) -> Any:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we upgrade openai package to openai>=2.36.0 , this workaround is not necessary while _normalize_custom_voice_for_server_event_validation is still required even with the latest version.

Can you add quick TODO comments explaining why and when to remove to these internal workarounds?

@seratch seratch added this to the 0.17.x milestone May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants