-
-
Notifications
You must be signed in to change notification settings - Fork 6
feat:OpenAPI: gpt-realtime examples, RealtimeAudioFormats, enum update #220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe OpenAPI spec updates examples to use gpt-realtime instead of gpt-4o-realtime, adds a RealtimeAudioFormats component with PCM/PCMU/PCMA variants using a type discriminator, and extends the realtime model enum to include gpt-realtime-2025-08-28. No endpoints or control flow are changed. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/libs/tryAGI.OpenAI/openapi.yaml (1)
5462-5466: Align language samples with curl by showing how to set session.model.The curl example includes session with model: "gpt-realtime", but Go/Java/Node/Python/Ruby samples call create() without a session payload. This is confusing for users trying to pick a model.
Consider updating each snippet to include a session/model example (even a compact one-liner) to mirror curl. I can draft the exact snippets if you confirm the SDK parameter names.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (23)
src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI..JsonSerializerContext.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeAudioFormats.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeAudioFormatsVariant1Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeAudioFormatsVariant1TypeNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeAudioFormatsVariant2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeAudioFormatsVariant2TypeNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeAudioFormatsVariant3Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonConverters.RealtimeAudioFormatsVariant3TypeNullable.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.JsonSerializerContextTypes.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormats.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormats.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsDiscriminator.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsDiscriminator.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsVariant1.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsVariant1.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsVariant1Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsVariant2.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsVariant2.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsVariant2Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsVariant3.Json.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsVariant3.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeAudioFormatsVariant3Type.g.csis excluded by!**/generated/**src/libs/tryAGI.OpenAI/Generated/tryAGI.OpenAI.Models.RealtimeSessionCreateRequestModel.g.csis excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/tryAGI.OpenAI/openapi.yaml(3 hunks)
🔇 Additional comments (1)
src/libs/tryAGI.OpenAI/openapi.yaml (1)
19601-19636: Verify audio.format schema wiring
I didn’t find any$ref: "#/components/schemas/RealtimeAudioFormats"underaudio.input.formatoraudio.output.format. Please manually confirm that allaudio.input.formatandaudio.output.formatdefinitions referenceRealtimeAudioFormats.
| python: "from openai import OpenAI\n\nclient = OpenAI(\n api_key=\"My API Key\",\n)\nclient_secret = client.realtime.client_secrets.create()\nprint(client_secret.expires_at)" | ||
| ruby: "require \"openai\"\n\nopenai = OpenAI::Client.new(api_key: \"My API Key\")\n\nclient_secret = openai.realtime.client_secrets.create\n\nputs(client_secret)" | ||
| response: "{\n \"value\": \"ek_68af296e8e408191a1120ab6383263c2\",\n \"expires_at\": 1756310470,\n \"session\": {\n \"type\": \"realtime\",\n \"object\": \"realtime.session\",\n \"id\": \"sess_C9CiUVUzUzYIssh3ELY1d\",\n \"model\": \"gpt-4o-realtime\",\n \"output_modalities\": [\n \"audio\"\n ],\n \"instructions\": \"You are a friendly assistant.\",\n \"tools\": [],\n \"tool_choice\": \"auto\",\n \"max_output_tokens\": \"inf\",\n \"tracing\": null,\n \"truncation\": \"auto\",\n \"prompt\": null,\n \"expires_at\": 0,\n \"audio\": {\n \"input\": {\n \"format\": {\n \"type\": \"audio/pcm\",\n \"rate\": 24000\n },\n \"transcription\": null,\n \"noise_reduction\": null,\n \"turn_detection\": {\n \"type\": \"server_vad\",\n \"threshold\": 0.5,\n \"prefix_padding_ms\": 300,\n \"silence_duration_ms\": 200,\n \"idle_timeout_ms\": null,\n \"create_response\": true,\n \"interrupt_response\": true\n }\n },\n \"output\": {\n \"format\": {\n \"type\": \"audio/pcm\",\n \"rate\": 24000\n },\n \"voice\": \"alloy\",\n \"speed\": 1.0\n }\n },\n \"include\": null\n }\n}\n" | ||
| response: "{\n \"value\": \"ek_68af296e8e408191a1120ab6383263c2\",\n \"expires_at\": 1756310470,\n \"session\": {\n \"type\": \"realtime\",\n \"object\": \"realtime.session\",\n \"id\": \"sess_C9CiUVUzUzYIssh3ELY1d\",\n \"model\": \"gpt-realtime\",\n \"output_modalities\": [\n \"audio\"\n ],\n \"instructions\": \"You are a friendly assistant.\",\n \"tools\": [],\n \"tool_choice\": \"auto\",\n \"max_output_tokens\": \"inf\",\n \"tracing\": null,\n \"truncation\": \"auto\",\n \"prompt\": null,\n \"expires_at\": 0,\n \"audio\": {\n \"input\": {\n \"format\": {\n \"type\": \"audio/pcm\",\n \"rate\": 24000\n },\n \"transcription\": null,\n \"noise_reduction\": null,\n \"turn_detection\": {\n \"type\": \"server_vad\",\n \"threshold\": 0.5,\n \"prefix_padding_ms\": 300,\n \"silence_duration_ms\": 200,\n \"idle_timeout_ms\": null,\n \"create_response\": true,\n \"interrupt_response\": true\n }\n },\n \"output\": {\n \"format\": {\n \"type\": \"audio/pcm\",\n \"rate\": 24000\n },\n \"voice\": \"alloy\",\n \"speed\": 1.0\n }\n },\n \"include\": null\n }\n}\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Fix example response: inconsistent types/fields (expires_at duplication; max_output_tokens as string).
- Top-level has expires_at (epoch), but nested session.expires_at is 0. Either remove the nested field or set it consistently.
- max_output_tokens is "inf" (string). If the schema expects an integer or null, this example will mislead generators/SDKs.
Apply this minimal example fix:
- "max_output_tokens": "inf",
- "tracing": null,
+ "max_output_tokens": null,
+ "tracing": null,
@@
- "expires_at": 0,
+ // "expires_at": 0,Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around line 5468, the example response
uses inconsistent types/fields: it includes a top-level expires_at epoch but
also a nested session.expires_at set to 0, and max_output_tokens is the string
"inf". Remove the redundant nested session.expires_at (or set it to the same
epoch value as the top-level expires_at) so only one consistent expires_at
remains, and change max_output_tokens from the string "inf" to either an integer
(e.g., a large number) or null to match the schema's numeric/null expectation.
| RealtimeAudioFormats: | ||
| anyOf: | ||
| - title: PCM audio format | ||
| type: object | ||
| properties: | ||
| rate: | ||
| enum: | ||
| - 24000 | ||
| type: integer | ||
| description: The sample rate of the audio. Always `24000`. | ||
| type: | ||
| enum: | ||
| - audio/pcm | ||
| type: string | ||
| description: The audio format. Always `audio/pcm`. | ||
| description: The PCM audio format. Only a 24kHz sample rate is supported. | ||
| - title: PCMU audio format | ||
| type: object | ||
| properties: | ||
| type: | ||
| enum: | ||
| - audio/pcmu | ||
| type: string | ||
| description: The audio format. Always `audio/pcmu`. | ||
| description: The G.711 μ-law format. | ||
| - title: PCMA audio format | ||
| type: object | ||
| properties: | ||
| type: | ||
| enum: | ||
| - audio/pcma | ||
| type: string | ||
| description: The audio format. Always `audio/pcma`. | ||
| description: The G.711 A-law format. | ||
| discriminator: | ||
| propertyName: type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Tighten RealtimeAudioFormats schema: require discriminators, lock shapes, prefer oneOf.
- Without required, an empty object could validate.
- For PCM, rate should be required.
- Use oneOf to ensure exactly one variant matches.
- Lock with additionalProperties: false for predictable generation.
- RealtimeAudioFormats:
- anyOf:
+ RealtimeAudioFormats:
+ oneOf:
- title: PCM audio format
type: object
properties:
rate:
enum:
- 24000
type: integer
description: The sample rate of the audio. Always `24000`.
type:
enum:
- audio/pcm
type: string
description: The audio format. Always `audio/pcm`.
+ required: [type, rate]
+ additionalProperties: false
description: The PCM audio format. Only a 24kHz sample rate is supported.
- title: PCMU audio format
type: object
properties:
type:
enum:
- audio/pcmu
type: string
description: The audio format. Always `audio/pcmu`.
+ required: [type]
+ additionalProperties: false
description: The G.711 μ-law format.
- title: PCMA audio format
type: object
properties:
type:
enum:
- audio/pcma
type: string
description: The audio format. Always `audio/pcma`.
+ required: [type]
+ additionalProperties: false
description: The G.711 A-law format.
discriminator:
propertyName: typeOptional: add discriminator.mapping or split into named subschemas for stronger codegen.
Also ensure audio.input.format and audio.output.format use $ref: '#/components/schemas/RealtimeAudioFormats'.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around lines 19601 to 19636, the
RealtimeAudioFormats schema is too permissive: make it an explicit oneOf with
named subschemas (e.g., PCM, Opus, AAC) or add a discriminator with mapping,
require the discriminant field, mark PCM.rate as required, set
additionalProperties: false on each variant to lock shapes, and replace any
inline definitions used for audio.input.format and audio.output.format with a
$ref to '#/components/schemas/RealtimeAudioFormats' so only one exact variant
can validate.
| - gpt-realtime | ||
| - gpt-realtime-2025-08-28 | ||
| - gpt-4o-realtime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Model enum extension looks good; confirm availability and update scattered examples.
Adding gpt-realtime-2025-08-28 is fine; ensure it’s an active, supported model and that examples/defaults across the spec are consistent.
Additionally, scan for stale model names:
🏁 Script executed:
#!/bin/bash
# Locate all model enum/value mentions to keep examples consistent.
rg -nP '(gpt-(realtime|4o-[^\"\'\s]+))' -C2Length of output: 226
🌐 Web query:
Is "gpt-realtime-2025-08-28" an officially listed OpenAI realtime model, and are there documented constraints (modalities, sample rates, token limits) that should be reflected in the OpenAPI spec?
💡 Result:
Short answer: No — there is an official OpenAI realtime model named exactly "gpt-realtime" (announced Aug 28, 2025), but "gpt-realtime-2025-08-28" is not the canonical model name shown by OpenAI. Use "gpt-realtime" in your spec. [1]
Documented constraints you should reflect in an OpenAPI spec (primary sources / community-confirmed behavior):
- Modalities: text and audio (settable per-session/response). [2][3]
- Audio formats / sample encodings commonly supported: pcm16, g711_ulaw, g711_alaw (and voice selection strings). Represent audio as PCM16 or the supported codec in your schema. [3][4]
- Token / context limits: Realtime models have large context ceilings but a practical max output/completion token limit of ~4,096 tokens; gpt-realtime has been reported with a 32,000 token context and 4,096 max output tokens in docs/analysis. Treat output tokens as capped and allow configuring max_response_output_tokens. [2][5]
- Session/time limits and rate considerations: Realtime sessions have session/connection limits and practical session-duration considerations (docs mention session-level controls / rate limits). Map rate limits and session timeouts in your spec. [1][2]
References:
- OpenAI announcement: "gpt-realtime" release (Aug 28, 2025). [1]
- Independent doc/analysis noting gpt-realtime name and reported limits (32k context, 4,096 output). [2]
- Realtime API/community examples showing modalities, session fields (modalities, input_audio_format, output_audio_format, max_response_output_tokens). [3]
- Third‑party Realtime API references listing supported audio formats/voice parameters. [4]
- Community threads describing the 4,096 output-token behavior and max_output_tokens parameter. [5]
If you want, I can produce a concise OpenAPI schema snippet (paths + models) that encodes modalities, audio-format enums, token-limit fields, and sensible defaults.
🏁 Script executed:
#!/bin/bash
# Scan the OpenAPI spec for occurrences of realtime and 4o model names
rg -n "gpt-realtime-2025-08-28" -C2 src/libs/tryAGI.OpenAI/openapi.yaml
rg -n "gpt-realtime" -C2 src/libs/tryAGI.OpenAI/openapi.yaml
rg -n "gpt-4o-" -C2 src/libs/tryAGI.OpenAI/openapi.yamlLength of output: 124027
Use canonical “gpt-realtime” model name; remove date-suffixed variant.
Replace the enum entry gpt-realtime-2025-08-28 with gpt-realtime in src/libs/tryAGI.OpenAI/openapi.yaml (around lines 22448–22450) and confirm no examples reference the removed variant.
🤖 Prompt for AI Agents
In src/libs/tryAGI.OpenAI/openapi.yaml around lines 22448 to 22450, replace the
enum value "gpt-realtime-2025-08-28" with the canonical name "gpt-realtime" in
the model enum, and update any schema references accordingly; after changing the
enum entry, search the file (and any YAML examples or sample requests nearby)
for any occurrences of the removed date-suffixed variant and update them to
"gpt-realtime" so no examples reference the old name.
Summary by CodeRabbit