-
Notifications
You must be signed in to change notification settings - Fork 848
Description
Which component is this bug for?
OpenAI Instrumentation
π Description
When using OpenAI's Responses API with background polling (responses.create(background=True) followed by repeated responses.retrieve() calls), a TypeError is raised on the second or subsequent retrieve() call.
Root cause: The instrumentation has two bugs that combine to cause the crash:
- Line 615/778:
tools=merged_tools if merged_tools else NonestoresNonewhenmerged_tools=[](empty list is falsy) - Line 599/761:
existing_data.get("tools", [])returnsNonewhen the key exists withNonevalue, then crashes onNone + request_tools
This is especially problematic in distributed/serverless environments (Azure Functions, AWS Lambda, etc.) where create() and retrieve() may run in different processes. The in-memory responses cache is not shared across processes, so retrieve() starts with an empty cache, stores tools=None, and crashes on subsequent calls.
π Reproduction steps
from openai import OpenAI
client = OpenAI()
# Step 1: Create background response with tools
response = client.responses.create(
model="gpt-4o",
input="Search for recent AI news",
tools=[{"type": "web_search"}],
background=True
)
response_id = response.id
# Step 2: Poll for completion (simulating distributed environment where cache is empty)
# First retrieve() succeeds but stores tools=None in cache
result1 = client.responses.retrieve(response_id)
# Step 3: Second retrieve() crashes
result2 = client.responses.retrieve(response_id) # TypeError!π Expected behavior
All responses.retrieve() calls should complete successfully and produce traced spans.
π Actual Behavior with Screenshots
TypeError: unsupported operand type(s) for +: 'NoneType' and 'list'
Stack trace:
File ".../opentelemetry/instrumentation/openai/v1/responses_wrappers.py", line 599
merged_tools = existing_data.get("tools", []) + request_tools
π€ Python Version
3.12
π Provide any additional context for the Bug.
Root Cause Analysis
Flow when cache is empty (common in serverless/distributed systems):
-
First
retrieve()call:existing_data = responses.get(response_id) # β None (empty cache) existing_data = {} # fallback to empty dict request_tools = get_tools_from_kwargs(kwargs) # β [] (no tools in retrieve()) merged_tools = {}.get("tools", []) + [] # β [] + [] = [] # Bug 1: Line 615 - stores None because [] is falsy! TracedData(tools=merged_tools if merged_tools else None, ...) # Cache now has: responses[id] = TracedData(tools=None)
-
Second
retrieve()call:existing_data = responses[id].model_dump() # β {"tools": None, ...} # Bug 2: Line 599 - dict.get() returns None when key EXISTS with None value existing_data.get("tools", []) # β None (NOT []!) None + request_tools # β TypeError!
Suggested Fix
Fix Bug 1 (lines 615, 778) - Don't convert empty list to None:
# Change from:
tools=merged_tools if merged_tools else None,
# To:
tools=merged_tools, # Keep [] as [], don't convert to NoneFix Bug 2 (lines 599, 761) - Handle None values from cache:
# Change from:
merged_tools = existing_data.get("tools", []) + request_tools
# To:
merged_tools = (existing_data.get("tools") or []) + request_toolsThe or [] pattern safely coerces None to [] while preserving actual tool lists.
Environment
- Python: 3.12.12
opentelemetry-instrumentation-openai: 0.50.1 (also affects 0.49.x)openai: 1.82.0+- Platform: Azure Durable Functions (but affects any distributed/serverless environment)
Additional Context
The in-memory responses: dict[str, TracedData] = {} cache assumes all create() and retrieve() calls happen in the same process. This doesn't hold true for:
- Serverless functions (Azure Functions, AWS Lambda)
- Distributed systems with multiple workers
- Background job processors
- Any system using
background=Truepolling pattern
π Have you spent some time to check if this bug has been raised before?
- I checked and didn't find similar issue
Are you willing to submit PR?
Yes I am willing to submit a PR!