Skip to content

πŸ› Bug Report: TypeError when using responses.retrieve() with tools=NoneΒ #3523

@ingli96

Description

@ingli96

Which component is this bug for?

OpenAI Instrumentation

πŸ“œ Description

When using OpenAI's Responses API with background polling (responses.create(background=True) followed by repeated responses.retrieve() calls), a TypeError is raised on the second or subsequent retrieve() call.

Root cause: The instrumentation has two bugs that combine to cause the crash:

  1. Line 615/778: tools=merged_tools if merged_tools else None stores None when merged_tools=[] (empty list is falsy)
  2. Line 599/761: existing_data.get("tools", []) returns None when the key exists with None value, then crashes on None + request_tools

This is especially problematic in distributed/serverless environments (Azure Functions, AWS Lambda, etc.) where create() and retrieve() may run in different processes. The in-memory responses cache is not shared across processes, so retrieve() starts with an empty cache, stores tools=None, and crashes on subsequent calls.

πŸ‘Ÿ Reproduction steps

from openai import OpenAI

client = OpenAI()

# Step 1: Create background response with tools
response = client.responses.create(
    model="gpt-4o",
    input="Search for recent AI news",
    tools=[{"type": "web_search"}],
    background=True
)
response_id = response.id

# Step 2: Poll for completion (simulating distributed environment where cache is empty)
# First retrieve() succeeds but stores tools=None in cache
result1 = client.responses.retrieve(response_id)

# Step 3: Second retrieve() crashes
result2 = client.responses.retrieve(response_id)  # TypeError!

πŸ‘ Expected behavior

All responses.retrieve() calls should complete successfully and produce traced spans.

πŸ‘Ž Actual Behavior with Screenshots

TypeError: unsupported operand type(s) for +: 'NoneType' and 'list'

Stack trace:

File ".../opentelemetry/instrumentation/openai/v1/responses_wrappers.py", line 599
    merged_tools = existing_data.get("tools", []) + request_tools

πŸ€– Python Version

3.12

πŸ“ƒ Provide any additional context for the Bug.

Root Cause Analysis

Flow when cache is empty (common in serverless/distributed systems):

  1. First retrieve() call:

    existing_data = responses.get(response_id)  # β†’ None (empty cache)
    existing_data = {}                           # fallback to empty dict
    
    request_tools = get_tools_from_kwargs(kwargs)  # β†’ [] (no tools in retrieve())
    merged_tools = {}.get("tools", []) + []        # β†’ [] + [] = []
    
    # Bug 1: Line 615 - stores None because [] is falsy!
    TracedData(tools=merged_tools if merged_tools else None, ...)
    # Cache now has: responses[id] = TracedData(tools=None)
  2. Second retrieve() call:

    existing_data = responses[id].model_dump()     # β†’ {"tools": None, ...}
    
    # Bug 2: Line 599 - dict.get() returns None when key EXISTS with None value
    existing_data.get("tools", [])                 # β†’ None (NOT []!)
    
    None + request_tools                           # β†’ TypeError!

Suggested Fix

Fix Bug 1 (lines 615, 778) - Don't convert empty list to None:

# Change from:
tools=merged_tools if merged_tools else None,

# To:
tools=merged_tools,  # Keep [] as [], don't convert to None

Fix Bug 2 (lines 599, 761) - Handle None values from cache:

# Change from:
merged_tools = existing_data.get("tools", []) + request_tools

# To:
merged_tools = (existing_data.get("tools") or []) + request_tools

The or [] pattern safely coerces None to [] while preserving actual tool lists.

Environment

  • Python: 3.12.12
  • opentelemetry-instrumentation-openai: 0.50.1 (also affects 0.49.x)
  • openai: 1.82.0+
  • Platform: Azure Durable Functions (but affects any distributed/serverless environment)

Additional Context

The in-memory responses: dict[str, TracedData] = {} cache assumes all create() and retrieve() calls happen in the same process. This doesn't hold true for:

  • Serverless functions (Azure Functions, AWS Lambda)
  • Distributed systems with multiple workers
  • Background job processors
  • Any system using background=True polling pattern

πŸ‘€ Have you spent some time to check if this bug has been raised before?

  • I checked and didn't find similar issue

Are you willing to submit PR?

Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions