Fix unbounded TypeAdapter cache causing memory leak in multi-threaded usage#2873
Fix unbounded TypeAdapter cache causing memory leak in multi-threaded usage#2873veeceey wants to merge 2 commits intoopenai:mainfrom
Conversation
… usage The lru_cache wrapping pydantic.TypeAdapter was set to maxsize=None (unbounded). In multi-threaded contexts, pydantic regenerates parameterized generic types with different identities on each call, so the cache grows without bound. This is especially problematic in webserver environments using responses.parse. Setting maxsize=128 bounds the cache and prevents the memory leak while still providing effective caching for the most recently used types. Fixes openai#2672
|
Hi, |
|
Hi @rona-sh, thank you for the feedback! You make a great point — setting max_size=128 is indeed redundant since that's already the default, so I'll clean that up. Regarding the caching concern, you're absolutely right that an LRU cache with unique-per-thread keys effectively becomes a write-only cache in high-concurrency scenarios. A smarter approach would be needed — for example, keying the cache on the type itself rather than on per-call identity, or using a thread-local cache, or even a simple dict without eviction since the number of distinct TypeAdapter types is typically bounded. I'd love to hear your thoughts on which approach you'd prefer. I can update the PR accordingly, or if you'd rather tackle the smarter caching in a separate PR, I'm happy to scope this one down to just the immediate fix (removing the unbounded growth) as a stopgap. Let me know! |
The LRU cache with maxsize=128 was redundant (default value) and ineffective in multi-threaded environments since each thread creates unique TypeAdapter instances. Switch to threading.local() which naturally prevents memory leaks (cleaned up on thread exit) while providing actual caching benefit within each thread. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Removed the redundant max_size and switched to a thread-local approach for the TypeAdapter — this avoids the memory leak while being actually useful in multi-threaded scenarios. Thanks for the thorough review! |
Fixes #2672
The
lru_cachewrappingpydantic.TypeAdapterin_models.pywas configured withmaxsize=None(unbounded). In multi-threaded environments — like a typical webserver — pydantic regenerates parameterized generic types (e.g.ParsedResponseOutputMessage[MyClass]) with different object identities on each call. Since each new identity misses the cache, the cache grows without limit, leaking memory proportional to request volume.This changes the cache to
maxsize=128, which bounds memory usage while still providing effective caching for the most commonly used types. 128 is the standard default forfunctools.lru_cacheand should cover the vast majority of real-world type usage patterns.I verified this with a multi-threaded test — after spawning 200 threads that each call
TypeAdapter, the cache stays bounded atmaxsize=128instead of growing indefinitely. Added a regression test that asserts the cache maxsize is not None.