Skip to content

Comments

Fix OpenAI inference through llmspy#214

Merged
bussyjd merged 1 commit intomainfrom
fix/llmspy-openai-inference
Feb 24, 2026
Merged

Fix OpenAI inference through llmspy#214
bussyjd merged 1 commit intomainfrom
fix/llmspy-openai-inference

Conversation

@bussyjd
Copy link
Collaborator

@bussyjd bussyjd commented Feb 24, 2026

Summary

  • Fix OpenAI npm SDK mismatch: ConfigMap providers.json had "npm": "openai" but OpenAiProvider registers with sdk = "@ai-sdk/openai" — the mismatch caused the provider to be silently auto-disabled at startup
  • Fix stream_options causing OpenAI 500: OpenClaw sends stream_options in requests; llmspy forces stream=false but didn't strip stream_options, causing OpenAI to reject with "stream_options is only allowed when stream is enabled". Init container now patches this at deploy time
  • Bump llmspy image from 3.0.33-obol.2 to 3.0.34-obol.1
  • Harden integration tests: remove deprecated max_tokens, add requireLLMSpyProvider() guard, add error pattern detection, add Google and Z.AI inference tests

Root Causes

Bug 1: npm SDK mismatch (llm.yaml line 93)

create_provider() in llmspy matches provider.get("npm") from providers.json against provider_type.sdk from Python provider classes. Our ConfigMap override had "npm": "openai" which overwrote the package's correct "npm": "@ai-sdk/openai" during the init container merge. No match → provider not created → auto-disabled.

Bug 2: stream_options passthrough (llm.yaml init container)

llmspy's process_chat() forces chat["stream"] = False (it collects full responses and re-chunks for streaming clients), but didn't remove stream_options from the request body. OpenAI strictly validates that stream_options requires stream=true. The init container now copies the llms package to a writable volume, patches main.py to add chat.pop("stream_options", None), and PYTHONPATH is set to load the patched version.

Test plan

  • All 5 inference tests pass: Ollama, Anthropic, OpenAI, Google, Z.AI
  • All 6 skills tests pass (staging, visibility, sync, idempotent, inference, smoke)
  • MultiInstance test passes (3 concurrent deployments)
  • Verified stream_options fix via direct llmspy test (HTTP 200 with stream_options after patch)
  • Verified npm SDK fix via runtime llms.json (all 5 providers stay enabled after restart)
  • Fresh cluster deploy (purge → init → up → all 11 integration tests pass)

Two bugs prevented OpenAI models from working through the llmspy gateway:

1. The ConfigMap providers.json had `"npm": "openai"` but the OpenAI provider
   registers with `sdk = "@ai-sdk/openai"`. The mismatch caused create_provider()
   to return None, so the provider was never added to g_handlers and got
   auto-disabled at serve time.

2. OpenClaw sends `stream_options` in chat completion requests. llmspy forces
   `stream=false` (it collects the full response and re-chunks if needed) but
   didn't strip `stream_options`. OpenAI rejects the combination with
   "stream_options is only allowed when stream is enabled". The init container
   now patches main.py to add `chat.pop("stream_options", None)` after the
   stream override, with PYTHONPATH loading the patched module.

Also bumps the llmspy image from 3.0.33-obol.2 to 3.0.34-obol.1.

Integration test improvements:
- Remove max_tokens parameter (gpt-5.2 requires max_completion_tokens)
- Add requireLLMSpyProvider() to skip tests when provider is auto-disabled
- Add error pattern detection for upstream errors wrapped in 200 responses
- Add Google and Z.AI inference tests
- Add response body logging for diagnostics
@bussyjd bussyjd merged commit b53c8f7 into main Feb 24, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants