Skip to content

docs: document WithHTTPTransportWrapper and RAG permanent error behavior#3099

Merged
dgageot merged 7 commits into
mainfrom
docs/auto-update
Jun 13, 2026
Merged

docs: document WithHTTPTransportWrapper and RAG permanent error behavior#3099
dgageot merged 7 commits into
mainfrom
docs/auto-update

Conversation

@aheritier

Copy link
Copy Markdown
Contributor

Documentation updates

This PR documents two code changes merged into `main` in the last 36 hours that were not reflected in the `/docs`.

Commit Source PR What changed
`8c28e215` #3090 Document `options.WithHTTPTransportWrapper` for HTTP middleware injection in Go SDK
`64889754` #3091 Document permanent error behavior in RAG indexing and reranking

Details

Go SDK HTTP Middleware / Transport Wrappers (#3090)

PR #3090 added `options.WithHTTPTransportWrapper` — a new `Opt` that lets Go SDK embedders inject an HTTP middleware (e.g. tracing, custom auth headers, metrics) into the transport chain of all provider clients. No documentation existed for this API.

Added a new "HTTP Middleware / Transport Wrappers" section to `docs/guides/go-sdk/index.md` covering:

  • Full Go code example (custom header injection pattern)
  • The fact that `base` already carries OTel/SSE/proxy instrumentation
  • Supported providers: Anthropic, OpenAI, Gemini (GeminiAPI backend)
  • Warning callout for Bedrock and Vertex AI (unsupported; warning is logged)
  • Gateway vs. direct mode lifecycle difference (wrapper called once vs. per-request)
  • Nil-return-is-a-bug note

RAG permanent error behavior (#3091)

PR #3091 fixed a bug where a misconfigured embedding or reranking model (e.g. a typo in the model name) caused a flood of doomed requests. The new behavior — aborting indexing on the first permanent failure and permanently disabling the reranker on permanent errors — was not documented.

Added a paragraph to the "Debugging RAG" section of `docs/tools/rag/index.md` explaining:

  • What counts as a permanent error (400, 401, 404, 429)
  • Indexing: aborts on first permanent failure and surfaces the error immediately
  • Reranking: permanently disabled for the manager lifetime; transient errors (5xx/timeouts) still fall back and retry

PRs reviewed and found up to date

Source PR Reason
#3095 `force_handoff` — docs included in the PR (multi-agent concepts, agent config ref, hooks ref)
#3086 Gateway model discovery — TUI docs updated in the PR
#3092 CHANGELOG v1.79.0 — automated release update
#3087 Require GPG/SSH commit signing — AGENTS.md only (contributor guide)
#3085 Semgrep MCP catalog auth fix — internal catalog metadata, no user docs
#3084 Previous docs update PR

@aheritier aheritier requested a review from a team as a code owner June 13, 2026 04:05
@aheritier aheritier added the kind/docs Documentation-only changes label Jun 13, 2026
docker-agent

This comment was marked as resolved.

@aheritier aheritier added area/api For features/issues/fixes related to the usage of the cagent API area/rag For work/issues that have to do with the RAG features labels Jun 13, 2026
@aheritier

Copy link
Copy Markdown
Contributor Author

/review

@docker-agent docker-agent left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

Documentation-only PR adding two new sections. One medium-severity accuracy issue found in the RAG permanent error documentation — HTTP 429 (rate limiting) is grouped with fatal configuration errors (bad auth, unknown model), which contradicts standard HTTP semantics and the PR's own stated motivation. A prior fix commit in the file history suggests this was already flagged once. One low-severity clarity note on the Go SDK code snippet.

Comment thread docs/tools/rag/index.md

Look for log tags: `[RAG Manager]`, `[Chunked-Embeddings Strategy]`, `[BM25 Strategy]`, `[RRF Fusion]`, `[Reranker]`.

**Permanent model errors abort early.** If the embedding model, semantic-LLM model, or reranking model returns a permanent error (HTTP 400, 401, 404, or 429 — invalid config, bad auth, unknown model, or rate limit), docker-agent treats the model configuration as invalid and stops immediately rather than retrying doomed requests:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] HTTP 429 misclassified as a permanent/fatal configuration error

The paragraph groups HTTP 429 ("Too Many Requests") together with 400, 401, and 404 and describes it as indicating an "invalid config, bad auth, or unknown model". However 429 is a transient rate-limit signal, not evidence of a misconfigured model. Standard HTTP semantics (RFC 6585) define 429 as retriable (often with a Retry-After header).

The current text says:

  • Indexing: aborted entirely on a 429 — user loses all partially-indexed content with no retry
  • Reranking: reranker permanently disabled for the manager's lifetime on a single 429 — user must restart the agent to restore reranking

This is doubly alarming because the PR description says PR #3091 was motivated by preventing "a flood of doomed requests" from misconfigured models (wrong model name, bad API key). A transient 429 from a correctly-configured model under load is not a "doomed" scenario.

The file history confirms the issue: commit 9448b75e is titled "fix: 429 misclassified as permanent error in RAG docs (#3099)" — suggesting a prior version of this PR already attempted to correct this, but the fix may not have made it into the current diff.

Suggestion: Remove 429 from the permanent-error list and describe it separately as a transient error that triggers backoff/retry (consistent with 5xx treatment), unless the implementation genuinely treats a 429 as a fatal permanent failure (in which case the PR description and the prior fix commit are misleading and the behaviour itself should be reconsidered).

Comment thread docs/guides/go-sdk/index.md Outdated
},
)

client, err := openai.NewClient(ctx, modelCfg, env, wrapper)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[LOW] Code snippet references modelCfg without definition or type annotation

The snippet ends with:

client, err := openai.NewClient(ctx, modelCfg, env, wrapper)

modelCfg is not declared anywhere in the snippet, and its type (*latest.ModelConfig) is not annotated. Readers following along from the Basic Example section may not know how to obtain or construct one.

Suggestion: Either add a brief inline comment such as // modelCfg is a *latest.ModelConfig — see the Basic Example section above or include a minimal declaration like:

modelCfg := &latest.ModelConfig{Model: "gpt-4o"}

The earlier openai.NewClient call at line 83 uses an inline struct literal which makes its type self-evident; this snippet could follow the same pattern.

@dgageot dgageot merged commit 70a784c into main Jun 13, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/api For features/issues/fixes related to the usage of the cagent API area/rag For work/issues that have to do with the RAG features kind/docs Documentation-only changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants