-
Notifications
You must be signed in to change notification settings - Fork 389
docs: document WithHTTPTransportWrapper and RAG permanent error behavior #3099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
8c28e21
6488975
9448b75
6fb8311
fde157c
67a1786
69e1447
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -183,6 +183,11 @@ $ docker agent run config.yaml --debug --log-file debug.log | |
|
|
||
| Look for log tags: `[RAG Manager]`, `[Chunked-Embeddings Strategy]`, `[BM25 Strategy]`, `[RRF Fusion]`, `[Reranker]`. | ||
|
|
||
| **Permanent model errors abort early.** If the embedding model, semantic-LLM model, or reranking model returns a permanent error (HTTP 400, 401, 404, or 429 — invalid config, bad auth, unknown model, or rate limit), docker-agent treats the model configuration as invalid and stops immediately rather than retrying doomed requests: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [MEDIUM] HTTP 429 misclassified as a permanent/fatal configuration error The paragraph groups HTTP 429 ("Too Many Requests") together with 400, 401, and 404 and describes it as indicating an "invalid config, bad auth, or unknown model". However 429 is a transient rate-limit signal, not evidence of a misconfigured model. Standard HTTP semantics (RFC 6585) define 429 as retriable (often with a The current text says:
This is doubly alarming because the PR description says PR #3091 was motivated by preventing "a flood of doomed requests" from misconfigured models (wrong model name, bad API key). A transient 429 from a correctly-configured model under load is not a "doomed" scenario. The file history confirms the issue: commit Suggestion: Remove 429 from the permanent-error list and describe it separately as a transient error that triggers backoff/retry (consistent with 5xx treatment), unless the implementation genuinely treats a 429 as a fatal permanent failure (in which case the PR description and the prior fix commit are misleading and the behaviour itself should be reconsidered). |
||
|
|
||
| - **Indexing** — the entire indexing run is aborted after the first permanent failure (including 429). The error is surfaced in the logs so you know immediately if a model name or API key is wrong, rather than silently producing incomplete results. | ||
| - **Reranking** — a permanent error (including 429) permanently disables the reranker for the lifetime of the manager. Subsequent queries fall back to un-reranked results. Only transient errors (5xx, timeouts) fall back and retry on the next query. | ||
|
|
||
| <div class="callout callout-tip" markdown="1"> | ||
| <div class="callout-title">Examples | ||
| </div> | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[HIGH]
options.WithHTTPTransportWrapperdoes not exist in the codebaseThe documented function
options.WithHTTPTransportWrapperhas no implementation anywhere in the repository. A repo-wide search of all.gofiles returns zero results for this symbol, andpkg/model/provider/options/options.gocontains no transport-related functions — onlyWithGateway,WithStructuredOutput,WithGeneratingTitle,WithMaxTokens,WithNoThinking,WithProviders, andWithModelsDevStore.Documenting a non-existent API will mislead users into writing code that will not compile. This section should be held until the corresponding implementation is merged, or the docs should reference the correct function name if it was renamed.