From 8c28e215a0e4252477d9c1a5afdc5345503f9290 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arnaud=20H=C3=A9ritier?= Date: Sat, 13 Jun 2026 04:04:25 +0000 Subject: [PATCH 1/7] docs: document WithHTTPTransportWrapper for HTTP middleware injection (#3090) --- docs/guides/go-sdk/index.md | 40 +++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/docs/guides/go-sdk/index.md b/docs/guides/go-sdk/index.md index 2e8f8a93e..efb029f16 100644 --- a/docs/guides/go-sdk/index.md +++ b/docs/guides/go-sdk/index.md @@ -289,6 +289,46 @@ func createAgentWithBuiltinTools(llm provider.Provider) *agent.Agent { } ``` +## HTTP Middleware / Transport Wrappers + +Use `options.WithHTTPTransportWrapper` to inject HTTP middleware into the transport chain of all provider clients built by docker-agent. This is useful for request tracing, injecting custom headers, collecting metrics, or any other cross-cutting concern at the HTTP layer. + +```go +import ( + "net/http" + + "github.com/docker/docker-agent/pkg/model/provider/options" +) + +// Example: add a custom header to every outbound LLM request +wrapper := options.WithHTTPTransportWrapper( + func(base http.RoundTripper) http.RoundTripper { + return roundTripperFunc(func(req *http.Request) (*http.Response, error) { + req = req.Clone(req.Context()) + req.Header.Set("X-Request-Source", "my-app") + return base.RoundTrip(req) + }) + }, +) + +client, err := openai.NewClient(ctx, modelCfg, env, wrapper) +``` + +The wrapper receives the already-instrumented transport (OpenTelemetry, SSE decompression, Desktop proxy support) as its `base` argument, so wrapping it preserves all built-in behaviour. + +**Supported providers:** Anthropic, OpenAI, Gemini (GeminiAPI backend). Works in both direct and gateway/proxy mode. + +
+
Bedrock and Vertex AI not supported +
+

Bedrock and Vertex AI use SDK-managed transports that docker-agent cannot intercept. Passing WithHTTPTransportWrapper when targeting these providers has no effect; a warning is logged at startup.

+ +
+ +In **gateway mode** the wrapper is called on every LLM request because gateway clients are rebuilt each call for short-lived auth tokens. In **direct mode** it is called once at client construction. + +Returning `nil` from your wrapper function is treated as a bug: docker-agent logs a warning and preserves the original transport instead. + ## Using Different Providers ```go From 648897540af4bd40755958ba1bda088bfb4c01bb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arnaud=20H=C3=A9ritier?= Date: Sat, 13 Jun 2026 04:04:45 +0000 Subject: [PATCH 2/7] docs: document permanent error behavior in RAG indexing and reranking (#3091) --- docs/tools/rag/index.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/tools/rag/index.md b/docs/tools/rag/index.md index ccc5af3d6..ec30112e2 100644 --- a/docs/tools/rag/index.md +++ b/docs/tools/rag/index.md @@ -183,6 +183,11 @@ $ docker agent run config.yaml --debug --log-file debug.log Look for log tags: `[RAG Manager]`, `[Chunked-Embeddings Strategy]`, `[BM25 Strategy]`, `[RRF Fusion]`, `[Reranker]`. +**Permanent model errors abort early.** If the embedding model, semantic-LLM model, or reranking model returns a permanent error (HTTP 400, 401, 404, or 429), docker-agent treats the model configuration as invalid and stops immediately rather than retrying doomed requests: + +- **Indexing** — the entire indexing run is aborted after the first permanent failure. The error is surfaced in the logs so you know immediately if a model name or API key is wrong, rather than silently producing incomplete results. +- **Reranking** — a permanent error permanently disables the reranker for the lifetime of the manager. Subsequent queries fall back to un-reranked results. Transient errors (5xx, timeouts) still fall back and retry on the next query. +
Examples
From 9448b75e827732f5915a9ec288a62963375d055a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arnaud=20H=C3=A9ritier?= Date: Sat, 13 Jun 2026 04:29:29 +0000 Subject: [PATCH 3/7] docs: fix 429 misclassified as permanent error in RAG docs (#3099) --- docs/tools/rag/index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/tools/rag/index.md b/docs/tools/rag/index.md index ec30112e2..14263e307 100644 --- a/docs/tools/rag/index.md +++ b/docs/tools/rag/index.md @@ -183,10 +183,10 @@ $ docker agent run config.yaml --debug --log-file debug.log Look for log tags: `[RAG Manager]`, `[Chunked-Embeddings Strategy]`, `[BM25 Strategy]`, `[RRF Fusion]`, `[Reranker]`. -**Permanent model errors abort early.** If the embedding model, semantic-LLM model, or reranking model returns a permanent error (HTTP 400, 401, 404, or 429), docker-agent treats the model configuration as invalid and stops immediately rather than retrying doomed requests: +**Permanent model errors abort early.** If the embedding model, semantic-LLM model, or reranking model returns a permanent error (HTTP 400, 401, or 404 — invalid config, bad auth, or unknown model), docker-agent treats the model configuration as invalid and stops immediately rather than retrying doomed requests: - **Indexing** — the entire indexing run is aborted after the first permanent failure. The error is surfaced in the logs so you know immediately if a model name or API key is wrong, rather than silently producing incomplete results. -- **Reranking** — a permanent error permanently disables the reranker for the lifetime of the manager. Subsequent queries fall back to un-reranked results. Transient errors (5xx, timeouts) still fall back and retry on the next query. +- **Reranking** — a permanent error permanently disables the reranker for the lifetime of the manager. Subsequent queries fall back to un-reranked results. Transient errors (5xx, timeouts) and rate limits (429) still fall back and retry on the next query.
Examples From 6fb831117325cf84727deccf7fc1e432ee05a113 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arnaud=20H=C3=A9ritier?= Date: Sat, 13 Jun 2026 11:45:09 +0000 Subject: [PATCH 4/7] fix: correct roundTripperFunc example and 429 reranker/indexing behavior in docs --- docs/guides/go-sdk/index.md | 18 ++++++++++++------ docs/tools/rag/index.md | 6 +++--- 2 files changed, 15 insertions(+), 9 deletions(-) diff --git a/docs/guides/go-sdk/index.md b/docs/guides/go-sdk/index.md index efb029f16..9bfbf6aef 100644 --- a/docs/guides/go-sdk/index.md +++ b/docs/guides/go-sdk/index.md @@ -300,14 +300,20 @@ import ( "github.com/docker/docker-agent/pkg/model/provider/options" ) +type headerTransport struct { + base http.RoundTripper +} + +func (t *headerTransport) RoundTrip(req *http.Request) (*http.Response, error) { + req = req.Clone(req.Context()) + req.Header.Set("X-Request-Source", "my-app") + return t.base.RoundTrip(req) +} + // Example: add a custom header to every outbound LLM request wrapper := options.WithHTTPTransportWrapper( - func(base http.RoundTripper) http.RoundTripper { - return roundTripperFunc(func(req *http.Request) (*http.Response, error) { - req = req.Clone(req.Context()) - req.Header.Set("X-Request-Source", "my-app") - return base.RoundTrip(req) - }) + func(base http.RoundTripper) (http.RoundTripper, error) { + return &headerTransport{base: base}, nil }, ) diff --git a/docs/tools/rag/index.md b/docs/tools/rag/index.md index 14263e307..35576f9d4 100644 --- a/docs/tools/rag/index.md +++ b/docs/tools/rag/index.md @@ -183,10 +183,10 @@ $ docker agent run config.yaml --debug --log-file debug.log Look for log tags: `[RAG Manager]`, `[Chunked-Embeddings Strategy]`, `[BM25 Strategy]`, `[RRF Fusion]`, `[Reranker]`. -**Permanent model errors abort early.** If the embedding model, semantic-LLM model, or reranking model returns a permanent error (HTTP 400, 401, or 404 — invalid config, bad auth, or unknown model), docker-agent treats the model configuration as invalid and stops immediately rather than retrying doomed requests: +**Permanent model errors abort early.** If the embedding model, semantic-LLM model, or reranking model returns a permanent error (HTTP 400, 401, 404, or 429 — invalid config, bad auth, unknown model, or rate limit), docker-agent treats the model configuration as invalid and stops immediately rather than retrying doomed requests: -- **Indexing** — the entire indexing run is aborted after the first permanent failure. The error is surfaced in the logs so you know immediately if a model name or API key is wrong, rather than silently producing incomplete results. -- **Reranking** — a permanent error permanently disables the reranker for the lifetime of the manager. Subsequent queries fall back to un-reranked results. Transient errors (5xx, timeouts) and rate limits (429) still fall back and retry on the next query. +- **Indexing** — the entire indexing run is aborted after the first permanent failure (including 429). The error is surfaced in the logs so you know immediately if a model name or API key is wrong, rather than silently producing incomplete results. +- **Reranking** — a permanent error (including 429) permanently disables the reranker for the lifetime of the manager. Subsequent queries fall back to un-reranked results. Only transient errors (5xx, timeouts) fall back and retry on the next query.
Examples From fde157c2c117b1d1c560fbfe5aa18fcbd8412a5e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arnaud=20H=C3=A9ritier?= Date: Sat, 13 Jun 2026 11:46:25 +0000 Subject: [PATCH 5/7] fix: correct WithHTTPTransportWrapper closure signature in go-sdk docs --- docs/guides/go-sdk/index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/guides/go-sdk/index.md b/docs/guides/go-sdk/index.md index 9bfbf6aef..c86046f49 100644 --- a/docs/guides/go-sdk/index.md +++ b/docs/guides/go-sdk/index.md @@ -312,8 +312,8 @@ func (t *headerTransport) RoundTrip(req *http.Request) (*http.Response, error) { // Example: add a custom header to every outbound LLM request wrapper := options.WithHTTPTransportWrapper( - func(base http.RoundTripper) (http.RoundTripper, error) { - return &headerTransport{base: base}, nil + func(base http.RoundTripper) http.RoundTripper { + return &headerTransport{base: base} }, ) From 67a17861ea61a4ac860c83cd5784989c89554c15 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arnaud=20H=C3=A9ritier?= Date: Sat, 13 Jun 2026 16:00:52 +0000 Subject: [PATCH 6/7] docs: clarify modelCfg in transport wrapper example and add 429 note (#3099) --- docs/guides/go-sdk/index.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/guides/go-sdk/index.md b/docs/guides/go-sdk/index.md index c86046f49..39811848b 100644 --- a/docs/guides/go-sdk/index.md +++ b/docs/guides/go-sdk/index.md @@ -317,7 +317,10 @@ wrapper := options.WithHTTPTransportWrapper( }, ) -client, err := openai.NewClient(ctx, modelCfg, env, wrapper) +client, err := openai.NewClient(ctx, &latest.ModelConfig{ + Provider: "openai", + Model: "gpt-4o", +}, env, wrapper) ``` The wrapper receives the already-instrumented transport (OpenTelemetry, SSE decompression, Desktop proxy support) as its `base` argument, so wrapping it preserves all built-in behaviour. @@ -331,7 +334,7 @@ The wrapper receives the already-instrumented transport (OpenTelemetry, SSE deco
-In **gateway mode** the wrapper is called on every LLM request because gateway clients are rebuilt each call for short-lived auth tokens. In **direct mode** it is called once at client construction. +In **gateway mode** the wrapper is called on every LLM request because gateway clients are rebuilt each call for short-lived auth tokens. In **direct mode** it is called once at client construction. Rate-limit responses (HTTP 429) are classified as non-retryable by the runtime and cause the model chain to skip to the next fallback, so wrappers that track per-request outcomes will observe these as failures rather than retried calls. Returning `nil` from your wrapper function is treated as a bug: docker-agent logs a warning and preserves the original transport instead. From 69e1447df4c5e19ce639ec2c6e4c4f932776f2e1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arnaud=20H=C3=A9ritier?= Date: Sat, 13 Jun 2026 16:03:38 +0000 Subject: [PATCH 7/7] fix: correct provider support matrix and nil-wrapper wording in go-sdk docs --- docs/guides/go-sdk/index.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/guides/go-sdk/index.md b/docs/guides/go-sdk/index.md index 39811848b..9f3afbf14 100644 --- a/docs/guides/go-sdk/index.md +++ b/docs/guides/go-sdk/index.md @@ -325,18 +325,18 @@ client, err := openai.NewClient(ctx, &latest.ModelConfig{ The wrapper receives the already-instrumented transport (OpenTelemetry, SSE decompression, Desktop proxy support) as its `base` argument, so wrapping it preserves all built-in behaviour. -**Supported providers:** Anthropic, OpenAI, Gemini (GeminiAPI backend). Works in both direct and gateway/proxy mode. +**Supported providers:** Anthropic, OpenAI, Gemini (GeminiAPI backend), Bedrock. Works in both direct and gateway/proxy mode.
-
Bedrock and Vertex AI not supported +
Vertex AI not supported
-

Bedrock and Vertex AI use SDK-managed transports that docker-agent cannot intercept. Passing WithHTTPTransportWrapper when targeting these providers has no effect; a warning is logged at startup.

+

Vertex AI uses an ADC-managed HTTP client that docker-agent cannot intercept. When a transport wrapper is set, docker-agent falls back to the GeminiAPI backend instead of Vertex AI — a debug message is logged.

In **gateway mode** the wrapper is called on every LLM request because gateway clients are rebuilt each call for short-lived auth tokens. In **direct mode** it is called once at client construction. Rate-limit responses (HTTP 429) are classified as non-retryable by the runtime and cause the model chain to skip to the next fallback, so wrappers that track per-request outcomes will observe these as failures rather than retried calls. -Returning `nil` from your wrapper function is treated as a bug: docker-agent logs a warning and preserves the original transport instead. +Returning `nil` from your wrapper function is not allowed; docker-agent logs a warning and keeps the original transport instead. ## Using Different Providers