feat(agent): call LiteLLM directly and authenticate the agent-service#5558
Open
bobbai00 wants to merge 4 commits into
Open
feat(agent): call LiteLLM directly and authenticate the agent-service#5558bobbai00 wants to merge 4 commits into
bobbai00 wants to merge 4 commits into
Conversation
The LiteLLM HTTP proxy in access-control-service existed so the browser could reach LiteLLM without holding the master key. The agent-service is now a trusted backend that holds the key itself, so the proxy hop is redundant. - agent-service: build the OpenAI client against LITELLM_BASE_URL with the master key; serve the model list at GET /api/agents/models from LiteLLM. - access-control-service: delete LiteLLMProxyResource / LiteLLMModelsResource, their auth spec, and the now-dead RolesAllowedDynamicFeature registration; drop LLMConfig and llm.conf. - frontend: fetch models from /api/agents/models. - routing/deploy: drop the /api/models and /api/chat proxy routes (nginx, k8s gateway, dev proxy); point the agent-service deployment at LiteLLM with the litellm-master-key secret; update the enable-LLM guide.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5558 +/- ##
============================================
- Coverage 52.16% 52.14% -0.03%
+ Complexity 2482 2480 -2
============================================
Files 1067 1067
Lines 41273 41256 -17
Branches 4437 4436 -1
============================================
- Hits 21532 21514 -18
+ Misses 18479 18474 -5
- Partials 1262 1268 +6
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
agent-service call LiteLLM directly and remove access-control LLM proxy
…t_authz Phase 1 authentication for the agent service (see apache#5561): - agent-service: real HS256 JWT verification using the shared secret read from auth.conf (env AUTH_JWT_SECRET); a guard rejects unauthenticated REST (Bearer) and WebSocket (access-token query) requests. auth.conf is bundled into the image. - access-control-service: authorize() gains an /api/agents branch that verifies the JWT and requires REGULAR/ADMIN, returning the trusted x-user-* headers (allow-all per-agent for now; per-agent ownership is deferred to apache#5302). - gateways: nginx auth_request (single-node) and an Envoy SecurityPolicy (k8s) route /api/agents through the access-control-service; the agent-service deployment now receives AUTH_JWT_SECRET. - frontend: attach the JWT to every agent call (Bearer on REST, access-token on the /react WebSocket). - tests: agent-service jwt.test.ts + server.test.ts guard cases; access-control AgentAccessAuthSpec.
agent-service call LiteLLM directly and remove access-control LLM proxy
Contributor
Author
|
@xuang7 I would like to include this in the release v1.2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this PR?
Two related changes to how the agent service reaches LiteLLM and how its endpoints are secured.
1 — Remove the LiteLLM HTTP proxy. Deletes
LiteLLMProxyResource/LiteLLMModelsResourcefromaccess-control-serviceand has theagent-service— a trusted backend that already holds the LiteLLM master key — call LiteLLM directly for both chat completions and the model list. The proxy only existed so the browser could reach LiteLLM without holding the master key; that hop is now redundant.LITELLM_BASE_URLwith the master key (LITELLM_MASTER_KEY); serves the model list atGET /api/agents/modelsby calling LiteLLM directly.RolesAllowedDynamicFeatureregistration; dropsLLMConfigandllm.conf./api/agents/models./api/modelsand/api/chatproxy routes (nginx, k8s gateway, dev proxy); points the agent-service deployment at LiteLLM with thelitellm-master-keysecret; updates the enable-LLM guide.Before —
/api/modelsand/api/chat/completionsare served by the access-control-service, which forwards to LiteLLM with the master key. (The/api/chat/*proxy route was never called by the frontend — only the agent-service used it.)flowchart LR FE[Frontend] AS[agent-service] AC["access-control-service<br/>(LiteLLM proxy)"] LLM[LiteLLM] FE -->|"GET /api/models"| AC AS -->|"POST /api/chat/completions"| AC AC -->|"forwards + adds master key"| LLMAfter — the agent-service calls LiteLLM directly (it holds the master key) and serves the model list at
/api/agents/models; the access-control-service is no longer in the LLM path.flowchart LR FE[Frontend] AS[agent-service] LLM[LiteLLM] FE -->|"GET /api/agents/models"| AS AS -->|"/models + /chat/completions<br/>(master key)"| LLM2 — Authenticate the agent service (Phase 1 of #5561). Now that the model list and LLM path live on the agent-service, its endpoints are authorized by the access-control-service the same way computing-unit / execution traffic already is.
authorize()gains an/api/agentsbranch that verifies the JWT and requiresREGULAR/ADMIN, returning the trustedx-user-*headers (allow-all per-agent for now; per-agent ownership is deferred to Improvement: Treat TexeraAgents as access-controlled Resources, just like other resources (workflows, datasets and computing units) #5302).auth.conf(envAUTH_JWT_SECRET) — as defense-in-depth. A guard rejects unauthenticated REST (Authorization: Bearer) and WebSocket (?access-token=…) requests.auth_requestand a k8s EnvoySecurityPolicyroute/api/agentsthrough access-control; the agent-service deployment now receivesAUTH_JWT_SECRET.access-tokenon the/reactWebSocket).See #5561 for the authentication design and traffic diagram.
Any related issues, documentation, discussions?
Closes #5422
Implements Phase 1 of #5561 (per-agent authorization is deferred — see #5302).
How was this PR tested?
agent-service:tsc --noEmitand 109 tests pass, includingjwt.test.ts(HS256 signature / expiry / alg-confusion / missing-claim cases) and the auth-guard cases (401 on missing/invalid/expired token, healthcheck open).access-control-service: compiles,AgentAccessAuthSpec+AccessControlServiceRunSpecpass; scalafmt/scalafix clean. The Helm chart renders with the updated env/secret/routes/SecurityPolicy.GET /api/agents/modelsreturns the model list and the old/api/models404s; the nginxauth_requestreturns 401 (no / invalid / expired token), 403 (INACTIVE role), 200 (REGULAR/ADMIN); the agent-service's own guard rejects direct unauthenticated access; the/reactWebSocket is gated.SecurityPolicyagainst a live Envoy Gateway: ext_authz returns 401 / 403 / 200 and the agent responds over the WebSocket through Envoy.Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.8 (1M context)