feat(agent): call LiteLLM directly and authenticate the agent-service by bobbai00 · Pull Request #5558 · apache/texera

bobbai00 · 2026-06-08T04:01:26Z

What changes were proposed in this PR?

Two related changes to how the agent service reaches LiteLLM and how its endpoints are secured.

1 — Remove the LiteLLM HTTP proxy. Deletes LiteLLMProxyResource / LiteLLMModelsResource from access-control-service and has the agent-service — a trusted backend that already holds the LiteLLM master key — call LiteLLM directly for both chat completions and the model list. The proxy only existed so the browser could reach LiteLLM without holding the master key; that hop is now redundant.

agent-service: builds the OpenAI client against LITELLM_BASE_URL with the master key (LITELLM_MASTER_KEY); serves the model list at GET /api/agents/models by calling LiteLLM directly.
access-control-service: deletes the two proxy resources, their auth spec, and the now-dead RolesAllowedDynamicFeature registration; drops LLMConfig and llm.conf.
frontend: fetches models from /api/agents/models.
routing / deploy: drops the /api/models and /api/chat proxy routes (nginx, k8s gateway, dev proxy); points the agent-service deployment at LiteLLM with the litellm-master-key secret; updates the enable-LLM guide.

Before — /api/models and /api/chat/completions are served by the access-control-service, which forwards to LiteLLM with the master key. (The /api/chat/* proxy route was never called by the frontend — only the agent-service used it.)

flowchart LR
    FE[Frontend]
    AS[agent-service]
    AC["access-control-service<br/>(LiteLLM proxy)"]
    LLM[LiteLLM]

    FE -->|"GET /api/models"| AC
    AS -->|"POST /api/chat/completions"| AC
    AC -->|"forwards + adds master key"| LLM

After — the agent-service calls LiteLLM directly (it holds the master key) and serves the model list at /api/agents/models; the access-control-service is no longer in the LLM path.

flowchart LR
    FE[Frontend]
    AS[agent-service]
    LLM[LiteLLM]

    FE -->|"GET /api/agents/models"| AS
    AS -->|"/models + /chat/completions<br/>(master key)"| LLM

2 — Authenticate the agent service (Phase 1 of #5561). Now that the model list and LLM path live on the agent-service, its endpoints are authorized by the access-control-service the same way computing-unit / execution traffic already is.

access-control-service: authorize() gains an /api/agents branch that verifies the JWT and requires REGULAR/ADMIN, returning the trusted x-user-* headers (allow-all per-agent for now; per-agent ownership is deferred to Improvement: Treat TexeraAgents as access-controlled Resources, just like other resources (workflows, datasets and computing units) #5302).
agent-service: also verifies the JWT itself — a real HS256 check using the shared secret read from auth.conf (env AUTH_JWT_SECRET) — as defense-in-depth. A guard rejects unauthenticated REST (Authorization: Bearer) and WebSocket (?access-token=…) requests.
gateways: single-node nginx auth_request and a k8s Envoy SecurityPolicy route /api/agents through access-control; the agent-service deployment now receives AUTH_JWT_SECRET.
frontend: attaches the JWT to every agent call (Bearer on REST, access-token on the /react WebSocket).

See #5561 for the authentication design and traffic diagram.

Any related issues, documentation, discussions?

Closes #5422

Implements Phase 1 of #5561 (per-agent authorization is deferred — see #5302).

How was this PR tested?

Unit — agent-service: tsc --noEmit and 109 tests pass, including jwt.test.ts (HS256 signature / expiry / alg-confusion / missing-claim cases) and the auth-guard cases (401 on missing/invalid/expired token, healthcheck open). access-control-service: compiles, AgentAccessAuthSpec + AccessControlServiceRunSpec pass; scalafmt/scalafix clean. The Helm chart renders with the updated env/secret/routes/SecurityPolicy.
End-to-end (docker-compose / nginx) — with both rebuilt images and a real OpenAI key: the agent responds through nginx → agent-service → LiteLLM → OpenAI; GET /api/agents/models returns the model list and the old /api/models 404s; the nginx auth_request returns 401 (no / invalid / expired token), 403 (INACTIVE role), 200 (REGULAR/ADMIN); the agent-service's own guard rejects direct unauthenticated access; the /react WebSocket is gated.
End-to-end (Kubernetes / Envoy) — deployed the rebuilt images plus the Envoy SecurityPolicy against a live Envoy Gateway: ext_authz returns 401 / 403 / 200 and the agent responds over the WebSocket through Envoy.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.8 (1M context)

The LiteLLM HTTP proxy in access-control-service existed so the browser could reach LiteLLM without holding the master key. The agent-service is now a trusted backend that holds the key itself, so the proxy hop is redundant. - agent-service: build the OpenAI client against LITELLM_BASE_URL with the master key; serve the model list at GET /api/agents/models from LiteLLM. - access-control-service: delete LiteLLMProxyResource / LiteLLMModelsResource, their auth spec, and the now-dead RolesAllowedDynamicFeature registration; drop LLMConfig and llm.conf. - frontend: fetch models from /api/agents/models. - routing/deploy: drop the /api/models and /api/chat proxy routes (nginx, k8s gateway, dev proxy); point the agent-service deployment at LiteLLM with the litellm-master-key secret; update the enable-LLM guide.

codecov-commenter · 2026-06-08T04:02:11Z

Codecov Report

❌ Patch coverage is 76.19048% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.14%. Comparing base (75b4619) to head (e291fd5).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...d/src/app/workspace/service/agent/agent.service.ts	15.38%	10 Missing and 1 partial ⚠️
...exera/service/resource/AccessControlResource.scala	92.30%	1 Missing and 1 partial ⚠️
agent-service/src/server.ts	90.47%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #5558      +/-   ##
============================================
- Coverage     52.16%   52.14%   -0.03%     
+ Complexity     2482     2480       -2     
============================================
  Files          1067     1067              
  Lines         41273    41256      -17     
  Branches       4437     4436       -1     
============================================
- Hits          21532    21514      -18     
+ Misses        18479    18474       -5     
- Partials       1262     1268       +6

Flag	Coverage Δ		*Carryforward flag
access-control-service	`61.63% <92.30%> (-2.99%)`	⬇️
agent-service	`33.82% <91.66%> (+0.06%)`	⬆️	Carriedforward from e9d8e17
amber	`53.26% <ø> (+0.03%)`	⬆️
computing-unit-managing-service	`1.65% <ø> (ø)`
config-service	`56.06% <ø> (ø)`
file-service	`38.32% <ø> (ø)`
frontend	`46.43% <15.38%> (-0.04%)`	⬇️
pyamber	`90.69% <ø> (ø)`		Carriedforward from e9d8e17
python	`90.83% <ø> (ø)`		Carriedforward from e9d8e17
workflow-compiling-service	`58.69% <ø> (ø)`

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

… paths

…t_authz Phase 1 authentication for the agent service (see apache#5561): - agent-service: real HS256 JWT verification using the shared secret read from auth.conf (env AUTH_JWT_SECRET); a guard rejects unauthenticated REST (Bearer) and WebSocket (access-token query) requests. auth.conf is bundled into the image. - access-control-service: authorize() gains an /api/agents branch that verifies the JWT and requires REGULAR/ADMIN, returning the trusted x-user-* headers (allow-all per-agent for now; per-agent ownership is deferred to apache#5302). - gateways: nginx auth_request (single-node) and an Envoy SecurityPolicy (k8s) route /api/agents through the access-control-service; the agent-service deployment now receives AUTH_JWT_SECRET. - frontend: attach the JWT to every agent call (Bearer on REST, access-token on the /react WebSocket). - tests: agent-service jwt.test.ts + server.test.ts guard cases; access-control AgentAccessAuthSpec.

bobbai00 · 2026-06-08T19:38:56Z

@xuang7 I would like to include this in the release v1.2

github-actions Bot assigned bobbai00 Jun 8, 2026

github-actions Bot added refactor Refactor the code frontend Changes related to the frontend GUI docs Changes related to documentations dev common platform Non-amber Scala service paths agent-service labels Jun 8, 2026

Bob Bai added 2 commits June 7, 2026 21:19

style(agent): format server.ts models endpoint with prettier

b8f655a

test(agent): cover GET /api/agents/models success and LiteLLM failure…

e9d8e17

… paths

bobbai00 requested a review from Yicong-Huang June 8, 2026 04:24

bobbai00 changed the title ~~refactor(agent): call LiteLLM directly, remove access-control LLM proxy~~ refactor(agent): let agent-service call LiteLLM directly and remove access-control LLM proxy Jun 8, 2026

bobbai00 mentioned this pull request Jun 8, 2026

Authorize agent-service requests in the access-control-service (JWT authn + per-agent authz) #5561

Open

bobbai00 changed the title ~~refactor(agent): let agent-service call LiteLLM directly and remove access-control LLM proxy~~ feat(agent): call LiteLLM directly and authenticate the agent-service Jun 8, 2026

bobbai00 requested a review from xuang7 June 8, 2026 19:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): call LiteLLM directly and authenticate the agent-service#5558

feat(agent): call LiteLLM directly and authenticate the agent-service#5558
bobbai00 wants to merge 4 commits into
apache:mainfrom
bobbai00:refactor/remove-litellm-proxy-access-control

bobbai00 commented Jun 8, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Jun 8, 2026 •

edited

Loading

Uh oh!

bobbai00 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bobbai00 commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this PR?

Any related issues, documentation, discussions?

How was this PR tested?

Was this PR authored or co-authored using generative AI tooling?

Uh oh!

codecov-commenter commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

bobbai00 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bobbai00 commented Jun 8, 2026 •

edited

Loading

codecov-commenter commented Jun 8, 2026 •

edited

Loading