fix: add Cerebras models zai-glm-4.7 by github-actions[bot] · Pull Request #679 · braintrustdata/braintrust-proxy

github-actions · 2026-05-29T15:20:03Z

fix: add Cerebras models zai-glm-4.7

Closes #674

Source issue: #674

Summary

Field	Value
Provider	cerebras
Primary model	zai-glm-4.7
Changed models	`zai-glm-4.7`
Added models	`zai-glm-4.7`
Updated models	None
Verification sources	1 2 3

Verified metadata

Model	Display name	Parent	Providers	Format	Flavor	Token limits	Pricing	Lifecycle
zai-glm-4.7	Z.ai GLM 4.7		cerebras	openai	chat	input=131072, output=40960	in/out=2.25/2.75 per 1M	reasoning=true

Verification notes

Verification

Sources and fields verified

https://inference-docs.cerebras.ai/models/overview — verified: model ID (zai-glm-4.7), parameters (355B), speed (~1000 tok/s), Preview status
https://inference-docs.cerebras.ai/models/zai-glm-47 — verified: context window paid tier (131k), max output (40k), pricing ($2.25/$2.75 per MTok), reasoning enabled by default, tool calling, structured outputs
https://cerebras.ai/pricing — confirmed general pricing tiers exist; per-model token pricing is on the model-specific page rather than the pricing landing page

sync_models (LiteLLM) cross-check

The model cerebras/zai-glm-4.7 does not appear in the LiteLLM model_prices_and_context_window_backup.json catalog. However, a related entry exists under the zai provider (Z.AI's own hosted endpoint, not Cerebras):

zai/glm-4.7 in sync_models: max_input_tokens=200000, max_output_tokens=128000, input_cost_per_token=6e-07 ($0.60/MTok), output_cost_per_token=2.2e-06 ($2.20/MTok)
Proposed zai-glm-4.7 (Cerebras-hosted): max_input_tokens=131072, max_output_tokens=40960, input_cost_per_mil_tokens=2.25, output_cost_per_mil_tokens=2.75

All four numeric fields differ because the sync_models entry reflects Z.AI's own infrastructure limits and pricing, whereas this issue covers the Cerebras-hosted version which has different context limits and pricing:

Field	Proposed (Cerebras)	sync_models `zai/glm-4.7` (Z.AI)	Justification
`max_input_tokens`	131072	200000	Cerebras model page states "131k tokens" context window for paid tiers (https://inference-docs.cerebras.ai/models/zai-glm-47). Different provider, different limits.
`max_output_tokens`	40960	128000	Cerebras model page states "40k tokens" max output (https://inference-docs.cerebras.ai/models/zai-glm-47). Different provider, different limits.
`input_cost_per_mil_tokens`	2.25	0.60	Cerebras charges $2.25/MTok (https://inference-docs.cerebras.ai/models/zai-glm-47); Z.AI charges $0.60/MTok on their own platform.
`output_cost_per_mil_tokens`	2.75	2.20	Cerebras charges $2.75/MTok (https://inference-docs.cerebras.ai/models/zai-glm-47); Z.AI charges $2.20/MTok on their own platform.

The Cerebras official documentation is preferred because this catalog entry is specifically for the Cerebras-hosted version of the model, which has its own independently published limits and pricing. The sync_models zai/glm-4.7 entry is for a different provider (zai) and is not applicable to the cerebras provider mapping.

Fields not published or not applicable

multimodal: Not specified in docs (input/output listed as "Text only"); omitted.
parent: Not applicable — this is a standalone model, not a dated snapshot or alias.
input_cache_read_cost_per_mil_tokens / input_cache_write_cost_per_mil_tokens: Prompt caching is mentioned as supported, but no cache-specific pricing is published; omitted.
deprecation_date: Not applicable — model is in Preview, no deprecation announced.
supported_regions: Not applicable — Cerebras is not a Vertex provider.
locations: Not applicable — Cerebras models do not use location-scoped routing.

Token limit interpretation

The Cerebras docs state "131k tokens" for context window and "40k tokens" for max output. Interpreting these as:

131k → 131,072 (consistent with existing gpt-oss-120b entry which uses 131072)
40k → 40,960 (40 × 1024, following the same binary-k convention)

If Cerebras means literal 40,000 rather than 40,960, the difference is minor (960 tokens). The binary interpretation is used here for consistency with the existing catalog convention.

sync_models vs proposed update

sync_models cross-check found differences. Official provider verification was used for the applied values, and sync_models discrepancies are listed below for review.

Model	Field	Proposed update	sync_models	sync_models source models
zai-glm-4.7	max_input_tokens	131072	128000	cerebras/zai-glm-4.7
zai-glm-4.7	max_output_tokens	40960	128000	cerebras/zai-glm-4.7

vercel · 2026-05-29T15:20:08Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
ai-proxy	Ready	Preview, Comment	May 29, 2026 3:21pm

fix: add Cerebras models zai-glm-4.7

ee5c2ad

github-actions Bot added the auto-sync label May 29, 2026

github-actions Bot requested a review from aswink May 29, 2026 15:20

github-actions Bot requested review from Alex Z (CLowbrow), Caitlin Pinn (cpinn), Erin McNulty (erin2722) and Ken Jiang (knjiang) May 29, 2026 15:20

github-actions Bot mentioned this pull request May 29, 2026

[BOT ISSUE] Cerebras: add missing zai-glm-4.7 model #674

Open

vercel Bot deployed to Preview May 29, 2026 15:21 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add Cerebras models zai-glm-4.7#679

fix: add Cerebras models zai-glm-4.7#679
github-actions[bot] wants to merge 1 commit into
mainfrom
chore/autofix-issue-674

github-actions Bot commented May 29, 2026

Uh oh!

vercel Bot commented May 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

github-actions Bot commented May 29, 2026

Verification

Sources and fields verified

sync_models (LiteLLM) cross-check

Fields not published or not applicable

Token limit interpretation

Uh oh!

vercel Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 29, 2026 •

edited

Loading