feat(auto): add Kilo Auto Balanced model#1031
Conversation
Routes to Kimi K2.5 for heavy modes (plan, general, architect, orchestrator, ask, debug) and Minimax M2.5 for implementation modes (build, explore, code), offering a lower-cost alternative to Frontier.
Code Review SummaryStatus: 2 Issues Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
Other Observations (not in diff)N/A Files Reviewed (3 files)
Reviewed by gpt-5.4-20260305 · 448,089 tokens |
The deprecatedAutoModels function was producing entries with undefined IDs for models without legacy mappings (like the new balanced model), causing the /api/openrouter/models endpoint to return 500.
|
Demo videos: |
|
Also validated that the modes do hit the correct models on the backend and used both. Flipping between modes in the same session also appears to work correctly. |
Minimax M2.5 needs search_and_replace only (no apply_diff/edit_file). Apply this restriction globally since the extension can't know which underlying model handles each request.
|
@chrarnoldus - one item I wasn't sure about, there are different roo settings for minimax:m2.5 and Kimi:k2.5. I applied the minimax ones as I wanted to be restrictive, but would love your guidance here. |
The nvidia entry was accidentally dropped during the merge. Restoring it fixes the preferredIndex values in the approval test.
* Update Kimi prices * Use edit_file instead of apply_diff * Use free MiniMax instead of paid * Remove unsupported parameters * Simplify deprecated mapping without flatMap
|
|
The PR originally used paid MiniMax, was that intentional? I changed it to free, but we can always change it back. edit: roadmap says free |
Summary
Adds a new
kilo-auto/balancedauto model that mirrors Frontier's mode-based routing structure but uses cheaper underlying models:moonshotai/kimi-k2.5) for heavy modes: plan, general, architect, orchestrator, ask, debug (where Frontier uses Opus)minimax/minimax-m2.5) for implementation modes: build, explore, code (where Frontier uses Sonnet)Context length and max completion tokens are derived from the minimum of both models (204,800 / 65,536). Pricing set at $2/$8 per M tokens. Added to
preferredModelsbetween Frontier and Free.Verification
pnpm typecheck— passes (no new errors introduced; all existing errors are in unrelated files)Visual Changes
N/A
Reviewer Notes
prompt_price/completion_price) is placeholder — may need adjustment based on actual upstream costs.supports_imagesisfalsesince Minimax M2.5 lacks vision support, even though Kimi K2.5 does support it.opencode_settingsisundefinedsince neither underlying model fits the existing families (claude, gpt, gemini, llama, mistral).CODE_MODEL→FRONTIER_CODE_MODEL,MODE_TO_MODEL→FRONTIER_MODE_TO_MODEL) for clarity now that there are two routing tables.