Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/cli/configuration/settings.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Choose the default AI model that powers your droid:
- **`gpt-5.2`** - OpenAI GPT-5.2
- **`haiku`** - Claude Haiku 4.5, fast and cost-effective
- **`gemini-3-pro`** - Gemini 3 Pro
- **`droid-core`** - GLM-4.6 open-source model
- **`droid-core`** - GLM-4.7 open-source model
- **`custom-model`** - Your own configured model via BYOK

[You can also add custom models and BYOK.](/cli/configuration/byok)
Expand Down
2 changes: 1 addition & 1 deletion docs/cli/droid-exec/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ Supported models (examples):
- gpt-5.1-codex
- gpt-5.1
- gemini-3-pro-preview
- glm-4.6
- glm-4.7

<Note>
See the [model table](/pricing#pricing-table) for the full list of available models and their costs.
Expand Down
16 changes: 10 additions & 6 deletions docs/cli/user-guides/choosing-your-model.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ description: Balance accuracy, speed, and cost by picking the right model and re
keywords: ['model', 'models', 'llm', 'claude', 'sonnet', 'opus', 'haiku', 'gpt', 'openai', 'anthropic', 'choose model', 'switch model']
---

Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shifts. Use this guide as a snapshot of how the major options compare today, and expect to revisit it as we publish updates. This guide was last updated on Thursday, December 4th 2025.
Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shifts. Use this guide as a snapshot of how the major options compare today, and expect to revisit it as we publish updates. This guide was last updated in February 2026.

---

## 1 · Current stack rank (December 2025)
## 1 · Current stack rank (February 2026)

| Rank | Model | Why we reach for it |
| ---- | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
Expand All @@ -20,7 +20,11 @@ Model quality evolves quickly, and we tune the CLI defaults as the ecosystem shi
| 6 | **Claude Haiku 4.5** | Fast, cost-efficient for routine tasks and high-volume automation. |
| 7 | **Gemini 3 Pro** | Strong at mixed reasoning with Low/High settings; helpful for researchy flows with structured outputs. |
| 8 | **Gemini 3 Flash** | Fast, cheap (0.2× multiplier) with full reasoning support; great for high-volume tasks where speed matters. |
| 9 | **Droid Core (GLM-4.6)** | Open-source, 0.25× multiplier, great for bulk automation or air-gapped environments; note: no image support. |
| 9 | **Droid Core (GLM-4.7)** | Open-source, 0.25× multiplier, great for bulk automation or air-gapped environments; note: no image support. |

<Note>
If your organization has access, **Claude Opus 4.6** (and **Opus 4.6 Fast Mode**) may appear as additional options. **Opus 4.6 Fast Mode** is available for some accounts at a promotional rate through **Monday, February 16**.
</Note>

<Note>
We ship model updates regularly. When a new release overtakes the list above,
Expand Down Expand Up @@ -65,7 +69,7 @@ Tip: you can swap models mid-session with `/model` or by toggling in the setting
- **GPT-5.2**: Low / Medium / High (default: Low)
- **Gemini 3 Pro**: Low / High (default: High)
- **Gemini 3 Flash**: Minimal / Low / Medium / High (default: High)
- **Droid Core (GLM-4.6)**: None only (default: None; no image support)
- **Droid Core (GLM-4.7)**: None only (default: None; no image support)

Reasoning effort increases latency and cost—start low for simple work and escalate as needed. **Extra High** is only available on GPT-5.1-Codex-Max.

Expand All @@ -82,14 +86,14 @@ Factory ships with managed Anthropic and OpenAI access. If you prefer to run aga

### Open-source models

**Droid Core (GLM-4.6)** is an open-source alternative available in the CLI. It's useful for:
**Droid Core (GLM-4.7)** is an open-source alternative available in the CLI. It's useful for:

- **Air-gapped environments** where external API calls aren't allowed
- **Cost-sensitive projects** needing unlimited local inference
- **Privacy requirements** where code cannot leave your infrastructure
- **Experimentation** with open-source model capabilities

**Note:** GLM-4.6 does not support image attachments. For image-based workflows, use Claude or GPT models.
**Note:** GLM-4.7 does not support image attachments. For image-based workflows, use Claude or GPT models.

To use open-source models, you'll need to configure them via BYOK with a local inference server (like Ollama) or a hosted provider. See [BYOK documentation](/cli/configuration/byok) for setup instructions.

Expand Down
10 changes: 5 additions & 5 deletions docs/guides/building/droid-exec-tutorial.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,8 @@ The Factory example uses a simple pattern: spawn `droid exec` with `--output-for
function runDroidExec(prompt: string, repoPath: string) {
const args = ["exec", "--output-format", "debug"];

// Optional: configure model (defaults to glm-4.6)
const model = process.env.DROID_MODEL_ID ?? "glm-4.6";
// Optional: configure model (defaults to glm-4.7)
const model = process.env.DROID_MODEL_ID ?? "glm-4.7";
args.push("-m", model);

// Optional: reasoning level (off|low|medium|high)
Expand All @@ -105,7 +105,7 @@ function runDroidExec(prompt: string, repoPath: string) {
- Alternative: `--output-format json` for final output only

**`-m` (model)**: Choose your AI model
- `glm-4.6` - Fast, cheap (default)
- `glm-4.7` - Fast, cheap (default)
- `gpt-5-codex` - Most powerful for complex code
- `claude-sonnet-4-5-20250929` - Best balance of speed and capability

Expand Down Expand Up @@ -311,7 +311,7 @@ The example supports environment variables:

```bash
# .env
DROID_MODEL_ID=gpt-5-codex # Default: glm-4.6
DROID_MODEL_ID=gpt-5-codex # Default: glm-4.7
DROID_REASONING=low # Default: low (off|low|medium|high)
PORT=4000 # Default: 4000
HOST=localhost # Default: localhost
Expand Down Expand Up @@ -376,7 +376,7 @@ fs.writeFileSync('./repos/site-content/page.md', markdown);
function runWithModel(prompt: string, model: string) {
return Bun.spawn([
"droid", "exec",
"-m", model, // glm-4.6, gpt-5-codex, etc.
"-m", model, // glm-4.7, gpt-5-codex, etc.
"--output-format", "debug",
prompt
], { cwd: repoPath });
Expand Down
6 changes: 3 additions & 3 deletions docs/guides/building/droid-vps-setup.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -182,15 +182,15 @@ The real power of running droid on a VPS is `droid exec` - a headless mode that
### Basic droid exec usage

```bash
# Simple query with a fast model (GLM 4.6)
droid exec --model glm-4.6 "Tell me a joke"
# Simple query with a fast model (GLM 4.7)
droid exec --model glm-4.7 "Tell me a joke"
```

### Advanced: System exploration

```bash
# Ask droid to explore your system and find specific information
droid exec --model glm-4.6 "Explore my system and tell me where the file is that I'm serving with Nginx"
droid exec --model glm-4.7 "Explore my system and tell me where the file is that I'm serving with Nginx"
```

Droid will:
Expand Down
2 changes: 1 addition & 1 deletion docs/guides/power-user/prompt-crafting.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -376,7 +376,7 @@ Match the model to the task:
| **Feature implementation** | Sonnet 4.5 or GPT-5.1-Codex | Medium |
| **Quick edits, formatting** | Haiku 4.5 | Off/Low |
| **Code review** | GPT-5.1-Codex-Max | High |
| **Bulk automation** | GLM-4.6 (Droid Core) | None |
| **Bulk automation** | GLM-4.7 (Droid Core) | None |
| **Research/analysis** | Gemini 3 Pro | High |

---
Expand Down
5 changes: 3 additions & 2 deletions docs/guides/power-user/token-efficiency.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -134,13 +134,14 @@ Different models have different cost multipliers and capabilities. Match the mod

| Model | Multiplier | Best For |
|-------|------------|----------|
| GLM-4.6 (Droid Core) | 0.25× | Bulk automation, simple tasks |
| GLM-4.7 (Droid Core) | 0.25× | Bulk automation, simple tasks |
| Gemini 3 Flash | 0.2× | High-volume tasks, quick processing |
| Claude Haiku 4.5 | 0.4× | Quick edits, routine work |
| GPT-5.1 / GPT-5.1-Codex | 0.5× | Implementation, debugging |
| GPT-5.2 | 0.7× | Harder implementation, deeper reasoning |
| Gemini 3 Pro | 0.8× | Research, analysis |
| Claude Sonnet 4.5 | 1.2× | Balanced quality/cost |
| Claude Opus 4.5 | 2× | Complex reasoning, architecture |
| Claude Opus 4.1 | 6× | Maximum capability (use sparingly) |

### Task-Based Model Selection

Expand Down
6 changes: 5 additions & 1 deletion docs/pricing.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Different models have different multipliers applied to calculate Standard Token

| Model | Model ID | Multiplier |
| ------------------------ | ---------------------------- | ---------- |
| Droid Core | `glm-4.6` | 0.25× |
| Droid Core | `glm-4.7` | 0.25× |
| Claude Haiku 4.5 | `claude-haiku-4-5-20251001` | 0.4× |
| GPT-5.1 | `gpt-5.1` | 0.5× |
| GPT-5.1-Codex | `gpt-5.1-codex` | 0.5× |
Expand All @@ -35,6 +35,10 @@ Different models have different multipliers applied to calculate Standard Token
| Claude Sonnet 4.5 | `claude-sonnet-4-5-20250929` | 1.2× |
| Claude Opus 4.5 | `claude-opus-4-5-20251101` | 2× |

<Note>
**Promo:** Claude Opus 4.6 **Fast Mode** is available for some accounts at a promotional rate through **Monday, February 16**.
</Note>

## Thinking About Tokens

As a reference point, using GPT-5.1-Codex at its 0.5× multiplier alongside our typical cache ratio of 4–8× means your effective Standard Token usage goes dramatically further than raw on-demand calls. Switching to very expensive models frequently—or rotating models often enough to invalidate the cache—will lower that benefit, but most workloads see materially higher usage ceilings compared with buying capacity directly from individual model providers. Our aim is for you to run your workloads without worrying about token math; the plans are designed so common usage patterns outperform comparable direct offerings.
2 changes: 1 addition & 1 deletion docs/reference/cli-reference.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ droid exec --auto high "Run tests, commit, and push changes"
| `claude-haiku-4-5-20251001` | Claude Haiku 4.5 | Yes (Off/Low/Medium/High) | off |
| `gemini-3-pro-preview` | Gemini 3 Pro | Yes (Low/High) | high |
| `gemini-3-flash-preview` | Gemini 3 Flash | Yes (Minimal/Low/Medium/High) | high |
| `glm-4.6` | Droid Core (GLM-4.6) | None only | none |
| `glm-4.7` | Droid Core (GLM-4.7) | None only | none |

Custom models configured via [BYOK](/cli/configuration/byok) use the format: `custom:<alias>`

Expand Down