Skip to content

Commit 3ad2316

Browse files
prosdevclaude
andcommitted
fix(cli): align termite model directory between setup and index
hasModel/pullModel used ~/.termite/models but the running Antfly server looked in ~/.antfly/models, causing "model not found" during indexing despite setup reporting the model as ready. Both now use a shared --models-dir pointing at the server's data directory. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent f91ad8b commit 3ad2316

File tree

9 files changed

+177
-17
lines changed

9 files changed

+177
-17
lines changed
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"@prosdevlab/dev-agent": patch
3+
---
4+
5+
Fix `dev setup` reporting model ready while `dev index` fails with "model not found". The CLI's `hasModel`/`pullModel` used `~/.termite/models` but the running server looked in `~/.antfly/models`. Both now use a shared `--models-dir` pointing at the server's data directory.

README.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,8 @@ dev-agent indexes your codebase and provides 6 MCP tools to AI assistants. Inste
2424
```bash
2525
# Install
2626
npm install -g @prosdevlab/dev-agent
27-
brew install --cask antflydb/antfly/antfly
2827

29-
# One-time setup
28+
# One-time setup (installs Antfly, pulls embedding model, starts server)
3029
dev setup
3130

3231
# Index your repository
@@ -138,7 +137,7 @@ Server health, Antfly connectivity, and repository access.
138137
## Prerequisites
139138

140139
- Node.js 22+ (LTS)
141-
- [Antfly](https://antfly.io)`brew install --cask antflydb/antfly/antfly`
140+
- [Antfly](https://antfly.io)installed automatically by `dev setup`
142141

143142
## Development
144143

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
/**
2+
* Tests for antfly utility helpers.
3+
*
4+
* Regression for: hasModel() false positive when antfly termite list defaulted
5+
* to ~/.termite/models (different from the server's ~/.antfly/models), causing
6+
* "Embedding model ready" in `dev setup` but "model not found" in `dev index`.
7+
*/
8+
9+
import { describe, expect, it } from 'vitest';
10+
11+
// modelPresentInOutput is not exported — test via the exported path by extracting
12+
// the pure logic into a local copy that mirrors the implementation exactly.
13+
// This keeps the test focused on the matching logic without requiring CLI env.
14+
15+
function modelPresentInOutput(model: string, output: string): boolean {
16+
if (output.includes(model)) return true;
17+
18+
const shortName = model.split('/').pop() ?? model;
19+
const escaped = shortName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
20+
return new RegExp(`(?<![\\w/-])${escaped}(?![\\w/-])`).test(output);
21+
}
22+
23+
describe('modelPresentInOutput', () => {
24+
const FULL_NAME = 'BAAI/bge-small-en-v1.5';
25+
const SHORT_NAME = 'bge-small-en-v1.5';
26+
27+
// Simulates `antfly termite list --models-dir ~/.antfly/models` output when
28+
// the model IS present (full name in NAME column, also in SOURCE column).
29+
const PRESENT_OUTPUT = `Local models in /Users/dev/.antfly/models:
30+
31+
NAME TYPE SIZE VARIANTS SOURCE
32+
BAAI/bge-small-en-v1.5 embedder 127.8 MB BAAI/bge-small-en-v1.5
33+
`;
34+
35+
// Output when NO models are installed (the bug scenario: server's models-dir
36+
// is empty, but ~/.termite/models has the model — the old code read the wrong
37+
// directory and would never see "No models found").
38+
const EMPTY_OUTPUT = `Local models in /Users/dev/.antfly/models:
39+
40+
NAME TYPE SIZE VARIANTS SOURCE
41+
No models found locally.
42+
43+
Use 'antfly termite pull <model-name>' to download models.
44+
Use 'antfly termite list --remote' to see available models.
45+
`;
46+
47+
// Output with a DIFFERENT model that happens to contain the short name as a
48+
// suffix — the old substring check would incorrectly return true here.
49+
const OTHER_MODEL_OUTPUT = `Local models in /Users/dev/.antfly/models:
50+
51+
NAME TYPE SIZE VARIANTS SOURCE
52+
vendor/other-bge-small-en-v1.5 embedder 200.0 MB vendor/other-bge-small-en-v1.5
53+
`;
54+
55+
it('returns true when full model name is present in output', () => {
56+
expect(modelPresentInOutput(FULL_NAME, PRESENT_OUTPUT)).toBe(true);
57+
});
58+
59+
it('returns true when only short name is present as a standalone token', () => {
60+
const outputWithShortName = `Local models:\n\n${SHORT_NAME} embedder 127 MB\n`;
61+
expect(modelPresentInOutput(FULL_NAME, outputWithShortName)).toBe(true);
62+
});
63+
64+
it('returns false when models directory is empty (server has no models)', () => {
65+
// This is the core regression: old code checked ~/.termite/models which had
66+
// the model, new code checks ~/.antfly/models which was empty. When empty,
67+
// hasModel must return false so pullModel is invoked.
68+
expect(modelPresentInOutput(FULL_NAME, EMPTY_OUTPUT)).toBe(false);
69+
});
70+
71+
it('returns false when a different model shares the short name as a suffix', () => {
72+
// Old bug: output.includes("bge-small-en-v1.5") matched
73+
// "vendor/other-bge-small-en-v1.5" — false positive.
74+
expect(modelPresentInOutput(FULL_NAME, OTHER_MODEL_OUTPUT)).toBe(false);
75+
});
76+
77+
it('returns false for completely unrelated output', () => {
78+
expect(modelPresentInOutput(FULL_NAME, 'No models found locally.')).toBe(false);
79+
});
80+
81+
it('handles model names without an org prefix', () => {
82+
// model = "mxbai-embed-large-v1" (no slash)
83+
const bareModel = 'mxbai-embed-large-v1';
84+
const output = `NAME TYPE\nmxbai-embed-large-v1 embedder\n`;
85+
expect(modelPresentInOutput(bareModel, output)).toBe(true);
86+
});
87+
88+
it('handles bare model not present', () => {
89+
const bareModel = 'mxbai-embed-large-v1';
90+
expect(modelPresentInOutput(bareModel, EMPTY_OUTPUT)).toBe(false);
91+
});
92+
});

packages/cli/src/utils/antfly.ts

Lines changed: 60 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55
*/
66

77
import { execSync, spawn } from 'node:child_process';
8+
import { homedir } from 'node:os';
9+
import { join } from 'node:path';
810
import { logger } from './logger.js';
911

1012
const DEFAULT_ANTFLY_URL = process.env.ANTFLY_URL ?? 'http://localhost:18080/api/v1';
@@ -14,6 +16,18 @@ const DOCKER_PORT = 18080;
1416
const STARTUP_TIMEOUT_MS = 30_000;
1517
const POLL_INTERVAL_MS = 500;
1618

19+
/**
20+
* The Termite models directory used by the running Antfly swarm server.
21+
*
22+
* `antfly swarm` uses `--data-dir` (default: ~/.antfly) as its root for all
23+
* storage, including Termite models at {data-dir}/models.
24+
* `antfly termite list/pull` defaults to --models-dir ~/.termite/models, which
25+
* is a DIFFERENT path. We must always pass --models-dir explicitly so that
26+
* `pullModel` and `hasModel` operate on the same directory the server uses.
27+
*/
28+
const ANTFLY_DATA_DIR = process.env.ANTFLY_DATA_DIR ?? join(homedir(), '.antfly');
29+
const TERMITE_MODELS_DIR = join(ANTFLY_DATA_DIR, 'models');
30+
1731
/**
1832
* Ensure antfly is running. Auto-starts if needed.
1933
*
@@ -32,10 +46,14 @@ export async function ensureAntfly(options?: { quiet?: boolean }): Promise<strin
3246
if (!options?.quiet) logger.info('Starting Antfly server...');
3347
// Use custom ports to avoid 8080 conflicts (Docker, other services).
3448
// metadata-api on 18080 (our default), store-api on 18381, raft on 19017/19021.
49+
// --data-dir is passed explicitly so the server's embedded Termite node stores
50+
// models in the same directory that pullModel/hasModel use (TERMITE_MODELS_DIR).
3551
const child = spawn(
3652
'antfly',
3753
[
3854
'swarm',
55+
'--data-dir',
56+
ANTFLY_DATA_DIR,
3957
'--metadata-api',
4058
'http://0.0.0.0:18080',
4159
'--store-api',
@@ -187,9 +205,14 @@ export function getNativeVersion(): string | null {
187205

188206
/**
189207
* Pull a Termite embedding model (native binary).
208+
*
209+
* Always targets TERMITE_MODELS_DIR so the model ends up in the same directory
210+
* the running Antfly swarm server uses for its embedded Termite node.
190211
*/
191212
export function pullModel(model: string): void {
192-
execSync(`antfly termite pull ${model}`, { stdio: 'inherit' });
213+
execSync(`antfly termite pull --models-dir ${TERMITE_MODELS_DIR} ${model}`, {
214+
stdio: 'inherit',
215+
});
193216
}
194217

195218
/**
@@ -201,33 +224,63 @@ export function pullModelDocker(model: string): void {
201224
}
202225

203226
/**
204-
* Check if a Termite model is available locally (native binary).
227+
* Check if a Termite model is available in the directory used by the running
228+
* Antfly swarm server (TERMITE_MODELS_DIR = ~/.antfly/models by default).
229+
*
230+
* Checks for the full model name first (e.g. "BAAI/bge-small-en-v1.5"), then
231+
* the short name as a whole word (e.g. "bge-small-en-v1.5"). Previously used
232+
* a simple substring match on the short name, which caused false positives when
233+
* `antfly termite list` defaulted to ~/.termite/models — a different directory
234+
* from the one the server reads, so the model appeared present but was not
235+
* available to the server during embedding.
205236
*/
206237
export function hasModel(model: string): boolean {
207238
try {
208-
const output = execSync('antfly termite list', {
239+
const output = execSync(`antfly termite list --models-dir ${TERMITE_MODELS_DIR}`, {
209240
encoding: 'utf-8',
210241
stdio: ['pipe', 'pipe', 'pipe'],
211242
});
212-
const shortName = model.split('/').pop() ?? model;
213-
return output.includes(shortName);
243+
return modelPresentInOutput(model, output);
214244
} catch {
215245
return false;
216246
}
217247
}
218248

219249
/**
220250
* Check if a Termite model is available inside the Docker container.
251+
*
252+
* Checks for the full model name first (e.g. "BAAI/bge-small-en-v1.5"), then
253+
* the short name as a whole word (e.g. "bge-small-en-v1.5"). Simple substring
254+
* matching on the short name was causing false positives when other models or
255+
* partial download records shared the suffix.
221256
*/
222257
export function hasModelDocker(model: string): boolean {
223258
try {
224259
const output = execSync(`docker exec ${CONTAINER_NAME} /antfly termite list`, {
225260
encoding: 'utf-8',
226261
stdio: ['pipe', 'pipe', 'pipe'],
227262
});
228-
const shortName = model.split('/').pop() ?? model;
229-
return output.includes(shortName);
263+
return modelPresentInOutput(model, output);
230264
} catch {
231265
return false;
232266
}
233267
}
268+
269+
/**
270+
* Return true when the model name is present in `antfly termite list` output.
271+
*
272+
* Strategy (most-specific first):
273+
* 1. Full name exact match — "BAAI/bge-small-en-v1.5" appears verbatim.
274+
* 2. Short name word-boundary — "bge-small-en-v1.5" appears as a whole token
275+
* (not as a suffix of a different model name).
276+
*/
277+
function modelPresentInOutput(model: string, output: string): boolean {
278+
// Full name check (covers "BAAI/bge-small-en-v1.5" style output)
279+
if (output.includes(model)) return true;
280+
281+
// Short name check with word-boundary anchors so "bge-small-en-v1.5" does not
282+
// match inside "other-bge-small-en-v1.5" or a partial download entry.
283+
const shortName = model.split('/').pop() ?? model;
284+
const escaped = shortName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
285+
return new RegExp(`(?<![\\w/-])${escaped}(?![\\w/-])`).test(output);
286+
}

website/content/docs/install.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
## Requirements
44

55
- **Node.js 22+** (LTS recommended)
6-
- **[Antfly](https://antfly.io)** — search backend (`brew install --cask antflydb/antfly/antfly`)
6+
- **[Antfly](https://antfly.io)** — search backend (installed automatically by `dev setup`)
77
- **Cursor** or **Claude Code** (for MCP integration)
88

99
## Install dev-agent

website/content/docs/quickstart.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Get from zero to semantic search in 5 minutes.
55
## Prerequisites
66

77
- Node.js 22+ installed
8-
- [Antfly](https://antfly.io) — search backend (`brew install --cask antflydb/antfly/antfly`)
8+
- [Antfly](https://antfly.io) — search backend (installed automatically by `dev setup`)
99
- Cursor IDE (or Claude Code)
1010
- A code repository to index
1111

website/content/index.mdx

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,6 @@ The key difference: semantic search finds code by **meaning**, not text matching
8080

8181
```bash
8282
npm install -g @prosdevlab/dev-agent
83-
brew install --cask antflydb/antfly/antfly
8483
```
8584

8685
### Setup and index

website/content/latest-version.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,10 @@
44
*/
55

66
export const latestVersion = {
7-
version: '0.10.2',
8-
title: 'MCP Install Fix & Dependency Cleanup',
7+
version: '0.10.3',
8+
title: 'Fix Setup/Index Model Directory Mismatch',
99
date: 'March 30, 2026',
1010
summary:
11-
'Fixed dev mcp install check, removed dead metrics module and better-sqlite3 dependency.',
12-
link: '/updates#v0102--mcp-install-fix--dependency-cleanup',
11+
'Fixed dev setup reporting model ready while dev index fails with "model not found" due to mismatched model directories.',
12+
link: '/updates#v0103--fix-setupindex-model-directory-mismatch',
1313
} as const;

website/content/updates/index.mdx

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,18 @@ What's new in dev-agent. We ship improvements regularly to help AI assistants un
99

1010
---
1111

12+
## v0.10.3 — Fix Setup/Index Model Directory Mismatch
13+
14+
*March 30, 2026*
15+
16+
**Fixed `dev setup` reporting model ready while `dev index` fails with "model not found".**
17+
18+
- `hasModel`/`pullModel` used `~/.termite/models` but the running Antfly server looked in `~/.antfly/models` — both now use a shared `--models-dir` pointing at the server's data directory
19+
- Improved model name matching to avoid false positives from substring collisions
20+
- Added unit tests for model detection logic
21+
22+
---
23+
1224
## v0.10.2 — MCP Install Fix & Dependency Cleanup
1325

1426
*March 30, 2026*

0 commit comments

Comments
 (0)