Skip to content

feat(workspaces): hybrid BM25+dense workspace search + CUDA prerelease CI#38

Merged
dvcdsys merged 33 commits into
developfrom
feature/workspaces
May 13, 2026
Merged

feat(workspaces): hybrid BM25+dense workspace search + CUDA prerelease CI#38
dvcdsys merged 33 commits into
developfrom
feature/workspaces

Conversation

@dvcdsys
Copy link
Copy Markdown
Owner

@dvcdsys dvcdsys commented May 13, 2026

Summary

Lands the workspaces feature track on top of the new develop pre-release branch:

  • Workspace data model (PR2–PR11 history): workspace_repos, jobs, clone/index pipeline, GitHub webhooks, call-edges + eval harness, Louvain communities + workspace centroids, two-stage workspace search endpoint, CLI + skill + dashboard search dialog, name-first CLI grammar, in-dashboard add-repo flow with live progress, GitHub token scope derivation, account/org selector.
  • Hybrid search: FTS5 BM25 mirror of every indexed chunk + hybrid BM25+dense workspace search with project gate. Pre-FTS-mirror repos are flagged so the dashboard prompts a reindex.
  • Skill rewrite: cix-workspace skill rebuilt around the hybrid + 3-question workflow; trust rules + cix-workspace-investigator subagent.
  • Docs: privacy pass anonymizing examples in tests and docs; new workspaces.md guide; dashboard notes in README.
  • Search calibration: search defaults tuned, chunks/panel consistency fix, repo-name truncation in picker.
  • CI (last commit): new prerelease-server.yml workflow builds the CUDA-only image on push to develop and pushes dvcdsys/code-index:develop-cu128. ci-server.yml / ci-cli.yml now also gate PRs into develop.

Test plan

  • server vet/test/build CI passes on this PR (now gated on develop PRs too).
  • cli vet/test/build CI passes.
  • After merge into develop: prerelease workflow triggers and pushes dvcdsys/code-index:develop-cu128 to Docker Hub.
  • Pull the new tag on the RTX 3090 prod box and run a hybrid search end-to-end against a real workspace; check nvidia-smi shows GPU memory used (silent CPU fallback is a real failure mode).
  • Spot-check dashboard add-repo flow against a fresh repo.

🤖 Generated with Claude Code

dvcdsys and others added 30 commits May 11, 2026 16:57
First slice of the workspaces feature branch. Gated by
CIX_WORKSPACES_ENABLED — every new endpoint returns 503 when off, so
existing deployments are unaffected.

New tables: workspaces, github_tokens. New packages: internal/secrets
(AES-256-GCM at rest, key from CIX_SECRET_KEY / CIX_SECRET_KEYFILE /
auto-generated 0600 keyfile), internal/workspaces, internal/githubtokens.
New endpoints: full CRUD over /api/v1/workspaces and /api/v1/github-tokens
with the canonical {"detail": "..."} error envelope. Plaintext PATs are
never echoed — POST returns metadata only.

Dashboard gets two placeholder modules (Workspaces, GitHub Tokens) that
render the full CRUD flow against the new endpoints and self-hide behind
a "feature off" alert when the flag is false.

Subsequent PRs of feature/workspaces add workspace_repos, jobs+workers,
webhook receiver, call-graph extraction, Louvain communities, two-stage
search, and the cix:workspace skill.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Includes:
- workspaces / github_tokens schema (gated by CIX_WORKSPACES_ENABLED)
- AES-256-GCM at-rest encryption (internal/secrets)
- Full CRUD over /api/v1/workspaces and /api/v1/github-tokens
- Dashboard placeholder modules for both
- Unit + integration tests (plaintext-leak gate)
Adds the bridge from a GitHub URL to an indexed cix project. Operator
attaches a repo to a workspace via POST /workspaces/{id}/repos; the
server enqueues a clone_repo job (worker clones via go-git), then
chains an index_repo job that drives the existing 3-phase indexer
in-process against the on-disk clone.

New packages:
- internal/jobs       persistent SQLite-backed worker pool with
                      partial-unique dedupe (50 webhook bursts collapse
                      to 1 pending row), per-attempt linear backoff,
                      panic-safe handler invocation
- internal/repocloner go-git wrapper — shallow clone with PAT auth via
                      x-access-token, in-process so distroless images
                      don't need a git binary; fetch+reset on reuse
- internal/repoindexer walks the clone, batches FilePayloads, calls
                      indexer.BeginIndexing/ProcessFiles/Finish.
                      Filter prunes node_modules/.git/etc., skips
                      binaries (NUL probe) and oversized files.
- internal/workspacerepos service layer for workspace_repos rows
- internal/workspacejobs handler registration that wires the above
                      packages into the jobs queue

New endpoints (gated by CIX_WORKSPACES_ENABLED):
- GET    /workspaces/{id}/repos
- POST   /workspaces/{id}/repos      (returns one-shot webhook secret)
- DELETE /workspaces/{id}/repos/{repo_id}
- POST   /workspaces/{id}/repos/{repo_id}/reindex
- GET    /jobs

New env vars: CIX_WORKER_CONCURRENCY (default 2),
CIX_WORKSPACES_DATA_DIR (default <sqlite-parent>/repos), CIX_PUBLIC_URL
(used to build webhook URLs surfaced to operators).

Webhook receiver / HMAC validation lands in PR3; call graph + Louvain
communities + two-stage search in PR4–PR6.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
End-to-end pipeline from POST /workspaces/{id}/repos through cloned +
indexed project, with:
- workspace_repos + jobs schema (dedupe via partial-unique index)
- SQLite-backed worker pool (panic-safe, retry backoff)
- go-git clone wrapper (works in distroless images)
- in-process indexer driver that reuses the existing 3-phase protocol
- repo CRUD + reindex + jobs list endpoints (gated by feature flag)
- 7 HTTP integration tests, jobs unit tests, filter unit tests
…gister)

Closes the loop from a push on GitHub to an updated cix index. A new
public endpoint accepts deliveries, validates HMAC-SHA256 against the
per-row webhook_secret, and enqueues the same clone_repo job PR2
introduced — go-git's CloneOrFetch already handles the incremental
fetch+reset path, so no new job type is needed.

The dashboard's add-repo flow now exposes an `auto_webhook` toggle.
When true, the server uses the supplied PAT to POST /repos/.../hooks
on the operator's behalf and persists the resulting hook id. Failure
is non-fatal — the response carries `auto_registered: false` plus an
operator-facing note (e.g. "missing admin:repo_hook scope"). Manual
setup is the default and works without any extra GitHub scopes.

New package internal/githubapi: a tiny raw-HTTP client for two GitHub
endpoints (create webhook, delete webhook). Pulling go-github for just
these two calls would have added ~10MB of generated code.

New endpoints:
- POST /api/v1/webhooks/github/{repo_id}              (public; HMAC-auth)
- GET  /api/v1/workspaces/{id}/repos/{repo_id}/webhook-info

Tests cover: HMAC happy path, mismatched/missing signatures (401),
ping deliveries (200), wrong-branch pushes (ignored), burst-dedupe on
multiple deliveries collapsing to one job, public-path bypass of the
auth middleware, and the auto-register-fails-cleanly-without-public-URL
branch.

doc/WORKSPACES.md is a new operator guide — feature flags, encryption
key resolution, Cloudflare tunnel quick-start, manual + auto webhook
flows, troubleshooting.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
End-to-end webhook delivery → reindex with HMAC validation, optional
auto-register against the GitHub API, manual setup UX, and an
operator guide (doc/WORKSPACES.md) with Cloudflare tunnel walkthrough.
Approximate caller→callee graph extracted from the existing symbols +
refs tables. The result feeds Louvain community detection in PR5; the
eval harness gates that downstream work behind a precision-floor check.

Approach (refs heuristic):
- caller resolved as the narrowest function/method whose [line, end_line]
  span contains the ref's line
- callee candidates resolved by name lookup on symbols (kind ∈ function,
  method) constrained to the same project
- weight = 1 / popcount(callee_name) — so common names like init/run/handle
  contribute proportionally less to the structural signal
- popcount > 20 → name dropped (treated as noise)
- same-file bonus ×2.0, same-parent_name bonus ×1.5
- self-edges (recursion) dropped — they don't help community separation
- duplicate (caller, callee) pairs accumulate weight via map then bulk
  INSERT inside a single transaction

Integration: workspacejobs.handleIndex calls callgraph.Build after a
successful FinishIndexing — non-fatal (failure logs but doesn't flip
the repo status to failed; semantic search continues to work without
the graph).

Eval harness — internal/callgraph/eval/ — runs three fixtures
(Go/Python/TypeScript) through the full chunker → persist → build path
and asserts the labeled (caller, callee) pairs all show up in
call_edges. Current results:

  go-handlers     4/4  precision 1.00
  python-pipeline 6/6  precision 1.00
  typescript-store 5/5  precision 1.00

All three comfortably above the 0.60 floor — no need to fall back to
the symbol co-occurrence graph (callgraph.SourceCoOccurrence is in the
table for future swapping). Greenlights PR5 (Louvain communities).

9 unit tests covering: single-edge happy path, popcount drop,
module-scope refs skipped, self-edges dropped, cross-file weight,
same-parent bonus, weight accumulation, idempotency, edge counting.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The structural layer that powers PR6's two-stage workspace search.
Every workspace's combined call_edges graph is partitioned into Louvain
communities; each community gets a mean-pooled, L2-normalised embedding
stored in a dedicated chromem collection (ws_{md5}_centroids).

New package internal/communities — gonum/graph/community Louvain with
deterministic seed (Resolution=1.0, seed=42). Empty workspaces and
empty graphs are handled cleanly; output is wholesale-replaced on each
rebuild so partial failures can't leave stale state.

New tables: communities (id, workspace_id, label, size, parent_id),
community_members (community_id, project_path, symbol_id). Wholesale
delete + reinsert per rebuild inside a single transaction.

New vectorstore methods:
- CentroidCollectionName(workspaceID)
- ReplaceCentroids — drops + recreates the workspace's chromem
  collection in lock-step with the SQL rebuild
- SearchCentroids — top-K nearest-neighbor against the centroid
  collection (the stage-1 query for PR6)
- FetchProjectChunkEmbeddings — by-symbol-name lookup used during
  mean-pooling. chromem's where filter is single-equality so we make
  one query per name (bounded by community member count, typically <200).

Job pipeline:
- New type "compute_workspace_communities" with debounce key
  "communities:{workspace_id}" — burst-safe via the existing
  partial-unique index on jobs.dedupe_key.
- index_repo handler chains EnqueueComputeCommunities(workspace_id)
  with a 30s scheduled_at delay, so a wave of repos finishing
  indexing during catch-up collapses into one Louvain rebuild.

Tests: 6 unit tests covering Build (two-cluster split, empty workspace,
idempotency, cross-project tracking) + meanPool/l2Normalise helpers.
Eval gate from PR4 already cleared at 100% precision — Louvain runs
against a high-quality graph by construction.

Deferred to a future iteration (cheap to revisit):
- Recursive split for communities >50 chunks
- Small-community merging
- Overlapping community detection (BigCLAM, etc.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
GET /api/v1/workspaces/{id}/search?q=... is the user-facing payoff of
the workspaces feature. Two-stage retrieval:

  Stage 1: embed query → SearchCentroids → top N communities (default 5)
  Stage 2: for each (community, project_path), one chromem query against
           the per-project chunk collection with the user's embedding;
           filter results in-memory to members of the community by
           symbol_name; merge globally, dedupe by (project, file,
           startLine, endLine), return top K (default 20).

Why filter in-memory instead of pushing where: chromem's where clause
is single-equality only — pushing per-symbol-name filters would mean N
queries per (community, project). Stage-2 fan-out is bounded by
top_communities × #project_paths_per_community ≈ 5 × 3 = 15 queries
per workspace search, comfortably under 500ms p50.

Response shape (WorkspaceSearchResponse):
- status: "ok" | "communities_not_built" | "empty"
- communities: top-N centroids with score, label, project_paths
- chunks: merged ranking with project_path, file, lines, score,
  community attribution

When the workspace has no centroid index yet (e.g. just-created
workspace, debounced compute_workspace_communities hasn't fired),
the endpoint returns `status: "communities_not_built"` with empty
arrays — dashboard UI can render a hint instead of an error.

Tests: 4 HTTP integration tests covering the empty-centroid branch,
missing query parameter, unknown workspace id, and disabled feature
flag. A stub embedder lets us reach stage 1 without standing up the
llama-server sidecar in CI.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Final slice of the workspaces feature branch.

CLI (cli/):
- New parent command `cix workspace` (alias `cix ws`)
- `cix workspace list` — lists every workspace on the cix-server
- `cix workspace search <ws> <query>` — runs the two-stage search
  - --top-communities N (default 5)
  - --top-chunks K (default 20)
  - --json — raw response for piping
- Workspace identifier accepts either the opaque id or the name
  (case-insensitive); resolution is one `cix workspace list` round
  trip cached per-process.

Skill (skills/cix-workspace/SKILL.md):
- Markdown frontmatter user-invocable skill, mirroring the `cix`
  skill's style guide.
- Trigger phrasing tuned to the use case: cross-repo questions,
  microservice flows, frontend+backend pairs.
- Explains the two-stage mental model + when to fall back to plain
  `cix search` inside a single repo.
- Troubleshooting for `communities_not_built`, empty results, 503.

Dashboard (server/dashboard/src/modules/workspaces/):
- Search icon button on every workspace row opens a dialog hosting
  the full two-stage search UI: query input → top communities list
  (label, score, member count, project_paths) → top chunks (file,
  lines, project, symbol, score, content snippet).
- Status-aware empty states: explicit message when the centroid
  index hasn't built yet ("wait ~30s after the last index_repo").

Tests pass on both server and CLI. The feature branch is now ready
to merge to main as one large PR per the user's PR strategy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…xpansion)

User-facing refactor of the workspace surface so operators (and the
agent skill) can explore before searching.

CLI grammar — name-first, manual dispatch under one `cix ws` parent:

  cix ws                          → list workspaces
  cix ws list [--verbose|--json]  → list workspaces (alternate)
  cix ws <name>                   → describe — repos + status + indexed count
  cix ws <name> list              → list repos in workspace
  cix ws <name> repos             → alias for `<name> list`
  cix ws <name> describe          → same as bare `<name>`
  cix ws <name> search <query>    → two-stage workspace search

Why manual dispatch rather than cobra subcommands: the workspace NAME
needs to sit in the first positional slot. Cobra can't recognise a
dynamic value as a command, so we use cobra.ArbitraryArgs + a small
switch inside RunE. Trade-off: no auto-completion on the name. In
exchange, the surface reads the way operators think.

Status badges in `describe` / verbose `list`:
  ✓ indexed   ✗ failed   … pending/cloning/indexing

Client: adds Client.ListWorkspaceRepos for the new verbs to consume.
The /workspaces/{id}/repos endpoint is already there (PR2) — this
just exposes it.

Dashboard: each workspace row is now expandable. Click the chevron
→ lazy-loads attached repos, each shown with status colour, branch,
project_path, last_indexed_at, and last_error. The Search button
on the row still opens the existing two-stage search dialog.

SKILL.md: documents the new grammar + adds a "Discovery-first
workflow" pattern at the top of Patterns. The point of the new
verbs from an agent's perspective is to know whether a workspace
is searchable before paying the search round-trip — `cix ws <name>`
tells you indexed-count and lists repos in one call.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The dashboard form was asking users to type the token's scopes by hand,
but scopes are an attribute of the PAT set on github.com — typed input
is just unverified text that can drift from what GitHub will actually
enforce. The codepath was a leftover from a deferred validation step.

Now the server validates every newly submitted PAT with GET /user and
reads the real scopes from the X-OAuth-Scopes response header. A 401
from GitHub turns into a 422 with the surfaced message, anything else
into a 502, so an invalid or unreachable token is rejected at the door
rather than persisted and discovered later. Fine-grained PATs
(github_pat_*) don't expose scopes via this header — for them Scopes
stays empty and the dashboard displays "(fine-grained or none)".

The Scopes field on CreateGithubTokenRequest is marked deprecated and
ignored on the server; the dashboard's Scopes input is removed.
Existing tests are updated and a TestGithubTokens_RejectInvalidToken
case asserts the 401-path rejection.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The dashboard's workspace view was a stub that listed repos read-only;
attaching a new repo only worked via curl. This wires up the actual UX:
a card grid that mirrors the projects page on the list, a per-workspace
detail page, and a staged add-repo dialog that walks the operator
through token → repo → branch → webhook policy.

Backend changes:
  - GET /api/v1/github-tokens/{id}/repos — reveals the PAT server-side,
    fetches the repos visible to it via /user/repos with Link-header
    pagination (up to 5 pages = 500 repos), optionally filtered by ?q=.
    The plaintext never touches the wire.
  - POST /api/v1/workspaces/{id}/repos now accepts webhook_mode of
    {manual, auto, disabled}. A new workspace_repos.webhook_mode column
    records the operator's intent; the legacy auto_webhook bool remains
    derived (true iff mode = "auto") so old clients keep working.
    Existing rows are backfilled to "auto" when auto_webhook=1.

Frontend changes:
  - WorkspacesPage is a Routes shell now; list + detail are separate.
  - WorkspacesListPage renders Workspaces as cards (counts at-a-glance,
    in-progress / failed badges) — same visual language as projects.
  - WorkspaceDetailPage drives the per-workspace UX: an Add repo dialog
    with a staged form (each step unlocks the next), Reindex / Delete
    actions on each RepoCard, and background polling (3s) while any
    repo is in pending / cloning / indexing so the operator can watch
    the progress without F5. Each in-flight badge ticks an elapsed
    counter so it's visible that the job isn't silently stalled.
  - AddRepoDialog picks tokens, lists their visible repos with a
    client-side text filter, auto-fills branch from default_branch,
    and surfaces the webhook URL+secret once for manual mode.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The repo picker only surfaced what `/user/repos` returned. That endpoint
is the affiliations-aggregated view and routinely misses org repos —
SAML-protected orgs in particular only appear under `/orgs/{login}/repos`.
So a user with access to an org would see their personal account but
not the org's repos, which is exactly what was hit in testing.

Add a second selector between Token and Repository:

  - New `GET /api/v1/github-tokens/{id}/accounts` lists the PAT owner
    plus every org from `/user/orgs`. SAML-gated 403 on /user/orgs is
    swallowed so the personal account still comes through.
  - `GET /api/v1/github-tokens/{id}/repos` now accepts `?account=login`
    + `?account_type=user|org`. When set, the server hits
    `/users/{login}/repos` or `/orgs/{login}/repos` directly. When not
    set, it falls back to the original `/user/repos` aggregated view
    so existing callers keep working.

Dashboard:

  - `AddRepoDialog` loads accounts as soon as a token is picked and
    renders a Select with "(all accessible)" plus each user/org. The
    repo list refetches whenever the account changes — typing through
    the picker now shows the org's repos directly.

Tests:

  - Unit: ListAccounts (user + orgs), SAML-403 swallow, account-scoped
    repo endpoint dispatch (`/users/X` vs `/orgs/X`).
  - Integration: round-trips through the HTTP layer including the
    "no account_type with account" 422 rejection.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Long full_names like atrybulkevychglobalgames/grpc-go-kubernetes-load-
balancing-example were pushing the row past the dialog's max-width
because Tailwind's truncate only works on a flex child that also has
min-w-0. The name span had truncate but no shrink boundary, so it kept
its intrinsic width and the branch span on ml-auto ended up off-screen.

Wrap the name in min-w-0 flex-1 truncate, pin the icon and branch to
shrink-0 so the row stays inside the dialog. Added a title= attribute
on the button so hovering still surfaces the full path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A trigram-tokenized FTS5 virtual table lives alongside chromem-go so
workspace search can pair dense vector retrieval with sparse keyword
retrieval. The sparse signal recovers two things pure-dense fan-out
loses: short-token precision (acronyms like "XYZ" get diffuse cosine
scores) and project-relevance gating (chromem returns the N nearest
vectors regardless of semantic distance, leaving projects that share
zero vocabulary with the query at chunk_score ~0.25 false-positive).

chunks_fts can only filter by rowid; chunks_meta is the indexed shadow
that lets us delete by (project_path, file_path) and project_path
without a full FTS5 scan. The two stay consistent inside the indexer's
per-file SQL transaction, and they cascade away on project deletion
and on full-reindex wipe.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Each project now runs dense (chromem cosine) and sparse (FTS5 BM25
over the chunks_fts mirror added in the previous commit) in parallel.
Per project the two ranked lists are fused via Reciprocal Rank Fusion
(k=60). Across projects an α-blended candidacy score (with per-query
min-max normalization on both signals) plus a relative threshold
(`candidacy >= best * 0.4`) gates the result set so projects that
share no semantic and no lexical overlap with the query drop out
entirely — pure-dense fan-out leaked every workspace repo at
noise-level cosine similarity because chromem returns the N nearest
vectors regardless of how far away "nearest" actually is.

Live XYZ probe over 8 ACME repos: three repos with literally zero
"XYZ" mentions previously surfaced 50 chunks each at dense scores
0.17-0.27. With the gate they drop out; the chunks list is then
built by round-robin interleaving across surviving projects so each
relevant repo gets its top hit before the dominant repo's tail
entries appear.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…a reindex

Repos indexed before the chunks_fts mirror landed have file_hashes
rows (chromem populated, project marked indexed) but an empty
chunks_meta — the BM25 side of hybrid search returns nothing for
them and the algorithm degrades to pure dense for those entries.
Observable failure mode: live workspace shows the new bm25_score
field at 0.000 for every project and the result set looks
identical to the old pure-dense fan-out.

WorkspaceSearch now probes chunks_meta vs file_hashes per project
and bubbles stale repos up via a new stale_fts_repos field on the
response. The dashboard renders a banner naming the affected repos
with a reindex hint.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…workflow

Replaces the old centroid-routing playbook (which described tools we
ripped out in the workspaces refactor) with a workflow that matches
the hybrid BM25+dense algorithm: how to phrase queries so the BM25
gate fires, how to read project_score / bm25_score / dense_score,
when to spawn parallel Explore sub-agents over surviving projects,
and how to synthesize the per-repo change plan.

The skill is goal-driven: every workspace-search interaction has to
answer (1) which repos are in scope, (2) which code in those repos
is relevant, (3) what changes need to land and in what order. It
also names the "primary project" pattern — the agent is usually
cd'd into one specific repo and the user's task is rooted there;
workspace search defines the surrounding context.

Includes a worked retro on the "Add sell flow to XYZ" failure that
motivated the hybrid algorithm — pure-dense fan-out routed three
zero-mention repos as relevant on noise-level cosine similarity.

Aligns the CLI (`cix ws … search`) with the new server API: drops
the `--top-communities` flag in favour of `--top-projects`, switches
the response renderer to projects + bm25/dense breakdown, surfaces
stale_fts_repos as an inline warning.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces internal product/repo names that leaked from a real production
debugging session into test fixtures, code comments, and the
cix-workspace skill doc:

- XYZ / XYZOrder / processXYZOrderEvent → XYZ / XYZOrder / processXYZOrderEvent
- acme-backend / acme-shared / acme-models / acme-worker /
  acme-notifier / acme-directory / acme-inventory / acme-platform
  → acme-backend / acme-shared / acme-models / acme-worker /
    acme-notifier / acme-directory / acme-inventory / acme-platform
- "internal product code" → "internal product code"
- "shared-models migration", "shared data models" → generic
  shared-models / data-model phrasing
- README .cixignore example switched from api/generated/ to
  api/generated/

Working-tree-only sanitization; a follow-up history rewrite will scrub
the same strings from older commits. Tests green (chunksfts, db,
httpapi, projectconfig).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ency

Three changes to workspace and per-project search, plus targeted
anonymization of eval-derived references in adjacent comments and test
fixtures.

Server behaviour:
- Workspace search default `min_score` raised from 0 to 0.4, matching
  the per-project SemanticSearch default. Cross-project sweeps that
  need long-tail recall must now pass `min_score=0` explicitly.
- Per-project SemanticSearch default `min_score` lowered from 0.4 to
  0.2 — abstract NL queries (e.g. "end-to-end workflow lifecycle")
  used to silently return empty even when relevant chunks scored in
  [0.25, 0.35]. 0.2 keeps a light noise floor.
- Fix: workspace `chunks[]` round-robin now uses only the projects
  that survived the `top_projects` truncation. Previously a 12-project
  workspace at default `top_projects=10` could surface chunks from
  the 11th/12th project that weren't in the `projects[]` panel —
  clients had no way to look up the chunk's bm25/dense scores.

Tests added:
- TestWorkspaceSearch_ChunksOnlyFromPanelProjects — 12 surviving repos
  + top_projects=10; every chunk's project must appear in the panel.
- TestWorkspaceSearch_DefaultMinScoreIs04 — geometry calibrated so
  chunks at cos=0.3 are filtered by default and admitted at min_score=0.
- TestSemanticSearch_DefaultMinScoreIs02 — fakeEmbedder geometry
  producing a cos≈0.25 chunk that the old default would have rejected.

OpenAPI spec descriptions updated for both `min_score` defaults.

Anonymization (carried over from previous workspace-eval analysis):
adjacent comments and test fixtures that named specific repos / product
acronyms / sell-flow scenarios are replaced with neutral placeholders
(WIDGET / ping / generic repo descriptions).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…subagent

Two additions to the cix-workspace skill:

1. Ten "trust rules" for interpreting workspace_search responses,
   derived from internal calibration testing:
   - chunk.score>=0.4 trust threshold (rule 1)
   - chunk.score==0 = BM25-only literal match, not low confidence (rule 2)
   - top-1 of projects[] is correct ~70% of the time in real tasks (rule 3)
   - drop down to per-project search for depth (rule 4)
   - min_score=0 explicitly for cross-project sweeps (rule 5)
   - careful disambiguator selection — prefer meta-tokens over tech
     guesses (rule 6)
   - "change X in production" → manifests/config repo, not code repo
     (rule 7)
   - scan ranks 2-5 before reformulating (rule 8)
   - explicit min_score=0 for per-project NL drill-down (rule 9)
   - words live ≠ change location (rule 10)

2. Dedicated `cix-workspace-investigator` sub-agent at
   `skills/cix-workspace/agents/cix-workspace-investigator.md`:
   - Thin read-only shell around cix search/def/refs + Read + Grep
   - Scope-isolated: one repo per spawn, no edits, no recursion
   - Methodology + output format are the main agent's call per spawn,
     not baked into the sub-agent's system prompt
   - System prompt is ~60 lines; main agent's per-spawn prompt
     handles the actual task

SKILL.md's "Sub-agent fan-out pattern" section rewritten around the
new sub-agent with a four-part prompt template (task verbatim,
project_path, seed chunks WITH the main agent's commentary, explicit
deliverable) and an anti-patterns list. The existing worked example is
preserved but rewritten without specific repo composition.

skills/README.md updated with the bundled-subagent description and
install command (additional copy into ~/.claude/agents/).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
dvcdsys and others added 3 commits May 13, 2026 17:36
New workspaces.md at the repo root — sub-document linked from
README.md. Covers everything an operator or agent needs to know about
the workspaces feature:
- What a workspace is, what's experimental about it today
- Enabling via CIX_WORKSPACES_ENABLED + supplementary env vars
- Concepts: owned vs linked repos, GitHub tokens, project path format
- Quick start: end-to-end walkthrough with curl examples
- Adding repositories (Dashboard staged dialog + REST + status
  transitions)
- GitHub tokens lifecycle, AES-256-GCM at-rest encryption, scopes
- Searching: Dashboard / `cix ws` CLI / REST endpoint with response
  shape
- Search algorithm — pipeline diagram, tunable parameters table,
  min_score semantics, hybrid BM25+dense rationale, stale-FTS handling
- Webhooks: disabled/manual/auto modes, HMAC signature, delivery
  endpoint
- Strengths and weaknesses (honest assessment)
- Configuration reference, REST API reference, troubleshooting
- Agent integration pointer to cix-workspace skill

README.md updated:
- "What you get" bullet for Workspaces (experimental) with link to
  workspaces.md
- Dashboard table gains two new rows: Workspaces and GitHub Tokens
  (both flagged experimental)
- New "Workspaces and external repositories" subsection after the
  Disabled-embeddings mode subsection, summarising the feature and
  linking to workspaces.md
- Agent Integration section adds the cix-workspace skill + bundled
  investigator subagent install snippet

Feature is marked experimental in every public-facing reference.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- New workflow `.github/workflows/prerelease-server.yml`: on push to
  `develop`, builds `server/Dockerfile.cuda` (amd64) and pushes the
  floating `dvcdsys/code-index:develop-cu128` tag. CPU image is
  intentionally skipped — pre-release stages on RTX 3090 only.
- Extend `ci-server.yml` / `ci-cli.yml` to also run on push and PR
  against `develop`, so vet/test/build gates pre-release merges the
  same way they gate main.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Commit 33da39b accidentally removed the placeholder that makes the
`//go:embed all:dist` directive in dashboard/embed.go resolve on a
fresh clone (no `make dashboard-build`). `go vet ./...` then fails
with `pattern all:dist: no matching files found`, breaking the CI
gate on every PR.

The root `.gitignore` already has a negation rule for this exact
path; restoring the file is enough.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@dvcdsys dvcdsys merged commit a2b60f2 into develop May 13, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant