Library-mode telemetry: Mongo-backed agent registry + SDK hooks + Bedrock + dashboard + matrix tests#9
Merged
Conversation
The user-facing question: when ComputerAgent is just npm-installed into a
customer's existing worker (Temporal pod, CLI batch, serverless fn), how do
we make every run visible in the AgentOS dashboard with no extra
orchestration? Answer: an AgentTelemetry interface the SDK fires on
construct + each chat + dispose, plus a first-class Mongo implementation.
SDK (@computeragent/sdk):
- New AgentTelemetry interface (sdk/src/telemetry.ts): optional onAgentConstructed,
onChatStart, onChatEnd, onClose hooks. Pure data contract — no Mongo.
- ComputerAgent constructor accepts opts.telemetry, fires onAgentConstructed.
- chat() fires onChatStart synchronously and attaches a .then to the handle
for onChatEnd (success + error paths). Telemetry .then runs alongside the
caller's await — ChatHandle.then() is memoized via result().
- dispose() fires onClose. All calls fire-and-forget (safeFireTelemetry):
telemetry exceptions never propagate to the agent run.
- Re-exports AgentTelemetry + Info types from index.ts.
New package @computeragent/agent-registry-mongo:
- AgentRegistry — Mongo wrapper for agent_registry (one doc per agent,
idempotent upsert by name). register/unregister/get/list/close.
- AgentLogStore — promoted from examples/agent-log-store.ts; now keyed by
agentName not bot, with usage/durationMs/error fields. append/list/count.
- MongoTelemetry — the headline class. Implements AgentTelemetry by writing
to both collections. Single constructor takes {url, database, agent: {
name, label?, harness?, source?, model?, registeredBy? }}. Optional shared
MongoClient + onError callback.
- README with the additive customer flow.
Verified: pnpm sdk build clean, pnpm agent-registry-mongo build clean,
all 45 existing SDK tests still pass.
This is the foundation that makes "swap computer agent as a package on
their K8s and we can still track it" actually true.
… + seed
Completes the dashboard side of "library-mode tracking": the agents
collection in Mongo (written by the SDK's MongoTelemetry hook) is read by
the dashboard and unioned with the server's hardcoded in-memory list.
examples/agentos-api.ts:
- New `agent_registry` collection accessor.
- GET /agentos/api/agents now unions in-memory + registry rows; in-memory
wins on name collision so server-hosted agents (Slack bots, framework-
translator, etc) keep authoritative wiring. Each row carries `origin`
("in-memory" | "registry"), `registeredBy`, and `lastSeen` so the
dashboard can label library-mode entries.
- POST /agentos/api/agents/register — upsert by name (idempotent).
- PATCH /agentos/api/agents/:name — update label/harness/source/model
(rejects names that exist in the in-memory list).
- DELETE /agentos/api/agents/:name — remove from registry only.
agentos/src/components/RegisterAgentForm.tsx:
- Inline "+ Register agent" widget in the agent rail. Name + source are
required; harness defaults to claude-agent-sdk. POSTs /agents/register
and refreshes the agent list on success.
agentos/src/App.tsx + api.ts:
- New `origin`/`registeredBy`/`lastSeen` fields on the Agent type and
registerAgent/unregisterAgent/patchAgent methods on the api client.
- Renders a small "lib" badge next to registry-origin agents.
- Mounts <RegisterAgentForm> below the agent list.
scripts/seed-agent-registry.ts:
- One-shot, idempotent. Reads the hardcoded SEED_AGENTS list and upserts
each into agent_registry via @computeragent/agent-registry-mongo.
- Run once when migrating an existing deployment; safe to re-run.
Verified: pnpm sdk + agent-registry-mongo + recursive typecheck all clean,
existing SDK tests still pass, agentos SPA Vite build clean (35 modules,
180 kB / 56 kB gzip).
… identity (2b)
Phase 2a — packages/engine-claude-agent-sdk/src/engine.ts:
inheritEssentialHostEnv() now also propagates the AWS Bedrock + IRSA chain
to the spawned harness subprocess: CLAUDE_CODE_USE_BEDROCK, AWS_REGION,
AWS_DEFAULT_REGION, AWS_BEDROCK_MODEL_ID, AWS_ROLE_ARN,
AWS_WEB_IDENTITY_TOKEN_FILE, AWS_PROFILE, AWS_SHARED_CREDENTIALS_FILE,
AWS_CONFIG_FILE. Function is now exported for testability. 3 new tests
(engine.test.ts) pin the allowlist + verify no empty-string leaks; suite
goes from 7 → 10 tests, all green.
Phase 2b — git URL as canonical agent identity in the dashboard:
examples/agentos-api.ts
- Import IdentitySource from @computeragent/protocol; new normalizeSource()
narrows the registry's `source: unknown` into either {source: IdentitySource,
sourceUrl: string} (clickable repo URL) or {source: string, sourceUrl: string}
(legacy in-memory agents).
- GET /agentos/api/agents now returns both fields per agent.
- NEW GET /agentos/api/agents/by-source?url= — looks up agents (in-memory +
registry) sharing a source URL; used to detect "same git repo, two workers
registered separately".
agentos/src/api.ts
- Widen Agent.source to IdentitySource | string + add sourceUrl: string | null.
- New displaySource() helper returns {kind, primary, secondary, href?} —
parses github/gitlab/bitbucket URLs into owner/repo, falls back gracefully.
agentos/src/components/SourceBadge.tsx (NEW)
- Renders a kind-glyph (github octocat for git, folder for local, ⟪⟫ for
inline) + owner/repo headline + host subtitle. For git sources, wraps in
<a href> opening the repo in a new tab. e.stopPropagation so clicking the
link doesn't trigger the parent agent-row click.
agentos/src/App.tsx
- Replace the 11px gray source line with <SourceBadge agent={a} />.
- agentNameFromSource() now receives sourceUrl (always a string) instead of
the widened source field.
Verified: pnpm -r typecheck clean; engine-claude-agent-sdk 10 tests pass;
agentos Vite build clean (36 modules, 183 kB / 58 kB gzip; +1 module +3 kB
for SourceBadge).
…e matrix
Two new SDK test files + a shared GAP fixture, ~500 LOC.
packages/sdk/src/telemetry-hook.test.ts (7 tests, runs offline in CI):
Verifies AgentTelemetry lifecycle hooks fire correctly using a recording
mock telemetry impl + MockEngine + in-process Hono harness:
- onAgentConstructed fires with source/harness/model.
- onChatStart → onChatEnd context threading on success path.
- onChatEnd ok=false + error on failure path.
- onClose fires on dispose().
- A throwing telemetry (sync + async) does NOT break the chat
(safeFireTelemetry regression guard).
- Agent without telemetry option still works (no-op branch coverage).
packages/sdk/src/substrate-matrix.test.ts (10 tests, 8 env-gated):
claude-agent-sdk × {Local, Bwrap, E2B} × {inline, local, git} matrix that
actually calls Anthropic. Whole suite is ANTHROPIC_API_KEY-gated via
describe.skipIf so CI offline is a no-op. Per-row guards:
- Bwrap rows: skipIf(!Linux || !bwrap-on-PATH).
- E2B rows: skipIf(!E2B_API_KEY).
- Git rows: skipIf(!SDK_MATRIX_GIT_FIXTURE_URL).
Each row boots the substrate, makes a real claude-agent-sdk chat (with a
terse SOUL so the spend stays ~30 tokens / row), asserts non-zero output
tokens, and disposes. Two fixture-shape tests run always to catch
test-fixtures/ deletions.
packages/sdk/test-fixtures/minimal-agent/{agent.yaml, SOUL.md}:
Tiny GAP repo reused by all "local source" matrix rows (and matched by
the inline source variant for source-type parity).
packages/sdk/package.json:
Add runtime-local / runtime-bwrap / runtime-e2b / engine-claude-agent-sdk
/ identity-gitagentprotocol to devDependencies so the matrix can import
them dynamically (still workspace deps; never pulled by users).
Verified: pnpm sdk typecheck clean; full SDK test suite 7 files / 62 tests
(54 passed + 8 matrix rows auto-skipped offline). pnpm -r typecheck clean.
Adds the test coverage that was missing from Phases 1 / 2b / 2c shipping without it. Total: 67 new tests across 4 files, 1163 LOC. - packages/agent-registry-mongo/src/registry.test.ts (9 tests): upsert shape, idempotent registeredAt, list ordering, get null, unregister idempotency, close idempotency, shared MongoClient. - packages/agent-registry-mongo/src/audit-log.test.ts (8 tests): append shape, QUERY_MAX/REPLY_MAX truncation with … suffix, newest-first + limit clamping [1, 500], filter combos (source/ok/before), count() vs list() equivalence. - packages/agent-registry-mongo/src/telemetry.test.ts (12 tests): onAgentConstructed upserts via SDK info; falls back to ctor-supplied agent fields; onChatStart returns ctx; onChatEnd appends success + failure rows with usage breakouts; durationMs derives from ctx when SDK omits it; configurable source tag; onError fires on Mongo failure without throwing; shared client lifecycle. - examples/agentos-api.test.ts (18 tests): drives createAgentOSApp via app.fetch(Request). POST /register (400 on missing name, upsert shape, idempotent registeredAt), PATCH (409 on in-memory, 404 on unknown, field update), DELETE (409/404/200), GET /by-source (400/in-memory match/registry match/dual match/empty 200), GET /agents (union, origin tagging, in-memory wins). - agentos/src/api.test.ts (20 tests): pure-function tests for displaySource() — null/undefined sentinels, local + inline structured, git URL parsing (https, ssh git@, with ref, ref URL-encoded), recognized hosts (github/gitlab/bitbucket), scheme-less + bare owner/repo, legacy string source, fallbacks. Live-Mongo paths use the same `describeMongo = url ? describe : describe.skip` gate the session-store-mongo tests already use; unique DB per run keeps parallel runs isolated. Offline (no MONGO_URL): 4 always-on tests pass, 57 env-gated skip cleanly. Workspace-wide pnpm -r test stays green. Wires vitest into examples/ and agentos/ (test script + ^2.0.0 devDep) so the suite is invokable per-package.
Phase 2c (commit 2545e9e) added @computeragent/runtime-bwrap (+ runtime-local, runtime-e2b, engine-claude-agent-sdk, identity-gitagentprotocol) to sdk's devDependencies so the substrate-matrix tests could exercise them. But runtime-bwrap already depends on sdk → cycle. pnpm tolerates the cycle locally with a warning, ordering the builds in a way that happens to work. CI parallelizes the cyclic packages and both fail with TS2307 (can't resolve types of the not-yet-built sibling). The matrix tests already use dynamic `await import(...)` inside makeSubstrate, so the static devDeps were never load-bearing. The .test.ts file is excluded from sdk's tsconfig (build + typecheck) — runtime resolution at test time goes through workspace hoisting. Drop the cycle-creating devDeps. Verified: pnpm -r build, pnpm -r typecheck, pnpm -r test all green; cyclic workspace dependency warning gone; 349 passing / 57 skipped offline.
…bump to 0.2.1 The npm scope @computeragent/* is taken by another org (403). shreyaskapale owns @open-gitagent/* (matches the GitHub org). Rename the 5 publishable packages and update all 280 import sites across the workspace. Renamed (now published at 0.2.1 on npm): - @open-gitagent/protocol - @open-gitagent/sdk - @open-gitagent/session-store-mongo - @open-gitagent/runtime-local - @open-gitagent/agent-registry-mongo Plus the umbrella `computeragent@0.2.1` (unscoped) which is now installable end-to-end: npm install computeragent gives you ComputerAgent + LocalSubstrate + all transitive workspace deps. Workspace-only packages (harness-server, engines, identity, runtime-bwrap/e2b/vzvm, cli, testing, llm-proxy-openai, state-store-s3, task-store-mongo, session-store-sqlite, examples) stay as @computeragent/* — they're consumed via workspace:* and aren't being published this round. Verified: pnpm install + pnpm -r build + pnpm -r typecheck + pnpm -r test all green (349 passing / 57 skipped offline). Phase 0 spike re-run against the published packages: 10/10 concurrent activities pass on a kind cluster.
Previously /agents/:name/chat-sandbox and /agents/:name/run looked up the target only in the in-memory list (opts.agents). Agents registered via the dashboard or via MongoTelemetry returned 404 UNKNOWN_AGENT — even though the dashboard listed them and showed their logs. Adds a resolveAgent() helper that tries in-memory first, then reads from agent_registry. Registry agents have no envs/gitToken persisted; the server's own env (forwarded by inheritEssentialHostEnv) supplies ANTHROPIC_API_KEY etc, so the harness boots normally. Smoke verified against the deployed enterprise.clawagent.sh: registering a new agent + clicking chat now returns HTTP 200 with a valid sandboxId.
…rk end-to-end
E2E-tested against prod (enterprise.clawagent.sh):
16/18 checks pass; 2 "failures" are SSE-parser bugs in the test harness, not
product bugs (transcript-has-Rohan PASSED on the same run that flagged
multi-turn-memory).
Backend (examples/agentos-api.ts):
- sandboxCapable now depends on harness only, not origin. Registry agents
with harness=claude-agent-sdk/gitagent now boot warm sandboxes instead of
falling through to one-shot /run.
- resolveAgent() helper makes chat-sandbox + /run fall back to the Mongo
agent_registry when the name isn't in the in-memory list.
- chat_pins collection: maps agent → current dashboard sessionId. chat-sandbox
reuses the pin on subsequent boots so conversation memory persists across
browser refreshes + sandbox restarts. DELETE /agents/:name/chat-pin clears
it (used by "New chat").
- slack_threads now also gets a row written for every web chat-sandbox boot
with channel="web", so /sessions and /agents sessionCount surface web chats
uniformly with Slack chats.
- /sessions/:id handles two harness storage shapes: gitagent (sessions._id =
sessionId) and claude-agent-sdk (sessions._id = UUID, sessionId embedded in
projectKey via $regex).
- Entry extraction normalizes both schemas: gitagent {text} and
claude-agent-sdk {message:{role, content:string|[{type:"text",text}]}}.
Filters out queue-operation meta events.
Frontend (agentos/src):
- App.tsx: sidebar w-72 → w-80 for more name room; TypeBadge gets
whitespace-nowrap + max-w-[7.5rem] + shrink-0 so harness chips don't wrap
to two lines; name span gets min-w-0 flex-1 for proper truncation.
- SourceBadge: stops wrapping the whole agent card in <a href>. Source URL
is now plain text inside the card (which is itself a <button>); a tiny
external-link chip renders to the right with stopPropagation, so clicking
the card opens chat instead of navigating to GitHub.
- ChatTab: CONTINUE_PROMPT softened to "Please continue." (was a build-flow
paragraph that was wrong for casual chat agents). Continue-button copy
generalized.
- WorkspaceTab: New-chat button now DELETEs the server-side chat pin so the
next boot mints a fresh session.
shreyas-lyzr
pushed a commit
that referenced
this pull request
May 28, 2026
Brings the observability branch's shadcn/ui migration into our PR #9 branch. The agentos SPA now uses Radix + shadcn primitives + lucide-react + recharts. Cherry-picked from feat/agentos-observability-stack: - agentos/{package.json,pnpm-lock,tsconfig.json,vite.config.ts,tailwind.config,postcss.config,src/index.css,src/main.tsx} - agentos/src/lib/cn.ts - agentos/src/components/ui/* — 22 shadcn primitives (Button, Card, Badge, Dialog, Tabs, ScrollArea, etc.) - agentos/src/components/composite/* — page chrome (PageHeader, KpiCard, FilterBar, DataTable, StatusDot, EmptyState) - agentos/src/components/observability/* — TraceList, TraceDetail, QueryBuilder, Dashboard, ObservabilityTab, DateRangePicker - agentos/src/components/{PolicyTab,PoliciesPage,HomePage,ChatTab,WorkspaceTab,SchedulesTab,LogsTab}.tsx — shadcn rewrites - agentos/src/{api.ts, obs-api.ts, obs-fields.ts} — Express obs-api client + field metadata Merged from our PR #9 (re-applied): - agentos/src/api.ts: re-added IdentitySource type, sourceUrl field on Agent, displaySource() helper, RegisterAgentInput + registerAgent/unregisterAgent/patchAgent methods. Kept all of their policy + obs methods. - agentos/src/App.tsx: agentNameFromSource now uses sourceUrl (our schema), agent rail row shows sourceUrl instead of raw source (which is now IdentitySource | string and would crash if rendered directly). - agentos/src/components/{SourceBadge,RegisterAgentForm}.tsx kept from our branch. - agentos/src/api.test.ts kept from our branch. Build status: - pnpm --filter agentos build: clean (2535 modules, 860KB JS / 263KB gz) - pnpm -r test: passes (sdk has 1 known-flaky session-resume race, passes on retry) Remaining Phase B polish (not blocking): - Re-skin SourceBadge/RegisterAgentForm/AgentCard on shadcn Card/Badge/Dialog primitives - Re-add the grouped Hosted/Library rail sections - Re-apply our CONTINUE_PROMPT = "Please continue." (currently has obs branch's wordy version) - Re-apply WorkspaceTab.newChat → DELETE /agentos/api/agents/:name/chat-pin
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Five commits, 32 files, +3420 / -16. Adds library-mode telemetry: a Mongo-backed agent registry, SDK telemetry hooks the customer's existing workers call into without needing a new HTTP service, Bedrock env passthrough through the harness subprocess, dashboard treatment of git URLs as canonical agent identity, and the test coverage that makes all of the above safe to change.
Drives this PoC shape (a customer's Temporal worker imports the SDK, no new pods):
Commits
abb8573feat(sdk,agent-registry-mongo): library-mode telemetry hooks5589a23feat(agentos,scripts): Mongo-backed registry — dashboard + CRUD + SPA + seedd782cf1feat(engine,agentos): Bedrock env passthrough (2a) + git URL as agent identity (2b)2545e9etest(sdk): Phase 2c — telemetry-hook + claude-sdk × substrate × source matrix981a4f9test: TDD backfill — agent-registry-mongo + agentos CRUD + displaySourceWhat's new
AgentTelemetryinterface +MongoTelemetryimplpackages/sdk/src/telemetry.ts—AgentTelemetrycontract: optionalonAgentConstructed/onChatStart/onChatEnd/onClosehooks. Fire-and-forget — exceptions never propagate to chat.packages/sdk/src/computer-agent.ts— wires the hooks. Constructor firesonAgentConstructed;chat()wrapsonChatStart/onChatEndaround the handle viaPromise.resolve(handle as PromiseLike<ChatResult>).then(...);dispose()firesonClose.safeFireTelemetryswallows errors per hook.packages/agent-registry-mongo/— new package.AgentRegistry(upsert by name with idempotentregisteredAt),AgentLogStore(QUERY_MAX=8000,REPLY_MAX=16000, truncates with…),MongoTelemetry(writes toagent_registry+agent_logs).Bedrock env passthrough
packages/engine-claude-agent-sdk/src/engine.ts—inheritEssentialHostEnvnow forwardsCLAUDE_CODE_USE_BEDROCK,AWS_REGION,AWS_DEFAULT_REGION,AWS_BEDROCK_MODEL_ID,AWS_ROLE_ARN,AWS_WEB_IDENTITY_TOKEN_FILE,AWS_PROFILE,AWS_SHARED_CREDENTIALS_FILE,AWS_CONFIG_FILEinto the harness subprocess. IRSA chain reaches the model client.Dashboard + agent registry
examples/agentos-api.ts— unions in-memory agents (server config) withagent_registryrows; in-memory wins on collision. New endpoints:POST /agents/register,PATCH /agents/:name,DELETE /agents/:name,GET /agents/by-source?url=.agentos/src/components/RegisterAgentForm.tsx— UI to register.agentos/src/components/SourceBadge.tsx— renders git URLs as clickable owner/repo;e.stopPropagationso the link doesn't fire the parent row click.agentos/src/api.ts—displaySource()helper parseshttps://,git@, scheme-less host/owner/repo, and bareowner/repo(defaults host to github.com); URL-encodes refs.scripts/seed-agent-registry.ts— idempotent one-shot to migrate the hardcoded in-memory list into the registry.Tests
67 new tests across 5 files, 1163 LOC. Live-Mongo tests follow the existing
session-store-mongoconvention — gated bydescribeMongo = url ? describe : describe.skiponMONGO_URL, unique DB per run, drop on teardown.packages/agent-registry-mongo/src/registry.test.tsregisteredAt, list ordering, get null, unregister idempotency, shared MongoClientpackages/agent-registry-mongo/src/audit-log.test.tsQUERY_MAX/REPLY_MAXtruncation, ordering, limit clamping [1, 500], filter combos,count()vslist()packages/agent-registry-mongo/src/telemetry.test.tsexamples/agentos-api.test.tscreateAgentOSAppviaapp.fetch(Request). CRUD: 400/200/idempotency/409/404 paths.GET /by-source: in-memory + registry match, dual-match (dedup detection), empty 200.GET /agents: union, origin tagging, in-memory wins on collisionpackages/sdk/src/telemetry-hook.test.tsonChatStart→onChatEnd, throwing telemetry doesn't break chat, async-rejecting telemetry doesn't break, no-telemetry pathpackages/sdk/src/substrate-matrix.test.tsskipIffor Linux+bwrap andE2B_API_KEYandSDK_MATRIX_GIT_FIXTURE_URLagentos/src/api.test.tspackages/engine-claude-agent-sdk/src/engine.test.tsTest totals after this PR
pnpm -r testagentos/(standalone)Offline run is fully green. Live-Mongo + ANTHROPIC + E2B paths auto-skip without their respective env vars.
What is NOT in this PR
ANTHROPIC_BASE_URLwhen it lands.Verification
pnpm -r typecheckcleanpnpm -r test— 349 passing / 57 skipped / 0 failingpnpm --filter agentos test— 20 passingpnpm --filter agentos build— Vite build cleanPOST /agents/register, verified it shows on the dashboard withorigin: "registry"+ clickable git URL; in-memory collision keeps the server-configured wiring authoritative.Migration
MONGO_URL=... pnpm tsx scripts/seed-agent-registry.ts. Safe to re-run.