Skip to content

feat: CLI telemetry for all commands + crash reporting#122

Open
nicknisi wants to merge 17 commits into
mainfrom
nicknisi/telemetry
Open

feat: CLI telemetry for all commands + crash reporting#122
nicknisi wants to merge 17 commits into
mainfrom
nicknisi/telemetry

Conversation

@nicknisi
Copy link
Copy Markdown
Member

@nicknisi nicknisi commented Apr 14, 2026

Summary

Right now we only have telemetry on the installer flow. This PR adds telemetry coverage to every CLI command, plus crash reporting and a debug flag.

What changed:

  • Event type enrichment -- session.start/end now carry environment fingerprint (OS, Node version, CI detection, shell). Step and tool events get explicit startTimestamp for accurate span reconstruction on the backend. Two new event types: command and crash.

  • Command telemetry for all commands -- every yargs command (both registerSubcommand and inline handlers) now emits a command event with canonical name, duration, success/failure, and which flags were used. Alias resolution normalizes org -> organization, claim -> env.claim so metrics don't fragment.

  • Crash reporting -- global uncaughtException/unhandledRejection handlers capture unhandled crashes with sanitized stack traces (home dir stripped, truncated to 4KB). Handlers are synchronous to avoid Node's async-handler footgun.

  • Store-and-forward -- events are persisted to a PID-based temp file on process exit via writeFileSync. Next CLI invocation recovers and sends them. Handles the ~25 locations that call process.exit() directly by queuing a provisional event in the middleware before the handler runs.

  • Flush improvements -- flush() now returns a boolean (sent vs retryable), uses splice instead of clearing the whole queue (protects events queued during an in-flight fetch), drops events on 4xx (permanent failures like 401 won't accumulate), and retains on 5xx/network errors for store-forward. Commands flush in-process via the handler wrapper rather than always deferring to next invocation.

  • WORKOS_DEBUG=1 -- enables verbose debug logging for all commands (not just the installer's --debug flag). Shows telemetry event details on flush.

Design decisions worth noting:

  • Non-installer telemetry only works for JWT-authenticated users. API-key-only users don't have a token the LLM gateway guard accepts, so their events are silently dropped. Acceptable for v1.
  • The install, dashboard, and default $0 commands are excluded from command-level telemetry since they have their own session-based telemetry.
  • The crash reporter is sync-only -- it queues the event and relies on store-forward's exit handler to persist it. No async flush attempts in the crash path.

Summary by CodeRabbit

  • Documentation

    • Added telemetry guidance for command handlers, aliasing, and installer exclusions.
  • New Features

    • Verbose CLI debug mode (WORKOS_DEBUG).
    • Global crash reporting with sanitized error reporting.
    • Stable per-device identifier and store-and-forward telemetry persistence/recovery.
    • Command telemetry middleware and lifecycle helpers for accurate command metrics.
  • Improvements

    • Expanded telemetry events (command, crash), auth-mode, and env/device fingerprinting.
    • Centralized command aliases to consolidate metrics.
    • Standardized exit/termination reporting across commands.
  • Privacy

    • Stronger redaction and truncation of secrets and sensitive paths in telemetry.

Review Change Stack

nicknisi added a commit that referenced this pull request Apr 15, 2026
Closes security-audit finding #1 on PR #122 (telemetry message
sanitization). `error.message` was flowing into 4 capture sites
unsanitized, leaking homedir paths (and rarely, credentials) to the
WorkOS gateway.

- Add `sanitizeMessage()` in crash-reporter.ts: homedir strip + Bearer/
  sk_*/JWT redaction + 1KB truncation.
- Factor secret redaction into shared `redactSecrets()` used by both
  `sanitizeMessage` and `sanitizeStack` (Node echoes `.message` into the
  leading `Error.stack` line, so message-only sanitization was
  insufficient).
- Add private `extractErrorFields()` chokepoint on `Analytics`; route
  all 4 capture sites through it (`captureException`, `stepCompleted`,
  `commandExecuted`, `captureUnhandledCrash`). `replaceLastCommandEvent`
  inherits sanitization via its delegation to `commandExecuted`.
- `captureUnhandledCrash` now uses `sanitizeStack` instead of inline
  truncation, providing defense-in-depth for callers that bypass the
  crash-reporter wrapper.
- Add regression guard test (`telemetry-sanitize.spec.ts`): poisons
  every capture method with homedir + Bearer + sk_live_ + JWT, asserts
  no marker reaches the serialized queue.

Reviewed: ideation:reviewer cycle 1 PASS (0 critical, 0 high).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7afeacb0-f6f8-4a94-acc3-f8ee0634345c

📥 Commits

Reviewing files that changed from the base of the PR and between 8003788 and c7804a2.

📒 Files selected for processing (6)
  • src/bin.ts
  • src/utils/analytics.spec.ts
  • src/utils/analytics.ts
  • src/utils/command-telemetry.spec.ts
  • src/utils/command-telemetry.ts
  • src/utils/telemetry-sanitize.spec.ts
🚧 Files skipped from review as they are similar to previous changes (5)
  • src/bin.ts
  • src/utils/telemetry-sanitize.spec.ts
  • src/utils/command-telemetry.spec.ts
  • src/utils/analytics.spec.ts
  • src/utils/command-telemetry.ts

📝 Walkthrough

Walkthrough

Adds end-to-end CLI telemetry: device-id persistence, telemetry client/store-forward, analytics (command/crash events, auth-mode), command telemetry middleware and wrapper, crash sanitization/reporting, exit-code→termination mapping, CLI bootstrap wiring, and associated tests.

Changes

Telemetry Infrastructure & Command Tracking

Layer / File(s) Summary
Docs and bin bootstrap
CLAUDE.md, src/bin.ts
Adds "Telemetry Wiring for New Commands" docs; initializes debug/crash-reporter, store-forward, analytics early; inserts commandTelemetryMiddleware(rawArgs) into yargs.
Top-level command handler wrapping
src/bin.ts
Wraps multiple top-level inline .command() handlers with wrapCommandHandler() and documents wrap requirement for inline handlers.
Command telemetry utilities & tests
src/utils/command-telemetry.ts, src/utils/command-telemetry.spec.ts
Adds canonical name resolution, user-flag extraction, provisional command enqueueing, wrapCommandHandler() for replacement and termination, and coverage tests including bin.ts enforcement.
Command aliases and help-json integration
src/lib/command-aliases.ts, src/utils/help-json.ts
Adds COMMAND_ALIASES and updates --help --json command resolution to use shared aliases.
Analytics core, types, and schema
src/utils/telemetry-types.ts, src/utils/telemetry-schema.spec.ts, src/utils/analytics.ts, src/utils/analytics.spec.ts
Extends telemetry types with command/crash, adds AuthMode/TerminationReason, environment/device fingerprinting, auth-mode control, command lifecycle APIs, crash capture, and tests for event shapes and behaviors.
TelemetryClient queue, auth, flush, and store-forward
src/utils/telemetry-client.ts, src/utils/telemetry-client.spec.ts, src/utils/telemetry-store-forward.ts, src/utils/telemetry-store-forward.spec.ts
Adds claim-token auth, queue mutation APIs (replaceLastEventOfType, patchLastEventOfType, queueEvents), snapshot-based flush() returning boolean, persistToFile, installStoreForward() and recoverPendingEvents() with tests.
Sanitization and crash reporting
src/utils/crash-reporter.ts, src/utils/crash-reporter.spec.ts, src/utils/telemetry-sanitize.spec.ts
Implements sanitizeStack/sanitizeMessage (home-dir replacement, secret redaction, truncation), synchronous crash handlers, guarded reporting via analytics.captureUnhandledCrash, and sanitization tests.
Device ID persistence and tests
src/lib/device-id.ts, src/lib/device-id.spec.ts
Adds getDeviceId() with on-disk persistence at ~/.workos/device-id, in-memory caching, safe IO fallback, and tests for generation, caching, reload, invalid-file handling, and readonly filesystem fallback.
Exit codes, output, and API error handling
src/utils/exit-codes.ts, src/utils/output.ts, src/lib/api-error-handler.ts, plus tests
Adds resolveErrorCode() mapping to {reason,exit}, records termination telemetry in exit helpers, enriches API error handling with structured apiContext, updates exitWithError to use resolved exit codes and telemetry, and updates tests accordingly.
Command modules: exit/telemetry updates & tests
src/commands/*, src/commands/*/*.spec.ts
Replaces direct process.exit calls with exitWithCode / exitWithError, updates command handlers to be wrapped where needed, and updates tests to expect standardized exit behavior (e.g., cancelled → code 2).

Possibly related PRs

  • workos/cli#149: Related to migrations top-level command wiring which this PR also wraps in src/bin.ts.
  • workos/cli#142: Related changes to the top-level workos api command registration and wiring.
  • workos/cli#139: Overlaps in src/commands/login.ts authentication/termination flow changes.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.74% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main changes: adding CLI telemetry for all commands and crash reporting, which are the primary features introduced across the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch nicknisi/telemetry

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (4)
src/utils/telemetry-types.ts (1)

58-58: ⚡ Quick win

Align startTimestamp optionality with backward-compat contract.

telemetry-schema.spec.ts explicitly accepts step/agent.tool without startTimestamp, but these interfaces currently require it. Making both optional avoids contract drift.

Suggested patch
 export interface StepEvent extends TelemetryEvent {
   type: 'step';
   name: string;
-  startTimestamp: string;
+  startTimestamp?: string;
   durationMs: number;
   success: boolean;
   error?: {
@@
 export interface AgentToolEvent extends TelemetryEvent {
   type: 'agent.tool';
   toolName: string;
-  startTimestamp: string;
+  startTimestamp?: string;
   durationMs: number;
   success: boolean;
 }

Also applies to: 70-70

src/utils/telemetry-client.spec.ts (1)

60-105: ⚡ Quick win

Add coverage for claim-token auth headers.

setClaimTokenAuth() is new behavior in this PR, but there’s no test asserting x-workos-claim-token + x-workos-client-id emission (and bearer-token precedence when both exist).

Also applies to: 124-218

src/utils/telemetry-store-forward.ts (1)

1-3: ⚡ Quick win

Use async fs APIs for startup recovery path.

recoverPendingEvents() (Line 25+) is async but uses sync disk calls (Lines 27-56), which blocks startup and conflicts with the project rule for TS files.

As per coding guidelines "Avoid Node-specific sync APIs (crypto, fs sync) unless necessary".

Also applies to: 27-56

src/utils/output.ts (1)

123-124: ⚡ Quick win

Protect process termination from telemetry failures

analytics.recordTermination(...) on Line 123 can throw and prevent Line 124 from executing. Error exits should remain deterministic even when telemetry is unhealthy.

🔧 Proposed fix
   const reason = error.apiContext ? 'api_error' : codeReason;
-  analytics.recordTermination(reason, error.code, error.apiContext);
-  process.exit(exit);
+  try {
+    analytics.recordTermination(reason, error.code, error.apiContext);
+  } finally {
+    process.exit(exit);
+  }
 }

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c45afda9-659c-43a0-9646-cd08dd0cabe3

📥 Commits

Reviewing files that changed from the base of the PR and between 524c709 and 31829ec.

📒 Files selected for processing (27)
  • CLAUDE.md
  • src/bin.ts
  • src/commands/debug.ts
  • src/lib/api-error-handler.spec.ts
  • src/lib/api-error-handler.ts
  • src/lib/command-aliases.ts
  • src/lib/device-id.spec.ts
  • src/lib/device-id.ts
  • src/lib/run-with-core.ts
  • src/utils/analytics.spec.ts
  • src/utils/analytics.ts
  • src/utils/command-telemetry.spec.ts
  • src/utils/command-telemetry.ts
  • src/utils/crash-reporter.spec.ts
  • src/utils/crash-reporter.ts
  • src/utils/exit-codes.spec.ts
  • src/utils/exit-codes.ts
  • src/utils/output.spec.ts
  • src/utils/output.ts
  • src/utils/register-subcommand.ts
  • src/utils/telemetry-client.spec.ts
  • src/utils/telemetry-client.ts
  • src/utils/telemetry-sanitize.spec.ts
  • src/utils/telemetry-schema.spec.ts
  • src/utils/telemetry-store-forward.spec.ts
  • src/utils/telemetry-store-forward.ts
  • src/utils/telemetry-types.ts

Comment thread src/lib/device-id.ts Outdated
Comment thread src/lib/device-id.ts
Comment on lines +31 to +43
try {
if (fs.existsSync(filePath)) {
const raw = fs.readFileSync(filePath, 'utf8').trim();
if (UUID_REGEX.test(raw)) {
cached = raw;
return raw;
}
}

const id = crypto.randomUUID();
fs.mkdirSync(path.dirname(filePath), { recursive: true, mode: 0o700 });
fs.writeFileSync(filePath, id, { encoding: 'utf8', mode: 0o600 });
cached = id;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Replace sync fs usage in the command path.

Line 31–43 and Line 49 use synchronous fs/crypto APIs on a hot CLI path. Please move this to async (node:fs/promises) and cache a pending promise to keep call sites simple.

As per coding guidelines src/**/*.ts: “Avoid Node-specific sync APIs (crypto, fs sync) unless necessary”.

Also applies to: 49-50

Comment thread src/utils/analytics.ts
Comment on lines +30 to +34
export function extractUserFlags(rawArgs: string[]): string[] {
const passedFlags = rawArgs
.filter((arg) => arg.startsWith('--') || (arg.startsWith('-') && arg.length === 2))
.map((arg) => arg.replace(/^-+/, '').split('=')[0]);
return [...new Set(passedFlags)];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Harden flag extraction to avoid invalid telemetry flags.

Line 32/33 currently treats -- and values like -1 as flags, which can pollute command.flags.

Suggested fix
 export function extractUserFlags(rawArgs: string[]): string[] {
   const passedFlags = rawArgs
-    .filter((arg) => arg.startsWith('--') || (arg.startsWith('-') && arg.length === 2))
-    .map((arg) => arg.replace(/^-+/, '').split('=')[0]);
+    .filter((arg) => {
+      if (arg === '--') return false;
+      if (/^--[A-Za-z][\w-]*(=.*)?$/.test(arg)) return true;
+      if (/^-[A-Za-z]$/.test(arg)) return true;
+      return false;
+    })
+    .map((arg) => arg.replace(/^-+/, '').split('=')[0])
+    .filter(Boolean);
   return [...new Set(passedFlags)];
 }

Comment on lines +60 to +61
const topLevelCommand = commandParts[0] ?? '';
if (SKIP_TELEMETRY_COMMANDS.has(topLevelCommand)) return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Normalize the skip check using canonical top-level command.

Line 60/61 checks the raw command token. If a skipped command is invoked via alias, the exemption can be bypassed and telemetry still emitted.

Suggested fix
-    const topLevelCommand = commandParts[0] ?? '';
+    const rawTopLevelCommand = commandParts[0] ?? '';
+    const topLevelCommand = COMMAND_ALIASES[rawTopLevelCommand] ?? rawTopLevelCommand;
     if (SKIP_TELEMETRY_COMMANDS.has(topLevelCommand)) return;

export function sanitizeStack(stack: string | undefined): string {
if (!stack) return '';
let sanitized = stack.replaceAll(HOME, '~');
sanitized = sanitized.replace(/\/[^\s:]+\/(node_modules|dist|src)\//g, '$1/');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

find . -name "crash-reporter.ts" -type f

Repository: workos/cli

Length of output: 84


🏁 Script executed:

cat -n src/utils/crash-reporter.ts

Repository: workos/cli

Length of output: 3137


🏁 Script executed:

python3 << 'EOF'
import re

# Test the current regex against various path formats
current_pattern = r'/[^\s:]+/(node_modules|dist|src)/'
windows_pattern = r'[A-Za-z]:\\[^\s:]+\\(node_modules|dist|src)\\'

# Test cases
test_cases = [
    # POSIX paths
    "/home/user/project/node_modules/lib/index.js",
    "/var/app/dist/main.js",
    "/Users/dev/src/index.ts",
    # Windows paths
    "C:\\Users\\dev\\project\\node_modules\\lib\\index.js",
    "D:\\app\\dist\\main.js",
    "E:\\code\\src\\index.ts",
    # Stack trace examples
    "  at functionName (/home/user/project/node_modules/pkg/index.js:10:5)",
    "  at functionName (C:\\Users\\dev\\project\\node_modules\\pkg\\index.js:10:5)",
]

print("Testing CURRENT regex pattern:")
print(f"Pattern: {current_pattern}")
for test in test_cases:
    match = re.search(current_pattern, test)
    result = re.sub(current_pattern, r'\1/', test) if match else test
    print(f"  '{test[:60]}...' -> Match: {bool(match)}, Result: '{result[:60]}...'")

print("\nTesting PROPOSED Windows pattern:")
print(f"Pattern: {windows_pattern}")
for test in test_cases:
    match = re.search(windows_pattern, test)
    result = re.sub(windows_pattern, r'\1\\', test) if match else test
    print(f"  '{test[:60]}...' -> Match: {bool(match)}, Result: '{result[:60]}...'")

print("\nTesting COMBINED patterns:")
for test in test_cases:
    result = test
    result = re.sub(current_pattern, r'\1/', result)
    result = re.sub(windows_pattern, r'\1\\', result)
    print(f"  '{test[:60]}...' -> Result: '{result[:60]}...'")
EOF

Repository: workos/cli

Length of output: 2630


Windows absolute paths leak in stack trace sanitization

Line 26 only collapses POSIX-style paths (/path/to/node_modules/). Windows paths (e.g., C:\Users\...\node_modules\) bypass this regex entirely and leak full local filesystem paths into telemetry payloads.

Regex testing confirms the current pattern does not match Windows absolute paths:

  • C:\Users\dev\project\node_modules\pkg\index.js → remains unsanitized
  • /home/user/project/node_modules/pkg/index.js → correctly collapsed to node_modules/pkg/index.js
🔧 Proposed fix
 export function sanitizeStack(stack: string | undefined): string {
   if (!stack) return '';
   let sanitized = stack.replaceAll(HOME, '~');
-  sanitized = sanitized.replace(/\/[^\s:]+\/(node_modules|dist|src)\//g, '$1/');
+  sanitized = sanitized
+    .replace(/\/[^\s:]+\/(node_modules|dist|src)\//g, '$1/')
+    .replace(/[A-Za-z]:\\[^\s:]+\\(node_modules|dist|src)\\/g, '$1\\');
   sanitized = redactSecrets(sanitized);
   return sanitized.length > MAX_STACK_LENGTH
     ? sanitized.slice(0, MAX_STACK_LENGTH) + '\n...[truncated]'
     : sanitized;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sanitized = sanitized.replace(/\/[^\s:]+\/(node_modules|dist|src)\//g, '$1/');
export function sanitizeStack(stack: string | undefined): string {
if (!stack) return '';
let sanitized = stack.replaceAll(HOME, '~');
sanitized = sanitized
.replace(/\/[^\s:]+\/(node_modules|dist|src)\//g, '$1/')
.replace(/[A-Za-z]:\\[^\s:]+\\(node_modules|dist|src)\\/g, '$1\\');
sanitized = redactSecrets(sanitized);
return sanitized.length > MAX_STACK_LENGTH
? sanitized.slice(0, MAX_STACK_LENGTH) + '\n...[truncated]'
: sanitized;
}

Comment thread src/utils/exit-codes.ts Outdated
Comment on lines 72 to 77
export function exitWithCode(code: ExitCodeValue, error?: { code: string; message: string }): never {
if (error) {
outputError(error);
}
analytics.recordTermination(reasonForExitCode(code), error?.code);
process.exit(code);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Classify termination from error.code when available.

Line 76 derives reason only from numeric exit code, so all code-1 exits become validation_error. That misclassifies API failures like http_500 when they go through exitWithCode.

Suggested fix
 export function exitWithCode(code: ExitCodeValue, error?: { code: string; message: string }): never {
   if (error) {
     outputError(error);
   }
-  analytics.recordTermination(reasonForExitCode(code), error?.code);
+  const reason = error ? resolveErrorCode(error.code).reason : reasonForExitCode(code);
+  analytics.recordTermination(reason, error?.code);
   process.exit(code);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function exitWithCode(code: ExitCodeValue, error?: { code: string; message: string }): never {
if (error) {
outputError(error);
}
analytics.recordTermination(reasonForExitCode(code), error?.code);
process.exit(code);
export function exitWithCode(code: ExitCodeValue, error?: { code: string; message: string }): never {
if (error) {
outputError(error);
}
const reason = error ? resolveErrorCode(error.code).reason : reasonForExitCode(code);
analytics.recordTermination(reason, error?.code);
process.exit(code);
}

Comment on lines +85 to 94
async flush(): Promise<boolean> {
if (this.events.length === 0) return true;
if (!this.gatewayUrl) {
debug('[Telemetry] No gateway URL configured, skipping flush');
return;
return false;
}

const payload: TelemetryRequest = { events: [...this.events] };
this.events = [];
const count = this.events.length;
const payload: TelemetryRequest = { events: this.events.slice(0, count) };

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Guard against concurrent flush() calls to prevent event loss/duplication.

Line 85 currently allows overlapping flushes. Two concurrent calls can send the same snapshot twice, and later splice(0, count) can remove events queued after the first flush.

Suggested patch
 export class TelemetryClient {
   private events: TelemetryEvent[] = [];
+  private flushInFlight: Promise<boolean> | null = null;
@@
   async flush(): Promise<boolean> {
+    if (this.flushInFlight) return this.flushInFlight;
+    this.flushInFlight = this.flushInternal();
+    try {
+      return await this.flushInFlight;
+    } finally {
+      this.flushInFlight = null;
+    }
+  }
+
+  private async flushInternal(): Promise<boolean> {
     if (this.events.length === 0) return true;
@@
-  }
+  }
 }

Also applies to: 135-137, 143-144

Comment on lines +164 to +166
mkdirSync(dirname(filePath), { recursive: true });
writeFileSync(filePath, JSON.stringify(this.events), 'utf-8');
this.events = [];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist telemetry with restrictive filesystem permissions.

persistToFile() currently relies on default permissions. Since payloads can include user/device identifiers, write with 0700 dir and 0600 file modes.

Suggested patch
-      mkdirSync(dirname(filePath), { recursive: true });
-      writeFileSync(filePath, JSON.stringify(this.events), 'utf-8');
+      mkdirSync(dirname(filePath), { recursive: true, mode: 0o700 });
+      writeFileSync(filePath, JSON.stringify(this.events), { encoding: 'utf-8', mode: 0o600 });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
mkdirSync(dirname(filePath), { recursive: true });
writeFileSync(filePath, JSON.stringify(this.events), 'utf-8');
this.events = [];
mkdirSync(dirname(filePath), { recursive: true, mode: 0o700 });
writeFileSync(filePath, JSON.stringify(this.events), { encoding: 'utf-8', mode: 0o600 });
this.events = [];

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 12, 2026

Greptile Summary

This PR adds comprehensive CLI telemetry to every command: a yargs middleware queues provisional command events, wrapCommandHandler replaces them with real duration/outcome on completion, and a store-and-forward mechanism persists unsent events across process exits. Two new event types (command, crash) are added alongside global uncaughtException/unhandledRejection handlers, a stable per-device UUID, environment fingerprinting, and a WORKOS_DEBUG flag.

  • Command telemetry — middleware + wrapCommandHandler wrap every yargs command; alias normalization (orgorganization, claimenv.claim) keeps metrics consolidated; SKIP_TELEMETRY_COMMANDS excludes installer/dashboard.
  • Crash reporting — synchronous handlers queue a crash event that store-forward persists on exit; sanitizeStack/sanitizeMessage redact home dir, absolute paths, Bearer tokens, sk_* keys, and JWTs.
  • Store-and-forward — PID-keyed temp files survive unexpected exits; recovery on next invocation queues and flushes the saved events; flush() now uses splice and drops on 4xx/retains on 5xx.

Confidence Score: 4/5

Safe to merge with one fix recommended: captureException logs the raw error message before sanitization, which can expose secrets in debug output.

The telemetry wiring is solid and well-tested. One method — captureException — logs error.message to stdout before running it through the sanitizer, meaning a Bearer token or API key in an error message would appear in debug output when WORKOS_DEBUG=1 is set. This directly violates the team's rule against logging sensitive fields. The rest of the PR (store-forward, crash reporter, device ID, command wrapping) behaves correctly.

src/utils/analytics.ts — captureException debug log at line 149

Important Files Changed

Filename Overview
src/utils/analytics.ts Major expansion: added command/crash events, env fingerprinting, auth-mode tracking, and termination recording. Contains an unsanitized debug log in captureException and a new circular import with crash-reporter.ts.
src/utils/command-telemetry.ts New file: middleware and handler wrapper for per-command telemetry. err.name used as error.code in the catch path is inconsistent with structured codes used elsewhere.
src/utils/crash-reporter.ts New file: synchronous uncaughtException/unhandledRejection handlers with isCrashing guard, sanitization helpers, and home-dir redaction. Creates a new circular import with analytics.ts.
src/utils/telemetry-store-forward.ts New file: PID-keyed persist-on-exit and startup recovery. Correctly deletes files before flush so failed flushes are re-persisted via the exit handler.
src/utils/telemetry-client.ts Updated flush() to return boolean, uses splice for safe concurrent-event protection, drops on 4xx and retains on 5xx/network errors.
src/bin.ts Wires crash reporter, store-forward, and command telemetry middleware. Wraps most inline handlers with wrapCommandHandler.
src/utils/telemetry-types.ts Added CommandEvent, CrashEvent, TerminationReason, and AuthMode types; clean schema additions with no issues.
src/lib/device-id.ts New persistent device-ID module: validates UUID format before trusting stored value, falls back to session-scoped UUID on IO failure. Correctly sets restrictive file permissions (0o600).
src/lib/command-aliases.ts New single-source-of-truth alias map (org → organization, claim → env.claim) shared by telemetry and help-json.
src/utils/exit-codes.ts Added resolveErrorCode mapping and analytics.recordTermination calls before process.exit.

Sequence Diagram

sequenceDiagram
    participant bin as bin.ts
    participant mw as commandTelemetryMiddleware
    participant wrap as wrapCommandHandler
    participant ana as Analytics
    participant tc as TelemetryClient
    participant sf as StoreForward

    bin->>bin: installCrashReporter()
    bin->>bin: installStoreForward()
    bin->>ana: initForNonInstaller()
    bin->>bin: recoverPendingEvents() fire-and-forget
    bin->>mw: yargs middleware fires
    mw->>ana: setCommandStart(now)
    mw->>ana: queueProvisionalCommand(name, flags)
    ana->>tc: "queueEvent command success=true duration=0"
    bin->>wrap: handler(argv) runs
    alt success
        wrap->>ana: replaceLastCommandEvent
        wrap->>ana: recordTermination success
        wrap->>tc: flush in-process
    else process.exit path
        wrap->>ana: recordTermination reason
        wrap->>sf: exit event fires
        sf->>tc: persistToFile pending-PID.json
    else uncaught throw
        wrap->>ana: replaceLastCommandEvent false
        wrap->>ana: recordTermination crash
        wrap->>tc: flush finally
    end
Loading

Reviews (3): Last reviewed commit: "fix(telemetry): address review nits — pr..." | Re-trigger Greptile

Comment thread src/utils/output.ts
Comment thread src/bin.ts
Comment thread src/utils/command-telemetry.ts
Comment thread src/utils/telemetry-client.ts Outdated
nicknisi added a commit that referenced this pull request May 13, 2026
Closes security-audit finding #1 on PR #122 (telemetry message
sanitization). `error.message` was flowing into 4 capture sites
unsanitized, leaking homedir paths (and rarely, credentials) to the
WorkOS gateway.

- Add `sanitizeMessage()` in crash-reporter.ts: homedir strip + Bearer/
  sk_*/JWT redaction + 1KB truncation.
- Factor secret redaction into shared `redactSecrets()` used by both
  `sanitizeMessage` and `sanitizeStack` (Node echoes `.message` into the
  leading `Error.stack` line, so message-only sanitization was
  insufficient).
- Add private `extractErrorFields()` chokepoint on `Analytics`; route
  all 4 capture sites through it (`captureException`, `stepCompleted`,
  `commandExecuted`, `captureUnhandledCrash`). `replaceLastCommandEvent`
  inherits sanitization via its delegation to `commandExecuted`.
- `captureUnhandledCrash` now uses `sanitizeStack` instead of inline
  truncation, providing defense-in-depth for callers that bypass the
  crash-reporter wrapper.
- Add regression guard test (`telemetry-sanitize.spec.ts`): poisons
  every capture method with homedir + Bearer + sk_live_ + JWT, asserts
  no marker reaches the serialized queue.

Reviewed: ideation:reviewer cycle 1 PASS (0 critical, 0 high).
@nicknisi nicknisi force-pushed the nicknisi/telemetry branch from 31829ec to 0475ea8 Compare May 13, 2026 03:20
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e6b7b384-44b3-47fe-9030-070bd92bf44e

📥 Commits

Reviewing files that changed from the base of the PR and between 31829ec and 0475ea8.

📒 Files selected for processing (39)
  • CLAUDE.md
  • src/bin.ts
  • src/commands/api/index.spec.ts
  • src/commands/api/index.ts
  • src/commands/api/interactive.spec.ts
  • src/commands/api/interactive.ts
  • src/commands/debug.ts
  • src/commands/dev.ts
  • src/commands/doctor.ts
  • src/commands/emulate.ts
  • src/commands/env.ts
  • src/commands/install-skill.ts
  • src/commands/login.ts
  • src/commands/uninstall-skill.ts
  • src/lib/api-error-handler.spec.ts
  • src/lib/api-error-handler.ts
  • src/lib/command-aliases.ts
  • src/lib/device-id.spec.ts
  • src/lib/device-id.ts
  • src/lib/run-with-core.ts
  • src/utils/analytics.spec.ts
  • src/utils/analytics.ts
  • src/utils/command-telemetry.spec.ts
  • src/utils/command-telemetry.ts
  • src/utils/crash-reporter.spec.ts
  • src/utils/crash-reporter.ts
  • src/utils/exit-codes.spec.ts
  • src/utils/exit-codes.ts
  • src/utils/help-json.ts
  • src/utils/output.spec.ts
  • src/utils/output.ts
  • src/utils/register-subcommand.ts
  • src/utils/telemetry-client.spec.ts
  • src/utils/telemetry-client.ts
  • src/utils/telemetry-sanitize.spec.ts
  • src/utils/telemetry-schema.spec.ts
  • src/utils/telemetry-store-forward.spec.ts
  • src/utils/telemetry-store-forward.ts
  • src/utils/telemetry-types.ts
✅ Files skipped from review due to trivial changes (1)
  • src/commands/emulate.ts
🚧 Files skipped from review as they are similar to previous changes (22)
  • CLAUDE.md
  • src/lib/command-aliases.ts
  • src/utils/register-subcommand.ts
  • src/utils/telemetry-client.spec.ts
  • src/utils/crash-reporter.spec.ts
  • src/commands/debug.ts
  • src/utils/telemetry-schema.spec.ts
  • src/utils/telemetry-client.ts
  • src/utils/telemetry-store-forward.spec.ts
  • src/lib/device-id.spec.ts
  • src/utils/output.spec.ts
  • src/utils/telemetry-store-forward.ts
  • src/utils/command-telemetry.spec.ts
  • src/lib/api-error-handler.spec.ts
  • src/lib/device-id.ts
  • src/lib/api-error-handler.ts
  • src/utils/telemetry-sanitize.spec.ts
  • src/utils/crash-reporter.ts
  • src/utils/analytics.ts
  • src/utils/telemetry-types.ts
  • src/utils/exit-codes.ts
  • src/utils/analytics.spec.ts

Comment thread src/commands/doctor.ts
Comment on lines 37 to 41
if (!options.json) {
clack.log.error(`Doctor failed: ${error instanceof Error ? error.message : 'Unknown error'}`);
} else {
console.error(JSON.stringify({ error: error instanceof Error ? error.message : 'Unknown error' }));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Emit structured JSON errors in doctor JSON mode.

Line 40 currently writes {"error":"..."}. JSON-mode errors should keep a stable error.code and error.message object shape.

Suggested fix
-    } else {
-      console.error(JSON.stringify({ error: error instanceof Error ? error.message : 'Unknown error' }));
-    }
+    } else {
+      console.error(
+        JSON.stringify({
+          error: {
+            code: 'doctor_failed',
+            message: error instanceof Error ? error.message : 'Unknown error',
+          },
+        }),
+      );
+    }

As per coding guidelines src/**/*.ts: Return structured JSON error objects to stderr in non-TTY mode: { "error": { "code": "...", "message": "..." } }.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!options.json) {
clack.log.error(`Doctor failed: ${error instanceof Error ? error.message : 'Unknown error'}`);
} else {
console.error(JSON.stringify({ error: error instanceof Error ? error.message : 'Unknown error' }));
}
if (!options.json) {
clack.log.error(`Doctor failed: ${error instanceof Error ? error.message : 'Unknown error'}`);
} else {
console.error(
JSON.stringify({
error: {
code: 'doctor_failed',
message: error instanceof Error ? error.message : 'Unknown error',
},
}),
);
}

Comment thread src/lib/run-with-core.ts
Comment on lines +541 to +548
const preSessionEnv = getActiveEnvironment();
if (preSessionEnv?.clientId && preSessionEnv?.apiKey) {
analytics.setAuthMode(isUnclaimedEnvironment(preSessionEnv) ? 'claim_token' : 'api_key');
} else if (getAccessToken()) {
analytics.setAuthMode('jwt');
} else if (process.env.WORKOS_API_KEY) {
analytics.setAuthMode('api_key');
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Include CLI/discovered API key in pre-session auth-mode detection.

At Line 546, pre-session classification only checks process.env.WORKOS_API_KEY. If auth comes from --api-key (or .env.local merged into augmentedOptions.apiKey), session.start can miss auth.mode = "api_key".

Suggested fix
   const preSessionEnv = getActiveEnvironment();
   if (preSessionEnv?.clientId && preSessionEnv?.apiKey) {
     analytics.setAuthMode(isUnclaimedEnvironment(preSessionEnv) ? 'claim_token' : 'api_key');
   } else if (getAccessToken()) {
     analytics.setAuthMode('jwt');
-  } else if (process.env.WORKOS_API_KEY) {
+  } else if (augmentedOptions.apiKey || process.env.WORKOS_API_KEY) {
     analytics.setAuthMode('api_key');
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const preSessionEnv = getActiveEnvironment();
if (preSessionEnv?.clientId && preSessionEnv?.apiKey) {
analytics.setAuthMode(isUnclaimedEnvironment(preSessionEnv) ? 'claim_token' : 'api_key');
} else if (getAccessToken()) {
analytics.setAuthMode('jwt');
} else if (process.env.WORKOS_API_KEY) {
analytics.setAuthMode('api_key');
}
const preSessionEnv = getActiveEnvironment();
if (preSessionEnv?.clientId && preSessionEnv?.apiKey) {
analytics.setAuthMode(isUnclaimedEnvironment(preSessionEnv) ? 'claim_token' : 'api_key');
} else if (getAccessToken()) {
analytics.setAuthMode('jwt');
} else if (augmentedOptions.apiKey || process.env.WORKOS_API_KEY) {
analytics.setAuthMode('api_key');
}

Comment thread src/utils/crash-reporter.ts
nicknisi added 16 commits May 27, 2026 16:30
…amps, and new event types

Add environment fingerprint fields (OS, Node version, CI detection, shell)
to session.start and session.end events. Add startTimestamp to step and
agent.tool events for span reconstruction. Define command and crash event
types with stub emission methods. Add discriminated union Zod schema
validation tests mirroring the API schema.
… persistence

Wire up yargs middleware that emits a provisional command event before each
handler runs, then replaces it with actual duration/success on completion.
This covers the ~25 process.exit() call sites without modifying them.

- Command telemetry middleware with canonical name resolution and flag extraction
- Crash reporter with sanitized stack traces (sync handlers, no async)
- Store-forward: persist unsent events to temp file on exit, recover on next run
- Fix flush() to retain events until HTTP success (was clearing before fetch)
- Auto-wrap handlers in registerSubcommand() (single change point)
- Shared COMMAND_ALIASES map for telemetry and help-json
- analytics.initForNonInstaller() sets gatewayUrl + JWT from stored creds
- Enable debug output for non-installer commands via env var
- Log telemetry event details (type, name, duration, attributes) on flush
- Register in debug command's env var catalog
- Wrap inline command handlers (seed, setup-org, doctor, etc.) with
  wrapCommandHandler so they report real duration/success
- Skip provisional telemetry event for install command (has own session telemetry)
- Add claim -> env.claim to canonical alias map
- Defer store-forward file deletion until after successful flush
Client errors (401, 403) are permanent failures that won't succeed on
retry. Only retain events for 5xx (transient server errors) and network
failures where store-forward retry is meaningful.
- flush() returns true (sent/dropped) or false (retryable) so callers
  can act on the result
- Use splice(0, count) instead of clearing all events, protecting
  events queued concurrently during the fetch
- wrapCommandHandler flushes in-process so events are sent immediately
  instead of always deferring to next invocation via store-forward
- Store-forward recovery deletes files after loading into memory
  (events are re-persisted by exit handler if flush fails)
- Skip provisional events for dashboard and $0 (installer entry points)
- Add 4xx drop test coverage
Add a section to CLAUDE.md explaining which commands auto-emit telemetry
(registerSubcommand) versus which need manual wrapCommandHandler wrapping
(inline top-level .command() calls). Add a pointer comment in bin.ts near
the workflow commands block.

Prevents new top-level commands from silently emitting duration=0 telemetry.
- Add workos.user_id to command and crash events (from stored credentials
  or unclaimed environment clientId) so dashboards can count unique users
- Add cli.version to command and crash events for release adoption tracking
- Support claim-token auth path on the telemetry client, so unclaimed
  environments' telemetry reaches the API (guard accepts this path too)
- Rename CrashEvent's installer.version to cli.version (crashes happen
  outside the installer too)
- initForNonInstaller() now wires up user_id and claim-token auth
Closes security-audit finding #1 on PR #122 (telemetry message
sanitization). `error.message` was flowing into 4 capture sites
unsanitized, leaking homedir paths (and rarely, credentials) to the
WorkOS gateway.

- Add `sanitizeMessage()` in crash-reporter.ts: homedir strip + Bearer/
  sk_*/JWT redaction + 1KB truncation.
- Factor secret redaction into shared `redactSecrets()` used by both
  `sanitizeMessage` and `sanitizeStack` (Node echoes `.message` into the
  leading `Error.stack` line, so message-only sanitization was
  insufficient).
- Add private `extractErrorFields()` chokepoint on `Analytics`; route
  all 4 capture sites through it (`captureException`, `stepCompleted`,
  `commandExecuted`, `captureUnhandledCrash`). `replaceLastCommandEvent`
  inherits sanitization via its delegation to `commandExecuted`.
- `captureUnhandledCrash` now uses `sanitizeStack` instead of inline
  truncation, providing defense-in-depth for callers that bypass the
  crash-reporter wrapper.
- Add regression guard test (`telemetry-sanitize.spec.ts`): poisons
  every capture method with homedir + Bearer + sk_live_ + JWT, asserts
  no marker reaches the serialized queue.

Reviewed: ideation:reviewer cycle 1 PASS (0 critical, 0 high).
Introduces two additive telemetry signals requested by the signals spec:

- device.id — persistent per-install UUID stored at ~/.workos/device-id.
  File IO failures fall through to a one-shot session UUID; never throws.
- auth.mode — 4-state enum ('jwt' | 'claim_token' | 'api_key' | 'none')
  derived during initForNonInstaller with JWT > claim_token > api_key
  priority. Installer flow sets it in run-with-core after credentials
  resolve.

Both fields are injected via getEnvFingerprint so they appear on every
event that already carries env context (session.start, session.end,
command, crash). No change to downstream gateway — existing deliveries
accept the new attrs as optional.

Implements Phase 1 of docs/ideation/telemetry-signals/spec-phase-1.md.
Gateway-side type mirroring is tracked separately in the gateway repo.
…and events

Replace the boolean `command.success` as the primary outcome dimension with
a structured `termination.reason` enum (success | cancelled | auth_required
| validation_error | api_error | crash) plus a `error.code` string.

- `exitWithError`, `exitWithCode`, `exitWithAuthRequired` now patch the
  provisional command event via `analytics.recordTermination` before
  calling `process.exit`. Previously the queued provisional event (success
  true, duration 0) was persisted as-is by store-forward, producing
  misleading dashboard data.
- `exitWithError` now honors the string error code for exit mapping:
  `auth_required` -> 4, `cancelled` -> 2, `http_*`/`not_found`/`unknown_error`
  -> 1 (api_error), everything else -> 1 (validation_error). Previously it
  always exited 1.
- New `TelemetryClient.patchLastEventOfType` helper mutates a queued event
  in place. Unlike `replaceLastEventOfType`, it preserves the event so
  multiple callers can update fields incrementally.
- `wrapCommandHandler` records `reason: 'success'` on clean completion and
  `reason: 'crash'` with `error.name` on uncaught throw.
- `command.success` remains on CommandEvent for backward-compat.
- `api.status`/`api.code`/`api.resource` fields added to CommandEvent and
  plumbed through `recordTermination` for Phase 3 consumption.

Review passed (2 medium + 1 low non-blocking; low fixed in same commit for
crash-path test symmetry).
…mmand events

Wires `createApiErrorHandler` through to the provisional command event via
an optional `apiContext` param on `exitWithError`, populating `api.status`,
`api.code`, and `api.resource` attributes on API-failure command events.

Also treats `apiContext` presence as ground truth for `api_error` termination
reason — WorkOS error codes like `rate_limited` or `validation_error` would
otherwise fall through `resolveErrorCode` to `validation_error`, hiding
legitimate API failures from api_error dashboards.

The `CommandEvent` type already had the `api.*` attribute slots (added as
forward-compat in Phase 2); only the wiring and tests are new.
- Fix duration=0 on exit-path events: middleware now sets
  analytics.commandStartTime so recordTermination can compute real
  duration when exitWithError/exitWithCode bypass the wrapper
- Remove not_found/unknown_error from ERROR_CODE_MAP to prevent local
  config misses (env.ts) from being misclassified as API errors
- Resolve auth.mode before sessionStart in installer flow so the
  session.start event carries the correct credential source
- Make recordTermination fully idempotent: clears stale api.* and
  error.code fields when called without them
- Tighten device-id UUID validation to proper RFC 4122 v4 regex
- Only override termination reason to api_error when the code-based
  classification falls to the generic validation_error fallback,
  preserving more specific reasons like auth_required
Wrap the `api` command handler with `wrapCommandHandler()` so it
reports real duration and success/failure instead of the provisional
defaults (duration=0, success=true).

Replace raw `process.exit()` calls with `exitWithCode`/`exitWithError`
across 10 command files so `recordTermination` fires before exit. This
ensures `termination.reason` and `error.code` are populated on every
command event. Long-running server handlers (dev/emulate SIGINT) are
left as-is.

Fix help-json.ts alias drift by importing from the shared
`COMMAND_ALIASES` map instead of maintaining a private copy.

Add a coverage test that scans bin.ts for inline handlers missing
`wrapCommandHandler`, catching the class of regression that let the
`api` command slip through.
@nicknisi nicknisi force-pushed the nicknisi/telemetry branch from 0475ea8 to 8003788 Compare May 27, 2026 21:34
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6ce40944-6658-4efe-aa8c-e9db5c81063a

📥 Commits

Reviewing files that changed from the base of the PR and between 0475ea8 and 8003788.

📒 Files selected for processing (39)
  • CLAUDE.md
  • src/bin.ts
  • src/commands/api/index.spec.ts
  • src/commands/api/index.ts
  • src/commands/api/interactive.spec.ts
  • src/commands/api/interactive.ts
  • src/commands/debug.ts
  • src/commands/dev.ts
  • src/commands/doctor.ts
  • src/commands/emulate.ts
  • src/commands/env.ts
  • src/commands/install-skill.ts
  • src/commands/login.ts
  • src/commands/uninstall-skill.ts
  • src/lib/api-error-handler.spec.ts
  • src/lib/api-error-handler.ts
  • src/lib/command-aliases.ts
  • src/lib/device-id.spec.ts
  • src/lib/device-id.ts
  • src/lib/run-with-core.ts
  • src/utils/analytics.spec.ts
  • src/utils/analytics.ts
  • src/utils/command-telemetry.spec.ts
  • src/utils/command-telemetry.ts
  • src/utils/crash-reporter.spec.ts
  • src/utils/crash-reporter.ts
  • src/utils/exit-codes.spec.ts
  • src/utils/exit-codes.ts
  • src/utils/help-json.ts
  • src/utils/output.spec.ts
  • src/utils/output.ts
  • src/utils/register-subcommand.ts
  • src/utils/telemetry-client.spec.ts
  • src/utils/telemetry-client.ts
  • src/utils/telemetry-sanitize.spec.ts
  • src/utils/telemetry-schema.spec.ts
  • src/utils/telemetry-store-forward.spec.ts
  • src/utils/telemetry-store-forward.ts
  • src/utils/telemetry-types.ts
✅ Files skipped from review due to trivial changes (3)
  • src/commands/api/index.spec.ts
  • CLAUDE.md
  • src/commands/api/interactive.spec.ts
🚧 Files skipped from review as they are similar to previous changes (31)
  • src/commands/debug.ts
  • src/commands/doctor.ts
  • src/commands/emulate.ts
  • src/utils/telemetry-store-forward.spec.ts
  • src/commands/uninstall-skill.ts
  • src/commands/install-skill.ts
  • src/lib/command-aliases.ts
  • src/commands/api/index.ts
  • src/utils/output.ts
  • src/utils/help-json.ts
  • src/utils/exit-codes.spec.ts
  • src/commands/env.ts
  • src/commands/login.ts
  • src/utils/command-telemetry.ts
  • src/utils/telemetry-types.ts
  • src/utils/crash-reporter.spec.ts
  • src/commands/api/interactive.ts
  • src/commands/dev.ts
  • src/utils/exit-codes.ts
  • src/lib/run-with-core.ts
  • src/utils/telemetry-schema.spec.ts
  • src/utils/telemetry-store-forward.ts
  • src/utils/crash-reporter.ts
  • src/lib/api-error-handler.ts
  • src/utils/telemetry-client.spec.ts
  • src/lib/device-id.spec.ts
  • src/utils/telemetry-client.ts
  • src/lib/api-error-handler.spec.ts
  • src/lib/device-id.ts
  • src/utils/analytics.ts
  • src/bin.ts

Comment thread src/utils/command-telemetry.spec.ts
…ze env.shell

Make `commandExecuted` private on Analytics, exposing only
`queueProvisionalCommand` for the middleware provisional-event path.
Prevents accidental rogue command events that break the swap pattern.

Record only `basename(SHELL)` in env.shell to avoid leaking homedir
paths (e.g. ~/.local/bin/fish).

Wrap `migrations` command with `wrapCommandHandler` for telemetry.
Comment thread src/utils/analytics.ts
@@ -62,18 +147,60 @@ export class Analytics {
if (!WORKOS_TELEMETRY_ENABLED) return;

debug('[Analytics] captureException:', error.message, properties);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unsanitized error message in debug log

debug('[Analytics] captureException:', error.message, properties) logs the raw, unsanitized error.message before extractErrorFields has a chance to redact it. If an error message carries a Bearer token, sk_live_ key, or JWT — all of which are possible in auth-failure paths — that secret will appear in stdout whenever WORKOS_DEBUG=1 is set. The sanitized value is correctly stored in this.tags['error.message'] two lines later, but the debug output uses the pre-sanitized text.

Rule Used: Do not log sensitive fields like access_token, ref... (source)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant