Skip to content

Standardize JSONL log envelopes across AWF runtime logs#3790

Merged
lpcox merged 3 commits into
mainfrom
copilot/standardize-jsonl-log-format
May 25, 2026
Merged

Standardize JSONL log envelopes across AWF runtime logs#3790
lpcox merged 3 commits into
mainfrom
copilot/standardize-jsonl-log-format

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 25, 2026

AWF JSONL outputs used inconsistent timestamp keys/formats (ts epoch vs timestamp ISO) and uneven event discrimination, which made cross-log parsing and correlation unnecessarily brittle. This change aligns runtime log records on a common envelope: timestamp (ISO-8601 ms), event, and _schema.

  • Emitter format alignment

    • audit.jsonl (Squid): switched to timestamp ISO-8601 with milliseconds, added top-level event.
    • token-usage.jsonl (api-proxy): added event: "token_usage" to persisted records.
    • otel.jsonl (api-proxy file exporter): added _schema + event: "otel_span" and standardized timestamp envelope.
    • access.jsonl (cli-proxy): renamed tstimestamp.
  • Parser migration support

    • Updated audit JSONL parsing to accept both:
      • new timestamp (ISO string), and
      • legacy ts (epoch float),
        preserving compatibility during rollout.
  • Schema contract updates

    • Updated schemas/audit.schema.json to require timestamp (ISO ms) and event.
    • Updated schemas/token-usage.schema.json to require event (token_usage).
    • Added:
      • schemas/otel-span.schema.json
      • schemas/cli-proxy-access.schema.json
  • Spec/docs normalization

    • Updated config spec log inventory section to make timestamp / event / _schema a normative JSONL requirement.
    • Extended schema index docs to include new otel-span and cli-proxy-access schema artifacts.

Example standardized envelope shape now emitted across JSONL records:

{
  "timestamp": "2026-05-25T09:00:00.000Z",
  "event": "token_usage",
  "_schema": "token-usage/v0.23.1"
}

Copilot AI changed the title [WIP] Standardize JSONL log record format with consistent timestamp and event fields Standardize JSONL log envelopes across AWF runtime logs May 25, 2026
Copilot finished work on behalf of lpcox May 25, 2026 17:17
Copilot AI requested a review from lpcox May 25, 2026 17:17
@lpcox lpcox marked this pull request as ready for review May 25, 2026 17:33
Copilot AI review requested due to automatic review settings May 25, 2026 17:33
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 25, 2026

Documentation Preview

Documentation build failed for this PR. View logs.

Built from commit 9dfeb83

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 25, 2026

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 96.53% 96.58% 📈 +0.05%
Statements 96.37% 96.42% 📈 +0.05%
Functions 97.99% 97.99% ➡️ +0.00%
Branches 90.80% 90.82% 📈 +0.02%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/logs/log-parser.ts 86.5% → 88.0% (+1.47%) 85.5% → 87.0% (+1.53%)
src/config-writer.ts 89.3% → 90.9% (+1.65%) 89.3% → 90.9% (+1.65%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes AWF’s JSONL runtime logs onto a consistent top-level envelope (timestamp in ISO-8601 ms, event, and _schema) to make downstream parsing and cross-log correlation less brittle.

Changes:

  • Updated Squid audit.jsonl emission to use timestamp (ISO-8601 ms) and an explicit event.
  • Updated log parsing/tests to accept both new (timestamp) and legacy (ts) audit records.
  • Extended/added JSON Schemas + docs for the standardized envelope across token usage, OTel spans, and CLI proxy access logs.
Show a summary per file
File Description
src/squid/config-generator.ts Updates Squid audit_jsonl logformat to emit timestamp (ISO ms) and event.
src/squid-config-features.test.ts Adjusts config generation test expectations for new audit JSONL fields.
src/logs/log-parser.ts Updates audit JSONL parser to support ISO timestamp while keeping legacy ts support.
src/logs/log-parser.test.ts Updates/extends tests for current and legacy audit JSONL timestamp formats.
containers/cli-proxy/server.js Renames CLI proxy access log field tstimestamp.
containers/api-proxy/token-tracker.schema.test.js Updates token-usage schema tests to include/require event.
containers/api-proxy/token-persistence.js Adds event to token-usage persistence record + runtime validation requirements.
containers/api-proxy/otel.js Adds _schema + event to file-exported OTel span JSONL records.
containers/api-proxy/otel.test.js Adds test coverage for OTel JSONL envelope fields (timestamp/event/_schema).
schemas/audit.schema.json Updates audit schema to require timestamp and event (dropping ts).
schemas/token-usage.schema.json Requires event: "token_usage" in token-usage records.
schemas/otel-span.schema.json Adds a new schema for otel.jsonl span records.
schemas/cli-proxy-access.schema.json Adds a new schema for CLI proxy access.jsonl records.
schemas/README.md Updates schema index docs and examples to reflect standardized envelope + new schema artifacts.
docs/awf-config-spec.md Updates spec to make standardized JSONL envelope normative and lists new schema artifacts.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 15/15 changed files
  • Comments generated: 6

Comment thread src/logs/log-parser.ts
Comment on lines +213 to +221
let timestamp = 0;
if (typeof obj.timestamp === 'string') {
const parsed = Date.parse(obj.timestamp);
if (!Number.isNaN(parsed)) {
timestamp = parsed / 1000;
}
} else if (typeof obj.ts === 'number') {
timestamp = obj.ts;
}
Comment on lines 116 to 121
const required = [
['_schema', 'string'],
['timestamp', 'string'],
['event', 'string'],
['request_id', 'string'],
['provider', 'string'],
Comment thread schemas/audit.schema.json Outdated
},
"event": {
"type": "string",
"enum": ["http_access", "http_access_allowed", "http_access_denied"],
Comment thread docs/awf-config-spec.md
Comment on lines +908 to +913
All AWF JSONL records **MUST** include the following top-level fields:

- `timestamp` (string, required): ISO 8601 UTC with milliseconds (`YYYY-MM-DDTHH:mm:ss.SSSZ`).
- `event` (string, required): Stable snake_case record discriminator.
- `_schema` (string, required): Schema identifier in the form `<record-type>/v<version>`.

Comment on lines +6 to +20
"type": "object",
"required": ["timestamp", "event"],
"additionalProperties": true,
"properties": {
"timestamp": {
"type": "string",
"format": "date-time",
"pattern": "^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d{3}Z$",
"description": "ISO 8601 UTC timestamp with millisecond precision."
},
"event": {
"type": "string",
"pattern": "^[a-z0-9]+(?:_[a-z0-9]+)*$",
"description": "Stable snake_case event discriminator (e.g. exec_start, exec_done)."
}
Comment on lines 59 to 63
* Each line is a self-contained JSON object for easy parsing.
*/
function accessLog(entry) {
const record = { ts: new Date().toISOString(), ...entry };
const record = { timestamp: new Date().toISOString(), ...entry };
const line = JSON.stringify(record);
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions github-actions Bot mentioned this pull request May 25, 2026
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@lpcox
Copy link
Copy Markdown
Collaborator

lpcox commented May 25, 2026

@copilot address review feedback

Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 25, 2026

@copilot address review feedback

All six review issues addressed in commit fix: address review feedback on JSONL log envelope standardization:

  1. log-parser.tsparseAuditJsonlLine now falls back to obj.ts when obj.timestamp is present but not a valid date string (Date.parse returns NaN). Added a test for this case.

  2. token-persistence.jsvalidateTokenUsageRecord now explicitly checks record.event === 'token_usage' and logs a token_record_schema_violation warning with expected: 'token_usage' if it differs.

  3. schemas/audit.schema.jsonevent changed from enum: [...] to const: "http_access" to match what the Squid audit_jsonl logformat actually emits.

  4. docs/awf-config-spec.md — Token-diag event names in the table and example block updated to snake_case (model_alias_resolution_step, model_alias_rewrite) to align with the normative snake_case requirement in §13.4.

  5. containers/api-proxy/server.js + server.models.test.jsdiag() calls updated to emit model_alias_resolution_step / model_alias_rewrite; tests updated accordingly.

  6. containers/cli-proxy/server.js + schemas/cli-proxy-access.schema.jsonaccessLog() now includes _schema: "cli-proxy-access/v<AWF_VERSION>" in every record; schema updated to require _schema with the versioned pattern.

Copilot finished work on behalf of lpcox May 25, 2026 17:47
@github-actions
Copy link
Copy Markdown
Contributor

🔧 BYOK Smoke Test Results

Status: PARTIAL PASS ⚠️

Note: Running in BYOK offline mode (COPILOT_OFFLINE=true) via api-proxy sidecar.

Current PR: "Standardize JSONL log envelopes across AWF runtime logs"
Author: @Copilot | Assignees: @lpcox, @Copilot

🔑 BYOK report filed by Smoke Copilot BYOK

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test Results

✅ GitHub MCP: PR #3786 "Update ghs_ token detection/redaction for stateless JWT format"
❌ File I/O: Test file not found

Status: FAIL

cc @Copilot @lpcox

📰 BREAKING: Report filed by Smoke Copilot

@github-actions
Copy link
Copy Markdown
Contributor

🔬 Smoke Test: API Proxy OpenTelemetry Tracing

Status: ✅ All scenarios validated

Test Results

Scenario Result Notes
Module Loading ✅ PASS otel.js exports 11 functions including startRequestSpan, setTokenAttributes, endSpan, isEnabled
Test Suite ✅ PASS 33/33 tests passed in otel.test.js covering span creation, token attributes, parent context, OTLP export, file export
Env Var Forwarding ✅ PASS OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS, GITHUB_AW_OTEL_TRACE_ID, GITHUB_AW_OTEL_PARENT_SPAN_ID forwarded to api-proxy (lines 146-151)
Token Tracker Integration ✅ PASS onUsage callback exists in token-tracker-http.js:62 and invoked at line 492 in proxy-request.js to call otel.setTokenAttributes()
OTEL Env Tests ✅ PASS 5/5 OTEL-specific tests pass: auto-forwarding, not-set handling, header vars in tokens, file exporter path

Architecture Validation

Span lifecycle: startRequestSpan()setTokenAttributes() via onUsage callback → endSpan() via onSpanEnd callback
GenAI attributes: gen_ai.usage.* semantic conventions set on spans
Parent context: Workflow trace context propagated via GITHUB_AW_OTEL_* env vars
Export paths: OTLP/HTTP via Squid when endpoint configured, file fallback to /var/log/api-proxy/otel.jsonl
Graceful degradation: isEnabled() checks prevent errors when OTEL unconfigured


Conclusion: OTEL tracing integration is fully operational. All hook points validated, test coverage complete.

📡 OTel tracing validated by Smoke OTel Tracing

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Codex: FAIL
✅ Merged PRs: Update ghs_ token detection/redaction for stateless JWT format; feat: document model alias logging and wire debugTokens through config
❌ SafeInputs GH: safeinputs-gh unavailable
✅ Playwright: GitHub title verified
❌ Tavily: bridge exposed no tools/results
✅ File/Bash, Discussion, Build

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

@github-actions
Copy link
Copy Markdown
Contributor

Service Connectivity Test Results

❌ Redis: TIMEOUT/ERROR
❌ PostgreSQL pg_isready: no response
❌ PostgreSQL SELECT 1: not tested (pg_isready failed)

Overall: FAIL — services not reachable via host.docker.internal

🔌 Service connectivity validated by Smoke Services

@github-actions
Copy link
Copy Markdown
Contributor

Gemini Engine Smoke Test Results

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

💎 Faceted by Smoke Gemini

@github-actions
Copy link
Copy Markdown
Contributor

Chroot Runtime Version Comparison

Test results from comparing runtime versions between host and chrooted environment:

Runtime Host Version Chroot Version Match?
Python 3.12.13 3.12.3 ❌ NO
Node.js v24.15.0 v22.22.3 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Overall Result: Not all tests passed (1/3 matches)

Analysis

  • Go: ✅ Versions match perfectly
  • Python: Minor version mismatch (3.12.13 vs 3.12.3) - chroot using older patch
  • Node.js: Major version mismatch (v24 vs v22) - chroot using different major version

The version mismatches indicate the chrooted environment is seeing different runtime installations than the host, which is expected behavior when using selective bind mounts rather than a full filesystem overlay.

Tested by Smoke Chroot

@github-actions
Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color All passed ✅ PASS
Go env All passed ✅ PASS
Go uuid All passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx All passed ✅ PASS
Node.js execa All passed ✅ PASS
Node.js p-limit All passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

All build and test operations completed successfully across all ecosystems.

Generated by Build Test Suite for issue #3790 · ● 14.7M ·

@lpcox lpcox merged commit 05e2089 into main May 25, 2026
67 of 70 checks passed
@lpcox lpcox deleted the copilot/standardize-jsonl-log-format branch May 25, 2026 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Standardize JSONL log record format: consistent timestamp and event fields

3 participants