Skip to content

feat: aggregate CI message per commit, not per workflow run#104

Merged
SimonThormeyer merged 1 commit into
hackathon-2026from
simon/feat/aggregate-per-commit
Jun 18, 2026
Merged

feat: aggregate CI message per commit, not per workflow run#104
SimonThormeyer merged 1 commit into
hackathon-2026from
simon/feat/aggregate-per-commit

Conversation

@SimonThormeyer

Copy link
Copy Markdown
Member

No description provided.

@SimonThormeyer SimonThormeyer requested a review from a team as a code owner June 18, 2026 14:49
@claude

claude Bot commented Jun 18, 2026

Copy link
Copy Markdown

PR Review: aggregate CI message per commit, not per workflow run

Overall this is a clean, well-structured change. Aggregating CI status into a single per-commit message is a nice UX improvement over one message per workflow run, and the extraction of GitHubMessageSender.kt / CommitCiMessageFormatter reads well — small private helpers, clear naming, sensible constants instead of magic strings. A few things worth a look before merging.

🐞 Potential bug — race condition on the per-commit message

This is the main concern. Previously each workflow run had its own message keyed by workflowRunId, so concurrent webhook deliveries never contended. Now all runs for a commit collapse onto one message, and the create/edit path is a non-atomic read-modify-write across several Redis ops:

  1. updateCommitCiWorkflow(...) returns summary.messageId (read from Redis)
  2. the SDK sends a new message or edits the existing one based on that id
  3. rememberCommitCiMessageId(...) writes the new id back

GitHub commonly delivers multiple workflow_run events for the same head_sha within the same instant (several workflows completing together). Ktor serves these concurrently, so two handlers can both observe messageId == null, both call WireMessage.Text.create, and produce two duplicate aggregate messages for the same commit — after which the last rememberCommitCiMessageId wins and one message is orphaned (never edited again). Even when an id already exists, concurrent edits can interleave and post stale content.

Worth considering a per-(repo, headSha, conversation) lock or a Redis-side atomic guard (e.g. SET key val NX to elect the single creator) around the create-vs-edit decision. At minimum it's worth documenting the assumption if you believe deliveries are effectively serial in practice.

🧹 Test coverage

The two ApplicationTest cases were updated correctly and the new content assertions (Check suite status: Running/Passed, per-workflow lines) are a good addition. However, the most complex new logic is untested: GitHubWebhookManager is mocked, so updateCommitCiWorkflow / commitCiWorkflows — JSON (de)serialization of CommitCiWorkflow, set membership, case-insensitive sort, and the rewired commit_ci_keys cleanup set — have no direct coverage. A focused unit test against the manager (with an embedded/mocked Redis) would protect the aggregation and the cleanup-key bookkeeping, which is exactly the part most likely to regress silently.

Also untested: the headline aggregation rules in CommitCiMessageFormatter — e.g. mixed states (one failed + one running → Failed), all-cancelled → Completed. These are pure functions and cheap to cover.

⚠️ Minor / nits

  • Unbounded key growth. commit_ci_keys accumulates every commit's keys for a repo and is only cleared on full repo-inactivity cleanup; individual commit entries have no TTL. Same shape as the old workflow_run_messages set, so no regression, but per-commit fan-out makes it grow faster. A TTL on the commit-CI keys (commits go stale quickly) would bound this naturally.
  • Re-runs duplicate rows. Workflows are keyed by runId, so re-running a workflow yields a second entry with the same name (e.g. two Build lines). If that's undesirable, keying by workflow name rather than run id would dedupe.
  • commitCiWorkflows does N+1 Redis round-trips (smembers then a get per member). Fine at current scale; mget would collapse it if commits ever carry many workflows.
  • Dead branch in TemplateHandler. handleEvent now returns null for EVENT_WORKFLOW_RUN, but Routing already branches on that event before consulting the template handler, so the EVENT_WORKFLOW_RUN -> null arm is unreachable. Could be dropped for clarity.
  • The isValidGithubWebhookSignature extraction is a nice readability win and preserves behavior. 👍

Nice work overall — the structure and naming are a clear improvement. The race condition is the one item I'd want resolved (or explicitly ruled out) before merge.

🤖 Automated review

@SimonThormeyer SimonThormeyer force-pushed the simon/feat/aggregate-per-commit branch from 59f8be1 to 188f878 Compare June 18, 2026 14:59
@SimonThormeyer SimonThormeyer merged commit 188f878 into hackathon-2026 Jun 18, 2026
1 check passed
@SimonThormeyer SimonThormeyer deleted the simon/feat/aggregate-per-commit branch June 18, 2026 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants