feat(ci): restore benchmark PR comments + add reliability/chaos reporting#589
feat(ci): restore benchmark PR comments + add reliability/chaos reporting#589jbachorik wants to merge 22 commits into
Conversation
…ting - Fix benchmark comments: replace UPSTREAM_* dependency with DDPROF_COMMIT_BRANCH; aggregate all per-cell reports into one comment - Add reliability/chaos PR comments via dynamic child pipeline gated on test:reliability label (detected in prepare.sh) - Fix notifier jobs: when:always so failures are reported (not silenced) - Forward DDPROF_COMMIT_BRANCH/SHA to child pipeline via trigger variables - Use jq (with grep fallback) for label detection in prepare.sh - Fix cut -f2 → cut -f2- to preserve full FAIL reason text - Fix multiline REASON dotenv: grep -m1 + tr -d '\n' - Fix tr '+' '_' mangling reason value; scope git insteadOf to clone Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CI Test ResultsRun: #27416950375 | Commit:
Status Overview
Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled Summary: Total: 32 | Passed: 32 | Failed: 0 Updated: 2026-06-12 13:11:15 UTC |
|
7258f4c to
484e303
Compare
…unsupported in root yml)
… cross-project not allowed)
…s, surface HTTP errors)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: eea64019c2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Reliability & Chaos Results✅ All reliability & chaos checks passed Pipeline: https://gitlab.ddbuild.io/DataDog/java-profiler/-/pipelines/118425729 |
What does this PR do?:
Restores benchmark comparison comments on PRs (broken since the CI pipeline was migrated into the repo) and adds reliability + chaos test result comments triggered by the
test:reliabilitylabel.Motivation:
Benchmark PR comments stopped posting after the pipeline migration from
java-profiler-buildbecause the old scripts depended onUPSTREAM_PROJECT_NAME/UPSTREAM_BRANCHwhich are never set in the in-repo pipeline. Reviewers lost per-PR benchmark regression visibility. Separately, reliability/chaos test results were only surfaced via Slack on scheduled runs; there was no way to see them on a PR.Additional Notes:
Key design decisions:
post-pr-comment.shcalls removed; a new aggregation job (post-benchmarks-pr-comment) collects allcomparison-baseline-vs-candidate_*.mdartifacts and posts one comment.test:reliability→generate-reliability-child-pipeline→run-reliability-teststrigger). The label is detected inprepare.shvia jq (with grep fallback) and exported asRUN_RELIABILITY=truetobuild.env. Scheduled runs on main are untouched.when: alwaysso failures are reported (withwhen: on_successthey would be silenced exactly when needed most).DDPROF_COMMIT_BRANCHis forwarded to the child pipeline viatrigger: variables:sinceforward: pipeline_variablesdoesn't carry dotenv artifact variables across the child pipeline boundary.chaos_check.shalready has a built-in fallback to rebuildchaos.jarinline and downloadddproffrom Maven snapshots, so the child pipeline doesn't needchaos:build/build-artifactparent artifacts.How to test the change?:
test:reliabilityto a PR — verify the reliability child pipeline triggers and a "Reliability & Chaos" comment is posted after jobs complete.when: on_success).For Datadog employees:
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance.