Skip to content

fix: add PR quick benchmark mode and timeout to resolve CI timeout issue#7

Merged
konard merged 3 commits intomainfrom
issue-6-469804e9e1db
Mar 11, 2026
Merged

fix: add PR quick benchmark mode and timeout to resolve CI timeout issue#7
konard merged 3 commits intomainfrom
issue-6-469804e9e1db

Conversation

@konard
Copy link
Member

@konard konard commented Mar 11, 2026

Summary

Fixes #6 — Benchmark CI job exceeded GitHub Actions' 6-hour time limit.

Root Cause

The Benchmark job had no timeout and ran with Criterion's default settings (100 samples, 5s measurement per benchmark). Each SpacetimeDB benchmark requires ~8,000 WebSocket round trips per iteration (3,000 background links × 2 ops + 1,000 benchmark links × 2 ops), taking ~8 seconds of wall-clock time per iteration. With 100 samples this is ~800s (~13 min) per SpacetimeDB benchmark × 7 benchmarks = ~94 minutes for SpacetimeDB alone, plus cleanup overhead pushed the total past 6 hours.

Full root cause analysis with timeline, performance calculations, and CI logs:
docs/case-studies/issue-6/README.md

Changes

.github/workflows/rust-benchmark.yml

  1. New benchmark-pr job (pull_request events only, timeout-minutes: 20):

    • BENCHMARK_LINK_COUNT=10, BACKGROUND_LINK_COUNT=30 — reduces RTTs per iteration from ~8,000 to ~80
    • --sample-size 10 --warm-up-time 1 --measurement-time 2 --nresamples 1000 — fast Criterion settings
    • Expected runtime: ~3–5 minutes total for all 35 benchmarks
    • Results uploaded as artifacts (not committed to repo)
  2. Updated benchmark job (push to main only, timeout-minutes: 180):

    • Same full scale: BENCHMARK_LINK_COUNT=1000, BACKGROUND_LINK_COUNT=3000
    • Added --sample-size 20 --nresamples 10000 — reduces runtime from ~2h+ to ~30–45 minutes while still producing statistically meaningful results
    • Added timeout-minutes: 180 hard safety limit
  3. Fixed test job timeout from 360 min to 30 min (tests complete in ~3 min)

docs/case-studies/issue-6/README.md — Deep case study analysis:

  • Timeline of events
  • Root cause analysis with performance calculations
  • SpacetimeDB round-trip overhead breakdown
  • Why defaults caused 6h+ runtime
  • Proposed and implemented solutions
  • Saved CI logs from the failing run

changelog.d/20260311_benchmark_timing_fix.md — Changelog entry


This PR was created automatically by the AI issue solver

Adding .gitkeep for PR creation (default mode).
This file will be removed when the task is complete.

Issue: #6
@konard konard self-assigned this Mar 11, 2026
Root cause: Criterion's default 100-sample measurement combined with SpacetimeDB's
synchronous round-trip per operation (~8000 RTTs × ~1ms = ~8s/iter) caused each of
the 7 SpacetimeDB benchmarks to run ~13 minutes, totalling >6h and hitting GitHub
Actions' hard limit.

Changes:
- Add `benchmark-pr` job for pull_request events: runs with reduced scale
  (BENCHMARK_LINK_COUNT=10, BACKGROUND_LINK_COUNT=30) and tight Criterion settings
  (--sample-size 10, --warm-up-time 1, --measurement-time 2) for ~3-5 min total.
  Results uploaded as artifacts but not committed.
- Update `benchmark` job (main branch only): use --sample-size 20 --nresamples 10000
  for ~30-45 min runtime (was potentially 6h+ with defaults). Add timeout-minutes: 180.
- Fix `test` job timeout from 360 min to 30 min (tests complete in ~3 min).
- Add case study analysis at docs/case-studies/issue-6/README.md with root cause,
  timeline, and detailed performance calculations.
- Add changelog entry.

Fixes #6.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@konard konard changed the title [WIP] We need to find a way to execute benchmark for debug for Pull Requests in just under 10 minutes and it should fit 3 hours in commits to main branch fix: add PR quick benchmark mode and timeout to resolve CI timeout issue Mar 11, 2026
@konard konard marked this pull request as ready for review March 11, 2026 17:56
@konard
Copy link
Member Author

konard commented Mar 11, 2026

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

💰 Cost estimation:

  • Public pricing estimate: $1.588971
  • Calculated by Anthropic: $4.815901 USD
  • Difference: $3.226930 (+203.08%)
    📎 Log file uploaded as Gist (4241KB)
    🔗 View complete solution draft log

Now working session is ended, feel free to review and add any feedback on the solution draft.

@konard
Copy link
Member Author

konard commented Mar 11, 2026

✅ Ready to merge

This pull request is now ready to be merged:

  • All CI checks have passed
  • No merge conflicts
  • No pending changes

Monitored by hive-mind with --auto-restart-until-mergeable flag

@konard konard merged commit 76f7f87 into main Mar 11, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

We need to find a way to execute benchmark for debug for Pull Requests in just under 10 minutes and it should fit 3 hours in commits to main branch

1 participant