Skip to content

Conversation

@yug49
Copy link
Contributor

@yug49 yug49 commented Dec 13, 2025

Related to #229

Hey!

This PR introduces a benchmarking system for Event Scanner using Criterion.rs to measure performance impact of changes to the scanner. Currently drafting, Bencher CI integration coming in follow-up.


What's Included

New benches Crate Structure

benches/
├── Cargo.toml                           # Benchmark crate config
├── src/
│   └── lib.rs                           # Shared utilities (Anvil setup, contract deployment, event generation)
└── benches/
    ├── historic_scanning.rs             # Historic mode benchmarks
    └── latest_events_scanning.rs        # Latest events mode benchmarks

Benchmarks Implemented

Mode Event Counts What It Measures
Historic 10K, 50K, 100K Time to scan all events from block 0 to latest
Latest Events 100, 1K, 10K, 50K Time to fetch the N most recent events from a 100K event pool

example of Historic:
Screenshot 2025-12-13 at 2 58 25 PM


How Regression Testing Works

Criterion stores baseline results in target/criterion/<benchmark>/base/. On subsequent runs, it compares new measurements against this baseline and reports:

historic_scanning/events/10000
                        time:   [30.963 ms 36.506 ms 40.598 ms]
                        thrpt:  [246.32 Kelem/s 273.93 Kelem/s 322.97 Kelem/s]
                 change: [-2.12% -1.01% +0.12%] (p = 0.12 > 0.05)
                        No change in performance detected.

If a change introduces a regression, you'll see something like:

                 change: [+15.2% +18.4% +21.1%] (p = 0.00 < 0.05)
                        Performance has regressed.

This makes it easy to catch slowdowns before merging


Running Benchmarks

# All benchmarks
cargo bench --manifest-path benches/Cargo.toml

# Specific benchmark
cargo bench --manifest-path benches/Cargo.toml --bench historic_scanning
cargo bench --manifest-path benches/Cargo.toml --bench latest_events_scanning

# Filter by event count
cargo bench --manifest-path benches/Cargo.toml -- "historic_scanning/events/10000"

Next Steps

  • Bencher CI integration: Add GitHub Actions workflow for on-demand benchmarking with historical tracking via Bencher

@yug49
Copy link
Contributor Author

yug49 commented Dec 17, 2025

Hey @0xNeshi
I have made the required changes in the Criterion part as you mentioned.
Please review once, and if everything's good, I can move forward with the Bencher integration.

@0xNeshi
Copy link
Collaborator

0xNeshi commented Dec 17, 2025

All good 👍

@yug49
Copy link
Contributor Author

yug49 commented Dec 21, 2025

GM @0xNeshi,
I have completed the integration of Bencher with the following setup:

Workflow Architecture

Three GitHub Actions workflows handle different scenarios:

  1. benchmarks.yml - Runs on push to main and manual dispatch. This workflow uploads benchmark results to Bencher and establishes the baseline for regression detection. Path filters ensure benchmarks only run when relevant code changes (src/, benches/, Cargo.toml, Cargo.lock).

  2. pr_benchmarks_run.yml - Runs benchmarks on pull requests. This workflow does not have access to secrets, making it safe for fork PRs. Results are saved as artifacts.

  3. pr_benchmarks_track.yml - Triggered after the PR benchmark run completes. This workflow downloads the artifacts and uploads them to Bencher for comparison against the base branch. It posts comparison results as comments on the PR.

Required Setup

  • BENCHER_API_TOKEN as a repository secret
  • BENCHER_PROJECT as a repository variable

How It Works

When code is merged to main, benchmarks run and upload results to establish the baseline. For pull requests, benchmarks run in a fork-safe workflow and results are compared against the baseline. The --start-point-reset flag ensures PR branches remain ephemeral and do not accumulate historical data.

@yug49 yug49 marked this pull request as ready for review December 21, 2025 22:10
Copy link
Collaborator

@0xNeshi 0xNeshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work, let's polish now

with:
egress-policy: audit

- name: Free up disk space
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this step required for our use case? Does event scanner benching need this additional space?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added this additional step since running benches was taking a lot of space, sometimes it exceeded the github runners' limited space (~14GB).

Here:
Screenshot 2025-12-22 at 8 10 20 PM

Even after adding this step, the github CI was often timing out for the latest events benches, because it was trying to generate 100,000 events which is a very heavy operation that can timeout on CI.

Sometimes, it was getting stuck at this step for > 30 mins:

Run cargo bench --manifest-path benches/Cargo.toml --bench latest_events_scanning 2>&1 | tee latest_results.txt
    Updating crates.io index
   Compiling event-scanner v0.9.0-alpha (/home/runner/work/Event-Scanner/Event-Scanner)
   Compiling event-scanner-benches v0.9.0-alpha (/home/runner/work/Event-Scanner/Event-Scanner/benches)
    Finished `bench` profile [optimized] target(s) in 17.61s
     Running benches/latest_events_scanning.rs (target/release/deps/latest_events_scanning-c7b9440175034d98)
Gnuplot not found, using plotters backend
Setting up environment with 100000 total events...

To counter this, I reduced the total events for this case to 50k from 100k, which I think is also a reasonable number for real world scenarios.

Rust compilation is the main disk consumer, unchanged by event count, the cleanup step runs in <5 seconds, which I think is a negligible cost. It is a cheap safety net that prevents potential flaky failures.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, makes sense.

Once blocks are loaded into Anvil using a dump file, double check if this step becomes redundant and if it's possible to still keep the event count at 100,000.

}
}

assert_eq!(log_count, expected_count, "expected {expected_count} events, got {log_count}");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, this validation unnecessarily affects bench result, while also being kind of redundant - the project has extensive integration tests with this validation.

Remove the assertions and stop tracking log count

Comment on lines +66 to +69
let env: BenchEnvironment = rt.block_on(async {
let config = BenchConfig::new(event_count);
setup_environment(config).await.expect("failed to setup benchmark environment")
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's really no reason to setup a new Anvil node for each bench run, it just adds to total bench execution time.
Let's make an optimization to the both benches.

Instead of setting up a new Anvil node for each event_count case, let's do the following:

  1. setup a single Anvil node with 100,000 events
  2. get latest block number
  3. for historic bench:
    • bench with 3 different block ranges that roughly correspond to:
      1. first 1/10 of all blocks
      2. first 1/2 of all blocks
      3. all blocks
  4. for latest events - leave as-is, i.e. bench latest 10k events, then 50k, then 100k (all) events

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related #254 (comment)

}

fn historic_scanning_benchmark(c: &mut Criterion) {
let rt = tokio::runtime::Runtime::new().expect("failed to create tokio runtime");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a singleton tokio runtime that can be reused across bench runs.
Something like:

use std::sync::OnceLock;

static RUNTIME: OnceLock<tokio::runtime::Runtime> = OnceLock::new();

fn get_runtime() -> &'static tokio::runtime::Runtime {
    RUNTIME.get_or_init(|| {
        tokio::runtime::Runtime::new().expect("failed to create tokio runtime")
    })
}

fn historic_scanning_benchmark(c: &mut Criterion) {
    let rt = get_runtime();
    // ... rest of benchmark ...
}

sol! {
// Built directly with solc 0.8.30+commit.73712a01.Darwin.appleclang
#[sol(rpc, bytecode="608080604052346015576101b0908161001a8239f35b5f80fdfe6080806040526004361015610012575f80fd5b5f3560e01c90816306661abd1461016157508063a87d942c14610145578063d732d955146100ad5763e8927fbc14610048575f80fd5b346100a9575f3660031901126100a9575f5460018101809111610095576020817f7ca2ca9527391044455246730762df008a6b47bbdb5d37a890ef78394535c040925f55604051908152a1005b634e487b7160e01b5f52601160045260245ffd5b5f80fd5b346100a9575f3660031901126100a9575f548015610100575f198101908111610095576020817f53a71f16f53e57416424d0d18ccbd98504d42a6f98fe47b09772d8f357c620ce925f55604051908152a1005b60405162461bcd60e51b815260206004820152601860248201527f436f756e742063616e6e6f74206265206e6567617469766500000000000000006044820152606490fd5b346100a9575f3660031901126100a95760205f54604051908152f35b346100a9575f3660031901126100a9576020905f548152f3fea2646970667358221220471585b420a1ad0093820ff10129ec863f6df4bec186546249391fbc3cdbaa7c64736f6c634300081e0033")]
contract BenchCounter {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
contract BenchCounter {
contract Counter {

align name with other contracts in examples

// - 10 samples (iterations)
// - Long measurement time to accommodate heavy loads
group.sample_size(10);
group.measurement_time(std::time::Duration::from_secs(120));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also increase warm up time to at least 5 seconds

yug49 and others added 4 commits December 22, 2025 19:32
Co-authored-by: Nenad <xinef.it@gmail.com>
Co-authored-by: Nenad <xinef.it@gmail.com>
Co-authored-by: Nenad <xinef.it@gmail.com>
@LeoPatOZ
Copy link
Collaborator

@0xNeshi what do you think about creating multiple dump files and starting anvil from that state instead of having to recreate it every time we start a bench

@0xNeshi
Copy link
Collaborator

0xNeshi commented Dec 23, 2025

@0xNeshi what do you think about creating multiple dump files and starting anvil from that state instead of having to recreate it every time we start a bench

Even better 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants