Skip to content

Validate benchmark result shapes#953

Open
fallintoplace wants to merge 1 commit into
ClickHouse:mainfrom
fallintoplace:fix-snowflake-result-validation
Open

Validate benchmark result shapes#953
fallintoplace wants to merge 1 commit into
ClickHouse:mainfrom
fallintoplace:fix-snowflake-result-validation

Conversation

@fallintoplace

Copy link
Copy Markdown

What changed

  • Fixed the Snowflake L result row that had four values: [1.511,1.385,1,440] is now [1.511,1.385,1.440].
  • Added a missing null row to the active Pandas c6a.metal result so it has the expected 43 query rows.
  • Added validate-results.py for the main ClickBench result layout. Active dashboard entries now fail validation if their result matrix is not 43 queries x 3 runs, or if timings are not numbers/null. Historical and inactive shape issues are reported as warnings.
  • Added a GitHub Actions workflow to run the validator on pull requests and pushes to main.

Why

Plain JSON parsing accepts rows like [1.511,1.385,1,440], but the dashboard expects three timings per query. That can make hot-run comparisons pick a fake 1.000 second timing instead of the intended 1.440 second timing.

Validation

  • python3 -B validate-results.py
  • python3 -B validate-results.py /Users/hoangvu/Code/OSS/ClickBench
  • jq -e '(.result | length) == 43 and all(.result[]; type == "array" and length == 3 and all(.[]; . == null or type == "number"))' snowflake/results/20220701/l.json pandas/results/20260218/c6a.metal.json\n\nThe validator currently exits 0 and reports warnings for pre-existing historical or inactive files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant