Skip to content

testrunner: don't mask a parent test that fails on its own#5698

Open
pietern wants to merge 2 commits into
mainfrom
fix-integration
Open

testrunner: don't mask a parent test that fails on its own#5698
pietern wants to merge 2 commits into
mainfrom
fix-integration

Conversation

@pietern

@pietern pietern commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

The integration test runner masks failures listed in known_failures.txt. A known-failure rule names a specific subtest (e.g. TestAccept/ssh/connection), but a parent test also fails whenever one of its subtests fails, so the runner additionally allowed the parent to fail. It did this by matching a rule against the parent bidirectionally: a rule whose pattern was a descendant of the failing test also matched.

That allowance was unconditional. When the top-level TestAccept failed on its own — before any subtest ran — its failure still matched any TestAccept/... rule and was silently swallowed. The recent RequireRuff setup check (#5662) does exactly this on the integration runners (ruff isn't installed there), so TestAccept has been failing on every environment while the nightly reports green.

This moves the parent-cascade handling out of rule matching and into failure analysis:

  • ConfigRule.matches reverts to plain forward matching — a rule matches a test or its subtree, nothing about parents.
  • checkFailures allows a failure when a rule matches or a subtest of it also failed. Go reports a parent's failure only after its subtests, so the failing subtest is already recorded when the parent is evaluated; no second pass is needed.

Net effect: a parent failing because of a (known or unknown) subtest is still not double-counted, but a test that fails on its own with no failing subtest now surfaces as unexpected.

This only makes the failure visible. Installing ruff on the integration runners is handled separately.

Tests

  • TestCheckFailures exercises checkFailures end-to-end: parent-with-failing-subtest (allowed), parent-alone (the ruff case, surfaced), unlisted failure (surfaced), and fail-then-pass-on-retry (allowed).
  • TestConfigRuleMatches reverse-match cases now assert that a rule does not match a parent of the listed test.

This pull request and its description were written by Isaac.

A known-failure rule names a specific subtest, but a parent test also fails
whenever one of its subtests fails. The runner allowed the parent to fail by
matching rules bidirectionally: a rule whose pattern was a descendant of the
failing test also matched. That allowance was unconditional, so the top-level
TestAccept failing on its own (before any subtest ran) matched any TestAccept/...
rule and was silently swallowed.

Move parent-cascade handling out of rule matching and into failure analysis:
matches() reverts to plain forward matching, and a failure is allowed when a
rule matches or a subtest of it also failed. Go reports a parent's failure only
after its subtests, so the failing subtest is already recorded when the parent
is evaluated.

Co-authored-by: Isaac
@pietern pietern temporarily deployed to test-trigger-is June 24, 2026 09:35 — with GitHub Actions Inactive
@pietern pietern temporarily deployed to test-trigger-is June 24, 2026 09:35 — with GitHub Actions Inactive
@pietern pietern requested review from andrewnester and denik June 24, 2026 09:38
Comment thread tools/testrunner/main.go
if unexpectedFailures[key] {
fmt.Printf("%s %s passed on retry\n", result.Package, result.Test)
delete(unexpectedFailures, key)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do this, although a bit hard to see what's going.

Alternative, possibly simpler is to place all the version checks into dedicated subtest so it's not just TestAccept that fails but TestAccept/min-versions

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ we should probably do it regardless, much better to see that it's TestAccept/min-versions having issues rather than TestAccept which tells nothing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move it into a subtest for clarity and -run filtering, not as a replacement for this change.

The ancestor of a "acceptable failure" failing without the entry itself failing must never be allowed.

Made the subtest change here: #5712

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it should not. I've always thought of TestAccept as empty shell that cannot fail as everything is in subtests (I think this was the first case it did) but if it does fail, the consequences are not great, so worth extra safety.

@eng-dev-ecosystem-bot

eng-dev-ecosystem-bot commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Integration test report

Commit: aaead19

Run: 28129557080

Env 🟨​KNOWN 🔄​flaky 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
🟨​ aws linux 7 2 1 13 241 1024 8:52
🟨​ aws windows 7 1 13 245 1022 8:18
💚​ aws-ucws linux 8 13 333 940 5:56
💚​ aws-ucws windows 8 13 335 938 6:37
💚​ azure linux 2 15 246 1022 4:31
💚​ azure windows 2 15 248 1020 5:00
💚​ azure-ucws linux 2 15 338 936 6:28
💚​ azure-ucws windows 2 15 340 934 5:36
💚​ gcp linux 2 15 245 1024 4:33
💚​ gcp windows 2 15 247 1022 5:10
23 interesting tests: 13 SKIP, 7 KNOWN, 2 flaky, 1 RECOVERED
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🟨​ TestAccept 🟨​K 🟨​K 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🙈​ TestAccept/bundle/invariant/no_drift 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🙈​ TestAccept/bundle/resources/postgres_branches/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/replace_existing 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/update_protected 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/without_branch_id 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_projects/update_display_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/recreate/embedding_dimension 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/ssh/connection 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🔄​ TestSecretsPutSecretBytesValue 🔄​f ✅​p 🙈​s 🙈​s ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p
🔄​ TestSecretsPutSecretStringValue 🔄​f ✅​p 🙈​s 🙈​s ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p
💚​ TestFetchRepositoryInfoAPI_FromRepo 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
Top 5 slowest tests (at least 2 minutes):
duration env testname
3:36 gcp windows TestAccept
3:19 azure-ucws windows TestAccept
3:15 azure windows TestAccept
3:12 aws linux TestSecretsPutSecretStringValue
3:06 aws-ucws windows TestAccept

@pietern pietern temporarily deployed to test-trigger-is June 24, 2026 21:04 — with GitHub Actions Inactive
@pietern pietern temporarily deployed to test-trigger-is June 24, 2026 21:04 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants