Fix #359: configuration-aware cquery hashing#363
Merged
Conversation
Under --useCquery, BazelRule.ruleInputList() previously folded each
ConfiguredRuleInput to its bare label via `.map { it.label }`, dropping
`configurationChecksum`. Two configured graphs that shared dep labels
but differed in per-edge configuration -- a `cfg = "exec"` transition
flip, a --platforms swap, a --config=... change, or a `--define`
toggling a select() inside a dep -- collapsed to identical
ruleInputList() output, so RuleHasher's transitive walk produced
identical digests and bazel-diff returned an empty impacted set under
cquery mode.
This change folds each dep edge's `configurationChecksum` into the
rule-input string using `|` -- a character that is invalid in a Bazel
label, so the encoding can never collide with a real input and survives
the existing `List<String>` return type. The full encoded string is
what gets mixed into RuleHasher's transitive hash, so two configured
graphs differing only by per-edge configuration now produce distinct
parent digests. RuleHasher splits on the separator to recover the bare
label for `allRulesMap` / `sourceDigests` lookups and dep tracking, so
the user-visible deps in the generated JSON stay clean.
The empty-checksum fallback (a known Bazel quirk where cquery's
`transitions=lite` output sometimes omits the checksum for an edge)
returns the bare label unchanged, preserving pre-#359 behaviour for
that path.
This is a hash-format change: hashes generated by this version are not
comparable to hashes from earlier versions under --useCquery, which is
the intended behaviour -- the old hashes were missing the
configuration signal.
Tests:
* Unignore the PR #361 reproducer
(`testCqueryRuleInputListDistinguishesConfigurationChecksum_issue359`)
and drop the two `_currentBehaviourBuggy_*` companions.
* Add `testCqueryRuleInputListEncodesLabelAndChecksum_issue359` to pin
the on-the-wire encoding.
* Add `testCqueryRuleInputListEmptyChecksumFallsBackToBareLabel_issue359`
to cover the Bazel-quirk fallback.
* Add `testNonCqueryRuleInputListIgnoresConfiguredRuleInput_issue359`
to confirm the encoding does not leak into the non-cquery path.
* Add `testDecodeConfiguredRuleInputLabelRoundTrip_issue359` to pin the
round-trip behaviour RuleHasher relies on.
This is the minimal symptom-fixing slice of #359 -- the parent target's
hash now reflects per-edge configuration changes, so the empty-impacted
set under cquery is gone. Full target-determinator parity (per-(label,
configuration) digest entries, `bazel config --output=json`
enumeration, empty-checksum inheritance from the depending target) is
intentionally out of scope here; those would change the on-disk JSON
schema and warrant a separate design discussion on the issue.
Refs #359, #361
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Resolves the symptom called out in #359: under
--useCquery, two configured graphs that differ only in per-edge configuration (acfg = \"exec\"transition, a--platformsswap, a--config=...change, or a--definetoggling aselect()inside a dep) collapsed to identical hashes and bazel-diff returned an empty impacted set. The cause is incli/src/main/kotlin/com/bazel_diff/bazel/BazelRule.kt:rule.configuredRuleInputList.map { it.label }.map { it.label }discardsConfiguredRuleInput.configurationChecksum, so the per-edge configuration never makes it into the hash.The fix
configurationChecksuminto the rule-input string via a new separator|, which is not a valid character in a Bazel label ([A-Za-z0-9/._+=,@~-]*plus:between package and target). The encoded form"<label> |<configurationChecksum>"cannot collide with any real input and survives the existingList<String>return type ofBazelRule.ruleInputList.RuleHasherwrites the full encoded string into the transitive hash (so two configured graphs differing only in per-edge configuration now produce distinct parent digests) but splits on|to recover the bare label forallRulesMap/sourceDigestslookups and dep tracking — so the user-visibledepsin the generated JSON stay clean.transitions=liteoutput sometimes omits the checksum) returns the bare label unchanged, preserving pre-Configuration-aware hashing under --useCquery (parity with target-determinator) #359 behaviour for that path.This is the minimal symptom-fixing slice of #359 — the parent target's hash now reflects per-edge configuration changes, so the empty-impacted set under cquery is gone. Full target-determinator parity (per-
(label, configuration)digest entries,bazel config --output=jsonenumeration, empty-checksum inheritance from the depending target) is intentionally out of scope here: those would change the on-disk JSON schema and deserve a separate design discussion on the issue.Hash-format implication
This is a hash-format change: hashes generated by this version are not comparable to hashes from earlier versions under
--useCquery. That is the intended behaviour — the old hashes were missing the configuration signal. Non-cquery hashes are unaffected.Tests
Builds on the reproducer test merged in #361:
@Ignoreoff the reproducer (testCqueryRuleInputListDistinguishesConfigurationChecksum_issue359) and drop the two_currentBehaviourBuggy_*companions — the bug they pinned is no longer current.testCqueryRuleInputListEncodesLabelAndChecksum_issue359to pin the on-the-wire encoding so the format is observable and future changes are intentional.testCqueryRuleInputListEmptyChecksumFallsBackToBareLabel_issue359to cover the Bazel-quirk fallback.testNonCqueryRuleInputListIgnoresConfiguredRuleInput_issue359to confirm the encoding does not leak into the non-cquery path.testDecodeConfiguredRuleInputLabelRoundTrip_issue359to pin the round-trip behaviourRuleHasherrelies on (the bare-label decode is what guardsdepsfrom being polluted with|<checksum>suffixes).Test plan
bazel test //cli:BazelRuleTest //cli:BuildGraphHasherTest //cli:CalculateImpactedTargetsInteractorTest //cli:CalculateImpactedTargetsInteractorIssue335Test //cli:CalculateImpactedTargetsInteractorModuleQueryTest //cli:SourceFileHasherTest //cli:TargetHashTest //cli:DeserialiseHashesInteractorTest //cli:ModuleGraphParserTest //cli:BazelClientTest //cli:BazelTargetTest— all 11 pass locally.//cli:E2ETestruns the cquery-with-platform-swap fixture (testUseCqueryWithExternalDependencyChange/testUseCqueryWithAndroidCodeChange), which is the e2e equivalent of the reproducer. Local E2E was skipped due to a JDK-sandbox issue unrelated to this change.Refs #359, #361
🤖 Generated with Claude Code