Skip to content

Fix #359: configuration-aware cquery hashing#363

Merged
tinder-maxwellelliott merged 1 commit into
masterfrom
claude/fix-issue-359
May 22, 2026
Merged

Fix #359: configuration-aware cquery hashing#363
tinder-maxwellelliott merged 1 commit into
masterfrom
claude/fix-issue-359

Conversation

@tinder-maxwellelliott
Copy link
Copy Markdown
Collaborator

Summary

Resolves the symptom called out in #359: under --useCquery, two configured graphs that differ only in per-edge configuration (a cfg = \"exec\" transition, a --platforms swap, a --config=... change, or a --define toggling a select() inside a dep) collapsed to identical hashes and bazel-diff returned an empty impacted set. The cause is in cli/src/main/kotlin/com/bazel_diff/bazel/BazelRule.kt:

rule.configuredRuleInputList.map { it.label }

.map { it.label } discards ConfiguredRuleInput.configurationChecksum, so the per-edge configuration never makes it into the hash.

The fix

  • Fold each cquery dep edge's configurationChecksum into the rule-input string via a new separator |, which is not a valid character in a Bazel label ([A-Za-z0-9/._+=,@~-]* plus : between package and target). The encoded form "<label> |<configurationChecksum>" cannot collide with any real input and survives the existing List<String> return type of BazelRule.ruleInputList.
  • RuleHasher writes the full encoded string into the transitive hash (so two configured graphs differing only in per-edge configuration now produce distinct parent digests) but splits on | to recover the bare label for allRulesMap / sourceDigests lookups and dep tracking — so the user-visible deps in the generated JSON stay clean.
  • Empty-checksum fallback (a known Bazel quirk where cquery's transitions=lite output sometimes omits the checksum) returns the bare label unchanged, preserving pre-Configuration-aware hashing under --useCquery (parity with target-determinator) #359 behaviour for that path.

This is the minimal symptom-fixing slice of #359 — the parent target's hash now reflects per-edge configuration changes, so the empty-impacted set under cquery is gone. Full target-determinator parity (per-(label, configuration) digest entries, bazel config --output=json enumeration, empty-checksum inheritance from the depending target) is intentionally out of scope here: those would change the on-disk JSON schema and deserve a separate design discussion on the issue.

Hash-format implication

This is a hash-format change: hashes generated by this version are not comparable to hashes from earlier versions under --useCquery. That is the intended behaviour — the old hashes were missing the configuration signal. Non-cquery hashes are unaffected.

Tests

Builds on the reproducer test merged in #361:

  • Flip the @Ignore off the reproducer (testCqueryRuleInputListDistinguishesConfigurationChecksum_issue359) and drop the two _currentBehaviourBuggy_* companions — the bug they pinned is no longer current.
  • Add testCqueryRuleInputListEncodesLabelAndChecksum_issue359 to pin the on-the-wire encoding so the format is observable and future changes are intentional.
  • Add testCqueryRuleInputListEmptyChecksumFallsBackToBareLabel_issue359 to cover the Bazel-quirk fallback.
  • Add testNonCqueryRuleInputListIgnoresConfiguredRuleInput_issue359 to confirm the encoding does not leak into the non-cquery path.
  • Add testDecodeConfiguredRuleInputLabelRoundTrip_issue359 to pin the round-trip behaviour RuleHasher relies on (the bare-label decode is what guards deps from being polluted with |<checksum> suffixes).

Test plan

  • bazel test //cli:BazelRuleTest //cli:BuildGraphHasherTest //cli:CalculateImpactedTargetsInteractorTest //cli:CalculateImpactedTargetsInteractorIssue335Test //cli:CalculateImpactedTargetsInteractorModuleQueryTest //cli:SourceFileHasherTest //cli:TargetHashTest //cli:DeserialiseHashesInteractorTest //cli:ModuleGraphParserTest //cli:BazelClientTest //cli:BazelTargetTest — all 11 pass locally.
  • CI: full E2E matrix (ubuntu+macos × bazel 7/8/9) — //cli:E2ETest runs the cquery-with-platform-swap fixture (testUseCqueryWithExternalDependencyChange / testUseCqueryWithAndroidCodeChange), which is the e2e equivalent of the reproducer. Local E2E was skipped due to a JDK-sandbox issue unrelated to this change.
  • CI: 90% main-source line-coverage gate.

Refs #359, #361

🤖 Generated with Claude Code

Under --useCquery, BazelRule.ruleInputList() previously folded each
ConfiguredRuleInput to its bare label via `.map { it.label }`, dropping
`configurationChecksum`. Two configured graphs that shared dep labels
but differed in per-edge configuration -- a `cfg = "exec"` transition
flip, a --platforms swap, a --config=... change, or a `--define`
toggling a select() inside a dep -- collapsed to identical
ruleInputList() output, so RuleHasher's transitive walk produced
identical digests and bazel-diff returned an empty impacted set under
cquery mode.

This change folds each dep edge's `configurationChecksum` into the
rule-input string using `|` -- a character that is invalid in a Bazel
label, so the encoding can never collide with a real input and survives
the existing `List<String>` return type. The full encoded string is
what gets mixed into RuleHasher's transitive hash, so two configured
graphs differing only by per-edge configuration now produce distinct
parent digests. RuleHasher splits on the separator to recover the bare
label for `allRulesMap` / `sourceDigests` lookups and dep tracking, so
the user-visible deps in the generated JSON stay clean.

The empty-checksum fallback (a known Bazel quirk where cquery's
`transitions=lite` output sometimes omits the checksum for an edge)
returns the bare label unchanged, preserving pre-#359 behaviour for
that path.

This is a hash-format change: hashes generated by this version are not
comparable to hashes from earlier versions under --useCquery, which is
the intended behaviour -- the old hashes were missing the
configuration signal.

Tests:
* Unignore the PR #361 reproducer
  (`testCqueryRuleInputListDistinguishesConfigurationChecksum_issue359`)
  and drop the two `_currentBehaviourBuggy_*` companions.
* Add `testCqueryRuleInputListEncodesLabelAndChecksum_issue359` to pin
  the on-the-wire encoding.
* Add `testCqueryRuleInputListEmptyChecksumFallsBackToBareLabel_issue359`
  to cover the Bazel-quirk fallback.
* Add `testNonCqueryRuleInputListIgnoresConfiguredRuleInput_issue359`
  to confirm the encoding does not leak into the non-cquery path.
* Add `testDecodeConfiguredRuleInputLabelRoundTrip_issue359` to pin the
  round-trip behaviour RuleHasher relies on.

This is the minimal symptom-fixing slice of #359 -- the parent target's
hash now reflects per-edge configuration changes, so the empty-impacted
set under cquery is gone. Full target-determinator parity (per-(label,
configuration) digest entries, `bazel config --output=json`
enumeration, empty-checksum inheritance from the depending target) is
intentionally out of scope here; those would change the on-disk JSON
schema and warrant a separate design discussion on the issue.

Refs #359, #361

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tinder-maxwellelliott tinder-maxwellelliott merged commit 82a62ee into master May 22, 2026
15 checks passed
@tinder-maxwellelliott tinder-maxwellelliott deleted the claude/fix-issue-359 branch May 22, 2026 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant