Skip to content

Conversation

@saethlin
Copy link
Member

@saethlin saethlin commented Dec 26, 2025

This was asked for in PR review for equivalence with assert!, but we don't need likely/unlikely intrinsics here (and indeed they are a bit of an antipattern), because codegen knows about the pattern that assert! produces where only one target of a SwitchInt leads to a block terminated by a #[cold] call. All our panic entrypoints are #[cold].

assert! does not use the likely/unlikely intrinsics, maybe it did in the past.

So the intrinsic call in const_assert! should be completely unnecessary. But currently it isn't, because const_panic! stamps out a runtime wrapper for its entrypoint which is not given #[inline], citing compile times. It's not obvious to me why not using #[inline] would even be good in this case.

I think the runtime wrapper should have #[inline] or #[cold], so I'm going to do 3 perf experiments here.
Just changing const_assert!: https://perf.rust-lang.org/compare.html?start=fabece9e9491d0a3c3655dba488881968b7c5f7a&end=f6b082cfcf880ad80b33523d5047c11654d907a8&stat=instructions:u
Also adding #[inline] to const_panic!'s runtime wrapper: https://perf.rust-lang.org/compare.html?start=fabece9e9491d0a3c3655dba488881968b7c5f7a&end=f467cc7e49b74b47da0e2cde49fc1a3140e00ff6&stat=instructions:u
Also adding #[cold] to const_panic!'s runtime wrapper: https://perf.rust-lang.org/compare.html?start=2e854a9344154564259de51385e9ec9506c0f3b7&end=fd8e098558ff08ac002b05ff78e9f91ffa6f3738&stat=instructions:u

The previous comment indicated perf would be worse with #[inline] on the runtime branch, but the results in this PR show that it is more mixed, and net positive. And in fact, the only regression in serde seems to be because we created a very tiny additional amount of codegen, but pushed the size of the second CGU just high enough that we no longer merge all CGUs together. So now serde (in debug with incremental disabled) has 2 CGUs instead of 1. If we just so happened to be still below the threshold where the CGUs don't merge, adding #[inline] would have been nearly all improvements.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Dec 26, 2025
@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Dec 26, 2025
Remove the explicit branch hint from const_panic
@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 26, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Dec 26, 2025

☀️ Try build successful (CI)
Build commit: f6b082c (f6b082cfcf880ad80b33523d5047c11654d907a8, parent: fabece9e9491d0a3c3655dba488881968b7c5f7a)

@rust-timer

This comment has been minimized.

@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Dec 26, 2025
Remove the explicit branch hint from const_panic
@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Dec 26, 2025
Remove the explicit branch hint from const_panic
@rust-bors

This comment has been minimized.

@saethlin saethlin force-pushed the no-const-assert-likely branch from 9c026c4 to ebf865b Compare December 26, 2025 04:31
@saethlin
Copy link
Member Author

@bors try

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Dec 26, 2025
Remove the explicit branch hint from const_panic
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (f6b082c): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.2% [-0.2%, -0.2%] 1
Improvements ✅
(secondary)
-0.0% [-0.1%, -0.0%] 3
All ❌✅ (primary) -0.2% [-0.2%, -0.2%] 1

Max RSS (memory usage)

Results (primary -1.9%, secondary -1.9%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.3% [2.3%, 2.3%] 1
Regressions ❌
(secondary)
2.6% [1.8%, 3.6%] 3
Improvements ✅
(primary)
-6.1% [-6.1%, -6.1%] 1
Improvements ✅
(secondary)
-4.1% [-4.9%, -3.7%] 6
All ❌✅ (primary) -1.9% [-6.1%, 2.3%] 2

Cycles

Results (primary -2.4%, secondary 1.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.2% [1.8%, 7.0%] 4
Improvements ✅
(primary)
-2.4% [-2.4%, -2.4%] 1
Improvements ✅
(secondary)
-2.6% [-2.9%, -2.4%] 3
All ❌✅ (primary) -2.4% [-2.4%, -2.4%] 1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 482.281s -> 482.922s (0.13%)
Artifact size: 392.44 MiB -> 390.42 MiB (-0.51%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 26, 2025
@rust-bors
Copy link

rust-bors bot commented Dec 26, 2025

☀️ Try build successful (CI)
Build commit: f467cc7 (f467cc7e49b74b47da0e2cde49fc1a3140e00ff6, parent: fabece9e9491d0a3c3655dba488881968b7c5f7a)

@saethlin
Copy link
Member Author

@rust-timer build f467cc7

@rust-timer

This comment has been minimized.

@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot removed the perf-regression Performance regression. label Dec 26, 2025
@saethlin saethlin force-pushed the no-const-assert-likely branch from 96a2530 to 469cf53 Compare December 26, 2025 19:20
@rustbot
Copy link
Collaborator

rustbot commented Dec 26, 2025

tgross35 is currently at their maximum review capacity.
They may take a while to respond.

@saethlin saethlin marked this pull request as ready for review December 26, 2025 19:24
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 26, 2025
Copy link
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the context/goal here? I'm guessing it may be related to rust-lang/compiler-builtins#1046, but I don't think this is a direct fix.

One suggestion then r=me for the change itself. Cc @RalfJung in case you have anything.

View changes since this review

Comment on lines 6 to 8
pub fn test_assert(x: bool) {
core::panic::const_assert!(x, "", "",);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to put this in the existing likely_assert.rs to make it obvious that they should be acting the same. Also save a per-test time overhead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, putting codegen tests for multiple similar functions in the same file has a hidden hazard that I discovered in #148849. You need the right CHECK-DAGs in place or the test for a function which is textually earlier can skip past the end of the function and match IR in another function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that is gnarly... specifically it matches IR in a function that is otherwise ignored in the test, right? It can't match one of the functions we check with CHECK-LABEL since then it would later fail to find that label.

This seems worth at least tracking in a separate issue, and eventually we should systematically do that for all codegen tests.

@capture { $($arg: $ty = $arg),* } -> !:
#[noinline]
if const #[track_caller] #[inline] { // Inline this, to prevent codegen
if const #[track_caller] { // #[inline] on the const branch lets it skip codegen
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment seems to explain why an attribute is present that's not actually present...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed #[noinline], so I'm basically documenting why it isn't present. Maybe there just no need to document using the default behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to refresh my memory of the macro... the docs seem a bit outdated; it says "The runtime function is always marked as #[inline]" but actually both functions are marked as inline by default and noinline can be used to suppress that.

This here is the only user of #[noinline] so if we don't want that any more, maybe we should just remove it?

Anyway for the comment, the one this comment is attached to indeed does not seem useful. The one a few lines down is confusing. It's unclear what "it" refers to, and also there's no panic_fmt visible here, so... what is t talking about?^^

Copy link
Member Author

@saethlin saethlin Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear what "it" refers to

I was intending the two comments to read as one sentence but of course they aren't actually one so it doesn't work.

When assert! fails, it calls a panic entrypoint,, which like all the panic entrypoint functions is #[cold]. Since the failure branch leads directly to a block which is terminated by a call to a #[cold] function, codegen produces a branch hint. This "SwitchInt arm leads to #[cold]" logic is actually how intrinsics::likely is implemented.

But in const_panic! we have this wrapper function around the runtime failure arm to make const_eval_select work. Adding #[inline] to the wrapper function means that the wrapper gets inlined away in MIR and the cold-ness of what it wraps is now visible to codegen. Which brings back the branch hint information that we lose by just removing intrinsics::likely.

@saethlin
Copy link
Member Author

saethlin commented Dec 28, 2025

I'm guessing it may be related to rust-lang/compiler-builtins#1046, but I don't think this is a direct fix.

Correct. I ran into the const_panic do_panic::runtime wrapper for the nth time while looking into that issue. Every time I see it not optimized out in immediate-abort builds, I grumble to myself then check the code and see the comment saying it is justified. In the specific context that it popped up recently I realized that the inlining should result in less code so I figured I should redo the perf experiment and see exactly what happens.

I'm working on an idea that could be a direct fix for that compiler-builtins issue, but this PR should stand on its own merits.

@saethlin saethlin force-pushed the no-const-assert-likely branch from 469cf53 to 037c39d Compare December 29, 2025 04:29
@rustbot
Copy link
Collaborator

rustbot commented Dec 29, 2025

Some changes occurred to the intrinsics. Make sure the CTFE / Miri interpreter
gets adapted for the changes, if necessary.

cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr

@rustbot
Copy link
Collaborator

rustbot commented Dec 29, 2025

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@saethlin saethlin force-pushed the no-const-assert-likely branch from 037c39d to 6126ddf Compare December 29, 2025 04:59
@saethlin saethlin force-pushed the no-const-assert-likely branch from 6126ddf to 315646a Compare December 29, 2025 15:30
@saethlin
Copy link
Member Author

@bors r=tgross35

@bors
Copy link
Collaborator

bors commented Dec 29, 2025

📌 Commit 315646a has been approved by tgross35

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 29, 2025
@bors
Copy link
Collaborator

bors commented Dec 30, 2025

⌛ Testing commit 315646a with merge d874dce...

@bors
Copy link
Collaborator

bors commented Dec 30, 2025

☀️ Test successful - checks-actions
Approved by: tgross35
Pushing d874dce to main...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 30, 2025
@bors bors merged commit d874dce into rust-lang:main Dec 30, 2025
12 checks passed
@rustbot rustbot added this to the 1.94.0 milestone Dec 30, 2025
@github-actions
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing b082bd5 (parent) -> d874dce (this PR)

Test differences

Show 26 test diffs

26 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard d874dce1252fe409ce13e64d773046853bf5e6ca --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. dist-aarch64-msvc: 7267.6s -> 5657.4s (-22.2%)
  2. aarch64-msvc-2: 7404.7s -> 5861.9s (-20.8%)
  3. x86_64-rust-for-linux: 3134.0s -> 2639.5s (-15.8%)
  4. x86_64-gnu-tools: 3711.0s -> 3202.5s (-13.7%)
  5. pr-check-1: 2023.2s -> 1750.0s (-13.5%)
  6. i686-gnu-1: 8529.5s -> 7423.7s (-13.0%)
  7. test-various: 7531.3s -> 6634.2s (-11.9%)
  8. x86_64-gnu-gcc: 3442.3s -> 3049.0s (-11.4%)
  9. x86_64-msvc-ext2: 5615.8s -> 6225.6s (+10.9%)
  10. x86_64-gnu-llvm-20: 4900.3s -> 4397.2s (-10.3%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (d874dce): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.6% [0.5%, 0.7%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.4% [-0.5%, -0.4%] 2
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.0%] 3
All ❌✅ (primary) 0.1% [-0.5%, 0.7%] 4

Max RSS (memory usage)

Results (primary -2.4%, secondary -0.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.1% [2.1%, 2.1%] 1
Improvements ✅
(primary)
-2.4% [-3.6%, -0.4%] 3
Improvements ✅
(secondary)
-3.3% [-3.3%, -3.3%] 1
All ❌✅ (primary) -2.4% [-3.6%, -0.4%] 3

Cycles

Results (primary -6.8%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.2% [2.0%, 2.4%] 2
Improvements ✅
(primary)
-6.8% [-6.8%, -6.8%] 1
Improvements ✅
(secondary)
-4.4% [-4.4%, -4.4%] 1
All ❌✅ (primary) -6.8% [-6.8%, -6.8%] 1

Binary size

Results (primary 0.5%, secondary 0.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.6% [0.5%, 0.7%] 3
Regressions ❌
(secondary)
0.7% [0.7%, 0.7%] 2
Improvements ✅
(primary)
-0.0% [-0.0%, -0.0%] 1
Improvements ✅
(secondary)
-0.1% [-0.1%, -0.1%] 1
All ❌✅ (primary) 0.5% [-0.0%, 0.7%] 4

Bootstrap: 482.402s -> 482.826s (0.09%)
Artifact size: 390.86 MiB -> 390.83 MiB (-0.01%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants