-
-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Remove the explicit branch hint from const_panic #150386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Remove the explicit branch hint from const_panic
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Remove the explicit branch hint from const_panic
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Remove the explicit branch hint from const_panic
This comment has been minimized.
This comment has been minimized.
9c026c4 to
ebf865b
Compare
|
@bors try |
This comment has been minimized.
This comment has been minimized.
Remove the explicit branch hint from const_panic
|
Finished benchmarking commit (f6b082c): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -1.9%, secondary -1.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -2.4%, secondary 1.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 482.281s -> 482.922s (0.13%) |
|
@rust-timer build f467cc7 |
This comment has been minimized.
This comment has been minimized.
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
96a2530 to
469cf53
Compare
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the context/goal here? I'm guessing it may be related to rust-lang/compiler-builtins#1046, but I don't think this is a direct fix.
One suggestion then r=me for the change itself. Cc @RalfJung in case you have anything.
| pub fn test_assert(x: bool) { | ||
| core::panic::const_assert!(x, "", "",); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to put this in the existing likely_assert.rs to make it obvious that they should be acting the same. Also save a per-test time overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI, putting codegen tests for multiple similar functions in the same file has a hidden hazard that I discovered in #148849. You need the right CHECK-DAGs in place or the test for a function which is textually earlier can skip past the end of the function and match IR in another function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that is gnarly... specifically it matches IR in a function that is otherwise ignored in the test, right? It can't match one of the functions we check with CHECK-LABEL since then it would later fail to find that label.
This seems worth at least tracking in a separate issue, and eventually we should systematically do that for all codegen tests.
library/core/src/panic.rs
Outdated
| @capture { $($arg: $ty = $arg),* } -> !: | ||
| #[noinline] | ||
| if const #[track_caller] #[inline] { // Inline this, to prevent codegen | ||
| if const #[track_caller] { // #[inline] on the const branch lets it skip codegen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment seems to explain why an attribute is present that's not actually present...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed #[noinline], so I'm basically documenting why it isn't present. Maybe there just no need to document using the default behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to refresh my memory of the macro... the docs seem a bit outdated; it says "The runtime function is always marked as #[inline]" but actually both functions are marked as inline by default and noinline can be used to suppress that.
This here is the only user of #[noinline] so if we don't want that any more, maybe we should just remove it?
Anyway for the comment, the one this comment is attached to indeed does not seem useful. The one a few lines down is confusing. It's unclear what "it" refers to, and also there's no panic_fmt visible here, so... what is t talking about?^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear what "it" refers to
I was intending the two comments to read as one sentence but of course they aren't actually one so it doesn't work.
When assert! fails, it calls a panic entrypoint,, which like all the panic entrypoint functions is #[cold]. Since the failure branch leads directly to a block which is terminated by a call to a #[cold] function, codegen produces a branch hint. This "SwitchInt arm leads to #[cold]" logic is actually how intrinsics::likely is implemented.
But in const_panic! we have this wrapper function around the runtime failure arm to make const_eval_select work. Adding #[inline] to the wrapper function means that the wrapper gets inlined away in MIR and the cold-ness of what it wraps is now visible to codegen. Which brings back the branch hint information that we lose by just removing intrinsics::likely.
Correct. I ran into the const_panic I'm working on an idea that could be a direct fix for that compiler-builtins issue, but this PR should stand on its own merits. |
469cf53 to
037c39d
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
037c39d to
6126ddf
Compare
6126ddf to
315646a
Compare
|
@bors r=tgross35 |
|
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing b082bd5 (parent) -> d874dce (this PR) Test differencesShow 26 test diffs26 doctest diffs were found. These are ignored, as they are noisy. Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard d874dce1252fe409ce13e64d773046853bf5e6ca --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (d874dce): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -2.4%, secondary -0.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -6.8%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.5%, secondary 0.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 482.402s -> 482.826s (0.09%) |
This was asked for in PR review for equivalence with
assert!, but we don't need likely/unlikely intrinsics here (and indeed they are a bit of an antipattern), because codegen knows about the pattern thatassert!produces where only one target of a SwitchInt leads to a block terminated by a#[cold]call. All our panic entrypoints are#[cold].assert!does not use the likely/unlikely intrinsics, maybe it did in the past.So the intrinsic call in
const_assert!should be completely unnecessary. But currently it isn't, becauseconst_panic!stamps out a runtime wrapper for its entrypoint which is not given#[inline], citing compile times. It's not obvious to me why not using#[inline]would even be good in this case.I think the runtime wrapper should have
#[inline]or#[cold], so I'm going to do 3 perf experiments here.Just changing
const_assert!: https://perf.rust-lang.org/compare.html?start=fabece9e9491d0a3c3655dba488881968b7c5f7a&end=f6b082cfcf880ad80b33523d5047c11654d907a8&stat=instructions:uAlso adding
#[inline]toconst_panic!'s runtime wrapper: https://perf.rust-lang.org/compare.html?start=fabece9e9491d0a3c3655dba488881968b7c5f7a&end=f467cc7e49b74b47da0e2cde49fc1a3140e00ff6&stat=instructions:uAlso adding
#[cold]toconst_panic!'s runtime wrapper: https://perf.rust-lang.org/compare.html?start=2e854a9344154564259de51385e9ec9506c0f3b7&end=fd8e098558ff08ac002b05ff78e9f91ffa6f3738&stat=instructions:uThe previous comment indicated perf would be worse with
#[inline]on the runtime branch, but the results in this PR show that it is more mixed, and net positive. And in fact, the only regression in serde seems to be because we created a very tiny additional amount of codegen, but pushed the size of the second CGU just high enough that we no longer merge all CGUs together. So now serde (in debug with incremental disabled) has 2 CGUs instead of 1. If we just so happened to be still below the threshold where the CGUs don't merge, adding#[inline]would have been nearly all improvements.