-
-
Notifications
You must be signed in to change notification settings - Fork 667
Add mul() and umul() overloads with overflow detection #22233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thanks for your pull request and interest in making D better, @Adityazzzzz! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + dmd#22233" |
|
As I don't remember the math for integer ops, I am asking Gemini to help with the PR. After looking at the op itself, that seems like it is ok, however it looks like you're missing some tests, please look over these and see if yours can be improved (don't copy them 1:1 confirm & cleanup first): // --- Setup for additional test cases ---
Cent C1 = {lo: 1UL, hi: 0UL};
// Max ULONG for low part
Cent MAX_L = {lo: ulong.max, hi: 0UL};
// Represents 2^63 (0x8000...)
Cent C2_63 = {lo: 0UL, hi: 0x8000000000000000UL};
// Represents 2^126 (0x4000...)
Cent C2_126 = {lo: 0UL, hi: 0x4000000000000000UL};
// Represents 2^128 + 1 (for cross-check)
Cent Cross_A = {lo: 1UL, hi: 0UL};
Cent Cross_B = {lo: ulong.max, hi: 1UL}; // B = 2^64*1 + (2^64-1) = 2^128 - 1 + 2^64
// --- ADDITIONS TO UNIT TEST ---
// 1. Unsigned Corner Cases (Cross-Product Overflow)
// Cross_A (1) * Cross_B (~2^128) should overflow due to cross-product carries.
res = umul(Cross_A, Cross_B, overflow);
assert(overflow);
// Max Low * Max Low (2^64-1)^2 = 2^128 - 2^65 + 1. Should NOT overflow.
res = umul(MAX_L, MAX_L, overflow);
assert(!overflow);
assert(res.hi == 0UL); // Should only populate low part (r0, r1)
// 2. Signed Corner Cases (Magnitude and Zero/Identity)
// S-Max Mag: 2^126 * 2 = 2^127. Valid magnitude for MinS. Should NOT overflow.
res = smul(C2_126, C2, overflow);
assert(!overflow);
// Result should be exactly MinS (0x8000...)
assert(res.lo == minS.lo && res.hi == minS.hi);
// Zero check (signed)
res = smul(minS, C2_63, overflow); // MinS * 2^63
assert(overflow); // (-2^127) * 2^63 = -2^190. Guaranteed signed overflow.
// Trivial Zero check
res = smul(maxU, Cent(0,0), overflow);
assert(!overflow);
assert(res.lo == 0UL && res.hi == 0UL); |
|
Using div is way too expensive a check. Think: c256 = a128 * b128;
result = c256.lo;
excess = c256.hi;When unsigned, overflow is checked by testing that the excess precision is zero overflowed = excess != 0;When signed, the sign of the top half of the widened result should agree with the bottom half. if (long(a128.hi) < 0)
excess += -b128;
if (long(b128.hi) < 0)
excess += -a128;
if (long(result.hi) < 0)
overflowed = ~excess != 0;
else
overflowed = excess != 0; |
|
The current implementation of mul ignores any carry in the excess precision bits. So maybe a slower implementation of mul should be added that computes and returns the excess precision via an |
@rikkimax is this gemini coming up with this? If you can point me to the proof where |
@rikkimax again, a simple C __int128 program test shows that the bit representation is |
@rikkimax and again. At this point, I'd rather not go through the other examples, as they are up to this point all demonstrably false. |
|
Yes Iain Gemini came up with them, and no I don't trust them, hence why I said don't copy them and verify them. The tests look less than full range of inputs. Hence, the suggestion to consider. |
b708f1c to
818b3f7
Compare
|
@WalterBright @ibuclaw @rikkimax I have updated the implementation based on your feedback to avoid the expensive division check.
Please review |
818b3f7 to
3b3e772
Compare
3b3e772 to
4c1053a
Compare
|
@ibuclaw @rikkimax Thanks for the detailed review! I have addressed all your comments in the latest push:
Ready for re-review! |
|
|
||
| assert(mul(C9_3, C10, overflow) == C90_30); | ||
| assert(!overflow); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind the trailing whitespace on some lines, there are a few other examples dotted around this changeset.
|
Thanks for the review! I've applied the suggestions in the latest push:
Ready for another look! |
|
@ibuclaw Can you review this , if everything is alright in this PR, suitable for merging?? |
As requested by @rikkimax and @ibuclaw in Issue #22170.
This PR adds
mul(signed) andumul(unsigned) overloads that accept anout bool overflowparameter.Changes:
umul(Cent, Cent, out bool)for unsigned overflow detection.mul(Cent, Cent, out bool)for signed overflow detection.prod / a == b) to verify results.Cent.minchecks).This allows callers to detect if a multiplication result has exceeded the 128-bit storage limit.