Update effects for acquire and release operations#8399
Conversation
One of the primary use cases of `EffectAnalyzer` is to determine whether one expression (or a sequence of expressions) can be reordered before or after another expression (or sequence of expressions). Until now, the check whether effects "invalidate" each other and would prevent reordering has been symmetrical, checking read-write, write-write, and write-read conflicts as well as symmetric conditions that would prevent reordering. But acquire load and release store operations break this symmetry. Consider an expression A containing and an acquire load (and no other effects) and an expression B containing a release store (and no other effects), where we know the accesses do not alias. Then if A is before B, it is not valid to reorder the expressions to have B before A because the acquire load cannot move forward past the release store and the release store cannot move back past the acquire load. However, if B is before A, it _is_ valid to reorder them, since memory accesses are generally allowed to move forward past acquire loads and back past release stores. Update `EffectAnalyzer` to take these allowed reorderings into account by tracking the strictest memory orders used to read or write to shared locations in the analyzed expressions. Also update it to differentiate between shared and unshared memories, structs, and arrays to account for both the fact that accesses to shared and unshared locations can never alias and the fact that accesses to unshared locations can never participate in synchronization. While we're at it, treat effects due to e.g. nullability of reference parameters and mutability of fields more uniformly across expressions. Now that the effect comparison to see whether reordering is prevented is no longer symmetric, rename it from `invalidates` to the more descriptive `orderedBefore`, where `A.orderedBefore(B)` returns true iff `A` ocurring before `B` implies that there is an ordering constraint that would prevent reordering `A` and `B` past each other. To accomodate users trying to move expressions in either direction, also add an `A.orderedAfter(B)` method for the case where `A` occurs after `B`. WIP because I need to add more testing for the effect analysis itself and for the passes to ensure they are performing effect comparisons in the correct directions.
| if (effects.hasExternalBreakTargets()) { | ||
| o << "hasExternalBreakTargets\n"; | ||
| } | ||
| o << "}"; |
There was a problem hiding this comment.
The { on line 23 will be lonely after this
src/ir/effects.h
Outdated
| parent.writeOrder = std::max(parent.writeOrder, curr->order); | ||
| } else { | ||
| parent.writesMemory = true; | ||
| } |
There was a problem hiding this comment.
can perhaps use a helper for this repeating pattern? void writesMemory(Name memory, Order order); etc.
| // We don't want this to be moved out of loops, but it doesn't otherwises | ||
| // matter much how it gets reordered. Say we transfer control as a coarse | ||
| // approximation of this. | ||
| parent.branchesOut = true; |
There was a problem hiding this comment.
This could be in an earlier PR perhaps? Just for bisection purposes later.
src/ir/effects.h
Outdated
| parent.readsSharedMemory = true; | ||
| } else { | ||
| parent.readsMemory = true; | ||
| } |
There was a problem hiding this comment.
This pattern also repeats a lot
src/ir/effects.h
Outdated
| EffectAnalyzer aEffects(passOptions, module, a); | ||
| EffectAnalyzer bEffects(passOptions, module, b); | ||
| return !aEffects.invalidates(bEffects); | ||
| return !aEffects.orderedBefore(bEffects); |
There was a problem hiding this comment.
Shouldn't this check the other direction too? orderedBefore assumes a is before b, but this function doesn't. Or is the intention for this to assume that ordering as well? The comment could be more explicit about that, then?
There was a problem hiding this comment.
Yes, this should also assume an initial ordering.
|
@kripken and @stevenfontanella, this should be ready to go. To keep the testing burden smaller, I've kept |
| // either direction. | ||
| // TODO: Update users to check order more precisely and remove this. | ||
| bool invalidates(const EffectAnalyzer& other) { | ||
| return orderedBefore(other) || orderedAfter(other); |
There was a problem hiding this comment.
This seems to now do twice the work as before - is this PR performance neutral?
| } | ||
|
|
||
| void readsMemory(Name memory, MemoryOrder order) { | ||
| if (parent.module.getMemory(memory)->shared) { |
There was a problem hiding this comment.
I wonder if it would be faster on average to check the feature before doing getMemory, which is slower. If this PR slows things down that might be worth checking, otherwise maybe not.
Ditto below for the Struct etc. looks for a field, which is likely even slower.
|
Testing the performance of -O3 with 6 runs on each of 5 different modules (2 C++, 1 Java, 1 Kotlin, 1 Dart), there was no statistically significant performance difference on 4 of the modules. There was a statistically significant 1.3% slowdown on the Java module, but that was also the module with the shortest optimization time; it was already fully optimized. Since the two C++ modules that use memories didn't show any difference, it shouldn't make a meaningful difference to check the features before checking whether the memory is shared. I did test the Java module again with feature checks, but there was no statistically significant difference (again with 6 trials). I think this is performance neutral enough :) |
One of the primary use cases of
EffectAnalyzeris to determine whetherone expression (or a sequence of expressions) can be reordered before or
after another expression (or sequence of expressions). Until now, the
check whether effects "invalidate" each other and would prevent
reordering has been symmetrical, checking read-write, write-write, and
write-read conflicts as well as symmetric conditions that would prevent
reordering. But acquire load and release store operations break this
symmetry.
Consider an expression A containing an acquire load (and no other
effects) and an expression B containing a release store (and no other
effects), where we know the accesses do not alias. Then if A is before
B, it is not valid to reorder the expressions to have B before A because
the acquire load cannot move forward past the release store and the
release store cannot move back past the acquire load. However, if B is
before A, it is valid to reorder them, since memory accesses are
generally allowed to move forward past acquire loads and back past
release stores.
Update
EffectAnalyzerto take these allowed reorderings into accountby tracking the strictest memory orders used to read or write to shared
locations in the analyzed expressions.
Now that the effect comparison to see whether reordering is prevented is
no longer symmetric, rename it from
invalidatesto the moredescriptive
orderedBefore, whereA.orderedBefore(B)returns true iffAocurring beforeBimplies that there is an ordering constraintthat would prevent reordering
AandBpast each other. To accomodateusers trying to move expressions in either direction, also add an
A.orderedAfter(B)method for the case whereAoccurs afterB.To avoid having to update all users at once, keep an
invalidatesmethodthat checks for ordering in either direction. This maintains the symmetry
many users depend on. For now, update only SimplifyLocals to use the more
precise directional methods. Add tests both for all the interesting
combinations of atomic effect ordering and for the correct use of
directionality in SimplifyLocals.