Disable GCM hoisting of memory-writing nodes for irreducible CFGs. This prevents
GCM from wrongly "hoisting" stores into descendants of their original loop. Such
an "inverted hoisting" can happen due to CFGLoop::compute_freq()'s inaccurate
estimation of frequencies for irreducible CFGs.
Extend CFG verification code by checking that memory-writing nodes are placed in
either their original loop or an ancestor.
Add tests for the reducible and irreducible cases. The former was already
handled correctly before the change (the frequency estimation model prevents
"inverted hoisting" for reducible CFGs), and is just added for coverage.
This change addresses the specific miscompilation issue in a conservative way,
for simplicity and safety. Future work includes investigating if only the
illegal blocks can be discarded as candidates for GCM hoisting, and refining
frequency estimation for irreducible CFGs.
Reviewed-by: kvn, chagedorn
Check for nodes missed by remove_useless_nodes() only if PhaseRemoveUseless has
actually been run. This makes it possible to use -XX:-UseLoopSafepoints without
crashing trivially, although implicit assumptions in other parts of C2 about the
existence of loop safepoints might lead to more subtle failures for more complex
methods.
Reviewed-by: neliasso, thartmann, kvn
8248188: Add IntrinsicCandidate and API for Base64 decoding, add Power64LE intrinsic implementation.
This patch set encompasses the following commits:
Adds a new intrinsic candidate to the java.lang.Base64 class - decodeBlock(), and provides a flexible API for the intrinsic. The API is similar to the existing encodeBlock intrinsic.
Adds the code in HotSpot to check and martial the new intrinsic's arguments to the arch-specific intrinsic implementation.
Adds a Power64LE-specific implementation of the decodeBlock intrinsic.
Adds a JMH microbenchmark for both Base64 encoding and encoding.
Enhances the JTReg hotspot intrinsic "TestBase64.java" regression test to more fully test both decoding and encoding.
Reviewed-by: rriggs, mdoerr, kvn
Prevent exponential number of calls to ConvI2LNode::Ideal() when AddIs are used
multiple times by other AddIs in the optimization ConvI2L(AddI(x, y)) ->
AddL(ConvI2L(x), ConvI2L(y)). This is achieved by (1) reusing existing ConvI2Ls
if possible rather than eagerly creating new ones and (2) postponing the
optimization of newly created ConvI2Ls. Remove hook node solution introduced in
8217359, since this is subsumed by (2). Use phase->is_IterGVN() rather than
can_reshape to check if ConvI2LNode::Ideal() is called within iterative GVN, for
clarity. Add regression tests that cover different shapes and sizes of AddI
subgraphs, implicitly checking (by not timing out) that there is no
combinatorial explosion.
Co-authored-by: Vladimir Ivanov <vlivanov@openjdk.org>
Reviewed-by: vlivanov, kvn
Use the code motion trace produced by TraceOptoPipelining (excluding traces of
stubs) to assert that two compilations with the same seed cause StressLCM and
StressGCM to take the same randomized decisions. Previously, the entire output
produced by PrintOptoStatistics was used instead, which has shown to be too
fragile. Also, disable inlining in both TestStressCM.java and the similar
TestStressIGVN.java to prevent flaky behavior, and run both tests for ten
different seeds to improve coverage.
Reviewed-by: kvn, thartmann