1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 11:13:28 +01:00
llvm-mirror/lib/Transforms/Scalar
Florian Hahn d596025713 [LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands.
D84108 exposed a bad interaction between inlining and loop-rotation
during regular LTO, which is causing notable regressions in at least
CINT2006/473.astar.

The problem boils down to: we now rotate a loop just before the vectorizer
which requires duplicating a function call in the preheader when compiling
the individual files ('prepare for LTO'). But this then prevents further
inlining of the function during LTO.

This patch tries to resolve this issue by making LoopRotate more
conservative with respect to rotating loops that have inline-able calls
during the 'prepare for LTO' stage.

I think this change intuitively improves the current situation in
general. Loop-rotate tries hard to avoid creating headers that are 'too
big'. At the moment, it assumes all inlining already happened and the
cost of duplicating a call is equal to just doing the call. But with LTO,
inlining also happens during full LTO and it is possible that a previously
duplicated call is actually a huge function which gets inlined
during LTO.

From the perspective of LV, not much should change overall. Most loops
calling user-provided functions won't get vectorized to start with
(unless we can infer that the function does not touch memory, has no
other side effects). If we do not inline the 'inline-able' call during
the LTO stage, we merely delayed loop-rotation & vectorization. If we
inline during LTO, chances should be very high that the inlined code is
itself vectorizable or the user call was not vectorizable to start with.

There could of course be scenarios where we inline a sufficiently large
function with code not profitable to vectorize, which would have be
vectorized earlier (by scalarzing the call). But even in that case,
there probably is no big performance impact, because it should be mostly
down to the cost-model to reject vectorization in that case. And then
the version with scalarized calls should also not be beneficial. In a way,
LV should have strictly more information after inlining and make more
accurate decisions (barring cost-model issues).

There is of course plenty of room for things to go wrong unexpectedly,
so we need to keep a close look at actual performance and address any
follow-up issues.

I took a look at the impact on statistics for
MultiSource/SPEC2000/SPEC2006. There are a few benchmarks with fewer
loops rotated, but no change to the number of loops vectorized.

Reviewed By: sanwou01

Differential Revision: https://reviews.llvm.org/D94232
2021-01-19 10:15:29 +00:00
..
ADCE.cpp [ADCE] Use succ_empty (NFC) 2020-11-15 19:52:59 -08:00
AlignmentFromAssumptions.cpp
AnnotationRemarks.cpp Add !annotation metadata and remarks pass. 2020-11-13 13:24:10 +00:00
BDCE.cpp
CallSiteSplitting.cpp [Support] Introduce a new InstructionCost class 2020-12-11 08:12:54 +00:00
CMakeLists.txt [ScalarizeMaskedMemIntrinsic] Move from CodeGen into Transforms 2020-12-08 12:25:58 -05:00
ConstantHoisting.cpp
ConstraintElimination.cpp [CodeGen, Transforms] Use llvm::sort (NFC) 2021-01-14 20:30:31 -08:00
CorrelatedValuePropagation.cpp [CVP] Simplify and generalize switch handling 2020-12-12 21:12:27 +01:00
DCE.cpp [DCE] Always get TargetLibraryInfo 2020-11-17 20:41:05 -08:00
DeadStoreElimination.cpp [Target, Transforms] Use contains (NFC) 2020-12-19 10:43:19 -08:00
DivRemPairs.cpp
EarlyCSE.cpp [EarlyCSE] Use m_LogicalAnd/Or matchers to handle branch conditions 2020-12-28 05:36:26 +09:00
FlattenCFGPass.cpp [NFC] Reduce include files dependency and AA header cleanup (part 2). 2020-12-17 14:04:48 +03:00
Float2Int.cpp [NFC] Reduce include files dependency and AA header cleanup (part 2). 2020-12-17 14:04:48 +03:00
GuardWidening.cpp [Transforms] Use llvm::append_range (NFC) 2020-12-27 09:57:29 -08:00
GVN.cpp [llvm] Call *(Set|Map)::erase directly (NFC) 2021-01-03 09:57:47 -08:00
GVNHoist.cpp Require chained analyses in BasicAA and AAResults to be transitive 2021-01-11 11:50:07 +01:00
GVNSink.cpp [llvm] Call *(Set|Map)::erase directly (NFC) 2021-01-03 09:57:47 -08:00
InductiveRangeCheckElimination.cpp [IRCE] Remove unused IsSigned and its accessor (NFC) 2020-12-04 21:26:12 -08:00
IndVarSimplify.cpp [Transforms] Use llvm::erase_if (NFC) 2020-12-17 19:53:10 -08:00
InferAddressSpaces.cpp [NewPM] Port infer-address-spaces 2020-12-28 19:58:12 -08:00
InstSimplifyPass.cpp
IVUsersPrinter.cpp
JumpThreading.cpp [llvm] Use *Set::contains (NFC) 2021-01-07 20:29:34 -08:00
LICM.cpp [NFC][LICM] Minor improvements to debug output 2021-01-11 18:02:49 -08:00
LoopAccessAnalysisPrinter.cpp
LoopDataPrefetch.cpp
LoopDeletion.cpp [LoopDeletion] Break backedge of outermost loops when known not taken 2021-01-10 16:02:33 -08:00
LoopDistribute.cpp [NFC] Reduce include files dependency and AA header cleanup (part 2). 2020-12-17 14:04:48 +03:00
LoopFlatten.cpp [LoopFlatten] Widen IV, support ZExt. 2020-11-23 08:57:19 +00:00
LoopFuse.cpp [Transforms] Construct SmallVector with iterator ranges (NFC) 2021-01-02 09:24:17 -08:00
LoopIdiomRecognize.cpp [LoopIdiom] 'left-shift until bittest': don't forget to check that PHI node is in loop header 2020-12-30 23:58:41 +03:00
LoopInstSimplify.cpp
LoopInterchange.cpp [llvm] Use the default value of drop_begin (NFC) 2021-01-18 10:16:36 -08:00
LoopLoadElimination.cpp [NFC] Reduce include files dependency and AA header cleanup (part 2). 2020-12-17 14:04:48 +03:00
LoopPassManager.cpp [LoopNest] Extend LPMUpdater and adaptor to handle loop-nest passes 2020-12-22 08:47:38 +08:00
LoopPredication.cpp [BPI] Improve static heuristics for "cold" paths. 2020-12-23 22:47:36 +07:00
LoopRerollPass.cpp [llvm] Use *Set::contains (NFC) 2021-01-07 20:29:34 -08:00
LoopRotation.cpp [LoopRotate] Add PrepareForLTO stage, avoid rotating with inline cands. 2021-01-19 10:15:29 +00:00
LoopSimplifyCFG.cpp [DominatorTree] Add support for mixed pre/post CFG views. 2021-01-06 14:53:09 -08:00
LoopSink.cpp Set option default for enabling memory ssa for new pass manager loop sink pass to true. 2021-01-15 09:56:44 -05:00
LoopStrengthReduce.cpp [llvm] Drop unnecessary make_range (NFC) 2021-01-09 09:25:00 -08:00
LoopUnrollAndJamPass.cpp
LoopUnrollPass.cpp [llvm] Use *Set::contains (NFC) 2021-01-07 20:29:34 -08:00
LoopUnswitch.cpp [DominatorTree] Add support for mixed pre/post CFG views. 2021-01-06 14:53:09 -08:00
LoopVersioningLICM.cpp [SCEV] Use isa<> pattern for testing for CouldNotCompute [NFC] 2020-11-24 18:47:49 -08:00
LowerAtomic.cpp
LowerConstantIntrinsics.cpp [Transforms] Use pred_empty (NFC) 2020-11-16 22:09:14 -08:00
LowerExpectIntrinsic.cpp Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" 2020-11-14 13:12:38 +03:00
LowerGuardIntrinsic.cpp
LowerMatrixIntrinsics.cpp [Utils][SimplifyCFG] Port SplitBlock() to DomTreeUpdater 2021-01-15 23:35:56 +03:00
LowerWidenableCondition.cpp
MakeGuardsExplicit.cpp
MemCpyOptimizer.cpp [Transforms] Construct SmallVector with iterator ranges (NFC) 2021-01-02 09:24:17 -08:00
MergedLoadStoreMotion.cpp
MergeICmps.cpp [Transforms] Use llvm::find_if (NFC) 2021-01-09 09:24:58 -08:00
NaryReassociate.cpp [NFC][NARY-REASSOCIATE] Restructure code to aviod isPotentiallyReassociatable 2020-12-04 16:19:43 +07:00
NewGVN.cpp [CodeGen, Transforms] Use llvm::sort (NFC) 2021-01-14 20:30:31 -08:00
PartiallyInlineLibCalls.cpp
PlaceSafepoints.cpp [Transforms] Use llvm::append_range (NFC) 2020-12-27 09:57:29 -08:00
Reassociate.cpp [Scalar] Construct SmallVector with iterator ranges (NFC) 2020-12-28 19:55:18 -08:00
Reg2Mem.cpp [Reg2Mem] add support for the new pass manager 2020-11-08 11:14:05 +00:00
RewriteStatepointsForGC.cpp [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-16 09:40:53 -08:00
Scalar.cpp [ScalarizeMaskedMemIntrin] Add new PM support 2020-12-08 17:15:22 -05:00
ScalarizeMaskedMemIntrin.cpp [ScalarizeMaskedMemIntrin] Add new PM support 2020-12-08 17:15:22 -05:00
Scalarizer.cpp [Scalarizer] Use poison as insertelement's placeholder 2021-01-04 00:35:28 +09:00
SCCP.cpp [SCCP] Handle bitcast of vector constants. 2020-11-03 12:58:39 +00:00
SeparateConstOffsetFromGEP.cpp SeparateConstOffsetFromGEP::lowerToSingleIndexGEPs - don't use dyn_cast_or_null. NFCI. 2020-12-15 17:27:25 +00:00
SimpleLoopUnswitch.cpp [NewPM] Only non-trivially loop unswitch at -O3 and for non-optsize functions 2021-01-13 14:54:49 -08:00
SimplifyCFGPass.cpp [SimplifyCFGPass] iterativelySimplifyCFG(): support lazy DomTreeUpdater 2021-01-12 02:09:47 +03:00
Sink.cpp
SpeculateAroundPHIs.cpp [Transforms] Use llvm::erase_if (NFC) 2020-12-17 19:53:10 -08:00
SpeculativeExecution.cpp [Target, Transforms] Use *Set::contains (NFC) 2021-01-08 18:39:54 -08:00
SROA.cpp [llvm] Use *::empty (NFC) 2021-01-16 09:40:55 -08:00
StraightLineStrengthReduce.cpp
StructurizeCFG.cpp static const char *const foo => const char foo[] 2020-12-01 10:33:18 -08:00
TailRecursionElimination.cpp [CSSPGO] IR intrinsic for pseudo-probe block instrumentation 2020-11-20 10:39:24 -08:00
WarnMissedTransforms.cpp [SVE] Add support for scalable vectors with vectorize.scalable.enable loop attribute 2020-12-02 13:23:43 +00:00