1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
llvm-mirror/test/Transforms
Michael Zolotukhin afd08c7313 [Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the...
Summary:
...loop after the last iteration.

This is really hard to do correctly. The core problem is that we need to
model liveness through the induction PHIs from iteration to iteration in
order to get the correct results, and we need to correctly de-duplicate
the common subgraphs of instructions feeding some subset of the
induction PHIs. All of this can be driven either from a side effect at
some iteration or from the loop values used after the loop finishes.

This patch implements this by storing the forward-propagating analysis
of each instruction in a cache to recall whether it was free and whether
it has become live and thus counted toward the total unroll cost. Then,
at each sink for a value in the loop, we recursively walk back through
every value that feeds the sink, including looping back through the
iterations as needed, until we have marked the entire input graph as
live. Because we cache this, we never visit instructions more than twice
-- once when we analyze them and put them into the cache, and once when
we count their cost towards the unrolled loop. Also, because the cache
is only two bits and because we are dealing with relatively small
iteration counts, we can store all of this very densely in memory to
avoid this from becoming an excessively slow analysis.

The code here is still pretty gross. I would appreciate suggestions
about better ways to factor or split this up, I've stared too long at
the algorithmic side to really have a good sense of what the design
should probably look at.

Also, it might seem like we should do all of this bottom-up, but I think
that is a red herring. Specifically, the simplification power is *much*
greater working top-down. We can forward propagate very effectively,
even across strange and interesting recurrances around the backedge.
Because we use data to propagate, this doesn't cause a state space
explosion. Doing this level of constant folding, etc, would be very
expensive to do bottom-up because it wouldn't be until the last moment
that you could collapse everything. The current solution is essentially
a top-down simplification with a bottom-up cost accounting which seems
to get the best of both worlds. It makes the simplification incremental
and powerful while leaving everything dead until we *know* it is needed.

Finally, a core property of this approach is its *monotonicity*. At all
times, the current UnrolledCost is a conservatively low estimate. This
ensures that we will never early-exit from the analysis due to exceeding
a threshold when if we had continued, the cost would have gone back
below the threshold. These kinds of bugs can cause incredibly hard to
track down random changes to behavior.

We could use a techinque similar (but much simpler) within the inliner
as well to avoid considering speculated code in the inline cost.

Reviewers: chandlerc

Subscribers: sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11758

llvm-svn: 269388
2016-05-13 01:42:39 +00:00
..
ADCE [PR27284] Reverse the ownership between DICompileUnit and DISubprogram. 2016-04-15 15:57:41 +00:00
AddDiscriminators Revert http://reviews.llvm.org/D19926 as it breaks tests. 2016-05-05 20:47:53 +00:00
AlignmentFromAssumptions
ArgumentPromotion [ArgumentPromotion] Propagate operand bundles to promoted call sites 2016-04-29 04:56:12 +00:00
AtomicExpand ARM: use a pseudo-instruction for cmpxchg at -O0. 2016-04-18 21:48:55 +00:00
BBVectorize
BDCE
BranchFolding
CodeExtractor
CodeGenPrepare [CodeGenPrepare] Don't sink a cast past its user 2016-04-27 19:36:38 +00:00
ConstantHoisting ARM: don't try to hoist constant RHS out of a division. 2016-04-15 18:17:18 +00:00
ConstantMerge [PM] Port ConstantMerge to the new pass manager. 2016-05-05 00:51:09 +00:00
ConstProp Revert "[SCCP] Partially propagate informations when the input is not fully defined." 2016-05-11 23:06:10 +00:00
CorrelatedValuePropagation Remove extra whitespace. NFC. 2016-05-02 16:45:00 +00:00
CrossDSOCFI [cfi] Cross-DSO CFI diagnostic mode (LLVM part). 2016-01-25 23:35:03 +00:00
DCE Mark guards on true as "trivially dead" 2016-04-29 22:23:16 +00:00
DeadArgElim [DeadArgumentElimination] Propagate operand bundles to promoted call sites 2016-04-29 07:22:36 +00:00
DeadStoreElimination [DeadStoreElimination] Shorten beginning of memset overwritten by later stores 2016-04-22 19:51:29 +00:00
EarlyCSE [EarlyCSE] Simplify guard intrinsics 2016-04-29 21:52:58 +00:00
EliminateAvailableExternally [PM] Port EliminateAvailableExternally pass to the new pass manager. 2016-05-05 02:37:32 +00:00
Float2Int
ForcedFunctionAttrs
FunctionAttrs Don't IPO over functions that can be de-refined 2016-04-08 00:48:30 +00:00
FunctionImport ThinLTO: do not import function whose linkage prevents inlining. 2016-05-03 00:27:28 +00:00
GCOVProfiling DebugInfo: Remove MDString-based type references 2016-04-23 21:08:00 +00:00
GlobalDCE [GlobalDCE, Misc] Don't remove functions referenced by ifuncs 2016-05-04 00:20:48 +00:00
GlobalOpt Make "@name =" mandatory for globals in .ll files. 2016-05-10 18:22:45 +00:00
GVN [GVN] PRE of unordered loads 2016-05-06 21:43:51 +00:00
IndVarSimplify [LLVM] Remove unwanted --check-prefix=CHECK from unit tests. NFC. 2016-04-19 23:51:52 +00:00
InferFunctionAttrs [InferAttrs] Mark memset_pattern16 params nocapture. 2016-04-27 19:04:43 +00:00
Inline All llvm.deoptimize declarations must use the same calling convention 2016-05-12 01:17:38 +00:00
InstCombine [InstCombine] Fold icmp ugt/ult (udiv i32 C2, X), C1. 2016-05-10 20:22:09 +00:00
InstMerge fixed typo - CHECK-LABEL 2016-03-29 06:49:38 +00:00
InstSimplify [InstSimplify] use computeKnownBits on shift amount operands 2016-05-10 20:46:54 +00:00
Internalize PM: Port Internalize to the new pass manager 2016-04-26 20:15:52 +00:00
IPConstantProp [PM] Port Interprocedural SCCP to the new pass manager. 2016-05-05 21:05:36 +00:00
IRCE [SCEV] Try to reuse existing value during SCEV expansion 2016-02-04 01:27:38 +00:00
JumpThreading [ValueTracking] Improve isImpliedCondition when the dominating cond is false. 2016-04-25 17:23:36 +00:00
LCSSA
LICM [ValueTracking] Use guards to prove non-nullness of a value 2016-05-10 02:35:44 +00:00
LoadCombine
LoopDataPrefetch [LoopDataPrefetch] Add optimization remark 2016-05-05 00:08:15 +00:00
LoopDeletion Use all_of instead of a raw loop; NFC 2016-05-03 17:50:06 +00:00
LoopDistribute [LoopDist] Add missing RUN line in test from r268006 2016-04-29 07:16:00 +00:00
LoopIdiom [LIR] Set attributes on memset_pattern16. 2016-04-27 19:04:50 +00:00
LoopInterchange
LoopLoadElim [LLE] Check for mismatching types between the store and the load earlier 2016-03-24 17:59:26 +00:00
LoopReroll Enable loopreroll for sext of loop control only IV 2016-05-10 21:16:49 +00:00
LoopRotate LPM: Drop require<loops> from these tests, it's redundant. NFC 2016-05-10 18:28:10 +00:00
LoopSimplify [PR27284] Reverse the ownership between DICompileUnit and DISubprogram. 2016-04-15 15:57:41 +00:00
LoopSimplifyCFG LPM: Drop require<loops> from these tests, it's redundant. NFC 2016-05-10 18:28:10 +00:00
LoopStrengthReduce AMDGPU: Stop reporting an addressing mode for unknown addrspace 2016-04-29 06:25:10 +00:00
LoopUnroll [Unroll] Implement a conservative and monotonically increasing cost tracking system during the full unroll heuristic analysis that avoids counting any instruction cost until that instruction becomes "live" through a side-effect or use outside the... 2016-05-13 01:42:39 +00:00
LoopUnswitch [SimlifyCFG] Prevent passes from destroying canonical loop structure, especially for nested loops 2016-03-29 04:08:57 +00:00
LoopVectorize Revert "[VectorUtils] Query number of sign bits to allow more truncations" 2016-05-10 12:27:23 +00:00
LoopVersioning [LVers] Change CHECK_LABEL to CHECK-LABEL (underscore->dash) 2016-03-28 21:04:13 +00:00
LoopVersioningLICM [LoopVersioningLICM] Add test coverage for llvm.loop.licm_versioning.disable 2016-04-22 18:34:50 +00:00
LowerAtomic
LowerBitSets [cfi] Support explicit sections for functions in cfi-icall. 2016-04-15 22:55:38 +00:00
LowerExpectIntrinsic [LowerExpectIntrinsic] make default likely/unlikely ratio bigger 2016-04-26 22:23:38 +00:00
LowerGuardIntrinsic [LowerGuardIntrinsics] Keep track of !make.implicit metadata 2016-04-30 00:55:59 +00:00
LowerInvoke
LowerSwitch
Mem2Reg [PR27284] Reverse the ownership between DICompileUnit and DISubprogram. 2016-04-15 15:57:41 +00:00
MemCpyOpt Revert "MemCpyOpt: combine local load/store sequences into memcpy." 2016-05-10 21:49:40 +00:00
MergeFunc MergeFunctions: test alloca better 2016-04-12 00:03:26 +00:00
MetaRenamer
NameAnonFunctions Add a pass to name anonymous/nameless function 2016-04-12 21:35:28 +00:00
NaryReassociate
ObjCARC [PR27284] Reverse the ownership between DICompileUnit and DISubprogram. 2016-04-15 15:57:41 +00:00
PartiallyInlineLibCalls
PGOProfile [PM]: port IR based profUse pass to new pass manager 2016-05-10 21:59:52 +00:00
PhaseOrdering Mark that SpeculativeExecution preserves Globals Alias Analysis. 2016-05-03 08:33:26 +00:00
PlaceSafepoints [PlaceSafepoints] Clamp NoStatepoints to true 2016-01-28 21:51:14 +00:00
PreISelIntrinsicLowering Introduce llvm.load.relative intrinsic. 2016-04-22 21:18:02 +00:00
PruneEH [PruneEH] Don't try to insert a terminator after another terminator 2016-01-23 06:00:44 +00:00
Reassociate PM: Port Reassociate to the new pass manager 2016-04-26 23:39:29 +00:00
Reg2Mem
RewriteStatepointsForGC All llvm.deoptimize declarations must use the same calling convention 2016-05-12 01:17:38 +00:00
SafeStack DebugInfo: Remove MDString-based type references 2016-04-23 21:08:00 +00:00
SampleProfile Tune basic block annotation algorithm. 2016-04-26 04:59:11 +00:00
Scalarizer [PR27284] Reverse the ownership between DICompileUnit and DISubprogram. 2016-04-15 15:57:41 +00:00
ScalarRepl [PR27284] Reverse the ownership between DICompileUnit and DISubprogram. 2016-04-15 15:57:41 +00:00
SCCP [SCCP] Resolve shifts beyond the bitwidth to undef 2016-05-12 03:07:40 +00:00
SeparateConstOffsetFromGEP [ValueTracking] Remove dead code from an old experiment 2016-03-03 19:44:06 +00:00
SimplifyCFG Propagate branch metadata when some branch probability is missing. 2016-05-10 23:07:19 +00:00
Sink PM: Port SinkingPass to the new pass manager 2016-04-22 19:54:10 +00:00
SLPVectorizer [SLPVectorizer][X86] Regenerated SEXT/ZEXT cast vectorization tests 2016-05-06 22:22:18 +00:00
SpeculativeExecution Move divergent-target test into CodeGen/NVPTX because it requires an NVPTX target. 2016-04-15 01:20:52 +00:00
SROA [SROA] Function canConvertValue needs to check whether both NewTy and OldTy pointers are 2016-05-03 19:30:48 +00:00
StraightLineStrengthReduce
StripDeadPrototypes
StripSymbols Refactor stripDebugInfo(Function) to handle intrinsic 2016-05-07 04:10:52 +00:00
StructurizeCFG AMDGPU: Remove leftover ShaderType attributes in tests 2016-04-13 00:39:48 +00:00
TailCallElim Push isDereferenceableAndAlignedPointer down into isSafeToLoadUnconditionally 2016-01-17 12:35:29 +00:00
Util [BasicAA] Treat llvm.assume as not accessing memory in getModRefBehavior(Function) 2016-04-29 17:18:28 +00:00
WholeProgramDevirt WholeProgramDevirt: introduce. 2016-02-09 22:50:34 +00:00