1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 11:33:24 +02:00
llvm-mirror/lib/Analysis
Daniel Berlin cfdd740e01 LVI: Add a per-value worklist limit to LazyValueInfo.
Summary:
LVI is now depth first, which is optimal for iteration strategy in
terms of work per call.  However, the way the results get cached means
it can still go very badly N^2 or worse right now.  The overdefined
cache is per-block, because LVI wants to try to get different results
for the same name in different blocks (IE solve the problem
PredicateInfo solves).  This means even if we discover a value is
overdefined after going very deep, it doesn't cache this information,
causing it to end up trying to rediscover it again and again.  The
same is true for values along the way.  In practice, overdefined
anywhere should mean overdefined everywhere (this is how, for example,
SCCP works).

Until we get around to reworking the overdefined cache, we need to
limit the worklist size we process.  Note that permanently reverting
the DFS strategy exploration seems the wrong strategy (temporarily
seems fine if we really want).  BFS is clearly the wrong approach, it
just gets luckier on some testcases.  It's also very hard to design
an effective throttle for BFS. For DFS, the throttle is directly related
to the depth of the CFG.  So really deep CFGs will get cutoff, smaller
ones will not. As the CFG simplifies, you get better results.
In BFS, the limit is it's related to the fan-out times average block size,
which is harder to reason about or make good choices for.

Bug being filed about the overdefined cache, but it will require major
surgery to fix it (plumbing predicateinfo through CVP or LVI).

Note: I did not make this number configurable because i'm not sure
anyone really needs to tweak this knob.  We run CVP 3 times. On the
testcases i have the slow ones happen in the middle, where CVP is
doing cleanup work other things are effective at.  Over the course of
3 runs, we don't see to have any real loss of performance.

I haven't gotten a minimized testcase yet, but just imagine in your
head a testcase where, going *up* the CFG, you have branches, one of
which leads 50000 blocks deep, and the other, to something where the
answer is overdefined immediately.  BFS would discover the overdefined
faster than DFS, but do more work to do so.  In practice, the right
answer is "once DFS discovers overdefined for a value, stop trying to
get more info about that value" (and so, DFS would normally cache the
overdefined results for every value it passed through in those 50k
blocks, and never do that work again. But it don't, because of the
naming problem)

Reviewers: chandlerc, djasper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29715

llvm-svn: 294463
2017-02-08 15:22:52 +00:00
..
AliasAnalysis.cpp [AliasAnalysis] Fences do not modify constant memory location 2017-01-20 00:21:33 +00:00
AliasAnalysisEvaluator.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
AliasAnalysisSummary.cpp Update a comment. 2016-08-25 01:29:55 +00:00
AliasAnalysisSummary.h Make some LLVM_CONSTEXPR variables const. NFC. 2016-08-25 01:05:08 +00:00
AliasSetTracker.cpp [AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory. 2016-11-07 14:11:45 +00:00
Analysis.cpp [LCSSA] Perform LCSSA verification only for the current loop nest. 2016-10-28 12:57:20 +00:00
AssumptionCache.cpp [ValueTracking] recognize a 'not' of an assumed condition as false 2017-01-17 18:15:49 +00:00
BasicAliasAnalysis.cpp Fix BasicAA incorrect assumption on GEP 2017-01-27 16:12:22 +00:00
BlockFrequencyInfo.cpp [PGO] internal option cleanups 2017-02-02 21:29:17 +00:00
BlockFrequencyInfoImpl.cpp Cleanup dump() functions. 2017-01-28 02:02:38 +00:00
BranchProbabilityInfo.cpp Retry: [BPI] Use a safer constructor to calculate branch probabilities 2016-12-17 01:02:08 +00:00
CallGraph.cpp Cleanup dump() functions. 2017-01-28 02:02:38 +00:00
CallGraphSCCPass.cpp Improve the -filter-print-funcs option to skip the banner for CGSCC pass when nothing is to be printed 2017-01-18 21:37:11 +00:00
CallPrinter.cpp [CG] Rename the DOT printing pass to actually reference "DOT". 2016-03-10 11:04:40 +00:00
CaptureTracking.cpp [CaptureTracking] Volatile operations capture their memory location 2016-05-26 17:36:22 +00:00
CFG.cpp Avoid overly large SmallPtrSet/SmallSet 2016-01-30 01:24:31 +00:00
CFGPrinter.cpp [PM] Port CFGViewer and CFGPrinter to the new Pass Manager 2016-09-15 18:35:27 +00:00
CFLAndersAliasAnalysis.cpp Apply clang-tidy's performance-unnecessary-value-param to LLVM. 2017-01-13 14:39:03 +00:00
CFLGraph.h [CFLAA] Check for pointer types in more places. 2016-07-29 01:23:45 +00:00
CFLSteensAliasAnalysis.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
CGSCCPassManager.cpp Revert r293017 and fix the actual underlying issue. 2017-02-07 01:50:48 +00:00
CMakeLists.txt [PM] Separate the LoopAnalysisManager from the LoopPassManager and move 2017-01-11 09:43:56 +00:00
CodeMetrics.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
ConstantFolding.cpp [Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC) 2017-01-23 23:16:46 +00:00
CostModel.cpp [X86] updating TTI costs for arithmetic instructions on X86\SLM arch. 2017-01-11 08:23:37 +00:00
Delinearization.cpp [NFC] Header cleanup 2016-04-18 09:17:29 +00:00
DemandedBits.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
DependenceAnalysis.cpp Cleanup dump() functions. 2017-01-28 02:02:38 +00:00
DivergenceAnalysis.cpp DivergenceAnalysis: Fix crash with no return blocks 2016-05-09 16:57:08 +00:00
DominanceFrontier.cpp [PM] Introduce an analysis set used to preserve all analyses over 2017-01-15 06:32:49 +00:00
DomPrinter.cpp Introduce analysis pass to compute PostDominators in the new pass manager. NFC 2016-02-25 17:54:07 +00:00
EHPersonalities.cpp [tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part 2016-11-14 21:41:13 +00:00
GlobalsModRef.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
IndirectCallPromotionAnalysis.cpp Remove another unused variable from r275216 2016-07-12 23:49:17 +00:00
InlineCost.cpp Improve PGO support for the new inliner 2017-01-20 22:44:04 +00:00
InstCount.cpp Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC) 2015-06-23 09:49:53 +00:00
InstructionSimplify.cpp [ValueTracking] emit a remark when we detect a conflicting assumption (PR31809) 2017-02-06 18:26:06 +00:00
Interval.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
IntervalPartition.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
IteratedDominanceFrontier.cpp Normalize file docs. NFC. 2016-07-21 20:52:35 +00:00
IVUsers.cpp [PM] Separate the LoopAnalysisManager from the LoopPassManager and move 2017-01-11 09:43:56 +00:00
LazyBlockFrequencyInfo.cpp [BPI] Add new LazyBPI analysis 2016-07-28 23:31:12 +00:00
LazyBranchProbabilityInfo.cpp [BPI] Add new LazyBPI analysis 2016-07-28 23:31:12 +00:00
LazyCallGraph.cpp [PM/LCG] Fix the no-asserts build after r294227. Sorry for the noise. 2017-02-06 20:59:07 +00:00
LazyValueInfo.cpp LVI: Add a per-value worklist limit to LazyValueInfo. 2017-02-08 15:22:52 +00:00
Lint.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
LLVMBuild.txt Restore "[ThinLTO] Prevent exporting of locals used/defined in module level asm" 2016-11-14 17:12:32 +00:00
Loads.cpp [JumpThread] Enhance finding partial redundant loads by continuing scanning single predecessor 2017-02-02 15:12:34 +00:00
LoopAccessAnalysis.cpp [SLP] Make sortMemAccesses explicitly return an error. NFC. 2017-02-03 19:32:50 +00:00
LoopAnalysisManager.cpp Revert r293017 and fix the actual underlying issue. 2017-02-07 01:50:48 +00:00
LoopInfo.cpp Make VerifyDomInfo and VerifyLoopInfo global variables 2017-01-24 05:52:07 +00:00
LoopPass.cpp Reverted: Track validity of pass results 2017-01-15 10:23:18 +00:00
LoopUnrollAnalyzer.cpp [LoopUnrollAnalyzer] Handle out of bounds accesses in visitLoad 2016-07-23 02:56:49 +00:00
MemDepPrinter.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
MemDerefPrinter.cpp NFC. Move isDereferenceable to Loads.h/cpp 2016-02-24 12:49:04 +00:00
MemoryBuiltins.cpp [Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC) 2017-01-23 23:16:46 +00:00
MemoryDependenceAnalysis.cpp [Devirtualization] MemDep returns non-local !invariant.group dependencies 2017-01-12 11:33:58 +00:00
MemoryLocation.cpp [Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC) 2017-01-23 23:16:46 +00:00
ModuleDebugInfoPrinter.cpp [IR] Remove the DIExpression field from DIGlobalVariable. 2016-12-20 02:09:43 +00:00
ModuleSummaryAnalysis.cpp Revert "[ThinLTO] Add an auto-hide feature" 2017-02-03 07:41:43 +00:00
ObjCARCAliasAnalysis.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
ObjCARCAnalysisUtils.cpp [ARC] Pull the ObjC ARC components that really serve the role of 2015-08-20 08:06:03 +00:00
ObjCARCInstKind.cpp Create llvm.addressofreturnaddress intrinsic 2016-10-12 22:13:19 +00:00
OptimizationDiagnosticInfo.cpp [LV] Also port failure remarks to new OptimizationRemarkEmitter API 2017-02-02 05:41:51 +00:00
OrderedBasicBlock.cpp [CaptureTracker] Provide an ordered basic block to PointerMayBeCapturedBefore 2015-07-31 14:31:35 +00:00
PHITransAddr.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
PostDominators.cpp [PM] Introduce an analysis set used to preserve all analyses over 2017-01-15 06:32:49 +00:00
ProfileSummaryInfo.cpp Compute summary before calling extractProfTotalWeight 2017-01-14 00:32:37 +00:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp [PM] Introduce an analysis set used to preserve all analyses over 2017-01-15 06:32:49 +00:00
RegionPass.cpp Reverted: Track validity of pass results 2017-01-15 10:23:18 +00:00
RegionPrinter.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
ScalarEvolution.cpp [SCEV] limit recursion depth and operands number in getAddExpr 2017-02-06 12:38:06 +00:00
ScalarEvolutionAliasAnalysis.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
ScalarEvolutionExpander.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
ScalarEvolutionNormalization.cpp Remove emacs mode markers from .cpp files. NFC 2016-04-24 17:55:41 +00:00
ScopedNoAliasAA.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
SparsePropagation.cpp Apply clang-tidy's modernize-loop-convert to lib/Analysis. 2016-06-26 17:27:42 +00:00
StratifiedSets.h Do a sweep over move ctors and remove those that are identical to the default. 2016-10-20 12:20:28 +00:00
TargetLibraryInfo.cpp [Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC) 2017-01-23 23:16:46 +00:00
TargetTransformInfo.cpp NVPTX: Refactor NVPTXInferAddressSpaces to check TTI 2017-01-30 23:02:12 +00:00
Trace.cpp Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. 2016-01-29 20:50:44 +00:00
TypeBasedAliasAnalysis.cpp [TBAA] Don't generate invalid TBAA when merging nodes 2016-12-11 20:07:25 +00:00
TypeMetadataUtils.cpp Analysis: Add appropriate const qualification to functions in TypeMetadataUtils.cpp. NFC. 2017-01-27 22:55:30 +00:00
ValueTracking.cpp [ValueTracking] emit a remark when we detect a conflicting assumption (PR31809) 2017-02-06 18:26:06 +00:00
VectorUtils.cpp [LV] Move interleaved access helper functions to VectorUtils (NFC) 2017-02-01 17:45:46 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//