llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 11:33:24 +02:00

History

Daniel Berlin cfdd740e01 LVI: Add a per-value worklist limit to LazyValueInfo. Summary: LVI is now depth first, which is optimal for iteration strategy in terms of work per call. However, the way the results get cached means it can still go very badly N^2 or worse right now. The overdefined cache is per-block, because LVI wants to try to get different results for the same name in different blocks (IE solve the problem PredicateInfo solves). This means even if we discover a value is overdefined after going very deep, it doesn't cache this information, causing it to end up trying to rediscover it again and again. The same is true for values along the way. In practice, overdefined anywhere should mean overdefined everywhere (this is how, for example, SCCP works). Until we get around to reworking the overdefined cache, we need to limit the worklist size we process. Note that permanently reverting the DFS strategy exploration seems the wrong strategy (temporarily seems fine if we really want). BFS is clearly the wrong approach, it just gets luckier on some testcases. It's also very hard to design an effective throttle for BFS. For DFS, the throttle is directly related to the depth of the CFG. So really deep CFGs will get cutoff, smaller ones will not. As the CFG simplifies, you get better results. In BFS, the limit is it's related to the fan-out times average block size, which is harder to reason about or make good choices for. Bug being filed about the overdefined cache, but it will require major surgery to fix it (plumbing predicateinfo through CVP or LVI). Note: I did not make this number configurable because i'm not sure anyone really needs to tweak this knob. We run CVP 3 times. On the testcases i have the slow ones happen in the middle, where CVP is doing cleanup work other things are effective at. Over the course of 3 runs, we don't see to have any real loss of performance. I haven't gotten a minimized testcase yet, but just imagine in your head a testcase where, going up the CFG, you have branches, one of which leads 50000 blocks deep, and the other, to something where the answer is overdefined immediately. BFS would discover the overdefined faster than DFS, but do more work to do so. In practice, the right answer is "once DFS discovers overdefined for a value, stop trying to get more info about that value" (and so, DFS would normally cache the overdefined results for every value it passed through in those 50k blocks, and never do that work again. But it don't, because of the naming problem) Reviewers: chandlerc, djasper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29715 llvm-svn: 294463		2017-02-08 15:22:52 +00:00
..
AliasAnalysis.cpp	[AliasAnalysis] Fences do not modify constant memory location	2017-01-20 00:21:33 +00:00
AliasAnalysisEvaluator.cpp	Consistently use FunctionAnalysisManager	2016-08-09 00:28:15 +00:00
AliasAnalysisSummary.cpp	Update a comment.	2016-08-25 01:29:55 +00:00
AliasAnalysisSummary.h	Make some LLVM_CONSTEXPR variables const. NFC.	2016-08-25 01:05:08 +00:00
AliasSetTracker.cpp	[AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory.	2016-11-07 14:11:45 +00:00
Analysis.cpp	[LCSSA] Perform LCSSA verification only for the current loop nest.	2016-10-28 12:57:20 +00:00
AssumptionCache.cpp	[ValueTracking] recognize a 'not' of an assumed condition as false	2017-01-17 18:15:49 +00:00
BasicAliasAnalysis.cpp	Fix BasicAA incorrect assumption on GEP	2017-01-27 16:12:22 +00:00
BlockFrequencyInfo.cpp	[PGO] internal option cleanups	2017-02-02 21:29:17 +00:00
BlockFrequencyInfoImpl.cpp	Cleanup dump() functions.	2017-01-28 02:02:38 +00:00
BranchProbabilityInfo.cpp	Retry: [BPI] Use a safer constructor to calculate branch probabilities	2016-12-17 01:02:08 +00:00
CallGraph.cpp	Cleanup dump() functions.	2017-01-28 02:02:38 +00:00
CallGraphSCCPass.cpp	Improve the `-filter-print-funcs` option to skip the banner for CGSCC pass when nothing is to be printed	2017-01-18 21:37:11 +00:00
CallPrinter.cpp	[CG] Rename the DOT printing pass to actually reference "DOT".	2016-03-10 11:04:40 +00:00
CaptureTracking.cpp	[CaptureTracking] Volatile operations capture their memory location	2016-05-26 17:36:22 +00:00
CFG.cpp	Avoid overly large SmallPtrSet/SmallSet	2016-01-30 01:24:31 +00:00
CFGPrinter.cpp	[PM] Port CFGViewer and CFGPrinter to the new Pass Manager	2016-09-15 18:35:27 +00:00
CFLAndersAliasAnalysis.cpp	Apply clang-tidy's performance-unnecessary-value-param to LLVM.	2017-01-13 14:39:03 +00:00
CFLGraph.h	[CFLAA] Check for pointer types in more places.	2016-07-29 01:23:45 +00:00
CFLSteensAliasAnalysis.cpp	[PM] Change the static object whose address is used to uniquely identify	2016-11-23 17:53:26 +00:00
CGSCCPassManager.cpp	Revert r293017 and fix the actual underlying issue.	2017-02-07 01:50:48 +00:00
CMakeLists.txt	[PM] Separate the LoopAnalysisManager from the LoopPassManager and move	2017-01-11 09:43:56 +00:00
CodeMetrics.cpp	Revert @llvm.assume with operator bundles (r289755-r289757)	2016-12-19 08:22:17 +00:00
ConstantFolding.cpp	[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)	2017-01-23 23:16:46 +00:00
CostModel.cpp	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch.	2017-01-11 08:23:37 +00:00
Delinearization.cpp	[NFC] Header cleanup	2016-04-18 09:17:29 +00:00
DemandedBits.cpp	Revert @llvm.assume with operator bundles (r289755-r289757)	2016-12-19 08:22:17 +00:00
DependenceAnalysis.cpp	Cleanup dump() functions.	2017-01-28 02:02:38 +00:00
DivergenceAnalysis.cpp	DivergenceAnalysis: Fix crash with no return blocks	2016-05-09 16:57:08 +00:00
DominanceFrontier.cpp	[PM] Introduce an analysis set used to preserve all analyses over	2017-01-15 06:32:49 +00:00
DomPrinter.cpp	Introduce analysis pass to compute PostDominators in the new pass manager. NFC	2016-02-25 17:54:07 +00:00
EHPersonalities.cpp	[tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part	2016-11-14 21:41:13 +00:00
GlobalsModRef.cpp	[PM] Change the static object whose address is used to uniquely identify	2016-11-23 17:53:26 +00:00
IndirectCallPromotionAnalysis.cpp	Remove another unused variable from r275216	2016-07-12 23:49:17 +00:00
InlineCost.cpp	Improve PGO support for the new inliner	2017-01-20 22:44:04 +00:00
InstCount.cpp	Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)	2015-06-23 09:49:53 +00:00
InstructionSimplify.cpp	[ValueTracking] emit a remark when we detect a conflicting assumption (PR31809)	2017-02-06 18:26:06 +00:00
Interval.cpp	Apply clang-tidy's modernize-loop-convert to lib/Analysis.	2016-06-26 17:27:42 +00:00
IntervalPartition.cpp	Apply clang-tidy's modernize-loop-convert to lib/Analysis.	2016-06-26 17:27:42 +00:00
IteratedDominanceFrontier.cpp	Normalize file docs. NFC.	2016-07-21 20:52:35 +00:00
IVUsers.cpp	[PM] Separate the LoopAnalysisManager from the LoopPassManager and move	2017-01-11 09:43:56 +00:00
LazyBlockFrequencyInfo.cpp	[BPI] Add new LazyBPI analysis	2016-07-28 23:31:12 +00:00
LazyBranchProbabilityInfo.cpp	[BPI] Add new LazyBPI analysis	2016-07-28 23:31:12 +00:00
LazyCallGraph.cpp	[PM/LCG] Fix the no-asserts build after r294227. Sorry for the noise.	2017-02-06 20:59:07 +00:00
LazyValueInfo.cpp	LVI: Add a per-value worklist limit to LazyValueInfo.	2017-02-08 15:22:52 +00:00
Lint.cpp	Revert @llvm.assume with operator bundles (r289755-r289757)	2016-12-19 08:22:17 +00:00
LLVMBuild.txt	Restore "[ThinLTO] Prevent exporting of locals used/defined in module level asm"	2016-11-14 17:12:32 +00:00
Loads.cpp	[JumpThread] Enhance finding partial redundant loads by continuing scanning single predecessor	2017-02-02 15:12:34 +00:00
LoopAccessAnalysis.cpp	[SLP] Make sortMemAccesses explicitly return an error. NFC.	2017-02-03 19:32:50 +00:00
LoopAnalysisManager.cpp	Revert r293017 and fix the actual underlying issue.	2017-02-07 01:50:48 +00:00
LoopInfo.cpp	Make VerifyDomInfo and VerifyLoopInfo global variables	2017-01-24 05:52:07 +00:00
LoopPass.cpp	Reverted: Track validity of pass results	2017-01-15 10:23:18 +00:00
LoopUnrollAnalyzer.cpp	[LoopUnrollAnalyzer] Handle out of bounds accesses in visitLoad	2016-07-23 02:56:49 +00:00
MemDepPrinter.cpp	Apply clang-tidy's modernize-loop-convert to lib/Analysis.	2016-06-26 17:27:42 +00:00
MemDerefPrinter.cpp	NFC. Move isDereferenceable to Loads.h/cpp	2016-02-24 12:49:04 +00:00
MemoryBuiltins.cpp	[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)	2017-01-23 23:16:46 +00:00
MemoryDependenceAnalysis.cpp	[Devirtualization] MemDep returns non-local !invariant.group dependencies	2017-01-12 11:33:58 +00:00
MemoryLocation.cpp	[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)	2017-01-23 23:16:46 +00:00
ModuleDebugInfoPrinter.cpp	[IR] Remove the DIExpression field from DIGlobalVariable.	2016-12-20 02:09:43 +00:00
ModuleSummaryAnalysis.cpp	Revert "[ThinLTO] Add an auto-hide feature"	2017-02-03 07:41:43 +00:00
ObjCARCAliasAnalysis.cpp	Consistently use FunctionAnalysisManager	2016-08-09 00:28:15 +00:00
ObjCARCAnalysisUtils.cpp	[ARC] Pull the ObjC ARC components that really serve the role of	2015-08-20 08:06:03 +00:00
ObjCARCInstKind.cpp	Create llvm.addressofreturnaddress intrinsic	2016-10-12 22:13:19 +00:00
OptimizationDiagnosticInfo.cpp	[LV] Also port failure remarks to new OptimizationRemarkEmitter API	2017-02-02 05:41:51 +00:00
OrderedBasicBlock.cpp	[CaptureTracker] Provide an ordered basic block to PointerMayBeCapturedBefore	2015-07-31 14:31:35 +00:00
PHITransAddr.cpp	Revert @llvm.assume with operator bundles (r289755-r289757)	2016-12-19 08:22:17 +00:00
PostDominators.cpp	[PM] Introduce an analysis set used to preserve all analyses over	2017-01-15 06:32:49 +00:00
ProfileSummaryInfo.cpp	Compute summary before calling extractProfTotalWeight	2017-01-14 00:32:37 +00:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp	[PM] Introduce an analysis set used to preserve all analyses over	2017-01-15 06:32:49 +00:00
RegionPass.cpp	Reverted: Track validity of pass results	2017-01-15 10:23:18 +00:00
RegionPrinter.cpp	Apply clang-tidy's modernize-loop-convert to lib/Analysis.	2016-06-26 17:27:42 +00:00
ScalarEvolution.cpp	[SCEV] limit recursion depth and operands number in getAddExpr	2017-02-06 12:38:06 +00:00
ScalarEvolutionAliasAnalysis.cpp	[PM] Change the static object whose address is used to uniquely identify	2016-11-23 17:53:26 +00:00
ScalarEvolutionExpander.cpp	Revert @llvm.assume with operator bundles (r289755-r289757)	2016-12-19 08:22:17 +00:00
ScalarEvolutionNormalization.cpp	Remove emacs mode markers from .cpp files. NFC	2016-04-24 17:55:41 +00:00
ScopedNoAliasAA.cpp	[PM] Change the static object whose address is used to uniquely identify	2016-11-23 17:53:26 +00:00
SparsePropagation.cpp	Apply clang-tidy's modernize-loop-convert to lib/Analysis.	2016-06-26 17:27:42 +00:00
StratifiedSets.h	Do a sweep over move ctors and remove those that are identical to the default.	2016-10-20 12:20:28 +00:00
TargetLibraryInfo.cpp	[Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)	2017-01-23 23:16:46 +00:00
TargetTransformInfo.cpp	NVPTX: Refactor NVPTXInferAddressSpaces to check TTI	2017-01-30 23:02:12 +00:00
Trace.cpp	Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment.	2016-01-29 20:50:44 +00:00
TypeBasedAliasAnalysis.cpp	[TBAA] Don't generate invalid TBAA when merging nodes	2016-12-11 20:07:25 +00:00
TypeMetadataUtils.cpp	Analysis: Add appropriate const qualification to functions in TypeMetadataUtils.cpp. NFC.	2017-01-27 22:55:30 +00:00
ValueTracking.cpp	[ValueTracking] emit a remark when we detect a conflicting assumption (PR31809)	2017-02-06 18:26:06 +00:00
VectorUtils.cpp	[LV] Move interleaved access helper functions to VectorUtils (NFC)	2017-02-01 17:45:46 +00:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//