llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 02:33:06 +01:00

History

David Sherwood 2d2e4a1b17 [Analysis] Add simple cost model for strict (in-order) reductions I have added a new FastMathFlags parameter to getArithmeticReductionCost to indicate what type of reduction we are performing: 1. Tree-wise. This is the typical fast-math reduction that involves continually splitting a vector up into halves and adding each half together until we get a scalar result. This is the default behaviour for integers, whereas for floating point we only do this if reassociation is allowed. 2. Ordered. This now allows us to estimate the cost of performing a strict vector reduction by treating it as a series of scalar operations in lane order. This is the case when FP reassociation is not permitted. For scalable vectors this is more difficult because at compile time we do not know how many lanes there are, and so we use the worst case maximum vscale value. I have also fixed getTypeBasedIntrinsicInstrCost to pass in the FastMathFlags, which meant fixing up some X86 tests where we always assumed the vector.reduce.fadd/mul intrinsics were 'fast'. New tests have been added here: Analysis/CostModel/AArch64/reduce-fadd.ll Analysis/CostModel/AArch64/sve-intrinsics.ll Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll Differential Revision: https://reviews.llvm.org/D105432		2021-07-26 10:26:06 +01:00
..
models	Unpack the CostEstimate feature in ML inlining models.	2021-07-02 16:57:16 +00:00
AliasAnalysis.cpp	[AA] Support callCapturesBefore() on BatchAA (NFCI)	2021-05-14 21:48:08 +02:00
AliasAnalysisEvaluator.cpp	[AA] Updates for D95543.	2021-04-15 12:22:03 +03:00
AliasAnalysisSummary.cpp
AliasAnalysisSummary.h
AliasSetTracker.cpp	[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.	2021-04-09 12:54:22 +03:00
Analysis.cpp
AssumeBundleQueries.cpp	[Attributes] Replace doesAttrKindHaveArgument() (NFC)	2021-07-12 21:57:26 +02:00
AssumptionCache.cpp	Use AssumeInst in a few more places [nfc]	2021-04-06 13:18:53 -07:00
BasicAliasAnalysis.cpp	[BasicAA] Fix typo ScaleForGDC -> ScaleForGCD.	2021-07-01 09:58:38 +01:00
BlockFrequencyInfo.cpp	Internalize some cl::opt global variables or move them under namespace llvm	2021-05-07 11:15:43 -07:00
BlockFrequencyInfoImpl.cpp	[CSSPGO][NFC] Allow cl::ZeroOrMore for use-iterative-bfi-inference	2021-07-18 13:22:32 -07:00
BranchProbabilityInfo.cpp	[Analaysis, CodeGen] Remove getHotSucc (NFC)	2021-07-17 07:31:36 -07:00
CallGraph.cpp	Set IgnoreLLVMUsed to false in CallGraph::addToCallGraph()	2021-04-08 11:14:09 -07:00
CallGraphSCCPass.cpp	Internalize some cl::opt global variables or move them under namespace llvm	2021-05-07 11:15:43 -07:00
CallPrinter.cpp	Support: Stop using F_{None,Text,Append} compatibility synonyms, NFC	2021-04-30 11:00:03 -07:00
CaptureTracking.cpp	[CaptureTracking] Simplify reachability check (NFCI)	2021-05-16 16:04:10 +02:00
CFG.cpp	[CFG] Move reachable from entry checks into basic block variant	2021-05-15 15:42:02 +02:00
CFGPrinter.cpp	Use `-cfg-func-name` value as filter for `-view-cfg`, etc.	2021-06-16 23:54:51 +02:00
CFLAndersAliasAnalysis.cpp	[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.	2021-04-09 12:54:22 +03:00
CFLGraph.h	[CFLGraph] Fix a crash due to missing handling of freeze	2021-03-21 02:14:13 +09:00
CFLSteensAliasAnalysis.cpp	[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.	2021-04-09 12:54:22 +03:00
CGSCCPassManager.cpp	[NewPM] Bail out of devirtualization wrapper if the current SCC is invalidated	2021-07-19 15:07:30 -07:00
CMakeLists.txt	[MLGO] Use binary protobufs for improved training performance.	2021-07-19 13:59:28 -07:00
CmpInstAnalysis.cpp
CodeMetrics.cpp	Revert "[AssumptionCache] Avoid dangling llvm.assume calls in the cache"	2021-02-11 12:17:38 -06:00
ConstantFolding.cpp	[ConstantFolding] Fold constrained arithmetic intrinsics	2021-07-23 14:39:51 +07:00
ConstraintSystem.cpp	[llvm] Remove redundant string initialization (NFC)	2021-01-12 21:43:46 -08:00
CostModel.cpp	[InstructionCost] Don't conflate Invalid costs with Unknown costs.	2021-03-30 09:29:42 +01:00
DDG.cpp	[Analysis] Use llvm::append_range (NFC)	2021-01-22 23:25:01 -08:00
DDGPrinter.cpp	Support: Stop using F_{None,Text,Append} compatibility synonyms, NFC	2021-04-30 11:00:03 -07:00
Delinearization.cpp	[Analysis] Use range-based for loops (NFC)	2021-02-06 11:17:10 -08:00
DemandedBits.cpp	Add getDemandedBits for uses.	2021-06-02 10:07:40 -04:00
DependenceAnalysis.cpp	[DependenceAnalysis] Guard analysis using getPointerBase().	2021-07-15 14:57:32 -07:00
DependenceGraphBuilder.cpp	[Analysis] Use llvm::append_range (NFC)	2021-01-22 23:25:01 -08:00
DevelopmentModeInlineAdvisor.cpp	[NFC][MLGO] Just use the underlying protobuf object for logging	2021-07-23 10:56:48 -07:00
DivergenceAnalysis.cpp	[Analysis] Use range-based for loops (NFC)	2021-02-22 20:17:18 -08:00
DominanceFrontier.cpp
DomPrinter.cpp
DomTreeUpdater.cpp	[NFCI][DomTreeUpdater] applyUpdates(): reserve space for updates first	2021-04-11 23:56:22 +03:00
EHPersonalities.cpp	[XCOFF] Handle the case when personality routine is an alias	2021-04-29 22:03:30 +00:00
FunctionPropertiesAnalysis.cpp	[llvm] Ensure newlines at the end of files (NFC)	2021-01-10 09:24:57 -08:00
GlobalsModRef.cpp	[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.	2021-04-09 12:54:22 +03:00
GuardUtils.cpp
HeatUtils.cpp
ImportedFunctionsInliningStatistics.cpp	[Analysis] ImportedFunctionsInliningStatistics.h - add <memory> and remove unused <string> include. NFCI.	2021-04-19 16:20:56 +01:00
IndirectCallPromotionAnalysis.cpp	[SampleFDO] Another fix to prevent repeated indirect call promotion in	2021-03-04 18:44:12 -08:00
InlineAdvisor.cpp	[NFC] Use llvm::raw_string_ostream instead of std::stringstream	2021-03-12 18:43:59 +01:00
InlineCost.cpp	[llvm][Inline] Add interface to return cost-benefit stuff	2021-07-25 20:18:19 +08:00
InlineSizeEstimatorAnalysis.cpp
InstCount.cpp
InstructionPrecedenceTracking.cpp	[GVN] Properly invalidate ICF cache when we simplify a value	2021-04-08 14:01:57 -07:00
InstructionSimplify.cpp	[FPEnv][InstSimplify] Constrained FP support for NaN	2021-07-09 11:26:28 -04:00
Interval.cpp
IntervalPartition.cpp
IRSimilarityIdentifier.cpp	[IRSim] Strip out the findSimilarity call from the constructor	2021-06-11 18:41:28 -05:00
IVDescriptors.cpp	Revert "[LV] Use lookThroughAnd with logical reductions"	2021-07-21 15:16:00 +01:00
IVUsers.cpp	[IVUsers] Check LoopSimplify cache earlier (NFC)	2021-04-10 22:58:13 +02:00
LazyBlockFrequencyInfo.cpp	Make dependency between certain analysis passes transitive (reapply)	2021-05-05 15:17:55 +02:00
LazyBranchProbabilityInfo.cpp	Make dependency between certain analysis passes transitive (reapply)	2021-05-05 15:17:55 +02:00
LazyCallGraph.cpp	Allow building for release with EXPENSIVE_CHECKS	2021-06-19 17:02:11 +01:00
LazyValueInfo.cpp	[LVI] Remove recursion from getValueForCondition (NFCI)	2021-06-24 09:58:22 +09:00
LegacyDivergenceAnalysis.cpp	[NewPM] Introduce (GPU)DivergenceAnalysis in the new pass manager	2021-02-16 10:26:45 +05:30
Lint.cpp	[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.	2021-04-09 12:54:22 +03:00
Loads.cpp	[CSSPGO] Unblock optimizations with pseudo probe instrumentation part 2.	2021-04-26 16:52:33 -07:00
LoopAccessAnalysis.cpp	[LoopUtils] Fix incorrect RT check bounds of loop-invariant mem accesses	2021-07-19 19:38:24 +08:00
LoopAnalysisManager.cpp	[NewPM] Don't mark AA analyses as preserved	2021-05-18 13:49:03 -07:00
LoopCacheAnalysis.cpp	[SCEV] Add a utility for converting from "exit count" to "trip count"	2021-05-26 10:41:49 -07:00
LoopInfo.cpp	[LoopFlatten][LoopInfo] Use Loop to identify latch compare instruction	2021-07-21 10:14:18 +01:00
LoopNestAnalysis.cpp	[LoopNest] Consider loop nest with inner loop guard using outer loop	2021-05-07 16:04:18 +00:00
LoopPass.cpp
LoopUnrollAnalyzer.cpp	[unroll] Use value domain for symbolic execution based cost model	2021-05-26 08:41:25 -07:00
MemDepPrinter.cpp
MemDerefPrinter.cpp	Minor format tweak to deref analysis printer	2021-03-22 18:44:18 -07:00
MemoryBuiltins.cpp	[OpenMP] Change `__kmpc_free_shared` to include the paired allocation size	2021-07-21 20:56:21 -04:00
MemoryDependenceAnalysis.cpp	[NFC] MemoryDependenceAnalysis cleanup.	2021-05-31 18:07:55 +03:00
MemoryLocation.cpp
MemorySSA.cpp	[IR] Add BasicBlock::isEntryBlock() (NFC)	2021-05-15 12:41:58 +02:00
MemorySSAUpdater.cpp	[Analysis] Remove changeCondBranchToUnconditionalTo (NFC)	2021-07-10 17:31:43 -07:00
MLInlineAdvisor.cpp	Unpack the CostEstimate feature in ML inlining models.	2021-07-02 16:57:16 +00:00
ModuleDebugInfoPrinter.cpp
ModuleSummaryAnalysis.cpp	[Support] Don't include VirtualFileSystem.h in CommandLine.h	2021-04-21 10:19:01 -04:00
MustExecute.cpp	[MustExecute] Use ListSeparator (NFC)	2021-01-28 22:21:16 -08:00
ObjCARCAliasAnalysis.cpp	[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.	2021-04-09 12:54:22 +03:00
ObjCARCAnalysisUtils.cpp
ObjCARCInstKind.cpp	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of	2021-03-04 11:22:30 -08:00
OptimizationRemarkEmitter.cpp
OverflowInstAnalysis.cpp	Fix MSan crash after 1977c53b	2021-05-02 13:44:43 +09:00
PHITransAddr.cpp
PhiValues.cpp
PostDominators.cpp
ProfileSummaryInfo.cpp	[CSSPGO][llvm-profdata] Support trimming cold context when merging profiles	2021-04-22 00:42:37 -07:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp
RegionPass.cpp
RegionPrinter.cpp
ReleaseModeModelRunner.cpp	[NFC][MLGO] Fix vector sizing	2021-07-22 13:06:00 -07:00
ReplayInlineAdvisor.cpp	[InlineAdvisor] Allow replay of inline decisions for the CGSCC inliner from optimization remarks	2021-01-25 15:38:57 -08:00
ScalarEvolution.cpp	Style tweaks for SCEV's computeMaxBECountForLT [NFC]	2021-07-23 17:19:45 -07:00
ScalarEvolutionAliasAnalysis.cpp	Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.	2021-07-06 12:16:05 -07:00
ScalarEvolutionDivision.cpp
ScalarEvolutionNormalization.cpp
ScopedNoAliasAA.cpp	[NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.	2021-04-09 12:54:22 +03:00
StackLifetime.cpp
StackSafetyAnalysis.cpp	Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.	2021-07-06 12:16:05 -07:00
StratifiedSets.h
SyncDependenceAnalysis.cpp	[Analysis] Use ListSeparator (NFC)	2021-02-14 08:36:14 -08:00
SyntheticCountsUtils.cpp
TargetLibraryInfo.cpp	[OpenMP] Change `__kmpc_free_shared` to include the paired allocation size	2021-07-21 20:56:21 -04:00
TargetTransformInfo.cpp	[Analysis] Add simple cost model for strict (in-order) reductions	2021-07-26 10:26:06 +01:00
TFUtils.cpp	[NFC][MLGO] Just use the underlying protobuf object for logging	2021-07-23 10:56:48 -07:00
Trace.cpp
TypeBasedAliasAnalysis.cpp	[Metadata] Decorate methods with 'const'. NFC.	2021-07-08 14:11:14 -04:00
TypeMetadataUtils.cpp	Revert "Allow invokable sub-classes of IntrinsicInst"	2021-04-20 15:38:38 -07:00
ValueLattice.cpp
ValueLatticeUtils.cpp
ValueTracking.cpp	[SimplifyCFG] simplifyUnreachable(): erase instructions iff they are guaranteed to transfer execution to unreachable	2021-07-03 10:45:44 +03:00
VectorUtils.cpp	[NFC] Fix a few whitespace issues and typos.	2021-07-04 11:49:58 +01:00
VFABIDemangling.cpp	[llvm] Use the default value of drop_begin (NFC)	2021-01-18 10:16:36 -08:00

README.txt

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//