1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 02:33:06 +01:00
llvm-mirror/lib/Analysis
David Sherwood 2d2e4a1b17 [Analysis] Add simple cost model for strict (in-order) reductions
I have added a new FastMathFlags parameter to getArithmeticReductionCost
to indicate what type of reduction we are performing:

  1. Tree-wise. This is the typical fast-math reduction that involves
  continually splitting a vector up into halves and adding each
  half together until we get a scalar result. This is the default
  behaviour for integers, whereas for floating point we only do this
  if reassociation is allowed.
  2. Ordered. This now allows us to estimate the cost of performing
  a strict vector reduction by treating it as a series of scalar
  operations in lane order. This is the case when FP reassociation
  is not permitted. For scalable vectors this is more difficult
  because at compile time we do not know how many lanes there are,
  and so we use the worst case maximum vscale value.

I have also fixed getTypeBasedIntrinsicInstrCost to pass in the
FastMathFlags, which meant fixing up some X86 tests where we always
assumed the vector.reduce.fadd/mul intrinsics were 'fast'.

New tests have been added here:

  Analysis/CostModel/AArch64/reduce-fadd.ll
  Analysis/CostModel/AArch64/sve-intrinsics.ll
  Transforms/LoopVectorize/AArch64/strict-fadd-cost.ll
  Transforms/LoopVectorize/AArch64/sve-strict-fadd-cost.ll

Differential Revision: https://reviews.llvm.org/D105432
2021-07-26 10:26:06 +01:00
..
models Unpack the CostEstimate feature in ML inlining models. 2021-07-02 16:57:16 +00:00
AliasAnalysis.cpp [AA] Support callCapturesBefore() on BatchAA (NFCI) 2021-05-14 21:48:08 +02:00
AliasAnalysisEvaluator.cpp [AA] Updates for D95543. 2021-04-15 12:22:03 +03:00
AliasAnalysisSummary.cpp
AliasAnalysisSummary.h
AliasSetTracker.cpp [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset. 2021-04-09 12:54:22 +03:00
Analysis.cpp
AssumeBundleQueries.cpp [Attributes] Replace doesAttrKindHaveArgument() (NFC) 2021-07-12 21:57:26 +02:00
AssumptionCache.cpp Use AssumeInst in a few more places [nfc] 2021-04-06 13:18:53 -07:00
BasicAliasAnalysis.cpp [BasicAA] Fix typo ScaleForGDC -> ScaleForGCD. 2021-07-01 09:58:38 +01:00
BlockFrequencyInfo.cpp Internalize some cl::opt global variables or move them under namespace llvm 2021-05-07 11:15:43 -07:00
BlockFrequencyInfoImpl.cpp [CSSPGO][NFC] Allow cl::ZeroOrMore for use-iterative-bfi-inference 2021-07-18 13:22:32 -07:00
BranchProbabilityInfo.cpp [Analaysis, CodeGen] Remove getHotSucc (NFC) 2021-07-17 07:31:36 -07:00
CallGraph.cpp Set IgnoreLLVMUsed to false in CallGraph::addToCallGraph() 2021-04-08 11:14:09 -07:00
CallGraphSCCPass.cpp Internalize some cl::opt global variables or move them under namespace llvm 2021-05-07 11:15:43 -07:00
CallPrinter.cpp Support: Stop using F_{None,Text,Append} compatibility synonyms, NFC 2021-04-30 11:00:03 -07:00
CaptureTracking.cpp [CaptureTracking] Simplify reachability check (NFCI) 2021-05-16 16:04:10 +02:00
CFG.cpp [CFG] Move reachable from entry checks into basic block variant 2021-05-15 15:42:02 +02:00
CFGPrinter.cpp Use -cfg-func-name value as filter for -view-cfg, etc. 2021-06-16 23:54:51 +02:00
CFLAndersAliasAnalysis.cpp [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset. 2021-04-09 12:54:22 +03:00
CFLGraph.h [CFLGraph] Fix a crash due to missing handling of freeze 2021-03-21 02:14:13 +09:00
CFLSteensAliasAnalysis.cpp [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset. 2021-04-09 12:54:22 +03:00
CGSCCPassManager.cpp [NewPM] Bail out of devirtualization wrapper if the current SCC is invalidated 2021-07-19 15:07:30 -07:00
CMakeLists.txt [MLGO] Use binary protobufs for improved training performance. 2021-07-19 13:59:28 -07:00
CmpInstAnalysis.cpp
CodeMetrics.cpp Revert "[AssumptionCache] Avoid dangling llvm.assume calls in the cache" 2021-02-11 12:17:38 -06:00
ConstantFolding.cpp [ConstantFolding] Fold constrained arithmetic intrinsics 2021-07-23 14:39:51 +07:00
ConstraintSystem.cpp [llvm] Remove redundant string initialization (NFC) 2021-01-12 21:43:46 -08:00
CostModel.cpp [InstructionCost] Don't conflate Invalid costs with Unknown costs. 2021-03-30 09:29:42 +01:00
DDG.cpp [Analysis] Use llvm::append_range (NFC) 2021-01-22 23:25:01 -08:00
DDGPrinter.cpp Support: Stop using F_{None,Text,Append} compatibility synonyms, NFC 2021-04-30 11:00:03 -07:00
Delinearization.cpp [Analysis] Use range-based for loops (NFC) 2021-02-06 11:17:10 -08:00
DemandedBits.cpp Add getDemandedBits for uses. 2021-06-02 10:07:40 -04:00
DependenceAnalysis.cpp [DependenceAnalysis] Guard analysis using getPointerBase(). 2021-07-15 14:57:32 -07:00
DependenceGraphBuilder.cpp [Analysis] Use llvm::append_range (NFC) 2021-01-22 23:25:01 -08:00
DevelopmentModeInlineAdvisor.cpp [NFC][MLGO] Just use the underlying protobuf object for logging 2021-07-23 10:56:48 -07:00
DivergenceAnalysis.cpp [Analysis] Use range-based for loops (NFC) 2021-02-22 20:17:18 -08:00
DominanceFrontier.cpp
DomPrinter.cpp
DomTreeUpdater.cpp [NFCI][DomTreeUpdater] applyUpdates(): reserve space for updates first 2021-04-11 23:56:22 +03:00
EHPersonalities.cpp [XCOFF] Handle the case when personality routine is an alias 2021-04-29 22:03:30 +00:00
FunctionPropertiesAnalysis.cpp [llvm] Ensure newlines at the end of files (NFC) 2021-01-10 09:24:57 -08:00
GlobalsModRef.cpp [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset. 2021-04-09 12:54:22 +03:00
GuardUtils.cpp
HeatUtils.cpp
ImportedFunctionsInliningStatistics.cpp [Analysis] ImportedFunctionsInliningStatistics.h - add <memory> and remove unused <string> include. NFCI. 2021-04-19 16:20:56 +01:00
IndirectCallPromotionAnalysis.cpp [SampleFDO] Another fix to prevent repeated indirect call promotion in 2021-03-04 18:44:12 -08:00
InlineAdvisor.cpp [NFC] Use llvm::raw_string_ostream instead of std::stringstream 2021-03-12 18:43:59 +01:00
InlineCost.cpp [llvm][Inline] Add interface to return cost-benefit stuff 2021-07-25 20:18:19 +08:00
InlineSizeEstimatorAnalysis.cpp
InstCount.cpp
InstructionPrecedenceTracking.cpp [GVN] Properly invalidate ICF cache when we simplify a value 2021-04-08 14:01:57 -07:00
InstructionSimplify.cpp [FPEnv][InstSimplify] Constrained FP support for NaN 2021-07-09 11:26:28 -04:00
Interval.cpp
IntervalPartition.cpp
IRSimilarityIdentifier.cpp [IRSim] Strip out the findSimilarity call from the constructor 2021-06-11 18:41:28 -05:00
IVDescriptors.cpp Revert "[LV] Use lookThroughAnd with logical reductions" 2021-07-21 15:16:00 +01:00
IVUsers.cpp [IVUsers] Check LoopSimplify cache earlier (NFC) 2021-04-10 22:58:13 +02:00
LazyBlockFrequencyInfo.cpp Make dependency between certain analysis passes transitive (reapply) 2021-05-05 15:17:55 +02:00
LazyBranchProbabilityInfo.cpp Make dependency between certain analysis passes transitive (reapply) 2021-05-05 15:17:55 +02:00
LazyCallGraph.cpp Allow building for release with EXPENSIVE_CHECKS 2021-06-19 17:02:11 +01:00
LazyValueInfo.cpp [LVI] Remove recursion from getValueForCondition (NFCI) 2021-06-24 09:58:22 +09:00
LegacyDivergenceAnalysis.cpp [NewPM] Introduce (GPU)DivergenceAnalysis in the new pass manager 2021-02-16 10:26:45 +05:30
Lint.cpp [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset. 2021-04-09 12:54:22 +03:00
Loads.cpp [CSSPGO] Unblock optimizations with pseudo probe instrumentation part 2. 2021-04-26 16:52:33 -07:00
LoopAccessAnalysis.cpp [LoopUtils] Fix incorrect RT check bounds of loop-invariant mem accesses 2021-07-19 19:38:24 +08:00
LoopAnalysisManager.cpp [NewPM] Don't mark AA analyses as preserved 2021-05-18 13:49:03 -07:00
LoopCacheAnalysis.cpp [SCEV] Add a utility for converting from "exit count" to "trip count" 2021-05-26 10:41:49 -07:00
LoopInfo.cpp [LoopFlatten][LoopInfo] Use Loop to identify latch compare instruction 2021-07-21 10:14:18 +01:00
LoopNestAnalysis.cpp [LoopNest] Consider loop nest with inner loop guard using outer loop 2021-05-07 16:04:18 +00:00
LoopPass.cpp
LoopUnrollAnalyzer.cpp [unroll] Use value domain for symbolic execution based cost model 2021-05-26 08:41:25 -07:00
MemDepPrinter.cpp
MemDerefPrinter.cpp Minor format tweak to deref analysis printer 2021-03-22 18:44:18 -07:00
MemoryBuiltins.cpp [OpenMP] Change __kmpc_free_shared to include the paired allocation size 2021-07-21 20:56:21 -04:00
MemoryDependenceAnalysis.cpp [NFC] MemoryDependenceAnalysis cleanup. 2021-05-31 18:07:55 +03:00
MemoryLocation.cpp
MemorySSA.cpp [IR] Add BasicBlock::isEntryBlock() (NFC) 2021-05-15 12:41:58 +02:00
MemorySSAUpdater.cpp [Analysis] Remove changeCondBranchToUnconditionalTo (NFC) 2021-07-10 17:31:43 -07:00
MLInlineAdvisor.cpp Unpack the CostEstimate feature in ML inlining models. 2021-07-02 16:57:16 +00:00
ModuleDebugInfoPrinter.cpp
ModuleSummaryAnalysis.cpp [Support] Don't include VirtualFileSystem.h in CommandLine.h 2021-04-21 10:19:01 -04:00
MustExecute.cpp [MustExecute] Use ListSeparator (NFC) 2021-01-28 22:21:16 -08:00
ObjCARCAliasAnalysis.cpp [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset. 2021-04-09 12:54:22 +03:00
ObjCARCAnalysisUtils.cpp
ObjCARCInstKind.cpp [ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of 2021-03-04 11:22:30 -08:00
OptimizationRemarkEmitter.cpp
OverflowInstAnalysis.cpp Fix MSan crash after 1977c53b 2021-05-02 13:44:43 +09:00
PHITransAddr.cpp
PhiValues.cpp
PostDominators.cpp
ProfileSummaryInfo.cpp [CSSPGO][llvm-profdata] Support trimming cold context when merging profiles 2021-04-22 00:42:37 -07:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp
RegionPass.cpp
RegionPrinter.cpp
ReleaseModeModelRunner.cpp [NFC][MLGO] Fix vector sizing 2021-07-22 13:06:00 -07:00
ReplayInlineAdvisor.cpp [InlineAdvisor] Allow replay of inline decisions for the CGSCC inliner from optimization remarks 2021-01-25 15:38:57 -08:00
ScalarEvolution.cpp Style tweaks for SCEV's computeMaxBECountForLT [NFC] 2021-07-23 17:19:45 -07:00
ScalarEvolutionAliasAnalysis.cpp Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. 2021-07-06 12:16:05 -07:00
ScalarEvolutionDivision.cpp
ScalarEvolutionNormalization.cpp
ScopedNoAliasAA.cpp [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset. 2021-04-09 12:54:22 +03:00
StackLifetime.cpp
StackSafetyAnalysis.cpp Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. 2021-07-06 12:16:05 -07:00
StratifiedSets.h
SyncDependenceAnalysis.cpp [Analysis] Use ListSeparator (NFC) 2021-02-14 08:36:14 -08:00
SyntheticCountsUtils.cpp
TargetLibraryInfo.cpp [OpenMP] Change __kmpc_free_shared to include the paired allocation size 2021-07-21 20:56:21 -04:00
TargetTransformInfo.cpp [Analysis] Add simple cost model for strict (in-order) reductions 2021-07-26 10:26:06 +01:00
TFUtils.cpp [NFC][MLGO] Just use the underlying protobuf object for logging 2021-07-23 10:56:48 -07:00
Trace.cpp
TypeBasedAliasAnalysis.cpp [Metadata] Decorate methods with 'const'. NFC. 2021-07-08 14:11:14 -04:00
TypeMetadataUtils.cpp Revert "Allow invokable sub-classes of IntrinsicInst" 2021-04-20 15:38:38 -07:00
ValueLattice.cpp
ValueLatticeUtils.cpp
ValueTracking.cpp [SimplifyCFG] simplifyUnreachable(): erase instructions iff they are guaranteed to transfer execution to unreachable 2021-07-03 10:45:44 +03:00
VectorUtils.cpp [NFC] Fix a few whitespace issues and typos. 2021-07-04 11:49:58 +01:00
VFABIDemangling.cpp [llvm] Use the default value of drop_begin (NFC) 2021-01-18 10:16:36 -08:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//