1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 05:52:53 +02:00
llvm-mirror/lib/Analysis
Elena Demikhovsky ac6dc0e1f0 AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns.
X86 target does not provide any target specific cost calculation for interleave patterns.It uses the common target-independent calculation, which gives very high numbers. As a result, the scalar version is chosen in many cases. The situation on AVX-512 is even worse, since we have 3-src shuffles that significantly reduce the cost.

In this patch I calculate the cost on AVX-512. It will allow to compare interleave pattern with gather/scatter and choose a better solution (PR31426).

* Shiffle-broadcast cost will be changed in Simon's upcoming patch.

Differential Revision: https://reviews.llvm.org/D28118

llvm-svn: 290810
2017-01-02 10:37:52 +00:00
..
AliasAnalysis.cpp [PM] Remove a pointless optimization. 2016-12-27 18:04:11 +00:00
AliasAnalysisEvaluator.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
AliasAnalysisSummary.cpp Update a comment. 2016-08-25 01:29:55 +00:00
AliasAnalysisSummary.h Make some LLVM_CONSTEXPR variables const. NFC. 2016-08-25 01:05:08 +00:00
AliasSetTracker.cpp [AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory. 2016-11-07 14:11:45 +00:00
Analysis.cpp [LCSSA] Perform LCSSA verification only for the current loop nest. 2016-10-28 12:57:20 +00:00
AssumptionCache.cpp Add files I seem to have dropped in my revert (r290086). 2016-12-19 08:32:13 +00:00
BasicAliasAnalysis.cpp [PM] Remove a pointless optimization. 2016-12-27 18:04:11 +00:00
BlockFrequencyInfo.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
BlockFrequencyInfoImpl.cpp [GraphTraits] Replace all NodeType usage with NodeRef 2016-08-22 21:09:30 +00:00
BranchProbabilityInfo.cpp Retry: [BPI] Use a safer constructor to calculate branch probabilities 2016-12-17 01:02:08 +00:00
CallGraph.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
CallGraphSCCPass.cpp Use StringRef in Pass/PassManager APIs (NFC) 2016-10-01 02:56:57 +00:00
CallPrinter.cpp
CaptureTracking.cpp
CFG.cpp
CFGPrinter.cpp [PM] Port CFGViewer and CFGPrinter to the new Pass Manager 2016-09-15 18:35:27 +00:00
CFLAndersAliasAnalysis.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
CFLGraph.h
CFLSteensAliasAnalysis.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
CGSCCPassManager.cpp [PM] Teach the CGSCC's CG update utility to more carefully invalidate 2016-12-28 10:34:50 +00:00
CMakeLists.txt Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
CodeMetrics.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
ConstantFolding.cpp [InstCombiner] Simplify lib calls to round{,f} 2016-12-26 14:29:29 +00:00
CostModel.cpp AVX-512 Loop Vectorizer: Cost calculation for interleave load/store patterns. 2017-01-02 10:37:52 +00:00
Delinearization.cpp
DemandedBits.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
DependenceAnalysis.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
DivergenceAnalysis.cpp
DominanceFrontier.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
DomPrinter.cpp
EHPersonalities.cpp [tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part 2016-11-14 21:41:13 +00:00
GlobalsModRef.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
IndirectCallPromotionAnalysis.cpp
InlineCost.cpp [PM] Provide an initial, minimal port of the inliner to the new pass manager. 2016-12-20 03:15:32 +00:00
InstCount.cpp
InstructionSimplify.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
Interval.cpp
IntervalPartition.cpp
IteratedDominanceFrontier.cpp
IVUsers.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
LazyBlockFrequencyInfo.cpp
LazyBranchProbabilityInfo.cpp
LazyCallGraph.cpp [PM] Teach the CGSCC's CG update utility to more carefully invalidate 2016-12-28 10:34:50 +00:00
LazyValueInfo.cpp [LVI] Remove count/erase idiom in favor of checking result value of erase 2016-12-30 22:09:10 +00:00
Lint.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
LLVMBuild.txt Restore "[ThinLTO] Prevent exporting of locals used/defined in module level asm" 2016-11-14 17:12:32 +00:00
Loads.cpp [Loads] Fix crash in is isDereferenceableAndAlignedPointer() 2016-10-28 15:32:28 +00:00
LoopAccessAnalysis.cpp [LAA] Prevent invalid IR for loop-invariant bound in loop body 2016-12-05 21:25:03 +00:00
LoopInfo.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
LoopPass.cpp [LCSSA] Perform LCSSA verification only for the current loop nest. 2016-10-28 12:57:20 +00:00
LoopPassManager.cpp [PM] Introduce the facilities for registering cross-IR-unit dependencies 2016-12-27 08:40:39 +00:00
LoopUnrollAnalyzer.cpp
MemDepPrinter.cpp
MemDerefPrinter.cpp
MemoryBuiltins.cpp [Analysis] Ignore nobuiltin on allocsize function calls. 2016-12-27 06:32:14 +00:00
MemoryDependenceAnalysis.cpp [MemDep] Handle gep with zeros for invariant.group 2016-12-30 18:45:07 +00:00
MemoryLocation.cpp
ModuleDebugInfoPrinter.cpp [IR] Remove the DIExpression field from DIGlobalVariable. 2016-12-20 02:09:43 +00:00
ModuleSummaryAnalysis.cpp [ThinLTO] Fix "||" vs "|" mixup. 2016-12-27 17:45:09 +00:00
ObjCARCAliasAnalysis.cpp Consistently use FunctionAnalysisManager 2016-08-09 00:28:15 +00:00
ObjCARCAnalysisUtils.cpp
ObjCARCInstKind.cpp Create llvm.addressofreturnaddress intrinsic 2016-10-12 22:13:19 +00:00
OptimizationDiagnosticInfo.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
OrderedBasicBlock.cpp
PHITransAddr.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
PostDominators.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
ProfileSummaryInfo.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
PtrUseVisitor.cpp
README.txt
RegionInfo.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
RegionPass.cpp
RegionPrinter.cpp
ScalarEvolution.cpp [SCEV] Be less conservative when extending bitwidths for computing ranges. 2016-12-20 23:03:42 +00:00
ScalarEvolutionAliasAnalysis.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
ScalarEvolutionExpander.cpp Revert @llvm.assume with operator bundles (r289755-r289757) 2016-12-19 08:22:17 +00:00
ScalarEvolutionNormalization.cpp
ScopedNoAliasAA.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
SparsePropagation.cpp
StratifiedSets.h Do a sweep over move ctors and remove those that are identical to the default. 2016-10-20 12:20:28 +00:00
TargetLibraryInfo.cpp [SimplifyLibCalls] Lower fls() to llvm.ctlz(). 2016-12-15 23:45:11 +00:00
TargetTransformInfo.cpp [PM] Change the static object whose address is used to uniquely identify 2016-11-23 17:53:26 +00:00
Trace.cpp
TypeBasedAliasAnalysis.cpp [TBAA] Don't generate invalid TBAA when merging nodes 2016-12-11 20:07:25 +00:00
TypeMetadataUtils.cpp TypeMetadataUtils: Simplify; spotted by Mehdi. 2016-12-21 19:00:47 +00:00
ValueTracking.cpp Fix an issue with isGuaranteedToTransferExecutionToSuccessor 2016-12-31 22:12:34 +00:00
VectorUtils.cpp IR: Change the gep_type_iterator API to avoid always exposing the "current" type. 2016-12-02 02:24:42 +00:00

Analysis Opportunities:

//===---------------------------------------------------------------------===//

In test/Transforms/LoopStrengthReduce/quadradic-exit-value.ll, the
ScalarEvolution expression for %r is this:

  {1,+,3,+,2}<loop>

Outside the loop, this could be evaluated simply as (%n * %n), however
ScalarEvolution currently evaluates it as

  (-2 + (2 * (trunc i65 (((zext i64 (-2 + %n) to i65) * (zext i64 (-1 + %n) to i65)) /u 2) to i64)) + (3 * %n))

In addition to being much more complicated, it involves i65 arithmetic,
which is very inefficient when expanded into code.

//===---------------------------------------------------------------------===//

In formatValue in test/CodeGen/X86/lsr-delayed-fold.ll,

ScalarEvolution is forming this expression:

((trunc i64 (-1 * %arg5) to i32) + (trunc i64 %arg5 to i32) + (-1 * (trunc i64 undef to i32)))

This could be folded to

(-1 * (trunc i64 undef to i32))

//===---------------------------------------------------------------------===//