1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 20:23:11 +01:00
llvm-mirror/lib
David Green 7d3e19f39a [LV] Account for tripcount when calculation vectorization profitability
The loop vectorizer will currently assume a large trip count when
calculating which of several vectorization factors are more profitable.
That is often not a terrible assumption to make as small trip count
loops will usually have been fully unrolled. There are cases however
where we will try to vectorize them, and especially when folding the
tail by masking can incorrectly choose to vectorize loops that are not
beneficial, due to the folded tail rounding the iteration count up for
the vectorized loop.

The motivating example here has a trip count of 5, so either performs 5
scalar iterations or 2 vector iterations (with VF=4). At a high enough
trip count the vectorization becomes profitable, but the rounding up to
2 vector iterations vs only 5 scalar makes it unprofitable.

This adds an alternative cost calculation when we know the max trip
count and are folding tail by masking, rounding the iteration count up
to the correct number for the vector width. We still do not account for
anything like setup cost or the mixture of vector and scalar loops, but
this is at least an improvement in a few cases that we have had
reported.

Differential Revision: https://reviews.llvm.org/D101726
2021-05-06 12:36:46 +01:00
..
Analysis Make dependency between certain analysis passes transitive (reapply) 2021-05-05 15:17:55 +02:00
AsmParser [Lexer] Allow LLLexer to be used as an API 2021-04-26 12:43:14 -04:00
BinaryFormat [NFC] Reordering parameters in getFile and getFileOrSTDIN 2021-03-25 09:47:49 -04:00
Bitcode [Bitcode] Ensure DIArgList in bitcode has no null or forward metadata refs 2021-04-22 12:03:33 +01:00
Bitstream
CodeGen [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics 2021-05-06 04:01:20 +01:00
DebugInfo [CodeView] Truncate Long Type Names With An MD5 Hash 2021-05-04 10:51:21 -04:00
Demangle [demangler] Initial support for the new Rust mangling scheme 2021-05-03 16:44:30 -07:00
DWARFLinker [MC] Untangle MCContext and MCObjectFileInfo 2021-05-05 10:03:02 -07:00
ExecutionEngine [ORC] Introduce C API for adding object buffers directly to an object layer. 2021-05-05 19:02:13 -07:00
Extensions
FileCheck Fix PR46880: Fail CHECK-NOT with undefined variable 2021-04-20 14:42:46 +01:00
Frontend [OpenMPIRBuilder] Add createOffloadMaptypes and createOffloadMapnames functions 2021-05-03 15:42:32 -04:00
Fuzzer
FuzzMutate [NFC] Reordering parameters in getFile and getFileOrSTDIN 2021-03-25 09:47:49 -04:00
InterfaceStub
IR [SVE][LoopVectorize] Add support for scalable vectorization of first-order recurrences 2021-05-06 11:35:39 +01:00
IRReader [NFC] Reordering parameters in getFile and getFileOrSTDIN 2021-03-25 09:47:49 -04:00
LineEditor
Linker Linker: Avoid scheduling the link of a global value twice due to an alias 2021-04-28 13:22:10 -07:00
LTO [Support] Don't include VirtualFileSystem.h in CommandLine.h 2021-04-21 10:19:01 -04:00
MC [MC] Untangle MCContext and MCObjectFileInfo 2021-05-05 10:03:02 -07:00
MCA [MCA] Fix CarryOver check in the DispatchStage (PR50174). 2021-04-30 14:26:46 +01:00
Object [MC] Untangle MCContext and MCObjectFileInfo 2021-05-05 10:03:02 -07:00
ObjectYAML Reland "AMDGPU/llvm-readobj: Add missing tests for note parsing/displaying" 2021-05-02 22:56:17 -04:00
Option [clang][cli] NFC: Remove ArgList infrastructure for recording queries 2021-02-25 13:53:24 +01:00
Passes [NewPM] Invalidate AAManager after populating GlobalsAA 2021-05-03 16:37:32 -07:00
ProfileData [CSSPGO] Explicitly disallow Binary and Compact Binary profile format for CSSPGO 2021-04-26 09:10:24 -07:00
Remarks [Support] Don't include VirtualFileSystem.h in CommandLine.h 2021-04-21 10:19:01 -04:00
Support [SystemZ][z/OS] Fix return values in AutoConversion functions 2021-05-05 09:43:14 -04:00
TableGen [TableGen] Fix two bugs in 'defm' when complex 'assert' is involved. 2021-04-30 11:31:06 -04:00
Target [SystemZ] Support builtin_frame_address with packed stack without backchain. 2021-05-06 12:50:49 +02:00
Testing [SystemZ][z/OS] Add IsText Argument to GetFile and GetFileOrSTDIN 2021-04-16 10:08:36 -04:00
TextAPI [TextAPI] move source code files out of subdirectory, NFC 2021-04-05 10:24:42 -07:00
ToolDrivers [NFC] Reordering parameters in getFile and getFileOrSTDIN 2021-03-25 09:47:49 -04:00
Transforms [LV] Account for tripcount when calculation vectorization profitability 2021-05-06 12:36:46 +01:00
WindowsManifest
XRay
CMakeLists.txt