1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

489 Commits

Author SHA1 Message Date
Arnold Schwaighofer
3fa9376236 SLPVectorizer: Fix whitespace errors.
llvm-svn: 195468
2013-11-22 15:47:17 +00:00
Yi Jiang
74286d427a SLP Vectorizer: Extract cost will only be added once even if the scalar has multiple external uses.
llvm-svn: 195406
2013-11-22 01:57:02 +00:00
Arnold Schwaighofer
242935ec8c SLPVectorizer: Fix stale for Value pointer array
We are slicing an array of Value pointers and process those slices in a loop.
The problem is that we might invalidate a later slice by vectorizing a former
slice.

Use a WeakVH to track the pointer. If the pointer is deleted or RAUW'ed we can
tell.

The test case will only fail when running with libgmalloc.

radar://15498655

llvm-svn: 195162
2013-11-19 22:20:20 +00:00
Arnold Schwaighofer
3149313505 SLPVectorizer: Fix whitespace errors
llvm-svn: 195161
2013-11-19 22:20:18 +00:00
Arnold Schwaighofer
e4280ec4dd LoopVectorizer: Extend the induction variable to a larger type
In some case the loop exit count computation can overflow. Extend the type to
prevent most of those cases.

The problem is loops like:
int main ()
{
  int a = 1;
  char b = 0;
  lbl:
    a &= 4;
    b--;
    if (b) goto lbl;
  return a;
}

The backedge count is 255. The induction variable type is i8. If we add one to
255 to get the exit count we overflow to zero.

To work around this issue we extend the type of the induction variable to i32 in
the case of i8 and i16.

PR17532

llvm-svn: 195008
2013-11-18 13:14:32 +00:00
Arnold Schwaighofer
01b6f1cc9a LoopVectorizer: Use abi alignment for accesses with no alignment
When we vectorize a scalar access with no alignment specified, we have to set
the target's abi alignment of the scalar access on the vectorized access.
Using the same alignment of zero would be wrong because most targets will have a
bigger abi alignment for vector types.

This probably fixes PR17878.

llvm-svn: 194876
2013-11-15 23:09:33 +00:00
Renato Golin
ed3b88828a Move debug message in vectorizer
No functional change, just better reporting.

llvm-svn: 194388
2013-11-11 16:27:35 +00:00
Benjamin Kramer
9eaaead296 SLPVectorizer: Use properlyDominates to satisfy the irreflexivity of a strict weak ordering.
STL debug mode checks this.

llvm-svn: 194015
2013-11-04 21:34:55 +00:00
Benjamin Kramer
15ebc47438 SLPVectorizer: Add a missing pair of parens. No functionality change.
llvm-svn: 193958
2013-11-03 12:54:32 +00:00
Benjamin Kramer
f45bcf5480 SLPVectorizer: When CSEing generated gathers only scan blocks containing them.
Instead of doing a RPO traversal of the whole function remember the blocks
containing gathers (typically <= 2) and scan them in dominator-first order.

The actual CSE is still quadratic, but I'm not confident that adding a
scoped hash table here is worth it as we're only looking at the generated
instructions and not arbitrary code.

llvm-svn: 193956
2013-11-03 12:27:52 +00:00
Benjamin Kramer
ae919396b6 SLPVectorizer: Remove duplicated function.
llvm-svn: 193927
2013-11-02 14:46:27 +00:00
Benjamin Kramer
abc7baa1dc LoopVectorize: Remove quadratic behavior the local CSE.
Doing this with a hash map doesn't change behavior and avoids calling
isIdenticalTo O(n^2) times. This should probably eventually move into a utility
class shared with EarlyCSE and the limited CSE in the SLPVectorizer.

llvm-svn: 193926
2013-11-02 13:39:00 +00:00
Arnold Schwaighofer
fed0c4f8e8 LoopVectorizer: Move cse code into its own function
llvm-svn: 193895
2013-11-01 23:28:54 +00:00
Arnold Schwaighofer
fba1c74b67 LoopVectorizer: Perform redundancy elimination on induction variables
When the loop vectorizer was part of the SCC inliner pass manager gvn would
run after the loop vectorizer followed by instcombine. This way redundancy
(multiple uses) were removed and instcombine could perform scalarization on the
induction variables. Having moved the loop vectorizer to later we no longer run
any form of redundancy elimination before we perform instcombine. This caused
vectorized induction variables to survive that did not before.

On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops.

This should also help lpbench and paq8p.

I ran a Release (without Asserts) build over the test-suite and did not see any
negative impact on compile time.

radar://15339680

llvm-svn: 193891
2013-11-01 22:18:19 +00:00
Benjamin Kramer
3045156cee LoopVectorize: Look for consecutive acces in GEPs with trailing zero indices
If we have a pointer to a single-element struct we can still build wide loads
and stores to it (if there is no padding).

llvm-svn: 193860
2013-11-01 14:09:50 +00:00
Arnold Schwaighofer
5d7be45165 LoopVectorizer: If dependency checks fail try runtime checks
When a dependence check fails we can still try to vectorize loops with runtime
array bounds checks.

This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the
scalar performance on a corei7-avx.

radar://15339680

llvm-svn: 193853
2013-11-01 03:05:07 +00:00
Arnold Schwaighofer
fe8e481ef6 LoopVectorizer: Clear all member data structures in RuntimeCheck.reset()
Clear all data structures when resetting the RuntimeCheck data structure.

No test case. This was exposed by an upcomming change.

llvm-svn: 193852
2013-11-01 03:05:04 +00:00
Arnold Schwaighofer
fe80e563da ARM cost model: Account for zero cost scalar SROA instructions
By vectorizing a series of srl, or, ... instructions we have obfuscated the
intention so much that the backend does not know how to fold this code away.

radar://15336950

llvm-svn: 193573
2013-10-29 01:33:53 +00:00
Arnold Schwaighofer
6f22639253 SLPVectorizer: Use vector type for vectorized memory operations
No test case, because with the current cost model we don't see a difference.
An upcoming ARM memory cost model change will expose and test this bug.

radar://15332579

llvm-svn: 193572
2013-10-29 01:33:50 +00:00
Wan Xiaofei
f3100f24fa Quick look-up for block in loop.
This patch implements quick look-up for block in loop by maintaining a hash set for blocks.
It improves the efficiency of loop analysis a lot, the biggest improvement could be 5-6%(458.sjeng).
Below are the compilation time for our benchmark in llc before & after the patch.

Benchmark	llc - trunk		llc - patched	
401.bzip2	0.339081	100.00%	0.329657	102.86%
403.gcc		19.853966	100.00%	19.605466	101.27%
429.mcf		0.049823	100.00%	0.048451	102.83%
433.milc	0.514898	100.00%	0.510217	100.92%
444.namd	1.109328	100.00%	1.103481	100.53%
445.gobmk	4.988028	100.00%	4.929114	101.20%
456.hmmer	0.843871	100.00%	0.825865	102.18%
458.sjeng	0.754238	100.00%	0.714095	105.62%
464.h264ref	2.9668		100.00%	2.90612		102.09%
471.omnetpp	4.556533	100.00%	4.511886	100.99%
bitmnp01	0.038168	100.00%	0.0357		106.91%
idctrn01	0.037745	100.00%	0.037332	101.11%
libquake2	3.78689		100.00%	3.76209		100.66%
libquake_	2.251525	100.00%	2.234104	100.78%
linpack		0.033159	100.00%	0.032788	101.13%
matrix01	0.045319	100.00%	0.043497	104.19%
nbench		0.333161	100.00%	0.329799	101.02%
tblook01	0.017863	100.00%	0.017666	101.12%
ttsprk01	0.054337	100.00%	0.053057	102.41%

Reviewer	: Andrew Trick <atrick@apple.com>, Hal Finkel <hfinkel@anl.gov>
Approver	: Andrew Trick <atrick@apple.com>
Test		: Pass make check-all & llvm test-suite

llvm-svn: 193460
2013-10-26 03:08:02 +00:00
Hal Finkel
d554c99b37 LoopVectorizer: Don't attempt to vectorize extractelement instructions
The loop vectorizer does not currently understand how to vectorize
extractelement instructions. The existing check, which excluded all
vector-valued instructions, did not catch extractelement instructions because
it checked only the return value. As a result, vectorization would proceed,
producing illegal instructions like this:

  %58 = extractelement <2 x i32> %15, i32 0
  %59 = extractelement i32 %58, i32 0

where the second extractelement is illegal because its first operand is not a vector.

llvm-svn: 193434
2013-10-25 20:40:15 +00:00
Renato Golin
ae79e04f36 Mark vector loops as already vectorized
Make sure we mark all loops (scalar and vector) when vectorizing,
so that we don't try to vectorize them anymore. Also, set unroll
to 1, since this is what we check for on early exit.

llvm-svn: 193349
2013-10-24 14:50:51 +00:00
Matt Arsenault
fcca6dd732 Use more type helper functions
llvm-svn: 193109
2013-10-21 19:43:56 +00:00
Arnold Schwaighofer
789187ee86 SLPVectorizer: Don't vectorize volatile memory operations
radar://15231682

Reapply r192799,
  http://lab.llvm.org:8011/builders/lldb-x86_64-debian-clang/builds/8226
showed that the bot is still broken even with this out.

llvm-svn: 192820
2013-10-16 17:52:40 +00:00
Arnold Schwaighofer
7097263371 Revert "SLPVectorizer: Don't vectorize volatile memory operations"
This speculatively reverts commit 192799. It might have broken a linux buildbot.

llvm-svn: 192816
2013-10-16 17:19:40 +00:00
Arnold Schwaighofer
eebda9d6cf SLPVectorizer: Don't vectorize volatile memory operations
radar://15231682

llvm-svn: 192799
2013-10-16 16:09:00 +00:00
Benjamin Kramer
a00487e169 LoopVectorize: Properly reflect PODness in comments.
llvm-svn: 192717
2013-10-15 16:19:54 +00:00
Arnold Schwaighofer
38ec37faba SLPVectorizer: Sort PHINodes based on their opcode
Before this patch we relied on the order of phi nodes when we looked for phi
nodes of the same type. This could prevent vectorization of cases where there
was a phi node of a second type in between phi nodes of some type.

This is important for vectorization of an internal graphics kernel. On the test
suite + external on x86_64 (and on a run on armv7s) it showed no impact on
either performance or compile time.

radar://15024459

llvm-svn: 192537
2013-10-12 18:56:27 +00:00
Tobias Grosser
bc154d94d0 LoopVectorize: Add missing INITIALIZE_PASS_DEPENDENCY macros
Contributed-by:  Peter Zotov  <whitequark@whitequark.org>
llvm-svn: 192536
2013-10-12 18:29:15 +00:00
Renato Golin
ec7fe56cfa Better info when debugging vectorizer
llvm-svn: 192460
2013-10-11 16:14:39 +00:00
Arnold Schwaighofer
bfe48b104a LoopVectorize: External uses must use the last value in a reduction cycle
Otherwise, we don't perform operations that would have been performed on
the scalar version.

Fixes PR17498.

llvm-svn: 192133
2013-10-07 21:05:43 +00:00
Arnold Schwaighofer
6fab174840 SLPVectorizer: Sort inputs to commutative binary operations
Sort the operands of the other entries in the current vectorization root
according to the first entry's operands opcodes.

%conv0 = uitofp ...
%load0 = load float ...

= fmul %conv0, %load0
= fmul %load0, %conv1
= fmul %load0, %conv2

Make sure that we recursively vectorize <%conv0, %conv1, %conv2> and <%load0,
%load0, %load0>.

This makes it more likely to obtain vectorizable trees. We have to be careful
when we sort that we don't destroy 'good' existing ordering implied by source
order.

radar://15080067

llvm-svn: 191977
2013-10-04 20:39:16 +00:00
Matt Arsenault
2c180f66a9 Don't use runtime bounds check between address spaces.
Don't vectorize with a runtime check if it requires a
comparison between pointers with different address spaces.
The values can't be assumed to be directly comparable.
Previously it would create an illegal bitcast.

llvm-svn: 191862
2013-10-02 22:38:17 +00:00
Yi Jiang
09195de8f3 Apply slp vectorization on fully-vectorizable tree of height 2
llvm-svn: 191852
2013-10-02 20:20:39 +00:00
Matt Arsenault
15633246b6 Fix debug printing spacing.
Fix missing newlines, missing and extra spaces in printed messages.

llvm-svn: 191851
2013-10-02 20:04:29 +00:00
Matt Arsenault
26cc78f548 Fix comment grammar and capitalization.
llvm-svn: 191850
2013-10-02 20:04:26 +00:00
Benjamin Kramer
4458ae839d SLPVectorizer: Make store chain finding more aggressive with GetUnderlyingObject.
This recursively strips all GEPs like the existing code. It also handles bitcasts and
other operations that do not change the pointer value.

llvm-svn: 191847
2013-10-02 19:06:06 +00:00
Rafael Espindola
a279462828 Remove several unused variables.
Patch by Alp Toker.

llvm-svn: 191757
2013-10-01 13:32:03 +00:00
Matt Arsenault
a3e171a6c8 Fix code duplication
llvm-svn: 191716
2013-10-01 00:01:14 +00:00
Benjamin Kramer
7b5eaaacfd Convert manual insert point restores to the new RAII object.
llvm-svn: 191675
2013-09-30 15:40:17 +00:00
Benjamin Kramer
33abdcddb3 IRBuilder: Add RAII objects to reset insertion points or fast math flags.
Inspired by the object from the SLPVectorizer. This found a minor bug in the
debug loc restoration in the vectorizer where the location of a following
instruction was attached instead of the location from the original instruction.

llvm-svn: 191673
2013-09-30 15:39:48 +00:00
Robert Wilhelm
198f21deb3 Even more spelling fixes for "instruction".
llvm-svn: 191611
2013-09-28 13:42:22 +00:00
Robert Wilhelm
6b36431ffa Fix spelling intruction -> instruction.
llvm-svn: 191610
2013-09-28 11:46:15 +00:00
Matt Arsenault
0dc0668061 Fix SLPVectorizer using wrong address space for load/store
llvm-svn: 191564
2013-09-27 21:24:57 +00:00
Justin Bogner
db7f32982e Transforms: Use getFirstNonPHI to set the insertion point for PHIs
We were previously using getFirstInsertionPt to insert PHI
instructions when vectorizing, but getFirstInsertionPt also skips past
landingpads, causing this to generate invalid IR.

We can avoid this issue by using getFirstNonPHI instead.

llvm-svn: 191526
2013-09-27 15:30:25 +00:00
Arnold Schwaighofer
9830391a8f SLPVectorize: Put horizontal reductions feeding a store under separate flag
Put them under a separate flag for experimentation. They are more likely to
interfere with loop vectorization which happens later in the pass pipeline.

llvm-svn: 191371
2013-09-25 14:02:32 +00:00
Yi Jiang
6ba7a7b02c set the cost of tiny trees to INT_MAX in SLP vectorizer to disable vectorization on them
llvm-svn: 191314
2013-09-24 17:26:43 +00:00
Arnold Schwaighofer
b1cea2cfcc Revert "LoopVectorizer: Only allow vectorization of intrinsics."
Revert 191122 - with extra checks we are allowed to vectorize math library
function calls.

Standard library indentifiers are reserved names so functions with external
linkage must not overrided them. However, functions with internal linkage can.

Therefore, we can vectorize calls to math library functions with a check for
external linkage and matching signature. This matches what we do during
SelectionDAG building.

llvm-svn: 191206
2013-09-23 14:54:39 +00:00
Arnold Schwaighofer
4f8b0cf48b SLPVectorizer: Fix multiline comment warning
llvm-svn: 191135
2013-09-21 05:37:30 +00:00
Arnold Schwaighofer
c1f8473eb6 Reapply "SLPVectorizer: Handle more horizontal reductions (disabled)""
Reapply r191108 with a fix for a memory corruption error I introduced.  Of
course, we can't reference the scalars that we replace by vectorizing and then
call their eraseFromParent method. I only 'needed' the scalars to get the
DebugLoc. Just store the DebugLoc before actually vectorizing instead. As a nice
side effect, this also simplifies the interface between BoUpSLP and the
HorizontalReduction class to returning a value pointer (the vectorized tree
root).

radar://14607682

llvm-svn: 191123
2013-09-21 01:06:00 +00:00