1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 14:32:51 +01:00
Commit Graph

96 Commits

Author SHA1 Message Date
Nadav Rotem
c0528d8ea7 Put the threshold magic number in a variable.
llvm-svn: 167134
2012-10-31 16:22:16 +00:00
Nadav Rotem
6319d757df Remove enum values since they are not used anymore.
llvm-svn: 167131
2012-10-31 16:14:06 +00:00
Hal Finkel
d1fc849359 BBVectorize: Choose pair ordering to minimize shuffles
BBVectorize would, except for loads and stores, always fuse instructions
so that the first instruction (in the current source order) would always
represent the low part of the input vectors and the second instruction
would always represent the high part. This lead to too many shuffles
being produced because sometimes the opposite order produces fewer of them.

With this change, BBVectorize tracks the kind of pair connections that form
the DAG of candidate pairs, and uses that information to reorder the pairs to
avoid excess shuffles. Using this information, a future commit will be able
to add VTTI-based shuffle costs to the pair selection procedure. Importantly,
the number of remaining shuffles can now be estimated during pair selection.

There are some trivial instruction reorderings in the test cases, and one
simple additional test where we certainly want to do a reordering to
avoid an unnecessary shuffle.

llvm-svn: 167122
2012-10-31 15:17:07 +00:00
Nadav Rotem
9ab0e93cc1 LoopVectorize: Do not vectorize loops with tiny constant trip counts.
llvm-svn: 167101
2012-10-31 03:31:07 +00:00
Nadav Rotem
240ead98fd Add support for loops that don't start with Zero.
This is important for loops in the LAPACK test-suite.
These loops start at 1 because they are auto-converted from fortran.

llvm-svn: 167084
2012-10-31 00:45:26 +00:00
Nadav Rotem
e06ea2d50f Add documentation.
llvm-svn: 167055
2012-10-30 22:06:26 +00:00
Hal Finkel
6cfb988397 BBVectorize: Cache fixed-order pairs instead of recomputing pointer info.
Instead of recomputing relative pointer information just prior to fusing,
cache this information (which also needs to be computed during the
candidate-pair selection process). This cuts down on the total number of
SE queries made, and also is a necessary intermediate step on the road toward
including shuffle costs in the pair selection procedure.

No functionality change is intended.

llvm-svn: 167049
2012-10-30 20:17:37 +00:00
Hal Finkel
1c116e9ec0 BBVectorize: Fix a small bug introduced in r167042.
We need to make sure that we take the correct load/store alignment
when the inputs are flipped.

llvm-svn: 167044
2012-10-30 19:47:37 +00:00
Hal Finkel
2d3b9c41d5 BBVectorize: Simplify how input swapping is handled.
Stop propagating the FlipMemInputs variable into the routines that
create the replacement instructions. Instead, just flip the arguments
of those routines. This allows for some associated cleanup (not all
of which is done here). No functionality change is intended.

llvm-svn: 167042
2012-10-30 19:35:29 +00:00
Hal Finkel
a27a64ab3e BBVectorize: Don't make calls to SE when the result is unused.
SE was being called during the instruction-fusion process (when the result
is unreliable, and thus ignored). No functionality change is intended.

llvm-svn: 167037
2012-10-30 18:55:49 +00:00
Nadav Rotem
69e6bca813 LoopVectorize: Add support for write-only loops when the write destination is a single pointer.
Speedup SciMark by 1%

llvm-svn: 167035
2012-10-30 18:36:45 +00:00
Nadav Rotem
4fc2912062 LoopVectorize: Fix a bug in the initialization of reduction variables. AND needs to start at all-one
while XOR, and OR need to start at zero.

llvm-svn: 167032
2012-10-30 18:12:36 +00:00
Nadav Rotem
2ada2db2a2 LoopVectorizer: change debug prints: Print the module identifier when deciding to vectorize. When deciding not to vectorize do not print the called function name because it can be null.
llvm-svn: 166989
2012-10-30 00:40:39 +00:00
Nadav Rotem
0c9445eb5c LoopVectorize: Update and preserve the dominator tree info.
llvm-svn: 166970
2012-10-29 21:52:38 +00:00
Hal Finkel
82db9961eb Update BBVectorize to use the new VTTI instr. cost interfaces.
The monolithic interface for instruction costs has been split into
several functions. This is the corresponding change. No functionality
change is intended.

llvm-svn: 166865
2012-10-27 04:33:48 +00:00
Nadav Rotem
04f3086065 1. Fix a bug in getTypeConversion. When a *simple* type is split, we need to return the type of the split result.
2. Change the maximum vectorization width from 4 to 8.
3. A test for both.

llvm-svn: 166864
2012-10-27 04:11:32 +00:00
Nadav Rotem
133e437c48 Refactor the VectorTargetTransformInfo interface.
Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc.

Port the LoopVectorizer to the new API.

The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions.

llvm-svn: 166836
2012-10-26 23:49:28 +00:00
Hal Finkel
32f63f9091 Use VTTI->getNumberOfParts in BBVectorize.
This change reflects VTTI refactoring; no functionality change intended.

llvm-svn: 166752
2012-10-26 04:28:06 +00:00
Hal Finkel
a47e6ef6e6 Disable generation of pointer vectors by BBVectorize.
Once vector-of-pointer support works, then this can be reverted.

llvm-svn: 166741
2012-10-26 00:05:26 +00:00
Hal Finkel
d26d094306 BBVectorize, when using VTTI, should not form types that will be split.
This is needed so that perl's SHA can be compiled (otherwise
BBVectorize takes far too long to find its fixed point).

I'll try to come up with a reduced test case.

llvm-svn: 166738
2012-10-25 23:47:16 +00:00
Hal Finkel
e2184ac235 Begin incorporating target information into BBVectorize.
This is the first of several steps to incorporate information from the new
TargetTransformInfo infrastructure into BBVectorize. Two things are done here:

 1. Target information is used to determine if it is profitable to fuse two
    instructions. This means that the cost of the vector operation must not
    be more expensive than the cost of the two original operations. Pairs that
    are not profitable are no longer considered (because current cost information
    is incomplete, for intrinsics for example, equal-cost pairs are still
    considered).

 2. The 'cost savings' computed for the profitability check are also used to
    rank the DAGs that represent the potential vectorization plans. Specifically,
    for nodes of non-trivial depth, the cost savings is used as the node
    weight.

The next step will be to incorporate the shuffle costs into the DAG weighting;
this will give the edges of the DAG weights as well. Once that is done, when
target information is available, we should be able to dispense with the
depth heuristic.

llvm-svn: 166716
2012-10-25 21:12:23 +00:00
Nadav Rotem
f73d286571 LoopVectorize: Teach the cost model to query scalar costs as scalar types and not vectors of 1.
llvm-svn: 166715
2012-10-25 21:03:48 +00:00
Nadav Rotem
5635a9350f Add support for additional reduction variables: AND, OR, XOR.
Patch by Paul Redmond <paul.redmond@intel.com>.

llvm-svn: 166649
2012-10-25 00:08:41 +00:00
Nadav Rotem
9d7ba0ef55 Implement a basic cost model for vector and scalar instructions.
llvm-svn: 166642
2012-10-24 23:47:38 +00:00
Nadav Rotem
23bafecedf whitespace
llvm-svn: 166622
2012-10-24 20:58:40 +00:00
Nadav Rotem
05d9e80245 LoopVectorizer: Add a basic cost model which uses the VTTI interface.
llvm-svn: 166620
2012-10-24 20:36:32 +00:00
Micah Villmow
ce5e56a156 Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this!
llvm-svn: 166596
2012-10-24 17:25:11 +00:00
Micah Villmow
ae5ce80c36 Delete a directory that wasn't supposed to be checked in yet.
llvm-svn: 166591
2012-10-24 17:20:04 +00:00
Nadav Rotem
3deae09579 Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news.
PR14158.

llvm-svn: 166491
2012-10-23 18:44:18 +00:00
Nadav Rotem
302d4b678a Don't crash if the load/store pointer is not a GEP.
Fix by Shivarama Rao <Shivarama.Rao@amd.com>

llvm-svn: 166427
2012-10-22 18:27:56 +00:00
Hal Finkel
7a55058abc BBVectorize should ignore unreachable blocks.
Unreachable blocks can have invalid instructions. For example,
jump threading can produce self-referential instructions in
unreachable blocks. Also, we should not be spending time
optimizing unreachable code. Fixes PR14133.

llvm-svn: 166423
2012-10-22 18:00:55 +00:00
Nadav Rotem
ea70508da6 Rename a variable.
llvm-svn: 166410
2012-10-22 04:53:05 +00:00
Nadav Rotem
6b56385c1a Vectorizer: optimize the generation of selects. If the condition is uniform, generate a scalar-cond select (i1 as selector).
llvm-svn: 166409
2012-10-22 04:38:00 +00:00
Nadav Rotem
708e5d2fb0 Update the loop vectorizer docs.
llvm-svn: 166408
2012-10-22 03:52:53 +00:00
Anders Carlsson
d04e66ae01 Avoid an extra hash lookup when inserting a value into the widen map.
llvm-svn: 166395
2012-10-21 16:26:35 +00:00
Jakub Staszak
a477da32fc Simplify code. No functionality change.
llvm-svn: 166393
2012-10-21 15:36:03 +00:00
Jakub Staszak
ed5ec60053 Simplify code. No functionality change.
llvm-svn: 166392
2012-10-21 15:29:19 +00:00
Nadav Rotem
380fe201de Fix a bug in the vectorization of wide load/store operations.
We used a SCEV to detect that A[X] is consecutive. We assumed that X was
the induction variable. But X can be any expression that uses the induction
for example: X = i + 2;

llvm-svn: 166388
2012-10-21 06:49:10 +00:00
Nadav Rotem
825cda19d5 Add support for reduction variables that do not start at zero.
This is important for nested-loop reductions such as :

In the innermost loop, the induction variable does not start with zero:

for (i = 0 .. n)
 for (j = 0 .. m)
  sum += ...

llvm-svn: 166387
2012-10-21 05:52:51 +00:00
Nadav Rotem
5ab04af30a Document change. Describe the pass and some papers that inspired the design of the pass.
llvm-svn: 166386
2012-10-21 04:04:25 +00:00
Nadav Rotem
763abacb83 Vectorizer: fix a bug in the classification of induction/reduction phis.
llvm-svn: 166384
2012-10-21 02:38:01 +00:00
Nadav Rotem
2ee8edf34a Fix an infinite loop in the loop-vectorizer.
PR14134.

llvm-svn: 166379
2012-10-20 20:45:01 +00:00
Nadav Rotem
cdd573e703 Vectorize: teach cavVectorizeMemory to distinguish between A[i]+=x and A[B[i]]+=x.
If the pointer is consecutive then it is safe to read and write. If the pointer is non-loop-consecutive then
it is unsafe to vectorize it because we may hit an ordering issue.

llvm-svn: 166371
2012-10-20 08:26:33 +00:00
Nadav Rotem
762317ecc6 Fix a typo
llvm-svn: 166367
2012-10-20 05:03:27 +00:00
Nadav Rotem
4e013454ca Vectorizer: refactor the memory checks to a new function. No functionality change.
llvm-svn: 166366
2012-10-20 04:59:06 +00:00
Nadav Rotem
61a5b018ad LoopVectorize: Keep the IRBuilder on the stack.
llvm-svn: 166354
2012-10-19 23:27:19 +00:00
Nadav Rotem
8fe03aa4c1 Vectorizer: Add support for loop reductions.
For example:

  for (i=0; i<n; i++)
   sum += A[i] +  B[i] + i;

llvm-svn: 166351
2012-10-19 23:05:40 +00:00
Benjamin Kramer
5080891e07 LoopVectorize: Keep the IRBuilder on the stack.
No functionality change.

llvm-svn: 166274
2012-10-19 08:42:02 +00:00
Nadav Rotem
451f76acc3 vectorizer: Add support for reading and writing from the same memory location.
llvm-svn: 166255
2012-10-19 01:24:18 +00:00
Nadav Rotem
fe5d8c8c09 cleanup the comment.
llvm-svn: 166247
2012-10-18 23:21:01 +00:00