1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

17342 Commits

Author SHA1 Message Date
Nadav Rotem
5635a9350f Add support for additional reduction variables: AND, OR, XOR.
Patch by Paul Redmond <paul.redmond@intel.com>.

llvm-svn: 166649
2012-10-25 00:08:41 +00:00
Nadav Rotem
9d7ba0ef55 Implement a basic cost model for vector and scalar instructions.
llvm-svn: 166642
2012-10-24 23:47:38 +00:00
Chad Rosier
3935e5ec30 Tell llvm-mc we're using intel syntax, so we don't have to use directives.
llvm-svn: 166640
2012-10-24 23:34:38 +00:00
Chad Rosier
492e58a0f7 [ms-inline asm] Add back-end test case for r166632. Make sure we emit the
correct .s output as well as get the correct encoding by the integrated
assembler.

llvm-svn: 166638
2012-10-24 23:10:28 +00:00
Hal Finkel
39d442863c Update GVN to support vectors of pointers.
GVN will now generate ptrtoint instructions for vectors of pointers.
Fixes PR14166.

llvm-svn: 166624
2012-10-24 21:22:30 +00:00
Nadav Rotem
05d9e80245 LoopVectorizer: Add a basic cost model which uses the VTTI interface.
llvm-svn: 166620
2012-10-24 20:36:32 +00:00
Evan Cheng
f97472cdf6 Fix a miscompilation caused by a typo. When turning a adde with negative value
into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.

rdar://12559385

llvm-svn: 166613
2012-10-24 19:53:01 +00:00
Hal Finkel
2392da9d83 getSmallConstantTripMultiple should never return zero.
When the trip count is -1, getSmallConstantTripMultiple could return zero,
and this would cause runtime loop unrolling to assert. Instead of returning
zero, one is now returned (consistent with the existing overflow cases).
Fixes PR14167.

llvm-svn: 166612
2012-10-24 19:46:44 +00:00
Micah Villmow
521311700f Add in support for getIntPtrType to get the pointer type based on the address space.
This checkin also adds in some tests that utilize these paths and updates some of the
clients.

llvm-svn: 166578
2012-10-24 15:52:52 +00:00
Elena Demikhovsky
b711ef1960 Special calling conventions for Intel OpenCL built-in library.
llvm-svn: 166566
2012-10-24 14:46:16 +00:00
Duncan Sands
1f4e159b01 Add a testcase that would have noticed the typo fixed in commit 166475.
llvm-svn: 166547
2012-10-24 07:17:20 +00:00
Michael Liao
18e40965aa Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x))
- If more than 1 elemennts are defined and target supports the vectorized
  conversion, use the vectorized one instead to reduce the strength on
  conversion operation.

llvm-svn: 166546
2012-10-24 04:14:18 +00:00
Michael Liao
70bb8004bd Add custom conversion from v2u32 to v2f32 in 32-bit mode
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
  v2f32 is added to improve the efficiency of the code generated.

llvm-svn: 166545
2012-10-24 04:09:32 +00:00
Akira Hatanaka
8c77adc9ce [mips] Make sure sret argument is returned in register V0.
llvm-svn: 166539
2012-10-24 02:10:54 +00:00
Rafael Espindola
8ebde25931 Change x86_fastcallcc to require inreg markers. This allows it to known
the difference from "int x" (which should go in registers and
"struct y {int x;}" (which should not).

Clang will be updated in the next patches.

llvm-svn: 166536
2012-10-24 01:58:48 +00:00
Michael Liao
7ca099f727 Fix PR14161
- Check index being extracted to be constant 0 before simplfiying.
  Otherwise, retain the original sequence.

llvm-svn: 166504
2012-10-23 21:40:15 +00:00
Nadav Rotem
3deae09579 Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news.
PR14158.

llvm-svn: 166491
2012-10-23 18:44:18 +00:00
Bill Wendling
cc498d64b5 Ignore unreachable blocks when doing memory dependence analysis on non-local
loads. It's not really profitable and may result in GVN going into an infinite
loop when it hits constructs like this:

     %x = gep %some.type %x, ...

Found via an LTO build of LLVM.

llvm-svn: 166490
2012-10-23 18:37:11 +00:00
Michael Liao
23c890e0a8 Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
llvm-svn: 166486
2012-10-23 17:34:00 +00:00
Duncan Sands
6ce2ce7ed1 Transform code like this
%V = mul i64 %N, 4
 %t = getelementptr i8* bitcast (i32* %arr to i8*), i32 %V
into
 %t1 = getelementptr i32* %arr, i32 %N
 %t = bitcast i32* %t1 to i8*
incorporating the multiplication into the getelementptr.
This happens all the time in dragonegg, for example for
  int foo(int *A, int N) {
    return A[N];
  }
because gcc turns this into byte pointer arithmetic before it hits the plugin:
  D.1590_2 = (long unsigned int) N_1(D);
  D.1591_3 = D.1590_2 * 4;
  D.1592_5 = A_4(D) + D.1591_3;
  D.1589_6 = *D.1592_5;
  return D.1589_6;
The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr
on a bitcast of A_4 to i8*, so this becomes exactly the kind of IR that the
transform fires on.

An analogous transform (with no testcases!) already existed for bitcasts of
arrays, so I rewrote it to share code with this one.

llvm-svn: 166474
2012-10-23 08:28:26 +00:00
Reed Kotler
730040c219 implement setXX patterns
llvm-svn: 166459
2012-10-23 01:35:48 +00:00
Bill Wendling
e97df2d337 When a block ends in an indirect branch, add its successors to the machine basic block.
The CFG of the machine function needs to know that the targets of the indirect
branch are successors to the indirect branch.
<rdar://problem/12529625>

llvm-svn: 166448
2012-10-22 23:30:04 +00:00
Kevin Enderby
0f6b703b72 Add support for annotated disassembly output for X86 and arm.
Per the October 12, 2012 Proposal for annotated disassembly output sent out by
Jim Grosbach this set of changes implements this for X86 and arm.  The llvm-mc
tool now has a -mdis option to produced the marked up disassembly and a couple
of small example test cases have been added.

rdar://11764962

llvm-svn: 166445
2012-10-22 22:31:46 +00:00
Nadav Rotem
302d4b678a Don't crash if the load/store pointer is not a GEP.
Fix by Shivarama Rao <Shivarama.Rao@amd.com>

llvm-svn: 166427
2012-10-22 18:27:56 +00:00
Nadav Rotem
58a0c52168 Add a testcase for the previous commit.
llvm-svn: 166425
2012-10-22 18:16:55 +00:00
Argyrios Kyrtzidis
61e2024bf0 Revert r166407 because it caused analyzer tests to crash and broke self-host bots.
llvm-svn: 166424
2012-10-22 18:16:14 +00:00
Hal Finkel
7a55058abc BBVectorize should ignore unreachable blocks.
Unreachable blocks can have invalid instructions. For example,
jump threading can produce self-referential instructions in
unreachable blocks. Also, we should not be spending time
optimizing unreachable code. Fixes PR14133.

llvm-svn: 166423
2012-10-22 18:00:55 +00:00
Nadav Rotem
6b56385c1a Vectorizer: optimize the generation of selects. If the condition is uniform, generate a scalar-cond select (i1 as selector).
llvm-svn: 166409
2012-10-22 04:38:00 +00:00
Nick Lewycky
44e5136371 Reapply r166405, teaching tailcallelim to be smarter about nocapture, with a
very small but very important bugfix:
  bool shouldExplore(Use *U) {
    Value *V = U->get();
    if (isa<CallInst>(V) || isa<InvokeInst>(V))
    [...]
should have read:
  bool shouldExplore(Use *U) {
    Value *V = U->getUser();
    if (isa<CallInst>(V) || isa<InvokeInst>(V))
Fixes PR14143!

llvm-svn: 166407
2012-10-22 03:03:52 +00:00
NAKAMURA Takumi
32507dd070 Revert r166405, "Teach TailRecursionElimination to consider 'nocapture' when deciding whether"
It broke selfhosting stage2 in several builders.

llvm-svn: 166406
2012-10-22 00:48:51 +00:00
Nick Lewycky
1a50a0e414 Teach TailRecursionElimination to consider 'nocapture' when deciding whether
calls can be marked tail.

llvm-svn: 166405
2012-10-21 23:51:22 +00:00
Hal Finkel
502fe3cc4a DataLayout should use itself when calculating the size of a vector.
This is important for vectors of pointers because only DataLayout,
not the underlying vector type, knows how to calculate the size
of the pointers in the vector. Fixes PR14138.

llvm-svn: 166401
2012-10-21 20:38:03 +00:00
Benjamin Kramer
d97f445bdf Revert r166390 "LoopIdiom: Replace custom dependence analysis with LoopDependenceAnalysis."
It passes all tests, produces better results than the old code but uses the
wrong pass, LoopDependenceAnalysis, which is old and unmaintained. "Why is it
still in tree?", you might ask. The answer is obviously: "To confuse developers."

Just swapping in the new dependency pass sends the pass manager into an infinte
loop, I'll try to figure out why tomorrow.

llvm-svn: 166399
2012-10-21 19:31:16 +00:00
Benjamin Kramer
7d87bba7c4 LoopIdiom: Replace custom dependence analysis with LoopDependenceAnalysis.
Requires a lot less code and complexity on loop-idiom's side and the more
precise analysis can catch more cases, like the one I included as a test case.
This also fixes the edge-case miscompilation from PR9481. I'm not entirely
sure that all cases are handled that the old checks handled but LDA will
certainly become smarter in the future.

llvm-svn: 166390
2012-10-21 15:03:07 +00:00
Nadav Rotem
380fe201de Fix a bug in the vectorization of wide load/store operations.
We used a SCEV to detect that A[X] is consecutive. We assumed that X was
the induction variable. But X can be any expression that uses the induction
for example: X = i + 2;

llvm-svn: 166388
2012-10-21 06:49:10 +00:00
Nadav Rotem
825cda19d5 Add support for reduction variables that do not start at zero.
This is important for nested-loop reductions such as :

In the innermost loop, the induction variable does not start with zero:

for (i = 0 .. n)
 for (j = 0 .. m)
  sum += ...

llvm-svn: 166387
2012-10-21 05:52:51 +00:00
Nadav Rotem
763abacb83 Vectorizer: fix a bug in the classification of induction/reduction phis.
llvm-svn: 166384
2012-10-21 02:38:01 +00:00
Nadav Rotem
2ee8edf34a Fix an infinite loop in the loop-vectorizer.
PR14134.

llvm-svn: 166379
2012-10-20 20:45:01 +00:00
Benjamin Kramer
f3ac3fa22b InstCombine: Fix an edge case where constant icmps could sneak into ConstantFoldInstOperands and crash.
Have to refactor the ConstantFolder interface one day to define bugs like this away. Fixes PR14131.

llvm-svn: 166374
2012-10-20 08:43:52 +00:00
Nadav Rotem
cdd573e703 Vectorize: teach cavVectorizeMemory to distinguish between A[i]+=x and A[B[i]]+=x.
If the pointer is consecutive then it is safe to read and write. If the pointer is non-loop-consecutive then
it is unsafe to vectorize it because we may hit an ordering issue.

llvm-svn: 166371
2012-10-20 08:26:33 +00:00
Nadav Rotem
8fe03aa4c1 Vectorizer: Add support for loop reductions.
For example:

  for (i=0; i<n; i++)
   sum += A[i] +  B[i] + i;

llvm-svn: 166351
2012-10-19 23:05:40 +00:00
Akira Hatanaka
5439c43726 [mips] Use 64-bit registers to return an sret pointer if target ABI is N64.
llvm-svn: 166344
2012-10-19 22:11:40 +00:00
Akira Hatanaka
32235d5612 [mips] Add code to do tail call optimization.
Currently, it is enabled only if option "enable-mips-tail-calls" is given and
all of the callee's arguments are passed in registers.

llvm-svn: 166342
2012-10-19 21:47:33 +00:00
Benjamin Kramer
0edfdde403 SimplifyLibcalls: The return value of ffsll is always i32, even when the input is zero.
Fixes PR13028.

llvm-svn: 166313
2012-10-19 20:43:44 +00:00
Daniel Dunbar
9d930b0d06 tests: Stop mangling '-vg' into the triple, we don't use this currently.
- Also, lit is going to get a valgrind feature, instead.

llvm-svn: 166302
2012-10-19 20:11:56 +00:00
Shuxin Yang
3ad15929e7 This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap()
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().

  The X86 backend is already able to handle debugtrap(). This patch is to:
  1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
  2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
     make the __builtin_debugtrap() "available" to all existing ports without the hassle of
     changing their code.
  3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
     __builtin_trap() will be expanded into the function call of the specified trap function.
    This behavior may need change in the future.

  The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86. 

llvm-svn: 166300
2012-10-19 20:11:16 +00:00
Benjamin Kramer
49175bcdcd Indvars: Don't recursively delete instruction during BB iteration.
This can invalidate the iterators leading to use after frees and crashes.
Fixes PR12536.

llvm-svn: 166291
2012-10-19 17:53:54 +00:00
Michael Liao
9984a3bee3 Lower BUILD_VECTOR to SHUFFLE + INSERT_VECTOR_ELT for X86
- If INSERT_VECTOR_ELT is supported (above SSE2, either by custom
  sequence of legal insn), transform BUILD_VECTOR into SHUFFLE +
  INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few
  (so far 1) elements being inserted.

llvm-svn: 166288
2012-10-19 17:15:18 +00:00
Benjamin Kramer
77d71ab3d7 SCEVExpander: Don't crash when trying to merge two constant phis.
Just constant fold them so they can't cause any trouble. Fixes PR12627.

llvm-svn: 166286
2012-10-19 16:37:30 +00:00
Stepan Dyatkovskiy
ece4c2a9c1 ARM:
Removed extra stack frame object for fixed byval arguments,
VarArgsStyleRegisters invocation was reworked due to some improper usage in
past. PR14099 also demonstrates it.

llvm-svn: 166273
2012-10-19 08:23:06 +00:00