1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00
Commit Graph

6554 Commits

Author SHA1 Message Date
Quentin Colombet
dde058d386 Change ForceSizeOpt attribute into MinSize attribute
llvm-svn: 167020
2012-10-30 16:32:52 +00:00
Adhemerval Zanella
ac3ba40bc2 PowerPC: More support for Altivec compare operations
This patch adds more support for vector type comparisons using altivec.
It adds correct support for v16i8, v8i16, v4i32, and v4f32 vector
types for comparison operators ==, !=, >, >=, <, and <=.

llvm-svn: 167015
2012-10-30 13:50:19 +00:00
Hans Wennborg
40eb1b4055 Use TargetTransformInfo to control switch-to-lookup table transformation
When the switch-to-lookup tables transform landed in SimplifyCFG, it
was pointed out that this could be inappropriate for some targets.
Since there was no way at the time for the pass to know anything about
the target, an awkward reverse-transform was added in CodeGenPrepare
that turned lookup tables back into switches for some targets.

This patch uses the new TargetTransformInfo to determine if a
switch should be transformed, and removes
CodeGenPrepare::ConvertLoadToSwitch.

llvm-svn: 167011
2012-10-30 11:23:25 +00:00
Reed Kotler
de0ea1027e Change mips16 delay slot jumps to non delay slot forms by default.
We will make them delay slot forms if there is something that can be
placed in the delay slot during a separate pass. Mips16 extended instructions
cannot be placed in delay slots.

llvm-svn: 166990
2012-10-30 00:54:49 +00:00
Jakub Staszak
f1cddf738b Re-commit r166971. I reverted it to quickly, when buildbots didn't have a chance
to test it with chapni's fix (-mattr=+avx).

llvm-svn: 166985
2012-10-30 00:01:57 +00:00
Jakub Staszak
ce95e4429f Revert r166971. It causes buildbot failure. To be investigated.
llvm-svn: 166979
2012-10-29 23:13:50 +00:00
NAKAMURA Takumi
d544c8f4ca llvm/test/CodeGen/X86/vec_shuffle-30.ll: Try to unbreak builds - assuming +avx.
llvm-svn: 166974
2012-10-29 22:45:18 +00:00
Jakub Staszak
ded6f21890 Allow to fold vector load if there is more than one bitcast, so in the case:
%0 = load <8 x i16>* %dest
%1 = shufflevector <8 x i16> %0, <8 x i16> %in,
      <8 x i32> < i32 0, i32 1, i32 2, i32 3, i32 13, i32 undef, i32 14, i32 14>
store <8 x i16> %1, <8 x i16>* %dest

We get:
  vmovlpd (%eax), %xmm0, %xmm0

instead of:
  vmovaps (%eax), %xmm1
  vmovsd  %xmm1, %xmm0, %xmm0

No extra test-case is added. I just fixed the existing one
(also it uses FileCheck now).

llvm-svn: 166971
2012-10-29 21:56:35 +00:00
Bill Schmidt
77a8fd274b This patch solves a problem with passing varargs parameters under the PPC64
ELF ABI.

A varargs parameter consisting of a single-precision floating-point value,
or of a single-element aggregate containing a single-precision floating-point
value, must be passed in the low-order (rightmost) four bytes of the
doubleword stack slot reserved for that parameter.  If there are GPR protocol
registers remaining, the parameter must also be mirrored in the low-order
four bytes of the reserved GPR.

Prior to this patch, such parameters were being passed in the high-order
four bytes of the stack slot and the mirrored GPR.

The patch adds a new test case to verify the correct code generation.

llvm-svn: 166968
2012-10-29 21:18:16 +00:00
Reed Kotler
3859f5469e Implement patterns for extloadi8 and extloadi16
llvm-svn: 166960
2012-10-29 19:39:04 +00:00
Ulrich Weigand
445bd73056 In various places throughout the code generator, there were special
checks to avoid performing compile-time arithmetic on PPCDoubleDouble.

Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.

llvm-svn: 166958
2012-10-29 18:35:49 +00:00
Chad Rosier
3b32dec25d Remove redundant test case from r166949, per Eli's suggestion.
llvm-svn: 166953
2012-10-29 18:18:26 +00:00
Chad Rosier
651ecf255c [ms-inline asm] Add support for the [] operator. Essentially, [expr1][expr2] is
equivalent to [expr1 + expr2].  See test cases for more examples.
rdar://12470392

llvm-svn: 166949
2012-10-29 18:01:54 +00:00
Michael Liao
d26c27ad35 Fix PR14204
- Add missing pattern on X86ISD::VZEXT from VR256 to VR256 when AVX2 is enabled.

llvm-svn: 166947
2012-10-29 17:57:12 +00:00
Jakob Stoklund Olesen
05cec5db28 Completely disallow partial copies in adjustCopiesBackFrom().
Partial copies can show up even when CoalescerPair.isPartial() returns
false. For example:

   %vreg24:dsub_0<def> = COPY %vreg31:dsub_0; QPR:%vreg24,%vreg31

Such a partial-partial copy is not good enough for the transformation
adjustCopiesBackFrom() needs to do.

llvm-svn: 166944
2012-10-29 17:51:52 +00:00
Ulrich Weigand
2daab9e4b4 Allow i32/i64 for 'f' constraint on PowerPC.
This fixes PR12757.

llvm-svn: 166943
2012-10-29 17:49:34 +00:00
Reed Kotler
56b8c348f2 Expand all atomic ops for mips16.
llvm-svn: 166935
2012-10-29 16:16:54 +00:00
Preston Gurd
1683865eea This patch addresses a problem with the Post RA scheduler generating an
incorrect instruction sequence due to it not being aware that an
inline assembly instruction may reference memory.

This patch fixes the problem by causing the scheduler to always assume that any
inline assembly code instruction could access memory. This is necessary because
the internal representation of the inline instruction does not include
any information about memory accesses.
 
This should fix PR13504.

llvm-svn: 166929
2012-10-29 15:01:23 +00:00
Bill Schmidt
2682b73f6d This patch adds alignment information for long double to the 64-bit PowerPC
ELF subtarget.

The existing logic is used as a fallback to avoid any changes to the Darwin
ABI.  PPC64 ELF now has two possible data layout strings: one for FreeBSD,
which requires 8-byte alignment, and a default string that requires
16-byte alignment.

I've added a test for PPC64 Linux to verify the 16-byte alignment.  If
somebody wants to add a separate test for FreeBSD, that would be great.

Note that there is a companion patch to update the alignment information
in Clang, which I am committing now as well.

llvm-svn: 166928
2012-10-29 14:59:36 +00:00
Reed Kotler
1b4c985fc8 Implement brind operator for mips16.
llvm-svn: 166903
2012-10-28 23:08:07 +00:00
Reed Kotler
16eaf8644a This patch is for the implementation of mips16 complex pattern addr16.
Previously mips16 was sharing the pattern addr which is used for mips32
and mips64. This had a number of problems:
1) Storing and loading byte and halfword quantities for mips16 has particular
problems due to the primarily non mips16 nature of SP. When we must
load/store byte/halfword stack objects in a function, we must create a mips16
alias register for SP. This functionality is tested in stchar.ll.
2) We need to have an FP register under certain conditions (such as 
dynamically sized alloca). We use mips16 register S0 for this purpose.
In this case, we also use this register when accessing frame objects so this
issue also affects the complex pattern addr16. This functionality is
tested in alloca16.ll.

The Mips16InstrInfo.td has been updated to use addr16 instead of addr.

The complex pattern C++ function for addr has been copied to addr16 and
updated to reflect the above issues.

llvm-svn: 166897
2012-10-28 06:02:37 +00:00
Jakob Stoklund Olesen
2efe8df169 Never attempt to join an early-clobber def with a regular kill.
This fixes PR14194.

llvm-svn: 166880
2012-10-27 17:41:27 +00:00
Quentin Colombet
bcd2bc3437 [code size][ARM] Emit regular call instructions instead of the move, branch sequence
llvm-svn: 166854
2012-10-27 01:10:17 +00:00
Reed Kotler
cc41464454 Implement MipsHi for mips16
llvm-svn: 166852
2012-10-27 00:57:14 +00:00
Akira Hatanaka
a2ae72bd2f [mips] Do not tail-call optimize vararg functions or functions with byval
arguments.

This is rather conservative and should be fixed later to be more aggressive.

llvm-svn: 166851
2012-10-27 00:56:56 +00:00
Akira Hatanaka
4d26f60ca0 [mips] Make sure FuncArg doesn't advance when OrigArgIndex is the same as in the
previous iteration.

llvm-svn: 166850
2012-10-27 00:44:39 +00:00
Jakob Stoklund Olesen
4b4db880a3 Revert r163298 "Optimize codegen for VSETLNi{8,16,32} operating on Q registers."
Keep the integer_insertelement test case, the new coalescer can handle
this kind of lane insertion without help from pseudo-instructions.

llvm-svn: 166835
2012-10-26 23:39:46 +00:00
Reed Kotler
8721a311b3 implement mips16 tls global addr
llvm-svn: 166827
2012-10-26 22:57:32 +00:00
Jakob Stoklund Olesen
1a8c00078d Add GPRPair Register class to ARM.
Some instructions in ARM require 2 even-odd paired GPRs. This
patch adds support for such register class.

Patch by Weiming Zhao!

llvm-svn: 166816
2012-10-26 21:29:15 +00:00
Reed Kotler
6b0e65fce7 Implement carry for subtract/add for mips16
llvm-svn: 166755
2012-10-26 04:46:26 +00:00
Reed Kotler
fd22c8bfc1 implement large (>16 bit) constant loading.
llvm-svn: 166749
2012-10-26 03:09:34 +00:00
Reed Kotler
72bddf2c9a fix test setgek.ll so that it will not give false "make check"
failure in some cases

llvm-svn: 166747
2012-10-26 01:29:42 +00:00
Reed Kotler
6ac4e4d880 implement mips16 patterns for select nodes
llvm-svn: 166721
2012-10-25 21:33:30 +00:00
Michael Liao
e5329935bd Add test for ATOM ISA SSSE3
- Remove SSE4.1 feature in other ATOM-based test cases

llvm-svn: 166699
2012-10-25 17:50:05 +00:00
Bill Schmidt
71c462aff2 This patch addresses a PPC64 ELF issue with passing parameters consisting of
structs having size 3, 5, 6, or 7.  Such a struct must be passed and received
as right-justified within its register or memory slot.  The problem is only
present for structs that are passed in registers.

Previously, as part of a patch handling all structs of size less than 8, I
added logic to rotate the incoming register so that the struct was left-
justified prior to storing the whole register.  This was incorrect because
the address of the parameter had already been adjusted earlier to point to
the right-adjusted value in the storage slot.  Essentially I had accidentally
accounted for the right-adjustment twice.

In this patch, I removed the incorrect logic and reorganized the code to make
the flow clearer.

The removal of the rotates changes the expected code generation, so test case
structsinregs.ll has been modified to reflect this.  I also added a new test
case, jaggedstructs.ll, to demonstrate that structs of these sizes can now
be properly received and passed.

I've built and tested the code on powerpc64-unknown-linux-gnu with no new
regressions.  I also ran the GCC compatibility test suite and verified that
earlier problems with these structs are now resolved, with no new regressions.

llvm-svn: 166680
2012-10-25 13:38:09 +00:00
Elena Demikhovsky
d93b6e7cd4 The test avx-intel-ocl.ll failed. I can't reproduce on any of my machines. I added -mcpu flag, may be it will fix the problem
llvm-svn: 166669
2012-10-25 08:38:42 +00:00
Chad Rosier
492e58a0f7 [ms-inline asm] Add back-end test case for r166632. Make sure we emit the
correct .s output as well as get the correct encoding by the integrated
assembler.

llvm-svn: 166638
2012-10-24 23:10:28 +00:00
Evan Cheng
f97472cdf6 Fix a miscompilation caused by a typo. When turning a adde with negative value
into a sbc with a positive number, the immediate should be complemented, not
negated. Also added a missing pattern for ARM codegen.

rdar://12559385

llvm-svn: 166613
2012-10-24 19:53:01 +00:00
Elena Demikhovsky
b711ef1960 Special calling conventions for Intel OpenCL built-in library.
llvm-svn: 166566
2012-10-24 14:46:16 +00:00
Michael Liao
18e40965aa Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x))
- If more than 1 elemennts are defined and target supports the vectorized
  conversion, use the vectorized one instead to reduce the strength on
  conversion operation.

llvm-svn: 166546
2012-10-24 04:14:18 +00:00
Michael Liao
70bb8004bd Add custom conversion from v2u32 to v2f32 in 32-bit mode
- As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to
  v2f32 is added to improve the efficiency of the code generated.

llvm-svn: 166545
2012-10-24 04:09:32 +00:00
Akira Hatanaka
8c77adc9ce [mips] Make sure sret argument is returned in register V0.
llvm-svn: 166539
2012-10-24 02:10:54 +00:00
Rafael Espindola
8ebde25931 Change x86_fastcallcc to require inreg markers. This allows it to known
the difference from "int x" (which should go in registers and
"struct y {int x;}" (which should not).

Clang will be updated in the next patches.

llvm-svn: 166536
2012-10-24 01:58:48 +00:00
Michael Liao
7ca099f727 Fix PR14161
- Check index being extracted to be constant 0 before simplfiying.
  Otherwise, retain the original sequence.

llvm-svn: 166504
2012-10-23 21:40:15 +00:00
Michael Liao
23c890e0a8 Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1
llvm-svn: 166486
2012-10-23 17:34:00 +00:00
Reed Kotler
730040c219 implement setXX patterns
llvm-svn: 166459
2012-10-23 01:35:48 +00:00
Bill Wendling
e97df2d337 When a block ends in an indirect branch, add its successors to the machine basic block.
The CFG of the machine function needs to know that the targets of the indirect
branch are successors to the indirect branch.
<rdar://problem/12529625>

llvm-svn: 166448
2012-10-22 23:30:04 +00:00
Akira Hatanaka
5439c43726 [mips] Use 64-bit registers to return an sret pointer if target ABI is N64.
llvm-svn: 166344
2012-10-19 22:11:40 +00:00
Akira Hatanaka
32235d5612 [mips] Add code to do tail call optimization.
Currently, it is enabled only if option "enable-mips-tail-calls" is given and
all of the callee's arguments are passed in registers.

llvm-svn: 166342
2012-10-19 21:47:33 +00:00
Shuxin Yang
3ad15929e7 This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap()
which is supposed to consistently raise SIGTRAP across all systems. In contrast,
__builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and
SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap"
functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap().

  The X86 backend is already able to handle debugtrap(). This patch is to:
  1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang).
  2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which
     make the __builtin_debugtrap() "available" to all existing ports without the hassle of
     changing their code.
  3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and
     __builtin_trap() will be expanded into the function call of the specified trap function.
    This behavior may need change in the future.

  The provided testing-case is to make sure 2) and 3) are working for ARM port, and we
already have a testing case for x86. 

llvm-svn: 166300
2012-10-19 20:11:16 +00:00