1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 22:42:46 +02:00
Commit Graph

3232 Commits

Author SHA1 Message Date
Nadav Rotem
2729f54295 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret

llvm-svn: 153848
2012-04-01 19:31:22 +00:00
Rafael Espindola
194491d9b0 Add a triple to the test.
llvm-svn: 153818
2012-03-31 18:59:07 +00:00
Rafael Espindola
2da83dbcd4 Teach CodeGen's version of computeMaskedBits to understand the range metadata.
This is the CodeGen equivalent of r153747. I tested that there is not noticeable
performance difference with any combination of -O0/-O2 /-g when compiling
gcc as a single compilation unit.

llvm-svn: 153817
2012-03-31 18:14:00 +00:00
Bill Wendling
9746fa5b94 Testcase for r153710.
llvm-svn: 153711
2012-03-30 00:26:54 +00:00
Lang Hames
20f2192d37 The shuffle scheduler is only available in asserts build - make misched-new.ll
testcase require asserts.

llvm-svn: 153687
2012-03-29 21:11:47 +00:00
Lang Hames
94d892c492 Make x86 REP_MOV* and REP_STO instructions use the correct operand sizes in 64-bit mode.
llvm-svn: 153680
2012-03-29 19:54:28 +00:00
Joel Jones
486c38b0cf For X86, change load/dec-or-inc/store into dec-or-inc, respectively.
This is a code change to add support for changing instruction sequences of the form:

  load
  inc/dec of 8/16/32/64 bits
  store

into the appropriate X86 inc/dec through memory instruction:

  inc[qlwb] / dec[qlwb]

The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode.  The comments have also been expanded.

llvm-svn: 153635
2012-03-29 05:45:48 +00:00
Joel Jones
32f97db4b2 Reverted to revision 153616 to unblock build
llvm-svn: 153623
2012-03-29 01:20:56 +00:00
Joel Jones
b4477ee31f For X86, change load/dec-or-inc/store into dec-or-inc, respectively.
This is a code change to add support for changing instruction sequences of the form:

  load
  inc/dec of 8/16/32/64 bits
  store

into the appropriate X86 inc/dec through memory instruction:

  inc[qlwb] / dec[qlwb]

The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode.  The comments have also been expanded.

llvm-svn: 153617
2012-03-29 00:37:47 +00:00
Eric Christopher
324ff39943 Add a test for the previous commit. Also, remove two tests that were
testing a) the wrong behavior or b) something that I'm already testing
in the new test.

llvm-svn: 153525
2012-03-27 18:35:57 +00:00
Evan Cheng
2ea4c182ae Post-ra LICM should take care not to hoist an instruction that would clobber a
register that's read by the preheader terminator.

rdar://11095580

llvm-svn: 153492
2012-03-27 01:50:58 +00:00
Eli Bendersky
3ef88c1833 Continue cleanup of LIT, getting rid of the remaining artifacts from dejagnu
* Removed test/lib/llvm.exp - it is no longer needed 
* Deleted the dg.exp reading code from test/lit.cfg. There are no dg.exp files
  left in the test suite so this code is no longer required. test/lit.cfg is
  now much shorter and clearer 
* Removed a lot of duplicate code in lit.local.cfg files that need access to
  the root configuration, by adding a "root" attribute to the TestingConfig
  object. This attribute is dynamically computed to provide the same
  information as was previously provided by the custom getRoot functions. 
* Documented the config.root attribute in docs/CommandGuide/lit.pod

llvm-svn: 153408
2012-03-25 09:02:19 +00:00
Andrew Trick
121e3e153c Remove -enable-lsr-nested in time for 3.1.
Tests cases have been removed but attached to open PR12330.

llvm-svn: 153286
2012-03-22 22:42:45 +00:00
Andrew Trick
f298457c8c misched: tag a few XFAILs that I plan to fix
llvm-svn: 153222
2012-03-21 22:31:31 +00:00
Chad Rosier
17f25ea47b [avx] Add patterns for combining vextractf128 + vmovaps/vmovups/vmobdqu to
vextractf128 with 128-bit mem dest.

Combines

	vextractf128 $0, %ymm0, %xmm0
	vmovaps %xmm0, (%rdi)

to

    vextractf128 $0, %ymm0, (%rdi)

rdar://11082570

llvm-svn: 153139
2012-03-20 21:43:40 +00:00
Chad Rosier
ffd2dbd676 [avx] Move the vextractf128 patterns closer to the vextractf128 def. Remove
whitespace from test case.  No functional change intended.

llvm-svn: 153103
2012-03-20 18:24:55 +00:00
Chad Rosier
c667f10cbb Fix test.
llvm-svn: 153095
2012-03-20 17:20:46 +00:00
Chad Rosier
143f33dc92 [avx] Adjust the VINSERTF128rm pattern to allow for unaligned loads.
This results in things such as

	vmovups	16(%rdi), %xmm0
	vinsertf128	$1, %xmm0, %ymm0, %ymm0

to be combined to

    vinsertf128	$1, 16(%rdi), %ymm0, %ymm0

rdar://11076953

llvm-svn: 153092
2012-03-20 17:08:51 +00:00
Bill Wendling
338ddac8f7 It's possible to have a constant expression who's size is quite big (e.g.,
i128). In that case, we may not be able to print out the MCExpr as an
expression. For instance, we could have an MCExpr like this:

    0xBEEF0000BEEF0000 | (0xBEEF0000BEEF0000 << 64)

The MCExpr printer handles sizes up to 64-bits, but this expression would
require 128-bits. In this situation, try to evaluate the constant expression and
emit that as the value into 64-bit chunks.
<rdar://problem/11070338>

llvm-svn: 153081
2012-03-20 08:56:43 +00:00
Preston Gurd
d1ae391210 This patch adds X86 instruction itineraries for non-pseudo opcodes in
X86InstrCompiler.td.
 
It also adds –mcpu-generic to the legalize-shift-64.ll test so the test
will pass if run on an Intel Atom CPU, which would otherwise
produce an instruction schedule which differs from that which the test expects.

llvm-svn: 153033
2012-03-19 14:10:12 +00:00
Nadav Rotem
8cf9105f96 When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, add the new node into the work list because there is a potential for further optimizations.
llvm-svn: 152784
2012-03-15 08:49:06 +00:00
Chad Rosier
bd3e55d39c [avx] Add patterns for VINSERTF128rm.
This results in things such as

	vmovaps	-96(%rbx), %xmm1
	vinsertf128	$1, %xmm1, %ymm0, %ymm0

to be combined to
         
	vinsertf128	$1, -96(%rbx), %ymm0, %ymm0

rdar://10643481

llvm-svn: 152762
2012-03-15 00:45:30 +00:00
Chad Rosier
a10cf5e1b9 Fix a regression from r147481.
Original commit message from r147481:
DAGCombine for transforming 128->256 casts into a vmovaps, rather
then a vxorps + vinsertf128 pair if the original vector came from a load.

Fix:
Unaligned loads need to generate a vmovups.
rdar://10974078

llvm-svn: 152366
2012-03-09 02:00:48 +00:00
Jakob Stoklund Olesen
9fed2852ff Remove a test case that no longer makes sense.
This was testing the handling of sub-register coalescing followed by
remat.  The original problem was caused by the extra <imp-def> operands
added by sub-register coalescing.  Those <imp-def> operands are not
added any longer, and the test case passes even when the original patch
is reverted.

llvm-svn: 152040
2012-03-05 19:10:13 +00:00
Bill Wendling
88f55b45b2 Do trivial CSE of dead BBs during codegen preparation.
Some BBs can become dead after codegen preparation. If we delete them here, it
could help enable tail-call optimizations later on.
<rdar://problem/10256573>

llvm-svn: 152002
2012-03-04 10:46:01 +00:00
Chad Rosier
c6fad847e9 Prevent obscure and incorrect tail-call optimization.
In this instance we are generating the tail-call during legalizeDAG.  The 2nd
floor call can't be a tail call because it clobbers %xmm1, which is defined by
the first floor call.  The first floor call can't be a tail-call because it's
not in the tail position.  The only reasonable way I could think to fix this
in a target-independent manner was to check for glue logic on the copy reg.

rdar://10930395

llvm-svn: 151877
2012-03-02 02:50:46 +00:00
Preston Gurd
29cb4871db Trivial change to make the test use Use –mcpu=generic,
so that the test will not fail when run on an Intel Atom
processor, due to the Atom scheduler producing an instruction sequence that is
different from that which is normally expected.

llvm-svn: 151832
2012-03-01 19:57:20 +00:00
Chad Rosier
3bdd700004 Revert r151816 as Jim has the appropriate fix.
llvm-svn: 151818
2012-03-01 17:41:19 +00:00
Chad Rosier
89b9ecae75 Fix testcases from r151807.
llvm-svn: 151816
2012-03-01 17:31:30 +00:00
Jim Grosbach
e41072bbb4 Add missing triple for tests.
Make darwin bots happier.

llvm-svn: 151813
2012-03-01 17:30:32 +00:00
James Molloy
1038b57cac Fix a codegen fault in which log2 or exp2 could be dead-code eliminated even though they could have sideeffects.
Only allow log2/exp2 to be converted to an intrinsic if they are declared "readnone".

llvm-svn: 151807
2012-03-01 14:32:18 +00:00
Lang Hames
6cd018d0bc Don't redundantly copy implicit operands when rematerializing.
While we're at it - don't copy vreg implicit operands while rematerializing.
This fixes PR12138.

llvm-svn: 151779
2012-03-01 00:41:17 +00:00
Benjamin Kramer
1b5aa9f5cd LegalizeIntegerTypes: Reorder operations in the "big shift by small amount" optimization, making the lives of later passes easier.
llvm-svn: 151722
2012-02-29 13:27:00 +00:00
Benjamin Kramer
daa291f4fd LegalizeIntegerTypes: Reenable the large shift with small amount optimization.
To avoid problems with zero shifts when getting the bits that move between words
we use a trick: first shift the by amount-1, then do another shift by one. When
amount is 0 (and size 32) we first shift by 31, then by one, instead of by 32.

Also fix a latent bug that emitted the low and high words in the wrong order
when shifting right.

Fixes PR12113.

llvm-svn: 151637
2012-02-28 17:58:00 +00:00
Nadav Rotem
75b36e6716 Fix a bug in the code that builds SDNodes from vector GEPs.
When the GEP index is a vector of pointers, the code that calculated the size
of the element started from the vector type, and not the contained pointer type.
As a result, instead of looking at the data element pointed by the vector, this
code used the size of the vector. This works for 32bit members (on 32bit
systems), but not for other types. Added code to peel the vector type and
added a test.

llvm-svn: 151626
2012-02-28 11:54:05 +00:00
Preston Gurd
81931b8e3b test commit.
llvm-svn: 151588
2012-02-27 23:31:51 +00:00
NAKAMURA Takumi
17b6271b41 Target/X86: Fix assertion failures and warnings caused by r151382 _ftol2 lowering for i386-*-win32 targets. Patch by Joe Groff.
[Joe Groff] Hi everyone. My previous patch applied as r151382 had a few problems:
Clang raised a warning, and X86 LowerOperation would assert out for
fptoui f64 to i32 because it improperly lowered to an illegal
BUILD_PAIR. Here's a patch that addresses these issues. Let me know if
any other changes are necessary. Thanks.

llvm-svn: 151432
2012-02-25 03:37:25 +00:00
Michael J. Spencer
d2f0ce2674 Add WIN_FTOL_* psudo-instructions to model the unique calling convention
used by the Win32 _ftol2 runtime function. Patch by Joe Groff!

llvm-svn: 151382
2012-02-24 19:01:22 +00:00
NAKAMURA Takumi
d8b4183963 test/CodeGen/X86/2012-02-23-mmx-inlineasm.ll: Fixup to add -march=x86.
-mcpu does not choose arch automatically, on non-x86 hosts.

llvm-svn: 151362
2012-02-24 13:29:50 +00:00
Pete Cooper
135769381b Turn avx insert intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove duplicate patterns for selecting the intrinsics
llvm-svn: 151342
2012-02-24 03:51:49 +00:00
Bill Wendling
1a35321235 Allow an integer to be converted into an MMX type when it's used in an inline
asm.
<rdar://problem/10106006>

llvm-svn: 151303
2012-02-23 23:25:25 +00:00
Anton Korobeynikov
fb863cd279 Fix to make sure that a comdat group gets generated correctly for a static member
of instantiated C++ templates.

Patch by Kristof Beyls!

llvm-svn: 151250
2012-02-23 10:36:04 +00:00
Daniel Dunbar
cac06bf0c6 MC: Fix the MCNullStreamer which was broken in r147763.
llvm-svn: 151213
2012-02-22 23:49:50 +00:00
Michael J. Spencer
24f6d49962 Properly emit _fltused with FastISel. Refactor to share code with SDAG.
Patch by Joe Groff!

llvm-svn: 151183
2012-02-22 19:06:13 +00:00
Aaron Ballman
a76a5b7265 Adding support for Microsoft's thiscall calling convention. LLVM side of the patch.
llvm-svn: 151123
2012-02-22 03:04:40 +00:00
NAKAMURA Takumi
0fac05f8e2 test/CodeGen/X86/2012-02-20-MachineCPBug.ll: Fix on generic(non-x86) hosts to add -mattr=+sse.
llvm-svn: 151053
2012-02-21 11:56:42 +00:00
Evan Cheng
3bffc22fc2 Fix machine-cp by having it to check sub-register indicies. e.g.
ecx = mov eax
al  = mov ch
The second copy is not a nop because the sub-indices of ecx,ch is not the
same of that of eax/al.

Re-enabled machine-cp.
PR11940

llvm-svn: 151002
2012-02-20 23:28:17 +00:00
Eric Christopher
c2e76f573d Testcase for the previous commit.
llvm-svn: 150852
2012-02-18 00:05:45 +00:00
David Chisnall
86b0f069d6 It turns out that putting an 8-byte symbol in a 4-byte section makes Solaris ld sulk. GNU ld is perfectly happy with it, which is worrying for a whole other set of reasons...
Thanks to Anton, Duncan and Rafael for helping me track this down.
Pointy hat to Rafael for introducing the bug in the first place.

llvm-svn: 150811
2012-02-17 16:05:50 +00:00
Bill Wendling
c137296347 Use –mcpu=generic, so that the test will not fail when run on an Intel Atom
processor, due to the Atom scheduler producing an instruction sequence that is
different from that which is expected.
Patch by Michael Spencer!

llvm-svn: 150736
2012-02-16 22:42:48 +00:00