1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 11:33:24 +02:00
Commit Graph

1517 Commits

Author SHA1 Message Date
Dan Gohman
c6c669cc1e Teach the x86 backend to eliminate "test" instructions by using the EFLAGS
result from add, sub, inc, and dec instructions in simple cases.

llvm-svn: 66004
2009-03-04 02:33:24 +00:00
Evan Cheng
db402a7a49 Fix PR3701. 1. X86 target renamed eflags register to flags. This matches what llvm-gcc generates so codegen knows flags register is being clobbered by inline asm. 2. BURR scheduler should also check if inline asm nodes can clobber "live" physical registers. Previously it was only checking target nodes with implicit defs.
llvm-svn: 65996
2009-03-04 01:41:49 +00:00
Bill Wendling
93eeea0493 The DAG combiner was performing a BT combine. The BT combine had a value of -1,
so it changed it into a 31 via the TLO.ShrinkDemandedConstant() call. Then it
would go through the DAG combiner again. This time it had a value of 31, which
was turned into a -1 by TLI.SimplifyDemandedBits(). This would ping pong
forever.

Teach the TLO.ShrinkDemandedConstant() call not to lower a value if the demanded
value is an XOR of all ones.

llvm-svn: 65985
2009-03-04 00:18:06 +00:00
Nate Begeman
6b41d33726 Fix a problem with DAGCombine on 64b targets where folding
extracts + build_vector into a shuffle would fail, because the
type of the new build_vector would not be legal.  Try harder to
create a legal build_vector type.  Note: this will be totally 
irrelevant once vector_shuffle no longer takes a build_vector for
shuffle mask.

New:
_foo:
	xorps	%xmm0, %xmm0
	xorps	%xmm1, %xmm1
	subps	%xmm1, %xmm1
	mulps	%xmm0, %xmm1
	addps	%xmm0, %xmm1
	movaps	%xmm1, 0

Old:
_foo:
	xorps	%xmm0, %xmm0
	movss	%xmm0, %xmm1
	xorps	%xmm2, %xmm2
	unpcklps	%xmm1, %xmm2
	pshufd	$80, %xmm1, %xmm1
	unpcklps	%xmm1, %xmm2
	pslldq	$16, %xmm2
	pshufd	$57, %xmm2, %xmm1
	subps	%xmm0, %xmm1
	mulps	%xmm0, %xmm1
	addps	%xmm0, %xmm1
	movaps	%xmm1, 0

llvm-svn: 65791
2009-03-01 23:44:07 +00:00
Evan Cheng
276f9b02c5 Minor optimization:
Look for situations like this:                                                                                                                                                              
%reg1024<def> = MOV r1                                                                                                                                                                      
%reg1025<def> = MOV r0                                                                                                                                                                      
%reg1026<def> = ADD %reg1024, %reg1025                                                                                                                                                      
r0            = MOV %reg1026                                                                                                                                                                
Commute the ADD to hopefully eliminate an otherwise unavoidable copy.

llvm-svn: 65752
2009-03-01 02:03:43 +00:00
Evan Cheng
9c3ce7905e Last commit accidentially deleted this code.
llvm-svn: 65679
2009-02-28 06:02:14 +00:00
Rafael Espindola
880e63bf01 Refactor TLS code and add some tests. The tests and expected results are:
pic |  declaration | linkage  | visibility |

!pic |  declaration | external | default    | tls1.ll     tls2.ll     | local exec
 pic |  declaration | external | default    | tls1-pic.ll tls2-pic.ll | general dynamic
!pic | !declaration | external | default    | tls3.ll     tls4.ll     | initial exec
 pic | !declaration | external | default    | tls3-pic.ll tls4-pic.ll | general dynamic

!pic |  declaration | external | hidden     | tls7.ll     tls8.ll     | local exec
 pic |  declaration | external | hidden     | X                       | local dynamic
!pic | !declaration | external | hidden     | tls9.ll     tls10.ll    | local exec
 pic | !declaration | external | hidden     | X                       | local dynamic

!pic |  declaration | internal | default    | tls5.ll     tls6.ll     | local exec
 pic |  declaration | internal | default    | X                       | local dynamic

The ones marked with an X have not been implemented since local dynamic is not implemented.

llvm-svn: 65632
2009-02-27 13:37:18 +00:00
Evan Cheng
257e065df6 Make sure this test passes on linux-ppc.
llvm-svn: 65600
2009-02-27 00:51:50 +00:00
Evan Cheng
0ca6e3dba5 MachineLICM CSE should match destination register classes; avoid hoisting implicit_def's.
llvm-svn: 65592
2009-02-27 00:02:22 +00:00
Evan Cheng
4014a9a5b8 ADDS{D|S}rr_Int and MULS{D|S}rr_Int are not commutable. The users of these intrinsics expect the high bits will not be modified.
llvm-svn: 65499
2009-02-26 03:12:02 +00:00
Evan Cheng
86fc9440db The last commit was overly conservative. It's ok to reuse value that's already marked livein.
llvm-svn: 65498
2009-02-26 03:02:21 +00:00
Evan Cheng
ec34226c2b Revert BuildVectorSDNode related patches: 65426, 65427, and 65296.
llvm-svn: 65482
2009-02-25 22:49:59 +00:00
Dan Gohman
3766eeea36 Fast-isel can't do TLS yet, so it should fall back to SDISel
if it sees TLS addresses.

llvm-svn: 65341
2009-02-23 22:03:08 +00:00
Dan Gohman
c1229a3a86 Use the -stack-alignment option instead of using a target triple
for avoiding dynamic stack realignment.

llvm-svn: 65319
2009-02-23 16:34:46 +00:00
Evan Cheng
dd139e795c Only v1i16 (i.e. _m64) is returned via RAX / RDX.
llvm-svn: 65313
2009-02-23 09:03:22 +00:00
Nate Begeman
7d9e99560f Make this test use darwin targe triple, to avoid stack traffic on linux.
llvm-svn: 65312
2009-02-23 09:03:06 +00:00
Nate Begeman
e0093d2501 Generate better code for v8i16 shuffles on SSE2
Generate better code for v16i8 shuffles on SSE2 (avoids stack)
Generate pshufb for v8i16 and v16i8 shuffles on SSSE3 where it is fewer uops.
Document the shuffle matching logic and add some FIXMEs for later further
  cleanups.
New tests that test the above.

Examples:

New:
_shuf2:
	pextrw	$7, %xmm0, %eax
	punpcklqdq	%xmm1, %xmm0
	pshuflw	$128, %xmm0, %xmm0
	pinsrw	$2, %eax, %xmm0

Old:
_shuf2:
	pextrw	$2, %xmm0, %eax
	pextrw	$7, %xmm0, %ecx
	pinsrw	$2, %ecx, %xmm0
	pinsrw	$3, %eax, %xmm0
	movd	%xmm1, %eax
	pinsrw	$4, %eax, %xmm0
	ret

=========

New:
_shuf4:
	punpcklqdq	%xmm1, %xmm0
	pshufb	LCPI1_0, %xmm0

Old:
_shuf4:
	pextrw	$3, %xmm0, %eax
	movsd	%xmm1, %xmm0
	pextrw	$3, %xmm1, %ecx
	pinsrw	$4, %ecx, %xmm0
	pinsrw	$5, %eax, %xmm0

========

New:
_shuf1:
	pushl	%ebx
	pushl	%edi
	pushl	%esi
	pextrw	$1, %xmm0, %eax
	rolw	$8, %ax
	movd	%xmm0, %ecx
	rolw	$8, %cx
	pextrw	$5, %xmm0, %edx
	pextrw	$4, %xmm0, %esi
	pextrw	$3, %xmm0, %edi
	pextrw	$2, %xmm0, %ebx
	movaps	%xmm0, %xmm1
	pinsrw	$0, %ecx, %xmm1
	pinsrw	$1, %eax, %xmm1
	rolw	$8, %bx
	pinsrw	$2, %ebx, %xmm1
	rolw	$8, %di
	pinsrw	$3, %edi, %xmm1
	rolw	$8, %si
	pinsrw	$4, %esi, %xmm1
	rolw	$8, %dx
	pinsrw	$5, %edx, %xmm1
	pextrw	$7, %xmm0, %eax
	rolw	$8, %ax
	movaps	%xmm1, %xmm0
	pinsrw	$7, %eax, %xmm0
	popl	%esi
	popl	%edi
	popl	%ebx
	ret

Old:
_shuf1:
	subl	$252, %esp
	movaps	%xmm0, (%esp)
	movaps	%xmm0, 16(%esp)
	movaps	%xmm0, 32(%esp)
	movaps	%xmm0, 48(%esp)
	movaps	%xmm0, 64(%esp)
	movaps	%xmm0, 80(%esp)
	movaps	%xmm0, 96(%esp)
	movaps	%xmm0, 224(%esp)
	movaps	%xmm0, 208(%esp)
	movaps	%xmm0, 192(%esp)
	movaps	%xmm0, 176(%esp)
	movaps	%xmm0, 160(%esp)
	movaps	%xmm0, 144(%esp)
	movaps	%xmm0, 128(%esp)
	movaps	%xmm0, 112(%esp)
	movzbl	14(%esp), %eax
	movd	%eax, %xmm1
	movzbl	22(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm1, %xmm2
	movzbl	42(%esp), %eax
	movd	%eax, %xmm1
	movzbl	50(%esp), %eax
	movd	%eax, %xmm3
	punpcklbw	%xmm1, %xmm3
	punpcklbw	%xmm2, %xmm3
	movzbl	77(%esp), %eax
	movd	%eax, %xmm1
	movzbl	84(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm1, %xmm2
	movzbl	104(%esp), %eax
	movd	%eax, %xmm1
	punpcklbw	%xmm1, %xmm0
	punpcklbw	%xmm2, %xmm0
	movaps	%xmm0, %xmm1
	punpcklbw	%xmm3, %xmm1
	movzbl	127(%esp), %eax
	movd	%eax, %xmm0
	movzbl	135(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm0, %xmm2
	movzbl	155(%esp), %eax
	movd	%eax, %xmm0
	movzbl	163(%esp), %eax
	movd	%eax, %xmm3
	punpcklbw	%xmm0, %xmm3
	punpcklbw	%xmm2, %xmm3
	movzbl	188(%esp), %eax
	movd	%eax, %xmm0
	movzbl	197(%esp), %eax
	movd	%eax, %xmm2
	punpcklbw	%xmm0, %xmm2
	movzbl	217(%esp), %eax
	movd	%eax, %xmm4
	movzbl	225(%esp), %eax
	movd	%eax, %xmm0
	punpcklbw	%xmm4, %xmm0
	punpcklbw	%xmm2, %xmm0
	punpcklbw	%xmm3, %xmm0
	punpcklbw	%xmm1, %xmm0
	addl	$252, %esp
	ret

llvm-svn: 65311
2009-02-23 08:49:38 +00:00
Scott Michel
3f8637305f Introduce the BuildVectorSDNode class that encapsulates the ISD::BUILD_VECTOR
instruction. The class also consolidates the code for detecting constant
splats that's shared across PowerPC and the CellSPU backends (and might be
useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for
generating new BUILD_VECTOR nodes.

llvm-svn: 65296
2009-02-22 23:36:09 +00:00
Richard Pennington
72dc7c621e bug 3610: Test case.
llvm-svn: 65287
2009-02-22 15:54:44 +00:00
Evan Cheng
41687ff389 If a use operand is marked isKill, don't forget to add kill to its live interval as well.
llvm-svn: 65279
2009-02-22 08:35:56 +00:00
Evan Cheng
4385f393f7 Be bug compatible with gcc by returning MMX values in RAX.
llvm-svn: 65274
2009-02-22 08:05:12 +00:00
Anton Korobeynikov
5df82e3e25 Drop bunch of half-working stuff in the ext_weak linkage support.
Now we're using one gross, but quite robust hack :) (previous ones
did not work, for example, when ext_weak symbol was used deep inside
constant expression in the initializer).

The proper fix of this problem will require some quite huge asmprinter
changes and that's why was postponed. This fixes PR3629 by the way :)

llvm-svn: 65230
2009-02-21 11:53:32 +00:00
Evan Cheng
d4dc06f55a If two-address def is dead and the instruction does not define other registers, and it doesn't produce side effects, just delete the instruction.
llvm-svn: 65218
2009-02-21 03:14:25 +00:00
Evan Cheng
56b43045f6 Teach LSR sink to sink the immediate portion of the common expression back into uses if they fit in address modes of all the uses.
llvm-svn: 65215
2009-02-21 02:06:47 +00:00
Evan Cheng
c2541a4450 Fix strange logic in CollectIVUsers used to determine whether all uses are
addresses, part 1. This fixes an obvious logic bug. Previously if the only
in-loop use is a PHI, it would return AllUsesAreAddresses as true.

llvm-svn: 65178
2009-02-20 22:16:49 +00:00
Evan Cheng
c40c3e28f7 Support return of MMX values in 64-bit mode.
llvm-svn: 65152
2009-02-20 20:43:02 +00:00
Owen Anderson
8d12de1c25 Fix a crash in the pre-alloc splitter exposed by recent codegen changes.
llvm-svn: 65121
2009-02-20 10:02:23 +00:00
Chris Lattner
c25ea86424 make these tests pass when run on a G5.
llvm-svn: 65117
2009-02-20 07:10:11 +00:00
Dan Gohman
4e8fc41d48 Implement "superhero" strength reduction, or full strength
reduction of address calculations down to basic pointer arithmetic.
This is currently off by default, as it needs a few other features
before it becomes generally useful. And even when enabled, full
strength reduction is only performed when it doesn't increase
register pressure, and when several other conditions are true.

This also factors out a bunch of exisiting LSR code out of
StrengthReduceStridedIVUsers into separate functions, and tidies
up IV insertion. This actually decreases register pressure even
in non-superhero mode. The change in iv-users-in-other-loops.ll
is an example of this; there are two more adds because there are
two fewer leas, and there is less spilling.

llvm-svn: 65108
2009-02-20 04:17:46 +00:00
Evan Cheng
bd63a1f40d GV with null value initializer shouldn't go to BSS if it's meant for a mergeable strings section. Currently it only checks for Darwin. Someone else please check if it should apply to other targets as well.
llvm-svn: 64877
2009-02-18 02:19:52 +00:00
Evan Cheng
2e0cf05ad2 A couple of places where reused use operands should be marked kill. This is exposed by recent availability fallthrough changes.
llvm-svn: 64745
2009-02-17 06:41:03 +00:00
Dan Gohman
3d93bc5654 Change these tests to use regular loads instead of llvm.x86.sse2.loadu.dq.
Enhance instcombine to use the preferred field of
GetOrEnforceKnownAlignment in more cases, so that regular IR operations are
optimized in the same way that the intrinsics currently are.

llvm-svn: 64623
2009-02-16 00:44:23 +00:00
Evan Cheng
9428fd271c Fix PR3522. It's not safe to sink into landing pad BB's.
llvm-svn: 64582
2009-02-15 08:36:12 +00:00
Evan Cheng
9041a71923 Teach x86 target -soft-float.
llvm-svn: 64496
2009-02-13 22:36:38 +00:00
Dan Gohman
aec5be6b01 Fix the code that checked if a SCEVAddRecExpr Start contains an
addrec in a different loop to check the value being added to
the accumulated Start value, not the Start value before it has
the new value added to it. This prevents LSR from going crazy
on the included testcase. Dale, please review.

llvm-svn: 64440
2009-02-13 03:58:31 +00:00
Dan Gohman
3ade7d2346 Fix LSR's IV sorting function to explicitly sort by bitwidth
after sorting by stride value. This prevents it from missing
IV reuse opportunities in a host-sensitive manner.

llvm-svn: 64415
2009-02-13 00:26:43 +00:00
Dale Johannesen
47321cf01f Arrange to print constants that match "n" and "i" constraints
in inline asm as signed (what gcc does).  Add partial support
for x86-specific "e" and "Z" constraints, with appropriate
signedness for printing.

llvm-svn: 64400
2009-02-12 20:58:09 +00:00
Chris Lattner
1174b80823 fix the X86 backend to just drop llvm.declare nodes for VLAs instead of
leaving them in the DAG and then getting selection errors.  This is a 
fix for PR3538.

llvm-svn: 64382
2009-02-12 17:33:11 +00:00
Chris Lattner
2a13562296 add PR
llvm-svn: 64377
2009-02-12 17:04:57 +00:00
Evan Cheng
91c27ec711 It's (currently) not safe to keep certain physical registers live across basic blocks, e.g. x86 fp stack registers.
llvm-svn: 64374
2009-02-12 10:32:17 +00:00
Evan Cheng
19a1ba3680 Replace one of burr scheduling heuristic with something more sensible. Now calcMaxScratches simply compute the number of true data dependencies. This actually improve a couple of tests in dejagnu suite as many tests in llvm nightly test suite.
llvm-svn: 64369
2009-02-12 08:59:45 +00:00
Chris Lattner
e5ec807aaf fix PR3537: if resetting bbi back to the start of a block, we need to
forget about already inserted expressions.

llvm-svn: 64362
2009-02-12 06:56:08 +00:00
Chris Lattner
6b7bab07d3 rename test to avoid messing with tab completion of dates.
llvm-svn: 64361
2009-02-12 06:54:55 +00:00
Evan Cheng
a34255f910 Remove a bogus assertion. It's possible a live-in available value is used by a previous instruction.
llvm-svn: 64339
2009-02-11 23:41:57 +00:00
Dan Gohman
d252329703 Don't use special heuristics for nodes with no data predecessors
unless they actually have data successors, and likewise for nodes
with no data successors unless they actually have data precessors.

llvm-svn: 64327
2009-02-11 21:29:39 +00:00
Dale Johannesen
f367ef04af Make a transformation added in 63266 a bit less aggressive.
It was transforming (x&y)==y to (x&y)!=0 in the case where
y is variable and known to have at most one bit set (e.g. z&1).
This is not correct; the expressions are not equivalent when y==0.
I believe this patch salvages what can be salvaged, including
all the cases in bt.ll.  Dan, please review.
Fixes gcc.c-torture/execute/20040709-[12].c

llvm-svn: 64314
2009-02-11 19:19:41 +00:00
Evan Cheng
cfa084930b Implement PR3495: local spiller optimization. The local spiller can now keep availability information over BB boundaries. It visits BB's in depth first order. After visiting a BB if it find a successor which has a single predecessor it visits the successor next without clearing the availability information. This allows the successor to omit reloads or change them into copies.
llvm-svn: 64298
2009-02-11 08:24:21 +00:00
Evan Cheng
cdb35e3f0f Handle llvm.x86.sse2.maskmov.dqu in 64-bit.
llvm-svn: 64240
2009-02-10 22:06:28 +00:00
Evan Cheng
5be1afd928 Fix PR3457: Ignore control successors when looking for closest scheduled successor. A control successor doesn't read result(s) produced by the scheduling unit being evaluated.
llvm-svn: 64210
2009-02-10 08:30:11 +00:00
Evan Cheng
3b84024598 Implement FpSET_ST1_*.
llvm-svn: 64186
2009-02-09 23:32:07 +00:00