1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

1566 Commits

Author SHA1 Message Date
Duncan Sands
8572084279 Tweak the fix for PR3784: be less sensitive about just
how invokes are set up.  The fix could be disturbed by
register copies coming after the EH_LABEL, and also didn't
behave quite right when it was the invoke result that
was used in a phi node.  Also (see new testcase) fix
another phi elimination bug while there: register copies
in the landing pad need to come after the EH_LABEL, because
that's where execution branches to when unwinding.  If they
come before the EH_LABEL then they will never be executed...
Also tweak the original testcase so it doesn't use a no-longer
existing counter.
The accumulated phi elimination changes fix two of seven Ada
testsuite failures that turned up after landing pad critical
edge splitting was turned off.  So there's probably more to come.

llvm-svn: 67049
2009-03-16 19:58:38 +00:00
Scott Michel
2e2bccf754 CellSPU:
Incorporate Tilmann's 128-bit operation patch. Evidently, it gets the
llvm-gcc bootstrap a bit further along.

llvm-svn: 67048
2009-03-16 18:47:25 +00:00
Dan Gohman
c938e0ab02 Add a testcase that covers a wide variety of ABI isel cases.
llvm-svn: 67003
2009-03-14 02:35:10 +00:00
Dan Gohman
fd6debff99 Use %rip-relative addressing on x86-64 whenever practical, as
it has a smaller encoding than absolute addressing.

llvm-svn: 67002
2009-03-14 02:33:41 +00:00
Dan Gohman
5c4cd59ce4 Add a few more ptrtoint/inttoptr cast tests.
llvm-svn: 66989
2009-03-13 23:54:51 +00:00
Dan Gohman
fa0a3504ba Improve FastISel's handling of truncates to i1, and implement
ptrtoint and inttoptr in X86FastISel. These casts aren't always
handled in the generic FastISel code because X86 sometimes needs
custom code to do truncation and zero-extension.

llvm-svn: 66988
2009-03-13 23:53:06 +00:00
Evan Cheng
cda58e565f Fix PR3784: If the source of a phi comes from a bb ended with an invoke, make sure the copy is inserted before the try range (unless it's used as an input to the invoke, then insert it after the last use), not at the end of the bb.
Also re-apply r66140 which was disabled as a workaround.

llvm-svn: 66976
2009-03-13 22:59:14 +00:00
Dan Gohman
790659c0d6 Fix FastISel's assumption that i1 values are always zero-extended
by inserting explicit zero extensions where necessary. Included
is a testcase where SelectionDAG produces a virtual register
holding an i1 value which FastISel previously mistakenly assumed
to be zero-extended.

llvm-svn: 66941
2009-03-13 20:42:20 +00:00
Rafael Espindola
ff17d02271 Improve sext and zext of TLS variables.
llvm-svn: 66922
2009-03-13 18:37:06 +00:00
Evan Cheng
f9951d1557 Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues.
1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants.
2. MachineConstantPool alignment field is also a log2 value.
3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values.
4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries.
5. Asm printer uses expensive data structure multimap to track constant pool entries by sections.
6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic.


Solutions:
1. ConstantPoolSDNode alignment field is changed to keep non-log2 value.
2. MachineConstantPool alignment field is also changed to keep non-log2 value.
3. Functions that create ConstantPool nodes are passing in non-log2 alignments.
4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT.
5. Asm printer uses cheaper data structure to group constant pool entries.
6. Asm printer compute entry offsets after grouping is done.
7. Change JIT code to compute entry offsets on the fly.

llvm-svn: 66875
2009-03-13 07:51:59 +00:00
Chris Lattner
cbbdd230dd generalize the previous code to use the full generality of LEA
for i32/i64 expressions (we could also do i16 on cpus where
i16 lea is fast, but I didn't add this).  On the example, we now
generate:

_test:
	movl	4(%esp), %eax
	cmpl	$42, (%eax)
	setl	%al
	movzbl	%al, %eax
	leal	4(%eax,%eax,8), %eax
	ret

instead of:

_test:
	movl	4(%esp), %eax
	cmpl	$41, (%eax)
	movl	$4, %ecx
	movl	$13, %eax
	cmovg	%ecx, %eax
	ret

llvm-svn: 66869
2009-03-13 05:53:31 +00:00
Chris Lattner
878d951f8f optimize the case of cond ? 42 : 41 and friends. This compiles the
example to:

_test:
	movl	4(%esp), %eax
	cmpl	$41, (%eax)
	setg	%al
	movzbl	%al, %eax
	orl	$4294967294, %eax
	ret

instead of:

        movl    4(%esp), %eax
        cmpl    $41, (%eax)
	movl	$4294967294, %ecx
	movl	$4294967295, %eax
	cmova	%ecx, %eax
	ret

which is smaller in code size and faster. rdar://6668608

llvm-svn: 66868
2009-03-13 05:22:11 +00:00
Dan Gohman
37d843c129 Enhance address-mode folding of ISD::ADD to handle cases where the
operands can't both be fully folded at the same time. For example,
in the included testcase, a global variable is being added with
an add of two values. The global variable wants RIP-relative
addressing, so it can't share the address with another base
register, but it's still possible to fold the initial add.

llvm-svn: 66865
2009-03-13 02:25:09 +00:00
Evan Cheng
27be5272b4 Add this test back.
llvm-svn: 66838
2009-03-12 23:01:35 +00:00
Duncan Sands
968939fca2 Revert commit 66140 since it caused several failures
in the Ada testcase.  Reverting this only covers up
the real problem, which is a nasty conceptual difficulty
in the phi elimination pass: when eliminating phi nodes
in landing pads, the register copies need to come before
the invoke, not at the end of the basic block which is
too late...  See PR3784.

llvm-svn: 66826
2009-03-12 21:13:42 +00:00
Evan Cheng
e04a84ed9f Typo.
llvm-svn: 66797
2009-03-12 17:07:39 +00:00
Evan Cheng
45aa89fa9a Fix test after Chris' select changes.
llvm-svn: 66795
2009-03-12 16:10:08 +00:00
Chris Lattner
26a971c4ec Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))"
related transformations out of target-specific dag combine into the
ARM backend.  These were added by Evan in r37685 with no testcases
and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll).

Add some simple X86-specific (for now) DAG combines that turn things
like cond ? 8 : 0  -> (zext(cond) << 3).  This happens frequently
with the recently added cp constant select optimization, but is a
very general xform.  For example, we now compile the second example
in const-select.ll to:

_test:
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        seta    %al
        movzbl  %al, %eax
        movl    4(%esp), %ecx
        movsbl  (%ecx,%eax,4), %eax
        ret

instead of:

_test:
        movl    4(%esp), %eax
        leal    4(%eax), %ecx
        movsd   LCPI2_0, %xmm0
        ucomisd 8(%esp), %xmm0
        cmovbe  %eax, %ecx
        movsbl  (%ecx), %eax
        ret

This passes multisource and dejagnu.

llvm-svn: 66779
2009-03-12 06:52:53 +00:00
Evan Cheng
46e903d2f6 On x86, if the only use of a i64 load is a i64 store, generate a pair of double load and store instead.
llvm-svn: 66776
2009-03-12 05:59:15 +00:00
Chris Lattner
a5368fd283 add no-unwind, remove duplicate run line.
llvm-svn: 66775
2009-03-12 05:56:37 +00:00
Chris Lattner
cb52d81eea add nounwinds
llvm-svn: 66773
2009-03-12 05:35:33 +00:00
Dan Gohman
d30e108f0e Revert r66024. The JIT encoding for CALLpcrel32 is wrong -- see PR3773, and the
assembly text output uses an indirect call ("call *") instead of a direct call.

llvm-svn: 66735
2009-03-11 23:01:47 +00:00
Rafael Espindola
a8fe373200 optimize i8 and i16 tls values.
llvm-svn: 66725
2009-03-11 22:40:04 +00:00
Evan Cheng
042e05cb31 My last coalescer fix introduced a subtler one. It's aborting a commuting optimization too late and left the live intervals to be out of sync with instructions. This fixes 8b10b.
llvm-svn: 66715
2009-03-11 22:18:44 +00:00
Mon P Wang
287e422039 For yonah, fix a vector shuffle case for v16i8 where we didn't properly clear some bits.
llvm-svn: 66684
2009-03-11 18:47:57 +00:00
Mon P Wang
2867737ad2 Fixed a v8i16 shuffle case that should generate a pshufb instead of a pshuflw/hw.
llvm-svn: 66645
2009-03-11 06:35:11 +00:00
Chris Lattner
a0c4a99ec2 reapply my previous patch (r66358) with a tweak to set the
alignment of the generated constant pool entry to the
desired alignment of a type.  If we don't do this, we end up
trying to do movsd from 4-byte alignment memory.  This fixes
450.soplex and 456.hmmer.

llvm-svn: 66641
2009-03-11 05:08:08 +00:00
Evan Cheng
264173da40 Two coalescer fixes in one.
1. Use the same value# to represent unknown values being merged into sub-registers.
2. When coalescer commute an instruction and the destination is a physical register, update its sub-registers by merging in the extended ranges.

llvm-svn: 66610
2009-03-11 00:03:21 +00:00
Bill Wendling
5af061a657 Readd test, but XFAIL it.
llvm-svn: 66581
2009-03-10 21:31:00 +00:00
Evan Cheng
535dbf4ffd Revert 66358 for now. It's breaking povray, 450.soplex, and 456.hmmer on x86 / Darwin.
llvm-svn: 66574
2009-03-10 20:47:18 +00:00
Bill Wendling
4ddb112b2f Add radar number.
llvm-svn: 66534
2009-03-10 06:53:54 +00:00
Chris Lattner
03060f6d50 wire up support for emitting "special" values from inline asm
format strings with the standard ${:foo} syntax.

llvm-svn: 66527
2009-03-10 05:37:13 +00:00
Chris Lattner
93662edfb8 Fix PR3763 by using proper APInt methods instead of uint64_t's.
llvm-svn: 66434
2009-03-09 20:22:18 +00:00
Evan Cheng
3c9a084a1b ARM isLegalAddressImmediate should check if type is a simple type now that optimizer can create values of funky scalar types.
llvm-svn: 66429
2009-03-09 19:15:00 +00:00
Evan Cheng
8b47b553f9 Yet another case where the spiller marked two uses of the same register on the same instruction as kill. This fixes PR3706.
llvm-svn: 66428
2009-03-09 19:00:05 +00:00
Evan Cheng
67ccd79e39 Recognize triplets starting with armv5-, armv6- etc. And set the ARM arch version accordingly.
llvm-svn: 66365
2009-03-08 04:02:49 +00:00
Evan Cheng
483ece4d2e If a MI uses the same register more than once, only mark one of them as 'kill'.
llvm-svn: 66363
2009-03-08 03:58:35 +00:00
Chris Lattner
578b634f56 implement an optimization to codegen c ? 1.0 : 2.0 as load { 2.0, 1.0 } + c*4.
For 2009-03-07-FPConstSelect.ll we now produce:

_f:
	xorl	%eax, %eax
	testl	%edi, %edi
	movl	$4, %ecx
	cmovne	%rax, %rcx
	leaq	LCPI1_0(%rip), %rax
	movss	(%rcx,%rax), %xmm0
	ret

previously we produced:

_f:
	subl	$4, %esp
	cmpl	$0, 8(%esp)
	movss	LCPI1_0, %xmm0
	je	LBB1_2	## entry
LBB1_1:	## entry
	movss	LCPI1_1, %xmm0
LBB1_2:	## entry
	movss	%xmm0, (%esp)
	flds	(%esp)
	addl	$4, %esp
	ret

on PPC the code also improves to:

_f:
	cntlzw r2, r3
	srwi r2, r2, 5
	li r3, lo16(LCPI1_0)
	slwi r2, r2, 2
	addis r3, r3, ha16(LCPI1_0)
	lfsx f1, r3, r2
	blr 

from:

_f:
	li r2, lo16(LCPI1_1)
	cmplwi cr0, r3, 0
	addis r2, r2, ha16(LCPI1_1)
	beq cr0, LBB1_2	; entry
LBB1_1:	; entry
	li r2, lo16(LCPI1_0)
	addis r2, r2, ha16(LCPI1_0)
LBB1_2:	; entry
	lfs f1, 0(r2)
	blr 

This also improves the existing pic-cpool case from:

foo:
	subl	$12, %esp
	call	.Lllvm$1.$piclabel
.Lllvm$1.$piclabel:
	popl	%eax
	addl	$_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax
	cmpl	$0, 16(%esp)
	movsd	.LCPI1_0@GOTOFF(%eax), %xmm0
	je	.LBB1_2	# entry
.LBB1_1:	# entry
	movsd	.LCPI1_1@GOTOFF(%eax), %xmm0
.LBB1_2:	# entry
	movsd	%xmm0, (%esp)
	fldl	(%esp)
	addl	$12, %esp
	ret

to:

foo:
	call	.Lllvm$1.$piclabel
.Lllvm$1.$piclabel:
	popl	%eax
	addl	$_GLOBAL_OFFSET_TABLE_ + [.-.Lllvm$1.$piclabel], %eax
	xorl	%ecx, %ecx
	cmpl	$0, 4(%esp)
	movl	$8, %edx
	cmovne	%ecx, %edx
	fldl	.LCPI1_0@GOTOFF(%eax,%edx)
	ret

This triggers a few dozen times in spec FP 2000.

llvm-svn: 66358
2009-03-08 01:51:30 +00:00
Dan Gohman
b9c32f1aca Arithmetic instructions don't set EFLAGS bits OF and CF bits
the same say the "test" instruction does in overflow cases,
so eliminating the test is only safe when those bits aren't
needed, as is the case for COND_E and COND_NE, or if it
can be proven that no overflow will occur. For now, just
restrict the optimization to COND_E and COND_NE and don't
do any overflow analysis.

llvm-svn: 66318
2009-03-07 01:58:32 +00:00
Dan Gohman
42049dafd7 Fix ScheduleDAGRRList::CopyAndMoveSuccessors' handling of nodes
with multiple chain operands. This can occur when the scheduler
has added chain operands to a node that already has a chain
operand, in order to handle physical register dependencies.

This fixes an llvm-gcc bootstrap failure on x86-64 introduced
in r66058.

llvm-svn: 66240
2009-03-06 02:23:01 +00:00
Dan Gohman
f6f684b206 Fix the "test" optimization to recognize "dec" as an add of
negative one, as subtracts of immediates are canonicalized
to adds.

llvm-svn: 66180
2009-03-05 19:32:48 +00:00
Dan Gohman
5e88bf4c00 Make this test more thorough. Not only should there be no %esi,
there should be no spilling of anything.

llvm-svn: 66179
2009-03-05 19:31:32 +00:00
Evan Cheng
24c138a1cd Do not split edges to EH landing pads. It will cause code size explosion.
llvm-svn: 66140
2009-03-05 06:31:26 +00:00
Dan Gohman
31fb085c2e Re-apply 66008, now that the unfoldMemoryOperand bug is fixed.
llvm-svn: 66058
2009-03-04 19:44:21 +00:00
Owen Anderson
fb7f64ea0d Add a restore folder, which shaves a dozen or so machineinstrs off oggenc. Update a testcase to check this.
llvm-svn: 66029
2009-03-04 08:52:31 +00:00
Evan Cheng
7d9019d0f3 Fix PR3666: isel calls to constant addresses.
llvm-svn: 66024
2009-03-04 06:48:53 +00:00
Eli Friedman
1bed40b86a PR3686: make the legalizer handle bitcast from i80 to x86 long double.
llvm-svn: 66021
2009-03-04 06:23:34 +00:00
Dan Gohman
6831e2c2a6 Revert r66004 for now; it's causing a variety of test failures.
llvm-svn: 66008
2009-03-04 03:54:19 +00:00
Evan Cheng
32eef2f73f Rename test.
llvm-svn: 66006
2009-03-04 02:47:25 +00:00
Dan Gohman
c6c669cc1e Teach the x86 backend to eliminate "test" instructions by using the EFLAGS
result from add, sub, inc, and dec instructions in simple cases.

llvm-svn: 66004
2009-03-04 02:33:24 +00:00