1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

5259 Commits

Author SHA1 Message Date
Chris Lattner
7099f3c400 fix a subtle volatile handling bug.
llvm-svn: 50428
2008-04-29 17:13:43 +00:00
Chris Lattner
b3972afe89 new testcase for PR2094. The inline asms should not pin allocas to the
stack anymore.

llvm-svn: 50397
2008-04-29 05:53:29 +00:00
Chris Lattner
51fe8415da don't delete the last store to an alloca if the store is volatile.
llvm-svn: 50390
2008-04-29 04:58:38 +00:00
Chris Lattner
0f63b8fecc make the vector conversion magic handle multiple results.
We now compile test2/test3 to:

_test2:
	## InlineAsm Start
	set %xmm0, %xmm1
	## InlineAsm End
	addps	%xmm1, %xmm0
	ret
_test3:
	## InlineAsm Start
	set %xmm0, %xmm1
	## InlineAsm End
	paddd	%xmm1, %xmm0
	ret

as expected.

llvm-svn: 50389
2008-04-29 04:48:56 +00:00
Chris Lattner
e75d09711d add support for multiple return values in inline asm. This is a step
towards PR2094.  It now compiles the attached .ll file to:

_sad16_sse2:
	movslq	%ecx, %rax
	## InlineAsm Start
	%ecx %rdx %rax %rax %r8d %rdx %rsi
	## InlineAsm End
	## InlineAsm Start
	set %eax
	## InlineAsm End
	ret

which is pretty decent for a 3 output, 4 input asm.

llvm-svn: 50386
2008-04-29 04:29:54 +00:00
Evan Cheng
381b094a2b Another extract_subreg coalescing bug.
e.g.
vr1024<2> extract_subreg vr1025, 2
If vr1024 do not have the same register class as vr1025, it's not safe to coalesce this away. For example, vr1024 might be a GPR32 while vr1025 might be a GPR64.

llvm-svn: 50385
2008-04-29 01:41:44 +00:00
Evan Cheng
8696eb18c1 Add -march=x86.
llvm-svn: 50380
2008-04-28 23:31:41 +00:00
Dan Gohman
fa033a0fbd Update and_ops.ll according to the recent dagcombiner changes.
Add a new test, and_ops_more.ll, which is XFAIL'd, to
record the parts of and_ops.ll that were affected by this
change.

llvm-svn: 50379
2008-04-28 23:26:22 +00:00
Evan Cheng
3167c716f2 Test case.
llvm-svn: 50377
2008-04-28 22:14:34 +00:00
Dan Gohman
9e4db7f0bd Fix DSE to not eliminate volatile loads with no uses.
llvm-svn: 50370
2008-04-28 19:51:27 +00:00
Dan Gohman
1b7238e6e4 Teach InstCombine's ComputeMaskedBits what SelectionDAG's
ComputeMaskedBits knows about cttz, ctlz, and ctpop. Teach
SelectionDAG's ComputeMaskedBits what InstCombine's knows
about SRem. And teach them both some things about high bits
in Mul, UDiv, URem, and Sub. This allows instcombine and
dagcombine to eliminate sign-extension operations in
several new cases.

llvm-svn: 50358
2008-04-28 17:02:21 +00:00
Chris Lattner
ede7e89144 Fix PR2256, yet another miscompilation in simplifycfg of i
multiple return values.

Bill, please pull this into Tak.

llvm-svn: 50332
2008-04-28 00:19:07 +00:00
Chris Lattner
39a4281deb Implement a signficant optimization for inline asm:
When choosing between constraints with multiple options,
like "ir", test to see if we can use the 'i' constraint and
go with that if possible.  This produces more optimal ASM in
all cases (sparing a register and an instruction to load it),
and fixes inline asm like this:

void test () {
  asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14));
}

Previously we would dump "42" into a memory location (which
is ok for the 'm' constraint) which would cause a problem
because the 'c' modifier is not valid on memory operands.

Isn't it great how inline asm turns 'missed optimization'
into 'compile failed'??

Incidentally, this was the todo in 
PowerPC/2007-04-24-InlineAsm-I-Modifier.ll

Please do NOT pull this into Tak.

llvm-svn: 50315
2008-04-27 00:37:18 +00:00
Chris Lattner
2798e42a9f When SRoA'ing a global variable, make sure the new globals get the
appropriate alignment.  This fixes a miscompilation of 252.eon on
x86-64 (rdar://5891920).

Bill, please pull this into Tak.

llvm-svn: 50308
2008-04-26 07:40:11 +00:00
Nate Begeman
1723f6af2b Feedback from chris
llvm-svn: 50305
2008-04-25 21:47:35 +00:00
Nate Begeman
acd7e1c464 Add a testcase for the recent "handle variable vector insert elt in mem" patch
llvm-svn: 50303
2008-04-25 21:26:59 +00:00
Evan Cheng
db1497fa77 Update tests.
llvm-svn: 50293
2008-04-25 20:13:47 +00:00
Evan Cheng
11f101a800 Special handling for MMX values being passed in either GPR64 or lower 64-bits of XMM registers.
llvm-svn: 50289
2008-04-25 19:11:04 +00:00
Dan Gohman
c4b6768db4 Remove the code from CodeGenPrepare that moved getresult instructions
to the block that defines their operands. This doesn't work in the
case that the operand is an invoke, because invoke is a terminator
and must be the last instruction in a block.

Replace it with support in SelectionDAGISel for copying struct values
into sequences of virtual registers.

llvm-svn: 50279
2008-04-25 18:27:55 +00:00
Chris Lattner
fb539f4468 new testcase
llvm-svn: 50274
2008-04-25 18:11:06 +00:00
Anton Korobeynikov
dad01d86e0 Update test
llvm-svn: 50272
2008-04-25 17:54:21 +00:00
Nick Lewycky
1f831c0f57 Remove 'unwinds to' support from mainline. This patch undoes r47802 r47989
r48047 r48084 r48085 r48086 r48088 r48096 r48099 r48109 and r48123.

llvm-svn: 50265
2008-04-25 16:53:59 +00:00
Evan Cheng
39ae78cadb MMX argument passing fixes:
On Darwin / Linux x86-32, v8i8, v4i16, v2i32 values are passed in MM[0-2].                                                                                                                                      
On Darwin / Linux x86-32, v1i64 values are passed in memory.                                                                                                                                                    
On Darwin x86-64, v8i8, v4i16, v2i32 values are passed in XMM[0-7].                                                                                                                                     
On Darwin x86-64, v1i64 values are passed in 64-bit GPRs.

llvm-svn: 50257
2008-04-25 07:56:45 +00:00
Chris Lattner
8c9f6c929a Loosen up an assertion to allow intrinsics. I really have no
idea what this code (findNonImmUse) does, so I'm only guessing 
that this is the right thing.  It would be really really nice
if this had comments and perhaps switched to SmallPtrSet
(hint hint) :)

This fixes rdar://5886601, a crash on gcc.target/i386/sse4_1-pblendw.c

llvm-svn: 50252
2008-04-25 05:13:01 +00:00
Chris Lattner
1a6268f776 Don't infininitely thread branches when a threaded edge
goes back to the block, e.g.:

  Threading edge through bool from 'bb37.us.thread3829' to 'bb37.us' with cost: 1, across block:

bb37.us:		; preds = %bb37.us.thread3829, %bb37.us, %bb33
	%D1361.1.us = phi i32 [ %tmp36, %bb33 ], [ %D1361.1.us, %bb37.us ], [ 0, %bb37.us.thread3829 ]		; <i32> [#uses=2]
	%tmp39.us = icmp eq i32 %D1361.1.us, 0		; <i1> [#uses=1]
	br i1 %tmp39.us, label %bb37.us, label %bb42.us

llvm-svn: 50251
2008-04-25 04:12:29 +00:00
Evan Cheng
484060ba4a Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not handled by rep instructions, a new memcpy / memset is introduced for them. However, since source / destination addresses are already adjusted, their offsets should be zero.
llvm-svn: 50239
2008-04-25 00:26:43 +00:00
Evan Cheng
906911f9e5 New test.
llvm-svn: 50229
2008-04-24 20:01:58 +00:00
Devang Patel
fed5cd5fe7 Add EXTRA_OPTIONS on the llvmgxx command line.
llvm-svn: 50217
2008-04-24 17:59:03 +00:00
Devang Patel
7ff3d5b65b Add EXTRA_OPTIONS on the llvmgcc command line.
llvm-svn: 50216
2008-04-24 17:54:25 +00:00
Chris Lattner
be35a0c224 Split some code out of the main SimplifyCFG loop into its own function.
Fix said code to handle merging return instructions together correctly
when handling multiple return values.

llvm-svn: 50199
2008-04-24 00:01:19 +00:00
Anton Korobeynikov
d6ec8965f7 Fix tests due to llvm2cpp move to llc target
llvm-svn: 50191
2008-04-23 22:41:53 +00:00
Dan Gohman
afa475f207 Add support to codegen for getresult instructions with undef operands.
llvm-svn: 50180
2008-04-23 20:21:29 +00:00
Anton Korobeynikov
244a615291 Disable stack realignment for these tests
llvm-svn: 50172
2008-04-23 18:25:44 +00:00
Anton Korobeynikov
1898ce20e2 Fix test becase ABI stack alignment dropped to 'normal' value
llvm-svn: 50171
2008-04-23 18:25:16 +00:00
Anton Korobeynikov
15c5a2ce26 Fix test, instruction count is valid only if stack is not realigned
llvm-svn: 50170
2008-04-23 18:24:48 +00:00
Chris Lattner
721ea7ca10 Rewrite multiple return value handling in SCCP. Before, the -sccp pass
would turn every getresult instruction into undef.  This helps with
rdar://5778210

llvm-svn: 50140
2008-04-23 05:38:20 +00:00
Chris Lattner
0dd624d232 remove this testcase. It isn't testing loop rotate, it is testing all
of -std-compile-opts and is now failing because other passes are generating
IR that looks different to input of loop rotate.  Devang, please 
introduce a testcase that only runs loop rotate.

llvm-svn: 50136
2008-04-23 05:36:04 +00:00
Chris Lattner
5a4e46d886 returning an empty multiple return list is not valid.
llvm-svn: 50135
2008-04-23 05:29:14 +00:00
Chris Lattner
be858fc296 make this test more interesting.
llvm-svn: 50128
2008-04-23 03:49:32 +00:00
Chris Lattner
d059ac2e32 distill down the essense of this test.
llvm-svn: 50125
2008-04-23 03:03:42 +00:00
Dale Johannesen
547a55caf1 new test
llvm-svn: 50123
2008-04-23 01:22:22 +00:00
Evan Cheng
680839e258 Don't do: "(X & 4) >> 1 == 2 --> (X & 4) == 4" if there are more than one uses of the shift result.
llvm-svn: 50118
2008-04-23 00:38:06 +00:00
Chris Lattner
e304ae5621 Start doing the significantly useful part of jump threading: handle cases
where a comparison has a phi input and that phi is a constant.  For example,
stuff like:

  Threading edge through bool from 'bb2149' to 'bb2231' with cost: 1, across block:
bb2237:		; preds = %bb2231, %bb2149
	%tmp2328.rle = phi i32 [ %tmp2232, %bb2231 ], [ %tmp2232439, %bb2149 ]		; <i32> [#uses=2]
	%done.0 = phi i32 [ %done.2, %bb2231 ], [ 0, %bb2149 ]		; <i32> [#uses=1]
	%tmp2239 = icmp eq i32 %done.0, 0		; <i1> [#uses=1]
	br i1 %tmp2239, label %bb2231, label %bb2327

or

bb38.i298:		; preds = %bb33.i295, %bb1693
	%tmp39.i296.rle = phi %struct.ibox* [ null, %bb1693 ], [ %tmp39.i296.rle1109, %bb33.i295 ]		; <%struct.ibox*> [#uses=2]
	%minspan.1.i291.reg2mem.1 = phi i32 [ 32000, %bb1693 ], [ %minspan.0.i288, %bb33.i295 ]		; <i32> [#uses=1]
	%tmp40.i297 = icmp eq %struct.ibox* %tmp39.i296.rle, null		; <i1> [#uses=1]
	br i1 %tmp40.i297, label %implfeeds.exit311, label %bb43.i301

This triggers thousands of times in spec.

llvm-svn: 50110
2008-04-22 21:40:39 +00:00
Chris Lattner
c59cf9c8da Dig through multiple levels of AND to thread jumps if needed.
llvm-svn: 50106
2008-04-22 20:46:09 +00:00
Chris Lattner
dcbc6443ae Teach jump threading to thread through blocks like:
br (and X, phi(Y, Z, false)), label L1, label L2

This triggers once on 252.eon and 6 times on 176.gcc.  Blocks 
in question often look like this:

bb262:		; preds = %bb261, %bb248
	%iftmp.251.0 = phi i1 [ true, %bb261 ], [ false, %bb248 ]		; <i1> [#uses=4]
	%tmp270 = icmp eq %struct.rtx_def* %tmp.0.i, null		; <i1> [#uses=1]
	%bothcond = or i1 %iftmp.251.0, %tmp270		; <i1> [#uses=1]
	br i1 %bothcond, label %bb288, label %bb273

In this case, it is clear that it doesn't matter if tmp.0.i is null when coming from bb261.  When coming from bb248, it is all that matters.


Another random example:

check_asm_operands.exit:		; preds = %check_asm_operands.exit.thr_comm, %bb30.i, %bb12.i, %bb6.i413
	%tmp.0.i420 = phi i1 [ true, %bb6.i413 ], [ true, %bb12.i ], [ true, %bb30.i ], [ false, %check_asm_operands.exit.thr_comm ; <i1> [#uses=1]
	call void @llvm.stackrestore( i8* %savedstack ) nounwind 
	%tmp4389 = icmp eq i32 %added_sets_1.0, 0		; <i1> [#uses=1]
	%tmp4394 = icmp eq i32 %added_sets_2.0, 0		; <i1> [#uses=1]
	%bothcond80 = and i1 %tmp4389, %tmp4394		; <i1> [#uses=1]
	%bothcond81 = and i1 %bothcond80, %tmp.0.i420		; <i1> [#uses=1]
	br i1 %bothcond81, label %bb4398, label %bb4397

Here is the case from 252.eon:

bb290.i.i:		; preds = %bb23.i57.i.i, %bb8.i39.i.i, %bb100.i.i, %bb100.i.i, %bb85.i.i110
	%myEOF.1.i.i = phi i1 [ true, %bb100.i.i ], [ true, %bb100.i.i ], [ true, %bb85.i.i110 ], [ true, %bb8.i39.i.i ], [ false, %bb23.i57.i.i ]		; <i1> [#uses=2]
	%i.4.i.i = phi i32 [ %i.1.i.i, %bb85.i.i110 ], [ %i.0.i.i, %bb100.i.i ], [ %i.0.i.i, %bb100.i.i ], [ %i.3.i.i, %bb8.i39.i.i ], [ %i.3.i.i, %bb23.i57.i.i ]		; <i32> [#uses=3]
	%tmp292.i.i = load i8* %tmp16.i.i100, align 1		; <i8> [#uses=1]
	%tmp293.not.i.i = icmp ne i8 %tmp292.i.i, 0		; <i1> [#uses=1]
	%bothcond.i.i = and i1 %tmp293.not.i.i, %myEOF.1.i.i		; <i1> [#uses=1]
	br i1 %bothcond.i.i, label %bb202.i.i, label %bb301.i.i
  Factoring out 3 common predecessors.

On the path from any blocks other than bb23.i57.i.i, the load and compare 
are dead.

llvm-svn: 50096
2008-04-22 07:05:46 +00:00
Chris Lattner
4638234905 add a basic testcase.
llvm-svn: 50093
2008-04-22 06:35:14 +00:00
Nick Lewycky
1b583954ad Start removing 'unwinds to' support from mainline in preparation for 2.3.
llvm-svn: 50086
2008-04-22 05:16:02 +00:00
Chris Lattner
14be19cf1e optimize "p != gep p, ..." better. This allows us to compile
getelementptr-seteq.ll into:

define i1 @test(i64 %X, %S* %P) {
	%C = icmp eq i64 %X, -1		; <i1> [#uses=1]
	ret i1 %C
}

instead of:

define i1 @test(i64 %X, %S* %P) {
	%A.idx.mask = and i64 %X, 4611686018427387903		; <i64> [#uses=1]
	%C = icmp eq i64 %A.idx.mask, 4611686018427387903		; <i1> [#uses=1]
	ret i1 %C
}

And fixes the second half of PR2235.  This speeds up the insertion sort
case by 45%, from 1.12s to 0.77s.  In practice, this will significantly
speed up for loops structured like:

for (double *P = Base + N; P != Base; --P)
  ...

Which happens frequently for C++ iterators.

llvm-svn: 50079
2008-04-22 02:53:33 +00:00
Dan Gohman
93b5be1824 Implement an x86-64 ABI detail of passing structs by hidden first
argument. The x86-64 ABI requires the incoming value of %rdi to
be copied to %rax on exit from a function that is returning a
large C struct.

Also, add a README-X86-64 entry detailing the missed optimization
opportunity and proposing an alternative approach.

llvm-svn: 50075
2008-04-21 23:59:07 +00:00
Duncan Sands
717a1e09aa Make these structs larger to ensure that they
are returned by struct return.

llvm-svn: 50038
2008-04-21 08:17:05 +00:00
Duncan Sands
cfb3631483 Make the struct bigger, to ensure it is returned
by struct return.

llvm-svn: 50037
2008-04-21 08:12:03 +00:00
Owen Anderson
bc6046416f Refactor memcpyopt based on Chris' suggestions. Consolidate several functions
and simplify code that was fallout from the separation of memcpyopt and gvn.

llvm-svn: 50034
2008-04-21 07:45:10 +00:00
Chris Lattner
2c5b96fbee A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2.
llvm-svn: 49986
2008-04-20 05:52:46 +00:00
Chris Lattner
8503e2d236 Not all x86-64 machines have sse3 apparently.
llvm-svn: 49985
2008-04-20 05:47:56 +00:00
Chris Lattner
a9d8d647ca rename *.llx -> *.ll, last batch.
llvm-svn: 49971
2008-04-19 22:32:52 +00:00
Chris Lattner
c310b1f1f3 rename *.llx -> *.ll
llvm-svn: 49970
2008-04-19 22:29:10 +00:00
Chris Lattner
63bd1df323 rename *.llx -> *.ll
llvm-svn: 49969
2008-04-19 22:26:29 +00:00
Chris Lattner
8cde1e71f0 Implement PR2206.
llvm-svn: 49967
2008-04-19 22:17:26 +00:00
Chris Lattner
1303e72c66 refactor handling of symbolic constant folding, picking up
a few new cases( see Integer/a1.ll), but not anything that
would happen in practice.

llvm-svn: 49965
2008-04-19 21:58:19 +00:00
Evan Cheng
f583b3feb6 64-bit atomic operations.
llvm-svn: 49949
2008-04-19 02:30:38 +00:00
Dan Gohman
ac2fac937c Teach llvm-as to accept function types with multiple return types.
llvm-svn: 49945
2008-04-19 00:24:39 +00:00
Evan Cheng
073659986f Be more careful with insert_subreg and extract_subreg where either source or destination operand has already been coalesced with another register that's defined by a insert_subreg or extract_subreg.
llvm-svn: 49843
2008-04-17 07:58:04 +00:00
Owen Anderson
cd1b9c4b43 Make GVN able to remove unnecessary calls to read-only functions again.
llvm-svn: 49842
2008-04-17 05:36:50 +00:00
Evan Cheng
7c2c3333ca Fix a sub-register indice propagation bug.
llvm-svn: 49832
2008-04-17 00:06:42 +00:00
Evan Cheng
e2e899b5c2 Don't forget about sub-register indices when rematting instructions.
llvm-svn: 49830
2008-04-16 23:44:44 +00:00
Evan Cheng
44a0a0c8ee After reading memory that's already freed.
llvm-svn: 49810
2008-04-16 20:24:25 +00:00
Evan Cheng
4b16ea6247 Really test what's intended.
llvm-svn: 49802
2008-04-16 18:21:55 +00:00
Evan Cheng
6d05ce493b Rewrite LiveVariable liveness computation. The new implementation is much simplified. It eliminated the nasty recursive routines and removed the partial def / use bookkeeping. There is also potential for performance improvement by replacing the conservative handling of partial physical register definitions. The code is currently disabled until live interval analysis is taught of the name scheme.
This patch also fixed a couple of nasty corner cases.

llvm-svn: 49784
2008-04-16 09:46:40 +00:00
Owen Anderson
64fc7a4268 XFAIL this test for the moment. The real solution is to prevent ADCE
from transforming loops and adding a separate loop pass for removing
loops with know trip counts.  Until that happens, ADCE is miscompiling this code.

llvm-svn: 49769
2008-04-16 04:25:42 +00:00
Dan Gohman
be8f2b452b Add support for the form of the SSE41 extractps instruction that
puts its result in a 32-bit GPR.

llvm-svn: 49762
2008-04-16 02:32:24 +00:00
Dan Gohman
cf79877623 Recreate the size SDNode instead of reusing the old one in the x86
memcpy lowering code; this ensures that the size node has the desired
result type. This fixes a regression from r49572 with @llvm.memcpy.i64
on x86-32.

llvm-svn: 49761
2008-04-16 01:32:32 +00:00
Dan Gohman
7d27552962 Add movd instructions to move from MMX registers
to 64-bit GPR registers on x86-64.

llvm-svn: 49757
2008-04-15 23:55:07 +00:00
Dale Johannesen
45e14f7753 Don't assume a tail call can't reference a byval
argument to the outer function, this isn't correct.

llvm-svn: 49731
2008-04-15 17:41:34 +00:00
Dan Gohman
3b99b3c807 Treat EntryToken nodes as "passive" so that they aren't added to the
ScheduleDAG; they don't correspond to any actual instructions so they
don't need to be scheduled.

This fixes a bug where the EntryToken was being scheduled multiple
times in some cases, though it ended up not causing any trouble because 
EntryToken doesn't expand into anything. With this fixed the schedulers
reliably schedule the expected number of units, so we can check this
with an assertion.

This requires a tweak to test/CodeGen/X86/loop-hoist.ll because it
ends up getting scheduled differently in a trivial way, though it was
enough to fool the prcontext+grep that the test does.

llvm-svn: 49701
2008-04-15 01:22:18 +00:00
Dan Gohman
cce2b42edc Upgrade these tests for the current intrinsic prototypes.
llvm-svn: 49669
2008-04-14 18:19:18 +00:00
Dale Johannesen
d9a9c746d8 Remove -unwind-tables-optional everywhere, since
this is now the default.

llvm-svn: 49667
2008-04-14 17:56:54 +00:00
Owen Anderson
a6d1d8dec2 The functionality being tested was removed because it was horribly unsafe.
llvm-svn: 49610
2008-04-13 09:51:06 +00:00
Arnold Schwaighofer
82af0e6a43 This patch corrects the handling of byval arguments for tailcall
optimized x86-64 (and x86) calls so that they work (... at least for
my test cases).

Should fix the following problems:

Problem 1: When i introduced the optimized handling of arguments for
tail called functions (using a sequence of copyto/copyfrom virtual
registers instead of always lowering to top of the stack) i did not
handle byval arguments correctly e.g they did not work at all :).

Problem 2: On x86-64 after the arguments of the tail called function
are moved to their registers (which include ESI/RSI etc), tail call
optimization performs byval lowering which causes xSI,xDI, xCX
registers to be overwritten. This is handled in this patch by moving
the arguments to virtual registers first and after the byval lowering
the arguments are moved from those virtual registers back to
RSI/RDI/RCX.

llvm-svn: 49584
2008-04-12 18:11:06 +00:00
Dan Gohman
15edbf989f Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal
on any current target and aren't optimized in DAGCombiner. Instead
of using intermediate nodes, expand the operations, choosing between
simple loads/stores, target-specific code, and library calls,
immediately.

Previously, the code to emit optimized code for these operations
was only used at initial SelectionDAG construction time; now it is
used at all times. This fixes some cases where rep;movs was being
used for small copies where simple loads/stores would be better.

This also cleans up code that checks for alignments less than 4;
let the targets make that decision instead of doing it in
target-independent code. This allows x86 to use rep;movs in
low-alignment cases.

Also, this fixes a bug that resulted in the use of rep;stos for
memsets of 0 with non-constant memory size when the alignment was
at least 4. It's better to use the library in this case, which
can be significantly faster when the size is large.

This also preserves more SourceValue information when memory
intrinsics are lowered into simple loads/stores.

llvm-svn: 49572
2008-04-12 04:36:06 +00:00
Dan Gohman
41f9d24d52 Fix a bug that prevented x86-64 from using rep.movsq for
8-byte-aligned data.

llvm-svn: 49571
2008-04-12 02:35:39 +00:00
Evan Cheng
6e52146f16 If a PHI node has a single implicit_def source, replace it with an implicit_def instead of a copy.
llvm-svn: 49543
2008-04-11 17:54:45 +00:00
Owen Anderson
15e930588a Add testcase for PR2213.
llvm-svn: 49517
2008-04-11 05:13:32 +00:00
Evan Cheng
56ca7e285a New test.
llvm-svn: 49514
2008-04-10 23:49:09 +00:00
Dan Gohman
318d9a6605 Teach InstCombine's ComputeMaskedBits to handle pointer expressions
in addition to integer expressions. Rewrite GetOrEnforceKnownAlignment
as a ComputeMaskedBits problem, moving all of its special alignment
knowledge to ComputeMaskedBits as low-zero-bits knowledge.

Also, teach ComputeMaskedBits a few basic things about Mul and PHI
instructions.

This improves ComputeMaskedBits-based simplifications in a few cases,
but more noticeably it significantly improves instcombine's alignment
detection for loads, stores, and memory intrinsics.

llvm-svn: 49492
2008-04-10 18:43:06 +00:00
Evan Cheng
6f164e3814 A copy instruction may use a register multiple times on some targets. Change them all.
llvm-svn: 49491
2008-04-10 18:38:47 +00:00
Chris Lattner
3b289289a7 Fix the x86-64 side of PR2108 by adding a v2f64 version of
MOVZQI2PQIrr.  This would be better handled as a dag combine 
(with the goal of eliminating the bitconvert) but I don't know
how to do that safely.  Thoughts welcome.

llvm-svn: 49463
2008-04-10 05:13:43 +00:00
Evan Cheng
1803e20a62 Teach branch folding pass about implicit_def instructions. Unfortunately we can't just eliminate them since register scavenger expects every register use to be defined. However, we can delete them when there are no intra-block uses. Carefully removing some implicit def's which enable more blocks to be optimized away.
llvm-svn: 49461
2008-04-10 02:32:10 +00:00
Evan Cheng
def576f9e6 - More aggressively coalescing away copies whose source is defined by an implicit_def.
- Added insert_subreg coalescing support.

llvm-svn: 49448
2008-04-09 20:57:25 +00:00
Chris Lattner
be01a5f699 Generalize getUnaryFloatFunction to handle any FP unary function, automatically
figuring out the suffix to use.  implement pow(2,x) -> exp2(x).

llvm-svn: 49437
2008-04-09 17:48:11 +00:00
Chris Lattner
5d0cbe7d22 remove capital letter from test name.
llvm-svn: 49436
2008-04-09 17:46:36 +00:00
Owen Anderson
ca7e0e21f3 Factor a bunch of functionality related to memcpy and memset transforms out of
GVN and into its own pass.

llvm-svn: 49419
2008-04-09 08:23:16 +00:00
Evan Cheng
f35cc57821 Missed a hasInterval check.
llvm-svn: 49415
2008-04-09 01:30:15 +00:00
Chris Lattner
976ea8990e many cleanups to the pow optimizer. Allow it to handle powf,
add support for  pow(x, 2.0) -> x*x.

llvm-svn: 49411
2008-04-09 00:07:45 +00:00
Duncan Sands
b430cf3b7c Check that bodies and calls but not declarations
are marked nounwind when compiling without
-fexceptions.

llvm-svn: 49393
2008-04-08 19:31:52 +00:00
Dale Johannesen
5ac0a0ed21 Rename -disable-required-unwind-tables to -unwind-tables-optional.
llvm-svn: 49391
2008-04-08 18:10:08 +00:00
Gabor Greif
80acb912a9 merge r48768 from branches/ggreif/parallelized-test
llvm-svn: 49382
2008-04-08 15:22:41 +00:00
Dale Johannesen
576a7685f2 Missed one.
llvm-svn: 49365
2008-04-08 00:14:59 +00:00
Dale Johannesen
3f992b224e Add -disable-required-unwind-tables to tests
that need it (usually, grepping for some string
found in unwind info)

llvm-svn: 49364
2008-04-08 00:14:17 +00:00
Duncan Sands
79af9d68ec Testcase for pr2169.
llvm-svn: 49344
2008-04-07 17:03:16 +00:00
Evan Cheng
6c58f2397d Fix test.
llvm-svn: 49343
2008-04-07 17:02:18 +00:00
Chris Lattner
f88214caca fix this testcase to pass and remove a duplicate instance of itself.
llvm-svn: 49281
2008-04-06 21:39:17 +00:00
Torok Edwin
34e6889671 Prefer to expand mask for xor to -1, so we have a chance to turn it into a not.
If it cannot be expanded, it will keep the old behaviour and try to shrink the constant.
Part of enhancement for PR2191.

llvm-svn: 49280
2008-04-06 21:23:02 +00:00
Evan Cheng
d7d1c94e67 1. IMPLICIT_DEF can *re-define* any register.
2. Coalescer can now create an interesting situation where a register def can
   reaches itself without being killed.

llvm-svn: 49246
2008-04-05 01:27:09 +00:00
Evan Cheng
4d7b2ab16f Favors pshufd over shufps when shuffling elements from one vector. pshufd is faster than shufps.
llvm-svn: 49244
2008-04-05 00:30:36 +00:00
Evan Cheng
f045d86660 New test case.
llvm-svn: 49190
2008-04-03 21:25:03 +00:00
Dale Johannesen
ebfa6edc65 Testcase for EH with functions whose names are stripped.
llvm-svn: 49111
2008-04-02 20:16:41 +00:00
Dan Gohman
168b2b1300 Speculatively micro-optimize memory-zeroing calls on Darwin 10.
llvm-svn: 49048
2008-04-01 20:38:36 +00:00
Evan Cheng
c2f298f318 More soft fp fixes.
llvm-svn: 49016
2008-04-01 02:18:22 +00:00
Evan Cheng
a38ae9c502 Unbreak ARM / Thumb soft FP support.
llvm-svn: 49012
2008-04-01 01:50:16 +00:00
Dale Johannesen
d9a5b77269 Mark functions in some tests as 'nounwind'. Generating
EH info for these functions causes the tests to fail for
random reasons (e.g. looking for 'or' or counting lines
with asm-printer; labels count as lines.)

llvm-svn: 49003
2008-03-31 23:20:09 +00:00
Evan Cheng
a3ce7b4c76 It's not safe to fold a load from GV stub or constantpool into a two-address use.
llvm-svn: 49002
2008-03-31 23:19:51 +00:00
Dan Gohman
f223eaafcd Fix a DAGCombiner optimization to respect volatile qualification.
llvm-svn: 48994
2008-03-31 20:32:52 +00:00
Chris Lattner
12cecbbb25 add a testcase for forming memset from noncontiguous stores.
llvm-svn: 48938
2008-03-29 04:51:35 +00:00
Dan Gohman
227e702cae Fix a tokenfactor node to use the load chain rather than the
load value. This fixes PR2177.

llvm-svn: 48932
2008-03-28 23:45:16 +00:00
Devang Patel
0951d2a8d3 add another testcase
llvm-svn: 48881
2008-03-27 17:13:55 +00:00
Devang Patel
a04c63181f New test case.
llvm-svn: 48858
2008-03-27 01:51:31 +00:00
Evan Cheng
6cbce6b602 Fix a memory bug: increment an iterator of a deleted machine instr.
llvm-svn: 48853
2008-03-27 01:27:25 +00:00
Erick Tryzelaar
0efea4df76 Expose ExecutionEngine::getTargetData() to c and ocaml bindings.
llvm-svn: 48851
2008-03-27 00:27:14 +00:00
Evan Cheng
6fc37c8f25 One more coalescer fix wrt deadness propagation.
llvm-svn: 48837
2008-03-26 20:15:49 +00:00
Evan Cheng
8d222d6221 Avoid commuting a def MI in order to coalesce a copy instruction away if any use of the same val# is a copy instruction that has already been coalesced.
llvm-svn: 48833
2008-03-26 19:03:01 +00:00
Dale Johannesen
8c1e95810f Use ## for comment delimiter on darwin x86-32, so
llvm's output .s files will go through gcc -std=c99
without triggering preprocesser errors.  Approach
suggested by Daveed Vandevoorde.

llvm-svn: 48808
2008-03-25 23:29:30 +00:00
Evan Cheng
8cb64d8e8b Handle a special case xor undef, undef -> 0. Technically this should be transformed to undef. But this is such a common idiom (misuse) we are going to handle it.
llvm-svn: 48792
2008-03-25 20:08:07 +00:00
Evan Cheng
563b265f37 Handle a special case xor undef, undef -> 0. Technically this should be transformed to undef. But this is such a common idiom (misuse) we are going to handle it.
llvm-svn: 48791
2008-03-25 20:07:13 +00:00
Dan Gohman
58ad056286 Add CMP32mr and friends to the load-unfolding table. Among
other things, this allows the scheduler to unfold a load operand
in the 2008-01-08-SchedulerCrash.ll testcase, so it now successfully
clones the comparison to avoid a pushf+popf.

llvm-svn: 48777
2008-03-25 16:53:19 +00:00
Gordon Henriksen
2d762e28e9 Tests for the instruction iterator bindings.
llvm-svn: 48775
2008-03-25 16:35:08 +00:00
Tanya Lattner
b6a27ed83f Byebye llvm-upgrade!
llvm-svn: 48762
2008-03-25 04:26:08 +00:00
Evan Cheng
7c1dcd8371 lastRegisterUse() should ignore identity copies. Those will be erased.
llvm-svn: 48759
2008-03-25 02:02:19 +00:00
Devang Patel
a7084b048f check struct layout
llvm-svn: 48758
2008-03-25 00:47:49 +00:00
Bill Wendling
2097b72649 Use the bit size of the operand instead of the hard-coded 32 to generate the
mask.

llvm-svn: 48750
2008-03-24 23:16:37 +00:00
Evan Cheng
dbdf48276a - SSE4.1 extractfps extracts a f32 into a gr32 register. Very useful! Not. Fix the instruction specification and teaches lowering code to use it only when the only use is a store instruction.
llvm-svn: 48746
2008-03-24 21:52:23 +00:00
Devang Patel
425514c509 Add incoming value from header only if phi node has any use inside the loop.
llvm-svn: 48738
2008-03-24 20:16:14 +00:00
Devang Patel
25068296ec Fix test name.
llvm-svn: 48733
2008-03-24 18:08:07 +00:00
Chris Lattner
97e4d98c2d apparently tclsh doesn't lex like bash. Weird.
llvm-svn: 48732
2008-03-24 17:41:57 +00:00
Chris Lattner
3a6d3372f5 pass the option so this test tests the right thing.
llvm-svn: 48731
2008-03-24 17:36:38 +00:00
Devang Patel
9548f89eaf Add new test.
llvm-svn: 48730
2008-03-24 17:16:39 +00:00
Devang Patel
4ca45ebdf4 Remove incorrect comment.
llvm-svn: 48728
2008-03-24 16:58:20 +00:00
Dan Gohman
b9c5e6258f APIntify SelectionDAG's EXTRACT_ELEMENT code.
llvm-svn: 48726
2008-03-24 16:38:05 +00:00
Evan Cheng
1d63708523 Transform (zext (or (icmp), (icmp))) to (or (zext (cimp), (zext icmp))) if at least one of the (zext icmp) can be transformed to eliminate an icmp.
llvm-svn: 48715
2008-03-24 00:21:34 +00:00
Gordon Henriksen
52f3a08237 Objective Caml bindings for basic block, function, global, and arg iterators.
llvm-svn: 48711
2008-03-23 22:21:29 +00:00
Bill Wendling
f607f27320 New testcase.
llvm-svn: 48697
2008-03-22 22:27:01 +00:00
Owen Anderson
2f91173e40 Use normal naming convention for test.
llvm-svn: 48693
2008-03-22 21:08:33 +00:00
Anton Korobeynikov
25a0157827 Add testcase for prev. commit. Minor fixes
llvm-svn: 48686
2008-03-22 08:37:05 +00:00
Anton Korobeynikov
06f3b7f4ee Support chained aliases for LLVM IR printing. This fixes PR2145
llvm-svn: 48684
2008-03-22 08:17:17 +00:00
Chris Lattner
16f62d36e8 implement an initial hack at a straight-line store -> memset optimization.
This fires dozens of times across spec and multisource, but I don't know
if it actually speeds stuff up.  Hopefully the testers will show something
nice :)

llvm-svn: 48680
2008-03-22 05:37:16 +00:00
Evan Cheng
874aee2eec Teach DAG combiner to commute commutable binary nodes in order to achieve sdisel CSE.
llvm-svn: 48673
2008-03-22 01:55:50 +00:00
Dan Gohman
59aeac6320 Handle getresult instructions in different basic blocks
from their aggregate operands by moving the getresult
instructions.

llvm-svn: 48657
2008-03-21 21:01:32 +00:00
Duncan Sands
8e40ac013e Testcase for PR2160.
llvm-svn: 48655
2008-03-21 20:22:11 +00:00
Chris Lattner
8a4fa95cae Add support for calls that return two FP values in
ST(0)/ST(1).

llvm-svn: 48634
2008-03-21 06:38:26 +00:00
Chris Lattner
933d0d318b disable a bogus assertion.
llvm-svn: 48633
2008-03-21 06:01:05 +00:00
Chris Lattner
260473f983 Enable support for returning two long-double values in ST(0)/ST(1).
This allows us to compile fp-stack-2results.ll into:

_test:
	fldz
	fld1
	ret

which returns 1 in ST(0) and 0 in ST(1).  This is needed for x86-64
_Complex long double.

llvm-svn: 48632
2008-03-21 05:57:20 +00:00