llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00

Author	SHA1	Message	Date
Chris Lattner	d696c25db5	This readme entry is done, testcase here: CodeGen/X86/zero-remat.ll llvm-svn: 47106	2008-02-14 05:39:46 +00:00
Evan Cheng	cbbcb3144d	Fix test. llvm-svn: 47102	2008-02-14 01:32:53 +00:00
Duncan Sands	2e9661573f	Teach LegalizeTypes how to expand and promote CTLZ, CTTZ and CTPOP. The expansion code differs from that in LegalizeDAG in that it chooses to take the CTLZ/CTTZ count from the Hi/Lo part depending on whether the Hi/Lo value is zero, not on whether CTLZ/CTTZ of Hi/Lo returned 32 (or whatever the width of the type is) for it. I made this change because the optimizers may well know that Hi/Lo is zero and exploit it. The promotion code for CTTZ also differs from that in LegalizeDAG: it uses an "or" to get the right result when the original value is zero, rather than using a compare and select. This also means the value doesn't need to be zero extended. llvm-svn: 47075	2008-02-13 18:01:53 +00:00
Chris Lattner	a30946c576	In SDISel, for targets that support FORMAL_ARGUMENTS nodes, lower this node as soon as we create it in SDISel. Previously we would lower it in legalize. The problem with this is that it only exposes the argument loads implied by FORMAL_ARGUMENTs after legalize, so that only dag combine 2 can hack on them. This causes us to miss some optimizations because datatype expansion also happens here. Exposing the loads early allows us to do optimizations on them. For example we now compile arg-cast.ll to: _foo: movl $2147483647, %eax andl 8(%esp), %eax ret where we previously produced: _foo: subl $12, %esp movsd 16(%esp), %xmm0 movsd %xmm0, (%esp) movl $2147483647, %eax andl 4(%esp), %eax addl $12, %esp ret It might also make sense to do this for ISD::CALL nodes, which have implicit stores on many targets. llvm-svn: 47054	2008-02-13 07:39:09 +00:00
Nate Begeman	cfd9883301	Add testcase for recent legalizer change llvm-svn: 47049	2008-02-13 06:48:40 +00:00
Evan Cheng	68a88c1f52	New tests. llvm-svn: 47047	2008-02-13 03:23:53 +00:00
Evan Cheng	0d2efb485d	Don't mask the isel bug. llvm-svn: 47018	2008-02-12 19:11:29 +00:00
Evan Cheng	6c7520f922	This test assumes no SSE4.1. llvm-svn: 47017	2008-02-12 19:11:08 +00:00
Evan Cheng	1ab096a313	Fix some test cases. llvm-svn: 46998	2008-02-12 07:22:46 +00:00
Evan Cheng	19f684ed72	Determine whether a spill kills the register it's spilling before insertion rather than trying to undo the kill marker afterwards. llvm-svn: 46953	2008-02-11 08:30:52 +00:00
Dale Johannesen	304406f01c	Alignment of struct containing vectors depends on whether SSE is present, on Darwin anyway. Make it explicit. llvm-svn: 46909	2008-02-09 19:04:25 +00:00
Evan Cheng	90f03a0b88	It's not always safe to fold movsd into xorpd, etc. Check the alignment of the load address first to make sure it's 16 byte aligned. llvm-svn: 46893	2008-02-08 21:20:40 +00:00
Evan Cheng	b2bc19ee5b	Added missing entries in X86 load / store folding tables. llvm-svn: 46866	2008-02-08 00:12:56 +00:00
Evan Cheng	a377b2bbd1	Fix a x86-64 codegen deficiency. Allow gv + offset when using rip addressing mode. Before: _main: subq $8, %rsp leaq _X(%rip), %rax movsd 8(%rax), %xmm1 movss _X(%rip), %xmm0 call _t xorl %ecx, %ecx movl %ecx, %eax addq $8, %rsp ret Now: _main: subq $8, %rsp movsd _X+8(%rip), %xmm1 movss _X(%rip), %xmm0 call _t xorl %ecx, %ecx movl %ecx, %eax addq $8, %rsp ret Notice there is another idiotic codegen issue that needs to be fixed asap: xorl %ecx, %ecx movl %ecx, %eax llvm-svn: 46850	2008-02-07 08:53:49 +00:00
Evan Cheng	6b03a1aeb9	It's PR1925, not PR1609. llvm-svn: 46825	2008-02-06 22:07:17 +00:00
Evan Cheng	2091d9a2e8	Fix a number of local register allocator issues: PR1609. llvm-svn: 46821	2008-02-06 19:16:53 +00:00
Evan Cheng	851d353eb8	Fix PR1975: dag isel emitter produces patterns that isel wrong flag result. llvm-svn: 46776	2008-02-05 22:50:29 +00:00
Evan Cheng	69d5e0fc0f	If a vr is already marked alive in a bb, then it has PHI uses that are visited earlier, then it is not killed in the def block (i.e. not dead). llvm-svn: 46763	2008-02-05 20:04:18 +00:00
Duncan Sands	b65b2462c8	Crashes LegalizeTypes with "Do not know how to expand the result of this operator!" (node: ctlz). llvm-svn: 46713	2008-02-04 18:07:02 +00:00
Duncan Sands	123f86e781	Crashes LegalizeTypes with "Do not know how to split this operator's operand" (node: extract_subvector). llvm-svn: 46712	2008-02-04 18:05:42 +00:00
Chris Lattner	cad0478491	remove target triple to make this test more "generic" llvm-svn: 46711	2008-02-04 18:02:37 +00:00
Duncan Sands	36a938c4fb	Crashed the new type legalizer. Not likely to catch any bugs in the future since to get the crash you also need hacked in fake libcall support (which creates odd but legal trees), but since adding it doesn't hurt... Thanks to Chris for this ultimately reduced version. llvm-svn: 46706	2008-02-04 09:40:27 +00:00
Lauro Ramos Venancio	563e0a3ea3	CBackend: Implement unaligned load/store. llvm-svn: 46646	2008-02-01 21:25:59 +00:00
Chris Lattner	35f063e37c	Add target triples to these so they don't fail on linux. llvm-svn: 46496	2008-01-29 06:26:07 +00:00
Scott Michel	dc780aeb57	Overhaul Cell SPU's addressing mode internals so that there are now only two addressing mode nodes, SPUaform and SPUindirect (vice the three previous ones, SPUaform, SPUdform and SPUxform). This improves code somewhat because we now avoid using reg+reg addressing when it can be avoided. It also simplifies the address selection logic, which was the main point for doing this. Also, for various global variables that would be loaded using SPU's A-form addressing, prefer D-form offs[reg] addressing, keeping the base in a register if the variable is used more than once. llvm-svn: 46483	2008-01-29 02:16:57 +00:00
Chris Lattner	26a8116f49	Update this test. Due to dag combiner improvements, we now compile f7/f11 to: _f7: eor r0, r0, #2, 2 @ -2147483648 bx lr _f11: bic r0, r0, #2, 2 @ -2147483648 bx lr instead of: _f7: fmsr s0, r0 fnegs s0, s0 fmrs r0, s0 bx lr _f11: fmsr s0, r0 fabss s0, s0 fmrs r0, s0 bx lr llvm-svn: 46423	2008-01-27 23:26:37 +00:00
Chris Lattner	2ab1fd3824	Implement some dag combines that allow doing fneg/fabs/fcopysign in integer registers if used by a bitconvert or using a bitconvert. This allows us to avoid constant pool loads and use cheaper integer instructions when the values come from or end up in integer regs anyway. For example, we now compile CodeGen/X86/fp-in-intregs.ll to: _test1: movl $2147483648, %eax xorl 4(%esp), %eax ret _test2: movl $1065353216, %eax orl 4(%esp), %eax andl $3212836864, %eax ret Instead of: _test1: movss 4(%esp), %xmm0 xorps LCPI2_0, %xmm0 movd %xmm0, %eax ret _test2: movss 4(%esp), %xmm0 andps LCPI3_0, %xmm0 movss LCPI3_1, %xmm1 andps LCPI3_2, %xmm1 orps %xmm0, %xmm1 movd %xmm1, %eax ret bitconverts can happen due to various calling conventions that require fp values to passed in integer regs in some cases, e.g. when returning a complex. llvm-svn: 46414	2008-01-27 17:42:27 +00:00
Chris Lattner	e66aea6532	New test to verify that "merging 4 loads into a vec load" continues to work and continues to infer alignment info. llvm-svn: 46403	2008-01-26 20:06:45 +00:00
Chris Lattner	682346a7b0	Infer alignment of loads and increase their alignment when we can tell they are from the stack. This allows us to compile stack-align.ll to: _test: movsd LCPI1_0, %xmm0 movapd %xmm0, %xmm1 * andpd 4(%esp), %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret instead of: _test: movsd LCPI1_0, %xmm0 movsd 4(%esp), %xmm1 ** andpd %xmm0, %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret llvm-svn: 46401	2008-01-26 19:45:50 +00:00
Chris Lattner	f0c3240135	remove a useless xfailed test. llvm-svn: 46400	2008-01-26 19:35:46 +00:00
Bill Wendling	7b83688c73	If there's no instructions being emitted on X86 for a function, emit a nop. Emit the nop directly for PPC. llvm-svn: 46398	2008-01-26 09:03:52 +00:00
Bill Wendling	26fb9335f5	Need to convert to LLVM code and not C. llvm-svn: 46397	2008-01-26 06:56:08 +00:00
Bill Wendling	3e622b88b6	Rename the .c to .ll llvm-svn: 46396	2008-01-26 06:53:40 +00:00
Bill Wendling	7151e8d92c	Move testcase to the code gen directory. llvm-svn: 46395	2008-01-26 06:53:06 +00:00
Chris Lattner	53a98f46fd	Fix some bugs in SimplifyNodeWithTwoResults where it would call deletenode to delete a node even if it was not dead in some cases. Instead, just add it to the worklist. Also, make sure to use the CombineTo methods, as it was doing things that were unsafe: the top level combine loop could touch dangling memory. This fixes CodeGen/Generic/2008-01-25-dag-combine-mul.ll llvm-svn: 46384	2008-01-26 01:09:19 +00:00
Chris Lattner	79076fdf2a	Add target-specific dag combines for FAND(x,0) and FOR(x,0). This allows us to compile: double test(double X) { return copysign(0.0, X); } into: _test: andpd LCPI1_0(%rip), %xmm0 ret instead of: _test: pxor %xmm1, %xmm1 andpd LCPI1_0(%rip), %xmm1 movapd %xmm0, %xmm2 andpd LCPI1_1(%rip), %xmm2 movapd %xmm1, %xmm0 orpd %xmm2, %xmm0 ret llvm-svn: 46344	2008-01-25 05:46:26 +00:00
Chris Lattner	16a8f126d3	Significantly simplify and improve handling of FP function results on x86-32. This case returns the value in ST(0) and then has to convert it to an SSE register. This causes significant codegen ugliness in some cases. For example in the trivial fp-stack-direct-ret.ll testcase we used to generate: _bar: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret because we move the result of foo() into an XMM register, then have to move it back for the return of bar. Instead of hacking ever-more special cases into the call result lowering code we take a much simpler approach: on x86-32, fp return is modeled as always returning into an f80 register which is then truncated to f32 or f64 as needed. Similarly for a result, we model it as an extension to f80 + return. This exposes the truncate and extensions to the dag combiner, allowing target independent code to hack on them, eliminating them in this case. This gives us this code for the example above: _bar: subl $12, %esp call L_foo$stub addl $12, %esp ret The nasty aspect of this is that these conversions are not legal, but we want the second pass of dag combiner (post-legalize) to be able to hack on them. To handle this, we lie to legalize and say they are legal, then custom expand them on entry to the isel pass (PreprocessForFPConvert). This is gross, but less gross than the code it is replacing :) This also allows us to generate better code in several other cases. For example on fp-stack-ret-conv.ll, we now generate: _test: subl $12, %esp call L_foo$stub fstps 8(%esp) movl 16(%esp), %eax cvtss2sd 8(%esp), %xmm0 movsd %xmm0, (%eax) addl $12, %esp ret where before we produced (incidentally, the old bad code is identical to what gcc produces): _test: subl $12, %esp call L_foo$stub fstpl (%esp) cvtsd2ss (%esp), %xmm0 cvtss2sd %xmm0, %xmm0 movl 16(%esp), %eax movsd %xmm0, (%eax) addl $12, %esp ret Note that we generate slightly worse code on pr1505b.ll due to a scheduling deficiency that is unrelated to this patch. llvm-svn: 46307	2008-01-24 08:07:48 +00:00
Chris Lattner	214c11ee6f	take these with a pr # llvm-svn: 46303	2008-01-24 06:35:44 +00:00
Evan Cheng	91089e6d66	Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type. llvm-svn: 46286	2008-01-23 23:17:41 +00:00
Evan Cheng	d436c2e724	SSE varargs arguments are passed in memory. llvm-svn: 46262	2008-01-22 23:26:53 +00:00
Dale Johannesen	7807e86260	Implement flt_rounds for PowerPC. llvm-svn: 46174	2008-01-18 19:55:37 +00:00
Chris Lattner	49fd213770	remove extraneous &&'s from tests, as Scott is apparently not going to. llvm-svn: 46173	2008-01-18 19:53:43 +00:00
Dale Johannesen	b2d9e41233	Test is correct again for the moment. llvm-svn: 46172	2008-01-18 19:53:31 +00:00
Chris Lattner	febc7ea9bf	Fix a latent bug exposed by my truncstore patch. We compiled stfiwx-2.ll to: _test: fctiwz f0, f1 stfiwx f0, 0, r4 blr instead of: _test: fctiwz f0, f1 stfd f0, -8(r1) nop nop lwz r2, -4(r1) stb r2, 0(r4) blr The former is not correct (stores 4 bytes, not 1). llvm-svn: 46161	2008-01-18 16:54:56 +00:00
Scott Michel	506e61bad1	Forward progress: crtbegin.c now compiles successfully! Fixed CellSPU's A-form (local store) address mode, so that all globals, externals, constant pool and jump table symbols are now wrapped within a SPUISD::AFormAddr pseudo-instruction. This now identifies all local store memory addresses, although it requires a bit of legerdemain during instruction selection to properly select loads to and stores from local store, properly generating "LQA" instructions. Also added mul_ops.ll test harness for exercising integer multiplication. llvm-svn: 46142	2008-01-17 20:38:41 +00:00
Chris Lattner	41717f6989	This commit changes: 1. Legalize now always promotes truncstore of i1 to i8. 2. Remove patterns and gunk related to truncstore i1 from targets. 3. Rename the StoreXAction stuff to TruncStoreAction in TLI. 4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions. 5. Mark a wide variety of invalid truncstores as such in various targets, e.g. X86 currently doesn't support truncstore of any of its integer types. 6. Add legalize support for truncstores with invalid value input types. 7. Add a dag combine transform to turn store(truncate) into truncstore when safe. The later allows us to compile CodeGen/X86/storetrunc-fp.ll to: _foo: fldt 20(%esp) fldt 4(%esp) faddp %st(1) movl 36(%esp), %eax fstps (%eax) ret instead of: _foo: subl $4, %esp fldt 24(%esp) fldt 8(%esp) faddp %st(1) fstps (%esp) movl 40(%esp), %eax movss (%esp), %xmm0 movss %xmm0, (%eax) addl $4, %esp ret llvm-svn: 46140	2008-01-17 19:59:44 +00:00
Chris Lattner	adb8aeaf6a	new testcase. llvm-svn: 46139	2008-01-17 19:47:23 +00:00
Chris Lattner	ee20bcd396	add testcase that has been sitting in my tree for awhile. llvm-svn: 46124	2008-01-17 06:54:09 +00:00
Evan Cheng	8633da0707	When a live virtual register is being clobbered by an implicit def, it is spilled and the spill is its kill. However, if the local allocator has determined the register has not been modified (possible when its value was reloaded), it would not issue a restore. In that case, mark the last use of the virtual register as kill. llvm-svn: 46111	2008-01-17 02:08:17 +00:00
Evan Cheng	5be34d811c	Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0. It's not safe to use the two value CombineTo variant to combine away a dead load. e.g. v1, chain2 = load chain1, loc v2, chain3 = load chain2, loc v3 = add v2, c Now we replace use of v1 with undef, use of chain2 with chain1. ReplaceAllUsesWith() will iterate through uses of the first load and update operands: v1, chain2 = load chain1, loc v2, chain3 = load chain1, loc v3 = add v2, c Now the second load is the same as the first load, SelectionDAG cse will ensure the use of second load is replaced with the first load. v1, chain2 = load chain1, loc v3 = add v1, c Then v1 is replaced with undef and bad things happen. llvm-svn: 46099	2008-01-16 23:11:54 +00:00

1 2 3 4 5 ...

661 Commits