llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00

Author	SHA1	Message	Date
Evan Cheng	a377b2bbd1	Fix a x86-64 codegen deficiency. Allow gv + offset when using rip addressing mode. Before: _main: subq $8, %rsp leaq _X(%rip), %rax movsd 8(%rax), %xmm1 movss _X(%rip), %xmm0 call _t xorl %ecx, %ecx movl %ecx, %eax addq $8, %rsp ret Now: _main: subq $8, %rsp movsd _X+8(%rip), %xmm1 movss _X(%rip), %xmm0 call _t xorl %ecx, %ecx movl %ecx, %eax addq $8, %rsp ret Notice there is another idiotic codegen issue that needs to be fixed asap: xorl %ecx, %ecx movl %ecx, %eax llvm-svn: 46850	2008-02-07 08:53:49 +00:00
Evan Cheng	6b03a1aeb9	It's PR1925, not PR1609. llvm-svn: 46825	2008-02-06 22:07:17 +00:00
Evan Cheng	2091d9a2e8	Fix a number of local register allocator issues: PR1609. llvm-svn: 46821	2008-02-06 19:16:53 +00:00
Evan Cheng	851d353eb8	Fix PR1975: dag isel emitter produces patterns that isel wrong flag result. llvm-svn: 46776	2008-02-05 22:50:29 +00:00
Evan Cheng	69d5e0fc0f	If a vr is already marked alive in a bb, then it has PHI uses that are visited earlier, then it is not killed in the def block (i.e. not dead). llvm-svn: 46763	2008-02-05 20:04:18 +00:00
Duncan Sands	b65b2462c8	Crashes LegalizeTypes with "Do not know how to expand the result of this operator!" (node: ctlz). llvm-svn: 46713	2008-02-04 18:07:02 +00:00
Duncan Sands	123f86e781	Crashes LegalizeTypes with "Do not know how to split this operator's operand" (node: extract_subvector). llvm-svn: 46712	2008-02-04 18:05:42 +00:00
Chris Lattner	cad0478491	remove target triple to make this test more "generic" llvm-svn: 46711	2008-02-04 18:02:37 +00:00
Duncan Sands	36a938c4fb	Crashed the new type legalizer. Not likely to catch any bugs in the future since to get the crash you also need hacked in fake libcall support (which creates odd but legal trees), but since adding it doesn't hurt... Thanks to Chris for this ultimately reduced version. llvm-svn: 46706	2008-02-04 09:40:27 +00:00
Lauro Ramos Venancio	563e0a3ea3	CBackend: Implement unaligned load/store. llvm-svn: 46646	2008-02-01 21:25:59 +00:00
Chris Lattner	35f063e37c	Add target triples to these so they don't fail on linux. llvm-svn: 46496	2008-01-29 06:26:07 +00:00
Scott Michel	dc780aeb57	Overhaul Cell SPU's addressing mode internals so that there are now only two addressing mode nodes, SPUaform and SPUindirect (vice the three previous ones, SPUaform, SPUdform and SPUxform). This improves code somewhat because we now avoid using reg+reg addressing when it can be avoided. It also simplifies the address selection logic, which was the main point for doing this. Also, for various global variables that would be loaded using SPU's A-form addressing, prefer D-form offs[reg] addressing, keeping the base in a register if the variable is used more than once. llvm-svn: 46483	2008-01-29 02:16:57 +00:00
Chris Lattner	26a8116f49	Update this test. Due to dag combiner improvements, we now compile f7/f11 to: _f7: eor r0, r0, #2, 2 @ -2147483648 bx lr _f11: bic r0, r0, #2, 2 @ -2147483648 bx lr instead of: _f7: fmsr s0, r0 fnegs s0, s0 fmrs r0, s0 bx lr _f11: fmsr s0, r0 fabss s0, s0 fmrs r0, s0 bx lr llvm-svn: 46423	2008-01-27 23:26:37 +00:00
Chris Lattner	2ab1fd3824	Implement some dag combines that allow doing fneg/fabs/fcopysign in integer registers if used by a bitconvert or using a bitconvert. This allows us to avoid constant pool loads and use cheaper integer instructions when the values come from or end up in integer regs anyway. For example, we now compile CodeGen/X86/fp-in-intregs.ll to: _test1: movl $2147483648, %eax xorl 4(%esp), %eax ret _test2: movl $1065353216, %eax orl 4(%esp), %eax andl $3212836864, %eax ret Instead of: _test1: movss 4(%esp), %xmm0 xorps LCPI2_0, %xmm0 movd %xmm0, %eax ret _test2: movss 4(%esp), %xmm0 andps LCPI3_0, %xmm0 movss LCPI3_1, %xmm1 andps LCPI3_2, %xmm1 orps %xmm0, %xmm1 movd %xmm1, %eax ret bitconverts can happen due to various calling conventions that require fp values to passed in integer regs in some cases, e.g. when returning a complex. llvm-svn: 46414	2008-01-27 17:42:27 +00:00
Chris Lattner	e66aea6532	New test to verify that "merging 4 loads into a vec load" continues to work and continues to infer alignment info. llvm-svn: 46403	2008-01-26 20:06:45 +00:00
Chris Lattner	682346a7b0	Infer alignment of loads and increase their alignment when we can tell they are from the stack. This allows us to compile stack-align.ll to: _test: movsd LCPI1_0, %xmm0 movapd %xmm0, %xmm1 * andpd 4(%esp), %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret instead of: _test: movsd LCPI1_0, %xmm0 movsd 4(%esp), %xmm1 ** andpd %xmm0, %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret llvm-svn: 46401	2008-01-26 19:45:50 +00:00
Chris Lattner	f0c3240135	remove a useless xfailed test. llvm-svn: 46400	2008-01-26 19:35:46 +00:00
Bill Wendling	7b83688c73	If there's no instructions being emitted on X86 for a function, emit a nop. Emit the nop directly for PPC. llvm-svn: 46398	2008-01-26 09:03:52 +00:00
Bill Wendling	26fb9335f5	Need to convert to LLVM code and not C. llvm-svn: 46397	2008-01-26 06:56:08 +00:00
Bill Wendling	3e622b88b6	Rename the .c to .ll llvm-svn: 46396	2008-01-26 06:53:40 +00:00
Bill Wendling	7151e8d92c	Move testcase to the code gen directory. llvm-svn: 46395	2008-01-26 06:53:06 +00:00
Chris Lattner	53a98f46fd	Fix some bugs in SimplifyNodeWithTwoResults where it would call deletenode to delete a node even if it was not dead in some cases. Instead, just add it to the worklist. Also, make sure to use the CombineTo methods, as it was doing things that were unsafe: the top level combine loop could touch dangling memory. This fixes CodeGen/Generic/2008-01-25-dag-combine-mul.ll llvm-svn: 46384	2008-01-26 01:09:19 +00:00
Chris Lattner	79076fdf2a	Add target-specific dag combines for FAND(x,0) and FOR(x,0). This allows us to compile: double test(double X) { return copysign(0.0, X); } into: _test: andpd LCPI1_0(%rip), %xmm0 ret instead of: _test: pxor %xmm1, %xmm1 andpd LCPI1_0(%rip), %xmm1 movapd %xmm0, %xmm2 andpd LCPI1_1(%rip), %xmm2 movapd %xmm1, %xmm0 orpd %xmm2, %xmm0 ret llvm-svn: 46344	2008-01-25 05:46:26 +00:00
Chris Lattner	16a8f126d3	Significantly simplify and improve handling of FP function results on x86-32. This case returns the value in ST(0) and then has to convert it to an SSE register. This causes significant codegen ugliness in some cases. For example in the trivial fp-stack-direct-ret.ll testcase we used to generate: _bar: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret because we move the result of foo() into an XMM register, then have to move it back for the return of bar. Instead of hacking ever-more special cases into the call result lowering code we take a much simpler approach: on x86-32, fp return is modeled as always returning into an f80 register which is then truncated to f32 or f64 as needed. Similarly for a result, we model it as an extension to f80 + return. This exposes the truncate and extensions to the dag combiner, allowing target independent code to hack on them, eliminating them in this case. This gives us this code for the example above: _bar: subl $12, %esp call L_foo$stub addl $12, %esp ret The nasty aspect of this is that these conversions are not legal, but we want the second pass of dag combiner (post-legalize) to be able to hack on them. To handle this, we lie to legalize and say they are legal, then custom expand them on entry to the isel pass (PreprocessForFPConvert). This is gross, but less gross than the code it is replacing :) This also allows us to generate better code in several other cases. For example on fp-stack-ret-conv.ll, we now generate: _test: subl $12, %esp call L_foo$stub fstps 8(%esp) movl 16(%esp), %eax cvtss2sd 8(%esp), %xmm0 movsd %xmm0, (%eax) addl $12, %esp ret where before we produced (incidentally, the old bad code is identical to what gcc produces): _test: subl $12, %esp call L_foo$stub fstpl (%esp) cvtsd2ss (%esp), %xmm0 cvtss2sd %xmm0, %xmm0 movl 16(%esp), %eax movsd %xmm0, (%eax) addl $12, %esp ret Note that we generate slightly worse code on pr1505b.ll due to a scheduling deficiency that is unrelated to this patch. llvm-svn: 46307	2008-01-24 08:07:48 +00:00
Chris Lattner	214c11ee6f	take these with a pr # llvm-svn: 46303	2008-01-24 06:35:44 +00:00
Evan Cheng	91089e6d66	Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type. llvm-svn: 46286	2008-01-23 23:17:41 +00:00
Evan Cheng	d436c2e724	SSE varargs arguments are passed in memory. llvm-svn: 46262	2008-01-22 23:26:53 +00:00
Dale Johannesen	7807e86260	Implement flt_rounds for PowerPC. llvm-svn: 46174	2008-01-18 19:55:37 +00:00
Chris Lattner	49fd213770	remove extraneous &&'s from tests, as Scott is apparently not going to. llvm-svn: 46173	2008-01-18 19:53:43 +00:00
Dale Johannesen	b2d9e41233	Test is correct again for the moment. llvm-svn: 46172	2008-01-18 19:53:31 +00:00
Chris Lattner	febc7ea9bf	Fix a latent bug exposed by my truncstore patch. We compiled stfiwx-2.ll to: _test: fctiwz f0, f1 stfiwx f0, 0, r4 blr instead of: _test: fctiwz f0, f1 stfd f0, -8(r1) nop nop lwz r2, -4(r1) stb r2, 0(r4) blr The former is not correct (stores 4 bytes, not 1). llvm-svn: 46161	2008-01-18 16:54:56 +00:00
Scott Michel	506e61bad1	Forward progress: crtbegin.c now compiles successfully! Fixed CellSPU's A-form (local store) address mode, so that all globals, externals, constant pool and jump table symbols are now wrapped within a SPUISD::AFormAddr pseudo-instruction. This now identifies all local store memory addresses, although it requires a bit of legerdemain during instruction selection to properly select loads to and stores from local store, properly generating "LQA" instructions. Also added mul_ops.ll test harness for exercising integer multiplication. llvm-svn: 46142	2008-01-17 20:38:41 +00:00
Chris Lattner	41717f6989	This commit changes: 1. Legalize now always promotes truncstore of i1 to i8. 2. Remove patterns and gunk related to truncstore i1 from targets. 3. Rename the StoreXAction stuff to TruncStoreAction in TLI. 4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions. 5. Mark a wide variety of invalid truncstores as such in various targets, e.g. X86 currently doesn't support truncstore of any of its integer types. 6. Add legalize support for truncstores with invalid value input types. 7. Add a dag combine transform to turn store(truncate) into truncstore when safe. The later allows us to compile CodeGen/X86/storetrunc-fp.ll to: _foo: fldt 20(%esp) fldt 4(%esp) faddp %st(1) movl 36(%esp), %eax fstps (%eax) ret instead of: _foo: subl $4, %esp fldt 24(%esp) fldt 8(%esp) faddp %st(1) fstps (%esp) movl 40(%esp), %eax movss (%esp), %xmm0 movss %xmm0, (%eax) addl $4, %esp ret llvm-svn: 46140	2008-01-17 19:59:44 +00:00
Chris Lattner	adb8aeaf6a	new testcase. llvm-svn: 46139	2008-01-17 19:47:23 +00:00
Chris Lattner	ee20bcd396	add testcase that has been sitting in my tree for awhile. llvm-svn: 46124	2008-01-17 06:54:09 +00:00
Evan Cheng	8633da0707	When a live virtual register is being clobbered by an implicit def, it is spilled and the spill is its kill. However, if the local allocator has determined the register has not been modified (possible when its value was reloaded), it would not issue a restore. In that case, mark the last use of the virtual register as kill. llvm-svn: 46111	2008-01-17 02:08:17 +00:00
Evan Cheng	5be34d811c	Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0. It's not safe to use the two value CombineTo variant to combine away a dead load. e.g. v1, chain2 = load chain1, loc v2, chain3 = load chain2, loc v3 = add v2, c Now we replace use of v1 with undef, use of chain2 with chain1. ReplaceAllUsesWith() will iterate through uses of the first load and update operands: v1, chain2 = load chain1, loc v2, chain3 = load chain1, loc v3 = add v2, c Now the second load is the same as the first load, SelectionDAG cse will ensure the use of second load is replaced with the first load. v1, chain2 = load chain1, loc v3 = add v1, c Then v1 is replaced with undef and bad things happen. llvm-svn: 46099	2008-01-16 23:11:54 +00:00
Duncan Sands	78e448d8b4	Trampoline support for x86-64. This looks like it should work, but I have no machine to test it on. Committed because it will at least cause no harm, and maybe someone can test it for me! llvm-svn: 46098	2008-01-16 22:55:25 +00:00
Chris Lattner	3c9f208ca8	add testcase for regression llvm-svn: 46073	2008-01-16 18:03:52 +00:00
Chris Lattner	109f0e56f5	make sure to use a cpu that has sse. llvm-svn: 46060	2008-01-16 06:32:02 +00:00
Chris Lattner	41e1fd13b2	My previous commit had an incomplete message, it should have been: make the 'fp return in ST(0)' optimization smart enough to look through token factor nodes. THis allows us to compile testcases like CodeGen/X86/fp-stack-retcopy.ll into: _carg: subl $12, %esp call L_foo$stub fstpl (%esp) fldl (%esp) addl $12, %esp ret instead of: _carg: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret Still not optimal, but much better and this is a trivial patch. Fixing the rest requires invasive surgery that is is not llvm 2.2 material. llvm-svn: 46054	2008-01-16 05:56:59 +00:00
Chris Lattner	afd4056065	verify x86 generates ud2 for llvm.trap llvm-svn: 46023	2008-01-15 22:22:02 +00:00
Chris Lattner	4d3944c554	new testcase for llvm.trap. llvm-svn: 46020	2008-01-15 22:17:26 +00:00
Scott Michel	5afa19350b	More CellSPU refinements: - struct_2.ll: Completely unaligned load/store testing - call_indirect.ll, struct_1.ll: Add test lines to exercise X-form [$reg($reg)] addressing At this point, loads and stores should be under control (he says in an optimistic tone of voice.) llvm-svn: 45882	2008-01-11 21:01:19 +00:00
Dale Johannesen	8ca78844b0	Disable for now. llvm-svn: 45881	2008-01-11 20:47:33 +00:00
Scott Michel	1e9496e4d4	More CellSPU refinement and progress: - Cleaned up custom load/store logic, common code is now shared [see note below], cleaned up address modes - More test cases: various intrinsics, structure element access (load/store test), updated target data strings, indirect function calls. Note: This patch contains a refactoring of the LoadSDNode and StoreSDNode structures: they now share a common base class, LSBaseSDNode, that provides an interface to their common functionality. There is some hackery to access the proper operand depending on the derived class; otherwise, to do a proper job would require finding and rearranging the SDOperands sent to StoreSDNode's constructor. The current refactor errs on the side of being conservatively and backwardly compatible while providing functionality that reduces redundant code for targets where loads and stores are custom-lowered. llvm-svn: 45851	2008-01-11 02:53:15 +00:00
Duncan Sands	2c89976416	Output sinl for a long double FSIN node, not sin. Likewise fix up a bunch of other libcalls. While there I remove NEG_F32 and NEG_F64 since they are not used anywhere. This fixes 9 Ada ACATS failures. llvm-svn: 45833	2008-01-10 10:28:30 +00:00
Evan Cheng	0747381b13	Codegen improvement has reduced one spill. llvm-svn: 45814	2008-01-10 02:54:40 +00:00
Chris Lattner	cce1483bcf	new testcase for PR1845 llvm-svn: 45795	2008-01-10 00:30:38 +00:00
Evan Cheng	ba0214a6cb	Special copy SUnit's do not have SDNode's. llvm-svn: 45787	2008-01-09 23:01:55 +00:00

1 2 3 4 5 ...

648 Commits