llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 22:42:52 +01:00

Author	SHA1	Message	Date
Chris Lattner	79076fdf2a	Add target-specific dag combines for FAND(x,0) and FOR(x,0). This allows us to compile: double test(double X) { return copysign(0.0, X); } into: _test: andpd LCPI1_0(%rip), %xmm0 ret instead of: _test: pxor %xmm1, %xmm1 andpd LCPI1_0(%rip), %xmm1 movapd %xmm0, %xmm2 andpd LCPI1_1(%rip), %xmm2 movapd %xmm1, %xmm0 orpd %xmm2, %xmm0 ret llvm-svn: 46344	2008-01-25 05:46:26 +00:00
Chris Lattner	16a8f126d3	Significantly simplify and improve handling of FP function results on x86-32. This case returns the value in ST(0) and then has to convert it to an SSE register. This causes significant codegen ugliness in some cases. For example in the trivial fp-stack-direct-ret.ll testcase we used to generate: _bar: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret because we move the result of foo() into an XMM register, then have to move it back for the return of bar. Instead of hacking ever-more special cases into the call result lowering code we take a much simpler approach: on x86-32, fp return is modeled as always returning into an f80 register which is then truncated to f32 or f64 as needed. Similarly for a result, we model it as an extension to f80 + return. This exposes the truncate and extensions to the dag combiner, allowing target independent code to hack on them, eliminating them in this case. This gives us this code for the example above: _bar: subl $12, %esp call L_foo$stub addl $12, %esp ret The nasty aspect of this is that these conversions are not legal, but we want the second pass of dag combiner (post-legalize) to be able to hack on them. To handle this, we lie to legalize and say they are legal, then custom expand them on entry to the isel pass (PreprocessForFPConvert). This is gross, but less gross than the code it is replacing :) This also allows us to generate better code in several other cases. For example on fp-stack-ret-conv.ll, we now generate: _test: subl $12, %esp call L_foo$stub fstps 8(%esp) movl 16(%esp), %eax cvtss2sd 8(%esp), %xmm0 movsd %xmm0, (%eax) addl $12, %esp ret where before we produced (incidentally, the old bad code is identical to what gcc produces): _test: subl $12, %esp call L_foo$stub fstpl (%esp) cvtsd2ss (%esp), %xmm0 cvtss2sd %xmm0, %xmm0 movl 16(%esp), %eax movsd %xmm0, (%eax) addl $12, %esp ret Note that we generate slightly worse code on pr1505b.ll due to a scheduling deficiency that is unrelated to this patch. llvm-svn: 46307	2008-01-24 08:07:48 +00:00
Chris Lattner	214c11ee6f	take these with a pr # llvm-svn: 46303	2008-01-24 06:35:44 +00:00
Evan Cheng	91089e6d66	Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type. llvm-svn: 46286	2008-01-23 23:17:41 +00:00
Evan Cheng	d436c2e724	SSE varargs arguments are passed in memory. llvm-svn: 46262	2008-01-22 23:26:53 +00:00
Dale Johannesen	7807e86260	Implement flt_rounds for PowerPC. llvm-svn: 46174	2008-01-18 19:55:37 +00:00
Chris Lattner	49fd213770	remove extraneous &&'s from tests, as Scott is apparently not going to. llvm-svn: 46173	2008-01-18 19:53:43 +00:00
Dale Johannesen	b2d9e41233	Test is correct again for the moment. llvm-svn: 46172	2008-01-18 19:53:31 +00:00
Chris Lattner	febc7ea9bf	Fix a latent bug exposed by my truncstore patch. We compiled stfiwx-2.ll to: _test: fctiwz f0, f1 stfiwx f0, 0, r4 blr instead of: _test: fctiwz f0, f1 stfd f0, -8(r1) nop nop lwz r2, -4(r1) stb r2, 0(r4) blr The former is not correct (stores 4 bytes, not 1). llvm-svn: 46161	2008-01-18 16:54:56 +00:00
Scott Michel	506e61bad1	Forward progress: crtbegin.c now compiles successfully! Fixed CellSPU's A-form (local store) address mode, so that all globals, externals, constant pool and jump table symbols are now wrapped within a SPUISD::AFormAddr pseudo-instruction. This now identifies all local store memory addresses, although it requires a bit of legerdemain during instruction selection to properly select loads to and stores from local store, properly generating "LQA" instructions. Also added mul_ops.ll test harness for exercising integer multiplication. llvm-svn: 46142	2008-01-17 20:38:41 +00:00
Chris Lattner	41717f6989	This commit changes: 1. Legalize now always promotes truncstore of i1 to i8. 2. Remove patterns and gunk related to truncstore i1 from targets. 3. Rename the StoreXAction stuff to TruncStoreAction in TLI. 4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions. 5. Mark a wide variety of invalid truncstores as such in various targets, e.g. X86 currently doesn't support truncstore of any of its integer types. 6. Add legalize support for truncstores with invalid value input types. 7. Add a dag combine transform to turn store(truncate) into truncstore when safe. The later allows us to compile CodeGen/X86/storetrunc-fp.ll to: _foo: fldt 20(%esp) fldt 4(%esp) faddp %st(1) movl 36(%esp), %eax fstps (%eax) ret instead of: _foo: subl $4, %esp fldt 24(%esp) fldt 8(%esp) faddp %st(1) fstps (%esp) movl 40(%esp), %eax movss (%esp), %xmm0 movss %xmm0, (%eax) addl $4, %esp ret llvm-svn: 46140	2008-01-17 19:59:44 +00:00
Chris Lattner	adb8aeaf6a	new testcase. llvm-svn: 46139	2008-01-17 19:47:23 +00:00
Chris Lattner	ee20bcd396	add testcase that has been sitting in my tree for awhile. llvm-svn: 46124	2008-01-17 06:54:09 +00:00
Evan Cheng	8633da0707	When a live virtual register is being clobbered by an implicit def, it is spilled and the spill is its kill. However, if the local allocator has determined the register has not been modified (possible when its value was reloaded), it would not issue a restore. In that case, mark the last use of the virtual register as kill. llvm-svn: 46111	2008-01-17 02:08:17 +00:00
Evan Cheng	5be34d811c	Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0. It's not safe to use the two value CombineTo variant to combine away a dead load. e.g. v1, chain2 = load chain1, loc v2, chain3 = load chain2, loc v3 = add v2, c Now we replace use of v1 with undef, use of chain2 with chain1. ReplaceAllUsesWith() will iterate through uses of the first load and update operands: v1, chain2 = load chain1, loc v2, chain3 = load chain1, loc v3 = add v2, c Now the second load is the same as the first load, SelectionDAG cse will ensure the use of second load is replaced with the first load. v1, chain2 = load chain1, loc v3 = add v1, c Then v1 is replaced with undef and bad things happen. llvm-svn: 46099	2008-01-16 23:11:54 +00:00
Duncan Sands	78e448d8b4	Trampoline support for x86-64. This looks like it should work, but I have no machine to test it on. Committed because it will at least cause no harm, and maybe someone can test it for me! llvm-svn: 46098	2008-01-16 22:55:25 +00:00
Chris Lattner	3c9f208ca8	add testcase for regression llvm-svn: 46073	2008-01-16 18:03:52 +00:00
Chris Lattner	109f0e56f5	make sure to use a cpu that has sse. llvm-svn: 46060	2008-01-16 06:32:02 +00:00
Chris Lattner	41e1fd13b2	My previous commit had an incomplete message, it should have been: make the 'fp return in ST(0)' optimization smart enough to look through token factor nodes. THis allows us to compile testcases like CodeGen/X86/fp-stack-retcopy.ll into: _carg: subl $12, %esp call L_foo$stub fstpl (%esp) fldl (%esp) addl $12, %esp ret instead of: _carg: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret Still not optimal, but much better and this is a trivial patch. Fixing the rest requires invasive surgery that is is not llvm 2.2 material. llvm-svn: 46054	2008-01-16 05:56:59 +00:00
Chris Lattner	afd4056065	verify x86 generates ud2 for llvm.trap llvm-svn: 46023	2008-01-15 22:22:02 +00:00
Chris Lattner	4d3944c554	new testcase for llvm.trap. llvm-svn: 46020	2008-01-15 22:17:26 +00:00
Scott Michel	5afa19350b	More CellSPU refinements: - struct_2.ll: Completely unaligned load/store testing - call_indirect.ll, struct_1.ll: Add test lines to exercise X-form [$reg($reg)] addressing At this point, loads and stores should be under control (he says in an optimistic tone of voice.) llvm-svn: 45882	2008-01-11 21:01:19 +00:00
Dale Johannesen	8ca78844b0	Disable for now. llvm-svn: 45881	2008-01-11 20:47:33 +00:00
Scott Michel	1e9496e4d4	More CellSPU refinement and progress: - Cleaned up custom load/store logic, common code is now shared [see note below], cleaned up address modes - More test cases: various intrinsics, structure element access (load/store test), updated target data strings, indirect function calls. Note: This patch contains a refactoring of the LoadSDNode and StoreSDNode structures: they now share a common base class, LSBaseSDNode, that provides an interface to their common functionality. There is some hackery to access the proper operand depending on the derived class; otherwise, to do a proper job would require finding and rearranging the SDOperands sent to StoreSDNode's constructor. The current refactor errs on the side of being conservatively and backwardly compatible while providing functionality that reduces redundant code for targets where loads and stores are custom-lowered. llvm-svn: 45851	2008-01-11 02:53:15 +00:00
Duncan Sands	2c89976416	Output sinl for a long double FSIN node, not sin. Likewise fix up a bunch of other libcalls. While there I remove NEG_F32 and NEG_F64 since they are not used anywhere. This fixes 9 Ada ACATS failures. llvm-svn: 45833	2008-01-10 10:28:30 +00:00
Evan Cheng	0747381b13	Codegen improvement has reduced one spill. llvm-svn: 45814	2008-01-10 02:54:40 +00:00
Chris Lattner	cce1483bcf	new testcase for PR1845 llvm-svn: 45795	2008-01-10 00:30:38 +00:00
Evan Cheng	ba0214a6cb	Special copy SUnit's do not have SDNode's. llvm-svn: 45787	2008-01-09 23:01:55 +00:00
Evan Cheng	f91cfb435f	Fix sse2.psrl.w and sse2.psrl.q definitions. llvm-svn: 45772	2008-01-09 02:16:44 +00:00
Chris Lattner	c93ad7d569	Make load->store deletion a bit smarter. This allows us to compile this: void test(long long P) { P ^= 1; } into just: _test: movl 4(%esp), %eax xorl $1, (%eax) ret instead of code like this: _test: movl 4(%esp), %ecx xorl $1, (%ecx) movl 4(%ecx), %edx movl %edx, 4(%ecx) ret llvm-svn: 45762	2008-01-08 23:08:06 +00:00
Duncan Sands	b3b1ae18ab	Crashes llc when using Chris's new legalization logic. llvm-svn: 45758	2008-01-08 21:51:53 +00:00
Chris Lattner	aeab9aefb3	remove darwin/i386 t-t llvm-svn: 45743	2008-01-08 06:52:51 +00:00
Chris Lattner	cafc567fb7	Finally implement correct ordered comparisons for PPC, even though the code generated is not wonderful. This turns a miscompilation into a code quality bug (noted in the ppc readme). This fixes PR642, which is over 2 years old (!). Nate, please review this. llvm-svn: 45742	2008-01-08 06:46:30 +00:00
Nate Begeman	98dba4b0ce	Update test to catch recent x86 insert regression and improvements llvm-svn: 45705	2008-01-07 17:49:23 +00:00
Gordon Henriksen	edbfece273	Setting GlobalDirective in TargetAsmInfo by default rather than providing a misleading facility. It's used once in the MIPS backend and hardcoded as "\t.globl\t" everywhere else. llvm-svn: 45676	2008-01-07 02:31:11 +00:00
Gordon Henriksen	db4f51e1b9	With this patch, the LowerGC transformation becomes the ShadowStackCollector, which additionally has reduced overhead with no sacrifice in portability. Considering a function @fun with 8 loop-local roots, ShadowStackCollector introduces the following overhead (x86): ; shadowstack prologue movl L_llvm_gc_root_chain$non_lazy_ptr, %eax movl (%eax), %ecx movl $___gc_fun, 20(%esp) movl $0, 24(%esp) movl $0, 28(%esp) movl $0, 32(%esp) movl $0, 36(%esp) movl $0, 40(%esp) movl $0, 44(%esp) movl $0, 48(%esp) movl $0, 52(%esp) movl %ecx, 16(%esp) leal 16(%esp), %ecx movl %ecx, (%eax) ; shadowstack loop overhead (none) ; shadowstack epilogue movl 48(%esp), %edx movl %edx, (%ecx) ; shadowstack metadata .align 3 ___gc_fun: # __gc_fun .long 8 .space 4 In comparison to LowerGC: ; lowergc prologue movl L_llvm_gc_root_chain$non_lazy_ptr, %eax movl (%eax), %ecx movl %ecx, 48(%esp) movl $8, 52(%esp) movl $0, 60(%esp) movl $0, 56(%esp) movl $0, 68(%esp) movl $0, 64(%esp) movl $0, 76(%esp) movl $0, 72(%esp) movl $0, 84(%esp) movl $0, 80(%esp) movl $0, 92(%esp) movl $0, 88(%esp) movl $0, 100(%esp) movl $0, 96(%esp) movl $0, 108(%esp) movl $0, 104(%esp) movl $0, 116(%esp) movl $0, 112(%esp) ; lowergc loop overhead leal 44(%esp), %eax movl %eax, 56(%esp) leal 40(%esp), %eax movl %eax, 64(%esp) leal 36(%esp), %eax movl %eax, 72(%esp) leal 32(%esp), %eax movl %eax, 80(%esp) leal 28(%esp), %eax movl %eax, 88(%esp) leal 24(%esp), %eax movl %eax, 96(%esp) leal 20(%esp), %eax movl %eax, 104(%esp) leal 16(%esp), %eax movl %eax, 112(%esp) ; lowergc epilogue movl 48(%esp), %edx movl %edx, (%ecx) ; lowergc metadata (none) llvm-svn: 45670	2008-01-07 01:30:53 +00:00
Chris Lattner	7d567adef9	fix this to use a valid triple. llvm-svn: 45509	2008-01-02 22:21:45 +00:00
Chris Lattner	fbd8cc03c8	verify that aligned common support doesn't break. llvm-svn: 45495	2008-01-02 19:48:24 +00:00
Duncan Sands	8a4882564a	Fix PR1833 - eh.exception and eh.selector return two values, which means doing extra legalization work. It would be easier to get this kind of thing right if there was some documentation... llvm-svn: 45472	2007-12-31 18:35:50 +00:00
Chris Lattner	d55e743cfe	One readme entry is done, one is really easy (Evan, want to investigate eliminating the llvm.x86.sse2.loadl.pd intrinsic?), one shuffle optzn may be done (if shufps is better than pinsw, Evan, please review), and we already know about LICM of simple instructions. llvm-svn: 45407	2007-12-29 19:31:47 +00:00
Chris Lattner	ed55329cc9	upgrade this test llvm-svn: 45406	2007-12-29 19:24:06 +00:00
Chris Lattner	cd147e5596	Fold comparisons against a constant nan, and optimize ORD/UNORD comparisons with a constant. This allows us to compile isnan to: _foo: fcmpu cr7, f1, f1 mfcr r2 rlwinm r3, r2, 0, 31, 31 blr instead of: LCPI1_0: ; float .space 4 _foo: lis r2, ha16(LCPI1_0) lfs f0, lo16(LCPI1_0)(r2) fcmpu cr7, f1, f0 mfcr r2 rlwinm r3, r2, 0, 31, 31 blr llvm-svn: 45405	2007-12-29 08:37:08 +00:00
Chris Lattner	b36a4a7a84	this xform is implemented. llvm-svn: 45404	2007-12-29 08:19:39 +00:00
Chris Lattner	f8e408b7b1	Codegen: as: _bar: pushl %esi subl $8, %esp movl 16(%esp), %esi call L_foo$stub fstps (%esi) addl $8, %esp popl %esi #FP_REG_KILL ret instead of: _bar: pushl %esi subl $8, %esp movl 16(%esp), %esi call L_foo$stub fstpl (%esi) cvtsd2ss (%esi), %xmm0 movss %xmm0, (%esi) addl $8, %esp popl %esi #FP_REG_KILL ret llvm-svn: 45401	2007-12-29 06:57:38 +00:00
Chris Lattner	e3515220d2	avoid going through a stack slot to convert from fpstack to xmm reg if we are just going to store it back anyway. This improves things like: double foo(); void bar(double P) { P = foo(); } llvm-svn: 45399	2007-12-29 06:41:28 +00:00
Chris Lattner	a432f12b76	one fewer uncond branch with my codegenprepare hack for single-mbb backedges. llvm-svn: 45360	2007-12-26 17:23:47 +00:00
Gordon Henriksen	e8226d70a9	Tests for changes made in r45356, where IPO optimizations would drop collector algorithms. llvm-svn: 45357	2007-12-26 02:47:37 +00:00
Gordon Henriksen	c0a3899bbf	GC poses hazards to the inliner. Consider: define void @f() { ... call i32 @g() ... } define void @g() { ... } The hazards are: - @f and @g have GC, but they differ GC. Inlining is invalid. This may never occur. - @f has no GC, but @g does. g's GC must be propagated to @f. The other scenarios are safe: - @f and @g have the same GC. - @f and @g have no GC. - @g has no GC. This patch adds inliner checks for the former two scenarios. llvm-svn: 45351	2007-12-25 03:10:07 +00:00
Gordon Henriksen	a9f4ed4070	Noting and enforcing that GC intrinsics are valid only within a function with GC. This will catch the error when the inliner inlines a function with GC into a caller with no GC. llvm-svn: 45350	2007-12-25 02:31:26 +00:00
Gordon Henriksen	44841db057	Adjusting verification of "llvm.gc*" intrinsic prototypes to match LangRef. llvm-svn: 45349	2007-12-25 02:02:10 +00:00
Evan Cheng	18c39c03a7	Remove xfail. This is fixed. llvm-svn: 45254	2007-12-20 02:25:21 +00:00
Scott Michel	5cbdbd26a8	More working CellSPU tests: - vec_const.ll: Vector constant loads - immed64.ll: i64, f64 constant loads llvm-svn: 45242	2007-12-20 00:44:13 +00:00
Scott Michel	83ac96e27d	CellSPU testcase, extract_elt.ll: extract vector element. llvm-svn: 45219	2007-12-19 21:17:42 +00:00
Scott Michel	686bbd9b19	More working CellSPU test cases: - call.ll: Function call - ctpop.ll: Count population - dp_farith.ll: DP arithmetic - eqv.ll: Equivalence primitives - fcmp.ll: SP comparisons - fdiv.ll: SP division - fneg-fabs.ll: SP negation, aboslute value - int2fp.ll: Integer -> SP conversion - rotate_ops.ll: Rotation primitives - select_bits.ll: (a & c) \| (b & ~c) bit selection - shift_ops.ll: Shift primitives - sp_farith.ll: SP arithmentic llvm-svn: 45217	2007-12-19 20:50:49 +00:00
Scott Michel	6cb9f6d20c	Two more test cases: or_ops.ll (arithmetic or operations) and vecinsert.ll (vector insertions) llvm-svn: 45216	2007-12-19 20:15:47 +00:00
Scott Michel	d4d96bb6f6	Add new immed16.ll test case, fix CellSPU errata to make test case work. llvm-svn: 45196	2007-12-19 07:35:06 +00:00
Evan Cheng	8824950e8f	Fix PR1872: SrcValue and SrcValueOffset should not be used to compute load / store node id. llvm-svn: 45167	2007-12-18 19:38:14 +00:00
Evan Cheng	36bfae49e3	FIX for PR1799: When a load is unfolded from an instruction, check if it is a new node. If not, do not create a new SUnit. llvm-svn: 45157	2007-12-18 08:42:10 +00:00
Scott Michel	94d7a2f1f2	i32 immediate constant test case for CellSPU llvm-svn: 45134	2007-12-17 23:45:52 +00:00
Scott Michel	4f980e1acd	- Restore some i8 functionality in CellSPU - New test case: nand.ll llvm-svn: 45130	2007-12-17 22:32:34 +00:00
Duncan Sands	3a0d757bd5	Make invokes of inline asm legal. Teach codegen how to lower them (with no attempt made to be efficient, since they should only occur for unoptimized code). llvm-svn: 45108	2007-12-17 18:08:19 +00:00
Evan Cheng	1d95b669b6	Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs. llvm-svn: 45058	2007-12-15 03:00:47 +00:00
Scott Michel	307f334014	Start committing working test cases for CellSPU. llvm-svn: 45050	2007-12-15 00:38:50 +00:00
Evan Cheng	6909ff8c4b	Fix ctlz and cttz. llvm definition requires them to return number of bits in of the src type when value is zero. llvm-svn: 45029	2007-12-14 08:30:15 +00:00
Evan Cheng	51cf86ded0	Implement ctlz and cttz with bsr and bsf. llvm-svn: 45024	2007-12-14 02:13:44 +00:00
Evan Cheng	a152909956	Be extra careful with extension use optimation. Now turned on by default. llvm-svn: 44981	2007-12-13 03:32:53 +00:00
Evan Cheng	343929c773	Fold some and + shift in x86 addressing mode. llvm-svn: 44970	2007-12-13 00:43:27 +00:00
Evan Cheng	64a1febf9a	Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled. llvm-svn: 44960	2007-12-12 23:12:09 +00:00
Dan Gohman	0075ea1f5f	Allow vector integer constants to be created with SelectionDAG::getConstant, in the same way as vector floating-point constants. This allows the legalize expansion code for @llvm.ctpop and friends to be usable with vector types. llvm-svn: 44954	2007-12-12 22:21:26 +00:00
Evan Cheng	ad3e7f3286	Use shuffles to implement insert_vector_elt for i32, i64, f32, and f64. llvm-svn: 44929	2007-12-12 07:55:34 +00:00
Evan Cheng	af6ba4dfd4	Add a test case for -optimize-ext-uses. llvm-svn: 44928	2007-12-12 07:54:08 +00:00
Evan Cheng	d36d69fe92	Lower a build_vector with all constants into a constpool load unless it can be done with a move to low part. llvm-svn: 44921	2007-12-12 06:45:40 +00:00
Evan Cheng	f6c2838f36	- Improved v8i16 shuffle lowering. It now uses pshuflw and pshufhw as much as possible before resorting to pextrw and pinsrw. - Better codegen for v4i32 shuffles masquerading as v8i16 or v16i8 shuffles. - Improves (i16 extract_vector_element 0) codegen by recognizing (i32 extract_vector_element 0) does not require a pextrw. llvm-svn: 44836	2007-12-11 01:46:18 +00:00
Christopher Lamb	5c577eb543	Improve branch folding by recgonizing that explict successor relationships impact the value of fall-through choices. llvm-svn: 44785	2007-12-10 07:24:06 +00:00
Gordon Henriksen	5d201e0bcc	Adding a collector name attribute to Function in the IR. These methods are new to Function: bool hasCollector() const; const std::string &getCollector() const; void setCollector(const std::string &); void clearCollector(); The assembly representation is as such: define void @f() gc "shadow-stack" { ... The implementation uses an on-the-side table to map Functions to collector names, such that there is no overhead. A StringPool is further used to unique collector names, which are extremely likely to be unique per process. llvm-svn: 44769	2007-12-10 03:18:06 +00:00
Gordon Henriksen	64016be9ea	Upgrading this test to 2.0 .ll syntax. llvm-svn: 44738	2007-12-09 15:03:01 +00:00
Chris Lattner	e93a775a4d	Fix a significant code quality regression I introduced on PPC64 quite a while ago. We now produce: _foo: mflr r0 std r0, 16(r1) ld r2, 16(r1) std r2, 0(r3) ld r0, 16(r1) mtlr r0 blr instead of: _foo: mflr r0 std r0, 16(r1) lis r0, 0 ori r0, r0, 16 ldx r2, r1, r0 std r2, 0(r3) ld r0, 16(r1) mtlr r0 blr for: void foo(void *X) { X = __builtin_return_address(0); } on ppc64. llvm-svn: 44701	2007-12-08 07:04:58 +00:00
Chris Lattner	e16166b78d	implement __builtin_return_addr(0) on ppc. llvm-svn: 44700	2007-12-08 06:59:59 +00:00
Evan Cheng	34c7b35135	Much improved v8i16 shuffles. (Step 1). llvm-svn: 44676	2007-12-07 08:07:39 +00:00
Evan Cheng	8d1f8b2f27	New test case. llvm-svn: 44672	2007-12-07 01:48:46 +00:00
Evan Cheng	cab253ba13	Fix a bogus test case. llvm-svn: 44668	2007-12-06 22:12:45 +00:00
Evan Cheng	d53f72dfb1	Turning simple splitting on. Start testing new coalescer heuristics as new llcbeta. llvm-svn: 44660	2007-12-06 08:54:31 +00:00
Chris Lattner	64a1a9f502	third time around: instead of disabling this completely, only disable it if we don't know it will be obviously profitable. Still fixme, but less so. :) llvm-svn: 44658	2007-12-06 07:47:55 +00:00
Chris Lattner	bb5fb18af8	Actually, disable this code for now. More analysis and improvements to the X86 backend are needed before this should be enabled by default. llvm-svn: 44657	2007-12-06 07:44:31 +00:00
Chris Lattner	c467b49c96	implement a readme entry, compiling the code into: _foo: movl $12, %eax andl 4(%esp), %eax movl _array(%eax), %eax ret instead of: _foo: movl 4(%esp), %eax shrl $2, %eax andl $3, %eax movl _array(,%eax,4), %eax ret As it turns out, this triggers all the time, in a wide variety of situations, for example, I see diffs like this in various programs: - movl 8(%eax), %eax - shll $2, %eax - andl $1020, %eax - movl (%esi,%eax), %eax + movzbl 8(%eax), %eax + movl (%esi,%eax,4), %eax - shll $2, %edx - andl $1020, %edx - movl (%edi,%edx), %edx + andl $255, %edx + movl (%edi,%edx,4), %edx Unfortunately, I also see stuff like this, which can be fixed in the X86 backend: - andl $85, %ebx - addl _bit_count(,%ebx,4), %ebp + shll $2, %ebx + andl $340, %ebx + addl _bit_count(%ebx), %ebp llvm-svn: 44656	2007-12-06 07:33:36 +00:00
Chris Lattner	1de4db0446	fix this when run on non x86 hosts. llvm-svn: 44645	2007-12-06 01:05:52 +00:00
Evan Cheng	1d289d0146	Fix for PR1831: if all defs of an interval are re-materializable, then it's a preferred spill candiate. llvm-svn: 44644	2007-12-06 00:01:56 +00:00
Evan Cheng	79e8b92dc3	Allow some reloads to be folded in multi-use cases. Specifically testl r, r -> cmpl [mem], 0. llvm-svn: 44479	2007-12-01 02:07:52 +00:00
Evan Cheng	90c548af8e	Do not fold reload into an instruction with multiple uses. It issues one extra load. llvm-svn: 44467	2007-11-30 21:23:43 +00:00
Evan Cheng	1aa45b56a3	Update tests. llvm-svn: 44435	2007-11-29 10:03:54 +00:00
Chris Lattner	331852dd02	upgrade this test llvm-svn: 44405	2007-11-28 18:22:12 +00:00
Chris Lattner	f35bff85c5	xfail a test llvm-svn: 44395	2007-11-28 05:37:13 +00:00
Chris Lattner	a9dc7d650b	update this test after the fmrrd fix llvm-svn: 44393	2007-11-28 05:27:07 +00:00
Tanya Lattner	c33660d278	Fix bug in regression tests that ignored stderr output in RUN lines. Updated tests and fixed broken run lines. XFAILed 3 arm regressions (will file bugs) llvm-svn: 44389	2007-11-28 04:57:00 +00:00
Chris Lattner	98fe074d3d	commit testcase I forgot to svn add. llvm-svn: 44383	2007-11-27 22:43:37 +00:00
Chris Lattner	5e0cabc90e	Fix a crash on invalid code due to memcpy lowering. llvm-svn: 44378	2007-11-27 22:14:42 +00:00
Andrew Lenharth	6e449dc482	something wrong with this opt llvm-svn: 44370	2007-11-27 18:31:30 +00:00
Dan Gohman	d12155d8c8	Remove unnecessary && from the RUN lines of this test. llvm-svn: 44342	2007-11-27 00:03:38 +00:00
Dan Gohman	a9f8208852	Don't lower srem/urem X%C to X-X/C*C unless the division is actually optimized. This avoids creating illegal divisions when the combiner is running after legalize; this fixes PR1815. Also, it produces better code in the included testcase by avoiding the subtract and multiply when the division isn't optimized. llvm-svn: 44341	2007-11-26 23:46:11 +00:00
Chris Lattner	be0c5a0500	Fix a long standing deficiency in the X86 backend: we would sometimes emit "zero" and "all one" vectors multiple times, for example: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 pcmpeqd %mm0, %mm0 movq %mm0, _M2 ret instead of: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 movq %mm0, _M2 ret This patch fixes this by always arranging for zero/one vectors to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be any random type. This ensures they get trivially CSE'd on the dag. This fix is also important for LegalizeDAGTypes, as it gets unhappy when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when 'i64' isn't legal. This patch makes the following changes: 1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into their canonical types. 2) The now-dead patterns are removed from the SSE/MMX .td files. 3) All the patterns in the .td file that referred to immAllOnesV or immAllZerosV in the wrong form now use *_bc to match them with a bitcast wrapped around them. 4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle bitcast'd zero vectors, which simplifies the code actually. 5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that is legal, instead of generating one that is illegal and expecting a later legalize pass to clean it up. 6) isZeroShuffle is generalized to handle bitcast of zeros. 7) several other minor tweaks. This patch is definite goodness, but has the potential to cause random code quality regressions. Please be on the lookout for these and let me know if they happen. llvm-svn: 44310	2007-11-25 00:24:49 +00:00
Chris Lattner	6304b1e16d	upgrade this test llvm-svn: 44298	2007-11-24 05:39:29 +00:00
Duncan Sands	7a8a7099b1	Fix a bug in which node A is replaced by node B, but later node A gets back into the DAG again because it was hiding in one of the node maps: make sure that node replacement happens in those maps too. llvm-svn: 44263	2007-11-21 16:43:19 +00:00
Chris Lattner	7672c08059	Testcase for PR1811 llvm-svn: 44244	2007-11-19 21:43:22 +00:00
Dan Gohman	0f62120b01	Add support in SplitVectorOp for remainder operators. llvm-svn: 44233	2007-11-19 15:15:03 +00:00
Chris Lattner	bef568f3f8	fix bogus test that the more strict lexer is finding. llvm-svn: 44216	2007-11-18 18:26:45 +00:00
Evan Cheng	121c50d5e3	Typo. llvm-svn: 44196	2007-11-16 23:55:08 +00:00
Dale Johannesen	f2dcb50351	Testcase from PR 1508 (although its's somewhat orthogonal to the main problem there) llvm-svn: 44194	2007-11-16 23:16:35 +00:00
Evan Cheng	c19506f69d	Fix a thinko in post-allocation coalescer. llvm-svn: 44166	2007-11-15 08:13:29 +00:00
Anton Korobeynikov	58298cb9cc	Fix PIC jump table codegen on x86-32/linux. In fact, such thing should be applied to all targets uses GOT-relative offsets for PIC (Alpha?) llvm-svn: 44108	2007-11-14 09:18:41 +00:00
Arnold Schwaighofer	64ad6fa1fa	Update tailcall code to include inline attribute operand for memcpy. llvm-svn: 43978	2007-11-10 10:48:01 +00:00
Evan Cheng	ea1474bdf3	Fix tests. llvm-svn: 43961	2007-11-09 20:46:00 +00:00
Lauro Ramos Venancio	d8f2190c19	[ARM] Implement __builtin_thread_pointer. llvm-svn: 43892	2007-11-08 17:20:05 +00:00
Evan Cheng	d9bab93a44	If both parts of smul_lohi, etc. are used, don't simplify. If only one part is used, try simplify it. llvm-svn: 43888	2007-11-08 09:25:29 +00:00
Evan Cheng	3764ad2bac	Add pseudo dependency to force two-address instruction to be scheduled after other uses. There was a overly restricted check that prevented some obvious cases. llvm-svn: 43762	2007-11-06 08:44:59 +00:00
Dan Gohman	6255ce9f5d	Add support for vector remainder operations. llvm-svn: 43744	2007-11-05 23:35:22 +00:00
Dale Johannesen	1f70f86c7a	Make labels work in asm blocks; allow labels as parameters. Rename ValueRefList to ParamList in AsmParser, since its only use is for parameters. llvm-svn: 43734	2007-11-05 21:20:28 +00:00
Lauro Ramos Venancio	f5081ba980	[ARM] Fix code generation for: static __thread struct { int a; int b; } teste = {0, 0}; llvm-svn: 43722	2007-11-05 18:33:37 +00:00
Evan Cheng	28c61e33a4	Skip over deleted val#'s. llvm-svn: 43700	2007-11-05 06:46:45 +00:00
Evan Cheng	e5eac2c5ac	Handle cases where a register and one of its super-register are both marked as defined on the same instruction. This fixes PR1767. llvm-svn: 43699	2007-11-05 03:11:55 +00:00
Evan Cheng	bc39e175c4	Fix test case. Chris didn't do make check. :-) llvm-svn: 43698	2007-11-05 03:04:26 +00:00
Evan Cheng	947c271e37	Doh. PR1187 -> PR1766. llvm-svn: 43693	2007-11-05 01:00:44 +00:00
Evan Cheng	13d79ab67a	Fix PR1187. llvm-svn: 43692	2007-11-05 00:59:10 +00:00
Chris Lattner	8fac63c8b5	Fix PR1761 by not printing (rip) suffix when in -static mode. Evan, please review this. llvm-svn: 43680	2007-11-04 19:23:28 +00:00
Chris Lattner	67cd357fb8	Fix PR1763 by allowing the 'q' constraint to work with 64-bit regs on x86-64. llvm-svn: 43669	2007-11-04 06:51:12 +00:00
Evan Cheng	1771f6da9c	There are times when the coalescer would not coalesce away a copy but the copy can be eliminated by the allocator is the destination and source targets the same register. The most common case is when the source and destination registers are in different class. For example, on x86 mov32to32_ targets GR32_ which contains a subset of the registers in GR32. The allocator can do 2 things: 1. Set the preferred allocation for the destination of a copy to that of its source. 2. After allocation is done, change the allocation of a copy destination (if legal) so the copy can be eliminated. This eliminates 443 extra moves from 403.gcc. llvm-svn: 43662	2007-11-03 07:20:12 +00:00
Evan Cheng	8d473f667d	Add run line. llvm-svn: 43645	2007-11-02 17:36:58 +00:00
Evan Cheng	65a07e73e2	One more extract_subreg coalescing bug. llvm-svn: 43644	2007-11-02 17:35:08 +00:00
Evan Cheng	b50cc64eb0	Missing a getNumOperands check. llvm-svn: 43630	2007-11-02 01:26:22 +00:00
Dale Johannesen	c125d9b4e8	Test that expand_vector_elt(v2i64) works in 32-bit mode. llvm-svn: 43598	2007-11-01 02:38:24 +00:00
Evan Cheng	5e058e94b5	It's not safe to tell SplitCriticalEdge to merge identical edges. It may delete the phi instruction that's being processed. llvm-svn: 43524	2007-10-30 22:27:26 +00:00
Evan Cheng	633cd3e84d	- Bug fixes. - Allow icmp rewrite using an iv / stride of a smaller integer type. llvm-svn: 43480	2007-10-29 22:07:18 +00:00
Dan Gohman	02b8beff5f	Fix a DAGCombiner abort on a bitcast from a scalar to a vector. llvm-svn: 43470	2007-10-29 20:44:42 +00:00
Evan Cheng	5fe81cf64e	Enable more fold (sext (load x)) -> (sext (truncate (sextload x))) transformation. Previously, it's restricted by ensuring the number of load uses is one. Now the restriction is loosened up by allowing setcc uses to be "extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq). llvm-svn: 43465	2007-10-29 19:58:20 +00:00
Chris Lattner	1503362624	Add support for the x86-64 'q' regigster modifier, and add support for the b/h/w/k/q inline asm memory modifiers, which are just ignored. This fixes PR1748 and CodeGen/X86/2007-10-28-inlineasm-q-modifier.ll llvm-svn: 43430	2007-10-29 03:09:07 +00:00
Bill Wendling	18b6321020	On second thought. Remove this as it should never be generated in the first place. llvm-svn: 43400	2007-10-26 20:34:37 +00:00
Bill Wendling	8d329ff809	- Remove the hacky code that forces a memcpy. Alignment is taken care of in the FE. - Explicitly pass in the alignment of the load & store. - XFAIL 2007-10-23-UnalignedMemcpy.ll because llc has a bug that crashes on unaligned pointers. llvm-svn: 43398	2007-10-26 20:24:42 +00:00
Evan Cheng	53696b7e9f	Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free. e.g. Turns this loop: LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx movw %dx, %si LBB1_2: # bb movl L_X$non_lazy_ptr, %edi movw %si, (%edi) movl L_Y$non_lazy_ptr, %edi movw %dx, (%edi) addw $4, %dx incw %si incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb into LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx LBB1_2: # bb movl L_X$non_lazy_ptr, %esi movw %cx, (%esi) movl L_Y$non_lazy_ptr, %esi movw %dx, (%esi) addw $4, %dx incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb llvm-svn: 43375	2007-10-26 01:56:11 +00:00
Evan Cheng	66cbf54030	If a loop termination compare instruction is the only use of its stride, and the compaison is against a constant value, try eliminate the stride by moving the compare instruction to another stride and change its constant operand accordingly. e.g. loop: ... v1 = v1 + 3 v2 = v2 + 1 if (v2 < 10) goto loop => loop: ... v1 = v1 + 3 if (v1 < 30) goto loop llvm-svn: 43336	2007-10-25 09:11:16 +00:00
Dale Johannesen	402c11966a	This was failing on Darwin, which defaults to PIC; no lea was generated. I think this follows the intent. llvm-svn: 43312	2007-10-24 20:58:14 +00:00
Bill Wendling	a420d660c8	If there's an unaligned memcpy to/from the stack, don't lower it. Just call the memcpy library function instead. llvm-svn: 43270	2007-10-23 23:32:40 +00:00
Evan Cheng	0590c75f18	Temporary solution: added a different set of BCTRL_Macho / BCTRL_ELF with right callee-saved defs set for ppc64. llvm-svn: 43248	2007-10-23 06:42:42 +00:00
Evan Cheng	252d9ddb4d	Fix memcpy lowering when addresses are 4-byte aligned but size is not multiple of 4. llvm-svn: 43234	2007-10-22 22:11:27 +00:00
Dan Gohman	df1f166e4a	Strength reduction improvements. - Avoid attempting stride-reuse in the case that there are users that aren't addresses. In that case, there will be places where the multiplications won't be folded away, so it's better to try to strength-reduce them. - Several SSE intrinsics have operands that strength-reduction can treat as addresses. The previous item makes this more visible, as any non-address use of an IV can inhibit stride-reuse. - Make ValidStride aware of whether there's likely to be a base register in the address computation. This prevents it from thinking that things like stride 9 are valid on x86 when the base register is already occupied. Also, XFAIL the 2007-08-10-LEA16Use32.ll test; the new logic to avoid stride-reuse elimintes the LEA in the loop, so the test is no longer testing what it was intended to test. llvm-svn: 43231	2007-10-22 20:40:42 +00:00
Dan Gohman	76e104c8ad	Fix the folding of multiplication into addresses on x86, which was broken by the recent {U,S}MUL_LOHI changes. llvm-svn: 43230	2007-10-22 20:22:24 +00:00
Evan Cheng	85eb733eff	Use ptr type in the immediate field of a BxA instruction so we don't end up selecting 32-bit call instruction for ppc64. llvm-svn: 43228	2007-10-22 19:46:19 +00:00
Evan Cheng	2d53c3f15e	New test case. llvm-svn: 43193	2007-10-19 22:05:00 +00:00
Rafael Espindola	2b5b200b9f	Test byval with a 8 bit aligned struct llvm-svn: 43173	2007-10-19 11:29:21 +00:00
Rafael Espindola	d8d4372845	Add support for byval function whose argument is not 32 bit aligned. To do this it is necessary to add a "always inline" argument to the memcpy node. For completeness I have also added this node to memmove and memset. I have also added getMem* functions, because the extra argument makes it cumbersome to use getNode and because I get confused by it :-) llvm-svn: 43172	2007-10-19 10:41:11 +00:00
Bill Wendling	32c9cd9e94	Pointer arithmetic should be done with the index the same size as the pointer. llvm-svn: 43120	2007-10-18 08:32:37 +00:00
Evan Cheng	f6d1c7be14	Really fix PR1734. Carefully track which register uses are sub-register uses by traversing inverse register coalescing map. llvm-svn: 43118	2007-10-18 07:49:59 +00:00
Dan Gohman	2903f7fc26	Add support for ISD::SELECT in SplitVectorOp. llvm-svn: 43072	2007-10-17 14:48:28 +00:00
Evan Cheng	524c0e6c3d	Yet another test case for extract_subreg coalescing crash. llvm-svn: 43063	2007-10-17 02:15:06 +00:00
Evan Cheng	09fa6ed483	Fix PR1734. llvm-svn: 43035	2007-10-16 19:29:47 +00:00
Dale Johannesen	dd254c4efa	New test for svn rev 43033, radar 5538745. llvm-svn: 43034	2007-10-16 18:10:14 +00:00
Chris Lattner	45d9c7aa07	Fix a bug handling frame references in ppc inline asm when the frame offset doesn't fit into 16 bits. llvm-svn: 43032	2007-10-16 18:00:18 +00:00
Evan Cheng	f5bcd3d737	LowerFP_TO_SINT must not create a stack object if it's not needed. llvm-svn: 43004	2007-10-15 20:11:21 +00:00
Dan Gohman	2dc6099def	Reapply the fix in 42908 for this file. This changes the function names from "test" to "foo" so that they don't match the grep -i ST. llvm-svn: 43001	2007-10-15 19:22:17 +00:00
Evan Cheng	43887d3714	Fix PR1729: watch out for val# with no def. llvm-svn: 42996	2007-10-15 18:33:50 +00:00
Tanya Lattner	3a64752342	Fix run line. llvm-svn: 42990	2007-10-15 16:35:13 +00:00
Evan Cheng	850c8739dc	New test case. llvm-svn: 42963	2007-10-14 10:15:03 +00:00
Evan Cheng	33df6a6bed	Revert 42908 for now. llvm-svn: 42960	2007-10-14 05:57:21 +00:00
Chris Lattner	a7666b08ad	new testcase llvm-svn: 42953	2007-10-13 06:56:18 +00:00
Evan Cheng	f5ed18f7d3	Fix test case. llvm-svn: 42949	2007-10-13 03:14:06 +00:00
Evan Cheng	6101e4ffdf	New tests. llvm-svn: 42948	2007-10-13 03:10:54 +00:00
Dan Gohman	c96f2809ca	Fix this test to not depend on the assembly output containing something that includes the string "st". This probably fixes the regression on Darwin. llvm-svn: 42932	2007-10-12 20:42:14 +00:00
Dan Gohman	a75e4a62e6	Change the names used for internal labels to use the current function symbol name instead of a codegen-assigned function number. Thanks Evan! :-) llvm-svn: 42908	2007-10-12 14:53:36 +00:00
Evan Cheng	51791564b0	Doh. llvm-svn: 42901	2007-10-12 09:10:27 +00:00
Evan Cheng	947b4a6c3d	EXTRACT_SUBREG test case. llvm-svn: 42900	2007-10-12 09:03:31 +00:00
Arnold Schwaighofer	f1e49dd41d	Added missing -march=x86 flag. llvm-svn: 42893	2007-10-12 07:49:48 +00:00
Dan Gohman	ab5c3ed0d1	Add intrinsics for sin, cos, and pow. These use llvm_anyfloat_ty, and so may be overloaded with vector types. And add a testcase for codegen for these. llvm-svn: 42885	2007-10-12 00:01:22 +00:00
Dan Gohman	9a70be46f1	Add an explicit target triple to make this test behave as expected on non-Apple hosts. And use the count script instead of wc + grep. llvm-svn: 42878	2007-10-11 23:04:36 +00:00
Arnold Schwaighofer	d47210011e	Added tail call optimization to the x86 back end. It can be enabled by passing -tailcallopt to llc. The optimization is performed if the following conditions are satisfied: * caller/callee are fastcc * elf/pic is disabled OR elf/pic enabled + callee is in module + callee has visibility protected or hidden llvm-svn: 42870	2007-10-11 19:40:01 +00:00
Dan Gohman	708e76e663	These two tests now require only two multiply instructions, instead of four. llvm-svn: 42784	2007-10-09 15:39:37 +00:00
Evan Cheng	25b65542d9	Update test. llvm-svn: 42775	2007-10-08 22:20:32 +00:00
Dan Gohman	9da3bddf43	These two tests now require only three multiply instructions, instead of four. llvm-svn: 42765	2007-10-08 20:48:12 +00:00
Dale Johannesen	b600202c68	Make test work on non-x86 hosts. llvm-svn: 42671	2007-10-06 01:22:39 +00:00
Evan Cheng	0a642eaa62	Test case for 3-address conversion. llvm-svn: 42664	2007-10-05 23:33:09 +00:00
Evan Cheng	dc467c6323	Enable convertToThreeAddress for X86 by default. llvm-svn: 42655	2007-10-05 22:31:10 +00:00
Dale Johannesen	c7b51b678d	First round of ppc long double. call/return and basic arithmetic works. Rename RTLIB long double functions to distinguish different flavors of long double; the lib functions have different names, alas. llvm-svn: 42644	2007-10-05 20:04:43 +00:00
Evan Cheng	6fd2606ff5	New test case. llvm-svn: 42628	2007-10-05 01:44:22 +00:00
Evan Cheng	1d3c836933	-pre-RA-sched=none, simple, simple-noitin are gone. llvm-svn: 42505	2007-10-01 22:17:20 +00:00
Dan Gohman	02f80006f8	Teach SplitVectorOp how to split INSERT_VECTOR_ELT. llvm-svn: 42457	2007-09-28 23:53:40 +00:00
Rafael Espindola	01b306e575	Refactor the memcpy lowering for the x86 target. The only generated code difference is that now we call memcpy when the size of the array is unknown. This matches GCC behavior and is better since the run time value can be arbitrarily large. llvm-svn: 42433	2007-09-28 12:53:01 +00:00
Dale Johannesen	e61886cee4	Add sqrt and powi intrinsics for long double. llvm-svn: 42423	2007-09-28 01:08:20 +00:00
Dale Johannesen	57b6470e8b	Modernize fabs.ll, add long double. Add tests for direct codegen of fsin/fcos. llvm-svn: 42369	2007-09-26 21:12:10 +00:00
Dan Gohman	1bb346f9f1	When both x/y and x%y are needed (x and y both scalar integer), compute both results with a single div or idiv instruction. This uses new X86ISD nodes for DIV and IDIV which are introduced during the legalize phase so that the SelectionDAG's CSE can automatically eliminate redundant computations. llvm-svn: 42308	2007-09-25 18:23:27 +00:00
Dale Johannesen	fe773726a9	Some tests for APFloat conversions. llvm-svn: 42303	2007-09-25 17:50:55 +00:00
Evan Cheng	dab211c87b	Forgot to check in the changes. Fix test case so it doesn't break with any scheduling changes. llvm-svn: 42302	2007-09-25 17:47:38 +00:00
Dan Gohman	dd675a5064	Use the correct result value type instead of using getValueType(0) in ExpandEXTRACT_VECTOR_ELT and SplitVectorOp. This fixes an abort in the included testcase. llvm-svn: 42264	2007-09-24 15:54:53 +00:00
Dan Gohman	39313e0fa3	Fix a typo in a comment. llvm-svn: 42263	2007-09-24 15:50:11 +00:00
Dale Johannesen	a928e8b8b2	Implementation of +sse -sse2 has changed; add -sse to preserve intent of this test. llvm-svn: 42247	2007-09-23 14:58:14 +00:00
Rafael Espindola	11ee0898b9	Don't add a default STACK_ALIGN (use the generic ABI alignment) Implement calls to functions with byval arguments on X86 llvm-svn: 42192	2007-09-21 15:50:22 +00:00
Evan Cheng	66bccf335b	Disable if-conversion for this test. llvm-svn: 42170	2007-09-20 18:06:22 +00:00
Evan Cheng	ae2dbcb25f	-enable-arm-if-conversion is gone. llvm-svn: 42169	2007-09-20 18:03:23 +00:00
Dan Gohman	eb622df2ef	Fix several more entries in the x86 reload/remat folding tables. llvm-svn: 42162	2007-09-20 14:17:21 +00:00
Evan Cheng	28a7839505	Clean up. llvm-svn: 42112	2007-09-18 22:56:31 +00:00
Evan Cheng	2716b97b13	Fix a bogus splat xform: shuffle <undef, undef, x, undef>, <undef, undef, undef, undef>, <2, 2, 2, 2> != <undef, undef, x, undef> llvm-svn: 42111	2007-09-18 21:54:37 +00:00
Bill Wendling	803e0d9970	Objective-C was generating EH frame info like this: "_-[NSString(local) isNullOrNil]".eh = 0 .no_dead_strip "_-[NSString(local) isNullOrNil]".eh The ".eh" should be inside the quotes. llvm-svn: 42074	2007-09-18 01:47:22 +00:00
Gordon Henriksen	fa3a5915b1	Fix for PR1633: Verifier doesn't fully verify GC intrinsics LLVM now enforces the following prototypes for the write barriers: <ty>* @llvm.gcread(<ty2>, <ty>) void @llvm.gcwrite(<ty>, <ty2>, <ty>*) And for @llvm.gcroot, the first stack slot is verified to be an alloca or a bitcast of an alloca. Fixes test/CodeGen/Generic/GC/lower_gcroot.ll, which violated these. llvm-svn: 42051	2007-09-17 20:30:04 +00:00
Dan Gohman	1aeaeec570	Emit integer x<1 as x<=0, as comparisons with zero (now includeing 64-bit) can use test instead of cmp with an immediate. llvm-svn: 42026	2007-09-17 14:49:27 +00:00
Dan Gohman	b9449c9118	Use "test reg,reg" in place of "cmp reg,0" for 64-bit operands. This was previously only done for 32-bit and smaller operands. llvm-svn: 42024	2007-09-17 14:35:24 +00:00
Dan Gohman	27ab14af9b	Add explicit triples to avoid default behavior that varies by host. llvm-svn: 41959	2007-09-14 20:37:18 +00:00
Rafael Espindola	5d8b225881	Add support for functions with byval arguments on x86 llvm-svn: 41953	2007-09-14 15:48:13 +00:00
Evan Cheng	0d738fff6d	Fixed a typo that's causing a missing kill marker. llvm-svn: 41893	2007-09-12 23:02:04 +00:00
Evan Cheng	b9a6798216	Sometimes a MI can define a register as well as defining a super-register at the same time. Do not mark the "smaller" def as dead. llvm-svn: 41871	2007-09-11 22:34:47 +00:00
Chris Lattner	5fd9bec89d	this is not infinite recursion. llvm-svn: 41806	2007-09-10 21:16:23 +00:00
Dale Johannesen	9dfdc452d9	Implement misaligned FP loads and stores. llvm-svn: 41786	2007-09-08 19:29:23 +00:00
Bill Wendling	04a6163921	Add missing index versions of instructions to the map. llvm-svn: 41776	2007-09-07 22:01:02 +00:00
Dan Gohman	3bc1bc2590	Avoid storing and reloading zeros and other constants from stack slots by flagging the associated instructions as being trivially rematerializable. llvm-svn: 41775	2007-09-07 21:32:51 +00:00
Rafael Espindola	8c57e70f93	Add support for having different alignment for objects on call frames. The x86-64 ABI states that objects passed on the stack have 8 byte alignment. Implement that. llvm-svn: 41768	2007-09-07 14:52:14 +00:00
Anton Korobeynikov	899c0c9c8d	Split eh.select / eh.typeid.for intrinsics into i32/i64 versions. This is needed, because they just "mark" register liveins and we let frontend solve type issue, not lowering code :) llvm-svn: 41763	2007-09-07 11:39:35 +00:00
Anton Korobeynikov	0e3789f07a	Proper handle case, when aliasee is external weak symbol referenced only by alias itself. Also, fix a case, when target doesn't have weak symbols supported. llvm-svn: 41746	2007-09-06 17:21:48 +00:00
Evan Cheng	896c1ed385	Fix a bug in X86InstrInfo::convertToThreeAddress that caused it to codegen: leal (,%rcx,8), %rcx It should be leal (,%rcx,8), %ecx llvm-svn: 41735	2007-09-06 00:14:41 +00:00
Dale Johannesen	f9ca7b6094	Change all floating constants that are not exactly representable to use hex format. llvm-svn: 41722	2007-09-05 17:50:36 +00:00
Duncan Sands	2e32997f97	Testcases for PR1628. llvm-svn: 41719	2007-09-05 11:53:04 +00:00
Bill Wendling	13549db795	Add the 64-bit versions of the DS* Altivec instructions. llvm-svn: 41717	2007-09-05 04:05:20 +00:00
Evan Cheng	bb21883dd3	Fix for PR1632. EHSELECTION always produces a i32 value. llvm-svn: 41712	2007-09-04 20:39:26 +00:00
Evan Cheng	02c6081f2d	Fix for PR1613: added 64-bit rotate left PPC instructions and patterns. llvm-svn: 41711	2007-09-04 20:20:29 +00:00
Evan Cheng	9c7cff8e62	Fix a gcroot lowering bug. llvm-svn: 41668	2007-09-01 02:00:51 +00:00
Rafael Espindola	4ddaad4de0	Initial support for calling functions with byval arguments on x86-64 llvm-svn: 41643	2007-08-31 15:06:30 +00:00
Evan Cheng	064691b876	Update test case to reflect Dale's change. llvm-svn: 41639	2007-08-31 06:29:32 +00:00
Tanya Lattner	0b279ff814	Do not run on darwin. llvm-svn: 41608	2007-08-30 16:07:20 +00:00
Evan Cheng	cb317912b2	Added support to fold X86 load / store instructions. This allow rematerialized loads to be folded into their uses. llvm-svn: 41599	2007-08-30 05:54:07 +00:00
Dan Gohman	c6e88b9bc2	Add explicit triples to avoid default behavior that varies by host. llvm-svn: 41510	2007-08-27 20:54:48 +00:00
Duncan Sands	454200b3bf	Remove this test as it is too hard to fix after the latest EH changes, and in any case it is hard to imagine how the original bug could be reintroduced. llvm-svn: 41497	2007-08-27 17:08:14 +00:00
Duncan Sands	50af87bf0b	Now that we don't output cleanups by default, the action offset needs to be adjusted in this test. llvm-svn: 41490	2007-08-27 16:30:05 +00:00
Dan Gohman	2e7e251f24	If the source and destination pointers in an llvm.memmove are known to not alias each other, it can be translated as an llvm.memcpy. llvm-svn: 41489	2007-08-27 16:26:13 +00:00
Rafael Espindola	3d52fe3ef3	call libc memcpy/memset if array size is bigger then threshold. Coping 100MB array (after a warmup) shows that glibc 2.6.1 implementation on x86-64 (core 2) is 30% faster (from 0.270917s to 0.188079s) llvm-svn: 41479	2007-08-27 10:18:20 +00:00
Chris Lattner	093144e147	Allow target constants to be illegal types. The target should know how to handle them. This fixes test/CodeGen/Generic/asm-large-immediate.ll llvm-svn: 41388	2007-08-25 01:00:22 +00:00
Andrew Lenharth	12f7bd64b4	update test to check that codegen works with llvm.used in llvm.metadata section llvm-svn: 41289	2007-08-22 19:36:31 +00:00
Evan Cheng	703bafa177	Test dag xform: Fold C ? 0 : 1 to ~C or zext(~C) or trunc(~C) llvm-svn: 41164	2007-08-18 06:11:57 +00:00
Evan Cheng	97e0a167ec	New test. Make sure dynamic_stackalloc size is rounded up. llvm-svn: 41135	2007-08-16 23:52:23 +00:00
Evan Cheng	625e712911	Update test: dynamic_stackalloc size must be rounded to ensure stack ptr be left in a valid state. llvm-svn: 41134	2007-08-16 23:51:28 +00:00
Rafael Espindola	817adb6532	add byval test llvm-svn: 41123	2007-08-16 13:09:02 +00:00
Lauro Ramos Venancio	9f9e5b3971	Implement FPOWI ExpandOp. Fix PR1287. llvm-svn: 41112	2007-08-15 22:13:27 +00:00
Evan Cheng	46b797bbed	Test case for PR1609. llvm-svn: 41110	2007-08-15 20:30:10 +00:00
Dan Gohman	74e688fce1	This test used "wc \| grep ..."; convert it to use the count script. llvm-svn: 41101	2007-08-15 13:55:47 +00:00
Dan Gohman	34263074cb	Convert tests using "grep -c ... \| grep ..." to use the count script. llvm-svn: 41100	2007-08-15 13:49:33 +00:00
Dan Gohman	10c3892fde	Delete extraneous uses of wc -l. llvm-svn: 41099	2007-08-15 13:45:35 +00:00
Dan Gohman	f259c770df	Convert another test to use the count script. This one didn't fit the regex used to convert all the others because the first '\|' was on a separate line. llvm-svn: 41098	2007-08-15 13:42:36 +00:00
Dan Gohman	794fa1f8f7	Convert tests using "\| wc -l \| grep ..." to use the count script. llvm-svn: 41097	2007-08-15 13:36:28 +00:00
Evan Cheng	92df220df4	New test. llvm-svn: 41087	2007-08-14 23:34:50 +00:00
Evan Cheng	e10e5e71b4	Test case for PR1596. llvm-svn: 41085	2007-08-14 23:21:10 +00:00
Chris Lattner	603e77e54e	tcl seems to hate \|& for some reason. llvm-svn: 41073	2007-08-14 16:19:35 +00:00
Chris Lattner	a76ba56608	switch this to use fastcc to avoid fpstack traffic on x86-32. Switch to using the count script instead of wc -l llvm-svn: 41072	2007-08-14 16:14:10 +00:00
Evan Cheng	5c28086ce6	Update test case. A spill should now be deleted. llvm-svn: 41070	2007-08-14 09:16:00 +00:00
Evan Cheng	fb29461720	Spiller reuse test case. llvm-svn: 41068	2007-08-14 05:51:03 +00:00
Evan Cheng	e79599dc2d	Now capable of rematerializing coalesced live intervals. llvm-svn: 41061	2007-08-13 23:54:16 +00:00
Dan Gohman	2390ff5060	When x86 addresses matching exceeds its recursion limit, check to see if the base register is already occupied before assuming it can be used. This fixes bogus code generation in the accompanying testcase. llvm-svn: 41049	2007-08-13 20:03:06 +00:00
Chris Lattner	7dfec1ee54	Fix PR1607 llvm-svn: 41048	2007-08-13 18:42:37 +00:00

... 3 4 5 6 7 ...

826 Commits