llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 06:22:51 +01:00

Author	SHA1	Message	Date
Anton Korobeynikov	a9ff11ea5f	Use target triple in tests, not 'realign-stack=0' option. Per request. llvm-svn: 50778	2008-05-06 23:09:29 +00:00
Evan Cheng	a84ed06284	Fix PR2287. Darwin passes mmx values in register in 64-mode, not Linux. llvm-svn: 50716	2008-05-06 07:23:50 +00:00
Mon P Wang	84a269e023	Added addition atomic instrinsics and, or, xor, min, and max. llvm-svn: 50663	2008-05-05 19:05:59 +00:00
Chris Lattner	ca94848f66	no need for eh info llvm-svn: 50658	2008-05-05 18:24:33 +00:00
Dan Gohman	c860d9c77c	Add AsmPrinter support for emitting a directive to declare that the code being generated does not require an executable stack. Also, add target-specific code to make use of this on Linux on x86. llvm-svn: 50634	2008-05-05 00:28:39 +00:00
Evan Cheng	a7747df955	Select vector shift with non-immediate i32 shift amount operand by first moving the operand into the right register. llvm-svn: 50619	2008-05-04 09:15:50 +00:00
Evan Cheng	c1c2adbfc6	Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This allow us to simplify the horribly complicated matching code. llvm-svn: 50601	2008-05-03 00:52:09 +00:00
Chris Lattner	8cc3e89b87	specify an arch for non-x86 hosts. llvm-svn: 50576	2008-05-02 15:11:58 +00:00
Chris Lattner	e9bbe8e6b6	don't randomly miscompile seto/setuo just because we are in ffastmath mode. This fixes rdar://5902801, a miscompilation of gcc.dg/builtins-8.c. Bill, please pull this into Tak. llvm-svn: 50523	2008-05-01 07:26:11 +00:00
Arnold Schwaighofer	29a654857a	Really commit the test checking the argument lowering behaviour on x86-64 :). llvm-svn: 50478	2008-04-30 09:19:47 +00:00
Chris Lattner	0f63b8fecc	make the vector conversion magic handle multiple results. We now compile test2/test3 to: _test2: ## InlineAsm Start set %xmm0, %xmm1 ## InlineAsm End addps %xmm1, %xmm0 ret _test3: ## InlineAsm Start set %xmm0, %xmm1 ## InlineAsm End paddd %xmm1, %xmm0 ret as expected. llvm-svn: 50389	2008-04-29 04:48:56 +00:00
Chris Lattner	e75d09711d	add support for multiple return values in inline asm. This is a step towards PR2094. It now compiles the attached .ll file to: _sad16_sse2: movslq %ecx, %rax ## InlineAsm Start %ecx %rdx %rax %rax %r8d %rdx %rsi ## InlineAsm End ## InlineAsm Start set %eax ## InlineAsm End ret which is pretty decent for a 3 output, 4 input asm. llvm-svn: 50386	2008-04-29 04:29:54 +00:00
Evan Cheng	381b094a2b	Another extract_subreg coalescing bug. e.g. vr1024<2> extract_subreg vr1025, 2 If vr1024 do not have the same register class as vr1025, it's not safe to coalesce this away. For example, vr1024 might be a GPR32 while vr1025 might be a GPR64. llvm-svn: 50385	2008-04-29 01:41:44 +00:00
Evan Cheng	8696eb18c1	Add -march=x86. llvm-svn: 50380	2008-04-28 23:31:41 +00:00
Evan Cheng	3167c716f2	Test case. llvm-svn: 50377	2008-04-28 22:14:34 +00:00
Chris Lattner	39a4281deb	Implement a signficant optimization for inline asm: When choosing between constraints with multiple options, like "ir", test to see if we can use the 'i' constraint and go with that if possible. This produces more optimal ASM in all cases (sparing a register and an instruction to load it), and fixes inline asm like this: void test () { asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14)); } Previously we would dump "42" into a memory location (which is ok for the 'm' constraint) which would cause a problem because the 'c' modifier is not valid on memory operands. Isn't it great how inline asm turns 'missed optimization' into 'compile failed'?? Incidentally, this was the todo in PowerPC/2007-04-24-InlineAsm-I-Modifier.ll Please do NOT pull this into Tak. llvm-svn: 50315	2008-04-27 00:37:18 +00:00
Nate Begeman	1723f6af2b	Feedback from chris llvm-svn: 50305	2008-04-25 21:47:35 +00:00
Nate Begeman	acd7e1c464	Add a testcase for the recent "handle variable vector insert elt in mem" patch llvm-svn: 50303	2008-04-25 21:26:59 +00:00
Evan Cheng	db1497fa77	Update tests. llvm-svn: 50293	2008-04-25 20:13:47 +00:00
Evan Cheng	11f101a800	Special handling for MMX values being passed in either GPR64 or lower 64-bits of XMM registers. llvm-svn: 50289	2008-04-25 19:11:04 +00:00
Evan Cheng	39ae78cadb	MMX argument passing fixes: On Darwin / Linux x86-32, v8i8, v4i16, v2i32 values are passed in MM[0-2]. On Darwin / Linux x86-32, v1i64 values are passed in memory. On Darwin x86-64, v8i8, v4i16, v2i32 values are passed in XMM[0-7]. On Darwin x86-64, v1i64 values are passed in 64-bit GPRs. llvm-svn: 50257	2008-04-25 07:56:45 +00:00
Chris Lattner	8c9f6c929a	Loosen up an assertion to allow intrinsics. I really have no idea what this code (findNonImmUse) does, so I'm only guessing that this is the right thing. It would be really really nice if this had comments and perhaps switched to SmallPtrSet (hint hint) :) This fixes rdar://5886601, a crash on gcc.target/i386/sse4_1-pblendw.c llvm-svn: 50252	2008-04-25 05:13:01 +00:00
Evan Cheng	484060ba4a	Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not handled by rep instructions, a new memcpy / memset is introduced for them. However, since source / destination addresses are already adjusted, their offsets should be zero. llvm-svn: 50239	2008-04-25 00:26:43 +00:00
Anton Korobeynikov	244a615291	Disable stack realignment for these tests llvm-svn: 50172	2008-04-23 18:25:44 +00:00
Anton Korobeynikov	1898ce20e2	Fix test becase ABI stack alignment dropped to 'normal' value llvm-svn: 50171	2008-04-23 18:25:16 +00:00
Anton Korobeynikov	15c5a2ce26	Fix test, instruction count is valid only if stack is not realigned llvm-svn: 50170	2008-04-23 18:24:48 +00:00
Dan Gohman	93b5be1824	Implement an x86-64 ABI detail of passing structs by hidden first argument. The x86-64 ABI requires the incoming value of %rdi to be copied to %rax on exit from a function that is returning a large C struct. Also, add a README-X86-64 entry detailing the missed optimization opportunity and proposing an alternative approach. llvm-svn: 50075	2008-04-21 23:59:07 +00:00
Chris Lattner	2c5b96fbee	A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2. llvm-svn: 49986	2008-04-20 05:52:46 +00:00
Chris Lattner	8503e2d236	Not all x86-64 machines have sse3 apparently. llvm-svn: 49985	2008-04-20 05:47:56 +00:00
Chris Lattner	c310b1f1f3	rename .llx -> .ll llvm-svn: 49970	2008-04-19 22:29:10 +00:00
Evan Cheng	073659986f	Be more careful with insert_subreg and extract_subreg where either source or destination operand has already been coalesced with another register that's defined by a insert_subreg or extract_subreg. llvm-svn: 49843	2008-04-17 07:58:04 +00:00
Evan Cheng	7c2c3333ca	Fix a sub-register indice propagation bug. llvm-svn: 49832	2008-04-17 00:06:42 +00:00
Evan Cheng	e2e899b5c2	Don't forget about sub-register indices when rematting instructions. llvm-svn: 49830	2008-04-16 23:44:44 +00:00
Evan Cheng	4b16ea6247	Really test what's intended. llvm-svn: 49802	2008-04-16 18:21:55 +00:00
Evan Cheng	6d05ce493b	Rewrite LiveVariable liveness computation. The new implementation is much simplified. It eliminated the nasty recursive routines and removed the partial def / use bookkeeping. There is also potential for performance improvement by replacing the conservative handling of partial physical register definitions. The code is currently disabled until live interval analysis is taught of the name scheme. This patch also fixed a couple of nasty corner cases. llvm-svn: 49784	2008-04-16 09:46:40 +00:00
Dan Gohman	be8f2b452b	Add support for the form of the SSE41 extractps instruction that puts its result in a 32-bit GPR. llvm-svn: 49762	2008-04-16 02:32:24 +00:00
Dan Gohman	cf79877623	Recreate the size SDNode instead of reusing the old one in the x86 memcpy lowering code; this ensures that the size node has the desired result type. This fixes a regression from r49572 with @llvm.memcpy.i64 on x86-32. llvm-svn: 49761	2008-04-16 01:32:32 +00:00
Dan Gohman	7d27552962	Add movd instructions to move from MMX registers to 64-bit GPR registers on x86-64. llvm-svn: 49757	2008-04-15 23:55:07 +00:00
Dan Gohman	3b99b3c807	Treat EntryToken nodes as "passive" so that they aren't added to the ScheduleDAG; they don't correspond to any actual instructions so they don't need to be scheduled. This fixes a bug where the EntryToken was being scheduled multiple times in some cases, though it ended up not causing any trouble because EntryToken doesn't expand into anything. With this fixed the schedulers reliably schedule the expected number of units, so we can check this with an assertion. This requires a tweak to test/CodeGen/X86/loop-hoist.ll because it ends up getting scheduled differently in a trivial way, though it was enough to fool the prcontext+grep that the test does. llvm-svn: 49701	2008-04-15 01:22:18 +00:00
Dale Johannesen	d9a9c746d8	Remove -unwind-tables-optional everywhere, since this is now the default. llvm-svn: 49667	2008-04-14 17:56:54 +00:00
Arnold Schwaighofer	82af0e6a43	This patch corrects the handling of byval arguments for tailcall optimized x86-64 (and x86) calls so that they work (... at least for my test cases). Should fix the following problems: Problem 1: When i introduced the optimized handling of arguments for tail called functions (using a sequence of copyto/copyfrom virtual registers instead of always lowering to top of the stack) i did not handle byval arguments correctly e.g they did not work at all :). Problem 2: On x86-64 after the arguments of the tail called function are moved to their registers (which include ESI/RSI etc), tail call optimization performs byval lowering which causes xSI,xDI, xCX registers to be overwritten. This is handled in this patch by moving the arguments to virtual registers first and after the byval lowering the arguments are moved from those virtual registers back to RSI/RDI/RCX. llvm-svn: 49584	2008-04-12 18:11:06 +00:00
Dan Gohman	15edbf989f	Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal on any current target and aren't optimized in DAGCombiner. Instead of using intermediate nodes, expand the operations, choosing between simple loads/stores, target-specific code, and library calls, immediately. Previously, the code to emit optimized code for these operations was only used at initial SelectionDAG construction time; now it is used at all times. This fixes some cases where rep;movs was being used for small copies where simple loads/stores would be better. This also cleans up code that checks for alignments less than 4; let the targets make that decision instead of doing it in target-independent code. This allows x86 to use rep;movs in low-alignment cases. Also, this fixes a bug that resulted in the use of rep;stos for memsets of 0 with non-constant memory size when the alignment was at least 4. It's better to use the library in this case, which can be significantly faster when the size is large. This also preserves more SourceValue information when memory intrinsics are lowered into simple loads/stores. llvm-svn: 49572	2008-04-12 04:36:06 +00:00
Dan Gohman	41f9d24d52	Fix a bug that prevented x86-64 from using rep.movsq for 8-byte-aligned data. llvm-svn: 49571	2008-04-12 02:35:39 +00:00
Chris Lattner	3b289289a7	Fix the x86-64 side of PR2108 by adding a v2f64 version of MOVZQI2PQIrr. This would be better handled as a dag combine (with the goal of eliminating the bitconvert) but I don't know how to do that safely. Thoughts welcome. llvm-svn: 49463	2008-04-10 05:13:43 +00:00
Evan Cheng	1803e20a62	Teach branch folding pass about implicit_def instructions. Unfortunately we can't just eliminate them since register scavenger expects every register use to be defined. However, we can delete them when there are no intra-block uses. Carefully removing some implicit def's which enable more blocks to be optimized away. llvm-svn: 49461	2008-04-10 02:32:10 +00:00
Evan Cheng	def576f9e6	- More aggressively coalescing away copies whose source is defined by an implicit_def. - Added insert_subreg coalescing support. llvm-svn: 49448	2008-04-09 20:57:25 +00:00
Evan Cheng	f35cc57821	Missed a hasInterval check. llvm-svn: 49415	2008-04-09 01:30:15 +00:00
Dale Johannesen	5ac0a0ed21	Rename -disable-required-unwind-tables to -unwind-tables-optional. llvm-svn: 49391	2008-04-08 18:10:08 +00:00
Dale Johannesen	3f992b224e	Add -disable-required-unwind-tables to tests that need it (usually, grepping for some string found in unwind info) llvm-svn: 49364	2008-04-08 00:14:17 +00:00
Evan Cheng	6c58f2397d	Fix test. llvm-svn: 49343	2008-04-07 17:02:18 +00:00
Chris Lattner	f88214caca	fix this testcase to pass and remove a duplicate instance of itself. llvm-svn: 49281	2008-04-06 21:39:17 +00:00
Torok Edwin	34e6889671	Prefer to expand mask for xor to -1, so we have a chance to turn it into a not. If it cannot be expanded, it will keep the old behaviour and try to shrink the constant. Part of enhancement for PR2191. llvm-svn: 49280	2008-04-06 21:23:02 +00:00
Evan Cheng	4d7b2ab16f	Favors pshufd over shufps when shuffling elements from one vector. pshufd is faster than shufps. llvm-svn: 49244	2008-04-05 00:30:36 +00:00
Evan Cheng	f045d86660	New test case. llvm-svn: 49190	2008-04-03 21:25:03 +00:00
Dale Johannesen	ebfa6edc65	Testcase for EH with functions whose names are stripped. llvm-svn: 49111	2008-04-02 20:16:41 +00:00
Dan Gohman	168b2b1300	Speculatively micro-optimize memory-zeroing calls on Darwin 10. llvm-svn: 49048	2008-04-01 20:38:36 +00:00
Dale Johannesen	d9a5b77269	Mark functions in some tests as 'nounwind'. Generating EH info for these functions causes the tests to fail for random reasons (e.g. looking for 'or' or counting lines with asm-printer; labels count as lines.) llvm-svn: 49003	2008-03-31 23:20:09 +00:00
Evan Cheng	a3ce7b4c76	It's not safe to fold a load from GV stub or constantpool into a two-address use. llvm-svn: 49002	2008-03-31 23:19:51 +00:00
Dan Gohman	f223eaafcd	Fix a DAGCombiner optimization to respect volatile qualification. llvm-svn: 48994	2008-03-31 20:32:52 +00:00
Dan Gohman	227e702cae	Fix a tokenfactor node to use the load chain rather than the load value. This fixes PR2177. llvm-svn: 48932	2008-03-28 23:45:16 +00:00
Evan Cheng	6cbce6b602	Fix a memory bug: increment an iterator of a deleted machine instr. llvm-svn: 48853	2008-03-27 01:27:25 +00:00
Evan Cheng	8d222d6221	Avoid commuting a def MI in order to coalesce a copy instruction away if any use of the same val# is a copy instruction that has already been coalesced. llvm-svn: 48833	2008-03-26 19:03:01 +00:00
Dale Johannesen	8c1e95810f	Use ## for comment delimiter on darwin x86-32, so llvm's output .s files will go through gcc -std=c99 without triggering preprocesser errors. Approach suggested by Daveed Vandevoorde. llvm-svn: 48808	2008-03-25 23:29:30 +00:00
Evan Cheng	8cb64d8e8b	Handle a special case xor undef, undef -> 0. Technically this should be transformed to undef. But this is such a common idiom (misuse) we are going to handle it. llvm-svn: 48792	2008-03-25 20:08:07 +00:00
Dan Gohman	58ad056286	Add CMP32mr and friends to the load-unfolding table. Among other things, this allows the scheduler to unfold a load operand in the 2008-01-08-SchedulerCrash.ll testcase, so it now successfully clones the comparison to avoid a pushf+popf. llvm-svn: 48777	2008-03-25 16:53:19 +00:00
Tanya Lattner	b6a27ed83f	Byebye llvm-upgrade! llvm-svn: 48762	2008-03-25 04:26:08 +00:00
Evan Cheng	dbdf48276a	- SSE4.1 extractfps extracts a f32 into a gr32 register. Very useful! Not. Fix the instruction specification and teaches lowering code to use it only when the only use is a store instruction. llvm-svn: 48746	2008-03-24 21:52:23 +00:00
Dan Gohman	b9c5e6258f	APIntify SelectionDAG's EXTRACT_ELEMENT code. llvm-svn: 48726	2008-03-24 16:38:05 +00:00
Evan Cheng	874aee2eec	Teach DAG combiner to commute commutable binary nodes in order to achieve sdisel CSE. llvm-svn: 48673	2008-03-22 01:55:50 +00:00
Dan Gohman	59aeac6320	Handle getresult instructions in different basic blocks from their aggregate operands by moving the getresult instructions. llvm-svn: 48657	2008-03-21 21:01:32 +00:00
Chris Lattner	8a4fa95cae	Add support for calls that return two FP values in ST(0)/ST(1). llvm-svn: 48634	2008-03-21 06:38:26 +00:00
Chris Lattner	933d0d318b	disable a bogus assertion. llvm-svn: 48633	2008-03-21 06:01:05 +00:00
Chris Lattner	260473f983	Enable support for returning two long-double values in ST(0)/ST(1). This allows us to compile fp-stack-2results.ll into: _test: fldz fld1 ret which returns 1 in ST(0) and 0 in ST(1). This is needed for x86-64 _Complex long double. llvm-svn: 48632	2008-03-21 05:57:20 +00:00
Evan Cheng	4ae9fee64c	Undo 48570. Correctly match mmx shift instructions with an immediate operand. llvm-svn: 48627	2008-03-21 00:40:09 +00:00
Evan Cheng	8ecb189245	Fix this xform: (sra (shl X, m), result_size) -> (sign_extend (trunc (shl X, result_size - n - m))) llvm-svn: 48578	2008-03-20 02:18:41 +00:00
Evan Cheng	6f729b2820	Add intrinsics to match mmx shift builtin's with immediate operand. llvm-svn: 48569	2008-03-19 23:38:52 +00:00
Christopher Lamb	958b0494c3	Fix X86's isTruncateFree to not claim that truncate to i1 is free. This fixes Bill's testcase that failed for r48491. llvm-svn: 48542	2008-03-19 08:30:06 +00:00
Evan Cheng	e9aa507edc	Fixed a coalescer bug caused by a typo. llvm-svn: 48526	2008-03-19 02:26:36 +00:00
Evan Cheng	3d9309c11d	Fix live variables issues: 1. If part of a register is re-defined, an implicit kill and an implicit def are added to denote read / mod / write. However, this should only be necessary if the register is actually read later. This is a performance issue. 2. If a sub-register is being defined, and it doesn't have a previous use, do not add a implicit kill to the last use of a super-register: = EAX, AX<imp-use,kill> ... AX = In this case, EAX is live but AX is killed, this is wrong and will cause the coalescer to do bad things. llvm-svn: 48521	2008-03-19 00:52:20 +00:00
Evan Cheng	5ac87b837e	Fix a x86-64 isel lowering bug that's been around forever. A x86-64 varargs function implicitly reads X86::AL, don't clobber it! llvm-svn: 48515	2008-03-18 23:36:35 +00:00
Bill Wendling	c8f3fc7c3d	It might be nice to have this run as x86 on non-x86 platforms... llvm-svn: 48511	2008-03-18 22:38:22 +00:00
Bill Wendling	5ea2aec3ac	Temporarily revert r48491. It's breaking test/CodeGen/X86/xorl.ll. llvm-svn: 48510	2008-03-18 22:29:51 +00:00
Christopher Lamb	1d70509b55	Target independent DAG transform to use truncate for field extraction + sign extend on targets where this is profitable. Passes nightly on x86-64. llvm-svn: 48491	2008-03-18 16:46:39 +00:00
Chris Lattner	bb335409c2	ensure we continue matching x86-64 rotates. llvm-svn: 48437	2008-03-17 01:35:03 +00:00
Evan Cheng	3612a7ed30	Fix PR2138. Apparently any modification to a std::multimap (including remove entries for a different key) can invalidate multimap iterators. llvm-svn: 48371	2008-03-14 20:44:01 +00:00
Evan Cheng	53c3dd0267	New test case. llvm-svn: 48338	2008-03-13 08:05:02 +00:00
Evan Cheng	a76bb6e64e	A test case I forgot to check in. llvm-svn: 48335	2008-03-13 06:42:46 +00:00
Evan Cheng	0b8b1647dd	TwoAddressInstructionPass enhancement. After it converts a two address instruction into a 3-address one, sink it past the instruction that kills the read-mod-write register if its definition is used past the kill. This reduces the number of live register by one. llvm-svn: 48333	2008-03-13 06:37:55 +00:00
Evan Cheng	620fd19798	Experimental scheduler change to schedule / coalesce the copies added for function livein's. Take 2008-03-10-RegAllocInfLoop.ll, the schedule looks like this after these copies are inserted: entry: 0x12049d0, LLVM BB @0x1201fd0, ID#0: Live Ins: %EAX %EDX %ECX %reg1031<def> = MOVPC32r 0 %reg1032<def> = ADD32ri %reg1031, <es:_GLOBAL_OFFSET_TABLE_>, %EFLAGS<imp-def> %reg1028<def> = MOV32rr %EAX %reg1029<def> = MOV32rr %EDX %reg1030<def> = MOV32rr %ECX %reg1027<def> = MOV8rm %reg0, 1, %reg0, 0, Mem:LD(1,1) [0x1201910 + 0] %reg1025<def> = MOV32rr %reg1029 %reg1026<def> = MOV32rr %reg1030 %reg1024<def> = MOV32rr %reg1028 The copies unnecessarily increase register pressure and it will end up requiring a physical register to be spilled. With -schedule-livein-copies: entry: 0x12049d0, LLVM BB @0x1201fa0, ID#0: Live Ins: %EAX %EDX %ECX %reg1031<def> = MOVPC32r 0 %reg1032<def> = ADD32ri %reg1031, <es:_GLOBAL_OFFSET_TABLE_>, %EFLAGS<imp-def> %reg1024<def> = MOV32rr %EAX %reg1025<def> = MOV32rr %EDX %reg1026<def> = MOV32rr %ECX %reg1027<def> = MOV8rm %reg0, 1, %reg0, 0, Mem:LD(1,1) [0x12018e0 + 0] Much better! llvm-svn: 48307	2008-03-12 22:19:41 +00:00
Dan Gohman	2ec41788ab	Fix this test on hosts that don't have sse2. llvm-svn: 48296	2008-03-12 20:40:51 +00:00
Dan Gohman	2a124430e9	Make this test x86-specific for now; targets that don't use the automated CallingConv code to handle return values typically don't support multiple return values. llvm-svn: 48265	2008-03-12 00:25:14 +00:00
Anton Korobeynikov	d2fd135594	Testcase for PR2137 llvm-svn: 48258	2008-03-11 22:43:42 +00:00
Anton Korobeynikov	efa9405b94	Update testcase for recent aliases change llvm-svn: 48250	2008-03-11 21:42:20 +00:00
Dan Gohman	f9f25bd41b	Add a test to ensure that all-ones vectors are materialized with pcmpeqd. llvm-svn: 48247	2008-03-11 21:37:00 +00:00
Dan Gohman	1fece90de9	Use the correct value for InSignBit. llvm-svn: 48245	2008-03-11 21:29:43 +00:00
Chris Lattner	fd2c24af72	Implement basic support for the 'f' register class constraint. This basically works, but probably won't if you mix it with 't' or 'u' yet. llvm-svn: 48243	2008-03-11 19:50:13 +00:00
Evan Cheng	af1c76846d	When the register allocator runs out of registers, spill a physical register around the def's and use's of the interval being allocated to make it possible for the interval to target a register and spill it right away and restore a register for uses. This likely generates terrible code but is before than aborting. llvm-svn: 48218	2008-03-11 07:19:34 +00:00
Chris Lattner	f0684bfd16	Don't emit FP_REG_KILL into a block that just returns. Nothing can be live out of the block anyway, so it isn't needed. llvm-svn: 48192	2008-03-10 23:34:12 +00:00
Dan Gohman	47137eba06	Fix mul expansion to check the correct number of bits for zero extension when checking if an unsigned multiply is safe. llvm-svn: 48171	2008-03-10 20:42:19 +00:00
Dale Johannesen	62a0b6a79b	These tests don't work unless SSE2 is active. Judging from the checking comments this is intentional, so add the flag (makes them pass on non-x86 host). llvm-svn: 48157	2008-03-10 17:33:57 +00:00
Dale Johannesen	c9ecee85c4	There is no "-mattr=+sse1" flag; fix test for non-x86 hosts. llvm-svn: 48156	2008-03-10 17:13:37 +00:00
Evan Cheng	02b66c3a32	- Fix a subtle bug in RemoveCopyByCommutingDef. ALR is the live range where the source is defined; BLR is the live range which is defined by the copy. If ALR and BLR overlaps and end of BLR extends beyond end of ALR, e.g. A = or A, B ... B = A ... C = A<kill> ... = B then do not add kills of A to the newly created B interval. - Also fix some kill info update bug. llvm-svn: 48141	2008-03-10 08:11:32 +00:00
Evan Cheng	3c0ddc999f	Avoid creating BUILD_VECTOR of all zero elements of "non-normalized" type (e.g. v8i16 on x86) after legalizer. Instruction selection does not expect to see them. In all likelihood this can only be an issue in a bugpoint reduced test case. llvm-svn: 48136	2008-03-10 07:19:13 +00:00
Chris Lattner	b6bfedbcfd	teach X86InstrInfo::copyRegToReg how to copy into ST(0) from an RFP register class. Teach ScheduleDAG how to handle CopyToReg with different src/dst reg classes. This allows us to compile trivial inline asms that expect stuff on the top of x87-fp stack. llvm-svn: 48107	2008-03-09 09:15:31 +00:00
Chris Lattner	8d0203478f	Add ScheduleDAG support for copytoreg where the src/dst register are in different register classes, e.g. copy of ST(0) to RFP*. This gets some really trivial inline asm working that plops things on the top of stack (PR879) llvm-svn: 48105	2008-03-09 08:49:15 +00:00
Chris Lattner	b9a4c86fbf	reduce this testcase more llvm-svn: 48092	2008-03-09 06:57:21 +00:00
Chris Lattner	b628208161	Finish implementing a readme entry: when inserting an i64 variable into a vector of zeros or undef, and when the top part is obviously zero, we can just use movd + shuffle. This allows us to compile vec_set-B.ll into: _test3: movl $1234567, %eax andl 4(%esp), %eax movd %eax, %xmm0 ret instead of: _test3: subl $28, %esp movl $1234567, %eax andl 32(%esp), %eax movl %eax, (%esp) movl $0, 4(%esp) movq (%esp), %xmm0 addl $28, %esp ret llvm-svn: 48090	2008-03-09 05:42:06 +00:00
Chris Lattner	17f68a3075	Implement a readme entry, compiling #include <xmmintrin.h> __m128i doload64(short x) {return _mm_set_epi16(0,0,0,0,0,0,0,1);} into: movl $1, %eax movd %eax, %xmm0 ret instead of a constant pool load. llvm-svn: 48063	2008-03-09 01:05:04 +00:00
Chris Lattner	24031c9426	make this test harder llvm-svn: 48061	2008-03-09 00:30:06 +00:00
Chris Lattner	7173d3bd70	Teach SD some vector identities, allowing us to compile vec_set-9 into: _test3: movd %rdi, %xmm1 #IMPLICIT_DEF %xmm0 punpcklqdq %xmm1, %xmm0 ret instead of: _test3: #IMPLICIT_DEF %rax movd %rax, %xmm0 movd %rdi, %xmm1 punpcklqdq %xmm1, %xmm0 ret This is still not ideal. There is no reason to two xmm regs. llvm-svn: 48058	2008-03-08 23:43:36 +00:00
Evan Cheng	dba1dfe962	Implement x86 support for @llvm.prefetch. It corresponds to prefetcht{0\|1\|2} and prefetchnta instructions. llvm-svn: 48042	2008-03-08 00:58:38 +00:00
Chris Lattner	aa81dc7d21	mark frem as expand for all legal fp types on x86, regardless of whether we're using SSE or not. This fixes PR2122. llvm-svn: 48006	2008-03-07 06:36:32 +00:00
Chris Lattner	a9fcb187af	Generalize FP constant shrinking optimization to apply to any vt except ppc long double. This allows us to shrink constant pool entries for x86 long double constants, which in turn allows us to use flds/fldl instead of fldt. llvm-svn: 47938	2008-03-05 06:48:13 +00:00
Evan Cheng	e0b3c221ab	Add a target lowering hook to control whether it's worthwhile to compress fp constant. For x86, if sse2 is available, it's not a good idea since cvtss2sd is slower than a movsd load and it prevents load folding. On x87, it's important to shrink fp constant since fldt is very expensive. llvm-svn: 47931	2008-03-05 01:30:59 +00:00
Evan Cheng	14f556a6d7	Really fix the test. llvm-svn: 47882	2008-03-04 08:01:56 +00:00
Evan Cheng	7a67175fcc	Fix broken test. llvm-svn: 47881	2008-03-04 07:59:13 +00:00
Evan Cheng	3123d6ced3	Add PR1501 test case. llvm-svn: 47874	2008-03-04 00:47:45 +00:00
Chris Lattner	299977b5ca	Evan implemented these. llvm-svn: 47828	2008-03-02 18:05:14 +00:00
Evan Cheng	e1d3e0958b	Set to default: x86 no longer fold and into test if it has more than one use. llvm-svn: 47711	2008-02-28 07:46:38 +00:00
Evan Cheng	da92e34fe3	Fix a bug in dead spill slot elimination. llvm-svn: 47687	2008-02-27 19:57:11 +00:00
Chris Lattner	e51c23341d	actually run llc, thanks Dan :) llvm-svn: 47677	2008-02-27 17:46:54 +00:00
Evan Cheng	295ae42ede	Don't track max alignment during stack object allocations since they can be deleted later. Let PEI compute it. llvm-svn: 47668	2008-02-27 10:04:56 +00:00
Chris Lattner	1f46cc2345	Make X86TargetLowering::LowerSINT_TO_FP return without creating a dead stack slot and store if the SINT_TO_FP is actually legal. This allows us to compile: double a(double b) {return (unsigned)b;} to: _a: cvttsd2siq %xmm0, %rax movl %eax, %eax cvtsi2sdq %rax, %xmm0 ret instead of: _a: subq $8, %rsp cvttsd2siq %xmm0, %rax movl %eax, %eax cvtsi2sdq %rax, %xmm0 addq $8, %rsp ret crazy. llvm-svn: 47660	2008-02-27 05:57:41 +00:00
Chris Lattner	bc686e546a	Compile x86-64-and-mask.ll into: _test: movl %edi, %eax ret instead of: _test: movl $4294967295, %ecx movq %rdi, %rax andq %rcx, %rax ret It would be great to write this as a Pat pattern that used subregs instead of a 'pseudo' instruction, but I don't know how to do that in td files. llvm-svn: 47658	2008-02-27 05:47:54 +00:00
Evan Cheng	7553230e3a	Spiller now remove unused spill slots. llvm-svn: 47657	2008-02-27 03:04:06 +00:00
Evan Cheng	701b6a1dc3	Enable -coalescer-commute-instrs by default. llvm-svn: 47623	2008-02-26 20:40:22 +00:00
Dan Gohman	8a8f3fe7e0	Avoid aborting on invalid shift counts. llvm-svn: 47612	2008-02-26 18:50:50 +00:00
Eli Friedman	1f2cabfbcf	Fix for pr2093: direct operands aren't necessarily addresses, so don't try to simplify them. llvm-svn: 47610	2008-02-26 18:37:49 +00:00
Evan Cheng	8e99554e84	This is possible: vr1 = extract_subreg vr2, 3 ... vr3 = extract_subreg vr1, 2 The end result is vr3 is equal to vr2 with subidx 2. llvm-svn: 47592	2008-02-26 08:03:41 +00:00
Evan Cheng	6366bbf577	Fix PR2076. CodeGenPrepare now sinks address computation for inline asm memory operands into inline asm block. llvm-svn: 47589	2008-02-26 02:42:37 +00:00
Evan Cheng	d299f09bc5	Rematerialization logic was overly conservative when it comes to loads from fixed stack slots. llvm-svn: 47529	2008-02-23 03:38:34 +00:00
Evan Cheng	6782480bd1	Update test. llvm-svn: 47527	2008-02-23 02:57:25 +00:00
Evan Cheng	4e9d5f1ead	Remat of pic loads are now on by default. llvm-svn: 47525	2008-02-23 02:08:30 +00:00
Evan Cheng	7c3a8d0056	Really. Why doesn't every arch support MMX? llvm-svn: 47513	2008-02-23 00:56:14 +00:00
Evan Cheng	3b35d2a86c	Test case for PR2082. llvm-svn: 47501	2008-02-22 20:38:49 +00:00
Evan Cheng	1b417c4d84	Allow re-materialization of pic load (controlled by -remat-pic-load for now). llvm-svn: 47476	2008-02-22 09:25:47 +00:00
Chris Lattner	a64d4179d4	copy mmx values from/to memory with GPRs on x86-32 instead of with mmx registers. This horribleness is apparently done by gcc to avoid having to insert emms in places that really should have it. This is the second half of rdar://5741668. llvm-svn: 47474	2008-02-22 05:18:04 +00:00
Chris Lattner	e70bc39d74	Start using GPR's to copy around mmx value instead of mmx regs. GCC apparently does this, and code depends on not having to do emms when this happens. This is x86-64 only so far, second half should handle x86-32. rdar://5741668 llvm-svn: 47470	2008-02-22 02:09:43 +00:00
Chris Lattner	4f87f1c087	Treat clobber operands like early clobbers: if we have any, we force sdisel to do all regalloc for an asm. This leads to gross but correct codegen. This fixes the rest of PR2078. llvm-svn: 47454	2008-02-21 19:43:13 +00:00
Tanya Lattner	8116db05a6	Remove llvm-upgrade and update tests. llvm-svn: 47432	2008-02-21 07:42:26 +00:00
Chris Lattner	99b5a37d39	Fix a (harmless) but where vregs were added to the used reg lists for inline asms. Fix PR2078 by marking aliases of registers used when a register is marked used. This prevents EAX from being allocated when AX is listed in the clobber set for the asm. llvm-svn: 47426	2008-02-21 04:55:52 +00:00
Evan Cheng	33ee06fa48	XFAIL this for now. llvm-svn: 47355	2008-02-20 02:38:58 +00:00
Chris Lattner	aaafe47a55	this test requires sse2 llvm-svn: 47331	2008-02-19 18:07:46 +00:00
Chris Lattner	3a4ac3a69e	Don't fold and's into test instructions if they have multiple uses. This compiles test-nofold.ll into: _test: movl $15, %ecx andl 4(%esp), %ecx testl %ecx, %ecx movl $42, %eax cmove %ecx, %eax ret instead of: _test: movl 4(%esp), %eax movl %eax, %ecx andl $15, %ecx testl $15, %eax movl $42, %eax cmove %ecx, %eax ret llvm-svn: 47330	2008-02-19 17:37:35 +00:00
Chris Lattner	67f2a6c009	rename tests to avoid a test- prefix when they aren't related to the test instruction. llvm-svn: 47329	2008-02-19 17:33:52 +00:00
Nick Lewycky	69457748ab	Don't spew stats to stderr. llvm-svn: 47308	2008-02-19 03:11:47 +00:00
Nick Lewycky	0560401b2e	Fix up the run line for this new test. llc: for the -info-output-file option: requires a value! llvm-svn: 47306	2008-02-19 02:58:36 +00:00
Evan Cheng	de4579d0b3	New test. llvm-svn: 47302	2008-02-19 02:09:58 +00:00
Evan Cheng	bb577266bf	- When DAG combiner is folding a bit convert into a BUILD_VECTOR, it should check if it's essentially a SCALAR_TO_VECTOR. Avoid turning (v8i16) <10, u, u, u> to <10, 0, u, u, u, u, u, u>. Instead, simply convert it to a SCALAR_TO_VECTOR of the proper type. - X86 now normalize SCALAR_TO_VECTOR to (BIT_CONVERT (v4i32 SCALAR_TO_VECTOR)). Get rid of X86ISD::S2VEC. llvm-svn: 47290	2008-02-18 23:04:32 +00:00
Dan Gohman	70b9b2f77f	Don't mark scalar integer multiplication as Expand on x86, since x86 has plain one-result scalar integer multiplication instructions. This avoids expanding such instructions into MUL_LOHI sequences that must be special-cased at isel time, and avoids the problem with that code that provented memory operands from being folded. This fixes PR1874, addressesing the most common case. The uncommon cases of optimizing multiply-high operations will require work in DAGCombiner. llvm-svn: 47277	2008-02-18 17:55:26 +00:00
Andrew Lenharth	c178981b85	llvm.memory.barrier, and impl for x86 and alpha llvm-svn: 47204	2008-02-16 01:24:58 +00:00
Evan Cheng	94742fb5d4	This test is not interesting. llvm-svn: 47189	2008-02-15 23:06:21 +00:00
Chris Lattner	9c24f3ec37	Fix a miscompilation from Dan's recent apintification. llvm-svn: 47128	2008-02-14 18:48:56 +00:00
Chris Lattner	d696c25db5	This readme entry is done, testcase here: CodeGen/X86/zero-remat.ll llvm-svn: 47106	2008-02-14 05:39:46 +00:00
Evan Cheng	cbbcb3144d	Fix test. llvm-svn: 47102	2008-02-14 01:32:53 +00:00
Chris Lattner	a30946c576	In SDISel, for targets that support FORMAL_ARGUMENTS nodes, lower this node as soon as we create it in SDISel. Previously we would lower it in legalize. The problem with this is that it only exposes the argument loads implied by FORMAL_ARGUMENTs after legalize, so that only dag combine 2 can hack on them. This causes us to miss some optimizations because datatype expansion also happens here. Exposing the loads early allows us to do optimizations on them. For example we now compile arg-cast.ll to: _foo: movl $2147483647, %eax andl 8(%esp), %eax ret where we previously produced: _foo: subl $12, %esp movsd 16(%esp), %xmm0 movsd %xmm0, (%esp) movl $2147483647, %eax andl 4(%esp), %eax addl $12, %esp ret It might also make sense to do this for ISD::CALL nodes, which have implicit stores on many targets. llvm-svn: 47054	2008-02-13 07:39:09 +00:00
Evan Cheng	68a88c1f52	New tests. llvm-svn: 47047	2008-02-13 03:23:53 +00:00
Evan Cheng	0d2efb485d	Don't mask the isel bug. llvm-svn: 47018	2008-02-12 19:11:29 +00:00
Evan Cheng	6c7520f922	This test assumes no SSE4.1. llvm-svn: 47017	2008-02-12 19:11:08 +00:00
Evan Cheng	1ab096a313	Fix some test cases. llvm-svn: 46998	2008-02-12 07:22:46 +00:00
Dale Johannesen	304406f01c	Alignment of struct containing vectors depends on whether SSE is present, on Darwin anyway. Make it explicit. llvm-svn: 46909	2008-02-09 19:04:25 +00:00
Evan Cheng	90f03a0b88	It's not always safe to fold movsd into xorpd, etc. Check the alignment of the load address first to make sure it's 16 byte aligned. llvm-svn: 46893	2008-02-08 21:20:40 +00:00
Evan Cheng	b2bc19ee5b	Added missing entries in X86 load / store folding tables. llvm-svn: 46866	2008-02-08 00:12:56 +00:00
Evan Cheng	a377b2bbd1	Fix a x86-64 codegen deficiency. Allow gv + offset when using rip addressing mode. Before: _main: subq $8, %rsp leaq _X(%rip), %rax movsd 8(%rax), %xmm1 movss _X(%rip), %xmm0 call _t xorl %ecx, %ecx movl %ecx, %eax addq $8, %rsp ret Now: _main: subq $8, %rsp movsd _X+8(%rip), %xmm1 movss _X(%rip), %xmm0 call _t xorl %ecx, %ecx movl %ecx, %eax addq $8, %rsp ret Notice there is another idiotic codegen issue that needs to be fixed asap: xorl %ecx, %ecx movl %ecx, %eax llvm-svn: 46850	2008-02-07 08:53:49 +00:00
Evan Cheng	851d353eb8	Fix PR1975: dag isel emitter produces patterns that isel wrong flag result. llvm-svn: 46776	2008-02-05 22:50:29 +00:00
Chris Lattner	35f063e37c	Add target triples to these so they don't fail on linux. llvm-svn: 46496	2008-01-29 06:26:07 +00:00
Chris Lattner	2ab1fd3824	Implement some dag combines that allow doing fneg/fabs/fcopysign in integer registers if used by a bitconvert or using a bitconvert. This allows us to avoid constant pool loads and use cheaper integer instructions when the values come from or end up in integer regs anyway. For example, we now compile CodeGen/X86/fp-in-intregs.ll to: _test1: movl $2147483648, %eax xorl 4(%esp), %eax ret _test2: movl $1065353216, %eax orl 4(%esp), %eax andl $3212836864, %eax ret Instead of: _test1: movss 4(%esp), %xmm0 xorps LCPI2_0, %xmm0 movd %xmm0, %eax ret _test2: movss 4(%esp), %xmm0 andps LCPI3_0, %xmm0 movss LCPI3_1, %xmm1 andps LCPI3_2, %xmm1 orps %xmm0, %xmm1 movd %xmm1, %eax ret bitconverts can happen due to various calling conventions that require fp values to passed in integer regs in some cases, e.g. when returning a complex. llvm-svn: 46414	2008-01-27 17:42:27 +00:00
Chris Lattner	e66aea6532	New test to verify that "merging 4 loads into a vec load" continues to work and continues to infer alignment info. llvm-svn: 46403	2008-01-26 20:06:45 +00:00
Chris Lattner	682346a7b0	Infer alignment of loads and increase their alignment when we can tell they are from the stack. This allows us to compile stack-align.ll to: _test: movsd LCPI1_0, %xmm0 movapd %xmm0, %xmm1 * andpd 4(%esp), %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret instead of: _test: movsd LCPI1_0, %xmm0 movsd 4(%esp), %xmm1 ** andpd %xmm0, %xmm1 andpd _G, %xmm0 addsd %xmm1, %xmm0 movl 20(%esp), %eax movsd %xmm0, (%eax) ret llvm-svn: 46401	2008-01-26 19:45:50 +00:00
Chris Lattner	f0c3240135	remove a useless xfailed test. llvm-svn: 46400	2008-01-26 19:35:46 +00:00
Bill Wendling	7b83688c73	If there's no instructions being emitted on X86 for a function, emit a nop. Emit the nop directly for PPC. llvm-svn: 46398	2008-01-26 09:03:52 +00:00
Chris Lattner	79076fdf2a	Add target-specific dag combines for FAND(x,0) and FOR(x,0). This allows us to compile: double test(double X) { return copysign(0.0, X); } into: _test: andpd LCPI1_0(%rip), %xmm0 ret instead of: _test: pxor %xmm1, %xmm1 andpd LCPI1_0(%rip), %xmm1 movapd %xmm0, %xmm2 andpd LCPI1_1(%rip), %xmm2 movapd %xmm1, %xmm0 orpd %xmm2, %xmm0 ret llvm-svn: 46344	2008-01-25 05:46:26 +00:00
Chris Lattner	16a8f126d3	Significantly simplify and improve handling of FP function results on x86-32. This case returns the value in ST(0) and then has to convert it to an SSE register. This causes significant codegen ugliness in some cases. For example in the trivial fp-stack-direct-ret.ll testcase we used to generate: _bar: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret because we move the result of foo() into an XMM register, then have to move it back for the return of bar. Instead of hacking ever-more special cases into the call result lowering code we take a much simpler approach: on x86-32, fp return is modeled as always returning into an f80 register which is then truncated to f32 or f64 as needed. Similarly for a result, we model it as an extension to f80 + return. This exposes the truncate and extensions to the dag combiner, allowing target independent code to hack on them, eliminating them in this case. This gives us this code for the example above: _bar: subl $12, %esp call L_foo$stub addl $12, %esp ret The nasty aspect of this is that these conversions are not legal, but we want the second pass of dag combiner (post-legalize) to be able to hack on them. To handle this, we lie to legalize and say they are legal, then custom expand them on entry to the isel pass (PreprocessForFPConvert). This is gross, but less gross than the code it is replacing :) This also allows us to generate better code in several other cases. For example on fp-stack-ret-conv.ll, we now generate: _test: subl $12, %esp call L_foo$stub fstps 8(%esp) movl 16(%esp), %eax cvtss2sd 8(%esp), %xmm0 movsd %xmm0, (%eax) addl $12, %esp ret where before we produced (incidentally, the old bad code is identical to what gcc produces): _test: subl $12, %esp call L_foo$stub fstpl (%esp) cvtsd2ss (%esp), %xmm0 cvtss2sd %xmm0, %xmm0 movl 16(%esp), %eax movsd %xmm0, (%eax) addl $12, %esp ret Note that we generate slightly worse code on pr1505b.ll due to a scheduling deficiency that is unrelated to this patch. llvm-svn: 46307	2008-01-24 08:07:48 +00:00
Chris Lattner	214c11ee6f	take these with a pr # llvm-svn: 46303	2008-01-24 06:35:44 +00:00
Evan Cheng	91089e6d66	Let each target decide byval alignment. For X86, it's 4-byte unless the aggregare contains SSE vector(s). For x86-64, it's max of 8 or alignment of the type. llvm-svn: 46286	2008-01-23 23:17:41 +00:00
Evan Cheng	d436c2e724	SSE varargs arguments are passed in memory. llvm-svn: 46262	2008-01-22 23:26:53 +00:00
Dale Johannesen	b2d9e41233	Test is correct again for the moment. llvm-svn: 46172	2008-01-18 19:53:31 +00:00
Chris Lattner	41717f6989	This commit changes: 1. Legalize now always promotes truncstore of i1 to i8. 2. Remove patterns and gunk related to truncstore i1 from targets. 3. Rename the StoreXAction stuff to TruncStoreAction in TLI. 4. Make the TLI TruncStoreAction table a 2d table to handle from/to conversions. 5. Mark a wide variety of invalid truncstores as such in various targets, e.g. X86 currently doesn't support truncstore of any of its integer types. 6. Add legalize support for truncstores with invalid value input types. 7. Add a dag combine transform to turn store(truncate) into truncstore when safe. The later allows us to compile CodeGen/X86/storetrunc-fp.ll to: _foo: fldt 20(%esp) fldt 4(%esp) faddp %st(1) movl 36(%esp), %eax fstps (%eax) ret instead of: _foo: subl $4, %esp fldt 24(%esp) fldt 8(%esp) faddp %st(1) fstps (%esp) movl 40(%esp), %eax movss (%esp), %xmm0 movss %xmm0, (%eax) addl $4, %esp ret llvm-svn: 46140	2008-01-17 19:59:44 +00:00
Evan Cheng	8633da0707	When a live virtual register is being clobbered by an implicit def, it is spilled and the spill is its kill. However, if the local allocator has determined the register has not been modified (possible when its value was reloaded), it would not issue a restore. In that case, mark the last use of the virtual register as kill. llvm-svn: 46111	2008-01-17 02:08:17 +00:00
Evan Cheng	5be34d811c	Fixes a nasty dag combiner bug that causes a bunch of tests to fail at -O0. It's not safe to use the two value CombineTo variant to combine away a dead load. e.g. v1, chain2 = load chain1, loc v2, chain3 = load chain2, loc v3 = add v2, c Now we replace use of v1 with undef, use of chain2 with chain1. ReplaceAllUsesWith() will iterate through uses of the first load and update operands: v1, chain2 = load chain1, loc v2, chain3 = load chain1, loc v3 = add v2, c Now the second load is the same as the first load, SelectionDAG cse will ensure the use of second load is replaced with the first load. v1, chain2 = load chain1, loc v3 = add v1, c Then v1 is replaced with undef and bad things happen. llvm-svn: 46099	2008-01-16 23:11:54 +00:00
Duncan Sands	78e448d8b4	Trampoline support for x86-64. This looks like it should work, but I have no machine to test it on. Committed because it will at least cause no harm, and maybe someone can test it for me! llvm-svn: 46098	2008-01-16 22:55:25 +00:00
Chris Lattner	109f0e56f5	make sure to use a cpu that has sse. llvm-svn: 46060	2008-01-16 06:32:02 +00:00
Chris Lattner	41e1fd13b2	My previous commit had an incomplete message, it should have been: make the 'fp return in ST(0)' optimization smart enough to look through token factor nodes. THis allows us to compile testcases like CodeGen/X86/fp-stack-retcopy.ll into: _carg: subl $12, %esp call L_foo$stub fstpl (%esp) fldl (%esp) addl $12, %esp ret instead of: _carg: subl $28, %esp call L_foo$stub fstpl 16(%esp) movsd 16(%esp), %xmm0 movsd %xmm0, 8(%esp) fldl 8(%esp) addl $28, %esp ret Still not optimal, but much better and this is a trivial patch. Fixing the rest requires invasive surgery that is is not llvm 2.2 material. llvm-svn: 46054	2008-01-16 05:56:59 +00:00
Chris Lattner	afd4056065	verify x86 generates ud2 for llvm.trap llvm-svn: 46023	2008-01-15 22:22:02 +00:00
Dale Johannesen	8ca78844b0	Disable for now. llvm-svn: 45881	2008-01-11 20:47:33 +00:00
Duncan Sands	2c89976416	Output sinl for a long double FSIN node, not sin. Likewise fix up a bunch of other libcalls. While there I remove NEG_F32 and NEG_F64 since they are not used anywhere. This fixes 9 Ada ACATS failures. llvm-svn: 45833	2008-01-10 10:28:30 +00:00
Evan Cheng	0747381b13	Codegen improvement has reduced one spill. llvm-svn: 45814	2008-01-10 02:54:40 +00:00
Evan Cheng	ba0214a6cb	Special copy SUnit's do not have SDNode's. llvm-svn: 45787	2008-01-09 23:01:55 +00:00
Evan Cheng	f91cfb435f	Fix sse2.psrl.w and sse2.psrl.q definitions. llvm-svn: 45772	2008-01-09 02:16:44 +00:00
Chris Lattner	c93ad7d569	Make load->store deletion a bit smarter. This allows us to compile this: void test(long long P) { P ^= 1; } into just: _test: movl 4(%esp), %eax xorl $1, (%eax) ret instead of code like this: _test: movl 4(%esp), %ecx xorl $1, (%ecx) movl 4(%ecx), %edx movl %edx, 4(%ecx) ret llvm-svn: 45762	2008-01-08 23:08:06 +00:00
Duncan Sands	b3b1ae18ab	Crashes llc when using Chris's new legalization logic. llvm-svn: 45758	2008-01-08 21:51:53 +00:00
Nate Begeman	98dba4b0ce	Update test to catch recent x86 insert regression and improvements llvm-svn: 45705	2008-01-07 17:49:23 +00:00
Chris Lattner	7d567adef9	fix this to use a valid triple. llvm-svn: 45509	2008-01-02 22:21:45 +00:00
Chris Lattner	fbd8cc03c8	verify that aligned common support doesn't break. llvm-svn: 45495	2008-01-02 19:48:24 +00:00
Chris Lattner	d55e743cfe	One readme entry is done, one is really easy (Evan, want to investigate eliminating the llvm.x86.sse2.loadl.pd intrinsic?), one shuffle optzn may be done (if shufps is better than pinsw, Evan, please review), and we already know about LICM of simple instructions. llvm-svn: 45407	2007-12-29 19:31:47 +00:00
Chris Lattner	ed55329cc9	upgrade this test llvm-svn: 45406	2007-12-29 19:24:06 +00:00
Chris Lattner	cd147e5596	Fold comparisons against a constant nan, and optimize ORD/UNORD comparisons with a constant. This allows us to compile isnan to: _foo: fcmpu cr7, f1, f1 mfcr r2 rlwinm r3, r2, 0, 31, 31 blr instead of: LCPI1_0: ; float .space 4 _foo: lis r2, ha16(LCPI1_0) lfs f0, lo16(LCPI1_0)(r2) fcmpu cr7, f1, f0 mfcr r2 rlwinm r3, r2, 0, 31, 31 blr llvm-svn: 45405	2007-12-29 08:37:08 +00:00
Chris Lattner	b36a4a7a84	this xform is implemented. llvm-svn: 45404	2007-12-29 08:19:39 +00:00
Chris Lattner	f8e408b7b1	Codegen: as: _bar: pushl %esi subl $8, %esp movl 16(%esp), %esi call L_foo$stub fstps (%esi) addl $8, %esp popl %esi #FP_REG_KILL ret instead of: _bar: pushl %esi subl $8, %esp movl 16(%esp), %esi call L_foo$stub fstpl (%esi) cvtsd2ss (%esi), %xmm0 movss %xmm0, (%esi) addl $8, %esp popl %esi #FP_REG_KILL ret llvm-svn: 45401	2007-12-29 06:57:38 +00:00
Chris Lattner	e3515220d2	avoid going through a stack slot to convert from fpstack to xmm reg if we are just going to store it back anyway. This improves things like: double foo(); void bar(double P) { P = foo(); } llvm-svn: 45399	2007-12-29 06:41:28 +00:00
Chris Lattner	a432f12b76	one fewer uncond branch with my codegenprepare hack for single-mbb backedges. llvm-svn: 45360	2007-12-26 17:23:47 +00:00
Evan Cheng	8824950e8f	Fix PR1872: SrcValue and SrcValueOffset should not be used to compute load / store node id. llvm-svn: 45167	2007-12-18 19:38:14 +00:00
Evan Cheng	36bfae49e3	FIX for PR1799: When a load is unfolded from an instruction, check if it is a new node. If not, do not create a new SUnit. llvm-svn: 45157	2007-12-18 08:42:10 +00:00
Evan Cheng	1d95b669b6	Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs. llvm-svn: 45058	2007-12-15 03:00:47 +00:00
Evan Cheng	6909ff8c4b	Fix ctlz and cttz. llvm definition requires them to return number of bits in of the src type when value is zero. llvm-svn: 45029	2007-12-14 08:30:15 +00:00
Evan Cheng	51cf86ded0	Implement ctlz and cttz with bsr and bsf. llvm-svn: 45024	2007-12-14 02:13:44 +00:00
Evan Cheng	a152909956	Be extra careful with extension use optimation. Now turned on by default. llvm-svn: 44981	2007-12-13 03:32:53 +00:00
Evan Cheng	343929c773	Fold some and + shift in x86 addressing mode. llvm-svn: 44970	2007-12-13 00:43:27 +00:00
Evan Cheng	64a1febf9a	Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled. llvm-svn: 44960	2007-12-12 23:12:09 +00:00
Dan Gohman	0075ea1f5f	Allow vector integer constants to be created with SelectionDAG::getConstant, in the same way as vector floating-point constants. This allows the legalize expansion code for @llvm.ctpop and friends to be usable with vector types. llvm-svn: 44954	2007-12-12 22:21:26 +00:00
Evan Cheng	ad3e7f3286	Use shuffles to implement insert_vector_elt for i32, i64, f32, and f64. llvm-svn: 44929	2007-12-12 07:55:34 +00:00
Evan Cheng	af6ba4dfd4	Add a test case for -optimize-ext-uses. llvm-svn: 44928	2007-12-12 07:54:08 +00:00
Evan Cheng	d36d69fe92	Lower a build_vector with all constants into a constpool load unless it can be done with a move to low part. llvm-svn: 44921	2007-12-12 06:45:40 +00:00
Evan Cheng	f6c2838f36	- Improved v8i16 shuffle lowering. It now uses pshuflw and pshufhw as much as possible before resorting to pextrw and pinsrw. - Better codegen for v4i32 shuffles masquerading as v8i16 or v16i8 shuffles. - Improves (i16 extract_vector_element 0) codegen by recognizing (i32 extract_vector_element 0) does not require a pextrw. llvm-svn: 44836	2007-12-11 01:46:18 +00:00
Christopher Lamb	5c577eb543	Improve branch folding by recgonizing that explict successor relationships impact the value of fall-through choices. llvm-svn: 44785	2007-12-10 07:24:06 +00:00
Evan Cheng	34c7b35135	Much improved v8i16 shuffles. (Step 1). llvm-svn: 44676	2007-12-07 08:07:39 +00:00
Evan Cheng	8d1f8b2f27	New test case. llvm-svn: 44672	2007-12-07 01:48:46 +00:00
Evan Cheng	cab253ba13	Fix a bogus test case. llvm-svn: 44668	2007-12-06 22:12:45 +00:00
Evan Cheng	d53f72dfb1	Turning simple splitting on. Start testing new coalescer heuristics as new llcbeta. llvm-svn: 44660	2007-12-06 08:54:31 +00:00
Chris Lattner	64a1a9f502	third time around: instead of disabling this completely, only disable it if we don't know it will be obviously profitable. Still fixme, but less so. :) llvm-svn: 44658	2007-12-06 07:47:55 +00:00
Chris Lattner	bb5fb18af8	Actually, disable this code for now. More analysis and improvements to the X86 backend are needed before this should be enabled by default. llvm-svn: 44657	2007-12-06 07:44:31 +00:00
Chris Lattner	c467b49c96	implement a readme entry, compiling the code into: _foo: movl $12, %eax andl 4(%esp), %eax movl _array(%eax), %eax ret instead of: _foo: movl 4(%esp), %eax shrl $2, %eax andl $3, %eax movl _array(,%eax,4), %eax ret As it turns out, this triggers all the time, in a wide variety of situations, for example, I see diffs like this in various programs: - movl 8(%eax), %eax - shll $2, %eax - andl $1020, %eax - movl (%esi,%eax), %eax + movzbl 8(%eax), %eax + movl (%esi,%eax,4), %eax - shll $2, %edx - andl $1020, %edx - movl (%edi,%edx), %edx + andl $255, %edx + movl (%edi,%edx,4), %edx Unfortunately, I also see stuff like this, which can be fixed in the X86 backend: - andl $85, %ebx - addl _bit_count(,%ebx,4), %ebp + shll $2, %ebx + andl $340, %ebx + addl _bit_count(%ebx), %ebp llvm-svn: 44656	2007-12-06 07:33:36 +00:00
Chris Lattner	1de4db0446	fix this when run on non x86 hosts. llvm-svn: 44645	2007-12-06 01:05:52 +00:00
Evan Cheng	79e8b92dc3	Allow some reloads to be folded in multi-use cases. Specifically testl r, r -> cmpl [mem], 0. llvm-svn: 44479	2007-12-01 02:07:52 +00:00
Evan Cheng	90c548af8e	Do not fold reload into an instruction with multiple uses. It issues one extra load. llvm-svn: 44467	2007-11-30 21:23:43 +00:00
Dan Gohman	d12155d8c8	Remove unnecessary && from the RUN lines of this test. llvm-svn: 44342	2007-11-27 00:03:38 +00:00
Dan Gohman	a9f8208852	Don't lower srem/urem X%C to X-X/C*C unless the division is actually optimized. This avoids creating illegal divisions when the combiner is running after legalize; this fixes PR1815. Also, it produces better code in the included testcase by avoiding the subtract and multiply when the division isn't optimized. llvm-svn: 44341	2007-11-26 23:46:11 +00:00
Chris Lattner	be0c5a0500	Fix a long standing deficiency in the X86 backend: we would sometimes emit "zero" and "all one" vectors multiple times, for example: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 pcmpeqd %mm0, %mm0 movq %mm0, _M2 ret instead of: _test2: pcmpeqd %mm0, %mm0 movq %mm0, _M1 movq %mm0, _M2 ret This patch fixes this by always arranging for zero/one vectors to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be any random type. This ensures they get trivially CSE'd on the dag. This fix is also important for LegalizeDAGTypes, as it gets unhappy when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when 'i64' isn't legal. This patch makes the following changes: 1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into their canonical types. 2) The now-dead patterns are removed from the SSE/MMX .td files. 3) All the patterns in the .td file that referred to immAllOnesV or immAllZerosV in the wrong form now use *_bc to match them with a bitcast wrapped around them. 4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle bitcast'd zero vectors, which simplifies the code actually. 5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that is legal, instead of generating one that is illegal and expecting a later legalize pass to clean it up. 6) isZeroShuffle is generalized to handle bitcast of zeros. 7) several other minor tweaks. This patch is definite goodness, but has the potential to cause random code quality regressions. Please be on the lookout for these and let me know if they happen. llvm-svn: 44310	2007-11-25 00:24:49 +00:00
Chris Lattner	6304b1e16d	upgrade this test llvm-svn: 44298	2007-11-24 05:39:29 +00:00
Dan Gohman	0f62120b01	Add support in SplitVectorOp for remainder operators. llvm-svn: 44233	2007-11-19 15:15:03 +00:00
Chris Lattner	bef568f3f8	fix bogus test that the more strict lexer is finding. llvm-svn: 44216	2007-11-18 18:26:45 +00:00
Evan Cheng	121c50d5e3	Typo. llvm-svn: 44196	2007-11-16 23:55:08 +00:00
Evan Cheng	c19506f69d	Fix a thinko in post-allocation coalescer. llvm-svn: 44166	2007-11-15 08:13:29 +00:00
Anton Korobeynikov	58298cb9cc	Fix PIC jump table codegen on x86-32/linux. In fact, such thing should be applied to all targets uses GOT-relative offsets for PIC (Alpha?) llvm-svn: 44108	2007-11-14 09:18:41 +00:00
Arnold Schwaighofer	64ad6fa1fa	Update tailcall code to include inline attribute operand for memcpy. llvm-svn: 43978	2007-11-10 10:48:01 +00:00
Evan Cheng	ea1474bdf3	Fix tests. llvm-svn: 43961	2007-11-09 20:46:00 +00:00
Evan Cheng	d9bab93a44	If both parts of smul_lohi, etc. are used, don't simplify. If only one part is used, try simplify it. llvm-svn: 43888	2007-11-08 09:25:29 +00:00
Evan Cheng	3764ad2bac	Add pseudo dependency to force two-address instruction to be scheduled after other uses. There was a overly restricted check that prevented some obvious cases. llvm-svn: 43762	2007-11-06 08:44:59 +00:00
Dan Gohman	6255ce9f5d	Add support for vector remainder operations. llvm-svn: 43744	2007-11-05 23:35:22 +00:00
Dale Johannesen	1f70f86c7a	Make labels work in asm blocks; allow labels as parameters. Rename ValueRefList to ParamList in AsmParser, since its only use is for parameters. llvm-svn: 43734	2007-11-05 21:20:28 +00:00
Evan Cheng	e5eac2c5ac	Handle cases where a register and one of its super-register are both marked as defined on the same instruction. This fixes PR1767. llvm-svn: 43699	2007-11-05 03:11:55 +00:00
Evan Cheng	bc39e175c4	Fix test case. Chris didn't do make check. :-) llvm-svn: 43698	2007-11-05 03:04:26 +00:00
Evan Cheng	947c271e37	Doh. PR1187 -> PR1766. llvm-svn: 43693	2007-11-05 01:00:44 +00:00
Evan Cheng	13d79ab67a	Fix PR1187. llvm-svn: 43692	2007-11-05 00:59:10 +00:00
Chris Lattner	8fac63c8b5	Fix PR1761 by not printing (rip) suffix when in -static mode. Evan, please review this. llvm-svn: 43680	2007-11-04 19:23:28 +00:00
Chris Lattner	67cd357fb8	Fix PR1763 by allowing the 'q' constraint to work with 64-bit regs on x86-64. llvm-svn: 43669	2007-11-04 06:51:12 +00:00
Evan Cheng	1771f6da9c	There are times when the coalescer would not coalesce away a copy but the copy can be eliminated by the allocator is the destination and source targets the same register. The most common case is when the source and destination registers are in different class. For example, on x86 mov32to32_ targets GR32_ which contains a subset of the registers in GR32. The allocator can do 2 things: 1. Set the preferred allocation for the destination of a copy to that of its source. 2. After allocation is done, change the allocation of a copy destination (if legal) so the copy can be eliminated. This eliminates 443 extra moves from 403.gcc. llvm-svn: 43662	2007-11-03 07:20:12 +00:00
Evan Cheng	8d473f667d	Add run line. llvm-svn: 43645	2007-11-02 17:36:58 +00:00
Evan Cheng	65a07e73e2	One more extract_subreg coalescing bug. llvm-svn: 43644	2007-11-02 17:35:08 +00:00
Evan Cheng	b50cc64eb0	Missing a getNumOperands check. llvm-svn: 43630	2007-11-02 01:26:22 +00:00
Dale Johannesen	c125d9b4e8	Test that expand_vector_elt(v2i64) works in 32-bit mode. llvm-svn: 43598	2007-11-01 02:38:24 +00:00
Evan Cheng	5e058e94b5	It's not safe to tell SplitCriticalEdge to merge identical edges. It may delete the phi instruction that's being processed. llvm-svn: 43524	2007-10-30 22:27:26 +00:00
Evan Cheng	633cd3e84d	- Bug fixes. - Allow icmp rewrite using an iv / stride of a smaller integer type. llvm-svn: 43480	2007-10-29 22:07:18 +00:00
Dan Gohman	02b8beff5f	Fix a DAGCombiner abort on a bitcast from a scalar to a vector. llvm-svn: 43470	2007-10-29 20:44:42 +00:00
Evan Cheng	5fe81cf64e	Enable more fold (sext (load x)) -> (sext (truncate (sextload x))) transformation. Previously, it's restricted by ensuring the number of load uses is one. Now the restriction is loosened up by allowing setcc uses to be "extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq). llvm-svn: 43465	2007-10-29 19:58:20 +00:00
Chris Lattner	1503362624	Add support for the x86-64 'q' regigster modifier, and add support for the b/h/w/k/q inline asm memory modifiers, which are just ignored. This fixes PR1748 and CodeGen/X86/2007-10-28-inlineasm-q-modifier.ll llvm-svn: 43430	2007-10-29 03:09:07 +00:00
Evan Cheng	53696b7e9f	Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free. e.g. Turns this loop: LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx movw %dx, %si LBB1_2: # bb movl L_X$non_lazy_ptr, %edi movw %si, (%edi) movl L_Y$non_lazy_ptr, %edi movw %dx, (%edi) addw $4, %dx incw %si incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb into LBB1_1: # entry.bb_crit_edge xorl %ecx, %ecx xorw %dx, %dx LBB1_2: # bb movl L_X$non_lazy_ptr, %esi movw %cx, (%esi) movl L_Y$non_lazy_ptr, %esi movw %dx, (%esi) addw $4, %dx incl %ecx cmpl %eax, %ecx jne LBB1_2 # bb llvm-svn: 43375	2007-10-26 01:56:11 +00:00
Evan Cheng	66cbf54030	If a loop termination compare instruction is the only use of its stride, and the compaison is against a constant value, try eliminate the stride by moving the compare instruction to another stride and change its constant operand accordingly. e.g. loop: ... v1 = v1 + 3 v2 = v2 + 1 if (v2 < 10) goto loop => loop: ... v1 = v1 + 3 if (v1 < 30) goto loop llvm-svn: 43336	2007-10-25 09:11:16 +00:00
Dale Johannesen	402c11966a	This was failing on Darwin, which defaults to PIC; no lea was generated. I think this follows the intent. llvm-svn: 43312	2007-10-24 20:58:14 +00:00
Dan Gohman	df1f166e4a	Strength reduction improvements. - Avoid attempting stride-reuse in the case that there are users that aren't addresses. In that case, there will be places where the multiplications won't be folded away, so it's better to try to strength-reduce them. - Several SSE intrinsics have operands that strength-reduction can treat as addresses. The previous item makes this more visible, as any non-address use of an IV can inhibit stride-reuse. - Make ValidStride aware of whether there's likely to be a base register in the address computation. This prevents it from thinking that things like stride 9 are valid on x86 when the base register is already occupied. Also, XFAIL the 2007-08-10-LEA16Use32.ll test; the new logic to avoid stride-reuse elimintes the LEA in the loop, so the test is no longer testing what it was intended to test. llvm-svn: 43231	2007-10-22 20:40:42 +00:00
Dan Gohman	76e104c8ad	Fix the folding of multiplication into addresses on x86, which was broken by the recent {U,S}MUL_LOHI changes. llvm-svn: 43230	2007-10-22 20:22:24 +00:00
Evan Cheng	2d53c3f15e	New test case. llvm-svn: 43193	2007-10-19 22:05:00 +00:00
Rafael Espindola	2b5b200b9f	Test byval with a 8 bit aligned struct llvm-svn: 43173	2007-10-19 11:29:21 +00:00
Rafael Espindola	d8d4372845	Add support for byval function whose argument is not 32 bit aligned. To do this it is necessary to add a "always inline" argument to the memcpy node. For completeness I have also added this node to memmove and memset. I have also added getMem* functions, because the extra argument makes it cumbersome to use getNode and because I get confused by it :-) llvm-svn: 43172	2007-10-19 10:41:11 +00:00
Evan Cheng	f6d1c7be14	Really fix PR1734. Carefully track which register uses are sub-register uses by traversing inverse register coalescing map. llvm-svn: 43118	2007-10-18 07:49:59 +00:00
Dan Gohman	2903f7fc26	Add support for ISD::SELECT in SplitVectorOp. llvm-svn: 43072	2007-10-17 14:48:28 +00:00
Evan Cheng	524c0e6c3d	Yet another test case for extract_subreg coalescing crash. llvm-svn: 43063	2007-10-17 02:15:06 +00:00
Evan Cheng	09fa6ed483	Fix PR1734. llvm-svn: 43035	2007-10-16 19:29:47 +00:00
Dale Johannesen	dd254c4efa	New test for svn rev 43033, radar 5538745. llvm-svn: 43034	2007-10-16 18:10:14 +00:00
Evan Cheng	f5bcd3d737	LowerFP_TO_SINT must not create a stack object if it's not needed. llvm-svn: 43004	2007-10-15 20:11:21 +00:00
Dan Gohman	2dc6099def	Reapply the fix in 42908 for this file. This changes the function names from "test" to "foo" so that they don't match the grep -i ST. llvm-svn: 43001	2007-10-15 19:22:17 +00:00
Evan Cheng	43887d3714	Fix PR1729: watch out for val# with no def. llvm-svn: 42996	2007-10-15 18:33:50 +00:00
Tanya Lattner	3a64752342	Fix run line. llvm-svn: 42990	2007-10-15 16:35:13 +00:00
Evan Cheng	850c8739dc	New test case. llvm-svn: 42963	2007-10-14 10:15:03 +00:00
Evan Cheng	33df6a6bed	Revert 42908 for now. llvm-svn: 42960	2007-10-14 05:57:21 +00:00
Evan Cheng	f5ed18f7d3	Fix test case. llvm-svn: 42949	2007-10-13 03:14:06 +00:00
Evan Cheng	6101e4ffdf	New tests. llvm-svn: 42948	2007-10-13 03:10:54 +00:00
Dan Gohman	c96f2809ca	Fix this test to not depend on the assembly output containing something that includes the string "st". This probably fixes the regression on Darwin. llvm-svn: 42932	2007-10-12 20:42:14 +00:00
Dan Gohman	a75e4a62e6	Change the names used for internal labels to use the current function symbol name instead of a codegen-assigned function number. Thanks Evan! :-) llvm-svn: 42908	2007-10-12 14:53:36 +00:00
Evan Cheng	51791564b0	Doh. llvm-svn: 42901	2007-10-12 09:10:27 +00:00
Evan Cheng	947b4a6c3d	EXTRACT_SUBREG test case. llvm-svn: 42900	2007-10-12 09:03:31 +00:00
Arnold Schwaighofer	f1e49dd41d	Added missing -march=x86 flag. llvm-svn: 42893	2007-10-12 07:49:48 +00:00
Dan Gohman	ab5c3ed0d1	Add intrinsics for sin, cos, and pow. These use llvm_anyfloat_ty, and so may be overloaded with vector types. And add a testcase for codegen for these. llvm-svn: 42885	2007-10-12 00:01:22 +00:00
Dan Gohman	9a70be46f1	Add an explicit target triple to make this test behave as expected on non-Apple hosts. And use the count script instead of wc + grep. llvm-svn: 42878	2007-10-11 23:04:36 +00:00
Arnold Schwaighofer	d47210011e	Added tail call optimization to the x86 back end. It can be enabled by passing -tailcallopt to llc. The optimization is performed if the following conditions are satisfied: * caller/callee are fastcc * elf/pic is disabled OR elf/pic enabled + callee is in module + callee has visibility protected or hidden llvm-svn: 42870	2007-10-11 19:40:01 +00:00
Dan Gohman	708e76e663	These two tests now require only two multiply instructions, instead of four. llvm-svn: 42784	2007-10-09 15:39:37 +00:00
Evan Cheng	25b65542d9	Update test. llvm-svn: 42775	2007-10-08 22:20:32 +00:00
Dan Gohman	9da3bddf43	These two tests now require only three multiply instructions, instead of four. llvm-svn: 42765	2007-10-08 20:48:12 +00:00
Dale Johannesen	b600202c68	Make test work on non-x86 hosts. llvm-svn: 42671	2007-10-06 01:22:39 +00:00
Evan Cheng	0a642eaa62	Test case for 3-address conversion. llvm-svn: 42664	2007-10-05 23:33:09 +00:00
Evan Cheng	dc467c6323	Enable convertToThreeAddress for X86 by default. llvm-svn: 42655	2007-10-05 22:31:10 +00:00
Evan Cheng	6fd2606ff5	New test case. llvm-svn: 42628	2007-10-05 01:44:22 +00:00
Evan Cheng	1d3c836933	-pre-RA-sched=none, simple, simple-noitin are gone. llvm-svn: 42505	2007-10-01 22:17:20 +00:00
Dan Gohman	02f80006f8	Teach SplitVectorOp how to split INSERT_VECTOR_ELT. llvm-svn: 42457	2007-09-28 23:53:40 +00:00
Rafael Espindola	01b306e575	Refactor the memcpy lowering for the x86 target. The only generated code difference is that now we call memcpy when the size of the array is unknown. This matches GCC behavior and is better since the run time value can be arbitrarily large. llvm-svn: 42433	2007-09-28 12:53:01 +00:00
Dale Johannesen	e61886cee4	Add sqrt and powi intrinsics for long double. llvm-svn: 42423	2007-09-28 01:08:20 +00:00
Dale Johannesen	57b6470e8b	Modernize fabs.ll, add long double. Add tests for direct codegen of fsin/fcos. llvm-svn: 42369	2007-09-26 21:12:10 +00:00
Dan Gohman	1bb346f9f1	When both x/y and x%y are needed (x and y both scalar integer), compute both results with a single div or idiv instruction. This uses new X86ISD nodes for DIV and IDIV which are introduced during the legalize phase so that the SelectionDAG's CSE can automatically eliminate redundant computations. llvm-svn: 42308	2007-09-25 18:23:27 +00:00
Dale Johannesen	fe773726a9	Some tests for APFloat conversions. llvm-svn: 42303	2007-09-25 17:50:55 +00:00
Evan Cheng	dab211c87b	Forgot to check in the changes. Fix test case so it doesn't break with any scheduling changes. llvm-svn: 42302	2007-09-25 17:47:38 +00:00

... 4 5 6 7 8 ...

750 Commits