llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00

Author	SHA1	Message	Date
Dan Gohman	5a9c2a3434	Implement CBE support for first-class structs and array values, and insertvalue and extractvalue instructions. First-class array values are not trivial because C doesn't support them. The approach I took here is to wrap all arrays in structs. Feedback is welcome. The 2007-01-15-NamedArrayType.ll test needed to be modified because it has a "not grep" for a string that now exists, because array types now have associated struct types, and those struct types have names. llvm-svn: 51881	2008-06-02 21:30:49 +00:00
Dan Gohman	385b7d76ed	Fix the position of MemOperands in nodes that use variadic_ops in DAGISelEmitter output. This bug was recently uncovered by the addition of patterns for CALL32m and CALL64m, which are nodes that now have both MemOperands and variadic_ops. This bug was especially visible with PIC in various configurations, because the new patterns are matching the indirect call code used in many PIC configurations. llvm-svn: 51877	2008-06-02 17:40:38 +00:00
Wojciech Matyjewicz	06e4c8a420	Fixes PR2395. Looking for a constant in a GEP tail (when the first GEP is longer than the second one) should stop after finding one. Added break instruction guarantees it. It also changes difference between offsets to absolute value of this difference in the condition. llvm-svn: 51875	2008-06-02 17:26:12 +00:00
Owen Anderson	7700de3137	Fix two issues that Eli Friedman pointed out, where would misoptimized code like: char a[200]; init(a, a+200); OR int a[200]; char* b = (char)a; char c = (char*)a; foo(b, c); llvm-svn: 51850	2008-06-01 22:26:26 +00:00
Owen Anderson	d194f76cb4	Test for PR2401 llvm-svn: 51849	2008-06-01 21:55:55 +00:00
Duncan Sands	d14212a3e1	When simplifying a call to a bitcast function, tighten up the conditions for performing the transform when only the function declaration is available: no longer allow turning i32 into i64 for example. Only allow changing between pointer types, and between pointer types and integers of the same size. For return values ptr -> intptr was already allowed; I added ptr -> ptr and intptr -> ptr while there. As shown by a recent objc testcase, changing the way parameters/return values are passed can be fatal when calling code written in assembler that directly manipulates call arguments and return values unless the transform has no impact on the way they are passed at the codegen level. While it is possible to imagine an ABI that treats integers of pointer size differently to pointers, I don't think LLVM supports any so the transform should now be safe while still being useful. llvm-svn: 51834	2008-06-01 07:38:42 +00:00
Chris Lattner	da1e2c8fa3	update this patch to handle an extraneous &1. This should be pulled into the 2.3 release branch. llvm-svn: 51824	2008-05-31 19:50:53 +00:00
Nick Lewycky	1bcd80adf7	Peer through sext/zext when looking for not(cmp). llvm-svn: 51819	2008-05-31 19:01:33 +00:00
Nick Lewycky	b30afdb62b	Add more i1 optimizations. add, sub, mul, s/udiv on i1 are now simplified away. llvm-svn: 51817	2008-05-31 17:59:52 +00:00
Nick Lewycky	cdcdcddc85	Adding i1 is always Xor. llvm-svn: 51816	2008-05-31 17:10:28 +00:00
Chris Lattner	43a47ddd89	Fix the CBE's handling of instructions whose result is an i1. Previously, we did not truncate the value down to i1 with (x&1). This caused a problem when the computation of x was nontrivial, for example, "add i1 1, 1" would return 2 instead of 0. This makes the testcase compile into: ... llvm_cbe_t = (((llvm_cbe_r == 0u) + (llvm_cbe_r == 0u))&1); llvm_cbe_u = (((unsigned int )(bool )llvm_cbe_t)); ... instead of: ... llvm_cbe_t = ((llvm_cbe_r == 0u) + (llvm_cbe_r == 0u)); llvm_cbe_u = (((unsigned int )(bool )llvm_cbe_t)); ... This fixes a miscompilation of mediabench/adpcm/rawdaudio/rawdaudio and 403.gcc with the CBE, regressions from LLVM 2.2. Tanya, please pull this into the release branch. llvm-svn: 51813	2008-05-31 09:23:55 +00:00
Dan Gohman	ac5c3382fe	IR, bitcode reader, bitcode writer, and asmparser changes to insertvalue and extractvalue to use constant indices instead of Value* indices. And begin updating LangRef.html. There's definately more to come here, but I'm checking this basic support in now to make it available to people who are interested. llvm-svn: 51806	2008-05-31 00:58:22 +00:00
Mikhail Glushenkov	9db02580c5	Fix the -opt switch and add a test case for it. llvm-svn: 51784	2008-05-30 19:56:27 +00:00
Mikhail Glushenkov	4d71bea2c9	Fix: 'sink' handling was broken. llvm-svn: 51750	2008-05-30 06:23:29 +00:00
Nick Lewycky	a02482cfaa	Unbreak this test. llvm-svn: 51726	2008-05-30 05:02:37 +00:00
Dan Gohman	aa8fcd5657	Add patterns for CALL32m and CALL64m. They aren't matched in most cases due to an isel deficiency already noted in lib/Target/X86/README.txt, but they can be matched in this fold-call.ll testcase, for example. This is interesting mainly because it exposes a tricky tblgen bug; tblgen was incorrectly computing the starting index for variable_ops in the case of a complex pattern. llvm-svn: 51706	2008-05-29 21:50:34 +00:00
Dan Gohman	e256337a1a	Expand small memmovs using inline code. Set the X86 threshold for expanding memmove to a more plausible value, now that it's actually being used. llvm-svn: 51696	2008-05-29 19:42:22 +00:00
Anton Korobeynikov	eb3cd5e822	For PR1338: Rename test dirs llvm-svn: 51695	2008-05-29 19:17:15 +00:00
Owen Anderson	2a0090d9bc	Move these tests into the proper directory. llvm-svn: 51685	2008-05-29 16:30:29 +00:00
Owen Anderson	bd3940abc7	Replace the old ADCE implementation with a new one that more simply solves the one case that ADCE catches that normal DCE doesn't: non-induction variable loop computations. This implementation handles this problem without using postdominators. llvm-svn: 51668	2008-05-29 08:45:13 +00:00
Evan Cheng	04c0915a2f	Implement vector shift up / down and insert zero with ps{rl}lq / ps{rl}ldq. llvm-svn: 51667	2008-05-29 08:22:04 +00:00
Evan Cheng	f2e38956ff	Add nounwind. llvm-svn: 51665	2008-05-29 07:09:24 +00:00
Evan Cheng	cd45b11bc1	Fix PR2289: vr defined by multiple implicit_def as result of coalescing. llvm-svn: 51648	2008-05-28 17:40:10 +00:00
Evan Cheng	591b57edd6	Teach local register allocator to deal with landing pad MBB's. llvm-svn: 51647	2008-05-28 17:22:32 +00:00
Chris Lattner	7a7da4f9c3	Implement PR2370: memmove(x,x,size) -> noop. llvm-svn: 51636	2008-05-28 05:30:41 +00:00
Dan Gohman	568685ffa7	Specify a target so that this tests tests what it's intended to test. llvm-svn: 51600	2008-05-27 17:55:57 +00:00
Dan Gohman	3ba9d77adb	Make this test independent of the target-triple; the stack alignment is specifically what this test depends on. llvm-svn: 51599	2008-05-27 17:44:23 +00:00
Nick Lewycky	0ba4adf4ef	Whoops -- forgot PR reference on this test. llvm-svn: 51569	2008-05-26 20:23:33 +00:00
Nick Lewycky	c096899392	The Linux ABI emits an extra "movl %esp, %ebp" in function prologue and sometimes a "mov %ebp, %esp" in the epilogue. Force these tests that rely on counting 'mov' to use i686-apple-darwin8.8.0 where they were written. llvm-svn: 51568	2008-05-26 20:18:56 +00:00
Nick Lewycky	7116ad5a18	Use {} instead of "" in RUN lines. llvm-svn: 51561	2008-05-26 01:27:08 +00:00
Nick Lewycky	f24743a6bb	Don't treat values as signed when looking at loop steppings in HowForToNonZero. llvm-svn: 51560	2008-05-25 23:43:32 +00:00
Nick Lewycky	744dad8004	"ret (constexpr)" can't be folded into a Constant. Add a method to Analysis/ConstantFolding to fold ConstantExpr's, then make instcombine use it to try to use targetdata to fold constant expressions on void instructions. Also extend the icmp(inttoptr, inttoptr) folding to handle the case where int size != ptr size. llvm-svn: 51559	2008-05-25 20:56:15 +00:00
Chris Lattner	3def8b4e53	Fix a serious brain-o. Obviously no-one reviewed my patch :( This fixes PR2359 llvm-svn: 51536	2008-05-24 04:06:28 +00:00
Chris Lattner	bde5fd685d	Fix PR2358 by resolving calls with undef arguments to overdefined. llvm-svn: 51535	2008-05-24 03:59:33 +00:00
Evan Cheng	e5e0b4660d	Eliminate x86.sse2.punpckh.qdq and x86.sse2.punpckl.qdq. llvm-svn: 51533	2008-05-24 02:56:30 +00:00
Evan Cheng	564238c841	Eliminate x86.sse2.movs.d, x86.sse2.shuf.pd, x86.sse2.unpckh.pd, and x86.sse2.unpckl.pd intrinsics. These will be lowered into shuffles. llvm-svn: 51531	2008-05-24 02:14:05 +00:00
Evan Cheng	e9c1c96f7b	New loadl_pd and loadh_pd tests. llvm-svn: 51525	2008-05-24 00:10:02 +00:00
Evan Cheng	365e0f3932	Autoupgrade x86.sse2.loadh.pd and x86.sse2.loadl.pd. llvm-svn: 51523	2008-05-24 00:08:39 +00:00
Dan Gohman	abbe3d47ab	Don't silently truncate array extents to 32 bits. llvm-svn: 51505	2008-05-23 21:40:55 +00:00
Evan Cheng	4f660778f0	Use movlps / movhps to modify low / high half of 16-byet memory location. llvm-svn: 51501	2008-05-23 21:23:16 +00:00
Dan Gohman	2412469191	Remove lingering references to .llx and .tr in the tests. llvm-svn: 51500	2008-05-23 21:15:35 +00:00
Dan Gohman	6cc0b4f262	Use PMULDQ for v2i64 multiplies when SSE4.1 is available. And add load-folding table entries for PMULDQ and PMULLD. llvm-svn: 51489	2008-05-23 17:49:40 +00:00
Matthijs Kooijman	cf417144f6	Restucture a part of the SimplifyCFG pass and include a testcase. The SimplifyCFG pass looks at basic blocks that contain only phi nodes, followed by an unconditional branch. In a lot of cases, such a block (BB) can be merged into their successor (Succ). This merging is performed by TryToSimplifyUncondBranchFromEmptyBlock. It does this by taking all phi nodes in the succesor block Succ and expanding them to include the predecessors of BB. Furthermore, any phi nodes in BB are moved to Succ and expanded to include the predecessors of Succ as well. Before attempting this merge, CanPropagatePredecessorsForPHIs checks to see if all phi nodes can be properly merged. All functional changes are made to this function, only comments were updated in TryToSimplifyUncondBranchFromEmptyBlock. In the original code, CanPropagatePredecessorsForPHIs looks quite convoluted and more like stack of checks added to handle different kinds of situations than a comprehensive check. In particular the first check in the function did some value checking for the case that BB and Succ have a common predecessor, while the last check in the function simply rejected all cases where BB and Succ have a common predecessor. The first check was still useful in the case that BB did not contain any phi nodes at all, though, so it was not completely useless. Now, CanPropagatePredecessorsForPHIs is restructured to to look a lot more similar to the code that actually performs the merge. Both functions now look at the same phi nodes in about the same order. Any conflicts (phi nodes with different values for the same source) that could arise from merging or moving phi nodes are detected. If no conflicts are found, the merge can happen. Apart from only restructuring the checks, two main changes in functionality happened. Firstly, the old code rejected blocks with common predecessors in most cases. The new code performs some extra checks so common predecessors can be handled in a lot of cases. Wherever common predecessors still pose problems, the blocks are left untouched. Secondly, the old code rejected the merge when values (phi nodes) from BB were used in any other place than Succ. However, it does not seem that there is any situation that would require this check. Even more, this can be proven. Consider that BB is a block containing of a single phi node "%a" and a branch to Succ. Now, since the definition of %a will dominate all of its uses, BB will dominate all blocks that use %a. Furthermore, since the branch from BB to Succ is unconditional, Succ will also dominate all uses of %a. Now, assume that one predecessor of Succ is not dominated by BB (and thus not dominated by Succ). Since at least one use of %a (but in reality all of them) is reachable from Succ, you could end up at a use of %a without passing through it's definition in BB (by coming from X through Succ). This is a contradiction, meaning that our original assumption is wrong. Thus, all predecessors of Succ must also be dominated by BB (and thus also by Succ). This means that moving the phi node %a from BB to Succ does not pose any problems when the two blocks are merged, and any use checks are not needed. llvm-svn: 51478	2008-05-23 09:09:41 +00:00
Nick Lewycky	6a16ace643	Constant integer vectors may also be negated. llvm-svn: 51476	2008-05-23 04:54:45 +00:00
Nick Lewycky	bd2da8098d	Revert X + X --> X * 2 optz'n which pessimizes heavily on x86. llvm-svn: 51474	2008-05-23 04:34:58 +00:00
Nick Lewycky	427209006f	Implement X + X for vectors. llvm-svn: 51472	2008-05-23 04:14:51 +00:00
Nick Lewycky	e62259c369	Fix a recently added optimization to not crash on vectors. llvm-svn: 51471	2008-05-23 03:26:47 +00:00
Dan Gohman	67e1a58e22	Generalize the new code in instcombine's ComputeNumSignBits for handling and/or to handle more cases (such as this add-sitofp.ll testcase), and port it to selectiondag's ComputeNumSignBits. llvm-svn: 51469	2008-05-23 02:28:01 +00:00
Dan Gohman	c7007dd0dc	Make structs and arrays first-class types, and add assembly and bitcode support for the extractvalue and insertvalue instructions and constant expressions. Note that this does not yet include CodeGen support. llvm-svn: 51468	2008-05-23 01:55:30 +00:00
Evan Cheng	097e95b1f7	Bug: rcpps can only folds a load if the address is 16-byte aligned. Fixed many 'ps' load folding patterns in X86InstrSSE.td which are missing the proper alignment checks. Also fixed some 80 col. violations. llvm-svn: 51462	2008-05-23 00:37:07 +00:00
Evan Cheng	dc3a3d3a2c	Add a couple of test cases. llvm-svn: 51441	2008-05-22 21:19:19 +00:00
Evan Cheng	d1373cd497	Add missing patterns. llvm-svn: 51435	2008-05-22 18:56:56 +00:00
Chris Lattner	6a45cf9dd6	Add support for multiple-return values in inline asm. This should get inline asm working as well as it did previously with the CBE with the new MRV support for inline asm. llvm-svn: 51420	2008-05-22 06:19:37 +00:00
Chris Lattner	477239c56d	testcase for PR2267 llvm-svn: 51408	2008-05-22 04:45:22 +00:00
Evan Cheng	8e02953de8	Fix PR2343. An interesting coalescer bug. BB1: vr1025 = copy vr1024 .. BB2: vr1024 = op = op vr1025 <loop eventually branch back to BB1> Even though vr1025 is copied from vr1024, it's not safe to coalesced them since live range of vr1025 intersects the def of vr1024. This happens when vr1025 is assigned the value of the previous iteration of vr1024 in the loop. llvm-svn: 51394	2008-05-21 22:34:12 +00:00
Gabor Greif	4a39cea7e7	resurrect lost tests by renaming them to not end with .tr llvm-svn: 51375	2008-05-21 14:48:24 +00:00
Gabor Greif	b03785f0cd	Eliminate questionable syntax for stdin redirection. This probably also speeds things up a bit. llvm-svn: 51357	2008-05-20 22:07:21 +00:00
Chris Lattner	821dc30131	Fix PR2346 by marking vaarg as volatile so that licm doesn't try to hoist them. llvm-svn: 51356	2008-05-20 22:05:28 +00:00
Dan Gohman	7d78d53d2a	Oops, commit the version of this test that actually works. llvm-svn: 51351	2008-05-20 21:19:36 +00:00
Dan Gohman	b48d4a75f6	Port SelectionDAG's ComputeNumSignBits-using code to instcombine, now that instcombine also has ComputeNumSignBits. llvm-svn: 51350	2008-05-20 21:01:12 +00:00
Gabor Greif	807c2df887	sabre brings to my attention that the 'tr' suffix is also obsolete llvm-svn: 51349	2008-05-20 21:00:03 +00:00
Gabor Greif	d8a4dbb5da	Rename the last test with .llx extension to .ll, resolve duplicate test by renaming to isnan2. Now that no test has llx ending there is no need to search for them from dg.exp too. llvm-svn: 51328	2008-05-20 19:52:04 +00:00
Evan Cheng	55e3957c96	More local spiller complexity! If local spiller optimization turns some instruction into an identity copy, it will be removed. If the output register happens to be dead (and source is obviously killed), transfer the kill / dead information to last use / def in the same MBB. llvm-svn: 51306	2008-05-20 08:13:21 +00:00
Evan Cheng	408425f0e0	Don't spill dead def. llvm-svn: 51305	2008-05-20 08:10:37 +00:00
Chris Lattner	b387fd90fc	Teach instcombine 4 new xforms: (add (sext x), cst) --> (sext (add x, cst')) (add (sext x), (sext y)) --> (sext (add int x, y)) (add double (sitofp x), fpcst) --> (sitofp (add int x, intcst)) (add double (sitofp x), (sitofp y)) --> (sitofp (add int x, y)) This generally reduces conversions. For example MiBench/telecomm-gsm gets these simplifications: HACK2: %tmp67.i142.i.i = sext i16 %tmp6.i141.i.i to i32 ; <i32> [#uses=1] %tmp23.i139.i.i = sext i16 %tmp2.i138.i.i to i32 ; <i32> [#uses=1] %tmp8.i143.i.i = add i32 %tmp67.i142.i.i, %tmp23.i139.i.i ; <i32> [#uses=3] HACK2: %tmp67.i121.i.i = sext i16 %tmp6.i120.i.i to i32 ; <i32> [#uses=1] %tmp23.i118.i.i = sext i16 %tmp2.i117.i.i to i32 ; <i32> [#uses=1] %tmp8.i122.i.i = add i32 %tmp67.i121.i.i, %tmp23.i118.i.i ; <i32> [#uses=3] HACK2: %tmp67.i.i190.i = sext i16 %tmp6.i.i189.i to i32 ; <i32> [#uses=1] %tmp23.i.i187.i = sext i16 %tmp2.i.i186.i to i32 ; <i32> [#uses=1] %tmp8.i.i191.i = add i32 %tmp67.i.i190.i, %tmp23.i.i187.i ; <i32> [#uses=3] HACK2: %tmp67.i173.i.i.i = sext i16 %tmp6.i172.i.i.i to i32 ; <i32> [#uses=1] %tmp23.i170.i.i.i = sext i16 %tmp2.i169.i.i.i to i32 ; <i32> [#uses=1] %tmp8.i174.i.i.i = add i32 %tmp67.i173.i.i.i, %tmp23.i170.i.i.i ; <i32> [#uses=3] HACK2: %tmp67.i152.i.i.i = sext i16 %tmp6.i151.i.i.i to i32 ; <i32> [#uses=1] %tmp23.i149.i.i.i = sext i16 %tmp2.i148.i.i.i to i32 ; <i32> [#uses=1] %tmp8.i153.i.i.i = add i32 %tmp67.i152.i.i.i, %tmp23.i149.i.i.i ; <i32> [#uses=3] HACK2: %tmp67.i.i.i.i = sext i16 %tmp6.i.i.i.i to i32 ; <i32> [#uses=1] %tmp23.i.i5.i.i = sext i16 %tmp2.i.i.i.i to i32 ; <i32> [#uses=1] %tmp8.i.i7.i.i = add i32 %tmp67.i.i.i.i, %tmp23.i.i5.i.i ; <i32> [#uses=3] This also fixes a bug in ComputeNumSignBits handling select and makes it more aggressive with and/or. llvm-svn: 51302	2008-05-20 05:46:13 +00:00
Dan Gohman	7681889f75	Run vortex-bug as x86-64, which is what the original bug was triggered on. llvm-svn: 51289	2008-05-20 00:54:39 +00:00
Devang Patel	9f385d71c2	Do not erase induction variable increment if it is used outside the loop. llvm-svn: 51280	2008-05-19 22:23:55 +00:00
Chris Lattner	63c384df1e	convert fptosi(sitofp x) -> x if the fp value has enough bits in its mantissa to accurately represent the integer. This triggers 9 times in 471.omnetpp, though 8 of those seem to be inlined from the same place. llvm-svn: 51271	2008-05-19 20:25:04 +00:00
Chris Lattner	1435b94f62	Fold FP comparisons where one operand is converted from an integer type and the other operand is a constant into integer comparisons. This happens surprisingly frequently (e.g. 10 times in 471.omnetpp), which are things like this: %tmp8283 = sitofp i32 %tmp82 to double %tmp1013 = fcmp ult double %tmp8283, 0.0 Clearly comparing tmp82 against i32 0 is cheaper here. this also triggers 8 times in gobmk, including this one: %tmp375376 = sitofp i32 %tmp375 to double %tmp377 = fcmp ogt double %tmp375376, 8.150000e+01 which is comparing an integer against 81.5 :). llvm-svn: 51268	2008-05-19 20:18:56 +00:00
Chris Lattner	510a6b249c	be more aggressive about transforming add -> or when the operands have no intersecting bits. This triggers all over the place, for example in lencode, with adds of stuff like: %tmp580 = mul i32 %tmp579, 2 %tmp582 = and i32 %b8, 1 and %tmp28 = shl i32 %abs.i, 1 %sign.0 = select i1 %tmp23, i32 1, i32 0 and %tmp344 = shl i32 %tmp343, 2 %tmp346 = and i32 %tmp96, 3 etc. llvm-svn: 51263	2008-05-19 20:01:56 +00:00
Duncan Sands	1c11cba7ec	Check that always_inline functions are inlined whether or not -funit-at-a-time is used (C++ uses it, C doesn't) - it was working before only when not doing unit-at-a-time. llvm-svn: 51258	2008-05-19 16:44:44 +00:00
Duncan Sands	7b84c36791	Fix PR2341 - when the length is 4 use an i32 not an i16! Cleaned up trailing whitespace while there. llvm-svn: 51240	2008-05-19 09:27:24 +00:00
Chris Lattner	8c0f0a0e6c	Fix PR2339 llvm-svn: 51226	2008-05-18 04:11:26 +00:00
Chris Lattner	8871489ae7	remove empty file? llvm-svn: 51225	2008-05-18 04:10:18 +00:00
Nick Lewycky	46e3a168c0	Revert constant-folding change that will miscompile in some cases. llvm-svn: 51223	2008-05-17 19:00:05 +00:00
Nick Lewycky	1df40102a9	Constant fold inttoptr and ptrtoint. llvm-svn: 51216	2008-05-17 09:03:26 +00:00
Evan Cheng	76aaaf62e8	Fix test. llvm-svn: 51191	2008-05-16 17:08:51 +00:00
Owen Anderson	55b78bc887	Move this test from ADCE to loop deletion, where it is more appropriate. llvm-svn: 51181	2008-05-16 04:34:19 +00:00
Owen Anderson	79a25ff8ec	Use loop deletion instead of ADCE in these tests. llvm-svn: 51180	2008-05-16 04:33:37 +00:00
Owen Anderson	d282184d18	Use loop deletion instead of ADCE for removing loops. llvm-svn: 51178	2008-05-16 04:27:38 +00:00
Owen Anderson	3e607df2f8	Fix this test. It was testing broken behavior in that it required ADCE to eliminate a potentially infinite loop, which is undesirable. Instead, test the LICM behavior that we're really interested in. llvm-svn: 51177	2008-05-16 04:25:09 +00:00
Chris Lattner	00e8e1e258	implement PR2328. llvm-svn: 51176	2008-05-16 02:59:42 +00:00
Dale Johannesen	4e46c5601d	Use common where we mean common, not weak. llvm-svn: 51173	2008-05-16 00:52:30 +00:00
Dan Gohman	fe7f6bc9ce	Revert the change from r51157 in test/Verifier/2002-11-05-GetelementptrPointers.ll, which was incorrect. Instead, fix getIndexedType to not follow pointer types, as PointerType is a subclass of CompositeType. llvm-svn: 51171	2008-05-16 00:16:32 +00:00
Dan Gohman	2da4145cd8	Fix a bug in LoopStrengthReduce that caused it to emit IR with use-before-def. The problem comes up in code with multiple PHIs where one PHI is being rewritten in terms of the other, but the other needs to be casted first. LLVM rules requre the cast instruction to be inserted after any PHI instructions, but when instructions were inserted to replace the second PHI value with a function of the first, they were ended up going before the cast instruction. Avoid this problem by remembering the location of the cast instruction, when one is needed, and inserting the expansion of the new value after it. This fixes a bug that surfaced in 255.vortex on x86-64 when instcombine was removed from the middle of the loop optimization passes. llvm-svn: 51169	2008-05-15 23:26:57 +00:00
Dale Johannesen	247e20c532	Remove the S92 code, which really has nothing to do with what the test is testing; makes it pass again on ppc32. llvm-svn: 51167	2008-05-15 22:23:54 +00:00
Dale Johannesen	f464bece2d	Evan has implemented this on ppc, so run the test there. llvm-svn: 51166	2008-05-15 22:22:37 +00:00
Dan Gohman	821bf58428	IR support for extractvalue and insertvalue instructions. Also, begin moving toward making structs and arrays first-class types. llvm-svn: 51157	2008-05-15 19:50:34 +00:00
Bill Wendling	c1d9f9604b	Situations can arise when you have a function called that returns a 'void', but is bitcast to return a floating point value. The result of the instruction may not be used by the program afterwards, and LLVM will happily remove all instructions except the call. But, on some platforms, if a value is returned as a floating point, it may need to be removed from the stack (like x87). Thus, we can't get rid of the bitcast even if there isn't a use of the value. llvm-svn: 51134	2008-05-14 22:45:20 +00:00
Devang Patel	047ba6df54	Simplify internalize pass. Add test case. Patch by Matthijs Kooijman! llvm-svn: 51114	2008-05-14 20:01:01 +00:00
Dan Gohman	cd29e1fa60	When bit-twiddling CondCode values for integer comparisons produces SETOEQ, is it does with (SETEQ & SETULE), map it to SETEQ. llvm-svn: 51112	2008-05-14 18:17:09 +00:00
Tanya Lattner	ecd5a9390a	Check if llvm-gcc is available before running tests. Patch by Matthijs Kooijman! llvm-svn: 51108	2008-05-14 16:32:44 +00:00
Duncan Sands	d1ee8534e8	Make this test pass on x86-32 linux. llvm-svn: 51099	2008-05-14 09:46:01 +00:00
Dale Johannesen	676a1d026b	Fix for PR 2323, infinite loop in tail dup. llvm-svn: 51063	2008-05-13 20:06:43 +00:00
Evan Cheng	9e15622879	Instead of a vector load, shuffle and then extract an element. Load the element from address with an offset. pshufd $1, (%rdi), %xmm0 movd %xmm0, %eax => movl 4(%rdi), %eax llvm-svn: 51026	2008-05-13 08:35:03 +00:00
Owen Anderson	f67c06279b	Add a testcase for non-local CSE of read-only calls. llvm-svn: 51025	2008-05-13 08:17:44 +00:00
Evan Cheng	e4ee4c2870	On x86, it's safe to treat i32 load anyext as a normal i32 load. Ditto for i8 anyext load to i16. llvm-svn: 51019	2008-05-13 00:54:02 +00:00
Evan Cheng	fcbdc8bd6e	Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other. llvm-svn: 51008	2008-05-12 23:04:07 +00:00
Dale Johannesen	b54491d31a	New test for tail merging llvm-svn: 51007	2008-05-12 22:59:44 +00:00
Mikhail Glushenkov	18508c5df4	Filter option names to escape symbols not allowed as C++ identifiers. Makes it possible to use options with names like "Wa,". Also fixes the -Wall option handling as a side-effect. llvm-svn: 50973	2008-05-12 16:33:06 +00:00
Duncan Sands	45b1810980	Testcase for PR2264. llvm-svn: 50965	2008-05-12 13:01:19 +00:00
Duncan Sands	15622620d3	Testcase for PR2303. llvm-svn: 50951	2008-05-10 16:43:10 +00:00
Evan Cheng	c19c639ad7	When transforming a vector_shuffle to a load, the base address must not be an undef. llvm-svn: 50940	2008-05-10 06:46:49 +00:00
Evan Cheng	fc4b8e1d96	Add nounwind. llvm-svn: 50931	2008-05-10 02:22:25 +00:00
Evan Cheng	cf4d5567d5	If all sources of a PHI node are defined by an implicit_def, just emit an implicit_def instead of a copy. llvm-svn: 50927	2008-05-10 00:17:50 +00:00
Evan Cheng	2adea48f7e	Add a pattern to do move the low element of a v4f32 and zero extend the rest. llvm-svn: 50922	2008-05-09 23:37:55 +00:00
Evan Cheng	3493e43afd	Handle a few more cases of folding load i64 into xmm and zero top bits. Note, some of the code will be moved into target independent part of DAG combiner in a subsequent patch. llvm-svn: 50918	2008-05-09 21:53:03 +00:00
Evan Cheng	a0688bf1cb	Simplify test. llvm-svn: 50911	2008-05-09 19:56:32 +00:00
Chris Lattner	02ca137915	Implement PR2298. This transforms: ~x < ~y --> y < x -x == -y --> x == y llvm-svn: 50882	2008-05-09 05:19:28 +00:00
Evan Cheng	f824b47188	Use movq to move low half of XMM register and zero-extend the rest. llvm-svn: 50874	2008-05-08 22:35:02 +00:00
Chris Lattner	4c1ef3628b	More than just loads can read from memory: readonly calls like strlen also need to be checked for memory modifying instructions before we can sink them. THis fixes the second half of PR2297. llvm-svn: 50860	2008-05-08 17:37:37 +00:00
Chris Lattner	cba8b4c7e8	Make instcombine's DSE respect loads as well as stores. It is not safe to delete the first store in: store x -> p load p store y -> p This is for PR2297. llvm-svn: 50859	2008-05-08 17:20:30 +00:00
Chris Lattner	9bf499a8b9	new testcase. llvm-svn: 50841	2008-05-08 04:55:51 +00:00
Evan Cheng	f97e716511	Handle vector move / load which zero the destination register top bits (i.e. movd, movq, movss (addr), movsd (addr)) with X86 specific dag combine. llvm-svn: 50838	2008-05-08 00:57:18 +00:00
Evan Cheng	7ff000c175	Add nounwind. llvm-svn: 50837	2008-05-07 22:59:08 +00:00
Evan Cheng	c86c035346	Yet another nasty spiller bug. %ecx = op store %cl<kill>, (addr) (addr) = op %al It's not safe to unfold the last operand and eliminate store even though %cl is marked kill. It's a sub-register use which means one of its super-register(s) may be used below. llvm-svn: 50794	2008-05-07 00:49:28 +00:00
Dan Gohman	6ea87fa437	Fix a bug in the ComputeMaskedBits logic for multiply. llvm-svn: 50793	2008-05-07 00:35:55 +00:00
Bill Wendling	3cb8ee3c80	Removing. llvm-svn: 50786	2008-05-06 23:56:22 +00:00
Anton Korobeynikov	a9ff11ea5f	Use target triple in tests, not 'realign-stack=0' option. Per request. llvm-svn: 50778	2008-05-06 23:09:29 +00:00
Owen Anderson	2dccdcf2f2	Testcase for r50770. llvm-svn: 50771	2008-05-06 21:01:34 +00:00
Mikhail Glushenkov	5a403195b3	Move test files around a bit - fixes the reported number of test cases. llvm-svn: 50761	2008-05-06 18:16:20 +00:00
Mikhail Glushenkov	3534026221	Use edge weights to choose the right linker based on input language names. llvm-svn: 50759	2008-05-06 18:15:12 +00:00
Mikhail Glushenkov	78aa308f84	Add a --linker command-line option, make all tests pass. llvm-svn: 50755	2008-05-06 18:13:00 +00:00
Mikhail Glushenkov	ffc7ee62c0	Add two (currently failing) tests. llvm-svn: 50752	2008-05-06 18:11:21 +00:00
Mikhail Glushenkov	dc50e8c5fa	Take object file as input and handle files with the same name correctly. llvm-svn: 50749	2008-05-06 18:10:20 +00:00
Mikhail Glushenkov	40b652c238	First small tests for llvmc2. llvm-svn: 50734	2008-05-06 17:24:54 +00:00
Duncan Sands	c96ff82e3e	Testcase for PR2292. llvm-svn: 50718	2008-05-06 14:56:40 +00:00
Evan Cheng	a84ed06284	Fix PR2287. Darwin passes mmx values in register in 64-mode, not Linux. llvm-svn: 50716	2008-05-06 07:23:50 +00:00
Dan Gohman	faf9df7227	Correct the value of LowBits in srem and urem handling in ComputeMaskedBits. llvm-svn: 50692	2008-05-06 00:51:48 +00:00
Chris Lattner	50d16c2939	Fix a crash when threading a block that includes a MRV call result. DemoteRegToStack doesn't work with MRVs yet, because it relies on the ability to load/store things. This fixes PR2285. llvm-svn: 50667	2008-05-05 20:21:22 +00:00
Mon P Wang	84a269e023	Added addition atomic instrinsics and, or, xor, min, and max. llvm-svn: 50663	2008-05-05 19:05:59 +00:00
Chris Lattner	ca94848f66	no need for eh info llvm-svn: 50658	2008-05-05 18:24:33 +00:00
Dan Gohman	c860d9c77c	Add AsmPrinter support for emitting a directive to declare that the code being generated does not require an executable stack. Also, add target-specific code to make use of this on Linux on x86. llvm-svn: 50634	2008-05-05 00:28:39 +00:00
Owen Anderson	611b415d12	Fix PR1098 by correcting the postdominators analysis. Patch by Florian Brandner. llvm-svn: 50628	2008-05-04 21:07:35 +00:00
Evan Cheng	a7747df955	Select vector shift with non-immediate i32 shift amount operand by first moving the operand into the right register. llvm-svn: 50619	2008-05-04 09:15:50 +00:00
Evan Cheng	c1c2adbfc6	Add separate intrinsics for MMX / SSE shifts with i32 integer operands. This allow us to simplify the horribly complicated matching code. llvm-svn: 50601	2008-05-03 00:52:09 +00:00
Chris Lattner	5346b6b0a7	verify builtin optimization works like gcc. llvm-svn: 50594	2008-05-02 22:07:34 +00:00
Dan Gohman	27156711ef	Fix a mistake in the computation of leading zeros for udiv. llvm-svn: 50591	2008-05-02 21:30:02 +00:00
Chris Lattner	96467cc665	strength reduce exp2 into ldexp, rdar://5852514 llvm-svn: 50586	2008-05-02 18:43:35 +00:00
Chris Lattner	8cc3e89b87	specify an arch for non-x86 hosts. llvm-svn: 50576	2008-05-02 15:11:58 +00:00
Dan Gohman	04e2b94842	Update old-style syntax in some "not grep" tests. llvm-svn: 50560	2008-05-01 23:50:07 +00:00
Dale Johannesen	4ab8b00dfa	New test for bug fixed in 50545. llvm-svn: 50548	2008-05-01 22:50:14 +00:00
Dan Gohman	793c9fed45	Fix an overaggressive SimplifyDemandedBits optimization on urem. This fixes the 254.gap regression on x86 and the 403.gcc regression on x86-64. llvm-svn: 50537	2008-05-01 19:13:24 +00:00
Bill Wendling	81b5245cec	Adding testcase. llvm-svn: 50536	2008-05-01 18:41:09 +00:00
Chris Lattner	e9bbe8e6b6	don't randomly miscompile seto/setuo just because we are in ffastmath mode. This fixes rdar://5902801, a miscompilation of gcc.dg/builtins-8.c. Bill, please pull this into Tak. llvm-svn: 50523	2008-05-01 07:26:11 +00:00
Chris Lattner	9d73228708	fix typo llvm-svn: 50519	2008-05-01 06:16:48 +00:00
Chris Lattner	9a678d6f55	instcombine does memset optzns. llvm-svn: 50518	2008-05-01 06:16:38 +00:00
Chris Lattner	926efd9174	simplifylibcalls doesn't optimize llvm.memmove, instcombine does. llvm-svn: 50517	2008-05-01 06:14:24 +00:00
Chris Lattner	d4bf588b85	move some tests from libcall optimizer suite. llvm-svn: 50516	2008-05-01 06:13:48 +00:00
Arnold Schwaighofer	29a654857a	Really commit the test checking the argument lowering behaviour on x86-64 :). llvm-svn: 50478	2008-04-30 09:19:47 +00:00
Arnold Schwaighofer	f58a35e2ec	Tail call optimization improvements: Move platform independent code (lowering of possibly overwritten arguments, check for tail call optimization eligibility) from target X86ISelectionLowering.cpp to TargetLowering.h and SelectionDAGISel.cpp. Initial PowerPC tail call implementation: Support ppc32 implemented and tested (passes my tests and test-suite llvm-test). Support ppc64 implemented and half tested (passes my tests). On ppc tail call optimization is performed if caller and callee are fastcc call is a tail call (in tail call position, call followed by ret) no variable argument lists or byval arguments option -tailcallopt is enabled Supported: * non pic tail calls on linux/darwin * module-local tail calls on linux(PIC/GOT)/darwin(PIC) * inter-module tail calls on darwin(PIC) If constraints are not met a normal call will be emitted. A test checking the argument lowering behaviour on x86-64 was added. llvm-svn: 50477	2008-04-30 09:16:33 +00:00
Owen Anderson	f8c80ca156	Move this test to LoopDeletion, where it now passes. llvm-svn: 50474	2008-04-30 07:17:22 +00:00
Chris Lattner	15195e00ee	move lowering of llvm.memset -> store from simplify libcalls to instcombine. llvm-svn: 50472	2008-04-30 06:39:11 +00:00
Chris Lattner	ce01263bff	no reason for simplifylibcalls to simplify intrinsics, instcombine does a fine job. llvm-svn: 50470	2008-04-30 06:12:15 +00:00
Chris Lattner	a62a4d407a	remove redundant check. llvm-svn: 50469	2008-04-30 06:06:37 +00:00
Owen Anderson	2caa79ae70	Fix a bug in memcpyopt where the memcpy-memcpy transform was never being applied because we were checking for it in the wrong order. This caused a miscompilation because the return slot optimization assumes that the call it is dealing with is NOT a memcpy. llvm-svn: 50444	2008-04-29 21:26:06 +00:00
Chris Lattner	5bd55b0885	don't eliminate load from volatile value on paths where the load is dead. This fixes the second half of PR2262 llvm-svn: 50430	2008-04-29 17:28:22 +00:00
Chris Lattner	4b5d48a3f0	make this test reduced and valid llvm-svn: 50429	2008-04-29 17:25:32 +00:00
Chris Lattner	7099f3c400	fix a subtle volatile handling bug. llvm-svn: 50428	2008-04-29 17:13:43 +00:00
Chris Lattner	b3972afe89	new testcase for PR2094. The inline asms should not pin allocas to the stack anymore. llvm-svn: 50397	2008-04-29 05:53:29 +00:00
Chris Lattner	51fe8415da	don't delete the last store to an alloca if the store is volatile. llvm-svn: 50390	2008-04-29 04:58:38 +00:00
Chris Lattner	0f63b8fecc	make the vector conversion magic handle multiple results. We now compile test2/test3 to: _test2: ## InlineAsm Start set %xmm0, %xmm1 ## InlineAsm End addps %xmm1, %xmm0 ret _test3: ## InlineAsm Start set %xmm0, %xmm1 ## InlineAsm End paddd %xmm1, %xmm0 ret as expected. llvm-svn: 50389	2008-04-29 04:48:56 +00:00
Chris Lattner	e75d09711d	add support for multiple return values in inline asm. This is a step towards PR2094. It now compiles the attached .ll file to: _sad16_sse2: movslq %ecx, %rax ## InlineAsm Start %ecx %rdx %rax %rax %r8d %rdx %rsi ## InlineAsm End ## InlineAsm Start set %eax ## InlineAsm End ret which is pretty decent for a 3 output, 4 input asm. llvm-svn: 50386	2008-04-29 04:29:54 +00:00
Evan Cheng	381b094a2b	Another extract_subreg coalescing bug. e.g. vr1024<2> extract_subreg vr1025, 2 If vr1024 do not have the same register class as vr1025, it's not safe to coalesce this away. For example, vr1024 might be a GPR32 while vr1025 might be a GPR64. llvm-svn: 50385	2008-04-29 01:41:44 +00:00
Evan Cheng	8696eb18c1	Add -march=x86. llvm-svn: 50380	2008-04-28 23:31:41 +00:00
Dan Gohman	fa033a0fbd	Update and_ops.ll according to the recent dagcombiner changes. Add a new test, and_ops_more.ll, which is XFAIL'd, to record the parts of and_ops.ll that were affected by this change. llvm-svn: 50379	2008-04-28 23:26:22 +00:00
Evan Cheng	3167c716f2	Test case. llvm-svn: 50377	2008-04-28 22:14:34 +00:00
Dan Gohman	9e4db7f0bd	Fix DSE to not eliminate volatile loads with no uses. llvm-svn: 50370	2008-04-28 19:51:27 +00:00
Dan Gohman	1b7238e6e4	Teach InstCombine's ComputeMaskedBits what SelectionDAG's ComputeMaskedBits knows about cttz, ctlz, and ctpop. Teach SelectionDAG's ComputeMaskedBits what InstCombine's knows about SRem. And teach them both some things about high bits in Mul, UDiv, URem, and Sub. This allows instcombine and dagcombine to eliminate sign-extension operations in several new cases. llvm-svn: 50358	2008-04-28 17:02:21 +00:00
Chris Lattner	ede7e89144	Fix PR2256, yet another miscompilation in simplifycfg of i multiple return values. Bill, please pull this into Tak. llvm-svn: 50332	2008-04-28 00:19:07 +00:00
Chris Lattner	39a4281deb	Implement a signficant optimization for inline asm: When choosing between constraints with multiple options, like "ir", test to see if we can use the 'i' constraint and go with that if possible. This produces more optimal ASM in all cases (sparing a register and an instruction to load it), and fixes inline asm like this: void test () { asm volatile (" %c0 %1 " : : "imr" (42), "imr"(14)); } Previously we would dump "42" into a memory location (which is ok for the 'm' constraint) which would cause a problem because the 'c' modifier is not valid on memory operands. Isn't it great how inline asm turns 'missed optimization' into 'compile failed'?? Incidentally, this was the todo in PowerPC/2007-04-24-InlineAsm-I-Modifier.ll Please do NOT pull this into Tak. llvm-svn: 50315	2008-04-27 00:37:18 +00:00
Chris Lattner	2798e42a9f	When SRoA'ing a global variable, make sure the new globals get the appropriate alignment. This fixes a miscompilation of 252.eon on x86-64 (rdar://5891920). Bill, please pull this into Tak. llvm-svn: 50308	2008-04-26 07:40:11 +00:00
Nate Begeman	1723f6af2b	Feedback from chris llvm-svn: 50305	2008-04-25 21:47:35 +00:00
Nate Begeman	acd7e1c464	Add a testcase for the recent "handle variable vector insert elt in mem" patch llvm-svn: 50303	2008-04-25 21:26:59 +00:00
Evan Cheng	db1497fa77	Update tests. llvm-svn: 50293	2008-04-25 20:13:47 +00:00
Evan Cheng	11f101a800	Special handling for MMX values being passed in either GPR64 or lower 64-bits of XMM registers. llvm-svn: 50289	2008-04-25 19:11:04 +00:00
Dan Gohman	c4b6768db4	Remove the code from CodeGenPrepare that moved getresult instructions to the block that defines their operands. This doesn't work in the case that the operand is an invoke, because invoke is a terminator and must be the last instruction in a block. Replace it with support in SelectionDAGISel for copying struct values into sequences of virtual registers. llvm-svn: 50279	2008-04-25 18:27:55 +00:00
Chris Lattner	fb539f4468	new testcase llvm-svn: 50274	2008-04-25 18:11:06 +00:00
Anton Korobeynikov	dad01d86e0	Update test llvm-svn: 50272	2008-04-25 17:54:21 +00:00
Nick Lewycky	1f831c0f57	Remove 'unwinds to' support from mainline. This patch undoes r47802 r47989 r48047 r48084 r48085 r48086 r48088 r48096 r48099 r48109 and r48123. llvm-svn: 50265	2008-04-25 16:53:59 +00:00
Evan Cheng	39ae78cadb	MMX argument passing fixes: On Darwin / Linux x86-32, v8i8, v4i16, v2i32 values are passed in MM[0-2]. On Darwin / Linux x86-32, v1i64 values are passed in memory. On Darwin x86-64, v8i8, v4i16, v2i32 values are passed in XMM[0-7]. On Darwin x86-64, v1i64 values are passed in 64-bit GPRs. llvm-svn: 50257	2008-04-25 07:56:45 +00:00
Chris Lattner	8c9f6c929a	Loosen up an assertion to allow intrinsics. I really have no idea what this code (findNonImmUse) does, so I'm only guessing that this is the right thing. It would be really really nice if this had comments and perhaps switched to SmallPtrSet (hint hint) :) This fixes rdar://5886601, a crash on gcc.target/i386/sse4_1-pblendw.c llvm-svn: 50252	2008-04-25 05:13:01 +00:00
Chris Lattner	1a6268f776	Don't infininitely thread branches when a threaded edge goes back to the block, e.g.: Threading edge through bool from 'bb37.us.thread3829' to 'bb37.us' with cost: 1, across block: bb37.us: ; preds = %bb37.us.thread3829, %bb37.us, %bb33 %D1361.1.us = phi i32 [ %tmp36, %bb33 ], [ %D1361.1.us, %bb37.us ], [ 0, %bb37.us.thread3829 ] ; <i32> [#uses=2] %tmp39.us = icmp eq i32 %D1361.1.us, 0 ; <i1> [#uses=1] br i1 %tmp39.us, label %bb37.us, label %bb42.us llvm-svn: 50251	2008-04-25 04:12:29 +00:00
Evan Cheng	484060ba4a	Fix bug in x86 memcpy / memset lowering. If there are trailing bytes not handled by rep instructions, a new memcpy / memset is introduced for them. However, since source / destination addresses are already adjusted, their offsets should be zero. llvm-svn: 50239	2008-04-25 00:26:43 +00:00
Evan Cheng	906911f9e5	New test. llvm-svn: 50229	2008-04-24 20:01:58 +00:00
Devang Patel	fed5cd5fe7	Add EXTRA_OPTIONS on the llvmgxx command line. llvm-svn: 50217	2008-04-24 17:59:03 +00:00
Devang Patel	7ff3d5b65b	Add EXTRA_OPTIONS on the llvmgcc command line. llvm-svn: 50216	2008-04-24 17:54:25 +00:00
Chris Lattner	be35a0c224	Split some code out of the main SimplifyCFG loop into its own function. Fix said code to handle merging return instructions together correctly when handling multiple return values. llvm-svn: 50199	2008-04-24 00:01:19 +00:00
Anton Korobeynikov	d6ec8965f7	Fix tests due to llvm2cpp move to llc target llvm-svn: 50191	2008-04-23 22:41:53 +00:00
Dan Gohman	afa475f207	Add support to codegen for getresult instructions with undef operands. llvm-svn: 50180	2008-04-23 20:21:29 +00:00
Anton Korobeynikov	244a615291	Disable stack realignment for these tests llvm-svn: 50172	2008-04-23 18:25:44 +00:00
Anton Korobeynikov	1898ce20e2	Fix test becase ABI stack alignment dropped to 'normal' value llvm-svn: 50171	2008-04-23 18:25:16 +00:00
Anton Korobeynikov	15c5a2ce26	Fix test, instruction count is valid only if stack is not realigned llvm-svn: 50170	2008-04-23 18:24:48 +00:00
Chris Lattner	721ea7ca10	Rewrite multiple return value handling in SCCP. Before, the -sccp pass would turn every getresult instruction into undef. This helps with rdar://5778210 llvm-svn: 50140	2008-04-23 05:38:20 +00:00
Chris Lattner	0dd624d232	remove this testcase. It isn't testing loop rotate, it is testing all of -std-compile-opts and is now failing because other passes are generating IR that looks different to input of loop rotate. Devang, please introduce a testcase that only runs loop rotate. llvm-svn: 50136	2008-04-23 05:36:04 +00:00
Chris Lattner	5a4e46d886	returning an empty multiple return list is not valid. llvm-svn: 50135	2008-04-23 05:29:14 +00:00
Chris Lattner	be858fc296	make this test more interesting. llvm-svn: 50128	2008-04-23 03:49:32 +00:00
Chris Lattner	d059ac2e32	distill down the essense of this test. llvm-svn: 50125	2008-04-23 03:03:42 +00:00
Dale Johannesen	547a55caf1	new test llvm-svn: 50123	2008-04-23 01:22:22 +00:00
Evan Cheng	680839e258	Don't do: "(X & 4) >> 1 == 2 --> (X & 4) == 4" if there are more than one uses of the shift result. llvm-svn: 50118	2008-04-23 00:38:06 +00:00
Chris Lattner	e304ae5621	Start doing the significantly useful part of jump threading: handle cases where a comparison has a phi input and that phi is a constant. For example, stuff like: Threading edge through bool from 'bb2149' to 'bb2231' with cost: 1, across block: bb2237: ; preds = %bb2231, %bb2149 %tmp2328.rle = phi i32 [ %tmp2232, %bb2231 ], [ %tmp2232439, %bb2149 ] ; <i32> [#uses=2] %done.0 = phi i32 [ %done.2, %bb2231 ], [ 0, %bb2149 ] ; <i32> [#uses=1] %tmp2239 = icmp eq i32 %done.0, 0 ; <i1> [#uses=1] br i1 %tmp2239, label %bb2231, label %bb2327 or bb38.i298: ; preds = %bb33.i295, %bb1693 %tmp39.i296.rle = phi %struct.ibox* [ null, %bb1693 ], [ %tmp39.i296.rle1109, %bb33.i295 ] ; <%struct.ibox> [#uses=2] %minspan.1.i291.reg2mem.1 = phi i32 [ 32000, %bb1693 ], [ %minspan.0.i288, %bb33.i295 ] ; <i32> [#uses=1] %tmp40.i297 = icmp eq %struct.ibox %tmp39.i296.rle, null ; <i1> [#uses=1] br i1 %tmp40.i297, label %implfeeds.exit311, label %bb43.i301 This triggers thousands of times in spec. llvm-svn: 50110	2008-04-22 21:40:39 +00:00
Chris Lattner	c59cf9c8da	Dig through multiple levels of AND to thread jumps if needed. llvm-svn: 50106	2008-04-22 20:46:09 +00:00
Chris Lattner	dcbc6443ae	Teach jump threading to thread through blocks like: br (and X, phi(Y, Z, false)), label L1, label L2 This triggers once on 252.eon and 6 times on 176.gcc. Blocks in question often look like this: bb262: ; preds = %bb261, %bb248 %iftmp.251.0 = phi i1 [ true, %bb261 ], [ false, %bb248 ] ; <i1> [#uses=4] %tmp270 = icmp eq %struct.rtx_def* %tmp.0.i, null ; <i1> [#uses=1] %bothcond = or i1 %iftmp.251.0, %tmp270 ; <i1> [#uses=1] br i1 %bothcond, label %bb288, label %bb273 In this case, it is clear that it doesn't matter if tmp.0.i is null when coming from bb261. When coming from bb248, it is all that matters. Another random example: check_asm_operands.exit: ; preds = %check_asm_operands.exit.thr_comm, %bb30.i, %bb12.i, %bb6.i413 %tmp.0.i420 = phi i1 [ true, %bb6.i413 ], [ true, %bb12.i ], [ true, %bb30.i ], [ false, %check_asm_operands.exit.thr_comm ; <i1> [#uses=1] call void @llvm.stackrestore( i8* %savedstack ) nounwind %tmp4389 = icmp eq i32 %added_sets_1.0, 0 ; <i1> [#uses=1] %tmp4394 = icmp eq i32 %added_sets_2.0, 0 ; <i1> [#uses=1] %bothcond80 = and i1 %tmp4389, %tmp4394 ; <i1> [#uses=1] %bothcond81 = and i1 %bothcond80, %tmp.0.i420 ; <i1> [#uses=1] br i1 %bothcond81, label %bb4398, label %bb4397 Here is the case from 252.eon: bb290.i.i: ; preds = %bb23.i57.i.i, %bb8.i39.i.i, %bb100.i.i, %bb100.i.i, %bb85.i.i110 %myEOF.1.i.i = phi i1 [ true, %bb100.i.i ], [ true, %bb100.i.i ], [ true, %bb85.i.i110 ], [ true, %bb8.i39.i.i ], [ false, %bb23.i57.i.i ] ; <i1> [#uses=2] %i.4.i.i = phi i32 [ %i.1.i.i, %bb85.i.i110 ], [ %i.0.i.i, %bb100.i.i ], [ %i.0.i.i, %bb100.i.i ], [ %i.3.i.i, %bb8.i39.i.i ], [ %i.3.i.i, %bb23.i57.i.i ] ; <i32> [#uses=3] %tmp292.i.i = load i8* %tmp16.i.i100, align 1 ; <i8> [#uses=1] %tmp293.not.i.i = icmp ne i8 %tmp292.i.i, 0 ; <i1> [#uses=1] %bothcond.i.i = and i1 %tmp293.not.i.i, %myEOF.1.i.i ; <i1> [#uses=1] br i1 %bothcond.i.i, label %bb202.i.i, label %bb301.i.i Factoring out 3 common predecessors. On the path from any blocks other than bb23.i57.i.i, the load and compare are dead. llvm-svn: 50096	2008-04-22 07:05:46 +00:00
Chris Lattner	4638234905	add a basic testcase. llvm-svn: 50093	2008-04-22 06:35:14 +00:00
Nick Lewycky	1b583954ad	Start removing 'unwinds to' support from mainline in preparation for 2.3. llvm-svn: 50086	2008-04-22 05:16:02 +00:00
Chris Lattner	14be19cf1e	optimize "p != gep p, ..." better. This allows us to compile getelementptr-seteq.ll into: define i1 @test(i64 %X, %S* %P) { %C = icmp eq i64 %X, -1 ; <i1> [#uses=1] ret i1 %C } instead of: define i1 @test(i64 %X, %S* %P) { %A.idx.mask = and i64 %X, 4611686018427387903 ; <i64> [#uses=1] %C = icmp eq i64 %A.idx.mask, 4611686018427387903 ; <i1> [#uses=1] ret i1 %C } And fixes the second half of PR2235. This speeds up the insertion sort case by 45%, from 1.12s to 0.77s. In practice, this will significantly speed up for loops structured like: for (double *P = Base + N; P != Base; --P) ... Which happens frequently for C++ iterators. llvm-svn: 50079	2008-04-22 02:53:33 +00:00
Dan Gohman	93b5be1824	Implement an x86-64 ABI detail of passing structs by hidden first argument. The x86-64 ABI requires the incoming value of %rdi to be copied to %rax on exit from a function that is returning a large C struct. Also, add a README-X86-64 entry detailing the missed optimization opportunity and proposing an alternative approach. llvm-svn: 50075	2008-04-21 23:59:07 +00:00
Duncan Sands	717a1e09aa	Make these structs larger to ensure that they are returned by struct return. llvm-svn: 50038	2008-04-21 08:17:05 +00:00
Duncan Sands	cfb3631483	Make the struct bigger, to ensure it is returned by struct return. llvm-svn: 50037	2008-04-21 08:12:03 +00:00
Owen Anderson	bc6046416f	Refactor memcpyopt based on Chris' suggestions. Consolidate several functions and simplify code that was fallout from the separation of memcpyopt and gvn. llvm-svn: 50034	2008-04-21 07:45:10 +00:00
Chris Lattner	2c5b96fbee	A better fix for my previous patch, MOVZQI2PQIrr just requires SSE2. llvm-svn: 49986	2008-04-20 05:52:46 +00:00
Chris Lattner	8503e2d236	Not all x86-64 machines have sse3 apparently. llvm-svn: 49985	2008-04-20 05:47:56 +00:00
Chris Lattner	a9d8d647ca	rename .llx -> .ll, last batch. llvm-svn: 49971	2008-04-19 22:32:52 +00:00
Chris Lattner	c310b1f1f3	rename .llx -> .ll llvm-svn: 49970	2008-04-19 22:29:10 +00:00
Chris Lattner	63bd1df323	rename .llx -> .ll llvm-svn: 49969	2008-04-19 22:26:29 +00:00
Chris Lattner	8cde1e71f0	Implement PR2206. llvm-svn: 49967	2008-04-19 22:17:26 +00:00
Chris Lattner	1303e72c66	refactor handling of symbolic constant folding, picking up a few new cases( see Integer/a1.ll), but not anything that would happen in practice. llvm-svn: 49965	2008-04-19 21:58:19 +00:00
Evan Cheng	f583b3feb6	64-bit atomic operations. llvm-svn: 49949	2008-04-19 02:30:38 +00:00
Dan Gohman	ac2fac937c	Teach llvm-as to accept function types with multiple return types. llvm-svn: 49945	2008-04-19 00:24:39 +00:00
Evan Cheng	073659986f	Be more careful with insert_subreg and extract_subreg where either source or destination operand has already been coalesced with another register that's defined by a insert_subreg or extract_subreg. llvm-svn: 49843	2008-04-17 07:58:04 +00:00
Owen Anderson	cd1b9c4b43	Make GVN able to remove unnecessary calls to read-only functions again. llvm-svn: 49842	2008-04-17 05:36:50 +00:00
Evan Cheng	7c2c3333ca	Fix a sub-register indice propagation bug. llvm-svn: 49832	2008-04-17 00:06:42 +00:00
Evan Cheng	e2e899b5c2	Don't forget about sub-register indices when rematting instructions. llvm-svn: 49830	2008-04-16 23:44:44 +00:00
Evan Cheng	44a0a0c8ee	After reading memory that's already freed. llvm-svn: 49810	2008-04-16 20:24:25 +00:00
Evan Cheng	4b16ea6247	Really test what's intended. llvm-svn: 49802	2008-04-16 18:21:55 +00:00
Evan Cheng	6d05ce493b	Rewrite LiveVariable liveness computation. The new implementation is much simplified. It eliminated the nasty recursive routines and removed the partial def / use bookkeeping. There is also potential for performance improvement by replacing the conservative handling of partial physical register definitions. The code is currently disabled until live interval analysis is taught of the name scheme. This patch also fixed a couple of nasty corner cases. llvm-svn: 49784	2008-04-16 09:46:40 +00:00
Owen Anderson	64fc7a4268	XFAIL this test for the moment. The real solution is to prevent ADCE from transforming loops and adding a separate loop pass for removing loops with know trip counts. Until that happens, ADCE is miscompiling this code. llvm-svn: 49769	2008-04-16 04:25:42 +00:00
Dan Gohman	be8f2b452b	Add support for the form of the SSE41 extractps instruction that puts its result in a 32-bit GPR. llvm-svn: 49762	2008-04-16 02:32:24 +00:00
Dan Gohman	cf79877623	Recreate the size SDNode instead of reusing the old one in the x86 memcpy lowering code; this ensures that the size node has the desired result type. This fixes a regression from r49572 with @llvm.memcpy.i64 on x86-32. llvm-svn: 49761	2008-04-16 01:32:32 +00:00
Dan Gohman	7d27552962	Add movd instructions to move from MMX registers to 64-bit GPR registers on x86-64. llvm-svn: 49757	2008-04-15 23:55:07 +00:00
Dale Johannesen	45e14f7753	Don't assume a tail call can't reference a byval argument to the outer function, this isn't correct. llvm-svn: 49731	2008-04-15 17:41:34 +00:00
Dan Gohman	3b99b3c807	Treat EntryToken nodes as "passive" so that they aren't added to the ScheduleDAG; they don't correspond to any actual instructions so they don't need to be scheduled. This fixes a bug where the EntryToken was being scheduled multiple times in some cases, though it ended up not causing any trouble because EntryToken doesn't expand into anything. With this fixed the schedulers reliably schedule the expected number of units, so we can check this with an assertion. This requires a tweak to test/CodeGen/X86/loop-hoist.ll because it ends up getting scheduled differently in a trivial way, though it was enough to fool the prcontext+grep that the test does. llvm-svn: 49701	2008-04-15 01:22:18 +00:00
Dan Gohman	cce2b42edc	Upgrade these tests for the current intrinsic prototypes. llvm-svn: 49669	2008-04-14 18:19:18 +00:00
Dale Johannesen	d9a9c746d8	Remove -unwind-tables-optional everywhere, since this is now the default. llvm-svn: 49667	2008-04-14 17:56:54 +00:00
Owen Anderson	a6d1d8dec2	The functionality being tested was removed because it was horribly unsafe. llvm-svn: 49610	2008-04-13 09:51:06 +00:00
Arnold Schwaighofer	82af0e6a43	This patch corrects the handling of byval arguments for tailcall optimized x86-64 (and x86) calls so that they work (... at least for my test cases). Should fix the following problems: Problem 1: When i introduced the optimized handling of arguments for tail called functions (using a sequence of copyto/copyfrom virtual registers instead of always lowering to top of the stack) i did not handle byval arguments correctly e.g they did not work at all :). Problem 2: On x86-64 after the arguments of the tail called function are moved to their registers (which include ESI/RSI etc), tail call optimization performs byval lowering which causes xSI,xDI, xCX registers to be overwritten. This is handled in this patch by moving the arguments to virtual registers first and after the byval lowering the arguments are moved from those virtual registers back to RSI/RDI/RCX. llvm-svn: 49584	2008-04-12 18:11:06 +00:00
Dan Gohman	15edbf989f	Drop ISD::MEMSET, ISD::MEMMOVE, and ISD::MEMCPY, which are not Legal on any current target and aren't optimized in DAGCombiner. Instead of using intermediate nodes, expand the operations, choosing between simple loads/stores, target-specific code, and library calls, immediately. Previously, the code to emit optimized code for these operations was only used at initial SelectionDAG construction time; now it is used at all times. This fixes some cases where rep;movs was being used for small copies where simple loads/stores would be better. This also cleans up code that checks for alignments less than 4; let the targets make that decision instead of doing it in target-independent code. This allows x86 to use rep;movs in low-alignment cases. Also, this fixes a bug that resulted in the use of rep;stos for memsets of 0 with non-constant memory size when the alignment was at least 4. It's better to use the library in this case, which can be significantly faster when the size is large. This also preserves more SourceValue information when memory intrinsics are lowered into simple loads/stores. llvm-svn: 49572	2008-04-12 04:36:06 +00:00
Dan Gohman	41f9d24d52	Fix a bug that prevented x86-64 from using rep.movsq for 8-byte-aligned data. llvm-svn: 49571	2008-04-12 02:35:39 +00:00
Evan Cheng	6e52146f16	If a PHI node has a single implicit_def source, replace it with an implicit_def instead of a copy. llvm-svn: 49543	2008-04-11 17:54:45 +00:00
Owen Anderson	15e930588a	Add testcase for PR2213. llvm-svn: 49517	2008-04-11 05:13:32 +00:00
Evan Cheng	56ca7e285a	New test. llvm-svn: 49514	2008-04-10 23:49:09 +00:00
Dan Gohman	318d9a6605	Teach InstCombine's ComputeMaskedBits to handle pointer expressions in addition to integer expressions. Rewrite GetOrEnforceKnownAlignment as a ComputeMaskedBits problem, moving all of its special alignment knowledge to ComputeMaskedBits as low-zero-bits knowledge. Also, teach ComputeMaskedBits a few basic things about Mul and PHI instructions. This improves ComputeMaskedBits-based simplifications in a few cases, but more noticeably it significantly improves instcombine's alignment detection for loads, stores, and memory intrinsics. llvm-svn: 49492	2008-04-10 18:43:06 +00:00
Evan Cheng	6f164e3814	A copy instruction may use a register multiple times on some targets. Change them all. llvm-svn: 49491	2008-04-10 18:38:47 +00:00
Chris Lattner	3b289289a7	Fix the x86-64 side of PR2108 by adding a v2f64 version of MOVZQI2PQIrr. This would be better handled as a dag combine (with the goal of eliminating the bitconvert) but I don't know how to do that safely. Thoughts welcome. llvm-svn: 49463	2008-04-10 05:13:43 +00:00
Evan Cheng	1803e20a62	Teach branch folding pass about implicit_def instructions. Unfortunately we can't just eliminate them since register scavenger expects every register use to be defined. However, we can delete them when there are no intra-block uses. Carefully removing some implicit def's which enable more blocks to be optimized away. llvm-svn: 49461	2008-04-10 02:32:10 +00:00
Evan Cheng	def576f9e6	- More aggressively coalescing away copies whose source is defined by an implicit_def. - Added insert_subreg coalescing support. llvm-svn: 49448	2008-04-09 20:57:25 +00:00
Chris Lattner	be01a5f699	Generalize getUnaryFloatFunction to handle any FP unary function, automatically figuring out the suffix to use. implement pow(2,x) -> exp2(x). llvm-svn: 49437	2008-04-09 17:48:11 +00:00
Chris Lattner	5d0cbe7d22	remove capital letter from test name. llvm-svn: 49436	2008-04-09 17:46:36 +00:00
Owen Anderson	ca7e0e21f3	Factor a bunch of functionality related to memcpy and memset transforms out of GVN and into its own pass. llvm-svn: 49419	2008-04-09 08:23:16 +00:00
Evan Cheng	f35cc57821	Missed a hasInterval check. llvm-svn: 49415	2008-04-09 01:30:15 +00:00

... 3 4 5 6 7 ...

5517 Commits